2020 Book ProgrammingLanguagesAndSystems PDF
2020 Book ProgrammingLanguagesAndSystems PDF
Programming
LNCS 12075
Languages
and Systems
29th European Symposium on Programming, ESOP 2020
Held as Part of the European Joint Conferences
on Theory and Practice of Software, ETAPS 2020
Dublin, Ireland, April 25–30, 2020, Proceedings
Lecture Notes in Computer Science 12075
Founding Editors
Gerhard Goos, Germany
Juris Hartmanis, USA
Programming
Languages
and Systems
29th European Symposium on Programming, ESOP 2020
Held as Part of the European Joint Conferences
on Theory and Practice of Software, ETAPS 2020
Dublin, Ireland, April 25–30, 2020
Proceedings
Editor
Peter Müller
ETH Zurich
Zurich, Switzerland
© The Editor(s) (if applicable) and The Author(s) 2020. This book is an open access publication.
Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International
License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution
and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this book are included in the book’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the book’s Creative
Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use,
you will need to obtain permission directly from the copyright holder.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, expressed or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
ETAPS Foreword
Welcome to the 23rd ETAPS! This is the first time that ETAPS took place in Ireland in
its beautiful capital Dublin.
ETAPS 2020 was the 23rd instance of the European Joint Conferences on Theory
and Practice of Software. ETAPS is an annual federated conference established in
1998, and consists of four conferences: ESOP, FASE, FoSSaCS, and TACAS. Each
conference has its own Program Committee (PC) and its own Steering Committee
(SC). The conferences cover various aspects of software systems, ranging from
theoretical computer science to foundations of programming language developments,
analysis tools, and formal approaches to software engineering. Organizing these
conferences in a coherent, highly synchronized conference program enables researchers
to participate in an exciting event, having the possibility to meet many colleagues
working in different directions in the field, and to easily attend talks of different
conferences. On the weekend before the main conference, numerous satellite
workshops took place that attracted many researchers from all over the globe. Also, for
the second time, an ETAPS Mentoring Workshop was organized. This workshop is
intended to help students early in the program with advice on research, career, and life
in the fields of computing that are covered by the ETAPS conference.
ETAPS 2020 received 424 submissions in total, 129 of which were accepted,
yielding an overall acceptance rate of 30.4%. I thank all the authors for their interest in
ETAPS, all the reviewers for their reviewing efforts, the PC members for their
contributions, and in particular the PC (co-)chairs for their hard work in running this
entire intensive process. Last but not least, my congratulations to all authors of the
accepted papers!
ETAPS 2020 featured the unifying invited speakers Scott Smolka (Stony Brook
University) and Jane Hillston (University of Edinburgh) and the conference-specific
invited speakers (ESOP) Işıl Dillig (University of Texas at Austin) and (FASE) Willem
Visser (Stellenbosch University). Invited tutorials were provided by Erika Ábrahám
(RWTH Aachen University) on the analysis of hybrid systems and Madhusudan
Parthasarathy (University of Illinois at Urbana-Champaign) on combining Machine
Learning and Formal Methods. On behalf of the ETAPS 2020 attendants, I thank all the
speakers for their inspiring and interesting talks!
ETAPS 2020 took place in Dublin, Ireland, and was organized by the University of
Limerick and Lero. ETAPS 2020 is further supported by the following associations and
societies: ETAPS e.V., EATCS (European Association for Theoretical Computer
Science), EAPLS (European Association for Programming Languages and Systems),
and EASST (European Association of Software Science and Technology). The local
organization team consisted of Tiziana Margaria (general chair, UL and Lero),
Vasileios Koutavas (Lero@UCD), Anila Mjeda (Lero@UL), Anthony Ventresque
(Lero@UCD), and Petros Stratis (Easy Conferences).
vi ETAPS Foreword
Program Committee
Elvira Albert Universidad Complutense de Madrid, Spain
Sophia Drossopoulou Imperial College London, UK
Jean-Christophe Filliatre LRI, CNRS, France
Arie Gurfinkel University of Waterloo, Canada
Jan Hoffmann Carnegie Mellon University, USA
Ranjit Jhala University of California at San Diego, USA
Woosuk Lee Hanyang University, South Korea
Rustan Leino Amazon Web Services, USA
Rupak Majumdar MPI-SWS, Germany
Roland Meyer Technische Universität Braunschweig, Germany
Antoine Miné LIP6, Sorbonne Université, France
Sasa Misailovic University of Illinois at Urbana-Champaign, USA
Toby Murray University of Melbourne, Australia
Peter Müller ETH Zurich, Switzerland
David Naumann Stevens Institute of Technology, USA
Zvonimir Rakamaric University of Utah, USA
Francesco Ranzato University of Padova, Italy
Sukyoung Ryu KAIST, South Korea
Ilya Sergey Yale-NUS College and National University
of Singapore, Singapore
Alexandra Silva University College London, UK
Nikhil Swamy Microsoft Research, USA
Sam Tobin-Hochstadt Indiana University Bloomington, USA
Caterina Urban Inria Paris, France
Viktor Vafeiadis MPI-SWS, Germany
Additional Reviewers
Işıl Dillig
Many database applications undergo significant schema changes during their life cycle
due to performance or maintainability reasons. Examples of such schema changes
include denormalization, splitting a single table into multiple tables, and consolidating
multiple tables into a single table. Even though such schema refactorings are quite
common in practice, programmers need to spend significant time and effort to
re-implement parts of the code base that are affected by the schema change. Further-
more, it is not uncommon to introduce bugs during this code transformation process.
In this talk, I will present our recent work on using formal methods to simplify the
schema refactoring process for evolving database applications. Specifically, I will first
propose a definition of equivalence between database applications that operate over
different schemas. Building on this definition, I will then present a fully automated
technique for proving equivalence between a pair of applications. Our verification
technique is capable of automatically synthesizing bisimulation invariants between two
database applications and uses the inferred bisimulation invariant to automatically
prove equivalence.
In the next part of the talk, I will explain how to leverage this verification technique
to completely automate the code migration process. Specifically, given an original
database application P over schema S and a new schema S0 , I will discuss a practical
program synthesis technique that can be used to generate a new program P0 over
schema S0 such that P and P0 are provably equivalent. In particular, I will first present a
method for generating a program sketch of the new version; then, I will describe a
novel synthesis algorithm that efficiently explores the space of all programs that are in
the search space of the generated sketch.
Finally, I will describe experimental results on a suite of schema refactoring
benchmarks, including real-world database applications written in Ruby-on-Rails.
I will also outline remaining challenges in this area and motivate future research
directions relevant to research in programming languages and formal methods.
Contents
Runners in Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Danel Ahman and Andrej Bauer
Abstract. Compiler correctness is, in its simplest form, defined as the inclusion
of the set of traces of the compiled program into the set of traces of the origi-
nal program, which is equivalent to the preservation of all trace properties. Here
traces collect, for instance, the externally observable events of each execution.
This definition requires, however, the set of traces of the source and target lan-
guages to be exactly the same, which is not the case when the languages are far
apart or when observations are fine-grained. To overcome this issue, we study a
generalized compiler correctness definition, which uses source and target traces
drawn from potentially different sets and connected by an arbitrary relation. We
set out to understand what guarantees this generalized compiler correctness defi-
nition gives us when instantiated with a non-trivial relation on traces. When this
trace relation is not equality, it is no longer possible to preserve the trace prop-
erties of the source program unchanged. Instead, we provide a generic charac-
terization of the target trace property ensured by correctly compiling a program
that satisfies a given source property, and dually, of the source trace property one
is required to show in order to obtain a certain target property for the compiled
code. We show that this view on compiler correctness can naturally account for
undefined behavior, resource exhaustion, different source and target values, side-
channels, and various abstraction mismatches. Finally, we show that the same
generalization also applies to many secure compilation definitions, which char-
acterize the protection of a compiled program against linked adversarial code.
1 Introduction
Compiler correctness is an old idea [37, 40, 41] that has seen a significant revival in re-
cent times. This new wave was started by the creation of the CompCert verified C com-
piler [33] and continued by the proposal of many significant extensions and variants of
CompCert [8, 9, 12, 23, 29, 30, 42, 52, 56, 57, 61] and the success of many other mile-
stone compiler verification projects, including Vellvm [64], Pilsner [45], CakeML [58],
CertiCoq [4], etc. Yet, even for these verified compilers, the precise statement of cor-
rectness matters. Since proof assistants are used to conduct the verification, an external
observer does not have to understand the proofs in order to trust them, but one still has
to deeply understand the statement that was proved. And this is true not just for correct
compilation, but also for secure compilation, which is the more recent idea that our
compilation chains should do more to also ensure security of our programs [3, 26].
Basic Compiler Correctness. The gold standard for compiler correctness is semantic
preservation, which intuitively says that the semantics of a compiled program (in the
target language) is compatible with the semantics of the original program (in the source
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 1–28, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 1
2 C. Abate et al.
language). For practical verified compilers, such as CompCert [33] and CakeML [58],
semantic preservation is stated extrinsically, by referring to traces. In these two settings,
a trace is an ordered sequence of events—such as inputs from and outputs to an external
environment—that are produced by the execution of a program.
A basic definition of compiler correctness can be given by the set inclusion of the
traces of the compiled program into the traces of the original program. Formally [33]:
This definition says that for any whole1 source program W, if we compile it (denoted
W↓), execute it with respect to the semantics of the target language, and observe a trace
t, then the original W can produce the same trace t with respect to the semantics of
the source language.2 This definition is simple and easy to understand, since it only
references a few familiar concepts: a compiler between a source and a target language,
each equipped with a trace-producing semantics (usually nondeterministic).
Beyond Basic Compiler Correctness. This basic compiler correctness definition as-
sumes that any trace produced by a compiled program can be produced by the source
program. This is a very strict requirement, and in particular implies that the source and
target traces are drawn from the same set and that the same source trace corresponds
to a given target trace. These assumptions are often too strong, and hence in practice
verified compiler efforts use different formulations of compiler correctness:
CompCert [33] The original compiler correctness theorem of CompCert [33] can be
seen as an instance of basic compiler correctness, but it does not provide any guar-
antees for programs that can exhibit undefined behavior [53]. As allowed by the
C standard, such unsafe programs are not even considered to be in the source lan-
guage, so are not quantified over. This has important practical implications, since
undefined behavior often leads to exploitable security vulnerabilities [13, 24, 25]
and serious confusion even among experienced C and C++ developers [32, 53, 59,
60]. As such, since 2010, CompCert provides an additional top-level correctness
theorem3 that better accounts for the presence of unsafe programs by providing
guarantees for them up to the point when they encounter undefined behavior [53].
This new theorem goes beyond the basic correctness definition above, as a target
trace need only correspond to a source trace up to the occurrence of undefined
behavior in the source trace.
CakeML [58] Compiler correctness for CakeML accounts for memory exhaustion in
target executions. Crucially, memory exhaustion events cannot occur in source
traces, only in target traces. Hence, dually to CompCert, compiler correctness only
requires source and target traces to coincide up to the occurrence of a memory
exhaustion event in the target trace.
1
For simplicity, for now we ignore separate compilation and linking, returning to it in §5.
2
Typesetting convention [47]: we use a blue, sans-serif font for source elements, an orange,
bold font for target ones and a black , italic font for elements common to both languages.
3
Stated at the top of the CompCert file driver/Complements.v and discussed by Regehr [53].
Trace-Relating Compiler Correctness and Secure Compilation 3
Reasoning About Trace Properties. To understand more about a particular CC∼ in-
stance, we propose to also look at how it preserves trace properties—defined as sets of
allowed traces [31]—from the source to the target. For instance, it is well known that
CC= is equivalent to the preservation of all trace properties (where W |= π reads “W
satisfies π” and stands for ∀t. W t ⇒ t ∈ π):
CC= ≡ ∀π ∈ 2Trace ∀W. W|=π ⇒ W↓|=π.
However, to the best of our knowledge, similar results have not been formulated for
trace relations beyond equality, when it is no longer possible to preserve the trace prop-
erties of the source program unchanged. For trace-relating compiler correctness, where
source and target traces can be drawn from different sets and related by an arbitrary
trace relation, there are two crucial questions to ask:
1. For a source trace property πS of a program—established for instance by formal
verification—what is the strongest target property that any CC∼ compiler is guar-
anteed to ensure for the produced target program?
2. For a target trace property πT , what is the weakest source property we need to show
of the original source program to obtain πT for the result of any CC∼ compiler?
Far from being mere hypothetical questions, they can help the developer of a verified
compiler to better understand the compiler correctness theorem they are proving, and
we expect that any user of such a compiler will need to ask either one or the other if they
are to make use of that theorem. In this work we provide a simple and natural answer to
these questions, for any instance of CC∼ . Building upon a bijection between relations
and Galois connections [5, 20, 43], we observe that any trace relation ∼ corresponds
to two property mappings τ̃ and σ̃, which are functions mapping source properties to
target ones (τ̃ standing for “to target”) and target properties to source ones (σ̃ standing
for “to source”):
τ̃ (πS ) = {t | ∃s. s ∼ t ∧ s ∈ πS } ; σ̃(πT ) = {s | ∀t. s ∼ t ⇒ t ∈ πT } .
The existential image of ∼, τ̃ , answers the first question above by mapping a given
source property πS to the target property that contains all target traces for which there
exists a related source trace that satisfies πS . Dually, the universal image of ∼, σ̃, an-
swers the second question by mapping a given target property πT to the source property
that contains all source traces for which all related target traces satisfy πT . We intro-
duce two new correct compilation definitions in terms of trace property preservation
(TP): TPτ̃ quantifies over all source trace properties and uses τ̃ to obtain the corre-
sponding target properties. TPσ̃ quantifies over all target trace properties and uses σ̃
to obtain the corresponding source properties. We prove that these two definitions are
equivalent to CC∼ , yielding a novel trinitarian view of compiler correctness (Figure 1).
CC∼
∀πT . ∀W. W |= σ̃(πT ) ∀πS . ∀W. W |= πS
⇒ W↓ |= πT ≡ TP σ̃
TP τ̃ ≡ ⇒ W↓ |= τ̃ (πS )
Fig. 1: The equivalent compiler correctness definitions forming our trinitarian view.
Trace-Relating Compiler Correctness and Secure Compilation 5
Contributions.
We propose a new trinitarian view of compiler correctness that accounts for non-trivial
trace relations. While, as discussed above, specific instances of the CC∼ definition have
already been used in practice, we seem to be the first to propose assessing the meaning-
fulness of CC∼ instances in terms of how properties are preserved between the source
and the target, and in particular by looking at the property mappings σ̃ and τ̃ induced
by the trace relation ∼. We prove that CC∼ , TPσ̃ , and TPτ̃ are equivalent for any
trace relation (§2.2), as illustrated in Figure 1. In the opposite direction, we show that
for every trace relation corresponding to a given Galois connection [20], an analogous
equivalence holds. Finally, we extend these results (§2.3) from the preservation of trace
properties to the larger class of subset-closed hyperproperties (e.g., noninterference).
We use CC∼ compilers of various complexities to illustrate that our view on com-
piler correctness naturally accounts for undefined behavior (§3.1), resource exhaustion
(§3.2), different source and target values (§3.3), and differences in the granularity of
data and observable events (§3.4). We expect these ideas to apply to any other discrep-
ancies between source and target traces. For each compiler we show how to choose
the relation between source and target traces and how the induced property mappings
preserve interesting trace properties and subset-closed hyperproperties. We look at the
way particular σ̃ and τ̃ work on different kinds of properties and how the produced
properties can be expressed for different kinds of traces.
We analyze the impact of correct compilation on noninterference [22], showing what
can still be preserved (and thus also what is lost) when target observations are finer than
source ones, e.g., side-channel observations (§4). We formalize the guarantee obtained
by correct compilation of a noninterfering program as abstract noninterference [21], a
weakening of target noninterference. Dually, we identify a family of declassifications
of target noninterference for which source reasoning is possible.
Finally, we show that the trinitarian view also extends to a large class of secure com-
pilation definitions [2], formally characterizing the protection of the compiled program
against linked adversarial code (§5). For each secure compilation definition we again
propose both a property-free characterization in the style of CC∼ , and two character-
izations in terms of preserving a class of source or target properties satisfied against
arbitrary adversarial contexts. The additional quantification over contexts allows for
finer distinctions when considering different property classes, so we study mapping
classes not only of trace properties and hyperproperties, but also of relational hyper-
properties [2]. An example secure compiler accounting for a target that can produce
additional trace events that are not possible in the source illustrates this approach.
The paper closes with discussions of related (§6) and future work (§7). An online ap-
pendix contains omitted technical details: https://arxiv.org/abs/1907.05320.
The traces considered in our examples are structured, usually as sequences of events.
We notice however that unless explicitly mentioned, all our definitions and results are
more general and make no assumption whatsoever about the structure of traces. Most
of the theorems formally or informally mentioned in the paper were mechanized in the
Coq proof assistant and are marked with . This development has around 10k lines of
code, is described in the online appendix, and is available at the following address:
https://github.com/secure-compilation/different_traces.
6 C. Abate et al.
Definition 2.1 (TPσ and TPτ ). Given two property mappings, τ : 2TraceS → 2TraceT
and σ : 2TraceT → 2TraceS , for a compilation chain ·↓ we define:
Definition 2.2 (Galois connection). Let (X, ) and (Y, ) be two posets. A pair of
maps, α : X → Y , γ : Y → X is a Galois connection iff it satisfies the adjunction law:
∀x ∈ X. ∀y ∈ Y. α(x) y ⇐⇒ x γ(y). α (resp. γ) is the lower (upper) adjoint
or abstraction (concretization) function and Y (X) the abstract (concrete) domain.
If two property mappings, τ and σ, form a Galois connection on trace properties ordered
by set inclusion, Lemma 2.3 (with α = τ and γ = σ) tells us that they satisfy the ideal
conditions we discussed above, i.e., τ (σ(πT )) ⊆ πT and σ(τ (πS )) ⊇ πS .4
The two ideal conditions on τ and σ are sufficient to show the equivalence of the
criteria they define, respectively TPτ and TPσ .
Theorem 2.4 (TPτ and TPσ coincide ). Let τ : 2TraceS 2TraceT : σ be a Galois
connection, with τ and σ the lower and upper adjoints (resp.). Then TPτ ⇐⇒ TPσ .
2.2 Trace Relations and Property Mappings
We now investigate the relation between CC∼ , TPτ and TPσ . We show that for a trace
relation and its corresponding Galois connection (Lemma 2.7), the three criteria are
equivalent (Theorem 2.8). This equivalence offers interesting insights for both verifi-
cation and design of a correct compiler. For a CC∼ compiler, the equivalence makes
explicit both the guarantees one has after compilation (τ̃ ) and source proof obligations
to ensure the satisfaction of a given target property (σ̃). On the other hand, a compiler
designer might first determine the target guarantees the compiler itself must provide,
i.e., τ , and then prove an equivalent statement, CC∼ , for which more convenient proof
techniques exist in the literature [7, 58].
Definition 2.5 (Existential and Universal Image [20]). Given any two sets X and Y
and a relation ∼ ⊆ A × B, define its existential or direct image, τ̃ : 2X → 2Y and its
universal image, σ̃ : 2Y → 2X as follows:
τ̃ = λ π ∈ 2X . {y | ∃x. x ∼ y ∧ x ∈ π} ; σ̃ = λ π ∈ 2Y . {x | ∀y. x ∼ y ⇒ y ∈ π} .
When trace relations are considered, the existential and universal images can be used to
instantiate Definition 2.1 leading to the trinitarian view already mentioned in §1.
Theorem 2.6 (Trinitarian View ). For any trace relation ∼ and its existential and
universal images τ̃ and σ̃, we have: TPτ̃ ⇐⇒ CC∼ ⇐⇒ TPσ̃ .
This result relies both on Theorem 2.4 and on the fact that the existential and universal
images of a trace relation form a Galois connection ( ). Below we further generalize
this result (Theorem 2.8) relying on a bijective correspondence between trace relations
and Galois connections on properties.
Lemma 2.7 (Trace relations ∼ = Galois connections on trace properties). The func-
tion ∼ → τ̃ σ̃ that maps a trace relation to its existential and universal images
is a bijection between trace relations 2TraceS ×TraceT and Galois connections on trace
properties 2TraceS 2TraceT . Its inverse is τ σ → ∼,
ˆ where s ∼
ˆ t ≡ t ∈ τ ({s}).
4
While target traces are often “more concrete” than source ones, trace properties 2Trace (which
in Coq we represent as the function type Trace→Prop) are contravariant in Trace and thus
target properties correspond to the abstract domain.
8 C. Abate et al.
Proof. Gardiner et al. [20] show that the existential image is a functor from the category
of sets and relations to the category of predicate transformers, mapping a set X → 2X
and a relation ∼ ⊆ X × Y → τ̃ : 2X → 2Y . They also show that such a functor
is an isomorphism – hence bijective – when one considers only monotonic predicate
transformers that have a – unique – upper adjoint. The universal image of ∼, σ̃, is the
unique adjoint of τ̃ ( ), hence ∼ → τ̃ σ̃ is itself bijective.
The bijection just introduced allows us to generalize Theorem 2.6 and switch between
the three views of compiler correctness described earlier at will.
Theorem 2.8 (Correspondence of Criteria). For any trace relation ∼ and corre-
sponding Galois connection τ σ, we have: TPτ ⇐⇒ CC∼ ⇐⇒ TPσ .
Proof. For a trace relation ∼ and the Galois connection τ̃ σ̃, the result follows from
Theorem 2.6. For a Galois connection τ σ and ∼, ˆ use Lemma 2.7 to conclude that
the existential and universal images of ∼
ˆ coincide with τ and σ, respectively; the goal
then follows from Theorem 2.6.
We conclude by explicitly noting that sometimes the lifted properties may be trivial:
the target guarantee can be the true property (the set of all traces), or the source obli-
gation the false property (the empty set of traces). This might be the case when source
observations abstract away too much information (§3.2 presents an example).
Formally we are defining two new mappings, this time on hyperproperties, but by a
small abuse of notation we still denote them by τ and σ.
Interestingly, it is not possible to apply the argument used for CC= to show that a
CC∼ compiler guarantees W↓ |= τ̃ (HS ) whenever W |= HS . This is in fact not true
because direct images do not necessarily preserve subset-closure [36, 44]. To fix this
we close the image of τ̃ and σ̃ under subsets (denoted as Cl⊆ ) and obtain:
Theorem 2.11 (Preservation of Subset-Closed Hyperproperties ). For any trace
relation ∼ and its existential and universal images lifted to hyperproperties, τ̃ and σ̃,
and for Cl ⊆ (H) = {π | ∃π ∈ H. π ⊆ π }, we have:
SCHPCl ⊆ ◦τ̃ ⇐⇒ CC∼ ⇐⇒ SCHPCl ⊆ ◦σ̃ , where
SCHPCl ⊆ ◦τ̃ ≡ ∀W∀HS ∈ SCHS .W |= HS ⇒ W↓ |= Cl ⊆ (τ̃ (HS ));
SCHPCl ⊆ ◦σ̃ ≡ ∀W∀HT ∈ SCHT .W |= Cl ⊆ (σ̃(HT )) ⇒ W↓ |= HT .
Theorem 2.11 makes us aware of the potential loss of precision when interested in
preserving subset-closed hyperproperties through compilation. In §4 we focus on a se-
curity relevant subset-closed hyperproperty, noninterference, and show that such a loss
of precision can be intended as a declassification of noninterference.
We proved that the property mappings induced by the relation can be written as ( ):
σ̃(πT ) = {s | s∈πT ∧ s = m·Goes_wrong} ∪ {m·Goes_wrong | ∀t. m≤t =⇒ t∈πT } ;
τ̃ (πS ) = {t | t∈πS } ∪ {t | ∃m ≤ t. m·Goes_wrong ∈ πS } .
These two mappings explain what a CC∼ compiler ensures for the ∼ relation above. The
target-to-source mapping σ̃ states that to prove that a compiled program has a property
πT using source-level reasoning, one has to prove that any trace produced by the source
program must either be a target trace satisfying πT or have undefined behavior, but only
provided that any continuation of the trace substituted for the undefined behavior satis-
fies πT . The source-to-target mapping τ̃ states that by compiling a program satisfying
a property πS we obtain a program that produces traces that satisfy the same property
or that extend a source trace that ends in undefined behavior.
These definitions can help us reason about programs. For instance, σ̃ specifies that,
to prove that an event does not happen in the target, it is not enough to prove that it
does not happen in the source: it is also necessary to prove that the source program is
does not have any undefined behavior (second disjunct). Indeed, if it had an undefined
behavior, its continuations could exhibit the unwanted event.
This relation can be easily generalized to other settings. For instance, consider the
setting in which we compile down to a low-level language like machine code. Target
traces can now contain new events that cannot occur in the source: indeed, in modern
architectures like x86 a compiler typically uses only a fraction of the available instruc-
tion set. Some instructions might even perform dangerous operations, such as writing
to the hard drive. Formally, the source and target do not have the same events any more.
Thus, we consider a source alphabet ΣS = Σ ∪ {Goes_wrong}, and a target alpha-
bet ΣT = Σ ∪ Σ . The trace relation is defined in the same way and we obtain the
same property mappings as above, except that since target traces now have more events
(some of which may be dangerous), and the arbitrary continuations of target traces get
more interesting. For instance, consider a new event that represents writing data on the
hard drive, and suppose we want to prove that this event cannot happen for a compiled
program. Then, proving this property requires exactly proving that the source program
exhibits no undefined behavior [11]. More generally, what one can prove about target-
only events can only be either that they cannot appear (because there is no undefined
behavior) or that any of them can appear (in the case of undefined behavior).
In §5.2 we study a similar example, showing that even in a safe language linked ad-
versarial contexts can cause dangerous target events that have no source correspondent.
Example 3.2 (Resource Exhaustion). We consider traces made of events drawn from
ΣS in the source, and ΣT = ΣS ∪ {Resource_Limit_Hit} in the target. Recall the
trace relation for resource exhaustion:
s ∼ t ≡ s = t ∨ ∃m ≤ s. t = m · Resource_Limit_Hit.
Formally, this relation is similar to the one for undefined behavior, except this time it is
the target trace that is allowed to end early instead of the source trace.
Trace-Relating Compiler Correctness and Secure Compilation 11
source language has a standard big-step operational semantics (e is, r) which tells
how an expression e generates a trace is, r. The target language is analogous, except
that it is untyped, only has naturals n and its only inputs are naturals inn . The semantics
of the target language is also given in big-step style. Since we only have naturals and
all expressions operate on them, no error result is possible in the target.
The compiler is homomorphic, translating a source expression to the same target
expression; the only differences are natural numbers (and conditionals), as noted below.
true↓ = 1 inb ↓ = inn e1 ≤ e2 ↓ = if e1 ↓ ≤ e2 ↓ then 1 else 0
false↓ = 0 inn ↓ = inn if e1 then e2 else e3 ↓ = if e1 ↓ ≤ 0 then e3 ↓ else e2 ↓
When compiling an if-then-else the target condition e1 ↓ ≤ 0 is used to check that e1 is
false, and therefore the then and else branches of the source are swapped in the target.
Relating Traces. We relate basic values (naturals and booleans) in a non-injective fash-
ion as noted below. Then, we extend the relation to lists of inputs pointwise (Rules Empty
and Cons) and lift that relation to traces (Rules Nat and Bool).
n∼n true ∼ n if n > 0 false ∼ 0
(Empty) (Cons) (Nat) (Bool)
i ∼ i is ∼ is is ∼ is n∼n is ∼ is b∼n
∅∼∅ i · is ∼ i · is is, n ∼ is, n is, b ∼ is, n
Property mappings. The property mappings σ̃ and τ̃ induced by the trace relation ∼
defined above capture the intuition behind encoding booleans as naturals:
– the source-to-target mapping allows true to be encoded by any non-zero number;
– the target-to-source mapping requires that 0 be replaceable by both 0 and false.
Compiler correctness. With the relation above, the compiler is proven to satisfy CC∼ .
Simulations with different traces. The difficulty in proving Theorem 3.3 arises from
the trace-relating compilation setting: For compilation chains that have the same source
and target traces, it is customary to prove compiler correctness using a forward simula-
tion (i.e., a simulation between source and target transition system); then, using deter-
minacy [18, 39] of the target language and input totality [19, 63] (aka receptiveness) of
the source, this forward simulation is flipped into a backward simulation (a simulation
between target and source transition system), as described by Beringer et al. [7], Leroy
[34]. This flipping is useful because forward simulations are often much easier to prove
(by induction on the transitions of the source) than backward ones, as it is the case here.
We first give the main idea of the flipping proof, when the inputs are the same in
the source and the target [7, 34]. We only consider inputs, as it is the most interesting
case, since with determinacy, nondeterminism only occurs on inputs. Given a forward
simulation R, and a target program WT that simulates a source program WS , WT is
able to perform an input iff so is WS : otherwise, say for instance that WS performs an
output, by forward simulation WT would also perform an output, which is impossible
because of determinacy. By input totality of the source, WS must be able to perform
the exact same input as WT ; using forward simulation and determinacy, the resulting
programs must be related.
Trace-Relating Compiler Correctness and Secure Compilation 13
WS = WS R WT
i1 ks i2 ks +3 i1
By input totality By contradiction,
However, our trace relation is not injective (both 0 and false are mapped to 0),
therefore these arguments do not apply: not all possible inputs of target programs are
accounted for in the forward simulation. We thus have to strengthen the forward sim-
ulation assumption, requiring the following additional property to hold, for any source
program WS and target program WT related by the forward simulation R.
WS R WT
∃iS2 iT 2 where iS 1 ∼ iT 1
iS1 iT 1
{ # i S 1 ∼ iT 2
∃WS2 WS1 R WT1 WT2 iS2 ∼ iT2
R
We say that a forward simulation for which this property holds is flippable. For our
example compiler, a flippable forward simulation works as follows: whenever a boolean
input occurs in the source, the target program must perform every strictly positive input
n (and not just 1, as suggested by the compiler). Using this property, determinacy of
the target, input totality of the source, as well as the fact that any target input has an
inverse image through the relation, we can indeed show that the forward simulation can
be turned into a backward one: starting from WS R WT and an input iT2 , we show
that there is iS1 and iT2 as in the diagram above, using the same arguments as when the
inputs are the same; because the simulation is flippable, we can close the diagram, and
obtain the existence of an adequate iS2 . From this we obtain CC∼ .
In fact, we have proven a completely general ‘flipping theorem’, with this flippable
hypothesis on the forward simulation ( ). We have also shown that if the relation ∼
defines a bijection between the inputs of the source and the target, then any forward
simulation is flippable, hence reobtaining the usual proof technique [7, 34] as a special
case. This flipping theorem is further discussed in the online appendix.
We now consider how to relate traces where a single source action is compiled to mul-
tiple target ones. To illustrate this, we take a pure, statically-typed source language that
can output (nested) pairs of arbitrary size, and a pure, untyped target language where
sent values have a fixed size. Concretely, the source is analogous to the language of §3.3,
except that it does not have inputs or booleans and it has an expression send e, which
can emit a (nested) pair e of values in a single action. That is, given that e reduces
to a pair, e.g., v1, v2, v3, expression send v1, v2, v3 emits action v1, v2, v3.
That expression is compiled into a sequence of individual sends in the target language
send v1 ; send v2 ; send v3, since in the target, send e sends the value that e re-
duces to, but the language has no pairs.
14 C. Abate et al.
Due to space constraints we omit the⏐ full formalization of these simple languages
and of the homomorphic compiler ( (·) : e → e). The only interesting bit is the
compilation of the send · expression, which relies on the gensend (·) function below.
That function takes a source expression of a given type and returns a sequence of target
send · instructions that send each element of the expression.
⏐
send ( e : N) if τ = N
gensend ( e : τ ) =
gensend ( e.1 : τ ); gensend ( e.2 : τ ) if τ = τ × τ
Relating Traces. We start with the trivial relation between numbers: n ∼0 n, i.e., num-
bers are related when they are the same. We cannot build a relation between single ac-
tions since a single source action is related to multiple target ones. Therefore, we define
a relation between a source action M and a target trace t (a list of numbers), inductively
on the structure of M (which is a pair of values, and values are natural numbers or pairs).
(Trace-Rel-N-N) (Trace-Rel-N-M) (Trace-Rel-M-N) (Trace-Rel-M-M)
n ∼0 n n ∼0 n n ∼0 n M∼t M∼t n ∼0 n M∼t M ∼ t
n, n ∼ n · n
n, M ∼ n · t M, n ∼ t · n M, M ∼ t · t
A pair of naturals is related to the two actions that send each element of the pair
(Rule Trace-Rel-N-N). If a pair is made of sub-pairs, we require all such sub-pairs to be
related (Rules Trace-Rel-N-M to Trace-Rel-M-M). We build on these rules to define the
s ∼ t relation between source and target traces for which the (Trace-Rel-Single)
compiler is correct (Theorem 3.4). Trivially, traces are related s∼t M ∼ t
when they are both empty. Alternatively, given related traces, s · M ∼ t · t
we can concatenate a source action and a second target trace
provided that they are related (Rule Trace-Rel-Single).
Theorem 3.4 ( (·) is correct). (·) is CC∼ .
⏐ ⏐
With our trace relation, the trace property mappings capture the following intuitions:
– The target-to-source mapping states that a source property can reconstruct target
action as it sees fit. For example, trace 4 · 6 · 5 · 7 is related to 4, 6 · 5, 7 and
4, 6, 5, 7 (and many more variations). This gives freedom to the source im-
plementation of a target behavior, which follows from the non-injectivity of ∼.5
– The source-to-target mapping “forgets” about the way pairs are nested, but is faith-
ful w.r.t. the values vi contained in a message. Notice that source safety properties
are always mapped to target safety properties. For instance, if πS ∈ SafetyS pre-
scribes that some bad number is never sent, then τ̃ (πS ) prescribes the same number
is never sent in the target and τ̃ (πS ) ∈ SafetyT . Of course if πS ∈ SafetyS pre-
scribes that a particular nested pairing like 4, 6, 5, 7 never happens, then τ̃ (πS )
is still a target safety property, but the trivial one, since τ̃ (πS ) = ∈ SafetyT .
scenario where target observations are strictly more informative than source observa-
tions, the best guarantee one may expect from a correct trace-relating compiler (CC∼ )
is a weakening (or declassification) of target noninterference that matches the noninter-
ference property satisfied in the source. To formalize this reasoning, this section applies
the trinitarian view of trace-relating compilation to the general framework of abstract
noninterference (ANI) [21].
We first define NI and explain the issue of preserving source NI via a CC∼ compiler.
We then introduce ANI, which allows characterizations of various forms of noninterfer-
ence, and formulate a general theory of ANI preservation via CC∼ . We also study how
to deal with cases such as undefined behavior in the target. Finally, we answer the dual
question, i.e., which source NI should be satisfied to guarantee that compiled programs
are noninterfering with respect to target observers.
Intuitively, NI requires that publicly observable outputs do not reveal information
about private inputs. To define this formally, we need a few additions to our setup. We
indicate the (disjoint) input and output projections of a trace t as t◦ and t• respectively6 .
Denote with [t]low the equivalence class of a trace t, obtained using a standard low-
equivalence relation that relates low (public) events only if they are equal, and ingores
any difference between private events. Then, NI for source traces can be defined as:
NIS = {πS | ∀s1 s2 ∈ πS . [s◦1 ]low = [s◦2 ]low ⇒ [s•1 ]low = [s•2 ]low } .
That is, source NI comprises the sets of traces that have equivalent low output projec-
tions as long as their low input projections are equivalent.
Trace-Relating Compilation and Noninterference. When additional observations are
possible in the target, it is unclear whether a noninterfering source program is compiled
to a noninterfering target program or not, and if so, whether the notion of NI in the tar-
get is the expected or desired one. We illustrate this issue considering a scenario where
target traces extend source ones by exposing the execution time. While source noninter-
ference NIS requires that private inputs do not affect public outputs, NIT additionally
requires that the execution time is not affected by private inputs.
To model the scenario described, let TraceS denote the set of traces in the source,
and TraceT = TraceS × Nω be the set of target traces, where Nω N ∪ {ω}. Tar-
get traces have two components: a source trace, and a natural number that denotes
the time spent to produce the trace (ω if infinite). Notice that if two source traces
s1 , s2 , are low-equivalent then {s1 , s2 } ∈ NIS and {(s1 , 42), (s1 , 42)} ∈ NIT , but
{(s1 , 42), (s2 , 43)} ∈ NIT and {(s1 , 42), (s2 , 42), (s1 , 43), (s2 , 43)} ∈ NIT .
Consider the following straightforward trace relation, which relates a source trace
to any target trace whose first component is equal to it, irrespective of execution time:
s ∼ t ≡ ∃n. t = (s, n).
A compiler is CC∼ if any trace that can be exhibited in the target can be simulated
in the source in some amount of time. For such a compiler Theorem 2.11 says that
if W satisfies NIS , then W↓ satisfies Cl⊆ ◦ τ̃ (NIS ), which however is strictly weaker
than NIT , as it contains, e.g., {(s1 , 42), (s2 , 42), (s1 , 43), (s2 , 43)}, and one cannot
conclude that W↓ is noninterfering in the target. It is easy to prove that
6
Here we only require the projections to be disjoint. Depending on the scenario and the attacker
model the projections might record information such as the ordering of events.
16 C. Abate et al.
between the sets of trace properties τ̃ σ̃ as described in §2.2. Similarly, the pair ∼
◦
sets of input and output properties. In the timing example, time is an output so we have
∼ =, ∼ and ∼ is defined as s• ∼ t• ≡ ∃n. t• = (s• , n).
• • •
Theorem 4.1 (Compiling ANI). Assume traces of source and target languages are
related via ∼ ⊆ TraceS × TraceT , ∼ ∼, ∼ such that ∼ and ∼ are both total
◦ • ◦ •
maps from target to source traces, and ∼ is surjective. Assume ↓ is a CC∼ compiler,
◦
◦ •
and φS ∈ uco(2TraceS ), ρS ∈ uco(2TraceS ).
ρ#
If W satisfies ANI ρφSS , then W↓ satisfies ANI T
, where φ# #
T and ρT are defined as:
φ#
T
φ#T = g ◦ φS ◦ f ;
◦ ◦
ρ#
T = g ◦ ρS ◦ f
• •
and
∃t ∈ πT . s ∼ t ; g (πS ) = {t | ∀s . s ∼ t◦ ⇒ s◦ ∈ πS◦ }
◦ ◦
◦ ◦ ◦ ◦ ◦
◦ ◦
◦ ◦ ◦ ◦ ◦
f (πT ) = s
(and both f • and g • are defined analogously).
For the example above we recover the definitions we justified intuitively, i.e., φ# T =
g ◦ ◦ φS ◦ f ◦ = φT and ρ# ◦ ◦ ρ ∼
•
• •
T = g ρ S f = T . Moreover, we can prove that if also
ρ# ρ#
is surjective, ANI T
⊆ Cl ⊆ ◦ τ̃ (ANI ρφSS ). Therefore, the derived guarantee ANI T
is
φ#
T φ#
T
at least as strong as the one that follows by just knowing that the compiler ↓ is CC∼ .
Noninterference and Undefined Behavior. As stated above, Theorem 4.1 does not
apply to several scenarios from §3 such as undefined behavior (§3.1), as in those cases
the relation ∼ is not a total map. Nevertheless, we can still exploit our framework to
•
ρ#
ANI T
where φ# #
T is defined as in Theorem 4.1, and ρT is such that:
φ#
T
∀s t. s• ∼ t• ⇒ ρ# # •
T (t ) = ρT (τ̃ (ρS (s ))).
• •
•
source trace s it relates to, up to the observational power of the source level attacker ρS .
Therefore, given a source attacker ρS , the theorem characterizes a family of attackers
that cannot observe any interference for a correctly compiled noninterfering program.
Notice that the target attacker ρ# T = λ_. satisfies the premise of the theorem, but
ρ#
defines a trivial hyperproperty, so that we cannot prove in general that ANI T
⊆ Cl ⊆ ◦
φ#
T
τ̃ (ANI ρφSS ).
The same ρ#
= λ_. shows that the family of attackers described in
T
Theorem 4.2 is nonempty, and this ensures the existence of a most powerful attacker
among them [21], whose explicit characterization we leave for future work.
From Target NI to Source NI. We now explore the dual question: under what hy-
potheses does trace-relating compiler correctness alone allow target noninterference to
be reduced to source noninterference? This is of practical interest, as one would be able
to protect from target attackers by ensuring noninterference in the source. This task can
be made easier if the source language has some static enforcement mechanism [1, 36].
Let us consider the languages from §3.4 extended with inputting of (pairs of) values.
It is easy to show that the compiler described in §3.4 is still CC∼ . Assume that we want
ρ
to satisfy a given notion of target noninterference after compilation, i.e., W↓|=ANI φTT .
Recall that the observational power of the target attacker, ρT , is expressed as a property
of sequences of values. To express the same property (or attacker) in the source, we
have to abstract the way pairs of values are nested. For instance, the source attacker
should not distinguish v1 , v2 , v3 and v1 , v2 , v3 . In general (i.e., when ∼ is not
◦
the identity), this argument is valid only when φT can be represented in the source.
More precisely, φT must consider as equivalent all target inputs that are related to the
same source one, because in the source it is not possible to have a finer distinction of
inputs. This intuitive correspondence can be formalized as follows:
Theorem 4.3 (Target ANI by source ANI). Let φT ∈ uco(2TraceT ), ρT ∈ uco(2TraceT )
◦ •
and ∼ a total and surjective map from source outputs to target ones and assume that
•
ρ# ρ
If ·↓ is a CC∼ compiler and W satisfies ANI S
, then W↓ satisfies ANI φTT for
φ#
S
φ#
S = σ̃ ◦ φT ◦ τ̃ ;
◦ ◦
ρ#
S = σ̃ ◦ ρT ◦ τ̃ .
• •
To wrap up the discussion about noninterference, the results presented in this section
formalize and generalize some intuitive facts about compiler correctness and noninter-
ference. Of course, they all place some restrictions on the shape of the noninterference
instances that can be considered, because compiler correctness alone is in general not a
strong enough criterion for dealing with many security properties [6, 17].
that may not be expressible as the compilation of a source context. Compiler correctness
does not address this case, as it does not consider arbitrary target contexts, looking
instead at whole programs (empty context [33]) or well-behaved target contexts that
behave like source ones (as in compositional compiler correctness [27, 30, 45, 57]).
To account for this scenario, Abate et al. [2] describe several secure compilation
criteria based on the preservation of classes of (hyper)properties (e.g., trace properties,
safety, hypersafety, hyperproperties, etc.) against arbitrary target contexts. For each of
these criteria, they give an equivalent “property-free” criterion, analogous to the equiv-
alence between TP and CC= . For instance, their robust trace property preservation cri-
terion (RTP) states that, for any trace property π, if a source partial program P plugged
into any context CS satisfies π, then the compiled program P↓ plugged into any target
context CT satisfies π. Their equivalent criterion to RTP is RTC, which states that for
any trace produced by the compiled program, when linked with any target context, there
is a source context that produces the same trace. Formally (writing C [P ] to mean the
whole program that results from linking partial program P with context C) they define:
RTP ≡ ∀P. ∀π. (∀CS . ∀t.CS [P] t ⇒ t ∈ π) ⇒ (∀CT . ∀t. CT [ P↓] t ⇒ t ∈ π);
RTC ≡ ∀P. ∀CT .∀t.CT [ P↓] t ⇒ ∃CS . CS [P] t.
In the following we adopt the notation P |=R π to mean “P robustly satisfies π,” i.e., P
satisfies π irrespective of the contexts it is linked with. Thus, we write more compactly:
RTP ≡ ∀π. ∀P. P |=R π ⇒ P↓ |=R π.
All the criteria of Abate et al. [2] share this flavor of stating the existence of some
source context that simulates the behavior of any given target context, with some varia-
tions depending on the class of (hyper)properties under consideration. All these criteria
are stated in a setting where source and target traces are the same. In this section, we ex-
tend their result to our trace-relating setting, obtaining trintarian views for secure com-
pilation. Despite the similarities with §2, more challenges show up, in particular when
considering the robust preservation of proper sub-classes of trace properties. For exam-
ple, after application of σ̃ or τ̃ , a property may not be safety anymore, a crucial point for
the equivalence with the property-free criterion for safety properties by Abate et al. [2].
We solve this by interpreting the class of safety properties as an abstraction of the class
of all trace properties induced by a closure operator (§5.1). The remaining subsections
provide example compilation chains satisfying our trace-relating secure compilation
criteria for trace properties (§5.2) and for safety properties hypersafety (§5.3).
Theorem 5.1 (Trinity for Robust Trace Properties ). For any trace relation ∼ and
induced property mappings τ̃ and σ̃, we have: RTPτ̃ ⇐⇒ RTC∼ ⇐⇒ RTPσ̃ , where
RTC∼ ≡ ∀P ∀CT ∀t. CT [ P↓] t ⇒ ∃CS ∃s ∼ t. CS [P]
s;
RTPτ̃ ≡ ∀P ∀πS ∈ 2TraceS . P |=R πS ⇒ P↓ |=R τ̃ (πS );
20 C. Abate et al.
explains all but one of the asymmetries in Figure 2, the one that concerns the robust
preservation of arbitrary hyperproperties:
Theorem 5.3 (Weak Trinity for Robust Hyperproperties ). For a trace relation
∼ ⊆ TraceS × TraceT and induced property mappings σ̃ and τ̃ , RHC∼ is equivalent
to RHPτ̃ ; moreover, if τ̃ σ̃ is a Galois insertion (i.e., τ̃ ◦ σ̃ = id), RHC∼ implies
RHPσ̃ , while if σ̃ τ̃ is a Galois reflection (i.e., σ̃ ◦ τ̃ = id), RHPσ̃ implies RHC∼ ,
where RHC∼ ≡ ∀P ∀CT ∃CS ∀t. CT [ P↓] t ⇐⇒ (∃s ∼ t. CS [P] s);
RHPτ̃ ≡ ∀P ∀HS . P |=R HS ⇒ P↓ |=R τ̃ (HS );
RHPσ̃ ≡ ∀P ∀HT . P |=R σ̃(HT ) ⇒ P↓ |=R HT .
This trinity is weak since extra hypotheses are needed to prove some implications.
While the equivalence RHC∼ ⇐⇒ RHPτ̃ holds unconditionally, the other two im-
plications hold only under distinct, stronger assumptions. For RHPσ̃ it is still possible
and correct to deduce a source obligation for a given target hyperproperty HT when no
information is lost in the the composition τ̃ ◦ σ̃ (i.e., the two maps are a Galois inser-
tion). On the other hand, RHPτ̃ is a consequence of RHPσ̃ when no information is lost
in composing in the other direction, σ̃ ◦ τ̃ (i.e., the two maps are a Galois reflection).
Navigating the Diagram. For a given trace relation ∼, Figure 2 orders the generalized
criteria according to their relative strength. If a trinity implies another (denoted by ⇒),
then the former provides stronger security for a compilation chain than the latter.
As mentioned, some property-full criteria regarding proper subclasses (i.e., subset-
closed hyperproperties, safety, hypersafety, 2-relational safety and 2-relational hyper-
properties) quantify over arbitrary (relational) (hyper)properties and compose τ̃ with
an additional operator. We have already presented the Safe operator; other operators
are Cl⊆ , HSafe, and 2rSafe, which approximate the image of τ̃ with a subset-closed
hyperproperty, a hypersafety and 2-relational safety respectively.
As a reading aid, when quantifying over arbitrary trace properties we use the shaded
blue as background color, we use the red when quantifying over arbitrary subset-closed
hyperproperties and green for arbitrary 2-relational properties.
We now describe how to interpret the acronyms in Figure 2. All criteria start with R
meaning they refer to robust preservation. Criteria for relational hyperproperties—here
only arity 2 is shown—contain 2r. Next, criteria names spell the class of hyperproperties
they preserve: H for hyperproperties, SCH for subset-closed hyperproperties, HS for
hypersafety, T for trace properties, and S for safety properties. Finally, property-free
criteria end with a C while property-full ones involving σ̃ and τ̃ end with P. Thus,
robust (R) subset-closed hyperproperty-preserving (SCH) compilation (C) is RSCHC∼ ,
robust (R) two-relational (2r) safety-preserving (S) compilation (C) is R2rSC∼ , etc.
R2rSCHC∼
∼
RHC
Ins.
Refl.
R2rSCHPCl ⊆ ◦σ̃ R2rSCHPCl ⊆ ◦τ̃
σ̃ τ̃
RHP RHP R2rTC∼
RSCHC∼ R2rTPσ̃ R2rTPτ̃ R2rSC∼
Cl ⊆ ◦σ̃ Cl ⊆ ◦τ̃
RSCHP RSCHP
R2rSPσ̃ R2rTP2rS
2rSafe◦
rS
S τ̃
RHSC∼
∼
RTC
RHSPCl ⊆ ◦σ̃ RSCHPHS
HSafe◦
S τ̃
RTPσ̃ RTPτ̃
RSC∼
RSPσ̃ RTPSa
Safe◦
S a τ̃
R robust 2r 2-relational
H hyperproperties SCH subset-closed hyperproperties HS hypersafety
T trace properties S safety properties
P property-full criterion C property-free criterion based on σ and τ
the expressions that generate them: outS n and outS n usable in source and target, re-
spectively, and outT n usable only in the target, which is the only difference between
source and target. The extra events in the target model the fact that the target language
has an increased ability to perform certain operations, some of them potentially dan-
gerous (such as writing to the hard drive), which cannot be performed by the source
language, and against which source-level reasoning can therefore offer no protection.
Both languages and compilation chains now deal with partial programs, contexts
and linking of those two to produce whole programs. In this setting, a whole program
is the combination of a main expression to be evaluated and a set of function definitions
(with distinct names) that can refer to their argument symbolically and can be called by
the main expression and by other functions. The set of functions of a whole program
is the union of the functions of a partial program and a context; the latter also contains
the main expression. The extensions of the typing rules and the operational semantics
for whole programs are unsurprising and therefore elided. The trace model also follows
closely that of §3.3: it consists of a list of regular events (including the new outputs)
terminated by a result event. Finally, a partial program and a context can be linked into
a whole program when their functions satisfy the requirements mentioned above.
Relating Traces. In the present model, source and target traces differ only in the fact
that the target draws (regular) events from a strictly larger set than the source, i.e.,
ΣT ⊃ ΣS . A natural relation between source and target traces essentially maps to a
given target trace t the source trace that erases from t those events that exist only at the
target level. Let t|ΣS indicate trace t filtered to retain only those elements included in
Trace-Relating Compiler Correctness and Secure Compilation 23
6 Related Work
We already discussed how our results relate to some existing work in correct compila-
tion [33, 58] and secure compilation [2, 49, 50]. We also already mentioned that most
of our definitions and results make no assumptions about the structure of traces. One
result that relies on the structure of traces is Theorem 5.2, which involves some finite
prefix m, suggesting traces should be some sort of sequences of events (or states), as
customary when one wants to refer to safety properties [14]. It is however sufficient
to fix a topology on properties where safety properties coincide with closed sets [46].
Even for reasoning about safety, hypersafety, or arbitrary hyperproperties, traces can
therefore be values, sequences of program states, or of input output events, or even the
recently proposed interaction trees [62]. In the latter case we believe that the compila-
tion from IMP to ASM proposed by Xia et al. [62] can be seen as an instance of HC∼ ,
for the relation they call “trace equivalence.”
Compilers Where Our Work Could Be Useful. Our work should be broadly applica-
ble to understanding the guarantees provided by many verified compilers. For instance,
Wang et al. [61] recently proposed a CompCert variant that compiles all the way down
to machine code, and it would be interesting to see if the model at the end of §3.1 applies
there too. This and many other verified compilers [12, 29, 42, 56] beyond CakeML [58]
deal with resource exhaustion and it would be interesting to also apply the ideas of §3.2
to them. Hur and Dreyer [27] devised a correct compiler from an ML language to as-
sembly using a cross-language logical relation to state their CC theorem. They do not
have traces, though were one to add them, the logical relation on values would serve as
the basis for the trace relation and therefore their result would attain CC∼ .
Switching to more informative traces capturing the interaction between the program
and the context is often used as a proof technique for secure compilation [2, 28, 48].
Most of these results consider a cross-language relation, so they probably could be
proved to attain one of the criteria from Figure 2.
Generalizations of Compiler Correctness. The compiler correctness definition of
Morris [41] was already general enough to account for trace relations, since it consid-
ered a translation between the semantics of the source program and that of the compiled
program, which he called “decode” in his diagram, reproduced in Figure 3 (left). And
even some of the more recent compiler correctness definitions preserve this kind of flex-
ibility [51]. While CC∼ can be seen as an instance of a definition by Morris [41], we are
not aware of any prior work that investigated the preservation of properties when the
“decode translation” is neither the identity nor a bijection, and source properties need
to be re-interpreted as target ones and vice versa.
Correct Compilation and Galois Connections. Melton et al. [38] and Sabry and
Wadler [55] expressed a strong variant of compiler correctness using the diagram of
Figure 3 (right) [38, 55]. They require that compiled programs parallel the computation
steps of the original source programs, which can be proven showing the existence of a
decompilation map # that makes the diagram commute, or equivalently, the existence
of an adjoint for ↓ (W ≤ W ⇐⇒ W W for both source and target). The
Trace-Relating Compiler Correctness and Secure Compilation 25
source semantics
source language source meanings S
W Z#
compile decode T
target semantics W↓ Z
target language target meanings
Fig. 3: Morris’s [41] (left) and Melton et al.’s [38] and Sabry and Wadler’s [55] (right)
“parallel” intuition can be formalized as an instance of CC∼ . Take source and target
traces to be finite or infinite sequences of program states (maximal trace semantics
[15]), and relate them exactly like Melton et al. [38] and Sabry and Wadler [55].
Translation Validation. Translation validation is an important alternative to proving
that all runs of a compiler are correct. A variant of CC∼ for translation validation can
simply be obtained by specializing the definition to a particular W, and one can obtain
again the same trinitarian view. Similarly for our other criteria, including our extensions
of the secure compilation criteria of Abate et al. [2], which Busi et al. [10] seem to
already be considering in the context of translation validation.
Bibliography
[24] I. Haller, Y. Jeon, H. Peng, M. Payer, C. Giuffrida, H. Bos, and E. van der Kouwe. TypeSan:
Practical type confusion detection. CCS, 2016.
[25] Heartbleed. The Heartbleed bug. http://heartbleed.com/, 2014.
[26] C. Hriţcu, D. Chisnall, D. Garg, and M. Payer. Secure compilation. SIGPLAN PL Perspec-
tives Blog, 2019.
[27] C. Hur and D. Dreyer. A Kripke logical relation between ML and assembly. POPL, 2011.
[28] A. Jeffrey and J. Rathke. Java Jr: Fully abstract trace semantics for a core Java language.
ESOP, 2005.
[29] J. Kang, C. Hur, W. Mansky, D. Garbuzov, S. Zdancewic, and V. Vafeiadis. A formal C
memory model supporting integer-pointer casts. PLDI, 2015.
[30] J. Kang, Y. Kim, C.-K. Hur, D. Dreyer, and V. Vafeiadis. Lightweight verification of sepa-
rate compilation. POPL, 2016.
[31] L. Lamport and F. B. Schneider. Formal foundation for specification and verification. In
Distributed Systems: Methods and Tools for Specification, An Advanced Course, 1984.
[32] C. Lattner. What every C programmer should know about undefined behavior #1/3. LLVM
Project Blog, 2011.
[33] X. Leroy. Formal verification of a realistic compiler. CACM, 52(7), 2009.
[34] X. Leroy. A formally verified compiler back-end. JAR, 43(4), 2009.
[35] X. Leroy. The formal verification of compilers (DeepSpec Summer School 2017), 2017.
[36] I. Mastroeni and M. Pasqua. Verifying bounded subset-closed hyperproperties. SAS, 2018.
[37] J. McCarthy and J. Painter. Correctness of a compiler for arithmetic expressions. Mathe-
matical Aspects Of Computer Science 1, 19 of Proceedings of Symposia in Applied Math-
ematics, 1967.
[38] A. Melton, D. A. Schmidt, and G. E. Strecker. Galois connections and computer science
applications. In Proceedings of a Tutorial and Workshop on Category Theory and Computer
Programming, 1986.
[39] R. Milner. A Calculus of Communicating Systems. Springer-Verlag, Berlin, Heidelberg,
1982.
[40] R. Milner and R. Weyhrauch. Proving compiler correctness in a mechanized logic. In Pro-
ceedings of 7th Annual Machine Intelligence Workshop, volume 7 of Machine Intelligence,
1972.
[41] F. L. Morris. Advice on structuring compilers and proving them correct. POPL, 1973.
[42] E. Mullen, D. Zuniga, Z. Tatlock, and D. Grossman. Verified peephole optimizations for
CompCert. PLDI, 2016.
[43] D. A. Naumann. A categorical model for higher order imperative programming. Mathe-
matical Structures in Computer Science, 8(4), 1998.
[44] D. A. Naumann and M. Ngo. Whither specifications as programs. In International Sympo-
sium on Unifying Theories of Programming. Springer, 2019.
[45] G. Neis, C. Hur, J. Kaiser, C. McLaughlin, D. Dreyer, and V. Vafeiadis. Pilsner: a compo-
sitionally verified compiler for a higher-order imperative language. ICFP, 2015.
[46] M. Pasqua and I. Mastroeni. On topologies for (hyper)properties. CEUR, 2017.
[47] M. Patrignani. Why should anyone use colours? or, syntax highlighting beyond code snip-
pets, 2020.
[48] M. Patrignani and D. Clarke. Fully abstract trace semantics for protected module architec-
tures. Computer Languages, Systems & Structures, 42, 2015.
[49] M. Patrignani and D. Garg. Secure compilation and hyperproperty preservation. CSF, 2017.
[50] M. Patrignani and D. Garg. Robustly safe compilation. ESOP, 2019.
[51] D. Patterson and A. Ahmed. The next 700 compiler correctness theorems (functional pearl).
PACMPL, 3(ICFP), 2019.
[52] T. Ramananandro, Z. Shao, S. Weng, J. Koenig, and Y. Fu. A compositional semantics for
verified separate compilation and linking. CPP, 2015.
28 C. Abate et al.
[53] J. Regehr. A guide to undefined behavior in C and C++, part 3. Embedded in Academia
blog, 2010.
[54] A. Sabelfeld and D. Sands. Dimensions and principles of declassification. CSFW, 2005.
[55] A. Sabry and P. Wadler. A reflection on call-by-value. ACM Transactions on Programming
Languages and Systems, 19(6), 1997.
[56] J. Sevcík, V. Vafeiadis, F. Z. Nardelli, S. Jagannathan, and P. Sewell. CompCertTSO: A
verified compiler for relaxed-memory concurrency. J. ACM, 60(3), 2013.
[57] G. Stewart, L. Beringer, S. Cuellar, and A. W. Appel. Compositional CompCert. POPL,
2015.
[58] Y. K. Tan, M. O. Myreen, R. Kumar, A. Fox, S. Owens, and M. Norrish. The verified
CakeML compiler backend. Journal of Functional Programming, 29, 2019.
[59] X. Wang, H. Chen, A. Cheung, Z. Jia, N. Zeldovich, and M. F. Kaashoek. Undefined
behavior: What happened to my code? APSYS, 2012.
[60] X. Wang, N. Zeldovich, M. F. Kaashoek, and A. Solar-Lezama. Towards optimization-safe
systems: Analyzing the impact of undefined behavior. SOSP, 2013.
[61] Y. Wang, P. Wilke, and Z. Shao. An abstract stack based approach to verified compositional
compilation to machine code. PACMPL, 3(POPL), 2019.
[62] L. Xia, Y. Zakowski, P. He, C. Hur, G. Malecha, B. C. Pierce, and S. Zdancewic. Interaction
trees: representing recursive and impure programs in Coq. PACMPL, 4(POPL), 2020.
[63] A. Zakinthinos and E. S. Lee. A general theory of security properties. S&P, 1997.
[64] J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formalizing the LLVM
intermediate representation for verified program transformations. POPL, 2012.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropri-
ate credit to the original author(s) and the source, provide a link to the Creative Commons license
and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter’s Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Runners in action
1 Introduction
One wishes that the ingenuity of the language implementors were better sup-
ported by a more flexible methodology with a sound theoretical footing.
Excessive generality is not as easily discerned, because generality of program-
ming concepts makes a language expressive and useful, such as general algebraic
effects and handlers enabling one to implement timeouts, rollbacks, stream redi-
rection [30], async & await [16], and concurrency [9]. However, the flip side of such
expressive freedom is the lack of any guarantees about how external resources
will actually be used. For instance, consider a simple piece of code, written in
Eff-like syntax, which first opens a file, then writes to it, and finally closes it:
let fh = open "hello.txt" in write (fh, "Hello, world."); close fh
What this program actually does depends on how the operations open, write,
and close are handled. For all we know, an enveloping handler may intercept the
write operation and discard its continuation, so that close never happens and
the file is not properly closed. Telling the programmer not to shoot themselves
in the foot by avoiding such handlers is not helpful, because the handler may
encounter an external reason for not being able to continue, say a full disk.
Even worse, external resources may be misused accidentally when we combine
two handlers, each of which works as intended on its own. For example, if we
combine the above code with a non-deterministic choose operation, as in
let fh = open "greeting.txt" in
let b = choose () in
if b then write (fh, "hello") else write (fh, "good bye") ; close fh
The resulting program attempts to close the file twice, as well as write to it twice,
because the continuation k is invoked twice when handling choose. Of course,
with enough care all such situations can be dealt with, but that is beside the
point. It is worth sacrificing some amount of the generality of algebraic effects
and monads in exchange for predictable and safe usage of external computational
effects, so long as the vast majority of common use cases are accommodated.
Runners are modular in that they can be used not only to model the top-
level interaction with the external environment, but programmers can also use
them to define and nest their own intermediate “virtual machines”. Our runners
are effectful : they may handle operations by calling further outer operations,
and raise exceptions and send signals, through which exceptional conditions and
runtime errors are communicated back to user programs in a safe fashion that
preserves linear usage of external resources and ensures their proper finalisation.
We achieve suitable generality for handling of external resources by showing
how runners provide implementations of algebraic operations together with a
natural notion of finalisation, and a strong guarantee that in the absence of
external kill signals the finalisation code is executed exactly once (Thm. 7). We
argue that for most purposes such discipline is well worth having, and giving up
the arbitrariness of effect handlers is an acceptable price to pay. In fact, as will
be apparent in the denotational semantics, runners are simply a restricted form
of handlers, which apply the continuation at most once in a tail call position.
Runners guarantee linear usage of resources not through a linear or unique-
ness type system (such as in the Clean programming language [15]) or a syntac-
tic discipline governing the application of continuations in handlers, but rather
by a design based on the linear state-passing technique studied by Møgelberg
and Staton [21]. In this approach, a computational resource may be implemented
without restrictions, but is then guaranteed to be used linearly by user code.
We begin with a short overview of the theory of algebraic effects and handlers,
as well as runners. To keep focus on how runners give rise to a programming
concept, we work naively in set theory. Nevertheless, we use category-theoretic
language as appropriate, to make it clear that there are no essential obstacles to
extending our work to other settings (we return to this point in §5.1).
We are abusing notation in a slight but standard way, by using op both as the
name of an operation and a tree-forming constructor. The elements of TreeΣ pXq
are called computation trees: a leaf return x represents a pure computation re-
turning a value x, while oppa, κq represents an effectful computation that calls
op with parameter a and continuation κ, which expects a result from Bop .
An algebraic theory T “ pΣT , EqT q is given by a signature ΣT and a set of
equations EqT . The equations EqT express computational behaviour via inter-
actions between operations, and are written in a suitable formalism, e.g., [30].
We explain these by way of examples, as the precise details do not matter for
our purposes. Let 0 “ t u be the empty set and 1 “ t‹u the standard singleton.
Example 1. Given a set C of possible states, the theory of C-valued state has
two operations, whose somewhat unusual naming will become clear later on,
getenv : 1 C, setenv : C 1
For example, the second equation states that reading state right after setting it
to c gives precisely c. The third equation states that setenv overwrites the state.
r´s
X
return / TreeΣ pXq / / FreeT pXq .
T
The Kleisli extension for this monad is then the operation which lifts any map
f : X Ñ TreeΣT pY q to the map f : : FreeΣT pXq Ñ FreeΣT pY q, given by
That is, f : traverses a computation tree and replaces each leaf return x with f x.
The preceding construction of free models and the monad may be retro-
fitted to an algebraic signature Σ, if we construe Σ as an algebraic theory with
no equations. In this case „ is just equality, and so we may omit the quotient
Runners in action 33
and the pesky equivalence classes. Thus the carrier of the free Σ-model is the
set of well-founded trees TreeΣ pXq, with the evident monad structure.
A fundamental insight of Plotkin and Power [25,28] was that many com-
putational effects may be adequately described by algebraic theories, with the
elements of free models corresponding to effectful computations. For example,
the monads induced by the theories from Examples 1 and 2 are respectively
isomorphic to the usual state monad StC X “ pC ñ X ˆ Cq and the exceptions
def
monad ExcE X “ X ` E.
def
Plotkin and Pretnar [30] further observed that the universal property of free
models may be used to model a programming concept known as handlers. Given
a T -model M and a map f : X Ñ |M|, the universal property of the free
T -model gives us a unique T -homomorphism f ; : FreeT pXq Ñ |M| satisfying
2.2 Runners
Much like monads, handlers are useful for simulating computational effects, be-
cause they allow us to transform T -computations to T 1 -computations. However,
eventually there has to be a “top level” where such transformations cease and
actual computational effects happen. For these we need another concept, known
as runners [35]. Runners are equivalent to the concept of comodels [27,31], which
are “just models in the opposite category”, although one has to apply the motto
correctly by using powers and co-powers where seemingly exponentials and prod-
ucts would do. Without getting into the intricacies, let us spell out the definition.
Definition 1. A runner R for a signature Σ is given by a carrier set |R| together
with, for each op P Σ, a co-operation opR : Aop Ñ p|R| ñ Bop ˆ |R|q.
Runners are usually defined to have co-operations in the equivalent uncurried
form opR : Aop ˆ |R| Ñ Bop ˆ |R|, but that is less convenient for our purposes.
Runners may be defined more generally for theories T , rather than just sig-
natures, by requiring that the co-operations satisfy EqT . We shall have no use
for these, although we expect no obstacles in incorporating them into our work.
A runner tells us what to do when an effectful computation reaches the
top-level runtime environment. Think of |R| as the set of configurations of
the runtime environment. Given the current configuration c P |R|, the opera-
tion oppa, κq is executed as the corresponding co-operation opR a c whose result
34 D. Ahman and A. Bauer
pb, c1 q P Bop ˆ |R| gives the result of the operation b and the next runtime
configuration c1 . The continuation κ b then proceeds in runtime configuration c1 .
It is not too difficult to turn this idea into a mathematical model. For any
X, the co-operations induce a Σ-structure M with |M| “ St|R| X “ p|R| ñ
def
We may then use the universal property of the free Σ-model to obtain a Σ-
homomorphism rX : TreeΣ pXq Ñ St|R| X satisfying the equations
rX preturn xq “ λc . px, cq, rX poppa, κqq “ opM pa, rX ˝ κq.
The map rX precisely captures the idea that a runner runs computations by
transforming (static) computation trees into state-passing maps. Note how in
the above definition of opM , the continuation κ is used in a controlled way, as
it appears precisely once as the head of the outermost application. In terms of
programming, this corresponds to linear use in a tail-call position.
Runners are less ad-hoc than they may seem. First, notice that opM is just the
composition of the co-operation opR with the state monad’s Kleisli extension of
the continuation κ, and so is the standard way of turning generic effects into Σ-
structures [26]. Second, the map rX is the component at X of a monad morphism
r : TreeΣ p´q Ñ St|R| . Møgelberg & Staton [21], as well as Uustalu [35], showed
that the passage from a runner R to the corresponding monad morphism r forms
a one-to-one correspondence between the former and the latter.
As defined, runners are too restrictive a model of top-level computation,
because the only effect available to co-operations is state, but in practice the
runtime environment may also signal errors and perform other effects, by calling
its own runtime environment. We are led to the following generalisation.
Definition 2. For a signature Σ and monad T , a T -runner R for Σ, or just an
effectful runner, is given by, for each op P Σ, a co-operation opR : Aop Ñ T Bop .
The correspondence between runners and monad morphisms still holds.
Proposition 3. For a signature Σ and a monad T , the monad morphisms
TreeΣ p´q Ñ T are in one-to-one correspondence with T -runners for Σ.
Proof. This is an easy generalisation of the correspondence for ordinary runners.
Let us fix a signature Σ, and a monad T with unit η and Kleisli extension ´: .
Let R be a T -runner for Σ. For any set X, R induces a Σ-structure M
with |M| “ T X and opM : Aop ˆ pBop ñ T Xq Ñ T X defined as opM pa, κq “
def def
κ: popR aq. As before, the universal property of the free model TreeΣ pXq provides
a unique Σ-homomorphism rX : TreeΣ pXq Ñ T X, satisfying the equations
rX preturn xq “ ηX pxq, rX poppa, κqq “ opM pa, rX ˝ κq.
The maps rX collectively give us the desired monad morphism r induced by R.
Conversely, given a monad morphism θ : TreeΣ p´q Ñ T , we may recover a T -
runner R for Σ by defining the co-operations as opR a “ θBop poppa, λb . return bqq.
def
These equations say that a signal discards state, which makes it unrecoverable.
To summarise, the kernel theory KΣ,E,S,C contains operations from a signa-
ture Σ, as well as state operations getenv : 1 C, setenv : C 1, exceptions
raise : E 0, and signals kill : S 0, with equations for state from Example 1,
equations (1) relating state and signals, and for each operation op P Σ, equations
expressing that external operations do not interact with kernel state. It is not
difficult to see that KΣ,E,S,C induces, up to isomorphism, the kernel monad
How about user code? It can of course call operations from a signature Σ
(not necessarily the same as the kernel code), and because we intend it to handle
exceptions, it might as well have the ability to raise them. However, user code
knows nothing about signals and kernel state. Thus, we choose the user theory
UΣ,E to be the algebraic theory with operations Σ, exceptions raise : E 0, and
no equations. This theory induces the user monad UΣ,E X “ TreeΣ pX ` Eq.
def
and analogously for kernel code. The familiar binding construct let x “ M in N
is simply shorthand for try M with treturn x ÞÑ N, . . . , raise e ÞÑ raise e, . . .u.
As a programming concept, a runner R takes the form
where each Kop is a kernel computation, with the variable x bound in Kop , so
that each clause op x ÞÑ Kop determines a co-operation for the kernel monad.
The subscript C indicates the type of the state used by the kernel code Kop .
The corresponding elimination form is a handling-like construct
which uses the co-operations of runner R “at” initial kernel state V to run user
code M , and finalises its return value, exceptions, and signals with F , see (3)
below. When user code M calls an operation op, the enveloping run construct
runs the corresponding co-operation Kop of R. While doing so, Kop might raise
exceptions. But not every exception makes sense for every operation, and so
we assign to each operation op a set of exceptions Eop which the co-operations
implementing it may raise, by augmenting its operation signature with Eop , as
Notice that the user code does not have direct access to the file handle. Instead,
the runner holds it in its state, where it is available to the co-operation that
implements write. The finalisation block gets access to the file handle upon suc-
cessful completion and raised exception, so it can close the file, but when a signal
happens the finalisation cannot close the file, nor should it attempt to do so.
We also mention that the code “cheats” by placing the call to open in a posi-
tion where a value is expected. We should have let-bound the file handle returned
by open outside the run construct, which would make it clear that opening the
file happens before this construct (and that open is not handled by the finalisa-
tion), but would also expose the file handle. Since there are clear advantages to
keeping the file handle inaccessible, a realistic language should accept the above
code and hoist computations from value positions automatically.
Inspired by the semantic notion of runners and the ideas of the previous section,
we now present a calculus for programming with co-operations and runners,
called λcoop . It is a low-level fine-grain call-by-value calculus [19], and as such
could inspire an intermediate language that a high-level language is compiled to.
4.1 Types
The types of λcoop are shown in Fig. 1. The ground types contain base types, and
are closed under finite sums and products. These are used in operation signa-
tures and as types of kernel state. (Allowing arbitrary types in either of these
entails substantial complications that can be dealt with but are tangential to
our goals.) Ground types can also come with corresponding constant symbols f,
each associated with a fixed constant signature f : pA1 , . . . , An q Ñ B.
We assume a supply of operation symbols O, exception names E, and signal
names S. Each operation symbol op P O is equipped with an operation signature
Aop Bop ! Eop , which specifies its parameter type Aop and arity type Bop , and
the exceptions Eop that the corresponding co-operations may raise in runners.
The value types extend ground types with two function types, and a type
of runners. The user function type X Ñ Y ! pΣ, Eq classifies functions tak-
ing arguments of type X to computations classified by the user (computa-
tion) type Y ! pΣ, Eq, i.e., those that return values of type Y , and may call
operations Σ and raise exceptions E. Similarly, the kernel function type X Ñ
Y pΣ, E, S, Cq classifies functions taking arguments of type X to computations
classified by the kernel (computation) type Y pΣ, E, S, Cq, i.e., those that return
values of type Y , and may call operations Σ, raise exceptions E, send signals S,
and use state of type C. We note that the ingredients for user and kernel types
correspond precisely to the parameters of the user monad UΣ,E and the kernel
monad KΣ,E,S,C from §3.1. Finally, the runner type Σ ñ pΣ 1 , S, Cq classifies run-
ners that implement co-operations for the operations Σ as kernel computations
which use operations Σ 1 , send signals S, and use state of type C.
Runners in action 39
Values Among the values are variables, constants for ground types, and con-
structors for sums and products. There are two kinds of functions, for abstracting
over user and kernel computations. A runner is a value of the form
tpop x ÞÑ Kop qopPΣ uC .
It implements co-operations for operations op as kernel computations Kop , with
x bound in Kop . The type annotation C specifies the type of the state that Kop
uses. Note that C ranges over ground types, a restriction that allows us to define
a naive set-theoretic semantics. We sometimes omit these type annotations.
User and kernel computations The user and kernel computations both have
pure computations, function application, exception raising and handling, stan-
40 D. Ahman and A. Bauer
Values
V, W ::“ x variable
ˇ
ˇ fpV1 , . . . , Vn q ground constant
ˇ
ˇ pq unit
ˇ
ˇ pV, W q pair
ˇ ˇ
ˇ inlX,Y V ˇ inrX,Y V injection
ˇ
ˇ fun px : Xq ÞÑ M user function
ˇ
ˇ funK px : Xq ÞÑ K kernel function
ˇ
ˇ tpop x ÞÑ Kop qopPΣ uC runner
User computations
M, N ::“ return V value
ˇ
ˇ V W application
ˇ
ˇ try M with treturn x ÞÑ N, praise e ÞÑ Ne qePE u exception handler
ˇ
ˇ match V with tpx, yq ÞÑ M u product elimination
ˇ
ˇ match V with tuX empty elimination
ˇ
ˇ match V with tinl x ÞÑ M, inr y ÞÑ N u sum elimination
ˇ
ˇ opX pV, px . M q, pNe qePEop q operation call
ˇ
ˇ raiseX e raise exception
ˇ
ˇ using V @ W run M finally F running user code
ˇ
ˇ kernel K @ W finally F switch to kernel mode
Kernel computations
K, L ::“ returnC V value
ˇ
ˇ V W application
ˇ
ˇ try K with treturn x ÞÑ L, praise e ÞÑ Le qePE u exception handler
ˇ
ˇ match V with tpx, yq ÞÑ Ku product elimination
ˇ
ˇ match V with tuX@C empty elimination
ˇ
ˇ match V with tinl x ÞÑ K, inr y ÞÑ Lu sum elimination
ˇ
ˇ opX pV, px . Kq, pLe qePEop q operation call
ˇ
ˇ raiseX@C e raise exception
ˇ
ˇ killX@C s send signal
ˇ
ˇ getenvC pc . Kq get kernel state
ˇ
ˇ setenvpV, Kq set kernel state
ˇ
ˇ user M with treturn x ÞÑ K, praise e ÞÑ Le qePE u switch to user mode
dard elimination forms, and operation calls. Note that the typing annotations
on some of these differ according to their mode. For instance, a user operation
call is annotated with the result type X, whereas the annotation X @ C on a
kernel operation call also specifies the kernel state type C.
The binding construct letX!E x “ M in N is not part of the syntax, but is an
abbreviation for try M with treturn x ÞÑ N, praise e ÞÑ raiseX eqePE u, and there is
an analogous one for kernel computations. We often drop the annotation X!E.
Some computations are specific to one or the other mode. Only the kernel
mode may send a signal with kill, and manipulate state with getenv and setenv,
but only the user mode has the run construct from §3.2. Finally, each mode has
the ability to “context switch” to the other one. The kernel computation
runs a user computation M and handles the returned value and leftover excep-
tions with kernel computations K and Le . Conversely, the user computation
runs kernel computation K with initial state W , and handles the returned value,
and leftover exceptions and signals with user computations M , Ne , and Ns .
Γ $ V : X, Γ $ M : X ! U, Γ $ K : X K,
X Ď Y, X ! U Ď Y ! V, X K Ď Y L,
Sub-Runner
Sub-Ground Σ11 Ď Σ1 Σ2 Ď Σ21 S Ď S1 C ” C1
AĎA Σ1 ñ pΣ2 , S, Cq Ď Σ11 ñ pΣ21 , S 1 , C 1 q
Sub-Kernel
X Ď X1 Σ Ď Σ1 E Ď E1 S Ď S1 C ” C1
X pΣ, E, S, Cq Ď X 1 pΣ 1 , E 1 , S 1 , C 1 q
TyUser-Try ` ˘
Γ $ M : X ! pΣ, Eq Γ, x : X $ N : Y ! pΣ, E 1 q Γ $ Ne : Y ! pΣ, E 1 q ePE
Γ $ try M with treturn x ÞÑ N, praise e ÞÑ Ne qePE u : Y ! pΣ, E 1 q
TyUser-Run
F ” treturn x @ c ÞÑ N, praise e @ c ÞÑ Ne qePE , pkill s ÞÑ Ns qsPS u
Γ $ V : Σ ñ pΣ 1 , S, Cq Γ $W :C
` Γ $ M : X ! pΣ, Eq ˘Γ, x : X, c
` : C $ N : Y ! pΣ 1 , E 1 q˘
Γ, c : C $ Ne : Y ! pΣ 1 , E 1 q ePE Γ $ Ns : Y ! pΣ 1 , E 1 q sPS
Γ $ using V @ W run M finally F : Y ! pΣ 1 , E 1 q
TyUser-Op
U ” pΣ, Eq op P Σ` Γ $ V : Aop˘
Γ, x : Bop $ M : X ! U Γ $ Ne : X ! U ePE
op
TyKernel-Op
K ” pΣ, E, S, Cq op P` Σ Γ $ V :˘Aop
Γ, x : Bop $ K : X K Γ $ Le : X K ePE
op
TyUser-Kernel
F ” treturn x @ c ÞÑ N, praise e @ c ÞÑ Ne qePE , pkill s ÞÑ Ns qsPS u
Γ $ K `: X pΣ, E, S, Cq Γ $ ˘W : C ` Γ, x : X, c : C $ N : ˘Y ! pΣ, E 1 q
Γ, c : C $ Ne : Y ! pΣ, E 1 q ePE Γ $ Ns : Y ! pΣ, E 1 q sPS
Γ $ kernel K @ W finally F : Y ! pΣ, E 1 q
TyKernel-User
K ” pΣ, E 1 , S, Cq Γ` $ M : X ! pΣ, ˘Eq
Γ, x : X $ K : Y K Γ $ Le : Y K ePE
Γ $ user M with treturn x ÞÑ K, praise e ÞÑ Le qePE u : Y K
equations analogous to Example 1. It has been observed [24,31] that such a lens
in fact amounts to an ordinary runner for C-valued state.
The rules TyUser-Op and TyKernel-Op govern operation calls, where we
have a success continuation which receives a value returned by a co-operation,
and exceptional continuations which receive exceptions raised by co-operations.
The rule TyUser-Run requires that the runner V implements all the opera-
tions M can use, meaning that operations are not implicitly propagated outside
a run block (which is different from how handlers are sometimes implemented).
Of course, the co-operations of the runner may call further external operations,
as recorded by the signature Σ 1 . Similarly, we require the finally block F to in-
tercept all exceptions and signals that might be produced by the co-operations
of V or the user code M . Such strict control is exercised throughout. For ex-
ample, in TyUser-Run, TyUser-Kernel, and TyKernel-User we catch all
the exceptions and signals that the code might produce. One should judiciously
relax these requirements in a language that is presented to the programmer, and
allow re-raising and re-sending clauses to be automatically inserted.
Because Kop is kernel code, it is executed in kernel mode, whose finally clauses
specify what happens afterwards: if Kop returns a value, or raises an exception,
execution continues with a suitable continuation, with R wrapped around it; and
if Kop sends a signal, the corresponding finalisation code from F is evaluated.
The next bundle describes how kernel code is executed within user code:
kernel preturnC V q @ W finally F ” N rV {x, W {cs,
kernel praiseX@C eq @ W finally F ” Ne rW {cs,
kernel pkillX@C sq @ W finally F ” Ns ,
kernel pgetenvC pc . Kqq @ W finally F ” kernel KrW {cs @ W finally F,
kernel psetenvpV, Kqq @ W finally F ” kernel K @ V finally F.
We also have an equation stating that an operation called in kernel mode prop-
agates out to user mode, with its continuations wrapped in kernel mode:
5 Denotational semantics
We provide a coherent denotational semantics for λcoop , and prove it sound with
respect to the equational theory given in §4.4. Having eschewed all forms of
recursion, we may afford to work simply over the category of sets and functions,
while noting that there is no obstacle to incorporating recursion at all levels and
switching to domain theory, similarly to the treatment of effect handlers in [3].
The skeletal versions of the user and kernel types are P ! and P C, respec-
tively. It is best to think of the skeletal types as ML-style types which implicitly
over-approximate effect information by “any effect is possible”, an idea which is
mathematically expressed by their semantics, as explained below.
First of all, the semantics of ground types is straightforward. One only needs
to provide sets denoting the base types b, after which the ground types receive
the standard set-theoretic meaning, as given in Fig. 4.
Recall that O, S, and E are the sets of all operations, signals, and exceptions,
and that each op P O has a signature op : Aop Bop ! Eop . Let us additionally
assume that there is a distinguished operation O P O with signature O : 1 0 ! 0
(otherwise we adjoin it to O). It ensures that the denotations of skeletal user and
kernel types are pointed sets, while operationally O indicates a runtime error.
Next, we define the skeletal user and kernel monads as
Us X “ UO,E X “ TreeO pX ` Eq ,
def
and Runners C as the set of all skeletal runners R (with state C), which are fami-
lies of co-operations topR : rrAop ss Ñ KO,Eop ,S,C rrBop ssuopPO . Note that KO,Eop ,S,C
is a coproduct [11] of monads C ñ TreeO p´ ˆ C ` Sq and ExcEop , and thus the
skeletal runners are the effectful runners for the former monad, so long as we
read the effectful signatures op : Aop Bop ! Eop as ordinary algebraic ones
op : Aop Bop ` Eop . While there is no semantic difference between the two
readings, there is one of intention: KO,Eop ,S,C rrBop ss is a kernel computation that
(apart from using state and sending signals) returns values of type Bop and raises
exceptions Eop , whereas C ñ TreeO pprrBop ss ` Eop q ˆ C ` Sq returns values of
type Bop ` Eop and raises no exceptions. We prefer the former, as it reflects our
treatment of exceptions as a control mechanism rather than exceptional values.
These ingredients suffice for the denotation of skeletal types as sets, as given
in Fig. 4. The user and kernel skeletal types are interpreted using the respective
skeletal monads, and hence the two function types as Kleisli exponentials.
We proceed with the semantics of effectful types. The skeleton of a value
type X is the skeletal type X s obtained by removing all effect information, and
similarly for user and kernel types, see Fig. 5. We interpret a value type X as a
subset rrrXsss Ď rrX s ss of the denotation of its skeleton, and similarly for user and
computation types. In other words, we treat the effectful types as refinements
of their skeletons. For this, we define the operation pX0 , X1 q pY0 , Y1 q, for any
X0 Ď X1 and Y0 Ď Y1 , as the set of maps X1 Ñ Y1 restricted to X0 Ñ Y0 :
pX0 , X1 q pY0 , Y1 q “ tf : X1 Ñ Y1 | @x P X0 . f pxq P Y0 u.
def
Next, observe that the user and the kernel monads preserve subset inclusions, in
the sense that UΣ,E X Ď UΣ 1 ,E 1 X 1 and KΣ,E,S,C X Ď KΣ 1 ,E 1 ,S 1 ,C X 1 if Σ Ď Σ 1 ,
E Ď E 1 , S Ď S 1 , and X Ď X 1 . In particular, we always have UΣ,E X Ď Us X
and KΣ,E,S,C X Ď KsC X. Finally, let RunnerΣ,Σ 1 ,S C Ď Runners C be the subset
of those runners R whose co-operations for Σ factor through KΣ 1 ,Eop ,S,C , i.e.,
opR : rrAop ss Ñ KΣ 1 ,Eop ,S,C rrBop ss Ď KO,Eop ,S,C rrBop ss, for each op P Σ.
46 D. Ahman and A. Bauer
Ground types
Skeletal types
rrrunner Css “ Runners rrCss rrP !ss “ Us rrP ss rrP Css “ KsrrCss rrP ss
def def def
Skeletons
pΣ ñ pΣ 1 , S, Cqq “ runner C
s def
As “ A pX ˆ Y qs “ X s ˆ Y s
def def
pX Ñ Y ! Uqs “ X s Ñ pY ! Uqs pX ` Y qs “ X s ` Y s
def def
pX Ñ Y Kqs “ X s Ñ pY Kqs pX ! U qs “ X s !
def def
Denotations
rrrAsss “ rrAss rrrX ˆ Y sss “ rrrXsss ˆ rrrXsss
def def
rrrX Ñ Y ! Usss “ prrrXsss, rrX s ssq prrrY ! U sss, rrpY ! Uqs ssq
def
rrrX Ñ Y Ksss “ prrrXsss, rrX s ssq prrrY Ksss, rrpY Kqs ssq
def
rrrX ! pΣ, Eqsss “ UΣ,E rrrXsss rrrX pΣ, E, S, Cqsss “ KΣ,E,S,rrCss rrrXsss
def def
op a “ pif op P Σ then ρprrΓ, x : Aop $s Kop : Bop Csspγ, aqq else Oq.
def
Here the map ρ : KsrrCss rrBop ss Ñ KO,Eop ,S,rrCss rrBop ss is the skeletal kernel theory
homomorphism characterised by the equations
ρpreturn bq “ return b, ρpop1 pa1 , κ, pνe qePEop1 qq “ op1 pa1 , ρ ˝ κ, pρpνe qqePEop1 q,
ρpgetenv κq “ getenvpρ ˝ κq, ρpraise eq “ pif e P Eop then raise e else Oq,
ρpsetenvpc, κqq “ getenvpc, ρ ˝ κq, ρpkill sq “ kill s.
The purpose of O in the definition of op is to model a runtime error when the
runner is asked to handle an unexpected operation, while ρ makes sure that op
raises at most the exceptions Eop , as prescribed by the signature of op.
48 D. Ahman and A. Bauer
environment γ P rrΓ ss, V is interpreted as a skeletal runner with state rrCss, which
induces a monad morphism r : TreeO p´q Ñ prrCss ñ TreeO p´ ˆ rrCss ` Sqq, as
in the proof of Prop. 3. Let f : KsrrCss rrP ss Ñ prrCss ñ Us rrQssq be the skeletal
kernel theory homomorphism characterised by the equations
The interpretation of (4) at γ is f prrrP ss`E prrΓ $s M : P !ss γqq prrΓ $s W : Css γq,
which reads: map the interpretation of M at γ from the skeletal user monad
to the skeletal kernel monad using r (which models the operations of M by the
cooperations of V ), and from there using f to a map rrCss ñ Us rrQss, that is then
applied to the initial kernel state, namely, the interpretation of W at γ.
We interpret the context switch Γ $s kernel K @ W finally F : Q! at an
environment γ P rrΓ ss as f prrΓ $s K : P Css γq prrΓ $s W : Css γq, where f is the
map (5). Finally, user context switch is interpreted much like exception handling.
We now define coherent semantics of λcoop ’s typing derivations by passing
through the skeletal semantics. Given a derivation D of Γ $ V : X, its skeleton
Ds derives Γ s $s V : X s . We identify the denotation of V with the skeletal one,
All that remains is to check that rrrΓ $ V : Xsss restricts to rrrΓ sss Ñ rrrXsss. This
is accomplished by induction on D. The only interesting step is subsumption,
which relies on a further observation that X Ď Y implies rrrXsss Ď rrrY sss. Typing
derivations for user and kernel computations are treated analogously.
With φ in hand, we may formulate the finalisation theorem for λcoop , stating that
the semantics of using V @ W run M finally F is a computation tree all of whose
branches end with finalisation clauses from F . Thus, unless some enveloping
runner sends a signal, finalisation with F is guaranteed to take place.
Theorem 7 (Finalisation). A well-typed run factors through finalisation:
rrrΓ $ pusing V @ W run M finally F q : Y ! pΣ 1 , E 1 qsss γ “ φ:γ t,
for some t P TreeΣ 1 pprrrXsss ` Eq ˆ rrrCsss ` Sq.
Proof. We first prove that f u c “ φ:γ pu cq holds for all u P KΣ 1 ,E,S,rrrCsss rrrXsss
and c P rrrCsss, where f is the map (5). The proof proceeds by computational
induction on u [29]. The finalisation statement is then just the special case with
u “ rrrrXsss`E prrrΓ $ M : X ! pΣ, Eqsss γq and c “ rrrΓ $ W : Csss γ. \
[
def def
6 Runners in action
Let us show examples that demonstrate how runners can be usefully combined
to provide flexible resource management. We implemented these and other ex-
amples in the language Coop and a library Haskell-Coop, see §7.
To make the code more understandable, we do not adhere strictly to the
syntax of λcoop , e.g., we use the generic versions of effects [26], as is customary
in programming, and effectful initialisation of kernel state as discussed in §3.2.
Example 8 (Nesting). In Example 4, we considered a runner fileIO for basic file
operations. Let us suppose that fileIO is implemented by immediate calls to the
operating system. Sometimes, we might prefer to accumulate writes and commit
them all at once, which can be accomplished by interposing between fileIO and
user code the following runner accIO, which accumulates writes in its state:
50 D. Ahman and A. Bauer
By nesting the runners, and calling the outer write (the one of fileIO) only in the
finalisation code for accIO, the accumulated writes are commited all at once:
using fileIO @ (open "hello.txt") run
using accIO @ (return "") run
write "Hello, world."; write "Hello, again."
finally { return x @ s Ñ write s; return x }
finally { return x @ fh Ñ ... , raise QuotaExceeded @ fh Ñ ... , kill IOError Ñ ... }
Here the interposed runner implements all operations of some enveloping runner,
by simply forwarding them, while also measuring computational cost by counting
the total number of operation calls, which is then reported during finalisation.
Example 10 (ML-style references). Continuing with the theme of nested run-
ners, they can also be used to implement abstract and safe interfaces to low-level
resources. For instance, suppose we have a low-level implementation of a mem-
ory heap that potentially allows unsafe memory access, and we would like to
implement ML-style references on top of it. A good first attempt is the runner
{ ref x Ñ let h = getenv () in
let (r,h') = malloc h x in
setenv h'; return r,
get r Ñ let h = getenv () in memread h r,
put (r, x) Ñ let h = getenv () in memset h r x }heap
which has the desired interface, but still suffers from three deficiencies that can be
addressed with further language support. First, abstract types would let us hide
the fact that references are just memory locations, so that the user code could
never devise invalid references or otherwise misuse them. Second, our simple
typing discipline forces all references to hold the same type, but in reality we
want them to have different types. This could be achieved through quantification
over types in the low-level implementation of the heap, as we have done in the
Haskell-Coop library using Haskell’s forall. Third, user code could hijack
a reference and misuse it out of the scope of the runner, which is difficult to
prevent. In practice the problem does not occur because, so to speak, the runner
for references is at the very top level, from which user code cannot escape.
Example 11 (Monotonic state). Nested runners can also implement access re-
strictions to resources, with applications in security [8]. For example, we can
Runners in action 51
The runner’s state is a map from references to preorders on integers. The co-
operation mref x rel creates a new reference r initialised with x (by calling ref of
the outer runner), and then adds the pair pr, relq to the map stored in the runner’s
state. Reading is delegated to the outer runner, while assignment first checks that
the new state is larger than the old one, according to the associated preorder. If
the preorder is respected, the runner proceeds with assignment (again delegated
to the outer runner), otherwise it reports a monotonicity violation. We may not
assume that every reference has an associated preorder, because user code could
pass to mput a reference that was created earlier outside the scope of the runner.
If this happens, the runner simply kills the offending user code with a signal.
Example 12 (Pairing). Another form of modularity is achieved by pairing run-
ners. Given two runners tpop x ÞÑ Kop qopPΣ1 uC1 and tpop1 x ÞÑ Kop1 qop1 PΣ2 uC2 ,
e.g., for state and file operations, we can use them side-by-side by combining
them into a single runner with operations Σ1 ` Σ2 and kernel state C1 ˆ C2 , as
follows (the co-operations op1 of the second runner are treated symmetrically):
{ op x Ñ let (c,c') = getenv () in
user
kernel (Kop x) @ c finally {
return y @ c'' Ñ return (inl (inl y, c'')),
(raise e @ c'' Ñ return (inl (inr e, c'')))ePEop ,
(kill s Ñ return (inr s))sPS1 }
with {
return (inl (inl y, c'')) Ñ setenv (c'', c'); return y,
return (inl (inr e, c'')) Ñ setenv (c'', c'); raise e,
return (inr s) Ñ kill s},
op' x Ñ ... , ... }C1 ˆC2
Notice how the inner kernel context switch passes to the co-operation Kop only
its part of the combined state, and how it returns the result of Kop in a reified
52 D. Ahman and A. Bauer
form (which requires treating exceptions and signals as values). The outer user
context switch then receives this reified result, updates the combined state, and
forwards the result (return value, exception, or signal) in unreified form.
7 Implementation
We accompany the theoretical development with two implementations of λcoop :
a prototype language Coop [6], and a Haskell library Haskell-Coop [1].
Coop, implemented in OCaml, demonstrates what a more fully-featured
language based on λcoop might look like. It implements a bi-directional variant
of λcoop ’s type system, extended with type definitions and algebraic datatypes,
to provide algorithmic typechecking and type inference. The operational seman-
tics is based on the computation rules of the equational theory from §4.4, but
extended with general recursion, pairing of runners from Example 12, and an in-
terface to the OCaml runtime called containers—these are essentially top-level
runners defined directly in OCaml. They are a modular and systematic way of
offering several possible top-level runtime environments to the programmer.
The Haskell-Coop library is a shallow embedding of λcoop in Haskell. The
implementation closely follows the denotational semantics of λcoop . For instance,
user and kernel monads are implemented as corresponding Haskell monads.
Internally, the library uses the Freer monad of Kiselyov [14] to implement free
model monads for given signatures of operations. The library also provides a
means to run user code via Haskell’s top-level monads. For instance, code
that performs input-output operations may be run in Haskell’s IO monad.
Haskell’s advanced features make it possible to use Haskell-Coop to
implement several extensions to examples from §6. For instance, we implement
ML-style state that allow references holding arbitrary values (of different types),
and state that uses Haskell’s type system to track which references are alive.
The library also provides pairing of runners from Example 12, e.g., to combine
state and input-output. We also use the library to demonstrate that ambient
functions from the Koka language [18] can be implemented with runners by
treating their binding and application as co-operations. (These are functions
that are bound dynamically but evaluated in the lexical scope of their binding.)
8 Related work
Comodels and (ordinary) runners have been used as a natural model of stateful
top-level behaviour. For instance, Plotkin and Power [27] have given a treatment
of operational semantics using the tensor product of a model and a comodel.
Recently, Katsumata, Rivas, and Uustalu have generalised this interaction of
models and comodels to monads and comonads [13]. An early version of Eff [4]
implemented resources, which were a kind of stateful runners, although they
lacked satisfactory theory. Uustalu [35] has pointed out that runners are the
additional structure that one has to impose on state to run algebraic effects
statefully. Møgelberg and Staton’s [21] linear-use state-passing translation also
Runners in action 25
relies on equipping the state with a comodel structure for the effects at hand.
Our runners arise when their setup is specialised to a certain Kleisli adjunction.
Our use of kernel state is analogous to the use of parameters in parameter-
passing handlers [30]: their return clause also provides a form of finalisation, as
the final value of the parameter is available. There is however no guarantee of
finalisation happening because handlers need not use the continuation linearly.
The need to tame the excessive generality of handlers, and willingness to give
it up in exchange for efficiency and predictability, has recently been recognised
by Multicore OCaml’s implementors, who have observed that in practice
most handlers resume continuations precisely once [9]. In exchange for impres-
sive efficiency, they require continuations to be used linearly by default, whereas
discarding and copying must be done explicitly, incurring additional cost. Lei-
jen [17] has extended handlers in Koka with a finally clause, whose semantics
ensures that finalisation happens whenever a handler discards its continuation.
Leijen also added an initially clause to parameter-passing handlers, which is used
to compute the initial value of the parameter before handling, but that gets
executed again every time the handler resumes its continuation.
References
1. Ahman, D.: Library Haskell-Coop. Available at https://github.com/
danelahman/haskell-coop (2019)
2. Ahman, D., Fournet, C., Hritcu, C., Maillard, K., Rastogi, A., Swamy, N.: Recalling
a witness: foundations and applications of monotonic state. PACMPL 2(POPL),
65:1–65:30 (2018)
3. Bauer, A., Pretnar, M.: An effect system for algebraic effects and handlers. Logical
Methods in Computer Science 10(4) (2014)
4. Bauer, A., Pretnar, M.: Programming with algebraic effects and handlers. J. Log.
Algebr. Meth. Program. 84(1), 108–123 (2015)
5. Bauer, A.: What is algebraic about algebraic effects and handlers? CoRR
abs/1807.05923 (2018)
6. Bauer, A.: Programming language coop. Available at https://github.com/
andrejbauer/coop (2019)
7. Benton, N., Kennedy, A.: Exceptional syntax. Journal of Functional Programming
11(4), 395–410 (2001)
8. Delignat-Lavaud, A., Fournet, C., Kohlweiss, M., Protzenko, J., Rastogi, A.,
Swamy, N., Zanella-Beguelin, S., Bhargavan, K., Pan, J., Zinzindohoue, J.K.: Im-
plementing and proving the tls 1.3 record layer. In: 2017 IEEE Symp. on Security
and Privacy (SP). pp. 463–482 (2017)
9. Dolan, S., Eliopoulos, S., Hillerström, D., Madhavapeddy, A., Sivaramakrishnan,
K.C., White, L.: Concurrent system programming with effect handlers. In: Wang,
M., Owens, S. (eds.) Trends in Functional Programming. pp. 98–117. Springer
International Publishing, Cham (2018)
10. Foster, J.N., Greenwald, M.B., Moore, J.T., Pierce, B.C., Schmitt, A.: Combinators
for bidirectional tree transformations: A linguistic approach to the view-update
problem. ACM Trans. Program. Lang. Syst. 29(3) (2007)
11. Hyland, M., Plotkin, G., Power, J.: Combining effects: Sum and tensor. Theor.
Comput. Sci. 357(1–3), 70–99 (2006)
12. Kammar, O., Lindley, S., Oury, N.: Handlers in action. In: Proc. of 18th ACM
SIGPLAN Int. Conf. on Functional Programming, ICFP 2013. ACM (2013)
13. Katsumata, S., Rivas, E., Uustalu, T.: Interaction laws of monads and comonads.
CoRR abs/1912.13477 (2019)
14. Kiselyov, O., Ishii, H.: Freer monads, more extensible effects. In: Proc. of 2015
ACM SIGPLAN Symp. on Haskell. pp. 94–105. Haskell ’15, ACM (2015)
15. Koopman, P., Fokker, J., Smetsers, S., van Eekelen, M., Plasmeijer, R.: Functional
Programming in Clean. University of Nijmegen (1998), draft
16. Leijen, D.: Structured asynchrony with algebraic effects. In: Proceedings of
the 2nd ACM SIGPLAN International Workshop on Type-Driven Development,
TyDe@ICFP 2017, Oxford, UK, September 3, 2017. pp. 16–29. ACM (2017)
17. Leijen, D.: Algebraic effect handlers with resources and deep finalization. Tech.
Rep. MSR-TR-2018-10, Microsoft Research (April 2018)
18. Leijen, D.: Programming with implicit values, functions, and control (or, implicit
functions: Dynamic binding with lexical scoping). Tech. Rep. MSR-TR-2019-7,
Microsoft Research (March 2019)
19. Levy, P.B.: Call-By-Push-Value: A Functional/Imperative Synthesis, Semantics
Structures in Computation, vol. 2. Springer (2004)
20. Miltner, A., Maina, S., Fisher, K., Pierce, B.C., Walker, D., Zdancewic, S.: Synthe-
sizing symmetric lenses. Proc. ACM Program. Lang. 3(ICFP), 95:1–95:28 (2019)
Runners in action 55
21. Møgelberg, R.E., Staton, S.: Linear usage of state. Logical Methods in Computer
Science 10(1) (2014)
22. Moggi, E.: Computational lambda-calculus and monads. In: Proc. of 4th Ann.
Symp. on Logic in Computer Science, LICS 1989. pp. 14–23. IEEE (1989)
23. Moggi, E.: Notions of computation and monads. Inf. Comput. 93(1), 55–92 (1991)
24. O’Connor, R.: Functor is to lens as applicative is to biplate: Introducing multiplate.
CoRR abs/1103.2841 (2011)
25. Plotkin, G., Power, J.: Semantics for algebraic operations. In: Proc. of 17th Conf. on
the Mathematical Foundations of Programming Semantics, MFPS XVII. ENTCS,
vol. 45, pp. 332–345. Elsevier (2001)
26. Plotkin, G., Power, J.: Algebraic operations and generic effects. Appl. Categor.
Struct. (1), 69–94 (2003)
27. Plotkin, G., Power, J.: Tensors of comodels and models for operational semantics.
In: Proc. of 24th Conf. on Mathematical Foundations of Programming Semantics,
MFPS XXIV. ENTCS, vol. 218, pp. 295–311. Elsevier (2008)
28. Plotkin, G.D., Power, J.: Notions of computation determine monads. In: Proc. of
5th Int. Conf. on Foundations of Software Science and Computation Structures,
FOSSACS 2002. LNCS, vol. 2303, pp. 342–356. Springer (2002)
29. Plotkin, G.D., Pretnar, M.: A logic for algebraic effects. In: Proc. of 23th Ann.
IEEE Symp. on Logic in Computer Science, LICS 2008. pp. 118–129. IEEE (2008)
30. Plotkin, G.D., Pretnar, M.: Handling algebraic effects. Logical Methods in Com-
puter Science 9(4:23) (2013)
31. Power, J., Shkaravska, O.: From comodels to coalgebras: State and arrays. Electr.
Notes Theor. Comput. Sci. 106, 297–314 (2004)
32. Power, J.: Enriched Lawvere theories. Theory Appl. Categ 6(7), 83–93 (1999)
33. Pretnar, M.: The Logic and Handling of Algebraic Effects. Ph.D. thesis, School of
Informatics, University of Edinburgh (2010)
34. Saleh, A.H., Karachalias, G., Pretnar, M., Schrijvers, T.: Explicit effect subtyping.
In: Proc. of 27th European Symposium on Programming, ESOP 2018. pp. 327–354.
LNCS, Springer (2018)
35. Uustalu, T.: Stateful runners of effectful computations. Electr. Notes Theor. Com-
put. Sci. 319, 403–421 (2015)
36. Wadler, P.: The essence of functional programming. In: Sethi, R. (ed.) Proc. of 19th
Ann. ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages,
POPL 1992. pp. 1–14. ACM (1992)
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
On the Versatility of Open Logical Relations
Continuity, Automatic Differentiation,
and a Containment Theorem
Abstract. Logical relations are one among the most powerful tech-
niques in the theory of programming languages, and have been used
extensively for proving properties of a variety of higher-order calculi.
However, there are properties that cannot be immediately proved by
means of logical relations, for instance program continuity and differen-
tiability in higher-order languages extended with real-valued functions.
Informally, the problem stems from the fact that these properties are
naturally expressed on terms of non-ground type (or, equivalently, on
open terms of base type), and there is no apparent good definition for
a base case (i.e. for closed terms of ground types). To overcome this is-
sue, we study a generalization of the concept of a logical relation, called
open logical relation, and prove that it can be fruitfully applied in sev-
eral contexts in which the property of interest is about expressions of
first-order type. Our setting is a simply-typed λ-calculus enriched with
real numbers and real-valued first-order functions from a given set, such
as the one of continuous or differentiable functions. We first prove a
containment theorem stating that for any collection of real-valued first-
order functions including projection functions and closed under function
composition, any well-typed term of first-order type denotes a function
belonging to that collection. Then, we show by way of open logical re-
lations the correctness of the core of a recently published algorithm for
forward automatic differentiation. Finally, we define a refinement-based
type system for local continuity in an extension of our calculus with con-
ditionals, and prove the soundness of the type system using open logical
relations.
The Second and Fourth Authors are supported by the ANR project 16CE250011
REPAS, the ERC Consolidator Grant DIAPASoN – DLV-818616, and the MIUR
PRIN 201784YSZ5 ASPRA.
1 Introduction
5
To avoid misunderstandings, we emphasize that we use first-order properties to refer
to properties of expressions of first-order types—and not in relation with definability
of properties in first-order predicate logic.
58 G. Barthe et al.
2 The Playground
In order to facilitate the communication of the main ideas behind open logical
relations and their applications, this paper deals with several vehicle calculi. All
such calculi can be seen as derived from a unique calculus, denoted by Λ×,→,R ,
which thus provides the common ground for our inquiry. The calculus Λ×,→,R is
obtained by adding to the simply typed λ-calculus with product and arrow types
(which we denote by Λ×,→ ) a ground type R for real numbers and constants r
of type R, for each real number r.
Given a collection F of real-valued functions, i.e. functions f : Rn → R
(with n ≥ 1), we endow Λ×,→,R with an operator f , for any f ∈ F, whose
On the Versatility of Logical Relations 59
τ ::= R | τ × τ | τ → τ Γ ::= · | x : τ , Γ
Γ t 1 : R · · · Γ tn : R Γ , x : τ1 t : τ2
Γ,x : τ x : τ Γ r:R Γ f (t1 , . . . , tn ) : R Γ λx.t : τ1 → τ2
Γ s : τ1 → τ2 Γ t : τ1 Γ t1 : τ Γ t2 : σ Γ t : τ 1 × τ2
(i ∈ {1, 2})
Γ st : τ2 Γ (t1 , t2 ) : τ × σ Γ t.i : τi
We do not confine ourselves with a fixed operational semantics (e.g. with a call-
by-value operational semantics), but take advantage of the simply-typed nature
of Λ×,→,R
F and opt for a set-theoretic denotational semantics. The category of
sets and functions being cartesian closed, the denotational semantics of Λ×,→,R F
is standard and associates to any judgment x 1 : τ 1 , . . . , xn : τn t : τ , a function
x1 : τ1 , . . . , xn : τn t : τ : i τi → τ , where τ —the semantics of τ —is
thus defined:
3 A Fundamental Gap
to denote essentially the same set of functions, modulo the adjoint between
R2 → R and R → (R → R). But this is clearly not the case: just consider the
function f in R → (R → R) thus defined:
λy.y if x ≥ 0
f (x) =
λy.y + 1 if x < 0.
Clearly, f turns any fixed real number to a polynomial, but when curried, it
is far from being a polynomial. In other words, reducibility seems apparently
inadequate to capture situations like the one above, in which the “base case” is
not the one of ground types, but rather the one of first-order types.
Before proceeding any further, it is useful to fix the boundaries of our in-
vestigation. We are interested in proving that (the semantics of) programs of
On the Versatility of Logical Relations 61
We extend FτΘ to the predicate FτΓ ,Θ , where Γ ranges over arbitrary environ-
ments (possibly containing variables of type R) as follows:
t ∈ FτΓ ,Θ ⇐⇒ (Γ , Θ t : τ ∧ ∀γ. γ ∈ FΘ
Γ
=⇒ tγ ∈ FτΘ ).
Γ
Here, γ ranges over substitutions6 and γ ∈ FΘ holds if the support of γ is Γ and
Θ
γ(x) ∈ Fτ , for any (x : τ ) ∈ Γ .
5 Automatic Differentiation
In this section, we show how we can use open logical relations to prove the
correctness of (a fragment of) the automatic differentiation algorithm of [50]
(suitably adapted to our calculus).
Automatic differentiation [8,9,35] (AD, for short) is a family of techniques
to efficiently compute the numerical (as opposed to symbolical ) derivative of
a computer program denoting a real-valued function. Roughly speaking, AD
acts on the code of a program by letting variables incorporate values for their
derivative, and operators propagate derivatives according to the chain rule of
differential calculus [52]. Due to its vast applications in machine learning (back-
propagation [49] being an example of an AD technique) and, most notably, in
deep learning [9], AD is rapidly becoming a topic of interest in the programming
language theory community, as witnessed by the new line of research called dif-
ferentiable programming (see, e.g., [28,50,16,1] for some recent results on AD
and programming language theory developed in the latter field).
AD comes several modes, the two most important ones being the forward
mode (also called tangent mode) and the backward mode (also called reverse
mode). These can be seen as different ways to compute the chain rule, the former
by traversing the chain rule from inside to outside, while the latter from outside
to inside.
6
We write tγ for the result of applying γ to variables in t.
On the Versatility of Logical Relations 63
Here we are concerned with forward mode AD. More specifically, we consider
the forward mode AD algorithm recently proposed in [50]. The latter is based
on a source-to-source program transformation extracting out of a program t a
new program Dt whose evaluation simultaneously gives the result of computing
t and its derivative. This is achieved by augmenting the code of t in such a way
to handle dual numbers 7 .
The transformation roughly goes as follows: expressions s of type R are trans-
formed into dual numbers, i.e. expressions s of type R×R, where the first compo-
nent of s gives the original value of s, and the second component of s gives the
derivative of s. Real-valued function symbols are then extended to handle dual
numbers by applying the chain rule, while other constructors of the language
are extended pointwise.
The algorithm of [50] has been studied by means of benchmarks and, to the
best of the authors’ knowledge, the only proof of its correctness available in the
literature8 has been given at the time of writing by Huot et al. in [37]. However,
the latter proof relies on denotational semantics, and no operational proof of
correctness has been given so far. Differentiability being a first-order concept,
open logical relations are thus a perfect candidate for such a job.
Γ t : τ =⇒ DΓ Dt : Dτ .
Let us comment the definition of D, beginning with its action on types. Follow-
ing the rationale behind forward-mode AD, the map D associates to the type
7
We represent dual numbers [21] as pairs of the form (x, x ), with x, x ∈ R. The first
component, namely x, is subject to the usual real number arithmetic, whereas the
second component, namely x , obeys to first-order differentiation arithmetic. Dual
numbers are usually presented, in analogy with complex numbers, as formal sums
of the form x + x ε, where ε is an abstract number (an infinitesimal) subject to the
law ε2 = 0.
8
However, we remark that formal approaches to backward automatic differentiation
for higher-order languages have been recently proposed in [1,16] (see Section 7).
64 G. Barthe et al.
DR = R × R D(·) = ·
D(τ1 × τ2 ) = Dτ1 × Dτ2 D(x : τ , Γ ) = dx : Dτ , DΓ
D(τ1 → τ2 ) = Dτ1 → Dτ2
n
Dr = (r, 0) D(f (t1 , . . . , tn )) = (f (Dt1 .1, . . . , Dtn .1), ∂xi f (Dt1 .1, . . . , Dtn .1) ∗ Dti .2)
i=1
R the product type R × R, the first and second components of its inhabitants
being the original expression and its derivative, respectively. The action of D
on non-basic types is straightforward and it is designed so that the automatic
differentiation machinery can handle higher-order expressions in such a way to
guarantee correctness at real-valued function types.
The action of D on the usual constructors of the λ-calculus is pointwise,
although it is worth noticing that D associates to any variable x of type τ a new
variable, which we denote by dx, of type Dτ . As we are going to see, if τ = R,
then dx acts as a placeholder for a dual number.
More interesting is the action of D on real-valued constructors. To any nu-
meral r, D associates the pair Dr = (r, 0), the derivative of a number being zero.
Let us now inspect the action of D on an operator f associated to f : Rn → R
(we treat f as a function in the variables x1 , . . . , xn ). The interesting part is the
second component of D(f (t1 , . . . , tn )), namely
n
∂xi f (Dt1 .1, . . . , Dtn .1) ∗ Dti .2
i=1
n
where i=1 and ∗ denote the operators (of Λ×,→,R F ) associated to summation
and (binary) multiplication (for readability we omit the underline notation), and
∂xi f is the operator (of Λ×,→,R
F ) associated to partial derivative ∂xi f of f in the
variable xi . It is not hard to recognize that the above expression is nothing but
an instance of the chain rule.
Finally, we notice that if Γ t : τ is a (derivable) judgment in Λ×,→,R
D , then
×,→,R
indeed DΓ Dt : Dτ is a (derivable) judgment in ΛF .
Open Logical relations for AD We have claimed that the operation deriv per-
forms automatic differentiation of Λ×,→,R
D expressions. By that we mean that
once applied to expressions of the form x1 : R, . . . , xn : R t : R, the operation
deriv can be used to compute the derivative of x1 : R, . . . , xn : R t : R. We
now show how we can prove such a statement using open logical relations, this
way providing a proof of correctness of our AD program transformation.
We begin by defining a logical relations R between Λ×,→,R D and Λ×,→,R
F ex-
pressions. We design R in such a way that (i) tRDt and (ii) if tRs and t inhabits
a first-order type, then indeed s corresponds to the derivative of t. While (ii)
essentially holds by definition, (i) requires some efforts in order to be proved.
66 G. Barthe et al.
Θ t : R ∧ DΘ s : R × R
⎧
⎪
⎪
∀y
⎪
⎨ : R.
t RΘ
R s ⇐⇒
⎪Θ s[dualy (x1 )/dx1 , . . . , dualy (xn )/dxn ].1 : R = Θ t : R
⎪
⎪
Θ s[dualy (x1 )/dx1 , . . . , dualy (xn )/dxn ].2 : R = ∂y Θ t : R
⎩
Θ t : τ1 → τ2 ∧ DΘ s : Dτ1 → Dτ2
t RΘ
τ1 →τ2 s ⇐⇒
∀p, q. p RΘ Θ
τ1 q =⇒ tp Rτ2 sq
Θ t : τ1 × τ2 ∧ DΘ s : Dτ1 × Dτ2
t RΘ
τ1 ×τ2 s ⇐⇒
∀i ∈ {1, 2}. t.i RΘ
τi s.i
t RΘ Θ
τ s ∧ Θ t : τ = Θ t : τ =⇒ t Rτ s
t RΘ Θ
τ s ∧ DΘ s : Dτ = DΘ s : Dτ =⇒ t Rτ s.
We are now ready to state and prove the main result of this section.
Θ dx[dualy (x)/dx].1 : R = Θ x : R
Θ dx[dualy (x)/dx].2 : R = ∂y Θ x : R
for any variable y (of type R). The first identity obviously holds as
In the latter case we have dualy (x) = (x, 0), and thus:
λx.sγ RΘ
τ1 →τ2 λdx.(Ds)δ,
i.e. (λx.sγ)p RΘ Θ
τ2 (λdx.(Ds)δ)q, for all p Rτ1 q. Let us fix a pair (p, q) as above.
By Lemma 2, it is sufficient to show (sγ)[p/x] RΘ
τ2 ((Ds)δ)[q/dx]. Let γ , δ be the
substitutions defined as follows:
p if y = x q if y = dx
γ (y) = δ (y) =
γ(y) otherwise δ(y) otherwise.
for any real-valued variable y, meaning that Dt indeed computes the partial
derivative of t.
The intended dynamic semantics of the term if t then s else p is the same as
the one of s whenever t evaluates to any real number r
= 0 and the same as the
one of p if it evaluates to 0.
Notice that the crux of the problem we aim to solve is the presence of the
if-then-else construct. Indeed, independently of point (i), such a construct breaks
the global continuity of programs, as illustrated in Figure 3a. As a consequence
we are forced to look at local continuity properties, instead: for instance we
can say that the program of Figure 3a is continuous both on R<0 and R≥0 .
Observe that guaranteeing local continuity allows us (up to a certain point) to
recover the ability of approximating the output of a program by approximating
its input. Indeed, if a program t : R × . . . × R → R is locally continuous on a
subset X of Rn , then the value of ts (for some input s) can be approximated
On the Versatility of Logical Relations 69
t(x) t(x)
x x
(a) t = λx.if x < 0 then − x else x + 1 (b) t = λx.if x < 0 then 1 else x + 1
same in their system for the corresponding imperative program: they ask the
domain of continuity of each of the two branches to coincide with the domain
of continuity of the whole program.
• On the other hand, the system of Chaudhuri at al. allows one to express
continuity along a restricted set of variables, which we cannot do. To illustrate
this, let us look at the program: λx, y.if (x = 0) then (3 ∗ y) else (4 ∗ y):
along the variable y, this program is continuous on the whole of R. Chaudhuri
et al. are able to express and prove this statement in their system, while we
can only say that for every real a, this program is continuous on the domain
{a} × R.
For the sake of simplicity, it is useful to slightly simplify our calculus; the ideas
we present here, however, would still be valid in a more general setting, but
that would make the presentation and proofs more involved. As usual, let F be
a collection of real-valued functions. We consider the restriction of the calculus
Λ×,→,R
F obtained by considering types of the form
τ ::= R | ρ; ρ ::= ρ1 × · · · × ρn × R × · · · × R → τ ;
m-times
Recall that with the connectives in our logic, we are able to encode logical
disjunction and implication, and as customary, we write φ ⇒ ψ for ¬φ ∨ ψ. A
real assignment is a partial map σ : V → R. When σ has finite support, we
sometimes specify σ by writing (α1 → σ(α1 ), . . . , αn → σ(αn )). We note σ |= φ
when σ is defined on the variables occurring in φ, and moreover the real formula
obtained when replacing along σ the logical variables of φ is true. We write |= φ
when σ |= φ always holds, independently on σ.
We can associate to every formula the subset of Rn consisting of all points
where this formula holds: more precisely, if φ is a formula, and X = α1 , . . . , αn
is a list of logical variables such that Vars(φ) ⊆ X, we call truth domain of φ
w.r.t. X the set:
We are now ready to define the language of refinement types, which can be
seen as simple types annotated by logical formulas. The type R is annotated by
logical variables: this way we obtain refinement real types of the form {α ∈ R}.
The crux of our refinement type system consists in the annotations we put on
the arrows. We introduce two distinct refined arrow constructs, depending on
the shape of the target type: more precisely we annotate the arrow of a type
(T1 , . . . , Tn ) → R with two logical formulas, while we annotate (T1 , . . . , Tn ) → H
(where H is an higher-order type) with only one logical formula. This way, we ob-
ψφ ψ
tain refined arrow types of the form (T1 , . . . , Tn ) → {α ∈ R}, and (T1 , . . . , Tn ) →
H: in both cases the formula ψ specifies the continuity domain, while the formula
φ is an image annotation used only when the target type is ground. The intuition
ψφ
is as follows: a program of type (H1 , . . . , Hn , {α1 ∈ R}, . . . , {αn ∈ R}) → {α ∈ R}
uses its real arguments continuously on the domain specified by the formula ψ
(w.r.t α1 , . . . , αn ), and this domain is sent into the domain specified by the for-
ψ
mula φ (w.r.t. α). Similarly, a program of the type (T1 , . . . , Tn ) → H has its real
arguments used in a continuous way on the domain specified by ψ, but it is not
possible anymore to specify an image domain, because H is higher-order.
The general form of our refined types is thus as follows:
We first consider refinement typing rules for the fragment of our language which
excludes conditionals: they are given in Figure 4. We illustrate them by way of
a series of examples.
Example 4. We first look at the typing rule var-F: if θ implies θ , then the
variable x—that, in semantics terms, does the projection of the context Γ to
one of its component—sends continuously the truth domain of θ into the truth
domain of θ . Using this rule we can, for instance, derive the following judgment:
(α≥0∧β≥0)(α≥0)
x : {α ∈ R}, y : {β ∈ R} r x : {α ∈ R}. (1)
Example 5. We now look at the Rf rule, that deals with functions from F. Using
this rule, we can show that:
(α≥0∧β≥0)(γ≥0)
x : {α ∈ R}, y : {β ∈ R} r min(x, y) : {γ ∈ R}. (2)
Before giving the refined typing rule for the if-then-else construct, we also
illustrate on an example how the rules in Figure 4 allow us to exploit the conti-
nuity informations we have on functions in F, compositionally.
On the Versatility of Logical Relations 73
|= θ ⇒ θ
var-H var-F
ψ θθ
Γ , x : H r x : H Γ , x : {α ∈ R} r x : {α ∈ R}
θθi
f ∈ F is continuous on Dom(θ1 ∧ . . . ∧ θn )α1 ...αn
Γ r ti : {αi ∈ R}
f (Dom(θ1 ∧ . . . ∧ θn )α1 ...αn ) ⊆ Dom(θ )β
Rf
θθ
Γ r f (t1 . . . tn ) : {β ∈ R}
ψ(η)
Γ , x 1 : T 1 , . . . , x n : T n r t : T |= ψ1 ∧ ψ2 ⇒ ψ
abs ψ2 ψ1 (η)
Γ r λ(x1 , . . . , xn ).t : (T1 , . . . , Tn ) → T
φ
(Γ r si : Hi )1≤i≤m |= θ1 ∧ . . . ∧ θn ⇒ θ
φθj
φ θ(η)
Γ r t : (H1 , . . . , Hm , F1 , . . . , Fn ) → T (Γ r pj : Fj )1≤j≤m
app
φ(η)
Γ r t(s1 , . . . , sm , p1 , . . . , pm ) : T
The formula ψ(η) should be read as ψ when T is a higher-order type, and as ψ η
when T is a ground type.
−x if x < 0
Example 6. Let f : R → R be the function defined as: f (x) = .
x + 1 otherwise
Observe that we can actually regard f as represented by the program in Fig-
ure 3a—but we consider it as a primitive function in F for the time being, since
we have not introduced the typing rule for the if-then-else construct, yet. Con-
sider the program:
t = λ(x, y).f (min(x, y)).
We see that t : R2 → R is continuous on the set {(x, y) | x ≥ 0 ∧ y ≥ 0},
and that, moreover, the image of f on this set is contained on [1, +∞). Using
the rules in Figure 4, the fact that f is continuous on R≥0 , and that min is
continuous on R2 , we see that our refined type system allows us to prove t to be
continuous in the considered domain, i.e.:
(α≥0∧β≥0)(γ≥1)
r t : ({α ∈ R}, {β ∈ R}) → {γ ∈ R}.
We now look at the rule for the if-then-else construct: as can be seen in the
two programs in Figure 3, the use of conditionals may or may not induce dis-
continuity points. The crux here is the behaviour of the two branches at the
74 G. Barthe et al.
θt (β=0∨β=1)
Γ r t : {β ∈ R}
θ(t,0) (β=0) θs (η) θp (η)
Γ r t : {β ∈ R} Γ r s : T Γ r p : T (1), (2)
θ(t,1) (β=1)
Γ r t : {β ∈ R}
If
θ(η)
Γ r if t then s else p : T
Again, the formula ψ(η) should be read as ψ when T is a higher-order type, and as
ψ η when T is a ground type. The side-conditions (1), (2) are given as:
s p p s
1. |= θ ⇒ (θ ∨ θ ) ∧ (θ (t,1)
∨ θ ) ∧ (θ (t,0)
∨ θ ) ∧ (θt ∨ (θs ∧ θp )) .
2. For all logical assignment σ compatible with GΓ , σ |= θ ∧ ¬θt implies HΓ
sσ GΓ ≡ctx pσ GΓ .
Example 7. Using our if-then-else typing rule, we can indeed type the program
in Figure 3b as expected:
λx.if x < 0 then 1 else x + 1 : {α ∈ R} → {β ∈ R}.
Our goal in this section is to show the correctness of our refinement type systems,
that we state below.
Theorem 3. Let t be any program such that:
θθ
x1 : {α1 ∈ R}, . . . , xn : {αn ∈ R} r t : {β ∈ R}.
• For F = {α ∈ R} we take:
C(Θ, Y ψ, F ) := {t | x1 : R, . . . , xn : R t : R,
t(Y ) ⊆ Dom(ψ)α ∧ t continuous over Y }.
ψ(η)
• if H is an arrow type of the form H = (H1 , . . . , Hm , {α1 ∈ R1 }, . . . , {αp ∈ R}) →
T:
C(Θ, Y , H) := {t | x1 : R, . . . , xn : R t : H,
∀Z, ∀s = (s1 , . . . , sm ) with si ∈ C(Θ, Z, Hi ),
∀p = (p1 , . . . pp ), ∀ψ j with |= ψ 1 ∧ . . . ∧ ψ p ⇒ ψ,
and pj ∈ C(Θ, Z ψ j , {αj ∈ R}),
it holds that t(s, p) ∈ C(Θ, (Y ∩ Z)(η), T )},
Our overall goal—in order to prove Theorem 3—is to show the counterpart
of the Fundamental Lemma from Section 4 (i.e. Lemma 1), which states that
the logical predicate FRΘ contains all well-typed terms. This lemma only talks
about the logical predicates for ground typing contexts, so we can state it as of
now, but its proof is based on the fact that we dispose of the three predicates.
Observe that from there, Theorem 3 follows just from the definition of the logical
predicates on base types. Similarly to what we did for Lemma 1 in Section 4,
proving it requires to define the logical predicates for substitutions and higher-
order typing contexts. We do this in Definition 5 below. As before, they consist in
an adaptation to our refinement types framework of the open logical predicates
Γ
FΘ and FτΘ,Γ of Section 4: as usual, we need to add continuity annotations, and
distinguish whether the target type is a ground type or an higher-order type.
Notation 2. We need to first introduce the following notation: let Γ , Θ be two
ground non-refined typing environments of length m and n respectively–and with
disjoint support. Let γ : supp(Γ ) → {t | Θ t : R} be a substitution. We write
γ for the real-valued function:
γ :Rn → Rn+m
a → (a, γ(x1 )(a), . . . , γ(xm )(a))
Definition 5. Let Θ be a ground typing environment of length n, and Γ an
arbitrary typing environment. We note n and m the lengths of respectively Θ
and GΓ .
• Let Z ⊆ Rn , W ⊆ Rn+m . We define C(Θ, Z W , Γ ) as the set of those
substitutions γ : supp(Γ ) → {t | Θ t : R} such that:
• ∀(x : H) ∈ HΓ , γ(x) ∈ C(Θ, Z, H),
• γ| GΓ : Rn → Rn+m sends continuously Z into W ;
• Let W ⊆ Rn+m , F = {α ∈ R} an annotated real type, and ψ a logical formula
with Vars(ψ) ⊆ {α}. We define:
C((Γ ; Θ), W ψ, F ) := {t | Γ , Θ t : R
∧ ∀X ⊆ Rn , ∀γ ∈ C(Θ, X W , Γ ), tγ ∈ C(Θ, X ψ, F )}.
• Let W ⊆ Rn+m , and H an higher-order refined type. We define :
C((Γ ; Θ), W , H) := {t | Γ , Θ t : H
∧ ∀X ⊆ Rn , ∀γ ∈ C(Θ, X W , Γ ). tγ ∈ C(Θ, X, H)}.
Example 9. We illustrate Definition 5 on an example. We consider the same
context Θ as in Example 8, i.e. Θ = x1 : {α1 ∈ R}, x2 : {α2 ∈ R}, and we take
(β1 ≥0)(β2 ≥0)
Γ = x3 : {α3 ∈ R}, z : H, with H = {β1 ∈ R} → {β2 ∈ R}. We are
interested in the following logical predicate for substitution:
C(Θ, B ◦ {(v, |v|) | v ∈ B ◦ )}, Γ )
√
where the norm of the couple (a, b) is taken as: |(a, b)| = a2 + b2 . We are
going to build a substitution γ : {x3 , z} → Λ×,→,R
F that belongs to this set. We
take:
On the Versatility of Logical Relations 79
w
• γ(z) = λw.f (w, x21 + x22 ) where f (w, a) = 1−a if a < 1; 0 otherwise.
√
• γ(x3 ) = ( ·)(x1 + x2 ).
2 2
Proof Sketch. The proof is by induction on the derivation of the refined typing
judgment. Along the lines, we need to show that our logical predicates play well
with the underlying denotational semantics, but also with logic. The details can
be found in the extended version [7].
From there, we can finally prove the main result of this section, i.e. Theo-
rem 3, that states the correctness of our refinement type system. Indeed, Lemma 5
has Theorem 3 as a corollary: from there it is enough to look at the definition
of the logical predicate for first-order programs to finally show the correctness
of our type system.
7 Related Work
Logical relations are certainly one of the most well-studied concepts in higher-
order programming language theory. In their unary version, they have been
introduced by Tait [54], and further exploited by Girard [33] and Tait [55] him-
self in giving strong normalization proofs for second-order type systems. The
relational counterpart of realizability, namely logical relations proper, have been
introduced by Plotkin [48], and further developed along many different axes, and
in particular towards calculi with fixpoint constructs or recursive types [3,4,2],
probabilistic choice [14], or monadic and algebraic effects [34,11,34]. Without
any hope to be comprehensive, we may refer to Mitchell’s textbook on program-
ming language theory for a comprehensive account about the earlier, classic
definitions [43], or to aforementioned papers for more recent developments.
Extensions of logical relations to open terms have been introduced by several
authors [39,47,30,53,15] and were explicitly referred to as open logical relations
in [59]. However, to the best of the authors’ knowledge, all the aforementioned
works use open logical relations for specific purposes, and do not investigate
their applicability as a general methodology.
80 G. Barthe et al.
We have showed how a mild variation on the concept of a logical relation can be
fruitfully used for proving both predicative and relational properties of higher-
order programming languages, when such properties have a first-order, rather
than a ground “flavor”. As such, the added value of this contribution is not much
in the technique itself, but in showing how it is extremely useful in heterogeneous
contexts, this way witnessing the versatility of logical relations.
The three case studies, and in particular the correctness of automatic dif-
ferentiation and refinement type-based continuity analysis, are given as proof-
of-concepts, but this does not mean they do not deserve to be studied more in
depth. An example of an interesting direction for future work is the extension
of our correctness proof from Section 5 to backward propagation differentiation
algorithms. Another one consists in adapting the refinement type system of Sec-
tion 6.1 to deal with differentiability. That would of course require a substantial
change in the typing rule for conditionals, which should take care of checking not
only continuity, but also differentiability at the critical points. It would also be
interesting to implement the refinement type system using standard SMT-based
approaches. Finally, the authors plan to investigate extensions of open logical
relations to non-normalizing calculi, as well as to non-simply typed calculi (such
as calculi with polymorphic or recursive types).
References
2. Ahmed, A.J.: Step-indexed syntactic logical relations for recursive and quantified
types. In: Proc. of ESOP 2006. pp. 69–83 (2006)
3. Appel, A.W., McAllester, D.A.: An indexed model of recursive types for foun-
dational proof-carrying code. ACM Trans. Program. Lang. Syst. 23(5), 657–683
(2001)
4. Appel, A.W., Mellies, P.A., Richards, C.D., Vouillon, J.: A very modal model of
a modern, major, general type system. In: ACM SIGPLAN Notices. vol. 42, pp.
109–122. ACM (2007)
5. Baillot, P., Dal Lago, U.: Higher-order interpretations and program complexity. In:
Proc. of CSL 2012. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2012)
6. Barendregt, H.P.: The lambda calculus: its syntax and semantics. North-Holland
(1984)
7. Barthe, G., Crubillé, R., Dal Lago, U., Gavazzo, F.: On the versatility of open
logical relations: Continuity, automatic differentiation, and a containment theorem
(long version) (2019), available at https://arxiv.org/abs/2002.08489
8. Bartholomew-Biggs, M., Brown, S., Christianson, B., Dixon, L.: Automatic dif-
ferentiation of algorithms. Journal of Computational and Applied Mathematics
124(1), 171 – 190 (2000), numerical Analysis 2000. Vol. IV: Optimization and
Nonlinear Equations
9. Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differen-
tiation in machine learning: a survey. Journal of Machine Learning Research 18,
153:1–153:43 (2017)
10. Benton, N., Hofmann, M., Nigam, V.: Abstract effects and proof-relevant logical
relations. In: Proc. of POPL 2014. pp. 619–632 (2014)
11. Biernacki, D., Piróg, M., Polesiuk, P., Sieczkowski, F.: Handle with care: rela-
tional interpretation of algebraic effects and handlers. PACMPL 2(POPL), 8:1–
8:30 (2018)
12. Birkedal, L., Jaber, G., Sieczkowski, F., Thamsborg, J.: A kripke logical relation
for effect-based program transformations. Inf. Comput. 249, 160–189 (2016)
13. Birkedal, L., Sieczkowski, F., Thamsborg, J.: A concurrent logical relation. In:
Proc. of CSL 2012. pp. 107–121 (2012)
14. Bizjak, A., Birkedal, L.: Step-indexed logical relations for probability. In: Proc. of
FoSSaCS 2015. pp. 279–294 (2015)
15. Bowman, W.J., Ahmed, A.: Noninterference for free. In: Proc. of ICFP 2015. pp.
101–113 (2015)
16. Brunel, A., Mazza, D., Pagani, M.: Backpropagation in the simply typed lambda-
calculus with linear negation. PACMPL 4(POPL), 64:1–64:27 (2020)
17. Brunel, A., Terui, K.: Church => scott = ptime: an application of resource sensitive
realizability. In: Proc. of DICE 2010. pp. 31–46 (2010)
18. Chaudhuri, S., Gulwani, S., Lublinerman, R.: Continuity analysis of programs. In:
Proc. of POPL 2010. pp. 57–70 (2010)
19. Chaudhuri, S., Gulwani, S., Lublinerman, R.: Continuity and robustness of pro-
grams. Commun. ACM 55(8), 107–115 (2012)
20. Chaudhuri, S., Gulwani, S., Lublinerman, R., NavidPour, S.: Proving programs
robust. In: Proc. of SIGSOFT/FSE 2011. pp. 102–112 (2011)
21. Clifford: Preliminary Sketch of Biquaternions. Proceedings of the London Mathe-
matical Society s1-4(1), 381–395 (11 1871)
22. Cook, S.A., Kapron, B.M.: Characterizations of the basic feasible functionals of
finite type (extended abstract). In: 30th Annual Symposium on Foundations of
Computer Science, Research Triangle Park, North Carolina, USA, 30 October - 1
November 1989. pp. 154–159 (1989)
82 G. Barthe et al.
23. Crary, K., Harper, R.: Syntactic logical relations for polymorphic and recursive
types. Electr. Notes Theor. Comput. Sci. 172, 259–299 (2007)
24. Crole, R.L.: Categories for Types. Cambridge mathematical textbooks, Cambridge
University Press (1993)
25. Dreyer, D., Neis, G., Birkedal, L.: The impact of higher-order state and control
effects on local relational reasoning. J. Funct. Program. 22(4-5), 477–528 (2012)
26. Edalat, A.: The domain of differentiable functions. Electr. Notes Theor. Comput.
Sci. 40, 144 (2000)
27. Edalat, A., Lieutier, A.: Domain theory and differential calculus (functions of one
variable). In: Proc. of LICS 2002. pp. 277–286 (2002)
28. Elliott, C.: The simple essence of automatic differentiation. PACMPL 2(ICFP),
70:1–70:29 (2018)
29. Escardó, M.H., Ho, W.K.: Operational domain theory and topology of sequential
programming languages. Inf. Comput. 207(3), 411–437 (2009)
30. Fiore, M.P.: Semantic analysis of normalisation by evaluation for typed lambda
calculus. In: Proc. of PPDP 2002. pp. 26–37 (2002)
31. Freeman, T., Pfenning, F.: Refinement types for ml. In: Proceedings of the ACM
SIGPLAN 1991 Conference on Programming Language Design and Implementa-
tion. pp. 268–277. PLDI ’91 (1991)
32. Gianantonio, P.D., Edalat, A.: A language for differentiable functions. In: Proc. of
FOSSACS 2013. pp. 337–352 (2013)
33. Girard, J.Y.: Une extension de l’interpretation de gödel a l’analyse, et son applica-
tion a l’elimination des coupures dans l’analyse et la theorie des types. In: Studies
in Logic and the Foundations of Mathematics, vol. 63, pp. 63–92. Elsevier (1971)
34. Goubault-Larrecq, J., Lasota, S., Nowak, D.: Logical relations for monadic types.
In: International Workshop on Computer Science Logic. pp. 553–568. Springer
(2002)
35. Griewank, A., Walther, A.: Evaluating Derivatives: Principles and Techniques
of Algorithmic Differentiation. Society for Industrial and Applied Mathematics,
Philadelphia, PA, USA, second edn. (2008)
36. Hofmann, M.: Logical relations and nondeterminism. In: Software, Services, and
Systems - Essays Dedicated to Martin Wirsing on the Occasion of His Retirement
from the Chair of Programming and Software Engineering. pp. 62–74 (2015)
37. Huot, M., Staton, S., Vákár, M.: Correctness of automatic differentiation via dif-
feologies and categorical gluing (2020), to appear in Proc. of ESOP 2020 (long
version available at http://arxiv.org/abs/2001.02209
38. Jaber, G.: Syteci: automating contextual equivalence for higher-order programs
with references. PACMPL 4(POPL), 59:1–59:28 (2020)
39. Jung, A., Tiuryn, J.: A new characterization of lambda definability. In: Proc. of
TLCA 1993. pp. 245–257 (1993)
40. Kapron, B.M., Cook, S.A.: A new characterization of type-2 feasibility. SIAM J.
Comput. 25(1), 117–132 (1996)
41. Lafont, Y.: Logiques, catégories & machines: implantation de langages de pro-
grammation guidée par la logique catégorique. Institut national de recherche en
informatique et en automatique (1988)
42. Manzyuk, O., Pearlmutter, B.A., Radul, A.A., Rush, D.R., Siskind, J.M.: Pertur-
bation confusion in forward automatic differentiation of higher-order functions. J.
Funct. Program. 29, e12 (2019)
43. Mitchell, J.C.: Foundations for programming languages. Foundation of computing
series, MIT Press (1996)
On the Versatility of Logical Relations 83
44. Owens, S., Myreen, M.O., Kumar, R., Tan, Y.K.: Functional big-step semantics.
In: Proc. of ESOP 2016. pp. 589–615 (2016)
45. Pearlmutter, B.A., Siskind, J.M.: Lazy multivariate higher-order forward-mode
AD. In: Proc. of POPL 2007. pp. 155–160 (2007)
46. Pearlmutter, B.A., Siskind, J.M.: Reverse-mode AD in a functional framework:
Lambda the ultimate backpropagator. ACM Trans. Program. Lang. Syst. 30(2),
7:1–7:36 (2008)
47. Pitts, A.M., Stark, I.D.B.: Observable properties of higher order functions that
dynamically create local names, or what’s new? In: Proc. of MFCS 1993. pp. 122–
141 (1993)
48. Plotkin, G.: Lambda-definability and logical relations. Edinburgh University (1973)
49. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Neurocomputing: Foundations of
research. chap. Learning Representations by Back-propagating Errors, pp. 696–699.
MIT Press (1988)
50. Shaikhha, A., Fitzgibbon, A., Vytiniotis, D., Peyton Jones, S.: Efficient differen-
tiable programming in a functional array-processing language. PACMPL 3(ICFP),
97:1–97:30 (2019)
51. Siskind, J.M., Pearlmutter, B.A.: Nesting forward-mode AD in a functional frame-
work. Higher-Order and Symbolic Computation 21(4), 361–376 (2008)
52. Spivak, M.: Calculus On Manifolds: A Modern Approach To Classical Theorems
Of Advanced Calculus. Avalon Publishing (1971)
53. Staton, S., Yang, H., Wood, F.D., Heunen, C., Kammar, O.: Semantics for prob-
abilistic programming: higher-order functions, continuous distributions, and soft
constraints. In: Proc. of LICS 2016. pp. 525–534 (2016)
54. Tait, W.W.: Intensional interpretations of functionals of finite type i. Journal of
Symbolic Logic 32(2), 198–212 (1967)
55. Tait, W.W.: A realizability interpretation of the theory of species. In: Logic Col-
loquium. pp. 240–251. Springer, Berlin, Heidelberg (1975)
56. Turon, A.J., Thamsborg, J., Ahmed, A., Birkedal, L., Dreyer, D.: Logical relations
for fine-grained concurrency. In: Proc. of POPL 2013. pp. 343–356 (2013)
57. Vuillemin, J.: Exact real computer arithmetic with continued fractions. IEEE
Trans. Comput. 39(8), 1087–1105 (1990)
58. Weihrauch, K.: Computable Analysis: An Introduction. Texts in Theoretical Com-
puter Science. An EATCS Series, Springer Berlin Heidelberg (2000)
59. Zhao, J., Zhang, Q., Zdancewic, S.: Relational parametricity for a polymorphic
linear lambda calculus. In: Proc. of APLAS 2010. pp. 344–359 (2010)
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Constructive Game Logic
1 Introduction
Two of the most essential tools in theory of programming languages are program
logics, such as Hoare calculi [29] and dynamic logics [45], and the Curry-Howard
correspondence [17,31], wherein propositions correspond to types, proofs to func-
tional programs, and proof term normalization to program evaluation. Their
intersection, the Curry-Howard interpretation of program logics, has received
surprisingly little study. We undertake such a study in the setting of Game
Logic (GL) [38], because this leads to novel insights, because the Curry-Howard
correspondence can be explained particularly intuitively for games, and because
our first-order GL is a superset of common logics such as first-order Dynamic
Logic (DL).
Constructivity and program verification have met before: Higher-order con-
structive logics [16] obey the Curry-Howard correspondence and are used to
This research was sponsored by the AFOSR under grant number FA9550-16-1-0288.
The authors were also funded by the NDSEG Fellowship and Alexander von Hum-
boldt Foundation, respectively.
develop verified functional programs. Program logics are also often embedded
in constructive proof assistants such as Coq [48], inheriting constructivity from
their metalogic. Both are excellent ways to develop verified software, but we
study something else.
We study the computational content of a program logic itself. Every funda-
mental concept of computation is expected to manifest in all three of logic, type
systems, and category theory [27]. Because dynamics logics (DL’s) such as GL
have shown that program execution is a first-class construct in modal logic, the
theorist has an imperative to explore the underlying notion of computation by
developing a constructive GL with a Curry-Howard interpretation.
The computational content of a proof is especially clear in GL, which gen-
eralizes DL to programmatic models of zero-sum, perfect-information games be-
tween two players, traditionally named Angel and Demon. Both normal-play and
misère-play games can be modeled in GL. In classical GL, the diamond modality
αφ and box modality [α]φ say that Angel and Demon respectively have a strat-
egy to ensure φ is true at the end of α, which is a model of a game. The difference
between classical GL and CGL is that classical GL allows proofs that exclude the
middle, which correspond to strategies which branch on undecidable conditions.
CGL proofs can branch only on decidable properties, thus they correspond to
strategies which are effective and can be executed by computer. Effective strate-
gies are crucial because they enable the synthesis of code that implements a
strategy. Strategy synthesis is itself crucial because even simple games can have
complicated strategies, and synthesis provides assurance that the implementa-
tion correctly solves the game. A GL strategy resolves the choices inherent in a
game: a diamond strategy specifies every move made by the Angel player, while
a box strategy specifies the moves the Demon player will make.
In developing Constructive Game Logic (CGL), adding constructivity is a
deep change. We provide a natural deduction calculus for CGL equipped with
proof terms and an operational semantics on the proofs, demonstrating the mean-
ing of strategies as functional programs and of winning strategies as functional
programs that are guaranteed to achieve their objective no matter what counter-
strategy the opponent follows. While the proof calculus of a constructive logic
is often taken as ground truth, we go a step further and develop a realizability
semantics for CGL as programs performing winning strategies for game proofs,
then prove the calculus sound against it. We adopt realizability semantics in
contrast to the winning-region semantics of classical GL because it enables us
to prove that CGL satisfies novel properties (Section 8). The proof of our Strat-
egy Property (Theorem 2) constitutes an (on-paper) algorithm that computes a
player’s (effective) strategy from a proof that they can win a game. This is the
key test of constructivity for CGL, which would not be possible in classical GL. We
show that CGL proofs have two computational interpretations: the operational
semantics interpret an arbitrary proof (strategy) as a functional program which
reduces to a normal-form proof (strategy), while realizability semantics interpret
Angel strategies as programs which defeat arbitrary Demonic opponents.
86 B. Bohrer and A. Platzer
While CGL has ample theoretical motivation, the practical motivations from
synthesis are also strong. A notable line of work on dGL extends first-order GL
to hybrid games to verify safety-critical adversarial cyber-physical systems [42].
We have designed CGL to extend smoothly to hybrid games, where synthesis
provides the correctness demanded by safety-critical systems and the synthesis
of correct monitors of the external world [36].
2 Related Work
This work is at the intersection of game logic and constructive modal logics.
Individually, they have a rich literature, but little work has been done at their
intersection. Of these, we are the first for GL and the first with a proofs-as-
programs interpretation for a full first-order program logic.
(PDL) [19]. While games are known to exhibit algebraic structure [25], such
laws are not essential to this work. Our semantics are also notable for the seam-
less interaction between a constructive Angel and a classical Demon.
CGL is first-order, so we must address the constructivity of operations that
inspect game state. We consider rational numbers so that equality is decidable,
but our work should generalize to constructive reals [11,13].
Intuitionistic modalities also appear in dynamic-epistemic logic (DEL) [21],
but that work is interested primarily in proof-theoretic semantics while we em-
ploy realizability semantics to stay firmly rooted in computation. Intuitionistic
Kripke semantics have also been applied to multimodal System K with itera-
tion [14], a weak fragment of PDL.
Constructivity and Dynamic Logic. With CGL, we bring to fruition several past
efforts to develop constructive dynamic logics. Prior work on PDL [18] sought
an Existential Property for Propositional Dynamic Logic (PDL), but they ques-
tioned the practicality of their own implication introduction rule, whose side
condition is non-syntactic. One of our results is a first-order Existential Prop-
erty, which Degen cited as an open problem beyond the methods of their day [18].
To our knowledge, only one approach [32] considers Curry-Howard or functional
proof terms for a program logic. While their work is a notable precursor to
ours, their logic is a weak fragment of PDL without tests, monotonicity, or un-
bounded iteration, while we support not only PDL but the much more powerful
first-order GL. Lastly, we are preceded by Constructive Concurrent Dynamic
Logic, [53] which gives a Kripke semantics for Concurrent Dynamic Logic [41],
a proper fragment of GL. Their work focuses on an epistemic interpretation of
constructivity, algebraic laws, and tableaux. We differ in our use of realizability
semantics and natural deduction, which were essential to developing a Curry-
Howard interpretation for CGL. In summary, we are justified in claiming to have
the first Curry-Howard interpretation with proof terms and Existential Proper-
ties for an expressive program logic, the first constructive game logic, and the
only with first-order proof terms.
While constructive natural deduction calculi map most directly to functional
programs, proof terms can be generated for any proof calculus, including a well-
known interpretation of classical logic as continuation-passing style [26]. Proof
terms have been developed [22] for a Hilbert calculus for dL, a dynamic logic
(DL) for hybrid systems. Their work focuses on a provably correct interchange
format for classical dL proofs, not constructive logics.
3 Syntax
We define the language of CGL, consisting of terms, games, and formulas. The
simplest terms are program variables x, y ∈ V where V is the set of variable
identifiers. Globally-scoped mutable program variables contain the state of the
game, also called the position in game-theoretic terminology. All variables and
terms are rational-valued (Q); we also write B for the set of Boolean values {0, 1}
for false and true respectively.
88 B. Bohrer and A. Platzer
f, g ::= · · · | q | x | f + g | f · g | f /g | f mod g
α, β ::= ?φ | x := f | x := ∗ | α ∪ β | α; β | α∗ | αd
In the test game ?φ, the active player wins if they can exhibit a constructive
proof that formula φ currently holds. If they do not exhibit a proof, the dormant
player wins by default and we informally say the active player “broke the rules”.
In deterministic assignment games x := f, neither player makes a choice, but
the program variable x takes on the value of a term f . In nondeterministic
assignment games x := ∗, the active player picks a value for x : Q. In the choice
game α ∪ β, the active player chooses whether to play game α or game β. In
the sequential composition game α; β, game α is played first, then β from the
resulting state. In the repetition game α∗ , the active player chooses after each
repetition of α whether to continue playing, but loses if they repeat α infinitely.
Notably, the exact number of repetitions can depend on the dormant player’s
moves, so the active player need not know, let alone announce, the exact number
of iterations in advance. In the dual game αd , the active player becomes dormant
and vice-versa, then α is played. We parenthesize games with braces {α} when
necessary. Sequential and nondeterministic composition both associate to the
right, i.e., α ∪ β ∪ γ ≡ {α ∪ {β ∪ γ}}. This does not affect their semantics as both
operators are associative, but aids in reading proof terms.
The defining constructs in CGL (and GL) are the modalities αφ and [α]φ.
These mean that the active or dormant Angel (i.e., constructive) player has a
constructive strategy to play α and achieve postcondition φ. This paper does
not develop the modalities for active and dormant Demon (i.e., classical) players
because by definition those cannot be synthesized to executable code. We assume
the presence of interpreted comparison predicates ∼ ∈ {≤, <, =, =, >, ≥}.
The standard connectives of first-order constructive logic can be derived from
games and comparisons. Verum (tt) is defined 1 > 0 and falsum (ff) is 0 > 1.
Conjunction φ ∧ ψ is defined ?φψ, disjunction φ ∨ ψ is defined ?φ∪?ψtt,
implication φ → ψ is defined [?φ]ψ, universal quantification ∀x φ is defined
[x := ∗]φ, and existential quantification ∃x φ is defined x := ∗φ. As usual in
logic, equivalence φ ↔ ψ can also be defined (φ → ψ) ∧ (ψ → φ). As usual in
constructive logics, negation ¬φ is defined φ → ff, and inequality is defined
by f = g ≡ ¬(f = g). We will use the derived constructs freely but present
semantics and proof rules only for the core constructs to minimize duplication.
Indeed, it will aid in understanding of the proof term language to keep the
definitions above in mind, because the proof terms for many first-order programs
follow those from first-order constructive logic.
For convenience, we also write derived operators where the dormant player
is given control of a single choice before returning control to the active player.
The dormant choice α ∩ β, defined {αd ∪ β d }d , says the dormant player chooses
which branch to take, but the active player is in control of the subgames. We
write φ xy (likewise for α and f ) for the renaming of x for y and vice versa in
formula φ, and write φfx for the substitution of term f for program variable x in
φ, if the substitution is admissible (Def. 9 in Section 6).
Coin Toss. Games are perfect-information and do not possess randomness in the
probabilistic sense, only (possibilistic) nondeterminism. This standard limitation
is shown by attempting to express a coin-guessing game:
The Demon player sets the value of a tossed coin, but does so adversarially,
not randomly, since strategies in CGL are pure strategies. The Angel player has
perfect knowledge of coin and can set guess equivalently, thus easily passing
the test guess = coin, unlike a real coin toss. Partial information games are
interesting future work that could be implemented by limiting the variables
visible in a strategy.
The game state consists of a single counter c containing a natural number, which
each player chooses (∪) to reduce by 1, 2, or 3 (c := c − k). The counter is non-
negative, and the game repeats as long as Angel wishes, until some player empties
the counter, at which point that player is declared the loser (?c > 0).
This implies the dormant player wins the game because the active player
violates the rules once c = 1 and no move is valid. We now state the winning
region for an active player.
At that point, the active player will win in one move by setting c = 1 which
forces the dormant player to set c = 0 and fail the test ?c > 0.
Constructive Game Logic 91
Cake-cutting. Another classic 2-player game, from the study of equitable divi-
sion, is the cake-cutting problem [40]: The active player cuts the cake in two,
then the (initially-)dormant player gets first choice of a piece. This is an optimal
protocol for splitting the cake in the sense that the active player is incentivized
to split the cake evenly, else the dormant player could take the larger piece.
Cake-cutting is also a simple use case for fractional numbers. The constant CC
defines the cake-cutting game. Here x is the relative size (from 0 to 1) of the
first piece, y is the size of the second piece, a is the size of the active player’s
piece, and d is the size of dormant player’s piece.
CC = x := ∗; ?(0 ≤ x ≤ 1); y := 1 − x;
{a := x; d := y ∩ a := y; d := x}
The game is played only once. The active player picks the division of the cake,
which must be a fraction 0 ≤ x ≤ 1. The dormant player then picks which slice
goes to whom.
The active player has a tight strategy to achieve a 0.5 cake share, as stated
in Proposition 3.
Proposition 3 (Active winning region). The following formula is valid:
CC a ≥ 0.5
The dormant player also has a computable strategy to achieve exactly 0.5
share of the cake (Proposition 4). Division is fair because each player has a
strategy to get their fair 0.5 share.
Proposition 4 (Dormant winning region). The following formula is valid:
[CC] d ≥ 0.5
Computability and Numeric Types. Perfect fair division is only achieved for a, d ∈
Q because rational equality is decidable. Trichotomy (a < 0.5∨a = 0.5∨a > 0.5)
is a tautology, so the dormant player’s strategy can inspect the active player’s
choice of a. Notably, we intend to support constructive reals in future work, for
which exact equality is not decidable and trichotomy is not an axiom. Future
work on real-valued CGL will need to employ approximate comparison techniques
as is typical for constructive reals [11,13,51]. The examples in this section have
been proven [12] using the calculus defined in Section 5.
4 Semantics
We now develop the semantics of CGL. In contrast to classical GL, whose seman-
tics are well-understood [38], the major semantic challenge for CGL is capturing
the competition between a constructive Angel and classical Demon. We base
our approach on realizability semantics [37,33], because this approach makes the
92 B. Bohrer and A. Platzer
4.1 Realizers
To define the semantics of games, we first define realizers, the programs which
implement strategies. The language of realizers is a higher-order lambda calculus
where variables can range over game states, numbers, or realizers which realize a
give proposition φ. Gameplay proceeds in continuation-passing style: invoking a
realizer returns another realizer which performs any further moves. We describe
the typing constraints for realizers informally, and say a is a αφ-realizer (a ∈
αφ Rz) if it provides strategic decisions exactly when αφ demands them.
Definition 5 (Realizers). The syntax of realizers a, b, c ∈ Rz (where Rz is
the set of all realizers) is defined coinductively:
where x is a program (or realizer) variable and f is a term over the state ω. The
Roman a, b, c should not be confused with the Greek α, β, γ which range over
games. Realizers have access to the game state ω, expressed by lambda realizers
(λω : S. a(ω)) which, when applied in a state ν, compute the realizer a with
ν substituted for ω. State lambdas λ are distinguished from propositional and
first-order lambdas Λ. The unit realizer () makes no choices and is understood
as a unit tuple. Units () realize f ∼ g because rational comparisons, in contrast
to real comparisons, are decidable. Conditional strategic decisions are realized
by if (f (ω)) a else b for computable function f : S → B, and execute a if f
returns truth, else b. Realizer (λω : S. f (ω)) is a α ∪ βφ-realizer if f (ω) ∈
({0} × αφ Rz) ∪ ({1} × βφ Rz) for all ω. The first component determines
Constructive Game Logic 93
and Demonic projections Z[0] , Z[1] , which represent binary decisions made by
a constructive Angel and a classical Demon, respectively. The Angelic projec-
tions, which are defined Z0 = {(πR (a), ω) | πL (a)(ω) = 0, (a, ω) ∈ Z} and
Z1 = {(πR (a), ω) | πL (a)(ω) = 1, (a, ω) ∈ Z}, filter by which branch Angel
chooses with πL (a)(ω) ∈ B, then project the remaining strategy πR (a). The
Demonic projections, which are defined Z[0] ≡ {(πL (a), ω) | (a, ω) ∈ Z} and
Z[1] ≡ {(πR (a), ω) | (a, ω) ∈ Z}, contain the same states as Z, but project the
realizer to tell Angel which branch Demon took.
Definition 6 (Formula semantics). [[φ]] ⊆ Rz × S is defined as:
((), ω) ∈ [[f ∼ g]] iff [[f ]]ω ∼ [[g]]ω
(a, ω) ∈ [[αφ]] iff {(a, ω)}α ⊆ ([[φ]] ∪ {})
(a, ω) ∈ [[[α]φ]] iff {(a, ω)}[[α]] ⊆ ([[φ]] ∪ {})
Comparisons f ∼ g defer to the term semantics, so the interesting cases are
the game modalities. Both [α]φ and αφ ask whether Angel wins α by following
the given strategy, and differ only in whether Demon vs. Angel is the active
player, thus in both cases every Demonic choice must satisfy Angel’s goal, and
early Demon wins are counted as Angel losses.
Definition 7 (Angel game forward semantics). We inductively define the
region Xα : ℘(Rz × S) in which α can end when active Angel plays X:
X?φ = {(πR (a), ω) | (πL (a), ω) ∈ [[φ]] for some (a, ω) ∈ X }
∪ {⊥ | (πL (a), ω) ∈/ [[φ]] for all (a, ω) ∈ X }
Xx := f = {(a, ωx[[f ]]ω ) | (a, ω) ∈ X}
Xx := ∗ = {(πR (a), ωxπL (a)(ω) ) | (a, ω) ∈ X}
Xα; β = (Xα)β
Xα ∪ β = X0 α ∪ X1 β
Xα∗ = {Z0 ⊆ Rz × S | X ∪ (Z1 α) ⊆ Z}
Xαd = X[[α]]
Definition 8 (Demon game forward semantics). We inductively define the
region X[[α]] : ℘(Rz × S) in which α can end when dormant Angel plays X:
X[[?φ]] = {(a b, ω) | (a, ω) ∈ X, (b, ω) ∈ [[φ]], some b ∈ Rz}
∪ { | (a, ω) ∈ X, but no (b, ω) ∈ [[φ]]}
X[[x := f ]] = {(a, ωx[[f ]]ω ) | (a, ω) ∈ X}
X[[x := ∗]] = {(a r, ωxr ) | r ∈ Q}
X[[α; β]] = (X[[α]])[[β]]
X[[α ∪ β]] = X[0] [[α]] ∪ X[1] [[β]]
X[[α∗ ]] = {Z[0] ⊆ Rz × S | X ∪ (Z[1] [[α]]) ⊆ Z}
X[[αd ]] = Xα
Constructive Game Logic 95
Angelic tests ?φ end in the current state ω with remaining realizer πR (a) if
Angel can realize φ with πL (a), else end in ⊥. Angelic deterministic assignments
consume no realizer and simply update the state, then end. Angelic nondeter-
ministic assignments x := ∗ ask the realizer πL (a) to compute a new value for x
from the current state. Angelic compositions α; β first play α, then β from the
resulting state using the resulting continuation. Angelic choice games α ∪ β use
the Angelic projections to decide which branch is taken according to πL (a). The
realizer πR (a) may be reused between α and β, since πR (a) could just invoke
πL (a) if it must decide which branch has been taken. This definition of Angelic
choice (corresponding to constructive disjunction) captures the reality that re-
alizers in CGL, in contrast with most constructive logics, are entitled to observe
a game state, but they must do so in computable fashion.
Duality Semantics. To play the dual game αd , the active and dormant players
switch roles, then play α. In classical GL, this characterization of duality is inter-
changeable with the definition of αd as the game that Angel wins exactly when
it is impossible for Angel to lose. The characterizations are not interchangeable
in CGL because the Determinacy Axiom (all games have winners) of GL is not
valid in CGL:
Remark 1 (Indeterminacy). Classically equivalent determinacy axiom schemata
¬α¬φ → [α]φ and α¬φ ∨ [α]φ of classical GL are not valid in CGL, because
they imply double negation elimination.
Remark 2 (Classical duality). In classical GL, Angelic dual games are character-
ized by the axiom schema αd φ ↔ ¬α¬φ, which is not valid in in CGL. It is
classically interdefinable with αd ↔ [α]φ.
The determinacy axiom is not valid in CGL, so we take αd ↔ [α]φ as primary.
5 Proof Calculus
Having settled on the meaning of a game in Section 4, we proceed to develop a
calculus for proving CGL formulas syntactically. The goal is twofold: the practical
motivation, as always, is that when verifying a concrete example, the realizabil-
ity semantics provide a notion of ground truth, but are impractical for proving
large formulas. The theoretical motivation is that we wish to expose the compu-
tational interpretation of the modalities αφ and [α]φ as the types of the players’
respective winning strategies for game α that has φ as its goal condition. Since
CGL is constructive, such a strategy constructs a proof of the postcondition φ.
To study the computational nature of proofs, we write proof terms explicitly:
the main proof judgement Γ M : φ says proof term M is a proof of φ in context
Γ , or equivalently a proof of sequent (Γ φ). We write M, N, O (sometimes
A, B, C) for arbitrary proof terms, and p, q, , r, s, g for proof variables, that is
variables that range over proof terms of a given proposition. In contrast to the
assignable program variables, the proof variables are given their meaning by
substitution and are scoped locally, not globally. We adapt propositional proof
terms such as pairing, disjoint union, and lambda-abstraction to our context of
game logic. To support first-order games, we include first-order proof terms and
new terms for features: dual, assignment, and repetition games.
We now develop the calculus by starting with standard constructs and work-
ing toward the novel constructs of CGL. The assumptions p in Γ are named,
so that they may appear as variable proof-terms p. We write Γ xy and M xy for
the renaming of program variable x to y and vice versa in context Γ or proof
term M, respectively. Proof rules for state-modifying constructs explicitly perform
renamings, which both ensures they are applicable as often as possible and also
ensures that references to proof variables support an intuitive notion of lexical
scope. Likewise Γxf and Mxf are the substitutions of term f for program variable
x. We use distinct notation to substitute proof terms for proof variables while
avoiding capture: [N/p]M substitutes proof term N for proof variable p in proof
term M . Some proof terms such as pairs prove both a diamond formula and a
box formula. We write M, N and [M, N ] respectively to distinguish the terms
or [M, N ] to treat them uniformly. Likewise we abbreviate [α]φ when the same
rule works for both diamond and box modalities, using [α]φ to denote its dual
modality. The proof terms x := f xy in p. M and [x := f xy in p. M ] introduce
an auxiliary ghost variable y for the old value of x, which improves completeness
without requiring manual ghost steps.
The propositional proof rules of CGL are in Fig. 1. Formula [?φ]ψ is construc-
tive implication, so rule [?]E with proof term M N eliminates M by supplying
an N that proves the test condition. Lambda terms (λp : φ. M ) are introduced
by rule [?]I by extending the context Γ . While this rule is standard, it is worth
emphasizing that here p is a proof variable for which a proof term (like N in [?]E)
may be substituted, and that the game state is untouched by [?]I. Constructive
disjunction (between the branches αφ and βφ) is the choice α ∪ βφ. The
introduction rules for injections are ∪I1 and ∪I2, and case-analysis is per-
formed with rule ∪E, with two branches that prove a common consequence
98 B. Bohrer and A. Platzer
from each disjunct. The cases ?φψ and [α ∪ β]φ are conjunctive. Conjunctions
are introduced by ?I and [∪]I as pairs, and eliminated by ?E1, ?E2, [∪]E1,
and [∪]E2 as projections. Lastly, rule hyp says formulas in the context hold by
assumption.
We now begin considering non-propositional rules, starting with the simplest
ones. The majority of the rules in Fig. 2, while thoroughly useful in proofs,
Γ A : α∗ φ Γ, s : φ B : ψ Γ, g : αα∗ φ C : ψ
∗C
Γ case∗ A of s ⇒ B | g ⇒ C : ψ
y
Γ M : αφ Γ BV(α) ,p : φ N :ψ
M
Γ M ◦p N : αψ
Γ M : [α∗ ]φ Γ M :[α]φ
[∗]E [d ]I
Γ [unroll M ] : φ ∧ [α][α∗ ]φ Γ [yield M ] :[αd ]φ
Γ M :φ Γ M : φ ∧ [α][α∗ ]φ
∗S [∗]R
Γ stop M : α∗ φ Γ [roll M ] : [α∗ ]φ
Γ M : φ ∨ αα∗ φ Γ M :[α][β]φ
∗G [;]I
Γ go M : α∗ φ Γ [ι M ] :[α; β]φ
are computationally trivial. The repetition rules ([∗]E,[∗]R) fold and unfold the
notion of repetition as iteration. The rolling and unrolling terms are named in
analogy to the iso-recursive treatment of recursive types [50], where an explicit
operation is used to expand and collapse the recursive definition of a type.
Rules ∗C,∗S,∗G are the destructor and injectors for α∗ φ, which are
similar to those for α ∪ βφ. The duality rules ([d ]I) say the dual game is proved
by proving the game where roles are reversed. The sequencing rules ([;]I) say a
Constructive Game Logic 99
sequential game is played by playing the first game with the goal of reaching a
state where the second game is winnable.
Among these rules, monotonicity M is especially computationally rich. The
y
notation Γ BV(α) says that in the second premiss, the assumptions in Γ have
all bound variables of α (written BV(α)) renamed to fresh variables y for com-
pleteness. In practice, Γ usually contains some assumptions on variables that
are not bound, which we wish to access without writing them explicitly in φ.
Rule M is used to execute programs right-to-left, giving shorter, more efficient
proofs. It can also be used to derive the Hoare-logical sequential composition
rule, which is frequently used to reduce the number of case splits. Note that like
every GL, CGL is subnormal, so the modal modus ponens axiom K and Gödel
generalization (or necessitation) rule G are not sound, and M takes over much of
the role they usually serve. On the surface, M simply says games are monotonic:
a game’s goal proposition may freely be replaced with a weaker one. From a
computational perspective, Section 7 will show that rule M can be (lazily) elimi-
nated. Moreover, M is an admissible rule, one whose instances can all be derived
from existing rules. When proofs are written right-to-left with M, the normal-
ization relation translates them to left-to-right normal proofs. Note also that in
checking M ◦p N, the context Γ has the bound variables α renamed freshly to
some y within N, as required to maintain soundness across execution of α.
Next, we consider first-order rules, i.e., those which deal with first-order
programs that modify program variables. The first-order rules are given in Fig. 3.
In :∗E, FV(ψ) are the free variables of ψ, the variables which can influence
its meaning. Nondeterministic assignment provides quantification over rational-
Γ xy , p : (x = f xy ) M : φ
[:=]I (y fresh)
Γ [x := f xy in p. M ] :[x := f ]φ
Γ x , p : (x = f x ) M : φ
y y
:∗I (y, p fresh, f comp.)
Γ f xy :∗ p. M : x := ∗φ
Γ M : x := ∗φ Γ x , p : φ N : ψ
y
:∗E (y fresh, x ∈/ FV(ψ))
Γ unpack(M, py. N ) : ψ
Γ x M :φ
y
[:∗]I (y fresh)
Γ (λx : Q. M ) : [x := ∗]φ
Γ M : [x := ∗]φ
[:∗]E (φfx admiss.)
Γ (M f ) : φfx
valued program variables. Rule [:∗]I is universal, with proof term (λx : Q. M ).
While this notation is suggestive, the difference vs. the function proof term
(λp : φ. M ) is essential: the proof term M is checked (resp. evaluated) in a state
where the program variable x has changed from its initial value. For soundness,
[:∗]I renames x to fresh program variable y throughout context Γ, written Γ xy .
This means that M can freely refer to all facts of the full context, but they
100 B. Bohrer and A. Platzer
now refer to the state as it was before x received a new value. Elimination [:∗]E
then allows instantiating x to a term f . Existential quantification is introduced
by :∗I whose proof term f xy :∗ p. M is like a dependent pair plus bound
renaming of x to y. The witness f is an arbitrary computable term, as always.
We write f x :∗ M for short when y is not referenced in M . It is eliminated in
:∗E by unpacking the pair, with side condition x ∈ / FV(ψ) for soundness. The
assignment rules [:=]I do not quantify, per se, but always update x to the value
of the term f, and in doing so introduce an assumption that x and f (suitably
renamed) are now equal. In :∗I and [:=]I, program variable y is fresh.
Γ A:ϕ
p : ϕ, q : M0 = M 0 B : α(ϕ ∧ M0 M) p : ϕ, q : M = 0 C : φ
∗I (M0 fresh)
Γ for(p : ϕ(M) = A; q. B; C){α} : α∗ φ
∗
Γ M :J Γ A : α φ
p : J N : [α]J p : J O : φ s : φ B:ψ g : αψ C : ψ
[∗]I FP
Γ (M rep p : J. N in O) : [α∗ ]φ Γ FP(A, s. B, g. C) : ψ
split Γ (split [f, g] ) : f ≤ g ∨ f > g
The looping rules in Fig. 4, especially ∗I, are arguably the most sophis-
ticated in CGL. Rule ∗I provides a strategy to repeat a game α until the
postcondition φ holds. This is done by exhibiting a convergence predicate ϕ and
termination metric M with terminal value 0 and well-ordering . Proof term A
shows ϕ holds initially. Proof term B guarantees M decreases with every itera-
tion where M0 is a fresh metric variable which is equal to M at the antecedent
of B and is never modified. Proof term C allows any postcondition φ which fol-
lows from convergence ϕ ∧ M = 0. Proof term for(p : ϕ(M) = A; q. B; C){α}
suggests the computational interpretation as a for-loop: proof A shows the con-
vergence predicate holds in the initial state, B shows that each step reduces the
termination metric while maintaining the predicate, and C shows that the post-
condition follows from the convergence predicate upon termination. The game α
repeats until convergence is reached (M = 0). By the assumption that metrics
are well-founded, convergence is guaranteed in finitely (but arbitrarily) many
iterations.
A naïve, albeit correct, reading of rule ∗I says M is literally some term f .
If lexicographic or otherwise non-scalar metrics should be needed, it suffices to
interpret ϕ and M0 M as formulas over several scalar variables.
Rule FP says α∗ φ is a least pre-fixed-point. That is, if we wish to show a
formula ψ holds now, we show that ψ is any pre-fixed-point, then it must hold
as it is no lesser than φ. Rule [∗]I is the well-understood induction rule for loops,
which applies as well to repeated games. Premiss O ensures [∗]I supports any
provable postcondition, which is crucial for eliminating M in Lemma 7. The elim-
ination form for [α∗ ]φ is simply [∗]E. Like any program logic, reasoning in CGL
consists of first applying program-logic rules to decompose a program until the
Constructive Game Logic 101
program has been entirely eliminated, then applying first-order logic principles
at the leaves of the proof. The constructive theory of rationals is undecidable
because it can express the undecidable [47] classical theory of rationals. Thus
facts about rationals require proof in practice. For the sake of space and since our
focus is on program reasoning, we defer an axiomatization of rational arithmetic
to future work. We provide a (non-effective!) rule FO which says valid first-order
formulas are provable.
Γ M :ρ
FO (exists a s.t. {a} × S ⊆ [[ρ → φ]], ρ, φ F.O.)
Γ FO[φ](M ) : φ
An effective special case of FO is split (Fig. 4), which says all term comparisons
are decidable. Rule split can be generalized to decide termination metrics (M =
0 ∨ M 0). Rule iG says the value of term f can be remembered in fresh ghost
variable x:
Γ, p : x = f M : φ
iG (x fresh except free in M, p fresh)
Γ Ghost[x = f ](p. M ) : φ
What’s Novel in the CGL Calculus? CGL extends first-order reasoning with game
reasoning (sequencing [32], assignments, iteration, and duality). The combina-
tion of first-order reasoning with game reasoning is synergistic: for example,
repetition games are known to be more expressive than repetition systems [42].
We give a new natural-deduction formulation of monotonicity. Monotonicity is
admissible and normalization translates monotonicity proofs into monotonicity-
free proofs. In doing so, normalization shows that right-to-left proofs can be
(lazily) rewritten as left-to-right. Additionally, first-order games are rife with
changing state, and soundness requires careful management of the context Γ .
The extended version [12] uses our calculus to prove the example formulas.
6 Theory: Soundness
Full versions of proofs outlined in this paper are given in the extended ver-
sion [12]. We have introduced a proof calculus for CGL which can prove winning
strategies for Nim and CC. For any new proof calculus, it is essential to con-
vince ourselves of our soundness, which can be done within several prominent
schools of thought. In proof-theoretic semantics, for example, the proof rules are
taken as the ground truth, but are validated by showing the rules obey expected
properties such as harmony or, for a sequent calculus, cut-elimination. While we
will investigate proof terms separately (Section 8), we are already equipped to
show soundness by direct appeal to the realizability semantics (Section 4), which
we take as an independent notion of ground truth. We show soundness of CGL
102 B. Bohrer and A. Platzer
proof rules against the realizability semantics, i.e., that every provable natural-
deduction sequent is valid. An advantage of this approach is that it explicitly
connects the notions of provability and computability! We build up to the proof
of soundness by proving lemmas on structurality, renaming and substitution.
Lemma 4 (Bound effect). Only the bound variables of a game are modified
by execution.
The latter condition can be relaxed in practice [44] to requiring φ does not
mention x under bindings of free variables.
Constructive Game Logic 103
Just as arithmetic terms are substituted for program variables, proof terms
are substituted for proof variables.
Lemma 6 (Proof term substitution). Let [N/p]M substitute N for p in M ,
avoiding capture. If Γ, p : ψ M : φ and Γ N : ψ then Γ [N/p]M : φ.
We have now shown that the CGL proof calculus is sound, the sine qua non
condition of any proof system. Because soundness was w.r.t. a realizability se-
mantics, we have shown CGL is constructive in the sense that provable formulas
correspond to realizable strategies, i.e., imperative programs executed in an ad-
versarial environment. We will revisit constructivity again in Section 8 from the
perspective of proof terms as functional programs.
7 Operational Semantics
of the “stop” and “go” cases, where the “go” case first shows [α]J, for loop in-
variant J, then expands J → [α∗ ]φ in the postcondition. Note the laziness of
[roll] is essential for normalization: when (M rep p : J. N in O) is understood
as a coinductive proof, it is clear that normalization would diverge if repβ were
applied indefinitely. Rule forβ for for(p : ϕ(M) = A; q. B; C){α} checks whether
the termination metric M has reached terminal value 0. If so, the loop stop’s
and A proves it has converged. Else, we remember M’s value in a ghost term
M0 , and go forward, supplying A and r, rr to satisfy the preconditions of
inductive step B, then execute the loop for(p : ϕ(M) = π1 t; q. B; C){α} in the
postcondition. Rule forβ reflects the fact that the exact number of iterations is
state dependent.
We discuss the structural, commuting conversion, and monotonicity conver-
sion rules for left injections as an example, with the full calculus in [12]. Struc-
tural rule ·S evaluates term M under an injector. Commuting conversion rule
[·]C normalizes an injection of a case to a case with injectors on each branch.
Monotonicity conversion rule [·]◦ simplifies a monotonicity proof of an injection
to an injection of a monotonicity proof.
M
→ M
·S
[ · M ]
→ [ · M ]
[·]C [ · case A of p ⇒ B | q ⇒ C]
→ case A of p ⇒ [ · B] | q ⇒ [ · C]
[·]◦ [ · M ]◦p N
→ [ · (M ◦p N )]
8 Theory: Constructivity
We now complete the study of CGL’s constructivity. We validate the operational
semantics on proof terms by proving that progress and preservation hold, and
106 B. Bohrer and A. Platzer
thus the CGL proof calculus is sound as a type system for the functional pro-
gramming language of CGL proof terms.
Lemma 7 (Progress). If · M : φ, then either M is normal or M → M for
some M .
Summary. By induction on the proof term M . If M is an introduction rule,
by the inductive hypotheses the subterms are well-typed. If they are all simple,
then M simp. If some subterm (not under a binder) steps, then M steps by a
structural rule. Else some subterm is an irreducible case expression not under
a binder, it lifts by the commuting conversion rule. If M is an elimination rule,
structural and commuting conversion rules are applied as above. Else by Def. 10
the subterm is an introduction rule, and M reduces with a β-rule. Lastly, if M
has form A◦x B and A simp, then by Def. 10 A is an introduction form, thus
reduced by some monotonicity conversion rule.
Disjunction strategies can depend on the state, so naïve DP does not hold.
Example 2 (Naïve DP). When Γ M :(φ ∨ ψ) there need not be N such that
Γ N : φ or Γ N : ψ.
Consider φ ≡ x > 0 and ψ ≡ x < 1. Then · split [x, 0] () :(φ ∨ ψ), but
neither x < 1 nor x > 0 is valid, let alone provable.
Summary. From proof term M and Theorem 1, we have a realizer for formula
αφ or [α]φ, respectively. We proceed by induction on α: the realizer b a contains
all realizers applied in the inductive cases composed with their continuations that
prove φ in each base case.
While these proofs, especially EP and DP, are short and direct, we note that
this is by design: the challenge in developing CGL is not so much the proofs of
this section, rather these proofs become simple because we adopted a realizability
semantics. The challenge was in developing the semantics and adapting the proof
calculus and theory to that semantics.
In this paper, we developed a Constructive Game Logic CGL, from syntax and re-
alizability semantics to a proof calculus and operational semantics on the proof
terms. We developed two understandings of proofs as programs: semantically,
every proof of game winnability corresponds to a realizer which computes the
game’s winning strategy, while the language of proof terms is also a functional
108 B. Bohrer and A. Platzer
References
1. Abramsky, S., Jagadeesan, R., Malacaria, P.: Full abstraction for PCF. Inf. Com-
put. 163(2), 409–470 (2000), https://doi.org/10.1006/inco.2000.2930
2. Aczel, P., Gambino, N.: The generalised type-theoretic interpretation of construc-
tive set theory. J. Symb. Log. 71(1), 67–103 (2006), https://doi.org/10.2178/jsl/
1140641163
3. Alechina, N., Mendler, M., de Paiva, V., Ritter, E.: Categorical and Kripke seman-
tics for constructive S4 modal logic. In: Fribourg, L. (ed.) CSL. LNCS, vol. 2142,
pp. 292–307. Springer (2001), https://doi.org/10.1007/3-ν540-ν44802-ν0_21
4. Altenkirch, T., Dybjer, P., Hofmann, M., Scott, P.J.: Normalization by evaluation
for typed lambda calculus with coproducts. In: LICS. pp. 303–310. IEEE Computer
Society (2001), https://doi.org/10.1109/LICS.2001.932506
Constructive Game Logic 109
25. Goranko, V.: The basic algebra of game equivalences. Studia Logica 75(2), 221–238
(2003), https://doi.org/10.1023/A:1027311011342
26. Griffin, T.: A formulae-as-types notion of control. In: Allen, F.E. (ed.) POPL. pp.
47–58. ACM Press (1990), https://doi.org/10.1145/96709.96714
27. Harper, R.: The holy trinity (2011), https://web.archive.org/web/
20170921012554/http://existentialtype.wordpress.com/2011/03/27/
the-νholy-νtrinity/
28. Hilken, B.P., Rydeheard, D.E.: A first order modal logic and its sheaf models
29. Hoare, C.A.R.: An axiomatic basis for computer programming. Commun. ACM
12(10), 576–580 (1969). https://doi.org/10.1145/363235.363259
30. van der Hoek, W., Jamroga, W., Wooldridge, M.J.: A logic for strategic reasoning.
In: Dignum, F., Dignum, V., Koenig, S., Kraus, S., Singh, M.P., Wooldridge, M.J.
(eds.) AAMAS. ACM (2005), https://doi.org/10.1145/1082473.1082497
31. Howard, W.A.: The formulae-as-types notion of construction. To HB Curry: essays
on combinatory logic, lambda calculus and formalism 44, 479–490 (1980)
32. Kamide, N.: Strong normalization of program-indexed lambda calculus. Bull. Sect.
Logic Univ. Łódź 39(1-2), 65–78 (2010)
33. Lipton, J.: Constructive Kripke semantics and realizability. In: Moschovakis,
Y. (ed.) Logic from Computer Science. pp. 319–357. Springer (1992).
https://doi.org/10.1007/978-1-4612-2822-6_13
34. Makarov, E., Spitters, B.: The Picard algorithm for ordinary differential equa-
tions in Coq. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP. LNCS,
vol. 7998. Springer (2013), https://doi.org/10.1007/978-ν3-ν642-ν39634-ν2_34
35. Mamouras, K.: Synthesis of strategies using the Hoare logic of angelic and demonic
nondeterminism. Log. Methods Computer Science 12(3) (2016), https://doi.org/
10.2168/LMCS-ν12(3:6)2016
36. Mitsch, S., Platzer, A.: ModelPlex: Verified runtime validation of verified
cyber-physical system models. Form. Methods Syst. Des. 49(1), 33–74 (2016).
https://doi.org/10.1007/s10703-016-0241-z, special issue of selected papers from
RV’14
37. van Oosten, J.: Realizability: A historical essay. Mathematical Structures in Com-
puter Science 12(3), 239–263 (2002), https://doi.org/10.1017/S0960129502003626
38. Parikh, R.: Propositional game logic. In: FOCS. pp. 195–200. IEEE (1983), https:
//doi.org/10.1109/SFCS.1983.47
39. Pauly, M.: A modal logic for coalitional power in games. J. Log. Comput. 12(1),
149–166 (2002), https://doi.org/10.1093/logcom/12.1.149
40. Pauly, M., Parikh, R.: Game logic - an overview. Studia Logica 75(2), 165–182
(2003), https://doi.org/10.1023/A:1027354826364
41. Peleg, D.: Concurrent dynamic logic. J. ACM 34(2), 450–479 (1987), https://doi.
org/10.1145/23005.23008
42. Platzer, A.: Differential game logic. ACM Trans. Comput. Log. 17(1), 1:1–1:51
(2015). https://doi.org/10.1145/2817824
43. Platzer, A.: A complete uniform substitution calculus for differential dynamic logic.
J. Autom. Reas. 59(2), 219–265 (2017). https://doi.org/10.1007/s10817-016-9385-
1
44. Platzer, A.: Uniform substitution for differential game logic. In: Galmiche, D.,
Schulz, S., Sebastiani, R. (eds.) IJCAR. LNCS, vol. 10900, pp. 211–227. Springer
(2018). https://doi.org/10.1007/978-3-319-94205-6_15
45. Pratt, V.R.: Semantical considerations on floyd-hoare logic. In: FOCS. pp. 109–121.
IEEE (1976). https://doi.org/10.1109/SFCS.1976.27
Constructive Game Logic 111
46. Ramanujam, R., Simon, S.E.: Dynamic logic on games with structured strategies.
In: Brewka, G., Lang, J. (eds.) Knowledge Representation. pp. 49–58. AAAI Press
(2008), http://www.aaai.org/Library/KR/2008/kr08-ν006.php
47. Robinson, J.: Definability and decision problems in arithmetic. J. Symb. Log. 14(2),
98–114 (1949), https://doi.org/10.2307/2266510
48. The Coq development team: The Coq proof assistant reference manual (2019),
https://coq.inria.fr/
49. Van Benthem, J.: Games in dynamic-epistemic logic. Bulletin of Economic Re-
search 53(4), 219–248 (2001)
50. Vanderwaart, J., Dreyer, D., Petersen, L., Crary, K., Harper, R., Cheng, P.: Typed
compilation of recursive datatypes. In: Shao, Z., Lee, P. (eds.) Proceedings of
TLDI’03: 2003 ACM SIGPLAN International Workshop on Types in Languages
Design and Implementation, New Orleans, Louisiana, USA, January 18, 2003. pp.
98–108. ACM (2003), https://doi.org/10.1145/604174.604187
51. Weihrauch, K.: Computable Analysis - An Introduction. Texts in Theoretical
Computer Science. An EATCS Series, Springer (2000), https://doi.org/10.1007/
978-ν3-ν642-ν56999-ν9
52. Wijesekera, D.: Constructive modal logics I. Ann. Pure Appl. Logic 50(3), 271–301
(1990), https://doi.org/10.1016/0168-ν0072(90)90059-νB
53. Wijesekera, D., Nerode, A.: Tableaux for constructive concurrent dynamic logic.
Ann. Pure Appl. Logic (2005), https://doi.org/10.1016/j.apal.2004.12.001
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Optimal and Perfectly Parallel Algorithms for
On-demand Data-flow Analysis∗
1 Introduction
f reveals that x might be null in line 6. Hence, if the decision to compile h relies
only on an offline static analysis, h is always compiled, even when not needed.
Now consider the case where the execution of the program is in line 4, and
at this point the compiler decides on whether to compile h. It is clear that
given this information, x cannot be null in line 6 and thus h does not have
to be compiled. As we have seen above, this decision can not be made based
on offline analysis. On the other hand, an on-demand analysis starting from the
current program location will correctly conclude that x is not null in line 6. Note
however, that this decision is made by the compiler during runtime. Hence, such
an on-demand analysis is useful only if it can be performed extremely fast. It
is also highly desirable that the time for running this analysis is predictable, so
that the compiler can decide whether to run the analysis or simply compile h
proactively.
The techniques we develop in this paper answer the above challenges rigor-
ously. Our approach exploits a key structural property of flow graphs of pro-
grams, called treewidth.
Treewidth of programs. A very well-studied notion in graph theory is the con-
cept of treewidth of a graph, which is a measure of how similar a graph is to
a tree (a graph has treewidth 1 precisely if it is a tree) [52]. On one hand the
treewidth property provides a mathematically elegant way to study graphs, and
on the other hand there are many classes of graphs which arise in practice and
have constant treewidth. The most important example is that the flow graph
for goto-free programs in many classic programming languages have constant
treewidth [63]. The low treewidth of flow graphs has also been confirmed exper-
imentally for programs written in Java [34], C [38], Ada [12] and Solidity [15].
Treewidth has important algorithmic implications, as many graph problems
that are hard to solve in general admit efficient solutions on graphs of low
treewidth. In the context of program analysis, this property has been exploited to
develop improvements for register allocation [63,9] (a technique implemented in
the Small Device C Compiler [28]), cache management [18], on-demand algebraic
path analysis [16], on-demand intraprocedural data-flow analysis of concurrent
programs [20] and data-dependence analysis [14].
Optimal and Parallel On-demand Data-flow Analysis 115
§
Note that we count the input itself as part of the space usage.
Optimal and Parallel On-demand Data-flow Analysis 117
2 Preliminaries
Model of computation. We consider the standard RAM model with word size
W = Θ(log n), where n is the size of our input. In this model, one can store
W bits in one word (aka “word tricks”) and arithmetic and bitwise operations
between pairs of words can be performed in O(1) time. In practice, word size is
a property of the machine and not the analysis. Modern machines have words
of size at least 64. Since the size of real-world input instances never exceeds 2 64 ,
the assumption of word size W = Θ(log n) is well-realized in practice and no
additional effort is required by the implementer to account for W in the context
of data flow analysis.
Graphs. We consider directed graphs G = (V, E) where V is a finite set of
vertices and E ⊆ V × V is a set of directed edges. We use the term graph to
refer to directed graphs and will explicitly mention if a graph is undirected.
For two vertices u, v ∈ V, a path P from u to v is a finite sequence of vertices
P = (wi )ki=0 such that w0 = u, wk = v and for every i < k, there is an edge from
wi to wi+1 in E. The length |P | of the path P is equal to k. In particular, for
every vertex u, there is a path of length 0 from u to itself. We write P : u v to
denote that P is a path from u to v and u v to denote the existence of such a
path, i.e. that v is reachable from u. Given a set V ⊆ V of vertices, the induced
subgraph of G on V is defined as G[V ] = (V , E ∩ (V × V )). Finally, the graph
G is called bipartite if the set V can be partitioned into two sets V1 , V2 , so that
every edge has one end in V1 and the other in V2 , i.e. E ⊆ (V1 × V2 ) ∪ (V2 × V1 ).
a function reaches its end, control returns back to the site of the most recent
call [58].
Flow graphs and supergraphs. In IFDS, a program with k procedures is specified
by a supergraph, i.e. a graph G = (V, E) consisting of k flow graphs G1 , . . . , Gk ,
one for each procedure, and extra edges modeling procedure-calls. Flow graphs
represent procedures in the usual way, i.e. they contain one vertex vi for each
statement i and there is an edge from vi to vj if the statement j may immediately
follow the statement i in an execution of the procedure. The only exception is
that a procedure-call statement i is represented by two vertices, a call vertex
ci and a return-site vertex ri . The vertex ci only has incoming edges, and the
vertex ri only has outgoing edges. There is also a call-to-return-site edge from
ci to ri . The call-to-return-site edges are included for passing intraprocedural
information, such as information about local variables, from ci to ri . Moreover,
each flow graph Gl has a unique start vertex sl and a unique exit vertex el .
The supergraph G also contains the following edges for each procedure-call i
with call vertex ci and return-site vertex ri that calls a procedure l: (i) an inter-
procedural call-to-start edge from ci to the start vertex of the called procedure,
i.e. sl , and (ii) an interprocedural exit-to-return-site edge from the exit vertex
of the called procedure, i.e. el , to ri .
Example 2. Figure 2 shows a simple C++ program on the left and its supergraph
on the right. Each statement i of the program has a corresponding vertex vi in
the supergraph, except for statement 7, which is a procedure-call statement and
hence has a corresponding call vertex c7 and return-site vertex r7 .
Interprocedurally valid paths. Not every path in the supergraph G can potentially
be realized by an execution of the program. Consider a path P in G and let P
be the sequence of vertices obtained by removing every vi from P , i.e. P only
consists of ci ’s and ri ’s. Then, P is called a same-context valid path if P can be
generated from S in this grammar:
Optimal and Parallel On-demand Data-flow Analysis 119
The intuition behind MSCP is similar to that of MVP, except that in MSCPv
we consider meet-over-same-context-paths (corresponding to runs that return to
the same stack state).
120 K. Chatterjee et al.
Rf := {(0, 0)}
∪ {(0, b) | b ∈ f (∅)}
∪ {(a, b) | b ∈ f ({a}) − f (∅)}.
0 a b 0 a b 0 a b 0 a b 0 a b
0 a b 0 a b 0 a b 0 a b a b
0
{a} x = ∅
λx.{a, b} λx.(x − {a}) ∪ {b} λx.x λx.x ∪ {a} λx.
∅ x=∅
0 a b
0 a b
λx.x ∪ {a}
λx.{a}
{a} x = ∅
λx.
∅ x=∅ 0 a b
0 a b
the two valid paths shown in red, both x and y might be null at 8. The pointer
y might be null because it is passed to the function f by value (instead of by
reference) and keeps its local value in the transition from c7 to r7 , hence the
edge ((c7 , ȳ), (r7 , ȳ)) is in G. On the other hand, the function f only initializes
y, which is its own local variable, and does not change x (which is shared with
main).
0x̄ ȳ v
5
1 void f ( int *& x , int * y ) {
v6
2 y = new int (1);
3 y = new int (2); c7
4 }
v1
5 int main () { v2
6 int *x , * y ;
7 f (x , y ); v3 r7
8 * x += * y ;
9 } v4 v8
0x̄ ȳ
v9
v1 v5 b1 {v1 , v2 , v5 }
v6 v2 b2 {v2 , v3 , v5 }
v7 v3 v4 b3 {v3 , v4 , v5 } {v2 , v6 , v7 } b4
It is well-known that flow graphs of programs have typically small treewidth [63].
For example, programs written in Pascal, C, and Solidity have treewidth at most
3, 6 and 9, respectively. This property has also been confirmed experimentally
for programs written in Java [34], C [38] and Ada [12]. The challenge is thus to
exploit treewidth for faster interprocedural on-demand analyses. The first step
in this approach is to compute tree decompositions of graphs. As the follow-
ing lemma states, tree decompositions of low-treewidth graphs can be computed
efficiently.
Lemma 2 ([11]). Given a graph G with constant treewidth t, a binary tree
decomposition of size O(n) bags, height O(log n) and width O(t) can be computed
in linear time.
Example 6. Figure 6 shows a graph and one of its tree decompositions with width
2. In this example, we have rb(v5 ) = b1 , rb(v3 ) = b2 , rb(v4 ) = b3 , and rb(v7 ) = b4 .
For the separator property of Lemma 3, consider the edge {b2 , b4 }. By removing
it, T breaks into two parts, one containing the vertices A = {v1 , v2 , v3 , v4 , v5 }
and the other containing B = {v2 , v6 , v7 }. We have A ∩ B = {v2 } = V (b2 ) ∩
V (b4 ). Also, any path from B − A = {v6 , v7 } to A − B = {v1 , v3 , v4 , v5 } or vice
versa must pass through {v2 }. Hence, (A, B) is a separation of G with separator
V (b2 ) ∩ V (b4 ) = {v2 }.
3 Problem definition
We consider same-context IFDS problems in which the flow graphs Gi have a
treewidth of at most t for a fixed constant t. We extend the classical notion of
same-context IFDS solution in two ways: (i) we allow arbitrary start points for
the analysis, i.e. we do not limit our analyses to same-context valid paths that
start at smain ; and (ii) instead of a one-shot algorithm, we consider a two-phase
process in which the algorithm first preprocesses the input instance and is then
provided with a series of queries to answer. We formalize these points below. We
fix an IFDS instance I = (G, D, F, M, ∪) with exploded supergraph G = (V , E).
Meet over same-context valid paths. We extend the definition of MSCP by spec-
ifying a start vertex u and an initial set Δ of data flow facts that hold at u.
Formally, for any vertex v that is in the same flow graph as u, we define:
MSCPu,Δ,v := pfP (Δ). (2)
P ∈SCVP(u,v)
The only difference between (2) and (1) is that in (1), the start vertex u is fixed
as smain and the initial data-fact set Δ is fixed as D, while in (2), they are free
to be any vertex/set.
Reduction to reachability. As explained in Section 2.1, computing MSCP is re-
duced to reachability via same-context valid paths in the exploded supergraph
G. This reduction does not depend on the start vertex and initial data flow facts.
Hence, for a data flow fact d ∈ D, we have d ∈ MSCPu,Δ,v iff in the exploded
supergraph G the vertex (v, d) is reachable via same-context valid paths from
a vertex (u, δ) for some δ ∈ Δ ∪ {0}. Hence, we define the following types of
queries:
Optimal and Parallel On-demand Data-flow Analysis 125
Pair query. A pair query provides two vertices (u, d1 ) and (v, d2 ) of the exploded
supergraph G and asks whether they are reachable by a same-context valid path.
Hence, the answer to a pair query is a single bit. Intuitively, if d2 = 0, then
the query is simply asking if v is reachable from u by a same-context valid
path in G. Otherwise, d2 is a data flow fact and the query is asking whether
d2 ∈ MSCPu,{d1 }∩D,v .
Single-source query. A single-source query provides a vertex (u, d1 ) and asks for
all vertices (v, d2 ) that are reachable from (u, d1 ) by a same-context valid path.
Assuming that u is in the flow graph Gi = (Vi , Ei ), the answer to the single source
query is a sequence of |Vi | · |D∗ | bits, one for each (v, d2 ) ∈ Vi × D∗ , signifying
whether it is reachable by same-context valid paths from (u, d1 ). Intuitively, a
single-source query asks for all pairs (v, d2 ) such that (i) v is reachable from u
by a same-context valid path and (ii) d2 ∈ MSCPu,{d1 }∩D,v ∪ {0}.
Intuition. We note the intuition behind such queries. We observe that since the
functions in F are distributive over ∪, we have MSCPu,Δ,v = ∪δ∈Δ MSCPu,{δ},v ,
hence MSCPu,Δ,v can be computed by O(|Δ|) single-source queries.
In this step, our goal is to compute a new graph Ĝ from the exploded supergraph
G such that there is a path from (u, d1 ) to (v, d2 ) in Ĝ iff there is a same-context
valid path from (u, d1 ) to (v, d2 ) in G. The idea behind this step is the same as
that of the tabulation algorithm in [50].
Summary edges. Consider a call vertex cl in G and its corresponding return-site
vertex rl . For d1 , d2 ∈ D∗ , the edge ((cl , d1 ), (rl , d2 )) is called a summary edge
if there is a same-context valid path from (cl , d1 ) to (rl , d2 ) in the exploded
supergraph G. Intuitively, a summary edge summarizes the effects of procedure
calls (same-context interprocedural paths) on the reachability between cl and
rl . From the definition of summary edges, it is straightforward to verify that the
graph Ĝ obtained from G by adding every summary edge and removing every
interprocedural edge has the desired property, i.e. a pair of vertices are reachable
in Ĝ iff they are reachable by a same-context valid path in G. Hence, we first
find all summary edges and then compute Ĝ. This is shown in Algorithm 1.
We now describe what Algorithm 1 does. Let sp be the start point of a
procedure p. A shortcut edge is an edge ((sp , d1 ), (v, d2 )) such that v is in the
same procedure p and there is a same-context valid path from (sp , d1 ) to (v, d2 ) in
G. The algorithm creates an empty graph H = (V , E ). Note that H is implicitly
represented by only saving E . It also creates a queue Q of edges to be added to
H (initially Q = E) and an empty set S which will store the summary edges.
The goal is to construct H such that it contains (i) intraprocedural edges of G,
(ii) summary edges, and (iii) shortcut edges.
It constructs H one edge at a time. While there is an unprocessed intrapro-
cedural edge e = ((u, d1 ), (v, d2 )) in Q, it chooses one such e and adds it to H
(lines 5–10). Then, if (u, d1 ) is reachable from (sp , d3 ) via a same-context valid
Optimal and Parallel On-demand Data-flow Analysis 127
path, then by adding the edge e, the vertex (v, d2 ) also becomes accessible from
(sp , d3 ). Hence, it adds the shortcut edge ((sp , d3 ), (v, d2 )) to Q, so that it is later
added to the graph H. Moreover, if u is the start sp of the procedure p and v is
its end ep , then for every call vertex cl calling the procedure p and its respective
return-site rl , we can add summary edges that summarize the effect of calling p
(lines 14–19). Finally, lines 20–24 compute Ĝ as discussed above.
Correctness. As argued above, every edge that is added to H is either intrapro-
cedural, a summary edge or a shortcut edge. Moreover, all such edges are added
to H, because H is constructed one edge at a time and every time an edge e
is added to H, all the summary/shortcut edges that might occur as a result
of adding e to H are added to the queue Q and hence later to H. Therefore,
Algorithm 1 correctly computes summary edges and the graph Ĝ.
Complexity. Note that the graph H has at most O(|E| · |D∗ |2 ) edges. Addition
of each edge corresponds to one iteration of the while loop at line 4 of Algo-
rithm 1. Moreover, each iteration takes O(|D∗ |) time, because the loop at line
11 iterates over at most |D∗ | possible values for d3 and the loops at lines 15
and 16 have constantly many iterations due to the bounded bandwidth assump-
128 K. Chatterjee et al.
tion (Section 2.1). Since |D∗ | = O(|D|) and |E| = O(n), the total runtime of
Algorithm 1 is O(|n| · |D|3 ). For a more detailed analysis, see [50, Appendix].
steps taken by Algorithm 2. In each step, a bag is chosen and a local all-pairs
reachability computation is performed over the bag. Local reachability edges are
added to Rlocal and to Ĝ (if they are not already in Ĝ).
We now prove the correctness and establish the complexity of Algorithm 2.
Correctness. We prove that when computeLocalReachability(T ) ends, the set Rlocal
contains all the local reachability edges between vertices that appear in the
same bag in T. The proof is by induction on the size of T. If T consists of a
single bag, then the local reachability computation on Hl (lines 7–9) fills Rlocal
correctly. Now assume that T has n bags. Let H−l = Ĝ[∪bi ∈T,i =l V (bi ) × D∗ ].
Intuitively, H−l is the part of Ĝ that corresponds to other bags in T , i.e. every
bag except the leaf bag bl . After the local reachability computation at lines 7–
9, (v, d2 ) is reachable from (u, d1 ) in H−l only if it is reachable in Ĝ. This is
because (i) the vertices of Hl and H−l form a separation of Ĝ with separator
(V (bl ) ∩ V (bp )) × D∗ (Lemma 3) and (ii) all reachability information in Hl is
now replaced by direct edges (line 8). Hence, by induction hypothesis, line 11
finds all the local reachability edges for T − bl and adds them to both Rlocal and
Ĝ. Therefore, after line 11, for every u, v ∈ V (bl ), we have (u, d1 ) (v, d2 ) in
Hl iff (u, d1 ) (v, d2 ) in Ĝ. Hence, the final all-pairs reachability computation
of lines 12–14 adds all the local edges in bl to Rlocal .
Complexity. Algorithm 2 performs at most two local all-pair reachability com-
putations over the vertices appearing in each bag, i.e. O(t · |D∗ |) vertices. Each
such computation can be performed in O(t3 · |D∗ |3 ) using standard reachabil-
ity algorithms. Given that the Ti ’s have O(n) bags overall, the total runtime of
Algorithm 2 is O(n · t3 · |D∗ |3 ) = O(n · |D∗ |3 ). Note that the treewidth t is a
constant and hence the factor t3 can be removed.
v1 v5 v1 v5
b4 b3
v6 v2 v6 v2
{v2, v6, v7} {v3, v4, v5}
v7 v3 v4 v7 v3 v4
v1 v5 v1 v5
b2 b1
v6 v2 v6 v2
{v2, v3, v5} {v1, v2, v5}
v7 v3 v4 v7 v3 v4
v1 v5 v1 v5
b2 b3
v6 v2 v6 v2
{v2, v3, v5} {v3, v4, v5}
v7 v3 v4 v7 v3 v4
v1 v5 v1 v5
b4
v6 v2 v6 v2
{v2, v6, v7}
v7 v3 v4 v7 v3 v4
We now show how to reduce the time complexity of Algorithm 3 from O(n ·
|D∗ |3 · log n) to O(n · |D∗ |3 ) using word tricks. The idea is to pack the F and F
sets of Algorithm 3 into words, i.e. represent them by a binary sequence.
Optimal and Parallel On-demand Data-flow Analysis 131
Given a bag b, we define δb as the sum of sizes of all ancestors of b. The tree
decompositions are balanced, so b has O(log n) ancestors. Moreover, the width
is t, hence δb = O(t · log n) = O(log n) for every bag b. We perform a top-down
pass of each tree decomposition Ti and compute δb for each b.
For every bag b, u ∈ V (b) and d1 ∈ D∗ , we store F (u, d1 , b, −) as a binary
sequence of length δb ·|D∗ |. The first |V (b)|·|D∗ | bits of this sequence correspond
to F (u, d1 , b, db ). The next |V (bp )| · |D∗ | correspond to F (u, d1 , b, db − 1), and so
on. We use a similar encoding for F . Using this encoding, Algorithm 3 can be
rewritten by word tricks and bitwise operations as follows:
– Lines 5–6 copy F (u, d, bp , −) into F (u, d, b, −). However, we have to shift and
align the bits, so these lines can be replaced by
We now describe how to answer pair and single-source queries using the data
saved in the preprocessing phase.
Answering a Pair Query. Our algorithm answers a pair query from a vertex
(u, d1 ) to a vertex (v, d2 ) as follows:
(i) If u and v are not in the same flow graph, return 0 (no).
(ii) Otherwise, let Gi be the flow graph containing both u and v. Let bu = rb(u)
and bv = rb(v) be the root bags of u and v in Ti and let b = lca(bu , bv ).
(iii) If there exists a vertex w ∈ V (b) and d3 ∈ D∗ such that (u, d1 ) anc (w, d3 )
and (w, d3 ) anc (v, d2 ), return 1 (yes), otherwise return 0 (no).
Correctness. If there is a path P : (u, d1 ) (v, d2 ), then we claim P must pass
through a vertex (w, d3 ) with w ∈ V (b). If b = bu or b = bv , the claim is obviously
true. Otherwise, consider the path P : bu bv in the tree decomposition Ti .
This path passes through b (by definition of b). Let e = {b, b } be an edge of P .
Applying the cut property (Lemma 3) to e, proves that P must pass through a
vertex (w, d3 ) with w ∈ V (b ) ∩ V (b). Moreover, b is an ancestor of both bu and
bv , hence we have (u, d1 ) anc (w, d3 ) and (w, d3 ) anc (v, d2 ).
Complexity. Computing LCA takes O(1) time. Checking all possiblevertices
|D|
(w, d3 ) takes O(t · |D∗ |) = O(|D|). This runtime can be decreased to O log n
by word tricks.
Answering a Single-source Query. Consider a single-source query from a vertex
(u, d1 ) with u ∈ Vi . We can answer this query by performing |Vi | × |D∗ | pair
queries, i.e. by performing one pair query from (u, d1 ) to (v,d2 ) for each v ∈
Vi
∗ ∗ |D|
and d2 ∈ D . Since |D | = O(|D|), the total complexity is O |Vi | · |D| · log n
for answering a single-source query. Using
a more involved preprocessing method,
we can slightly improve this time to O |Vlog
i |·|D|
2
|D| n · |D|2
O and O , respectively.
log n log n
We now turn our attention to parallel versions of our query algorithms, as well
as cases where the algorithms are optimal.
Parallelizability. Assume we have k threads in our disposal.
Optimal and Parallel On-demand Data-flow Analysis 133
1. Given a pair query of the form (u, d1 , v, d2 ), let bu (resp. bv ) be the root
bag u (resp. v), and b = lca(bu , bv ) the lowest common ancestor of bu and
bv . We partition the set V (b) × D∗ into k subsets {Ai }1≤i≤k . Then, thread
i handles the set Ai , as follows: for every pair (w, d3 ) ∈ Ai , the thread sets
the output to 1 (yes) iff (u, d1 ) anc (w, d3 ) and (w, d3 ) anc (v, d2 ).
2. Recall that a single source query (u, d1 ) is answered by breaking it down to
|Vi | × |D∗ | pair queries, where Gi is the flow graph containing u. Since all
such pair queries are independent, we parallelize them among k threads, and
further parallelize each pair query as described above.
|D|
With word tricks, parallel pair and single-source queries require O k·log n
n·|D|
and O k·log n time, respectively. Hence, for large enough k, each query re-
quires only O(1) time, and we achieve perfect parallelism.
Optimality. Observe that when |D| = O(1), i.e. when the domain is small, our
algorithm is optimal : the preprocessing runs in O(n), which is proportional to
the size of the input, and the pair query and single-source query run in times
O(1) and O(n/ log n), respectively, each case being proportional to the size of
the output. Small domains arise often in practice, e.g. in dead-code elimination
or null-pointer analysis.
5 Experimental Results
Experimental setting. The results were obtained on Debian using an Intel Xeon
E5-1650 processor (3.2 GHz, 6 cores, 12 threads) with 128GB of RAM. The
parallel results used all 12 threads.
Fig. 8: Preprocessing times of CPP and SEQ/PAR (over all instances). A dot
above the 300s line denotes a timeout.
Optimal and Parallel On-demand Data-flow Analysis 135
Results. We found that, except for the smallest instances, our algorithm consis-
tently outperforms all previous approaches. Our results were as follows:
Note that Figure 9 combines the results of all five mentioned data-flow analy-
ses. However, the observations above hold independently for every single analysis,
as well. See [17] for analysis-specific figures.
136 K. Chatterjee et al.
Fig. 9: Comparison of pair query time (top row) and single source query time
(bottom row) of the algorithms. Each dot represents one of the 110 instances.
Each row starts with a global picture (left) and zooms into smaller time units
(right) to differentiate between the algorithms. The plots above contain results
over all five analyses. However, our observations hold independently for every
single analysis, as well (See [17]).
6 Conclusion
References
20. Chatterjee, K., Ibsen-Jensen, R., Goharshady, A.K., Pavlogiannis, A.: Algorithms
for algebraic path properties in concurrent systems of constant treewidth com-
ponents. ACM Transactions on Programming Langauges and Systems 40(3), 9
(2018)
21. Chatterjee, K., Ibsen-Jensen, R., Pavlogiannis, A.: Optimal reachability and a
space-time tradeoff for distance queries in constant-treewidth graphs. In: ESA
(2016)
22. Chaudhuri, S., Zaroliagis, C.D.: Shortest paths in digraphs of small treewidth. part
i: Sequential algorithms. Algorithmica 27(3-4), 212–226 (2000)
23. Chaudhuri, S.: Subcubic algorithms for recursive state machines. In: POPL (2008)
24. Chen, T., Lin, J., Dai, X., Hsu, W.C., Yew, P.C.: Data dependence profiling for
speculative optimizations. In: CC. pp. 57–72 (2004)
25. Cousot, P., Cousot, R.: Static determination of dynamic properties of recursive
procedures. In: IFIP Conference on Formal Description of Programming Concepts
(1977)
26. Cygan, M., Fomin, F.V., Kowalik, L ., Lokshtanov, D., Marx, D., Pilipczuk, M.,
Pilipczuk, M., Saurabh, S.: Parameterized algorithms, vol. 4 (2015)
27. Duesterwald, E., Gupta, R., Soffa, M.L.: Demand-driven computation of interpro-
cedural data flow. POPL (1995)
28. Dutta, S.: Anatomy of a compiler. Circuit Cellar 121, 30–35 (2000)
29. Flückiger, O., Scherer, G., Yee, M.H., Goel, A., Ahmed, A., Vitek, J.: Correctness
of speculative optimizations with dynamic deoptimization. In: POPL. pp. 49:1–
49:28 (2017)
30. Giegerich, R., Möncke, U., Wilhelm, R.: Invariance of approximate semantics with
respect to program transformations. In: ECI (1981)
31. Gould, C., Su, Z., Devanbu, P.: Jdbc checker: A static analysis tool for SQL/JDBC
applications. In: ICSE. pp. 697–698 (2004)
32. Grove, D., Torczon, L.: Interprocedural constant propagation: A study of jump
function implementation. In: PLDI (1993)
33. Guarnieri, S., Pistoia, M., Tripp, O., Dolby, J., Teilhet, S., Berg, R.: Saving the
world wide web from vulnerable javascript. In: ISSTA. pp. 177–187 (2011)
34. Gustedt, J., Mæhle, O.A., Telle, J.A.: The treewidth of java programs. In:
ALENEX. pp. 86–97 (2002)
35. Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors.
SIAM Journal on Computing 13(2), 338–355 (1984)
36. Horwitz, S., Reps, T., Sagiv, M.: Demand interprocedural dataflow analysis. ACM
SIGSOFT Software Engineering Notes (1995)
37. Hovemeyer, D., Pugh, W.: Finding bugs is easy. ACM SIGPLAN Notices 39(12),
92–106 (Dec 2004)
38. Klaus Krause, P., Larisch, L., Salfelder, F.: The tree-width of C. Discrete Applied
Mathematics (03 2019)
39. Knoop, J., Steffen, B.: The interprocedural coincidence theorem. In: CC (1992)
40. Krüger, S., Späth, J., Ali, K., Bodden, E., Mezini, M.: CrySL: An Extensible
Approach to Validating the Correct Usage of Cryptographic APIs. In: ECOOP.
pp. 10:1–10:27 (2018)
41. Lee, Y.f., Marlowe, T.J., Ryder, B.G.: Performing data flow analysis in parallel.
In: ACM/IEEE Supercomputing. pp. 942–951 (1990)
42. Lee, Y.F., Ryder, B.G.: A comprehensive approach to parallel data flow analysis.
In: ICS. pp. 236–247 (1992)
Optimal and Parallel On-demand Data-flow Analysis 139
43. Lin, J., Chen, T., Hsu, W.C., Yew, P.C., Ju, R.D.C., Ngai, T.F., Chan, S.: A com-
piler framework for speculative optimizations. ACM Transactions on Architecture
and Code Optimization 1(3), 247–271 (2004)
44. Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kauf-
mann (1997)
45. Naeem, N.A., Lhoták, O., Rodriguez, J.: Practical extensions to the ifds algorithm.
CC (2010)
46. Nanda, M.G., Sinha, S.: Accurate interprocedural null-dereference analysis for java.
In: ICSE. pp. 133–143 (2009)
47. Rapoport, M., Lhoták, O., Tip, F.: Precise data flow analysis in the presence of
correlated method calls. In: SAS. pp. 54–71 (2015)
48. Reps, T.: Program analysis via graph reachability. ILPS (1997)
49. Reps, T.: Undecidability of context-sensitive data-dependence analysis. ACM
Transactions on Programming Languages and Systems 22(1), 162–186 (2000)
50. Reps, T., Horwitz, S., Sagiv, M.: Precise interprocedural dataflow analysis via
graph reachability. In: POPL. pp. 49–61 (1995)
51. Reps, T.: Demand interprocedural program analysis using logic databases. In: Ap-
plications of Logic Databases, vol. 296 (1995)
52. Robertson, N., Seymour, P.D.: Graph minors. iii. planar tree-width. Journal of
Combinatorial Theory, Series B 36(1), 49–64 (1984)
53. Rodriguez, J., Lhoták, O.: Actor-based parallel dataflow analysis. In: CC. pp. 179–
197 (2011)
54. Rountev, A., Kagan, S., Marlowe, T.: Interprocedural dataflow analysis in the
presence of large libraries. In: CC. pp. 2–16 (2006)
55. Sagiv, M., Reps, T., Horwitz, S.: Precise interprocedural dataflow analysis with
applications to constant propagation. Theoretical Computer Science (1996)
56. Schubert, P.D., Hermann, B., Bodden, E.: PhASAR: An inter-procedural static
analysis framework for C/C++. In: TACAS. pp. 393–410 (2019)
57. Shang, L., Xie, X., Xue, J.: On-demand dynamic summary-based points-to analy-
sis. In: CGO. pp. 264–274 (2012)
58. Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. In:
Program flow analysis: Theory and applications. Prentice-Hall (1981)
59. Smaragdakis, Y., Bravenboer, M., Lhoták, O.: Pick your contexts well: Under-
standing object-sensitivity. In: POPL. pp. 17–30 (2011)
60. Späth, J., Ali, K., Bodden, E.: Context-, flow-, and field-sensitive data-flow analysis
using synchronized pushdown systems. In: POPL. pp. 48:1–48:29 (2019)
61. Sridharan, M., Bodı́k, R.: Refinement-based context-sensitive points-to analysis for
java. ACM SIGPLAN Notices 41(6), 387–400 (2006)
62. Sridharan, M., Gopan, D., Shan, L., Bodı́k, R.: Demand-driven points-to analysis
for java. In: OOPSLA. pp. 59–76 (2005)
63. Thorup, M.: All structured programs have small tree width and good register
allocation. Information and Computation 142(2), 159–181 (1998)
64. Torczon, L., Cooper, K.: Engineering a Compiler. Morgan Kaufmann, 2nd edn.
(2011)
65. Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L.J., Lam, P., Sundaresan, V.: Soot
- a Java bytecode optimization framework. In: CASCON. p. 13 (1999)
66. Xu, G., Rountev, A., Sridharan, M.: Scaling cfl-reachability-based points-to anal-
ysis using context-sensitive must-not-alias analysis. In: ECOOP (2009)
67. Yan, D., Xu, G., Rountev, A.: Demand-driven context-sensitive alias analysis for
java. In: ISSTA. pp. 155–165 (2011)
140 K. Chatterjee et al.
68. Yuan, X., Gupta, R., Melhem, R.: Demand-driven data flow analysis for commu-
nication optimization. Parallel Processing Letters 07(04), 359–370 (1997)
69. Zheng, X., Rugina, R.: Demand-driven alias analysis for c. In: POPL. pp. 197–208
(2008)
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/
4.0/), which permits use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons license and indicate if changes
were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Concise Read-Only Specifications for
Better Synthesis of Programs with Pointers
Andreea Costea1 , Amy Zhu2 , Nadia Polikarpova3 , and Ilya Sergey4,1
1
School of Computing, National University of Singapore, Singapore
2
University of British Columbia, Vancouver, Canada
3
University of California, San Diego, USA
4
Yale-NUS College, Singapore
Abstract. In program synthesis there is a well-known trade-off between
concise and strong specifications: if a specification is too verbose, it might
be harder to write than the program; if it is too weak, the synthesised
program might not match the user’s intent. In this work we explore the
use of annotations for restricting memory access permissions in program
synthesis, and show that they can make specifications much stronger
while remaining surprisingly concise. Specifically, we enhance Synthetic
Separation Logic (SSL), a framework for synthesis of heap-manipulating
programs, with the logical mechanism of read-only borrows.
We observe that this minimalistic and conservative SSL extension bene-
fits the synthesis in several ways, making it more (a) expressive (stronger
correctness guarantees are achieved with a modest annotation overhead),
(b) effective (it produces more concise and easier-to-read programs),
(c) efficient (faster synthesis), and (d) robust (synthesis efficiency is
less affected by the choice of the search heuristic). We explain the in-
tuition and provide formal treatment for read-only borrows. We sub-
stantiate the claims (a)–(d) by describing our quantitative evaluation of
the borrowing-aware synthesis implementation on a series of standard
benchmark specifications for various heap-manipulating programs.
1 Introduction
Deductive program synthesis is a prominent approach to the generation of correct-
by-construction programs from their declarative specifications [14, 23, 29, 33].
With this methodology, one can represent searching for a program satisfying the
user-provided constraints as a proof search in a certain logic. Following this idea,
it has been recently observed [34] that the synthesis of correct-by-construction
imperative heap-manipulating programs (in a language similar to C) can be im-
plemented as a proof search in a version of Separation Logic (SL)—a program
logic designed for modular verification of programs with pointers [32, 37].
SL-based deductive program synthesis based on Synthetic Separation Logic
(SSL) [34] requires the programmer to provide a Hoare-style specification for a
program of interest. For instance, given the predicate ls(x, S), which denotes a
symbolic heap corresponding to a linked list starting at a pointer x, ending with
null, and containing elements from the set S, one can specify the behaviour of
the procedure for copying a linked list as follows:
{r → x ∗ ls(x, S)} listcopy(r) {r → y ∗ ls(x, S) ∗ ls(y, S)} (1)
Work done during an internship at NUS School of Computing in Summer 2019.
ls(nxt, S′)
content x v nxt v′ nxt′ ⁄ w null
address r x x+1 nxt nxt + 1
ls(x, S)
The precondition of specification (1), defining the shape of the initial heap,
is illustrated by the figure above. It requires the heap to contain a pointer r,
which is taken by the procedure as an argument and whose stored value, x, is the
head pointer of the list to be copied. The list itself is described by the symbolic
heap predicate instance ls(x, S), whose footprint is assumed to be disjoint from
the entry r → x, following the standard semantics of the separating conjunction
operator (∗) [32]. The postcondition asserts that the final heap, in addition to
containing the original list ls(x, S), will contain a new list starting from y whose
contents S are the same as of the original list, and also that the pointer r will now
point to the head y of the list copy. Our specification is incomplete: it allows, for
example, duplicating or rearranging elements. One hopes that such a program
is unlikely to be synthesised. In synthesis, it is common to provide incomplete
specs: writing complete ones can be as hard as writing the program itself.
pointer nxt, once assigned to *(y + 1) on y y+1 y1 y1 + 1
line 13, becomes the tail of the copy! ls(y, S)
Indeed, the exercise in tail swapping is Fig. 1: Result program for spec (1)
totally pointless: not only does it produces and the shape of its final heap.
less “natural” and readable code, but the
resulting program’s locality properties are unsatisfactory; for instance, this pro-
Concise Read-Only Specifications for Better Synthesis 143
gram cannot be plugged into a concurrent setting where multiple threads rely
on ls(x, S) to be unchanged.
The issue with the result in Fig. 1 is caused by specification (1) being too
permissive: it does not prevent the synthesised program from modifying the
structure of the initial list, while creating its copy. Luckily, the SL community has
devised a number of SL extensions that allow one to impose such restrictions, like
declaring a part of the provided symbolic heap as read-only [5, 8, 9, 11, 15, 20, 21],
i.e., forbidden to modify by the specified code.
two mutable (i.e., allowed to be written to) pointers x and r, that point to
unspecified values f and h, correspondingly. With this symbolic heap, is it safe
to call the following function that modifies the contents of r but not of x?
RO M RO M
x → f ∗ r → h readX(x, r) x → f ∗ r → f (2)
5
We will be using the words “annotation” and “permission” interchangeably.
144 A. Costea et al.
Paper Outline. We start by showcasing the intricacies and the virtues of SSL-
based synthesis with read-only specifications in Sec. 2. We provide the formal
account of read-only borrows and present the modified SSL rules, along with
the soundness argument in Sec. 3. We report on the implementation and evalu-
ation of the enhanced synthesis in Sec. 4. We conclude with a discussion on the
limitations of read-only borrows in Sec. 5 and compare to related work in Sec. 6.
rules, which reduce the initial goal to a trivial one, so it can be solved by one of
the terminal rules, such as, e.g., the rule Emp shown below:
φ⇒ψ
Emp
Γ ; {φ; emp} ; {ψ; emp}| skip
That is, Emp requires that (i) symbolic heaps in both pre- and post-conditions
are empty and (ii) that the pure part φ of the precondition implies the pure
part ψ of the postcondition. As the result, Emp “emits” a trivial program skip.
Some of the SSL rules are aimed at simplifying the goal, bringing it to the shape
that can be solved with Emp. For instance, consider the following rules:
Frame UnifyHeaps
EV (Γ, P, Q) ∩ Vars (R) = ∅ [σ]R = R ∅ = dom (σ)
⊆ EV (Γ, P, Q)
Γ ; {φ; P} ; {ψ; Q}| c Γ ; {φ; P ∗ R} ; [σ] ψ; Q ∗ R c
Γ ; {φ; P ∗ R} ; {ψ; Q ∗ R}| c Γ ; {φ; P ∗ R} ; ψ; Q ∗ R c
Neither of the rules Frame and UnifyHeaps “adds” to the program c being
synthesised. However, Frame reduces the goal by removing a matching part R
(a.k.a. frame) from both the pre- and the post-condition. UnifyHeaps non-
deterministically picks a substitution σ, which replaces existential variables in a
sub-heap R of the postcondition to match the corresponding symbolic heap R in
the precondition. Both of these rules make choices with regard to what frame R
to remove or which substitution σ to adopt—a point that will be of importance
for the development described in Sec. 2.2.
Finally, the following (simplified) rule for producing a write command is oper-
ational, as it emits a part of the program to be synthesised, while also modifying
the goal accordingly. The resulting program will, thus, consist of the emitted
store ∗x = e of an expression e to the pointer variable x. The remainder is syn-
thesised by solving the sub-goal produced by applying the Write rule.
Vars (e) ⊆ Γ e = e
Γ ; {φ; x → e ∗ P} ; {ψ; x → e ∗ Q}| c
Write
Γ; φ; x → e ∗ P ; {ψ; x → e ∗ Q} ∗x = e; c
Notice how in the rule above the heaplets of the form x → e are now anno-
M
tated with the access permission M, which explicitly indicates that the code may
modify the corresponding heap location.
Following with the example specification (5), we can imagine a similar scenario
when the rule UnifyHeaps picks the substitution σ = [239/z]. Should this be
the case, the next application of the rule WriteRO will not be possible, due to
y → 239 in the resulting sub-goal:
RO
the read-only annotation
on the heaplet
M RO M RO
{x, y} ; x → 239 ∗ y → 30 ; z ≤ 100; x → 239 ∗ y → 239
As the RO access permission prevents the synthesised code from modifying the
greyed heaplets, the synthesis search is forced to back-track, picking an alterna-
tive substitution σ =[30/z] and converging on the desirable program ∗x=30.
8
One might argue that it was possible to detect the unsolvable conjunct 239 ≤ 100 in
the postcondition immediately after performing substitution, thus sparing the need
to proceed with this derivation further. This is, indeed, a possibility, but in general
it is hard to argue which of the heuristics in applying the rules will work better in
general. We defer the quantitative argument on this matter until Sec. 4.4.
148 A. Costea et al.
could be provided to the callee and discarded upon return since the caller re-
tained the full permission of the original heap. Several works on RO permissions
have adopted this approach [9, 11, 13]. While discarding such clones works just
fine for sequential program verification, in the case of synthesis guided by pre-
and postconditions, incomplete postconditions could lead to intractable goals.
Our solution. The key to gaining the necessary expressivity wrt. passing/return-
ing access permissions, while maintaining a sound yet simple logic, is treating
access permissions as first-class values. A natural consequence of this treatment
is that immutability annotations can be symbolic (i.e., variables of a special sort
Concise Read-Only Specifications for Better Synthesis 149
The only substantial difference with spec (5) is that now the pointer y’s access
permission is given an explicit name a. Such named annotations (a.k.a. borrows)
are treated as RO by the callee, as long as the pure precondition does not con-
strain them to be mutable. However, giving these permissions names achieves
an important goal: performing accurate accounting while composing specifica-
tions with different access permissions. Specifically, we can now emit a call to
pick(u, v) as specified by (7) from the goal (6), keeping in mind the substitution
σ = [u/x, v/y, M/a]. This call now accounts for borrows as well, and makes it
straightforward to restore v’s original permission M upon returning.
Following the same idea, borrows can be naturally composed through capture-
avoiding substitutions. For instance, the same specification (7) of pick could be
used to advance the following modified version of the goal (6):
M c M c
{u, v} ; u → 239 ∗ v → 30 ∗ P ; w ≤ 210; u → w ∗ v → w ∗ Q
precondition. Therefore, not being able to follow the derivation path (a), the
synthesiser is forced to explore an alternative one, eventually deriving the version
of listcopy without tail-swapping.
Write
e = e
M M
Vars (e) ⊆ Γ Γ; φ; x, ι → e ∗ P ; ψ; x, ι → e ∗ Q c
Γ; φ; x, ι → e ∗ P ; ψ; x, ι → e ∗ Q ∗ (x + ι) = e; c
M M
Alloc
z∈EV (Γ, P, Q)
∗ M
R [y, n] ∗ 0≤i<n y, i → ti
M
Σ; Γ; φ; P ∗ R ; {ψ; Q ∗ R} c
Σ; Γ; {φ; P} ; {ψ; Q ∗ R}| let y = malloc(n); c
Free
not formalise sort-checking of formulae; however, for readability, we will use the
meta-variable α where the intended sort of the pure logic term is “permission”,
and Perm for the set of all permissions. The permission to allocate or deallocate
a memory-block [x, n]α is controlled by α.
3.1 BoSSL rules
New rules of BoSSL are shown in Fig. 4. The figure contains only 3 rules: this
minimal adjustment is possible thanks to our approach to unification and permis-
sion accounting from first principles. Writing to a memory location requires its
corresponding
symbolica heap to be annotated as mutable. Note that for a pre-
condition a = M; x
→
5 , a normalisation
rule like SubstLeft would first
transform it into M = M; x
→ 5 , at which point the Write rule can be ap-
M
plied. Note also that Alloc does not require specific permissions on the block
in the postcondition; if they turn out to be RO, the resulting goal is unsolvable.
Unsurprisingly, the rule for accessing a memory cell just for reading purposes
requires no adjustments since any permission allows reading. Moreover, the Call
rule for method invocation does not need adjustments either. Below, we describe
how borrow and return seamlessly operate within a method call:
Call
F f(xi ) : {φf ; Pf }{ψf ; Qf } ∈ Σ R = [σ]Pf φ ⇒ [σ]φf ei = [σ]xi
Vars (ei ) ⊆ Γ φ [σ]ψf R [σ]Qf Σ; Γ; {φ ∧ φ ; P ∗ R } ; {Q}| c
Σ; Γ; {φ; P ∗ R} ; {Q}| f(ei ); c
The Call rule fires when a sub-heap R in the precondition of the goal can
be unified with the precondition Pf of a function f from context Σ. Some salient
points are worth mentioning here: (1) the annotation borrowing from R to Pf for
those symbolic sub-heaps in Pf which require read-only permissions is handled
by the unification of Pf with R, namely R = [σ]Pf (i.e., substitution accounts for
borrows: α/a); (2) the annotation recovery in the new precondition is implicit
Concise Read-Only Specifications for Better Synthesis 153
via R [σ]Qf , where the substitution σ was computed during the unification,
that is, while borrowing; (3) finding a substitution σ for R = [σ]Pf fails if R does
not have sufficient accessibility permissions to call f (i.e., substitutions of the
form a/M are disallowed since the domain of σ may only contain existentials).
We reiterate that read-only specifications only manipulate symbolic borrows,
that is to say, RO constants are not expected in the specification.
3.2 Memory Model
We closely follow the standard SL memory model [32,37] and assume Loc ⊂ Val.
(Heap) h ∈ Heaps ::= Loc → Val (Stack) s ∈ Stacks ::= Var → Val
To enable C-like accounting of dynamically-allocated memory blocks, we as-
sume that the heap h also stores sizes of allocated blocks in dedicated locations.
Conceptually, this part of the heap corresponds to the meta-data of the mem-
ory allocator. This accounting ensures that only a previously allocated memory
block can be disposed (as opposed to any set of allocated locations), enabling the
free command to accept a single argument, the address of the block. To model
this meta-data, we introduce a function bl : Loc → Loc, where bl(x) denotes
the location in the heap where the block meta-data for the address x is stored, if
x is the starting address of a block. In an actual language implementation, bl(x)
might be, e.g., x − 1 (i.e., the meta-data is stored right before the block).
Since we have opted for an unsophisticated permission mechanism, where the
heap ownership is not divisible, but some heap locations are restricted to RO,
the definition of the satisfaction relation Σ,R
I for the annotated assertions in a
particular context Σ and given an interpretation I, is parameterised with a fixed
set of read-only locations, R:
– h, s
Σ,R
I {φ;
emp} iff s = true and dom (h) = ∅.
φ
– h, s
Σ,R
α
I φ; e1 , ι
→
e2 iff φs = true and l e1 s +ι and dom (h) = {l}
and h(l) = e2 s and l ∈ R ⇔ α = RO .
– h, s
Σ,R α
I {φ; [e, n] } iff φs = true and l bl(es ) and dom (h) = {l} and
h(l) = n and l ∈ R ⇔ α = RO.
– h, s
Σ,R · Σ,R
I {φ; P1 ∗ P2 } iff ∃ h1 , h2 , h = h1 ∪ h2 and h1 , s
I {φ; P1 } and
Σ,R
h2 , s
I {φ; P2 }.
Σ,R
– h,
s
I φ; p(ψ i ) iff φs = true and D p(xi ) ek , {χk , Rk }
∈ Σ and
h, ψi s ∈ I(D) and k ( h, s
Σ,R I [ψi /xi ]{φ ∧ ek ∧ χk ; Rk }).
There are two non-standard cases: points-to and block, whose permissions
must agree with R. Note that in the definition of satisfaction, we only need to
consider that case where the permission α is a value (i.e., either RO or M).
Although in a specification α can also be a variable, well-formedness guarantees
that this variable must be logical, and hence will be substituted away in the
definition of validity. We stress the fact that a reference that has RO permissions
to a certain symbolic heap still retains the full ownership of that heap, with the
restriction that it is not allowed to update or deallocate it. Note that deallocation
additionally requires a mutable permission for the enclosing block.
154 A. Costea et al.
3.3 Soundness
The BoSSL operational semantics is in the spirit of the traditional SL [38], and
hence is omitted for the sake of saving space (selected rules are available in
the extended version of the paper). The validity definition and the soundness
proofs of SSL are ported to BoSSL without any modifications, since our current
definition of satisfaction implies the one defined for SSL:
Definition 1 (Validity). We say that a well-formed Hoare-style specification
Σ; Γ; {P} c {Q} is valid wrt. the function dictionary Δ iff whenever dom (s) = Γ,
∀σgv = [xi → di ]xi ∈GV(Γ,P,Q) such that h, s
ΣI [σgv ]P, and Δ; h, (c, s) ·
∗
h , (skip, s ) ·
∗ h , (c , s ) ·
and a synthesis goal Σ, F; Γ; x → 7 ∗ y → x ; x → 7 ∗ y → z c, firing the Call
M M M M
rule
for the candidate
r) would lead to the unsolvable goal Σ, F; Γ;
function f(x,
a2
7 ∗ y → 8 ; x → 7 ∗ y → z f(x, y); c. Frame may never be fired on this
M M M
x →
new goal since the permission of reference x in the goal’s precondition has been
permanently weakened. To eliminate such sources of incompleteness we require
the user-provided predicates and specifications to be well-formed:
Definition 2 (Well-Formedness of Spatial Predicates). We say that a
spatial predicate p(xi ) ek , {χk , Rk }
k∈1..N is well-formed iff
N
( k=1 (Vars (ek ) ∪ Vars (χk ) ∪ Vars (Rk )) ∩ Perm) ⊆ (xi ∩ Perm).
Concise Read-Only Specifications for Better Synthesis 155
That is, every accessibility annotation within the predicate’s clause is bound by
the predicate’s parameters.
Definition 3 (Well-Formedness of Specifications). We say that a Hoare-
style specification Σ; Γ ; {P} c {Q} is well-formed iff EV (Γ, P, Q)∩Perm = ∅ and
every predicate instance in P and Q is an instance of a well-formed predicate.
That is, postconditions are not allowed to have existential accessibility annota-
tions in order to avoid permanent weakening of accessibility.
A callee that requires borrows for a symbolic heap always returns back to the
caller its original permission for that respective symbolic heap:
Corollary 2 (Borrows Always Return). A heaplet with permission α, either
(a) retains the same permission α after a call to a function that is decorated with
well-formed specifications and that requires for that heaplet to have read-only
permission, or (b) it may be deallocated in case if α = M.
4 Implementation and Evaluation
We implemented BoSSL in an enhanced version of the SuSLik tool, which
we refer to as ROBoSuSLik [12].11 The changes to the original SuSLik in-
frastructure affected less than 100 lines of code. The extended synthesis is
backwards-compatible with the original benchmarks. To make this possible, we
treat the original SSL specifications as annotated/instantiated with M permis-
sions, whenever necessary, which is consistent with treatment of access permis-
sions in BoSSL.
We have conducted an extensive experimental evaluation of ROBoSuSLik,
aiming to answer the following research questions:
1. Do borrowing annotations improve the performance of SSL-based synthesis
when using standard search strategy [34, § 5.2]?
2. Do read-only borrows improve the quality of synthesised programs, in terms of
size and comprehensibility, wrt. to their counterparts obtained from regular,
“all-mutable” specifications?
3. Do we obtain stronger correctness guarantees for the programs from the stan-
dard SSL benchmark suite [34, § 6.1] by simply adding, whenever reasonable,
read-only annotations to their specifications?
4. Do borrowing specifications enable more robust synthesis? That is, should we
expect to obtain better programs/synthesis performance on average regardless
of the adopted unification and search strategies?
4.1 Experimental Setup
Benchmark Suite. To tackle the above research questions, we have adopted most
of the heap-manipulating benchmarks from SuSLik suite [34, § 6.1] (with some
variations) into our sets of experiments. In particular we looked at the group
of benchmarks which manipulate singly linked list segments, sorted linked list
segments and binary trees. We did not include the benchmarks concerning binary
search trees (BSTs) for the reasons outlined in the next paragraph.
11
The sources are available at https://github.com/TyGuS/robosuslik.
156 A. Costea et al.
The Tools. For a fair comparison which accounts for the latest advancements
to SuSLik, we chose to parameterise the synthesis process with a flag that
turns the read-only annotations on and off (off means that they are set to be
mutable). Those values which are the result of having this flag set will be marked
in the experiments with RO, while those marked with Mut ignore the read-only
annotations during the synthesis process. For simplicity, we will refer to the two
instances of the tool, namely RO and Mut, as two different tools. Each tool was
set to timeout after 2 minutes of attempting to synthesise a program.
Criteria. In an attempt to quantify our results, we have looked at the size of
the synthesised program (AST size), the absolute time needed to synthesise the
code given its specification, averaged over several runs (Time), the number of
backtrackings in the proof search due to nondeterminism (#Backtr ), the total
number of rule applications that the synthesis fired during the search (#Rules),
including those that lead to unsolvable goals, and the strength of the guarantees
offered by the specifications (Stronger Guarantees).
Variables. Some benchmarks have shown improvement over the synthesis pro-
cess without the read-only annotations. To emphasise the fact that read-only
annotations’ improvements are not accidental, we have varied the inductive defi-
nitions of the corresponding benchmarks to experiment with different properties
of the underlying structure: the shape of the structure (in all the definitions),
the length of the structure (for those benchmarks tagged with len), the values
stored within the structure (val ), a combination of all these properties (all ) as
well as with the sortedness property for the “Sorted list” group of benchmarks.
Experiment Schema. To measure the performance and the quality of the borrowing-
aware synthesis we ran the benchmarks against the two different tools and did
a one-to-one comparison of the results. We ran each tool three times for each
benchmark, and average the resulted synthesis time. All the other evaluation
criteria remain constant within all three runs.
To measure the tools’ robustness we stressed the synthesis algorithm by alter-
ing the default proof search strategy. We prepared 42 such perturbations which
we used to run against the different program variants enumerated above. Each
pair of program variant and proof strategy perturbation has been then analysed
to measure the number of rules that had been fired by RO and Mut.
Hardware Setup. The experiments were conducted on a 64-bit machine running
Ubuntu, with an Intel Xeon CPU (6 cores, 2.40GHz) with 32GB RAM.
4.2 Performance and Quality of the Borrowing-Aware Synthesis
Tab. 1 captures the results of running RO and Mut against the considered bench-
marks. It provides the empirical proof that the borrowing-aware synthesis im-
proves the performance of the original SSL-based synthesis, or in other words,
answering positively the Research Question 1. RO suffers almost no loss in per-
formance (except for a few cases, such as the list segment append where there
is a negligible increase in time), while the gain is considerable for those synthe-
sis problems with complex pointer manipulation. For example, if we consider
the number of fired rules as the performance measurement criteria, in the worst
Concise Read-Only Specifications for Better Synthesis 157
Table 1: Benchmarks and comparison between the results for synthesis with read-
only annotations (RO) and without them (Mut). For each case study we measure
the AST size of the synthesised program, the Time needed to synthesize the
benchmark, the number of times that the synthesiser had to discard a derivation
branch (#Backtr.), and the total number of fired rules (#Rules).
case, RO behaves the same as Mut, while in the best scenario it buys us a 32-fold
decrease in the number of applied rules. At the same time, synthesising a few
small examples in the RO case is a bit slower, despite the same or smaller num-
ber of rule applications. This is due to the increased number of logical variables
(because of added borrows) when discharging obligations via SMT solver.
Fig. 5 offers a statistical view of the numbers in the table, where smaller bars
mark a better performance. The barplots indicate that as the complexity of the
problem increases (approximately from left to right), RO outperforms Mut.
Perhaps the most important take-away from this experiment is that the syn-
thesis with read-only borrows often produces a more concise program (light green
cells in the columnt AST size of Tab. 1), while retaining the same or better per-
formance wrt. all the evaluated criteria. For instance, RO gets rid of the spurious
write from the motivating example introduced in Sec. 1, reducing the AST size
from 35 nodes down to 32, while in the same time firing fewer rules. That also
means that we secure a positive answer for Research Question 2.
40
30
20
10
0
th ax in n e t y
ini cop en
d te end er t len val −all len −all len −all ose rph len val −all len val −all app acc
ng m leto pos le
le
m
g dis l
ap
p de rep ins or t− or t− or t ize− ize ptr− −ptr isp mo py− py− opy ptr− ptr− −ptr en− en−
sin p −s s−s ns−
s ts ts e− ze d
tc
o tco tc py− py− py latt latt
ins in i iz tsi o tco tc
o f f
ts tc
Mutable
0
th ax in n e t y
ini cop en
d te end er t len val −all len −all len −all ose rph len val −all len val −all app acc
ng m leto pos le
le
m
g dis l
ap
p de rep ins or t− or t− or t ize− ize ptr− −ptr isp mo py− py− opy ptr− ptr− −ptr en− en−
sin p −s s−s ns−
s ts ts e− ze d
tc
o tco tc py− py− py latt latt
ins in i iz tsi o tco tc
o f f
ts tc
Fig. 5: Statistics for synthesis with and without Read-Only specifications.
ones - the results are summarized in the last column of Tab. 1. For instance, a
specification stating that the shape of a linked-list segment is read-only implies
that the size of that segment remains constant through the program’s execution.
In other words, the length property need not be captured separately in the
segment’s definition. If, in addition to the shape, the payload of the segment is
also read-only, then the set of values and their ordering are also invariant.
Consider the goal {lseg(x, y, s, a1 , a2 , a3 )} ; {lseg(x, y, s, a1 , a2 , a3 )}, where
lseg is an inductive definition of a list segment which ends at y and contains
the set of values s. The borrowing-aware synthesiser will produce a program
which is guaranteed to treat the segment pointed by x and ending with y as
read-only (that is, its shape, length, values and orderings are invariant). At the
same time, for a goal {lseg(x, y, s)} ; {lseg(x, y, s)} , the guarantees are that
the returned segment still ends in y and contains values s. Internal modifications
of the segment, such as reordering and duplicating list elements, may still occur.
The few entries marked with same are programs with specifications which have
not got stronger when instrumented with RO annotations (e.g., delete). These
benchmarks require mutation over the entire data structure, hence the read-only
annotations do not influence the offered guarantees. Overall, our observations
that read-only annotations offer stronger guarantees are in agreement with the
works on SL-based program verification [9, 13], but are promoted here to the
more challenging problem of program synthesis.
Concise Read-Only Specifications for Better Synthesis 159
Read−Only
Mutable
16
14
log2 of number of tried rules
10 8
612
6) 6) (42
)
(42
) 6) 8) 90
)
84
)
(12 (12 er t er t (12 (10 tr ( tr (
py py ins ins py py −p −p
lco lco tco tco copy
copy
t t
Fig. 6: Boxplots of variations in log2 (numbers of applied rules) for synthesis per-
turbations. Numbers of data points for each example are given in parentheses.
six number summary: minimum, first quartile, median, third quartile, maximum,
outliers. For example, the boxplot for tcopy-ptr corresponding to RO and con-
taining 90 data points, reads as follows: “the synthesis processes fired between
64 and 256 rules, with most of the processes firing between 64 and 128 rules.
There are three exception where the synthesiser fired more than 256 rules”. Note
that the y-axis represents the binary logarithm of the number of fired rules.
Even though we attempted to synthesise each program 126 times for each tool,
some attempts hit the timeout and therefore their corresponding data points had
to be eliminated from the boxplot. It is of note, though, that whenever RO with
configuration (v, k) hit the timeout for the synthesis problem s ∈ S, so did Mut,
hence both the (RO, s, (v, k)) as well as (Mut, s, (v, k)) are omitted from the
boxplots. But the inverse did not hold: RO hit the timeout fewer times than
Mut, hence RO is measured at disadvantage (i.e., more data points means more
opportunities to show worse results). Since insert collected the highest number
of timeouts, we equalised it to remove non-matched entries across the two tools.
Despite RO’s potential measurement disadvantage, the boxplots depicts it as a
clear winner. Not only RO fires fewer rules in all the cases, but with the exception
of insert, it is also more stable to the proof search perturbations, it varies a
few order of magnitude less than Mut does for the same configurations. Fig. 7
supports this observation by offering a more detailed view on the distributions
of the numbers of fired rules per synthesis configuration. Taller bars show that
more processes fall in the same range (wrt. the number of fired rules). For lcopy,
tcopy, tcopy-ptr it is clear that Mut has a wider distribution of the number
of fired rules, that is, Mut is more sensitive to the perturbations than RO. We
additionally make some further observations:
Concise Read-Only Specifications for Better Synthesis 161
60
60
60
126 data points 42 data points 126 data points 90 data points
50
50
50
50
40
40
40
40
Frequency
30
30
30
30
20
20
20
20
10
10
10
10
0
0
4 6 8 10 12 14 16 18 4 6 8 10 12 14 16 18 4 6 8 10 12 14 16 18 4 6 8 10 12 14 16 18
60
60
60
126 data points 42 data points 108 data points 84 data points
50
50
50
50
40
40
40
40
Frequency
30
30
30
30
20
20
20
20
10
10
10
10
0
0
4 6 8 10 12 14 16 18 4 6 8 10 12 14 16 18 4 6 8 10 12 14 16 18 4 6 8 10 12 14 16 18
6 Related Work
Language design. There is a large body of work on integrating access permissions
into practical type systems [5, 16, 42] (see, e.g., the survey by Clarke et al. [10]).
One notable such system, which is the closest in its spirit to our proposal, is
the borrows type system of the Rust programming language [1] proved safe with
RustBelt [22]. Similar to our approach, borrows in Rust are short-lived: in
Rust they share the scope with the owner; in our approach they do not escape
the scope of a method call. In contrast with our work, Rust’s type system care-
fully manages different references to data by imposing strict sharing constraints,
whereas in our approach the treatment of aliasing is taken care of automatically
by building on Separation Logic. Moreover, Rust allows read-only borrows to be
duplicated, while in the sequential setting of BoSSL this is currently not possible.
Somewhat related to our approach, Naden et al. propose a mechanisms for
borrowing permissions, albeit integrated as a fundamental part of a type sys-
tem [31]. Their type system comes equipped with change permissions which
enforce the borrowing requirements and describe the effects of the borrowing
upon return. As a result of treating permissions as first-class values, we do not
need to explicitly describe the flow of permissions for each borrow since this is
controlled by a mix of the substitution and unification principles.
Program verification with read-only permissions. Boyland introduced fractional
permissions to statically reason about interference in the presence of shared-
memory concurrency [8]. A permission p denotes full resource ownership (i.e.
read-write access) when p = 1, while p ∈ (0, 1) denotes a partial ownership (i.e.
read-only access). To leverage permissions in practice, a system must support
two key operations: permission splitting and permission borrowing. Permission
p p1 p2
splitting (and merging back) follows the split rule: x → a = x → a∗x → a, with p =
p1 +p2 and p, p1 , p2 ∈ (0, 1]. Permission borrowing refers to the safe manipulation
of permissions: a callee may remove some permissions from the caller, use them
temporarily, and give them back upon return.
Though it exists, tool support for fractional permissions is still scarce. Leino
and Müller introduced a mechanism for storing fractional permissions in data
structures via dedicated access predicates in the Chalice verification tool [27].
To promote generic specifications, Heule et al. advanced Chalice with insta-
tiable abstract permissions, allowing automatic fire of the split rule and symbolic
borrowing [20]. VeriFast [21] is guided by contracts written in Separation Logic
and assumes the existence of lemmas to cater for permission splitting. Viper [30]
164 A. Costea et al.
7 Conclusion
In this work, we have advanced the state of the art in program synthesis by
highlighting the benefits of guiding the synthesis process with information about
memory access permissions. We have designed the logic BoSSL and implemented
the tool ROBoSuSLik, showing that a minimalistic discipline for read-only per-
missions already brings significant improvements wrt. the performance and ro-
bustness of the synthesiser, as well as wrt. the quality of its generated programs.
Acknowledgements. We thank Alexander J. Summers, Cristina David, Olivier
Danvy, and Peter O’Hearn for their comments on the prelimiary versions of
the paper. We are very grateful to the ESOP 2020 reviewers for their detailed
feedback, which helped to conduct a more adequate comparison with related
approaches and, thus, better frame the conceptual contributions of this work.
Nadia Polikarpova’s research was supported by NSF grant 1911149. Amy Zhu’s
research internship and stay in Singapore during the Summer 2019 was supported
by Ilya Sergey’s start-up grant at Yale-NUS College, and made possible thanks
to UBC Science Co-op Program.
166 A. Costea et al.
References
1. The Rust Programming Language: References and Borrowing. https://doc.
rust-lang.org/1.8.0/book/references-and-borrowing.html, 2019.
2. Rajeev Alur, Rastislav Bodı́k, Garvit Juniwal, Milo M. K. Martin, Mukund
Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina
Torlak, and Abhishek Udupa. Syntax-guided synthesis. In FMCAD, pages 1–8.
IEEE, 2013.
3. Andrew W. Appel. Verified software toolchain - (invited talk). In ESOP, volume
6602 of LNCS, pages 1–17. Springer, 2011.
4. Vytautas Astrauskas, Peter Müller, Federico Poli, and Alexander J. Summers.
Leveraging Rust types for modular specification and verification. PACMPL,
3(OOPSLA):147:1–147:30, 2019.
5. Thibaut Balabonski, François Pottier, and Jonathan Protzenko. The Design and
Formalization of Mezzo, a Permission-Based Programming Language. ACM Trans.
Program. Lang. Syst., 38(4):14:1–14:94, 2016.
6. Josh Berdine, Cristiano Calcagno, and Peter W. O’Hearn. Symbolic execution
with separation logic. In APLAS, volume 3780 of LNCS, pages 52–68. Springer,
2005.
7. Richard Bornat, Cristiano Calcagno, Peter W. O’Hearn, and Matthew J. Parkin-
son. Permission Accounting in Separation Logic. In POPL, pages 259–270. ACM,
2005.
8. John Boyland. Checking Interference with Fractional Permissions. In SAS, volume
2694 of LNCS, pages 55–72. Springer, 2003.
9. Arthur Charguéraud and François Pottier. Temporary Read-Only Permissions for
Separation Logic. In ESOP, volume 10201 of LNCS, pages 260–286. Springer, 2017.
10. Dave Clarke, Johan Östlund, Ilya Sergey, and Tobias Wrigstad. Ownership Types:
A Survey, pages 15–58. Springer Berlin Heidelberg, 2013.
11. Andreea Costea, Asankhaya Sharma, and Cristina David. HIPimm: verifying gran-
ular immutability guarantees. In PEPM, pages 189–194. ACM, 2014.
12. Andreea Costea, Amy Zhu, Nadia Polikarpova, and Ilya Sergey. ROBoSuSLik:
ESOP 2020 Artifact. 2020. DOI: 10.5281/zenodo.3630044.
13. Cristina David and Wei-Ngan Chin. Immutable specifications for more concise and
precise verification. In OOPSLA, pages 359–374. ACM, 2011.
14. Benjamin Delaware, Clément Pit-Claudel, Jason Gross, and Adam Chlipala. Fiat:
Deductive Synthesis of Abstract Data Types in a Proof Assistant. In POPL, pages
689–700. ACM, 2015.
15. Robert Dockins, Aquinas Hobor, and Andrew W. Appel. A fresh look at separation
algebras and share accounting. In APLAS, volume 5904 of LNCS, pages 161–177.
Springer, 2009.
16. Ronald Garcia, Éric Tanter, Roger Wolff, and Jonathan Aldrich. Foundations of
typestate-oriented programming. ACM Trans. Program. Lang. Syst., 36(4):12:1–
12:44, 2014.
17. Adrià Gascón, Ashish Tiwari, Brent Carmer, and Umang Mathur. Look for the
proof to find the program: Decorated-component-based program synthesis. In CAV,
volume 10427 of LNCS, pages 86–103. Springer, 2017.
18. Colin S. Gordon, Matthew J. Parkinson, Jared Parsons, Aleks Bromfield, and Joe
Duffy. Uniqueness and reference immutability for safe parallelism. In OOPSLA,
pages 21–40. ACM, 2012.
19. Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. Syn-
thesis of loop-free programs. In PLDI, pages 62–73. ACM, 2011.
Concise Read-Only Specifications for Better Synthesis 167
20. Stefan Heule, K. Rustan M. Leino, Peter Müller, and Alexander J. Summers. Ab-
stract read permissions: Fractional permissions without the fractions. In VMCAI,
volume 7737 of LNCS, pages 315–334. Springer, 2013.
21. Bart Jacobs, Jan Smans, Pieter Philippaerts, Frédéric Vogels, Willem Penninckx,
and Frank Piessens. VeriFast: A Powerful, Sound, Predictable, Fast Verifier for C
and Java. In NASA Formal Methods, volume 6617 of LNCS, pages 41–55. Springer,
2011.
22. Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. Rust-
Belt: Securing the foundations of the Rust programming language. PACMPL,
2(POPL):66, 2017.
23. Etienne Kneuss, Ivan Kuraj, Viktor Kuncak, and Philippe Suter. Synthesis modulo
recursive functions. In OOPSLA, pages 407–426. ACM, 2013.
24. Tristan Knoth, Di Wang, Nadia Polikarpova, and Jan Hoffmann. Resource-guided
program synthesis. In PLDI, pages 253–268. ACM, 2019.
25. Xuan Bach Le and Aquinas Hobor. Logical reasoning for disjoint permissions. In
ESOP, volume 10801 of LNCS, pages 385–414. Springer, 2018.
26. K. Rustan M. Leino and Aleksandar Milicevic. Program Extrapolation with Jen-
nisys. In OOPSLA, pages 411–430. ACM, 2012.
27. K. Rustan M. Leino and Peter Müller. A Basis for Verifying Multi-threaded Pro-
grams. In ESOP, volume 5502 of LNCS, pages 378–393. Springer, 2009.
28. K. Rustan M. Leino, Peter Müller, and Jan Smans. Verification of Concurrent
Programs with Chalice. In Foundations of Security Analysis and Design V, FOSAD
2007/2008/2009 Tutorial Lectures, volume 5705 of LNCS, pages 195–222. Springer,
2009.
29. Zohar Manna and Richard J. Waldinger. A deductive approach to program syn-
thesis. ACM Trans. Program. Lang. Syst., 2(1):90–121, 1980.
30. Peter Müller, Malte Schwerhoff, and Alexander J. Summers. Viper: A Verification
Infrastructure for Permission-Based Reasoning. In VMCAI, volume 9583 of LNCS,
pages 41–62. Springer, 2016.
31. Karl Naden, Robert Bocchino, Jonathan Aldrich, and Kevin Bierhoff. A type
system for borrowing permissions. In POPL, pages 557–570. ACM, 2012.
32. Peter W. O’Hearn, John C. Reynolds, and Hongseok Yang. Local reasoning about
programs that alter data structures. In CSL, volume 2142 of LNCS, pages 1–19.
Springer, 2001.
33. Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. Program synthesis
from polymorphic refinement types. In PLDI, pages 522–538. ACM, 2016.
34. Nadia Polikarpova and Ilya Sergey. Structuring the Synthesis of Heap-
Manipulating Programs. PACMPL, 3(POPL):72:1–72:30, 2019.
35. Nadia Polikarpova, Jean Yang, Shachar Itzhaky, and Armando Solar-Lezama. En-
forcing information flow policies with type-targeted program synthesis. CoRR,
abs/1607.03445, 2016.
36. Xiaokang Qiu and Armando Solar-Lezama. Natural synthesis of provably-correct
data-structure manipulations. PACMPL, 1(OOPSLA):65:1–65:28, 2017.
37. John C. Reynolds. Separation logic: A logic for shared mutable data structures.
In LICS, pages 55–74. IEEE Computer Society, 2002.
38. Reuben N. S. Rowe and James Brotherston. Automatic cyclic termination proofs
for recursive procedures in separation logic. In CPP, pages 53–65. ACM, 2017.
39. Calvin Smith and Aws Albarghouthi. Synthesizing differentially private programs.
Proc. ACM Program. Lang., 3(ICFP):94:1–94:29, July 2019.
40. Armando Solar-Lezama. Program sketching. STTT, 15(5-6):475–495, 2013.
168 A. Costea et al.
41. Saurabh Srivastava, Sumit Gulwani, and Jeffrey S. Foster. From program verifica-
tion to program synthesis. In POPL, pages 313–326. ACM, 2010.
42. Sven Stork, Karl Naden, Joshua Sunshine, Manuel Mohr, Alcides Fonseca, Paulo
Marques, and Jonathan Aldrich. Æminium: A Permission-Based Concurrent-by-
Default Programming Language Approach. TOPLAS, 36(1):2:1–2:42, 2014.
43. Alexander J. Summers and Peter Müller. Automating deductive verification for
weak-memory programs. In TACAS, volume 10805 of LNCS, pages 190–209.
Springer, 2018.
44. Emina Torlak and Rastislav Bodı́k. A lightweight symbolic virtual machine for
solver-aided host languages. In PLDI, pages 530–541. ACM, 2014.
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/
4.0/), which permits use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons license and indicate if changes
were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Soundness conditions for big-step semantics
1 Introduction
The semantics of programming languages or software systems specifies, for each
program/system configuration, its final result, if any. In the case of non-existence
of a final result, there are two possibilities:
– either the computation stops with no final result, and there is no means to
compute further: stuck computation,
– or the computation never stops: non-termination.
There are two main styles to define operationally a semantic relation: the
small-step style [34,35], on top of a reduction relation representing single com-
putation steps, or directly by a set of rules as in the big-step style [28]. Within a
small-step semantics it is straightforward to make the distinction between stuck
and non-terminating computations, while a typical drawback of the big-step style
is that they are not distinguished (no judgement is derived in both cases).
For this reason, even though big-step semantics is generally more abstract,
and sometimes more intuitive to design and therefore to debug and extend, in the
literature much more effort has been devoted to study the meta-theory of small-
step semantics, providing properties, and related proof techniques. Notably, the
soundness of a type system (typing prevents stuck computation) can be proved
by progress and subject reduction (also called type preservation) [40].
Our quest is then to provide a general proof technique to prove the soundness
of a predicate with respect to an arbitrary big-step semantics. How can we
achieve this result, given that in big-step formulation soundness cannot even
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 169–196, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 7
170 F. Dagnino et al.
– C is a set of configurations c.
– R ⊆ C is a set of results r . We define judgments j ≡ c ⇒ r , meaning that
configuration c evaluates to result r . Set C (j ) = c and R(j ) = r .
– R is a set of rules ρ of shape
j1 . . . jn jn+1
also written in inline format: rule(j1 . . . jn , jn+1 , c)
c ⇒ R(jn+1 )
with c ∈ C \R, where j1 . . . jn are the dependencies and jn+1 is the continu-
ation. Set C (ρ)=c and, for i ∈ 1..n + 1, C (ρ, i)=C (ji ) and R(ρ, i)=R(ji ).
– For each result r ∈ R, we implicitly assume a single axiom . Hence, the
r⇒r
only derivable judgment for r is r ⇒ r , which we will call a trivial judgment.
We will use the inline format, more concise and manageable, for the development
of the meta-theory, e.g., in constructions.
A rule corresponds to the following evaluation process for a non-result con-
figuration: first, dependencies are evaluated in the given order, then the contin-
uation is evaluated and its result is returned as result of the entire computation.
172 F. Dagnino et al.
(app-r). As said above, these different choices do not affect the semantic relation
c ⇒ r defined by the inference system, which is always the same. However, they
will affect the way the extended semantics distinguishing stuck computation and
non-termination is constructed. Indeed, if the evaluation of e1 and e2 is stuck
and non-terminating, respectively, we should obtain stuck computation with rule
(app) and non-termination with rule (app-r).
In summary, to see a typical big-step semantics as an instance of our defi-
nition, it is enough to assume an order (or more than one) on premises, make
implicit the axiom for results, and add a dummy continuation when needed. In
the examples (Sect. 5), we will assume a left-to-right order on premises, and
omit dummy continuations to keep a more familiar style. In the technical part
(Sect. 3, Sect. 4 and Sect. 6) we will adopt the inline format.
3 Extended semantics
In the following, we assume a big-step semantics C , R, R and describe two
constructions which make the distinction between non-termination and stuck
computation explicit. In both cases, the approach is based on well-know ideas;
the novel contribution is that, thanks to the meta-theory in Sect. 2, we provide
a general construction working on an arbitrary big-step semantics.
3.1 Traces
We denote by C , C ω , and C ∞ = C ∪C ω , respectively, the sets of finite, infinite,
and possibly infinite traces, that is, sequences of configurations. We write t · t
for concatenation of t∈C with t ∈C ∞ .
We derive, from the judgement c ⇒ r , an enriched big-step judgement c ⇒tr t
with t ∈ C ∞ . Intuitively, t keeps trace of all the configurations visited during the
evaluation, starting from c itself. To define the trace semantics, we construct,
starting from R, a new set of rules Rtr , which are of two kinds:
trace introduction These rules enrich the standard semantics by finite traces:
for each ρ ≡ rule(j1 . . . jn , jn+1 , c) in R, and finite traces t1 , . . . , tn+1 ∈C ,
we add the rule
C (j1 ) ⇒tr t1 · R(j1 ) . . . C (jn+1 ) ⇒tr tn+1 · R(jn+1 )
c ⇒tr c · t1 · R(j1 ) · . . . · tn+1 · R(jn+1 )
We denote this rule by trace(ρ, t1 , . . . , tn+1 ), to highlight the relationship
with the original rule ρ. We also add one axiom for each result r .
r ⇒tr r
Such rules derive judgements c ⇒ t with t∈C , for convergent computations.
divergence propagation These rules propagate divergence, that is, if a
(sub)configuration in the premise of a rule diverges, then the subsequent
premises are ignored and the configuration in the conclusion diverges as
well: for each ρ ≡ rule(j1 . . . jn , jn+1 , c) in R, index i∈1..n + 1, finite traces
t1 , . . . , ti−1 ∈ C , and infinite trace t, we add the rule:
C (j1 ) ⇒tr t1 · R(j1 ) . . . C (ji−1 ) ⇒tr ti−1 · R(ji−1 ) C (ji ) ⇒ t
c ⇒ c · t1 · R(j1 ) · . . . · ti−1 · R(ti−1 ) · t
174 F. Dagnino et al.
3.2 Wrong
e1 ⇒ n e ⇒ λx.e
(wrong-app) (wrong-succ)
e1 e2 ⇒ wrong succ e ⇒ wrong
notion of strong soundness, introduced by [40]. Strong soundness holds if, for
configurations satisfying Πι (e.g., having a given type), computation cannot be
stuck, and moreover, produces a result satisfying Πι (e.g., of the same type)
if terminating. Note that soundness alone does not even guarantee to obtain a
result satisfying Π (e.g., a well-typed result). The three conditions introduced
in the following section actually ensure strong soundness.
In Sect. 4.2 we provide sufficient conditions for soundness-must, showing that
they actually ensure soundness in the wrong semantics (Theorem 3). Then, in
Sect. 4.3, we provide (weaker) sufficient conditions for soundness-may, and show
that they actually ensure soundness-may in the trace semantics (Theorem 4).
Thinking to the paradigmatic case where the indexes are types, for each rule
ρ, if the configuration c in the consequence has type ι, we have to find types
ι1 , . . . , ιn+1 which can be assigned to (the configurations in) the premises, in
particular the same type as c for the continuation. More precisely, we start find-
ing type ι1 , and successively find the type ιk for (the configuration in) the k-th
premise assuming that the results of all the previous premises have the expected
types. Indeed, if all such previous premises are derivable, then the expected type
should be preserved by their results; if some premise is not derivable, the consid-
ered rule is “useless”. For instance, considering (an instantiation of) meta-rule
(app) rule(e1 ⇒ λx .e e2 ⇒ v2 , e[v2 /x ] ⇒ v , e1 e2 ) in Sect. 2, we prove that e[v2 /x ]
has the type T of e1 e2 under the assumption that λx .e has type T → T , and
v2 has type T (see the proof example in Sect. 5.1 for more details).
A counter-example to condition S1 is discussed at the beginning of Sect. 5.3.
The following lemma states that local preservation actually implies preser-
vation of the semantic relation as a whole.
Proposition 1. Let R and Π satisfy condition S1. For each rule(j1 . . . jn , jn+1 , c)
and k ∈ 1..n + 1, if c ∈ Π and, for all h < k, R
jh , then C (jk ) ∈ Π.
The second condition, named ∃-progress, ensures that, for configurations sat-
isfying the predicate Π (e.g., well-typed), we can start constructing a proof tree.
Definition 2 (S2: ∃-progress). For each c ∈ Π\R, C (ρ) = c for some rule ρ.
The third condition, named ∀-progress, ensures that, for configurations sat-
isfying Π, we can continue constructing the proof tree. This condition uses the
notion of rules equivalent up-to an index introduced at the beginning of Sect. 3.2.
We have to check, for each rule ρ, the following: if the configuration c in the
consequence satisfies the predicate (e.g., is well-typed), then, for each k, if the
configuration in premise k evaluates to some result r (that is, R
C (jk ) ⇒ r ),
then there is a rule (ρ itself or another rule with the same configuration in the
consequence and the first k − 1 premises) with such judgment as k-th premise.
This check can be done under the assumption that all the previous premises
are derivable. For instance, consider again (an instantiation of) the meta-rule
(app) rule(e1 ⇒ λx .e e2 ⇒ v2 , e[v2 /x ] ⇒ v , e1 e2 ). Assuming that e1 evaluates to
some v1 , we have to check that there is a rule with first premise e1 ⇒ v1 , in
pratice, that v1 is a λ-abstraction; in general, checking S3 for a (meta-)rule
amounts to show that (sub)configurations in the premises evaluate to results
with the required shape (see also the proof example in Sect. 5.1).
Lemma 2. Let S ⊆ C be a set. If, for all c ∈ S, there are ρ ≡ rule(j1 . . . jn , jn+1 , c)
and k ∈ 1..n + 1 such that
1. for all h < k, R
jh , and
2. C (jk ) ∈ S
then, for all c ∈ S, there is t ∈ C ω such that Rtr
c ⇒tr t.
Theorem 4. Let R and Π satisfy conditions S1 and S4. If c ∈ Π, then there
is t such that Rtr
c ⇒tr t.
Proof. First note that, thanks to Theorem 1, the statement is equivalent to the
following:
If c ∈ Π and R
c ⇒ , then there is t ∈ C ω such that Rtr
c ⇒tr t.
Then, the proof follows from Lemma 2. We define S = {c | c∈Π and R
c ⇒ },
and show that, for all c ∈ S, there are ρ ≡ rule(j1 . . . jn , jn+1 , c) and k ∈ 1..n + 1
such that, for all h < k, R
jh , and C (jk ) ∈ S.
Consider c ∈ S, then, by S4, there is ρ ≡ rule(j1 . . . jn , jn+1 , c). By definition
of S, we have R
c ⇒ , hence there exists a (first) k ∈ 1..n + 1 such that R
jk ,
since, otherwise, we would have R
c ⇒ R(jn+1 ). Then, since k is the first index
with such property, for all h < k, we have R
jh , hence, again by condition
S4, we have that R
C (jk ) ⇒ . Finally, since for all h < k we have R
jh , by
Prop. 1, we get C (jk ) ∈ Π, hence C (jk ) ∈ S, as needed.
5 Examples
Sect. 5.1 explains in detail how a typical soundness proof can be rephrased in
terms of our technique, by reasoning directly on big-step rules. Sect. 5.2 shows
a case where this is advantageous, since the property to be checked is not pre-
served by intermediate computation steps, whereas it holds for the final result.
Sect. 5.3 considers a more sophisticated type system, with intersection and union
types. Finally, Sect. 5.4 shows another example where subject reduction is not
preserved, whereas soundness can be proved with our technique. This example
is intended as a preliminary step towards a more challenging case.
(t-var) Γ (x ) = T (t-const)
Γ x :T Γ n : Nat
Γ {T /x } e : T Γ e1 : T → T Γ e 2 : T
(t-abs) (t-app)
Γ λx .e : T → T Γ e1 e 2 : T
Γ e : Nat Γ e 1 : T Γ e2 : T
(t-succ) (t-choice)
Γ succ e : Nat Γ e 1 ⊕ e2 : T
Lemma 3 (Inversion).
1. If Γ
x : T , then Γ (x ) = T .
2. If Γ
n : T , then T = Nat.
3. If Γ
λx .e : T , then T = T1 → T2 and Γ {T1 /x }
e : T2 .
4. If Γ
e1 e2 : T , then Γ
e1 : T → T , and Γ
e2 : T .
5. If Γ
succ e : T , then T = Nat and Γ
e : Nat.
6. If Γ
e1 ⊕ e2 : T , then Γ
ei : T with i ∈ 1, 2.
Lemma 4 (Substitution). If Γ {T /x }
e : T and Γ
e : T , then Γ
e[e /x ] : T .
1. If
v : T → T , then v = λx .e.
2. If
v : Nat, then v = n.
Since the aim of this first example is to illustrate the proof technique, we
provide a proof where we explain the reasoning in detail.
Proof of S1. We should prove this condition for each (instantiation of meta-)rule.
(app):Assume that
e1 e2 : T holds. We have to find types for the premises,
notably T for the last one. We proceed as follows:
Proof of S2. We should prove that, for each non-result configuration (here,
expression e which is not a value) such that
e : T holds for some T , there is
a rule with this configuration in the consequence. The expression e cannot be a
variable, since a variable cannot be typed in the empty environment. Applica-
tion, successor and choice appear as consequence in the reduction rules.
Proof of S3. We should prove this condition for each (instantiation of meta-)rule.
(app): Assuming
e1 e2 : T , again by Lemma 3 (4) we get Γ
e1 : T → T .
1. First premise: if e1 ⇒ v is derivable, then there should be a rule with e1 e2
in the consequence and e1 ⇒ v as first premise. Since we proved S1, by
preservation (Lemma 1)
v : T → T holds. Then, by Lemma 5 (1), v has
shape λx .e, hence the required rule exists. As noted at page 10, in practice
checking S3 for a (meta-)rule amounts to show that (sub)configurations in
the premises evaluate to results which have the required shape (to be a
λ-abstraction in this case).
2. Second premise: if e1 ⇒ λx .e, and e2 ⇒ v2 , then there should be a rule with
e1 e2 in the consequence and e1 ⇒ λx .e, e2 ⇒ v as first two premises. This is
trivial since the meta-variable v2 can be freely instantiated in the meta-rule.
(succ): Assuming
succ e : T , again by Lemma 3 (5) we get
e : Nat. If e ⇒ v
is derivable, there should be a rule with succ e in the consequence and e ⇒ v as
first premise. Indeed, by preservation (Lemma 1) and Lemma 5 (2), v has shape
n. For the second premise, if n + 1 ⇒ v is derivable, then v is necessarily n + 1.
(choice): Trivial since the meta-variable v can be freely instantiated.
0 0 : Nat, then condition S3 fails for rule (app), since 0 ⇒ 0 is derivable, but
there is no rule with 0 0 in the conclusion and 0 ⇒ 0 as first premise.
5.2 MiniFJ&λ
In this example, the language is a subset of FJ&λ [12], a calculus extending
Featherweight Java (FJ) with λ-abstractions and intersection types, introduced
in Java 8. To keep the example small, we do not consider intersections and focus
Soundness conditions for big-step semantics 183
on one key typing feature: λ-abstractions can only be typed when occurring in a
context requiring a given type (called the target type). In a small-step semantics,
this poses a problem: reduction can move λ-abstractions into arbitrary contexts,
leading to intermediate terms which would be ill-typed. To maintain subject
reduction, in [12] λ-abstractions are decorated with their initial target type. In
a big-step semantics, there is no need of intermediate terms and annotations.
The syntax is given in the first part of Fig. 5. We assume sets of variables
x , class names C, interface names I, J, field names f, and method names m.
Interfaces which have exactly one method (dubbed functional interfaces) can be
used as target types. Expressions are those of FJ, plus λ-abstractions, and types
are class and interface names. In λxs.e we assume that xs is not empty and e
is not a λ-abstraction. For simplicity, we only consider upcasts, which have no
runtime effect, but are important to allow the programmer to use λ-abstractions,
as exemplified in discussing typing rules.
To be concise, the class table is abstractly modelled as follows:
The big-step semantics is given in the last part of Fig. 5. MiniFJ&λ shows
an example of instantiation of the framework where configurations include an
auxiliary structure, rather than being just language terms. In this case, the
structure is an environment e (a finite map from variables to values) modelling
the current stack frame. Results are values, which are either objects, of shape
[vs]C , or λ-abstractions.
Rules for FJ constructs are straightforward. Note that, since we only consider
upcasts, casts have no runtime effect. Indeed, they are guaranteed to succeed on
well-typed expressions. Rule (λ-invk) shows that, when the receiver of a method
is a λ-abstraction, the method name is not significant at runtime, and the effect
is that the body of the function is evaluated as in the usual application.
The type system is given in Fig. 6. Method bodies are expected to be well-
typed with respect to method types. Formally, mbody(C, m) and mtype(C, m)
are either both defined or both undefined: in the first case mbody(C, m) =
x1 . . . xn , e, mtype(C, m) = T1 . . . Tn → T , and x1 :T1 , . . . , xn :Tn , this:C
e :
T . Moreover, we assume other standard FJ constraints on the class table, such
as no field hiding, no method overloading, the same parameter and return types
in overriding.
Besides the standard typing features of FJ, the MiniFJ&λ type system en-
sures the following.
184 F. Dagnino et al.
(var) e(x ) = v
e, x ⇒ v
e, ei ⇒ vi ∀i ∈ 1..n
(new)
e, new C(e1 , . . . , en ) ⇒ [v1 , . . . , vn ]C
e, e0 ⇒ [vs]C
e,
ei ⇒ vi ∀i ∈ 1..n
x1 :v1 , . . . , xn :vn , this:[vs]C , e ⇒ v mbody(C, m) = x1 . . . xn , e
(invk)
e, e0 .m(e1 , . . . , en ) ⇒ v
e, e0 ⇒ λxs.e
e, ei ⇒ vi ∀i ∈ 1..n
x1 :v1 , . . . , xn :vn , e ⇒ v e, e ⇒ v
(λ-invk) (upcast)
e, e0 .m(e1 , . . . , en ) ⇒ v e, (T )e ⇒ v
interface J {}
interface I extends J { A m(A x); }
class C {
C m(I y) { return new C().n(y); }
C n(J y) { return new C(); }
}
Soundness conditions for big-step semantics 185
Γ e:C fields(C) = T1 f1 ; . . . Tn fn ;
Γ (x ) = T
i ∈ 1..n
(t-var) (t-field-access)
Γ x :T Γ e.f : Ti
Γ ei : Ti ∀i ∈ 1..n
(t-new) fields(C) = T1 f1 ; . . . Tn fn ;
Γ new C(e1 , . . . , en ) : C
x1 :T1 , . . . , xn :Tn e : T
(t-λ) !mtype(I) = T1 . . . Tn → T
Γ λxs.e : I
and the main expression new C().n(λx .x ). Here, the λ-abstraction has tar-
get type J, which is not a functional interface, hence the expression is ill-
typed in Java (the compiler has no functional type against which to type-
check the λ-abstraction). On the other hand, in the body of method m, the
parameter y of type I can be passed, as usual, to method n expecting a su-
pertype. For instance, the main expression new C().m(λx .x ) is well-typed,
since the λ-abstraction has target type I, and can be safely passed to method
n, since it is not used as function there. To formalise this behaviour, it is
forbidden to apply subsumption to λ-abstractions, see rule (t-sub).
– However, λ-abstractions occurring as results rather than in source code (that
is, in the environment and as fields of objects) are allowed to have a sub-
type of the required type, see the explicit side condition in rules (t-conf)
and (t-object). For instance, if C is a class with one field J f, the expression
new C((I)λx.x) is well-typed, whereas new C(λx.x) is ill typed, since rule
(t-sub) cannot be applied to λ-abstractions. When the expression is evaluated,
the result is [λx.x]C , which is well-typed.
We enrich the type system of Fig. 4 by adding intersection and union type
constructors and the corresponding typing rules, see Fig. 7. As usual we require
an infinite number of arrows in each infinite path for the trees representing types.
Intersection types for the λ-calculus have been widely studied [11]. Union types
naturally model conditionals [26] and non-deterministic choice [22].
Γ e:T Γ e:S Γ e :T ∧S Γ e :T ∧S
(∧ I) (∧ E) (∧ E)
Γ e :T ∧S Γ e:T Γ e:S
Γ e:T Γ e:S
(∨ I) (∨ I)
Γ e :T ∨S Γ e :T ∨S
The typing rules for the introduction and the elimination of intersection
and union are standard, except for the absence of the union elimination rule:
Γ {T /x }
e : V Γ {S /x }
e : V Γ
e : T ∨ S
(∨E)
Γ
e[e /x ] : V
As a matter of fact rule (∨E) is unsound for ⊕. For example, let split the type
Nat into Even and Odd and add the expected typings for natural numbers. The
prefix addition + has type
(Even → Even → Even) ∧ (Odd → Odd → Even)
and we derive
1 : Odd 2 : Even
(∨ I) (∨ I)
1 : Even ∨ Odd 2 : Even ∨ Odd
(⊕)
x:Even + x x:Even x:Odd + x x:Even (1 ⊕ 2) : Even ∨ Odd
(∨ E)
+(1 ⊕ 2)(1 ⊕ 2) : Even
We cannot assign the type Even to 3, which is a possible result, so strong sound-
ness is lost. In the small-step approach, we cannot assign Even to the interme-
diate term + 1 2, so subject reduction fails. In the big-step approach, there is no
such intermediate term; however, condition S1 fails for the reduction rule for +.
Indeed, considering the following instantiation of the rule:
Soundness conditions for big-step semantics 187
1 ⊕ 2⇒ 1 1 ⊕ 2⇒ 2 3⇒ 3
(+)
+(1 ⊕ 2)(1 ⊕ 2) ⇒ 3
and the type Even for the consequence, we cannot assign this type to the (con-
figuration in) last premise (continuation).
Intersection types allow to derive meaningful types also for expressions con-
taining variables applied to themselves, for example we can derive
λx .x x : (T → S ) ∧ T → S
With union types all non-deterministic choices between typable expressions can
be typed too, since we can derive Γ
e1 ⊕ e2 : T1 ∨ T2 from Γ
e1 : T1 and
Γ
e2 : T2 .
In order to state soundness, let Π3T (e) be
e : T , for T defined in Fig. 7.
Theorem 7 (Soundness). The big-step semantics R1 and the indexed predi-
cate Π3 satisfy the conditions S1, S2 and S3 of Sect. 4.2.
5.4 MiniFJ&O
A well-known example in which proving soundness with respect to small-step
semantics is extremely challenging is the standard type system with intersection
and union types [10] w.r.t. the pure λ-calculus with full reduction. Indeed, the
standard subject reduction technique fails5 , since, for instance, we can derive
the type (T → T → V ) ∧ (S → S → V ) → (U → T ∨ S ) → U → V for both
λx.λy.λz.x((λt.t)(y z))((λt.t)(y z)) and λx.λy.λz.x(y z)(y z), but the intermedi-
ate expressions λx.λy.λz.x((λt.t)(y z))(y z) and λx.λy.λz.x(y z)((λt.t)(y z)) do
not have this type.
As the example shows, the key problem is that rule (∨E) can be applied to
expression e where the same subexpression e occurs more than once. In the
non-deterministic case, as shown by the example in the previous section, this
is unsound, since e can reduce to different values. In the deterministic case,
instead, this is sound, but cannot be proved by subject reduction. Since using
big-step semantics there are no intermediate steps to be typed, our approach
seems very promising to investigate an alternative proof of soundness. Whereas
we leave this challenging problem to future work, here as first step we describe a
(hypothetical) calculus with a much simpler version of the problematic feature.
The calculus is a variant of FJ [27] with intersection and union types. Meth-
ods have intersection types with the same return type and different parameter
types, modelling a form of overloading. Union types enhance typability of condi-
tionals. The more interesting feature is the possibility of replacing an arbitrary
number of parameters with the same expression having an union type. We dub
this calculus MiniFJ&O.
Fig. 8 gives the syntax, big-step semantics and typing rules of MiniFJ&O.
We omit the standard big-step rule for conditional, and typing rules for boolean
5
For this reason, in [10] soundness is proved by an ad-hoc technique, that is, by
considering parallel reduction and an equivalent type system à la Gentzen, which
enjoys the cut elimination property.
188 F. Dagnino et al.
ei ⇒ vi ∀i ∈ 1..n
(new)
new C(e1 , . . . , en ) ⇒ new C(v1 , . . . , vn )
e0 ⇒ new C(vs )
ei ⇒ vi ∀i ∈ 1..n
e[v1 /x1 ] . . .[vn /xn ][new C(vs )/this] ⇒ v
(invk) mbody(C, m) = x1 . . . xn , e
e0 .m(e1 , . . . , en ) ⇒ v
Γ e:C fields(C) = T1 f1 ; . . . Tn fn ;
Γ (x ) = T
i ∈ 1..n
(t-var) (t-field-access)
Γ x :T Γ e.fi : Ci
Γ ei : Ci ∀i ∈ 1..n
(t-new) fields(C) = T1 f1 ; . . . Tn fn ;
Γ new C(e1 , . . . , en ) : C
mtype(C 0 , m) <:
Γ ei : Ci ∀i ∈ 0..n Γ e : 1≤i≤m Di
(C 1 . . . Cn Di . . . Di → C)
(t-invk)
Γ e0 .m(e1 , . . . , en , e, . . . , e ) : C 1≤i≤m
p
p
Γ e : Bool Γ e1 : T Γ e2 : T Γ e:T
(t-if) (t-sub) T <: T
Γ if e then e1 else e2 : T Γ e : T
constants. The subtyping relation <: is the reflexive and transitive closure of the
union of the extends relation and the standard rules for union:
T1 <: T1 ∨ T2 T1 <: T2 ∨ T1
On the other hand, method types (results of the mtype function) are now inter-
section types, and the subtyping relation on them is the reflexive and transitive
closure of the standard rules for intersection:
MT 1 ∧ MT 2 <: MT 1 MT 1 ∧ MT 2 <: MT 2
The functions fields and mbody are defined as for MiniFJ&λ.
Instead mtype(C, m) gives, for each method m in class C, an intersection type. We
assume mbody(C, m) and mtype(C, m) either both defined or both undefined: in
(i)
the first case mbody(C, m)=x1 . . . xn , e, mtype(C, m)= 1≤i≤m (C1 . . . C(i)
n → D),
(i)
and x1 :C1 , . . . , xn :C(i)
n , this:C
e : D for i ∈ 1..m.
Clearly rule (t-invk) is inspired by rule (∨E), but the restriction to method
calls endows a standard inversion lemma. The subtyping in this rule allows to
choose the types for the method best fitting the types of the arguments. Not
surprisingly, subject reduction fails for the expected small-step semantics. For
Soundness conditions for big-step semantics 189
example, let class C have a field point which contains cartesian coordinates and
class D have a field point which contains polar coordinates. The method eq takes
two objects and compares their point fields returning a boolean value. A type for
this method is (C C → Bool) ∧ (D D → Bool) and we can type eq(e, e), where
e = if false then new C( . . . ) else new D( . . . )
In fact e has type C ∨ D. Notice that in a standard small-step semantics
eq(e, e) −→ eq(new D( . . . ), if false then new C( . . . ) else new D( . . . ))
and this last expression cannot be typed.
In order to state soundness, let R4 be the big-step semantics defined in Fig. 8,
and let Π4T (e) hold if
e : T , for T defined in Fig. 8.
In this section, our aim is to provide a formal justification that the constructions
in Sect. 3 are correct. For instance, for the wrong semantics we would like to be
sure that all the cases are covered. To this end, we define a third construction,
dubbed pev for “partial evaluation”, which makes explicit the computations of
a big-step semantics, intended as the sequences of execution steps of the natu-
rally associated evaluation algorithm. Formally, we obtain a reduction relation
on approximated proof trees, so non-termination and stuck computation are
distinguished, and both soundness-must and soundness-may can be expressed.
To this end, first of all we introduce a special result ?, so that a judgment
c ⇒ ? (called incomplete, whereas a judgment in R is complete) means that the
evaluation of c is not completed yet. Analogously to the previous constructions,
we define an augmented set of rules R? for the judgment extended with ?:
Finally, we consider the set T of the (finite) proof trees τ in R? . Each τ can
be thought as a partial proof or partial evaluation of the root configuration. In
particular, we say it is complete if it is a proof tree in R (that is, it only contains
R
complete judgments), incomplete otherwise. We define a reduction relation −−−→
190 F. Dagnino et al.
R R c ⇒ ? C (ρ) = c
−−−→ −−−→
c ⇒ ? C (ρ, 1) = c
(r? ) (r ) (c? ) (prop(ρ,1,?))
r⇒? r⇒r c⇒?
ρ ∼i ρ
τ1 . . . τ i R τ1 . . . τ i
(intro? (ρ, i, r )) −−−→ (ρ ) R(ρ , i) = r
c⇒? c⇒r
#ρ = i
ρ ∼i ρ
τ1 . . . τ i R τ1 . . . τ i c ⇒ ?
(intro? (ρ, i, r )) −−−→ (prop(ρ ,i+1,?)) R(ρ , i) = r
c⇒? c⇒?
C (ρ , i + 1) = c
R
τ1 . . . τ i R τ1 . . . τi−1 τi τi −−−→τi
(prop(ρ,i,?)) −−−→ (prop(ρ,i,?))
c⇒? c⇒? R? (r(τi )) = ?
τ1 . . . τ i R τ1 . . . τi−1 τi τi −−R
−→τi
(prop(ρ,i,?)) −−−→ (intro? (ρ, i, r ))
c⇒? c⇒? R? (r(τi )) = r
on T such that, starting from the initial proof tree , we derive a sequence
c⇒?
where, intuitively, at each step we detail the proof (evaluation). In this way, a
...
sequence ending with a complete tree models terminating computation,
c⇒r
whereas an infinite sequence (tending to an infinite proof tree) models divergence,
and a stuck sequence models a stuck computation.
R
The one-step reduction relation −−−→ on T is inductively defined by the rules
in Fig. 9. In this figure #ρ denotes the number of premises of ρ, and r(τ ) the
root of τ . We set R? (c ⇒ u) = u where u ∈ R ∪ {?}. Finally, ∼i is the equivalence
up-to an index of rules, introduced at the beginning of Sect. 3.2. As said above,
each reduction step makes “less incomplete” the proof tree. Notably, reduction
rules apply to nodes with consequence c ⇒ ?, whereas subtrees with root c ⇒ r
represent terminated evaluation. In detail:
– If the last applied rule is an axiom, and the configuration is a result r , then
we can evaluate r to itself. Otherwise, we have to find a rule ρ with c in the
consequence and start evaluating the first premise of such rule.
– If the last applied rule is intro? (ρ, i, r ), then all subtrees are complete, hence,
to continue the evaluation, we have to find another rule ρ , having, for each
k ∈ 1..i, as k-th premise the root of τk . Then there are two possibilities: if
there is an i + 1-th premise, we start evaluating it, otherwise, we propagate
to the conclusion the result r of τi .
– If the last applied rule is a propagation rule prop(ρ, i, ?), then we simply
propagate the step made by τi .
R λx .x ⇒ ? R λx .x ⇒ λx .x R λx .x ⇒ λx .x n ⇒ ?
−−−→ −−−→ −−−→
(λx .x ) n ⇒ ? (λx .x ) n ⇒ ? (λx .x ) n ⇒ ? (λx .x ) n ⇒ ?
R λx .x ⇒ λx .x n ⇒ n R λx .x ⇒ λx .x n ⇒ n n ⇒ ?
−−−→ −−−→
(λx .x ) n ⇒ ? (λx .x ) n ⇒ ?
R λx .x ⇒ λx .x n ⇒ n n ⇒ n R λx .x ⇒ λx .x n ⇒ n n ⇒ n
−−−→ −−−→
(λx .x ) n ⇒ ? (λx .x ) n ⇒ n
7 Related work
Acknowledgments The authors are grateful to the referees: the paper strongly
improved thanks to their useful suggestions and remarks.
6
Available at https://github.com/fdgn/soundness-big-step-semantics.
194 F. Dagnino et al.
References
1. Peter Aczel. An introduction to inductive definitions. In Handbook of Mathematical
logic, pages 739–782, Amsterdam, 1977. North Holland.
2. Mads Sig Ager. From natural semantics to abstract machines. In Sandro Etalle,
editor, LOPSTR 2014 - 14th International Symposium on Logic Based Program
Synthesis and Transformation, volume 3573 of Lecture Notes in Computer Science,
pages 245–261, Berlin, 2004. Springer. doi:10.1007/11506676\_16.
3. Nada Amin and Tiark Rompf. Type soundness proofs with definitional interpreters.
In Giuseppe Castagna and Andrew D. Gordon, editors, POPL’17 - ACM Symp.
on Principles of Programming Languages, pages 666–679, New York, 2017. ACM
Press. doi:10.1145/3009837.
4. Nada Amin, Tiark Rompf, and Martin Odersky. Foundations of path-dependent
types. In Andrew P. Black and Todd D. Millstein, editors, OOPSLA’14 - ACM
International Conference on Object Oriented Programming Systems Languages and
Applications, pages 233–249, New York, 2014. ACM Press. doi:10.1145/2660193.
2660216.
5. Davide Ancona. Soundness of object-oriented languages with coinductive big-step
semantics. In James Noble, editor, ECOOP’12 - Object-Oriented Programming,
volume 7313 of Lecture Notes in Computer Science, pages 459–483, Berlin, 2012.
Springer. doi:10.1007/978-3-642-31057-7\_21.
6. Davide Ancona. How to prove type soundness of Java-like languages without
forgoing big-step semantics. In David J. Pearce, editor, FTfJP’14 - Formal
Techniques for Java-like Programs, pages 1:1–1:6, New York, 2014. ACM Press.
doi:10.1145/2635631.2635846.
7. Davide Ancona, Francesco Dagnino, and Elena Zucca. Generalizing inference sys-
tems by coaxioms. In Hongseok Yang, editor, ESOP 2017 - European Symposium
on Programming, volume 10201 of Lecture Notes in Computer Science, pages 29–
55, Berlin, 2017. Springer. doi:10.1007/978-3-662-54434-1_2.
8. Davide Ancona, Francesco Dagnino, and Elena Zucca. Reasoning on divergent
computations with coaxioms. PACMPL, 1(OOPSLA):81:1–81:26, 2017. doi:10.
1145/3133905.
9. Davide Ancona, Francesco Dagnino, and Elena Zucca. Modeling infinite behaviour
by corules. In Todd D. Millstein, editor, ECOOP’18 - Object-Oriented Program-
ming, volume 109 of LIPIcs, pages 21:1–21:31, Dagstuhl, 2018. Schloss Dagstuhl -
Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ECOOP.2018.21.
10. Franco Barbanera, Mariangiola Dezani-Ciancaglini, and Ugo de’Liguoro. Inter-
section and union types: Syntax and semantics. Information and Computation,
119(2):202–230, 1995. doi:10.1006/inco.1995.1086.
11. Hendrik Pieter Barendregt, Wil Dekkers, and Richard Statman. Lambda Calculus
with Types. Perspectives in logic. Cambridge University Press, Cambridge, 2013.
12. Lorenzo Bettini, Viviana Bono, Mariangiola Dezani-Ciancaglini, Paola Giannini,
and Betti Venneri. Java & Lambda: a Featherweight story. Logical Methods in
Computer Science, 14(3), 2018. doi:10.23638/LMCS-14(3:17)2018.
13. Martin Bodin, Thomas Jensen, and Alan Schmitt. Certified abstract interpretation
with pretty-big-step semantics. In Xavier Leroy and Alwen Tiu, editors, CPP’15 -
Proceedings of the 2015 Conference on Certified Programs and Proofs, pages 29–40,
New York, 2015. ACM. doi:10.1145/2676724.2693174.
14. James Brotherston. Cyclic proofs for first-order logic with inductive definitions.
In Bernhard Beckert, editor, Automated Reasoning with Analytic Tableaux and
Soundness conditions for big-step semantics 195
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Liberate Abstract Garbage Collection from
the Stack by Decomposing the Heap
Kimball Germane1( )
and Michael D. Adams2
1
Brigham Young University, Provo UT, USA [email protected]
2
University of Michigan, Ann Arbor MI, USA [email protected]
1 Introduction
Among the many enhancements available to improve the precision of control-flow
analysis (CFA), abstract garbage collection and pushdown models of control flow
stand out as particularly effective ones. But their combination is non-trivial.
Abstract garbage collection (GC) [10] is the result of applying standard GC—
which calculates the heap data reachable from a root set derived from a given
environment and continuation—to an abstract semantics. Though it operates in
the same way as concrete GC, abstract GC has a different effect on the semantics
to which it’s applied. Concrete GC is semantically irrelevant in that it has no
effect on a program’s observable behavior.3 Abstract GC, on the other hand,
is semantically relevant in that, by eliminating some merging in the abstract
heap, it prevents a utilizing CFA from conflating some distinct heap data. In the
setting of a higher-order language, where data can represent control, this superior
approximation of data translates to a superior approximation of control as well,
manifest by the CFA exploring fewer infeasible execution paths.
Pushdown models of control flow [16, 3] encode the call–return relation of a
program’s flow of execution as precisely as an unbounded control stack would
3
It is irrelevant only if space consumption is unobservable, as is typical.
1.1 Examples
Note In the remainder of the paper, we use the standard term store to refer
to the analysis component which models the heap. Thus, we will describe our
technique as, e.g., treating stores compositionally.
3 Background
In this section, we review abstract garbage collection and the k-CFA context
abstraction. We begin by introducing a small-step concrete semantics which
defines the ground truth of evaluation.
202 K. Germane and M. D. Adams
First, we introduce some semantic components that we will use heavily through-
out the rest of the paper.
Machine states come in two variants. An Eval machine state represents a point
in execution in which an expression will be evaluated; it contains registers for
an expression e, its closing environment ρ, the store σ (modelling the heap), the
continuation κ (modelling the stack), and the time t. An Apply machine state
represents a point in execution at which a value is in hand and must be delivered
to the continuation; it contains registers for the value v to deliver, the store σ,
the continuation κ, and the time t.
Figure 1 contains the definitions of two relations over machine states, the
union of which constitutes the small-step relation. The →ev relation transitions
an Eval state to its successor. The Let rule pushes a continuation frame to save
the bound variable, environment, and body expression. The resultant Eval state
is poised to evaluate the bound expression ce. The Call rule first uses aeval
defined
to obtain values for each of the operator and argument. It then increments
the time, extends the store and environment with the incremented time, and
arranges evaluation of the operator body at the incremented time. The Set! rule
remaps a location in the store designated by a given variable (which is resolved in
the environment) to a value obtained by aeval. It returns the identity function.
Liberate Abstract Garbage Collection 203
Let
ev(let x = ce in e, ρ, σ, κ, t) →ev ev(ce, ρ, σ, lt(x, ρ, e, κ), t)
Call
(λx.e, ρ ) = aeval(σ, ρ, ae 0 ) v = aeval(σ, ρ, ae 1 ) t = (ae 0 ae 1 ) :: t
σ = σ[(x, t ) → v] ρ = ρ [x → t ]
ev((ae 0 ae 1 ), ρ, σ, κ, t) →ev ev(e, ρ , σ , κ, t )
Set!
v = aeval(σ, ρ, ae) a = (x, ρ(x)) σ = σ[a → v]
ev(set! x ae, ρ, σ, κ, t) →ev ap((λx.x, ⊥), σ , κ, t)
Atomic Apply
v = aeval(σ, ρ, ae) ρ = ρ[x → t] σ = σ[(x, t) → v]
ev(ae, ρ, σ, κ, t) →ev ap(v, σ, κ, t) ap(v, σ, lt(x, ρ, e, κ), t) →ap ev(e, ρ , σ , κ, t)
The Atomic rule evaluates an atomic expression. The Apply rule applies a
continuation to a value, extending the environment and store and arranging for
the evaluation of the let body.
We inject a program pr into the initial evaluation state ev(pr , ⊥, ⊥, mt, )
which arranges evaluation in the empty environment, empty store, halt contin-
uation, and empty time.
the root addresses of an environment are simply the variable–time pairs that
define it—that is, the definition of rootρ views its argument ρ extensionally as
a set of addresses. The rootκ metafunction extracts the root addresses from a
continuation. The empty continuation has no root addresses whereas the root
addresses of a non-empty continuation are those of its stored environment (re-
stricted to the free variables of the expression it closes) combined with those of
the continuation it extends.
Next, we define a reachability relation →σ parameterized by a store σ and
over addresses by
a0 →σ a1 ⇔ a1 ∈ rootv (σ(a0 ))
We then define the reachability of a root set with respect to a store
where →∗σ is the reflexive, transitive closure of →σ . From here, we obtain the
transitions
GC-Eval
A = rootρ (ρ|e ) ∪ rootκ (κ) σ = σ|R(σ,A)
ev(e, ρ, σ, κ, t) →GC ev(e, ρ, σ , κ, t)
GC-Apply
A = rootv (v) ∪ rootκ (κ) σ = σ|R(σ,A)
ap(v, σ, κ, t) →GC ap(v, σ , κ, t)
where σ|R(σ,A) is σ restricted to the reachable addresses R(σ, A). We compose
this garbage-collecting transition with each of →ev and →ap . Altogether, the
garbage-collecting semantics are given by →GC ◦[→ev ∪ →ap ].
Now that we have a small-step abstract machine semantics with GC, we are
ready to apply the AAM recipe to obtain a sound, computable CFA with GC.
We apply the AAM recipe in two steps.
First, we refactor the state space so that all inductively-defined components
are redirected through the store. Practically, this refactoring has the effect of
allocating continuations in the store. For our semantics, this refactoring yields
the state space StateSA defined
and of stores by
Not reflected in this structure is the typical constraint that an address a will
only ever locate a value and a continuation address α will only ever locate a
continuation.
Second, we finitely partition the unbounded address space of the store and
treat the constituent sets as abstract addresses (via some finite representative).
Practically, this partitioning is achieved by limiting the time t to at most k call
sites where k becomes a parameter of the CFA (leading to the designation k-
CFA). Any addresses which agree on the k-length prefix of their time component
are identified and the finite representative for this set of addresses uses simply
that prefix. Accordingly, we define an abstract time domain Time = Time ≤k
and let it reverberate through the state space definitions, obtaining
State
=Eval
+ Apply
=Exp × Env
Eval × Store
× ContAddr
× Time
Apply × ContAddr
=Store × Val × Time
(in which we allow the definition of ContAddr to depend, directly or not, on that
of Time).
Finitization of the address space is key to producing a computable CFA.
Practically, however, it means that some values located previously by distinct
addresses will after be located by the same abstract address. When this conflation
occurs, the CFA must behave as if either access was intended; this behavior is
manifested by non-deterministically choosing the value located by a particular
address. Because our language is higher-order, this non-determinism also affects
the control flows the CFA considers. This effect is evident in the Call rule
defined
Call
(λx.e, ρ̂ ) ∈ aeval(σ̂,
ρ̂, ae 0 ) v̂ = aeval(σ̂,
ρ̂, ae 1 ) t̂ = (ae 0 ae 1 ) :: t̂
k
σ̂ = σ̂[(x, t̂ ) → v̂] ρ̂ = ρ̂ [x → t̂ ]
ev((ae 0 ae 1 ), ρ̂, σ̂, α̂, t̂) →ev ev(e, ρ̂ , σ̂ , α̂, t̂ )
before we abstract it to produce a CFA. (Some CFAs factor the store out of
machine states to be managed globally, part of widening the store. In a sense,
factoring out the continuation is part of widening the continuation.) Without
a continuation component, an Eval PD state is an evaluation configuration and
an Apply PD state is an evaluation result. Except for the presence of the time
component, State PD exhibits precisely the configuration and result shapes one
finds in many stack-precise CFAs [17, 8, 1, 18].
However, factoring the continuation out and ceding control of it to the anal-
ysis presents an obstacle to abstract GC, which needs to extract the root set of
reachable addresses from it. Earl et al. [4] developed a technique whereby the
analysis could introspect the continuation and extract the root set of reachable
addresses from the continuation. Johnson and Van Horn [8] reformulated this
incomplete technique for an operational setting and offered a complete—albeit
theoretically more-expensive—technique capable of more precision. Johnson et
al. [7] unified these techniques within an expanded framework. Darais et al. [1]
then showed that the Abstracting Definitional Interpreters-approach—currently
the state of the art—is compatible with the complete technique by including the
set of stack root addresses as a component in the evaluation configuration.
In the concrete semantics, the time component t serves two purposes. The first
purpose is to provide the allocator with a source of freshness, so that when the
allocator must furnish a heap cell for a variable bound previously in execution, it
is able to furnish a distinct one. Were freshness the only constraint on t, the Time
domain could simply consist of N. In anticipation of its role in the downstream
CFA, the time component assumes a second purpose which is to capture some
notion of the context in which execution is occurring. The hope is that the notion
of context it captures is semantically meaningful so that, when an unbounded
set of times are identified by the process of abstraction, each address, which
is qualified by such an abstracted time, locates a semantically-coherent set of
values.
To get a better idea of what notion of context our treatment of time cap-
tures, let’s examine how our concrete semantics treats time, as dictated by k-
CFA. Time begins as the empty sequence . It is passed unchanged across all
Eval transitions, save one, and the Apply transition. The exception is the Call
transition, which instead passes the (at-most-)k-length prefix of the application
prepended to the incoming time. Hence, the k-CFA context abstraction is the
k-most-recent calls made in execution history.
In Section 6.2, we consider the ramifications of threading the time component
through evaluation and compare it to an alternative treatment.
which point we can apply systematic abstraction techniques [1, 18] to obtain
corresponding CFAs exhibiting perfect stack precision. Second, they emphasize
the availability of the configuration store at the delivery point of the evalua-
tion result; this availability is crucial to our ability to shift to a compositional
treatment of the store.
To orient ourselves to the big-step setting, we present the reference semantics for
our language in big-step style in Figure 2. This reference semantics is equivalent
to the reference semantics given in small-step style in Section 3.2 except that
there is no corresponding Apply rule; its responsibility—to deliver a value to
a continuation—is handled implicitly by the big-step formulation. In terms of
big-step semantics, this reference semantics is characterized by the threading of
the store through each rule; the resultant store of evaluation is the configuration
store plus the allocation and mutation incurred during evaluation. Hence, we
refer to this semantics as the threaded-store semantics. We use natural numbers
as store subscripts in each rule to emphasize the store’s monotonic increase.
Let
σ0 , ρ, t ce ⇓ (v0 , σ1 )
ρ = ρ[x → t] σ2 = σ1 [(x, t) → v0 ] σ2 , ρ , t e ⇓ (v, σ3 )
σ0 , ρ, t let x = ce in e ⇓ (v, σ3 )
Call
((λx.e, ρ0 ), σ1 ) = aeval(σ0 , ρ, ae 0 )
(v1 , σ2 ) = aeval(σ1 , ρ, ae 1 ) t = (ae 0 ae 1 ) :: t
ρ1 = ρ0 [x → t ] σ3 = σ2 [(x, t ) → v1 ] σ3 , ρ1 , t e ⇓ (v, σ4 )
σ0 , ρ, t (ae 0 ae 1 ) ⇓ (v, σ4 )
Set!
(v, σ1 ) = aeval(σ0 , ρ, ae) σ1 = σ0 [(x, ρ(x)) → v] Atomic
σ0 , ρ, t set! x ae ⇓ ((λx.x, ⊥), σ1 ) σ, ρ, t ae ⇓ aeval(σ, ρ, ae)
incoming binding context.7 The resultant store is extended with mapping from
that binding to the resultant value. The body expression is evaluated under
the extended environment and store and its result becomes that of the overall
expression.
Contrasting the treatment of the environment and the store by the Let rule
is instructive. On the one hand, the environment is treated compositionally: the
incoming environment of evaluation is restored and extended after evaluation of
the bound value. On the other hand, the store is treated non-compositionally:
the store resulting from the evaluation of the bound expression is extended after
it has accumulated the effects of its evaluation.
Under this criteria, we classify the treatment of the binding context as compo-
sitional rather than threaded. This compositional treatment departs from typical
practice of CFA and is the first such treatment in a stack-precise CFA to our
knowledge. In Section 6.2, we examine the ramifications of this treatment.
The Call rule evaluates the atomic expressions ae 0 and ae 1 for the operator
and argument, respectively. It then derives a new binding context, extends the
environment and store with a binding using that context, and evaluates the oper-
ator body under the extended environment, store, and derived binding context.
The result of evaluation the body is that of the overall expression.
The Set! rule evaluates the atomic body expression ae and updates the
binding of the referenced variable in the store. Its result is the identity function
paired with the updated store.
The Atomic rule evaluates an atomic expression ae using the aeval atomic
evaluation metafunction. Foreshadowing the succeeding semantics, we define
aeval to return a pair of its calculated value and the given store. In this seman-
tics, the store is passed through unmodified; in forthcoming semantics, it will be
altered according to the calculated value. Atomic evaluation is unchanged from
the small-step semantics:
The second semantics enhances the reference semantics with an effect log ξ which
explicitly records the allocation and mutation that occurs through evaluation.
The effect log is considered part of the evaluation result; accordingly the effect log
semantics are in terms of judgments of the form σ, ρ, t e ⇓! (v, σ ), ξ. Figure 3
presents the effect log semantics, identical to the reference semantics except for
(1) the addition of the effect log and (2) the use of the metavariable a to denote
an address (x,t). (This usage persists in all subsequent semantics as well.)
The effect log is represented by a function from store to store. The definition
of each log is given by either a literal identity function, a use of the extendlog
7
Because the program is alphatised, the binding of a let-bound variable in a particular
calling context will not interfere with the binding of any other variable.
Liberate Abstract Garbage Collection 211
Let
σ0 , ρ, t ce ⇓! (v0 , σ1 ), ξ0
ρ = ρ[x → t] σ2 = σ1 [(x, t) → v0 ] σ2 , ρ , t e ⇓! (v, σ3 ), ξ1
σ0 , ρ, t let x = ce in e ⇓! (v, σ3 ), ξ1 ◦ extendlog ((x, t), v0 , σ1 ) ◦ ξ0
Call
((λx.e, ρ0 ), σ1 ) = aeval(σ0 , ρ, ae 0 )
(v1 , σ2 ) = aeval(σ1 , ρ, ae 1 ) t = (ae 0 ae 1 ) :: t
ρ1 = ρ0 [x → t ] σ3 = σ2 [(x, t ) → v1 ] σ3 , ρ1 , t e ⇓! (v, σ4 ), ξ
σ0 , ρ, t (ae 0 ae 1 ) ⇓! (v, σ4 ), ξ ◦ extendlog ((x, t ), v1 , σ2 )
Set!
(v, σ1 ) = aeval(σ0 , ρ, ae)
a = (x, ρ(x)) σ1 = σ0 [a → v]
σ0 , ρ, t set! x ae ⇓! ((λx.x, ⊥), σ1 ), extendlog (a, v, σ1 )
Atomic
σ, ρ, t ae ⇓! aeval(σ, ρ, ae), λσ.σ
where the union of the extended store σ[a → v] and the value-associated store
σ treats each store extensionally as a set of pairs but the result is always a
function—i.e. any given address is paired with at most one value. The effect
log of the Atomic rule is the identity function, reflecting that no allocation or
mutation is performed when evaluating an atomic expression. The effect log of
the Set! rule is constructed by the metafunction extendlog ; the store argument
to extendlog is the store after the mutation has occurred. The use of this store is
necessary to propagate the mutative effect and ensures that its union with the
store on which this log is replayed agrees on all common bindings. The effect log
of the Call rule is composed of the effect log of evaluation of the body and an
entry for the allocation of the bound variable. Finally, the effect log of the Let
rule is composed of the effect logs of evaluation of both the body and binding
expression interposed by an entry for the allocation of the bound variable.
In this semantics (and the next), the bindings in σ are redundant: once
extendlog applies the the mutative or allocative binding to its argument σ, σ
already contains all the bindings of σ . Once we introduce GC to the semantics,
however, this will no longer be the case.
The intended role of the effect log is captured by the following lemma, which
states that one may obtain the resultant store by applying the resultant log to
the initial store of evaluation.
212 K. Germane and M. D. Adams
The third semantics (seen in Figure 4) shifts the previous semantics from thread-
ing the store to treating it compositionally. Under this treatment, evaluation
results still consist of a value, store, and effect log, but the store is associated
directly to the value—at least conceptually—and not treated as a global effect
repository. This alternative role is particularly apparent in the Let rule: the
store resulting from evaluation of the bound expression is not extended to be
used as the initial store of evaluation of the body. Instead, the effect log resulting
from evaluation of the bound expression is applied to the initial store (of the
overall let expression). We emphasize this compositional treatment by no longer
using numeric subscripts, which suggest “evolution” of the store, and instead
using ticks, which suggest distinct (but related) instances.
Let
σ, ρ, t ce ⇓◦ (v , σv ), ξ σ = ξ (σ)
(ρ , σ ) = extend(ρ, σ , x, t, v , σv )
σ , ρ , t e ⇓◦ (v, σv ), ξ
Call
((λx.e, ρ0 ), σ0 ) = aeval(σ, ρ, ae 0 ) (v1 , σ1 ) = aeval(σ, ρ, ae 1 ) t = (ae 0 ae 1 ) :: t
(ρ , σ ) = extend(ρ0 , σ0 , x, t , v1 , σ1 ) σ , ρ , t e ⇓◦ (v, σv ), ξ
σ, ρ, t (ae 0 ae 1 ) ⇓◦ (v, σv ), ξ ◦ extendlog ((x, t ), v1 , σ1 )
Set!
(v, σv ) = aeval(σ, ρ, ae)
a = (x, ρ(x)) σ = σv [a → v]
σ, ρ, t set! x ae ⇓◦ ((λx.x, ⊥), σ ), extendlog (a, v, σ )
Atomic
σ, ρ, t ae ⇓◦ aeval(σ, ρ, ae), λσ.σ
When we extend σ with a mapping for v, we also copy all of the mappings from
σv . This copying will yield a well-formed store since σ[(x, t) → v] and σv agree
on any common bindings.
Although the role of the store has changed, the same lemma holds in this
semantics as does in the previous. We repeat it in terms of this semantics.
Lemma 2. If σ, ρ, t e ⇓◦ (v, σv ), ξ, then ξ(σ) = σv .
Like the previous lemma, its proof can be obtained by induction on the
judgment’s derivation.
Let
(ρce , σce ) = restrict(ce, ρ, σ)
σce , ρce , t ce ⇓gc (v , σv ), ξ σ = ξ (σ) (ρ , σ ) = extend(ρ, σ , x, t, v , σv )
(ρe , σe ) = restrict(e, ρ , σ ) σe , ρe , t e ⇓gc (v, σv ), ξ
σ, ρ, t let x = ce in e ⇓gc (v, σv ), ξ ◦ extendlog ((x, t), v , σv ) ◦ ξ
Call
((λx.e, ρ0 ), σ0 ) = aevalgc (σ, ρ, ae 0 ) (v1 , σ1 ) = aevalgc (σ, ρ, ae 1 )
t = (ae 0 ae 1 ) :: t (ρ , σ ) = extend(ρ0 , σ0 , x, t , v1 , σ1 )
(ρe , σe ) = restrict(e, ρ , σ ) σe , ρe , t e ⇓gc (v, σv ), ξ
σ, ρ, t (ae 0 ae 1 ) ⇓gc (v, σv ), ξ ◦ extendlog ((x, t ), v1 , σ1 )
Set!
(v, σv ) = aevalgc (σ, ρ, ae)
a = (x, ρ(x)) σ = σv [a → v]
σ, ρ, t set! x ae ⇓gc ((λx.x, ⊥), ⊥), extendlog (a, v, σ )
Atomic
σ, ρ, t ae ⇓gc aevalgc (σ, ρ, ae), λσ.σ
the bindings of its associated store σv . Finally, the extended environment and
store are restricted with respect to the body expression e before e’s evaluation
under them.
The Call rule proceeds by first evaluating the atomic operator and argument
expressions. After calculating the new binding context t , the operator value
environment and store are extended with the new binding. Before evaluation of
the body e commences, the extended environment and store are restricted with
respect to it.
The Set! rule atomically evaluates the expression ae producing the assigned
value. It returns the identity function which, with an empty environment, is
closed by an empty store.
The Atomic rule evaluates an atomic expression with aevalgc .
To connect this semantics to the previous, we show that the addition of GC
has no semantic effect by the following lemma.
In prose, this lemma states that two evaluation configurations, identical ex-
cept that one’s store is the other’s with unreachable bindings pruned, will yield
the same evaluation result: their evaluation will produce the same value and,
modulo unreachable bindings, the same closing store.
Liberate Abstract Garbage Collection 215
v̂ ∈ Val
= P(Lam × Env
) ρ̂ ∈ Env
= Var Time
= App ≤m
t̂ ∈ Time â ∈ Address
= Var × Time
σ̂ ∈ Store → Val
= Address ξˆ ∈ Log → Val
= Address
8
The parameter m is used similarly to the parameter k of k-CFA.
9
The abstraction function is typically accompanied by a complementary concretiza-
tion function to complete a Galois connection. For simplicity here, we leave it in-
complete.
216 K. Germane and M. D. Adams
aeval(σ̂, ρ̂, x) = (v̂, gc(v̂,
σ̂)) where v̂ = σ̂(ρ̂(x))
aeval(σ̂, ρ̂, λx.e) = (v̂, gc(v̂,
σ̂)) where v̂ = {(λx.e, ρ̂|λx.e )}
Let
(ρ̂ce , σ̂ce ) = restrict(ce,
ρ̂, σ̂)
ˆ (v̂ , σ̂v ), ξˆ
σ̂ce , ρ̂ce , t̂ ce ⇓ σ̂ = ξˆ (σ̂) (ρ̂ , σ̂ ) = extend(ρ̂,
σ̂ , x, t̂, v̂ , σ̂v )
(ρ̂e , σ̂e ) = restrict(e,
ρ̂ , σ̂ ) σ̂e , ρ̂e , t̂ e ⇓ˆ (v̂, σ̂v ), ξˆ
σ̂, ρ̂, t̂ let x = ce in e ⇓ ˆ (v̂, σ̂v ), ξˆˆ◦ξˆ
Call
(v̂0 , σ̂0 ) = aeval(σ̂,
ρ̂, ae 0 ) (λx.e, ρ̂0 ) ∈ v̂0
(v̂1 , σ̂1 ) = aeval(σ̂,
ρ̂, ae 1 )
t̂ = (ae 0 ae 1 ) :: t̂
m (ρ̂ , σ̂ ) = extend(ρ̂
0 , σ̂0 , x, t̂ , v̂1 , σ̂1 )
(ρ̂e , σ̂e ) = restrict(e,
ρ̂ , σ̂ ) σ̂e , ρ̂e , t̂ e ⇓
ˆ (v̂, σ̂v ), ξˆ
ˆ (v̂, σ̂v ), ξˆ
σ̂, ρ̂, t̂ (ae 0 ae 1 ) ⇓
Set!
(v̂, σ̂v ) = aeval(σ̂,
ρ̂, ae)
ˆ
( , ξ) = extend(⊥, ⊥, x, ρ̂(x), v̂, σ̂v )
Atomic
This theorem states that if the configuration components are related by ab-
straction, then, for any given derivation in the exact semantics, there is an deriva-
tion in the abstract semantics which yields an abstraction of its results. It can
be proved by induction on the derivation.
6 Discussion
Now we examine the ramifications of a compositional treatment of analysis com-
ponents. We do so in turn, first considering the ramifications of treating the store
compositionally and then of treating the time compositionally.
(define (f x) x)
(define (g y) (f y))
(g 42)
(g 35)
the abstract resource 42 is allocated in the heap twice—first when the call to
g is made and second when the call to f is made. At the point of the second
allocation, the two most-recently-encountered call sites in evaluation are (f y)
and (g 42); hence, these call sites are used to qualify the binding of 42 to x in
the heap. The treatment of the abstract resource 35 is similar except its second
allocation is qualified by (f y) and (g 35). For this program, [k = 2]CFA is
able to keep the two allocations distinct.
Next, consider a [k = 2]CFA of the similar program
(define (f x) x)
(define (g y)
(displayln y)
(f y))
(g 42)
(g 35)
7 Related Work
References
1. Darais, D., Labich, N., Nguyen, P.C., Van Horn, D.: Abstracting definitional in-
terpreters (functional pearl). Proceedings of the ACM on Programming Languages
1(ICFP), 12:1–12:25 (Aug 2017). https://doi.org/10.1145/3110256
2. Dillig, I., Dillig, T., Aiken, A., Sagiv, M.: Precise and compact modular pro-
cedure summaries for heap manipulating programs. In: Proceedings of the
32Nd ACM SIGPLAN Conference on Programming Language Design and Im-
plementation. pp. 567–577. PLDI ’11, ACM, New York, NY, USA (2011).
https://doi.org/10.1145/1993498.1993565
3. Earl, C., Might, M., Van Horn, D.: Pushdown control-flow analysis of higher order
programs. Workshop on Scheme and Functional Programming (2010)
4. Earl, C., Sergey, I., Might, M., Van Horn, D.: Introspective pushdown analysis of
higher-order programs. In: Proceedings of the 17th ACM SIGPLAN International
Conference on Functional Programming. pp. 177–188. ICFP ’12, ACM, New York,
NY, USA (Sep 2012). https://doi.org/10.1145/2364527.2364576
5. Flanagan, C., Sabry, A., Duba, B.F., Felleisen, M.: The essence of compiling with
continuations. In: Proceedings of the ACM SIGPLAN 1993 Conference on Pro-
gramming Language Design and Implementation. pp. 237–247. PLDI ’93, ACM,
New York, NY, USA (1993). https://doi.org/10.1145/155090.155113
6. Gilray, T., Lyde, S., Adams, M.D., Might, M., Van Horn, D.: Pushdown control-
flow analysis for free. In: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages. pp. 691–704. POPL ’16,
ACM, New York, NY, USA (Jan 2016). https://doi.org/10.1145/2837614.2837631
7. Johnson, J.I., Sergey, I., Earl, C., Might, M., Van Horn, D.: Pushdown flow analysis
with abstract garbage collection. Journal of Functional Programming 24, 218–283
(May 2014). https://doi.org/10.1017/s0956796814000100
8. Johnson, J.I., Van Horn, D.: Abstracting abstract control. In: Proceedings of the
10th ACM Symposium on Dynamic Languages. pp. 11–22. DLS ’14, ACM, New
York, NY, USA (Oct 2014). https://doi.org/10.1145/2661088.2661098
9. Jones, N.D.: Flow analysis of lambda expressions. In: International Colloquium on
Automata, Languages, and Programming. pp. 114–128. Springer (1981)
10. Might, M., Shivers, O.: Improving flow analyses via Γ CFA: abstract garbage collec-
tion and counting. In: Proceedings of the Eleventh ACM SIGPLAN International
Conference on Functional Programming. pp. 13–25. ICFP ’06, ACM, New York,
NY, USA (Sep 2006). https://doi.org/10.1145/1159803.1159807
11. Might, M., Smaragdakis, Y., Van Horn, D.: Resolving and exploiting the k -CFA
paradox: illuminating functional vs. object-oriented program analysis. In: Proceed-
ings of the 31st ACM SIGPLAN Conference on Programming Language Design and
Implementation. pp. 305–315. PLDI ’10, ACM, New York, NY, USA (Jun 2010).
https://doi.org/10.1145/1806596.1806631
12. Peng, F.: h-CFA: A simplified approach for pushdown control flow analysis. Mas-
ter’s thesis, The University of Wisconsin-Milwaukee (2016)
13. Reynolds, J.C.: Definitional interpreters for Higher-Order programming languages.
Higher-Order and Symbolic Computation 11(4), 363–397 (1998)
14. Shivers, O.: Control-Flow Analysis of Higher-Order Languages. Ph.D. thesis,
Carnegie Mellon University, Pittsburgh, PA, USA (1991)
15. Van Horn, D., Might, M.: Abstracting abstract machines. In: Proceedings
of the 15th ACM SIGPLAN International Conference on Functional Pro-
gramming. pp. 51–62. ICFP ’10, ACM, New York, NY, USA (Sep 2010).
https://doi.org/10.1145/1863543.1863553
Liberate Abstract Garbage Collection 223
16. Vardoulakis, D., Shivers, O.: CFA2: A context-free approach to control-flow anal-
ysis. In: Gordon, A.D. (ed.) Programming Languages and Systems. pp. 570–589.
Springer Berlin Heidelberg, Berlin, Heidelberg (2010)
17. Vardoulakis, D., Shivers, O.: CFA2: a context-free approach to control-
flow analysis. Logical Methods in Computer Science 7(2) (2011).
https://doi.org/10.2168/LMCS-7(2:3)2011
18. Wei, G., Decker, J., Rompf, T.: Refunctionalization of abstract abstract machines:
bridging the gap between abstract abstract machines and abstract definitional in-
terpreters (functional pearl). Proceedings of the ACM on Programming Languages
2(ICFP), 105:1–105:28 (Jul 2018). https://doi.org/10.1145/3236800
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
SMT-Friendly Formalization of the Solidity
Memory Model
1 Introduction
Ethereum [32] is a public blockchain platform that provides a novel computing
paradigm for developing decentralized applications. Ethereum allows the deploy-
ment of arbitrary programs (termed smart contracts [31]) that operate over the
blockchain state. The public can interact with the contracts via transactions. It
is currently the most popular public blockchain with smart contract functional-
ity. While the nodes participating in the Ethereum network operate a low-level,
stack-based virtual machine (EVM) that executes the compiled smart contracts,
the contracts themselves are mostly written in a high-level, contract-oriented
programming language called Solidity [30].
Even though smart contracts are generally short, they are no less prone
to errors than software in general. In the Ethereum context, any flaws in the
contract code come with potentially devastating financial consequences (such as
the infamous DAO exploit [17]). This has inspired a great interest in applying
formal verification techniques to Ethereum smart contracts (see e.g., [4] or [14] for
surveys). In order to apply formal verification of any kind, be it static analysis or
The author was also affiliated with SRI International as an intern during this project.
Supported by the ÚNKP-19-3 New National Excellence Program of the Ministry for
Innovation and Technology.
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 224–250, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 9
SMT-Friendly Formalization of the Solidity Memory Model 225
model checking, the first step is to formalize the semantics of the programming
language that the smart contracts are written in. Such semantics should not
only remain an exercise in formalization, but should preferably be developed,
resulting in precise and automated verification tools.
Early approaches to verification of Ethereum smart contracts focused mostly
on formalizing the low-level virtual machine precisely (see, e.g., [11,19,21,22,2]).
However, the unnecessary details of the EVM execution model make it difficult to
reason about high-level functional properties of contracts (as they were written
by developers) in an effective and automated way. For Solidity-level properties
of smart contracts, Solidity-level semantics are preferred. While some aspects
of Solidity have been studied and formalized [23,10,15,33], the semantics of the
Solidity memory model still lacks a detailed and precise formalization that also
enables automation.
The memory model of Solidity has various unusual and non-trivial behaviors,
providing a fertile ground for potential bugs. Smart contracts have access to two
classes of data storage: a permanent storage that is a part of the global blockchain
state, and a transient local memory used when executing transactions. While the
local memory uses a standard heap of entities with references, the permanent
storage has pure value semantics (although pointers to storage can be declared
locally). This memory model that combines both value and reference semantics,
with all interactions between the two, poses some interesting challenges but
also offers great opportunities for automation. For example, the value semantics
of storage ensures non-aliasing of storage data. This can, if supported by an
appropriate encoding of the semantics, potentially improve both the precision
and effectiveness of reasoning about contract storage.
This paper provides a formalization of the Solidity semantics in terms of a
simple SMT-based intermediate language that covers all features related to man-
aging contract storage and memory. A major contribution of our formalization
is that all but few of its elements can be encoded in the quantifier-free fragment
of standard SMT theories. Additionally, our formalization captures the value se-
mantics of storage with implicit non-aliasing information of storage entities. This
allows precise and effective verification of Solidity smart contracts using modern
SMT solvers. The formalization is implemented in the open-source solc-verify
tool [20], which is a modular verifier for Solidity based on SMT solvers. We val-
idate the formalization and demonstrate its effectiveness by evaluating it on a
comprehensive set of tests that exercise the memory model. We show that our
formalization significantly improves the precision and soundness compared to
existing Solidity-level verifiers, while remarkably outperforming low-level EVM-
based tools in terms of efficiency.
2 Background
2.1 Ethereum
Ethereum [32,3] is a generic blockchain-based distributed computing platform.
The Ethereum ledger is a storage layer for a database of accounts (identified
226 Á. Hajdu and D. Jovanović
by addresses) and the data associated with the accounts. Every account has
an associated balance in Ether (the native cryptocurrency of Ethereum). In
addition, an account can also be associated with the executable bytecode of a
contract and the contract state.
Although Ethereum contracts are deployed to the blockchain in the form
of the bytecode of the Ethereum Virtual Machine (EVM) [32], they are gener-
ally written in a high-level programming language called Solidity [30] and then
compiled to EVM bytecode. After deployment, the contract is publicly acces-
sible and its code cannot be modified. An external user, or another contract,
can interact with a contract through its API by invoking its public functions.
This can be done by issuing a transaction that encodes the function to be called
with its arguments, and contains the contract’s address as the recipient. The
Ethereum network then executes the transaction by running the contract code
in the context of the contract instance.
A contract instance has access to two different kinds of memory during its
lifetime: contract storage and memory.3 Contract storage is a dedicated data
store for a contract to store its persistent state. At the level of the EVM, it is
an array of 256-bit storage slots stored on the blockchain. Contract data that
fits into a slot, or can be sliced into fixed number of slots, is usually allocated
starting from slot 0. More complex data types that do not fit into a fixed number
of slots, such as mappings, or dynamic arrays, are not supported directly by the
EVM. Instead, they are implemented by the Solidity compiler using storage as a
hash table where the structured data is distributed in a deterministic collision-
free manner. Contract memory is used during the execution of a transaction on
the contract, and is deleted after the transaction finishes. This is where function
parameters, return values and temporary data can be allocated and stored.
2.2 Solidity
contract DataStorage {
struct Record {
bool set ;
int [] data ;
}
Types. Solidity is statically typed and provides two classes of types: value types
and reference types. Value types include elementary types such as addresses,
integers, and Booleans that are always passed by value. Reference types, on the
other hand, are passed by reference and include structs, arrays and mappings.
228 Á. Hajdu and D. Jovanović
Example 2. The contract in Figure 1 uses the following types. The records
variable is a mapping from addresses to Record structures which, in turn, consist
of a Boolean value and a dynamically-sized integer array. It is a common practice
to define a struct with a Boolean member (set) to indicate that a mapping value
has been set. This is because Solidity mappings do not store keys: any key can
be queried, returning a default value if no value was associated previously.
Data locations for reference types. Data of reference types resides in a data
location that is either storage or memory. Storage is the persistent store used
for state variables of the contract. In contrast, memory is used during execution
of a transaction to store function parameters, return values and local variables,
and it is deleted after the transaction finishes.
SMT-Friendly Formalization of the Solidity Memory Model 229
contract C { t T
struct T { function f ( S memory sm1 ) public {
S T memory tm = sm1 . ta [1];
int z ;
s T
} S memory sm2 = S (0 , sm1 . ta ) ;
struct S { T }
int x ;
T [] ta ; S
} T S T
sm1
T t; T
S s; sa S
T tm
S [] sa ;
} T S
T sm2
Fig. 3: An example illustrating reference types (structs and arrays) and their lay-
out in storage and memory: (a) a contract defining types and state variables; (b)
an abstract representation of the contract storage as values; and, (c) a function
using the memory data location and a possible layout of the data in memory.
Example 3. Consider the contract C defined in Figure 3a. The contract defines
two reference struct types S and T, and declares state variables s, t, and sa.
These variables are maintained in storage during the contract lifetime and they
are represented as values with no references within. A potential value of these
variables is shown in Figure 3b. On the other hand, the top of Figure 3c shows a
function with three variables in the memory data location, one as the argument
to the function, and two defined within the function. Because they are in memory,
these variables are references to heap locations. Any data of reference types,
stored within the structures and arrays, is also a reference and can be reallocated
or assigned to point to an existing heap location. This means that the layout of
the data can contain arbitrary graphs with arbitrary aliasing. A potential layout
of these variables is shown at the bottom of Figure 3c.
and interact with other Ethereum accounts. Besides accessing the storage of the
contract through its state variables, functions can also define local variables, in-
cluding function arguments and return values. Variables of value types are stored
as values on a stack. Variables of reference types must be explicitly declared with
a data location, and are always pointers to an entity in that data location (stor-
age or memory). A pointer to storage is called a local storage pointer. As the
storage is not memory in the usual sense, but a value instead, one can see storage
pointers as encoding a path to one reference type entity in the storage.
3 Formalization
3.1 Types
We use T (.) to denote the function that maps a Solidity type to an SMT type.
This function is used in the translation of contract elements and can, as a side
effect, introduce datatype definitions and variable declarations. This is denoted
with [decl ] in the result of the function. To simplify the presentation, we assume
that such side effects are automatically added to the preamble of the SMT pro-
gram. Furthermore, we assume that declarations with the same name are only
added once. We use type(expr) to denote the original (Solidity) type of an ex-
pression (to be used later in the formalization). The definition of T (.) is shown
in Figure 5.
T (bool) =
˙ bool
T (address) =
˙ T (int) =
˙ T (uint) =
˙ int
T (mapping(K=>V ) storage) =
˙ [T (K)]T (V )
T (mapping(K=>V ) storptr) =
˙ [int]int
T (T [n] storage) =
˙ T (T [] storage)
T (T [n] storptr) =
˙ T (T [] storptr)
T (T [n] memory) =˙ T (T [] memory)
T (T [] storage) =
˙ StorArrT with [StorArrT (arr : [int]T (T ), length : int)]
T (T [] storptr) =
˙ [int]int
T (T [] memory) =˙ int with [MemArrT (arr : [int]T (T ), length : int)]
[arrheap T : [int]MemArrT ]
T (struct S storage) =
˙ StorStructS with [StorStructS (. . . , mi : T (Si ), . . .)]
T (struct S storptr) =
˙ [int]int
T (struct S memory) =˙ int with [MemStructS (. . . , mi : T (Si ), . . .)]
[structheap S : [int]MemStructS ]
Value types. Booleans are mapped to SMT Booleans while other value types
are mapped to SMT integers. Addresses are also mapped to SMT integers so
that arithmetic comparison and conversions between integers and addresses is
supported. For simplicity, we map all integers (signed or unsigned) to SMT
SMT-Friendly Formalization of the Solidity Memory Model 233
integers.6 Solidity also allows function types to store, pass around, and call
functions, but this is not yet supported by our encoding.
Reference types. The Solidity syntax does not always require the data location
for variable and parameter declarations. However, for reference types it is always
required (enforced by the compiler), except for state variables that are always
implicitly storage. In our formalization, we assume that the data location of
reference types is a part of the type. As discussed before, memory entities are
always accessed through pointers. However, for storage we distinguish whether
it is the storage reference itself (e.g., state variable) or a storage pointer (e.g.,
local variable, function parameter). We denote the former with storage and the
latter with storptr in the type name. Our modeling of reference types relies on
the generalized theory of arrays [16] and the theory of inductive data-types [8],
both of which are supported by modern SMT solvers (e.g., cvc4 [6] and z3 [28]).
Mappings and arrays. For both arrays and mappings, we abstract away the
implementation details of Solidity and model them with the SMT theory of
arrays and inductive datatypes. We formalize Solidity mappings simply as SMT
arrays. Both fixed- and dynamically-sized arrays are translated using the same
SMT type and we only treat them differently in the context of statements and
expressions. Strings and byte arrays are not discussed here, but we support them
as particular instances of the array type. To ensure that array size is properly
modeled we keep track of it in the datatype (length) along with the actual
elements (arr ).
For storage array types with base type T , we introduce an SMT datatype
StorArrT with a constructor that takes two arguments: an inner SMT array (arr )
associating integer indexes and the recursively translated base type (T (T )), and
an integer length. The advantage of this encoding is that the value semantics
of storage data is provided by construction: each array element is a separate
entity (no aliasing) and assigning storage arrays in SMT makes a deep copy.
This encoding also generalizes if the base type is a reference type.
For memory array types with base type T , we introduce a separate datatype
MemArrT (side effect). However, memory arrays are stored with pointer values.
Therefore the memory array type is mapped to integers, and a heap (arrheap T )
is introduced to associate integers (pointers) with the actual memory array
datatypes. Note that mixing data locations within a reference type is not possi-
ble: the element type of the array has the same data location as the array itself.
Therefore, it is enough to introduce two datatypes per element type T : one for
storage and one for memory. In the former case the element type will have value
semantics whereas in the latter case elements will be stored as pointers.
Structs. For each storage struct type S the translation introduces an inductive
datatype StorStructS , including a constructor for each struct member with types
6
Note that this does not capture the precise machine integer semantics, but this is
not relevant from the perspective of the memory model. Precise computation can be
provided by relying on SMT bitvectors or modular arithmetic (see, e.g., [20]).
234 Á. Hajdu and D. Jovanović
mapped recursively. Similarly to arrays, this ensures the value semantics of stor-
age such as non-aliasing and deep copy assignments. For each memory struct S
we also introduce a datatype MemStructS and a constructor for each member.7
However, the memory struct type itself is mapped to integers (pointer) and a
heap (structheap S ) is introduced to associate the pointers with the actual mem-
ory struct datatypes. Note that if a memory struct has members with reference
types, they are also pointers, which is ensured recursively by our encoding.
An interesting aspect of the storage data location is that, although the stored
data has value semantics, it is still possible to define pointers to an entity in
storage within a local context, e.g., with function parameters or local variables.
These pointers are called local storage pointers.
Since our formalization uses SMT datatypes to encode the contract data in stor-
age, it is not possible to encode these pointers directly. A partial solution would
be to substitute each occurrence of the local pointer with the expression that is
assigned to it when it was defined. However, this approach is too simplistic and
has limitations. Local storage pointers can be reassigned, or assigned condition-
ally, or it might not be known at compile time which definition should be used.
Furthermore, local storage pointers can also be passed in as function arguments:
they can point to different storage entities for different calls.
We propose an approach to encode local storage pointers while overcoming
these limitations. Our encoding relies on the fact that storage data of a contract
can be viewed as a finite-depth tree of values. As such, each element of the stored
data can be uniquely identified by a finite path leading to it.8
Example 7. Consider the contract C in Figure 6a. The contract defines structs
T and S, and state variables of these types. If we are interested in all storage
entities of type T, we can consider the sub-tree of the contract storage tree that
has leaves of type T, as depicted in Figure 6b. The root of the tree is the contract
itself, with indexed sub-nodes for state variables, in order. For nodes of struct
type there are indexed sub-nodes leading to its members, in order. For each node
of array type there is a sub-node for the base type. Every pointer to a storage T
entity can be identified by a path in this tree: by fixing the index to each state
7
Mappings in Solidity cannot reside in memory. If a struct defines a mapping member
and it is stored in memory, the mapping is simply inaccessible. Such members could
be omitted from the constructor.
8
Solidity does support a limited form of recursive data-types. Such types could make
the storage a tree of potentially arbitrary depth. We chose not to support such types
as recursion is non-existing in Solidity types used in practice.
SMT-Friendly Formalization of the Solidity Memory Model 235
contract C {
struct T { unpack(ptr) =
int z ; t1 (0) ite(ptr[0] = 0,
} C T
t1,
struct S {
s1 (1) t (0) ite(ptr[0] = 1,
int x ;
S T
T t; ite(ptr[1] = 0,
T [] ts ; ts (1) (i)
} T[] T s1.t,
T t1 ;
ss (2) (i) t (0)
s1.ts[ptr[2]]),
S s1 ; S[] ite(ptr[2] = 0,
S T
S [] ss ;
} ts (1) (i) ss[ptr[1]].t,
T[] T ss[ptr[1]].ts[ptr[3]])))
(a) (b) (c)
Fig. 6: An example of packing and unpacking: (a) contract with struct definitions
and state variables; (b) the storage tree of the contract for type T; and (c) the
unpacking expression for storage pointers of type T.
variable, member, and array index, as seen in brackets in Figure 6b, such paths
can be encoded as an array of integers. For example, the state variable t1 can
be represented as [0], the member s1.t as [1, 0], and ss[8].ts[5] as [2, 8, 1, 5].
This idea allows us to encode storage pointer types (pointing to arrays, structs
or mappings) simply as SMT arrays ([int]int). The novelty of our approach is
that storage pointers can be encoded and passed around, while maintaining the
value semantics of storage data, without the need for quantifiers to describe
non-aliasing. To encode storage pointers, we need to address initialization and
dereference of storage pointers, while assignment is simply an assignment of
array values. When a storage pointer is initialized to a concrete expression, we
pack the indexed path to the storage entity (that the expression references) into
an array value. When a storage pointer is dereferenced (e.g., by indexing into or
accessing a member), the array is unpacked into a conditional expression that
will evaluate to a storage entity by decoding paths in the tree.
Storage tree. The storage tree for a given type T can be easily obtained by
filtering the AST nodes of the contract definition to only include state variable
declarations and to, further, only include nodes that lead to a sub-node of type
T . We denote the storage tree for type T as tree(T ).9
access e.mi it is recursively the base expressions of e. We call the first element
of this list (denoted by car) the base expression (the innermost base expression).
The base expression is always either a state variable or a storage pointer, and
we consider these two cases separately.
If the base expression is a state variable, we simply align the expression along
the storage tree with the packpath function. The packpath function takes the
list of base sub-expressions, and the storage tree to use for alignment, and then
processes the expressions in order. If the current expression is an identifier (state
variable or member access), the algorithm finds the outgoing edge annotated with
the identifier (from the current node) and writes the index into the result array.
If the expression is an index access, the algorithm maps and writes the index
expression (symbolically) in the array. The expression mapping function E(.) is
introduced later in Section 3.6.
If the base expression is a storage pointer, the process is more general since
the “start” of the packing must accommodate any point in storage where the base
expression can point to. In this case the algorithm finds all paths to leaves in the
SMT-Friendly Formalization of the Solidity Memory Model 237
tree of the base pointer, identifies the condition for taking that path and writes
the labels on the path to an array. Then it uses packpath to continue writing
the array with the rest of the expression (denoted by cdr), as before. Finally, a
conditional expression is constructed with all the conditions and packed arrays.
Note, that the type of this conditional is still an SMT array of integers as it is
the case for a single path.
def unpack(ptr):
return unpack(ptr, tree(type(ptr)), empty, 0);
def unpack(ptr, node, expr, d):
result := empty;
if node has no outgoing edges then result := expr ;
if node is contract then
id (i)
foreach edge node −−−→ child do
result := ite(ptr[d] = i, unpack(ptr, child, id, d + 1), result);
if node is struct then
id (i)
foreach edge node −−−→ child do
result := ite(ptr[d] = i, unpack(ptr, child, expr.id, d + 1), result);
(i)
if node is array/mapping with edge node − −→ child then
result := unpack(ptr, child, expr[ptr[d]], d + 1);
return result;
single outgoing edge by wrapping the subexpression into an index access using
the current element (at index d) of the pointer.
Note that with inheritance and libraries [30] it is possible that a contract
defines a type T but has no nodes in its storage tree. The contract can still
define functions with storage pointers to T , which can be called by derived
contracts that define state variables of type T . In such cases we declare an array
of type [int]T (T ), called the default context, and unpack storage pointers to T
as if the default context was a state variable. This allows us to reason about
abstract contracts and libraries, modeling that their storage pointers can point
to arbitrary entities not yet declared.
The focus of our discussion is the Solidity memory model and, for presentation
purposes, we assume a minimalist setting where the important aspects of storage
and memory can be presented: we assume a single contract and a single function
to translate. Interactions between multiple functions are handled differently de-
pending on the verification approach. For example, in modular verification func-
tions are checked individually against specifications (pre- and post-conditions)
and function calls are replaced by their specification [20].
10
Note that due to the “else” branches, unpack is a is a non-injective surjective func-
tion. For example, [a, 8, 1, 5] with any a ≥ 2 would evaluate to the same slot. However
this does not affect our encoding as pointers cannot be compared and pack always
returns the same (unique) values.
11
Generalizing this to multiple contracts can be done directly by using a separate
one-dimensional heap for each state variable, indexed by a receiver parameter (this :
address) identifying the current contract instance (see, e.g., [20]).
SMT-Friendly Formalization of the Solidity Memory Model 239
defval(bool) =
˙ false
defval(address) =
˙ defval(int) =
˙ defval(uint) =
˙ 0
defval(mapping(K=>V )) =
˙ constarr[T (K)]T (V ) (defval(V ))
defval(T [] storage) =
˙ defval(T [0] storage)
defval(T [] memory) =˙ defval(T [0] memory)
defval(T [n] storage) =
˙ StorArrT (constarr[int]T (T ) (defval(T )), n)
defval(T [n] memory) =˙ [ref : int] (fresh symbol)
{ref := refcnt := refcnt + 1}
{arrheap T [ref].length := n}
{arrheap T [ref].arr[i] := defval(T )} for 0 ≤ i ≤ n
ref
defval(struct S storage) =
˙ StorStructS (. . . , defval(Si ), . . .)
defval(struct S memory) =˙ [ref : int] (fresh symbol)
{ref := refcnt := refcnt + 1}
{structheap S [ref].mi = defval(Si )} for each mi
ref
Functions calls. From the perspective of the memory model, the only important
aspect of function calls is the way parameters are passed in and how function
return values are treated. Our formalization is general in that it allows us to
treat both of the above as plain assignments (explained later in Section 3.5).
For each parameter pi and return value ri of a function, we add declarations
pi : T (type(pi )) and ri : T (type(ri )) in the SMT program. Note that for reference
types appearing as parameters or return values of the function, their types are
either memory or storage pointers.
in Figure 9, to denote the function that maps a Solidity type to its default
value as an SMT expression. Note that, as a side effect, this function can do
allocations for memory entities, introducing extra declarations and statements,
denoted by [decl ] and {stmt}. As expected, the default value is false for Booleans
and 0 for other primitives that map to integers. For mappings from K to V , the
default value is an SMT constant array returning the default value of the value
type V for each key k ∈ K (see, e.g., [16]). The default value of storage arrays
is the corresponding datatype value constructed with a constant array of the
default value for base type T , and a length of n or 0 for fixed- or dynamically-
sized arrays. For storage structs, the default value is the corresponding datatype
value constructed with the default values of each member.
The default value of uninitialized memory pointers is unusual. Since Solidity
doesn’t support “null” pointers, a new entity is automatically allocated in mem-
ory and initialized to default values (which might include additional recursive
initialization). Note, that for fixed-size arrays Solidity enforces that the array
size n must be an integer literal or a compile time constant, so setting each
element to its default value is possible without loops or quantifiers. Similarly
for structs, each member is recursively initialized, which is again possible by
explicitly enumerating each member.
3.4 Statements
We use S. to denote the function that translates Solidity statements to a list
of statements in the SMT program. It relies on the type mapping function T (.)
(presented previously in Section 3.1) and on the expression mapping function E(.)
(to be introduced in Section 3.6). Furthermore, we define a helper function A(., .)
dedicated to modeling Solidity assignments (to be discussed in Section 3.5).
The definition of S. is shown in Figure 10. As a side effect, extra declarations
can be introduced to the preamble of the SMT program (denoted by [decl ]).
The Solidity documentation [30] does not precisely state the order of evaluating
subexpressions in statements. It only specifies that subnodes are processed before
the parent node. This problem is independent form the discussion of the memory
models so we assume that side effects of subexpressions are added in the same
order as it is implemented in the compiler. Furthermore, if a subexpression is
mapped multiple times, we assume that the side effects are only added once.
This makes our presentation simpler by introducing fewer temporary variables.
Local variable declarations introduce a variable declaration with the same
identifier in the SMT program by mapping the type.12 If an initialization ex-
pression is given, it is mapped using E(.) and assigned to the variable. Otherwise,
the default value is used as defined by defval(.) in Figure 9. Delete assigns the
default value for a type, which is simply mapped to an assignment in our formal-
ization. Solidity supports multiple assignments as one statement with a tuple-like
syntax. The documentation [30] does not specify the behavior precisely, but the
12
Without the loss of generality we assume that identifiers in Solidity are unique. The
compiler handles scoping and assigns an unique identifier to each declaration.
SMT-Friendly Formalization of the Solidity Memory Model 241
contract C { contract C {
struct S { int x ; } struct S { int x ; }
S s1 , s2 , s3 ; S [] a ;
compiler first evaluates the RHS and LHS tuples (in this order) from left to right
and then assignment is performed component-wise from right to left.
Array push increases the length and assigns the given expression as the last
element. Array pop decreases the length and sets the removed element to its
default value. While the removed element can no longer be accessed via indexing
into an array (a runtime error occurs), it can still be accessed via local storage
pointers (see Figure 12).13
3.5 Assignments
Assignments between reference types in Solidity can be either pointer assign-
ments or value assignments, involving deep copying and possible new allocations
in the latter case. We use A(lhs, rhs) to denote the function that assigns a rhs
SMT expression to a lhs SMT expression based on their original types and data
locations. The definition of A(., .) is shown in Figure 13. Value type assignments
are simply mapped to an SMT assignment. To make our presentation more
clear, we subdivide the other cases into separate functions for array, struct and
mapping operands, denoted by AA (., .), AS (., .) and AM (., .) respectively.
Structs and arrays. For structs and arrays the semantics of assignment is sum-
marized in Figure 14. However, there are some notable details in various cases
that we expand on below.
Assigning anything to storage LHS always causes a deep copy. If the RHS is
storage, this is simply mapped to a datatype assignment in our encoding (with
an additional unpacking if the RHS is storage pointer).15 If the RHS is memory,
deep copy for structs can be done member wise by accessing the heap with the
RHS pointer and performing the assignment recursively (as members can be
reference types themselves). For arrays, we access the datatype corresponding
to the array via the heap and do an assignment, which does a deep copy in
SMT. Note however, that this only works if the base type of the array is a
value type. For reference types, memory array elements are pointers and would
require being dereferenced during assignment to storage. As opposed to struct
members, the number of array elements is not known at compile time so loops or
quantifiers have to be used (as in traditional software analysis). However, this is a
13
The current version (0.5.x) of Solidity supports resizing arrays by assigning to
the length member. However, this behavior is dangerous and has been since re-
moved in the next version (0.6.0) (see https://solidity.readthedocs.io/en/v0.6.0/
060-breaking-changes.html). Therefore, we do not support this in our encoding.
14
This is consequence of the fact that keys are not stored in mappings and so the
assignment is impossible to perform.
15
This also causes mappings to be copied, which contradicts the current semantics.
However, we chose to keep the deep copy as assignments of mappings is planned to
be disallowed in the future (see https://github.com/ethereum/solidity/issues/7739).
SMT-Friendly Formalization of the Solidity Memory Model 243
A(lhs, rhs) =
˙ lhs := rhs for value type operands
A(lhs, rhs) =
˙ AM (lhs, rhs) for mapping type operands
A(lhs, rhs) =
˙ AS (lhs, rhs) for struct type operands
A(lhs, rhs) =
˙ AA (lhs, rhs) for array type operands
AM (lhs : sp, rhs : s) =˙ lhs := pack(rhs)
AM (lhs : sp, rhs : sp) =
˙ lhs := rhs
AM (lhs, rhs) ˙ {}
= (all other cases)
AS (lhs : s, rhs : s) =
˙lhs := rhs
AS (lhs : s, rhs : m) =
˙A(lhs.mi , structheap type(rhs) [rhs].mi ) for each mi
AS (lhs : s, rhs : sp) =
˙AS (lhs, unpack(rhs))
AS (lhs : m, rhs : m) =
˙lhs := rhs
AS (lhs : m, rhs : s) =
˙lhs := refcnt := refcnt + 1
A(structheap type(lhs) [lhs].mi , rhs.mi ) for each mi
AS (lhs : m, rhs : sp) =˙ AS (lhs, unpack(rhs))
AS (lhs : sp, rhs : s) =˙ lhs := pack(rhs)
AS (lhs : sp, rhs : sp) =
˙ lhs := rhs
AA (lhs : s, rhs : s) =
˙lhs := rhs
AA (lhs : s, rhs : m) =
˙lhs := arrheap type(rhs) [rhs]
AA (lhs : s, rhs : sp) =
˙AA (lhs, unpack(rhs))
AA (lhs : m, rhs : m) =
˙lhs := rhs
AA (lhs : m, rhs : s) =
˙lhs := refcnt := refcnt + 1
arrheap type(lhs) [lhs] := rhs
AA (lhs : m, rhs : sp) =˙ AA (lhs, unpack(rhs))
AA (lhs : sp, rhs : s) =˙ lhs := pack(rhs)
AA (lhs : sp, rhs : sp) =
˙ lhs := rhs
Fig. 13: Formalization of assignment based on different type categories and data
locations for the LHS and RHS. We use s, sp and m after the arguments to
denote storage, storage pointer and memory types respectively.
special case, which can be encoded in the decidable array property fragment [13].
Assigning storage (or storage pointer) to memory is also a deep copy but in
the other direction. However, instead overwriting the existing memory entity, a
new one is allocated (recursively for reference typed elements or members). We
model this by incrementing the reference counter, storing it in the LHS and then
accessing the heap for deep copy using the new pointer.
3.6 Expressions
We use E(.) to denote the function that translates a Solidity expression to an
SMT expression. As a side effect, declarations and statements might be intro-
duced (denoted by [decl ] and {stmt} respectively). The definition of E(.) is shown
in Figure 15. As discussed in Section 3.4 we assume that side effects are added
from subexpressions in the proper order and only once.
Member access is mapped to an SMT member access by mapping the base
expression and the member name. There is an extra unpacking step for storage
244 Á. Hajdu and D. Jovanović
Fig. 14: Semantics of assignment between array and struct operands based on
their data location.
E(id ) =
˙ id
E(expr.id) =
˙ E(expr).E(id) if type(expr) = struct S storage
E(expr.id) =
˙ unpack(E(expr)).E(id) if type(expr) = struct S storptr
E(expr.id) =
˙ structheap S [E(expr)].E(id) if type(expr) = struct S memory
E(expr.id) =
˙ E(expr).E(id) if type(expr) = T [] storage
E(expr.id) =
˙ unpack(E(expr)).E(id) if type(expr) = T [] storptr
E(expr.id) =
˙ arrheap T [E(expr)].E(id) if type(expr) = T [] memory
E(expr[idx]) =
˙ E(expr).arr [E(idx)] if type(expr) = T [] storage
E(expr[idx]) =
˙ unpack(E(expr)).arr [E(idx)] if type(expr) = T [] storptr
E(expr[idx]) =
˙ arrheap T [E(expr)].arr [E(idx)] if type(expr) = T [] memory
E(expr[idx]) =
˙ E(expr)[E(idx)] if type(expr) = mapping(K=>V ) storage
E(expr[idx]) =
˙ unpack(E(expr))[E(idx)] if type(expr) = mapping(K=>V ) storptr
pointers and a heap access for memory. Note that the only valid member for
arrays is length. Index access is mapped to an SMT array read by mapping the
base expression and the index, and adding en extra member access for arrays to
get the inner array arr of elements from the datatype. Furthermore, similarly to
member accesses, an extra unpacking step is needed for storage pointers and a
heap access for memory.
SMT-Friendly Formalization of the Solidity Memory Model 245
4 Evaluation
The formalization described in this paper serves as the basis of our Solidity
verification tool solc-verify [20].16 In this section we provide an evaluation of
the presented formalization and our implementation by validating it on a set of
relevant test cases. For illustrative purposes we also compare our tool with other
available Solidity analysis tools.17
“Real world” contracts currently deployed on Ethereum (e.g., contract avail-
able on Etherscan) have limited value for evaluating memory model semantics.
Many such contracts use old compiler versions with constructs that are not sup-
ported anymore, and do not use newer features. There are also many toy and
trivial contracts that are deployed but not used, and popular contracts (e.g.
tokens) are over-represented with many duplicates. Furthermore, the inconsis-
tent usage of assert and require [20] makes evaluation hard. Evaluating the
memory semantics requires contracts that exercise diverse features of the mem-
ory model. There are larger dApps that do use more complex features (e.g.,
Augur or ENS), but these contracts also depend on many other features (e.g.
inheritance, modifiers, loops) that would skew the results.
Therefore we have manually developed a set of tests that try to capture
the interesting behaviors and corner cases of the Solidity memory semantics.
The tests are targeted examples that do not use irrelevant features. The set
is structured so that every target test behavior is represented with a test case
that sets up the state, exercises a specific feature and checks the correctness
of the behavior with assertions. This way a test should only pass if the tool
provides a correct verification result by modeling the targeted feature precisely.
16
solc-verify is open source, available at https://github.com/SRI-CSL/solidity. Be-
sides certain low-level constructs (such as inline assembly) solc-verify supports
a majority of Solidity features that we omitted from the presentation, including
inheritance, function modifiers, for/while loops and if-then-else.
17
All tests, with a Truffle test harness, a docker container with all the tools, and all indi-
vidual results are available at https://github.com/dddejan/solidity-semantics-tests.
246 Á. Hajdu and D. Jovanović
only implemented partially (such as deep copy of arrays with reference types
and recursively initializing memory objects). There are no technical difficulties
in supporting them and they are planned in the future.
5 Related Work
some interactive theorem provers. Amani et al. [2] extends this work by defining
a program logic to reason about EVM bytecode.
6 Conclusion
We presented a high-level SMT-based formalization of the Solidity memory
model semantics. Our formalization covers all aspects of the language related to
managing both the persistent contract storage and the transient local memory.
The novel encoding of storage pointers as arrays allows us to precisely model non-
aliasing and deep copy assignments between storage entities without the need
for quantifiers. The memory model forms the basis of our Solidity-level modular
verification tool solc-verify. We developed a suite of test cases exercising all
aspects of memory management with different combinations of reference types.
Results indicate that our memory model outperforms existing Solidity-level tools
in terms of soundness and precision, and is on par with low-level EVM-based
implementations, while having a significantly lower computational cost for dis-
charging verification conditions.
SMT-Friendly Formalization of the Solidity Memory Model 249
References
1. Alt, L., Reitwiessner, C.: SMT-based verification of Solidity smart con-
tracts. In: ISoLA 2018, LNCS, vol. 11247, pp. 376–388. Springer (2018).
https://doi.org/10.1007/978-3-030-03427-6 28
2. Amani, S., Bégel, M., Bortin, M., Staples, M.: Towards verifying ethereum smart
contract bytecode in Isabelle/HOL. In: Proceedings of the 7th ACM SIGPLAN In-
ternational Conference on Certified Programs and Proofs. pp. 66–77. ACM (2018)
3. Antonopoulos, A., Wood, G.: Mastering Ethereum: Building Smart Contracts and
Dapps. O’Reilly Media, Inc. (2018)
4. Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on Ethereum smart
contracts. In: POST 2017, LNCS, vol. 10204, pp. 164–186. Springer (2017).
https://doi.org/10.1007/978-3-662-54455-6 8
5. Barnett, M., Chang, B.Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: A
modular reusable verifier for object-oriented programs. In: FMCO 2005, LNCS,
vol. 4111, pp. 364–387. Springer (2006). https://doi.org/10.1007/11804192 17
6. Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T.,
Reynolds, A., Tinelli, C.: CVC4. In: CAV 2011, LNCS, vol. 6806, pp. 171–177.
Springer (2011). https://doi.org/10.1007/978-3-642-22110-1 14
7. Barrett, C., Fontaine, P., Tinelli, C.: The Satisfiability Modulo Theories Library
(SMT-LIB) (2016), www.SMT-LIB.org
8. Barrett, C., Shikanian, I., Tinelli, C.: An abstract decision procedure for satis-
fiability in the theory of recursive data types. Journal on Satisfiability, Boolean
Modeling and Computation 3, 21–46 (2007)
9. Barrett, C., Tinelli, C.: Satisfiability modulo theories. In: Handbook of Model
Checking, pp. 305–343. Springer (2018)
10. Bartoletti, M., Galletta, L., Murgia, M.: A minimal core calculus for Solidity con-
tracts. In: DPM 2019, CBT 2019, LNCS, vol. 11737, pp. 233–243. Springer (2019).
https://doi.org/978-3-030-31500-9 15
11. Bhargavan, K., Delignat-Lavaud, A., Fournet, C., Gollamudi, A., Gonthier, G.,
Kobeissi, N., Kulatova, N., Rastogi, A., Sibut-Pinote, T., Swamy, N., Zanella-
Béguelin, S.: Formal verification of smart contracts: Short paper. In: ACM Work-
shop on Programming Languages and Analysis for Security. pp. 91–96. ACM (2016)
12. Biere, A., Heule, M., van Maaren, H.: Handbook of satisfiability. IOS press (2009)
13. Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about ar-
rays? In: VMCAI 2006, LNCS, vol. 3855, pp. 427–442. Springer (2006).
https://doi.org/10.1007/11609773 28
14. Chen, H., Pendleton, M., Njilla, L., Xu, S.: A survey on ethereum systems security:
Vulnerabilities, attacks and defenses (2019), https://arxiv.org/abs/1908.04507
15. Crafa, S., Pirro, M.D., Zucca, E.: Is solidity solid enough? In: Financial Cryptog-
raphy Workshops (2019)
16. De Moura, L., Bjørner, N.: Generalized, efficient array decision procedures. In:
Formal Methods in Computer-Aided Design. pp. 45–52. IEEE (2009)
17. Dhillon, V., Metcalf, D., Hooper, M.: The DAO hacked. In: Blockchain Enabled
Applications, pp. 67–78. Apress (2017)
18. Filliâtre, J.C., Paskevich, A.: Why3 — where programs meet provers. In: ESOP
2013, LNCS, vol. 7792, pp. 125–128. Springer (2013). https://doi.org/10.1007/978-
3-642-37036-6 8
19. Grishchenko, I., Maffei, M., Schneidewind, C.: A semantic framework for the secu-
rity analysis of Ethereum smart contracts. In: POST 2018, LNCS, vol. 10804, pp.
243–269. Springer (2018). https://doi.org/10.1007/978-3-319-89722-6 10
250 Á. Hajdu and D. Jovanović
20. Hajdu, Á., Jovanović, D.: solc-verify: A modular verifier for Solidity smart con-
tracts. In: VSTTE 2019, LNCS, vol. 12301. Springer (2019), (In press)
21. Hildenbrandt, E., Saxena, M., Zhu, X., Rodrigues, N., Daian, P., Guth, D., Rosu,
G.: KEVM: A complete semantics of the Ethereum virtual machine. Tech. rep.,
IDEALS (2017)
22. Hirai, Y.: Defining the Ethereum virtual machine for interactive theorem
provers. In: FC 2017, LNCS, vol. 10323, pp. 520–535. Springer (2017).
https://doi.org/10.1007/978-3-319-70278-0 33
23. Jiao, J., Kan, S., Lin, S., Sanán, D., Liu, Y., Sun, J.: Executable operational
semantics of Solidity (2018), http://arxiv.org/abs/1804.01295
24. Lahiri, S.K., Chen, S., Wang, Y., Dillig, I.: Formal specification and verification of
smart contracts for azure blockchain. In: VSTTE 2019, LNCS, vol. 12301. Springer,
(In press)
25. Leino, K.R.M.: Ecstatic: An object-oriented programming language with an ax-
iomatic semantics. In: Proceedings of the Fourth International Workshop on Foun-
dations of Object-Oriented Languages (1997)
26. Leino, K.R.M.: Dafny: An automatic program verifier for functional cor-
rectness. In: LPAR 2010, LNCS, vol. 11247, pp. 348–370. Springer (2010).
https://doi.org/10.1007/978-3-642-17511-4 20
27. McCarthy, J.: Towards a mathematical science of computation. In: IFIP Congress.
pp. 21–28 (1962)
28. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: TACAS 2008, LNCS,
vol. 4963, pp. 337–340. Springer (2008). https://doi.org/10.1007/978-3-540-78800-
3 24
29. Mueller, B.: Smashing Ethereum smart contracts for fun and real profit. In: Pro-
ceedings of the 9th Annual HITB Security Conference (HITBSecConf) (2018)
30. Solidity documentation (2019), https://solidity.readthedocs.io/
31. Szabo, N.: Smart contracts (1994)
32. Wood, G.: Ethereum: A secure decentralised generalised transaction ledger (2017),
https://ethereum.github.io/yellowpaper/paper.pdf
33. Zakrzewski, J.: Towards verification of Ethereum smart contracts: A formalization
of core of Solidity. In: VSTTE 2018, LNCS, vol. 11294, pp. 229–247. Springer
(2018). https://doi.org/10.1007/978-3-030-03592-1 13
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Exploring Type-Level Bisimilarity towards
More Expressive Multiparty Session Types
1 Introduction
Background. To take advantage of modern parallel and distributed comput-
ing platforms, message-passing concurrency is becoming increasingly important.
Modern programming languages, however, offer insufficiently effective linguistic
support to guide programmers towards safe usage of message-passing abstrac-
tions (e.g., to prevent deadlocks or protocol violations).
Multiparty session types (MPST) [34]
constitute a static, correct-by-construc- G
tion approach to simplify concurrent project global
type G onto
programming, by offering a type-based each role
framework to specify message-passing L1 L2 ... Ln
type-check
protocols and ensure deadlock-freedom each process
and protocol conformance. The idea is Pi against
local type Li
to use behavioural types [1,37] to en- P1 P2 ... Pn
force protocols (i.e., patterns of admissi-
ble communications) between roles (e.g., Fig. 1: MPST framework
threads, processes, services) to avoid con-
currency bugs. The framework is illustrated in Fig. 1: first, a global type G (pro-
tocol specification; written by the programmer) is projected onto every role; then,
every resulting endpoint type (local type) Li (role specification) is type-checked
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 251–279, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 10
252 S. Jongmans and N. Yoshida
Several improvements to the original work have been proposed: Honda et al.
managed to allow each role r not involved in a choice to have different behaviour
in different branches [15], so long as r is made aware of which branch is chosen in a
timely and unambiguous fashion (e.g., the previous global type is still forbidden),
while Lange et al., Castagna et al., and Hu & Yoshida managed to allow choices
between different receivers [16,23,36,40]. For instance, the following global type
(the Client directly requests the specialised server) is allowed:
μX. c s1 : Add · s1 c : Sum · X + c s2 : Mul · s2 c : Prod · X
But, the following global type (two Clients c1 and c2 use Server S) is forbidden:
c1 s : Add · s c1 : Sum · X + c1 s : Mul · s c1 : Prod · X +
μX.
c2 s : Add · s c2 : Sum · X + c2 s : Mul · s c2 : Prod · X
Exploring Type-Level Bisimilarity towards More Expressive MPST 253
None of the existing works allow the above nondeterministic choices between
different senders. We call this the +-problem: how to add a choice constructor,
denoted by +, to specify choices between disjoint sender-receiver-label triples?
2. No existential quantification: Related to the +-problem is the ∃-
problem: how to add an existential role quantifier, denoted by ∃, to specify
the execution of ∃’s body for some role in ∃’s domain? For instance, instead
of writing a separate global type for 2 Clients, 3 Clients, etc., existential role
quantification allows us to write only one global type for any n>1 Clients:
The ∃-problem was first formulated by Deniélou & Yoshida [22] as the dual of the
∀-problem (i.e., specify the execution of ∀’s body for each role in ∀’s domain):
the ∀-problem was solved in the same paper, but the ∃-problem “raises many
semantic issues” [22] and has remained open for almost a decade.
3. Limited parallel composition: The third open problem related to
choice is the -problem: how to add a constructor, denoted by , that allows
infinite branching (i.e., non-finite control) through unbounded parallel inter-
leaving? While extensions of the original work with parallel composition exist
(e.g., [16,22,23,43]), none of these works supports unbounded interleaving. For
instance, the following global type allows an unbounded number of requests to
be served by the Server in parallel (instead of sequentializing them):
– For the first time, we provide solutions to the +-problem, the ∃-problem,
and the -problem, by presenting expressive syntax for global and local types
(formulated as process algebraic terms), a refined notion of projection, and
novel well-formedness conditions.
– Our main theoretical result is operational equivalence: a well-formed global
type behaves the same as the parallel composition of its projections, modulo
weak bisimulation. This implies freedom of deadlocks and freedom of protocol
violations of the projections. Checking this equivalence is decidable.
To our knowledge, we are the first to use (weak) bisimilarity to prove the
correctness of a projection operator from global to local types. By doing so,
254 S. Jongmans and N. Yoshida
8: Value(“x” , 5)
Client Server
9: Barrier Lock
we decouple (a) the act of reasoning about projection and (b) the act of
establishing compliance between local types and process implementations;
until our work, these two concerns have always been conflated.
– Our main practical results are: (1) to provide representative protocols ty-
pable in our approach; and (2) the well-formedness conditions of (1) can be
checked orders of magnitude faster than directly checking weak bisimilarity
using mCRL2 [10,20,29], a state-of-the-art model checker.
In Sect. 2, we present an overview of our contribution through a representative
example protocol that is not supported by previous work. In Sect. 3, we present
the details of our theoretical contribution. In Sect. 4, we present the details of our
practical contribution (implementation and evaluation). In Sect. 5, we discuss
related work. We conclude and discuss future work in Sect. 6.
Detailed formal definitions and proofs of all lemmas and theorems can be
found in our supplement [38].
the store, represented by role name s. The store has keys of type Str (strings) and
values of type Nat (numbers). Fig. 2 shows valid and invalid example executions
of the protocol (n=2) as message sequence charts; it works as follows.
First, a Lock-message is communicated from some Client ci (1≤i≤n) to Server
s (Fig. 2a, arrows 1, 5); this grants ci exclusive access to the store. Then, a
sequence of messages to write and/or read values is communicated:
– To write, a Set-message is communicated from ci to s (arrows 2, 3, 11).
– To read, a Get-message is communicated from ci to s (arrows 6, 7). Then,
eventually, a Value-message is communicated from s to ci (arrows 8, 10), but
in the meantime, additional Get-messages can be communicated from ci to
s. In this way, the Client does not need to await the responses of the Server
to perform multiple independent requests. To indicate enough Get-messages
have been sent, a Barrier-message is communicated from ci to s (arrow 9),
which serves as a communication fence: the protocol will only proceed once
all Value-messages for pending Get-messages have been communicated.
The sequence ends with the communication of an Unlock-message from ci to s
(arrow 12). The protocol is then repeated for some Client cj (1≤j≤n); possibly,
but not necessarily, i=j. In this way, the Server atomically processes accesses to
the store between Lock/Unlock-messages.
Global and local types. The corresponding global type and local types, in-
ferred via projection (for some n), are as follows:
Local type r1 r2 ! (t) specifies the send of a (t)-message through the channel
from r1 to r2 ; dually, local type r1 r2 ?(t) specifies a receive. Because every
Client participates in only one branch of the quantification, their local types do
not contain ∃ under the recursion. In contrast, because the Server participates
in all branches, LS does contain ∃ under the recursion.
By Thm. 3, G and the parallel composition of LC1 , ..., LCn , LS are opera-
tionally equivalent (weakly bisimilar), which in turn implies deadlock-freedom
and absence of protocol violations. Note also that our global type for the Key-
Value Store protocol indeed relies on solutions to the +-problem (choice between
multiple clients that send a Lock-message), the ∃-problem (existential quantifica-
tion over clients), and the -problem (unbounded interleaving to support asyn-
chronous responses of a statically unknown number of requests).
We define our languages of global and local types as algebras over sets of (global)
communications and (local) sends/receives. This subsection presents preliminar-
ies on the generic algebraic framework we use, based on the existing algebras
PA [3] and TCP+REC [2]; the next subsection presents our specific instantia-
tions for global and local types.
Let A denote a set of actions, ranged over by α, and let {X1 , X2 , . . . , Y, . . .}
denote a set of recursion variables. Then, let Term(A) denote the set of (alge-
braic) terms, ranged over by T , generated by the following grammar:
α α α α
T1 −→ T1 T1 ↓ T2 −→ T2 T1 −→ T1 T2 −→ T2
α α α α α
α −→ 1 T1 · T2 −→ T1 · T2 T1 · T2 −→ T2 T1 + T2 −→ T1 T1 + T2 −→ T2
α α α
T1 −→ T1 T2 −→ T2 sub(E, E(X)) −→ T
α α α
T1 T2 −→ T1 T2 T1 T2 −→ T1 T2 X | E −→ T
(a) Reduction
T1 ↓ T2 ↓ T1 ↓ T 2 ↓ T1 ↓ T2 ↓ sub(E, E(X)) ↓
1↓ T 1 + T2 ↓ T 1 + T2 ↓ T 1 · T2 ↓ T1 T2 ↓ X | E ↓
(b) Termination
Σ {M [ri /r]}i∈I G
(existential role quantification)
∃r∈{ri }i∈I . M G
Fig. 4: Macros
action εrr1 r2 specifies the idling of role r during a communication between roles
r1 and r2 . The inclusion of such annotated idling actions in local types is novel;
we shortly elaborate on its purpose.
We can now define Glob = Term(Ag ) and Loc = Term(Al ) as the sets of
all global and local types, ranged over by G and L.
M1 = a b · 1 G1 = a ab · ab b
M2 = a b · a b · 1 G2 = a ab · ab b a ab · ab b
M3 = a b · b a · 1 G3 = a ab · ab b · b ba · ba a
M4 = a b · a b G4 = a ab · ab b · a b
(For brevity, we omit 1 from the resulting global types; this can be incorporated
in the macro expansion rules, at the expense of a more complex formulation.)
Global type G1 specifies an asynchronous communication from Alice to Bob.
Global type G2 specifies two asynchronous communications from Alice to Bob;
Alice can do the second send already before Bob has done the first receive.
Global type G3 specifies an asynchronous communication from Alice to Bob,
followed by one from Bob to Alice; in contrast to G2 , Bob can send only after
he has received (i.e., this encoding of asynchrony preserves causality of messages
sent and received by the same role). Global type G4 specifies an asynchronous
communication from Alice to Bob, followed by a synchronous communication
from Bob to Alice; it highlights that, unlike existing languages of global types,
ours supports mixing synchrony and asynchrony in a single global type.
Example 2 (Finite recursion). The Key-Value Store protocol in Sect. 2 does not
terminate: in its global type, the inner recursions (Y and Z) can be exited, but
the outer recursion (X) cannot. A version of this protocol that terminates once
each of the Clients has indicated it has finished using the store (e.g., by sending
an Exit-message) can also be specified.
We illustrate the key idea in a simplified example:
L(r) ↓ r r !U r r ?U εr
r r
L(r1 ) −−1−−
2
−→ Lr1 L(r2 ) −−−−−→ Lr2
1 2
L(r) −−−−→ Lr
1 2
T r =T if: G ∈ { 1 } ∪ X ⎧
⎪
⎨ r1 r 2 ! U if: r1 = r = r2
(G1 ∗ G2 ) r = (G1 r) ∗ (G2 r) r 1 r2 : U r = r 1 r 2 ? U if: r1 =
r = r2
⎪
⎩ r
if: ∗ ∈ {+, ·, } εr 1 r 2 if: r1 = r = r2
X | E r = X | E r E r = {X → E(X) r | X ∈ dom E}
G R = {r → G r | r ∈ R} if: r(G) ⊆ R = ∅
Fig. 6: Projection
either Alice or Bob to Carol, or an Exit-message. In the latter case, Carol stops
communicating with a role, while she proceeds communicating with the other
role. Thus, the communications between Alice and Carol, and between Bob and
Carol, are decoupled (i.e., decisions to continue or break recursions are made per
role). Macro μ generalizes this pattern to arbitrary recursion bodies.
Groups. Finally, let R Loc denote the set of all groups of local types (i.e.,
every group is a partial function from role names to local types), ranged over
by L. The idea is that while a global type specifies a protocol among n roles
from one global perspective, a group of local types specifies a protocol from the
n local perspectives. Fig. 5 defines the operational semantics of groups, built on
top of the operational semantics of local types; we use the f [x → y] notation
to update function f with entry x → y. In words, group L is reduced either by
synchronously reducing the local types of a sender r1 and a receiver r2 (yielding
a communication from r1 to r2 ), or by reducing the local type of an idling role.
τ α τ α α τ σ
T↓ T −→ T ⇓ T −→ T T −→ T =⇒ T T =⇒ T −→ T T =⇒ T
α α α τ
T⇓ T⇓ T =⇒ T T =⇒ T T =⇒ T T =⇒ T
(a) Termination (b) Reduction
projections, where the side condition implies that the group is nonempty and
contains a local type for at least every role name that occurs in G. Thus, a group
of projections of G is a partial function relative to the set of all roles R, but it
is total relative to the set of roles r(G) ⊆ R that occur in G. (We note that we
also continue to assume global types are 1 -free, closed, and deterministic.)
Our projection operator is similar to existing projection operators in the
MPST literature [34], but it also differs on a fundamental account: it produces
local types with annotated idling actions. These idling actions will be instrumen-
tal in the definition of our well-formedness conditions. We note that no idling
actions occur in the local types for the Key-Value Store protocol in Sect. 2. This
is because after the idling actions have been used to establish well-formedness,
they are of no more use and can be eliminated to simplify the local types.
The following lemmas state key properties about termination and reduction
behaviour of global types and their projections: Lem. 1 states projection is sound
and complete for termination; Lem. 2 states the same for reduction.
Proof. By induction on G.
g gr
Lemma 2. G −→ G implies (G r) −−→ (G r)
gr g
and (G r) −−→ L implies G −→ G and L = G r for some G
Proof. Both conjuncts are proven by induction on the structure of G, also using
Lem. 1 (needed because termination plays a role in reduction of ·).
The idling actions introduced in local types by our projection operator are inter-
nal, because they never compose into communications that emerge between local
types in groups. Therefore, the operational equivalence relation under which we
prove the correctness of projection should be insensitive to idling actions.
First, let Aτ = {εrr1 r2 | r1 = r2 and r1 = r = r2 } denote the set of all in-
ternal actions, ranged over by τ, σ. Second, Fig. 7 defines an extension of our
operational semantics (Fig. 3) with relations that assert weak termination and
weak reduction (i.e., versions of termination and reduction that are insensitive to
internal actions). Third, Fig. 8 defines weak bisimilarity (≈), in terms of weak
similarity (), in terms of weak termination and weak reduction; it coincides
with the definition found in the literature (e.g., [2]), with the administrative
262 S. Jongmans and N. Yoshida
α
exception that we need the fourth rule in Fig. 7b to account for the fact we
have multiple different internal actions. We use a double horizontal line in the
formulation of rules to indicate they should be applied coinductively.
The notion of weak reduction allows us to generalize the soundness and com-
pleteness of projection from roles (Lem. 2) to groups of roles: Lem. 3 states (1)
if G can g-reduce to G and the projection of G is defined, then the group of
projections of G can reduce to the group of projections of G , either directly or
with a trailing weak τ -reduction; (2) conversely, if the group of projections of G
can g-reduce to L , then G can g-reduce to G and either L equals the group of
projections of G , or it can get there with a weak reduction.
⎡ g
⎤⎤
R) −→ R)
⎡
g
(G (G or
⎢ G −→ G and g τ
Lemma 3. ⎣ implies ⎣ (G R) −→ L =⇒ (G R) ⎦⎦
⎢ ⎥⎥
G R is defined
for some L , τ
L = G R or
⎡ ⎡ ⎤⎤
g
g ⎢ G −→ G and L =⇒ τ
(G R) ⎥
and ⎣(G R) −→ L implies ⎣
⎢ ⎥
⎦⎦
for some G , τ
Proof. Both conjuncts are proven by induction on R, also using Lem. 2.
Example 3 (Bad protocols). The following global types (message types omitted)
specify “bad” protocols that do not permit “good” concurrent implementations:
G1 = a b + a c G2 = a b · c d
choose to perform idling action εcab (i.e., local type G1 c has two reductions,
neither one of which has priority), thereby assuming that Alice has chosen Bob.
However, Bob can symmetrically assume that Alice has chosen Carol.b As a result,
εc ε
the group projection can reduce as follows: G1 {a, b, c} −−ab→ L1 −−ac→ L2 . Now,
L2 cannot reduce further, but Alice has not terminated yet. This sequence of
reductions cannot be (weakly) simulated by G1 .
Global type G2 specifies a communication from Alice to Bob, followed by a
communication from Carol to Dave. This is a bad protocol, because there is no
way for Carol and Dave to know when the communication from Alice to Bob
has occurred. Formally, this is manifested in the fact that Carol’s and Dave’s
local types can at any time choose to perform idling actions, thereby assuming
that the communication from Alice to Bob has occurred. As a result, the group
εcab εdab dd ab
projection can reduce as follows: G2 {a, b, c, d} −−→ L1 −−→ L2 −−−→ L3 −−−→
L4 . This sequence cannot be (weakly) simulated by G2 .
C For every r ∈ R, for every choice that local type G r has between a weak
l
reduction = ⇒ (where l is a send, a receive, or an idling action) and a com-
τ
pletely unobservable weak reduction =⇒, choosing to perform the former
does not disable the latter, and vice versa. This can be thought of as a form
of commutativity between l and τ .
EC For every r ∈ R, one of the following is true:
l
1. For every every weak reduction = ⇒ that local type G r can perform
(where l is a send or a receive, but not an idling action), it can perform
l
a reduction − →. That is, if G r can perform l in the future after idling
actions, it can do l already eagerly in the present.
2. Local type G r is the start of a causal chain: a sequence of τ -reductions,
followed by a non-τ -reduction, that are “causally related” to each other.
An εrr1 r2 -reduction is causally related to a εrr3 r4 -reduction iff {r1 , r2 } ∩
{r3 , r4 } = ∅. Globally speaking, this means communication between r3
and r4 must be preceded by communication between r1 and r2 .
These conditions must hold coinductively for all local types that G r can reduce
to. Essentially, these conditions state that by performing idling actions, a local
type can neither decrease its possible behaviour (C), nor increase it (EC-1),
unless it is guaranteed the added behaviour cannot be exercised yet, because it
is causally related to other communications that need to happen first (EC-2).
Fig. 9: Well-formedness conditions; Λ, Λ , Λ , Λ1 , Λ1 , Λ2 , Λ2 ∈ Loc ∪ (R Loc)
Fig. 9 defines C and EC formally. We define C not only for local types, but also
for groups of local types, as this simplifies some notation later on. We prove key
properties of C: Thm. 1 states commutativity of local sends/receives/idling (l) in
local types gets lifted to commutativity of global communications/idling (α) in
groups of local types; Lem. 4 states weak bisimilarity preserves commutativity.
Clτ (L(r)) Cα
τ (L)
Theorem 1. for all r ∈ dom L implies
for all l, τ for all α, τ
and C(L(r)) for all r ∈ dom L implies C(L)
Proof. The first conjunct is proven by induction on the rules of =⇒ . The second
is proven by coinduction on the rule of C, also using the first conjunct.
α1
Lemma 4. Cα2 (L1 ) and L1 ≈ L2 implies Cα
α2 (L2 )
1
Proof. The first conjunct is proven by applying the definitions of C and ≈; the
second is proven by coinduction on the rule of C, also using the first conjunct.
We also prove key properties of Chain and EC, both of which work specifically
for groups of projections: Lem. 5 states if the projections of r1 and r2 are both
causal chains, they cannot weakly reduce to local types where they can perform
Exploring Type-Level Bisimilarity towards More Expressive MPST 265
reciprocal actions (r1 the send; r2 the receive); Thm. 2 states eagerness of lo-
cal sends/receives (not idling) in projections gets lifted to eagerness of global
communications in groups of projections (cf. Thm. 1).
τ1 r r2 ! U
L (r1 ) −−1−− −→ L (r1 ) and
Chain (G R)(r1 ) ==⇒
Lemma 5. τ2 r r ?U implies false
Chain (G R)(r2 ) ==⇒ L (r2 ) −−1−2−−→ L (r2 )
Proof. The first conjunct is proven by using Lem. 5; the second is proven by
coinduction on the rule of EC, also using the first conjunct.
We note that, in contrast to Lem. 4 for C, we do not have a lemma that states
weak bisimilarity preserves EC. Such a lemma would have been highly useful in
our subsequent proofs, but it is unfortunately false, because weak bisimilarity
does not preserve Chain. A simple counterexample, for local types, is this: L1 =
r1 r2 ! U and L2 = εrr34 r5 · r1 r2 ! U , where {r1 , r2 } ∩ {r3 , r4 , r5 } = ∅. While L1 and
L2 are weakly bisimilar, L1 is the start of a unary causal chain, but L2 is not.
The problem here is that Chain depends on the role names associated with idling
actions, whereas weak bisimilarity abstracts those role names away.
We call a global type well-formed if each of its projections satisfies C and EC.
In words, L1 =⇒ L2 means L1 has a silent reduction (only τ -s) to a term that
is weakly bisimilar to L2 , or L1 is already weakly bisimilar to L2 (without any
reductions). Essentially, if C(G R) and EC(G R), then
relates G to a set
of groups S = {L | G
L} that can roughly be characterised as follows:
– (base) G R is in S;
– (successors) any group to which G R can silently reduce, is in S;
– (predecessors) any group that can silently reduce to G R, is in S;
266 S. Jongmans and N. Yoshida
g
implies G −→ G and (G R) =⇒ L ⇐ = L for some L
Lemma 7. G
L and G ↓ implies L ⇓
and G
L and L ↓ implies G ⇓
Proof. The first conjunct is proven by induction on the rules of =⇒ , also using
Lem. 1; the second is proven by contradiction (assume not G ↓; derive false;
conclude G ↓; it implies G ⇓).
g
G
L and L =⇒ L
g
Lemma 8. G
L and G −→ G implies
for some L
g
G
L and G −→ G
g
and G
L and L −→ L implies
for some G
τ
and G
L and L −→ L implies G
L
⇒,
Proof. The first and second conjunct are proven by induction on the rules of =
⇒.
also using Lemmas 3–4; the third is proven by induction on the rules of =
Proof. By coinduction on the rule of (Fig. 8), also using Lemmas 7-8.
The rationale behind this proposition is as follows. First, to check C(L) and
EC(L), by Thm. 1 and Thm. 2, it suffices to check C(L(r)) and EC(L(r)) for
each r ∈ dom L. For each such local type L(r), there are two possibilities.
If local type L(r) has finite control, its state space can be exhaustively ex-
plored in finite time, so checking C(L(r)) and EC(L(r)) is obviously decidable.
In contrast, if L(r) has non-finite control, we make two observations. The
first observation is that the only possibly source of infinity is the occurrence of
recursion variables under parallel composition. The second observation is that
C and EC are true for L1 L2 if they are true for L1 and L2 separately; this is
because C and EC essentially assert a “diamond structure” on the reductions of
L1 L2 , which is precisely the operational semantics of (Fig. 3). Thus, we can
check C(L1 L2 ) and EC(L1 L2 ) by checking C(L1 ), C(L2 ), EC(L1 ), and EC(L2 ),
thereby “avoiding” the possible source of infinity.
We note that splitting the checks for parallel composition in this way not only
ensures decidability; it also avoids exponential state explosion (in the number of
nested -operators in a single local type) in local types with finite control.
Our use of (weak) bisimilarity, plus the key insight to annotate silent actions with
additional information to keep track of choices, made the problem of proving the
correctness of projection (Thm. 3) feasible. The major technical challenges to
achieve this were defining the right bisimulation relation (Sect. 3.5) and discov-
ering corresponding well-formedness conditions (Sect. 3.6).
A naive weak bisimulation relation, Rnaive , relates every global type only
with its group of projections. Rnaive is sufficient to prove that every reduction
of a global type can be weakly simulated with one non-silent reduction of the
group (sender and receiver), followed by a number of silent reductions (idling
268 S. Jongmans and N. Yoshida
Local types
(if well-formed)
Global type Local types
4.1 Implementation
Setup. In the previous section, we formulated and proved the theoretical cor-
rectness of our well-formedness conditions (Thm. 3). In this section, we demon-
strate the practical usefulness through experimental evaluation in benchmarks.
Specifically, we show that checking our well-formedness conditions is faster and
more scalable than explicitly checking operational equivalence (which currently
seems the only alternative to attain the same level of expressiveness as our work).
In our benchmarks, we compare three approaches to check operational equiv-
alence between a global type and its group of projected local types:
– mpstpp-seq (baseline): In this approach, the mpstpp tool is used to check our
well-formedness conditions (which imply operational equivalence; Thm. 3),
without using any form of parallel processing.
270 S. Jongmans and N. Yoshida
Key-Value Store (KVS): This protocol is the same protocol as the one pre-
sented in Sect. 2, except each inner parallel composition () is replaced with
sequential composition (·). This is because mcrl22lps does not support nor-
malisation of mCRL2 specifications where occurs under recursion.
Load Balancer (LB): This protocol consists of a Master and a number of
Workers. Iteratively, first, a Request-message is communicated from the Mas-
ter to one of the Workers; then, a Response-message is communicated from
that Worker to the Master.
Work Stealing (WS): This protocol consists of a Master and a number of
Workers. Iteratively, a Job-message is communicated from the Master to one
of the Workers. Meanwhile, Workers can try to “steal” jobs from each other:
at any point, first, a Steal-message can be communicated from one Worker
to another Worker; then, either a Job-message (if the former Worker has a
job to spare) or a None-message (otherwise) is communicated from the latter
Worker to the former Worker.
Map/Reduce (MR): This protocol consists of a Master and a number of Work-
ers. First, in no particular order, a Map-message is communicated from the
Master to each Worker; then, in no particular order, a Reduce-message is
communicated from each Worker to the Master.
Peer-to-Peer (PtP): This protocol consists of a number of Peers. Unordered,
a Msg-message is communicated from each Peer to each other Peer.
Pub/Sub (PS): This protocol consists of a Publisher and a number of Sub-
scribers. In no particular order, a Sub-message can be communicated once
from each Subscriber to the Publisher to gain a subscription. Concurrently,
a Pub-message can be communicated from the Publisher to each Subscriber
with a subscription.
Fig. 11: Speedups (y-axis; y>1E+0 means faster, y<1E+0 means slower) of ex-
plicit relative to mpstpp-seq as the number of roles increases (x-axis)
each 2≤n≤7, we instantiated the Pub/Sub protocol with 1 Publisher and n Sub-
scribers; we did not instantiate the Pub/Sub protocol with n>7 Subscribers, as
the resulting global types are too large (their size grows exponentially in n).
Benchmark results. Figures 11–12 shows the results of our benchmarks. The
x-axis indicates the number of roles; the y-axis indicates relative speed-ups. The
baselines are at y=1E+0 and y=1: above it, a competing approach is faster than
mpstpp-seq; below it, it is slower. We draw two conclusions.
(1) For each protocol and number of roles, mpstpp-seq outperforms
explicit. In the cases of Key-Value Store and Load Balancer, explicit grows
towards mpstpp-seq, but the growth levels off as the number of roles increases,
while explicit is still about two order of magnitude slower than mpstpp-seq
in the best of circumstances. In the cases of Work Stealing, Peer-to-Peer, and
Pub/Sub, the LTSs generated from the translated mCRL2 specifications were
too large to be compared (i.e., ltscompare produced an error) beyond 7, 5, and
5 roles; this was no issue for mpstpp-seq. In the case of Map/Reduce, the LTSs
were small enough to compare using mCRL2’s ltscompare, but after an initial
upwards slope for 2≤N ≤7 roles, explicit starts to perform progressively worse.
(2) Especially for larger numbers of roles, parallelisation can yield
serious performance improvements. In the cases of Key-Value Store and
Load Balancer, mpstpp-par outperforms mpstpp-seq only with 14–16 roles; for
smaller numbers of roles, parallel execution is slower. In the worst case (Load
Balancer, 2 roles), the slowdown is roughly 10.9μs
3.2μs = 3.4; we hypothesise that be-
Exploring Type-Level Bisimilarity towards More Expressive MPST 273
Fig. 12: Speedups (y-axis; y>1 means faster, y<1 means slower) of mpstpp-par
relative to mpstpp-seq as the number of roles increases (x-axis)
cause of the low absolute execution times, the cost of spawning and synchronising
threads outweighs their benefit. However, the ascending gradient indicates that
as the number of roles increases, relatively more of the total work can be paral-
lelised, yielding progressive rewards. In the cases of Work Stealing, Map/Reduce,
Peer-to-Peer, and Pub/Sub, similar trends can be observed, except y=1 is crossed
sonner. The absolute execution times for these protocols and for small numbers
of roles are higher than for Key-Value Store and Load Balancer.
5 Related Work
Session types and model checking. Recently, there has been growing interest
in using model checking to verify properties of (multiparty) session types, similar
to our use of mCRL2 as an alternative to checking well-formedness (Sect. 4.2).
Lange et al. [39] infer behavioural types from Go programs and use mCRL2 to
verify the inferred types, to establish safety properties (combined with another
tool, KITTeL [26], to establish liveness). Hu and Yoshida [36] use a custom model
checker to verify safety and progress properties of local types (represented as
CFSMs) as part of API generation in the Scribble toolchain for MPST [35].
Exploring Type-Level Bisimilarity towards More Expressive MPST 275
Closest to our use of mCRL2 is the work of Scalas et al. [52,53], where mCRL2
is used to verify properties of local types (e.g., deadlock-freedom), while a form of
dependent type-checking is used to verify conformance of processes against those
types (i.e., actors in Scala); no global types and projection are used, though (pro-
grammers write local types manually). The idea is that properties model-checked
on the types carry over to the processes. Similarly, Scalas and Yoshida [51] use
mCRL2 to model-check session environments, as a more expressive alternative
to the classical consistency condition needed to prove subject reduction. Note
that [51, Theorem 5.15] shows that, in the case that a set of processes is typable
by a single multiparty session (i.e. a single global type), type-level properties
including safety, deadlock-freedom and liveness guarantee the same properties
for multiparty session π-processes. Hence our type-level analysis is directly us-
able to provide decidable procedures to verify session π-calculi with extended
expressiveness [51, Theorem 7.2].
6 Conclusion
A key open problems with multiparty session types (MPST) concerns expressive-
ness: none of the previous languages of global and local types supports arbitrary
choice (e.g., choices between different senders), existential quantification over
roles, and unbounded interleaving of subprotocols (in the same session). In this
paper, we presented the first theory that supports these features. Our main the-
oretical result is operational equivalence under weak bisimilarity: this guarantees
classical MPST properties for groups of local types projected from a global type,
namely freedom of deadlocks and absence of protocol violations. Our main prac-
tical result is that our well-formedness conditions, which guarantee operational
equivalence, can be checked orders of magnitude faster than directly checking
weak bisimilarity, which is demonstrated by our benchmark results.
We identify several interesting avenues for future work. First, it is useful to
extend our theory with parametrisation along the lines of Castro et al. [18] (which
currently works only for restrictive choices); their proof technique for correctness
seems to offer substantial synergy with our bisimilarity-based approach in this
paper. Second, we aim to investigate extensions of our theory with subtyping
(e.g., in terms of weak similarity). Notably, while asynchronous communication
can be encoded in our current theory, asynchronous subtyping is known to be
undecidable [9,41], so the connection between the two is interesting to explore.
References
1. Ancona, D., Bono, V., Bravetti, M., Campos, J., Castagna, G., Deniélou, P., Gay,
S.J., Gesbert, N., Giachino, E., Hu, R., Johnsen, E.B., Martins, F., Mascardi, V.,
Montesi, F., Neykova, R., Ng, N., Padovani, L., Vasconcelos, V.T., Yoshida, N.:
Behavioral types in programming languages. Foundations and Trends in Program-
ming Languages 3(2-3), 95–230 (2016)
2. Baeten, J.C.M., Bravetti, M.: A ground-complete axiomatisation of finite-state pro-
cesses in a generic process algebra. Mathematical Structures in Computer Science
18(6), 1057–1089 (2008)
3. Bergstra, J.A., Fokkink, W., Ponse, A.: Chapter 5 - process algebra with recursive
operations. In: Bergstra, J., Ponse, A., Smolka, S. (eds.) Handbook of Process
Algebra, pp. 333 – 389. Elsevier Science (2001)
4. Bergstra, J.A., Klop, J.W.: Process algebra for synchronous communication. In-
formation and Control 60(1-3), 109–137 (1984)
5. van Beusekom, R., Groote, J.F., Hoogendijk, P.F., Howe, R., Wesselink, W.,
Wieringa, R., Willemse, T.A.C.: Formalising the dezyne modelling language in
mcrl2. In: FMICS-AVoCS. Lecture Notes in Computer Science, vol. 10471, pp.
217–233. Springer (2017)
6. Bocchi, L., Lange, J., Yoshida, N.: Meeting deadlines together. In: CONCUR.
LIPIcs, vol. 42, pp. 283–296. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
(2015)
7. Bocchi, L., Yang, W., Yoshida, N.: Timed multiparty session types. In: CONCUR.
Lecture Notes in Computer Science, vol. 8704, pp. 419–434. Springer (2014)
8. Brand, D., Zafiropulo, P.: On communicating finite-state machines. J. ACM 30(2),
323–342 (1983)
9. Bravetti, M., Carbone, M., Zavattaro, G.: Undecidability of asynchronous session
subtyping. Inf. Comput. 256, 300–320 (2017)
10. Bunte, O., Groote, J.F., Keiren, J.J.A., Laveaux, M., Neele, T., de Vink, E.P., Wes-
selink, W., Wijs, A., Willemse, T.A.C.: The mcrl2 toolset for analysing concurrent
systems - improvements in expressivity and usability. In: TACAS (2). Lecture Notes
in Computer Science, vol. 11428, pp. 21–39. Springer (2019)
11. Capecchi, S., Castellani, I., Dezani-Ciancaglini, M.: Typing access control and se-
cure information flow in sessions. Inf. Comput. 238, 68–105 (2014)
12. Capecchi, S., Castellani, I., Dezani-Ciancaglini, M.: Information flow safety in mul-
tiparty sessions. Mathematical Structures in Computer Science 26(8), 1352–1394
(2016)
13. Capecchi, S., Castellani, I., Dezani-Ciancaglini, M., Rezk, T.: Session types for
access and information flow control. In: CONCUR. Lecture Notes in Computer
Science, vol. 6269, pp. 237–252. Springer (2010)
14. Carbone, M., Montesi, F.: Deadlock-freedom-by-design: multiparty asynchronous
global programming. In: POPL. pp. 263–274. ACM (2013)
15. Carbone, M., Yoshida, N., Honda, K.: Asynchronous session types: Exceptions and
multiparty interactions. In: SFM. Lecture Notes in Computer Science, vol. 5569,
pp. 187–212. Springer (2009)
16. Castagna, G., Dezani-Ciancaglini, M., Padovani, L.: On global types and multi-
party session. Logical Methods in Computer Science 8(1) (2012)
17. Castellani, I., Dezani-Ciancaglini, M., Pérez, J.A.: Self-adaptation and secure infor-
mation flow in multiparty communications. Formal Asp. Comput. 28(4), 669–696
(2016)
Exploring Type-Level Bisimilarity towards More Expressive MPST 277
18. Castro, D., Hu, R., Jongmans, S., Ng, N., Yoshida, N.: Distributed program-
ming using role-parametric session types in go: statically-typed endpoint apis
for dynamically-instantiated communication structures. PACMPL 3(POPL), 29:1–
29:30 (2019)
19. Coppo, M., Dezani-Ciancaglini, M., Yoshida, N., Padovani, L.: Global progress for
dynamically interleaved multiparty sessions. Mathematical Structures in Computer
Science 26(2), 238–302 (2016)
20. Cranen, S., Groote, J.F., Keiren, J.J.A., Stappers, F.P.M., de Vink, E.P., Wes-
selink, W., Willemse, T.A.C.: An overview of the mcrl2 toolset and its recent
advances. In: TACAS. Lecture Notes in Computer Science, vol. 7795, pp. 199–213.
Springer (2013)
21. Davoudian, A., Chen, L., Liu, M.: A survey on nosql stores. ACM Comput. Surv.
51(2), 40:1–40:43 (2018)
22. Deniélou, P., Yoshida, N.: Dynamic multirole session types. In: POPL. pp. 435–446.
ACM (2011)
23. Deniélou, P., Yoshida, N.: Multiparty session types meet communicating automata.
In: ESOP. Lecture Notes in Computer Science, vol. 7211, pp. 194–213. Springer
(2012)
24. Deniélou, P., Yoshida, N.: Multiparty compatibility in communicating automata:
Characterisation and synthesis of global session types. In: ICALP (2). Lecture
Notes in Computer Science, vol. 7966, pp. 174–186. Springer (2013)
25. Deniélou, P., Yoshida, N., Bejleri, A., Hu, R.: Parameterised multiparty session
types. Logical Methods in Computer Science 8(4) (2012)
26. Falke, S., Kapur, D., Sinz, C.: Termination analysis of imperative programs using
bitvector arithmetic. In: VSTTE. Lecture Notes in Computer Science, vol. 7152,
pp. 261–277. Springer (2012)
27. Gessert, F., Wingerath, W., Friedrich, S., Ritter, N.: Nosql database systems: a
survey and decision guidance. Computer Science - R&D 32(3-4), 353–365 (2017)
28. Groote, J.F., Jansen, D.N., Keiren, J.J.A., Wijs, A.: An O(mlogn) algorithm for
computing stuttering equivalence and branching bisimulation. ACM Trans. Com-
put. Log. 18(2), 13:1–13:34 (2017)
29. Groote, J.F., Mousavi, M.R.: Modeling and Analysis of Communicating Systems.
MIT Press (2014)
30. Hamers, R., Jongmans, S.S.: Discourje: Runtime verification of communication
protocols in clojure. In: TACAS 2020 (in press)
31. Hoefler, T., Belli, R.: Scientific benchmarking of parallel computing systems: twelve
ways to tell the masses when reporting performance results. In: SC. pp. 73:1–73:12.
ACM (2015)
32. Honda, K., Tokoro, M.: An object calculus for asynchronous communication. In:
ECOOP. Lecture Notes in Computer Science, vol. 512, pp. 133–147. Springer
(1991)
33. Honda, K., Yoshida, N., Carbone, M.: Multiparty asynchronous session types. In:
POPL. pp. 273–284. ACM (2008)
34. Honda, K., Yoshida, N., Carbone, M.: Multiparty asynchronous session types. J.
ACM 63(1), 9:1–9:67 (2016)
35. Hu, R., Yoshida, N.: Hybrid session verification through endpoint API generation.
In: FASE. Lecture Notes in Computer Science, vol. 9633, pp. 401–418. Springer
(2016)
36. Hu, R., Yoshida, N.: Explicit connection actions in multiparty session types. In:
FASE. Lecture Notes in Computer Science, vol. 10202, pp. 116–133. Springer (2017)
278 S. Jongmans and N. Yoshida
37. Hüttel, H., Lanese, I., Vasconcelos, V.T., Caires, L., Carbone, M., Deniélou, P.,
Mostrous, D., Padovani, L., Ravara, A., Tuosto, E., Vieira, H.T., Zavattaro, G.:
Foundations of session types and behavioural contracts. ACM Comput. Surv.
49(1), 3:1–3:36 (2016)
38. Jongmans, S.S., Yoshida, N.: Exploring Type-Level Bisimilarity towards More Ex-
pressive Multiparty Session Types. Tech. Rep. TR-OU-INF-2020-01, Open Univer-
sity of the Netherlands (2020)
39. Lange, J., Ng, N., Toninho, B., Yoshida, N.: A static verification framework for
message passing in go using behavioural types. In: ICSE. pp. 1137–1148. ACM
(2018)
40. Lange, J., Tuosto, E., Yoshida, N.: From communicating machines to graphical
choreographies. In: POPL. pp. 221–232. ACM (2015)
41. Lange, J., Yoshida, N.: On the undecidability of asynchronous session subtyping.
In: FoSSaCS. Lecture Notes in Computer Science, vol. 10203, pp. 441–457 (2017)
42. Lange, J., Yoshida, N.: Verifying asynchronous interactions via communicating
session automata. In: CAV (1). Lecture Notes in Computer Science, vol. 11561,
pp. 97–117. Springer (2019)
43. Mostrous, D., Yoshida, N., Honda, K.: Global principal typing in partially com-
mutative asynchronous sessions. In: ESOP. Lecture Notes in Computer Science,
vol. 5502, pp. 316–332. Springer (2009)
44. Neykova, R., Bocchi, L., Yoshida, N.: Timed runtime monitoring for multiparty
conversations. Formal Asp. Comput. 29(5), 877–910 (2017)
45. Neykova, R., Hu, R., Yoshida, N., Abdeljallal, F.: A session type provider: compile-
time API generation of distributed protocols with refinements in f#. In: CC. pp.
128–138. ACM (2018)
46. Neykova, R., Yoshida, N.: Let it recover: multiparty protocol-induced recovery. In:
CC. pp. 98–108. ACM (2017)
47. Ng, N., Yoshida, N.: Pabble: parameterised scribble. Service Oriented Computing
and Applications 9(3-4), 269–284 (2015)
48. Redis Labs: Redis (nd), accessed 18 October 2019, https://redis.io
49. Redis Labs: Transactions – redis (nd), accessed 18 October 2019, https://redis.io/
topics/transactions
50. Scalas, A., Dardha, O., Hu, R., Yoshida, N.: A linear decomposition of multiparty
sessions for safe distributed programming. In: ECOOP. LIPIcs, vol. 74, pp. 24:1–
24:31. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2017)
51. Scalas, A., Yoshida, N.: Less is more: multiparty session types revisited. PACMPL
3(POPL), 30:1–30:29 (2019)
52. Scalas, A., Yoshida, N., Benussi, E.: Effpi: verified message-passing programs in
dotty. In: SCALA@ECOOP. pp. 27–31. ACM (2019)
53. Scalas, A., Yoshida, N., Benussi, E.: Verifying message-passing programs with de-
pendent behavioural types. In: PLDI. pp. 502–516. ACM (2019)
Exploring Type-Level Bisimilarity towards More Expressive MPST 279
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Verifying Visibility-Based Weak Consistency
1 Introduction
Programming efficient multithreaded programs generally involves carefully organiz-
ing shared memory accesses to facilitate inter-thread communication while avoiding
synchronization bottlenecks. Modern software platforms like Java include reusable
abstractions which encapsulate low-level shared memory accesses and synchronization
into familiar high-level abstract data types (ADTs). These so-called concurrent objects
typically include mutual-exclusion primitives like locks, numeric data types like atomic
integers, as well as collections like sets, key-value maps, and queues; Java’s standard-
edition platform contains many implementations of each. Such objects typically provide
strong consistency guarantees like linearizability [18], ensuring that each operation
appears to happen atomically, witnessing the atomic effects of predecessors according
to some linearization order among concurrently-executing operations.
While such strong consistency guarantees are ideal for logical reasoning about
programs which use concurrent objects, these guarantees are too strong for many oper-
ations, since they preclude simple and/or efficient implementation — over half of Java’s
concurrent collection methods forego atomicity for weak-consistency [13]. On the one
hand, basic operations like the get and put methods of key-value maps typically admit
relatively-simple atomic implementations, since their behaviors essentially depend
upon individual memory cells, e.g., where the relevant key-value mapping is stored.
On the other hand, making aggregate operations like size and contains (value) atomic
would impose synchronization bottlenecks, or otherwise-complex control structures,
since their atomic behavior depends simultaneously upon the values stored across
many memory cells. Interestingly, such implementations are not linearizable even
when their underlying memory operations are sequentially consistent, e.g., as is the
case with Java 8’s concurrent collections, whose memory accesses are data-race free.4
For instance, the contains (value) method of Java’s concurrent hash map iterates
through key-value entries without blocking concurrent updates in order to avoid
unreasonable performance bottlenecks. Consequently, in a given execution, a contains-
value-v operation o1 will overlook operation o2 ’s concurrent insertion of k1 → v for a
key k1 it has already traversed. This oversight makes it possible for o1 to conclude that
value v is not present, and can only be explained by o1 being linearized before o2 . In the
case that operation o3 removes k2 → v concurrently before o1 reaches key k2 , but only
after o2 completes, then atomicity is violated since in every possible linearization, either
mapping k2 → v or k1 → v is always present. Nevertheless, such weakly-consistent
operations still offer guarantees, e.g., that values never present are never observed, and
initially-present values not removed are observed.
In this work we develop a methodology for proving that concurrent-object imple-
mentations adhere to the guarantees prescribed by their weak-consistency specifica-
tions. The key salient aspects of our approach are the lifting of existing sequential ADT
specifications via visibility relaxation [13], and the harnessing of simple and mechaniz-
able reasoning based on forward simulation [25] by relaxed-visibility ADTs. Effectively,
our methodology extends the predominant forward-simulation based linearizability-
proof methodology to concurrent objects with weakly-consistent operations, and
enables automation for proving weak-consistency guarantees.
To enable the harnessing of existing sequential ADT specifications, we adopt the
recent methodology of visibility relaxation [13]. As in linearizability [18], the return
value of each operation is dictated by the atomic effects of its predecessors in some
(i.e., existentially quantified) linearization order. To allow consistency weakening,
operations are allowed, to a certain extent, to overlook some of their linearization-order
predecessors, behaving as if they had not occurred. Intuitively, this (also existentially
quantified) visibility captures the inability or unwillingness to atomically observe
the values stored across many memory cells. To provide guarantees, the extent of
4
Java 8 implementations guarantee data-race freedom by accessing individual shared-memory
cells with atomic operations via volatile variables and compare-and-swap instructions. Starting
with Java 9, the implementations of the concurrent collections use the VarHandle mechanism
to specify shared variable access modes. Java’s official language and API specifications do not
clarify whether these relaxations introduce data races.
282 S. Krishna et al.
2 Weak Consistency
Our methodology for verifying weakly-consistent concurrent objects relies both on the
precise characterization of weak consistency specifications, as well as a proof technique
for establishing adherence to specifications. In this section we recall and outline a
characterization called visibility relaxation [13], an extension of sequential abstract
data type (ADT) specifications in which the return values of some operations may not
reflect the effects of previously-effectuated operations.
Notationally, in the remainder of this article, ε denotes the empty sequence, ∅
denotes the empty set, _ denotes an unused binding, and and ⊥ denote the Boolean
values true and false, respectively. We write R(x) to denote the inclusion x ∈ R of
a tuple x in the relation R; and R[x → y] to denote the extension R ∪ {xy} of R to
include xy; and R | X to denote the projection R ∩ X ∗ of R to set X; and R to denote
the complement {x : x ∈ / R} of R; and R(x) to denote the image {y : xy ∈ R} of R on
x; and R−1 (y) to denote the pre-image {x : xy ∈ R} of R on y; whether R(x) refers
to inclusion or an image will be clear from its context. Finally, we write xi to refer to
the ith element of tuple x = x0 x1 . . ..
–
has, v, b, and b = iff no prior
put, kv , _ nor
rem, k, _ follows some prior
put, kv, _.
The read-only predicate Rm holds for the following cases:
Rm (
put, _, b) if ¬b Rm (
rem, _, b) if ¬b Rm (
get, _, _) Rm (
has, _, _).
Java’s concurrent hash map appears to be consistent with this specification [13].
projection vis of lin maps each operation o ∈ O to a subset vis(o) ⊆ lin −1 (o) of the
operations preceding o in lin; note that
o1 , o2 ∈ vis means o1 observes o2 . For a
given read-only predicate R, we say o’s visibility is monotonic when it includes every
happens-before predecessor, and operation visible to a happens-before predecessor,
which is not read-only,7 i.e., vis(o) ⊇ hb −1 (o) ∪ vis(hb −1 (o)) | R. We says o’s
visibility is absolute when vis(o) = lin −1 (o), and vis is itself absolute when each vis(o)
is. An abstract execution e =
h, lin, vis is a history h along with a linearization of
h, and a visibility projection vis of lin. An abstract execution is sequential when hb is
total, complete when h is, and absolute when vis is.
put,
1, 1,
get, 1, 1
put,
0, 1,
put,
1, 0, ⊥
has, 1, ⊥
along with a happens-before order that, compared to the linearization order, keeps
Remark 1. Consistency models suited for modern software platforms like Java are based
on happens-before relations which abstract away from real-time execution order. Since
happens-before, unlike real-time, is not necessarily an interval order, the composition
7
For convenience we rephrase Emmi and Enea [13]’s notion to ignore read-only predecessors.
8
For readability, we list linearization sequences with operation labels in place of identifiers.
9
As is standard, adequate labelings of incomplete executions are obtained by completing each
linearized yet pending operation with some arbitrarily-chosen return value [18]. It is sufficient
that one of these completions be included in the sequential specification.
10
We consider a simplification from prior work [13]: rather than allowing the observers of a
given operation to pretend they see distinct return values, we suppose that all observers agree
on return values. While this is more restrictive in principle, it is equivalent for the simple
specifications studied in this article.
286 S. Krishna et al.
of linearizations of two distinct objects in the same execution may be cyclic, i.e., not
linearizable. Recovering compositionality in this setting is orthogonal to our work of
proving consistency against a given model, and is explored elsewhere [11].
– execution
h , lin, vis such that h = O, inv , ret, hb and hb ⊆ hb; and
– W -consistent execution
h , lin, vis with h =
O, inv , ret , hb and vis ⊆ vis.
Example 5. The abstract executions of Wm include the complete, sequential, and abso-
lute abstract execution defined by the following happens-before order
put,
1, 1,
get, 1, 1
put,
0, 1,
put,
1, 0, ⊥
has, 1,
which implies that it also includes one in which just the happens-before order is modi-
fied such that
has, 1, becomes unordered w.r.t.
put,
0, 1, and
put,
1, 0, ⊥.
Since it includes the latter, it also includes the execution in Example 3 where the
visibility of has is weakened which also modifies its return value from to ⊥.
action over operation identifier o is an o-action, and assume that executions are well
formed in the sense that for a given operation identifier o: at most one call o-action
occurs, at most one ret o-action occurs, and no ret nor hb o-actions occur prior to a
call o-action. Furthermore, we assume call o-actions are enabled, so long as no prior
call o-action has occurred. The history of a trace τ is defined inductively by fh (h∅ , τ ),
where h∅ is the empty history, and,
fh (h, ε) = h gh (h, call(o, m, x)) =
O ∪ {o}, inv [o →
m, x], ret, hb
fh (h, aτ ) = fh (gh (h, a), τ ) gh (h, ret(o, y)) =
O, inv , ret[o → y], hb
fh (h, ãτ ) = fh (h, τ ) gh (h, hb(o, o )) =
O, inv , ret, hb ∪
o, o
where h =
O, inv , ret, hb, and a is a call, ret, or hb action, and ã is not. An imple-
mentation I is a history transition system, and the histories H(I) of I are those of its
traces. Finally, we define consistency against specifications via history containment.
Definition 3. Implementation I is consistent with specification W iff H(I) ⊆ H(W ).
Lemma 2. A weak-visibility spec. and its transition system have identical histories.
Our notion of simulation is in some sense complete when the sequential specifica-
tion S of a weak-consistency specification W =
S, R, V is return-value deterministic,
i.e., there is a single label
m, x, y such that λ ·
m, x, y ∈ S for any method m,
argument-value x, and admitted sequence λ ∈ S. In particular, W s simulates any wit-
nessing implementation I whose abstract executions E(I) are included in E(W s ).11
This completeness, however, extends only to inclusion of abstract executions, and not
all the way to consistency, since consistency is defined on histories, and any given
operation’s return value is not completely determined by the other operation labels
and happens-before relation of a given history: return values generally depend on lin-
earization order and visibility as well. Nevertheless, sequential specifications typically
are return-value deterministic, and we have used simulation to prove consistency of
Java-inspired weakly-consistent objects.
Establishing simulation for an implementation is also helpful when reasoning
about clients of a concurrent object. One can use the specification in place of the
implementation and encode the client invariants using the abstract execution of the
specification in order to prove client properties, following Sergey et al.’s approach [35].
loadm (M, x, _) =
M, M (x)
storem (M, xy, o) =
M [x →
y, M (x)1 ∪ {o}], ε
M [x →
z, M (x)1 ∪ {o}],
true, M (x)1 if M (x)0 = y
casm (M, xyz, o) =
M,
false, M (x)1 if M (x)0 = y
where the compare-and-swap (CAS) operation stores value z at address x and returns
true when y was previously stored, and otherwise returns false.
and a update continuation f mapping the memory command’s return value y to a pair
f (y) =
2 , α, where 2 is an updated local state, and α maps an operation o to an LTS
action α(o). We assume the denotation ret xc (1 ) =
nop, ε, λy.
2 , λo.ret(z) of
the ret command yields a local state 2 with done(2 ) without executing memory
commands, and outputs a corresponding LTS ret action.
Example 7. A simple goto language over variables a, b, . . . for the memory system of
Example 6 would include the following commands:
goto ac () =
nop, ε, λy.
jump(, (a)), λo.ε
assume ac () =
nop, ε, λy.
next(), λo.ε if (a) = 0
b, c = load(a)c () =
load, (a), λy1 , y2 .
next([b → y1 ][c → y2 ]), λo.ε
store(a, b)c () =
store, (a)(b), λy.
next(), λo.ε
d, e = cas(a, b, c)c () =
cas, (a)(b)(c), λy1 , y2 .
next([d → y1 ][e → y2 ]), λo.ε
where the jump and next functions update a program counter, and the load command
stores the operation identifier returned from the corresponding memory commands.
Linearization and visibility actions are captured as program commands as follows:
linc () =
nop, ε, λy.
next(), λo.lin(o)
vis(a)c () =
nop, ε, λy.
next(), λo.vis(o, (a))
Atomic sections can be captured with a lock variable and a pair of program commands,
beginc () =
nop, ε, λy.
next([lock → true]), λo.ε
endc () =
nop, ε, λy.
next([lock → false]), λo.ε
such that idle states are identified by not holding the lock, i.e., idle() = ¬(lock), as
in the initial state init(m, x)(lock) = false.
Fig. 1. The semantics of program P = init, cmd, idle, done as an abstract-execution transition
system, where ·c and ·m are the denotations of program and memory commands, respectively.
argument x, and emitting action α(o). Besides its effect on shared memory, each step
uses the result
M2 , y of memory command μ to update local state and emit an action
using the continuation f , i.e., f (y) =
2 , α. Commands which do not access memory
are modeled by a no-op memory commands. We define the consistency of programs by
reduction to their transition systems.
4 Proof Methodology
Fig. 2. An implementation Ichm modeling Java’s concurrent hash map. The command inc(k)
increments counter k, and commands within atomic {. . .} are collectively atomic.
It is without loss of generality because the clients of such implementations can use
auxiliary variables to impose synchronization order constraints between every two
operations ordered by returns-before, e.g., writing a variable after each operation
returns which is read before each other operation is called (under sequential consistency,
every write happens-before every other read which reads the written value).
We illustrate our methodology with the key-value map implementation Ichm of
Figure 2, which models Java’s concurrent hash map. The lines marked in blue and
red represent linearization/visibility commands added by the instrumentation that
will be described below. Key-value pairs are stored in an array table indexed by keys.
The implementation of put and get are obvious while the implementation of has
returns true iff the input value is associated to some key consists of a while loop
traversing the array and searching for the input value. To simplify the exposition, the
shared memory reads and writes are already adapted to the memory system described
in Section 3.2 (essentially, this consists in adding new variables storing the set of
operation identifiers returned by a shared memory read). While put and get are
obviously linearizable, has is weakly consistent, with monotonic visibility. For instance,
given the two thread program {get(1); has(1)} || {put(1, 1); put(0, 1); put(1, 0)} it
is possible that get(1) returns 1 while has(1) returns false. This is possible in an
interleaving where has reads table[0] before put(0,1) writes into it (observing the
initial value 0), and table[1] after put(1,0) writes into it (observing value 0 as well).
The only abstract execution consistent with the weakly-consistent contains-value map
Wm (Example 2) which justifies these return values is given in Example 3. We show
that this implementation is consistent with a simplification of the contains-value map
Wm , without remove key operations, and where put operations return no value.
Given an implementation I, let L(I) be an instrumentation of I with program
commands lin() emitting linearization actions. The execution of lin() in the context
of an operation with identifier o emits a linearization action lin(o). We assume that L(I)
leads to well-formed executions (e.g., at most one linearization action per operation).
Verifying Visibility-Based Weak Consistency 293
Example 9. The blue lines in Figure 2 demonstrate the visibility commands added by
the instrumentation V(·) to the key-value map in Figure 2 (in this case, the modifiers
are put operations). The first visibility command in has precedes the procedure body
to emphasize the fact that it is executed atomically with the procedure call. Also, note
that the read of the array table is the only shared memory read in has.
Proof. Let
h, lin, vis be the abstract execution of a trace τ of V(L(I)), and let o be
an invocation in h of a monotonic method (w.r.t. V ). By the definition of V, the call
action of o is immediately followed in τ by a sequence of visibility actions vis(o, o )
12
We rely on retrieving the identifiers of currently-linearized operations. More complex proofs
may also require inspecting, e.g., operation labels and happens-before relationships.
294 S. Krishna et al.
for every modifier o which has been already linearized. Therefore, any operation
which has returned before o (i.e., happens-before o) has already been linearized and it
will necessarily have a smaller visibility (w.r.t. set inclusion) because the linearization
sequence is modified only by appending new operations. The instrumentation of
shared memory reads may add more visibility actions vis(o, _) but this preserves the
monotonicity status of o’s visibility. The case of absolute methods is obvious.
The consistency of the abstract executions of V(L(I)) with a given sequential
specification S, which completes the proof of consistency with a weak-visibility speci-
fication W =
S, R, V , can be proved by showing that the transition system W s of
W simulates V(L(I)) (Theorem 1). Defining a simulation relation between the two
systems is in some part implementation specific, and in the following we demonstrate
it for the key-value map implementation V(L(Ichm )).
We show that Wm s simulates implementation Ichm . A state of Ichm in Figure 2
is a valuation of table and the history variable lin storing the current linearization
sequence, and a valuation of the local variables for each active operation. Let ops(q)
denote the set of operations which are active in an implementation state q. Also, for
a has operation o ∈ ops(q), let index (o) be the maximal index k of the array table
such that o has already read table[k] and table[k] = v. We assume index (o) = −1 if
o did not read any array cell.
Definition 6. Let Rchm be a relation which associates every implementation state q with
a state of Wm s , i.e., an
S, R, V -consistent abstract execution e =
h, lin, vis with
h =
O, inv , ret, hb, such that:
1. O is the set of identifiers occurring in ops(q) or the history variable lin,
2. for each operation o ∈ ops(q), inv (o) is defined according to its local state, ret(o) is
undefined, and o is maximal in the happens-before order hb,
3. the value of the history variable lin in q equals the linearization sequence lin,
4. every invocation o ∈ ops(q) of an absolute method (put or get) has absolute visibility
if linearized, otherwise, its visibility is empty,
5. table is the array obtained by executing the sequence of operations lin,
6. for every linearized get(k) operation o ∈ ops(q), the put(k,_) operation in vis(o)
which occurs last in lin writes v to key k, where v is the local variable of o,
7. for every has operation o ∈ ops(q), vis(o) consists of:
– all the put operations o which returned before o was invoked,
– for each i ≤ index (o), all the put(i,_) operations from a prefix of lin that
wrote a value different from v,
– all the put(index (o) + 1,_) operations from a prefix of lin that ends with a
put(index (o) + 1,v) operation, provided that tv = v.
Above, the linearization prefix associated to an index j1 < j2 should be a prefix of
the one associated to j2 .
A large part of this definition is applicable to any implementation, only points (5),
(6), and (7) being specific to the implementation we consider. The points (6) and (7)
ensure that the return values of operations are consistent with S and mimic the effect
of the vis commands from Figure 2.
Theorem 3. Rchm is a simulation relation from V(L(Ichm )) to Wm s .
Verifying Visibility-Based Weak Consistency 295
Q, A, q, →a , where W s =
Q, A, q, → is the AETS of W and e1 −
→a e2 if and only if
a ∗
→ e2 and a ∈ {call(o, m, x)}∪{ret(o, y)}∪{hb(o, o )}∪{a1 lin(o) : a1 ∈ {vis(o, _)} }.
e1 −
heap. We adapted the first-order theory of reachability and footprint sets from the
GRASShopper verifier [30] for dynamically allocated data structures. This fragment is
decidable, but relies on local theory extensions [36], which we implemented by using
the trigger mechanism of the underlying SMT solver [27, 15] to ensure that quantified
axioms were only instantiated for program expressions. For instance, here is the “cycle”
axiom that says that if a node x has a field f[x] that points to itself, then any y that
it can reach via that field (encoded using the between predicate Btwn(f, x, y, y))
must be equal to x:
axiom (forall f: [Ref]Ref, x: Ref, y:Ref :: {known(x), known(y)}
f[x] == x && Btwn(f, x, y, y) ==> x == y);
We use the trigger known(x), known(y) (known is a dummy function that maps every
reference to true) and introduce known(t) terms in our programs for every term t of
type Ref (for instance, by adding assert known(t) to the point of the program where
t is introduced). This ensures that the cycle axiom is only instantiated for terms that
appear in the program, and not for terms that are generated by instantations of axioms
(like f[x] in the cycle axiom). This process was key to keeping the verification time
manageable.
Since we consider fine-grained concurrent implementations, we also needed to
reason about interference by other threads and show thread safety. civl provides
Owicki-Gries [29] style thread-modular reasoning, by means of demarcating atomic
blocks and providing preconditions for each block that are checked for stability under
all possible modifications by other threads. One of the consequences of this is that
these annotations can only talk about the local state of a thread and the shared global
state, but not other threads. To encode facts such as distinctness of operation identifiers
and ownership of unreachable nodes (e.g. newly allocated nodes) in the shared heap,
we use civl’s linear type system [40].
For instance, the proof of the push method needs to make assertions about the value
of the newly-allocated node x. These assertions would not be stable under interference
of other threads if we didn’t have a way of specifying that the address of the new node
is known only by the push thread. We encode this knowledge by marking the type of
the variable x as linear – this tells civl that all values of x across all threads are distinct,
which is sufficient for the proof. civl ensures soundness by making sure that linear
variables are not duplicated (for instance, they cannot be passed to another method
and then used afterwards).
We evaluate our proof methodology by considering models of two of Java’s weakly-
consistent concurrent objects.
Fig. 3. Case study detail: for each object we show lines of code, lines of proof, total lines, and
verification time in seconds. We also list common definitions and axiomatizations separately.
civl can construct a simulation relation equivalent to the one defined in Definition 6
automatically, given an inductive invariant that relates the state of the implementation
to the abstract execution. A first attempt at an invariant might be that the value stored
at table[k] for every key k is the same as the value returned by adding a get operation
on k by the specification AETS. This invariant is sufficient for civl to prove that the
return value of the absolute methods (put and get) is consistent with the specification.
However, it is not enough to show that the return value of the monotonic has
method is consistent with its visibility. This is because our proof technique constructs
a visibility set for has by taking the union of the memory tags (the set of operations
that wrote to each memory location) of each table entry it reads, but without additional
invariants this visibility set could entail a different return value. We thus strengthen
the invariant to say that tableTags[k], the memory tags associated with hash table
entry k, is exactly the set of linearized put operations with key k. A consequence of
this is that the abstract state encoded by tableTags[k] has the same value for key k as
the value stored at table[k]. civl can then prove, given the following loop invariant,
that the value returned by has is consistent with its visibility set.
(forall i: int :: 0 <= i && i < k ==> Map.ofVis(my_vis, lin)[i] != v)
This loop invariant says that among the entries scanned thus far, the abstract map
given by the projection of lin to the current operation’s visibility my_vis does not
include value v.
var head, tail: Ref; struct Node { var data: K; var next: Ref; }
abstract and concrete states. Once again, we need to strengthen this invariant in order
to verify the monotonic size method, because otherwise we cannot prove that the
visibility set we construct (by taking the union of the memory tags of nodes in the list
during traversal) justifies the return value.
The key additional invariant is that the memory tags for the next field of each node
(denoted x.nextTags for each node x) in the queue contain the operation label of the
operation that pushed the next node into the queue (if it exists). Further, the sequence
of push operations in lin are exactly the operations in the nextTags field of nodes in
the queue, and in the order they are present in the queue.
Figure 5 shows a simplified version of the civl encoding of these invariants. In
it, we use the following auxiliary variables in order to avoid quantifier alternation:
nextInvoc maps nodes to the operation label (type Invoc in civl) contained in the
nextTags field; nextRef maps operations to the nodes whose nextTags field contains
them, i.e. it is the inverse of nextInvoc; and absRefs maps the index of the abstract
queue (represented as a mathematical sequence) to the corresponding concrete heap
node. We omit the triggers and known predicates for readability; the full invariant can
be found in the accompanying proof scripts.
Given these invariants, one can show that the return value s computed by size
is consistent with the visibility set it constructs by picking up the memory tags from
each node that it traverses. The loop invariant is more involved, as due to concurrent
updates size could be traversing nodes that have been popped from the queue; see
our civl proofs for more details.
Results Figure 3 provides a summary of our case studies. We separate the table into
sections, one for each case study, and a common section at the top that contains the
common theories of sets and sequences and our encoding of the heap. In each case study
section, we separate the definitions of the atomic specification of the ADT (which can
Verifying Visibility-Based Weak Consistency 299
be reused for other implementations) from the code and proof of the implementation
we consider. For each resulting module, we list the number of lines of code, lines of
proof, total lines, and civl’s verification time in seconds. Experiments were conducted
on an Intel Core i7-4470 3.4 GHz 8-core machine with 16GB RAM.
Our two case studies are representative of the weakly-consistent behaviors exhibited
by all the Java concurrent objects studied in [13], both those using fixed-size arrays
and those using dynamic memory. As civl does not direclty support dynamic memory
and other Java language features, we were forced to make certain simplifications
to the algorithms in our verification effort. However, the assumptions we make are
orthogonal to the reasoning and proof of weak consistency of the monotonic methods.
The underlying algorithm used by, and hence the proof argument for monotonicity
of, hash map’s has method is the same as that in the other monotonic hash map
operations such as elements, entrySet, and toString. Similarly, the argument used
for the queue’s size can be adapted to other monotonic ConcurrentLinkedQueue
and LinkedTransferQueue operations like toArray and toString. Thus, our proofs
carry over to the full versions of the implementations as the key invariants linking the
memory tags and visibility sets to the specification state are the same.
In addition, civl does not currently have any support for inferring the preconditions
of each atomic block, which currently accounts for most of the lines of proof in our case
studies. However, these problems have been studied and solved in other tools [30, 39],
and in theory can be integrated with civl in order to simplify these kinds of proofs.
300 S. Krishna et al.
6 Related Work
This work develops the first verification methodology for weakly-consistent operations
using sequential specifications and forward simulation, thus reusing existing sequential
ADT specifications and enabling simple reasoning, i.e., without prophecy [1] or back-
ward simulation [25]. This paper demonstrates the application of our methodology to
absolute and monotonic methods on sequentially-consistent memory, as these are the
consistency levels demonstrated in actual Java implementations of which we are aware.
Our formalization is general, and also applicable to the other visibility relaxations,
e.g., the peer and weak visibilities [13], and weaker memory models, e.g., the Java
memory model.
Extrapolating, we speculate that handling other visibilities amounts to adding anno-
tations and auxiliary state which mirrors inter-operation communication. For example,
while monotonic operations on shared-memory implementations observe mutating
linearization-order predecessors – corresponding to a sequence of shared-memory up-
dates – causal operations with message-passing based implementations would observe
operations whose messages have (transitively) propagated. The corresponding anno-
tations may require auxiliary state to track message propagation, similar in spirit to
the getModLin() auxiliary state that tracks mutating linearization-order predecessors
(§4). Since weak memory models essentially alter the mechanics of inter-operation
communication, the corresponding visibility annotations and auxiliary state may simi-
larly reflect this communication. Since this communication is partly captured by the
denotations of memory commands (§3.2), these denotations would be modified, e.g., to
include not one value and tag per memory location, but multiple. While variations are
possible depending on the extent to which the proof of a given implementation relies
on the details of the memory model, in the worst case the auxiliary state could capture
an existing memory model (e.g., operational) semantics exactly.
As with systematic or automated linearizability-proof methodologies, our proof
methodology is susceptible to two potential sources of incompleteness. First, as men-
tioned in Section 3, methodologies like ours based on forward simulation are only
complete when specifications are return-value deterministic. However, data types are
typically designed to be return-value deterministic and this source of incompleteness
does not manifest in practice.
Second, methodologies like ours based on annotating program commands, e.g., with
linearization points, are generally incomplete since the consistency mechanism em-
ployed by any given implementation may not admit characterization according to a
given static annotation scheme; the Herlihy-Wing Queue, whose linearization points
depend on the results of future actions, is a prototypical example [18]. Likewise, our
systematic strategy for annotating implementations with lin and vis commands (§3)
can fail to prove consistency of future-dependent operations. However, we have yet
to observe any practical occurrence of such exotic objects; our strategy is sufficient
for verifying the weakly-consistent algorithms implemented in the Java development
kit. As a theoretical curiosity for future work, investigating the potential for complete
annotation strategies would be interesting, e.g., for restricted classes of data types
and/or implementations.
302 S. Krishna et al.
Lemma 2. A weak-visibility specification and its transition system have identical histo-
ries.
Proof. It follows almost immediately that the abstract executions of W s are identical
to those of W , since W s ’s state effectively records the abstract execution of a given
AETS execution, and only enables those returns that are consistent with W . Since
histories are the projections of abstract executions, the corresponding history sets are
also identical.
Theorem 1. A witnessing implementation I is consistent with a weak-visibility specifi-
cation W if the transition system W s of W simulates I.
Proof. This follows from standard arguments, given that the corresponding SLTSs
include ε transitions to ensure that every move of one system can be matched by
stuttering from the other: since both systems synchronize on the call, ret, hb, lin, and
vis actions, the simulation guarantees that every abstract execution, and thus history,
of I is matched by one of W s . Then by Lemma 2, the histories of I are included in
W.
Theorem 3. Rchm is a simulation relation from Ichm to Wm s .
Proof Sketch. We show that every step of the implementation, i.e., an atomic section
or a program command, is simulated by Wm s . Given
q, e ∈ Rchm , we consider the
different implementation steps which are possible in q.
The case of commands corresponding to procedure calls of put and get is trivial.
Executing a procedure call in q leads to a new state q which differs only by having
call(o,_,_)
a new active operation o. We have that e −−−−−−→ e and
q , e ∈ Rchm where e
is obtained from e by adding o with an appropriate value of inv (o) and an empty
visibility.
The transition corresponding to the atomic section of put is labeled by a sequence
of visibility actions (one for each linearized operation) followed by a linearization
action. Let σ denote this sequence of actions. This transition leads to a state q where
the array table may have changed (unless writing the same value), and the history
variable lin is extended with the put operation o executing this step. We define an
abstract execution e from e by changing lin to the new value of lin, and defining an
σ
absolute visibility for o. We have that e − → e because e is consistent with Wm . Also,
q , e ∈ Rchm because the validity of (3), (4), and (5) follow directly from the definition
304 S. Krishna et al.
of e . The atomic section of get can be handled in a similar way. The simulation of
return actions of get operations is a direct consequence of point (6) which ensures
consistency with S.
For has, we focus on the atomic sections containing vis commands and the lin-
earization commands (the other internal steps are simulated by steps of Wm s , and
the simulation of the return step follows directly from (7) which justifies the consis-
tency of the return value). The atomic section around the procedure call corresponds
to a transition labeled by a sequence σ of visibility actions (one for each linearized
modifier) and leads to a state q with a new active has operation o (compared to q).
σ
We have that e − → e because e is consistent with Wm . Indeed, the visibility of o in
e is not constrained since o has not been linearized and the Wm -consistency of e
follows from the Wm -consistency of e. Also,
q , e ∈ Rchm because index (o) = −1
and (7) is clearly valid. The atomic section around the read of table[k] is simulated
by Wm s in a similar way, noticing that (7) models precisely the effect of the visibility
commands inside this atomic section. For the simulation of the linearization commands
is important to notice that any active has operation in e has a visibility that contains
all modifiers which returned before it was called and as explained above, this visibility
is monotonic.
References
[1] Abadi, M., Lamport, L.: The existence of refinement mappings. Theor. Comput.
Sci. 82(2), 253–284 (1991)
[2] Abdulla, P.A., Haziza, F., Holík, L., Jonsson, B., Rezine, A.: An integrated specifica-
tion and verification technique for highly concurrent data structures for highly
concurrent data structures. STTT 19(5), 549–563 (2017)
[3] Amit, D., Rinetzky, N., Reps, T.W., Sagiv, M., Yahav, E.: Comparison under abstrac-
tion for verifying linearizability. In: CAV. Lecture Notes in Computer Science,
vol. 4590, pp. 477–490. Springer (2007)
[4] Blom, S., Darabi, S., Huisman, M., Oortwijn, W.: The vercors tool set: Verification
of parallel and concurrent software. In: IFM. Lecture Notes in Computer Science,
vol. 10510, pp. 102–110. Springer (2017)
[5] Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: On reducing linearizability to state
reachability. Inf. Comput. 261(Part), 383–400 (2018)
[6] Bouajjani, A., Emmi, M., Enea, C., Mutluergil, S.O.: Proving linearizability using
forward simulations. In: CAV (2). Lecture Notes in Computer Science, vol. 10427,
pp. 542–563. Springer (2017)
[7] Burckhardt, S., Gotsman, A., Yang, H., Zawirski, M.: Replicated data types: specifi-
cation, verification, optimality. In: POPL. pp. 271–284. ACM (2014)
[8] Chakraborty, S., Henzinger, T.A., Sezgin, A., Vafeiadis, V.: Aspect-oriented lin-
earizability proofs. Logical Methods in Computer Science 11(1) (2015)
Verifying Visibility-Based Weak Consistency 305
[9] Delbianco, G.A., Sergey, I., Nanevski, A., Banerjee, A.: Concurrent data structures
linked in time. In: ECOOP. LIPIcs, vol. 74, pp. 8:1–8:30. Schloss Dagstuhl - Leibniz-
Zentrum fuer Informatik (2017)
[10] Derrick, J., Dongol, B., Schellhorn, G., Tofan, B., Travkin, O., Wehrheim, H.: Qui-
escent consistency: Defining and verifying relaxed linearizability. In: FM. Lecture
Notes in Computer Science, vol. 8442, pp. 200–214. Springer (2014)
[11] Dongol, B., Jagadeesan, R., Riely, J., Armstrong, A.: On abstraction and composi-
tionality for weak-memory linearisability. In: VMCAI. Lecture Notes in Computer
Science, vol. 10747, pp. 183–204. Springer (2018)
[12] Dragoi, C., Gupta, A., Henzinger, T.A.: Automatic linearizability proofs of con-
current objects with cooperating updates. In: CAV. Lecture Notes in Computer
Science, vol. 8044, pp. 174–190. Springer (2013)
[13] Emmi, M., Enea, C.: Weak-consistency specification via visibility relaxation.
PACMPL 3(POPL), 60:1–60:28 (2019)
[14] Haas, A., Henzinger, T.A., Holzer, A., Kirsch, C.M., Lippautz, M., Payer, H., Sezgin,
A., Sokolova, A., Veith, H.: Local linearizability for concurrent container-type
data structures. In: CONCUR. LIPIcs, vol. 59, pp. 6:1–6:15. Schloss Dagstuhl -
Leibniz-Zentrum fuer Informatik (2016)
[15] Hawblitzel, C., Petrank, E.: Automated verification of practical garbage collectors.
Logical Methods in Computer Science 6(3) (2010)
[16] Hawblitzel, C., Petrank, E., Qadeer, S., Tasiran, S.: Automated and modular refine-
ment reasoning for concurrent programs. In: CAV (2). Lecture Notes in Computer
Science, vol. 9207, pp. 449–465. Springer (2015)
[17] Henzinger, T.A., Kirsch, C.M., Payer, H., Sezgin, A., Sokolova, A.: Quantitative
relaxation of concurrent data structures. In: POPL. pp. 317–328. ACM (2013)
[18] Herlihy, M., Wing, J.M.: Linearizability: A correctness condition for concurrent
objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)
[19] Jones, C.B.: Specification and design of (parallel) programs. In: IFIP Congress. pp.
321–332. North-Holland/IFIP (1983)
[20] Jung, R., Krebbers, R., Jourdan, J., Bizjak, A., Birkedal, L., Dreyer, D.: Iris from the
ground up: A modular foundation for higher-order concurrent separation logic. J.
Funct. Program. 28, e20 (2018)
[21] Khyzha, A., Dodds, M., Gotsman, A., Parkinson, M.J.: Proving linearizability using
partial orders. In: ESOP. Lecture Notes in Computer Science, vol. 10201, pp. 639–
667. Springer (2017)
[22] Lahav, O., Vafeiadis, V.: Owicki-gries reasoning for weak memory models. In:
ICALP (2). Lecture Notes in Computer Science, vol. 9135, pp. 311–323. Springer
(2015)
[23] Leino, K.R.M.: Dafny: An automatic program verifier for functional correctness.
In: LPAR (Dakar). Lecture Notes in Computer Science, vol. 6355, pp. 348–370.
Springer (2010)
[24] Liang, H., Feng, X.: Modular verification of linearizability with non-fixed lineariza-
tion points. In: PLDI. pp. 459–470. ACM (2013)
[25] Lynch, N.A., Vaandrager, F.W.: Forward and backward simulations: I. untimed
systems. Inf. Comput. 121(2), 214–233 (1995)
306 S. Krishna et al.
[26] Michael, M.M., Scott, M.L.: Simple, fast, and practical non-blocking and blocking
concurrent queue algorithms. In: PODC. pp. 267–275. ACM (1996)
[27] Moskal, M., Lopuszanski, J., Kiniry, J.R.: E-matching for fun and profit. Electr.
Notes Theor. Comput. Sci. 198(2), 19–35 (2008)
[28] O’Hearn, P.W.: Resources, concurrency and local reasoning. In: CONCUR. Lecture
Notes in Computer Science, vol. 3170, pp. 49–67. Springer (2004)
[29] Owicki, S.S., Gries, D.: Verifying properties of parallel programs: An axiomatic
approach. Commun. ACM 19(5), 279–285 (1976)
[30] Piskac, R., Wies, T., Zufferey, D.: Grasshopper - complete heap verification with
mixed specifications. In: TACAS. Lecture Notes in Computer Science, vol. 8413,
pp. 124–139. Springer (2014)
[31] Raad, A., Doko, M., Rozic, L., Lahav, O., Vafeiadis, V.: On library correctness under
weak memory consistency: specifying and verifying concurrent libraries under
declarative consistency models. PACMPL 3(POPL), 68:1–68:31 (2019)
[32] Reynolds, J.C.: Separation logic: A logic for shared mutable data structures. In:
LICS. pp. 55–74. IEEE Computer Society (2002)
[33] Schellhorn, G., Wehrheim, H., Derrick, J.: How to prove algorithms linearisable. In:
CAV. Lecture Notes in Computer Science, vol. 7358, pp. 243–259. Springer (2012)
[34] Sergey, I., Nanevski, A., Banerjee, A.: Mechanized verification of fine-grained
concurrent programs. In: PLDI. pp. 77–87. ACM (2015)
[35] Sergey, I., Nanevski, A., Banerjee, A., Delbianco, G.A.: Hoare-style specifications
as correctness conditions for non-linearizable concurrent objects. In: OOPSLA.
pp. 92–110. ACM (2016)
[36] Sofronie-Stokkermans, V.: Hierarchic reasoning in local theory extensions. In:
CADE. Lecture Notes in Computer Science, vol. 3632, pp. 219–234. Springer (2005)
[37] Vafeiadis, V.: Shape-value abstraction for verifying linearizability. In: VMCAI.
Lecture Notes in Computer Science, vol. 5403, pp. 335–348. Springer (2009)
[38] Vafeiadis, V.: Automatically proving linearizability. In: CAV. Lecture Notes in
Computer Science, vol. 6174, pp. 450–464. Springer (2010)
[39] Vafeiadis, V.: Rgsep action inference. In: VMCAI. Lecture Notes in Computer
Science, vol. 5944, pp. 345–361. Springer (2010)
[40] Wadler, P.: Linear types can change the world! In: Programming Concepts and
Methods. p. 561. North-Holland (1990)
[41] Zhu, H., Petri, G., Jagannathan, S.: Poling: SMT aided linearizability proofs. In:
CAV (2). Lecture Notes in Computer Science, vol. 9207, pp. 3–19. Springer (2015)
Verifying Visibility-Based Weak Consistency 307
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which per-
mits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as
you give appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Local Reasoning for Global Graph Properties
Abstract. Separation logics are widely used for verifying programs that manipu-
late complex heap-based data structures. These logics build on so-called separation
algebras, which allow expressing properties of heap regions such that modifica-
tions to a region do not invalidate properties stated about the remainder of the heap.
This concept is key to enabling modular reasoning and also extends to concurrency.
While heaps are naturally related to mathematical graphs, many ubiquitous graph
properties are non-local in character, such as reachability between nodes, path
lengths, acyclicity and other structural invariants, as well as data invariants which
combine with these notions. Reasoning modularly about such graph properties
remains notoriously difficult, since a local modification can have side-effects on a
global property that cannot be easily confined to a small region.
In this paper, we address the question: What separation algebra can be used to
avoid proof arguments reverting back to tedious global reasoning in such cases?
To this end, we consider a general class of global graph properties expressed as
fixpoints of algebraic equations over graphs. We present mathematical foundations
for reasoning about this class of properties, imposing minimal requirements on the
underlying theory that allow us to define a suitable separation algebra. Building
on this theory, we develop a general proof technique for modular reasoning about
global graph properties expressed over program heaps, in a way which can be
directly integrated with existing separation logics. To demonstrate our approach,
we present local proofs for two challenging examples: a priority inheritance
protocol and the non-blocking concurrent Harris list.
1 Introduction
Separation logic (SL) [31,37] provides the basis of many successful verification tools that
can verify programs manipulating complex data structures [1, 4, 17, 29]. This success is
due to the logic’s support for reasoning modularly about modifications to heap-based data.
For simple inductive data structures such as lists and trees, much of this reasoning can
be automated [2, 11, 20, 33]. However, these techniques often fail when data structures
are less regular (e.g. multiple overlaid data structures) or provide multiple traversal
patterns (e.g. threaded trees). Such idioms are prevalent in real-world implementations
such as the fine-grained concurrent data structures found in operating systems and
databases. Solutions to these problems have been proposed [14] but remain difficult to
automate. For proofs of general graph algorithms, the situation is even more dire. Despite
substantial improvements in the verification methodology for such algorithms [35, 38],
significant parts of the proof argument still typically need to be carried out using non-
local reasoning [7, 8, 13, 25]. This paper presents a general technique for local reasoning
0 1
1 method acquire(p: Node, r: Node) {
2 if (r.next == null) { r1 p2
3 r.next := p; update(p, -1, r.curr_prio)
4 } else { ∅ {0}
5 p.next := r; update(r, -1, p.curr_prio)
6 } 3 p1 {2} 0 r2 {1}
7 }
8 method update(n: Node, from: Int, to: Int) { 0 2
9 n.prios := n.prios \ {from}
10 if (to >= 0) n.prios := n.prios ∪ {to} r3 p3
11 from := n.curr_prio
{1, 2, 2} {1}
12 n.curr_prio := max(n.prios ∪ {n.def_prio})
13 to := n.curr_prio;
14 if (from != to && n.next != null) { 1 p4 ∅ 2 p5 ∅ 2 p6 ∅
15 update(n.next, from, to)
16 }
17 } r4
2 p7 ∅ 0 {2, 2}
Fig. 1: Pseudocode of the PIP and a state of the protocol data structure. Round nodes
represent processes and rectangular nodes resources. Nodes are marked with their default
priorities def_prio as well as the aggregate priority multiset prios. A node’s current
priority curr_prio is underlined and marked in bold blue.
about global graph properties that can be used within off-the-shelf separation logics.
We demonstrate our technique using two challenging examples for which no fully local
proof existed before, respectively, whose proof required a tailor-made logic.
As a motivating example, we consider an idealized priority inheritance protocol (PIP),
a technique used in process scheduling [39]. The purpose of the protocol is to avoid
priority inversion, i.e. a situation where a low-priority process causes a high-priority
process to be blocked. The protocol maintains a bipartite graph with nodes representing
processes and resources. An example graph is shown in Fig. 1. An edge from a process
p to a resource r indicates that p is waiting for r to be available whereas an edge in
the other direction means that r is currently held by p. Every node has an associated
default priority and current; these are natural numbers. The current priority is used for
scheduling processes. When a process attempts to acquire a resource currently held by
another process, the graph is updated to avoid priority inversion. For example, when
process p1 with current priority 3 attempts to acquire the resource r1 held by process
p2 of priority 1, p1 ’s higher priority is propagated to p2 and, transitively, to any other
process that p2 is waiting for (p3 in this case). As a result, all nodes on the created cycle3
will get current priority 3. The protocol maintains the following invariant: the current
priority of each node is the maximum of its default priority and the current priorities of
all its predecessors. Priority propagation is implemented by the method update shown
in Fig 1. The implementation represents graph edges by next pointers and handles both
adding an edge (acquire) and removing one (release - code omitted). To recalculate
the current priority of a node (line 12), each node maintains its default priority def_prio
and a multiset prios which contains the priorities of all its immediate predecessors.
Verifying that the PIP maintains its invariant using established separation logic (SL)
techniques is challenging. In general, SL assertions describe resources and express the
fact that the program has permission to access and manipulate these resources. In what
3
The cycle can be used to detect/handle a deadlock; this is not the concern of this data structure.
310 S. Krishna et al.
follows, we stick to the standard model of SL where resources are memory regions
represented as partial heaps. We sometimes view partial heaps more abstractly as partial
graphs (hereafter, simply graphs). Assertions describing larger regions are built from
smaller ones using separating conjunction, φ1 ∗ φ2 . Semantically, the ∗ operator is tied to
a notion of resource composition defined by an underlying separation algebra [5, 6]. In
the standard model, composition enforces that φ1 and φ2 must describe disjoint regions.
The logic and algebra are set up so that changes to the region φ1 do not affect φ2 (and
vice versa). That is, if φ1 ∗ φ2 holds before the modification and φ1 is changed to φ1 ,
then φ1 ∗ φ2 holds afterwards. This so-called frame rule enables modular reasoning
about modifications to the heap and extends well to the concurrent setting when threads
operate on disjoint portions of memory [3, 9, 10, 36]. However, the mere fact that φ2 is
preserved by modifications to φ1 does not guarantee that if a global property such as the
PIP invariant holds for φ1 ∗ φ2 , it also still holds for φ1 ∗ φ2 .
For example, consider the PIP scenario depicted in Fig. 1. If φ1 describes the
subgraph containing only node p1 , φ2 the remainder of the graph, and φ1 the graph
obtained from φ1 by adding the edge from p1 to r1 , then the PIP invariant will no longer
hold for the new composed graph described by φ1 ∗ φ2 . On the other hand, if φ1 captures
p1 and the nodes reachable from r1 (i.e., the set of nodes modified by update), φ2 the
remainder of the graph, and we reestablish the PIP invariant locally in φ1 obtaining φ1
(i.e., run update to completion), then φ1 ∗ φ2 will also globally satisfy the PIP invariant.
The separating conjunction ∗ is not sufficient to differentiate these two cases; both
describe valid partitions of a possible program heap. As a consequence, prior techniques
have to revert back to non-local reasoning to prove that the invariant is maintained.
A first helpful idea towards a solution to this problem is that of iterated separating
conjunction [30, 44], which describes a graph G consisting of a set of nodes X by a
∗
formula Ψ = x∈X N(x) where N(x) is some predicate that holds locally for every
node x ∈ X. Using such node-local conditions one can naturally express non-inductive
properties of graphs (e.g. “G has no outgoing edges” or “G is bipartite”). The advan-
tages of this style of specification are two-fold. First, one can arbitrarily decompose
and recompose Ψ by splitting X into disjoint subsets. For example, if X is partitioned
∗ ∗
into X1 and X2 , then Ψ is equivalent to x∈X1 N(x) ∗ x∈X2 N(x). Moreover, it is
very easy to prove that Ψ is preserved under modifications of subgraphs. For instance,
∗
if a program modifies the subgraph induced by X1 such that x∈X1 N(x) is preserved
locally, then the frame rule guarantees that Ψ will be preserved in the new larger graph.
Iterated separating conjunction thus yields a simple proof technique for local reasoning
about graph properties that can be described in terms of node-local conditions. However,
this idea alone does not actually solve our problem because general global graph proper-
ties such as “G is a direct acyclic graph”, “G is an overlay of multiple trees”, or “G
satisfies the PIP invariant” cannot be directly described via node-local conditions.
Solution. The key ingredient of our approach is the concept of a flow of a graph: a
function fl from the nodes of the graph to flow values. For the PIP, the flow maps
each node to the multiset of its incoming priorities. In general, a flow is a fixpoint of
a set of algebraic equations induced by the graph. These equations are defined over a
flow domain, which determines how flow values are propagated along the edges of the
graph and how they are aggregated at each node. In the PIP example, an edge between
Local Reasoning for Global Graph Properties 311
nodes (n, n ) propagates the multiset containing max(fl (n), n.def_prio) from n to
n . The multisets arriving at n are aggregated with multiset union to obtain fl (n ).
Flows enable capturing global graph properties in terms of node-local conditions. For
example, the PIP invariant can be expressed by the following node-local condition:
n.curr_prio = max(fl (n), n.def_prio). To enable compositional reasoning about
such properties we need an appropriate separation algebra allowing us to prove locally
that modifications to a subgraph do not affect the flow of the remainder of the graph.
To this end, we make the useful observation that a separation algebra induces a
notion of an interface of a resource: we say that two resources a and a are equivalent
if they compose with the same resources. The interface of a resource a could then be
defined as a’s equivalence class, but more-succinct and simpler representations may be
possible. In the standard model of SL where resources are graphs and composition is
disjoint graph union, the interface of a graph G is the set of all graphs G that have the
same domain as G; in this model, a graph’s domain could be defined to be its interface.
The interfaces of resources described by assertions capture the information that is
implicitly communicated when these assertions are conjoined by separating conjunction.
As we discussed earlier, in the standard model of SL, this information is too weak to
enable local reasoning about global properties of the composed graphs because some
additional information about the subgraphs’ structure other than which nodes they
contain must be communicated. For instance, if the goal is to verify the PIP invariant, the
interfaces must capture information about the multisets of priorities propagated between
the subgraphs. We define a separation algebra achieving exactly this: the induced flow
interface of a graph G in this separation algebra captures how values of the flow domain
must enter and leave G such that, when composed with a compatible graph G , the
imposed local conditions on the flow of each node are satisfied in the composite graph.
This is the key to enabling SL-style framing for global graph properties. Using iter-
ated separating conjunctions over the new separation algebra, we obtain a compositional
proof technique that yields succinct proofs of programs such as the PIP, whose proofs
with existing techniques would involve non-trivial global reasoning steps.
Flows Redesigned. Our work is inspired by the recent flow framework explored by
some of the authors [22], but was redesigned from the ground up. We revisit the core
algebra behind flow reasoning, and derive a different algebraic foundation by analysing
the minimal requirements for general local reasoning; we call our newly-designed
reasoning framework the foundational flow framework. Our new framework makes
312 S. Krishna et al.
several significant improvements over [22] and eliminates its most stark limitations. We
provide a detailed technical comparison with [22] and discuss other related work in §5.
2.2 Flows
Recursive properties of graphs naturally depend on non-local information; e.g. we cannot
express that a graph is acyclic directly as a conjunction of per-node invariants. Our
Local Reasoning for Global Graph Properties 313
foundational flow framework defines flow values at each node that capture non-local
graph properties, and enables local specification and reasoning about such properties.
Flow values are drawn from a flow domain, an algebraic structure which also specifies
the operations used to define a flow via recursive computations over the graph. Our
entire theory is parametric with the choice of a flow domain, whose components will be
explained and motivated in the rest of this section.
Example 1. The path-counting flow domain is (N, +, 0, {λid , λ0 }), consisting of the
monoid of natural numbers under addition and the set of edge functions containing only
the identity function and the zero function. This can be used to define a flow where the
values at each node represent the number of paths to this node from a distinguished node
n. Path-counting provides enough information to express locally per node that e.g. (a)
all nodes are reachable from n (all path counts are non-zero), or (b) that the graph forms
a tree rooted at n (all path counts are exactly 1).
Example 2. We use (NN , ∪, ∅, {λ0 } ∪ {(λm. {max(m ∪ {p})}) | p∈N}) as flow do-
main for the PIP example (Figure 1). This consists of the monoid of multisets of natural
numbers under multiset union and two kinds of edge functions: λ0 and functions map-
ping a multiset m to the singleton multiset containing the maximum value between m
and a fixed value p (used to represent a node’s default priority). This can define a flow
which locally captures the appropriate current node priorities as the graph is modified.
Further definitions in this section assume a fixed flow domain (M, +, 0, E) and a
(potentially infinite) set of nodes N. For this section, we abstract heaps using directed
partial graphs; integration of our graph reasoning with direct proofs over program heaps
is explained in §3.
Flow Values and Flows. Flow values (taken from M ; the first element of a flow domain)
are used to capture sufficient information to express desired non-local properties of a
graph. In Example 1, flow values are non-negative integers; for the PIP (Example 2)
we instead use multisets of integers, representing relevant non-local information: the
priorities of nodes currently referencing a given node in the graph. Given such flow values,
a node’s correct priority can be defined locally per node in the graph. This definition
requires only the maximum value of these multisets, but as we will see shortly these
multisets enable local recomputation of a correct priority when the graph is changed.
For a graph G = (N, e) we express properties of G in terms of node-local conditions
that may depend on the nodes’ flow. A flow is a function fl : N → M assigning every
node a flow value and must be some fixpoint of the following flow equation:
fl (n ) e(n , n)
∀n ∈ N. fl (n) = in(n) + (FlowEqn)
n ∈N
314 S. Krishna et al.
Intuitively, one can think of the flow as being obtained by a fold computation over the
graph:4 the inflow in : N → M defines an initial flow at each node. This initial flow
is then updated recursively for each node n: the current flow value at its predecessor
nodes n is transferred to n via edge functions e(n , n) : M → M . These flow values are
aggregated using the summation operation + of the flow domain to obtain an updated
flow of n; a flow for the graph is some fixpoint satisfying this equation at all nodes. 5
Definition 3 (Flow Graph). A flow graph H = (N, e, fl ) is a graph (N, e) and function
fl : N → M such that there exists an inflow in : N → M satisfying FlowEqn(in, e, fl ).
We let dom(H) = N , and sometimes identify H and dom(H) to ease notational
burden. For n ∈ H we write Hn for the singleton flow subgraph of H induced by n.
Edge Functions. In any flow graph, the flow value assigned to a node n by a flow
is propagated to its neighbours n (and transitively) according to the edge function
e(n, n ) labelling the edge (n, n ). The edge function maps the flow value at the source
node n to one propagated on this edge to the target node n . Note that we require such
a labelling for all pairs consisting of a source node n inside the graph and a target
node n ∈ N (i.e., possibly outside the graph). The 0 flow value (the third element
of our flow domains) is used to represent no flow; the corresponding (constant) zero
function λ0 = (λm. 0) is used as edge function to model the absence of an edge in the
graph. A set of edge functions E from which this labelling is chosen can, other than
the requirement λ0 ∈ E, be chosen as desired. As we will see in §4.4, restrictions to
particular sets of edge functions E can be exploited to further strengthen our overall
technique. Edge functions can depend on the local state of the source node (as in the
following example); dependencies from elsewhere in the graph must be represented by
the node’s flow.
Example 3. Consider the graph in Figure 1 and the flow domain as in Example 2. We
choose the edge functions to be λ0 where no edge exists in the PIP structure, and other-
wise (λm. {max(m ∪ {d})}) where d is the default priority of the source of the edge.
For example, in Figure 1, e(r3 , p2 ) = λ0 and e(r3 , p1 ) = (λm. {max(m ∪ {0})}).
Since the flow value at r3 is {1, 2, 2}, the edge (r3 , p1 ) propagates the value {2} to p1 ,
correctly representing the current priority of r3 .
Flow Aggregation and Inflows. The flow value at a node is defined by those propagated
to it from each node in a graph via edge functions, along with an additional inflow value
explained here. Since multiple non-zero flow values can be propagated to a node, we
require an aggregation of these values via a binary + operator on flow values : the second
element of our flow domains. The edges from which the aggregated values originate
are unordered. Thus, we require + to be commutative and associative, making this
aggregation order-independent. The 0 flow value must act as a unit for +. For example,
in the path-counting flow domain + means addition on natural numbers, while for the
multisets employed for the PIP it means multiset union.
4
We note that flows are not generally defined in this manner as we consider any fixpoint of the
flow equation to be a flow. Nonetheless, the analogy helps to build an initial intuition.
5
We discuss questions regarding the existence and uniqueness of such fixpoints in §4.
Local Reasoning for Global Graph Properties 315
Each node in a flow graph has an inflow, modelling contributions to its flow value
which do not come from inside the graph. Inflows play two important roles: first, since
our graphs are partial, they model contributions from nodes outside of the graph. Second,
inflow can be artificially added as a means of specialising the computation of flow values
to characterise specific graph properties. For example, in the path-counting domain, we
give an inflow of 1 to the node from which we are counting paths, and 0 to all others.
Example 4. Let the edges in the graph in Figure 1 be labelled as described in Example 3.
If the inflow function in assigns the empty multiset to every node n and we let fl (n) be
the multiset labelling every node in the figure, then FlowEqn(in, e, fl ) holds.
The flow equation (FlowEqn) defines the flow of a node n to be the aggregation of
flow values coming from other nodes n inside the graph (as given by the respective edge
function e(n , n)) as well as the inflow in(n). Preserving solutions to this equation across
updates to the graph structure is a fundamental goal of our technique. The following
lemma (which relies on the fact that + is required to be cancellative) states that any
correct flow values uniquely determine appropriate inflow values:
Lemma 1. Given a flow graph (N, e, fl ), there exists a unique inflow in such that
FlowEqn(in, e, fl ).
We now turn to how solutions of the flow equation can be preserved or appropriately
updated under changes to the underlying graph.
Graph Updates and Cancellativity. Given a flow graph with known flow and inflow
values, suppose we remove an edge from n1 to n2 (replacing the edge function with
λ0 ). For the same inflow, such an update will potentially affect the flow at n2 and nodes
to which n2 (transitively) propagates flow. Starting from the simple case that n2 has
no outgoing edges, we need to recompute a suitable flow at n2 . Knowing the old flow
value (say, m) and the contribution m = fl (n1 ) e(n1 , n2 ) previously provided along
the removed edge, we know that the correct new flow value is some m such that
m + m = m. This constraint has a unique solution (and thus, we can unambiguously
recompute a new flow value) exactly when the aggregation + is cancellative; we therefore
make cancellativity a requirement on the + of any flow domain.
Cancellativity intuitively enforces that the flow domain carries enough information
to enable adaptation to local updates (in particular, removal of edges6 ). Returning to the
PIP example, cancellativity requires us to carry multisets as flow values rather than only
the maximum priority value: + cannot be the maximum operation, as this would not be
cancellative. The resulting multisets (like the prio fields in the actual code) provide the
information necessary to recompute corrected priority values locally.
For example, in the PIP graph shown in Figure 1, removing the edge from p6 to
r4 would not affect the current priority of r4 whereas if p7 had current priority 1 instead
of 2, then the current priority of r4 would have to decrease. In either case, recomputing
the flow value for r4 is simply a matter of subtraction (removing {2} from the multiset at
r4 ); cancellativity guarantees that our flow domains will always provide the information
6
As we will show in §2.3, an analogous problem for composition of flow graphs is also directly
solved by this choice to force aggregation to be cancellative.
316 S. Krishna et al.
needed for this recomputation. Without this property, the recomputation of a flow value
for the target node n2 would, in general, entail recomputing the incoming flow values
from all remaining edges from scratch. Cancellativity is also crucial for Lemma 1 above,
forcing uniqueness of inflows, given known flow values in a flow graph. This allows us
to define natural but powerful notions of flow graph decomposition and recomposition.
Definition 4 (Flow Graph Algebra). The flow graph algebra (FG,
, H∅ ) for the flow
domain (M, +, 0, E) is defined by
where e∅ and fl ∅ are the edge functions and flow on the empty set of nodes N = ∅.
Intuitively, two flow graphs compose to a flow graph if their contributions to each
others’ flow (along edges from one to the other) are reflected in the corresponding inflow
of the other graph. For example, consider the subgraph from Figure 1 consisting of
the single node p7 (with 0 inflow). This will compose with the remainder of the graph
depicted only if this remainder subgraph has an inflow which, at node r4 , includes at
least the multiset {2}, reflecting the propagated value from p7 .
We use this intuition to extract an abstraction of flow graphs which we call flow
interfaces. Given a flow (sub)graph, its flow interface consists of the node-wise inflow
and outflow (the flow contributions its nodes make to all nodes outside of the graph,
defined below). It is thus an abstraction that hides the flow values and edges that are
wholly inside the flow graph. Flow graphs that have the same flow interface “look the
same” to the external graph, as the same values are propagated inwards and outwards.
Definition 5 (Flow Interface). For a given flow domain M , a flow interface is a pair
I = (in, out) where in : N → M and out : N \ N → M for some N ⊆ N.
We write I.in, I.out for the two components of the interface I = (in, out). We will
again sometimes identify I and dom(I.in) to ease notational burden.
Given a flow graph H ∈ FG, we can compute its interface as follows. Recall that
Lemma 1 implies that any flow graph has a unique inflow. Thus, we can define an inflow
function that maps each flow graph H = (N, e, fl ) to the unique inflow inf(H) : H →
M such that FlowEqn(inf(H), e, fl ). Dually, we define the outflow of H as the function
outf(H) : N \ N → M defined by outf(H)(n) := n ∈N fl (n ) e(n , n). The flow
interface of H, written int(H), is the pair (inf(H), outf(H)) consisting of its inflow
Local Reasoning for Global Graph Properties 317
and its outflow. Returning to the previous example, if H is the singleton subgraph
consisting of node p7 from Figure 1 with flow and edges as depicted, then int(H) =
(λn. ∅, λn. (n=r4 ? {2} : ∅)).
This abstraction, while simple, turns out to be powerful enough to build a separation
algebra over our flow graphs, allowing them to be decomposed, locally modified and
recomposed in ways yielding all the local reasoning benefits of separation logics. In
particular, for graph operations within a subgraph with a certain interface, we need to
prove: (a) that the modified subgraph is still a flow graph (by checking that the flow
equation still has a solution locally in the subgraph) and (b) that it satisfies the same
interface (in other words, the effect of the modification on the flow is contained within
the subgraph); the meta-level results for our technique then justify that we can recompose
the modified subgraph with any graph that the original could be composed with.
We define the corresponding flow interface algebra as follows:
Definition 6 (Flow Interface Algebra). For a given flow domain M , the flow interface
algebra over M is defined to be (FI, ⊕, I∅ ), where:
Crucially, the following result shows that we can use our flow interfaces as an
abstraction directly compatible with existing separation logics.
This result forms the core of our reasoning technique; it enables us to make modifi-
cations within a chosen subgraph and, by proving preservation of its interface, know that
the result composes with any context exactly as the original did. Flow interfaces cap-
ture precisely the information relevant about a flow graph, with respect to composition
with other flow graphs. In Appendix B of the accompanying technical report (hereafter,
TR) [23] we provide additional examples of flow domains that demonstrate the range of
data structures and graph properties that can be expressed using flows, including a notion
of universal flow that in a sense provides a completeness result for the expressivity of
the framework. We now turn to constructing proofs atop these new reasoning principles.
318 S. Krishna et al.
3 Proof Technique
This section shows how to integrate flow reasoning into a standard separation logic,
using the priority inheritance protocol (PIP) algorithm to illustrate our proof techniques.
Since flow graphs and flow interfaces form separation algebras, it is possible in
principle to define a separation logic (SL) using these notions as a custom semantic
model (indeed, this is the proof approach taken in [22]). By contrast, we integrate flow
interfaces with a standard separation logic without modifying its semantics. This has
the important technical advantage that our proof technique can be naturally integrated
with existing separation logics and verification tools supporting SL-style reasoning. We
consider a standard sequential SL in this section, but our technique can also be directly
integrated with a concurrent SL such as RGSep (as we show in §4.5) or frameworks such
as Iris [18] supporting (ghost) resources ranging over user-defined separation algebras.
Proofs using our flow framework can employ a combination of specifications enforced
at the node level and in terms of the flow graphs and interfaces corresponding to larger
heap regions such as entire data structures (henceforth, composite graphs and composite
interfaces). At the node level, we write invariants that every node is intended to satisfy,
typically relating the node’s flow value to its local state (fields). For example, in the PIP,
we use node-local invariants to express that a node’s current priority is the maximum of
the node’s default priority and those in its current flow value. We typically express such
specifications in terms of singleton (flow) graphs, and their singleton interfaces.
Specification in terms of composite interfaces has several important purposes. One
is to define custom inflows: e.g. in the path-counting flow domain, specifying that the
inflow of a composite interface is 1 at some designated node r and 0 elsewhere enforces
in any underlying flow graph that each node n’s flow value will be the number of paths
from r to n.7 Composite interfaces can also be used to express that, in two states of
execution, a portion of the heap “looks the same” with respect to composition (it has the
same interface, and so can be composed with the same flow graphs), or to capture by
how much there is an observable difference in inflow or outflow; we employ this idea in
the PIP proof below.
We now define an assertion syntax convenient for capturing both node-level and
composite-level constraints, defined within an SL-style proof system. We assume an intu-
itionistic, garbage-collected SL [6] with standard syntax and semantics:8 see Appendix A
of the TR [23] for more details.
Node Predicates. The basic building block of our flow-based specifications is a node
predicate N(x, H), representing ownership of the fields of a single node x, as well as
7
Note that the analogous property cannot be captured at the node level; when considering
singleton interfaces per node in a tree rooted at r, every singleton interface has an inflow of 1.
8
As P ∗ φ ≡ P ∧ φ for pure formulas P in garbage-collected SLs, we use ∗ instead of ∧
throughout this paper.
Local Reasoning for Global Graph Properties 319
N(x, H) := ∃fs, fl . x → fs ∗ H = ({x} , (λy. edge(x, fs, y)), fl ) ∗ γ(x, fs, fl (x))
N is implicitly parameterised by fs, edge and γ; these are explained next and are typically
fixed across any given flow-based proof. The N predicate expresses that we have a heap
cell at location x containing fields fs (a list of field-name/value mappings).9 It also
says that H is a singleton flow graph with domain {x} with some flow fl , whose edge
functions are defined by a user-defined abstraction function edge(x, fs, y); this function
allows us to define edges in terms of x’s field values. Finally, the node, its fields, and
its flow in this flow graph satisfy the custom predicate γ, used to encode node-local
properties such as constraints in terms of the flow values of nodes.
Graph Predicates. The analogous predicate for composite graphs is Gr. It carries owner-
ship to the nodes making up a potentially unbounded graph, using iterated separating
conjunction over a set of nodes X as mentioned in §1:
Gr(X, H) := ∃H.
∗
N(x, H(x)) ∗ H =
x∈X
H(x)
x∈X
Lifting to Interfaces. Flow based proofs can often be expressed more elegantly and
abstractly using predicates in terms of node and composite-level interfaces rather than
flow graphs. To this end, we overload both our node and graph predicates with analogues
whose second parameter is a flow interface, defined as follows:
We will use these versions in the PIP proof below; interfaces capture all relevant proper-
ties for decomposition and composition of these flow graphs.
Flow Lemmas. We first illustrate our N and Gr predicates (which capture SL ownership
of heap regions and abstract these with flow interfaces) by identifying a number of
lemmas which are generically useful in flow-based proofs. Reasoning at the level of flow
interfaces is entirely in the pure world (mathematics independent of heap-ownership and
9
For simplicity, we assume that all fields of a flow graph node are to be handled by our flow-
based technique, and that their ownership (via → points-to predicates) is always carried around
together; lifting these restrictions would be straightforward.
320 S. Krishna et al.
Fig. 2: Some useful lemmas for proving entailments between flow-based specifications.
resources) with respect to the underlying SL reasoning; these lemmas are consequences
of our predicate definitions and the foundational flow framework definitions themselves.
Examples of these lemmas are shown in Figure 2. (D ECOMP) shows that we can
always decompose a valid flow graph into subgraphs which are themselves flow graphs.
Recomposition (C OMP) is possible only if the subgraphs compose. These rules, as well
as (S ING), and (G R E MP) follow directly from the definition of Gr and standard SL prop-
erties of iterated separating conjunction. The final rule (R EPL) is a direct consequence of
rules (C OMP), (D ECOMP) and the congruence relation on flow graphs induced by their
interfaces (cf. Lemma 2). Conceptually, it expresses that after decomposing any flow
graph into two parts H1 and H2 , we can replace H1 with a new flow graph H1 with the
same interface; when recomposing, the overall graph will be a flow graph with the same
overall interface.
Note the connection between rules (C OMP)/(D ECOMP) and the algebraic laws of
standard inductive predicates such as ls describing a segment of a linked list [2]. For
instance by combining the definition of Gr with these rules and (S ING) we can prove the
following graph analogue of the rule to separate a list into the head node and the tail:
However, crucially (and unlike when using general inductive predicates [32]), this rule
is symmetrical for any node x in X; it works analogously for any desired order of
decomposition of the graph, and for any data structure specified using flows.
When working with our overloaded N and Gr predicates, similar steps to those
described by the above lemmas are useful. Given these overloaded predicates, we simply
apply the lemmas above to the existentially quantified flow-graphs in their definitions and
then lift the consequence of the lemma back to the interface level using the congruence
between our flow graph and interface composition notions (Lemma 2).
32 Gr(X, I)
33 } else {
34 p.next := r; update(r, -1, p.curr_prio)
35 }
36 }
37
Fig. 3: Full PIP code and specifications, with proof sketch for acquire. The comments
and coloured annotations (lines 29 to 32) are used to highlight steps in the proof, and are
explained in detail in the text.
322 S. Krishna et al.
we explain in more detail below. 10 We instantiate our framework in order to capture the
PIP invariants as follows:
fs := next : y, curr_prio : q, def_prio : q 0 , prios : Q
(λm. max(m ∪ {q 0 })) if z = y = null
edge(x, fs, z) :=
λ0 otherwise
γ(x, fs, m) := q ≥ 0 ∗ (∀q ∈ Q. q ≥ 0) ∗ m = Q ∗ q = max(Q ∪ {q 0 })
0
ϕ(I) := I = (λ0 , λ0 )
Each node has the four fields listed in fs. fs also defines variables such as y to denote
field values that are used in the definitions of edge and γ; these variables are bound to the
heap by N. edge abstracts the heap into a flow graph by letting each node have an edge
to its next successor labelled by a function that passes to it the maximum incoming
priority or the node’s default priority: whichever is larger. With this definition, one can
see that the flow of every node will be the multiset containing exactly the priorities of
its predecessors. The node-local invariant γ says that all priorities are non-negative, the
flow m of each node is stored in the prios field, and its current priority is the maximum
of its default and incoming priorities. Finally, the constraint ϕ on the global interface
expresses that the graph is closed – it has no inflow or outflow.
Flows Specifications for the PIP. Our specifications of acquire and release guarantee
that if we start with a valid flow graph (closed, according to ϕ), we are guaranteed to
return a valid flow graph with the same interface (i.e. the graph remains closed). For
clarity of the exposition, we focus here on how we prove that being a flow graph that
satisfies the PIP invariant is preserved (as is the composite flow graph’s interface).
Extending this specification to one which proves, e.g., that acquire adds the expected
edge is straightforward (see Appendix C of the TR [23]). 11
The specification for update is somewhat subtle, and exploits the full flexibility
of flow interfaces as a specification medium. The preconditions of update describe an
update to the graph which is not yet completed. There are three complementary aspects
to this specification. Firstly, (as for acquire and release), node-local invariants (γ)
hold for all nodes in the graph (enforced via N and Gr predicates). Secondly, we employ
flow interfaces to express a decomposition of the original top-level interface I into
compatible (primed) sub-interfaces. The key to understanding this specification is that
In is in some sense a fake interface; it does not abstract the current state of the heap node
n. Instead, In expresses the way in which the node n’s current inflow hasn’t yet been
accounted for in the heap: that if n could adjust its inflow according to the propagated
priority change without changing its outflow, then it would compose back with the rest of
the graph, and restore the graph’s overall interface. The shorthand δ defines the required
change to n’s inflow.
In general (except when n’s next field is null, or n’s flow value is unchanged), it
is not even possible for n’s fields to be updated to satisfy In ; by updating n’s inflow,
10
In specifications, we implicitly quantify at the top level over free variables such as I. λ0 denotes
an identically zero function on an unconstrained domain.
11
We also omit acquire’s precondition that p.next == null for brevity.
Local Reasoning for Global Graph Properties 323
we will necessarily update its outflow. However, we can then construct a corresponding
“fake” interface for the next node in the graph, reflecting the update yet to be accounted
for, and establishing the precondition for the recursive call to update.
The third specification aspect is the connection between heap-level nodes and in-
terfaces. The N(n, In ) predicate connects n with a different interface; In is the actual
current abstraction of n’s state. Conceptually, the key property which is broken at this
point is this connection between the interface-level specification and the heap at node n,
reflected by the decomposition in the specification between X \ {n} and {n}.
We note that the same specification ideas and proof style can be easily adapted to
other data structure implementations with an update-notify style, including well-known
designs such as Subject-Observer patterns, or the Composite pattern [27].
Proof Outline. To illustrate the application of flows reasoning to our PIP specification
ideas more clearly, we examine in detail the first if-branch in the proof of acquire. Our
intermediate proof steps are shown as purple annotations surrounded by braces. The first
step, as shown in the first line inside the method body, is to apply ((U N )F OLD) twice (on
the flow graphs represented by these predicates) and peel off N predicates for each of r
and p. The update to r’s next field (line 27) causes the correct singleton interface of r to
change to Ir : its outflow (previously none, since the next field was null) now propagates
flow to p. We summarise this state in the assertion on line 29 (we omit e.g. repetition
of properties from the function’s precondition, focusing on the flow-related steps of
the argument). We now rewrite this state; using the definition of interface composition
(Definition 6) we deduce that although Ir and Ip do not compose (since the former has
outflow that the latter does not account for as inflow), the alternative “fake” interface
Ip for p (which artificially accounts for the missing inflow) would do so (cf. line 30).
Essentially, we show Ir ⊕ Ip = Ir ⊕ Ip , that the interface of {r, p} would be unchanged
if p could somehow have interface Ip . Now by setting I2 = Ir ⊕ I1 and using algebraic
properties of interfaces, we assemble the precondition expected by update. After the
call, update’s postcondition gives us the desired postcondition.
We focused here on the details of acquire’s proof, but very similar manipulations
are required for reasoning about the recursive call in update’s implementation.12 The
main difference there is that if the if-condition wrapping the recursive call is false then
either the last-modified node has no successor (and so there is no outstanding inflow
change needed), or we have from = to which implies that the “fake” interface is actually
the same as the currently correct one.
Despite the property proved for the PIP example being a rather delicate recursive in-
variant over the (potentially cyclic) graph, the power of our framework enables extremely
succinct specifications for the example, and proofs which require the application of rela-
tively few generic lemmas. The integration with standard separation logic reasoning, and
the complementary separation algebras provided by flow interfaces allow decomposition
and recomposition to be simple proof steps. For this proof, we integrated with standard
sequential separation logic, but in the next section we will show that compatibility with
concurrent SL techniques is similarly straightforward.
12
We provide further proof outlines in Appendix C of the TR [23].
324 S. Krishna et al.
mh −∞ 3 5 9 10 12 ∞
fh 2 6 1 7 ft
Fig. 4: A potential state of the Harris list with explicit memory management. fnext
pointers are shown with dashed edges, marked nodes are shaded gray, and null pointers
are omitted for clarity.
This section introduces some advanced foundational flow framework theory and demon-
strates its use in the proof of the Harris list. We note that [22] presented a proof of this
data structure in the original flow framework. The proof given here shows that the new
framework eliminates the need for the customized concurrent separation logic defined
in [22]. We start with a recap of Harris’ algorithm adapted from [22].
The power of flow-based reasoning is exhibited in the proof of overlaid data structures
such as the Harris list, a concurrent non-blocking linked list algorithm [12]. This algo-
rithm implements a set data structure as a sorted list, and uses atomic compare-and-swap
(CAS) operations to allow a high degree of parallelism. As with the sequential linked
list, Harris’ algorithm inserts a new key k into the list by finding nodes k1 , k2 such that
k1 < k < k2 , setting k to point to k2 , and using a CAS to change k1 to point to k only
if it was still pointing to k2 . However, a similar approach fails for the delete operation.
If we had consecutive nodes k1 , k2 , k3 and we wanted to delete k2 from the list (say by
setting k1 to point to k3 ), there is no way to ensure with one CAS that k2 and k3 are also
still adjacent (another thread could have inserted/deleted in between them).
Harris’ solution is a two step deletion: first atomically mark k2 as deleted (by setting
a mark bit on its successor field) and then later remove it from the list using a single
CAS. After a node is marked, no thread can insert or delete to its right, hence a thread
that wanted to insert k to the right of k2 would first remove k2 from the list and then
insert k as the successor of k1 .
In a non-garbage-collected environment, unlinked nodes cannot be immediately freed
as suspended threads might continue to hold a reference to them. A common solution
is to maintain a second “free list” to which marked nodes are added before they are
unlinked from the main list (this is the so-called drain technique). These nodes are then
labelled with a timestamp, which is used by a maintenance thread to free them when it is
safe to do so. This leads to the kind of data structure shown in Figure 4, where each node
has two pointer fields: a next field for the main list and an fnext field for the free list
(the list from fh to ft via dashed edges). Threads that have been suspended while holding
Local Reasoning for Global Graph Properties 325
1 1 1 1
1 1 1 n1 n2 1 1 n1 n2 1
? x n3 1 1 n4 n3 1 1 n4
? x 1 1
n5 n5
(a) (b) (c)
Fig. 5: Examples of graphs that motivate effective acyclicity. All graphs use the path-
counting flow domain, the flow is displayed inside each node, and the inflow is displayed
as curved arrows to the top-left of nodes. (a) shows a graph and inflow that has no
solution to (FlowEqn); (b) has many solutions. (c) shows a modification that preserves
the interface of the modified nodes, yet goes from a graph that has a unique flow to one
that has many solutions to (FlowEqn).
a reference to a node that was added to the free list can simply continue traversing the
next pointers to find their way back to the unmarked nodes of the main list.
Even for seemingly simple properties such as that the Harris list is memory safe and
not leaking memory, the proof will rely on the following non-trivial invariants:
(a) The data structure consists of two (potentially overlapping) lists: a list on next
edges beginning at mh and one on fnext edges beginning at fh.
(b) The two lists are null terminated and next edges from nodes in the free list point to
nodes in the free list or main list.
(c) All nodes in the free list are marked.
(d) ft is an element in the free list (due to concurrency, it’s not always the tail).
Challenges. To prove that Harris’ algorithm maintains the invariants listed above we
must tackle a number of challenges. First, we must construct flow domains that allow us
to describe overlaid data structures, such as the overlapping main and free lists (§4.2).
Second, the flow-based proofs we have seen so far work by showing that the interface of
some modified region is unchanged. However, if we consider a program that allocates
and inserts a new node into a data structure (like the insert method of Harris), then the
interface cannot be the same since the domain has changed (it has increased by the
newly allocated node). We must thus have a means to reason about preservation of flows
by modifications that allocate new nodes (§4.3). The third issue is that in some flow
domains, there exist graphs G and inflows in for which no solutions to the flow equation
(FlowEqn) exist. For instance, consider the path-counting flow domain and the graph
in Figure 5(a). Since we would need to use the path-counting flow in the proof of the
Harris list to encode its structural invariants, this presents a challenge (§4.4).
We will next see how to overcome these three challenges in turn, and then apply
those solution to the proof of the Harris list in §4.5.
326 S. Krishna et al.
Nilpotent Cycles. Let (M, +, 0, E) be a flow domain where every edge function e ∈ E
is an endomorphism on M . In this case, we can show that the flow of a node n is the
sum of the flow as computed along each path in the graph that ends at n. Suppose we
additionally know that the edge functions are defined such that their composition along
any cycle in the graph eventually becomes the identically zero function. We then need
only consider finitely many paths to compute the flow of a node, which means the flow
equation has a unique solution.
Definition 9. A closed set of endomorphisms E ⊆ End(M ) is called nilpotent if there
exists p > 1 such that ep ≡ 0 for every e ∈ E.
Example 5. The flow domain (N2 , +, (0, 0), {(λ(x, y). (0, c · x)) | c ∈ N}) contains
nilpotent edge functions that shift the first component of the flow to the second (with
a scaling factor). This domain can be used to express the property that every node in a
graph is reachable from the root via a single edge (by requiring the flow of every node to
be (0, 1) under the inflow (λn. (n = r ? (1, 0) : (0, 0)))).
Before we prove that nilpotent endomorphisms lead to unique flows, we present a
useful notion when dealing with endomorphic flow domains.
Definition 10. The capacity of a flow graph G = (N, e) is cap(G) : N × N → (M →
M ), defined inductively as cap(G) := cap|G| (G), where cap0 (G)(n, n ) := δn=n and
capi+1 (G)(n, n ) := δn=n + capi (G)(n, n ) ◦ e(n , n ).
n ∈G
328 S. Krishna et al.
Effectively Acyclic Flow Graphs. There are some flow domains that compute flows
useful in practice, but which do not guarantee either existence or uniqueness of fixpoints
a priori for all graphs. For example, the path-counting flow from Example 1 is one where
for certain graphs, there exist no solutions to the flow equation (see Figure 5(a)), and for
others, there can exist more than one (in Figure 5(b), the nodes marked with x can have
any path count, as long as they both have the same value).
In such cases, we explore how to restrict the class of graphs we use in our flow-based
proofs such that each graph has a unique fixpoint; the difficulty is that this restriction must
be respected for composition of our graphs. Here, we study the class of flow domains
(M, +, 0, E) such that M is a positive monoid and E is a set of reduced endomorphisms
(defined below). In such domains we can decompose the flow computations into the
various paths in the graph, and achieve unique fixpoints by restricting the kinds of cycles
graphs can have.
Definition 11. A flow graph H = (N, e, fl ) is effectively acyclic (EA) if for every 1 ≤ k
and n1 , . . . , nk ∈ N ,
The simplest example of an effectively acyclic graph is one where the edges with
non-zero edge functions form an acyclic graph. However, our semantic condition is
weaker: for example, when reasoning about two overlaid acyclic lists whose union
happens to form a cycle, a product of two path-counting domains will satisfy effective
acyclicity because the composition of different types of edges results in the zero function.
Lemma 5. Let (M, +, 0, E) be a flow domain such that M is a positive monoid and
E is a closed set of endomorphisms. Given a graph (N, e) over this flow domain and
inflow in : N → M , if there exists a flow graph H = (N, e, fl ) that is effectively acyclic,
then fl is unique.
While the restriction to effectively acyclic flow graphs guarantees us that the flow is
the unique fixpoint of the flow equation, it is not easy to show that modifications to the
graph preserve EA while reasoning locally. Even modifying a subgraph to another with
the same flow interface (which we know guarantees that it will compose with any context)
can inadvertently create a cycle in the larger composite graph. For instance, consider
Figure 5(c), that shows a modification to nodes {n3 , n4 } (the boxed blue region). The
interface of this region is ({n3 1, n4 1} , {n5 1, n2 1}), and so swapping
Local Reasoning for Global Graph Properties 329
the edges of n3 and n4 preserves this interface. However, the resulting graph, despite
composing with the context to form a valid flow graph, is not EA (in this case, it has
multiple solutions to the flow equation). This shows that flow interfaces are not powerful
enough to preserve effective acyclicity. For a special class of endomorphisms, we show
that a local property of the modified subgraph can be checked, which implies that the
modified composite graph continues to be EA.
Note that if E is reduced, then no e ∈ E can be nilpotent. In that sense, this class of
instantiations is complementary to the nilpotent class.
Example 6. Examples of flow domains that fall into this class include positive semirings
of reduced rings (with the additive monoid of the semiring being the aggregation monoid
of the flow domain and E being any set of functions that multiply their argument with
a constant flow value). Note that any direct product of integral rings is a reduced ring.
Hence, products of the path counting flow domain are a special case.
This pairwise check, apart from requiring the interface of the modified region to be
unchanged, also permits allocating new nodes as long as no flow is routed via the new
nodes (condition (3)). We now show that it is sufficient to check that a modification is a
subflow-preserving extension to guarantee composition back to an effectively-acyclic
composite graph:
Theorem 3. Let (M, +, 0, E) be a flow domain such that M is a positive monoid and E
is a reduced set of endomorphisms. If H = H1
H2 and H1 s H1 are all effectively
acyclic flow graphs such that H1 ∩ H2 = ∅ and ∀n ∈ H1 \ H1 . outf(H2 )(n) = 0, then
there exists an effectively acyclic flow graph H = H1
H2 such that H s H .
We define effectively acyclic versions of our flow graph predicates, Na (x, H) and
Gra (X, H), that additionally constrain H to be effectively acyclic. The above theorem
yields the following variant of the (R EPL) rule for EA graphs:
We use the techniques seen in this section in the proof of the Harris list. As the data
structure consists of two potentially overlapping lists, we use Lemma 3 to construct a
product flow domain of two path-counting flows: one tracks the path count from the
head of the main list, and one from the head of the free list. We also work under the
effectively acyclic restriction (i.e. we use the Na and Gra predicates), both in order to
obtain the desired interpretation of the flow as well as to ensure existence of flows in this
flow domain.
We instantiate the framework using the following definitions of parameters:
Here, edge encodes the edge functions needed to compute the product of two path
counting flows, the first component tracks path-counts from mh on next edges and the
second tracks path-counts from fh on fnext edges 13 . The node-local invariant γ says:
the flow is one of {(1, 0), (0, 1), (1, 1)} (meaning that the node is on one of the two lists,
invariant (a)); if the flow is not (1, 0) (the node is not only on the main list, i.e. it is
on the free list) then the node is marked (indicated by M (y), invariant (c)); and if the
node is ft then it must be on the free list (invariant (d)). The constraint on the global
interface, ϕ, says that the inflow picks out mh and fh as the roots of the lists, and there
is no outgoing flow (thus, all non-null edges must stay within the graph, invariant (b)).
Since the Harris list is a concurrent algortihm, we perform the proof in rely-guarantee
separation logic (RGSep) [41]. Like in §3, we do not need to modify the semantics of
RGSep in any way; our flow-based predicates can be defined and reasoning using our
lemmas can be performed in the logic out-of-the-box. For space reasons, the full proof
can be found in Appendix D of the TR [23].
5 Related Work
As mentioned in §1, the most closely related work is the flow framework developed by
some of the authors in [22]. We here present a simplified and generalized meta theory of
flows that makes the approach much more broadly applicable. There were a number of
limitations of the prior framework that prevented its application to more general classes
of examples.
First, [22] required flow domains to form a semiring; the analogue of edge functions
are restricted to multiplication with a constant which must come from the same flow
13
We use the shorthands λ(1,0) := (λ(m1 , m2 ). (m1 , 0)) and λ(0,1) := (λ(m1 , m2 ). (0, m2 )),
and denote an anonymous existentially-quantified variable by _.
Local Reasoning for Global Graph Properties 331
value set. This restriction made it complex to encode many graph properties of interest.
For example, one could not easily encode the PIP flow, or a simple flow that counts the
number of incoming edges to each node. Our foundational flow framework decouples
the algebraic structure defining how flow is aggregated from the algebraic structure of
the edge functions. In this way, we obtain a more general framework that applies to many
more examples, and with simpler flow domains.
Second, in [22], a flow graph did not uniquely determine its inflow (cf. Lemma 1).
Correspondingly, [22]’s notion of interface included an equivalence class of inflows (all
those that induce the same flow values). Since, in [22], the interface also determines
which modifications are permitted by the framework, [22] could only handle modifica-
tions that preserve the inflow equivalence class. For example, this prevents one from
reasoning locally about the removal of a single edge from a graph in certain cases (in
particular, like release does in the PIP). Our foundational flow framework solves
this problem by requiring that the aggregation operation on flow values is cancellative,
guaranteeing unique inflows.
Cancellativity is fundamentally incompatible with [22], which requires the flow
domain to form an ω-CPO in order to guarantee the existence of unique flows. For
example, in a graph with two nodes n and n with identity edges between them and
all other edges zero (in [22], edges labelled with 1 and 0), if we have in(n) = 0
and in(n) = m for some non-zero m, a solution to the flow equation must satisfy
fl (n) = m + fl (n). [22] forces such solutions to exist, ruling out cancellativity. To solve
this problem, we present a new theory which can optionally guarantee unique flows
when desired and show that requiring cancellativity does not limit expressivity.
Next, the proofs of programs shown in [22] depend on a bespoke program logic. This
logic requires new reasoning primitives that are not supported by the logics implemented
in existing SL-based verification tools. Our general proof technique eliminates the need
for a dedicated program logic and can be implemented on top of standard separation log-
ics and existing SL-based tools. Finally, the underlying separation algebra of the original
framework makes it hard to use equational reasoning, which is a critical prerequisite for
enabling proof automation.
An abundance of SL variants provide complementary mechanisms for modular
reasoning about programs (e.g. [18, 36, 38]). Most are parameterized by the underlying
separation algebra; our flow-based reasoning technique easily integrates with these
existing logics.
The most common approach to reason about irregular graph structures in SL is to
use iterated separating conjunction [30, 44] and describe the graph as a set of nodes each
of which satisfies some local invariant. This approach has the advantage of being able to
naturally describe general graphs. However, it is hard to express non-local properties that
involve some form of fixpoint computation over the graph structure. One approach is to
abstract the program state as a mathematical graph using iterated separating conjunction
and then express non-local invariants in terms of the abstract graph rather than the
underlying program state [14, 35, 38]. However, a proof that a modification to the state
maintains a global invariant of the abstract graph must then often revert back to non-local
and manual reasoning, involving complex inductive arguments about paths, transitive
closure, and so on. Our technique also exploits iterated separating conjunction for the
332 S. Krishna et al.
underlying heap ownership, with the key benefit that flow interfaces exactly capture the
necessary conditions on a modified subgraph in order to compose with any context and
preserve desired non-local invariants.
In recent work, Wang et al. present a Coq-mechanised proof of graph algorithms in
C, based on a substantial library of graph-related lemmas, both for mathematical and
heap-based graphs [42]. They prove rich functional properties, integrated with the VST
tool. In contrast to our work, a substantial suite of lemmas and background properties are
necessary, since these specialise to particular properties such as reachability. We believe
that our foundational flow framework could be used to simplify framing lemmas in a
way which remains parameteric with the property in question.
Proofs of a number of graph algorithms have been mechanized in various verification
tools and proof assistants, including Tarjan’s SCC algorithm [8], union-find [7], Kruskal’s
minimum spanning tree algorithm [13], and network flow algorithms [25]. These proofs
generally involve non-local reasoning arguments about mathematical graphs.
An alternative approach to using SL-style reasoning is to commit to global reasoning
but remain within decidable logics to enable automation [16, 21, 24, 28, 43]. However,
such logics are restricted to certain classes of graphs and certain types of properties.
For instance, reasoning about reachability in unbounded graphs with two successors
per node is undecidable [15]. Recent work by Ter-Gabrielyan et al. [40] shows how
to deal with modular framing of pairwise reachability specifications in an imperative
setting. Their framing notion has parallels to our notion of interface composition, but
allows subgraphs to change the paths visible to their context. The work is specific to
a reachability relation, and cannot express the rich variety of custom graph properties
available in our technique.
Dynamic frames [19] (e.g. implemented in Dafny [26]), can be used to explicitly
reason about framing of heap information in a first-order logic. However, by itself, this
theory does not enable modular reasoning about global graph properties. We believe that
the flow framework could in principle be adapted to the dynamic frames setting.
We have presented the foundational flow framework, enabling local modular reasoning
about recursively-defined properties over general graphs. The core reasoning technique
has been designed to make minimal mathematical requirements, providing great flexi-
bility in terms of potential instantiations and applications. We identified key classes of
these instantiations for which we can provide existence and uniqueness guarantees for
the fixpoint properties our technique addresses and demonstrate our proof technique on
several challenging examples. As future work, we plan to automate flow-based proofs
in our new framework using existing tools that support SL-style reasoning such as
Viper [29] and GRASShopper [34].
References
1. Appel, A.W.: Verified software toolchain. In: NASA Formal Methods. Lecture Notes in
Computer Science, vol. 7226, p. 2. Springer (2012)
2. Berdine, J., Calcagno, C., O’Hearn, P.W.: A decidable fragment of separation logic. In:
FSTTCS. Lecture Notes in Computer Science, vol. 3328, pp. 97–109. Springer (2004)
3. Brookes, S., O’Hearn, P.W.: Concurrent separation logic. SIGLOG News 3(3), 47–65 (2016)
4. Calcagno, C., Distefano, D., Dubreil, J., Gabi, D., Hooimeijer, P., Luca, M., O’Hearn, P.W.,
Papakonstantinou, I., Purbrick, J., Rodriguez, D.: Moving fast with software verification. In:
NFM. Lecture Notes in Computer Science, vol. 9058, pp. 3–11. Springer (2015)
5. Calcagno, C., O’Hearn, P.W., Yang, H.: Local action and abstract separation logic. In: LICS.
pp. 366–378. IEEE Computer Society (2007)
6. Cao, Q., Cuellar, S., Appel, A.W.: Bringing order to the separation logic jungle. In: APLAS.
Lecture Notes in Computer Science, vol. 10695, pp. 190–211. Springer (2017)
7. Charguéraud, A., Pottier, F.: Verifying the correctness and amortized complexity of a union-
find implementation in separation logic with time credits. J. Autom. Reasoning 62(3), 331–365
(2019)
8. Chen, R., Cohen, C., Lévy, J., Merz, S., Théry, L.: Formal proofs of tarjan’s strongly connected
components algorithm in why3, coq and isabelle. In: ITP. LIPIcs, vol. 141, pp. 13:1–13:19.
Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)
9. Dockins, R., Hobor, A., Appel, A.W.: A fresh look at separation algebras and share accounting.
In: APLAS. Lecture Notes in Computer Science, vol. 5904, pp. 161–177. Springer (2009)
10. Dodds, M., Jagannathan, S., Parkinson, M.J., Svendsen, K., Birkedal, L.: Verifying custom
synchronization constructs using higher-order separation logic. ACM Trans. Program. Lang.
Syst. 38(2), 4:1–4:72 (2016)
11. Enea, C., Lengál, O., Sighireanu, M., Vojnar, T.: SPEN: A solver for separation logic. In:
NFM. Lecture Notes in Computer Science, vol. 10227, pp. 302–309 (2017)
12. Harris, T.L.: A pragmatic implementation of non-blocking linked-lists. In: DISC. Lecture
Notes in Computer Science, vol. 2180, pp. 300–314. Springer (2001)
13. Haslbeck, M.P.L., Lammich, P., Biendarra, J.: Kruskal’s algorithm for minimum spanning
forest. Archive of Formal Proofs 2019 (2019)
14. Hobor, A., Villard, J.: The ramifications of sharing in data structures. In: POPL. pp. 523–536.
ACM (2013)
15. Immerman, N., Rabinovich, A.M., Reps, T.W., Sagiv, S., Yorsh, G.: The boundary between de-
cidability and undecidability for transitive-closure logics. In: CSL. Lecture Notes in Computer
Science, vol. 3210, pp. 160–174. Springer (2004)
16. Itzhaky, S., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Effectively-propositional
reasoning about reachability in linked data structures. In: CAV. Lecture Notes in Computer
Science, vol. 8044, pp. 756–772. Springer (2013)
17. Jacobs, B., Smans, J., Philippaerts, P., Vogels, F., Penninckx, W., Piessens, F.: Verifast: A
powerful, sound, predictable, fast verifier for C and java. In: NASA Formal Methods. Lecture
Notes in Computer Science, vol. 6617, pp. 41–55. Springer (2011)
18. Jung, R., Krebbers, R., Jourdan, J., Bizjak, A., Birkedal, L., Dreyer, D.: Iris from the ground
up: A modular foundation for higher-order concurrent separation logic. J. Funct. Program. 28,
e20 (2018)
19. Kassios, I.T.: Dynamic frames: Support for framing, dependencies and sharing without
restrictions. In: FM. Lecture Notes in Computer Science, vol. 4085, pp. 268–283. Springer
(2006)
20. Katelaan, J., Matheja, C., Zuleger, F.: Effective entailment checking for separation logic with
inductive definitions. In: TACAS (2). Lecture Notes in Computer Science, vol. 11428, pp.
319–336. Springer (2019)
334 S. Krishna et al.
21. Klarlund, N., Schwartzbach, M.I.: Graph types. In: POPL. pp. 196–205. ACM Press (1993)
22. Krishna, S., Shasha, D.E., Wies, T.: Go with the flow: compositional abstractions for concur-
rent data structures. PACMPL 2(POPL), 37:1–37:31 (2018)
23. Krishna, S., Summers, A.J., Wies, T.: Local reasoning for global graph properties. CoRR
abs/1911.08632 (2019)
24. Lahiri, S.K., Qadeer, S.: Back to the future: revisiting precise program verification using SMT
solvers. In: POPL. pp. 171–182. ACM (2008)
25. Lammich, P., Sefidgar, S.R.: Formalizing network flow algorithms: A refinement approach in
isabelle/hol. J. Autom. Reasoning 62(2), 261–280 (2019)
26. Leino, K.R.M.: Dafny: An automatic program verifier for functional correctness. In: LPAR
(Dakar). Lecture Notes in Computer Science, vol. 6355, pp. 348–370. Springer (2010)
27. Leino, K.R.M., Moskal, M.: Vacid-0: Verification of ample correctness of invariants of
data-structures, edition 0. Microsoft Research Technical Report (2010)
28. Madhusudan, P., Qiu, X., Stefanescu, A.: Recursive proofs for inductive tree data-structures.
In: POPL. pp. 123–136. ACM (2012)
29. Müller, P., Schwerhoff, M., Summers, A.J.: Viper: A verification infrastructure for permission-
based reasoning. In: Jobstmann, B., Leino, K.R.M. (eds.) Verification, Model Checking, and
Abstract Interpretation (VMCAI). LNCS, vol. 9583, pp. 41–62. Springer-Verlag (2016)
30. Müller, P., Schwerhoff, M., Summers, A.J.: Automatic verification of iterated separating
conjunctions using symbolic execution. In: CAV (1). Lecture Notes in Computer Science,
vol. 9779, pp. 405–425. Springer (2016)
31. O’Hearn, P.W., Reynolds, J.C., Yang, H.: Local reasoning about programs that alter data
structures. In: CSL. Lecture Notes in Computer Science, vol. 2142, pp. 1–19. Springer (2001)
32. Parkinson, M.J., Bierman, G.M.: Separation logic and abstraction. In: Palsberg, J., Abadi, M.
(eds.) Principles of Programming Languages (POPL). pp. 247–258. ACM (2005)
33. Piskac, R., Wies, T., Zufferey, D.: Automating separation logic using SMT. In: CAV. Lecture
Notes in Computer Science, vol. 8044, pp. 773–789. Springer (2013)
34. Piskac, R., Wies, T., Zufferey, D.: Grasshopper - complete heap verification with mixed
specifications. In: TACAS. Lecture Notes in Computer Science, vol. 8413, pp. 124–139.
Springer (2014)
35. Raad, A., Hobor, A., Villard, J., Gardner, P.: Verifying concurrent graph algorithms. In:
APLAS. Lecture Notes in Computer Science, vol. 10017, pp. 314–334 (2016)
36. Raad, A., Villard, J., Gardner, P.: Colosl: Concurrent local subjective logic. In: ESOP. Lecture
Notes in Computer Science, vol. 9032, pp. 710–735. Springer (2015)
37. Reynolds, J.C.: Separation logic: A logic for shared mutable data structures. In: LICS. pp.
55–74. IEEE Computer Society (2002)
38. Sergey, I., Nanevski, A., Banerjee, A.: Mechanized verification of fine-grained concurrent
programs. In: PLDI. pp. 77–87. ACM (2015)
39. Sha, L., Rajkumar, R., Lehoczky, J.P.: Priority inheritance protocols: An approach to real-time
synchronization. IEEE Trans. Computers 39(9), 1175–1185 (1990)
40. Ter-Gabrielyan, A., Summers, A.J., Müller, P.: Modular verification of heap reachability
properties in separation logic. PACMPL 3(OOPSLA), 121:1–121:28 (2019)
41. Vafeiadis, V.: Modular fine-grained concurrency verification. Ph.D. thesis, University of
Cambridge, UK (2008)
42. Wang, S., Cao, Q., Mohan, A., Hobor, A.: Certifying graph-manipulating C programs via
localizations within data structures. PACMPL 3(OOPSLA), 171:1–171:30 (2019)
43. Wies, T., Muñiz, M., Kuncak, V.: An efficient decision procedure for imperative tree data
structures. In: CADE. Lecture Notes in Computer Science, vol. 6803, pp. 476–491. Springer
(2011)
44. Yang, H.: An example of local reasoning in BI pointer logic: the Schorr-Waite graph marking
algorithm. In: Proceedings of the SPACE Workshop (2001)
Local Reasoning for Global Graph Properties 335
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropri-
ate credit to the original author(s) and the source, provide a link to the Creative Commons license
and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter’s Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Aneris: A Mechanised Logic for Modular
Reasoning about Distributed Systems
1 Introduction
view of the global state and events are observable changes to this state. State
transition systems are quite versatile and have been used in other verification
applications. However, reasoning based on state transition systems often suffer
from a lack of modularity due to their very global. As a consequence, separate
nodes or components cannot be verified in isolation and the system has to be
verified as a whole.
IronFleet [7] is the first system that supports node-local reasoning for verifying
the implementation of programs that run on different nodes. In IronFleet, a
distributed system is modeled by a transition system. This transition system
is shown to be refined by the composition of a number of transition systems,
each pertaining to one of the nodes in the system. Each node in the distributed
system is shown to be correct and a refinement of its corresponding transition
system. Nevertheless, IronFleet does not allow you to reason compositionally; a
correctness proof for a distributed system cannot be used to show the correctness
of a larger system.
Higher-order concurrent separation logics (CSLs) [3, 4, 13, 15, 18, 26, 27,
28, 33, 34, 36, 39] simplify reasoning about higher-order imperative concurrent
programs by offering facilities for specifying and proving correctness of programs in
a modular way. Indeed, their support for modular reasoning (a.k.a. compositional
reasoning) is the key reason for their success. Disel [35] is a separation logic
that does support compositional reasoning about distributed systems, allowing
correctness proofs of distributed systems to be used for verifying larger systems.
However, Disel struggles with node-local reasoning in that it cannot hide node-
local usage of mutable state. That is, the use of internal state in nodes must be
exposed in the high-level protocol of the system and changes to the internal state
are only possible upon sending and receiving messages over the network.
Finally, both Disel and IronFleet restrict nodes to run only sequential programs
and no node-level concurrency is supported.
In this paper we present Aneris, a framework for implementing and reasoning
about functional correctness of distributed systems. Aneris is based on concurrent
separation logic and supports modular reasoning with respect to both nodes
(node-local reasoning) and threads within nodes (thread-local reasoning). The
Aneris framework consists of a programming language, AnerisLang, for writing
realistic, real-world distributed systems and a higher-order concurrent separation
logic for reasoning about these systems. AnerisLang is a concurrent ML-like
programming language with higher-order functions, local state, threads, and
network primitives. The operational semantics of the language, naturally, involves
multiple hosts (each with their own heap and multiple threads) running in a
network. The Aneris logic is build on top of the Iris framework [13, 15, 18]
and supports machine-verified formal proofs in the Coq proof assistant about
distributed systems written in AnerisLang.
Modular Reasoning in Aneris. In general, there are two different ways to support
modular reasoning about distributed systems corresponding to how components
can be composed. Aneris enables simultaneously both:
– Vertical composition: when reasoning about programs within each node, one
is able to compose proofs of different components to prove correctness of the
whole program. For instance, the specification of a verified data structure,
e.g. a concurrent queue, should suffice for verifying programs written against
that data structure, independently of its implementation.
– Horizontal composition: at each node, a verified thread is composable with
other verified threads. Similarly, a verified node is composable with other
verified nodes which potentially engage in different protocols. This naturally
aids implementing and verifying large-scale distributed systems.
Node-local variants of the standard rules of CSLs like, for example, the bind rule
and the frame rule (as explained in Sect. 2) enable vertical reasoning. Sect. 6
showcases vertical reasoning in Aneris using a replicated distributed logging
service that is implemented and verified using a separate implementation and
specification of the two-phase commit protocol.
Horizontal reasoning in Aneris is achieved through the Thread-par-rule and
the Node-par-rule (further explained in Sect. 2) which intuitively says that to
verify a distributed system, it suffices to verify each thread and each node in
isolation. This is analogous to how CSLs allow us to reason about multi-threaded
programs by considering individual threads in isolation; in Aneris we extend
this methodology to include both threads and nodes. Where most variants of
concurrent separation logic use some form of an invariant mechanism to reason
about shared-memory concurrency, we abstract the communication between nodes
over the network through socket protocols that restrict what can be sent and
received on a socket and allow us to share ownership of logical resources among
nodes. Sect. 5 showcases horizontal reasoning in Aneris using an implementation
and a correctness proof for a simple addition service that uses a load balancer to
distribute the workload among several addition servers. Each node is verified in
isolation and composed to form the final distributed system.
Separation logic is a resource logic, in the sense that propositions denote not only
facts about the state, but ownership of resources. Originally, separation logic [32]
was introduced for modular reasoning about the heap—i.e. the notion of resource
was fixed to be logical pieces of the heap. The essential idea is that we can give a
local specification {P } e {v.Q} to a program e involving only the footprint of e.
Hence, while verifying e, we need not consider the possibility that another piece
of code in the program might interfere with e; the program e can be verified
without concern for the environment in which e may occur. Local specifications
can then be lifted to more global specifications by framing and binding:
Node-par
{P1 ∗ IsNode(n1 ) ∗ FreePorts(ip 1 , P)} n1 ; e1 {True}
{P2 ∗ IsNode(n2 ) ∗ FreePorts(ip 2 , P)} n2 ; e2 {True}
{P1 ∗ P2 ∗ FreeIp(ip1 ) ∗ FreeIp(ip 2 )} S; (n1 ; ip 1 ; e1 ) ||| (n2 ; ip 2 ; e2 ) {True}
where ||| denotes parallel composition of two nodes with identifier n1 and n2
running expressions e1 and e2 with IP addresses ip 1 and ip 2 .2 The set P =
{p | 0 ≤ p ≤ 65535} denotes a finite set of ports.
Note that only a distinguished system node S can start new nodes (as
elaborated on in Sect. 3). In Aneris, the execution of the distributed system
starts with the execution of S as the only node in the system. In order to start
a new node associated with ip address ip one provides the resource FreeIp(ip)
which indicates that ip is not used by other nodes. The node can then rely
on the fact that when it starts, all ports on ip are available. The resource
IsNode(n) indicates that the node n is a node in the system and keeps track of
abstract state related to our modeling of node n’s heap and allocated sockets.
To facilitate modular reasoning, free ports can be split: if A ∩ B = ∅ then
FreePorts(ip, A) ∗ FreePorts(ip, B)
FreePorts(ip, A ∪ B) where
denotes
2
In the same way as the parallel composition rule is derived from a more general
fork-based rule, this composition rule is also an instance of a more general rule for
spawning nodes shown in Sect. 3.
342 M. Krogh-Jespersen et al.
logical equivalence of Aneris propositions (of type iProp). We will use FreePort(a)
as shorthand for FreePorts(ip, {p}) where a = (ip, p).
Finally, observe that the node-local postconditions are simply True, in contrast
to the arbitrary thread-local postconditions in the Thread-par-rule that carry
over to the main thread. In the concurrent setting, shared memory provides
reliable communication and synchronization between the child threads and the
main thread; in the rule for parallel composition, the main thread will wait for
the two child processes to finish. In the distributed setting, there are no such
guarantees and nodes are separate entities that cannot synchronize with the
distinguished system node.
Socket Protocols. Similar to how classical CSLs introduce the concept of resource
invariants for expressing protocols on shared state among multiple threads, we
introduce the simple and novel concept of socket protocols for expressing protocols
among multiple nodes. With each socket address—a pair of an IP address and
a port—a protocol is associated, which restricts what can be communicated on
that socket.
A socket protocol is a predicate Φ : Message → iProp on incoming messages
received on a particular socket. One can think of this as a form of rely-guarantee
reasoning since the socket protocol will be used to restrict the distributed en-
vironment’s interference with a node on a particular socket. In Aneris we write
a ⇒ Φ to mean that socket address a is governed by the protocol Φ. In particular,
if a ⇒ Φ and a ⇒ Ψ then Φ and Ψ are equivalent.3 Moreover, the proposition is
duplicable: a ⇒ Φ
a ⇒ Φ ∗ a ⇒ Φ.
Conceptually, a socket is an abstract representation of a handle for a local
endpoint of some channel. We further restrict channels to use the User Datagram
Protocol (UDP) which is asynchronous, connectionless, and stateless. In accor-
dance with UDP, Aneris provides no guarantee of delivery or ordering although
we assume duplicate protection. We assume duplicate protection to simplify
our examples, as otherwise the code of all of our examples would have to be
adapted to cope with duplication of messages. One can think of sockets in Aneris
as open-ended multi-party communication channels without synchronization.
It is noteworthy that inter-process communication can happen in two ways.
Thread-concurrent programs can communicate both through the shared heap and
by sending messages through sockets. For memory-separated programs running
on different nodes all communication is by message-passing.
In the logic, we consider both static and dynamic socket addresses. This
distinction is entirely abstract and at the level of the logic. Static addresses come
with primordial protocols, agreed upon before starting the distributed system,
whereas dynamic addresses do not. Protocols on static addresses are primarily
intended for addresses pointing to nodes that offer a service.
To distinguish between static and dynamic addresses, we use a resource
Fixed(A) which denotes that the addresses in A are static and should have a fixed
3
The predicate equivalence is under a later modality in order to avoid self-referential
paradoxes. We omit it for the sake of presentation as this is an orthogonal issue.
Aneris: A Logic for Modular Reasoning about Distributed Systems 343
Socketbind-static
{Fixed(A) ∗ a ∈ A ∗ FreePort(a) ∗ z →n None}
n; socketbind z a
{x. x = 0 ∗ z →n Some a}
Socketbind-dynamic
{Fixed(A) ∗ a ∈ A ∗ FreePort(a) ∗ z →n None}
n; socketbind z a
{x. x = 0 ∗ z →n Some a ∗ a ⇒ Φ}
In the remainder of the paper we will use the following shorthands in order to
simplify the presentation of our specifications.
to the server address using the socket and waits for a response, projecting out
the result of the addition on arrival and deserializing it.
In order to give the server code a specification we will fix a primordial socket
protocol that will govern the address given to the server. The protocol will spell
out how the server relies on the socket. We will use from(m) and body(m) for
projections of the sender and the message body, respectively, from the message
m. We define Φadd as follows:
Φadd (m) ∃Ψ, x, y. from(m) ⇒ Ψ ∗ body(m) = serialize(x, y) ∗
∀m , body(m ) = serialize(x + y) −∗ Ψ (m )
Intuitively, the protocol demands that the sender of a message m is governed by
some protocol Ψ and that the message body body(m) must be the serialization
of two numbers x and y. Moreover, the sender’s protocol must be satisfied if the
serialization of x + y is sent as a response.
Using Φadd as the socket protocol, we can give server the specification
{Static(a, A, Φadd ) ∗ IsNode(n)} n; server a {False}.
The postcondition is allowed to be False as the program does not terminate. The
triple guarantees safety which, among others, means that if the server responds
to communication on address a it does so according to Φadd .
Similarly, using Φadd as a primordial protocol for the server address, we can
also give client a specification
{srv ⇒ Φadd ∗ srv ∈ A ∗ Dynamic(a, A) ∗ IsNode(m)}
m; client x y srv a
{v.v = x + y}
that showcases how the client is able to conclude that the response from the
server is the sum of the numbers it sent to it. In the proof, when binding a to
the socket using Socketbind-dynamic, we introduce the proposition a ⇒ Φclient
where
Φclient (m) body(m) = serialize(x + y)
and use it to instantiate Ψ when satisfying Φadd . Using the two specifications
and the Node-par-rule it is straightforward to specify and verify a distributed
system composed of, e.g., a server and multiple clients.
Aneris: A Logic for Modular Reasoning about Distributed Systems 345
Mutual exclusion in distributed systems is often a necessity and there are many
different approaches for providing it. The simplest solution is a centralized
algorithm with a single node acting as the coordinator. We will develop this
example to showcase a more interesting protocol that relies on ownership transfer
of spatial resources between nodes to ensure correctness.
The code for a centralized lock server implementation is shown in Fig. 2.
rec lockserver a =
let lock = ref NONE in
let skt = socket () in
socketbind skt a;
listen skt (rec handler msg from =
if (msg = "LOCK") then
match !lock with
NONE => lock ← SOME (); sendto skt "YES" from
| SOME __ => sendto skt "NO" from
end
else lock ← NONE; sendto skt "RELEASED" from
listen skt handler)
The lock server declares a node-local variable lock to keep track of whether
the lock is taken or not. It allocates a socket, binds the input address to the
socket and continuously listens for incoming messages. When a "LOCK" message
arrives and the lock is available, the lock gets taken and the server responds
"YES". If the lock was already taken, the server will respond "NO". Finally, if
the message was not "LOCK", the lock is released and the server responds with
"RELEASED".
Our specification of the lock server will be inspired by how a lock can
be specified in concurrent separation logic. Thus we first recall how such a
specification usually looks like.
Conceptually, a lock can either be unlocked or locked, as described by a
two-state labeled transition system.
unlocked locked
K
In concurrent separation logic, the lock specification does not describe this
transition system directly, but instead focuses on the resources needed for the
transitions to take place. In the case of the lock, the resources are simply a
non-duplicable resource K, which is needed in order to call the lock’s release
method. Intuitively, this resource corresponds to the key of the lock.
346 M. Krogh-Jespersen et al.
∃ isLock .
∧ ∀v, K. isLock(v, K)
isLock(v, K) ∗ isLock(v, K)
∧ ∀v, K. isLock(v, K) K ∗ K ⇒ False
∧ {True} newLock () {v. ∃K. isLock(v, K)}
∧ ∀v. {isLock(v, K)} acquire v {v.K}
∧ ∀v. {isLock(v, K) ∗ K} release v {True}
– Calling newLock will lead to the duplicable knowledge of the return value v
being a lock.
– Knowing that a value is a lock, a thread can try to acquire the lock and when
it eventually succeeds it will get the key K.
– Only a thread holding this key is allowed to call release.
Sharing of the lock among several threads is achieved by the isLock predicate
being duplicable. Mutual exclusion is ensured by the last bullet point together
with the requirement of K being non-duplicable whenever we have isLock(v, K).
For a leisurely introduction to such specifications, the reader may consult Birkedal
and Bizjak [1].
Let us now return to the distributed lock synchronization. To give clients
the possibility of interacting with the lock server as they would with such a
concurrent lock module, the specification for the lock server will look like follows.
This specification simply states that a lock server should have a primordial
protocol Φlock and that it needs the key resource to begin with. To allow for the
desired interaction with the server, we define the socket protocol Φlock as follows:
The protocol Φlock demands that a client of the lock has to be bound to some
protocol Ψ and that the server can receive two types of messages fulfilling either
acq(m, Ψ ) or rel(m, Ψ ). These correspond to the module’s two methods acquire
and release respectively. In the case of a "LOCK" message, the server will answer
either "NO" or "YES" along with the key resource. In either case, the answer should
suffice for fulfilling the client protocol Ψ .
Aneris: A Logic for Modular Reasoning about Distributed Systems 347
3 AnerisLang
AnerisLang is an untyped functional language with higher-order functions, fork-
based concurrency, higher-order mutable references, and primitives for communi-
cating over network sockets. The syntax is as follows:
(e, h) (e , h )
.
n; e, (H[n → h], S, P, M) → n; e , (H[n → h ], S, P, M)
S(n)(z) = None
p ∈ P(ip) S = S[n → S(n)[z → Some (ip, p)]] P = P[ip → P(ip) ∪ {p}]
n; socketbind z (ip, p), (H, S, P, M) → n; 0, (H, S , P , M)
S(n)(z) = Some to
M(i) = (f rom, to, msg, Sent) M = M[i → (f rom, to, msg, Received)]
n; receivefrom z, (H, S, P, M) → n; Some (msg, f rom), (H, S, P, M )
S(n)(z) = Some to
n; receivefrom z, (H, S, P, M) → n; None, (H, S, P, M)
unbound socket using a fresh handle z for a node n and socketbind binds a
socket address a to an unbound socket z if the address and port p is not already
in use. Hereafter, the port is no longer available in P (ip). For bound sockets,
sendto sends a message msg to a destination address to from the sender’s address
Aneris: A Logic for Modular Reasoning about Distributed Systems 349
f rom found in the bound socket. The message is assigned a unique identifier and
tagged with a status flag Sent indicating that the message has been sent and
not received. The operation returns the number of characters sent.
To model possibly dropped or delayed messages we introduce two rules for
receiving messages using the receivefrom operation that on a bound socket either
returns a previously unreceived message or nothing. If a message is received the
status flag of the message is updated to Received
Third and finally, using standard call-by-value right-to-left evaluation contexts
K ∈ Ectx we lift the node-local head reduction to a distributed systems reduction
shown below. We write ∗ for its reflexive-transitive closure. The distributed
systems relation reduces by picking a thread on any node or forking off a new
thread on a node.
(n; e, Σ) → (n; e , Σ )
+ [n; K[e]] +
(T 1 + + T 2 , Σ) (T 1 ++ [n; K[e ]] +
+ T 2; Σ)
Note that in Aneris the usual points-to connective about the heap, →n v, is
indexed by a node identifier n ∈ Node, asserting ownership of the singleton heap
mapping to v on node n.
The logic features (impredicative) invariants P and user-definable ghost state
γ
via the proposition a , which asserts ownership of a piece of ghost state a at
ghost location γ. The logical support for user-defined invariants and ghost state
allows one to relate (ghost and physical) resources to each other; this is vital for
our specifications as will become evident in Sect. 5 and Sect. 6. We refer to Jung
et al. [14] for a more thorough treatment of user-defined ghost state.
To reason about AnerisLang programs, the logic features Hoare triples.5 The
intuitive reading of the Hoare triple {P } n; e {x. Q} is that if the program e on
4
To avoid the issue of reentrancy, invariants are annotated with a namespace and
Hoare triples with a mask. We omit both for the sake of presentation as they are
orthogonal issues.
5
In both Iris and Aneris the notion of a Hoare triple is defined in terms of a weakest
precondition but this will not be important for the remainder of this paper.
350 M. Krogh-Jespersen et al.
The socket resource z →n o keeps track of the address associated with the
socket handle z on node n and takes part in ensuring that the socket is bound
Aneris: A Logic for Modular Reasoning about Distributed Systems 351
only once. It behaves similarly to the points-to connective for the heap, e.g.,
z →n o ∗ z →n o ⇒ False.
As briefly touched upon in Sect. 2, the logic offers two different rules for
binding an address to a socket depending on whether or not the address has a (at
the level of the logic) primordial, agreed upon protocol. To distinguish between
such static and dynamic addresses, we use a persistent resource Fixed(A) to keep
track of the set of addresses that have a fixed socket protocol.
To reason about a static address binding to a socket z it suffices to show that
the address a being bound has a fixed interpretation (by being in the “fixed” set),
that the port of the address is free, and that the socket is not bound.
Socketbind-static
{Fixed(A) ∗ a ∈ A ∗ FreePort(a) ∗ z →n None}
n; socketbind z a
{x. x = 0 ∗ z →n Some a}
In accordance with the BSD-socket API, the bind operation returns the integer 0
and the socket resource gets updated, reflecting the fact that the binding took
place.
The rule for dynamic address binding is similar but the address a should not
have a fixed interpretation. Moreover, the user of the logic is free to pick the
socket protocol Φ to govern address a.
Socketbind-dynamic
{Fixed(A) ∗ a ∈ A ∗ FreePort(a) ∗ z →n None}
n; socketbind z a
{x. x = 0 ∗ z →n Some a ∗ a ⇒ Φ}
We now state a formal adequacy theorem, which expresses that Aneris guarantees
both safety, and, that all protocols are adhered to.
To state our theorem we introduce a notion of initial state coherence: A
fin
set of addresses A ⊆ Address = Ip × Port and a map P : Ip − ℘fin (Port) are
said to satisfy initial state coherence if the following hold: (1) if (i, p) ∈ A then
i ∈ dom(P), and (2) if i ∈ dom(P) then P(i) = ∅.
∗ ∗
triple
{Fixed(A) ∗ a ⇒ Φa ∗ FreeIp(i)} n1 ; e {v.ϕ(v)}
a∈A i∈dom(P)
is derivable in Aneris.
If we have
Given predefined socket protocols for all primordial protocols and the necessary
free IP addresses, this theorem provides the normal adequacy guarantees of Iris-
like logics, namely safety, i.e., that nodes and threads on nodes cannot get stuck
and that the postcondition holds for the resulting value. Notice, however, that
this theorem also implies that all nodes adhere to the agreed upon protocols;
otherwise, a node not adhering to a protocol would be able to cause another
node to get stuck, which the adequacy theorem explicitly guarantees against.
Aneris: A Logic for Modular Reasoning about Distributed Systems 353
C1 S1
T1 : serve z1
..
. z0
T2 : serve z2
Cn S2
socket node
communication thread
Fig. 4. The architecture of a distributed system with a load balancer and two servers.
v. The two predicates are application specific and used to give logical accounts
of the client requests and the server responses, respectively. Furthermore, we
parameterize the protocol by a predicate Pval on a meta-language value that
will allows us to maintain ghost state between the request and response as will
become evident in following.
In our specification, the sockets where the load balancer and the servers
receive requests (the blue sockets in Fig. 4) will all be governed by the same
socket protocol Φrel such that the load balancer may seamlessly relay requests
and responses between the main socket and the servers, without invalidating any
socket protocols. We define the generic relay socket protocol Φrel as follows:
Φrel (Pval , Pin , Pout )(m) ∃Ψ, v. from(m) ⇒ Ψ ∗ Pin (m, v) ∗ Pval (v) ∗
(∀m . Pval (v) ∗ Pout (m , v) −∗ Ψ (m ))
When verifying a request, this protocol demands that the sender (corresponding
to the red sockets in Fig. 4) is governed by some protocol Ψ , that the request
fulfills the Pin and Pval predicates, and that Ψ is satisfied given a response that
maintains Pval and satisfies Pout .
When verifying the load balancer receiving a request m from a client, we
obtain the resources Pin (m, v) and Pval (v) for some v according to Φrel . This
suffices for passing the request along to a server. However, to forward the server’s
response to the client we must know that the server behaves faithfully and
gave us the response to the right request value v. Φrel does not give us this
immediately as the v is existentially quantified. Hence we define a ghost resource
LB(π, s, v) that provides fractional ownership for π ∈ (0, 1], which satisfies
LB(1, s, v)
LB( 12 , s, v) ∗ LB( 12 , s, v), and for which v can only get updated if
π = 1 and in particular LB(π, s, v) ∗ LB(π, s, v ) =⇒ v = v for any π. Using
this resource, the server with address s will have PLB (s) as its instantiation of
Pval where
PLB (s)(v) LB( 12 , s, v).
When verifying the load balancer, we will update this resource to the request
value v when receiving a request (as we have the full fraction) and transfer
Aneris: A Logic for Modular Reasoning about Distributed Systems 355
LB( 12 , s, v) to the server with address s handling the request and, according to
Φrel , it will be required to send it back along with the result. Since the server
logically only gets half ownership, the value cannot be changed. Together with
the fact that v is also an argument to Pin and Pout , this ensures that the server
fulfills Pout for the same value as it received Pin for. The socket protocol for the
serve function’s socket (z1 and z2 in Fig. 4) that communicates with a server
with address s can now be stated as follows.
Since all calls to the serve function need access to the main socket in order to
receive requests, we will keep the socket resource required in an invariant ILB
which is shared among all the threads:
The specification requires the address amain of the socket main to be governed
by Φrel with a trivial instantiation of Pval and the address s of the server to
be governed by Φrel with Pval instantiated by PLB . The specification moreover
expects resources for a dynamic setup, the invariant that owns the resource
needed to verify use of the main socket, and a full instance of the LB(1, s, v)
resource for some arbitrary v.
With this specification in place the complete specification of our load balancer
is immediate (note that it is parameterized by Pin and Pout ):
{ }
Static((ip, p), A, φrel (λ_.True, Pin , Pout )) ∗ IsNode(n) ∗
⎛ ⎞
⎝
∗
p ∈ports
Dynamic((ip, p ), A)⎠ ∗
∗
∃v. LB(1, s, v) ∗ s ⇒ φrel (PLB (s), Pin , Pout )
s∈srvs
n; load_balancer ip p srvs
{True}
where ports = [1100, · · · , 1100 + |srvs|]. In addition to the protocol setup for
each server as just described, for each port p ∈ ports which will become the
endpoint for a corresponding server, we need the resources for a dynamic setup,
and we need the resource for a static setup on the main input address (ip, p).
356 M. Krogh-Jespersen et al.
Pin
add
(m, (v1 , v2 )) body(m) = serialize(v1 , v2 )
Pout
add
(m, (v1 , v2 )) body(m) = serialize(v1 + v2 )
with serialize being the same serialization function from Sect. 2.3. We build and
verify two distributed systems, (1) one consisting of two clients and an addition
server and (2) one including two clients, a load balancer and three addition servers.
We prove both of these systems safe and the proofs utilize the specifications we
have given for the individual components. Notice that Φrel (λ_.True, Pin add
, Pout
add
)
and Φadd from Sect. 2.3 are the same. This is why we can use the same client
specification in both system proofs. Hence, we have demonstrated Aneris’ ability
and support for horizontal composition of the same modules in different systems.
While the load balancer demonstrates the use of node-local concurrency, its
implementation does not involve shared memory concurrency, i.e., synchronization
among the node-local threads. The appendix [20] includes an example of a
distributed system, where clients interact with a server that implements a bag.
The server uses multiple threads to handle client requests concurrently and
the threads use a shared bag data structure governed by a lock. This example
demonstrates Aneris’ ability to support both shared-memory concurrency and
distributed networking.
(b) All participants that voted for a commit wait for the final verdict from
the coordinator. If the participant receives a global commit it locally
commits the transaction, otherwise the transaction is locally aborted. All
participants must acknowledge.
Our implementation and specification details can be found in the appendix [20]
and in the accompanying Coq development, but we will emphasize a few key
points.
To provide general, reusable implementations and specifications of the coordi-
nator and participants implementing TPC, we do not define how requests, votes,
nor decisions look like. We leave it to a user of the module to provide decidable
predicates matching the application specific needs and to define the logical, local
pre- and postconditions, P and Q, of participants for the operation in question.
Our specifications use fractional ghost resources to keep track of coordinator
and participant state w.r.t. the coordinator and participant transition systems
indicated in the protocol description above. Similar to our previous case study, we
exploit partial ownership to limit when transitions can be made. When verifying
a participant, we keep track of their state and the coordinator’s state and require
all participants’ view of the coordinator state to be in agreement through an
invariant.
In short, our specification of TPC
– ensures the participants and coordinator act according to the protocol, i.e.,
• the coordinator decides based on all the participant votes,
• participants act according to the global decision,
• if the decision was to commit, we obtain the resources described by Q
for all participants,
• if the decision was to abort, we still have the resources described by P
for all participants,
– does not require the coordinator to be primordial, so the coordinator could
change from round to round.
C1 S1
coordinate
.. updates
.
Cn S2
socket node
communication
Fig. 6. The architecture of a replicated logging system implemented using the TPC
modules (the blue parts of the diagram) with a coordinator and two databases (S1 and
S2 ) each storing a copy of the log.
stored in the database at a node at address a, LOG(π, a, l). The second one keeps
track of what the log should be updated to, if the pending round of consensus
succeeds. This is a pair of the existing log l and the (pending) change s proposed
in this round, PEND(π, a, (l, s)). We exploit fractional resource ownership by
letting the coordinator, logically, keep half of the pending log resources at all
times. Together with suitable local pre- and postconditions for the databases,
this prevents the databases from doing arbitrary changes to the log. Concretely,
we instantiate P and Q of the TPC module as follows:
where @ denotes string concatenation. Note how the request message specifies the
proposed change (since the string that we would like to add to the log is appended
to the requests message) and how we ensure consistency by making sure the two
ghost assertions hold for the same log. Even though l and s are existentially
quantified, we know the logs cannot be inconsistent since the coordinator retains
partial knowledge of the log. Due to the guarantees given by TPC specification,
this implies that if the global decision was to commit a change this change
will have happened locally on all databases, cf. LOG( 12 , p, l@s) in Qrep , and if
the decision was to abort, then the log remains unchanged on all databases,
cf. LOG( 12 , p, l) in Prep . We refer to the appendix [20] or the Coq development
for further details.
7 Related Work
Session Types for Giving Types to Protocols. Session types have been studied for
a wide range of process calculi, in particular, typed π-calculus. The idea is to
describe two-party communication protocols as a type to ensure communication
safety and progress [10]. This has been extended to multi-party asynchronous
channels [11], multi-role types [2] which informally model topics of actor-based
message-passing and dependent session types allowing quantification over mes-
sages [38]. Our socket protocol definitions are quite similar to the multi-party
asynchronous session types with progress encoded by having suitable ghost-
assertions and using the magic wand. Actris [8] is a logic for session-type based
reasoning about message-passing in actor-based languages.
Hoare Style Reasoning About Distributed Systems. Disel [35] is a Hoare Type
Theory for distributed program verification in Coq with ideas from separation
logic. It provides the novel protocol-tailored rules WithInv and Frame which
allow for modularity of proofs under the condition of an inductive invariant
and distributed systems composition. In Disel, programs can be extracted into
runnable OCaml programs, which is on our agenda for future work.
IronFleet [7] allows for building provably correct distributed systems by
combining TLA-style state-machine refinement with Hoare-logic verification in a
layered approach, all embedded in Dafny [24]. IronFleet also allows for liveness
assertions. For a comparison of Disel and IronFleet to Aneris from a modularity
point of view we refer to the Introduction section.
Other Distributed Verification Efforts. Verdi [40] is a framework for writing and
verifying implementations of distributed algorithms in Coq, providing a novel
approach to network semantics and fault models. To achieve compositionality, the
authors introduced verified system transformers, that is, a function that trans-
forms one implementation to another implementation with different assumptions
about its environment. This makes vertical composition difficult for clients of
proven protocols and in comparison AnerisLang seems more expressive.
EventML [30, 31] is a functional language in the ML family that can be used
for coding distributed protocols using high-level combinators from the Logic of
Events, and verify them in the Nuprl interactive theorem prover. It is not quite
clear how modular reasoning works, since one works within the model, however,
the notion of a central main observer is akin to our distinguished system node.
360 M. Krogh-Jespersen et al.
8 Conclusion
Distributed systems are ubiquitous and hence it is essential to be able to verify
them. In this paper we presented Aneris, a framework for writing and verifying
distributed systems in Coq built on top of the Iris framework. From a programming
point of view, the important aspect of AnerisLang is that it is feature-rich: it is a
concurrent ML-like programming language with network primitives. This allows
individual nodes to internally use higher-order heap and concurrency to write
efficient programs.
The Aneris logic provides node-local reasoning through socket protocols. That
is, we can reason about individual nodes in isolation as we reason about indi-
vidual threads. We demonstrate the versatility of Aneris by studying interesting
distributed systems both implemented and verified within Aneris. The adequacy
theorem of Aneris implies that these programs are safe to run.
Table 1. Sizes of implementations, specifications, and proofs in lines of code. When
proving adequacy, the system must be closed.
Relating the verification sizes of the modules from Table 1 to other formal
verification efforts in Coq indicates that it is easier to specify and verify systems
in Aneris. The total work required to prove two-phase commit with replicated
logging is 1,272 lines which is just half of the lines needed for proving the inductive
invariant for TPC in other works [35]. However, extensive work has gone into
Iris Proof Mode thus it is hard to conclude that Aneris requires less verification
effort and does not just have richer tactics.
Acknowledgments
This work was supported in part by the ModuRes Sapere Aude Advanced Grant
from The Danish Council for Independent Research for the Natural Sciences
(FNU); a Villum Investigator grant (no. 25804), Center for Basic Research in
Program Verification (CPV), from the VILLUM Foundation; and the Flemish
research fund (FWO).
Aneris: A Logic for Modular Reasoning about Distributed Systems 361
Bibliography
[1] Birkedal, L., Bizjak, A.: Lecture notes on Iris: Higher-order concur-
rent separation logic (2017), URL http://iris-project.org/tutorial-pdfs/
iris-lecture-notes.pdf
[2] Deniélou, P., Yoshida, N.: Dynamic multirole session types. In: Ball,
T., Sagiv, M. (eds.) Proceedings of the 38th ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages, POPL 2011,
Austin, TX, USA, January 26-28, 2011, pp. 435–446, ACM (2011),
https://doi.org/10.1145/1926385.1926435
[3] Dinsdale-Young, T., Birkedal, L., Gardner, P., Parkinson, M.J., Yang,
H.: Views: compositional reasoning for concurrent programs. In: Gi-
acobazzi, R., Cousot, R. (eds.) The 40th Annual ACM SIGPLAN-
SIGACT Symposium on Principles of Programming Languages, POPL
’13, Rome, Italy - January 23 - 25, 2013, pp. 287–300, ACM (2013),
https://doi.org/10.1145/2429069.2429104
[4] Dinsdale-Young, T., Dodds, M., Gardner, P., Parkinson, M.J., Vafeiadis, V.:
Concurrent abstract predicates. In: D’Hondt, T. (ed.) ECOOP 2010 - Object-
Oriented Programming, 24th European Conference, Maribor, Slovenia, June
21-25, 2010. Proceedings, Lecture Notes in Computer Science, vol. 6183, pp.
504–528, Springer (2010), https://doi.org/10.1007/978-3-642-14107-2_24
[5] Floyd, R.W.: Assigning meanings to programs. Mathematical aspects of
computer science 19(19-32), 1 (1967)
[6] Gray, J.: Notes on data base operating systems. In: Flynn, M.J., Gray, J.,
Jones, A.K., Lagally, K., Opderbeck, H., Popek, G.J., Randell, B., Saltzer,
J.H., Wiehle, H. (eds.) Operating Systems, An Advanced Course, Lec-
ture Notes in Computer Science, vol. 60, pp. 393–481, Springer (1978),
https://doi.org/10.1007/3-540-08755-9_9
[7] Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J.R., Parno, B., Roberts,
M.L., Setty, S.T.V., Zill, B.: Ironfleet: proving practical distributed systems
correct. In: Miller, E.L., Hand, S. (eds.) Proceedings of the 25th Symposium
on Operating Systems Principles, SOSP 2015, Monterey, CA, USA, October
4-7, 2015, pp. 1–17, ACM (2015), https://doi.org/10.1145/2815400.2815428
[8] Hinrichsen, J.K., Bengtson, J., Krebbers, R.: Actris: session-type
based reasoning in separation logic. PACMPL 4, 6:1–6:30 (2020),
https://doi.org/10.1145/3371074
[9] Holzmann, G.J.: The model checker SPIN. IEEE Trans. Software Eng. 23(5),
279–295 (1997), https://doi.org/10.1109/32.588521
[10] Honda, K., Vasconcelos, V.T., Kubo, M.: Language primitives and type
discipline for structured communication-based programming. In: Hankin,
C. (ed.) Programming Languages and Systems - ESOP’98, 7th European
Symposium on Programming, Held as Part of the European Joint Conferences
on the Theory and Practice of Software, ETAPS’98, Lisbon, Portugal, March
362 M. Krogh-Jespersen et al.
[30] Rahli, V., Guaspari, D., Bickford, M., Constable, R.L.: Formal specification,
verification, and implementation of fault-tolerant systems using EventML.
ECEASST 72 (2015), https://doi.org/10.14279/tuj.eceasst.72.1013
[31] Rahli, V., Guaspari, D., Bickford, M., Constable, R.L.: EventML: Spec-
ification, verification, and implementation of crash-tolerant state ma-
chine replication systems. Sci. Comput. Program. 148, 26–48 (2017),
https://doi.org/10.1016/j.scico.2017.05.009
[32] Reynolds, J.C.: Separation logic: A logic for shared mutable data structures.
In: 17th IEEE Symposium on Logic in Computer Science (LICS 2002), 22-25
July 2002, Copenhagen, Denmark, Proceedings, pp. 55–74, IEEE Computer
Society (2002), https://doi.org/10.1109/LICS.2002.1029817
[33] da Rocha Pinto, P., Dinsdale-Young, T., Gardner, P.: Tada: A logic for time
and data abstraction. In: Jones, R.E. (ed.) ECOOP 2014 - Object-Oriented
Programming - 28th European Conference, Uppsala, Sweden, July 28 -
August 1, 2014. Proceedings, Lecture Notes in Computer Science, vol. 8586,
pp. 207–231, Springer (2014), https://doi.org/10.1007/978-3-662-44202-9_9
[34] Sergey, I., Nanevski, A., Banerjee, A.: Mechanized verification of fine-grained
concurrent programs. In: Grove, D., Blackburn, S. (eds.) Proceedings of the
36th ACM SIGPLAN Conference on Programming Language Design and
Implementation, Portland, OR, USA, June 15-17, 2015, pp. 77–87, ACM
(2015), https://doi.org/10.1145/2737924.2737964
[35] Sergey, I., Wilcox, J.R., Tatlock, Z.: Programming and proving
with distributed protocols. PACMPL 2(POPL), 28:1–28:30 (2018),
https://doi.org/10.1145/3158116
[36] Svendsen, K., Birkedal, L.: Impredicative concurrent abstract predicates.
In: Shao, Z. (ed.) Programming Languages and Systems - 23rd European
Symposium on Programming, ESOP 2014, Held as Part of the European Joint
Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble,
France, April 5-13, 2014, Proceedings, Lecture Notes in Computer Science,
vol. 8410, pp. 149–168, Springer (2014), https://doi.org/10.1007/978-3-642-
54833-8_9
[37] Timany, A., Stefanesco, L., Krogh-Jespersen, M., Birkedal, L.: A logical
relation for monadic encapsulation of state: proving contextual equiva-
lences in the presence of runST. PACMPL 2(POPL), 64:1–64:28 (2018),
https://doi.org/10.1145/3158152
[38] Toninho, B., Caires, L., Pfenning, F.: Dependent session types via intuitionis-
tic linear type theory. In: Schneider-Kamp, P., Hanus, M. (eds.) Proceedings
of the 13th International ACM SIGPLAN Conference on Principles and
Practice of Declarative Programming, July 20-22, 2011, Odense, Denmark,
pp. 161–172, ACM (2011), https://doi.org/10.1145/2003476.2003499
[39] Turon, A., Dreyer, D., Birkedal, L.: Unifying refinement and hoare-style
reasoning in a logic for higher-order concurrency. In: Morrisett, G., Uustalu,
T. (eds.) ACM SIGPLAN International Conference on Functional Program-
ming, ICFP’13, Boston, MA, USA - September 25 - 27, 2013, pp. 377–390,
ACM (2013), https://doi.org/10.1145/2500365.2500600
Aneris: A Logic for Modular Reasoning about Distributed Systems 365
[40] Wilcox, J.R., Woos, D., Panchekha, P., Tatlock, Z., Wang, X., Ernst, M.D.,
Anderson, T.E.: Verdi: a framework for implementing and formally verifying
distributed systems. In: Grove, D., Blackburn, S. (eds.) Proceedings of the
36th ACM SIGPLAN Conference on Programming Language Design and
Implementation, Portland, OR, USA, June 15-17, 2015, pp. 357–368, ACM
(2015), https://doi.org/10.1145/2737924.2737958
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Continualization of Probabilistic Programs
With Correction
Jacob Laurel( )
and Sasa Misailovic
1 Introduction
However, many popular Bayesian models can have distributions which are
discrete or hybrid discrete-continuous mixtures (denoted simply as “hybrid”)
leading to computationally inefficient inference for much the same reason. Par-
ticularly when the observed variable is a discrete-continuous mixture, inference
may fail altogether [65]. Likewise even if the observed variable and likelihood
are continuous, the prior or important latent variables, may be discrete (e.g.,
Binomial) leading to an equally difficult discrete inference problem [61, 50].
In fact, a number of popular inference algorithms such as Hamiltonian Monte
Carlo [48], NUTS [31, 50], or versions of Variational Inference (VI) [9] only work
for restricted classes of programs (e.g. by requiring each latent be continuous)
to avoid these problems. Furthermore, we cannot always marginalize away the
program’s discrete component since it is often precisely the one we are interested
in. Even if the parameter was one which could be safely marginalized out, doing
so may require the programmer to use advanced domain knowledge to analyti-
cally solve and obtain a new model and re-write the program completely, which
can be well beyond the abilities of the average PPL user.
Problem statement: We address the question of how to accurately approx-
imate the semantics of a probabilistic program P whose prior or likelihood is
either discrete or hybrid, with a new program PC , where all variables follow
continuous distributions, so that we can exploit the aforementioned inference
algorithms to improve inference in an easy, off-the-shelf fashion.
While a programmer could manually rewrite the probabilistic program or
model and apply approximations in an ad hoc manner, such as simply adding
Gaussian noise to each variable, this would be neither sufficient nor wise. For
instance, it has been shown that when a model contains Gaussians, how they
are programatically written and parametrized can impact the inference time and
quality [29, 5]. Also, by not correcting for continuity in the program’s branch
conditions, one could significantly alter the probability of executing a particular
program branch, and hence alter the overall distribution represented by the
probabilistic program.
Leios: We introduce a fully automated program analysis framework to continu-
alize probabilistic programs for significantly improved inference performance, es-
pecially in cases where inference was originally intractable or prohibitively slow.
An input to Leios is a probabilistic program, which consists of (1) model
that specifies the prior distributions and how the latent variables are related,
368 J. Laurel and S. Misailovic
(2) specifications of observable variables, and (3) specifications of data sets. Leios
transforms the model, given the set of the observable variables. This model is
then substituted back into the original program to produce a fully continuous
probabilistic program leading to greatly improved inference. Furthermore the
approximated program can easily be reused with different, unseen data.
Figure 1 presents the main workflow of Leios :
– Distribution transformer and Boolean predicate correction: Leios first finds
individual discrete distribution sample statements to replace with continu-
ous approximations based on known convergence theorems that specifically
match the distributions’ first moments [23]. Leios then performs a dataflow
analysis to identify and then correct Boolean predicates in branches to best
preserve the original program’s probabilistic control flow. To correct Boolean
predicates, we convert the program to a sketch and fill in the predicates with
holes that will then be synthesized with the optimal values. We ensure that
the distribution of the model’s observed variables is fully continuous with
a differentiable density function, by transforming it using an approach that
adapts Smooth Interpretation [14] to probabilistic programs. We describe
the transformations in Section 4.
– Parameter Synthesizer: Leios determines the optimal parameters which min-
imize a numerical approximation of the Wasserstein Distance to fill in the
holes in the program sketch. This step of the algorithm can be thought of as
a “training phase” much like in machine learning, and we need only perform
it once for a given program, regardless of the number of times we will later
perform inference on different data sets. These parameters correspond to
continuity correction factors in classical probability theory [23]. We describe
the synthesizer in Section 5.
Contributions: This paper makes the following main contributions:
– Concept: To the best of our knowledge, Leios is the first technique to auto-
mate program transformations that approximate discrete or hybrid discrete-
continuous probabilistic programs with fully continuous ones to improve in-
ference. It combines insights from probability theory, program analysis, com-
piler autotuning, and machine learning.
– Program Transformation: Leios implements a set of transformations on
distributions and the conditional statements that can produce provably con-
tinuous probabilistic programs that approximate the original ones.
– Parameter Synthesis: We present a synthesis algorithm that corrects the
probabilities of taking specific branches in the probabilistic program and
improves the overall inference accuracy.
– Evaluation: We evaluated Leios on a set of ten benchmarks from existing
literature and two systems, WebPPL (using MCMC sampling) and Pyro
(using stochastic variational inference). The results demonstrate that Leios
can achieve a substantial decrease in inference time compared to the origi-
nal model, while still achieving high inference accuracy. We also show how
a continualized program allows for easy off-the-shelf inference that is not
always readily available to discrete or hybrid models.
Continualization of Probabilistic Programs With Correction 369
1 Model {
1 Data := [ 1 2 , 8 , . . . ] ;
2 p r i o r = Uniform ( 2 0 , 5 0 ) ;
2
3 mu p = prior;
3 Model {
4 sigma p = sqrt(prior);
4 p r i o r = Uniform ( 2 0 , 5 0 ) ;
5 R e c r u i t e r s = Gaussian(mu p,sigma p) ;
5 Recruiters = Poisson ( p r i o r ) ;
6
6
7 perfGPA = 4 ;
7 perfGPA = Gaussian(4,β) ;
8 regGPA = 4∗ Beta ( 7 , 3 ) ;
8 regGPA = 4∗ Beta ( 7 , 3 ) ;
9 GPA = Mix ( perfGPA , . 0 5 , regGPA , . 9 5 )
9 GPA = Mix ( perfGPA , . 0 5 , regGPA , . 9 5 )
10
10
11 i f ( 4 - θ1 < GPA < 4+ θ2 ) {
11 i f (GPA == 4 ) {
12 mu = Recruiters ∗ 0.9;
12 I n t e r v i e w s = Bin ( R e c r u i t e r s , . 9 ) ;
13 sigma = sqrt(Recruiters∗0.9∗0.1);
13 } e l s e i f (GPA > 3 . 5 ) {
14 I n t e r v i e w s = Gaussian(mu,sigma) ;
14 I n t e r v i e w s = Bin ( R e c r u i t e r s , . 6 ) ;
15 } e l s e i f ( GPA > 3.5 + θ3 ) {
15 } else {
16 mu = Recruiters ∗ 0.6;
16 I n t e r v i e w s = Bin ( R e c r u i t e r s , . 5 ) ;
17 sigma= sqrt(Recruiters∗0.6∗0.4);
17 }
18 I n t e r v i e w s = Gaussian(mu,sigma) ;
18
19 } else {
19 O f f e r s = Bin ( I n t e r v i e w s , 0 . 4 ) ;
20 mu = Recruiters ∗ 0.5;
20 }
21 sigma = sqrt(Recruiters∗0.5∗0.5);
21
22 I n t e r v i e w s = Gaussian(mu,sigma) ;
22 f o r d i n Data {
23 }
23 factor ( Offers , d) ;
24 mu2 = Interviews ∗ 0.4;
24 }
25 sigma2 = sqrt(Interviews∗0.4∗0.6);
25
26 O f f e r s = Gaussian(mu2,sigma2) ;
26 return p r i o r ;
27 }
(a) (b)
2 Example
Figure 2 (a) presents a program that infers the parameters of the distribution
modeling the number of recruiters coming to a recruiting fair given both the
number of offers multiple students receive (line 1). As the number of recruiters
may vary year to year, we model this count as a Poisson distribution (line 5).
However, to accurately quantify how much this count varies year to year, we
want to estimate the unknown parameter of this Poisson variable. We thus place
a uniform prior over this parameter (line 4).
The example represents the student GPAs in lines 7-9: it is either a perfect
4.0 score or any number between 0 and 4. We model the perfect GPA with a dis-
crete distribution that has all the probability mass at 4.0 (line 7). To model the
imperfect GPA, we use a Beta distribution (line 8), scaled by 4 to lie in the range
[0.0, 4.0]. Finally, the distribution of the GPAs is a mixture of these two compo-
nents (line 9). Our mixture assumes that 5% of students obtain perfect GPAs.
Because the GPA impacts the number of interviews a student receives, our
model incorporates control flow where each branch captures the distribution
of interviews received, conditioned on the GPA being in a certain range (lines
11-17). Each student’s resume is available to all recruiters and each recruiter
can request an interview or not, hence all three of the Interviews distributions
follow a Binomial distribution (here denoted as bin) with the same n (number of
recruiters) but with different probabilities (higher probabilities for higher GPAs).
From the factor statement (line 23) we see that the Offers variable governs the
370 J. Laurel and S. Misailovic
2.1 Continualization
Our approach starts from the observation that inference with continuous distri-
butions is often more efficient for several inference algorithms [53, 52, 56]. Leios
first continualizes discrete and hybrid distributions in the original model. Start-
ing in line 5 in Figure 2 (b), we approximate the Poisson variable with a Gaussian
using a classical result [16], hence relaxing the constraint that the number of re-
cruiters be an integer. (For ease of presentation we created new variables mu p
and sigma p corresponding to the parameters of the approximation; Leios sim-
ply inlines these.) We next approximate the discrete component of the GPA
hybrid mixture distribution by a Gaussian centered at 4 and small tunable stan-
dard deviation β (line 7). The GPA is now a mixture of two continuous distri-
butions. We then transform all of the Binomials to Gaussians (lines 14, 18, 22,
and 26) using another classic approximation [23].
Finally, Leios smooths the observed variables by a Gaussian to ensure the
likelihood function is both fully continuous and differentiable. In this example
we see that the approximation of the Binomial already makes the distribution of
Offers (given all latent values) a Gaussian, hence this final step is not needed.
After continualization, the GPA cannot be exactly 4.0, thus we need to re-
pair the first conditional branch of the continualized program. In line 11, we re-
place the exact equality predicate with the interval predicate 4-θ1 < GPA < 4+θ2
where each θ is a hole whose value Leios will synthesize. Leios finds all such
branching predicates by tracking transitive data dependencies of all continual-
ized variables.
1 Model {
2 p r i o r = Uniform ( 2 0 , 5 0 ) ;
3 mu p = p r i o r ;
4 sigma p = s q r t ( p r i o r ) ;
5 R e c r u i t e r s = G a u s s i a n (mu p, sigma p ) ;
6
7 perfGPA = G a u s s i a n ( 4 , 0.1 ) ;
8 regGPA = 4∗ Beta ( 7 , 3 ) ;
9 GPA = Mix ( perfGPA , . 0 5 , regGPA , . 9 5 ) ;
10
11 if ( 3.99999 < GPA < 4.95208 ) {
12 mu = R e c r u i t e r s ∗ 0 . 9 ;
13 sigma = s q r t ( R e c r u i t e r s ∗ 0 . 9 ∗ 0 . 1 ) ;
14 I n t e r v i e w s = G a u s s i a n (mu, sigma ) ;
15 } e l s e i f (GPA > 3.500122 ) {
16 mu = R e c r u i t e r s ∗ 0 . 6 ;
17 sigma = s q r t ( R e c r u i t e r s ∗ 0 . 6 ∗ 0 . 4 ) ;
18 I n t e r v i e w s = G a u s s i a n (mu, sigma ) ; }
19 } else {
20 mu = R e c r u i t e r s ∗ 0 . 5 ;
21 sigma = s q r t ( R e c r u i t e r s ∗ 0 . 5 ∗ 0 . 5 ) ;
22 I n t e r v i e w s = G a u s s i a n (mu, sigma ) ;
23 }
24 (b)
25 mu2 = I n t e r v i e w s ∗ 0 . 4 ;
26 sigma2 = s q r t ( I n t e r v i e w s ∗ 0 . 4 ∗ 0 . 6 ) ;
27 O f f e r s = G a u s s i a n ( mu2 , sigma2 ) ;
28 }
(a)
Fig. 3: (a) the fully continualized model and (b) Convergence of the Synthesis
Step for multiple β.
(b)
(a)
Fig. 5: (a) Posteriors of each method – the true value is equal to 37. (b) Avg.
Accuracy and Inference time; the bars represent accuracy (left Y-axis), the lines
represent time (right Y-axis).
the point-estimate, τest , of the parameter’s true value τ . Figure 5 (b) presents the
run time and the error ratio, | τ −ττ est |, for each approach (for the given true value
of 37). It shows that our continualized version leads to the fastest inference.
The syntax is similar to the ones used in [24, 51]. Unlike [51], our syntax does include
exact equality predicates, which introduce difficulties during the approximation. To give
the developer the flexibility in selecting which parts of the program to continualize,
we add the CONST annotation. It indicates that the variable’s distribution should not
374 J. Laurel and S. Misailovic
be continualized. Until explicitly noted, we will not use this annotation in the rest
of the paper. For simplicity of exposition, we present only a single DataBlock and
ObserveBlock, but our approach naturally extends to the cases with multiple data and
observed variables.
Definition 3. A measure μover Rn is a mapping from B{Rn } to [0, +∞) such that
μ(∅) = 0 and μ( i∈N Xi ) = i∈N μ(Xi ) when all Xi are mutually disjoint. A probability
measure is a measure that satisfies μ(Rn ) = 1 and a sub-probability measure is one
satisfying μ(Rn ) ≤ 1. The simplest measure is the Dirac measure denoted as δai (S) =
1 if ai in S else 0. We denote the set of all sub-probability measures as M(Rn ).
Definition 7. The Lebesgue measure on R (denoted Leb) is the measure that maps
any interval to its length, e.g., Leb([a, b]) = b − a. The Lebesgue measure in Rn is
simply the n-fold product measure of n copies of the Lebesgue measure on R.
3.2 Semantics
Expression Level Semantics Arithmetic Expression semantics are standard, they
map states σ ∈ Rn to values, equivalently Expr : Rn → R. Boolean Expression
Semantics, denoted BExpr, simply return the set of states Bi ∈ B{Rn } satisfying the
Boolean conditional.
c(σ) = c xi (σ) = σ[xi ] t1 op t2 (σ) = t1 (σ) op t2 (σ) f (t1 )(σ) = f (t1 (σ))
κDisc (σ) = DiscDist(e1 , e2 , ...)(σ) = λS. fDisc (v; e1 (σ), e2 (σ), ...)
v∈Supp∩S
Where fCont and fDisc are the density and mass functions, respectively, of the prim-
−(x−μ)2
itive distribution being sampled from (e.g., fGauss (x; μ, σ) = σ
√1 e
2π
2σ 2 · 1{σ>0} )
and Supp is the distribution’s support.
(1) We first locally approximate the program’s prior and latent variables using a series
of program transformations to best preserve the local structural properties of the
program and then apply smoothing globally to ensure that the likelihood function
is both fully continuous and differentiable.
376 J. Laurel and S. Misailovic
xi := e(μ) = λS.μ({(x1 , ..., xn ) ∈ Rn | (x1 , ..., xi−1 , e(x1 , ..xn ), xi+1 ..., xn ) ∈ S})
xi := Dist(e1 ,...ek )(μ) = λS. μ(dσ)·δx1 ⊗...δxi−1 ⊗Dist(e1 ,...ek )(σ)⊗δxi+1 ...(S)
Rn
if (B) {P1 } else {P2 }(μ) = P1 (condition(B)(μ))+P2 (condition(not B)(μ))
∞
while (B) { P1 }(μ) = (condition(B); P1 )k ; condition(not B)(μ)
k=0
(2) We next synthesize a set of parameters that (approximately) minimize the distance
metric between the distributions of the original and continualized models and we
use light-weight auto-tuning to ensure the approximations do not introduce run-
time errors.
⎧ √
⎪
⎪ Gaussian(λ, λ) E = P oisson(λ)
⎪
⎪
⎪
⎪ Gamma(λ, 1) E = P oisson(λ) & Gaussian fails
⎪
⎪
⎪
⎪ Gaussian(np, np(1 − p))
⎪
⎪ E = Binomial(n, p)
⎪
⎪
⎪
⎪ Gamma(n, p) E = Binomial(n, p) & Gaussian fails
⎪
⎪
⎪
⎪ U nif orm(a, b) E = DiscU nif orm(a, b)
⎪
⎪
⎪
⎪
⎪
⎨Exponential(p) E = Geometric(p)
TEβ [E] = M ixOf Gaussβ ([(1, p), (0, 1 − p)]) E = Bernoulli(p)
⎪
⎪
⎪Beta(β, β 1−p )
⎪ E = Bernoulli(p) & MixOfGauss fails
⎪
⎪ p
⎪
⎪
⎪
⎪ M ixture([(TEβ [D1 ], p1 ), ...(TEβ [D2 ], p2 )]) E = M ixture([(D1 , p1 ), ...(D2 , p2 )])
⎪
⎪
⎪
⎪ Gaussian(c, β) E = c (constant)
⎪
⎪
⎪
⎪
⎪
⎪ E E = a·xi +b (a=0)
⎪
⎪
⎪
⎪ KDE(β) E ∈ DiscDist & not covered
⎪
⎩
Gaussian(E, β) otherwise
The rationale for this definition is that these approximations all preserve key struc-
tural properties of the distributions’ shape (e.g., the number of modes) which have been
shown to strongly affect the quality of inference [25, 45, 17]. Second, these continuous
approximations all match the first moment of their corresponding discrete distributions,
which is another important feature that affects the quality of approximation [53]. We
refer the reader to [54] to see that for each distribution on the left, the corresponding
378 J. Laurel and S. Misailovic
continuous distribution on the right has the same mean. These approximations are best
when certain limit conditions are satisfied, e.g. λ ≥ 10 for approximating a Poisson dis-
tribution with Gaussian, hence the values in the program itself do affect the overall
approximation accuracy.
However, if we are not careful, a statement level transformation could introduce
runtime errors. For example, a Binomial is always non-negative, but its Gaussian ap-
proximation could be negative. This is why TEβ [•] has multiple transformations for the
same distribution. For example, in addition to using a Gaussian to approximate both a
Binomial and a Poisson, we also have a Gamma approximation since a Gamma distri-
bution is always non-negative. Likewise we have a Beta approximation to a Bernoulli
if we require that the approximation also have support in the range [0, 1]. Leios uses
auto-tuning to safeguard against such errors during the synthesis phase, whereby when
sampling the transformed program, if we encounter a run-time error of this nature,
we simply go back and try a safer (but possibly slower) alternative (Algorithm 1 line
12). Since there are only finitely many variables and (safer) transformations to apply,
this process will eventually terminate. For discrete distributions not supported by the
specific approximations, but with fixed parameters, we empirically sample them to get
a set of samples and then use a Kernel Density Estimate (KDE) [62] with a Gaussian
kernel (the KDE bandwidth is precisely β) as the approximation.
Lastly, by default all discrete random variables become approximated with contin-
uous versions, however we leave the option to the user to manually specify CONST in
front of a variable if they do not wish for it to be approximated (in which case we no
longer make any theoretical guarantees about continuity).
4.3 Influence Analysis and Control-Flow Correction of Predicates
The abort, factor and skip statements and the DataBlock remain the same after
applying the transformation operator TPβ [•].
Proof. (sketch) To prove the theorem we will show that when any variable xi is initially
defined, it comes from an absolutely continuous distribution and furthermore that the
Continualization of Probabilistic Programs With Correction 381
semantics of each statement in TPβ [P ] preserves the absolute continuity of each marginal
measure (where μxi ≡ μ(R × ... × Bi × R... × R)), equivalently for any statement, any
(already defined) variable xi and any Borel set Bi ∈ B{R}:
Case 1. skip and abort: Since skip is the identity measure transformer of each de-
fined marginal measure μxi was A.C. before, then they will trivially be so afterward
since they are unchanged. abort sends each marginal to the 0 sub-measure (which
is trivially A.C.).
Case 2. condition and factor: Since factor and condition only lose measure we have
condition(B)(μ)(S) ≤ μ(S) and factor(xk ,t)(μ)(S) ≤ μ(S) for any Borel set S.
Thus μ(S) = 0 ⇒ condition(B)(μ)(S) = 0 and μ(S) = 0 ⇒ factor(xk ,t)(μ)(S) = 0
since all measures are non-negative. Hence by transitivity, since μ(R×...Bi ×R...) is A.C.,
factor(xk ,t)(μ)(S)(R × ...Bi × R... × R) is A.C. and likewise for similar reasons, we
have that condition(B)(μ)(R × ...Bi × R... × R) is A.C.
Case 4. Sequencing, if and while: Intuitively since the above statements each preserve
A.C of each marginal, any sequencing of them should too. Since the sum of two measures
that are both A.C. in each marginal is also A.C. in each marginal, if statements
preserve A.C. of each marginal. For this same reason while loops also preserve A.C.
to P , we would simply be over-fitting to the data and we would not be able to re-use
TPβ [P ] for new data sets with different true parameters. Instead our objective is to
minimize the distance between the original model M , which is simply the fragment of
P that does not contain the data or observe block (and hence only defines the prior,
likelihood and latent variables), and the corresponding continualized approximation,
TPβ [M ]. To do so, we need to choose the best possible continuity correction factors,
θ, for TPβ [M ]. Thus we define the “optimal” parameters as those which minimize a
distance metric d between probability measures d : M(Rn ) × M(Rn ) → [0, ∞). We
also need to ensure that the metric can (a) compute the distance between discrete and
continuous distributions and (b) is such that if models or likelihoods are close with
respect to d, the posteriors should be as well.
To restrict the search space we follow common practice [23, 3] by requiring each θi ∈
(0, 1). Such optimization problem lacks a closed form solution. Symbolically computing
the Wasserstein Distance is intractable, hence we numerically approximate it via the
empirical Wasserstein Distance (EWD) between observed samples of M and TPβ [Mθ ].
Because this step is fully dynamic (we run and sample the model), the samples are
conditioned upon successfully terminating, and hence the model’s sub-measure has
been implicitly renormalized to a full probability measure, thus justifying the use of a
fully renormalized measure in equations (1) and (2).
Continualization of Probabilistic Programs With Correction 383
Though intuitively we would expect that as we apply less smoothing (i.e. β < 1),
the optimal θi should also be smaller (less need for correction) and the continualized
program should become closer to the original, a simple negative result illustrates this
is not always the case and that the dependence between the smoothing and continuity
correction must be non-linear.
Proof. Let X be the constant random variable that is 0 with probability 1 and let
X ∼ Gaussian(0, β). Furthermore, let I := (X == 0) and Ic := (cβ ≤ X ≤ cβ) be
two indicator random variables. Intuitively we want Ic to have the same probability of
being true as I for any β. However if c is constant (such as 1) then P r(cβ ≤ X ≤ cβ)
will always be the same regardless of β (when c = 1, the probability is always 0.68).
check for such an error in line 4 and if one exists, we return immediately, with the flag
variable set to false (line 5).
To evaluate the EWD objective function (when there are parameters to synthesize),
Algorithm 2 follows a technique from [14] and uses a Nelder-Mead search (line 11),
due to Nelder-Mead’s well known success in solving non-convex program synthesis
problems. We first extract the fragment of the programs corresponding to the models,
M and TPβ [M ], respectively in line 9. In each step of the Nelder-Mead search we take
n samples (n ≈ 500) of TPβ [M ], but with a fixed value of θi substituted into TPβ [M ],
to compute the EWD with respect to samples of the original model M (which have
been cached to avoid redundant resampling). The Nelder-Mead search steps through
the parameter space (with step size η > 0), substituting different values of θ into
TPβ [M ]. This process continues until the search converges to a minimizing parameter,
p, that is within the stopping threshold
> 0 or encounters a runtime error during
the sampling (which is checked in line 12). As before, if we encounter such an error we
immediately return with the flag set to false (line 13). Following [14], we successively
restart the Nelder-Mead search from k evenly spaced grid points in [0, 1]d (hence the
loop in line 10), to find the globally optimal parameter (hence our approach is robust
to local minima), which we successively update in lines 15-16. If no runtime error was
ever encountered, we substitute in the parameters with the minimum EWD over all
runs, θ̂, to the fully continuous program TPβ [P ] and return (line 20). Though it can be
argued this sampling is potentially as difficult as the original inference, we reiterate
that we need only do this once offline, hence the cost is easily amortized.
6 Methodology
6.1 Benchmarks
Table 1 presents the benchmarks. For each benchmark, Columns 2 and 3 present the
original prior and likelihood type, respectively. Column 4 presents whether the conti-
nuity correction was applied. Column 5 presents the time to continualize the program,
TCont. . As can be seen in Columns 4 and 5 the total continualization time, TCont. ,
depends on whether parameters had to be synthesized. GPAExample had the longest
TCont. at 3.6s, due to the complexity of the multiple predicates, however these times
are amortized as our synthesis step is done only once.
As our problem has received little attention, no standard benchmark suites exist.
In fact, to make inference tractable, for many models, developers would construct
continuous approximations by hand, in an ad hoc fashion. However we wanted a
benchmark suite that showcased all 3 inference scenarios that our approach works
for: (1) discrete/hybrid prior and discrete/hybrid likelihood (2) continuous prior but
discrete/hybrid likelihood and (3) discrete/hybrid prior but a continuous likelihood.
Therefore, we obtained the benchmarks in two ways. First, we looked at variations
of the mixed distributions benchmarks previously published in the machine learning
community, e.g., [65, 58], which served as the inspiration for our GPAExample. Sec-
ond, we took existing benchmarks [27, 30] for which designers modeled certain distri-
butions with continuous approximations, and we retro-fitted these models with the
corresponding discrete distributions. This step was done for Election, Fairness,
SVMfairness, SVE, and TrueSkill. These discretizations were only applied where they
made sense, e.g., the Gauss(np,np(1-p)) in the original Election program became dis-
cretized as Binomial(n,p). We also took popular Bayesian models from Cognitive
Science literature which use multiple discrete latent variables [39] and these models
Continualization of Probabilistic Programs With Correction 385
are BetaBinomial and Exam. Lastly we took population models from the mathematical
biology literature [10, 4] to build benchmarks since populations are by nature discrete.
This was done for Plankton and DiscreteDisease. We present the original programs
in the appendix [38].
– Original Program: inference done in standard fashion on the original model, and
– Naive Smoothing : inference done on a KDE style model in which Gaussian smooth-
ing is applied only to the observed variable, but no approximations are applied to
the inner latent variables.
We will refer to these as simply “Original” and “Naive” respectively.
Table 2: Inference Times (s) and Error Ratios for each model, β = 0.1
Program Original Original Naive Naive Leios Leios
Time Error Time Error Time Error
GPAExample 0.806 0.090 0.631 0.070 0.605 0.058
Election - - 3.232 0.051 0.616 0.036
Fairness 4.396 0.057 0.563 0.056 0.603 0.093
SVMfairness - - 0.626 0.454 0.980 0.261
TrueSkill 3.668 0.009 0.494 0.059 0.586 0.053
DiscreteDisease 4.944 0.009 1.350 0.013 0.490 0.008
SVE - - 0.522 0.045 0.516 0.091
BetaBinomial 1.224 0.028 0.564 0.024 0.459 0.013
Exam 3.973 0.087 0.504 0.126 0.527 0.133
Plankton 0.570 0.017 0.457 0.080 0.453 0.042
Average 2.797 0.043 0.894 0.098 0.584 0.079
7 Evaluation
We study the following three research questions:
RQ1 Can program continualization make inference faster, while still maintaining a
high degree of accuracy, compared to the original program and naive smoothing?
RQ2 How do performance and accuracy vary for different smoothing factors β?
RQ3 Can program continualization enable running transformed programs with off-
the-shelf inference algorithms that cannot execute the original programs?
Fig. 7: Inference Times and Error ratios for Leios and Naive for different β
For accuracy, inference performed via Leios was on average more accurate than
Naive (E = 0.079 vs. 0.098, respectively). Both were slightly less accurate than infer-
ence performed on Original (E = 0.043). This is not unreasonable as Original has no
approximations applied (which are the main source of inference error). However the
Original failed on Election, SVE, and SVMfairness. For Election, a large Binomial
latent led to a timeout, and it also slowed the Naive version relative to Leios (3.23s vs
0.61s). The Original failed on SVE since it is a hybrid discrete-continuous model (which
can make inference intractable [65, 6]). SVMfairness is a non-linear model where many
latent variables have high variances, leading to inference on the Original failing to con-
verge; Leios and Naive had higher error on this benchmark, for much the same reason
(though Leios was still significantly better than Naive, E = 0.261 vs 0.454).
Although Leios was faster than Original in all cases, for TrueSkill and SVMfairness,
Leios was somewhat slower than Naive. This is likely because the discrete latent vari-
ables in these benchmarks had small enough parameters (Binomial with small n). Sim-
ilarly, for Fairness, Leios was slightly less accurate than Naive because the Gaussian
approximation can be less accurate for smaller n.
Table 3: Variational Inference Times (s) and Error Ratios for selected β
β : 0.25 β : 0.5 β : 0.75
Program Torg Eorg TN S EN S TLeios ELeios TLeios ELeios TLeios ELeios
GPAExample - - - - 3.111 0.207 3.341 0.241 3.435 0.321
Election - - - - 1.762 0.070 1.755 0.110 1.764 0.064
Fairness - - - - 1.813 0.722 1.827 0.769 1.830 0.753
SVMfairness - - - - 1.800 0.201 1.806 0.293 1.804 0.301
TrueSkill - - - - 1.809 0.119 1.802 0.062 1.790 0.090
DiscreteDisease - - - - 1.734 0.248 1.731 0.471 1.747 0.553
SVE 0.677 0.684 1.478 3.095 1.471 0.587 1.460 0.566 1.448 0.348
BetaBinomial - - - - 1.605 0.834 1.596 0.708 1.587 0.497
Exam - - - - 0.603 0.222 0.602 0.213 0.603 0.285
Plankton - - - - 3.432 0.297 3.427 0.763 3.434 0.530
Table 3 presents the results for running translated programs in Pyro. Columns 2-5
present the inference times and result errors for the original and naively smoothed pro-
gram. These columns are “-” when Pyro cannot successfully perform inference (i.e. the
model contains a discrete variable that is unsupported by the auto guide). Columns 6-11
present Leios’ time and error for each model, for three different smoothing parameters.
Fully-automated Variational Inference failed on all but one of the examples for
both the Original and Naive. This is because in both cases the program still contains
latent or observed discrete random variables. For most of the benchmarks (Election,
GPA, TrueSkill) the program optimized with Leios had errors comparable to those
computed previously with MCMC in WebPPL. For some the error was over 0.5 for all
β (BetaBinomial, Fairness), which is in part a consequence of limitations of automatic
VI, and hence for certain models manual fine-tuning may be unavoidable. These results
illustrate that Leios can be used to create an efficient program in situations when the
original language does not easily support non-continuous distributions.
8 Related Work
expressed and require specialized inference algorithms. In contrast, Leios can work
with a variety of off-the-shelf inference algorithms that operate on arbitrary models
and does not need to define its own inference algorithm. In [66] the authors explored
a restricted programming language that can statically detect which parameters the
program’s density is discontinuous in. However they did not address the question of
continuous approximation, rather their approach was to develop a custom inference
scheme and restrict the language so that pathological models cannot be written (they
also disallow ‘==’ predicates). In [65], Wu et al. develop a custom inference method for
discrete-continuous mixtures but only for models encodeable as a Bayesian network,
furthermore as pointed out by [47], the specialized inference method of Wu et al. is
restrictive since it cannot be composed with other program transformations.
Additionally, Machine Learning researchers have developed other continuous relax-
ation techniques to address the inherent problems of non-differentiable models. One
other popular method is to reparametrize the gradient estimator during Variational
Inference (VI) computation, commonly called the “reparameterization trick” [42, 61].
However, this approach suffers from the fact that not all distributions support such
gradient reparameterizations, and also this method is only limited to Variational In-
ference. Conversely our approach allows one to still use any inference scheme. Further,
even though these techniques have been attempted in the probabilistic programming
setting, [40], such work still inherits the aforementioned weaknesses.
We also draw upon Kernel Density Estimation (KDE) [62], a common approxima-
tion scheme in statistics. KDE fits a Kernel density to each observed data point, hence
constructing a smooth approximation. Naive Smoothing is essentially a KDE (with
a Gaussian Kernel) of the original while Leios employs additional continualizations.
Furthermore, our smoothing factor β is analogous to the bandwidth of a KDE.
9 Conclusion
Acknowledgements
We would like to thank the anonymous reviewers for their constructive feedback. We
thank Darko Marinov for his helpful feedback during early stages of the work. We thank
Adithya Murali for valuable feedback about the semantics. We thank Zixin Huang and
Saikat Dutta for helpful discussions about the evaluation and Vimuth Fernando and
Keyur Joshi for helpful proofreads. JL is grateful for support from the Alfred P. Sloan
foundation for a Sloan Scholar award used to support much of this work. The research
presented in this paper has been supported in part by NSF, Grant no. CCF-1846354.
References
1. Aigner, D.J., Amemiya, T., Poirier, D.J.: On the estimation of production fron-
tiers: maximum likelihood estimation of the parameters of a discontinuous density
function. International Economic Review pp. 377–396 (1976)
2. Albarghouthi, A., D’Antoni, L., Drews, S., Nori, A.V.: Fairsquare: Probabilistic
verification of program fairness. Proc. ACM Program. Lang. (OOPSLA) (2017)
3. Bar-Lev, S.K., Fuchs, C.: Continuity corrections for discrete distributions under
the edgeworth expansion. Methodology And Computing In Applied Probability
3(4), 347–364 (2001)
4. Becker, N.: A general chain binomial model for infectious diseases. Biometrics
37(2), 251–258 (1981)
5. Betancourt, M., Girolami, M.: Hamiltonian monte carlo for hierarchical models.
Current trends in Bayesian methodology with applications 79, 30 (2015)
6. Bhat, S., Borgström, J., Gordon, A.D., Russo, C.: Deriving probability density
functions from probabilistic functional programs. In: International Conference on
Tools and Algorithms for the Construction and Analysis of Systems. pp. 508–522.
TACAS’13 (2013)
7. Bichsel, B., Gehr, T., Vechev, M.T.: Fine-grained semantics for probabilistic pro-
grams. In: Programming Languages and Systems - 27th European Symposium on
Programming, ESOPh. pp. 145–185 (2018)
8. Bingham, E., Chen, J.P., Jankowiak, M., Obermeyer, F., Pradhan, N., Karalet-
sos, T., Singh, R., Szerlip, P., Horsfall, P., Goodman, N.D.: Pyro: Deep Universal
Probabilistic Programming. arXiv preprint arXiv:1810.09538 (2018)
9. Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: A review for
statisticians. Journal of the American Statistical Association 112(518) (2017)
10. Blumenthal, S., Dahiya, R.C.: Estimating the binomial parameter n. Journal of
the American Statistical Association 76(376), 903–909 (1981)
11. Chasins, S., Phothilimthana, P.M.: Data-driven synthesis of full probabilistic pro-
grams. In: CAV (2017)
12. Chaudhuri, S., Clochard, M., Solar-Lezama, A.: Bridging boolean and quantitative
synthesis using smoothed proof search. In: ACM SIGPLAN-SIGACT Symposium
on Principles of Programming Languages. POPL ’14 (2014)
13. Chaudhuri, S., Gulwani, S., Lublinerman, R.: Continuity and robustness of pro-
grams. In: Communications of the ACM, Research Highlights. vol. 55 (2012)
14. Chaudhuri, S., Solar-Lezama, A.: Smooth interpretation. In: Proceedings of the
31st ACM SIGPLAN Conference on Programming Language Design and Imple-
mentation. pp. 279–291. PLDI ’10 (2010)
Continualization of Probabilistic Programs With Correction 391
15. Chen, Y., Ghahramani, Z.: Scalable discrete sampling as a multi-armed bandit
problem. In: Proceedings of the 33rd International Conference on International
Conference on Machine Learning - Volume 48. pp. 2492–2501. ICML’16 (2016)
16. Cheng, T.T.: The normal approximation to the poisson distribution and a proof
of a conjecture of ramanujan. Bull. Amer. Math. Soc. 55(4), 396–401 (04 1949)
17. Chung, H., Loken, E., Schafer, J.L.: Difficulties in drawing inferences with finite-
mixture models. The American Statistician 58(2), 152–158 (2004)
18. Cooper, G.F.: The computational complexity of probabilistic inference using
bayesian belief networks. Artificial Intelligence 42(2), 393 – 405 (1990)
19. Dahlqvist, F., Kozen, D., Silva, A.: Semantics of probabilistic programming: A
gentle introduction. In: Foundations of Probabilistic Programming (2020)
20. Delon, J., Desolneux, A.: A wasserstein-type distance in the space of gaussian
mixture models. arXiv preprint arXiv:1907.05254 (2019)
21. DeMillo, R.A., Lipton, R.J.: Defining software by continuous, smooth functions.
IEEE Trans. Softw. Eng. 17(4) (Apr 1991)
22. Dutta, S., Zhang, W., Huang, Z., Misailovic, S.: Storm: program reduction for
testing and debugging probabilistic programming systems. In: Proceedings of the
2019 27th ACM Joint Meeting on European Software Engineering Conference and
Symposium on the Foundations of Software Engineering. pp. 729–739 (2019)
23. Feller, W.: On the normal approximation to the binomial distribution. Ann. Math.
Statist. 16(4), 319–329 (12 1945)
24. Gehr, T., Misailovic, S., Vechev, M.T.: PSI: exact symbolic inference for proba-
bilistic programs. In: Computer Aided Verification, CAV. pp. 62–83 (2016)
25. Gelman, A.: Parameterization and bayesian modeling. Journal of the American
Statistical Association 99(466), 537–545 (2004)
26. Goodman, N.D., Stuhlmüller, A.: The Design and Implementation of Probabilistic
Programming Languages (2014)
27. Goodman, N.D., Tenenbaum, J.B., Contributors, T.P.: Probabilistic Models of
Cognition (2016)
28. Gordon, A.D., Henzinger, T.A., Nori, A.V., Rajamani, S.K.: Probabilistic pro-
gramming. In: Proceedings of the on Future of Software Engineering (2014)
29. Gorinova, M.I., Moore, D., Hoffman, M.D.: Automatic reparameterisation in prob-
abilistic programming (2019)
30. Herbrich, R., Minka, T., Graepel, T.: TrueskillTM : A bayesian skill rating system.
In: Proceedings of the 19th International Conference on Neural Information Pro-
cessing Systems. pp. 569–576. NIPS’06 (2006)
31. Hoffman, M.D., Gelman, A.: The no-u-turn sampler: Adaptively setting path
lengths in hamiltonian monte carlo (2011)
32. Huang, Z., Wang, Z., Misailovic, S.: Psense: Automatic sensitivity analysis for
probabilistic programs. In: Automated Technology for Verification and Analysis -
15th International Symposium, ATVA 2018, Los Angeles, California, October 7-10,
2018, Proceedings (2018)
33. Hur, C.K., Nori, A.V., Rajamani, S.K., Samuel, S.: Slicing probabilistic programs.
In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language
Design and Implementation. pp. 133–144 (2014)
34. Inala, J.P., Gao, S., Kong, S., Solar-Lezama, A.: REAS: combining numerical op-
timization with SAT solving (2018)
35. Kildall, G.A.: A unified approach to global program optimization. In: Proceedings
of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Program-
ming Languages. pp. 194–206. POPL ’73 (1973)
392 J. Laurel and S. Misailovic
36. Kozen, D.: Semantics of probabilistic programs. Journal of Computer and System
Sciences 22(3), 328 – 350 (1981)
37. Lan, S., Streets, J., Shahbaba, B.: Wormhole hamiltonian monte carlo. In: Pro-
ceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. pp.
1953–1959. AAAI’14 (2014)
38. Laurel, J., Misailovic, S.: Continualization of probabilistic programs with correction
(appendix) (2020), https://jsl1994.github.io/papers/ESOP2020 appendix.pdf
39. Lee, M.D., Wagenmakers, E.J.: Bayesian cognitive modeling: A practical course.
Cambridge University Press (2014)
40. Lee, W., Yu, H., Yang, H.: Reparameterization gradient for non-differentiable mod-
els. In: Advances in Neural Information Processing Systems. pp. 5553–5563 (2018)
41. Lew, A.K., Cusumano-Towner, M.F., Sherman, B., Carbin, M., Mansinghka, V.K.:
Trace types and denotational semantics for sound programmable inference in prob-
abilistic languages. Proc. ACM Program. Lang. 4(POPL) (2019)
42. Maddison, C.J., Mnih, A., Teh, Y.W.: The Concrete Distribution: A Continuous
Relaxation of Discrete Random Variables. In: International Conference on Learning
Representations (2017)
43. Marin, J.M., Mengersen, K., Robert, C.P.: Bayesian modelling and inference on
mixtures of distributions. Handbook of statistics 25, 459–507 (2005)
44. Morgan, C., McIver, A., Seidel, K.: Probabilistic predicate transformers. ACM
Trans. Program. Lang. Syst. 18(3), 325–353 (May 1996)
45. Murray, I., Salakhutdinov, R.: Evaluating probabilities under high-dimensional la-
tent variable models. In: Proceedings of the 21st International Conference on Neu-
ral Information Processing Systems. pp. 1137–1144. NIPS’08 (2008)
46. Nandi, C., Grossman, D., Sampson, A., Mytkowicz, T., McKinley, K.S.: Debugging
probabilistic programs. In: Proceedings of the 1st ACM SIGPLAN International
Workshop on Machine Learning and Programming Languages. MAPL 2017 (2017)
47. Narayanan, P., Shan, C.c.: Symbolic disintegration with a variety of base measures
(2019), http://homes.sice.indiana.edu/ccshan/rational/disint2arg.pdf
48. Neal, R.M.: Mcmc using hamiltonian dynamics. In: Handbook of Markov Chain
Monte Carlo, chap. 5 (2012)
49. Nguyen, V.A., Abadeh, S.S., Yue, M.C., Kuhn, D., Wiesemann, W.: Optimistic
distributionally robust optimization for nonparametric likelihood approximation.
In: Advances in Neural Information Processing Systems. pp. 15846–15856 (2019)
50. Nishimura, A., Dunson, D., Lu, J.: Discontinuous hamiltonian monte
carlo for discrete parameters and discontinuous likelihoods (2017),
https://arxiv.org/abs/1705.08510
51. Nori, A.V., Ozair, S., Rajamani, S.K., Vijaykeerthy, D.: Efficient synthesis of prob-
abilistic programs. In: Proceedings of the 36th ACM SIGPLAN Conference on Pro-
gramming Language Design and Implementation. pp. 208–217. PLDI ’15 (2015)
52. Opper, M., Archambeau, C.: The variational gaussian approximation revisited.
Neural Computation 21(3), 786–792 (2009)
53. Opper, M., Winther, O.: Expectation consistent approximate inference. J. Mach.
Learn. Res. 6, 2177–2204 (Dec 2005)
54. Ross, S.: A First Course in Probability. Pearson (2010)
55. Rudin, W.: Real and complex analysis. McGraw-Hill Education (2006)
56. Salimans, T., Kingma, D.P., Welling, M.: Markov chain monte carlo and variational
inference: Bridging the gap. In: Proceedings of the 32nd International Conference
on International Conference on Machine Learning. pp. 1218–1226. ICML (2015)
Continualization of Probabilistic Programs With Correction 393
57. Sankaranarayanan, S., Chakarov, A., Gulwani, S.: Static analysis for probabilistic
programs: inferring whole program properties from finitely many paths. In: Pro-
ceedings of the 34th ACM SIGPLAN conference on Programming language design
and implementation. pp. 447–458 (2013)
58. Sanner, S., Abbasnejad, E.: Symbolic variable elimination for discrete and contin-
uous graphical models. In: Proceedings of the Twenty-Sixth AAAI Conference on
Artificial Intelligence. pp. 1954–1960. AAAI’12 (2012)
59. Smith, J., Croft, J.: Bayesian networks for discrete multivariate data: an algebraic
approach to inference. Journal of Multivariate Analysis 84(2), 387 – 402 (2003)
60. Tolpin, D., van de Meent, J.W., Yang, H., Wood, F.: Design and implementa-
tion of probabilistic programming language anglican. In: Proceedings of the 28th
Symposium on the Implementation and Application of Functional Programming
Languages. IFL 2016 (2016)
61. Tucker, G., Mnih, A., Maddison, C.J., Sohl-Dickstein, J.: REBAR : Low-variance,
unbiased gradient estimates for discrete latent variable models. In: Neural Infor-
mation Processing Systems (2017)
62. Wand, M., Jones, M.: Kernel Smoothing (Chapman & Hall/CRC Monographs on
Statistics and Applied Probability) (1995)
63. Wang, D., Hoffmann, J., Reps, T.: Pmaf: An algebraic framework for static analysis
of probabilistic programs. In: Proceedings of the 39th ACM SIGPLAN Conference
on Programming Language Design and Implementation. PLDI 2018 (2018)
64. Wang, D., Hoffmann, J., Reps, T.: A denotational semantics for low-level proba-
bilistic programs with nondeterminism. Electronic Notes in Theoretical Computer
Science 347 (2019), proceedings of the Thirty-Fifth Conference on the Mathemat-
ical Foundations of Programming Semantics
65. Wu, Y., Srivastava, S., Hay, N., Du, S., Russell, S.: Discrete-continuous mixtures
in probabilistic programming: Generalized semantics and inference algorithms. In:
Proceedings of the 35th International Conference on Machine Learning. Proceed-
ings of Machine Learning Research, vol. 80, pp. 5343–5352 (2018)
66. Zhou, Y., Gram-Hansen, B.J., Kohn, T., Rainforth, T., Yang, H., Wood, F.:
LF-PPL: A low-level first order probabilistic programming language for non-
differentiable models. In: The 22nd International Conference on Artificial Intelli-
gence and Statistics, AISTATS. Proceedings of Machine Learning Research, vol. 89,
pp. 148–157 (2019)
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Semantic Foundations for Deterministic
Dataflow and Stream Processing
Konstantinos Mamouras
1 Introduction
Stream processing is the computational paradigm where the input is not pre-
sented in its entirety at the beginning of the computation, but instead it is
given in an incremental fashion as a potentially unbounded sequence of elements
or data items. This paradigm is appropriate in settings where data is created
continually in real-time and has to be processed immediately in order to ex-
tract actionable insights and enable timely decision-making. Examples of such
datasets are streams of business events in an enterprise setting [26], streams
of packets that flow through computer networks [37], time-series data that is
captured by sensors in healthcare applications [33], etc.
Due to the great variety of streaming applications, there are various propos-
als for specialized languages, compilers, and runtime systems that deal with the
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 394–427, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 15
Semantic Foundations for Deterministic Dataflow and Stream Processing 395
until now) to an output stream history (i.e., the fragment of the output stream
produced so far). The monotonicity requirement captures the idea that a stream
transformation cannot retract the output that has already been emitted. We
call such functions stream transductions, and we propose them as a deno-
tational semantic model for stream processing. This model encompasses string
transductions, non-diverging Kahn-computable [59] functions on streams, mono-
tone relational transformations [71], the CQL-definable [16] transformations on
time-varying relations, and transformations of continuous-time signals [27].
We also introduce an abstract model of computation for stream processing.
The considered programs or abstract machines are called stream transduc-
ers, and they are organized using transducer types that specify the input and
output stream types. A stream transducer processes the input stream in an in-
cremental fashion, by consuming it fragment by fragment. The consumption of
an input fragment results in the emission of an output fragment. Our algebraic
setting brings in an unavoidable complication compared to the classical theory
of word transducers: not all stream transducers describe a stream transduction.
This phenomenon has to do with the generalization of the input and output data
streams from sequences of atomic data items to elements of arbitrary monoids.
A stream transducer has to respect its input/output type, which means that the
way in which the input stream is fragmented into pieces and fed to the trans-
ducer does not affect the cumulative output. More concisely, this says that the
cumulative output is independent from the fragmentation of the input. In order
to formalize this notion, we say that a factorization of an input history u is a
sequence of stream fragments u1 , u2 , . . . , un whose concatenation is equal to the
input history, i.e. u1 · u2 · · · un = u. Now, the desired restriction can be described
as follows: for every input history w and any two factorizations u1 , . . . , um and
v1 , . . . , vm of w, the cumulative output that the transducer emits when consum-
ing the fragments u1 , . . . , um in sequence is equal to the cumulative output when
consuming the fragments v1 , . . . , vn . Fortunately, this complex property can be
distilled into an equivalent property on the structure of the stream transducer
that we call coherence property. Every stream transducer that is coherent has
a well-defined semantics or denotation in terms of a stream transduction.
We have already outlined the basics of our general framework for streaming
computation, which includes: (1) a classification of streams using monoids as
types, (2) a denotational semantic model that employs monotone functions from
input histories to output histories, and (3) a programming model that general-
izes transducers to compute meaninfully on elements of arbitrary monoids. This
already allows us to address important questions about specific computations:
− Does a streaming program (transducer) behave as intended? This amounts
to checking whether the denotation of the transducer is the desired function.
− Are two streaming programs (transducers) equivalent? This means that their
denotations in terms of stream transductions are the same.
The first question is a correctness property. The second question is relevant for
semantics-preserving program optimization. We will turn now to the issue of how
to modularly specify complex stream transductions and transducers.
398 K. Mamouras
Outline of paper. In Sect. 2 we introduce the idea that data streams can be
classified using monoids as their types, and in Sect. 3 we propose the semantic
model of stream transductions. Sect. 4 is devoted to the description of an ab-
stract model of streaming computation, called stream transducer, and the main
properties that it satisfies. In Sect. 5 we show that our abstract model is closed
under a fundamental set of dataflow combinators: serial, parallel, and feedback
composition. In Sect. 6 we prove the soundness of a streaming optimizing trans-
formation using denotational arguments and algebraic rewriting. Sect. 7 contains
related work, and Sect. 8 concludes with a brief summary of our proposal.
We write BSig(A) for the set of all these bounded-domain continuous-time sig-
nals. The unit signal is the unique function of type [0, 0) → A, whose domain of
definition is empty. Observe that BSig(A) is a monoid. For signals f : [0, u) → A
and g : [0, v) → A, it holds that f g iff u ≤ v and f (t) = g(t) for ev-
ery t ∈ [0, u). There is a unique prefix witness function, because for every
f, g ∈ BSig(A) with f g there is a unique h ∈ BSig(A) such that f · h = g.
[0..n] = {0, . . . , n} for some integer n ≥ 0. We also use the notation f : [0..n] →
FBag(A) to convey this information regarding the domain of f . We define the
concatenation operation · for finite time-varying multisets as follows:
⎧
f : [0..m] → FBag(A) ⎨f (t),
⎪ if t ∈ [0..m − 1]
g : [0..n] → FBag(A) (f · g)(t) = f (t) ∪ g(0), if t = m
⎩g(t − m),
⎪
if t ∈ [m + 1..n]
f · g : [0..m + n] → FBag(A)
We write TFBag(A) to denote the set of all finite time-varying multisets over A.
The unit time-varying multiset Id : [0..0] → FBag(A) is given by Id(0) = ∅. It is
easy to see that f · Id = f and that Id · f = f for every f : [0..n] → FBag(A).
We leave it to the reader to also verify that (f · g) · h = f · (g · h) for finite
time-varying multisets f , g and h. So, the set TFBag(A) together with · and Id
is a monoid. It is not difficult to show that it is left-cancellative.
Let us consider now the prefix preorder on finite time-varying multisets.
For f : [0..m] → FBag(A) and g : [0..n] → FBag(A), it holds that f g iff
m ≤ n and f (t) = g(t) for every t ∈ [0..m].
The examples above highlight the variety of mathematical objects that can
be meaningfully viewed as streams. These streams can be organized elegantly
using the structure of monoids. The sequences of Example 1, the multisets of
Example 2, and the finite time-varying multisets of Example 7 can be described
equivalently in terms of the partial orders of [13, 85], which have also been sug-
gested as an approach to unify notions of streams. Using partial orders it is
also possible to model the timed finite sequences of Example 6, but only with a
non-succinct encoding: every time punctuation t ∈ N is encoded with a sequence
11 . . . 1 of t punctuations, one for each time unit. Partial orders cannot encode
the sets of Example 3, the maps of Example 4, or the signals of Example 5. In-
formally, the reason for this is that partial orders can only encode commutation
equations, which are insufficient for objects such as sets and maps.
3 Stream Transductions
In this section we will introduce stream transductions as semantic denotational
models of stream transformations. At any given point in a streaming computa-
tion, we have seen an input history (the part of the stream from the beginning
of the computation until now) and we have produced an output history (the
cumulative output that has been emitted from the beginning until now). As a
first approximation, a streaming computation can be described mathematically
by a function β : A → B, where A and B are monoids that describe the input
and output type respectively, which maps an input history x ∈ A to an output
history β(x) ∈ B. The function β has to be monotone because the output is
cumulative, which means that it can only be extended with more output items
as the computation proceeds. An equivalent way to understand the monotonicity
property is that it captures the idea that any output that has already been emit-
ted cannot be retracted. Since β takes an entire input history as its argument,
Semantic Foundations for Deterministic Dataflow and Stream Processing 403
it can describe stateful computations, where the output that is emitted at every
step potentially depends on the entire input history.
Definition 8 (Stream Transduction & Incremental Form). Let A and B
be monoids. A function β : A → B is said to be monotone (with respect to the
prefix preorder) if x y implies β(x) β(y) for all x, y ∈ A. For a monotone
β : A → B, we say that the partial function μ is a monotonicity witness function
if it maps elements x, y ∈ A and z ∈ prefix(x, y) witnessing that x y to a
witness μ(x, y, z)
∈ prefix(β(x), β(y)) for β(x) β(y). That is, we require that
the type of μ is x,y∈A prefix(x, y) → prefix(β(x), β(y)). So, the defining property
of μ is that for all x, y, z ∈ A with xz = y it holds that β(x) · μ(x, y, z) = β(y).
For brevity, we will sometimes write μ(x, z) to denote μ(x, xz, z). The defining
property of μ is then written as β(x) · μ(x, z) = β(xz) for all x, z ∈ A.
A stream transduction from A to B is a function β : A → B that is mono-
tone with respect
to the prefix preorder, together with a monotonicity witness
function μ : x,y∈A prefix(x, y) → prefix(β(x), β(y)). We write STrans(A, B) to
denote the set of all stream transductions from A to B.
The incremental form of a stream transduction β, μ ∈ STrans(A, B) is a
function F(β, μ) : A∗ → B ∗ , which is defined inductively by F(β, μ)(ε) = β(1)
and F(β, μ)(x1 , . . . , xn , xn+1 ) = F(β, μ)(x1 , . . . , xn ) · μ(x1 · · · xn , xn+1 ) for
every sequence x1 , . . . , xn+1 ∈ A∗ .
Consider the stream transduction β, μ : STrans(A, B) and the input frag-
ments x, y ∈ A. Notice that μ(x, y) gives the output increment that the streaming
computation generates when the input history x is extended into xy. For an ar-
bitrary output monoid B, the output increment μ(x, y) is generally not uniquely
determined by β(x) and β(xy). This means that the monotonicity witness func-
tion μ generally provides some additional information about the streaming com-
putation that cannot be obtained purely from β. However, if the output monoid
B is left-cancellative then there is a unique function μ that witnesses the mono-
tonicity of β.
Suppose that β, μ : STrans(A, B) is a stream transduction. The incremental
form F(β, μ) of the transduction β, μ describes the stream transformation in
explicit input/output increments. For example, F(β, μ)(x1 ) = β(1), μ(1, x1 )
and F(β, μ)(x1 , x2 ) = β(1), μ(1, x1 ), μ(x1 , x2 ). The key property of the in-
cremental form is that π(F(β, μ)(x̄)) = β(π(x̄)) for every x̄ ∈ A∗ . For example,
π(F(β, μ)(x1 , x2 , x3 )) = β(1)·μ(1, x1 )·μ(x1 , x2 )·μ(x1 x2 , x3 ) = β(x1 )·μ(x1 , x2 )·
μ(x1 x2 , x3 ) = β(x1 x2 ) · μ(x1 x2 , x3 ) = β(x1 x2 x3 ).
Example 9 (Counting). Let A be an arbitrary set. We will describe a stream-
ing computation whose input type is the monoid FBag(A) and whose output
type is the monoid FSeq(N). The informal operational description is as follows:
there is no initial output, and every time a new data item arrives the compu-
tation emits the total number of items seen so far. The formal description is
given by the stream transduction β : FBag(A) → FSeq(N), defined by β(∅) = ε
and β(x) = 1, 2, . . . , |x| for every non-empty x ∈ FBag(A), where |x| denotes
the size of the multiset x. It is easy to see that β is monotone. Since FSeq(N)
404 K. Mamouras
4 Model of Computation
We will present an abstract model of computation for stream processing, where
the input and output data streams are elements of monoids A and B respec-
tively. A streaming algorithm is described by a transducer, a kind of automaton
that produces output values. We consider transducers that can have a poten-
tially infinite state space, which we denote by St. The computation starts at a
distinguished initial state init ∈ St, and the initialization triggers some initial
output o ∈ B. The computation then proceeds by consuming the input stream
incrementally, i.e. fragment by fragment. One step of the computation from a
state s ∈ St involves consuming an input fragment x ∈ A, producing an output
increment out(s, x) ∈ B and transitioning to the next state next(s, x) ∈ St.
The proof is by induction on the length of the sequence. For the base case, we
have that gnext(init, ε) = init and next(init, 1) are bisimilar because G is coherent
(recall Property (N1) of Definition 20). For the induction step we have:
which is equal to next(init, π(x̄ · y)). This concludes the proof of the claim (N*).
The proof of the theorem proceeds by induction on x̄ ∈ A∗ . For the base case,
observe that o · gout(init, ε) = o · 1 = o is equal to o · out(init, 1) = o (property
(O1) for G). For the induction step, we have:
When G is coherent, Theorem 21 says that the denotation gives the same cumu-
lative output for any two factorizations of the input. We say that the transducers
G1 and G2 are equivalent if their denotations are equal, i.e. G1 = G2 .
Proof. Suppose that G = (St, init, o, next, out) : G(A, B) is a coherent transducer.
Define the function β : A → B by β(x) = o · out(init, x) for every x ∈ A, and
the function μ : A × A → B by μ(x, y) = out(next(init, x), y) for all x, y ∈
A. For any x, y ∈ A, we have to establish that β(x) · μ(x, y) = β(xy). This
follows immediately from Part (O2) of the coherence property for G. So, β, μ
is a stream transduction. It remains to prove that G implements β, μ, that is,
Semantic Foundations for Deterministic Dataflow and Stream Processing 409
G(x̄) = F(β, μ)(x̄) for every x̄ ∈ A∗ . For the base case, we have G(ε) = o
and F(β, μ)(ε) = β(1), which are equal because β(1) = o · out(init, 1) = o by
(O1). For the step case, we observe that:
Since G implements β, μ, we have that G(x̄ · z) = F(β, μ)(x̄ · z) and there-
fore out(s, z) = μ(π(x̄), z). Similarly, we can obtain that out(t, z) = μ(π(ȳ), z).
From π(x̄) = π(ȳ) we get that μ(π(x̄), z) = μ(π(ȳ), z), and therefore out(s, z) =
out(t, z). Now, observe that s = next(s, z) = next(gnext(init, x̄), z) = gnext(x̄ ·
z) using Property 1. Similarly, we have that t = next(t, z) = gnext(ȳ · z).
From π(x̄ · z) = π(x̄)z = π(ȳ)z = π(ȳ · z) we conclude that s Rt . We have
thus established that R is a bisimulation.
Now, we are ready to prove that G is coherent. We will only present the cases
of Part (N2) and Part (O2), since they are the most interesting ones. Let x, y ∈ A.
For Part (N2), we have to show that the states s = next(next(init, x), y) and
t = next(init, xy) are bisimilar. Since R (previous paragraph) is a bisimulation, it
suffices to show that (s, t) ∈ R. Indeed, this is true because s = gnext(init, x, y),
t = gnext(init, xy) and π(x, y) = xy = π(xy). For Part (O2), we have that
G(xy) = o, out(init, xy) and F(β, μ)(xy) = β(1), μ(1, xy), as well as
using the definitions of G and F. Since G implements β, μ, we know that
G(x, y) = F(β, μ)(x, y) and G(xy) = F(β, μ)(xy). Using all the above,
we get that o · out(init, x) · out(next(init, x), y) = β(1) · μ(1, x) · μ(x, y) = β(x) ·
μ(x, y) = β(xy) and o · out(init, xy) = β(1) · μ(1, xy) = β(xy). So, Part (O2) of
the coherence property holds.
410 K. Mamouras
Proof. Recall from Definition 8 that the monotonicity witness function μ satisfies
the following property: β(x) · μ(x, y) = β(xy) for every x, y ∈ A. Now, we define
the transducer G = (St, init, o, next, out) as follows: St = A, init = 1, o = β(1),
next(s, x) = s · x and out(s, x) = μ(s, x) for every state s ∈ St and input x ∈ A.
The following properties hold for every s ∈ St and x1 , . . . , xn ∈ A∗ :
for all x̄ ∈ FSeq(A)∗ and y ∈ FSeq(A). We have thus proved that Flatten(A) is
correct: its denotation is equal to the intended semantics.
for all x̄ ∈ A∗ and y ∈ A. We have thus established that Split(r) is correct: its
denotation is equal to the intended semantics.
Feedback composition
β, μ : STrans(A × B, B)
loopB (β, μ) = γ, ν : STrans(FSeq(A), FSeq(B))
γ(a1 , . . . , an ) = b0 , b1 , . . . , bn
γ(ε) = b0 , where b0 = β(1A , 1B )
γ(a1 , . . . , an , an+1 ) = γ(a1 , . . . , an ) · bn+1 , where
bn+1 = μ(a1 · · · an , b0 b1 · · · bn−1 , an+1 , bn )
G = (St, init, o, next, out) : G(A × B, B)
LoopB(G) = (St , init , o , next , out ) : G(FSeq(A), FSeq(B))
St = St × B (second component: last output batch)
init = init, o and o = o
next (s, b, a) = next(s, a, b), out(s, a, b)
out (s, b, a) = out(s, a, b)
β, μ : STrans(A × B, B) splitter r for A
loop(β, μ, r) = split(r) loopB (β, μ) flatten(B) : STrans(A, B)
G : G(A × B, B) splitter r for A
Loop(G, r) = Serial(Split(r), LoopB(G), Flatten(B)) : G(A, B)
Semantic Foundations for Deterministic Dataflow and Stream Processing 413
where β, μ = β1 , μ1 β2 , μ2 . All four claims above are proved by induction
on the sequence x̄. Equations (7) and (8) are needed to prove Equation (9). Now,
we will establish that G implements β, μ. Indeed, we have that
Each edge of the graph represents a communication channel along which a stream
flows, and it is annotated with the type of the stream. The dataflow graph
above represents the transducer G = Serial(Par(Lift(op), Lift(op)), Merge),
where Merge : G(FBag(A) × FBag(A), FBag(A)) is the transducer of Example 16.
From Propositions 27, 29 and 28 we obtain that G implements the transduction
(lift(op) lift(op)) merge, where merge is described in Example 11.
We will now consider the feedback combinator, which introduces cycles in
the dataflow graph. One consequence of cyclic graphs in the style of Kahn-
MacQueen [60] is that divergence can be introduced, that is, a finite amount
of input can cause an operator to enter an infinite loop. For example, consider
the transducer Merge : G(FBag(A) × FBag(A), FBag(A)) of Example 16. The
figure below visualizes the dataflow graph, where the output channel of Merge
is connected to one of its input channels, thus forming a feedback loop.
FBag(A) FBag(A) FBag(A)
Merge •
Suppose that the singleton input {a} is fed to the input of the dataflow graph
above, which corresponds to the first input channel of Merge. This will cause
Merge to emit {a}, which will be sent again to the second input channel of Merge.
Intuitively, this will cause the computation to enter an infinite loop (divergence)
of consuming and emitting {a}. This behavior is undesirable in systems that
process data streams, because divergence can make the system unresponsive. For
this reason, we will consider here a form of feedback that eliminates this problem
by ensuring that the computation of a feedback loop proceeds in a sequence of
rounds. This avoid divergence, because the computation always makes progress
by moving from one round to the next, as dictated by the input data. We describe
this organization in rounds by requiring that the programmer specifies a splitter
(recall Example 18). The splitter decomposes the input stream into batches,
and one round of computation for the feedback loop corresponds to consuming
one batch of data, generating the corresponding output batch, and sending the
output batch along the feedback loop to be available for the next round of
processing. This form of feedback allows flexibility in specifying what constitutes
Semantic Foundations for Deterministic Dataflow and Stream Processing 415
a single batch (and thus a single round ), and therefore generalizes the feedback
combinator of Synchronous Languages such as Lustre [31].
Proposition 30 (Feedback Composition). Let A and B be monoids, β, μ :
STrans(A, B) be a transduction, G : G(A, B) be a transducer, and r = (r1 , r2 )
be a splitter for A (see Example 13).
(1) Implem.: If G implements β, μ, then Loop(G, r) implements loop(β, μ, r).
(2) Coherence: If G is coherent, then so is Loop(G, r).
Proof. We leave to the reader the proofs that Split (Example 18) implements
split and that Flatten (Example 17) implements flatten. Given Proposition 29,
it suffices to show that G = LoopB(G) implements γ, ν = loopB (β, μ). Since
G is of type G(FSeq(A), FSeq(B)) it suffices to define the transition and output
functions on singleton sequences (as done in Table 1), because there is a unique
way to extend them so that G is coherent. It remains to show that G (x̄) =
F(γ, ν)(x̄) for every x̄ ∈ FSeq(A)∗ . The base case is easy, and for the step case it
suffices to show that out (gnext (init , x̄), y) = ν(π(x̄), y) for every x̄ ∈ FSeq(A)∗
and y ∈ FSeq(A). As we discussed before, gnext and out can be viewed as being
defined on elements of A rather than sequences of FSeq(A), so we can equivalently
prove that out (gnext (init , a1 , . . . , an ), an+1 ) = ν(a1 , . . . , an , an+1 ) with each
ai an element of A. Given that G implements β, μ, the key observation to finish
the proof is gnext (init , a1 , . . . , an ) = gnext(init, a1 , b0 , . . . , an , bn−1 ), bn ,
where γ(a1 , . . . , an ) = b0 , b1 , . . . , bn .
Example 31. For an example of using the feedback combinator, consider the
transduction β, μ which adds two input streams of numbers pointwise. That
is, β : FSeq(N) × FSeq(N) → FSeq(N) is defined by β(x1 x2 . . . xm , y1 y2 . . . yn ) =
0(x1 + y1 )(x2 + y2 ) . . . (xk + yk ) where k = min(m, n). Additionally, consider
the trivial splitter r = (r1 , r2 ) for sequences where each batch is a singleton:
r1 (x1 . . . xn ) = x1 , . . . , xn and r2 (x1 . . . xn ) = ε. We use this splitter to enforce
that each batch is a single element and that each round of the computation
involves consuming one element. Finally, the transduction loop(β, μ, r) = γ, ν
describes the running sum, that is, γ(x1 . . . xn ) = 0x1 (x1 + x2 ) . . . (x1 + · · · + xn ).
The dataflow combinators of this section could form the basis of query lan-
guage design. The StreamQRE language [10,84] and related formalisms [9,11,12,
14] are based on a set of combinators for efficiently processing linearly-ordered
streams (e.g., time series [3, 4]). Extending a language like StreamQRE to the
typed setting of stream transductions is an interesting research direction.
The above equation illustrates our proposed style of reasoning for establishing
the soundness of optimizing streaming transformations: (1) prove equalities be-
tween transductions using elementary set-theoretic arguments, (2) prove that
the transducers (programs) implement the transductions (denotations) using
induction, (3) translate the equalities between transductions into equivalences
between transducers using the results of Sect. 5, and finally (4) use algebraic
reasoning to establish more complex equivalences.
The example of this section is simple but illustrates two key points: (1) our
data types for streams (monoids) capture important invariants about the streams
that enable transformations, and (2) useful program transformations can be
established with denotational arguments that require an appropriate notion of
transduction. This approach opens up the possibility of formally verifying the
wealth of optimizing transformations that are used in stream processing systems.
Semantic Foundations for Deterministic Dataflow and Stream Processing 417
The papers [54, 101] describe several of them, but use informal arguments that
rely on the operational intuition about streaming computations. Our approach
here, on the other hand, relies on rigorous denotational arguments.
The equational axiomatizations of arrows [56] and traced monoidal categories
[58] are relevant to our setting, but would require adaptation. An interesting
question is whether a complete axiomatization can be provided for the basic
dataflow combinators of Sect. 5, similarly to how Kleene Algebra (KA) [62, 63]
and its extensions [49,64,79,83] (as well as other program logics [65,66,78,80–82])
capture properties of imperative programs at the propositional level. We also
leave for future work the development of the coalgebraic approach [96–98] for
reasoning about the equivalence of stream transducers. We have already defined
a notion of bisimulation in Sect. 4, which could give an alternative approach for
proving equivalence using coinduction on the transducers.
7 Related Work
The idea of using types to classify streams has been recently explored in [85]
(see also [13]), but only for a restricted class of types that correspond to partial
orders. No general abstract model of computation is presented in [85], and many
of the examples in this paper cannot be adequately accomodated.
The mathematical framework of coalgebras [97] has been used to describe
streams [98]. One advantage of this approach is that proofs of equivalence can
be given using the proof principle of coinduction [96], which in many cases offers
a useful alternative to proofs by induction. This line of work mostly focuses on
infinite sequences of elements, whereas here we focus on the transformation of
streams of data that can be of various different forms (not just sequences).
The idea to model the input/output of automata using monoids has appeared
in the algebraic theory of automata and transducers. Monoids (non-free, e.g.
A∗ × B ∗ ) have been used to generalize automata from recognizers of languages
to recognizers of relations [45], which are sometimes called rational transduc-
ers [100]. Our focus here is on (deterministic) functions, as models that recog-
nize relations can give rise to the Brock-Ackerman anomaly [30]. The automata
models (with inputs from a free monoid A∗ ) most closely related to our stream
transducers are deterministic: Mealy machines [87], Moore machines [90], se-
quential transducers [48, 95], and sub-sequential transducers [102]. The concept
of coherence that we introduce here (Definition 20) does not arise in these mod-
els, because they do not operate on input batches. An algebraic generalization
of a deterministic acceptor is provided by a right monoid action δ : St × A → St
(see page 231 of [100]), which satisfies the following properties for all s ∈ St and
x, y ∈ A: (1) δ(s, 1) = s, and (2) δ(δ(s, x), y) = δ(s, xy). These properties look
similar to (N1) and (N2) of Definition 20. They are, however, too restrictive for
our stream transducers, as they would falsify Theorem 23.
8 Conclusion
References
1. Abadi, D.J., Ahmad, Y., Balazinska, M., Cetintemel, U., Cherniack, M., Hwang,
J.H., Lindner, W., Maskey, A., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y.,
Zdonik, S.: The design of the Borealis stream processing engine. In: Proceedings
of the 2nd Biennial Conference on Innovative Data Systems Research (CIDR ’05).
pp. 277–289 (2005), http://cidrdb.org/cidr2005/papers/P23.pdf
2. Abadi, D.J., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S.,
Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: A new model and architec-
ture for data stream management. The VLDB Journal 12(2), 120–139 (2003).
https://doi.org/10.1007/s00778-003-0095-z
3. Abbas, H., Alur, R., Mamouras, K., Mangharam, R., Rodionova, A.: Real-time
decision policies with predictable performance. Proceedings of the IEEE, Spe-
cial Issue on Design Automation for Cyber-Physical Systems 106(9), 1593–1615
(2018). https://doi.org/10.1109/JPROC.2018.2853608
4. Abbas, H., Rodionova, A., Mamouras, K., Bartocci, E., Smolka, S.A., Grosu, R.:
Quantitative regular expressions for arrhythmia detection. IEEE/ACM Trans-
actions on Computational Biology and Bioinformatics 16(5), 1586–1597 (2019).
https://doi.org/10.1109/TCBB.2018.2885274
5. Affetti, L., Tommasini, R., Margara, A., Cugola, G., Della Valle, E.: Defining
the execution semantics of stream processing engines. Journal of Big Data 4(1)
(2017). https://doi.org/10.1186/s40537-017-0072-9
6. Akidau, T., Balikov, A., Bekiroğlu, K., Chernyak, S., Haberman, J., Lax, R.,
McVeety, S., Mills, D., Nordstrom, P., Whittle, S.: MillWheel: Fault-tolerant
stream processing at Internet scale. Proceedings of the VLDB Endowment 6(11),
1033–1044 (2013). https://doi.org/10.14778/2536222.2536229
7. Akidau, T., Bradshaw, R., Chambers, C., Chernyak, S., Fernández-Moctezuma,
R.J., Lax, R., McVeety, S., Mills, D., Perry, F., Schmidt, E., Whittle, S.: The
dataflow model: A practical approach to balancing correctness, latency, and cost in
massive-scale, unbounded, out-of-order data processing. Proceedings of the VLDB
Endowment 8(12), 1792–1803 (2015). https://doi.org/10.14778/2824032.2824076
8. Alur, R., Černý, P.: Streaming transducers for algorithmic verification of
single-pass list-processing programs. In: Proceedings of the 38th Annual
ACM SIGPLAN-SIGACT Symposium on Principles of Programming Lan-
guages. pp. 599–610. POPL ’11, ACM, New York, NY, USA (2011).
https://doi.org/10.1145/1926385.1926454
9. Alur, R., Fisman, D., Mamouras, K., Raghothaman, M., Stanford, C.: Stream-
able regular transductions. Theoretical Computer Science 807, 15–41 (2020).
https://doi.org/10.1016/j.tcs.2019.11.018
10. Alur, R., Mamouras, K.: An introduction to the StreamQRE language. Depend-
able Software Systems Engineering 50, 1–24 (2017). https://doi.org/10.3233/978-
1-61499-810-5-1
11. Alur, R., Mamouras, K., Stanford, C.: Automata-based stream processing. In:
Chatzigiannakis, I., Indyk, P., Kuhn, F., Muscholl, A. (eds.) Proceedings of
the 44th International Colloquium on Automata, Languages, and Programming
(ICALP ’17). Leibniz International Proceedings in Informatics (LIPIcs), vol. 80,
pp. 112:1–112:15. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl,
Germany (2017). https://doi.org/10.4230/LIPIcs.ICALP.2017.112
12. Alur, R., Mamouras, K., Stanford, C.: Modular quantitative monitoring. Pro-
ceedings of the ACM on Programming Languages 3(POPL), 50:1–50:31 (2019).
https://doi.org/10.1145/3290363
420 K. Mamouras
13. Alur, R., Mamouras, K., Stanford, C., Tannen, V.: Interfaces for stream process-
ing systems. In: Lohstroh, M., Derler, P., Sirjani, M. (eds.) Principles of Modeling:
Essays Dedicated to Edward A. Lee on the Occasion of His 60th Birthday, Lec-
ture Notes in Computer Science, vol. 10760, pp. 38–60. Springer, Cham (2018).
https://doi.org/10.1007/978-3-319-95246-8 3
14. Alur, R., Mamouras, K., Ulus, D.: Derivatives of quantitative regular expressions.
In: Aceto, L., Bacci, G., Bacci, G., Ingólfsdóttir, A., Legay, A., Mardare, R.
(eds.) Models, Algorithms, Logics and Tools: Essays Dedicated to Kim Guldstrand
Larsen on the Occasion of His 60th Birthday, Lecture Notes in Computer Science,
vol. 10460, pp. 75–95. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-
63121-9 4
15. Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., Motwani,
R., Srivastava, U., Widom, J.: STREAM: The Stanford data stream management
system. Tech. Rep. 2004-20, Stanford InfoLab (2004), http://ilpubs.stanford.edu:
8090/641/
16. Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: Seman-
tic foundations and query execution. The VLDB Journal 15(2), 121–142 (2006).
https://doi.org/10.1007/s00778-004-0147-z
17. Arasu, A., Widom, J.: A denotational semantics for continuous queries
over streams and relations. SIGMOD Record 33(3), 6–11 (2004).
https://doi.org/10.1145/1031570.1031572
18. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models
and issues in data stream systems. In: Proceedings of the Twenty-first
ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database
Systems. pp. 1–16. PODS ’02, ACM, New York, NY, USA (2002).
https://doi.org/10.1145/543613.543615
19. Bai, Y., Thakkar, H., Wang, H., Luo, C., Zaniolo, C.: A data stream
language and system designed for power and extensibility. In: Proceedings
of the 15th ACM International Conference on Information and Knowledge
Management. pp. 337–346. CIKM ’06, ACM, New York, NY, USA (2006).
https://doi.org/10.1145/1183614.1183664
20. Benveniste, A., Caspi, P., Edwards, S.A., Halbwachs, N., Guernic, P.L., de Si-
mone, R.: The synchronous languages 12 years later. Proceedings of the IEEE
91(1), 64–83 (2003). https://doi.org/10.1109/JPROC.2002.805826
21. Benveniste, A., Guernic, P.L., Jacquemot, C.: Synchronous programming with
events and relations: The SIGNAL language and its semantics. Science of
Computer Programming 16(2), 103–149 (1991). https://doi.org/10.1016/0167-
6423(91)90001-E
22. Berry, G., Gonthier, G.: The Esterel synchronous programming language: De-
sign, semantics, implementation. Science of Computer Programming 19(2), 87–
152 (1992). https://doi.org/10.1016/0167-6423(92)90005-V
23. Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development.
Springer (2013). https://doi.org/10.1007/978-3-662-07964-5
24. Bilsen, G., Engels, M., Lauwereins, R., Peperstraete, J.: Cyclo-static
dataflow. IEEE Transactions on Signal Processing 44(2), 397–408 (1996).
https://doi.org/10.1109/78.485935
25. Botan, I., Derakhshan, R., Dindar, N., Haas, L., Miller, R.J., Tatbul, N.: SE-
CRET: A model for analysis of the execution semantics of stream process-
ing systems. Proceedings of the VLDB Endowment 3(1-2), 232–243 (2010).
https://doi.org/10.14778/1920841.1920874
Semantic Foundations for Deterministic Dataflow and Stream Processing 421
26. Bouillet, E., Kothari, R., Kumar, V., Mignet, L., Nathan, S., Ranganathan, A.,
Turaga, D.S., Udrea, O., Verscheure, O.: Processing 6 billion CDRs/day: From
research to production (experience report). In: Proceedings of the 6th ACM In-
ternational Conference on Distributed Event-Based Systems. pp. 264–267. DEBS
’12, ACM, New York, NY, USA (2012). https://doi.org/10.1145/2335484.2335513
27. Bourke, T., Pouzet, M.: Zélus: A synchronous language with ODEs. In: Pro-
ceedings of the 16th International Conference on Hybrid Systems: Computa-
tion and Control. pp. 113–118. HSCC ’13, ACM, New York, NY, USA (2013).
https://doi.org/10.1145/2461328.2461348
28. Boussinot, F., de Simone, R.: The ESTEREL language. Proceedings of the IEEE
79(9), 1293–1304 (1991). https://doi.org/10.1109/5.97299
29. Brenna, L., Demers, A., Gehrke, J., Hong, M., Ossher, J., Panda, B., Riedewald,
M., Thatte, M., White, W.: Cayuga: A high-performance event processing engine.
In: Proceedings of the 2007 ACM SIGMOD International Conference on Manage-
ment of Data. pp. 1100–1102. SIGMOD ’07, ACM, New York, NY, USA (2007).
https://doi.org/10.1145/1247480.1247620
30. Brock, J.D., Ackerman, W.B.: Scenarios: A model of non-determinate computa-
tion. In: Dı́az, J., Ramos, I. (eds.) Proceedings of the International Colloquium
on the Formalization of Programming Concepts (ICFPC ’81). Lecture Notes in
Computer Science, vol. 107, pp. 252–259. Springer, Berlin, Heidelberg (1981).
https://doi.org/10.1007/3-540-10699-5 102
31. Caspi, P., Pilaud, D., Halbwachs, N., Plaice, J.A.: LUSTRE: A declar-
ative language for real-time programming. In: Proceedings of the 14th
ACM SIGACT-SIGPLAN Symposium on Principles of Programming Lan-
guages. pp. 178–188. POPL ’87, ACM, New York, NY, USA (1987).
https://doi.org/10.1145/41625.41641
32. Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M.,
Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.: Tele-
graphCQ: Continuous dataflow processing for an uncertain world. In: Proceedings
of the First Biennial Conference on Innovative Data Systems Research (CIDR ’03)
(2003), http://cidrdb.org/cidr2003/program/p24.pdf
33. Chen, C.M., Agrawal, H., Cochinwala, M., Rosenbluth, D.: Stream query pro-
cessing for healthcare bio-sensor applications. In: Proceedings of the 20th Inter-
national Conference on Data Engineering. pp. 791–794. ICDE ’04, IEEE (2004).
https://doi.org/10.1109/ICDE.2004.1320048
34. Cooper, G.H., Krishnamurthi, S.: Embedding dynamic dataflow in a call-by-value
language. In: Sestoft, P. (ed.) Proceedings of the 15th European Symposium on
Programming (ESOP ’06). Lecture Notes in Computer Science, vol. 3924, pp. 294–
308. Springer, Berlin, Heidelberg (2006). https://doi.org/10.1007/11693024 20
35. Coquand, T., Huet, G.: The calculus of constructions. Information and Compu-
tation 76(2), 95–120 (1988). https://doi.org/10.1016/0890-5401(88)90005-3
36. Courtney, A.: Frappé: Functional reactive programming in Java. In: Ra-
makrishnan, I.V. (ed.) Proceedings of the 3rd International Symposium on
Practical Aspects of Declarative Languages (PADL ’01). Lecture Notes in
Computer Science, vol. 1990, pp. 29–44. Springer, Berlin, Heidelberg (2001).
https://doi.org/10.1007/3-540-45241-9 3
37. Cranor, C., Johnson, T., Spataschek, O., Shkapenyuk, V.: Gigascope: A stream
database for network applications. In: Proceedings of the 2003 ACM SIGMOD
International Conference on Management of Data. pp. 647–651. SIGMOD ’03,
ACM, New York, NY, USA (2003). https://doi.org/10.1145/872757.872838
422 K. Mamouras
38. Czaplicki, E., Chong, S.: Asynchronous functional reactive programming for GUIs.
In: Proceedings of the 34th ACM SIGPLAN Conference on Programming Lan-
guage Design and Implementation. pp. 411–422. PLDI ’13, ACM, New York, NY,
USA (2013). https://doi.org/10.1145/2491956.2462161
39. D’Angelo, B., Sankaranarayanan, S., Sanchez, C., Robinson, W., Finkbeiner,
B., Sipma, H.B., Mehrotra, S., Manna, Z.: LOLA: Runtime monitoring of syn-
chronous systems. In: Proceedings of the 12th International Symposium on Tem-
poral Representation and Reasoning (TIME ’05). pp. 166–174. IEEE (2005).
https://doi.org/10.1109/TIME.2005.26
40. Demers, A., Gehrke, J., Hong, M., Riedewald, M., White, W.: Towards expres-
sive publish/subscribe systems. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W.,
Matthes, F., Hatzopoulos, M., Boehm, K., Kemper, A., Grust, T., Boehm, C.
(eds.) Proceedings of the 10th International Conference on Extending Database
Technology (EDBT ’06). Lecture Notes in Computer Science, vol. 3896, pp. 627–
644. Springer, Berlin, Heidelberg (2006). https://doi.org/10.1007/11687238 38
41. Demers, A., Gehrke, J., Panda, B., Riedewald, M., Sharma, V., White, W.:
Cayuga: A general purpose event monitoring system. In: Proceedings of the 3rd
Biennial Conference on Innovative Data Systems Research (CIDR ’07). pp. 412–
422 (2007), http://cidrdb.org/cidr2007/papers/cidr07p47.pdf
42. Dennis, J.B.: First version of a data flow procedure language. In: Robinet, B.
(ed.) Programming Symposium. Lecture Notes in Computer Science, vol. 19,
pp. 362–376. Springer, Berlin, Heidelberg (1974). https://doi.org/10.1007/3-540-
06859-7 145
43. Deshmukh, J.V., Donzé, A., Ghosh, S., Jin, X., Juniwal, G., Seshia, S.A.: Robust
online monitoring of signal temporal logic. Formal Methods in System Design
51(1), 5–30 (2017). https://doi.org/10.1007/s10703-017-0286-7
44. Dindar, N., Tatbul, N., Miller, R.J., Haas, L.M., Botan, I.: Modeling the execution
semantics of stream processing engines with SECRET. The VLDB Journal 22(4),
421–446 (2013). https://doi.org/10.1007/s00778-012-0297-3
45. Elgot, C.C., Mezei, J.E.: On relations defined by generalized finite au-
tomata. IBM Journal of Research and Development 9(1), 47–68 (1965).
https://doi.org/10.1147/rd.91.0047
46. Elliott, C., Hudak, P.: Functional reactive animation. In: Proceedings of
the Second ACM SIGPLAN International Conference on Functional Pro-
gramming. pp. 263–273. ICFP ’97, ACM, New York, NY, USA (1997).
https://doi.org/10.1145/258948.258973
47. Elliott, C.M.: Push-pull functional reactive programming. In: Proceedings of the
2nd ACM SIGPLAN Symposium on Haskell. pp. 25–36. Haskell ’09, ACM, New
York, NY, USA (2009). https://doi.org/10.1145/1596638.1596643
48. Ginsburg, S., Rose, G.F.: A characterization of machine mappings. Canadian
Journal of Mathematics 18, 381—-388 (1966). https://doi.org/10.4153/CJM-
1966-040-3
49. Grathwohl, N.B.B., Kozen, D., Mamouras, K.: KAT + B! In: Proceedings of
the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer
Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on
Logic in Computer Science (LICS). pp. 44:1–44:10. CSL-LICS ’14, ACM, New
York, NY, USA (2014). https://doi.org/10.1145/2603088.2603095
50. Gyllstrom, D., Wu, E., Chae, H.J., Diao, Y., Stahlberg, P., Anderson, G.: SASE:
Complex event processing over streams. In: Proceedings of the 3rd Biennial Con-
ference on Innovative Data Systems Research (CIDR ’07). pp. 407–411 (2007),
http://cidrdb.org/cidr2007/papers/cidr07p46.pdf
Semantic Foundations for Deterministic Dataflow and Stream Processing 423
51. Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous data flow
programming language LUSTRE. Proceedings of the IEEE 79(9), 1305–1320
(1991). https://doi.org/10.1109/5.97300
52. Havelund, K., Roşu, G.: Efficient monitoring of safety properties. Interna-
tional Journal on Software Tools for Technology Transfer 6(2), 158–173 (2004).
https://doi.org/10.1007/s10009-003-0117-6
53. Hirzel, M.: Partition and compose: Parallel complex event processing. In:
Proceedings of the 6th ACM International Conference on Distributed Event-
Based Systems. pp. 191–200. DEBS ’12, ACM, New York, NY, USA (2012).
https://doi.org/10.1145/2335484.2335506
54. Hirzel, M., Soulé, R., Schneider, S., Gedik, B., Grimm, R.: A catalog of stream
processing optimizations. ACM Computing Surveys (CSUR) 46(4), 46:1–46:34
(2014). https://doi.org/10.1145/2528412
55. Hudak, P., Courtney, A., Nilsson, H., Peterson, J.: Arrows, robots, and functional
reactive programming. In: Jeuring, J., Jones, S.L.P. (eds.) Revised Lectures of
the 4th International School on Advanced Functional Programming: AFP 2002,
Oxford, UK, August 19-24, 2002., Lecture Notes in Computer Science, vol. 2638,
pp. 159–187. Springer, Berlin, Heidelberg (2003). https://doi.org/10.1007/978-3-
540-44833-4 6
56. Hughes, J.: Generalising monads to arrows. Science of Computer Programming
37(1), 67–111 (2000). https://doi.org/10.1016/S0167-6423(99)00023-4
57. Jain, N., Mishra, S., Srinivasan, A., Gehrke, J., Widom, J., Balakrishnan, H.,
Çetintemel, U., Cherniack, M., Tibbetts, R., Zdonik, S.: Towards a streaming
SQL standard. Proceedings of the VLDB Endowment 1(2), 1379–1390 (2008).
https://doi.org/10.14778/1454159.1454179
58. Joyal, A., Street, R., Verity, D.: Traced monoidal categories. Mathematical
Proceedings of the Cambridge Philosophical Society 119(3), 447—-468 (1996).
https://doi.org/10.1017/S0305004100074338
59. Kahn, G.: The semantics of a simple language for parallel programming. Infor-
mation Processing 74, 471–475 (1974)
60. Kahn, G., MacQueen, D.B.: Coroutines and networks of parallel processes. Infor-
mation Processing 77, 993–998 (1977)
61. Karp, R.M., Miller, R.E.: Properties of a model for parallel computations: De-
terminacy, termination, queueing. SIAM Journal on Applied Mathematics 14(6),
1390–1411 (1966). https://doi.org/10.1137/0114108
62. Kozen, D.: A completeness theorem for Kleene algebras and the algebra
of regular events. Information and Computation 110(2), 366–390 (1994).
https://doi.org/10.1006/inco.1994.1037
63. Kozen, D.: Kleene algebra with tests. ACM Transactions on Pro-
gramming Languages and Systems (TOPLAS) 19(3), 427–443 (1997).
https://doi.org/10.1145/256167.256195
64. Kozen, D., Mamouras, K.: Kleene algebra with equations. In: Esparza, J., Fraigni-
aud, P., Husfeldt, T., Koutsoupias, E. (eds.) Proceedings of the 41st International
Colloquium on Automata, Languages and Programming (ICALP ’14). Lecture
Notes in Computer Science, vol. 8573, pp. 280–292. Springer, Berlin, Heidelberg
(2014). https://doi.org/10.1007/978-3-662-43951-7 24
65. Kozen, D., Parikh, R.: An elementary proof of the completeness of PDL. The-
oretical Computer Science 14(1), 113–118 (1981). https://doi.org/10.1016/0304-
3975(81)90019-0
424 K. Mamouras
66. Kozen, D., Tiuryn, J.: On the completeness of propositional Hoare logic. In-
formation Sciences 139(3—4), 187–195 (2001). https://doi.org/10.1016/S0020-
0255(01)00164-5
67. Krämer, J., Seeger, B.: Semantics and implementation of continuous sliding win-
dow queries over data streams. ACM Transactions on Database Systems (TODS)
34(1), 4:1–4:49 (2009). https://doi.org/10.1145/1508857.1508861
68. Krishnaswami, N.R.: Higher-order functional reactive programming without
spacetime leaks. In: Proceedings of the 18th ACM SIGPLAN International Con-
ference on Functional Programming. pp. 221–232. ICFP ’13, ACM, New York,
NY, USA (2013). https://doi.org/10.1145/2500365.2500588
69. Krishnaswami, N.R., Benton, N.: Ultrametric semantics of reactive programs. In:
Proceedings of the 26th Annual IEEE Symposium on Logic in Computer Science
(LICS ’11). pp. 257–266. IEEE (2011). https://doi.org/10.1109/LICS.2011.38
70. Kulkarni, S., Bhagat, N., Fu, M., Kedigehalli, V., Kellogg, C., Mittal, S., Patel,
J.M., Ramasamy, K., Taneja, S.: Twitter Heron: Stream processing at scale. In:
Proceedings of the 2015 ACM SIGMOD International Conference on Manage-
ment of Data. pp. 239–250. SIGMOD ’15, ACM, New York, NY, USA (2015).
https://doi.org/10.1145/2723372.2742788
71. Law, Y.N., Wang, H., Zaniolo, C.: Relational languages and data
models for continuous queries on sequences and data streams. ACM
Transactions on Database Systems (TODS) 36(2), 8:1–8:32 (2011).
https://doi.org/10.1145/1966385.1966386
72. Le Guernic, P., Benveniste, A., Bournai, P., Gautier, T.: SIGNAL–
a data flow-oriented language for signal processing. IEEE Transactions
on Acoustics, Speech, and Signal Processing 34(2), 362–374 (1986).
https://doi.org/10.1109/TASSP.1986.1164809
73. Lee, E.A., Messerschmitt, D.G.: Synchronous data flow. Proceedings of the IEEE
75(9), 1235–1245 (1987). https://doi.org/10.1109/PROC.1987.13876
74. Leucker, M., Schallhart, C.: A brief account of runtime verification.
The Journal of Logic and Algebraic Programming 78(5), 293–303 (2009).
https://doi.org/10.1016/j.jlap.2008.08.004
75. Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and
evaluation techniques for window aggregates in data streams. In: Proceed-
ings of the 2005 ACM SIGMOD International Conference on Management
of Data. pp. 311–322. SIGMOD ’05, ACM, New York, NY, USA (2005).
https://doi.org/10.1145/1066157.1066193
76. Maier, D., Li, J., Tucker, P., Tufte, K., Papadimos, V.: Semantics of data
streams and operators. In: Eiter, T., Libkin, L. (eds.) Proceedings of the 10th
International Conference on Database Theory (ICDT ’05). Lecture Notes in
Computer Science, vol. 3363, pp. 37–52. Springer, Berlin, Heidelberg (2005).
https://doi.org/10.1007/978-3-540-30570-5 3
77. Maier, I., Odersky, M.: Higher-order reactive programming with incremen-
tal lists. In: Castagna, G. (ed.) Proceedings of the 27th European Confer-
ence on Object-Oriented Programming (ECOOP ’13). Lecture Notes in Com-
puter Science, vol. 7920, pp. 707–731. Springer, Berlin, Heidelberg (2013).
https://doi.org/10.1007/978-3-642-39038-8 29
78. Mamouras, K.: On the Hoare theory of monadic recursion schemes. In: Proceed-
ings of the Joint Meeting of the 23rd EACSL Annual Conference on Computer
Science Logic (CSL) and the 29th Annual ACM/IEEE Symposium on Logic in
Computer Science (LICS). pp. 69:1–69:10. CSL-LICS ’14, ACM, New York, NY,
USA (2014). https://doi.org/10.1145/2603088.2603157
Semantic Foundations for Deterministic Dataflow and Stream Processing 425
79. Mamouras, K.: Extensions of Kleene Algebra for Program Verification. Ph.D. the-
sis, Cornell University, Ithaca, NY (August 2015), http://hdl.handle.net/1813/
40960
80. Mamouras, K.: Synthesis of strategies and the Hoare logic of angelic nondeter-
minism. In: Pitts, A. (ed.) Proceedings of the 18th International Conference on
Foundations of Software Science and Computation Structures (FoSSaCS ’15). Lec-
ture Notes in Computer Science, vol. 9034, pp. 25–40. Springer, Berlin, Heidelberg
(2015). https://doi.org/10.1007/978-3-662-46678-0 2
81. Mamouras, K.: The Hoare logic of deterministic and nondeterministic monadic
recursion schemes. ACM Transactions on Computational Logic (TOCL) 17(2),
13:1–13:30 (2016). https://doi.org/10.1145/2835491
82. Mamouras, K.: Synthesis of strategies using the Hoare logic of angelic and de-
monic nondeterminism. Logical Methods in Computer Science 12(3) (2016).
https://doi.org/10.2168/LMCS-12(3:6)2016
83. Mamouras, K.: Equational theories of abnormal termination based on Kleene al-
gebra. In: Esparza, J., Murawski, A.S. (eds.) Proceedings of the 20th International
Conference on Foundations of Software Science and Computation Structures (FoS-
SaCS ’17). Lecture Notes in Computer Science, vol. 10203, pp. 88–105. Springer,
Berlin, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54458-7 6
84. Mamouras, K., Raghothaman, M., Alur, R., Ives, Z.G., Khanna, S.: StreamQRE:
Modular specification and efficient evaluation of quantitative queries over stream-
ing data. In: Proceedings of the 38th ACM SIGPLAN Conference on Program-
ming Language Design and Implementation. pp. 693–708. PLDI ’17, ACM, New
York, NY, USA (2017). https://doi.org/10.1145/3062341.3062369
85. Mamouras, K., Stanford, C., Alur, R., Ives, Z.G., Tannen, V.: Data-trace
types for distributed stream processing systems. In: Proceedings of the 40th
ACM SIGPLAN Conference on Programming Language Design and Imple-
mentation. pp. 670–685. PLDI ’19, ACM, New York, NY, USA (2019).
https://doi.org/10.1145/3314221.3314580
86. McSherry, F., Murray, D.G., Isaacs, R., Isard, M.: Differential dataflow. In: Pro-
ceedings of the 6th Biennial Conference on Innovative Data Systems Research
(CIDR ’13) (2013), http://cidrdb.org/cidr2013/Papers/CIDR13 Paper111.pdf
87. Mealy, G.H.: A method for synthesizing sequential circuits. The Bell Sys-
tem Technical Journal 34(5), 1045–1079 (1955). https://doi.org/10.1002/j.1538-
7305.1955.tb03788.x
88. Mei, Y., Madden, S.: ZStream: A cost-based query processor for adaptively detect-
ing composite events. In: Proceedings of the 2009 ACM SIGMOD International
Conference on Management of Data. pp. 193–206. SIGMOD ’09, ACM, New York,
NY, USA (2009). https://doi.org/10.1145/1559845.1559867
89. Meyerovich, L.A., Guha, A., Baskin, J., Cooper, G.H., Greenberg, M., Bromfield,
A., Krishnamurthi, S.: Flapjax: A programming language for Ajax applications.
In: Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Pro-
gramming Systems Languages and Applications. pp. 1–20. OOPSLA ’09, ACM,
New York, NY, USA (2009). https://doi.org/10.1145/1640089.1640091
90. Moore, E.F.: Gedanken-Experiments on Sequential Machines, Annals of Mathe-
matics Studies, vol. 34, pp. 129–153. Princeton University Press (1956)
91. Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku,
G.S., Olston, C., Rosenstein, J., Varma, R.: Query processing, approximation,
and resource management in a data stream management system. In: Proceedings
of the First Biennial Conference on Innovative Data Systems Research (CIDR
’03) (2003), http://cidrdb.org/cidr2003/program/p22.pdf
426 K. Mamouras
92. Murray, D.G., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M.: Naiad:
A timely dataflow system. In: Proceedings of the Twenty-Fourth ACM Sympo-
sium on Operating Systems Principles. pp. 439–455. SOSP ’13, ACM, New York,
NY, USA (2013). https://doi.org/10.1145/2517349.2522738
93. Nilsson, H., Courtney, A., Peterson, J.: Functional reactive programming,
continued. In: Proceedings of the 2002 ACM SIGPLAN Workshop on
Haskell. pp. 51—-64. Haskell ’02, ACM, New York, NY, USA (2002).
https://doi.org/10.1145/581690.581695
94. Noghabi, S.A., Paramasivam, K., Pan, Y., Ramesh, N., Bringhurst, J.,
Gupta, I., Campbell, R.H.: Samza: Stateful scalable stream processing at
LinkedIn. Proceedings of the VLDB Endowment 10(12), 1634–1645 (2017).
https://doi.org/10.14778/3137765.3137770
95. Raney, G.N.: Sequential functions. Journal of the ACM 5(2), 177––180 (1958).
https://doi.org/10.1145/320924.320930
96. Rutten, J.J.M.M.: Automata and coinduction (an exercise in coalgebra). In:
Sangiorgi, D., de Simone, R. (eds.) Proceedings of the 9th International
Conference on Concurrency Theory (CONCUR ’98). Lecture Notes in Com-
puter Science, vol. 1466, pp. 194–218. Springer, Berlin, Heidelberg (1998).
https://doi.org/10.1007/BFb0055624
97. Rutten, J.J.M.M.: Universal coalgebra: A theory of systems. Theoreti-
cal Computer Science 249(1), 3–80 (2000). https://doi.org/10.1016/S0304-
3975(00)00056-6
98. Rutten, J.J.M.M.: A coinductive calculus of streams. Mathe-
matical Structures in Computer Science 15(1), 93–147 (2005).
https://doi.org/10.1017/S0960129504004517
99. Sadri, R., Zaniolo, C., Zarkesh, A., Adibi, J.: Expressing and optimizing sequence
queries in database systems. ACM Transactions on Database Systems 29(2), 282–
318 (2004). https://doi.org/10.1145/1005566.1005568
100. Sakarovitch, J.: Elements of Automata Theory. Cambridge University Press
(2009)
101. Schneider, S., Hirzel, M., Gedik, B., Wu, K.L.: Safe data parallelism for
general streaming. IEEE Transactions on Computers 64(2), 504–517 (2015).
https://doi.org/10.1109/TC.2013.221
102. Schützenberger, M.P.: Sur une variante des fonctions séquentielles. Theo-
retical Computer Science 4(1), 47–57 (1977). https://doi.org/10.1016/0304-
3975(77)90055-X
103. Sculthorpe, N., Nilsson, H.: Safe functional reactive programming through depen-
dent types. In: Proceedings of the 14th ACM SIGPLAN International Conference
on Functional Programming. pp. 23—-34. ICFP ’09, ACM, New York, NY, USA
(2009). https://doi.org/10.1145/1596550.1596558
104. Shivers, O., Might, M.: Continuations and transducer composition. In: Proceed-
ings of the 27th ACM SIGPLAN Conference on Programming Language Design
and Implementation. pp. 295—-307. PLDI ’06, ACM, New York, NY, USA (2006).
https://doi.org/10.1145/1133981.1134016
105. Thati, P., Roşu, G.: Monitoring algorithms for metric temporal logic specifica-
tions. Electronic Notes in Theoretical Computer Science 113, 145–162 (2005).
https://doi.org/10.1016/j.entcs.2004.01.029
106. The Coq development team: The Coq proof assistant. https://coq.inria.fr (2020),
[Online; accessed February 22, 2020]
Semantic Foundations for Deterministic Dataflow and Stream Processing 427
107. Thies, W., Karczmarek, M., Amarasinghe, S.: StreamIt: A language for stream-
ing applications. In: Horspool, R.N. (ed.) Proceedings of the 11th Interna-
tional Conference on Compiler Construction (CC ’02). Lecture Notes in Com-
puter Science, vol. 2304, pp. 179–196. Springer, Berlin, Heidelberg (2002).
https://doi.org/10.1007/3-540-45937-5 14
108. Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S.,
Jackson, J., Gade, K., Fu, M., Donham, J., Bhagat, N., Mittal, S., Ryaboy, D.:
Storm @ Twitter. In: Proceedings of the 2014 ACM SIGMOD International Con-
ference on Management of Data. pp. 147–156. SIGMOD ’14, ACM, New York,
NY, USA (2014). https://doi.org/10.1145/2588555.2595641
109. Tucker, P.A., Maier, D., Sheard, T., Fegaras, L.: Exploiting punctuation semantics
in continuous data streams. IEEE Transactions on Knowledge and Data Engineer-
ing 15(3), 555–568 (2003). https://doi.org/10.1109/TKDE.2003.1198390
110. Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Bjorner, N.: Symbolic
finite state transducers: Algorithms and applications. In: Proceedings of the
39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Program-
ming Languages. pp. 137–150. POPL ’12, ACM, New York, NY, USA (2012).
https://doi.org/10.1145/2103656.2103674
111. Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over
streams. In: Proceedings of the 2006 ACM SIGMOD International Conference on
Management of Data. pp. 407–418. SIGMOD ’06, ACM, New York, NY, USA
(2006). https://doi.org/10.1145/1142473.1142520
112. Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., Stoica, I.: Dis-
cretized streams: Fault-tolerant streaming computation at scale. In: Pro-
ceedings of the Twenty-Fourth ACM Symposium on Operating Systems
Principles. pp. 423–438. SOSP ’13, ACM, New York, NY, USA (2013).
https://doi.org/10.1145/2517349.2522737
113. Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X.,
Rosen, J., Venkataraman, S., Franklin, M.J., Ghodsi, A., Gonzalez, J., Shenker,
S., Stoica, I.: Apache Spark: A unified engine for big data processing. Communi-
cations of the ACM 59(11), 56–65 (2016). https://doi.org/10.1145/2934664
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Connecting Higher-Order Separation Logic to a
First-Order Outside World
1 Introduction
for the purposes of verification. The soundness proof of the logic then relates
these decorated heaps to the simple address-map view of memory used in the
semantics of the target language.
This works well as long as every piece of the system is verified with re-
spect to decorated heaps, but what if we have multiple verification tools, some
of which provide correctness results in terms of undecorated memory (or, still
worse, memory with a different set of decorations)? To take advantage of the
correctness theorem of a function verified with one of these tools, we will need
to translate our decorated memory into an undecorated one, demonstrate that
it meets the function’s undecorated precondition, and then take the memory
output by the function and use it to reconstruct a decorated memory. In this
paper, we demonstrate a technique to do exactly that, allowing higher-order
separation logics (in this instance, the Verified Software Toolchain) to take ad-
vantage of correctness proofs generated by other tools (in this case, the CertiKOS
verified operating system). This allows us to remove the separation-logic-level
specifications of system calls from our trusted computing base, instead relying
on the operating system’s proofs of its own calls. In particular, we are interested
in functions that do more than just manipulate memory (which is separation
logic’s specialty)—they communicate with the outside world, which may not
know anything about program memory or higher-order state.
int main(void) {
unsigned int n, d; char c;
n=0;
c=getchar();
while (n<1000) {
d = ((unsigned)c)-(unsigned)’0’;
if (d>=10) break;
n+=d;
print int(n);
putchar(’\n’);
c=getchar();
}
return 0;
}
Fig. 1: A simple communicating program
an external read event: the program before the call to getchar must have per-
mission to perform a sequence of operations beginning with a read, and after the
call it has permission to perform the remaining operations (with values that may
depend upon the received value). By adding these specifications as axioms to
VST’s separation logic, we can use standard separation logic techniques to prove
the correctness of programs such as the one above. But when we compile and
run this C program, putchar and getchar are not axiomatized functions; they
are system calls provided by the operating system, which may have an effect
on kernel memory, user memory, and of course the console itself. If we prove
a specification of this C program using the separation logic rules for putchar
and getchar, what does that tell us about the behavior of the program when it
runs? For programs without external calls, we can answer this question with the
soundness proof of the logic. To extend this soundness proof to programs with
external calls, we must relate the pre- and postconditions of the external calls
to both the semantics of C and their implementations in the operating system.
In this paper, we describe a modular approach to proving soundness of a ver-
ification system for communicating programs, including the following elements:
– An extension of VST with support for generic ghost state.
– A generic mechanism for reasoning about external communication in a higher-
order separation logic, built on top of ghost state.
– A technique for relating pre- and postconditions for external functions in
higher-order separation logic to first-order specifications of the same func-
tions in the verified operating system CertiKOS, with a general approach to
“de-step-indexing” a certain class of step-indexed specifications.
– A new notion of correctness of the implementation of external communi-
cation, by relating user-level traces of external behavior to I/O operations
inside the operating system.
The result is the first soundness proof of a separation logic that can be extended
with first-order specifications of system calls. All proofs are formalized in the
Coq proof assistant.
To understand the scope of our results, it is important to clarify exactly
how much of CertiKOS we have brought into our proofs of correctness for C
programs, and how much of a gap remains. The semantics on which we prove
the soundness of our separation logic is the standard CompCert semantics of
C, extended with the specifications of system calls provided by CertiKOS. Our
model does not include the process by which CertiKOS switches from user mode
to kernel mode when executing a system call, but rather assumes that CertiKOS
implements this process so that the user cannot distinguish it from a normal
function call. To prove this assertion rather than assuming it, we would need to
transfer our soundness proof to the whole-system assembly-language semantics
used by CertiKOS, and interface with not just CertiKOS’s system call specifica-
tions but also its top-level correctness theorem. We discuss this last gap further
in Section 7, but in summary, we prove that our client-side programs and OS-side
system calls are correct, while assuming that CertiKOS correctly implements its
transition between user mode and kernel mode.
Connecting Higher-Order Separation Logic to a First-Order Outside World 431
own op a1 · a2 = a3
own g a3 pp ⇔ own g a1 pp ∗ own g a2 pp
fp update a b
own update
own g a pp own g b pp
P P {P } C {Q } Q Q
consequence
{P } C {Q}
x = 0;
acquire(l); acquire(l);
x++; x++;
release(l); release(l);
Figure 3 shows the canonical example of a program where ghost state in-
creases the verification power of separation logic. Using concurrent separation
logic as originally presented by O’Hearn [17], we can prove that the value of x
at the end of the program is at least 0, but we cannot prove that it is exactly 2.
This limitation comes from the fact that we can associate an invariant with the
lock l, but that invariant cannot express progress properties such as a change
in the value of x. We can get around this limitation by adding ghost state that
captures the contribution of each thread to x, and then use the invariant to en-
sure that the value of x is the sum of all contributions. (This approach is due to
Ley-Wild and Nanevski [16].) We begin with ghost state that models the central
operation of the program:
Connecting Higher-Order Separation Logic to a First-Order Outside World 433
Definition 1. The sum ghost algebra is the algebra (N, +, λn.True) of natural
numbers with addition, in which every number is a valid element.
Intuitively, the lock invariant should remember every addition to x, while each
individual thread only knows its own contribution. This is actually an instance of
a very general pattern: the reference pattern, in which one party holds a complete
and correct “reference” copy of some ghost state, and one or more other parties
hold possibly incomplete “partial” copies. Because the reference copy must al-
ways be completely up to date, the partial copies cannot be modified without
access to the reference copy. When all the partial copies are gathered together,
they are guaranteed to accurately represent the state of the data structure. The
reference ghost algebra is built as follows:
The positive ghost algebra contains pairs of a nonempty share and an element
of G, with join defined pointwise, representing partial ownership of an element
of G. Total ownership of the element can be recovered by combining all of the
pieces, obtaining a full share, and combining all of the G elements accordingly.
by replacing the sum algebra with one appropriate to the application or data
structure in question. We will also make use of it later to model the state of the
external world as a separation logic resource.
where (c, h) → (c , h ) means that the program c executed with starting heap h
may take a step to a new program c with heap h . For a step-indexed logic, it
is more convenient to write this definition inductively:
Definition 4 (Safety). A configuration (c, h) is safe for n steps with postcon-
dition Q if:
– n is 0, or
– c has terminated and Q(h) holds to approximation (step-index) n, or
– (c, h) → (c , h ) and (c , h ) is safe for n − 1 steps with Q.
We can then define {P } c {Q} (at step-index n) to mean that ∀h. P (h) ⇒ (c, h)
is safe for n steps with Q.
Once we have added ghost state, our heap h is now a pair (h, g) of physical
and ghost state, and between any two steps the ghost state may change. This
leads us to a ghost-augmented version of safety.
Definition 5 (Safety with Ghost State). A configuration (c, h, g) is safe for
n steps with postcondition Q if:
– n is 0, or
– c has terminated and Q(h, g) holds to approximation n, or
– (c, h) → (c , h ) and ∀gframe . g · gframe ⇒ ∃g . (g · gframe ∧ (c , h , g ) is safe
for n − 1 steps with Q).
The program must be able to continue executing under any gframe consistent
with its current ghost state, but its choice of new ghost state g may depend on
the frame. This quantifier alternation captures the essence of ghost state: the
ghost state held by the program constrains any other ghost state held by the
notional “rest of the system”, and may be changed arbitrarily in any way that
does not invalidate that other ghost state.
Connecting Higher-Order Separation Logic to a First-Order Outside World 435
On the back end, we must still modify VST’s semantics to connect the ghost
state a to the actual external state, and to prevent the “ghost steps” of the
semantics from changing the external state. Recall from Section 2 that in order
for a non-terminated configuration (c, h, g) to be safe for a nonzero number
of steps, it must be the case that (c, h) → (c , h ) and ∀gframe . g · gframe ⇒
∃g . g · gframe ∧ (c , h , g ) is safe. To connect the external ghost state to a real
external state z , we simply extend this definition to require that gframe include
an element (⊥, z ) at identifier 0. This enforces the requirement that the value
of the external ghost state always be the same as the value of the external
state, and ensures that frame-preserving updates cannot change the value of the
external state. Re-proving the separation logic rules of Verifiable C with this new
definition of Hoare triple required only minor changes, since internal program
steps never change the external ghost state.
When the semantics reaches an external call, the call is allowed to make
arbitrary changes to the state consistent with its pre- and postcondition, in-
cluding changing the value of the external ghost state (as well as the actual
external state). We can use has ext assertions in the pre- and postcondition of
an external function to describe how that function affects the external state. For
instance, we might give a console write function the “consuming-style” specifica-
tion {has ext(write(v); ; k)} write(v) {has ext(k)}, stating that if before calling
write(v) the program has permission to write the value v and then do the opera-
tions in k, then after the call it is left with permission to do k. (We could reverse
the pre- and postcondition for a “trace-style” specification, in which the external
state records the history of operations performed by the program instead of the
future operations allowed.) In this paper, we use interaction trees [13] as a means
of describing a collection of allowed traces of external events. Interaction trees
can be thought of as “abstract traces with binding”; for instance, we can write
x ← read; ; write (x + 1); ; k x to mean “read a value, call it x, write the value
x + 1, and then continue to do the actions in k using the same value of x.”
In the end, we have a new assertion has ext on external state that works in
exactly the way we expect: it can hold external state of any type, it cannot be
modified by user code, it can be freely modified by external calls, it always has
exactly the same value as the external state already present in VST’s semantics,
and it exposes no ghost-state functionality to the user. If the user wants more
fine-grained control over external state (for instance, to split it into pieces so
multiple threads can make concurrent calls to external functions), they can define
their own ghost algebra for the state and pass around part elements explicitly,
but for the common case, has ext provides seamless separation-logic reasoning
about C programs that interact with an external environment.
Once we have separation logic specifications for external function calls, verifying
a communicating program is no different from verifying any other program. We
demonstrate this with the example program excerpted in Figure 1, shown in
Connecting Higher-Order Separation Logic to a First-Order Outside World 437
{ITree(write list(decimal rep (i)); ; k)} {ITree(c ← read; ; main loop(0, c))}
void print intr(unsigned int i) { int main(void) {
unsigned int q,r; unsigned int n, d; char c;
if (i!=0) {
q=i/10u; n=0;
r=i%10u; c=getchar();
print intr(q); while (n<1000) {
putchar(r+’0’); d = ((unsigned)c)-
} (unsigned)’0’;
} if (d>=10) break;
n+=d;
{ITree(k)}
print int(n);
putchar(’\n’);
{ITree(write list(decimal rep(i)); ; k)}
c=getchar();
void print int(unsigned int i) { }
if (i==0) return 0;
putchar(’0’); }
else print intr(i);
{ITree(done)}
}
{ITree(k)}
full in Figure 4. The print intr function uses external calls to putchar to print
the decimal representation of its argument, as long as that argument is nonzero;
print int handles the zero case as well. The main function repeatedly reads in
digits using getchar and then prints the running total of the digits read so far.
The ITree predicate is simply a wrapper around the has ext predicate of the
previous section (i.e., an assertion on the external ghost state), specialized to
interaction trees on I/O operations. We can then write simple specifications for
getchar and putchar, using interaction trees to represent external state:
Next, we annotate each function with separation logic pre- and postcon-
ditions; the program does not manipulate memory, so the specifications only
describe the I/O behavior of each function. The effect of print intr is to make a
series of calls to putchar, printing the digits of the argument i as computed by
the meta-level function decimal rep (where write list([i0 ; i1 ; ...; in ]) is an abbre-
viation for the series of outputs write(i0 ); ; write(i1 ); ; ...; ; write(in )). When the
value of i is 0, print intr assumes that the number has been completely printed,
so print int adds a special case for 0 as the initial input. The specification for
the main loop is a recursive sequence of read and write operations, taking the
438 W. Mansky et al.
running total (which starts at 0) and the most recent input as arguments:
Using the specifications for putchar and getchar as axioms, we can easily prove
the specifications of print intr, print int, and main. (The following sections show
how we substantiate these axioms.)
The soundness proof of VST [1] describes the guarantees that the Hoare-logic
proof of correctness for a C program provides about the actual execution of that
program. A C program P is represented as a list P1 , ..., Pn of function definitions
in CompCert Clight, a Coq representation of the abstract syntax of C. The
program is annotated with a collection of function specifications (i.e., separation
logic pre- and postconditions) Γ = Γ1 , ..., Γn , one for each function. We then
prove that each Pi satisfies its specification Γi , which we write as Γ Pi : Γi
(note that each function may call on the specification of any function, including
itself). The soundness theorem of VST without external function calls is then:
n=0;
buf = malloc(4);
if (!buf) exit(1);
i = getchars(buf, 4);
while (n<1000) {
for(j = 0; j < i; j++){
c = buf[j];
d = ((unsigned)c)-(unsigned)’0’;
if (d>=10) { free(buf); return 0; }
n+=d;
print int(n);
}
i = getchars(buf, 4);
}
free(buf);
return 0;
}
{ITree(done)}
Corollary 1. Since null pointer dereferences, integer overflows, etc. are all
stuck in CompCert’s small-step semantics, this means that a verified program
will be free of all of these kinds of errors.
This soundness theorem expresses the relationship between the juicy seman-
tics described by VST’s separation logic and the dry semantics under which
C programs actually execute8 . The proof of correctness of a program gives us
enough information to construct a corresponding dry execution for each juicy
execution9 . However, we may not have access to the code of external functions,
and in some cases (e.g., system calls) they may not even be implemented in C. In
this section, we generalize the soundness theorem to include external functions.
8
Of course, a C program actually executes by running machine code, but the relation-
ship between the dry C semantics and the semantics of assembly language is already
proved in CompCert, as is assembly-to-machine language [20].
9
Theorem 1 blurs the line between juicy and dry by saying that a dry execution
“terminates in a state that satisfies its postcondition”, where the postcondition is
stated in separation logic. In the original proof of soundness [1], this is resolved by
assuming that the postcondition of main is always true. The techniques we use in
this section can also be applied to more refined specifications of main.
Connecting Higher-Order Separation Logic to a First-Order Outside World 441
The pre- and postcondition each make one assertion about memory (that the
buffer buf points to the string of bytes vs) and one assertion about the external
state10 (that the interaction tree allows write list(vs) followed by k before the
call, and k afterward). The corresponding first-order specification on dry memory
and external state is:
Pre((vs, k), (buf , n), m, z) length(vs) = n ∧ z = (write list(vs); ; k) ∧
∀i < n. m(buf + i) = vs[i]
Post((vs, k), (buf , n), m0 , m, z) m0 = m ∧ z = k
where (vs, k) is the witness (i.e., the parameters to the specification), buf and
n are the arguments passed to the function, m is the current memory, z is
the external state, and m0 in the postcondition is the memory before the call
(allowing us to state that memory is unchanged). Of the roughly 210 Linux
system calls that are not Linux- or platform-specific, about 140 fall into this
pattern, including socket, console, and file I/O, memory allocation, or are simpler
informational calls like gethostname that do not involve memory.
Once we have a juicy and a dry specification for a given external function,
what is the relationship between them? Intuitively, if the juicy specification for a
function f is {Pj } f (args); {Qj }, the Hoare logic proof for a program that calls
10
ITree is actually an assertion on the external ghost state, which is connected to the
true external state as described in Section 3, and is erased at the dry level.
442 W. Mansky et al.
reconstruction captures the effects of the external call on the program’s memory;
to reflect the changes to the external state, we must also set the external ghost
state of the reconstructed juicy memory to match the external state returned
by the call. We define a reconstruct operation such that reconstruct(jm, m, z) is
a version of the juicy memory jm that has been modified to take into account
the changes in the dry memory m and the external state z.
Second, we need a way to transform a juicy witness into the corresponding
dry witness. When a user adds a new external call to VST, they must provide a
dessicate function that performs this transformation. Fortunately, the dessicate
operation usually follows a simple pattern. Components of the witness that are
not memory objects are generally identical in their juicy and dry versions. The
frame is usually the only memory object in the juicy witness; while it is possible in
VST to write a Hoare triple that quantifies over other memory objects explicitly,
it is very unusual and runs counter to the spirit of separation logic. Similarly, the
postcondition of the dry specification may refer to the memory state before the
call (to express properties such as “this call stored value v at location ”), but
there is rarely a reason to refer to any other memory object. Thus, the dessicate
operation for each function can simply discard the frame (juicy) memory and
replace it with the dry memory from before the call. This standard dessicate
operation works for all external functions shown in this paper.
This leads to the following definition and theorem:
Definition 6 (Juicy-Dry Correspondence). A juicy specification (Pj , Qj )
and a dry specification (Pd , Qd ) for an external function correspond if, for a
suitable dessicate operation:
– for all witnesses w, arguments a, external states z, and juicy memories jm,
if Pj (w, a, z, jm), then Pd (dessicate(jm, w), a, z, dry(jm)); and
– for all witnesses w, arguments a, return values r, external states z, ini-
tial juicy memories jm 0 , initial external states z0 , and dry memories m, if
Pd (dessicate(jm 0 , w), a, z0 , dry(jm 0 )) and Qd (dessicate(jm 0 , w), r, z, m), then
Qj (w, r, z, reconstruct(jm 0 , m, z)).
Proof. We extend the juicy semantics of Theorem 1 with a rule for external
calls that uses their juicy pre- and postconditions, and then prove that execu-
tions in this semantics erase to safe executions in the dry semantics, using the
correspondence to relate juicy and dry behaviors of external calls.
444 W. Mansky et al.
Pre(k, c, m, z) (write(c); ; k)
z
Post(k, c, m0 , m, z) m0 = m ∧ z
k
Fig. 9: The core of the putchar system call vs. its dry specification
that blocks fit in the virtual address space and map to nonoverlapping regions,
the exact mapping has no effect on the system call correctness, so it can be com-
pletely arbitrary. To relate a CompCert memory to a CertiKOS one, we define
a relation inj(m, flat(s), ptbl(s)), which states that if a block and offset in the
CompCert memory m is valid, then it contains the same data as the correspond-
ing location (according to Rmem and the page table) in the flat memory of the
OS state s. Note that inj is parameterized by the page table to allow a system
call to alter the address mapping, for example by allocating new memory.
At the user level, the precondition contains an interaction tree (or similar
external specification) that specifies the allowed external behaviors, and the
postcondition contains a smaller tree that continues using the return value of
the “consumed” actions. On the other hand, in CertiKOS, specifications begin
with a trace of the events that have already happened and extend it with new
events by querying the external environment. To reconcile these two views, we
can first relate an interaction tree to a (possibly infinite) set of (possibly infinitely
long) traces, each of which intuitively is the result of following one path in the
tree. Then any trace allowed by the output interaction tree should be a suffix of
a trace allowed by the input tree, and the difference between the two should be
exactly the trace of events generated during the system call:
resulting state that satisfies the dry postcondition Qd . The inj relation may
relate multiple CompCert memories to a given OS state (hence the universal
quantification over the resulting memory m ), but all such memories must agree
on the contents of all valid addresses, so the postcondition will usually hold for
all m if it holds for any m .
Theorem 3. Putchar and getchar in CertiKOS correctly implement their dry
specifications.
While this correspondence is specific to CertiKOS, we can adapt it to other
verified operating systems by replacing the CertiKOS system call specification,
user memory model, and external event representation with those of the other
OS. For example, in the case of the seL4 microkernel [12], inj could be redefined to
relate a CompCert memory to certain capability slots that represent the virtual
memory, and the system call might send a message to a device driver running
in another process. Despite these changes, most of the theorems in this paper
aside from Theorem 3 would continue to hold with minor or no alterations.
The C program has states (c, m), where c holds the values of local variables
and the control stack, and m is the memory. Our small-step relation (c, m) →
(c , m ) characterizes internal C execution, and therefore if c is at a call to an
external function then (c, m) → (c , m ). The operating system has states s that
contain the physical memory flat(s) and many other components used internally
by the OS (and its proof of correctness), including a trace of past events; we say
that s is consistent with t when the trace in s is exactly t.
Definition 9 has several important differences from our original definition of
safety in Section 2. First, configurations include the trace t of events performed
so far, as well as T , the high-level specification of the allowed communication
events (here it is taken to be an interaction tree, but it could easily be defined
in another formalism just by changing the definition of consume). Second, our
external functions are not simply axiomatized with pre- and postconditions,
but implemented by the executable specifications Of provided by the operating
system. We use the ideas of the previous section to relate the execution of C
programs to the behavior of system calls: we inject the user memory into the OS
state, extract the resulting memory from the resulting state, and require that the
new interaction tree T reflect the communication events tnew performed by the
call. Note the quantification over the current OS state s: the details of the OS
state, such as the buffer of values received, are unknown to the C program (and
may change arbitrarily between steps, for instance, if an interrupt occurs), and
so it must be safe under all possible OS states consistent with the events t. The
set T contains all possible communication traces from the program’s execution,
so by proving that every trace in T is allowed by the initial interaction tree T ,
we show that the program’s communication is always constrained by T .
Lemma 2 (Trace Correctness). If (c, m, T ) is safe for n steps with respect
to T , then for all traces t ∈ T , there exists some interaction tree T such that
consume(T , T , t).
Proof. By induction on n. Since the consume relation holds for the trace segment
produced by each external call, it suffices to show that it is transitive, i.e., that
consume(a, b, t1 ) and consume(b, c, t2 ) imply consume(a, c, t1 ++ t2 ).
Theorem 4 (Soundness of VST + CertiKOS). Let P be a program with
n functions, calling also upon m external functions. The internal functions have
(juicy) specifications Γ1 . . . Γn and the external functions have (juicy) specifi-
cations Γn+1 . . . Γn+m . Suppose P is proved correct in Verifiable C with initial
interaction tree T . Let Dn+1 , . . . , Dn+m be dry specifications that safely evolve
memory, and that correspond to Γn+1 . . . Γn+m . Further, let each Di be correctly
implemented by an OS function fi with executable specification Ofi . Then for all
n, the main function of P is safe for n steps with respect to some set of traces
T , and for every trace t ∈ T , there exists some interaction tree T such that
consume(T , T , t).
Proof. By the combination of the soundness of VST with external functions
(Theorem 2), Lemma 2, and a proof relating our previous definition of safety to
the new definition.
450 W. Mansky et al.
This is our main result: by combining the results of the previous sections, we
obtain a soundness theorem down to the operating system’s implementation of
system calls, one that guarantees that the actual communication operations per-
formed by the program are always a prefix of the initial specification of allowed
operations. By instantiating the theorem with a set of verified system calls, we
obtain a strong correctness result for our VST-verified programs, such as:
Theorem 5. Let P be a program that uses the putchar and getchar system calls
provided by CertiKOS, such as the one in Figure 4. Suppose P is proved correct
with initial interaction tree T . Then for all n, the main function of P is safe
for n steps with respect to some set of traces T , and for every trace t ∈ T , there
exists some interaction tree T such that consume(T , T , t).
Thus far, we have assumed that the events in a program’s trace are exactly
the events described in the user-level interaction tree T . In practice, however,
the communication performed by the OS may differ from that observed by the
user. For example, like all operating systems, CertiKOS uses a kernel buffer of
finite size to store characters received from the serial device; if the buffer is
full, incoming characters are discarded without being read. To represent this
distinction, we distinguish between the user-visible events produced by system
calls, and external events, which are generated by the environment oracle and
recorded in the trace at the time that they occur. For the system call events
to be meaningful, they must correspond in some way to the external events,
but this correspondence may not be one-to-one. In the case of console I/O, each
character received by the serial device should be returned by getchar at most
once, and in the order they arrived, but characters may be dropped. This leads us
to the condition that the user events should be a subsequence of the environment
events, which is proved in CertiKOS.
Lemma 3. The getchar system call maintains the invariant that there exists an
injective map from a system call event with value v in the OS trace to an external
event with value v earlier in the trace.
dropped before a getchar call, then there will be external events that do not cor-
respond to anything in the interaction tree, and this is the intended semantics of
buffered communication without flow control. A similar corollary can be proved
for any set of system calls, but the precise correspondence between user events
and external events will depend on the particular system calls involved.
There is one more soundness theorem we might want to prove, asserting
that the combined system of program and operating system executes correctly
according to the assembly-level semantics of the OS. We should be able to obtain
this theorem by connecting Theorem 4 with the soundness theorem of CertiKOS,
which guarantees that the behavior of the operating system running a program
P refines the behavior of a system K
P consisting of the program along
with an abstract model of the operating system. However, this connection is
far from trivial: it involves lowering our soundness result from C to assembly
(using the correctness theorem of CompCert), modeling the switch from user to
kernel mode (including the semantics of the trap instruction), and considering
the effects of other OS features on program behavior (e.g., context switching). We
estimate that we have covered more than half of the distance between VST and
CertiKOS with our current result, but there is still work to be done to complete
the connection. We can now remove the OS’s implementation of each system call
from the trusted computing base; it remains to remove the OS entirely.
8 Related Work
The most comprehensive prior work connecting verified programs to the imple-
mentation of I/O operations is that of Férée et al. [5] in CakeML, a functional
language with I/O connected to a verified compiler and verified hardware. As in
our approach, the language is parameterized by functional specifications for ex-
ternal functions, backed by proofs at a lower level. However, while CakeML does
support a separation logic [9], it is not higher-order, so all of the components are
specified in the same basic style. Our approach could enable higher-order sepa-
ration logic reasoning about CakeML programs. Ironclad Apps [10] also includes
verified communicating code, for user-level networking applications running on
the Verve operating system [21]. However, their network stack is implemented
outside of the operating system, so proofs about I/O operations are carried out
within the same framework as the programs that use the operations.
One major category of system calls is file I/O operations. The FSCQ file
system [2] is verified using Crash Hoare Logic, a separation logic which accounts
for possible crashes at any point in a program. File system assertions are similar
to the ordinary points-to assertions of separation logic, but may persist through
crashes while memory is reset. In Crash Hoare Logic, the implementation-level
model of the file state is the same as the user’s model, and the approach does
not obviously generalize to other forms of external communication.
Another related area is the extension of separation logic to distributed sys-
tems, which necessarily involves reasoning about communication with external
entities. The most closely related such logic is Aneris [14], which is built on
452 W. Mansky et al.
Iris, the inspiration for VST’s approach to ghost state. The adequacy theorem
of Aneris proves the connection between higher-order separation logic specifica-
tions of socket operations and a language that includes first-order operational
semantics for those functions. In our approach, this would correspond to directly
adding the “dry” specifications for each operation to the language semantics, and
building the correspondence proof for those particular operations into the sound-
ness theorem of the logic; our more generic style of soundness theorem would
make it easier to plug in new external calls. The bottom half of our approach—
showing that the language-level semantics of the operations are implemented by
an OS such as CertiKOS—could be applied to Aneris more or less as is. Another
interesting feature of Aneris is that the communication allowed on each socket
is specified by a user-provided protocol, an arbitrary separation logic predicate
on messages and resources. In our examples thus far, we have assumed that the
external world does not share any notion of resource with the program, and
so our external state only mentions the messages to be sent and received; how-
ever, the construction of Section 3 does allow the external state to have arbitrary
ghost-state structure, which we could use to define similarly expressive protocols.
References
1. Appel, A.W., Dockins, R., Hobor, A., Beringer, L., Dodds, J., Stewart, G., Blazy,
S., Leroy, X.: Program Logics for Certified Compilers. Cambridge University Press
Connecting Higher-Order Separation Logic to a First-Order Outside World 453
(2014), http://www.cambridge.org/de/academic/subjects/computer-science/
programming-languages-and-applied-logic/program-logics-certified-compilers?
format=HB
2. Chen, H., Ziegler, D., Chajed, T., Chlipala, A., Kaashoek, M.F., Zeldovich, N.:
Using Crash Hoare Logic for certifying the FSCQ file system. In: Proceedings of
the 25th Symposium on Operating Systems Principles. pp. 18–37. SOSP ’15, ACM,
New York, NY, USA (2015). https://doi.org/10.1145/2815400.2815402
3. Dinsdale-Young, T., Birkedal, L., Gardner, P., Parkinson, M.J., Yang, H.: Views:
compositional reasoning for concurrent programs. In: Giacobazzi, R., Cousot, R.
(eds.) The 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages, POPL ’13, Rome, Italy - January 23 - 25, 2013. pp.
287–300. ACM (2013). https://doi.org/10.1145/2429069.2429104
4. Dinsdale-Young, T., Dodds, M., Gardner, P., Parkinson, M.J., Vafeiadis, V.: Con-
current abstract predicates. In: D’Hondt, T. (ed.) ECOOP 2010 - Object-Oriented
Programming, 24th European Conference, Maribor, Slovenia, June 21-25, 2010.
Proceedings. Lecture Notes in Computer Science, vol. 6183, pp. 504–528. Springer
(2010). https://doi.org/10.1007/978-3-642-14107-2 24
5. Férée, H., Pohjola, J.Å., Kumar, R., Owens, S., Myreen, M.O., Ho, S.: Program
verification in the presence of I/O - semantics, verified library routines, and verified
applications. In: Piskac, R., Rümmer, P. (eds.) Verified Software. Theories, Tools,
and Experiments - 10th International Conference, VSTTE 2018, Oxford, UK, July
18-19, 2018, Revised Selected Papers. Lecture Notes in Computer Science, vol.
11294, pp. 88–111. Springer (2018). https://doi.org/10.1007/978-3-030-03592-1 6
6. Gu, R., Koenig, J., Ramananandro, T., Shao, Z., Wu, X.N., Weng, S.C., Zhang,
H., Guo, Y.: Deep specifications and certified abstraction layers. In: Proceedings
of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Pro-
gramming Languages. pp. 595–608. POPL ’15, ACM, New York, NY, USA (2015).
https://doi.org/10.1145/2676726.2676975
7. Gu, R., Shao, Z., Chen, H., Wu, X.N., Kim, J., Sjöberg, V., Costanzo, D.: Certikos:
An extensible architecture for building certified concurrent OS kernels. In: 12th
USENIX Symposium on Operating Systems Design and Implementation, OSDI
2016, Savannah, GA, USA, November 2-4, 2016. pp. 653–669 (2016), https://www.
usenix.org/conference/osdi16/technical-sessions/presentation/gu
8. Gu, R., Shao, Z., Kim, J., Wu, X.N., Koenig, J., Sjöberg, V., Chen, H., Costanzo,
D., Ramananandro, T.: Certified concurrent abstraction layers. In: Proceedings
of the 39th ACM SIGPLAN Conference on Programming Language Design and
Implementation, PLDI 2018, Philadelphia, PA, USA, June 18-22, 2018. pp. 646–
661 (2018). https://doi.org/10.1145/3192366.3192381
9. Guéneau, A., Myreen, M.O., Kumar, R., Norrish, M.: Verified characteristic for-
mulae for CakeML. In: Yang, H. (ed.) Programming Languages and Systems. pp.
584–610. Springer Berlin Heidelberg, Berlin, Heidelberg (2017)
10. Hawblitzel, C., Howell, J., Lorch, J.R., Narayan, A., Parno, B., Zhang, D., Zill, B.:
Ironclad apps: End-to-end security via automated full-system verification. In: 11th
USENIX Symposium on Operating Systems Design and Implementation, OSDI
’14, Broomfield, CO, USA, October 6-8, 2014. pp. 165–181 (2014), https://www.
usenix.org/conference/osdi14/technical-sessions/presentation/hawblitzel
11. Jung, R., Krebbers, R., Birkedal, L., Dreyer, D.: Higher-order ghost state. In:
Proceedings of the 21st ACM SIGPLAN International Conference on Functional
Programming. pp. 256–269. ICFP 2016, ACM, New York, NY, USA (2016).
https://doi.org/10.1145/2951913.2951943
454 W. Mansky et al.
12. Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elka-
duwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Win-
wood, S.: seL4: Formal verification of an OS kernel. In: Proceedings of the ACM
SIGOPS 22nd Symposium on Operating Systems Principles. pp. 207–220. SOSP
’09, ACM, New York, NY, USA (2009). https://doi.org/10.1145/1629575.1629596
13. Koh, N., Li, Y., Li, Y., Xia, L.y., Beringer, L., Honoré, W., Mansky, W., Pierce,
B.C., Zdancewic, S.: From C to interaction trees: Specifying, verifying, and test-
ing a networked server. In: Proceedings of the 8th ACM SIGPLAN International
Conference on Certified Programs and Proofs. pp. 234–248. CPP 2019, ACM, New
York, NY, USA (2019). https://doi.org/10.1145/3293880.3294106
14. Krogh-Jespersen, M., Timany, A., Ohlenbusch, M.E., Birkedal, L.: Aneris: A
logic for node-local, modular reasoning of distributed systems (2019), https:
//iris-project.org/pdfs/2019-aneris-submission.pdf , unpublished draft
15. Leroy, X., Appel, A.W., Blazy, S., Stewart, G.: The CompCert memory model. In:
Appel, A.W. (ed.) Program Logics for Certified Compilers, chap. 32. Cambridge
University Press (2014)
16. Ley-Wild, R., Nanevski, A.: Subjective auxiliary state for coarse-grained concur-
rency. In: Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium
on Principles of Programming Languages. pp. 561–574. POPL ’13, ACM, New
York, NY, USA (2013). https://doi.org/10.1145/2429069.2429134
17. O’Hearn, P.W.: Resources, concurrency, and local reasoning. Theor. Comput. Sci.
375(1-3), 271–307 (Apr 2007). https://doi.org/10.1016/j.tcs.2006.12.035
18. Penninckx, W., Jacobs, B., Piessens, F.: Sound, modular and compositional ver-
ification of the input/output behavior of programs. In: Programming Languages
and Systems - 24th European Symposium on Programming, ESOP 2015, Held
as Part of the European Joint Conferences on Theory and Practice of Software,
ETAPS 2015, London, UK, April 11-18, 2015. Proceedings. pp. 158–182 (2015).
https://doi.org/10.1007/978-3-662-46669-8 7
19. Sergey, I., Nanevski, A., Banerjee, A.: Specifying and verifying concurrent algo-
rithms with histories and subjectivity. In: Vitek, J. (ed.) Proceedings of the 24th
European Symposium on Programming (ESOP 2015). Lecture Notes in Computer
Science, vol. 9032, pp. 333–358. Springer (2015). https://doi.org/10.1007/978-3-
662-46669-8 14
20. Wang, Y., Wilke, P., Shao, Z.: An abstract stack based approach to verified com-
positional compilation to machine code. Proceedings of the ACM on Programming
Languages 3(POPL), 62 (2019)
21. Yang, J., Hawblitzel, C.: Safe to the last instruction: automated verifica-
tion of a type-safe operating system. In: Proceedings of the 2010 ACM SIG-
PLAN Conference on Programming Language Design and Implementation,
PLDI 2010, Toronto, Ontario, Canada, June 5-10, 2010. pp. 99–110 (2010).
https://doi.org/10.1145/1806596.1806610
Connecting Higher-Order Separation Logic to a First-Order Outside World 455
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Modular Inference of Linear Types for
Multiplicity-Annotated Arrows
Kazutaka Matsuda1
1 Introduction
interface between the two. Thus, there have been several proposed approaches
for more practical linear type systems [7, 21, 24, 28].
Among these approaches, a system called λq→ , the core type system of Linear
Haskell, stands out for its ability to have linear code in large unrestricted code
bases [7]. With it, existing unrestricted code in Haskell typechecks in Linear
Haskell without modification, and if one desires, some of the unrestricted code
can be replaced with linear code, again without any special programming effort.
For example, one can use the function append in an unrestricted context as
λx.tail (append x x), regardless of whether append is a linear or unrestricted
function. This is made possible by their representation of linearity. Specifically,
they annotate function type with its argument’s multiplicity (“linearity via
arrows” [7]) as A →m B, where m = 1 means that the function of the type
uses its argument linearly, and m = ω means that there is no restriction in
the use of the argument, which includes all non-linear standard Haskell code.
In this system, linear functions can be used in an unrestricted context if their
arguments are unrestricted. Thus, there is no problem in using append : List A →1
List A →1 List A as above, provided that x is unrestricted. This promotion of
linear expressions to unrestricted ones is difficult in other approaches [21, 24, 28]
(at least in the absence of bounded kind-polymorphism), where linearity is a
property of a type (called “linearity via kinds” in [7]).
However, as far as we are aware, little is known about type inference for
λq→ . It is true that Linear Haskell is implemented as a fork1 of the Glasgow
Haskell Compiler (GHC), which of course comes with type inference. However,
the algorithm has not been formalized and has limitations due to a lack of proper
handling of multiplicity constraints. Indeed, Linear Haskell gives up handling
complex constraints on multiplicities such as those with multiplications p · q; as
a result, Linear Haskell sometimes fails to infer principal types, especially for
higher-order functions.2 This limits the reusability of code. For example, Linear
Haskell cannot infer an appropriate type for function composition to allow it to
compose both linear and unrestricted functions.
A classical approach to have both separated constraint solving that works
well with the usual unification-based typing and principal typing (for a rank 1
fragment) is qualified typing [15]. In qualified typing, constraints on multiplicities
are collected, and then a type is qualified with it to obtain a principal type.
Complex multiplicities are not a problem in unification as they are handled by a
constraint solver. For example, consider app = λf.λx.f x. Suppose that f has
type a →p b, and x has type a (here we focus only on multiplicities). Let us write
the multiplicities of f and x as pf and px , respectively. Since x is passed to f ,
there is a constraint that the multiplicity px of x must be ω if the multiplicity p
of the f ’s argument also is. In other words, px must be no less than p, which is
represented by inequality p ≤ px under the ordering 1 ≤ ω. (We could represent
the constraint as an equality px = p · px , but using inequality is simpler here.)
1
https://github.com/tweag/ghc/tree/linear-types
2
Confirmed for commit 1c80dcb424e1401f32bf7436290dd698c739d906 at May 14,
2019.
458 K. Matsuda
∀q qf qx pf px a b. (q ≤ qx ∧ qf ≤ pf ∧ qx ≤ px ) ⇒ (a →q b) →pf a →px b
Finally, we discuss related work (Sect. 7) and then conclude the paper (Sect. 8).
The prototype implementation is available as a part of a reversible programming
system Sparcl, available from https://bitbucket.org/kztk/partially-reversible-lang-impl/.
Due to space limitation, we omit some proofs from this paper, which can be
found in the full version [20].
In this section, we introduce a qualified-typed [15] variant of λq→ [7] for its
rank 1 fragment, on which we base our type inference. Notable differences to the
original λq→ include: (1) multiplicity abstractions and multiplicity applications
are implicit (as type abstractions and type applications), (2) this variant uses
qualified typing [15], (3) conditions on multiplicities are inequality based [6],
which gives better handling of multiplicity variables, and (4) local definitions
are excluded as we postpone the discussions to Sect. 5 due to their issues in the
handling of local assumptions in qualified typing [31].
A, B ::= ∀pa.Q ⇒ τ (polytypes) Q ::= i φi (constraints)
σ, τ ::= a | D μ τ | σ →μ τ (monotypes) φ ::= M ≤ M (predicates)
μ ::= p | 1 | ω (multiplicities) M, N ::= i μi (multiplications)
Fig. 1. Types and related notions: a and p are type and multiplicity variables, respec-
tively, and D represents a type constructor.
of this section, we shall postpone the discussions of local bindings (i.e., let) to
Sect. 5. Expressions consist of variables x, applications e1 e2 , λ-abstractions λx.e,
constructor applications C e, and (shallow) pattern matching case e0 of {Ci xi →
ei }i . For simplicity, we assume that constructors are fully-applied and patterns
are shallow. As usual, patterns Ci xi must be linear in the sense that each variable
in xi is different. Programs are assumed to be appropriately α-renamed so that
newly introduced variables by λ and patterns are always fresh. We do not require
the patterns of a case expression to be exhaustive or no overlapping, following
the original λq→ [7]; the linearity in λq→ cares only for successful computations.
Unlike the original λq→ , we do not annotate λ and case with the multiplicity of
the argument and the scrutinee, respectively.
Constructors play an important role in λq→ . As we will see later, they can be
used to witness unrestrictedness, similarly to ! of !e in a linear type system [33].
2.2 Types
Types and related notations are defined in Fig. 1. Types are separated into
monotypes and polytypes (or, type schemes). Monotypes consist of (rigid) type
variables a, datatypes D μ τ , and multiplicity-annotated function types τ1 →μ τ2 .
Here, a multiplicity μ is either 1 (linear), ω (unrestricted), or a (rigid) multiplicity
variable p. Polytypes have the form ∀pa.Q ⇒ τ , where Q is a constraint that
is a conjunction of predicates. A predicate φ has the form of M ≤ M , where
M and M are multiplications of multiplicities. We shall sometimes treat Q as
a set of predicates, which means that we shall rewrite Q according to contexts
by the idempotent commutative monoid laws of ∧. We call both multiplicity (p)
and type (a) variables type-level variables, and write ftv(t) for the set of free
type-level variables in syntactic objects (such as types and constraints) t.
The relation (≤) and operator (·) in predicates denote the corresponding
relation and operator on {1, ω}, respectively. On {1, ω}, (≤) is defined as the
reflexive closure of 1 ≤ ω; note that ({1, ω} , ≤) forms a total order. Multiplication
(·) on {1, ω} is defined by
1·m=m·1=m ω · m = m · ω = ω.
For simplicity, we shall sometimes omit (·) and write m1 m2 for m1 · m2 . Note
that, for m1 , m2 ∈ {1, ω}, m1 · m2 is the least upper bound of m1 and m2 with
respect to ≤. As a result, m1 · m2 ≤ m holds if and only if (m1 ≤ m) ∧ (m2 ≤ m)
holds; we will use this property for efficient handling of constraints (Sect. 3.2).
Modular Inference of Linear Types for Multiplicity-Annotated Arrows 461
Q; Γ ; Δ e : τ Γ (x) = ∀pa.Q ⇒ τ
Q |= Δ = Δ Q |= τ ∼ τ Q |= Q [p → μ] Q |= x1 ≤ Δ
Eq Var
Q; Γ ; Δ e : τ Q; Γ ; Δ x : τ [p → μ, a → τ ]
Q; Γ, x : σ; Δ, xμ e : τ Q; Γ ; Δ1 e1 : σ →μ τ Q; Γ ; Δ2 e2 : σ
Abs App
Q; Γ ; Δ λx.e : σ →μ τ Q; Γ ; Δ1 + μΔ2 e1 e2 : τ
C : ∀pa. τ →ν D p a {Q; Γ ; Δi ei : τi [p → μ, a → σ]}i
Con
Q; Γ ; ωΔ0 + i νi [p → μ]Δi C e : D μ σ
Q; Γ ; Δ0 e0 : D μ σ
Ci : ∀pa. τi →νi D p a
Q; Γ, xi : τi [p → μ, a → σ]; Δi , xi μ0 νi [p→μ] ei : τ i
Case
Q; Γ ; μ0 Δ0 + i Δi case e0 of {Ci xi → ei }i : τ
Lemma 2. Q |= μΔ ≤ Δ implies Q |= Δ ≤ Δ .
Rule Var says that x is used once in a variable expression x, but it is safe to
regard that the expression uses x more than once and uses other variables ω times.
At the same time, the type ∀pa.Q ⇒ τ of x instantiated to τ [p
→ μ, a
→ σ] with
yielding constraints Q [p
→ μ], which must be entailed from Q.
Rule Abs says that λx.e has type σ →μ τ if e has type τ , assuming that
the use of x in e is μ. Unlike the original λq→ [7], in our system, multiplicity
annotations on arrows must be μ, i.e., 1, ω, or a multiplicity variable, instead of
M . This does not limit the expressiveness because such general arrow types can
be represented by type σ →p τ with constraints p ≤ M ∧ M ≤ p.
Rule App sketches an important principle in λq→ ; when an expression with
variable use Δ is used μ-many times, the variable use in the expression becomes
μΔ. Thus, since we pass e2 (with variable use Δ2 ) to e1 , where e1 uses the
argument μ-many times as described in its type σ →μ τ , the use of variables in
e2 of e1 e2 becomes μΔ2 . For example, for (λy.42) x, x is considered to be used
ω times because (λy.42) has type σ →ω Int for any σ.
Rule Con is nothing but a combination of Var and App. The ωΔ0 part is
only useful when C is nullary; otherwise, we can weaken Δ at leaves.
Rule Case is the most complicated rule in this type system. In this rule, μ0
represents how many times the scrutinee e0 is used in the case. If μ0 = ω, the
pattern bound variables can be used unrestrictedly, and if μ0 = 1, the pattern
bound variables can be used according to the multiplicities of the arguments of the
constructor.4 Thus, in the ith branch, variables in xi can be used as μ0 νi [p
→ μ],
where μi [p
→ μ] represents the multiplicities of the arguments of the constructor
Ci . Other than xi , each branch body ei can contain free variables used as Δi .
Thus, the uses of free variables in the whole branch bodies are summarized as
i Δi . Recall that the case uses the scrutinee μ0 times; thus, the whole uses of
variables are estimated as μ0 Δ0 + i Δi .
Then, we define the typing judgment for programs, Γ prog, which reads that
program prog is well-typed under Γ , by the typing rules in Fig. 3. At this place,
the rules Bind and BindA have no significant differences; their difference will be
clear when we discuss type inference. In the rules Bind and BindA, we assumed
that Γ contains no free type-level variables. Therefore, we can safely generalize
all free type-level variables in Q and τ . We do not check the use Δ in both rules
4
This behavior, inherited from λq→ [7], implies the isomorphism !(A ⊗ B) ≡ !A ⊗ !B,
which is not a theorem in the standard linear logic. The isomorphism intuitively means
that unrestricted products can (only) be constructed from unrestricted components,
as commonly adopted in linearity-via-kind approaches [11, 21, 24, 28, 29].
464 K. Matsuda
as bound variables are assumed to be used arbitrarily many times in the rest
of the program; that is, the multiplicity of a bound variable is ω and its body
uses variable as ωΔ, which maps x ∈ dom(Δ) to ω and has no free type-level
variables.
2.4 Metatheories
Lemma 4 is the standard weakening property. Lemma 5 says that we can replace
Q with a stronger one, Lemma 6 says that we can replace Δ with a greater one,
and Lemma 7 says that we can substitute type-level variables in a term-in-context
without violating typeability. These lemmas state some sort of weakening, and
the last three lemmas clarify the goal of our inference system discussed in Sect. 3.
Lemma 4. Q; Γ ; Δ e : τ implies Q; Γ, x : A; Δ e : τ .
Lemma 5. Q; Γ ; Δ e : τ and Q |= Q implies Q ; Γ ; Δ e : τ .
Lemma 6. Q; Γ ; Δ e : τ and Q |= Δ ≤ Δ implies Q; Γ ; Δ e : τ .
Lemma 7. Q; Γ ; Δ e : τ implies Qθ; Γ θ; Δθ e : τ θ.
We have the following form of the substitution lemma:
Lemma 8 (Substitution). Suppose Q0 ; Γ, x: σ; Δ0 , xμ e : τ , and Qi ; Γ ; Δi
ei : σi for each i. Then, Q1 ∧ i Qi ; Γ ; Δ0 + i μi Δi e[x
→ e ] : τ .
Subject Reduction We show the subject reduction property for a simple call-by-
name semantics. Consider the standard small-step call-by-name relation e −→ e
with the following β-reduction rules (we omit the congruence rules):
(λx.e1 ) e2 −→ e1 [x
→ e2 ] case Cj ej of {Ci xi → ei }i −→ ej [xj
→ ej ]
Then, by Lemma 8, we have the following subjection reduction property:
Lemma 9 (Subject Reduction). Q; Γ ; Δ e : τ and e −→ e implies
Q; Γ ; Δ e : τ .
Lemma 9 holds even for the call-by-value reduction, though with a caveat.
For a program f1 = e1 ; . . . ; fn = en , it can happen that some ei is typed
only under unsatisfiable (i.e., conflicting) Qi . As conflicting Qi means that ei
is essentially ill-typed, evaluating ei may not be safe. However, the standard
call-by-value strategy evaluates ei , even when fi is not used at all and thus the
type system does not reject this unsatisfiability. This issue can be addressed
by the standard witness-passing transformation [15] that converts programs so
that Q ⇒ τ becomes WQ → τ , where WQ represents a set of witnesses of Q.
Nevertheless, it would be reasonable to reject conflicting constraints locally.
We then state the correspondence with the original system [7] (assuming the
modification [6] for the variable case5 ) to show that the qualified-typed version
5
In the premise of Var, the original [7] uses ∃Δ . Δ = x1 + ωΔ , which is modified
to x1 ≤ Δ in [6]. The difference between the two becomes clear when Δ(x) = p, for
which the former one does not hold as we are not able to choose Δ depending on p.
Modular Inference of Linear Types for Multiplicity-Annotated Arrows 465
captures the linearity as the original. While the original system assumes the
call-by-need evaluation, Lemma 9 could be lifted to that case.
Theorem 1. If ; Γ ; Δ e : τ where Γ contains only monotypes, e is also
well-typed in the original λq→ under some environment.
The main reason for the monotype restriction is that our polytypes are strictly
more expressive than their (rank-1) polytypes. This extra expressiveness comes
from predicates of the form · · · ≤ M ·M . Indeed, f = λx.case x of {MkMany y →
(y, y)} has type ∀p q a. ω ≤ p · q ⇒ MkMany p a →q a ⊗ a in our system, while it
has three incomparable types in the original λq→ .
3 Type Inference
In this section, we give a type inference method for the type system in the
previous section. Following [31, Section 3], we adopt the standard two-phase
approach; we first gather constraints on types and then solve them. As mentioned
in Sect. 1, the inference system described here has the issue of ambiguity, which
will be addressed in Sect. 4.
τ ::= · · · | α μ ::= · · · | π
ψ ::= φ | τ ∼ τ
C ::= ψi
i
e0 : τ0 ; Δ0 ; C0 π0 , πi , αi , β : fresh
Γ
Ci : ∀pa. τi →νi D p a
Γ, xi : τi [p → πi , a → αi ] ei : τi ; Δi , xi Mi ; Ci i
C = C0 ∧ i Ci ∧ β ∼ τi ∧ (τ0 ∼ D πi αi ) ∧ j Mij ≤ π0 νij [p → πi ]
Γ case e0 of {Ci xi → ei }i : β ; π0 Δ0 + i Δi ; C
in some sense to C under the assumption Q. The idea underlying our simplification
is to solve type equality constraints in C as much as possible and then remove
predicates that are implied by Q. Rules s-Fun, s-Data, s-Uni, and S-Triv
are responsible for the former, which decompose type equality constraints and
yield substitutions once either of the sides becomes a unification variable. Rules
S-Entail and S-Rem are responsible for the latter, which remove predicates
implied by Q and then return the residual constraints. Rule S-Entail checks
Q |= φ; a concrete method for this check will be discussed in Sect. 3.2.
Example 1 (app). Let us illustrate how the system infers a type for app =
λf.λx.f x. We have the following derivation for its body λf.λx.f x:
f : αf f : αf ; f 1 ; x : αx x : αx ; x1 ;
f : αf , x : αx f x : β ; f 1 , xπ ; αf ∼ (αx →π β)
f : αf λx.f x : αx →πx β ; f 1 ; αf ∼ (αx →π β) ∧ πx ≤ π
λf.λx.f x : αf →πf αx →πx β ; ∅; αf ∼ (αx →π β) ∧ πx ≤ π ∧ 1 ≤ πf
Q simp σ ∼ σ ∧ μ ≤ μ ∧ μ ≤ μ ∧ τ ∼ τ ; Q ; θ
S-Fun
Q simp (σ →μ τ ) ∼ (σ →μ τ ) ∧ C ; Q ; θ
Q simp μ ≤ μ ∧ μ ≤ μ ∧ σ ∼ σ ∧ C ; Q ; θ
S-Data
Q simp (D μ σ) ∼ (D μ σ ) ∧ C ; Q ; θ
α ∈ fuv(τ ) Q simp C[α → τ ] ; Q ; θ Q simp C ; Q ; θ
S-Uni S-Triv
Q simp α ∼ τ ∧ C ; Q ; θ ◦ [α → τ ] Q simp τ ∼ τ ∧ C ; Q ; θ
Q ∧ Qw |= φ Q simp Qw ∧ C ; Q ; θ no other rules can apply
S-Entail S-Rem
Q simp φ ∧ Qw ∧ C ; Q ; θ Q simp Q ; Q ; ∅
– Then, in the third last step, for f x, the system infers type β with constraint
αf ∼ (αx →π β). At the same time, the variable use in f x is also inferred
as f 1 , xπ . Note that the use of x is π because it is passed to f : αx →π β.
– After that, in the last two steps again, the system yields constraints πx ≤ π
and 1 ≤ πf .
As a result, the type τ = αf →πf αx →πx β is inferred with the constraint
C = αf ∼ (αx →π β) ∧ πx ≤ π ∧ 1 ≤ πf .
Then, we try to assign a polytype to app by the rules in Fig. 4. By simplifi-
cation, we have simp C ; πx ≤ π; [αf
→ (αx →π β)]. Thus, by generalizing
τ [αf
→ (αx →π β)] = (αx →π β) →πf αx →πx β with πx ≤ π, we obtain the
following type for app:
app : ∀p pf px a b. p ≤ px ⇒ (a →p b) →pf a →px b
π ∈ fuv(Q) π = μ
Q ∧ Qw |= π ≤ μ ∧ μ ≤ π Q simp (Qw ∧ C)[π
→ μ] ; Q ; θ
S-Eq
Q simp Qw ∧ C ; Q ; θ ◦ [π
→ μ]
Modular Inference of Linear Types for Multiplicity-Annotated Arrows 469
This rule says that if π = μ must hold for Qw ∧ C to hold, the simplification yields
the substitution [π
→ μ]. The condition π ∈ fuv(Q) is required for Lemma 10; a
solution cannot substitute variables in Q. Note that this rule essentially finds an
improving substitution [16].
Using the rule is optional. Our prototype implementation actually uses S-Eq
only for Qw for which we can find μ easily: M ≤ 1, ω ≤ μ, and looping chains
μ1 ≤ μ2 ∧ · · · ∧ μn−1 ≤ μn ∧ μn ≤ μ1 .
The simplification rules rely on the check of entailment Q |= φ. For the constraints
in this system, we can perform this check in quadratic time at worst but in linear
time for most cases. Specifically, we reduce the checking Q |= φ to satisfiability of
propositional Horn formulas (Horn SAT), which is known to be solved in linear
time in the number of occurrences of literals [10], where the reduction (precisely,
the preprocessing of the reduction) may increase the problem size quadratically.
The idea of using Horn SAT for constraint solving in linear typing can be found
in Mogensen [23].
First, as a preprocess, we normalize both given and wanted constraints by
the following rules:
– Replace M1 · M2 ≤ M with M1 ≤ M ∧ M2 ≤ M .
– Replace M · 1 and 1 · M with M , and M · ω and ω · M with ω.
– Remove trivial predicates 1 ≤ M and M ≤ ω.
In this section, we address the issue of ambiguous and leaky types by using
quantifier elimination. The basic idea is simple; we just view the type of app as
The elimination of existential quantifiers is rather easy; we simply use the well-
known fact that a disjunction of a Horn clause and a definite clause can also be
represented as a Horn clause. Regarding our encoding of normalized predicates
(Sect. 3.2) that maps μ ≤ M to a Horn clause, the fact can be rephrased as:
Lemma 12. (μ ≤ M ∨ ω ≤ M ) ≡ μ ≤ M · M .
Here, we extend constraints to include ∨ and write ≡ for the logical equivalence;
that is, Q ≡ Q if and only if Q |= Q and Q |= Q.
As a corollary, we obtain the following result:
Corollary 2. There effectively exists a quantifier-free constraint Q , denoted by
elim(∃π.Q), such that Q is logically equivalent to ∃π.Q.
472 K. Matsuda
In the worst case, the size of elim(∃π.Q) can be quadratic to that of Q. Thus,
repeating elimination can make the constraints exponentially bigger. We believe
that such blow-up rarely happens because it is usual that π occurs only in a few
predicates in Q. Also, recall that non-singleton right-hand sides are caused only
by multiplicity-parameterized constructors. When each right-hand side of ≤ is a
singleton in Q, the same holds in elim(∃π.Q). For such a case, the exponential
blow-up cannot happen because the size of constraints in the form is at most
quadratic in the number of multiplicity variables.
Example 3. Consider (Q, θ) in Sect. 3.3 such that Q = (πf ≤ πf ∧ π ≤ πx ∧ πx ≤
πx ) and θ = [αf
→ (α →π β ), π1
→ πf , β
→ (αf →πx β ), π2
→ πx , γ
→
β ]), which is obtained after simplification of the gathered constraint. Following
Example 2, eliminating variables that are not in τ θ = (α →π β ) →πf α →πx β
yields the constraint π ≤ πx . As a result, by generalization, we obtain the
polytype
∀q pf px a b. (q ≤ px ) ⇒ (a →q b) →pf a →px b
for app , which is equivalent to the inferred type of app.
h = λf.λk.let y = f (λx.k x) in 0
Suppose for simplicity that f and x have types (a →π1 b) →π2 c and a →π3 b,
respectively (here we only focus on the treatment of multiplicity). Then, f (λx.k x)
474 K. Matsuda
has type c with the constraint π3 ≤ π1 . Thus, after generalization, y has type
π3 ≤ π1 ⇒ c, where π3 and π1 are neither generalized nor eliminated because
they escape from the definition of y. As a result, h has type ∀p1 p2 p3 a b c. ((a →p1
b) →p2 c) →ω (a →p3 b) →ω Int; there is no constraint p3 ≤ p1 because the
definition of y does not yield a constraint. This nonexistence of the constraint
would be counter-intuitive because users wrote f (λx.k x) while the constraint
for the expression is not imposed. In particular, it does not cause an error even
when f : (a →1 b) →1 c and k : a →ω b, while f (λx.k x) becomes illegal for this
case. Also, if we change 0 to y, the error happens at the use site instead of the
definition site. Moreover, the type is fragile as it depends on whether y occurs or
not; for example, if we change 0 to const 0 y where const = λa.λb.a, the type of
h changes to ∀p1 p2 p3 a b c. p1 ≤ p3 ⇒ ((a →p1 b) →p2 c) →ω (a →p3 b) →ω Int.
In this discussion, we do not consider type-equality constraints, but there are no
legitimate reasons why type-equality constraints are solved on the fly in typing y.
As demonstrated in the above example, “ let should not be generalized” [30,31]
in our case. Thus, we adopt the same principle in OutsideIn(X) that let will
be generalized only if users write a type annotation for it [31]. This principle is
also adopted in GHC (as of 6.12.1 when the language option MonoLocalBinds is
turned on) with a slight relaxation to generalize closed bindings.
6.1 Implementation
The implementation follows the present paper except for a few points. Following
the implementation of OutsideIn(X) in GHC, our type checker keeps a natural
number, which we call an implication level, corresponding to the depth of implica-
tion constraints, and a unification variable also accordingly keeps the implication
level at which the variable is introduced. As usual, we represent unification
variables by mutable references. We perform unification on the fly by destructive
assignment, while unification of variables that have smaller implication levels than
the current level is recorded for later checking of implication constraints; such a
variable cannot be in πα of ∃πα.Q |=τ C. The implementation supports GADTs
because they can be implemented rather easily by extending constraints Q to
include type equalities, but does not support type classes because the handling
of them requires another X of OutsideIn(X).
Although we can use a linear-time Horn SAT solving algorithm [10] for
checking Q |= φ, the implementation uses a general SAT solver based on DPLL [8,
9] because the unit propagation in DPLL works efficiently for Horn formulas.
We do not use external solvers, such as Z3, as we conjecture that the sizes of
formulas are usually small, and overhead to use external solvers would be high.
Modular Inference of Linear Types for Multiplicity-Annotated Arrows 477
(◦) : (q ≤ s ∧ q ≤ t ∧ p ≤ t) ⇒ (b →q c) →r (a →p b) →s a →t c
curry : (p ≤ r ∧ p ≤ s) ⇒ ((a ⊗ b) →p c) →q a →r b →s c
uncurry : (p ≤ s ∧ q ≤ s) ⇒ (a →p b →q c) →r (a ⊗ b) →s c
either : (p ≤ r ∧ q ≤ r) ⇒ (a →p c) →ω (b →q c) →ω Either a b →r c
foldr : (q ≤ r ∧ p ≤ s ∧ q ≤ s) ⇒ (a →p b →q b) →ω b →r List a →s b
foldl : (p ≤ r ∧ r ≤ s ∧ q ≤ s) ⇒ (b →p a →q b) →ω b →r List a →s b
map : (p ≤ q) ⇒ (a →p b) →ω List a →q List b
filter : (a →p Bool) →ω List a →ω List a
append : List a →p List a →q List a
reverse : List a →p List a
concat : List (List a) →p List a
concatMap : (p ≤ q) ⇒ (a →p List b) →ω List a →q List b
Fig. 7. Inferred types for selected functions from Prelude (quantifications are omitted)
Note that the inference results do not contain →1 ; recall that there is no problem
in using unrestricted inputs linearly, and thus the multiplicity of a linear input
can be arbitrary. The results also show that the inference algorithm successfully
detected that append , reverse, and concat are linear functions.
It is true that these inferred types indeed leak some internal details into their
constraints, but those constraints can be understood only from their extensional
behaviors, at least for the examined functions. Thus, we believe that the inferred
types are reasonably simple.
(Here, we used our paper’s syntax instead of that of the actual examined code.)
Here, both $ and & are operator versions of app, where the arguments are flipped
in &. As well as treatment of multiplicities, the disambiguation is crucial for this
expression to have type Int.
The experiments were conducted on a MacBook Pro (13-inch, 2017) with
Mac OS 10.14.6, 3.5 GHz Intel Core i7 CPU, and 16 GB memory. GHC 8.6.5
with -O2 was used for compiling our prototype system.
Table 1 lists the experimental results. Each elapsed time is the average of 1,000
executions for the first two programs, and 10,000 executions for the last two. All
columns are self-explanatory except for the # column, which counts the number of
6
We changed the type of fork : Dual s s →ω (Ch s →1 Ch End) →1 (Ch s →1
Un r) →1 r, as their type Dual s s ⇒ (Ch s →1 Ch End) →1 Ch s is incorrect for
the multiplicity erasing semantics. A minor difference is that we used a GADT to
witness duality because our prototype implementation does not support type classes.
Modular Inference of Linear Types for Multiplicity-Annotated Arrows 479
7 Related Work
Borrowing the terminology from Bernardy et al. [7], there are two approaches to
linear typing: linearity via arrows and linearity via kinds. The former approaches
manage how many times an assumption (i.e., a variable) can be used; for example,
in Wadler [33]’s linear λ calculus, there are two sort of variables: linear and
unrestricted, where the latter variables can only be obtained by decomposing
let !x = e1 in e2 . Since primitive sources of assumptions are arrow types, it is nat-
ural to annotate them with arguments’ multiplicities [7, 12, 22]. For multiplicities,
we focused on 1 and ω following Linear Haskell [6, 7, 26]. Although {1, ω} would
already be useful for some domains including reversible computation [19, 35]
and quantum computation [2, 25], handling more general multiplicities, such
as {0, 1, ω} and arbitrary semirings [12], is an interesting future direction. Our
discussions in Sect. 2 and 3, similarly to Linear Haskell [7], could be extended
to more general domains with small modifications. In contrast, we rely on the
particular domains {1, ω} of multiplicities for the crucial points of our inference,
i.e., entailment checking and quantifier elimination. Igarashi and Kobayashi [14]’s
linearity analysis for π calculus, which assigns input/output usage (multiplicities)
to channels, has similarity to linearity via arrows. Multiplicity 0 is important in
their analysis to identify input/output only channels. They solve constraints on
multiplicities separately in polynomial time, leveraging monotonicity of multi-
plicity operators with respect to ordering 0 ≤ 1 ≤ ω. Here, 0 ≤ 1 comes from the
fact that 1 in their system means “at-most once” instead of “exactly once”.
The “linearity via kinds” approaches distinguish types of which values are
treated linearly and types of which values are not [21,24,28], where the distinction
usually is represented by kinds [21, 28]. Interestingly, they also have two function
types—function types that belong to the linear kind and those that belong to
the unrestricted kind—because the kind of a function type cannot be determined
solely by the argument and return types. Mazurak et al. [21] use subkinding to
avoid explicit conversions from unrestricted values to linear ones. However, due
to the variations of the function types, a function can have multiple incompatible
types; e.g., the function const can have four incompatible types [24] in the system.
480 K. Matsuda
Universal types accompanied by kind abstraction [28] address the issue to some
extent; it works well for const, but still gives two incomparable types to the
function composition (◦) [24]. Morris [24] addresses this issue of principality
with qualified typing [15]. Two forms of predicates are considered in the system:
Un τ states that τ belongs to the unrestricted kind, and σ ≤ τ states that
Un σ implies Un τ . This system is considerably simple compared with the
previous systems. Turner et al. [29]’s type-based usage analysis has a similarity
to the linearity via kinds; in the system, each type is annotated by usage (a
multiplicity) as (List Intω )ω . Wansbrough and Peyton Jones [34] extends the
system to include polymorphic types and subtyping with respect to multiplicities,
and have discussions on multiplicity polymorphism. Mogensen [23] is a similar
line of work, which reduces constraint solving on multiplicities to Horn SAT.
His system concerns multiplicities {0, 1, ω} with ordering 0 ≤ 1 ≤ ω, and his
constraints can involve more operations including additions and multiplications
but only in the left-hand side of ≤.
Morris [24] uses improving substitutions [16] in generalization, which some-
times are effective for removing ambiguity, though without showing concrete
algorithms to find them. In our system, as well as S-Eq, elim(∃π.Q) can be
viewed as a systematic way to find improving substitutions. That is, elim(∃π.Q)
improves Q by substituting π with min{Mi | ω ≤ Mi ∈ Φω }, i.e., the largest
possible candidate of π. Though the largest solution is usually undesirable, espe-
of ≤ are all singletons, we can also view that
cially when the right-hand sides
elim(∃π.Q) substitutes π by μi ≤1∈Φ1 μi , i.e., the smallest possible candidate.
8 Conclusion
We designed a type inference system for a rank 1 fragment of λq→ [7] that can infer
principal types based on the qualified typing system OutsideIn(X) [31]. We
observed that naive qualified typing infers ambiguous types often and addressed
the issue based on quantifier elimination. The experiments suggested that the
proposed inference system infers principal types effectively, and the overhead
compared with unrestricted typing is acceptable, though not negligible.
Since we based our work on the inference algorithm used in GHC, the natural
expectation is to implement the system into GHC. A technical challenge to achieve
this is combining the disambiguation techniques with other sorts of constraints,
especially type classes, and arbitrarily ranked polymorphism.
Acknowledgments
We thank Meng Wang, Atsushi Igarashi, and the anonymous reviewers of ESOP
2020 for their helpful comments on the preliminary versions of this paper. This
work was partially supported by JSPS KAKENHI Grant Numbers 15H02681
and 19K11892, JSPS Bilateral Program, Grant Number JPJSBP120199913, the
Kayamori Foundation of Informational Science Advancement, and EPSRC Grant
EXHIBIT: Expressive High-Level Languages for Bidirectional Transformations
(EP/T008911/1).
Modular Inference of Linear Types for Multiplicity-Annotated Arrows 481
References
1. Aehlig, K., Berger, U., Hofmann, M., Schwichtenberg, H.: An arithmetic for non-
size-increasing polynomial-time computation. Theor. Comput. Sci. 318(1-2), 3–27
(2004). https://doi.org/10.1016/j.tcs.2003.10.023
2. Altenkirch, T., Grattage, J.: A functional quantum programming language. In:
20th IEEE Symposium on Logic in Computer Science (LICS 2005), 26-29 June
2005, Chicago, IL, USA, Proceedings. pp. 249–258. IEEE Computer Society (2005).
https://doi.org/10.1109/LICS.2005.1
3. Baillot, P., Hofmann, M.: Type inference in intuitionistic linear logic. In: Kut-
sia, T., Schreiner, W., Fernández, M. (eds.) Proceedings of the 12th Interna-
tional ACM SIGPLAN Conference on Principles and Practice of Declarative
Programming, July 26-28, 2010, Hagenberg, Austria. pp. 219–230. ACM (2010).
https://doi.org/10.1145/1836089.1836118
4. Baillot, P., Terui, K.: A feasible algorithm for typing in elementary affine
logic. In: Urzyczyn, P. (ed.) Typed Lambda Calculi and Applications, 7th In-
ternational Conference, TLCA 2005, Nara, Japan, April 21-23, 2005, Proceed-
ings. Lecture Notes in Computer Science, vol. 3461, pp. 55–70. Springer (2005).
https://doi.org/10.1007/11417170_6
5. Baillot, P., Terui, K.: Light types for polynomial time computation in lambda cal-
culus. Inf. Comput. 207(1), 41–62 (2009). https://doi.org/10.1016/j.ic.2008.08.005
6. Bernardy, J.P., Boespflug, M., Newton, R., Jones, S.P., Spiwack, A.: Linear mini-
core. GHC Developpers Wiki, https://gitlab.haskell.org/ghc/ghc/wikis/uploads/
ceaedb9ec409555c80ae5a97cc47470e/minicore.pdf, visited Oct. 14, 2019.
7. Bernardy, J., Boespflug, M., Newton, R.R., Peyton Jones, S., Spiwack, A.: Lin-
ear haskell: practical linearity in a higher-order polymorphic language. PACMPL
2(POPL), 5:1–5:29 (2018). https://doi.org/10.1145/3158093
8. Davis, M., Logemann, G., Loveland, D.W.: A machine program for theorem-proving.
Commun. ACM 5(7), 394–397 (1962). https://doi.org/10.1145/368273.368557
9. Davis, M., Putnam, H.: A computing procedure for quantification theory. J. ACM
7(3), 201–215 (1960). https://doi.org/10.1145/321033.321034
10. Dowling, W.F., Gallier, J.H.: Linear-time algorithms for testing the satisfia-
bility of propositional horn formulae. J. Log. Program. 1(3), 267–284 (1984).
https://doi.org/10.1016/0743-1066(84)90014-1
11. Gan, E., Tov, J.A., Morrisett, G.: Type classes for lightweight substructural types.
In: Alves, S., Cervesato, I. (eds.) Proceedings Third International Workshop on
Linearity, LINEARITY 2014, Vienna, Austria, 13th July, 2014. EPTCS, vol. 176,
pp. 34–48 (2014). https://doi.org/10.4204/EPTCS.176.4
12. Ghica, D.R., Smith, A.I.: Bounded linear types in a resource semiring. In: Shao,
Z. (ed.) Programming Languages and Systems - 23rd European Symposium on
Programming, ESOP 2014, Held as Part of the European Joint Conferences on
Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014,
Proceedings. Lecture Notes in Computer Science, vol. 8410, pp. 331–350. Springer
(2014). https://doi.org/10.1007/978-3-642-54833-8_18
13. Girard, J., Scedrov, A., Scott, P.J.: Bounded linear logic: A modular approach
to polynomial-time computability. Theor. Comput. Sci. 97(1), 1–66 (1992).
https://doi.org/10.1016/0304-3975(92)90386-T
14. Igarashi, A., Kobayashi, N.: Type reconstruction for linear -calculus with I/O sub-
typing. Inf. Comput. 161(1), 1–44 (2000). https://doi.org/10.1006/inco.2000.2872
482 K. Matsuda
15. Jones, M.P.: Qualified Types: Theory and Practice. Cambridge University Press,
New York, NY, USA (1995)
16. Jones, M.P.: Simplifying and improving qualified types. In: Williams, J. (ed.)
Proceedings of the seventh international conference on Functional programming
languages and computer architecture, FPCA 1995, La Jolla, California, USA, June
25-28, 1995. pp. 160–169. ACM (1995). https://doi.org/10.1145/224164.224198
17. Lindley, S., Morris, J.G.: A semantics for propositions as sessions. In: Vitek, J.
(ed.) Programming Languages and Systems - 24th European Symposium on Pro-
gramming, ESOP 2015, Held as Part of the European Joint Conferences on Theory
and Practice of Software, ETAPS 2015, London, UK, April 11-18, 2015. Proceed-
ings. Lecture Notes in Computer Science, vol. 9032, pp. 560–584. Springer (2015).
https://doi.org/10.1007/978-3-662-46669-8_23
18. Lindley, S., Morris, J.G.: Embedding session types in haskell. In: Main-
land, G. (ed.) Proceedings of the 9th International Symposium on Haskell,
Haskell 2016, Nara, Japan, September 22-23, 2016. pp. 133–145. ACM (2016).
https://doi.org/10.1145/2976002.2976018
19. Lutz, C.: Janus: a time-reversible language. Letter to R. Landauer. (1986), available
on: http://tetsuo.jp/ref/janus.pdf
20. Matsuda, K.: Modular inference of linear types for multiplicity-annotated arrows
(2020), http://arxiv.org/abs/1911.00268v2
21. Mazurak, K., Zhao, J., Zdancewic, S.: Lightweight linear types in system fdegree.
In: TLDI. pp. 77–88. ACM (2010)
22. McBride, C.: I got plenty o’ nuttin’. In: Lindley, S., McBride, C., Trinder, P.W., San-
nella, D. (eds.) A List of Successes That Can Change the World - Essays Dedicated
to Philip Wadler on the Occasion of His 60th Birthday. Lecture Notes in Computer
Science, vol. 9600, pp. 207–233. Springer (2016). https://doi.org/10.1007/978-3-
319-30936-1_12
23. Mogensen, T.Æ.: Types for 0, 1 or many uses. In: Clack, C., Hammond, K.,
Davie, A.J.T. (eds.) Implementation of Functional Languages, 9th International
Workshop, IFL’97, St. Andrews, Scotland, UK, September 10-12, 1997, Selected
Papers. Lecture Notes in Computer Science, vol. 1467, pp. 112–122. Springer (1997).
https://doi.org/10.1007/BFb0055427
24. Morris, J.G.: The best of both worlds: linear functional programming with-
out compromise. In: Garrigue, J., Keller, G., Sumii, E. (eds.) Proceedings of
the 21st ACM SIGPLAN International Conference on Functional Programming,
ICFP 2016, Nara, Japan, September 18-22, 2016. pp. 448–461. ACM (2016).
https://doi.org/10.1145/2951913.2951925
25. Selinger, P., Valiron, B.: A lambda calculus for quantum computation with classical
control. Mathematical Structures in Computer Science 16(3), 527–552 (2006).
https://doi.org/10.1017/S0960129506005238
26. Spiwack, A., Domínguez, F., Boespflug, M., Bernardy, J.P.: Linear types. GHC
Proposals, https://github.com/tweag/ghc-proposals/blob/linear-types2/proposals/
0000-linear-types.rst, visited Sep. 11, 2019.
27. Stuckey, P.J., Sulzmann, M.: A theory of overloading. ACM Trans. Program. Lang.
Syst. 27(6), 1216–1269 (2005). https://doi.org/10.1145/1108970.1108974
28. Tov, J.A., Pucella, R.: Practical affine types. In: Ball, T., Sagiv, M. (eds.) Proceed-
ings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011. pp. 447–458. ACM
(2011). https://doi.org/10.1145/1926385.1926436
Modular Inference of Linear Types for Multiplicity-Annotated Arrows 483
29. Turner, D.N., Wadler, P., Mossin, C.: Once upon a type. In: Williams, J. (ed.)
Proceedings of the seventh international conference on Functional programming
languages and computer architecture, FPCA 1995, La Jolla, California, USA, June
25-28, 1995. pp. 1–11. ACM (1995). https://doi.org/10.1145/224164.224168
30. Vytiniotis, D., Peyton Jones, S.L., Schrijvers, T.: Let should not be gener-
alized. In: Kennedy, A., Benton, N. (eds.) Proceedings of TLDI 2010: 2010
ACM SIGPLAN International Workshop on Types in Languages Design and
Implementation, Madrid, Spain, January 23, 2010. pp. 39–50. ACM (2010).
https://doi.org/10.1145/1708016.1708023
31. Vytiniotis, D., Peyton Jones, S.L., Schrijvers, T., Sulzmann, M.: Outsidein(x)
modular type inference with local assumptions. J. Funct. Program. 21(4-5), 333–
412 (2011). https://doi.org/10.1017/S0956796811000098
32. Wadler, P.: Linear types can change the world! In: Broy, M. (ed.) Programming
concepts and methods: Proceedings of the IFIP Working Group 2.2, 2.3 Working
Conference on Programming Concepts and Methods, Sea of Galilee, Israel, 2-5
April, 1990. p. 561. North-Holland (1990)
33. Wadler, P.: A taste of linear logic. In: Borzyszkowski, A.M., Sokolowski, S. (eds.)
Mathematical Foundations of Computer Science 1993, 18th International Sym-
posium, MFCS’93, Gdansk, Poland, August 30 - September 3, 1993, Proceed-
ings. Lecture Notes in Computer Science, vol. 711, pp. 185–210. Springer (1993).
https://doi.org/10.1007/3-540-57182-5_12
34. Wansbrough, K., Peyton Jones, S.L.: Once upon a polymorphic type. In: Appel,
A.W., Aiken, A. (eds.) POPL ’99, Proceedings of the 26th ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages, San Antonio, TX, USA, Jan-
uary 20-22, 1999. pp. 15–28. ACM (1999). https://doi.org/10.1145/292540.292545
35. Yokoyama, T., Axelsen, H.B., Glück, R.: Towards a reversible functional language.
In: Vos, A.D., Wille, R. (eds.) RC. Lecture Notes in Computer Science, vol. 7165,
pp. 14–29. Springer (2011). https://doi.org/10.1007/978-3-642-29517-1_2
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
RustHorn: CHC-based Verification for Rust
Programs
1 Introduction
int mc91(int n) {
if (n > 100) return n - 10; else return mc91(mc91(n + 11));
}
Suppose that we wish to prove mc91(n) returns 91 whenever n ≤ 101 (if it ter-
minates). The wished property is equivalent to the satisfiability of the following
CHCs, where Mc91 (n, r) means that mc91(n) returns r if it terminates.
Mc91 (n, r) ⇐= n > 100 ∧ r = n − 10
The full version of this paper is available as [47].
1
Free variables are universally quantified. Terms and variables are governed under
sorts (e.g. int, bool), which are made explicit in the formalization of § 3.
It can immediately return true; or it recursively calls itself and checks if the
target of ma remains unchanged through the recursive call. In effect this function
does nothing on the allocated memory blocks, although it can possibly modify
some of the unused parts of the memory.
Suppose we wish to verify that just_rec never returns false. The standard
CHC-based verifier for C, SeaHorn [23], generates a CHC system like below:56
JustRec(ma, h, h , r) ⇐= h = h ∧ r = true
JustRec(ma, h, h , r) ⇐= mb = ma ∧ h = h{mb ← b}
∧ JustRec(mb, h , h , r ) ∧ r = (h[ma] == h [ma])
r = true ⇐= JustRec(ma, h, h , r)
Unfortunately the CHC system above is not satisfiable and thus SeaHorn issues
a false alarm. This is because, in this formulation, mb may not necessarily be
completely fresh; it is assumed to be different from the argument ma of the
current call, but may coincide with ma of some deep ancestor calls.7
The simplest remedy would be to explicitly specify the way of memory allo-
cation. For example, one can represent the memory state as a pair of an array h
and an index sp indicating the maximum index that has been allocated so far.
JustRec + (ma, h, sp, h , sp , r) ⇐= h = h ∧ sp = sp ∧ r = true
JustRec + (ma, h, sp, h , sp , r) ⇐= mb = sp = sp + 1 ∧ h = h{mb ← b}
JustRec + (mb, h , sp , h , sp , r ) ∧ r = (h[ma] == h [ma])
r = true ⇐= JustRec + (ma, h, sp, h , sp , r) ∧ ma ≤ sp
The resulting CHC system now has a model, but it involves quantifiers:
JustRec + (ma, h, sp, h , sp , r) :⇐⇒ r = true ∧ ∀ i ≤ sp. h[i] = h [i]
Finding quantified invariants is known to be difficult in general despite ac-
tive studies on it [41,2,36,26,19] and most current array-supporting CHC solvers
give up finding quantified invariants. In general, much more complex operations
on pointers can naturally take place, which makes the universally quantified in-
variants highly involved and hard to automatically find. To avoid complexity of
models, CHC-based verification tools [23,24,37] tackle pointers by pointer anal-
ysis [61,43]. Although it does have some effects, the current applicable scope of
pointer analysis is quite limited.
5
==, !=, >=, && denote binary operations that return boolean values.
6
We omitted the allocation for old_a for simplicity.
7
Precisely speaking, SeaHorn tends to even omit shallow address-freshness checks like
mb = ma.
RustHorn: CHC-based Verification for Rust Programs 487
mc
ma
a
mb
b
L LL LLL LY
FDOO UHWXUQ HQGRI
take_max take_max ERUURZLQJ
Fig. 1. Values and aliases of a and b in evaluating inc_max(5,3). Each line shows
each variable’s permission timeline: a solid line expresses the update permission and a
bullet shows a point when the borrowed permission is given back. For example, b has
the update permission to its content during (i) and (iv), but not during (ii) and (iii)
because the pointer mb, created at the call of take_max, borrows b until the end of (iii).
Key Idea. The key idea of our method is to represent a pointer ma as a pair
a, a◦
of the current target value a and the target value a◦ at the end of borrow.89 This
representation employs access to the future information (it is related to prophecy
variables; see § 5). This simple idea turns out to be very powerful.
In our approach, the verification problem “Does inc_max always return true?”
is reduced to the satisfiability of the following CHCs:
TakeMax (
a, a◦ ,
b, b◦ , r) ⇐= a ≥ b ∧ b◦ = b ∧ r =
a, a◦
TakeMax (
a, a◦ ,
b, b◦ , r) ⇐= a < b ∧ a◦ = a ∧ r =
b, b◦
IncMax (a, b, r) ⇐= TakeMax (
a, a◦ ,
b, b◦ ,
c, c◦ ) ∧ c = c + 1
∧ c◦ = c ∧ r = (a◦ != b◦ )
r = true ⇐= IncMax (a, b, r).
The mutable reference ma is now represented as
a, a◦ , and similarly for mb and
mc. The first CHC models the then-clause of take_max: the return value is ma,
which is expressed as r =
a, a◦ ; in contrast, mb is released, which constrains
b◦ , the value of b at the end of borrow, to the current value b. In the clause on
IncMax , mc is represented as a pair
c, c◦ . The constraint c = c + 1 ∧ c◦ = c
models the increment of mc (in the phase (iii) in Fig. 1). Importantly, the final
check a != b is simply expressed as a◦ != b◦ ; the updated values of a/b are
available as a◦ /b◦ . Clearly, the CHC system above has a simple model.
Also, the just_rec example in § 1.1 can be encoded as a CHC system
JustRec(
a, a◦ , r) ⇐= a◦ = a ∧ r = true
JustRec(
a, a◦ , r) ⇐= mb =
b, b◦ ∧ JustRec(mb, r )
∧ a◦ = a ∧ r = (a == a0 )
8
Precisely, this is the representation of a pointer with a borrowed update permission
(i.e. mutable reference). Other cases are discussed in § 3.
9
For example, in the case of Fig. 1, when take_max is called, the pointer ma is 5, 6
and mb is 3, 3.
RustHorn: CHC-based Verification for Rust Programs 489
r = true ⇐= JustRec(
a, a◦ , r).
Now it has a simple model: JustRec(
a, a◦ , r) :⇐⇒ r = true ∧ a◦ = a. Re-
markably, arrays and quantified formulas are not required to express the model,
which allows the CHC system to be easily solved by many CHC solvers. More
advanced examples are presented in § 3.4, including one with destructive update
on a singly-linked list.
Contributions. Based on the above idea, we formalize the translation from pro-
grams to CHC systems for a core language of Rust, prove correctness (both
soundness and completeness) of the translation, and confirm the effectiveness
of our approach through preliminary experiments. The core language supports,
among others, recursive types. Remarkably, our approach enables us to automat-
ically verify some properties of a program with destructive updates on recursive
data types such as lists and trees.
The rest of the paper is structured as follows. In § 2, we provide a formalized
core language of Rust supporting recursions, lifetime-based ownership and recur-
sive types. In §3, we formalize our translation from programs to CHCs and prove
its correctness. In § 4, we report on the implementation and the experimental
results. In § 5 we discuss related work and in § 6 we conclude the paper.
2.1 Syntax
The following is the syntax of COR.
(program) Π ::= F0 · · · Fn−1
(function definition) F ::= fn f Σ {L0 : S0 · · · Ln−1 : Sn−1 }
(function signature) Σ ::= α0 , . . . , αm−1 | αa0 ≤ αb0 , . . . , αal−1 ≤ αbl−1
(x0 : T0 , . . . , xn−1 : Tn−1 ) → U
(statement) S ::= I; goto L | return x
| match ∗x {inj0 ∗y0 → goto L0 , inj1 ∗y1 → goto L1 }
(instruction) I ::= let y = mutborα x | drop x | immut x | swap(∗x, ∗y)
| let ∗y = x | let y = ∗x | let ∗y = copy ∗x | x as T
| let y = f α0 , . . . , αm−1 (x0 , . . . , xn−1 )
| intro α | now α | α ≤ β
| let ∗y = const | let ∗y = ∗x op ∗x | let ∗y = rand()
| let ∗y = injTi 0+T1 ∗x | let ∗y = (∗x0 , ∗x1 ) | let (∗y0 , ∗y1 ) = ∗x
(type) T, U ::= X | μX.T | P T | T0 +T1 | T0 ×T1 | int | unit
(pointer kind) P ::= own | Rα (reference kind) R ::= mut | immut
490 Y. Matsushita et al.
owning pointer has data in the heap memory, can freely update the data (un-
less it is borrowed), and has the obligation to clean up the data from the heap
memory. In contrast, a mutable/immutable reference (or unique/shared refer-
ence) borrows an update/read permission from an owning pointer or another
reference with the deadline of a lifetime α (introduced later). A mutable ref-
erence cannot be copied, while an immutable reference can be freely copied. A
reference loses the permission at the time when it is released.14
A type T that appears in a program (not just as a substructure of some type)
should satisfy the following condition (if it holds we say the type is complete):
every type variable X in T is bound by some μ and guarded by a pointer con-
structor (i.e. given a binding of form μX.U , every occurrence of X in U is a part
of a pointer type, of form P U ).
Expressivity and Limitations. COR can express most borrow patterns in the
core of Rust. The set of moments when a borrow is active forms a continuous
time range, even under non-lexical lifetimes [54].16
A major limitation of COR is that it does not support unsafe code blocks and
also lacks type traits and closures. Still, our idea can be combined with unsafe
code and closures, as discussed in §3.5. Another limitation of COR is that, unlike
Rust and λRust , we cannot directly modify/borrow a fragment of a variable (e.g.
an element of a pair). Still, we can eventually modify/borrow a fragment by
borrowing the whole variable and splitting pointers (e.g. ‘let (∗y0 , ∗y1 ) = ∗x’).
This borrow-and-split strategy, nevertheless, yields a subtle obstacle when we
extend the calculus for advanced data types (e.g. get_default in ‘Problem Case
#3’ from [54]). For future work, we pursue a more expressive calculus modeling
Rust and extend our verification method to it.
Example 1 (COR Program). The following program expresses the functions
take_max and inc_max presented in § 1.2. We shorthand sequential executions
14
In Rust, even after a reference loses the permission and the lifetime ends, its address
data can linger in the memory, although dereferencing on the reference is no longer
allowed. We simplify the behavior of lifetimes in COR.
15
In the terminology of Rust, a lifetime often means a time range where a borrow is
active. To simplify the discussions, however, we in this paper use the term lifetime
to refer to a time point when a borrow ends.
16
Strictly speaking, this property is broken by recently adopted implicit two-phase
borrows [59,53]. However, by shallow syntactical reordering, a program with implicit
two-phase borrows can be fit into usual borrow patterns.
492 Y. Matsushita et al.
fn take_max<'a>(ma: &'a mut i32, mb: &'a mut i32) -> &'a mut i32 {
if *ma >= *mb { drop mb; ma } else { drop ma; mb }
}
fn inc_max(mut a: i32, mut b: i32) -> bool {
{ intro 'a;
let mc = take_max<'a> (&'a mut a, &'a mut b); *mc += 1;
drop mc; now 'a; }
a != b
}
The type system of COR assigns to each label a whole context (Γ, A). We define
below the whole context and the typing judgments.
Program and Function. The rules for typing programs and functions are pre-
sented below. They assign to each label a whole context (Γ, A). ‘S:Π,f (Γ, A) |
(ΓL , AL )L | U ’ is explained later.
for any F in Π, F :Π (Γname(F ),L , Aname(F ),L )L∈LabelF
Π: (Γf,L , Af,L )(f,L) ∈ FnLabelΠ
name(F ): the function name of F LabelF : the set of labels in F
FnLabelΠ : the set of pairs (f, L) such that a function f in Π has a label L
F = fn f α0 , . . . , αm−1 | αa0 ≤ αb0 , . . . , αal−1 ≤ αbl−1 (x0 : T0 , . . . , xn−1 : Tn−1 ) → U {· · ·}
+
Γentry = {xi : Ti | i ∈ [n]} A = {αj | j ∈ [m]} Aentry = A, IdA ∪{(αak , αbk ) | k ∈ [l]}
for any L : S ∈ LabelStmtF , S:Π,f (ΓL , AL ) | (ΓL , AL )L∈LabelF | U
F :Π (ΓL , AL )L∈LabelF
LabelStmtF : the set of labeled statements in F
IdA : the identity relation on A R+ : the transitive closure of R
On the rule for the function, the initial whole context at entry is specified
(the second and third preconditions) and also the contexts for other labels are
checked (the fourth precondition). The context for each label (in each function)
can actually be determined in the order by the distance in the number of goto
jumps from entry, but that order is not very obvious because of unstructured
control flows.
The rule for the return statement ensures that there remain no extra variables
and local lifetime variables.
Instruction. ‘I:Π,f (Γ, A) → (Γ , A )’ means that running the instruction I (un-
der Π, f ) updates the whole context (Γ, A) into (Γ , A ). The rules are designed
so that, for any I, Π, f , (Γ, A), there exists at most one (Γ , A ) such that
494 Y. Matsushita et al.
I:Π,f (Γ, A) → (Γ , A ) holds. Below we present some of the rules; the complete
rules are presented in the full paper. The following is the typing rule for mutable
(re)borrow.
α∈
/ Aex Π,f P = own, mutα for any β ∈ LifetimeP T , α ≤A β
let y = mutborα x :Π,f (Γ+{x: P T }, A) → (Γ+{y: mutα T, x:†α P T }, A)
LifetimeT : the set of lifetime variables occurring in T
On intro α, it just ensures the new local lifetime variable to be earlier than
any lifetime parameters (which are given by exterior functions). On now α, the
variables frozen with α get active again. Below is the typing rule for dereference
of a pointer to a pointer, which may be a bit interesting.
let y = ∗x :Π,f (Γ+{x: P P T }, A) → (Γ+{y: (P ◦P ) T }, A)
mut (R = R = mut)
P ◦ own = own ◦ P := P Rα ◦ Rβ := Rα where R =
immut (otherwise)
The third precondition of the typing rule for mutbor justifies taking just α in
the rule ‘Rα ◦ Rβ := Rα
’.
Let us interpret Π: (Γf,L , Af,L )(f,L) ∈ FnLabelΠ as “the program Π has the
type (Γf,L , Af,L )(f,L) ∈ FnLabelΠ ”. The type system ensures that any program
has at most one type (which may be a bit unclear because of unstructured
control flows). Hereinafter, we implicitly assume that a program has a type.
Here we introduce ‘#T ’, which represents how many memory cells the type T
takes (at the outermost level). #T is defined for every complete type T , because
every occurrence of type variables in a complete type is guarded by a pointer
constructor.
#(T0 +T1 ) := 1 + max{#T0 , #T1 } #(T0 ×T1 ) := #T0 + #T1
# μX.T := # T [μX.T /X] # int = # P T := 1 # unit = 0
Sort System. ‘t:Δ σ’ (the term t has the sort σ under Δ) is defined as follows.
Here, Δ is a finite map from variables to sorts. σ ∼ τ is the congruence on sorts
induced by μX.σ ∼ σ[μX.σ/X].
Δ(x) = σ t:Δ σ t∗ , t ◦ :Δ σ t:Δ σi t0 : Δ σ 0 t1 : Δ σ 1
x:Δ σ t:Δ box σ t∗ , t◦ :Δ mut σ inji t:Δ σ0 + σ1 (t0 , t1 ):Δ σ0 × σ1
t:Δ C σ t:Δ mut σ t:Δ σ0 + σ1 t, t :Δ int t:Δ σ σ ∼ τ
const:Δ σconst
∗t:Δ σ ◦t:Δ σ t.i:Δ σi t op t :Δ σop t:Δ τ
σconst : the sort of const σop : the output sort of op
The CHC system (Φ, Ξ) is said to be well-sorted if wellSortedΞ (Φ) holds for any
Φ ∈ Φ.
When M |= (Φ, Ξ) holds, we say that M is a model of (Φ, Ξ). Every well-
sorted CHC system (Φ, Ξ) has the least model on the point-wise ordering (which
can be proved based on the discussions in [16]), which we write as Mleast
(Φ,Ξ) .
if, in COR, a function call f (v0 , . . . , vn−1 ) can return w. Actually, in concrete
operational semantics, such values should be read out from the heap memory.
The formal description and proof of this expected property is presented in § 3.3.
Auxiliary Definitions. The sort corresponding to the type T , (|T |), is defined
as follows. P̌ is a meta-variable for a non-mutable-reference pointer kind, i.e.
own or immutα . Note that the information on lifetimes is all stripped off.
(|X|) := X (|μX.T |) = μX.(|T |) (|P̌ T |) := box (|T |) (|mutα T |) := mut (|T |)
(|int|) := int (|unit|) := unit (|T0 +T1 |) := (|T0 |) + (|T1 |) (|T0 ×T1 |) := (|T0 |) × (|T1 |)
as follows, if the items in the variable context for the label are enumerated as
x0 :a0 T0 , . . . , xn−1 :an−1 Tn−1 and the return type of the function is U .
ϕ̌Π,f,L := fL (x0 , . . . , xn−1 , res) ΞΠ,f,L := ((|T0 |), . . . , (|Tn−1 |), (|U |))
ΔΠ,f,L := {(xi , (|Ti |)) | i ∈ [n]} + {(res, (|U |))}
∀(Δ) stands for ∀ x0 : σ0 , . . . , xn−1 : σn−1 , where the items in Δ are enumerated
as (x0 , σ0 ), . . . , (xn−1 , σn−1 ).
CHC Representation. Now we introduce ‘(|L: S|)Π,f ’, the set (in most cases,
singleton) of CHCs modeling the computation performed by the labeled state-
ment L: S in f from Π. Unlike informal descriptions in § 1, we turn to pattern
matching instead of equations, to simplify the proofs. Below we show some of the
rules; the complete rules are presented in the full paper. The variables marked
green (e.g. x◦ ) should be fresh. The following is the rule for mutable (re)borrow.
(|L: let y = mutborα x; goto L |)Π,f
⎧
⎪
⎪ ∀(ΔΠ,f,L +{(x◦ , (|T |))}).
⎨ ϕ̌Π,f,L ⇐= ϕ̌ (TyΠ,f,L (x) = own T )
Π,f,L [∗x, x◦ /y, x◦ /x]
:=
⎪
⎪ ∀(Δ Π,f,L +{(x ◦ , (|T |))}).
⎩ (TyΠ,f,L (x) = mutα T )
ϕ̌Π,f,L ⇐= ϕ̌Π,f,L [∗x, x◦ /y, x◦ , ◦x/x]
The value at the end of borrow is represented as a newly introduced variable x◦ .
Below is the rule for release of a variable.
drop x; goto L |)Π,f
(|L: ⎧
⎪
⎨ ∀(ΔΠ,f,L ). ϕ̌Π,f,L ⇐= ϕ̌Π,f,L (TyΠ,f,L (x) = P̌ T )
:= ∀(ΔΠ,f,L −{(x, mut (|T |))}+{(x∗ , (|T |))}).
⎪
⎩ ϕ̌ (TyΠ,f,L (x) = mutα T )
Π,f,L [x∗ , x∗ /x] ⇐= ϕ̌Π,f,L
take-maxL1 (ma, mb, inj1 ∗ou, res) ⇐= take-maxL2 (ma, mb, ou, res)
take-maxL1 (ma, mb, inj0 ∗ou, res) ⇐= take-maxL5 (ma, mb, ou, res)
take-maxL2 (ma, mb, ou, res) ⇐= take-maxL3 (ma, mb, res)
take-maxL3 (ma, mb ∗ ,mb ∗ , res) ⇐= take-maxL4 (ma, res)
take-maxL4 (ma, ma) ⇐=
take-maxL5 (ma, mb, ou, res) ⇐= take-maxL6 (ma, mb, res)
take-maxL6 (ma ∗ ,ma ∗ , mb, res) ⇐= take-maxL7 (mb, res)
take-maxL7 (mb, mb) ⇐=
The fifth and eighth CHC represent release of mb/ma. The sixth and ninth CHC
represent the determination of the return value res.
Notations. We use {|· · ·|} (instead of {·· · }) for the intensional description of
a multiset. A ⊕ B (or more generally λ Aλ ) denotes the multiset sum (e.g.
{|0, 1|} ⊕ {|1|} = {|0, 1, 1|} = {|0, 1|}).
Here, the ‘no duplicate items’ precondition checks the safety on the ownership.
COS-based Model. Now we introduce the COS-based model (COS stands for
concrete operational semantics) fΠ COS
to formally describe the expected input-
output relation. Here, for simplicity, f is restricted to one that does not take
lifetime parameters (we call such a function simple; the input/output types
of a simple function cannot contain references). We define fΠ COS
as the pred-
icate (on values of sorts (|T0 |), . . . , (|Tn−1 |), (|U |) if f ’s input/output types are
T0 , . . . , Tn−1 , U ) given by the following rule.
C0 →Π · · · →Π CN final Π (CN ) C0 =[f, entry] F | H CN = [f, L] F | H
safeH F :: ΓΠ,f,entry {(xi , vi ) | i ∈ [n]} safeH F :: ΓΠ,f,L {(y, w)}
fΠCOS
(v0 , . . . , vn−1 , w)
ΓΠ,f,L : the variable context for the label L of f in the program Π
Proof. The details are presented in the full paper. We outline the proof below.
First, we introduce abstract operational semantics, where we get rid of heaps
and directly represent each variable in the program simply as a value with ab-
stract variables, which is strongly related to prophecy variables (see § 5). An
abstract variable represents the undetermined value of a mutable reference at
the end of borrow.
Next, we introduce SLDC resolution for CHC systems and find a bisimula-
tion between abstract operational semantics and SLDC resolution, whereby we
show that the AOS-based model, defined analogously to the COS-based model,
is equivalent to the least model of the CHC representation. Moreover, we find
a bisimulation between concrete and abstract operational semantics and prove
that the COS-based model is equivalent to the AOS-based model.
Finally, combining the equivalences, we achieve the proof for the correctness
of the CHC representation.
fn choose<'a>(ma: &'a mut i32, mb: &'a mut i32) -> &'a mut i32 {
if rand() { drop ma; mb } else { drop mb; ma }
}
fn linger_dec<'a>(ma: &'a mut i32) -> bool {
*ma -= 1; if rand() >= 0 { drop ma; return true; }
let mut b = rand(); let old_b = b; intro 'b; let mb = &'b mut b;
let r2 = linger_dec<'b> (choose<'b> (ma, mb)); now 'b;
r2 && old_b >= b
}
Unlike just_rec, the function linger_dec can modify the local variable of an
arbitrarily deep ancestor. Interestingly, each recursive call to linger_dec can
introduce a new lifetime 'b , which yields arbitrarily many layers of lifetimes.
Suppose we wish to verify that linger_dec never returns false. If we use,
like JustRec + in § 1.1, a predicate taking the memory states h, h and the stack
pointer sp, we have to discover the quantified invariant: ∀ i ≤ sp. h[i] ≥ h [i]. In
contrast, our approach reduces this verification problem to the following CHCs:
Choose(a, a◦ , b, b◦ , r) ⇐= b◦ = b ∧ r = a, a◦
Choose(a, a◦ , b, b◦ , r) ⇐= a◦ = a ∧ r = b, b◦
LingerDec(a, a◦ , r) ⇐= a = a − 1 ∧ a◦ = a ∧ r = true
LingerDec(a, a◦ , r) ⇐= a = a − 1 ∧ oldb = b ∧ Choose(a , a◦ , b, b◦ , mc)
∧ LingerDec(mc, r ) ∧ r = (r && oldb >= b◦ )
r = true ⇐= LingerDec(a, a◦ , r).
This can be solved by many solvers since it has a very simple model:
Choose(
a, a◦ ,
b, b◦ , r) :⇐⇒ (b◦ = b ∧ r =
a, a◦ ) ∨ (a◦ = a ∧ r =
b, b◦ )
LingerDec(
a, a◦ , r) :⇐⇒ r = true ∧ a ≥ a◦ .
Example 4. Combined with recursive data structures, our method turns out to
be more interesting. Let us consider the following Rust code:22
enum List { Cons(i32, Box<List>), Nil } use List::*;
fn take_some<'a>(mxs: &'a mut List) -> &'a mut i32 {
match mxs {
Cons(mx, mxs2) => if rand() { drop mxs2; mx }
else { drop mx; take_some<'a> (mxs2) }
Nil => { take_some(mxs) }
22
In COR, List can be expressed as μX.int × own X + unit.
502 Y. Matsushita et al.
}
}
fn sum(xs: &List) -> i32 {
match xs { Cons(x, xs2) => x + sum(xs2), Nil => 0 }
}
fn inc_some(mut xs: List) -> bool {
let n = sum(&xs); intro 'a; let my = take_some<'a> (&'a mut xs);
*my += 1; drop my; now 'a; let m = sum(&xs); m == n + 1
}
This is a program that manipulates singly linked integer lists, defined as a re-
cursive data type. take_some takes a mutable reference to a list and returns
a mutable reference to some element of the list. sum calculates the sum of the
elements of a list. inc_some increments some element of a list via a mutable
reference and checks that the sum of the elements of the list has increased by 1.
Suppose we wish to verify that inc_some never returns false. Our method
translates this verification problem into the following CHCs.23
3.5 Discussions
‘let a' = Random.int(0)’ expresses a random guess and ‘assume (a' = a)’
expresses a check. The original problem “Does inc_max never return false?”
is reduced to the problem “Does main never fail at assertion?” on the OCaml
program.24
This representation allows us to use various verification techniques, including
model checking (higher-order, temporal, bounded, etc.), semi-automated verifi-
cation (e.g. on Boogie [48]) and verification on proof assistants (e.g. Coq [15]).
The property to be verified can be not only partial correctness, but also total
correctness and liveness. Further investigation is left for future work.
Libraries with Unsafe Code. Our translation does not use lifetime information;
the correctness of our method is guaranteed by the nature of borrow. Whereas
24
MoCHi [39], a higher-order model checker for OCaml, successfully verified the safety
property for the OCaml representation above. It also successfully and instantly ver-
ified a similar representation of choose/linger_dec at Example 3.
504 Y. Matsushita et al.
lifetimes are used for static check of the borrow discipline, many libraries in Rust
(e.g. RefCell) provide a mechanism for dynamic ownership check.
We believe that such libraries with unsafe code can be verified for our method
by a separation logic such as Iris [35,33], as RustBelt [32] does. A good news
is that Iris has recently incorporated prophecy variables [34], which seems to fit
well with our approach. This is an interesting topic for future work.
After the libraries are verified, we can turn to our method. For an easy
example, Vec [58] can be represented simply as a functional array; a muta-
ble/immutable slice &mut[T]/&[T] can be represented as an array of muta-
ble/immutable references. For another example, to deal with RefCell [56], we
pass around an array that maps a RefCell<T> address to data of type T equipped
with an ownership counter; RefCell itself is modeled simply as an address.2526
Importantly, at the very time we take a mutable reference
a, a◦ from a ref-cell,
the data at the array should be updated into a◦ . Using methods such as pointer
analysis [61], we can possibly shrink the array.
Still, our method does not go quite well with memory leaks [52] caused for
example by combination of RefCell and Rc [57], because they obfuscate the
ownership release of mutable references. We think that use of Rc etc. should
rather be restricted for smooth verification. Further investigation is needed.
False alarms of SeaHorn for the last six groups are mainly due to problematic
approximation of SeaHorn for pointers and heap memories, as discussed in § 1.1.
On the modified CHC outputs of SeaHorn, five false alarms were erased and four
of them became successful. For the last four groups, unboundedly many mem-
ory cells can be allocated, which imposes a fundamental challenge for SeaHorn’s
array-based approach as discussed in § 1.1.31 The combination of RustHorn and
HoIce took a relatively long time or reported timeout for some programs, includ-
ing unsafe ones, because HoIce is still an unstable tool compared to Spacer; in
general, automated CHC solving can be rather unstable.
5 Related Work
CHC-based Verification of Pointer-Manipulating Programs. SeaHorn [23] is a
representative existing tool for CHC-based verification of pointer-manipulating
programs. It basically represents the heap memory as an array. Although some
pointer analyses [24] are used to optimize the array representation of the heap,
their approach suffers from the scalability problem discussed in §1.1, as confirmed
by the experiments in § 4. Still, their approach is quite effective as automated
verification, given that many real-world pointer-manipulating programs do not
follow Rust-style ownership.
Another approach is taken by JayHorn [37,36], which translates Java pro-
grams (possibly using object pointers) to CHCs. They represent store invariants
using special predicates pull and push. Although this allows faster reasoning
about the heap than the array-based approach, it can suffer from more false
alarms. We conducted a small experiment for JayHorn (0.6-alpha) on some of
the benchmarks of § 4.2; unexpectedly, JayHorn reported ‘UNKNOWN’ (instead of
‘SAFE’ or ‘UNSAFE’) for even simple programs such as the programs of the instance
unique-scalar in simple and the instance basic in inc-max.
Verification for Rust. Whereas we have presented the first CHC-based (fully au-
tomated) verification method specially designed for Rust-style ownership, there
have been a number of studies on other types of verification for Rust.
RustBelt [32] aims to formally prove high-level safety properties for Rust
libraries with unsafe internal implementation, using manual reasoning on the
higher-order concurrent separation logic Iris [35,33] on the Coq Proof Assistant
[15]. Although their framework is flexible, the automation of the reasoning on
the framework is little discussed. The language design of our COR is affected by
their formal calculus λRust .
Electrolysis [67] translates some subset of Rust into a purely functional pro-
gramming language to manually verify functional correctness on Lean Theorem
Prover [49]. Although it clears out pointers to get simple models like our ap-
proach, Electrolysis’ applicable scope is quite limited, because it deals with mu-
table references by simple static tracking of addresses based on lenses [20], not
31
We also tried on Spacer JustRec + , the stack-pointer-based accurate representation
of just_rec presented in § 1.1, but we got timeout of 180 seconds.
508 Y. Matsushita et al.
supporting even basic use cases such as dynamic selection of mutable references
(e.g. take_max in § 1.2) [66], which our method can easily handle. Our approach
covers all usages of pointers of the safe core of Rust as discussed in § 3.
Some serial studies [27,3,17] conduct (semi-)automated verification on Rust
programs using Viper [50], a verification platform based on separation logic with
fractional ownership. This approach can to some extent deal with unsafe code
[27] and type traits [17]. Astrauskas et al. [3] conduct semi-automated verifi-
cation (manually providing pre/post-conditions and loop invariants) on many
realistic examples. Because Viper is based on fractional ownership, however,
their platforms have to use concrete indexing on the memory for programs like
take_max/inc_max. In contrast, our idea leverages borrow-based ownership, and
it can be applied also to semi-automated verification as suggested in § 3.5.
Some researches [65,4,44] employ bounded model checking on Rust programs,
especially with unsafe code. Our method can be applied to bounded model check-
ing as discussed in § 3.5.
6 Conclusion
References
1. Abadi, M., Lamport, L.: The existence of refinement mappings. Theor. Comput.
Sci. 82(2), 253–284 (1991). https://doi.org/10.1016/0304-3975(91)90224-P
2. Alberti, F., Bruttomesso, R., Ghilardi, S., Ranise, S., Sharygina, N.: Lazy ab-
straction with interpolants for arrays. In: Bjørner, N., Voronkov, A. (eds.)
Logic for Programming, Artificial Intelligence, and Reasoning - 18th Interna-
tional Conference, LPAR-18, Mérida, Venezuela, March 11-15, 2012. Proceed-
ings. Lecture Notes in Computer Science, vol. 7180, pp. 46–61. Springer (2012).
https://doi.org/10.1007/978-3-642-28717-6 7
3. Astrauskas, V., Müller, P., Poli, F., Summers, A.J.: Leveraging Rust types
for modular specification and verification (2018). https://doi.org/10.3929/ethz-b-
000311092
4. Baranowski, M.S., He, S., Rakamaric, Z.: Verifying Rust programs with SMACK.
In: Lahiri and Wang [42], pp. 528–535. https://doi.org/10.1007/978-3-030-01090-
4 32
5. Barnett, M., Fähndrich, M., Leino, K.R.M., Müller, P., Schulte, W., Venter, H.:
Specification and verification: The Spec# experience. Commun. ACM 54(6), 81–91
(2011). https://doi.org/10.1145/1953122.1953145
6. Bjørner, N., Gurfinkel, A., McMillan, K.L., Rybalchenko, A.: Horn clause
solvers for program verification. In: Beklemishev, L.D., Blass, A., Dershowitz,
N., Finkbeiner, B., Schulte, W. (eds.) Fields of Logic and Computation II
- Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday.
Lecture Notes in Computer Science, vol. 9300, pp. 24–51. Springer (2015).
https://doi.org/10.1007/978-3-319-23534-9 2
7. Bornat, R., Calcagno, C., O’Hearn, P.W., Parkinson, M.J.: Permission accounting
in separation logic. In: Palsberg, J., Abadi, M. (eds.) Proceedings of the 32nd
ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,
POPL 2005, Long Beach, California, USA, January 12-14, 2005. pp. 259–270. ACM
(2005). https://doi.org/10.1145/1040305.1040327
8. Boyapati, C., Lee, R., Rinard, M.C.: Ownership types for safe program-
ming: Preventing data races and deadlocks. In: Ibrahim, M., Matsuoka,
S. (eds.) Proceedings of the 2002 ACM SIGPLAN Conference on Object-
Oriented Programming Systems, Languages and Applications, OOPSLA 2002,
Seattle, Washington, USA, November 4-8, 2002. pp. 211–230. ACM (2002).
https://doi.org/10.1145/582419.582440
9. Boyland, J.: Checking interference with fractional permissions. In: Cousot, R. (ed.)
Static Analysis, 10th International Symposium, SAS 2003, San Diego, CA, USA,
June 11-13, 2003, Proceedings. Lecture Notes in Computer Science, vol. 2694, pp.
55–72. Springer (2003). https://doi.org/10.1007/3-540-44898-5 4
10. Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about arrays? In: Emer-
son, E.A., Namjoshi, K.S. (eds.) Verification, Model Checking, and Abstract In-
terpretation, 7th International Conference, VMCAI 2006, Charleston, SC, USA,
January 8-10, 2006, Proceedings. Lecture Notes in Computer Science, vol. 3855,
pp. 427–442. Springer (2006). https://doi.org/10.1007/11609773 28
11. Champion, A., Chiba, T., Kobayashi, N., Sato, R.: ICE-based refinement type
discovery for higher-order functional programs. In: Beyer, D., Huisman, M. (eds.)
Tools and Algorithms for the Construction and Analysis of Systems - 24th Interna-
tional Conference, TACAS 2018, Held as Part of the European Joint Conferences
510 Y. Matsushita et al.
on Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, April 14-
20, 2018, Proceedings, Part I. Lecture Notes in Computer Science, vol. 10805, pp.
365–384. Springer (2018). https://doi.org/10.1007/978-3-319-89960-2 20
12. Champion, A., Kobayashi, N., Sato, R.: HoIce: An ICE-based non-linear Horn
clause solver. In: Ryu, S. (ed.) Programming Languages and Systems - 16th Asian
Symposium, APLAS 2018, Wellington, New Zealand, December 2-6, 2018, Pro-
ceedings. Lecture Notes in Computer Science, vol. 11275, pp. 146–156. Springer
(2018). https://doi.org/10.1007/978-3-030-02768-1 8
13. Clarke, D.G., Potter, J., Noble, J.: Ownership types for flexible alias protection.
In: Freeman-Benson, B.N., Chambers, C. (eds.) Proceedings of the 1998 ACM
SIGPLAN Conference on Object-Oriented Programming Systems, Languages &
Applications (OOPSLA ’98), Vancouver, British Columbia, Canada, October 18-
22, 1998. pp. 48–64. ACM (1998). https://doi.org/10.1145/286936.286947
14. Cohen, E., Dahlweid, M., Hillebrand, M.A., Leinenbach, D., Moskal, M., Santen,
T., Schulte, W., Tobies, S.: VCC: A practical system for verifying concurrent C. In:
Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) Theorem Proving in Higher
Order Logics, 22nd International Conference, TPHOLs 2009, Munich, Germany,
August 17-20, 2009. Proceedings. Lecture Notes in Computer Science, vol. 5674,
pp. 23–42. Springer (2009). https://doi.org/10.1007/978-3-642-03359-9 2
15. Coq Team: The Coq proof assistant (2020), https://coq.inria.fr/
16. van Emden, M.H., Kowalski, R.A.: The semantics of predicate logic as
a programming language. Journal of the ACM 23(4), 733–742 (1976).
https://doi.org/10.1145/321978.321991
17. Erdin, M.: Verification of Rust Generics, Typestates, and Traits. Master’s thesis,
ETH Zürich (2019)
18. Fedyukovich, G., Kaufman, S.J., Bodı́k, R.: Sampling invariants from frequency
distributions. In: Stewart, D., Weissenbacher, G. (eds.) 2017 Formal Methods in
Computer Aided Design, FMCAD 2017, Vienna, Austria, October 2-6, 2017. pp.
100–107. IEEE (2017). https://doi.org/10.23919/FMCAD.2017.8102247
19. Fedyukovich, G., Prabhu, S., Madhukar, K., Gupta, A.: Quantified invariants via
syntax-guided synthesis. In: Dillig, I., Tasiran, S. (eds.) Computer Aided Verifica-
tion - 31st International Conference, CAV 2019, New York City, NY, USA, July
15-18, 2019, Proceedings, Part I. Lecture Notes in Computer Science, vol. 11561,
pp. 259–277. Springer (2019). https://doi.org/10.1007/978-3-030-25540-4 14
20. Foster, J.N., Greenwald, M.B., Moore, J.T., Pierce, B.C., Schmitt, A.: Com-
binators for bidirectional tree transformations: A linguistic approach to the
view-update problem. ACM Trans. Program. Lang. Syst. 29(3), 17 (2007).
https://doi.org/10.1145/1232420.1232424
21. Gondelman, L.: Un système de types pragmatique pour la vérification déductive des
programmes. (A Pragmatic Type System for Deductive Verification). Ph.D. thesis,
University of Paris-Saclay, France (2016), https://tel.archives-ouvertes.fr/
tel-01533090
22. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing soft-
ware verifiers from proof rules. In: Vitek, J., Lin, H., Tip, F. (eds.) ACM
SIGPLAN Conference on Programming Language Design and Implementation,
PLDI ’12, Beijing, China - June 11 - 16, 2012. pp. 405–416. ACM (2012).
https://doi.org/10.1145/2254064.2254112
23. Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verification
framework. In: Kroening, D., Pasareanu, C.S. (eds.) Computer Aided Verification
RustHorn: CHC-based Verification for Rust Programs 511
- 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-
24, 2015, Proceedings, Part I. Lecture Notes in Computer Science, vol. 9206, pp.
343–361. Springer (2015). https://doi.org/10.1007/978-3-319-21690-4 20
24. Gurfinkel, A., Navas, J.A.: A context-sensitive memory model for verification of
C/C++ programs. In: Ranzato, F. (ed.) Static Analysis - 24th International Sym-
posium, SAS 2017, New York, NY, USA, August 30 - September 1, 2017, Proceed-
ings. Lecture Notes in Computer Science, vol. 10422, pp. 148–168. Springer (2017).
https://doi.org/10.1007/978-3-319-66706-5 8
25. Gurfinkel, A., Shoham, S., Meshman, Y.: SMT-based verification of parameterized
systems. In: Zimmermann, T., Cleland-Huang, J., Su, Z. (eds.) Proceedings of
the 24th ACM SIGSOFT International Symposium on Foundations of Software
Engineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016. pp. 338–348.
ACM (2016). https://doi.org/10.1145/2950290.2950330
26. Gurfinkel, A., Shoham, S., Vizel, Y.: Quantifiers on demand. In: Lahiri and Wang
[42], pp. 248–266. https://doi.org/10.1007/978-3-030-01090-4 15
27. Hahn, F.: Rust2Viper: Building a Static Verifier for Rust. Master’s thesis, ETH
Zürich (2016). https://doi.org/10.3929/ethz-a-010669150
28. Hoenicke, J., Majumdar, R., Podelski, A.: Thread modularity at many levels: A
pearl in compositional verification. In: Castagna, G., Gordon, A.D. (eds.) Pro-
ceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming
Languages, POPL 2017, Paris, France, January 18-20, 2017. pp. 473–485. ACM
(2017). https://doi.org/10.1145/3009837
29. Hojjat, H., Rümmer, P.: The Eldarica Horn solver. In: Bjørner, N., Gurfinkel,
A. (eds.) 2018 Formal Methods in Computer Aided Design, FMCAD 2018,
Austin, TX, USA, October 30 - November 2, 2018. pp. 1–7. IEEE (2018).
https://doi.org/10.23919/FMCAD.2018.8603013
30. Horn, A.: On sentences which are true of direct unions of algebras. The Journal of
Symbolic Logic 16(1), 14–21 (1951), http://www.jstor.org/stable/2268661
31. Jim, T., Morrisett, J.G., Grossman, D., Hicks, M.W., Cheney, J., Wang, Y.: Cy-
clone: A safe dialect of C. In: Ellis, C.S. (ed.) Proceedings of the General Track:
2002 USENIX Annual Technical Conference, June 10-15, 2002, Monterey, Califor-
nia, USA. pp. 275–288. USENIX (2002), http://www.usenix.org/publications/
library/proceedings/usenix02/jim.html
32. Jung, R., Jourdan, J., Krebbers, R., Dreyer, D.: RustBelt: Securing the founda-
tions of the Rust programming language. PACMPL 2(POPL), 66:1–66:34 (2018).
https://doi.org/10.1145/3158154
33. Jung, R., Krebbers, R., Jourdan, J., Bizjak, A., Birkedal, L., Dreyer, D.: Iris from
the ground up: A modular foundation for higher-order concurrent separation logic.
J. Funct. Program. 28, e20 (2018). https://doi.org/10.1017/S0956796818000151
34. Jung, R., Lepigre, R., Parthasarathy, G., Rapoport, M., Timany, A., Dreyer, D.,
Jacobs, B.: The future is ours: Prophecy variables in separation logic. PACMPL
4(POPL), 45:1–45:32 (2020). https://doi.org/10.1145/3371113
35. Jung, R., Swasey, D., Sieczkowski, F., Svendsen, K., Turon, A., Birkedal, L.,
Dreyer, D.: Iris: Monoids and invariants as an orthogonal basis for concurrent
reasoning. In: Rajamani, S.K., Walker, D. (eds.) Proceedings of the 42nd Annual
ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,
POPL 2015, Mumbai, India, January 15-17, 2015. pp. 637–650. ACM (2015).
https://doi.org/10.1145/2676726.2676980
36. Kahsai, T., Kersten, R., Rümmer, P., Schäf, M.: Quantified heap invariants for
object-oriented programs. In: Eiter, T., Sands, D. (eds.) LPAR-21, 21st Interna-
tional Conference on Logic for Programming, Artificial Intelligence and Reasoning,
512 Y. Matsushita et al.
Maun, Botswana, May 7-12, 2017. EPiC Series in Computing, vol. 46, pp. 368–384.
EasyChair (2017)
37. Kahsai, T., Rümmer, P., Sanchez, H., Schäf, M.: JayHorn: A framework for ver-
ifying Java programs. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Ver-
ification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July
17-23, 2016, Proceedings, Part I. Lecture Notes in Computer Science, vol. 9779,
pp. 352–358. Springer (2016). https://doi.org/10.1007/978-3-319-41528-4 19
38. Kalra, S., Goel, S., Dhawan, M., Sharma, S.: Zeus: Analyzing safety of smart
contracts. In: 25th Annual Network and Distributed System Security Symposium,
NDSS 2018, San Diego, California, USA, February 18-21, 2018. The Internet So-
ciety (2018)
39. Kobayashi, N., Sato, R., Unno, H.: Predicate abstraction and CEGAR for higher-
order model checking. In: Hall, M.W., Padua, D.A. (eds.) Proceedings of the 32nd
ACM SIGPLAN Conference on Programming Language Design and Implementa-
tion, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. pp. 222–233. ACM (2011).
https://doi.org/10.1145/1993498.1993525
40. Komuravelli, A., Gurfinkel, A., Chaki, S.: SMT-based model checking for recursive
programs. In: Biere, A., Bloem, R. (eds.) Computer Aided Verification - 26th Inter-
national Conference, CAV 2014, Held as Part of the Vienna Summer of Logic, VSL
2014, Vienna, Austria, July 18-22, 2014. Proceedings. Lecture Notes in Computer
Science, vol. 8559, pp. 17–34. Springer (2014). https://doi.org/10.1007/978-3-319-
08867-9 2
41. Lahiri, S.K., Bryant, R.E.: Constructing quantified invariants via predicate ab-
straction. In: Steffen, B., Levi, G. (eds.) Verification, Model Checking, and Ab-
stract Interpretation, 5th International Conference, VMCAI 2004, Venice, Italy,
January 11-13, 2004, Proceedings. Lecture Notes in Computer Science, vol. 2937,
pp. 267–281. Springer (2004). https://doi.org/10.1007/978-3-540-24622-0 22
42. Lahiri, S.K., Wang, C. (eds.): Automated Technology for Verification and Analysis
- 16th International Symposium, ATVA 2018, Los Angeles, CA, USA, October
7-10, 2018, Proceedings, Lecture Notes in Computer Science, vol. 11138. Springer
(2018). https://doi.org/10.1007/978-3-030-01090-4
43. Lattner, C., Adve, V.S.: Automatic pool allocation: Improving performance by
controlling data structure layout in the heap. In: Sarkar, V., Hall, M.W. (eds.)
Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language
Design and Implementation, Chicago, IL, USA, June 12-15, 2005. pp. 129–142.
ACM (2005). https://doi.org/10.1145/1065010.1065027
44. Lindner, M., Aparicius, J., Lindgren, P.: No panic! Verification of Rust programs
by symbolic execution. In: 16th IEEE International Conference on Industrial Infor-
matics, INDIN 2018, Porto, Portugal, July 18-20, 2018. pp. 108–114. IEEE (2018).
https://doi.org/10.1109/INDIN.2018.8471992
45. Matsakis, N.D.: Introducing MIR (2016), https://blog.rust-lang.org/2016/
04/19/MIR.html
46. Matsakis, N.D., Klock II, F.S.: The Rust language. In: Feldman, M., Taft, S.T.
(eds.) Proceedings of the 2014 ACM SIGAda annual conference on High integrity
language technology, HILT 2014, Portland, Oregon, USA, October 18-21, 2014. pp.
103–104. ACM (2014). https://doi.org/10.1145/2663171.2663188
47. Matsushita, Y., Tsukada, T., Kobayashi, N.: RustHorn: CHC-based verification for
Rust programs (full version). CoRR (2020), https://arxiv.org/abs/2002.09002
48. Microsoft: Boogie: An intermediate verification language (2020), https:
//www.microsoft.com/en-us/research/project/boogie-an-intermediate-
verification-language/
RustHorn: CHC-based Verification for Rust Programs 513
49. de Moura, L.M., Kong, S., Avigad, J., van Doorn, F., von Raumer, J.: The
Lean theorem prover (system description). In: Felty, A.P., Middeldorp, A.
(eds.) Automated Deduction - CADE-25 - 25th International Conference on
Automated Deduction, Berlin, Germany, August 1-7, 2015, Proceedings. Lec-
ture Notes in Computer Science, vol. 9195, pp. 378–388. Springer (2015).
https://doi.org/10.1007/978-3-319-21401-6 26
50. Müller, P., Schwerhoff, M., Summers, A.J.: Viper: A verification infrastructure
for permission-based reasoning. In: Jobstmann, B., Leino, K.R.M. (eds.) Verifi-
cation, Model Checking, and Abstract Interpretation - 17th International Con-
ference, VMCAI 2016, St. Petersburg, FL, USA, January 17-19, 2016. Proceed-
ings. Lecture Notes in Computer Science, vol. 9583, pp. 41–62. Springer (2016).
https://doi.org/10.1007/978-3-662-49122-5 2
51. Rust Community: The MIR (Mid-level IR) (2020), https://rust-lang.github.
io/rustc-guide/mir/index.html
52. Rust Community: Reference cycles can leak memory - the Rust programming lan-
guage (2020), https://doc.rust-lang.org/book/ch15-06-reference-cycles.
html
53. Rust Community: RFC 2025: Nested method calls (2020), https://rust-lang.
github.io/rfcs/2025-nested-method-calls.html
54. Rust Community: RFC 2094: Non-lexical lifetimes (2020), https://rust-lang.
github.io/rfcs/2094-nll.html
55. Rust Community: Rust programming language (2020), https://www.rust-lang.
org/
56. Rust Community: std::cell::RefCell - Rust (2020), https://doc.rust-lang.org/
std/cell/struct.RefCell.html
57. Rust Community: std::rc::Rc - Rust (2020), https://doc.rust-lang.org/std/
rc/struct.Rc.html
58. Rust Community: std::vec::Vec - Rust (2020), https://doc.rust-lang.org/std/
vec/struct.Vec.html
59. Rust Community: Two-phase borrows (2020), https://rust-lang.github.io/
rustc-guide/borrow_check/two_phase_borrows.html
60. Sato, R., Iwayama, N., Kobayashi, N.: Combining higher-order model checking with
refinement type inference. In: Hermenegildo, M.V., Igarashi, A. (eds.) Proceedings
of the 2019 ACM SIGPLAN Workshop on Partial Evaluation and Program Manip-
ulation, PEPM@POPL 2019, Cascais, Portugal, January 14-15, 2019. pp. 47–53.
ACM (2019). https://doi.org/10.1145/3294032.3294081
61. Steensgaard, B.: Points-to analysis in almost linear time. In: Boehm, H., Jr., G.L.S.
(eds.) Conference Record of POPL’96: The 23rd ACM SIGPLAN-SIGACT Sym-
posium on Principles of Programming Languages, Papers Presented at the Sympo-
sium, St. Petersburg Beach, Florida, USA, January 21-24, 1996. pp. 32–41. ACM
Press (1996). https://doi.org/10.1145/237721.237727
62. Stump, A., Barrett, C.W., Dill, D.L., Levitt, J.R.: A decision procedure for an ex-
tensional theory of arrays. In: 16th Annual IEEE Symposium on Logic in Computer
Science, Boston, Massachusetts, USA, June 16-19, 2001, Proceedings. pp. 29–37.
IEEE Computer Society (2001). https://doi.org/10.1109/LICS.2001.932480
63. Suenaga, K., Kobayashi, N.: Fractional ownerships for safe memory dealloca-
tion. In: Hu, Z. (ed.) Programming Languages and Systems, 7th Asian Sym-
posium, APLAS 2009, Seoul, Korea, December 14-16, 2009. Proceedings. Lec-
ture Notes in Computer Science, vol. 5904, pp. 128–143. Springer (2009).
https://doi.org/10.1007/978-3-642-10672-9 11
514 Y. Matsushita et al.
64. Terauchi, T.: Checking race freedom via linear programming. In: Gupta, R., Ama-
rasinghe, S.P. (eds.) Proceedings of the ACM SIGPLAN 2008 Conference on Pro-
gramming Language Design and Implementation, Tucson, AZ, USA, June 7-13,
2008. pp. 1–10. ACM (2008). https://doi.org/10.1145/1375581.1375583
65. Toman, J., Pernsteiner, S., Torlak, E.: crust: A bounded verifier for Rust.
In: Cohen, M.B., Grunske, L., Whalen, M. (eds.) 30th IEEE/ACM Interna-
tional Conference on Automated Software Engineering, ASE 2015, Lincoln,
NE, USA, November 9-13, 2015. pp. 75–80. IEEE Computer Society (2015).
https://doi.org/10.1109/ASE.2015.77
66. Ullrich, S.: Electrolysis reference (2016), http://kha.github.io/electrolysis/
67. Ullrich, S.: Simple Verification of Rust Programs via Functional Purification. Mas-
ter’s thesis, Karlsruhe Institute of Technology (2016)
68. Vafeiadis, V.: Modular fine-grained concurrency verification. Ph.D. thesis, Univer-
sity of Cambridge, UK (2008), http://ethos.bl.uk/OrderDetails.do?uin=uk.
bl.ethos.612221
69. Z3 Team: The Z3 theorem prover (2020), https://github.com/Z3Prover/z3
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/
4.0/), which permits use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons license and indicate if changes
were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
A First-Order Logic with Frames
Abstract. We propose a novel logic, called Frame Logic (FL), that ex-
tends first-order logic (with recursive definitions) using a construct Sp(·)
that captures the implicit supports of formulas— the precise subset of
the universe upon which their meaning depends. Using such supports, we
formulate proof rules that facilitate frame reasoning elegantly when the
underlying model undergoes change. We show that the logic is expressive
by capturing several data-structures and also exhibit a translation from
a precise fragment of separation logic to frame logic. Finally, we design
a program logic based on frame logic for reasoning with programs that
dynamically update heaps that facilitates local specifications and frame
reasoning. This program logic consists of both localized proof rules as
well as rules that derive the weakest tightest preconditions in FL.
1 Introduction
Program logics for expressing and reasoning with programs that dynamically
manipulate heaps is an active area of research. The research on separation logic
has argued convincingly that it is highly desirable to have localized logics that
talk about small states (heaplets rather than the global heap), and the ability
to do frame reasoning. Separation logic achieves this objective by having a tight
heaplet semantics and using special operators, primarily a separating conjunction
operator ∗ and a separating implication operator (the magic wand −∗).
In this paper, we ask a fundamental question: can classical logics (such as
FOL and FOL with recursive definitions) be extended to support localized spec-
ifications and frame reasoning? Can we utilize classical logics for reasoning effec-
tively with programs that dynamically manipulate heaps, with the aid of local
specifications and frame reasoning?
The primary contribution of this paper is to endow a classical logic, namely
first-order logic with recursive definitions (with least fixpoint semantics) with
frames and frame reasoning.
Equal contribution Corresponding Author
the value of ϕ depends on. We then prove a frame theorem (Theorem 1) that
says that changing a model M by changing the interpretation of functions that
are not in the support of ϕ will not affect the truth of the formula ϕ. This theo-
rem then directly supports frame reasoning; if a model satisfies ϕ and the model
is changed so that the changes made are disjoint from the support of ϕ, then
ϕ will continue to hold. We also show that FL formulae can be translated to
vanilla FO-RD logic (without support operators); in other words, the semantics
for the support of a formula can be captured in FO-RD itself. Consequently, we
can use any FO-RD reasoning mechanism (proof systems [19, 20] or heuristic
algorithms such as the natural proof techniques [24, 32, 37, 41]) to reason with
FL formulas.
We illustrate our logic using several examples drawn from program verifica-
tion; we show how to express various data-structure definitions and the elements
they contain and various measures for them using FL formulas (e.g., linked lists,
sorted lists, list segments, binary search trees, AVL trees, lengths of lists, heights
of trees, set of keys stored in the data-structure, etc.)
While the sensibilities of our logic are definitely inspired by separation logic,
there are some fundamental differences beyond the fact that our logic extends
the syntax and semantics of classical logics with a special support operator
and avoids operators such as ∗ and −∗. In separation logic, there can be many
supports of a formula (also called heaplets)— a heaplet for a formula is one that
supports its truth. For example, a formula of the form α ∨ β can have a heaplet
that supports the truth of α or one that supports the truth of β. However,
the philosophy that we follow in our design is to have a single support that
supports the truth value of a formula, whether it be true or false. Consequently,
the support of the formula α ∨ β is the union of the supports of the formulas α
and β.
The above design choice of the support being determined by the formula has
several consequences that lead to a deviation from separation logic. For instance,
the support of the negation of a formula ϕ is the same as the support of ϕ. And
the support of the formula f (x) = y and its negation are the same, namely the
singleton location interpreted for x. In separation logic, the corresponding for-
mula will have the same heaplet but its negation will include all other heaplets.
The choice of having determined supports or heaplets is not new, and there have
been several variants and sublogics of separation logics that have been explored.
For example, the logic Dryad [32, 37] is a separation logic that insists on de-
termined heaplets to support automated reasoning, and the precise fragment of
separation logic studied in the literature [29] defines a sublogic that has (essen-
tially) determined heaplets. The second main contribution in this paper is to
show that this fragment of separation logic (with slight changes for technical
reasons) can be translated to frame logic, such that the unique heaplet that
satisfies a precise separation logic formula is its support of the corresponding
formula in frame logic.
The third main contribution of this paper is a program logic based on frame
logic for a simple while-programming language destructively updating heaps. We
518 A. Murali et al.
present two kinds of proof rules for reasoning with such programs annotated with
pre- and post-conditions written in frame logic. The first set of rules are local
rules that axiomatically define the semantics of the program, using the small-
est supports for each command. We also give a frame rule that allows arguing
preservation of properties whose supports are disjoint from the heaplet modified
by a program. These rules are similar to analogous rules in separation logic.
The second class of rules work to give a weakest tightest precondition for any
postcondition with respect to non-recursive programs. In separation logic, the
corresponding rules for weakest preconditions are often expressed using separat-
ing implication (the magic-wand operator). Given a small change made to the
heap and a postcondition β, the formula α −∗ β captures all heaplets H where
if a heaplet that satisfies α is joined with H, then β holds. When α describes
the change effected by the program, α −∗ β captures, essentially, the weakest
precondition. However, the magic wand is a very powerful operator that calls for
quantifications over heaplets and submodels, and hence involves second order
quantification. In our logic, we show that we can capture the weakest precon-
dition with only first-order quantification, and hence first-order frame logic is
closed under weakest preconditions across non-recursive programs blocks. This
means that when inductive loop invariants are given also in FL, reasoning with
programs reduces to reasoning with FL. By translating FL to pure FO-RD for-
mulas, we can use FO-RD reasoning techniques to reason with FL, and hence
programs.
The base logic upon which we build frame logic is a first order logic with recursive
definitions (FO-RD), where we allow a foreground sort and several background
sorts, each with their individual theories (like arithmetic, sets, arrays, etc.). The
foreground sort and functions involving the foreground sort are uninterpreted
(not constrained by theories). This hence can be seen as an uninterpreted com-
bination of theories over disjoint domains. This logic has been defined and used
to model heap verification before [23].
We will build frame logic over such a framework where supports are modeled
as subsets of elements of the foreground sort. When modeling heaps in program
verification using logic, the foreground sort will be used to model locations of the
heap, uninterpreted functions from the foreground sort to foreground sort will
be used to model pointers, and uninterpreted functions from the foreground sort
to the background sort will model data fields. Consequently, supports will be
subsets of locations of the heap, which is appropriate as these are the domains
of pointers that change when a program updates a heap.
We define a signature as Σ = (S; C; F ; R; I), where S is a finite non-empty
set of sorts. C is a set of constant symbols, where each c ∈ C has some sort
τ ∈ S. F is a set of function symbols, where each function f ∈ F has a type of
the form τ1 × . . . × τm → τ for some m, with τi , τ ∈ S. The sets R and I are
(disjoint) sets of relation symbols, where each relation R ∈ R ∪ I has a type of
the form τ1 × . . . × τm . The set I contains those relation symbols for which the
corresponding relations are inductively defined using formulas (details are given
below), while those in R are given by the model.
We assume that the set of sorts contains a designated “foreground sort”
denoted by σf . All the other sorts in S are called background sorts, and for
each such background sort σ we allow the constant symbols of type σ, function
symbols that have type σ n → σ for some n, and relation symbols have type σ m
for some m, to be constrained using an arbitrary theory Tσ .
A formula in first-order logic with recursive definitions (FO-RD) over such a
signature is of the form (D, α), where D is a set of recursive definitions of the
form R(x) := ρR (x), where R ∈ I and ρR (x) is a first-order logic formula, in
which the relation symbols from I occur only positively. α is also a first-order
logic formula over the signature. We assume D has at most one definition for any
inductively defined relation, and that the formulas ρR and α use only inductive
relations defined in D.
The semantics of a formula is standard; the semantics of inductively defined
relations are defined to be the least fixpoint that satisfies the relational equations,
and the semantics of α is the standard one defined using these semantics for
relations. We do not formally define the semantics, but we will formally define
the semantics of frame logic (discussed in the next section and whose semantics
is defined in the Technical Report [25]) which is an extension of FO-RD.
520 A. Murali et al.
3 Frame Logic
We now define Frame Logic (FL), the central contribution of this paper.
Fig. 1. Syntax of frame logic: γ for guards, tτ for terms of sort τ , and general formulas
ϕ. Guards cannot use inductively defined relations or support expressions.
The syntax of our logic is given in the grammar in Figure 1. This extends FO-RD
with the rule for building support expressions, which are terms of sort σS(f) of
the form Sp(α) for a formula α, or Sp(t) for a term t.
The formulas defined by γ are used as guards in existential quantification and
in the if-then-else-operator, which is denoted by ite. The restriction compared to
general formulas is that guards cannot use inductively defined relations (R ranges
only over R in the rule for γ, and over R ∪ I in the rule for ϕ), nor terms of sort
σS(f) and thus no support expressions (τ ranges over S \ {σS(f) } in the rules for γ
and over S in the rule for ϕ). The requirement that the guard does not use the
inductive relations and support expressions is used later to ensure the existence
of least fixpoints for defining semantics of inductive definitions. The semantics of
an ite-formula ite(γ, α, β) is the same as the one of (γ ∧α)∨(¬γ ∧β); however, the
supports of the two formulas will turn out to be different (i.e., Sp(ite(γ : α, β))
and Sp((γ ∧ α) ∨ (¬γ ∧ β)) are different), as explained in Section 3.2. The same
is true for existential formulas, i.e., ∃y : γ.ϕ has the same semantics as ∃y.γ ∧ ϕ
but, in general, has a different support.
For recursive definitions (throughout the paper, we use the terms recursive
definitions and inductive definitions with the same meaning), we require that
the relation R that is defined does not have arguments of sort σS(f) . This is
another restriction in order to ensure the existence of a least fixpoint model in
the definition of the semantics.1
We discuss the design decisions that go behind the semantics of the support
operator Sp in our logic, and then give an example for the support of an inductive
definition. The formal conditions that the supports should satisfy are stated in
the equations in Figure 2, and are explained in Section 3.3. Here, we start by an
informal discussion.
The first decision is to have every formula uniquely define a support, which
roughly captures the subdomain of mutable functions that a formula ϕ’s truth-
hood depends on, and have Sp(ϕ) evaluate to it.
The choice for supports of atomic formulae are relatively clear. An atomic
formula of the kind f (x)=y, where x is of the foreground sort and f is a mutable
function, has as its support the singleton set containing the location interpreted
1
It would be sufficient to restrict formulas of the form R(t1 , . . . , tn ) for inductive
relations R to not contain support expressions as subterms.
522 A. Murali et al.
for x. And atomic formulas that do not involve mutable functions over the fore-
ground have an empty support. Supports for terms can also be similarly defined.
The support of a conjunction α ∧ β should clearly be the union of the supports
of the two formulas.
Remark 1. In traditional separation logic, each pointer field is stored in a sep-
arate location, using integer offsets. However, in our work, we view pointers as
references and disallow pointer arithmetic. A more accurate heaplet for such
references can be obtained by taking heaplet to be the pair (x, f ) (see [30]), cap-
turing the fact that the formula depends only on the field f of x. Such accurate
heaplets can be captured in FL as well— we can introduce a non-mutable field
lookup pointer Lf and use x.Lf .f in programs instead of x.f .
What should the support of a formula α ∨ β be? The choice we make here is
that its support is the union of the supports of α and β. Note that in a model
where α is true and β is false, we still include the heaplet of β in Sp(α ∨ β). In a
sense, this is an overapproximation of the support as far as frame reasoning goes,
as surely preserving the model’s definitions on the support of α will preserve the
truth of α, and hence of α ∨ β.
However, we prefer the support to be the union of the supports of α and β.
We think of the support as the subdomain of the universe that determines the
meaning of the formula, whether it be true or false. Consequently, we would like
the support of a formula and its negation to be the same. Given that the support
of the negation of a disjunction, being a conjunction, is the union of the frames
of α and β, we would like this to be the support.
Separation logic makes a different design decision. Logical formulas are not
associated with tight supports, but rather, the semantics of the formula is defined
for models with given supports/heaplets, where the idea of a heaplet is whether
it supports the truthhood of a formula (and not its falsehood). For example,
for a model, the various heaplets that satisfy ¬(f (x) = y) in separation logic
would include all heaplets where the location of x is not present, which does
not coincide with the notion we have chosen for supports. However, for positive
formulas, separation logic handles supports more accurately, as it can associate
several supports for a formula, yielding two heaplets for formulas of the form
α ∨ β when they are both true in a model. The decision to have a single support
for a formula compels us to take the union of the supports to be the support of
a disjunction.
There are situations, however, where there are disjunctions α ∨ β, where only
one of the disjuncts can possibly be true, and hence we would like the support
of the formula to be the support of the disjunct that happens to be true. We
therefore introduce a new syntactical form ite(γ : α, β) in frame logic, whose
heaplet is the union of the supports of γ and α, if γ is true, and the supports
of γ and β if γ is false. While the truthhood of ite(γ : α, β) is the same as that
of (γ ∧ α) ∨ (¬γ ∧ β), its supports are potentially smaller, allowing us to write
formulas with tighter supports to support better frame reasoning. Note that the
support of ite(γ : α, β) and its negation ite(γ : ¬α, ¬β) are the same, as we
desired.
A First-Order Logic with Frames 523
Turning to quantification, the support for a formula of the form ∃x.α is hard
to define, as its truthhood could depend on the entire universe. We hence provide
a mechanism for guarded quantification, in the form ∃x : γ. α. The semantics
of this formula is that there exists some location that satisfies the guard γ, for
which α holds. The support for such a formula includes the support of the guard,
and the supports of α when x is interpreted to be a location that satisfies γ. For
example, ∃x : (x = f (y)). g(x) = z has as its support the locations interpreted
for y and f (y) only.
For a formula R(t) with an inductive relation R defined by R(x) := ρR (x),
the support descends into the definition, changing the variable assignment of the
variables in x from the inductive definition to the terms in t. Furthermore, it
contains the elements to which mutable functions are applied in the terms in t.
Recursive definitions are designed such that the evaluation of the equations
for the support expressions is independent of the interpretation of the inductive
relations. The equations mainly depend on the syntactic structure of formulas
and terms. Only the semantics of guards, and the semantics of subterms under
a mutable function symbol play a role. For this reason, we disallow guards to
contain recursively defined relations or support expressions. We also require that
the only functions involving the sort σS(f) are the standard functions involving
sets. Thus, subterms of mutable functions cannot contain support expressions
(which are of sort σS(f) ) as subterms.
These restrictions ensure that there indeed exists a unique simultaneous least
solution of the equations for the inductive relations and the support expressions.
We end this section with an example.
Example 1. Consider the definition of a predicate tree(x) w.r.t. two unary mu-
table functions left and right:
tree(x) := ite(x = nil : true, α) where
α = ∃
, r : (
= left(x) ∧ r = right(x)).tree(
) ∧ tree(r) ∧
Sp(tree(
)) ∩ Sp(tree(r)) = ∅ ∧ ¬(x ∈ Sp(tree(
)) ∪ Sp(tree(r)))
This inductive definition defines binary trees with pointer fields left and right
for left- and right-pointers, by stating that x points to a tree if either x is equal
to nil (in this case its support is empty), or left(x) and right(x) are trees with
disjoint supports. The last conjunct says that x does not belong to the support
of the left and right subtrees; this condition is, strictly speaking, not required to
define trees (under least fixpoint semantics). Note that the access to the support
of formulas eases defining disjointness of heaplets, like in separation logic. The
support of tree(x) turns out to be precisely the nodes that are reachable from
x using left and right pointers, as one would desire. Consequently, if a pointer
outside this support changes, we would be able to conclude using frame reasoning
that the truth value of tree(x) does not change.
more, the interpretation function maps each expression of the form Sp(ϕ) to a
function Sp(ϕ)M that assigns to each variable assignment ν a set Sp(ϕ)M (ν)
of foreground elements. The set Sp(ϕ)M (ν) corresponds to the support of the
formula when the free variables are interpreted by ν. Similarly, Sp(t)M is a
function from variable assignments to sets of foreground elements.
Based on such models, we can define the semantics of terms and formulas in
the standard way. The only construct that is non-standard in our logic are terms
of the form Sp(ϕ), for which the semantics is directly given by the interpretation
function. We write tM,ν for the interpretation of a term t in M with variable
assignment ν. With this convention, Sp(ϕ)M (ν) denotes the same thing as
Sp(ϕ)M,ν . As usual, we write M, ν |= ϕ to indicate that the formula ϕ is true
in M with the free variables interpreted by ν, and ϕM denotes the relation
defined by the formula ϕ with free variables x.
We refer to the above semantics as the uninterpreted semantics of ϕ because
we do not give a specific meaning to inductive definitions and support expres-
sions.
Now let us define the true semantics for FL. The relation symbols R ∈ I
represent inductively defined relations, which are defined by equations of the
form R(x) := ρR (x) (see Figure 1). In the intended meaning, R is interpreted as
the least relation that satisfies the equation
R(x)M = ρR (x)M .
The usual requirement for the existence of a unique least fixpoint of the equation
is that the definition of R does not negatively depend on R. For this reason, we
require that in ρR (x) each occurrence of an inductive predicate R ∈ I is either
inside a support expression, or it occurs under an even number of negations. 2
Every support expression is evaluated on a model to a set of foreground el-
ements (under a given variable assignment ν). Formally, we are interested in
models in which the support expressions are interpreted to be the sets that cor-
respond to the smallest solution of the equations given in Figure 2. The intuition
behind these definitions was explained in Section 3.2
Example 2. Consider the inductive definition tree(x) defined in Example 1. To
check whether the equations from Figure 2 indeed yield the desired support,
note that the supports of Sp(x = nil) = Sp(x) = Sp(true) = ∅. Below, we write
[u] for a variable assignment that assigns u to the free variable of the formula
that we are considering. Then we obtain that Sp(tree(x))[u] = ∅ if u = nil , and
Sp(tree(x))[u] = Sp(α)[u] if x
= nil. The formula α is existentially quantified
with guard
= left(x) ∧ r = right(x). The support of this guard is {u} because
mutable functions are applied to x. The support of the remaining part of α is the
union of the supports of tree(
)[left(u)] and tree(r)[right(u)] (the assignments for
and r that make the guard true). So we obtain for the case that u
= nil that
the element u enters the support, and the recursion further descends into the
subtrees of u, as desired.
2
As usual, it would be sufficient to forbid negative occurrences of inductive predicates
in mutual recursion.
526 A. Murali et al.
Proposition 1. For each model M , there is a unique frame model over the
same universe and the same interpretation of the constants, functions, and non-
inductive relations.
The support of a formula can be used for frame reasoning in the following sense:
if we modify a model M by changing the interpretation of the mutable functions
(e.g., a program modifying pointers), then truth values of formulas do not change
if the change happens outside the support of the formula. This is formalized
below and proven in the Technical Report [25].
Given two models M, M over the same universe, we say that M is a mutation
of M if RM = RM , cM = cM , and f M = f M , for all constants c,
relations R ∈ R, and functions f ∈ F \ Fm . In other words, M can only be
different from M on the interpretations of the mutable functions, the inductive
relations, and the support expressions.
Given a subset X ⊆ Uσf of the elements from the foreground universe, we say
that the mutation is stable on X if the values of the mutable functions did not
change on arguments from X, that is, f M (u1 , . . . , un ) = f M (u1 , . . . , un ) for
all mutable functions f ∈ Fm and all appropriate tuples u1 , . . . , un of arguments
with {u1 , . . . , un } ∩ X
= ∅.
The only extension of frame logic compared to FO-RD is the operator Sp, which
defines a function from interpretations of free variables to sets of foreground
elements. The semantics of this operator can be captured within FO-RD itself,
so reasoning within frame logic can be reduced to reasoning within FO-RD.
A formula α(y) with y = y1 , . . . , ym has one support for each interpreta-
tion of the free variables. We capture these supports by an inductively defined
relation Spα (y, z) of arity m + 1 such that for each frame model M , we have
(u1 , . . . , um , u) ∈ Spα M if u ∈ Sp(α)M (ν) for the interpretation ν that inter-
prets yi as ui .
Since the semantics of Sp(α) is defined over the structure of α, we introduce
corresponding inductively defined relations Spβ and Spt for all subformulas β
and subterms t of either α or of a formula ρR for R ∈ I.
A First-Order Logic with Frames 527
The equations for supports from Figure 2 can be expressed by inductive def-
initions for the relations Spβ . The translations are shown in the Technical Re-
port [25]. It is not hard to see that general frame logic formulas can be translated
to FO-RD formulas that make use of these new inductively defined relations.
Proposition 2. For every frame logic formula there is an equisatisfiable FO-
RD formula with the signature extended by auxiliary predicates for recursive
definitions of supports.
– it is never the case that the abort state ⊥ is encountered in the execution
on S.
– if (M, H, U ) transitions to (M , H , U ) on S, then M |= β and H =
Sp(β)M
530 A. Murali et al.
The above rules are intuitively clear and are similar to the local rules in
separation logic [38]. The rules for statements capture their semantics using
minimal/tight heaplets, and the frame rule allows proving triples with larger
heaplets. In the rule for alloc, the postcondition says that the newly allocated
location has default values for all pointer fields and datafields (denoted as deff ).
The soundness of the frame rule relies crucially on the frame theorem for FL
(Theorem 1). The full soundness proof can be found in the Technical Report [25].
Theorem 2. The above rules are sound with respect to the operational seman-
tics.
Recall that the M W 3 primitives MW x.f :=y and MW valloc(x) need to evaluate a
formula β in the pre-state as it would evaluate in the post-state after mutation
and allocation statements. The definition of MW x.f :=y is as follows:
MW x.f :=y (β) = β[λz. ite(z = x : ite(f (x) = f (x) : y, y), f (z))/f ]
Theorem 3. The rules above suffixed with -G are sound w.r.t the operational
semantics. And, each precondition corresponds to the weakest tightest precondi-
tion of β.
4.6 Example
In this section, we will see an example of using our program logic rules that we
described earlier. This will demonstrate the utility of Frame Logic as a logic for
annotating and reasoning with heap manipulating programs, as well as offer some
intuition about how our program logic can be deployed in a practical setting.
The following program performs in-place list reversal: j := nil ; while (i
!= nil) do k := i.next ; i.next := j ; j := i ; i := k For the sake
of simplicity, instead of proving that this program reverses a list, we will instead
prove the simpler claim that after executing this program j is a list. The recursive
definition of list we use for this proof is the one from Figure 3:
We need to also give an invariant for the while loop, simply stating that i
and j point to disjoint lists: list(i) ∧ list(j) ∧ Sp(list(i)) ∩ Sp(list(j)) = ∅.
We prove that this is indeed an invariant of the while loop below. Our proof
uses a mix of both local and global rules from Sections 4.3 and 4.4 above to
demonstrate how either type of rule can be used. We also use the consequence
rule along with the program rule to be applied in several places in order to
simplify presentation. As a result, some detailed analysis is omitted, such as
proving supports are disjoint in order to use the frame rule.
Armed with this, proving j is a list after executing the full program above is
a trivial application of the assignment, while, and consequence rules, which we
omit for brevity.
Observe that in the above proof we were apply the frame rule because of
the fact that i belongs neither to Sp(list(k)) nor Sp(list(j)). This can be dis-
pensed with easily using reasoning about first-order formulae with least-fixpoint
definitions, techniques for which are discussed in Section 6.
Also note the invariant of the loop is precisely the intended meaning of list(i)∗
list(j) in separation logic. In fact, as we will see in Section 6, we can define a
first-order macro Star as Star (ϕ, ψ) = ϕ ∧ ψ ∧ Sp(ϕ) ∩ Sp(ψ) = ∅. We can use
this macro to represent disjoint supports in similar proofs.
These proofs demonstrate what proofs of actual programs look like in our
program logic. They also show that frame logic and our program logic can prove
many results similarly to traditional separation logic. And, by using the derived
operator Star , very little even in terms of verbosity is sacrificed in gaining the
flexibility of Frame Logic(please see Section 6 for a broader discussion of the ways
in which Frame Logic differs from Separation Logic and in certain situations
offers many advantages in stating and reasoning with specifications/invariants).
most one heaplet for any store. The translation also shows that frame logic can
naturally and compactly capture such separation logic formulas.
6 Discussion
Comparison with Separation Logic. The design of frame logic is, in many ways,
inspired by the design choices of separation logic. Separation logic formulas im-
plicitly hold on tight heaplets— models are defined on pairs (s, h), where s is
a store (an interpretation of variables) and h is a heaplet that defines a subset
of the heap as the domain for functions/pointers. In Frame Logic, we choose to
not define satisfiability with respect to heaplets but define it with respect to the
entire heap. However, we give access to the implicitly defined heaplet using the
operator Sp, and give a logic over sets to talk about supports. The separating
conjunction operation ∗ can then be expressed using normal conjunction and a
constraint that says that the support of formulae are disjoint.
We do not allow formulas to have multiple supports, which is crucial as Sp is
a function, and this roughly corresponds to precise fragments of separation logic.
Precise fragments of separation logic have already been proposed and accepted in
the separation logic literature for giving robust handling of modular functions,
concurrency, etc. [8, 29]. Section 5 details a translation of a precise fragment
of separation logic (with ∗ but not magic wand) to frame logic that shows the
natural connection between precise formulas in separation logic and frame logic.
Frame logic, through the support operator, facilitates local reasoning much
in the same way as separation logic does, and the frame rule in frame logic
supports frame reasoning in a similar way as separation logic. The key difference
between frame logic and separation logic is the adherence to a first-order logic
(with recursive definitions), both in terms of syntax and expressiveness.
First and foremost, in separation logic, the magic wand is needed to express
the weakest precondition [38]. Consider for example computing the weakest pre-
condition of the formula list(x) with respect to the code y.n := z. The weakest
precondition should essentially describe the (tight) heaplets such that changing
the n pointer from y to z results in x pointing to a list. In separation logic,
n
this is expressed typically (see [38]) using magic wand as (y −
→ z) −∗ (list(x)).
However, the magic wand operator is inherently a second-order property. The
formula α −∗β holds on a heaplet h if for any disjoint heaplet that satisfies α,
β will hold on the conjoined heaplet. Expressing this property (for arbitrary α,
whose heaplet can be unbounded ) requires quantifying over unbounded heaplets
satisfying α, which is not first order expressible.
In frame logic, we instead rewrite the recursive definition list(·) to a new
one list (·) that captures whether x points to a list, assuming that n(y) = z
(see Section 4.4). This property continues to be expressible in frame logic and
can be converted to first-order logic with recursive definitions (see Section 3.5).
Note that we are exploiting the fact that there is only a bounded amount of
change to the heap in straight-line programs in order to express this in FL.
Let us turn to expressiveness and compactness. In separation logic, separa-
tion of structures is expressed using ∗, and in frame logic, such a separation
is expressed using conjunction and an additional constraint that says that the
supports of the two formulas are disjoint. A precise separation logic formula
of the form α1 ∗ α2 ∗ . . . αn is compact and would get translated to a much
A First-Order Logic with Frames 537
larger formula in frame logic as it would have to state that the supports of
each pair of formulas is disjoint. We believe this can be tamed using macros
(Star(α, β) = α ∧ β ∧ Sp(α) ∩ Sp(β) = ∅).
There are, however, several situations where frame logic leads to more com-
pact and natural formulations. For instance, consider expressing the property
that x and y point to lists, which may or may not overlap. In Frame Logic,
we simply write list(x) ∧ list(y). The support of this formula is the union of
the supports of the two lists. In separation logic, we cannot use ∗ to write
this compactly (while capturing the tightest heaplet). Note that the formula
(list(x) ∗ true) ∧ (list(y) ∗ true) is not equivalent, as it is true in heaplets that
are larger than the set of locations of the two lists. The simplest formulation we
know is to write a recursive definition lseg(u, v) for list segments from u to v and
use quantification: (∃z. lseg(x, z) ∗ lseg(y, z) ∗ list(z)) ∨ (list(x) ∗ list(y)) where
the definition of lseg is the following: lseg(u, v) ≡ (u = v ∧ emp) ∨ (∃w. u →
w ∗ lseg(w, v)).
If we wanted to say x1 , . . . , xn all point to lists, that may or may not overlap,
then in FL we can say list(x1 ) ∧ list(x2 ) ∧ . . . ∧ list(xn ). However, in separation
logic, the simplest way seems to be to write using lseg and a linear number
of quantified variables and an exponentially-sized formula. Now consider the
property saying x1 , . . . , xn all point to binary trees, with pointers left and right,
and that can overlap arbitrarily. We can write it in FL as tree(x1 )∧. . .∧tree(xn ),
while a formula in (first-order) separation logic that expresses this property
seems very complex.
In summary, we believe that frame logic is a logic that supports frame rea-
soning built on the same principles as separation logic, but is still translatable
to first-order logic (avoiding the magic wand), and makes different choices for
syntax/semantics that lead to expressing certain properties more naturally and
compactly, and others more verbosely.
in fact, the work on implicit dynamic frames [31, 39] provides translations from
separation logic to regions for reasoning using dynamic frames.
Reasoning with regions using set theory in a first-order logic with recursive
definitions has been explored by many works to support automated reasoning.
Tools like Vampire [20] for first-order logic have been extended in recent work to
handle algebraic datatypes [19]; many data-structures in practice can be modeled
as algebraic datatypes and the schemes proposed in [19] are powerful tools to
reason with them using first-order theorem provers.
A second class of tools are those proposed in the work on natural proofs [23,
32, 37]. Natural proofs explicitly work with first order logic with recursive defi-
nitions (FO-RD), implementing validity through a process of unfolding recursive
definitions, uninterpreted abstractions, and proving inductive lemmas using in-
duction schemes. Natural proofs are currently used primarily to reason with
separation logic by first translating verification conditions arising from Hoare
triples with separation logic specifications (without magic wand) to first-order
logic with recursive definitions. Frame logic reasoning can also be done in a very
similar way by translating it first to FO-RD.
The work in [23] considers natural proofs and quantifier instantiation heuris-
tics for FO-RD (using a similar setup of foreground sort for locations and back-
ground sorts), and the work identifies a fragment of FO-RD (called safe fragment)
for which this reasoning is complete (in the sense that a formula is detected as
unsatisfiable by quantifier instantiation iff it is unsatisfiable with the inductive
definitions interpreted as fixpoints and not least fixpoints). Since FL can be
translated to FO-RD, it is possible to deal with FL using the techniques of [23].
The conditions for the safe fragment of FO-RD are that the quantifiers over
the foreground elements are the outermost ones, and that terms of foreground
type do not contain variables of any background type. As argued in [23], these
restrictions are typically satisfied in heap logic reasoning applications.
7 Related Work
∗ and −∗. However, explicitly writing out frame annotations can become verbose
and tedious.
The work on Implicit Dynamic Frames [22, 39, 40] bridges the worlds of
separation logic (without magic wand) and dynamic frames— it uses separation
logic and fractional permissions to implicitly define frames (reducing annotation
burden), allows annotations to access these frames, and translates them into set
regions for first-order reasoning. Our work is similar in that frame logic also
implicitly defines regions and gives annotations access to these regions, and can
be easily translated to pure FO-RD for first-order reasoning.
One distinction with separation logic involves the non-unique heaplets in
separation logic and the unique heaplets in frame logic. Determined heaplets
have been used [29, 32, 37] as they are more amenable to automated reasoning. In
particular a separation logic fragment with determined heaplets known as precise
predicates is defined in [29], which we capture using frame logic in Section 5.
There is also a rich literature on reasoning with these heap logics for program
verification. Decidability is an important dimension and there is a lot of work on
decidable logics for heaps with separation logic specifications [4–6, 11, 26, 33].
The work based on EPR (Effectively Propositional Reasoning) for specifying
heap properties [14–16] provides decidability, as does some of the work that
translates separation logic specifications into classical logic [34].
Finally, translating separation logic into classical logics and reasoning with
them is another solution pursued in a lot of recent efforts [10, 23, 24, 32, 32,
34–37, 41]. Other techniques including recent work on cyclic proofs [9, 42] use
heuristics for reasoning about recursive definitions.
8 Conclusions
Our main contribution is to propose Frame Logic, a classical first-order logic
endowed with an explicit operator that recovers the implicit supports of formulas
and supports frame reasoning. we have argued its expressive by capturing several
properties of data-structures naturally and succinctly, and by showing that it
can express a precise fragment of separation logic. The program logic built using
frame logic supports local heap reasoning, frame reasoning, and weakest tightest
preconditions across loop-free programs.
We believe that frame logic is an attractive alternative to separation logic,
built using similar principles as separation logic while staying within the first-
order logic world. The first-order nature of the logic makes it potentially amenable
to easier automated reasoning.
A practical realization of a tool for verifying programs in a standard program-
ming language with frame logic annotations by marrying it with existing auto-
mated techniques and tools for first-order logic (in particular [19, 24, 32, 37, 41]),
is the most compelling future work.
Acknowledgements: We thank ESOP’20 reviewers for their comments that
helped improve this paper. This work is based upon research supported by the
National Science Foundation under Grant NSF CCF 1527395.
540 A. Murali et al.
Bibliography
[1] Banerjee, A., Naumann, D.: Local reasoning for global invariants, Part II:
Dynamic boundaries. Journal of the ACM (JACM) 60 (06 2013)
[2] Banerjee, A., Naumann, D.A., Rosenberg, S.: Regional logic for local rea-
soning about global invariants. In: Vitek, J. (ed.) ECOOP 2008 – Object-
Oriented Programming. pp. 387–411. Springer Berlin Heidelberg, Berlin,
Heidelberg (2008)
[3] Banerjee, A., Naumann, D.A., Rosenberg, S.: Local reasoning for global
invariants, Part I: Region logic. J. ACM 60(3), 18:1–18:56 (Jun 2013), http:
//doi.acm.org/10.1145/2485982
[4] Berdine, J., Calcagno, C., O’Hearn, P.W.: A decidable fragment of separa-
tion logic. In: Proceedings of the 24th International Conference on Founda-
tions of Software Technology and Theoretical Computer Science. pp. 97–109.
FSTTCS’04 (2004)
[5] Berdine, J., Calcagno, C., O’Hearn, P.W.: Symbolic execution with separa-
tion logic. In: Proceedings of the Third Asian Conference on Programming
Languages and Systems. pp. 52–68. APLAS’05 (2005)
[6] Berdine, J., Calcagno, C., O’Hearn, P.W.: Smallfoot: Modular automatic
assertion checking with separation logic. In: Proceedings of the 4th In-
ternational Conference on Formal Methods for Components and Ob-
jects. pp. 115–137. FMCO’05, Springer-Verlag, Berlin, Heidelberg (2006).
https://doi.org/10.1007/11804192 6
[7] Brinck, K., Foo, N.Y.: Analysis of algorithms on threaded
trees. The Computer Journal 24(2), 148–155 (01 1981).
https://doi.org/10.1093/comjnl/24.2.148
[8] Brookes, S.: A semantics for concurrent separation logic.
Theor. Comput. Sci. 375(1-3), 227–270 (Apr 2007).
https://doi.org/10.1016/j.tcs.2006.12.034
[9] Brotherston, J., Distefano, D., Petersen, R.L.: Automated cyclic en-
tailment proofs in separation logic. In: Proceedings of the 23rd Inter-
national Conference on Automated Deduction. pp. 131–146. CADE’11,
Springer-Verlag, Berlin, Heidelberg (2011), http://dl.acm.org/citation.cfm?
id=2032266.2032278
[10] Chin, W.N., David, C., Nguyen, H.H., Qin, S.: Automated verification of
shape, size and bag properties. In: 12th IEEE International Conference
on Engineering Complex Computer Systems (ICECCS 2007). pp. 307–320
(2007)
[11] Cook, B., Haase, C., Ouaknine, J., Parkinson, M., Worrell, J.: Tractable
reasoning in a fragment of separation logic. In: Proceedings of the 22nd In-
ternational Conference on Concurrency Theory. pp. 235–249. CONCUR’11
(2011)
[12] Demri, S., Deters, M.: Separation logics and modalities: a survey. Journal
of Applied Non-Classical Logics 25, 50–99 (2015)
A First-Order Logic with Frames 541
[13] Hayes, P.J.: The frame problem and related problems in artifi-
cial intelligence. In: Webber, B.L., Nilsson, N.J. (eds.) Readings
in Artificial Intelligence, pp. 223 – 230. Morgan Kaufmann (1981).
https://doi.org/10.1016/B978-0-934613-03-3.50020-9
[14] Itzhaky, S., Banerjee, A., Immerman, N., Lahav, O., Nanevski, A., Sagiv,
M.: Modular reasoning about heap paths via effectively propositional for-
mulas. In: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium
on Principles of Programming Languages. pp. 385–396. POPL ’14, ACM,
New York, NY, USA (2014). https://doi.org/10.1145/2535838.2535854
[15] Itzhaky, S., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.:
Effectively-propositional reasoning about reachability in linked data struc-
tures. In: Proceedings of the 25th International Conference on Computer
Aided Verification. pp. 756–772. CAV’13, Springer-Verlag, Berlin, Heidel-
berg (2013). https://doi.org/10.1007/978-3-642-39799-8 53
[16] Itzhaky, S., Bjørner, N., Reps, T., Sagiv, M., Thakur, A.: Property-directed
shape analysis. In: Proceedings of the 16th International Conference on
Computer Aided Verification. pp. 35–51. CAV’14, Springer-Verlag, Berlin,
Heidelberg (2014). https://doi.org/10.1007/978-3-319-08867-9 3
[17] Kassios, I.T.: The dynamic frames theory. Form. Asp. Comput. 23(3), 267–
288 (May 2011). https://doi.org/10.1007/s00165-010-0152-5
[18] Kassios, I.T.: Dynamic frames: Support for framing, dependencies and shar-
ing without restrictions. In: Misra, J., Nipkow, T., Sekerinski, E. (eds.) FM
2006: Formal Methods. pp. 268–283. Springer-Verlag, Berlin, Heidelberg
(2006)
[19] Kovács, L., Robillard, S., Voronkov, A.: Coming to terms with quantified
reasoning. In: Proceedings of the 44th ACM SIGPLAN Symposium on Prin-
ciples of Programming Languages. pp. 260–270. POPL ’17, ACM, New York,
NY, USA (2017). https://doi.org/10.1145/3009837.3009887
[20] Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In:
CAV ’13. pp. 1–35 (2013). https://doi.org/10.1007/978-3-642-39799-8 1
[21] Leino, K.R.M.: Dafny: An automatic program verifier for func-
tional correctness. In: Proceedings of the 16th International Confer-
ence on Logic for Programming, Artificial Intelligence, and Reason-
ing. p. 348–370. LPAR’10, Springer-Verlag, Berlin, Heidelberg (2010).
https://doi.org/10.5555/1939141.1939161
[22] Leino, K.R.M., Müller, P.: A basis for verifying multi-threaded pro-
grams. In: Castagna, G. (ed.) Programming Languages and Systems.
pp. 378–393. Springer Berlin Heidelberg, Berlin, Heidelberg (2009).
https://doi.org/10.1007/978-3-642-00590-9 27
[23] Löding, C., Madhusudan, P., Peña, L.: Foundations for natural proofs
and quantifier instantiation. PACMPL 2(POPL), 10:1–10:30 (2018).
https://doi.org/10.1145/3158098
[24] Madhusudan, P., Qiu, X., Ştefănescu, A.: Recursive proofs for induc-
tive tree data-structures. In: Proceedings of the 39th Annual ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Lan-
542 A. Murali et al.
guages. pp. 123–136. POPL ’12, ACM, New York, NY, USA (2012).
https://doi.org/10.1145/2103656.2103673
[25] Murali, A., Peña, L., Löding, C., Madhusudan, P.: A first order logic with
frames. CoRR (2019), http://arxiv.org/abs/1901.09089
[26] Navarro Pérez, J.A., Rybalchenko, A.: Separation logic + superposition
calculus = heap theorem prover. In: Proceedings of the 32nd ACM SIG-
PLAN Conference on Programming Language Design and Implementation.
pp. 556–566. PLDI ’11, ACM, New York, NY, USA (2011)
[27] O’Hearn, P.W.: A primer on separation logic (and automatic program ver-
ification and analysis). In: Software Safety and Security (2012)
[28] O’Hearn, P.W., Reynolds, J.C., Yang, H.: Local reasoning about programs
that alter data structures. In: Proceedings of the 15th International Work-
shop on Computer Science Logic. pp. 1–19. CSL ’01, Springer-Verlag, Lon-
don, UK, UK (2001), http://dl.acm.org/citation.cfm?id=647851.737404
[29] O’Hearn, P.W., Yang, H., Reynolds, J.C.: Separation and information hid-
ing. In: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on
Principles of Programming Languages. pp. 268–280. POPL ’04, ACM, New
York, NY, USA (2004). https://doi.org/10.1145/964001.964024
[30] Parkinson, M., Bierman, G.: Separation logic and abstraction. In: Proceed-
ings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages. pp. 247–258. POPL ’05, ACM, New York, NY,
USA (2005). https://doi.org/10.1145/1040305.1040326
[31] Parkinson, M.J., Summers, A.J.: The relationship between separation logic
and implicit dynamic frames. In: Barthe, G. (ed.) Programming Languages
and Systems. pp. 439–458. Springer Berlin Heidelberg, Berlin, Heidelberg
(2011). https://doi.org/10.1007/978-3-642-19718-5 23
[32] Pek, E., Qiu, X., Madhusudan, P.: Natural proofs for data structure
manipulation in C using separation logic. In: Proceedings of the 35th
ACM SIGPLAN Conference on Programming Language Design and Im-
plementation. pp. 440–451. PLDI ’14, ACM, New York, NY, USA (2014).
https://doi.org/10.1145/2594291.2594325
[33] Pérez, J.A.N., Rybalchenko, A.: Separation logic modulo theories. In: Pro-
gramming Languages and Systems (APLAS). pp. 90–106. Springer Interna-
tional Publishing, Cham (2013)
[34] Piskac, R., Wies, T., Zufferey, D.: Automating separation logic using
SMT. In: Proceedings of the 25th International Conference on Computer
Aided Verification. pp. 773–789. CAV’13, Springer-Verlag, Berlin, Heidel-
berg (2013). https://doi.org/10.1007/978-3-642-39799-8 54
[35] Piskac, R., Wies, T., Zufferey, D.: Automating separation logic with trees
and data. In: Proceedings of the 16th International Conference on Computer
Aided Verification. pp. 711–728. CAV’14, Springer-Verlag, Berlin, Heidel-
berg (2014)
[36] Piskac, R., Wies, T., Zufferey, D.: Grasshopper. In: Ábrahám, E., Havelund,
K. (eds.) Tools and Algorithms for the Construction and Analysis of Sys-
tems. pp. 124–139. Springer Berlin Heidelberg, Berlin, Heidelberg (2014)
A First-Order Logic with Frames 543
[37] Qiu, X., Garg, P., Ştefănescu, A., Madhusudan, P.: Natural proofs for
structure, data, and separation. In: Proceedings of the 34th ACM SIG-
PLAN Conference on Programming Language Design and Implemen-
tation. pp. 231–242. PLDI ’13, ACM, New York, NY, USA (2013).
https://doi.org/10.1145/2491956.2462169
[38] Reynolds, J.C.: Separation logic: A logic for shared mutable data structures.
In: Proceedings of the 17th Annual IEEE Symposium on Logic in Computer
Science. pp. 55–74. LICS ’02 (2002)
[39] Smans, J., Jacobs, B., Piessens, F.: Implicit dynamic frames: Combining dy-
namic frames and separation logic. In: Drossopoulou, S. (ed.) ECOOP 2009
– Object-Oriented Programming. pp. 148–172. Springer Berlin Heidelberg,
Berlin, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03013-0 8
[40] Smans, J., Jacobs, B., Piessens, F.: Implicit dynamic frames.
ACM Trans. Program. Lang. Syst. 34(1), 2:1–2:58 (May 2012).
https://doi.org/10.1145/2160910.2160911
[41] Suter, P., Dotta, M., Kunćak, V.: Decision procedures for algebraic
data types with abstractions. In: Proceedings of the 37th Annual ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Lan-
guages. pp. 199–210. POPL ’10, ACM, New York, NY, USA (2010).
https://doi.org/10.1145/1706299.1706325
[42] Ta, Q.T., Le, T.C., Khoo, S.C., Chin, W.N.: Automated mutual explicit
induction proof in separation logic. In: Fitzgerald, J., Heitmeyer, C., Gnesi,
S., Philippou, A. (eds.) FM 2016: Formal Methods. pp. 659–676. Springer
International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-
48989-6 40
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Proving the safety of highly-available distributed
objects
1 Introduction
Many modern applications serve users accessing shared data in different ge-
ographical regions. Examples include social networks, multi-user games, co-
operative engineering, collaborative editors, source-control repositories, or dis-
tributed file systems. One approach would be to store the application’s data
(which we call object) in a single central location, accessed remotely. However,
users far from the central location would suffer long delays and outages.
Instead, the object is replicated to several locations. A user accesses the
closest available replica. To ensure availability, an update must not synchronise
across replicas; otherwise, when a network partition occurs, the system would
block. Thus, a replica executes both queries and updates locally, and propagates
its updates to other replicas asynchronously.
Updates at different locations are concurrent; this may cause replicas to
diverge, at least temporarily. Replicas may diverge, but if the system ensures
Strong Eventual Consistency (SEC), this ensures that replicas that have received
the same set of updates have the same state [25], simplifying the reasoning.
The replicated object may also require to maintain some (application-specific)
invariant, an assertion about the object. We say a state is safe if the invariant
is true in that state; the system is safe if every reachable state is safe. In a se-
quential system, this is straightforward (in principle): if the initial state is safe,
and the final state of every update individually is safe, then the system is safe.
However, these conditions are not sufficient in the replicated case, because con-
current updates at different replicas may interfere with one another. This can be
fixed by synchronising between some or all types of updates. To maximise avail-
ability and latency, such synchronisation should be minimised. In this paper, we
propose a proof methodology to ensure that a given object is system-safe, for a
given invariant and a given amount of concurrency control. In contrast to pre-
vious works, we consider state-based objects.1 Indeed, the specific properties of
state-based propagation enable simple modular reasoning despite concurrency,
thanks to the concept of concurrency invariant. Our proof methodology derives
the concurrency invariant automatically from the sequential specification. Now,
if the initial state is safe, and every update maintains both the application in-
variant and the concurrency invariant, then every reachable state is safe, even
in concurrent executions, regardless of network partitions. We have developed
a tool named Soteria, to automate our proof methodology. Soteria analyses the
specification to detect concurrency bugs and provides counterexamples.
The contributions of this paper are as follows:
– We propose a novel proof system specialised to proving the safety of avail-
able objects that converge by propagating state. This specialisation supports
modular reasoning, and thus it enables automation.
– We demonstrate that this proof system is sound. Moreover, we provide a sim-
ple semantics for state-propagating systems that allows us to ignore network
messages altogether.
– We present Soteria, to the best of our knowledge the first tool support-
ing the verification of program invariants for state-based replicated objects.
When Soteria succeeds it ensures that every execution, whether replicas are
partitioned or concurrent, is safe.
– We present a number of representative case studies, which we run through
Soteria.
2 Background
– Its Status, that can move from initial state INVALID (under preparation) to
ACTIVE (can receive bids) and then to CLOSED (no more bids accepted).
– The Winner of the auction, that is initially ⊥ and can become the bid taking
the highest amount. In case of ties, the bid with the lowest id wins.
– The set of Bids placed, that is initially empty. A bid is a tuple composed of
• BidId: A unique identifier
• Placed: A boolean flag to indicate whether the bid has been placed or
not. Initially, it is FALSE. Once placed, a bid cannot be withdrawn.
• The monetary Amount of the bid; this cannot be modified once the bid
is created.
546 S. S. Nair et al.
Figure 1 illustrates how the auction state evolves over time. The state of the
object is geo-replicated at data centers in Adelaide, Brussels, and Calgary. Users
at different locations can start an auction, place bids, close the auction, declare
a winner, inspect the local replica, and observe if a winner is declared and who
it is. The updates are propagated asynchronously to other replicas. All replicas
will eventually agree on the same auction status, the same set of bids and the
same winner.
There are two basic approaches to propagating updates. The operation-based
approach applies an update to some origin replica, then transmits the operation
itself to be replayed at other replicas. If messages are delivered in causal order,
exactly once, and concurrent operations are commutative, then two replicas that
received the same updates reach the same state (this is the Strong Eventual
Consistency guarantee, or SEC) [25].
The state-based approach applies an update to some origin replica. Occasion-
ally, a replica sends its full state to some other replica, which merges the received
state into its own. If the state space forms a monotonic semi-lattice, an update
is an inflation (its output state is not lesser than the input state), and merge
computes the least-upper-bound of the local and received states, then SEC is
guaranteed [25]. As long as every update eventually reaches every replica, mes-
sages may be dropped, re-ordered or duplicated, and the set of replicas may be
unknown. Due to these relaxed requirements, state-based propagation is widely
used in industry. Figure 1 shows the state-based approach with local operations
and merges. Alternatives exist where only a delta of the state —that is, the
portion of the state not known to be part of the other replicas— is sent as a
message [1]; since this is an optimisation, it is of no consequence to the results
of this paper.
1
As opposed to operation-based. These terms are defined in Section 2.
Proving the safety of highly-available distributed objects 547
Looking back to Figure 1, we can see that replicas diverge temporarily. This
temporary divergence can lead to an unsafe state, in this case declaring a wrong
winner. This correctness problem has been addressed before; however, previous
works mostly consider the operation-based propagation approach [11, 13, 19, 24].
3 System Model
In this section, we first introduce the object components, explain the underlying
system model informally, and then formalise the operational semantics.
Operations. Each replica may perform the operations defined for the object.
To support availability, an operation modifies the local state at some arbitrary
replica, the origin replica for that operation, without synchronising with other
replicas (the cost of synchronisation being significant at scale). An operation
might consist of several changes; these are applied to the replica as a single
atomic unit.
Executing an operation on its origin replica has an immediate effect. However,
the state of the other replicas, called remote replicas, remains unaltered at this
point. The remote replicas get updated when the state is eventually propagated.
An immediate consequence of this execution model is that in the presence of
concurrent operations, replicas can reach different states, i.e. they diverge.
Let us illustrate this with our example in Figure 1. Initially, the auction
is yet to start, the winner is not declared and no bids are placed. By de-
fault, a replica can execute any operation - start auction, place bid, and
close auction - locally without synchronising with other replicas. We see that
the local states of replicas occasionally diverge. For example at the point where
operation close auction completes at the Adelaide replica, the Adelaide replica
is aware of only a $100 bid, the Brussels replica has two bids, and the Calgary
replica observes only one bid for $105.
548 S. S. Nair et al.
This condition must hold true in all possible executions of the object.
– The local semantic function takes an operation and a state, and returns the
state after applying the operation. We write op(σ) = σnew for executing
operation op on state σ resulting in a new state σnew .
– Ω denotes a partial function returning the current state of a replica. For
instance Ω(r) = σ means that in global state Ω, replica r is in local state
σ. We will use the notation Ω[r ← σ] to denote the global state resulting
from replacing the local state of replica r with σ. The local state of all other
replicas remains unchanged in the resulting global state.3
σ
– A message propagating states between replicas is denoted r − → r . This
represents the fact that replica r has sent a message (possibly not yet re-
ceived) to replica r , with the state σ as its payload. The meta-variable M
denotes the messages in transit in the network.
– In the following sub-section, we will utilise a set of states to record the history
of the execution. The set of past states will be ranged over with the variable
S ∈ P(Σ).
– All replicas are assumed to start in the same initial state σi . Formally, for
each replica r ∈ dom(Ωi ) we have Ωi (r) = σi .
In this and the following subsections we will present two semantics for systems
propagating states. Importantly, while the first semantics takes into account
the effects of the network on the propagation of the states, and is hence an
accurate representation of the execution of systems with state propagation, we
will show in the next subsection that reasoning about the network is unnecessary
in this kind of system. We will demonstrate this claim by presenting a much
simpler semantics in which the network is abstracted away. The importance
of this reduction is that the number of events to be considered, both when
conducting proofs and when reasoning about applications, is greatly reduced.
As informal evidence of this claim, we point at the difference in complexity
between the semantic rules presented in Figure 2 and Figure 3. We postpone the
equivalence argument to Theorem 1.
Figure 2 presents the semantic rules describing what we shall call the precise
semantics (we will later present a more abstract version) defining the transition
relations describing how the state of the object evolves.
The figure defines a semantic judgement of the form (Ω, M) → − (Ωnew , Mnew )
where (Ω, M) is a configuration where the replica states are given by Ω as shown
above, and M is a set of messages that have been transmitted by different replicas
and are pending to be received by their target replicas.
Rule Operation presents the state transition resulting from a replica r
executing an operation op. The operation queries the state of replica r, evaluates
the semantic function for operation op and updates its state with the result. The
3
This notation of a global state is used only to explain and prove our proof rule. In
fact, the rule is based only on the local state of each replica.
550 S. S. Nair et al.
Operation
Ω(r) = σ op(σ) = σnew Ωnew = Ω[r ← σnew ]
(Ω, M) −
→ (Ωnew , M)
Send
σ
Ω(r) = σ r ∈ dom(Ω) \ {r} → r }
Mnew = M ∪ { r −
(Ω, M) −
→ (Ω, Mnew )
Merge
Ω(r) = σ
σ
Mnew = M \ { r −→ r } merge(σ, σ ) = σnew Ωnew = Ω[r ← σnew ]
(Ω, M) −
→ (Ωnew , Mnew )
Op & Broadcast
Ω(r) = σ op(σ) = σnew Ωnew = Ω[r ← σnew ]
σnew
Mnew = M ∪ { r −− −→ r | r ∈ dom(Ω) \ {r} }
(Ω, M) −
→ (Ωnew , Mnew )
Merge & Broadcast
Ω(r) = σ
σ
Mnew = M \ { r −→ r } merge(σ, σ ) = σnew Ωnew = Ω[r ← σnew ]
σnew
Mnew = Mnew ∪ { r −−−→ r | r ∈ dom(Ω) \ {r} }
(Ω, M) −
→ (Ωnew , Mnew )
set of messages M does not change. The second rule, Send, represents the non-
deterministic sending of the state of replica r to replica r . The rule has no other
effect than to add a message to the set of pending messages M. The Merge rule
σ
picks any message, r −→ r , in the set of pending messages M, and applies the
merge function to the destination replica with the state in the payload of the
σ
message, removing r −→ r from M.
The final two rules, Op & Broadcast and Merge & Broadcast represent
the specific case when the states are immediately sent to all replicas. These rules
are not strictly necessary since they are subsumed by the application of either
Operation or Merge followed by one Send per replica. We will, however, use
them to simplify a simulation argument in what follows.
We remark at this point that no assumptions are made about the duplication
of messages or the order in which messages are delivered. This is in contrast to
other works on the verification of properties of replicated objects [11, 13]. The
reason why this assumption is not a problem in our case is that the least-upper-
bound assumption of the merge function, as well as the inflation assumptions on
the states considered in Item 2 (Section 6.1) mean that delayed messages have
no effect when they are merged.
Proving the safety of highly-available distributed objects 551
Operation
Ω(r) = σ op(σ) = σnew Ωnew = Ω[r ← σnew ]
(Ω, S) −
→ (Ωnew , S ∪ {σnew })
Merge
Ω(r) = σ σ ∈ S merge(σ, σ ) = σnew Ωnew = Ω[r ← σnew ]
(Ω, σ) −
→ (Ωnew , S ∪ {σnew })
∗
As customary we will denote with (Ω, M) −
→ (Ωnew , Mnew ) the repeated appli-
cation of the semantic rules zero or more times, from the state (Ω, M) resulting
in the state (Ωnew , Mnew ).
It is easy to see how the example in Figure 1 proceeds according to these
rules for the auction.
The following lemma,4 to be used later, establishes that whenever we use
only the broadcast rules, for any intermediate state in the execution, and for
any replica, when considering the final state of the trace, either the replica
has already observed a fresher version of the state in the execution, or there
is a message pending for it with that state. This is an obvious consequence of
broadcasting.
for any two replicas r and r and a state σ such that Ω(r) = σ, then either:
– Ωnew (r ) ≥ σ, or
σ
– r−→ r ∈ Mnew .
We now turn our attention to a simpler semantics where we omit messages from
configurations, but instead, we record in a separate set all the states occurring
in any replica throughout the execution.
The semantics in Figure 3 presents a judgement of the form (Ω, S) →
− (Ωnew , Snew )
between configurations of the form (Ω, S) as before, but where the set of messages
is replaced by a set of states denoted with the meta-variable S ∈ P(Σ).
4
The proofs for the lemmas are included in the extended version[23].
552 S. S. Nair et al.
Lemma 2. Consider a state (Ω, S) reachable from an initial global state Ωi with
∗
the semantics of Figure 3. Formally: (Ωi , {σi }) −
→ (Ω, S). We can conclude that
the set of recorded states in the final configuration S includes all of the states
present in any of the replicas
{Ω(r)} ⊆ S
r∈dom(Ω)
In other words, two states represented in the two configurations are related
if both are reachable from an initial global state and all the states transmitted
by the messages (M) is present in the history (S).
We can now show that this relation is indeed a bisimulation. We first show
that the semantics of Figure 3 simulates that of Figure 2. That is, all behaviours
produced by the precise semantics with messages can also be produced by the
semantics with history states. This is illustrated in the commutative diagram
of Figure 4a and Figure 4b, where the dashed arrows represent existentially
quantified components that are proven to exist in the theorem.
∗ ∗
(Ωi , ∅) (Ω, M) (Ωnew , Mnew ) (Ωi , {σi }) (Ω, S) (Ωnew , Snew )
RΩi RΩi RΩi RΩi
∗ ∗
(Ωi , {σi }) (Ω, S) (Ωnew , Snew ) (Ωi , ∅) (Ω, M) (Ωnew , Mnew )
and consider that there exists a state (Ω, S) of the history preserving semantics
of Figure 3 such that they are related by the simulation relation
We can conclude that, as illustrated in Figure 4a, there exists a state (Ωnew , Snew )
such that
(Ω, S) →
− (Ωnew , Snew ) and (Ωnew , Mnew ) RΩi (Ωnew , Snew )
We will now consider the lemma showing the inverse relation. To that end we
will consider a special case of the semantics of Figure 2 where instead of apply-
ing the Operation rule, we will always apply the Op & Broadcast rule, and
instead of the Merge rule, we will apply Merge & Broadcast. As we men-
tioned before, this is equivalent to the application of the Operation/Merge
rule, followed by a sequence of applications of Send. The reason we will do this
is that we are interested in showing that for any execution of the semantics in
Figure 3 there is an equivalent (simulated) execution of the semantics of Fig-
ure 2. Since all states can be merged in the semantics of Figure 3 we have to
assume that in the semantics of Figure 2 the states have been sent with messages.
Fortunately, we can choose how to instantiate the existential send messages to
apply the rules as necessary, and that justifies this choice.
We can conclude that there exists a state (Ωnew , Mnew ) such that
(Ω, M) →
− (Ωnew , Mnew ) and (Ωnew , Mnew ) RΩi (Ωnew , Snew )
554 S. S. Nair et al.
4 Proving Invariants
Let us call safe state a replica state that satisfies the invariant. Assuming
the current state is safe, any update (local or merge) must result in a safe state.
To ensure this, every update is equipped with a precondition that disallows any
unsafe execution.5 Thus, a local update executes only when, at the origin replica,
the current state is safe and its precondition currently holds.
Formally, an update u (an operation or a merge), mutates the local state σ, to
a new state σnew = u(σ). To preserve the invariant, Inv, we require that the local
state respects the precondition of the update, Preu : σ ∈ Preu =⇒ u(σ) ∈ Inv
To illustrate local preconditions, consider an operation close auction(w:
BidId), which sets auction status to CLOSED and the winner to w (of type BidId).
The developer may have written a precondition such as status = ACTIVE be-
cause closing an auction doesn’t make sense otherwise. In order to ensure the
invariant that the winner has the highest amount, one needs to strengthen it
with the clause is highest(Bids, w), defined as
∀ b ∈ Bids , b . placed =⇒ b . Amount ≤ w . Amount
This means that if the status is CLOSED in either of the two states, the winner
should be the highest bid in any state. This condition ensures that when a winner
is declared, it is the highest bid among the set of bids in any state at any replica.
Since merge can happen at any time, it must be the case that its precondition
is always true, i.e., it constitutes an additional invariant. We call this as the
concurrency invariant. Now our global invariant consists of two parts: first, the
invariant (Inv), and second, the concurrency invariant(Invconc ).
σi Inv (1)
⎛ ⎞
σ Preop ∧
∀ op, σ, σnew , ⎝ σ Inv ∧ ⎠ ⇒ σnew Inv (2)
op(σ) = σnew
⎛ ⎞
(σ, σ ) Premerge ∧
⎜ σ Inv ∧ ⎟
∀ σ, σ , σnew , ⎜
⎝
⎟⇒
⎠ σnew Inv (3)
σ Inv ∧
merge(σ, σ ) = σnew
– Clearly, the initial state of the object must satisfy the global invariant, this
is checked by conditions (1) and (4).
The rest of the rules perform a kind of inductive reasoning. Assuming that we
start in a state that satisfies the global invariant, we need to check that any
state update preserves the validity of said invariant. Importantly, this reasoning
is not circular, since the initial state is known by the rule above to be safe. 6
– Condition (2) checks that each of the operations, when executed starting
in a state satisfying its precondition and the invariant, is safe. Notice that
we require that the precondition of the operation be satisfied in the start-
ing state. This is the core of the inductive argument alluded to above, all
operations – which as we mentioned in Section 3 execute atomically w.r.t.
concurrency – preserve the invariant Inv.
Other than the execution of operations, the other source of local state changes
is the execution of the merge function in a replica. It is not true in general that
for any two given states of an object, the merge should compute a safe state.
In particular, it could be the case that the merge function needs a precondition
that is stronger than the conjunction of the invariants in the two states to be
merged. The following rules deal with these cases.
– We require the merge function to be annotated with a precondition strong
enough to guarantee that merge will result in a safe state. Generally, this
6
Indeed, the proof of soundness of program logics such as Rely/Guarantee are typi-
cally inductive arguments of this nature.
Proving the safety of highly-available distributed objects 557
We remark at this point that there are numerous program logic approaches
to proving invariants of shared-memory concurrent programs, with Rely/Guar-
antee [15] and concurrent separation logic [6] underlying many of them. While
these approaches could be adapted to our use case (propagating-state distributed
systems), this adaptation is not evident. As an indication of this complexity: one
would have to predicate about the different states of the different replicas, re-
state the invariant to talk about these different versions of the state, encode the
non-deterministic behaviour of merge, etc. Instead, we argue that our specialised
rules are much simpler, allowing for a purely sequential and modular verification
that we can mechanise and automate. This reduction in complexity is the main
theoretical contribution of this paper.
Let us apply the proof methodology to the auction object. Its invariant is the
following conjunction:
Computing the weakest precondition of each update operation, for this invariant
is obvious. For instance, as discussed earlier, close auction(w) gets precondi-
tion is highest(Bids, w), because of Invariant Item 2 above.
Despite local updates to each replica respecting the invariant Inv, Figure 1
showed that it is susceptible of being violated by merging. This is the case if Bob’s
$100 bid in Brussels wins, even though Charles concurrently placed a $105 bid
in Calgary; this occurred because status became CLOSED in Brussels while still
ACTIVE in Calgary. The weakest precondition of merge for safety expresses that,
if status in either state is CLOSED, the winner should be the bid with the highest
amount in both the states. This merge precondition, now called the concurrency
invariant, strengthens the global invariant to be safe in concurrent executions.
Let us now consider how this strengthening impacts the local update opera-
tions. Since starting the auction doesn’t modify any bids, the operation trivially
preserves it. Placing a bid might violate Invconc if the auction is concurrently
closed in some other replica; conversely, closing the auction could also violate
Invconc , if a higher bid is concurrently placed in a remote replica. Thus, the auc-
tion object is safe when executed sequentially, but it is unsafe when updates are
concurrent. This indicates the specification has a bug, which we now proceed to
fix.
Proving the safety of highly-available distributed objects 559
5 Case Studies
This section presents three representative examples of different consistency re-
quirements of several distributed applications. The consensus object is an ex-
560 S. S. Nair et al.
Comparison function :
Initial state : t > t0
∃ r, V.r ∧ t = 0 ∨ ( t = t0 ∧ V = V0 )
{Pretransfer : V . me } {Premerge :
transfer ( r o ): ( t = t 0 =⇒ V = V 0 )
t = t +1 ∧ ( V . me =⇒ t ≥ t 0 )}
V . me := false merge (( t , V ) ,( t 0 ,V 0 )):
V . r 0 := true t = max (t , t 0 )
v = ( t 0 <t )? V : V 0
Invariant :
∃ r , V . r ∧ ∀ r , r 0 , ( V . r ∧ V . r 0 ) =⇒ r = r 0
The pseudo code of the consensus example is shown in Figure 7. The design
for consensus can be relaxed, requiring only the majority of replicas to mark
their boxes. The extension for that is trivial.
its origin replica. For state inflation, a timestamp associated with the lock is
incremented during each transfer.
A merge of two states of this distributed lock will preserve the state with
the highest timestamp. In order for the merge function to be the least upper
bound, we must specify that if the timestamps of the two states are equal, their
corresponding boolean arrays are also equal. Also if the origin replica owns the
lock, it has the highest timestamp. The conjunction of these two restrictions
which form the precondition of merge, Premerge , is the concurrency invariant,
Invconc .
Consider the case of three replicas r1 , r2 and r3 sharing a distributed lock.
Assume that initially replica r1 owns the lock. Replicas r2 and r3 concurrently
place a request for the lock. The current owner r1 , has to make a decision on the
priority of the requests based on the business logic. r1 calculates a higher priority
for r3 and transfers the lock to r3 . Since r1 no longer has the lock, it cannot issue
any further transfer operations. We see here clearly that the transfer operation is
safe. In the new state, r3 is the only replica that can perform a transfer operation.
We can also note that this prevents any concurrent transfer operations. This can
guarantee mutual exclusion and hence ensures safety in a concurrent execution
environment.
An interesting property we can observe from this example is total order. Due
to the preconditions imposed in order to be safe, we see that the states progress
through a total order, ordered by the timestamp. The transfer function increases
the timestamp and merge function preserves the highest timestamp.
5.3 Courseware
We now look at an application that allows students to register and enroll in a
course. For space reasons, we elide the pseudocode which can be found in the
extended version[23]. The state consists of a set of students, a set of courses and
enrollments of students for different courses. Students can register and deregister,
courses can be created and deleted, and a student can enroll for a course. The
invariant requires enrolled students and courses to be registered and created
respectively.
The set of students and courses consists of two sets - one to track registrations
or creations and another to track deregistrations or deletions. Registration or cre-
ation monotonically adds the student or course to the registered sets respectively
and deregistration or deletion monotonically adds them to the unregistered sets.
The semantics currently doesn’t support re-registration, but that can be fixed
by using a slightly modified data structure that counts the number of times the
student has been registered/unregistered and decides on the status of registra-
tion. Enrollment adds the student-course pair to the set. Currently, we do not
consider canceling an enrollment, but it is a trivial extension. Merging two states
takes the union of the sets.
Let us consider the safety of each operation. The operations to register a
student and create a course are safe without any restrictions. Therefore they do
not need any precondition. The remaining three operations might violate the
Proving the safety of highly-available distributed objects 563
6 Automation
In this section, we present a tool to automate the verification of invariants as
discussed in the previous sections. Our tool, called Soteria is based on the Boogie
[5] verification framework. The input to Soteria is a specification of the object
written as Boogie procedures, augmented with a number of domain-specific an-
notations needed to check the properties described in Section 4.
Let us now consider how a distributed object is specified in Soteria.:
– State: We require the programmer to provide a declaration of the state
using the global variables in Boogie. The data types can be either built-in
or user defined.
– Comparison function: Next we require the programmer to provide a com-
parison function. This function determines the partial order on states. Again,
we shall use this comparison function as a basis to check the lattice condi-
tions, and whether each operation is an inflation on the lattice. We use the
keyword @gteq to annotate the comparison function in the tool. This com-
parison function returns true when all the components of the first state are
greater than or equal to the corresponding components in the other state. It
is encoded as a function in Boogie.
– Operations: We require the programmer to provide the implementation of
the operations of the object. Moreover, for each operation op we require the
564 S. S. Nair et al.
1. Syntax checks
The first simple checks validate that the specification provided respects Boo-
gie syntax when ignoring Soteria annotations. It also calls Boogie to validate
that the types are correct and that the pre/post conditions provided are
sound.
Then it checks that the specification provides all the elements necessary for a
complete specification. Specifically, it checks the function signatures marked
by @gteq and @invariant and the procedure marked by @merge.
2. Convergence check
This stage checks the convergence of the specification. Specifically, it checks
whether the specification respects Strong Eventual Consistency. The Strong
Eventual Consistency (SEC) property states that any two replicas that re-
ceived the same set of updates are in the same state. To guarantee this,
objects are designed to have certain sufficient properties in the encoding of
the state [3, 4, 25], which can be summarised as follows:
– The state space is equipped with an ordering operator, comparing two
states.
– The ordering forms a join-semilattice.
– Each individual operation is an inflation in the semilattice.
– The merge operation, composing states from two replicas, computes the
least-upper-bound of the given states in the semilattice.
We present the conditions formally in the extended version[23].
An alternative is to make use of the CALM theorem [12]. This allows non-
monotonic operations, but requires them to coordinate. However, our aim is
to provide maximum possible availability with SEC. 7
To ensure these conditions of Strong Eventual Consistency, the tool performs
the following checks:
– That each operation is an inflation. In a nutshell, we prove using Boogie
the following Hoare-logic triple:
assume σ ∈ Preop
call σnew := op(σ)
assert σnew ≥ σ
– Merge computes the least upper bound. The verification condition dis-
charged is shown below:
assume (σ, σ ) ∈ Premerge
call σnew := merge(σ, σ )
assert σnew ≥ σ ∧ σnew ≥ σ
assert ∀σ∗, σ∗ ≥ σ ∧ σ∗ ≥ σ =⇒ σ∗ ≥ σnew
3. Safety check This stage verifies the safety of the specification as discussed
in Section 4. This stage is divided further into two sub-stages:
– Sequential safety: Soteria checks whether each individual operation is
safe. This corresponds to the conditions (2) and (3) in Figure 5. The
verification condition discharged by the tool to ensure sequential safety
of operations is:
7
Convergence of our running example is discussed in the extended version[23].
566 S. S. Nair et al.
The special case of the merge function is verified with the following
verification condition:
assume (σ, σ ) ∈ Premerge ∧ σ ∈ Inv ∧ σ ∈ Inv
call σnew := merge(σ, σ )
assert σnew ∈ Inv
Notice that in this condition we assume that there are two copies of the
state, the state of the replica applying the merge, and the state with
superscript representing a state arriving from another replica. In case of
failure of the sequential safety check, the designer needs to strengthen
the precondition of the operation (or merge) which was unsafe.
– Concurrent safety: Here we check whether each operation upholds the
precondition of merge. This corresponds to the conditions (5) and (6) in
Figure 5. Notice that while this check relates to the concurrent behaviour
of the distributed object, the check itself is completely sequential; it does
not require reasoning about operations performed by other processes. As
shown in Section 4, this ensures safety during concurrent operation.
The verification conditions are:
assume σ ∈ Preop ∧ Inv ∧ (σ, σ ) ∈ Invconc
call σnew := op(σ)
assert (σnew , σ ) ∈ Invconc
to validate a call to merge. If the concurrent safety check fails, the design
of the distributed object needs a replicated concurrency control mecha-
nism embedded as part of the state.
When all checks are validated, the tool reports that the specification is safe.
Whenever a check fails, Soteria provides a counterexample 8 along with the
failure message tailored to the type of check. This can help the developer identify
issues with the specification and fix it.
Once the invariants and specification of an application is given, Soteria is
fully automatic, thanks to Z3, an SMT solver that is fully automated. The spec-
ification of the application includes the state, all the operations including the pre
and post conditions (including merge). In case the invariant cannot be proven,
Soteria provides counter-examples. The programmer can leverage these to up-
date the specification with appropriate concurrency control, rerun Soteria, and
so on until the application is correct. As far as the proof system is concerned, no
programmer involvement is required. Currently, the effort of adding the required
synchronization conditions is manual, but as the next step, we are working on
8
Soteria uses the counter model provided by Boogie.
Proving the safety of highly-available distributed objects 567
7 Related Work
but requiring that their merge satisfies the invariant. This is captured in the
concurrency invariant, Invconc , which is synthesised from the user provided in-
variant. How to obtain this invariant is understandably not addressed in Bailis
et al.[2] since no proof technique is provided. Notice that this is a sound approxi-
mation since it guarantees the invariant is satisfied, and we also verify that every
operation preserves this condition as shown in Corollary 1. In this sense we say
that the pre-condition of merge for a given invariant I, is also an invariant of the
system. It is this abstraction step that makes the analysis performed by Soteria
to be syntax-driven, automated, and machine-checked. The fact that Soteria is
an analysis of a program is in contrast with I-confluence [2] where no means
to link a given program text to the semantical model, let alone rules to show
that the syntax implies invariant preservation, are provided. In other words, I-
confluence [2] does not provide a program logic, but rather a meta-theoretical
proof about lattice-based state-propagating systems.
Our previous work [21], provides an informal proof methodology for ensuring
safety of Convergent Replicated Data Types(CvRDTs), which are a group of
specialised data structures used to ensure convergence in distributed program-
ming. This work builds upon it, and formalises the proof rules and prove them
sound. We relax the requirement of CvRDTs by allowing the usage of any data
types, that together respect the lattice conditions mentioned in Section 3. We
also show several case studies which demonstrate the use of the rule.
A final interesting remark is that we can show how our methodology can
aid in the verification of distributed objects mediated by concurrency control.
Some works [16, 17, 26, 27] have considered this problem from the standpoint of
synthesis, or from the point of view of which mechanisms can be used to check
a certain property of the system.
8 Conclusion
We have presented a sound proof rule to verify invariants of state-based dis-
tributed objects, i.e., the objects that propagate state. We present the proof
obligations guaranteeing that the implementation is safe in concurrent execution
by reducing the problem to checking that each operation of the object satisfies
a precondition of the merge function of the state.
We presented Soteria, a tool sitting on top of the Boogie verification frame-
work. This tool can be used to identify the concurrency bugs in the design of
a distributed object. Soteria also checks convergence by checking the lattice
conditions on the state, described by [3]. We have shown multiple compelling
case-studies showing how Soteria can be leveraged to ensure the correctness of
distributed objects that propagate state. It would be an interesting next step
to look into automatic concurrency control synthesis. The synthesised concur-
rency control can be analysed and adapted dynamically to minimise the cost of
synchronisation.
Acknowledgements. This research is supported in part by the RainbowFS project (Agence Na-
tionale de la Recherche, France, number ANR-16-CE25-0013-01) and by European H2020 project
732 505 LightKone (2017–2020).
Proving the safety of highly-available distributed objects 569
Bibliography
[1] Almeida, P.S., Shoker, A., Baquero, C.: Delta state replicated data types.
J. Parallel Distrib. Comput. 111, 162–173 (2018), https://doi.org/10.1016/
j.jpdc.2017.08.003
[2] Bailis, P., Fekete, A., Franklin, M.J., Ghodsi, A., Hellerstein, J.M., Sto-
ica, I.: Coordination avoidance in database systems. Proc. VLDB Endow.
8(3), 185–196 (Nov 2014), http://dx.doi.org/10.14778/2735508.2735509,
int. Conf. on Very Large Data Bases (VLDB) 2015, Waikoloa, Hawai’i, USA
[3] Baquero, C., Almeida, P.S., Cunha, A., Ferreira, C.: Composition in state-
based replicated data types. Bulletin of the EATCS 123 (2017), http://
eatcs.org/beatcs/index.php/beatcs/article/view/507
[4] Baquero, C., Moura, F.: Using structural characteristics for autonomous
operation. Operating Systems Review 33(4), 90–96 (1999), https://doi.org/
10.1145/334598.334614
[5] Barnett, M., Chang, B.Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie:
A modular reusable verifier for object-oriented programs. In: Proceedings
of the 4th International Conference on Formal Methods for Components
and Objects. pp. 364–387. FMCO’05, Springer-Verlag, Berlin, Heidelberg
(2006), http://dx.doi.org/10.1007/11804192 17
[6] Brookes, S., O’Hearn, P.W.: Concurrent separation logic. SIGLOG News
3(3), 47–65 (2016), https://dl.acm.org/citation.cfm?id=2984457
[7] Burckhardt, S.: Principles of eventual consistency. Foundations and Trends
in Programming Languages 1(1-2), 1–150 (2014), https://doi.org/10.1561/
2500000011
[8] Burckhardt, S., Gotsman, A., Yang, H., Zawirski, M.: Replicated data types:
Specification, verification, optimality. In: Symp. on Principles of Prog. Lang.
(POPL). pp. 271–284. San Diego, CA, USA (Jan 2014), http://doi.acm.org/
10.1145/2535838.2535848
[9] Dijkstra, E.: A discipline of programming. Prentice-Hall series in automatic
computation, Prentice-Hall (1976)
[10] Gomes, V.B.F., Kleppmann, M., Mulligan, D.P., Beresford, A.R.: A frame-
work for establishing strong eventual consistency for conflict-free replicated
datatypes. Archive of Formal Proofs 2017 (2017), https://www.isa-afp.org/
entries/CRDT.shtml
[11] Gotsman, A., Yang, H., Ferreira, C., Najafzadeh, M., Shapiro, M.: ’Cause
I’m Strong Enough: Reasoning about consistency choices in distributed sys-
tems. In: Symp. on Principles of Prog. Lang. (POPL). pp. 371–384. St. Pe-
tersburg, FL, USA (2016), http://dx.doi.org/10.1145/2837614.2837625
[12] Hellerstein, J.M., Alvaro, P.: Keeping CALM: when distributed consistency
is easy. CoRR abs/1901.01930 (2019), http://arxiv.org/abs/1901.01930
[13] Houshmand, F., Lesani, M.: Hamsaz: Replication coordination analysis and
synthesis. Proc. ACM Program. Lang. 3(POPL), 74:1–74:32 (Jan 2019),
http://doi.acm.org/10.1145/3290387
570 S. S. Nair et al.
[14] Jagadeesan, R., Riely, J.: Eventual consistency for crdts. In: Ahmed, A.
(ed.) Programming Languages and Systems - 27th European Symposium
on Programming, ESOP 2018, Held as Part of the European Joint Con-
ferences on Theory and Practice of Software, ETAPS 2018, Thessaloniki,
Greece, April 14-20, 2018, Proceedings. Lecture Notes in Computer Sci-
ence, vol. 10801, pp. 968–995. Springer (2018), https://doi.org/10.1007/
978-3-319-89884-1 34
[15] Jones, C.B.: Specification and design of (parallel) programs. In: Mason, R.
(ed.) Information Processing 83. IFIP Congress Series, vol. 9, pp. 321–332.
IFIP, North-Holland/IFIP, Paris, France (Sep 1983)
[16] Kaki, G., Earanky, K., Sivaramakrishnan, K., Jagannathan, S.: Safe repli-
cation through bounded concurrency verification. Proc. ACM Program.
Lang. 2(OOPSLA), 164:1–164:27 (Oct 2018), http://doi.acm.org/10.1145/
3276534
[17] Kaki, G., Nagar, K., Najafzadeh, M., Jagannathan, S.: Alone together:
Compositional reasoning and inference for weak isolation. In: Symp. on
Principles of Prog. Lang. (POPL). Proc. ACM Program. Lang., vol. 2, pp.
27:1–27:34. Assoc. for Computing Machinery, Assoc. for Computing Ma-
chinery, Los Angeles, CA, USA (Dec 2017), http://doi.acm.org/10.1145/
3158115
[18] Leino, K.R.M., Monahan, R.: Reasoning about comprehensions with first-
order smt solvers. In: Proceedings of the 2009 ACM Symposium on Applied
Computing. pp. 615–622. SAC ’09, ACM, New York, NY, USA (2009),
http://doi.acm.org/10.1145/1529282.1529411
[19] Marcelino, G., Balegas, V., Ferreira, C.: Bringing hybrid consistency closer
to programmers. In: W. on Principles and Practice of Consistency for
Distr. Data (PaPoC). pp. 6:1–6:4. PaPoC ’17, Euro. Conf. on Comp. Sys.
(EuroSys), ACM, Belgrade, Serbia (2017), http://doi.acm.org/10.1145/
3064889.3064896
[20] Nair, S., Shapiro, M.: Improving the “Correct Eventual Consistency” tool.
Rapport de recherche RR-9191, Institut National de la Recherche en Infor-
matique et Automatique (Inria), Paris, France (Jul 2018), https://hal.inria.
fr/hal-01832888
[21] Nair, S.S., Petri, G., Shapiro, M.: Invariant safety for distributed appli-
cations. In: W. on Principles and Practice of Consistency for Distr. Data
(PaPoC). pp. 4:1–4:7. Assoc. for Computing Machinery, Assoc. for Com-
puting Machinery, Dresden, Germany (Mar 2019), https://doi.org/10.1145/
3301419.3323970
[22] Nair, S.S., Petri, G., Shapiro, M.: Soteria. https://github.com/sreeja/
soteria tool (2019)
[23] Nair, S.S., Petri, G., Shapiro, M.: Proving the safety of highly-available
distributed objects (Extended version). Tech. rep. (Feb 2020), https://hal.
archives-ouvertes.fr/hal-02492599
[24] Najafzadeh, M., Gotsman, A., Yang, H., Ferreira, C., Shapiro, M.: The
CISE tool: Proving weakly-consistent applications correct. In: W. on Prin-
ciples and Practice of Consistency for Distr. Data (PaPoC). EuroSys 2016
Proving the safety of highly-available distributed objects 571
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Solving Program Sketches with
Large Integer Values
1 Introduction
The most popular sketching tool, Sketch [21], can efficiently solve complex
program sketches with hundreds of lines of code. However, Sketch often per-
forms poorly if the sketched program manipulates large integer values. Sketch’s
synthesis is based on an algorithm called counterexample-guided inductive syn-
thesis (Cegis) [21]. The Cegis algorithm iteratively considers a finite set I of
inputs for the program and performs SAT queries to identify values for the holes
so that the resulting program satisfies all the assertions for the inputs in I.
Further SAT queries are then used to verify whether the generated solution is
correct on all the possible inputs of the program. Sketch represents integers
using a unary encoding (a variable for each integer value) so that arithmetic
computations such as addition, multiplication etc. can be represented efficiently
in the SAT formulas as lookup operations. This unary encoding, however, results
in huge formulas for solving sketches with larger integer values as we also observe
in our evaluation. Recently, an SMT-like technique that extends the SAT solver
with native integer variables and integer constraints was proposed to alleviate
this issue in Sketch. It guesses values for the integer variables and propagates
them through the integer constraints, and learns from conflict clauses. However,
this technique does not scale well when the sketches contain complex arithmetic
operations—e.g., non-linear integer arithmetic.
In this paper, we propose a program transformation technique that allows
Sketch to solve program sketches involving large integer values while retain-
ing the unary encoding used by the traditional Sketch solver. Our technique
rewrites a Sketch program into an equivalent one that performs computations
over smaller values. The technique is based on the well-known Chinese Remain-
der Theorem, which states that, given distinct prime numbers p1 , . . . , pn such
that N = p1 · . . . · pn , for every two distinct numbers 0 ≤ k1 , k2 < N , there
exists a pi such that k1 mod pi = k2 mod pi . Intuitively, this theorem states that
tracking the modular values of a number smaller than N for each pi is enough to
uniquely recover the actual value of the number itself. We use this idea to replace
a variable x in the program with n variables xp1 , . . . , xpn , so that for every i,
xpi = x mod pi . Using closure properties of modular arithmetic we show that,
as long as the program uses the operators +, −, ∗, ==, tracking the modular
values of variables and performing the corresponding operations on such values
is enough to ensure correctness. For example, to reflect the variable assignment
x = y + z, we perform the assignment xpi = (ypi + zpi ) mod pi , for every pi . Sim-
ilarly, the Boolean operation x == y will only hold if xpi = ypi , for every pi . To
identify what variables and values in the program can be rewritten, we develop
a data-flow analysis that computes what variables may flow into operations that
are not sound in modular arithmetic—e.g., <, >, ≤, and /.
We provide a comprehensive theoretical analysis of the complexity of the
proposed transformation. First, we derive how many prime numbers are needed
to track values in a certain integer range. Second, we analyze the number of bits
required to encode values in the original and rewritten program and show that,
for the unary encoding used by Sketch, our technique offers an exponential
saving in the number of required bits.
574 R. Pan et al.
An extended version containing all proofs and further details has been uploaded
to arXiv as supplementary material.
2 Motivating Example
In this section, we use a simple example to illustrate our technique and its
effectiveness. Consider the Sketch program polyArray presented in Figure 1b.
The goal of this synthesis problem is to synthesize a two-variable quadratic
polynomial (lines 7–8) whose evaluation p on given inputs x and y is equal to a
given expected-output array z (line 9). Solving the problem amounts to finding
non-negative integer values for the holes (??) and sign values, i.e., -1 or 1, for
the holes (??s ) such that the assertion becomes true.1 In this case, a possible
solution is the polynomial:
p [ i ] = -17* y [ i ]^2 -8* x [ i ]* y [ i ] -17* x [ i ]^2 -3* x [ i ];
When attempting to solve this problem, the Sketch synthesizer times out at
300 seconds. To solve this problem, Sketch creates SAT queries where the
variables are the holes. Due to the large numbers involved in the computation of
this program, the unary encoding of Sketch ends up with SAT formulas with
approximately 45 million clauses.
1
In Sketch, holes can only assume positive values. This is why we need the sign holes,
which are implemented using regular holes as follows: if(??) then 1 else -1.
Solving Program Sketches with Large Integer Values 575
Fig. 1: Sketch program (a) and rewritten version with values tracked for differ-
ent moduli (b).
Sketch Program with Modular Arithmetic The technique we propose in this paper
has the goal of reducing the complexity of the synthesis problem by transforming
the program into an equivalent one that manipulates smaller integer values and
that yields easier SAT queries. Given the Sketch program in Figure 1b, our
technique produces the modified Sketch program pAPrime in Figure 1a. The
new Sketch program has the same control flow graph as the original one, but
576 R. Pan et al.
instead of computing the actual values of the expressions x[·] and y[·], it tracks
their remainders for the set of prime numbers {2, 3, 5, 7, 11, 13, 17} using new
variables—e.g., x2[i] tracks the remainder of x[i] modulo 2.
The program pAPrime initializes the modular variables with the correspond-
ing modular values (lines 5–8). When rewriting a computation over modular
variables, the same computation is performed modularly (lines 12–17). For ex-
ample, the term ??s1 ∗ ??1 *y[i]2 when tracked modulo 2 is rewritten as
(?? s1 *(?? 1 %2)*(( y2 [ i ]%2) 2 %2))%2
In the rewritten program, the variables i and n are not tracked modularly,
since such a transformation would incorrectly access array indices. Finally, the
assertions for different moduli share the same holes as the solution to the Sketch
has to be correct for all modular values. In the rest of the paper, we develop a
data flow analysis that detects when variables can be tracked modularly.
Sketch can solve the rewritten program in less than 2 seconds and produce
hole values that are correct solutions for the original program. This speedup
is due to the small integer values manipulated by the modular computations.
In fact, the intermediate SAT formulas generated by Sketch for the program
pAPrime have approximately 120 thousand clauses instead of the 45 million
clauses for polyArray. Due to the complex arithmetic in the formulas, even if
Sketch uses the SMT-like native integer encoding, it still requires more than
300 seconds to solve this problem.
While this technique is quite powerful, it does have some limitations. In
particular, the solution to the rewritten Sketch is guaranteed to be a correct
solution only for inputs that cause intermediate values of the program to be in
a range [d1 , d2 ] such that d2 − d1 ≤ 2 × 3 × 5 × 7 × 11 × 13 × 17 = 510, 510. We
will prove this result in Section 4.
3 Preliminaries
In this section, we describe the IMP language that we will consider through-
out the paper and briefly recall the counter-example guided inductive synthesis
algorithm employed by the Sketch solver.
For simplicity, we consider a simple imperative language IMP with integer
holes for defining the hypothesis space of programs. The syntax and semantics
of IMP are shown in Appendix ??. Without loss of generality, we assume the
programs consists of a single program f (v1 , · · · , vn , ??1 , . . .??m ) with n integer
variables and m integer holes. The body of the program f consists of a sequence
of statements, where a statement s can either be a variable assignment, a while
loop statement, an if conditional statement, or an assert statement. The holes
?? denote integer constant values that are unknown and the goal of the synthesis
process is to compute these values such that a set of desired program assertions
are satisfied for every possible input values to f .2
2
Our implementation also supports for-loops, recursion, arrays, and complex types.
Solving Program Sketches with Large Integer Values 577
The goal of the synthesizer is to compute the value of the hole ?? such that the
assertion is true for all possible input values of n and h. For this example, ?? = 3
is a valid solution.
The Sketch solver uses the counter-example guided inductive synthesis al-
gorithm (Cegis) to find hole values such that the desired assertions hold for all
input values. Formally, the Sketch synthesizer solves the following constraint:
where Z denotes the domain of all integer values, ?? denotes the list of unknown
hole values (??1 , · · · , ??m ) ∈ Zm , I denotes the domain of all input argument
values to the function f , and f (in, ??)IMP = ⊥ denotes that the program satis-
fies all assertions. The synthesis problem is in general undecidable for a language
with complex operations such as the IMP language because of the infinite size of
possible hole and input values. To make the synthesis process more tractable,
Sketch imposes a bound on the sizes of both the input domain (Ib ) and the
domain of hole values (Zb ) to obtain the following constraint:
The bounded domains make the synthesis problem decidable, but the second-
order quantified formula results in a search space of hole values that is still huge
for any reasonable bounds. To solve such bounded equations efficiently, Sketch
uses the Cegis algorithm to incrementally add inputs from the domain until
obtaining hole values ?? that satisfy the assertion predicates for all the input
values in the bounded domain. The algorithm solves the second-order formula
by iteratively solving a series of first-order queries. It first encodes the existential
query (synthesis query) over a randomly selected input value in0 to find the hole
values H that satisfy the predicate for in0 using a SAT solver in the backend.
Integer Encoding The Sketch solver can efficiently solve the synthesis con-
straint in many domains, but it does not scale well for sketches manipulating
large numbers. Sketch uses a unary encoding to represent integers, where the
encoded formula consists of a variable for each integer value. The unary encod-
ing allows for simplifying the representation of complex non-linear arithmetic
operations. For example, a multiplication operation can be represented as sim-
ply a lookup table using this encoding. In practice, the unary encoding results
in magnitudes of faster solving times compared to the logarithmic encoding for
many synthesis problems. However, this also results in huge SAT formulas in
presence of large integers. Recently, a new SMT-like technique based on extend-
ing the SAT solver with native integer variables and constraints was proposed to
alleviate this issue in Sketch. Similar to the Boolean variables, this extended
solver guesses for integer values and propagates them in the constraints while
also learning from conflict clauses. Note that Sketch uses these SAT extensions
and encodings instead of an SMT solver as SMT doesn’t scale well for the non-
linear constraints typically found in the synthesis problems. Our new technique
for handling computations over large numbers still maintains the efficient unary
encoding of integers and computations over them.
The Chinese Remainder Theorem is a powerful number theory result that shows
the following: given a set of distinct primes P = {p1 , . . . , pk }, any number n in
an interval of size p1 · . . . · pk can be uniquely identified from the remainders
[n mod p1 , · · · , n mod pk ]. In Section 4.2, we will use this idea to define the
semantics of the IMP-MOD language. The main benefit of this idea is that the
remainders could be much smaller than actual program values.
Example 2. For P = [3, 5, 7] and an integer 101, its remainders [2, 1, 3] are much
smaller than 101. However, any number of the form 101 + 105 × n also has
remainders [2, 1, 3] with respect to the same prime set.
In general, one cannot uniquely determine an arbitrary integer value from its
remainders for some set P—i.e., the mapping from a number to its remainders
is an abstraction in the sense of abstract interpretation [6]. However, if we are
interested in a limited range of integer values [L, U ), one can choose a set of
primes P = {p1 , . . . , pk } such that, for values L ≤ x < U , the map [r1 , · · · , rk ]
→
x, where x ≡ ri mod pi , is an injection.
Solving Program Sketches with Large Integer Values 579
Stmt s := v = a | v P = aP | s1 ; s2
| while(b) {s} | if(b) s1 else s2 | assert b
Program P := f (v1 , · · · , vn , v1P , · · · , vm
P
, ??1 , . . . , ??l ) {s}
cPσ,σP := c vPσ,σP := σ(v) a1 opa a2 Pσ,σP := a1 Pσ,σP opa a2 Pσ,σP
Next, we provide an alternative integer semantics, which applies the IMP integer
semantics to modular expressions and show that, under some assumptions on
the values manipulated by the program, the modular and integer semantics are
equivalent. We will use this result to build our modified synthesis algorithm.
Solving Program Sketches with Large Integer Values 581
cσ1 ,σ2 := c vσ1 ,σ2 := σ1 (v) a1 opa a2 σ1 ,σ2 := a1 σ1 ,σ2 opa a2 σ1 ,σ2
v = aσ1 ,σ2 := (σ1 [v ← aσ1 ,σ2 ], σ2 ) v P = aP σ1 ,σ2 := (σ1 , σ2 [v P ← aP σ1 ,σ2 ])
Similarly, we show that the two semantics are also equivalent for Boolean
expressions.
We are now ready to show the equivalence between the modular semantics
and the integer semantics for programs P ∈ IMP-MOD. The semantics of a pro-
gram P = f (V Z , V P , H) {s} is a map from valuations to valuations, i.e., given
a valuation σ1 : V Z → Z for integer variables, a valuation σ2 : V P → Z for mod-
ular variables and a valuation σ H : H → Z for holes, we have P (σ1 , σ2 , σ H ) =
sσ1 ∪σH ,σ2 and P P (σ1 , σ2 , σ H ) = sPσ1 ∪σH ,mP ◦σ2 . Therefore, it is sufficient to
show that the two semantics are equivalent for any statement s.
The two semantics are equivalent for a statement s if, under the same input
valuations, the resulting valuations of the semantics can be translated to each
other. Formally, given valuations σ1 , σ2 and an interval R of size N , we say
sσ1 ,σ2 ≡P sPσ1 ,mP ◦σ2 iff σ1 = σ1 , mP ◦ σ2 = σ2P and σ2 = m−1,R P ◦ σ2P where
sσ1 ,σ2 = (σ1 , σ2 ) and sσ1 ,mP ◦σ2 = (σ1 , σ2 ).
P P
⎧ P
⎪v
⎪
⎪
if a ≡ v and v ∈ V P
⎨c P if a ≡ c
Ra (a) =
⎪
⎪ R a (a ) op P
R a (a ) if a ≡ a1 opPa a2
⎪
⎩
1 a 2
toPrime(a) otherwise
⎧
⎪
⎪ Ra (a1 ) == Ra (a2 ) if b ≡ a1 == a2
⎪
⎨R (b ) and R (b ) if b ≡ b and b
b 1 b 2 1 2
Rb (b) =
⎪
⎪ not R b (b1 ) if b ≡ not b
⎪
⎩
2
b otherwise
⎧
⎪
⎪ Rs (s1 ); Rs (s2 ) if s ≡ s1 ; s2
⎪
⎪
⎪
⎪ v = a if s ≡ v = a and v ∈ V Z
⎪
⎪
⎨v P = R (a) if s ≡ v = a and v ∈ V P
a
Rs (s) =
⎪if(Rb (b)) Rs (s0 ) else Rs (s1 ) if s ≡ if(b) s0 else s1
⎪
⎪
⎪
⎪
⎪
⎪while(Rb (b)) {Rs (s)}
⎪
if s ≡ while b {s}
⎩
assert Rb (b) if s ≡ assert b
Fig. 5: Subset of rules for the translation from IMP to IMP-MOD programs. Rules
are parametric in V Z , V P with P: Rf (f (V, ??){s}) = f (V Z , V P , ??){Rs (s)}.
∃σ H ∈ RH .∀σ1 , σ2 ∈ IR
P
.sσ1 ∪σH ,σ2 = ⊥.
the choice of prime numbers. In practice, one can use a verifier to check the cor-
rectness of the synthesized solution and add more prime numbers to the modular
synthesizer if needed. In fact, this is the main idea behind the counterexample-
guided inductive synthesis algorithm used by Sketch (Section 3).
The primorial captures the size N of the interval covered by the Chinese Re-
mainder Theorem when using prime numbers up to n. The following number
theory result gives us a close form for the primorial and shows that the number
N has approximately n bits.
The following number theory result relates the primorial to the Chebyshev func-
tion.
ϑ(n) = log(n#) = log 2(1+o(1))n = (1 + o(1))n (3)
Aside from rounding errors, the Chebyshev function captures the number of bits
required to represent the numbers in Pn . To
obtain a more precise bound on this
number, we need a bound for the formula log p.
p∈Pn
We start by recalling the following fundamental number theory result.
Theorem 4 (Prime number theorem). The set Pn has size approximately
n/ log n.
Using Theorem 4, we get the following result.
log p ≤ n/ log n + log p ≈ (1 + o(1))n (4)
p∈Pn p∈Pn
8 Evaluation
8.1 Benchmarks
input examples and coeff∈ {[−10, 10], [−30, 30], [−50, 50]} denotes the range of
randomly generated coefficients in the polynomial f .
Invariants The second set of benchmarks contain 46 variants of two invariant
generation problems obtained from a public set of programs that require poly-
nomial invariants to be verified [8]. We selected the two programs in which at
least one variable could be tracked modularly by our tool (the other programs
involved complex array operations or inequality operators) and turned the verifi-
cation problems into synthesis problems by asking Sketch to find a polynomial
equality (using the program variables) that is an invariant for the loop in the
program. To control the size of the magnitudes of the inputs, we only require
the invariants to hold for a fixed set of input examples.
The first problem, mannadiv, iteratively computes the remainder and the
quotient of two numbers given as input. The invariant required to verify mannadiv
is a polynomial equality of degree 2 involving 5 variables. The Sketch template
required to describe the space of all polynomial equalities has 32 holes and can-
not be handled by any of the Sketch solvers we consider. We therefore simplify
the invariant synthesis problems in two ways. In the first variant, we reduce the
ranges of the hole values in the templates by considering cbits ∈ {2, 3}. In the
second variant, we set cbits = {5, 6, 7}, but reduce the number of missing hole
values to 4 (i.e., we provide part of the invariant). Each benchmark takes two
random inputs and we consider the following input ranges {[1, 50], [1, 100]}. In
total, we have 10 benchmarks for mannadiv.
The second problem, petter, iteratively computes the sum 1≤i≤n i5 for a
given input n. The invariant required to verify petter is a polynomial equality
of degree 6 involving 3 variables. The Sketch template required to describe all
such polynomial equalities has 56 holes and cannot be handled by any of the
Sketch solvers we consider. We consider the following simplified variants of the
problem: (i ) petter_0 computes 1≤i≤n1 and requires a polynomial invariant
of degree one, (ii ) petter_x computes 1≤i≤n x for a given input variable x
and
requires a polynomial invariant of degree two, (iii ) petter_1 computes
1≤i≤n i and
requires a polynomial invariant of degree two, and (iv ) petter_10
computes 1≤i≤n i + 1 and requires a polynomial invariant of degree two. Each
benchmark takes two random inputs and we consider the following input ranges
{[1, 10], [1, 100], [1, 1000]}. In total, we have 12 variants of petter, each run for
values of cbits ∈ {5, 6, 7}—i.e., a total of 36 benchmarks.
Program Repair The third set of benchmarks contains 54 variants of Sketch
problems from the domain of automatic feedback generation for introductory
programming assignments [7]. Each benchmark corresponds to an incorrect pro-
gram submitted by a student and the goal of the synthesizer is to find a small
variation of the program that behaves correctly on a set of test cases. We select
the 6/11 benchmarks from the tool Qlose [7] for which (i ) our implementation
can support all the features in the program, and (ii ) our data flow analysis
identifies at least one variable that can be tracked modularly. Of the remaining
benchmarks, 3/11 do not contain variables that can be tracked modularly, and
2/11 call auxiliary functions that cannot be translated into Sketch. For each
Solving Program Sketches with Large Integer Values 591
Table 1: Effectiveness of different solvers. SAT (resp. UNSAT) denotes the num-
ber of benchmarks for which solver could find a solution to the benchmarks (resp.
prove no solution existed) while TO denotes the number of timeouts.
Polynomials Invariants Program repair
Solver Solved SAT UNSAT TO SAT UNSAT TO SAT UNSAT TO
Unary 69/181 12 4 65 5 0 41 48 0 6
Binary 127/181 70 6 5 17 0 29 34 0 20
Unary-p 169/181 73 5 3 41 2 3 48 0 6
Unary-p-inc 172/181 73 6 2 41 2 3 50 0 4
program, we consider the original problem and two variants where the integer
inputs are multiplied by 10 and 100, respectively. Further, for each program vari-
ants, we impose an assertion specifying that the distance between the original
program and the repaired program is within a certain bound. We select three
different bounds for each program: the minimum cost c, c + 100, and c + 200.
the discussion throughout the paper, these are benchmarks typically involving
complex operations but not involving overly large numbers.
We can now answer Q1. First, Unary-p consistently outperforms Unary
across all benchmarks. Second, Unary-p outperforms Binary on hard-to-
solve problems and can solve problems that Binary cannot solve—
e.g., Unary-p solved 28/46 invariant problems that Sketch could not solve.
Unary-p and Binary have similar performance on easy problems.
Comparison to full SMT encoding For completeness, we also compare our ap-
proach to a tool that uses SMT solvers to model the entire synthesis problem.
We choose the state-of-the-art SMT-based synthesizer Rosette [23] for our
comparison. Rosette is a programming language that encodes verification and
synthesis constraints written in a domain-
specific language into SMT formulas that
can be solved using SMT solvers. 105
We only run Rosette on the set of
Polynomials because Rosette does sup- Binary (ms)
104
port the theories of integers, but does not
have native support for loops, so there 103
is no direct way to encode Invariants
and Program Repair benchmarks. To our 102
knowledge, Rosette provides a way to 102 103 104 105
specify the number k it uses to model in- Rosette (ms)
tegers and reals as k-bit words, but the
user has no control over how many bits Fig. 6: Rosette vs Binary
it uses for unknown holes specifically. So
we evaluate 27 instead of 81 variants of the polynomial synthesis problem on
Rosette, i.e., we consider different numbers of cbits.
Figure 6 shows the running times (log scale) for Rosette and Binary with
cbits=6. Rosette successfully solved 16/27 benchmarks and it terminates
quickly (avg. 2.9s) when it can find a solution. However, Rosette times out
on 11 benchmarks for which Binary terminates. The timeouts are due to the
fact that Rosette employs full SMT encodings that combine multiple theories
while Binary uses a SAT solver that is only modified to accommodate SMT-like
integer constraints. Since we now know full SMT encodings are not as general
and efficient as the encodings used in Sketch, we will only evaluate the effec-
tiveness of our technique based on comparison with Binary.
Finally, we tried applying our prime-based technique to Rosette and, as
expected, the technique is not beneficial due to the binary encoding of numbers
in SMT, and causes all benchmarks to timeout. To summarize, (i ) SMT solvers
cannot efficiently handle the synthesis problems considered in this paper, and
(ii ) our technique is better suited for SAT solvers than SMT solvers.
Polynomials Polynomials
105 Repair 105 Repair
Invariants Invariants
Unary-p (ms)
Unary-p (ms)
104 104
103 103
104
103
103
[2 3 5] [2 3 5 7] [2−11] [2−13] [2−17] [2−17] [11 17 19 23] [31 41 47] [251 263]
prime set prime set
(a) Larger sets of primes (b) Larger primes
In this experiment, we compare the sizes of the intermediate SAT formulas gen-
erated by Unary-p and Unary. Figure 10a shows a scatter plot (log scale) of
the number of clauses of the largest intermediate SAT query generated by the
CEGIS algorithm for the two techniques. We only plot the instances in which
Unary was able to produce at least a SAT formula. Unary produces SAT for-
mulas that are on average 19.3X larger than those produced by Unary-p. To
answer Q5, as predicted by our theory, Unary-p produces significantly
smaller SAT queries than Unary.
Performance vs Size of SAT Queries We also evaluate the correlation between
synthesis time and size of SAT queries. Figure 10b plots the synthesis times of
both solvers against the sizes of the SAT queries. It is clear that the synthesis
Solving Program Sketches with Large Integer Values 595
Polynomials Unary
formula size Unary-p
time (ms)
106
104
5
10
104 103
103 3
10 104 105 106 107 105 106 107 108
formula size Unary formula size
(a) Size: Unary-p vs Unary (b) Performance vs size
time increases with larger SAT queries. The plot illustrates how the solving time
strongly depends on the size of the generated formulas.
9 Related Work
since the abstraction is used to show that a program does not contain a bug—
i.e., even in the abstract domain, the problem behaves fine. In our setting, the
problem is opposite as we use the abstraction to simplify the synthesis problem
and provide a theory for when the modular and integer semantics are equivalent.
Pruning Spaces in Program Synthesis Many techniques have been proposed to
prune large search space of possible programs [14]. Enumerative synthesis tech-
niques [24, 12, 13, 17] enumerate programs in a search space and avoid enumer-
ating syntactically and semantically equivalent terms. Some synthesizers such
as Synquid [16] and Morpheus [10] use refinement types and first-order formu-
las over specifications of DSL constructs to refute inconsistent programs. Re-
cently, Wang et al. [25] proposed a technique based on abstraction refinement
for iteratively refining abstractions to construct synthesis problems of increasing
complexity for incremental search over a large space of programs.
Instead of pruning programs in the syntactic space, our technique uses mod-
ular arithmetic to prune the semantic space—i.e., the complexity of verifying the
correctness of the synthesized solution—while maintaining the syntactic space
of programs. Our approach is related to that of Tiwari et al. [22], who present a
technique for component-based synthesis using dual semantics—where syntactic
symbols in a language are provided two different semantics to capture differ-
ent requirements. Our technique is similar in the sense that we also provide an
additional semantics based on modular arithmetic. However, we formalize our
analysis based on number theory results and develop it in the context of general-
purpose Sketch programs that manipulate integer values, unlike Tiwari et al.’s
work that is developed for straight-line programs composed of components.
Synthesis for Large Integer Values Abate et al. propose a modification of the
Cegis algorithm for solving syntax-guided synthesis (SyGuS) problems with
large constants [1]. SyGuS differs from program sketching in how the synthesis
problem is posed and in the type of programs that can be modeled. In particular,
in SyGuS one can only describe programs representing SMT formulas and the
logical specification for the problem can only relate the input and output of the
program—i.e., there cannot be intermediate assertions within the program. The
problem setup and the solving algorithms proposed in this paper are orthogonal
to those of Abate et al. First, we focus on program sketching, which is orthog-
onal to SyGuS as sketching allows for richer and more generic program spaces
as well as richer specifications. While it is true that certain synthesis problems
can be expressed both as sketches and as SyGuS problems, this is not the case
for our benchmarks programs, which use loops, arrays and non-linear integer
arithmetic, all of which are not supported by SyGuS. Second, our technique is
motivated by how Sketch encodes and solves program sketches through SAT
solving. While the traditional Sketch encoding can explode for large constants,
the same encoding allows Sketch to solve program sketches involving complex
arithmetic and complex programming constructs. The algorithm proposed by
Abate et al. iteratively builds SMT (not SAT) formulas that are required to
be in a decidable logical theory. Such an encoding only works for the restricted
programming models used in SyGuS problems.
Solving Program Sketches with Large Integer Values 597
References
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Modular Relaxed Dependencies in Weak
Memory Concurrency
1 Introduction
It has been a longstanding problem to define the semantics of programming
languages with shared memory concurrency in a way that does not allow un-
wanted behaviours – especially observing thin-air values [8,7] – and that does
not forbid compiler optimisations that are important in practice, as is the case
with Java and Hotspot [30,29]. Recent attempts [16,11,25,15] have abandoned
the style of axiomatic models, which is the de facto paradigm of industrial spec-
ification [8,2,6]. Axiomatic models comprise rules that allow or forbid individual
program executions. While it is impossible to solve all of the problems in an
This work was funded by EPSRC Grants EP/M017176/1, EP/R020566/1 and
EP/S028129/1, the Lloyds Register Foundation, and the Royal Academy of En-
gineering.
axiomatic setting [7], abandoning it completely casts aside mature tools for au-
tomatic evaluation [3], automatic test generation [32], and model checking [23],
as well as the hard-won refinements embodied in existing specifications like C++,
where problems have been discovered and fixed [8,7,18]. Furthermore, the indus-
trial appetite for fundamental change is limited. In this paper we offer a solution
to the thin-air problem that integrates with existing axiomatic models.
The thin-air problem in C++ stems from a failure to account for dependen-
cies [22]: false dependencies are those that optimisation might remove, and real
dependencies must be left in place to forbid unwanted behaviour [7]. A single
execution is not sufficient to discern real and false dependencies. A key insight
from previous work [14,15] is that event structures [33,34] give us a simultane-
ous overview of all traces at once, allowing us to check whether a write is sure
to happen in every branch of execution. Unfortunately, previous work does not
integrate well with axiomatic models, nor lend itself to automatic evaluation.
To address this, we construct a denotational semantics in which the meaning
of an entire program is constructed by combining the meanings of its subcom-
ponents via a compositional function over the program text. This approach can
be particularly amenable to automatic evaluation, reasoning and compiler certi-
fication [19,24], and fits with the prevailing axiomatic approach.
This paper uses this denotational approach to capturing program dependen-
cies to explore the thin-air problem, resulting in a concrete proposal for fixing
the thin-air problem in the ISO standard for C++.
Contributions. There are two parts to the paper. In the first, we develop a deno-
tational model called “Modular Relaxed Dependencies model” (MRD) and build
metatheory around it. The model uses a relatively simple account of synchronisa-
tion, but it demonstrates separation between the calculation of dependency and
the enforcement of synchronisation. In the second, we evaluate the dependency
calculation by combining it with the fully-featured axiomatic models RC11 [18]
and IMM [26].
The denotational semantics has the following advantages:
1. It is the first thin-air solution to support fork/join (§2.2).
2. It satisfies the DRF-SC property for a compositional model (§5): programs
without data races behave according to sequential consistency.
3. It comes with a refinement relation that validates program transformations,
including the optimisation that makes Hotspot unsound for Java [30,29], and
a list of others from the Java Causality Tests [27] (§7).
4. It is shown to be equivalent to a global semantics that first performs a
dependency calculation and then applies an axiomatic model.
5. An example in Section 10 illustrates a case in which thin-air values are
observable in the current state-of-the-art models but forbidden in ours.
We adopt the dependency calculation from the global semantics of point 4 as
the basis of our C++ model, which we call MRD-C11. We establish the C++
DRF-SC property described in the standard [13] (§9.1) and we provide several
desirable properties for a solution to the thin-air problem in C++:
Modular Relaxed Dependencies in Weak Memory Concurrency 601
5. We show that our dependency calculation is the first that can be applied
to any axiomatic model, and in particular the RC11 and IMM models that
cover C++ concurrency (§8).
6. Our augmented IMM model, which we call MRD+IMM, is provably imple-
mentable over x86, Power, ARMv8, ARMv7 and RISC-V, with the compiler
mappings provided by the IMM [26] (§8.1).
7. These augmented models of C++ are the first that solve the thin-air problem
to have a tool that can automatically evaluate litmus tests (§11).
k 9
_x0 _x1
j 8
qy0 qy1
Each event has a unique identifier (the number attached to the box). The
straight black arrows represent program order, the curved yellow arrows indicate
a causal dependency between the reads and writes, and the red zigzag represents
a conflict between two events. If two events are in conflict, then their respective
continuations are in conflict too.
If we interpret the program Init; LB1 , as below, we get a program where
the Init event sets the variables to zero.
602 M. Paviotti et al.
R
AMBi
k 9
_x0 _x1
j 8
qy0 qy1
(r2 := y; x := r2 ) (LB2 )
R
AMBi
k 9 e 3
_x0 _x1 _y0 _y1
j 8 d N
qy0 qy1 qx0 qx1
The program Init; (LB1 LB2 ) allows executions of the following three
shapes.
R R R
AMBi AMBi AMBi
k e k e k e
_x0 _y0 _x0 _y0 _x0 _y0
j d j d j d
qy0 qx0 qy0 qx0 qy0 qx0
Modular Relaxed Dependencies in Weak Memory Concurrency 603
Note that in this example, we are not allowed to read the value 1 – reading
a value that does not appear in the program is one sort of thin-air behaviour, as
described by Batty et al. [7]. For example, the execution {1, 4, 5, 8, 9} does not
satisfy the coherence axiom as 4 −→ 5 −→ 8 −→ 9 −→ 4 forms a cycle.
dp rf dp rf
r1 := y; x := 1 (LB3 )
where the value written to the variable x is a constant. Its generated event
structure is depicted as follows
+
_y0 _y1
# /
qx1 qx1
In this program, for each branch, we can reach a write of value 1 to location
x. Hence, this will happen no matter which branch is chosen: we say b and d
are independent writes and we draw no dependency edges from their preceding
reads.
Consider now the program (LB3 ) in parallel with LB1 introduced earlier in
this section. As usual, we interpret the Init program in sequence with (LB1
LB3 ) as follows:
R
AMBi
k 9 +
_x0 _x1 _y0 _y1
j 8 # /
qy0 qy1 qx1 qx1
The resulting event structure is very similar to that of (LB1 LB2 ), but the
executions permitted in this event structure are different. The dependency edges
calculated when adding the read are preserved, and now executions {1, 2, 3, a, b}
and {1, a, b, 4, 5} are allowed. However, this event structure also contains the
execution in which d is independent.
In the execution {d −→ 4 −→ 5 −→ c} there is
rf dp rf
R
no rf or dp edge between d and c that can create
AMBi
a cycle, hence this is a valid complete execution in
9 + which we can observe x = 1, y = 1. Note that the
_x1 _y1 Init is irrelevant in the consistency of this execution.
8 /
qy1 qx1 Modularity. It is worthwhile underlining the role that
modularity plays here. In order to compute the be-
haviour of (LB1 LB2 ) and (LB1 LB3 ) we did not have to compute the
behaviour of LB1 again. In fact, we computed the semantics of LB1 , LB2 and
LB3 in isolation and then we observed the behaviour in parallel composition.
604 M. Paviotti et al.
Thin-air values. The program (LB1 LB3 ) is a standard example in the weak
memory literature called load buffering. In the program (LB1 LB2 ), if event 5
or 9 were allowed in a complete execution, that would be an undesirable thin-air
behaviour: there is no value 1 in the program text, nor does any operation in the
program compute the value 1. The program (LB1 LB3 ) is similar, but now
contains a write of value 1 in the program text, so this is no longer a thin-air
value. Note that the execution given for it is not sequentially consistent, but
nonetheless a weak memory model needs to allow it so that a compiler can, for
example, swap the order of the two commands in LB3 , which are completely
independent of each other from its perspective.
2 Event Structures
Event structures will form the semantic domain of our denotational semantics
in Section 5. Our presentation follows the essential ideas of Winskel [33] and is
further influenced by the treatment of shared memory by Jeffrey and Riely [15].
2.1 Background
A partial order (E, ) is a set E equipped with a reflexive, transitive and an-
tisymmetric relation . A well-founded partial order is a partial order that has
no infinite decreasing chains of the form · · · ei−1 ei ei+1 · · · .
A prime event structure is a triple (E, , #). E is a set of events, is a
well-founded partial order on E and # is a conflict relation on E. # is binary,
symmetric and irreflexive such that, for all c, d, e ∈ E, if c#d e then c#e. We
write Con(E) for the set of conflict-free subsets of E, i.e. those subsets C ⊆ E
for which there is no c, d ∈ C such that c#d.
1 ∪ 3 ∪ {(e, e ) | e ∈ i ∧ e ∈ E2E }
i∈I
The set of events, E, is the set E1 plus all the elements from the copies of
E3 . The order, , is constructed by linking every event in the copy E2i , with all
the events in the set i, plus the obvious order from E1 and the order in the local
copy E2i . Finally, the conflict relation is the union of the conflict in E1 and E3 .
→ r with v = v .
hbX hbX
that w −−−→ e −−−
A complete execution X is an execution where all read events r have a write
w that they read from, i.e. w −→X r.
rf
r1 := x; r2 := t; if (r1 == 1 ∨ r2 == 1){y := 1}
R k
_x0 _x1
j 9 8 e
_t0 _t1 _t0 _t1
d 3 N
qy1 qy1 qy1
The rules later on in this section will provide us with justifications {(6 : R t 1)}
(9 : W y 1) and {(2 : R x 1)} (9 : W y 1) (but not the independent justification
(9 : W y 1)). So in this program there are two minimal justifications of
(9 : W y 1). The result of freezing is to duplicate all partial executions for each
choice of write justifications. In this case, we get an execution containing 2 −→ 9
dp
R
Forwarding is forbidden if there
qx0 exists e in E such that w ≤ e ≤
r, as in the example on the left.
k j In this example we do not for-
_x0 _x1 ward 1 to 6. The rules of this
9 8 e d section give us that {1, 3, 6} 9:
_x0 _x1 _x0 _x1 we have preserved program or-
der over the accesses of x, 1 ≤
3 N
3 ≤ 6, and we do not forward
qz1 qz1
across the intervening read 3.
C1 ∪ {r} \ LF e
with LF being the “Load Forwarded ” set of reads, i.e.the set of reads consecu-
tively following the matching prepended one:
LF = {(r : R x v) ∈ C1 | e , r ≤X e ≤X r }
where ≤ is built as in the read rule and S contains all coherent executions of
the form,
(X ∪ {w}, lkX , (rfX ∪ rfi ), dpX )
i
where X ∈ S1 , and w −−→
rf
r for any set of matching reads r in E1 such that
condition (1.2) of coherence is satisfied. Adding rfi edges leaves condition (1.1)
satisfied.
The justification relation is the smallest upward-closed relation such that
for all C 1 e:
1. w
2. C \ SF ∪ {w} e if there exists e ∈ C s.t. w ≤X e
3. C \ SF e otherwise
Modular Relaxed Dependencies in Weak Memory Concurrency 609
with SF being the Store Forwarding set of reads, i.e.the set of reads that we are
going to remove from the justification set for later events that are matching the
write we are prepending. This is defined as follows:
SF = {(r : R x v) | e, w ≤X e ≤X r }
In each case, the event structure is built as the coproduct of the conflicting
events. In (LB3 ), prior to applying coproduct we have {a} b and {c} d. The
writes have the same label for both read values so, taking C1 and C2 to be empty,
coproduct makes them independent, adding the independent writes b and d.
610 M. Paviotti et al.
In contrast, the values of writes 3 and 5 differ in (LB1 ), so the coproduct has
{2} 3 and {4} 5. When ultimately frozen, the justifications of (LB1 ) will
produce the dependency edges (2, 3) and (4, 5) as described in Section 1.
As for condition (2), if there is an event in the justification set that is ordered
in ≤X with the respective top read, then the top read cannot be erased from the
justification. Doing so would break the ≤X link.
When having value sets that contain more than two values, we use v∈V to
denote a simultaneous coproduct (rather than the infinite sum). More precisely,
if we coproduct the event structures E0 , E1 , · · · , En in a pairwise fashion as
follows,
(· · · (E0 + E1 ) + · · · ) + Ev
we would get liftings that are undesirable. To see this, it suffices to consider the
program,
if (r==3){x := 2}{x := 1}
where the write to x of 1 is independent for a coproduct over values 1 and 2, but
not when considering the event structure following (R x 3).
where X ∈ S1 and the lock order lk is such that for all lock or unlock event
l ∈ X, l −→ l . Finally, ≤L is ≤L 1 extended with the lock ordered before all
lk
events in E1 .
The semantics for the unlock is similar.
where ≤ is built as in the read rule and S are all executions of the form
B := M = M | B ∧ B | B ∨ B | ¬B M := n | r
P ::= skip | r := x | x := M | P1 ; P2 | P1 P2 | if (B){P1 }{P2 }
| while(B){P } | L | U
5.1 Compositionality
We define the language of contexts inductively in the standard way.
Definition 3 (Context).
C ::= [−] | P ; C | C; P | (C P ) | (P C)
| if (B){C}{P } | if (B){P }{C} | while(B){C}
In the base case, the context is a hole, denoted by [−]. The inductive cases follow
the structure of the program syntax. In particular, a context can be a program
P in sequence with a context, a context in sequence with a program P and so
on. For a context C we denote C[P ] by the inductively defined function on the
context C that substitutes the program P in every hole.
5
Jeffrey and Riely [15] adopt the same restriction. We conjecture that modelling
blocking locks [4] would not affect the DRF-SC property.
Modular Relaxed Dependencies in Weak Memory Concurrency 613
P 1 ρ κ = ∅
Ln ρ κ = (L • E1 , 1 )
skipn ρ κ = κ(ρ)
where (E1 , 1 ) = κ(ρ)
r := xn ρ κ = Σv∈V (R x v • κ(ρ[r → v]))
Un ρ κ = (U • E1 , 1 )
x := M n ρ κ = (W x M ρ ) • κ(ρ)
where (E1 , 1 ) = κ(ρ)
P1 ; P2 n ρ κ = P1 n ρ (λρ.P2 n ρ κ )
The following lemma shows that the semantics preserve context application.
This falls out from the fact that the semantic interpretation is compositional,
that is, we define every constructor in terms of its subcomponents.
Lemma 1 (Compositionality). For all programs P1 , P2 , if P1 = P2 then
for all contexts C, C[P1 ] = C[P2 ].
The proof is a straightforward induction on the context C and it follows from the
fact that semantics is inductively defined on the program syntax. The attentive
reader may note that to prove P1 = P2 in the first place we have to assume n,
ρ and κ and prove P1 n ρ κ = P2 n ρ κ . It is customary however in denotational
semantics to have programs denoted by functions that are equal if they are equal
at all inputs [31].
Data race freedom ensures that we forbid optimisations which could lead to
unexpected behaviour even in the absence of data races. We first define the
closed semantics for a program P . For all n, the semantics of P , namely P
is Init(P )n λx.0 ∅ , where Init(P ) is the program that takes the global vari-
ables in P and initialises them to 0. We now establish that race-free programs
interpreted in the closed semantics have sequentially consistent behaviour.
DRF semantics. Rather than proving DRF-SC directly, we prove that race-free
programs behave according to an intermediate semantics ·. This semantics
differs from · in only two ways: program order is used in the calculation of
coherence instead of preserved program order, and no dependency edges are
614 M. Paviotti et al.
recorded (as these are subsumed by program order). More precisely, the seman-
tics is calculated as in Figure 1 but we check that (rfe ∪ lk ∪ ) is acyclic.
Note that race-free executions of the intermediate semantics · satisfy the
constraints of the model of Boehm and Adve [10], and the definition of race is
the same between the two models. Boehm and Adve prove that in the absence
of races, their model provides sequential consistency.
The DRF-SC theorem is stated as follows.
Theorem 1. For any program P , if P is data race free then every execution
D in P is a sequentially consistent execution, i.e. D is in P .
6.1 LB+ctrl-double
In the first example, from Batty et al. [7], the compiler collapses conditionals to
transform P1 to P2 .
P1 P2
r1 := x;
if (r1 ==1){ +
y := 1 r1 := x; _x0 _x1
−→
} else { y := 1 # /
y := 1 qy1 qy1
}
Coproduct ensures that the denotations of P1 and P2 are identical, with the
event structure above, together with justification b and d. From composi-
tionality (Lemma 1) and equality of the denotations, we have equal behaviour
of P1 and P2 in any context, and the optimisation is allowed.
As noted by Jeffrey and Riely [15], the failure of this test “indicates a failure to
validate the reordering of independent reads”.
R e
_z0 _z1
k j d 3
_x0 _x1 _x0 _x1
9 8 N Ry
qy0 qy1 qy0 qy1
In the definition of prepending writes (equation (3), condition (2)) we state that
for any given justification, if there is an event in the justification set that is
related via ≤X with the write we are prepending, then that write must be in the
justification set as well.
To see why we made this choice consider the following program,
616 M. Paviotti et al.
x := 1;
r1 := y;
if (r1 ==0){
r := z;
3
x := 0; r2 := x; if (r2 ==1){z := 1}
if (z==1){y := 1}
} else {
r3 := x; if (r3 ==1){z := 1}
}
and its associated event structure,
y
qx1
R k Ry RR
_y0 _y1 _z0 _z1
j 9 8 Rk
qx0 _x0 _x1 qy1
e d 3
_x0 _x1 qz1
N
qz1
2 and 5.
This execution is not sequentially consistent, but under SC, the program is
race free. Without writes in justifications, the model would violate the DRF-SC
property described in Section 5.2.
T3 T2 T1
r2 := y;
if(r2 == 1)
r2 := y; x := 1;
{r3 := y; x := r3 } −→ −→
x := 1; r2 := y;
else
{x := 1}
The optimisation removes the apparently redundant pair of reads (4, 6), then
reorders the now-independent write. This redundancy is represented in justifi-
cation: when prepending the top read of y to the right-hand side of the event
structure, the existing justification 6 7 is replaced by 3 7. When coproduct is
applied, this matches with justification 1 2, leading to the independent writes
2 and 7. In a weak memory context however, a parallel thread could write a
value to y between the two reads, thereby changing the value written to x. For
this reason, we keep event 4 in the denotation and create the dependency edge
4 −→ 5.
dp
7 Refinement
Note that the refinement relation is defined over a tweaked version of the
semantics, ·T , a variant of · in which the registers are explicit in the event
structure.
Finally we show is compositional:
In this section we show that our calculation of relaxed dependencies can easily be
reused to solve the thin-air problem in other state-of-the-art axiomatic models,
drawing the advantages of these models over to ours. In particular, we augment
the IMM and RC11 models of Podkopaev et al. [26]. We adopt their language,
given below. It covers C++ atomics, fences, fetch-and-add and compare-and-
swap operations but excludes locks. Note that locks are implementable using
compare and swap operations.
M := n | r
P ::= T1 · · · Tn
B := M = M | B ∧ B | B ∨ B | ¬B
oR ::= rlx | acq
T ::= skip | r :=oR x | x :=oW M | T1 ; T2
oW ::= rlx | rel
| if (B){P1 }{P2 } | while(B){P }
oF ::= acq | rel | acqrel | sc
| fenceoF | r := FADDooRRMW,oW
(x, M )
oRMW ::= normal | strong
| CASooRRMW
,oW
(x, M, M )
First we provide a model, written (for a program P ) as P MRD+IMM , that
combines our relaxed dependencies to the axiomatic model of IMM , here written
as P IMM . We will make these definitions precise shortly. We then show that
P MRD+IMM is weaker than P IMM , making P MRD+IMM implementable over
hardware architectures like x86-TSO, ARMv7, ARMv8 and Power. Secondly, we
relax the RC11 axiomatic model by using our relaxed dependencies model MRD
to create a new model P MRD-C11 , and show this model weaker than the RC11
Modular Relaxed Dependencies in Weak Memory Concurrency 619
8.1 Implementability
We can now state and prove that the MRD model is implementable over IMM,
which gives us that MRD is implementable over x86-TSO, ARMv7, ARMv8,
Power and RISC-V by combining our result with the implementability result of
IMM .
We refer to the RC11 [18] model, as specified in Podkopaev et al. [26]. We call this
model P RC11 . While P RC11 forbids thin-air executions, it is not weak enough:
it forbids common compiler optimisations by imposing that ( ∪ rf) is acyclic.
We relax this condition by similarly replacing with our relaxed dependency
relation dp, this time calculated on our preserved program order relation (≤).
We call this model P MRD-C11 . Mathematically, this is done by imposing that
(dp ∪ rf) is acyclic.
At this point, we prove the following lemma:
Lemma 3 (Implementability of MRD-C11). For all programs P ,
P MRD-C11 ⊇ P RC11
To show this it suffices to show that there always exists dp ⊆ . This is straight-
forward by induction on the structure of P , observing that the only place where
dependencies go against is when hoisting a write in the coproduct case. How-
ever, in the same construction we always preserve the dependencies coming from
the different branches of the structure which are, by inductive hypothesis, always
agreeing with program order.
620 M. Paviotti et al.
Sketch proof. In the absence of races and relaxed atomics, the no-thin-air guar-
antee of RC11 is made redundant by the guarantee of happens-before acyclicity
shared by RC11 and MRD-C11. The result follows from this observation, lemma 3
and Theorem 4 from Lahav et al. [18].
The program is an adaptation6 of a Java test, where the the unwanted out-
come represents a violation of type safety [20]. Observing the thin-air behaviour
where a = 1 in the adaptation above is the analogue of the unwanted outcome in
the original test. If in the end a = 1, then the second branch of the conditional
in the rightmost thread must execute. It contains a read of 1 from y, and a
dependent write of x := 1. On the middle thread there is a read of 1 from x, and
a dependent write of y := 1. These dependencies form the archetypal thin-air
shape in the execution where a = 1. MRD correctly identifies these dependencies
and the outcome is prohibited due to its cycle in reads-from and dependency.
The a = 1 outcome is allowed in the Promising Semantics: a promise can be
validated against the write of x := 1 in the true branch of the righthand thread,
and later switched to a validation with x := r0 from the false branch, ignoring
the dependency on the read of y.
In the previous example, Coh-CYC, a stepwise global coherence check caused
weakestmo to forbid the unwanted behaviour allowed by Promising, but that
machinery does not apply here. weakestmo allows the unwanted outcome, and
we conjecture that this deficiency stems from the structure of the model. De-
pendencies are not represented as a relation at the level of the global axiomatic
constraint, so one cannot check that they are consistent with the dynamic exe-
cution of memory, as represented by the other relations. Adopting a coherence
check in the stepwise generation of the event structure mitigates this concern for
Coh-CYC, but not for the test above.
In contrast, MRD does represent dependencies as a relation, allowing us to
check consistency with the rf relation here. The axiom that requires acyclicity
of (dp ∪ rf) forbids the unwanted outcome, as desired.
MRD-C11 is the first weak memory model to solve the thin-air problem for C++
atomics that has a tool for automatically evaluating litmus tests. Our tool, MRD-
er, evaluates litmus tests under the base model, RC11 augmented with MRD, and
IMM augmented with MRD. It has been used to check the result of every litmus
test in this paper, together with many tests from the literature, including the
Java Causality Test cases [7,11,15,16,18,25,26,27].
When evaluating whether a particular execution is allowed for a given test, a
model that solves the thin-air problem must take other executions of the program
into account. For example, the semantics of Pichon-Pharabod et al., having
explored one execution path, may ultimately backtrack [25]. Jeffrey and Riely
phrase their semantics as a two player game where at each turn, the player
explores all forward executions of the program [15]. At each operational step, the
Promising Semantics [16] has to run forwards in a limited local way to validate
6
James Riely, Alan Jeffrey and Radha Jagadeesan provided the precise example pre-
sented here [28]. It is based on Fig. 8 of Lochbihler [20], and its problematic execution
under Promising was confirmed with the authors of Promising.
622 M. Paviotti et al.
12 Discussion
Four recent papers have presented models that forbid thin-air values and permit
previously challenging compiler optimisations. The key insight from these papers
is that it is necessary to consider multiple program executions simultaneously.
To do this, three of the four [15,25,11] use event structures, while the Promising
Semantics [16] is a small-step operational semantics that explores future traces
in order to take a step.
Although the Promising Semantics [16] is quite different from MRD, its mech-
anism for promising focuses on future writes, and MRD has parallels in its cal-
culation of independent writes. Note also that both Promising’s certification
mechanism and MRD’s lifting are thread-local.
The previous event-structure-based models are superficially similar to MRD,
but all have a fundamentally different approach from ours: Pichon-Pharabod and
Sewell [25] use event structures as the state of a rewriting system; Jeffrey and
Riely [14,15] build whole-program event structures and then use a global mech-
anism to determine which executions are allowed; and Chakraborty et al. [11]
transform an event structure using an operational semantics. In contrast, we fol-
low a more traditional approach [33] where our event structures are used as the
co-domain of a denotational semantics. Further, Jeffrey and Riely [14,15] and
Pichon-Pharabod and Sewell [25] do not cover a significant subset set of C++
relaxed concurrency primitives.
MRD does not suffer from known problems with existing models. As noted
by Kang et al. [16], the Pichon-Pharabod and Sewell model produces behaviour
incompatible with the ARM architecture. The Jeffrey and Riely model forbids
the reordering of independent reads, as demonstrated by Java Causality Test 7
(see Section 6.2). The Promising semantics allows the cyclic coherence ordering
of the problematic Coh-CYC example [11]. weakestmo allows the thin-air out-
come in the Java-inspired test of Section 10. In all four cases MRD provides the
correct behaviour.
MRD is also highly compatible with the existing C++ standard text. The
dp relation generated by MRD can be used directly in the axiomatic model to
forbid thin-air behaviour. We are working on standards text with the ISO C++
committee based on this work, and have a current working paper with them [5].
Modular Relaxed Dependencies in Weak Memory Concurrency 623
The notion in C++ that data-race free programs should not exhibit observ-
able weak behaviours goes back to Adve and Hill [1], and formed the basis of
the original proposal for C++ [10]. This was formalised by Batty et al. [8] and
adopted into the ISO standard. Despite the pervasiveness of DRF-SC theorems
for weak memory models, these have remained whole-program theorems that
do not support breaking a program into separate DRF and racy components.
Our DRF theorem for our denotational model demonstrates a limited form of
modularity that merits further exploration.
Other denotational approaches to relaxed concurrency have not tackled the
thin-air problem. Dodds et al. [12] build a denotational model based on an
axiomatic model similar to C++. It forms the basis of a sound refinement relation
and is used to validate data-structures and optimisations. Their context language
is too restrictive to support a compositional semantics, and their compromise
to disallow thin-air executions forbids important optimisations. Kavanagh and
Brookes [17] provide a denotational account of TSO concurrency, but their model
is based on pomsets and suffers from the same limitation as axiomatic models [7]:
it cannot be made to recognise false dependencies.
13 Conclusions
We have used the relatively recent insight that to avoid thin-air problems, a
semantics should consider some information about what might happen in other
program executions. We codify that into a modular notation of justification,
leading to a semantic notion of independent writes, and finally of dependency
(dp). We demonstrate the effectiveness of these concepts in three ways. One,
we define a denotational semantics for a weak memory model, show it supports
DRF-SC, and build a compositional refinement relation strong enough to verify
difficult optimisations. Two, we show how to use dp with other axiomatic models,
supporting the first optimal implementability proof for a thin-air solution via
IMM , and showing how to repair the ISO C++ model. Three, we build a tool
for executing litmus tests allowing us to check a large number of examples.
624 M. Paviotti et al.
References
1. Adve, S.V., Hill, M.D.: Weak ordering — a new definition. In: ISCA (1990)
2. Alglave, J., Maranget, L., McKenney, P.E., Parri, A., Stern, A.: Frightening small
children and disconcerting grown-ups: Concurrency in the linux kernel. In: ASP-
LOS (2018)
3. Alglave, J., Maranget, L., Tautschnig, M.: Herding cats: modelling, simulation,
testing, and data-mining for weak memory. In: PLDI (2014)
4. Batty, M.: The C11 and C++11 Concurrency Model. Ph.D. thesis, University of
Cambridge, UK (2015)
5. Batty, M., Cooksey, S., Owens, S., Paradis, A., Paviotti, M., Wright, D.: Modular
Relaxed Dependencies: A new approach to the Out-Of-Thin-Air Problem (2019),
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1780r0.html
6. Batty, M., Donaldson, A.F., Wickerson, J.: Overhauling SC atomics in C11 and
OpenCL. In: POPL (2016)
7. Batty, M., Memarian, K., Nienhuis, K., Pichon-Pharabod, J., Sewell, P.: The prob-
lem of programming language concurrency semantics. In: ESOP (2015)
8. Batty, M., Owens, S., Sarkar, S., Sewell, P., Weber, T.: Mathematizing C++ con-
currency. In: POPL (2011)
9. Benton, N., Hur, C.: Step-indexing: The good, the bad and the ugly. In: Modelling,
Controlling and Reasoning About State, 29.08. - 03.09.2010 (2010)
10. Boehm, H.J., Adve, S.V.: Foundations of the C++ concurrency model. In: PLDI
(2008)
11. Chakraborty, S., Vafeiadis, V.: Grounding thin-air reads with event structures. In:
POPL (2019)
12. Dodds, M., Batty, M., Gotsman, A.: Compositional verification of compiler opti-
misations on relaxed memory. In: ESOP (2018)
13. ISO/IEC JTC 1/SC 22 Programming languages, their environments and system
software interfaces: ISO/IEC 14882:2017 Programming languages — C++ (2017)
14. Jeffrey, A., Riely, J.: On thin air reads towards an event structures model of relaxed
memory. In: LICS (2016)
15. Jeffrey, A., Riely, J.: On thin air reads: Towards an event structures model of
relaxed memory. Logical Methods in Computer Science 15(1) (2019)
16. Kang, J., Hur, C.K., Lahav, O., Vafeiadis, V., Dreyer, D.: A promising semantics
for relaxed-memory concurrency. In: POPL (2017)
17. Kavanagh, R., Brookes, S.: A denotational semantics for SPARC TSO. MFPS
(2018)
18. Lahav, O., Vafeiadis, V., Kang, J., Hur, C., Dreyer, D.: Repairing sequential con-
sistency in C/C++11. In: PLDI (2017)
19. Leroy, X., Grall, H.: Coinductive big-step operational semantics. Inf. Comput.
(2009)
20. Lochbihler, A.: Making the Java memory model safe. ACM Trans. Program. Lang.
Syst. (2013)
21. Manson, J., Pugh, W., Adve, S.V.: The Java Memory Model. In: POPL (2005)
22. McKenney, P.E., Jeffrey, A., Sezgin, A., Tye, T.: Out-of-Thin-Air Execution
is Vacuous (2016), http://www.open-std.org/jtc1/sc22/wg21/docs/papers/
2016/p0422r0.html
23. Michalis Kokologiannakis, Azalea Raad, V.V.: Model checking for weakly consis-
tent libraries. In: PLDI (2019)
Modular Relaxed Dependencies in Weak Memory Concurrency 625
24. Owens, S., Myreen, M.O., Kumar, R., Tan, Y.K.: Functional big-step semantics. In:
Programming Languages and Systems - 25th European Symposium on Program-
ming, ESOP 2016, Held as Part of the European Joint Conferences on Theory and
Practice of Software, ETAPS 2016, Eindhoven, The Netherlands, April 2-8, 2016,
Proceedings (2016)
25. Pichon-Pharabod, J., Sewell, P.: A concurrency semantics for relaxed atomics that
permits optimisation and avoids thin-air executions. In: POPL (2016)
26. Podkopaev, A., Lahav, O., Vafeiadis, V.: Bridging the gap between programming
languages and hardware weak memory models. PACMPL (POPL) (2019)
27. Pugh, W.: Java causality tests. http://www.cs.umd.edu/~pugh/java/
memoryModel/CausalityTestCases.html (2004), accessed: 2018-11-17
28. Riely, J., Jagadeesan, R., Jeffrey, A.: private correspondence (2020)
29. Ševčı́k, J.: Program transformations in weak memory models. Ph.D. thesis, Uni-
versity of Edinburgh, UK (2009)
30. Ševčı́k, J., Aspinall, D.: On validity of program transformations in the Java memory
model. In: ECOOP (2008)
31. Streicher, T.: Domain-theoretic foundations of functional programming (01 2006)
32. Wickerson, J., Batty, M., Sorensen, T., Constantinides, G.A.: Automatically com-
paring memory consistency models. In: POPL (2017)
33. Winskel, G.: Event structures. In: Petri Nets: Central Models and Their Properties,
Advances in Petri Nets 1986, Part II, Proceedings of an Advanced Course, Bad
Honnef, 8.-19. September 1986 (1986)
34. Winskel, G.: An introduction to event structures (1989)
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/
4.0/), which permits use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons license and indicate if changes
were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
ARMv8-A system semantics: instruction fetch in
relaxed architectures
1 Introduction
Computing relies on the architectural abstraction: the specification of an en-
velope of allowed hardware behaviour that hardware implementations should
lie within, and that software should assume. These interfaces, defined by hard-
ware vendors and relatively stable over time, notionally decouple hardware and
software development; they are also, in principle, the foundation for software ver-
ification. In practice, however, industrial architectures have accumulated great
complexity and subtlety: the ARMv8-A and Intel architecture reference manuals
are now 7476 and 4922 pages [9,26], and hardware optimisations, including out-
of-order and speculative execution, result in surprising and poorly-understood
programmer-observable behaviour. Architecture specifications have historically
also been entirely informal, describing these complex envelopes of allowed be-
haviour solely in prose and pseudocode. This is problematic in many ways: do not
serve as clear documentation, with the inevitable ambiguity and incompleteness
of informal prose leaving major questions unanswered; without a specification
that is executable as a test oracle (that can decide whether some observed be-
haviour is allowed or not), hardware validation relies on test suites that must be
manually curated; without an architecturally-complete emulator (that can ex-
hibit all allowed behaviour), it is very hard for software developers to “program to
the specification” – they rely on test-and-debug development, and can only test
above the hardware implementation(s) they have; and without a mathematically
rigorous semantics, formal verification of hardware or software is impossible.
Over the last 10 years, much has been done to put architecture specifications
on a more rigorous footing, so that a single specification can serve all those
purposes. There are three main problems, two of which are now largely solved.
The first is the instruction-set architecture (ISA): the specification of the
sequential behaviour of individual instructions. This is chiefly a problem of scale:
modern industrial architectures such as Arm or x86 have large instruction sets,
and each instruction involves many details, including its behaviour at different
privilege levels, virtual-to-physical address translation, and so on – a single Arm
instruction might involve hundreds of auxiliary functions. Recent work by Reid
et al. within Arm [40,41,42] transitioned their internal ISA description into a
mechanised form, used both for documentation and testing, and with him we
automatically translated this into publicly available Sail definitions and thence
into theorem-prover definitions [11,10]. Other related work is in §7.
The second is the relaxed-memory concurrent behaviour of “user-mode” op-
erations: memory writes and reads, and the mechanisms that architectures pro-
vide to enforce ordering and atomicity (dependencies, memory barriers, load-
linked/store-conditional operations, etc.). In 2008, for ARMv7, IBM POWER,
and x86, this was poorly understood, and the architects regarded even their own
prose specifications as inscrutable. Now, following extensive work by many peo-
ple [36,37,19,18,22,8,31,45,7,46,48,35,6,2,47,13,1], ARMv8-A has a well-defined
and simplified model as part of its specification [9, B2.3], including a prose
transcription of a mathematical model [15], and an equivalence proof between
operational and axiomatic presentations [36,37]; RISC-V has adopted a similar
model [52]; and IBM POWER and x86 have well-established de-facto-standard
models. All of these are experimentally validated against hardware, and sup-
ported by tools for exhaustively running tests [17,4]. The combination of these
models and the ISA semantics above is enough to let one reason about or model-
check concurrent algorithms.
628 B. Simner et al.
That leaves the third part of the problem: the “system” semantics, of
instruction-fetch and cache maintenance, exceptions and interrupts, and ad-
dress translation and TLB (translation lookaside buffer) maintenance. Just as
for “user-mode” relaxed memory, these are all areas where microarchitectural op-
timisations can have surprising programmer-visible effects, especially in the con-
current context. The mechanisms are relied on by all code, but they are explicitly
managed only by systems code, in just-in-time (JIT) compilers, dynamic loaders,
operating-system (OS) kernels, and hypervisors. This is, of course, exactly the
security-critical computing base, currently trusted but not trustworthy, that is
especially in need of verification – which requires a precise and well-validated
definition of the architectural abstraction. Previous work has scarcely touched
on this: none of seL4 [27], CertiKOS [24,23], Komodo [16], or [25,12], address
realistic architecture concurrency, and they use (at best) idealised models of the
sequential systems architecture. The CakeML [51,28] and CompCert [29] verified
compilers target only sequential user-mode ISA fragments.
In this paper we focus on one aspect of system semantics: instruction fetch
and cache maintenance, for ARMv8-A. The ability to execute code that has
previously been written to data memory is fundamental to computing: fine-
grained self-modifying code is now rare, and (rightly) deprecated, but program
loading, dynamic linking, JIT compilation, debugging, and OS configuration all
rely on executing code from data writes. However, because these are relatively
infrequent operations, hardware designers have been able to optimise by partially
separating the instruction and data paths, e.g. with distinct instruction caching,
which by default may not be coherent with data accesses. This can introduce
programmer-visible behaviour analogous to that of user-mode relaxed-memory
concurrency, and require specific additional synchronisation to correctly pick up
code modifications. Exactly what these are is not entirely clear in the current
ARMv8-A architecture text, just as pre-2018 user-mode concurrency was not.
Our main contribution is to clarify this situation, developing precise abstrac-
tions that bring the instruction-fetch part of ARMv8-A system behaviour into
the domain of rigorous semantics. Arm have stated [private communication]
that they intend to incorporate a version of this into their architecture. We aim
thereby to enable future work on system software verification using the tech-
niques of programming languages research: program analysis, model-checking,
program logics, etc. We begin (§2) by recalling the informal architectural guar-
antees that Arm provide, and the ways in which real-world software systems
such as Linux, JavaScript, and WebAssembly change instruction memory. Then:
(1) We explore the fundamental phenomena and architecture de-
sign questions with a series of examples (§3). We explore the interactions
between instruction fetching, cache maintenance and the ‘usual’ relaxed mem-
ory stores and loads, showing that instruction fetches are more relaxed, and
how even fundamental coherence guarantees for data memory do not apply to
instruction fetches. Most of these questions arose during the development of our
models, in detailed ongoing discussion with the Arm Chief Architect and other
Arm staff. They include questions of several different kinds. Six are clear from
ARMv8-A system semantics: instruction fetch in relaxed architectures 629
the Arm prose specification. Of the others: two are not implied by the prose but
are natural choices; five involved substantive new choices by Arm that had not
previously been considered and/or documented; for two, either choice could be
reasonable, and Arm chose the simpler (and weaker) option; and for one, Arm
were independently already strengthening the architecture to accommodate ex-
isting software.
(2) We give an operational semantics for Arm instruction fetch
and icache maintenance (§4). This is in an abstract-microarchitectural style
that supports an operational intuition for how hardware actually works, while
abstracting from the mass of detail and the microarchitectural variation of actual
hardware implementations. We do so by extending the Flat model [37] with
simple abstractions of instruction caches and the coherent data cache network,
in a way that captures the architectural intent, defining the entire envelope of
behaviours that implementations should be allowed to exhibit.
(3) We give a more concise presentation of the model in an ax-
iomatic style (§5), extending the “user-mode” axiomatic model from previous
work [37,36,15,9], and intended to be functionally equivalent. We discuss how
this too matches the architectural intent.
(4) We validate all this in two ways: by the extensive discussion with
Arm staff mentioned above, and by experimental testing of hardware behaviour,
on a selection of ARMv8-A cores designed by multiple vendors (§6). We run
tests on hardware with a mild extension of the Litmus tool [5,7]. We make the
operational model executable as a test oracle by integrating it into the RMEM
tool and its web interface [17], introducing optimisations that make it possible
to exhaustively execute the examples. We make the axiomatic model executable
as a test oracle with a new tool that takes litmus tests and uses a Sail [11]
definition of a fragment of the ARMv8-A ISA to generate SMT problems for the
model. We then compare hardware and the two models for the handwritten tests
(modulo two tests not supported by the axiomatic checker), compare hardware
and the operational model on a suite of 1456 tests, automatically generated
with an extension of the diy tool [3], and check the operational and axiomatic
models against sets of previous non-ifetch tests. In all this data our models are
equivalent to each other and consistent with hardware observations, except for
one case where our testing uncovered a hardware bug on a Qualcomm device.
Finally, we discuss other related work (§7) and conclude (§8). We do all this
for ARMv8-A, but other relaxed architectures, e.g. IBM POWER and RISC-V,
face similar issues; our tests and tooling should enable corresponding work there.
The models are too large to include or explain in full here, so we focus
on explaining the motivating examples, the main intuition and style of the
operational model, in a prose rendering of its executable mathematics, and
the definition of the axiomatic model. Appendices provide additional exam-
ples, a complete prose description of the operational model, and additional ex-
planation of the axiomatic model. The complete executable mathematics ver-
sion, the web-interface tool for running it, and our test results are at https:
//www.cl.cam.ac.uk/~pes20/iflat/.
630 B. Simner et al.
Caveats and Limitations Our executable models are integrated with a substan-
tial fragment of the Sail ARMv8-A ISA (similar to that used for CakeML), but
not yet with the full ISA model [11,40,41,42]; this is just a matter of additional
engineering. We only handle the 64-bit AArch64 part of ARMv8-A, not AArch32.
We do not handle the interaction between instruction fetch and mixed-size ac-
cesses, or other variants of the cache maintenance instructions, e.g. those used for
interaction with DMA engines, and variants by set or way instead of by virtual
address. Finally, the equivalence between our operational and axiomatic models
is validated experimentally. A proof of this equivalence is essential in the long
term, but would be a major work in itself: the complexity makes mechanisation
essential, but the operational model (in all its scale and complexity) has not yet
been subject to mechanised proof. Without instruction fetch, a non-mechanised
proof was the main result of an entire PhD thesis [36], and we expect the addition
of instruction fetch to require global changes to the argument.
At first sight, this may be entirely mysterious. The remainder of the paper es-
tablishes precise semantics for each instruction, explaining why each is required,
but as a rough intuition:
1. The DC CVAU,Xn cleans this core’s data cache for address Xn, pushing the new
write far enough down the hierarchy for an instruction fetch that misses in
the instruction cache to be guaranteed to see the new value. This point is the
Point of Unification (PoU) and is usually the point where the instruction
and data caches become unified (L2 for most modern devices).
2. The DSB ISH waits for the clean to have happened before letting the later
instructions execute (without this, the sequence itself can execute out-of-
order, and the clean might not have pushed the write down far enough before
the instruction cache is updated). The ISH makes this specific to the Inner
Shareable Domain: the processor itself, not the system-on-chip. We do not
model shareability domains in this paper, so this is equivalent to a DSB SY.
3. The IC IVAU,Xn invalidates any entry for that address in the instruction
caches for all cores, forcing any future fetch to miss in the instruction cache,
and instead read the new value from the data memory hierarchy; it also
touches some fetch queue machinery.
4. The second DSB ISH ensures the invalidation completes.
5. The final ISB flushes this core’s pipeline, forcing a re-fetch of all program-
order-later instructions.
Some hardware implementations provide extra guarantees, rendering the DC or
IC instructions unnecessary. Arm allow software to discover this in an archi-
tectural way, by reading the CTR_EL0 register’s DIC and IDC bits. Our mod-
elling handles this, but for brevity we only discuss the weakest case, with
CTR_EL0.DIC=CTR_EL0.IDC=0, that requires full cache maintenance.
Arm make clear that instructions can be prefetched (perhaps speculatively):
“How far ahead of the current point of execution instructions are fetched from
is IMPLEMENTATION DEFINED. Such prefetching can be either a fixed or a
dynamically varying number of instructions, and can follow any or all possible
future execution paths. For all types of memory, the PE might have fetched the
instructions from memory at any time since the last Context synchronization
event on that PE.”
632 B. Simner et al.
In this test Thread 0 performs a memory store (with the STR instruction)
to the code that Thread 1 is executing; overwriting the ADD X0,X0,#1 instruc-
tion with the 32-bit encoding of the SUB X0,X0,#1 instruction. If the fetch were
atomic, the outcome of this test would be the result of executing either the ADD
or the SUB instruction, but, since at least one of those is not in the set of the
8 atomically-fetchable instructions given previously, Thread 1 has constrained-
unpredictable behaviour and the final state is very loosely constrained. Note,
however, that this is nonetheless much stronger than the C/C++ whole-program
undefined behaviour in the presence of a data race: unlike C/C++, a hardware
architecture has to define a useful envelope of behaviour for arbitrary code, to
provide guarantees for the rest of the system when one user thread has a race.
Conditional Branches For conditional branches, the Arm architecture pro-
vides a specific non-single-copy-atomic fetch guarantee: the execution will be
consistent with either the old or new target, and either the old or new condition.
For example, this W+F+branches
W+F+branches AArch64
test can overwrite a B.EQ g with
Initial state: 0:W0="B.NE h", 0:X1=l
a B.NE h, and end up executing
Thread 0 Thread 1
B.NE g or B.EQ h instead of one
STR W0,[X1] l: B.EQ g
of those. Our future examples will
only modify NOPs and unconditional Allowed: execute "B.NE g"
branch instructions.
3.2 Coherence
Data writes and reads are coherent, in Arm and in other major architectures:
in any execution, for each address, the reads of each hardware thread must see
a subsequence of the total coherence order of all writes to that address. The
plain-data CoRR test [46] illustrates one case of this: it is forbidden for a thread
to read a new write of x and then the initial state for x. However, instruction
fetches are not necessarily coherent: one instruction fetch may be inconsistent
634 B. Simner et al.
with a program-order-previous fetch, and the data and instruction streams can
become out-of-sync with each other. We explore three kinds of coherence:
Edges from the initial state are drawn from a small circle. Since we do not modify
the code of most locations, we usually omit the fetch events for those instructions,
showing only a subgraph of the interesting events, e.g. as on the right above. For
Arm, this execution is both architecturally allowed and experimentally observed.
Here, and in future tests, we assume some common code consisting of a
function at address f which always has the same shape: a branch that might
be overwritten, which selects a block that writes a value to register X10 before
ARMv8-A system semantics: instruction fetch in relaxed architectures 635
This is not clear in the existing prose specification, but the architectural
intent that emerged during discussion with Arm is that the given execution
should be forbidden, reflecting microarchitectural choices that (1) instructions
decode in order, so the fetch b must occur before the read d, and (2) fetches that
miss in the instruction cache must read from data storage, so the instruction
cache cannot be ahead of the available data. This ensures that fetching from a
write means that all threads are now guaranteed to read from that write (or
another coherence-after it).
Instruction-to-Data Coherence In the other direction, reading from a par-
ticular write to some location does not imply that later fetches of that location
will see that write (or a coherence successor), as in the following CoRF+ctrl-isb.
CoRF+ctrl-isb AArch64
Initial state: 0:W0="B l1", 0:X1=f, 1:X2=f
Thread 0 Thread 1 Common Thread 0 Thread 1
rf
STR W0,[X1] LDR X0,[X2] f: B l0 a:write f=B l1 b:read f=B l1
CBNZ X0,l l1: MOV X10,#2
l: ISB RET ctrl+isb
BL f l0: MOV X10,#1 irf
MOV X1,X10 RET
c:fetch f=B l0
Allowed: 1:X0="B l1", 1:X1=1
old value from the instruction cache without going out to data memory. The ISB
ensures that f is freshly fetched, but does not ensure that Thread 1’s instruction
cache is up-to-date with respect to data memory.
This is for similar reasons to the above CoFR test: since Thread 1 fetched the
updated value for f, we know that value must have reached at least the data
caches (since that is where the instruction cache reads from) and therefore multi-
copy atomicity guarantees that a normal load instruction will observe it.
The final variant of these MP-shaped tests has both Thread 0 writes be of new
instructions. This idiom is very common in practice; it is currently how Chrome’s
WebAssembly JIT synchronises the modified thread with the new code.
MP.FF+dmb+fpo AArch64
Initial state: 0:W0="B l1", 0:X1=f1,
0:W2="B l1", 0:X3=f2 Thread 0 Thread 1
Thread 0 Thread 1 a:write f1=B l1 c:fetch f2=B l1
irf fpo
STR W0,[X1] BL f2 dmb
DMB ISH MOV X0,X10 irf
STR W2,[X3] BL f1 b:write f2=B l1 d:fetch f1=B l0
MOV X1,X10
Allowed: 1:X0=2, 1:X1=1
ISA2.F+dc+ic+ctrl-isb AArch64
Initial state: 0:W0="B l1", 0:X1=f,
0:X2=1, 0:X3=x, [x]=0, 1:X4=f, Thread 0 Thread 1 Thread 2
1:X1=x, 1:X2=1, 1:X3=y, [y]=0, 2:X2=y a:write f=B l1 c:read x=1 e:read y=1
Thread 0 Thread 1 Thread 2 dcsync icsync ctrl
STR W0,[X1] LDR X0,[X1] LDR X0,[X2] rf rf
DC CVAU, X1 DSB ISH CBZ X0,l b:write x=1 d:write y=1 f: ISB
DSB ISH IC IVAU, X4 l:ISB isb
STR X2,[X3] DSB ISH BL f ifr
STR X2,[X3] MOV X1,X10 g:fetch f=B l0
Forbidden: 1:X0=1, 1:X1=1
For data accesses, the question of whether they are multi-copy atomic is a crucial
one for relaxed architectures. IBM POWER, ARMv7, and pre-2018 ARMv8-A
are/were non-multi-copy atomic: two writes to different addresses could become
visible to distinct other threads in different orders. Post-2018 ARMv8-A and
RISC-V are multi-copy atomic (or “other multi-copy-atomic” in Arm terminol-
ogy) [37,36,9]: the programmer can assume there is a single shared memory, with
all relaxed-memory effects due to thread-local out-of-order execution.
However, for fetches, due to the lack of any fetch atomicity guarantee for most
instructions (§3.1), and the lack of coherent fetches for the others (§3.2), the
question of multi-copy atomicity is not particularly interesting. Tests are either
trivially forbidden (by data-to-instruction coherence) or are allowed but only the
full cache synchronisation sequence provides enough guarantees to forbid it, and
(§3.3) this ensures all cores will share the same consistent view of memory.
Multiple Points of Unification Cleaning the data cache, using the DC in-
struction, makes a write visible to instruction memory. It does this by pushing
the write past the Point of Unification. However, there may be multiple Points
of Unification: one for each core, where its own instruction and data memory
become unified, and one for the entire system (or shareability domain) where all
the caches unify. Fetching from a write implies that it has reached the closest
PoU, but does not imply it has reached any others, even if the write originated
from a distant core. Consider: Here Thread 0 modifies f, Thread 1 fetches the
new value and performs just an IC and DSB, before signalling Thread 0 which
also fetches f. That IC is not strong enough to ensure that the write is pulled
into the instruction cache of Thread 0.
This is not clear in the existing prose, but the architectural intent is that it
be allowed (i.e., that IC is weak in this respect). We have not so far observed it
in practice. The write may have passed the Point of Unification for Thread 1,
but not the shared Point of Unification for both threads. In other words, the
write might reach Thread 1’s instruction cache without being pushed down from
Thread 0’s data cache. Microarchitecturally this can be explained by direct data
640 B. Simner et al.
SM.F+ic AArch64
Initial state: 0:W0="B l1", 0:X4=f, Thread 0 Thread 1
irf
0:X3=x, [x]=0, 1:X4=f, 1:X2=1, 1:X3=x a:write f=B l1 e:fetch f=B l1
Thread 0 Thread 1 po icsync
STR W0,[X4] BL f b:read x=1 f:write x=1
LDR X2,[X3] MOV X0,X10 rf
CBZ X2,l IC IVAU, X4 ctrl
l: ISB DSB ISH
BL f STR X2,[X3] c:ISB
MOV X1,X10 isb
Allowed: 1:X0=2, 0:X2=1, 0:X1=1 irf
d:fetch f=B l0
require keeping multiple writes in the coherent part of the data caches, rather
than a single dirty line, which would require more complex cache coherence pro-
tocols. On the other hand, there does not seem to be any benefit to software from
forbidding it. Arm therefore prefer the choice that gives a simpler and weaker
model (here the two happen to coincide), to make it easier to understand and to
provide more flexibility for future microarchitectural optimisations. We therefore
design our models to allow the above behaviour.
This is similar to the preceding FOW case: it is thought unlikely that hardware
will exhibit this in practice, but the desire for the simpler and weaker option
means the architectural intent is to allow it, and we follow that in our models.
abstract machine states consisting of a tree of instructions for each thread, and
a flat memory subsystem shared by all threads. Each instruction in each thread
corresponds to a sequence of transitions, with some guards and a potential effect
on the shared memory state. The Flat model is made executable in our RMEM
tool, which can exhaustively interleave transitions to enumerate all the possible
behaviours. The tree of instructions for each thread models out-of-order and
speculative execution explicitly. Below we show an example for a thread that is
executing 10 instruction instances.
Some (grey) are finished, no longer
subject to restart; others (pink)
have run some but perhaps not all
of their instruction semantics; in-
structions are not necessarily atomic. Those with multiple children are branch
instructions with multiple potential successors speculated simultaneously.
For each state, the model defines the set of allowed transitions, each of which
steps to a new machine state. Transitions correspond to steps of single instruc-
tions, and individual instructions may give rise to many. Example transitions
include Register Write, Propagate Write to Memory, etc.
iFlat Extension Originally, Flat decode
had a fixed instruction mem-
ory, with a single transition that Fetch Queue
per-thread
new
can speculate the address of any fetch
Thread request fetch
program-order successor of any in-
struction in flight, fetch it from Abstract I$
the fixed instruction memory, and
decode it. We now remove that write data
read data
add to I$
most D$ any
structures as shown on the right. recent
These are all of unbounded size, as
is appropriate for an architecture Memory
definition.
Fetch Queues (per-thread) These are ordered buffers of pre-fetched entries,
waiting to be decoded and begin execution. Entries are either a fetched 32-bit
opcode, or an unfetched request. The fetch queues allow the model to speculate
and pre-fetch many instructions ahead of where the thread is currently executing.
The model’s fetch queues abstract from multiple real-hardware structures: in-
struction queues, line-fill buffers, loop buffers, and slots objects. We keep a close
relation to this underlying microarchitecture by allowing out-of-order fetches,
but we believe this is not experimentally observable on real hardware.
Abstract Instruction Cches (per-thread) These are just sets of writes.
When the fetch queue requests a new entry, it gets satisfied from the instruction
cache, either immediately (a hit) or at some later point in time (a miss). The
ARMv8-A system semantics: instruction fetch in relaxed architectures 643
instruction cache can contain many possible writes for each location (§3.6), and
it can be spontaneously updated with new writes in the system at any time ([9,
B2.4.4]). To manage IC instructions, each thread keeps a list of addresses yet to
be invalidated by in-flight ICs.
Data Cache (global) Above the single shared flat memory for the entire sys-
tem, which sufficed for the multi-copy-atomic ARMv8-A data memory, we insert
a shared buffer which is just a list of writes; abstracting from the many possible
coherent data cache hierarchies. Data reads must be coherent, reading from the
most recent write to the same address in the buffer, but instruction fetches are
allowed to read from any such write in the buffer (§3.2).
Transitions To accommodate instruction fetch and cache maintenance, we in-
troduce new transitions: Fetch Request, Fetch Instruction, Fetch Instruction
(Unpredictable), Fetch Instruction (B.cond), Decode Instruction, Begin IC,
Propagate IC to Thread, Complete IC, Perform DC, and Update Instruction
Cache. We also have to modify some Flat transitions: Commit ISB, Wait for
DSB, Commit DSB, Propagate Memory Write, and Satisfy Read from Memory.
These transitions define the lifecycle of each instruction: a request gets issued
for the fetch, then at some later point the fetch gets satisfied from the instruc-
tion cache, the instruction is then decoded (in program-order) and then handed
to the existing semantics to be executed. To give a flavour, we show just one,
the Propagate IC to Thread transition, which is responsible for invalidation of
the abstract instruction caches. This is a prose rendering of the rule in our exe-
cutable mathematical model, which is expressed in the typed functional subset
of Lem [32].
be re-interpreted w.r.t. our precise model, and using this to explain the thread
migration case of §3.3. Given DC Xn; DSB; IC Xn; DSB we can use this model
to give meaning to it (omitting uninteresting transitions): First the DC CVAU
causes a Perform DC transition. This pushes any write that might have been
in the abstract data cache into memory. Now the first DSB’s Commit DSB can
be taken, allowing Begin IC to happen. This creates entries for each thread,
which are discharged by each Propagate IC to Thread (see above). Once all
entries are invalidated, a Complete IC can happen. Now, if any thread decodes
an instruction for that address, it must have been fetched from the write the
DC pushed, or something coherence-after it. If the software thread performing
this sequence is interrupted and migrated (by the OS) to a different hardware
thread, then, so long as the OS includes the DSB to maintain the thread-local DC
ordering, the DC will push the write in an identical way, since it only affects the
global abstract data cache. The IC transitions can all be taken, and the sequence
continues as before, just on a new hardware thread. So when the second DSB
finishes, and the final Commit DSB transitions is taken, the effect of the full
sequence will be seen system-wide even if the thread was migrated.
and S, R−1 for the inverse of relation R, R|S and R&S for the union and intersection
of R and S, and [A];R;[B] for the restriction of R to the domain A and range B.
Handling instruction fetch requires extending the notion of candidate ex-
ecution. We add new events: an instruction-fetch (IF) event for each executed
instruction; a DC event for each DC CVAU instruction; an IC event for each IC IVAU
and IC IALLU instruction. We replace po with fetch-program-order (fpo) which
orders the IF event of an instruction before any program-order later IF events.
We add a relation same-cache-line (scl), relating reads, writes, fetches, DC and
IC events to addresses in the same cache line. We add an acyclic transitively
closed relation wco, which extends co with orderings for cache maintenance (DC
or IC) events: it includes an ordering (e, e ) or (e , e) for any cache maintenance
event e and same-cache-line event e if e is a write or another cache mainte-
nance event; where co = ([W];wco;[W]) & loc. The loc, addr, and ctrl are all
extended to include DC and IC events. We add a fetch-to-execute relation (fe),
relating an IF event to any event generated by the execution of that instruction;
and an instruction-read-from relation (irf), which relates a write to any IF event
that fetches from it. Finally, we add a boolean constrained-unpredictable (CU) to
detect badly behaved programs. Now we derive the following relations: the stan-
dard po relation, as po = fe−1 ;fpo;fe (two events e and e are po-related if their
fetch-events are fpo-related); and instruction-from-reads (ifr), the analogue of
fr for instruction fetches, relating a fetch to all writes coherence-after the one it
fetched from: ifr = irf−1 ;co.
We then make two semantics-preserving rewrites of the existing model to
make adding instruction fetches easier (described in the appendix); and make
the following changes and additions to the model. The full model is shown in
Figure 1, with comments pointing to the relevant locations in the model defini-
tion. For lack of space we only describe the main addition, the iseq relation, in
detail (including its correspondence with the operational model of §4); for the
others we give an overview and refer to the appendix for the full description.
We define the relation iseq, relating some write w to address x to an IC
event completing a cache synchronisation sequence (not necessarily on a single
thread): w is followed by a same-cache line DC event, which is in turn followed
by a same-cache line IC event. In operational model terms, this captures traces
that propagated w to memory, subsequently performed a same-cache-line DC,
and then began an IC (and eagerly propagated the IC to all threads). In any
state after this sequence it is guaranteed that w, or a coherence-newer same-
address write, is in the instruction cache of all threads: performing the DC has
cleared the abstract data cache of writes to x, and the subsequent IC has re-
moved old instructions for location x from the instruction caches, so that any
subsequent updates to the instruction caches have been with w, or co-newer
writes. Adding ifr;iseq to the observed-by relation (obs) (4) relates an instruc-
tion fetch i to location x to an IC ic if: i fetched from a write w to x, some
write w to x is coherence-after w, and ic completes a cache synchronisation se-
quence (iseq) starting from w . Then the irreflexive ob axiom requires that i
must be ordered-before ic (because it would otherwise have fetched w ).We now
646 B. Simner et al.
briefly overview other changes made to the axiomatic model and their intuition.
We include irf in obs (3): for an instruction to be fetched from a write, the
write has to have been done before. We add a relation fetch-ordered-before (fob)
(5-7), which is included in ordered-before. The relation fob includes fpo and fe;
including fpo (5) requires fetches to be ordered according to their position in the
control-flow unfolding of the execution. and including the fe (fetch-to-execute)
relation (6) captures the idea that an instruction must be fetched before it can
execute; fetches program-order-after an ISB happen after the ISB (or else are
restarted) (7). For DSB ISH instructions the edge [R|W|F|DC|IC];po;[dsb.ish]
is included in ob (9): DSB ISHs are ordered with all program-order-preceding
non-fetch events. Symmetrically, all non-IF events are ordered after program-
order-preceding dsb.ish events (10). DCs wait for preceding dmb.sy events (11).
We include the relation cache-op-ordered-before (cob) in ob. This relation orders
DC instructions with program-order previous reads/writes and other DCs to the
same cache line (12,13).
Finally, could-fetch-from (cff) (14) captures, for each fetch i, the writes it
could have fetched from (including the one it did fetch from), which we use to
define the constrained unpredictable axiom cff_bad (not given) (15).
ARMv8-A system semantics: instruction fetch in relaxed architectures 647
6 Validation
To gain confidence in the presented models we validated the models against the
Arm architectural intent, against each other, and against real hardware.
Validation against the Architecture To ensure our models correctly cap-
tured the architectural intent we engaged in detailed discussions with Arm, in-
cluding the Arm chief architect. These involved inventing litmus tests (including,
those described in §3 and many others) and discussing what the architecture
should allow in each case.
Validating against hardware To run instruction-fetch tests on hardware, we
extended the litmus tool [7]. The most significant extension consists in handling
code that can be modified, and thus has to be restored between experiments. To
that end, code copies are executed, those copies reside in mmap’d memory with
(execute permission granted. Copies are made from “master” copies, in effect
C functions whose contents basically consist of gcc extended inline assembly. Of
course, such code has to be position independent, and explicit code addresses in
test initialisation sections (such as in 0:X1=l in the test of §3.1) are specific to
each copy. All the cache handling instructions used in our experiments are all
allowed to execute at exception level 0 (user-mode), and therefore no additional
privilege is needed to run the tests.
To automatically generate families of interesting instruction-fetch tests, we
extended the diy test generation tool [3] to support instruction-fetch reads-
from (irf) and instruction-fetch from-reads (ifr) edges, in both internal (same-
thread) and external (inter-thread) forms, and the cachesync edge. We used this
to generate 1456 tests involving those edges together with po, rf, fr, addr, ctrl,
ctrlisb, and dmb.sy. diy does not currently support bare DC or IC instructions,
locations which are both fetched and read from, or repeated fetches from the
same location.
We then ran the diy-generated test suite on a range of hardware implemen-
tations, to collect a substantial sample of actual hardware behaviour.
Correspondence between the models We experimentally test the equiva-
lence of the operational and axiomatic models on the above hand-written and
diy-generated tests, checking that the models give the same sets of allowed final
states, and that these are consistent with the hardware observations.
Making the models executable as a test oracle To make the operational
model executable as a test oracle, capable of computing the set of all allowed
executions of a litmus test, we must be able to exhaustively enumerate all possible
traces. For the model as presented, doing this naively is infeasible: for each
instruction it is theoretically possible to speculate any of the 264 addresses as
potential next address, and the interleaving of the new fetch transitions with
others leads to an additional combinatorial explosion.
We address these with two new optimisations. First, we extend the fixed-point
optimisation in RMEM (incrementally computing the set of possible branch tar-
gets) [37] to keep track not only of indirect branches but also the successors of
648 B. Simner et al.
every program location, and only allow speculating from this set of successors.
Additionally, we track during a test which locations were both fetched and mod-
ified during the test, and eagerly take fetch and decode transitions for all other
locations. As before, the search then runs until the set of branch targets and
the set of modified program-locations reaches a fixed point. We also take some
of the transitions eagerly to reduce the search space, in cases where this cannot
remove behaviour: Wait for IC, Complete IC, Fetch Request, and Update
Instruction Cache.
Validation results First, to check for regressions, we ran the operational model
on all the 8950 non-mixed-size tests used for developing the original Flat model
(without instruction fetch or cache maintenance). The results are identical, ex-
cept for 23 tests which did not terminate within two hours. We used a 160
hardware-thread POWER9 server to run the tests.
We have also run the axiomatic model on the 90 basic two-thread tests that
do not use Arm release/acquire instructions (not supported by the ISA semantics
ARMv8-A system semantics: instruction fetch in relaxed architectures 649
used for this); the results are all as they should be. This takes around 30 minutes
on 8 cores of a Xeon Gold 6140.
Then, for the key handwritten tests mentioned in this paper, together with
some others (that have also been discussed with Arm), we ran them on various
hardware implementations and in the operational and axiomatic models. The
models’ results are identical to the Arm architectural intent in all cases, except
for two tests which are not currently supported by the axiomatic checker.
Test Arm intent op. model ax. model hardware obs.
CoFF allow = = 42.6k/13G
CoFR forbid = = 0/13G
CoRF+ctrl-isb allow = = 3.02G/13G
SM allow = = 25.8G/25.9G
SM+cachesync-isb forbid = = 0/25.9G
MP.RF+dmb+ctrl-isb allow = = 480M/6.36G
MP.RF+cachesync+ctrl-isb forbid = = 0/13G
MP.FR+dmb+fpo-fe forbid = = 0/13G
MP.FF+dmb+fpo allow = = 447M/13G
F
MP.FF+cachesync+fpo forbid = = 2.3k/13G
ISA2.F+dc+ic+ctrl-isb forbid = = 0/6.98G
U
SM.F+ic allow = unsupported 0/12.9G
U
FOW allow = unsupported 0/7G
U
MP.RF+dc+ctrl-isb-isb allow = = 0/12.94G
MP.R.RF+addr-cachesync+dmb+ctrl-isb forbid = = 0/6.97G
U
MP.RF+dmb+addr-cachesync allow = = 0/6.34G
[The hardware observations are the sum of testing seven devices: a Snapdragon 810
(4x Arm A53 + 4x Arm A57 cores), Tegra K1 (2x NVIDIA Denver cores), Snapdragon
820 (4x Qualcomm Kryo cores), Exynos 8895 (4x Arm A53 + 4x Samsung Mongoose 2
cores), Snapdragon 425 (4x Arm A53), Amlogic 905 (4x Arm A53 cores), and Amlogic
922X (4x Arm A73 + 2x Arm A53 cores). U: allowed but unobserved. F: forbidden but
observed.]
Our testing revealed a hardware bug in a Snapdragon 820 (4 Qualcomm Kryo
cores). A version of the first cross-thread synchronisation test of §3.3 but with
the full cache synchronisation (MP.RF+cachesync+ctrl-isb) exhibited an illegal
outcome in 84/1.1G runs (not shown in the table), which we have reported. We
have also seen an anomaly for MP.FF+cachesync+fpo, currently under investi-
gation by Arm. Apart from these, the hardware observations are all allowed by
the models. As usual, specific hardware implementations are sometimes stronger.
Finally, we ran the 1456 new instruction-fetch diy tests on a variety of hard-
ware, for around 10M iterations each, and in the operational model. The model
is sound with respect to the observed hardware behaviour except for that same
Snapdragon 820 device.
7 Related Work
To the best of our knowledge, no previous work establishes well-validated rigor-
ous semantics for any systems aspects, of any current production architecture,
in a realistic concurrent setting.
650 B. Simner et al.
The closest is Raad et al.’s work on non-volatile memory, which models the
required cache maintenance for persistent storage in ARMv8-A [39], as an ex-
tension to the ARMv8-A axiomatic model, and for Intel x86 [38] as an oper-
ational model, but neither are validated against hardware. In the sequential
case, Myreen’s JIT compiler verification [33] models x86 icache behaviour with
an abstract cache that can be arbitrarily updated, cleared on a jmp. For ad-
dress translation, the authoritative Arm-internal ASL model [40,41,42], and Sail
model derived from it [11] cover this, and other features sufficient to boot an OS
(Linux), as do the handwritten Sail models for RISC-V (Linux and FreeBSD)
and MIPS/CHERI-MIPS (FreeBSD, CheriBSD), but without any cache effects.
Goel et al. [21,20] describe an ACL2 model for much of x86 that covers address
translation; and the Forvis [34] and RISCV-PLV [14] Haskell RISC-V ISA mod-
els are also complete enough to boot Linux. Syeda and Klein [49,50] provide
an somewhat idealised model for ARMv7 address translation and TLB mainte-
nance. Komodo [16] uses a handwritten model for a small part of ARMv7, as
do Guanciale et al. [25,12]. Romanescu et al. [44,43] do discuss address trans-
lation in the concurrent setting, but with respect to idealised models. Lustig et
al. [30] describe a concurrent model for address translation based on the Intel
Sandy Bridge microarchitecture, combined with a synopsis of some of the rele-
vant Linux code, but not an architectural semantics for machine-code programs.
8 Conclusion
References
1. Adir, A., Attiya, H., Shurek, G.: Information-flow models for shared memory with
an application to the PowerPC architecture. IEEE Trans. Parallel Distrib. Syst.
14(5), 502–515 (2003). https://doi.org/10.1109/TPDS.2003.1199067
2. Alglave, J., Fox, A., Ishtiaq, S., Myreen, M.O., Sarkar, S., Sewell, P.,
Zappa Nardelli, F.: The semantics of Power and ARM multiprocessor machine
code. In: Proc. DAMP 2009 (Jan 2009)
3. Alglave, J., Maranget, L.: The diy7 tool. http://diy.inria.fr/ (2019), accessed
2019-07-08
4. Alglave, J., Maranget, L.: The herd7 tool. http://diy.inria.fr/doc/herd.html/
(2019), accessed 2019-07-08
5. Alglave, J., Maranget, L., Deplaix, K., Didier, K., Sarkar, S.: The litmus7 tool.
http://diy.inria.fr/doc/litmus.html/ (2019), accessed 2019-07-08
6. Alglave, J., Maranget, L., Sarkar, S., Sewell, P.: Fences in weak memory models.
In: Proc. CAV (2010)
7. Alglave, J., Maranget, L., Sarkar, S., Sewell, P.: Litmus: running tests against
hardware. In: Proceedings of TACAS 2011: the 17th international conference on
Tools and Algorithms for the Construction and Analysis of Systems. pp. 41–44.
Springer-Verlag, Berlin, Heidelberg (2011), http://dl.acm.org/citation.cfm?id=
1987389.1987395
8. Alglave, J., Maranget, L., Tautschnig, M.: Herding Cats: Modelling, Simulation,
Testing, and Data Mining for Weak Memory. ACM TOPLAS 36(2), 7:1–7:74 (Jul
2014). https://doi.org/10.1145/2627752
9. ARM Limited: ARM architecture reference manual. ARMv8, for ARMv8-A archi-
tecture profile (Oct 2018), v8.4. ARM DDI 0487D.a (ID103018)
10. Armstrong, A., Bauereiss, T., Campbell, B., Gray, S.F.J.F.K.E., Kerneis, G., Kr-
ishnaswami, N., Mundkur, P., Norton-Wright, R., Pulte, C., Reid, A., Sewell, P.,
Stark, I., Wassell, M.: Sail. https://www.cl.cam.ac.uk/~pes20/sail/ (2019)
11. Armstrong, A., Bauereiss, T., Campbell, B., Reid, A., Gray, K.E., Norton, R.M.,
Mundkur, P., Wassell, M., French, J., Pulte, C., Flur, S., Stark, I., Krishnaswami,
N., Sewell, P.: ISA semantics for ARMv8-A, RISC-V, and CHERI-MIPS. In: Proc.
46th ACM SIGPLAN Symposium on Principles of Programming Languages (Jan
2019). https://doi.org/10.1145/3290384, proc. ACM Program. Lang. 3, POPL, Ar-
ticle 71
12. Baumann, C., Schwarz, O., Dam, M.: Compositional verification of security prop-
erties for embedded execution platforms. In: PROOFS@CHES 2017, 6th Interna-
tional Workshop on Security Proofs for Embedded Systems, Taipei, Taiwan, Friday
September 29th, 2017. pp. 1–16 (2017), http://www.easychair.org/publications
/paper/wkpS
13. Chong, N., Ishtiaq, S.: Reasoning about the ARM weakly consistent memory
model. In: MSPC (2008)
14. Clester, I.J., Bourgeat, T., Wright, A., Gruetter, S., Chlipala, A.: riscv-plv risc-v
isa formal specification. https://github.com/mit- plv/riscv- semantics (2019),
accessed 2019-07-01
15. Deacon, W.: The ARMv8 application level memory model. https://github.com
/herd/herdtools7/blob/master/herd/libdir/aarch64.cat (accessed 2019-07-01)
(2016)
16. Ferraiuolo, A., Baumann, A., Hawblitzel, C., Parno, B.: Komodo: Using verification
to disentangle secure-enclave hardware from software. In: Proceedings of the 26th
652 B. Simner et al.
28. Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified imple-
mentation of ML. In: The 41st Annual ACM SIGPLAN-SIGACT Symposium on
Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January
20-21, 2014. pp. 179–192 (2014). https://doi.org/10.1145/2535838.2535841
29. Leroy, X.: A formally verified compiler back-end. J. Autom. Reasoning 43(4), 363–
446 (2009). https://doi.org/10.1007/s10817-009-9155-4
30. Lustig, D., Sethi, G., Martonosi, M., Bhattacharjee, A.: COATCheck: Verifying
memory ordering at the hardware-OS interface. SIGOPS Oper. Syst. Rev. 50(2),
233–247 (Mar 2016). https://doi.org/10.1145/2954680.2872399
31. Maranget, L., Sarkar, S., Sewell, P.: A tutorial introduction to the ARM and
POWER relaxed memory models. Draft available from http://www.cl.cam.ac.
uk/~pes20/ppc-supplemental/test7.pdf (2012)
32. Mulligan, D.P., Owens, S., Gray, K.E., Ridge, T., Sewell, P.: Lem: reusable engi-
neering of real-world semantics. In: Proceedings of ICFP 2014: the 19th ACM SIG-
PLAN International Conference on Functional Programming. pp. 175–188 (2014).
https://doi.org/10.1145/2628136.2628143
33. Myreen, M.O.: Verified just-in-time compiler on x86. In: Proceedings of the
37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Program-
ming Languages. pp. 107–118. POPL ’10, ACM, New York, NY, USA (2010).
https://doi.org/10.1145/1706299.1706313
34. Nikhil, R.S., Sharma, N.N.: Forvis: A formal RISC-V ISA specification. https:
//github.com/rsnikhil/Forvis_RISCV-ISA-Spec (2019), accessed 2019-07-01
35. Owens, S., Sarkar, S., Sewell, P.: A better x86 memory model: x86-TSO. In: Pro-
ceedings of TPHOLs 2009: Theorem Proving in Higher Order Logics, LNCS 5674.
pp. 391–407 (2009)
36. Pulte, C.: The Semantics of Multicopy Atomic ARMv8 and RISC-V. Ph.D. thesis,
University of Cambridge (2019), https://doi.org/10.17863/CAM.39379
37. Pulte, C., Flur, S., Deacon, W., French, J., Sarkar, S., Sewell, P.: Simplifying ARM
Concurrency: Multicopy-atomic Axiomatic and Operational Models for ARMv8.
In: Proceedings of the 45th ACM SIGPLAN Symposium on Principles of Program-
ming Languages (Jan 2018). https://doi.org/10.1145/3158107
38. Raad, A., Wickerson, J., Neiger, G., Vafeiadis, V.: Persistency seman-
tics of the Intel-x86 architecture. PACMPL 4(POPL), 11:1–11:31 (2020).
https://doi.org/10.1145/3371079
39. Raad, A., Wickerson, J., Vafeiadis, V.: Weak persistency semantics from the
ground up: Formalising the persistency semantics of ARMv8 and transactional
models. Proc. ACM Program. Lang. 3(OOPSLA), 135:1–135:27 (Oct 2019).
https://doi.org/10.1145/3360561
40. Reid, A.: Trustworthy specifications of ARM v8-A and v8-M system level archi-
tecture. In: FMCAD 2016. pp. 161–168 (October 2016), https://alastairreid.g
ithub.io/papers/fmcad2016-trustworthy.pdf
41. Reid, A.: ARM releases machine readable architecture specification. https://alas
tairreid.github.io/ARM-v8a-xml-release/ (Apr 2017)
42. Reid, A., Chen, R., Deligiannis, A., Gilday, D., Hoyes, D., Keen, W., Pathirane,
A., Shepherd, O., Vrabel, P., Zaidi, A.: End-to-end verification of processors with
ISA-Formal. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Verification -
28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016,
Proceedings, Part II. Lecture Notes in Computer Science, vol. 9780, pp. 42–58.
Springer (2016)
654 B. Simner et al.
43. Romanescu, B., Lebeck, A., Sorin, D.J.: Address translation aware
memory consistency. IEEE Micro 31(1), 109–118 (Jan 2011).
https://doi.org/10.1109/MM.2010.99
44. Romanescu, B.F., Lebeck, A.R., Sorin, D.J.: Specifying and dynamically verifying
address translation-aware memory consistency. In: Proceedings of the Fifteenth
Edition of ASPLOS on Architectural Support for Programming Languages and
Operating Systems. pp. 323–334. ASPLOS XV, ACM, New York, NY, USA (2010).
https://doi.org/10.1145/1736020.1736057
45. Sarkar, S., Memarian, K., Owens, S., Batty, M., Sewell, P., Maranget, L.,
Alglave, J., Williams, D.: Synchronising C/C++ and POWER. In: Pro-
ceedings of PLDI 2012, the 33rd ACM SIGPLAN conference on Program-
ming Language Design and Implementation (Beijing). pp. 311–322 (2012).
https://doi.org/10.1145/2254064.2254102
46. Sarkar, S., Sewell, P., Alglave, J., Maranget, L., Williams, D.: Understanding
POWER multiprocessors. In: Proceedings of PLDI 2011: the 32nd ACM SIGPLAN
conference on Programming Language Design and Implementation. pp. 175–186
(2011). https://doi.org/10.1145/1993498.1993520
47. Sarkar, S., Sewell, P., Zappa Nardelli, F., Owens, S., Ridge, T., Braibant,
T., Myreen, M., Alglave, J.: The semantics of x86-CC multiprocessor machine
code. In: Proceedings of POPL 2009: the 36th annual ACM SIGPLAN-SIGACT
symposium on Principles of Programming Languages. pp. 379–391 (Jan 2009).
https://doi.org/10.1145/1594834.1480929
48. Sewell, P., Sarkar, S., Owens, S., Zappa Nardelli, F., Myreen, M.O.: x86-TSO: A
rigorous and usable programmer’s model for x86 multiprocessors. Communications
of the ACM 53(7), 89–97 (Jul 2010), (Research Highlights)
49. Syeda, H., Klein, G.: Reasoning about translation lookaside buffers. In: LPAR-21,
21st International Conference on Logic for Programming, Artificial Intelligence and
Reasoning, Maun, Botswana, May 7-12, 2017. pp. 490–508 (2017), http://www.ea
sychair.org/publications/paper/340347
50. Syeda, H.T., Klein, G.: Program verification in the presence of cached address
translation. In: Interactive Theorem Proving - 9th International Conference, ITP
2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK,
July 9-12, 2018, Proceedings. pp. 542–559 (2018). https://doi.org/10.1007/978-3-
319-94821-8_32
51. Tan, Y.K., Myreen, M.O., Kumar, R., Fox, A.C.J., Owens, S., Norrish, M.:
The verified CakeML compiler backend. J. Funct. Program. 29, e2 (2019).
https://doi.org/10.1017/S0956796818000229
52. Waterman, A., Asanović, K. (eds.): The RISC-V Instruction Set Manual Vol-
ume I: Unprivileged ISA (Dec 2018), document Version 20181221-Public-Review-
draft. Contributors: Arvind, Krste Asanović, Rimas Avižienis, Jacob Bachmeyer,
Christopher F. Batten, Allen J. Baum, Alex Bradbury, Scott Beamer, Preston
Briggs, Christopher Celio, Chuanhua Chang, David Chisnall, Paul Clayton, Palmer
Dabbelt, Roger Espasa, Shaked Flur, Stefan Freudenberger, Jan Gray, Michael
Hamburg, John Hauser, David Horner, Bruce Hoult, Alexandre Joannou, Olof
Johansson, Ben Keller, Yunsup Lee, Paul Loewenstein, Daniel Lustig, Yatin Man-
erkar, Luc Maranget, Margaret Martonosi, Joseph Myers, Vijayanand Nagarajan,
Rishiyur Nikhil, Jonas Oberhauser, Stefan O’Rear, Albert Ou, John Ousterhout,
David Patterson, Christopher Pulte, Jose Renau, Colin Schmidt, Peter Sewell,
Susmit Sarkar, Michael Taylor, Wesley Terpstra, Matt Thomas, Tommy Thorn,
Caroline Trippel, Ray VanDeWalker, Muralidaran Vijayaraghavan, Megan Wachs,
ARMv8-A system semantics: instruction fetch in relaxed architectures 655
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/
4.0/), which permits use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons license and indicate if changes
were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Higher-Ranked Annotation Polymorphic
Dependency Analysis
1 Introduction
The typical compiler for a statically typed functional language will perform a
number of analyses for validation, optimisation, or both (e.g., strictness anal-
ysis, control-flow analysis, and binding time analysis). These analyses can be
specified as a type-based static analysis so that vocabulary, implementation and
concepts from the world of type systems can be reused in this setting [19,24].
In that setting the analysis properties are taken from a language of annotations
which adorn the types computed for the program during type inference: the anal-
ysis is specified as an annotated type system, and the payload of the analysis
corresponds to the annotations computed for a given program.
Consider for example binding-time analysis [5,7]. In this case, we have a two-
value lattice of annotations containing S for static and D for dynamic (where
⊥ = S D = , so that whenever an expression is annotated with S, it
can be soundly changed to D, because that is a strictly weaker property). An
expression that is known to be static may be evaluated at compile time, because
the analysis has determined that all the values that determine its outcome are
in fact available at compile-time while all other expressions are annotated with
D, and must be evaluated at run-time; the goal of binding-time analysis is then
to (soundly) assign S to as many expressions as possible.
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 656–683, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 24
Higher-Ranked Annotation Polymorphic Dependency Analysis 657
where β1 and β2 can be instantiated independently for each of the two calls to
f in foo, and β3 is universally bound by foo and represents how the argument f
uses its function argument.
Since the argument to f is itself a function, the information that flows out
of, say, the first call to f can be independent of the analysis of the function
that flows into the second call (and vice versa), thereby avoiding unnecessary
poisoning. This means that the binding-time of, say, the second component of
the pair depends only on f and the function λx : int.0, irrespective of f also
receiving λx : int.x as argument to compute the first component.
For the next example, let us consider security flow analysis in which we have
annotations L and H that designate values (call these L-values and H-values)
of low respectively high confidentiality. An important scenario where additional
precision can be achieved is when analyzing Haskell code in which type classes
have been desugared to dictionary-passing functional core. A function like
g x y = (x + y, y + y)
3 The λ -calculus
In order to avoid confusion with the field of (algebraic) effects, we refer to terms
of λ as dependency terms or dependency annotations. Terms are either of base
sort , representing values in the underlying lattice L, or of function sort κ1 ⇒ κ2 .
On the term level, we allow arbitrary elements of the underlying lattice and
taking binary joins, in addition to the usual variables, function applications and
lambda abstractions. Lattice elements are assumed to be taken from a bounded
join-semilattice L, an algebraic structure L, consisting of an underlying set
L and an associative, commutative and idempotent binary operation , called
join (we usually write ∈ L for ∈ L), and a least element ⊥.
The sorting rules of λ are straightforward (see [26]). Values of the underlying
lattice are always of sort , and the join operator is defined on arbitrary terms
of the same sort:
Σ s ξ1 : κ Σ s ξ2 : κ
[S-Join]
Σ s ξ1 ξ2 : κ
The sorting rule uses sort environments denoted by the letter Σ that map
annotation variables β to sorts κ. We denote the set of sort environments by
SortEnv. More precisely, a sort environment or sort context Σ is a finite list of
bindings from annotation variables β to sorts κ. The empty context is written
as ∅ (in code as []), and the context Σ extended with the binding of the variable
V = L
Vκ1 ⇒κ2 = {f : Vκ1 → Vκ2 | f mono}
ρ : AnnVar →fin {Vκ | κ ∈ AnnSort}
βρ = ρ(β)
λβ :: κ1 .ξρ = λv ∈ Vκ1 . ξρ[β→v]
ξ1 ξ2 ρ = ξ1 ρ (ξ2 ρ )
ρ =
ξ1 ξ2 ρ = ξ1 ρ ξ2 ρ
Fig. 2: The semantics of λ -calculus
Higher-Ranked Annotation Polymorphic Dependency Analysis 661
The types and syntax of our source language are given in figure 3. The types
of our source language consist of a unit type, and product, sum and function
types. As mentioned earlier, let-polymorphism at the type level is not part of the
662 F. Thorand and J. Hage
type system. The language itself is then hardly suprising and includes variables,
a unit constant, lambda abstraction, function application, projection functions
for product types, sum constructors, a sum eliminator (case), fixpoints, seq for
explicitly forcing evaluation in our call-by-name language, and, finally, a special
operation ann (t) that raises the annotation level of t to . We omit the underly-
ing type system for the source language since it consists mostly of the standard
rules (see [26]). A notable exception is the rule for ann (t). Such an explicitly
annotated term has the same underlying type as t:
Γ t :τ [U-Ann]
Γ ann (t) : τ
The annotation imposed on t only becomes relevant in the annotated type
system that we discuss next. In the following, we assume the usual definitions
for computing the set of free term variables of a term, ftv(t).
::=
τ ∈ Ty ∀β :: κ. τ (annotation quantification)
|
unit (unit type)
| τ1 ξ1 + τ2 ξ2 (sum type)
| τ1 ξ1 × τ2 ξ2 (product type)
| τ1 ξ1 → τ2 ξ2 (function type)
::= · · ·
t ∈ Tm
| λx : τ & ξ.
t (abstraction)
| μx : τ & ξ.
t (fixpoint)
| ···
| Λβ :: κ.t (dependency abstraction)
|
t ξ (dependency application)
[Sub-Refl]
Σ sub τ τ
Σ sub τ1 τ2 Σ sub τ2 τ3
[Sub-Trans]
Σ sub τ1 τ3
Σ, β :: κ sub τ1 τ2
[Sub-Forall]
Σ sub ∀β :: κ.τ1 ∀β :: κ.
τ2
Σ sub τ1 τ1 Σ sub τ2 τ2
Σ sub ξ1
ξ1 Σ sub ξ2
ξ2
[Sub-Prod]
Σ sub τ1 ξ1 × τ2 ξ2 τ1 ξ1 × τ2 ξ2
Σ sub τ1 τ1 Σ sub τ2 τ2
Σ sub ξ1
ξ1 Σ sub ξ2
ξ2
[Sub-Arr]
Σ sub τ1 ξ1 → τ2 ξ2 τ1 ξ1 → τ2 ξ2
Fig. 5: Subtyping relation (Σ sub τ1 τ2 ), [Sub-Sum] is like [Sub-Prod]
Σ, written Σ sub τ1 τ2 , if a value of type τ1 can be used in places where a value
of type τ2 is required. The subtyping relation only relates the annotations inside
the types using the subsumption relation Σ sub ξ1 ξ2 between dependency
terms. Moreover, the subtyping relation implicitly demands that both types are
well-formed under the environment. The [Sub-Forall] rule requires that the
quantified variable has the same name in both types. This is not a restriction,
as we can simply rename the variables in one or both of the types accordingly
in order to make them match and prevent unintentional capturing of previously
free variables. Note that [Sub-Arr] is contravariant for argument positions. We
omitted [Sub-Sum] which can be derived from [Sub-Prod] by replacing × with
+.
the dependency term of t 1 . It is implicitly assumed that every type τ is also well-
formed under Σ, i.e. Σ wft τ, and that the resulting dependency annotation ξ
is of sort , i.e. Σ s ξ : .
We now discuss some of the more interesting rules of figure 6. In [T-Var],
both the annotated type and the dependency annotation are looked up in the
environment. The dependency annotation of the unit value defaults to the least
annotation in [T-Unit]. While we could admit an arbitrary dependency anno-
tation here, the same can be achieved by using the subtyping rule [T-Sub]. We
employ this principle more often, e.g., in [T-Abs], and [T-Pair]. This essentially
means that the context in which such a term is used completely determines the
annotation.
The rule [T-App] may seem overly restrictive by requiring that the types
and dependency annotations of the arguments match, and that the dependency
annotations of the return value and the function itself are the same. However, in
combination with the subtyping rule [T-Sub], this effectively does not restrict
the analysis in any way. We see the same happening in other rules, such as
[T-Case] and [T-Proj]. Note that the dependency annotation of the argument
does not play a role in the resulting dependency annotation of the application.
This is because we are dealing with a call by name semantics which means that
the argument is not necessarily evaluated before the function call. It should be
noted that this does not mean that the dependency annotations of arguments
are ignored completely. If the body of a function makes use of an argument, the
type system makes sure that its dependency annotation is also incorporated into
the result.
When constructing a pair (rule [T-Pair]), the dependency annotations of
the components are stored in the type while the pair itself is assigned the least
dependency annotation. When accessing a component of a pair (rule [T-Proj]),
we require that the dependency annotation of the pair matches the dependency
annotation of the projected component. Again, this is no restriction due to the
subtyping rule.
In [T-Inl/Inr], the argument to the injection constructor only determines
the type and annotation of one component of the sum type while the other
component can be chosen arbitrarily as long as the underlying type matches the
annotation on the constructor. The destruction of sum types happens in a case
statement that is handled by rule [T-Case]. Again, to keep the rule simple and
without loss of precision due to judicious use of rule [T-Sub], we may demand
that the types of both branches match, and that additionally the dependency
annotations of both branches and the scrutinee are equal.
The annotation rule [T-Ann] requires that the dependency annotation of
the term being annotated is at least as large as the lattice element . In the
fixpoint rule, [T-Fix], not only the types but also the dependency annotations
of the term itself and the bound variables must match. Note that this rule also
1
Following the literature of type and effect systems we would much like to use the
term “effect” at this point, but decided to use a different term to avoid confusion
with the literature on effect handlers.
666 F. Thorand and J. Hage
Γ(x) = τ & ξ
[T-Var]
Σ | Γ te x : τ & ξ
[T-Unit]
Σ | Γ te () : unit
&⊥
Σ | Γ te t : τ1 & ξ1
[T-Inl]
Σ | Γ te inlτ2 (t) : τ1 ξ1 + τ2 ξ2 & ⊥
Σ | Γ te t : τ2 & ξ2
[T-Inr]
Σ | Γ te inrτ1 (t) : τ1 ξ1 + τ2 ξ2 & ⊥
Σ | Γ te t : τ & ξ Σ sub
ξ
[T-Ann]
Σ | Γ te ann (t) : τ & ξ
Σ | Γ te t : ∀β :: κ.
τ &ξ Σ s ξ : κ
[T-AnnApp]
Σ | Γ te t ξ : [ξ / β ]
τ &ξ
Fig. 6: Declarative annotated type system (Σ | Γ te
t : τ & ξ)
Higher-Ranked Annotation Polymorphic Dependency Analysis 667
5 Metatheory
In this section we develop a noninterference proof for our declarative type system,
based on a small-step operational call-by-name semantics for the target language.
Figure 7 defines the values of the target language, i.e. those terms that cannot
be further evaluated. Apart from a technicality related to annotations, they
correspond exactly to the weak head normal forms of terms. The distinction for
Nf ⊂ Nf is made to ensure that there is at most one annotation at top level.
The semantics itself is largely straightforward, except for the handling of
annotations. These are moved just as far outwards as necessary in order to
reach a normal form, thereby computing the least “permission” an evaluator
must possess for computing a certain output. Figure 8 shows two rules: a lifting
rule (for applications) and the rule for merging adjacent annotations (see the
supplemental material for the others).
In the remainder of this section we state the standard progress and subject
reduction theorems that ensure that our small-step semantics is compatible with
v ∈ Nf
[E-LiftApp]
(ann (v )) t2 → ann (v t2 )
v ∈ Nf
[E-JoinAnn]
ann1 (ann2 (v )) → ann1 2 (v )
the annotated type system. The following progress theorem demonstrates that
any well-typed term is in normal form, or an evaluation step can be performed.
Theorem 1 (Progress). If ∅ | ∅ te t : τ & ξ, then either t ∈ Nf or there is a
t such that t → t .
The subject reduction property says that the reduction of a well-typed term
results in a term of the same type.
Theorem 2 (Subject Reduction). If ∅ | ∅ te t : τ & ξ and there is a t such
that t → t , then ∅ | ∅ te t : τ & ξ.
As expected, subject reduction extends naturally to a sequence of reductions
by induction on the length of the reduction sequence:
Corollary 1. If we have ∅ | ∅ te t : τ & ξ and t →∗ v , then ∅ | ∅ te v : τ & ξ.
where, as usual, we write t →∗ v if there is a finite sequence of terms (ti )0in
with t0 = t and tn = v ∈ Nf and reductions (ti → ti+1 )0i<n between them. If
there is no such sequence, this is denoted by t ⇑ and t is said to diverge.
Finally, if a term evaluates to an annotated value, this annotation is com-
patible with the dependency annotation that has been assigned to the term:
Theorem 3 (Semantic Soundness). If we have ∅ | ∅ te t : τ & ξ and t →∗
ann (v ), then ∅ sub ξ.
β ∈ αi
[P-Unit]
αi :: καi
p unit & β αi β :: καi ⇒
The definition of a pattern is then extended to annotated types using the rules
from figure 9. Our definition is more precise than the one from previous work in
that it makes explicit which variables are expected to be bound and which are
free. We require that all variables with different names in the definition of these
rules are distinct from each other.
An annotated type and depencency pair τ & ξ is a pattern type under the
sort environment Σ if the judgment Σ p τ & ξ
Σ holds for some Σ . We call
the variables in Σ argument variables and the variables in Σ pattern variables.
∀β1 :: .unitβ
1 → (∀β2 :: .unitβ β1 β2 )β β1
2 → unitβ
Note that since β1 is quantified on the function arrow chain, it is passed on to the
second function arrow. However, it is not propagated into the second argument.
In general, annotations on the return type may depend on the annotations of all
previous arguments while annotations of the arguments may not. This prevents
any dependency between the annotations of arguments and guarantees that they
are as permissive as possible. This is also why pattern variables in a covariant
position are passed on to the next higher level while pattern variables in argu-
ments are quantified in the enclosing function arrow. This allows the caller of
a function to instantiate the dependency annotations of the parameters to the
actual arguments.
β fresh
[C-Unit]
αi :: καi
c unit : unit & β αi β :: καi ⇒
1. τ = unit,
or
2. τ = τ1 ξ1 + τ2 ξ2 and both τ1 and τ2 are conservative, or
3. τ = τ1 ξ1 × τ2 ξ2 and both τ1 and τ2 are conservative, or
4. τ = ∀βj :: κj . τ1 ξ1 → τ2 ξ2 and both (a) ∅ p τ1 & ξ1
βj :: κj and (b) τ2 is
conservative.
f : ∀β :: ⇒ .∀β :: ⇒ ⇒ .∀β3 :: .
(∀β1 :: .unitβ
1 → (∀β2 :: .unitβ β1 β2 )β β1 )β3
2 → unitβ
3 β ⊥ β ⊥ & ⊥
→ unitβ
Note that the pattern variables of the argument have been bound in the
top-level function type. This allows callers of f to instantiate these patterns.
We can extend the previous definition of pattern types to the type completion
relation shown in figure 10. It relates every underlying type τ with a pattern type
τ such that τ erases to τ . It is defined through judgments Σ c τ : τ & ξ
Σ with
the meaning that under the sort environment Σ, τ is completed to the annotated
type τ and the dependency annotation ξ containing the pattern variables Σ .
The completion relation can also be interpreted as a function taking Σ and τ as
arguments and returning τ, ξ and Σ .
Lastly, we revisit the examples from the previous sections and show how a
pattern type can be mechanically derived from an underlying type.
In example 1 we presented a pattern type for the underlying type unit →
unit → unit. Using the type completion relation, we can derive the pattern type,
(∀β1 .unitβ
1 → (∀β2 .unitβ β1 β2 )β β1 ) & β3
2 → unitβ
without having to guess. This is because the components τ, ξ and Σ in a judg-
ment Σ c τ : τ & ξ
Σ are uniquely determined by Σ and τ from looking at
672 F. Thorand and J. Hage
the syntax alone. The resulting pattern type contains three pattern variables,
β :: ⇒ , β :: ⇒ ⇒ and β3 :: . If the initial sort environment is empty,
these are also the only free variables of the pattern type.
Based on the type completion relation we can define least type completions.
These are conservative types that are subtypes of all other conservative types of
the same shape. Therefore, all annotations occurring in positive positions on the
top level function arrow chain must also be least. We do not need to consider
arguments here because those are by definition equal up to alpha-conversion due
to being pattern types. We define the least annotation term of sort κ as
⊥ = ⊥
⊥κ1 ⇒κ2 = λβ : κ1 .⊥κ2 .
These least annotation terms correspond to the least elements of our bounded
lattice for a given sort κ. This in turn leads us to the definition of the least
completion of type τ (see figure 10) by substituting all free variables in the
completion with the least annotation of the corresponding sort, i.e.
⊥τ = [⊥κi / βi ]
τ for ∅ c τ : τ & ξ
βi :: κi .
The algorithm We can now move on to the type reconstruction algorithm that
performs the actual analysis. At its core lies algorithm R shown in figure 11.
The input of the algorithm is a triple (Γ, Σ, t) consisting of a well-typed source
term t, an annotated type environment Γ providing the types and dependency
annotations of the free term variables in t and a sort environment Σ mapping
each free annotation variable in scope to its sort. It returns a triple t : τ & ξ
consisting of an elaborated term t in the target language (that erases to the
source term t), an annotated type τ and an dependency annotation ξ such that
Σ | Γ te t : τ & ξ holds. In the definition of R, to avoid clutter, we write Γ
instead of Γ because we are only dealing with one kind of type environment.
The algorithm relies on the invariant that all types in the type environment
and the inferred type must be conservative. In the version of [16], all inferred
dependency annotations (including those nested as annotations in types) had
to be canonically ordered as well. But as it turned out that this canonically
ordered form was not enough for deciding semantic equality, so we lifted this
requirement. We still mark those places in the algorithm where canonicalization
would have occurred with ·· , but the actual result of this operation does not
matter as long as the dependency terms remain equivalent.
The algorithm for computing the least upper bound of types ( in figure 12)
requires that both types are conservative, have the same shape and use the same
names for bound variables. The latter can be ensured by α-conversion while the
former two requirements are fulfilled by how this function is used in R.
The restriction to conservative types allows us to ignore functions arguments
because these are always required to be pattern types, which are unique up to
α-equivalence. This alleviates the need for computing a corresponding greatest
lower bound of types, because the algorithm only traverses covariant positions.
Higher-Ranked Annotation Polymorphic Dependency Analysis 673
× Ty
R : AnnTyEnv × SortEnv × Tm → Tm × AnnTm
R(Γ ; Σ; x ) = x : Γ (x ) τ2 τ3 Σ & ξ1 ξ2 ξ3 Σ
:
R(Γ ; Σ; ()) = () : unit &⊥ R(Γ ; Σ; proji (t)) =
R(Γ ; Σ; ann (t)) = let t : τ1 ξ1 × τ2 ξ2 & ξ = R(Γ ; Σ; t)
let t : τ & ξ = R(Γ ; Σ; t) in proji ( t ) : τi & ξ ξi Σ
in ann ( t ) : τ & ξ Σ R(Γ ; Σ; λx : τ1 .t) =
R(Γ ; Σ; seq t1 t2 ) = let τ1 & β βi :: κi = C([ ]; τ1 )
let t1 : τ1 & ξ1 = R(Γ ; Σ; t1 ) Γ = Γ, x : τ1 & β
t2 : τ2 & ξ2 = R(Γ ; Σ; t2 ) Σ = Σ, βi :: κi
in seq t1 t2 : τ2 & ξ1 ξ2 Σ
t : τ2 & ξ2 = R(Γ ; Σ ; t)
R(Γ ; Σ; (t1 , t2 )) = in Λβi :: κi .λx : τ1 & β. t
let t1 : τ1 & ξ1 = R(Γ ; Σ; t1 ) : ∀βi :: κi .τ1 β → τ2 ξ2 & ⊥
t2 : τ2 & ξ2 = R(Γ ; Σ; t2 ) R(Γ ; Σ; t1 t2 ) =
in (t1 , t2 ) : τ1 ξ1 × τ2 ξ2 & ⊥ let t1 : τ1 & ξ1 = R(Γ ; Σ; t1 )
R(Γ ; Σ; inlτ2 (t)) = t2 : τ2 & ξ2 = R(Γ ; Σ; t2 )
let t : τ1 & ξ1 = R(Γ ; Σ; t) τ2 β → τξ βi = I( τ1 )
in inlτ2 ( t ) : τ1 ξ1 + ⊥τ2 ⊥ & ⊥ θ = [β → ξ2 ] ◦ M([ ]; τ2 ; τ2 )
R(Γ ; Σ; inrτ1 (t)) = in t1 θβi t2 : θ τ Σ & ξ1 θξΣ
let t : τ2 & ξ2 = R(Γ ; Σ; t) R(Γ ; Σ; μx : τ.t) =
in inrτ1 ( t ) : ⊥τ1 ⊥ + τ2 ξ2 & ⊥ do i ; τ0 & ξ0 ← 0; ⊥τ & ⊥
R(Γ ; Σ; case t1 of {inl(x ) → t2 ; repeat ti+1 : τi+1 & ξi+1
inr(y) → t3 }) = ← R(Γ, x : τi & ξi ; Σ; t)
let t1 : τξ + τ ξ & ξ1 = R(Γ ; Σ; t1 ) i ←i +1
t2 : τ2 & ξ2 = R(Γ, x : τ & ξ; Σ; t2 ) until ( τi−1 ≡ τi ∧ ξi−1 ≡ ξi )
t3 : τ3 & ξ3 = R(Γ, y : τ & ξ ; Σ; t3 ) return (μx : τi & ξi . ti ) : τi & ξi
in case t1 of {inl(x ) → t2 ; inr(y) → t3 }
× Ty
: Ty → Ty
unit
unit
= unit
τ1 ξ1 × τ2 ξ2 ) (
( τ1 ξ1 × τ2 ξ2 ) = (τ1 τ1 )ξ1 ξ1 × (τ2 τ2 )ξ2 ξ2
τ1 β → τ2 ξ2 ) (
( τ1 β → τ2 ξ2 ) = τ1 β → ( τ2 τ2 )ξ2 ξ2
(∀β :: κ.τ) (∀β :: κ.τ ) = ∀β :: κ. τ τ
→ Ty
I : Ty × SortEnv
τ ) = let τ Σ = I(
I(∀β :: κ. τ ) in [β → β ](
τ ) β :: κ, Σ where β be fresh
I(τ) = τ [ ]
M : SortEnv × Ty × Ty → AnnSubst
M(Σ; unit;
unit) = []
M(Σ; τ1 β βi × τ2 β βi ; τ1 ξ1 × τ2 ξ2 ) =
[β → λβi :: Σ(βi ).ξ1 , β → λβi :: Σ(βi ).ξ2 ] ◦ M(Σ; τ1 ; τ1 ) ◦ M(Σ; τ2 ; τ2 )
M(Σ; τ1 β → τ2 β βi ; τ1 β → τ2 ξ) =
[β → λβi :: Σ(βi ).ξ ] ◦ M(Σ; τ2 ; τ2 )
M(Σ; ∀β :: κ. τ ; ∀β :: κ.τ ) = M(Σ, β :: κ; τ ; τ)
Fig. 12: Least upper bound of types ( ), completion (C), instantiation (I), and
matching (M). Rules for · + · in and M are like those for · × ·.
The crucial part here is the termination of the fixpoint iteration. In order to show
the convergence of the fixpoint iteration, we start by defining an equivalence
relation on annotated type and depencency pairs.
Our type reconstruction algorithm handles polymorphic recursion through
Kleene-Mycroft-iteration. Such an algorithm is based on fixpoint iteration and
needs a way to decide whether two dependency terms are equal according to the
denotational semantics of λ .
A straightforward way to decide semantic equivalence is to enumerate all
possible environments and compare the denotations of the two terms in all of
these (possibly after some semantics preserving normalization). This only works
if the dependency lattice L is finite.
For some analyses, e.g., the set of all program locations in a slicing analysis,
L = V is finite but large, and deciding equality in this fashion becomes imprac-
tical. To alleviate this problem, our prototype implementation applies a partial
canonicalization procedure which, while not complete, can serve as an approxi-
mation of equality: if two canonicalized dependency terms become syntactically
equal, then we can be assured that they are semantically equal, but if they are
not we can still apply the above procedure to the canonicalized dependency
terms. We omit formal details from the paper.
We can now state our completeness results for the type reconstruction al-
gorithm. Here, we write Γ t t : τ to say that term t has type τ under the
environment Γ in the underlying type system.
In an analysis with monomorphic recursion, the analysis assigns the same anno-
tation to both parameters, large enough to accommodate for both arguments.
This is due to the permutation of the arguments in the else branch. An analysis
with polymorphic recursion is allowed to use a different instantiation for f in
that case. Our algorithm hence infers the following most general type.
∀β1 :: .boolβ
1 → (∀β2 :: .boolβ
2 → boolβ
1 β2 )⊥ & ⊥
We see that the result of the function indeed depends on the annotations of
both arguments, as both end up in the condition of the if-expression at some
Higher-Ranked Annotation Polymorphic Dependency Analysis 677
point. Yet, both arguments are completely unrestricted, and unrelated in their
annotations. In contrast, a type system with monomorphic recursion would only
admit a weaker type, possibly similar to
∀β1 :: .boolβ
1 → (boolβ
1 → boolβ
1 )⊥ & ⊥
A real world example of this kind is Euclid’s algorithm for computing the
greatest common divisor(see [26]).
id : int → int
id = λx : int.x
In case of both, the function parameter f can be instantiated separately for each
component because our analysis assigns it a type that universally quantifies over
2
NB. both is a simplified instance of a traversal ∀f .Applicative f ⇒ (Int → f Int) →
(Int, Int) → f (Int, Int), in order to fit the restrictions of the source language [6,15].
678 F. Thorand and J. Hage
the annotation of its argument. It is evident from the type signature that the
components of the resulting pair only depend on the corresponding components
of the input pair, and the function and the input pair itself. They do not depend
on the respective other component of the input.
If we again consider the call both id p, we obtain β2 = λβ :: .β, β1 = β3 =
β5 = S and β4 = D through pattern unification. Normalization of the resulting
dependency terms results in the expected return type intS × intD.
The generality provided by the higher-ranked analysis extends to an arbitrar-
ily deep nesting of function arrows. The following example demonstrates this for
two levels of arrows. Functions with more than two levels of arrows can arise
directly in actual programs, but even more so in desugared code, e.g., when type
classes in Haskell are implemented via explicit dictionary passing. Due to limi-
tations of our source language, the examples are syntactically heavily restricted.
Consider the following function that takes a function argument which again
requires a function.
The higher-ranked analysis infers the following type and target term (where we
omitted the type in the argument of the lambda term because it essentially
repeats what is already visible in the top level type signature).
Since the type of f is a pattern type, the argument to f is also a pattern type by
definition. Therefore, the analysis of f depends on the analysis of the function
passed to it. This gives rise to the higher-order effect operator β3 [12]. Thus, f
can be applied to any function with a conservative type of the right shape. As our
algorithm always infers conservative types, the type of f is as general as possible.
This is reflected in the body of the lambda where in both cases f is instantiated
with the dependency annotation corresponding to the function passed to it. The
result of this instantiation can be observed in the returned product type where
β3 is applied to the effect operators λβ0 :: .β0 and λβ0 :: .S corresponding to
the respective functions used as arguments to f .
Only when we finally apply foo, the resulting annotations can be evaluated.
For bar we obtain foo bar : intD × intS & S. In this case, β3 = λβ2 ::
.λβ1 :: ⇒ .β1 D β2 , because bar applies its argument to a value with
dynamic binding time. This causes the first component of the returned pair to
be deemed dynamic as well. On the other hand, in the second component bar
is applied to a constant function. Thus, regardless of the argument’s dynamic
binding time, the resulting binding time is static. In a rank-1 system we would
get intD × intD instead of intD × intS.
8 Related Work
The basis for most type systems of functional programming languages is the
Hindley-Milner type system [22]. Our algorithm R strongly resembles the well-
known type inference algorithm for the Hindley-Milner type system, Algorithm
W [3], a distinct advantage of our approach. The idea to define an annotated
type system as a means to design static analyses for higher-order languages is
attributed to [19]. The major technical difference compared to a let-polyvariant
analysis is that our annotations form a simply typed lambda-calculus.
Full reconstruction for a higher-ranked polyvariant annotated type system
was first considered by [12] in the context of a control-flow analysis. However,
we found that the (constraint-based) algorithm as presented in [12] generates
constraints free of cycles. Therefore, it cannot faithfully reflect the constraints
necessary for the fixpoint combinator. The algorithm incorrectly concludes for
the following example that only the first and third ‘False‘ term flow into the
condition x , but not the second one.
(fix (λf . λx . λy. λz . if x then True else f z x y)) False False False
We reproduced this mistake with their implementation and verified that the
mistake was not a simple bug in that implementation.
Close to our formulation is the (unpublished) work of [16] which deals with
exception analysis, which uses a simply typed lambda-calculus with sets to repre-
sent annotations. We have chosen a more modular approach in which we offload
much of the complexity of dealing with lattice values to the lattice. In [16] terms
from the simply typed lambda-calculus with sets are canonicalized and then
checked for alpha equivalence during Kleene-Mycroft iteration. We found how-
ever that two terms can have different canonical forms even though they are
actually semantically equivalent. This causes Koot’s reconstruction algorithm
to diverge on a particular class of programs, because the inferred annotations
continue to grow. The simplest such program we found is the following.
[29]), exception analysis [17,16], secure information flow analysis [9] and static
slicing [27]. They devised the Dependency Core Calculus (DCC) to which each
instance of a dependency analysis can be mapped. This allowed them to compare
different dependency analyses, uncover problems with existing instance analy-
ses and to simplify proofs of noninterference [8,20]. The instance analyses in
[1] were defined as a monovariant type and effect system with subtyping, for a
monomorphic call-by-name language. An implicit, let-polymorphic implementa-
tion of DCC, FlowCaml, was developed by [25]. It is not higher-ranked.
The difference between DCC and our analysis is to a large extent a different
focus: the DCC is a calculus defined in a way that any calculus that elaborates
to DCC has the noninterference property and any other properties proven for
the calculus. On the other hand, our analysis is meant to be implemented in a
compiler (with the added precision), and that implementation (and its associated
meta-theory) can then be reused inside the compiler for a variety of analyses.
Comparable to DCC, we have proven a noninterference property for our generic
higher-rank polyvariant dependency analysis, so that all its instances inherit it.
The Haskell community supports an implementation of DCC in which the
(security) annotations are lifted to the Haskell type level [2]. Since the GHC
compiler supports higher-rank types, the code written with this library can in
fact model security flows with higher-rank. Because of the general undecidability
of full reconstruction for higher-rank types [14], the programmer must however
provide explicit type information. In [18], the authors introduce dependent flow
types, that allows them to express a large variety of security policies. An essential
difference with our work is that our approach is fully automated.
Early on in our research, we observed that the approach of [11] may lead to
similar precision gains as higher-ranked annotations do. Since they deal with a
different analysis, a direct comparison is impossible to make at this time.
References
1. Abadi, M., Banerjee, A., Heintze, N., Riecke, J.G.: A core calculus of dependency.
In: Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of
programming languages - POPL '99. Association for Computing Machinery (ACM)
(1999). https://doi.org/10.1145/292540.292555
2. Algehed, M., Russo, A.: Encoding dcc in haskell. In: Proceedings of the 2017 Work-
shop on Programming Languages and Analysis for Security. pp. 77–89. PLAS ’17,
ACM, New York, NY, USA (2017). https://doi.org/10.1145/3139337.3139338
3. Damas, L., Milner, R.: Principal type-schemes for functional programs. In: Pro-
ceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles of pro-
gramming languages - POPL '82. Association for Computing Machinery (ACM)
(1982). https://doi.org/10.1145/582153.582176
4. Dowek, G.: Handbook of automated reasoning. chap. Higher-order Unification and
Matching, pp. 1009–1062. Elsevier Science Publishers B. V., Amsterdam, The
Netherlands (2001), http://dl.acm.org/citation.cfm?id=778522.778525
5. Dussart, D., Henglein, F., Mossin, C.: Polymorphic recursion and subtype qualifi-
cations: Polymorphic binding-time analysis in polynomial time. In: Static Analysis,
pp. 118–135. Springer Nature (1995). https://doi.org/10.1007/3-540-60360-3 36
6. Foster, J.N., Greenwald, M.B., Moore, J.T., Pierce, B.C., Schmitt, A.: Com-
binators for bidirectional tree transformations: A linguistic approach to the
view-update problem. ACM Trans. Program. Lang. Syst. 29(3) (May 2007).
https://doi.org/10.1145/1232420.1232424
7. Glynn, K., Stuckey, P.J., Sulzmann, M., Söndergaard, H.: Boolean constraints for
binding-time analysis. In: PADO ’01: Proceedings of the Second Symposium on
Programs as Data Objects. pp. 39–62. Springer-Verlag, London, UK (2001)
8. Goguen, J.A., Meseguer, J.: Security policies and security models. In:
1982 IEEE Symposium on Security and Privacy. pp. 11–11 (April 1982).
https://doi.org/10.1109/SP.1982.10014
9. Heintze, N., Riecke, J.G.: The SLam calculus. In: Proceedings of the 25th
ACM SIGPLAN-SIGACT symposium on Principles of programming lan-
guages - POPL '98. Association for Computing Machinery (ACM) (1998).
https://doi.org/10.1145/268946.268976
10. Henglein, F.: Type inference with polymorphic recursion. ACM Transac-
tions on Programming Languages and Systems 15(2), 253–289 (4 1993).
https://doi.org/10.1145/169701.169692
11. Hoffmann, J., Das, A., Weng, S.C.: Towards automatic resource bound analysis for
ocaml. In: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of
Programming Languages. pp. 359–373. POPL 2017, ACM, New York, NY, USA
(2017). https://doi.org/10.1145/3009837.3009842
12. Holdermans, S., Hage, J.: Polyvariant flow analysis with higher-ranked
polymorphic types and higher-order effect operators. In: Proceedings of
the 15th ACM SIGPLAN international conference on Functional program-
ming - ICFP '10. Association for Computing Machinery (ACM) (2010).
https://doi.org/10.1145/1863543.1863554
13. Jones, S.P., Vytiniotis, D., Weirich, S., Shields, M.: Practical type inference for
arbitrary-rank types. Journal of Functional Programming 17(1), 1–82 (2007).
https://doi.org/http://dx.doi.org/10.1017/S0956796806006034
14. Kfoury, A., Tiuryn, J.: Type reconstruction in finite rank fragments of the
second-order λ-calculus. Information and Computation 98(2), 228–257 (6 1992).
https://doi.org/10.1016/0890-5401(92)90020-g
682 F. Thorand and J. Hage
15. Kmett, E.: The lens library (2018), http://lens.github.io/, consulted 9/7/2018
16. Koot, R.: Higher-ranked exception types (2015), https://github.com/ruudkoot/
phd/tree/master/higher-ranked-exception-types, accessed 2018-03-09
17. Koot, R., Hage, J.: Type-based exception analysis for non-strict higher-
order functional languages with imprecise exception semantics. In: Proceed-
ings of the 2015 Workshop on Partial Evaluation and Program Manipu-
lation - PEPM '15. Association for Computing Machinery (ACM) (2015).
https://doi.org/10.1145/2678015.2682542
18. Lourenço, L., Caires, L.: Dependent information flow types. In: Proceedings of the
42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Program-
ming Languages. pp. 317–328. POPL ’15, ACM, New York, NY, USA (2015).
https://doi.org/10.1145/2676726.2676994
19. Lucassen, J.M., Gifford, D.K.: Polymorphic effect systems. In: POPL ’88:
Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles
of programming languages. pp. 47–57. ACM, New York, NY, USA (1988).
https://doi.org/http://doi.acm.org/10.1145/73560.73564
20. McLean, J.: Security Models. Wiley Press (1994).
https://doi.org/10.1002/0471028959
21. Miller, D.: A logic programming language with lambda-abstraction, function vari-
ables, and simple unification. In: Extensions of Logic Programming, pp. 253–281.
Springer Nature (1991). https://doi.org/10.1007/bfb0038698
22. Milner, R.: A theory of type polymorphism in programming. Journal of Computer
and System Sciences 17(3), 348–375 (12 1978). https://doi.org/10.1016/0022-
0000(78)90014-4
23. Mycroft, A.: Polymorphic type schemes and recursive definitions. In: Lec-
ture Notes in Computer Science, pp. 217–228. Springer Nature (1984).
https://doi.org/10.1007/3-540-12925-1 41
24. Nielson, F., Nielson, H., Hankin, C.: Principles of Program Analysis. Springer
Verlag, second printing edn. (2005)
25. Pottier, F., Simonet, V.: Information flow inference for ml. ACM Trans. Program.
Lang. Syst. 25(1), 117–158 (Jan 2003). https://doi.org/10.1145/596980.596983
26. Thorand, F., Hage, J.: Addendum with proofs, definitions and examples for the
esop 2020 paper, higher-ranked annotation polymorphic dependency analysis, http:
//www.staff.science.uu.nl/∼hage0101/downloads/hrp-addendum.pdf
27. Tip, F.: A survey of program slicing techniques. Tech. rep., Amsterdam, The
Netherlands, The Netherlands (1994)
28. Wansbrough, K., Jones, S.P.: Once upon a polymorphic type. In: Proceedings
of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming
languages - POPL '99. Association for Computing Machinery (ACM) (1999).
https://doi.org/10.1145/292540.292545
29. Zhang, G.: Binding-Time Analysis: Subtyping versus Subeffecting. Msc thesis
(2008), http://people.cs.uu.nl/jur/downloads/guangyuzhang-msc.pdf
Higher-Ranked Annotation Polymorphic Dependency Analysis 683
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
ConSORT: Context- and Flow-Sensitive
Ownership Refinement Types for Imperative
Programs
1 Introduction
Driven by the increasing power of automated theorem provers and recent high-
profile software failures, fully automated program verification has seen a surge
of interest in recent years [5, 10, 15, 29, 38, 66]. In particular, refinement types
[9, 21, 24, 65], which refine base types with logical predicates, have been shown to
be a practical approach for program verification that are amenable to (sometimes
full) automation [47, 61, 62, 63]. Despite promising advances [26, 32, 46], the sound
and precise application of refinement types (and program verification in general)
in settings with mutability and aliasing (e.g., Java, Ruby, etc.) remains difficult.
One of the major challenges is how to precisely and soundly support strong
updates for the invariants on memory cells. In a setting with mutability, a single
invariant may not necessarily hold throughout the lifetime of a memory cell; while
the program mutates the memory the invariant may change or evolve. To model
these changes, a program verifier must support different, incompatible invariants
which hold at different points during program execution. Further, precise program
verification requires supporting different invariants on distinct pieces of memory.
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 684–714, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 25
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 685
One solution is to use refinement types on the static program names (i.e.,
variables) which point to a memory location. This approach can model evolving
invariants while tracking distinct invariants for each memory cell. For example,
consider the (contrived) example in Figure 1. This program is written in an ML-
like language with mutable references; references are updated with := and allo-
cated with mkref. Variable p can initially be given the type {ν : int | ν = 3} ref ,
indicating it is a reference to the integer 3. Similarly, q can be given the type
{ν : int | ν = 5} ref . We can model the mutation of p’s memory on line 5 by
strongly updating p’s type to {ν : int | ν = 4} ref .
Unfortunately, the precise application of this technique is confounded by the
existence of unrestricted aliasing. In general, updating just the type of the mutated
reference is insufficient: due to aliasing, other variables may point to the mutated
memory and their refinements must be updated as well. However, in the presence
of conditional, may aliasing, it is impossible to strongly update the refinements on
all possible aliases; given the static uncertainty about whether a variable points to
the mutated memory, that variable’s refinement may only be weakly updated. For
example, suppose we used a simple alias analysis that imprecisely (but soundly)
concluded all references allocated at the same program point might alias. Variables
p and q share the allocation site on line 1, so on line 5 we would have to weakly
update q’s type to {ν : int | ν = 4 ∨ ν = 5}, indicating it may hold either 4 or
5. Under this same imprecise aliasing assumption, we would also have to weakly
update p’s type on line 6, preventing the verification of the example program.
Given the precision loss associated with weak updates, it is critical that
verification techniques built upon refinement types use precise aliasing information
and avoid spuriously applied weak updates. Although it is relatively simple to
conclude that p and q do not alias in Figure 1, consider the example in Figure 2.
(In this example, represents non-deterministic values.) Verifying this program
requires proving a and b never alias at the writes on lines 3 and 4. In fact, a
and b may point to the same memory location, but only in different invocations
of loop; this pattern may confound even sophisticated symbolic alias analyses.
686 J. Toman et al.
2 Target Language
This section describes a simple imperative language with mutable references and
first-order, recursive functions.
2.1 Syntax
We assume a set of variables, ranged over by x, y, z, . . . , a set of function names,
ranged over by f , and a set of labels, ranged over by 1 , 2 , . . . . The grammar of
the language is as follows.
As a convenience, we assume all variable names introduced with let bindings and
function parameters are distinct.
Unlike ML (and like C or Java) we do not allow general expressions on the
right hand side of let bindings. The simplest right hand forms are a variable y or
an integer literal n. mkref y creates a reference cell with value y, and ∗y accesses
the contents of reference y. For simplicity, we do not include an explicit null value;
an extension to support null is discussed in Section 4. Function calls must occur
on the right hand side of a variable binding and take the form f (x1 , . . . , xn ),
where x1 , . . . , xn are distinct variables and is a (unique) label. These labels are
used to make our type system context-sensitive as discussed in Section 3.3.
The single base case for expressions is a single variable. If the variable
expression is executed in a tail position of a function, then the value of that
variable is the return value of the function, otherwise the value is ignored.
The only intraprocedural control-flow operations in our language are if state-
ments. ifz checks whether the condition variable x equals zero and chooses the
corresponding branch. Loops can be implemented with recursive functions and
we do not include them explicitly in our formalism.
Our grammar requires that side-effecting, result-free statements, assert(ϕ)
alias(x = y), alias(x = ∗y) and assignment x := y are followed by a continu-
ation expression. We impose this requirement for technical reasons to ease our
formal presentation; this requirement does not reduce expressiveness as dummy
continuations can be inserted as needed. The assert(ϕ) ; e form executes e if
the predicate ϕ holds in the current state and aborts the program otherwise.
alias(x = y) ; e and alias(x = ∗y) ; e assert a must-aliasing relationship between
x and y (resp. x and ∗y) and then execute e. alias statements are effectively an-
notations that our type system exploits to gain added precision. x : = y ; e updates
the contents of the memory cell pointed to by x with the value of y. In addition
to the above continuations, our language supports general sequencing with e1 ; e2 .
A program is a pair D, e, where D = {d1 , ... , dn } is a set of first-order,
mutually recursive function definitions, and e is the program entry point. A
function definition d maps the function name to a tuple of argument names
x1 , ... , xn that are bound within the function body e.
Paper Syntax. In the remainder of the paper, we will write programs that are
technically illegal according to our grammar, but can be easily “de-sugared” into
an equivalent, valid program. For example, we will write
let x = mkref 4 in assert(*x = 4)
as syntactic sugar for:
let f = 4 in let x = mkref f in
let tmp = *x in assert(tmp = 4); let dummy = 0 in dummy
, x
H , R, F : F −→D , F [x ]
H , R, F , E [x ; e]
H , R, F : F −→D , E [e]
H , R, F
(R-Var) (R-Seq)
x ∈ dom(R) x ∈ dom(R)
H , R, F , E [let x = y in e]
H , R, F , E [let x = n in e]
−→D H , R{x → R(y)}, F , E [[x /x ]e] −→D H , R{x → n}, F , E [[x /x ]e]
(R-Let) (R-LetInt)
R(x ) = 0 R(x ) = 0
, E [ifz x then e1 else e2 ]
H , R, F , E [ifz x then e1 else e2 ]
H , R, F
−→D H , R, F , E [e1 ] −→D H , R, F , E [e2 ]
(R-IfTrue) (R-IfFalse)
a ∈ dom(H ) x ∈ dom(R) R(y) = a H (a) = v x ∈ dom(R)
, E [let x = mkref y in e] −→D
H , R, F , E [let x = ∗y in e] −→D
H , R, F
H {a → R(y)}, R{x → a}, F , E [[x /x ]e] , E [[x /x ]e]
H , R{x → v }, F
(R-MkRef) (R-Deref)
R(x ) = R(y)
R(y) = a H (a) = R(x )
H , R, F , E [alias(x = y) ; e]
, E [alias(x = ∗y) ; e]
H , R, F , E [e]
−→D H , R, F
−→D H , R, F , E [e]
(R-AliasPtr)
(R-Alias)
|= [R] ϕ
|= [R] ϕ
, E [assert(ϕ) ; e]
H , R, F
, E [assert(ϕ) ; e]
H , R, F −→D AssertFail
−→D H , R, F , E [e]
(R-AssertFail)
(R-Assert)
prepended onto the stack of the input configuration. The substitution of formal
arguments for parameters in e , denoted by [y1 /x1 ] · · · [yn /xn ]e , becomes the
currently reducing expression in the output configuration. Function returns are
handled by R-Var. Our semantics return values by name; when the currently
executing function fully reduces to a single variable x, x is substituted into the
return context on the top of the stack, denoted by E [let y = [] in e][x].
In the rules R-Assert we write |= [R] ϕ to mean that the formula yielded
by substituting the concrete values in R for the variables in ϕ is valid within
some chosen logic (see Section 3.1); in R-AssertFail we write |= [R] ϕ when
the formula is not valid. The substitution operation [R] ϕ is defined inductively
as [∅] ϕ = ϕ, [R{x → n}] ϕ = [R] [n/x ]ϕ, [R{x → a}] ϕ = [R] ϕ. In the case of an
assertion failure, the semantics steps to a distinguished configuration AssertFail.
The goal of our type system is to show that no execution of a well-typed program
may reach this configuration. The alias form checks whether the two references
actually alias; i.e., if the must-alias assertion provided by the programmer is
correct. If not, our semantics steps to the distinguished AliasFail configuration.
Our type system does not guarantee that AliasFail is unreachable; aliasing
assertions are effectively trusted annotations that are assumed to hold.
In order to avoid duplicate variable names in our register file due to recursive
functions, we refresh the bound variable x in a let expression to x . Take expression
let x = y in e as an example; we substitute a fresh variable x for x in e, then bind
x to the value of variable y. We assume this refreshing of variables preserves our
assumption that all variable bindings introduced with let and function parameters
are unique, i.e. x does not overlap with variable names that occur in the program.
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 691
3 Typing
The syntax of types is given in Figure 6. Our type system has two type con-
structors: references and integers. τ ref r is the type of a (non-null) reference to a
value of type τ . r is an ownership which is a rational number in the range [0, 1].
An ownership of 0 indicates a reference that cannot be written, and for which
there may exist a mutable alias. By contrast, 1 indicates a pointer with exclusive
ownership that can be read and written. Reference types with ownership values
between these two extremes indicate a pointer that is readable but not writable,
and for which no mutable aliases exist. ConSORT ensures that these invariants
hold while aliases are created and destroyed during execution.
Integers are refined with a predicate ϕ. The language of predicates is built using
the standard logical connectives of first-order logic, with (in)equality between
variables and integers, and atomic predicate symbols φ as the basic atoms. We
include a special “value” variable ν representing the value being refined by the
predicate. For simplicity, we omit the connectives ϕ1 ∧ ϕ2 and ϕ1 =⇒ ϕ2 ; they
can be written as derived forms using the given connectives. We do not fix a
particular theory from which φ are drawn, provided a sound (but not necessarily
complete) decision procedure exists. CP are context predicates, which are used
for context sensitivity as explained below.
Example 1. {ν : int | ν > 0} is the type of strictly positive integers. The type
of immutable references to integers exactly equal to 3 can be expressed by
{ν : int | ν = 3} ref 0.5 .
Function Types, Contexts, and Context Polymorphism. Our type system achieves
context sensitivity by allowing function types to depend on where a function is
called, i.e., the execution context of the function invocation. Our system represents
a concrete execution contexts with strings of call site labels (or just “call strings”),
defined by ::= | : . As is standard (e.g., [49, 50]), the string : abstracts an
execution context where the most recent, active function call occurred at call site
which itself was executed in a context abstracted by ; is the context under
which program execution begins. Context variables, drawn from a finite domain
CVar and ranged over by λ1 , λ2 , . . ., represent arbitrary, unknown contexts.
A function type takes the form ∀λ. x1 : τ1 , . . . , xn : τn → x1 : τ1 , . . . , xn : τn | τ .
The arguments of a function are an n-ary tuple of types τi . To model side-effects on
arguments, the function type includes the same number of output types τi . In ad-
dition, function types have a direct return type τ . The argument and output types
are given names: refinements within the function type may refer to these names.
Function types in our language are context polymorphic, expressed by universal
quantification “∀λ.” over a context variable. Intuitively, this context variable repre-
sents the many different execution contexts under which a function may be called.
Argument and return types may depend on this context variable by including
context query predicates in their refinements. A context query predicate CP
usually takes the form λ, and is true iff is a prefix of the concrete context
represented by λ. Intuitively, a refinement λ =⇒ ϕ states that ϕ holds in any
concrete execution context with prefix , and provides no information in any other
context. In full generality, a context query predicate may be of the form 1 2
or 1 . . . n : λ; these forms may be immediately simplified to , ⊥ or λ.
As types in our type system may contain context variables, our typing
judgment (introduced below) includes a typing context L, which is either a
single context variable λ or a concrete context . This typing context represents
the assumptions about the execution context of the term being typed. If the
typing context is a context variable λ, then no assumptions are made about the
execution context of the term, although types may depend upon λ with context
query predicates. Accordingly, function bodies are typed under the context
variable universally quantified over in the corresponding function type; i.e., no
assumptions are made about the exact execution context of the function body.
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 693
Function type environments are denoted with Θ and are finite maps from
function names (f ) to function types (σ).
We now introduce the type system for the intraprocedural fragment of our
language. Accordingly, this section focuses on the interplay of mutability and
694 J. Toman et al.
(T-Var)
Θ | L | Γ [x : τ1 + τ2 ] x : τ1 ⇒ Γ [x ← τ2 ]
Θ | L | Γ, x : {ν : int | ν = n} e : τ ⇒ Γ x ∈ dom(Γ )
(T-LetInt)
Θ | L | Γ let x = n in e : τ ⇒ Γ
Θ | L | Γ [x ← {ν : int | ϕ ∧ ν = 0}] e1 : τ ⇒ Γ
Θ | L | Γ [x ← {ν : int | ϕ ∧ ν = 0}] e2 : τ ⇒ Γ
(T-If)
Θ | L | Γ [x : {ν : int | ϕ}] ifz x then e1 else e2 : τ ⇒ Γ
refinement types. The typing rules are given in Figures 7 and 8. A typing judgment
takes the form Θ | L | Γ e : τ ⇒ Γ , which indicates that e is well-typed under
a function type environment Θ, typing context L, and type environment Γ , and
evaluates to a value of type τ and modifies the input environment according to Γ .
Any valid typing derivation must have L WF Γ , L WF Γ , and L | Γ WF τ ,
i.e., the input and output type environments and result type must be well-formed.
The typing rules in Figure 7 handle the relatively standard features in our
language. The rule T-Seq for sequential composition is fairly straightforward
except that the output type environment for e1 is the input type environment for
e2 . T-LetInt is also straightforward; since x is bound to a constant, it is given
type {ν : int | ν = n} to indicate x is exactly n. The output type environment Γ
cannot mention x (expressed with x ∈ dom(Γ )) to prevent x from escaping its
scope. This requirement can be met by applying the subtyping rule (see below) to
weaken refinements to no longer mention x . As in other refinement type systems
[47], this requirement is critical for ensuring soundness.
Rule T-Let is crucial to understanding our ownership type system. The
body of the let expression e is typechecked under a type environment where
the type of y in Γ is linearly split into two types: τ1 for y and τ2 for the newly
created binding x . This splitting is expressed using the + operator. If y is a ref-
erence type, the split operation distributes some portion of y’s ownership infor-
mation to its new alias x . The split operation also distributes refinement infor-
mation between the two types. For example, type {ν : int | ν > 0} ref 1 can be
split into (1) {ν : int | ν > 0} ref r and {ν : int | ν > 0} ref (1−r) (for r ∈ (0, 1)),
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 695
Γ ≤ Γ Θ | L | Γ e : τ ⇒ Γ Γ , τ ≤ Γ , τ
(T-Sub)
Θ | L | Γ e : τ ⇒ Γ
τ1 ≈ τ2 iff • τ1 ≤ τ2 and • τ2 ≤ τ1 .
Γ |= ϕ1 =⇒ ϕ2 ∀ x ∈ dom(Γ ).Γ Γ (x ) ≤ Γ (x )
(S-Int) (S-TyEnv)
Γ {ν : int | ϕ1 } ≤ {ν : int | ϕ2 } Γ ≤ Γ
r 1 ≥ r2 Γ τ1 ≤ τ 2 Γ, x : τ ≤ Γ , x : τ x ∈ dom(Γ )
(S-Ref) (S-Res)
Γ τ1 ref r1 ≤ τ2 ref r2 Γ, τ ≤ Γ, τ
As described thus far, the type system is quite strict: if ownership has been
completely transferred from one reference to another, the refinement information
found in the original reference is effectively useless. Additionally, once a mutable
pointer has been split through an assignment or let expression, there is no
way to recover mutability. The typing rule for must alias assertions, T-Alias
and T-AliasPtr, overcomes this restriction by exploiting the must-aliasing
information to “shuffle” or redistribute ownerships and refinements between two
aliased pointers. The typing rule assigns two fresh types τ1 ref r1 and τ2 ref r2 to
the two operand pointers. The choice of τ1 , r1 , τ2 , and r2 is left open provided
that the sum of the new types, (τ1 ref r1 ) + (τ2 ref r2 ) is equivalent (denoted ≈)
to the sum of the original types. Formally, ≈ is defined as in Figure 8; it implies
that any refinements in the two types must be logically equivalent and that
ownerships must also be equal. This redistribution is sound precisely because the
two references are assumed to alias; the total ownership for the single memory
cell pointed to by both references cannot be increased by this shuffling. Further,
any refinements that hold for the contents of one reference must necessarily hold
for contents of the other and vice versa.
The aliasing rules give fine-grained control over ownership information. This
flexibility allows mutation through two or more aliased references within the
same scope. Provided sufficient aliasing annotations, the type system may shuffle
ownerships between one or more live references, enabling and disabling mutability
as required. Although the reliance on these annotations appears to decrease the
practicality of our type system, we expect these aliasing annotations can be
inserted by a conservative must-aliasing analysis. Further, empirical experience
from our prior work [56] indicates that only a small number of annotations are
required for larger programs.
After the first aliasing statement the type system shuffles the (exclusive) mutability
between x and y to enable the write to y. After the second aliasing statement
the ownership in y is split with x ; note that transferring all ownership from y to
x would also yield a valid typing.
Finally, we describe the subtyping rule. The rules for subtyping types and
environments are shown in Figure 9. For integer types, the rules require the
refinement of a supertype is a logical consequence of the subtype’s refinement
conjoined with the lifting of Γ . The subtype rule for references is covariant in
the type of reference contents. It is widely known that in a language with un-
restricted aliasing and mutable references such a rule is unsound: after a write
into the coerced pointer, reads from an alias may yield a value disallowed by
the alias’ type [43]. However, as in the assign case, ownership types prevent un-
soundness; a write to the coerced pointer requires the pointer to have ownership
1, which guarantees any aliased pointers have the maximal type and provide no
information about their contents beyond simple types.
types (post substitution). The body of the let binding is then checked with
the argument types updated to reflect the changes in the function call (again,
post substitution). This update is well-defined because we require all function
arguments be distinct as described in Section 2.1. Intuitively, the substitution σα
represents incrementally refining the behavior of the callee function with partial
context information. If L is itself a context variable λ , this substitution effectively
transforms any context prefix queries over λ in the argument/return/output
types into a queries over : λ . In other words, while the exact concrete execution
context of the callee is unknown, the context must at least begin with which
can potentially rule out certain behaviors.
Rule T-FunDef type checks a function definition f → (x1 , .. , xn )e against
the function type given in Θ. As a convenience we assume that the parameter
names in the function type match the formal parameters in the function definition.
The rule checks that under an initial environment given by the argument types the
function body produces a value of the return type and transforms the arguments
according to the output types. As mentioned above, functions may be executed
under many different contexts, so type checking the function body is performed
under the context variable λ that occurs in the function type.
Finally, the rule for typing programs (T-Prog) checks that all function
definitions are well typed under a well-formed function type environment, and
that the entry point e is well typed in an empty type environment and the typing
context , i.e., the initial context.
Applying the substitution [3 : λ/λ ] to the argument type of get_real yields:
which is exactly the type of p. A similar derivation applies to the return type of
get_real and thus get.
3.4 Soundness
We have proven that any program that type checks according to the rules above
will never experience an assertion failure. We formalize this claim with the
following soundness theorem.
Proof (Sketch). By standard progress and preservation lemmas; the full proof
has been omitted for space reasons and can be found in the full version [60].
4.1 Inference
Our tool first runs a standard, simple type inference algorithm to generate type
templates for every function parameter type, return type, and for every live
variable at each program point. For a variable x of simple type τS ::= int | τS ref
at program point p, ConSORT generates a type template τS x ,0,p as follows:
intx ,n,p = {ν : int | ϕx ,n,p (ν; FVp )} τS ref x ,n,p = τS x ,n+1,p ref rx ,n,p
ϕx ,n,p (ν; FVp ) denotes a fresh relation symbol applied to ν and the free variables
of simple type int at program point p (denoted FVp ). rx ,n,p is a fresh ownership
variable. For each function f , there are two synthetic program points, f b and f e
for the beginning and end of the function respectively. At both points, ConSORT
generates type template for each argument, where FVf b and FVf e are the names
of integer typed parameters. At f e , ConSORT also generates a type template
for the return value. We write Γ p to indicate the type environment at point p,
where every variable is mapped to its corresponding type template. Γ p is thus
equivalent to x ∈FVp ϕx ,0,p (x ; FVp ).
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 701
When generating these type templates, our implementation also generates own-
ership well-formedness constraints. Specifically, for a type template of the form
{ν : int | ϕx ,n+1,p (ν; FVp )} ref rx ,n,p ConSORT emits the constraint: rx ,n,p =
0 =⇒ ϕx ,n+1,p (ν; FVp ) and for a type template (τ ref rx ,n+1,p ) ref rx ,n,p Con-
SORT emits the constraint rx ,n,p = 0 =⇒ rx ,n+1,p = 0.
ConSORT then walks the program, generating constraints between relation
symbols and ownership variables according to the typing rules. These constraints
take three forms, ownership constraints, subtyping constraints, and assertion
constraints. Ownership constraints are simple linear (in)equalities over ownership
variables and constants, according to conditions imposed by the typing rules.
For example, if variable x has the type template τ ref rx ,0,p for the expression
x : = y ; e at point p, ConSORT generates the constraint rx ,0,p = 1.
ConSORT emits subtyping constraints between the relation symbols at
related program points according to the rules of the type system. For example, for
the term let x = y in e at program point p (where e is at program point p , and x
has simple type int ref ) ConSORT generates the following subtyping constraint:
Solving Constraints. The results of the above process are two systems of con-
straints; real arithmetic constraints over ownership variables and constrained Horn
702 J. Toman et al.
clauses (CHC) over the refinement relations. Under certain assumptions about the
simple types in a program, the size of the ownership and subtyping constraints will
be polynomial to the size of the program. These systems are not independent; the
relation constraints may mention the value of ownership variables due to the well-
formedness constraints described above. The ownership constraints are first solved
with Z3 [16]. These constraints are non-linear but Z3 appears particularly well-
engineered to quickly find solutions for the instances generated by ConSORT. We
constrain Z3 to maximize the number of non-zero ownership variables to ensure as
few refinements as possible are constrained to be by ownership well-formedness.
The values of ownership variables inferred by Z3 are then substituted into the
constrained Horn clauses, and the resulting system is checked for satisfiability
with an off-the-shelf CHC solver. Our implementation generates constraints in
the industry standard SMT-Lib2 format [8]; any solver that accepts this format
can be used as a backend for ConSORT. Our implementation currently supports
Spacer [37] (part of the Z3 solver [16]), HoICE [13], and Eldarica [48] (adding a
new backend requires only a handful of lines of glue code). We found that different
solvers are better tuned to different problems; we also implemented parallel mode
which runs all supported solvers in parallel, using the first available result.
4.2 Extensions
Primitive Operations. As defined in Section 2, our language can compare integers
to zero and load and store them from memory, but can perform no meaningful
computation over these numbers. To promote the flexibility of our type system
and simplify our soundness statement, we do not fix a set of primitive operations
and their static semantics. Instead, we assume any set of primitive operations
used in a program are given sound function types in Θ. For example, under the
assumption that + has its usual semantics and the underlying logic supports +, we
can give + the type ∀λ. x : 0 , y : 0 → x : 0 , y : 0 | {ν : int | ν = x + y}.
Interactions with a nondeterministic environment or unknown program inputs
can then be modeled with a primitive that returns integers refined with .
Recursive Types. Our language also supports some unbounded heap structures
via recursive reference types. To keep inference tractable, we forbid nested recur-
sive types, multiple occurrences of the recursive type variable, and additionally
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 703
fix the shape of refinements that occur within a recursive type. For recursive re-
finements that fit the above restriction, our approach for refinements is broadly
similar to that in [35], and we use the ownership scheme of [56] for handling
ownership. We first use simple type inference to infer the shape of the recursive
types, and automatically insert fold/unfold annotations into the source program.
As in [35], the refinements within an unfolding of a recursive type may refer to
dependent tuple names bound by the enclosing type. These recursive types can
express, e.g., the invariants of a mutable, sorted list. As in [56], recursive types
are unfolded once before assigning ownership variables; further unfoldings copy
existing ownership variables.
As in Java or C++, our language does not support sum types, and any
instantiation of a recursive type must use a null pointer. Our implementation
supports an ifnull construct in addition to a distinguished null constant. Our
implementation allows any refinement to hold for the null constant, including
⊥. Currently, our implementation does not detect null pointer dereferences, and
all soundness guarantees are made modulo freedom of null dereferences. As Γ
omits refinements under reference types, null pointer refinements do not affect
the verification of programs without null pointer dereferences.
4.3 Limitations
Our current approach is not complete; there are safe programs that will be rejected
by our type system. As mentioned in Section 3.1, our well-formedness condition
forbids refinements that refer to memory locations. As a result, ConSORT
cannot in general express, e.g., that the contents of two references are equal.
Further, due to our reliance on automated theorem provers we are restricted to
logics with sound but potentially incomplete decision procedures. ConSORT
also does not support conditional or context-sensitive ownerships, and therefore
cannot precisely handle conditional mutation or aliasing.
5 Experiments
We now present the results of preliminary experiments performed with the imple-
mentation described in Section 4. The goal of these experiments was to answer the
704 J. Toman et al.
Table 1. Description of benchmark suite adapted from JayHorn. Java are programs
that test Java-specific features. Inc are tests that cannot be handled by ConSORT, e.g.,
null checking, etc. Bug includes a “safe” program we discovered was actually incorrect.
Remark 2. The original JayHorn paper includes two additional benchmark sets,
Mine Pump and CBMC. Both our tool and recent JayHorn versions time out on
the Mine Pump benchmark. Further, the CBMC tests were either subsumed by
our own test programs, tested Java specific features, or tested program synthesis
functionality. We therefore omitted both of these benchmarks from our evaluation.
ConSORT JayHorn
Set N. Tests Correct T/O Correct T/O Imp.
Safe 32 29 3 24 5 3
Unsafe 26 26 0 19 0 7
Name Safe? Time(s) Ann JH Name Safe? Time(s) Ann JH
Array-Inv 10.07 0 T/O Array-Inv-BUG X 5.29 0 T/O
Array-List 16.76 0 T/O Array-List-BUG X 1.13 0 T/O
Intro2 0.08 0 T/O Intro2-BUG X 0.02 0 T/O
Mut-List 1.45 3 T/O Mut-List-BUG X 0.41 3 T/O
Shuffle 0.13 3 Shuffle-BUG X 0.07 3 X
Sorted-List 1.90 3 T/O Sorted-List-BUG X 1.10 3 T/O
We introduced unsafe mutations to these programs to check our tool for unsound-
ness and translated these programs into Java for further comparison with JayHorn.
Our benchmarks and JayHorn’s require a small number of trivially identi-
fied alias annotations. The adapted JayHorn benchmarks contain a total of 6
annotations; the most for any individual test was 3. The number of annotations
required for our benchmark suite are shown in column Ann. of Table 2.
We first ran ConSORT on each program in our benchmark suite and ran
version 0.7 of JayHorn on the corresponding Java version. We recorded the final
verification result for both our tool and JayHorn. We also collected the end-to-end
runtime of ConSORT for each test; we do not give a performance comparison
with JayHorn given the many differences in target languages. For the JayHorn
suite, we first ran our tool on the adapted version of each test program and ran
JayHorn on the original Java version. We also did not collect runtime information
for this set of experiments because our goal is a comparison of tool precision, not
performance. All tests were run on a machine with 16 GB RAM and 4 Intel i5
CPUs at 2GHz and with a timeout of 60 seconds (the same timeout was used in
[32]). We used ConSORT’s parallel backend (Section 4) with Z3 version 4.8.4,
HoICE version 1.8.1, and Eldarica version 2.0.1 and JayHorn’s Eldarica backend.
5.1 Results
The results of our experiments are shown in Table 2. On the JayHorn benchmark
suite ConSORT performs competitively with JayHorn, correctly identifying 29
of the 32 safe programs as such. For all 3 tests on which ConSORT timed out
after 60 seconds, JayHorn also timed out (column T/O). For the unsafe programs,
ConSORT correctly identified all programs as unsafe within 60 seconds; JayHorn
answered Unknown for 7 tests (column Imp.).
On our own benchmark set, ConSORT correctly verifies all safe versions of
the programs within 60 seconds. For the unsafe variants, ConSORT was able to
706 J. Toman et al.
quickly and definitively determine these programs unsafe. JayHorn times out on
all tests except for Shuffle and ShuffleBUG (column JH). We investigated the
cause of time outs and discovered that after verification failed with an unbounded
heap model, JayHorn attempts verification on increasingly larger bounded heaps.
In every case, JayHorn exceeded the 60 second timeout before reaching a pre-
configured limit on the heap bound. This result suggests JayHorn struggles in
the presence of per-object invariants and unbounded allocations; the only two
tests JayHorn successfully analyzed contain just a single object allocation.
We do not believe this struggle is indicative of a shortcoming in JayHorn’s
implementation, but stems from the fundamental limitations of JayHorn’s memory
representation. Like many verification tools (see Section 6), JayHorn uses a single,
unchanging invariant to for every object allocated at the same syntactic location;
effectively, all objects allocated at the same location are assumed to alias with one
another. This representation cannot, in general, handle programs with different
invariants for distinct objects that evolve over time. We hypothesize other tools
that adopt a similar approach will exhibit the same difficulty.
6 Related Work
The difficulty in handling programs with mutable references and aliasing has been
well-studied. Like JayHorn, many approaches model the heap explicitly at ver-
ification time, approximating concrete heap locations with allocation site labels
[14, 20, 32, 33, 46]; each abstract location is also associated with a refinement. As
abstract locations summarize many concrete locations, this approach does not in
general admit strong updates and flow-sensitivity; in particular, the refinement
associated with an abstract location is fixed for the lifetime of the program. The
techniques cited above include various workarounds for this limitation. For exam-
ple, [14, 46] temporarily allows breaking these invariants through a distinguished
program name as long as the abstract location is not accessed through another
name. The programmer must therefore eventually bring the invariant back in
sync with the summary location. As a result, these systems ultimately cannot
precisely handle programs that require evolving invariants on mutable memory.
A similar approach was taken in CQual [23] by Aiken et al. [2]. They used
an explicit restrict binding for pointers. Strong updates are permitted through
pointers bound with restrict, but the program is forbidden from using any pointers
which share an allocation site while the restrict binding is live.
A related technique used in the field of object-oriented verification is to declare
object invariants at the class level and allow these invariants on object fields to be
broken during a limited period of time [7, 22]. In particular, the work on Spec#
[7] uses an ownership system which tracks whether object a owns object b; like
ConSORT’s ownership system, these ownerships contain the effects of mutation.
However, Spec#’s ownership is quite strict and does not admit references to b
outside of the owning object a.
Viper [30, 42] (and its related projects [31, 39]) uses access annotations (ex-
pressed as permission predicates) to explicitly transfer access/mutation permis-
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 707
sions for references between static program names. Like ConSORT, permissions
may be fractionally transferred, allowing temporary shared, immutable access to
a mutable memory cell. However, while ConSORT automatically infers many
ownership transfers, Viper requires extensive annotations for each transfer.
F*, a dependently typed dialect of ML, includes an update/select theory of
heaps and requires explicit annotations summarizing the heap effects of a method
[44, 57, 58]. This approach enables modular reasoning and precise specification of
pre- and post-conditions with respect to the heap, but precludes full automation.
The work on rely–guarantee reference types by Gordon et al. [26, 27] uses re-
finement types in a language mutable references and aliasing. Their approach ex-
tends reference types with rely/guarantee predicates; the rely predicate describes
possible mutations via aliases, and the guarantee predicate describes the admissi-
ble mutations through the current reference. If two references may alias, then the
guarantee predicate of one reference implies the rely predicate of the other and
vice versa. This invariant is maintained with a splitting operation that is similar
to our + operator. Further, their type system allows strong updates to reference
refinements provided the new refinements are preserved by the rely predicate.
Thus, rely–guarantee refinement support multiple mutable, aliased references
with non-trivial refinement information. Unfortunately this expressiveness comes
at the cost of automated inference and verification; an embedding of this system
into Liquid Haskell [63] described in [27] was forced to sacrifice strong updates.
Work by Degen et al. [17] introduced linear state annotations to Java. To effect
strong updates in the presence of aliasing, like ConSORT, their system requires
annotated memory locations are mutated only through a distinguished reference.
Further, all aliases of this mutable reference give no information about the state
of the object much like our 0 ownership pointers. However, their system cannot
handle multiple, immutable aliases with non-trivial annotation information; only
the mutable reference may have non-trivial annotation information.
The fractional ownerships in ConSORT and their counterparts in [55, 56]
have a clear relation to linear type systems. Many authors have explored the
use of linear type systems to reason in contexts with aliased mutable references
[18, 19, 52], and in particular with the goal of supporting strong updates [1].
A closely related approach is RustHorn by Matsushita et al. [40]. Much like
ConSORT, RustHorn uses CHC and linear aliasing information for the sound
and—unlike ConSORT—complete verification of programs with aliasing and
mutability. However, their approach depends on Rust’s strict borrowing discipline,
and cannot handle programs where multiple aliased references are used in the
same lexical region. In contrast, ConSORT supports fine-grained, per-statement
changes in mutability and even further control with alias annotations, which
allows it to verify larger classes of programs.
The ownerships of ConSORT also have a connection to separation logic
[45]; the separating conjunction isolates write effects to local subheaps, while
ConSORT’s ownership system isolates effects to local updates of pointer types.
Other researchers have used separation logic to precisely support strong updates
of abstract state. For example, in work by Kloos et al. [36] resources are associated
708 J. Toman et al.
with static, abstract names; each resource (represented by its static name) may
be owned (and thus, mutated) by exactly one thread. Unlike ConSORT, their
ownership system forbids even temporary immutable, shared ownership, or
transferring ownerships at arbitrary program points. An approach proposed by
Bakst and Jhala [4] uses a similar technique, combining separation logic with
refinement types. Their approach gives allocated memory cells abstract names, and
associates these names with refinements in an abstract heap. Like the approach
of Kloos et al. and ConSORT’s ownership 1 pointers, they ensure these abstract
locations are distinct in all concrete heaps, enabling sound, strong updates.
The idea of using a rational number to express permissions to access a refer-
ence dates back to the type system of fractional permissions by Boyland [12]. His
work used fractional permissions to verify race freedom of a concurrent program
without a may-alias analysis. Later, Terauchi [59] proposed a type-inference algo-
rithm that reduces typing constraints to a set of linear inequalities over rational
numbers. Boyland’s idea also inspired a variant of separation logic for a concurrent
programming language [11] to express sharing of read permissions among several
threads. Our previous work [55, 56], inspired by that in [11, 59], proposed meth-
ods for type-based verification of resource-leak freedom, in which a rational num-
ber expresses an obligation to deallocate certain resource, not just a permission.
The issue of context-sensitivity (sometimes called polyvariance) is well-studied
in the field of abstract interpretation (e.g., [28, 34, 41, 50, 51], see [25] for a recent
survey). Polyvariance has also been used in type systems to assign different behav-
iors to the same function depending on its call site [3, 6, 64]. In the area of refine-
ment type systems, Zhu and Jagannathan developed a context-sensitive dependent
type system for a functional language [67] that indexed function types by unique
labels attached to call-sites. Our context-sensitivity approach was inspired by this
work. In fact, we could have formalized context-polymorphism within the frame-
work of full dependent types, but chose the current presentation for simplicity.
7 Conclusion
We presented ConSORT, a novel type system for safety verification of imperative
programs with mutability and aliasing. ConSORT is built upon the novel combi-
nation of fractional ownership types and refinement types. Ownership types flow-
sensitively and precisely track the existence of mutable aliases. ConSORT admits
sound strong updates by discarding refinement information on mutably-aliased
references as indicated by ownership types. Our type system is amenable to auto-
matic type inference; we have implemented a prototype of this inference tool and
found it can verify several non-trivial programs and outperforms a state-of-the-art
program verifier. As an area of future work, we plan to investigate using fractional
ownership types to soundly allow refinements that mention memory locations.
Acknowledgments The authors would like to the reviewers for their thoughtful feedback
and suggestions, and Yosuke Fukuda and Alex Potanin for their feedback on early drafts.
This work was supported in part by JSPS KAKENHI, grant numbers JP15H05706 and
JP19H04084, and in part by the JST ERATO MMSD Project.
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 709
Bibliography
[1] Ahmed, A., Fluet, M., Morrisett, G.: L3 : a linear language with locations.
Fundamenta Informaticae 77(4), 397–449 (2007)
[2] Aiken, A., Foster, J.S., Kodumal, J., Terauchi, T.: Checking and
inferring local non-aliasing. In: Conference on Programming Lan-
guage Design and Implementation (PLDI). pp. 129–140 (2003).
https://doi.org/10.1145/781131.781146
[3] Amtoft, T., Turbak, F.: Faithful translations between polyvariant flows and
polymorphic types. In: European Symposium on Programming (ESOP). pp.
26–40. Springer (2000). https://doi.org/10.1007/3-540-46425-5 2
[4] Bakst, A., Jhala, R.: Predicate abstraction for linked data struc-
tures. In: Conference on Verification, Model Checking, and Abstract In-
terpretation (VMCAI). pp. 65–84. Springer Berlin Heidelberg (2016).
https://doi.org/10.1007/978-3-662-49122-5 3
[5] Ball, T., Levin, V., Rajamani, S.K.: A decade of software model check-
ing with SLAM. Communications of the ACM 54(7), 68–76 (2011).
https://doi.org/10.1145/1965724.1965743
[6] Banerjee, A.: A modular, polyvariant and type-based closure analysis. In:
International Conference on Functional Programming (ICFP). pp. 1–10
(1997). https://doi.org/10.1145/258948.258951
[7] Barnett, M., Fähndrich, M., Leino, K.R.M., Müller, P., Schulte, W., Venter,
H.: Specification and verification: the Spec# experience. Communications
of the ACM 54(6), 81–91 (2011). https://doi.org/10.1145/1953122.1953145
[8] Barrett, C., Fontaine, P., Tinelli, C.: The Satisfiability Modulo Theories
Library (SMT-LIB). www.SMT-LIB.org (2016)
[9] Bengtson, J., Bhargavan, K., Fournet, C., Gordon, A.D., Maffeis, S.: Re-
finement types for secure implementations. ACM Transactions on Pro-
gramming Languages and Systems (TOPLAS) 33(2), 8:1–8:45 (2011).
https://doi.org/10.1145/1890028.1890031
[10] Bhargavan, K., Bond, B., Delignat-Lavaud, A., Fournet, C., Hawblitzel,
C., Hriţcu, C., Ishtiaq, S., Kohlweiss, M., Leino, R., Lorch, J., Mail-
lard, K., Pan, J., Parno, B., Protzenko, J., Ramananandro, T., Rane, A.,
Rastogi, A., Swamy, N., Thompson, L., Wang, P., Zanella-Béguelin, S.,
Zinzindohoué, J.K.: Everest: Towards a verified, drop-in replacement of
HTTPS. In: Summit on Advances in Programming Languages (SNAPL
2017). pp. 1:1–1:12. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2017).
https://doi.org/10.4230/LIPIcs.SNAPL.2017.1
[11] Bornat, R., Calcagno, C., O’Hearn, P.W., Parkinson, M.J.: Per-
mission accounting in separation logic. In: Symposium on Prin-
ciples of Programming Languages (POPL). pp. 259–270 (2005).
https://doi.org/10.1145/1040305.1040327
[12] Boyland, J.: Checking interference with fractional permissions. In:
Symposion on Static Analysis (SAS). pp. 55–72. Springer (2003).
https://doi.org/10.1007/3-540-44898-5 4
710 J. Toman et al.
[13] Champion, A., Kobayashi, N., Sato, R.: HoIce: An ICE-based non-linear Horn
clause solver. In: Asian Symposium on Programming Languages and Systems
(APLAS). pp. 146–156. Springer (2018). https://doi.org/10.1007/978-3-030-
02768-1 8
[14] Chugh, R., Herman, D., Jhala, R.: Dependent types for JavaScript. In: Confer-
ence on Object Oriented Programming Systems Languages and Applications
(OOPSLA). pp. 587–606 (2012). https://doi.org/10.1145/2384616.2384659
[15] Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux, D.,
Rival, X.: The ASTRÉE analyzer. In: European Symposium on Programming
(ESOP). pp. 21–30. Springer (2005). https://doi.org/10.1007/978-3-540-
31987-0 3
[16] De Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Conference on
Tools and Algorithms for the Construction and Analysis of Systems (TACAS).
pp. 337–340. Springer (2008). https://doi.org/10.1007/978-3-540-78800-3 24
[17] Degen, M., Thiemann, P., Wehr, S.: Tracking linear and affine resources
with JAVA(X). In: European Conference on Object-Oriented Programming
(ECOOP). pp. 550–574. Springer (2007). https://doi.org/10.1007/978-3-540-
73589-2 26
[18] DeLine, R., Fähndrich, M.: Enforcing high-level protocols in low-level soft-
ware. In: Conference on Programming Language Design and Implementation
(PLDI). pp. 59–69 (2001). https://doi.org/10.1145/378795.378811
[19] Fähndrich, M., DeLine, R.: Adoption and focus: Practical linear
types for imperative programming. In: Conference on Programming
Language Design and Implementation (PLDI). pp. 13–24 (2002).
https://doi.org/10.1145/512529.512532
[20] Fink, S.J., Yahav, E., Dor, N., Ramalingam, G., Geay, E.: Effective type-
state verification in the presence of aliasing. ACM Transactions on Soft-
ware Engineering and Methodology (TOSEM) 17(2), 9:1–9:34 (2008).
https://doi.org/10.1145/1348250.1348255
[21] Flanagan, C.: Hybrid type checking. In: Symposium on Prin-
ciples of Programming Languages (POPL). pp. 245–256 (2006).
https://doi.org/10.1145/1111037.1111059
[22] Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B.,
Stata, R.: Extended static checking for Java. In: Conference on Program-
ming Language Design and Implementation (PLDI). pp. 234–245 (2002).
https://doi.org/10.1145/512529.512558
[23] Foster, J.S., Terauchi, T., Aiken, A.: Flow-sensitive type qualifiers. In:
Conference on Programming Language Design and Implementation (PLDI).
pp. 1–12 (2002). https://doi.org/10.1145/512529.512531
[24] Freeman, T., Pfenning, F.: Refinement types for ML. In: Conference on
Programming Language Design and Implementation (PLDI). pp. 268–277
(1991). https://doi.org/10.1145/113445.113468
[25] Gilray, T., Might, M.: A survey of polyvariance in abstract interpretations.
In: Symposium on Trends in Functional Programming. pp. 134–148. Springer
(2013). https://doi.org/10.1007/978-3-642-45340-3 9
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 711
[26] Gordon, C.S., Ernst, M.D., Grossman, D.: Rely–guarantee references for
refinement types over aliased mutable data. In: Conference on Program-
ming Language Design and Implementation (PLDI). pp. 73–84 (2013).
https://doi.org/10.1145/2491956.2462160
[27] Gordon, C.S., Ernst, M.D., Grossman, D., Parkinson, M.J.: Verifying invari-
ants of lock-free data structures with rely–guarantee and refinement types.
ACM Transactions on Programming Languages and Systems (TOPLAS)
39(3), 11:1–11:54 (2017). https://doi.org/10.1145/3064850
[28] Hardekopf, B., Wiedermann, B., Churchill, B., Kashyap, V.: Widening for
control-flow. In: Conference on Verification, Model Checking, and Abstract
Interpretation (VMCAI). pp. 472–491 (2014). https://doi.org/10.1007/978-
3-642-54013-4 26
[29] Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J.R., Parno, B., Roberts,
M.L., Setty, S., Zill, B.: IronFleet: proving practical distributed systems
correct. In: Symposium on Operating Systems Principles (SOSP). pp. 1–17.
ACM (2015). https://doi.org/10.1145/2815400.2815428
[30] Heule, S., Kassios, I.T., Müller, P., Summers, A.J.: Verification condition gen-
eration for permission logics with abstract predicates and abstraction func-
tions. In: European Conference on Object-Oriented Programming (ECOOP).
pp. 451–476. Springer (2013). https://doi.org/10.1007/978-3-642-39038-8 19
[31] Heule, S., Leino, K.R.M., Müller, P., Summers, A.J.: Abstract read per-
missions: Fractional permissions without the fractions. In: Conference on
Verification, Model Checking, and Abstract Interpretation (VMCAI). pp.
315–334 (2013). https://doi.org/10.1007/978-3-642-35873-9 20
[32] Kahsai, T., Kersten, R., Rümmer, P., Schäf, M.: Quantified heap invariants
for object-oriented programs. In: Conference on Logic for Programming
Artificial Intelligence and Reasoning (LPAR). pp. 368–384 (2017)
[33] Kahsai, T., Rümmer, P., Sanchez, H., Schäf, M.: JayHorn: A framework for
verifying Java programs. In: Conference on Computer Aided Verification
(CAV). pp. 352–358. Springer (2016). https://doi.org/10.1007/978-3-319-
41528-4 19
[34] Kashyap, V., Dewey, K., Kuefner, E.A., Wagner, J., Gibbons, K., Sarracino,
J., Wiedermann, B., Hardekopf, B.: JSAI: a static analysis platform for
JavaScript. In: Conference on Foundations of Software Engineering (FSE).
pp. 121–132 (2014). https://doi.org/10.1145/2635868.2635904
[35] Kawaguchi, M., Rondon, P., Jhala, R.: Type-based data structure verification.
In: Conference on Programming Language Design and Implementation
(PLDI). pp. 304–315 (2009). https://doi.org/10.1145/1542476.1542510
[36] Kloos, J., Majumdar, R., Vafeiadis, V.: Asynchronous liquid separation
types. In: European Conference on Object-Oriented Programming (ECOOP).
pp. 396–420. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2015).
https://doi.org/10.4230/LIPIcs.ECOOP.2015.396
[37] Komuravelli, A., Gurfinkel, A., Chaki, S., Clarke, E.M.: Automatic ab-
straction in SMT-based unbounded software model checking. In: Confer-
ence on Computer Aided Verification (CAV). pp. 846–862. Springer (2013).
https://doi.org/10.1007/978-3-642-39799-8 59
712 J. Toman et al.
[38] Leino, K.R.M.: Dafny: An automatic program verifier for functional correct-
ness. In: Conference on Logic for Programming Artificial Intelligence and Rea-
soning (LPAR). pp. 348–370. Springer (2010). https://doi.org/10.1007/978-
3-642-17511-4 20
[39] Leino, K.R.M., Müller, P., Smans, J.: Deadlock-free channels and locks. In:
European Symposium on Programming (ESOP). pp. 407–426. Springer-
Verlag (2010). https://doi.org/10.1007/978-3-642-11957-6 22
[40] Matsushita, Y., Tsukada, T., Kobayashi, N.: RustHorn: CHC-based verifica-
tion for Rust programs. In: European Symposium on Programming (ESOP).
Springer (2020)
[41] Milanova, A., Rountev, A., Ryder, B.G.: Parameterized object sen-
sitivity for points-to analysis for Java. ACM Transactions on Soft-
ware Engineering and Methodology (TOSEM) 14(1), 1–41 (2005).
https://doi.org/10.1145/1044834.1044835
[42] Müller, P., Schwerhoff, M., Summers, A.J.: Viper: A verification infrastruc-
ture for permission-based reasoning. In: Conference on Verification, Model
Checking, and Abstract Interpretation (VMCAI). pp. 41–62. Springer-Verlag
(2016). https://doi.org/10.1007/978-3-662-49122-5 2
[43] Pierce, B.C.: Types and programming languages. MIT press (2002)
[44] Protzenko, J., Zinzindohoué, J.K., Rastogi, A., Ramananandro, T., Wang,
P., Zanella-Béguelin, S., Delignat-Lavaud, A., Hriţcu, C., Bhargavan, K.,
Fournet, C., Swamy, N.: Verified low-level programming embedded in F*.
Proceedings of the ACM on Programming Languages 1(ICFP), 17:1–17:29
(2017). https://doi.org/10.1145/3110261
[45] Reynolds, J.C.: Separation logic: A logic for shared mutable data structures.
In: Symposium on Logic in Computer Science (LICS). pp. 55–74. IEEE
(2002). https://doi.org/10.1109/LICS.2002.1029817
[46] Rondon, P., Kawaguchi, M., Jhala, R.: Low-level liquid types. In: Symposium
on Principles of Programming Languages (POPL). pp. 131–144 (2010).
https://doi.org/10.1145/1706299.1706316
[47] Rondon, P.M., Kawaguci, M., Jhala, R.: Liquid types. In: Conference on
Programming Language Design and Implementation (PLDI). pp. 159–169
(2008). https://doi.org/10.1145/1375581.1375602
[48] Rümmer, P., Hojjat, H., Kuncak, V.: Disjunctive interpolants for Horn-
clause verification. In: Conference on Computer Aided Verification (CAV).
pp. 347–363. Springer (2013). https://doi.org/10.1007/978-3-642-39799-8 24
[49] Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis.
In: Muchnick, S.S., Jones, N.D. (eds.) Program Flow Analysis: Theory and
Applications, chap. 7, pp. 189–223. Prentice Hall (1981)
[50] Shivers, O.: Control-flow analysis of higher-order languages. Ph.D. thesis,
Carnegie Mellon University (1991)
[51] Smaragdakis, Y., Bravenboer, M., Lhoták, O.: Pick your con-
texts well: Understanding object-sensitivity. In: Symposium on
Principles of Programming Languages (POPL). pp. 17–30 (2011).
https://doi.org/10.1145/1926385.1926390
ConSORT: Context- and Flow-Sensitive Ownership Refinement Types 713
[52] Smith, F., Walker, D., Morrisett, G.: Alias types. In: European
Symposium on Programming (ESOP). pp. 366–381. Springer (2000).
https://doi.org/10.1007/3-540-46425-5 24
[53] Späth, J., Ali, K., Bodden, E.: Context-, flow-, and field-sensitive
data-flow analysis using synchronized pushdown systems. Proceedings
of the ACM on Programming Languages 3(POPL), 48:1–48:29 (2019).
https://doi.org/10.1145/3290361
[54] Späth, J., Nguyen Quang Do, L., Ali, K., Bodden, E.: Boomerang:
Demand-driven flow-and context-sensitive pointer analysis for Java. In:
European Conference on Object-Oriented Programming (ECOOP). pp.
22:1–22:26. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016).
https://doi.org/10.4230/LIPIcs.ECOOP.2016.22
[55] Suenaga, K., Fukuda, R., Igarashi, A.: Type-based safe resource dealloca-
tion for shared-memory concurrency. In: Conference on Object Oriented
Programming Systems Languages and Applications (OOPSLA). pp. 1–20
(2012). https://doi.org/10.1145/2384616.2384618
[56] Suenaga, K., Kobayashi, N.: Fractional ownerships for safe memory deal-
location. In: Asian Symposium on Programming Languages and Systems
(APLAS). pp. 128–143. Springer (2009). https://doi.org/10.1007/978-3-642-
10672-9 11
[57] Swamy, N., Hriţcu, C., Keller, C., Rastogi, A., Delignat-Lavaud, A., Forest,
S., Bhargavan, K., Fournet, C., Strub, P.Y., Kohlweiss, M., Zinzindohoué,
J.K., Zanella-Béguelin, S.: Dependent types and multi-monadic effects in
F*. In: Symposium on Principles of Programming Languages (POPL). pp.
256–270 (2016). https://doi.org/10.1145/2837614.2837655
[58] Swamy, N., Weinberger, J., Schlesinger, C., Chen, J., Livshits, B.: Verifying
higher-order programs with the Dijkstra monad. In: Conference on Program-
ming Language Design and Implementation (PLDI). pp. 387–398 (2013).
https://doi.org/10.1145/2491956.2491978
[59] Terauchi, T.: Checking race freedom via linear programming. In: Conference
on Programming Language Design and Implementation (PLDI). pp. 1–10
(2008). https://doi.org/10.1145/1375581.1375583
[60] Toman, J., Siqi, R., Suenaga, K., Igarashi, A., Kobayashi, N.: Consort:
Context- and flow-sensitive ownership refinement types for imperative pro-
grams. https://arxiv.org/abs/2002.07770 (2020)
[61] Unno, H., Kobayashi, N.: Dependent type inference with interpolants. In:
Conference on Principles and Practice of Declarative Programming (PPDP).
pp. 277–288. ACM (2009). https://doi.org/10.1145/1599410.1599445
[62] Vazou, N., Rondon, P.M., Jhala, R.: Abstract refinement types. In: Euro-
pean Symposium on Programming (ESOP). pp. 209–228. Springer (2013).
https://doi.org/10.1007/978-3-642-37036-6 13
[63] Vazou, N., Seidel, E.L., Jhala, R., Vytiniotis, D., Peyton-Jones, S.: Refine-
ment types for Haskell. In: International Conference on Functional Program-
ming (ICFP). pp. 269–282 (2014). https://doi.org/10.1145/2628136.2628161
714 J. Toman et al.
[64] Wells, J.B., Dimock, A., Muller, R., Turbak, F.: A calculus with polymorphic
and polyvariant flow types. Journal of Functional Programming 12(3), 183–
227 (2002). https://doi.org/10.1017/S0956796801004245
[65] Xi, H., Pfenning, F.: Dependent types in practical programming. In: Sympo-
sium on Principles of Programming Languages (POPL). pp. 214–227. ACM
(1999). https://doi.org/10.1145/292540.292560
[66] Zave, P.: Using lightweight modeling to understand Chord. ACM
SIGCOMM Computer Communication Review 42(2), 49–57 (2012).
https://doi.org/10.1145/2185376.2185383
[67] Zhu, H., Jagannathan, S.: Compositional and lightweight dependent
type inference for ML. In: Conference on Verification, Model Check-
ing, and Abstract Interpretation (VMCAI). pp. 295–314. Springer (2013).
https://doi.org/10.1007/978-3-642-35873-9 19
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Mixed Sessions
1 Introduction
where ⊕ denotes internal choice (the consumer decides), the two branches in the
choice are labelled with enough and more, type end denotes a channel on which
no further interaction is possible, and ?int denotes the reception of an integer
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 715–742, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 26
716 V. T. Vasconcelos et al.
value. Reception is a prefix to a type, the continuation is T (in this case the “goes
back to the beginning” part). The code for the consumer (and the producer as
well) is unnecessarily complex, featuring parts that exchange messages in both
directions: enough and more selections from the consumer to the producer, and int
messages from the producer to the consumer. In particular, the consumer must
first select option enough (outgoing) and then receive an integer (incoming).
Using mixed sessions one can invert the direction of the more selection and
write the type of the channel (again as seen from the side of the consumer) as
⊕{ enough ! u n i t . end , more ? i n t . T}
The changes seem merely cosmetic, but label/polarity pairs (polarity is ! or ?)
are now indivisible and constitute the keys of the choice type when seen as a
map. The integer value is piggybacked on top of selection more. As a result, the
classical session primitive operations: selection and branching (that is, internal
and external choice) and communication (output and input) become one only:
mixed session. The producer can be safely written as
p ( enough ? z . 0 + more ! n . p r o d u c e ! ( p , n+1) )
offering a choice on channel end p featuring mixed branches with labels enough?
and more!, where 0 denotes the terminated process and produce(p, n+1) a recur-
sive call to the producer. The example is further developed in Section 2.
Mixed sessions build on Vasconcelos presentation of session types which we
call classical sessions [43], by adapting choice and input/output as needed, but
keeping everything else unchanged as much as possible. The result is a language
with
– a single synchronisation/communication primitive: mixed choice on a given
channel that
– allows for duplicated labels in choice processes, leading to non-determinism
in a pure linear setting, and
– replicated output processes arising naturally from replicated mixed choices,
and that
– enjoys preservation and absence of runtime errors for typable processes, and
– provides for embedding classical sessions in a tight type and operational
correspondence.
The rest of the paper is organised as follows: the next section shows mixed ses-
sions in action; Section 3 introduces the technical development of the language,
and Section 4 proves the main results (preservation and absence of runtime
errors for typable processes). Then Section 5 presents the embedding and the
correspondence proofs, Section 6 discusses implementation details, and Section 7
explores related work. Section 8 concludes the paper.
Consider the producer-consumer problem where the producer produces only in-
sofar as so requested by the consumer. Here is the code for a producer that
writes on channel end x numbers starting from n.
def produce ( x , n ) =
l i n x ( enough ? z . 0 +
more ! n . p r o d u c e ! ( x , n+1)
)
Suppose that x and y are two ends of the same channel. When choices on x and
on y get together, a pair of matching label-polarities pairs is selected and a value
transmitted from the output continuation to the input continuation.
Types for the two channel ends ensure that choice synchronisation succeeds.
The type of x is rec a. lin &{enough?unit.end, more!int.a} where the qualifier lin
says that the channel end must be used in exactly one process, & denotes external
choice, and each branch is composed of a label, a polarity mark, the type of the
communication, and that of the continuation. The type end states that no further
interaction is possible at the channel and rec introduces a recursive type. The
type of y is obtained from that of x by inverting views (⊕ and &) and polarities
( ! and ?), yielding rec b. lin ⊕{enough!unit.end, more?int.b}. The choice at x in the
produce process contains all branches in the type and so we select an external
choice view & for x. The choices at y contain only part of the branches, hence
the internal choice view ⊕. This type discipline ensures that processes do not
engage in runtime errors when trying to find a match for two choices at the two
ends of a given channel.
A few type and process abbreviations simplify coding: i) the lin qualifier
can be omitted, ii) the terminated process 0 together with the trailing dot can
be omitted; iii) the terminated type end together with the trailing dot can be
omitted; and iv) we introduce wildcards ( ) in variable binding positions (in
input branches).
718 V. T. Vasconcelos et al.
Process collect sees the channel from the dual viewpoint, obtained by ex-
changing ? with ! and ⊕ with &. Parameter n in this case denotes the number
of messages received. When done, the process writes the result on channel end r ,
global to the collect process.
c o l l e c t : ( r e c b .&{msg ! u n i t . b , msg? u n i t . b , done ? u n i t } , i n t )
def c o l l e c t ( y , n ) =
y ( msg ! ( ) . c o l l e c t ! ( y , n+1) +
msg? . c o l l e c t ! ( y , n ) +
done ? . r ( r e s u l t ! n ) )
reduces either to (z!true | (νxy)y? .z! false )) or to (z! false | (νxy)y? .z!true)),
leaving for the runtime the garbage collection of the inert residuals. Also note
that in this case, channel y cannot remain linear.
Duplicated message-polarities in choices lead to elegant and concise code. A
random number generator with a given number n of bits can be written with two
processes. The first process sends n messages on channel end x. The contents of
the messages are irrelevant (we use value () of type unit ); what is important is
that n more messages are sent, followed by a done message, followed by silence.
w r i t e : ( r e c a .⊕{ done ! u n i t , more ! u n i t . a } , i n t )
def w r i t e ( x , n ) =
i f n == 0
then x ( done ! ( ) )
e l s e x ( more ! ( ) . w r i t e ! ( x , n−1) )
The reader process reads the more messages in two distinct branches and
interprets messages received on one branch as bit 0, and on the other as 1. Upon
the reception of a done message, the accumulated random number is conveyed
on channel end r , a variable global to the read process.
r e a d : ( r e c b .&{ done ? u n i t , more ? u n i t . b } , i n t )
def read ( y , n ) =
y ( done ? . r ( r e s u l t ! n ) +
more ? . r e a d ! ( y , 2∗ n ) +
more ? . r e a d ! ( y , 2∗ n+1)
)
Mixed sessions allow for replicated output processes. The original version of
the π-calculus [30,31] features recursion on arbitrary processes. Subsequent ver-
sions [29] introduce replication but restricted to input processes. When compared
to languages with unrestricted input only, unrestricted output allows for more
concise programs and fewer message exchanges for the same effect. Here is a
process (call it P ) containing a pair of processes that exchange msg-labelled
messages ad-aeternum,
( ν xy ) ( un y ( msg ! ( ) ) | un x ( msg? ) )
v ::= Values:
x variable
true | false boolean values
() unit value
P ::= Processes:
qx Mi choice
i∈I
P |P parallel composition
(νxx)P scope restriction
if v then P else P conditional
0 inaction
M ::= Branches:
l v.P branch
::= Polarities:
! |? out and in
q ::= Qualifiers:
lin | un linear and unrestricted
Even if unrestricted output can be simulated with unrestricted input, the encod-
ing requires one extra channel (wz) and an extra message exchange (on channel
wz) in order to reestablish the output on channel end y.
It is a fact that unrestricted output can be added to any flavour of the π-
calculus (session-typed or not). In the case of mixed sessions it arises naturally:
there is only one communication primitive—choice—and this can be classified as
lin or un. If an un-choice happens to behave in “output mode”, then we have an un-
output. It is not obvious how to design the language of mixed choices without
allowing unrestricted output, while still allowing unrestricted input (which is
mandatory for unbounded behaviour).
This section introduces the syntax and the semantics of mixed sessions. Inspired
in Vasconcelos’ formulation of session types for the π-calculus [43,45], mixed
sessions replace input and output, selection and branching (internal and external
choice), with a single construct which we call choice.
Mixed Sessions 721
3.1 Syntax
Figure 1 presents the syntax of values and processes. Let x, y, z range over a
(countable) set of variables, and let l range over a set of labels. Metavariable v
ranges over values. Following the tradition of the π-calculus, set up by Milner
et al. [30,31], variables are used both as placeholders for incoming values in
communication and for channels. Linearity constraints, central to session types
but absent in the π-calculus, dictate that the two ends of a channel must be
syntactically distinguished; we use one variable for each end [43]. Different prim-
itive values can be used. Here, we pick the boolean values (so that we may have
a conditional process), and unit that plays its role in the embedding of classical
session types (Section 5).
Metavariables P and Q range over processes. Choices are processes of the
form qx i∈I Mi offering a choice of Mi alternatives on channel end x. Qualifier q
describes how choice behaves with respect to reduction. If q is lin, then the choice
is consumed in reduction, otherwise q must be un, and in this case the choice
persists after reduction. The type system in Figure 8 rejects nullary (empty)
choices. There are two forms of branches: output l! v.P and input l? x.P . An
output branch sends value v and continues as P . An input branch receives a
value and continues as P with the value replacing variable x. The type system
in Figure 8 makes sure that value v in l? v.P is a variable.
The remaining process constructors are standard in the π-calculus. Processes
of the form P | Q denote the parallel composition of processes P and Q. Scope
restriction (νxy)P binds together the two channel ends x and y of a same channel
in process P . The conditional process if v then P else Q behaves as process P if
v is true and as process Q otherwise. Since we do not have nullary choices, we
include 0—called inaction—as primitive to denote the terminated process.
The variable bindings in the language are as follows: variables x and y are bound
in P , in a process of the form (νxy)P ; variable x is bound in P in a choice of
the form l? x.P . The sets of bound and free variables, as well as substitution,
P [v/x], are defined accordingly. We work up to alpha-conversion and follow
Barendregt’s variable convention, whereby all variables in binding occurrences
in any mathematical context are pairwise distinct and distinct from the free
variables [2].
Figure 2 summarises the operational semantics of mixed sessions. Following
the tradition of the π-calculus, a binary relation on processes—structural congru-
ence—rearranges processes when preparing for reduction. Such an arrangement
reduces the number of rules included in the operational semantics. Structural
congruence was introduced by Milner [27,29]. It is defined as the least congru-
ence relation closed under the axioms in Figure 2. The first three rules state that
parallel composition is commutative, associative, and takes inaction as the neu-
tral element. The fourth rule is commonly known as scope extrusion [30,31] and
allows extending the scope of channel ends x, y to process Q. The side-condition
722 V. T. Vasconcelos et al.
Structural congruence, P ≡ P
P |Q≡Q|P (P | Q) | R ≡ P | (Q | R) P |0≡P
(νxy)P | Q ≡ (νxy)(P | Q) (νxy)0 ≡ 0 (νwx)(νyz)P ≡ (νyz)(νwx)P
Reduction, P → P
[R-LinLin]
(νxy)(linx(M + l! v.P + M ) | uny(N + l? z.Q + N ) | R) → [R-LinUn]
(νxy)(P | Q[v/z] | uny(N + l z.Q + N ) | R)
?
P → Q P → Q P ≡ P P → Q Q ≡ Q
(νxy)P → (νxy)Q P |R → Q|R P → Q
[R-Res] [R-Par] [R-Struct]
T ::= Types:
q{Ui }i∈I choice
end termination
unit | bool unit and boolean
μa.T recursive type
a type variable
U ::= Branches:
l T.T branch
::= Views:
⊕ | & internal and external
Γ ::= Contexts:
· empty
Γ, x : T entry
omitted for there is no distinction between input and output: choice is the only
(symmetrical) communication primitive.
We have designed mixed choices in such a way that labels may be duplicated
in choices; more: label-polarity pairs may be also be duplicated. This allows for
non-determinism in a linear context. For example, process
reduces in one step to either lin w(m! true.0) or lin w(m! false.0).
The examples in Section 2 take advantage of a def notation, a derived process
construct inspired in the SePi [12] and the Pict languages [36]. A process of the
form def x(z) = P in Q is understood as
and calls to the recursive procedure, of the form x!v, are interpreted as lin x(! v),
for an arbitrarily chosen label. The derived syntax hides channel end y and
simplifies the syntax of calls to the procedure. Procedures with more than one
parameter require tuple passing, a notion that is not primitive to mixed sessions.
Fortunately, tuple passing is easy to encode; see Vasconcelos[43].
3.3 Typing
Figure 3 summarises the syntax of types. We rely on an extra set, that of type
variables, a, b, . . . Types describe values, including boolean and unit values, and
724 V. T. Vasconcelos et al.
channel ends. A type of the form q{Ui }i∈I denotes a channel end. Qualifier q
states the number of processes that may contain references to the channel end:
exactly one for lin, zero or more for un. View distinguishes external (⊕) from in-
ternal (&) choice. This distinction is not present in processes but is of paramount
importance for typing purposes, as we shall see. The branches are either of
output—l! S.T —or of input—l? S.T —nature. In either case, S denotes the ob-
ject of communication and T describes the subsequent behaviour of the channel
end. Type end denotes the channel end on which no more interaction is possible.
Types μa.T and a cater for recursive types.
Types are subject to a few syntactic restrictions: i) choices must have at least
one branch; ii) label-polarity pairs—l —are pairwise distinct in the branches of
a choice type (unlike in processes); iii) recursive types are assumed contractive
(that is, containing no subterm of the form μa1 . . . μan .a1 ). New variables, new
bindings: type variable a is bound in T in type μa.T . Again the definitions
of bound and free names as well as that of substitution—S[T /a]—are defined
accordingly.
Mixed sessions come equipped with a notion of subtyping. Figure 4 introduces
the rules that allow determining whether a given type is subtype of another.
The rules must be read coinductively. Base types (end, unit, bool) are subtypes
to themselves. The rules for recursive types are standard. Subtyping behaves
differently in presence of external or internal choice. For external choice we re-
quire the branches in the subtype to contain those in the supertype: exercising
less options cannot cause difficulties on the receiving side. For internal choice we
require the opposite: here offering more choices can not cause runtime errors.
For branches we distinguish output from input: output is contravariant on the
contents of the message, input is covariant. In either case, the continuation is
covariant. Choices, input/output, and recursive types receive no different treat-
ment than those in classical sessions [15]. We can easily show that the <: relation
is a preorder. Notation S ≡ T abbreviates S <: T and T <: S.
Duality is a notion central to session types. In order for channel communi-
cation to proceed smoothly, the two channel ends must be compatible: if one
end says input, the other must say output; if one end says external choice, the
Mixed Sessions 725
Type duality, T ⊥ T
un(T )
un(end) un(unit) un(bool) un(un{Ui })
un(μa.T ) lin(T )
other must say internal choice. In presence of recursive types, the problem of
building the dual of a given type has been elusive, as works by Bernardi and
Hennessy, Bono and Padovani, Lindley and Morris show [5,7,25]. Here we eschew
the problem by working with a duality relation, as in Gay and Hole [15].
The rules in Figure 5 define what we mean for two types to be dual. This
is the coinductive definition of Gay and Hole in rule format (and adapted to
choice). Duality is defined for session types only. Type end is the dual of itself.
The rule for choice types requires dual views (& is the dual of ⊕, and vice-versa)
and dual polarities (? is the dual of !, and vice-versa). Furthermore, the objects
of communications must be equivalent (Si ≡ Si ) and the continuations must be
dual again (Ti ⊥ Ti ). The rules in the second line handle recursion in the exact
same way as in type equivalence. As an example, we can easily show that
μa.lin ⊕ {l? bool.lin&{m! unit.a}} ⊥ lin&{l! bool.μb.lin ⊕{m? unit.lin&{l! bool.b}}}
It can be shown that ⊥ is an involution, that is, if R ⊥ S and S ⊥ T , then
R ≡ T.
The meaning of the un and lin predicates are defined by the rules in Fig-
ure 6. Basic types—unit, bool, end—are unrestricted; un-annotated choices are
unrestricted; μa.T is unrestricted if T is. Contractivity ensures that the predi-
cate is total. All types are lin, meaning that both lin and non-lin types may be
used in linear contexts.
Before presenting the type system, we need to introduce two notions that
manipulate typing contexts. The rules in Figure 7 define the meaning of context
split and context update. These two relations are taken verbatim from Vasconce-
los [43]; context split is originally from Walker [48] (cf. Kobayashi et al. [22,23]).
Context split is used when type checking processes with two sub-processes. In
726 V. T. Vasconcelos et al.
Context split, Γ = Γ ◦ Γ
Γ1 ◦ Γ 2 = Γ un(T )
·=·◦·
Γ, x : T = (Γ1 , x : T ) ◦ (Γ2 , x : T )
Γ = Γ1 ◦ Γ 2 Γ = Γ1 ◦ Γ 2
Γ, x : lin p = (Γ1 , x : lin p) ◦ Γ2 Γ, x : lin p = Γ1 ◦ (Γ2 , x : lin p)
Context update, Γ + x : T = Γ
x: U ∈/Γ un(T ) T ≡U
Γ + x : T = Γ, x : T (Γ, x : T ) + x : U = (Γ, x : T )
this case we split the context in two, by copying unrestricted entries to both
contexts and linear entries to one only. Context update is used to add to a given
context an entry representing the continuation (after a choice operation) of a
channel. If the variable in the entry is not in the context, then we add the entry
to the context. Otherwise we require the entry to be present in the context and
the type to be unrestricted.
The rules in Figure 8 introduce the typing system for mixed sessions. Here the
un and lin predicates on types are pointwise extended to typing contexts. Notice
that all contexts are linear and only some contexts are unrestricted. We require
all instances of the axioms to be built from unrestricted contexts, thus ensuring
that linear resources (channel ends) are fully consumed in typing derivations.
The typing rules for values should be straightforward: constants have their
own types, the type for a variable is read from the context, and [T-Sub] is the
subsumption rule, allowing a type to be replaced by a supertype.
The rules for branches—[T-Out] and [T-In]—follow those for output and
input in classical session types. To type an output branch we split the context
in two: one part for the value, the other for the continuation process. To type an
input branch we add an entry with the bound variable x to the context under
which we type the continuation process. Rule [T-In] rejects branches of the form
l? v.P when v not a variable. The continuation type T is not used in neither rule;
instead it is incorporated in the type for the channel in Γ (cf. rule [T-Choice]
below).
The rules for inaction, parallel composition, and conditional are from Vas-
concelos [43]. That for scope restriction is adapted from Gay and Hole [15]. Rule
[T-Inact] follows the general pattern for axioms, requiring a un context. Rule
[T-Par] splits the context in two, providing each subprocess with one part. Rule
[T-If] splits the context and uses one part to type guard v. Because v is unre-
stricted, we know that Γ1 contains exactly the un entries in Γ1 ◦ Γ2 and that Γ2
is equal to Γ1 ◦ Γ2 . Context Γ2 is used to type both branches of the conditional,
for only one of them will ever execute. Rule [T-Res] introduces in the typing
context entries for the two channel ends, x and y, at dual types.
Mixed Sessions 727
The rule for choice is new. The incoming context is split in two: one for the
subject x of the choice, the other for the various branches in the choice. The
qualifier of the process, q1 , dictates the nature of the incoming context: un or lin.
This allows for a linear choice to contain channels of an arbitrary nature, but
limits unrestricted choices to unrestricted channels only (for one cannot predict
how many times such choices will be exercised). The second premise extracts a
type q2 {li Si .Ti } for x. The third premise types each branch: type Sj is used to
type values vj in the branches and each type Tj is used to type the corresponding
continuation. The rule updates context Γ2 with the continuation type of x: if
q2 is lin, then x is not in Γ2 and the update operation simply adds the entry
to the context. If, on the other hand, q2 is un, then x is in Γ2 and the context
update operation (together with rule [T-Sub]) insists that type Tj is a subtype
of un{lj Sj .Tj }, meaning that Tj is a recursive type.
The last premise to rule [T-Choice] insists that the set of labels in the
choice type coincides with that in the choice process. That does not mean that
the label-polarity pairs are in a one-to-one correspondence: label-polarity pairs
are pairwise distinct in types (see the syntactic restrictions in Section 3.3),
but not in processes. For example, process linx(l? y.0 + l? z.0) can be typed
against context x : lin ⊕ {l? bool.end}. From the fact that the two sets must co-
incide does not follow that the label-polarity pairs type in the context must
coincide with those in the process. Taking advantage of subtyping, the above
process can still be typed against context x : lin⊕{l? bool.end, m! unit.end} because
lin ⊕{l? bool.end, m! unit.end} <: lin ⊕{l? bool.end}. The opposite phenomenon hap-
728 V. T. Vasconcelos et al.
pens with external choice, where one may remove branches by virtue of subtyp-
ing.
We complete this section by discussing examples that illustrate options taken
in the typing system (we postpone the formal justification to Section 4). Suppose
we allow empty choices in the syntax of types. Then the process
(νxy)(x() | y())
would be typable by taking x : ⊕ (), y : &(), yet the process would not reduce.
We could add an extra reduction rule for the effect
could be typed under context x : &{l! bool, l! unit}, y : ⊕ {l? bool, l? unit}, yet the
process might reduce to if () then 0 else 0 which is a runtime error.
runtime errors for they may be typed. Just think of (νxy)linx(l? z.liny(l! true.0)),
typable under the empty context. Unlike the interpretations of session types
in linear logic by Caires, Pfenning and Wadler [8,14,46,47], typable mixed ses-
sion processes can easily deadlock. Similarly, processes with more than one lin-
choice on the same channel end can be typed. For example process linx(l! true.0) |
linx(l? z.0)) can be typed under context x : μa.un ⊕ {l! unit.a, l? bool.a}. Recall the
relationship between qualifiers in processes q1 and those in types q2 in the dis-
cussion of the rules for choice in Section 3.
Theorem 1 (Well-typed processes are not runtime errors). If · P ,
then P is not a runtime error.
Proof. In view of a contradiction, assume that · P and that P is
and {li• }i∈I ∩ {lj }j∈J = ∅ with i ⊥•i . From the typing derivation for P ,
using [T-Par] and [T-Res], we obtain a context Γ = Γ1 ◦ Γ2 ◦ Γ3 =
x1 : T
1 , y1 : S1 , . . . , xn : Tn , yn : Sn ,Ti ⊥ Si for all i = 1, . . . , n and that Γ1
q1 xn i∈I li vi .Pi and Γ2 q2 yn j∈J lj wj .Qj and Γ3 R. Without loss of
generality, due to the fact that xn and yn have dual types and from the
premises of rule [T-Choice], assume that Γ1 xn : q1 &{lk Tk .Tk }k∈K and
Γ2 yn : q2 ⊕{lk• Sk .Sk }k∈K , {li }i∈I = {lk }k∈K and {lj }j∈J ⊆ {lk• }k∈K ,
with k ⊥•k . This also implies that {li• }i∈I = {lk• }k∈K . Thus, a label lj from
q2 y1 j∈J lj wj .Qj belongs to the set of labels {li• }i∈I : lj ∈ {lk• }k∈K = {li• }i∈I ,
contradicting {li• }i∈I ∩ {lj }j∈J = ∅ with i ⊥•i
When P is qz(M + l? v.P + N ) and v is not a variable, the contradiction is
with rule [T-Out], which can only be applied when the value v is a variable.
When P is if v then P else Q and v is not a boolean value, the contradiction
immediately arises with rule [T-If].
In order to prepare for the preservation result we introduce a few lemmas.
Lemma 1 (Unrestricted weakening). If Γ P and un(T ), then Γ, x : T P .
Proof. The proof goes by mutual induction on the rules for branches and pro-
cesses, but we first need to show the result for the value typing rules. We need
to show that if Γ v : S and un(R) then Γ, x : R v : S. This follows by a simple
case inspection of the rules [T-Unit], [T-True],[T-False],[T-Var] taking into
consideration that un(R). For the rule [T-Sub], use the induction hypothesis to
obtain Γ, x : R v : S and conclude, using [T-Sub], that Γ, x : R v : T .
For the branch and processes typing rules we detail the proof when the last
rule is [T-Out]. Using the result for typing values, we obtain Γ1 , x : R v : S,
and the induction hypothesis for processes leads to Γ2 , x : R P . Using the
un context split property, taking into account that un(R), we conclude that
Γ1 ◦ Γ2 , x : R l! v.P : l! S.T .
730 V. T. Vasconcelos et al.
For the processes rule [T-Inact], the result is a simple consequence of un(T ).
For the other rules, the result follows by induction hypothesis in processes and
branches rules, as well as using the value typing result. We detail the proof for
rule [T-If]. Using the typing values result, we know that Γ1 , x : T x : bool. By
induction hypothesis we also obtain that Γ2 , x : T P and Γ2 , x : T Q. Using
the un context split property, we conclude Γ1 ◦ Γ2 , x : T if v then P else Q.
Proof. As in Vasconcelos [43, Lemma 7.4] since we share the structural congru-
ence axioms.
Proof. The proof follows by mutual induction on the rules for processes and
branches.
Proof. The proof is by rule induction on the reduction, making use of the weaken-
ing, substitution lemmas, and preservation for structural congruence. We sketch
the cases for [R-LinLin] and [R-LinUn].
When reduction ends with rule [R-LinLin], we know that rule [T-Res] in-
troduces x : X, y : Y with X⊥Y in the context Γ . From there, with applications
of [T-Par] and [T-Choice], Γ = Γ1 ◦ Γ2 ◦ Γ3 and Γ1 lin x(M + l! v.P + M ),
Γ2 lin y(N + l? z.Q + N ), Γ3 R. Furthermore, Γ1 = Γ1 ◦ Γ1 and lin(Γ1 ),
Γ1 x : lin ⊕ {M, l! S.T, M } and Γ1 , x : T l! v.P : l! S.T . From the [T-Out]
rule, Γv v : S and Γ4 P . For the y side, Γ2 y : lin&{N, l? U.V, N } and
Γ2 , y : Y l? z.Q : l? U.V . From the [T-In] rule, Γz , y : V, z : U Q. We also have
that S ≡ U from the duality of x and y. Using the substitution Lemma 3,
Γz , y : V, Γv Q[v/z]. Using [T-Par] with the remaining contexts and [T-Res]
types the conclusion of [R-LinLin].
When reduction ends with rule [R-LinUn], we know that rule [T-Res] in-
troduces x : X, y : Y with X⊥Y in the context Γ . From there, with applications
of [T-Par] and [T-Choice], Γ = Γ1 ◦ Γ2 ◦ Γ3 and Γ1 lin x(M + l! v.P + M ),
Γ2 un y(N + l? z.Q + N ), Γ3 R. Furthermore, Γ1 = Γ1 ◦ Γ1 and lin(Γ1 ),
Γ1 x : un ⊕ {M, l! S.T, M }. Here x is un since x and y are dual. We also have
Γ1 , x : T l! v.P : l! S.T , from which follows Γ4 v : S and Γ5 P from rule
[T-Out]. For the y side, Γ2 y : un&{N, l? U.V, N } and Γ2 , y : Y l? z.Q : l? U.V
which has Γ6 , y : V, z : U Q from [T-In].
Types S and U are equivalent due to the duality of x, y and so Γ6 , y : V, z : S
Q. Using the substitution Lemma 3, Γ6 ◦ Γ4 , y : V Q[v/z]. From Γ5 we also type
the process P . Using [T-Par] with the remaining contexts and [T-Res], types
the conclusion of [R-UnLin].
Mixed Sessions 731
P ::= . . . Processes:
x!v.P output
qx?x.P input
x l.P selection
x {li : Pi }i∈I branching
T ::= . . . Types:
q T.T communication
q{li : Ti }i∈I choice
S≡T S ⊥ T S i ⊥ Ti
q?S.S ⊥ q!T.T
q ⊕ {li : Si }i∈I ⊥ q&{li : Ti }i∈I
1. If Γ v : T , then Γ v : T .
2. If Γ P , then Γ P .
Mixed Sessions 733
Process translation
Following the ideas from Peters et al. [34], the translation from mixed to
classical sessions can be enriched with a renaming policy ϕ , representing a
map from channel ends to sequences of channel ends. The following theorem
states that the proposed translation is name invariant.
where σ is such that ϕ (σ(x)) = σ (ϕ (x)), for every channel end x.
Proof. The translation transforms each channel end (x, in Figure 10) into itself.
Thus, any substitution is preserved. See Figure 10.
1. If P → P , then P → P .
2. If P → Q, then P → P and P = Q, for some P .
Proof. Straightforward rule induction on the hypotheses, relying on the fact that
P [v/x] = P [v/x] and xi ∈
/ fv(Pi ) in the translation of x {li : Pi }i∈I .
Mixed Sessions 735
The following theorems concern the finite and infinite behavior of classical
session processes and their corresponding translations.
Theorem 7 (Divergence Reflection). The translation · : C −→ M reflects
divergence, i.e., if P →ω ω
M then P →C for every process P ∈ C.
= a
Broker = a
Broker
l? ! !
l! +l? l? ! !
l! v1 +l?
1 +l2 +l3
1 2 1
3 1 +l2 v2 +l3 v3
1 2 1
3
3
4
3
4
l!1 l?
1 l!1 v1 l?
1
} ! } !
P o
5
v1
Q P Q
l!1 v1 +l? l? ! !
1 +l2 v2 +l3 v3
r ,
3
1
1
P/B 3Q P k Q/B
2 2
l?
1 l!1 v1
Finally, we observe that the broker need not be an independent process; it can
be located at one of the choice processes. This reduces the number of messages
down to two messages in the general case, as described in Figures 12a and 12b
where either P is the broker or Q is the broker. Even if the value was already
sent by Q in the case that P is the broker, P must still let Q know which choice
was taken, so that Q may proceed with the appropriate branch.
However, in particular cases one message may be enough. Take, for instance
a process P un x(l1! v1 .P + l2! v2 .P ). Independently of which branch is taken,
the process proceeds as P . Thus, if the broker is located in a process Q, then
P needs not be informed of the selected choice. The same is true for classical
sessions where selection is a mixed-out choice of a single branch.
There are two other aspects that one should discuss when implementing
mixed sessions on a message passing architecture other than the number of
messages exchanged.
The first is related the type of broker used and to which values are revealed in
a choice to the other party. In the case of the basic broker, only the chosen option
value is revealed, and never to the broker itself. However, when we piggyback
the values in the second type of broker, all values in the choice branches are
revealed to the broker, even if they are not used in the end. This is even more
striking in the case where one of the processes is the broker—the other party
has access to all the possible values, independently of the choice that is taken.
Mixed Sessions 737
The second aspect is also related to the values themselves which, in order to
be presented in the choice, values must be computed a priori, even if they are
not used in the choice.
When dealing with the privacy of the values, we can choose which type of
broker to use depending on how much we want to reveal to the other party.
However, to prevent computing before a branch is chosen, one should instead
use classical sessions.
7 Related Work
Mixed choices in the Singularity operating system Concrete syntax apart, the
language of linear mixed choices is quite similar to that of channel contracts in
Sing# [10]. Rather than explicit recursive types, Sing# contracts uses named
states (akin to typestates [40]), providing for more legible contracts. In Sing#,
each state in a contract corresponds to a mixed session lin&{li Si .Ti } (contracts
are always written from the consumer side) where each li denotes a message tag,
the message direction (! or ?), Si the type of the value in the message, and Ti
the next state.
Stengel and Bultan showed that processes that follow Sing# contracts can
engage in communication errors [39]. They further provide a realizability condi-
tion for contracts that essentially rules out mixed choices. Bono and Padovani
present a calculus and a type system that models Sing# [6,7]. The type system
738 V. T. Vasconcelos et al.
ensures that well-typed processes are exempt from communication errors, but the
language of types excludes mixed-choices. So it seems that Sing#-like languages
only function properly under separated choice, yet our work survives under mixed
choices. Contradiction? No! Sing# features asynchronous (or buffered) seman-
tics whereas mixed sessions run under synchronous semantics. The operational
semantics makes all the difference in this case.
Synchronicity, asynchronicity, and choice Pierce and Turner identified the prob-
lem: “In an asynchronous language guarded choice should be restricted still fur-
ther since an asynchronous output in a choice is sensitive to buffering” [36] and
Peters et al. state that “a discussion on synchrony versus asynchrony cannot
be separated from a discussion on choice” [34,35]. Based on classical sessions,
mixed sessions are naturally synchronous. The naive introduction of an asyn-
chronous semantics would ruin the main results of the language (see Section 4).
Asynchronous semantics are known to be compatible with classical sessions;
see Honda et al. [20,21] for multiparty asynchronous session types and Fowler
et al. [11] and Gay and Vasconcelos [16] for two examples of functional lan-
guages with session types and asynchronous semantics. So one can ask whether
a language can be designed where mixed-choices are handled synchronously and
separated-choices asynchronously, a type-guided operational semantics with by-
default asynchronous semantics, reverting to a synchronous semantics when in
presence of mixed-choices.
Separation results Palamidessi shows that the π-calculus with mixed choice is
more expressive than its subset with separated choice [32]. Gorla provides a
simpler proof [17] of the same result and Peters and Nestmann analyse the
problem from the perspective of breaking initial symmetries in separated-choice
processes [33]. Unlike the π-calculus with separated choices, mixed choices oper-
ate on the same channel and are guided by types. It would be interesting to look
into separation results for classical sessions and mixed sessions. Are mixed ses-
sions more expressive than classical session under some widely accepted criteria
(those of Gorla [17], for example)?
The origin of mixed sessions Mixed sessions dawned on us when looking into
an algorithm to decide the equivalence of context-free session types [1,42]. The
algorithm translates types into (simple) context-free grammars. The decision
procedure runs on arbitrary simple grammars: the right-hand sides of grammar
productions may start with a label-output or a label-input pair for the same
non-terminal symbol at the left of the production. We then decided to explore
mixed sessions and picked the simplest possible language for the effect: the π-
calculus. It would be interesting to look into mixed context-free session types,
given that decidability of type equivalence is guaranteed.
Mixed Sessions 739
8 Conclusion
We introduce mixed sessions: session types with mixed choice. Classical session
types feature separated choice; in fact all the proposals in the literature we are
aware of provide for choice on the input side only, even if we can easily think
of choice on the output side. Mixed sessions increase flexibility in programming
and are easily realisable in conventional message passing architectures.
Mixed choices come with a type system featuring subtyping. Typability is
preserved by reduction. Furthermore well-typed programs are exempt from run-
time errors. We provide suggestions on how to derive a type checking procedure,
even if we do not formalise it. Classical session types are a particular case of
mixed sessions: we provide for an encoding and show typing and operational
correspondences.
We leave open the problem of looking into a typed separation result (or a
proof of inseparability) between classical sessions and mixed sessions. An inter-
esting avenue for further development includes looking for a hybrid type-guided
semantics, asynchcronous by default, that reverts to synchronous when in pres-
ence of an output choice.
References
1. Almeida, B., Mordido, A., Vasconcelos, V.T.: Checking the equivalence of context-
free session types. In: Tools and Algorithms for the Construction and Analysis of
Systems - 26th International Conference, TACAS 2020. Lecture Notes in Computer
Science, Springer (2020)
2. Barendregt, H.P.: The lambda calculus - its syntax and semantics, Studies in logic
and the foundations of mathematics, vol. 103. North-Holland (1985)
3. Bergstra, J.A., Klop, J.W.: Process theory based on bisimulation semantics. In:
Linear Time, Branching Time and Partial Order in Logics and Models for Concur-
rency. Lecture Notes in Computer Science, vol. 354, pp. 50–122. Springer (1988).
https://doi.org/10.1007/BFb0013021
4. Bernardi, G., Dardha, O., Gay, S.J., Kouzapas, D.: On duality relations for session
types. In: Trustworthy Global Computing. Lecture Notes in Computer Science,
vol. 8902, pp. 51–66. Springer (2014). https://doi.org/10.1007/978-3-662-45917-
14
5. Bernardi, G., Hennessy, M.: Using higher-order contracts to model
session types. Logical Methods in Computer Science 12(2) (2016).
https://doi.org/10.2168/LMCS-12(2:10)2016
6. Bono, V., Messa, C., Padovani, L.: Typing copyless message passing. In: Program-
ming Languages and Systems. Lecture Notes in Computer Science, vol. 6602, pp.
57–76. Springer (2011). https://doi.org/10.1007/978-3-642-19718-5 4
740 V. T. Vasconcelos et al.
7. Bono, V., Padovani, L.: Typing copyless message passing. Logical Methods in Com-
puter Science 8(1) (2012). https://doi.org/10.2168/LMCS-8(1:17)2012
8. Caires, L., Pfenning, F., Toninho, B.: Linear logic propositions as session
types. Mathematical Structures in Computer Science 26(3), 367–423 (2016).
https://doi.org/10.1017/S0960129514000218
9. Demangeon, R., Honda, K.: Full abstraction in a subtyped pi-calculus with lin-
ear types. In: CONCUR 2011 - Concurrency Theory. Lecture Notes in Computer
Science, vol. 6901, pp. 280–296. Springer (2011). https://doi.org/10.1007/978-3-
642-23217-6 19
10. Fähndrich, M., Aiken, M., Hawblitzel, C., Hodson, O., Hunt, G.C., Larus, J.R.,
Levi, S.: Language support for fast and reliable message-based communication in
singularity OS. In: Proceedings of the 2006 EuroSys Conference. pp. 177–190. ACM
(2006). https://doi.org/10.1145/1217935.1217953
11. Fowler, S., Lindley, S., Morris, J.G., Decova, S.: Exceptional asynchronous ses-
sion types: session types without tiers. PACMPL 3(POPL), 28:1–28:29 (2019).
https://doi.org/10.1145/3290341
12. Franco, J., Vasconcelos, V.T.: A concurrent programming language with re-
fined session types. In: Software Engineering and Formal Methods. Lec-
ture Notes in Computer Science, vol. 8368, pp. 15–28. Springer (2013).
https://doi.org/10.1007/978-3-319-05032-4 2
13. Garrigue, J., Keller, G., Sumii, E. (eds.): Proceedings of the 21st ACM SIGPLAN
International Conference on Functional Programming, ICFP 2016, Nara, Japan,
September 18-22, 2016. ACM (2016). https://doi.org/10.1145/2951913
14. Gastin, P., Laroussinie, F. (eds.): CONCUR 2010 - Concurrency Theory, 21th
International Conference, CONCUR 2010, Paris, France, August 31-September 3,
2010. Proceedings, Lecture Notes in Computer Science, vol. 6269. Springer (2010).
https://doi.org/10.1007/978-3-642-15375-4
15. Gay, S.J., Hole, M.: Subtyping for session types in the pi calculus. Acta Inf. 42(2-3),
191–225 (2005). https://doi.org/10.1007/s00236-005-0177-z
16. Gay, S.J., Vasconcelos, V.T.: Linear type theory for asynchronous session types. J.
Funct. Program. 20(1), 19–50 (2010). https://doi.org/10.1017/S0956796809990268
17. Gorla, D.: Towards a unified approach to encodability and separa-
tion results for process calculi. Inf. Comput. 208(9), 1031–1053 (2010).
https://doi.org/10.1016/j.ic.2010.05.002
18. Honda, K.: Types for dyadic interaction. In: CONCUR ’93, 4th International Con-
ference on Concurrency Theory. Lecture Notes in Computer Science, vol. 715, pp.
509–523. Springer (1993). https://doi.org/10.1007/3-540-57208-2 35
19. Honda, K., Vasconcelos, V.T., Kubo, M.: Language primitives and type discipline
for structured communication-based programming. In: Programming Languages
and Systems. Lecture Notes in Computer Science, vol. 1381, pp. 122–138. Springer
(1998). https://doi.org/10.1007/BFb0053567
20. Honda, K., Yoshida, N., Carbone, M.: Multiparty asynchronous session
types. In: Proceedings of the 35th ACM SIGPLAN-SIGACT Sympo-
sium on Principles of Programming Languages. pp. 273–284. ACM (2008).
https://doi.org/10.1145/1328438.1328472
21. Honda, K., Yoshida, N., Carbone, M.: Multiparty asynchronous session types. J.
ACM 63(1), 9:1–9:67 (2016). https://doi.org/10.1145/2827695
22. Kobayashi, N., Pierce, B.C., Turner, D.N.: Linearity and the pi-calculus.
In: Conference Record of POPL’96. pp. 358–371. ACM Press (1996).
https://doi.org/10.1145/237721.237804
Mixed Sessions 741
23. Kobayashi, N., Pierce, B.C., Turner, D.N.: Linearity and the pi-
calculus. ACM Trans. Program. Lang. Syst. 21(5), 914–947 (1999).
https://doi.org/10.1145/330249.330251
24. Kouzapas, D., Yoshida, N.: Mixed-choice multiparty session types (2020), unpub-
lished
25. Lindley, S., Morris, J.G.: Talking bananas: structural recursion for session types.
In: Garrigue et al. [13], pp. 434–447. https://doi.org/10.1145/2951913.2951921
26. Milner, R.: A Calculus of Communicating Systems, Lecture Notes in Computer
Science, vol. 92. Springer (1980). https://doi.org/10.1007/3-540-10235-3
27. Milner, R.: Functions as processes. In: Automata, Languages and Programming.
Lecture Notes in Computer Science, vol. 443, pp. 167–180. Springer (1990).
https://doi.org/10.1007/BFb0032030
28. Milner, R.: The polyadic pi-calculus: A tutorial. ECS-LFCS 91–180, Lab oratory
for Foundations of Computer Science, Department of Computer Science, University
of Edinburgh (1991), this report was published in F. L. Hamer, W. Brauer and H.
Schwichtenberg, editors, Logic and Algebra of Specification. Springer-Verlag, 1993
29. Milner, R.: Functions as processes. Mathematical Structures in Computer Science
2(2), 119–141 (1992). https://doi.org/10.1017/S0960129500001407
30. Milner, R., Parrow, J., Walker, D.: A calculus of mobile processes, I. Inf. Comput.
100(1), 1–40 (1992). https://doi.org/10.1016/0890-5401(92)90008-4
31. Milner, R., Parrow, J., Walker, D.: A calculus of mobile processes, II. Inf. Comput.
100(1), 41–77 (1992). https://doi.org/10.1016/0890-5401(92)90009-5
32. Palamidessi, C.: Comparing the expressive power of the synchronous and asyn-
chronous pi-calculi. Mathematical Structures in Computer Science 13(5), 685–719
(2003). https://doi.org/10.1017/S0960129503004043
33. Peters, K., Nestmann, U.: Breaking symmetries. Mathemati-
cal Structures in Computer Science 26(6), 1054–1106 (2016).
https://doi.org/10.1017/S0960129514000346
34. Peters, K., Schicke, J., Nestmann, U.: Synchrony vs causality in the asynchronous
pi-calculus. In: Proceedings 18th International Workshop on Expressiveness in Con-
currency. EPTCS, vol. 64, pp. 89–103 (2011). https://doi.org/10.4204/EPTCS.64.7
35. Peters, K., Schicke-Uffmann, J., Goltz, U., Nestmann, U.: Synchrony versus causal-
ity in distributed systems. Mathematical Structures in Computer Science 26(8),
1459–1498 (2016). https://doi.org/10.1017/S0960129514000644
36. Pierce, B.C., Turner, D.N.: Pict: a programming language based on the pi-calculus.
In: Proof, Language, and Interaction, Essays in Honour of Robin Milner. pp. 455–
494. The MIT Press (2000)
37. Sangiorgi, D.: An interpretation of typed objects into typed pi-calculus. Inf. Com-
put. 143(1), 34–73 (1998). https://doi.org/10.1006/inco.1998.2711
38. Sangiorgi, D., Walker, D.: The Pi-Calculus - a theory of mobile processes. Cam-
bridge University Press (2001)
39. Stengel, Z., Bultan, T.: Analyzing singularity channel contracts. In: Proceedings
of the Eighteenth International Symposium on Software Testing and Analysis. pp.
13–24. ACM (2009). https://doi.org/10.1145/1572272.1572275
40. Strom, R.E., Yemini, S.: Typestate: A programming language concept for en-
hancing software reliability. IEEE Trans. Software Eng. 12(1), 157–171 (1986).
https://doi.org/10.1109/TSE.1986.6312929
41. Takeuchi, K., Honda, K., Kubo, M.: An interaction-based language and its
typing system. In: PARLE ’94: Parallel Architectures and Languages Europe.
Lecture Notes in Computer Science, vol. 817, pp. 398–413. Springer (1994).
https://doi.org/10.1007/3-540-58184-7 118
742 V. T. Vasconcelos et al.
42. Thiemann, P., Vasconcelos, V.T.: Context-free session types. In: Garrigue et al.
[13], pp. 462–475. https://doi.org/10.1145/2951913.2951926
43. Vasconcelos, V.T.: Fundamentals of session types. Inf. Comput. 217, 52–70 (2012).
https://doi.org/10.1016/j.ic.2012.05.002
44. Vasconcelos, V.T.: Typed concurrent objects. In: Object-Oriented Programming.
Lecture Notes in Computer Science, vol. 821, pp. 100–117. Springer (1994).
https://doi.org/10.1007/BFb0052178
45. Vasconcelos, V.T.: Fundamentals of session types. In: Formal Methods for Web
Services. Lecture Notes in Computer Science, vol. 5569, pp. 158–186. Springer
(2009). https://doi.org/10.1007/978-3-642-01918-0 4
46. Wadler, P.: Propositions as sessions. In: ACM SIGPLAN International
Conference on Functional Programming. pp. 273–286. ACM (2012).
https://doi.org/10.1145/2364527.2364568
47. Wadler, P.: Propositions as sessions. J. Funct. Program. 24(2-3), 384–418 (2014).
https://doi.org/10.1017/S095679681400001X
48. Waker, D.: Advanced Topics in Types and Programming Languages, chap. Sub-
structural Type Systems. The MIT Press (2005)
49. Yoshida, N., Vasconcelos, V.T.: Language primitives and type discipline for struc-
tured communication-based programming revisited: Two systems for higher-order
session communication. Electr. Notes Theor. Comput. Sci. 171(4), 73–93 (2007).
https://doi.org/10.1016/j.entcs.2007.02.056
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Higher-Order Spreadsheets with Spilled Arrays
1 Introduction
Many spreadsheets contain repeated regions that share the same formatting and
formulas, perhaps with minor variations. The typical method for generating each
variation is to apply the operations copy-paste-modify. That is, the user copies
the region they intend to repeat, pastes it into a new location, and makes local
modifications to the newly pasted region such as altering data values, format-
ting, or formulas. A common problem associated with copy-paste-modify is that
updates to a source region will not propagate to a modified copy. A user must
modify each copy manually—a process that is tedious and error-prone.
Gridlets [12] are a high-level abstraction for re-use in spreadsheets based on
the principle of live copy-paste-modify: a pasted region of a spreadsheet can be
locally modified without severing the link to the source region. Changes to the
source region propagate to the copy.
c The Author(s) 2020
P. Müller (Ed.): ESOP 2020, LNCS 12075, pp. 743–769, 2020.
https://doi.org/10.1007/978-3-030-44914-8_ 27
744 J. Williams et al.
The central idea of this paper is that we can implement gridlets using a
formula operator G. If a cell a contains the formula
G(r, a1 , F1 , . . . , an , Fn )
then the behaviour is to copy range r, modify cells ai with formulas Fi , and
paste the computed array in cell a where its elements may be displayed in the
cells below and to the right.
Consider the following example:
A B C A B C
1 “Edge” “Len.” 1 “Edge” “Len.”
2 “a” 3 = B2^2 2 “a” 3 9
3 “b” 4 = B3^2 3 “b” 4 16
4 “c” = SQRT(C4) = C2 + C3 4 “c” 5 25
A B C A B C
.. .. .. .. .. ..
. . . . . .
6 = G(A1:C4, B2, 7, B3, 24) 6 “Edge” “Len.”
7 7 “a” 7 49
8 8 “b” 24 576
9 9 “c” 25 625
First, we make sense of array spilling and its subtleties. Two formulas spilling
into the same cell, or colliding, is one problem. Another problem is a formula
spilling into an area on which it depends, triggering a spill cycle. Both problems
make preserving determinism and acyclicity of spreadsheet evaluation a chal-
lenge. We give a semantics of spilling that exploits iteration to determine which
arrays spill successfully, and which do not. Our solution ensures that there is at
most one array that spills into any address, and that the iteration converges.
Second, we develop three new spreadsheet primitives that implement G when
paired with spilled arrays. We present a higher-order spreadsheet calculus, the
grid calculus, that admits sheets as first-class values and provides operations
that manipulate sheet-values. Previous work has drawn connections between
spreadsheets and object-oriented programming [5,8,9,15,17], but we give the first
direct correspondence by showing that the Abadi and Cardelli object calculus [1]
can be embedded in the grid calculus. Our translation constitutes a precise
analogy between objects and sheets, and between methods and cells.
In our semantics for gridlets, we make three distinct technical contributions:
– We develop the spill calculus, the first formalisation of spilled arrays for
spreadsheets. Our first theorem is that the iterative process of spilling we
present converges deterministically (Section 4). Our formal analysis of spilled
arrays, a feature now available in commercial spreadsheet systems, is a sub-
stantial contribution of this work, independent of our gridlet semantics.
– We develop the grid calculus, an extension of the spill calculus with three
higher-order operators: GRID, VIEW, and UPDATE. These correspond to
copy, paste, and modify, and suffice to encode the operator G (Section 5).
– In the course of developing the grid calculus, we realised a close connection
between gridlets and object-oriented programming. We make this precise by
encoding the Abadi and Cardelli object calculus into the grid calculus. Our
second theorem shows the correctness of this encoding (Section 6).
2 Challenges of Spilling
In this section we describe the challenges of implementing spilled arrays. We de-
scribe core design principles for spreadsheet implementations and then illustrate
how spilled arrays challenge these principles.
A B A B
1 {10, 20} 1 10 20
2 2
Static Collision Every cell in a spill area should be blank except for the spill
root; a blank cell has no formula. A static collision occurs when a spill root spills
into another non-blank cell, and we say the non-blank cell is an obstruction.
The choice to read the value from the obstruction or the spilled value violates
determinism. We adopt a simple mechanism used by Excel and Sheets to resolve
static spill collisions: the root evaluates to an error value, not an array, and spills
nowhere. The ambiguity between reading the obstructing cell’s value and the
root’s spilled value is resolved by preventing the root from spilling—we always
read the value from the obstructing cell. Consider the following example:
Higher-Order Spreadsheets with Spilled Arrays 747
A B A B
1 {10, 20} 40 1 ERR 40
2 B1 + 2 2 42
Dynamic Collisions A dynamic collision occurs when a blank cell is a spill target
for two distinct spill roots. Dynamic collisions can be resolved in different ways.
– The conservative approach is to say no colliding spill root spills and each
root evaluates to an error.
– The liberal approach is to say that every colliding spill root spills. This
approach can be non-deterministic because the spill target obtains its value
by choosing one of the multiple colliding spill roots. Google Sheets takes the
liberal approach.
– An intermediate approach enforces what we call the single-spill policy. One
root from the set of colliding roots is permitted to spill and the rest evaluate
to an error. This approach can be non-deterministic because there is a choice
of which root is permitted to spill. Excel takes the single-spill approach.
Consider the following example that uses the single-spill approach:
A B A B A B
1 B2 {3; 4} 1 2 ERR 1 4 3
2 {1, 2} 2 1 2 2 ERR 4
cycle occurs when the value of a formula in a cell depends on the value of the
cell itself. We know that it is never legal for a cell to read its own value and
therefore it is possible to eagerly detect cell cycles during evaluation of a cell. In
contrast, a spill cycle only occurs if the cell evaluates to an array that is spilled
into a range the cell depends on, so it is not possible to detect the cycle until
the cell has been evaluated.
We can thus proactively detect cell cycles, but only retroactively detect spill
cycles. To see why, let us consider the following example, wherein we assume
the definition of a conditional operator IF that is lazy in the second and third
arguments, and the function INC that maps over an array and increments every
number and converts to 0, where is the value read from a blank cell.
A B
1 42 IF(A1 = 42, SUM(B2:B3), INC(B2:B3))
2
3
The evaluation of address B1 returns the sum of the range B2 : B3. While the
value of B1 depends on the values in the range B2:B3, the sum returns a scalar
and therefore no spilling is required.
Consider the case where the value in A1 is changed to 43. The address B1
will evaluate the formula INC(B2 : B3), first by dereferencing the range B2 : B3
to yield {; }, and then by applying INC to yield {0; 0}. The array {0; 0} will
attempt to spill into the range B1:B2—a range just read from by the formula.
The attempt to spill will induce a spill cycle; there is no consistent value that
can be assigned to the addresses B1, B2, and B3.
In Section 4 we give a semantics for spilling that uses dynamic dependency
tracking to ensure that no spill root depends on its own spill area.
In this section we present a core calculus for spreadsheets that serves as the
foundation of our technical developments.
3.1 Syntax
Figure 1 presents the syntax of the core calculus. Let a and b range over A1-style
addresses, written N m, composed from a column name N and row index m. A
column name is a base-26 numeral written using the symbols A..Z. A row index
is a decimal numeral written as usual. Let m and n range over positive natural
numbers which we typically use to denote row or array indices. We assume a
locale in which rows are numbered from top to bottom, and columns from left to
right, so that A1 is the top-left cell of the sheet. We use the terms address and cell
interchangeably. Let r range over ranges that are pairs of addresses that denote
a rectangular region of a grid. Modern spreadsheet systems do not restrict which
Higher-Order Spreadsheets with Spilled Arrays 749
corners of a rectangle are denoted by a range but will automatically normalise the
range to represent the top-left and bottom-right corners. We implicitly assume
that all ranges are written in the normalised form such that range B1:A2 does
not occur; instead, the range is denoted A1:B2.
A value V is either the blank value , a constant c, an error ERR, or a
two-dimensional array {Vi,j i∈1..m,j∈1..n }. We write {Vi,j i∈1..m,j∈1..n } as short
for array literal {V1,1 , . . . , V1,n ; . . . ; Vm,1 , . . . , Vm,n }.
Let F range over formulas. A formula is either a value V , a range r, or a
function application f (F1 , . . . , Fn ), where f ranges over names of pre-defined
worksheet functions such as SUM or PRODUCT.
Let S range over sheets, where a sheet is a partial function from addresses
to formulas that has finite domain. We write [] to denote the empty map, and
we write S[a → F ] to denote the extension of S to map address a to formula
F , potentially shadowing an existing mapping. We do not model the maximum
numbers of rows or columns imposed by some implementations. Each finite S
represents an unbounded sheet that is almost everywhere blank: we say a cell a
is blank to mean that a is not in the domain of S.
Let γ range over grids, where a grid is a partial function from addresses to
values that has finite domain. A grid can be viewed as a function that assigns
values to addresses, obtained by evaluating a sheet.
Figure 2 presents the operational semantics of the core calculus. Auxiliary defi-
nitions are present at the top of Figure 2.
S Fi ⇓ Vi f (V1 , . . . , Vn ) = V S a!V
SV ⇓V S f (F1 , . . . , Fn ) ⇓ V S a:a ⇓ V
Address dereferencing: S a ! V
Sheet evaluation: S ⇓ γ
def
S ⇓ γ = ∀a ∈ dom(S). S a ! γ(a)
of a range written (m, n), where m is the number of rows, and n is the number of
columns. We write a + (i, j) to denote the address offset to the right and below a
by i − 1 rows and j − 1 columns. For example, a + (1, 1) maps to a, and a + (1, 2)
maps to the address immediately to the right of a. Both size(a1:a2 ) and a + (i, j)
are defined in Figure 2.
The spill calculus, presented in this section, is the first formalism to explain the
semantics of arrays that spill out of cells in spreadsheets. The spill calculus and
its convergence, Theorem 1, is our first main technical contribution.
4.1 Syntax
def
Let S = [A1 → {7; 8}, B1 → IF(A2 = 8, {9; 10}, 100)]
A B A B A B
1 {7; 8} 100 1 7 {9; 10} 1 7 9
2 2 8 2 8 10
[] = ω1 −→ ω2 −→ · · · −→ ωn and ωn is consistent
Consider the example in Figure 4. At the top we show the bindings of the sheet;
at the bottom we show the oracle and induced grid for each round of spilling.
We define the initial spill oracle as ω1 = [] and in the first round the oracle
is empty. An empty oracle anticipates no spill roots and therefore no roots are
permitted to spill. The array in A1 remains collapsed and B1 evaluates using the
false branch. Once the sheet has been fully evaluated we determine that ω1 was
not a consistent prediction because there is an array in A1 with no corresponding
entry in ω1 . We compute a new oracle that determines that A1 is allowed to spill
because the area is blank. We define the new oracle as ω2 = [A1 → (2, 1, )].
In the second round the root A1 is permitted to spill by the oracle and as a
consequence B1 now evaluates to the array {9; 10}—this array is not anticipated
by the oracle and remains collapsed. Once the sheet has been fully evaluated we
determine that ω2 was not a consistent prediction because there is an array in
Higher-Order Spreadsheets with Spilled Arrays 753
Spill Rejection Spill oracles explicitly track the anticipated size of the array
to ensure that spill rejections based on incorrect dimensions can be corrected.
Consider the following example:
A B C
1 IF(C2 = 2, {10; 20}, {10; 20; 30}) {1; 2}
2
3 {1, 2, 3}
After the first round using an empty spill oracle there are three spill roots:
A3 = {1, 2, 3}, B1 = {10; 20; 30}, and C1 = {1; 2}. There is sufficient space to
spill C1 but only space to spill one of A3 and B1; the decision is resolved using
the total ordering on addresses. Suppose that we allow A3 to spill such that the
new oracle is: [A3 → (1, 3, ), B1 → (3, 1, ×), C1 → (2, 1, )].
After the second round we find that address B1 returns an array of a smaller
size because the root C1 spills into C2. Previously we thought B1 was too big to
spill but with the new oracle we find there is now sufficient room; by explicitly
recording the anticipated size it is possible to identify cases that require further
refinement. We compute the new oracle [A3 → (1, 3, ), B1 → (2, 1, ), C1 →
(2, 1, )] that is consistent.
An interesting limitation arises if the total ordering places B1 before A3,
which we discuss in Section 4.6.
Figure 5 presents the operational semantics for the spill calculus. The key ad-
ditions to the relations for formula evaluation and address dereferencing are an
oracle ω that is part of the context, and a dependency set D that is part of the
output. We discuss each relation in turn and focus on the extensions and modi-
fications from Figure 2. Auxiliary definitions are present at the top of Figure 5.
owners(ω, a) = {(ar , i, j) | ω(ar ) = (m, n, ) and ar + (i, j) = a and (i, j) ≤ (m, n)}
area(a, m, n) = {
a + (i, j) | ∀i ∈ 1..m, ∀j ∈ 1..n }
(m, n) if V = {Vi,j i∈1..m,j∈1..n }
size(V ) =
⊥ otherwise
Formula evaluation: S, ω F ⇓ V, D
S, ω Fi ⇓ Vi , Di f (V1 , . . . , Vn ) = V
S, ω V ⇓ V, ∅ n
S, ω f (F1 , . . . , Fn ) ⇓ V, Di
i=1
S, ω a ! V # , V ! , D S, ω a ! V # , V ! , D
S, ω a# ⇓ V , D ∪ {a}
#
S, ω a:a ⇓ V ! , D ∪ {a}
a1 = a2
size(a1 :a2 ) = (m, n) ∀i ∈ 1..m, j ∈ 1..n. S, ω a1 + (i, j) ! Vi,j
# !
, Vi,j , Di,j
m,n
! i∈1..m,j∈1..n
S, ω a1 :a2 ⇓ {Vi,j }, Di,j ∪ {a1 + (i, j)}
i,j=1,1
Address dereferencing: S, ω a ! V # , V ! , D
(ar , i, j) ∈ owners(ω, a)
ω(ar ) = (m, n, ) S(ar ) = F S, ω\ar F ⇓ V, D size(V ) = (m, n)
(5)
S, ω a ! (a = ar ? V : ), (a = ar ? V : ), (a = ar ? D : ∅)
Sheet evaluation: S, ω ⇓ γ
def
S, ω ⇓ γ = ∀a ∈ dom(S). S, ω a ! γ(a)
def
γ |=a ω = ∀m, n, p. (ω(a) = (m, n, p)) ⇔
∃V # , V ! , D. (γ(a) = (V # , V ! , D) ∧ size(V # ) = (m, n))
def
γ |= ω = ∀a. γ |=a ω
decide(S, ω, []) = ω
decide(S, ω, γ[a → (V # , V ! , D)]) = decide(S, ω[a → (m, n, p)], γ)
#
where a is the least element in dom(γ) and size(V ) = (m, n)
if ∀at ∈ area(a, m, n). a = at ⇒ at ∈ dom(S) and owners(ω, at ) = ∅
p=
× otherwise
S, ω ⇓ γ γ |= ω refine(S, ω, γ) = ω S, ω ⇓ γ γ |= ω
ω −→S ω S ω final
oracle. A root is permitted to spill if the potential spill area is blank (excluding
the root itself) and each address in the spill area has no owner, thereby preserving
the single-spill policy.
Spill iteration The relation ω −→S ω denotes a single iteration of oracle refine-
ment. When a computed grid is not consistent with the spill oracle that induced
it, written γ |= ω, a new oracle is produced using function refine(S, ω, γ). We
write −→∗S for the reflexive and transitive closure of −→S .
Final oracle The relation S ω final states that oracle ω is final for sheet S,
and is valid when the grid induced by ω is consistent with ω.
This section presents the main technical result of the spill calculus: that iteration
of oracle refinement converges for well-behaved sheets. We begin with prelimi-
nary definitions and results.
To avoid ambiguous evaluation every spill area must be disjoint and unob-
structed; an oracle is well-formed if it predicts non-blank spill roots, and predicts
disjoint and unobstructed spill areas, defined below:
Definition 1 (Well-formed oracle). We write S ω wf if oracle ω is well-
formed for sheet S. An oracle ω is well-formed if for all addresses a the following
conditions are satisfied:
Proof. (Sketch—see Appendix B of the extended version [21] for the full proof.)
The value of any address with a binding is a function of its dependencies and the
oracle prediction for that address. We inductively define an address as fixed if
the oracle prediction is consistent for the address, and every address in the spill-
dependency set (defined in [21]) is fixed. Lemma 3 states that correct predictions
are always preserved, therefore a fixed address remains fixed through iteration
and its value remains invariant. The dependency graph of the sheet is acyclic
therefore if there is a non-fixed address then there must be a non-fixed address
with no dependencies but an inconsistent oracle prediction—we call this a non-
fixed source. Lemma 2 states that every new oracle correctly predicts the size
with respect to the previous grid, therefore any non-fixed sources will be fixed
in the new oracle. We conclude by observing that the number of fixed addresses
in the sheet strictly increases at each step, and when every address is fixed the
oracle is final.
be permitted to spill, even if the size of the associated array does not change.
This particular interaction arises when a root that was previously preventing a
from spilling changes dimension, freeing a previously occupied spill area. Per-
mitting roots to spill into newly freed regions of the grid is desirable from a user
perspective because it reflects the visual aspect of spreadsheet programming
where an array will spill into any unoccupied cells.
A limitation of our formalism, if implemented directly, is that there exist some
spreadsheets that when evaluated will prevent an array from spilling, despite the
potential spill area being blank. Consider the sheet:
When the total ordering used by oracle refinement orders A3 before C1 then
the behaviour is as expected: A3 spills to the right and C1 evaluates to an error
value. When the total ordering used by oracle refinement orders C1 before A3
then the behaviour appears peculiar: A3 evaluates to an error value and C1
evaluates to 0. The root A3 is prevented from spilling despite there appearing
room in the grid! The issue is that the array in A3 never changes size, therefore
the permit × assigned to the root is preserved, despite root C1 relinquishing the
spill area on subsequent spill iterations.
The fundamental problem is one of constraint satisfaction. We would like to
find a well-formed oracle that maximizes the number of roots that can spill in
a deterministic manner. The total order on addresses ensures determinism but
restricts the solution space. Our approach could be modified to deterministically
permute the ordering until an optimal solution is found, however such a method
would be prohibitively expensive.
Both Sheets and Excel find the best solution to our example sheet. We expect
their implementations do not permute a total order on addresses, but implement
a more efficient algorithm that runs for a bounded time. Finding a more efficient
algorithm that is guaranteed to terminate remains an open challenge.
The limitation we present in our formalism only arises when a spreadsheet
includes dynamic spill collisions and conditional spilling. We anticipate that this
is a rare use case for spilled arrays, and does not arise when using spilled arrays
to implement gridlets for live copy-paste-modify.
A B C A B C
1 “Edge” “Len.” .. .. ..
. . .
2 “a” 3 B2^2
6 G(A1:C4, B2, 7, B3, 24)
3 “b” 4 B3^2
7
4 “c” SQRT(C4) C2 + C3
.. .. .. 8
. . . 9
Syntax Let x range over formula identifiers. Let F range over formulas which
may additionally be identifiers x, LET(x, F1 , F2 ) which binds the result of evalu-
ating F1 to x in F2 , GRID which captures the current sheet, UPDATE(F1 , a, F2 )
which updates a formula binding in a sheet-value, and VIEW(F, r) which extracts
a dereferenced range from a sheet-value. Let V range over values which may ad-
ditionally be a sheet-value S
. Let V range over views; a view is a sheet with a
range, denoted (S, r). A view range r delimits the addresses to be computed in
sheet S.
762 J. Williams et al.
Identifier x ∈ Ident
Formula F ::= · · · | x | LET(x, F1 , F2 ) | GRID | UPDATE(F1 , a, F2 ) | VIEW(F, r)
Value V ::= · · · | S
View V ::= (S, r)
Formula evaluation: S, ω F ⇓ V, D
S, ω F1 ⇓ V1 , D1 S, ω F2 [x := V1 ] ⇓ V2 , D2
S, ω LET(x, F1 , F2 ) ⇓ V2 , D1 ∪ D2 S, ω GRID ⇓ S, ∅
View evaluation: V, ω ⇓ γ
def
(S, r), ω ⇓ γ = ∀a ∈ dom(S) ∩ area(r). S, ω a ! γ(a)
Fig. 7. Syntax and Operational Semantics for Grid Calculus (Extends Figures 3—6)
def
Let S = [A1 → VIEW(UPDATE(GRID, A1, 10), A2), A2 → A1]
Sheet S evaluates to grid [A1 → 10, A2 → 10]. What are the dependencies of
each address? The value of A2 in the grid depends on the value of A1 in the grid.
In contrast, the value of A1 in the grid does not depend on the value of A2 in the
grid. This is because evaluating the formula in A1 constructs a private grid from
which the value of A2 is obtained. However, A1 does depend on the formula
of A2 in the containing grid. Our semantics only considers value dependence,
therefore the dependency set of A1 is ∅—the address has no dependence on
values in the containing grid.
Formula dependence is vital for efficient recalculation, though we do not
model that in our semantics and only use dependency tracking to prevent spill
cycles. If an address depends on the value of another address bound in a sheet,
then it also depends on the formula of that address. The converse is not true in
the presence of sheet-values.
Spill iteration: ω −→V ω The definition of spill iteration for views is the same
as spill iteration for sheets, except that we use view evaluation rather than sheet
evaluation.
Final oracle: V ω final The definition of a final oracle for views is the same as
a final oracle for sheets, except that we use view evaluation rather than sheet
evaluation.
The translation makes our analogy concrete. We use the LET formula to lexically
capture self identifiers. The grid calculus allows the construction of diverging
formulas, as discussed in Section 4.5. We demonstrate this using a diverging
object calculus term.
We give an encoding of the lambda calculus that is inspired by the object calculus
embedding of the lambda calculus. We use ARG1 to hold the argument and
VAL1 to hold the result of a lambda. In spreadsheet languages both ARG1 and
VAL1 are legal cell addresses; for example, address ARG1 denotes the cell at
column 1151 and row 1.
[[x]] = x
[[λx.M ]] = UPDATE(GRID, VAL1, LET(x, VIEW(GRID, ARG1), [[M ]]))
[[M N ]] = VIEW(UPDATE([[M ]], ARG1, [[N ]]), VAL1)
A sheet-defined function [14, 17, 19, 20] is a mechanism for a user to author a
function using a region of a spreadsheet. We can model a sheet-defined function
f as a triple (S, (a0 , . . . , an ), r) that consists of the moat or sheet-bindings for
the function, the addresses from the moat that denote arguments, and the range
from the moat that denotes the result. The application f (V0 , . . . , Vn ) can be
encoded in the grid calculus as follows, where f = (S, (a0 , . . . , an ), r):
7 Related Work
Extending the Spreadsheet Paradigm. Clack and Braine [8] propose a spreadsheet
based on a combination of functional and object-oriented programming. Their
integration is different from our analogy: in their system, a class is a collection
of parameterised worksheets, and a parameterised worksheet corresponds to a
method. In gridlets, the grid corresponds to an object and cells on the grid
correspond to methods of the object.
Error prevention and Error detection. Abraham and Erwig propose type systems
for error detection [3] and automatic model inference [2]. Abraham and Erwig [3]
provide an operational semantics for sheets that is similar to the core calculus
in Section 3, but they do not give a semantics for spilled arrays.
Gencel [10] is a typed “template language” that describes the layout of a de-
sired worksheet along with a set of customized update operations that are specific
6
https://support.google.com/docs/answer/6208276?hl=en
7
https://aka.ms/excel-dynamic-arrays
8
https://aka.ms/excel-cse-formulas
Higher-Order Spreadsheets with Spilled Arrays 767
to the particular template. The type system guarantees that the restricted set
of update operations keeps the desired worksheet free from omission, reference,
and type errors.
Cheng and Rival [7] use abstract interpretation to detect formula errors due
to mismatch in type. Their technique also incorporates analysis of associated
programs, such as VBA scripts, along with formulas on the grid.
8 Conclusion
Repetition is common in programming—spreadsheets are no different. The dis-
tinguishing property of spreadsheets is that reuse includes formatting and layout,
and is not limited to formula logic. Gridlets [12] are a high-level re-use abstrac-
tion for spreadsheets. In this work we give the first semantics of gridlets as a
formula. Our approach comes in two stages.
First, we make sense of spilled arrays, a feature that is available in major
spreadsheet implementations but not previously formalised. The concept is sim-
ple and belies the many subtleties involved in implementing spilled arrays. We
present the spill calculus as a concise description of spilling in spreadsheets.
Second, we extend the spill calculus with the tools to implement gridlets. The
grid calculus introduces the concept of first-class sheet values, and describes the
semantics of three higher-order operators that emulate copy-paste-modify. The
composition of these operators gives the semantics for gridlet operator G.
Spreadsheet programming bears a resemblance to object-oriented program-
ming, alluded to often in the literature. We show that the resemblance runs deep
by giving an encoding of the object calculus into the grid calculus, with a direct
parallel between objects and sheets.
Acknowledgements
Thank you to the Microsoft Excel team for hosting the second author during his
research internship at Microsoft’s Redmond campus. Thank you to Tony Hoare,
Simon Peyton Jones, Ben Zorn, and members of the Microsoft Excel team for
their feedback and assistance with this work.
References
1. Abadi, M., Cardelli, L.: A Theory of Objects. Monographs in Computer Science,
Springer (1996)
2. Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: Proceedings
of the 28th International Conference on Software Engineering. pp. 182–191. ICSE
’06, ACM, New York, NY, USA (2006)
3. Abraham, R., Erwig, M.: Type inference for spreadsheets. In: Proceedings of the
8th ACM SIGPLAN International Conference on Principles and Practice of Declar-
ative Programming. pp. 73–84. PPDP ’06, ACM, New York, NY, USA (2006)
4. Bock, A.A., Bøgholm, T., Sestoft, P., Thomsen, B., Thomsen, L.L.: Concrete and
abstract cost semantics for spreadsheets. Tech. Rep. TR–2008–203, IT University
of Copenhagen (2018)
5. Burnett, M., Atwood, J., Djang, R.W., Reichwein, J., Gottfried, H., Yang, S.:
Forms/3: A first-order visual language to explore the boundaries of the spreadsheet
paradigm. Journal of functional programming 11(2), 155–206 (2001)
6. Chambers, C., Ungar, D.M.: Customization: Optimizing compiler technology for
self, A dynamically-typed object-oriented programming language. In: PLDI. pp.
146–160. ACM (1989)
7. Cheng, T., Rival, X.: Static analysis of spreadsheet applications for type-unsafe
operations detection. In: Vitek, J. (ed.) Programming Languages and Systems. pp.
26–52. Springer Berlin Heidelberg, Berlin, Heidelberg (2015)
8. Clack, C., Braine, L.: Object-oriented functional spreadsheets. In: 10th Glasgow
Workshop on Functional Programming. pp. 1–12 (1997)
9. Djang, R.W., Burnett, M.M.: Similarity inheritance: a new model of inheritance
for spreadsheet vpls. In: Proceedings. 1998 IEEE Symposium on Visual Languages
(Cat. No. 98TB100254). pp. 134–141. IEEE (1998)
10. Erwig, M., Abraham, R., Cooperstein, I., Kollmansberger, S.: Automatic genera-
tion and maintenance of correct spreadsheets. In: Proceedings of the 27th interna-
tional conference on Software engineering. pp. 136–145. ACM (2005)
11. Jelen, B.: Excel Dynamic Arrays Straight to the Point. Holy Macro!
Books (2018), see also https://blog-insider.office.com/2019/06/13/
dynamic-arrays-and-new-functions-in-excel/
12. Joharizadeh, N., Sarkar, A., Gordon, A.D., Williams, J.: Gridlets: Reusing
spreadsheet grids. In: Extended Abstracts of the 2020 CHI Conference on
Human Factors in Computing Systems. CHI EA ’20, ACM, New York, NY,
USA (2020). https://doi.org/10.1145/3334480.3382806, http://doi.acm.org/10.
1145/3334480.3382806
13. Kay, A.: Computer software. Scientific American 251(3), 52–59 (1984), http://
www.jstor.org/stable/24920344
14. McCutchen, M., Borghouts, J., Gordon, A.D., Peyton Jones, S., Sarkar, A.: Elastic
sheet-defined functions: Generalising spreadsheet functions to variable-size input
arrays (2019), unpublished manuscript available at https://aka.ms/calcintel
15. McCutchen, M., Itzhaky, S., Jackson, D.: Object spreadsheets: A new computa-
tional model for end-user development of data-centric web applications. In: Pro-
ceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms,
and Reflections on Programming and Software. pp. 112–127. Onward! 2016, ACM,
New York, NY, USA (2016)
16. Mokhov, A., Mitchell, N., Peyton Jones, S.: Build systems à la carte. PACMPL
2(ICFP), 79:1–79:29 (2018)
Higher-Order Spreadsheets with Spilled Arrays 769
17. Peyton Jones, S.L., Blackwell, A.F., Burnett, M.M.: A user-centred approach to
functions in Excel. In: ICFP. pp. 165–176. ACM (2003)
18. Sarkar, A., Gordon, A.D., Jones, S.P., Toronto, N.: Calculation view: multiple-
representation editing in spreadsheets. In: 2018 IEEE Symposium on Visual
Languages and Human-Centric Computing (VL/HCC). pp. 85–93 (Oct 2018).
https://doi.org/10.1109/VLHCC.2018.8506584
19. Sestoft, P.: Implementing function spreadsheets. In: Proceedings of the 4th inter-
national workshop on End-user software engineering. pp. 91–94. ACM (2008)
20. Sestoft, P., Sørensen, J.Z.: Sheet-defined functions: Implementation and initial eval-
uation. In: Dittrich, Y., Burnett, M., Mørch, A., Redmiles, D. (eds.) End-User
Development. pp. 88–103. Springer Berlin Heidelberg, Berlin, Heidelberg (2013)
21. Williams, J., Joharizadeh, N., Gordon, A.D., Sarkar, A.: Higher-order spreadsheets
with spilled arrays (with appendices). Tech. rep., Microsoft Research (2020), https:
//aka.ms/calcintel
Open Access This chapter is licensed under the terms of the Creative Commons
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),
which permits use, sharing, adaptation, distribution and reproduction in any medium
or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were
made.
The images or other third party material in this chapter are included in the chapter’s
Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the chapter’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need
to obtain permission directly from the copyright holder.
Author Index