Frap Book
Frap Book
Frap Book
Adam Chlipala
MIT, Cambridge, MA, USA
Email address: [email protected]
Abstract. Briefly, this book is about an approach to bringing software en-
gineering up to speed with more traditional engineering disciplines, providing
a mathematical foundation for rigorous analysis of realistic computer systems.
As civil engineers apply their mathematical canon to reach high certainty that
bridges will not fall down, the software engineer should apply a different canon
to argue that programs behave properly. As other engineering disciplines have
their computer-aided-design tools, computer science has proof assistants, IDEs
for logical arguments. We will learn how to apply these tools to certify that
programs behave as expected.
More specifically: Introductions to two intertangled subjects: the Coq
proof assistant, a tool for machine-checked mathematical theorem proving;
and formal logical reasoning about the correctness of programs.
For more information, see the book’s home page:
http://adam.chlipala.net/frap/
The classic engineering disciplines all have their standard mathematical tech-
niques that are applied to the design of any artifact, before it is deployed, to gain
confidence about its safety, suitability for some purpose, and so on. The engineers
in a discipline more or less agree on what are “the rules” to be followed in vetting
a design. Those rules are specified with a high degree of rigor, so that it isn’t a
matter of opinion whether a design is safe. Why doesn’t software engineering have
a corresponding agreed-upon standard, whereby programmers convince themselves
that their systems are safe, secure, and correct? The concepts and tools may not
quite be ready yet for broad adoption, but they have been under development for
decades. This book introduces one particular tool and a body of ideas for how to
apply it to different tasks in program proof.
As this document is in a very early draft stage, no more will be said here, in
favor of jumping right into the technical material. Eventually, there will no doubt
be some sort of historical overview here, as part of a general placing-in-context of
the particular approach that will come next. There will also be plenty of scholarly
citations (here and throughout the book). In this early version, you get to take the
author’s word for it that we are about to learn a promising approach!
However, one overarching element of our strategy is important enough to de-
serve to be called out here. We will study a variety of different approaches for
formalizing what a program should do and for proving that a program does what it
should. At every step, we will pay close attention to the common foundation that
underlies everything. For one thing, we will be proving all of our theorems with
the Coq proof assistant1, a powerful framework for writing and machine-checking
proofs. Coq itself is based on a relatively small set of core features, much like a
well-designed programming language, and in both we build up increasingly sophis-
ticated abstractions as libraries. Those features can be thought of as the core of all
mathematical reasoning.
We will also apply a recipe specific to program proof. When we encounter a new
challenge, to prove a new kind of property about a new kind of program, we will
generally be considering four broad elements that appear in nearly all techniques.
‚ Encoding. Every programming language has both syntax, which defines
what programs look like, and semantics, which defines how programs be-
have when run. Even when these elements seem obvious intuitively, we
often find that there are surprisingly subtle choices to be made in defin-
ing syntax and semantics at the highest level of rigor. Seemingly minor
decisions can have big impacts on how smoothly our proofs go.
1The author only makes an effort to keep the associated Coq code working with the latest
Coq version, which is 8.16 as of this writing.
1
2 1. WHY PROVE THE CORRECTNESS OF PROGRAMS?
In the course of the book, we will never quite define any of these meta-techniques
in complete formality. Instead, we’ll meet many examples of each, called out by
eye-catching margin notes. Generalizing from the examples should help the reader
start developing an intuition for when to use each element and for the common
design patterns that apply.
The core subject matter of the book is often grouped under traditional disci-
plinary headers like semantics, programming-languages theory, formal methods, and
verification. Often these different traditions have their own competing terminology
for shared concepts. We’ll follow one particular set of unified terminology and no-
tation, cherry-picked from the conventions of different communities. There really
is a huge amount of commonality across everything that we’ll study, so we don’t
want to distract by constantly translating between notations. It is quite important
to be literate in the standard notational conventions, which are almost always im-
plemented with LATEX, and we stick entirely to that kind of notation in this book.
However, we follow another, much less usual convention: while we give theorem
and lemma statements, we rarely give their proofs. The reason is that the author
and many other researchers today feel that proofs on paper have outlived their use-
fulness. Instead, the proofs are all found in the parallel world of the accompanying
Coq source code.
That is, each chapter of this book has a corresponding Coq source file, dis-
tributed with the general book source code. The Coq sources are heavily com-
mented and may even, in many cases, be feasible to read without also reading the
book chapters. More importantly, the Coq sources aren’t just meant to be read.
They are meant to be executed. We suggest stepping through them interactively,
seeing intermediate states of proofs as appropriate. The book proper can be read
without the Coq sources, to learn the standard background material of program
1. WHY PROVE THE CORRECTNESS OF PROGRAMS? 3
proof; and the Coq sources can be read without the book proper, to learn a partic-
ular concrete realization of those ideas. However, they go better together.
CHAPTER 2
define a set. Formally, the set is defined to be the smallest one that satisfies all
the rules. Each rule has premises and a conclusion. We illustrate with four rules
that together are equivalent to the BNF grammar above, for defining a set Exp of
Encoding expressions.
The general reading of an inference rule is: if all the facts above the horizontal
line are true, then the fact below the line is true, too. The rule implicitly needs
to hold for all values of the metavariables (like n and e1 ) that appear within it;
we can model them more explicitly with a sort of top-level universal quantification.
Newcomers to semantics often react negatively to seeing this style of definition, but
very quickly it becomes apparent as a remarkably compact notation for expressing
many concepts. Think of it as a domain-specific programming language for math-
ematical definitions, an analogy that becomes quite concrete in the associated Coq
code!
Const : N Ñ Exp
Var : Strings Ñ Exp
Plus : Exp ˆ Exp Ñ Exp
Times : Exp ˆ Exp Ñ Exp
Note that the “ˆ” here is not the multiplication operator of concrete syntax,
but rather the Cartesian-product operator of set theory, to indicate a type of pairs!
Such a list of constructors defines the set Exp to contain exactly those terms
Encoding that can be built up with the constructors. In inference-rule notation:
(2) For each premise E P S, add a companion premise P pEq. That is, the
obligation allows assuming that P holds of certain terms. Each such
assumption is called an inductive hypothesis (IH ).
That mechanical procedure derives the following four proof obligations, associ-
ated with an inductive proof that @x P Exp. P pxq.
nPN x P Strings
P pConstpnqq P pVarpxqq
e1 P Exp P pe1 q e2 P Exp P pe2 q e1 P Exp P pe1 q e2 P Exp P pe2 q
P pPluspe1 , e2 qq P pTimespe1 , e2 qq
In other words, to establish @x P Exp. P pxq, we need to prove that each of these
inference rules is valid.
To see induction in action, we prove a theorem giving a sanity check on our
two recursive definitions from earlier: depth can never exceed size.
Theorem 2.1. For all e P Exp, res ď |e|.
Proof. By induction on the structure of e. □
That sort of minimalist proof often surprises and frustrates newcomers. Our
position here is that proof checking is an activity fit for machines, not people, so
we will leave out gory details, which are to be found in the accompanying Coq
code, for this theorem and many others associated with this chapter. Actually,
even published proofs on paper tend to use “proofs” as brief as the one above,
relying on the reader’s experience to “fill in the blanks”! Unsurprisingly, fairly
often there are logical errors in such arguments, leading to acceptance of bogus
theorems. For that reason, we stick to machine-checked proofs here, using the
book chapters to introduce concepts, reasoning principles, and statements of key
theorems and lemmas.
Variables x P Strings
Functions f P Strings
Terms e ::“ x | f pe, . . . , eq
Propositions ϕ ::“ e “ e | ␣ϕ | ϕ ^ ϕ
In this theory, we know nothing about the detailed properties of the variables
or functions that we use. Instead, we must reason solely from the basic properties
of equality:
e2 “ e1 e1 “ e3 e3 “ e2
e “ e Reflexivity e1 “ e2 Symmetry e1 “ e2 Transitivity
f “ f 1 e1 “ e11 . . . en “ e1n
Congruence
f pe1 , . . . , en q “ f 1 pe11 , . . . , e1n q
Data Abstraction
All of the fully formal proofs in this book are worked out only in associated
Coq code. Therefore, before proceeding to more topics in program semantics and
proof, it is important to develop some basic Coq competence. Several heavily com-
mented examples files are associated with this crucial point in the book. We won’t
discuss details of Coq proving in this document, outside Appendix A. However,
one of the possibilities shown off in the Coq code is worth drawing attention to,
as a celebrated semantics idea in its own right, though we don’t yet connect it to
formalized syntax of programming languages. That idea is data abstraction, one
of the most central ideas in program structuring. Let’s consider the mathematical
meaning of encapsulation in data structures.
tpαq : Set
empty : tpαq
enqueue : tpαq ˆ α Ñ tpαq
dequeue : tpαq á tpαq ˆ α
dequeuepemptyq “ ¨
@q. dequeuepqq “ ¨ ñ q “ empty
Actually, the inference-rule notation from last chapter also makes algebraic
laws more readable, so here is a restatement.
dequeuepqq “ ¨
dequeuepemptyq “ ¨ q “ empty
One more rule suffices to give a complete characterization of behavior, with the
familiar math notation for piecewise functions.
#
pempty, xq, dequeuepqq “ ¨
dequeuepenqueuepq, xqq “
penqueuepq , xq, yq, dequeuepqq “ pq 1 , yq
1
tpαq “ listpαq
empty “ rs
enqueuepq, xq “ rxs ’ q
dequeueprsq “ ¨
dequeueprxs ’ qq “ prs, xq, when dequeuepqq “ ¨.
dequeueprxs ’ qq “ prxs ’ q 1 , yq, when dequeuepqq “ pq 1 , yq.
There is also a dual implementation where we enqueue to list backs and dequeue
from list fronts.
tpαq “ listpαq
empty “ rs
enqueuepq, xq “ q ’ rxs
dequeueprsq “ ¨
dequeueprxs ’ qq “ pq, xq
Proofs of the algebraic laws, for both implementations, appear in the associated
Coq code. Both versions actually take quadratic time in practice, assuming con-
catenation takes time linear in the length of its first argument. There is a famous,
more clever implementation that achieves amortized constant time (linear time to
run a whole sequence of operations), but we will need to expand our algebraic style
to accommodate it.
3.2. ALGEBRAIC INTERFACES WITH CUSTOM EQUIVALENCE RELATIONS 15
#
pempty, xq, dequeuepqq “ ¨
dequeuepenqueuepq, xqq «
penqueuepq , xq, yq, dequeuepqq “ pq 1 , yq
1
What’s the payoff from this reformulation? Well, first, it passes the sanity
check that the two queue implementations from the last section comply, with «
instantiated as simple equality. However, we may now also handle the classic two-
stack queue. Here is its implementation, relying on list-reversal function rev (which
takes linear time).
tpαq “ listpαq ˆ listpαq
empty “ prs, rsq
enqueueppℓ1 , ℓ2 q, xq “ prxs ’ ℓ1 , ℓ2 q
dequeuepprs, rsqq “ ¨
dequeueppℓ1 , rxs ’ ℓ2 qq “ ppℓ1 , ℓ2 q, xq
dequeueppℓ1 , rsqq “ pprs, q11 q, xq, when revpℓ1 q “ rxs ’ q11 .
The basic trick is to encode a queue as a pair of lists pℓ1 , ℓ2 q. We try to enqueue
into ℓ1 by adding elements to its front in constant time, and we try to dequeue from
ℓ2 by removing elements from its front in constant time. However, sometimes we
run out of elements in ℓ2 and need to reverse ℓ1 and transfer the result into ℓ2 . The
suitable equivalence relation formalizes this plan.
repppℓ1 , ℓ2 qq “ ℓ1 ’ revpℓ2 q
q1 « q2 “ reppq1 q “ reppq2 q
We can prove both that this « is an equivalence relation and that the other
queue laws are satisfied. As a result, client code (and its correctness proofs) can use
16 3. DATA ABSTRACTION
this fancy code, effectively viewing it as a simple queue, with the two-stack nature
hidden.
Why did we need to go through the trouble of introducing custom equivalence
relations? Consider the following two queues. Are they equal? (We write π1 for
the function that projects out the first element of a pair.)
?
enqueuepempty, 2q “ π1 pdequeuepenqueuepenqueuepempty, 1q, 2qqq
No, they aren’t equal! The first expression reduces to pr2s, rsq, while the second
reduces to prs, r2sq. This data structure is noncanonical , in the sense that the same
logical value may have multiple physical representations. The equivalence relation
lets us indicate which physical representations are equivalent.
Notice that this specification style can also be viewed as giving a reference im-
plementation of the data type, where rep shows how to convert back to the reference
implementation at any point.
A few laws characterize expected behavior, with J and K the respective ele-
ments “true” and “false” of B.
k1 ‰ k2
memberpempty, kq “ K memberpaddps, kq, kq “ J memberpaddps, k1 q, k2 q “ memberps, k2 q
There is a simple generic implementation of this data type with unsorted lists.
t “ list
empty “ rs
addps, kq “ rks ’ s
memberprs, kq “ K
memberprk 1 s ’ s, kq “ k “ k 1 _ memberps, kq
However, we can build specialized finite sets for particular element types and
usage patterns. For instance, assume we are working with sets of natural numbers,
where we know that most sets contain consecutive numbers. In those cases, it suf-
fices to store just the lowest and highest elements of sets, and all the set operations
run in constant time. Assume a fallback implementation of finite sets, with type
t0 and operations empty0 , add0 , and member0 . We implement our optimized set
type like so, assuming an operation fromRange : N ˆ N Ñ t0 to turn a range into
an ad-hoc set.
t “ Empty | RangepN ˆ Nq | AdHocpt0 q
empty “ Empty
addpEmpty, kq “ Rangepk, kq
addpRangepn1 , n2 q, kq “ Rangepn1 , n2 q, when n1 ď k ď n2
addpRangepn1 , n2 q, n1 ´ 1q “ Rangepn1 ´ 1, n2 q, when n1 ď n2
addpRangepn1 , n2 q, n2 ` 1q “ Rangepn1 , n2 ` 1q, when n1 ď n2
addpRangepn1 , n2 q, kq “ AdHocpadd0 pfromRangepn1 , n2 q, kqq, otherwise
addpAdHocpsq, kq “ AdHocpadd0 ps, kqq
memberpEmpty, kq “ K
memberpRangepn1 , n2 q, kq “ n1 ď k ď n2
memberpAdHocpsq, kq “ member0 ps, kq
This implementation can be proven to satisfy the finite-set spec, assuming that
the baseline ad-hoc implementation does, too. For workloads that only build sets
of consecutive numbers, this implementation can be much faster than the generic
list-based implementation, converting quadratic-time algorithms into linear-time.
CHAPTER 4
That’s enough about what programs look like. Let’s shift our attention to what
programs mean.
k1 ‰ k2
mrk ÞÑ vspkq “ v mrk1 ÞÑ vspk2 q “ mpk2 q
With these operators in hand, we can write a semantics for arithmetic expres-
sions. This is a recursive function that maps variable valuations to numbers. We
write JeK for the meaning of e; this notation is often referred to as Oxford brackets.
Recall that we allow this kind of notation as syntactic sugar for arbitrary func-
tions, even when giving the equations that define those functions. We write v for a
valuation (finite map). Encoding
JnKv “ n
JxKv “ vpxq
Je1 ` e2 Kv “ Je1 Kv ` Je2 Kv
Je1 ˆ e2 Kv “ Je1 Kv ˆ Je2 Kv
Note how parts of the definition feel a little bit like cheating, as we just “push
notations inside the brackets.” It’s important to remember that plus inside the
brackets is syntax, while plus outside the brackets is the normal addition of math!
To test our semantics, we define a variable substitution function. A substitution
re1 {xse stands for the result of running through the syntax of e, replacing every
occurrence of variable x with expression e1 .
19
20 4. SEMANTICS VIA INTERPRETERS
re{xsn “ n
re{xsx “ e
re{xsy “ y, when y ‰ x
re{xspe1 ` e2 q “ re{xse1 ` re{xse2
re{xspe1 ˆ e2 q “ re{xse1 ˆ re{xse2
We can prove a key compatibility property of these two recursive functions.
Theorem 4.1. For all e, e1 , x, and v, Jre1 {xseKv “ JeKpvrx ÞÑ Je1 Kvsq.
That is, in some sense, the operations of interpretation and substitution com-
mute with each other. That intuition gives rise to the common notion of a com-
muting diagram, like the one below for this particular example.
re1 {xs...
pe, vq pre1 {xse, vq
...rxÞÑJe1 Kvs J...K
We start at the top left, with a given expression e and valuation v. The diagram
shows the equivalence of two different paths to the bottom right. Each individual
arrow is labeled with some description of the transformation it performs, to get
from the term at its source to the term at its destination. The right-then-down
path is based on substituting and then interpreting, while the down-then-right path
is based on extending the valuation and then interpreting. Since both paths wind
up at the same spot, the diagram indicates an equality between the corresponding
terms.
It’s a matter of taste whether the theorem statement or the diagram expresses
the property more clearly!
JPushConstpnqKpv, sq “ n▷s
JPushVarpxqKpv, sq “ vpxq ▷ s
JAddKpv, n2 ▷ n1 ▷ sq “ pn1 ` n2 q ▷ s
JMultiplyKpv, n2 ▷ n1 ▷ sq “ pn1 ˆ n2 q ▷ s
The last two cases require the stack have at least a certain height. Here we’ll
ignore what happens when the stack is too short, though it suffices, for our purposes,
qy
to add pretty much any default behavior for the missing cases. We overload i to
refer to the composition of the interpretations of the different instructions within
i, in order.
Next, we give our first example of what might be called a compiler , or a trans-
lation from one language to another. Let’s compile arithmetic expressions into
stack programs, which then become easy to map onto the instructions of common
assembly languages. In that sense, with this translation, we make progress toward
efficient implementation on commodity hardware.
Throughout this book, we will use notation t. . .u for compilation, where the
floor-based notation suggests moving downward to a lower abstraction level. Here
is the compiler that concerns us now, where we write i1 ’ i2 for concatenation of
two instruction sequences i1 and i2 . Encoding
tnu “ PushConstpnq
txu “ PushVarpxq
te1 ` e2 u “ te1 u ’ te2 u ’ Add
te1 ˆ e2 u “ te1 u ’ te2 u ’ Multiply
The first two cases are straightforward: their compilations just push the obvious
values onto the stack. The binary operators are just slightly more tricky. Each first
evaluates its operands in order, where each operand leaves its final result on the
stack. With both of them in place, we run the instruction to pop them, combine
them, and push the result back onto the stack.
The correctness theorem for compilation must refer to both of our interpreters.
From here on, we consider that all unaccounted-for variables in a theorem statement
are quantified universally.
Theorem 4.2. JteuKpv, ¨q “ JeKv.
Here’s a restatement as a commuting diagram.
t...u
e teu
J...K
J...K
JeK
As usual, we leave proof details for the associated Coq code, but the key insight
of the proof is to strengthen the induction hypothesis via a lemma.
q y qy
Lemma 4.3. teu ’ i pv, sq “ i pv, JeKv ▷ sq.
We strengthen the statement by considering both an arbitrary initial stack s
and a sequence of extra instructions i to be run after e.
22 4. SEMANTICS VIA INTERPRETERS
Now the optimization itself is easy to define. We’ll write |. . .| for this and other
optimizations, which move neither down nor up a tower of program abstraction
levels. Encoding
|skip| “ skip
|x Ð e| “ xÐe
|c1 ; c2 | “ |c1 |; |c2 |
n
|repeat n do c done| “ |c|
|repeat e do c done| “ repeat e do |c| done
Note that, when multiple defining equations apply to some function input, by
convention we apply the earliest equation that matches.
Let’s prove that this optimization preserves program behavior; that is, we prove
that it is semantics-preserving.
Theorem 4.4. J|c|Kv “ JcKv.
It all looks so straightforward from that statement, doesn’t it? Indeed, there
actually isn’t so much work to do to prove this theorem. We can also present it as
a commuting diagram much like the prior one.
|...|
c |c|
J...K
J...K
JcK
The statement of Theorem 4.4 happens to be already in the right form to do
induction directly, but we need a helper lemma, capturing the interaction of n c and
the semantics.
n
Lemma 4.5. Jn cK “ JcK .
Let us end the chapter with the commuting-diagram version of the lemma
statement.
n
... n
c c
J...K J...K
n
... n
JcK JcK
CHAPTER 5
We should pause here to consider another crucial mathematical tool that is not
in common use outside the study of semantics but which will be essential for almost
all language semantics we define from here on. That tool is similar to the inductive
set or type definitions we met in Chapter 2. However, now we define relations (and
predicates, the colloquial name for single-argument relations) inductively. Let us
take some time to work through simple examples before moving on to cases more
relevant to semantics.
5.3. Permutations
It may not be the most intuitively obvious formulation, but we can use an
inductive relation to explain when one list is a permutation of another, written
here as infix relation „.
ℓ1 „ ℓ2 ℓ „ ℓ1 ℓ1 „ ℓ2
rs „ rs x ▷ ℓ1 „ x ▷ ℓ2 y ▷ x ▷ ℓ „ x ▷ y ▷ ℓ ℓ „ ℓ2
We apply the usual recipe to derive its induction principle, showing @ℓ, ℓ1 . ℓ „
ℓ ñ Qpℓ, ℓ1 q:
1
Most of the remaining connectives have elimination rules, too. The simplest
case is conjunction, with rules pulling out the conjuncts.
Γ $ ϕ1 ^ ϕ2 Γ $ ϕ 1 ^ ϕ 2
Γ $ ϕ1 Γ $ ϕ2
We eliminate a disjunction using reasoning by cases, extending the context
appropriately in each case.
Γ $ ϕ 1 _ ϕ 2 ϕ1 ▷ Γ $ ϕ ϕ 2 ▷ Γ $ ϕ
Γ$ϕ
If we manage to prove falsehood, we have a contradiction, and any conclusion
follows.
Γ$K
Γ$ϕ
Finally, the somewhat-controversial law of the excluded middle, which actually
does not hold in general in Coq, though many developments postulate it as an extra
axiom (which we take advantage of in our own Coq proof here). We write negation
␣ϕ as shorthand for ϕ ñ K.
Γ $ ϕ _ ␣ϕ
This style of inductive relation definition is called natural deduction. We write
$ ϕ as shorthand for ¨ $ ϕ, for an empty context.
Note the fundamental new twist introduced compared to last section’s language:
it is no longer the case that the top-level connective of the goal formula gives us a
small set of connective-specific rules that are the only we need to consider applying.
Instead, we may need to combine the hypothesis rule with elimination rules, taking
advantage of assumptions. The power of inductive relation definitions is clearer
here, since we couldn’t simply use a recursive function over formulas to express
that kind of pattern explicitly.
A simple interpreter sets the stage for proving soundness and completeness.
The most important extension to the interpreter is that it now takes in a valuation
v, just like in the previous chapter, though now the valuation maps variables to
truth values, not numbers.
JpKv “ vppq
JJKv “ J
JKKv “ K
Jϕ1 ^ ϕ2 Kv “ Jϕ1 Kv ^ Jϕ2 Kv
Jϕ1 _ ϕ2 Kv “ Jϕ1 Kv _ Jϕ2 Kv
Jϕ1 ñ ϕ2 Kv “ Jϕ1 Kv ñ Jϕ2 Kv
To indicate that ϕ is a tautology (that is, true under any values of the variables),
we write JϕK, as a kind of abuse of notation expanding to @v. JϕKv.
Theorem 5.14 (Soundness). If $ ϕ, then JϕK.
Proof. By appeal to Lemma 5.15. □
Lemma 5.15. If Γ $ ϕ, and if we have Jϕ1 Kv for every ϕ1 P Γ, then JϕKv.
Proof. By induction on the proof of Γ $ ϕ, using propositional logic in the
metalanguage to plumb together the case proofs. □
30 5. INDUCTIVE RELATIONS AND RULE INDUCTION
The other direction, completeness, is quite a bit more involved, and indeed
its Coq proof strays outside the range of what is reasonable to ask students to
construct at this point in the book, but it makes for an interesting exercise. The
basic idea is to do a proof by exhaustive case analysis over the truth values of all
propositional variables p that appear in a formula.
Theorem 5.16 (Completeness). If JϕK, then $ ϕ.
Proof. By appeal to Lemma 5.17. □
Say that a context Γ and a valuation v are compatible if they agree on the truth
of any variable included in both. That is, when p P Γ, we have vppq “ J; and when
␣p P Γ, we have vppq “ K.
Lemma 5.17. Given context Γ and formula ϕ, if
‚ there is no variable p such that both p P Γ and ␣p P Γ, and
‚ for any valuation v compatible with Γ, we have JϕKv,
then Γ $ ϕ.
Proof. By induction on the set of variables p appearing in ϕ such that neither
p P Γ nor ␣p P Γ. If that set is empty, we appeal directly to Lemma 5.18. Otherwise,
choose some variable p in ϕ that hasn’t yet been assigned a truth value in Γ.
Combine excluded middle on p with the _ elimination rule of $ to do a case split,
so that it now suffices to prove both p ▷ Γ $ ϕ and ␣p ▷ Γ $ ϕ. Each case can be
proved by direct appeal to the induction hypothesis, since assigning p a truth value
in Γ shrinks the set we induct over. □
We write JϕKΓ to denote interpreting ϕ in a valuation that assigns J to exactly
those variables that appear directly in Γ.
Lemma 5.18. Given context Γ and formula ϕ, if
‚ for every variable p appearing in ϕ, we have either p P Γ or ␣p P Γ; and
‚ there is no variable p such that both p P Γ and ␣p P Γ
then if JϕKΓ, then Γ $ ϕ, otherwise Γ $ ␣ϕ.
Proof. By induction on ϕ, with tedious combination of propositional logic in
the metalanguage and in the rules of $. Inductive cases make several appeals to
Lemma 5.19, and it is important that the base case for variables p is able to assume
that either p or ␣p appears in Γ. □
Lemma 5.19 (Weakening). If Γ $ ϕ, and Γ1 includes a superset of the formulas
from Γ, then Γ1 $ ϕ.
Proof. By induction on the proof of Γ $ ϕ. □
CHAPTER 6
Natural numbers n P N
States s ::“ AnswerIspnq | WithAccumulatorpn, nq
There are two types of states. An AnswerIspaq state corresponds to the return
statement. It records the final result a of the factorial operation. A WithAccumulatorpn, aq
records an intermediate state, giving the values of the two local variables, just before
a loop iteration begins.
31
32 6. TRANSITION SYSTEMS AND INVARIANTS
Following the more familiar parts of automata theory, let’s define a set of initial
states for this machine.
WithAccumulatorpn0 , 1q P F0
For consistency with the notation we will be using later, we define the set F0 using
an inference rule. Equivalently, we could just write F0 “ tWithAccumulatorpn0 , 1qu,
essentially reading off the initial variable values from the first lines of the code above.
Similarly, we also define a set of final states.
AnswerIspaq P Fω
Equivalently: Fω “ tAnswerIspaq | a P Nu. Note that this definition only captures
when the program is done, not when it returns the right answer. It follows from
the last line of the code.
The last and most important ingredient of our state machine is its transition
relation, where we write s Ñ s1 to indicate that state s advances to state s1 in
one step, following the semantics of the program. Here inference rules are more
obviously a good fit.
WithAccumulatorp0, aq Ñ AnswerIspaq
WithAccumulatorpn ` 1, aq Ñ WithAccumulatorpn, a ˆ pn ` 1qq
The first rule corresponds to the case where the program ends, because the loop
test has failed and we now know the final answer. The second rule corresponds to
going once around the loop, following directly from the code in the loop body.
We can fit these ingredients into the general concept of a transition system, the
term we will use throughout this book for this sort of state machine. Actually, the
words “state machine” suggest to many people that the state set must be finite,
hence our preference for “transition system,” which is also used fairly frequently in
semantics.
Definition 6.1. A transition system is a triple xS, S0 , Ñy, with S a set of
states, S0 Ď S a set of initial states, and Ñ Ď S ˆ S a transition relation.
For an arbitrary transition relation Ñ, not just the one defined above for fac-
torial, we define its transitive-reflexive closure Ñ˚ with two inference rules:
s Ñ s1 s1 Ñ˚ s2
˚
sÑ s s Ñ˚ s2
That is, a formal claim s Ñ˚ s1 corresponds exactly to the informal claim that
“starting from state s, we can reach state s1 .”
Definition 6.2. For transition system xS, S0 , Ñy, we say that a state s is
reachable if and only if there exists s0 P S0 such that s0 Ñ˚ s.
Building on these notations, here is one way to state the correctness of our
factorial program, which, defining S according to the state grammar above, we
model as F “ xS, F0 , Ñy.
Theorem 6.3. For any state s reachable in F, if s P Fω , then s “ AnswerIspn0 !q.
That is, whenever the program finishes, it returns the right answer. (Recall
that n0 is the initial value of the input variable.)
We could prove this theorem now in a relatively ad-hoc way. Instead, let’s
develop the general machinery of invariants.
6.2. INVARIANTS 33
6.2. Invariants
The concept of “invariant” may be familiar from such relatively informal notions
as “loop invariant” in introductory programming classes. Intuitively, an invariant
is a property of program state that starts true and stays true, but let’s make that
idea a bit more formal, as applied to our transition-system formalism. Invariants
Definition 6.4. An invariant of a transition system is a property that is
always true, in all of the system’s reachable states. That is, for transition system
xS, S0 , Ñy, where R is the set of all its reachable states, some I Ď S is an invariant
iff R Ď I. (Note that here we adopt the mathematical convention that “properties”
of states and “sets” of states are synonymous, so that in each case we can use what
terminology seems most natural. The “property” holds of exactly those states that
belong to the “set.”)
At first look, the definition may appear a bit silly. Why not always just take
the reachable states R as the invariant, instead of scrambling to invent something
new? The reason is the same as for strengthening induction hypotheses to make
proofs easier. Often it is easier to characterize an invariant that isn’t fully precise,
admitting some states that the system can never actually reach. Additionally, it
can be easier to prove existence of an approximate invariant by induction, by the
method that the next key theorem formalizes.
Theorem 6.5. Consider a transition system xS, S0 , Ñy and its candidate in-
variant I. The candidate is truly an invariant if (1) S0 Ď I and (2) for every s P I
where s Ñ s1 , we also have s1 P I.
That’s enough generalities for now. Let’s define a suitable invariant for facto-
rial. Invariants
IpAnswerIspaqq “ n0 ! “ a
IpWithAccumulatorpn, aqq “ n0 ! “ n! ˆ a
It is an almost-routine exercise to prove that I really is an invariant, using
Theorem 6.5. The key new ingredient we need is inversion, a principle for deducing
which inference rules may have been used to prove a fact.
For instance, at one point in the proof, we need to draw a conclusion from a
premise s P F0 , meaning that s is an initial state. By inversion, because set F0 is
defined by a single inference rule, that rule must have been used to conclude the
premise, so it must be that s “ WithAccumulatorpn0 , 1q.
Similarly, at another point in the proof, we must reason from a premise s Ñ s1 .
The relation Ñ is defined by two inference rules, so inversion leads us to two cases to
consider. In the first case, corresponding to the first rule, s “ WithAccumulatorp0, aq
and s1 “ AnswerIspaq. In the second case, corresponding to the second rule, s “
WithAccumulatorpn ` 1, aq and s1 “ WithAccumulatorpn, a ˆ pn ` 1qq. It’s worth
checking that these values of s and s1 are read off directly from the rules.
Though a completely formal and exhaustive treatment of inversion is beyond
the scope of this text, generally it follows standard intuitions about “reverse-
engineering” a set of rules that could have been used to derive some premise.
Another important property of invariants formalizes the connection with weak-
ening an induction hypothesis.
34 6. TRANSITION SYSTEMS AND INVARIANTS
ppg, ℓq, Writepnqq ÑL ppn ` 1, ℓq, Unlockq ppg, ℓq, Unlockq ÑL ppg, Kq, Doneq
Note that these rules will allow a thread to read and write the shared state even
without holding the lock. The rules also allow any thread to unlock the lock, with
no consideration for whether that thread must be the current lock holder. We must
use an invariant-based proof to show that there are, in fact, no lurking violations
of the lock-based concurrency discipline.
Of course, with just a single thread running, there aren’t any interesting viola-
tions! However, we have been careful to describe system L in a generic way, with
its state a pair of shared and private components. We can define a generic notion
of a multithreaded system, with two systems that share some state and maintain
Encoding their own private state.
@ D @ D
Definition 6.8. Let T 1 “ S ˆ P 1 , S0 ˆ P01 , Ñ1 and T 2 “ S ˆ P 2 , S0 ˆ P02 , Ñ2
be two transition systems, with a shared-state type S in common between their
state sets, also agreeing on the initial
@ values S0 for that shared state.D We define the
parallel composition T 1 | T 2 as S ˆ pP 1 ˆ P 2 q, S0 ˆ pP01 ˆ P02 q, Ñ , defining new
transition relation Ñ with the following inference rules, which capture the usual
notion of thread interleaving.
Note that the operator | is carefully defined so that its output is suitable as
input to a further instance of itself. As a result, while L | L is a transition system
modeling two threads running the code from above, we also have L | pL | Lq as a
three-thread system based on that code, pL | Lq | pL | Lq as a four-thread system
based on that code, etc.
Also note that | constructs transition systems with our first examples of non-
determinism in transition relations. That is, given a particular starting state, there
are multiple different places it may wind up after a given number of execution
steps. In general, with thread-interleaving concurrency, the set of possible final
states grows exponentially in the number of steps, a fact that torments concurrent-
software testers to no end! Rather than consider all possible runs of the program,
we will use an invariant to tame the complexity.
First, we should be clear on what we mean to prove about this program. Let’s
also restrict our attention to the two-thread case for the rest of this section; the
n-thread case is left as an exercise for the reader!
Model Checking
Our analyses so far have been tedious for at least two different reasons. First,
we’ve hand-crafted definitions of transition systems, rather than just writing pro-
grams in conventional programming languages. The next chapter will clear that
obstacle, by introducing operational semantics, for building transition systems au-
tomatically from programs. The other inconvenience we’ve faced is defining invari-
ants manually. There isn’t a silver bullet to get us out of this duty, when working
with Turing-complete languages, where almost all interesting questions, this one
included, are undecidable. However, when we can phrase problems in terms of
transition systems with finitely many reachable states, we can construct invariants
automatically by exhaustive exploration of the state space, an approach otherwise
known as model checking. Surprisingly many real programs can be reduced to finite
state spaces, using the techniques introduced in this chapter. First, though, let’s
formalize our intuitions about exhaustive state-space exploration as a sound way
to find invariants.
Here we call reachpnq a fixed point of the transition system, because it is closed
under further exploration. To find a fixed point with a concrete system, we start
with S0 . We repeatedly take the single-step closure corresponding to composition
with Ñ. At each step, we check whether the expanded set is actually equal to the
previous set. If so, our process of multi-step closure has terminated, and we have
an invariant, by construction. Again, keep in mind that multi-step closure will not
terminate for most transition systems, and there is an art to phrasing a problem in
terms of systems where it will terminate.
thread() {
int local;
while (true) {
local = global;
global = local + 2;
}
}
If we assume infinite-precision integers, then the state space is infinite. Consid-
ering just the global variable, every even number is reachable, even if we only run a
single thread. However, there is a high degree of regularity across this state space.
In particular, those values really are all even. Consider this other program, which
is hauntingly similar to the last one, in a way that we will make precise shortly.
bool global = true;
thread() {
bool local;
while (true) {
local = global;
global = local;
}
}
We replaced every use of an integer with a Boolean that is true iff the integer
is even. Notice that now the program has a finite state space, and model checking
applies easily! We can formalize such a transformation via the general principle of
abstraction of a transition system.
The key idea is that every state of the concrete system (with relatively many
states) can be associated to one or more states of the abstract system (with rela-
tively few states). We formalize this association via a simulation relation R, and
we define what makes a choice of R sound, via a notion of simulation via a binary
operator ă, subscripted by R.
7.3. MODULAR DECOMPOSITION OF INVARIANT-FINDING 41
f() {
int local = 0;
while (true) {
local = global;
local = 3 + local;
local = 7 + local;
global = local;
}
}
Call the transition-system encoding of this code S. We can apply the Boolean-
for-evenness abstraction to model a single thread with finite state, but we are
7.3. MODULAR DECOMPOSITION OF INVARIANT-FINDING 43
left needing to account for interference by other threads. However, we can apply
Theorem 7.4 to analyze threads separately.
For instance, we want to show that “global is always even” is an invariant of
S | S. By Theorem 7.3, we can switch to analyzing system pS | SqI , where I is the
evenness invariant. By Theorem 7.4, we can switch to proving the same invariant
separately for systems SI and SI , which are, of course, the same system in this case.
We apply the Boolean-for-evenness abstraction to this system, to get one with a
finite state space, so we can check the invariant automatically by model checking.
Following the chain of reasoning backward, we have proved the invariant for S | S.
Even better, that last proof includes the hardest steps that carry over to the
proof for an arbitrary number of threads. Define an exponentially growing system
of threads Sn by:
S0 “ S
n`1
S “ Sn | Sn
Theorem 7.5. For any n, it is an invariant of Sn that the global variable is
always even.
Proof. By induction on n, repeatedly using Theorem 7.4 to push the obli-
gation down to the leaves of the tree of concurrent compositions, after applying
Theorem 7.3 at the start to introduce the use of . . .I . Every leaf is the same system
S, for which we abstract and apply model checking, appealing to the step above
where we ran the same analysis. □
CHAPTER 8
Operational Semantics
It gets tedious to define a relation from first principles, to explain the behaviors
of any concrete program. We do more things with programs than just reason
about them. For instance, we compile them into other languages. To get the most
mileage out of our correctness proofs, we should connect them to the same program
syntax that we pass to compilers. Operational semantics is a family of techniques
for automatically defining a transition system, or other relational characterization,
from program syntax.
Throughout this chapter, we will demonstrate the different operational-semantics
techniques on a single source language, defined like so.
Numbers n P N
Variables x P Strings
Expressions e ::“ n|x|e`e|e´e|eˆe
Commands c ::“ skip | x Ð e | c; c | if e then c else c | while e do c
Theorem 8.4. There exists a valuation v such that p‚rinput ÞÑ 2s, factorialq Ñ˚
pv, skipq and vpoutputq “ 2.
JeKv ‰ 0 JeKv “ 0
pv, while e do c1 q Ñ0 pv, c1 ; while e do c1 q pv, while e do c1 q Ñ0 pv, skipq
We regain the full coverage of the original rules with a new relation Ñc , saying
that we may apply Ñ0 at the active subcommand within a larger command.
pv, cq Ñ0 pv 1 , c1 q
pv, Crcsq Ñc pv 1 , Crc1 sq
Let’s revisit last section’s example, to see contextual semantics in action, es-
pecially to demonstrate how to express an arbitrary command as an evaluation
context plugged with another command.
Theorem 8.11. There exists valuation v such that p‚rinput ÞÑ 2s, factorialq Ñ˚c
pv, skipq and vpoutputq “ 2.
Proof.
p‚rinput ÞÑ 2s, output Ð 1; factorial loopq
“ p‚rinput ÞÑ 2s, pl; factorial loopqroutput Ð 1sq
Ñc p‚rinput ÞÑ 2sroutput ÞÑ 1s, skip; factorial loopq
“ p‚rinput ÞÑ 2sroutput ÞÑ 1s, lrskip; factorial loopsq
Ñc p‚rinput ÞÑ 2sroutput ÞÑ 1s, factorial loopq
“ p‚rinput ÞÑ 2sroutput ÞÑ 1s, lrfactorial loopsq
Ñc p‚rinput ÞÑ 2sroutput ÞÑ 1s, poutput Ð output ˆ input; input Ð input ´ 1q; factorial loopq
“ p‚rinput ÞÑ 2sroutput ÞÑ 1s, ppl; input Ð input ´ 1q; factorial loopqroutput Ð output ˆ inputsq
Ñc p‚rinput ÞÑ 2sroutput ÞÑ 2s, pskip; input Ð input ´ 1q; factorial loopq
“ p‚rinput ÞÑ 2sroutput ÞÑ 2s, pl; factorial loopqrskip; input Ð input ´ 1qs
Ñc p‚rinput ÞÑ 2sroutput ÞÑ 2s, input Ð input ´ 1; factorial loopq
“ p‚rinput ÞÑ 2sroutput ÞÑ 2s, pl; factorial loopqrinput Ð input ´ 1sq
Ñc p‚rinput ÞÑ 1sroutput ÞÑ 2s, skip; factorial loopq
“ p‚rinput ÞÑ 1sroutput ÞÑ 2s, lrskip; factorial loopsq
Ñ˚c . . .
Ñc p‚rinput ÞÑ 0sroutput ÞÑ 2s, skipq
Clearly the final valuation assigns output to 2. □
8.3.1. Equivalence of Small-Step, With and Without Evaluation Con-
texts. This new semantics formulation is equivalent to the other two, as we estab-
lish now.
Theorem 8.12. If pv, cq Ñ pv 1 , c1 q, then pv, cq Ñc pv 1 , c1 q.
Proof. By induction on the derivation of pv, cq Ñ pv 1 , c1 q. □
Lemma 8.13. If pv, cq Ñ0 pv 1 , c1 q, then pv, cq Ñ pv 1 , c1 q.
Proof. By cases on the derivation of pv, cq Ñ0 pv 1 , c1 q. □
Lemma 8.14. If pv, cq Ñ0 pv 1 , c1 q, then pv, Crcsq Ñ pv 1 , Crc1 sq.
Proof. By induction on the structure of evaluation context C, appealing to
the last lemma. □
Theorem 8.15. If pv, cq Ñc pv 1 , c1 q, then pv, cq Ñ pv 1 , c1 q.
Proof. By inversion on the derivation of pv, cq Ñc pv 1 , c1 q, followed by an
appeal to the last lemma. □
50 8. OPERATIONAL SEMANTICS
8.4. Determinism
Our last extension with parallelism introduced intentional nondeterminism in
the semantics: a single starting state can step to multiple different next states.
However, the three semantics for the original language are deterministic, and we
can prove it.
Theorem 8.16. If pv, cq ó v1 and pv, cq ó v2 , then v1 “ v2 .
Proof. By induction on the derivation of pv, cq ó v1 and inversion on the
derivation of pv, cq ó v2 . □
Theorem 8.17. If pv, cq Ñ pv1 , c1 q and pv, cq Ñ pv2 , c2 q, then v1 “ v2 and
c1 “ c2 .
Proof. By induction on the derivation of pv, cq Ñ pv1 , c1 q and inversion on
the derivation of pv, cq Ñ pv2 , c2 q. □
Theorem 8.18. If pv, cq Ñc pv1 , c1 q and pv, cq Ñc pv2 , c2 q, then v1 “ v2 and
c1 “ c2 .
Proof. Follows from the last theorem and the equivalence we proved between
Ñ and Ñc . □
We’ll stop, for now, in our tour of useful properties of operational semantics. All
of the rest of the book is based on small-step semantics, with or without evaluation
contexts. As we study new kinds of programming languages, we will see how
to model them operationally. Almost every new proof technique is phrased as
an approach to establishing invariants of transition systems based on small-step
semantics.
CHAPTER 9
The last two chapters showed us both how to build a transition system from a
program automatically and how to find an invariant for a transition system auto-
matically. Let’s now combine these ideas to find invariants for programs automat-
ically, in a particular way associated with the technique of dataflow analysis used
to drive many compiler optimizations. Throughout, we’ll stick with the example
of the small imperative language whose semantics we studied in the last chapter.
We’ll confine our attention to its basic small-step semantics via the Ñ relation.
Model checking builds up increasingly larger finite sets of reachable states in
a system. A state pv, cq of our imperative language combines control state c (the
next command to execute) with data state v (the values of the variables), and so
model checking will find invariants that restrict both components. We say that
model checking is path-sensitive because its invariants can distinguish between the
different data states that can be associated with the same control state, reached
along different paths in the program’s executions. Path-sensitive analyses tend
to be much more computationally expensive than path-insensitive analyses, whose
invariants collapse together all ways of reaching the same control state. Dataflow
analysis is one such path-insensitive approach, and its underlying theory is abstract
interpretation.
‚ „ formalizes the idea of which concrete values are covered by which ab-
stract values.
For a, b P D, define a Ď b to mean @n P N. pn „ aq ñ pn „ bq. That is, b
is at least as general as a. An abstract interpretation must satisfy the following
algebraic laws:
‚ @a P D. a Ď J
‚ @n P N. n „ Cpnq
‚ @n, m P N. @a, b P D. n „ a ^ m „ b ñ pn ` mq „ pa`bqˆ
‚ @n, m P N. @a, b P D. n „ a ^ m „ b ñ pn ´ mq „ pa´bqˆ
‚ @n, m P N. @a, b P D. n „ a ^ m „ b ñ pn ˆ mq „ paˆbqˆ
ˆ Ď pa1 `b
‚ @a, b, a1 , b1 P D. a Ď a1 ^ b Ď b1 ñ pa`bq ˆ 1q
ˆ Ď pa1 ´b
‚ @a, b, a1 , b1 P D. a Ď a1 ^ b Ď b1 ñ pa´bq ˆ 1q
1 1 1 1
‚ @a, b, a , b P D. a Ď a ^ b Ď b ñ paˆbq ˆ Ď pa ˆb
1 ˆ 1q
‚ @a, b P D. a Ď pa \ bq
‚ @a, b P D. b Ď pa \ bq
As an example, consider this formalization of even-odd analysis, whose proof of
soundness is left as an exercise for the reader. (While the treatment of subtraction
may seem gratuitously imprecise, recall that we are working here with natural
numbers and not integers, such that subtraction “sticks” at zero when the result
would otherwise be negative.)
D “ tE, O, Ju
Cpnq “ E or O, depending on parity of n
E`ˆ E “ E
ˆ
E`O “ O
ˆ E “ O
O`
ˆ O “ E
O`
`ˆ “ J
ˆ
E´E “ E
ˆ O
O´ “ E
´ˆ “ J
ˆ
Eˆ “ E
ˆ E
ˆ “ E
ˆ O
Oˆ “ O
ˆˆ “ J
E\E “ E
O\O “ O
\ “ J
n„E “ n is even
n„O “ n is odd
n„J “ always
We generally think of an abstract interpretation as forming a lattice (actually
a semilattice), which is roughly the algebraic structure characterized by operations
9.2. FLOW-INSENSITIVE ANALYSIS 53
like \, when \ truly returns the most specific or least upper bound of its two
arguments. We visualize the even-odd lattice like so.
J
E O
The idea is that taking the join of two elements moves us up the lattice to their
lowest common ancestor.
An edge going up from a to b indicates that a Ď b. As another example,
consider a lattice tracking prime factors of numbers, up to 5. Then the picture
version might go like so:
tu
t2u t5u
t3u
t2, 3u t3, 5u
t2, 5u
t2, 3, 5u
Since Ď is clearly transitive, upward-moving paths across multiple nodes also
imply Ď relationships between their endpoints. It’s worth verifying quickly that
any two nodes in this graph have a unique lowest common ancestor, which is the
proper result of the \ operation on those nodes.
Another worthwhile exercise for the reader is to work out the proper definitions
ˆ ´,
of `, ˆ and ˆ ˆ for this domain.
Next, we model the possible effects of commands. We already said that our
flow-insensitive analysis will forget about control flow in a command, but what
does that mean formally? States of this language, without control flow taken into
account, are just variable valuations, and the only way a command can affect a
valuation is through executing assignments. Therefore, forgetting the control flow
of a command amounts to just recording which assignments it contains syntactically,
losing all context about which Boolean tests would need to pass to reach each
assignment. This simple syntactic extraction process can be formalized with an
assignments-of function A for commands.
Apskipq “ tu
Apx Ð eq “ tpx, equ
Apc1 ; c2 q “ Apc1 q Y Apc2 q
Apif e then c1 else c2 q “ Apc1 q Y Apc2 q
Apwhile e do c1 q “ Apc1 q
As a final preliminary ingredient, for abstract states s1 and s2 , define s1 \ s2
by ps1 \ s2 qpxq “ s1 pxq \ s2 pxq.
Now we define the flow-insensitive step relation, over abstract states alone, as:
px, eq P Apcq
s ÑcFI s s ÑcFI s \ srx ÞÑ resss
We can establish formally how forgetting about the order of assignments is a
valid abstraction technique.
Abstraction
Theorem 9.3. Given command c, initial valuation v, and initial abstract state
s such that v „ s. The transition system with initial state s and step relation ÑcFI
simulates the system with initial state pv, cq and step relation Ñ, according to a
simulation relation enforcing „ between the valuation and abstract state.
Now a simple procedure can find an invariant for the abstracted system. In
particular:
(1) Initialize s with the
Ůabstract state from the theorem statement.
(2) Compute s1 “ s \ px,eqPApcq srx ÞÑ resss.
(3) If s1 Ď s, then we’re done; s is the invariant.
(4) Otherwise, assign s “ s1 and return to 2.
Every step in this outline is computable, since the abstract states will always
be finite maps.
Invariants
Theorem 9.4. If the outline above terminates, then it is an invariant of the
flow-insensitive abstracted system that s (its final value from the loop above) is an
upper bound for every reachable state. That is, for every reachable s1 , s1 Ď s.
To check a concrete program, we first abstract it to a flow-insensitive version
with Theorem 9.3, then we find a guaranteed invariant with Theorem 9.4. One
wrinkle here is that it is not obvious that our informal loop above always terminates.
However, it always terminates if our abstract domain has finite height, meaning that
there is no infinite ascending chain of distinct elements ai such that ai Ď ai`1 for all
i. Our even-odd example trivially has that property, since it contains only finitely
many distinct elements.
9.3. FLOW-SENSITIVE ANALYSIS 55
It is worth emphasizing that, when those conditions are met, our invariant-
finding procedure is guaranteed to terminate, even though the underlying language
is Turing-complete, so that most interesting analysis problems are uncomputable!
The catch is that it is always possible that the invariant found is a trivial one,
where the abstract state maps every variable to J.
Here is an example of a program where flow-insensitive even-odd analysis gives
the most precise answer (relative to its simplifying assumption that we must assign
the same description to a variable at every step of execution).
n Ð 10; x Ð 0; while n ą 0 do x Ð x ` 2 ˆ n; n Ð n ´ 1
The abstract state we wind up with is ‚rn ÞÑ Jsrx ÞÑ Es.
Abstraction
Theorem 9.5. Given command c and initial valuation v. The transition system
with initial state ps, cq and step relation ÑFS simulates the system with initial state
pv, cq and step relation Ñ, according to a simulation relation enforcing equality of
the commands, as well as „ between the valuation and abstract state.
Now another simple procedure can find an invariant for the abstracted system.
We write S \ S 1 for joining of two flow-sensitive abstract states. When c is in the
domain of exactly one of S or S 1 , S \ S 1 agrees with the corresponding mapping.
When c is in neither domain, it isn’t in the domain of S \ S 1 either. Finally, when
c is in both domains, we have pS \ S 1 qpcq “ Spcq \ S 1 pcq.
Also define S Ď S 1 to mean that, whenever Spcq “ s, there exists s1 such that
S pcq “ s1 and s Ď s1 .
1
9.4. Widening
Consider an abstract interpretation of intervals, where each elements of the
domain is either ra, bs or ra, 8q, for a, b P N. Restricting our attention to a and
b values between 0 and 1 for illustration purposes, we have this diagram of the
domain, where the bottom element represents an empty set.
r0, 8q
r0, 1s r1, 8q
r0, 0s r1, 1s
r1, 0s
The abstract operators have intuitive and simple definitions, like, flattening the
different kinds of intervals into a common notation, defining pa1 , b1 q \ pa2 , b2 q “
9.4. WIDENING 57
JeKv
pv, outpeqq Ñ0 pv, skipq
ϵ ϵ
pv, x Ð eq Ñ0 pvrx ÞÑ JeKvs, skipq pv, skip; c2 q Ñ0 pv, c2 q
59
60 10. COMPILER CORRECTNESS VIA SIMULATION ARGUMENTS
JeKv ‰ 0 JeKv “ 0
ϵ ϵ
pv, if e then c1 else c2 q Ñ0 pv, c1 q pv, if e then c1 else c2 q Ñ0 pv, c2 q
JeKv ‰ 0 JeKv “ 0
ϵ ϵ
pv, while e do c1 q Ñ0 pv, c1 ; while e do c1 q pv, while e do c1 q Ñ0 pv, skipq
ℓ
pv, cq Ñ0 pv 1 , c1 q
ℓ
pv, Crcsq Ñc pv 1 , Crc1 sq
To reason about infinite executions, we need a new abstraction, compared to
what has worked in our invariant-based proofs so far. That abstraction will be
traces, sequences of outputs (and termination events) that a program might be
observed to generate. We define a command’s trace set inductively. Recall that ¨
is the empty list, while ’ does list concatenation.
ϵ n
s Ñc s1 t P Trps1 q s Ñc s1 t P Trps1 q
¨ P Trpsq terminate P Trppv, skipqq t P Trpsq outpnq ’ t P Trpsq
Notice that a trace is allowed to end at any point, even if the program under
inspection hasn’t terminated yet. Also, since our language is deterministic, for any
two traces of one command, one trace is a prefix of the other. Many parts of the
machinery we develop here will, however, work well for nondeterministic systems,
as we will see with labeled transition systems for concurrency in Chapter 21.
Definition 10.1 (Trace inclusion). For commands c1 and c2 , let c1 ĺ c2 iff
Trpc1 q Ď Trpc2 q.
Definition 10.2 (Trace equivalence). For commands c1 and c2 , let c1 » c2 iff
Trpc1 q “ Trpc2 q.
We will enforce that a correct compiler phase respects trace equivalence. That
is, the output program has the same traces as the input program. For nondeter-
ministic languages, subtler conditions are called for, but we’re happy to stay within
the safe confines of determinism for this chapter.
s11 s12
R´1
Invariants
As usual, the diagram tells us that when a path along the left exists, a matching
roundabout path exists, too. That is, any step on the left can be matched by a step
on the right. Notice the similarity to the invariant-induction principle that we have
mostly relied on so far. Instead of showing that every step preserves a one-state
predicate, we show that every step preserves a two-state predicate in a particular
way. The simulation approach is as general for relating programs as the invariant
approach is for verifying individual programs.
Theorem 10.4. If there exists a simulation R such that s1 R s2 , then s1 » s2 .
Proof. We prove the two trace-inclusion directions separately. The left-to-
right direction proceeds by induction over the definition of traces on the left, while
the right-to-left direction proceeds by similar induction on the right. While most of
the proof is generic in details of the labeled transition system, for the right-to-left
direction we do rely on proofs of two important properties of this object language.
First, the semantics is total, in the sense that any state whose command isn’t skip
can take a step. Second, the semantics is deterministic, in that there can be at
most one label/state pair reachable in one step from a particular starting state.
In the inductive step of the right-to-left inclusion proof, we know that the
righthand system has taken a step. The lefthand system might already be a skip,
in which case, by the definition of simulations, the righthand system is already a
skip, contradicting the assumption that the righthand side stepped. Otherwise, by
totality, the lefthand system can take a step. By the definition of simulation, there
exists a matching step on the righthand side. By determinism, the matching step
is the same as the one we were already aware of. Therefore, we have a new R
relationship to connect to that step and apply the induction hypothesis. □
We can apply this very general principle to constant folding.
Theorem 10.5. For any v and c, pv, cq » pv, cfold1 pcqq.
Proof. By a simulation argument using this relation:
pv1 , c1 q R pv2 , c2 q “ v1 “ v2 ^ c2 “ cfold1 pc1 q
What we have done is translate the original theorem statement into the language
of binary relations, as this simple case needs no equivalent of strengthening the
induction hypothesis. Internally to the proof, we need to define constant folding of
evaluation contexts C, and we need to prove that primitive steps Ñ0 may be lifted
to apply over constant-folded states, this second proof by case analysis on Ñ0
derivations. Another more obvious workhorse is a lemma showing that constant
folding of expressions preserves interpretation results. □
62 10. COMPILER CORRECTNESS VIA SIMULATION ARGUMENTS
count n will be used up, and the incorrect “optimized” program will be forced to
reveal itself by taking a step that outputs.
Theorem 10.8. If there exists a simulation with skipping R such that s1 Rn s2 ,
then s1 » s2 .
Proof. The proof is fairly similar to that of Theorem 10.4. To show termina-
tion preservation in the backward direction, we find ourselves proving a lemma by
induction on n. □
Theorem 10.9. For any v and c, pv, cq » pv, cfold2 pcqq.
Proof. By a simulation argument (with skipping) using this relation:
pv1 , c1 q Rn pv2 , c2 q “ v1 “ v2 ^ c2 “ cfold2 pc1 q ^ countIfspc1 q ă n
We rely on a simple helper function countIfspcq to count how many If nodes appear
in the syntax of c. This notion turns out to be a conservative upper bound on how
many times in a row we will need to let lefthand steps go unmatched on the right.
The rest of the proof proceeds essentially the same way as in Theorem 10.5. □
The heart of this relation is a subrelation – over valuations, capturing when they
agree on all variables that are not reserved for temporaries, since the flattened
program will feel free to scribble all over the temporaries. The details of – are
especially important to the key lemma, showing that flattening of expressions is
sound, both taking in a – premise and drawing a related – conclusion. The overall
proof is not short, with quite a few lemmas, found in the Coq code. □
It might not be clear why we bothered to define simulation with multiple match-
ing steps, when we already had simulation with skipping. After all, we use simu-
lation to conclude completely symmetric facts about two commands, so why not
just verify this section’s example by applying simulation with skipping, with the
operand order reversed?
Consider the heart of the proof approach that we did adopt. We need to show
that any step of c can be matched suitably by flattenpcq. The proof is divided into
ℓ
cases by inversion on a premise pv, cq Ñc pv 1 , c1 q. Each case naturally fixes the top-
level structure of c, from which we can apply straightforward algebraic simplification
to find the top-level structure of flattenpcq and therefore the step rules that apply
to it.
Now consider applying simulation with skipping, with the commands passed as
ℓ
operands in the reverse order. The crucial inversion is on pv, flattenpcqq Ñc pv 1 , c1 q.
Unfortunately, the top-level structure of flattenpcq does not imply the top-level
structure of c, but we need to show that c can take a matching step. We need
to prove a whole set of bothersome special-case inversion lemmas by induction,
essentially to invert the action of what is, in the general case, an arbitrarily complex
compiler.
CHAPTER 11
We’ll now take a break from the imperative language we’ve been studying for
the last three chapters, instead looking at a classic sort of small language that distills
the essence of functional programming. That’s the language paradigm that we’ve
been using throughout this book, as we coded executable versions of algorithms.
Its distinctive characteristics are first, a computation style based on simplifying
terms instead of running step-by-step instructions that modify state; and second,
use of functions as first-class values. Functional programming went mainstream
in the early 21st century, influencing widely adopted languages from JavaScript,
where first-class functions are routinely used as callbacks in asynchronous event
processing; to Scala, a hybrid language that melds functional-programming ideas
with object-oriented programming for the Java platform; to Haskell, a purely func-
tional language that has become popular with programming hobbyists and is seeing
increasing adoption in industry.
The heart of functional programming persists even in λ-calculus (or lambda cal-
culus), the simplest version of which contains just three syntactic forms, but which
provides probably the simplest of the widely known Turing-complete languages that
is (nearly!) pleasant to program in directly.
Intuitively, a variable is free in an expression iff it doesn’t occur inside the scope of
a λ binding the same variable.
Next we define substitution.
re1 {xsx “ e1
re1 {xsy “ y, if y ‰ x
1
re {xsλx. e “ λx. e
1
re {xsλy. e “ λy. re1 {xse, if y ‰ x
re1 {xse1 e2 “ re1 {xse1 re1 {xse2
Notice a peculiar property of this definition when we work with open terms,
whose free-variable sets are nonempty. According to the definition rx{ysλx. y “
λx. x. In this example, we say that λ-bound variable x has been captured uninten-
tionally, where substitution created a reference to that λ where none existed before.
Such a problem can only arise when replacing a variable with an open term. In this
case, that term is x, where FVpxq “ txu ‰ H.
More general investigations into λ-calculus will define a more involved notion of
capture-avoiding substitution. Instead, in this book, we carefully steer clear of the
λ-calculus applications that require substituting open terms for variables, letting us
stick with the simpler definition. When it comes to formal encoding of this style of
syntax in proof assistants, surprisingly many complications arise, leading to what
is still an active research area in encodings of language syntax with local variable
binding. Since we aim more for broad than deep coverage of the field of formal
program reasoning, we are happy to avoid those complexities.
With substitution in hand, a big-step semantics is easy to define. We use the
syntactic shorthand v for a value, or term that needs no further evaluation, which
Encoding in this case includes just the λ-abstractions.
e1 ó λx. e e2 ó v rv{xse ó v 1
λx. e ó λx. e e1 e2 ó v 1
Ω “ pλx. x xq pλx. x xq
An enjoyable (though not entirely trivial) exercise for the reader is to generalize
the methods of Church encoding to encoding of other inductive datatypes, including
the syntax of λ-calculus itself. A hallmark of a Turing-complete language is that it
can host an interpreter for itself, and λ-calculus is no exception!
pλx. eq v Ñ rv{xse
Note the one subtlety: the function argument is required to be a value. This
innocuous-looking restriction helps enforce call-by-value evaluation order , where,
upon encountering a function application, we must first evaluate the function, then
evaluate the argument, and only then call the function.
Two more rules complete the semantics and the characterization of call-by-
value.
e1 Ñ e11 e2 Ñ e12
e1 e2 Ñ e1 e2 v e2 Ñ v e12
1
Note again from the second rule that we are only allowed to move on to evaluating
the function argument after the function is fully evaluated.
Following a very similar outline to what we used in Chapter 8, we establish
equivalence between the two semantics for λ-calculus.
Theorem 11.6. If e Ñ˚ v, then e ó v.
Theorem 11.7. If e ó v, then e Ñ˚ v.
There are a few proof subtleties beyond what we encountered before, and the
Coq formalization may be worth reading, to see those details.
Again as before, we have a natural way to build a transition system from any
λ-term e, where L is the set of λ-terms. We define Tpeq “ xL, teu, Ñy. The next
section gives probably the most celebrated λ-calculus result based on the transition-
system perspective.
We add two new bureaucratic rules for addition, mirroring those for function
application.
e1 Ñ e11 e2 Ñ e12
e1 ` e2 Ñ e11 ` e2 v ` e2 Ñ v ` e12
One more rule for addition is called for. Here we face a classic nuisance in
writing rules that combine explicit syntax with standard mathematical operators,
and we write ` for the syntactic construct and + for the mathematical addition
operator.
n ` m Ñ n+m
What would be a useful property to prove about our new expressions? For
one thing, we don’t want them to “crash,” as in the expression pλx. xq ` 7 that
tries to add a function and a number. No rule of the semantics knows what to do
with that case, but it also isn’t a value, so we shouldn’t consider it as finished with
evaluation. Define an expression as stuck when it is not a value and it cannot take
a small step. For “reasonable” expressions e, we should be able to prove that it is
an invariant of Tpeq that no expression is ever stuck.
To define “reasonable,” we formalize the popular idea of a static type system.
Every expression will be assigned a type, capturing which sorts of contexts it may
legally be dropped into. Our language of types is simple. Abstraction
Types τ ::“ N | τ Ñ τ
We have trees of function-space constructors, where all the leaves are instances of
the natural-number type N. Note that, with type assignment, we have yet another
case of abstraction, approximating a potentially complex expression with a type
that only records enough information to rule out crashes.
To assign types to closed terms, we must recursively define what it means for an
open term to have a type. To that end, we use typing contexts Γ, finite maps from
variables to types. To mimic standard notation, we write Γ, x : τ as shorthand for
Γrx ÞÑ τ s, overriding of key x with value τ in Γ. Now we define typing as a three-
place relation, written Γ $ e : τ , to indicate that, assuming Γ as an assignment of
types to e’s free variables, we conclude that e has type τ .
We define the relation inductively, with one case per syntactic construct. Modularity
Γpxq “ τ Γ $ e1 : N Γ $ e2 : N
Γ$x:τ Γ$n:N Γ $ e1 ` e2 : N
Γ, x : τ1 $ e : τ2 Γ $ e1 : τ1 Ñ τ2 Γ $ e2 : τ1
Γ $ λx. e : τ1 Ñ τ2 Γ $ e1 e2 : τ2
We write $ e : τ as shorthand for ‚ $ e : τ , meaning that closed term e has type
τ , with no typing context required. Note that this style of typing rules provides
another instance of modularity, since we can separately type-check different subex-
pressions of a large expression, using just their types to coordinate expectations
among subexpressions.
It should be an invariant of Tpeq that every reachable expression has the same
type as the original, so long as the original was well-typed. This observation is
the key to proving that it is also an invariant that no reachable expression is stuck,
using a proof technique called the syntactic approach to type soundness, which turns
out to be just another instance of our general toolbox for invariant proofs.
70 11. LAMBDA CALCULUS AND SIMPLE TYPE SAFETY
We work our way through a suite of standard lemmas to support that invariant
proof.
Lemma 11.8 (Progress). If $ e : τ , then e isn’t stuck.
Proof. By induction on the derivation of $ e : τ . □
Lemma 11.9 (Weakening). If Γ $ e : τ and every mapping in Γ is also included
in Γ1 , then Γ1 $ e : τ .
Proof. By induction on the derivation of Γ $ e : τ . □
1 1 1
Lemma 11.10 (Substitution). If Γ, x : τ $ e : τ and $ e : τ , then Γ $
re1 {xse : τ .
Proof. By induction on the derivation of Γ, x : τ 1 $ e : τ , with appeal to
Lemma 11.9. □
Lemma 11.11 (Preservation). If e1 Ñ e2 and $ e1 : τ , then $ e2 : τ .
Proof. By induction on the derivation of e1 Ñ e2 . □
Invariants
Theorem 11.12 (Type Soundness). If $ e : τ , then ␣stuck is an invariant of
Tpeq.
Proof. First, we strengthen the invariant to Ipeq “ $ e : τ , justifying the
implication by Lemma 11.8, Progress. Then we apply invariant induction, where
the base case is trivial. The induction step is a direct match for Lemma 12.3,
Preservation. □
The syntactic approach to type soundness is often presented as a proof tech-
nique in isolation, but what we see here is that it follows very directly from our
general invariant proof technique. Usually syntactic type soundness is presented as
fundamentally about proving Progress and Preservation conditions. The Progress
condition maps to invariant strengthening, and the Preservation condition maps to
invariant induction, which we have used in almost every invariant proof so far. Since
the basic proof structure matches our standard one, the main insight is the usual
one: a good choice of a strengthened invariant. In this case, invariant Ipeq “ $ e : τ
is that crucial insight, including the original design of the set of types and the typing
relation.
CHAPTER 12
12.4. Exceptions
Next, let’s see how to model exceptions, a good representative of control-flow-
heavy language features. For simplicity, we will encode exception values themselves
as natural numbers. The action is in how exceptions are thrown and caught.
Expressions e ::“ . . . | throwpeq | ptry e catch x ñ eq
Contexts C ::“ . . . | throwpCq | ptry C catch x ñ eq
We also introduce metavariable C ´ to stand for an evaluation context that does
not use the constructor for catch and is not just a hole l (though it must contain a
l). It is handy to express the idea of exceptions bubbling up to the nearest enclosing
catch constructs. Specifically, here are three rules to define exception behavior.
To write out the rules that are specific to references, it’s helpful to extend our
language syntax with a form that will never appear in original programs but which
does show up at intermediate execution steps. In particular, let’s add an expression
form for locations, the runtime values of references, and let’s say that locations also
count as values.
Locations ℓ P N
Expressions e ::“ n | e ` e | x | λx. e | e e | newpeq | !e | e :“ e | ℓ
Values v ::“ n | λx. e | ℓ
Now we can write the rules for the three reference primitives.
ℓ R domphq hpℓq “ v hpℓq “ v
ph, newpvqq Ñ0 phrℓ ÞÑ vs, ℓq ph, !ℓq Ñ0 ph, vq ph, ℓ :“ v q Ñ0 phrℓ ÞÑ v 1 s, v 1 q
1
We prove variants of all of the lemmas behind last chapter’s type-safety proof,
with a few new ones and twists on the originals. Here we give some highlights.
Lemma 13.2 (Heap Weakening). If Σ; Γ $ e : τ and every mapping in Σ is
also included in Σ1 , then Σ1 ; Γ $ e : τ .
Lemma 13.3. If ph, eq Ñ0 ph1 , e1 q, Σ; ‚ $ e : τ , and Σ $ h, then there exists Σ1
such that Σ1 ; ‚ $ e1 : τ , Σ1 $ h1 , and Σ1 preserves all mappings from Σ.
Lemma 13.4. If Σ; ‚ $ Cre1 s : τ , then there exists τ0 such that Σ; ‚ $ e1 : τ0
and, for all e2 and Σ1 , if Σ1 ; ‚ $ e2 : τ0 and Σ1 preserves mappings from Σ, then
Σ1 ; ‚ $ Cre2 s : τ .
Lemma 13.5 (Preservation). If ph, eq Ñ ph1 , e1 q, Σ; ‚ $ e : τ , and Σ $ h, then
there exists Σ1 such that Σ1 ; ‚ $ e1 : τ and Σ1 $ h1 .
Now we add one new top-level rule to the operational semantics, saying un-
reachable locations may be removed at any time.
@ℓ, v. ℓ P Rh peq ^ hpℓq “ v ñ h1 pℓq “ v
@ℓ, v. h1 pℓq “ v ñ hpℓq “ v
h1 ‰ h
ph, eq Ñ ph1 , eq
Let us explain each premise in more detail. The first premise says that, going
from the old heap h to the new heap h1 , the value of every reachable reference is
preserved. The second premise says that the new heap is a subheap of the original,
not spontaneously adding any new mappings. The final premise says that we have
actually done some useful work: the new heap isn’t just the same as the old one.
It may not be clear why we must include the last premise. The reason has to
do with our formulation of type safety, by saying that programs never get stuck.
We defined that e is stuck if it is not a value but it also can’t take a step. If we
omitted from the garbage-collection rule the premise h1 ‰ h, then this rule would
always apply, for any term, simply by setting h1 “ h. That is, no term would ever
be stuck, and type safety would be meaningless! Since the rule also requires that h1
be no larger than h (with the second premise), additionally requiring h1 ‰ h forces
h1 to shrink, garbage-collecting at least one location. Thus, in any execution state,
we can “kill time” by running garbage collection only finitely many times before
we need to find some “real” step to run. More precisely, the limit on how many
times we can run garbage collection in a row, starting from heap h, is |domphq|, the
number of locations in h.
The type-safety proof is fairly straightforward to update. We prove progress by
ignoring the garbage-collection rule, since the existing rules were already enough
to find a step for every nonvalue. A bit more work is needed to update the proof
of preservation; its cases for the existing rules follow the same way as before, while
we must prove a few lemmas on the way to handling the new rule.
Lemma 13.6 (Transitivity for reachability). If freelocpe1 q Ď freelocpe2 q, then
Rh pe1 q Ď Rh pe2 q.
Lemma 13.7 (Irrelevance of unreachable locations for typing). If Σ $ h, Σ; Γ $
e : τ , then Σ1 ; Γ $ e : τ , if we also know that, for all ℓ and τ 1 , when ℓ P Rh peq and
Σpℓq “ τ 1 , it follows that Σ1 pℓq “ τ 1 .
Lemma 13.8 (Reachability sandwich). If ℓ P Rh peq, hpℓq “ v, and ℓ1 P Rh pvq,
then ℓ1 P Rh peq.
To extend the proof of preservation, we need to show that the strengthened
invariant still holds after garbage collection. A key element is choosing the new
heap typing. We pick the restriction of the old heap typing Σ to the domain of the
new heap h1 . That is, we drop from the heap typing all locations that have been
garbage collected, preserving the types of the survivors. Some work is required to
show that this strategy is sound, given the definition of reachability, but the lemmas
above work out the details, leaving just a bit of bookkeeping in the preservation
proof. The final safety proof then proceeds in exactly the same way as before.
Our proof here hasn’t quite covered all the varieties of garbage collectors that
exist. In particular, copying collectors may move references to different locations,
while we only allow collectors to delete some references. It may be an edifying
82 13. TYPES AND MUTATION
exercise for the reader to extend our proof in a way that also supports reference
relocation.
CHAPTER 14
We now take a step away from the last chapters in two dimensions: we switch
back from functional to imperative programs, and we return to proofs of deep cor-
rectness properties, rather than mere absence of type-related crashes. Nonetheless,
the essential proof structure winds up being the same, as we once again prove
invariants of transition systems!
Numbers n P N
Variables x P Strings
Expressions e ::“ n | x | e ` e | e ´ e | e ˆ e | ˚res
Boolean expressions b ::“ e “ e | e ă e
Commands c ::“ skip | x Ð e | ˚res Ð e | c; c
| if b then c else c | tauwhile b do c | assertpaq
Beside assertions, we also have memory-read operations ˚res and memory-write
operations ˚re1 s Ð e2 , which are written suggestively, as if the memory were a
global array named ˚. Loops have sprouted an extra assertion in their syntax, which
we will actually ignore in the language semantics, but which becomes important as
part of the proof technique we will learn, especially in automating it.
Expressions have a standard recursive semantics.
JnKph, vq “ n
JxKph, vq “ vpxq
Je1 ` e2 Kph, vq “ Je1 Kph, vq ` Je2 Kph, vq
Je1 ´ e2 Kph, vq “ Je1 Kph, vq ´ Je2 Kph, vq
Je1 ˆ e2 Kph, vq “ Je1 Kph, vq ˆ Je2 Kph, vq
J˚resKph, vq “ hpJeKph, vqq
Je1 “ e2 Kph, vq “ Je1 Kph, vq “ Je2 Kph, vq
Je1 ă e2 Kph, vq “ Je1 Kph, vq ă Je2 Kph, vq
We finish up with a big-step semantics in the style of those we’ve seen before,
with the added complication of threading a heap through. Encoding
83
84 14. HOARE LOGIC: VERIFYING IMPERATIVE PROGRAMS
aph, vq
ph, v, assertpaqq ó ph, vq
Reasoning directly about operational semantics can get tedious, so let’s develop
some machinery for proving program correctness automatically.
tP uskiptP u
A rule for assignment is slightly more involved: to state what we know is true
after, we recall that there existed a prestate satisfying the precondition, which then
evolved into the poststate in the expected way.
tP u˚re1 s Ð e2 tλph, vq. Dh1 . P ph1 , vq ^ h “ h1 rJe1 Kph1 , vq ÞÑ Je2 Kph1 , vqsu
To model sequencing, we thread predicates through in an intuitive way.
tP uc1 tQu tQuc2 tRu
tP uc1 ; c2 tRu
For conditional statements, we start from the basic approach of sequencing,
adding two twists. First, since the two subcommands run after different outcomes
of the test expression, we extend their preconditions. Second, since we may reach
14.2. HOARE TRIPLES 85
the end of the command after running either subcommand, we take the disjunction
of their postconditions.
tλs. P psq ^ JbKpsquc1 tQ1 u tλs. P psq ^ ␣JbKpsquc2 tQ2 u
tP uif b then c1 else c2 tλs. Q1 psq _ Q2 psqu
Coming to loops, we at last have a purpose for the assertion annotated on each
one. We call those assertions loop invariants; one of these is meant to be true every Invariants
time a loop iteration begins. We will try to avoid confusion with the more funda-
mental concept of invariant for transition systems, though in fact the two are closely
related formally, which we will see in the last section of this chapter. Essentially, the
loop invariant gives the induction hypothesis that makes the program-correctness
proof go through. We encapsulate the induction reasoning once-and-for-all, in the
proof of soundness for Hoare triples. To verify an individual program, it is only
necessary to prove the premises of the rule, which we give now.
p@s. P psq ñ Ipsqq tλs. Ipsq ^ JbKpsquctIu
tP utIuwhile b do ctλs. Ipsq ^ ␣JbKpsqu
In words: the loop invariant is true when we begin the loop, and every iteration
preserves the invariant, given the extra knowledge that the loop test succeeded. If
the loop finishes, we know that the invariant is still true, but now the test is false.
The final command-specific rule, for assertions, is a bit anticlimactic. The
precondition is carried over as postcondition, if it is strong enough to prove the
assertion.
@s. P psq ñ Ipsq
tP uassertpIqtP u
One more essential rule remains, this time not specific to any command form.
The rules we’ve given deduce specific kinds of precondition-postcondition pairs. For
instance, the skip rule forces the precondition and postcondition to match. However,
we expect to be able to prove tλph, vq. vpxq ą 0uskiptλph, vq. vpxq ě 0u, because the
postcondition is weaker than the precondition, meaning the precondition implies the
postcondition. Alternatively, the precondition is stronger than the postcondition,
because the precondition keeps all restrictions from the postcondition while adding
new ones. Hoare Logic’s rule of consequence allows us to build a new Hoare triple
from an old one by strengthening the precondition and weakening the postcondition.
tP uctQu p@s. P 1 psq ñ P psqq p@s. Qpsq ñ Q1 psqq
tP 1 uctQ1 u
These rules together are complete, in the sense that any intuitively correct
precondition-postcondition pair for a command is provable. Here we only go into
detail on a proof of the dual property, soundness.
Lemma 14.1. Assume the following fact: Together, ph, v, cq ó ph1 , v 1 q, Iph, vq,
and JbKph, vq imply Iph1 , v 1 q. Then, given ph, v, tIuwhile b do cq ó ph1 , v 1 q, it follows
that Iph1 , v 1 q and ␣JbKph1 , v 1 q.
Proof. By induction on the derivation of ph, v, tIuwhile b do cq ó ph1 , v 1 q. □
JbKph, vq ␣JbKph, vq
ph, v, if b then c1 else c2 q Ñ ph, v, c1 q ph, v, if b then c1 else c2 q Ñ ph, v, c2 q
JbKph, vq ␣JbKph, vq
ph, v, tIuwhile b do cq Ñ ph, v, c; tIuwhile b do cq ph, v, tIuwhile b do cq Ñ ph, v, skipq
aph, vq
ph, v, assertpaqq Ñ ph, v, skipq
Next, we apply invariant induction, whose base case follows trivially. The
induction step follows by Lemma 14.5. □
CHAPTER 15
Const : N Ñ exp
Var : V Ñ exp
Plus : exp Ñ exp Ñ exp
Times : exp Ñ exp Ñ exp
Let : V Ñ exp Ñ exp Ñ exp
89
90 15. DEEP EMBEDDINGS, SHALLOW EMBEDDINGS, AND OPTIONS IN BETWEEN
That last example program, with implicit free variables x and y, may now be
redefined in the exp type.
foo1 “ Let pVar “u”q pPlus pVar “x”q pVar “y”qq pLet pVar “v”q
pTimes pVar “u”q pVar “y”qq pPlus pVar “u”q pVar “v”qqq
As in Chapter 4, we can define a recursive interpreter, mapping exp programs
and variable valuations to numbers. Using that interpreter, we can prove equiva-
lence of foo and foo1 .
We say that foo uses a shallow embedding, because it is coded directly in the
metalanguage, with no extra layer of syntax. Conversely, foo1 uses a deep embed-
ding, since it goes via the inductively defined exp type.
These extremes are not our only options. In higher-order logics like Coq’s, we
may also choose what might be called mixed embeddings, which define syntax-tree
types that allow some use of general functions from the metalanguage. Here’s an
Encoding example, as an alternative definition of exp.
Const : N Ñ exp
Var : V Ñ exp
Plus : exp Ñ exp Ñ exp
Times : exp Ñ exp Ñ exp
Let : exp Ñ pN Ñ expq Ñ exp
The one change is in the type of the Let constructor, where now no variable
name is given, and instead the body of the “let” is represented as a Gallina function
from numbers to expressions. The intent is that the body is called on the number
that results from evaluating the first expression. This style is called higher-order
abstract syntax . Though that term is often applied to a more specific instance of
the technique, which is not exactly the one used here, we will not be so picky.
As an illustration of the technique in action, here’s our third encoding of the
simple example program.
foo2 “ Let pPlus pVar “x”q pVar “y”qq pλu.
Let pTimes pConst uq pVar “y”qq pλv.
Plus pConst uq pConst vqqq
With a bit of subtlety, we can define an interpreter for this language, too.
JConst nKv “ n
JVar xKv “ vpxq
JPlus e1 e2 Kv “ Je1 Kv ` Je2 Kv
JTimes e1 e2 Kv “ Je1 Kv ˆ Je2 Kv
JLet e1 e2 Kv “ Je2 pJe1 KvqKv
Note how, in the Let case, since the body e2 is a function, before evaluating it,
we call it on the result of evaluating e1 . This language would actually be sufficient
even if we removed the Var constructor and the v argument of the interpreter. Coq’s
normal variable binding is enough to let us model interesting programs and prove
things about them by induction on syntax.
It is important here that Coq’s induction principles give us useful induction
hypotheses, for constructors whose recursive arguments are functions. The second
15.2. A MIXED EMBEDDING FOR HOARE LOGIC 91
it cannot be caught. We associate this exception with program failure, and the
Hoare logic will ensure that programs never actually fail.
The extension to program syntax is easy:
Fail : @α. cmd α
That is, a failing program can be considered to return any result type, since it will
never actually return normally, instead throwing an uncatchable exception.
The operational semantics is also easily extended to signal failures, with a new
special system state called Failed. We also add this Hoare-logic rule.
tλ . KuFailtλ , . Ku
That is, failure can only be verified against an unsatisfiable precondition, so that
we know that the failure is unreachable.
With this extension, we can prove a soundness-theorem variant, capturing the
impossibility of failure. Invariants
Theorem 15.3. If tP uctQu and P phq for some heap h, then it is an invariant
of pc, hq that the state never becomes Failed.
Note that this version of the theorem still tells us interesting things about
programs that run forever. It is easy to implement runtime assertion checking with
code that performs some test and runs Fail if the test does not pass. An infinite-
looping program may perform such tests infinitely often, and we learn that none of
the tests ever fail.
The accompanying Coq code demonstrates another advantage of this mixed-
embedding style: we can extract our programs to OCaml and run them efficiently.
That is, rather than using functional programming to implement our three kinds of
side effects, we implement them directly with OCaml’s mutable heap, unbounded
recursion, and exceptions, respectively. As a result, our extracted programs achieve
the asymptotic performance that we would expect, thinking of them as C-like code,
where interpreters in a pure functional language like Gallina would necessarily add
at least an extra logarithmic factor in the modeling of unboundedly growing heaps.
CHAPTER 16
Separation Logic
ph, Loop i f q Ñ ph, x Ð f piq; match x with Donepaq ñ Return a | Againpaq ñ Loop a f q
hpaq “ v hpaq “ v
ph, Read aq Ñ ph, Return vq ph, Write a v q Ñ phra ÞÑ v 1 s, Return pqq
1
domphq X ra, a ` nq “ H
ph, Alloc nq Ñ phra ÞÑ 0n s, Return aq ph, Free a nq Ñ ph ´ ra, a ` nq, Return pqq
A few remarks about the last four rules: The basic Read and Write operations
now get stuck when accessing unmapped addresses. The premise of the rule for
Alloc enforces that address a denotes a currently unmapped memory region of size
n. We use a variety of convenient notations that we won’t define in detail here,
referring instead to the accompanying Coq code. Another notation uses 0n to refer
informally to a sequence of n zeroes to write into memory. Similarly, the conclusion
of the Free rule unmaps a whole size-n region, starting at a. We could also have
chosen to enforce in this rule that the region starts out as mapped into h.
95
96 16. SEPARATION LOGIC
emp “ t‚u
p ÞÑ v “ t‚rp ÞÑ vsu
rϕs “ th | ϕ ^ h “ ‚u
Dx. P pxq “ th | Dx. h P P pxqu
P ˚Q “ th1 Z h2 | h1 P P ^ h2 P Qu
The formula emp accepts only the empty heap, while formula p ÞÑ v accepts
only the heap whose only address is p, mapped to value v. We overload the ÞÑ
operator in that second line above, to denote “points-to” on the lefthand side of
the equality and finite-map overriding on the righthand side. Notation rϕs is lifting
a pure (i.e., regular old mathematical) proposition ϕ into an assertion, enforcing
both that the heap is empty and that ϕ is true. We also adapt the normal existential
quantifier to this setting.
The essential definition is the last one, of the separating conjunction ˚. We
use the notation h1 Z h2 for disjoint union of heaps h1 and h2 , implicitly enforc-
ing domph1 q X domph2 q “ H. The intuition of separating conjunction is that we
partition the overall heap into two subheaps, each of which matches one of the
respective conjuncts P and Q. This connective implicitly enforces lack of aliasing,
leading to separation logic’s famous conciseness of specifications that combine data
structures.
We can also define natural comparison operators between assertions, overload-
ing the usual notations for equivalence and implication of propositions.
P ôQ “ @h. h P P ô h P Q
P ñQ “ @h. h P P ñ h P Q
ϕ ñ pP ñ Qq ϕ P ñ Q ϕ
P ˚ rϕs ñ Q P ñ Q ˚ rϕs P ô rϕs ˚ P
P1 ñ P2 Q1 ñ Q2
P ˚QôQ˚P P ˚ pQ ˚ Rq ô pP ˚ Qq ˚ R P1 ˚ Q1 ñ P2 ˚ Q2
16.3. PROGRAM LOGIC 97
parts of memory that the command does not touch directly. We might also say that
the footprint of this command is the singleton set tau. In general, frame predicates
record preserved facts about addresses outside a command’s footprint. The next
few rules don’t have frame predicates baked in; we finish with a rule that adds them
back, in a generic way for arbitrary Hoare triples.
tDv. a ÞÑ vuWrite a v 1 tλ . a ÞÑ v 1 u
This last rule, for Write, is even simpler. We see a straightforward illustration
of overwriting a’s old value v with the new value v’.
tP uctQu
tP ˚ Ructλr. Qprq ˚ Ru
In other words, any Hoare triple can be extended by conjoining an arbitrary
predicate R in both precondition and postcondition. Even more intuitively, when
a program satisfies a spec, it also satisfies an extended spec that records the state
of some other part of memory that is untouched (i.e., is outside the command’s
footprint).
For the pragmatics of proving particular programs, we defer to the accompany-
Modularity ing Coq code. However, for modular proofs, the frame rule has such an important
role that we want to emphasize it here. It is possible to define (recursively) a
predicate llistpℓ, pq, capturing the idea that the heap contains exactly an imperative
linked list, rooted at pointer p, representing functional linked list ℓ. We can also
prove a general specification for a list-reversal function:
@ℓ, p. tllistpℓ, pqureverseppqtλr. llistprevpℓq, rqu
Now consider that we have the roots p1 and p2 of two disjoint lists, respec-
tively representing ℓ1 and ℓ2 . It is easy to instantiate the general theorem and get
tllistpℓ1 , p1 qureversepp1 qtλr. llistprevpℓ1 q, rqu and tllistpℓ2 , p2 qureversepp2 qtλr. llistprevpℓ2 q, rqu.
Applying the frame rule to the former theorem, with R “ llistpℓ2 , p2 q, we get:
tllistpℓ1 , p1 q ˚ llistpℓ2 , p2 qureversepp1 qtλr. llistprevpℓ1 q, rq ˚ llistpℓ2 , p2 qu
Similarly, applying the frame rule to the latter, with R “ llistprevpℓ1 q, rq, we get:
tllistpℓ2 , p2 q ˚ llistprevpℓ1 q, rqureversepp2 qtλr1 . llistprevpℓ2 q, r1 q ˚ llistprevpℓ1 q, rqu
Now it is routine to derive the following spec for a larger program:
tllistpℓ1 , p1 q ˚ llistpℓ2 , p2 qu
r Ð reversepp1 q; r1 Ð reversepp2 q; Returnpr, r1 q
tλpr, r1 q. llistprevpℓ1 q, rq ˚ llistprevpℓ2 q, r1 qu
Note that this specification would be incorrect if the two input lists could share
any memory cells! The separating conjunction ˚ in the precondition implicitly
formalizes our expectation of nonaliasing. The proof internals require only the basic
16.4. SOUNDNESS PROOF 99
rules for Return and sequencing, in addition to the rule of consequence, whose side
conditions we discharge using the cancellation approach sketched in the previous
section.
Note also that this highly automatable proof style works just as well when
calling functions associated with several different data structures in memory. The
frame rule provides a way to show that any function, in any library, preserves
arbitrary memory state outside its footprint.
In the other direction, we write a Coq function from syntax trees to strings
of concrete code in some widely used language. This function is still trusted, but
it tends to be much shorter and worthy of trust than its inverse. Coq can be
run as part of a build process, printing to the screen the string that has been
computed from a syntax tree. A scripting language can be used to extract the
string from Coq’s output, write it to a file, and call a conventional compiler. A
major challenge of this approach is that only deeply embedded languages have
straightforward printing to concrete syntax, in practice, while shallowly embedded
languages tend to be easier to do proofs about.
vpxq “ n v $ n1 ãÑ e1 v $ n2 ãÑ e2
v $ n ãÑ x v $ n1 ` n2 ãÑ e1 ` e2
So far, the logic is straightforward. When we want to mention a number, it
suffices to find a variable that has already been assigned that number. Translation
recurses through addition in a natural way. (Note that, in the addition rule, the
“+” on the left of the arrow is normal Gallina addition, while the “+” on the right
is syntactic “plus” of the deeply embedded language!)
Another rule may appear at first to be overly powerful.
v $ n ãÑ n
That is, any numeric expression may be injected into the deep embedding as
a constant. How can we hope to embed all Gallina expressions in C? The details
of the command-compilation rules reveal why we are safe, so let us turn to those
rules.
The rules we show here are simplified from the full set in the Coq development,
supporting an even smaller subset of the source language, to make the presentation
easier to understand. The rule for lone Return commands is simple, delegating most
of the work to expression compilation, using a designated variable result to hold
the final answer of a command.
v $ n ãÑ e
v $ Return n ãÑ result Ð e
The most interesting rules cover uses of Bind on various primitive operations
directly. Here is the simplest such rule, where the primitive is simple Return.
Here is where we see why it wasn’t problematic earlier to include a rule that
translates any number n into a constant in the deep embedding. Most numeric
expressions in a source program will depend on results of earlier Bind operations.
However, the quantified premise of the last rule enforces that the output statement
s is not allowed to depend on w directly! The values of introduced variables can
only be accessed through deeply embedded variable names. In other words, if we
tried to use the constant rule directly to prove the quantified premise of that last
rule, when the body involves a Return of a complex expression that mentions the
bound variable, we would generate s that includes w, which is not allowed.
Similar rules are present for Read and Write, which interestingly are compiled
the same way as Return. From a syntactic, compilation standpoint, they do not
behave differently from pure computation. More involved and raising new compli-
cations is the rule for loops; see the Coq code for details.
This result can be composed with soundness of any Hoare logic for the source
language. The associated Coq code defines one, essentially following our separation
logic from last chapter.
Theorem 17.2. If tP uctQu, P phq, v $ c ãÑ s, and result R dompvq, then
it is an invariant of the transition system starting in ph, v, sq that execution never
gets stuck.
106 17. CONNECTING TO REAL-WORLD PROGRAMMING LANGUAGES
(3) For m1 ‰ m, M1m1 “ λps, cq, x. ps1 , yq Ð Mm ps, xq; c1 Ð tc1 | f psq “ c ñ
f ps1 q “ c1 u; ret pps1 , c1 q, yq
Then T Ě T 1 .
Proof. Justified by choosing the simulation relation tps, ps, f psqqq | s P Su.
□
Intuitively, method m is a pure observer, not changing the state, only returning
some pure function f of it. We change the state set from S to S ˆ N, so that the
second component of a state caches the output of f on the first component. Like
in a change of representation, method bodies are all rewritten automatically, but
pick-from-set operations are inserted, and we must refine them away to arrive at a
final implementation.
Here the crucial such pattern is tc1 | f psq “ c ñ f ps1 q “ c1 u. Intuitively, we
are asked to choose a cache value c1 that is correct for the new state s1 , while we
are allowed to assume that the prior cache value c was accurate for the old state s.
Therefore, it is natural to give an efficient formula for computing c1 in terms of c.
CHAPTER 19
Separation logic tames sharing of a mutable memory across libraries and data
structures. We will need some additional techniques when we add concurrency
to the mix, resulting in the shared-memory style of concurrency. This chapter
introduces a basic style of operational semantics for shared memory, also studying
its use in model checking, including with an important optimization called partial-
order reduction. The next chapter shows how to prove deeper properties of fancier
programs, by extending the Hoare-logic approach to shared-memory concurrency.
Then the chapter after that shows how to formalize and reason about a different
style of concurrency, message passing.
ph, l, Read aq Ñ ph, l, Return hpaqq ph, l, Write a vq Ñ phra ÞÑ vs, l, Return 0q
aRl aPl
ph, l, Lock aq Ñ ph, l Y tau, Return 0q ph, l, Unlock aq Ñ ph, lztau, Return 0q
Note that the last two rules are the only source of nondeterminism in this
semantics, where a single state can step to multiple different next states. This non-
determinism corresponds to the freedom we give to a scheduler that may pick which
thread runs next. Though this kind of concurrent programming is very expressive
and often achieves very high performance, it comes at a cost in reasoning, as there
may be exponentially many different schedules for a single program, measured with
respect to the textual length of the program. A popular name for this pitfall is the
state-explosion problem.
Note also that we have omitted any looping constructs from this object lan-
guage, so all programs terminate. The Coq formalization uses the mixed-embedding
style, making it not entirely obvious that all programs really do terminate. In any
case, if we must tame the state-explosion problem, we already have our work cut
out for us, even when the state space rooted at any concrete state is finite!
The base semantics can be used to define transition systems in the usual way,
with Tph, l, cq “ xtph, l, cqu, Ñy. We can also define short-circuiting transition sys-
tems with TL ph, l, cq “ xtph, l, tcuqu, ÑL y. A theorem shows that the latter overap-
proximates the former. Abstraction
Theorem 19.1. If natf is an invariant of TL ph, l, cq, then it is also an invariant
of Tph, l, cq.
Proof. By induction on a trace ph, l, cq Ñ˚ ph1 , l1 , c1 q, matching each original
step with zero or one alternative steps. We appeal to a number of lemmas, some of
which are summarized below. □
Lemma 19.2. For all c, ttcuu “ tcu.
Proof. By induction on the structure of c. □
Lemma 19.3. If ph, l, cq Ñ ph1 , l1 , c1 q, then either ph1 , l1 q “ ph, lq and tc1 u “ tcu
(the step was local), or there exists c2 where ph, l, tcuq Ñ ph1 , l1 , c2 q and tc2 u “ tc1 u
(the step was not local).
Proof. By induction on the derivation of ph, l, cq Ñ ph1 , l1 , c1 q, appealing in
places to to Lemma 19.2. □
Lemma 19.4. If natfptcuq, then natfpcq.
Proof. By induction on the structure of c. □
aPr aPw
summarizepRead a, pr, w, ℓqq summarizepWrite a v, pr, w, ℓqq
aPℓ aPℓ
summarizepLock a, pr, w, ℓqq summarizepUnlock a, pr, w, ℓqq
summarizepc1 , sq summarizepc2 , sq
summarizepc1 ||c2 , sq
Those relations do all we need to do to record which actions a thread might
not commute with. The other key ingredient is an extractor for the next atomic
action in a thread, written as a partial function.
nextActionpReturn rq “ Return r
nextActionpFailq “ Fail
nextActionpRead aq “ Read a
nextActionpWrite a vq “ Write a v
nextActionpLock aq “ Lock a
nextActionpUnlock aq “ Unlock a
nextActionpx Ð c1 ; c2 pxqq “ nextActionpc1 q
Given a next atomic action and a summary of another thread, it is now easy
to define commutativity of the two.
commutespReturn , q “ J
commutespFail, q “ J
commutespRead a, p , w, qq “ aRw
commutespWrite a , pr, w, qq “ aRrYw
commutespLock a, p , , ℓqq “ aRℓ
commutespUnlock a, p , , ℓqq “ aRℓ
commutesp , q “ K
With these ingredients, we can define a predicate porSafe that figures out when
a state is eligible for the partial-order-reduction optimization, which is to force the
first thread to run next, ignoring the other threads for now. In working out the
formal details, we will confine ourselves to commands c1 ||c2 with distinguished “first
threads” c1 , though everything can be generalized to other settings (and doing that
generalization could be a worthwhile exercise for the reader, though it requires a
lot of logical bookkeeping). This optimization is only safe when the first thread
can take a step and when that step commutes with any action that other threads
(combined into c2 ) might perform. Formally, we define porSafeph, l, c1 , c2 , sq as
follows, where s should be a valid summary of c2 .
‚ There is some c0 where nextActionpc1 q “ c0 . That is, thread c1 has some
uniquely determined atomic action lined up to run next.
‚ There exist h1 , l1 , and c11 such that ph, l, c1 q Ñ ph1 , l1 , c11 q. That is, thread
c1 is actually able to take a step, which might not be possible if e.g. trying
to take a lock that is already held.
19.3. BASIC PARTIAL-ORDER REDUCTION 115
‚ And the crucial compatibility condition: commutespc0 , sq. That is, all
actions that other threads might perform commute with c0 , the first action
of c1 .
With the applicability condition defined, it is now straightforward to define an
optimized step relation, parameterized on an accurate summary s for c2 .
timeOfpLock a, n ` 1q timeOfpUnlock a, n ` 1q
These facts contradict our assumption that natf is an invariant of TC ph, l, c1 , c2 , sq.
□
Lemma 19.6. If timeOfpc, nq, then there exist h1 , l1 , and c1 where ph, l, cq Ñ˚
ph , l1 , c1 q, such that ph1 , l1 , c1 q is a stuck state.
1
Readiness: Apsq Ď Epsq. That is, we do not select any threads that are not
actually ready to run.
Progress: If Apsq “ H, then Epsq “ H. That is, so long as any thread at
all can step, we select at least one thread.
Commutativity: Consider all executions starting at s and taking steps only
with the threads not in Apsq. These executions only include actions that
commute with the next actions of the threads in Apsq. As a consequence,
any actions that run before elements of the ample set can be reordered to
follow the execution of any ample-set element.
Invisibility: If Apsq ‰ Epsq, then no action in Apsq modifies the truth of ϕ.
An optimization in the spirit of our original from the prior section would happily
decree that it is safe always to pick the first thread to run. This reduced state-
transition system never gets around to running the second thread, so exploring the
state space never finds the failure! To plug this soundness hole, we add a final
condition on the ample sets.
Fairness: If there is a cycle in the finite state space where α is enabled at
some point, then α P Apsq for some s in the cycle.
This condition effectively forces the ample set for the example program above
to include the second thread.
CHAPTER 20
ph, l, Loop i f q Ñ ph, l, x Ð f piq; match x with Donepaq ñ Return a | Againpaq ñ Loop a f q
hpaq “ v hpaq “ v
ph, l, Read aq Ñ ph, l, Return vq ph, l, Write a v 1 q Ñ phra ÞÑ v 1 s, l, Return pqq
domphq X ra, a ` nq “ H
ph, l, Alloc nq Ñ phra ÞÑ 0n s, l, Return aq ph, l, Free a nq Ñ ph ´ ra, a ` nq, l, Return pqq
aRl aPl
ph, l, Lock aq Ñ ph, l Y tau, Return pqq ph, l, Unlock aq Ñ ph, lztau, Return pqq
and local regions. This motion is only part of a proof technique; it has no runtime
content reflected in the operational semantics!
With all of that set-up, the final two rules may seem surprisingly simple.
aPL aPL
tempuLock atλ . Ipaqu tIpaquUnlock atλ . empu
When a thread takes a lock, it appears as if a memory chunk satisfying that
lock’s invariant materializes in the local memory space. Conversely, when a thread
releases a lock, it appears as if the lock grabs a memory chunk satisfying the invari-
ant out of the local memory space. The rules are coordinating conceptual ownership
transfers between local memory and the global lock memory.
The accompanying Coq code shows a few example verifications of interesting
programs.
Lemma 20.3. If tP uc1 ||c2 tQu, then there exist P1 , P2 , Q1 , and Q2 such that
tP1 uc1 tQ1 u, tP2 uc2 tQ2 u, and P ñ P1 ˚ P2 .
Proof. By induction on the derivation of tP uc1 ||c2 tQu. One somewhat sur-
prising case is when the frame rule begins the derivation. We have some predicate
R that is added to both the precondition and postcondition. In picking P1 , P2 ,
Q1 , and Q2 , we have a choice as to where we incorporate R. The two threads
together leave R alone, so clearly either thread individually does, too. Therefore,
we arbitrarily incorporate R in P1 and Q1 . □
Two lemmas express crucial techniques to isolate elements within iterated con-
junction.
Lemma 20.4. If v P S, then ˚xPS P pxq ñ P pvq ˚ ˚xPSztvu P pxq.
Proof. By induction on the cardinality of S. □
Lemma 20.5. If v R S, then P pvq ˚ ˚xPS P pxq ñ ˚xPSYtvu P pxq.
Proof. By induction on the cardinality of S. □
1 1 1
Lemma 20.6 (Preservation). If ph, l, cq Ñ ph , l , c q, tP uctQu, and h satisfies
pP ˚ R ˚ ˚ℓPL pℓ R l ÝÑ Ipℓqqq, then there exists P 1 such that tP 1 uc1 tQu, where h1
satisfies pP 1 ˚ R ˚ ˚ℓPL pℓ R l1 ÝÑ Ipℓqqq.
Proof. By induction on the derivation of ph, l, cq Ñ ph1 , l1 , c1 q. The cases for
lock and unlock respectively use Lemmas 20.4 and 20.5. Note that we include
the parameter R solely to get a strong enough induction hypothesis for steps of
commands c1 ||c2 . We need to know that a step by one thread does not change
the private heap of the other thread. To draw that conclusion, in appealing to the
induction hypothesis, we extend R with precisely that private state. □
Lemma 20.7. ˚ℓPL Ipℓq ñ ˚ℓPL pℓ R H ÝÑ Ipℓqq.
Proof. By induction on the cardinality of L. □
Lemma 20.8. If tP uctQu, and if a heap h satisfies the predicate pP ˚˚ℓPL Ipℓqq,
then an invariant of the system starting at state ph, H, cq is: for reachable state
ph1 , l1 , c1 q, there exists P 1 where tP 1 uc1 tQu, such that h1 satisfies pP 1 ˚˚ℓPL pℓ R l1 ÝÑ Ipℓqqq.
Proof. By invariant induction, using Lemma 20.7 for the base case and Lemma
20.6 for the induction step, the latter with R “ emp. □
Lemma 20.9 (Progress). If tP uctQu and c is about to fail, then P is unsatis-
fiable.
Proof. By induction on the derivation of tP uctQu. □
The overall soundness proof proceeds by invariant weakening with the invariant
established by Lemma 20.8. We prove the inclusion of new invariant in old by
Lemma 20.9.
CHAPTER 21
The last two chapters dealt with the most popular sort of concurrent program-
ming, the threads-and-locks shared-memory style. It’s a fundamentally imperative
style, with side effects coordinating synchronization across threads. Another well-
established (and increasingly popular) style is message passing, which is closer in
spirit to functional programming. In that world, there is, in fact, no memory at all,
let alone shared memory. Instead, state is incorporated into the text of thread code,
and information passes from thread to thread by sending messages over channels.
There are two main kinds of message passing. In the asynchronous or mailbox style,
a thread can deposit a message in a channel, even when no one is ready to receive
the message immediately. Later, a thread can come along and effectively dequeue
the message from the channel. In the synchronous or rendezvous style, a message
send only executes when a matching receive, on the same channel, is available im-
mediately. The threads of the two complementary operations rendezvous and pass
the message in one atomic step.
Packages of semantics and proof techniques for such languages are often called
process algebras, as they support an algebraic style of reasoning about the source
code of message-passing programs. That is, we prove laws very similar to the
familiar equations of algebra and use those laws to “rewrite” inside larger processes,
by replacing their subprocesses with others we have shown suitably equivalent. It’s
a powerful technique for highly modular proofs, which we develop in the rest of
this chapter for one concrete synchronous language. Well-known process algebras
include the π-calculus and the Calculus of Communicating Systems; the one we
focus on is idiosyncratic and designed partly to make the Coq proofs manageable.
Channels c
Processes p ::“ νr⃗cspxq; ppxq | blockpcq; p | !cpvq; p | ?cpxq; ppxq | p||p | dupppq | done
Here’s the intuitive explanation of each syntax construction.
‚ Fresh channel generation νr⃗cspxq; ppxq creates a new private channel
to be used by the body process ppxq, where we replace x with the channel
that is chosen. Following tradition, we use the Greek letter ν (nu) for
this purpose. Each generation operation takes a parameter ⃗c, which we
call the support of the operation. It gives a list of channels already in use
for other purposes, so that the fresh channel must not equal any of them.
(We assume an infinite domain of channels, so that, for any specific list,
it is always possible to find a channel not in that list.)
123
124 21. PROCESS ALGEBRA AND REFINEMENT
dupppq ÝÑ dupppq||p
The labeled-transition-system approach may seem a bit unwieldy for just ex-
plaining the behavior of programs. Where it really pays off is in supporting a
modular, algebraic reasoning style about processes, which we turn to next.
RD is precisely the relation we need to finish the current proof. Intuitively, the
challenge is that dupppq includes infinitely many copies of p, each of which may
evolve in a different way. It is even possible for different copies to interact with each
other through shared channels. However, comparing intermediate states of dupppq
and duppp1 q, we expect to see a shared backbone, where corresponding threads are
related by the original simulation R. The definition of RD formalizes that intuition
of a shared backbone with R connecting corresponding leaves. □
We wrap up the chapter with a few more algebraic properties, which the
Coq code puts to good use in larger examples. We sometimes rely on a predi-
cate neverUsespc, pq, to express that, no matter how other threads interact with it,
process p will never perform a send or receive operation on channel c.
Theorem 21.7. If p ď p1 , then blockpcq; p ď blockpcq; p1 .
Theorem 21.8. blockpc1 q; blockpc2 q; p ď blockpc2 q; blockpc1 q; p
Theorem 21.9. If neverUsespc, p2 q, then pblockpcq; p1 ||p2 q ď pblockpcq; p1 q||p2 .
Theorem 21.10 (Handoff). If neverUsespc, ppvqq, then pblockpcq; p!cpvq; doneq||dupp?cpxq; ppxqqq ď
ppvq.
That last theorem is notable for how it prunes down the space of possibilities
given an infinitely duplicated server, where each thread is trying to receive from a
channel. If server threads never touch that channel after their initial receives, then
most server threads will remain inert. The one send !cpvq; done is the only possible
source of interaction with server threads, thanks to the abstraction barrier on c,
and that one send can only awaken one server thread. Thus, the whole composition
behaves just like a single server thread, instantiated with the right input value.
A concrete example of the Handoff theorem in action is a refinement like this
one, applying to a kind of forwarding chain between channels:
p “ blockpc1 q; blockpc2 q; !c1 pvq; done||dupp?c1 pxq; !c2 pxq; doneq||dupp?c2 pyq; !c3 pyq; doneq
p ď !c3 pvq; done
Note that, without the abstraction boundaries at the start, this fact would not
be derivable. We would need to worry about meddlesome threads in our environ-
ment interacting directly with c1 or c2 , spoiling the protocol and forcing us to add
extra cases to the righthand side of the refinement.
CHAPTER 22
Session Types
Process algebra, as we met it last chapter, can be helpful for modeling network
protocols. Here, multiple parties step through a script of exchanging messages and
making decisions based on message contents. A buggy party might introduce a
deadlock , where, say, party A is blocked waiting for a message from party B, while
B is also waiting for A. Session types are a style of static type system that rule out
deadlock while allowing convenient separate checking of each party, given a shared
protocol type.
There is an almost unlimited variation of different versions of session types. We
still step through a progression of three variants here, and even by the end there
will be obvious protocols that don’t fit the framework. Still, we aim to convey the
core ideas of the approach.
A satisfying soundness theorem applies to our type system. To state it, we first
need the crucial operation of complementing a session type.
!cpσq; τ “ ?cpσq; τ
?cpσq; τ “ !cpσq; τ
done “ done
Modularity
It is apparent that complementation just swaps the sends and receives. When
the original session type tells one party what to do, the complement type tells the
other party what to do. The power of this approach is that we can write one global
protocol description (the session type) and then check two parties’ code against it
separately. A new version of one party can be dropped in without rechecking the
other party’s code.
Using complementation, we can give succinct conditions for deadlock freedom
of a pair of parties.
Theorem 22.1. If p1 : τ and p2 : τ , then it is an invariant of p1 ||p2 that an
intermediate process is either done||done or can take a step.
Proof. By invariant induction, after strengthening the invariant to say that
any intermediate process takes the form p11 ||p12 , where, for some type τ 1 , we have
p11 : τ 1 and p12 : τ 1 . The inductive case of the proof proceeds by simple inversion
on the derivation of p11 : τ 1 , where by the definition of complement it is apparent
that any communication p11 performs has a matching action at the start of p12 . The
choice of τ 1 changes during such a step, to the “tail” of the old τ 1 . □
Why does the last premise of the third rule set the Boolean flag, forcing the
next action to be a receive? Otherwise, at some point in the protocol, we could
have multiple parties trying to send messages. In such a scenario, there might not
be a unique step that the composed parties can take. The proofs are easier if we
can assume deterministic execution within a protocol, which is why we introduced
this static restriction.
To amend our theorem statement, we need to characterize when a process
implements a set of parties correctly. We use the judgment p :α⃗ τ to that end,
where p is the process, α⃗ is a list of all the involved parties, and τ is the type they
must follow collectively.
p1 :α,K τ p2 :β⃗ τ
done :rs τ p1 ||p2 :α’β⃗ τ
The heart of the proof is demonstrating the existence of a unique sequence of
steps to a point where all parties are done. Here is a sketch of the key lemmas.
Lemma 22.3. If p :α⃗ done, then p can’t take any silent step.
Proof. By induction on any derivation of a silent step, followed by inversion
on p :α⃗ done. □
Lemma 22.4. If p :α⃗ !cpx : σq; τ pxq and at least one of sender or receiver of
channel c is missing from α
⃗ , then p can’t take any silent step.
Proof. By induction on any derivation of a silent step, followed by inversion
on p :α⃗ !cpx : σq; τ pxq. □
Lemma 22.5. Assume that α ⃗ is a duplicate-free list of parties excluding both
sender and receiver of channel c. If p :α⃗ !cpx : σq; τ pxq, then for any v : σ, we have
p :α⃗ τ pvq. In other words, when we have well-typed code for a set of parties that do
not participate in the first step of a protocol, that code remains well-typed when we
advance to the next protocol step.
Proof. By induction on the derivation of p :α⃗ !cpx : σq; τ pxq. □
Lemma 22.6. Assume that α ⃗ is a duplicate-free list of parties, at least compre-
hensive enough to include the sender of channel c. However, α ⃗ should exclude the
!cpvq
receiver of c. If p :α⃗ !cpx : σq; τ pxq and p ÝÑ p1 , then p1 :α⃗ τ pvq.
Proof. By induction on steps followed by inversion on multiparty typing. As
we step through elements of α⃗ , we expect to “pass” parties that do not participate
in the current protocol step. Lemma 22.5 lets us justify those passings. □
Theorem 22.7. Assume that α ⃗ is a duplicate-free list of all parties for a pro-
tocol. If p :α⃗ τ , then it is an invariant of p that an intermediate process is either
inert (made up only of dones and parallel compositions) or can take a step.
Proof. By invariant induction, after strengthening the invariant to say that
any intermediate process p1 satisfies p1 :α⃗ τ 1 for some τ 1 . The inductive case uses
Lemma 22.3 to rule out steps by finished protocols, and it uses Lemma 22.4 to rule
out cases that are impossible because parties that are scheduled to go next are not
present in α
⃗ . Interesting cases are where we find that one of the active parties is
at the head of α⃗ . That party either sends or receives. In the first case, we appeal
22.3. MULTIPARTY SESSION TYPES 133
to Lemma 22.6 to find a receiver among the remaining parties. In the second case,
we appeal to an analogous lemma (not stated here) to find a sender.
The other crucial case of the proof is showing that existence of a multiparty
typing implies that, if a process is not inert, it can take a step. The reasoning is
quite similar to in the inductive case, but where instead of showing that any possible
step preserves typing, we demonstrate that a particular step exists. The head of
the session type telegraphs what step it is: for the communication at the head of
the type, the assigned sending party sends to the assigned receiving party. □
APPENDIX A
apply H with (x1 := e1 ) ... (xn := en ): Like the last one, supplying
values for quantified variables in H’s statement, especially for those vari-
ables whose values aren’t immediately implied by the current goal.
apply H1 in H2 : Like apply H1 , but used in a forward direction rather
than backward. For instance, if H1 proves P ñ Q and H2 proves P , then
the effect is to change H2 to Q.
assert P : First prove proposition P , then continue with it as a new hy-
pothesis.
assumption: Prove a conclusion that matches a hypothesis exactly.
cases e: Break the proof into one case for each constructor that might have
been used to build the value of expression e. In the special case where e
essentially has a Boolean type, we consider whether e is true or false.
constructor: When proving an instance of an inductive predicate, apply
the first matching rule of that predicate.
eapply H: Like apply but will work even when some quantified variables
from H do not have their values determined immediately by the form of
the goal. Instead, existential variables (with names starting with question
marks) are introduced for those values.
eassumption: Like assumption but will figure out values of existential vari-
ables.
econstructor: When proving an instance of an inductive predicate, eapply
the first matching rule of that predicate.
eexists: To prove Dx. P pxq, switch to proving P p?yq, for a new existential
variable ?y.
equality: A complete decision procedure for the theory of equality and un-
interpreted functions. That is, the goal must follow from only reflexivity,
symmetry, transitivity, and congruence of equality, including that func-
tions really do behave as functions. See Section 2.4.
exfalso: From any proof state, switch to proving False. In other words,
indicate a switch to a proof by contradiction.
exists e: Prove Dx. P pxq by proving P peq.
first order: Simplify a goal into zero or more new goals, based on the rules
of first-order logic alone. Warning: this tactic is especially likely to run
forever, on complex enough goals! (While entailment for propositional
logic is decidable, entailment for first-order logic isn’t.)
f equal: When the goal is an equality between two applications of the same
function, switch to proving that the function arguments are pairwise equal.
induct x: Where x is a variable in the theorem statement, structure the
proof by induction on the structure of x. You will get one generated
subgoal per constructor in the inductive definition of x. (Indeed, it is
required that x’s type was introduced with Inductive.)
invert H: Replace hypothesis H with other facts that can be deduced from
the structure of H’s statement. More detail to be added here soon!
linear arithmetic: A complete decision procedure for linear arithmetic.
Relevant formulas are essentially those built up from variables and con-
stant natural numbers and integers using only addition and subtraction,
with equality and inequality comparisons on top. (Multiplication by con-
stants is supported, as a shorthand for repeated addition.) See Section
A.3. FURTHER READING 137
2.4. Also note that this tactic goes a bit beyond that theory, by (1) con-
verting multivariable terms into a standard polynomial form and then (2)
treating each different product of powers of variables as one variable in
a linear-arithmetic problem. So, for instance, linear arithmetic can
prove x ˆ y “ y ˆ x simply by deciding that a new variable z “ x ˆ y,
rewriting the goal to z “ z after putting polynomials in canonical form (in
this case, commuting argument order in products to make it consistent).
left: Prove a disjunction by proving its left side.
maps equal: Prove that two finite maps are equal by considering all the
relevant cases for mappings of different keys.
propositional: Simplify a goal into zero or more new goals, based on the
rules of propositional logic alone.
replace e1 with e2 by tac: Replace occurrences of e1 with e2 , proving e2 “
e1 with tactic tac.
rewrite H: Where H is a hypothesis or previously proved theorem, es-
tablishing forall x1 .. xN, e1 = e2, find a subterm of the goal that
equals e1, given the right choices of xi values, and replace that subterm
with e2.
rewrite H1 in H2 : Like rewrite H1 but performs the rewrite in hypothesis
H2 instead of in the conclusion.
right: Prove a disjunction by proving its right side.
ring: Prove goals that are equalities over some registered ring or semiring,
in the sense of algebra, where the goal follows solely from the axioms of
that algebraic structure. See Section 2.4.
simplify: Simplify throughout the goal, applying the definitions of recursive
functions directly. That is, when a subterm matches one of the match cases
in a defining Fixpoint, replace with the body of that case, then repeat.
subst: Remove all hypotheses like x “ e for variables x, simply replacing
all uses of x by e.
symmetry: When proving X “ Y , switch to proving Y “ X.
transitivity X: When proving Y “ Z, switch to proving Y “ X and
X “ Z.
trivial: Coq maintains a database of simple proof steps, such as proving
a fact by direct appeal to a matching hypothesis. trivial asks to try all
such simple steps.
unfold X: Replace X by its definition.
unfold X in *: Like the last one, but unfolds in hypotheses as well as con-
clusion.
The first of these two, especially, goes in-depth on the automated proof-scripting
principles showcased from time to time in the Coq example code associated with
the present book.
There are also other sources that introduce program-reasoning principles at the
same time, including:
‚ Benjamin C. Pierce et al., Software Foundations, http://www.cis.upenn.
edu/~bcpierce/sf/
Software Foundations generally proceeds at a slower pace than this book does.
Index
139
140 INDEX
target language, 59
tautology, 29
TCB, 101
termination of recursive definitions, 7
theory of equality with uninterpreted
functions, 9
threads and locks, 123
top element of an abstract interpretation,
51
total correctness, 86
total function, 7
trace equivalence, 60
trace inclusion, 60, 125
traces, 60
transition system, 2, 32
transitive closure, 26
transitive-reflexive closure, 32
trusted code base, 101
Turing-completeness, 22, 39
two-stack queue, 15
type system, 129
typing context, 69
uncaught exceptions, 73
unification, 11
unification variables, 97
value, 66
variable binding, 66
variable capture, 66
variants, 72
verification, 2
vertical decomposition, 2
weakening, 30
weakening the postcondition, 85
weaker predicate, 85
widening, 57