Tutorial-Isabelle
Tutorial-Isabelle
HO
lle L
b e ∀
a
s =
I α
λ
β →
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
1
Introduction
We assume that the reader is used to logical and set-theoretic notation and
is familiar with the basic concepts of functional programming.
Chapter 2 introduces HOL as a functional programming language and ex-
plains how to write simple inductive proofs of mostly equational properties
of recursive functions. Chapter 3 introduces the rest of HOL: the language of
formulas beyond equality, automatic proof tools, single-step proofs, and in-
ductive definitions, an essential specification construct. Chapter 4 introduces
Isar, Isabelle’s language for writing structured proofs.
This introduction to the core of Isabelle is intentionally concrete and
example-based: we concentrate on examples that illustrate the typical cases
without explaining the general case if it can be inferred from the examples.
We cover the essentials (from a functional programming point of view) as
quickly and compactly as possible.
For a comprehensive treatment of all things Isabelle we recommend the
Isabelle/Isar Reference Manual [7], which comes with the Isabelle distribu-
tion. The tutorial by Nipkow, Paulson and Wenzel [6] (in its updated version
that comes with the Isabelle distribution) is still recommended for the wealth
of examples and material, but its proof style is outdated. In particular it does
not cover the structured proof language Isar.
If you want to apply what you have learned about Isabelle we recommend
you download and read the book Concrete Semantics [5], a guided tour of the
wonderful world of programming language semantics formalized in Isabelle.
In fact, Programming and Proving in Isabelle/HOL constitutes part I of
2 1 Introduction
Concrete Semantics. The web pages for Concrete Semantics also provide a set
of LATEX-based slides and Isabelle demo files for teaching Programming and
Proving in Isabelle/HOL.
Acknowledgements
I wish to thank the following people for their comments on this document:
Florian Haftmann, Peter Johnson, René Thiemann, Sean Seefried, Christian
Sternagel and Carl Witty.
2
Programming and Proving
2.1 Basics
HOL is a typed logic whose type system resembles that of functional pro-
gramming languages. Thus there are
base types, in particular bool, the type of truth values, nat, the type of
natural numbers (N), and int , the type of mathematical integers (Z).
type constructors, in particular list, the type of lists, and set, the type of
sets. Type constructors are written postfix, i.e., after their arguments. For
example, nat list is the type of lists whose elements are natural numbers.
function types, denoted by ⇒.
type variables, denoted by 0a, 0b, etc., like in ML.
Note that 0a ⇒ 0b list means 0a ⇒ ( 0b list ), not ( 0a ⇒ 0b) list : postfix type
constructors have precedence over ⇒.
Terms are formed as in functional programming by applying functions to
arguments. If f is a function of type τ1 ⇒ τ2 and t is a term of type τ1 then
f t is a term of type τ2 . We write t :: τ to mean that term t has type τ.
There are many predefined infix symbols like + and 6. The name of the cor-
responding binary function is (+), not just +. That is, x + y is nice surface
syntax (“syntactic sugar”) for (+) x y.
are part of the Isabelle framework, not the logic HOL. Logically, they agree
with their HOL counterparts ∀ and −→, but operationally they behave dif-
ferently. This will become clearer as we go along.
Right-arrows of all kinds always associate to the right. In particular, the formula
A1 =⇒ A2 =⇒ A3 means A1 =⇒ (A2 =⇒ A3 ). The (Isabelle-specific1 ) notation
[[ A1 ; . . .; An ]] =⇒ A is short for the iterated implication A1 =⇒ . . . =⇒ An =⇒ A.
A1 ... An
Sometimes we also employ inference rule notation:
A
2.1.2 Theories
where T 1 . . . T n are the names of existing theories that T is based on. The
T i are the direct parent theories of T. Everything defined in the parent
theories (and their parents, recursively) is automatically visible. Each theory
T must reside in a theory file named T .thy.
HOL contains a theory Main, the union of all the basic predefined theories like
arithmetic, lists, sets, etc. Unless you know what you are doing, always include
Main as a direct or indirect parent of all your theories.
The textual definition of a theory follows a fixed syntax with keywords like
begin and datatype. Embedded in this syntax are the types and formulas of
HOL. To distinguish the two levels, everything HOL-specific (terms and types)
must be enclosed in quotation marks: ". . . ". Quotation marks around a single
identifier can be dropped. When Isabelle prints a syntax error message, it
refers to the HOL syntax as the inner syntax and the enclosing theory
language as the outer syntax.
By default Isabelle/jEdit does not show the proof state but this tutorial refers
to it frequently. You should tick the “Proof state” box to see the proof state in
the output window.
These are the most important predefined types. We go through them one by
one. Based on examples we learn how to define (possibly recursive) functions
and prove theorems about them by induction and simplification.
with the two values True and False and with many predefined functions: ¬,
∧, ∨, −→, etc. Here is how conjunction could be defined by pattern matching:
fun conj :: "bool ⇒ bool ⇒ bool" where
"conj True True = True" |
"conj _ _ = False"
Both the datatype and function definitions roughly follow the syntax of func-
tional programming languages.
ing “for an arbitrary but fixed m”. The =⇒ separates assumptions from the
conclusion. The command apply(auto) instructs Isabelle to try and prove all
subgoals automatically, essentially by simplifying them. Because both sub-
goals are easy, Isabelle can do it. The base case add 0 0 = 0 holds by def-
inition of add, and the induction step is almost as simple: add (Suc m) 0
= Suc(add m 0) = Suc m using first the definition of add and then the
2.2 Types bool, nat and list 7
An Informal Proof
Lemma add m 0 = m
Proof by induction on m.
• Case 0 (the base case): add 0 0 = 0 holds by definition of add.
• Case Suc m (the induction step): We assume add m 0 = m, the induction
hypothesis (IH), and we need to show add (Suc m) 0 = Suc m. The proof
is as follows:
add (Suc m) 0 = Suc (add m 0) by definition of add
= Suc m by IH
8 2 Programming and Proving
Although lists are already predefined, we define our own copy for demonstra-
tion purposes:
datatype 0a list = Nil | Cons 0a " 0a list"
• Type 0a list is the type of lists over elements of type 0a. Because 0a is a
type variable, lists are in fact polymorphic: the elements of a list can be
of arbitrary type (but must all be of the same type).
• Lists have two constructors: Nil, the empty list, and Cons, which puts an
element (of type 0a) in front of a list (of type 0a list ). Hence all lists are
of the form Nil, or Cons x Nil, or Cons x (Cons y Nil), etc.
• datatype requires no quotation marks on the left-hand side, but on the
right-hand side each of the argument types of a constructor needs to be
enclosed in quotation marks, unless it is just an identifier (e.g., nat or 0a).
We also define two standard functions, append and reverse:
fun app :: " 0a list ⇒ 0a list ⇒ 0a list" where
"app Nil ys = ys" |
"app (Cons x xs) ys = Cons x (app xs ys)"
theory MyList
imports Main
begin
(* a comment *)
end
Just as for natural numbers, there is a proof principle of induction for lists.
Induction over a list is essentially induction over the length of the list, al-
though the length remains implicit. To prove that some property P holds for
all lists xs, i.e., P xs, you need to prove
1. the base case P Nil and
2. the inductive case P (Cons x xs) under the assumption P xs, for some
arbitrary but fixed x and xs.
This is often called structural induction for lists.
10 2 Programming and Proving
We will now demonstrate the typical proof process, which involves the for-
mulation and proof of auxiliary lemmas. Our goal is to show that reversing a
list twice produces the original list.
theorem rev_rev [simp]: "rev (rev xs) = xs"
Commands theorem and lemma are interchangeable and merely indicate the
importance we attach to a proposition. Via the bracketed attribute simp we
also tell Isabelle to make the eventual theorem a simplification rule: future
proofs involving simplification will replace occurrences of rev (rev xs) by xs.
The proof is by induction:
apply(induction xs)
As explained above, we obtain two subgoals, namely the base case (Nil) and
the induction step (Cons):
1. rev (rev Nil) = Nil
V
2. x1 xs.
rev (rev xs) = xs =⇒ rev (rev (Cons x1 xs)) = Cons x1 xs
Let us try to solve both goals automatically:
apply(auto)
Subgoal 1 is proved, and disappears; the simplified version of subgoal 2 be-
comes the new subgoal 1:
V
1. x1 xs.
rev (rev xs) = xs =⇒
rev (app (rev xs) (Cons x1 Nil)) = Cons x1 xs
In order to simplify this subgoal further, a lemma suggests itself.
A First Lemma
A Second Lemma
Associativity of app
Didn’t we say earlier that all proofs are by simplification? But in both cases,
going from left to right, the last equality step is not a simplification at all!
In the base case it is app ys zs = app Nil (app ys zs). It appears almost
mysterious because we suddenly complicate the term by appending Nil on
the left. What is really going on is this: when proving some equality s = t ,
both s and t are simplified until they “meet in the middle”. This heuristic
for equality proofs works well for a functional programming context like ours.
In the base case both app (app Nil ys) zs and app Nil (app ys zs) are
simplified to app ys zs, the term in the middle.
Isabelle’s predefined lists are the same as the ones above, but with more
syntactic sugar:
• [] is Nil,
• x # xs is Cons x xs,
• [x 1 , . . ., x n ] is x 1 # . . . # x n # [], and
• xs @ ys is app xs ys.
2.2 Types bool, nat and list 13
There is also a large library of predefined functions. The most important ones
are the length function length :: 0a list ⇒ nat (with the obvious definition),
and the map function that applies a function to all elements of a list:
fun map :: "( 0a ⇒ 0b) ⇒ 0a list ⇒ 0b list" where
"map f Nil = Nil" |
"map f (Cons x xs) = Cons (f x ) (map f xs)"
From now on lists are always the predefined lists.
In addition to nat there are also the types int and real, the mathematical
integers and real numbers. As mentioned above, numerals and most of the
standard arithmetic operations are overloaded. In particular they are defined
on int and real.
There are two infix exponentiation operators: (^) for nat and int (with exponent
of type nat in both cases) and (powr ) for real.
Type int is already part of theory Main, but in order to use real as well, you
have to import theory Complex_Main instead of Main.
There are three coercion functions that are inclusions and do not lose
information:
int :: nat ⇒ int
real :: nat ⇒ real
real_of_int :: int ⇒ real
Isabelle inserts these inclusions automatically once you import Com-
plex_Main. If there are multiple type-correct completions, Isabelle chooses
an arbitrary one. For example, the input (i ::int ) + (n::nat ) has the unique
type-correct completion i + int n. In contrast, ((n::nat ) + n) :: real has two
type-correct completions, real(n+n) and real n + real n.
There are also the coercion functions in the other direction:
nat :: int ⇒ nat
floor :: real ⇒ int
ceiling :: real ⇒ int
Exercises
Exercise 2.1. Use the value command to evaluate the following expressions:
"1 + (2::nat )", "1 + (2::int )", "1 − (2::nat )" and "1 − (2::int )".
14 2 Programming and Proving
Exercise 2.2. Start from the definition of add given above. Prove that add
is associative and commutative. Define a recursive function double :: nat ⇒
nat and prove double m = add m m.
Exercise 2.3. Define a function count :: 0a ⇒ 0a list ⇒ nat that counts the
number of occurrences of an element in a list. Prove count x xs 6 length xs.
Exercise 2.5. Define a recursive function sum_upto :: nat ⇒ nat such that
sum_upto n = 0 + ... + n and prove sum_upto n = n ∗ (n + 1) div 2.
2.3.1 Datatypes
2.3.2 Definitions
2.3.3 Abbreviations
Recursive functions are defined with fun by pattern matching over datatype
constructors. The order of equations matters, as in functional programming
languages. However, all HOL functions must be total. This simplifies the logic
— terms are always defined — but means that recursive functions must ter-
minate. Otherwise one could define a function f n = f n + 1 and conclude
0 = 1 by subtracting f n on both sides.
Isabelle’s automatic termination checker requires that the arguments of
recursive calls on the right-hand side must be strictly smaller than the ar-
guments on the left-hand side. In the simplest case, this means that one
fixed argument position decreases in size with each recursive call. The size is
measured as the number of constructors (excluding 0-ary ones, e.g., Nil). Lex-
icographic combinations are also recognized. In more complicated situations,
the user may have to prove termination by hand. For details see [3].
Functions defined with fun come with their own induction schema that
mirrors the recursion schema and is derived from the termination order. For
example,
2.3 Type and Function Definitions 17
This customized induction rule can simplify inductive proofs. For example,
lemma "div2 n = n div 2"
apply(induction n rule: div2.induct )
(where the infix div is the predefined division operation) yields the subgoals
1. div2 0 = 0 div 2
2. div2 (Suc 0) = Suc 0 div 2
V
3. n. div2 n = n div 2 =⇒
div2 (Suc (Suc n)) = Suc (Suc n) div 2
An application of auto finishes the proof. Had we used ordinary structural
induction on n, the proof would have needed an additional case analysis in
the induction step.
This example leads to the following induction heuristic:
Let f be a recursive function. If the definition of f is more com-
plicated than having one equation for each constructor of some
datatype, then properties of f are best proved via f .induct.
The general case is often called computation induction, because the
induction follows the (terminating!) computation. For every defining equation
f (e) = . . . f (r 1 ) . . . f (r k ) . . .
where f (r i ), i =1. . .k, are all the recursive calls, the induction rule f .induct
contains one premise of the form
P (r 1 ) =⇒ . . . =⇒ P (r k ) =⇒ P (e)
If f :: τ1 ⇒ . . . ⇒ τn ⇒ τ then f .induct is applied like this:
apply(induction x 1 . . . x n rule: f .induct )
where typically there is a call f x 1 . . . x n in the goal. But note that the
induction rule does not mention f at all, except in its name, and is applicable
independently of f.
18 2 Programming and Proving
Exercises
Exercise 2.6. Starting from the type 0a tree defined in the text, define a
function contents :: 0a tree ⇒ 0a list that collects all values in a tree in a list,
in any order, without removing duplicates. Then define a function sum_tree
:: nat tree ⇒ nat that sums up all values in a tree of natural numbers and
prove sum_tree t = sum_list (contents t ) where sum_list is predefined by
the equations sum_list [] = 0 and sum_list (x # xs) = x + sum_list xs.
Exercise 2.7. Define the two functions pre_order and post_order of type
0
a tree ⇒ 0a list that traverse a tree and collect all stored values in the
respective order in a list. Prove pre_order (mirror t ) = rev (post_order t ).
We have already noted that theorems about recursive functions are proved by
induction. In case the function has more than one argument, we have followed
the following heuristic in the proofs about the append function:
Perform induction on argument number i
if the function is defined by recursion on argument number i.
The key heuristic, and the main point of this section, is to generalize the
goal before induction. The reason is simple: if the goal is too specific, the
induction hypothesis is too weak to allow the induction step to go through.
Let us illustrate the idea with an example.
Function rev has quadratic worst-case running time because it calls ap-
pend for each element of the list and append is linear in its first argument.
A linear time version of rev requires an extra argument where the result is
accumulated gradually, using only #:
fun itrev :: " 0a list ⇒ 0a list ⇒ 0a list" where
"itrev [] ys = ys" |
"itrev (x #xs) ys = itrev xs (x #ys)"
The behaviour of itrev is simple: it reverses its first argument by stacking
its elements onto the second argument, and it returns that second argument
when the first one becomes empty. Note that itrev is tail-recursive: it can be
compiled into a loop; no stack is necessary for executing it.
Naturally, we would like to show that itrev reverses its first argument:
2.4 Induction Heuristics 19
The induction hypothesis is too weak. The fixed argument, [], prevents it from
rewriting the conclusion. This example suggests a heuristic:
Generalize goals for induction by replacing constants by variables.
Of course one cannot do this naively: itrev xs ys = rev xs is just not true.
The correct generalization is
lemma "itrev xs ys = rev xs @ ys"
If ys is replaced by [], the right-hand side simplifies to rev xs, as required. In
this instance it was easy to guess the right generalization. Other situations
can require a good deal of creativity.
Although we now have two variables, only xs is suitable for induction, and
we repeat our proof attempt. Unfortunately, we are still not there:
V
1. a xs.
itrev xs ys = rev xs @ ys =⇒
itrev xs (a # ys) = rev xs @ a # ys
The induction hypothesis is still too weak, but this time it takes no intuition
to generalize: the problem is that the ys in the induction hypothesis is fixed,
but the induction hypothesis needs to be applied with a # ys instead of ys.
Hence we prove the theorem for all ys instead of a fixed one. We can instruct
induction to perform this generalization for us by adding arbitrary: ys.
apply(induction xs arbitrary: ys)
The induction hypothesis in the induction step is now universally quantified
over ys:
1. ys. itrev [] ys = rev [] @ ys
V
V
2. a xs ys.
( ys. itrev xs ys = rev xs @ ys) =⇒
V
Exercises
2.5 Simplification
So far we have talked a lot about simplifying terms without explaining the
concept. Simplification means
• using equations l = r from left to right (only),
• as long as possible.
To emphasize the directionality, equations that have been given the simp
attribute are called simplification rules. Logically, they are still symmetric,
but proofs by simplification use them only in the left-to-right direction. The
proof tool that performs simplifications is called the simplifier. It is the basis
of auto and other related proof methods.
The idea of simplification is best explained by an example. Given the
simplification rules
0+n=n (1)
Suc m + n = Suc (m + n) (2)
(Suc m 6 Suc n) = (m 6 n) (3)
(0 6 m) = True (4)
the formula 0 + Suc 0 6 Suc 0 + x is simplified to True as follows:
(1)
(0 + Suc 0 6 Suc 0 + x ) =
(2)
(Suc 0 6 Suc 0 + x ) =
(3)
(Suc 0 6 Suc (0 + x )) =
(4)
(0 6 0 + x ) =
True
2.5 Simplification 21
Simplification rules can be conditional. Before applying such a rule, the sim-
plifier will first try to prove the preconditions, again by simplification. For
example, given the simplification rules
p 0 = True
p x =⇒ f x = g x,
the term f 0 simplifies to g 0 but f 1 does not simplify because p 1 is not
provable.
2.5.3 Termination
So far we have only used the proof method auto. Method simp is the key
component of auto, but auto can do much more. In some cases, auto is
overeager and modifies the proof state too much. In such cases the more
predictable simp method should be used. Given a goal
1. [[ P 1 ; . . .; P m ]] =⇒ C
the command
apply(simp add: th 1 . . . th n )
simplifies the assumptions P i and the conclusion C using
• all simplification rules, including the ones coming from datatype and fun,
• the additional lemmas th 1 . . . th n , and
• the assumptions.
In addition to or instead of add there is also del for removing simplification
rules temporarily. Both are optional. Method auto can be modified similarly:
apply(auto simp add: . . . simp del: . . .)
Here the modifiers are simp add and simp del instead of just add and del
because auto does not just perform simplification.
Note that simp acts only on subgoal 1, auto acts on all subgoals. There
is also simp_all, which applies simp to all subgoals.
more robust: if the definition has to be changed, only the proofs of the ab-
stract properties will be affected.
The definition of a function f is a theorem named f_def and can be added
to a call of simp like any other theorem:
apply(simp add: f_def )
In particular, let-expressions can be unfolded by making Let_def a simplifi-
cation rule.
Goals containing if-expressions are automatically split into two cases by simp
using the rule
P (if A then s else t ) = ((A −→ P s) ∧ (¬ A −→ P t ))
For example, simp can prove
(A ∧ B ) = (if A then B else False)
because both A −→ (A ∧ B ) = B and ¬ A −→ (A ∧ B ) = False simplify
to True.
We can split case-expressions similarly. For nat the rule looks like this:
P (case e of 0 ⇒ a | Suc n ⇒ b n) =
((e = 0 −→ P a) ∧ (∀ n. e = Suc n −→ P (b n)))
Case expressions are not split automatically by simp, but simp can be in-
structed to do so:
apply(simp split : nat .split )
splits all case-expressions over natural numbers. For an arbitrary datatype t
it is t .split instead of nat .split. Method auto can be modified in exactly the
same way. The modifier split : can be followed by multiple names. Splitting
if or case-expressions in the assumptions requires split : if_splits or split :
t .splits.
Exercises
Exercise 2.10. Define a datatype tree0 of binary tree skeletons which do not
store any information, neither in the inner nodes nor in the leaves. Define a
function nodes :: tree0 ⇒ nat that counts the number of all nodes (inner
nodes and leaves) in such a tree. Consider the following recursive function:
fun explode :: "nat ⇒ tree0 ⇒ tree0" where
"explode 0 t = t" |
"explode (Suc n) t = explode n (Node t t )"
Find an equation expressing the size of a tree after exploding it (nodes
(explode n t )) as a function of nodes t and n. Prove your equation. You
may use the usual arithmetic operators, including the exponentiation opera-
tor “^”. For example, 2 ^ 2 = 4.
Hint: simplifying with the list of theorems algebra_simps takes care of
common algebraic properties of the arithmetic operators.
3.1 Formulas
The core syntax of formulas (form below) provides the standard logical con-
structs, in decreasing order of precedence:
Terms are the ones we have seen all along, built from constants, variables,
function application and λ-abstraction, including all the syntactic sugar like
infix symbols, if, case, etc.
Remember that formulas are simply terms of type bool. Hence = also works for
formulas. Beware that = has a higher precedence than the other logical operators.
Hence s = t ∧ A means (s = t ) ∧ A, and A ∧ B = B ∧ A means A ∧ (B = B )
∧ A. Logical equivalence can also be written with ←→ instead of =, where ←→ has
the same low precedence as −→. Hence A ∧ B ←→ B ∧ A really means (A ∧ B )
←→ (B ∧ A).
The most frequent logical symbols and their ASCII representations are shown
in Fig. 3.1. The first column shows the symbols, the other columns ASCII
representations. The \<...> form is always converted into the symbolic form
by the Isabelle interfaces, the treatment of the other ASCII forms depends on
the interface. The ASCII forms /\ and \/ are special in that they are merely
keyboard shortcuts for the interface and not logical symbols by themselves.
26 3 Logic and Proof Beyond Equality
∀ \<forall> ALL
∃ \<exists> EX
λ \<lambda> %
−→ -->
←→ <->
∧ /\ &
∨ \/ |
¬ \<not> ~
6 = \<noteq> ~=
3.2 Sets
Sets of elements of type 0a have type 0a set . They can be finite or infinite.
Sets come with the usual notation:
• {}, {e 1 ,. . .,e n }
• e ∈ A, A ⊆ B
• A ∪ B , A ∩ B , A − B, − A
(where A − B and −A are set difference and complement) and much more.
UNIV is the set of all elements of some type. Set comprehension is written
{x . P } rather than {x | P }.
In {x . P } the x must be a variable. Set comprehension involving a proper term
t must be written {t | x y. P }, where x y are those free variables in t that occur
in P. This is just a shorthand for {v . ∃ x y. v = t ∧ P }, where v is a new variable.
For example, {x + y |x . x ∈ A} is short for {v . ∃ x . v = x +y ∧ x ∈ A}.
Here are the ASCII representations of the mathematical symbols:
∈ \<in> :
⊆ \<subseteq> <=
∪ \<union> Un
∩ \<inter> Int
Sets also allow bounded quantifications ∀ x ∈A. P and ∃ x ∈A. P.
For the more ambitious, there are also and :
S T
3.3 Proof Automation 27
A = {x . ∃ B ∈A. x ∈ B } A = {x . ∀ B ∈A. x ∈ B }
S T
The ASCII forms of are \<Union> and Union, those of are \<Inter>
S T
( x ∈A B x ) = {y. ∀ x ∈A. y ∈ B x }
T
The ASCII forms are UN x:A. B and INT x:A. B where x may occur in B.
If A is UNIV you can write UN x. B and INT x. B.
Some other frequently useful functions on sets are the following:
set :: 0a list ⇒ 0a set converts a list to the set of its elements
finite :: 0a set ⇒ bool is true iff its argument is finite
card :: 0a set ⇒ nat is the cardinality of a finite set
and is 0 for all infinite sets
f ‘ A = {y. ∃ x ∈A. y = f x } is the image of a function over a set
See [4] for the wealth of further predefined functions in theory Main.
Exercises
Exercise 3.1. Start from the data type of binary trees defined earlier:
datatype 0a tree = Tip | Node " 0a tree" 0a " 0a tree"
Define a function set :: 0a tree ⇒ 0a set that returns the elements in a tree
and a function ord :: int tree ⇒ bool that tests if an int tree is ordered.
Define a function ins that inserts an element into an ordered int tree
while maintaining the order of the tree. If the element is already in the tree,
the same tree should be returned. Prove correctness of ins: set (ins x t ) =
{x } ∪ set t and ord t =⇒ ord (ins i t ).
So far we have only seen simp and auto: Both perform rewriting, both can
also prove linear arithmetic facts (no multiplication), and auto is also able to
prove simple logical or set-theoretic goals:
lemma "∀ x . ∃ y. x = y"
by auto
by proof-method
is short for
apply proof-method
done
The key characteristics of both simp and auto are
• They show you where they got stuck, giving you an idea how to continue.
• They perform the obvious steps but are highly incomplete.
A proof method is complete if it can prove all true formulas. There is no
complete proof method for HOL, not even in theory. Hence all our proof
methods only differ in how incomplete they are.
A proof method that is still incomplete but tries harder than auto is
fastforce. It either succeeds or fails, it acts on the first subgoal only, and it
can be modified like auto, e.g., with simp add. Here is a typical example of
what fastforce can do:
lemma "[[ ∀ xs ∈ A. ∃ ys. xs = ys @ ys; us ∈ A ]]
=⇒ ∃ n. length us = n+n"
by fastforce
This lemma is out of reach for auto because of the quantifiers. Even fastforce
fails when the quantifier structure becomes more complicated. In a few cases,
its slow version force succeeds where fastforce fails.
The method of choice for complex logical goals is blast . In the following
example, T and A are two binary predicates. It is shown that if T is total,
A is antisymmetric and T is a subset of A, then A is a subset of T :
lemma
"[[ ∀ x y. T x y ∨ T y x ;
∀ x y. A x y ∧ A y x −→ x = y;
∀ x y. T x y −→ A x y ]]
=⇒ ∀ x y. A x y −→ T x y"
by blast
We leave it to the reader to figure out why this lemma is true. Method blast
• is (in principle) a complete proof procedure for first-order formulas, a
fragment of HOL. In practice there is a search bound.
• does no rewriting and knows very little about equality.
• covers logic, sets and relations.
• either succeeds or fails.
Because of its strength in logic and sets and its weakness in equality reasoning,
it complements the earlier proof methods.
3.3 Proof Automation 29
3.3.1 Sledgehammer
3.3.2 Arithmetic
If you want to try all of the above automatic proof methods you simply type
try
There is also a lightweight variant try0 that does not call sledgehammer. If
desired, specific simplification and introduction rules can be added:
try0 simp: . . . intro: . . .
Although automation is nice, it often fails, at least initially, and you need
to find out why. When fastforce or blast simply fail, you have no clue why.
At this point, the stepwise application of proof rules may be necessary. For
example, if blast fails on A ∧ B, you want to attack the two conjuncts A and
B separately. This can be achieved by applying conjunction introduction
?P ?Q
conjI
?P ∧ ?Q
to the proof state. We will now examine the details of this process.
We had briefly mentioned earlier that after proving some theorem, Isabelle re-
places all free variables x by so called unknowns ?x. We can see this clearly in
rule conjI. These unknowns can later be instantiated explicitly or implicitly:
• By hand, using of . The expression conjI [of "a=b" "False"] instantiates
the unknowns in conjI from left to right with the two formulas a=b and
False, yielding the rule
3.4 Single Step Proofs 31
a =b False
a = b ∧ False
In general, th[of string 1 . . . string n ] instantiates the unknowns in the
theorem th from left to right with the terms string 1 to string n .
• By unification. Unification is the process of making two terms syntacti-
cally equal by suitable instantiations of unknowns. For example, unifying
?P ∧ ?Q with a = b ∧ False instantiates ?P with a = b and ?Q with
False.
We need not instantiate all unknowns. If we want to skip a particular one we
can write _ instead, for example conjI [of _ "False"]. Unknowns can also be
instantiated by name using where, for example conjI [where ?P = "a=b"
and ?Q = "False"].
?P =⇒ ?Q ?Q =⇒ ?P
iffI
?P = ?Q
These rules are part of the logical system of natural deduction (e.g., [2]).
Although we intentionally de-emphasize the basic rules of logic in favour of
automatic proof methods that allow you to take bigger steps, these rules are
helpful in locating where and why automation fails. When applied backwards,
these rules decompose the goal:
• conjI and iffI split the goal into two subgoals,
• impI moves the left-hand side of a HOL implication into the list of as-
sumptions,
• and allI removes a ∀ by turning the quantified variable into a fixed local
variable of the subgoal.
Isabelle knows about these and a number of other introduction rules. The
command
apply rule
automatically selects the appropriate rule for the current subgoal.
You can also turn your own theorems into introduction rules by giving
them the intro attribute, analogous to the simp attribute. In that case blast,
fastforce and (to a limited extent) auto will automatically backchain with
those theorems. The intro attribute should be used with care because it in-
creases the search space and can lead to nontermination. Sometimes it is better
to use it only in specific calls of blast and friends. For example, le_trans, tran-
sitivity of 6 on type nat, is not an introduction rule by default because of the
disastrous effect on the search space, but can be useful in specific situations:
lemma "[[ (a::nat ) 6 b; b 6 c; c 6 d; d 6 e ]] =⇒ a 6 e"
by(blast intro: le_trans)
Of course this is just an example and could be proved by arith, too.
Forward proof means deriving new theorems from old theorems. We have
already seen a very simple form of forward proof: the of operator for instan-
tiating unknowns in a theorem. The big brother of of is OF for applying
one theorem to others. Given a theorem A =⇒ B called r and a theorem
A 0 called r 0, the theorem r [OF r 0] is the result of applying r to r 0, where
r should be viewed as a function taking a theorem A and returning B. More
precisely, A and A 0 are unified, thus instantiating the unknowns in B, and
the result is the instantiated B. Of course, unification may also fail.
3.5 Inductive Definitions 33
Application of rules to other rules operates in the forward direction: from the
premises to the conclusion of the rule; application of rules to proof states operates
in the backward direction, from the conclusion to the premises.
Inductive definitions are the third important definition facility, after datatypes
and recursive function.
Rule Induction
Showing that all even numbers have some property is more complicated. For
example, let us prove that the inductive definition of even numbers agrees
with the following recursive one:
fun evn :: "nat ⇒ bool" where
"evn 0 = True" |
"evn (Suc 0) = False" |
"evn (Suc(Suc n)) = evn n"
We prove ev m =⇒ evn m. That is, we assume ev m and by induction on
the form of its derivation prove evn m. There are two cases corresponding to
the two rules for ev :
Case ev0: ev m was derived by rule ev 0:
=⇒ m = 0 =⇒ evn m = evn 0 = True
Case evSS : ev m was derived by rule ev n =⇒ ev (n + 2):
=⇒ m = n + 2 and by induction hypothesis evn n
=⇒ evn m = evn(n + 2) = evn n = True
What we have just seen is a special case of rule induction. Rule induction
applies to propositions of this form
ev n =⇒ P n
That is, we want to prove a property P n for all even n. But if we assume
ev n, then there must be some derivation of this assumption using the two
defining rules for ev. That is, we must prove
Case ev0: P 0
Case evSS : [[ev n; P n]] =⇒ P (n + 2)
The corresponding rule is called ev .induct and looks like this:
^
ev n P0 n. [[ev n; P n]] =⇒ P (n + 2)
Pn
3.5 Inductive Definitions 35
The first premise ev n enforces that this rule can only be applied in situations
where we know that n is even.
Note that in the induction step we may not just assume P n but also
ev n, which is simply the premise of rule evSS. Here is an example where the
local assumption ev n comes in handy: we prove ev m =⇒ ev (m − 2) by
induction on ev m. Case ev0 requires us to prove ev (0 − 2), which follows
from ev 0 because 0 − 2 = 0 on type nat. In case evSS we have m = n + 2
and may assume ev n, which implies ev (m − 2) because m − 2 = (n +
2) − 2 = n. We did not need the induction hypothesis at all for this proof
(it is just a case analysis of which rule was used) but having ev n at our
disposal in case evSS was essential. This case analysis of rules is also called
“rule inversion” and is discussed in more detail in Chapter 4.
In Isabelle
Let us now recast the above informal proofs in Isabelle. For a start, we use
Suc terms instead of numerals in rule evSS :
ev n =⇒ ev (Suc (Suc n))
This avoids the difficulty of unifying n+2 with some numeral, which is not
automatic.
The simplest way to prove ev (Suc (Suc (Suc (Suc 0)))) is in a forward
direction: evSS [OF evSS [OF ev0]] yields the theorem ev (Suc (Suc (Suc
(Suc 0)))). Alternatively, you can also prove it as a lemma in backwards
fashion. Although this is more verbose, it allows us to demonstrate how each
rule application changes the proof state:
lemma "ev (Suc(Suc(Suc(Suc 0))))"
apply(rule evSS )
apply(rule evSS )
1. ev 0
apply(rule ev0)
done
Rule induction is applied by giving the induction rule explicitly via the
rule: modifier:
36 3 Logic and Proof Beyond Equality
presented with the trivial negative cases. If you want the convenience of both
rewriting and rule induction, you can make two definitions and show their
equivalence (as above) or make one definition and prove additional properties
from it, for example rule induction from computation induction.
But many concepts do not admit a recursive definition at all because
there is no datatype for the recursion (for example, the transitive closure of
a relation), or the recursion would not terminate (for example, an interpreter
for a programming language). Even if there is a recursive definition, if we are
only interested in the positive information, the inductive definition may be
much simpler.
Exercises
S → ε | aSb | SS
T → ε | T aT b
as two inductive predicates. If you think of a and b as “(” and “)”, the
grammars define balanced strings of parentheses. Prove T w =⇒ S w and
S w =⇒ T w separately and conclude S w = T w.
4
Isar: A Language for Structured Proofs
A proof can either be an atomic by with a single proof method which must
finish off the statement being proved, for example auto, or it can be a proof–qed
block of multiple steps. Such a block can optionally begin with a proof method
that indicates how to start off the proof, e.g., (induction xs).
A step either assumes a proposition or states a proposition together with
its proof. The optional from clause indicates which facts are to be used in the
proof. Intermediate propositions are stated with have, the overall goal is stated
with show. A step can also introduce new local variables with fix. Logically,
fix introduces -quantified variables, assume introduces the assumption of an
V
meaningful names are hard to invent and are often not necessary. Both have
steps are obvious. The second one introduces the diagonal set {x . x ∈ / f x },
the key idea in the proof. If you wonder why 2 directly implies False: from 2
it follows that (a ∈
/ f a) = (a ∈ f a).
Labels should be avoided. They interrupt the flow of the reader who has to
scan the context for the point where the label was introduced. Ideally, the
proof is a linear flow, where the output of one step becomes the input of
the next step, piping the previously proved fact into the next proof, like in
a UNIX pipe. In such cases the predefined name this can be used to refer to
the proposition proved in the previous step. This allows us to eliminate all
labels from our proof (we suppress the lemma statement):
proof
assume "surj f"
from this have "∃ a. {x . x ∈
/ f x } = f a" by(auto simp: surj_def )
from this show "False" by blast
qed
We have also taken the opportunity to compress the two have steps into one.
To compact the text further, Isar has a few convenient abbreviations:
then = from this
thus = then show
hence = then have
With the help of these abbreviations the proof becomes
proof
assume "surj f"
hence "∃ a. {x . x ∈
/ f x } = f a" by(auto simp: surj_def )
thus "False" by blast
qed
There are two further linguistic variations:
(have|show) prop using facts = from facts (have|show) prop
with facts = from facts this
The using idiom de-emphasizes the used facts by moving them behind the
proposition.
lemma
fixes f :: " 0a ⇒ 0a set"
assumes s: "surj f"
shows "False"
The optional fixes part allows you to state the types of variables up front
rather than by decorating one of their occurrences in the formula with a type
constraint. The key advantage of the structured format is the assumes part that
allows you to name each assumption; multiple assumptions can be separated
by and. The shows part gives the goal. The actual theorem that will come out
of the proof is surj f =⇒ False, but during the proof the assumption surj f
is available under the name s like any other fact.
proof −
have "∃ a. {x . x ∈
/ f x } = f a" using s by(auto simp: surj_def )
thus "False" by blast
qed
Note the hyphen after the proof command. It is the null method that does
nothing to the goal. Leaving it out would be asking Isabelle to try some suitable
introduction rule on the goal False — but there is no such rule and proof would fail.
In the have step the assumption surj f is now referenced by its name s. The
duplication of surj f in the above proofs (once in the statement of the lemma,
once in its proof) has been eliminated.
Stating a lemma with assumes-shows implicitly introduces the name assms
that stands for the list of all assumptions. You can refer to individual as-
sumptions by assms(1), assms(2), etc., thus obviating the need to name
them individually.
4.2.1 Logic
We start with two forms of case analysis: starting from a formula P we have
the two cases P and ¬ P, and starting from a fact P ∨ Q we have the two
cases P and Q:
4.2 Proof Patterns 45
If you want to go beyond merely using the above proof patterns and want
to understand what also and finally mean, read on. There is an Isar theorem
variable called calculation, similar to this. When the first also in a chain is
encountered, Isabelle sets calculation := this. In each subsequent also step,
Isabelle composes the theorems calculation and this (i.e. the two previous
(in)equalities) using some predefined set of rules including transitivity of =,
6 and < but also mixed rules like [[x 6 y; y < z ]] =⇒ x < z. The result of
this composition is assigned to calculation. Consider
have "t 1 6 t 2 " hproof i
also have "... < t 3 " hproof i
also have "... = t 4 " hproof i
finally show "t 1 < t 4 " .
48 4 Isar: A Language for Structured Proofs
After the first also, calculation is "t 1 6 t 2 ", and after the second also, cal-
culation is "t 1 < t 3 ". The command finally is short for also from calculation.
Therefore the also hidden in finally sets calculation to t 1 < t 4 and the final
“.” succeeds.
For more information on this style of proof see [1].
In the proof patterns shown above, formulas are often duplicated. This can
make the text harder to read, write and maintain. Pattern matching is an
abbreviation mechanism to avoid such duplication. Writing
show formula (is pattern)
matches the pattern against the formula, thus instantiating the unknowns in
the pattern for later use. As an example, consider the proof pattern for ←→:
show "formula 1 ←→ formula 2 " (is "?L ←→ ?R")
proof
assume "?L"
..
.
show "?R" hproof i
next
assume "?R"
..
.
show "?L" hproof i
qed
Instead of duplicating formula i in the text, we introduce the two abbrevia-
tions ?L and ?R by pattern matching. Pattern matching works wherever a
formula is stated, in particular with have and lemma.
The unknown ?thesis is implicitly matched against any goal stated by
lemma or show. Here is a typical example:
lemma "formula"
proof −
..
.
show ?thesis hproof i
qed
Unknowns can also be instantiated with let commands
4.3 Streamlining Proofs 49
let ?t = "some-big-term"
Later proof steps can refer to ?t :
have ". . . ?t . . . "
Names of facts are introduced with name: and refer to proved theorems. Un-
knowns ?X refer to terms or formulas.
4.3.2 moreover
The moreover version is no shorter but expresses the structure a bit more
clearly and avoids new names.
50 4 Isar: A Language for Structured Proofs
Sometimes one would like to prove some lemma locally within a proof, a
lemma that shares the current context of assumptions but that has its own
assumptions and is generalized over its locally fixed variables at the end. This
is simply an extension of the basic have construct:
have B if name: A1 . . . Am for x 1 . . . x n
hproof i
proves [[ A1 ; . . . ; Am ]] =⇒ B where all x i have been replaced by unknowns
?x i . As an example we prove a simple fact about divisibility on integers. The
definition of dvd is (b dvd a) = (∃ k . a = b ∗ k ).
lemma fixes a b :: int assumes "b dvd (a+b)" shows "b dvd a"
proof −
have "∃ k 0. a = b∗k 0" if asm: "a+b = b∗k" for k
proof
show "a = b∗(k − 1)" using asm by(simp add: algebra_simps)
qed
then show ?thesis using assms by(auto simp add: dvd_def )
qed
Exercises
ing computation induction rule called f .induct. Induction with this rule looks
like in Section 2.3.4, but now with proof instead of apply:
proof (induction x 1 . . . x k rule: f .induct )
Just as for structural induction, this creates several cases, one for each defining
equation for f. By default (if the equations have not been named by the user),
the cases are numbered. That is, they are started by
case (i x y ...)
where i = 1,...,n, n is the number of equations defining f, and x y ... are the
variables in equation i. Note the following:
• Although i is an Isar name, i .IH (or similar) is not. You need double
quotes: "i .IH ". When indexing the name, write "i .IH "(1), not "i .IH (1)".
• If defining equations for f overlap, fun instantiates them to make them
nonoverlapping. This means that one user-provided equation may lead to
several equations and thus to several cases in the induction rule. These
have names of the form "i_j ", where i is the number of the original equa-
tion and the system-generated j indicates the subcase.
In Isabelle/jEdit, the induction proof method displays a proof skeleton with
all cases. This is particularly useful for computation induction and the follow-
ing rule induction.
Recall the inductive and recursive definitions of even numbers in Section 3.5:
inductive ev :: "nat ⇒ bool" where
ev0: "ev 0" |
evSS : "ev n =⇒ ev (Suc(Suc n))"
The proof resembles structural induction, but the induction rule is given
explicitly and the names of the cases are the names of the rules in the inductive
definition. Let us examine the two assumptions named evSS : ev n is the
premise of rule evSS, which we may assume because we are in the case where
that rule was used; evn n is the induction hypothesis.
Because each case command introduces a list of assumptions named like the case
name, which is the name of a rule of the inductive definition, those rules now
need to be accessed with a qualified name, here ev .ev0 and ev .evSS.
In the case evSS of the proof above we have pretended that the system
fixes a variable n. But unless the user provides the name n, the system will
just invent its own name that cannot be referred to. In the above proof, we
do not need to refer to it, hence we do not give it a specific name. In case one
needs to refer to it one writes
case (evSS m)
like case (Suc n) in earlier structural inductions. The name m is an arbi-
trary choice. As a result, case evSS is derived from a renamed version of rule
evSS : ev m =⇒ ev (Suc(Suc m)). Here is an example with a (contrived)
intermediate step that refers to m:
lemma "ev n =⇒ evn n"
proof(induction rule: ev .induct )
case ev0 show ?case by simp
next
case (evSS m)
have "evn(Suc(Suc m)) = evn m" by simp
thus ?case using ‘evn m‘ by blast
qed
56 4 Isar: A Language for Structured Proofs
In any induction, case name sets up a list of assumptions also called name,
which is subdivided into three parts:
name.IH contains the induction hypotheses.
name.hyps contains all the other hypotheses of this case in the induction
rule. For rule inductions these are the hypotheses of rule name, for struc-
tural inductions these are empty.
name.prems contains the (suitably instantiated) premises of the statement
being proved, i.e., the Ai when proving [[ A1 ; . . .; An ]] =⇒ A.
Proof method induct differs from induction only in this naming policy: induct
does not distinguish IH from hyps but subsumes IH under hyps.
More complicated inductive proofs than the ones we have seen so far often
need to refer to specific assumptions — just name or even name.prems and
name.IH can be too unspecific. This is where the indexing of fact lists comes
in handy, e.g., name.IH (2) or name.prems(1−2).
Rule inversion is case analysis of which rule could have been used to de-
rive some fact. The name rule inversion emphasizes that we are reasoning
4.4 Case Analysis and Induction 57
backwards: by which rules could some given fact have been proved? For the
inductive definition of ev, rule inversion can be summarized like this:
ev n =⇒ n = 0 ∨ (∃ k . n = Suc (Suc k ) ∧ ev k )
The realisation in Isabelle is a case analysis. A simple example is the proof
that ev n =⇒ ev (n − 2). We already went through the details informally
in Section 3.5.1. This is the Isar proof:
assume "ev n"
from this have "ev (n − 2)"
proof cases
case ev0 thus "ev (n − 2)" by (simp add: ev .ev0)
next
case (evSS k ) thus "ev (n − 2)" by (simp add: ev .evSS )
qed
The key point here is that a case analysis over some inductively defined pred-
icate is triggered by piping the given fact (here: from this) into a proof by
cases. Let us examine the assumptions available in each case. In case ev0 we
have n = 0 and in case evSS we have n = Suc (Suc k ) and ev k. In each
case the assumptions are available under the name of the case; there is no
fine-grained naming schema like there is for induction.
Sometimes some rules could not have been used to derive the given fact
because constructors clash. As an extreme example consider rule inversion
applied to ev (Suc 0): neither rule ev0 nor rule evSS can yield ev (Suc 0)
because Suc 0 unifies neither with 0 nor with Suc (Suc n). Impossible cases
do not have to be proved. Hence we can prove anything from ev (Suc 0):
assume "ev (Suc 0)" then have P by cases
That is, ev (Suc 0) is simply not provable:
lemma "¬ ev (Suc 0)"
proof
assume "ev (Suc 0)" then show False by cases
qed
Normally not all cases will be impossible. As a simple exercise, prove that
¬ ev (Suc (Suc (Suc 0))).
This advanced form of induction does not support the IH naming schema ex-
plained in Section 4.4.5: the induction hypotheses are instead found under the
name hyps, as they are for the simpler induct method.
Exercises
Exercise 4.4. Give a structured proof of ¬ ev (Suc (Suc (Suc 0))) by rule
inversions. If there are no cases to be proved you can close a proof immediately
with qed.
Exercise 4.5. Recall predicate star from Section 3.5.2 and iter from Exer-
cise 3.4. Prove iter r n x y =⇒ star r x y in a structured style; do not just
sledgehammer each case of the required induction.
Exercise 4.6. Define a recursive function elems :: 0a list ⇒ 0a set and prove
x ∈ elems xs =⇒ ∃ ys zs. xs = ys @ x # zs ∧ x ∈ / elems ys.
Exercise 4.7. Extend Exercise 3.5 with a function that checks if some
alpha list is a balanced string of parentheses. More precisely, define a recursive
function balanced :: nat ⇒ alpha list ⇒ bool such that balanced n w is
true iff (informally) S (a n @ w ). Formally, prove that balanced n w = S
(replicate n a @ w ) where replicate :: nat ⇒ 0a ⇒ 0a list is predefined and
replicate n x yields the list [x , . . ., x ] of length n.
References