HOL Theorem-Proving
HOL Theorem-Proving
This volume contains a tutorial on the H OL system. It is one of four documents making
up the documentation for H OL:
(i) LOGIC: a formal description of the higher order logic implemented by the H OL
system;
These four documents will be referred to by the short names (in small slanted capitals)
given above.
This document, TUTORIAL, is intended to be the first item read by new users of H OL. It
provides a self-study introduction to the structure and use of the system. The tutorial is
intended to give a ‘hands-on’ feel for the way H OL is used, but it does not systematically
explain all the underlying principles (DESCRIPTION and LOGIC explain these). After
working through TUTORIAL the reader should be capable of using H OL for simple tasks,
and should also be in a position to consult the other documents.
Getting started
Chapter 1 explains how to get and install H OL. Once this is done, the potential H OL user
should become familiar with the following subjects:
2. The formal logic supported by the H OL system (higher order logic) and its manipu-
lation via ML.
3
4 Preface
demonstrating both the logic and proving properties of one’s definitions with some
high-level tactics.
Chapter 5 features another worked example: the specification and verification of a
simple sequential parity checker. The intention is to accomplish two things: (i) to present
another complete piece of work with H OL; and (ii) to give an idea of what it is like to
use the H OL system for a tricky proof. Chapter 6 is a more extensive example: the proof
of confluence for combinatory logic. Again, the aim is to present a complete piece of
non-trivial work.
Chapter 7 gives an example of implementing a proof tool of one’s own. This demon-
strates the programmability of H OL: the way in which technology for solving specific
problems can be implemented on top of the underlying kernel. With high-powered tools
to draw on, it is possible to write prototypes very quickly.
Chapter 8 briefly discusses some of the examples distributed with H OL in the examples
directory.
TUTORIAL has been kept short so that new users of H OL can get going as fast as possible.
Sometimes details have been simplified. It is recommended that as soon as a topic in
TUTORIAL has been digested, the relevant parts of DESCRIPTION and REFERENCE be studied.
Acknowledgements
Current edition
The current edition of all four volumes (LOGIC, TUTORIAL, DESCRIPTION and REFERENCE)
has been prepared by Michael Norrish and Konrad Slind. Further contributions to these
volumes came from: Hasan Amjad, who developed a model checking library and wrote
sections describing its use; Jens Brandt, who developed and documented a library for
the rational numbers; Anthony Fox, who formalized and documented new word theories
and the associated libraries; Mike Gordon, who documented the libraries for BDDs and
SAT; Peter Homeier, who implemented and documented the quotient library; Joe Hurd,
who added material on first order proof search; and Tjark Weber, who wrote libraries for
Satisfiability Modulo Theories (SMT) and Quantified Boolean Formulae (QBF).
The material in the third edition constitutes a thorough re-working and extension of
previous editions. The only essentially unaltered piece is the semantics by Andy Pitts
(in LOGIC), reflecting the fact that, although the H OL system has undergone continual
development and improvement, the H OL logic is unchanged since the first edition (1988).
5
6 Acknowledgements
Second edition
The second edition of REFERENCE was a joint effort by the Cambridge H OL group.
First edition
The three volumes TUTORIAL, DESCRIPTION and REFERENCE were produced at the Cam-
bridge Research Center of SRI International with the support of DSTO Australia.
The H OL documentation project was managed by Mike Gordon, who also wrote parts
of DESCRIPTION and TUTORIAL using material based on an early paper describing the
H OL system1 and The ML Handbook 2 . Other contributers to DESCRIPTION incude Avra
Cohn, who contributed material on theorems, rules, conversions and tactics, and also
composed the index (which was typeset by Juanito Camilleri); Tom Melham, who wrote
the sections describing type definitions, the concrete type package and the ‘resolution’
tactics; and Andy Pitts, who devised the set-theoretic semantics of the H OL logic and
wrote the material describing it.
The original document design used LATEX macros supplied by Elsa Gunter, Tom Melham
and Larry Paulson. The typesetting of all three volumes was managed by Tom Melham.
The cover design is by Arnold Smith, who used a photograph of a ‘snow watching lantern’
taken by Avra Cohn (in whose garden the original object resides). John Van Tassel
composed the LATEX picture of the lantern.
Many people other than those listed above have contributed to the H OL documentation
effort, either by providing material, or by sending lists of errors in the first edition.
Thanks to everyone who helped, and thanks to DSTO and SRI for their generous support.
1 M.J.C. Gordon, ‘HOL: a Proof Generating System for Higher Order Logic’, in: VLSI Specification,
Verification and Synthesis, edited by G. Birtwistle and P.A. Subrahmanyam, (Kluwer Academic Publishers,
1988), pp. 73–128.
2 The ML Handbook, unpublished report from Inria by Guy Cousineau, Mike Gordon, Gérard Huet,
2 Introduction to ML 17
2.1 How to interact with ML . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9
10 CONTENTS
This chapter describes how to get the H OL system and how to install it. It is generally
assumed that some sort of Unix system is being used, but the instructions that follow
should apply mutatis mutandis to other platforms. Unix is not a pre-requisite for using the
system. H OL may be run on PCs running Windows operating systems from Windows NT
onwards (i.e., Windows 2000, XP and Vista are also supported), as well as Macintoshes
running MacOS X.
1.1 Getting H OL
The H OL system can be downloaded from http://hol-theorem-prover.org. The nam-
ing scheme for H OL releases is ⟨name⟩-⟨number⟩; the release described here is Kananaskis-
12.
1.3 Installing H OL
It is assumed that the H OL sources have been obtained and the tar file unpacked into a
directory hol.1 The contents of this directory are likely to change over time, but it should
contain the following:
11
12 CHAPTER 1. GETTING AND INSTALLING HOL
The session in the box below shows a typical distribution directory. The H OL distribu-
tion has been placed on a PC running Linux in the directory /home/mn200/hol/.
All sessions in this documentation will be displayed in boxes with a number in the top
right hand corner. This number indicates whether the session is a new one (when the
number will be 1) or the continuation of a session started in an earlier box. Consecutively
numbered boxes are assumed to be part of a single continuous session. The Unix
prompt for the sessions is $, so lines beginning with this prompt were typed by the
user. After entering the H OL system (see below), the user is prompted with - for an
expression or command of the H OL meta-language ML; lines beginning with this are
thus ML expressions or declarations. Lines not beginning with $ or - are system output.
Occasionally, system output will be replaced with a line containing ... when it is of
minimal interest. The meta-language ML is introduced in Chapter 2.
$ pwd 1
/home/mn200/hol
$ ls -F
CONTRIBUTORS README doc/ sigobj/ tools/
COPYRIGHT bin/ examples/ src/ tools-poly/
INSTALL developers/ help/ std.prelude
libpolyml.so and libpolymain.so. If these files are in /usr/lib, nothing will need to
be changed, but other locations may require further system configuration. A sample
LD_LIBRARY_PATH initialisation command (in a file such as .bashrc) might be
declare -x LD_LIBRARY_PATH=/usr/local/lib:$HOME/lib
instead.
Assuming you don’t interrupt the configuration process, this will build the Holmake and
build programs, and move them into the hol/bin directory. If something goes wrong at
this stage, consult Section 1.3.1 below.
The next step is to run the build program. This should result in a great deal of output
as all of the system code is compiled and the theories built. Eventually, a H OL system4 is
produced in the bin/ directory.
$ bin/build 3
...
...
Uploading files to /home/mn200/hol/sigobj
4 Four H OL executables are produced: hol, hol.noquote, hol.bare and hol.bare.noquote. The first of these
will be used for most examples in the TUTORIAL.
14 CHAPTER 1. GETTING AND INSTALLING HOL
At this point, the system is build in your HOL directory, and cannot easily be moved to
other locations. In other words, you should unpack HOL in the location/directory where
you wish to access it for all your future work.
The config-override file need only provide values for those variables that need
overriding.
With this file in place, the smart-configure program will use the values specified
there rather than those it attempts to calculate itself. The value given for the OS variable
must be one of "unix", "linux", "solaris", "macosx" or "winNT".5
In extreme circumstances it is possible to edit the file tools/configure.sml your-
self to set configuration variables directly. (If you are using Poly/ML, you must edit
tools-poly/configure.sml instead.) At the top of this file various incomplete SML
declarations are present, but commented out. You will need to uncomment this sec-
tion (remove the (* and *) markers), and provide sensible values. All strings must be
enclosed in double quotes.
The holdir value must be the name of the top-level directory listed in the first session
above. The OS value should be one of the strings specified in the accompanying comment.
When working with Poly/ML, the poly string must be the path to the poly executable
that begins an interactive ML session. The polymllibdir must be a path to a directory
that contains the file libpolymain.a. When working with Moscow ML, the mosmldir
value must be the name of the directory containing the Moscow ML binaries (mosmlc,
mosml, mosmllex etc).
Subsequent values (CC and GNUMAKE) are needed for “optional” components of the
system. The first gives a string suitable for invoking the system’s C compiler, and the
second specifies a make program.
5 The
string "winNT" is used for Microsoft Windows operating systems that are at least as recent as
Windows NT. This includes Windows 2000, XP, Vista, Windows 10 etc. Do not use "winNT" when using
Poly/ML via Cygwin or the Linux sub-system.
1.3. INSTALLING HOL 15
After editing, tools/configure.sml the lines above will look something like:
$ more configure.sml 5
...
val mosmldir = "/home/mn200/mosml";
val holdir = "/home/mn200/hol";
val OS = "linux" (* Operating system; choices are:
"linux", "solaris", "unix", "winNT" *)
Now, at either this level (in the tools or tools-poly directory) or at the level above, the
script configure.sml must be piped into the ML interpreter (i.e., mosml or poly). For
example,
Introduction to ML
This chapter is a brief introduction to the meta-language ML. The aim is just to give a
feel for what it is like to interact with the language. A more detailed introduction can be
found in numerous textbooks and web-pages; see for example the list of resources on
the MoscowML home-page1 , or the comp.lang.ml FAQ.2
(i) An editor window into which ML commands are initially typed and recorded.
(ii) A shell window (or non-Unix equivalent) which is used to evaluate the com-
mands.
A common way to achieve this is to work inside Emacs with a text window and a shell
window.
After typing a command into the edit (text) window it can be transferred to the shell
and evaluated in H OL by ‘cut-and-paste’. In Emacs this is done by copying the text into a
buffer and then ‘yanking’ it into the shell. The advantage of working via an editor is that
if the command has an error, then the text can simply be edited and used again; it also
records the commands in a file which can then be used again (via a batch load) later.
In Emacs, the shell window also records the session, including both input from the user
and the system’s response. The sessions in this tutorial were produced this way. These
sessions are split into segments displayed in boxes with a number in their top right hand
corner (to indicate their position in the complete session).
The interactions in these boxes should be understood as occurring in sequence. For
example, variable bindings made in earlier boxes are assumed to persist to later ones. To
1 http://mosml.org
2 http://www.faqs.org/faqs/meta-lang-faq/
17
18 CHAPTER 2. INTRODUCTION TO ML
enter the H OL system, one types hol at the command-line, possibly preceded by path
information if the H OL system’s bin directory is not in one’s path. The H OL system then
prints a sign-on message and puts one into ML. The ML prompt varies depending on
the implementation. In Poly/ML, the implementation assumed for our sessions here,
the prompt is >, so lines beginning with > are typed by the user, and other lines are the
system’s responses.
$ bin/hol 1
---------------------------------------------------------------------
HOL-4 [Kananaskis 12 (stdknl, built Mon Jun 18 16:18:42 2018)]
> 1 :: [2,3,4,5];
val it = [1, 2, 3, 4, 5]: int list
> tl l;
val it = [2, 3, 4, 5]: int list
> hd it;
val it = 2: int
ML expressions like "a", "b", "foo" etc. are strings and have type string. Any sequence
of ASCII characters can be written between the quotes.3 The function explode splits a
string into a list of single characters, which are written like single character strings, with
a # character prepended.
> explode "a b c"; 4
val it = [#"a", #" ", #"b", #" ", #"c"]: char list
An expression of the form (𝑒1 ,𝑒2 ) evaluates to a pair of the values of 𝑒1 and 𝑒2 . If 𝑒1 has
type 𝜎1 and 𝑒2 has type 𝜎2 then (𝑒1 ,𝑒2 ) has type 𝜎1 *𝜎2 . The first and second components
of a pair can be extracted with the ML functions #1 and #2 respectively. If a tuple has
more than two components, its 𝑛-th component can be extracted with a function #𝑛.
The values (1,2,3), (1,(2,3)) and ((1,2), 3) are all distinct and have types
int * int * int, int * (int * int) and (int * int) * int respectively.
> val triple1 = (1,true,"abc"); 5
val triple1 = (1, true, "abc"): int * bool * string
> #2 triple1;
val it = true: bool
> #2 triple2;
val it = (true, "abc"): bool * string
The ML expressions true and false denote the two truth values of type bool.
ML types can contain the type variables ’a, ’b, ’c, etc. Such types are called polymorphic.
A function with a polymorphic type should be thought of as possessing all the types
obtainable by replacing type variables by types. This is illustrated below with the function
zip.
Functions are defined with declarations of the form fun 𝑓 𝑣1 … 𝑣𝑛 = 𝑒 where each 𝑣𝑖
is either a variable or a pattern built out of variables.
The function zip, below, converts a pair of lists ([𝑥1 ,…,𝑥𝑛 ], [𝑦1 ,…,𝑦𝑛 ]) to a list of
pairs [(𝑥1 ,𝑦1 ),…,(𝑥𝑛 ,𝑦𝑛 )].
3 Newlines must be written as ∖n, and quotes as ∖".
20 CHAPTER 2. INTRODUCTION TO ML
> zip([1,2,3],["a","b","c"]);
val it = [(1, "a"), (2, "b"), (3, "c")]: (int * string) list
Functions may be curried, i.e. take their arguments ‘one at a time’ instead of as a tuple.
This is illustrated with the function curried_zip below:
The evaluation of an expression either succeeds or fails. In the former case, the
evaluation returns a value; in the latter case the evaluation is aborted and an exception is
raised. This exception passed to whatever invoked the evaluation. This context can either
propagate the failure (this is the default) or it can trap it. These two possibilities are
illustrated below. An exception trap is an expression of the form 𝑒1 handle _ => 𝑒2 . An
expression of this form is evaluated by first evaluating 𝑒1 . If the evaluation succeeds (i.e.
doesn’t fail) then the value of the whole expression is the value of 𝑒1 . If the evaluation of
𝑒1 raises an exception, then the value of the whole is obtained by evaluating 𝑒2 .4
> 3 div 0; 8
Exception- Div raised
The sessions above are enough to give a feel for ML. In the next chapter, the syntax of
the logic supported by the H OL system (higher order logic) will be introduced.
4 This
description of exception handling is actually a gross simplification of the way exceptions can be
handled in ML; consult a proper text for a better explanation.
Chapter 3
which is a much more accurate picture of what the term really looks like in memory.
1 Notethat the user cannot write theorem values directly; this would break the prover’s guarantee of
soundness!
21
22 CHAPTER 3. WRITING HOL TERMS AND TYPES
Table 3.1: Unicode/ASCII equivalents in H OL syntax. Delimiters are the quotation marks
that delimit whole terms or types, separating them from the ML level.
It is possible to turn Unicode printing off and on by setting the PP.avoid_unicode trace:
> set_trace "PP.avoid_unicode" 1; 2
val it = (): unit
> ‘‘x ∈ A’’;
<<HOL message: inventing new type variable names: ’a>>
val it = ‘‘x IN A‘‘: term
H OL looks like ML
One interesting (and also confusing for beginners) aspect of H OL is that its terms and
types look like ML’s. For example, the zip function in ML (from the previous chapter)
might be characterised by the HOL term that can be written:
Apart from the fact that some of the relevant constants have different names (NULL vs
null for example), and apart from the use of logical disjunction (∨) instead of orelse,
the text is identical.
23
The following session shows the (rather involved) way in which this definition can be
made,2 allowing us to see the way the definition theorem is printed back. We can also
ask the system to print the new constant’s type:
Note how the pretty-printer is at liberty to make adjustments to the way the underlying
term is rendered as a string: its placement of newline and space characters is not exactly
the same as the user’s.
H OL’s language of types is also similar but slightly different to ML’s: the # symbol is
used for the pair type rather than *, and the printer uses Greek letters 𝛼 and 𝛽 rather
than ’a and ’b.
H OL vs ML Traps
Lists, sets and other types with syntax for enumerating elements use a semicolon rather
than a comma to separate elements. Thus
ML has three distinct types 𝜏1 *𝜏2 *𝜏3 , (𝜏1 *𝜏2 )*𝜏3 and 𝜏1 *(𝜏2 *𝜏3 ). One might see these as
a flat triple, and two flavours of pair with a nested pair as one or other component. In
2 The usual “H OL” way to define this function, with pattern-matching, wouldn’t be so complicated.
24 CHAPTER 3. WRITING HOL TERMS AND TYPES
H OL, the concrete syntax 𝜏1 #𝜏2 #𝜏3 maps to 𝜏1 #(𝜏2 #𝜏3 ) (i.e., the infix # type operator is
right-associative).
ML uses the op keyword to remove infix status from function forms. In H OL one can
either “wrap” the operator in parentheses3 or precede it with a $-sign. Further, infixes in
ML take pairs; in H OL they are curried:
In H OL:
> Datatype ‘tree = Lf | Nd tree 𝛼 tree’; 7
<<HOL message: Defined type: "tree">>
val it = (): unit
ML uses ~ as the unary negation operator on numeric types. H OL allows it in this role
(as well as for boolean negation), but also allows - for numeric negation. First the ML
behaviour:
3 But
watch out for the * operator; one can’t wrap this in parentheses because the result then looks like
comment syntax.
25
> ~3; 8
val it = ~3: int
> -3;
Exception- (-) has infix status but was not preceded by op.
Type error in function application.
Function: - : int * int -> int
Argument: 3 : int
Reason: Can’t unify int to int * int (Incompatible types)
Fail "Static Errors" raised
In H OL:
> load "intLib"; ...output elided... 9
> EVAL ‘‘~3 + 4’’;
val it = ⊢ -3 + 4 = 1: thm
> EVAL ‘‘-3 * 4’’;
val it = ⊢ -3 * 4 = -12: thm
26 CHAPTER 3. WRITING HOL TERMS AND TYPES
Chapter 4
In this chapter, we prove in H OL that for every number, there is a prime number that
is larger, i.e., that the prime numbers form an infinite sequence. This proof has been
excerpted and adapted from a much larger example due to John Harrison, in which he
proved the 𝑛 = 4 case of Fermat’s Last Theorem. The proof development is intended to
serve as an introduction to performing high-level interactive proofs in H OL.1 Many of the
details may be difficult to grasp for the novice reader; nonetheless, it is recommended
that the example be followed through in order to gain a true taste of using H OL to prove
non-trivial theorems.
Some tutorial descriptions of proof systems show the system performing amazing
feats of automated theorem proving. In this example, we have not taken this approach;
instead, we try to show how one actually goes about the business of proving theorems
in H OL: when more than one way to prove something is possible, we will consider the
choices; when a difficulty arises, we will attempt to explain how to fight one’s way clear.
One ‘drives’ H OL by interacting with the ML top-level loop, perhaps mediated via an
editor such as emacs or vim. In this interaction style, ML function calls are made to bring
in already-established logical context, e.g., via load; to define new concepts, e.g., via
Datatype, Define, and Hol_reln; and to perform proofs using the goalstack interface,
and the proof tools from bossLib (or if they fail to do the job, from lower-level libraries).
Let’s get started. First, we start the system, with the command <holdir>/bin/hol. We
then “open” the arithmetic theory; this means that all of the ML bindings from the H OL
theory of arithmetic are made available at the top level.
We now begin the formalization. In order to define the concept of prime number, we
first need to define the divisibility relation:
Note how we are using ASCII notation to input our terms (? is the ASCII way to write
the existential quantifier), but the system responds with pleasant Unicode. Unicode
1 The proofs discussed below may be found in examples/euclid.sml of the H OL distribution.
27
28 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
characters can also be used in the input. Also note how equality on booleans gets printed
as the if-and-only-if arrow, while equality on natural numbers stays as an equality. The
underlying constant is the same (equality) (as is implied by the fact that one can use = in
both places in the input), but the system tries to be helpful when printing.
The definition is added to the current theory with the name divides_def, and also
returned from the invocation of Define. We take advantage of this and make an ML
binding of the name divides_def to the definition. In the usual way of interacting
with H OL, such an ML binding is made for each definition and (useful) proved theorem:
the ML environment is thus being used as a convenient place to hold definitions and
theorems for later reference in the session.
We want to treat divides as a (non-associating) infix:
Next we define the property of a number being prime: a number 𝑝 is prime if and only if
it is not equal to 1 and it has no divisors other than 1 and itself:
There is more ASCII syntax to observe here: <> for not-equals, and ! for the universal
quantifier.
That concludes the definitions to be made. Now we “just” have to prove that there are
infinitely many prime numbers. If we were coming to this problem fresh, then we would
have to go through a not-well-understood and often tremendously difficult process of
finding the right lemmas required to prove our target theorem.2 Fortunately, we are
working from an already completed proof and can devote ourselves to the far simpler
problem of explaining how to prove the required theorems.
Proof tools The development will illustrate that there is often more than one way
to tackle a HOL proof, even if one has only a single (informal) proof in mind. In this
example, we often find proofs by using the rewriter rw to unwind definitions and perform
basic simplifications, often reducing a goal to its essence.
> rw; 5
val it = fn: thm list -> tactic
When rw is applied to a list of theorems, the theorems will be added to H OL’s built-in
database of useful facts as supplementary rewrite rules. We will see that rw is also
somewhat knowledgeable about arithmetic.3 Sometimes simplification with rw proves
the goal immediately. Often however, we are left with a goal that requires some study
before one realizes what lemmas are needed to conclude the proof. Once these lemmas
have been proven, or located in ancestor theories, metis_tac4 can be invoked with
them, with the expectation that it will find the right instantiations needed to finish the
proof. Note that these two operations, simplification and resolution-style automatic
proof search, will not suffice to perform all the proofs in this example; in particular, our
development will also need case analysis and induction.
Finding theorems This raises the following question: how does one find the right
lemmas and rewrite rules to use? This is quite a problem, especially since the number of
ancestor theories, and the theorems in them, is large. There are several possibilities
• The help system can be used to look up definitions and theorems, as well as proof
procedures; for example, an invocation of
help "arithmeticTheory"
will display all the definitions and theorems that have been stored in the theory of
arithmetic. However, the complete name of the item being searched for must be
known before the help system is useful, so the following two search facilities are
often more useful.
• DB.match allows the use of patterns to locate the sought-for theorem. Any stored
theorem having an instance of the pattern as a subterm will be returned.
• DB.find will use fragments of names as keys with which to lookup information.
4.1 Divisibility
We start by proving a number of theorems about the divides relation. We will see
that each of these initial theorems can be proved with a single invocation of metis_tac.
Both rw and metis_tac are quite powerful reasoners, and the choice of a reasoner in a
particular situation is a matter of experience. The major reason that metis_tac works so
well is that divides is defined by means of an existential quantifier, and metis_tac is
quite good at automatically instantiating existentials in the course of proof. For a simple
example, consider proving ∀𝑥. 𝑥 𝚍𝚒𝚟𝚒𝚍𝚎𝚜 0. A new proposition to be proved is entered
to the proof manager via “g”, which starts a fresh goalstack:
∀x. x divides 0
The proof manager tells us that it has only one proof to manage, and echoes the given
goal. Now we expand the definition of divides. Notice that 𝛼-conversion takes place in
order to keep distinct the 𝑥 of the goal and the 𝑥 in the definition of divides:
> e (rw[divides_def]); 7
OK..
<<HOL message: Initialising SRW simpset ... done>>
1 subgoal:
val it =
∃x’. (x = 0) ∨ (x’ = 0)
(x = 0) ∨ (0 = 0)
> e (rw[]); 9
OK.. ...output elided...
Goal proved.
⊢ ∃x’. (x = 0) ∨ (x’ = 0)
val it =
Initial goal proved.
⊢ ∀x. x divides 0: proof
What just happened here? The application of rw to the goal decomposed it to an empty
list of subgoals; in other words the goal was proved by rw. Once a goal has been proved,
it is popped off the goalstack, prettyprinted to the output, and the theorem becomes
available for use by the level of the stack. When all the sub-goals required by that level
are proven, the corresponding goal at that level can be proven too. This ‘unwinding’
process continues until the stack is empty, or until it hits a goal with more than one
remaining unproved subgoal. This process may be hard to visualize,5 but that doesn’t
matter, since the goalstack was expressly written to allow the user to ignore such details.
We can sequence tactics with the >> operator (also known as THEN). If our three
interactions above are joined together with >> to form a single tactic, we can try the
proof again from the beginning (using the restart function) and this time it will take
just one step:
> restart(); ...output elided... 10
We have seen one way to prove the theorem. However, as mentioned earlier, there is
another: one can let metis_tac expand the definition of divides and find the required
instantiation for x’ from the theorem MULT_CLAUSES.6
> restart(); ...output elided... 11
As it runs, metis_tac prints out some possibly interesting diagnostics. In any case,
having done our proof inside the goalstack package, we now want to have access to the
5 Perhaps since we have used a stack to implement what is notionally a tree!
6 You might like to try typing MULT_CLAUSES into the interactive loop to see exactly what it states.
32 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
theorem value that we have proved. We use the top_thm function to do this, and then
use drop to dispose of the stack:
> val DIVIDES_0 = top_thm(); 12
val DIVIDES_0 = ⊢ ∀x. x divides 0: thm
> drop();
OK..
val it = There are currently no proofs.: proofs
We have used metis_tac in this way to prove the following collection of theorems
about divides. As mentioned previously, the theorems supplied to metis_tac in the
following proofs did not (usually) come from thin air: in most cases some exploratory
work with the simplifier (rw) was done to open up definitions and see what lemmas
would be required by metis_tac.
> e (rw[divides_def]);
OK..
1 subgoal:
val it =
m ≤ m * x ∨ (m * x = 0)
This goal is a disappointing one to have the simplifier produce. Both disjuncts look as
if they should simplify further: the first looks as if we should be able to divide through
by m on both sides of the inequality, and the second looks like something we could attack
with the knowledge that one of two factors must be zero if a multiplication equals zero.
The relevant theorems justifying such steps have already been proved in arithmeticTheory;
something we can confirm with the generally useful DB.match function
DB.match : string list -> term
-> ((string * string) * (thm * class)) list
This function takes a list of theory names, and a pattern, and looks in the list of theories
for any theorem, definition, or axiom that has an instance of the pattern as a subterm.
If the list of theory names is empty, then all loaded theories are included in the search.
Let’s look in the theory of arithmetic for the subterm to be rewritten.
> DB.match ["arithmetic"] ‘‘m <= m * x‘‘; 14
val it =
[(("arithmetic", "LE_MULT_CANCEL_LBARE"),
(⊢ (m ≤ m * n ⟺ (m = 0) ∨ 0 < n) ∧ (m ≤ n * m ⟺ (m = 0) ∨ 0 < n), Thm))]:
DB.data list
This is just the theorem we’d like to use. Using DB.match again, you should now try
to find the theorem that will simplify the other disjunct. Because both are so generally
useful, rw already has both rewrites in its internal database, and all we need to do is
rewrite once more to get those rewrites applied:
> e (rw[]); 15
OK..
Goal proved.
⊢ m ≤ m * x ∨ (m * x = 0)
val it =
Initial goal proved.
⊢ ∀m n. m divides n ⇒ m ≤ n ∨ (n = 0): proof
34 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
That was gratifyingly easy! The process of finding the proof has now finished, and
all that remains is for the proof to be packaged up into the single tactic we saw above.
Rather than use top_thm and the goalstack, we can bypass it and use the store_thm
function. This function takes a string, a term and a tactic and applies the tactic to the
term to get a theorem, and then stores the theorem in the current theory under the given
name.
> val DIVIDES_LE = store_thm( 16
"DIVIDES_LE",
‘‘!m n. m divides n ==> m <= n \/ (n = 0)‘‘,
rw[divides_def] >> rw[]);
val DIVIDES_LE = ⊢ ∀m n. m divides n ⇒ m ≤ n ∨ (n = 0): thm
Storing theorems in our script record of the session in this style (rather than with the
goalstack) results in a more concise script, and also makes it easier to turn our script into
a theory file, as we do in section 4.5.
(FACT) (FACT 0 = 1) /\
(!n. FACT (SUC n) = SUC n * FACT n)
A polished proof of DIVIDES_FACT is the following7 :
which ought to be easy; in the inductive case, the inductive hypothesis seems like it
should give us what we need. This strategy for the inductive case is a bit vague, because
we are trying to mentally picture a slightly complicated formula, but we can rely on the
system to accurately calculate the cases of the induction for us. If the inductive case
turns out to be not what we expect, we will have to re-think our approach.
> g ‘!m n. 0 < m /\ m <= n ==> m divides (FACT n)‘; 17
val it =
Proof manager status: 2 proofs.
2. Completed goalstack: ⊢ ∀m n. m divides n ⇒ m ≤ n ∨ (n = 0)
1. Incomplete goalstack:
Initial goal:
We now have two sub-goals to prove: a base case and a step case. The first goal the
system expects us to prove is the lowest one printed (it’s closest to the cursor), the
base-case. This can obviously be simplified:
> e (rw[]); 21
OK..
1 subgoal:
val it =
m divides FACT m
------------------------------------
0 < m
0 divides FACT 0
------------------------------------
0 < 0
Here the first sub-goal goal has an assumption that is false. We can demonstrate
this to the system by using the DECIDE function to prove a simple fact about arithmetic
(namely, that no number 𝑥 is less than itself), and then passing the resulting theorem to
METIS_TAC, which can combine this with the contradictory assumption.
4.1. DIVISIBILITY 37
Goal proved.
[.] ⊢ 0 divides FACT 0
Remaining subgoals:
val it =
Alternatively, we could trust that H OL’s existing theories somewhere include the fact
that less-then is irreflexive, find that theorem using DB.match (using the pattern x < x),
and then quote that theorem-name to metis_tac.
Another alternative would be to apply the simplifier directly to the sub-goal’s assump-
tions. Certainly, the simplifier has already been primed with the irreflexivity of less-than,
so this seems natural. This can be done with the fs tactic:
Goal proved.
[.] ⊢ 0 divides FACT 0
Remaining subgoals:
val it =
Using the theorems identified above the remaining sub-goal can be proved with the
simplifier rw.
38 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
Goal proved.
⊢ ∀m. 0 < m ⇒ m divides FACT (m + 0)
Remaining subgoals:
val it =
Now we have finished the base case of the induction and can move to the step case. An
obvious thing to try is simplification with the definitions of addition and factorial:
> e (rw[DIVIDES_RMUL]); 27
OK.. ...output elided...
Goal proved.
⊢ ∀m p. 0 < m ⇒ m divides FACT (m + p)
val it =
Initial goal proved.
⊢ ∀m n. 0 < m ∧ m ≤ n ⇒ m divides FACT n: proof
We have finished the search for the proof, and now turn to the task of making a single
tactic out of the sequence of tactic invocations we have just made. We assume that the
sequence of invocations has been kept track of in a file or a text editor buffer. We would
thus have something like the following:
4.1. DIVISIBILITY 39
One obvious step would be to merge the two successive invocations of the simplifier in
the step case:
Now we’ll make the occasionally dangerous assumption that the simplifications of the
step case won’t interfere with what is happening in the base case, and move the step
case’s tactic to precede the first >|, using >>. When the Induct tactic generates two
sub-goals, the step case’s simplification will be applied to both of them:
m divides FACT m
------------------------------------
0 < m
The step case has been dealt with, and as we hoped the base case has not been changed
at all. This means that our tactic can become
In the base case, we have two invocations of the simplifier under the case-split on m. In
general, the two different simplifier invocations do slightly different things in addition to
simplifying the conclusion of the goal:
4.1. DIVISIBILITY 41
• rw strips apart the propositional structure of the goal, and eliminates equalities
from the assumptions
However, in this case the goal where we used rw did not include any propositional
structure to strip apart, and so we can be confident that using fs in the same place
would also work. Thus, we can merge the two sub-cases of the base-case into a single
invocation of fs:
We have now finished our exercise in tactic revision. Certainly, it would be hard to
foresee that this final tactic would prove the goal; the required lemmas for the final
invocation of metis_tac have been found by an incremental process of revision.
This is slighly hard to read, so we sequence a call to the simplifier to strip both arms of
the proof. As before, use of >> ensures that the tactic gets applied in both branches of
the induction. (We might also use rpt strip_tac if we didn’t want the simplification to
happen.)
42 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
m divides FACT n
------------------------------------
0. ∀n m. (v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n
1. SUC v = n − m
2. 0 < m
m divides FACT n
------------------------------------
0. n ≤ m
1. 0 < m
2. m ≤ n
Looking at the first goal, we can see (by the anti-symmetry of ≤) that 𝑚 = 𝑛. We can
prove this fact, using rw and add it to the hypotheses by use of the infix operator “by”:
m divides FACT n
------------------------------------
0. n ≤ m
1. 0 < m
2. m ≤ n
3. m = n
We can now use simplification again to propagate the newly derived equality throughout
the goal.
> e (rw[]); 33
OK..
1 subgoal:
val it =
m divides FACT m
------------------------------------
0. m ≤ m
1. 0 < m
2. m ≤ m
4.1. DIVISIBILITY 43
At this point in the previous proof we did a case analysis on 𝑚. However, we already
have the hypothesis that 𝑚 is positive (along with two other now useless hypotheses).
Thus we know that 𝑚 is the successor of some number 𝑘. We might wish to assert this
fact with an invocation of “by” as follows:
But what is the tactic? If we try rw, it will fail since the embedded arithmetic decision
procedure doesn’t handle existential statements very well. What to do?
In fact, that earlier case analysis will again do the job: but now we hide it away so
that it is only used to prove this sub-goal. When we execute Cases_on ‘m‘, we will get a
case where m has been substituted out for 0. This case will be contradictory given that
we already have an assumption 0 < m, and we can again use fs. In the other case, there
will be an assumption that m is some successor value, and this will make it easy for the
simplifier to prove the goal.
Thus:
> e (‘?k. m = SUC k‘ by (Cases_on ‘m‘ >> fs[])); 34
OK..
1 subgoal:
val it =
m divides FACT m
------------------------------------
0. m ≤ m
1. 0 < m
2. m ≤ m
3. m = SUC k
Goal proved.
[...] ⊢ m divides FACT n
Remaining subgoals:
val it =
m divides FACT n
------------------------------------
0. ∀n m. (v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n
1. SUC v = n − m
2. 0 < m
44 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
That takes care of the base case. For the induction step, things look a bit more difficult
than in the earlier proof. However, we can make progress by realizing that the hypotheses
imply that 0 < 𝑛 and so we can transform 𝑛 into a successor, thus enabling the unfolding
of FACT, as in the previous proof:
> e (‘0 < n‘ by rw[] >> ‘?k. n = SUC k‘ by (Cases_on ‘n‘ >> fs[])); 36
OK..
1 subgoal:
val it =
m divides FACT n
------------------------------------
0. ∀n m. (v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n
1. SUC v = n − m
2. 0 < m
3. 0 < n
4. n = SUC k
The proof now finishes in much the same manner as the previous one:
> e (rw [FACT, DIVIDES_RMUL]); 37
OK.. ...output elided...
Goal proved.
[...] ⊢ m divides FACT n
val it =
Initial goal proved.
⊢ ∀m n. 0 < m ∧ m ≤ n ⇒ m divides FACT n: proof
We leave the details of stitching the proof together to the interested reader.
4.2 Primality
Now we move on to establish some facts about the primality of the first few numbers: 0
and 1 are not prime, but 2 is. Also, all primes are positive. These are all quite simple to
prove.
(NOT_PRIME_0) ~prime 0
rw[prime_def,DIVIDES_0]
(NOT_PRIME_1) ~prime 1
rw[prime_def]
(PRIME_2) prime 2
rw[prime_def] >>
metis_tac [DIVIDES_LE, DIVIDES_ZERO, DECIDE ‘‘2<>0‘‘,
DECIDE ‘‘x <= 2 <=> (x=0) \/ (x=1) \/ (x=2)‘‘]
4.3. EXISTENCE OF PRIME FACTORS 45
We start by invoking complete induction. This gives us an inductive hypothesis that holds
at every number 𝑚 strictly smaller than 𝑛:
We can move the antecedent to the hypotheses and make our case split. Notice that the
term given to Cases_on need not occur in the goal:
In the second case, we can get a divisor of 𝑛 that isn’t 1 or 𝑛 (since 𝑛 is not prime):
At this point, the polished tactic simply invokes metis_tac with a collection of theorems.
We will attempt a more detailed exposition. Given the hypotheses, and by DIVIDES_LE,
we can assert 𝑥 < 𝑛 ∨ 𝑛 = 0 and thus split the proof into two cases:
4.3. EXISTENCE OF PRIME FACTORS 47
In the first subgoal, we can see that the antecedents of the inductive hypothesis are met
and so 𝑥 has a prime divisor. We can then use the transitivity of divisibility to get the fact
that this divisor of 𝑥 is also a divisor of 𝑛, thus finishing this branch of the proof:
48 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
Goal proved.
[.......] ⊢ ∃p. prime p ∧ p divides n
Remaining subgoals:
val it =
> e (rw[]); 44
OK..
1 subgoal:
val it =
> DIVIDES_0; 45
val it = ⊢ ∀x. x divides 0: thm
Goal proved.
[.] ⊢ n ≠ 1 ⇒ ∃p. prime p ∧ p divides n
val it =
Initial goal proved.
⊢ ∀n. n ≠ 1 ⇒ ∃p. prime p ∧ p divides n: proof
Again, work now needs to be done to compose and perhaps polish a single tactic from the
individual proof steps, but we will not describe it.9 Instead we move forward, because
our ultimate goal is in reach.
Let’s prise this apart and look at it in some detail. A proof by contradiction can be started
by using the bossLib function spose_not_then. With it, one assumes the negation of the
current goal and then uses that in an attempt to prove falsity (F). The assumed negation
¬(∀𝑛. ∃𝑝. 𝑛 < 𝑝 ∧ prime 𝑝) is simplified a bit into ∃𝑛. ∀𝑝. 𝑛 < 𝑝 ⊃ ¬ prime 𝑝 and then is
passed to the tactic strip_assume_tac. This moves its argument to the assumption list
of the goal after eliminating the existential quantification on 𝑛.
9 Indeed, the tactic can be simplified into complete induction followed by an invocation of METIS_TAC
with suitable lemmas.
50 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
F
------------------------------------
∀p. n < p ⇒ ¬prime p
Thus we have the hypothesis that all 𝑝 beyond a certain unspecified 𝑛 are not prime, and
our task is to show that this cannot be. At this point we take advantage of Euclid’s great
inspiration and we build an explicit term from 𝑛. In the informal proof we are asked
to ‘consider’ the term FACT 𝑛 + 1.10 This term will have certain properties (i.e., it has a
prime factor) that lead to contradiction. Question: how do we ‘consider’ this term in the
formal HOL proof? Answer: by instantiating a lemma with it and bringing the lemma
into the proof. The lemma and its instantiation are:11
> PRIME_FACTOR; 48
val it = ⊢ ∀n. n ≠ 1 ⇒ ∃p. prime p ∧ p divides n: thm
It is evident that the antecedent of th can be eliminated. In H OL, one could do this in a
so-called forward proof style (by proving ⊢ ¬(FACT 𝑛 + 1 = 1) and then applying modus
ponens, the result of which can then be used in the proof), or one could bring th into the
proof and simplify it in situ. We choose the latter approach.
The invocation mp_tac (⊢ 𝑀) applied to a goal (Δ, 𝑔) returns the goal (Δ, 𝑀 ⇒ 𝑔). Now
we simplify:
10 The HOL parser thinks FACT 𝑛 + 1 is equivalent to (FACT 𝑛) + 1.
11 The function SPEC implements the rule of universal specialization.
4.4. EUCLID’S THEOREM 51
> e (rw[]); 50
OK..
2 subgoals:
val it =
FACT n ≠ 0
------------------------------------
∀p. n < p ⇒ ¬prime p
We recall that zero is less than every factorial, a fact found in arithmeticTheory under
the name FACT_LESS. Thus we can solve the top goal by simplification:
Goal proved.
⊢ FACT n ≠ 0
Remaining subgoals:
val it =
Notice the ‘on-the-fly’ use of DECIDE to provide an ad hoc rewrite. Looking at the
remaining goal, one might think that our aim, to prove falsity, has been lost. But this is
not so: a goal ¬𝑃 ∨ ¬𝑄 is logically equivalent to 𝑃 ⇒ 𝑄 ⇒ 𝙵. In the following invocation,
we use the equality ⊢ 𝐴 ⇒ 𝐵 = ¬𝐴 ∨ 𝐵 as a rewrite rule oriented right to left by use of
GSYM.12
12 Loosely speaking, GSYM swaps the left and right hand sides of any equations it finds.
52 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
> IMP_DISJ_THM; 52
val it = ⊢ ∀A B. A ⇒ B ⟺ ¬A ∨ B: thm
We can quickly proceed to show that 𝑝 𝚍𝚒𝚟𝚒𝚍𝚎𝚜 (𝙵𝙰𝙲𝚃 𝑛), so that if it also divides
FACT n + 1, then 𝑝 divides 1, meaning that 𝑝 = 1. But then 𝑝 is not prime, at which
point we are done. This can all be packaged into a single invocation of METIS_TAC:
Goal proved.
[.] ⊢ F
val it =
Initial goal proved.
⊢ ∀n. ∃p. n < p ∧ prime p: proof
Euclid’s theorem is now proved, and we can rest. However, this presentation of the
final proof will be unsatisfactory to some, because the proof is completely hidden
in the invocation of the automated reasoner. Well then, let’s try another proof, this
time employing the so-called ‘assertional’ style. When used uniformly, this can allow
a readable linear presentation that mirrors the informal proof. The following proves
Euclid’s theorem in the assertional style. We think it is fairly readable, certainly much
more so than the standard tactic proof just given.13
13 Note that CCONTR_TAC, which is used to start the proof, initiates a proof by contradiction by negating
the goal and placing it on the hypotheses, leaving F as the new goal.
4.5. TURNING THE SCRIPT INTO A THEORY 53
<holdir>/examples/euclid.sml
as our base-line. This file is already close to being in the right form. It has all of the
proofs of the theorems in “sewn-up” form so that when run, it does not involve the
goal-stack at all. In its given form, it can be run as input to hol thus:
$ cd examples/ 1
$ ../bin/hol < euclid.sml
...
However, we now want to create a euclidTheory that we can load in other interactive
sessions. So, our first step is to create a file euclidScript.sml, and to copy the body of
euclid.sml into it.
The first non-comment line opens arithmeticTheory. However, when writing for the
compiler, we need to explicitly mention the other H OL modules that we depend on. We
must add
open HolKernel boolLib Parse bossLib
The next line that poses a difficulty is
set_fixity "divides" (Infixr 450);
While it is legitimate to type expressions directly into the interactive system, the compiler
requires that every top-level phrase be a declaration. We satisfy this requirement by
altering this line into a “do nothing” declaration that does not record the result of the
expression:
val _ = set_fixity "divides" (Infixr 450)
The only extra changes are to bracket the rest of the script text with calls to new_theory
and export_theory. So, before the definition of divides, we add:
val _ = new_theory "euclid";
and at the end of the file:
val _ = export_theory();
Now, we can compile the script we have created using the Holmake tool. To keep things
a little tidier, we first move our script into a new directory.
$ mkdir euclid 2
$ mv euclidScript.sml euclid
$ cd euclid
$ ../../bin/Holmake
Analysing euclidScript.sml
Trying to create directory .HOLMK for dependency files
Compiling euclidScript.sml
Linking euclidScript.uo to produce theory-builder executable
<<HOL message: Created theory "euclid".>>
Definition has been stored under "divides_def".
Definition has been stored under "prime_def".
Meson search level: .....
Meson search level: .................
...
Exporting theory "euclid" ... done.
Analysing euclidTheory.sml
Analysing euclidTheory.sig
Compiling euclidTheory.sig
Compiling euclidTheory.sml
4.6. SUMMARY 55
Now we have created four new files, various forms of euclidTheory with four different
suffixes. Only euclidTheory.sig is really suitable for human consumption. While still
in the euclid directory that we created, we can demonstrate:
$ ../../bin/hol 3
[...]
4.6 Summary
The reader has now seen an interesting theorem proved, in great detail, in H OL. The
discussion illustrated the high-level tools provided in bossLib and touched on issues
including tool selection, undo, ‘tactic polishing’, exploratory simplification, and the
‘forking-off’ of new proof attempts. We also attempted to give a flavour of the thought
processes a user would employ. Following is a more-or-less random collection of other
observations.
• Even though the proof of Euclid’s theorem is short and easy to understand when
presented informally, a perhaps surprising amount of support development was
required to set the stage for Euclid’s classic argument.
• The proof support offered by bossLib (rw, metis_tac, DECIDE, Cases_on, Induct_on,
and the “by” construct) was nearly complete for this example: it was rarely neces-
sary to resort to lower-level tactics.
same thing). This is desirable, since the hypotheses are notionally a set, and
moreover, experience has shown that profligate indexing into hypotheses results in
hard-to-maintain proof scripts.
We also found that we could directly simplify in the assumptions by using the fs
tactic. Nonetheless, it can be clumsy to work with a large set of hypotheses, in
which case the following approaches may be useful.
One can directly refer to hypotheses by using UNDISCH_TAC (makes the designated
hypothesis the antecedent to the goal), ASSUM_LIST (gives the entire hypothesis
list to a tactic), pop_assum (gives the top hypothesis to a tactic), and qpat_assum
(gives the first matching hypothesis to a tactic). (See the REFERENCE for further
details on all of these.) The numbers attached to hypotheses by the proof manager
could likely be used to access hypotheses (it would be quite simple to write such a
tactic). However, starting a new proof is sometimes the most clarifying thing to do.
In some cases, it is useful to be able to delete a hypothesis. This can be accomplished
by passing the hypothesis to a tactic that ignores it. For example, to discard the top
hypothesis, one could invoke pop_assum kall_tac.
• In the example, we didn’t use the more advanced features of bossLib, largely
because they do not, as yet, provide much more functionality than the simple
sequencing of simplification, decision procedures, and automated first order rea-
soning. The >> tactical has thus served as an adequate replacement. In the future,
these entrypoints should become more powerful.
• High powered tools like metis_tac, and rw are the principal way of advancing a
proof in bossLib. In many cases, they do exactly what is desired, or even manage
4.6. SUMMARY 57
to surprise the user with their power. In the formalization of Euclid’s theorem, the
tools performed fairly well. However, sometimes they are overly aggressive, or
they simply flounder. In such cases, more specialized proof tools need to be used,
or even written, and hence the support underlying bossLib must eventually be
learned.
• Having a good knowledge of the available lemmas, and where they are located, is
an essential part of being successful. Often powerful tools can replace lemmas in a
restricted domain, but in general, one has to know what has already been proved.
We have found that the entrypoints in DB help in quickly finding lemmas.
58 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
Chapter 5
This chapter consists of a worked example: the specification and verification of a simple
sequential parity checker. The intention is to accomplish two things:
(ii) To give a flavour of what it is like to use the H OL system for a tricky proof.
Concerning (ii), note that although the theorems proved are, in fact, rather simple,
the way they are proved illustrates the kind of intricate ‘proof engineering’ that is typical.
The proofs could be done more elegantly, but presenting them that way would defeat
the purpose of illustrating various features of H OL. It is hoped that the small example
here will give the reader a feel for what it is like to do a big one.
Readers who are not interested in hardware verification should be able to learn
something about the H OL system even if they do not wish to penetrate the details of
the parity-checking example used here. The specification and verification of a slightly
more complex parity checker is set as an exercise (a solution is provided in the directory
examples/parity).
5.1 Introduction
The sessions of this example comprise the specification and verification of a device that
computes the parity of a sequence of bits. More specifically, a detailed verification is given
of a device with an input in, an output out and the specification that the 𝑛th output on
out is T if and only if there have been an even number of T’s input on in. A theory named
PARITY is constructed; this contains the specification and verification of the device. All the
ML input in the boxes below can be found in the file examples/parity/PARITYScript.sml.
It is suggested that the reader interactively input this to get a ‘hands on’ feel for the
example. The goal of the case study is to illustrate detailed ‘proof hacking’ on a small
and fairly simple example.
59
60 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER
5.2 Specification
The first step is to start up the H OL system. We again use <holdir>/bin/hol. The ML
prompt is >, so lines beginning with > are typed by the user and other lines are the
system’s response.
To specify the device, a primitive recursive function PARITY is defined so that for 𝑛 > 0,
PARITY 𝑛𝑓 is true if the number of T’s in the sequence 𝑓 (1), … , 𝑓 (𝑛) is even.
The effect of our call to Define is to store the definition of PARITY on the current theory
with name PARITY_def and to bind the defining theorem to the ML variable with the same
name. Notice that there are two name spaces being written into: the names of constants
in theories and the names of variables in ML. The user is generally free to manage these
names however he or she wishes (subject to the various lexical requirements), but a
common convention is (as here) to give the definition of a constant CON the name CON_def
in the theory and also in ML. Another commonly-used convention is to use just CON for the
theory and ML name of the definition of a constant CON. Unfortunately, the H OL system
does not use a uniform convention, but users are recommended to adopt one. In this
case Define has made one of the choices for us, but there are other scenarios where we
have to choose the name used in the theory file.
The specification of the parity checking device can now be given as:
It is intuitively clear that this specification will be satisfied if the signal1 functions inp
and out satisfy2 :
out(0) = T
and
!inp out.
(out 0 = T) /\
(!t. out(SUC t) = if inp(SUC t) then ~out t else out t)
==>
(!t. out t = PARITY t inp)
The proof of this is done by Mathematical Induction and, although trivial, is a good
illustration of how such proofs are done. The lemma is proved interactively using H OL’s
subgoal package. The proof is started by putting the goal to be proved on a goal stack
using the function g which takes a goal as argument.
∀inp out.
(out 0 ⟺ T) ∧
(∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t) ⇒
∀t. out t ⟺ PARITY t inp
The subgoal package prints out the goal on the top of the goal stack. The top goal is
expanded by stripping off the universal quantifier (with gen_tac) and then making the
two conjuncts of the antecedent of the implication into assumptions of the goal (with
strip_tac). The ML function e takes a tactic and applies it to the top goal; the resulting
subgoals are pushed on to the goal stack. The message ‘OK..’ is printed out just before
the tactic is applied. The resulting subgoal is then printed.
Next induction on t is done using Induct, which does induction on the outermost
universally quantified variable.
62 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER
> e Induct; 4
OK..
2 subgoals:
val it =
The assumptions of the two subgoals are shown numbered underneath the horizontal
lines of hyphens. The last goal printed is the one on the top of the stack, which is the
basis case. This is solved by rewriting with its assumptions and the definition of PARITY.
> e(rw[PARITY_def]); 5
OK..
<<HOL message: Initialising SRW simpset ... done>>
Goal proved.
[.] ⊢ out 0 ⟺ PARITY 0 inp
Remaining subgoals:
val it =
The top goal is proved, so the system pops it from the goal stack (and puts the proved
theorem on a stack of theorems). The new top goal is the step case of the induction. This
goal is also solved by rewriting.
5.3. IMPLEMENTATION 63
> e(rw[PARITY_def]); 6
OK.. ...output elided...
Goal proved.
[..] ⊢ ∀t. out t ⟺ PARITY t inp
val it =
Initial goal proved.
⊢ ∀inp out.
(out 0 ⟺ T) ∧
(∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t) ⇒
∀t. out t ⟺ PARITY t inp: proof
The goal is proved, i.e. the empty list of subgoals is produced. The system now applies
the justification functions produced by the tactics to the lists of theorems achieving the
subgoals (starting with the empty list). These theorems are printed out in the order in
which they are generated (note that assumptions of theorems are printed as dots).
The ML function
returns the theorem just proved (i.e. on the top of the theorem stack) in the current
theory, and we bind this to the ML name UNIQUENESS_LEMMA.
5.3 Implementation
The lemma just proved suggests that the parity checker can be implemented by holding
the parity value in a register and then complementing the contents of the register
whenever T is input. To make the implementation more interesting, it will be assumed
that registers ‘power up’ storing F. Thus the output at time 0 cannot be taken directly
from a register, because the output of the parity checker at time 0 is specified to be T.
Another tricky thing to notice is that if t>0, then the output of the parity checker at time
t is a function of the input at time t. Thus there must be a combinational path from the
input to the output.
The schematic diagram below shows the design of a device that is intended to imple-
ment this specification. (The leftmost input to MUX is the selector.) This works by storing
the parity of the sequence input so far in the lower of the two registers. Each time T is
input at in, this stored value is complemented. Registers are assumed to ‘power up’ in a
64 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER
state in which they are storing F. The second register (connected to ONE) initially outputs
F and then outputs T forever. Its role is just to ensure that the device works during the
first cycle by connecting the output out to the device ONE via the lower multiplexer. For
all subsequent cycles out is connected to l3 and so either carries the stored parity value
(if the current input is F) or the complement of this value (if the current input is T).
in
NOT
l1 l2
ONE MUX
REG l3 l4
l5
MUX
REG
out
The devices making up this schematic will be modelled with predicates [5]. For
example, the predicate ONE is true of a signal out if for all times t the value of out is T.
> val ONE_def = Define ‘ONE(out:num->bool) = !t. out t = T‘; 8
Definition has been stored under "ONE_def"
val ONE_def = ⊢ ∀out. ONE out ⟺ ∀t. out t ⟺ T: thm
Note that, as discussed above, ‘ONE_def’ is used both as an ML variable and as the name
of the definition in the theory. Note also how ‘:num->bool’ has been added to resolve type
5.3. IMPLEMENTATION 65
ambiguities; without this (or some other type information) the typechecker would not
be able to infer that t is to have type num.
The binary predicate NOT is true of a pair of signals (inp,out) if the value of out is
always the negation of the value of inp. Inverters are thus modelled as having no delay.
This is appropriate for a register-transfer level model, but not at a lower level.
The remaining devices in the schematic are registers. These are unit-delay elements;
the values output at time t+1 are the values input at the preceding time t, except at time
0 when the register outputs F.3
5.4 Verification
The following theorem will eventually be proved:
This states that if inp and out are related as in the schematic diagram (i.e. as in the
definition of PARITY_IMP), then the pair of signals (inp,out) satisfies the specification.
First, the following lemma is proved; the correctness of the parity checker follows from
this and UNIQUENESS_LEMMA by the transitivity of ==>.
∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t
The first step in proving this goal is to rewrite with definitions followed by a decomposi-
tion of the resulting goal using strip_tac. The rewriting tactic PURE_REWRITE_TAC is used;
this does no built-in simplifications, only the ones explicitly given in the list of theorems
5.4. VERIFICATION 67
out 0 ⟺ T
------------------------------------
0. ∀t. l1 t ⟺ ¬l2 t
1. ∀t. l3 t ⟺ if inp t then l1 t else l2 t
2. ∀t. l2 t ⟺ if t = 0 then F else out (t − 1)
3. ∀t. l4 t ⟺ T
4. ∀t. l5 t ⟺ if t = 0 then F else l4 (t − 1)
5. ∀t. out t ⟺ if l5 t then l3 t else l4 t
The top goal is the one printed last; its conclusion is out 0 = T and its assumptions
are equations relating the values on the lines in the circuit. The natural next step would
be to expand the top goal by rewriting with the assumptions. However, if this were done
the system would go into an infinite loop because the equations for out, l2 and l3 are
mutually recursive. Instead we use the first-order reasoner metis_tac to do the work:
68 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER
Goal proved.
[......] ⊢ out 0 ⟺ T
Remaining subgoals:
val it =
The first of the two subgoals is proved. Inspecting the remaining goal it can be seen that
it will be solved if its left hand side, out(SUC t), is expanded using the assumption:
!t. out t = if l5 t then l3 t else l4 t
However, if this assumption is used for rewriting, then all the subterms of the form
out t will also be expanded. To prevent this, we really want to rewrite with a formula
that is specifically about out (SUC t). We want to somehow pull the assumption that
we do have out of the list and rewrite with a specialised version of it. We can do just
this using qpat_x_assum. This tactic is of type term quotation -> thm -> tactic. It
selects an assumption that is of the form given by its first argument, and passes it to the
second argument, a function which expects a theorem and returns a tactic. Here it is in
action:
> e (qpat_x_assum ‘!t. out t = X t‘ 16
(fn th => REWRITE_TAC [SPEC ‘‘SUC t‘‘ th]));
OK..
1 subgoal:
val it =
The pattern used here exploited something called higher order matching. The actual
assumption that was taken off the assumption stack did not have a RHS that looked like
the application of a function (X in the pattern) to the t parameter, but the RHS could
nonetheless be seen as equal to the application of some function to the t parameter. In
fact, the value that matched X was ‘‘\x. if l5 x then l3 x else l4 x‘‘.
Inspecting the goal above, it can be seen that the next step is to unwind the equations
for the remaining lines of the circuit. We do using the standard simplifier rw.
> e (rw[]); 17
OK.. ...output elided...
Goal proved.
[......] ⊢ out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t
val it =
Initial goal proved.
⊢ ∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t: proof
The theorem just proved is named PARITY_LEMMA and saved in the current theory.
> val PARITY_LEMMA = top_thm (); 18
val PARITY_LEMMA =
⊢ ∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t: thm
PARITY_LEMMA could have been proved in one step with a single compound tactic. Our
initial goal can be expanded with a single tactic corresponding to the sequence of tactics
that were used interactively:
> restart(); ...output elided... 19
> e (PURE_REWRITE_TAC [PARITY_IMP_def, ONE_def, NOT_def,
MUX_def, REG_def] >>
rpt strip_tac >| [
metis_tac [],
qpat_x_assum ‘!t. out t = X t‘
(fn th => REWRITE_TAC [SPEC ‘‘SUC t‘‘ th]) >>
rw[]
]);
OK..
metis: r[+0+17]+0+0+0+0+0+1+0+1+0+0+1#
val it =
Initial goal proved.
⊢ ∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t: proof
70 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER
Armed with PARITY_LEMMA, the final theorem is easily proved. This will be done in one
step using the ML function prove.
5.5 Exercises
Two exercises are given in this section: Exercise 1 is straightforward, but Exercise 2 is
quite tricky and might take a beginner several days to solve.
5.5.1 Exercise 1
Using only the devices ONE, NOT, MUX and REG defined in Section 5.3, design and verify a
register RESET_REG with an input inp, reset line reset, output out and behaviour specified
as follows.
• If reset is T at time t or t+1, then the value output at out at time t+1 is T, otherwise
it is equal to the value input at time t on inp.
RESET_REG(reset,inp,out) <=>
(!t. reset t ==> (out t = T)) /\
(!t. out(t+1) = if reset t \/ reset(t+1) then T else inp t)
Note that this specification is only partial; it doesn’t specify the output at time 0 in the
case that there is no reset.
The solution to the exercise should be a definition of a predicate RESET_REG_IMP as
an existential quantification of a conjunction of applications of ONE, NOT, MUX and REG to
suitable line names,4 together with a proof of:
5.5.2 Exercise 2
1. Formally specify a resetable parity checker that has two boolean inputs reset and
inp, and one boolean output out with the following behaviour:
The value at out is T if and only if there have been an even number of Ts
input at inp since the last time that T was input at reset.
2. Design an implementation of this specification built using only the devices ONE, NOT,
MUX and REG defined in Section 5.3.
6.1 Introduction
This small case study is a formalisation of (variable-free) combinatory logic. This logic
is of foundational importance in theoretical computer science, and has a very rich
theory. The example builds principally on a development done by Tom Melham. The
complete script for the development is available as clScript.sml in the examples/ind_-
def directory of the distribution. It is self-contained and so includes the answers to the
exercises set at the end of this document.
The H OL sessions assume that the Unicode trace is on (as it is by default), meaning
that even though the inputs may be written in pure ASCII, the output still uses nice
Unicode output (symbols such as ∀ and ⇒). The Unicode symbols could also be used in
the input.
We also want the # to be an infix, so we set its fixity to be a tight left-associative infix:
73
74 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
Combinatory logic is the study of how values of this type can evolve given various rules
describing how they change. Therefore, our next step is to define the reductions that
combinators can undergo. There are two basic rules:
𝖪𝑥𝑦 → 𝑥
𝖲 𝑓 𝑔 𝑥 → (𝑓 𝑥)(𝑔𝑥)
Here, in our description outside of HOL, we use juxtaposition instead of the #. Further,
juxtaposition is also left-associative, so that 𝖪 𝑥 𝑦 should be read as 𝖪 # 𝑥 # 𝑦 which is in
turn (𝖪 # 𝑥) # 𝑦.
Given a term in the logic, we want these reductions to be able to fire at any point, not
just at the top level, so we need two further congruence rules:
𝑥 → 𝑥′
𝑥 𝑦 → 𝑥′ 𝑦
𝑦 → 𝑦′
𝑥 𝑦 → 𝑥 𝑦′
In HOL, we can capture this relation with an inductive definition. First we need to set our
arrow symbol up as an infix to make everything that bit prettier. The set_mapped_fixity
function lets the arrow be our surface syntax, but maps to the name redn underneath.
Making constants have pure alphanumeric names is generally a good idea.
We make our arrow symbol non-associative, thereby making it a parse error to write
x --> y --> z. It would be nice to be able to write this and have it mean x --> y /\ y --> z,
but this is not presently possible with the HOL parser.
Our next step is to actually define the relation with the Hol_reln function. This
function returns three separate theorems, but we will only need to refer to the first:
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 75
In addition to proving these three theorems for us, the inductive definitions package
will also save them to disk when the theory is exported.
Now, using our theorem redn_rules we can demonstrate single steps of our reduction
relation:
> PROVE [redn_rules] ‘‘S # (K # x # x) --> S # x‘‘; 5
Meson search level: ...
val it = ⊢ S # (K # x # x) --> S # x: thm
The system we have just defined is as powerful as the 𝜆-calculus, Turing machines, and
all the other standard models of computation.
One useful result about the combinatory logic is that it is confluent. Consider the
term 𝖲 𝑧 (𝖪 𝖪) (𝖪 𝑦 𝑥). It can make two reductions, to 𝖲 𝑧 (𝖪 𝖪) 𝑦 and also to
(𝑧 (𝖪 𝑦 𝑥)) (𝖪 𝖪 (𝖪 𝑦 𝑥)). Do these two choices of reduction mean that from this point on
the terms have two completely separate histories? Roughly speaking, to be confluent
means that the answer to this question is no.
So, we begin our abstract digression with another inductive definition. Our new
constant is RTC, such that 𝖱𝖳𝖢 𝑅 𝑥 𝑦 is true if it is possible to get from 𝑥 to 𝑦 with zero
or more “steps” of the 𝑅 relation. (The standard notation for 𝖱𝖳𝖢 𝑅 is 𝑅∗ ; we will see
H OL try to approximate this with the text Rˆ*.) We can express this idea with just two
rules. The first
𝖱𝖳𝖢 𝑅 𝑥 𝑥
says that it’s always possible to get from 𝑥 to 𝑥 in zero or more steps. The second
𝑅𝑥𝑦 𝖱𝖳𝖢 𝑅 𝑦 𝑧
𝖱𝖳𝖢 𝑅 𝑥 𝑧
says that if you can take a single step from 𝑥 to 𝑦, and then take zero or more steps
to get 𝑦 to 𝑧, then it’s possible to take zero or more steps to get between 𝑥 and 𝑧. The
realisation of these rules in HOL is again straightforward.
(As it happens, RTC is already a defined constant in the context we’re working in
(it is found in relationTheory), so we’ll hide it from view before we begin. We thus
avoid messages telling us that we are inputting ambiguous terms. The ambiguities
would always be resolved in the favour of more recent definition, but the warnings are
annoying. We inherit the nice syntax for the old constant with our new one.)
> val _ = hide "RTC"; 6
Now let us go back to the notion of confluence. We want this to mean something like:
“though a system may take different paths in the short-term, those two paths can always
end up in the same place”. This suggests that we define confluent thus:
> val confluent_def = Define 7
‘confluent R =
!x y z. RTC R x y /\ RTC R x z ==>
?u. RTC R y u /\ RTC R z u‘;
<<HOL message: inventing new type variable names: ’a>>
Definition has been stored under "confluent_def"
val confluent_def =
⊢ ∀R. confluent R ⟺ ∀x y z. R^* x y ∧ R^* x z ⇒ ∃u. R^* y u ∧ R^* z u: thm
𝑥
∗ ∗
𝑦 𝑧
𝑥
∗ ∗
𝑦 𝑧
∗ ∗
𝑢
One nice property of confluent relations is that from any one starting point they
produce no more than one normal form, where a normal form is a value from which no
further steps can be taken.
In other words, a system has an 𝑅-normal form at 𝑥 if there are no connections via 𝑅
to any other values. (We could have written ~?y. R x y as our RHS for the definition
above.)
We can now prove the following:
∀R.
confluent R ⇒
∀x y z. R^* x y ∧ normform R y ∧ R^* x z ∧ normform R z ⇒ (y = z)
> e (rw[confluent_def]); 10
OK..
<<HOL message: Initialising SRW simpset ... done>>
1 subgoal:
val it =
y = z
------------------------------------
0. ∀x y z. R^* x y ∧ R^* x z ⇒ ∃u. R^* y u ∧ R^* z u
1. R^* x y
2. normform R y
3. R^* x z
4. normform R z
Our confluence property is now assumption 0, and we can use it to infer that there is a 𝑢
at the base of the diamond:
y = z
------------------------------------
0. ∀x y z. R^* x y ∧ R^* x z ⇒ ∃u. R^* y u ∧ R^* z u
1. R^* x y
2. normform R y
3. R^* x z
4. normform R z
5. R^* y u
6. R^* z u
So, from 𝑦 we can take zero or more steps to get to 𝑢 and similarly from 𝑧. But, we also
know that we’re at an 𝑅-normal form at both 𝑦 and 𝑧. We can’t take any steps at all from
these values. We can conclude both that 𝑢 = 𝑦 and 𝑢 = 𝑧, and this in turn means that
𝑦 = 𝑧, which is our goal. So we can finish with
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 79
Goal proved.
[.....] ⊢ y = z
val it =
Initial goal proved.
⊢ ∀R.
confluent R ⇒
∀x y z. R^* x y ∧ normform R y ∧ R^* x z ∧ normform R z ⇒ (y = z)
Packaged up so as to remove the sub-goal package commands, we can prove and save
the theorem for future use by:
> val confluent_normforms_unique = store_thm( 13
"confluent_normforms_unique",
‘‘!R. confluent R ==>
!x y z. RTC R x y /\ normform R y /\
RTC R x z /\ normform R z ==> (y = z)‘‘,
rw[confluent_def] >>
‘?u. RTC R y u /\ RTC R z u‘ by metis_tac [] >>
metis_tac [normform_def, RTC_cases]);
<<HOL message: inventing new type variable names: ’a>>
metis: r[+0+8]+0+0+0+0+0+0+1+1+1+1#
metis: r[+0+20]+0+0+0+0+0+0+0+0+0+0+0+0+6+0+0+0+0+0+0+2+0 .... #
val confluent_normforms_unique =
⊢ ∀R.
confluent R ⇒
∀x y z. R^* x y ∧ normform R y ∧ R^* x z ∧ normform R z ⇒ (y = z):
thm
⋯⋄⋯
Clearly confluence is a nice property for a system to have. The question is how we
might manage to prove it. Let’s start by defining the diamond property that we used in
the definition of confluence. We’ll again hide the existing definition of “diamond”:
> val _ = hide "diamond"; 14
> val diamond_def = Define
‘diamond R = !x y z. R x y /\ R x z ==> ?u. R y u /\ R z u‘;
<<HOL message: inventing new type variable names: ’a>>
Definition has been stored under "diamond_def"
val diamond_def =
⊢ ∀R. diamond R ⟺ ∀x y z. R x y ∧ R x z ⇒ ∃u. R y u ∧ R z u: thm
Now we clearly have that confluence of a relation is equivalent to the reflexive, transitive
closure of that relation having the diamond property.
80 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
So far so good. How then do we show the diamond property for 𝖱𝖳𝖢 𝑅? The answer
that leaps to mind is to hope that if the original relation has the diamond property, then
maybe the reflexive and transitive closure will too. The theorem we want is
𝑥
𝑦 𝑧
𝑢
𝑥
𝑦 𝑧
𝑢
𝑝 𝑞
𝑟
where the dashed lines indicate that these steps (from 𝑥 to 𝑝, for example) are using
𝖱𝖳𝖢 𝑅. The presence of two instances of 𝖱𝖳𝖢 𝑅 is an indication that this proof will
require two inductions. With the first we will prove
𝑥
𝑦 𝑧
𝑢
𝑝
𝑟
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 81
In other words, we want to show that if we take one step in one direction (to 𝑧) and
many steps in another (to 𝑝), then the diamond property for 𝑅 will guarantee us the
existence of 𝑟, to which will we be able to take many steps from both 𝑝 and 𝑧.
We take some care to state the goal so that after stripping away the outermost as-
sumption (that 𝑅 has the diamond property), it will match the induction principle for
RTC.1
First, we strip away the diamond property assumption (two things need to be stripped:
the outermost universal quantifier and the antecedent of the implication). If we use rw
at this point, we strip away too much so we have to be more precise and use the lower
level tool strip_tac. This tactic will remove a universal quantification, an implication
or a conjunction:
Now we can use the induction principle for reflexive and transitive closure (alternatively,
we perform a “rule induction”). To do this, we use the Induct_on command that is also
used to do structural induction on algebraic data types (such as numbers and lists). We
provide the name of the constant whose induction principle we want to use, and the
tactic does the rest:
1 Inthis and subsequent proofs using the sub-goal package, we will present the proof manager as if
the goal to be proved is the first ever on this stack. In other words, we have done a dropn 1; after every
successful proof to remove the evidence of the old goal. In practice, there is no harm in leaving these goals
on the proof manager’s stack.
82 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
Let’s strip the goal as much as possible with the aim of making what remains to be proved
easier to see:
> e (rw[]); 19
OK..
2 subgoals:
val it =
This first goal is easy. It corresponds to the case where the many steps from 𝑥 to 𝑝 are
actually no steps at all, and 𝑝 and 𝑥 are actually the same place. In the other direction, 𝑥
has taken one step to 𝑧, and we need to find somewhere reachable in zero or more steps
from both 𝑥 and 𝑧. Given what we know so far, the only candidate is 𝑧 itself. In fact, we
don’t even need to provide this witness explicitly: metis_tac will find it for us, as long
as we tell it what the rules governing RTC are:
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 83
Goal proved.
[..] ⊢ ∃u. R^* x u ∧ R^* z u
Remaining subgoals:
val it =
And what of this remaining goal? Assumptions one and four between them are the top
of an 𝑅-diamond. Let’s use the fact that we have the diamond property for 𝑅 and infer
that there exists a 𝑣 to which 𝑦 and 𝑧′ can both take single steps:
Now we can apply our induction hypothesis (assumption 3) to complete the long, lop-
sided strip of the diamond. We will conclude that there is a 𝑢 such that 𝑅∗ 𝑝 𝑢 and 𝑅∗ 𝑣 𝑢.
We actually need a 𝑢 such that 𝖱𝖳𝖢 𝑅 𝑧 𝑢, but because there is a single 𝑅-step between 𝑧
and 𝑣 we have that as well. All we need to provide metis_tac is the rules for RTC:
84 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
Goal proved.
[.] ⊢ ∀x p. R^* x p ⇒ ∀z. R x z ⇒ ∃u. R^* p u ∧ R^* z u
val it =
Initial goal proved.
⊢ ∀R. diamond R ⇒ ∀x p. R^* x p ⇒ ∀z. R x z ⇒ ∃u. R^* p u ∧ R^* z u: proof
Again we can (and should) package up the lemma, avoiding the sub-goal package
commands:
val R_RTC_diamond = store_thm( 23
"R_RTC_diamond",
‘‘!R. diamond R ==>
!x p. RTC R x p ==>
!z. R x z ==>
?u. RTC R p u /\ RTC R z u‘‘,
strip_tac >> strip_tac >> Induct_on ‘RTC‘ >> rw[] >| [
metis_tac [RTC_rules],
‘?v. R x’ v /\ R z v‘ by metis_tac [diamond_def] >>
metis_tac [RTC_rules]
]);
⋯⋄⋯
Now we can move on to proving that if 𝑅 has the diamond property, so too does 𝑅∗ .
We want to prove this by induction again. It’s very tempting to state the goal as the
obvious
but doing so will actually make it harder to apply the induction principle when the time
is right. Better to start out with a statement of the goal that is very near in form to the
induction princple. So, we manually expand the meaning of diamond and state our next
goal thus:
> g ‘!R. diamond R ==> !x y. RTC R x y ==> 24
!z. RTC R x z ==>
?u. RTC R y u /\ RTC R z u‘;
<<HOL message: inventing new type variable names: ’a>>
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:
Again we strip the diamond property assumption, apply the induction principle, and strip
repeatedly:
The first goal is again an easy one, corresponding to the case where the trip from 𝑥 to 𝑦
has been one of no steps whatsoever.
Goal proved.
[..] ⊢ ∃u. R^* x u ∧ R^* z u
Remaining subgoals:
val it =
This goal is very similar to the one we saw earlier. We have the top of a (“lop-sided”)
diamond in assumptions 1 and 4, so we can infer the existence of a common destination
for 𝑥′ and 𝑧:
86 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
At this point in the last proof we were able to finish it all off by just appealing to the
rules for RTC. This time it is not quite so straightforward. When we use the induction
hypothesis (assumption 3), we can conclude that there is a 𝑢 to which both 𝑦 and 𝑣 can
connect in zero or more steps, but in order to show that this 𝑢 is reachable from 𝑧, we
need to be able to conclude 𝑅∗ 𝑧 𝑢 when we know that 𝑅∗ 𝑧 𝑣 (assumption 6 above) and
𝑅∗ 𝑣 𝑢 (our consequence of the inductive hypothesis). We leave the proof of this general
result as an exercise, and here assume that it is already proved as the theorem RTC_RTC.
Goal proved.
[.....] ⊢ ∃u. R^* y u ∧ R^* z u
val it =
Initial goal proved.
⊢ ∀R. diamond R ⇒ ∀x y. R^* x y ⇒ ∀z. R^* x z ⇒ ∃u. R^* y u ∧ R^* z u
We can package this result up as a lemma and then prove the prettier version directly:
6.5. BACK TO COMBINATORS 87
Then we can define parallel reduction itself. The rules look very similar to those for →.
The difference is that we allow the reflexive transition, and say that an application of
𝑥 𝑢 can be transformed to 𝑦 𝑣 if there are transformations taking 𝑥 to 𝑦 and 𝑢 to 𝑣. This
is why we must have reflexivity incidentally. Without it, a term like (𝖪 𝑥 𝑦) 𝖪 couldn’t
reduce because while the LHS of the application (𝖪 𝑥 𝑦) can reduce, its RHS (K) can’t.
We do exactly the same thing for the reflexive and transitive closure of our parallel
reduction.
> set_fixity "-||->*" (Infix(NONASSOC, 450)); 33
val it = (): unit
∀x y. x -->* y ⇒ x -||->* y
We back-chain using our monotonicity result:
> e (match_mp_tac RTC_monotone); 36
OK..
1 subgoal:
val it =
∀x y. x --> y ⇒ x -||-> y
Now we can induct over the rules for →:
> e (Induct_on ‘x --> y‘); 37
OK..
1 subgoal:
val it =
We could split the 4-way conjunction apart into four goals, but there is no real need. It is
quite clear that each follows immediately from the rules for parallel reduction.
> e (metis_tac [predn_rules]); 38
OK..
metis: r[+0+5]#
r[+0+4]#
r[+0+8]+0+0+0+0+0+0+0+1#
r[+0+8]+0+0+0+0+0+0+0+1# ...output elided...
Goal proved.
⊢ ∀x y. x --> y ⇒ x -||-> y
val it =
Initial goal proved.
⊢ ∀x y. x -->* y ⇒ x -||->* y: proof
Packaged into a tidy little sub-goal-package-free parcel, our proof is
val RTCredn_RTCpredn = store_thm( 39
"RTCredn_RTCpredn",
‘‘!x y. x -->* y ==> x -||->* y‘‘,
match_mp_tac RTC_monotone >>
Induct_on ‘x --> y‘ >> metis_tac [predn_rules]);
⋯⋄⋯
Our next proof is in the other direction. It should be clear that we will not just be
able to appeal to the monotonicity of RTC this time; one step of the parallel reduction
relation can not be mirrored with one step of the original reduction relation. It’s clear
that mirroring one step of the parallel reduction relation might take many steps of the
original relation. Let’s prove that then:
> g ‘!x y. x -||-> y ==> x -->* y‘; 40
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:
∀x y. x -||-> y ⇒ x -->* y
This time our induction will be over the rules defining the parallel reduction relation.
> e (Induct_on ‘x -||-> y‘); 41
OK..
1 subgoal:
val it =
(∀x. x -->* x) ∧
(∀x y x’ y’.
x -||-> y ∧ x -->* y ∧ x’ -||-> y’ ∧ x’ -->* y’ ⇒ x # x’ -->* y # y’) ∧
(∀y y’. K # y # y’ -->* y) ∧ ∀f g x. S # f # g # x -->* f # x # (g # x)
6.5. BACK TO COMBINATORS 91
There are four conjuncts here, and it should be clear that all but the second can be
proved immediately by appeal to the rules for the transitive closure and for → itself. So,
we split apart the conjunctions and enter a THENL branch, putting in an all_tac in the
2nd position so that this falls through to be dealt with more carefully.
> e (rpt conj_tac >| [metis_tac[RTC_rules, redn_rules], 42
all_tac,
metis_tac[RTC_rules, redn_rules],
metis_tac[RTC_rules, redn_rules] ]);
OK..
metis: r[+0+11]+0+0+0+0+0+0+0+0+0+0+8+1#
metis: r[+0+11]+0+0+0+0+0+0+0+0+0+0+8+1#
metis: r[+0+3]#
1 subgoal:
val it =
∀x y x’ y’.
x -||-> y ∧ x -->* y ∧ x’ -||-> y’ ∧ x’ -->* y’ ⇒ x # x’ -->* y # y’
What of this latest sub-goal? If we look at it for long enough, we should see that it
is another monotonicity fact. More accurately, we need what is called a congruence
result for -->*. In this form, it’s not quite right for easy proof. Let’s go away and prove
RTCredn_ap_monotonic separately. (Another exercise!) Our new theorem should state
val RTCredn_ap_congruence = store_thm( 43
"RTCredn_ap_congruence",
‘‘!x y. x -->* y ==> !z. x # z -->* y # z /\ z # x -->* z # y‘‘,
...);
Now that we have this, our sub-goal is almost immediately provable. Using it, we know
that
𝑥 𝑥′ →∗ 𝑦 𝑥′
𝑦 𝑥′ →∗ 𝑦 𝑦′
All we need to do is “stitch together” the two transitions above and go from 𝑥 𝑥′ to 𝑦 𝑦′ .
We can do this by appealing to our earlier RTC_RTC result.
> e (metis_tac [RTC_RTC, RTCredn_ap_congruence]); 44
OK..
metis: r[+0+9]+0+0+0+0+0+0+0+0+10+1+2+3+1+2+1+7+1# ...output elided...
Goal proved.
⊢ (∀x. x -->* x) ∧
(∀x y x’ y’.
x -||-> y ∧ x -->* y ∧ x’ -||-> y’ ∧ x’ -->* y’ ⇒ x # x’ -->* y # y’) ∧
(∀y y’. K # y # y’ -->* y) ∧ ∀f g x. S # f # g # x -->* f # x # (g # x)
val it =
Initial goal proved.
⊢ ∀x y. x -||-> y ⇒ x -->* y: proof
92 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
But given that we can finish off what we thought was an awkward branch with just
another application of metis_tac, we don’t need to use our fancy branching footwork at
the stage before. Instead, we can just merge the theorem lists passed to both invocations,
dispense with the rpt conj_tac and have a very short tactic proof indeed:
val predn_RTCredn = store_thm( 45
"predn_RTCredn",
‘‘!x y. x -||-> y ==> x -->* y‘‘,
Induct_on ‘x -||-> y‘ >>
metis_tac [RTC_rules, redn_rules, RTC_RTC, RTCredn_ap_congruence]);
⋯⋄⋯
Now it’s time to prove that if a number of parallel reduction steps are chained together,
then we can mirror this with some number of steps using the original reduction relation.
Our goal:
> g ‘!x y. x -||->* y ==> x -->* y‘; 46
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:
∀x y. x -||->* y ⇒ x -->* y
This we can finish off in one step. The first conjunct is obvious, and in the second the
x -||-> y and our last result combine to tell us that x -->* y. Then this can be chained
together with the other assumption in the second conjunct and we’re done.
> e (metis_tac [RTC_rules, predn_RTCredn, RTC_RTC]); 48
OK..
metis: r[+0+12]+0+0+0+0+0+0+0+0+1+0+0+8+12+5+0+0+3+4+4+8+1#
r[+0+3]#
Goal proved.
⊢ (∀x. x -->* x) ∧ ∀x x’ y. x -||-> x’ ∧ x’ -||->* y ∧ x’ -->* y ⇒ x -->* y
val it =
Initial goal proved.
⊢ ∀x y. x -||->* y ⇒ x -->* y: proof
6.5. BACK TO COMBINATORS 93
⋯⋄⋯
Our final act is to use what we have so far to conclude that →∗ and −∣∣→∗ are equal.
We state our goal:
$-||->* = $-->*
We want to now appeal to extensionality. The simplest way to do this is to rewrite with
the theorem FUN_EQ_THM:
> FUN_EQ_THM; 51
val it = ⊢ ∀f g. (f = g) ⟺ ∀x. f x = g x: thm
So, we rewrite:
> e (rw[FUN_EQ_THM]); 52
OK..
1 subgoal:
val it =
x -||->* x’ ⟺ x -->* x’
Goal proved.
⊢ x -||->* x’ ⟺ x -->* x’
val it =
Initial goal proved.
⊢ $-||->* = $-->*: proof
The characterise function specialises the theorem predn_cases with the input term,
and then simplifies. The srw_ss() simpset includes information about the injectivity and
disjointness of constructors and eliminates obvious impossibilities. For example,
> val K_predn = characterise ‘‘K‘‘; 56
val K_predn = ⊢ ∀a1. K -||-> a1 ⟺ (a1 = K): thm
Our characterise function will just have to help us in the proofs that follow.
val Kx_predn = prove( 59
‘‘!x y. K # x -||-> y = ?z. (y = K # z) /\ (x -||-> z)‘‘,
rw[characterise ‘‘K # x‘‘, predn_rules, K_predn, EQ_IMP_THM]);
What of 𝖪 𝑥 𝑦? A little thought demonstrates that there really must be two cases this
time.
6.5. BACK TO COMBINATORS 95
By way of contrast, there is only one case for 𝖲 𝑥 𝑦 because it is not yet a “redex” at the
top-level.
Last of all, we want a characterisation for 𝑥 𝑦. What characterise gives us this time
can’t be improved upon, for all that we might look upon the four disjunctions and despair.
⋯⋄⋯
We now induct and split the goal into its individual conjuncts:
96 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
∀x y x’ y’.
x -||-> y ∧ (∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u) ∧
x’ -||-> y’ ∧ (∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u) ⇒
∀z. x # x’ -||-> z ⇒ ∃u. y # y’ -||-> u ∧ z -||-> u
The first goal is easily disposed of. The witness we would provide for this case is simply
z, but metis_tac will do the work for us:
> e (metis_tac [predn_rules]); 66
OK..
metis: r[+0+7]+0+0+0+0+1#
Goal proved.
⊢ ∀x z. x -||-> z ⇒ ∃u. x -||-> u ∧ z -||-> u
Remaining subgoals:
val it =
∀x y x’ y’.
x -||-> y ∧ (∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u) ∧
x’ -||-> y’ ∧ (∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u) ⇒
∀z. x # x’ -||-> z ⇒ ∃u. y # y’ -||-> u ∧ z -||-> u
The next goal includes two instances of terms of the form x # y -||-> z. We can use
our x_ap_y_predn theorem here. However, if we rewrite indiscriminately with it, we
6.5. BACK TO COMBINATORS 97
will really confuse the goal. We want to rewrite just the assumption, not the instance
underneath the existential quantifier. Starting everything by repeatedly stripping can’t
lead us too far astray.
> e (rw[]); 67
OK..
1 subgoal:
val it =
We need to split up assumption 4. We can get it out of the assumption list using the
qpat_x_assum theorem-tactical. We will write
qpat_x_assum ‘_ # _ -||-> _‘
(strip_assume_tac o SIMP_RULE std_ss [x_ap_y_predn])
The quotation specifies the pattern that we want to match: we want the term that has
an application term reducing, and as there is just one such, we can use “don’t care”
underscore patterns for the various arguments. The second argument specifies how we
are going to transform the theorem. Reading the compositions from right to left, first we
will simplify with the x_ap_y_predn theorem and then we will assume the result back
into the assumptions, stripping disjunctions and existentials as we go.3
We already know that doing this is going to produce four new sub-goals (there were
four disjuncts in the x_ap_y_predn theorem). We’ll follow up the use of strip_assume_tac
with rw to eliminate any equalities that might appear as assumptions.
So:
3 An alternative to using qpat_x_assum is to use by instead: you would have to state the four-way
disjunction yourself, but the proof would be more “declarative” in style, and though wordier, might be
more maintainable.
98 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
This first sub-goal is an easy consequence of the rules for parallel reduction. Because
we’ve elided the somewhat voluminous output, we call p() to print the next sub-goal
again:
This goal requires application of the two inductive hypotheses as well as the rules for
parallel reduction, but is again straightforward for metis_tac;
6.5. BACK TO COMBINATORS 99
Now our next goal (the third of the four) features a term K # z -||-> y in the assump-
tions. We have a theorem that pertains to just this situation. But before applying it
willy-nilly, let us try to figure out exactly what the situation is. A diagram of the current
situation might look like
K # z # x’ /
z
/
y # y’ ?u?
Our theorem tells us that y must actually be of the form K # w for some w, and that there
must be an arrow between z and w. Thus:
> e (‘?w. (y = K # w) /\ (z -||-> w)‘ by metis_tac [Kx_predn]); 71
OK..
metis: r[+0+11]+0+0+0+0+0+1+2+0+1+1+6+1+1#
1 subgoal:
val it =
On inspection, it becomes clear that the u must be w. The first conjunct requires
K # w # y’ -||-> w, which we have because this is what Ks do, and the second conjunct
is already in the assumption list. Rewriting (eliminating that equality in the assumption
list first will make metis_tac’s job that much easier), and then first order reasoning will
solve this goal:
100 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
Goal proved.
[....] ⊢ ∃u. y # y’ -||-> u ∧ z -||-> u
Remaining subgoals:
val it =
Goal proved.
⊢ ∀x y x’ y’.
x -||-> y ∧ (∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u) ∧ x’ -||-> y’ ∧
(∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u) ⇒
∀z. x # x’ -||-> z ⇒ ∃u. y # y’ -||-> u ∧ z -||-> u
Remaining subgoals:
val it =
...1 subgoal elided...
This next goal features a K # x # y -||-> z term that we have a theorem for already.
Let’s speculatively use a call to metis_tac to eliminate the simple cases immediately
(Kxy_predn is a disjunct so we’ll get two sub-goals if we don’t eliminate anything).
6.5. BACK TO COMBINATORS 101
Goal proved.
⊢ ∀y y’ z. K # y # y’ -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u
Remaining subgoals:
val it =
We got both cases immediately, and have moved onto the last case. We can try the same
strategy.
> e (rw[Sxyz_predn] >> metis_tac [predn_rules]); 75
OK..
metis: r[+0+3]#
metis: r[+0+9]+0+0+0+0+0+0+0+2+0+2+1+1+1#
Goal proved.
⊢ ∀f g x z. S # f # g # x -||-> z ⇒ ∃u. f # x # (g # x) -||-> u ∧ z -||-> u
val it =
Initial goal proved.
⊢ ∀x y. x -||-> y ⇒ ∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u: proof
⋯⋄⋯
We are on the home straight. The lemma can be turned into a statement involving the
diamond constant directly:
And now we can prove that our original relation is confluent in similar fashion:
6.6 Exercises
If necessary, answers to the first three exercises can be found by examining the source
file in examples/ind_def/clScript.sml.
1. Prove that
You will need to prove the goal by induction, and will probably need to massage
it slightly first to get it to match the appropriate induction principle. Store the
theorem under the name RTC_RTC.
3. Yet another RTC induction, but where 𝑅 is no longer abstract, and is instead the
original reduction relation. Prove
𝑥 →∗ 𝑦 ⇒ ∀𝑧. 𝑥 𝑧 →∗ 𝑦 𝑧 ∧ 𝑧 𝑥 →∗ 𝑧 𝑦
Call it RTCredn_ap_congruence.
6.6. EXERCISES 103
Users of H OL can create their own theorem proving tools by combining predefined rules
and tactics. The ML type-discipline ensures that only logically sound methods can be
used to create values of type thm. In this chapter, a real example is described.
Two implementations of the tool are given to illustrate various styles of proof pro-
gramming. The first implementation is the obvious one, but is inefficient because of the
‘brute force’ method used. The second implementation attempts to be a great deal more
intelligent. Extensions to the tools to allow more general applicability are also discussed.
The problem to be solved is that of deciding the truth of a closed formula of proposi-
tional logic. Such a formula has the general form
𝜑 ∶∶= 𝑣 | ¬𝜑 | 𝜑 ∧ 𝜑 | 𝜑 ∨ 𝜑 | 𝜑 ⇒ 𝜑 | 𝜑 = 𝜑
𝑓 𝑜𝑟𝑚𝑢𝑙𝑎 ∶∶= ∀𝑣.
⃗ 𝜑
where the variables 𝑣 are all of boolean type, and where the universal quantification at
the outermost level captures all of the free variables.
105
106 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC
This is a dreadful algorithm for solving this problem. The system’s built-in function,
tautLib.TAUT_CONV, solves the problem above much faster. The only real merit in this
solution is that it took one line to write. This is a general illustration of the truth that
H OL’s high-level tools, particularly the simplifier, can provide fast prototypes for a variety
of proof tasks.
Preliminaries
To begin, assume that we have code already to convert arbitrary formulas into CNF, and
to then decide the satisfiability of these formulas. Assume further that if the input to the
latter procedure is unsatisfiable, then it will return with a theorem of the form
⊢𝜑=F
or if it is satisfiable, then it will return a satisfying assignment, a map from variables to
booleans. This map will be a function from H OL variables to one of the H OL terms T or F.
Thus, we will assume
datatype result = Unsat of thm | Sat of term -> term
val toCNF : term -> thm
val DPLL : term -> result
(The theorem returned by toCNF will equate the input term to another in CNF.)
Before looking into implementing these functions, we will need to consider
• how to transform our inputs to suit the function; and
• how to use the outputs from the functions to produce our desired results
We are assuming our input is a universally quantified formula. Both the CNF and DPLL
procedures expect formulas without quantifiers. We also want to pass these procedures
the negation of the original formula. Both of the required term manipulations required
can be done by functions found in the structure boolSyntax. (In general, important
theories (such as bool) are accompanied by Syntax modules containing functions for
manipulating the term-forms associated with that theory.)
In this case we need the functions
strip_forall : term -> term list * term
mk_neg : term -> term
The function strip_forall strips a term of all its outermost universal quantifications,
returning the list of variables stripped and the body of the quantification. The function
mk_neg takes a term of type bool and returns the term corresponding to its negation.
Using these functions, it is easy to see how we will be able to take ∀𝑣.
⃗ 𝜑 as input, and
pass the term ¬𝜑 to the function toCNF. A more significant question is how to use the
results of these calls. The call to toCNF will return a theorem
⊢ ¬𝜑 = 𝜑′
The formula 𝜑′ is what will then be passed to DPLL. (We can extract it by using the concl
and rhs functions.) If DPLL returns the theorem ⊢ 𝜑′ = F, an application of TRANS to this
and the theorem displayed above will derive the formula ⊢ ¬𝜑 = 𝐹 . In order to derive
the final result, we will need to turn this into ⊢ 𝜑. This is best done by proving a bespoke
theorem embodying the equality (there isn’t one such already in the system):
108 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC
The other possibility is that DPLL will return a satisfying assignment demonstrating that
𝜑′ is satisfiable. If this is the case, we want to show that ∀𝑣.
⃗ 𝜑 is false. We can do this by
assuming this formula, and then specialising the universally quantified variables in line
with the provided map. In this way, it will be possible to produce the theorem
⊢ ∀𝑡. 𝑡 ⇒ F = (𝑡 = F)
Putting all of the above together, we can write our wrapper function, which we will
call DPLL_UNIV, with the UNIV suffix reminding us that the input must be universally
quantified.
The auxiliary function spec that is used in the second case relies on the fact that
dest_forall will raise a HOL_ERR exception if the term it is applied to is not universally
quantified. When spec’s argument is not universally quantified, this means that the
recursion has bottomed out, and all of the original formula’s universal variables have
been specialised. Then the resulting formula can be rewritten to false (REWRITE_RULE’s
built-in rewrites will handle all of the necessary cases).
The DPLL_UNIV function also uses REWR_CONV in two places. The REWR_CONV function
applies a single (first-order) rewrite at the top of a term. These uses of REWR_CONV are
done within calls to the CONV_RULE function. This lifts a conversion 𝑐 (a function taking
a term 𝑡 and producing a theorem ⊢ 𝑡 = 𝑡′ ), so that CONV_RULE 𝑐 takes the theorem ⊢ 𝑡 to
⊢ 𝑡′ .
¬(𝜙 ∧ 𝜓) = ¬𝜙 ∨ ¬𝜓
¬(𝜙 ∨ 𝜓) = ¬𝜙 ∧ ¬𝜓
𝜙 ∨ (𝜓 ∧ 𝜉) = (𝜙 ∨ 𝜓) ∧ (𝜙 ∨ 𝜉)
(𝜓 ∧ 𝜉) ∨ 𝜙 = (𝜙 ∨ 𝜓) ∧ (𝜙 ∨ 𝜉)
𝜙 ⇒ 𝜓 = ¬𝜙 ∨ 𝜓
(𝜙 = 𝜓) = (𝜙 ⇒ 𝜓) ∧ (𝜓 ⇒ 𝜙)
Under the existentially-bound x, the code has produced a formula in CNF. With an
example this small, the formula is actually bigger than that produced by the naïve
translation, but with more realistic examples, the difference quickly becomes significant.
The last example used with tautDP is 20 times bigger when translated naïvely than when
using defCNF, and the translation takes 150 times longer to perform.
110 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC
But what of these extra existentially quantified variables? In fact, we can ignore the
quantification when calling the core DPLL procedure. If we pass the unquantified body
to DPLL, we will either get back an unsatisfiable verdict of the form ⊢ 𝜑′ = F, or a
satisfying assignment for all of the free variables. If the latter occurs, the same satisfying
assignment will also satisfy the original. If the former, we will perform the following
proof
⊢ 𝜑′ = F
⊢ 𝜑′ ⇒ F
⊢ ∀⃗𝑥. 𝜑′ ⇒ F
𝑥. 𝜑′ ) ⇒ F
⊢ (∃⃗
⊢ (∃⃗𝑥. 𝜑′ ) = F
𝑥. 𝜑′ )
⊢ 𝜑 = (∃⃗
where the individual variables 𝑥𝑖 of the first formula are replaced by calls to the 𝑣
function 𝑣(𝑖), and there is just one quantified variable, 𝑣. This variation will not affect
the operation of the proof sketched above. And as long as we don’t require literals to be
variables or their negations, but also allow them to be terms of the form 𝑣(𝑖) and ¬𝑣(𝑖) as
well, then the action of the DPLL procedure on the formula 𝜑′ won’t be affected either.
Unfortunately for uniformity, in simple cases, the definitional CNF conversion functions
may not result in any existential quantifications at all. This makes our implementation
of DPLL somewhat more complicated. We calculate a body variable that will be passed
onto the CoreDPLL function, as well as a transform function that will transform an
unsatisfiability result into something of the desired form. If the result of conversion to
CNF produces an existential quantification, we use the proof sketched above. Otherwise,
the transformation can be the identity function, I:
7.2. METHOD 2: THE DPLL ALGORITHM 111
where we have still to implement the core DPLL procedure (called CoreDPLL above). The
above code uses REWR_CONV with the IMP_F_EQ_F theorem to affect two of the proof’s
transformations. The GSYM function is used to flip the orientation a theorem’s top-level
equalities. Finally, the FORALL_IMP_CONV conversion takes a term of the form
∀𝑥. 𝑃 (𝑥) ⇒ 𝑄
formula) and a context (the current assignment) as parameters. The assignment can be
naturally represented as a set of equations, where each equation is either 𝑣 = T or 𝑣 = F.
This suggests that a natural representation for our program state is a theorem: the
hypotheses will represent the assignment, and the conclusion can be the current formula.
Of course, H OL theorems can’t just be wished into existence. In this case, we can make
everything sound by also assuming the initial formula. Thus, when we begin our initial
state will be 𝜙 ⊢ 𝜙. After splitting on variable 𝑣, we will generate two new states
𝜙, (𝑣 = T) ⊢ 𝜙1 , and 𝜙, (𝑣 = F) ⊢ 𝜙2 , where the 𝜙𝑖 are the result of simplifying 𝜙 under the
additional assumption constraining 𝑣.
The easiest way to add an assumption to a theorem is to use the rule ADD_ASSUM. But
in this situation, we also want to simplify the conclusion of the theorem with the same
assumption. This means that it will be enough to rewrite with the theorem 𝜓 ⊢ 𝜓, where
𝜓 is the new assumption. The action of rewriting with such a theorem will cause the
new assumption to appear among the assumptions of the result.
The casesplit function is thus:
A case-split can result in a formula that has been rewritten all the way to true or false.
These are the recursion’s base cases. If the formula has been rewritten to true, then we
have found a satisfying assignment, one that is now stored for us in the hypotheses of
the theorem itself. The following function, mk_satmap, extracts those hypotheses into a
finite-map, and then returns the lookup function for that finite-map:
The foldthis function above adds the equations that are stored as hypotheses into
the finite-map. The exception handler in foldthis is necessary because one of the
hypotheses will be the original formula. The exception handler in the function that looks
7.2. METHOD 2: THE DPLL ALGORITHM 113
𝜙0 , Δ, (𝑣 = T) ⊢ F 𝜙0 , Δ, (𝑣 = F) ⊢ F
where 𝜙0 is the original formula, Δ is the rest of the current assignment, and 𝑣 is the
variable on which a split has just been performed. To turn these two theorems into the
desired
𝜙0 , Δ ⊢ F
⊢ ∀𝑡. (𝑡 = T) ∨ (𝑡 = F)
We can put these fragments together and write the top-level CoreDPLL function, in
Figure 7.1.
All that remains to be done is to figure out which variable to case-split on. The most
important variables to split on are those that appear in what are called “unit clauses”, a
clause containing just one literal. If there is a unit clause in a formula then it is of the
form
𝜙 ∧ 𝑣 ∧ 𝜙′
or
𝜙 ∧ ¬𝑣 ∧ 𝜙′
In either situation, splitting on 𝑣 will always result in a branch that evaluates directly
to false. We thus eliminate a variable without increasing the size of the problem. The
114 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC
process of eliminating unit clauses is usually called “unit propagation”. Unit propagation
is not usually thought of as a case-splitting operation, but doing it this way makes our
code simpler.
If a formula does not include a unit clause, then choice of the next variable to split on
is much more of a black art. Here we will implement a very simple choice: to split on
the variable that occurs most often. Our function find_splitting_var takes a formula
and returns the variable to split on.
This function works by handing a list of clauses to the inner recurse function. This strips
each clause apart in turn. If a clause has only one disjunct it is a unit-clause and the
variable can be returned directly. Otherwise, the variables in the clause are counted and
added to the accumulating map by count_vars, and the recursion can continue.
The count_vars function has the following implementation:
The use of a binary tree to store variable data makes it efficient to update the data as
it is being collected. Extracting the variable with the largest count is then a linear scan
of the tree, which we can do with the foldl function:
7.2.3 Performance
Once inputs get even a little beyond the clearly trivial, the function we have written (at the
top-level, DPLL_UNIV) performs considerably better than the truth table implementation.
For example, the generalisation of the following term, with 29 variables, takes our
function multiple seconds to demonstrate as a tautology:
val t0 = ‘‘
(s0_0 = (x_0 = ~y_0)) /\ (c0_1 = x_0 /\ y_0) /\
(s0_1 = ((x_1 = ~y_1) = ~c0_1)) /\
(c0_2 = x_1 /\ y_1 \/ (x_1 \/ y_1) /\ c0_1) /\
(s0_2 = ((x_2 = ~y_2) = ~c0_2)) /\
(c0_3 = x_2 /\ y_2 \/ (x_2 \/ y_2) /\ c0_2) /\
(s1_0 = ~(x_0 = ~y_0)) /\ (c1_1 = x_0 /\ y_0 \/ x_0 \/ y_0) /\
(s1_1 = ((x_1 = ~y_1) = ~c1_1)) /\
(c1_2 = x_1 /\ y_1 \/ (x_1 \/ y_1) /\ c1_1) /\
(s1_2 = ((x_2 = ~y_2) = ~c1_2)) /\
(c1_3 = x_2 /\ y_2 \/ (x_2 \/ y_2) /\ c1_2) /\
(c_3 = ~c_0 /\ c0_3 \/ c_0 /\ c1_3) /\
(s_0 = ~c_0 /\ s0_0 \/ c_0 /\ s1_0) /\
(s_1 = ~c_0 /\ s0_1 \/ c_0 /\ s1_1) /\
(s_2 = ~c_0 /\ s0_2 \/ c_0 /\ s1_2) /\ ~c_0 /\
(s2_0 = (x_0 = ~y_0)) /\ (c2_1 = x_0 /\ y_0) /\
(s2_1 = ((x_1 = ~y_1) = ~c2_1)) /\
(c2_2 = x_1 /\ y_1 \/ (x_1 \/ y_1) /\ c2_1) /\
(s2_2 = ((x_2 = ~y_2) = ~c2_2)) /\
(c2_3 = x_2 /\ y_2 \/ (x_2 \/ y_2) /\ c2_2) ==>
(c_3 = c2_3) /\ (s_0 = s2_0) /\ (s_1 = s2_1) /\ (s_2 = s2_2)‘‘;
val t = list_mk_forall(free_vars t0, t0);
(As is apparent from the above, if you want real speed, the built-in TAUT_PROVE function
works in less than a hundredth of a second, by using an external tool to generate the
proof of unsatisfiability, and then translating that proof back into HOL.)
Relaxing the Quantification Requirement The first step is to allow formulas that are
not closed. In order to hand on a formula that is closed to DPLL_UNIV, we can simply
generalise over the formula’s free variables. If DPLL_UNIV then says that the new, ground
formula is true, then so too will be the original. On the other hand, if DPLL_UNIV says
that the ground formula is false, then we can’t conclude anything further and will have
to raise an exception.
Code implementing this is shown below:
fun nonuniv_wrap t = let
val fvs = free_vars t
val gen_t = list_mk_forall(fvs, t)
val gen_t_eq = DPLL_UNIV gen_t
in
if rhs (concl gen_t_eq) = boolSyntax.T then let
val gen_th = EQT_ELIM gen_t_eq
in
EQT_INTRO (SPECL fvs gen_th)
end
else
raise mk_HOL_ERR "dpll" "nonuniv_wrap" "No conclusion"
end
Allowing Non-Literal Leaves We can do better than nonuniv_wrap: rather than quan-
tifying over just the free variables (which we have conveniently assumed will only be
boolean), we can turn any leaf part of the term that is not a variable or a negated
variable into a fresh variable. We first extract those boolean-valued leaves that are not
the constants true or false.
fun var_leaves acc t = let
val (l,r) = dest_conj t handle HOL_ERR _ =>
dest_disj t handle HOL_ERR _ =>
dest_imp t handle HOL_ERR _ =>
dest_bool_eq t
in
var_leaves (var_leaves acc l) r
end handle HOL_ERR _ =>
if type_of t <> bool then
raise mk_HOL_ERR "dpll" "var_leaves" "Term not boolean"
else if t = boolSyntax.T then acc
else if t = boolSyntax.F then acc
else HOLset.add(acc, t)
Note that we haven’t explicitly attempted to pull apart boolean negations (which one
might do with dest_neg). This is because dest_imp also destructs terms ˜p, returning
p and F as the antecedent and conclusion. We have also used a function dest_bool_eq
designed to pull apart only those equalities which are over boolean values. Its definition
is
118 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC
fun DPLL_TAUT tm =
let val (univs,tm’) = strip_forall tm
val insts = HOLset.listItems (var_leaves empty_tmset tm’)
val vars = map (fn t => genvar bool) insts
val theta = map2 (curry (op |->)) insts vars
val tm’’ = list_mk_forall (vars,subst theta tm’)
in
EQT_INTRO (GENL univs
(SPECL insts (EQT_ELIM (DPLL_UNIV tm’’))))
end
Note how this code first pulls off all external universal quantifications (with strip_forall),
and then re-generalises (with list_mk_forall). The calls to GENL and SPECL undo these
manipulations, but at the level of theorems. This produces a theorem equating the
original input to true. (If the input term is not an instance of a valid propositional
formula, the call to EQT_ELIM will raise an exception.)
Exercises
1. Extend the procedure so that it handles conditional expressions (both arms of the
terms must be of boolean type).
Chapter 8
More Examples
In addition to the examples already covered in this text, the H OL distribution comes
with a variety of instructive examples in the examples directory. There the following
examples (among others) are to be found:
bmark In this directory, there is a standard HOL benchmark: the proof of correctness of
a multiplier circuit, due to Mike Gordon.
euclid.sml This example is the same as that covered in Chapter 4: a proof of Euclid’s
theorem on the infinitude of the prime numbers, extracted and modified from a
much larger development due to John Harrison. It illustrates the automation of
H OL on a classic proof.
lambda This directory develops theories of a “de Bruijn” style lambda calculus, and also
a name-carrying version. (Both are untyped.) The development is a revision of the
proofs underlying the paper “5 Axioms of Alpha Conversion”, Andy Gordon and Tom
Melham, Proceedings of TPHOLs’96, Springer LNCS 1125.
parity This sub-directory contains the files used in the parity example of Chapter 5.
119
120 CHAPTER 8. MORE EXAMPLES
Thery.sml A very short example due to Laurent Théry, demonstrating a cute inductive
proof.
RSA This directory develops some of the mathematics underlying the RSA cryptography
scheme. The theories have been produced by Laurent Théry of INRIA Sophia-
Antipolis.
References
[1] S.F. Allen, R.L. Constable, D.J. Howe and W.E. Aitken, ‘The Semantics of Reflected
Proof’, Proceedings of the 5th IEEE Symposium on Logic in Computer Science, pp.
95–105, 1990.
[2] R.S. Boyer and J S. Moore, ‘Metafunctions: Proving Them Correct and Using Them
Efficiently as New Proof Procedures’, in: The Correctness Problem in Computer
Science, edited by R.S. Boyer and J S. Moore, Academic Press, New York, 1981.
[3] A.J. Camilleri, T.F. Melham and M.J.C. Gordon, ‘Hardware Verification using
Higher-Order Logic’, in: From HDL Descriptions to Guaranteed Correct Circuit
Designs: Proceedings of the IFIP WG 10.2 Working Conference, Grenoble, September
1986, edited by D. Borrione (North-Holland, 1987), pp. 43–67.
[4] M. Davis, G. Logemann and D. Loveland, ‘A machine program for theorem proving’,
Communications of the ACM, Vol. 5 (1962), pp. 394–397.
[5] M. Gordon, ‘Why higher-order Logic is a good formalism for specifying and verifying
hardware’, in: Formal Aspects of VLSI Design: Proceedings of the 1985 Edinburgh
Workshop on VLSI, edited by G. Milne and P.A. Subrahmanyam (North-Holland,
1986), pp. 153–177.
[7] Saunders Mac Lane and Garrett Birkhoff. Algebra. Collier-MacMillan Limited, Lon-
don, 1967.
121
122 REFERENCES
[11] L. Paulson, Logic and Computation: Interactive Proof with Cambridge LCF, Cambridge
Tracts in Theoretical Computer Science 2 (Cambridge University Press, 1987).