100% found this document useful (1 vote)
117 views122 pages

HOL Theorem-Proving

Uploaded by

Hamid Kisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
117 views122 pages

HOL Theorem-Proving

Uploaded by

Hamid Kisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

[For H OL Kananaskis-12] June 18, 2018

The HOL System


TUTORIAL
Preface

This volume contains a tutorial on the H OL system. It is one of four documents making
up the documentation for H OL:

(i) LOGIC: a formal description of the higher order logic implemented by the H OL
system;

(ii) TUTORIAL: a tutorial introduction to H OL, with case studies;

(iii) DESCRIPTION: a detailed user’s guide for the H OL system;

(iv) REFERENCE: the reference manual for H OL.

These four documents will be referred to by the short names (in small slanted capitals)
given above.
This document, TUTORIAL, is intended to be the first item read by new users of H OL. It
provides a self-study introduction to the structure and use of the system. The tutorial is
intended to give a ‘hands-on’ feel for the way H OL is used, but it does not systematically
explain all the underlying principles (DESCRIPTION and LOGIC explain these). After
working through TUTORIAL the reader should be capable of using H OL for simple tasks,
and should also be in a position to consult the other documents.

Getting started
Chapter 1 explains how to get and install H OL. Once this is done, the potential H OL user
should become familiar with the following subjects:

1. The programming meta-language ML, and how to interact with it.

2. The formal logic supported by the H OL system (higher order logic) and its manipu-
lation via ML.

3. Goal directed proof, tactics, and tacticals.

Chapter 2 introduces ML. Chapter 4 then develops an extended example—Euclid’s


proof of the infinitude of primes—to illustrate how H OL is used to prove theorems,

3
4 Preface

demonstrating both the logic and proving properties of one’s definitions with some
high-level tactics.
Chapter 5 features another worked example: the specification and verification of a
simple sequential parity checker. The intention is to accomplish two things: (i) to present
another complete piece of work with H OL; and (ii) to give an idea of what it is like to
use the H OL system for a tricky proof. Chapter 6 is a more extensive example: the proof
of confluence for combinatory logic. Again, the aim is to present a complete piece of
non-trivial work.
Chapter 7 gives an example of implementing a proof tool of one’s own. This demon-
strates the programmability of H OL: the way in which technology for solving specific
problems can be implemented on top of the underlying kernel. With high-powered tools
to draw on, it is possible to write prototypes very quickly.
Chapter 8 briefly discusses some of the examples distributed with H OL in the examples
directory.

TUTORIAL has been kept short so that new users of H OL can get going as fast as possible.
Sometimes details have been simplified. It is recommended that as soon as a topic in
TUTORIAL has been digested, the relevant parts of DESCRIPTION and REFERENCE be studied.
Acknowledgements

The bulk of H OL is based on code written by—in alphabetical order—Hasan Amjad,


Richard Boulton, Anthony Fox, Mike Gordon, Elsa Gunter, John Harrison, Peter Homeier,
Gérard Huet (and others at INRIA), Joe Hurd, Ramana Kumar, Ken Friis Larsen, Tom
Melham, Robin Milner, Lockwood Morris, Magnus Myreen, Malcolm Newey, Michael
Norrish, Larry Paulson, Konrad Slind, Don Syme, Thomas Türk, Chris Wadsworth, and
Tjark Weber. Many others have supplied parts of the system, bug fixes, etc.

Current edition
The current edition of all four volumes (LOGIC, TUTORIAL, DESCRIPTION and REFERENCE)
has been prepared by Michael Norrish and Konrad Slind. Further contributions to these
volumes came from: Hasan Amjad, who developed a model checking library and wrote
sections describing its use; Jens Brandt, who developed and documented a library for
the rational numbers; Anthony Fox, who formalized and documented new word theories
and the associated libraries; Mike Gordon, who documented the libraries for BDDs and
SAT; Peter Homeier, who implemented and documented the quotient library; Joe Hurd,
who added material on first order proof search; and Tjark Weber, who wrote libraries for
Satisfiability Modulo Theories (SMT) and Quantified Boolean Formulae (QBF).
The material in the third edition constitutes a thorough re-working and extension of
previous editions. The only essentially unaltered piece is the semantics by Andy Pitts
(in LOGIC), reflecting the fact that, although the H OL system has undergone continual
development and improvement, the H OL logic is unchanged since the first edition (1988).

5
6 Acknowledgements

Second edition
The second edition of REFERENCE was a joint effort by the Cambridge H OL group.

First edition
The three volumes TUTORIAL, DESCRIPTION and REFERENCE were produced at the Cam-
bridge Research Center of SRI International with the support of DSTO Australia.
The H OL documentation project was managed by Mike Gordon, who also wrote parts
of DESCRIPTION and TUTORIAL using material based on an early paper describing the
H OL system1 and The ML Handbook 2 . Other contributers to DESCRIPTION incude Avra
Cohn, who contributed material on theorems, rules, conversions and tactics, and also
composed the index (which was typeset by Juanito Camilleri); Tom Melham, who wrote
the sections describing type definitions, the concrete type package and the ‘resolution’
tactics; and Andy Pitts, who devised the set-theoretic semantics of the H OL logic and
wrote the material describing it.
The original document design used LATEX macros supplied by Elsa Gunter, Tom Melham
and Larry Paulson. The typesetting of all three volumes was managed by Tom Melham.
The cover design is by Arnold Smith, who used a photograph of a ‘snow watching lantern’
taken by Avra Cohn (in whose garden the original object resides). John Van Tassel
composed the LATEX picture of the lantern.
Many people other than those listed above have contributed to the H OL documentation
effort, either by providing material, or by sending lists of errors in the first edition.
Thanks to everyone who helped, and thanks to DSTO and SRI for their generous support.

1 M.J.C. Gordon, ‘HOL: a Proof Generating System for Higher Order Logic’, in: VLSI Specification,
Verification and Synthesis, edited by G. Birtwistle and P.A. Subrahmanyam, (Kluwer Academic Publishers,
1988), pp. 73–128.
2 The ML Handbook, unpublished report from Inria by Guy Cousineau, Mike Gordon, Gérard Huet,

Robin Milner, Larry Paulson and Chris Wadsworth.


Acknowledgements 7

In Memory of Mike Gordon


As documented in the academic literature, in material available from his archived web-
pages at the University of Cambridge Computer Laboratory, and in these manuals, Mike
Gordon was H OL’s primary creator and developer. Mike not only created a significant
piece of software, inspiring this and many other projects since, but also built a world-
leading research group in the Computer Laboratory. This research environment was
wonderfully productive for many of the system’s authors, and we all owe Mike an
enormous debt for both the original work on H OL, and for the way he and his group
supported our own development as researchers and H OL hackers.

Mike Gordon, 1948–2017


8 Acknowledgements
Contents

1 Getting and Installing H OL 11


1.1 Getting H OL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 The hol-info mailing list . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Installing H OL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.1 Overriding smart-configure . . . . . . . . . . . . . . . . . . . . 14

2 Introduction to ML 17
2.1 How to interact with ML . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Writing H OL Terms and Types 21

4 Example: Euclid’s Theorem 27


4.1 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.1 Divisibility and factorial . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.2 Divisibility and factorial (again!) . . . . . . . . . . . . . . . . . . 41
4.2 Primality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3 Existence of prime factors . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Euclid’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.5 Turning the script into a theory . . . . . . . . . . . . . . . . . . . . . . . 53
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5 Example: a Simple Parity Checker 59


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.5.1 Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.5.2 Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6 Example: Combinatory Logic 73


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2 The type of combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

9
10 CONTENTS

6.3 Combinator reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74


6.4 Transitive closure and confluence . . . . . . . . . . . . . . . . . . . . . . 75
6.5 Back to combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.5.1 Parallel reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.5.2 Using RTC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.5.3 Proving the RTCs are the same . . . . . . . . . . . . . . . . . . . 89
6.5.4 Proving a diamond property for parallel reduction . . . . . . . . 94
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

7 Proof Tools: Propositional Logic 105


7.1 Method 1: Truth Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 Method 2: the DPLL Algorithm . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.1 Conversion to Conjunctive Normal Form . . . . . . . . . . . . . . 109
7.2.2 The Core DPLL Procedure . . . . . . . . . . . . . . . . . . . . . . 111
7.2.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3 Extending our Procedure’s Applicability . . . . . . . . . . . . . . . . . . 116

8 More Examples 119


Chapter 1

Getting and Installing H OL

This chapter describes how to get the H OL system and how to install it. It is generally
assumed that some sort of Unix system is being used, but the instructions that follow
should apply mutatis mutandis to other platforms. Unix is not a pre-requisite for using the
system. H OL may be run on PCs running Windows operating systems from Windows NT
onwards (i.e., Windows 2000, XP and Vista are also supported), as well as Macintoshes
running MacOS X.

1.1 Getting H OL
The H OL system can be downloaded from http://hol-theorem-prover.org. The nam-
ing scheme for H OL releases is ⟨name⟩-⟨number⟩; the release described here is Kananaskis-
12.

1.2 The hol-info mailing list


The hol-info mailing list serves as a forum for discussing H OL and disseminating news
about it. If you wish to be on this list (which is recommended for all users of H OL), visit
http://lists.sourceforge.net/lists/listinfo/hol-info. This web-page can also
be used to unsubscribe from the mailing list.

1.3 Installing H OL
It is assumed that the H OL sources have been obtained and the tar file unpacked into a
directory hol.1 The contents of this directory are likely to change over time, but it should
contain the following:

1 You may choose another name if you want; it is not important.

11
12 CHAPTER 1. GETTING AND INSTALLING HOL

Principal Files on the HOL Distribution Directory

File name Description File type


README Description of directory hol Text
COPYRIGHT A copyright notice Text
INSTALL Installation instructions Text
tools Source code for building the system Directory
bin Directory for HOL executables Directory
sigobj Directory for ML object files Directory
src ML sources of H OL Directory
help Help files for H OL system Directory
examples Example source files Directory

The session in the box below shows a typical distribution directory. The H OL distribu-
tion has been placed on a PC running Linux in the directory /home/mn200/hol/.
All sessions in this documentation will be displayed in boxes with a number in the top
right hand corner. This number indicates whether the session is a new one (when the
number will be 1) or the continuation of a session started in an earlier box. Consecutively
numbered boxes are assumed to be part of a single continuous session. The Unix
prompt for the sessions is $, so lines beginning with this prompt were typed by the
user. After entering the H OL system (see below), the user is prompted with - for an
expression or command of the H OL meta-language ML; lines beginning with this are
thus ML expressions or declarations. Lines not beginning with $ or - are system output.
Occasionally, system output will be replaced with a line containing ... when it is of
minimal interest. The meta-language ML is introduced in Chapter 2.

$ pwd 1
/home/mn200/hol
$ ls -F
CONTRIBUTORS README doc/ sigobj/ tools/
COPYRIGHT bin/ examples/ src/ tools-poly/
INSTALL developers/ help/ std.prelude

Now you will need to rebuild H OL from the sources.2


Before beginning you must have a current version of Poly/ML or Moscow ML.3 In the
case of Moscow ML, you must have at least version 2.01. Poly/ML is available from
http://polyml.org. Moscow ML is available on the web from http://mosml.org.
When working with Poly/ML, the installation must ensure that dynamic library load-
ing (typically done by setting the LD_LIBRARY_PATH environment variable) picks up
2 Itis possible that pre-built systems may soon be available from the web-page mentioned above.
3 We recommend using Poly/ML on all operating systems, but it requires Cygwin or the Windows Linux
sub-system on Windows.
1.3. INSTALLING HOL 13

libpolyml.so and libpolymain.so. If these files are in /usr/lib, nothing will need to
be changed, but other locations may require further system configuration. A sample
LD_LIBRARY_PATH initialisation command (in a file such as .bashrc) might be

declare -x LD_LIBRARY_PATH=/usr/local/lib:$HOME/lib

Do not use the --with-portable option.


When you have your ML system installed, and are in the root directory of the distribu-
tion, the next step is to run smart-configure. With Moscow ML, this looks like:

$ mosml < tools/smart-configure.sml 2


Moscow ML version 2.01 (January 2004)
Enter ‘quit();’ to quit.
- [opening file "tools/smart-configure-mosml.sml"]

HOL smart configuration.

Determining configuration parameters: OS mosmldir holdir


OS: linux
mosmldir: /home/mn200/mosml/bin
holdir: /home/mn200/hol
dynlib_available: true

Configuration will begin with above values. If they are wrong


press Control-C.

If you are using Poly/ML, then write

poly < tools/smart-configure.sml

instead.
Assuming you don’t interrupt the configuration process, this will build the Holmake and
build programs, and move them into the hol/bin directory. If something goes wrong at
this stage, consult Section 1.3.1 below.
The next step is to run the build program. This should result in a great deal of output
as all of the system code is compiled and the theories built. Eventually, a H OL system4 is
produced in the bin/ directory.

$ bin/build 3
...
...
Uploading files to /home/mn200/hol/sigobj

Hol built successfully.


$

4 Four H OL executables are produced: hol, hol.noquote, hol.bare and hol.bare.noquote. The first of these
will be used for most examples in the TUTORIAL.
14 CHAPTER 1. GETTING AND INSTALLING HOL

At this point, the system is build in your HOL directory, and cannot easily be moved to
other locations. In other words, you should unpack HOL in the location/directory where
you wish to access it for all your future work.

1.3.1 Overriding smart-configure


If smart-configure is unable to guess correct values for the various parameters (holdir,
OS etc.) then you can create a file called to provide correct values. With Poly/ML, this
should be poly-includes.ML in the tools-poly directory. With Moscow ML, this should
be config-override in the root directory of the HOL distribution. In this file, specify
the correct value for the appropriate parameter by providing an ML binding for it. All
variables except dynlib_available must be given a string as a possible value, while
dynlib_available must be either true or false. So, one might write
val OS = "unix"; 4
val holdir = "/local/scratch/myholdir";
val dynlib_available = false;

The config-override file need only provide values for those variables that need
overriding.
With this file in place, the smart-configure program will use the values specified
there rather than those it attempts to calculate itself. The value given for the OS variable
must be one of "unix", "linux", "solaris", "macosx" or "winNT".5
In extreme circumstances it is possible to edit the file tools/configure.sml your-
self to set configuration variables directly. (If you are using Poly/ML, you must edit
tools-poly/configure.sml instead.) At the top of this file various incomplete SML
declarations are present, but commented out. You will need to uncomment this sec-
tion (remove the (* and *) markers), and provide sensible values. All strings must be
enclosed in double quotes.
The holdir value must be the name of the top-level directory listed in the first session
above. The OS value should be one of the strings specified in the accompanying comment.
When working with Poly/ML, the poly string must be the path to the poly executable
that begins an interactive ML session. The polymllibdir must be a path to a directory
that contains the file libpolymain.a. When working with Moscow ML, the mosmldir
value must be the name of the directory containing the Moscow ML binaries (mosmlc,
mosml, mosmllex etc).
Subsequent values (CC and GNUMAKE) are needed for “optional” components of the
system. The first gives a string suitable for invoking the system’s C compiler, and the
second specifies a make program.
5 The
string "winNT" is used for Microsoft Windows operating systems that are at least as recent as
Windows NT. This includes Windows 2000, XP, Vista, Windows 10 etc. Do not use "winNT" when using
Poly/ML via Cygwin or the Linux sub-system.
1.3. INSTALLING HOL 15

After editing, tools/configure.sml the lines above will look something like:

$ more configure.sml 5
...
val mosmldir = "/home/mn200/mosml";
val holdir = "/home/mn200/hol";
val OS = "linux" (* Operating system; choices are:
"linux", "solaris", "unix", "winNT" *)

val CC = "gcc"; (* C compiler (for building quote filter) *)


val GNUMAKE = "gnumake"; (* for robdd library *)
...
$

Now, at either this level (in the tools or tools-poly directory) or at the level above, the
script configure.sml must be piped into the ML interpreter (i.e., mosml or poly). For
example,

$ mosml < tools/configure.sml 6


Moscow ML version 2.01 (January 2004)
Enter ‘quit();’ to quit.
- > val mosmldir = "/home/mn200/mosml" : string
val holdir = "/home/mn200/hol" : string
val OS = "linux" : string
- > val CC = "gcc" : string
...
Beginning configuration.
- Making bin/Holmake.
...
Making bin/build.
- Making hol98-mode.el (for Emacs)
- Setting up the standard prelude.
- Setting up src/0/Globals.sml.
- Generating bin/hol.
- Generating bin/hol.noquote.
- Attempting to compile quote filter ... successful.
- Setting up the muddy library Makefile.
- Setting up the help Makefile.
-
Finished configuration!
-
$
16 CHAPTER 1. GETTING AND INSTALLING HOL
Chapter 2

Introduction to ML

This chapter is a brief introduction to the meta-language ML. The aim is just to give a
feel for what it is like to interact with the language. A more detailed introduction can be
found in numerous textbooks and web-pages; see for example the list of resources on
the MoscowML home-page1 , or the comp.lang.ml FAQ.2

2.1 How to interact with ML


ML is an interactive programming language like Lisp. At top level one can evaluate
expressions and perform declarations. The former results in the expression’s value and
type being printed, the latter in a value being bound to a name.
A standard way to interact with ML is to configure the workstation screen so that there
are two windows:

(i) An editor window into which ML commands are initially typed and recorded.

(ii) A shell window (or non-Unix equivalent) which is used to evaluate the com-
mands.

A common way to achieve this is to work inside Emacs with a text window and a shell
window.
After typing a command into the edit (text) window it can be transferred to the shell
and evaluated in H OL by ‘cut-and-paste’. In Emacs this is done by copying the text into a
buffer and then ‘yanking’ it into the shell. The advantage of working via an editor is that
if the command has an error, then the text can simply be edited and used again; it also
records the commands in a file which can then be used again (via a batch load) later.
In Emacs, the shell window also records the session, including both input from the user
and the system’s response. The sessions in this tutorial were produced this way. These
sessions are split into segments displayed in boxes with a number in their top right hand
corner (to indicate their position in the complete session).
The interactions in these boxes should be understood as occurring in sequence. For
example, variable bindings made in earlier boxes are assumed to persist to later ones. To
1 http://mosml.org
2 http://www.faqs.org/faqs/meta-lang-faq/

17
18 CHAPTER 2. INTRODUCTION TO ML

enter the H OL system, one types hol at the command-line, possibly preceded by path
information if the H OL system’s bin directory is not in one’s path. The H OL system then
prints a sign-on message and puts one into ML. The ML prompt varies depending on
the implementation. In Poly/ML, the implementation assumed for our sessions here,
the prompt is >, so lines beginning with > are typed by the user, and other lines are the
system’s responses.
$ bin/hol 1

---------------------------------------------------------------------
HOL-4 [Kananaskis 12 (stdknl, built Mon Jun 18 16:18:42 2018)]

For introductory HOL help, type: help "hol";


To exit type <Control>-D
---------------------------------------------------------------------

> 1 :: [2,3,4,5];
val it = [1, 2, 3, 4, 5]: int list

The ML expression 1 :: [2,3,4,5] has the form 𝑒1 𝑜𝑝 𝑒2 where 𝑒1 is the expression 1


(whose value is the integer 1), 𝑒2 is the expression [2,3,4,5] (whose value is a list of
four integers) and 𝑜𝑝 is the infixed operator ‘::’ which is like Lisp’s cons function. Other
list processing functions include hd (𝑐𝑎𝑟 in Lisp), tl (𝑐𝑑𝑟 in Lisp) and null (𝑛𝑢𝑙𝑙 in Lisp).
The semicolon ‘;’ terminates a top-level phrase. The system’s response is shown on the
line starting with the word val. It consists of the value of the expression followed, after
a colon, by its type. The ML type checker infers the type of expressions using methods
invented by Robin Milner [8]. The type int list is the type of ‘lists of integers’; list is
a unary type operator. The type system of ML is very similar to the type system of the
H OL logic.
The value of the last expression evaluated at the top-level not otherwise bound to a
name is always remembered in a variable called it.
> val l = it; 2
val l = [1, 2, 3, 4, 5]: int list

> tl l;
val it = [2, 3, 4, 5]: int list

> hd it;
val it = 2: int

> tl(tl(tl(tl(tl l))));


val it = []: int list

Following standard 𝜆-calculus usage, the application of a function 𝑓 to an argument


𝑥 can be written without brackets as 𝑓 𝑥 (although the more conventional 𝑓 (𝑥) is
2.1. HOW TO INTERACT WITH ML 19

also allowed). The expression 𝑓 𝑥1 𝑥2 ⋯ 𝑥𝑛 abbreviates the less intelligible expression


(⋯((𝑓 𝑥1 )𝑥2 )⋯)𝑥𝑛 (function application is left associative).
Declarations have the form val 𝑥1 =𝑒1 and ⋯ and 𝑥𝑛 =𝑒𝑛 and result in the value of each
expression 𝑒𝑖 being bound to the name 𝑥𝑖 .
> val l1 = [1,2,3] and l2 = ["a","b","c"]; 3
val l1 = [1, 2, 3]: int list
val l2 = ["a", "b", "c"]: string list

ML expressions like "a", "b", "foo" etc. are strings and have type string. Any sequence
of ASCII characters can be written between the quotes.3 The function explode splits a
string into a list of single characters, which are written like single character strings, with
a # character prepended.
> explode "a b c"; 4
val it = [#"a", #" ", #"b", #" ", #"c"]: char list

An expression of the form (𝑒1 ,𝑒2 ) evaluates to a pair of the values of 𝑒1 and 𝑒2 . If 𝑒1 has
type 𝜎1 and 𝑒2 has type 𝜎2 then (𝑒1 ,𝑒2 ) has type 𝜎1 *𝜎2 . The first and second components
of a pair can be extracted with the ML functions #1 and #2 respectively. If a tuple has
more than two components, its 𝑛-th component can be extracted with a function #𝑛.
The values (1,2,3), (1,(2,3)) and ((1,2), 3) are all distinct and have types
int * int * int, int * (int * int) and (int * int) * int respectively.
> val triple1 = (1,true,"abc"); 5
val triple1 = (1, true, "abc"): int * bool * string

> #2 triple1;
val it = true: bool

> val triple2 = (1, (true, "abc"));


val triple2 = (1, (true, "abc")): int * (bool * string)

> #2 triple2;
val it = (true, "abc"): bool * string

The ML expressions true and false denote the two truth values of type bool.
ML types can contain the type variables ’a, ’b, ’c, etc. Such types are called polymorphic.
A function with a polymorphic type should be thought of as possessing all the types
obtainable by replacing type variables by types. This is illustrated below with the function
zip.
Functions are defined with declarations of the form fun 𝑓 𝑣1 … 𝑣𝑛 = 𝑒 where each 𝑣𝑖
is either a variable or a pattern built out of variables.
The function zip, below, converts a pair of lists ([𝑥1 ,…,𝑥𝑛 ], [𝑦1 ,…,𝑦𝑛 ]) to a list of
pairs [(𝑥1 ,𝑦1 ),…,(𝑥𝑛 ,𝑦𝑛 )].
3 Newlines must be written as ∖n, and quotes as ∖".
20 CHAPTER 2. INTRODUCTION TO ML

> fun zip(l1,l2) = 6


if null l1 orelse null l2 then []
else (hd l1,hd l2) :: zip(tl l1,tl l2);
val zip = fn: ’a list * ’b list -> (’a * ’b) list

> zip([1,2,3],["a","b","c"]);
val it = [(1, "a"), (2, "b"), (3, "c")]: (int * string) list

Functions may be curried, i.e. take their arguments ‘one at a time’ instead of as a tuple.
This is illustrated with the function curried_zip below:

> fun curried_zip l1 l2 = zip(l1,l2); 7


val curried_zip = fn: ’a list -> ’b list -> (’a * ’b) list

> fun zip_num l2 = curried_zip [0,1,2] l2;


val zip_num = fn: ’a list -> (int * ’a) list

> zip_num ["a","b","c"];


val it = [(0, "a"), (1, "b"), (2, "c")]: (int * string) list

The evaluation of an expression either succeeds or fails. In the former case, the
evaluation returns a value; in the latter case the evaluation is aborted and an exception is
raised. This exception passed to whatever invoked the evaluation. This context can either
propagate the failure (this is the default) or it can trap it. These two possibilities are
illustrated below. An exception trap is an expression of the form 𝑒1 handle _ => 𝑒2 . An
expression of this form is evaluated by first evaluating 𝑒1 . If the evaluation succeeds (i.e.
doesn’t fail) then the value of the whole expression is the value of 𝑒1 . If the evaluation of
𝑒1 raises an exception, then the value of the whole is obtained by evaluating 𝑒2 .4

> 3 div 0; 8
Exception- Div raised

> 3 div 0 handle _ => 0;


val it = 0: int

The sessions above are enough to give a feel for ML. In the next chapter, the syntax of
the logic supported by the H OL system (higher order logic) will be introduced.

4 This
description of exception handling is actually a gross simplification of the way exceptions can be
handled in ML; consult a proper text for a better explanation.
Chapter 3

Writing H OL Terms and Types

The ML language is literally the H OL system’s meta-language. We use it to manipulate


and interact with the terms, types and theorems of higher-order logic. Ultimately this is
done with ML functions and values, but a hugely significant part of the user’s connection
to the logic is via the system’s parser and pretty-printer.
The parser allows the user to write terms and types in a pleasant textual form (rather
than constructing them directly with the kernel’s underlying ML API).1 The pretty-printer
shows the user terms, types and theorems in a pretty way. We believe it greatly helps the
user experience to see 𝑝 ∧ 𝑞 rather than something like

COMB (COMB (CONST ("bool", "/\",


TYPE("min", "fun",
[TYPE("min", "bool", []),
TYPE("min", "fun",
[TYPE("min", "bool", []),
TYPE("min", "bool", [])])])),
VAR ("p", TYPE("min", "bool", []))),
COMB (VAR ("q", TYPE("min", "bool", []))))

which is a much more accurate picture of what the term really looks like in memory.

H OL can use Unicode and ASCII


The fundamental logical connectives can usually be parsed and printed in two different
ways, with an ASCII notation, or a generally prettier Unicode form. By default, the parser
will accept either and the printer will choose to use the Unicode form. Thus

> [‘‘p ∧ q’’, ‘‘p /\ q‘‘]; 1


val it = [‘‘p ∧ q’’, ‘‘p ∧ q’’]: term list

> [‘‘∀x:𝛼. P x ⇒ ¬Q x’’, ‘‘!x:’a. P x ⇒ ~Q x’’];


val it = [‘‘∀x. P x ⇒ ¬Q x’’, ‘‘∀x. P x ⇒ ¬Q x’’]: term list

1 Notethat the user cannot write theorem values directly; this would break the prover’s guarantee of
soundness!

21
22 CHAPTER 3. WRITING HOL TERMS AND TYPES

Boolean connectives Set operations Other theories


∀ ! ∈ IN ≤ <=
∃ ? ∉ NOTIN ≥ >=
¬ ~ ∪ UNION
∧ /\ ∩ INTER Delimiters
∨ \/ ⊆ SUBSET “. . . ” ‘‘. . . ‘‘
⇒ ==> ∅ EMPTY ‘. . . ’ ‘. . . ‘
⟺ <=> 𝕌(∶ 𝛼) univ(:’a)
⟺̸ <=/=> × CROSS

Table 3.1: Unicode/ASCII equivalents in H OL syntax. Delimiters are the quotation marks
that delimit whole terms or types, separating them from the ML level.

It is possible to turn Unicode printing off and on by setting the PP.avoid_unicode trace:
> set_trace "PP.avoid_unicode" 1; 2
val it = (): unit
> ‘‘x ∈ A’’;
<<HOL message: inventing new type variable names: ’a>>
val it = ‘‘x IN A‘‘: term

> set_trace "PP.avoid_unicode" 0;


val it = (): unit
> ‘‘x ∈ A’’;
<<HOL message: inventing new type variable names: ’a>>
val it = ‘‘x ∈ A’’: term

Table 3.1 lists a number of Unicode/ASCII pairs. Generation of Unicode code-points is


up to the user’s environment, but modes assisting this are available for emacs and vim.
Note also that the encoding for both parsing and printing must be UTF8, which is again
the user’s responsibility.

H OL looks like ML
One interesting (and also confusing for beginners) aspect of H OL is that its terms and
types look like ML’s. For example, the zip function in ML (from the previous chapter)
might be characterised by the HOL term that can be written:

zip (l1, l2) = if NULL l1 ∨ NULL l2 then []


else (HD l1, HD l2) :: zip (TL l1, TL l2)

Apart from the fact that some of the relevant constants have different names (NULL vs
null for example), and apart from the use of logical disjunction (∨) instead of orelse,
the text is identical.
23

The following session shows the (rather involved) way in which this definition can be
made,2 allowing us to see the way the definition theorem is printed back. We can also
ask the system to print the new constant’s type:

> val zip_def = 3


tDefine "zip" ‘zip (l1, l2) = if NULL l1 ∨ NULL l2 then []
else (HD l1, HD l2) :: zip (TL l1, TL l2)’
(WF_REL_TAC ‘measure (LENGTH o FST)’ >> Cases_on ‘l1’ >> simp[]);
<<HOL message: Initialising SRW simpset ... done>>
<<HOL message: inventing new type variable names: ’a, ’b>>
Equations stored under "zip_def".
Induction stored under "zip_ind".
val zip_def =
⊢ ∀l2 l1.
zip (l1,l2) =
if NULL l1 ∨ NULL l2 then []
else (HD l1,HD l2)::zip (TL l1,TL l2): thm

> type_of ‘‘zip’’;


<<HOL message: inventing new type variable names: ’a, ’b>>
val it = ‘‘:𝛼 list # 𝛽 list -> (𝛼 # 𝛽) list’’: hol_type

Note how the pretty-printer is at liberty to make adjustments to the way the underlying
term is rendered as a string: its placement of newline and space characters is not exactly
the same as the user’s.
H OL’s language of types is also similar but slightly different to ML’s: the # symbol is
used for the pair type rather than *, and the printer uses Greek letters 𝛼 and 𝛽 rather
than ’a and ’b.

H OL vs ML Traps
Lists, sets and other types with syntax for enumerating elements use a semicolon rather
than a comma to separate elements. Thus

> [1,2,3,4] (* ML *); 4


val it = [1, 2, 3, 4]: int list

> ‘‘[1;2;3;4] (* HOL *)’’;


val it = ‘‘[1; 2; 3; 4]’’: term
> type_of it;
val it = ‘‘:num list’’: hol_type

ML has three distinct types 𝜏1 *𝜏2 *𝜏3 , (𝜏1 *𝜏2 )*𝜏3 and 𝜏1 *(𝜏2 *𝜏3 ). One might see these as
a flat triple, and two flavours of pair with a nested pair as one or other component. In
2 The usual “H OL” way to define this function, with pattern-matching, wouldn’t be so complicated.
24 CHAPTER 3. WRITING HOL TERMS AND TYPES

H OL, the concrete syntax 𝜏1 #𝜏2 #𝜏3 maps to 𝜏1 #(𝜏2 #𝜏3 ) (i.e., the infix # type operator is
right-associative).
ML uses the op keyword to remove infix status from function forms. In H OL one can
either “wrap” the operator in parentheses3 or precede it with a $-sign. Further, infixes in
ML take pairs; in H OL they are curried:

> op+ (3,4) (* ML *); 5


val it = 7: int
> map op* [(1,2), (3,4)] (* ML *);
val it = [2, 12]: int list

> EVAL ‘‘(+) 3 4 < $* 3 4 (* HOL *)’’;


val it = ⊢ 3 + 4 < 3 * 4 ⟺ T: thm
> EVAL ‘‘MAP (+) [1;2;3]’’;
val it = ⊢ MAP $+ [1; 2; 3] = [$+ 1; $+ 2; $+ 3]: thm

ML insists that arguments of datatype constructors be tuples (“uncurried”), and that


type arguments be provided to new types. H OL insists that type arguments be omitted,
and allows either form of argument to constructors (though it’s generally better practice
to not use tuples). In ML:

> datatype ’a tree = Lf | Nd of (’a tree * ’a * ’a tree); 6


datatype ’a tree = Lf | Nd of ’a tree * ’a * ’a tree
> fun size Lf = 0 | size (Nd(l,_,r)) = 1 + size l + size r;
val size = fn: ’a tree -> int

In H OL:
> Datatype ‘tree = Lf | Nd tree 𝛼 tree’; 7
<<HOL message: Defined type: "tree">>
val it = (): unit

> type_of ‘‘Nd’’;


<<HOL message: inventing new type variable names: ’a>>
val it = ‘‘:𝛼 tree -> 𝛼 -> 𝛼 tree -> 𝛼 tree’’: hol_type

> Define ‘(size Lf = 0) ∧ (size (Nd l _ r) = 1 + size l + size r)’;


<<HOL message: inventing new type variable names: ’a>>
Definition has been stored under "size_def"
val it = ⊢ (size Lf = 0) ∧ ∀l v0 r. size (Nd l v0 r) = 1 + size l + size r:
thm

ML uses ~ as the unary negation operator on numeric types. H OL allows it in this role
(as well as for boolean negation), but also allows - for numeric negation. First the ML
behaviour:
3 But
watch out for the * operator; one can’t wrap this in parentheses because the result then looks like
comment syntax.
25

> ~3; 8
val it = ~3: int
> -3;
Exception- (-) has infix status but was not preceded by op.
Type error in function application.
Function: - : int * int -> int
Argument: 3 : int
Reason: Can’t unify int to int * int (Incompatible types)
Fail "Static Errors" raised

In H OL:
> load "intLib"; ...output elided... 9
> EVAL ‘‘~3 + 4’’;
val it = ⊢ -3 + 4 = 1: thm
> EVAL ‘‘-3 * 4’’;
val it = ⊢ -3 * 4 = -12: thm
26 CHAPTER 3. WRITING HOL TERMS AND TYPES
Chapter 4

Example: Euclid’s Theorem

In this chapter, we prove in H OL that for every number, there is a prime number that
is larger, i.e., that the prime numbers form an infinite sequence. This proof has been
excerpted and adapted from a much larger example due to John Harrison, in which he
proved the 𝑛 = 4 case of Fermat’s Last Theorem. The proof development is intended to
serve as an introduction to performing high-level interactive proofs in H OL.1 Many of the
details may be difficult to grasp for the novice reader; nonetheless, it is recommended
that the example be followed through in order to gain a true taste of using H OL to prove
non-trivial theorems.
Some tutorial descriptions of proof systems show the system performing amazing
feats of automated theorem proving. In this example, we have not taken this approach;
instead, we try to show how one actually goes about the business of proving theorems
in H OL: when more than one way to prove something is possible, we will consider the
choices; when a difficulty arises, we will attempt to explain how to fight one’s way clear.
One ‘drives’ H OL by interacting with the ML top-level loop, perhaps mediated via an
editor such as emacs or vim. In this interaction style, ML function calls are made to bring
in already-established logical context, e.g., via load; to define new concepts, e.g., via
Datatype, Define, and Hol_reln; and to perform proofs using the goalstack interface,
and the proof tools from bossLib (or if they fail to do the job, from lower-level libraries).
Let’s get started. First, we start the system, with the command <holdir>/bin/hol. We
then “open” the arithmetic theory; this means that all of the ML bindings from the H OL
theory of arithmetic are made available at the top level.

> open arithmeticTheory; ...output elided... 1

We now begin the formalization. In order to define the concept of prime number, we
first need to define the divisibility relation:

> val divides_def = Define‘divides a b = ?x. b = a * x‘; 2


Definition has been stored under "divides_def"
val divides_def = ⊢ ∀a b. divides a b ⟺ ∃x. b = a * x: thm

Note how we are using ASCII notation to input our terms (? is the ASCII way to write
the existential quantifier), but the system responds with pleasant Unicode. Unicode
1 The proofs discussed below may be found in examples/euclid.sml of the H OL distribution.

27
28 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

characters can also be used in the input. Also note how equality on booleans gets printed
as the if-and-only-if arrow, while equality on natural numbers stays as an equality. The
underlying constant is the same (equality) (as is implied by the fact that one can use = in
both places in the input), but the system tries to be helpful when printing.
The definition is added to the current theory with the name divides_def, and also
returned from the invocation of Define. We take advantage of this and make an ML
binding of the name divides_def to the definition. In the usual way of interacting
with H OL, such an ML binding is made for each definition and (useful) proved theorem:
the ML environment is thus being used as a convenient place to hold definitions and
theorems for later reference in the session.
We want to treat divides as a (non-associating) infix:

> set_fixity "divides" (Infix(NONASSOC, 450)); 3


val it = (): unit

Next we define the property of a number being prime: a number 𝑝 is prime if and only if
it is not equal to 1 and it has no divisors other than 1 and itself:

> val prime_def = 4


Define ‘prime p = p <> 1 /\ !x. x divides p ==> (x=1) \/ (x=p)‘;
Definition has been stored under "prime_def"
val prime_def = ⊢ ∀p. prime p ⟺ p ≠ 1 ∧ ∀x. x divides p ⇒ (x = 1) ∨ (x = p):
thm

There is more ASCII syntax to observe here: <> for not-equals, and ! for the universal
quantifier.
That concludes the definitions to be made. Now we “just” have to prove that there are
infinitely many prime numbers. If we were coming to this problem fresh, then we would
have to go through a not-well-understood and often tremendously difficult process of
finding the right lemmas required to prove our target theorem.2 Fortunately, we are
working from an already completed proof and can devote ourselves to the far simpler
problem of explaining how to prove the required theorems.

Proof tools The development will illustrate that there is often more than one way
to tackle a HOL proof, even if one has only a single (informal) proof in mind. In this
example, we often find proofs by using the rewriter rw to unwind definitions and perform
basic simplifications, often reducing a goal to its essence.

> rw; 5
val it = fn: thm list -> tactic

2 This is of course a general problem in doing any kind of proof.


29

When rw is applied to a list of theorems, the theorems will be added to H OL’s built-in
database of useful facts as supplementary rewrite rules. We will see that rw is also
somewhat knowledgeable about arithmetic.3 Sometimes simplification with rw proves
the goal immediately. Often however, we are left with a goal that requires some study
before one realizes what lemmas are needed to conclude the proof. Once these lemmas
have been proven, or located in ancestor theories, metis_tac4 can be invoked with
them, with the expectation that it will find the right instantiations needed to finish the
proof. Note that these two operations, simplification and resolution-style automatic
proof search, will not suffice to perform all the proofs in this example; in particular, our
development will also need case analysis and induction.

Finding theorems This raises the following question: how does one find the right
lemmas and rewrite rules to use? This is quite a problem, especially since the number of
ancestor theories, and the theorems in them, is large. There are several possibilities

• The help system can be used to look up definitions and theorems, as well as proof
procedures; for example, an invocation of

help "arithmeticTheory"

will display all the definitions and theorems that have been stored in the theory of
arithmetic. However, the complete name of the item being searched for must be
known before the help system is useful, so the following two search facilities are
often more useful.

• DB.match allows the use of patterns to locate the sought-for theorem. Any stored
theorem having an instance of the pattern as a subterm will be returned.

• DB.find will use fragments of names as keys with which to lookup information.

Tactic composition Once a proof of a proposition has been found, it is customary,


although not necessary, to embark on a process of revision, in which the original sequence
of tactics is composed into a single tactic. Sometimes the resulting tactic is much shorter,
and more aesthetically pleasing in some sense. Some users spend a fair bit of time
polishing these tactics, although there doesn’t seem much real benefit in doing so, since
they are ad hoc proof recipes, one for each theorem. In the following, we will show how
this is done in a few cases.
3 Linear arithmetic especially: purely universal formulas involving the operators SUC, +, −, numeric

literals, <, ≤, >, ≥, =, and multiplication by numeric literals.


4 The metis_tac tactic performs resolution-style first-order proof search.
30 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

4.1 Divisibility
We start by proving a number of theorems about the divides relation. We will see
that each of these initial theorems can be proved with a single invocation of metis_tac.
Both rw and metis_tac are quite powerful reasoners, and the choice of a reasoner in a
particular situation is a matter of experience. The major reason that metis_tac works so
well is that divides is defined by means of an existential quantifier, and metis_tac is
quite good at automatically instantiating existentials in the course of proof. For a simple
example, consider proving ∀𝑥. 𝑥 𝚍𝚒𝚟𝚒𝚍𝚎𝚜 0. A new proposition to be proved is entered
to the proof manager via “g”, which starts a fresh goalstack:

> g ‘!x. x divides 0‘; 6


val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀x. x divides 0

The proof manager tells us that it has only one proof to manage, and echoes the given
goal. Now we expand the definition of divides. Notice that 𝛼-conversion takes place in
order to keep distinct the 𝑥 of the goal and the 𝑥 in the definition of divides:

> e (rw[divides_def]); 7
OK..
<<HOL message: Initialising SRW simpset ... done>>
1 subgoal:
val it =

∃x’. (x = 0) ∨ (x’ = 0)

It is of course quite easy to instantiate the existential quantifier by hand.

> e (qexists_tac ‘0‘); 8


OK..
1 subgoal:
val it =

(x = 0) ∨ (0 = 0)

Then a simplification step finishes the proof.


4.1. DIVISIBILITY 31

> e (rw[]); 9
OK.. ...output elided...

Goal proved.
⊢ ∃x’. (x = 0) ∨ (x’ = 0)
val it =
Initial goal proved.
⊢ ∀x. x divides 0: proof

What just happened here? The application of rw to the goal decomposed it to an empty
list of subgoals; in other words the goal was proved by rw. Once a goal has been proved,
it is popped off the goalstack, prettyprinted to the output, and the theorem becomes
available for use by the level of the stack. When all the sub-goals required by that level
are proven, the corresponding goal at that level can be proven too. This ‘unwinding’
process continues until the stack is empty, or until it hits a goal with more than one
remaining unproved subgoal. This process may be hard to visualize,5 but that doesn’t
matter, since the goalstack was expressly written to allow the user to ignore such details.
We can sequence tactics with the >> operator (also known as THEN). If our three
interactions above are joined together with >> to form a single tactic, we can try the
proof again from the beginning (using the restart function) and this time it will take
just one step:
> restart(); ...output elided... 10

> e (rw[divides_def] >> qexists_tac ‘0‘ >> rw[]);


OK..
val it =
Initial goal proved.
⊢ ∀x. x divides 0: proof

We have seen one way to prove the theorem. However, as mentioned earlier, there is
another: one can let metis_tac expand the definition of divides and find the required
instantiation for x’ from the theorem MULT_CLAUSES.6
> restart(); ...output elided... 11

> e (metis_tac [divides_def, MULT_CLAUSES]);


OK..
metis: r[+0+10]+0+0+0+1+2#
val it =
Initial goal proved.
⊢ ∀x. x divides 0: proof

As it runs, metis_tac prints out some possibly interesting diagnostics. In any case,
having done our proof inside the goalstack package, we now want to have access to the
5 Perhaps since we have used a stack to implement what is notionally a tree!
6 You might like to try typing MULT_CLAUSES into the interactive loop to see exactly what it states.
32 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

theorem value that we have proved. We use the top_thm function to do this, and then
use drop to dispose of the stack:
> val DIVIDES_0 = top_thm(); 12
val DIVIDES_0 = ⊢ ∀x. x divides 0: thm
> drop();
OK..
val it = There are currently no proofs.: proofs

We have used metis_tac in this way to prove the following collection of theorems
about divides. As mentioned previously, the theorems supplied to metis_tac in the
following proofs did not (usually) come from thin air: in most cases some exploratory
work with the simplifier (rw) was done to open up definitions and see what lemmas
would be required by metis_tac.

(DIVIDES_0) !x. x divides 0


metis_tac [divides_def, MULT_CLAUSES]

(DIVIDES_ZERO) !x. 0 divides x = (x = 0)


metis_tac [divides_def, MULT_CLAUSES]

(DIVIDES_ONE) !x. x divides 1 = (x = 1)


metis_tac [divides_def, MULT_CLAUSES, MULT_EQ_1]

(DIVIDES_REFL) !x. x divides x


metis_tac [divides_def, MULT_CLAUSES]

(DIVIDES_TRANS) !a b c. a divides b /\ b divides c ==> a divides c


metis_tac [divides_def, MULT_ASSOC]

(DIVIDES_ADD) !d a b. d divides a /\ d divides b ==> d divides (a+b)


metis_tac [divides_def,LEFT_ADD_DISTRIB]

(DIVIDES_SUB) !d a b. d divides a /\ d divides b ==> d divides (a-b)


metis_tac [divides_def, LEFT_SUB_DISTRIB]

(DIVIDES_ADDL) !d a b. d divides a /\ d divides (a+b) ==> d divides b


metis_tac [ADD_SUB, ADD_SYM, DIVIDES_SUB]

(DIVIDES_LMUL) !d a x. d divides a ==> d divides (x * a)


metis_tac [divides_def, MULT_ASSOC, MULT_SYM]

(DIVIDES_RMUL) !d a x. d divides a ==> d divides (a * x)


metis_tac [MULT_SYM, DIVIDES_LMUL]
We’ll assume that the above proofs have been performed, and that the appropriate
ML names have been given to all of the theorems. Now we encounter a lemma about
divisibility that doesn’t succumb to just a single invocation of metis_tac:
4.1. DIVISIBILITY 33

(DIVIDES_LE) !m n. m divides n ==> m <= n \/ (n = 0)


rw[divides_def] >> rw[]
Let’s see how this is proved. The easiest way to start is to simplify with the definition of
divides:
> g ‘!m n. m divides n ==> m <= n \/ (n = 0)‘; ...output elided... 13

> e (rw[divides_def]);
OK..
1 subgoal:
val it =

m ≤ m * x ∨ (m * x = 0)
This goal is a disappointing one to have the simplifier produce. Both disjuncts look as
if they should simplify further: the first looks as if we should be able to divide through
by m on both sides of the inequality, and the second looks like something we could attack
with the knowledge that one of two factors must be zero if a multiplication equals zero.
The relevant theorems justifying such steps have already been proved in arithmeticTheory;
something we can confirm with the generally useful DB.match function
DB.match : string list -> term
-> ((string * string) * (thm * class)) list
This function takes a list of theory names, and a pattern, and looks in the list of theories
for any theorem, definition, or axiom that has an instance of the pattern as a subterm.
If the list of theory names is empty, then all loaded theories are included in the search.
Let’s look in the theory of arithmetic for the subterm to be rewritten.
> DB.match ["arithmetic"] ‘‘m <= m * x‘‘; 14
val it =
[(("arithmetic", "LE_MULT_CANCEL_LBARE"),
(⊢ (m ≤ m * n ⟺ (m = 0) ∨ 0 < n) ∧ (m ≤ n * m ⟺ (m = 0) ∨ 0 < n), Thm))]:
DB.data list
This is just the theorem we’d like to use. Using DB.match again, you should now try
to find the theorem that will simplify the other disjunct. Because both are so generally
useful, rw already has both rewrites in its internal database, and all we need to do is
rewrite once more to get those rewrites applied:
> e (rw[]); 15
OK..

Goal proved.
⊢ m ≤ m * x ∨ (m * x = 0)
val it =
Initial goal proved.
⊢ ∀m n. m divides n ⇒ m ≤ n ∨ (n = 0): proof
34 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

That was gratifyingly easy! The process of finding the proof has now finished, and
all that remains is for the proof to be packaged up into the single tactic we saw above.
Rather than use top_thm and the goalstack, we can bypass it and use the store_thm
function. This function takes a string, a term and a tactic and applies the tactic to the
term to get a theorem, and then stores the theorem in the current theory under the given
name.
> val DIVIDES_LE = store_thm( 16
"DIVIDES_LE",
‘‘!m n. m divides n ==> m <= n \/ (n = 0)‘‘,
rw[divides_def] >> rw[]);
val DIVIDES_LE = ⊢ ∀m n. m divides n ⇒ m ≤ n ∨ (n = 0): thm

Storing theorems in our script record of the session in this style (rather than with the
goalstack) results in a more concise script, and also makes it easier to turn our script into
a theory file, as we do in section 4.5.

4.1.1 Divisibility and factorial


The next lemma, DIVIDES_FACT , says that every number greater than 0 and ≤ 𝑛 divides
the factorial of 𝑛. Factorial is found at arithmeticTheory.FACT and has been defined by
primitive recursion:

(FACT) (FACT 0 = 1) /\
(!n. FACT (SUC n) = SUC n * FACT n)
A polished proof of DIVIDES_FACT is the following7 :

(DIVIDES_FACT) !m n. 0 < m /\ m <= n ==> m divides (FACT n)


‘!p m. 0 < m ==> m divides (FACT (m + p))‘
suffices_by metis_tac[LESS_EQ_EXISTS] >>
Induct_on ‘p‘ >>
rw[FACT,ADD_CLAUSES,DIVIDES_RMUL] >>
Cases_on ‘m‘ >>
fs[FACT,DIVIDES_LMUL,DIVIDES_REFL]
We will examine this proof in detail, so we should first attempt to understand why
the theorem is true. What’s the underlying intuition? Suppose 0 < 𝑚 ≤ 𝑛, and so
FACT 𝑛 = 1 ∗ ⋯ ∗ 𝑚 ∗ ⋯ ∗ 𝑛. To show 𝑚 divides (FACT 𝑛) means exhibiting a 𝑞 such that
𝑞 ∗ 𝑚 = FACT 𝑛. Thus 𝑞 = FACT 𝑛 ÷ 𝑚. If we were to take this approach to the proof, we
would end up having to find and apply lemmas about ÷. This seems to take us a little
out of our way; isn’t there a proof that doesn’t use division? Well yes, we can prove the
theorem by induction on 𝑛−𝑚: in the base case, we will have to prove 𝑛 divides (FACT 𝑛),
7 Thisand subsequent proofs use the theorems proved on page 32, which were added to the ML
environment after being proved.
4.1. DIVISIBILITY 35

which ought to be easy; in the inductive case, the inductive hypothesis seems like it
should give us what we need. This strategy for the inductive case is a bit vague, because
we are trying to mentally picture a slightly complicated formula, but we can rely on the
system to accurately calculate the cases of the induction for us. If the inductive case
turns out to be not what we expect, we will have to re-think our approach.
> g ‘!m n. 0 < m /\ m <= n ==> m divides (FACT n)‘; 17
val it =
Proof manager status: 2 proofs.
2. Completed goalstack: ⊢ ∀m n. m divides n ⇒ m ≤ n ∨ (n = 0)
1. Incomplete goalstack:
Initial goal:

∀m n. 0 < m ∧ m ≤ n ⇒ m divides FACT n


Instead of directly inducting on 𝑛 − 𝑚, we will induct on a witness variable, obtained by
use of the theorem LESS_EQ_EXISTS.
> LESS_EQ_EXISTS; 18
val it = ⊢ ∀m n. m ≤ n ⟺ ∃p. n = m + p: thm
Now we want to induct on the 𝑝 that our theorem says exists. This effectively requires
us to prove a slight restatement of the theorem. We might prove the restatement
as a separate lemma, but it is probably just as easy to do this inline with the (infix)
suffices_by tactic:
> e (‘!m p. 0 < m ==> m divides FACT(m + p)‘ 19
suffices_by metis_tac[LESS_EQ_EXISTS]);
OK..
metis: r[+0+7]+0+0+0+0+0+0+2+1#
1 subgoal:
val it =

∀m p. 0 < m ⇒ m divides FACT (m + p)


The tactic that we provide after the suffices_by checks that the first argument does
indeed imply the original goal. If that tactic succeeds (as it does here), we have a new
goal to prove. Now we can perform the induction:
> e (Induct_on ‘p‘); 20
OK..
2 subgoals:
val it =

∀m. 0 < m ⇒ m divides FACT (m + SUC p)


------------------------------------
∀m. 0 < m ⇒ m divides FACT (m + p)

∀m. 0 < m ⇒ m divides FACT (m + 0)


36 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

We now have two sub-goals to prove: a base case and a step case. The first goal the
system expects us to prove is the lowest one printed (it’s closest to the cursor), the
base-case. This can obviously be simplified:

> e (rw[]); 21
OK..
1 subgoal:
val it =

m divides FACT m
------------------------------------
0 < m

Now we can do a case analysis on 𝑚: if it is 0, we have a trivial goal; if it is a suc-


cessor, then we can use the definition of FACT and the theorems DIVIDES_RMUL and
DIVIDES_REFL.

> e (Cases_on ‘m‘); 22


OK..
2 subgoals:
val it =

SUC n divides FACT (SUC n)


------------------------------------
0 < SUC n

0 divides FACT 0
------------------------------------
0 < 0

Here the first sub-goal goal has an assumption that is false. We can demonstrate
this to the system by using the DECIDE function to prove a simple fact about arithmetic
(namely, that no number 𝑥 is less than itself), and then passing the resulting theorem to
METIS_TAC, which can combine this with the contradictory assumption.
4.1. DIVISIBILITY 37

> e (metis_tac [DECIDE ‘‘!x. ~(x < x)‘‘]); 23


OK..
metis: r[+0+4]#

Goal proved.
[.] ⊢ 0 divides FACT 0

Remaining subgoals:
val it =

SUC n divides FACT (SUC n)


------------------------------------
0 < SUC n

Alternatively, we could trust that H OL’s existing theories somewhere include the fact
that less-then is irreflexive, find that theorem using DB.match (using the pattern x < x),
and then quote that theorem-name to metis_tac.
Another alternative would be to apply the simplifier directly to the sub-goal’s assump-
tions. Certainly, the simplifier has already been primed with the irreflexivity of less-than,
so this seems natural. This can be done with the fs tactic:

> b(); ...output elided... 24


> e (fs[]);
OK..

Goal proved.
[.] ⊢ 0 divides FACT 0

Remaining subgoals:
val it =

SUC n divides FACT (SUC n)


------------------------------------
0 < SUC n

Using the theorems identified above the remaining sub-goal can be proved with the
simplifier rw.
38 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

> e (rw [FACT, DIVIDES_LMUL, DIVIDES_REFL]); 25


OK.. ...output elided...

Goal proved.
⊢ ∀m. 0 < m ⇒ m divides FACT (m + 0)

Remaining subgoals:
val it =

∀m. 0 < m ⇒ m divides FACT (m + SUC p)


------------------------------------
∀m. 0 < m ⇒ m divides FACT (m + p)

Now we have finished the base case of the induction and can move to the step case. An
obvious thing to try is simplification with the definitions of addition and factorial:

> e (rw [FACT, ADD_CLAUSES]); 26


OK..
1 subgoal:
val it =

m divides FACT (m + p) * SUC (m + p)


------------------------------------
0. ∀m. 0 < m ⇒ m divides FACT (m + p)
1. 0 < m

And now, by DIVIDES_RMUL and the inductive hypothesis, we are done:

> e (rw[DIVIDES_RMUL]); 27
OK.. ...output elided...

Goal proved.
⊢ ∀m p. 0 < m ⇒ m divides FACT (m + p)
val it =
Initial goal proved.
⊢ ∀m n. 0 < m ∧ m ≤ n ⇒ m divides FACT n: proof

We have finished the search for the proof, and now turn to the task of making a single
tactic out of the sequence of tactic invocations we have just made. We assume that the
sequence of invocations has been kept track of in a file or a text editor buffer. We would
thus have something like the following:
4.1. DIVISIBILITY 39

e (‘!m p. 0 < m ==> m divides FACT (m + p)‘


suffices_by metis_tac[LESS_EQ_EXISTS]);
e (Induct_on ‘p‘);
(*1*)
e (rw[]);
e (Cases_on ‘m‘);
(*1.1*)
e (fs[]);
(*1.2*)
e (rw[FACT, DIVIDES_LMUL, DIVIDES_REFL]);
(*2*)
e (rw[FACT, ADD_CLAUSES]);
e (rw[DIVIDES_RMUL]);
We have added a numbering scheme to keep track of the branches in the proof. We
can stitch the above together directly into the following compound tactic:
‘!m p. 0 < m ==> m divides FACT (m + p)‘
suffices_by metis_tac[LESS_EQ_EXISTS] >>
Induct_on ‘p‘ >| [
rw[] >> Cases_on ‘m‘ >| [
fs[],
rw[FACT, DIVIDES_LMUL, DIVIDES_REFL]
],
rw[FACT, ADD_CLAUSES] >> rw[DIVIDES_RMUL]
]
This can be tested to see that we have made no errors:
> restart(); ...output elided... 28
> e (‘!m p. 0 < m ==> m divides FACT (m + p)‘
suffices_by metis_tac[LESS_EQ_EXISTS] >>
Induct_on ‘p‘ >| [
rw[] >> Cases_on ‘m‘ >| [
fs[],
rw[FACT, DIVIDES_LMUL, DIVIDES_REFL]
],
rw[FACT, ADD_CLAUSES] >> rw[DIVIDES_RMUL]
]);
OK..
metis: r[+0+7]+0+0+0+0+0+0+2+1#
val it =
Initial goal proved.
⊢ ∀m n. 0 < m ∧ m ≤ n ⇒ m divides FACT n: proof
For many users, this would be the end of dealing with this proof: the tactic can now
be packaged into an invocation of prove8 or store_thm and that would be the end of it.
However, another user might notice that this tactic could be shortened.
8 The prove function takes a term and a tactic and attempts to prove the term using the supplied tactic.
It returns the proved theorem if the tactic succeeds. It doesn’t save the theorem to the developing theory.
40 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

One obvious step would be to merge the two successive invocations of the simplifier in
the step case:

‘!m p. 0 < m ==> m divides FACT (m + p)‘


suffices_by metis_tac[LESS_EQ_EXISTS] >>
Induct_on ‘p‘ >| [
Cases_on ‘m‘ >| [
fs[],
rw[FACT, DIVIDES_LMUL, DIVIDES_REFL]
],
rw[FACT, ADD_CLAUSES, DIVIDES_RMUL]
]

Now we’ll make the occasionally dangerous assumption that the simplifications of the
step case won’t interfere with what is happening in the base case, and move the step
case’s tactic to precede the first >|, using >>. When the Induct tactic generates two
sub-goals, the step case’s simplification will be applied to both of them:

> restart(); ...output elided... 29


> e (‘!m p. 0 < m ==> m divides FACT (m + p)‘
suffices_by metis_tac[LESS_EQ_EXISTS] >>
Induct_on ‘p‘ >> rw[FACT, ADD_CLAUSES, DIVIDES_RMUL]);
OK..
metis: r[+0+7]+0+0+0+0+0+0+2+1#
1 subgoal:
val it =

m divides FACT m
------------------------------------
0 < m

The step case has been dealt with, and as we hoped the base case has not been changed
at all. This means that our tactic can become

‘!m p. 0 < m ==> m divides FACT (m + p)‘


suffices_by metis_tac[LESS_EQ_EXISTS] >>
Induct_on ‘p‘ >>
rw[FACT, ADD_CLAUSES,DIVIDES_RMUL] >>
(* base case only remains *)
Cases_on ‘m‘ >| [
fs[],
rw[FACT,DIVIDES_LMUL,DIVIDES_REFL]
]

In the base case, we have two invocations of the simplifier under the case-split on m. In
general, the two different simplifier invocations do slightly different things in addition to
simplifying the conclusion of the goal:
4.1. DIVISIBILITY 41

• rw strips apart the propositional structure of the goal, and eliminates equalities
from the assumptions

• fs simplifies the assumptions as well as the conclusion

However, in this case the goal where we used rw did not include any propositional
structure to strip apart, and so we can be confident that using fs in the same place
would also work. Thus, we can merge the two sub-cases of the base-case into a single
invocation of fs:

‘!m p. 0 < m ==> m divides FACT (m + p)‘


suffices_by metis_tac[LESS_EQ_EXISTS] >>
Induct_on ‘p‘ >>
rw[FACT, ADD_CLAUSES,DIVIDES_RMUL] >>
Cases_on ‘m‘ >>
fs[FACT,DIVIDES_LMUL,DIVIDES_REFL]

We have now finished our exercise in tactic revision. Certainly, it would be hard to
foresee that this final tactic would prove the goal; the required lemmas for the final
invocation of metis_tac have been found by an incremental process of revision.

4.1.2 Divisibility and factorial (again!)


In the previous proof, we made an initial simplification step in order to expose a variable
upon which to induct. However, the proof is really by induction on 𝑛 − 𝑚. Can we express
this directly? The answer is a qualified yes: the induction can be naturally stated, but it
leads to somewhat less natural goals.

> restart(); ...output elided... 30


> e (Induct_on ‘n - m‘);
OK..
2 subgoals:
val it =

∀n m. (SUC v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n


------------------------------------
∀n m. (v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n

∀n m. (0 = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n

This is slighly hard to read, so we sequence a call to the simplifier to strip both arms of
the proof. As before, use of >> ensures that the tactic gets applied in both branches of
the induction. (We might also use rpt strip_tac if we didn’t want the simplification to
happen.)
42 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

> b(); ...output elided... 31


> e (Induct_on ‘n - m‘ >> rw[]);
OK..
2 subgoals:
val it =

m divides FACT n
------------------------------------
0. ∀n m. (v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n
1. SUC v = n − m
2. 0 < m

m divides FACT n
------------------------------------
0. n ≤ m
1. 0 < m
2. m ≤ n

Looking at the first goal, we can see (by the anti-symmetry of ≤) that 𝑚 = 𝑛. We can
prove this fact, using rw and add it to the hypotheses by use of the infix operator “by”:

> e (‘m = n‘ by rw[]); 32


OK..
1 subgoal:
val it =

m divides FACT n
------------------------------------
0. n ≤ m
1. 0 < m
2. m ≤ n
3. m = n

We can now use simplification again to propagate the newly derived equality throughout
the goal.

> e (rw[]); 33
OK..
1 subgoal:
val it =

m divides FACT m
------------------------------------
0. m ≤ m
1. 0 < m
2. m ≤ m
4.1. DIVISIBILITY 43

At this point in the previous proof we did a case analysis on 𝑚. However, we already
have the hypothesis that 𝑚 is positive (along with two other now useless hypotheses).
Thus we know that 𝑚 is the successor of some number 𝑘. We might wish to assert this
fact with an invocation of “by” as follows:

‘?k. m = SUC k‘ by <tactic>

But what is the tactic? If we try rw, it will fail since the embedded arithmetic decision
procedure doesn’t handle existential statements very well. What to do?
In fact, that earlier case analysis will again do the job: but now we hide it away so
that it is only used to prove this sub-goal. When we execute Cases_on ‘m‘, we will get a
case where m has been substituted out for 0. This case will be contradictory given that
we already have an assumption 0 < m, and we can again use fs. In the other case, there
will be an assumption that m is some successor value, and this will make it easy for the
simplifier to prove the goal.
Thus:
> e (‘?k. m = SUC k‘ by (Cases_on ‘m‘ >> fs[])); 34
OK..
1 subgoal:
val it =

m divides FACT m
------------------------------------
0. m ≤ m
1. 0 < m
2. m ≤ m
3. m = SUC k

Now the tactic we used before can finish this off:


> e (fs[FACT, DIVIDES_LMUL, DIVIDES_REFL]); 35
OK.. ...output elided...

Goal proved.
[...] ⊢ m divides FACT n

Remaining subgoals:
val it =

m divides FACT n
------------------------------------
0. ∀n m. (v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n
1. SUC v = n − m
2. 0 < m
44 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

That takes care of the base case. For the induction step, things look a bit more difficult
than in the earlier proof. However, we can make progress by realizing that the hypotheses
imply that 0 < 𝑛 and so we can transform 𝑛 into a successor, thus enabling the unfolding
of FACT, as in the previous proof:
> e (‘0 < n‘ by rw[] >> ‘?k. n = SUC k‘ by (Cases_on ‘n‘ >> fs[])); 36
OK..
1 subgoal:
val it =

m divides FACT n
------------------------------------
0. ∀n m. (v = n − m) ⇒ 0 < m ∧ m ≤ n ⇒ m divides FACT n
1. SUC v = n − m
2. 0 < m
3. 0 < n
4. n = SUC k
The proof now finishes in much the same manner as the previous one:
> e (rw [FACT, DIVIDES_RMUL]); 37
OK.. ...output elided...

Goal proved.
[...] ⊢ m divides FACT n
val it =
Initial goal proved.
⊢ ∀m n. 0 < m ∧ m ≤ n ⇒ m divides FACT n: proof
We leave the details of stitching the proof together to the interested reader.

4.2 Primality
Now we move on to establish some facts about the primality of the first few numbers: 0
and 1 are not prime, but 2 is. Also, all primes are positive. These are all quite simple to
prove.

(NOT_PRIME_0) ~prime 0
rw[prime_def,DIVIDES_0]

(NOT_PRIME_1) ~prime 1
rw[prime_def]

(PRIME_2) prime 2
rw[prime_def] >>
metis_tac [DIVIDES_LE, DIVIDES_ZERO, DECIDE ‘‘2<>0‘‘,
DECIDE ‘‘x <= 2 <=> (x=0) \/ (x=1) \/ (x=2)‘‘]
4.3. EXISTENCE OF PRIME FACTORS 45

(PRIME_POS) !p. prime p ==> 0<p


Cases >> rw[NOT_PRIME_0]

4.3 Existence of prime factors


Now we are in position to prove a more substantial lemma: every number other than 1
has a prime factor. The proof proceeds by a complete induction on 𝑛. Complete induction
is necessary since a prime factor won’t be the predecessor. After induction, the proof
splits into cases on whether 𝑛 is prime or not. The first case (𝑛 is prime) is trivial. In the
second case, there must be an 𝑥 that divides 𝑛, and 𝑥 is not 1 or 𝑛. By DIVIDES_LE, 𝑛 = 0
or 𝑥 ≤ 𝑛. If 𝑛 = 0, then 2 is a prime that divides 0. On the other hand, if 𝑥 ≤ 𝑛, there
are two cases: if 𝑥 < 𝑛 then we can use the inductive hypothesis and by transitivity of
divides we are done; otherwise, 𝑥 = 𝑛 and then we have a contradiction with the fact
that 𝑥 is not 1 or 𝑛. The polished tactic is the following:

(PRIME_FACTOR) !n. ~(n = 1) ==> ?p. prime p /\ p divides n


completeInduct_on ‘n‘ >>
rw [] >>
Cases_on ‘prime n‘ >| [
metis_tac [DIVIDES_REFL],
‘?x. x divides n /\ x<>1 /\ x<>n‘
by METIS_TAC[prime_def] >>
metis_tac [LESS_OR_EQ, PRIME_2,
DIVIDES_LE,DIVIDES_TRANS,DIVIDES_0]]

We start by invoking complete induction. This gives us an inductive hypothesis that holds
at every number 𝑚 strictly smaller than 𝑛:

> g ‘!n. n <> 1 ==> ?p. prime p /\ p divides n‘; 38


val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀n. n ≠ 1 ⇒ ∃p. prime p ∧ p divides n

> e (completeInduct_on ‘n‘);


OK..
1 subgoal:
val it =

n ≠ 1 ⇒ ∃p. prime p ∧ p divides n


------------------------------------
∀m. m < n ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
46 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

We can move the antecedent to the hypotheses and make our case split. Notice that the
term given to Cases_on need not occur in the goal:

> e (rw[] >> Cases_on ‘prime n‘); 39


OK..
2 subgoals:
val it =

∃p. prime p ∧ p divides n


------------------------------------
0. ∀m. m < n ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
1. n ≠ 1
2. ¬prime n

∃p. prime p ∧ p divides n


------------------------------------
0. ∀m. m < n ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
1. n ≠ 1
2. prime n

As mentioned, the first case is proved with the reflexivity of divisibility:

> e (metis_tac [DIVIDES_REFL]); ...output elided... 40

In the second case, we can get a divisor of 𝑛 that isn’t 1 or 𝑛 (since 𝑛 is not prime):

> e (‘?x. x divides n /\ x<>1 /\ x<>n‘ by metis_tac [prime_def]); 41


OK..
metis: r[+0+11]+0+0+0+0+0+0+1+1+1+1+0+1+1#
1 subgoal:
val it =

∃p. prime p ∧ p divides n


------------------------------------
0. ∀m. m < n ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
1. n ≠ 1
2. ¬prime n
3. x divides n
4. x ≠ 1
5. x ≠ n

At this point, the polished tactic simply invokes metis_tac with a collection of theorems.
We will attempt a more detailed exposition. Given the hypotheses, and by DIVIDES_LE,
we can assert 𝑥 < 𝑛 ∨ 𝑛 = 0 and thus split the proof into two cases:
4.3. EXISTENCE OF PRIME FACTORS 47

> e (‘x < n \/ (n=0)‘ by metis_tac [DIVIDES_LE,LESS_OR_EQ]); 42


OK..
metis: r[+0+14]+0+0+0+0+0+0+0+0+0+0+1+0+1#
2 subgoals:
val it =

∃p. prime p ∧ p divides n


------------------------------------
0. ∀m. m < n ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
1. n ≠ 1
2. ¬prime n
3. x divides n
4. x ≠ 1
5. x ≠ n
6. n = 0

∃p. prime p ∧ p divides n


------------------------------------
0. ∀m. m < n ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
1. n ≠ 1
2. ¬prime n
3. x divides n
4. x ≠ 1
5. x ≠ n
6. x < n

In the first subgoal, we can see that the antecedents of the inductive hypothesis are met
and so 𝑥 has a prime divisor. We can then use the transitivity of divisibility to get the fact
that this divisor of 𝑥 is also a divisor of 𝑛, thus finishing this branch of the proof:
48 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

> e (metis_tac [DIVIDES_TRANS]); 43


OK..
metis: r[+0+11]+0+0+0+0+0+0+0+1+0+4+1+0+3+0+2+2+1#

Goal proved.
[.......] ⊢ ∃p. prime p ∧ p divides n

Remaining subgoals:
val it =

∃p. prime p ∧ p divides n


------------------------------------
0. ∀m. m < n ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
1. n ≠ 1
2. ¬prime n
3. x divides n
4. x ≠ 1
5. x ≠ n
6. n = 0

The remaining goal can be clarified by simplification:

> e (rw[]); 44
OK..
1 subgoal:
val it =

∃p. prime p ∧ p divides 0


------------------------------------
0. ∀m. m < 0 ⇒ m ≠ 1 ⇒ ∃p. prime p ∧ p divides m
1. 0 ≠ 1
2. ¬prime 0
3. x divides 0
4. x ≠ 1
5. x ≠ 0

We know that everything divides 0:

> DIVIDES_0; 45
val it = ⊢ ∀x. x divides 0: thm

So any prime will do for p.


4.4. EUCLID’S THEOREM 49

> e (metis_tac [PRIME_2, DIVIDES_0]); 46


OK..
metis: r[+0+10]# ...output elided...

Goal proved.
[.] ⊢ n ≠ 1 ⇒ ∃p. prime p ∧ p divides n
val it =
Initial goal proved.
⊢ ∀n. n ≠ 1 ⇒ ∃p. prime p ∧ p divides n: proof

Again, work now needs to be done to compose and perhaps polish a single tactic from the
individual proof steps, but we will not describe it.9 Instead we move forward, because
our ultimate goal is in reach.

4.4 Euclid’s theorem


Theorem. Every number has a prime greater than it.
Informal proof.
Suppose the opposite; then there’s an 𝑛 such that all 𝑝 greater than 𝑛 are not prime.
Consider FACT(𝑛) + 1: it’s not equal to 1 so, by PRIME_FACTOR, there’s a prime 𝑝 that
divides it. Note that 𝑝 also divides FACT(𝑛) because 𝑝 ≤ 𝑛. By DIVIDES_ADDL, we have
𝑝 = 1. But then 𝑝 is not prime, which is a contradiction.
End of proof.
A HOL rendition of the proof may be given as follows:

(EUCLID) !n. ?p. n < p /\ prime p


spose_not_then strip_assume_tac
>> mp_tac (SPEC ‘‘FACT n + 1‘‘ PRIME_FACTOR)
>> rw[FACT_LESS, DECIDE ‘‘~(x=0) = 0<x‘‘]
>> metis_tac [NOT_PRIME_1, NOT_LESS, PRIME_POS,
DIVIDES_FACT, DIVIDES_ADDL, DIVIDES_ONE]

Let’s prise this apart and look at it in some detail. A proof by contradiction can be started
by using the bossLib function spose_not_then. With it, one assumes the negation of the
current goal and then uses that in an attempt to prove falsity (F). The assumed negation
¬(∀𝑛. ∃𝑝. 𝑛 < 𝑝 ∧ prime 𝑝) is simplified a bit into ∃𝑛. ∀𝑝. 𝑛 < 𝑝 ⊃ ¬ prime 𝑝 and then is
passed to the tactic strip_assume_tac. This moves its argument to the assumption list
of the goal after eliminating the existential quantification on 𝑛.

9 Indeed, the tactic can be simplified into complete induction followed by an invocation of METIS_TAC
with suitable lemmas.
50 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

> g ‘!n. ?p. n < p /\ prime p‘; ...output elided... 47

> e (spose_not_then strip_assume_tac);


OK..
1 subgoal:
val it =

F
------------------------------------
∀p. n < p ⇒ ¬prime p

Thus we have the hypothesis that all 𝑝 beyond a certain unspecified 𝑛 are not prime, and
our task is to show that this cannot be. At this point we take advantage of Euclid’s great
inspiration and we build an explicit term from 𝑛. In the informal proof we are asked
to ‘consider’ the term FACT 𝑛 + 1.10 This term will have certain properties (i.e., it has a
prime factor) that lead to contradiction. Question: how do we ‘consider’ this term in the
formal HOL proof? Answer: by instantiating a lemma with it and bringing the lemma
into the proof. The lemma and its instantiation are:11

> PRIME_FACTOR; 48
val it = ⊢ ∀n. n ≠ 1 ⇒ ∃p. prime p ∧ p divides n: thm

> val th = SPEC ‘‘FACT n + 1‘‘ PRIME_FACTOR;


val th = ⊢ FACT n + 1 ≠ 1 ⇒ ∃p. prime p ∧ p divides FACT n + 1: thm

It is evident that the antecedent of th can be eliminated. In H OL, one could do this in a
so-called forward proof style (by proving ⊢ ¬(FACT 𝑛 + 1 = 1) and then applying modus
ponens, the result of which can then be used in the proof), or one could bring th into the
proof and simplify it in situ. We choose the latter approach.

> e (mp_tac (SPEC ‘‘FACT n + 1‘‘ PRIME_FACTOR)); 49


OK..
1 subgoal:
val it =

(FACT n + 1 ≠ 1 ⇒ ∃p. prime p ∧ p divides FACT n + 1) ⇒ F


------------------------------------
∀p. n < p ⇒ ¬prime p

The invocation mp_tac (⊢ 𝑀) applied to a goal (Δ, 𝑔) returns the goal (Δ, 𝑀 ⇒ 𝑔). Now
we simplify:
10 The HOL parser thinks FACT 𝑛 + 1 is equivalent to (FACT 𝑛) + 1.
11 The function SPEC implements the rule of universal specialization.
4.4. EUCLID’S THEOREM 51

> e (rw[]); 50
OK..
2 subgoals:
val it =

¬prime p ∨ ¬(p divides FACT n + 1)


------------------------------------
∀p. n < p ⇒ ¬prime p

FACT n ≠ 0
------------------------------------
∀p. n < p ⇒ ¬prime p

We recall that zero is less than every factorial, a fact found in arithmeticTheory under
the name FACT_LESS. Thus we can solve the top goal by simplification:

> e (rw[FACT_LESS, DECIDE ‘‘!x. ~(x=0) = 0 < x‘‘]); 51


OK..

Goal proved.
⊢ FACT n ≠ 0

Remaining subgoals:
val it =

¬prime p ∨ ¬(p divides FACT n + 1)


------------------------------------
∀p. n < p ⇒ ¬prime p

Notice the ‘on-the-fly’ use of DECIDE to provide an ad hoc rewrite. Looking at the
remaining goal, one might think that our aim, to prove falsity, has been lost. But this is
not so: a goal ¬𝑃 ∨ ¬𝑄 is logically equivalent to 𝑃 ⇒ 𝑄 ⇒ 𝙵. In the following invocation,
we use the equality ⊢ 𝐴 ⇒ 𝐵 = ¬𝐴 ∨ 𝐵 as a rewrite rule oriented right to left by use of
GSYM.12

12 Loosely speaking, GSYM swaps the left and right hand sides of any equations it finds.
52 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

> IMP_DISJ_THM; 52
val it = ⊢ ∀A B. A ⇒ B ⟺ ¬A ∨ B: thm

> e (rw[GSYM IMP_DISJ_THM]);


OK..
1 subgoal:
val it =

¬(p divides FACT n + 1)


------------------------------------
0. ∀p. n < p ⇒ ¬prime p
1. prime p

We can quickly proceed to show that 𝑝 𝚍𝚒𝚟𝚒𝚍𝚎𝚜 (𝙵𝙰𝙲𝚃 𝑛), so that if it also divides
FACT n + 1, then 𝑝 divides 1, meaning that 𝑝 = 1. But then 𝑝 is not prime, at which
point we are done. This can all be packaged into a single invocation of METIS_TAC:

> e (metis_tac [DIVIDES_FACT, DIVIDES_ADDL, DIVIDES_ONE, 53


NOT_PRIME_1, NOT_LESS, PRIME_POS]);
OK..
metis: r[+0+12]+0+0+0+0+0+0+0+0+0+0+0+1+1+1+1+1+4# ...output elided...

Goal proved.
[.] ⊢ F
val it =
Initial goal proved.
⊢ ∀n. ∃p. n < p ∧ prime p: proof

Euclid’s theorem is now proved, and we can rest. However, this presentation of the
final proof will be unsatisfactory to some, because the proof is completely hidden
in the invocation of the automated reasoner. Well then, let’s try another proof, this
time employing the so-called ‘assertional’ style. When used uniformly, this can allow
a readable linear presentation that mirrors the informal proof. The following proves
Euclid’s theorem in the assertional style. We think it is fairly readable, certainly much
more so than the standard tactic proof just given.13

13 Note that CCONTR_TAC, which is used to start the proof, initiates a proof by contradiction by negating
the goal and placing it on the hypotheses, leaving F as the new goal.
4.5. TURNING THE SCRIPT INTO A THEORY 53

(AGAIN) !n. ?p. n < p /\ prime p


CCONTR_TAC >>
‘?n. !p. n < p ==> ~prime p‘ by metis_tac [] >>
‘~(FACT n + 1 = 1)‘ by rw[FACT_LESS,
DECIDE‘‘~(x=0)=0<x‘‘] >>
‘?p. prime p /\
p divides (FACT n + 1)‘ by metis_tac [PRIME_FACTOR] >>
‘0 < p‘ by metis_tac [PRIME_POS] >>
‘p <= n‘ by metis_tac [NOT_LESS] >>
‘p divides FACT n‘ by metis_tac [DIVIDES_FACT] >>
‘p divides 1‘ by metis_tac [DIVIDES_ADDL] >>
‘p = 1‘ by metis_tac [DIVIDES_ONE] >>
‘~prime p‘ by metis_tac [NOT_PRIME_1] >>
metis_tac []

4.5 Turning the script into a theory


Having proved our result, we probably want to package it up in a way that makes it
available to future sessions, but which doesn’t require us to go all through the theorem-
proving effort again. Even having a complete script from which it would be possible to
cut-and-paste is an error-prone solution.
In order to do this we need to create a file with the name 𝑥Script.sml, where 𝑥 is the
name of the theory we wish to export. This file then needs to be compiled. In fact, we
really do use the ML compiler, carefully augmented with the appropriate logical context.
However, the language accepted by the compiler is not quite the same as that accepted
by the interactive system, so we will need to do a little work to massage the original
script into the correct form.
We’ll give an illustration of converting to a form that can be compiled using the script

<holdir>/examples/euclid.sml

as our base-line. This file is already close to being in the right form. It has all of the
proofs of the theorems in “sewn-up” form so that when run, it does not involve the
goal-stack at all. In its given form, it can be run as input to hol thus:

$ cd examples/ 1
$ ../bin/hol < euclid.sml
...

> val EUCLID = |- !n. ?p. n < p /\ prime p : thm


...

> val EUCLID_AGAIN = |- !n. ?p. n < p /\ prime p : thm


-
54 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

However, we now want to create a euclidTheory that we can load in other interactive
sessions. So, our first step is to create a file euclidScript.sml, and to copy the body of
euclid.sml into it.
The first non-comment line opens arithmeticTheory. However, when writing for the
compiler, we need to explicitly mention the other H OL modules that we depend on. We
must add
open HolKernel boolLib Parse bossLib
The next line that poses a difficulty is
set_fixity "divides" (Infixr 450);
While it is legitimate to type expressions directly into the interactive system, the compiler
requires that every top-level phrase be a declaration. We satisfy this requirement by
altering this line into a “do nothing” declaration that does not record the result of the
expression:
val _ = set_fixity "divides" (Infixr 450)
The only extra changes are to bracket the rest of the script text with calls to new_theory
and export_theory. So, before the definition of divides, we add:
val _ = new_theory "euclid";
and at the end of the file:
val _ = export_theory();
Now, we can compile the script we have created using the Holmake tool. To keep things
a little tidier, we first move our script into a new directory.
$ mkdir euclid 2
$ mv euclidScript.sml euclid
$ cd euclid
$ ../../bin/Holmake
Analysing euclidScript.sml
Trying to create directory .HOLMK for dependency files
Compiling euclidScript.sml
Linking euclidScript.uo to produce theory-builder executable
<<HOL message: Created theory "euclid".>>
Definition has been stored under "divides_def".
Definition has been stored under "prime_def".
Meson search level: .....
Meson search level: .................
...
Exporting theory "euclid" ... done.
Analysing euclidTheory.sml
Analysing euclidTheory.sig
Compiling euclidTheory.sig
Compiling euclidTheory.sml
4.6. SUMMARY 55

Now we have created four new files, various forms of euclidTheory with four different
suffixes. Only euclidTheory.sig is really suitable for human consumption. While still
in the euclid directory that we created, we can demonstrate:

$ ../../bin/hol 3
[...]

[closing file "/local/scratch/mn200/Work/hol98/tools/end-init-boss.sml"]


- load "euclidTheory";
> val it = () : unit
- open euclidTheory;
> type thm = thm
val DIVIDES_TRANS =
|- !a b c. a divides b / b divides c ==> a divides c
: thm
...
val DIVIDES_REFL = |- !x. x divides x : thm
val DIVIDES_0 = |- !x. x divides 0 : thm

4.6 Summary
The reader has now seen an interesting theorem proved, in great detail, in H OL. The
discussion illustrated the high-level tools provided in bossLib and touched on issues
including tool selection, undo, ‘tactic polishing’, exploratory simplification, and the
‘forking-off’ of new proof attempts. We also attempted to give a flavour of the thought
processes a user would employ. Following is a more-or-less random collection of other
observations.

• Even though the proof of Euclid’s theorem is short and easy to understand when
presented informally, a perhaps surprising amount of support development was
required to set the stage for Euclid’s classic argument.

• The proof support offered by bossLib (rw, metis_tac, DECIDE, Cases_on, Induct_on,
and the “by” construct) was nearly complete for this example: it was rarely neces-
sary to resort to lower-level tactics.

• Simplification is a workhorse tactic; even when an automated reasoner such as


metis_tac is used, its application has often been set up by some exploratory simpli-
fications. It therefore pays to become familiar with the strengths and weaknesses
of the simplifier.

• A common problem with interactive proof systems is dealing with hypotheses.


Often metis_tac and the “by” construct allow the use of hypotheses without
directly resorting to indexing into them (or naming them, which amounts to the
56 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM

same thing). This is desirable, since the hypotheses are notionally a set, and
moreover, experience has shown that profligate indexing into hypotheses results in
hard-to-maintain proof scripts.
We also found that we could directly simplify in the assumptions by using the fs
tactic. Nonetheless, it can be clumsy to work with a large set of hypotheses, in
which case the following approaches may be useful.
One can directly refer to hypotheses by using UNDISCH_TAC (makes the designated
hypothesis the antecedent to the goal), ASSUM_LIST (gives the entire hypothesis
list to a tactic), pop_assum (gives the top hypothesis to a tactic), and qpat_assum
(gives the first matching hypothesis to a tactic). (See the REFERENCE for further
details on all of these.) The numbers attached to hypotheses by the proof manager
could likely be used to access hypotheses (it would be quite simple to write such a
tactic). However, starting a new proof is sometimes the most clarifying thing to do.
In some cases, it is useful to be able to delete a hypothesis. This can be accomplished
by passing the hypothesis to a tactic that ignores it. For example, to discard the top
hypothesis, one could invoke pop_assum kall_tac.

• In the example, we didn’t use the more advanced features of bossLib, largely
because they do not, as yet, provide much more functionality than the simple
sequencing of simplification, decision procedures, and automated first order rea-
soning. The >> tactical has thus served as an adequate replacement. In the future,
these entrypoints should become more powerful.

• It is almost always necessary to have an idea of the informal proof in order to be


successful when doing a formal proof. However, all too often the following strategy
is adopted by novices: (1) rewrite the goal with a few relevant definitions, and
then (2) rely on the syntax of the resulting goal to guide subsequent tactic selection.
Such an approach constitutes a clear case of the tail wagging the dog, and is a poor
strategy to adopt. Insight into the high-level structure of the proof is one of the
most important factors in successful verification exercises.
The author has noticed that many of the most successful verification experts work
using a sheet of paper to keep track of the main steps that need to be made. Perhaps
looking away to the paper helps break the mesmerizing effect of the computer
screen.
On the other hand, one of the advantages of having a mechanized logic is that the
machine can be used as a formal expression calculator, and thus the user can use it
to quickly and accurately explore various proof possibilities.

• High powered tools like metis_tac, and rw are the principal way of advancing a
proof in bossLib. In many cases, they do exactly what is desired, or even manage
4.6. SUMMARY 57

to surprise the user with their power. In the formalization of Euclid’s theorem, the
tools performed fairly well. However, sometimes they are overly aggressive, or
they simply flounder. In such cases, more specialized proof tools need to be used,
or even written, and hence the support underlying bossLib must eventually be
learned.

• Having a good knowledge of the available lemmas, and where they are located, is
an essential part of being successful. Often powerful tools can replace lemmas in a
restricted domain, but in general, one has to know what has already been proved.
We have found that the entrypoints in DB help in quickly finding lemmas.
58 CHAPTER 4. EXAMPLE: EUCLID’S THEOREM
Chapter 5

Example: a Simple Parity Checker

This chapter consists of a worked example: the specification and verification of a simple
sequential parity checker. The intention is to accomplish two things:

(i) To present a complete piece of work with H OL.

(ii) To give a flavour of what it is like to use the H OL system for a tricky proof.

Concerning (ii), note that although the theorems proved are, in fact, rather simple,
the way they are proved illustrates the kind of intricate ‘proof engineering’ that is typical.
The proofs could be done more elegantly, but presenting them that way would defeat
the purpose of illustrating various features of H OL. It is hoped that the small example
here will give the reader a feel for what it is like to do a big one.
Readers who are not interested in hardware verification should be able to learn
something about the H OL system even if they do not wish to penetrate the details of
the parity-checking example used here. The specification and verification of a slightly
more complex parity checker is set as an exercise (a solution is provided in the directory
examples/parity).

5.1 Introduction
The sessions of this example comprise the specification and verification of a device that
computes the parity of a sequence of bits. More specifically, a detailed verification is given
of a device with an input in, an output out and the specification that the 𝑛th output on
out is T if and only if there have been an even number of T’s input on in. A theory named
PARITY is constructed; this contains the specification and verification of the device. All the
ML input in the boxes below can be found in the file examples/parity/PARITYScript.sml.
It is suggested that the reader interactively input this to get a ‘hands on’ feel for the
example. The goal of the case study is to illustrate detailed ‘proof hacking’ on a small
and fairly simple example.

59
60 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER

5.2 Specification
The first step is to start up the H OL system. We again use <holdir>/bin/hol. The ML
prompt is >, so lines beginning with > are typed by the user and other lines are the
system’s response.
To specify the device, a primitive recursive function PARITY is defined so that for 𝑛 > 0,
PARITY 𝑛𝑓 is true if the number of T’s in the sequence 𝑓 (1), … , 𝑓 (𝑛) is even.

> val PARITY_def = Define‘ 1


(PARITY 0 f = T) /\
(PARITY(SUC n) f = if f(SUC n) then ~PARITY n f else PARITY n f)‘;
Definition has been stored under "PARITY_def"
val PARITY_def =
⊢ (∀f. PARITY 0 f ⟺ T) ∧
∀n f. PARITY (SUC n) f ⟺ if f (SUC n) then ¬PARITY n f else PARITY n f:
thm

The effect of our call to Define is to store the definition of PARITY on the current theory
with name PARITY_def and to bind the defining theorem to the ML variable with the same
name. Notice that there are two name spaces being written into: the names of constants
in theories and the names of variables in ML. The user is generally free to manage these
names however he or she wishes (subject to the various lexical requirements), but a
common convention is (as here) to give the definition of a constant CON the name CON_def
in the theory and also in ML. Another commonly-used convention is to use just CON for the
theory and ML name of the definition of a constant CON. Unfortunately, the H OL system
does not use a uniform convention, but users are recommended to adopt one. In this
case Define has made one of the choices for us, but there are other scenarios where we
have to choose the name used in the theory file.
The specification of the parity checking device can now be given as:

!t. out t = PARITY t inp

It is intuitively clear that this specification will be satisfied if the signal1 functions inp
and out satisfy2 :

out(0) = T

and

!t. out(t+1) = (if inp(t+1) then ~(out t) else out t)

1 Signals are modelled as functions from numbers, representing times, to booleans.


2 We’d like to use in as one of our variable names, but this is a reserved word for let-expressions.
5.2. SPECIFICATION 61

This can be verified formally in H OL by proving the following lemma:

!inp out.
(out 0 = T) /\
(!t. out(SUC t) = if inp(SUC t) then ~out t else out t)
==>
(!t. out t = PARITY t inp)

The proof of this is done by Mathematical Induction and, although trivial, is a good
illustration of how such proofs are done. The lemma is proved interactively using H OL’s
subgoal package. The proof is started by putting the goal to be proved on a goal stack
using the function g which takes a goal as argument.

> g ‘!inp out. 2


(out 0 = T) /\
(!t. out(SUC t) = (if inp(SUC t) then ~(out t) else out t)) ==>
(!t. out t = PARITY t inp)‘;
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀inp out.
(out 0 ⟺ T) ∧
(∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t) ⇒
∀t. out t ⟺ PARITY t inp

The subgoal package prints out the goal on the top of the goal stack. The top goal is
expanded by stripping off the universal quantifier (with gen_tac) and then making the
two conjuncts of the antecedent of the implication into assumptions of the goal (with
strip_tac). The ML function e takes a tactic and applies it to the top goal; the resulting
subgoals are pushed on to the goal stack. The message ‘OK..’ is printed out just before
the tactic is applied. The resulting subgoal is then printed.

> e(rpt gen_tac >> strip_tac); 3


OK..
1 subgoal:
val it =

∀t. out t ⟺ PARITY t inp


------------------------------------
0. out 0 ⟺ T
1. ∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t

Next induction on t is done using Induct, which does induction on the outermost
universally quantified variable.
62 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER

> e Induct; 4
OK..
2 subgoals:
val it =

out (SUC t) ⟺ PARITY (SUC t) inp


------------------------------------
0. out 0 ⟺ T
1. ∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t
2. out t ⟺ PARITY t inp

out 0 ⟺ PARITY 0 inp


------------------------------------
0. out 0 ⟺ T
1. ∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t

The assumptions of the two subgoals are shown numbered underneath the horizontal
lines of hyphens. The last goal printed is the one on the top of the stack, which is the
basis case. This is solved by rewriting with its assumptions and the definition of PARITY.

> e(rw[PARITY_def]); 5
OK..
<<HOL message: Initialising SRW simpset ... done>>

Goal proved.
[.] ⊢ out 0 ⟺ PARITY 0 inp

Remaining subgoals:
val it =

out (SUC t) ⟺ PARITY (SUC t) inp


------------------------------------
0. out 0 ⟺ T
1. ∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t
2. out t ⟺ PARITY t inp

The top goal is proved, so the system pops it from the goal stack (and puts the proved
theorem on a stack of theorems). The new top goal is the step case of the induction. This
goal is also solved by rewriting.
5.3. IMPLEMENTATION 63

> e(rw[PARITY_def]); 6
OK.. ...output elided...

Goal proved.
[..] ⊢ ∀t. out t ⟺ PARITY t inp
val it =
Initial goal proved.
⊢ ∀inp out.
(out 0 ⟺ T) ∧
(∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t) ⇒
∀t. out t ⟺ PARITY t inp: proof

The goal is proved, i.e. the empty list of subgoals is produced. The system now applies
the justification functions produced by the tactics to the lists of theorems achieving the
subgoals (starting with the empty list). These theorems are printed out in the order in
which they are generated (note that assumptions of theorems are printed as dots).
The ML function

top_thm : unit -> thm

returns the theorem just proved (i.e. on the top of the theorem stack) in the current
theory, and we bind this to the ML name UNIQUENESS_LEMMA.

> val UNIQUENESS_LEMMA = top_thm(); 7


val UNIQUENESS_LEMMA =
⊢ ∀inp out.
(out 0 ⟺ T) ∧
(∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t) ⇒
∀t. out t ⟺ PARITY t inp: thm

5.3 Implementation
The lemma just proved suggests that the parity checker can be implemented by holding
the parity value in a register and then complementing the contents of the register
whenever T is input. To make the implementation more interesting, it will be assumed
that registers ‘power up’ storing F. Thus the output at time 0 cannot be taken directly
from a register, because the output of the parity checker at time 0 is specified to be T.
Another tricky thing to notice is that if t>0, then the output of the parity checker at time
t is a function of the input at time t. Thus there must be a combinational path from the
input to the output.
The schematic diagram below shows the design of a device that is intended to imple-
ment this specification. (The leftmost input to MUX is the selector.) This works by storing
the parity of the sequence input so far in the lower of the two registers. Each time T is
input at in, this stored value is complemented. Registers are assumed to ‘power up’ in a
64 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER

state in which they are storing F. The second register (connected to ONE) initially outputs
F and then outputs T forever. Its role is just to ensure that the device works during the
first cycle by connecting the output out to the device ONE via the lower multiplexer. For
all subsequent cycles out is connected to l3 and so either carries the stored parity value
(if the current input is F) or the complement of this value (if the current input is T).

in

NOT

l1 l2

ONE MUX

REG l3 l4
l5

MUX

REG

out

The devices making up this schematic will be modelled with predicates [5]. For
example, the predicate ONE is true of a signal out if for all times t the value of out is T.
> val ONE_def = Define ‘ONE(out:num->bool) = !t. out t = T‘; 8
Definition has been stored under "ONE_def"
val ONE_def = ⊢ ∀out. ONE out ⟺ ∀t. out t ⟺ T: thm

Note that, as discussed above, ‘ONE_def’ is used both as an ML variable and as the name
of the definition in the theory. Note also how ‘:num->bool’ has been added to resolve type
5.3. IMPLEMENTATION 65

ambiguities; without this (or some other type information) the typechecker would not
be able to infer that t is to have type num.
The binary predicate NOT is true of a pair of signals (inp,out) if the value of out is
always the negation of the value of inp. Inverters are thus modelled as having no delay.
This is appropriate for a register-transfer level model, but not at a lower level.

> val NOT_def = Define‘NOT(inp, out:num->bool) = !t. out t = ~(inp t)‘; 9


Definition has been stored under "NOT_def"
val NOT_def = ⊢ ∀inp out. NOT (inp,out) ⟺ ∀t. out t ⟺ ¬inp t: thm

The final combinational device needed is a multiplexer. This is a ‘hardware conditional’;


the input sw selects which of the other two inputs are to be connected to the output out.

> val MUX_def = Define‘ 10


MUX(sw,in1,in2,out:num->bool) =
!t. out t = if sw t then in1 t else in2 t‘;
Definition has been stored under "MUX_def"
val MUX_def =
⊢ ∀sw in1 in2 out.
MUX (sw,in1,in2,out) ⟺ ∀t. out t ⟺ if sw t then in1 t else in2 t:
thm

The remaining devices in the schematic are registers. These are unit-delay elements;
the values output at time t+1 are the values input at the preceding time t, except at time
0 when the register outputs F.3

> val REG_def = 11


Define ‘REG(inp,out:num->bool) =
!t. out t = if (t=0) then F else inp(t-1)‘;
Definition has been stored under "REG_def"
val REG_def =
⊢ ∀inp out. REG (inp,out) ⟺ ∀t. out t ⟺ if t = 0 then F else inp (t − 1):
thm

The schematic diagram above can be represented as a predicate by conjoining the


relations holding between the various signals and then existentially quantifying the
internal lines. This technique is explained elsewhere (e.g. see [3, 5]).

3 Time 0 represents when the device is switched on.


66 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER

> val PARITY_IMP_def = Define 12


‘PARITY_IMP(inp,out) =
?l1 l2 l3 l4 l5.
NOT(l2,l1) /\ MUX(inp,l1,l2,l3) /\ REG(out,l2) /\
ONE l4 /\ REG(l4,l5) /\ MUX(l5,l3,l4,out)‘;
Definition has been stored under "PARITY_IMP_def"
val PARITY_IMP_def =
⊢ ∀inp out.
PARITY_IMP (inp,out) ⟺
∃l1 l2 l3 l4 l5.
NOT (l2,l1) ∧ MUX (inp,l1,l2,l3) ∧ REG (out,l2) ∧ ONE l4 ∧
REG (l4,l5) ∧ MUX (l5,l3,l4,out): thm

5.4 Verification
The following theorem will eventually be proved:

|- !inp out. PARITY_IMP(inp,out) ==> (!t. out t = PARITY t inp)

This states that if inp and out are related as in the schematic diagram (i.e. as in the
definition of PARITY_IMP), then the pair of signals (inp,out) satisfies the specification.
First, the following lemma is proved; the correctness of the parity checker follows from
this and UNIQUENESS_LEMMA by the transitivity of ==>.

> g ‘!inp out. 13


PARITY_IMP(inp,out) ==>
(out 0 = T) /\
!t. out(SUC t) = if inp(SUC t) then ~(out t) else out t‘;
val it =
Proof manager status: 2 proofs.
2. Completed goalstack:
⊢ ∀inp out.
(out 0 ⟺ T) ∧
(∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t) ⇒
∀t. out t ⟺ PARITY t inp
1. Incomplete goalstack:
Initial goal:

∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t

The first step in proving this goal is to rewrite with definitions followed by a decomposi-
tion of the resulting goal using strip_tac. The rewriting tactic PURE_REWRITE_TAC is used;
this does no built-in simplifications, only the ones explicitly given in the list of theorems
5.4. VERIFICATION 67

supplied as an argument. One of the built-in simplifications used by REWRITE_TAC is


|- (x = T) = x. PURE_REWRITE_TAC is used to prevent rewriting with this being done.

> e(PURE_REWRITE_TAC [PARITY_IMP_def, ONE_def, NOT_def, 14


MUX_def, REG_def] >>
rpt strip_tac);
OK..
2 subgoals:
val it =

out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t


------------------------------------
0. ∀t. l1 t ⟺ ¬l2 t
1. ∀t. l3 t ⟺ if inp t then l1 t else l2 t
2. ∀t. l2 t ⟺ if t = 0 then F else out (t − 1)
3. ∀t. l4 t ⟺ T
4. ∀t. l5 t ⟺ if t = 0 then F else l4 (t − 1)
5. ∀t. out t ⟺ if l5 t then l3 t else l4 t

out 0 ⟺ T
------------------------------------
0. ∀t. l1 t ⟺ ¬l2 t
1. ∀t. l3 t ⟺ if inp t then l1 t else l2 t
2. ∀t. l2 t ⟺ if t = 0 then F else out (t − 1)
3. ∀t. l4 t ⟺ T
4. ∀t. l5 t ⟺ if t = 0 then F else l4 (t − 1)
5. ∀t. out t ⟺ if l5 t then l3 t else l4 t

The top goal is the one printed last; its conclusion is out 0 = T and its assumptions
are equations relating the values on the lines in the circuit. The natural next step would
be to expand the top goal by rewriting with the assumptions. However, if this were done
the system would go into an infinite loop because the equations for out, l2 and l3 are
mutually recursive. Instead we use the first-order reasoner metis_tac to do the work:
68 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER

> e(metis_tac []); 15


OK..
metis: r[+0+17]+0+0+0+0+0+0+1#

Goal proved.
[......] ⊢ out 0 ⟺ T

Remaining subgoals:
val it =

out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t


------------------------------------
0. ∀t. l1 t ⟺ ¬l2 t
1. ∀t. l3 t ⟺ if inp t then l1 t else l2 t
2. ∀t. l2 t ⟺ if t = 0 then F else out (t − 1)
3. ∀t. l4 t ⟺ T
4. ∀t. l5 t ⟺ if t = 0 then F else l4 (t − 1)
5. ∀t. out t ⟺ if l5 t then l3 t else l4 t

The first of the two subgoals is proved. Inspecting the remaining goal it can be seen that
it will be solved if its left hand side, out(SUC t), is expanded using the assumption:
!t. out t = if l5 t then l3 t else l4 t

However, if this assumption is used for rewriting, then all the subterms of the form
out t will also be expanded. To prevent this, we really want to rewrite with a formula
that is specifically about out (SUC t). We want to somehow pull the assumption that
we do have out of the list and rewrite with a specialised version of it. We can do just
this using qpat_x_assum. This tactic is of type term quotation -> thm -> tactic. It
selects an assumption that is of the form given by its first argument, and passes it to the
second argument, a function which expects a theorem and returns a tactic. Here it is in
action:
> e (qpat_x_assum ‘!t. out t = X t‘ 16
(fn th => REWRITE_TAC [SPEC ‘‘SUC t‘‘ th]));
OK..
1 subgoal:
val it =

(if l5 (SUC t) then l3 (SUC t) else l4 (SUC t)) ⟺


if inp (SUC t) then ¬out t else out t
------------------------------------
0. ∀t. l1 t ⟺ ¬l2 t
1. ∀t. l3 t ⟺ if inp t then l1 t else l2 t
2. ∀t. l2 t ⟺ if t = 0 then F else out (t − 1)
3. ∀t. l4 t ⟺ T
4. ∀t. l5 t ⟺ if t = 0 then F else l4 (t − 1)
5.4. VERIFICATION 69

The pattern used here exploited something called higher order matching. The actual
assumption that was taken off the assumption stack did not have a RHS that looked like
the application of a function (X in the pattern) to the t parameter, but the RHS could
nonetheless be seen as equal to the application of some function to the t parameter. In
fact, the value that matched X was ‘‘\x. if l5 x then l3 x else l4 x‘‘.
Inspecting the goal above, it can be seen that the next step is to unwind the equations
for the remaining lines of the circuit. We do using the standard simplifier rw.
> e (rw[]); 17
OK.. ...output elided...

Goal proved.
[......] ⊢ out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t
val it =
Initial goal proved.
⊢ ∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t: proof
The theorem just proved is named PARITY_LEMMA and saved in the current theory.
> val PARITY_LEMMA = top_thm (); 18
val PARITY_LEMMA =
⊢ ∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t: thm
PARITY_LEMMA could have been proved in one step with a single compound tactic. Our
initial goal can be expanded with a single tactic corresponding to the sequence of tactics
that were used interactively:
> restart(); ...output elided... 19
> e (PURE_REWRITE_TAC [PARITY_IMP_def, ONE_def, NOT_def,
MUX_def, REG_def] >>
rpt strip_tac >| [
metis_tac [],
qpat_x_assum ‘!t. out t = X t‘
(fn th => REWRITE_TAC [SPEC ‘‘SUC t‘‘ th]) >>
rw[]
]);
OK..
metis: r[+0+17]+0+0+0+0+0+1+0+1+0+0+1#
val it =
Initial goal proved.
⊢ ∀inp out.
PARITY_IMP (inp,out) ⇒
(out 0 ⟺ T) ∧
∀t. out (SUC t) ⟺ if inp (SUC t) then ¬out t else out t: proof
70 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER

Armed with PARITY_LEMMA, the final theorem is easily proved. This will be done in one
step using the ML function prove.

> val PARITY_CORRECT = prove( 20


‘‘!inp out. PARITY_IMP(inp,out) ==> (!t. out t = PARITY t inp)‘‘,
rpt strip_tac >> match_mp_tac UNIQUENESS_LEMMA >>
irule PARITY_LEMMA >> rw[]);
val PARITY_CORRECT =
⊢ ∀inp out. PARITY_IMP (inp,out) ⇒ ∀t. out t ⟺ PARITY t inp: thm

This completes the proof of the parity checking device.

5.5 Exercises
Two exercises are given in this section: Exercise 1 is straightforward, but Exercise 2 is
quite tricky and might take a beginner several days to solve.

5.5.1 Exercise 1
Using only the devices ONE, NOT, MUX and REG defined in Section 5.3, design and verify a
register RESET_REG with an input inp, reset line reset, output out and behaviour specified
as follows.

• If reset is T at time t, then the value at out at time t is also T.

• If reset is T at time t or t+1, then the value output at out at time t+1 is T, otherwise
it is equal to the value input at time t on inp.

This is formalized in H OL by the definition:

RESET_REG(reset,inp,out) <=>
(!t. reset t ==> (out t = T)) /\
(!t. out(t+1) = if reset t \/ reset(t+1) then T else inp t)

Note that this specification is only partial; it doesn’t specify the output at time 0 in the
case that there is no reset.
The solution to the exercise should be a definition of a predicate RESET_REG_IMP as
an existential quantification of a conjunction of applications of ONE, NOT, MUX and REG to
suitable line names,4 together with a proof of:

RESET_REG_IMP(reset,inp,out) ==> RESET_REG(reset,inp,out)


4 i.e. a definition of the same form as that of PARITY_IMP on page 66.
5.5. EXERCISES 71

5.5.2 Exercise 2
1. Formally specify a resetable parity checker that has two boolean inputs reset and
inp, and one boolean output out with the following behaviour:

The value at out is T if and only if there have been an even number of Ts
input at inp since the last time that T was input at reset.

2. Design an implementation of this specification built using only the devices ONE, NOT,
MUX and REG defined in Section 5.3.

3. Verify the correctness of your implementation in H OL.


72 CHAPTER 5. EXAMPLE: A SIMPLE PARITY CHECKER
Chapter 6

Example: Combinatory Logic

6.1 Introduction
This small case study is a formalisation of (variable-free) combinatory logic. This logic
is of foundational importance in theoretical computer science, and has a very rich
theory. The example builds principally on a development done by Tom Melham. The
complete script for the development is available as clScript.sml in the examples/ind_-
def directory of the distribution. It is self-contained and so includes the answers to the
exercises set at the end of this document.
The H OL sessions assume that the Unicode trace is on (as it is by default), meaning
that even though the inputs may be written in pure ASCII, the output still uses nice
Unicode output (symbols such as ∀ and ⇒). The Unicode symbols could also be used in
the input.

6.2 The type of combinators


The first thing we need to do is define the type of combinators. There are just two of
these, K and S, but we also need to be able to combine them, and for this we need to
introduce the notion of application. For lack of a better ASCII symbol, we will use the
hash (#) to represent this in the logic. Finally, we will start by “hiding” the names S and
K so that the constants of these names from the existing H OL theories won’t interfere
with parsing.

> hide "K"; hide "S"; ...output elided... 1


> Datatype ‘cl = K | S | # cl cl‘;
<<HOL message: Defined type: "cl">>
val it = (): unit

We also want the # to be an infix, so we set its fixity to be a tight left-associative infix:

> set_fixity "#" (Infixl 1100); 2


val it = (): unit

73
74 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

6.3 Combinator reductions

Combinatory logic is the study of how values of this type can evolve given various rules
describing how they change. Therefore, our next step is to define the reductions that
combinators can undergo. There are two basic rules:

𝖪𝑥𝑦 → 𝑥
𝖲 𝑓 𝑔 𝑥 → (𝑓 𝑥)(𝑔𝑥)

Here, in our description outside of HOL, we use juxtaposition instead of the #. Further,
juxtaposition is also left-associative, so that 𝖪 𝑥 𝑦 should be read as 𝖪 # 𝑥 # 𝑦 which is in
turn (𝖪 # 𝑥) # 𝑦.
Given a term in the logic, we want these reductions to be able to fire at any point, not
just at the top level, so we need two further congruence rules:

𝑥 → 𝑥′
𝑥 𝑦 → 𝑥′ 𝑦

𝑦 → 𝑦′
𝑥 𝑦 → 𝑥 𝑦′

In HOL, we can capture this relation with an inductive definition. First we need to set our
arrow symbol up as an infix to make everything that bit prettier. The set_mapped_fixity
function lets the arrow be our surface syntax, but maps to the name redn underneath.
Making constants have pure alphanumeric names is generally a good idea.

> set_mapped_fixity {fixity = Infix(NONASSOC, 450), 3


tok = "-->", term_name = "redn"}
val it = (): unit

We make our arrow symbol non-associative, thereby making it a parse error to write
x --> y --> z. It would be nice to be able to write this and have it mean x --> y /\ y --> z,
but this is not presently possible with the HOL parser.
Our next step is to actually define the relation with the Hol_reln function. This
function returns three separate theorems, but we will only need to refer to the first:
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 75

> val (redn_rules, _, redn_cases) = Hol_reln 4


‘(!x y f. x --> y ==> f # x --> f # y) /\
(!f g x. f --> g ==> f # x --> g # x) /\
(!x y. K # x # y --> x) /\
(!f g x. S # f # g # x --> (f # x) # (g # x))‘;
val redn_cases =
⊢ ∀a0 a1.
a0 --> a1 ⟺
(∃x y f. (a0 = f # x) ∧ (a1 = f # y) ∧ x --> y) ∨
(∃f g x. (a0 = f # x) ∧ (a1 = g # x) ∧ f --> g) ∨
(∃y. a0 = K # a1 # y) ∨
∃f g x. (a0 = S # f # g # x) ∧ (a1 = f # x # (g # x)): thm
val redn_rules =
⊢ (∀x y f. x --> y ⇒ f # x --> f # y) ∧
(∀f g x. f --> g ⇒ f # x --> g # x) ∧ (∀x y. K # x # y --> x) ∧
∀f g x. S # f # g # x --> f # x # (g # x): thm

In addition to proving these three theorems for us, the inductive definitions package
will also save them to disk when the theory is exported.
Now, using our theorem redn_rules we can demonstrate single steps of our reduction
relation:
> PROVE [redn_rules] ‘‘S # (K # x # x) --> S # x‘‘; 5
Meson search level: ...
val it = ⊢ S # (K # x # x) --> S # x: thm

The system we have just defined is as powerful as the 𝜆-calculus, Turing machines, and
all the other standard models of computation.
One useful result about the combinatory logic is that it is confluent. Consider the
term 𝖲 𝑧 (𝖪 𝖪) (𝖪 𝑦 𝑥). It can make two reductions, to 𝖲 𝑧 (𝖪 𝖪) 𝑦 and also to
(𝑧 (𝖪 𝑦 𝑥)) (𝖪 𝖪 (𝖪 𝑦 𝑥)). Do these two choices of reduction mean that from this point on
the terms have two completely separate histories? Roughly speaking, to be confluent
means that the answer to this question is no.

6.4 Transitive closure and confluence


A notion crucial to that of confluence is that of transitive closure. We have defined
a system that evolves by specifying how an algebraic value can evolve into possible
successor values in one step. The natural next question is to ask for a characterisation of
evolution over one or more steps of the → relation.
In fact, we will define a relation that holds between two values if the second can be
reached from the first in zero or more steps. This is the reflexive, transitive closure of our
original relation. However, rather than tie our new definition to our original relation, we
will develop this notion independently and prove a variety of results that are true of any
system, not just our system of combinatory logic.
76 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

So, we begin our abstract digression with another inductive definition. Our new
constant is RTC, such that 𝖱𝖳𝖢 𝑅 𝑥 𝑦 is true if it is possible to get from 𝑥 to 𝑦 with zero
or more “steps” of the 𝑅 relation. (The standard notation for 𝖱𝖳𝖢 𝑅 is 𝑅∗ ; we will see
H OL try to approximate this with the text Rˆ*.) We can express this idea with just two
rules. The first

𝖱𝖳𝖢 𝑅 𝑥 𝑥

says that it’s always possible to get from 𝑥 to 𝑥 in zero or more steps. The second
𝑅𝑥𝑦 𝖱𝖳𝖢 𝑅 𝑦 𝑧
𝖱𝖳𝖢 𝑅 𝑥 𝑧

says that if you can take a single step from 𝑥 to 𝑦, and then take zero or more steps
to get 𝑦 to 𝑧, then it’s possible to take zero or more steps to get between 𝑥 and 𝑧. The
realisation of these rules in HOL is again straightforward.
(As it happens, RTC is already a defined constant in the context we’re working in
(it is found in relationTheory), so we’ll hide it from view before we begin. We thus
avoid messages telling us that we are inputting ambiguous terms. The ambiguities
would always be resolved in the favour of more recent definition, but the warnings are
annoying. We inherit the nice syntax for the old constant with our new one.)
> val _ = hide "RTC"; 6

> val (RTC_rules, _, RTC_cases) = Hol_reln ‘


(!x. RTC R x x) /\
(!x y z. R x y /\ RTC R y z ==> RTC R x z)‘;
<<HOL message: inventing new type variable names: ’a>>
<<HOL message: Treating "R" as schematic variable>>
val RTC_cases = ⊢ ∀R a0 a1. R^* a0 a1 ⟺ (a1 = a0) ∨ ∃y. R a0 y ∧ R^* y a1:
thm
val RTC_rules = ⊢ ∀R. (∀x. R^* x x) ∧ ∀x y z. R x y ∧ R^* y z ⇒ R^* x z: thm

Now let us go back to the notion of confluence. We want this to mean something like:
“though a system may take different paths in the short-term, those two paths can always
end up in the same place”. This suggests that we define confluent thus:
> val confluent_def = Define 7
‘confluent R =
!x y z. RTC R x y /\ RTC R x z ==>
?u. RTC R y u /\ RTC R z u‘;
<<HOL message: inventing new type variable names: ’a>>
Definition has been stored under "confluent_def"
val confluent_def =
⊢ ∀R. confluent R ⟺ ∀x y z. R^* x y ∧ R^* x z ⇒ ∃u. R^* y u ∧ R^* z u: thm

This property states of 𝑅 that we can “complete the diamond”; if we have


6.4. TRANSITIVE CLOSURE AND CONFLUENCE 77

𝑥
∗ ∗

𝑦 𝑧

then we can complete with a fresh value 𝑢:

𝑥
∗ ∗

𝑦 𝑧

∗ ∗
𝑢

One nice property of confluent relations is that from any one starting point they
produce no more than one normal form, where a normal form is a value from which no
further steps can be taken.

> val normform_def = Define‘normform R x = !y. ~(R x y)‘; 8


<<HOL message: inventing new type variable names: ’a, ’b>>
Definition has been stored under "normform_def"
val normform_def = ⊢ ∀R x. normform R x ⟺ ∀y. ¬R x y: thm

In other words, a system has an 𝑅-normal form at 𝑥 if there are no connections via 𝑅
to any other values. (We could have written ~?y. R x y as our RHS for the definition
above.)
We can now prove the following:

> g ‘!R. confluent R ==> 9


!x y z.
RTC R x y /\ normform R y /\
RTC R x z /\ normform R z ==> (y = z)‘;
<<HOL message: inventing new type variable names: ’a>>
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀R.
confluent R ⇒
∀x y z. R^* x y ∧ normform R y ∧ R^* x z ∧ normform R z ⇒ (y = z)

We rewrite with the definition of confluence:


78 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> e (rw[confluent_def]); 10
OK..
<<HOL message: Initialising SRW simpset ... done>>
1 subgoal:
val it =

y = z
------------------------------------
0. ∀x y z. R^* x y ∧ R^* x z ⇒ ∃u. R^* y u ∧ R^* z u
1. R^* x y
2. normform R y
3. R^* x z
4. normform R z

Our confluence property is now assumption 0, and we can use it to infer that there is a 𝑢
at the base of the diamond:

> e (‘?u. RTC R y u /\ RTC R z u‘ by metis_tac []); 11


OK..
metis: r[+0+8]+0+0+0+0+0+0+1+1+1+1#
1 subgoal:
val it =

y = z
------------------------------------
0. ∀x y z. R^* x y ∧ R^* x z ⇒ ∃u. R^* y u ∧ R^* z u
1. R^* x y
2. normform R y
3. R^* x z
4. normform R z
5. R^* y u
6. R^* z u

So, from 𝑦 we can take zero or more steps to get to 𝑢 and similarly from 𝑧. But, we also
know that we’re at an 𝑅-normal form at both 𝑦 and 𝑧. We can’t take any steps at all from
these values. We can conclude both that 𝑢 = 𝑦 and 𝑢 = 𝑧, and this in turn means that
𝑦 = 𝑧, which is our goal. So we can finish with
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 79

> e (metis_tac [normform_def, RTC_cases]); 12


OK..
metis: r[+0+20]+0+0+0+0+0+0+0+0+0+0+0+0+6+0+0+0+0+0+0+2+0 .... #
...output elided...

Goal proved.
[.....] ⊢ y = z
val it =
Initial goal proved.
⊢ ∀R.
confluent R ⇒
∀x y z. R^* x y ∧ normform R y ∧ R^* x z ∧ normform R z ⇒ (y = z)

Packaged up so as to remove the sub-goal package commands, we can prove and save
the theorem for future use by:
> val confluent_normforms_unique = store_thm( 13
"confluent_normforms_unique",
‘‘!R. confluent R ==>
!x y z. RTC R x y /\ normform R y /\
RTC R x z /\ normform R z ==> (y = z)‘‘,
rw[confluent_def] >>
‘?u. RTC R y u /\ RTC R z u‘ by metis_tac [] >>
metis_tac [normform_def, RTC_cases]);
<<HOL message: inventing new type variable names: ’a>>
metis: r[+0+8]+0+0+0+0+0+0+1+1+1+1#
metis: r[+0+20]+0+0+0+0+0+0+0+0+0+0+0+0+6+0+0+0+0+0+0+2+0 .... #
val confluent_normforms_unique =
⊢ ∀R.
confluent R ⇒
∀x y z. R^* x y ∧ normform R y ∧ R^* x z ∧ normform R z ⇒ (y = z):
thm
⋯⋄⋯

Clearly confluence is a nice property for a system to have. The question is how we
might manage to prove it. Let’s start by defining the diamond property that we used in
the definition of confluence. We’ll again hide the existing definition of “diamond”:
> val _ = hide "diamond"; 14
> val diamond_def = Define
‘diamond R = !x y z. R x y /\ R x z ==> ?u. R y u /\ R z u‘;
<<HOL message: inventing new type variable names: ’a>>
Definition has been stored under "diamond_def"
val diamond_def =
⊢ ∀R. diamond R ⟺ ∀x y z. R x y ∧ R x z ⇒ ∃u. R y u ∧ R z u: thm

Now we clearly have that confluence of a relation is equivalent to the reflexive, transitive
closure of that relation having the diamond property.
80 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> val confluent_diamond_RTC = store_thm( 15


"confluent_diamond_RTC",
‘‘!R. confluent R = diamond (RTC R)‘‘,
rw[confluent_def, diamond_def]);
<<HOL message: inventing new type variable names: ’a>>
val confluent_diamond_RTC = ⊢ ∀R. confluent R ⟺ diamond R^*: thm

So far so good. How then do we show the diamond property for 𝖱𝖳𝖢 𝑅? The answer
that leaps to mind is to hope that if the original relation has the diamond property, then
maybe the reflexive and transitive closure will too. The theorem we want is

𝖽𝗂𝖺𝗆𝗈𝗇𝖽 𝑅 ⇒ 𝖽𝗂𝖺𝗆𝗈𝗇𝖽 (𝖱𝖳𝖢 𝑅)

Graphically, this is hoping that from

𝑥
 
𝑦 𝑧
 
𝑢

we will be able to conclude

𝑥
 
𝑦 𝑧
 
𝑢
 
𝑝 𝑞

 
𝑟

where the dashed lines indicate that these steps (from 𝑥 to 𝑝, for example) are using
𝖱𝖳𝖢 𝑅. The presence of two instances of 𝖱𝖳𝖢 𝑅 is an indication that this proof will
require two inductions. With the first we will prove

𝑥
 
𝑦 𝑧
 
𝑢

𝑝
 
𝑟
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 81

In other words, we want to show that if we take one step in one direction (to 𝑧) and
many steps in another (to 𝑝), then the diamond property for 𝑅 will guarantee us the
existence of 𝑟, to which will we be able to take many steps from both 𝑝 and 𝑧.
We take some care to state the goal so that after stripping away the outermost as-
sumption (that 𝑅 has the diamond property), it will match the induction principle for
RTC.1

> g ‘!R. diamond R ==> 16


!x p. RTC R x p ==>
!z. R x z ==>
?u. RTC R p u /\ RTC R z u‘;
<<HOL message: inventing new type variable names: ’a>>
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀R. diamond R ⇒ ∀x p. R^* x p ⇒ ∀z. R x z ⇒ ∃u. R^* p u ∧ R^* z u

First, we strip away the diamond property assumption (two things need to be stripped:
the outermost universal quantifier and the antecedent of the implication). If we use rw
at this point, we strip away too much so we have to be more precise and use the lower
level tool strip_tac. This tactic will remove a universal quantification, an implication
or a conjunction:

> e (strip_tac >> strip_tac); 17


OK..
1 subgoal:
val it =

∀x p. R^* x p ⇒ ∀z. R x z ⇒ ∃u. R^* p u ∧ R^* z u


------------------------------------
diamond R

Now we can use the induction principle for reflexive and transitive closure (alternatively,
we perform a “rule induction”). To do this, we use the Induct_on command that is also
used to do structural induction on algebraic data types (such as numbers and lists). We
provide the name of the constant whose induction principle we want to use, and the
tactic does the rest:
1 Inthis and subsequent proofs using the sub-goal package, we will present the proof manager as if
the goal to be proved is the first ever on this stack. In other words, we have done a dropn 1; after every
successful proof to remove the evidence of the old goal. In practice, there is no harm in leaving these goals
on the proof manager’s stack.
82 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> e (Induct_on ‘RTC‘); 18


OK..
1 subgoal:
val it =

(∀x z. R x z ⇒ ∃u. R^* x u ∧ R^* z u) ∧


∀x x’ p.
R x x’ ∧ R^* x’ p ∧ (∀z. R x’ z ⇒ ∃u. R^* p u ∧ R^* z u) ⇒
∀z. R x z ⇒ ∃u. R^* p u ∧ R^* z u
------------------------------------
diamond R

Let’s strip the goal as much as possible with the aim of making what remains to be proved
easier to see:

> e (rw[]); 19
OK..
2 subgoals:
val it =

∃u. R^* p u ∧ R^* z u


------------------------------------
0. diamond R
1. R x x’
2. R^* x’ p
3. ∀z. R x’ z ⇒ ∃u. R^* p u ∧ R^* z u
4. R x z

∃u. R^* x u ∧ R^* z u


------------------------------------
0. diamond R
1. R x z

This first goal is easy. It corresponds to the case where the many steps from 𝑥 to 𝑝 are
actually no steps at all, and 𝑝 and 𝑥 are actually the same place. In the other direction, 𝑥
has taken one step to 𝑧, and we need to find somewhere reachable in zero or more steps
from both 𝑥 and 𝑧. Given what we know so far, the only candidate is 𝑧 itself. In fact, we
don’t even need to provide this witness explicitly: metis_tac will find it for us, as long
as we tell it what the rules governing RTC are:
6.4. TRANSITIVE CLOSURE AND CONFLUENCE 83

> e (metis_tac [RTC_rules]); 20


OK..
metis: r[+0+9]+0+0+0+0+0+0+1+0+0+6+1#

Goal proved.
[..] ⊢ ∃u. R^* x u ∧ R^* z u

Remaining subgoals:
val it =

∃u. R^* p u ∧ R^* z u


------------------------------------
0. diamond R
1. R x x’
2. R^* x’ p
3. ∀z. R x’ z ⇒ ∃u. R^* p u ∧ R^* z u
4. R x z

And what of this remaining goal? Assumptions one and four between them are the top
of an 𝑅-diamond. Let’s use the fact that we have the diamond property for 𝑅 and infer
that there exists a 𝑣 to which 𝑦 and 𝑧′ can both take single steps:

> e (‘?v. R x’ v /\ R z v‘ by metis_tac [diamond_def]); 21


OK..
metis: r[+0+16]+0+0+0+0+0+0+0+0+0+0+0+0+0+1+1+1+1+1#
1 subgoal:
val it =

∃u. R^* p u ∧ R^* z u


------------------------------------
0. diamond R
1. R x x’
2. R^* x’ p
3. ∀z. R x’ z ⇒ ∃u. R^* p u ∧ R^* z u
4. R x z
5. R x’ v
6. R z v

Now we can apply our induction hypothesis (assumption 3) to complete the long, lop-
sided strip of the diamond. We will conclude that there is a 𝑢 such that 𝑅∗ 𝑝 𝑢 and 𝑅∗ 𝑣 𝑢.
We actually need a 𝑢 such that 𝖱𝖳𝖢 𝑅 𝑧 𝑢, but because there is a single 𝑅-step between 𝑧
and 𝑣 we have that as well. All we need to provide metis_tac is the rules for RTC:
84 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> e (metis_tac [RTC_rules]); 22


OK..
metis: r[+0+15]+0+0+0+0+0+0+0+0+0+0+1+0+0+1+0+1+0+10+2+0+ .... #
...output elided...

Goal proved.
[.] ⊢ ∀x p. R^* x p ⇒ ∀z. R x z ⇒ ∃u. R^* p u ∧ R^* z u
val it =
Initial goal proved.
⊢ ∀R. diamond R ⇒ ∀x p. R^* x p ⇒ ∀z. R x z ⇒ ∃u. R^* p u ∧ R^* z u: proof

Again we can (and should) package up the lemma, avoiding the sub-goal package
commands:
val R_RTC_diamond = store_thm( 23
"R_RTC_diamond",
‘‘!R. diamond R ==>
!x p. RTC R x p ==>
!z. R x z ==>
?u. RTC R p u /\ RTC R z u‘‘,
strip_tac >> strip_tac >> Induct_on ‘RTC‘ >> rw[] >| [
metis_tac [RTC_rules],
‘?v. R x’ v /\ R z v‘ by metis_tac [diamond_def] >>
metis_tac [RTC_rules]
]);
⋯⋄⋯

Now we can move on to proving that if 𝑅 has the diamond property, so too does 𝑅∗ .
We want to prove this by induction again. It’s very tempting to state the goal as the
obvious

𝖽𝗂𝖺𝗆𝗈𝗇𝖽 𝑅 ⇒ 𝖽𝗂𝖺𝗆𝗈𝗇𝖽 (𝑅∗ )

but doing so will actually make it harder to apply the induction principle when the time
is right. Better to start out with a statement of the goal that is very near in form to the
induction princple. So, we manually expand the meaning of diamond and state our next
goal thus:
> g ‘!R. diamond R ==> !x y. RTC R x y ==> 24
!z. RTC R x z ==>
?u. RTC R y u /\ RTC R z u‘;
<<HOL message: inventing new type variable names: ’a>>
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀R. diamond R ⇒ ∀x y. R^* x y ⇒ ∀z. R^* x z ⇒ ∃u. R^* y u ∧ R^* z u


6.4. TRANSITIVE CLOSURE AND CONFLUENCE 85

Again we strip the diamond property assumption, apply the induction principle, and strip
repeatedly:

> e (strip_tac >> strip_tac >> Induct_on ‘RTC‘ >> rw[]); 25


OK..
2 subgoals:
val it =

∃u. R^* y u ∧ R^* z u


------------------------------------
0. diamond R
1. R x x’
2. R^* x’ y
3. ∀z. R^* x’ z ⇒ ∃u. R^* y u ∧ R^* z u
4. R^* x z

∃u. R^* x u ∧ R^* z u


------------------------------------
0. diamond R
1. R^* x z

The first goal is again an easy one, corresponding to the case where the trip from 𝑥 to 𝑦
has been one of no steps whatsoever.

> e (metis_tac [RTC_rules]); 26


OK..
metis: r[+0+9]+0+0+0+0+0+0+1#

Goal proved.
[..] ⊢ ∃u. R^* x u ∧ R^* z u

Remaining subgoals:
val it =

∃u. R^* y u ∧ R^* z u


------------------------------------
0. diamond R
1. R x x’
2. R^* x’ y
3. ∀z. R^* x’ z ⇒ ∃u. R^* y u ∧ R^* z u
4. R^* x z

This goal is very similar to the one we saw earlier. We have the top of a (“lop-sided”)
diamond in assumptions 1 and 4, so we can infer the existence of a common destination
for 𝑥′ and 𝑧:
86 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> e (‘?v. RTC R x’ v /\ RTC R z v‘ by metis_tac [R_RTC_diamond]); 27


OK..
metis: r[+0+13]+0+0+0+0+0+0+0+1+0+0+0+1+0+1+1+1+0+1+1#
1 subgoal:
val it =

∃u. R^* y u ∧ R^* z u


------------------------------------
0. diamond R
1. R x x’
2. R^* x’ y
3. ∀z. R^* x’ z ⇒ ∃u. R^* y u ∧ R^* z u
4. R^* x z
5. R^* x’ v
6. R^* z v

At this point in the last proof we were able to finish it all off by just appealing to the
rules for RTC. This time it is not quite so straightforward. When we use the induction
hypothesis (assumption 3), we can conclude that there is a 𝑢 to which both 𝑦 and 𝑣 can
connect in zero or more steps, but in order to show that this 𝑢 is reachable from 𝑧, we
need to be able to conclude 𝑅∗ 𝑧 𝑢 when we know that 𝑅∗ 𝑧 𝑣 (assumption 6 above) and
𝑅∗ 𝑣 𝑢 (our consequence of the inductive hypothesis). We leave the proof of this general
result as an exercise, and here assume that it is already proved as the theorem RTC_RTC.

> e (metis_tac [RTC_rules, RTC_RTC]); 28


OK..
metis: r[+0+16]+0+0+0+0+0+0+0+0+0+0+2+0+0+0+0+1+14+21+1+2 .... #
...output elided...

Goal proved.
[.....] ⊢ ∃u. R^* y u ∧ R^* z u
val it =
Initial goal proved.
⊢ ∀R. diamond R ⇒ ∀x y. R^* x y ⇒ ∀z. R^* x z ⇒ ∃u. R^* y u ∧ R^* z u

We can package this result up as a lemma and then prove the prettier version directly:
6.5. BACK TO COMBINATORS 87

val diamond_RTC_lemma = prove( 29


‘‘!R.
diamond R ==>
!x y. RTC R x y ==> !z. RTC R x z ==> ?u. RTC R y u /\ RTC R z u‘‘,
strip_tac >> strip_tac >> Induct_on ‘RTC‘ >> rw[] >| [
metis_tac [RTC_rules],
‘?v. RTC R x’ v /\ RTC R z v‘ by metis_tac [R_RTC_diamond] >>
metis_tac [RTC_RTC, RTC_rules]
]);
val diamond_RTC = store_thm(
"diamond_RTC",
‘‘!R. diamond R ==> diamond (RTC R)‘‘,
metis_tac [diamond_def,diamond_RTC_lemma]);

6.5 Back to combinators


Now, we are in a position to return to the real object of study and prove confluence for
combinatory logic. We have done an abstract development and established that
𝖽𝗂𝖺𝗆𝗈𝗇𝖽 𝑅 ⇒ 𝖽𝗂𝖺𝗆𝗈𝗇𝖽 (𝖱𝖳𝖢 𝑅)

𝖽𝗂𝖺𝗆𝗈𝗇𝖽 (𝖱𝖳𝖢 𝑅) ≡ 𝖼𝗈𝗇𝖿𝗅𝗎𝖾𝗇𝗍 𝑅
(We have also established a couple of other useful results along the way.)
Sadly, it just isn’t the case that →, our one-step relation for combinators, has the
diamond property. A counter-example is 𝖪 𝖲 (𝖪 𝖪 𝖪). Its possible evolution can be
described graphically:
𝖪 𝖲 (𝖪 𝖪 𝖪)
y '
𝖲 𝖪𝖲𝖪
w
𝖲
If we had the diamond property, it should be possible to find a common destination
for 𝖪 𝖲 𝖪 and 𝖲. However, S doesn’t admit any reductions whatsoever, so there isn’t a
common destination.2
This is a problem. We are going to have to take another approach. We will define
another reduction strategy (parallel reduction), and prove that its reflexive, transitive
closure is actually the same relation as our original’s reflexive and transitive closure.
Then we will also show that parallel reduction has the diamond property. This will
establish that its reflexive, transitive closure has it too. Then, because they are the same
relation, we will have that the reflexive, transitive closure of our original relation has the
diamond property, and therefore, our original relation will be confluent.
2 Infact our counter-example is more complicated than necessary. The fact that 𝖪 𝖲 𝖪 has a reduction
to the normal form 𝖲 also acts as a counter-example. Can you see why?
88 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

6.5.1 Parallel reduction


Our new relation allows for any number of reductions to occur in parallel. We use the
-||-> symbol to indicate parallel reduction because of its own parallel lines, and use
predn to name the constant:

> set_mapped_fixity {tok = "-||->", fixity = Infix(NONASSOC, 450), 30


term_name = "predn"};
val it = (): unit

Then we can define parallel reduction itself. The rules look very similar to those for →.
The difference is that we allow the reflexive transition, and say that an application of
𝑥 𝑢 can be transformed to 𝑦 𝑣 if there are transformations taking 𝑥 to 𝑦 and 𝑢 to 𝑣. This
is why we must have reflexivity incidentally. Without it, a term like (𝖪 𝑥 𝑦) 𝖪 couldn’t
reduce because while the LHS of the application (𝖪 𝑥 𝑦) can reduce, its RHS (K) can’t.

> val (predn_rules, _, predn_cases) = Hol_reln 31


‘(!x. x -||-> x) /\
(!x y u v. x -||-> y /\ u -||-> v
==>
x # u -||-> y # v) /\
(!x y. K # x # y -||-> x) /\
(!f g x. S # f # g # x -||-> (f # x) # (g # x))‘; ...output elided...

6.5.2 Using RTC


Now we can set up nice syntax for the reflexive and transitive closures of our two
relations. We will use ASCII symbols for both that consist of the original symbol followed
by an asterisk. Note also how, in defining the two relations, we have to use the $
character to “escape” the symbols’ usual fixities. This is exactly analogous to the way
in which ML’s op keyword is used. First, we create the desired symbol for the concrete
syntax, and then we “overload” it so that the parser will expand it to the desired form.

> set_fixity "-->*" (Infix(NONASSOC, 450)); 32


val it = (): unit

> overload_on ("-->*", ‘‘RTC redn‘‘);


val it = (): unit

We do exactly the same thing for the reflexive and transitive closure of our parallel
reduction.
> set_fixity "-||->*" (Infix(NONASSOC, 450)); 33
val it = (): unit

> overload_on ("-||->*", ‘‘RTC predn‘‘);


val it = (): unit
6.5. BACK TO COMBINATORS 89

Incidentally, in conjunction with PROVE we can now automatically demonstrate relatively


long chains of reductions:
> PROVE [RTC_rules, redn_rules] ‘‘S # K # K # x -->* x‘‘; 34
Meson search level: ......
val it = ⊢ S # K # K # x -->* x: thm

> PROVE [RTC_rules, redn_rules]


‘‘S # (S # (K # S) # K) # (S # K # K) # f # x -->*
f # (f # x)‘‘;
Meson search level: ...........................
val it = ⊢ S # (S # (K # S) # K) # (S # K # K) # f # x -->* f # (f # x): thm
(The latter sequence is seven reductions long.)

6.5.3 Proving the RTCs are the same


We start with the easier direction, and show that everything in →∗ is in −∣∣→∗ . Because
RTC is monotone (which fact is left to the reader to prove), we can reduce this to
showing that 𝑥 → 𝑦 ⇒ 𝑥 −∣∣→ 𝑦.
Our goal:
> g ‘!x y. x -->* y ==> x -||->* y‘; 35
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀x y. x -->* y ⇒ x -||->* y
We back-chain using our monotonicity result:
> e (match_mp_tac RTC_monotone); 36
OK..
1 subgoal:
val it =

∀x y. x --> y ⇒ x -||-> y
Now we can induct over the rules for →:
> e (Induct_on ‘x --> y‘); 37
OK..
1 subgoal:
val it =

(∀x y f. x --> y ∧ x -||-> y ⇒ f # x -||-> f # y) ∧


(∀f g x. f --> g ∧ f -||-> g ⇒ f # x -||-> g # x) ∧
(∀x y. K # x # y -||-> x) ∧ ∀f g x. S # f # g # x -||-> f # x # (g # x)
90 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

We could split the 4-way conjunction apart into four goals, but there is no real need. It is
quite clear that each follows immediately from the rules for parallel reduction.
> e (metis_tac [predn_rules]); 38
OK..
metis: r[+0+5]#
r[+0+4]#
r[+0+8]+0+0+0+0+0+0+0+1#
r[+0+8]+0+0+0+0+0+0+0+1# ...output elided...

Goal proved.
⊢ ∀x y. x --> y ⇒ x -||-> y
val it =
Initial goal proved.
⊢ ∀x y. x -->* y ⇒ x -||->* y: proof
Packaged into a tidy little sub-goal-package-free parcel, our proof is
val RTCredn_RTCpredn = store_thm( 39
"RTCredn_RTCpredn",
‘‘!x y. x -->* y ==> x -||->* y‘‘,
match_mp_tac RTC_monotone >>
Induct_on ‘x --> y‘ >> metis_tac [predn_rules]);
⋯⋄⋯

Our next proof is in the other direction. It should be clear that we will not just be
able to appeal to the monotonicity of RTC this time; one step of the parallel reduction
relation can not be mirrored with one step of the original reduction relation. It’s clear
that mirroring one step of the parallel reduction relation might take many steps of the
original relation. Let’s prove that then:
> g ‘!x y. x -||-> y ==> x -->* y‘; 40
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀x y. x -||-> y ⇒ x -->* y
This time our induction will be over the rules defining the parallel reduction relation.
> e (Induct_on ‘x -||-> y‘); 41
OK..
1 subgoal:
val it =

(∀x. x -->* x) ∧
(∀x y x’ y’.
x -||-> y ∧ x -->* y ∧ x’ -||-> y’ ∧ x’ -->* y’ ⇒ x # x’ -->* y # y’) ∧
(∀y y’. K # y # y’ -->* y) ∧ ∀f g x. S # f # g # x -->* f # x # (g # x)
6.5. BACK TO COMBINATORS 91

There are four conjuncts here, and it should be clear that all but the second can be
proved immediately by appeal to the rules for the transitive closure and for → itself. So,
we split apart the conjunctions and enter a THENL branch, putting in an all_tac in the
2nd position so that this falls through to be dealt with more carefully.
> e (rpt conj_tac >| [metis_tac[RTC_rules, redn_rules], 42
all_tac,
metis_tac[RTC_rules, redn_rules],
metis_tac[RTC_rules, redn_rules] ]);
OK..
metis: r[+0+11]+0+0+0+0+0+0+0+0+0+0+8+1#
metis: r[+0+11]+0+0+0+0+0+0+0+0+0+0+8+1#
metis: r[+0+3]#
1 subgoal:
val it =

∀x y x’ y’.
x -||-> y ∧ x -->* y ∧ x’ -||-> y’ ∧ x’ -->* y’ ⇒ x # x’ -->* y # y’
What of this latest sub-goal? If we look at it for long enough, we should see that it
is another monotonicity fact. More accurately, we need what is called a congruence
result for -->*. In this form, it’s not quite right for easy proof. Let’s go away and prove
RTCredn_ap_monotonic separately. (Another exercise!) Our new theorem should state
val RTCredn_ap_congruence = store_thm( 43
"RTCredn_ap_congruence",
‘‘!x y. x -->* y ==> !z. x # z -->* y # z /\ z # x -->* z # y‘‘,
...);
Now that we have this, our sub-goal is almost immediately provable. Using it, we know
that
𝑥 𝑥′ →∗ 𝑦 𝑥′
𝑦 𝑥′ →∗ 𝑦 𝑦′
All we need to do is “stitch together” the two transitions above and go from 𝑥 𝑥′ to 𝑦 𝑦′ .
We can do this by appealing to our earlier RTC_RTC result.
> e (metis_tac [RTC_RTC, RTCredn_ap_congruence]); 44
OK..
metis: r[+0+9]+0+0+0+0+0+0+0+0+10+1+2+3+1+2+1+7+1# ...output elided...

Goal proved.
⊢ (∀x. x -->* x) ∧
(∀x y x’ y’.
x -||-> y ∧ x -->* y ∧ x’ -||-> y’ ∧ x’ -->* y’ ⇒ x # x’ -->* y # y’) ∧
(∀y y’. K # y # y’ -->* y) ∧ ∀f g x. S # f # g # x -->* f # x # (g # x)
val it =
Initial goal proved.
⊢ ∀x y. x -||-> y ⇒ x -->* y: proof
92 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

But given that we can finish off what we thought was an awkward branch with just
another application of metis_tac, we don’t need to use our fancy branching footwork at
the stage before. Instead, we can just merge the theorem lists passed to both invocations,
dispense with the rpt conj_tac and have a very short tactic proof indeed:
val predn_RTCredn = store_thm( 45
"predn_RTCredn",
‘‘!x y. x -||-> y ==> x -->* y‘‘,
Induct_on ‘x -||-> y‘ >>
metis_tac [RTC_rules, redn_rules, RTC_RTC, RTCredn_ap_congruence]);
⋯⋄⋯

Now it’s time to prove that if a number of parallel reduction steps are chained together,
then we can mirror this with some number of steps using the original reduction relation.
Our goal:
> g ‘!x y. x -||->* y ==> x -->* y‘; 46
val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

∀x y. x -||->* y ⇒ x -->* y

We use the appropriate induction principle to get to:


> e (Induct_on ‘RTC‘); 47
OK..
1 subgoal:
val it =

(∀x. x -->* x) ∧ ∀x x’ y. x -||-> x’ ∧ x’ -||->* y ∧ x’ -->* y ⇒ x -->* y

This we can finish off in one step. The first conjunct is obvious, and in the second the
x -||-> y and our last result combine to tell us that x -->* y. Then this can be chained
together with the other assumption in the second conjunct and we’re done.
> e (metis_tac [RTC_rules, predn_RTCredn, RTC_RTC]); 48
OK..
metis: r[+0+12]+0+0+0+0+0+0+0+0+1+0+0+8+12+5+0+0+3+4+4+8+1#
r[+0+3]#

Goal proved.
⊢ (∀x. x -->* x) ∧ ∀x x’ y. x -||-> x’ ∧ x’ -||->* y ∧ x’ -->* y ⇒ x -->* y
val it =
Initial goal proved.
⊢ ∀x y. x -||->* y ⇒ x -->* y: proof
6.5. BACK TO COMBINATORS 93

Packaged up, this proof is:

val RTCpredn_RTCredn = store_thm( 49


"RTCpredn_RTCredn",
‘‘!x y. x -||->* y ==> x -->* y‘‘,
Induct_on ‘RTC‘ >> metis_tac [predn_RTCredn, RTC_RTC, RTC_rules]);

⋯⋄⋯

Our final act is to use what we have so far to conclude that →∗ and −∣∣→∗ are equal.
We state our goal:

> g ‘$-||->* = $-->*‘; 50


val it =
Proof manager status: 1 proof.
1. Incomplete goalstack:
Initial goal:

$-||->* = $-->*

We want to now appeal to extensionality. The simplest way to do this is to rewrite with
the theorem FUN_EQ_THM:
> FUN_EQ_THM; 51
val it = ⊢ ∀f g. (f = g) ⟺ ∀x. f x = g x: thm

So, we rewrite:
> e (rw[FUN_EQ_THM]); 52
OK..
1 subgoal:
val it =

x -||->* x’ ⟺ x -->* x’

This goal is an easy consequence of our two earlier implications.

> e (metis_tac [RTCpredn_RTCredn, RTCredn_RTCpredn]); 53


OK..
metis: r[+0+5]+0+0+0+1#
r[+0+5]+0+0+0+0+1#

Goal proved.
⊢ x -||->* x’ ⟺ x -->* x’
val it =
Initial goal proved.
⊢ $-||->* = $-->*: proof

Packaged, the proof is:


94 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

val RTCpredn_EQ_RTCredn = store_thm( 54


"RTCpredn_EQ_RTCredn",
‘‘$-||->* = $-->*‘‘,
rw [FUN_EQ_THM] >>
metis_tac [RTCpredn_RTCredn, RTCredn_RTCpredn]);

6.5.4 Proving a diamond property for parallel reduction


Now we just have one substantial proof to go. Before we can even begin, there are a
number of minor lemmas we will need to prove first. These are basically specialisations
of the theorem predn_cases. We want exhaustive characterisations of the possibilities
when the following terms undergo a parallel reduction: 𝑥 𝑦, K, S, 𝖪 𝑥, 𝖲 𝑥, 𝖪 𝑥 𝑦, 𝖲 𝑥 𝑦
and 𝖲 𝑥 𝑦 𝑧.
To do this, we will write a little function that derives characterisations automatically:
> fun characterise t = SIMP_RULE (srw_ss()) [] (SPEC t predn_cases); 55
val characterise = fn: term -> thm

The characterise function specialises the theorem predn_cases with the input term,
and then simplifies. The srw_ss() simpset includes information about the injectivity and
disjointness of constructors and eliminates obvious impossibilities. For example,
> val K_predn = characterise ‘‘K‘‘; 56
val K_predn = ⊢ ∀a1. K -||-> a1 ⟺ (a1 = K): thm

> val S_predn = characterise ‘‘S‘‘;


val S_predn = ⊢ ∀a1. S -||-> a1 ⟺ (a1 = S): thm

Unfortunately, what we get back from other inputs is not so good:


> val Sx_predn0 = characterise ‘‘S # x‘‘; 57
val Sx_predn0 =
⊢ ∀a1.
S # x -||-> a1 ⟺
(a1 = S # x) ∨ ∃y v. (a1 = y # v) ∧ S -||-> y ∧ x -||-> v: thm

That first disjunct is redundant, as the following demonstrates:


val Sx_predn = prove( 58
‘‘!x y. S # x -||-> y = ?z. (y = S # z) /\ (x -||-> z)‘‘,
rw[EQ_IMP_THM, Sx_predn0, predn_rules, S_predn]);

Our characterise function will just have to help us in the proofs that follow.
val Kx_predn = prove( 59
‘‘!x y. K # x -||-> y = ?z. (y = K # z) /\ (x -||-> z)‘‘,
rw[characterise ‘‘K # x‘‘, predn_rules, K_predn, EQ_IMP_THM]);

What of 𝖪 𝑥 𝑦? A little thought demonstrates that there really must be two cases this
time.
6.5. BACK TO COMBINATORS 95

val Kxy_predn = prove( 60


‘‘!x y z.
K # x # y -||-> z =
(?u v. (z = K # u # v) /\ (x -||-> u) /\ (y -||-> v)) \/
(z = x)‘‘,
rw[EQ_IMP_THM, characterise ‘‘K # x # y‘‘, predn_rules, Kx_predn]);

By way of contrast, there is only one case for 𝖲 𝑥 𝑦 because it is not yet a “redex” at the
top-level.

val Sxy_predn = prove( 61


‘‘!x y z. S # x # y -||-> z =
?u v. (z = S # u # v) /\ (x -||-> u) /\ (y -||-> v)‘‘,
rw[characterise ‘‘S # x # y‘‘, predn_rules, EQ_IMP_THM, Sx_predn]);

Next, the characterisation for 𝖲 𝑥 𝑦 𝑧:

val Sxyz_predn = prove( 62


‘‘!w x y z. S # w # x # y -||-> z =
(?p q r. (z = S # p # q # r) /\
w -||-> p /\ x -||-> q /\ y -||-> r) \/
(z = (w # y) # (x # y))‘‘,
rw[characterise ‘‘S # w # x # y‘‘, predn_rules, EQ_IMP_THM, Sxy_predn]);

Last of all, we want a characterisation for 𝑥 𝑦. What characterise gives us this time
can’t be improved upon, for all that we might look upon the four disjunctions and despair.

> val x_ap_y_predn = characterise ‘‘x # y‘‘; 63


val x_ap_y_predn =
⊢ ∀a1.
x # y -||-> a1 ⟺
(a1 = x # y) ∨ (∃y’ v. (a1 = y’ # v) ∧ x -||-> y’ ∧ y -||-> v) ∨
(x = K # a1) ∨ ∃f g. (x = S # f # g) ∧ (a1 = f # y # (g # y)): thm

⋯⋄⋯

Now we are ready to prove the final goal. It is

> g ‘!x y. x -||-> y ==> 64


!z. x -||-> z ==> ?u. y -||-> u /\ z -||-> u‘;
val it =
Proof manager status: 2 proofs.
2. Completed goalstack: ⊢ $-||->* = $-->*
1. Incomplete goalstack:
Initial goal:

∀x y. x -||-> y ⇒ ∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u

We now induct and split the goal into its individual conjuncts:
96 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> e (Induct_on ‘x -||-> y‘ >> rpt conj_tac); 65


OK..
4 subgoals:
val it =

∀f g x z. S # f # g # x -||-> z ⇒ ∃u. f # x # (g # x) -||-> u ∧ z -||-> u

∀y y’ z. K # y # y’ -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u

∀x y x’ y’.
x -||-> y ∧ (∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u) ∧
x’ -||-> y’ ∧ (∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u) ⇒
∀z. x # x’ -||-> z ⇒ ∃u. y # y’ -||-> u ∧ z -||-> u

∀x z. x -||-> z ⇒ ∃u. x -||-> u ∧ z -||-> u

The first goal is easily disposed of. The witness we would provide for this case is simply
z, but metis_tac will do the work for us:
> e (metis_tac [predn_rules]); 66
OK..
metis: r[+0+7]+0+0+0+0+1#

Goal proved.
⊢ ∀x z. x -||-> z ⇒ ∃u. x -||-> u ∧ z -||-> u

Remaining subgoals:
val it =

∀f g x z. S # f # g # x -||-> z ⇒ ∃u. f # x # (g # x) -||-> u ∧ z -||-> u

∀y y’ z. K # y # y’ -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u

∀x y x’ y’.
x -||-> y ∧ (∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u) ∧
x’ -||-> y’ ∧ (∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u) ⇒
∀z. x # x’ -||-> z ⇒ ∃u. y # y’ -||-> u ∧ z -||-> u

The next goal includes two instances of terms of the form x # y -||-> z. We can use
our x_ap_y_predn theorem here. However, if we rewrite indiscriminately with it, we
6.5. BACK TO COMBINATORS 97

will really confuse the goal. We want to rewrite just the assumption, not the instance
underneath the existential quantifier. Starting everything by repeatedly stripping can’t
lead us too far astray.

> e (rw[]); 67
OK..
1 subgoal:
val it =

∃u. y # y’ -||-> u ∧ z -||-> u


------------------------------------
0. x -||-> y
1. ∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u
2. x’ -||-> y’
3. ∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u
4. x # x’ -||-> z

We need to split up assumption 4. We can get it out of the assumption list using the
qpat_x_assum theorem-tactical. We will write

qpat_x_assum ‘_ # _ -||-> _‘
(strip_assume_tac o SIMP_RULE std_ss [x_ap_y_predn])

The quotation specifies the pattern that we want to match: we want the term that has
an application term reducing, and as there is just one such, we can use “don’t care”
underscore patterns for the various arguments. The second argument specifies how we
are going to transform the theorem. Reading the compositions from right to left, first we
will simplify with the x_ap_y_predn theorem and then we will assume the result back
into the assumptions, stripping disjunctions and existentials as we go.3
We already know that doing this is going to produce four new sub-goals (there were
four disjuncts in the x_ap_y_predn theorem). We’ll follow up the use of strip_assume_tac
with rw to eliminate any equalities that might appear as assumptions.
So:

3 An alternative to using qpat_x_assum is to use by instead: you would have to state the four-way
disjunction yourself, but the proof would be more “declarative” in style, and though wordier, might be
more maintainable.
98 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> e (qpat_x_assum ‘_ # _ -||-> _‘ 68


(strip_assume_tac o SIMP_RULE (srw_ss()) [x_ap_y_predn]) >>
rw[]);
OK..
4 subgoals:
val it =
...3 subgoals elided...

∃u. y # y’ -||-> u ∧ x # x’ -||-> u


------------------------------------
0. x -||-> y
1. ∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u
2. x’ -||-> y’
3. ∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u

This first sub-goal is an easy consequence of the rules for parallel reduction. Because
we’ve elided the somewhat voluminous output, we call p() to print the next sub-goal
again:

> e (metis_tac[predn_rules]); ...output elided... 69


> p();
val it =
...2 subgoals elided...

∃u. y # y’ -||-> u ∧ y’’ # v -||-> u


------------------------------------
0. x -||-> y
1. ∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u
2. x’ -||-> y’
3. ∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u
4. x -||-> y’’
5. x’ -||-> v

This goal requires application of the two inductive hypotheses as well as the rules for
parallel reduction, but is again straightforward for metis_tac;
6.5. BACK TO COMBINATORS 99

> e (metis_tac[predn_rules]); ...output elided... 70


> p();
val it =
...1 subgoal elided...

∃u. y # y’ -||-> u ∧ z -||-> u


------------------------------------
0. K # z -||-> y
1. ∀z’. K # z -||-> z’ ⇒ ∃u. y -||-> u ∧ z’ -||-> u
2. x’ -||-> y’
3. ∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u

Now our next goal (the third of the four) features a term K # z -||-> y in the assump-
tions. We have a theorem that pertains to just this situation. But before applying it
willy-nilly, let us try to figure out exactly what the situation is. A diagram of the current
situation might look like

K # z # x’ /
z


/ 
y # y’ ?u?
Our theorem tells us that y must actually be of the form K # w for some w, and that there
must be an arrow between z and w. Thus:
> e (‘?w. (y = K # w) /\ (z -||-> w)‘ by metis_tac [Kx_predn]); 71
OK..
metis: r[+0+11]+0+0+0+0+0+1+2+0+1+1+6+1+1#
1 subgoal:
val it =

∃u. y # y’ -||-> u ∧ z -||-> u


------------------------------------
0. K # z -||-> y
1. ∀z’. K # z -||-> z’ ⇒ ∃u. y -||-> u ∧ z’ -||-> u
2. x’ -||-> y’
3. ∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u
4. y = K # w
5. z -||-> w

On inspection, it becomes clear that the u must be w. The first conjunct requires
K # w # y’ -||-> w, which we have because this is what Ks do, and the second conjunct
is already in the assumption list. Rewriting (eliminating that equality in the assumption
list first will make metis_tac’s job that much easier), and then first order reasoning will
solve this goal:
100 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

> e (rw [] >> metis_tac [predn_rules]); 72


OK..
metis: r[+0+13]+0+0+0+0+0+0+0+0+0+0+1# ...output elided...

Goal proved.
[....] ⊢ ∃u. y # y’ -||-> u ∧ z -||-> u

Remaining subgoals:
val it =

∃u. y # y’ -||-> u ∧ f # x’ # (g # x’) -||-> u


------------------------------------
0. S # f # g -||-> y
1. ∀z. S # f # g -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u
2. x’ -||-> y’
3. ∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u

This case involving S is analogous. Here’s the tactic to apply:

> e (‘?p q. (y = S # p # q) /\ (f -||-> p) /\ (g -||-> q)‘ 73


by metis_tac [Sxy_predn] >>
rw [] >> metis_tac [predn_rules]);
OK..
metis: r[+0+12]+0+0+0+0+0+2+0+1+5+0+7+0+5+1+4+7+1+1#
metis: r[+0+14]+0+0+0+0+0+0+0+0+0+0+0+0+4+2+1+1+1+1+1# ...output elided...

Goal proved.
⊢ ∀x y x’ y’.
x -||-> y ∧ (∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u) ∧ x’ -||-> y’ ∧
(∀z. x’ -||-> z ⇒ ∃u. y’ -||-> u ∧ z -||-> u) ⇒
∀z. x # x’ -||-> z ⇒ ∃u. y # y’ -||-> u ∧ z -||-> u

Remaining subgoals:
val it =
...1 subgoal elided...

∀y y’ z. K # y # y’ -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u

This next goal features a K # x # y -||-> z term that we have a theorem for already.
Let’s speculatively use a call to metis_tac to eliminate the simple cases immediately
(Kxy_predn is a disjunct so we’ll get two sub-goals if we don’t eliminate anything).
6.5. BACK TO COMBINATORS 101

> e (rw[Kxy_predn] >> metis_tac[predn_rules]); 74


OK..
metis: r[+0+3]#
metis: r[+0+8]+0+0+0+0+0+0+1#

Goal proved.
⊢ ∀y y’ z. K # y # y’ -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u

Remaining subgoals:
val it =

∀f g x z. S # f # g # x -||-> z ⇒ ∃u. f # x # (g # x) -||-> u ∧ z -||-> u

We got both cases immediately, and have moved onto the last case. We can try the same
strategy.
> e (rw[Sxyz_predn] >> metis_tac [predn_rules]); 75
OK..
metis: r[+0+3]#
metis: r[+0+9]+0+0+0+0+0+0+0+2+0+2+1+1+1#

Goal proved.
⊢ ∀f g x z. S # f # g # x -||-> z ⇒ ∃u. f # x # (g # x) -||-> u ∧ z -||-> u
val it =
Initial goal proved.
⊢ ∀x y. x -||-> y ⇒ ∀z. x -||-> z ⇒ ∃u. y -||-> u ∧ z -||-> u: proof

The final goal proof can be packaged into:


val predn_diamond_lemma = prove( 76
‘‘!x y. x -||-> y ==>
!z. x -||-> z ==> ?u. y -||-> u /\ z -||-> u‘‘,
Induct_on ‘x -||-> y‘ >> rpt conj_tac >| [
metis_tac [predn_rules],
rw[] >>
qpat_x_assum ‘_ # _ -||-> _‘
(strip_assume_tac o SIMP_RULE std_ss [x_ap_y_predn]) >>
rw[] >| [
metis_tac[predn_rules],
metis_tac[predn_rules],
‘?w. (y = K # w) /\ (z -||-> w)‘ by metis_tac [Kx_predn] >>
rw[] >> metis_tac [predn_rules],
‘?p q. (y = S # p # q) /\ (f -||-> p) /\ (g -||-> q)‘ by
metis_tac [Sxy_predn] >>
rw [] >> metis_tac [predn_rules]
],
rw[Kxy_predn] >> metis_tac [predn_rules],
rw[Sxyz_predn] >> metis_tac [predn_rules]
]);
102 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC

⋯⋄⋯

We are on the home straight. The lemma can be turned into a statement involving the
diamond constant directly:

val predn_diamond = store_thm( 77


"predn_diamond",
‘‘diamond predn‘‘,
metis_tac [diamond_def, predn_diamond_lemma]);

And now we can prove that our original relation is confluent in similar fashion:

val confluent_redn = store_thm( 78


"confluent_redn",
‘‘confluent redn‘‘,
metis_tac [predn_diamond, confluent_diamond_RTC,
RTCpredn_EQ_RTCredn, diamond_RTC]);

6.6 Exercises
If necessary, answers to the first three exercises can be found by examining the source
file in examples/ind_def/clScript.sml.

1. Prove that

𝖱𝖳𝖢 𝑅 𝑥 𝑦 ∧ 𝖱𝖳𝖢 𝑅 𝑦 𝑧 ⇒ 𝖱𝖳𝖢 𝑅 𝑥 𝑧

You will need to prove the goal by induction, and will probably need to massage
it slightly first to get it to match the appropriate induction principle. Store the
theorem under the name RTC_RTC.

2. Another induction. Show that

(∀𝑥 𝑦. 𝑅1 𝑥 𝑦 ⇒ 𝑅2 𝑥 𝑦) ⇒ (∀𝑥 𝑦. 𝖱𝖳𝖢 𝑅1 𝑥 𝑦 ⇒ 𝖱𝖳𝖢 𝑅2 𝑥 𝑦)

Call the resulting theorem RTC_monotone.

3. Yet another RTC induction, but where 𝑅 is no longer abstract, and is instead the
original reduction relation. Prove

𝑥 →∗ 𝑦 ⇒ ∀𝑧. 𝑥 𝑧 →∗ 𝑦 𝑧 ∧ 𝑧 𝑥 →∗ 𝑧 𝑦

Call it RTCredn_ap_congruence.
6.6. EXERCISES 103

4. Come up with a counter-example for the following property:


( )
∀𝑥 𝑦 𝑧. 𝑅 𝑥 𝑦 ∧ 𝑅 𝑥 𝑧 ⇒
∃𝑢. 𝖱𝖳𝖢 𝑅 𝑦 𝑢 ∧ 𝖱𝖳𝖢 𝑅 𝑧 𝑢

𝖽𝗂𝖺𝗆𝗈𝗇𝖽 (𝖱𝖳𝖢 𝑅)
104 CHAPTER 6. EXAMPLE: COMBINATORY LOGIC
Chapter 7

Proof Tools: Propositional Logic

Users of H OL can create their own theorem proving tools by combining predefined rules
and tactics. The ML type-discipline ensures that only logically sound methods can be
used to create values of type thm. In this chapter, a real example is described.
Two implementations of the tool are given to illustrate various styles of proof pro-
gramming. The first implementation is the obvious one, but is inefficient because of the
‘brute force’ method used. The second implementation attempts to be a great deal more
intelligent. Extensions to the tools to allow more general applicability are also discussed.
The problem to be solved is that of deciding the truth of a closed formula of proposi-
tional logic. Such a formula has the general form
𝜑 ∶∶= 𝑣 | ¬𝜑 | 𝜑 ∧ 𝜑 | 𝜑 ∨ 𝜑 | 𝜑 ⇒ 𝜑 | 𝜑 = 𝜑
𝑓 𝑜𝑟𝑚𝑢𝑙𝑎 ∶∶= ∀𝑣.
⃗ 𝜑
where the variables 𝑣 are all of boolean type, and where the universal quantification at
the outermost level captures all of the free variables.

7.1 Method 1: Truth Tables


The first method to be implemented is the brute force method of trying all possible
boolean combinations. This approach’s only real virtue is that it is exceptionally easy to
implement. First we will prove the motivating theorem:
val FORALL_BOOL = prove(
‘‘(!v. P v) = P T /\ P F‘‘,
rw[EQ_IMP_THM] >> Cases_on ‘v‘ >> rw[]);
The proof proceeds by splitting the goal into two halves, showing
(∀𝑣. 𝑃 (𝑣)) ⇒ 𝑃 (⊤) ∧ 𝑃 (⊥)
(which goal is automatically shown by the simplifier), and
𝑃 (⊤) ∧ 𝑃 (⊥) ⇒ 𝑃 (𝑣)
for an arbitrary boolean variable 𝑣. After case-splitting on 𝑣, the assumptions are then
enough to show the goal. (This theorem is actually already proved in the theory bool.)
The next, and final, step is to rewrite with this theorem:

105
106 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC

val tautDP = SIMP_CONV bool_ss [FORALL_BOOL]

This enables the following

> tautDP ‘‘!p q. p /\ q /\ ~p‘‘; 1


val it = ⊢ (∀p q. p ∧ q ∧ ¬p) ⟺ F: thm

> tautDP ‘‘!p. p \/ ~p‘‘


val it = ⊢ (∀p. p ∨ ¬p) ⟺ T: thm

and even the marginally more intimidating

> time tautDP 2


‘‘!p q c a. ~(((~a \/ p /\ ~q \/ ~p /\ q) /\
(~(p /\ ~q \/ ~p /\ q) \/ a)) /\
(~c \/ p /\ q) /\ (~(p /\ q) \/ c)) \/
~(p /\ q) \/ c /\ ~a‘‘;
runtime: 0.02058s, gctime: 0.00559s, systime: 0.00237s.
val it =
⊢ (∀p q c a.
¬(((¬a ∨ p ∧ ¬q ∨ ¬p ∧ q) ∧ (¬(p ∧ ¬q ∨ ¬p ∧ q) ∨ a)) ∧
(¬c ∨ p ∧ q) ∧ (¬(p ∧ q) ∨ c)) ∨ ¬(p ∧ q) ∨ c ∧ ¬a) ⟺ T: thm

This is a dreadful algorithm for solving this problem. The system’s built-in function,
tautLib.TAUT_CONV, solves the problem above much faster. The only real merit in this
solution is that it took one line to write. This is a general illustration of the truth that
H OL’s high-level tools, particularly the simplifier, can provide fast prototypes for a variety
of proof tasks.

7.2 Method 2: the DPLL Algorithm


The Davis-Putnam-Loveland-Logemann method [4] for deciding the satisfiability of
propositional formulas in CNF (Conjunctive Normal Form) is a powerful technique,
still used in state-of-the-art solvers today. If we strip the universal quantifiers from
our input formulas, our task can be seen as determining the validity of a propositional
formula. Testing the negation of such a formula for satisfiability is a test for validity: if
the formula’s negation is satisfiable, then it is not valid (the satisfying assignment will
make the original false); if the formula’s negation is unsatisfiable, then the formula is
valid (no assignment can make it false).
(The source code for this example is available in the file examples/dpll.sml.)
7.2. METHOD 2: THE DPLL ALGORITHM 107

Preliminaries
To begin, assume that we have code already to convert arbitrary formulas into CNF, and
to then decide the satisfiability of these formulas. Assume further that if the input to the
latter procedure is unsatisfiable, then it will return with a theorem of the form
⊢𝜑=F
or if it is satisfiable, then it will return a satisfying assignment, a map from variables to
booleans. This map will be a function from H OL variables to one of the H OL terms T or F.
Thus, we will assume
datatype result = Unsat of thm | Sat of term -> term
val toCNF : term -> thm
val DPLL : term -> result
(The theorem returned by toCNF will equate the input term to another in CNF.)
Before looking into implementing these functions, we will need to consider
• how to transform our inputs to suit the function; and

• how to use the outputs from the functions to produce our desired results
We are assuming our input is a universally quantified formula. Both the CNF and DPLL
procedures expect formulas without quantifiers. We also want to pass these procedures
the negation of the original formula. Both of the required term manipulations required
can be done by functions found in the structure boolSyntax. (In general, important
theories (such as bool) are accompanied by Syntax modules containing functions for
manipulating the term-forms associated with that theory.)
In this case we need the functions
strip_forall : term -> term list * term
mk_neg : term -> term
The function strip_forall strips a term of all its outermost universal quantifications,
returning the list of variables stripped and the body of the quantification. The function
mk_neg takes a term of type bool and returns the term corresponding to its negation.
Using these functions, it is easy to see how we will be able to take ∀𝑣.
⃗ 𝜑 as input, and
pass the term ¬𝜑 to the function toCNF. A more significant question is how to use the
results of these calls. The call to toCNF will return a theorem
⊢ ¬𝜑 = 𝜑′
The formula 𝜑′ is what will then be passed to DPLL. (We can extract it by using the concl
and rhs functions.) If DPLL returns the theorem ⊢ 𝜑′ = F, an application of TRANS to this
and the theorem displayed above will derive the formula ⊢ ¬𝜑 = 𝐹 . In order to derive
the final result, we will need to turn this into ⊢ 𝜑. This is best done by proving a bespoke
theorem embodying the equality (there isn’t one such already in the system):
108 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC

val NEG_EQ_F = prove(‘‘(~p = F) = p‘‘, REWRITE_TAC []);

To turn ⊢ 𝜑 into ⊢ (∀𝑣.


⃗ 𝜑) = T, we will perform the following proof:
⊢𝜑
GENL(𝑣)

⊢ ∀𝑣.
⃗ 𝜑
EQT_INTRO
⊢ (∀𝑣.
⃗ 𝜑) = T

The other possibility is that DPLL will return a satisfying assignment demonstrating that
𝜑′ is satisfiable. If this is the case, we want to show that ∀𝑣.
⃗ 𝜑 is false. We can do this by
assuming this formula, and then specialising the universally quantified variables in line
with the provided map. In this way, it will be possible to produce the theorem

⃗ 𝜑 ⊢ 𝜑[𝑣⃗ ∶= satisfying assignment]


∀𝑣.

Because there are no free variables in ∀𝑣.


⃗ 𝜑, the substitution will produce a completely
ground boolean formula. This will straightforwardly rewrite to F (if the assignment
makes ¬𝜑 true, it must make 𝜑 false). Turning 𝜙 ⊢ F into ⊢ 𝜙 = F is a matter of calling
DISCH and then rewriting with the built-in theorem IMP_F_EQ_F:

⊢ ∀𝑡. 𝑡 ⇒ F = (𝑡 = F)

Putting all of the above together, we can write our wrapper function, which we will
call DPLL_UNIV, with the UNIV suffix reminding us that the input must be universally
quantified.

fun DPLL_UNIV t = let


val (vs, phi) = strip_forall t
val cnf_eqn = toCNF (mk_neg phi)
val phi’ = rhs (concl cnf_eqn)
in
case DPLL phi’ of
Unsat phi’_eq_F => let
val negphi_eq_F = TRANS cnf_eqn phi’_eq_F
val phi_thm = CONV_RULE (REWR_CONV NEG_EQ_F) negphi_eq_F
in
EQT_INTRO (GENL vs phi_thm)
end
| Sat f => let
val t_assumed = ASSUME t
fun spec th =
spec (SPEC (f (#1 (dest_forall (concl th)))) th)
handle HOL_ERR _ => REWRITE_RULE [] th
in
CONV_RULE (REWR_CONV IMP_F_EQ_F) (DISCH t (spec t_assumed))
end
end
7.2. METHOD 2: THE DPLL ALGORITHM 109

The auxiliary function spec that is used in the second case relies on the fact that
dest_forall will raise a HOL_ERR exception if the term it is applied to is not universally
quantified. When spec’s argument is not universally quantified, this means that the
recursion has bottomed out, and all of the original formula’s universal variables have
been specialised. Then the resulting formula can be rewritten to false (REWRITE_RULE’s
built-in rewrites will handle all of the necessary cases).
The DPLL_UNIV function also uses REWR_CONV in two places. The REWR_CONV function
applies a single (first-order) rewrite at the top of a term. These uses of REWR_CONV are
done within calls to the CONV_RULE function. This lifts a conversion 𝑐 (a function taking
a term 𝑡 and producing a theorem ⊢ 𝑡 = 𝑡′ ), so that CONV_RULE 𝑐 takes the theorem ⊢ 𝑡 to
⊢ 𝑡′ .

7.2.1 Conversion to Conjunctive Normal Form


A formula in Conjunctive Normal Form is a conjunction of disjunctions of literals (either
variables, or negated variables). It is possible to convert formulas of the form we are
expecting into CNF by simply rewriting with the following theorems

¬(𝜙 ∧ 𝜓) = ¬𝜙 ∨ ¬𝜓
¬(𝜙 ∨ 𝜓) = ¬𝜙 ∧ ¬𝜓
𝜙 ∨ (𝜓 ∧ 𝜉) = (𝜙 ∨ 𝜓) ∧ (𝜙 ∨ 𝜉)
(𝜓 ∧ 𝜉) ∨ 𝜙 = (𝜙 ∨ 𝜓) ∧ (𝜙 ∨ 𝜉)

𝜙 ⇒ 𝜓 = ¬𝜙 ∨ 𝜓
(𝜙 = 𝜓) = (𝜙 ⇒ 𝜓) ∧ (𝜓 ⇒ 𝜙)

Unfortunately, using these theorems as rewrites can result in an exponential increase in


the size of a formula. (Consider using them to convert an input in Disjunctive Normal
Form, a disjunction of conjunctions of literals, into CNF.)
A better approach is to convert to what is known as “definitional CNF”. H OL includes
functions to do this in the structure defCNF. Unfortunately, this approach adds extra,
existential, quantifiers to the formula. For example

> defCNF.DEF_CNF_CONV ‘‘p \/ (q /\ r)‘‘; 3


val it = ⊢ p ∨ q ∧ r ⟺ ∃x. (x ∨ ¬q ∨ ¬r) ∧ (r ∨ ¬x) ∧ (q ∨ ¬x) ∧ (p ∨ x): thm

Under the existentially-bound x, the code has produced a formula in CNF. With an
example this small, the formula is actually bigger than that produced by the naïve
translation, but with more realistic examples, the difference quickly becomes significant.
The last example used with tautDP is 20 times bigger when translated naïvely than when
using defCNF, and the translation takes 150 times longer to perform.
110 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC

But what of these extra existentially quantified variables? In fact, we can ignore the
quantification when calling the core DPLL procedure. If we pass the unquantified body
to DPLL, we will either get back an unsatisfiable verdict of the form ⊢ 𝜑′ = F, or a
satisfying assignment for all of the free variables. If the latter occurs, the same satisfying
assignment will also satisfy the original. If the former, we will perform the following
proof

⊢ 𝜑′ = F
⊢ 𝜑′ ⇒ F
⊢ ∀⃗𝑥. 𝜑′ ⇒ F
𝑥. 𝜑′ ) ⇒ F
⊢ (∃⃗
⊢ (∃⃗𝑥. 𝜑′ ) = F

producing a theorem of the form expected by our wrapper function.


In fact, there is an alternative function in the defCNF API that we will use in preference
to DEF_CNF_CONV. The problem with DEF_CNF_CONV is that it can produce a big quantifi-
cation, involving lots of variables. We will rather use DEF_CNF_VECTOR_CONV. Instead of
output of the form

𝑥. 𝜑′ )
⊢ 𝜑 = (∃⃗

this second function produces

⊢ 𝜑 = (∃(𝑣 ∶ num → bool). 𝜑′ )

where the individual variables 𝑥𝑖 of the first formula are replaced by calls to the 𝑣
function 𝑣(𝑖), and there is just one quantified variable, 𝑣. This variation will not affect
the operation of the proof sketched above. And as long as we don’t require literals to be
variables or their negations, but also allow them to be terms of the form 𝑣(𝑖) and ¬𝑣(𝑖) as
well, then the action of the DPLL procedure on the formula 𝜑′ won’t be affected either.
Unfortunately for uniformity, in simple cases, the definitional CNF conversion functions
may not result in any existential quantifications at all. This makes our implementation
of DPLL somewhat more complicated. We calculate a body variable that will be passed
onto the CoreDPLL function, as well as a transform function that will transform an
unsatisfiability result into something of the desired form. If the result of conversion to
CNF produces an existential quantification, we use the proof sketched above. Otherwise,
the transformation can be the identity function, I:
7.2. METHOD 2: THE DPLL ALGORITHM 111

fun DPLL t = let


val (transform, body) = let
val (vector, body) = dest_exists t
fun transform body_eq_F = let
val body_imp_F = CONV_RULE (REWR_CONV (GSYM IMP_F_EQ_F)) body_eq_F
val fa_body_imp_F = GEN vector body_imp_F
val ex_body_imp_F = CONV_RULE FORALL_IMP_CONV fa_body_imp_F
in
CONV_RULE (REWR_CONV IMP_F_EQ_F) ex_body_imp_F
end
in
(transform, body)
end handle HOL_ERR _ => (I, t)
in
case CoreDPLL body of
Unsat body_eq_F => Unsat (transform body_eq_F)
| x => x
end

where we have still to implement the core DPLL procedure (called CoreDPLL above). The
above code uses REWR_CONV with the IMP_F_EQ_F theorem to affect two of the proof’s
transformations. The GSYM function is used to flip the orientation a theorem’s top-level
equalities. Finally, the FORALL_IMP_CONV conversion takes a term of the form

∀𝑥. 𝑃 (𝑥) ⇒ 𝑄

and returns the theorem

⊢ (∀𝑥. 𝑃 (𝑥) ⇒ 𝑄) = ((∃𝑥. 𝑃 (𝑥)) ⇒ 𝑄)

7.2.2 The Core DPLL Procedure


The DPLL procedure can be seen as a slight variation on the basic “truth table” technique
we have already seen. As with that procedure, the core operation is a case-split on a
boolean variable. There are two significant differences though: DPLL can be seen as a
search for a satisfying assignment, so that if picking a variable to have a particular value
results in a satisfying assignment, we do not need to also check what happens if the
same variable is given the opposite truth-value. Secondly, DPLL takes some care to pick
good variables to split on. In particular, unit propagation is used to eliminate variables
that will not cause branching in the search-space.
Our implementation of the core DPLL procedure is a function that takes a term and
returns a value of type result: either a theorem equating the original term to false, or
a satisfying assignment (in the form of a function from terms to terms). As the DPLL
search for a satisfying assignment proceeds, an assignment is incrementally constructed.
This suggests that the recursive core of our function will need to take a term (the current
112 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC

formula) and a context (the current assignment) as parameters. The assignment can be
naturally represented as a set of equations, where each equation is either 𝑣 = T or 𝑣 = F.
This suggests that a natural representation for our program state is a theorem: the
hypotheses will represent the assignment, and the conclusion can be the current formula.
Of course, H OL theorems can’t just be wished into existence. In this case, we can make
everything sound by also assuming the initial formula. Thus, when we begin our initial
state will be 𝜙 ⊢ 𝜙. After splitting on variable 𝑣, we will generate two new states
𝜙, (𝑣 = T) ⊢ 𝜙1 , and 𝜙, (𝑣 = F) ⊢ 𝜙2 , where the 𝜙𝑖 are the result of simplifying 𝜙 under the
additional assumption constraining 𝑣.
The easiest way to add an assumption to a theorem is to use the rule ADD_ASSUM. But
in this situation, we also want to simplify the conclusion of the theorem with the same
assumption. This means that it will be enough to rewrite with the theorem 𝜓 ⊢ 𝜓, where
𝜓 is the new assumption. The action of rewriting with such a theorem will cause the
new assumption to appear among the assumptions of the result.
The casesplit function is thus:

fun casesplit v th = let


val eqT = ASSUME (mk_eq(v, boolSyntax.T))
val eqF = ASSUME (mk_eq(v, boolSyntax.F))
in
(REWRITE_RULE [eqT] th, REWRITE_RULE [eqF] th)
end

A case-split can result in a formula that has been rewritten all the way to true or false.
These are the recursion’s base cases. If the formula has been rewritten to true, then we
have found a satisfying assignment, one that is now stored for us in the hypotheses of
the theorem itself. The following function, mk_satmap, extracts those hypotheses into a
finite-map, and then returns the lookup function for that finite-map:

fun mk_satmap th = let


val hyps = hypset th
fun foldthis (t,acc) = let
val (l,r) = dest_eq t
in
Binarymap.insert(acc,l,r)
end handle HOL_ERR _ => acc
val fmap = HOLset.foldl foldthis (Binarymap.mkDict Term.compare) hyps
in
Sat (fn v => Binarymap.find(fmap,v)
handle Binarymap.NotFound => boolSyntax.T)
end

The foldthis function above adds the equations that are stored as hypotheses into
the finite-map. The exception handler in foldthis is necessary because one of the
hypotheses will be the original formula. The exception handler in the function that looks
7.2. METHOD 2: THE DPLL ALGORITHM 113

up variable bindings is necessary because a formula may be reduced to true without


every variable being assigned a value at all. In this case, it is irrelevant what value we
give to the variable, so we arbitrarily map such variables to T.
If the formula has been rewritten to false, then we can just return this theorem directly.
Such a theorem is not quite in the right form for the external caller, which is expecting
an equation, so if the final result is of the form 𝜙 ⊢ F, we will have to transform this to
⊢ 𝜙 = F.
The next question to address is what to do with the results of recursive calls. If a
case-split returns a satisfying assignment this can be returned unchanged. But if a
recursive call returns a theorem equating the input to false, more needs to be done. If
this is the first call, then the other branch needs to be checked. If this also returns that
the theorem is unsatisfiable, we will have two theorems:

𝜙0 , Δ, (𝑣 = T) ⊢ F 𝜙0 , Δ, (𝑣 = F) ⊢ F

where 𝜙0 is the original formula, Δ is the rest of the current assignment, and 𝑣 is the
variable on which a split has just been performed. To turn these two theorems into the
desired

𝜙0 , Δ ⊢ F

we will use the rule of inference DISJ_CASES:


Γ⊢𝜓 ∨𝜉 Δ1 ∪ {𝜓} ⊢ 𝜙 Δ2 ∪ {𝜉} ⊢ 𝜙
Γ ∪ Δ 1 ∪ Δ2 ⊢ 𝜙

and the theorem BOOL_CASES_AX:

⊢ ∀𝑡. (𝑡 = T) ∨ (𝑡 = F)

We can put these fragments together and write the top-level CoreDPLL function, in
Figure 7.1.
All that remains to be done is to figure out which variable to case-split on. The most
important variables to split on are those that appear in what are called “unit clauses”, a
clause containing just one literal. If there is a unit clause in a formula then it is of the
form

𝜙 ∧ 𝑣 ∧ 𝜙′

or

𝜙 ∧ ¬𝑣 ∧ 𝜙′

In either situation, splitting on 𝑣 will always result in a branch that evaluates directly
to false. We thus eliminate a variable without increasing the size of the problem. The
114 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC

fun CoreDPLL form = let


val initial_th = ASSUME form
fun recurse th = let
val c = concl th
in
if c = boolSyntax.T then
mk_satmap th
else if c = boolSyntax.F then
Unsat th
else let
val v = find_splitting_var c
val (l,r) = casesplit v th
in
case recurse l of
Unsat l_false => let
in
case recurse r of
Unsat r_false =>
Unsat (DISJ_CASES (SPEC v BOOL_CASES_AX) l_false r_false)
| x => x
end
| x => x
end
end
in
case (recurse initial_th) of
Unsat th => Unsat (CONV_RULE (REWR_CONV IMP_F_EQ_F) (DISCH form th))
| x => x
end

Figure 7.1: The core of the DPLL function


7.2. METHOD 2: THE DPLL ALGORITHM 115

process of eliminating unit clauses is usually called “unit propagation”. Unit propagation
is not usually thought of as a case-splitting operation, but doing it this way makes our
code simpler.
If a formula does not include a unit clause, then choice of the next variable to split on
is much more of a black art. Here we will implement a very simple choice: to split on
the variable that occurs most often. Our function find_splitting_var takes a formula
and returns the variable to split on.

fun find_splitting_var phi = let


fun recurse acc [] = getBiggest acc
| recurse acc (c::cs) = let
val ds = strip_disj c
in
case ds of
[lit] => (dest_neg lit handle HOL_ERR _ => lit)
| _ => recurse (count_vars ds acc) cs
end
in
recurse (Binarymap.mkDict Term.compare) (strip_conj phi)
end

This function works by handing a list of clauses to the inner recurse function. This strips
each clause apart in turn. If a clause has only one disjunct it is a unit-clause and the
variable can be returned directly. Otherwise, the variables in the clause are counted and
added to the accumulating map by count_vars, and the recursion can continue.
The count_vars function has the following implementation:

fun count_vars ds acc =


case ds of
[] => acc
| lit::lits => let
val v = dest_neg lit handle HOL_ERR _ => lit
in
case Binarymap.peek (acc, v) of
NONE => count_vars lits (Binarymap.insert(acc,v,1))
| SOME n => count_vars lits (Binarymap.insert(acc,v,n + 1))
end

The use of a binary tree to store variable data makes it efficient to update the data as
it is being collected. Extracting the variable with the largest count is then a linear scan
of the tree, which we can do with the foldl function:

fun getBiggest acc =


#1 (Binarymap.foldl(fn (v,cnt,a as (bestv,bestcnt)) =>
if cnt > bestcnt then (v,cnt) else a)
(boolSyntax.T, 0)
acc
116 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC

7.2.3 Performance
Once inputs get even a little beyond the clearly trivial, the function we have written (at the
top-level, DPLL_UNIV) performs considerably better than the truth table implementation.
For example, the generalisation of the following term, with 29 variables, takes our
function multiple seconds to demonstrate as a tautology:

val t0 = ‘‘
(s0_0 = (x_0 = ~y_0)) /\ (c0_1 = x_0 /\ y_0) /\
(s0_1 = ((x_1 = ~y_1) = ~c0_1)) /\
(c0_2 = x_1 /\ y_1 \/ (x_1 \/ y_1) /\ c0_1) /\
(s0_2 = ((x_2 = ~y_2) = ~c0_2)) /\
(c0_3 = x_2 /\ y_2 \/ (x_2 \/ y_2) /\ c0_2) /\
(s1_0 = ~(x_0 = ~y_0)) /\ (c1_1 = x_0 /\ y_0 \/ x_0 \/ y_0) /\
(s1_1 = ((x_1 = ~y_1) = ~c1_1)) /\
(c1_2 = x_1 /\ y_1 \/ (x_1 \/ y_1) /\ c1_1) /\
(s1_2 = ((x_2 = ~y_2) = ~c1_2)) /\
(c1_3 = x_2 /\ y_2 \/ (x_2 \/ y_2) /\ c1_2) /\
(c_3 = ~c_0 /\ c0_3 \/ c_0 /\ c1_3) /\
(s_0 = ~c_0 /\ s0_0 \/ c_0 /\ s1_0) /\
(s_1 = ~c_0 /\ s0_1 \/ c_0 /\ s1_1) /\
(s_2 = ~c_0 /\ s0_2 \/ c_0 /\ s1_2) /\ ~c_0 /\
(s2_0 = (x_0 = ~y_0)) /\ (c2_1 = x_0 /\ y_0) /\
(s2_1 = ((x_1 = ~y_1) = ~c2_1)) /\
(c2_2 = x_1 /\ y_1 \/ (x_1 \/ y_1) /\ c2_1) /\
(s2_2 = ((x_2 = ~y_2) = ~c2_2)) /\
(c2_3 = x_2 /\ y_2 \/ (x_2 \/ y_2) /\ c2_2) ==>
(c_3 = c2_3) /\ (s_0 = s2_0) /\ (s_1 = s2_1) /\ (s_2 = s2_2)‘‘;
val t = list_mk_forall(free_vars t0, t0);

> val _ = time DPLL_UNIV t;


runtime: 7.2s, gctime: 0.43424s, systime: 0.35291s.
> val _ = time tautLib.TAUT_PROVE t;
runtime: 0.00865s, gctime: 0.00275s, systime: 0.00453s.

(As is apparent from the above, if you want real speed, the built-in TAUT_PROVE function
works in less than a hundredth of a second, by using an external tool to generate the
proof of unsatisfiability, and then translating that proof back into HOL.)

7.3 Extending our Procedure’s Applicability


The function DPLL_UNIV requires its input to be universally quantified, with all free
variables bound, and for each literal to be a variable or the negation of a variable.
This makes DPLL_UNIV a little unfriendly when it comes to using it as part of the proof
of a goal. In this section, we will write one further “wrapper” layer to wrap around
DPLL_UNIV, producing a tool that can be applied to many more goals.
7.3. EXTENDING OUR PROCEDURE’S APPLICABILITY 117

Relaxing the Quantification Requirement The first step is to allow formulas that are
not closed. In order to hand on a formula that is closed to DPLL_UNIV, we can simply
generalise over the formula’s free variables. If DPLL_UNIV then says that the new, ground
formula is true, then so too will be the original. On the other hand, if DPLL_UNIV says
that the ground formula is false, then we can’t conclude anything further and will have
to raise an exception.
Code implementing this is shown below:
fun nonuniv_wrap t = let
val fvs = free_vars t
val gen_t = list_mk_forall(fvs, t)
val gen_t_eq = DPLL_UNIV gen_t
in
if rhs (concl gen_t_eq) = boolSyntax.T then let
val gen_th = EQT_ELIM gen_t_eq
in
EQT_INTRO (SPECL fvs gen_th)
end
else
raise mk_HOL_ERR "dpll" "nonuniv_wrap" "No conclusion"
end

Allowing Non-Literal Leaves We can do better than nonuniv_wrap: rather than quan-
tifying over just the free variables (which we have conveniently assumed will only be
boolean), we can turn any leaf part of the term that is not a variable or a negated
variable into a fresh variable. We first extract those boolean-valued leaves that are not
the constants true or false.
fun var_leaves acc t = let
val (l,r) = dest_conj t handle HOL_ERR _ =>
dest_disj t handle HOL_ERR _ =>
dest_imp t handle HOL_ERR _ =>
dest_bool_eq t
in
var_leaves (var_leaves acc l) r
end handle HOL_ERR _ =>
if type_of t <> bool then
raise mk_HOL_ERR "dpll" "var_leaves" "Term not boolean"
else if t = boolSyntax.T then acc
else if t = boolSyntax.F then acc
else HOLset.add(acc, t)
Note that we haven’t explicitly attempted to pull apart boolean negations (which one
might do with dest_neg). This is because dest_imp also destructs terms ˜p, returning
p and F as the antecedent and conclusion. We have also used a function dest_bool_eq
designed to pull apart only those equalities which are over boolean values. Its definition
is
118 CHAPTER 7. PROOF TOOLS: PROPOSITIONAL LOGIC

fun dest_bool_eq t = let


val (l,r) = dest_eq t
val _ = type_of l = bool orelse
raise mk_HOL_ERR "dpll" "dest_bool_eq" "Eq not on bools"
in
(l,r)
end

Now we can finally write our final DPLL_TAUT function:

fun DPLL_TAUT tm =
let val (univs,tm’) = strip_forall tm
val insts = HOLset.listItems (var_leaves empty_tmset tm’)
val vars = map (fn t => genvar bool) insts
val theta = map2 (curry (op |->)) insts vars
val tm’’ = list_mk_forall (vars,subst theta tm’)
in
EQT_INTRO (GENL univs
(SPECL insts (EQT_ELIM (DPLL_UNIV tm’’))))
end

Note how this code first pulls off all external universal quantifications (with strip_forall),
and then re-generalises (with list_mk_forall). The calls to GENL and SPECL undo these
manipulations, but at the level of theorems. This produces a theorem equating the
original input to true. (If the input term is not an instance of a valid propositional
formula, the call to EQT_ELIM will raise an exception.)

Exercises
1. Extend the procedure so that it handles conditional expressions (both arms of the
terms must be of boolean type).
Chapter 8

More Examples

In addition to the examples already covered in this text, the H OL distribution comes
with a variety of instructive examples in the examples directory. There the following
examples (among others) are to be found:

autopilot.sml This example is a H OL rendition (by Mark Staples) of a PVS example


due to Ricky Butler of NASA. The example shows the use of the record-definition
package, as well as illustrating some aspects of the automation available in H OL.

bmark In this directory, there is a standard HOL benchmark: the proof of correctness of
a multiplier circuit, due to Mike Gordon.

euclid.sml This example is the same as that covered in Chapter 4: a proof of Euclid’s
theorem on the infinitude of the prime numbers, extracted and modified from a
much larger development due to John Harrison. It illustrates the automation of
H OL on a classic proof.

ind_def This directory contains some examples of an inductive definition package in


action. Featured are an operational semantics for a small imperative programming
language, a small process algebra, and combinatory logic with its type system.
The files were originally developed by Tom Melham and Juanito Camilleri and are
extensively commented. The last is the basis for Chapter 6.
Most of the proofs in these theories can now be done much more easily by using
some of the recently developed proof tools, namely the simplifier and the first order
prover.

fol.sml This file illustrates John Harrison’s implementation of a model-elimination style


first order prover.

lambda This directory develops theories of a “de Bruijn” style lambda calculus, and also
a name-carrying version. (Both are untyped.) The development is a revision of the
proofs underlying the paper “5 Axioms of Alpha Conversion”, Andy Gordon and Tom
Melham, Proceedings of TPHOLs’96, Springer LNCS 1125.

parity This sub-directory contains the files used in the parity example of Chapter 5.

119
120 CHAPTER 8. MORE EXAMPLES

MLsyntax This sub-directory contains an extended example of a facility for defining


mutually recursive types, due to Elsa Gunter of Bell Labs. In the example, the type
of abstract syntax for a small but not totally unrealistic subset of ML is defined,
along with a simple mutually recursive function over the syntax.

Thery.sml A very short example due to Laurent Théry, demonstrating a cute inductive
proof.

RSA This directory develops some of the mathematics underlying the RSA cryptography
scheme. The theories have been produced by Laurent Théry of INRIA Sophia-
Antipolis.
References

[1] S.F. Allen, R.L. Constable, D.J. Howe and W.E. Aitken, ‘The Semantics of Reflected
Proof’, Proceedings of the 5th IEEE Symposium on Logic in Computer Science, pp.
95–105, 1990.

[2] R.S. Boyer and J S. Moore, ‘Metafunctions: Proving Them Correct and Using Them
Efficiently as New Proof Procedures’, in: The Correctness Problem in Computer
Science, edited by R.S. Boyer and J S. Moore, Academic Press, New York, 1981.

[3] A.J. Camilleri, T.F. Melham and M.J.C. Gordon, ‘Hardware Verification using
Higher-Order Logic’, in: From HDL Descriptions to Guaranteed Correct Circuit
Designs: Proceedings of the IFIP WG 10.2 Working Conference, Grenoble, September
1986, edited by D. Borrione (North-Holland, 1987), pp. 43–67.

[4] M. Davis, G. Logemann and D. Loveland, ‘A machine program for theorem proving’,
Communications of the ACM, Vol. 5 (1962), pp. 394–397.

[5] M. Gordon, ‘Why higher-order Logic is a good formalism for specifying and verifying
hardware’, in: Formal Aspects of VLSI Design: Proceedings of the 1985 Edinburgh
Workshop on VLSI, edited by G. Milne and P.A. Subrahmanyam (North-Holland,
1986), pp. 153–177.

[6] Donald. E. Knuth. The Art of Computer Programming. Volume 1/Fundamental


Algorithms. Addison-Wesley, second edition, 1973.

[7] Saunders Mac Lane and Garrett Birkhoff. Algebra. Collier-MacMillan Limited, Lon-
don, 1967.

[8] R. Milner, ‘A Theory of Type Polymorphism in Programming’, Journal of Computer


and System Sciences, Vol. 17 (1978), pp. 348–375.

[9] George D. Mostow, Joseph H. Sampson, and Jean-Pierre Meyer. Fundamental


Structures of Algebra. McGraw-Hill Book Company, 1963.

[10] L. Paulson, ‘A Higher-Order Implementation of Rewriting’, Science of Computer


Programming, Vol. 3, (1983), pp. 119–149.

121
122 REFERENCES

[11] L. Paulson, Logic and Computation: Interactive Proof with Cambridge LCF, Cambridge
Tracts in Theoretical Computer Science 2 (Cambridge University Press, 1987).

[12] R.E. Weyhrauch, ‘Prolegomena to a theory of mechanized formal reasoning’, Artifi-


cial Intelligence 3(1), 1980, pp. 133–170.

[13] A.N. Whitehead and B. Russell, Principia Mathematica, 3 volumes (Cambridge


University Press, 1910–3).

You might also like