0% found this document useful (0 votes)

5 views25 pages

From LCF To Isabelle:HOL

The document discusses the evolution of interactive theorem provers, focusing on Isabelle/HOL and its features, such as automatic proof search and a structured proof language. It traces the origins of these systems back to Edinburgh LCF and highlights significant applications, including cryptographic protocol verification and the formalization of the Kepler conjecture. The authors emphasize the importance of automation and usability in the development of these powerful tools over the past four decades.

Uploaded by

Lindy W

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views25 pages

From LCF To Isabelle:HOL

Uploaded by

Lindy W

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Under consideration for publication in Formal Aspects of Computing

From LCF to Isabelle/HOL

arXiv:1907.02836v2 [cs.LO] 17 Jul 2019

Lawrence C. Paulson1 , Tobias Nipkow2 and Makarius Wenzel3

1 Computer Laboratory, University of Cambridge, UK
2 Fakultät für Informatik, Technische Universität München, Germany
3 Augsburg, Germany

Abstract. Interactive theorem provers have developed dramatically over the past four decades, from prim-
itive beginnings to today’s powerful systems. Here, we focus on Isabelle/HOL and its distinctive strengths.
They include automatic proof search, borrowing techniques from the world of first order theorem proving,
but also the automatic search for counterexamples. They include a highly readable structured language of
proofs and a unique interactive development environment for editing live proof documents. Everything rests
on the foundation conceived by Robin Milner for Edinburgh LCF: a proof kernel, using abstract types to
ensure soundness and eliminate the need to store proofs. Compared with the research prototypes of the
1970s, Isabelle is a practical and versatile tool. It is used by system designers, mathematicians and many
others.
Keywords: LCF, HOL, Isabelle, interactive theorem proving.

1. Introduction
Today’s interactive theorem provers originated in research undertaken during the 1970s on the verification
of functional programs. Two quite different tools were built: the Boyer/Moore theorem prover (now ACL2,
described elsewhere in this volume) and Edinburgh LCF [GMW79, Pau18].
Descendants of the latter include every member of the HOL family (HOL4, HOL Light, ProofPower)
[GM93] as well as Coq [BC04] and Isabelle [NPW02]. As we shall see in the sequel, the achievements of
the past 40 years lie on several dimensions, including type systems, proof languages and user interfaces.
Automation is the key to usability; this includes automated search for counterexamples as well as for proofs.
These developments are reflected in the size of the tools themselves. In 1977, the Edinburgh LCF dis-
tribution was 900 KB, including an implementation of the ML programming language in Lisp. In 1986, the
Isabelle distribution was a mere 324 KB (not counting ML, now a separate distribution). By 1991, Isabelle
exceeded LCF: 937 KB. In 2019, the Isabelle distribution had reached 133 MB! And by way of comparison,
HOL Light (which is much more closely related to LCF) was 84 MB. Much of this bulk consists of proof
libraries rather than executable code, though libraries make a crucial contribution to a system’s capabilities.
Today’s mature systems also include documentation and examples.1

Correspondence and offprint requests to: Lawrence C. Paulson, Computer Laboratory, University of Cambridge, 15 JJ Thomson
Avenue, Cambridge CB3 0FD, England. e-mail: [email protected]
1 The figures are for uncompressed distribution directories containing no binaries, but possibly PDF files.
2 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

Isabelle is a leading interactive theorem prover. It is generic, supporting a number of different formal cal-
culi, but by far the most important of these is its instantiation to higher-order logic: Isabelle/HOL. Already
during the 1990s, Isabelle/HOL was being applied with great success to the task of verifying cryptographic
protocols [Pau98]. Turning to mathematics, it played a critical role in Hales’s Flyspeck project, which verified
his proof of the Kepler conjecture [HAB+ 17]. It is the basis for the seL4 project, under which an entire oper-
ating system kernel was verified, proving full functional correctness [KAE+ 10]. It was adopted by researchers
outside the verification milieu for specifying and verifying algorithms for replicated datatypes that provide
“eventual consistency” [GKMB17]. Numerous other projects are underway around the world. Like other
proof assistants, Isabelle is not directly concerned with program verification, i.e. with verifying code written
in a programming language, but it can be used as a back end to prove verification conditions. Isabelle can
also be used as a verified programming environment, where mathematical functions can be proved correct
and then automatically translated to executable code in one of several different programming languages. The
translation process itself is currently unverified, but even this is likely to change in the near future.
This essay focuses on Isabelle/HOL and therefore has little to say about techniques common to most sys-
tems. For example, simplification by rewriting — coupled with recursive simplification to handle conditional
rewrite rules — was already realised in both the Boyer/Moore theorem prover [BM79] and Edinburgh LCF
by the end of the 1970s. Recursive datatype and function definitions, as well as inductive definitions, were
commonplace by around 1995. Linear arithmetic decision procedures were also widely available by then.
Our title echoes Mike Gordon’s paper “From LCF to HOL: a Short History” [Gor00]. Like Mike, we
begin with LCF, the source of the most fundamental ideas. But we pass over this material quickly in order
to focus on Isabelle. There is no way to surpass Mike’s account of the early years. He starts in 1969, with
Dana Scott’s Logic for Computable Functions, and covers the original LCF project at Stanford University.
He describes in detail the development of the successor LCF systems at Edinburgh and then at Cambridge.
He names all the people involved and finally outlines his own development of HOL.
We begin with LCF because of its seminal importance, continuing to HOL because it is so strongly
linked (§2). Then we focus exclusively on Isabelle. We begin with the core ideas of a generic reasoner built
around unification and backtracking (§3). Then we consider the task of supporting higher-order logic, which
required the introduction of type classes (§4). There follows an account of automatic proof search and its
dual, the search for counterexamples (§5–6). We also discuss the generation of code from logical functions
(§7). We then turn to Isabelle’s most distinctive features: its structured proof language (§8) and the powerful
user interface architecture supporting it (§9). The next section describes Isabelle’s Archive of Formal Proofs
(§10). To conclude, we discuss the synergy among these ideas, making them more powerful in combination
than individually (§11).
Coq and other systems built around constructive type theories represent a distinctive strand of develop-
ment and fall outside our scope.

2. LCF and HOL

Edinburgh LCF is best known for the so-called LCF approach: implementing the inference rules of a logical
calculus within a proof kernel that has the exclusive right to create theorems. Such a kernel is indeed found
in most modern systems, and is responsible for their good record of soundness. But in fact, LCF introduced
a broader set of norms that are now taken for granted: a focus on backward proof, the practice of working
in a theory hierarchy, and above all, the central role of a functional programming language, ML [Mil85].
ML is short for meta language, as it was the vehicle for operating on formulas belonging to the object
language (namely, the proof calculus). Radical at the time, ML was soon seen to be a general-purpose
programming language and today exerts a strong influence on language design. Crucial to ML is its sound,
polymorphic type system with first-class functions, and in particular, its support for abstract types. An
abstract type encapsulates a data structure’s internal representation, allowing access only through a fixed set
of functions. A typical abstract type might be a dictionary, where the implementation (e.g. balanced trees)
is only accessible through well-defined operations such as insert, update and delete; then the implementer is
free to change the representation (e.g. to improve performance or to introduce additional operations) without
affecting any of the code outside the abstract data type declaration itself.
Robin Milner’s key insight was that abstract data types could save space by eliminating the need to store
proofs. (In his early experiments, he had kept running out of memory [Gor15].) He declared an abstract
type of theorems, where the operations were simply the inference rules of his logical calculus. The resulting
From LCF to Isabelle/HOL 3

type, thm, was the type of theorems and ML’s type checker was our guarantee that anything of type thm had
definitely been created exclusively by the application of inference rules. Just as a dictionary doesn’t need to
keep a record of the operations applied to it but only the dictionary itself, type thm doesn’t need to store
the proofs of theorems but only their statements. The resultant savings of space are as important now as
they were in the 1970s, for although today we have more memory, we also have vastly bigger proofs.
Edinburgh LCF also introduced proof tactics and tacticals to express backward, goal-directed reasoning.
An LCF tactic typically applies a specific inference rule, while tacticals denote control structures, e.g., THEN
(one tactic followed by another), ORELSE (which allows one of several tactics to be attempted) and REPEAT
(repeated execution of a tactic until it fails). Remarkably, even tactics lie outside the proof kernel. An LCF
tactic is a function that takes a goal (a formula with its assumptions) and returns a list of subgoals that it
claims are logically sufficient. To justify this claim, it returns a function operating on values of type thm,
but we have no guarantee that this function will deliver the promised theorem in the end. We see that the
LCF architecture makes proof procedures harder to implement, while reducing the amount of code that has
to be trusted.
We see incidentally two meanings of the word proof :
1. formal deductions of theorems from axioms using the inference rules of a logical calculus;
2. executable code written using tactics or other primitives, expressing the search for such deductions.
To resolve this ambiguity, the former are sometimes called proof objects or proof terms and the latter, proof
scripts or proof texts. Thus we see that the LCF approach eliminates the need to store proof terms and
allows proof scripts to be coded in ML using tactics and tacticals. Nevertheless, proof assistants based on
constructive type theories retain proof objects, as they are intrinsic to such formalisms.
The most fundamental question in the design of a theorem prover is what calculus to support. Boyer and
Moore made the inspired choice of Pure Lisp, which was sufficient for verifying simple functional programs and
formalising elementary number theory in the 1970s, and which has grown to support advanced applications
today. For Edinburgh LCF, Milner chose the Logic for Computable Functions, which was perfect for the
domain-theoretic investigations topical at the time but proved to be too quirky for general adoption.
Mike Gordon, who was one of the designers of Edinburgh LCF, ultimately adopted higher-order logic
as the basis for his research into hardware verification [Gor86, Gor00]. Higher-order logic is most easily
understood as a typed predicate calculus including function and Boolean types, and therefore also set types.
Gordon launched his HOL system in 1986 [Gor86, GM93], presenting both the logic and its application to
hardware specification and verification. HOL turned out to be extremely versatile and it soon attracted a
global user community. New versions emerged, in particular HOL Light [Har96]. Members of the HOL family
have been used in verification projects of every description, including the formalisation of great bodies of
mathematics.
However, it wasn’t obvious in the 1980s that one or two formalisms could be sufficient for the needs of
verification. Type theories, dynamic logics and many other formalisms were being proposed. Gordon himself
had reworked LCF twice in his hardware verification research. This was the origin of the idea that a theorem
prover could be generic: supporting a spectrum of formalisms through a common framework. Today this
includes syntactic tools (parsing, pretty printing), inference tools (rewriting, unification), a common proof
language (Isar) and user interface foundation (PIDE).

3. Isabelle in The Early Days: A Logical Framework

Isabelle originated in a project to build an LCF-style proof assistant for Martin-Löf’s constructive type
theory [ML84]. Two ideas influenced the design from the outset [Pau90]:
• Reasoning should be based on unification rather than pattern matching, so that goals could contain
variables that could be instantiated.
• A tactic should be able to return multiple results in a lazy list, representing alternative proof attempts
for backtracking.
Both ideas were aimed at supporting proof search. The LCF work had demonstrated conclusively that veri-
fication was repetitious and tedious, requiring the best possible automation. The combination of unification
4 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

and backtracking would allow the use of logic programming techniques, as in Prolog [CM87]. For example,
in Martin-Löf’s type theory we have the following (derived) rule:
c∈A×B
fst(c) ∈ A.
Backward chaining, using this rule to prove something of the form fst(c) ∈ A, leaves us with a subgoal of
the form c ∈ A × ?B, where the question mark indicates that ?B is a hole to be filled: in Prolog parlance,
a logical variable. Unification fills in the holes later, while backtracking can orchestrate alternative ways of
filling the holes. In this example, the form of c will determine the choice of ?B. The ability to defer choices
until they become clear is as valuable in theorem proving as it is in Prolog programming.
It may be worth mentioning that Prolog was a hot topic in the 1980s. However, standard first-order
unification was simply not applicable to Martin-Löf type theory, which is based on a form of typed λ-
calculus. Higher-order unification would be necessary. Undecidable in its full generality, a reasonably practical
unification procedure had recently been published by Gerard Huet [Hue75].
Higher-order unification is strikingly different from first-order unification. The latter is easily implemented
and delivers a unique answer modulo variable renaming. Higher-order unification allows even functions to
be variables and can instantiate such variables with λ-abstractions generated on the fly, yielding multiple
results. For example, it is possible to unify F M with 3 in two different ways: F = λx. 3 (with no constraint
on M ) and F = λx. x, M = 3. Unifying F 3 with 3 + 3 could yield four different results: F = λx. x + x,
F = λx. x + 3, F = λx. 3 + x and F = λx. 3 + 3. The search for such unifiers works by descending through the
terms one level at a time, attempting two kinds of steps: projections (use of a bound variable) and imitations
(copying the opposite term). The examples above show how it can find different ways of abstracting one
term from another.
With such a sophisticated syntactic mechanism built in at the lowest level, it became clear that the
LCF approach to inference rules could be radically changed [Pau86]. Consider the proof rule of universal
elimination:
∀x. φ(x)
(1)
φ(a)
Its LCF representation is an ML function taking two arguments: a theorem (which must have the form
∀x. φ(x)) and a term a (which must have the same type as x). It then generates the desired conclusion,
namely φ(a), raising an exception if any of the preconditions is violated. This approach is general; the
drawback is the tedium and attendant risk of error when there are dozens of rules. With our λ-calculus
framework, inference rules such as the one above can simply be written out in the form of a template
and instantiated using unification. Better still, unification could be applied either to the premise or to the
conclusion, yielding forward or backward proof through a single mechanism. This is Isabelle’s central idea
[Pau86].
The precise nature of these templates remained to be determined, and the best approach turned out to be
a sort of logical framework. These are specialised formalisms whose purpose is to encode other formalisms. The
Edinburgh Logical Framework [HHP93] is the best known of these. It differs from Isabelle’s by incorporating
proof objects into the calculus itself, neutralising one of the key advantages of the LCF architecture: that
proofs do not have to be stored.
So Isabelle can be seen as an instance of the traditional LCF approach, but where type thm formalises
a logical framework or meta-logic, where other formalisms (the object-logics) can later be encoded. Such
encodings can also be proved correct [Pau89]. Our logical framework approach is less general than the
original LCF representation, where an inference rule can undertake an arbitrary computation. Nevertheless,
it captures a variety of possibilities, and moreover, it is open-ended: for example, Isabelle defines intuitionistic
first-order logic, which in a succession of formal theories is extended with classical logic, then with Zermelo–
Fraenkel set theory, then with the axiom of choice. With the original LCF approach, once you define the
type thm, it can never be extended.
The first object-logic was constructive type theory (Isabelle/CTT). It was followed by a classical first-
order sequent calculus (Isabelle/LK) and by natural deduction calculi for intuitionistic and classical first-
order logic (Isabelle/IFOL and Isabelle/FOL) [Pau93]. Later, a substantial development of Zermelo–Fraenkel
set theory was developed on top of Isabelle/FOL [PG96]. This is one of the leading tools for formal reasoning
in axiomatic set theory.
From LCF to Isabelle/HOL 5

Although Isabelle took a declarative approach to defining logics, the system architecture was still based
on LCF’s philosophy that everything was done in ML. All interactions with Isabelle took place at the ML
toplevel. For example, the declaration of new types, constants with their definitions, and axioms all required
calling appropriate ML functions with any necessary data as arguments. This began to change in the early
1990s [Pau94] when logics could be defined in a theory file with types, constants and proof rules declared
using a natural syntax.2 Proofs, however, were still expressed in ML. The structured proof language Isar
(Section 8.1) came later. Of today’s LCF-based systems, the HOL family has remained the most faithful to
the original conception, with proofs coded in ML.
Paulson’s original implementation of higher-order unification [Pau86] was practical, but still slow. Yet
in Isabelle practice, many unification problems are first-order, or almost so. Dale Miller [Mil91] discovered
a subclass of λ-terms, later called (higher-order) patterns [Nip91a], which behave like first-order terms:
unification is decidable and if two terms are unifiable, they have a most-general unifier. A term (in β-
normal form) is called a higher-order pattern if every free occurrence of a variable has as arguments a list
of distinct bound variables. Most unification problems in Isabelle are of this form. Nipkow gave a succinct
implementation of pattern unification and added it to Isabelle [Nip93a]. Full higher-order unification is
invoked only if pattern unification encounters a non-pattern.
A special case of unification is matching where the variables of only one of the two terms are instantiated.
Isabelle’s rewrite engine (aka the simplifier ) is based on higher-order pattern matching. Thus the simplifier
can deal with many standard transformations of quantified terms, for example the following ones:
(∀x. P (x) ∧ Q(x)) = (∀x. P (x)) ∧ (∀x. Q(x))
(∀x. P ∨ Q(x)) = P ∨ (∀x. Q(x))
(∀x. x = t ∧ P (x)) = P (t)
It appears that Isabelle was the first theorem prover to support higher-order rewrite rules [NP98].

4. Type Classes and Isabelle/HOL

Gordon’s HOL system became a runaway success, dominating the verification arena. By 1991 it was being
used in over 80 separate projects around the world [Kal91]. Isabelle claimed to be a generic theorem prover,
but it couldn’t handle higher-order logic. The problem was that the templates mentioned above could not
refer to types. This was not an issue in the original application of Martin-Löf type theory, where types
and formulas were effectively identified and the inference rules referred to types explicitly. In the following
example, A and A + B are types and the rule expresses how type checking should be done for a term of the
form inl(a).
a∈A
inl(a) ∈ A + B
Contrast with the previous inference rule (1), where no types are visible. There are many formalisms,
including many-sorted first-order logic3 as well as higher-order logic, where types are kept in the background
and type constraints are enforced implicitly. Users would not like to be forced to prove statements like
i + 1 : int.
Adequate support for logics having implicitly-typed variables required another idea, order-sorted poly-
morphism, leading to an even more powerful idea, axiomatic type classes.

4.1. Order-Sorted Polymorphism

Isabelle had always supported polymorphism internally; the difficulty lay in making it available to users.
In its simplest form, a polymorphic type variable may take on any type whatsoever. This could not be
allowed in a logical framework, where some types are intrinsic to the framework itself and other types might

2 This represented a return to the original Edinburgh LCF, which also supported theory files. The Isabelle version was inspired
by Goguen’s OBJ system [Gog79, FGJM85], in particular concerning the declaration of new mixfix syntax.
3 Beware of terminological confusion regarding the word sort, which for first-order logic is synonymous with type. Our use of
sort below will be entirely different.
6 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

be unsuitable to a particular object-logic. In first-order logic, the type of x in ∀x. φ(x) must not involve
functions or Booleans.4 Order-sorted polymorphism solves such difficulties by introducing a hierarchy of
sorts on types [Nip91b]. A sort is a finite intersection of type classes, which are essentially collections of
types. For first-order logic, we may introduce the type class FO for all types for which quantification is
permitted. For higher order logic, we would have a different type class (say HO), containing Boolean and
function types. We write τ :: C if type τ has class C.
Now we need to express how (possibly nullary) type constructors such as nat (the type of natural num-
bers), Cartesian product, function space and list (the type of lists) act on type classes. For example, for
first-order logic we want that nat :: FO and that if τ :: FO then τ list :: FO. These assertions can be codified
in Isabelle as instance declarations:
instance nat :: FO
instance list :: (FO) FO
Thus nat list :: FO holds, but in the absence of corresponding instances, bool :: FO and bool list :: FO do not:
truth values, or lists of them, do not belong to FO. Technically, these instance declarations form the signature
of the (term) algebra of types. Under certain natural conditions on the interaction of the class hierachy
with the instance declarations, order-sorted unification of types, like unsorted unification, is still unitary:
solvable unification problems have most-general unifiers. Therefore these conditions guarantee that we still
have principal types. This theory, combined with extensions to higher-order unification [Nip91b], allows full
support for higher-order logic. And so the “templates” of our framework contain explicit type variables, in
addition to the ordinary variables that we had before. This version was first released in 1991 [NP92]. At the
same time, the correspondence to Haskell’s type classes was worked out [NS91, NP93].

4.2. Axiomatic Type Classes

The full power of type classes is realised when they are combined with axioms. Axiomatic type classes
support the flexible and principled overloading of symbols [Nip93b, Wen97]. A type class can be introduced
on the basis of a specific vocabulary of symbols (or signature) possibly coupled with axioms (a specification)
to constrain those symbols—and possibly extending other type classes. Orderings are a natural example:
Isabelle/HOL introduces a succession of type classes for increasingly stronger notions of ordering:
• ord : the ordering symbols <, ≤, but with no attached properties
• preorder : adding reflexivity and transitivity, and defining x < y iff x ≤ y ∧ ¬(y ≤ x)
• order : adding antisymmetry
• linorder : adding linearity
For the case of lists, the lexicographic ordering yields a partial ordering if the list elements are partially
ordered and yields a linear ordering if the list elements are linearly ordered. Such details are easily expressible:
instance list :: (order ) order
instance list :: (linorder ) linorder
Instance declarations for axiomatic type classes require the user to supply definitions of any associated
symbols (here < and ≤) for the supplied type along with proofs of the associated properties using those
definitions. This overloading is principled in that although < and ≤ will have separate definitions for each
type for which they are defined, they will always satisfy the associated axioms.
The value of axiomatic type classes can be seen in the formalisation of mathematical analysis. Type classes
for groups, rings and other algebraic structures provide overloading for the common arithmetic symbols, but
in addition, we have type classes for metric and topological spaces, vector and Euclidean spaces. When
we construct the type of complex numbers for example, by showing that they form a field and a Euclidean
space, we instantly inherit substantial libraries of facts covering limits, convergence, derivatives and topology,
which would otherwise have to be largely duplicated from the analogous facts for the real numbers [HIH13].
The same thing happens again when we construct more advanced number systems, such as the quaternions
[Woo18], a number system for three-dimensional space. We have a surprising variety of numeric types, such

4 Quantification over Booleans (and therefore, relations) requires at least second-order logic [Bar77, page 7].
From LCF to Isabelle/HOL 7

as the extended reals (R∪{+∞, −∞}) and the nonnegative extended reals; both of these belong to a number
of ordering and topological type classes.
An exposition of axiomatic type classes with fully worked out examples is available in a paper by Paulson
[Pau04]. It’s slightly outdated — we have a much more elaborate type class hierarchy now and a somewhat
different syntax — but the principles are the same.

4.3. Logical Foundations

Wenzel [Wen97] reduced type classes to (1) overloaded constant definitions (OCDs) for the signature and
(2) predicate definitions over a single type variable for the specification. Getting the notion of definition
with type-polymorphism right is notoriously tricky. For example, in an early version of Gordon’s HOL
system, constant definitions could introduce inconsistencies [Art16]. The problem is simply to detect ill-
formed definitions, in particular circularities, automatically. Circularities can lead to inconsistencies. Type
classes link the level of types with the level of terms, opening up new opportunities to create highly obscure
circularities.
Wenzel analysed the impact of OCDs on the consistency of an arbitrary theory. He sketched conditions
under which they were meta-safe, which roughly means that they can be removed without affecting provabil-
ity. Meta-safety implies conservativity, which in turn implies consistency preservation. However, his proof
sketch considered an idealized version of Isabelle’s actual definition facilities, which were only partially im-
plemented and in conflict with existing application theories. In particular—in order to stay within Gordon’s
HOL and the Isabelle logical framework—he excluded the interleaving of OCDs with type definitions.
Obua [Obu06] found that Isabelle accepted OCDs that introduced inconsistencies. He gave a more rigorous
and general formulation of Wenzel’s conditions [Wen97] and implemented them using an external termination
checker, outside of the inference kernel. He also sketched a proof that these conditions ensured conservativity.
As an alternative, Wenzel and Haftmann proposed a much simpler (and stronger) check on OCDs [HW06]
inside the logical kernel, but this excluded a few application theories with ambitious overloading.
Several years later Kunčar and Popescu [KP19, KP18] rediscovered that combining overloaded constant
definitions with type definitions could introduce circularities that were missed by the previous analyses.
They gave extended checks to avoid these circularities and proved that these checks ensure consistency of
theories that extend the initial HOL theory (comprising the standard HOL axioms) with overloaded constant
definitions and type definitions. In the end [KP18], they strengthened the result to show that such definitional
extensions are in fact meta-safe over the initial HOL theory.
These particular concerns were specific to detecting circular definitions, but all proof assistants potentially
contain errors, like any other software. A notable case is PVS [Owr06], which lacking an LCF-style proof
kernel was particularly vulnerable to soundness errors (at least in the 1990s):
PVS still seems to contain a lot of bugs and frequently new bugs show up. . . . It would be desirable
that the bugs in PVS would only influence completeness and not soundness. Unfortunately, this is
not the case, as some recent proofs of true==false have shown [15, bug numbers 71, 82, 113 and
160]. . . . It is reasonable to assume that PVS will continue to contain soundness bugs. The obvious
question thus arises, why use a proof tool that probably contains soundness bugs? Our answer is
threefold: PVS is still a very critical reader of proofs. PVS lets fewer mistakes slip through than many
of our human colleagues. . . . Furthermore, history tells us that the fixed soundness bugs are hardly
ever unintentionally explored, we know of only a single case. Thirdly, most mistakes in a system that
is to be verified are detected in the process of making a formal specification.[GH98, page 134–5]
Notwithstanding such forbearance, users deserve a much higher standard of correctness than this. At the
same time, it’s vital to stress that users need to take responsibility for their own definitions: we know how
to detect circularity, but no system can check whether a definition conforms to the user’s true intentions.

4.4. Isabelle/HOL versus HOL

As mentioned at the start of this section, the purpose of the foregoing work was to make possible an Isa-
belle instantiation of higher-order logic. Isabelle/HOL emerged in 1991 [NP92] and was soon a fully capable
8 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

alternative to HOL [NPW02]. In some ways it emulated the latter, particularly in its axiom system. Never-
theless, it is thoroughly Isabelle in its treatment of theorems, tactics, etc; we cannot regard Isabelle/HOL
as a member of the HOL family the way that HOL Light is.
Both Isabelle and HOL implement Milner’s idea of a proof kernel implementing a formal calculus as an
abstract type called thm. The essential difference is that Isabelle’s formal calculus is a logical framework in
which other formalisms can be defined, while HOL’s is simply higher-order logic. Either way, a function of
type thm->thm implements an inference rule. In the case of Isabelle, this would be an operation at the level
of the logical framework, such as the resolution rule through which proofs are constructed. In the case of
HOL, this would be an inference in higher-order logic itself, like the quantifier rule (1). In HOL, a tactic must
be coded separately corresponding to each rule of inference, while in Isabelle, inference rules such as (1) are
all expressed declaratively and applied using the generic resolution rule.
Isabelle’s generic nature can be seen in how many of its capabilities are shared among its various instances.
The common libraries include a general representation of syntax, with parsing, pretty printing, type checking,
simplification and other proof procedures, as well as the proof language and user interface. They are available
in other instances of Isabelle, such as Isabelle/ZF, which has been the basis for substantial developments in
Zermelo–Fraenkel set theory [Pau03].

5. Automation
It was clear from the outset that machine proof was extremely laborious and could only be feasible if the
machine itself provided as much automation as possible. But it was also clear that fully automatic theorem
proving was not practically achievable. Significant advances in automation had already been made by the
end of the 1970s:
1. decision procedures for arithmetic, arrays, lists, etc., as well as methods for combining decision procedures
[NO80];
2. resolution for first-order logic [Rob65, Ove75], complete in principle but frequently disappointing in
practice;
3. the signature automation of the Boyer/Moore theorem prover [BM79]: conditional rewriting plus powerful
heuristics for induction.
Of these, simplifiers based on rewriting (by previously proved theorems of the form t = u) quickly found
their way into LCF, HOL, etc. All such simplifiers eventually supported conditional rewriting, for rewrites
of the form φ =⇒ t = u; in such a case, the necessary instance of φ would be proved recursively by the
simplifier itself. Arithmetic decision procedures were eventually adopted in many systems. On the other
hand, resolution had acquired a bad reputation. As late as 2002, Shankar could write
The popularity of uniform proof methods like resolution stems from the simple dogma that since first-
order logic is a generic language for expressing statements, generic first-order proof search methods
must also be adequate for finding proofs. This central dogma seems absurd on the face of it. . . . A more
sophisticated version of the dogma is that a uniform proof method can serve as the basic structure
for introducing domain-specific automation. There is little empirical evidence that even this dogma
has any validity. . . . Automated reasoning has for too long been identified with uniform proof search
procedures in first-order logic. This approach shows very little promise. [Sha02, pages 3–4]
Of the various proof assistants, only Isabelle saw a sustained effort to incorporate ideas from resolution.

5.1. The classical reasoner

As mentioned in Section 3 above, Isabelle supported both unification and backtracking from the start, with
the aim of incorporating ideas from first-order automatic proof procedures. In the context of interactive
proof, unification provided the ability to prove a subgoal of the form ∃x. φ(x) by removing the quantifier and
proving φ(?t), where ?t stood as a placeholder for a concrete term to be supplied later. Through unification,
this term could even be built up incrementally. Dually, unification provided a means of using a universally
quantified fact ∀x. φ(x), when the required instances were not immediately obvious.
From LCF to Isabelle/HOL 9

Simple automation is achievable through a combination of obvious applications of the propositional

connectives (∧, ∨, ¬, etc.) along with heuristics for performing quantifier reasoning. Stronger automation is
obtainable by borrowing well-known techniques for classical first-order logic theorem proving. But the most
important idea is to embrace the concepts of natural deduction in application theories as well as in pure
logic. Natural deduction prefers the use of simple inference rules focusing on a single symbol. For example,
conjunction is effectively defined by the following three rules:
φ ψ φ∧ψ φ∧ψ
φ∧ψ φ ψ
The intersection of two sets has a technical definition that would greatly complicate reasoning, but it is easy
to derive inference rules for intersection in the style of natural deduction (and analogous to those above):
a∈A a∈B a∈A∩B a∈A∩B
a∈A∩B a∈A a∈B
Many other reasoning steps can be expressed similarly:
A⊆B a∈A A⊆B B⊆A k dvd m k dvd n
a∈B A=B k dvd gcd(m, n)
So the crucial idea is to build proof tactics that support reasoning with inference rules of this sort. Though
they borrow techniques from first-logic theorem provers, they are far more effective than expanding the
various operators into their low-level definitions and attempting to prove the resulting formulas of pure logic.
And note: they would be formulas of higher-order logic, where automation is considerably more difficult than
for first-order logic.
Such tactics have collectively become known as Isabelle’s classical reasoner [Pau99]. A principle of natu-
ral deduction is that the syntactic form of each rule clearly identifies which symbol it is concerned with. And
therefore hundreds of such rules can coexist in the classical reasoner without causing a combinatorial explo-
sion. The user who invokes auto—which combines classical reasoning and simplification—gains the benefit
of built-in knowledge about everything in Isabelle’s standard libraries. As users build their own libraries,
they can continue to augment this knowledge. This combination of classical reasoning with rewriting is still
unique to Isabelle.

5.2. Sledgehammer
By the early 2000s, resolution theorem provers [RV02, Sch04, Wei01] were demonstrating power far beyond
anything the classical reasoner could ever achieve. The idea of some sort of interface between Isabelle and
these systems was beguiling. This idea wasn’t new: such linkups have been attempted on a number of past
occasions, always unsuccessfully. The key was to make such a linkup useful:
The guiding idea is that user interaction should be minimal. The system should invoke automatic
provers spontaneously or in response to a trivial gesture such as a mouse click. These proof attempts
should run in the background, not disturbing the user unless a proof is found. Proofs should refer to
a large library of known lemmas: users should not have to select the relevant ones. The automatic
prover should not be trusted; instead, proofs should be translated back into the formalism of the
interactive prover. Proofs should be delivered in source form to the user, who can simply paste them
into her proof script. [MQP06, p. 1576]
A major difficulty with building an interface between Isabelle/HOL and first-order automatic theorem provers
is that they operate on quite different formalisms. Isabelle/HOL has λ-abstractions, types and type classes,
while first-order logic has none of these. One click invocation could only be achieved if the system itself
took care of everything: the translation of higher-order syntax, some representation of type constraints, etc.
An additional requirement was relevance filtering: to identify the most suitable of the thousands of facts
available in an Isabelle session, since providing too many would overwhelm the first-order provers.
The final difficulty was to translate the proofs discovered by the external reasoning tools into Isabelle’s
proof kernel. This translation moreover had to be expressed as a source-level proof so that the expensive
proof search would not have to be repeated. Resolution theorem provers typically print a formal justification
of their reasoning, but it is difficult to interpret and frequently ambiguous. The approach eventually adopted
10 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

Fig. 1. Sledgehammer GUI in Isabelle/jEdit: clicking on highlighted output inserts the proposed proof snippet
into the text

was to extract from their output nothing but the list of axioms—each a known Isabelle theorem—used in the
proof. This list would typically contain no more than eight axioms, yielding a problem simple enough to be
proved by simple tactics integrated with Isabelle’s kernel [PS07]. This tool became known as Sledgehammer.
The original version was developed at Cambridge, but Sledgehammer was comprehensively rewritten at
Munich, particularly by Blanchette [BBP13], who greatly increased its scope and power.
Sledgehammer helps beginners by identifying and using theorems that they didn’t know existed. It puts
the latest theorem provers, such as Vampire and Z3 [dMB08], at their disposal with a single click (Figure 1).
But even advanced users are often surprised by the proofs Sledgehammer comes up with. It is now seen as
indispensable, and similar subsystems are being developed for other interactive theorem provers.

6. Counterexample Search
Isabelle’s proof methods and Sledgehammer are effective for proving theorems, but given an invalid conjecture
they normally fail to detect the invalidity, let alone produce an informative counterexample. Novices and
experts alike state invalid theorems and find themselves wasting hours on impossible proofs. To make proving
more enjoyable and productive, Isabelle includes counterexample generators that complement the proof tools.
The main ones are Quickcheck and Nitpick. As a simple example, suppose that the user types in this lemma
statement (where rev reverses a list and @ appends two lists):
lemma rev (xs @ ys) = rev xs @ rev ys
Quickcheck is invoked automatically and displays the counterexample xs = [a1 ], ys = [a2 ].

6.1. Quickcheck
As the name suggests, Quickcheck is Isabelle’s counterpart to the QuickCheck testing tool [CH00] for Haskell.
The original Quickcheck [BN04] combined Isabelle’s code generation infrastructure (Section 7) with random
testing, covering both recursive functions and inductive predicates. It aimed at providing fully automatic
counterexample search, in contrast to Haskell’s QuickCheck, which is an infrastructure for building special-
ized random generators for testing. Therefore it worked automatically whenever all functions and inductive
predicates were executable, but, depending on the property, it might take forever to find a counterexample.
Consider especially conditional properties φ =⇒ ψ where most of the random values simply falsify φ; this
situation requires a special random generator that yields only values that satisfy φ.
Therefore Bulwahn [Bul12a, Bul12b, Bul12c, BBN11] refined Quickcheck in three directions:
• Exhaustive enumeration of small values, an idea due to Runciman et al. [RNL08].
From LCF to Isabelle/HOL 11

• Generating values not randomly but synthesizing generators from the premises of conditional properties.
• Symbolic testing with narrowing, a technique from functional-logic programming [AH10].
Isabelle’s Quickcheck was inspired by the work of Dybjer et al. [DHT03] in the theorem prover Agda: they
had followed the original QuickCheck design, which expected users to set up specialized random generators
in the host language/logic. Over the next decade, other major theorem provers adopted analogous checkers:
PVS [Owr06], ACL2 [CDKM11] and Coq [PHD+ 15].
Quickcheck is one of Isabelle’s best-loved tools (after Sledgehammer), partly because it is invoked auto-
matically and silently every time the user types in a lemma. You may not even be aware of its existence
and it suddenly announces that it has found a counterexample to your purported lemma. The result is a
surprised and grateful user. Quickcheck is most effective in the context of functional programming combined
with inductive predicates.

6.2. Nitpick

A radically different approach to Quickcheck is based on systematic model enumeration using a SAT solver.
This approach was pioneered by the tool Refute [Web05, Web08] and is now embodied by Nitpick [BN10,
Bla12]. Nitpick looks for finite fragments (substructures) of infinite countermodels, soundly approximating
problematic constructs. Common Isabelle idioms, such as inductive and coinductive predicates and datatypes
as well as recursive and corecursive functions, are treated specially to ensure efficient SAT solving [Bla13].
The actual reduction to SAT is performed by the Kodkod library [TJ07]. Given a conjecture, Nitpick (via
Kodkod and the SAT solver) searches for a standard set-theoretic model that falsifies it while satisfying any
relevant axioms and definitions. Nitpick is innately better suited to problems from set theory and logic than
Quickcheck. Nitpick revels in particular in finite combinatorial problems.
The first tool that exploits SAT-solving for finding counterexamples in a theorem prover (ACL2) seems
to be due to Sumners [Sum02]. A second prototype tool, again for ACL2, was developed by Spiridonov and
Khurshid [SK07] and was based on Kodkod. Blanchette was unaware of this work when developing Nitpick.

7. Code Generation
Code generation is the process of generating efficiently executable code from definitions in the logic of a
theorem prover. It serves three main purposes: to obtain actual runnable software, to validate definitions by
executing them on concrete values, and to search for counterexamples to properties by testing.
Most major theorem provers are based on logics that have an executable sublanguage. In the Boyer and
Moore theorem prover, from its earliest incarnation [BM79] to present-day ACL2 [HKMS17], the logic is
itself a purely functional fragment of Lisp: all expressions can be executed according to Lisp semantics. In
Martin-Löf type theory [ML84], the term language — including proof terms — has a well-defined operational
semantics. Terms are also executable, subject to certain conditions, in other type-theory based systems such
as Coq [BC04]. Note that the types (which correspond to the formulas of predicate logic) are not executable.
In the case of HOL, a sublanguage can be identified that corresponds to a functional programming
language. Then we superimpose an operational semantics on this fragment: programs are sets of equations
that are to be used as rewrite rules, from left to right. This idea goes back to term rewriting and was
expressed succinctly by titles like Computing in Systems Described by Equations [O’D77] and Programming
with Equations [HO82]. The step-by-step execution of an equational program by rewriting corresponds to
performing a proof in equational logic. If the rewrite rules come from recursive function definitions in the first
place, it is natural to translate them to programs in an ML-like functional language with pattern-matching.
The first such tool was created by Rajan [Raj93] for Gordon’s HOL system [GM93].

7.1. History

Berghofer and Nipkow [BN02] realised the approach above, creating a compiler in Isabelle from a subset
of HOL into ML. Datatypes and recursive functions defined by pattern matching are translated directly
into their ML counterparts. They also extended the approach by making a subset of inductive predicates
12 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

executable. The idea is to view them as Prolog programs and to perform a mode analysis [DW88]. Each
possible mode partitions the arguments of an inductive predicate into inputs and outputs, and for each mode
the inductive predicate is compiled into an ML function that maps an input tuple to a stream of output
tuples. A subset of HOL terms and certain queries involving inductive predicates can now be compiled and
executed in ML. This formed the basis of Isabelle’s initial Quickcheck (Section 6.1).
Isabelle’s code generator is part of the trusted kernel. Work by Haftmann [Haf09] and others improved
its reliability and functionality significantly: code generation for inductive predicates was moved out of the
kernel by translating inductive predicates into recursive functions inside HOL, where the equivalence is
proved [BBH09]. Reliability of the compilation of the purely functional sublanguage was improved by a
pen-and-paper correctness proof [HN10]. Code generation was extended to type classes by eliminating them
in a first step. Code generation was extended to support data refinement [HKKN13], i.e. the automatic
implementation of abstract types like sets by concrete types like search trees. Due to a new modular design,
further target languages are easier to add: as of this writing, the code generator supports Standard ML,
OCaml, Haskell and Scala.
Most recently, Hupel [HN18] has provided an alternative verified code generator that translates HOL into
CakeML, an ML-like functional language with a verified compiler [KMNO14]. CakeML is the only backend
for code generation that doesn’t have to be trusted, since the CakeML compiler has itself been verified
formally. This yields a verified tool chain from HOL to machine code, except that the CakeML compiler was
verified in HOL4 [SN08] rather than Isabelle/HOL. Eliminating this gap is the subject of current research.

7.2. Applications
The code generator has been an enabling technology for a large number of applications. The following are
some representative examples:
Imperative HOL Bulwahn et al. [BKH+ 08] developed a monadic embedding of imperative programs in
HOL. That is, on the HOL level everything is still purely functional, but in a monadic style. They
extended the code generator such that it translates these monadic definitions into actual imperative code
in the target language (SML etc.). This leads to substantially improved performance and is used in a
number of applications below.
Refinement Framework Lammich [Lam13, Lam16, Lam19] has developed an HOL framework for the
stepwise refinement of (possibly nondeterministic) algorithms down to (possibly imperative) executable
code. This framework is used in a number of applications below. Lammich and others [LT12, Lam14, LS19]
have used the framework extensively for the verification of efficient graph algorithms.
Model Checking and SAT Solvers Esparza et al. [ELN+ 13] developed an executable verified model LTL
model checker (10–50 times slower than SPIN [Hol97] on standard benchmarks) that was later extended
with partial order reduction [BL18]. Brunner and Lammich built on work by Peled [Pel96] but found
that one of his lemmas was incorrect; thus they were unable to use his actual reduction algorithm.
Siegel [Sie19] found a counterexample to the correctness of Peled’s algorithm with the help of the Alloy
analyzer [Jac06].
Wimmer and Lammich [WL18] developed an executable model checker for timed automata whose through-
put is about one order of magnitude lower than Uppaal’s [LPY97] on standard benchmarks (but degen-
erates for large state spaces).
Lammich [Lam17] verified an executable checker for unsatisfiability certificates emitted by SAT solvers
which is twice as fast as the standard unverified checker.
Term Rewriting Thiemann has been developing a huge formalisation of the theory of term rewriting
called IsaFoR/CeTA5 over more than a decade now [TS09]. Initially, IsaFoR/CeTA was aimed primarily
at checking termination proofs found by automatic tools like AProVE [GAB+ 17]. The code generator
produces these proof checkers from their verified HOL formalizations. Today, IsaFoR/CeTA can also
check proofs of a term rewriting system’s complexity [DJK+ 18] and confluence [NM16].
Computer Algebra As representative examples we mention decision procedures for univariate real polyno-
mials (based on Sturm [Ebe15] and on cylindrical algebraic decomposition [LPP19]) and the Berlekamp-
Zassenhaus factorization algorithm [DJTY19].

5 http://cl-informatik.uibk.ac.at/software/ceta/
From LCF to Isabelle/HOL 13

Programming Languages One key application of theorem provers has been the formalization of program-
ming languages and compilers [Ler09, KMNO14, NK14]. The code generator was used by Lochbihler and
Bulwahn [LB11] to generate an interpreter directly from the semantics of a Java-like language with
threads.

8. Structured Proofs, Structured Specifications and Formal Contexts

In the beginning, formal proofs were tiny and the ML proof scripts were readable enough. A typical proof
goal involved just a few assumptions involving two or three bound variables. But as the field progressed,
researchers tackled increasingly ambitious problems. The longer proofs got, the more incomprehensible they
became. Moreover, the traditional LCF approach of tactical proof had a tendency to retain too much, so
that the user might be faced with a list of several dozen assumptions. These were not merely overwhelming
to the eye but caused automatic proof procedures to bog down: their execution time could rise exponentially.
The Isar proof language, introduced in the late 1990s, addressed these concerns by allowing proofs to
be structured into nested scopes. Local goals were proved from local assumptions, which were written out
explicitly. Block structure is well understood in computer science, but here the ability to make declarations
locally had to be retrofitted into Isabelle. This led to the idea of local contexts to encapsulate the assumptions
(and associated bound variables) specific to a particular goal, along with other information. During an
ambitious proof, the user may still have several dozen assumptions available, but these are now structured
through the nesting of the contexts and accessible by name rather than in a single giant list.
Locales are a further structuring mechanism for expressing an extended series of proofs that rest on
shared assumptions. They are useful for developing abstract mathematics, such as group theory, which can
then be applied to particular groups.

8.1. Structured Proofs: the Isar Language

A distinctive feature of Isabelle is its Isar language of structured proofs [Wen07]; the acronym stands for
Intelligible semi-automated reasoning. In the original LCF paradigm, a proof could be arbitrary ML code,
which in some later systems was replaced by a command language for proofs (e.g. the Ltac scripting language
in Coq). The problem with these traditional approaches is that somebody looking at a machine proof can
have no idea what is being proved at a given point: it is like playing blindfold chess. In contrast, an Isar proof
is a hierarchical structure containing explicit statements of assumptions and conclusions, with an indication
of the use of local facts. Isar also provides some mechanisms to avoid redundancy: it allows the proof author
to achieve a good balance of readability versus maintainability, such that small changes to definitions and
theorem statements should lead to reasonably small changes to proofs.
Here is a tiny Isar proof that implicitly uses some derived rules for logical connectives taken from the
library (similar to the classical reasoner from Section 5.1):
have A ∧ B −→ B ∧ A
proof
assume ∗: A ∧ B
show B ∧ A
proof
from ∗ show B ..
from ∗ show A ..
qed
qed
Here we prove a formula, A ∧ B −→ B ∧ A. We first assume A ∧ B, giving it the label ∗. Then we show B
∧ A, treating B and A separately. Those subproofs refer to the assumption via its label.
The Isar approach scales from a few primitive inferences, as above, to large proof developments involving
heavy automated reasoning tools, allowing the user to control the extent of proof automation. The proof
engine is able to check well-structured Isar proofs more efficiently than traditional tactic scripts: the hi-
erarchical structure helps to keep internal goals concise, without the intrusion of redundant assumptions
or unused lemmas. Moreover, the compositionality of Isar proofs, together with the proof irrelevance of the
14 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

Isabelle framework, allows independent checking of sub-proofs—even with parallel checking on multiple cores
enabled by default, due to Matthews and Wenzel [MW10, Wen13b].
The Isar proof language reuses some ideas from Mizar, a legendary software tool for doing mathematics by
machine [GKN15]. But the many impressive facilities of the Mizar language are intertwined with its unusual
set theoretic formalism, making the key ideas difficult to extract, especially given Mizar’s notorious lack of
documentation and closed sources. In contrast, Isar’s generic principles of proof—fixing local variables and
making local assumptions—are identified with the corresponding elements of Isabelle’s logical framework.
Thus it works with any object-logic that uses the builtin Natural Deduction paradigm of Isabelle, e.g. HOL,
FOL, ZF. Various derived Isar language elements provide explicit proof structure for typical reasoning seen
in object-logics: existential elimination and case-splitting, calculational chains of equalities and inequalities
(including substitution), structured induction etc. All of this is parameterised by declarations in the library.
The Isar proof language is parametrised by user-defined Isar proof methods, which are the old idea of ML
tactics fitted into the richer structure of the Isar engine. Thus new reasoning patterns may be added to a
theory library without revisiting the design of the proof language itself. For example, the Isabelle standard
library defines method rule for declarative forward/backward chaining explained in Section 3, and method
auto for a combination of the simplifier and the classical reasoner explained in Section 5.1. Proof methods are
either defined in ML (as in LCF or HOL), or in the Eisbach language [MMW16] (similar to Ltac definitions
in Coq): Eisbach uses the source notation of existing proof methods to define new ones via simple recursion
and pattern matching.
Despite its importance for the Isabelle ecosystem, the Isar proof language has not been adopted by other
interactive theorem provers. There are no fundamental obstacles to doing so, but it takes some effort to do
properly. Some isolated aspects of Isar have made it into the SSReflect language for Coq [GM10], notably the
have keyword for local claims within a proof. A bit more Isar syntax made it into the Lean prover [dMKA+ 15]
with slightly different meaning, though. Lean’s fix, assume, have, show, obtain construct certain λ-terms in
a more elementary manner than the Isabelle/Isar proof context export and goal refinement operations; e.g.
see the treatment of fix-assume-show in [Wen07, §2.2]. The experiments by Wiedijk towards supporting
structured proofs in HOL Light [Wie01] are not directly related to Isar. They are simply a family of HOL
Light tactics that allow a proof script written in ML to have some similarity to a Mizar text. This style of
working has not caught on in the HOL world.
Although some detailed aspects of Isar are specific to Isabelle—forward/backwards refinement via higher-
order unification (Section 3) and local proof contexts (Section 8.3)—the language design as a whole is generic.
It requires nothing from the underlying calculus but primitive notions of terms and types and logical notions
equivalent to “implies” and “for all”. It’s not too hard to envisage a common language to be shared among
proof assistants, offering portability of proofs exactly as today’s programming languages offer portability
between different machine architectures. The crucial missing ingredient is the ability to write assertions and
prove subgoals while minimising explicit references to a particular calculus.

8.2. Global Theory Context

The original LCF approach (Section 2) declares an abstract ML type thm of theorems, but the background
theory is implicit in the system state: during a session there is only one big theory under development,
which grows monotonically. There are primitives to introduce new types, terms, definitions and axioms
which augment the theory, but without any way to undo such a change. HOL provides some tricks to hide
unwanted constants via name-space manipulation, to help interactive development.
In contrast, as early as 1986, Isabelle declared the abstract ML type theory alongside the type of
theorems. Type theory makes available the initial theory of the logical framework and allows for its extension
as an acyclic graph of application theories: there are operations to extend and merge theories. Multiple
theories can coexist in a single Isabelle/ML session, but end-users only work with a single theory document
at a time. This may import other theory documents—in foundational order from bottom to top. The concrete
syntax looks like this:
theory Test
imports Main HOL-Library.Finite Map HOL-Library.Finite Lattice
begin
definition constant = term
From LCF to Isabelle/HOL 15

[Pure]

[HOL]

Fib [HOL-Library]

[HOL-Computational_Algebra]

Cong Eratosthenes [HOL-Algebra] Prime_Powers

Mod_Exp Totient

Residues

Euler_Criterion Pocklington

Gauss Residue_Primitive_Roots

Quadratic_Reciprocity

Number_Theory

Fig. 2. The theory dependencies of session HOL-Number Theory

theorem name: statement hproof i

end
Here we declare a new theory, called Test. It is built upon three other theories: Main, Finite Map and Fi-
nite Lattice. The last two names are qualified with the Isabelle session to which they belong: HOL-Library.
Qualified names eliminate the danger of name clashes between theories belonging to different sessions. Be-
tween the begin and end brackets we can make definitions and prove theorems.
Proved results are formally certified against their original theory context, e.g. a theorem Θ ` φ for the-
ory Θ, and are implicitly propagated to another theory Θ0 ` φ, provided that Θ0 extends Θ by construction.
For efficiency, this theory relation is implemented via symbolic stamps that represent definitions, proofs,
etc., extending a theory: actual theory content is not compared. Stamp inclusion needs to be checked in
every inference step: its complexity is logarithmic in the number of theory extensions, which can be many
thousands in typical applications. These mechanisms lie within the logical kernel.
In early versions of Isabelle as a logical framework, theories coincided with object-logics plus some ex-
amples on top of them: Isabelle/FOL, Isabelle/ZF, Isabelle/HOL etc. Derivations in different branches of
the theory graph could coexist in a single file without interference. Today, theories are usually derived from
the Main entry-point of Isabelle/HOL and built as a natural hierarchy according to the structure of the
application. It helps users to organise formal definitions and proofs like consecutive chapters in a book; it
also helps the system for parallel checking of independent paths in the theory graph. Figure 2 shows the
theory hierarchy of the Isabelle/HOL number theory development. At the top are sessions such as Pure and
HOL-Library; in the middle are theories such as Totient, all of which are imported into Number Theory.
Isabelle theory operations are purely functional updates, making undo trivial: the earlier versions continue
16 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

to exist and can be returned to. This also includes add-on content like the ML environment or hints for proof
tools: thus the theory context provides a default set of parameters, according to the imports from the library.
In summary, Isabelle theories provide large-scale structure to formalisation projects, with a built-in notion
of monotonic reasoning over an acyclic graph of theory nodes.

8.3. Local Proof Context

LCF and the HOL family lack an explicit notion of proof context. There are usually some auxiliary structures
to manage the current proof state (maybe just a list of subgoals) for interactive theorem proving. However,
proof tools cannot refer to those; they merely see an isolated goal. Such a goal is essentially a sequent: a
list of formulas (the assumptions) paired with another formula (the goal itself). Proof assistants based on
dependent type theories (like Coq) do have a formal context Γ that declares variables, but this again is
essentially a sequent. When applying HOL tactics it is frequently necessary to re-state the types of variables
present in the statement of the theorem being proved, and one can even introduce two variables called x
with different types. So at its most basic, we need a context to associate a type with each variable involved
in the current proof. In fact, they do much more.
A genuine Local Proof Context implemented by the ML type Proof.context first appeared in Isabelle99,
as infrastructure for the then emerging Isar proof language (Section 8.1). The idea is to support a block-
structured notepad with recursive nesting of local declarations (e.g. type parameters, term parameters,
assumptions) and local conclusions that are generalized when leaving a nested block: there is a generic
export operation to move results from a nested context into the enclosing context:Vthis usually turns context
elements into the rule structure of the Isabelle logical framework (the quantifier and connective =⇒). In
concrete syntax, this looks as follows:

notepad
begin
{
fix x y z
assume A x and B y
have C x y z hproof i
} V
note h x y z . A x =⇒ B y =⇒ C x y z i
end

Here the final note recalls the result from the preceding proof block (enclosed in the curly brackets). This
block introduces three bound variables, assumed to be fixed and to satisfy A x and B y; from these we prove
C x y z. The effect of this block is to prove the formula shown in the note.
The notepad leaves nothing behind in the theory context: it is merely an experiment. In contrast, a
theorem statement produces an initial proof context with a pending claim: the theorem to be proved. The
subsequent proof body needs to solve that in the context, potentially with further nesting of local contexts
and auxiliary claims. The final result extends the enclosing theory (section 8.2) by the new fact.
Proof tools may access the Proof.context value at each point. It contains the logical content (pa-
rameters, assumptions, proved facts), local syntax, and hints to guide proof strategies. Thus the universal
Proof.context replaces Isabelle’s earlier tool-specific contexts like simpset for the simplifier or claset for
the classical reasoner. This uniformity is important to combine tools: e.g. the simplifier uses the context to
extract its own information (rewrite rules and auxiliary proof procedures); during the simplification process it
augments the context and passes it on to other procedures, which in turn extract their own information from
it (e.g. for classical reasoning). Users can even extend the context with additional components to support
proof procedures specific to their own applications. Contexts provide extensibility along with modularity.
In summary, a proof context in Isabelle represents a local situation derived from the enclosing theory.
Nested contexts appear and disappear; only exported results remain. For proof tools, the context is a universal
environment for storing tool-specific information.
From LCF to Isabelle/HOL 17

8.4. Structured Specifications: Locales

Isabelle locales provide an infrastructure for structured specifications: definitions, statements and proofs of
a theory may depend on local parameters (type and term variables) and local premises (hypotheses). The
subsequent example specifies partial orders axiomatically and defines a derived operation and theorem in
that specification context:
locale partial order =
fixes le :: 0a ⇒ 0a ⇒ bool (infixl v 50 )
assumes refl : x v x
and trans: x v y =⇒ y v z =⇒ x v z
and antisym: x v y =⇒ y v x =⇒ x = y
This introduces a new locale, called partial order. The locale declares le, effectively a constant to which we
attach infix syntax. The locale asserts what are effectively axioms. But in reality, a locale abbreviates a
predicate taking in this case a single argument, asserting that it satisfies the given assumptions. The point
is to allow this small specification to be imported simply by quoting the name partial order.
definition (in partial order )
less :: 0a ⇒ 0a ⇒ bool (infixl @ 50 )
where x @ y ←→ x v y ∧ x 6= y

theorem (in partial order )

less le trans: x @ y =⇒ y v z =⇒ x @ z hproof i
Now we can make definitions and prove theorems with respect to this locale, its constants and assumptions
visible within these declarations.
Locales may be combined via locale expressions, to rename or instantiate parameters and merge contexts.
Locale interpretation imports a given instance of a locale expression into an application context: after proving
the locale assumptions as theorems, all conclusions of the locale context become available as facts. Here is
an example that instantiates abstract partial orders to natural numbers:
interpretation nat: partial order (≤) :: nat ⇒ nat ⇒ bool
rewrites nat.less ≡ (<) :: nat ⇒ nat ⇒ bool hproof i
Thus le is instantiated as the standard order on type nat, and the derived operation less is identified with
the corresponding strict order on nat (imposing a proof obligation that needs to be proven together with the
other locale axioms). Afterwards, all conclusions from the partial order context become available in terms
of these native signatures on nat.
Locales are ubiquitious in Isabelle theory development today, but the majority are actually defined as
type classes (section 4). These concepts started out independently, but were unified by Haftmann and Wenzel
[HW06]. This is another application of the locale interpretation concepts, which are due to Ballarin [Bal06].
There are many ways of structuring mathematical developments via locales. One is simply to package
up a group of related assumptions that are needed in a long series of proofs: this avoids cluttering up the
theorem statements with this common material. Moreover, locales are more general than type classes. A
type can belong to a class in only one way, so duality arguments (reversing the direction of a partial order
for example) are out of the question; locales however can be instantiated in multiple ways at the same time.
And while the type class for groups imposes the group structure on entire types, a locale for groups can have
the carrier of the group as an explicit component of the locale; it can therefore be an arbitrary set. This
generality is necessary to develop algebra properly. Isabelle’s HOL-Algebra session develops a substantial
amount of group theory using this approach.
Ballarin [Bal14] provides a comprehensive overview of the concepts, and a tutorial on locales is part
of the standard Isabelle documentation. Many people have participated in the development of locales over
almost 15 years. The original work by Kammüller, Wenzel, and Paulson [KWP99] directly uses primitives
of the Isabelle framework; important abstractions on top of proof contexts (section 8.3) were introduced by
Haftmann and Wenzel [HW09].
LCF did not have any concept comparable to locales. But the need for structuring mechanisms has been
evident for a long time, and a variety of ideas were tried in systems as diverse as AUTOMATH, Coq and
18 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

HOL. The original locale concept was inspired by experiments (since abandoned) that had been done in
HOL using higher-order predicates. Coq has a richer logic with a built-in notion of “sections”: the original
Isabelle locales from 1999 [KWP99] were inspired by that, but without augmenting the logical framework
of Isabelle. This design principle of building higher concepts without extending the existing logic was later
adopted by Coq: its type classes are built on top of existing concepts like records, predicates, and implicit
arguments.

8.5. ML within the Logical Context

LCF and HOL can be seen as being nothing but libraries of proof procedures written in ML. The user can call
those procedures via the ML toplevel but does not have to prove theorems and could instead, say, calculate
π to 10,000 digits. In this sense, using LCF or HOL is the same as working with any other subroutine library.
This type of ML toplevel no longer exists in Isabelle. Instead, ML has become a sub-language of the
framework. Syntactically, ML expressions (or whole modules) may appear within the theory and proof
language of Isabelle/Isar: for example, in the command method setup to define a new proof method.
Semantically, program snippets may depend on symbolic entities from the logical context: types, terms,
facts, etc. The ML compiler is invoked at run-time within an augmented environment; it refers to logical
entities as well as its own environment (for ML types, values, modules). The static result is an updated ML
environment within the context: thus it also conforms to the parallel evaluation model of Isabelle/ML.
Specialised proof procedures can be implemented in this manner within the normal theory document,
using the regular Prover IDE (see Section 9). This integration of programming with logic works without
augmenting the logic: like in original LCF, ML has access to the implementation of the object-logic, but is
not part of the logical formalism. This is in contrast to Coq, where users often implement proof tools inside
the logical language itself (with correctness proofs), but genuine extensions in OCaml need to be assembled
outside the system as “plugins”.

9. Document-oriented Interaction: the Prover IDE

The now ubiquitous WIMP interface (windows, icons, menus, pointer) emerged in the late 1970s, around
the same time as LCF. Many observers suggested that the tedium of theorem proving could be addressed
by involving those new ideas. However, typical suggestions did not address the real difficulties. A favourite
was to let users point to terms that must be rewritten. But in most situations it is infinitely more effective
to execute such steps automatically, driven by an algorithm. Think of dragging 10,000 files one-by-one to
a trash can icon when they could be deleted by a single command specifying a pattern. Many suggestions
were geared to the needs of novices rather than to the professionals who would be the main users.
Useful interfaces for theorem provers would not appear for 20 years. What users really needed, it seems,
was the ability to survey the situation around them: what has been proved, what remains to be proved, what
theorems are available, where and how they were proved, etc.

9.1. Prover Interfaces: The Early Days

The original LCF proof assistant from 1979 used a line-oriented terminal or teletype. This model is known
as a read-eval-print loop (or REPL): the user types one command after another, reacting on output printed
by the prover. When computer screens and multiple windows arrived, there was often a split into two areas:
the editor to work on a growing “proof script” and the terminal with the REPL to update its state, using
manual copy-paste operations from the editor.
Around 1998, the highly influential Proof General interface for Emacs was released for the first time
[Asp00], including support for Coq and Isabelle. Here a refined model of copy-paste and state-synchronization
is baked into the editor (which is freely programmable in Lisp): the user can move a frontier of already checked
text either forwards (apply command) or backwards (undo command); only the unchecked part may be freely
edited. Proof General requires suitable undo operations of the prover, and for robustness it is better to have
a restricted command-language instead of arbitrary ML. Both are missing in HOL (any version), which
consequently still uses the original prover REPL, with some support through Emacs.
From LCF to Isabelle/HOL 19

Today, Coq remains as the main back-end for Proof General Emacs, there is also a popular Proof General
clone written in OCaml: CoqIde. In contrast, Isabelle discontinued both the REPL and its Proof General
mode in 2014: interaction now works exclusively via the document-oriented Prover IDE.

9.2. The Isabelle Prover IDE (PIDE)

From 2008, multi-threaded ML programming became routinely available in Isabelle, for parallel processing
of theories and proofs within a single Poly/ML process. This posed some challenges to the robustness and
performance of the prover engine, addressed subsequently by Matthews and Wenzel [Wen13b].
Parallel processing is also in conflict with the traditional interaction model: the REPL acts like a single
focus of single-threaded command application. In order to remove many built-in assumptions of sequential
evaluation from the interaction model and to provide rich semantic information in proof authoring process,
Wenzel introduced the document-oriented Prover IDE (PIDE) approach [Wen11, Wen13a, Wen14, Wen19].
The main principles of PIDE are as follows:
• The prover supports document edits and markup reports natively. Interaction works via protocol com-
mands (like Document.update) that take regular prover commands as data (e.g. definition, theorem).
It has its own policies to process proof documents in parallel, according to the structure of the text.
• The editor connects the physical world of editor input events and GUI painting to the mathematical
document-model of the prover. There are pipelines to stream input and output events asynchronously,
with explicit identification of document versions.
• Add-on tools may participate in the ongoing document processing by conventional means, as isolated
functions from input to output that are managed by PIDE. External tools merely need to ensure that
interrupts work correctly: this is required when the user continues editing and old versions of the document
are discontinued eventually.
Unlike Proof General, PIDE never locks the source text: edits by the user may lead to instantaneous
updates by the prover, or significant delays for slow proof tools. Thanks to overall performance improvements
of Isabelle and its underlying Poly/ML implementation, this ambitious interaction model works smoothly.
Isabelle/PIDE is delivered to end-users as a fully integrated desktop application called Isabelle/jEdit: it is
based on the Java-based text editor jEdit (see http://www.jedit.org). This explains the initial motivation
to use the Java platform for the outwards facing side of PIDE, which is implemented in Scala (on the JVM).
Non-Java editors (e.g. VSCode) may be connected to PIDE via an extra socket connection that exchanges
JSON records. There is also a Headless PIDE server with a similar protocol; this allows PIDE to run under
control of another program.
Generally speaking, Isabelle/ML works best for pure applications of mathematical logic inside the prover,
but Isabelle/Scala allows us to connect to the physical world: IDE front-ends, database engines, TCP services
etc. Such technologies are not available in the same quality in Standard ML, nor even in OCaml (which
underlies HOL Light and Coq).

10. The Archive of Formal Proofs

Proof libraries are of enormous importance to formal verification. They are analogous to software libraries,
facilitating reuse and eliminating the need to construct everything from scratch.
The Isabelle distribution already comes with a basic collection of more than 700,000 lines (38 MB) of
Isabelle/HOL theories. On top of it sits the Archive of Formal Proofs (AFP, see https://www.isa-afp.org),
a large online collection of proof developments contributed by the Isabelle community, all of them (as of this
writing) for Isabelle/HOL. Each entry or article is a collection of Isabelle theories. It is the sole shared library
of the Isabelle community. The AFP was launched in 2004 and at the time of writing contains 480 articles
written by 322 authors. These comprise 2,250,000 lines (152 MB) of Isar text proving 134,000 theorems and
supporting lemmas. It covers both computer science and mathematics.
As with a scientific journal, there is a small editorial board and submissions are reviewed for proof style
and relevance. The AFP is an online resource and therefore more dynamic than a normal scientific journal.
Articles can and do evolve. This conflicts with the purpose of archiving entries as they have been submitted
20 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

and with the purpose of providing a stable interface to users. However, true preservation requires ensuring
that entries continue to work as Isabelle itself evolves. The AFP deals with this conflict as follows. For each
Isabelle release there is a corresponding AFP release. There is a separate development version of the AFP
that is updated by Isabelle developers and AFP authors: Isabelle developers maintain all entries to be up
to date with the current Isabelle development version; authors can update their articles monotonically by
adding further material while ensuring that all entries that depend on theirs still work.
As Isabelle evolves, the self-imposed requirement to maintain all AFP articles in working order puts a
burden on the Isabelle developers. At the same time, it acts as a reality check, helping developers to evaluate
the impact of their changes on the user community. Isar’s structured proofs assist the maintenance effort by
localising the impact of changes. Sledgehammer is also invaluable when fixing unfamiliar proofs.
The model for the AFP is the Mizar Mathematical Library [Miz, BBG+ 18] which was started in 1989.
The statistics today are similar: 250 authors, 1300 articles, 60,000 theorems, 3,000,000 lines (97 MB) of text.
Of course one has to take into account that the languages and logics are different and Mizar has less proof
automation.

11. Postscript: Synergy between Ideas

We have looked at a wide variety of ideas: a proof kernel written in a functional language; a logical framework
to support multiple formalisms; polymorphism and type classes; advanced forms of automation; a structured
proof language; a unique Prover IDE. Having seen the ideas in isolation, it’s worth looking at how they work
in combination.
Sometimes one idea led to another straightforwardly. Unification and backtracking were included by
design to support future automation. The low-level primitives of the logical framework (implication and
universal quantification) provided the right foundation for the Isar language precisely because they were the
most fundamental logical concepts.
In other cases, the synergy between ideas could not have been predicted. It’s remarkable that the early
decision to adopt a polymorphic functional programming language (ML) turned out to be crucial 40 years
later: the Isabelle Prover IDE is intertwined with pervasive parallelism. The Isabelle code base was not purely
functional, but it was close enough, and the few sections that were necessarily imperative could be isolated
easily. This yields an impressive speed up in multicore environments, a machine architecture nobody could
have expected in 1975.
Another example of synergy is between automation (notably Sledgehammer) and the Isar language. The
latter allows us to write a derivation as a chain of simple steps, which can be proved automatically by the
former. If a link of this chain needs a different sort of proof, such as induction, a nested scope can be inserted
on the spot. This technique is particularly helpful to beginners, who would otherwise have to learn a great
many specialised proof methods for transforming one formula into another, as with other proof assistants.
But even experts don’t have to think so hard when writing proofs, which are also easy to read because the
chain of steps is written out explicitly. Powerful automation means the proofs don’t have to be too detailed.
The earliest design decisions, dating from Edinburgh LCF, still make sense 40 years on. Our choice of
a polymorphic functional language, a minimal proof kernel and no stored proofs yields good performance
with a minimum risk of soundness errors; contrast that with alternative choices in many other automated
theorem provers. Our focus on pervasive automation and readability must be contrasted with the prevailing
tendency for low level, “write only” proofs. Our varied ideas have produced a system that looks like a unified
whole, despite being the product of many people’s contributions6 over several decades.

Acknowledgement We thank the referees, Jasmin Blanchette, Cliff Jones, Michael Norrish and Andrei
Popescu for valuable comments on drafts of this paper. The work reported above was funded by the British
EPSRC, the German DFG and various European Union funding agencies.

References
[AH10] Sergio Antoy and Michael Hanus. Functional logic programming. Commun. ACM, 53(4):74–85, 2010.

6 Included in each release is a file entitled CONTRIBUTORS, but it only goes back to 2005 and has many omissions.
From LCF to Isabelle/HOL 21

[Art16] Rob Arthan. On definitions of constants and types in HOL. J. Automated Reasoning, 56(3):205–219, 2016.
[Asp00] David Aspinall. Proof General: A generic tool for proof development. In Susanne Graf and Michael Schwartzbach,
editors, European Joint Conferences on Theory and Practice of Software (ETAPS), volume 1785 of LNCS. Springer,
2000.
[Bal06] Clemens Ballarin. Interpretation of locales in Isabelle: Theories and proof contexts. In Jonathan M. Borwein and
William M. Farmer, editors, Mathematical Knowledge Management, 5th International Conference, MKM 2006,
volume 4108 of Lecture Notes in Computer Science, pages 31–43. Springer, 2006.
[Bal14] Clemens Ballarin. Locales: A module system for mathematical theories. J. Autom. Reasoning, 52(2):123–153,
2014.
[Bar77] J. Barwise. An introduction to first-order logic. In J. Barwise, editor, Handbook of Mathematical Logic, pages
5–46. North-Holland, 1977.
[BBG+ 18] Grzegorz Bancerek, Czeslaw Bylinski, Adam Grabowski, Artur Kornilowicz, Roman Matuszewski, Adam Naumow-
icz, and Karol Pak. The role of the Mizar Mathematical Library for interactive proof development in Mizar. J.
Autom. Reasoning, 61(1-4):9–32, 2018.
[BBH09] Stefan Berghofer, Lukas Bulwahn, and Florian Haftmann. Turning inductive into equational specifications. In
Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel, editors, Theorem Proving in Higher
Order Logics, volume 5674 of LNCS, pages 131–146. Springer, 2009.
[BBN11] Jasmin Christian Blanchette, Lukas Bulwahn, and Tobias Nipkow. Automatic proof and disproof in Isabelle/HOL.
In C. Tinelli and V. Sofronie-Stokkermans, editors, Frontiers of Combining Systems (FroCoS 2011), volume 6989
of LNCS, pages 12–27. Springer, 2011.
[BBP13] Jasmin Christian Blanchette, Sascha Böhme, and Lawrence C. Paulson. Extending Sledgehammer with SMT
solvers. Journal of Automated Reasoning, 51(1):109–128, 2013.
[BC04] Yves Bertot and Pierre Castéran. Interactive Theorem Proving and Program Development: Coq’Art: The Calculus
of Inductive Constructions. Springer, 2004.
[BKH+ 08] Lukas Bulwahn, Alexander Krauss, Florian Haftmann, Levent Erkök, and John Matthews. Imperative functional
programming with Isabelle/HOL. In Otmane Aı̈t Mohamed, César A. Muñoz, and Sofiène Tahar, editors, Theorem
Proving in Higher Order Logics, 21st International Conference, TPHOLs 2008, volume 5170 of Lecture Notes in
Computer Science, pages 134–149. Springer, 2008.
[BL18] Julian Brunner and Peter Lammich. Formal verification of an executable LTL model checker with partial order
reduction. J. Autom. Reasoning, 60(1):3–21, 2018.
[Bla12] Jasmin Christian Blanchette. Automatic Proofs and Refutations for Higher-Order Logic. PhD thesis, Technical
University Munich, 2012.
[Bla13] Jasmin Christian Blanchette. Relational analysis of (co)inductive predicates, (co)algebraic datatypes, and
(co)recursive functions. Software Quality Journal, 21(1):101–126, 2013.
[BM79] Robert S. Boyer and J Strother Moore. A Computational Logic. Academic Press, 1979.
[BN02] Stefan Berghofer and Tobias Nipkow. Executing higher order logic. In P. Callaghan, Z. Luo, J. McKinna, and
R. Pollack, editors, Types for Proofs and Programs (TYPES 2000), volume 2277 of LNCS, pages 24–40. Springer,
2002.
[BN04] Stefan Berghofer and Tobias Nipkow. Random testing in Isabelle/HOL. In J. Cuellar and Z. Liu, editors, Software
Engineering and Formal Methods (SEFM 2004), pages 230–239. IEEE Computer Society, 2004.
[BN10] Jasmin Christian Blanchette and Tobias Nipkow. Nitpick: A counterexample generator for higher-order logic based
on a relational model finder. In Matt Kaufmann and Lawrence C. Paulson, editors, Interactive Theorem Proving,
volume 6172 of LNCS, pages 131–146. Springer, 2010.
[Bul12a] Lukas Bulwahn. Counterexample Generation for Higher-Order Logic using Functional and Logic Programming.
PhD thesis, Technical University Munich, 2012.
[Bul12b] Lukas Bulwahn. The new Quickcheck for Isabelle: Random, exhaustive and symbolic testing under one roof. In
Chris Hawblitzel and Dale Miller, editors, Certified Programs and Proofs, volume 7679 of LNCS, pages 92–108.
Springer, 2012.
[Bul12c] Lukas Bulwahn. Smart testing of functional programs in Isabelle. In Nikolaj Bjørner and Andrei Voronkov, editors,
Logic for Programming, Artificial Intelligence, and Reasoning, volume 7180 of LNCS, pages 153–167. Springer,
2012.
[CDKM11] Harsh Raju Chamarthi, Peter C. Dillinger, Matt Kaufmann, and Panagiotis Manolios. Integrating testing and
interactive theorem proving. In David Hardin and Julien Schmaltz, editors, 10th International Workshop on the
ACL2 Theorem Prover and its Applications, ACL2 2011, volume 70 of EPTCS, pages 4–19, 2011.
[CH00] Koen Claessen and John Hughes. QuickCheck: a lightweight tool for random testing of Haskell programs. In Martin
Odersky and Philip Wadler, editors, Fifth ACM SIGPLAN International Conference on Functional Programming
(ICFP ’00), pages 268–279. ACM, 2000.
[CM87] W. F. Clocksin and C. S. Mellish. Programming in Prolog. Springer, 3rd edition, 1987.
[DHT03] P. Dybjer, Q. Haiyan, and M. Takeyama. Combining testing and proving in dependent type theory. In Theorem
Proving in Higher Order Logics, volume 2758 of LNCS, pages 188–203. Springer, 2003.
[DJK+ 18] Jose Divasón, Sebastiaan J. C. Joosten, Ondrej Kuncar, René Thiemann, and Akihisa Yamada. Efficient certifi-
cation of complexity proofs: formalizing the Perron-Frobenius theorem (invited talk paper). In June Andronick
and Amy P. Felty, editors, 7th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP
2018, pages 2–13. ACM, 2018.
[DJTY19] Jose Divasón, Sebastiaan J. C. Joosten, René Thiemann, and Akihisa Yamada. A verified implementation of the
Berlekamp-Zassenhaus factorization algorithm. J. Autom. Reasoning, 2019. published online.
[dMB08] Leonardo de Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In C. Ramakrishnan and Jakob Rehof,
22 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

editors, Tools and Algorithms for the Construction and Analysis of Systems, volume 4963 of Lecture Notes in
Computer Science, pages 337–340. Springer, 2008.
[dMKA+ 15] Leonardo Mendonça de Moura, Soonho Kong, Jeremy Avigad, Floris van Doorn, and Jakob von Raumer. The
Lean theorem prover (system description). In Amy P. Felty and Aart Middeldorp, editors, Automated Deduction
— CADE-25, volume 9195 of Lecture Notes in Computer Science, pages 378–388. Springer, 2015.
[DW88] Saumya K. Debray and David S. Warren. Automatic mode inference for logic programs. The Journal of Logic
Programming, 5(3):207 – 229, 1988.
[Ebe15] Manuel Eberl. A decision procedure for univariate real polynomials in Isabelle/HOL. In 2015 Conference on
Certified Programs and Proofs, CPP ’15, pages 75–83. ACM, 2015.
[ELN+ 13] Javier Esparza, Peter Lammich, René Neumann, Tobias Nipkow, Alexander Schimpf, and Jan-Georg Smaus. A
fully verified executable LTL model checker. In Natasha Sharygina and Helmut Veith, editors, Computer Aided
Verification - 25th International Conference, CAV 2013, volume 8044 of Lecture Notes in Computer Science, pages
463–478. Springer, 2013.
[FGJM85] Kokichi Futatsugi, Joseph A. Goguen, Jean-Pierre Jouannaud, and José Meseguer. Principles of OBJ2. In 12th
ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), pages 52–66, New York,
NY, USA, 1985. ACM.
[GAB+ 17] Jürgen Giesl, Cornelius Aschermann, Marc Brockschmidt, Fabian Emmes, Florian Frohn, Carsten Fuhs, Jera
Hensel, Carsten Otto, Martin Plücker, Peter Schneider-Kamp, Thomas Ströder, Stephanie Swiderski, and René
Thiemann. Analyzing program termination and complexity automatically with AProVE. J. Autom. Reasoning,
58(1):3–31, 2017.
[GH98] David Griffioen and Marieke Huisman. A comparison of PVS and Isabelle/HOL. In Jim Grundy and Malcolm
Newey, editors, Theorem Proving in Higher Order Logics: TPHOLs ’98, pages 123–142. Springer, 1998.
[GKMB17] Victor B. F. Gomes, Martin Kleppmann, Dominic P. Mulligan, and Alastair R. Beresford. Verifying strong eventual
consistency in distributed systems. Proc. ACM Program. Lang., 1(OOPSLA):109:1–109:28, October 2017.
[GKN15] Adam Grabowski, Artur Kornilowicz, and Adam Naumowicz. Four decades of Mizar. Journal of Automated
Reasoning, 55(3):191–198, Oct 2015.
[GM93] M. J. C. Gordon and T. F. Melham, editors. Introduction to HOL: A Theorem Proving Environment for Higher
Order Logic. Cambridge University Press, 1993.
[GM10] Georges Gonthier and Assia Mahboubi. An introduction to small scale reflection in Coq. J. Formalized Reasoning,
3(2), 2010.
[GMW79] Michael J. C. Gordon, Robin Milner, and Christopher P. Wadsworth. Edinburgh LCF: A Mechanised Logic of
Computation. LNCS 78. Springer, 1979.
[Gog79] Joseph A. Goguen. Some design principles and theory for OBJ-O, a language to express and execute algebraic
specification for programs. In Edward K. Blum, Manfred Paul, and Satoru Takasu, editors, Mathematical Studies
of Information Processing, volume 75 of LNCS, pages 425–473. Springer, 1979.
[Gor86] Michael J. C. Gordon. Why higher-order logic is a good formalism for specifying and verifying hardware. In
G. Milne and P. A. Subrahmanyam, editors, Formal Aspects of VLSI Design, pages 153–177. North-Holland, 1986.
[Gor00] Michael J. C. Gordon. From LCF to HOL: a short history. In Gordon Plotkin, Colin Stirling, and Mads Tofte,
editors, Proof, Language, and Interaction: Essays in Honor of Robin Milner, pages 169–185. MIT Press, 2000.
[Gor15] M. J. C. Gordon. Tactics for mechanized reasoning: A commentary on Milner (1984) ‘The use of machines to assist
in rigorous proof’. Philosophical Transactions of the Royal Society. Series A, 373(2039), 2015.
[HAB+ 17] Thomas Hales, Mark Adams, Gertrud Bauer, Tat Dat Dang, John Harrison, Le Truong Hoang, Cezary Kaliszyk,
Victor Magron, Sean Mclaughlin, Tat Thang Nguyen, et al. A formal proof of the Kepler conjecture. Forum of
Mathematics, Pi, 5:e2, 2017.
[Haf09] Florian Haftmann. Code Generation from Specifications in Higher Order Logic. PhD thesis, Technische Universität
München, 2009.
[Har96] John Harrison. HOL Light: A tutorial introduction. In Mandayam K. Srivas and Albert John Camilleri, editors,
Formal Methods in Computer-Aided Design: FMCAD ’96, LNCS 1166, pages 265–269. Springer, 1996.
[HHP93] Robert Harper, Furio Honsell, and Gordon Plotkin. A framework for defining logics. Journal of the ACM,
40(1):143–184, 1993.
[HIH13] Johannes Hölzl, Fabian Immler, and Brian Huffman. Type classes and filters for mathematical analysis in Isa-
belle/HOL. In S. Blazy, C. Paulin-Mohring, and D. Pichardie, editors, Interactive Theorem Proving — 4th Inter-
national Conference, LNCS 7998, pages 279–294. Springer, 2013.
[HKKN13] Florian Haftmann, Alexander Krauss, Ondřej Kunčar, and Tobias Nipkow. Data refinement in Isabelle/HOL.
In S. Blazy, C. Paulin-Mohring, and D. Pichardie, editors, Interactive Theorem Proving — 4th International
Conference, volume 7998 of LNCS, pages 100–115. Springer, 2013.
[HKMS17] Warren A Hunt, Jr, Matt Kaufmann, J Strother Moore, and Anna Slobodova. Industrial hardware and software
verification with ACL2. Philosophical transactions of the Royal Society. Series A, 375(2104), 2017.
[HN10] Florian Haftmann and Tobias Nipkow. Code generation via higher-order rewrite systems. In M. Blume,
N. Kobayashi, and G. Vidal, editors, Functional and Logic Programming (FLOPS 2010), volume 6009 of LNCS,
pages 103–117. Springer, 2010.
[HN18] Lars Hupel and Tobias Nipkow. A verified compiler from Isabelle/HOL to CakeML. In A. Ahmed, editor, European
Symposium on Programming (ESOP 2018), volume 10801 of LNCS, pages 999–1026. Springer, 2018.
[HO82] Christoph M. Hoffmann and Michael J. O’Donnell. Programming with equations. ACM Trans. Program. Lang.
Syst., 4(1):83–112, 1982.
[Hol97] Gerard J. Holzmann. The model checker SPIN. IEEE Trans. Software Eng., 23(5):279–295, 1997.
[Hue75] G. P. Huet. A unification algorithm for typed λ-calculus. Theoretical Computer Science, 1:27–57, 1975.
From LCF to Isabelle/HOL 23

[HW06] Florian Haftmann and Makarius Wenzel. Constructive type classes in Isabelle. In Thorsten Altenkirch and Conor
McBride, editors, Types for Proofs and Programs, volume 4502 of LNCS, pages 160–174. Springer, 2006.
[HW09] Florian Haftmann and Makarius Wenzel. Local theory specifications in Isabelle/Isar. In Stefano Berardi, Ferruccio
Damiani, and Ugo de Liguoro, editors, Types for Proofs and Programs, TYPES 2008, volume 5497 of LNCS.
Springer, 2009.
[Jac06] Daniel Jackson. Software Abstractions. Logic, Language, and Analysis. MIT Press, 2006.
[KAE+ 10] Gerwin Klein, June Andronick, Kevin Elphinstone, Gernot Heiser, David Cock, Philip Derrin, Dhammika Elkaduwe,
Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch, and Simon Winwood. sel4: Formal
verification of an operating-system kernel. Communications of the ACM, 53(6):107–115, June 2010.
[Kal91] Sara Kalvala. HOL around the world. In M. Archer, J. J. Joyce, K. N. Levitt, and P. J. Windley, editors,
International Workshop on the HOL Theorem Proving System and its Applications, pages 4–12. IEEE Computer
Society, 1991.
[KMNO14] Ramana Kumar, Magnus O. Myreen, Michael Norrish, and Scott Owens. CakeML: a verified implementation of
ML. In Suresh Jagannathan and Peter Sewell, editors, The 41st Annual ACM SIGPLAN-SIGACT Symposium on
Principles of Programming Languages, POPL ’14, pages 179–192. ACM, 2014.
[KP18] Ondřej Kunčar and Andrei Popescu. Safety and conservativity of definitions in HOL and Isabelle/HOL. PACMPL,
2(POPL):24:1–24:26, 2018.
[KP19] Ondřej Kunčar and Andrei Popescu. A consistent foundation for Isabelle/HOL. J. Automated Reasoning,
62(4):531–555, 2019.
[KWP99] Florian Kammüller, Markus Wenzel, and Lawrence C. Paulson. Locales: A sectioning concept for Isabelle. In
Y. Bertot, G. Dowek, A. Hirschowitz, C. Paulin, and L. Thery, editors, Theorem Proving in Higher Order Logics:
TPHOLs ’99, volume 1690 of LNCS. Springer, 1999.
[Lam13] Peter Lammich. Automatic data refinement. In Sandrine Blazy, Christine Paulin-Mohring, and David Pichardie,
editors, Interactive Theorem Proving - 4th International Conference, ITP 2013, volume 7998 of Lecture Notes in
Computer Science, pages 84–99. Springer, 2013.
[Lam14] Peter Lammich. Verified efficient implementation of Gabow’s strongly connected component algorithm. In Gerwin
Klein and Ruben Gamboa, editors, Interactive Theorem Proving — 5th International Conference, ITP 2014,
volume 8558 of Lecture Notes in Computer Science, pages 325–340. Springer, 2014.
[Lam16] Peter Lammich. Refinement based verification of imperative data structures. In Jeremy Avigad and Adam Chlipala,
editors, 5th ACM SIGPLAN Conference on Certified Programs and Proofs, pages 27–36. ACM, 2016.
[Lam17] Peter Lammich. Efficient verified (UN)SAT certificate checking. In Leonardo de Moura, editor, Automated Deduc-
tion — CADE-26, volume 10395 of Lecture Notes in Computer Science, pages 237–254. Springer, 2017.
[Lam19] Peter Lammich. Refinement to imperative HOL. J. Autom. Reasoning, 62(4):481–503, 2019.
[LB11] Andreas Lochbihler and Lukas Bulwahn. Animating the formalised semantics of a Java-like language. In Marko C.
J. D. van Eekelen, Herman Geuvers, Julien Schmaltz, and Freek Wiedijk, editors, Interactive Theorem Proving -
Second International Conference, ITP 2011, volume 6898 of Lecture Notes in Computer Science, pages 216–232.
Springer, 2011.
[Ler09] Xavier Leroy. A formally verified compiler back-end. J. Automated Reasoning, 43:363–446, 2009.
[LPP19] Wenda Li, Grant Olney Passmore, and Lawrence C. Paulson. Deciding univariate polynomial problems using
untrusted certificates in Isabelle/HOL. J. Autom. Reasoning, 62(1):69–91, 2019.
[LPY97] Kim Guldstrand Larsen, Paul Pettersson, and Wang Yi. UPPAAL in a nutshell. STTT, 1(1-2):134–152, 1997.
[LS19] Peter Lammich and S. Reza Sefidgar. Formalizing network flow algorithms: A refinement approach in isabelle/hol.
J. Autom. Reasoning, 62(2):261–280, 2019.
[LT12] Peter Lammich and Thomas Tuerk. Applying data refinement for monadic programs to Hopcroft’s algorithm. In
Lennart Beringer and Amy P. Felty, editors, Interactive Theorem Proving - Third International Conference, ITP
2012, volume 7406 of Lecture Notes in Computer Science, pages 166–182. Springer, 2012.
[Mil85] Robin Milner. The use of machines to assist in rigorous proof. In C. A. R. Hoare and J. C. Shepherdson, editors,
Mathematical Logic and Programming Languages, pages 77–88. Prentice-Hall, 1985.
[Mil91] Dale Miller. A logic programming language with lambda-abstraction, function variables, and simple unification.
J. Log. Comput., 1(4):497–536, 1991.
[Miz] The Mizar Mathematical Library. http://mizar.org.
[ML84] P. Martin-Löf. Constructive mathematics and computer programming. Philosophical Transactions of the Royal
Society. Series A, 312(1522):501–518, 1984.
[MMW16] Daniel Matichuk, Toby C. Murray, and Makarius Wenzel. Eisbach: A proof method language for Isabelle. Journal
of Automated Reasoning, 56(3), 2016.
[MQP06] Jia Meng, Claire Quigley, and Lawrence C. Paulson. Automation for interactive proof: First prototype. Information
and Computation, 204(10):1575–1596, 2006.
[MW10] D. Matthews and M. Wenzel. Efficient parallel programming in Poly/ML and Isabelle/ML. In ACM SIGPLAN
Workshop on Declarative Aspects of Multicore Programming (DAMP 2010), 2010.
[Nip91a] Tobias Nipkow. Higher-order critical pairs. In Proc. 6th IEEE Symp. Logic in Computer Science, pages 342–349.
IEEE Press, 1991.
[Nip91b] Tobias Nipkow. Higher-order unification, polymorphism, and subsorts. In S. Kaplan and M. Okada, editors, Proc.
2nd Int. Workshop Conditional and Typed Rewriting Systems, volume 516 of LNCS. Springer, 1991.
[Nip93a] Tobias Nipkow. Functional unification of higher-order patterns. In Proc. 8th IEEE Symp. Logic in Computer
Science, pages 64–74, 1993.
[Nip93b] Tobias Nipkow. Order-sorted polymorphism in Isabelle. In Gérard Huet and Gordon Plotkin, editors, Logical
Environments, pages 164–188. Cambridge Uiversity Press, 1993.
24 Lawrence C. Paulson, Tobias Nipkow and Makarius Wenzel

[NK14] Tobias Nipkow and Gerwin Klein. Concrete Semantics with Isabelle/HOL. Springer, 2014. 298 pp.
http://concrete-semantics.org.
[NM16] Julian Nagele and Aart Middeldorp. Certification of classical confluence results for left-linear term rewrite sys-
tems. In Jasmin Christian Blanchette and Stephan Merz, editors, Interactive Theorem Proving - 7th International
Conference, ITP 2016, volume 9807 of Lecture Notes in Computer Science, pages 290–306. Springer, 2016.
[NO80] Greg Nelson and Derek C. Oppen. Fast decision procedures based on congruence closure. Journal of the ACM,
27(2):356–364, 1980.
[NP92] Tobias Nipkow and Lawrence C. Paulson. Isabelle-91. In Deepak Kapur, editor, Automated Deduction — CADE-
11, volume 607 of LNCS, pages 673–676. Springer, 1992.
[NP93] Tobias Nipkow and Christian Prehofer. Type checking type classes. In Principles of Programming Languages,
POPL ’93, pages 409–418, New York, NY, USA, 1993. ACM.
[NP98] Tobias Nipkow and Christian Prehofer. Higher-order rewriting and equational reasoning. In W. Bibel and
P. Schmitt, editors, Automated Deduction — A Basis for Applications. Volume I: Foundations, volume 8 of
Applied Logic Series, pages 399–430. Kluwer, 1998.
[NPW02] Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Isabelle/HOL: A Proof Assistant for Higher-Order
Logic. Springer, 2002. Online at http://isabelle.in.tum.de/dist/Isabelle/doc/tutorial.pdf.
[NS91] Tobias Nipkow and Gregor Snelting. Type classes and overloading resolution via order-sorted unification. In
J. Hughes, editor, Proc. 5th ACM Conf. Functional Programming Languages and Computer Architecture, volume
523 of LNCS, pages 1–14. Springer, 1991.
[Obu06] Steven Obua. Checking conservativity of overloaded definitions in higher-order logic. In Frank Pfenning, editor,
Term Rewriting and Applications, volume 4098 of LNCS, pages 212–226. Springer, 2006.
[O’D77] Michael J. O’Donnell. Computing in Systems Described by Equations, volume 58 of LNCS. Springer, 1977.
[Ove75] R. Overbeek. An implementation of hyper-resolution. Computers and Mathematics with Applications, 1:201–214,
1975.
[Owr06] Sam Owre. Random testing in PVS. In Workshop on Automated Formal Methods (AFM), 2006. Online at
http://fm.csl.sri.com/AFM06/papers/5-Owre.pdf.
[Pau86] Lawrence C. Paulson. Natural deduction as higher-order resolution. Journal of Logic Programming, 3:237–258,
1986.
[Pau89] Lawrence C. Paulson. The foundation of a generic theorem prover. J. Automated Resoning, 5(3):363–397, 1989.
[Pau90] Lawrence C. Paulson. Isabelle: The next 700 theorem provers. In P. Odifreddi, editor, Logic and Computer Science,
pages 361–386. Academic Press, 1990.
[Pau93] Lawrence C. Paulson. Isabelle’s object-logics. Technical Report 286, Cambridge University Computer Laboratory,
1993.
[Pau94] Lawrence C. Paulson. Isabelle - A Generic Theorem Prover (with contributions by T. Nipkow), volume 828 of
Lecture Notes in Computer Science. Springer, 1994.
[Pau98] Lawrence C. Paulson. The inductive approach to verifying cryptographic protocols. Journal of Computer Security,
6(1-2):85–128, 1998.
[Pau99] Lawrence C. Paulson. A generic tableau prover and its integration with Isabelle. Journal of Universal Computer
Science, 5(3):73–87, 1999.
[Pau03] Lawrence C. Paulson. The relative consistency of the axiom of choice — mechanized using Isabelle/ZF. LMS
Journal of Computation and Mathematics, 6:198–248, 2003. http://www.lms.ac.uk/jcm/6/lms2003-001/.
[Pau04] Lawrence C. Paulson. Organizing numerical theories using axiomatic type classes. J. Automated Reasoning,
33(1):29–49, 2004.
[Pau18] Lawrence C. Paulson. Computational logic: Its origins and applications. Proceedings of the Royal Society of London
A: Mathematical, Physical and Engineering Sciences, 474(2210), 2018.
[Pel96] Doron A. Peled. Combining partial order reductions with on-the-fly model-checking. Formal Methods in System
Design, 8(1):39–64, 1996.
[PG96] Lawrence C. Paulson and Krzysztof Grabczewski. Mechanizing set theory: Cardinal arithmetic and the axiom of
choice. J. Automated Reasoning, 17(3):291–323, December 1996.
[PHD+ 15] Zoe Paraskevopoulou, Catalin Hritcu, Maxime Dénès, Leonidas Lampropoulos, and Benjamin C. Pierce. Foun-
dational property-based testing. In Christian Urban and Xingyuan Zhang, editors, Interactive Theorem Proving,
volume 9236 of LNCS, pages 325–343. Springer, 2015.
[PS07] Lawrence C. Paulson and Kong Woei Susanto. Source-level proof reconstruction for interactive theorem proving.
In Klaus Schneider and Jens Brandt, editors, Theorem Proving in Higher Order Logics: TPHOLs 2007, LNCS
4732, pages 232–245. Springer, 2007.
[Raj93] Sreeranga P. Rajan. Executing HOL specifications: Towards an evaluation semantics for classical higher order
logic. In Luc J. M. Claesen and Michael J. C. Gordon, editors, Higher Order Logic Theorem Proving and its
Applications, volume A-20 of IFIP Transactions, pages 527–536. North-Holland/Elsevier, 1993.
[RNL08] Colin Runciman, Matthew Naylor, and Fredrik Lindblad. SmallCheck and lazy SmallCheck: Automatic exhaustive
testing for small values. In Andy Gill, editor, Proc. 1st ACM SIGPLAN Symposium on Haskell, pages 37–48.
ACM, 2008.
[Rob65] J. A. Robinson. A machine-oriented logic based on the resolution principle. Journal of the ACM, 12:23–41, 1965.
[RV02] Alexandre Riazanov and Andrei Voronkov. The design and implementation of VAMPIRE. AI Communications,
15(2):91–110, 2002.
[Sch04] Stephan Schulz. System description: E 0.81. In David Basin and Michaël Rusinowitch, editors, Automated Rea-
soning — Second International Joint Conference, IJCAR 2004, LNAI 3097, pages 223–228. Springer, 2004.
From LCF to Isabelle/HOL 25

[Sha02] Natarajan Shankar. Little engines of proof. In L.-H. Eriksson and P. Lindsay, editors, FME 2002: Formal Methods
- Getting IT Right : International Symposium of Formal Methods Europe, LNCS 2391, pages 1–20. Springer, 2002.
[Sie19] Stephen F. Siegel. What’s wrong with on-the-fly partial order reduction. In I. Dillig and S. Tasiran, editors,
Computer Aided Verification (CAV 2019), LNCS. Springer, 2019.
[SK07] Alexander Spiridonov and Sarfraz Khurshid. Automatic generation of counterexamples for ACL2 using Alloy. In
Seventh Int. Workshop on the ACL2 Theorem prover and its Applications, 2007.
[SN08] Konrad Slind and Michael Norrish. A brief overview of HOL4. In Otmane Aı̈t Mohamed, César Muñoz, and Sofiène
Tahar, editors, Theorem Proving in Higher Order Logics, TPHOLs 2008, pages 28–32, 2008.
[Sum02] Rob Sumners. Checking ACL2 theorems via SAT checking. In Third Int. Workshop on the ACL2 Theorem prover
and its Applications, 2002.
[TJ07] Emina Torlak and Daniel Jackson. Kodkod: A relational model finder. In Orna Grumberg and Michael Huth,
editors, Tools and Algorithms for the Construction and Analysis of Systems, volume 4424 of LNCS, pages 632–647.
Springer, 2007.
[TS09] René Thiemann and Christian Sternagel. Certification of termination proofs using CeTA. In Stefan Berghofer,
Tobias Nipkow, Christian Urban, and Makarius Wenzel, editors, Theorem Proving in Higher Order Logics, 22nd
International Conference, TPHOLs 2009, volume 5674 of Lecture Notes in Computer Science, pages 452–468.
Springer, 2009.
[Web05] Tjark Weber. Bounded model generation for Isabelle/HOL. In Wolfgang Ahrendt, Peter Baumgartner, Hans
de Nivelle, Silvio Ranise, and Cesare Tinelli, editors, Selected Papers from the Workshops on Disproving and the
Second International Workshop on Pragmatics of Decision Procedures (PDPAR 2004), volume 125(3) of Electronic
Notes in Theoretical Computer Science, pages 103–116. Elsevier, July 2005.
[Web08] Tjark Weber. SAT-based finite model generation for higher-order logic. PhD thesis, Technical University Munich,
Germany, 2008.
[Wei01] Christoph Weidenbach. Combining superposition, sorts and splitting. In Alan Robinson and Andrei Voronkov,
editors, Handbook of Automated Reasoning, volume II, chapter 27, pages 1965–2013. Elsevier Science, 2001.
[Wen97] Markus Wenzel. Type classes and overloading in higher-order logic. In Theorem Proving in Higher Order Logics,
volume 1275 of LNCS, pages 307–322. Springer, 1997.
[Wen07] Makarius Wenzel. Isabelle/Isar — a generic framework for human-readable proof documents. Studies in Logic,
Grammar, and Rhetoric, 10(23):277–297, 2007. From Insight to Proof — Festschrift in Honour of Andrzej Trybulec.
[Wen11] M. Wenzel. Isabelle as document-oriented proof assistant. In J. H. Davenport et al., editors, Conference on
Intelligent Computer Mathematics (CICM 2011), volume 6824 of LNAI. Springer, 2011.
[Wen13a] Makarius Wenzel. READ-EVAL-PRINT in parallel and asynchronous proof-checking. In Cezary Kaliszyk and
Christoph Lüth, editors, User Interfaces for Theorem Provers (UITP 2012), volume 118 of Electronic Proceedings
in Theoretical Computer Science, 2013.
[Wen13b] Makarius Wenzel. Shared-memory multiprocessing for interactive theorem proving. In Sandrine Blazy, Christine
Paulin-Mohring, and David Pichardie, editors, Interactive Theorem Proving (ITP 2013), volume 7998 of Lecture
Notes in Computer Science. Springer, 2013.
[Wen14] Makarius Wenzel. Asynchronous user interaction and tool integration in Isabelle/PIDE. In Gerwin Klein and
Ruben Gamboa, editors, Interactive Theorem Proving (ITP 2014), volume 8558 of LNCS. Springer, 2014.
[Wen19] Makarius Wenzel. Interaction with formal mathematical documents in Isabelle/PIDE. In Cezary Kaliszyk, Edwin
Brady, Andrea Kohlhase, and Claudio Sacerdoti Coen, editors, Intelligent Computer Mathematics (CICM 2019),
volume 11617 of LNAI. Springer, 2019. https://arxiv.org/abs/1905.01735.
[Wie01] Freek Wiedijk. Mizar Light for HOL Light. In Richard J. Boulton and Paul B. Jackson, editors, Theorem Proving
in Higher Order Logics, TPHOLs 2001, pages 378–393, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg.
[WL18] Simon Wimmer and Peter Lammich. Verified model checking of timed automata. In Dirk Beyer and Marieke
Huisman, editors, Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2018, volume
10805 of Lecture Notes in Computer Science, pages 61–78. Springer, 2018.
[Woo18] Charlie Wood. The strange numbers that birthed modern algebra. Online at https://www.quantamagazine.org/
the-strange-numbers-that-birthed-modern-algebra-20180906/, September 2018.

Programming Language Foundations PDF
100% (2)
Programming Language Foundations PDF
338 pages
ToPrint ExamTopics 77 - 100
100% (1)
ToPrint ExamTopics 77 - 100
46 pages
A. S. Troelstra, H. Schwichtenberg-Basic Proof Theory-Cambridge University Press (2000) PDF
100% (5)
A. S. Troelstra, H. Schwichtenberg-Basic Proof Theory-Cambridge University Press (2000) PDF
430 pages
Hitori Solution
No ratings yet
Hitori Solution
218 pages
SAMA's Regulatory Role Strategy Project: Saudi Arabian Monetary Authority RFI Presentation - January 29, 2019
100% (2)
SAMA's Regulatory Role Strategy Project: Saudi Arabian Monetary Authority RFI Presentation - January 29, 2019
44 pages
Tutorial Isabelle
No ratings yet
Tutorial Isabelle
65 pages
Prog Prove
No ratings yet
Prog Prove
59 pages
Isabelle: A Proof Assistant For Higher-Order Logic
No ratings yet
Isabelle: A Proof Assistant For Higher-Order Logic
231 pages
Tutorial
100% (1)
Tutorial
223 pages
Code Generation Via Higher-Order Rewrite Systems: 1 Introduction and Related Work
No ratings yet
Code Generation Via Higher-Order Rewrite Systems: 1 Introduction and Related Work
15 pages
Isabelle
No ratings yet
Isabelle
221 pages
978 3 642 03359 9
No ratings yet
978 3 642 03359 9
527 pages
Main
No ratings yet
Main
212 pages
LA Joe Livason
No ratings yet
LA Joe Livason
165 pages
Functional Algorithms Verified
No ratings yet
Functional Algorithms Verified
278 pages
Slides Isabelle
No ratings yet
Slides Isabelle
186 pages
Concrete Semantics With Isabelle/HOL
No ratings yet
Concrete Semantics With Isabelle/HOL
308 pages
Volume 1: Logical Foundations 1 Preface
No ratings yet
Volume 1: Logical Foundations 1 Preface
6 pages
Interactive Theorem Proving: Jeremy Avigad
No ratings yet
Interactive Theorem Proving: Jeremy Avigad
66 pages
Logic and Calculus
No ratings yet
Logic and Calculus
21 pages
Comparison of Two Theorem Provers: Isabelle/HOL and Coq
No ratings yet
Comparison of Two Theorem Provers: Isabelle/HOL and Coq
18 pages
An Introduction to Functional Programming Through Lambda Calculus
From Everand
An Introduction to Functional Programming Through Lambda Calculus
Greg Michaelson
No ratings yet
Computability and Incompleteness - Lecture Notes
100% (1)
Computability and Incompleteness - Lecture Notes
128 pages
Avigad J. - Computability and Incompleteness. Lecture Notes (2007)
No ratings yet
Avigad J. - Computability and Incompleteness. Lecture Notes (2007)
128 pages
Isar Ref
No ratings yet
Isar Ref
329 pages
Holsuperlight
No ratings yet
Holsuperlight
11 pages
C: A Verification Framework For Concurrent Imperative Programs
No ratings yet
C: A Verification Framework For Concurrent Imperative Programs
13 pages
Preparing Testing by Proving Critical Systems
No ratings yet
Preparing Testing by Proving Critical Systems
10 pages
Automated Reasoning and Its Applications: John Harrison Intel Corporation
No ratings yet
Automated Reasoning and Its Applications: John Harrison Intel Corporation
35 pages
Interactive Theorem Proving Course
No ratings yet
Interactive Theorem Proving Course
322 pages
Isar Ref
No ratings yet
Isar Ref
357 pages
Yale University Department of Computer Science
No ratings yet
Yale University Department of Computer Science
96 pages
A Really Temporal Logic Lecture Notes Rajeev Alur T A Henzinger Stanford University Computer Science Department Instant Download
No ratings yet
A Really Temporal Logic Lecture Notes Rajeev Alur T A Henzinger Stanford University Computer Science Department Instant Download
86 pages
Automatic Verification of Sequencial
No ratings yet
Automatic Verification of Sequencial
169 pages
(Distinguished Dissertations) John Harrison PHD, MA (Auth.) - Theorem Proving With The Real Numbers-Springer-Verlag London (1998)
No ratings yet
(Distinguished Dissertations) John Harrison PHD, MA (Auth.) - Theorem Proving With The Real Numbers-Springer-Verlag London (1998)
192 pages
CS6180 Lecture2
No ratings yet
CS6180 Lecture2
6 pages
SBMF
No ratings yet
SBMF
18 pages
FP IITDARUNKUMAR Ilfp
No ratings yet
FP IITDARUNKUMAR Ilfp
314 pages
Thesis
No ratings yet
Thesis
225 pages
Categorical and Kripke Semantics For Constructive S4 Modal Logic
No ratings yet
Categorical and Kripke Semantics For Constructive S4 Modal Logic
19 pages
Lewis H.R., Papadimitriou C.H. Elements of The Theory of Computation (2ed., PH, 1998) (ISBN 013262478
100% (1)
Lewis H.R., Papadimitriou C.H. Elements of The Theory of Computation (2ed., PH, 1998) (ISBN 013262478
375 pages
Brauer Notes On Realizability
No ratings yet
Brauer Notes On Realizability
111 pages
Book Review: Introduction To HOL: A Theorem Proving Environment For Higher Order Logic
No ratings yet
Book Review: Introduction To HOL: A Theorem Proving Environment For Higher Order Logic
3 pages
A Survey of Two Verifiable Delay Functions
No ratings yet
A Survey of Two Verifiable Delay Functions
13 pages
(Ingo Wegener) The Complexity of Boolean Functions
No ratings yet
(Ingo Wegener) The Complexity of Boolean Functions
470 pages
Schwichtenberg & Wainer - Proofs and Computations
100% (1)
Schwichtenberg & Wainer - Proofs and Computations
384 pages
Lecture Notes in Computer Science: Edited by G. Goos, J. Hartmanis and J. Van Leeuwen
No ratings yet
Lecture Notes in Computer Science: Edited by G. Goos, J. Hartmanis and J. Van Leeuwen
278 pages
(Foundations of Computing) Joseph A. Goguen, Grant Malcolm-Algebraic Semantics of Imperative Programs-The MIT Press (1996) PDF
No ratings yet
(Foundations of Computing) Joseph A. Goguen, Grant Malcolm-Algebraic Semantics of Imperative Programs-The MIT Press (1996) PDF
229 pages
Projects Bib
No ratings yet
Projects Bib
12 pages
Gidti Sample
No ratings yet
Gidti Sample
39 pages
Toc Recursive Function Theory
100% (1)
Toc Recursive Function Theory
83 pages
plf-6 1
No ratings yet
plf-6 1
704 pages
(James H. Andrews) Logic Programming Operational
No ratings yet
(James H. Andrews) Logic Programming Operational
116 pages
The Geometry of Types: To Cite This Version
No ratings yet
The Geometry of Types: To Cite This Version
13 pages
3 s2.0 B978178548112350014X Main
No ratings yet
3 s2.0 B978178548112350014X Main
6 pages
Zero Knowledge Proof
No ratings yet
Zero Knowledge Proof
14 pages
Rodin
No ratings yet
Rodin
185 pages
Terminology Extraction: Fundamentals and Applications
From Everand
Terminology Extraction: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mastering COBOL Programming: From Basics to Expert Proficiency
From Everand
Mastering COBOL Programming: From Basics to Expert Proficiency
William Smith
No ratings yet
Explanation Based Learning: Fundamentals and Applications
From Everand
Explanation Based Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
New Learning of Python by Practical Innovation and Technology
From Everand
New Learning of Python by Practical Innovation and Technology
Sudhir Pathania
No ratings yet
Eiffel Language and Design by Contract: Definitive Reference for Developers and Engineers
From Everand
Eiffel Language and Design by Contract: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Silver 14
No ratings yet
Silver 14
9 pages
A New Approach Towards Autoformalization
No ratings yet
A New Approach Towards Autoformalization
8 pages
Formal Verification With Isabellehol in Practice Finding A Bug in GCC
No ratings yet
Formal Verification With Isabellehol in Practice Finding A Bug in GCC
16 pages
O1 Replication Journey - A Strategic Progress Report - Part 1
No ratings yet
O1 Replication Journey - A Strategic Progress Report - Part 1
22 pages
OlympiadBench - A Challenging Benchmark For Promoting AGI With Olympiad-Level Bilingual Multimodal Scientific Problems
No ratings yet
OlympiadBench - A Challenging Benchmark For Promoting AGI With Olympiad-Level Bilingual Multimodal Scientific Problems
20 pages
LPML - Llm-Prompting Markup Language For Mathematical Reasoning
No ratings yet
LPML - Llm-Prompting Markup Language For Mathematical Reasoning
10 pages
Large Language Models For Mathematical Reasoning - Progresses and Challenges
No ratings yet
Large Language Models For Mathematical Reasoning - Progresses and Challenges
14 pages
Llemma - An Open Language Model For Mathematics
No ratings yet
Llemma - An Open Language Model For Mathematics
28 pages
Appendix C - Machine Language: Code Operand Description
No ratings yet
Appendix C - Machine Language: Code Operand Description
1 page
Digi EX50 User Guide 90002435
No ratings yet
Digi EX50 User Guide 90002435
1,189 pages
Think and Decide Think and Observe: 3 Quarter Week 1 Lesson Plan Mathematics 4 I. Objectives
100% (1)
Think and Decide Think and Observe: 3 Quarter Week 1 Lesson Plan Mathematics 4 I. Objectives
3 pages
EBLQ-CV3, CW1 EDLQ-CV3, CW1 4PEN522034-1 2018 01 Installer Reference Guide English
No ratings yet
EBLQ-CV3, CW1 EDLQ-CV3, CW1 4PEN522034-1 2018 01 Installer Reference Guide English
108 pages
Welding
No ratings yet
Welding
3 pages
Network Security Engineer
No ratings yet
Network Security Engineer
5 pages
HP Officejet 4200 Series All-In-One
No ratings yet
HP Officejet 4200 Series All-In-One
166 pages
Question A - Merged
No ratings yet
Question A - Merged
14 pages
Litefinance Partner Agreement en
No ratings yet
Litefinance Partner Agreement en
16 pages
Biped Humanoid Robot of 17 Degree of Freedom (Dof)
No ratings yet
Biped Humanoid Robot of 17 Degree of Freedom (Dof)
5 pages
Q4 MATH 9-WEEK 3-Solving Right Triangle Using Trigonometric Ratios
No ratings yet
Q4 MATH 9-WEEK 3-Solving Right Triangle Using Trigonometric Ratios
39 pages
Computational Fluid Dynamic Analysis of Innovative Design of Solar-Biomass Hybrid Dryer
No ratings yet
Computational Fluid Dynamic Analysis of Innovative Design of Solar-Biomass Hybrid Dryer
12 pages
LV Circuit Breaker Calculator Guide (Level 2) European Arc Guide EAG
No ratings yet
LV Circuit Breaker Calculator Guide (Level 2) European Arc Guide EAG
5 pages
Feedback Control Systems (FCS) : Lecture-26 Routh-Herwitz Stability Criterion
No ratings yet
Feedback Control Systems (FCS) : Lecture-26 Routh-Herwitz Stability Criterion
19 pages
Reed Arena Facility Guide
No ratings yet
Reed Arena Facility Guide
28 pages
CBD ZZ 00 DR DR 1001
No ratings yet
CBD ZZ 00 DR DR 1001
1 page
Problem Set 1 Answers
No ratings yet
Problem Set 1 Answers
4 pages
Com - Upgadata.up7723 Logcat
No ratings yet
Com - Upgadata.up7723 Logcat
47 pages
Trade Ultra Brochure Web
No ratings yet
Trade Ultra Brochure Web
11 pages
NIJ-0108.01 Ballistic Resistant Protective Materials
100% (1)
NIJ-0108.01 Ballistic Resistant Protective Materials
16 pages
JS7 ClassNotes
No ratings yet
JS7 ClassNotes
5 pages
Permission Forms
No ratings yet
Permission Forms
4 pages
Hardness Shore A vs. Shore D - Darwin Microfluidics
No ratings yet
Hardness Shore A vs. Shore D - Darwin Microfluidics
2 pages
Rigging Safety
100% (1)
Rigging Safety
27 pages
PanduitProductDetails UTP28SP2MBU
No ratings yet
PanduitProductDetails UTP28SP2MBU
2 pages
Ensci 1100 Exam Part 2 Toledo Prince David Art M. Bsce 1 1
No ratings yet
Ensci 1100 Exam Part 2 Toledo Prince David Art M. Bsce 1 1
3 pages
Design and Implement of Performance of M
No ratings yet
Design and Implement of Performance of M
4 pages
Geu Admit Card Back
No ratings yet
Geu Admit Card Back
1 page

From LCF To Isabelle:HOL

Uploaded by

From LCF To Isabelle:HOL

Uploaded by

Under consideration for publication in Formal Aspects of Computing

From LCF to Isabelle/HOL

Lawrence C. Paulson1 , Tobias Nipkow2 and Makarius Wenzel3

2. LCF and HOL

3. Isabelle in The Early Days: A Logical Framework

4. Type Classes and Isabelle/HOL

4.1. Order-Sorted Polymorphism

4.2. Axiomatic Type Classes

4.3. Logical Foundations

4.4. Isabelle/HOL versus HOL

5.1. The classical reasoner

Simple automation is achievable through a combination of obvious applications of the propositional

8. Structured Proofs, Structured Specifications and Formal Contexts

8.1. Structured Proofs: the Isar Language

8.2. Global Theory Context

Cong Eratosthenes [HOL-Algebra] Prime_Powers

Fig. 2. The theory dependencies of session HOL-Number Theory

theorem name: statement hproof i

8.3. Local Proof Context

8.4. Structured Specifications: Locales

theorem (in partial order )

8.5. ML within the Logical Context

9. Document-oriented Interaction: the Prover IDE

9.1. Prover Interfaces: The Early Days

9.2. The Isabelle Prover IDE (PIDE)

10. The Archive of Formal Proofs

11. Postscript: Synergy between Ideas

You might also like