Hindley: Basic Simple Type Theory
Hindley: Basic Simple Type Theory
Editorial Board
J. Roger Hindley
University of Wales, Swansea
CAMBRIDGE
UNIVERSITY PRESS
PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE
The Pitt Building, Trumpington Street, Cambridge CB2 1RP, United Kingdom
A catalogue record for this book is available from the British Library
Introduction page ix
1 The type-free A-calculus 1
vii
Vlll Contents
7C The converse PT proof 96
7D Condensed detachment 102
This book is not about type theories in general but about one very neat and special
system called "TA" for "type-assignment". Its types contain type-variables and
arrows but nothing else, and its terms are built by A-abstraction and application
from term-variables and nothing else. Its expressive power is close to that of the
system called simple type theory that originated with Alonzo Church.
TA is polymorphic in the sense that a term can have more than one type, indeed
an infinite number of types. On the other hand the system has no V-types and
hence it is weaker than the strong polymorphic theories in current use in logic and
programming. However, it lies at the core of nearly every one of them and its
properties are so distinctive and even enjoyable that I believe the system is worth
isolating and studying on its own. That is the aim of this book. In it I hope to try
to pass on to the reader the pleasure the system's properties have given me.
TA is also an excellent training ground for learning the techniques of type-theory
as a whole. Its methods and algorithms are not trivial but the main lines of most
of them become clear once the basic concepts have been understood. Many ideas
that are complicated and tedious to formulate for stronger type-theories, and many
complex techniques for analysing structures in these theories, appear in TA in a very
clean and neat stripped-down form. This book will take advantage of this neatness
to introduce some of the most important type-theoretic techniques with particular
emphasis on explaining why things happen the way they do.
Thus the reader who learns the basic techniques of type-theory from TA will
acquire a very good foundation for the study of other type-systems.
Type theories in general date back to the philosopher Bertrand Russell and
beyond.' They were used in the early 1900's for the very specific purpose of getting
round the paradoxes that had shaken the foundations of mathematics at that time,
but their use was later widened until they came to be part of the logicians' standard
bag of technical tools, especially in proof-theory. (Their use in combinatory logic
dates back to Curry 1934 and in A-calculus to Church 1940.) However, they remained
a relatively specialist tool until around the 1970's.
About that time the need for stronger programming languages brought type-
theories to the attention of computer scientists, and several of the new languages
developed in the 1970's and 80's were built on a type-theory base. These languages
1 See for example Russell 1903 Appendix B and the comments in Gandy 1977 and Church 1976.
ix
x Introduction
have proved themselves in many applications and have now become well established
in the research community (and are even becoming known outside it!). The chief
example is ML, developed at Edinburgh University by the group led by Robin
Milner, but others include HOL (Cambridge University), Miranda (Regd trademark,
Research Software Ltd.) and Nuprl (Cornell University).
The system TA is, with slight modifications, a common part of all of these. Indeed,
in its early days it was studied mainly as a prelude to studies of stronger systems
and this is the way it was treated in Curry and Feys 1958.1 But from the 1960's
onward it gradually became clear that TA was not as trivial as it had at first seemed
and was worth isolating and studying in its own right. Natural questions about
TA turned out to be much harder to answer than expected. Their answers are not
completed even today, but from them have come some very interesting techniques
that have had applications elsewhere, such as type-checking algorithms and filter
2-models.
In fact more is now known about TA than can fit into a book of reasonable
length. The present book will therefore be very selective. Although covering all the
main basic properties of TA it will focus on the following three algorithms. The first
is well known but the other two are scarcely known at all and their consequences are
still not nearly fully understood, though they were first discovered well over fifteen
years ago.
(1) Type-checking or principal-type algorithm. This algorithm is the core of the
type-checking algorithm used in ML. It takes a 2-term and decides whether a type
can be assigned to it and, if so, outputs the most general such type (the principal
type of the term). The method behind the algorithm was sketched in Curry and
Feys 1958 and versions of the algorithm itself have appeared in Morris 1968, Curry
1969, Hindley 1969 and Milner 1978.
(2) Converse principal-type algorithm. This algorithm takes a closed A-term M
and any type i which can be assigned to M, and outputs a closed 2-term M` such
that T is the principal type of M. Via the formulae-as-types correspondence with
propositional logic it leads to a completeness proof for a variant of the Resolution
rule called the rule of condensed detachment in a system of implicational logic. In
fact several converse principal-type algorithms are known, each producing an M'
with slightly different properties and giving completeness for a slightly different
logic. (Hindley 1969, Meyer and Bunder 1988, Mints and Tammet 1991.)
(3) Inhabitant-counting algorithm. A normal inhabitant of a type i is a closed
A-term M in #-normal form to which 'r can be assigned. The counting-algorithm
takes a type r and outputs the number of its normal inhabitants (0, 1, 2,... or
infinity, modulo changes of bound variables), and then lists these one by one; in
particular it decides in a finite time whether this list will be infinite or not. It is
like the known algorithm for deciding whether a regular language is infinite but
with extra procedures to deal with bound variables. It is also like known algorithms
for deciding provability in Intuitionist propositional logic but with extra procedures
Barendregt 1992 is a good survey of type-theories which shows the relations between TA2 and others;
see especially §3.1 where TA,t is called 2-*-Curry. Other modern introductions to type theories in
general are Andrews 1986, Girard et al. 1989, Krivine 1990, Mitchell 1990 and 1996, Constable 1991,
Gallier 1993, Nerode and Odifreddi 199-, Scedrov 1990.
Introduction xi
added to do the counting. It originated in Ben-Yelles 1979 though part of the proof
that it works is due to Hirokawa 1993c.
Each of the above algorithms will be presented in full with a proof of correctness
included.
Acknowledgements
I am very grateful to all who have helped in producing this book, especially the
following.
For invitations to give the lecture-courses from which this book has (very slowly)
developed: Institut d' Informatique, Universite de Tizi-Ouzou, Algeria (1988), and
the U.K. Science and Engineering Research Council's "Logic for I.T." Initiative
(1990).
For a period of reduced teaching duties without which the book would not have
been possible: my colleagues in the Mathematics Department, University of Wales
Swansea.
For finance and accommodation which facilitated some of the meetings and
discussions involved: the European Community's "Esprit" Basic Research Actions
3230 and 7232, the Australian Research Grants Scheme, and the Mathematics
Department of the University of Wollongong.
For useful criticisms, discussions, and helpful suggestions: Yohji Akama, Choukri-
Bey Ben-Yelles, Martin Bunder, Naim cakman, Felice Cardone, Mariangiola
Dezani-Ciancaglini, Roy Dyckhoff, Fritz Henglein, Sachio Hirokawa, Hans Leiss,
Mohamed Mezghiche, Seref Mirasyedioglu, Gordon Plotkin, Adrian Rezus, Jon
Seldin, Masako Takahashi, Anne Troelstra, Werner Wolff, Marek Zaionc and
anonymous referees. (Any errors remaining in the work are my own responsi-
bility, however.)
For perseverance: the staff of Cambridge University Press, especially David
Tranah.
And last but not least, for a healthy combination of support and cynicism, my
wife Carol.
Chapter 1 contains a very short summary of all the basic facts about A-calculus
needed in this book, though the reader is assumed to have met A-calculus before.
Further information can be found in standard text-books; for example (in English)
Barendregt 1984, Curry and Feys 1958, Hindley and Seldin 1986, Revesz 1988,
Hankin 1994, or (in French or English) Krivine 1990, or (in Japanese) Takahashi
1991.
Chapter 9 at the end of the book collects together some technical details that are
needed in the correctness-proofs of the main algorithms: readers who prefer to omit
these proofs can omit this chapter too.
Some exercises are provided in the text. Answers to starred ones are at the end of
the book.
Throughout this book "HS 86" refers to Hindley and Seldin 1986.
1
1A1.1 Notation Term-variables are denoted by "u", "v", "w", "x", "y", "z", with or
without number-subscripts. Distinct letters denote distinct variables unless otherwise
stated.
Arbitrary R-terms are denoted by "L", "M", "N", "P", "Q", "R", "S", "T", with
or without number-subscripts. For "2-term" we shall usually say just "term".
Syntactic identity: "M = N" will mean that M is the same expression as N (if M
and N are terms or other expressions). But for identity of numbers, sets, etc. we
shall say "=" as usual.
Parentheses and repeated A's will often be omitted in such a way that, for example,
Axyz-M = (2x-(2y-(1z-M))), MNPQ - (((MN)P)Q).
(The rule for restoring parentheses omitted from MNPQ is called association to the
left.)
1
2 1 The type free A-calculus
1A2 Definition The length, IMI, of a A-term M is the number of occurrences of
variables in M; in detail, define
Ixl = 1, IMNI = IMI + INI, 1 + IMI.
1A2.1 Example 5.
1A6.1 Warning Two distinct concepts have been defined here, free/bound occur-
rences and free/bound variables. A variable x may be both free and bound in M,
for example if M - x in M cannot be both
free and bound.
Also note that x is said to be bound in even though its only occurrence there
is a binding one.
1A7.1 Notation (Simultaneous substitution) For any N1,...,Nn and any distinct
xl,...,xn, the result of simultaneously substituting N1 for xj,N2 for x2,... in M, and
changing bound variables to avoid clashes, is defined similarly to [N/x]M. (For a
neat definition see Stoughton 1988 §2.) It is called
P=a Q.
1A8.1 Note Some basic lemmas about a-conversion and substitution are given in
HS 86 §1B. Two simple properties that will be needed here are
(i) P Q IPI = IQI,
(ii) P Q FV(P) = FV(Q)
1A9.1 Lemma Every term can be a-converted to a term without bound-variable clashes.
1A10.1 Example The following closed terms will be used in examples and results
throughout this book.
B Axyz-x(yz), B' Axyz-y(xz),
C I Ax-x,
K S
W AxY'xyy, Y = Ax-(AY'x(YY))(AY'x(YY)),
0 - 1 -
n (nx's applied toy)
(Y is Curry's fixed-point combinator, see HS 86 Ch.3 §3B for background; the terms
n are the Church numerals for n = 0, 1, 2,..., see HS 86 Def. 4.2.)
This section outlines the definition and main properties of the term-rewriting pro-
cedure called Q-reduction. Further details can be found in many other books, for
example HS 86 Chs. 1-6 and Barendregt 1984 Chs. 3 and 11-14.
1B3 Definition The length of a fl-reduction is the number of its #-contractions (finite
or oo). A reduction with maximal length is one that continues as long as there are
redexes to contract (i.e. one that either is infinite or ends at a term containing no
redexes).
1B4.1 Exercise For every term F let XF - YF where Y is the fixed-point combinator
defined in 1A10.1; show that
FXF =, XF.
1B5 Church-Rosser Theorem for f (i) If M ># P and M >p Q (see Fig. 1 B5a) then
there exists T such that
P 1'p T, Q ># T.
Proof of 1B5 (i) See HS 86 Appendix 1 or Barendregt 1984 §3.2. (ii) This is deduced
from (i) as suggested in Fig. IB5b.
6 I The type free .1-calculus
P \ /\ /,O\\ %
Q
1B6 Definition (#-normal forms) A fl-normal form is a term that contains no /3-
redexes. The class of all fl-nf's is called /3-nf. We say a term M has /3-nf N
if
Mr>jNandNE/3-nf.
1B7 NF-Uniqueness Lemma Modulo a-conversion, a term M has at most one f3-nf.
such that R. is the leftmost fl-redex-occurrence in Pi for all i >_ 1 (and P, a-converts
to P and Pi+1 or-converts to Q; for all i >_ 1).
Proof See Curry and Feys 1958 §4E Cor. I.I. (In fact this result is an immediate
corollary of a slightly deeper result called the standardization theorem ; for the latter
see Curry and Feys 1958 §4E Thm. 1 or Barendregt 1984 Thm. 11.4.7, or the
particularly clear proof in Mitschke 1979 Thm. 7.)
1B9.2 Note (Seeking f3-normal forms) The leftmost reduction of a term M is com-
pletely determined by M, so by 1B9 it gives an algorithm for seeking M*#: if M*Q
exists the leftmost reduction of M will end at M*#, and if not, this reduction will
be infinite. Of course this algorithm does not decide in finite time whether M has a
fl-nf; and in fact this cannot be done, as the set of terms with normal forms is not
recursive. (See e.g. HS 86 Cor 5.6.2 or Barendregt 1984 Thm. 6.6.5.)
11310.1 Note The following special cases of 11310 are worth mention:
m = n = 0: N y (an atom);
m = 0, n > 1: N = yNl ... Nn (an application);
m 1: N AxI ... (an abstract);
m > 1, n = 0: N Axe ... xn, - y (called an abstracted atom).
11310.2 Exercise Prove that f3-nf is the smallest class of terms satisfying (i) and (ii)
below:
(i) all variables are in fl-nf;
(ii) for all m,n>_ 0 with m+n > 1, and all
N1,...,Nn E f3-nf E f3-nf.
1C I- and #I-reductions
This section sketches the most basic properties of rl- and frl-reductions. For more
details see HS 86 Ch. 7 and Barendregt 1984 §15.1.
1C3 Definition The qfamily {P}7 of a term P is the set of all terms Q such that
P -7 Q.
1C4 Church-Rosser Theorem for I If P =,, Q then there exists T such that
P>7T, Qr>7T.
1C5.2 Note A Jrl-reduction may have a-steps as well as /3 and q. The following
theorem says that all its rl-steps can be postponed to the end of the reduction.
1C6 q-Postponement Theorem If M Dp7 N then there exists a term P such that
M Np P r7 N.
1C7 Commuting Lemma If M Np P and M r'7 Q (see Fig. 1 C7a) then there exists
a term T such that
P D7 T, Q Np T.
1C7.1 Corollary If M Dp7 P and M Np Q then there exists a term T such that
P Pp T, Q Np7 T.
1C8 Church-Rosser Theorem for /q (i) If M Np7 P and M Dp7 Q then there exists
T such that
P Dd7 T, Q t>#7 T.
IC q- and iq-reductions 9
3T
Fig. lC7a.
P DBn T, Q >p, T.
Proof (i) From 1B5, 1C4, 1C6, 1C7. (ii) From (i) as in Fig. 1B5b.
1C9 Definition (Jiq- and q-normal forms) A ft-normal form ($q-nf) is a term
without frl-redexes. The class of all #q-nf's is called /Jq-nf. We say M has fq-nf N'
if
MDRaN, NE f3j-nf.
Similarly we define 1-normal form, q-nf, and M has q-nf N.
1C9.1 Notation The f q-nf and q-nf of a term M are unique modulo =-,, by the
Church-Rosser theorems for fn and q ; they will be called
M*Pn, M*?1*
1C9.2 Lemma (i) An q-reduction of a fl-nf cannot create new f3-redexes; more precisely
Proof (i) It is easy to check all possible cases. (ii) By 1C2, M*fl has an q-nf (M*#)*,,
and this is a f3q-nf by (i).
1C9.3 Corollary If N is a fl-nf then all the members of its ?I-family are fl-nf's and
exactly one of them is a fq-nf, namely N*,,.
Proof For "only if", see Curry et al. 1972 §11E Lemma 13.1 or Barendregt 1984
Cor. 15.1.5. For "if", see 1C9.2. (By the way, do not confuse the present lemma with
a claim that a term is in fl-nf if it is in Pq-nf, which is of course false!)
10 1 The type free A-calculus
1C9.5 Note (Seeking $q-normal-forms) To seek for M*f,, reduce M by its leftmost
/3-reduction. If this is finite, it must end at M*p and then the leftmost 11-reduction
will reach an rl-nf in < IM*f 1/2 steps, by 1C2. If the leftmost #-reduction of M is
infinite, M*s does not exist and hence by 1C9.4 neither does M*#,,. Of course this
procedure does not decide in finite time whether M*R, exists; see the comment in
1 B9.2.
1D Restricted A-terms
The following restricted classes of A-terms will play a role later in the correspondence
between type-assignment and propositional logic.
ID1 Definition (Al-terms) A 2-term P is called a 2l--term if, for each subterm with
form in P, x occurs free in M at least once.
1D1.1 Note The 1I-terms are the terms that were originally studied by Church.
They have the property that if a Al-term has a normal form, so have all its
subterms (Church 1941 §7, Thm. 7 XXXII). Church restricted his system to 2I-terms
because he regarded terms without normal forms as meaningless and preferred that
meaningful terms did not have meaningless subterms. The 21-terms are discussed in
detail in Barendregt 1984 Ch. 9.
The standard example of a non-Al-term is K =
1D1.2 Notation Sometimes unrestricted A-terms are called 2K-terms, and the unre-
stricted A-calculus the 2K-calculus, to contrast with 2I-terms and to emphasise the
absence of restriction.
1D2.1 Examples Of the terms in the list in 1A10.1 the following are BCKA-terms:
B= B' C
I K n= (n = 0 or 1).
And the following are not:
S Axy xyy,
n = Axy x"y (n >_ 2).
1D2.2 Lemma The class of all BCKA-terms is closed under abstraction, i.e. if M is a
BCKI-term then so is Ax M for every variable x.
.
1D2.3 Notes (i) In contrast to the above lemma the class of all AI-terms is only
closed under abstractions such that x occurs free in M.
1D Restricted A-terms 11
(ii) The BCKA-terms are so called because the closed terms in this class correspond
to combinations of three combinators called "B", "C" and "K" in combinatory logic
(see 9F for details). They have also sometimes been called linear A-terms but this
name is nowadays usually applied to the following class.
1D4 Lemma Each of the three classes (Al-terms, BCKA-terms and BCIA-terms) is
closed under f q-reduction, i.e. every term obtained by flu-reducing a member of the
class is also in the class.
Proof Straightforward.
The topic of this book is one of the simplest current type-theories. It was called TA
in the Introduction but in fact it comes in two forms, TAC for combinatory logic
and TA2 for A-calculus. Since most readers probably know A-calculus better than
combinatory logic, only TAx will be described here. (The reader who wishes to see
an outline of TAc can find one in HS 86 Ch.14; most of its properties are parallel
to those of TA2.)
The present chapter consists of a definition and description of TA2. It is close to
the treatment in HS 86 Ch. 15 but differs in some technical details.
2A1.1 Notation Type-variables are denoted by "a", "b", "c", "d", "e", "f", "g", with
or without number-subscripts, and distinct letters denote distinct variables unless
otherwise stated.
Arbitrary types are denoted by lower-case Greek letters except "A".
Parentheses will often (but not always) be omitted from types, and the reader
should restore omitted ones in such a way that, for example,
P->a-t = (P-(Q->r)).
This restoration rule is called association to the right.'
12
2A The system TAA 13
The structure of an arbitrary type is analysed in detail in 9D-E. The lemmas there
will be used in some later chapters but not in this one.
2A3 Discussion (The Church and Curry approaches) In current use there are two
main ways of introducing types into A-calculus, one attributable to Alonzo Church
and the other to Haskell Curry.
The former goes back to a type-system introduced in Church 1940. In it, the
definition of "A-term" is restricted by giving each term a unique type as part of
its structure and saying that an application PQ is only defined when P has a
function-type a--+T and Q the appropriate argument-type a.1
The effect of Church's restriction can be seen on which is a well-formed
term in type-free A-calculus but represents the abstract concept of self-application, a
concept whose meaningfulness may well be questioned. Self-application was involved
in most of the paradoxes that were discovered in mathematics in the early 1900's,
and Bertrand Russell devised the first of all type-theories specifically as a language
in which these paradoxes could not be expressed. In Church's typed A-calculus each
variable has a unique type, so if x has a function-type a-->T it cannot also have type
a, and so the application xx cannot be defined as a typed term. Hence also Ax xx
cannot be a typed term.
Curry took a different approach. He pointed out that if we wish to ask questions
about the meaningfulness of then we need a language in which these questions
can be expressed. And Church's type-theory by itself is not adequate for this, because
we have just seen that is excluded from it. Curry proposed a language which
would include all the type-free A-terms, and a type-theory which would contain rules
assigning types to some of these terms but not to others. The term would not
be given a type by these rules, but would still remain in the system and hence be
discussable. (Curry and Feys 1958 §0B, p.5.)
Along with this change Curry proposed another, which is best understood by
looking at the identity-combinator as an example.
In Church's type-theory there is no term Ax-x. Instead, for each type o there is
For a definition of typed A-term and a few examples see HS 86 §13A; for another version, with
motivation and more details, see Barendregt 1992 §3.2.
14 2 Assigning types to terms
a variable x° with type o and a term with type Informally, this term
denotes the identity function on whatever set S may be denoted by a. Call this
function Is; the only objects it accepts as inputs are members of S, and Is(x) = x for
all x E S. Thus Church's theory has an infinite number of identity functions, one for
each set S. This agrees with the view of functions taken by most mathematicians:
each function is seen as a set of ordered pairs with a domain and range built into
its definition, and the identity functions Is and IT on two distinct sets S and T are
viewed as different functions.
But this view is not entirely satisfying; an alternative and perhaps more natural
view is to see all the separate identity-functions Is, IT, etc. as special cases of one
intuitive concept, the operation of doing nothing. If we admit that such a concept
exists, even though only in an imprecise sense, then a type-theory that tries to make
it precise by splitting it into an infinite number of different special cases at the
beginning will seem at the very best inefficient.
Curry's aim was a type-theory in which the identity-concept would be expressed
by just one term Ax-x, to which an infinite number of types would be assigned
by suitable formal rules. Types would contain variables, and if a term M received
a type T it would also receive all substitution-instances of T. This kind of theory
will be called here a type-assignment theory or a Curry-style type-theory. (It is the
ancestor of polymorphic type-theories.) In contrast, a theory in which each term has
a unique built-in type will be called a typed-term theory or a Church-style theory.'
TA2 will be a type-assignment theory.
M:T
where M is a A-term and T is a type; we call M its subject and T its predicate.
("M: T" should be read informally as "assign to M the type T" or "M has type T"
or "M denotes a member of whatever set T denotes".)
Subjects(F) = {xi,...,xm}.
i
Curry's and Church's lines of thought were not really as distinct as the above seems to imply. In
particular Church did not ignore the possibility that a single identity-concept might be formalizable
instead of a multitude of particular identity-functions. Indeed his first systems of A-calculus in the
1930's were part of an attempt to formalize exactly this single-identity view of functions in a type-free
theory, and one of the best available expositions of this view is in the introduction to his book Church
1941. Only after his attempt to do this in an extremely general setting proved inconsistent did Church
turn to type-theory and a more restricted approach to functions. Also Curry's type-theories began
their development in some of his earliest work and were not simply a response to Church's; see Curry
1934.
2A The system TA, 15
2A5.1 Notation The result of removing from r the assignment whose subject is x (if
r has one) is called
FPM
Subjects(F) = FV(M).
2A5.2 Note A type-context F is a set, not a sequence. Hence it does not change
when its members are permuted or repeated. To implement TA2 as a practical
system we would have to represent F by an expression in some language and
include rewrite-rules to permute F's members and make and remove repetitions.
Such rules would obscure the main themes of this book so they have been avoided
here by simply assuming that contexts are sets)
2A7 Definition (TA2-formulae) For any F, M and T the triple (F, M, T) is called a
TAx formula and is written as
F --* M :'r
(or just --+ M:T when F is empty). We shall call M the subject of this formula and T
its predicate (despite the fact that in general it contains other subjects and predicates
too, namely those of the assignments in F).
2A8 Definition (The system TAB,) TA2 has an infinite set of axioms and two
deduction-rules (called (-'.E) or --*-elimination and (-*I) or -'.-introduction), as
follows.
Axioms of TA2: for every term-variable x and every type r, TA,1 has an axiom
Type-contexts are also called environments in the literature. They play a different role from the sets
called bases in HS 86 Chs.14-15: there a basis was a set of axioms for a theory, whereas here a context
will be used as a set of assumptions for a particular deduction in a theory.
16 2 Assigning types to terms
Deduction-rules of TAx:
171 H P :(o--->T) r2 Q
(-E) r1 U r2 H (PQ) : T,
[if 171 U r2 is consistent]
F H P :-r
[if r is consistent with x:oj
F-x F-+ (Ax.P):(r->r).
r --> M :T
2A8.1 Note (Rule (-+I)) The condition in (-*1) that r be consistent with x:a means
that either r contains x:a or r contains no assignment at all whose subject is x. In
the first case the rule is said to discharge or cancel x from F. In the second case it is
said to discharge x vacuously.
In these two cases the rule takes two slightly different forms which may be
displayed as follows (using "r1" below to correspond to "F - x" above).
r1,x:a H P:T
(*l)main [ifx Subjects(r1)]
rl (Ax.P):(Q->T),
[ifx 0 Subjects(rl)]
F, '-+ (Ax.P):(a-*T).
r-z -+ (Az-x(yz)):c-->b
r - z - y H (Ayz -x(yz)):(c-+a)
ti (Axyz-x(yz)):(a->b)->(c-*a)-,c-->b
2A The system TA2 17
(-E)
H
2A8.6 Remark (Self-application) The above example gave a type to a term involving
self-application, namely II. This was done by giving a different type to each of the
two occurrences of I, and to do this we had to give two different types to the one
variable x; but there was no inconsistency problem when (-+E) was applied because
the two applications of (--+I) above (-+E) removed x from the contexts on the left
of " -4 ". Similarly it is possible to give types to several other self-applications in
TA2, for example KK and BB.
This may seem surprising, in view of the claim in 2A3 that the original aim
of a type-theory was to avoid self-application. But in fact the "dangerous" self-
application to be avoided is not any one simple particular case like II, but the overall
general concept of self-application as represented by the term Ax.xx. And 2x.xx
does not receive a type in TAz.
To see this, suppose there were a TA2-deduction of
H
for some r. Then its last step would have to be an application of (-+I) to a deduction
of
2A8.9 Note (Comparison with HS 86) The format of TAx is what is known as the
"Natural Deduction" style and was originated by Gerhard Gentzen in his thesis
Gentzen 1935. The system called "TA2" in HS 86 §15B is another variant of the
same style; its main differences from the above system TA1 are as follows.
(i) In HS 86 the discharging of assumptions by rule (--+I) was shown by enclosing
the assumption in brackets at the top of the deduction-tree. But here the set of
undischarged assumptions at each stage of the deduction is displayed on the left
of the " -4 " symbol and when rule (-*I) is used this set is simply reduced. This
notation is perhaps more explicit than that in HS 86 and is in common use in recent
literature. In both notations deductions have the same tree-structure.
(ii) The version in HS 86 included an a-rule that is not in the present version. This
was to ensure that the set of provable formulae would be closed under a-conversion
even when the basis of axioms was not. But there are no axioms here in the sense of
HS 86 so a-closure will turn out to be provable without adding an a-rule; see 2B6.
In the special case r = 0 we shall say M has type T in TA2, ors is a type of M
in TAx, or
F-2 M: T.
There is at least one interesting type-theory in which this consistency condition is relaxed, the theory
of intersection-types that originated in Coppo and Dezani 1978 and Salle 1978. In this theory xx
receives a type and types play a significantly more complex role than in TA2, see for example the
comment in Hindley 1992 §1.1.
2A The system TA2 19
2A11 Lemma (i) F I-2 M:r iff Subjects(F) 2 FV(M) and there exists a TA,2-
deduction of the formula F r M --* M:2.
(ii) (3F)(F I-A M:'t) (3F){F is an M-context and F I-2 M: i}.
(iii) For closed terms M,
(317)(F F-,2 M: T) : F-Z M: T.
Deductions in TAx have one very important property that is not shared by de-
ductions in many more complex type-theories; the tree-structure of a deduction of
F H M:T follows the tree-structure of M exactly. To make this correspondence
precise we really need the detailed definition of construction-tree of a term given in
9A4 and that of a deduction given in 9C1; but the following example gives a very
good idea of what it means.
y z
x
yz
x(yz)
),z.x(yz)
?,yz.x(yz)
T
kxyz.x(yz)
Fig. 2B l a.
2B2 Subject-construction Theorem (Seldin 1968 §3D Thm. 1, Curry et al. 1972 §14D
Thm. 1.) Let A be a TA2-deduction of a formula 1 --> M:T.
(i) If we remove from each formula in A everything except its subject, A changes to
a tree of terms which is exactly the construction-tree for M.
2B The subject-construction theorem 21
(ii) If M is an atom, say M x, then F = {x: T} and A contains only one formula,
namely the axiom
x:T --+ X: T.
r F--+ P: a.
Proof Induction on IMI. Parts (i)-(iii) follow immediately from the full definition
of deduction in 9C1. For (iv): if M = then by 2A8.1 the last step in A must
have one of the forms
F,x:p --+ P: or r --+ P : a
r i--* Ax.P:p-+a F i--i
and by 2A10 (->I) main is used when x E FV(P) and (-+I)vac is used otherwise. Hence
result.
and consider the deduction in Fig. 2B2.1a; the type a in that figure can be arbitrary.
Fig. 2B2.1 a.
22 2 Assigning types to terms
However, if M is a normal form or a AI-term this freedom will disappear and
A will be completely determined by M, as we shall see in the next lemma and the
exercise below it.
2B3 Lemma (Uniqueness of deductions for of s) (Ben-Yelles 1979 Cor. 3.2.) Let
M be a /3-nf and A a TAB,-deduction of I' F--> M:T. Then
(i) every type in A has an occurrence in T or in a type in F,
(ii) 0 is unique, i.e. if A' is also a deduction of r H M:T then A' - A.
Proof' Use induction on IMI. The cases M = y and M - are easy. Since M is
a /3-nf, by 1B10 the only other possible case is
M yPl ... Pn
In this case any deduction A of F H M:T must contain an axiom
Y : (PI -P.-T) Y : (P1-'...-'Pn-T),
as well as n deductions 01, ... , On giving
r1 - P1:P1, ... >Fn - Pn:Pn
followed by n applications of (-FE) to deduce
{Y : (P1- ... -->Pn-T)} U F1 U ... U Tn --> (YP1 ... Pn) T.
And F must be
{Y : (P1-... ->P.-'T)} U 171 U ... U Fn-
To prove (i): by part (i) of the induction hypothesis every type in a Ai occurs in
p; or F, and hence in F; also the type of y occurs in F. Hence (i) holds.
To prove (ii): the argument above shows that A' must use the same rules at the
same positions as in A. And the type assigned to y in A' is determined by F and
the assumption that type-contexts are consistent; then the types of P1,...,Pn are
determined by the type of y.
2B3.1 Note (Subformula property) Part (i) of 2B3 corresponds to what is usually
called in logic the subformula property; this says that in a Natural Deduction system
every formula in an irreducible deduction occurs in either the conclusion or an
undischarged assumption. (The correspondence between types and propositional
logic will be fully described in Chapter 6.)
In contrast the TAI-deduction in Fig. 2B2.1a contains a type a that does not
occur in an undischarged assumption or the conclusion.
The following three lemmas will be needed in the next section. The first is a
special case of the third but is stated separately because it is needed in the proof of
the third.
2B4 First Substitution Lemma for Deductions Let F - M:T and let [y/x]F be
the result of substituting y for a term-variable x in F. if either of the following holds:
(i) y Subjects(F),
(ii) y and x receive the same type in F,
then
[Y/x]r H2 ([Ylx]M) : T.
Proof First, in both cases (i) and (ii) [y/x]F satisfies the consistency condition for
contexts. Next, by 2A 11 there is a deduction of
F- F+ M : T
for some F- s F with Subjects (F-) = FV(M). Then [y/x]F- is consistent. An
induction on IMI then shows that
[Y/x]F l-z ([Ylx]M) : T.
Then the weakening lemma (2A9.1) gives the result.
Proof [Depends on Section 9C] It is enough to prove the result for one change of
bound variable, say the replacement of a component of P by with
y ¢ FV(M). If P the result follows using 2B4. If is a proper part of
P, use 9C5 (a replacement lemma).
2B6 Second Substitution Lemma for Deductions Let F1 be consistent with F2 and let
F1,x:o i-,, M:T, F2 I-,t N:a
Then
F1 U F2 I-,, [N/x]M : T.
Proof Assume x E FV(M). (If not, the result holds trivially.) By 2B5 we can assume
no variable bound in M is free in xN. In this case [N/x]M is simply the result of
replacing each free x in M by N with no accompanying changes of bound variables.
And by 2A11 we can assume that
Subjects(Fi) U {x} = FV(M),
Subjects(F2) = FV(N).
2C1 Subject-reduction Theorem (Morris 1968 §4D Thm. 1, Seldin 1968 §3D Thm. 2.)
If F -1 P :T and P >f, Q then
F [-A Q: T.
where F1 U F2 = f-. Then 2B6 applied to the deductions for M and N gives
f1Uf2F[N/x]M:T
If x V FV(M) the proof is similar.
Proof Exercise. This theorem is a special case of Curry et al. 1972 p.315, §14D
Thm. 3 (= Seldin 1968 §3D Thm. 3).
2C Subject reduction and expansion 25
Proof For use 1D6 and 2C2; for "=" use 2C1.
The subject-expansion theorem can be extended to some cancelling contractions
under suitable restrictions. (For example see Curry et at. 1972 §14D Thm. 3 or
Hindley 1989 Thm. 3.3.) But it cannot be extended to arbitrary contractions, as the
following examples show.
2C2.2 Example P >1# Q by a cancelling contraction and Q has a type but P has no
type:
P = ()uv-v)(ilx-xx), Q - 2v-v.
We have I-x Q: a-+a by 2A8.3. But no TA2-deduction has a conclusion with
form H-+ P : z. Because such a deduction would have to contain a deduction
of --+ a for some a and this is impossible by 2A8.6.
2C2.3 Example P tip Q by a duplicating contraction and Q has a type but P has
none:
P = (Ax-xx)I, Q = 11.
We have I-A Q: a--+a by 2A8.5. But P has no type because has none (by 2A8.6).
2C2.4 Example P > p Q by a cancellation, P and Q both have types, but Q has more
types than P :
P - 2xyz-(Au-y)(xz), Q = 1xyz'y.
It is easy to prove that
12 P: (c-*d)->b-*c-+b, I-A Q: a->b-*c-*b;
and an application of the principal-type algorithm (3E1) will show that the types
possessed by P are exactly the substitution-instances of the one shown above, and
similarly for Q. Hence P cannot have the type displayed for Q. (Roughly speaking,
the underlying reason is that x has a function position in P and must therefore be
assumed to have a function-type c--+d; since x does not occur at all in Q the type
of Q has no such limitation.)
2C2.5 Example P >1# Q by a duplication, P and Q both have types, but Q has more
types than P :
P =_ Q
By 2A8.8 we have
I-A P: (a--+b) --- (b--+a--+b)--+a-+a-+b,
I-A Q: (a-+b)-*(b--.c)-+a-->c;
26 2 Assigning types to terms
and an application of the principal-type algorithm (3E1) will show that P cannot
have the type displayed for Q. (The underlying reason is that the two v's in P must
receive the same type whereas the two I's in Q are not so limited.)
2C2.6 Example P rl-contracts to Q, P and Q both have types, but Q has more types
than P:
P - 2xy-xy, Q = AY.Y
It is easy to see that
F-2 P : (a-*b)-*a-*b, F-2 Q : a--+a,
We shall see in Chapter 3 that if Types(M) is not empty its members are exactly
the substitution-instances of one type, the principal type of M; hence Types(M) is
either empty or infinite.
However, Chapter 4 will describe the effect of adding a new rule to TA,t to
overcome this defect, and it will give theoretical evidence to suggest that perhaps
the price is not so high after all.
In practice too the conversion-sensitivity of Types(M) has turned out to be a
very small problem. Indeed, if one views an assignment M: a-fr as saying that the
application of M to every term with type v is "safe" in some sense, then the most
important practical property of a type-system is the subject-reduction theorem, which
says that if M has type a--+T it will not lose this safety-feature during a reduction.
If Types(M) happens to increase as M is reduced this is not a drawback but simply
means that M is becoming safer. In particular, practical programming languages
like ML and its relatives operate very successfully without conversion-invariance.
The system TAx divides the A-terms in a natural way into two complementary
classes: those which can receive types, such as Axyz x(yz), and those which cannot,
such as The former may be regarded as "safe" in the sense that if a term has
a type we know there is a way of assigning types to all its components that avoids
mis-matches of types. The following is a precise definition of this class.
2D1 Definition A term M is called (TAI-) typable or stratified if there exist r and
r such that
r1-2M:T.
2D2 Lemma The class of all TA2-typable terms is closed under the following oper-
ations:
(i) taking subterms (i.e. all subterms of a typable term are typable);
(ii) /3rl-reduction;
(iii) non-cancelling and non-duplicating /3-expansion;
(iv) A-abstraction (i.e. if M is typable so is Ax-M).
Proof (i) by 2B2. (ii) by 2C1. (iii) by 2C2. (iv) by rule (-+I).
2D3 Theorem The class of all TA2-typable terms is decidable; that is, there is an
algorithm which decides whether a given term is typable in TA1.
2D5 Weak Normalization (WN) Theorem (Turing 1942, Curry and Feys 1958, etc.)
Every TA2-typable term has both a f3-nf and a /3n-nf.
Proof See 5C1 and 5C1.1 for a proof (from Turing 1942), and 5C1.2 for historical
notes.
2D6 Strong Normalization (SN) Theorem (Sanchis 1967, Diller 1968, etc.) if M is
a TA2-typable term, every f3rl-reduction that starts at M is finite.
Proof There are many proofs in the literature besides those of Sanchis and Diller;
for example HS 86 Appendix 2 contains an accessible one for /3 in Thm. A2.3 and
one for /3ri in Thm. A2.4. For references to some others see 5C2.2.
2D6.1 Note Since SN implies WN there is no real need for a separate treatment of
WN. But the Turing proof of WN in 5C1 is both simpler and older than any proof
of SN. Further, most applications of normalization turn out to be of WN rather
than SN. The following are a couple of such applications.
Proof Reduce P and Q to their fl-nf's (which exist by WN, and can be found using
leftmost reductions, by 1B9) and see whether they differ.
2D8.1 Note The BCKA-terms are terms without multiple occurrences of variables
(except possibly for binding occurrences), so the above theorem connects untypabil-
ity with multiple occurrences of variables. On the other hand not every term with
multiple occurrences is untypable; consider S Axyzxz(yz) in 2A8.7(iii) for example.
3
A typable term has in general an infinite set of types in TAR. For example if I =_
it is possible to assign to I every type with form 6-*6, by the following deduction:
and it is easy to see that I has no other types than these. (In fact every deduction
for I must have the simple form shown above, by the subject-construction theorem.)
The type a---+a is called a principal type for I.
The aim of the present chapter is to show that the existence of a principal type
is a property of all typable terms, not just I. In effect the principal type of a term
is the most general type it can receive in TAR, and the principal type theorem will
say that every typable term has one. Further, and most important in practice, an
algorithm will be described for finding it.
This algorithm will decide whether a given term M is typable and, if the answer
is "yes", will output a principal type for M. Such algorithms are usually called
type-checking or principal-type or PT algorithms. The existence of a PT algorithm
is what gives TAR and its extensions such as ML their practical value, since if
the typability of a program is regarded as a safety criterion the programmer will
want to be able to decide effectively whether a newly created program satisfies this
criterion.
The PT algorithm below will be easy to describe and even easier to apply in
practice. But to prove the PT theorem a bald statement of the algorithm will not
be enough; we shall need also a proof that the algorithm is correct, i.e. that it does
what it claims to do. In the account below a correctness proof will be included with
the algorithm in the form of comments to each of its steps, explaining the purpose
and effect of each step as the reader meets it.
A little knowledge of substitution, unification and most general unifiers will
be needed so introductions to these will be included before the statement of the
algorithm.
30
3A Principal types and their history 31
3A1.1 Notation Letters "r", "s", "t", "u", "d" will denote type-substitutions. If
s - [at/al,... , Qn/an] a frequent alternative notation to s(T) will be
Recall that the set of all variables occurring in a type T is called Vars(T). The sets of
all type-variables occurring in a finite sequence (T1,...,Tn) of types, or in a deduction
A, are called respectively
Vars(TI...... ,), Vars(A).
3A2.2 Warning Two distinct concepts of substitution into deductions have now
been mentioned, for term-variables in 2B4 and for type-variables above. Note that
Vars(A) is a set of type-variables not term-variables, and when s is applied to A the
terms in A are completely unchanged.
32 3 The principal-type algorithm
3A3 Definition (Principal types) In TA1, a principal type or PT of a term M is a
type T such that
(i) I I-2 M:T for some F,
(ii) if I'' I-A M:a for some I'' and a then a is an instance of T.
3A3.2 Notation It will be shown in 3B8.2 that a term's principal type is unique
(modulo substitutions of distinct variables for distinct variables), so we shall often
say "the principal type of M" or
PT(M).
3A4 Definition (Principal pairs) A principal pair for a term M is a pair (F, T) such
that the formula F H M:T is TA1-deducible and every other TA1-deducible formula
r" H M:a is an instance of F M:T.
3A5.1 Notes (i) If A is a principal deduction of r --p M:T then clearly 'r is a principal
type of M and (F, T) is a principal pair for M. In fact the PT theorem will prove that
every typable term has not only a principal type but a principal deduction, and the
PT algorithm will be seen to construct principal deductions as well as types. This
observation will slightly simplify the algorithm's correctness-proof. Thus a typable
term M will be shown to have not only a most general type but a deduction whose
every step is most general.
(ii) Although principal types have been studied since 1969, principal deductions
were almost entirely neglected until about 1990 when the structure of a principal
deduction was characterized by Sachio Hirokawa (Hirokawa 1991a Thms. 1 and
2). This work is beyond the scope of the present book but it is one of the more
interesting recent developments in the study of TAz and has led to new results on
principal types (for example those in Hirokawa 1991b-c, 1993a) and to simplified
proofs of some old results.1
To be precise, Hirokawa 1991a uses a weaker definition of principal deduction than that in 3A5 above:
Hirokawa calls a deduction of r H M:t principal if the formula F M:t is a principal pair in
the sense of 3A4. But his characterization theorems can easily be modified to fit the definition in 3A5
by extending the conditions in them to apply to all type-variables in a deduction, not just those in its
conclusion and undischarged assumptions.
3A Principal types and their history 33
3A6 Principal Type (PT) Theorem Every typable term has a principal deduction
and a principal type in TA2. Further, there is an algorithm that will decide whether a
given A-term M is typable in TAx, and if the answer is "yes" will output a principal
deduction and principal type for M.
(a-*b)-> (c-+a)->c-+b.
Using the subject-construction theorem (2B2), show that this type is principal for B.
That is, show that every type assigned to B in TA2 must have form
(P- x)-(t-P)->Q-*t,
where p, U, T are arbitrary types.
3A7.1 Note The PT algorithm in this chapter will use the method of Hindley 1969
and Milner 1978 rather than equation-solving.
As mentioned above this method depends on a unification algorithm given in
advance, but unification algorithms are widely available as packages in practice, so
this feature makes a PT algorithm easy to fit into an already given system in a
practical implementation.
This method will also turn out to be well suited to deal with the case where
the term whose PT is being computed is a combination of other terms Pt,...,P
whose PT's are already known. This situation is common in practice, where a
library of terms and their PT's can be built up and used in determining the PT's of
new terms, and one of the original motivations for the method was a belief that it
would probably use such accumulated information more efficiently than a straight
equation-solving algorithm.
However, the account in this chapter will not be concerned with maximizing
efficiency, but only with the (usually incompatible) aim of making the PT algorithm's
structure and motivation as clear as possible.
3B Type-substitutions
sUt = [al/al,...,Qnlan,Tllbl,...,Tplbp]
(with repetitions omitted).
3B5.2 Exercise* (i) Write out s o t in the special case that s - [alb] and I = [b/a],
and verify that (s o t)(T) = s(t(T)) in the case T - a-*b.
(ii) Show that the action of any s on a given type T can be expressed as a
composition sl o ... o sk of single substitutions, in the sense that
(sl o ... o §k)(T) = §(T)
The next lemma will play an important role in the correctness proof of the PT
algorithm: it says that if a composition sot is "extended" to r U (sot), the extended
substitution can also be expressed as a composition with t (under certain conditions
on r to prevent clashes).
Then r U (s o 1) is
S_' = [al/bi,...,an/bn]
3137.1 Lemma If s is a renaming in T then s-1 is a renaming in s(T) and
s '(s(T)) = T.
3B7.2 Warning A one-to-one substitution may be a renaming in one type but not
in another. For example [b/a] is a renaming in a--+a but not in a-- b.
3138.1 Lemma (i) a is an alphabetic variant of T iff a and T are instances of each
other.
(ii) Part (i) also holds for deductions and finite type-sequences.
Proof For "only if" in (i) use 3B7.1. For "if", let Vars(T) = where
al,...,a are distinct, and suppose v - s(T) and T - 1(a), for substitutions s and It
with
Dom(s) = Vars(T), Dom(t) = Vars(a).
38 3 The principal-type algorithm
Then t(s(ai)) = ai for i = 1,...,n. Hence §(a;) cannot be composite, otherwise
9(§(a;)) would be composite too. Thus s has form
§ ° [bl/al,...,bn/a],
and Vars(a) = {bl,...,bn}. Also bl,...,bn are distinct, because if bi = bj then
a; = ff(s(ai)) = ff(bi) = ¢(bi) = tr(s(ai)) = ap
3B8.3 Lemma For each finite set of type-variables at....,am and each type T there is
an alphabetic variant of T that contains none of al, ... , am.
sl(p) ° §2(T),
we will be able to deduce a type for PQ by rule (-*E), thus:
- P:940-'si(a) '-' Q:s2(T) (-E)
F-> PQ:sI(a).
Conversely, by the subject-construction theorem (2B2(iii)), every type deduced for
PQ must have been obtained from instances of p--+u and T by (->E) in this way.
Thus the problem of deciding whether PQ is typable reduces to that of finding §I
and §2 such that §I(p) §2(T). This suggests the next two definitions.
3C2 Definition (Common instances) (i) Iff v =_ §I(p) §2(T) we call v a common
instance (c.i.) of the pair (p, T), and we call (sI,§2) a pair of converging substitutions
for (p, T).
3C Motivating the PT algorithm 39
3C2.1 Example A common instance of the pair (a->(b-*c), (a-+b)->a) is the type
(where /3, y, b are any given types), and the corresponding
converging substitutions are
sl = fl/b, y/c), §2 = [(l3-'Y)la, Slb].
3C2.2 Note Not every pair of types has a common instance. For example the pair
(a-*a, (b-*b)-*b) has none, because if §i (a-+a) §2((b-+b)-*b) we would have
3C3 Definition (M.g.c.i.) (i) A most general common instance (m.g.c.i.) of (p, T) is
a common instance vo such that every other common instance is an instance of vo.
If vo is an m.g.c.i. of (p, T) we shall call any pair (51,52) such that S1 (P) _ §2(T) vo
an m.g.c.i.-generator for (p, T).
(ii) M.g.c.i.'s of pairs of type-sequences and pairs of deductions are defined
similarly.
3C3.1 Exercise* Show that the pair (a-*(b-*c), (a--+b)-->a) in 3C2.1 has the following
as an m.g.c.i.:
vo =
3C3.2 Lemma (i) M.g.c.i.'s are unique modulo renaming. That is, if v is an m.g.c.i. of
(p, T) the other m.g.c.i.'s of (p,T) are alphabetic variants of'v.
(ii) If p' and T' are alphabetic variants of p and T respectively, then (p, T') has the
same common instances and the same m.g.c.i.'s as (p,T).
(iii) Similarly for m.g.c.i.'s of deductions and finite type-sequences.
3C4 Discussion It will be shown later that every pair (p, T) with a common instance
has an m.g.c.i. Given this fact, the discussion in 3C1 suggests that if we know
PT(P) _ p- +a and PT(Q) - T and we know somehow that (p,T) has a common
instance, we can compute PT(PQ) by just constructing the m.g.c.i. of (p, T), say
V = sl(P) = §2(T),
and then letting PT(PQ) _ si(a).
And this is indeed true, provided we avoid one small snag. Suppose u contains
some variables bl,...,bk that do not occur in p, and by bad luck the m.g.c.i. v we
have constructed also contains some of these variables. Then si(a) might contain
two occurrences of one variable b;, one originally in o and the other introduced into
it by sl. In this case sl(a) would not be the most general type assignable to PQ,
because we could change v to an alphabetic variant v' - s,(p) with no variables in
40 3 The principal-type algorithm
common with a, and then the corresponding sp(a) would be a type of P Q that was
not an instance of Si (a).
As a concrete example, let P - Q =_ Ax-x; it will be shown in 3E that
PT(P) - a-+(b-4a), PT(Q) - b-*b.
Thus in this case p ° a, a = b-- a, T = b--+b. Clearly an m.g.c.i. of (p, T) is v = b-+b,
obtained from p by the substitution §1 - [(b-*b)/a]. And
§I(a) = si(b-+a) - b-.(b-+b).
It is easy to see that b-*(b-*b) is assignable to PQ. But it is not the principal type
of PQ. Because if we change s1 to a new substitution s' by replacing b by a new
variable c that does not already occur in a, we get
IA P : (c-->c)->(b->(c--c)), la Q: c->c,
and hence PQ has a type b-*(c-*c) that is not an instance of b-+(b->b).
To avoid this snag the PT algorithm will be careful to choose an m.g.c.i. v of
(p, T) such that
(1) Vars(v) n (Vars(a) - Vars(p)) = 0.
Given this precaution, we have now reduced the problem of finding PT(PQ) to
that of finding an m.g.c.i. Of (p, T). More precisely, we need an algorithm to decide
whether (p,T) has a common instance and, if the answer is "yes", to output a most
general one. A suitable algorithm will be given in the next section. It will not be
direct, but will apply the unification algorithm, which has the advantage that its main
properties are so widely known that they will only need to be outlined below.
3D Unification
Most readers have probably met unification before. This section merely summarizes
the relevant definitions and basic properties for the reader who has not. The account
is based on the classical one in Robinson 1965. (An alternative account is in Aho et
al. 1986 §6.7, a thorough survey of major results is in Baader and Siekmann 1994,
and a survey of the various applications of unification is in Knight 1989.)
3D1 Definition (Unifiers) (i) Iff there is a substitution s such that s(p) __ §(T) we
say (p, T) is unifiable; we call any such s a unifier of (p, T) and call s(p) a unification
of (p, T).
(ii) A unifier of a pair of sequences ((pi,..., pn), (TI,..., T,,)), both with the same
length, is an s such that
§((pl,...,pn)) = s((TI.....rn))
this pair was shown in 3C2.1 to have a common instance, but no s can exist such
that §(p) - §(T), because the latter would imply the impossible identity
s(a) - §(a)-*s(b).
3D1.2 Note The problem of finding unifiers for pairs of type-sequences can be
reduced to that for pairs of types as follows. Given two sequences (P1,...,pn) and
(ti..... Tn), choose a variable b not occurring in any of these types and define
p* = P1--+...->Pn-+b, T - T1-->...->Tn-fib;
then the given pair of sequences is unified by a substitution s if (p*,T*) is unified
by s r Vars(pl,...,Pn,T1,...,Tn).
3D2 Definition (M.g.u.) (i) A most general unifier (m.g.u.) of (p, T) is a unifier u
such that for every other unifier s of (p, T) we have
§(P) = § (u(P))
for some s'. If v - u(p) for some m.g.u. U of (p,T) we shall call v a most general
unification (m.g.u.) of (p, T).
(ii) M.g.u.'s of pairs of type-sequences or deductions are defined similarly.
3D2.1 Exercise* Prove that the pair (a-*(b->b), (c--+c)-+a) is unifiable with a most
general unifier u = [(b->b)/a, b/c], and with the corresponding most general unifi-
cation being
(b-*b)-*(b-*b).
3D2.4 Notation From now on we shall often speak of "the" m.g.u. of (p, T) as if
m.g.u.'s were unique. (By 3D2.2, 3B8.1 and 3D2.3 they are unique modulo renaming.)
3D2.5 Lemma (Avoiding variables) (i) Let V be any finite set of type-variables. If
(p, T) has an m.g.u. u, then it has an m.g.u. u' such that
Dom(u) = Vars(p) U Vars(T), Range(u') n v = 0.
(ii) Similarly for m.g.u.'s of pairs of type-sequences or deductions.
3D3 M.G.U.-M.G.C.I. Lemma (i) If p and r have no common variables, (p, T) has
an m.g.u. ii f it has an m.g.c.i., and the two are identical.
(ii) For all p and T: if we change r to an alphabetic variant T* with no variables
in common with p, the unifications of (p,T"`) will be exactly the common instances of
(p, T) and the m.g.u. Of (p,T*) will be the m.g.c.i. Of (p, T).
(iii) Similarly for pairs of type-sequences or deductions.
Thus the problem of finding m.g.c.i.'s has now been reduced to that of finding
m.g.u.'s of pairs (p,T) with no variables in common. But searching for m.g.u.'s of
pairs, with or without variables in common, can be done by an algorithm as follows.
3D4 Unification Theorem (J. A. Robinson) (i) There is an algorithm which decides
whether a pair of types (p, T) has a unifier, and, if the answer is "yes", constructs its
m.g.u.
(ii) If a pair (p,T) has a unifier it has an m.g.u.
(iii) Parts (i)-(ii) hold also for pairs of deductions and for pairs of finite type-
sequences.
Proof (i) For Robinson's algorithm see 3D5 below; for a proof of its correctness
see Robinson 1965 §5 pp. 32-33.
(ii)-(iii) Like (i).
3D5 Unification Algorithm (Robinson 1965 §5.) Input: any pair (p, T) of types.
Intended output: either a correct statement that (p, T) is not unifiable or an m.g.u. au
of (p, T). [The algorithm will build au in stages auo, wit,..., each auk being a composition
of auk-1 with a new substitution: at the k-th stage it will test whether auk(p) - 0uk(T),
and if the answer is "yes" it will choose u = uk and stop; but if not, it will "extend"
uk and go to the next stage.]
Step 0. Choose k = 0 and uo = e (the empty substitution).
Step k+1. Given k and auk, construct pk = auk(p) and Tk = 0uk(T), and apply'the
comparison procedure below to (pk,Tk). That procedure will output either a correct
statement that pk Tk or a disagreement pair (a, a) (see below) such that a * a.
If Pk - Tk, choose u = uk.
3D Unification 43
If pk * Tk and the output of the comparison procedure is (a, a), decide whether
a E Vars(a).1
If a E Vars(a), state that (p, r) is not unifiable and stop.
If a ¢ Vars(a), then replace k by k + 1, choose uk+1 = [a/a] o uuk, and go to
Step k + 2.
Comparison Procedure. Given a pair (p, v) of types, write p and v as symbol-strings,
say
p - 51 ... 5m, V = tl ... to (m, n ! 1)
where each o f sl,... , sm, t1, ... , t is an occurrence of a parenthesis, arrow or
variable.
If p v, state that p - v and stop.
If y * v, choose the least p < Min{m, n} such that s, # tp; it is not hard
to show that one of sa, t, must be a variable and the other must be a left
parenthesis or a different variable. Further, sp can be shown to be the leftmost
symbol of a unique subtype p" of p. (If sv is a variable, p" - se.) Similarly
to is the leftmost symbol of a unique subtype v" of v. Choose one of p",
v" that is a variable and call it "a". (If both are variables, choose the one
that is first in the sequence given in Definition 2A1.) Then call the remaining
member of (p", v") "a"; the pair (a, a) is called the disagreement pair for
(p, v)
3D5.1 Note To prove that p exists in the case that p # v in the comparison
procedure we must show that it is not possible to have
tl...tn = SI ... Smtm+l ... tn
with it > m. This is left as a (rather dull) exercise for the reader.
3D5.2 Note (History) As mentioned earlier, the above unification algorithm is due
to J. A. Robinson, see Robinson 1965 §5; it was the first to be published complete
with a correctness-proof. Robinson dates its initial implementation to 1962, but he
credits a less smooth algorithm implicit in Prawitz 1960 with influence on its origin
and mentions that the history of unification algorithms goes back at least as far
as Herbrand 1930.2 Another early unification algorithm is implicit in Maslov 1964.
For more on the early history see the start of §3 in Baader and Siekmann 1994.
Although Robinson's algorithm is easy to describe it is not particularly efficient
to run, and many better algorithms have been published and implemented since the
1960's. Useful surveys are in Knight 1989 and Baader and Siekmann 1994. A good
discussion of efficiency is in §3.3 of the latter; here it is enough just to mention that
there are algorithms in Paterson and Wegman 1978 §4 and Martelli and Montanari
1982 that run in linear time if their inputs (p, T) and outputs au are coded in a suitably
compact way, but that without some coding even the mere printing-out of au(p) may
take exponential time in the worst cases. The problem of deciding whether a given
pair (p,T) can be unified is known to be PTIME-complete (Dwork et al. 1984).
This decision is called the "occurs check".
Z Robinson 1966, Robinson 1979 p. 292.
44 3 The principal-type algorithm
3D5.3 Corollary (i) There exists an algorithm which decides whether a pair (p, T) has
a common instance, and if the answer is "yes", outputs an m.g.c.i. and a pair (§1, §2)
of m.g.c.i.-generators for (p, T).
(ii) Every pair of types with a common instance has an m.g.c.i.
3E The PT algorithm
The PT theorem (3A6) stated that there is an algorithm that decides whether a given
untyped term M is typable in TAR, and outputs a principal deduction and principal
type for M if the answer is "yes". A suitable algorithm will now be described. The
proof that it satisfies 3A6 will be given as a series of notes between the algorithm's
steps; these notes will also help to motivate the steps.
3E1 Principal Type (PT) Algorithm (Hindley 1969 §3.) Input: any A-term M, closed
or not.
Intended output: either a principal deduction AM for M or a correct statement
that M is not typable.
x:a, x1:a1,...,xt:at
.for some types a, cu, ... , at, fl. Apply rule to make a deduction of
Justification of Case II. We must show that the above ARC is principal for
Let A be any other deduction of a type for Ax-P. By the subject-construction
theorem (2B2) the last step in A must be by rule (-*I)main, with form
Consequently A =
Case III. If M - 2x P and x FV(P), say FV(P) = {xl,...,xt}, apply the
algorithm to P. If P is not typable then M is not. If P has a principal deduction
Ap its conclusion must have form
xl:al,...,xt:at H P:/3
for some types al, ... , at, /3. Choose a new type-variable d not in Ap and apply
vacuously discharging x:d, to get a deduction of
xl:al,...,xt:at '--* (.lx
Then apply u to Ap and AQ; this changes their conclusions to, respectively,
u1:6* ,...,uP:BP, Wl :tpi ,...,Wr:tpr r-* P:p*-.v*,
vl:(al*,...,vq:(a9*, WI:XI*,...,Wr:Xr* Q:2*,
F-i
(7) u1:0*.... I UP OP
Justification of Subcase IVa. For IVal we must prove that if PQ is typable then
(wl, . , Wr, p) and (Xi, ... , Xr, i) have a unifier, and for IVa2 we must prove that the
above APQ is principal for PQ.
Justification of IVal. If PQ is typable, there is a deduction A whose conclusion
has form
for some types 7t1, ... , 7tp, etc. By the subject-construction theorem (2B2), A must have
been built by applying rule (--* E) to two deductions Al and A2 whose conclusions
are
for some type a. But Ap and AQ are principal deductions for P and Q, so
Al = ri(Ap), A2 = r2(AQ)
for some substitutions r1 and r2 such that
Dom(rl) = Vars(Ap), Dom(r2) = Vars(AQ).
F
Roughly speaking, (5) and (6) say that when u is applied to Ap and AQ any new variables it introduces
will differ from all those already in Ap and AQ, and hence no unnecessary identifications of variables
will be made. (Cf. the motivation in 3C4.) By the way, if the aim of the algorithm had been to
construct merely a principal type and not a principal deduction, (5) could have been weakened by
re-defining V to consist of just the variables (if any) in a that do not occur in the types in (3). (Cf. (1)
in 3C4.)
3E The PT algorithm 47
Therefore by (11),
Al - (r' U s)(u(Ap)), A2 = (p U s)(u(AQ))
But A is a combination of Al and A2 by rule and ApQ is a similar combination
of u(Ap) and u(AQ). Hence
A = (r' U s)(ApQ).
48 3 The principal-type algorithm
Subcase IVb: M = PQ and PT(P) is atomic, say PT(P) = b. Then the conclusions
of A and AQ have form, respectively,
Choose any variable c Vars(SP) U Vars(OQ) and apply the unification algorithm
to the pair of sequences
Subsubcase IVbl: the pair (21) has no unifier. Then PQ is not typable. [See the
justification below]
Subsubcase IVb2: the pair (21) has a unifier. Then the unification algorithm gives
an m.g.u. u; apply 3D2.5 to ensure that
WI - Xl > ,Wr = Xr
Choose APQ to be the deduction obtained by applying rule to u(Ap) and
u(OQ); its conclusion is
(Wl,...,Wr,b), (X1,...,Xr,T-C).
Hence, just as in the justification of IVa2, r" =ext s o u for some s such that
(27) Dom(s) 9 Range(u).
Thus r =ext r' U (s o u). Using (23) and (27) it is easy to see that r', s, u satisfy the
conditions in the composition-extension lemma (3B6), so by that lemma,
r=ext(0 U §) o U.
3E1.1 Notes (i) The PT algorithm could have been shortened by combining Sub-
cases IVa and IVb as was done in Hindley 1969: in IVa the unification algorithm
could have been applied to
(W1,...,Wr,P-'Q), (X1,...,Xr,T-C)
where c is a new variable, and this modification would have made IVa exactly
parallel to IVb and the latter could have been omitted. But this chapter's aim is
to explain clearly what is involved in computing a PT and not just to give the
slickest presentation, and IVa seems easier to understand when not combined with
IVb.
50 3 The principal-type algorithm
(ii) At the start of Case IV the algorithm could have been made more direct
by omitting the renaming of the type-variables in A p and AQ and using common
instances instead of unifiers in Subcases IVa and IVb. For example in IVa, instead
of seeking an m.g.u. u of the sequences
(p1, ... , wr, p), (XI, ... xr, T),
we could have sought an m.g.c.i. The justification of the algorithm would then have
rested on 3D5.3, the analogue of the unification theorem for common instances.
3E2 Exercise Show that the closed terms in Table 3E2a have the principal types
shown there. (The answers to (1)-(7) are displayed in 2A8.2-4 and in the answer to
Exercise 2A8.7; these deductions can easily be shown to be principal. The others
can be checked by applying the PT algorithm.)
By the way, the terms (10) and (11) in Table 3E2a are not in #-normal form; we
shall see in 8B4 and 8B7 that there are no closed fl-nfs with the same PT's as these
terms.
An extended table of terms and their principal types is given at the end of the
book for ease of reference.
Table 3E2a.
Term Principal type
(1) B = 2xyz-x(yz)
(2) B' = 2xyz'Y(xz)
(3) C (a-b-.c)-b-a-c
(4) a- a
(5) K a-b-*a
(6) S = (a-.b- c)-
(7) W )xY'xYY (a-+a-*b)-*a-*b
(8) 0 a-.b-b
(9) 1 -
(10) (.1xyz-K(xy)(xz))I a- a- a
(11) Rxy (2z x)(Yx) a-(a-b)- a
(12) )xyz-xy(xz) [untypable]
(13) [untypable]
3E3 Theorem The relation I-x M:T is decidable, i.e. there is an algorithm which
accepts any M and T and decides whether or not I-x M:T.
3E4 Further Reading For an alternative introduction to PT's and type-checking see
Aho et al. 1986 Ch. 6. The early literature on PT algorithms was mentioned in 3A7
above. The more recent literature on their uses and properties is fairly extensive and
perhaps the best place to start would be the survey Tiuryn 1990 and the papers in
its bibliography, as well as Giannini et al. 1993.
Accounts of PT algorithms that include correctness-proofs are relatively rare,
however; besides the proofs in Curry 1969 and Hindley 1969 §3 (and the unpublished
ones in Morris 1968 and Damas 1984) there is one in Wand 1987.
Let M be a closed term and Types(M) be the set of all types assignable to M in
TA2. As we saw in Chapter 2 this set might not stay invariant during a /t-conversion
of M. In fact Examples 2C2.2-6 and Note 2C3.2 showed that this set may increase
during a reduction of M and sometimes M can even be converted to a term with
no types at all in common with M.
As noted in 2C3.2 this lack of equality-invariance has not hindered TA2 from a
practical point of view, nor its descendant ML, as the main need in both systems
has been simply for the subject-reduction theorem. However, on the theoretical side,
equality-invariance of Types(M) seems a desirable property from the viewpoint of
any of the standard 2-calculus semantics (except possibly an operational semantics):
if we believe terms represent functions of some sort and equal terms represent the
same function, then equal terms should have the same types.
So this chapter will describe a system obtained by adding an equality-invariance
rule to TA2. More precisely, two systems will be described in parallel, one for =p
and the other for =#,; both will have similar properties.
4A1.1 Notation When confusion is unlikely (Eqp) and (Eqp,) may both be called
just "(Eq)".
Strictly speaking, to make the (Eq)-rules meaningful we must also add the axioms and rules for =p
and =p to the definitions of TAA+f and TAx+pq in some way; the details of this are left to the reader.
52
4A The equality rule 53
The name "TAx+N[n]" will be used to denote TA2+p and TAx+p, simultaneously
when stating results that hold for both systems. Deducibility notation in such
statements will be
To emphasize the contrast with F# and F we shall call F2, in the present chapter
only,
Fnoeq
4A1.2 Example Some terms become typable in TAx+9[n] that were not typable in
TA2. For example, consider the term
P-
which was shown in 2C2.2 to have no type although it reduces to I which has type
a-*a. In TA,2+p[n] we can assign a type to P by the following deduction.
v:a H v:a
F-. Av - v =p(.1uv v)(2x xx)
(Eqp)
F-->
4A 1.3 Remark By adding (Eq) to TAx we have trivially solved the problem of
making the set of a term's types invariant with respect to conversion. But for this
easy gain we pay at least two prices.
(i) Rule (Eq) is undecidable. That is, there is no algorithm that will apply to
arbitrary F, T, M, N and decide whether the formula F i-+ N:i is deducible from
F --. M:2 by (Eq). (Because the relations =p and =p,, are undecidable.) This defect
can be remedied by replacing (Eq) by a series of rules, each one corresponding
to an axiom in a suitable definition of =p or =#,,, but this solution is far from
neat.
(ii) The subject-construction theorem (2B2) breaks down: when there are (Eq)-
steps present the structure of a deduction is no longer dictated by that of its subject.
Also when the proof of a formula F H M:T contains (Eq) we cannot infer that
Subjects(F) = FV(M) as we did in Lemma 2A10.
At first sight (i) and (ii) are serious drawbacks. However, the following theorems
will show that (ii), at least, is not as bad as it might seem; we shall see that in fact
rule (Eq) plays an unexpectedly small role in deductions and that TA,A+B[n] is tied to
TA2 very closely indeed.
r, u r2 ti (P'Q'):Q.
Replace this part by the following, noting that the assumptions and the conclusion
stay the same.
Ti F-+ r2 i-- Q:p
4A2.1 Note The above proof depends on little more than the very simple fact that
if P =6[n] P' and Q =#[n] Q' then
In TA2, x occurred in the context on the top line of (-.I) iff x E FV(P), but as remarked in 4A1.3(ii),
this does not necessarily hold when (Eq) is present.
4A The equality rule 55
4A3 Weak Normalization Theorem for TAA+p[n] Every TA,A+p[n]-typable term M has
a f[q]-nf M*, and furthermore
F 1-p[n] M:'r t=om r F-no eq M* :'r.
Proof Let F I-p[n] M:T. Then by 4A2 there exists M' =p[n] M such that
(1) r 1-no eq M':T.
By the weak normalization theorem for TA2 (2D5), M' has a of M* ; and by the
Church-Rosser theorem M' reduces to M* and M* is also the of of M. They by (1)
and the subject-reduction theorem (2C1),
(2) r F-no eq M*:T.
4A3.2 Corollary The relation F I-p[nl M:-r is equivalent to each of the following:
(i) F F-no eq M* :t,
(ii) F r M* F-no eq M* :T,
(iii) F F-p[n] M* :T,
(iv) F r M* I-p[n] M* :'r,
(v) r r M* I-p[nl M :T'
(vi) F rM F-p[n1 M:T.
Proof Each of (i)-(vi) implies F F-p[n1 M:T by (Eq) and the weakening property of
" F-" (which holds for F-p[n] just as for I-2, see 2A9.1).
For the converse, the relation F F-p[n] M:T implies (i) by 4A3. And (i) implies (ii)
by 2A11. Next, (ii) implies (iv) trivially, and (iv) implies (iii), (v) and (vi) by (Eq)
and weakening.
4A3.3 Note When F F-#[n] M:T we cannot infer that Subjects(F) 2 FV(M), but by
the above corollary and 2A11 we can infer that
Subjects(F) 2 FV(M*).
The next theorem will express the content of Corollary 4A3.2 very neatly for the
case that M is closed.
4A4 Definition If M is closed, the set of all T such that I-g[,] M:T holds will be
called
Types p[n] (M).
56 4 Type assignment with equality
To emphasize the contrast, the set Types(M) defined in 2C3 will be called here
Types no eq(M).
4A5 Theorem If M is closed and has /3- and f rl-nf's M*p and M*p, respectively,
then
(i) TypeSp(M) = Typesp(M*p) = TYpesno eq(M*p),
(ii) Typespq(M) = Types#,(M*R,) = TYpesno eq(M*pn).
Proof By 4A3.2.
4A7 PT Theorem for TAx+p[a] In TAx+p[a] every typable term M has a principal
type, and it is the same as the principal type in TAx of the /3[q]-nf of M.
4A8 Remark Results 4A2-7 show that the effect of adding rule (Eq) has been much
less than we might have feared: (Eq) simply transfers types from normal forms to
the terms that reduce to them, and consequently TA2+p[n] is little more than the
theory of assigning types to f3[ry]-normal forms in TAx.
However, this is not quite the whole story. Rule (Eq) was motivated at the start
of this section by semantic considerations and the semantics of TAx+p[n] will be
investigated in the next section.
But first two more syntactic results will be stated. The first is an analogue for
TA,1+p of the subject-reduction theorem for TA2: if it merely asserted closure under
'p it would be trivial but it says slightly more.
Proof Let F = 0 and T=- a--+a. For any M, if Hp[n] M: a-*a then M is closed and
by 4A3, M has a of M* which is closed and has the property
Hoo eq M* : a-->a
But by the subject-construction theorem (2B2) it is easy to see that a closed /3[r]]-nf
with type a--+a in TA2 must have form for some x. Hence
[n] M:a-*a > M=d[n] I
Thus a test for Hpp[n] would give a test for convertibility to I, contrary to standard
undecidability results (e.g. Barendregt 1984 Thm. 6.6.2 or HS 86 Cor. 5.6.1).
4B3 Definition A A-model -9 = (D, , I I) is called extensional if, for all di and
d2 E D,
(Ve E D)(di-e = d2-e) di = d2.
4B4 Lemma (i) Every A-model -9 satisfies the theory of of fl-equality in the sense
that
M =s N (VE) IMIE = INIE
(ii) A A-model -9 is extensional iff it satisfies f rl in the sense that
4B4.1 Note A A-model is a model of untyped A-calculus, i.e. every term receives
an interpretation in D whether it is typable or not. This agrees with the Curry
approach to type-theory described in 2A3. In contrast, in a model of Church's
type-theory only typed terms are interpreted, and instead of one domain D the
model has a distinct domain for each type. The definition of model for Church's
system can be found in Henkin 1950.
Types are interpreted in an arbitrary A-model as follows.
4B5 Definition (Interpreting types) Let ' = (D, , [I) be any A-model. An inter-
pretation of the type-variables is any function V that assigns a subset of D to each
type-variable. Each such V generates an interpretation of all the types, I Iv, defined
as follows:
(i) Ialy = V (a) for all type-variables a,
(ii) Ia->Tlv = {d E D : (Vd' E Iolv)(d-d' E ITIv)}.
IMIE E ITIV.
A set r of type-assignments is said to be satisfied if all its members are satisfied. A
formula r H M:T is said to be valid in .9 if every pair E, V satisfying F in -9 also
satisfies M:T in -9.
If I' --> M:T is valid in all A-models (respectively, all extensional A-models), we
say
F Ip M:T, IT =Bn M:T.
4B Semantics and completeness 59
4136.1 Note The above definition of satisfaction is called the simple semantics for
type-assignment. There are several other semantics-definitions in the literature, for
example see Hindley 1983 §§4-5 and Mitchell 1988 §3.
Many different particular A-models are described in the literature (for some
examples see Barendregt 1984 Ch. 18 and HS 86 Ch. 12), but we shall not need to
know about them here. In fact the only model used in the completeness-proof below
will be the simplest of all kinds of model, a term-model.
4B7 Definition (The term-models TM#, TMpn) The domain D of TM# is the set
of all fl-equality-classes of A-terms: in more detail, for each M we define
[M]P = {P : P =s M},
[M]'-[N]l = [MN]I.
Terms are interpreted in TM# thus: if FV(M) = {xl,...,xn} and E(x1) = [Qi]fl for
i = 1, ... , n (for some terms Q1, ... , Q,,), define
IMIE = [[Q1/xl,...>Qnlxn]M]'.
4B7.2 Lemma (i) TM# and T M#,, are A-models and T M#, is extensional.
(ii) In TMp[nl we have ME,= [M]PIll for all M.
Proof Routine. (For (i) see, e.g., Barendregt 1984 Prop. 5.2.12.)
4B8 Soundness Theorem Let I' be any type-context, M any A-term and r any type.
Then
4B9 Completeness Theorem Let r be any type-context, M any 2-term and i any
type. Then
where xl,... , xm are distinct. Extend F to an infinite set F+ in which every type in
the language of TA2 is assigned to an infinite number of term-variables (and no
variable receives more than one type).
This can be done as follows. First list all the types as an infinite sequence, say
'G1,t2,.... Then note that in 1Al we assumed the A-language to contain an infinite
number of term-variables, so there are an infinite number of variables distinct from
xl,... , xm and from the free variables of M; choose from these an infinite number
of disjoint infinite sequences, say
Then define
F+=FU{v,,;:i;: i> 1,.j>_ 1}.
In what follows, the notation "F+ F-p P:a" (for any given P and a) will mean
that there is a finite subset r* of F+ such that r* I-p P:a.
In TM # define an interpretation vo of the type-variables by setting, for each
type-variable a,
vo(a) = {[P]: t+ F-# P :a}.
Proof of (2) For ". ", use rule (-*E). To prove "=>", suppose the upper clause
in (2) holds and, as a special case, take Q to be a variable z not occurring in P
and such that r+ contains the assignment z exists.)
Then by the upper clause in (2),
I"+ Hp Pz:tl.
Hence by rule (-+1),
F+-z I-p
This means that there is a finite subset F* of F+ - z such that
F* I-p
hence by 4B9
F* I-p P: ->rl,
which gives the "=>" part of (2). Hence (2) holds and the proof of (1) is complete.
Deduction of the theorem from (1) If I' (=p M:T then in particular, for TMp and E0
and V O we have
[M]p E ITIVo
Hence by (1),
I" I-p M:z.
That is F* I-p M:i for some finite F* c F+. Hence by 4A3.2(vi),
(3) F* tM I-p M:r.
But F* P M = F P M because F* s r+ and the extra term-variables in F+ were
chosen to be distinct from those in M. Hence by the weakening-property of I-,
F 1-fl M:i.
In Chapter 2 some care was taken to distinguish the Curry and Church approaches
to type-theory from each other. Curry's approach involved assigning types to pre-
existing untyped terms with each term receiving either an infinite set of types or
none at all, whereas in Church's the terms were defined with built-in types with
each term having a single type (see 2A3). In Curry's approach the types contained
variables, in Church's they contained only constants.
This book focuses on the Curry approach. However, even in this approach it turns
out to be very useful to introduce a typed-term language as an alternative notation
for TA2-deductions. Although the tree-notation introduced in Chapter 2 shows very
clearly what assumptions are needed in deducing what conclusions, it takes up a lot
of space and is hard to visualise when the deduction is in any way complicated. And
when manipulations and reductions of deductions are under discussion it is almost
unmanageable. A much more compact alternative notation is needed, and this is
what the typed terms in the present chapter will give.
We shall also define reduction of typed terms; typed terms will be shown in the
next chapter to encode deductions in propositional logic as well as in TA2, and
their reduction will be essentially the same as the reduction of deductions that is a
standard tool in proof theory.
By the way, a cynical reader might think we are simply abandoning Curry's
approach here and replacing it by Church's, but this is not so; the main positive
feature of Curry's approach is the expressive power gained from its use of type-
variables and the presence of an underlying language of untyped terms, and we shall
not abandon these. All we shall do is replace space-hungry deduction-tree diagrams
by neat and compact typed terms. And these terms will differ in several important
ways from those in a true Church-style system (see 5A1.5 for example); they will
simply be codes for TA2-deductions, nothing more.
5A Typed terms
63
64 5 A version using typed terms
It might be useful to read the detailed definition of a TA2-deduction in 9C in
parallel with this section.
5A1 Definition (Typed terms, TT(F)) Given a type-context F the set TT(F) of typed
terms relative to F is a set of expressions defined thus:
(i) if F contains x:a then the expression x° is in TT(F) and is called a typed
variable;
(ii) if F1 U F2 is consistent and M°-'T E TT(F1) and N° E TT(r2), then
5A1.1 Notation Typed terms will be abbreviated using the same conventions as
for untyped terms. Also some type-superscripts may be omitted. For example,
depending on which of its types are to be emphasised in a particular discussion, a
typed term (xP-OyP)° may be called any of
5A1.2 Exercise (Compare 2A8.2-4.) Show that TT(O) contains the typed terms
(2xa-b yc-.azc C. (xa-b (yc-aZC)a)b)(a-.b)-(c-+a)-c-.b
(Axa . xa)a-.a
(Axayb. xa)a~b~a
5A1.3 Note (i) In the following term x appears decorated with two different types:
(x(a-a)_b(Axa.xa)a-.a)b
Qb =
Despite this, Qb is a genuine typed term, and in fact it is easy to verify that
Qb E TT(r), where F = {x: (a-*a)-*b}. Further, Qb translates into a genuine TA,2
-deduction in an obvious way.
(ii) In contrast, the following expression is not a typed term (and does not translate
into a TA2-deduction):
(xa-'bxI.
5A1.4 Warning The subterms of a term in TT(r) need not be in TT(I'). For
example we have
(2xa,xa)a-.a E TT(O)
5A1.5 Warning If M` and N° are typed terms we cannot always say that
(M°-'tN°)T is one. (Because if M°-'T E TT(r1) and N° E TT(F2) we cannot
apply 5A1(ii) unless F1 U F2 is consistent.)
This is a crucial difference between the present typed terms and those of Church,
for example in Church 1940; the latter are typed in an "absolute" sense and have
the property that if M°-'T and N° are typed then so is (M°-'tN°)t, but the present
terms are only typed relatively to a given F.
The translation mappings between typed terms and TA2-deductions will now be
defined; they are very straightforward but will be given here in full for the sake of
precision.
66 5 A version using typed terms
5A5 Definition (Translating deductions to typed terms) Let A be a TAR-deduction
with conclusion F --> M:T; define a typed term To E TT(F) by following the clauses
of the definition of A (see 9C1) thus:
(i) if A is an axiom x:T H x:T, define To - xt;
(ii) if A is the result of applying (-+E) to deductions A1, A2 of
I'1 I--) P : a--+T, r2 I-)
Q:a,
To = (TA"' 1 )t;
(iii) if A is the result of applying rule (-+I) to a deduction Al of
F I-- P :U'
To = (AxP.Te1)P-Q
for some r' - 171 and F' s F2, define NA MT) to be the deduction obtained by
applying (-+E) to A(P°-'t) and A(Qa);
(iii) if MT = (2xP Pa)P- T p- o, P° E TT(F), r is consistent with {x:p}, and
A(Pa) is a deduction of
F' i--> Pd:a
for some t' s F, define A(Mt) to be the deduction obtained by applying (-AI)
to A(Pa), discharging x.
5B1 Definition The length, IMTI, of a typed term MT is the same as JM11, see 1A2.
5B2.1 Lemma (i) If 5B2(i) holds then {Ta/P°}PMT is a well-defined typed term and
Con({Ta/P°}PMT) 9 Con (MT).
(ii) If 5B2(ii) holds then {T°/P°}PMT is well-defined and
Con({Ta/P°}PMT) c Con(MT) U Con(T°).
5B2.2 Note If neither (i) nor (ii) holds in 5B2 we do not define {Ta/P°}PMT. In
this case the replacement of P° by T° in MT will still produce an expression of
some kind but it might not be a typed term. For example let
(xa-bya)b, Ta =
MT = P° = yQ,
xa.
5B3 Definition Free and bound variable-occurrences, and the set FV(Mt), are defined
just as in 1A6. (But all occurrences are now typed.)
5B5.1 Notes The restrictions in the above clauses may seem over-strong in that they
involve untyped variables and terms; but if they were weakened the last part of the
useful lemma 5B5.2 below would fail.
Clause (iv) includes the case 1 # o. But in this case x° cannot occur free in P",
because if it did then would not be a typed term (see Definition 5A1(iii)).
Thus substituting for x° should change nothing. And this is exactly what (iv) says.
No attempt has been made to define [NP/x'] when p # o or [N9/xa]xt when
'r*0.
5B5.2 Lemma If ((2xa.Mt)Na)t is a typed term with no bound-variable clashes, then
(i) [Na/xa]Mt is defined and is a typed term with type r,
(ii) [Na/x°]Mt - Mt ifxa 0 FV(Mt),
(iii) Con([N'/xa]Mt) c (Con(Mt) - x) U Con(N'),
(iv) ([Na/xa]Mt)f =a [N'1/x]Mf.
Proof Parts (i)-(iv) are proved together by a straightforward but boring induction on
IMt 1. (The assumption about ((2xa.Mt)Na)t implies in particular that Con(MI) - x
is consistent with Con(Na) and that x does not occur in Nd.)
5B6 Definition (Typed a-conversion) The relation -a is defined just as for untyped
terms (see 1A8), using replacements with form
(a)
Axa'Mt AY°'[Ya/xa]Mt FV(Mf))
(y
5B6.1 Lemma If Ax°.Mt is a typed term and y FV(Mf ), then [ya/xa]MT is defined
and both sides of (a) are typed terms with the same type and minimum context. Hence
the class of all typed terms is closed under a-conversion, and
5B6.2 Warning The condition "y FV(M1 )" in (a) cannot be weakened to "ya 0
FV(Mt)". Because if it were, we could a-convert a typed term to an expression that
was not one, thus:
Axa-+b , xa-b ya Aya-b, ya.b ya
=a
5B7 Definition (Typed redexes and reduction) Typed >p and v#,, are defined just like
the untyped relations in 1B and 1C. In particular typed f- and q-redexes have form
((2x°'Mt)a~tNo)t,
with x 0 FV(P'N'f), and their contracta are, respectively,
[Nalxo]Mt, Pa-.t
Proof Prove (i) and (ii) together: for $ use 5B5.2 and 5B2.1(i), and for n the proof
is straightforward.
5B7.2 Note (i) When typed terms are interpreted as TA2-deductions a /3-reduction
of a typed term corresponds to a reduction of a deduction. (The idea of reducing
deductions originated in Prawitz 1965 and there is a modern account in Troelstra
and van Dalen 1988 Vol. 2 Ch. 10 §2; cf. also HS86 Def. 15.30.) In this interpretation
the above lemma says that if we reduce a deduction of F H P:r, the result will
still be a genuine deduction and will have conclusion I'` i--). Q:T for some F* c--- r
and some Q obtained by #-reducing P.
(ii) Part (ii) of the above lemma is rather like the subject-reduction theorem
for TA2-deductions (2C1), but is not an exact analogue. A closer analogue is the
following lemma which will also be used later.
5B7.3 Lemma (i) Let Pt be a typed term and Pf D19 Q by contracting a f-redex R
at a position r in Pf. Then Pt contains at position r a typed $-redex R° such that
Ra = R, and contracting this R° changes Pt to a typed term Qt such that
Qf = Q.
(ii) The same holds true for rl-redexes.
Proof Same as the proof of 2C1 but with typed terms instead of TA2-deductions.
0
5B7.4 Lemma (i) A typed term Pt has a /3-reduction with length n iff Pf has a 1-
reduction with length n.
(ii) A typed term Pt is a f-nf iff Pf is a f-nf.
(iii) Both (i) and (iii) hold for n.
Proof Note first that if a typed term Pt contains a fl-redex then so does P1. Then
(i)-(iii) follow from this and 5B7.3.
5B8 Typed Church-Rosser Theorem (i) If Mt >p Pt and Mt >p Qt then there exists
a typed term Tt such that Pt >p Tt and Qt us P.
(ii) The same holds for t1- and fltl -reductions.
5138.1 Warning (No typed convertibility relation) No attempt is made here to define
a convertibility relation =p or =p, on typed terms by expansions and contractions
5C Normalization theorems 71
of redexes. First, we shall not need one. Second, since the typed terms in this
chapter correspond exactly to TAT-deductions, any attempt to define such a typed
equality would meet the same type-variation problems as were discussed for TAT in
2C. In fact, f3-expanding a typed term may lead to an expression which is no longer
a typed term (cf. 2C2.2-6).
5C Normalization theorems
Along with most other type-theories, TAB, has what is called the weak normalization
property:
(WN) every typable term has a normal form.
And many of them, including TA2, also have the strong normalization property:
(SN) all reductions of a typable term are finite.
These properties were stated for TA2 in Theorems 2D5 and 2D6 without proof. In
the present section a proof will be given of the analogue of WN for typed terms, and
WN for typable terms will be deduced from it. No proof of SN will be given here
as there is already a detailed proof in our main reference (see HS86 Appendix 2).
5C1 Weak Normalization (WN) Theorem (Turing 1942, Curry and Feys 1958
Cor. 9F9.2, etc.) Every typed term P" has both a fl-nf and a f3n-nf.
Proof The existence of a frl-nf follows from that of a fl-nf by the typed analogue of
1C9.4. It remains to prove that P" has a fl-nf, i.e. to give a strategy for reducing P"
and to prove this reduction terminates. This will be done by the method of Turing
1942, the earliest proof known and also the simplest. Some knowledge of 9B will be
needed.
Let P' E TT(r). Then each fl-redex-occurrence RT in P" has form
RT = ((,.xo'MT)a-.TNa)T
We shall call the function part of RT and its type a--+T the dominant type
of RT, and the number of atom-occurrences in Q-*r the degree of RT :
Deg(RT) = lo-,rl
Now let the contraction of RT change P" to Q', and consider the residuals in
Q" of fl-redex-occurrences in P'. (Residuals are defined in 9B2.) By 9B2.2(iii) the
degree of each residual is the same as that of the residual's parent in PR.
Further, by the conclusion of 9B5.1, for each newly created fl-redex-occurrence in
Q" (i.e. one that is not a residual of one in P') its function-part is an occurrence
of either Na or [N/x]MT ; thus its type is either a or z, and so the degree of each
newly created redex-occurrence is strictly less than the degree of RT.
Now let d(PI) be the maximum of the degrees of all fl-redex-occurrences in P"
and let S be the set of all fl-redex-occurrences in P" with this degree. Reduce P"
by a minimal complete development of S as defined in 9B4. By 9B4 and 9B4.1 this
reduction is finite and its result is a term Pi containing no residuals of members of
72 5 A version using typed terms
S. All the redexes in P, are either newly created or residuals of redexes not in 5;
hence
d(P1) < d(P").
Continue this procedure by making further minimal complete developments of sets
of redexes with maximum degree; since each development strictly decreases d(P")
we must eventually obtain a fl-normal form of P".
5C1.2 Historical notes The first known proof of WN for a type-theory equivalent
to TAx was written around 1941 or '42 in unpublished notes by Alan Turing. (See
Gandy 1980a for a transcript of Turing's notes, and Gandy 1977 pp. 178-180 for
some comments on Turing's work in type-theory.)
But the first proof to be actually published was carried out by Curry in the late
1950's by a completely different technique (Curry and Feys 1958 §9F, especially
Cor. 9F9.2). Its key step was a cut-elimination theorem for a particular formulation
of TACL.'
Also there is a WN theorem in Prawitz 1965 for reducing proofs in logic which is
essentially equivalent to WN for TA2.
From the 1960's onwards the Turing method of proof was re-discovered, probably
independently, several times (Morris 1968 §4F Thm. 2, Andrews 1971 Prop. 2.7.3,
for example), and WN theorems were proved by various methods for many stronger
type-theories than TA2.
In particular, the mid-60's saw a spate of proofs of WN for typed .1-calculi
enhanced by primitive recursion operators (for example those in Tait 1965, Hanatani
1966, Hinata 1967, Sanchis 1967, Tait 1967, Diller 1968, Dragalin 1968 and Howard
1970); see Troelstra 1973 §§2.2.1-2.3.13 or HS86 Ch. 18 for descriptions of the
background setting. At least two of these also included proofs of SN (see below).
5C2 Strong Normalization (SN) Theorem (Sanchis 1967 Thm. 8, Diller 1968 §6,
etc.) Let P" E TT(F) for some F; then, for /3- and for /hj-reductions,
(i) all reductions of P" are finite,
(ii) there is an algorithm which accepts P" as input and outputs a number k(P") such
that all reductions of P" have length 5 k(P").
Proof For (i) there is an accessible proof in HS86 Appendix 2: see Thm. A2.3 for
/3, and Thm. A2.4 for fl q. Alternatively, see Barendregt 1992 Thm. 4.3.6.
For (ii), simply construct a finitely branching tree of reductions starting at P" by
doing all possible contractions at each step. By (i) this tree's branches are all finite
Cut-elimination originated in the study of predicate logic in Gentzen 1935, and the relation between
cut-elimination and normalization is explored in Zucker 1974. It depends on the correspondence
between formulae and types to be described in the next chapter. Cut-elimination theorems for versions
of TAx can be found in Seldin 1977 and 1978.
SC Normalization theorems 73
in length, and the famous Konig's Lemma on trees states that in this case the whole
tree must be finite. Hence its branches can all be measured and the maximum of
their lengths determined. Call this number k(P').
But this algorithm is of course inefficient and the proof that k(P") is well-defined
depends on Konig's Lemma which is usually regarded as non-constructive. To
remedy these defects several workers have devised constructive proofs of (ii); see
Mints 1979, Gandy 1980b, de Vrijer 1987 and Schwichtenberg 1991. These proofs
give more efficient ways of computing suitable bounds k(PE) (though the bounds
they give are not always the least possible, see comments in de Vrijer 1987).
5C2.2 Note The history of SN began over twenty years later than that of WN. The
first known explicit SN proofs were in Sanchis 1967 Thm. 8 and Diller 1968 §6, the
former for weak reduction in combinatory logic enhanced by primitive recursion
operators and the latter for A$-reduction similarly enhanced.'
Most published proofs have depended on defining what is usually known as a
computability predicate by a suitable induction, and proving first the SN property
for all computable terms and then the computability of all terms P" by induction
on JP"i. (The proof in HS86 Appendix 2 is typical of this approach, cf. also the WN
proof in Tait 1967.) This method has been successfully adapted to so many different
type-theories over the years that it is now in effect the standard one. In the early
1970's it was applied by Jean-Yves Girard in a generalised and strengthened form
to prove SN for his strong second-order type-theory known as System F (Girard et
al. 1989 Ch. 14), and it has since been applied to other second-order type-theories.
An analysis with references is in Gallier 1990. Another account is in Barendregt
1992; Barendregt's System A2 is equivalent to Girard's System F.
1 In each case manuscript versions were made available a couple of years before publication.
6
One of the most interesting facts about TA,2 is that there is a very close correspon-
dence between this system and propositional logic, in which the types assignable
to closed terms in TA1 turn out to be exactly the formulae provable in a certain
formal logic of implication. This correspondence is often called the Curry-Howard
isomorphism or the formulae-as-types isomorphism, and will be studied in this chapter.
The logic involved in this correspondence is not the classical logic of truth-tables
however, but that of the intuitionist philosophers; it will be defined in the first
section below.
The Curry-Howard isomorphism was first hinted at in print in Curry 1934 p.588,
and was made explicit in Curry 1942 p.60 footnote 28 and Curry and Feys 1958 §9E.
But it was viewed there as no more than a curiosity. The first people to see that it
could be extended to other connectives and quantifiers and used as a technical tool to
derive results were N. G. de Bruijn, William Howard and H. Lauchli in the 1960's.
See Howard 1969, de Bruijn 1980 (an introduction to de Bruijn's AUTOMATH
project which began in the 1960's), and Lauchli 1965.
This chapter will also define three rather interesting subsystems of intuitionist logic
and show that they correspond to the three restricted classes of 2-terms defined in
Section 1D. This correspondence was first noted by Carew Meredith in unpublished
work around 1951 and was explored in detail in the thesis Rezus 1981.
This similarity of style is no coincidence; when the earliest formulations of TA,2 were written Gentzen's
methods were just beginning to be known more widely; indeed Curry was one of the first to understand
their importance.
74
6A Intuitionist implicational logic 75
As mentioned earlier, the formulae provable in the logic defined below will turn
out to be exactly the types of closed A-terms. However, we can also go one step
deeper than this and get a correspondence between the proofs of these formulae and
the A-terms themselves. But to make this deeper correspondence work we must be
very careful about the definition of "proof" and "deduction" in a Natural Deduction
system, and we must clarify some features of such deductions that are not often
emphasised in the standard literature on the topic.
Therefore some of the definitions below will involve position-labels and other
tedious details that have so far been confined to Chapter 9. These will be needed
when determining exactly which deductions correspond to which A-terms.
6A1.1 Note (i) Of course implicational formulae are exactly the same as types.
Parentheses will be omitted using the same association-to-the-right convention as
for types, see 2A1.1.
(PL) ((a-+b)-->a)--*a,
which is known as Peirce's law. The logic developed by the intuitionists has attracted
continuing interest over the years from many logicians and computer scientists quite
independently of their philosophical views. For example many of the polymorphic
type-theories in the current literature have intuitionist logic as their basis, not
classical logic.
The formulae provable by the rules in the next definition can be shown to
coincide exactly with the implicational formulae that the intuitionists accept as
universally valid. (See Troelstra and van Dalen 1988, Ch.2 (System IPC) and Ch.10
§5 (separation theorem).)
76 6 The correspondence with implication
6A2 Definition Intuitionist logic (or more precisely, its implicational fragment) has
the following two rules:
Q-+T a
(-+E) : (-+I) : [o]
o.-+T.
Each application of (-+I) is said to discharge (or cancel) some, all, or none of the
occurrences of a above i, and must be accompanied by a discharge label that lists
all the occurrences of a it discharges (see below for details). If none are discharged
the application of (-+I) is called vacuous. Discharged occurrences of a at the tops
of branches must be marked by enclosing them in brackets.1
Before formally defining deductions and proofs in this system it will be helpful
to look at some examples of how the rules are used. In these examples a position-
notation like that in 9A1-4 will be included in the deductions to allow the discharge
labels to say precisely which occurrences have been discharged. The position of each
formula will be written in parentheses beside it, and discharge labels will be written
between braces "{ }".
6A3.2 Note Comparing 6A2.2 with 6A3.1 might suggest that a succession of
partial discharges of the same formula can always be replaced by one complete
discharge followed by some vacuous ones. And this suggestion is correct. (Though
it would fail in restricted logics that forbid vacuous discharging.) Thus if partial
discharging were forbidden the class of all provable formulae would not be reduced.
However, the class of proofs would then become more restricted than the class of
A-terms.
6A3.3 Example The following proof illustrates an earlier remark that the partial
discharge convention allows vacuous discharging of a even when the set of available
78 6 The correspondence with implication
occurrences of a is not empty.
Hopefully this and the earlier examples have given the reader some feeling for
the concepts involved in a natural deduction. The formal definition of "natural
deduction" will now be given for precision's sake, though the reader may omit
it without much loss. In it, a natural deduction will be a tree in which each
node will carry either two or three labels: (1) a formula, (2) a position, and
(3) either a discharge-label (if the node represents the conclusion of (-+I)) or a
pair of brackets (if the node represents a discharged assumption), or no third
label.
6A4 Definition Natural deductions in the Intuitionist logic of implication (or logic
deductions for short) are trees, defined thus.
(i) An atomic deduction is a single node with two labels r, 0, where t is a formula
and 0 is the empty position.
(ii) If Al and A2 are natural deductions whose bottom nodes are labelled respec-
tively by o-*t, 0 and a, 0 (and possibly a third label each), define a new deduction
by first putting "1" on the left end of each position-label in Al and "2" on the
left end of each position-label in A2, next doing the same to the positions in any
discharge-labels in Al and A2, and finally placing an extra node beneath the two
modified deductions, as shown below.
Modified . . Modified .
° °2
6T `2
(iii) If A, is a natural deduction whose bottom node is labelled by r, 0 (and
possibly a third label), and c is any formula, define a new deduction thus. First
choose a (possibly empty) set of unbracketed occurrences of a at the tops of branches
in A, and label them with brackets, then put "0" on the left end of every position
in Al, then place an extra node beneath the modified Ai as shown in the diagram
below, and finally give this node three labels: (1) a-->i, (2) 0, and (3) a discharge-label
listing the positions of the chosen occurrences of o. (If the set of chosen occurrences
is empty we make the discharge-label say "vacuous".)
6B The Curry-Howard isomorphism 79
T 0
I
0 {discharging assumptions at p ...,pk }
6A4.1 Notation In practice natural deductions are never displayed in the space-
wasting form shown in these diagrams, but in the horizontal-line form used in the
examples earlier.
6A5 Definition (Deducibility and provability) Iff there is a deduction whose con-
clusion is T and whose undischarged assumptions are occurrences of members of
we say
Fint T
(This notation may be used when a,_., on are not distinct.) Iff n = 0 the deduction
is called a proof and T a provable formula or theorem, and we say
dint T.
(Al) (02)
r, H P:a- T r2 H Q:a
(-.E)
F, U F2 H (PQ):T,
construct AL by applying rule (-*E) of 6A2 to A,L and A2L.
80 6 The correspondence with implication
(iii) If M - Ax-P, r = p--+a, r = F'-x and the last step in A is
(A')
r' F--+ P
F'-x --> (Ax'P): (p-+v),
6B1.3 Note The A-terms to which AI and A2 assign types in the two above examples
are distinct, although they have the same tree-structure. Correspondingly (AI)L and
(A2)L are distinct, though they only differ in their discharge-labels.
(B) (a-+b)-+(c-+a)-*c->b,
(B') (a-->b)-+(b-*c)-->a-*c,
(I) a--+a,
(K) a--*b-+a,
(C)
(W) (a->a->b)->a--*b.
6B3 Warning The Curry-Howard mapping is not one-to-one. For example consider
the following TA2-deductions A3 and A4 which assign types to xyz and xyy.
A3: x:a-+a-+c --+ x:a-+a-+c y:a --+ y:a
(-+E)
x:a-+a--+c, y:a H xy:a-+c z:a '-+ z:a
(-+E)
x:a-*a-+c, y:a, z:a F+ xyz:c
It is easy to check that the Curry-Howard mapping assigns the same logic
deduction to both, namely
a-+a-+c a
(-+E)
a-+c a
(---+E)
c
x:T H x:T.
(ii) If the last step in A is (-*E) applied to the conclusions of deductions A' and
A", and A' and A" have been defined and are deductions of
I'' --* M : o -vr, r" H N :a,
then replace all term-variables in A" by distinct new ones (to make it have no
term-variables in common with A', neither free nor bound), and apply the TA2-rule
(-+E); call the resulting deduction Ax.
(iii) If the last step in A is an occurrence of (--+I) with form
a deduction A'
{discharging k >_ 0 occurrences p1,...,pk of p},
p-+a
where v1....,vk are distinct from each other and from the variables in F, and vi,...,vk
occur free in P at the same positions as pi, ... , pk have in A' (each v; occurring only
once in P), then proceed as follows.
If k >_ 1, replace all of v1,...,vk in Az by one new variable x that does not occur
elsewhere in A'; this changes A' to a deduction of
f,x:p > P*:v (P* = [x/vi,...,x/vk]P)
where x occurs in P * at exactly the same positions as p1,. .. , pk have in A. Then
apply (-*I)ma;,, to this modified A' and call the resulting deduction A2. Its conclusion
will be
6B4.1 Note In the above definition clause (ii) makes as many variables as possible
distinct, and then (iii) identifies some of these again. Lemma 6B5 will show that A2
is well-defined for every A.
6B4.2 Examples (i) If A is the following logic deduction, then AA is the TA2-
deduction A3 in 6B3 (modulo replacements of distinct term-variables by distinct
6B The Curry-Howard isomorphism 83
term-variables).
a-*a--*c a
(-E)
a-+c a
(-+E)
c
[a] (00)
(-'I)
a-+a (0) a
(-+I)
a vacouously}
(AA)L = A.
6135.1 Note The Curry-Howard mapping from A to AL is usually called the Curry-
Howard isomorphism: how far can it justify this description? There are three levels
to consider:
(i) provable formulae H types of closed terms,
(ii) logic proofs -> TAx-proofs,
(iii) logic deductions +- TA1-deductions.
By 6B5, the AL-mapping has a one-sided inverse AA such that (A2)L A, but for it
to be an isomorphism in any real sense we should also have (AL)z A. We have
seen in 6B3 that this breaks down on level (iii). But on levels (i) and (ii) it holds
true, as the next theorem will show.
84 6 The correspondence with implication
But before that theorem a lemma will describe exactly how far the identity
(AL)z = A breaks down.
6B6 Lemma Let A be a TA2-deduction oft --> P :,r. Then (AL)2 differs from A only
in that some term-variables distinct in (OL)2 may be identical in A. In more detail:
(i) (AL)2 is a TA1-deduction of a formula with form
r' H M:T
where, just as in 6B5, M has no bound-variable clashes, FV(M) = {xl,...,xn}, and
each x; occurs just once in M; also, for some vl,...,v, not necessarily distinct,
(ii) if A is a proof of H P:t then (AL)2 is a proof of i--+ P':i for some P' =_a P.
Proof Part (i) is proved by induction on IMO with three cases following the clauses
of 6B1. (The notation "[VIN] ... [vn/xn](AL)A" means the deduction obtained by
doing the substitutions [v1/x1],..., [vn/xn] in all subjects in (AL)2.)
Part (ii) is a special case of (i).
(AL)x = A
(modulo -a in subjects in (OL)x), and for all logic proofs A,
(Da)L = 0.
Proof For (i) and (iii) use 6B2 and 6B6. For (ii) "if", use 6B2.
For (ii) "only if", let 61, ... , an I- t be obtained by a logic deduction A, and apply
rule (->I) n times to change A into a proof, A% of
for some closed A-term N with no bound-variable clashes. Hence by the subject-
construction theorem, N must have form Axl ... for some M and some distinct
xl,... , x,, and (1 * )2, must contain the formula
x1:61i...,xn:Un I-2 M:i.
(By the way, M has no bound-variable clashes because N has none.)
6B7.3 Note An algorithm to decide whether there exists a closed typed term Mt will
be described in 8D5.2. Via the above corollary this will give a decision-algorithm
for Intuitionist implicational logic. Of course decision-procedures for this logic have
been known for a long time,' but the one in 8D5.2 will also count the number of
closed terms Mt in #-normal form and will generate them one by one. This can be
regarded as counting and generating all the irreducible Natural Deduction proofs
of T if we think of each Mt as representing a proof via the Curry-Howard mapping.
Proof By 8B6, Peirce's Law is not the type of a closed normal form; hence by the
SN theorem it cannot be the type of any closed term.
6B8 Note The Curry-Howard theorem has shown that the Intuitionist logic of pure
implication corresponds in a very neat and natural way to the pure A-calculus. Over
the years this correspondence has been extended to more expressive logics and even
to quite strong mathematical systems, and this has led to some useful techniques in
proof theory, for example methods for extracting programs from proofs of existence.
(See Crossley and Shepherdson 1993 for a modern account.)
But although each of these extensions is very important from a practical point of
view, few of them have the directness and neatness of the original "core" system.
6C1 Definition (The relevance logic R.) The definition of R.. is exactly like that
of Intuitionist implicational logic in 6A2 except that vacuous discharging is not
allowed. That is, when a--*r is the conclusion of rule (-+I) its discharge-label must
show at least one occurrence of or.
6C1.1 Note (Motivation for R.) In one important view of implication, a formula
a--+T should not be provable unless a is in some way relevant to T. In this view the
formula
is not universally valid, because it says in essence that if a statement a is true then
every other statement b implies it, even when b has no connection with the meaning
of a. There have been many different attempts to capture the notion of relevant
implication by formal systems and R_ is one of the earliest and simplest. In it,
roughly speaking, we can only prove a--+-r when a has actually been used in the
deduction of r.
More about R_ can be found in Anderson and Belnap 1975, especially §§3, 7, 8.3,
8.4, 8.18 and 9-14, and in Anderson et al. 1992, especially §§47.1, 63.2 and 71.2. The
system has been invented independently at several different times; see Moh 1950,
Church 1951 and the discussion in Dosen 1992b. It is known to be decidable (i.e.
there is an algorithm to decide whether a given formula is provable in R.); see
Anderson and Belnap 1975 §13 and Anderson et al. 1992 §63.2
6C1.2 Examples Of the proofs in the examples in 6A, those in 6A2.1 and 6A2.2 are
correct R--proofs. In particular, by 6A2.2, the formula
6C2 Definition (BCK-logic) The system that will be called here BCK--logic (or more
precisely, the implicational fragment of BCK-logic) is defined exactly like Intuitionist
The logics defined in the present section and some others like them have come to be known as
substructural logics.
2 Though in the literature R.. is often presented as the implicational fragment of a system R which has
other connectives besides "-+" and has been proved undecidable (Urquhart 1984 or Anderson et al.
1992 §65).
6C Some weaker logics 87
logic in 6A2, except that multiple discharging is not allowed. That is, when (-I) is
used its discharge-label must either be vacuous or contain only one occurrence of v.
6C2.2 Examples Of the proofs constructed in Exercise 6B2.1, it is easy to see that
those of (B), (B'), (I), (K) and (C) are correct BCK-proofs.
But the proof of (W) in 6A2.1 is not, because it contains a multiple discharge.
Of the two proofs of in 6A2.2 and 6A3.1, the former is a
BCK-proof but the latter is not.
The proof of a->a-+a in 6A3.3 is a BCK-proof.
6C3 Definition (BCI-logic) The definition of BCI-logic (to be precise, its implica-
tional fragment) is like 6A2 except that both vacuous and multiple discharging are
forbidden; i.e., when rule (-+I) is used its discharge-label must contain exactly one
occurrence of a.
6C3.2 Examples Of the proofs constructed in Exercise 6B2.1, those of (B), (B'), (I),
and (C) are correct BCI-proofs.
But the proofs of (K) and (W) are not.
Of the two proofs of (a-+a-.c)--+a-*a-+c in 6A2.2 and 6A3.1, the former is a
BCI-proof but the latter is not.
The proof of a--+a-+a in 6A3.3 is not a BCI-proof.
6C4 Lemma The sets of provable formulae of the four logics defined in 6B-C are
related as follows:
(i) BCI c BCK c Intuitionist,
(ii) BCI c R, c Intuitionist.
88 6 The correspondence with implication
6C5 Refined Curry-Howard Theorem (i) The provable formulae of the three logics
defined in 6C1-3 are exactly the types of the following A-terms:
R, : types of the closed 2I-terms;
BCK-logic : types of the closed BCKA-terms;
BCI-logic : types of the closed BCI2-terms.
(ii) The relation a 1, ... , Qn T holds in R_ BCK- or BCI-logic iff there exist M
and x1,..., x (distinct) such that
xI:a1,...,xn:an l- M:T
6C5.1 Lemma Let A be a TAx-deduction oft --> M:T and let AL be defined as in
6B1. Then
(i) f M is a 2l-term, AL is an Rte-deduction,
(ii) f M is a BCKA-term, AL is a BCK-deduction,
(iii) f M is a BCI2-term, AL is a BCI-deduction.
6D Axiom-based versions
As a matter of historical fact the four logics discussed in this chapter all made their
first appearance not in the Natural Deduction form given above but in an axiom-
based version. These axiom-based versions will play a role in the next chapter, so
as a preparation they will be defined and discussed in this section.
But before defining the four logics separately some of the properties of axiom-
based systems in general will be described, relative to an arbitrary set A of axioms.
Axiomatic systems are often called Hilbert-style systems to contrast with Natural
Deduction.
6D1 Definition (A-logics) Let A be any set of implicational formulae that are
tautologies in the classical truth-table sense. Then A generates the following Hilbert-
style system, which will be called the corresponding A-logic.
Axioms: the members of A.
6D Axiom-based versions 89
Deduction-rules :
(-.E)
T
(Sub) T
[if s is a substitution and no variable in
S(T)
Dom(s) occurs in a non-axiom assumption
in the deduction above the line]
Deductions in an A-logic are trees, with axioms and assumptions at the tops of
branches and the conclusion at the bottom of the tree. The notation
a1,...,a8 F-A T
means that there is a deduction whose non-axiom assumptions are some or all of
61, ... , 6n and whose conclusion is T. need not all be distinct.)
When n = 0 the deduction is called a proof of T and we call T a provable formula
or theorem of the A-logic in question, and we say
6D1.1 Note (i) Rule (-*E) is often called modus ponens or the detachment rule. In
it, U--+T is called its major premise and T its minor premise.
(ii) Rule (Sub) is the substitution rule. Its side-condition is standard in propositional
logic and says, roughly speaking, that substitutions may be made for variables
occurring only in axioms. It excludes such deductions as
a
[b/a].
b
6D1.2 Example Let A contain the formulae (C) and (K) shown in 6B2.1. Then the
following deduction gives I-A a-'a. (Both its substitution-steps use the substitution
[(a-->b-+a)/b, a/c].)
(C) (K)
(a->b-*c)->b-*a-*c
(Sub) (Sub)
(a-'(a->b-+a)-.a)-'(a->b->a)-*a-*a a-'(a->b->a)-'a
(-+E) (K)
(a-'b-'a)-'a-+a a-+b-+a
(->E)
a-*a
Proof Suppose rule (Sub) is applied below an application of rule (-+E), as follows:
a-*r or
(->E)
T
(Sub)
§(i)
Then (Sub) can be moved up above (-->E), thus:
6--+i a
(Sub) - (Sub)
(-+E)
§(i)
Also two successive (Sub)'s can be combined into one. The moving-up procedure
ends when all (Sub)'s are at the tops of branches in the deduction-tree. By the
restriction on (Sub) in 6D1, the top formula of each of these branches cannot be a
non-axiom assumption, so it must be an axiom. D
6D3 Definition Hilbert-style Intuitionist logic (of implication only) is the A-logic
whose A has just the following four members:
(B)
(C)
(K) a--,.b--+a,
(W) (a->a->b)->a-+b.
(I) a--+a.
6D5 Definition Hilbert-style BCK-logic (of implication only) is the A-logic for the
set A = {(B), (C), (K)}.
6D6 Definition Hilbert-style BCI-logic (of implication only) is the A-logic for the
set A = {(B), (C), (I)}.
6D Axiom-based versions 91
6D6.1 Note (i) By Example 6D1.2, (I) is provable in Hilbert-style BCK-logic and
Intuitionist logic.
(ii) Of course (B), (C), (I), (K) and (W) are the principal types of the 2-terms B,
C, I, K and W, see Table 3E2a. But each one also expresses a property of implication
that has its own interest quite independently of type-theory. Roughly speaking, (I) is
the reflexivity property of implication, (C) states that hypotheses can be commuted,
(K) that redundant hypotheses can be added, (W) that duplicates can be removed.
Finally (B) can be viewed either as a transitivity property of implication or as a
"right-handed" replacement property which says that if a--+b holds, then a may be
replaced by b in the formula c--+a.
(iii) The next theorem will describe the precise connection between A-logics and
A-calculi; it will be a Hilbert-style analogue of the Curry-Howard theorem at the
end of the previous section. As mentioned earlier, the type-system that corresponds
most naturally to a Hilbert-style logic is combinatory logic not A-calculus, but we
can build a correspondence with A-calculus if we are careful; the key will be the
concept of applicative combination defined in 9F1.
6D7 Theorem (Curry-Howard for Hilbert systems) Let {Cl, C2,. ..} be a finite or
infinite set of typable closed A-terms and let A = {y1,y2.... } where y, PT(C,). Then
(i) the theorems of A-logic are exactly the types of the typable applicative combi-
nations of C1, C2, ...,
(ii) the relation el,...,an I-A T holds ii f there exist an applicative combination M of
C1, C2.... and some distinct term-variables xl, ... , xn, such that
xl:Ql,...,xn:Q I-A M:T.
Proof Part (i) is the case n = 0 of (ii). To prove (ii) we first prove "if". Let M be
an applicative combination of xl,... , xn, Cl, C2, ..., and let A be a TA2-deduction of
(1) xl:oi,...,xn:Qn i-- M:T.
Corresponding to each occurrence of a C; in M there will be an occurrence of
i-, C,:s(y,) in A for some substitution s. Remove from A all steps above these
occurrences of C1, C2, ..., and replace each formula H C; :s(yi) by the type s(yi).
Then replace every other formula in A, say IF H N:p, by the type p. (Cf. 6B1(i)-(ii).)
The result is a Hilbert-style deduction giving
(2) Q1,...,an I-A T.
To prove "only if" in (ii), let 01.... , an I- T in A-logic. Then by 6D2.1 there
is a deduction A of T in which (Sub) is only applied to axioms. Change A to a
TA2-deduction thus. First choose some distinct term-variables xl,...,xn and replace
each undischarged branch-top occurrence of each Q, in A by
xi:a, h-) x,:Q,.
Next, note that each application of (Sub) in A will be applied to an axiom to give,
say, yj I- s(yj); replace it by a TAA-proof of H Cj:s(yj). Then replace the logic rule
by the TA2-rule (-+E) throughout. The result is a TA2-deduction of (1) for
some term M as required.
92 6 The correspondence with implication
6137.1 Corollary The relation al, ... , Qn F- i holds in the Hilbert version of Intuitionist
logic, R., BCK- or BCI-logic if
x1:al,...,x,,:a F-A M:T
holds for some distinct X1,...,xn and some M which is an applicative combination of
x1i ... , xn and, respectively, {B, C, K, W}, {B, C, I, W}, {B, C, K} or {B, C,1}.
6D8 Theorem (Hilbert-Gentzen link) For the Intuitionist, R.-, BCK- and BCI-
logics the relation
at,...,an F- 2
holds in the Natural Deduction version tiff it holds in the Hilbert version.
Proof By the Curry-Howard theorems for Natural Deduction (6B7(ii), 6C5(ii)), the
combinatory completeness theorems (9F3, 9F5), and Corollary 6D7.1.
6D8.1 Note The Hilbert-Gentzen link is usually proved directly without going
through A-calculus. The key step in a direct proof is a result usually called the
deduction theorem which is a close analogue of the combinatory completeness the-
orems in 9F. (The deduction theorem is treated in many standard introductions
to classical logic, see for example Hamilton 1988 Proposition 2.8; a study of the
theorem in some different logics is in Bunder 1982.)
The converse principal-type algorithm
This chapter combines the theme of propositional logic from Chapter 6 with that of
principal types from Chapter 3. We saw in the Curry-Howard theorem (6B7) that
the types of the closed terms are exactly the theorems of the intuitionist logic of
implication: hence the principal types of these terms must form a subset of these
theorems, and the very natural question arises of just how large this subset is. Do its
members form an aristocracy distinguished in some structural way from the general
rabble of theorems, or can every theorem be a principal type?
The main result of the chapter will show that there is in fact no aristocracy: if a
type T is assignable to a closed term M but is not the principal type of M, then it is
the principal type of another closed term M*.
The proof will include an algorithm to construct M* when T and M are given.
To build M* an occurrence of M will be combined with some extra terms chosen
from a certain carefully defined stock of "building blocks", closed terms with known
principal types; and the main aim of this chapter will be to build M* from as
restricted a set of building blocks as possible.
The algorithm in the earliest known proof needed full A-calculus (Hindley 1969),
but two later ones used only Al-terms as building blocks (Mints and Tammet 1991,
Hirokawa 1992a §3) and another used an even more restricted class (Meyer and
Bunder 1988 §9). The algorithm below will be based on the latter very economical
one.
When it was first proved the existence of M* seemed nothing more than a mere
technical curiosity, but several years later it was discovered by David Meredith to
be equivalent to the completeness of a neat form of resolution rule for Intuitionist
implicational logic. This rule and its completeness problem will be described at the
end of the chapter.
93
94 7 The converse principal-type algorithm
7A1 Definition A AK-PT [respectively, AI-PT, BCKA-PT, BCIA-PT] is the principal
type of a closed AK-term [Al-term, BCKA-term, BCIA-term].
A BCKW-PT[BCIW-P T, etc] is the principal type of a BCKW-combination
[BCIW-combination, etc] as defined in 9F1.
7A1.1 Lemma A type is a AK-PT, AI-PT, BCKA-PT or BCIA-PT iff it is, respec-
tively, a BCKW-, BCIW-, BCK- or BCI-PT.
7A2 Converse PT Theorem for AK (i) Every type of a closed AK-term is also a
AK-PT.
(ii) Further, there is an algorithm which accepts any typable closed AK-term M and
any type T of M, and constructs a closed AK-term M* whose PT is T.
(iii) Furthermore, M" >p M.
Proof There is proof in Hindley 1969 Thm. 3 of the corresponding result for
combinatory logic. This translates straightforwardly into A-calculus. It contains an
algorithm that constructs, for every type a, a closed AK-term IQ such that
PT(IQ) = Q-+o, IQ c'f I.
7A2.1 Corollary There exist two typable closed AK-terms P and P' such that P =0 P'
but P and P' have no types in common at all.
7A3 Converse PT Theorem For Al (i) Every type of a closed AI-term is also a 21-PT.
(ii) Further, there is an algorithm which accepts any typable closed Al-term M and
any type r of M, and builds a closed AI-term M* whose PT is T.
(iii) Furthermore, M* Pan M.
7A3.1 Proof-note An algorithm and correctness proof will be given in 7C. Note
that (iii) is weaker than the corresponding result for AK in 7A2(iii): in AK we have
t>p but in Al the proof will only give DRn
The first known algorithm for Al was devised in 1986 by Robert Meyer (Meyer
and Bunder 1988 §9), and another was invented independently by Grigori Mints
and Tanel Tammet (Mints and Tammet 1991 §2). Both of these originated in the
context of Hilbert-style axiom-based logics and although they can easily be trans-
lated into A-calculus they lose some of their directness when this is done. In 1990
a direct and natural A-algorithm was described by Sachio Hirokawa (Hirokawa
1992a §3).
The most economical of these algorithms is the Meyer one, in the sense that it
does not need all the Al-terms as building blocks but only a certain well-defined
proper subset of them. So this algorithm is the one that will be described in
7C.
All three algorithms produce terms M* satisfying (iii) as well as (ii). Though (iii)
was first proved for the Hirokawa algorithm (Hirokawa 1992a §3), a proof will be
given below for the Meyer algorithm and a careful reading of Mints and Tammet
1991 shows that (iii) holds for that algorithm too.
7A3.2 Exercise* If the answer to the above question is "yes" then as a consequence
there exist two /3-equal typable Al-terms with no types in common (cf. Corol-
lary 7A2.1 for AK). Show that this consequence holds even if the answer to the
above question is "no", by constructing two suitable Al-terms directly. (Hint (Martin
Bunder): consider the term P in 2A8.8.)
7B Identifications
The construction of M* from M and T in the proof of 7A3 will depend on viewing T
as the result of applying certain substitutions to a particularly simple form of type.
The present section introduces the notation needed for this.
First recall the substitution notation introduced in 3B. In particular (from 3B7) a
variables-for-variables substitution is one with form
s = [blbal,...,bn/an]
where bl,... , b are any variables; and a renaming in T (where T is a given type) is
a variables-for-variables substitution such that Vars(T) and bl,...,b
are distinct.
96 7 The converse principal-type algorithm
7B1 Definition For any type T, an identification in T (sometimes called a contraction
in T) is any substitution [b/a] such that a, b both occur in T. A type a is obtained
from ,r by identifications if
v = [b./a/,](... ([bt/ai](T))...)
where [bl/al] is an identification in T, [b2/a2] is an identification in [bl/at]T, etc.
p = [bn/an](...([bl/al](v(T)))...)
where n > 0, v is a renaming in T, and each [b;/a;] is an identification in
7B2 Definition A type T is skeletal if each variable in T occurs exactly once. (The
property of being skeletal is sometimes called the 1 -property.)
7B3 Lemma Every type r can be obtained by identifications from a skeletal type T°
(which will be called a skeleton of T).
Proof If T is skeletal choose T° - T. If not, then for each variable b with two or more
occurrences in r, replace all but one of these occurrences by distinct new variablesE
This section contains a proof of Theorem 7A3 obtained by translating the proof
in Meyer and Bunder 1988 §9 into A-calculus. The strategy will be to show first
that every type of a closed 2I-term can be obtained from a AI-PT by a series of
identifications, and then that identifications preserve the property of being a 2I-PT.
But a preliminary step will be to prove the special case of the theorem in which T
has form 9-.0 where 0 is skeletal.
7C1 Lemma Let 0 be any skeletal type. Then is the PT of a closed RI-term IB
such that
Next, suppose 0 _ p--+o- and IP and Ia have already been built such that
Choose
(4) IP_Q = Axy-I"(x(IPy))
From (2) and the PT algorithm (3E1) it is easy to deduce that
(5) PT(IP. ) = (P-'v)-'(p-'6).
(Note that the proof of (5) depends on the fact that Vars(p) n Vars(a) = 0, which
holds because p-*a is skeletal.) Also, by (3),
Let F-x M:T. Then M has a PT, call it a. And by renaming variables in a we can
ensure that
(2) Vars(a) n Vars(T°) = 0.
Define
(3) M+ _ I,. M.
Then (ii) holds by (1). To prove (i), apply the PT algorithm to M+. The first step
is to find an m.g.u. of (T°, a). To see that one exists, note that T is an instance of
T° by the definition of T°, and T is an instance of a because a is principal, so (T°, a)
has a common instance, namely r; and by (2) this instance is a unification of (T°, a) ;
hence (T°, a) has an m.g.u. T+ and r is an instance of T+, say T _ s(T+). Then
(4) PT(M+) = T+.
To complete the proof of (i) the next step is to show that s is a variables-for-
variables substitution. For this it is enough to show that
(5) ITI < IT+I,
98 7 The converse principal-type algorithm
because if s substituted a composite type for a variable in T+ it would increase IT+I.
But ITI = IT°I because T is obtained from t° by identifications; and T+ is an instance
of T° since T+ unifies (T°, a), so IT+I > IT°I; hence (5) holds.
Thus r is obtained from a 2I-PT T+ by a variables-for-variables substitution s.
By 7B1.2, s can be split into a renaming (which changes T+ to another PT of M+)
followed by some identifications. Hence (i) holds.
To prove (iii) note that if T is skeletal then T° - T, so M+ = ITM and the PT of
the latter is clearly r.
7C2.1 Note By the above lemma, the proof of Theorem 7A3 will be complete if we
can prove that identifications preserve the property of being a 2I-PT. This will be
done in the next three lemmas.
In them the following notation will be useful: if a1,...,a are distinct and each
has exactly one occurrence in a given type T, we may say
T(al,...,an), T(o ,...,a,)
(7) Q = 2xYz'Q,+x(Y(Ipz))
Then QI D6n I easily. Now (a-+b)* - a--+b since a occurs positively in T; and by the
PT algorithm and perhaps some renaming we get
(8) PT(Q) _
Case 3: the occurrence of a in T is negative and is in p. Then a occurs positively
in p, so by the induction hypothesis there exists Qp+ such that
(9) PT(Qp+) ° (a-'b)-'p(a)-'p(b),
and Qp+I Dpn I. Choose
(10) Q = 2xYz.IQ(Y(Qp+xz)).
Then QI upn I. Now (a-+b)* - b-+a since a occurs negatively in T; and by the PT
algorithm and perhaps some renaming, we get
(11) PT(Q) =
Case 4: the occurrence of a in T is negative and is in a. Then a occurs negatively
in a, so the induction hypothesis gives Q,_ such that
(12) PT(QQ_) - (b->a)- a(a)- a(b)
and Qa_I D#n I. Choose
(13) Q
(i) (at-+bt)*-+(a2-+b2)*-+T(at,a2)-+T(bt,b2),
where, for each i < 2, (a;-+b;)* - a1-+b, if the occurrence of ai in T is positive, but
(ai-+bi)* - b,-*a, if it is negative. Also
(ii) RII >pn 1.
100 7 The converse principal-type algorithm
Proof By 7C3 there exist AI-terms P and Q such that
PT(P) = (ai->bi)*-+T(ai,a2)-'T(bi,a2),
PT(Q) = (a2-b2)*-*T(bi,a2)_ r(bi,b2),
and PI D#n I and QI D6q I. Choose
(1) R - lxyz-Qy(Pxz).
7C5 Identification Lemma If T is a AI-PT then so are all types obtained from T
by identifications. Further, there is an algorithm which, when given a Al-term M and
T - P T (M) and a - [b/a]T with a, b E Vars(T), will construct a Al-term N such that
PT(N) - a, N>#n M.
Proof Let T = PT(M) for some closed AI-term M and let a =_ [b/a] T where a,
b E Vars(T). We must prove that a is a AI-PT. Let
Vars(T) = {a,b,cl,...,ck}
where a, b, cl,... , ck are distinct, and let p, q, n1,. .. , nk respectively be the number
of occurrences of each of these variables in T.
Define T° to be the result of replacing each occurrence of each variable in T by a
distinct new variable; say
and T is
T a,a,...,a,b,b,...,b,C],...,C], . .. , Ck,...,Ck .
7C The converse PT proof 101
The following lemma and algorithm complete the proof of Theorem 7A3.
7C6 Lemma Algorithm 7C7 below accepts any T and any M such that F-A M:-r, and
outputs a closed term M* such that PT(M*) - T and M* t>s,i M. Further, if M is a
t1-term so is M*.
7C7 Converse PT Algorithm (Meyer and Bunder 1988 §9.) Input: any pair (T, M)
such that I-A M:T. (The relation I-x M:T is decidable by 3E3.)
Step 1. Follow the proof of 7C2 to construct M+ and T+ - PT(M+) and to find a
series o f identifications S1, ... , Sk (k >_ 0) such that
7C7.1 Note By the proofs of 7C1-5 we can see that the M* produced by the above
algorithm has form
M* = Tk(Tk-l(... (T1(I,,-M))...))
where T1..... Tk and ITo are typable closed AI-terms that flry-reduce to I.
(ii) T b-+b--*b, M K.
102 7 The converse principal-type algorithm
(By the way, in (ii) M is not a 2I-term; but the algorithm works despite this, though
in such a case it produces an M* that is also not a 2I-term.)
7C7.3 Note As mentioned earlier the above converse PT proof is not quite the
one in Meyer and Bunder 1988 §9 but a 2-adaptation of it. The unadapted Meyer-
Bunder proof is slightly stronger, and shows that the converse PT theorem holds
not only for the class of all AI-terms but for a more restricted class too, namely the
BB'IW-combinations as defined in 9171.1 For completeness' sake this result will now
be stated as a formal theorem.
7C8 Theorem (Meyer and Bunder 1988 §9.) (i) Every type of a BB'IW-combination
is also a BB'IW-P T.
(ii) In detail: there is an algorithm which accepts any typable closed term M and any
type T of M, and outputs an applicative combination M* of M and the closed terms
B, B', I, W, such that
PT(M*) __ T, M* t'p,1 M.
Proof Using the PT algorithm and some patience, it is easy to show that the proofs
of Lemmas 7C1-5 stay valid if the definitions of Ip-Q, Q, R and N in them are
replaced by the following:
in 7C1 (4): I,, B(B'B)(BBB')IpI,,,
in 7C3 (4): Q = BB(B(B'B')(BBB))IQQP_,
in 7C3 (7): Q BB(B(B'B)(BBB'))IpQQ+,
in 7C3 (10): Q BB(B(B'B')(BBB))IQQp+,
in 7C3 (13): Q BB(B(B'B)(BBB'))IpQQ_,
in 7C4 (1): R B(B'B')(BB')PQ,
in 7C5 (2): N = WRIM.
7D Condensed detachment
We now come to the connection with the resolution-style rule for propositional logic
mentioned in the introduction to this Chapter. This rule was originally devised in
the early 1950's under the name of the condensed detachment rule. It will be de-
scribed here in the context of the A-logic generated by an arbitrary set A of axioms,
see 6D1.
7D1 Definition Rule (D), the condensed detachment rule, makes the deduction
p-+o T
(D)
D(p-.o)T
where D(p--*r)T is an implicational formula defined thus: if p and T have a common
This class is a proper subclass of the closed Al-terms because it does not contain C - (When
C xyz is #-reduced the occurrence of z moves away from the rightmost position, and it is easy to see
that no BOW-combination has this property.)
7D Condensed detachment 103
7D1.1 Lemma If D(p-*Q)T is defined, it is the most general conclusion that can be
obtained by (-+E) from instances of p--+o and r using an instance of p-+a as major
premise. More precisely, if v §1(p) s2(T) is an m.g.c.i. of (p,T) satisfying 7D1(i)-
(ii) and r1 and r2 are any substitutions such that ri(p) - r2(T), then r1(o) is an instance
of si(Q).
7D1.2 Notes (i) D(p-+a)T is unique modulo renaming of variables (by 3C3.2), so in
future we may think of it as uniquely defined.
(ii) D(p-+a)T is also independent of renaming in p--+a and T; that is, if D(p-+6)T
is defined and p'-*c' and T' are alphabetic variants of p--+u and r respectively then
D(p'->a')T' is defined and is an alphabetic variant of D(p-+v)T. (By 3C3.2).
(iii) In any system whose rules include (-+E) and (Sub) we can derive rule (D) as
follows.
(-auE) P" T
(Sub) §2(T)
§1(Q)-+§i(6)
(§2(P) = §2(T))
si(a)
The following lemma expresses formally the fact that the definition of D(p-+o)T
is the same as the procedure in 3C4 for computing PT(PQ) when PT(P) - p--+Q
and PT(Q) - T.
7D2 Lemma (D. Meredith.) If P and Q are typable closed terms with P T(P) = p-+o
and P T (Q) =-T, then
(i) PQ is typable iff D(p-+a)T is defined,
(ii) PT(PQ) - D(p-+Q)T modulo permutation of variables.
7D3 Historical Note Although rule (D) is so close to the PT algorithm each of the
two concepts was developed in total ignorance of work on the other until around
1978, when David Meredith was the first to realise the parallel. (Hindley and
Meredith 1990.)
104 7 The converse principal-type algorithm
Rule (D) was invented by Carew Meredith, and David Meredith (his cousin)
recalled learning it from him in 1954. Its first appearance in print was in Lemmon et
al. 1957 §9. A definition of the rule with historical comments appears in D. Meredith
1977 and further historical comments are in Kalman 1983.
Precursors of the rule date back perhaps as far as the 1920's, however. In Poland, a
particular D-computation was published in Lukasiewicz 1939 (p. 276 of the English
translation); and there is indirect evidence' that Alfred Tarski, then a colleague of
Lukasiewicz, might have formulated the D-concept (and therefore perhaps also an
algorithm for computing D(p- a)T) even before 1930.
Carew Meredith's work on propositional logics involved computing D(p- a)T in
many special cases, for example those in Lemmon et al. 1957, but he does not seem
to have written out his method as a formal algorithm, at least not in print. However,
in 1957 an algorithm for constructing D(p-*a)T was formalized by David Meredith
and implemented by him as a program for the computer UNIVAC 1.2
Now by 7D2 the D-construction is the same as the core of the PT algorithm;
hence David Meredith's 1957 D-program was probably in essence the first formal
PT algorithm. It was also the first one to be run on a machine. (Cf. Comment 3A7
on the history of PT algorithms.)
7D4 Definition (Condensed logics) For any set A of formulae, condensed A-logic is
obtained by replacing the two rules (-*E) and (Sub) in the definition of "A-logic"
in 6D1 by the single rule (D). We shall call its deductions D-deductions; they are
trees constructed from axioms and assumptions by rule (D). For deducibility, etc.
in a condensed logic we shall say D-deducible, D -provable, D -proof, D-theorem, and
shall use the notation
U1,...,u, E- T, [-AD T.
7D5 Definition Rule (D) is said to be complete for a given set A of axioms (or Al is
said to be D-complete) if
ADF-
= Ak.
7135.1 Note The concepts of condensed logic and D-completeness have not been
defined for Natural Deduction systems but only for Hilbert-style (axiom-based)
logics.
7D6 Meredith's Curry-Howard Theorem (D. Meredith.) Let {C1, C2,...} be a finite
or infinite set of typable closed A-terms, let PT(C;) - yi, and let Al = {y1,y2.... }.
Then the theorems of condensed A-logic are exactly the PTs of the typable applicative
combinations of C1, C2. ....
Proof By 7D2.
7D7 Theorem The axiom-set {(B), (C), (I), (W)} for R_, given in 6D4 is D-complete.
It is straightforward to check that this combination has the same PT as B' (in fact
it also reduces to B'); so PT(M**) -PT(M*) - T. Hence r is a BCIW-PT. But by
7D6 every BCIW-PT is a D-theorem of the logic whose axioms are the PT's of B, C,
I, W, and this logic is R..
The next step in answering Question 7D5.2 is to prove the D-completeness of the
axiom-set {(B), (C), (K), (W)} for Intuitionist logic. This logic is stronger than Rte,
and if we view the D-completeness of a set of axioms as saying that deductions
obtained by rule (Sub) can be imitated using rule (D) in combination with some
of the axioms, it is natural to conjecture that if we strengthen a D-complete set it
will remain D-complete. The following definition and lemma make this conjecture
precise.
7D8 Definition For any sets Al and B of formulae: B-logic is called an extension of
A-logic if B' =2 A'; it is called a D-extension of Al-logic if
AD-
s(2)->s(r) E BD-.
D(s(i)->s(r), r) = S(T).
Hence by rule (D) applied to the D-theorems S(T)--+S(T) and T we get
s(2) E I°F.
7D10 Theorem The axiom-set {(B), (C), (K), (W)} for Intuitionist logic given in 6D3
is D-complete.
7D11 Note (Other D-complete logics) (i) Classical logic. The implicational fragment
of classical logic is a Hilbert-style system defined by the axioms (B), (C), (K), (W)
and
(PL)
((PL) is called Peirce's law, see .6A1.2 and 6B7.4.) It can be shown that a formula
is provable in this system if it is a tautology in the usual truth-table sense. (Prior
1955 Part I, Ch. III.) By 7D9-10 this system is D-complete.
(ii) Ticket entailment. A Hilbert-style logic defined by the axioms (B), (B'), (I) and
(W) listed in 6B2.1 was introduced in Anderson and Belnap 1975 (see especially
Ch. 1 §§6, 8.3.2), where it was called T_, or the logic of ticket entailment. The details
of T_, and its motivation are not the concern of this book, but in a sense (B) and
(B') are "right-handed" and "left-handed" replacement properties: if a-+b holds, (B)
says that a can be replaced by b in the formula c-.a and (B') says that b can be
replaced by a in a-*c. (Meanings for (I) and (W) were suggested in 6D6.1(ii).) This
logic can be shown to be weaker than R_, but by 7D6 and 7C8 it is nevertheless
D-complete. 1
(iii) A set of axioms whose logic is strictly weaker than T. was proved D-complete
by N. Megill in unpublished notes in 1993, and an infinite series of ever weakening
D-complete axiom-sets has since been constructed (Megill and Bunder 1996).
But not all axiom-sets are D-complete, as the next theorem will show.
7D12 Theorem The axiom-sets {(B), (C), (K)} and {(B), (C), (I)} given in 6D5-6 are
D-incomplete.
1 By the way, it is not yet known whether there is a decision-procedure for provability in T...,.
7D Condensed detachment 107
Proof By 7D9 it is enough to prove the result for {(B), (C), (K)}. And by 7D6 it
is enough to find a type T that can be assigned to an applicative combination of
B, C, K but is not the PT of such a combination. One such type (pointed out by
A. Wronski, cf. Bunder 1986) is
T = ((a-*a)--*a)-,a.
It is easy to deduce from the types of C and K in Table 3E2a that
I-1 (C(CKK)(CKK)):T.
Given a type r, how many closed terms can receive type r in TA2? As stated this
question is trivial, since if the answer is not zero it is always infinite; for example
the type a--+a can be assigned to all members of the sequence I, II, III, etc. But if we
change the question to ask only for terms in normal form, the answer is often finite
and interesting patterns show up which are still not completely understood.
The aim of this chapter is to describe an algorithm from Ben-Yelles 1979 that
answers the "how many" question for normal forms. For each r it will decide in a
finite number of steps whether the number of closed $-normal forms that receive
type r is finite or infinite, will compute this number in the finite case, and will list
all the relevant terms in both cases.
Ben-Yelles' algorithm can be used in particular to test whether the number of
terms with type r is zero or not, and as mentioned in 6B7.3 this gives a test for
provability in Intuitionist implicational logic.
The first section below will describe the sets to be counted. The next will show
some examples of the algorithm's strategy in action. Then in 8C-D the algorithm
will be stated formally, and the rest of the chapter will be occupied by a proof that
the algorithm does what it claims to do.t
8A Inhabitants
This section gives precise definitions and notations for the sets to be counted.
108
8A Inhabitants 109
A Pq-normal inhabitant of a type is an inhabitant in /3rl-nf. The sets of all typed and
untyped fl q-normal inhabitants of T will both be called
Nhabs,(r ).
Proof Let M E $-nf; then M inhabits T if there exists a proof A of H M:T; and
by 2B3 this A is uniquely determined by M. And such proofs correspond one-to-one
with typed closed terms by 5A7.
8A2.1 Notation All terms in this chapter will be typed unless explicitly stated
otherwise. But for ease of reading they will often be written with some or all of
their types omitted.
8A3 Definition The number (0, 1, 2.... or oo) of members of a set S, counted modulo
ma if § is a set of A-terms, is called the cardinality of S or
#(S).
For #(Nhabs(T)) and #(Nhabs,(i)) we shall usually say just
#(T), #n(T).
2x21 ..
xTm.(V(P1......
MPn)T*)(21--*
(1) 1 m \\ 1 n
where m >_ 0, n >_ 0, and
(ii) T Tl-+... +Tm->r
for some T*, possibly composite, and each is a f3-nf that is typed relative to
(iii) f U {xl :Tl,...,xm:Tm}
(and the set displayed in (iii) is consistent). Further, if Mt is closed then m >_ 1 and
there is an i < m such that
(iv) V ° xi, Ti = Pl--'...-+Pn-+T
1 ,..., Axt'^
Axt' m+ e Min,Mn
are called respectively the term's initial abstractors, head and arguments.
8A6 Definition (Depth(Mt )) Define the depth of a typed or untyped fi-nf thus:
(i) Depth(y) = Depth(Axl ... x,n y) = 0;
(ii) Depth(Axl ... xm yM, ... Mn) = 1 + Max Depth(Mj) if n > 0.
15j5n
8A6.1 Examples
Depth 2.
8A7 Definition (Long Jl-nf s) A typed /i-nf Mt is called long or maximal if every
variable-occurrence z in Mt is followed by the longest sequence of arguments allowed
by its type, i.e. if each component with form (zPl ... Pn)(n >_ 0) that is not in a
function position has atomic type.' An untyped fl-nf M is called long relative to a
type T if it is the type-erasure of a typed long /3-nf Mt. (By 8A2 Mt is unique.)
The sets of all long normal inhabitants of T (typed'or untyped) will both be called
Long(T ).
Sometimes "long ?I-normal form" is used in the literature for long fl-nf's but this is misleading, as these
terms are /3-nf's but not necessarily ry-nrs.
8A Inhabitants 111
(because ya-.b has a type which "demands" an argument but none is provided). On
the other hand the following one is long:
Nt -
8A7.2 Definition The sets of all long normal inhabitants of s (typed or untyped)
with depth < d will both be called
Longs, d).
The next lemma shows that if we could enumerate long normal inhabitants the
others would be obtainable from these by il-reduction.
8A8 Lemma (Completeness of Long(s)) (Ben-Yelles 1979 Lemma 3.9.) Every nor-
mal inhabitant Of T can be rl-expanded to a long normal inhabitant of s. And this long
inhabitant is unique (modulo -a); i.e.
{Mt,Nt E Long(s) and Mt =,1 Nt} Mr -a N.
Proof Let Pt E Nhabs(s). First we must rl-expand Pt to a term Pt+ E Long(s).
Then we must show that Pt+ is unique, i.e. that
(1) Mt E Long(s), Mt C>,1 Pt Mt Pt+
Suppose Pt contains a short component, i.e. a component with form
Choose distinct new variables zl, ... , zk not occurring in Pf and replace this com-
ponent by
(AZ1" ... Zkop '((yQI ... Qn) aZ1a, ... Z)
6k') a e
Fig. 8A 10a.
8A9 Definition (q-family) The set of all terms obtained by j-reducing Mt will be
called the q-family of MT (just as for untyped terms in 1C3), or
{Mt }n.
8A9.1 Note Let Mt E TT(I') for some F. Then {M'},, is finite (by the typed
analogue of 1C3.1) and all its members are in TT(I') (by 5B7.1). If Mt is a f3-nf
then so are all the members of {Mt}n (by the typed analogue of 1C9.3). Hence
(i) MT E Nhabs(T) {Mt}n S Nhabs(T).
Also, if MT is a /3-nf its q-family contains exactly one f3 -nf (by the typed analogue
of 1C9.3). Finally, each normal inhabitant of r is in the ?I-family of exactly one long
normal inhabitant, by 8A8.
The following lemma summarises this situation.
8A10 Lemma (i) The q families of the long typed normal inhabitants of T partition
Nhabs(T) into non-overlapping finite subsets, each q -family containing just one long
member and just one /3ri-nf. (See Fig. 8A10a.)
(ii) #(T) is finite or infinite or zero according as #, (T) is finite or infinite or zero.
(iii) #n(T) = #(Long(T)) .
The sets of all principal /3-normal inhabitants of r (typed or untyped) will both
be called
Nprinc(t ).
Fig. 8A12a.
8A11.2 Lemma Let MT+ be the unique member of Long(T) to which MTri-expands
(see 8A8). Then
MT E Nprinc(T) MT+ E Nprinc(T).
Proof The it-expansion in the proof of 8A8 preserves principality because of the
way the types given to zl, ... , zk are determined by the type v of the component that
is replaced.
8A12 Comment (Sets of of s) Three sets of fl-nf's have been defined so far in this
section, namely
Nhabs(T), Long(T), Nprinc(T),
and the sets of all f t -nf's in these sets will be called respectively
Nhabs,(T), Long,(T), Nprinc,,(T).
To clarify the relations between these six sets consider the type
T - (a-+a-->a)->a->a-*a.
(See Fig. 8A12a.) For this T the six sets are all distinct and in fact there is a term in
every space in Fig. 8A12a except one. In detail:
(i) Axa_a-+a.xa _ a-.a E Nhabs,, - (Nprinc U Long);
(ii) A,xa-'a-'aya.(xy)a-'a E Nhabs - Nhabs,, - (Nprinc U Long);
(iii) 1xa-'a-+a yaza . (xyZ)a E Long - Nhabs, - Nprinc ;
(iv) 1xa-.atiaya.(x(xyy))a-a E Nprincn - Long,,;
8A12.1 Exercise* In Fig. 8A12a one of the eight spaces is empty: show that if T is
changed to the following type then every space in that figure will contain a term:
T-
8A13 Remark (Sets containing non-nf's) This chapter is concerned mainly with
counting normal forms so the sets Habs(T) and Princ(T) will play almost no role.
However, it is worth noting their relation to Nhabs(T) and Nprinc(T) before going on.
(i) If T has an inhabitant M then M has a fl-nf M*p by WN (5C1) and M*p is
also an inhabitant of T by subject-reduction (2C1); hence
Habs(T) 0 Nhabs(T) # 0.
(ii) Next, by the converse PT theorem (7A2) every type with an inhabitant has a
principal one, so
Habs(T) # 0 . Princ(T) # 0.
(iii) In contrast it is not true that Habs(T) # 0 Nprinc(T) 0. It is not even true
that Princ(T) # 0 Nprinc(T) # 0. Because T may have an inhabitant M, even a
principal one, such that PT(M) changes when M is reduced to M*p. An example is
T = a->a--*a;
by 8B4 this type's only normal inhabitants are Axy x and Axy y and it is easy to
check that neither of these is principal; on the other hand Table 3E2a(10) showed
that T has a non-normal principal inhabitant
(Axyz' K(xy)(xz))I.
8B1 Lemma Every type r can be expressed uniquely in the following form, where
m >_ 0 and e is an atom:
2=
Proof Easy induction on ITI
8B2 Comments (Long typed P-nf s) Let T be any type; say T has form
T- (m >_ 0, e an atom)
T*
where 0 < k:5 m and __ If MT is long (see 8A7), then
(i) k = m,
(ii) T* __ e,
(iii) the types of xl,... , xm coincide with the premises of T,
(iv) the tail of the type of v is isomorphic to that of T,
(v) if MT is closed then m >_ 1 and v is an xi (1 < i< m) and
Ti - PV->...-+Pn-+e.
The following examples show how the above comments are applied.
8B3 Example (A type T with #(T) = 1) (Ben-Yelles 1979 p. 42.) The following type
has exactly one normal inhabitant:
T-
And its normal inhabitant (which is also both long and principal) is
ST Axa-.b-.cya-»bza.xZ(yz).
where w is an xi whose type's tail is isomorphic to the tail of the type of U. This
tail is an occurrence of a, so the only possibility is
W x3.
In fact its normal inhabitants are the following m terms (called projectors or selec-
tors):
8B5 Example (Types i with #(r) = 0) (i) No atomic type has inhabitants.
(ii) No type that is skeletal (i.e. in which each atom occurs at most once, cf. 7B2)
has inhabitants.
Proof (i) Every type with an inhabitant has a normal one by 8A13(i). But every
type with a normal inhabitant has arity m >_ 1 by 8B2(iii).
(ii) Let T be skeletal and let T - T i -+... (m >_ 1). If r had inhabitants it
would have at least one long normal one by 8A8, and by 8B2 this inhabitant would
have form
Axl ... xm xiMi ... M.
with xi having type Ti and the tail of Ti being an occurrence of e. But T is skeletal,
so e cannot occur in any Ti. Hence T has no inhabitants.
8B5.1 Corollary Intuitionist implicational logic is consistent in the sense that not all
.formulae are provable.
Proof If an atomic formula e were provable it would be the type of a closed term
MI by 6B7.1, contradicting 8B5.
8B6 Example (Peirce's law, cf. 6A1.2) The type r = ((a-+b)->a)-+a has no inhabi-
tants.
for some Ua-b. Since a--+b has just one premise, U'-.b must have form
Ua-b = Vr)b (r>_0).
118 8 Counting a type's inhabitants
Since Mi is closed, w must be x or y. But w must have a type whose tail is an
occurrence of b and neither x nor y has such a type, so no suitable U°-b exists.
Thus Long(r) _ 0 and hence Nhabs(r) _ 0 by 8A8.1. Hence Habs(r) _ 0 by 8A13(i).
0
8B7 Exercise* Verify the information in Table 8B7a by proving that the types
shown there have no other inhabitants than those displayed. The notation used in
the table's last row is:
PO AX-XI,
Pn = (n z 1),
Qn,i = (Ayn x(Rz Yi)) ...)) (n > 1,1 < i < n).
3 a-b-*a K K K K
4 (a-*b)-(c-a)-+c-*b B B B B
5 (a-*b-c)->b-a-+c C C C 2xyz xzy
6 (a-*b-*c)- (a-*b)-a-c S S S S
see Ex. 8B3
7 (a-+a--b)-+a-+b W W W W Axy.xyy
8 a-* ... -+a-+a II.,....n II; None i - xl ... Xm'xi
m
Long(s) with depth d = 0, then d = 1, etc. The search will output a sequence
d('r, 0), .Sd7(t, 2),... of finite sets of expressions that will serve as "approxima-
tions" to these members. Those in .W(T,d) will look like typed fl-nf's with depth < d
but may have some "holes" to be filled in by the algorithm at a later stage. These
holes will be represented by new symbols called meta-variables and the approxima-
tions will be called nf-schemes. (In Example 8B3 the expression on the right-hand
side of (1) was in effect a nf-scheme and "U" and "V" were its meta-variables.)
To construct JV('r, d+1) the approximations in d(x, d) will be extended by replacing
their meta-variables by certain chosen nf-schemes with depth 1.
Nf-schemes are defined like terms in 1A1 except that they may contain meta-
variables or term-variables or both, and must satisfy the following restrictions:
(i) each nf-scheme is a $-nf without bound-variable clashes;
(ii) meta-variables do not bind, i.e. AV is forbidden;
(iii) in a composite nf-scheme meta-variables only occur in argument positions (as
defined in 9A1);
(iv) each meta-variable in a nf-scheme occurs only once.
Proper nf-schemes are those that contain at least one meta-variable.
8C1.2 Notes (i) The reader familiar with "context" notation will see that a nf-
scheme with k meta-variables is essentially a context containing k different kinds of
hole.
(ii) The restrictions in 8C1 are imposed simply because no wider class of expres-
sions will be needed below.
(iii) Non-proper nf-schemes are just terms (in /3-nf without bound-variable clashes).
(iv) In what follows, a new variable or meta-variable will be one that has not been
used earlier in the chapter.
(v) Most of the term-notation introduced in Chapter 1 extends to nf-schemes in
an obvious way, and the same holds for the definition of depth in 8A6. But a few
concepts need defining afresh and the next three definitions will cover these.
8C2 Definition A closed nf-scheme is one containing no free variables. (But possibly
containing meta-variables, for example Ax-xV and V are closed.)
8C4.1 Example The nf-scheme (i) below is long, though in contrast the term (ii) is
not:
x(a-.b)-.c Va-.b
(i)
(ii) x(a-.b)-.cZa-.b
The description of the search algorithm will now be given: it will begin with a
precise statement of what the algorithm does (in the following theorem), followed
by the algorithm itself.
8C5 Search Theorem for Long(t) (Ben-Yelles 1979 §§3.14-3.17.) The search al-
gorithm in 8C6 below accepts as input any composite type t and outputs a finite or
infinite sequence o f sets d(T, d) (d = 0, 1, 2, ...) such that ,for all d >- 0,
(i) each member of d(t, d) is a closed long typed nf-scheme with type t, and is either
(a) a proper nf-scheme with depth d, or
(b) a term with depth d - 1;
Note. This .W('t, 0) trivially satisfies 80(i)-(ii). (The algorithm may be seen as
building approximations to an unknown term Mt; Vt is the weakest approximation
and represents the fact that at this stage we know nothing at all about Mt other
than its type.)
Step d + 1. Assume sl(r,d) has been defined and satisfies 8C5(i)-(ii).
Substep I. If C(T,d) = 0 or no member of .c/(r,d) contains meta-variables then
stop. (In this case .W(r,d+ 1) is undefined and the algorithm's output is just the
finite sequence d(i, 0), ... , d ('C, d).)
Substep II. Otherwise, begin the construction of d(,r,d+ 1) by listing the proper
nf-schemes in d (?, d) and applying IIa-IIb below to each one.
Subsubstep IIa. Given any proper Xt E d(r,d), list the meta-variables in Xt;
say they are
(1) p= (m>0);
first list all i < m for which Tail(aj) = a = Tail(p). (If there are none or
m = 0, go direct to IIa2.) For each such i, ai has form
Define
il1...V aini'')a
(3) yip Ax1'...xm'(xi
where the x's and V's are distinct new variables and meta-variables. (Yif
is called a suitable replacement for VP.)
Part IIa2. List the abstractors that cover the (unique) occurrence of VP in
Xt, in the order they occur in XT from left to right; say they are
(5)
Define
j (hj >_0).
where the x's and V's are distinct and new. (Ze is called a suitable
replacement. for VP.)
122 8 Counting a type's inhabitants
Notes. (i) It is easy to see that each Y;P defined in IIal and each Zf defined in
IIa2 is a long nf-scheme with depth < 1 and the same type as VP, so VP can be
replaced by Y;P or Z' in XT without violating type-restrictions or the restrictions
in the definition of long nf-scheme (8C1, 8C3-4). And the result of making such a
replacement will clearly have depth < d + 1.
(ii) YAP and Z° need not contain meta-variables. In fact YAP is without meta-
variables if o; is an atom, and in this case
8C6.1 Example (cf. Example 8B3.) Applying the search algorithm to the type
T - (a-*b--*c)-+(a--+b)->a--+c
produces the following sets:
SI(T, 0) = {VTj,
{1x1-b-cxz-bx3.xa-b-cVl Vz
d(T,2) = l 1 2 3 1 3( 2
8C6.2 Example (cf. Example 8B4.) Applying the search algorithm to the type
T =_ a-+ ... --+a-+a
8C6.3 Example (cf. Example 8B6.) Applying the search algorithm to the type
T ((a-+b)-+a)->a
produces the following sets:
,%/(T,0) = {VT},
,V(T 1) = {Ax(a-.b)-.a,x(a-+b)-ava_.b}
1 1 1
d(T, 2) = 0.
8C6.4 Exercise* List members of the sets sad(T,0),szl(T,1).... for all the types in
Table 8B7a.
8C6.5 Comments (i) Besides Ben-Yelles' 1979 search algorithm there are others
in Zaionc 1985 and 1988 and in Takahashi, Akama and Hirokawa 1994. Also
procedures essentially equivalent to the main parts of 8C6 are used in some decision-
algorithms for provability in intuitionist logic, cf. 6B7.3, and in algorithms that seek
to unify pairs of typed A-terms, for example the original one in Huet 1975 §3.
(ii) The algorithm in 8C6 can be viewed as a context-free grammar for generating
a language whose expressions are exactly the inhabitants of T. In the standard
notation for context-free grammars (see for example Hopcroft and Ullman 1979
Ch. 4) the meta-variables in 8C6 would be called non-terminal variables and each
suitable replacement defined in Subsubstep Ila of 8C6 would determine a production
in the grammar. The set of all such productions need not be finite, however, so the
original definition of "grammar" must be relaxed if this view is to be made precise.
This view was taken in Huet 1975 §3, Zaionc 1985 §3 and 1988 §5, and was used
and analysed in detail in Takahashi, Akama and Hirokawa 1994.
(iii) A very smooth description of a search algorithm in terms of solving sets of
polynomial equations over the domain {0, 1, 2, ... , co} is in Zaionc 199-.
124 8 Counting a type's inhabitants
8D The Counting algorithm
In this section the search algorithm will be extended to count and enumerate normal
inhabitants. The extension will be very simple indeed to state; but the proof that it
works will need some care and will be postponed to Sections 8E-F.
As noted in 8A2, it does not matter whether typed or untyped inhabitants are
counted since there is a one-to-one correspondence between the two: in this section
"Long(T)" and "Nhabs(T)" will denote sets of typed terms.
8D1 Definition (i) For the sets .QI(T, d) and dterms(T, d) introduced in 8C5, define
d(T, < d) = S /(T, 0) U ... U a (T, d),
dterms(T, < d) = dterms(T, 0) U ... U Wterms(T, d).
(ii) For any T, recall from 2A2 that ITI is the number of atom-occurrences in T
and IITII is the number of distinct atoms in T; define
8D2 Stretching Lemma If Long(T) has a member MT with depth d >_ lit II then it has
members with depths greater than any given integer, and hence is infinite.
Proof-note The proof will be given in 8F2 (cf. Ben-Yelles 1979 §§3.20-3.21). It will
show that if MT is in Long(T) and has depth d >_ lit Ii then MT must contain two
distinct components B and B' with the same type and with one inside the other. Then
replacing the smaller one by the larger will change MT to a new term M`T deeper than
MT, and it will be easy to check that M'T is a genuine typed term and is in Long(T).
Then a similar replacement in M'T will change it to a still deeper term, and so on.1
8D3 Shrinking Lemma If Long(T) has a member MT with depth >_ 11(T) then it has
a member NT with
®(T) - IITII < Depth(NT) < ®(T).
Proof-note The proof will be given in 8F3 (cf. Ben-Yelles 1979 Thm. 3.28). Just
as for the previous lemma, MT will be seen to contain two components B and B'
with the same type and one inside the other; but now M*T will be constructed by
replacing the larger by the smaller. A problem is that if B and B' are not chosen
carefully this replacement risks deleting some abstractors and giving a non-closed
M*T: to prevent this, and to ensure that the depth of M*r lies in the range required,
B and B' will be chosen only after a close analysis of the structure of MT.2
I
This lemma is very like the pumping lemma in the theory of context-free grammars and regular
languages, see for example Hopcroft and Ullman 1979 §3.1 and Comment 8C6.5(ii) above. And the
next lemma will also be very like one in that theory, see Hopcroft and Ullman 1979 §3.3 Theorem 3.7,
though its proof here will be more complicated due to the presence of bound variables. For a
precise statement of the analogy between typed 2-calculus and the theory of context-free grammars
see Takahashi et al. 1994, where a generalised concept of context-free grammar is introduced and
generalizations of the classical results are proved which imply the corresponding 2-results.
2
Ben-Yelles 1979 Thm. 3.28, whose lines the proof in 8173 will follow, was actually slightly weaker than
the above lemma. The first proof of a full-strength shrinking lemma was outlined in Hirokawa 1993c;
it was different from the one to be given in 8F3.
8D The Counting algorithm 125
8D3.1 Corollary If Long(T) has a member Mt with depth > D(T) then it has a
member Nt with
IITII <_ Depth(Nt) < D(T).
Proof If Long(r) has a member then T is composite by 8B5. Hence ITI > 2, so
8D4 Theorem (Counting long normal inhabitants) (Ben-Yelles 1979 Cor. 3.33.)
When given a type r the algorithm in 8D5 below outputs #(Long(T)) and an enumer-
ation of Long(T).
Stop the search algorithm at d = D(T) and enumerate dterms(T, < D(T)).
[By 8C5 this set is finite and contains all members of Long(T) with
depth < D(T).]
Case I: (Vterms(T, < D(T)) = 0. Then Long(T) = 0.
[By 8D3, if Long(T) had a member with depth >_ D(T) it would have
one with depth < D(T).]
Case II: c/terms(T, < D(T)) has a member with depth >_ IITII. Then by 8D2 Long(T) is
infinite. To enumerate Long(T), apply the search algorithm to enumerate dterms(T, d)
ford=0,1,2,....
[By 8C5(iv) the union of these sets is Long(T).]
Case III: dterms(T,< D(T)) has members but they all have depth < IITII. Then
Long(T) = dterms(T, < D(T)), which is finite.
[The only way for Long(T) to differ from this set would be for Long(T)
to have members with depth d >_ D(T), but by 8D3.1 it would then
have a member with lit II < d < D(T) contrary to the assumption of
the present case.]
Proof By 8A10 the members of Nhabs,t(T) are the q-nf's of those of Long(T). And
by 8A10(iii), #,t(T) = #(Nhabsq(T)) = #(Long(T)).
8D5.2 Corollary (Emptiness test) The algorithm in 8D5 can be used to decide whether
a type T has no inhabitants.
126 8 Counting a type's inhabitants
Proof By 8A13(i), Habs(T) = 0 e:> #(Nhabs(T)) = #(T) = 0. And by 8A10(ii) and
(iii), #(T) = 0 a #(Long(T)) = 0.
8D6 Theorem (Counting all normal inhabitants) (Ben-Yelles 1979 Ch. 3.) When
given a type T the algorithm in 8D7 below outputs #(T) and an enumeration of the set
Nhabs(T) of all the /3-normal inhabitants of T.
8D8 Definition A type T will be called monatomic if IITII = 1, i.e. if only one atom
occurs in T.
8D9 Theorem (Ben-Yelles 1979 §3.25.) Let r be a monatomic type with the form
T -TI ->...Tm->a (m > 0). Then
(i) if at least one Ti is composite, #(T) is either oo or 0;
(ii) if TI - ... - Tm = a then #(T) = M.
Proof Part (ii) is just 8B4. To prove (i), assume Nhabs(T) is finite and non-empty
and assume r has form
T=
with m >_ 1 and at least one premise composite. By 8A8 Long(T) is also finite and
has a member MT. By 8D2, Depth(MT) < fITli = 1, so MT must have form
MT = AXT I 1
XT'"
m ' xTI
P (1<p<m).
And TP = a since MT is long. Choose a composite premise of T; say it is Ti and has
form
(m;>_ 1).
For each j < m, choose distinct new variables yl,... , ym J and define
P.+,i
- T+i,l
Y1 .. Ym; T 1,i,m;' j . Xap
8E The structure of a nf-scheme 127
where xP is the rightmost variable in MT (and is therefore distinct from the y's).
Then define
8D11 Comment (Efficiency) At first glance the algorithms in this section may seem
remarkably efficient, as they terminate in only ITI X IITII steps. But this impression
is totally misleading, as each one of these steps involves listing a set d(T,d) which,
although finite, may be very large indeed. In fact any counting algorithm will decide
as a special case whether Nhabs(T) is non-empty, which is equivalent to deciding
whether T is provable in Intuitionist logic (by 6B7.2), and this decision problem is
polynomial-space complete (Statman 1979a Prop. 4).
.Tree for
m n Y
v 0 1 j
1
n-1 Tree for
0 1 2
mn-1
O1
vYl
vYl ..Yn
X xl...xIn(vYI ...Yn) 0
Fig. 8131a.
will lay the foundation by analysing the structure of an arbitrary long typed nf-
scheme.
A key role will be played by a slightly strengthened form of the subformula
property (2B3.1). That property says in effect that the types of all the components
of a closed fl-nf Mi are subtypes of r, and this implies that all the successes
produced by the search algorithm, growing deeper and deeper, have the types of
their components drawn from the same fixed finite set. This limitation is the source
of the bounds in the stretching and shrinking lemmas.
8E1 Notation Recall that a nf-scheme is essentially a $-nf that may contain meta-
variables under certain restrictions, see Definitions 8C1 (untyped) and 8C3 (typed).
The early parts of the present section will apply to both typed and untyped nf-
schemes, so types will be omitted when nf-schemes are written. But later parts will
apply only to typed nf-schemes and types will then be displayed.
We shall need the notation for positions, components and construction-trees
introduced in 9A1-4.
In writing positions a sequence of n 0's may be written as on (with 00 and
similarly for l's and 2's.
Recall from 8A5 that every non-atomic nf-scheme X can be expressed uniquely
in the form
(1) X- (m+n>_ 1),
where v is one of xl,...,xn if X is closed. The construction-tree of such an X is
shown in Fig. 8Ela. The head and arguments of X are v_ and Y1,...,Yn. (If X is an
atom its head is X and it has no arguments.) Note that the position of L is
(2) 0"`1n-`2 (1 < i < n).
8E The structure of a nf-scheme 129
x(Xu.uV1 )V2
?.y.x(Xu.uV1)
2
X xy.x(k u.uVI )V2
8E3.1 Lemma Let X be a typed or untyped nf-scheme with Depth(X) > 1. (Depth is
defined in 8A6.) Then
(i) Depth(X) is the maximum of the depths in X of all subarguments of X,
(ii) X has at least one subargument whose depth in X is the same as Depth(X), and
each such subargument is an atom or abstracted atom.
ZZo,Z1,...,Z) (k > 1)
such that Zo - X and Z1 is an argument of Z.-1 for i = 1,...,k, and Zk = Z. It is
called unextendable if Z is an atom or abstracted atom. Its length is k (not k + 1).
Proof For (i) use induction on IXI; for (ii) use 8E3.1.
8E5.1 Note (i) If X has no bound-variable clashes the member of IA(Z) are distinct
and so are those of CA(Z X).
(ii) IA(Z) and CA(Z X) are sequences of variables not components.
(iii) For typed nf-schemes each variable in IA(Z) or CA(Z X) is typed.
(iv) If the argument-branch from X to Z is (Zo,...,Z) (k > 1), then
CA(Z X) = IA(Zo) * ... * IA(Zk_l)
where "*" denotes concatenation of sequences. (Because the abstractors whose
scopes contain Z are exactly the initial abstractors of
The next definition and lemma will have meaning for typed nf-schemes only. The
lemma will be the strengthened form of the subformula property mentioned at the
start of the section, and will lead to a computation of upper bounds for several key
sequences and sets. Part of it will use notation from 9E.
Length (IAT(Z°)) = m.
Proof Since Z° is long, IAT(Z°) coincides with the sequence of all premises of a,
so (ii) holds. Also if or is composite we have
(1) IAT(Z°) E NSS(a)
by the definition of NSS (a) in 9E9. Now (i) implies (iv) by 9E9.2(iii), and (iv) and
(1) imply (iii). Hence only (i) remains to be proved.
The proof of (i) is an induction on IXTI. To make this work we shall prove
If XT is a long member of TNS(r) (defined in 8C3) and
r = {UI:BI,...,up:O ,V1:01,...,Vq:¢q}
(2) and Z' is a subargument of XT, then a occurs as a positive
subpremise of
elm...-ABP-->T.
Basis. If XT is an atom the conclusion of (2) holds vacuously.
Induction step. Let XT have form
Xnn)e)(TI-...-+T,n--*e)
(3) (A.xl' ... x;;
where m, n > 0 and T - TI-+... -T,,, -e. Then either y - xi for some i < m or y ui
for some i< p. If y - xi then
(4) Ti PI -+ ... ->Pn-+e
and if y - ui then
(5) ei = Pl-+...->Pn->e.
In both cases each of p1i..., Pn occurs as a positive subpremise of
(6) e1--+...->O
Now Z° must be in an for some j < n. If Z° then a pi and the
conclusion of (2) follows by the above. Next, suppose Z° is a subargument of XP'.
Note that
E TNS({xl :Ti,...,x,n:T,n} U F).
Hence, by the induction hypothesis, a occurs as a positive subpremise of
T1-> -+eP-+P).
Thus a occurs as a positive subpremise of (6), giving (2). 0
132 8 Counting a type's inhabitants
8E7.1 Corollary If Xt is a closed long typed nf-scheme, the type of each meta-variable
in XT either occurs as a positive subpremise of T or is T itself.
The main effect of 8E7 is to connect IAT(Z°), which in general depends on the
structure of ZI and hence implicitly on that of XT, with NSS(T) which depends on r
and nothing else. The next corollary will use this to deduce reasonably neat bounds
for IA(Z°) and CA(Z,XT).
Proof For (i): Length (IAT(Z°)) < ITI - 1 by 8E7(iii) and 9E9.3(iv).
For (ii): If Z - X the left side of (ii) is 0. If Z # X let (Zo,... , Z) (k >- 1) be the
argument-branch from X to Z; then by 8E5.1(iv)
Length(CA(Z,X)) = Length(IA(Zo)) + + Length(IA(Zk_l))
< k(JTJ - 1)
by (i). But Depth(X) > k by 8E4.1(ii), so (ii) holds.
For (iii): Each pi is in IAT(XT) or in IAT(YB) for some subargument YB of XT;
and in both cases pi E U NSS (T) (in the former case trivially, and in the latter case
by 8E7(iii)). Then use 9E9.3(iii).
8E7.3 Exercise* Show that if T is composite and d >_ 1 and .nI(,r,d) is defined, then
(i) each XT E d) contains < T1d meta-variables,
(ii) #(_q/(T, d)) < (d x
ITI)(IT1"
This section fills in the three gaps that were left in the verification of the counting
algorithm in 8C-D: the stretching and shrinking lemmas and the "completeness"
part of the search theorem.
8F1 Search-Completeness Lemma Part (iii) of the search theorem 8C5 holds; i.e. if
T is composite and d > 0, then
Long (T,d) 9 d(T, < d + 1).
8F Stretching, shrinking and completeness 133
Proof The lemma will be proved by induction on d but to make the induction work
we must prove something slightly stronger. Recall that Long(T,d) is the set of all
long inhabitants of T with depth < d (8A7.2). Let L*(T,d) be the set of all long typed
closed nf-schemes XT such that Depth(XT) = d and
(1) ( J(a) XT is proper and all its meta-variables have depth d in XT,
t (b) all subarguments with depth d in XT are meta-variables.
We shall prove both the inclusions
(2) L*(T, d) 9 sl(T, < d),
(3) Long (T, d) c d + 1),
where (2) is understood modulo renaming of meta-variables (i.e. (2) says that if
XT E L*(T, d) then d(T, < d) contains either XT or a nf-scheme that differs from XT
only by replacing its meta-variables by distinct others.)
Basis: d = 0. For (2): the only nf-schemes with depth 0 are meta-variables;
also .sa l(T,0) _ {VT} by Step 0 of the search algorithm (8C6), so (2) holds modulo
renaming.
For (3): let T - (m >_ 1), and let MT E Long(T,0). Then MT is a
term and has form
(4) AY1 ... Ym'. Y ` (1 < i < m, Ti = e).
Now d(,r,0) = {VT}. To construct C/(T,1) apply 8C6 Step 1: Part Ilal therein will
output (4) as a suitable replacement for VT because the tail of Ti is isomorphic to
that of T. Hence .9/(T, 1) is defined and contains MT.
Induction step: d to d+1. For (2): let X E L*(T,d+1). (Types will not be displayed
from now on.) Then Depth(X) = d + 1, so by 8E3.1(ii) X has a subargument whose
depth in X is d + 1, and so by 8E4.1 X has one whose depth in X is d. List all
such subarguments (without repetitions); say they are W1,..., W (r > 1). Clearly
W1,..., Wr are disjoint components since they all have the same depth d in X; and
since Depth(X) = d + 1 we have Depth(Wi) < 1 for each i. Also since X satisfies
(la) and (lb) relative to d + 1, no Wi can be a meta-variable; hence each Wi must
have form
(5) Wi = Axi,1 ... xi,m^ - Yi Vi,l ... Vj,nr (mi, ni > 0).
Let X' be the result of replacing each W i in X by a distinct new meta-variable V,
with the same type as W-.
Then X' is a nf-scheme. (It satisfies restriction (iii) in Definition 8C1 because each
Wi is a subargument.) And it is clearly long and closed and has depth d. Also, by its
construction X' satisfies condition (lb) for membership of L*(T,d). It also satisfies
(la), because if it contained a meta-variable-occurrence V at a depth < d such a V
could not be a Vi and hence would occur also in X at a depth < d, contrary to the
assumption that X satisfies (la) relative to d + 1.
Hence X' E L*(T,d). Therefore by the induction hypothesis there is an X" E
_%/(T,< d) that is identical to X' except perhaps for alphabetic variation of meta-
variables. Apply Step d + 1 of Algorithm 8C6 to each Vi in X". Since Wi has form
(5) and is part of a closed nf-scheme, namely X, it is easy to see that Wi is a suitable
replacement for Vi (modulo renamings in Wi). Hence the algorithm will give X as
an extension of X", so X E W(T, < d + 1), giving (2) for d + 1.
134 8 Counting a type's inhabitants
For (3): let M E Long(T,d + 1). Then by 8E3.1 M has a subargument whose
depth in M is d + 1. List all such subarguments without repetitions; say they are
Ul,..., U (r >_ 1). Clearly El__ U are disjoint; and since Depth(M) = d+ 1 each
Ui must have depth 0 and hence must have form
(6) Ui = 2xi,l ... xi,m,.yi (mi > 0).
Define M' to be the result of replacing each U. in M by a distinct new meta-variable
V. with the same type as U. Then M' is a genuine nf-scheme. And it is clearly long
and closed and has depth d + 1. Also M' satisfies condition (lb) for membership of
L* (T, d + 1), and satisfies (1a) because all its meta-variables have been introduced by
the above replacements. Hence M' E L*(T, d + 1).
Therefore by (2) for d + 1 there is an M", differing from M' only by renaming
meta-variables, such that
M" E d + 1).
8F2 Detailed Stretching Lemma (cf. 8D2.) If Long (T) has a member Mt with depth
>- IITII then
(i) there exists M*t E Long(T) with Depth (M*t) > Depth(MT) + 1,
(ii) Long (T) is infinite.
Proof [Ben-Yelles 1979 §§3.20-3.21.] We shall prove (i) and then (ii) will follow by
repetition. Types of typed terms will be omitted.
To prove (i), let M be a typed closed long fl-nf with type T and without bound-
variable clashes. Let d = Depth(M) >- IITII > 1. By 8E4.1, M has at least one
argument-branch with length d. Choose any such branch, call it
(1) (No,...,Nd),
where No M and Ni+l is an argument of Ni for i = 0, ... , d -1. Each of No, ... , Nd
must have form
(2) Ni = 2xi,l ... xi,m,.yiPi,l ... Pi,n, (mi, ni >- 0),
and for i < d - 1 we have Ni+1 = Pit, for some ki < ni (and hence ni > 1). Since
d = Depth(M) the last member of (1) has no arguments, so nd = 0.
For i = 0, ... , d let Bi be the body of Ni ; that is
(3) Bi = r
-Pill ...Pin..
Since Ni is long, the type of Bi is an atom. And this atom occurs in T by 2B3(i). But
IITII < d and there are d + 1 components B3,...,Bd, so at least two of these must
have the same type.
Choose any pair (p, p + r) such that BP and BP+, have the same type (and r > 1).
Note that BP properly contains Bp+r and
Depth(Bp) >- r + Depth(Bp+r).
8F Stretching, shrinking and completeness 135
8F3 Detailed Shrinking Lemma (cf. 8D3.) If Long(T) has a member Mt with depth
>- D(T) then
M*T
(i) it has a member with
Proof [cf. Ben-Yelles 1979 Thm. 3.28.] Part (ii) is proved by repeating (i) and taking
the first output with depth < D(-r).
Part (i) is proved as follows. (Types of typed terms will be omitted.) Let M be
a member of Long(T) without bound-variable clashes. Let d = Depth(M) > D(T).
Incidentally D(T) >- 2 because T is composite since atomic types have no inhabitants.
By 8E4.1, M has at least one argument-branch with length d; to reduce the depth
of M we must shrink all these branches. Consider any such branch; just as in the
proof of the stretching lemma 8F2 it has form
(1) (No,...,Nd),
136 8 Counting a type's inhabitants
where N M and Ni+1 is an argument of Ni for i = 0, ... , d - 1. And
(2) Ni = )xi,1 ... xi,m, *yiPi,I ... Pi,,,, (mi, ni > 0)
for 0 < i < d - 1. Let the type of Ni be
(3) pi = Pi,l --+ --+Pi,mr--+ai
Then since Ni is long, the types of Xi,1,xi,2,... are exactly pi,1,pi,2.... ; that is, using
the "IAT" notation introduced in 8E6,
IAT(Ni) =
Just as in the proof of 8F2 let B be the body of N. for i = 0,...,d. Then the type
of Bi is the tail of the type of NO namely ai.
Define a sequence of integers do,d1,... thus: do = 0 and d3+l is the least i > dj
such that IAT(Ni) differs from all of
(4) IAT(Ndo), IAT(Nd,),..., IAT(Nd1).
Let n be the greatest integer such that d is defined. The branch (1) has only d
members after No, so n< d and d < d. Then
(5) 0=do <d1 <... <dn <d,
and for 0< i< d, IAT(Ni) is identical to one of
(6) IAT(Nd0), IAT(Nd1), ...,
Also the n+1 IAT's in (6) are distinct, and by 8E7 they are either empty or members
of NSS (T). Hence by 9E9.3(ii),
(7) n + 1 <- ITI.
Now do,. .. , do partition the set {0, 1.....d} into the following n+ 1 non-empty sets
which will be called IAT-intervals:
Oj ={dj,dj +1,...,dj+1-1} (0<j<n-1),
On = {dn,dn+ 1,...,d}.
If lj contains two numbers p, p + r such that r > 1 and B,, and Bp+r have the same
type (i.e. ap = ap+r) we shall call (p, p + r) a tail-repetition. It will be called minimal
iff there is no other tail-repetition (p',q') with
p<p'<q'<p+r.
Now an Bj that contains no tail-repetitions must have < ITI1 members. Because
for such an lj the atoms
ad,...,adi+,-1
must all be distinct, and by (3) each ai occurs in pi, which occurs in T by 8E7, and
there are only IITII distinct atoms in T. This argument also shows that for a minimal
tail-repetition (p, p + r) we have
(8) r < IITII.
8F Stretching, shrinking and completeness 137
Now there are n + 1 IAT-intervals in the given branch and n + 1 < ITI by (7), so if
no interval contained a tail-repetition the branch would have < ITI X IITII members.
But the branch has d + 1 members and
But M is closed, so the free v in B p+r in M must be in the scope of a .iv in one of
IA(No),... , IA(N p+r). Hence v occurs in IA(Nh) for some h with p + 1 < h < p + r;
in the notation of (2) we have
V ° Xh,k
for some k < mh. And the type of v is Ph,k E IAT(Nh). Now by the definition of
do,..., d,,, and the fact that the tail-repetition (p, p + r) we are eliminating is in an
interval Oj,IAT(Nh) coincides with one of
IAT(Ndo),..., IAT(4);
say IAT(Nh) = IAT(Ndq) for some q < j. Hence there is a variable
Xdq,k E IA(Ndq)
with the same type as v. Replace v by this variable throughout M'. The result will
be a long fl-nf with the same type and depth as M' and containing one less free
variable.
138 8 Counting a type's inhabitants
Similarly replace every variable of BP+, that is free in M' by a new one which has
the same type but is bound in M'. The result will be a long $-nf M" with the same
type and depth as M' and which is closed.'
Now M" has been obtained by removing a type-repetition from an argument-
branch in M which originally contained d subarguments. And by (8) the number of
subarguments removed is < IITII. Hence
(10) d - IITII <- Depth(M") < d.
If Depth(M") < d; define M* - M". If not, select a branch in M" with length
d and apply the above removal procedure to it, then continue shortening branches
with length d until there are none left. (This process must terminate because each
removal strictly reduces IMI.) Define M* to be the first term produced by this
procedure whose depth is less than d. Then
8F3.1 Example Let T = (a-*a)-*a-+a, and let Mr be a typed version of the Church
numeral for the number four, namely
M, =
Then
IITI I = 1, ITI = 4, D(T) = ITI X IITII = 4.
n = 1, do = 0, di = 1, to = {0}, l I _ { 1, 2, 3, 4}.
There are three minimal tail-repetitions in l and the last one is (3, 4). The proof of
8F3 removes this by replacing uv by v_; this changes M to
M*
8F3.2 Warning As mentioned in 8D10(iii) the proof of the shrinking lemma does
not necessarily apply to restricted systems of A-terms, for example the Al-calculus.
In fact there is no guarantee that if we shrink a Al-term the result will still be a
AI-term, since shrinking may cut out some variables.
9
Technical details
To avoid interrupting the main lines of thought in the earlier chapters some concepts
were defined there only in outline and their main properties were stated without
proof. This chapter gives the full definitions and proofs. It should be read only
as required to follow the arguments in the other chapters. Its sections are largely
independent of each other.
9A1 Definition (Positions) A position p = ii ... i is any finite (perhaps empty) string
of symbols such that i1,..., irri_1 are integers and in, is either an integer or an asterisk,
*. Its length is m, and if m = 0 we say p = 0.
If m 1 and n, = 1 we call p a function position;
if m 1 and i,,, = 2 we call p an argument position;
if m > 1 and ,n = 0 we call p a body position; and
if m 1 and n, = * we call p an abstractor position.
(Positions containing integers >_ 3 will be used later but not in the present section.)
The concatenation pq of positions p = i1 ... i, and q = j1... j is defined thus:
p0 = p, Oq = q, and if m, n > 1 and in * define
pq = i1 ... imjl ... jn.
(pq is undefined if m,n >_ 1 and in, _ *.)
A refinement of p is any position with form pq; it is proper if q * 0.
Two positions p = i1 ... in and q = j1 ... j, are said to diverge if neither is a
refinement of the other, i.e. if there exists h such that
ih # jh, 1 < h < Min {m, n}.
140
9A The structure of a term 141
9A2 Definition (Occurs, occurrence) The phrase "P occurs in M at position p" (or
"M contains P at position p") is defined by induction on the length of p, thus:
9A2.1 Note It is easy to prove that at most one term can occur in M at one position.
(We assume of course that a term cannot be simultaneously an application and an
abstraction, or a composite term and an atom; also that
9A2.2 Notation An occurrence (P, p, M) may be called simply "P" when no confu-
sion is likely. A binding occurence of x may be called either of
9A3.1 Note The reason that binding occurrences of variables are denied the name
"components" in the above definition is that they play a very different role from
other occurrences. In a term such as we shall think of x and y as being the
material from which the term is built, but think of z as being part of one of the
operations which do the building.
x.0.
(ii) If M = PQ, its tree is obtained by first concatenating "1" onto the left end of
each position-label in the tree for P, then concatenating "2" onto the left of each
142 9 Technical details
position-label in the tree for Q, and then placing an extra node beneath the two
modified trees, as shown below.
PQ 0
(iii) If M =_ Ax-P, its tree is obtained by first concatenating "0" onto the left end
of each position-label in the tree for P, and then placing an extra node beneath the
modified tree, as shown below.
Modified tree .
for P
X x.P 0
9A4.1 Example The construction-tree of is shown in Fig. 9A4.1a.
(Xx.yx)(X z.x(yx)) 0
Fig. 9A4.1a. Construction-tree of
9A4.2 Note Each component has a node on the construction-tree; but binding
occurrences of variables are not components and do not have their own nodes.
P contains R R is a part of P R is in P;
9A The structure of a term 143
9A5.1 Notes (i) If P is not disjoint from R it is easy to see that either P is in R
or R is in P. Hence if P, .... P, are components of a term M and no two are
disjoint they must be linearly ordered by the relation "is in"; i.e. there must exist a
permutation (i1....,in) of (1,...,n) such that
P 1
is in P1-2 in P1.3 in ... in P. .
9B Residuals
This section summarises some properties of #-contractions needed in the proof of
the weak normalisation theorem in 5C. The full theory of #-reduction is in fact
quite deep (see Barendregt 1984 Chapters 3 and 11-14), but none of it is used in
this book except the following few very basic ideas.
Everything in this section is valid for both the untyped terms of Chapter 1 and
the typed terms of Chapter 5.
9131 Notation Recall the definitions of ji-redex and #-contraction in 1B1. (In this
section "#" will usually be omitted.) A redex-occurrence
R_
is a particular occurrence of a redex in a term. The notations function part and
argument part of R will be used here for and N respectively.
Recall from 1132 that a reduction is a finite or infinite sequence of contractions
(P1, R1, Q1), (P2, R2, Q2)....
where Pl _a P and Ql =a Pi+1 for i = 1,2,.... This definition allows a reduction
to make a-conversions before or after each of its contractions, but the reader may
safely ignore these and concentrate on the contractions; the next lemma will say
contractions are unaffected by a-conversions in a certain precise sense.
Recall from 1B3 that the length of a reduction is the number of its contractions.
(And a-conversions are not counted.)
In this section "c" will denote an arbitrary contraction and "r" an arbitrary
reduction.
If rl is a finite reduction from a term P to a term Q and r2 is a reduction of 0,
the reduction consisting of rl followed by r2 will be called
rl + r2.
Proof-note Two cases must be considered: (i) P' comes from P by replacing
by Ay [y/x]M, (ii) P comes from P by replacing Ay [y/x]M by Ax M (where
y FV(M)). The proof is boring but can be slightly simplified by restricting rule
(a) as follows (cf. HS 86 Remark 1.21):
9B1.2 Warning The above lemma does not claim that R' -a R. In fact the following
is an example where this fails:
P - Ax-R, R =_ P' _
The next definition will describe what happens to a redex-occurrence S in a term
P when another redex-occurrence R in P is contracted.
(The second form will apply if some changes of bound variables are needed during
the substitution [N/x]M; the third if S is in the scope of an occurrence of Ax in
M.) We call this S' the residual of S.
Subcase 4b: S is in N. Let M contain k >_ 0 free occurrences of x. The contrac-
tion of R changes to [N/x]M which contains k substituted occurrences of
N, each containing an occurrence of S. We call these occurrences of S the residuals
of S.
146 9 Technical details
9132.1 Notation The set of all the residuals of S with respect to R will be called
"S/R" (or "S/1;" if is the contraction (P,R,P')).
A redex-occurrence S will be called the parent of its residuals.
9B2.2 Note (i) If S is not a part of R then SIR has exactly one member (because
then Case 2 and Subcase 4b cannot happen).
(ii) Each member of SIR is an occurrence of a term where z y except
possibly in Case 4a if changes of bound variables are needed, and H' and L' are
either H and L or terms obtained from them by substitutions.
(iii) The definition of residuals is meaningful for the typed-term system in Chap-
ter 5 as well as for untyped terms. If P is a typed term then by (ii) each residual
of S in P' has the same type as S. Further, its function-part has the
same type as the function-part of S. This fact is needed in the proof of the
weak normalisation theorem in 5C.
[N/x] M.
Let S be any newly created redex-occurrence in P'; say
S =_
This section should be read in parallel with the outline definition of "TA2-deduction"
in 2A8. That outline is filled in here and some lemmas are proved for use in
Chapters 5 and 8.
A TAx-deduction as defined below will be a slightly more elaborate object than the
trees shown in the examples in Chapter 2, and each deduction-tree in that chapter
should be regarded as an abbreviation for a deduction as defined below.
(ii) If Ai and A2 are deductions whose bottom nodes are labelled by the position
0 and the formulae, respectively,
rl ra P:(v-.r), F2 F-., Q:v,
and F1 U F2 is consistent, then a new deduction called (A1A2) is constructed by first
putting "I" on the left end of each position-label in At, then putting "2" on the left
end of each position-label in A2, and then placing an extra node beneath the two
modified deductions, as shown below.
Modified Modified
2
i U r2 H PQ: t 0
(iii) If Al is a deduction whose bottom node is labelled by the position 0 and the
formula F H P :T, and r is consistent with x:a, then a new deduction called (.1x01)
9C The structure of a TAR-deduction 149
is constructed by first putting "0" on the left end of each position-label in At and
then placing an extra node beneath the modified deduction, as shown below.
9C1.1 Example For examples of deductions see 2A8.2-5 and the answers to Exer-
cises 2A8.7-8. (The deductions in those examples omit all position-labels and are
displayed in the standard space-saving way using horizontal lines instead of the
node-and-line way shown in the diagrams above, but this is merely a matter of
representation and is not intended to conflict with the definition.)
9C1.2 Note If we remove contexts and types from all the TAR-formulae in a
deduction-tree whose conclusion is IF --> M:r, it will be transformed into the
construction-tree of M as defined in 9A4.
9C2 Definition The length, 1Al, of a TAR-deduction A is defined thus: JAI = 1 for
atomic A, and for composite A
Clearly JAI = IMI if the conclusion of A is F F--* M:i. (For IMI see 1A2.)
9C3 Definition The set of all type-variables occurring in a deduction A will be called
Vars(A).'
r H M:r.
Then from 9C1 it is easy to see that A contains a deduction Ap of a formula
(1) Fp H P :Up
for some Fp and op.
It is natural to expect that replacing P by a new term T with the same type
will leave the type of M unchanged. But the types of P, T and M all depend on
contexts: F for M, ['p for P, and an unspecified one for T. So our expectation is
imprecise. It will be made precise and proved in the two replacement lemmas below.
Warning: deductions 0 contain term-variables as well as type-variables but only the latter are included
in Vars(L).
150 9 Technical details
The first step towards precision is to clarify the relationship between rp and F.
If P is not in the body of a A-abstract in M then clearly
(2) rpgr.
But now suppose P is in the bodies of some A-abstracts in M; say there are n
distinct such abstracts:
1x,Nt, ... , -1x :Nn,
for some rp and vp; let {T/P}PM be the result of replacing P at p by a term T such
that
FT FA T :Qp
for some FT c rp. Then
F 1-1 {T/P}pM:i.
9C6.1 Corollary The conclusion of Lemma 9C6 also holds if instead of 9C6(ii) we
assume that FU FT is consistent and none of X1.... , xn occurs free in T.
Proof If Subjects(FT) = FV(T) the above assumption implies that IT satisfies 9C6
(ii).
9D1 Definition (Occurs, occurrence) Here positions are the same as in 9A1; however,
positions containing * are not needed in the present section. The phrase "a occurs
in z at position p" is defined thus (cf. 9A2):
(i) T occurs in T at position 0;
(ii) if al --.v2 occurs in T at p, then a; occurs in T at pi (i = 1, 2).
A triple (a, p, T) such that a occurs in T at p is called an occurrence of a in T, or a
component of T. A type that occurs in T is called a subtype of T.
9D1.1 Notation An occurrence (a, p,T) may be called simply "a" when no confusion
is likely.
Recall from 2A2 that the number of occurrences of variables in T is called DTI and
152 9 Technical details
the set of these variables is called Vars(T). The set of all variables occurring in a
finite sequence (ti,. .. , will be called
p-*a 0
9D2.1 Example Let T - (a-(b--+c))-+((a-+b)-*(a-*c)); its tree is as shown in
Fig. 9D2.1a.
0
(a-. (b-+ c))- ((a-. b)-* (ate c))
Fig. 9D2.1 a.
9D3.1 Example The positive occurrences in the type r shown in Example 9D2.1 are
The reader familiar with the usual concept of positive and negative occurrences
in propositional logic can easily check that the above definition agrees with it.
The present section analyses the structure of an arbitrary type in a more compact
way than the last, and computes a bound on the number of its components of a
certain kind. This bound is used in the proof of 8F3.
The starting point is a remark in 8B1 that every type T can be expressed uniquely
in the form
(1)
where m >_ 0 and e is an atom. In this section we shall view a type as having
been built up from its atoms by the operation of constructing (1) from ti...... ,,,
and e instead of the more usual operation of constructing from p and u.
Corresponding to this new view a new construction-tree of a type can be defined
that is more condensed than the usual one and in which there is no bound on the
number of branches that may start from a node. The following definitions formalize
this view.
9E1 Notation Just as in 9A1, a position is any finite (perhaps empty) string p =
it ... iof integers and *'s such that ij,..., The present section will use the
notation of 9A1, with the following exceptions:
(ii) If T =_ (n > 1), construct its condensed tree from the condensed
trees of T1,...,T,,, by first replacing each position-label p in the tree of T; by
154 9 Technical details
(m + 1 - i)p (for i = 1, ... , m), and then combining the modified trees as follows:
Modified Modified
tree for tree for
9E2.1 Note The use of (m + 1- i)p in (ii) above has the effect of assigning positions
to Tl,...,Tm backwards, giving position 1 to Tm, 2 to and m to TI. This
makes it easier to relate positions in the condensed tree of a--*r to those of a and
T, though we shall not need this facility in this book.
As an example, Fig. 9E2.1a shows the condensed tree for the type
T (a->(b->c))->((a-b)-(a->c))
(a-.b-->c)-+(a-*b)--a--*c.
a, b, c, a-*b-.c, a-*b, T.
Note that although b-+c is a subtype of T in the usually accepted sense (9D1) it is
not an s-subtype because it does not correspond to a node on the condensed tree.
(a_b-4c)_(a-+b)-> a-* c
Fig. 9E2.1a.
9E The condensed structure of a type 155
9E4 Definition (S-components) Iff a node on the condensed tree of T is labelled with
a type a and a position p we call the triple (a, p, T) an s-component of -r. (Thus an
s-component is a particular occurrence of an s-subtype.)
9E5.1 Lemma Two distinct s-components of a type T cannot have the same tail-
component.
Proof Induction on itt, using the fact that if T = Tl--*... the only s-
components containing a are r and a and a is not the tail of itself because atoms
do not have tails (by 9E5).
9E5.2 Warning The above lemma does not say that the tails of two distinct s-
components p and g cannot be occurrences of the same atom. That is, using
the =-notation of (iv), the lemma forbids Tail(p) = Tail(g) but does not forbid
Tail(p) = Tail(g).
9E6.1 Example The type T - (a--+b--+c)--+(a-+b)-+a-+c in Fig. 9E2.1a has six sub-
premises, namely all three a's and
3, T), (a-*b, 2, T), (b, 31, T).
Proof For (i): use 9E5.1. For (ii): each composite s-component is either a subpremise
or r itself. For (iii): use 9E6.2(ii). For (iv): use 9E6.2(iii). For (v): subtract (iii) from
(ii). For (vi): use 9E6.2(ii), adding (iv) to (v) and adding 1 for T itself.
9E7 Definition Order(T), the order of r, is 1 + the length of the longest position on
the condensed tree of T. In detail: Order(e) = 1 for atoms e, and for composite types
Order 1+Max {Order(Ti),...,Order(T)}.
9E9 Definition (NSS(T)) (cf. Ben-Yelles 1979 Def. 3.36.) If r is composite, NSS(T)
is the set of all finite sequences (n >: 1) such that T contains a positive
composite s-component with form
for some atom a. Each member of NSS (T) is called a negative subpremise-sequence
(because it is a sequence of terms that have occurrences as negative subpremises in
T).
The set of all the members of the sequences in NSS(T) will be called
U NSS (r).
9F Imitating combinatory logic in A-calculus 157
9E9.2 Notes (i) If T is composite, say T - Tl-+...--+T,,,-+e (m >_ 1), and each T; has
form
Ti - T1,1-+...--+Ti,m.-+ei (m1>0),
then
P r o o f For (i): each (61, ... , E NSS (T) is obtained from a distinct positive
composite s-component of T. For (ii): use 9E6.3 (i) and (iv). For (iii): note that
U NSS (T) c Sub premises (T)
and use 9E6.3 (v). For (iv): use the definition of NSS(T). 0
K= S= W=
9F1 Definition (S-combinations) If S is a set of A-terms, an S-combination, or
applicative combination of members of 5, is a A-term built from some or all of the
members of S by application only. An S-and-variables combination is an applicative
combination of members of S and variables.
For subsets of {B, B', C, I, K, S, W} the S-combinations will be called BCK-combina-
tions, BCIW-combinations, etc.
158 9 Technical details
9F1.1 Example If S = {B, C, K} then CKK and B are S-combinations and CKx, xy
and CKK are S-and-variables combinations. But lx-BC is not an S-combination or
an S-and-variables combination.
The theory called combinatory logic was originally developed from the idea
that 2-abstraction can be imitated by building combinations of a very limited set
of operators called basic combinators. (See HS 86 Ch. 2.) In many accounts of
the theory just two basic combinators called S and K are assumed, with similar
reduction-properties to the 2-terms S and K akove. But the version below uses B, C,
K and W instead; this makes it easier to discuss particular subsystems.
9F2 Definition For 2-terms P and Q, we shall say that P fl-reduces to Q with strong
type-invariance if
(i) P>pQ,
(ii) FV(P) = FV(Q),
(iii) for all I' and i, f F-2 P :T t F-1 Q:i.
We shall say P #-converts to Q with strong type-invariance if P =p Q and (ii) and
(iii) hold.
I*;
(d) if N* PQ,x FV(P),x E FV(Q):
(e) if N* PQ,x E FV(P),x FV(Q):
(f) if N* PQ,x E FV(P),x E FV(Q):
It is easy to see that is defined for every applicative combination N* of B,
C, K, W and variables, and a routine induction on IN* j shows that (3)-(5) hold.
9F4 Definition A set S of typable closed 2-terms is called a typable basis for a set 1_
of 2-terms if there is an algorithm that accepts any member of L and constructs an
S-and-variables combination M* which fl-reduces to M with strong type-invariance.
9F5 Partial Completeness Theorem (i) {B, C, I, W} is a typable basis.for the set of
all 2I-terms (defined in 1D1).
(ii) {B,C,K} is a typable basis for the set of all BCK2-terms (1D2).
(iii) {B, C, I} is a typable basis for the set of all BC12-terms (1D3).
Proof (i) Modify the algorithm given in 9F3 by omitting (a) and replacing I * by I
in (b).
(ii) Modify the algorithm in 9F3 by omitting (f).
(iii) In 9F3, omit (a) and (f) and replace I* by I in (b).
Answers to starred exercises
2A8.8 The deductions are too wide to show in full here, so the context and arrow
will be omitted from each formula. (This cannot be done for arbitrary deductions
161
162 Answers to starred exercises
without ambiguity but those below are simple enough.) First note that, by deductions
like 2A8.3,
I- I :(a-+b)---).a-+b, I- I :c--+c
2B3.2 (T. Coquand, informal correspondence 1993.) Let F F-+ M:T be the conclusion
of A and A. Then M has a /3-nf M* by WN (2D5), and by 1B9 there is a leftmost
reduction p from M to M*. Such a reduction consists of the following successive
parts (each of which may be empty):
(1) A reduction (called a head reduction of order 0) in which each step has form
(2y P )QR1... Rk >1 ([Q/y]P)R1... Rk (k > 0).
where N > N' by a head reduction of order 0. (We call pi a head reduction of
order i.)
(3) A reduction (called an internal reduction) with form
D 2xl...x,n-yN1*...Nn* = M*,
where Ni P N1* by a leftmost reduction, for i = 1, ... , n.
Answers to starred exercises 163
By the proof of the Subject-reduction theorem 2C1, A and A' reduce to deductions
of
r* H M*:T, r'* H M*:t,
for some F* s r and r'* r. (Reducing a deduction is like reducing a typed term,
5B8.) But r* = r'* = r, because FV(M*) = FV(M) since reductions of Al-terms
do not cancel, and so
Subjects(r*) = Subjects(r'*) = FV (M*) = FV(M) = Subjects(r).
Hence by 2B4, A and A' both reduce to the same deduction.
Now prove A = A' by induction on IM *I with an induction on the length of the
reduction p in both the basis and induction steps. (The case p = 0 is 2B4.)
Basis : M* - y. Then (2) and (3) are empty. For each step (1) in p, if the deduction
for the right side of (1) is unique, then the types and deductions for Q, P, R1, ... , Rk
are uniquely determined. (Note that Q occurs in [Q/y]P because y occurs in P.)
Hence the deduction for the left side is uniquely determined.
Induction step: M* - Axl ... xn,.yNi * . . . Nn*, m + n > 1. For steps (1) use the
above argument and for (2) and (3) use the hypothesis of the induction on JM*I.
3A6.1 Let B - Axyz.x(yz) and let A be any TAx-deduction of a type l; for B. By
2B2(iv) the last three steps in A must be applications of rule (-*1), with form
for some a, /3, a, r. Then by 2B2(iii) the step above these must be (-*E), and its
premises must have form
H x:P-*T, Y:$,z:a H yz:p
for some p. Hence a - p-nT. Also by 2B2(iii) the step above yz:p must be (-*E),
with premises of form
y:a--.p H y:a-'p, z:a --> z:Q.
3B5.2 (i) [a/b] o [b/a] - [a/b, a/a]. For example if t = a-*b, we have
([a/b] o [b/a])(T) a--*a, [a/b]([b/a](T)) = [alb](b-*b) = a--+a.
(Warning : [alb] o [b/a] * [a/a] ! To see this, apply [a/a] to T.)
(ii) Let s - [al /al,... , a /a ]. If Range(s) n Dom(s) = 0, define
St = [QI/al] o ... o [Un/an]
164 Answers to starred exercises
If Range(s) fl Dom(s) * 0, choose distinct b1,..., bn 0 Range(s) U Dom(s) U Vars(T),
and define c = [bi/a,]o, for each i < n; then define
ST =ext [al/bl] o o [an/bn] o [al /al] ° ° [Qn /an]-
(Warning: The dependence of si on r is real; the definition of s' includes the clause
"bl,...,bn 0 Vars(r)" and this clause is more important than it looks: no claim is
made that there is a composition s'" of single substitutions such that s'(r) - s(r)
for all T.)
3C3.1 Let p - a-+(b-+c), r (a->b)--+a, vo - By 3C2.1, vo is
a c.i. of p and T. To show vo is most general, let v be any other c.i., say
v = si(a)->(§1(b)--+si(c)) = (§2(a)-*§2(b))-+§2(a)
Then
s1(a) = §2(a)-'§2(b), §2(a) = si(b)->sl(c),
so
sl(a) _ (st(b)-->sl(c))--+s2(b),
and hence v is an instance of vo.
3D2.1 Let p - a->(b-+b), r - (c-+c)--+a, au =_ [(b->b)/a, b/c]. First, u clearly unifies
{ p, r}. To prove that u is most general, let s be any other unifier, say
PT(RI) = (d->.f)-->(g->b)-*((a->b)->c-*d)-+((a-*g)->c->f),
PT(R2) = (f--+c)-+(g-+bl)-+((a->bl)-+c-->b2)-'((a-*g)-*f-*b2),
PT(R3) = (a-+f)-'(g-'bl)-'((a-Bbl)-+b2-b3)-'((f-'g)-,b2-.b3).
NI - N2 - N3 -
Finally it defines M* - N3. (By the way, the algorithm is not claimed to be efficient!
There exists a much shorter M* than the one above, namely M*
(ii) Let M - K, ,r - b--+b--+b. Define T° - a--+b--+c; then T° has the 1-property and
changes to r under the identifications [b/a], [b/c]. The algorithm begins by applying
the proof of 7C1 to build a term whose PT is T°-+T°; this term is
IT' -
Then, following the proof of 7C2, the algorithm defines
M+-IToM-IT0K
and computes T+ - PT(M+); in this case T+ - a-+b-+a and for the identifications
sl, ... , sk in 7C7 we have k = 1 and sl - [b/a]. The next step is to apply 7C5's
proof to build N such that
PT(N) __ [b/a]T+ b-+b-+b.
166 Answers to starred exercises
To do this, the algorithm first applies 7C4 to obtain a term R such that
PT(R) = (f-*a1)-*(g-*b)->(al->b-->a2)-+(f-*g-*a2),
and then defines N - (Ax'Rxx)I(I,.K).
8A12.1 The eight regions contain the following terms in order from left to right.
Top row: Axy'xy, Axyz'x(xyyy)z, 2xyzu'x(xyyy)zu, Axyzu'xyzu;
bottom row: Ax-x, Axyz'xz(xyyy), Axyzu'xu(xyyy)z, Axyzu'uxyz.
8B7 For items 1, 6, 8, 11 in Table 8B7a see 8B3-8B6. For the rest, see the answer to
8C6.4. (In item 12, PO Nprinc(T) since the PT of Po is ((a--+a)--+b)-+b by the PT
algorithm, 3E1.)
8C6.4 For rows 6, 8 and 11 of Table 8B7a see Examples 8C6.1-3. The other rows
are dealt with below. (For ease of reading, types are omitted and x'M is used for
x(x(... (xM) ...)) with d x's.)
1. d(T,0) = {V}, ,q/(T,1) = 0.
2. &/('r,0) _ {V}, .sJI(T,1) _ {Ax1'x1}.
3. .Q/' (T,0) _ {V}, Ql(T,1) _ {Ax1x2'xl}.
4. d(T,0) _ {V}, d(T,1) _ {Ax1x2x3'x1 VI 1, .9/(T,2) = {Ax1x2x3'x1(x2V2)},
,SV(T, 3)= {Ax1x2x3'x1(x2x3)}
5. .Ql(T,0) = {V}, Q/(T,1) = {Ax1x2x3'x1 V1 V2}, JV(T,2) = {Axlx2x3'xlx3x2}.
7. crl(T,0) = {V}, d(T,1) = {Axix2'x1 V1 V2}, 4(T,2)= {Ax1x2'x1x2x2}.
9. 5z/(T,O) = {V}, .21(T, 1) = {Axix2'xl V1, Axtx2'x2}
srl(T,d) = {2x1x2'xiVd, Axlx2'xi-1x2} for all d > 2.
10. Al(T,O) = {V}, .sal(T, 1) = V1}, sl(T,2) = {Ax1x2'xlx2}.
12. d(T,0) = {V}, s.V/(T,1) = {Ax'xVl}
d(T,2) = Ax'x(Ayryi)}
d(T,3) = 2x'x(2y1'x(2y2'xV3))
2x'x(AY1'x(2Y2'Y1)), Ax'x(2Y1'x(2Y2'Y2))}
,(V(T,d) _ {2x'x(Ayl'x(... (2yd-1'xVd)...)),
Ax'x(Ay1'x(... (AYd-2'x(AYd-1'Y1))...)), ...
2x'x(Ay1'x(... (AYd-2'x(AYd-1'Yd-1))...))} for all d 4.
8E7.3 For (iii), use (ii) and the fact that #(szl(T,0)) = 1 (since d(T,0) = {VT} by
Step 0 of Algorithm 8C6).
For (i), use induction on d. The basis is trivial since d(T, 0) = {V' J.
For the induction step (d to d+ 1), let XT E d(T, d) contain q metavariables where
1 < q < ITId, and let VP be one of these.
Consider Part Hal of Step d + 1 of Algorithm 8C6: using the notation of IIal,
note that each suitable replacement YfP generated by IIal for VP contains < n1
metavariables, where n1 is the arity of a1. But o1 occurs in p which occurs in T by
8E7.1, so
ni < jail - 1 < ITI - 1 < ITI.
Answers to starred exercises 167
Next consider Part IIa2 in 8C6. Using the notation of IIa2, note that each suitable
replacement Zj generated by IIa2 for VP contains < h metavariables, where hj is
the arity of Cj. But Cj occurs in r by 2B3(i), so
hj < K,I-1 < ITI-1 < ITI.
Thus each metavariable in XT is replaced by < ITI new ones, so the total number
of metavariables in the resulting extension of XT is < gITI. Hence (i) holds.
To prove (ii), look at the above induction step in more detail. When IIal in 8C6
is applied to VP the number of suitable replacements it generates is < m, where m
is the arity of p. We have
m< IPI-1 < ITI-1.
When IIa2 in 8C6 is applied to VP the number of suitable replacements it generates
is < t, and by 8E7.2(ii),
t < (ITI - 1)Depth(XT).
Thus the total number of suitable replacements for VP is less than or equal to
IT) - 1 +(ITI - 1)d, < ITI(d+ 1).
But there are q metavariables in XT, so when we apply IIa2 the total number of
resulting extensions of XT is less than (ITI(d + 1))9. Hence
#(d(T, d + 1)) < (ITI(d + 1))9 x #(d(T, d))
Then (ii) follows by (i).
Bibliography
AHO, A. V., SETHI, R., ULLMAN, J. D. [1986] Compilers, Addison-Wesley Co., USA 1986.
ANDERSON, A. R., BELNAP, N. D. [1975] Entailment, Vol. I, Princeton University Press,
USA 1975.
ANDERSON, A. R., BELNAP, N. D., DUNN, J. M. [1992] Entailment, Vol. II, Princeton
University Press, USA 1992.
ANDREWS, P. B. [1965] A transfinite type theory with type variables, North-Holland Co.,
Netherlands 1965.
ANDREWS, P. B. [1971] Resolution in type theory, J. Symbolic Logic 36 (1971), 414-432.
ANDREWS, P. B. [1986] An Introduction to Mathematical Logic and Type Theory; to Truth
through Proof, Academic Press, USA and UK 1986.
AVRON, A. [1988] The semantics and proof theory of linear logic, Theoretical Computer
Science 57 (1988), 161-184.
AVRON, A. [1992] Axiomatic systems, deduction and implication, J. Logic and
Computation 2 (1992), 51-98.
BAADER, F., SIEKMANN, J. H. [1994] Unification theory, in Handbook of Logic in
Artificial Intelligence, Vol. 2: Deduction Methodologies, ed. D. M. Gabbay, C. J. Hogger,
J. A. Robinson, Oxford University Press, UK 1994, pp. 41-126.
BARENDREGT, H. P. [1984] The Lambda Calculus, North-Holland Co., Netherlands, 2nd.
edition 1984.
BARENDREGT, H. P. [1992] Lambda calculi with types, in Handbook of Logic in Computer
Science, Vol. 2, ed. S. Abramsky et at., Clarendon Press, UK 1992, pp. 117-309.
BARENDREGT, H. P., COPPO, M., DEZANI, M. [1983] A filter lambda model and the
completeness of type assignment, J. Symbolic Logic 48 (1983), 931-940.
BEN-YELLES, C: B. [1979] Type-assignment in the lambda-calculus; syntax and semantics,
thesis 1979, Mathematics Dept., University of Wales Swansea, Swansea SA2 8PP, UK.
BLOK, W. J., PIGOZZI, D. [1989] Algebraizable Logics, Memoirs of the American
Mathematical Society No. 396 (1989), Amer. Math. Soc., Providence, R.I. 02901, USA.
de BRUIJN, N. G. [1980] A survey of the project AUTOMATH, in To H. B. Curry, ed.
J. P. Seldin, J. R. Hindley, Academic Press, UK 1980, pp. 579-606.
BUNDER, M. W. [1982] Deduction theorems for weak implicational logics, Studia Logica 41
(1982), 95-108.
BUNDER, M. W. [1986] Review no. 03004, in Zentralblatt fdr Mathematik 574 (1986),
10-11, of Bunder, M. W. and Meyer, R. K. A result for combinatory, BCK logics and
BCK algebras, Logique et Analyse 28 (1985), 33-40.
BUNDER, M. W. [1991] Corrections to some results for BCK logics and algebras, Logique et
Analyse 31 (1991), 115-122.
169
170 Bibliography
HINATA, S. [1967] Calculability of primitive recursive functionals of finite type, Sci. Report
Tokyo Kyoiku Daigaku, Section A, 9 (226) (1967), 42-59.
HINDLEY, J. R. [1969] The principal type-scheme of an object in combinatory logic, Trans.
American Math. Soc. 146 (1969), 29-60.
HINDLEY, J. R. [1983a] The completeness theorem for typing 2-terms, Theoretical Computer
Science 22 (1983), 1-17.
HINDLEY, J. R. [1983b] Curry's type-rules are complete with respect to the F-semantics too,
Theoretical Computer Science 22 (1983), 127-133.
HINDLEY, J. R. [1989] BCK-combinators and linear 2-terms have types, Theoretical
Computer Science 64 (1989), 97-106.
HINDLEY, J. R. [1992] Types with intersection, an introduction, Formal Aspects of
Computing 4 (1992), 470-486.
HINDLEY, J. R. [1993] BCK and BCI logics, condensed detachment and the 2-property,
Notre Dame J. Formal Logic 34 (1993), 231-250.
HINDLEY, J. R., MEREDITH, D. [1990] Principal type-schemes and condensed detachment,
J. Symbolic Logic 55 (1990), 90-105.
HINDLEY, J. R., SELDIN, J. P. [1986] [HS 86] Introduction to Combinators and A-calculus,
Cambridge University Press, UK 1986.
HIROKAWA, S. [1991a] Principal type assignment to lambda terms, International Journal of
the Foundations of Computer Science 2 (1991), 149-162.
HIROKAWA, S. [1991b] Principal type-schemes of BCI-lambda-terms, in Theoretical Aspects
of Computer Software, ed. by T. Ito, A. R. Meyer, Lecture Notes in Computer Science,
Springer-Verlag, Germany, No. 526 (1991), 633-650.
HIROKAWA, S. [1991c] BCK-formulas having unique proofs, in Category Theory and
Computer Science, ed. by D. H. Pitt et al., Lecture Notes in Computer Science,
Springer-Verlag, Germany, No. 530 (1991), 106-120.
HIROKAWA, S. [1992a] The converse principal type-scheme theorem in lambda calculus,
Studia Logica 51 (1992), 83-95.
HIROKAWA, S. [1992b] Balanced formulas, BCK-minimal formulas and their proofs, in
Logical Foundations of Computer Science - Tver '92, ed. by A. Nerode, M. Taitslin,
Lecture Notes in Computer Science, Springer-Verlag, Germany, No. 620 (1992), 198-208.
HIROKAWA, S. [1993a] Principal types of BCK-lambda-terms, Theoretical Computer
Science 107 (1993), 253-276.
HIROKAWA, S. [1993b] The relevance graph of a BCK-formula, J. Logic and Computation
3 (1993), 269-285.
HIROKAWA, S. [1993c] The number of proofs for an implicational formula, J. Symbolic
Logic 58 (1993), 1117 (abstract only; fuller MS informally circulated 1991).
HOPCROFT, J. E., ULLMAN, J. D., [1979] Introduction to Automata Theory, Languages,
and Computation, Addison-Wesley, USA 1979.
HOWARD, W. [1969] The formulae-as-types notion of construction, MS 1969, publ. in To
H. B. Curry, ed. J. R. Hindley, J. P. Seldin, Academic Press, UK 1980, pp. 479-490.
HOWARD, W. [1970] Assignment of ordinals to terms for primitive recursive functionals of
finite type, in Intuitionism and Proof Theory (Proc. Buffalo Conference 1968), ed.
A. Kino, J. Myhill, R. Vesley, North-Holland Co., Netherlands, 1970, pp. 443-458.
HUDELMAIER, J. [1993] An O(nlogn)-space decision procedure for intuitionistic
propositional logic, J. Logic and Computation 3 (1993), 63-75.
HUET, G. [1975] A unification algorithm for typed 2-calculus, Theoretical Computer Science
1 (1975), 27-57.
ISEKI, K., TANAKA, S. [1978] An introduction to the theory of BCK-algebras,
Mathematica Japonica 23 (1978), 1-26.
JASKOWSKI, S. [1963] Uber Tautologien, in welchen keine Variable mehr als zweimal
vorkommt, Zeitschrift fur Mathematische Logik 9 (1963), 219-228.
KALMAN, I. A. [1982] The two-property and condensed detachment, Studia Logica 41
(1982), 173-179.
Bibliography 173
This table gives a (not necessarily unique) principal inhabitant of each of some
types. It gives a fl-nf where possible, but if there is no normal principal inhabitant
it gives a non-normal one. It includes most of the PT inform Atio. i in Tables 3E2a
and 8B7a.
Type A principal inhabitant
a-*a
a-*b-*a K
a-*a-*a (Axyz K(xy)(xz))I [Nprinc = 0]
(a-.b)-*a-*b
(a-.b)-.a->a Axy Ky(xy) [Nprinc = 0]
(a-.a)-.a-*a [and
a-*(a-*b)-*a 2xy Kx(yx) [Nprinc = 0]
a-*a-4b-*b [Nprinc = 0]
((a-a)-+a)-a , others see 8B7a(12)]
(a-.a-*b)-*a-.b )xY'xYY = W
(a-*a-*a)-*a-+a
(a-.b)-*a-+a-*b 2xyz K(xy)(xz) [Nprinc = 0]
C
(a1-+...
b- aj -* ...
(a-.b)-.(c-*a)-*c-b 2xyz x(yz) = B
(a-.b)-.(b-.c)-*a-+c 2xyzy(xz) = B'
S
(a- b- 1xyz xyzz
177
178 Table of principal types
Auvwx u(vx)w
a3 b
(a-*b1-+... b,-+a--+d Auvwl ...
(a-+b)-+((c1 -+ ...
(c1-+...
(a-+b-+c)-+((d, -+... f )-+ ...Y,,z'xY1 ... yn(uzw))
b-+(d1-+...
Index
179
180 Index
=ext 35
AL (see Curry-Howard mapping) (see #-conversion)
A2 (logic-to-lambda mapping) 82 =fin (see fln-conversion)
=9[n] (used for both =p, =pq) 53
n-contraction, conversion 7
=n (see n-conversion)
typed 69
n-family 8
rt9 (fl-contracts to, see fl-contraction)
typed 112
n-nf (see n-normal forms) r-p (/f-reduces to, see #-reduction)
n-normal forms 9 t>f,, (flu-reduces to, see flu-reduction)
typed 69 E>n (n-reduces to, see n-reduction)
186 Index