Proofs and Types
Proofs and Types
JEAN-YVES GIRARD
Translated and with appendices by
PAUL TAYLOR
YVES LAFONT
ii
Published by the Press Syndicate of the University of Cambridge
The Pitt Building, Trumpington Street, Cambridge CB2 1RP
32 East 57th Streey, New York, NY 10022, USA
10 Stamford Road, Oakleigh, Melbourne 3166, Australia
c Cambridge University Press, 1989
First Published 1989
Reprinted with minor corrections 1990
Reprinted for the Web 2003
Originally printed in Great Britain at the University Press, Cambridge
British Library Cataloguing in Publication Data available
Library of Congress Cataloguing in Publication Data available
ISBN 0 521 37181 3
iii
Preface
This little book comes from a short graduate course on typed -calculus given at
the Universite Paris VII in the autumn term of 19867. It is not intended to be
encyclopedic the Church-Rosser theorem, for instance, is not proved and
the selection of topics was really quite haphazard.
Some very basic knowledge of logic is needed, but we will never go into tedious
details. Some book in proof theory, such as [Gir], may be useful afterwards to
complete the information on those points which are lacking.
The notes would never have reached the standard of a book without the
interest taken in translating (and in many cases reworking) them by Yves Lafont
and Paul Taylor. For instance Yves Lafont restructured chapter 6 and Paul Taylor
chapter 8, and some sections have been developed into detailed appendices.
The translators would like to thank Luke Ong, Christine Paulin-Mohring,
Ramon Pino, Mark Ryan, Thomas Streicher, Bill White and Liz Wolf for their
suggestions and detailed corrections to earlier drafts and also Samson Abramsky
for his encouragement throughout the project.
In the reprinting an open problem on page 140 has been resolved.
Contents
1 Sense, Denotation and Semantics
1.1 Sense and denotation in logic . .
1.1.1 The algebraic tradition . .
1.1.2 The syntactic tradition . .
1.2 The two semantic traditions . . .
1.2.1 Tarski . . . . . . . . . . .
1.2.2 Heyting . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
3
3
4
4
5
2 Natural Deduction
2.1 The calculus . . . . . . . . . . . . .
2.1.1 The rules . . . . . . . . . .
2.2 Computational significance . . . . .
2.2.1 Interpretation of the rules .
2.2.2 Identification of deductions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
9
10
10
11
13
.
.
.
.
.
.
.
.
14
15
15
15
16
17
18
19
20
.
.
.
.
.
.
.
22
22
24
24
25
25
26
26
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Normalisation Theorem
The Church-Rosser property . . . . . . . .
The weak normalisation theorem . . . . .
Proof of the weak normalisation theorem .
4.3.1 Degree and substitution . . . . . .
4.3.2 Degree and conversion . . . . . . .
4.3.3 Conversion of maximal degree . . .
4.3.4 Proof of the theorem . . . . . . . .
iv
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
4.4
5 Sequent Calculus
5.1 The calculus . . . . . . . . . . . . . . . . . .
5.1.1 Sequents . . . . . . . . . . . . . . . .
5.1.2 Structural rules . . . . . . . . . . . .
5.1.3 The intuitionistic case . . . . . . . .
5.1.4 The identity group . . . . . . . . .
5.1.5 Logical rules . . . . . . . . . . . . . .
5.2 Some properties of the system without cut .
5.2.1 The last rule . . . . . . . . . . . . .
5.2.2 Subformula property . . . . . . . . .
5.2.3 Asymmetrical interpretation . . . . .
5.3 Sequent Calculus and Natural Deduction . .
5.4 Properties of the translation . . . . . . . . .
6 Strong Normalisation Theorem
6.1 Reducibility . . . . . . . . . .
6.2 Properties of reducibility . . .
6.2.1 Atomic types . . . . .
6.2.2 Product type . . . . .
6.2.3 Arrow type . . . . . .
6.3 Reducibility theorem . . . . .
6.3.1 Pairing . . . . . . . . .
6.3.2 Abstraction . . . . . .
6.3.3 The theorem . . . . .
.
.
.
.
.
.
.
.
.
7 G
odels system T
7.1 The calculus . . . . . . . . . . .
7.1.1 Types . . . . . . . . . .
7.1.2 Terms . . . . . . . . . .
7.1.3 Intended meaning . . . .
7.1.4 Conversions . . . . . . .
7.2 Normalisation theorem . . . . .
7.3 Expressive power: examples . .
7.3.1 Booleans . . . . . . . . .
7.3.2 Integers . . . . . . . . .
7.4 Expressive power: results . . .
7.4.1 Canonical forms . . . . .
7.4.2 Representable functions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
28
29
29
29
30
30
31
32
33
33
34
35
38
.
.
.
.
.
.
.
.
.
41
41
42
42
43
43
44
44
44
45
.
.
.
.
.
.
.
.
.
.
.
.
46
47
47
47
47
47
48
49
49
49
51
51
51
vi
CONTENTS
8 Coherence Spaces
8.1 General ideas . . . . . . . . . . . . . . . . .
8.2 Coherence Spaces . . . . . . . . . . . . . . .
8.2.1 The web of a coherence space . . . .
8.2.2 Interpretation . . . . . . . . . . . . .
8.3 Stable functions . . . . . . . . . . . . . . . .
8.3.1 Stable functions on a flat space . . .
8.3.2 Parallel Or . . . . . . . . . . . . . .
8.4 Direct product of two coherence spaces . . .
8.5 The Function-Space . . . . . . . . . . . . . .
8.5.1 The trace of a stable function . . . .
8.5.2 Representation of the function space
8.5.3 The Berry order . . . . . . . . . . .
8.5.4 Partial functions . . . . . . . . . . .
9 Denotational Semantics of T
9.1 Simple typed calculus . . . . .
9.1.1 Types . . . . . . . . . .
9.1.2 Terms . . . . . . . . . .
9.2 Properties of the interpretation
9.3 Godels system . . . . . . . . .
9.3.1 Booleans . . . . . . . . .
9.3.2 Integers . . . . . . . . .
9.3.3 Infinity and fixed point
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . .
. . . .
simple
. . . .
. . . .
. . . .
types
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
53
53
55
55
56
57
59
59
60
61
61
63
64
65
.
.
.
.
.
.
.
.
66
66
66
67
68
69
69
69
71
.
.
.
.
.
.
.
.
.
.
.
72
72
73
74
75
76
76
78
79
79
80
80
.
.
.
.
81
81
82
83
83
CONTENTS
vii
Sum
. . . . .
. . . . .
. . . . .
. . . . .
terms of
. . . . .
. . . . .
. . . . .
. . . . .
Elimination (Hauptsatz)
The key cases . . . . . . . .
The principal lemma . . . .
The Hauptsatz . . . . . . .
Resolution . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
83
84
84
85
85
86
87
87
88
88
90
92
92
93
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
preservation .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
94
95
95
97
98
98
99
100
102
103
.
.
.
.
104
. 104
. 108
. 110
. 111
.
.
.
.
.
.
.
.
.
113
. 114
. 114
. 114
. 115
. 116
. 117
. 117
. 117
. 118
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
viii
CONTENTS
15 Representation Theorem
15.1 Representable functions . . . . . . . . . . . . . .
15.1.1 Numerals . . . . . . . . . . . . . . . . . .
15.1.2 Total recursive functions . . . . . . . . . .
15.1.3 Provably total functions . . . . . . . . . .
15.2 Proofs into programs . . . . . . . . . . . . . . . .
15.2.1 Formulation of HA2 . . . . . . . . . . . .
15.2.2 Translation of HA2 into F . . . . . . . . .
15.2.3 Representation of provably total functions
15.2.4 Proof without undefined objects . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
119
120
120
121
122
123
124
125
126
128
A Semantics of System F
A.1 Terms of universal type . . . . .
A.1.1 Finite approximation . . .
A.1.2 Saturated domains . . . .
A.1.3 Uniformity . . . . . . . . .
A.2 Rigid Embeddings . . . . . . . .
A.2.1 Functoriality of arrow . .
A.3 Interpretation of Types . . . . . .
A.3.1 Tokens for universal types
A.3.2 Linear notation for tokens
A.3.3 The three simplest types .
A.4 Interpretation of terms . . . . . .
A.4.1 Variable coherence spaces
A.4.2 Coherence of tokens . . .
A.4.3 Interpretation of F . . . .
A.5 Examples . . . . . . . . . . . . .
A.5.1 Of course . . . . . . . . .
A.5.2 Natural Numbers . . . . .
A.5.3 Linear numerals . . . . . .
A.6 Total domains . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
131
. 131
. 131
. 132
. 133
. 134
. 135
. 136
. 137
. 138
. 139
. 140
. 140
. 141
. 143
. 144
. 144
. 146
. 147
. 148
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
149
149
151
154
157
160
Bibliography
161
Index
165
Chapter 1
Sense, Denotation and Semantics
Theoretical Computing is not yet a science. Many basic concepts have not been
clarified, and current work in the area obeys a kind of wedding cake paradigm:
for instance language design is reminiscent of Ptolomeic astronomy forever
in need of further corrections. There are, however, some limited topics such as
complexity theory and denotational semantics which are relatively free from this
criticism.
In such a situation, methodological remarks are extremely important, since we
have to see methodology as strategy and concrete results as of a tactical nature.
In particular what we are interested in is to be found at the source of the
logical whirlpool of the 1900s, illustrated by the names of Frege, Lowenheim,
Godel and so on. The reader not acquainted with the history of logic should
consult [vanHeijenoort].
1.1
1.1.1
This tradition (begun by Boole well before the time of Frege) is based on a radical
application of Ockhams razor: we quite simply discard the sense, and consider
only the denotation. The justification of this mutilation of logic is its operational
side: it works!
The essential turning point which established the predominance of this tradition
was Lowenheims theorem of 1916. Nowadays, one may see Model Theory as
the rich pay-off from this epistemological choice which was already very old. In
fact, considering logic from the point of view of denotation, i.e. the result of
operations, we discover a slightly peculiar kind of algebra, but one which allows
us to investigate operations unfamiliar to more traditional algebra. In particular,
it is possible to avoid the limitation to shall we say equational varieties, and
consider general definable structures. Thus Model Theory rejuvenates the ideas
and methods of algebra in an often fruitful way.
1.1.2
1.2
1.2.1
A B A B A B A
t
t
t
f
f
t
t
t
f
t
f
f
f
t
1.2.2
Heyting
Heytings idea is less well known, but it is difficult to imagine a greater disparity
between the brilliance of the original idea and the mediocrity of subsequent
developments. The aim is extremely ambitious: to model not the denotation, but
the proofs.
Instead of asking the question when is a sentence A true?, we ask what is
a proof of A?. By proof we understand not the syntactic formal transcript, but
the inherent object of which the written form gives only a shadowy reflection. We
take the view that what we write as a proof is merely a description of something
which is already a process in itself. So the reply to our extremely ambitious
question (and an important one, if we read it computationally) cannot be a formal
system.
1. For atomic sentences, we assume that we know intrinsically what a proof is;
for example, pencil and paper calculation serves as a proof of 2737 = 999.
2. A proof of A B is a pair (p, q) consisting of a proof p of A and a proof
q of B.
3. A proof of A B is a pair (i, p) with:
i = 0, and p is a proof of A, or
2
A[a/] is meta-notation for A where all the (free) occurrences of have been replaced
by a. In defining this formally, we have to be careful about bound variables.
Chapter 2
Natural Deduction
As we have said, the syntactic point of view shows up some profound symmetries
of Logic. Gentzens sequent calculus does this in a particularly satisfying
manner. Unfortunately, the computational significance is somewhat obscured by
syntactic complications that, although certainly immaterial, have never really been
overcome. That is why we present Prawitz natural deduction before we deal with
sequent calculus.
Natural deduction is a slightly paradoxical system: it is limited to the
intuitionistic case (in the classical case it has no particularly good properties) but
it is only satisfactory for the (, , ) fragment of the language: we shall defer
consideration of and until chapter 10. Yet disjunction and existence are the
two most typically intuitionistic connectors!
The basic idea of natural deduction is an asymmetry: a proof is a vaguely
tree-like structure (this view is more a graphical illusion than a mathematical
reality, but it is a pleasant illusion) with one or more hypotheses (possibly none)
but a single conclusion. The deep symmetry of the calculus is shown by the
introduction and elimination rules which match each other exactly. Observe,
incidentally, that with a tree-like structure, one can always decide uniquely what
was the last rule used, which is something we could not say if there were several
conclusions.
2.1
The calculus
A
to designate a deduction of A, that is, ending at A. The deduction will be written
as a finite tree, and in particular, the tree will have leaves labelled by sentences.
For these sentences, there are two possible states, dead or alive.
In the usual state, a sentence is alive, that is to say it takes an active part in
the proof: we say it is a hypothesis. The typical case is illustrated by the first
rule of natural deduction, which allows us to form a deduction consisting of a
single sentence:
A
Here A is both the leaf and the root; logically, we deduce A, but that was easy
because A was assumed!
Now a sentence at a leaf can be dead, when it no longer plays an active part in
the proof. Dead sentences are obtained by killing live ones. The typical example
is the -introduction rule:
[A]
B
AB
10
2.1.1
The rules
Hypothesis:
Introductions:
AB
[A]
AB
A
I
. A
Eliminations:
AB
A
1E
AB
B
2E
AB
B
. A
A[a/]
2.2
Computational significance
We shall re-examine the natural deduction system in the light of Heyting semantics;
we shall suppose fixed the interpretation of atomic formulae and also the range
of the quantifiers. A formula A will be seen as the set of its possible deductions;
instead of saying proves A, we shall say A.
1
The variable belongs to the object language (it may stand for a number, a data-record,
an event). We reserve x, y, z for -calculus variables, which we shall introduce in the next
section.
11
2.2.1
2 hu, vi = v
h 1 t, 2 ti = t
These equations (and the similar ones we shall have occasion to write down)
are the essence of the correspondence between logic and computer science.
4. If a deduction ends in I, let v be the term associated with the immediate
sub-deduction; this immediate sub-deduction is unambiguously determined
at the level of parcels of hypotheses, by saying that a whole A-parcel has
been discharged. If x is a variable associated to this parcel, then we have
a function v[x, x1 , . . . , xn ]. We shall associate to our deduction the function
12
The
The rules for echo those for : they do not add much, so we shall in
future omit them from our discussion. On the other hand, we shall soon replace
the boring first-order quantifier by a second-order quantifier with more novel
properties.
2.2.2
13
Identification of deductions
AB
A
[A]
B
AB
B
equals
1E
AB
equals
I
2E
equals
I
E
What we have written is clear, provided that we observe carefully what happens
in the last case: all the discharged hypotheses are replaced by (copies of) the
deduction ending in A.
Chapter 3
The Curry-Howard Isomorphism
We have seen that Heytings ideas perform very well in the framework of natural
deduction. We shall exploit this remark by establishing a formal system of typed
terms for discussing the functional objects which lie behind the proofs. The
significance of the system will be given by means of the functional equations we
have written down. In fact, these equations may be read in two different ways,
which re-iterate the dichotomy between sense and denotation:
as the equations which define the equality of terms, in other words the
equality of denotations (the static viewpoint).
as rewrite rules which allows us to calculate terms by reduction to a normal
form. That is an operational, dynamic viewpoint, the only truly fruitful
view for this aspect of logic.
Of course the second viewpoint is under-developed by comparison with the
first one, as was the case in Logic! For example denotational semantics of
programs (Scotts semantics, for example) abound: for this kind of semantics,
nothing changes throughout the execution of a program. On the other hand,
there is hardly any civilised operational semantics of programs (we exclude ad
hoc semantics which crudely paraphrase the steps toward normalisation). The
establishment of a truly operational semantics of algorithms is perhaps the most
important problem in computer science.
The correspondence between types and propositions was set out in [Howard].
14
3.1
3.1.1
15
Lambda Calculus
Types
3.1.2
Terms
Proofs become terms; more precisely, a proof of A (as a formula) becomes a term
of type A (as a type). Specifically:
1. The variables xT0 , . . . , xTn , . . . are terms of type T .
2. If u and v are terms of types respectively U and V , then hu, vi is a term of
type U V .
3. If t is a term of type U V then 1 t and 2 t are terms of types respectively
U and V .
4. If v is a term of type V and xUn is a variable of type U then xUn . v is a term
of type U V . In general we shall suppose that we have settled questions of
the choice of bound variables and of substitution, by some means or other,
which allows us to disregard the names of bound variables, the idea being
that a bound variable has no individuality.
5. If t and u are terms of types respectively U V and U , then t u is a term
of type V .
16
3.2
Denotational significance
Types represent the kind of object under discussion. For example an object of
type U V is a function from U to V , and an object of type U V is an ordered
pair consisting of an object of U and an object of V . The meaning of atomic
types is not important it depends on the context.
The terms follow very precisely the five schemes which we have used for
Heyting semantics and natural deduction.
1. A variable xT of type T represents any term t of type T (provided that xT
is replaced by t).
2. hu, vi is the ordered pair of u and v.
3. 1 t and 2 t are respectively the first and second projection of t.
4. xU . v is the function which to any u of type U associates v[u/x], that is v
in which xU is regarded as an abbreviation for u.
5. t u is the result of applying the function t to the argument u.
Denotationally, we have the following (primary) equations
1 hu, vi = u
2 hu, vi = v
xU . t x = t
(x not free in t)
3.3
17
Operational significance
18
For want of a clearer idea of how to explain the terms operationally, we have
an ad hoc notion, which is not so bad: we shall make the equations of 3.2
asymmetric and turn them into rewrite rules. This rewriting may be seen as
an embryonic program calculating the terms in question. That is not too bad,
because the operational semantics which we lack is surely very close to this process
of calculation, itself based on the fundamental symmetries of logic.
So one could hope to make progress at the operational level by a close study
of normalisation.
3.4
Conversion
2 hu, vi
(xU . v) u
A term t converts to a term t0 when one of the following three cases holds:
t = 1 hu, vi
t0 = u
t = 2 hu, vi
t0 = v
t = (xU . v)u
t0 = v[u/x]
t is called the redex and t0 the contractum; they are always of the same type.
A term u reduces1 to a term v when there is a sequence of conversions from u
to v, that is a sequence u = t0 , t1 , . . . , tn1 , tn = v such that for i = 0, 1, . . . , n 1,
ti+1 is obtained from ti by replacing a redex by its contractum. We write u
v
for u reduces to v: is reflexive and transitive.
A normal form for t is a term u such that t
u and which is normal. We
shall see in the following chapter that normal forms exist and are unique.
We shall want to discuss normal forms in detail, and for this purpose the
following definition, which is essential to the study of untyped -calculus, is useful:
Lemma A term t is normal iff it is in head normal form:
x1 . x2 . . . . xn . y u1 u2 . . . um
(where y may, but need not, be one of the xi ), and moreover the uj are also
normal.
1
A term converts in one step, reduces in many. In chapter 6 we shall introduce a more
abstract notion called reducibility, and the reader should be careful to avoid confusion.
19
Corollary If the types of the free variables of t are strictly simpler than the type
of t, or in particular if t is closed, then it is an abstraction.
3.5
This is nothing other than the precise statement of the correspondence between
proofs and functional terms, which can be done in a precise way, now that
functional terms have a precise status. On one side we have proofs with parcels
of hypotheses, these parcels being labelled by integers, on the other side we have
the system of typed terms:
1. To the deduction
2. To the deduction
AB
correspond to the deductions of A and B.
3. To the deduction
AB
(respectively
1E
AB
) corresponds
2E
A
B
1 t (respectively 2 t), where t corresponds to the deduction of A B.
[A]
4. To the deduction
corresponds xA
i . v, if the deleted
I
AB
hypotheses form parcel i, and v corresponds to the deduction of B.
5. To the deduction
AB
B
and u correspond to the deductions of A B and B.
20
3.6
AB
A
1E
AB
B
2E
[A]
B
AB
B
I
E
21
Basically, the two sides of the isomorphism are undoubtedly the the same
object, accidentally represented in two different ways. It seems, in the light of
recent work, that the proof aspect is less tied to contingent intuitions, and is
the way in which one should study algorithms. The functional aspect is more
eloquent, more immediate, and should be kept to a heuristic role.
Chapter 4
The Normalisation Theorem
This chapter concerns the two results which ensure that the typed -calculus
behaves well computationally. The Normalisation Theorem provides for the
existence of a normal form, whilst the Church-Rosser property guarantees its
uniqueness. In fact we shall simply state the latter without proof, since it
is not really a matter of type theory and is well covered in the literature,
e.g. [Barendregt].
The normalisation theorem has two forms:
a weak one (there is some terminating strategy for normalisation), which we
shall prove in this chapter,
a strong one (all possible strategies for normalisation terminate), proved in
chapter 6.
4.1
This property states the uniqueness of the normal form, independently of its
existence. In fact, it has a meaning for calculi such as untyped -calculus
where the normalisation theorem is false.
Theorem If t
@
R
@
@
R
@
22
w.
23
...
t2
@
R
@
t1
t2n2
t2n = v
@
R
@
@
R
@
@
R
@
@
R
@
t2n3
t3
@
R
@
t2n1
Now, if u and v are two distinct normal forms of the same type (for example
two distinct variables) no such w exists, so the equation u = v cannot be
proved. So Church-Rosser shows the denotational consistency of the system.
24
4.2
This result states the existence of a normal form which is necessarily unique
for every term. Its immediate corollary is the decidability of denotational equality.
Indeed we have seen that the equation u = v is provable exactly when u, v
w
for some w; but such w has a normal form, which then becomes the common
normal form for u and v. To decide the denotational equality of u and v we
proceed thus:
in the first step, calculate the normal forms of u and v,
in the second step, compare them.
There is perhaps a small difficulty hidden in calculating the normal forms,
since the reduction is not a deterministic algorithm. That is, for fixed t, many
conversions (but only a finite number) are possible on the subterms of t. So
the theorem states the possibility of finding the normal form by appropriate
conversions, but does not exclude the possibility of bad reductions, which do not
lead to a normal form. That is why one speaks of weak normalisation.
Having said that, it is possible to find the normal form by enumerating all the
reductions in one step, all the reductions in two steps, and so on until a normal
form is found. This inelegant procedure is justified by the fact that there are only
finitely many reductions of length n starting from a fixed term t.
The strong normalisation theorem will simplify the situation by guaranteeing
that all normalisation strategies are good, in the sense they all lead to the normal
form. Obviously, some are more efficient than others, in terms of the number of
steps, but if one ignores this (essential) aspect, one always gets to the result!
4.3
25
The degree d(t) of a term is the sup of the degrees of the redexes it contains. By
convention, a normal term (i.e. one containing no redex) has degree 0.
NB A redex r has two degrees: one as redex, another as term, for the redex may
contain others; the second degree is greater than or equal to the first: (r) d(r).
4.3.1
4.3.2
Proof We need only consider the case where there is only one conversion step:
u is obtained from t by replacing r by c. The situation is very close to that of
lemma 4.3.1, i.e. in u we find:
redexes which were in t but not in r, modified by the replacement of r by c
(which does not affect the degree),
redexes of c. But c is obtained by simplification of r, or by an internal
substitution in r: (x. s) s0 becomes s[s0 /x] and lemma 4.3.1 tells us that
d(c) max(d(s), d(s0 ), (T )), where T is the type of x. But (T ) < d(r), so
d(c) d(r).
redexes which come from the replacement of r by c. The situation is the
same as in lemma 4.3.1: these redexes have degree equal to (T ) where T
is the type of r, and (T ) < (r).
26
4.3.3
Lemma Let r be a redex of maximal degree n in t, and suppose that all the
redexes strictly contained in r have degree less than n. If u is obtained from t by
converting r to c then u has strictly fewer redexes of degree n.
Proof When the conversion is made, the following things happen:
The redexes outside r remain.
The redexes strictly inside r are in general conserved, but sometimes
proliferated: for example if one replaces (x. hx, xi) s by hs, si, the redexes
of s are duplicated. The hypothesis made does not exclude duplication, but
it is limited to degrees less than n.
The redex r is destroyed and possibly replaced by some redexes of strictly
smaller degree.
4.3.4
Lemma 4.3.3 says that it is possible to choose a redex r of t in such a way that,
after conversion of r to c, the result t0 satisfies (t0 ) < (t) for the lexicographic
order, i.e. if (t0 ) = (n0 , m0 ) then n0 < n or (n0 = n and m0 < m). So the result is
established by a double induction.
4.4
The weak normalisation theorem is in fact a bit better than its statement leads
us to believe, because we have a simple algorithm for choosing at each step an
appropriate redex which leads us to the normal form. Having said this, it is
interesting to ask whether all normalisation strategies converge.
A term t is strongly normalisable when there is no infinite reduction sequence
beginning with t.
27
Lemma t is strongly normalisable iff there is a number (t) which bounds the
length of every normalisation sequence beginning with t.
Proof From the existence of (t), it follows immediately that t is strongly
normalisable.
The converse uses Konigs lemma1 : one can represent a sequence of conversions
by specifying a redex r0 of t0 , then a redex r1 of t1 , and so on. The possible
sequences can then be arranged in the form of a tree, and the fact that a term
has only a finite number of subterms assures us that the tree is finitely-branching.
Now, the strong normalisation hypothesis tells us that the tree has no infinite
branch, and by Konigs lemma, the whole tree must be finite, which gives us the
existence of (t).
There are several methods to prove that every term (of the typed -calculus)
is strongly normalisable:
internalisation: this consists of a tortuous translation of the calculus into
itself in such a way as to prove strong normalisation by means of weak
normalisation. Gandy was the first to use this technique [Gandy].
reducibility: we introduce a property of hereditary calculability which
allows us to manipulate complex combinatorial information. This is the
method we shall follow, since it is the only one which generalises to very
complicated situations. This method will be the subject of chapter 6.
A finitely branching tree with no infinite branch is finite. Unless the branches are labelled
(as they usually are), this requires the axiom of Choice.
Chapter 5
Sequent Calculus
The sequent calculus, due to Gentzen, is the prettiest illustration of the symmetries
of Logic. It presents numerous analogies with natural deduction, without being
limited to the intuitionistic case.
This calculus is generally ignored by computer scientists1 . Yet it underlies
essential ideas: for example, PROLOG is an implementation of a fragment of sequent
calculus, and the tableaux used in automatic theorem-proving are just a special
case of this calculus. In other words, it is used unwittingly by many people, but
mixed with control features, i.e. programming devices. What makes everything
work is the sequent calculus with its deep symmetries, and not particular tricks.
So it is difficult to consider, say, the theory of PROLOG without knowing thoroughly
the subtleties of sequent calculus.
From an algorithmic viewpoint, the sequent calculus has no Curry-Howard
isomorphism, because of the multitude of ways of writing the same proof. This
prevents us from using it as a typed -calculus, although we glimpse some deep
structure of this kind, probably linked with parallelism. But it requires a new
approach to the syntax, for example natural deductions with several conclusions.
An exception is [Gallier].
28
5.1
29
The calculus
5.1.1
Sequents
5.1.2
Structural rules
These rules, which seem not to say anything at all, impose a certain way of
managing the slots in which one writes formulae. They are:
1. The exchange rules
A, C, D, A0 ` B
A, D, C, A0 ` B
LX
A ` B, C, D, B 0
A ` B, D, C, B 0
RX
LW
A`B
A ` C, B
RW
LC
A ` C, C, B
A ` C, B
RC
30
In fact, contrary to popular belief, these rules are the most important of
the whole calculus, for, without having written a single logical symbol, we have
practically determined the future behaviour of the logical operations. Yet these
rules, if they are obvious from the denotational point of view, should be examined
closely from the operational point of view, especially the contraction.
It is possible to envisage variants on the sequent calculus, in which these rules
are abolished or extremely restricted. That seems to have some very beneficial
effects, leading to linear logic [Gir87]. But without going that far, certain
well-known restrictions on the sequent calculus seem to have no purpose apart
from controlling the structural rules, as we shall see in the following sections.
5.1.3
5.1.4
1. For every formula C there is the identity axiom C ` C . In fact one could
limit it to the case of atomic C, but this is rarely done.
2. The cut rule
A ` C, B
A0 , C ` B 0
A, A0 ` B, B 0
Cut
is another way of expressing the identity. The identity axiom says that C
(on the left) is stronger than C (on the right); this rule states the converse
truth, i.e. C (on the right) is stronger than C (on the left).
31
The identity axiom is absolutely necessary to any proof, to start things off.
That is undoubtedly why the cut rule, which represents the dual, symmetric
aspect can be eliminated, by means of a difficult theorem (proved in chapter 13)
which is related to the normalisation theorem. The deep content of the two results
is the same; they only differ in their syntactic dressing.
5.1.5
Logical rules
There is tradition which would have it that Logic is a formal game, a succession of
more or less arbitrary axioms and rules. Sequent calculus (and natural deduction
as well) shows this is not at all so: one can amuse oneself by inventing ones own
logical operations, but they have to respect the left/right symmetry, otherwise
one creates a logical atrocity without interest. Concretely, the symmetry is the
fact that we can eliminate the cut rule.
1. Negation: the rules for negation allow us to pass from the right hand side
of ` to the left, and conversely:
A ` C, B
A, C ` B
A, C ` B
A ` C, B
2. Conjunction: on the left, two unary rules; on the right, one binary rule:
A, C ` B
A, C D ` B
L1
A ` C, B
A, D ` B
A, C D ` B
A0 ` D, B 0
A, A0 ` C D, B, B 0
L2
A0 , D ` B 0
A, A0 , C D ` B, B 0
A ` C, B
A ` C D, B
R1
A ` D, B
A ` C D, B
R2
32
A0 , D ` B
A, A0 , C D ` B
where B contains zero or one formula. This rule is not a special case of its
classical analogue, since a classical L leads to B, B on the right. This is
the only case where the intuitionistic rule is not simply a restriction of the
classical one.
4. Implication: here we have on the left a rule with two premises and on the
right a rule with one premise. They match again, but in a different way
from the case of conjunction: the rule with one premise uses two occurrences
in the premise:
A ` C, B
A0 , D ` B 0
A, A0 , C D ` B, B 0
A, C ` D, B
A ` C D, B
5. Universal quantification: two unary rules which match in the sense that one
uses a variable and the other a term:
A, C[a/] ` B
A, . C ` B
A ` C, B
A ` . C, B
A ` C[a/], B
A ` . C, B
5.2
5.2.1
33
If we can prove A in the predicate calculus, then it is possible to show the sequent
` A without cut. What is the last rule used? Surely not RW, because the empty
sequent is not provable. Perhaps it is the logical rule Ris where s is the principal
symbol of A, and this case is very important. But it may also be RC, in which
case we are led to ` A, A and all is lost! That is why the intuitionistic case, with
its special management which forbids contraction on the right, is very important:
if A is provable in the intuitionistic sequent calculus by a cut-free proof, then the
last rule is a right logical rule.
Two particularly famous cases:
If A is a disjunction A0 A00 , the last rule must be R1, in which case
` A0 is provable, or R2, in which case ` A00 is provable: this is what is
called the Disjunction Property.
If A is an existence . A0 , the last rule must be R1, which means that the
premise is of the form ` A0 [a/] ; in other words, a term t can be found
such that ` A0 [a/] is provable: this is the Existence Property.
These two examples fully justify the interest of limiting the use of the structural
rules, a limitation which leads to linear logic.
5.2.2
Subformula property
Let us consider the last rule of a proof: can one somehow predict the premises?
The cut rule is absolutely unpredictable, since an arbitrary formula C disappears:
it cannot be recovered from the conclusions. It is the only rule which behaves so
badly. Indeed, all the other rules have the property that the unspecified context
part (written A, B, etc.) is preserved intact. The rule actually concerns only
a few of the formulae. But the formulae in the premises are simpler than the
corresponding ones in the conclusions. For example, for A B as a conclusion,
A and B must have been used as premises, or for . A as a conclusion, A[a/]
must have been used as a premise. In other words, one has to use subformulae as
premises:
The immediate subformulae of A B, A B and A B are A and B.
The only immediate subformula of A is A.
The immediate subformulae of . A and . A are the formulae A[a/]
where a is any term.
34
Now it is clear that all the rules except the cut have the property
that the premises are made up of subformulae of the conclusion. In particular,
a cut-free proof of a sequent uses only subformulae of its formulae. We shall
prove the corresponding result for natural deduction in section 10.3.1. This is
very interesting for automated deduction. Of course, it is not enough to make
the predicate calculus decidable, since we have an infinity of subformulae for the
sentences with quantifiers.
5.2.3
Asymmetrical interpretation
We have described the identity axiom and the cut rule as the two faces of
A is A. Now, in the absence of cut, the situation is suddenly very different:
we can no longer express that A (on the right) is stronger than A (on the left).
Then there arises the possibility of splitting A into two interpretations AL and
AR , which need not necessarily coincide. Let us be more precise.
In a sentence, we can define the signature of an occurrence of an atomic
predicate, +1 or 1: the signature is the parity of the number of times that this
symbol has been negated. Concretely, P retains the signature which it had in A,
when it is considered in A B, B A, A B, B A, B A, . A and . A,
and reverses it in A and A B.
In a sequent too, we can define the signature of an occurrence of a predicate:
if P occurs in A on the left of `, the signature is reversed, if P occurs on the
right, it is conserved.
The rules of the sequent calculus (apart from the identity axiom and the cut)
preserve the signature: in other words, they relate occurrences with the same
signature. The identity axiom says that the negative occurrences (signature 1)
are stronger than the positive ones; the cut says the opposite. So in the absence
of cut, there is the possibility of giving asymmetric interpretations to sequent
calculus: A does not have the same meaning when it is on the right as when it is
on the left of `.
AR is obtained by replacing the positive occurrences of the predicate P by
P R and the negative ones by P L .
AL is obtained by replacing the positive occurrences of the predicate P by
P L and the negative ones by P R .
The atomic symbols P R and P L are tied together by a condition, namely
P L P R.
35
5.3
We shall consider here the noble part of natural deduction, that is, the fragment
without , or . We restrict ourselves to sequents of the form A ` B ; the
correspondence with natural deduction is given as follows:
To a proof of A ` B corresponds a deduction of B under the hypotheses,
or rather parcels of hypotheses, A.
Conversely, a deduction of B under the (parcels of) hypotheses A can be
represented in the sequent calculus, but unfortunately not uniquely.
From a proof of A ` B , we build a deduction of B, of which the hypotheses
are parcels, each parcel corresponding in a precise way to a formula of A.
1. The axiom A ` A becomes the deduction
A0 , B ` C
A, A0 ` C
Cut
A
A0 , B
B
C
the sub-proofs above the two premises, then we associate to our proof the
deduction 0 where all the occurrences of B in the parcel it represents are
replaced by :
A
0
A, B
36
LX
is interpreted as the identity: the same deduction before and after the rule.
4. The rule LW
A`B
A, C ` B
LW
LC
A0 ` C
A, A0 ` B C
A0
BC
37
A, B ` C
A`BC
becomes
A, [B]
C
BC
A`B
A ` . B
becomes
. B
9. With the left rules appears one of the hidden properties of natural deduction,
namely that the elimination rules (which correspond grosso modo to the left
rules of sequents) are written backwards! This is nowhere seen better than
in linear logic, which makes the lost symmetries reappear. Here concretely,
this is reflected in the fact that the left rules are translated by actions on
parcels of hypotheses.
The rule L1 becomes 1E:
BC
A, B ` D
A, B C ` D
L1
is interpreted by
A,
1E
38
A`B
A0 , C ` D
A, A0 , B C ` D
is interpreted by
L
BC
A0 ,
5.4
is interpreted by
L
A, B[a/]
The translation from sequent calculus into natural deduction is not 11: different
proofs give the same deduction, for example
A`A
B`B
A, B ` A B
A`A
A A0 , B ` A B
A, B ` A B
L1
A A0 , B B 0 ` A B
B`B
L1
A, B B 0 ` A B
L1
A A0 , B B 0 ` A B
L1
which differ only in the order of the rules, have the same translation:
A A0
A
1E
AB
B B0
B
1E
39
In some sense, we should think of the natural deductions as the true proof
objects. The sequent calculus is only a system which enable us to work on these
objects: A ` B tells us that we have a deduction of B under the hypotheses A.
A rule such as the cut
A`C
A0 , C ` B
A, A0 ` B
Cut
allows us to construct a new deduction from two others, in a sense made explicit
by the translation.
In other words, the system of sequents is not primitive, and the rules of
the calculus are in fact more or less complex combinations of rules of natural
deduction:
1. The logical rules on the right correspond to introductions.
2. Those on the left to eliminations. Here the direction of the rules is inverted
in the case of natural deduction, since in fact, the tree of natural deduction
grows by its leaves at the elimination stage.
The correspondence R = I, L = E is extremely precise, for example we have
R = I and L1 = 1E.
3. The contraction rule LC corresponds to the formation of parcels, and LW,
in some cases, to the formation of mock parcels.
4. The exchange rule corresponds to nothing at all.
5. The cut rule does not correspond to a rule of natural deduction, but to the
need to make deductions grow at the root. Let us give an example: the
strict translation of L gives us from a deduction of A and one of C (with
a B-parcel as hypothesis), the deduction
AB
B
is formed which grows in the wrong direction (towards the leaves). Yet,
the full power of the calculus is only obtained with the top-down rule
40
AB
B
B`B
A0 , A B ` B
B0 ` A B
A0 , B 0 ` B
Cut
A`A
A`A
is translated by the deduction
Cut
Chapter 6
Strong Normalisation Theorem
In this chapter we shall prove the strong normalisation theorem for the simple
typed -calculus, but since we have already discussed this topic at length, and
in particular proved weak normalisation, the purpose of the chapter is really to
introduce the technique which we shall later apply to system F.
For simple typed -calculus, there is proof theoretic techniques which make it
possible to express the argument of the proof in arithmetic, and even in a very
weak system. However our method extends straightforwardly to Godels system
T, which includes a type of integers and hence codes Peano Arithmetic. As a
result, strong normalisation implies the consistency of PA, which means that it
cannot itself be proved in PA (Second Incompleteness Theorem).
Accordingly we have to use a strong induction hypothesis, for which we
introduce an abstract notion called reducibility, originally due to [Tait]. Some of
the technical improvements, such as neutrality, are due to [Gir72]. Besides proving
strong normalisation, we identify the three important properties (CR 1-3) of
reducibility which we shall use for system F in chapter 14.
6.1
Reducibility
41
42
The deep reason why reducibility works where combinatorial intuition fails is
its logical complexity. Indeed, we have:
t REDU V
iff
u (u REDU t u REDV )
6.2
Properties of reducibility
1t
2t
tu
t0 , then t0 REDT .
6.2.1
Atomic types
6.2.2
43
Product type
6.2.3
Arrow type
A term of arrow type is reducible iff all its applications to reducible terms are
reducible.
(CR 1) If t is reducible of type U V , let x be a variable of type U ; the
induction hypothesis (CR 3) for U says that the term x, which is neutral
and normal, is reducible. So t x is reducible. Just as in the case of
the product type, we remark that (t) (t x). The induction hypothesis
(CR 1) for V guarantees that (t x) is finite, and so (t) is finite, and t is
strongly normalisable.
(CR 2) If t
t0 and t is reducible, take u reducible of type U ; then t u is
reducible and t u
t0 u. The induction hypothesis (CR 2) for V gives that
t0 u is reducible. So t0 is reducible.
(CR 3) Let t be neutral and suppose all the t0 one step from t are reducible. Let u
be a reducible term of type U ; we want to show that t u is reducible. By
induction hypothesis (CR 1) for U , we know that u is strongly normalisable;
so we can reason by induction on (u).
In one step, t u converts to
t0 u with t0 one step from t; but t0 is reducible, so t0 u is.
44
6.3
6.3.1
Reducibility theorem
Pairing
6.3.2
Abstraction
6.3.3
45
The theorem
Chapter 7
G
odels system T
The extremely rudimentary type system we have studied has very little expressive
power. For example, can we use it to represent the integers or the booleans, and
if so can we represent sufficiently many functions on them? The answer is clearly
no.
To obtain more expressivity, we are inexorably led to the consideration of other
schemes: new types, or new terms, often both together. So it is quite natural
that systems such as that of Godel appear, which we shall look at briefly. That
said, we come up against a two-fold difficulty:
Systems like T are a step backwards from the logical viewpoint: the new
schemes do not correspond to proofs in an extended logical system. In
particular, that makes it difficult to study them.
By proposing improvements of expressivity, these systems suggest the
possibility of further improvements. For example, it is well known that the
language PASCAL does not have the type of lists built in! So we are led to
endless improvement, in order to be able to consider, besides the booleans,
the integers, lists, trees, etc. Of course, all this is done to the detriment of
conceptual simplicity and modularity.
The system F resolves these questions in a very satisfying manner, as it will
be seen that the addition of a new logical scheme allows us to deal with common
data types. But first, let us concentrate on the system T, which already has
considerable expressive power.
46
7.1
7.1.1
47
The calculus
Types
In chapter 3 we allowed for given additional constant types; we shall now specify
two such types, namely Int (integers) and Bool (booleans).
7.1.2
Terms
Besides the usual five, there are schemes for the specific constants Int and Bool.
We have retained the introduction/elimination terminology, as these schemes will
appear later in F:
1. Int-introduction:
O is a constant of type Int;
if t is of type Int, then S t is of type Int.
2. Int-elimination: if u, v, t are of types respectively U , U (IntU ) and Int,
then R u v t is of type U .
3. Bool-introduction: T and F are constants of type Bool.
4. Bool-elimination: if u, v, t are of types respectively U , U and Bool, then
D u v t is of type U .
7.1.3
Intended meaning
7.1.4
Conversions
u
v (R u v t) t
DuvT
DuvF
u
v
CHAPTER 7. GODELS
SYSTEM T
48
7.2
Normalisation theorem
In T, all the reduction sequences are finite and lead to the same normal form.
Proof Part of the result is the extension of Church-Rosser; it is not difficult to
extend the proof for the simple system to this more complex case. The other part
is a strong normalisation result, for which reducibility is well adapted (it was for
T that Tait invented the notion).
First, the notion of neutrality is extended: a term is called neutral if it is not
of the form hu, vi, x. v, O, S t, T or F. Then, without changing anything, we
show successively:
1. O, T and F are reducible they are normal terms of atomic type.
2. If t of type Int is reducible (i.e. strongly normalisable), then S t is reducible
that comes from (S t) = (t).
3. If u, v, t are reducible, then D u v t is reducible u, v, t are strongly
normalisable by (CR 1), and so one can reason by induction on the number
(u) + (v) + (t). The neutral term D u v t converts to one of the following
terms:
D u0 v 0 t0 with u, v, t reduced respectively to u0 , v 0 , t0 . In this case,
we have (u0 ) + (v 0 ) + (t0 ) < (u) + (v) + (t), and by induction
hypothesis, the term is reducible.
u or v if t is T or F; these two terms are reducible.
We conclude by (CR 3) that D u v t is reducible.
4. If u, v, t are reducible, then R u v t is reducible here also we reason
by induction, but on (u) + (v) + (t) + `(t), where `(t) is the number of
symbols of the normal form of t. In one step, R u v t converts to:
R u0 v 0 t0 with etc. reducible by induction.
u (if t = O) reducible.
v (R u v w) w, where S w = t; since (w) = (t) and `(w) < `(t), the
induction hypothesis tells us that R u v w is reducible. As v and w are,
v (R u v w) w is reducible by the definition for U V .
The use of the induction hypothesis in the final case is really essential: it is
the only occasion, in all the uses so far made of reducibility, where we truly use
an induction on reducibility. For the other cases, the cognoscenti will see that
we really have no need for induction on a complex predicate, by reformulating
(CR 3) in an appropriate way.
7.3
7.3.1
49
disj(u, v) = D T v u
conj(u, v) = D v F u
7.3.2
G hx, Ti
G hF, Fi
Integers
x + S y = S (x + y)
t[x, S y]
S t[x, y]
null(S x) = F
gives
def
CHAPTER 7. GODELS
SYSTEM T
50
None of these examples makes serious use of higher types. However, as the
types used in the recursion increase, more and more functions become expressible.
For example, if f is of type Int Int, one can define it(f ) of type Int Int by
it(f ) x = R 1 (z Int . z 0 Int . f z) x
(it(f ) n is f n 1)
v (It u v t)
7.4
7.4.1
51
First a question: what guarantee do we have that Int represents the integers, Bool
the booleans, etc.? It is not because we have represented the integers in the type
Int that this type can immediately claim to represent the integers. The answer
lies in the following lemma:
Lemma Let t be a closed normal term:
If t is of type Int, then t is of the form n.
If t is of type Bool, then t is of the form T or F.
If t is of type U V , then t is of the form hu, vi.
If t is of type U V , then t is of the form x. v.
Proof By induction on the number of symbols of t. If t is S w, the induction
hypothesis applied to w gives w = n, so t = n + 1. So we suppose that t is not of
the form O, T, F, hu, vi or x. v:
If t is R u v w, then the induction hypothesis says that w is of the form n,
and then t is not normal.
If t is D u v w, then by the induction hypothesis w is T or F, and then t is
not normal.
If t is i w, then again w is of the form hu, vi, and t is not normal.
If t is w u, then w is of the form x. v, and t is not normal.
7.4.2
Representable functions
iff
tn
iff
tn
The functions |t| are clearly calculable: the normalisation algorithm gives |t|(n)
as a function of n. So those functions representable in T are recursive. Can we
characterise the class of such functions?
CHAPTER 7. GODELS
SYSTEM T
52
In the sense of complexity. Thus for instance hyperexponential algorithms, such as the
proof of cut elimination, are not feasible.
Chapter 8
Coherence Spaces
The earliest work in denotational semantics was done by [Scott69] for the untyped
-calculus, and much has been written since then. His approach is characterised by
continuity, i.e. the preservation of directed joins. In this chapter, a novel kind of
domain theory is introduced, in which we also have (and preserve) meets bounded
above (pullbacks). This property, called stability, was originally introduced by
[Berry] in an attempt to give a semantic characterisation of sequential algorithms.
We shall find that this semantics is well adapted to system F and leads us towards
linear logic.
8.1
General ideas
54
This interpretation is all very well, but it does not explain anything. The
computationally interesting objects just get drowned in a sea of set-theoretic
functions. The function spaces also quickly become enormous.
Kreisel had the following idea (hereditarily effective operations):
type = partial equivalence relation on N.
U V is the set of (codes of) partial recursive functions f such that, if
x U y, then f (x) V f (y), subject to the equivalence relation:
f (U V ) g
iff
x, y (x U y f (x) V g(y))
The most common (but by no means the universal) answer to this question is to use the
compact-open topology, in which a function lies in a basic open set if, when restricted to a
specified compact set, its values lie in a specified open set. This topology is only well-behaved
when the spaces are locally compact (every point has a base of compact neighbourhoods), and
even then the function space need not itself be locally compact.
2
There is, however, a logical view of topology, which has been set out in a computer
science context by [Abr88, ERobinson, Smyth, Vickers].
8.2
55
Coherence Spaces
{f }
{0}
{1}
{2}
...
are (very basic) coherence spaces, respectively called Bool and Int, but
{1, 2}
{0, 2}
{0, 1}
H
@ HH@
@ H@
H@
@
H
{0}
{1}
QQ
Q
{2}
is not. However we shall see that coherence spaces are better regarded as
undirected graphs.
8.2.1
def S
Consider |A| = A = { : {} A}. The elements of |A| are called tokens, and
the coherence relation modulo A is defined between tokens by
0
_
^ (mod A)
iff
{, 0 } A
The term espace coherent is used in the French text, and indeed Plotkin has also used the
word coherent to refer to this binary condition. However coherent space is established, albeit
peculiar, usage for a space with a basis of compact open sets, also called a spectral space.
Consequently, the term was modified in translation.
56
For example, the web of Bool consists of the tokens t and f , which are
incoherent; similarly the web of Int is a discrete graph whose points are the
integers. Such domains we call flat.
The construction of the web of a coherence space is a bijection between
coherence spaces and (reflexive-symmetric) graphs. From the web we recover the
coherence space by:
a A a |A| 1 , 2 a (1 _
^ 2 (mod A))
So in the terminology of Graph Theory, a point is exactly a clique, i.e. a
complete subgraph.
8.2.2
Interpretation
The aim is to interpret a type by a coherence space A, and a term of this type
by a point of A (coherent subset of |A|, infinite in general: we write Afin for the
set of finite points).
To work in an effective manner with points of A, it is necessary to introduce
a notion of finite approximation. An approximant of a A is any subset a0 of a.
Condition (i) for coherence spaces ensures that approximants are still in A. Above
all, there are enough finite approximants to a:
a is the union of its set of finite approximants.
The set I of finite approximants is directed. In other words,
i) I is nonempty ( I).
ii) If a0 , a00 I, one can find a I such that a0 , a00 a (take a = a0 a00 ).
This comes from the following idea:
On the one hand we have the true (or total) objects of A. For example,
in Bool , the singletons {t} and {f }, in Int, {0}, {1}, {2}, etc.
On the other hand, the approximants, of which, in the two simplistic cases
considered, is the only example. They are partial objects.
57
The addition of partial objects has much the same significance as in recursion
theory, where we shift from total to partial functions: for example, to the integers
(represented by singletons) we add the undefined .
One should not, however, attach too much importance to this first intuition.
For example, it is misguided to seek to identify the total points of an arbitrary
coherence space A. One might perhaps think that the total points of A are the
maximal points, i.e. such that:
0
|A| (0 a _
^ (mod A)) a
which indeed they are in the simple cases (integers, booleans, etc.). However
we would like to define totality in coherence spaces which are the interpretations
of complex types, using formulae analogous to the ones for reducibility (see 6.1).
These are of greater and greater logical complexity4 , and altogether unpredictable,
whilst the notion of maximality remains desperately 02 , so one cannot hope for
a coincidence. In fact, for any given coherence space there are many notions of
totality, just as there are many reducibility candidates (chapter 14) for the same
type. In fact the semantics partialises everything, and the total objects get a bit
lost inside it.
The functions from A to B will be seen as functions defined uniquely by
their approximants, and in this way continuous. Here it is possible to use a
topological language where the subsets {a : a a} of A, for a finite, are open.
However whereas in Scott-style domain theory the functions between domains are
exactly those which are continuous for this topology, this will no longer be so
here.
8.3
Stable functions
(St)
58
F (a) =
{F (a ) : a a, a finite}
a1
@
I
@
@
@
I
@
a2
@
a1 a2
must be preserved. The intention is that this should hold for any set {a1 , a2 , . . .}
which is bounded above, not just finite ones, but in the context of strongly finite
approximation (i.e. the fact that the approximating elements have only finitely
many elements below them, which is not in general true in Scotts theory) we
dont need to say this.
Let us give an example to show that the hypothesis of coherence between a1
and a2 cannot be lifted. We want to be able to represent all functions from N to
N as stable functions from Int to Int, in particular f (0) = f (1) = 0, f (n + 2) = 1.
This forces F ({0}) = F ({1}) = {0}, F ({n + 2}) = {1}, and by monotonicity,
F () = . Now, F ({0} {1}) = F () = 6= F ({0}) F ({1}); we are saved by
the incoherence of 0 and 1, which makes {0} {1}
/ Int.
We shall see that this property forces the existence of a least approximant in
certain cases, simply by taking the intersection of a set which is bounded above.
8.3.1
59
= {n};
the functions fe, amongst which are the constants fe() = , fe({m}) = {n},
which only differ from the first by the value at .
8.3.2
Parallel Or
Let us look for all the stable functions of two arguments from Bool , Bool to Bool
which represent the disjunction in the sense that F ({}, {}) = { } for every
substitution of t and f for and .
We must have F (a0 , b0 ) F (a, b) when a0 a and b0 b. In particular,
if F (, ) = {t} (or {f }), then F takes constantly the value t (or f ),
which is impossible. Similarly we have F ({f }, ) = F (, {f }) = because
F ({f }, ) F ({f }, {t}) = {t} and F ({f }, ) F ({f }, {f }) = {f }.
F ({t}, ) = {t} is possible, but then F (, {t}) = : indeed, if we write the
third condition for two arguments:
a1 a2 Bool b1 b2 Bool F (a1 a2 , b1 b2 ) = F (a1 , b1 ) F (a2 , b2 )
and apply it for a1 = {t}, a2 = , b1 = , b2 = {t}, then F (, {t}) = {t} would
give us F (, ) = {t}.
By symmetry, we have obtained two functions:
F1 ({t}, ) = F1 ({t}, {t}) = F1 ({t}, {f }) = F1 ({f }, {t}) = {t}
F1 ({f }, {f }) = {f }
F1 (, ) = F1 ({f }, ) = F1 (, {t}) = F1 (, {f }) =
and F2 (a, b) = F1 (b, a).
60
What have we got against this example? It violates a principle of least data:
we have F0 ({t}, {t}) = {t}; we seek to find a least approximant to the pair of
arguments {t}, {t} which already gives {t}; now we have at our disposal , {t}
and {t}, which are minimal (, does not work) and distinct.
Of course, knowing that we always have a distinguished (least) solution (rather
than many minimal solutions) for a problem of this kind radically simplifies a lot
of calculations.
8.4
@
I
@
@
(a, b0 )
(a0 , b)
@
I
@
@
(a0 , b0 )
(where a a A and b b B) be preserved.
0
We would like to avoid studying stable functions of two (or more) variables
and so reduce them to the unary case. For this we shall introduce the (direct)
product A N B of two coherence spaces. The notation comes from linear logic.
61
0
iff _
^ (mod A)
0
(2, ) _
^ (2, ) (mod A N B)
0
iff _
^ (mod B)
(1, ) _
^ (2, ) (mod A N B)
8.5
The Function-Space
We started with the idea that type = coherence space. The previous section
defines a product of coherence spaces corresponding to the product of types, but
what do we do with the arrow? We would like to define A B as the set of
stable functions from A to B, but this is not presented as a coherence space.
So we shall give a particular representation of the set of stable functions in such
a way as to make it a coherence space.
8.5.1
62
Proof
S
i) Write a = iI ai , where the ai are the finite subsets of a.
S
F (a) = iI F (ai ), and if F (a), F (ai0 ) for some i0 I.
Then
F (a) = { : a a (a , ) Tr(F )}
which results immediately from the lemma. In particular the function F 7 Tr(F )
is 11.
Consider for example the stable function F1 from Bool NBool to Bool introduced
in 8.3.2. The elements of its trace Tr(F1 ) are:
({(1, t)}, t)
8.5.2
63
Proposition As F varies over the stable functions from A to B, their traces give
the points of a coherence space, written A B.
Proof Let us define the coherence space C by |C| = Afin |B| (Afin is the set of
finite points of A) where (a1 , 1 ) _
^ (a2 , 2 ) (mod C) if
i) a1 a2 A 1 _
^ 2 (mod B)
ii) a1 a2 A a1 6= a2 1 6= 2 (mod B)
In 12.3, we shall see a more symmetrical way of writing this.
If F is stable, then Tr(F ) is a subset of |C| by construction. We verify the
coherence modulo C of (a1 , 1 ) and (a2 , 2 ) Tr(F ):
i) If a1 a2 A then {1 , 2 } F (a1 a2 ) so 1 _
^ 2 (mod B).
ii) If 1 = 2 and a1 a2 A, then the lemma applied to 1 F (a1 a2 ) gives
us a1 = a2 .
Conversely, let f be a point of C. We define a function from A to B by the
formula:
(App)
F (a) = { : a a (a , ) f }
i) F is monotone: immediate.
S
S
ii) If a = iI ai , then iI F (ai ) F (a) by monotonicity. Conversely, if
F (a), this
means there is an a0 finite, a0 a, such that F (a0 ); but
S
since a0 iI ai , we have a0 ak for some k (that is why I was chosen
directed!) so F (ak ) and the converse inclusion is established.
iii) If a1 a2 A, then F (a1 a2 ) F (a1 ) F (a2 ) by monotonicity. Conversely,
if F (a1 ) F (a2 ), this means that (a01 , ), (a02 , ) f for some
appropriate a01 a1 and a02 a2 . But (a01 , ) and (a02 , ) are coherent
and a01 a02 a1 a2 A, so a01 = a02 , a01 a1 a2 and F (a1 a2 ).
iv) We nearly forgot to show that F maps A into B: F (a), for a A, is a
subset of |B|, of which it is again necessary to show coherence! Now, if
0 , 00 F (a), this means that (a0 , 0 ), (a00 , 00 ) f for appropriate a0 , a00 a;
00
but then a0 a00 a A, so, as (a0 , 0 ) and (a00 , 00 ) are coherent, 0 _
^
(mod B).
Finally, it is easy to check that these constructions are mutually inverse.
In fact, the same application formula occurs in Scotts domain theory [Scott76],
but the corresponding notion of trace is more complicated.
64
8.5.3
Tr(F ) Tr(G)
iff
iff
Proof If F B G then F (a) G(a) for all a (take a = a0 ). Let (a, ) Tr(F );
then F (a) G(a). We seek to show that (a, ) Tr(G). Let a0 a such that
G(a0 ); then F (a) G(a0 ) = F (a0 ), which forces a0 = a.
Conversely, if Tr(F ) Tr(G), it is easy to see that F (a) G(a) for all a. In
particular if a0 a, then F (a0 ) F (a) G(a0 ). Now, if F (a) G(a0 ), one can
find a a, a0 a0 such that
(a , ) Tr(F ) Tr(G) 3 (a0 , )
so (a , ) and (a0 , ) are coherent, and since a a0 a A, we have a = a0 ,
and F (a0 ) = F (a ) F (a0 ).
As an example, it is easy to see (using one of the characterisations of B ) that
F3 6B F1 (see 8.3.2) although F3 (a, b) F1 (a, b) for all a, b Bool . The reader is
also invited to show that the identity is maximal.
The Berry order says that evaluation preserves the pullback (cf. the one in
section 8.4)
(G, a)
@
I
@
@
(G, a0 )
(F, a)
@
I
@
@
(F, a0 )
for a0 a in (A B) N A, so this is exactly the order relation we need on A B
to make evaluation stable.
8.5.4
65
Partial functions
Let us see how this construction works by calculating Int Int. We have
Int fin ' N {} and |Int| = N, so |Int Int| ' (N {}) N where
0
0
0
0
i) (n, m) _
^ (n , m ) if n = n m = m
ii) (, m) _
^ (, m)
with incoherence otherwise. This is the direct sum (see section 12.1) of the
coherence space which represents partial functions with the space which represents
the constants by vocation. Let us ignore the latter part and concentrate on the
space PF defined on the web N N by condition (i).
What is the order relation on PF? Well f PF is a set of pairs (n, m)
such that if (n, m), (n, m0 ) f then m = m0 , which is just the usual graph
representation of a partial function. Since the Berry order corresponds simply to
containment, it is the usual extension order on partial functions.
In the Berry order, the partial functions fe and the constants by vocation
n are incomparable. However pointwise we have fe < 0 for any partial function
which takes no other value than zero, of which there are infinitely many. One
advantage of our semantics is that it avoids this phenomenon of compact5 objects
with infinitely many objects below them.
Another consequence of the Berry order arises at an even simpler type: in
the function-space Sgl Sgl , where Sgl is the coherence space with just one
token (section 12.6). In the pointwise (Scott) order, the identity function is below
the constant by vocation {}, whilst in the Berry order they are incomparable.
This means that in the stable semantics, unlike the Scott semantics, it is possible
for a test program to succeed on the identity (which reads its input) but fail on
the constant (which does not).
S
The notion of compactness in topology is purely order-theoretic: if a
I for some
directed set I then a b for some b I. Besides Scotts domain theory, this also arises in
ring theory as Noetherianness and in universal algebra as finite presentability.
5
Chapter 9
Denotational Semantics of T
The constructions of chapter 8 provide a nice denotational semantics of the
systems we have already considered.
9.1
We propose here to interpret the simple typed calculus, based on and . The
essential idea is that:
-abstraction turns a function (x 7 t[x]) into an object;
application associates to an object t of type U V a function u 7 t u.
In other words, application and -abstraction are two mutually inverse
operations which identify objects of type U V and functions from U to V .
So we shall interpret them as follows:
-abstraction by the operation which maps a stable function from A to B
to its trace, a point of A B;
application by the operation which maps a point of A B to the function
of which it is the trace.
9.1.1
Types
Suppose we have fixed for each atomic type Si a coherence space [[Si ]]; then we
define [[T ]] for each type T by:
[[U V ]] = [[U ]] N [[V ]]
66
9.1.2
67
Terms
2 (c) = { : (2, ) c}
68
As an exercise, one can calculate the traces of Pair , 1 , 2 , App and the
function in 4 which takes F to G.
9.2
App(Tr(F ), a) = F (a)
Categorically, what we have shown is that N and are the product and
exponential for a Cartesian closed category whose objects are coherence spaces
and whose morphisms are stable maps. However, we have forgotten one thing:
composition! But it is easy to show that the trace of G F is
{(a1 ... ak , ) : ({1 , ..., k }, ) Tr(F ), (a1 , 1 ), ..., (ak , k ) Tr(G)}
where F and G are stable functions from A to B and from B to C respectively.
9.3. GODELS
SYSTEM
9.3
69
G
odels system
9.3.1
Booleans
def
[[T]] = T = {t}
[[F]] = F = {f }
D(a, b, {t}) = a
D(a, b, {f }) = b
t hx, Ti
t hF, Fi
[[t]](, T ) = T
[[t]](F, F) = F
9.3.2
Integers
The obvious idea for interpreting Int is the coherence space Int introduced in the
previous chapter:
def
[[O]] = O = {0}
R u v (S x)
F (S()) = {f }
70
p+ _
^ q iff p < q
+
p+ _
^ q for all p, q
To see how it all works out, let us look for the maximal points. If a Int + is
maximal, either:
some p a; then a contains no other q, nor does it contain any q + with
def
p q. So a pe = {0+ , . . . , (p 1)+ , p}; but this set is coherent, and as a is
maximal it must be equal to pe.
def
a contains no p; then a f
= {0+ , 1+ , 2+ , . . .} which is coherent, so a is
equal to this infinite set.
The interpretation is as follows:
O = {0}
G(S(a)) = F (G(a), a)
G(a) = if 0, 0+
/a
These lazy natural numbers are rather more complicated than the usual ones, which do
not form a coherence space but a dI-domain (section 12.2.1). The difference is that we admit
the token 1+ in the absence of 0+ , although it is difficult to see what this might mean.
9.3. GODELS
SYSTEM
71
p = {0+ , 1+ , ..., (p 1)+ } = S p
then G(a0 ) = G(a) (assuming F has this property), so (by induction) no term of
T involves p or p+ in its semantics without {0+ , . . . (p 1)+ } as well.
As an exercise, one can try to calculate directly a stable function from Int +
to Int + which represents the predecessor.
9.3.3
v (It u v )
Chapter 10
Sums in Natural Deduction
This chapter gives a brief description of those parts of natural deduction whose
behaviour is not so pretty, although they show precisely the features which are
most typical of intuitionism. For this fragment, our syntactic methods are frankly
inadequate, and only a complete recasting of the ideas could allow us to progress.
In terms of syntax, there are three connectors to put back: , and . For ,
it is common to add a symbol (absurdity) and interpet A as A .
The rules are:
A
AB
1I
B
AB
A[a/]
. A
2I
AB
[A]
[B]
C
[A]
. A
The variable must no longer be free in the hypotheses or the conclusion after
use of the rule E. There is, of course, no rule I.
10.1
The introduction rules (two for , none for and one for ) are excellent!
Moreover, if you mentally turn them upside-down, you will find the same structure
as 1E, 2E, E (in linear logic, there is only one rule in each case, since they
are actually turned over).
72
73
The elimination rules are very bad. What is catastrophic about them is the
parasitic presence of a formula C which has no structural link with the formula
which is eliminated. C plays the role of a context, and the writing of these rules
is a concession to sequent calculus.
In fact, the adoption of these rules (and let us repeat that there is currently
no alternative) contradicts the idea that natural deductions are the real objects
behind the proofs. Indeed, we cannot decently work with the full fragment
without identifying a priori different deductions, for example:
AB
[A]
C
C
D
[B]
and
E
AB
[A]
C
D
[B]
C
D
r
E
with two conclusions. Later, these two conclusions would have to be brought back
together into one. But we have no way of bringing them back together, apart
from writing E as we did, which forces us to choose the moment of reunification.
The commutation rules express the fact that this moment can fundamentally be
postponed.
10.2
Standard conversions
74
A
AB
1I
[A]
[B]
[A]
[B]
B
AB
2I
C
A[a/]
. A
converts to
converts to
[A]
I
C
converts to
E
A[a/]
B
AB
(A B) C
I
C
10.3
10.3.1
75
Subformula Property
AB
it is not possible that the proof above the principal premise ends in an
introduction, so it ends in an elimination and has a principal branch, which
can be extended to a principal branch of .
76
10.3.2
For the full calculus, we come against an enormous difficulty: it is no longer true
that the conclusion of an elimination is a subformula of its principal premise: the
C of the three elimination rules has nothing to do with the eliminated formula.
So we are led to restricting the notion of principal branch to those eliminations
which are well-behaved (1E, 2E, E and E) and we can try to extend our
theorem. Of course it will be necessary to restrict part (ii) to the case where
ends in a good elimination.
The theorem is proved as before in the case of introductions, but the case of
eliminations is more complex:
If ends in a good elimination, look at its principal premise A: we shall
be embarrassed in the case where A is the conclusion of a bad elimination.
Otherwise we conclude the existence of a principal branch.
If ends in a bad elimination, look again at its principal premise A: it is not
the conclusion of an introduction. If A is a hypothesis or the conclusion of
a good elimination, it is a subformula of a hypothesis, and the result follows
easily. There still remains the case where A comes from a bad elimination.
To sum up, it would be necessary to get rid of configurations formed from
a succession of two rules: a bad elimination of which the conclusion C is the
principal premise of an elimination, good or bad. Once we have done this, we can
recover the Subformula Property. A quick calculation shows that the number of
configurations is 3 7 = 21 and there is no question of considering them one by
one. In any case, the removal of these configurations is certainly necessary, as the
following example shows:
[A]
AA
[A]
AA
AA
A
[A]
[A]
AA
1E
10.4
Commuting conversions
..
C
. denotes an elimination of the principal premise C, the
In what follows,
r
D
conclusion is D and the ellipsis represents some possible secondary premises with
the corresponding deductions. This symbolic notation covers the seven cases of
elimination.
77
1. commutation of E:
converts to
E
..
.
2. commutation of E:
AB
[A]
[B]
converts to
E
..
.
[A]
AB
..
.
D
[B]
3. commutation of E:
[A]
. A
C
converts to
E
..
.
[A]
. A
..
.
D
AB
[A]
C D
C D
[B]
C D
[C]
[D]
E
E
converts to
..
.
78
C D
AB
[C]
[D]
E
E
[B]
C D
[C]
E
E
[D]
E
E
10.5
Properties of conversion
First of all, the normal form, if it exists, is unique: that follows again from a
Church-Rosser property. The result remains true in this case, since the conflicts
of the kind
A
AB
[A]
1I
[B]
..
.
[A]
and
..
.
D
A
AB
1I
[A]
..
.
D
[B]
C
D
..
.
are easily resolved, because the second deduction converts to the first.
It is possible to extend the results obtained for the (, , ) fragment to the
full calculus, at the price of boring complications. [Prawitz] gives all the technical
details for doing this. The abstract properties of reducibility for this case are
in [Gir72], and there are no real problems when we extend this to existential
quantification over types.
79
Having said this, we shall give no proof, because the theoretical interest is
limited. One tends to think that natural deduction should be modified to correct
such atrocities: if a connector has such bad rules, one ignores it (a very common
attitude) or one tries to change the very spirit of natural deduction in order to
be able to integrate it harmoniously with the others. It does not seem that the
(, , ) fragment of the calculus is etched on tablets of stone.
Moreover, the extensions are long and difficult, and for all that you will not
learn anything new apart from technical variations on reducibility. So it will suffice
to know that the strong normalisation theorem also holds in this case. In the
unlikely event that you want to see the proof, you may consult the references
above.
10.6
10.6.1
Empty type
Emp is considered to be the empty type. For this reason, there will be a canonical
function U from Emp to any type U : if t is of type Emp, them U t is of type U .
The commutation for U is set out in five cases:
1 (U V t)
2 (U V t)
U t
V t
(U V t) u
V t
U (Emp t)
U t
x. u y. v (R+S t)
U t
80
10.6.2
Sum type
x. u y. v (2 s)
u[r/x]
v[s/y]
x. ( 1 u) y. ( 1 v) t
x. ( 2 u) y. ( 2 v) t
U = V W
U = V W
( x. u y. v t) w
x. (u w) y. (v w) t
U = V W
W ( x. u y. v t)
x0 . u0 y 0 . v 0 ( x. u y. v t)
( x. (W u) y. (W v) t)
U = Emp
x. ( x0 . u0 y 0 . v 0 u) y. ( x0 . u0 y 0 . v 0 v) t
U =V +W
10.6.3
Additional conversions
t and x. t x
x. (1 x) y. (2 y) t
t:
Clearly the terms on both sides of the are denotationally equal. However
the direction in which the conversion should work is not very clear: the opposite
one is in fact much more natural.
Chapter 11
System F
System F [Gir71] arises as an extension of the simple typed calculus, obtained by
adding an operation of abstraction on types. This operation is extremely powerful
and in particular all the usual data-types (integers, lists, etc.) are definable.
The system was introduced in the context of proof theory [Gir71], but it was
independently discovered in computer science [Reynolds].
The most primitive version of the system is set out here: it is based on
implication and universal quantification. We shall content ourselves with defining
the system and giving some illustrations of its expressive power.
11.1
The calculus
82
v[U/X]
11.2
Comments
11.3
83
11.3.1
Booleans
def
T = X. xX . y X . x
F = X. xX . y X . y
Duvt = tU uv
11.3.2
Product of types
def
hu, vi = X. xU V X . x u v
The projections are defined as follows:
def
1 t = t U (xU . y V . x)
def
2 t = t V (xU . y V . y)
84
Note that h 1 t, 2 ti
X. t X
t.
11.3.3
t and
Empty type
def
def
11.3.4
Sum type
def
1 u = X. xU X . y V X . x u
def
2 v = X. xU X . y V X . y v
x. u y. v t = t U (xU . u) (y V . v)
Let us calculate x. u y. v (1 r):
x. u y. v (1 r) = (X. xRX . y SX . x r) U (xR . u) (y S . v)
(xRU . y SU . x r) (xR . u) (y S . v)
(y SU . (xR . u) r) (y S . v)
(xR . u) r
u[r/x]
and similarly x. u y. v (2 s)
v[s/y].
85
On the other hand, the translation does not interpret the commuting or
secondary conversions associated with the sum type; the same remark applies to
the type Emp and also to the type Bool which has a sum structure and for which
it is possible to write commutation rules.
11.3.5
Existential type
X. V = Y. (X. (V Y ))Y
If U is a type and v a term of type V [U/X], then we define hU, vi of type
X. V by
def
hU, vi = Y. xX. V Y . x U v
Corresponding to the introduction of , there is an elimination: if w is of
type W and t of type X. V , X is a type variable, x a variable of type V and
the only free occurrences of X in the type of a free variable of w are in the type
of x, one can form X. x. w t of type W (the occurrences of X and x in w are
bound by this construction):
def
X. x. w t = t W (X. xV . w)
Let us calculate (X. x. w ) hU, vi:
(X. x. w ) hU, vi = (Y. xX. V Y . x U v) W (X. xV . w)
(xX. V W . x U v) (X. xV . w)
(X. xV . w) U v
(xV [U/X] . w[U/X]) v
w[U/X][v/xV [U/X] ]
This gives a conversion rule which was for example in the original version of the
system.
11.4
We have translated some simple types; we shall continue with some inductive
types: integers, trees, lists, etc. Undoubtedly the possibilities are endless and we
shall give the general solution to this kind of question before specialising to more
concrete situations.
86
11.4.1
Free structure
11.4.2
87
We have to find an object fi for each type Si [T /X]. In other words, we are
looking for a function fi which takes ki arguments of types Tji [T /X] and returns
a value of type T .
Let x1 , . . . , xki be the arguments of fi . As X occurs positively in Tji , the
canonical function hi of type T X defined by
hi x = x X y1S1 . . . ynSn
induces a function Tji [hi ] from Tji [T /X] to Tji depending on X, y1 , . . . , yn . This
function could be defined formally, but we shall see it much better with examples.
Finally we put tj = Tji [hi ] xj for j = 1, . . . , ki and we define
fi x1 . . . xki = X. y1S1 . . . . ynSn . yi t1 . . . tki
11.4.3
Induction
The question of knowing whether the only objects of type T which one can
construct are indeed those generated from the fi is hard; the answer is yes,
almost! We shall come back to this in 15.1.1.
A preliminary indication of this fact is the possibility of defining a function
by induction on the construction of . We start off with a type U and functions
g1 , . . . , gn of types Si [U/X] (i = 1, . . . , n). We would like to define a function h of
type T U satisfying:
h (fi x1 . . . xki ) = gi u1 . . . uki
88
11.5
All the definitions given in 11.3 (except the existential type) are particular cases
of what we describe in 11.4: they do not come out of a hat.
1. The boolean type has two constants, which will then give f1 and f2 of type
boolean: so S1 = S2 = X and Bool = X. XXX. It is easy to show
that T and F are indeed the 0-ary functions defined in 11.4 and that the
induction operation is nothing other than D.
2. The product type has a function f1 of two arguments, one of type U and
one of type V . So we have S1 = U V X, which explains the translation.
The pairing function fits in well with the general case of 11.4, but the two
projections go outside this treatment: they are in fact more easy to handle
than the indirect scheme resulting from a mechanical application of 11.4.
3. The sum type has two functions (the canonical injections), so S1 = U X
and S2 = V X. The interpretation of 11.3.4 matches faithfully the general
scheme.
4. The empty type has nothing, so n = 0.
induction operator.
11.5.1
Integers
The integer type has two functions: O of type integer and S from integers to
integers, which gives S1 = X and S2 = XX, so
def
Int = X. X(XX)X
In the type Int, the integer n will be represented by
n = X. xX . y XX . y (y (y . . . (y x) . . .))
|
{z
}
n occurrences
89
def
O = X. xX . y XX . x
S t = X. xX . y XX . y (t X x y)
n+1.
As to the induction operator, it is in fact the iterator It, which takes an object
of type U , a function of type U U and returns a result of type U :
It u f t = t U u f
It u f O = (X. xX . y XX . x) U u f
(xU . y U U . x) u f
(y U U . u) f
u
It u f (S t) = (X. xX . y XX . y (t X x y)) U u f
(xU . y U U . y (t U x y)) u f
(y U U . y (t U u y)) f
f (t U u f )
= f (It u f t)
It is not true that It u f n+1
f (f (f . . . (f u) . . .))
|
{z
}
n+1 occurrences
90
R u f n+1 f (R u f n) n
The second equation for recursion is satisfied by values only, i.e. for each n
separately. We make no secret of the fact that this is a defect of system F.
Indeed, if we program the predecessor function
pred O = O
pred (S x) = x
the second equation will only be satisfied for x of the form n, which means
that the program decomposes the argument x completely into S S S . . . S O, then
reconstructs it leaving out the last symbol S. Of course it would be more
economical to remove the first instead!
11.5.2
Lists
U being a type, we want to form the type List U , whose objects are finite sequences
(u1 , . . . , un ) of type U . We have two functions:
the sequence () of type List U , and hence S1 = X;
the function which maps an object u of type U and a sequence (u1 , . . . , un )
to (u, u1 , . . . , un ). So S2 = U XX.
Mechanically applying the general scheme, we get
List U
def
= X. X(U XX)X
def
nil = X. xX . y U XX . x
def
cons u t = X. xX . y U XX . y u (t X x y)
So the sequence (u1 , . . . , un ) is represented by
X. xX . y U XX . y u1 (y u2 . . . (y un x) . . .)
which we recognise, replacing y by cons and x by nil, as
cons u1 (cons u2 . . . (cons un nil) . . .)
This last term could be obtained by reducing (u1 , . . . , un ) (List U ) nil cons.
91
It w f t = t W w f
which satisfies
It w f nil
It w f (cons u t)
f u (It w f t)
Examples
It nil cons t
(g u1 , . . . , g un )
tail(cons u t) = t
where the second equation is only satisfied for t of the form (u1 , . . . , un ).
As an exercise, define by iteration:
concatenation: (u1 , . . . , un ) @ (v1 , . . . , vm ) = (u1 , . . . , un , v1 , . . . , vm )
reversal : reverse (u1 , . . . , un ) = (un , . . . , u1 )
List U depends on U , but the definition we have given is in fact uniform in it,
so we can define
Nil = X. nil[X]
of type X. List X
Cons = X. cons[X] of type X. XList XList X
92
11.5.3
Binary trees
We are interested in finite binary trees. For this, we have two functions:
the tree consisting only of its root, so S1 = X;
the construction of a tree from two trees, so S2 = XXX.
def
Bintree = X. X(XXX)X
def
nil = X. xX . y XXX . x
def
couple u v = X. xX . y XXX . y (u X x y) (v X x y)
def
11.5.4
It w f (couple u v)
f (It w f u) (It w f v)
Tree U
def
= X. X((U X)X)X
def
nil = X. xX . y (U X)X . x
def
collect f = X. xX . y (U X)X . y (z U . f z X x y)
def
It w h (collect f )
h (xU . It w h (f x))
Notice that Bintree could be treated as the type of trees with boolean branching
type, without substantial alteration.
93
Just as we can abstract on U in List U , the same thing is possible with trees.
This potential for abstraction shows up the modularity of F very well: for example,
one can define the module Collect = X. collect[X], which can subsequently be
used by specifying the type X. Of course, we see the value of this in more
complicated cases: we only write the program once, but it can be applied (plugged
into other modules) in a great variety of situations.
11.6
The types in F are nothing other than propositions quantified at the second order,
and the isomorphism we have already established for the arrow extends to these
quantifiers:
A
X. A
X. A
2 I
A[B/X]
2 E
A
X. A
2 I
A[B/X]
2 E
converts to
A[B/X]
Chapter 12
Coherence Semantics of the Sum
Here we consider the denotational semantics of Emp and + (corresponding to
and ) introduced in chapter 10.
Emp is naturally interpreted as the coherence space Emp whose web is empty,
and the interpretation of U follows immediately1 .
The sum, on the other hand, poses some delicate problems. When A and B
are two coherence spaces, there is just one obvious notion of sum, namely the
direct sum introduced below. Unfortunately, the scheme is not interpreted. This
objection also holds for other kinds of semantics, for example Scott domains.
After examining and rejecting a certain number of fudged alternatives, we
are led back to the original solution, which would work with linear functions
(i.e. preserving unions), and we arrive at a representation of the sum type as:
!A !B
It is this decomposition which is the origin of linear logic: the operations (direct
sum) and ! (linearisation) are in fact logical operations in their own right.
The reader familiar with category theory should notice that Emp is not an initial object.
This is to be expected in any reasonable category of domains, because there can be no initial
object in a non-degenerate Cartesian closed category where every object is inhabited (as it
will be if there are fixpoints). With linear logic, the problem vanishes because we do not
require a Cartesian closed category.
94
12.1
95
Direct sum
The problem with sum types arises from the impossibility of defining the
interpretation by means of the direct sum:
|A B| = |A| + |B| = {1} |A| {2} |B|
0
(1, ) _
^ (1, ) (mod A B)
0
if _
^ (mod A)
0
(2, ) _
^ (2, ) (mod A B)
0
if _
^ (mod B)
every object of the coherence space A B can be written Inj 1 (a) for some a A
or Inj 2 (b) for some b B. This expression is unique, except in the case of the
empty set: = Inj 1 = Inj 2 . This non-uniqueness of the decomposition makes
it impossible to define a function casewise
H(Inj 1 (a)) = F (a)
from two stable functions F from A to C and G from B to C. Indeed this fails
for the argument , since F () has no reason to be equal to G().
12.2
Lifted sum
96
In other words, in order to know whether H(c), we look inside c for a tag
1 or 2, then if we find one (say 1), we write c = q1 (a) and ask whether G(a).
This solution interprets the standard conversion schemes:
x. u y. v (1 r)
u[r/x]
x. u y. v (2 s)
v[s/y]
does not always satisfy H(c) = c. In fact this equation is satisfied only for c of
the form q1 (a), q2 (b) or .
On the other hand, the commuting conversions do hold: let t 7 E t be an
elimination of the form 1 t, or 2 t, or t w, or U t, or x0 . u0 y 0 . v 0 t. We want
to check that E ( x. u y. v t) and x. (E u) y. (E v) t have the same interpretation.
In the case where (semantically) t is q1 a, the two expressions give [[E u]](a). In
the case where c {1, 2} = , we get on the one hand E() where E is the
stable function corresponding to E, and on the other ; but it is easy to see that
E() = (E is strict) in all the cases in question.
Having said this, the presence of an equation (however minor) which is not
interpreted means we must reject the semantics. Even if we are unsure how to
use it, the equation
x. (1 x) y. (2 y) t = t
plays a part in the implicit symmetries of the disjunction. Once again, we are not
looking for a model at any price, but for a convincing one. For that, even the
secondary connectors (such as ) and the marginal equations are precious, because
they show up some points of discord between syntax and semantics. By trying to
analyse this discord, one can hope to find some properties hidden in the syntax.
12.2.1
97
dI-domains
where Sgl is the coherence space with just one token (section 12.6). This may be
used as an alternative way of defining inductive data types.
The damage caused by this interpretation is limited, because one can require
that for all |A|, the set of 0 < be finite, which ensures that the down-closure
of a finite set is always finite, and so we are saved from one of our objections to
Scott domains.
Semantically, there is nothing else to quarrel with about this interpretation,
which accounts for all reasonable constructions. But on the other hand, it forces
us to leave the class of coherence spaces, and uses an order relation which
compromises the conceptual simplicity of the system.
This leads us to look for something else, which does preserve this class. The
price will be a more complicated interpretation of the sum (although we are
basically only interested in the sum as a test for our semantic ideas) but we shall
be rewarded with a novel idea: linearity.
The interpretation we shall give is manifestly not associative. It is interesting
to remark that Winskels interpretation is not either: indeed, if A, B, C are
coherence spaces considered as event structures (with a trivial order relation) then
(A q B) q C and A q (B q C) are not the same:
98
(1, 1)
(1, 2)
(2, )
(1, )
(2, 1)
@
@
@
@
2
(A q B) q C
12.3
(2, 2)
2
A q (B q C)
Linearity
Let us work out Tr(E): we have to find all the E(f ) with f minimal.
Now E(f ) = f (a) iff there exists some a a such that (a , ) f . So the
minimal f are the singletons {(a , )} with a a, a finite, and the objects of
Tr(E) are of the form
({(a , )}, )
12.3.1
12.3. LINEARITY
99
These properties characterise the stable functions which are linear; indeed,
if F (a) with a minimal, a must be a singleton:
i) F () = , so a 6= .
ii) if a = a0 a00 , then F (a) = F (a0 ) F (a00 ), so F (a0 ) or F (a00 ); so, if a
is not a singleton, we can find a decomposition a = a0 a00 which contradicts
the minimality of a.
Properties (i) and (ii) combine with preservation of filtered unions (Lin):
if A A, andSfor all aS1 , a2 A, a1 a2 A,
then F ( A) = {F (a) : a A}
Observe that this condition is in the spirit of coherence spaces, which must be
closed under pairwise-bounded unions. So we can define linear stable functions
from A to B by (Lin) and (St):
if a1 a2 A then F (a1 a2 ) = F (a1 ) F (a2 )
the monotonicity of F being a consequence of (Lin).
12.3.2
Linear implication
for incoherence.
100
_
^ (mod A )
iff
0
^
_ (mod A)
_ 0 0
words, (, ) _
^ ( , ) (mod A ( B) iff (, ) ^ ( , ) (mod B ( A ).
What is the meaning of this? A stable function takes an input of A and
returns an output of B. When the function is linear, this process can be seen
dually as returning an input of A (i.e. an output of A ) from an output of B
(i.e. an input of B ). So the linear implication introduces a symmetrical form of
functional dependence, the duality of roles of the argument and the result being
expressed by the linear negation A 7 A . This is analogous to transposition (not
inversion) in Linear Algebra.
To make this relevant, we have to show that linearity is not an exceptional
phenomenon, and we shall be able to symmetrise the functional situations.
12.4
Linearisation
iff a1 a2 A
12.4. LINEARISATION
101
Delin(G)(a) = G(!a)
It is easy to see that Lin and Delin are mutually inverse operations2 , and in
particular the equation Lin(F )(!a) = F (a) characterises Lin(F ).
We can now see very well how the reversibility works for ordinary implication:
A B = !A ( B ' B ( (!A) = B ( ?(A )
def
where ?C = (!(C ))
In other words the (non-linear) implication is reversible, but this requires some
complicated constructions which have no connection with the functional intuition
we started off with.
All this is side-tracking us, towards linear logic, and we shall stick to concluding
the interpretation of the sum.
2
Categorically, this says that ! is the left adjoint to the forgetful functor from coherence
spaces and linear maps to coherence spaces and stable maps.
102
12.5
Linearised sum
q2 b = {2} !b
H({2} B) = Lin(G)(B)
without conflict at , since Lin(F ) and Lin(G) are linear and so H() = .
The interpretation is not particularly economical but it has the merit of
making use of the direct sum, and not any less intelligible considerations. Above
all, it suggests a decomposition of the sum which shows up the more primitive
operations: ! which we found in the decomposition of the arrow, and which
is the truly disjunctive part of the sum.
Let us check the equations we want to interpret.
If F , G and a are the interpretations of u[x], v[y] and r, then the interpretation
of x. u y. v (1 r) is Lin(F )(!a), which is equal to the interpretation F (a) of
u[r/x]. Similarly, we shall interpret the conversion x. u y. v (2 s)
v[s/y].
Now we shall turn to the equation x. (1 x) y. (2 y) t = t. First, we see that
Lin(q1 )(A) = {1}A, because it is the unique linear solution F of F (!a) = {1}!a.
In particular, if t is interpreted by {1} A, then x. (1 x) y. (2 y) t is interpreted
by Lin(q1 )(A) = {1} A, and similarly, if t is interpreted by {2} B, then
x. (1 x) y. (2 y) t is interpreted by Lin(q2 )(B) = {2} B.
Finally, the commuting conversions are of the form
E ( x. u y. v t)
x. (E u) y. (E v) t
103
12.6
The direct sum forms the disjoint union of the webs of two coherence spaces, so
what is the meaning of the graph product?
We define A B to be the coherence space whose tokens are the pairs h, i,
where |A| and |B|, with the coherence relation
0
0
_ 0
_ 0
h, i _
^ h , i (mod A B) iff ^ (mod A) and ^ (mod B)
This is called the tensor product. The dual (linear negation) of the tensor product
is called the par or tensor sum:
(A B) = A O B
Comparing this with the linear implication we have
A ( B = A O B = (A B )
Chapter 13
Cut Elimination (Hauptsatz)
Gentzens theorem, one of the most important in logic, is not very far removed
from normalisation in natural deduction, which is to a large extent inspired by it.
In a slightly modified form, it is at the root of languages such as PROLOG. In other
words, it is a result which everyone should see proved at least once. However the
proof is very delicate and fiddly. So we shall begin by pointing out the key cases
which it is important to understand. Afterwards we shall develop the detailed
proof, whose intricacies are less interesting.
13.1
A0 , C ` B 0
A, A0 ` B, B 0
Cut
where the left premise is a right logical rule and the right premise a left logical
rule, so that both introduce the main symbol of C. These cases enlighten the
deep symmetries of logical rules, which match each other exactly.
1. R and L1
A ` C, B
A0 ` D, B 0
A, A0 ` C D, B, B 0
A00 , C ` B 00
A00 , C D ` B 00
A, A0 , A00 ` B, B 0 , B 00
is replaced by
104
L1
Cut
105
A ` C, B
A00 , C ` B 00
Cut
A, A00 ` B, B 00
====0 ===
=========
A, A , A00 ` B, B 0 , B 00
where the double bar denotes a certain number of structural rules, in this
case weakening and exchange.
2. R and L2
A ` C, B
A0 ` D, B 0
A, A0 ` C D, B, B 0
A00 , D ` B 00
A00 , C D ` B 00
A, A0 , A00 ` B, B 0 , B 00
L2
Cut
is replaced similarly by
A0 ` D, B 0
A00 , D ` B 00
A0 , A00 ` B 0 , B 00
====0 ===
=========
A, A , A00 ` B, B 0 , B 00
Cut
3. R1 and L
A ` C, B
A ` C D, B
R1
A0 , C ` B 0
A00 , D ` B 00
A0 , A00 , C D ` B 0 , B 00
A, A0 , A00 ` B, B 0 , B 00
is replaced by
A ` C, B
A0 , C ` B 0
A, A0 ` B, B 0
====0 ===
=========
A, A , A00 ` B, B 0 , B 00
This is the dual of case 1.
Cut
Cut
106
4. R2 and L
A ` D, B
A ` C D, B
R2
A0 , C ` B 0
A00 , D ` B 00
A0 , A00 , C D ` B 0 , B 00
A, A0 , A00 ` B, B 0 , B 00
Cut
is replaced by
A ` D, B
A00 , D ` B 00
A, A00 ` B, B 00
====0 ===
=========
A, A , A00 ` B, B 0 , B 00
Cut
5. R and L
A0 ` C, B 0
A0 , C ` B 0
A, A0 ` B, B 0
L
Cut
is replaced by
A0 ` C, B 0
A, C ` B
A0 , A ` B 0 , B
====0 ======0
A, A ` B, B
Cut
A0 ` C, B 0
A0 , A00 , C D ` B 0 , B 00
A, A0 , A00 ` B, B 0 , B 00
is replaced by
A00 , D ` B 00
L
Cut
107
A0 ` C, B 0
A, C ` D, B
A0 , A ` B 0 , D, B
====0=========0
A, A ` D, B, B
Cut
A00 , D ` B 00
A, A0 , A00 ` B, B 0 , B 00
Cut
7. R and L
A ` C, B
A ` . C, B
A0 , C[a/] ` B 0
A0 , . C ` B 0
A, A0 ` B, B 0
L
Cut
is replaced by
A ` C[a/], B
A0 , C[a/] ` B 0
A, A0 ` B, B 0
Cut
A0 , C ` B 0
A0 , . C ` B 0
A, A0 ` B, B 0
L
Cut
is replaced by
A ` C[a/], B
A0 , C[a/] ` B 0
A, A0 ` B, B 0
This is the dual of case 7.
Cut
108
13.2
$ is a variant of , not of .
109
The last rule r of has premises Ai ` B i proved by i , and the last rule r0
of 0 has premises A0j ` B 0j proved by j0 . There are several cases to consider:
1. is an axiom. There are two subcases:
proves C ` C . Then a proof $ of C, A0 C ` B 0 is obtained from
0 by means of structural rules.
proves D ` D . Then a proof $ of D, A0 C ` D, B 0 is obtained
from by means of structural rules.
2. 0 is an axiom. This case is handled as 1; but notice that if and 0 are
both axioms, we have arbitrarily privileged .
3. r is a structural rule. The induction hypothesis for 1 and 0 gives a proof
$1 of A1 , A0 C ` B 1 C, B 0 . Then $ is obtained from $1 by means of
structural rules. Notice that in the case where the last rule of is RC on
C, we have more occurrences of C in B1 than in B.
4. r0 is a structural rule (dual of 3).
5. r is a logical rule, other than a right one with principal formula C. The
induction hypothesis for i and 0 gives a proof $i of Ai , A0 C ` B i C, B 0 .
The same rule r is applicable to the $i , and since r does not create any new
occurrence of C on the right side, this gives a proof $ of A, A0C ` BC, B 0 .
6. r0 is a logical rule, other than a left one principal formula C (dual of 5).
7. Both r and r0 are logical rules, r a right one and r0 a left one, of principal
formula C. This is the only important case, and it is symmetrical.
First, apply the induction hypothesis to
i and 0 , giving a proof $i of Ai , A0 C ` B i C, B 0 ;
and j0 , giving a proof $j0 of A, A0j C ` B C, B 0j .
Second apply r (and some structural rules) to the $i to give a proof of
A, A0 C ` C, B C, B 0 . Likewise apply r0 (and some structural rules) to
the $j0 to give a proof 0 of A, A0 C, C ` B C, B 0 .
There is one occurrence of C too many on the right of the conclusion to
and on the left of that to 0 . Using the cut rule we have a proof of
A, A0 C, A, A0 C ` B C, B 0 , B C, B 0 .
However the degree of this cut is d, which is too much. But we observe
that this is precisely one of the key cases presented in 13.1, so we can
replace this cut by others of degree < d. Finally $ is obtained by structural
manipulations.
110
13.3
The Hauptsatz
A0 , C ` B 0
A, A0 ` B, B 0
Cut
One should have some idea of how the process of eliminating cuts explodes
the height of proofs. We shall just give an overall estimate which does not take
into account the structural rules.
The principal lemma is linear: the elimination of a cut at worst multiplies the
height by the constant k = 4.
The proposition is exponential: reducing the degree by 1 increases the height
from h to 4h at worst, since in using the lemma we multiply by 4 for each unit of
height.
Altogether, the Hauptsatz is hyperexponential: a proof of height h and degree d
becomes, at worst, one of height H(d, h), where:
H(0, h) = h
H(d + 1, h) = 4H(d,h)
13.4. RESOLUTION
13.4
111
Resolution
Gentzens result does not say anything about the case where we have non-trivial
axioms. Nevertheless, by close examination of the proof, we can see that the only
case in which we would be unable to eliminate a cut is that in which one of
the two premises is an axiom, and that it is necessary to extend the axioms by
substitution.
In other words, the Hauptsatz remains applicable, but in the form of a
restriction of the cut rule to those sequents which are obtained from proper
axioms by substitution.
As a consequence, if we confine ourselves to atomic sequents (built from atomic
formulae) as proper axioms, and as the conclusion, there is no need for the logical
rules.
Let us turn straight to the case of PROLOG. The axioms are of a very special
form, namely atomic intuitionistic sequents (also called Horn clauses) A ` B .
The aim is to prove goals, i.e. atomic sequents of the form ` B . In doing this
we have at our disposal
instances (by substitution) A ` B of the proper axioms,
identity axioms A ` A with A atomic,
cut, and
the structural rules.
But the contraction and weakening are redundant:
Lemma If the atomic sequent A ` B is provable using these rules, there is an
intuitionistic sequent A0 ` B 0 provable without weakening or contraction, such
that:
A0 is built from formulae of A;
B 0 is in B.
Proof By induction on the proof of A ` B .
1. If is an axiom the result is immediate, as the axioms, proper or identity,
are intuitionistic.
2. If ends in a structural rule applied to A1 ` B1 , the induction hypothesis
gives an intuitionistic sequent A01 ` B10 and we put A0 = A01 , B 0 = B10 .
112
3. If ends in a cut
A1 ` C, B 1
A2 , C ` B 2
A1 , A2 ` B 1 , B 2
Cut
then the induction hypothesis provides A01 ` B10 and A02 ` B20 and two
cases arise:
B10 6= C: we can take A0 = A01 and B 0 = B10 ;
B10 = C, which occurs, say, n times in A2 : by making exchanges and
n cuts with A01 ` C we obtain the result with A0 = A01 , . . . , A01 , A02 C
and B 0 = B20 .
This lemma is immediately applicable to a goal ` B , which gives A0 empty
and B 0 = B. Notice that the deduction necessarily lies in the intuitionistic
fragment. But in this case, it is possible to eliminate exchange too, by permuting
the order of application of cuts. Furthermore, cut with an identity axiom
A`C
C`C
A`C
Cut
is useless, so we have:
Proposition In order to prove a goal, we only need to use cut with instances (by
substitution) of proper axioms.
Robinsons resolution method (1965) gives a reasonable strategy for finding
such proofs. The idea is to try all possible combinations of cuts and substitutions,
the latter being limited by unification. However that would lead us too far afield.
Chapter 14
Strong Normalisation for F
The aim of this chapter is to prove:
Theorem All terms of F are strongly normalisable, and the normal form is
unique.
The uniqueness is not problematic: it comes from an extension of the
Church-Rosser theorem. Existence is much more delicate; in fact, we shall see
in chapter 15 that the normalisation theorem for F implies the consistency of
second order arithmetic PA2 . The classic result of logic, if anything deserves that
name, is Godels second incompleteness theorem, which says (assuming that it
is not contradictory) that the consistency of PA2 cannot be proved within PA2 .
Consequently, since consistency can be deduced from normalisation within PA2 , the
normalisation theorem cannot be proved within PA2 . That gives us an essential
piece of information for the proof: we must look for a strategy which goes outside
PA2 .
Essentially, PA2 contains the Axiom (scheme) of comprehension
X. . ( X A[])
where A is a formula in which the variable X does not occur free. A may contain
first order (. , . ) and second order (X. , X. ) quantification. Intuitively,
the first order variables range over integers and the second order ones over sets
of integers. This system suffices for everyday mathematics: for instance, real
numbers may be coded as sets of integers.
So we seek to use all possible axioms of comprehension, or at least a large
class of them. For this, we shall look back at Taits proof (using reducibility) and
try to extend it to system F.
113
114
14.1
We would like to say that t of type X. T is reducible iff for all types U , t U is
reducible (of type T [U/X]). For example, t of type X. X would be reducible iff
t U is reducible for all U . But U is arbitrary it may be X. X and we need
to know the meaning of reducibility of type U before we can define it! We shall
never get anywhere like this. Moreover, if this method were practicable, it would
be applicable to variants of system F for which normalisation fails.
14.1.1
Reducibility candidates
14.1.2
Remarks
The choice of (CR 1-3) is crucial. We need to identify some useful induction
hypotheses on a set of terms which is otherwise arbitrary, and they must be
preserved by the construction of the true reducibility. These conditions were
originally found by trial and error. In linear logic, reducibility candidates appear
much more naturally, from a notion of orthogonality on terms [Gir87].
115
14.1.3
Definitions
A term t is neutral if it does not start with an abstraction symbol, i.e. if it has
one of the following forms:
x
tu
tU
t0 , then t0 R.
116
iff
u (u R t u S)
14.2
Let T [X] be a type, where we understand that X contains (at least) all the
free variables of T . Let U be a sequence of types, of the same length; then
we can define by simultaneous substitution a type T [U /X]. Now let R be a
sequence of reducibility candidates of corresponding types; then we can define a
set REDT [R/X] (parametric reducibility) of terms of type T [U /X] as follows:
1. If T = Xi , then REDT [R/X] = Ri ;
2. If T = V W , then REDT [R/X] = REDV [R/X] REDW [R/X];
3. If T = Y. W then REDT [R/X] is the set of terms t of type T [U /X] such
that, for every type V and reducibility candidate S of this type, then
t V REDW [R/X, S/Y ].
Lemma REDT [R/X] is a reducibility candidate of type T [U /X].
Proof By induction on T : the only case to consider is T = Y. W .
(CR 1) If t REDT [R/X], take an arbitrary type V and an arbitrary candidate
S of type V (for example, the strongly normalisable terms of type V ).
Then t V REDW [R/X, S/Y ], and so, by induction hypothesis (CR 1), we
know that t V is strongly normalisable. But (t) (t V ), so t is strongly
normalisable.
(CR 2) If t REDT [R/X] and t
t0 then for all types V and candidate S,
we have t V REDW [R/X, S/Y ] and t V
t0 V . By induction hypothesis
(CR 2) we know that t0 V REDW [R/X, S/Y ]. So t0 REDT [R/X].
(CR 3) Let t be neutral and suppose all the t0 one step from t are in REDT [R/X].
Take V and S: applying a conversion inside t V , the result is a t0 V since t is
neutral, and t0 V is in REDW [R/X, S/Y ] since t0 is. By induction hypothesis
(CR 3) we see that t V REDW [R/X, S/Y ], and so t REDT [R/X].
14.2.1
117
Substitution
The following lemma says that parametric reducibility behaves well with respect
to substitution:
14.2.2
Universal abstraction
Lemma If for every type V and candidate S, w[V /Y ] REDW [R/X, S/Y ], then
Y. w REDY. W [R/X].
Proof We have to show that (Y. w) V REDW [R/X, S/Y ] for every type V
and candidate S of type V . We argue by induction on (w). Converting a redex
of (Y. w) V gives:
(Y. w0 ) V with (w0 ) < (w), which is in REDW [R/X, S/Y ] by the induction
hypothesis.
w[V /Y ] which is in REDW [R/X, S/Y ] by assumption.
So the result follows from (CR 3).
14.2.3
Universal application
118
14.3
Reducibility theorem
Chapter 15
Representation Theorem
In this chapter we aim to study the strength of system F with a view to
identifying the class of algorithms which are representable. For example, if f is a
closed term of type IntInt, it gives rise to a function (in the set-theoretic sense)
|f | from N to N by
f (n)
|f |(n)
119
120
15.1
Representable functions
15.1.1
Numerals
n occurrences
where n is an integer.
Suppose that v is w u or w U , where w 6= y. Since v is normal, w must be of
the form w0 u0 or w0 U 0 . But the types of x and y are simpler than that of w0 , so
w0 is an abstraction and w is a redex: contradiction. So v is x, in which case our
result holds with n = 0, or v is y v 0 and we apply the induction hypothesis to v 0
of type X.
Remark If we had taken the variant X. (XX)(XX) we would have
obtained almost the same result, but in addition there is a variant for 1:
X. y XX . y
This phenomenon is one of the little imperfections of the syntax. Similar
features arise with inductive data types, i.e. the closed normal forms of type T
are almost the terms obtained by combining the functions fi , but in general
only almost.
Having said this, the recursion scheme for inductive types, defined (morally) in
terms of the fi , shows that (in a sense to be made precise) the terms constructed
from the fi are dense among the others. To return to our pet subject, the
syntax seems to be too rigid and much too artificial to allow a satisfactory study
of such difficulties. Undoubtedly they cannot be resolved otherwise than by
means of an operational semantics which would allow us to identify (or distinguish
between) algorithms beyond what can be done with normalisation, which is only
an approximation to that semantics.
15.1.2
121
Let us return to the original question, which was to characterise the functions
which are representable in F. We have seen that such functions are recursive,
i.e. calculable.
Proposition There is a total recursive function which is not representable in F.
Proof The function which we shall take is the normalisation operation. We
represent terms in a formal language as a string of symbols from a fixed finite
alphabet and hence as an integer. Then this function takes one term (represented
by an integer) and yields another. This function is universal (in the sense of
Turing) with respect to the functions representable in F, and so cannot itself be
represented in F.
More precisely:
N (n) = m if n codes the term t, m codes u and u is the normal form of t.
N (n) = 0 if n does not code any term of F.
On the other hand we have the functions:
A(m, n) = p if m, n, p are the codes of t, u, v such that v = t u, with
A(m, n) = 0 otherwise.
](n) = m if m codes n.
[(m) = n if m is the code of the numeral n, with [(m) = 0 otherwise.
Now consider:
D(n) = [(N (A(n, ](n)))) + 1
This is certainly a total recursive function, but it cannot be represented
in F. Indeed, suppose that t of type IntInt represents D and let n be the
code of t. Then A(n, ](n)) is the code of t n, and N (A(n, ](n))) that of its
normal form. But by definition of t, t n
D(n), so N (A(n, ](n))) = ](D(n)) and
[(N (A(n, ](n)))) = D(n) whence D(n) = D(n) + 1: contradiction.
For any reasonable coding, A, ] and [ are obviously representable in F,
so N itself is not representable in F.
This result is of course a variant of a very famous result in Recursion Theory
(due to Turing), namely that the set of total recursive functions cannot be
enumerated by a single total recursive function. In particular it applies to all
sorts of calculi, typed or untyped, which satisfy the normalisation theorem.
122
15.1.3
123
Remark Let us point out briefly the status of functions which are provably total
in a system of arithmetic which is not too weak:
If A is 1-consistent, i.e. proves no false 01 formula (as we hope is the
case for PA, PA2 and the axiomatic set theory of Zermelo-Fraenkel) then
a diagonalisation argument shows that there are total recursive functions
which are not provably total in A.
Otherwise (and notice that A can be consistent without being 1-consistent,
e.g. A = PA + consis(PA)) A proves the totality of recursive functions
which are in fact partial. It can even prove the totality of all recursive
functions (but for wrong reasons, and after modification of the programs).
15.2
124
15.2.1
Formulation of HA2
S = S =
A
X. A
2 I
X. A
A[{. C}/X]
2 E
In the last rule, A[{. C}/X] means that we replace all the atoms a X by
C[a/] (so {. C} is not part of the syntax).
To illustrate the strength of this formalism (second order `
a la Takeuti) observe
2
that E is nothing but the principle
X. A A[{. C}/X]
and in particular, with A the provable formula
125
Y. . ( X Y )
we get Y. . (C Y ).
Comprehension Scheme.
Nat() = X. (O X . ( X S X) X)
then it is easy to prove that
A[O/]
. (Nat() A[/])
In other words, the induction scheme holds provided all first order quantifiers are
relativised to Nat.
15.2.2
126
15.2.3
n1
S OX
[. ( X S X)]
Sn1 O X Sn O X
Sn O X
. ( X S X) Sn O X
E
E
O X . ( X S X) Sn O X
n
X. (O X . ( X S X) S O X)
2 I
127
This fact is similar to 15.1.1, but the proof is more delicate, because of the
axioms (especially the negative one S = O) which, a priori, could appear in
the deduction. The fact that S a = O is not provable (consistency of HA2 ) must
be exploited.
Now let A[n, m] be a formula expressing the fact that an algorithm, if given
input n, terminates with output m = f (n). Suppose we have can prove
n N. m N. A[n, m]
by means of a deduction in HA2 of
. (Nat() . (Nat() A[, ]))
Then we get a term [[ ]] of type
[[ . (Nat() . (Nat() A[, ])) ]] = Int(Int[[ A ]])
and the term t = x. 1 ([[ ]] x) of type IntInt yields an object that keeps the
algorithmic content of the theorem:
n N. m N. A[n, m]
Indeed, for any n N, the normal form of the deduction
Nat(Sn O)
m
Nat(S O) A[Sn O, Sm O]
. (Nat() A[Sn O, ])
E
E
128
15.2.4
Instead of adding this junk term, we can interpret it into pure system F, by a
coding which maps every type to an inhabited one while preserving normalisation.
Proposition For any (closed) term t of type IntInt in system F with junk,
there is a (closed) term t0 of pure system F such that, if t n normalises to m, then
t0 n normalises to m.
In particular, if t represents a function f , so does t0 , and the representation
theorem is (correctly) proved.
Proof By induction, we define:
hhXii = X
hhU V ii = hhU iihhV ii
hhX. V ii = X. XhhV ii
so that:
hhT [U/X]ii = hhT ii[hhU ii/X]
129
and
X. X = X. xX . x
If t is term of type T with free type variables X1 , . . . , Xp and free first order
variables y1 , . . . , yq of types U1 , . . . , Uq we define inductively a term hhtii (without
junk) of type hhT ii with free type variables X1 , . . . , Xp and free first order variables
x1 , . . . , xp , y1 , . . . , yq of types X1 , . . . , Xp , hhU1 ii, . . . , hhUq ii:
hhy T ii = y hhT ii
hhy U . vii = y hhU ii . hhvii
hht uii = hhtii hhuii
hhX. vii = X. xX . hhvii (note that x may occur in hhvii)
hht U ii = hhtii hhU ii U
hhii = Emp = X. xX . x
Again the reader can check the following properties
hht[u/y U ]ii = hhtii[hhuii/y hhU ii ]
T [U/X] = T [hhU ii/X][U /xhhU ii ]
hht[U/X]ii = hhtii[hhU ii/X][U /xhhU ii ]
which are needed for the preservation of conversions:
if t
u then hhtii
hhuii
130
weaken n
hhnii
and
contract hhnii
Appendix A
Semantics of System F
by Paul Taylor
In this appendix we shall give a semantics for system F in terms of coherence
spaces. In particular we shall interpret universal abstraction by means of a kind of
trace, showing that the primary and secondary equations hold. We shall examine
the way in which its terms are uniform over all types. Finally we shall attempt
to calculate some universal types such as Emp = X. X, Sgl = X. X X,
Bool = X. X X X and Int = X. X (X X) X.
A.1
A.1.1
132
A.1.2
Saturated domains
As an exercise, the reader is invited to construct a countable coherence space into which
any other can be rigidly embedded (A.3.1).
A.1.3
133
Uniformity
134
A.2
Rigid Embeddings
There are reasons for weakening this to 1A pe. We may consider that a domain is
a better approximation than another if it can express more data, and this gives rise to an
embedding. However we may also consider that a domain is inferior if its representation makes
a priori distinctions between things which subsequently turn out to be the same, and such
a comparison is of this more general form. On the other hand the limit-colimit coincidence
and other important constructions such as and types remain valid. However for rigid
adjunctions 1A = pe is forced because the identity is maximal in the Berry order.
3
In fact is not a partial order but a category, because it depends on e. Applying this to
a functor T , we obtain a category with objects the pairs (A, b) for b T (A) and morphisms
given in this way by embeddings; this is called the total category or Grothendieck fibration of
X. T .
T and is written
135
Observe then that for inclusions the embedding is just the identity and the
projection is the restriction:
e(a) = a
A.2.1
p(b) = b |A|
Functoriality of arrow
The reason for using pairs of maps for approximations is that we need to make
the function-space functorial (positive) in its first argument: if A0 approximates
A then we need A0 B to approximate A B and not vice versa.
Indeed if e : A0 A and f : B 0 B then we have e f : (A0 B0 ) (A B)
by
(e f )+ (t0 )(a) = f + (t0 (e a))
(e f ) (t)(a0 ) = f (t(e+ a0 ))
for a A, a0 A0 , t : A B and t0 : A0 B0 . (We leave the reader to check the
inequalities.)
Recall that the tokens of A B are of the form (a, ) where a is a clique
(finite coherent subset) of |A| and is a token of |B|. If e : |A0 | |A| and
f : |B0 | |B| are rigid embeddings then the effect on the token (a0 , 0 ) of A0 B0
is simply the corresponding renaming throughout, i.e. (e+ a0 , f 0 ).
0
136
A.3
Interpretation of Types
We can use this to express any type T of F with n free type variables X1 , ..., Xn
as a functor [[T ]] : Gemn Gem as follows:
1. If T is a constant type then we assign to it a coherence space T and
A.3.1
137
We may calculate |[[T ]](A)| from these tokens as follows. For every embedding
e : A0 A and every token |[[T ]](A0 )|, we have a token [[T ]](e)() |[[T ]](A)|.
However the fact that there may be several such embeddings (and hence several
copies of the token, which must be coherent) gives rise to additional (uniformity)
conditions on the tokens of |[[X. T ]]|. For instance we shall see that hSgl , i is
not a token for [[X. X]].
4
for a pullback of
occurs on the left
however, hold for
for a continuous
138
A.3.2
We can use the linear logic introduced in chapter 12 to choose a good notation
for the tokens and express the conditions on them. Recall that
The tokens of !A are the cliques (finite complete subgraphs) of |A|, and two
cliques are coherent iff their union is a clique; we write cliques as enumerated
sets.
B is the linear negation of B, whose web is the complementary graph to
^ 0
0
that of B; it is convenient to write its tokens as . Then _
^ iff _ ;
this avoids saying mod B or mod B .
|C D| is the graph product of |C| and |D|; its tokens are pairs h, i and
0
_ 0
this is coherent with h 0 , 0 i iff _
^ and ^ .
The token of the identity, X. xX . x, is therefore written
hSgl , h{}, ii
In this notation it is easy to see how we can ascribe a meaning to the phrase
occurs positively (or negatively) in . Informally, a particular occurrence is
positive or negative according as it is over-lined evenly or oddly.
We can obtain a very useful criterion for whether a potential token can actually
occur.
Lemma Let |A| and |[[T ]](A)|. Define a coherence space A+ by adjoining
an additional token 0 to |A| which bears the same coherence relation to the
other tokens (besides ) as does , and is coherent with . There are two rigid
embeddings A A+ (in which is taken to respectively and 0 ), so write
, 0 |A|+ for the images of under these embeddings. Similarly we have
A A , in which 0 ^
_ . Then
if does not occur in then = 0 in both [[T ]](A+ ) and [[T ]](A ).
0
+
^ 0
if occurs positively but not negatively then _
^ in [[T ]](A ) and _
in [[T ]](A ).
139
A.3.3
Any token for X X is of the form hA, ha, ii, in which only the token appears
positively, so a = {}. Hence the only token for this type is the one given, and
[[X. X X]] ' Sgl . This means that the only uniform functions of type X X
are the identity and the undefined function.
The case of T = X is even simpler. No token of A can appear negatively,
and so there is no token at all: [[X. X]] ' Emp has the empty web and only
the totally undefined term, . The reason for this is that if a term is defined
uniformly for all types then it must be coherent with any term; since there are
incoherent terms this must be trivial.
It is clear that no model of F of a domain-theoretic nature can exclude the
undefined function, simply because is semantically definable. For higher types
this leads to the same logical complexities as in section 8.2.2.
Unfortunately, even accepting partiality, coherence spaces do not behave as we
might wish. The tokens for the interpretation of
Bool = X. X X X
are of the form hSgl , ha, hb, iii such that a b = {}. This admits not two but
three (incoherent) solutions:
hSgl , h{}, h, iii
140
A.4
Interpretation of terms
Having sketched the notation we shall now interpret terms and give the formal
semantics of F using coherence spaces.
Recall that a type T with n free type variables X1 , ..., Xn is interpreted by a
stable functor [[T ]] : Gemn Gem. Let t be a term of type T with free variables
x1 , ..., xm of types U1 , ..., Um , where the free variables of the U are included among
the X. Then t likewise assigns to every n-tuple A in Gemn and every m-tuple
bj [[Uj ]](A) a point c [[T ]](A). Of course the function b 7 c must be stable,
and we may simplify matters by replacing t by x. t and T by U1 ... Um T
to make m = 0. We must consider what happens when we vary the Ai .
A.4.1
Let T : Gem Gem be any stable functor and (A) T (A) a choice of points.
Let e : A0 A be a rigid embedding; we want to make monotone with
respect to it. We can use the idea from section A.3.1 to do this: we want
(A0 ) T (e) ( (A))
which becomes, when the embeddings are subspace inclusions,
(A0 ) (A) |T (A0 )|
6
These two hitherto unpublished observations have been made by the author of this
appendix since the original edition of this book.
141
We shall use the separability property to show that stability forces equality here.
The following is due to Eugenio Moggi.
Lemma Let e : A0 A be a rigid embedding. Let A +A0 A be the coherence
space whose web consists of two incoherent copies of |A| with the subgraphs |A0 |
identified. Then A has two canonical rigid embeddings into A +A0 A and their
intersection is A0 .
What does it mean for to be a stable function from Gem? We have not
given the codomain7 , but we can still work out intersections using the definition
of a b as a e b for e : A B. Write A1 and A2 for the two copies of A
inside A +A0 A, whose intersection is A0 .
Using the projection form of the inequality, hA00 , i is in the intersection iff
A00 A1 A2
(A1 ) |T (A00 )| = (A) |T (A00 )|
(A2 ) |T (A00 )| = (A) |T (A00 )|
The intersection of the values at A1 and A2 is therefore just
(A) |T (A0 )|
By stability this must be the value at A0 . This proves the
Proposition Let be an object of the variable coherence space T (X1 , ..., Xn ),
and ei : A0i Ai be rigid embeddings. Then8
(A0 ) = (A) |T (A0 )|
and indeed if satisfies this condition then it is stable.
A.4.2
Coherence of tokens
142
To test this condition we only need to consider graphs up to twice the size of
|A|, and so it is a finite9 calculation to determine whether hA, i satisfies it. For
any given type these tokens are recursively enumerable. Because hA,i is atomic,
we must have just one token for X. T (X), so hA, i and hA0 , 0 i are identified
for any e : A ' A0 with e = 0 .
We still have to say when these tokens are coherent.
Lemma Let 1 |T (A1 )| and 2 |T (A2 )| each satisfy these conditions. Then
hA1 ,1 i (B) _
^ hA2 ,2 i (B) at every coherence space B iff for every pair of embeddings
e1 : A1 C, e2 : A2 C, we have T (e1 )() _
^ T (e2 )().
Finally this enables us to calculate the universal abstraction of any variable
coherence space.
Proposition Let T : Gem Gem be a stable functor. Then its universal
abstraction, X. T (X), is the coherence space whose tokens are equivalence classes
of pairs hA, i such that
|T (A)|
A is minimal for this, i.e. if A0 A and |T (A0 )| then A0 = A (so A is
finite).
for any two rigid embeddings e1 , e2 : A B, we have
T (e1 )() _
^ T (e2 )()
in T (B).
hA, i is identified with hA0 , 0 i iff e : A ' A0 and T (e)() = 0 (so |A| may
be taken to be a subset of N).
9
143
A.4.3
Interpretation of F
Let us sum up by setting out in full the coherence space semantics of F. The type
U in n free variables X is interpreted as a stable functor [[U ]] : Gemn Gem as
in A.3, with the additional clause
4. If U = X. T then the web of [[U ]](A) is given as in the preceding proposition,
where T (X) = [[T ]](A, X). The embedding induced by e : A0 A is takes
tokens of [[U ]](A0 ) to the corresponding tokens with i0 replaced by ei i0 .
The term t of type T with m free variables x of types U (the free type
variables of T, U being X) is interpreted as an assignment to each A of a stable
function
[[t]](A) : [[U1 ]](A) N ... N [[Um ]](A) [[T ]](A)
such that for e : A0 A and bj [[Uj ]](A) the uniformity equation holds:
[[T ]](e) ([[t]](A)(b)) = [[t]](A0 )([[U ]](e) (b))
In detail,
1. The variable xj is interpreted by the jth product projection.
[[xj ]](A)(b) = bj
2. The interpretation of -abstraction x. u is given in terms of that of u by
the trace
[[x. u]](A)(b) = {hc, i : [[u]](A)(b, c), with c minimal}
144
The conversion rules are satisfied because they amount to the bijection between
objects of X. T (X) and variable objects of T (we need to prove a substitution
lemma similar to that in section 9.2).
A.5
Examples
A.5.1
Of course
[[Bool]] is also a special case if we admit the two-element discrete poset (not a coherence
space) for the domain U, in a category with coproducts. The other three examples which
we are about to consider are derived by means of the identities U V X ' (UV) X,
X. V(X)) Y .
(A X)(B X) ' (A + B) X and X. (V(X) Y ) ' (
A.5. EXAMPLES
145
where ui range over finite cliques of U, i.e. tokens of !U. However although there
is only one token, namely , available to tag the ui s, it may occur repeatedly; the
token is therefore given by a finite (pairwise incoherent) set of tokens of !U.
In other words, denotationally,
146
A.5.2
Natural Numbers
(!A !((!A A ) ) A )
whose tokens are of the form
k
S
bi = a {1 , ..., k }
i=1
A.5. EXAMPLES
147
It is more enlightening to turn to the syntax and find the tokens of the
numeral 1. Calculating [[X. x. y. yx]] using section A.4.3, we get tokens of the
form
hA, ha, h{ha, i}, iii
where |A| consists of the clique a and the token .
If a = we have the program which ignores the starting value stream and
everything on the transition function stream apart from the constant part
of its value, which is copied to the output.
If a has m elements, the program reads that part of the transition function
which reads its input exactly m times, and applies this to the starting value
(which it reads m times). But,
If a then the program outputs only that part of the result of the
transition function which is contained in the input.
If 6 a then it only outputs that part which is not contained in the input.
But,
If _
^ , where ranges over r of the m tokens of the clique a, then is
only output in those cases where the input and output are coherent in this
way.
So even the numeral 1 is a very complex beast: it amounts to a resolution
of the transition function into a polynomial, the mth term of which reads its
input exactly m times. It further resolves the terms according to the relationship
between the input and output.
Clearly these complications multiply as we consider larger numerals. Along
with and intersection, do they provide a complete classification of the tokens
of Int? What does Int Int look like?
A.5.3
Linear numerals
We can try to bring some order to this chaos by considering a linear version of
the natural numbers analogous to the linear booleans.
LInt = X. X ( ((X ( X) X)
(we leave one classical implication behind!) The effect of this is to replace a by
{} and bi by {i }, and then the positive and negative criterion gives
148
which are not necessarily distinct. Besides the undirected graph structure given
by coherence, the pairing hi , i i induces a transition relation on A.
The linear numeral k consists of the tokens of the form
= 1 , 1 = 2 , ..., k1 = k , k =
_
subject only to i _
^ j i+1 ^ j+1 so there are still quite a lot of them!
More generally, the transition relation preserves coherence, reflects incoherence,
and contains a path from to via any given token. The reader is invited to
verify this characterisation and also determine when two such tokens are coherent.
A.6
Total domains
Proposition The total objects in the denotation of Bool and Int are exactly the
truth values and the numerals.
Appendix B
What is Linear Logic?
by Yves Lafont
Linear logic was originally discovered in coherence semantics (see chapter 12). It
appears now as a promising approach to fundamental questions arising in proof
theory and in computer science.
In ordinary (classical or intuitionistic) logic, you can use an hypothesis as
many times as you want: this feature is expressed by the rules of weakening and
contraction of Sequent Calculus. There are good reasons for considering a logic
without those rules:
From the viewpoint of proof theory, it removes pathological situations from
classical logic (see next section) and introduces a new kind of invariant
(proof nets).
From the viewpoint of computer science, it gives a new approach to questions
of laziness, side effects and memory allocation [GirLaf, Laf87, Laf88] with
promising applications to parallelism.
B.1
150
D, C ` E
A, D ` B, E
Cut
A`B
A ` C, B
RW
D`E
D, C ` E
A, D ` B, E
LW
Cut
reduces to
A`B
==========
A, D ` B, E
D`E
==========
A, D ` B, E
or to
`B
` C, B
`B
RW
` B, B
`B
LW
C`B
Cut
RC
reduces to
`B
===
`B
or to
`B
===
`B
where the double bar is a weakening (with an exchange in the first case) followed
by a contraction.
151
B.2
We simply discard weakening and contraction. Exchange, identity and cut are left
unchanged, but logical rules need some adjustments: for example, the rules for
are now inadequate (since cut elimination in 13.1 requires weakenings). In fact,
we need two conjunctions: a tensor product (or cumulative conjunction)
A, C, D ` B
A, C D ` B
A ` C, B
A0 ` D, B 0
A, A0 ` C D, B, B 0
L1N
A, D ` B
A, C N D ` B
L2N
A ` C, B
A ` D, B
A ` C N D, B
RN
Dually, we shall have a tensor sum O (dual of ) and a direct sum (dual
of N), with symmetrical rules: left becoming right and vice versa. There is an
easy way to avoid this boring repetition, by using asymmetrical sequents.
152
(A N B) = A B
(A B) = A N B
` A
1 , . . . , An , B1 , . . . , Bm
` C , B
` A, B
Cut
` D, B
` C D, A, B
` C, A
` D, A
` C N D, A
` C, D, A
` C O D, A
` C, A
` C D, A
` D, A
` C D, A
153
Units (1 for , for O, > for N and 0 for ) are also introduced:
1 =
`1
= 1
`A
` , A
> = 0
` >, A
0 = >
>
Finally, the lost structural rules come back with a logical dressing, via the
modalities ! A (of course A) and ? A (why not A):
( ! A) = ? A
` B, ? A
` ! B, ? A
`A
` ? B, A
( ? A) = ! A
W?
` ? B, ? B, A
` ? B, A
C?
` B, A
` ? B, A
D?
AB =ANB
A B = ! A ! B
A B =!A ( B
A = ! A ( 0
in such a way that an intuitionistic formula is valid iff its translation is provable
in Linear Sequent Calculus (so, for example, dereliction expresses that B B).
This translation is in fact used for the coherence semantics of typed lambda
calculus (chapters 8, 9, 12 and appendix A).
It is also possible to add (first and second order) quantifiers, but the main
features of linear logic are already contained in the propositional fragment.
154
B.3
Proof nets
the context A, which plays a passive role, is rewritten without any change. By
expelling all those boring contexts, we obtain the substantifique moelle of the
proof, called the proof net.
For example, the proof
` A, A
` B, B
` A B, A , B
` C, C
` (A B) C, A , B , C
====
==================
` A , B , (A B) C, C
` A O B , (A B) C, C
becomes
AB
(A B) C
1
C
C
A O B
The idea is to use, not a fixed logic, but an extensible one. The program declares its own
connectors (i.e. polymorphic types) and rules (i.e. constructors and destructors), and describes
the conversions (i.e. the program). Cut elimination is in fact parallel communication between
processes. In this language, logic does not ensure termination, but absence of deadlock.
155
` B, B
` A B, A , B
====
=========
` A , B , A B
` A O B , A B
==========
=====
` A B, A O B
O
` C, C
` (A B) C, A O B , C
cut:
A
logical rules:
A
AB
AOB
Each formula must be the conclusion of exactly one rule and a premise of at
most one rule. Formulae which are not premises are called conclusions of the proof
structure: these conclusions are not ordered. Links and cuts are symmetrical.
156
Proof nets are proof structures which are constructed according to the rules of
Linear Sequent Calculus:
Links are proof nets.
If A is a conclusion of a proof net and A is a conclusion of a proof net 0 ,
0
is a proof net.
If A is a conclusion of a proof net and B is a conclusion of a proof net 0 ,
0
AB
is a proof net.
If A and B are conclusions of the same proof net ,
AOB
is a proof net.
is a proof net.
If is a proof net,
is a proof net.
157
There is a funny correctness criterion (the long trip condition, see [Gir87]) to
characterise proof nets among proof structures. For example, the following proof
structure
A
A
AOB
is not a proof net, and indeed, does not satisfy the long trip condition.
Unfortunately, this criterion works only for the (, O, 1) fragment of the logic
(not ).
B.4
Cut elimination
Proofs nets provide a very nice framework for describing cut elimination.
Conversions are purely local:
AB
A O B
(nothing)
158
AB
(A B) C
A O B
AB
AB
(A B) C
and
A
A O B
159
A O B
AB
and similarly for 1 and . So we can also restrict links to atomic formulae.
Consider now a cut free proof net with fixed conclusions. Since the logical rules
follow faithfully the structure of these conclusions, our proof net is completely
determined by its (atomic) links. So our first example comes to
A O B
(A B) C
AB
(A B) C
A O B
AB
A O B
AB
(A B) C
(A B) C
160
This turbo cut elimination mechanism is the basic idea for generalising proof
nets to non-multiplicative connectives (geometry of interaction).
B.5
It is fair to say that proof nets are the natural deductions of linear logic, but with
two notable differences:
Thanks to linearity, there is no need for parcels of hypotheses.
Thanks to linear negation, there is no need for discharge or for elimination
rules.
For example, if we follow the obvious analogy between the intuitionistic
implication A B and the linear one A ( B = A O B, the introduction
[A]
B
AB
corresponds to
A
A O B
AB
to
A O B
A B
Bibliography
[Abr87] S. Abramsky, Domain theory and the logic of observable properties, Ph.D.
thesis (Queen Mary College, University of London, 1987).
[Abr88] S. Abramsky, Domain theory in logical form, Annals of Pure and Applied
Logic 51 (1991) 177.
[AbrVick] S. Abramsky and S.J. Vickers, Quantales, Observational Logic and
Process Semantics, Mathematical Structures in Computer Science, 3
(1993) 161227.
[Barendregt] H. Barendregt, The lambda-calculus:
North-Holland (1980).
162
BIBLIOGRAPHY
J.Y. Girard, Proof theory and logical complexity, Bibliopolis (Napoli, 1987).
BIBLIOGRAPHY
163
[HylPit] J.M.E. Hyland and A.M. Pitts, The theory of constructions: categorical
semantics and topos-theoretic models, in [GrSc], 137199.
[Jung] A. Jung, Cartesian closed categories of domains, Ph. D. thesis (Technische
Hochschule Darmstadt, 1988).
[Kowalski] R. Kowalski, Logic for problem solving [PROLOG], North-Holland (1979).
[Koymans] C.P.J. Koymans, Models of the -calculus, Centruum voor Wiskunde
en Informatica, 9 (1984).
[KrLev] G. Kreisel and A. Levy, Reflection principles and their use for establishing
the complexity of axiomatic systems, Z. Math. Logik Grundlagen Math.
33 (1968).
[KriPar] J.L. Krivine and M. Parigot, Programming with proofs, Sixth symposium
on somputation theory (Wendisch-Rietz, 1987).
[Laf87] Y. Lafont, Logiques, categories et machines, Th`ese de doctorat (Universite
Paris VII, 1988).
[Laf88] Y. Lafont, The linear abstract machine, Theoretical Computer Science 59
(1988) 157180.
[LamSco] J. Lambek and P.J. Scott, An introduction to higher order categorical
logic, Cambridge University Press (1986).
[Lei83] D. Leivant, Reasoning about functional programs and complexity classes
associated with type disciplines, Twenty fourth annual symposium on foundations of computer science, IEEE Computer Society Press, (Washington
DC, 1983).
[Lei90] D. Leivant, Contracting proofs to programs, in: Pergiorgio Odifreddi (ed.),
Logic in Computer Science, Academic Press (1990).
[ML70] P. Martin-Lof, A construction of the provable well-ordering of the theory
of species (unpublished).
[ML84] P. Martin-Lof, Intuitionistic type theories, Bibliopolis (Napoli, 1984)
[Prawitz] D. Prawitz, Ideas and results in proof-theory, in: Proceedings of the
second Scandinavian logic symposium , North-Holland (1971) 237309.
[Reynolds] J.C. Reynolds, Towards theory of type structure, Paris colloquium on
programming, Springer-Verlag LNCS 19 (1974).
[ReyPlo] J.C. Reynolds and G. Plotkin, On functors expressible in the polymorphic
lambda calculus.
[ERobinson] E. Robinson, Logical aspects of denotational semantics in: D.H.
Pitt, A. Poigne and D.E. Rydeheard (eds.), Category theory and computer
science LNCS 283, Springer-Verlag (Edinburgh, 1987).
164
BIBLIOGRAPHY
Index
Notation
variables, x, y, z
object language, ,
second order, X, Y , Z
terms, t, u, v, w
object language, a, b
types, S, T , U , V , W
propositions, A, B, C, D
coherence spaces, A, B, C
points, a, b, c
tokens, , ,
numbers, m, n, p, q
Brackets
denotation, [[T ]], [[t]]
pairing, (a, b) and ha, bi
set, {n : P [n]}
substitution, t[u/x]
web, |A|
Connectives on types
conjunction,
direct product, N
direct sum,
disjunction,
function-space,
implication,
linear implication, (
product, `
sum, +,
tensor product,
tensor sum or par, O
Quantifiers
existential,
,
existential type, ,
universal,
universal type,
Relations
coherence, _
^
def
definitional equality, =
embedding and projection, , .
if and only if (iff),
incoherence, ^
_
interconvertible with,
isomorphism, '
reduces (converts) to,
result of function, 7
sequent, `
Miscellaneous
composition,
S W
directed union and join, ,
negation, (linear, )
of course and why not, !, ?
sequence, A
Abramsky, i, 55
abstraction ()
conversion, 13
introduction, 12, 20
realisability, 127
reducibility, 45
semantics, 68, 144
syntax, 15, 82
absurdity (), 6, 95
commuting conversion, 78
denotation (f ), see booleans and
undefined object ()
empty type (Emp and U ), 80
linear logic ( and 0), 154
natural deduction (E), 73
realisability, 129
165
166
sequent calculus ( ` ), 29
Ackermanns function, 51
algebraic domain, 56
alive hypothesis, 9
all (), see universal quantifier
alternations of quantifiers, 58, 124
alternative conjunction (N), see
direct product
amalgamated sum, 96, 134
analysis (second order arithmetic),
114
and, see conjunction
application
conversion, 13
elimination, 12, 20
realisability, 127
reducibility, 43
semantics (App), 69
stability=Berry order, 65
syntax, 15, 82
trace formula (App), 63, 64, 144
approximation of points and domains,
57, 134
arrow type, see implication and
function-space
associativity of sum type, 98
asymmetrical interpretation, 34
atomic formulae, 4, 5, 30, 112, 160
atomic points, see tokens
atomic sequents, 112
atomic types, 15, 48
automated deduction, 28, 34
automorphisms, 134
axiom
comprehension, 114, 118, 123
excluded middle, 6, 156
hypothesis, 10
identity, 30
link, 156
proper, 112
Bad elimination, 77
INDEX
Barendregt, 22
Berry, 54
Berry order (B ), 65, 66, 135, 146
beta () rule, see conversion
binary completeness, 56
binary trees (Bintree), 93
binding variables, 5, 12, 15, 83, 161
Boole, 3
booleans, 4
coherence space
Bool , 56, 60, 70
[[X. XXX]], 140
commuting conversion, 86
conversion, 48
denotation (t and f ), 4
in F (X. XXX), 84, 140
in T (Bool, T, F), 48, 50, 70
in system F, 84
totality, 149
bounded meet, see pullback
boundedly complete domains, 140
Brouwer, 6
by values, 51, 70, 91, 133
C, C?, see contraction
camembert, 3
CAML, 81
candidate
reducibility, 43, 115, 116
totality, 58, 149
Cantor, 1
Cartesian closed category, 54, 62, 67,
69, 95, 152
Cartesian natural transformation, see
Berry order
Cartesian product, see product
casewise definition (D), 48, 83, 97
category, 59, 95, 133, 135
characteristic subgroup, 134
Church-Rosser property, 16, 22, 49,
74, 79, 90, 114, 152, 159
clique, 57, 62, 101, 138
INDEX
closed normal form, 19, 52, 121
coclosure, 135
coherence space, 56
booleans
Bool , 56, 60, 70
[[X. XXX]], 140
coherence relation ( _
^ ), 56
direct product (N), 62
direct sum (), 96, 103
empty type (Emp), 104, 139
function space (), 64, 102, 138
integers
flat (Int), 56, 60, 66, 70
lazy (Int + ), 71, 98
[[X. X(XX)X)]], 147
linear implication ((), 100, 104,
138
linear negation (A ), 100, 138
of course (!), 101, 138, 145
Pair , 1 , 2 , 68
partial functions (PF), 66
types, 143
semantics, 67, 132
singleton (Sgl ), 104, 139
tensor product (), 104, 138
tensor sum or par (O), 104
tokens and web, 56
coherent or spectral space, 56
collect (forming trees), 94
communicating processes, 155
commutativity of logic, 29
commuting conversion of sum, see
conversion
compact, 59, 66
compact-open topology, 55
complementary graph, see linear
negation
complete subgraph, see clique
complexity
algorithmic, 53, 111, 143
logical, 42, 58, 114, 122, 124, 140
components ( 1 and 2 )
167
elimination, 19
reducibility, 43
composition
stable functions, 69
comprehension scheme, 114, 118, 123,
126
computational significance, 1, 11, 17,
112, 120
confluent relation, see Church-Rosser
property
conjunction, 5
and product, 11, 15, 19
conj in Bool, 50
conversion, 13
cut elimination, 105
in F: X. (U V X)X, 84
linear logic, 152
natural deduction
I, 1E and 2E, 10
realisability, 126
sequent calculus
L1, L2 and R, 31
cons (add to list), 91
consistency
equational, 16, 23, 152
logical (consis), 42, 114, 124
constants by vocation, 60, 66
constructors, see data types in F
continuity, 54, 59, 137
contraction
LC and RC, 29
linear logic (C?), 154
contractum, 18, see redex
control features of PROLOG, 28
conversion ( ), 18
bogus example, 75
booleans (D), 48
commuting, 74, 78, 85, 97, 103
conjunction (), 13
degree, 25
denotational equality, 69, 132
disjunction (), 75, 97
168
existential quantifier (), 75
conversion ( )
implication (), 13
in F, 83, 94
infinity (), 72
integers (R, It), 48, 51
-calculus, 11, 16, 18, 69
linear logic (proof nets), 158
natural deduction, 13, 20
reducibility, 43, 116
rewrite rules, 14
second order, 94
Coquand, 116, 133
correctness criterion
for proof nets, 158
for tokens of types, 139, 142
couple (forming trees), 93
(CR 13), see reducibility
cumulative conjunction, see tensor
product
Curry-Howard isomorphism, 5, 150
conjunction and product, 14
disjunction and sum, 80
implication and functions, 14
none in sequent calculus, 28
second order, 94
cut rule
Cut, 30
elimination of, 3, 105, 151, 158
linear logic, 153, 156, 158
natural deduction, 35, 40
not Church-Rosser, 150
proofs without, 33, 159
restriction of, 112
D, D, see casewise definition
data types in F, 87, 89
dead hypothesis, 9
deadlock, 155
deduction (natural), 9
degree, 24, 109
(), of formula or type
INDEX
d(), of cut or redex
Delin, see linearisation
denotation, 1
denotational semantics, 14, 54, 67,
95, 132
dereliction (D?), 154
dI-domains, 71, 98
direct product (N)
coherence space, 61, 67
linear logic, 152
direct sum ()
coherence space, 96, 103, 146
example, 66
linear logic, 152
directed joins, 57, 59, 66
discharge, 9, 12, 37, 73, 161
discrete graph, see flat domain
disjunction, 5, 6, 95
and sum, 81
commuting conversion, 78
conversion, 75
cut elimination, 106
disj in Bool, 50
intuitionistic property, 8, 33
linear logic ( and O), 152
natural deduction
1I, 2I and E, 73
sequent calculus
L, R1 and R2, 31
intuitionistic L, 32
domain theory, 56, 132
dI-domains, 71, 98
domain equations, 98
Girard versus Scott, 54, 66, 98
L-domains, 140
lifted sum, 96
donkey, 134
dynamic, 2, 14, 54
, see existential quantifier
elimination, 8, 48
1E, 2E, E and E, 10
INDEX
R and D in T, 48
elimination
E, E and E, 73
2 E, 94, 125
application and components, 19
good and bad, 77
left logical rules, 37, 40
linear logic, 161
linearity of, 99, 103
embedding-projection pair, 133, 134
empty type
and absurdity, 80
coherence space (Emp), 95, 104,
139
Emp and U , 80
in F: Emp = X. X, 85, 139
linear logic ( and 0), 154
realisability, 129
equalisers, 137
equations between terms and proofs,
see conversion
espace coherent, 56
eta rule, see secondary equations
evaluation, see application
event structures, 98
exchange
LX and RX, 29
linear logic (X), 153
existential quantifier, 5, 6
commuting conversion, 78
conversion, 75
cut elimination, 108
intuitionistic property, 8, 33
natural deduction
I and E, 73
sequent calculus
L and R, 32
existential type in F (, ), 86, 145
exponential
object, see implication and
function-space
169
process, see complexity,
algorithmic
expressive power, 50, 89, 155
extraction, see universal application
F (Girard-Reynolds system)
representable functions, 120
semantics, 132
strong normalisation, 42, 114
syntax, 82
F0 (parallel or), 61, 70
false
denotation (f , F, F), see booleans
proposition (), see absurdity
feasible, see complexity, algorithmic
fields of numbers, 134
filtered colimits, 59, 137
finite
approximation, 57, 66, 132, 134
branching tree, 27
normalisation, 24
points (Afin ), 57
presentability, 66
sense and denotation, 2
very, 59, 66
fixed point, 72, 95
flat domain, 57, 60, 66, 70, 140
for all (), see universal quantifier
for some (), see existential quantifier
Frege, 1, 2
function, 1, 17
Berry order (B ), 65
composition, 69
continuous, 55, 58
fixed point, 72
graph, 1, 66
linear, 99, 148
not representable, 122
on proofs, 6, 11
on types, 83, 132, 136
partial, 60, 66
partial recursive, 55
170
INDEX
function
pointwise order, 66
polynomial resolution, 147
provably total, 52, 123
recursion, 50, 90, 120
representable, 52, 121
sequential, 54
stable, 58, 62, 68
total recursive, 122
trace (Tr), 62
two arguments, 61
function-space
and implication, 12, 15, 20
in F, 82
-calculus, 12, 15
linear decomposition, 101
semantics, 54, 62, 64, 67, 136
functor, 59, 134, 136, 141
Gallier, 28
Galois Theory, 134
Gandy, 27
garbage collection, 150
Gem, 136
general recursion, 72
Gentzen, 3, 28, 105
geometry of interaction, 4, 160
Girard, 30, 42, 80, 82, 114, 124, 150
goals in PROLOG, 112
Godel, 1, 6, 47, 54
incompleteness theorem, 42, 114
numbering, 53
-translation, 124
good elimination, 77
graph
embedding, 133, 134
function, 1, 66
product, 104, 138
web, 56
), 135, 137,
Grothendieck fibration (
141
I, see introduction
idempotence of logic, 29
identification of terms and proofs, see
conversion
identity
axiom, 30, 112, 156
hypothesis, 10
maximal in Berry order, 65, 135
polymorphic, 83, 132, 136, 138
proof of A A, 6
if, see casewise definition
implication, 5
and function-space, 12, 15, 20
conversion, 13
cut elimination, 107
linear ((), 100, 153
natural deduction
I and E, 10
realisability, 127
semantics, 54
sequent calculus
L and R, 32
inclusion order, 56
INDEX
incoherence ( ^
_ ), 100
incompleteness theorem, 6, 42, 114,
124
inductive data types, 87, 121
inductive definition of +, , etc., 50
infinite static denotation, 2
infinity (f
and ), 71
initial object, 95, 152
input, 1, 17
integers, 1
coherence space
flat (Int), 56, 60, 66, 70
lazy (Int + ), 71, 98
[[X. X(XX)X)]], 147
conversion, 48
dI-domain (Int < ), 98
in F, 89, 121, 147
in HA2 , 125
in T (Int, O, S, R), 48, 70
iteration (It), 51, 70, 90
linear type (LInt), 148
normal form, 52, 121
realisability (Nat), 126, 127
recursor (R), 48, 91
totality, 149
internalisation, 27
intersection, see conjunction
bounded above, see pullback
in [[Bool]], 140
introduction, 8, 48
I, I and I, 10
O, S, T and F in T, 48
1I, 2I and I, 73
2 I, 94, 125
linear logic, 161
pairing and -abstraction, 11, 12,
19
right logical rules, 37, 40
sums, 81
intuitionism, 6, 150
intuitionistic sequent, 8, 30, 32, 33,
112, 152
171
inversion in linear algebra, 101
isomorphisms, 132134
iteration (It), 51, 70, 90
Join, see disjunction
preservation (linearity), 99
joint continuity and stability, 61
Jung, 133, 140
Kleene, 123
Konigs lemma, 27
Koymans, 133
Kreisel, 55
L
172
direct product (N), 61, 104, 152
direct sum (), 96, 103, 146, 152
implication ((), 100, 104, 153
integers (LInt), 148
intuitionistic logic, 154
linear maps, 99, 101
link rule, 156
natural deduction, 161
negation ( ), 100, 138, 153
notation for tokens, 138
of course (!), 101, 145, 154
polynomial resolution, 147
proof nets, 155
reducibility, 115
semantics, 95
sequent calculus, 152
sum decomposition, 95, 103, 146
syntax, 150
tensor product (), 104, 146, 152,
156
tensor sum or par (O), 104, 152,
156
trace (Trlin), 100
units (1, , > and 0), 104, 154
why not (?), 102, 154
link axiom, 156
lists, 47, 91
locally compact space, 55
logical rules, 31
cut elimination, 105, 112
linear logic, 153, 156
natural deduction, 37
logical view of topology, 55
long trip condition, 158
Lowenheim, 1, 3
Martin-Lof, 88
match (patterns), 81
maximally consistent extensions, 54
meet, see conjunction
bounded above, see pullback
memory allocation, 150
INDEX
mod, coherence relation, 56
modalities, 101, 154
model theory, 3
modules, 17, 47
modus ponens (E), see elimination
Moggi, 141
N = {0, 1, 2, ...}, set of integers
Nat predicate in HA2 , 126
natural deduction, 8
, and , 10
, and , 73
conversion, 13, 20, 75, 78
-calculus, 11, 19
linear logic, 161
normalisation, 24, 42
second order, 94, 125
sequent calculus, 35
subformula property, 76
natural numbers, see integers
negation, 5
A A, 6
def
A = A , 6, 73
cut elimination, 107
linear ( ), 100, 138, 153
neg in Bool, 50
sequent calculus (L, R), 31
neutrality, 43, 49, 116
nil (empty tree or list), 91
NJ (Prawitz system), see natural
deduction
Noetherian, 66
nonconvergent rewriting, 72
nondeterminism, 150
normal closure of a field, 134
normal form, 18, 76
cut-free, 41
existence, 24
head normal form, 19
integers, 52, 121
linear logic, 159
normal form
INDEX
uniqueness, 22
normalisation theorem
for F, 114
for T, 49
length of process (), 27, 43, 49
linear logic, 155
strong, 42
weak, 22, 24
not (), see negation
O, O (zero), see integers
object language, 10
Ockhams razor, 3
of course (!), 101, 138, 145, 154
operational semantics, 14, 121
or, see disjunction and parallel or
orthogonal ( ), see linear negation
output, 17
PA, PA2 , see Peano arithmetic
pairing
conjunction, 11
in F, 84
introduction, 19
semantics (Pair ), 61, 68
par (O), see tensor sum
parallel or, 50, 60, 70, 81
parallel process, 150, 159
parcels of hypotheses, 11, 36, 40, 161
partial equivalence relation, 55
partial function, 60
coherence space (PF), 66
recursive, 55
partial objects, 57
PASCAL, 47
pattern matching, 81
Peano arithmetic (PA), 42, 53
second order (PA2 ), 114, 123
peculiar, 134
permutations, 134
, see system F
1 , 2 , 1 , 2 , see components
Pitts, 133
173
Plotkin, 56
plugging, 17
polynomial, 148
P, see Scott
positive negative, 34, 87, 139, 142
potential tokens, 138
Prawitz, 8, 80
predecessor (pred), 51, 72, 91
preservation of joins (linearity), 99
primary equations, see conversion
principal branch or premise, 75, 76
product
and conjunction, 11, 15, 19
coherence space (N), 61, 68
in F, 84, 145
linear logic ( and N), 152
projection, 11, 61, 68, 84
programs, 17, 53, 72, 84, 124
projection (embedding), 135, 142
PROLOG, 28, 105, 112
proof, 5
proof net, 155
proof of program, 53
proof structure, 156
provably total, 53, 124
Ptolomeic astronomy, 1
pullback, 54, 59, 61, 65, 137, 141
Quantifiers, see universal, existential
and second order
R (right logical rules), see sequent
calculus
R
, see asymmetrical interpretation
real numbers, 114
realisability, 7
recursion
and iteration, 51, 70, 90
recurrence relation, 50
recursor (R), 48
semantics, 70
redex, 18, 24, 48
reducibility, 27, 58, 123
174
in F, 115
in T, 49
-calculus, 42
reduction, 18
reflexive symmetric relation, 57
representable functions, 52, 120
resolution method, 112
rewrite rules, see conversion
Reynolds, 82
right logical rules, see sequent
calculus
rigid embeddings, 134
ring theory, 66
Robinson, 112
Rosser, see Church-Rosser property
S, S (successor), see integers
saturated domains, 133
Scott, 54, 55, 64, 66, 133
second incompleteness theorem, 42
second order logic, 94, 114, 123
secondary equations, 16, 69, 81, 85,
97, 132
semantic definability, 140
sense, 1
separability, 134
sequent calculus, 28
Cut rule, 30
linear logic, 150
logical rules, 31
natural deduction, 35
PROLOG, 112
structural rules, 29
sequential algorithm, 54
side effects, 150
(existential type) in F, 86, 145
(total category), 135
INDEX
SN (strongly normalisable terms),
119
specification, 17
spectral space, 56
stability, 54, 134, 137
definition (St), 58, 100
static, 2, 14, 54
strict (preserve ), 97
strong finiteness, 59, 66
strong normalisation, 26, 42, 114
structural rules
cut elimination, 106, 112
linear logic, 152, 153
natural deduction, 36
sequent calculus, 29
subdomain, 134
subformula property
-calculus, 19
natural deduction, 76
sequent calculus, 33
substantifique moelle, 155
substitution, 5, 25, 69, 112, 118
subuniformity, 134
successor (S and S), see integers
sum type, 95
+, 1 , 2 and , 81
and disjunction, 81
coherence space, 103, 146
linear decomposition, 96, 103
linear logic ( and O), 152
symmetry, 10, 18, 28, 30, 31, 97, 105,
134
T, t, T , see true
T (Godels system), 47, 67, 70, 123
tableaux, 28
Tait, 42, 49, 114
Takeuti, 125
Tarski, 4
tautology
linear logic (1 and >), 154
Taylor, 133
INDEX
tensor product ()
coherence space, 104, 138, 146
linear logic, 152
tensor sum or par (O)
coherence space, 104
linear logic, 152
terminal object, 104
terminating relation, see
normalisation
terms
in F, 82
in HA2 , 125
in T, 68
-calculus, 15
object language, 5
theory of constructions, 116, 133
token, 56, 64, 137
topological space, 55
), 135, 137, 141, 146
total category (
total objects, 57, 149
total recursive function, 53, 122
trace (Tr), 62, 67, 144
linear (Trlin), 100
transposition in linear algebra, 101
trees, 8, 47, 93
true
denotation (t, T , T), see
booleans
proposition (>), see tautology
turbo cut elimination, 160
Turing, 122
type variables, 82
types, 15, 54, 67
Undefined object (), 56, 96, 129,
139, 146, 149
unification, 113
uniform continuity, 55
uniformity of types, 83, 132, 134,
143
units (0, >, 1 and ), 104
universal algebra, 66
175
universal domain, 133
universal program (Turing), 122
universal quantifier, 5, 6
cut elimination, 108
natural deduction
I and E, 10
sequent calculus
L and R, 32
universal quantifier (second order),
82, 126
2
I and 2 E, 94, 125
reducibility, 118
semantics, 132, 143, 149
Variable coherence spaces, 141
variables
hypotheses, 11, 15, 19
object language, 10, 125
type, 82, 125
very finite, 59, 66
Vickers, 55
Weak normalisation, 24
weakening
LW and RW, 29
linear logic (W?), 154
web, 56, 135
why not (?), 102, 154
Winskel, 98, 133
X, LX, RX (exchange), 29, 153
Y (fixed point), 72
Zero
0, unit of , 154
O and O, see integers