Programming in Standard ML
Programming in Standard ML
Robert Harper
Carnegie Mellon University
Spring Semester, 2011
c
Copyright
2011
by Robert Harper.
All Rights Reserved.
Preface
This book is an introduction to programming with the Standard ML programming language. It began life as a set of lecture notes for Computer
Science 15212: Principles of Programming, the second semester of the introductory sequence in the undergraduate computer science curriculum at
Carnegie Mellon University. It has subsequently been used in many other
courses at Carnegie Mellon, and at a number of universities around the
world. It is intended to supersede my Introduction to Standard ML, which
has been widely circulated over the last ten years.
Standard ML is a formally defined programming language. The Definition of Standard ML (Revised) is the formal definition of the language. It
is supplemented by the Standard ML Basis Library, which defines a common basis of types that are shared by all implementations of the language.
Commentary on Standard ML discusses some of the decisions that went into
the design of the first version of the language.
There are several implementations of Standard ML available for a wide
variety of hardware and software platforms. The best-known compilers
are Standard ML of New Jersey, MLton, Moscow ML, MLKit, and PolyML.
These are all freely available on the worldwide web. Please refer to The
Standard ML Home Page for up-to-date information on Standard ML and
its implementations.
Numerous people have contributed directly and indirectly to this text.
I am especially grateful to the following people for their helpful comments and suggestions: Brian Adkins, Nels Beckman, Marc Bezem, James
Bostock, Terrence Brannon, Franck van Breugel, Chris Capel, Matthew
William Cox, Karl Crary, Yaakov Eisenberg, Matt Elder, Mike Erdmann,
Matthias Felleisen, Andrei Formiga, Stephen Harris, Nils Jahnig, Joel Jones,
David Koppstein, John Lafferty, Johannes Laire, Flavio Lerda, Daniel R.
Licata, Adrian Moos, Bryce Nichols, Michael Norrish, Arthur J. ODwyer,
Frank Pfenning, Chris Stone, Dave Swasey, Michael Velten, Johan Wallen,
Scott Williams, and Jeannette Wing. Richard C. Cobbe helped with font selection. I am also grateful to the many students of 15-212 who used these
notes and sent in their suggestions over the years.
These notes are a work in progress. Corrections, comments and suggestions are most welcome.
Contents
Preface
ii
Overview
Programming in Standard ML
1.1 A Regular Expression Package . . . . . . . . . . . . . . . . .
1.2 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . .
II
2
13
3
3
12
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
16
17
19
21
23
23
.
.
.
.
.
.
24
24
25
25
26
27
28
CONTENTS
3.5
3.6
4
vi
Functions
4.1 Functions as Templates . . . .
4.2 Functions and Application . .
4.3 Binding and Scope, Revisited
4.4 Sample Code . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
33
34
37
39
.
.
.
.
.
.
40
40
40
42
45
48
50
.
.
.
.
.
51
51
52
53
54
56
57
58
61
62
65
66
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
67
. 67
. 70
. 73
. 76
R EVISED 11.02.11
D RAFT
V ERSION 1.2
CONTENTS
9
vii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
82
82
83
85
88
89
91
11 Higher-Order Functions
11.1 Functions as Values .
11.2 Binding and Scope .
11.3 Returning Functions
11.4 Patterns of Control .
11.5 Staging . . . . . . . .
11.6 Sample Code . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
92
92
93
95
97
99
102
.
.
.
.
.
.
103
104
104
105
107
110
112
.
.
.
.
.
.
.
.
113
113
115
116
118
119
120
122
124
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12 Exceptions
12.1 Exceptions as Errors . . . . . . .
12.1.1 Primitive Exceptions . . .
12.1.2 User-Defined Exceptions .
12.2 Exception Handlers . . . . . . . .
12.3 Value-Carrying Exceptions . . . .
12.4 Sample Code . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
13 Mutable Storage
13.1 Reference Cells . . . . . . . . . . . .
13.2 Reference Patterns . . . . . . . . . .
13.3 Identity . . . . . . . . . . . . . . . . .
13.4 Aliasing . . . . . . . . . . . . . . . .
13.5 Programming Well With References
13.5.1 Private Storage . . . . . . . .
13.5.2 Mutable Data Structures . . .
13.6 Mutable Arrays . . . . . . . . . . . .
R EVISED 11.02.11
D RAFT
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
V ERSION 1.2
CONTENTS
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
130
132
133
135
137
III
140
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
142
142
143
144
147
147
148
150
.
.
.
.
151
152
153
157
157
20 Signature Ascription
158
20.1 Ascribed Structure Bindings . . . . . . . . . . . . . . . . . . . 158
20.2 Opaque Ascription . . . . . . . . . . . . . . . . . . . . . . . . 159
R EVISED 11.02.11
D RAFT
V ERSION 1.2
CONTENTS
ix
IV
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Programming Techniques
182
182
185
187
191
192
194
194
196
199
R EVISED 11.02.11
D RAFT
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
212
212
214
215
216
217
219
V ERSION 1.2
CONTENTS
27 Proof-Directed Debugging
220
27.1 Regular Expressions and Languages . . . . . . . . . . . . . . 220
27.2 Specifying the Matcher . . . . . . . . . . . . . . . . . . . . . . 222
27.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
28 Persistent and Ephemeral Data Structures
229
28.1 Persistent Queues . . . . . . . . . . . . . . . . . . . . . . . . . 232
28.2 Amortized Analysis . . . . . . . . . . . . . . . . . . . . . . . . 235
28.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
29 Options, Exceptions, and Continuations
29.1 The n-Queens Problem . . . . . . . .
29.2 Solution Using Options . . . . . . . .
29.3 Solution Using Exceptions . . . . . .
29.4 Solution Using Continuations . . . .
29.5 Sample Code . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
239
239
241
242
244
246
30 Higher-Order Functions
30.1 Infinite Sequences . . . . . . . . . . . . .
30.2 Circuit Simulation . . . . . . . . . . . . .
30.3 Regular Expression Matching, Revisited
30.4 Sample Code . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
247
248
251
254
256
31 Memoization
31.1 Cacheing Results . . . . . .
31.2 Laziness . . . . . . . . . . .
31.3 Lazy Data Types in SML/NJ
31.4 Recursive Suspensions . . .
31.5 Sample Code . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
257
257
259
261
263
264
.
.
.
.
.
265
266
266
268
272
273
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
32 Data Abstraction
32.1 Dictionaries . . . . . . . . . . . . . .
32.2 Binary Search Trees . . . . . . . . . .
32.3 Balanced Binary Search Trees . . . .
32.4 Abstraction vs. Run-Time Checking
32.5 Sample Code . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
D RAFT
V ERSION 1.2
CONTENTS
xi
Appendices
278
279
B Compilation Management
280
B.1 Overview of CM . . . . . . . . . . . . . . . . . . . . . . . . . . 281
B.2 Building Systems with CM . . . . . . . . . . . . . . . . . . . . 281
B.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
C Sample Programs
R EVISED 11.02.11
282
D RAFT
V ERSION 1.2
Part I
Overview
2
Standard ML is a type-safe programming language that embodies many
innovative ideas in programming language design. It is a statically typed
language, with an extensible type system. It supports polymorphic type
inference, which all but eliminates the burden of specifying types of variables and greatly facilitates code re-use. It provides efficient automatic
storage management for data structures and functions. It encourages functional (effect-free) programming where appropriate, but allows imperative (effect-ful) programming where necessary. It facilitates programming
with recursive and symbolic data structures by supporting the definition
of functions by pattern matching. It features an extensible exception mechanism for handling error conditions and effecting non-local transfers of
control. It provides a richly expressive and flexible module system for
structuring large programs, including mechanisms for enforcing abstraction, imposing hierarchical structure, and building generic modules. It is
portable across platforms and implementations because it has a precise
definition. It provides a portable standard basis library that defines a rich
collection of commonly-used types and routines.
Many implementations go beyond the standard to provide experimental language features, extensive libraries of commonly-used routines, and
useful program development tools. Details can be found with the documentation for your compiler, but heres some of what you may expect.
Most implementations provide an interactive system supporting on-line
program development, including tools for compiling, linking, and analyzing the behavior of programs. A few implementations are batch compilers that rely on the ambient operating system to manage the construction
of large programs from compiled parts. Nearly every compiler generates
native machine code, even when used interactively, but some also generate code for a portable abstract machine. Most implementations support separate compilation and provide tools for managing large systems
and shared libraries. Some implementations provide tools for tracing and
stepping programs; many provide tools for time and space profiling. Most
implementations supplement the standard basis library with a rich collection of handy components such as dictionaries, hash tables, or interfaces to
the ambient operating system. Some implementations support language
extensions such as support for concurrent programming (using messagepassing or locking), richer forms of modularity constructs, and support for
lazy data structures.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 1
Programming in Standard ML
1.1
To develop a feel for the language and how it is used, let us consider the
implementation of a package for matching strings against regular expressions. Well structure the implementation into two modules, an implementation of regular expressions themselves and an implementation of a
matching algorithm for them.
These two modules are concisely described by the following signatures.
signature REGEXP = sig
datatype regexp =
Zero | One | Char of char |
Plus of regexp * regexp |
Times of regexp * regexp |
Star of regexp
exception SyntaxError of string
val parse : string -> regexp
val format : regexp -> string
end
signature MATCHER = sig
structure RegExp : REGEXP
val accepts : RegExp.regexp -> string -> bool
end
The signature REGEXP describes a module that implements regular expressions. It consists of a description of the abstract syntax of regular expres-
sions, together with operations for parsing and unparsing them. The signature MATCHER describes a module that implements a matcher for a given
notion of regular expression. It contains a function accepts that, when
given a regular expression, returns a function that determines whether or
not that expression accepts a given string. Obviously the matcher is dependent on the implementation of regular expressions. This is expressed
by a structure specification that specifies a hierarchical dependence of an implementation of a matcher on an implementation of regular expressions
any implementation of the MATCHER signature must include an implementation of regular expressions as a constituent module. This ensures that
the matcher is self-contained, and does not rely on implicit conventions
for determining which implementation of regular expressions it employs.
The definition of the abstract syntax of regular expressions in the signature REGEXP takes the form of a datatype declaration that is reminiscent
of a context-free grammar, but which abstracts from matters of lexical presentation (such as precedences of operators, parenthesization, conventions
for naming the operators, etc..) The abstract syntax consists of six clauses,
corresponding to the regular expressions 0, 1, a, r1 + r2 , r1 r2 , and r .1 The
functions parse and format specify the parser and unparser for regular
expressions. The parser takes a string as argument and yields a regular
expression; if the string is ill-formed, the parser raises the exception SyntaxError with an associated string describing the source of the error. The
unparser takes a regular expression and yields a string that parses to that
regular expression. In general there are many strings that parse to the
same regular expressions; the unparser generally tries to choose one that
is easiest to read.
The implementation of the matcher consists of two modules: an implementation of regular expressions and an implementation of the matcher
itself. An implementation of a signature is called a structure. The implementation of the matching package consists of two structures, one implementing the signature REGEXP, the other implementing MATCHER. Thus the
overall package is implemented by the following two structure declarations:
structure RegExp :> REGEXP = ...
structure Matcher :> MATCHER = ...
The structure identifier RegExp is bound to an implementation of the REGEXP
1 Some
R EVISED 11.02.11
D RAFT
V ERSION 1.2
might seem that one can apply Matcher.accepts to the output of RegExp.parse,
since Matcher.RegExp.parse is just RegExp.parse. However, this relationship is not
stated in the interface, so there is a pro forma distinction between the two. See Chapter 22
for more information on the subtle issue of sharing.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
The use of long identifiers can get tedious at times. There are two typical methods for alleviating the burden. One is to introduce a synonym for
a long package name. Heres an example:
structure M = Matcher
structure R = M.RegExp
val regexp = R.parse "((a + %).(b + %))*"
val matches = M.accepts regexp
val ex1 = matches "aabba"
val ex2 = matches "abac"
Another is to open the structure, incorporating its bindings into the current environment:
open Matcher Matcher.RegExp
val regexp = parse "(a+b)*"
val matches = accepts regexp
val ex1 = matches "aabba"
val ex2 = matches "abac"
It is advisable to be sparing in the use of open because it is often hard to
anticipate exactly which bindings are incorporated into the environment
by its use.
Now lets look at the internals of the structures RegExp and Matcher.
Heres a birds eye view of RegExp:
structure RegExp :> REGEXP = struct
datatype regexp =
Zero | One | Char of char |
Plus of regexp * regexp |
Times of regexp * regexp |
Star of regexp
.
.
.
fun tokenize s = ...
.
.
.
fun parse s =
let
val (r, s) =
R EVISED 11.02.11
D RAFT
V ERSION 1.2
D RAFT
V ERSION 1.2
|
|
|
|
|
|
|
|
The symbol @ stands for the empty regular expression and the symbol
% stands for the regular expression accepting only the null string. Concatentation is indicated by ., alternation by +, and iteration by *.
We use a datatype declaration to introduce the type of tokens corresponding to the symbols of the input language. The function tokenize
has type char list -> token list; it transforms a list of characters into
a list of tokens. It is defined by a series of clauses that dispatch on the first
character of the list of characters given as input, yielding a list of tokens.
The correspondence between characters and tokens is relatively straightforward, the only non-trivial case being to admit the use of a backslash
to quote a reserved symbol as a character of input. (More sophisticated
languages have more sophisticated token structures; for example, words
(consecutive sequences of letters) are often regarded as a single token of
input.) Notice that it is quite natural to look ahead in the input stream
in the case of the backslash character, using a pattern that dispatches on
the first two characters (if there are such) of the input, and proceeding accordingly. (It is a lexical error to have a backslash at the end of the input.)
Lets turn to the parser. It is a simple recursive-descent parser implementing the precedence conventions for regular expressions given earlier.
These conventions may be formally specified by the following grammar,
which not only enforces precedence conventions, but also allows for the
R EVISED 11.02.11
D RAFT
V ERSION 1.2
::=
::=
::=
::=
rtrm | rtrm+rexp
rfac | rfac.rtrm
ratm | ratm*
@ | % | a | (rexp)
D RAFT
V ERSION 1.2
10
D RAFT
V ERSION 1.2
11
=> false)
Note that we incorporate the structure RegExp into the structure Matcher,
in accordance with the requirements of the signature. The function accepts
explodes the string into a list of characters (to facilitiate sequential processing of the input), then calls match with an initial continuation that ensures
that the remaining input is empty to determine the result. The type of
match is
RegExp.regexp -> char list ->
D RAFT
V ERSION 1.2
12
by the iteration once again. This neatly captures the zero or more times
interpretation of iteration of a regular expression.
Important: the code given above contains a subtle error. Can
you find it? If not, see chapter 27 for further discussion!
This completes our brief overview of Standard ML. The remainder of
these notes are structured into three parts. The first part is a detailed introduction to the core language, the language in which we write programs in
ML. The second part is concerned with the module language, the means
by which we structure large programs in ML. The third is about programming techniques, methods for building reliable and robust programs. I
hope you enjoy it!
1.2
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Part II
The Core Language
14
All Standard ML is divided into two parts. The first part, the core
language, comprises the fundamental programming constructs of the language the primitive types and operations, the means of defining and
using functions, mechanisms for definining new types, and so on. The
second part, the module language, comprises the mechanisms for structuring programs into separate units and is described in Part III. Here we
introduce the core language.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 2
Types, Values, and Effects
2.1
16
2.2
D RAFT
V ERSION 1.2
17
say that its value (should it have one) is a number, and for an expression
to have type real is to say that its value (if any) is a floating point number.
In general we can think of the type of an expression as a prediction of
the form of the value that it has, should it have one. Every expression is
required to have at least one type; those that do are said to be well-typed.
Those without a type are said to be ill-typed; they are considered ineligible
for evaluation. The type checker determines whether or not an expression
is well-typed, rejecting with an error those that are not.
A well-typed expression is evaluated to determine its value, if indeed
it has one. An expression can fail to have a value because its evaluation
never terminates or because it raises an exception, either because of a runtime fault such as division by zero or because some programmer-defined
condition is signalled during its evaluation. If an expression has a value,
the form of that value is predicted by its type. For example, if an expression evaluates to a value v and its type is bool, then v must be either true
or false; it cannot be, say, 17 or 3.14. The soundness of the type system
ensures the accuracy of the predictions made by the type checker.
Evaluation of an expression might also engender an effect. Effects include such phenomena as raising an exception, modifying memory, performing input or output, or sending a message on the network. It is important to note that the type of an expression says nothing about its possible
effects! An expression of type int might well display a message on the
screen before returning an integer value. This possibility is not accounted
for in the type of the expression, which classifies only its value. For this
reason effects are sometimes called side effects, to stress that they happen
off to the side during evaluation, and are not part of the value of the
expression. We will ignore effects until chapter 13. For the time being we
will assume that all expressions are effect-free, or pure.
2.2.1
Type Checking
What is a type? What types are there? Generally speaking, a type is defined by specifying three things:
a name for the type,
the values of the type, and
the operations that may be performed on values of the type.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
18
Often the division of labor into values and operations is not completely
clear-cut, but it nevertheless serves as a very useful guideline for describing types.
Lets consider first the type of integers. Its name is int. The values
of type int are the numerals 0, 1, 1, 2, 2, and so on. (Note that negative numbers are written with a prefix tilde, rather than a minus sign!)
Operations on integers include addition, +, subtraction, -, multiplication,
*, quotient, div, and remainder, mod. Arithmetic expressions are formed
in the familiar manner, for example, 3*2+6, governed by the usual rules
of precedence. Parentheses may be used to override the precedence conventions, just as in ordinary mathematical practice. Thus the preceding
expression may be equivalently written as (3*2)+6, but we may also write
3*(2+6) to override the default precedences.
The formation of expressions is governed by a set of typing rules that
define the types of expressions in terms of the types of their constituent expressions (if any). The typing rules are generally quite intuitive since they
are consistent with our experience in mathematics and in other programming languages. In their full generality the rules are somewhat involved,
but we will sneak up on them by first considering only a small fragment
of the language, building up additional machinery as we go along.
Here are some simple arithmetic expressions, written using infix notation for the operations (meaning that the operator comes between the
arguments, as is customary in mathematics):
3
3 + 4
4 div 3
4 mod 3
Each of these expressions is well-formed; in fact, they each have type
int. This is indicated by a typing assertion of the form exp : typ, which
states that the expression exp has the type typ. A typing assertion is said to
be valid iff the expression exp does indeed have the type typ. The following
are all valid typing assertions:
3
3
4
4
: int
+ 4 : int
div 3 : int
mod 3 : int
R EVISED 11.02.11
D RAFT
V ERSION 1.2
19
Why are these typing assertions valid? In the case of the value 3, it
is an axiom that integer numerals have integer type. What about the expression 3+4? The addition operation takes two arguments, each of which
must have type int. Since both arguments in fact have type int, it follows that the entire expression is of type int. For more complex cases we
reason analogously, for example, deducing that (3+4) div (2+3): int by
observing that (3+4): int and (2+3): int.
The reasoning involved in demonstrating the validity of a typing assertion may be summarized by a typing derivation consisting of a nested
sequence of typing assertions, each justified either by an axiom, or a typing rule for an operation. For example, the validity of the typing assertion
(3+7) div 5 : int is justified by the following derivation:
1. (3+7): int, because
(a) 3 : int because it is an axiom
(b) 7 : int because it is an axiom
(c) the arguments of + must be integers, and the result of + is an
integer
2. 5 : int because it is an axiom
3. the arguments of div must be integers, and the result is an integer
The outermost steps justify the assertion (3+4) div 5 : int by demonstrating that the arguments each have type int. Recursively, the inner
steps justify that (3+4): int.
2.2.2
Evaluation
Evaluation of expressions is defined by a set of evaluation rules that determine how the value of a compound expression is determined as a function
of the values of its constituent expressions (if any). Since the value of an
operator is determined by the values of its arguments, ML is sometimes
said to be a call-by-value language. While this may seem like the only sensible way to define evaluation, we will see in chapter 15 that this need not
be the case some operations may yield a value without evaluating their
arguments. Such operations are sometimes said to be lazy, to distinguish
R EVISED 11.02.11
D RAFT
V ERSION 1.2
20
R EVISED 11.02.11
D RAFT
V ERSION 1.2
2.3
21
What types are there besides the integers? Here are a few useful base types
of ML:
Type name: real
Values: 3.14, 2.17, 0.1E6, . . .
Operations: +, -, *, /, =, <, . . .
Type name: char
Values: #"a", #"b", . . .
Operations: ord,chr,=, <, . . .
Type name: string
Values: "abc", "1234", . . .
Operations: , size, =, <, . . .
Type name: bool
Values: true, false
Operations: if exp then exp1 else exp2
There are many, many (in fact, infinitely many!) others, but these are
enough to get us started. (See Appendix A for a complete description of
the primitive types of ML, including the ones given above.)
Notice that some of the arithmetic operations for real numbers are written the same way as for the corresponding operation on integers. For example, we may write 3.1+2.7 to perform a floating point addition of two
floating point numbers. This is called overloading; the addition operation
is said to be overloaded at the types int and real. In an expression involving addition the type checker tries to resolve which form of addition
(fixed point or floating point) you mean. If the arguments are ints, then
fixed point addition is used; if the arguments are reals, then floating addition is used; otherwise an error is reported.1 Note that ML does not perform any implicit conversions between types! For example, the expression
1 If
the type of the arguments cannot be determined, the type defaults to int.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
22
D RAFT
V ERSION 1.2
23
2.4
Type Errors
Now that we have more than one type, we have enough rope to hang
ourselves by forming ill-typed expressions. For example, the following expressions are not well-typed:
size
#"1"
#"2"
3.14
45
+ 1
"1"
+ 2
2.5
Sample Code
D RAFT
V ERSION 1.2
Chapter 3
Declarations
3.1
Variables
the same token a value binding might also be called a value abbreviation, but for
some reason it never is.
25
3.2
3.2.1
Basic Bindings
Type Bindings
Any type may be given a name using a type binding. At this stage we have
so few types that it is hard to justify binding type names to identifiers, but
well do it anyway because well need it later. Here are some examples of
type bindings:
type float = real
type count = int and average = real
The first type binding introduces the type constructor float, which subsequently is synonymous with real. The second introduces two type constructors, count and average, which stand for int and real, respectively.
In general a type binding introduces one or more new type constructors simultaneously in the sense that the definitions of the type constructors
may not involve any of the type constructors being defined. Thus a binding such as
type float = real and average = float
is nonsensical (in isolation) since the type constructors float and average
are introduced simultaneously, and hence cannot refer to one another.
The syntax for type bindings is
type tycon1 = typ1
and ...
and tyconn = typn
where each tyconi is a type constructor and each typi is a type expression.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
3.2.2
26
Value Bindings
A value may be given a name using a value binding. Here are some examples:
val m : int = 3+2
val pi : real = 3.14 and e : real = 2.17
The first binding introduces the variable m, specifying its type to be int
and its value to be 5. The second introduces two variables, pi and e, simultaneously, both having type real, and with pi having value 3.14 and
e having value 2.17. Notice that a value binding specifies both the type
and the value of a variable.
The syntax of value bindings is
val var1 : typ1 = exp1
and ...
and varn : typn = expn ,
where each vari is a variable, each typi is a type expression, and each expi
is an expression.
A value binding of the form
val var : typ = exp
is type-checked by ensuring that the expression exp has type typ. If not,
the binding is rejected as ill-formed. If so, the binding is evaluated using
the bind-by-value rule: first exp is evaluated to obtain its value val, then val
is bound to var. If exp does not have a value, then the declaration does not
bind anything to the variable var.
The purpose of a binding is to make a variable available for use within
its scope. In the case of a type binding we may use the type variable introduced by that binding in type expressions occurring within its scope. For
example, in the presence of the type bindings above, we may write
val pi : float = 3.14
since the type constructor float is bound to the type real, the type of the
expression 3.14. Similarly, we may make use of the variable introduced
by a value binding in value expressions occurring within its scope.
Continuing from the preceding binding, we may use the expression
R EVISED 11.02.11
D RAFT
V ERSION 1.2
27
Math.sin pi
to stand for 0.0 (approximately), and we may bind this value to a variable
by writing
val x : float = Math.sin pi
As these examples illustrate, type checking and evaluation are context
dependent in the presence of type and value bindings since we must refer
to these bindings to determine the types and values of expressions. For
example, to determine that the above binding for x is well-formed, we
must consult the binding for pi to determine that it has type float, consult
the binding for float to determine that it is synonymous with real, which
is necessary for the binding of x to have type float.
The rough-and-ready rule for both type-checking and evaluation is that
a bound variable or type constructor is implicitly replaced by its binding
prior to type checking and evaluation. This is sometimes called the substitution principle for bindings. For example, to evaluate the expression
Math.cos x in the scope of the above declarations, we first replace the occurrence of x by its value (approximately 0.0), then compute as before,
yielding (approximately) 1.0. Later on we will have to refine this simple
principle to take account of more sophisticated language features, but it is
useful nonetheless to keep this simple idea in mind.
3.3
Compound Declarations
D RAFT
V ERSION 1.2
28
are nested within one another: the scope of dec1 includes dec2 , . . . , decn , the
scope of dec2 includes dec3 , . . . , decn , and so on.
One thing to keep in mind is that binding is not assignment. The binding
of a variable never changes; once bound to a value, it is always bound to
that value (within the scope of the binding). However, we may shadow a
binding by introducing a second binding for a variable within the scope
of the first binding. Continuing the above example, we may write
val n : real = 2.17
to introduce a new variable n with both a different type and a different
value than the earlier binding. The new binding eclipses the old one,
which may then be discarded since it is no longer accessible. (Later on, we
will see that in the presence of higher-order functions shadowed bindings
are not always discarded, but are preserved as private data in a closure.
One might say that old bindings never die, they just fade away.)
3.4
Limiting Scope
R EVISED 11.02.11
D RAFT
V ERSION 1.2
29
let
val m : int = 3
val n : int = m*m
in
m*n
end
This expression has type int and value 27, as you can readily verify by
first calculating the bindings for m and n, then computing the value of m*n
relative to these bindings. The bindings for m and n are local to the expression m*n, and are not accessible from outside the expression.
If the declaration part of a let expression eclipses earlier bindings, the
ambient bindings are restored upon completion of evaluation of the let
expression. Thus the following expression evaluates to 54:
val m : int = 2
val r : int =
let
val m : int = 3
val n : int = m*m
in
m*n
end * m
The binding of m is temporarily overridden during the evaluation of the
let expression, then restored upon completion of this evaluation.
3.5
D RAFT
V ERSION 1.2
30
environment records their values. For example, after processing the compound declaration
val m : int = 0
val x : real = Math.sqrt(2.0)
val c : char = #"a"
the type environment contains the information
val m : int
val x : real
val c : char
and the value environment contains the information
val m = 0
val x = 1.414
val c = #"a"
In a sense the value declarations have been divided in half, separating
the type from the value information.
Thus we see that value bindings have significance for both type checking and evaluation. In contrast type bindings have significance only for
type checking, and hence contribute only to the type environment. A type
binding such as
type float = real
is recorded in its entirety in the type environment, and no change is made
to the value environment. Subsequently, whenever we encounter the type
constructor float in a type expression, it is replaced by real in accordance
with the type binding above.
In chapter 2 we said that a typing assertion has the form exp : typ, and
that an evaluation assertion has the form exp val. While two-place typing
and evaluation assertions are sufficient for closed expressions (those without variables), we must extend these relations to account for open expressions (those with variables). Each must be equipped with an environment
recording information about type constructors and variables introduced
by declarations.
Typing assertions are generalized to have the form
R EVISED 11.02.11
D RAFT
V ERSION 1.2
31
turnstile symbol, `, is simply a punctuation mark separating the type environment from the expression and its type.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
32
In words, if the specification val var : typ occurs in the type environment,
then we may conclude that the variable var has type typ. This rule glosses
over an important point. In order to account for shadowing we require
that the rightmost specification govern the type of a variable. That way
re-binding of variables with the same name but different types behaves as
expected.
Similarly, the evaluation relation must take account of the value environment. Evaluation of variables is governed by the following axiom:
. . . val var = val . . . ` var val
Here again we assume that the val specification is the rightmost one governing the variable var to ensure that the scoping rules are respected.
The role of the type equivalence assertion is to ensure that type constructors always stand for their bindings. This is expressed by the following axiom:
. . . type typvar = typ . . . ` typvar typ
Once again, the rightmost specification for typvar governs the assertion.
3.6
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 4
Functions
4.1
Functions as Templates
So far we just have the means to calculate the values of expressions, and to
bind these values to variables for future reference. In this chapter we will
introduce the ability to abstract the data from a calculation, leaving behind
the bare pattern of the calculation. This pattern may then be instantiated
as often as you like so that the calculation may be repeated with specified
data values plugged in.
For example, consider the expression 2*(3+4). The data might be taken
to be the values 2, 3, and 4, leaving behind the pattern * ( + ),
with holes where the data used to be. We might equally well take the
data to just be 2 and 3, and leave behind the pattern * ( + 4). Or
we might even regard * and + as the data, leaving 2 (3 4) as the
pattern! What is important is that a complete expression can be recovered
by filling in the holes with chosen data.
Since a pattern can contain many different holes that can be independently instantiated, it is necessary to give names to the holes so that instantiation consists of plugging in a given value for all occurrences of a name
in an expression. These names are, of course, just variables, and instantiation is just the process of substituting a value for all occurrences of a
variable in a given expression. A pattern may therefore be viewed as a
function of the variables that occur within it; the pattern is instantiated by
applying the function to argument values.
This view of functions is similar to our experience from high school
34
4.2
D RAFT
V ERSION 1.2
35
mathematics, a function in ML is a kind of value, namely a value of function type of the form typ -> typ0 . The type typ is the domain type (the type of
arguments) of the function, and typ0 is its range type (the type of its results).
We compute with a function by applying it to an argument value of its domain type and calculating the result, a value of its range type. Function
application is indicated by juxtaposition: we simply write the argument
next to the function.
The values of function type consist of primitive functions, such as addition and square root, and function expressions, which are also called lambda
expressions,1 of the form
fn var : typ => exp
The variable var is called the parameter, and the expression exp is called
the body. It has type typ->typ0 provided that exp has type typ0 under the
assumption that the parameter var has the type typ.
To apply such a function expression to an argument value val, we add
the binding
val var = val
to the value environment, and evaluate exp, obtaining a value val0 . Then
the value binding for the parameter is removed, and the result value, val0 ,
is returned as the value of the application.
For example, Math.sqrt is a primitive function of type real->real that
may be applied to a real number to obtain its square root. For example, the
expression Math.sqrt 2.0 evaluates to 1.414 (approximately). We can,
if we wish, parenthesize the argument, writing Math.sqrt (2.0) for the
sake of clarity; this is especially useful for expressions such as Math.sqrt
(Math.sqrt 2.0). The square root function is built in. We may write the
fourth root function as the following function expression:
fn x : real => Math.sqrt (Math.sqrt x)
It may be applied to an argument by writing an expression such as
(fn x : real => Math.sqrt (Math.sqrt x)) (16.0),
1 For
R EVISED 11.02.11
D RAFT
V ERSION 1.2
36
which calculates the fourth root of 16.0. The calculation proceeds by binding the variable x to the argument 16.0, then evaluating the expression
Math.sqrt (Math.sqrt x) in the presence of this binding. When evaluation completes, we drop the binding of x from the environment, since it is
no longer needed.
Notice that we did not give the fourth root function a name; it is an
anonymous function. We may give it a name using the declaration
forms introduced in chapter 3. For example, we may bind the fourth root
function to the variable fourthroot using the following declaration:
val fourthroot : real -> real =
fn x : real => Math.sqrt (Math.sqrt x)
We may then write fourthroot 16.0 to compute the fourth root of 16.0.
This notation for defining functions quickly becomes tiresome, so ML
provides a special syntax for function bindings that is more concise and
natural. Instead of using the val binding above to define fourthroot, we
may instead write
fun fourthroot (x:real):real = Math.sqrt (Math.sqrt x)
This declaration has the same meaning as the earlier val binding, namely
it binds fn x:real => Math.sqrt(Math.sqrt x) to the variable fourthroot.
It is important to note that function applications in ML are evaluated
according to the call-by-value rule: the arguments to a function are evaluated before the function is called. Put in other terms, functions are defined
to act on values, rather than on unevaluated expressions. Thus, to evaluate
an expression such as fourthroot (2.0+2.0), we proceed as follows:
1. Evaluate fourthroot to the function value fn x : real => Math.sqrt
(Math.sqrt x).
2. Evaluate the argument 2.0+2.0 to its value 4.0
3. Bind x to the value 4.0.
4. Evaluate Math.sqrt (Math.sqrt x) to 1.414 (approximately).
(a) Evaluate Math.sqrt to a function value (the primitive square
root function).
R EVISED 11.02.11
D RAFT
V ERSION 1.2
37
(b) Evaluate the argument expression Math.sqrt x to its value, approximately 2.0.
i. Evaluate Math.sqrt to a function value (the primitive square
root function).
ii. Evaluate x to its value, 4.0.
iii. Compute the square root of 4.0, yielding 2.0.
(c) Compute the square root of 2.0, yielding 1.414.
5. Drop the binding for the variable x.
Notice that we evaluate both the function and argument positions of an
application expression both the function and argument are expressions
yielding values of the appropriate type. The value of the function position
must be a value of function type, either a primitive function or a lambda
expression, and the value of the argument position must be a value of the
domain type of the function. In this case the result value (if any) will be of
the range type of the function. Functions in ML are first-class, meaning that
they may be computed as the value of an expression. We are not limited to
applying only named functions, but rather may compute new functions
on the fly and apply these to arguments. This is a source of considerable
expressive power, as we shall see in the sequel.
Using similar techniques we may define functions with arbitrary domain and range. For example, the following are all valid function declarations:
fun
fun
fun
fun
fun
fun
Thus pal "ot" evaluates to the string "otto", and is even 4 evaluates to
true.
4.3
D RAFT
V ERSION 1.2
38
D RAFT
V ERSION 1.2
39
4.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 5
Products and Records
5.1
Product Types
5.1.1
Tuples
This chapter is concerned with the simplest form of aggregate data structure, the n-tuple. An n-tuple is a finite ordered sequence of values of the
form
(val1 ,...,valn ),
where each vali is a value. A 2-tuple is usually called a pair, a 3-tuple a
triple, and so on.
An n-tuple is a value of a product type of the form
typ1 *... *typn .
41
R EVISED 11.02.11
D RAFT
V ERSION 1.2
42
val xn = expn
(x1,...,xn) end
5.1.2
Tuple Patterns
D RAFT
V ERSION 1.2
43
If wed like we can even give names to the first and second components of
the pair, without decomposing them into constituent parts:
val (is:int*string,rc:real*char) = val
The general form of a value binding is
val pat = exp,
where pat is a pattern and exp is an expression. A pattern is one of three
forms:
1. A variable pattern of the form var:typ.
2. A tuple pattern of the form (pat1 ,...,patn ), where each pati is a pattern. This includes as a special case the null-tuple pattern, ().
3. A wildcard pattern of the form .
The type of a pattern is determined by an inductive analysis of the form
of the pattern:
1. A variable pattern var:typ is of type typ.
2. A tuple pattern (pat1 ,...,patn ) has type typ1 * *typn , where each
pati is a pattern of type typi . The null-tuple pattern () has type unit.
3. The wildcard pattern has any type whatsoever.
A value binding of the form
val pat = exp
is well-typed iff pat and exp have the same type; otherwise the binding is
ill-typed and is rejected.
For example, the following bindings are well-typed:
val
val
val
val
D RAFT
V ERSION 1.2
44
= val is discarded.
These simplifications are repeated until all bindings are irreducible, which
leaves us with a set of variable bindings that constitute the result of pattern
matching.
For example, evaluation of the binding
val ((m:int,n:int), (r:real, s:real)) = ((2,3),(2.0,3.0))
proceeds as follows. First, we compose this binding into the following two
bindings:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
45
m:int = 2
n:int = 3
r:real = 2.0
s:real = 3.0
5.2
Record Types
Tuples are most useful when the number of positions is small. When the
number of components grows beyond a small number, it becomes difficult
to remember which position plays which role. In that case it is more natural to attach a label to each component of the tuple that mediates access to
it. This is the notion of a record type.
A record type has the form
{lab1 :typ1 ,...,labn :typn },
where n 0, and all of the labels labi are distinct. A record value has the
form
{lab1 =val1 ,...,labn =valn },
where vali has type typi . A record pattern has the form
{lab1 =pat1 ,...,labn =patn }
which has type
{lab1 :typ1 ,...,labn :typn }
provided that each pati has type typi .
A record value binding of the form
R EVISED 11.02.11
D RAFT
V ERSION 1.2
46
val
{lab1 =pat1 ,...,labn =patn } =
{lab1 =val1 ,...,labn =valn }
is decomposed into the following set of bindings
val pat1 = val1
and ...
and patn = valn .
Since the components of a record are identified by name, not position,
the order in which they occur in a record value or record pattern is not
important. However, in a record expression (in which the components may
not be fully evaluated), the fields are evaluated from left to right in the
order written, just as for tuple expressions.
Here are some examples to help clarify the use of record types. First,
let us define the record type hyperlink as follows:
type hyperlink =
{ protocol : string,
address : string,
display : string }
The record binding
val mailto rwh : hyperlink =
{ protocol="mailto",
address="[email protected]",
display="Robert Harper" }
defines a variable of type hyperlink. The record binding
val { protocol=prot, display=disp, address=addr } = mailto rwh
decomposes into the three variable bindings
val prot = "mailto"
val addr = "[email protected]"
val disp = "Robert Harper"
which extract the values of the fields of mailto rwh.
Using wild cards we can extract selected fields from a record. For example, we may write
R EVISED 11.02.11
D RAFT
V ERSION 1.2
47
D RAFT
V ERSION 1.2
5.3
48
A function may bind more than one argument by using a pattern, rather
than a variable, in the argument position. Function expressions are generalized to have the form
fn pat => exp
where pat is a pattern and exp is an expression. Application of such a function proceeds much as before, except that the argument value is matched
against the parameter pattern to determine the bindings of zero or more
variables, which are then used during the evaluation of the body of the
function.
For example, we may make the following definition of the Euclidean
distance function:
val dist
: real * real -> real
= fn (x:real, y:real) => Math.sqrt (x*x + y*y)
This function may then be applied to a pair (a two-tuple!) of arguments to
yield the distance between them. For example, dist (2.0,3.0) evaluates
to (approximately) 4.0.
Using fun notation, the distance function may be defined more concisely as follows:
fun dist (x:real, y:real):real = Math.sqrt (x*x + y*y)
The meaning is the same as the more verbose val binding given earlier.
Keyword parameter passing is supported through the use of record patterns. For example, we may define the distance function using keyword
parameters as follows:
fun dist {x=x:real, y=y:real} = Math.sqrt (x*x + y*y)
The expression dist {x=2.0,y=3.0} invokes this function with the indicated x and y values.
Functions with multiple results may be thought of as functions yielding tuples (or records). For example, we might compute two different notions of distance between two points at once as follows:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
49
D RAFT
V ERSION 1.2
5.4
50
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 6
Case Analysis
6.1
Tuple types have the property that all values of that type have the same
form (n-tuples, for some n determined by the type); they are said to be
homogeneous. For example, all values of type int*real are pairs whose
first component is an integer and whose second component is a real. Any
type-correct pattern will match any value of that type; there is no possibility of failure of pattern matching. The pattern (x:int,y:real) is of
type int*real and hence will match any value of that type. On the other
hand the pattern (x:int,y:real,z:string) is of type int*real*string
and cannot be used to match against values of type int*real; attempting
to do so fails at compile time.
Other types have values of more than one form; they are said to be heterogeneous types. For example, a value of type int might be 0, 1, 1, . . . or
a value of type char might be #"a" or #"z". (Other examples of heterogeneous types will arise later on.) Corresponding to each of the values of
these types is a pattern that matches only that value. Attempting to match
any other value against that pattern fails at execution time with an error
condition called a bind failure.
Here are some examples of pattern-matching against values of a heterogeneous type:
val 0 = 1-1
val (0,x) = (1-1, 34)
val (0, #"0") = (2-1, #"0")
52
The first two bindings succeed, the third fails. In the case of the second,
the variable x is bound to 34 after the match. No variables are bound in
the first or third examples.
6.2
D RAFT
V ERSION 1.2
53
exp
pat1 => exp1
...
patn => expn
6.3
The type bool of booleans is perhaps the most basic example of a heterogeneous type. Its values are true and false. Functions may be defined
R EVISED 11.02.11
D RAFT
V ERSION 1.2
54
on booleans using clausal definitions that match against the patterns true
and false.
For example, the negation function may be defined clausally as follows:
fun not true = false
| not false = true
The conditional expression
if exp then exp1 else exp2
is short-hand for the case analysis
case exp
of true => exp1
| false => exp2
which is itself short-hand for the application
(fn true => exp1 | false => exp2 ) exp.
The short-circuit conjunction and disjunction operations are defined
as follows. The expression exp1 andalso exp2 is short for
if exp1 then exp2 else false
and the expression exp1 orelse exp2 is short for
if exp1 then true else exp2 .
You should expand these into case expressions and check that they behave
as expected. Pay particular attention to the evaluation order, and observe
that the call-by-value principle is not violated by these expressions.
6.4
R EVISED 11.02.11
D RAFT
V ERSION 1.2
55
domain must match one of its clauses. The second, called redundancy checking, ensures that no clause of a match is subsumed by the clauses that precede it. This means that the set of values covered by a clause in a match
must not be contained entirely within the set of values covered by the preceding clauses of that match.
Redundant clauses are always a mistake such a clause can never be
executed. Redundant rules often arise accidentally. For example, the second rule of the following clausal function definition is redundant:
fun not True = false
| not False = true
By capitalizing True we have turned it into a variable, rather than a constant pattern. Consequently, every value matches the first clause, rendering
the second redundant.
Since the clauses of a match are considered in the order they are written, redundancy checking is correspondingly order-sensitive. In particular, changing the order of clauses in a well-formed, irredundant match can
make it redundant, as in the following example:
fun recip (n:int) = 1 div n
| recip 0 = 0
The second clause is redundant because the first matches any integer value,
including 0.
Inexhaustive matches may or may not be in error, depending on whether
the match might ever be applied to a value that is not covered by any
clause. Here is an example of a function with an inexhaustive match that
is plausibly in error:
fun is numeric #"0" =
| is numeric #"1"
| is numeric #"2"
| is numeric #"3"
| is numeric #"4"
| is numeric #"5"
| is numeric #"6"
| is numeric #"7"
| is numeric #"8"
| is numeric #"9"
R EVISED 11.02.11
true
= true
= true
= true
= true
= true
= true
= true
= true
= true
D RAFT
V ERSION 1.2
56
When applied to, say, #"a", this function fails. Indeed, the function never
returns false for any argument!
Perhaps what was intended here is to include a catch-all clause at the
end:
fun is numeric #"0" = true
| is numeric #"1" = true
| is numeric #"2" = true
| is numeric #"3" = true
| is numeric #"4" = true
| is numeric #"5" = true
| is numeric #"6" = true
| is numeric #"7" = true
| is numeric #"8" = true
| is numeric #"9" = true
| is numeric = false
The addition of a final catch-all clause renders the match exhaustive, because any value not matched by the first ten clauses will surely be matched
by the eleventh.
Having said that, it is a very bad idea to simply add a catch-all clause
to the end of every match to suppress inexhaustiveness warnings from the
compiler. The exhaustiveness checker is your friend! Each such warning is
a suggestion to double-check that match to be sure that youve not made
a silly error of omission, but rather have intentionally left out cases that
are ruled out by the invariants of the program. In chapter 10 we will see
that the exhaustiveness checker is an extremely valuable tool for managing
code evolution.
6.5
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 7
Recursive Functions
So far weve only considered very simple functions (such as the reciprocal
function) whose value is computed by a simple composition of primitive
functions. In this chapter we introduce recursive functions, the principal
means of iterative computation in ML. Informally, a recursive function is
one that computes the result of a call by possibly making further calls to
itself. Obviously, to avoid infinite regress, some calls must return their
results without making any recursive calls. Those that do must ensure
that the arguments are, in some sense, smaller so that the process will
eventually terminate.
This informal description obscures a central point, namely the means
by which we may convince ourselves that a function computes the result
that we intend. In general we must prove that for all inputs of the domain type, the body of the function computes the correct value of result
type. Usually the argument imposes some additional assumptions on the
inputs, called the pre-conditions. The correctness requirement for the result is called a post-condition. Our burden is to prove that for every input
satisfying the pre-conditions, the body evaluates to a result satisfying the
post-condition. In fact we may carry out such an analysis for many different pre- and post-condition pairs, according to our interest. For example,
the ML type checker proves that the body of a function yields a value of
the range type (if it terminates) whenever it is given an argument of the
domain type. Here the domain type is the pre-condition, and the range
type is the post-condition. In most cases we are interested in deeper properties, examples of which we shall consider below.
To prove the correctness of a recursive function (with respect to given
58
pre- and post-conditions) it is typically necessary to use some form of inductive reasoning. The base cases of the induction correspond to those
cases that make no recursive calls; the inductive step corresponds to those
that do. The beauty of inductive reasoning is that we may assume that the
recursive calls work correctly when showing that a case involving recursive calls is correct. We must separately show that the base cases satisfy
the given pre- and post-conditions. Taken together, these two steps are
sufficient to establish the correctness of the function itself, by appeal to an
induction principle that justifies the particular pattern of recursion.
No doubt this all sounds fairly theoretical. The point of this chapter is
to show that it is also profoundly practical.
7.1
In order for a function to call itself, it must have a name by which it can
refer to itself. This is achieved by using a recursive value binding, which are
ordinary value bindings qualified by the keyword rec. The simplest form
of a recursive value binding is as follows:
val rec var:typ = val.
As in the non-recursive case, the left-hand is a pattern, but here the righthand side must be a value. In fact the right-hand side must be a function
expression, since only functions may be defined recursively in ML. The
function may refer to itself by using the variable var.
Heres an example of a recursive value binding:
val rec factorial : int->int =
fn 0 => 1 | n:int => n * factorial (n-1)
Using fun notation we may write the definition of factorial much more
clearly and concisely as follows:
fun factorial 0 = 1
| factorial (n:int) = n * factorial (n-1)
There is obviously a close correspondence between this formulation of
factorial and the usual textbook definition of the factorial function in
R EVISED 11.02.11
D RAFT
V ERSION 1.2
59
( n > 0)
D RAFT
V ERSION 1.2
60
and we wish to apply var to the value val of type typ. As before, we consider each clause in turn, until we find the first pattern pati matching val.
We proceed, as before, by evaluating expi , replacing the variables in pati by
the bindings determined by pattern matching, but, in addition, we replace
all occurrences of the var by its binding in expi before continuing evaluation.
For example, to evaluate factorial 3, we proceed by retrieving the
binding of factorial and evaluating
(fn 0=>1 | n:int => n*factorial(n-1))(3).
Considering each clause in turn, we find that the first doesnt match, but
the second does. We therefore continue by evaluating its right-hand side,
the expression n * factorial(n-1), after replacing n by 3 and factorial
by its definition. We are left with the sub-problem of evaluating the expression
3 * (fn 0 => 1 | n:int => n*factorial(n-1))(2)
Proceeding as before, we reduce this to the sub-problem of evaluating
3 * (2 * (fn 0=>1 | n:int => n*factorial(n-1))(1)),
which reduces to the sub-problem of evaluating
3 * (2 * (1 * (fn 0=>1 | n:int => n*factorial(n-1))(0))),
which reduces to
3 * (2 * (1 * 1)),
which then evaluates to 6, as desired.
Observe that the repeated substitution of factorial by its definition
ensures that the recursive calls really do refer to the factorial function itself.
Also observe that the size of the sub-problems grows until there are no
more recursive calls, at which point the computation can complete. In
broad outline, the computation proceeds as follows:
1. factorial 3
2. 3 * factorial 2
R EVISED 11.02.11
D RAFT
V ERSION 1.2
7.2 Iteration
61
3. 3 * 2 * factorial 1
4. 3 * 2 * 1 * factorial 0
5. 3 * 2 * 1 * 1
6. 3 * 2 * 1
7. 3 * 2
8. 6
Notice that the size of the expression first grows (in direct proportion to
the argument), then shrinks as the pending multiplications are completed.
This growth in expression size corresponds directly to a growth in runtime storage required to record the state of the pending computation.
7.2
Iteration
R EVISED 11.02.11
D RAFT
V ERSION 1.2
62
local
fun helper (0,r:int) = r
| helper (n:int,r:int) = helper (n-1,n*r)
in
fun factorial (n:int) = helper (n,1)
end
This way the helper function is not visible, only the function of interest is
exported by the declaration.
The important thing to observe about helper is that it is iterative, or tail
recursive, meaning that the recursive call is the last step of evaluation of an
application of it to an argument. This means that the evaluation trace of a
call to helper with arguments (3,1) has the following general form:
1. helper (3, 1)
2. helper (2, 3)
3. helper (1, 6)
4. helper (0, 6)
5. 6
Notice that there is no growth in the size of the expression because there
are no pending computations to be resumed upon completion of the recursive call. Consequently, there is no growth in the space required for an
application, in contrast to the first definition given above. Tail recursive
definitions are analogous to loops in imperative languages: they merely
iterate a computation, without requiring auxiliary storage.
7.3
Inductive Reasoning
Time and space usage are important, but what is more important is that
the function compute the intended result. The key to the correctness of
a recursive function is an inductive argument establishing its correctness.
The critical ingredients are these:
1. An input-output specification of the intended behavior stating pre-conditions
on the arguments and a post-condition on the result.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
63
2. A proof that the specification holds for each clause of the function,
assuming that it holds for any recursive calls.
3. An induction principle that justifies the correctness of the function as
a whole, given the correctness of its clauses.
Well illustrate the use of inductive reasoning by a graduated series
of examples. First consider the simple, non-tail recursive definition of
factorial given in section 7.1. One reasonable specification for factorial
is as follows:
1. Pre-condition: n 0.
2. Post-condition: factorial n evaluates to n!.
We are to establish the following statement of correctness of factorial:
if n 0, then factorial n evaluates to n!.
That is, we show that the pre-conditions imply the post-condition holds
for the result of any application. This is called a total correctness assertion
because it states not only that the post-condition holds of any result of
application, but, moreover, that every application in fact yields a result
(subject to the pre-condition on the argument).
In contrast, a partial correctness assertion does not insist on termination,
only that the post-condition holds whenever the application terminates.
This may be stated as the assertion
if n 0 and factorial n evaluates to p, then p = n!.
Notice that this statement is true of a function that diverges whenever it is
applied! In this sense a partial correctness assertion is weaker than a total
correctness assertion.
Let us establish the total correctness of factorial using the pre- and
post-conditions stated above. To do so, we apply the principle of mathematical induction on the argument n. Recall that this means we are to
establish the specification for the case n = 0, and, assuming it to hold for
n >= 0, show that it holds for n + 1. The base case, n = 0, is trivial:
by definition factorial n evaluates to 1, which is 0!. Now suppose that
n = m + 1 for some m >= 0. By the inductive hypothesis we have that
R EVISED 11.02.11
D RAFT
V ERSION 1.2
64
D RAFT
V ERSION 1.2
65
(n-3) again, and fib (n-4). As you can see, there is considerable redundancy here. It can be shown that the running time of fib is exponential in
its argument, which is quite awful.
Heres a better solution: for each n >= 0 compute not only the nth
Fibonacci number, but also the (n 1)st as well. (For n = 0 we define the
1st Fibonacci number to be zero). That way we can avoid redundant
recomputation, resulting in a linear-time algorithm. Heres the code:
(* for n>=0, fib n evaluates to (a, b), where
a is the nth Fibonacci number, and
b is the (n-1)st *)
fun fib 0 = (1, 0)
| fib 1 = (1, 1)
| fib (n:int) =
let
val (a:int, b:int) = fib (n-1)
in
(a+b, a)
end
You might feel satisfied with this solution since it runs in time linear in
n. It turns out (see Graham, Knuth, and Patashnik, Concrete Mathematics
(Addison-Wesley 1989) for a derivation) that the recurrence
F0 = 1
F1 = 1
Fn = Fn1 + Fn2
has a closed-form solution over the real numbers. This means that the
nth Fibonacci number can be calculated directly, without recursion, by using floating point arithmetic. However, this is an unusual case. In most
instances recursively-defined functions have no known closed-form solution, so that some form of iteration is inevitable.
7.4
Mutual Recursion
D RAFT
V ERSION 1.2
66
even 0 = true
even n = odd (n-1)
odd 0 = false
odd n = even (n-1)
Notice that even calls odd and odd calls even, so they are not definable
separately from one another. We join their definitions using the keyword
and to indicate that they are defined simultaneously by mutual recursion.
7.5
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 8
Type Inference and
Polymorphism
8.1
Type Inference
So far weve mostly written our programs in what is known as the explicitly typed style. This means that whenever weve introduced a variable,
weve assigned it a type at its point of introduction. In particular every
variable in a pattern has a type associated with it. As you may have noticed, this gets a little tedious after a while, especially when youre using
clausal function definitions. A particularly pleasant feature of ML is that
it allows you to omit this type information whenever it can be determined
from context. This process is known as type inference since the compiler is
inferring the missing type information based on context.
For example, there is no need to give a type to the variable s in the
function
fn s:string => s "\n".
The reason is that no other type for s makes sense, since s is used as an
argument to string concatenation. Consequently, you may write simply
fn s => s "\n",
leaving ML to insert :string for you.
When is it allowable to omit this information? Almost always, with
very few exceptions. It is a deep, and important, result about ML that
68
D RAFT
V ERSION 1.2
69
type of x to be int. This may be expressed using type schemes by writing this function in the explicitly-typed form fn (x:int,y:a)=>x+1 with
type int*a->int.
In these examples we needed only one type variable to express the
polymorphic behavior of a function, but usually we need more than one.
For example, the function fn (x,y) = x constrains neither the type of x
nor the type of y. Consequently we may choose their types freely and independently of one another. This may be expressed by writing this function in the form fn (x:a,y:b)=>x with type scheme a*b->a. Notice
that while it is correct to assign the type a*a->a to this function, doing
so would be overly restrictive since the types of the two parameters need
not be the same. However, we could not assign the type a*b->c to this
function because the type of the result must be the same as the type of
the first parameter: it returns its first parameter when invoked! The type
scheme a*b->a precisely captures the constraints that must be satisfied for the function to be type correct. It is said to be the most general or
principal type scheme for the function.
It is a remarkable fact about ML that every expression (with the exception of a few pesky examples that well discuss below) has a principal type
scheme. That is, there is (almost) always a best or most general way to infer
types for expressions that maximizes generality, and hence maximizes flexibility in the use of the expression. Every expression seeks its own depth
in the sense that an occurrence of that expression is assigned a type that is
an instance of its principal type scheme determined by the context of use.
For example, if we write
(fn x=>x)(0),
the context forces the type of the identity function to be int->int, and if
we write
(fn x=>x)(fn x=>x)(0)
the context forces the instance (int->int)->(int->int) of the principal
type scheme for the identity at the first occurrence, and the instance int->int
for the second.
How is this achieved? Type inference is a process of constraint satisfaction. First, the expression determines a set of equations governing the relationships among the types of its subexpressions. For example, if a function
R EVISED 11.02.11
D RAFT
V ERSION 1.2
70
8.2
Polymorphic Definitions
D RAFT
V ERSION 1.2
71
D RAFT
V ERSION 1.2
72
the type checker behaves as though all uses of the bound variable are implicitly replaced by its binding before type checking. Since this may involve replication of the binding, the meaning of a program is not necessarily preserved by this transformation. (Think, for example, of any expression that opens a window on your screen: if you replicate the expression
and evaluate it twice, it will open two windows. This is not the same as
evaluating it only once, which results in one window.)
To ensure semantic consistency, variables introduced by a val binding
are allowed to be polymorphic only if the right-hand side is a value. This
is called the value restriction on polymorphic declarations. For fun bindings this restriction is always met since the right-hand side is implicitly a
lambda expression, which is a value. However, it might be thought that
the following declaration introduces a polymorphic variable of type a ->
a, but in fact it is rejected by the compiler:
val J = I I
The reason is that the right-hand side is not a value; it requires computation to determine its value. It is therefore ruled out as inadmissible for
polymorphism; the variable J may not be used polymorphically in the remainder of the program. In this case the difficulty may be avoided by
writing instead
fun J x = I I x
because now the binding of J is a lambda, which is a value.
In some rare circumstances this is not possible, and some polymorphism is lost. For example, the following declaration of a value of list
type1
val l = nil @ nil
does not introduce an identifier with a polymorphic type, even though the
almost equivalent declaration
val l = nil
does do so. Since the right-hand side is a list, we cannot apply the trick
of defining l to be a function; we are stuck with a loss of polymorphism in
1 To
be introduced in chapter 9.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
8.3 Overloading
73
this case. This particular example is not very impressive, but occasionally
similar examples do arise in practice.
Why is the value restriction necessary? Later on, when we study mutable storage, well see that some restriction on polymorphism is essential if the language is to be type safe. The value restriction is an easilyremembered sufficient condition for soundness, but as the examples above
illustrate, it is by no means necessary. The designers of ML were faced
with a choice of simplicity vs flexibility; in this case they opted for simplicity at the expense of some expressiveness in the language.
8.3
Overloading
Type information cannot always be omitted. There are a few corner cases
that create problems for type inference, most of which arise because of
concessions that are motivated by long-standing, if dubious, notational
practices.
The main source of difficulty stems from overloading of arithmetic operators. As a concession to long-standing practice in informal mathematics
and in many programming languages, the same notation is used for both
integer and floating point arithmetic operations. As long as we are programming in an explicitly-typed style, this convention creates no particular problems. For example, in the function
fn x:int => x+x
it is clear that integer addition is called for, whereas in the function
fn x:real => x+x
it is equally obvious that floating point addition is intended.
However, if we omit type information, then a problem arises. What are
we to make of the function
fn x => x+x ?
Does + stand for integer or floating point addition? There are two distinct reconstructions of the missing type information in this example, corresponding to the preceding two explictly-typed programs. Which is the
compiler to choose?
When presented with such a program, the compiler has two choices:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
8.3 Overloading
74
1. Declare the expression ambiguous, and force the programmer to provide enough explicit type information to resolve the ambiguity.
2. Arbitrarily choose a default interpretation, say the integer arithmetic, that forces one interpretation or another.
Each approach has its advantages and disadvantages. Many compilers
choose the second approach, but issue a warning indicating that it has
done so. To avoid ambiguity, explicit type information is required from
the programmer.
The situation is actually a bit more subtle than the preceding discussion implies. The reason is that the type inference process makes use of
the surrounding context of an expression to help resolve ambiguities. For
example, if the expression fn x=>x+x occurs in the following, larger expression, there is in fact no ambiguity:
(fn x => x+x)(3).
Since the function is applied to an integer argument, there is no question
that the only possible resolution of the missing type information is to treat
x as having type int, and hence to treat + as integer addition.
The important question is how much context is considered before the
situation is considered ambiguous? The rule of thumb is that context is
considered up to the nearest enclosing function declaration. For example,
consider the following example:
let
val double = fn x => x+x
in
(double 3, double 4)
end
The function expression fn x=>x+x will be flagged as ambiguous, even
though its only uses are with integer arguments. The reason is that value
bindings are considered to be units of type inference for which all ambiguity must be resolved before type checking continues. If your compiler
adopts the integer interpretation as default, the above program will be accepted (with a warning), but the following one will be rejected:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
8.3 Overloading
75
let
val double = fn x => x+x
in
(double 3.0, double 4.0)
end
Finally, note that the following program must be rejected because no
resolution of the overloading of addition can render it meaningful:
let
val double = fn x => x+x
in
(double 3, double 3.0)
end
The ambiguity must be resolved at the val binding, which means that the
compiler must commit at that point to treating the addition operation as
either integer or floating point. No single choice can be correct, since we
subsequently use double at both types.
A closely related source of ambiguity arises from the record elision
notation described in chapter 5. Consider the function #name, defined by
fun #name {name=n:string, ...} = n
which selects the name field of a record. This definition is ambiguous because the compiler cannot uniquely determine the domain type of the
function! Any of the following types are legitimate domain types for
#name, none of which is best:
{name:string}
{name:string,salary:real}
{name:string,salary:int}
{name:string,address:string}
Of course there are infinitely many such examples, none of which is clearly
preferable to the other. This function definition is therefore rejected as
ambiguous by the compiler there is no one interpretation of the function
that suffices for all possible uses.
In chapter 5 we mentioned that functions such as #name are pre-defined
by the ML compiler, yet we just now claimed that such a function definition is rejected as ambiguous. Isnt this a contradiction? Not really,
R EVISED 11.02.11
D RAFT
V ERSION 1.2
76
8.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 9
Programming with Lists
9.1
List Primitives
78
to get another type, written typ list. The forms nil and :: are the value
constructors of type typ list. The nullary (no argument) constructor nil
may be thought of as the empty list. The binary (two argument) constructor :: constructs a non-empty list from a value h of type typ and another
value t of type typ list; the resulting value, h::t, of type typ list, is pronounced h cons t (for historical reasons). We say that h is consd onto t,
that h is the head of the list, and that t is its tail.
The definition of the values of type typ list given above is an example
of an inductive definition. The type is said to be recursive because this definition is self-referential in the sense that the values of type typ list are
defined in terms of (other) values of the same type. This is especially clear
if we examine the types of the value constructors for the type typ list:
val nil : typ list
val (op ::) : typ * typ list -> typ list
The notation op :: is used to refer to the :: operator as a function, rather
than to use it to form a list, which requires infix notation.
Two things are notable here:
1. The :: operation takes as its second argument a value of type typ
list, and yields a result of type typ list. This self-referential aspect
is characteristic of an inductive definition.
2. Both nil and op :: are polymorphic in the type of the underlying elements of the list. Thus nil is the empty list of type typ list for
any element type typ, and op :: constructs a non-empty list independently of the type of the elements of that list.
It is easy to see that a value val of type typ list has the form
val1 ::(val2 :: ( ::(valn ::nil) ))
for some n 0, where vali is a value of type typ for each 1 i n.
For according to the inductive definition of the values of type typ list,
the value val must either be nil, which is of the above form, or val1 ::val0 ,
where val0 is a value of type typ list. By induction val0 has the form
(val2 :: ( ::(valn ::nil) ))
R EVISED 11.02.11
D RAFT
V ERSION 1.2
79
9.2
How do we compute with values of list type? Since the values are defined
inductively, it is natural that functions on lists be defined recursively, using
a clausal definition that analyzes the structure of a list. Heres a definition
of the function length that computes the number of elements of a list:
fun length nil = 0
| length ( ::t) = 1 + length t
The definition is given by induction on the structure of the list argument.
The base case is the empty list, nil. The inductive step is the non-empty
list ::t (notice that we do not need to give a name to the head). Its definition is given in terms of the tail of the list t, which is smaller than the
list ::t. The type of length is a list -> int; it is defined for lists of
values of any type whatsoever.
We may define other functions following a similar pattern. Heres the
function to append two lists:
fun append (nil, l) = l
| append (h::t, l) = h :: append (t, l)
This function is built into ML; it is written using infix notation as exp1 @
exp2 . The running time of append is proportional to the length of the first
list, as should be obvious from its definition.
Heres a function to reverse a list.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
80
D RAFT
V ERSION 1.2
81
then the application helper (l, a) reduces to the value of helper (t,
(h::a)). By the inductive hypothesis this is just (rev t) @ (h :: a),
which is equivalent to (rev t) @ [h] @ a. But this is just rev (h::t) @
a, which was to be shown.
The principle of structural induction may be summarized as follows.
To show that a function works correctly for every list l, it suffices to show
1. The correctness of the function for the empty list, nil, and
2. The correctness of the function for h::t, assuming its correctness for
t.
As with mathematical induction over the natural numbers, structural induction over lists allows us to focus on the basic and incremental behavior
of a function to establish its correctness for all lists.
9.3
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 10
Concrete Data Types
10.1
Datatype Declarations
Lists are one example of the general notion of a recursive type. ML provides
a general mechanism, the datatype declaration, for introducing programmerdefined recursive types. Earlier we introduced type declarations as an abbreviation mechanism. Types are given names as documentation and as
a convenience to the programmer, but doing so is semantically inconsequential one could replace all uses of the type name by its definition
and not affect the behavior of the program. In contrast the datatype declaration provides a means of introducing a new type that is distinct from
all other types and that does not merely stand for some other type. It is the
means by which the ML type system may be extended by the programmer.
The datatype declaration in ML has a number of facets. A datatype
declaration introduces
1. One or more new type constructors. The type constructors introduced may, or may not, be mutually recursive.
2. One or more new value constructors for each of the type constructors
introduced by the declaration.
The type constructors may take zero or more arguments; a zero-argument,
or nullary, type constructor is just a type. Each value constructor may
also take zero or more arguments; a nullary value constructor is just a
constant. The type and value constructors introduced by the declaration
are new in the sense that they are distinct from all other type and value
83
10.2
Non-Recursive Datatypes
outranks
outranks
outranks
outranks
outranks
outranks
outranks
outranks
This defines a function of type suit * suit -> bool that determines whether
or not the first suit outranks the second.
Data types may be parameterized by a type. For example, the declaration
R EVISED 11.02.11
D RAFT
V ERSION 1.2
84
D RAFT
V ERSION 1.2
85
10.3
Recursive Datatypes
The next level of generality is the recursive type definition. For example,
one may define a type typ tree of binary trees with values of type typ at
the nodes using the following declaration:
datatype a tree =
Empty |
Node of a tree * a * a tree
This declaration corresponds to the informal definition of binary trees with
values of type typ at the nodes:
1. The empty tree Empty is a binary tree.
2. If tree 1 and tree 2 are binary trees, and val is a value of type typ, then
Node (tree 1, val, tree 2) is a binary tree.
3. Nothing else is a binary tree.
The distinguishing feature of this definition is that it is recursive in the
sense that binary trees are constructed out of other binary trees, with the
empty tree serving as the base case.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
86
D RAFT
V ERSION 1.2
87
R EVISED 11.02.11
D RAFT
V ERSION 1.2
size
size
size
size
88
tree Empty = 0
tree (Node ( , f)) = 1 + size forest f
forest None = 0
forest (Tree (t, f)) = size tree t + size forest f
Notice that we define the size of a tree in terms of the size of a forest, and
vice versa, just as the type of trees is defined in terms of the type of forests.
Many other variations are possible. Suppose we wish to define a notion
of binary tree in which data items are associated with branches, rather than
nodes. Heres a datatype declaration for such trees:
datatype a tree =
Empty |
Node of a branch * a branch
and a branch =
Branch of a * a tree
In contrast to our first definition of binary trees, in which the branches
from a node to its children were implicit, we now make the branches themselves explicit, since data is attached to them.
For example, we can collect into a list the data items labelling the branches
of such a tree using the following code:
fun collect Empty = nil
| collect (Node (Branch (ld, lt), Branch (rd, rt))) =
ld :: rd :: (collect lt) @ (collect rt)
10.4
Returning to the original definition of binary trees (with data items at the
nodes), observe that the type of the data items at the nodes must be the
same for every node of the tree. For example, a value of type int tree has
an integer at every node, and a value of type string tree has a string at
every node. Therefore an expression such as
Node (Empty, 43, Node (Empty, "43", Empty))
is ill-typed. The type system insists that trees be homogeneous in the sense
that the type of the data items is the same at every node.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
89
It is quite rare to encounter heterogeneous data structures in real programs. For example, a dictionary with strings as keys might be represented as a binary search tree with strings at the nodes; there is no need
for heterogeneity to represent such a data structure. But occasionally one
might wish to work with a heterogeneous tree, whose data values at each
node are of different types. How would one represent such a thing in ML?
To discover the answer, first think about how one might manipulate
such a data structure. When accessing a node, we would need to check
at run-time whether the data item is an integer or a string; otherwise we
would not know whether to, say, add 1 to it, or concatenate "1" to the
end of it. This suggests that the data item must be labelled with sufficient
information so that we may determine the type of the item at run-time. We
must also be able to recover the underlying data item itself so that familiar
operations (such as addition or string concatenation) may be applied to it.
The required labelling and discrimination is neatly achieved using a
datatype declaration. Suppose we wish to represent the type of integeror-string trees. First, we define the type of values to be integers or strings,
marked with a constructor indicating which:
datatype int or string =
Int of int |
String of string
Then we define the type of interest as follows:
type int or string tree =
int or string tree
Voila! Perfectly natural and easy heterogeneity is really a special case of
homogeneity!
10.5
Abstract Syntax
Datatype declarations and pattern matching are extremely useful for defining and manipulating the abstract syntax of a language. For example, we
may define a small language of arithmetic expressions using the following
declaration:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
90
datatype expr =
Numeral of int |
Plus of expr * expr |
Times of expr * expr
This definition has only three clauses, but one could readily imagine adding
others. Here is the definition of a function to evaluate expressions of the
language of arithmetic expressions written using pattern matching:
fun eval (Numeral n) = Numeral n
| eval (Plus (e1, e2)) =
let
val Numeral n1 = eval e1
val Numeral n2 = eval e2
in
Numeral (n1+n2)
end
| eval (Times (e1, e2)) =
let
val Numeral n1 = eval e1
val Numeral n2 = eval e2
in
Numeral (n1*n2)
end
The combination of datatype declarations and pattern matching contributes enormously to the readability of programs written in ML. A less
obvious, but more important, benefit is the error checking that the compiler can perform for you if you use these mechanisms in tandem. As an
example, suppose that we extend the type expr with a new component for
the reciprocal of a number, yielding the following revised definition:
datatype expr =
Numeral of int |
Plus of expr * expr |
Times of expr * expr |
Recip of expr
First, observe that the old definition of eval is no longer applicable to
values of type expr! For example, the expression
R EVISED 11.02.11
D RAFT
V ERSION 1.2
91
The value of the checks provided by the compiler in such cases cannot be
overestimated. When recompiling a large program after making a change
to a datatype declaration the compiler will automatically point out every
line of code that must be changed to conform to the new definition; it is
impossible to forget to attend to even a single case. This is a tremendous
help to the developer, especially if she is not the original author of the code
being modified and is another reason why the static type discipline of ML
is a positive benefit, rather than a hindrance, to programmers.
10.6
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 11
Higher-Order Functions
11.1
Functions as Values
Values of function type are first-class, which means that they have the same
rights and privileges as values of any other type. In particular, functions
may be passed as arguments and returned as results of other functions,
and functions may be stored in and retrieved from data structures such
as lists and trees. We will see that first-class functions are an important
source of expressive power in ML.
Functions which take functions as arguments or yield functions as results are known as higher-order functions (or, less often, as functionals or
operators). Higher-order functions arise frequently in mathematics. For
example, the differential operator is the higher-order function that, when
given a (differentiable) function on the real line, yields its first derivative
as a function on the real line. We also encounter functionals mapping functions to real numbers, and real numbers to functions. An example of the
former is provided by the definite integral viewed as a function of its integrand, and an example of the latter is the definite integral of a given
function on the interval [ a, x ], viewed as a function of a, that yields the
area under the curve from a to x as a function of x.
Higher-order functions are less familiar tools for many programmers
since the best-known programming languages have only rudimentary mechanisms to support their use. In contrast higher-order functions play a
prominent role in ML, with a variety of interesting applications. Their
use may be classified into two broad categories:
93
1. Abstracting patterns of control. Higher-order functions are design patterns that abstract out the details of a computation to lay bare the
skeleton of the solution. The skeleton may be fleshed out to form a
solution of a problem by applying the general pattern to arguments
that isolate the specific problem instance.
2. Staging computation. It arises frequently that computation may be
staged by expending additional effort early to simplify the computation of later results. Staging can be used both to improve efficiency and, as we will see later, to control sharing of computational
resources.
11.2
Before discussing these programming techniques, we will review the critically important concept of scope as it applies to function definitions. Recall
that Standard ML is a statically scoped language, meaning that identifiers
are resolved according to the static structure of the program. A use of the
variable var is considered to be a reference to the nearest lexically enclosing
declaration of var. We say nearest because of the possibility of shadowing; if we re-declare a variable var, then subsequent uses of var refer to the
most recent (lexically!) declaration of it; any previous declarations are
temporarily shadowed by the latest one.
This principle is easy to apply when considering sequences of declarations. For example, it should be clear by now that the variable y is bound
to 32 after processing the following sequence of declarations:
val
val
val
val
x
y
x
y
=
=
=
=
2
x*x
y*x
x*y
(*
(*
(*
(*
x=2 *)
y=4 *)
x=8 *)
y=32 *)
In the presence of function definitions the situation is the same, but it can
be a bit tricky to understand at first.
Heres an example to test your grasp of the lexical scoping principle:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
x
f
x
z
=
y
=
=
94
2
= x+y
3
f 4
D RAFT
V ERSION 1.2
11.3
95
Returning Functions
While seemingly very simple, the principle of lexical scope is the source of
considerable expressive power. Well demonstrate this through a series of
examples.
To warm up lets consider some simple examples of passing functions
as arguments and yielding functions as results. The standard example of
passing a function as argument is the map function, which applies a given
function to every element of a list. It is defined as follows:
fun map (f, nil) = nil
| map (f, h::t) = (f h) :: map (f, t)
For example, the application
map (fn x => x+1, [1,2,3,4])
evaluates to the list [2,3,4,5].
Functions may also yield functions as results. What is surprising is that
we can create new functions during execution, not just return functions
that have been previously defined. The most basic (and deceptively simple) example is the function constantly that creates constant functions:
given a value k, the application constantly k yields a function that yields
k whenever it is applied. Heres a definition of constantly:
val constantly = fn k => (fn a => k)
The function constantly has type a -> (b -> a). We used the fn notation for clarity, but the declaration of the function constantly may also
be written using fun notation as follows:
fun constantly k a = k
Note well that a white space separates the two successive arguments to
constantly! The meaning of this declaration is precisely the same as the
earlier definition using fn notation.
The value of the application constantly 3 is the function that is constantly 3; i.e., it always yields 3 when applied. Yet nowhere have we defined the function that always yields 3. The resulting function is created
by the application of constantly to the argument 3, rather than merely
R EVISED 11.02.11
D RAFT
V ERSION 1.2
96
retrieved off the shelf of previously-defined functions. In implementation terms the result of the application constantly 3 is a closure consisting
of the function fn a => k with the environment val k = 3 attached to it.
The closure is a data structure (a pair) that is created by each application of
constantly to an argument; the closure is the representation of the new
function yielded by the application. Notice, however, that the only difference between any two results of applying the function constantly lies in
the attached environment; the underlying function is always fn a => k. If
we think of the lambda as the executable code of the function, then this
amounts to the observation that no new code is created at run-time, just
new instances of existing code.
This also points out why functions in ML are not the same as code
pointers in C. You may be familiar with the idea of passing a pointer to
a C function to another C function as a means of passing functions as arguments or yielding functions as results. This may be considered to be a
form of higher-order function in C, but it must be emphasized that code
pointers are significantly less powerful than closures because in C there
are only statically many possibilities for a code pointer (it must point to one
of the functions defined in your code), whereas in ML we may generate dynamically many different instances of a function, differing in the bindings
of the variables in its environment. The non-varying part of the closure,
the code, is directly analogous to a function pointer in C, but there is no
counterpart in C of the varying part of the closure, the dynamic environment.
The definition of the function map given above takes a function and list
as arguments, yielding a new list as result. Often it occurs that we wish to
map the same function across several different lists. It is inconvenient (and
a tad inefficient) to keep passing the same function to map, with the list
argument varying each time. Instead we would prefer to create a instance
of map specialized to the given function that can then be applied to many
different lists. This leads to the following definition of the function map:
fun map f nil = nil
| map f (h::t) = (f h) :: (map f t)
The function map so defined has type (a->b) -> a list -> b list.
It takes a function of type a -> b as argument, and yields another function of type a list -> b list as result.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
97
11.4
Patterns of Control
add
add
mul
mul
up
up
up
up
nil = 0
(h::t) = h + add up t
nil = 1
(h::t) = h * mul up t
What precisely is the similarity? We will look at it from two points of view.
One view is that in each case we have a binary operation and a unit
element for it. The result on the empty list is the unit element, and the
result on a non-empty list is the operation applied to the head of the list
and the result on the tail. This pattern can be abstracted as the function
reduce defined as follows:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
98
D RAFT
V ERSION 1.2
11.5 Staging
99
11.5
Staging
An interesting variation on reduce may be obtained by staging the computation. The motivation is that unit and opn often remain fixed for many
different lists (e.g., we may wish to sum the elements of many different
lists). In this case unit and opn are said to be early arguments and the
list is said to be a late argument. The idea of staging is to perform as
much computation as possible on the basis of the early arguments, yielding a function of the late arguments alone.
In the case of the function reduce this amounts to building red on the
basis of unit and opn, yielding it as a function that may be later applied to
many different lists. Heres the code:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
11.5 Staging
100
D RAFT
V ERSION 1.2
11.5 Staging
101
The time saved by staging the computation in the definition of staged reduce
is admittedly minor. But consider the following definition of an append
function for lists that takes both arguments at once:
fun append (nil, l) = l
| append (h::t, l) = h :: append(t,l)
Suppose that we will have occasion to append many lists to the end of a
given list. What wed like is to build a specialized appender for the first
list that, when applied to a second list, appends the second to the end of
the first. Heres a naive solution that merely curries append:
fun curried append nil l = l
| curried append (h::t) l = h :: curried append t l
Unfortunately this solution doesnt exploit the fact that the first argument
is fixed for many second arguments. In particular, each application of the
result of applying curried append to a list results in the first list being
traversed so that the second can be appended to it.
We can improve on this by staging the computation as follows:
fun staged append nil = (fn l => l)
| staged append (h::t) =
let
val tail appender = staged append t
in
fn l => h :: tail appender l
end
Notice that the first list is traversed once for all applications to a second argument. When applied to a list [v1 ,...,vn ], the function staged append
yields a function that is equivalent to, but not quite as efficient as, the
function
fn l => v1 :: v2 :: ... :: vn :: l.
This still takes time proportional to n, but a substantial savings accrues
from avoiding the pattern matching required to destructure the original
list argument on each call.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
11.6
102
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 12
Exceptions
In chapter 2 we mentioned that expressions in Standard ML always have a
type, may have a value, and may have an effect. So far weve concentrated
on typing and evaluation. In this chapter we will introduce the concept of
an effect. While its hard to give a precise general definition of what we
mean by an effect, the idea is that an effect is any action resulting from
evaluation of an expression other than returning a value. From this point
of view we might consider non-termination to be an effect, but we dont
usually think of failure to terminate as a positive action in its own right,
rather as a failure to take any action.
The main examples of effects in ML are these:
1. Exceptions. Evaluation may be aborted by signaling an exceptional
condition.
2. Mutation. Storage may be allocated and modified during evaluation.
3. Input/output. It is possible to read from an input source and write to
an output sink during evaluation.
4. Communication. Data may be sent to and received from communication channels.
This chapter is concerned with exceptions; the other forms of effects will
be considered later.
12.1
104
Exceptions as Errors
ML is a safe language in the sense that its execution behavior may be understood entirely in terms of the constructs of the language itself. Behavior such as dumping core or incurring a bus error are extra-linguistic
notions that may only be explained by appeal to the underlying implementation of the language. These cannot arise in ML. This is ensured by
a combination of a static type discipline, which rules out expressions that
are manifestly ill-defined (e.g., adding a string to an integer or casting an
integer as a function), and by dynamic checks that rule out violations that
cannot be detected statically (e.g., division by zero or arithmetic overflow).
Static violations are signalled by type checking errors; dynamic violations
are signalled by raising exceptions.
12.1.1
Primitive Exceptions
D RAFT
V ERSION 1.2
105
12.1.2
User-Defined Exceptions
So far we have considered examples of pre-defined exceptions that indicate fatal error conditions. Since the built-in exceptions have a builtR EVISED 11.02.11
D RAFT
V ERSION 1.2
106
in meaning, it is generally inadvisable to use these to signal programspecific error conditions. Instead we introduce a new exception using an
exception declaration, and signal it using a raise expression when a runtime violation occurs. That way we can associate specific exceptions with
specific pieces of code, easing the process of tracking down the source of
the error.
Suppose that we wish to define a checked factorial function that ensures that its argument is non-negative. Heres a first attempt at defining
such a function:
exception Factorial
fun checked factorial n =
if n < 0 then
raise Factorial
else if n=0 then
1
else n * checked factorial (n-1)
The declaration exception Factorial introduces an exception Factorial,
which we raise in the case that checked factorial is applied to a negative
number.
The definition of checked factorial is unsatisfactory in at least two
respects. One, relatively minor, issue is that it does not make effective use
of pattern matching, but instead relies on explicit comparison operations.
To some extent this is unavoidable since we wish to check explicitly for
negative arguments, which cannot be done using a pattern. A more significant problem is that checked factorial repeatedly checks the validity
of its argument on each recursive call, even though we can prove that if
the initial argument is non-negative, then so must be the argument on each
recursive call. This fact is not reflected in the code. We can improve the
definition by introducing an auxiliary function:
exception Factorial
local
fun fact 0 = 1
| fact n = n * fact (n-1)
in
fun checked factorial n =
if n >= 0 then
R EVISED 11.02.11
D RAFT
V ERSION 1.2
107
fact n
else
raise Factorial
end
Notice that we perform the range check exactly once, and that the auxiliary
function makes effective use of pattern-matching.
12.2
Exception Handlers
D RAFT
V ERSION 1.2
108
D RAFT
V ERSION 1.2
109
factorial driver ()
end
handle EndOfFile => print "Done."
| SyntaxError =>
let
val = print "Syntax error."
in
factorial driver ()
end
| Factorial =>
let
val = print "Out of range."
in
factorial driver ()
end
We will return to a more detailed discussion of input/output later in these
notes. The point to notice here is that the code is structured with a completely uncluttered normal path that reads an integer, computes its factorial, formats it, prints it, and repeats. The exception handler takes care
of the exceptional cases: end of file, syntax error, and domain error. In the
latter two cases we report an error, and resume reading. In the former we
simply report completion and we are done.
The reader is encouraged to imagine how one might structure this program without the use of exceptions. The primary benefits of the exception
mechanism are as follows:
1. They force you to consider the exceptional case (if you dont, youll
get an uncaught exception at run-time), and
2. They allow you to segregate the special case from the normal case in
the code (rather than clutter the code with explicit checks).
These aspects work hand-in-hand to facilitate writing robust programs.
A typical use of exceptions is to implement backtracking, a programming technique based on exhaustive search of a state space. A very simple, if somewhat artificial, example is provided by the following function
to compute change from an arbitrary list of coin values. What is at issue
is that the obvious greedy algorithm for making change that proceeds
R EVISED 11.02.11
D RAFT
V ERSION 1.2
110
12.3
Value-Carrying Exceptions
So far exceptions are just signals that indicate that an exceptional condition has arisen. Often it is useful to attach additional information that
is passed to the exception handler. This is achieved by attaching values to
exceptions.
For example, we might associate with a SyntaxError exception a string
indicating the precise nature of the error. In a parser for a language we
might write something like
raise SyntaxError "Integer expected"
to indicate a malformed expression in a situation where an integer is expected, and write
raise SyntaxError "Identifier expected"
to indicate a badly-formed identifier.
To associate a string with the exception SyntaxError, we declare it as
R EVISED 11.02.11
D RAFT
V ERSION 1.2
111
D RAFT
V ERSION 1.2
112
exn as argument and raises an exception with that value. The clauses of
a handler may be applied to any value of type exn using the rules of pattern matching described earlier; if an exception constructor is no longer in
scope, then the handler cannot catch it (other than via a wild-card pattern).
The type exn may be thought of as a kind of built-in datatype, except
that the constructors of this type are not determined once and for all (as
they are with a datatype declaration), but rather are incrementally introduced as needed in a program. For this reason the type exn is sometimes
called an extensible datatype.
12.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 13
Mutable Storage
In this chapter we consider a second form of effect, called a storage effect,
the allocation or mutation of storage during evaluation. The introduction
of storage effects has profound consequences, not all of which are desirable. (Indeed, one connotation of the phrase side effect is an unintended
consequence of a medication!) While it is excessive to dismiss storage effects as completely undesirable, it is advantageous to minimize the use of
storage effects except in situations where the task clearly demands them.
We will explore some techniques for programming with storage effects
later in this chapter, but first we introduce the primitive mechanisms for
programming with mutable storage in ML.
13.1
Reference Cells
To support mutable storage the execution model that we described in chapter 2 is enriched with a memory consisting of a finite set of mutable cells. A
mutable cell may be thought of as a container in which a data value of a
specified type is stored. During execution of a program the contents of
a cell may be retrieved or replaced by any other value of the appropriate
type. Since cells are used by issuing commands to modify and retrieve
their contents, programming with cells is called imperative programming.
Changing the contents of a mutable cell introduces a temporal aspect
to evaluation. We speak of the current contents of a cell, meaning the value
most recently assigned to it. We also speak of previous and future values of a
reference cell when discussing the behavior of a program. This is in sharp
114
contrast to the effect-free fragment of ML, for which no such concepts apply. For example, the binding of a variable does not change while evaluating within the scope of that variable, lending a permanent quality
to statements about variables the current binding is the only binding
that variable will ever have.
The type typ ref is the type of reference cells containing values of type
typ. Reference cells are, like all values, first class they may be bound
to variables, passed as arguments to functions, returned as results of functions, appear within data structures, and even be stored within other reference cells.
A reference cell is created, or allocated, by the function ref of type typ ->
typ ref. When applied to a value val of type typ, ref allocates a new cell,
initializes its content to val, and returns a reference to the cell. By new
we mean that the allocated cell is distinct from all other cells previously
allocated, and does not share storage with them.
The contents of a cell of type typ is retrieved using the function ! of type
typ ref -> typ. Applying ! to a reference cell yields the current contents
of that cell. The contents of a cell is changed by applying the assignment
operator op :=, which has type typ ref * typ -> unit. Assignment is
usually written using infix syntax. When applied to a cell and a value, it
replaces the content of that cell with that value, and yields the null-tuple
as result.
Here are some examples:
val
val
val
val
val
val
val
val
r = ref 0
s = ref 0
= r := 3
x = !s + !r
t = r
= t := 5
y = !s + !r
z = !t + !r
D RAFT
V ERSION 1.2
115
expression exp has type unit, so that its value is guaranteed to be the nulltuple, (), if it has a value at all.
A wildcard binding is used to define sequential composition of expressions in ML. The expression
exp1 ; exp2
is shorthand for the expression
let
val
= exp1
in
exp2
end
that first evaluates exp1 for its effect, then evaluates exp2 .
Functions of type typ->unit are sometimes called procedures, because
they are executed purely for their effect. This is apparent from the type: it
is assured that the value of applying such a function is the null-tuple, (),
so the only point of applying it is for its effects on memory.
13.2
Reference Patterns
D RAFT
V ERSION 1.2
13.3 Identity
116
13.3
Identity
Reference cells raise delicate issues of equality that considerably complicate reasoning about programs. In general we say that two expressions (of
the same type) are equal iff they cannot be distinguished by any operation
in the language. That is, two expressions are distinct iff there is some way
within the language to tell them apart. This is called Leibnizs Principle of
identity of indiscernables we equate everything that we cannot tell apart
and the indiscernability of identicals that which we deem equal cannot
be told apart.
What makes Leibnizs Principle tricky to grasp is that it hinges on what
we mean by a way to tell expressions apart. The crucial idea is that we
can tell two expressions apart iff there is a complete program containing one
of the expressions whose observable behavior changes when we replace that
expression by the other. That is, two expressions are considered equal iff
there is no such scenario that distinguishes them. But what do we mean by
complete program? And what do we mean by observable behavior?
For the present purposes we will consider a complete program to be
any expression of basic type (say, int or bool or string). The idea is that a
complete program is one that computes a concrete result such as a number.
The observable behavior of a complete program includes at least these
aspects:
1. Its value, or lack thereof, either by non-termination or by raising an
uncaught exception.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
13.3 Identity
117
2. Its visible side effects, include visible modifications to mutable storage or any input/output it may perform.
In contrast here are some behaviors that we will not count as observations:
1. Execution time or space usage.
2. Private uses of storage (e.g., internally-allocated reference cells).
3. The name of uncaught exceptions (i.e., we will not distinguish between terminating with the uncaught exception Bind and the uncaught exception Match.
With these ideas in mind, it should be plausible that if we evaluate
these bindings
val r = ref 0
val s = ref 0
then r and s are not equivalent. Consider the following usage of r to compute an integer result:
(s := 1 ; !r)
Clearly this expression evaluates to 0, and mutates the binding of s. Now
replace r by s to obtain
(s := 1 ; !s)
This expression evaluates to 1, and mutates s as before. These two complete programs distinguish r from s, and therefore must be considered
distinct.
Had we replaced the binding for s by the binding
val s = r
then the two expressions that formerly distinguished r from s no long
do so they are, after all, bound to the same reference cell! In fact, no
program can be concocted that would distinguish them. In this case r and
s are equivalent.
Now consider a third, very similar scenario. Let us declare r and s as
follows:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
13.4 Aliasing
118
val r = ref ()
val s = ref ()
Are r and s equivalent or not? We might first try to distinguish them by
a variant of the experiment considered above. This breaks down because
there is only one possible value we can assign to a variable of type unit
ref! Indeed, one may suspect that r and s are equivalent in this case,
but in fact there is a way to distinguish them! Heres a complete program
involving r that we will use to distinguish r from s:
if r=r then "its r" else "its not"
Now replace the first occurrence of r by s to obtain
if s=r then "its r" else "its not"
and the result is different.
This example hinges on the fact that ML defines equality for values
of reference type to be reference equality (or, occasionally, pointer equality).
Two reference cells (of the same type) are equal in this sense iff they both
arise from the exact same use of the ref operation to allocate that cell;
otherwise they are distinct. Thus the two cells bound to r and s above are
observably distinct (by testing reference equality), even though they can
only ever hold the value (). Had equality not been included as a primitive,
any two reference cells of unit type would have been equal.
Why does ML provide such a fine-grained notion of equality? True
equality, as defined by Leibnizs Principle, is, unfortunately, undecidable
there is no computer program that determines whether two expressions
are equivalent in this sense. ML provides a useful, conservative approximation to true equality that in some cases is not defined (you cannot test
two functions for equality) and in other cases is too picky (it distinguishes
reference cells that are otherwise indistinguishable). Such is life.
13.4
Aliasing
D RAFT
V ERSION 1.2
119
val r = ref 0
val s = ref 0
the variables r and s are not aliases, but after the declaration
val r = ref 0
val s = r
the variables r and s are aliases for the same reference cell.
These examples show that we must be careful when programming
with variables of reference type. This is particularly problematic in the
case of functions, because we cannot assume that two different argument
variables are bound to different reference cells. They might, in fact, be
bound to the same reference cell, in which case we say that the two variables are aliases for one another. For example, in a function of the form
fn (x:typ ref, y:typ ref) => exp
we may not assume that x and y are bound to different reference cells. We
must always ask ourselves whether weve properly considered aliasing
when writing such a function. This is harder to do than it sounds. Aliasing
is a huge source of bugs in programs that work with reference cells.
13.5
R EVISED 11.02.11
D RAFT
V ERSION 1.2
120
13.5.1
Private Storage
R EVISED 11.02.11
D RAFT
V ERSION 1.2
121
local
val counter = ref 0
in
fun tick () = (counter := !counter + 1; !counter)
fun reset () = (counter := 0)
end
This declaration introduces two functions, tick of type unit -> int and
reset of type unit -> unit. Their definitions share a private variable
counter that is bound to a mutable cell containing the current value of
a shared counter. The tick operation increments the counter and returns
its new value, and the reset operation resets its value to zero. The types
of the operations suggest that implicit state is involved. In the absence
of exceptions and implicit state, there is only one useful function of type
unit->unit, namely the function that always returns its argument (and
its debatable whether this is really useful!).
The declaration above defines two functions, tick and reset, that share
a single private counter. Suppose now that we wish to have several different instances of a counter different pairs of functions tick and reset that
share different state. We can achieve this by defining a counter generator (or
constructor) as follows:
fun new counter () =
let
val counter = ref 0
fun tick () = (counter := !counter + 1; !counter)
fun reset () = (counter := 0)
in
{ tick = tick, reset = reset }
end
The type of new counter is
unit -> { tick : unit->int, reset : unit->unit }.
Weve packaged the two operations into a record containing two functions that share private state. There is an obvious analogy with class-based
object-oriented programming. The function new counter may be thought
of as a constructor for a class of counter objects. Each object has a private instance variable counter that is shared between the methods tick and reset
of the object represented as a record with two fields.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
122
13.5.2
D RAFT
V ERSION 1.2
123
cons (h, t) =
nill () = Pcl
phd (Pcl (ref
ptl (Pcl (ref
D RAFT
V ERSION 1.2
124
The last step backpatches the tail of the last cell of infinite to be infinite
itself, creating a circular list.
Now let us define the size of a pcl to be the number of distinct nodes
occurring in it. It is an interesting problem is to define a size function
for pcls that makes no use of auxiliary storage (e.g., no set of previouslyencountered nodes) and runs in time proportional to the number of cells in
the pcl. The idea is to think of running a long race between a tortoise and a
hare. If the course is circular, then the hare, which quickly runs out ahead
of the tortoise, will eventually come from behind and pass it! Conversely,
if this happens, the course must be circular.
local
fun race (Nil, Nil) = 0
| race (Cons ( , Pcl (ref c)), Nil) =
1 + race (c, Nil)
| race (Cons ( , Pcl (ref c)), Cons ( , Pcl (ref Nil))) =
1 + race (c, Nil)
| race (Cons ( , l), Cons ( , Pcl (ref (Cons ( , m))))) =
1 + race (l, m)
and race (Pcl (r as ref c), Pcl (s as ref d)) =
if r=s then 0 else race (c, d)
in
fun size (Pcl (ref c)) = race (c, c)
end
The hare runs twice as fast as the tortoise. We let the tortoise do the counting; the hares job is simply to detect cycles. If the hare reaches the finish
line, it simply waits for the tortoise to finish counting. This covers the
first three clauses of race. If the hare has not yet finished, we must continue with the hare running at twice the pace, checking whether the hare
catches the tortoise from behind. Notice that it can never arise that the
tortoise reaches the end before the hare does! Consequently, the definition
of race is inexhaustive.
13.6
Mutable Arrays
In addition to reference cells, ML also provides mutable arrays as a primitive data structure. The type typ array is the type of arrays carrying valR EVISED 11.02.11
D RAFT
V ERSION 1.2
125
The function array creates a new array of a given length, with the given
value as the initial value of every element of the array. The function length
returns the length of an array. The function sub performs a subscript operation, returning the ith element of an array A, where 0 i < length(A).
(These are just the basic operations on arrays; please see Appendix A for
complete information.)
One simple use of arrays is for memoization. Heres a function to compute the nth Catalan number, which may be thought of as the number
of distinct ways to parenthesize an arithmetic expression consisting of a
sequence of n consecutive multiplications. It makes use of an auxiliary
summation function that you can easily define for yourself. (Applying
sum to f and n computes the sum of f 1 + + f n.)
fun C 1 = 1
| C n = sum (fn k => (C k) * (C (n-k))) (n-1)
This definition of C is hugely inefficient because a given computation may
be repeated exponentially many times. For example, to compute C 10 we
must compute C 1, C 2, . . . , C 9, and the computation of C i engenders the
computation of C 1, . . . , C i 1 for each 1 i 9. We can do better by
caching previously-computed results in an array, leading to an enormous
improvement in execution speed. Heres the code:
local
val limit : int = 100
val memopad : int option array =
Array.array (limit, NONE)
in
fun C 1 = 1
| C n = sum (fn k => (C k)*(C (n-k))) (n-1)
and C n =
if n < limit then
case Array.sub (memopad, n)
R EVISED 11.02.11
D RAFT
V ERSION 1.2
126
of SOME r => r
| NONE =>
let
val r = C n
in
Array.update (memopad, n, SOME r);
r
end
else
C n
end
Note carefully the structure of the solution. The function C is a memoized
version of the Catalan number function. When called it consults the memopad to determine whether or not the required result has already been
computed. If so, the answer is simply retrieved from the memopad, otherwise the result is computed, stored in the cache, and returned. The function C looks superficially similar to the earlier definition of C, with the
important difference that the recursive calls are to C, rather than C itself.
This ensures that sub-computations are properly cached and that the cache
is consulted whenever possible.
The main weakness of this solution is that we must fix an upper bound
on the size of the cache. This can be alleviated by implementing a more
sophisticated cache management scheme that dynamically adjusts the size
of the cache based on the calls made to it.
13.7
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 14
Input/Output
The Standard ML Basis Library (described in Appendix A) defines a threelayer input and output facility for Standard ML. These modules provide
a rudimentary, platform-independent text I/O facility that we summarize
briefly here. The reader is referred to Appendix A for more details. Unfortunately, there is at present no standard library for graphical user interfaces; each implementation provides its own package. See your compilers
documentation for details.
14.1
Textual Input/Output
The text I/O primitives are based on the notions of an input stream and an
output stream, which are values of type instream and outstream, respectively. An input stream is an unbounded sequence of characters arising
from some source. The source could be a disk file, an interactive user, or
another program (to name a few choices). Any source of characters can
be attached to an input stream. An input stream may be thought of as
a buffer containing zero or more characters that have already been read
from the source, together with a means of requesting more input from the
source should the program require it. Similarly, an output stream is an unbounded sequence of characters leading to some sink. The sink could be a
disk file, an interactive user, or another program (to name a few choices).
Any sink for characters can be attached to an output stream. An output
stream may be thought of as a buffer containing zero or more characters
that have been produced by the program but have yet to be flushed to the
128
sink.
Each program comes with one input stream and one output stream,
called stdIn and stdOut, respectively. These are ordinarily connected to
the users keyboard and screen, and are used for performing simple text
I/O in a program. The output stream stdErr is also pre-defined, and is
used for error reporting. It is ordinarily connected to the users screen.
Textual input and output are performed on streams using a variety of
primitives. The simplest are inputLine and print. To read a line of input
from a stream, use the function inputLine of type instream -> string. It
reads a line of input from the given stream and yields that line as a string
whose last character is the line terminator. If the source is exhausted, return the empty string. To write a line to stdOut, use the function print
of type string -> unit. To write to a specific stream, use the function
output of type outstream * string -> unit, which writes the given string
to the specified output stream. For interactive applications it is often important to ensure that the output stream is flushed to the sink (e.g., so
that it is displayed on the screen). This is achieved by calling flushOut of
type outstream -> unit. The print function is a composition of output
(to stdOut) and flushOut.
A new input stream may be created by calling the function openIn of
type string -> instream. When applied to a string, the system attempts
to open a file with that name (according to operating system-specific naming conventions) and attaches it as a source to a new input stream. Similarly, a new output stream may be created by calling the function openOut
of type string -> outstream. When applied to a string, the system attempts to create a file with that name (according to operating systemspecific naming conventions) and attaches it as a sink for a new output
stream. An input stream may be closed using the function closeIn of type
instream -> unit. A closed input stream behaves as if there is no further input available; request for input from a closed input stream yield
the empty string. An output stream may be closed using closeOut of type
outstream -> unit. A closed output stream is unavailable for further output; an attempt to write to a closed output stream raises the exception
TextIO.IO.
The function input of type instream -> string is a blocking read operation that returns a string consisting of the characters currently available
from the source. If none are currently available, but the end of source has
not been reached, then the operation blocks until at least one character is
R EVISED 11.02.11
D RAFT
V ERSION 1.2
129
available from the source. If the source is exhausted or the input stream
is closed, input returns the null string. To test whether an input operation would block, use the function canInput of type instream * int ->
int option. Given a stream s and a bound n, the function canInput determines whether or not a call to input on s would immediately yield up
to n characters. If the input operation would block, canInput yields NONE;
otherwise it yields SOME k, with 0 k n being the number of characters
immediately available on the input stream. If canInput yields SOME 0, the
stream is either closed or exhausted. The function endOfStream of type
instream -> bool tests whether the input stream is currently at the end
(no further input is available from the source). This condition is transitive
since, for example, another process might append data to an open file in
between calls to endOfStream.
The function output of type outstream * string -> unit writes a string
to an output stream. It may block until the sink is able to accept the entire string. The function flushOut of type outstream -> unit forces any
pending output to the sink, blocking until the sink accepts the remaining
buffered output.
This collection of primitive I/O operations is sufficient for performing
rudimentary textual I/O. For further information on textual I/O, and support for binary I/O and Posix I/O primitives, see the Standard ML Basis
Library.
14.2
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 15
Lazy Data Structures
In ML all variables are bound by value, which means that the bindings of
variables are fully evaluated expressions, or values. This general principle
has several consequences:
1. The right-hand side of a val binding is evaluated before the binding
is effected. If the right-hand side has no value, the val binding does
not take effect.
2. In a function application the argument is evaluated before being passed
to the function by binding that value to the parameter of the function. If the argument does not have a value, then neither does the
application.
3. The arguments to value constructors are evaluated before the constructed value is created.
According to the by-value discipline, the bindings of variables are evaluated, regardless of whether that variable is ever needed to complete execution. For example, to compute the result of applying the function fn
x => 1 to an argument, we never actually need to evaluate the argument,
but we do anyway. For this reason ML is sometimes said to be an eager
language.
An alternative is to bind variables by name,1 which means that the binding of a variable is an unevaluated expression, known as a computation or a
1 The
lished.
131
suspension or a thunk.2 This principle has several consequences:
1. The right-hand side of a val binding is not evaluated before the binding is effected. The variable is bound to a computation (unevaluated
expression), not a value.
2. In a function application the argument is passed to the function in
unevaluated form by binding it directly to the parameter of the function. This holds regardless of whether the argument has a value or
not.
3. The arguments to value constructor are left unevaluated when the
constructed value is created.
According to the by-name discipline, the bindings of variables are only
evaluated (if ever) when their values are required by a primitive operation.
For example, to evaluate the expression x+x, it is necessary to evaluate the
binding of x in order to perform the addition. Languages that adopt the
by-name discipline are, for this reason, said to be lazy.
This discussion glosses over another important aspect of lazy evaluation, called memoization. In actual fact laziness is based on a refinement
of the by-name principle, called the by-need principle. According to the byname principle, variables are bound to unevaluated computations, and are
evaluated only as often as the value of that variables binding is required
to complete the computation. In particular, to evaluate the expression x+x
the value of the binding of x is needed twice, and hence it is evaluated
twice. According to the by-need principle, the binding of a variable is
evaluated at most once not at all, if it is never needed, and exactly once if
it ever needed at all. Re-evaluation of the same computation is avoided by
memoization. Once a computation is evaluated, its value is saved for future
reference should that computation ever be needed again.
The advantages and disadvantages of lazy versus eager languages have
been hotly debated. We will not enter into this debate here, but rather content ourselves with the observation that laziness is a special case of eagerness.
(Recent versions of) ML have lazy data types that allow us to treat unevaluated computations as values of such types, allowing us to incorporate
laziness into the language without disrupting its fundamental character
2 For
R EVISED 11.02.11
D RAFT
V ERSION 1.2
132
on which so much else depends. This affords the benefits of laziness, but
on a controlled basis we can use it when it is appropriate, and ignore it
when it is not.
The main benefit of laziness is that it supports demand-driven computation. This is useful for representing on-line data structures that are created
only insofar as we examine them. Infinite data structures, such as the sequence of all prime numbers in order of magnitude, are one example of
an on-line data structure. Clearly we cannot ever finish creating the sequence of all prime numbers, but we can create as much of this sequence
as we need for a given run of a program. Interactive data structures, such
as the sequence of inputs provided by the user of an interactive system,
are another example of on-line data structures. In such a system the users
inputs are not pre-determined at the start of execution, but rather are created on demand in response to the progress of computation up to that
point. The demand-driven nature of on-line data structures is precisely
what is needed to model this behavior.
Note: Lazy evaluation is a non-standard feature of ML that is supported
only by the SML/NJ compiler. The lazy evaluation features must be enabled by executing the following at top level:
Compiler.Control.lazysml := true;
open Lazy;
15.1
D RAFT
V ERSION 1.2
133
where val is of type typ, and val0 is another such computation. Notice how
this description captures the incremental nature of lazy data structures.
The computation is not evaluated until we examine it. When we do, its
structure is revealed as consisting of an element val together with another
suspended computation of the same type. Should we inspect that computation, it will again have this form, and so on ad infinitum.
Values of type typ stream are created using a val rec lazy declaration that provides a means for building a circular data structure. Here
is a declaration of the infinite stream of 1s as a value of type int stream:
val rec lazy ones = Cons (1, ones)
The keyword lazy indicates that we are binding ones to a computation,
rather than a value. The keyword rec indicates that the computation is
recursive (or self-referential or circular). It is the computation whose underlying value is constructed using Cons (the only possibility) from the integer
1 and the very same computation itself.
We can inspect the underlying value of a computation by pattern matching. For example, the binding
val Cons (h, t) = ones
extracts the head and tail of the stream ones. This is performed by
evaluating the computation bound to ones, yielding Cons (1, ones), then
performing ordinary pattern matching to bind h to 1 and t to ones.
Had the pattern been deeper, further evaluation would be required,
as in the following binding:
val Cons (h, (Cons (h, t)) = ones
To evaluate this binding, we evaluate ones to Cons (1, ones), binding h
to 1 in the process, then evaluate ones again to Cons (1, ones), binding
h to 1 and t to ones. The general rule is pattern matching forces evaluation
of a computation to the extent required by the pattern. This is the means by
which lazy data structures are evaluated only insofar as required.
15.2
The combination of (recursive) lazy function definitions and decomposition by pattern matching are the core mechanisms required to support lazy
R EVISED 11.02.11
D RAFT
V ERSION 1.2
134
evaluation. However, there is a subtlety about function definitions that requires careful consideration, and a third new mechanism, the lazy function
declaration.
Using pattern matching we may easily define functions over lazy data
structures in a familiar manner. For example, we may define two functions
to extract the head and tail of a stream as follows:
fun shd (Cons (h, )) = h
fun stl (Cons ( , s)) = s
These are functions that, when applied to a stream, evaluate it, and match
it against the given patterns to extract the head and tail, respectively.
While these functions are surely very natural, there is a subtle issue that
deserves careful discussion. The issue is whether these functions are lazy
enough. From one point of view, what we are doing is decomposing a
computation by evaluating it and retrieving its components. In the case
of the shd function there is no other interpretation we are extracting a
value of type typ from a value of type typ stream, which is a computation
of a value of the form Cons (exph , expt ). We can adopt a similar viewpoint about stl, namely that it is simply extracting a component value
from a computation of a value of the form Cons (exph , expt ).
However, in the case of stl, another point of view is also possible.
Rather than think of stl as extracting a value from a stream, we may instead think of it as creating a stream out of another stream. Since streams
are computations, the stream created by stl (according to this view) should
also be suspended until its value is required. Under this interpretation the
argument to stl should not be evaluated until its result is required, rather
than at the time stl is applied. This leads to a variant notion of tail that
may be defined as follows:
fun lazy lstl (Cons ( , s)) = s
The keyword lazy indicates that an application of lstl to a stream does
not immediately perform pattern matching on its argument, but rather sets
up a stream computation that, when forced, forces the argument and extracts the tail of the stream.
The behavior of the two forms of tail function can be distinguished
using print statements as follows:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
135
15.3
D RAFT
V ERSION 1.2
136
the stream, rather than merely set up a future computation of the same
result.
To illustrate the use of smap, heres a definition of the infinite stream
of natural numbers:
val one plus = smap (fn n => n+1)
val rec lazy nats = Cons (0, one plus nats)
Now lets define a function sfilter of type
(a -> bool) -> a stream -> a stream
that filters out all elements of a stream that do not satisfy a given predicate.
fun sfilter pred =
let
fun lazy loop (Cons (x, s)) =
if pred x then
Cons (x, loop s)
else
loop s
in
loop
end
We can use sfilter to define a function sieve that, when applied to a
stream of numbers, retains only those numbers that are not divisible by a
preceding number in the stream:
fun m mod n = m - n * (m div n)
fun divides m n = n mod m = 0
fun lazy sieve (Cons (x, s)) =
Cons (x, sieve (sfilter (not o (divides x)) s))
(This example uses o for function composition.)
We may now define the infinite stream of primes by applying sieve to
the natural numbers greater than or equal to 2:
val nats2 = stl (stl nats)
val primes = sieve nats2
R EVISED 11.02.11
D RAFT
V ERSION 1.2
137
15.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 16
Equality and Equality Types
16.1
Sample Code
Chapter 17
Concurrency
Concurrent ML (CML) is an extension of Standard ML with mechanisms
for concurrent programming. It is available as part of the Standard ML of
New Jersey compiler. The eXene Library for programming the X windows
system is based on CML.
17.1
Sample Code
Part III
The Module Language
141
The Standard ML module language comprises the mechanisms for structuring programs into separate units. Program units are called structures. A
structure consists of a collection of components, including types and values, that constitute the unit. Composition of units to form a larger unit
is mediated by a signature, which describes the components of that unit.
A signature may be thought of as the type of a unit. Large units may be
structured into hierarchies using substructures. Generic, or parameterized,
units may be defined as functors.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 18
Signatures and Structures
The fundamental constructs of the ML module system are signatures and
structures. A signature may be thought of as a description of a structure,
and a structure may correspondingly be thought of as an implementation
of a signature. Many languages have similar constructs, often with different names. Signatures are often called interfaces or package specifications,
and structures are often called implementations or packages. Whatever
the terminology, the main idea is to assign a type to a body of code as a
whole.
Unlike many languages, however, the relationship between signatures
and structures is many-to-many, rather than many-to-one or one-to-one. A
signature may describe many different structures, and a structure may satisfy many different signatures. Thus, strictly speaking, it does not make
sense to speak of the signature of a structure or the structure matching a
signature, because there can be more than one in each case. By contrast
many languages impose much more stringent conditions such as requiring that each structure have a unique signature, or that each signature arise
from a unique structure. This is not the case for ML.
18.1
Signatures
A signature is a specification, or a description, of a program unit, or structure. Structures consist of declarations of type constructors, exception constructors, and value bindings. A signature specifies some requirements
on a structure, such as what type components it must have, and what
18.1 Signatures
143
value components it must have and what must be their types. A structure matches, or implements, a signature iff it meets these requirements in a
sense that will be made precise below. The requirements are to be thought
of as descriptive in that the structure may meet more stringent requirements
than are specified by a signature by, for example, having more components than are specified, but the structure nevertheless matches any less
stringent specification. (As a limiting case, any structure matches the null
signature that imposes no requirements on it!)
18.1.1
Basic Signatures
A basic signature expression has the form sig specs end, where specs is a
sequence of specifications. There are four basic forms of specification that
may occur in specs:1
1. A type specification of the form
type (tyvar1 ,...,tyvarn ) tycon [ = typ ],
where the definition typ of tycon may or may not be present.
2. A datatype specification, which has precisely the same form as a datatype
declaration.
3. An exception specification of the form
exception excon [ of typ ],
where the type typ of excon may or may not be present.
4. A value specification of the form
val id : typ.
Each specification may refer to the type constructors introduced earlier in
the sequence. No component may be specified more than once.
Signatures may be given names using a signature binding
signature sigid = sigexp,
1 There are two other forms of specification beyond these four, substructure specifications
R EVISED 11.02.11
D RAFT
V ERSION 1.2
18.1 Signatures
144
18.1.2
Signature Inheritance
Signatures may be built up from one another using two principal tools,
signature inclusion and signature specialization. Each is a form of inheritance
in which a new signature is created by enriching another signature with
additional information.
Signature inclusion is used to add more components to an existing signature. For example, if we wish to add an emptiness test to the signature
QUEUE we might define the augmented signature, QUEUE WITH EMPTY, using
the following signature binding:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
18.1 Signatures
145
R EVISED 11.02.11
D RAFT
V ERSION 1.2
18.1 Signatures
146
R EVISED 11.02.11
D RAFT
V ERSION 1.2
18.2 Structures
18.2
147
Structures
18.2.1
Basic Structures
D RAFT
V ERSION 1.2
18.2 Structures
148
evaluating the right-hand side, and binding the resulting structure value
to strid.
Here is an example of a structure binding:
structure Queue =
struct
type a queue = a list * a list
exception Empty
val empty = (nil, nil)
fun insert (x, (b,f)) = (x::b, f)
fun remove (nil, nil) = raise Empty
| remove (bs, nil) = remove (nil, rev bs)
| remove (bs, f::fs) = (f, (bs, fs))
end
Recall that a fun binding is really an abbreviation for a val rec binding,
and hence constitutes a value binding (of function type).
18.2.2
chapter 21, we will generalize this to admit an arbitrary sequence of strids separated by a dot.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
18.2 Structures
149
D RAFT
V ERSION 1.2
150
18.3
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 19
Signature Matching
When does a structure implement a signature? The structure must provide all of the components and satisfy all of the type definitions required by
the signature. Type components must be provided with the same number
of arguments and with an equivalent definition (if any) to that given in
the signature. Value components must be present with the type specified
in the signature, up to the definitions of any preceding types. Exception
components must be present with the type of argument (if any) specified
in the signature.
These simple principles have a number of important consequences.
To minimize bureaucracy, a structure may provide more components
than are strictly required by the signature. If a signature requires
components x, y, and z, it is sufficient for the structure to provide x,
y, z, and w.
To enhance reuse, a structure may provide values with more general
types than are required by the signature. If a signature demands a
function of type int->int, it is enough to provide a function of type
a->a.
To avoid over-specification, a datatype may be provided where a type
is required, and a value constructor may be provided where a value
is required.
To increase flexibility, a structure may consist of declarations presented
in any sensible order, not just the order specified in the signature,
provided that the requirements of the specification are met.
19.1
152
Principal Signatures
There is a most stringent, or most precise, signature for a structure, called its
principal signature. A structure may be considered to match a signature exactly when the specified signature is no more restrictive than the principal
signature of the structure. To determine whether a structure matches a signature, it is enough to check whether the principal signature of that structure is no weaker than the specified signature. For the purposes of type
checking, the principal signature is the official proxy for the structure. We
need never examine the code of the structure durng type checking, once
its principal signature has been determined.
A structure expression is assigned a principal signature by a componentby-component analysis of its constituent declarations. The principal signature of a structure is obtained as follows:1
1. Corresponding to a declaration of the form
type (tyvar1 ,...,tyvarn ) tycon = typ,
the principal signature contains the specification
type (tyvar1 ,...,tyvarn ) tycon = typ
The principal signature includes the definition of tycon.
2. Corresponding to a declaration of the form
datatype (tyvar1 ,...,tyvarn ) tycon =
con1 of typ1 | ... | conk of typk
the principal signature contains the specification
datatype (tyvar1 ,...,tyvarn ) tycon =
con1 of typ1 | ... | conk of typk
The specification is identical to the declaration.
3. Corresponding to a declaration of the form
1 These
rules gloss over some technical complications that arise only in unusual circumstances. See The Definition of Standard ML [3] for complete details.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
19.2 Matching
153
exception id of typ
the principal signature contains the specification
exception id of typ
4. Corresponding to a declaration of the form
val id = exp
the principal signature contains the specification
val id : typ
where typ is the principal type of the expression exp (relative to the
preceding declarations).
In brief, the principal signature contains all of the type definitions, datatype
definitions, and exception bindings of the structure, plus the principal
types of its value bindings.
19.2
Matching
R EVISED 11.02.11
D RAFT
V ERSION 1.2
19.2 Matching
154
The candidate may have additional components not mentioned in the target, or satisfy additional type equations not required in the target, but it
cannot have fewer of either. The target signature may therefore be seen
as a weakening of the candidate signature, since all of the properties of the
latter are true of the former.
The matching relation is reflexiveevery signature matches itself
and transitive if sigexp1 matches sigexp2 and sigexp2 matches sigexp3 , then
sigexp1 matches sigexp3 . Two signatures are equivalent (in the sense of
Chapter chapter 18) iff each matches the other, which is to say that they
are equivalently restrictive.
It will be helpful to consider some examples. Recall the following signatures from chapter 18.
signature QUEUE =
sig
type a queue
exception Empty
val empty : a queue
val insert : a * a queue -> a queue
val remove : a queue -> a * a queue
end
signature QUEUE WITH EMPTY =
sig
include QUEUE
val is empty : a queue -> bool
end
signature QUEUE AS LISTS =
QUEUE where type a queue = a list * a list
The signature QUEUE WITH EMPTY matches the signature QUEUE, because
all of requirements of QUEUE are met by QUEUE WITH EMPTY. The converse
does not hold, because QUEUE lacks the component is empty, which is required by QUEUE WITH EMPTY.
The signature QUEUE AS LISTS matches the signature QUEUE. It is identical to QUEUE, apart from the additional specification of the type a queue.
The converse fails, because the signature QUEUE does not satisfy the requirement that a queue be equivalent to a list * a list.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
19.2 Matching
155
Matching does not distinguish between equivalent signatures. For example, consider the following signature:
signature QUEUE AS LIST = sig
type a queue = a list
exception Empty
val empty : a list
val insert : a * a list -> a list
val remove : a list -> a * a list
val is empty : a list -> bool
end
At first glance you might think that this signature does not match the signature QUEUE, since the components of QUEUE AS LIST have superficially
dissimilar types from those in QUEUE. However, the signature QUEUE AS LIST
is equivalent to the signature QUEUE with type a queue = a list, which
matches QUEUE for reasons noted earlier. Therefore, QUEUE AS LIST matches
QUEUE as well.
Signature matching may also involve instantiation of polymorphic types.
The types of values in the candidate may be more general than required
by the target. For example, the signature
signature MERGEABLE QUEUE =
sig
include QUEUE
val merge : a queue * a queue -> a queue
end
matches the signature
signature MERGEABLE INT QUEUE =
sig
include QUEUE
val merge : int queue * int queue -> int queue
end
because the polymorphic type of merge in MERGEABLE QUEUE instantiates to
its type in MERGEABLE INT QUEUE.
Finally, a datatype specification matches a signature that specifies a
type with the same name and arity (but no definition), and zero or more
R EVISED 11.02.11
D RAFT
V ERSION 1.2
19.2 Matching
156
R EVISED 11.02.11
D RAFT
V ERSION 1.2
19.3 Satisfaction
19.3
157
Satisfaction
19.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 20
Signature Ascription
Signature ascription imposes the requirement that a structure implement a
signature and, in so doing, weakens the signature of that structure for all
subsequent uses of it. There are two forms of ascription in ML. Both require that a structure implement a signature; they differ in the extent to
which the assigned signature of the structure is weakened by the ascription.
1. Transparent, or descriptive ascription. The structure is assigned the
signature obtained by propagating type definitions from the principal
signature to the candidate signature.
2. Opaque, or restrictive ascription. The structure is assigned the target
signature as is, without propagating any type definitions.
In either case the components of the structure are cut down to those specified in the signature; the only difference is whether type definitions are
propagated from the principal signature or not.
20.1
159
20.2
Opaque Ascription
R EVISED 11.02.11
D RAFT
V ERSION 1.2
160
The use of opaque ascription ensures that the type a Queue.queue is abstract. No definition is provided for it in the signature QUEUE, and therefore it has no definition in terms of other types of the language; the type
a Queue.queue is abstract.
For the type a Queue.queue to be abstract means that the only operations that may be performed on values of that type are empty, insert, and
remove. Importantly, we may not make use of the fact that a queue is really
a pair of lists on the grounds that it is implemented this way. We have
obscured this fact by opaquely ascribing a signature that does not provide
a definition for the type a queue. All clients of the structure Queue are
insulated from the details of how queues are implemented. Consequently,
the implementation of queues can be changed without breaking any client
code, as long as the new implementation satisfies the same signature.
Hiding the representation of a type allows us to isolate the enforcement
of representation invariants on a data structure. We may think of the type
a Queue.queue as the type of states of an abstract machine whose sole
instructions are empty (the initial state), insert, and remove. Internally
to the structure Queue we may wish to impose invariants on the internal
state of the machine. The beauty of data abstraction is that it provides
an elegant means of enforcing such invariants, called the assume-ensure,
or rely-guarantee, method. It reduces the enforcement of representation
invariants to these two requirements:
1. All initialization instructions must ensure that the invariant holds
true of the machine state after execution.
2. All state transition instructions may assume that the invariant holds
of the inputs states, and must ensure that it holds of the output state.
By induction on the number of instructions executed, the invariant must
hold for all statesit must really be invariant!
Suppose that we wish to implement an abstract type of priority queues
for an arbitrary element type. The queue operations are no longer polymorphic in the element type because they actually touch the elements to
determine their relative priorities. Here is a possible signature for priority
queues that expresses this dependency:1
1 In chapter 21 well introduce better means for structuring this module, but the central
R EVISED 11.02.11
D RAFT
V ERSION 1.2
161
signature PQ =
sig
type elt
val lt : elt * elt -> bool
type queue
exception Empty
val empty : queue
val insert : elt * queue -> queue
val remove : queue -> elt * queue
end
Now let us consider an implementation of priority queues in which
the elements are taken to be strings. Since priority queues form an abstract type, we would expect to use opaque ascription to ensure that its
representation is hidden. This suggests an implementation along these
lines:
structure PrioQueue :> PQ =
struct
type elt = string
val lt : string * string -> bool = (op <)
type queue = ...
.
.
.
end
But not only is the type PrioQueue.queue abstract, so is PrioQueue.elt!
This leaves us no means of creating a value of type PrioQueue.elt, and
hence we can never call PrioQueue.insert. The problem is that the interface is too abstract it should only obscure the identity of the type
queue, and not that of the type elt.
The solution is to augment the signature PQ with a definition for the
type elt, then opaquely ascribe this to PrioQueue:
signature STRING PQ = PQ where type elt = string
structure PrioQueue :> STRING PQ = ...
Now the type PrioQueue.elt is equivalent to string, and we may call
PrioQueue.insert with a string, as expected.
The moral is that there is always an element of judgement involved in
deciding which types to hold abstract, and which to make opaque. In the
R EVISED 11.02.11
D RAFT
V ERSION 1.2
162
case of priority queues, the determining factor is that we specified only the
operations on elt that were required for the implementation of priority
queues, and no others. This means that elt could not usefully be held
abstract, but must instead be specified in the signature. On the other hand
the operations on queues are intended to be complete, and so we hold the
type abstract.
20.3
Transparent Ascription
R EVISED 11.02.11
D RAFT
V ERSION 1.2
163
R EVISED 11.02.11
D RAFT
V ERSION 1.2
164
20.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 21
Module Hierarchies
So far we have confined ourselves to considering flat modules consisting of a linear sequence of declarations of types, exceptions, and values.
As programs grow in size and complexity, it becomes important to introduce further structuring mechanisms to support their growth. The ML
module language also supports module hierarchies, tree-structured configurations of modules that reflect the architecture of a large system.
21.1
Substructures
A substructure is a structure within a structure. Structures bindings (either opaque or transparent) are admitted as components of other structures. Structure specifications of the form
structure strid : sigexp
may appear in signatures. There is no distinction between transparent
and opaque specifications in a signature, because there is no structure to
ascribe!
The type checking and evaluation rules for structures are extended
to substructures recursively. The principal signature of a sub-structure
binding is determined according to the rules given in chapter 19. A substructure binding in one signature matches the corresponding one in another iff their signatures match according to the rules in chapter 19. Evaluation of a sub-structure binding consists of evaluating the structure expression, then binding the resulting structure value to that identifier.
21.1 Substructures
166
To see how substructures arise in practice, consider the following programming scenario. The first version of a system makes use of a polymorphic dictionary data structure whose search keys are strings. The signature
for such a data structure might be as follows:
signature MY STRING DICT =
sig
type a dict
val empty : a dict
val insert : a dict * string * a -> a dict
val lookup : a dict * string -> a option
end
The return type of lookup is a option, since there may be no entry in the
dictionary with the specified key.
The implementation of this abstraction looks approximately like this:
structure MyStringDict :> MY STRING DICT =
struct
datatype a dict =
Empty |
Node of a dict * string * a * a dict
val empty = Empty
fun insert (d, k, v) = ...
fun lookup (d, k) = ...
end
The omitted implementations of insert and lookup make use of the builtin lexicographic ordering of strings.
The second version of the system requires another dictionary whose
keys are integers, leading to another signature and implementation for
dictionaries.
signature MY INT DICT =
sig
type a dict
val empty : a dict
val insert : a dict * int * a -> a dict
val lookup : a dict * int -> a option
end
R EVISED 11.02.11
D RAFT
V ERSION 1.2
21.1 Substructures
167
D RAFT
V ERSION 1.2
21.1 Substructures
168
R EVISED 11.02.11
D RAFT
V ERSION 1.2
21.1 Substructures
169
v
end
Notice that we required an auxiliary function, divides, to implement the
comparison in the required sense.
With this in mind, let us re-consider our initial attempt to consolidate
the signatures of the various versions of dictionaries in play. In one sense
there is nothing to do the signature MY GEN DICT suffices. However,
as weve just seen, the instances of this signature, which are ascribed to
particular implementations, do not determine the interpretation. What
wed like to do is to package the type with its interpretation so that the
dictionary module is self-contained. Not only does the dictionary module
carry with it the type of its keys, but it also carries the interpretation used
on that type.
This is achieved by introducing a substructure binding in the dictionary structure. To begin with we first isolate the notion of an ordered
type.
signature ORDERED =
sig
type t
val lt : t * t -> bool
val eq : t * t -> bool
end
This signature describes modules that contain a type t equipped with an
equality and comparison operation on it.
An implementation of this signature specifies the type and the interpretation, as in the following examples.
(* Lexicographically ordered strings. *)
structure LexString : ORDERED =
struct
type t = string
val eq = (op =)
val lt = (op <)
end
(* Integers ordered conventionally. *)
structure LessInt : ORDERED =
R EVISED 11.02.11
D RAFT
V ERSION 1.2
21.1 Substructures
170
struct
type t = int
val eq = (op =)
val lt = (op <)
end
(* Integers ordered by divisibility.*)
structure DivInt : ORDERED =
struct
type t = int
fun lt (m, n) = (n mod m = 0)
fun eq (m, n) = lt (m, n) andalso lt (n, m)
end
Notice that the use of transparent ascription is very natural here, since
ORDERED is not intended as a self-contained abstraction.
The signature of dictionaries is re-structured as follows:
signature DICT =
sig
structure Key : ORDERED
type a dict
val empty : a dict
val insert : a dict * Key.t * a -> a dict
val lookup : a dict * Key.t -> a option
end
The signature DICT includes as a substructure the key type together with
its interpretation as an ordered type.
To enforce abstraction we introduce specialized versions of this signature that specify the key type using a where type clause.
signature STRING DICT =
DICT where type Key.t=string
signature INT DICT =
DICT where type Key.t=int
These are, respectively, signatures for the abstract type of dictionaries whose
keys are strings and integers.
How are these signatures to be implemented? Corresponding to the
layering of the signatures, we have a layering of the implementation.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
21.1 Substructures
171
D RAFT
V ERSION 1.2
21.1 Substructures
172
end
Similarly, dictionaries with integer keys ordered by divisibility may be
implemented as follows:
structure IntDivDict :> INT DICT =
struct
structure Key : ORDERED = IntDiv
datatype a dict =
Empty |
Node of a dict * Key.t * a * a dict
val empty = Empty
fun insert (None, k, v) = Node (Empty, k, v, Empty)
fun lookup (Empty, ) = NONE
| lookup (Node (dl, l, v, dr), k) =
if Key.lt(k, l) then
lookup (dl, k)
else if Key.lt (l, k) then
lookup (dr, k)
else
v
end
Taking stock of the development, what we have done is to structure
the signature of dictionaries to allow the type of keys, together with its
interpretation, to vary from one implementation to another. The Key substructure may be viewed as a parameter of the signature DICT that is instantiated by specialization to specific types of interest. In this sense substructures subsume the notion of a parameterized signature found in some
languages. There are several advantages to this:
1. A signature with one or more substructures is still a complete signature. Parameterized signatures, in contrast, are incomplete signatures that must be completed to be used.
2. Any substructure of a signature may play the role of a parameter.
There is no need to designate in advance which are arguments and
which are results.
In chapter 23 we will introduce the mechanisms needed to build a
generic implementation of dictionaries that may be instantiated by the key
type and its ordering.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
21.2
173
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 22
Sharing Specifications
In chapter 21 we illustrated the use of substructures to express the dependence of one abstraction on another. In this chapter we will consider the
problem of symmetric combination of modules to form larger modules.
22.1
Combining Abstractions
175
D RAFT
V ERSION 1.2
176
D RAFT
V ERSION 1.2
177
the package. To support this it is necessary to constrain the implementation to use the same notion of vector throughout. This is achieved using
a type sharing constraint. The revised signatures for the geometry package
look like this:
signature SPHERE =
sig
structure Vector : VECTOR
structure Point : POINT
sharing type Point.Vector.vector = Vector.vector
type sphere
val sphere : Point.point * Vector.vector -> sphere
end
signature GEOMETRY =
sig
structure Point : POINT
structure Sphere : SPHERE
sharing type Point.point = Sphere.Point.point
and Point.Vector.vector = Sphere.Vector.vector
end
These equations specify that the two copies of the point abstraction, and
the three copies of the vector abstraction must coincide. In the presence
of the above sharing specification, the ill-typed expression above becomes
well-typed, since now the required type equation holds by explicit specification in the signature.
As a notational convenience we may use a structure sharing constraint
instead to express the same requirements:
signature SPHERE =
sig
structure Vector : VECTOR
structure Point : POINT
sharing Point.Vector = Vector
type sphere
val sphere : Point.point * Vector.vector -> sphere
end
signature GEOMETRY =
sig
R EVISED 11.02.11
D RAFT
V ERSION 1.2
178
D RAFT
V ERSION 1.2
179
quired sharing between Sphere.Point and Point does not hold, because
Sphere2D.Point is distinct from Point3D.
structure Geom3D :> GEOMETRY =
struct
structure Point = Point3D
structure Sphere = Sphere2D
end
It is natural to wonder whether it might be possible to restructure the
GEOMETRY signature so that the duplication of the point and vector components is avoided, thereby obviating the need for sharing specifications.
One can re-structure the code in this manner, but doing so would do violence to the overall structure of the program. This is why sharing specifications are so important.
Lets try to re-organize the signature GEOMETRY so that duplication of
the point and vector structures is avoided. One step is to eliminate the substructure Vector from SPHERE, replacing uses of Vector.vector by Vector.Point.vector.
signature SPHERE =
sig
structure Point : POINT
type sphere
val sphere :
Point.point * Point.Vector.vector -> sphere
end
After all, since the structure Point comes equipped with a notion of vector,
why not use it?
This cuts down the number of sharing specifications to one:
signature GEOMETRY =
sig
structure Point : POINT
structure Sphere : SPHERE
sharing Point = Sphere.Point
end
If we could further eliminate the substructure Point from the signature
SPHERE we would have only one copy of Point and no need for a sharing
specification.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
180
But what would the signature SPHERE look like in this case?
signature SPHERE =
sig
type sphere
val sphere :
Point.point * Point.Vector.vector -> sphere
end
The problem now is that the signature SPHERE is no longer self-contained.
It makes reference to a structure Point, but which Point are we talking
about? Any commitment would tie the signature to a specific structure,
and hence a specific dimension, contrary to our intentions. Rather, the
notion of point must be a generic concept within SPHERE, and hence Point
must appear as a substructure. The substructure Point may be thought of
as a parameter of the signature SPHERE in the sense discussed earlier.
The only other move available to us is to eliminate the structure Point
from the signature GEOMETRY. This is indeed possible, and would eliminate
the need for any sharing specifications. But it only defers the problem,
rather than solving it. A full-scale geometry package would contain more
abstractions that involve points, so that there will still be copies in the
other abstractions. Sharing specifications would then be required to ensure that these copies are, in fact, identical.
Here is an example. Let us introduce another geometric abstraction,
the semi-space.
signature SEMI SPACE =
sig
structure Point : POINT
type semispace
val side : Point.point * semispace -> bool option
end
The function side determines (if possible) whether a given point lies in
one half of the semi-space or the other.
The expanded GEOMETRY signature would look like this (with the elimination of the Point structure in place).
signature EXTD GEOMETRY =
sig
R EVISED 11.02.11
D RAFT
V ERSION 1.2
181
22.2
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 23
Parameterization
To support code re-use it is useful to define generic, or parameterized, modules that leave unspecified some aspects of the implementation of a module. The unspecified parts may be instantiated to determine specific instances of the module. The common part is thereby implemented once and
shared among all instances.
In ML such generic modules are called functors. A functor is a modulelevel function that takes a structure as argument and yields a structure
as result. Instances are created by applying the functor to an argument
specifying the interpretation of the parameters.
23.1
Functors are defined using a functor binding. There are two forms, the
opaque and the transparent. A transparent functor has the form
functor funid(decs):sigexp = strexp
where the result signature, sigexp, is transparently ascribed; an opaque functor has the form
functor funid(decs):>sigexp = strexp
where the result signature is opaquely ascribed. A functor is a modulelevel function whose argument is a sequence of declarations, and whose
result is a structure.
183
D RAFT
V ERSION 1.2
184
alternative, called applicativity, means that there is one abstract type shared by all
instances of that functor.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
185
23.2
In chapter 22 we developed a signature of geometric primitives that contained sharing specifications to ensure that the constituent abstractions
may be combined properly. The signature GEOMETRY is defined as follows:
signature GEOMETRY =
sig
structure Point : POINT
structure Sphere : SPHERE
sharing Point = Sphere.Point
and Point.Vector = Sphere.Vector
and Sphere.Vector = Sphere.Point.Vector
end
R EVISED 11.02.11
D RAFT
V ERSION 1.2
186
The sharing clauses ensure that the Point and Sphere components are
compatible with each other.
Since we expect to define vectors, points, and spheres of various dimensions, it makes sense to implement these as functors, according to the
following scheme:
functor PointFun
(structure V : VECTOR) : POINT = ...
functor SphereFun
(structure V : VECTOR
structure P : POINT) : SPHERE =
struct
structure Vector = V
structure Point = P
.
.
.
end
functor GeomFun
(structure P : POINT
structure S : SPHERE) : GEOMETRY =
struct
structure Point = P
structure Sphere = S
end
A two-dimensional geometry package may then be defined as follows:
structure Vector2D : VECTOR = ...
structure Point2D : POINT =
PointFun (structure V = Vector2D)
structure Sphere2D : SPHERE =
SphereFun (structure V = Vector2D and P = Point2D)
structure Geom2D : GEOMETRY =
GeomFun (structure P = Point2D and S = Sphere2D)
A three-dimensional version is defined similarly.
There is only one problem: the functors SphereFun and GeomFun are
not well-typed! The reason is that in both cases their result signatures
require type equations that are not true of their parameters! For example,
R EVISED 11.02.11
D RAFT
V ERSION 1.2
187
23.3
D RAFT
V ERSION 1.2
188
D RAFT
V ERSION 1.2
189
struct
structure Vector = P.Vector
structure Point = P
.
.
.
end
functor SemiSpaceFun
(structure P : POINT) : SEMI SPACE =
struct
.
.
.
end
functor ExtdGeomFun1
(structure P : POINT) : GEOMETRY =
struct
structure Sphere =
SphereFun (structure P = Point)
structure SemiSpace =
SemiSpaceFun (structure P = Point)
end
The problems with this solution are these:
The body of ExtdGeomFun1 makes use of the functors SphereFun and
SemiSpaceFun. In effect we are limiting the geometry functor to arguments that are built from these specific functors, and no other. This is
a significant loss of generality that is otherwise present in the functor ExtdGeomFun, which may be applied to any implementations of
SPHERE and SEMI SPACE.
The functor ExtdGeomFun1 must have as parameter the common element(s) of the components of its body, which is then used to build
up the appropriate substructures in a manner consistent with the required sharing. This approach does not scale well when many abstractions are layered atop one another. We must reconstruct the
entire hierarchy, starting with the components that are conceptually
furthest away as arguments.
There is no inherent reason why ExtdGeomFun1 must take an implementation of POINT as argument. It does so only so that it can reconR EVISED 11.02.11
D RAFT
V ERSION 1.2
190
R EVISED 11.02.11
D RAFT
V ERSION 1.2
191
This solution has all of the advantages of the direct use of sharing specifications, and no further disadvantages. However, we are forced to violate
arbitrarily the inherent symmetry of the situation. We could just as well
have written
functor ExtdGeomFun4
(structure Ss : SEMI SPACE
structure Sp : SPHERE where Point = Sp.Point) =
struct
structure Sphere = Sp
structure SemiSpace = Ss
end
without changing the meaning.
Here is the point: sharing specifications allow a symmetric situation to be
treated in a symmetric manner. The compiler breaks the symmetry by choosing representatives arbitrarily in the manner illustrated above. Sharing
specifications off-load the burden of making such tedious (because arbitrary) decisions to the compiler, rather than imposing it on the programmer.
23.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Part IV
Programming Techniques
193
In this part of the book we will explore the use of Standard ML to build
elegant, reliable, and efficient programs. The discussion takes the form of
a series of worked examples illustrating various techniques for building
programs.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 24
Specifications and Correctness
The most important tools for getting programs right are specification and
verification. In this Chapter we review the main ideas in preparation for
their subsequent use in the rest of the book.
24.1
Specifications
A specification is a description of the behavior of a piece of code. Specifications take many forms:
Typing. A type specification describes the form of the value of an
expression, without saying anything about the value itself.
Effect Behavior. An effect specification resembles a type specification,
but instead of describing the value of an expression, it describes the
effects it may engender when evaluated.
Input-Output Behavior. An input-output specification is a mathematical formula, usually an implication, that describes the output of a
function for all inputs satisfying some assumptions.
Time and Space Complexity. A complexity specification states the time
or space required to evaluate an expression. The specification is most
often stated asymptotically in terms of the number of execution steps
or the size of a data structure.
24.1 Specifications
195
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
fib 0 = (1, 0)
fib 1 = (1, 1)
fib n =
let
val (a, b) = fib (n-1)
R EVISED 11.02.11
D RAFT
V ERSION 1.2
196
in
(a+b, a)
end
Here are some specifications pertaining to these functions:
Type specifications:
val fib : int -> int
val fib : int -> int * int
Effect specifications:
The application fib n may raise the exception Overflow.
The application fib n may raise the exception Overflow.
Input-output specifications:
If n 0, then fib n evaluates to the nth Fibonacci number.
If n 0, then fib n evaluates the nth and n 1st Fibonacci
number, in that order.
Time complexity specifications:
If n 0, then fib n terminates in O(2n ) steps.
If n 0, then fib n terminates in O(n) steps.
Equivalence specification:
For all n 0, fib n is equivalent to #1(fib n)
24.2
Correctness Proofs
R EVISED 11.02.11
D RAFT
V ERSION 1.2
197
misconception is encouraged by the C assert macro, which introduces an executable test that a certain computable condition holds. This is a fine thing, but from
this many people draw the conclusion that assertions (specifications) are simply boolean
tests. This is false.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
198
D RAFT
V ERSION 1.2
199
24.3
D RAFT
V ERSION 1.2
200
means that the pre-conditions impose obligations on the caller, the user
of the code, in order for the callee, the code itself, to be well-behaved. A
conditional specification is a contract between the caller and the callee: if
the caller meets the pre-conditions, the caller promises to fulfill the postcondition.
In the case of type specifications the compiler enforces this obligation
by ruling out as ill-typed any attempt to use a piece of code in a context that does not fulfill its typing assumptions. Returning to the example
above, if one attempts to use the expression x+1 in a context where x is not
an integer, one can hardly expect that x+1 will yield an integer. Therefore
it is rejected by the type checker as a violation of the stated assumptions
governing the types of its free variables.
What about specifications that are not mechanically enforced? For example, if x is negative, then we cannot infer anything about x+1 from the
specification given above.2 To make use of the specification in reasoning
about its used in a larger program, it is essential that this pre-condition be
met in the context of its use.
Lacking mechanical enforcement of these obligations, it is all too easy
to neglect them when writing code. Many programming mistakes can be
traced to violation of assumptions made by the callee that are not met by
the caller.3 What can be done about this?
A standard method, called bullet-proofing, is to augment the callee with
run-time checks that ensure that its pre-conditions are met, raising an exception if they are not. For example, we might write a bullet-proofed
version of fib that ensures that its argument is non-negative as follows:
local
exception PreCond
fun unchecked fib
| unchecked fib
| unchecked fib
unchecked fib
in
0 = 1
1 = 1
n =
(n-1) + unchecked fib (n-2)
2 There
are obviously other specifications that carry more information, but were only
concerned here with the one given. Moreover, if f is an unknown function, then we will,
in general, only have the specification, and not the code, to reason about.
3 Sadly, these assumptions are often unstated and can only be culled from the code
with great effort, if at all.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
201
D RAFT
V ERSION 1.2
Chapter 25
Induction and Recursion
This chapter is concerned with the close relationship between recursion
and induction in programming. If a function is recursively-defined, an inductive proof is required to show that it meets a specification of its behavior. The motto is
when programming recursively, think inductively.
Doing so significantly reduces the time spent debugging, and often leads
to more efficient, robust, and elegant programs.
25.1
Exponentiation
Lets start with a very simple series of examples, all involving the computation of the integer exponential function. Our first example is to compute
2n for integers n 0. We seek to define the function exp of type int->int
satisfying the specification
if n 0, then exp n evaluates to 2n .
The precondition, or assumption, is that the argument n is non-negative. The
postcondition, or guarantee, is that the result of applying exp to n is the number 2n . The caller is required to establish the precondition before applying
exp; in exchange, the caller may assume that the result is 2n .
Heres the code:
fun exp 0 = 1
| exp n = 2 * exp (n-1)
25.1 Exponentiation
203
Does this function satisfy the specification? It does, and we can prove
this by induction on n. If n = 0, then exp n evaluates to 1 (as you can
see from the first line of its definition), which is, of course, 20 . Otherwise,
assume that exp is correct for n 1 0, and consider the value of exp
n. From the second line of its definition we can see that this is the value
of 2 p, where p is the value of exp (n 1). Inductively, p 2n1 , so
2 p = 2 2n1 = 2n , as desired. Notice that we need not consider
arguments n < 0 since the precondition of the specification requires that
this be so. We must, however, ensure that each recursive call satisfies this
requirement in order to apply the inductive hypothesis.
That was pretty simple. Now let us consider the running time of exp
expressed as a function of n. Assuming that arithmetic operations are executed in constant time, then we can read off a recurrence describing its
execution time as follows:
T (0) = O (1)
T ( n + 1) = O (1) + T ( n )
We are interested in solving a recurrence by finding a closed-form expression for it. In this case the solution is easily obtained:
T (n) = O(n)
Thus we have a linear time algorithm for computing the integer exponential function.
What about space? This is a much more subtle issue than time because it is much more difficult in a high-level language such as ML to see
where the space is used. Based on our earlier discussions of recursion and
iteration we can argue informally that the definition of exp given above
requires space given by the following recurrence:
S (0) = O (1)
S ( n + 1) = O (1) + S ( n )
The justification is that the implementation requires a constant amount of
storage to record the pending multiplication that must be performed upon
completion of the recursive call.
Solving this simple recurrence yields the equation
S(n) = O(n)
R EVISED 11.02.11
D RAFT
V ERSION 1.2
25.1 Exponentiation
204
expressing that exp is also a linear space algorithm for the integer exponential function.
Can we do better? Yes, on both counts! Heres how. Rather than count
down by ones, multiplying by two at each stage, we use successive squaring to achieve logarithmic time and space requirements. The idea is that
if the exponent is even, we square the result of raising 2 to half the given
power; otherwise, we reduce the exponent by one and double the result,
ensuring that the next exponent will be even. Heres the code:
fun
fun
fun
|
Its specification is precisely the same as before. Does this code satisfy
the specification? Yes, and we can prove this by using complete induction,
a form of mathematical induction in which we may prove that n > 0 has
a desired property by assuming not only that the predecessor has it, but
that all preceding numbers have it, and arguing that therefore n must have
it. Heres how its done. For n = 0 the argument is exactly as before.
Suppose, then, that n > 0. If nis even, the value of exp n is the result
of squaring the value of exp (n 2). Inductively this value is 2(n2) , so
squaring it yields 2(ndiv2) 2(n2) = 22(n2) = 2n , as required. If, on
the other hand, n is odd, the value is the result of doubling exp (n 1).
Inductively the latter value is 2(n1) , so doubling it yields 2n , as required.
Heres a recurrence governing the running time of fast exp as a function of its argument:
T (0) = O (1)
T (2n) = O(1) + T (n)
T (2n + 1) = O(1) + T (2n)
= O (1) + T ( n )
Solving this recurrence using standard techniques yields the solution
T (n) = O(lgn)
R EVISED 11.02.11
D RAFT
V ERSION 1.2
25.1 Exponentiation
205
You should convince yourself that fast exp also requires logarithmic space
usage.
Can we do better? Well, its not possible to improve the time requirement (at least not asymptotically), but we can reduce the space required to
O(1) by putting the function into iterative (tail recursive) form. However,
this may not be achieved in this case by simply adding an accumulator argument, without also increasing the running time! The obvious approach
is to attempt to satisfy the specification
if n 0, then skinny fast exp (n, a) evaluates to 2n a.
Heres some code that achieves this specification:
fun skinny fast exp (0, a) = a
| skinny fast exp (n, a) =
if n mod 2 = 0 then
skinny fast exp (n div 2,
skinny fast exp (n div 2, a))
else
skinny fast exp (n-1, 2*a)
It is easy to see that this code works properly for n = 0 and for n > 0
when n is odd, but what if n > 0 is even? Then by induction we compute
2(n2) 2(n2 ) a by two recursive calls to skinny fast exp.
This yields the desired result, but what is the running time? Heres a
recurrence to describe its running time as a function of n:
T (0) = 1
T (2n) = O(1) + 2T (n)
T (2n + 1) = O(1) + T (2n)
= O(1) + 2T (n)
Here again we have a standard recurrence whose solution is
T ( n ) = O ( n ).
Can we do better? The key is to recall the following important fact:
2(2n) = (22 )n = 4n .
We can achieve a logarithmic time and exponential space bound by a
change of base. Heres the specification:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
25.1 Exponentiation
206
D RAFT
V ERSION 1.2
207
adding accumulator arguments. What we observe is the apparent paradox that it is often easier to do something (superficially) harder! In terms
of proving, it is often easier to push through an inductive argument for a
stronger specification, precisely because we get to assume the result as the
inductive hypothesis when arguing the inductive step(s). We are limited
only by the requirement that the specification be proved outright at the
base case(s); no inductive assumption is available to help us along here. In
terms of programming, it is often easier to compute a more complicated
function involving accumulator arguments, precisely because we get to
exploit the accumulator when making recursive calls. We are limited only
by the requirement that the result be defined outright for the base case(s);
no recursive calls are available to help us along here.
25.2
Lets consider a more complicated example, the computation of the greatest common divisor of a pair of non-negative integers. Recall that m is
a divisor of n, written m|n, iff n is a multiple of m, which is to say that
there is some k 0 such that n = k m. The greatest common divisor of
non-negative integers m and n is the largest p such that p|m and p|n. (By
convention the g.c.d. of 0 and 0 is taken to be 0.) Heres the specification
of the gcdfunction:
if m, n 0, then gcd(m,n) evaluates to the g.c.d. of m and n.
Euclids algorithm for computing the g.c.d. of m andn is defined by
complete induction on the product mn. Heres the algorithm, written in
ML:
fun gcd (m:int, 0):int = m
| gcd (0, n:int):int = n
| gcd (m:int, n:int):int =
if m>n then
gcd (m mod n, n)
else
gcd (m, n mod m)
Why is this algorithm correct? We may prove that gcd satisfies the
specification by complete induction on the product m n. If m n is zero,
R EVISED 11.02.11
D RAFT
V ERSION 1.2
208
then either mor n is zero, in which case the answer is, correctly, the other
number. Otherwise the product is positive, and we proceed according to
whether m > n or m n. Suppose that m > n. Observe that mmodn =
m (m n) n, so that (mmodn) n = m n (m n)n2 < m n,
so that by induction we return the g.c.d. of mmodn and n. It remains to
show that this is the g.c.d. of m and n. If d divides both mmodn and n,
then k d = (mmodn) = (m (m n) n)and l d = n for some nonnegative k and l. Consequently, k d = m (m n) l d, so m =
(k + (m n) l ) d, which is to say that d divides m. Now if d is any
other divisor of m and n, then it is also a divisor of (mmodn) and n, so
d > d0 . That is, d is the g.c.d. of m and n. The other case, m n, follows
similarly. This completes the proof.
At this point you may well be thinking that all this inductive reasoning
is surely helpful, but its no replacement for good old-fashioned bulletproofing conditional tests inserted at critical junctures to ensure that
key invariants do indeed hold at execution time. Sure, you may be thinking, these checks have a run-time cost, but they can be turned off once
the code is in production, and anyway the cost is minimal compared to,
say, the time required to read and write from disk. Its hard to complain
about this attitude, provided that sufficiently cheap checks can be put into
place and provided that you know where to put them to maximize their
effectiveness. For example, theres no use checking i > 0 at the start of the
then clause of a test for i > 0. Barring compiler bugs, it cant possibly be
anything other than the case at that point in the program. Or it may be
possible to insert a check whose computation is more expensive (or more
complicated) than the one were trying to perform, in which case were
defeating the purpose by including them!
This raises the question of where should we put such checks, and what
checks should be included to help ensure the correct operation (or, at least,
graceful malfunction) of our programs? This is an instance of the general
problem of writing self-checking programs. Well illustrate the idea by elaborating on the g.c.d. example a bit further. Suppose we wish to write a
self-checking g.c.d. algorithm that computes the g.c.d., and then checks
the result to ensure that it really is the greatest common divisor of the two
given non-negative integers before returning it as result. The code might
look something like this:
exception GCD ERROR
R EVISED 11.02.11
D RAFT
V ERSION 1.2
209
D RAFT
V ERSION 1.2
210
let
val (d, a, b) = ggcd (m mod n, n)
in
(d, a, b - a * (m div n))
end
else
let
val (d, a, b) = ggcd (m, n mod m)
in
(d, a - b*(n div m), b)
end
We may easily check that this code satisfies the specification by induction
on the product m n. If m n = 0, then either m or n is 0, in which case
the result follows immediately. Otherwise assume the result for smaller
products, and show it for m n > 0. Suppose m > n; the other case
is handled analogously. Inductively we obtain d, a, and b such that d is
the g.c.d. of mmodn and n, and hence is the g.c.d. of m and n, and d =
a (mmodn) + b n. Since mmodn = m (m n) n, it follows that
d = a m + (b a (m n)) n, from which the result follows.
Now we can write a self-checking g.c.d. as follows:
exception GCD ERROR
fun checked gcd (m, n) =
let
val (d, a, b) = ggcd (m, n)
in
if m mod d = 0 andalso
n mod d = 0 andalso d = a*m+b*n
then
d
else
raise GCD ERROR
end
This algorithm takes no more time (asymptotically) than the original, and,
moreover, ensures that the result is correct. This illustrates the power of
the interplay between mathematical reasoning methods such as induction
and number theory and programming methods such as bulletproofing to
achieve robust, reliable, and, what is more important, elegant programs.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
25.3
211
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 26
Structural Induction
The importance of induction and recursion are not limited to functions
defined over the integers. Rather, the familiar concept of mathematical induction over the natural numbers is an instance of the more general notion
of structural induction over values of an inductively-defined type. Rather than
develop a general treatment of inductively-defined types, we will rely on a
few examples to illustrate the point. Lets begin by considering the natural
numbers as an inductively defined type.
26.1
Natural Numbers
The set of natural numbers, N, may be thought of as the smallest set containing 0 and closed under the formation of successors. In other words, n
is an element of N iff either n = 0 or n = m + 1 for some m in N. Still
another way of saying it is to define N by the following clauses:
1. 0 is an element of N.
2. If m is an element of N, then so is m + 1.
3. Nothing else is an element of N.
(The third clause is sometimes called the extremal clause; it ensures that
we are talking about N and not just some superset of it.) All of these
definitions are equivalent ways of saying the same thing.
213
D RAFT
V ERSION 1.2
26.2 Lists
214
The type checker ensures that we have covered all cases, but it does not ensure that the pattern of structural recursion is strictly followed we may
accidentally define f (m + 1) in terms of itself or some f (k ) where k > m,
breaking the pattern. The reason this is admitted is that the ML compiler
cannot always follow our reasoning: we may have a clever algorithm in
mind that isnt easily expressed by a simple structural induction. To avoid
restricting the programmer, the language assumes the best and allows any
form of definition.
Using the principle of structure induction for the natural numbers, we
may prove properties of functions defined over the naturals. For example,
we may easily prove by structural induction over the type nat that for
every n N, exp n evaluates to a positive number. (In previous chapters
we carried out proofs of more interesting program properties.)
26.2
Lists
Generalizing a bit, we may think of the type a list as inductively defined by the following clauses:
1. nil is a value of type a list
2. If h is a value of type a, and t is a value of type a list, then h::t
is a value of type a list.
3. Nothing else is a value of type a list.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
26.3 Trees
215
26.3
Trees
D RAFT
V ERSION 1.2
216
26.4
Does this pattern apply to every datatype declaration? Yes and no. No
matter what the form of the declaration it always makes sense to define a
function over it by a clausal function definition with one clause per constructor. Such a definition is guaranteed to be exhaustive (cover all cases),
and serves as a valuable guide to structuring your code. (It is especially
valuable if you change the datatype declaration, because then the compiler
will inform you of what clauses need to be added or removed from functions defined over that type in order to restore it to a sensible definition.)
The slogan is:
To define functions over a datatype, use a clausal definition
with one clause per constructor
R EVISED 11.02.11
D RAFT
V ERSION 1.2
217
26.5
Abstracting Induction
D RAFT
V ERSION 1.2
218
D RAFT
V ERSION 1.2
219
26.6
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 27
Proof-Directed Debugging
In this chapter well put specification and verification techniques to work
in devising a regular expression matcher. The code is similar to that sketched
in chapter 1, but we will use verification techiques to detect and correct a
subtle error that may not be immediately apparent from inspecting or even
testing the code. We call this process proof-directed debugging.
The first task is to devise a precise specification of the regular expression matcher. This is a difficult problem in itself. We then attempt to verify
that the matching program developed in chapter 1 satisfies this specification. The proof attempt breaks down. Careful examination of the failure
reveals a counterexample to the specification the program does not satisfy it. We then consider how best to resolve the problem, not by change of
implementation, but instead by change of specification.
27.1
Before we begin work on the matcher, let us first define the set of regular expressions and their meaning as a set of strings. The set of regular
expressions is given by the following grammar:
r ::= 0 | 1 | a | r1 r2 | r1 + r2 | r
Here a ranges over a given alphabet, a set of primitive letters that may
be used in a regular expression. A string is a finite sequence of letters of
the alphabet. We write for the null string, the empty sequence of letters.
We write s1 s2 for the concatenation of the strings s1 and s2 , the string s
221
=
=
=
=
=
=
0
1
{a}
L (r1 ) L (r2 )
L (r1 ) + L (r2 )
L (r )
=
=
=
=
=
=
=
{}
L1 L2
{ s1 s2 | s1 L1 , s2 L2 }
1
L L (i )
S
(i )
i 0 L
D RAFT
V ERSION 1.2
222
27.2
D RAFT
V ERSION 1.2
223
tion match of type regexp -> string -> bool that determines whether or
not a given string matches a given regular expression. More precisely, we
wish to satisfy the following specification:
For every regular expression r and every string s, match r s
terminates, and evaluates to true iff s L(r ).
We saw in chapter 1 that a natural way to define the procedure match
is to use a technique called continuation passing. We defined an auxiliary
function match is with the type
regexp -> char list -> (char list -> bool) -> bool
that takes a regular expression, a list of characters (essentially a string, but
in a form suitable for incremental processing), and a continuation, and
yields a boolean. The idea is that match is takes a regular expression r, a
character list cs, and a continuation k, and determines whether or not some
initial segment of cs matches r, passing the remaining characters cs0 to k in
the case that there is such an initial segment, and yields false otherwise.
Put more precisely,
For every regular expression r, character list cs, and continuation k, if cs = cs0 @cs00 with cs0 L(r ) and k cs00 evaluates to true, then match is r cs k evaluates true; otherwise,
match is r cs k evaluates to false.
Unfortunately, this specification is too strong to ever be satisfied by any
program! Can you see why? The difficulty is that if k is not guaranteed to
terminate for all inputs, then there is no way that match is can behave as
required. For example, if there is no input on which k terminates, the specification requires that match is return false. It should be intuitively clear
that we can never implement such a function. Instead, we must restrict
attention to total continuations, those that always terminate with true or
false on any input. This leads to the following revised specification:
For every regular expression r, character list cs, and total continuation k, if cs = cs0 cs00 with cs0 L(r ) and k cs00 evaluates to true, then match is r cs k evaluates to true; otherwise, match is r cs k evaluates to false.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
224
Observe that this specification makes use of an implicit existential quantification. Written out in full, we might say For all . . . , if there exists cs0
and cs00 such that cs = cs0 cs00 with . . . , then . . . . This observation makes
clear that we must search for a suitable splitting of cs into two parts such
that the first part is in L(r ) and the second is accepted by k. There may,
in general, be many ways to partition the input to as to satisfy both of
these requirements; we need only find one such way. Note, however, that
if cs = cs0 @cs00 with cs0 L(r ) but k cs00 yielding false, we must reject
this partitioning and search for another. In other words we cannot simply
accept any partitioning whose initial segment matches r, but rather only
those that also induce k to accept its corresponding final segment. We may
return false only if there is no such splitting, not merely if a particular
splitting fails to work.
Suppose for the moment that match is satisfies this specification. Does
it follow that match satisfies the original specification? Recall that the function match is defined as follows:
fun match r s =
match is r
(String.explode s)
(fn nil => true | false)
Notice that the initial continuation is indeed total, and that it yields true
(accepts) iff it is applied to the null string. Therefore match satisfies the following property obtained from the specification of mathc is by plugging
in the initial continuation:
For every regular expression r and string s, if s L(r ), then
match r s evaluates to true, and otherwise match r s evaluates to false.
This is precisely the property that we desire for match. Thus match is correct (satisfies its specification) if match is is correct.
So far so good. But does match is satisfy its specification? If so, we are
done. How might we check this? Recall the definition of match is given
in the overview:
fun match is Zero k = false
| match is One cs k = k cs
| match is (Char c) nil k = false
R EVISED 11.02.11
D RAFT
V ERSION 1.2
225
D RAFT
V ERSION 1.2
226
true, and hence match is r1 cs10 cs20 cs00 k0 evaluates to true, as required.
If, however, no such partitioning exists, then one of three situations occurs:
1. either no initial segment of cs matches r1 , in which case the outer
recursive call yields false, as required, or
2. for every initial segment matching r1 , no initial segment of the corresponding final segment matches r2 , in which case the inner recursive
call yields false on every call, and hence the outer call yields false,
as required, or
3. every pair of successive initial segments of cs matching r1 and r2
successively results in k evaluating to false, in which case the inner
recursive call always yields false, and hence the continuation k0 always yields false, and hence the outer recursive call yields false,
as required.
Be sure you understand the reasoning involved here, it is quite tricky to
get right!
We seem to be on track, with one more case to consider, r = r1 . This
case would appear to be a combination of the preceding two cases for alternation and concatenation, with a similar argument sufficing to establish correctness. But there is a snag: the second recursive call to match is
leaves the regular expression unchanged! Consequently we cannot apply
the inductive hypothesis to establish that it behaves correctly in this case,
and the obvious proof attempt breaks down.
What to do? A moments thought suggests that we proceed by an inner induction on the length of the string, based on the idea that if some
initial segment of cs matches L(r ), then either that initial segment is the
null string (base case), or cs = cs0 @cs00 with cs0 L(r1 ) and cs00 L(r )
(induction step). We then handle the base case directly, and handle the
inductive case by assuming that match is behaves correctly for cs00 and
showing that it behaves correctly for cs. But there is a flaw in this argument the string cs00 need not be shorter than cs in the case that cs0 is the
null string! In that case the inductive hypothesis does not apply, and we
are once again unable to complete the proof.
This time we can use the failure of the proof to obtain a counterexample to the specification! For if r = 1 , for example, then match is r cs k
does not terminate! In general if r = r1 with L(r1 ), then match is r
R EVISED 11.02.11
D RAFT
V ERSION 1.2
227
cs k fails to terminate. In other words, match is does not satisfy the specification we have given for it. Our conjecture is false!
Our failure to establish that match is satisfies its specification lead to a
counterexample that refuted our conjecture and uncovered a genuine bug
in the program the matcher may not terminate for some inputs. What
to do? One approach is to explicitly check for looping behavior during
matching by ensuring that each recursive calls matches some non-empty
initial segment of the string. This will work, but at the expense of cluttering the code and imposing additional run-time overhead. You should
write out a version of the matcher that works this way, and check that it
indeed satisfies the specification weve given above.
An alternative is to observe that the proof goes through under the additional assumption that no iterated regular expression matches the null
R EVISED 11.02.11
D RAFT
V ERSION 1.2
228
=
=
=
=
=
=
0
1
0
(r1 ) (r2 )
(r1 ) (r2 )
1
=
=
=
=
=
=
0
0
0
r1 + r2
(r1 ) r2 + r1 (r2 ) + r1 r2
(r ) + r
The only tricky case is the one for concatenation, which must take account
of the possibility that r1 or r2 accepts the null string.
Exercise 3
Show that L(r ) = L(r ) \ 1.
27.3
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 28
Persistent and Ephemeral Data
Structures
This chapter is concerned with persistent and ephemeral abstract types. The
distinction is best explained in terms of the logical future of a value. Whenever a value of an abstract type is created it may be subsequently acted
upon by the operations of the type (and, since the type is abstract, by no
other operations). Each of these operations may yield (other) values of that
abstract type, which may themselves be handed off to further operations
of the type. Ultimately a value of some other type, say a string or an integer, is obtained as an observable outcome of the succession of operations
on the abstract value. The sequence of operations performed on a value of
an abstract type constitutes a logical future of that type a computation
that starts with that value and ends with a value of some observable type.
We say that a type is ephemeral iff every value of that type has at most one
logical future, which is to say that it is handed off from one operation of
the type to another until an observable value is obtained from it. This is
the normal case in familiar imperative programming languages because
in such languages the operations of an abstract type destructively modify
the value upon which they operate; its original state is irretrievably lost by
the performance of an operation. It is therefore inherent in the imperative
programming model that a value have at most one logical future. In contrast, values of an abstract type in functional languages such as ML may
have many different logical futures, precisely because the operations do
not destroy the value upon which they operate, but rather create fresh
values of that type to yield as results. Such values are said to be persistent
230
because they persist after application of an operation of the type, and in
fact may serve as arguments to further operations of that type.
Some examples will help to clarify the distinction. The primitive list
types of ML are persistent because the performance of an operation such
as consing, appending, or reversing a list does not destroy the original
list. This leads naturally to the idea of multiple logical futures for a given
value, as illustrated by the following code sequence:
(* original list *)
val l = [1,2,3]
val m1 = hd l
(* first future of l *)
val n1 = rev (tl m1)
(* second future of l *)
val m2 = l @ [4,5,6]
Notice that the original list value, [1,2,3], has two distinct logical futures,
one in which we remove its head, then reverse the tail, and the other in
which we append the list [4,5,6] to it. The ability to easily handle multiple logical futures for a data structure is a tremendous source of flexibility
and expressive power, alleviating the need to perform tedious bookkeeping to manage versions or copies of a data structure to be passed to
different operations.
The prototypical ephemeral data structure in ML is the reference cell.
Performing an assignment operation on a reference cell changes it irrevocably; the original contents of the cell are lost, even if we keep a handle on
it.
val r = ref 0
(* original cell *)
val s = r
val = (s := 1)
val x = !r
(* 1! *)
Notice that the contents of (the cell bound to) r changes as a result of performing an assignment to the underlying cell. There is only one future
for this cell; a reference to its original binding does not yield its original
contents.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
231
More elaborate forms of ephemeral data structures are certainly possible. For example, the following declaration defines a type of lists whose
tails are mutable. It is therefore a singly-linked list, one whose predecessor
relation may be changed dynamically by assignment:
datatype a mutable list =
Nil |
Cons of a * a mutable list ref
Values of this type are ephemeral in the sense that some operations
on values of this type are destructive, and hence are irreversible (so to
speak!). For example, heres an implementation of a destructive reversal
of a mutable list. Given a mutable list l, this function reverses the links in
the cell so that the elements occur in reverse order of their occurrence in l.
local
fun ipr (Nil, a) = a
| ipr (this as (Cons ( , r as ref next)), a) =
ipr (next, (r := a; this))
in
(* destructively reverse a list *)
fun inplace reverse l = ipr (l, Nil)
end
As you can see, the code is quite tricky to understand! The idea is the
same as the iterative reverse function for pure lists, except that we re-use
the nodes of the original list, rather than generate new ones, when moving
elements onto the accumulator argument.
The distinction between ephemeral and persistent data structures is
essentially the distinction between functional (effect-free) and imperative
(effect-ful) programming functional data structures are persistent; imperative data structures are ephemeral. However, this characterization is
oversimplified in two respects. First, it is possible to implement a persistent data structure that exploits mutable storage. Such a use of mutation
is an example of what is called a benign effect because for all practical purposes the data structure is purely functional (i.e., persistent), but is in
fact implemented using mutable storage. As we will see later the exploitation of benign effects is crucial for building efficient implementations of
persistent data structures. Second, it is possible for a persistent data type
R EVISED 11.02.11
D RAFT
V ERSION 1.2
232
to be used in such a way that persistence is not exploited rather, every value of the type has at most one future in the program. Such a type
is said to be single-threaded, reflecting the linear, as opposed to branching, structure of the future uses of values of that type. The significance
of a single-threaded type is that it may as well have been implemented as
an ephemeral data structure (e.g., by having observable effects on values)
without changing the behavior of the program.
28.1
Persistent Queues
D RAFT
V ERSION 1.2
q1 =
q2 =
(h1,
(h2,
233
(* h1 = 1, q3 = q1 *)
(* h2 = 2, q4 = q0 *)
q0 :
q1 =
q2 =
(h1,
(h2,
(h2,
(*
(*
(*
(*
D RAFT
V ERSION 1.2
234
D RAFT
V ERSION 1.2
235
28.2
Amortized Analysis
How can we prove this claim? First we given an informal argument, then
we tighten it up with a more rigorous analysis. We are to account for the
total work performed by a sequence of n operations by showing that any
sequence of n operations can be executed in cn steps for some constant
c. Dividing by n, we obtain the result that each operations takes c steps
when amortized over the entire sequence. The key is to observe first that
R EVISED 11.02.11
D RAFT
V ERSION 1.2
236
the work required to execute a sequence of queue operations may be apportioned to the elements themselves, then that only a constant amount of
work is expended on each element. The life of a queue element may be
divided into three stages: its arrival in the queue, its transit time in the
queue, and its departure from the queue. In the worst case each element
passes through each of these stages (but may die young, never participating in the second or third stage). Arrival requires constant time to
add the element to the back of the queue. Transit consists of being moved
from the back to the front by a reversal, which takes constant time per
element on the back. Departure takes constant time to pattern match and
extract the element. Thus at worst we require three steps per element to account for the entire effort expended to perform a sequence of queue operations. This is in fact a conservative upper bound in the sense that we may
need less than 3n steps for the sequence, but asymptotically the bound is
optimal we cannot do better than constant time per operation! (You
might reasonably wonder whether there is a worst-case, non-amortized
constant-time implementation of persistent queues. The answer is yes,
but the code is far more complicated than the simple implementation we
are sketching here.)
This argument can be made rigorous as follows. The general idea is to
introduce the notion of a charge scheme that provides an upper bound on
the actual cost of executing a sequence of operations. An upper bound on
the charge will then provide an upper bound on the actual cost. Let T (n)
be the cumulative time required (in the worst case) to execute a sequence
of n queue operations. We will introduce a charge function, C (n), representing the cumulative charge for executing a sequence of n operations and
show that T (n) C (n) = O(n). It is convenient to express this in terms
of a function R(n) = C (n) T (n) representing the cumulative residual, or
overcharge, which is the amount that the charge for n operations exceeds
the actual cost of executing them. We will arrange things so that R(n) 0
and that C (n) = O(n), from which the result follows immediately.
Down to specifics. By charging 2 for each insert operation and 1 for
each remove, it follows that C (n) 2n for any sequence of n inserts and
removes. Thus C (n) = O(n). After any sequence of n 0 operations
have been performed, the queue contains 0 b n elements on the back
half and 0 f n elements on the front half. We claim that for every
n 0, R(n) = b. We prove this by induction on n 0. The condition
clearly holds after performing 0 operations, since T (0) = 0, C (0) = 0,
R EVISED 11.02.11
D RAFT
V ERSION 1.2
237
R EVISED 11.02.11
D RAFT
V ERSION 1.2
28.3
238
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 29
Options, Exceptions, and
Continuations
In this chapter we discuss the close relationships between option types,
exceptions, and continuations. They each provide the means for handling
failure to produce a value in a computation. Option types provide the
means of explicitly indicating in the type of a function the possibility that
it may fail to yield a normal result. The result type of the function forces
the caller to dispatch explicitly on whether or not it returned a normal
value. Exceptions provide the means of implicitly signalling failure to return a normal result value, without sacrificing the requirement that an application of such a function cannot ignore failure to yield a value. Continuations provide another means of handling failure by providing a function
to invoke in the case that normal return is impossible.
29.1
We will explore the trade-offs between these three approaches by considering three different implementations of the n-queens problem: find a way
to place n queens on an n n chessboard in such a way that no two queens
attack one another. The general strategy is to place queens in successive
columns in such a way that it is not attacked by a previously placed queen.
Unfortunately its not possible to do this in one pass; we may find that we
can safely place k < n queens on the board, only to discover that there is
no way to place the next one. To find a solution we must reconsider earlier
240
D RAFT
V ERSION 1.2
241
29.2
D RAFT
V ERSION 1.2
242
29.3
The explicit check on the result of each recursive call can be replaced by
the use of exceptions. Rather than have addqueen return a value of type
Board.board option, we instead have it return a value of type Board.board,
if possible, and otherwise raise an exception indicating failure. The case
analysis on the result is replaced by a use of an exception handler. Heres
the code:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
243
exception Fail
(* addqueen bd yields bd, where bd is a complete safe
placement extending bd, if one exists, and raises Fail otherwise
*)
fun addqueen bd =
let
fun try j =
if j > Board.size bd then
raise Fail
else if Board.safe (bd, j) then
addqueen (Board.place (bd, j))
handle Fail => try (j+1)
else
try (j+1)
in
if Board.complete bd then
bd
else
try 1
end
fun queens n =
SOME (addqueen (Board.new n))
handle Fail => NONE
The main difference between this solution and the previous one is that
both calls to addqueen must handle the possibility that it raises the exception Fail. In the outermost call this corresponds to a complete failure to
find a safe placement, which means that queens must return NONE. If a safe
placement is indeed found, it is wrapped with the constructor SOME to indicate success. In the recursive call within try, an exception handler is
required to handle the possibility of there being no safe placement starting in the current position. This check corresponds directly to the case
analysis required in the solution based on option types.
What are the trade-offs between the two solutions?
1. The solution based on option types makes explicit in the type of
the function addqueen the possibility of failure. This forces the programmer to explicitly test for failure using a case analysis on the result of the call. The type checker will ensure that one cannot use a
R EVISED 11.02.11
D RAFT
V ERSION 1.2
244
29.4
D RAFT
V ERSION 1.2
245
let
fun try j =
if j > Board.size bd then
fc ()
else if Board.safe (bd, j) then
addqueen
(Board.place (bd, j),
fn () => try (j+1))
else
try (j+1)
in
if Board.complete bd then
SOME bd
else
try 1
end
fun queens n =
addqueen (Board.new n, fn () => NONE)
Here again the differences are small, but significant. The initial continuation simply yields NONE, reflecting the ultimate failure to find a safe placement. On a recursive call we pass to addqueen a continuation that resumes
search at the next row of the current column. Should we exceed the number of rows on the board, we invoke the failure continuation of the most
recent call to addqueen.
The solution based on continuations is very close to the solution based
on exceptions, both in form and in terms of efficiency. Which is preferable?
Here again there is no easy answer, we can only offer general advice. First
off, as weve seen in the case of regular expression matching, failure continuations are more powerful than exceptions; there is no obvious way to
replace the use of a failure continuation with a use of exceptions in the
matcher. However, in the case that exceptions would suffice, it is generally preferable to use them since one may then avoid passing an explicit
failure continuation. More significantly, the compiler ensures that an uncaught exception aborts the program gracefully, whereas failure to invoke
a continuation is not in itself a run-time fault. Using the right tool for the
right job makes life easier.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
29.5
246
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 30
Higher-Order Functions
Higher-order functions those that take functions as arguments or return functions as results are powerful tools for building programs. An
interesting application of higher-order functions is to implement infinite
sequences of values as (total) functions from the natural numbers (nonnegative integers) to the type of values of the sequence. We will develop a
small package of operations for creating and manipulating sequences, all
of which are higher-order functions since they take sequences (functions!)
as arguments and/or return them as results. A natural way to define many
sequences is by recursion, or self-reference. Since sequences are functions,
we may use recursive function definitions to define such sequences. Alternatively, we may think of such a sequence as arising from a loopback
or feedback construct. We will explore both approaches.
Sequences may be used to simulate digital circuits by thinking of a
wire as a sequence of bits developing over time. The ith value of the
sequence corresponds to the signal on the wire at time i. For simplicity
we will assume a perfect waveform: the signal is always either high or
low (or is undefined); we will not attempt to model electronic effects such
as attenuation or noise. Combinational logic elements (such as and gates
or inverters) are operations on wires: they take in one or more wires as
input and yield one or more wires as results. Digital logic elements (such
as flip-flops) are obtained from combinational logic elements by feedback,
or recursion a flip-flop is a recursively-defined wire!
30.1
248
Infinite Sequences
Let us begin by developing a sequence package. Here is a suitable signature defining the type of sequences:
signature SEQUENCE =
sig
type a seq = int -> a
(* constant sequence *)
val constantly : a -> a seq
(* alternating values *)
val alternately : a * a -> a seq
(* insert at front *)
val insert : a * a seq -> a seq
val map : (a -> b) -> a seq -> b seq
val zip : a seq * b seq -> (a * b) seq
val unzip : (a * b) seq -> a seq * b seq
(* fair merge *)
val merge : (a * a) seq -> a seq
val stretch : int -> a seq -> a seq
val shrink : int -> a seq -> a seq
val take : int -> a seq -> a list
val drop : int -> a seq -> a seq
val shift : a seq -> a seq
val loopback : (a seq -> a seq) -> a seq
end
Observe that we expose the representation of sequences as functions. This
is done to simplify the definition of recursive sequences as recursive functions. Alternatively we could have hidden the representation type, at the
expense of making it a bit more awkward to define recursive sequences.
In the absence of this exposure of representation, recursive sequences may
only be built using the loopback operation which constructs a recursive
sequence by looping back the output of a sequence transformer to its
input. Most of the other operations of the signature are adaptations of
familiar operations on lists. Two exceptions to this rule are the functions
stretch and shrink that dilate and contract the sequence by a given time
R EVISED 11.02.11
D RAFT
V ERSION 1.2
249
R EVISED 11.02.11
D RAFT
V ERSION 1.2
250
(* bad definition *)
fun loopback loop = loop (loopback loop)
The reason is that any application of loopback will immediately loop forever! In contrast, the original definition is arranged so that application of
loopback immediately returns a function. This may be made more apparent by writing it in the following form, which is entirely equivalent to the
definition given above:
fun loopback loop =
fn n => loop (loopback loop) n
This format makes it clear that loopback immediately returns a function
when applied to a loop functional.
Second, for an application of loopback to a loop to make sense, it must
be the case that the loop returns a sequence without touching the argument sequence (i.e., without applying the argument to a natural number). Otherwise accessing the sequence resulting from an application of
loopback would immediately loop forever. Some examples will help to
illustrate the point.
First, lets build a few sequences without using the loopback function,
just to get familiar with using sequences:
val
val
val
fun
(* [0,1,2,3,4,5,6,7,8,9] *)
(* [5,6,7,8,9] *)
(* [1,1,2,3,5] *)
Now lets consider an alternative definition of fibs that uses the loopback
operation:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
251
30.2
Circuit Simulation
With these ideas in mind, we may apply the sequence package to build
an implementation of digital circuits. Lets start with wires, which are
represented as sequences of levels:
R EVISED 11.02.11
D RAFT
V ERSION 1.2
252
D RAFT
V ERSION 1.2
253
gate f w i = f (w (i-1))
(* unit delay *)
delay : unary gate = gate logical nop
inverter : unary gate = gate logical not
nor gate : binary gate = gate logical nor
of these behaviors may be observed by using take and drop to inspect the values
on the circuit.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
254
= take 20 X;
It is a good exercise to derive a system of equations governing the RS flipflop from the definition weve given here, using the implementation of the
sequence operations given above. Observe that the delays arising from
the combinational logic elements ensure that a solution exists by ensuring
that the next element of the output refers only the previous elements,
and not the current element.
Finally, we consider a variant implementation of an RS flip-flop using
the loopback operation:
fun loopback2 (f : wire * wire -> wire * wire) =
unzip (loopback (zip o f o unzip))
fun RS ff (S : wire, R : wire) =
let
fun RS loop (X, Y) =
(nor gate (zip (S, Y)),
nor gate (zip (X, R)))
in
loopback2 RS loop
end
Here we must define a binary loopback function to implement the flipflop. This is achieved by reducing binary loopback to unary loopback by
composing with zip and unzip.
30.3
D RAFT
V ERSION 1.2
255
D RAFT
V ERSION 1.2
256
30.4
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 31
Memoization
In this chapter we will discuss memoization, a programming technique for
cacheing the results of previous computations so that they can be quickly
retrieved without repeated effort. Memoization is fundamental to the implementation of lazy data structures, either by hand or using the provisions of the SML/NJ compiler.
31.1
Cacheing Results
sum
sum
p 1
p n
f
f
=
=
0 = 0
n = (f n) + sum f (n-1)
1
sum (fn k => (p k) * (p (n-k))) (n-1)
258
limit = 100
memopad = Array.array (100, NONE)
p 1 = 1
p n = sum (fn k => (p k) * (p (n-k))) (n-1)
p n =
if n < limit then
case Array.sub of
SOME r => r
| NONE =>
let
val r = p n
in
R EVISED 11.02.11
D RAFT
V ERSION 1.2
31.2 Laziness
259
Array.update (memopad, n, SOME r);
r
end
else
p n
end
The main idea is to modify the original definition so that the recursive
calls consult and update the memopad. The exported version of the
function is the one that refers to the memo pad. Notice that the definitions
of p and p are mutually recursive!
31.2
Laziness
(* nothing printed *)
(* prints hello *)
D RAFT
V ERSION 1.2
31.2 Laziness
260
D RAFT
V ERSION 1.2
261
the memo pad. However, the contents of the memo pad changes as a result
of forcing it so that subsequent forces exhibit different behavior. Specifically, the first time dt is forced, it forces the thunk t, which then forces
t its value r, zaps the memo pad, and returns r. The second time dt is
forced, it forces the contents of the memo pad, as before, but this time the
it contains the constant function that immediately returns r. Altogether
we have ensured that t is forced at most once by using a form of selfmodifying code.
Heres an example to illustrate the effect of delaying a thunk:
val t = Susp.delay (fn () => print "hello")
(* prints hello *)
val = Susp.force t
val = Susp.force t
(* silent *)
Notice that hello is printed once, not twice! The reason is that the suspended computation is evaluated at most once, so the message is printed
at most once on the screen.
31.3
see chapter 15 for a description of the SML/NJ lazy data type mechanism.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
262
D RAFT
V ERSION 1.2
31.4
263
Recursive Suspensions
D RAFT
V ERSION 1.2
31.5
264
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 32
Data Abstraction
An abstract data type (ADT) is a type equipped with a set of operations for
manipulating values of that type. An ADT is implemented by providing
a representation type for the values of the ADT and an implementation for
the operations defined on values of the representation type. What makes
an ADT abstract is that the representation type is hidden from clients of the
ADT. Consequently, the only operations that may be performed on a value
of the ADT are the given ones. This ensures that the representation may be
changed without affecting the behavior of the client since the representation is hidden from it, the client cannot depend on it. This also facilitates
the implementation of efficient data structures by imposing a condition,
called a representation invariant, on the representation that is preserved by
the operations of the type. Each operation that takes a value of the ADT as
argument may assume that the representation invariant holds. In compensation each operation that yields a value of the ADT as result must guarantee that the representation invariant holds of it. If the operations of the
ADT preserve the representation invariant, then it must truly be invariant
no other code in the system could possibly disrupt it. Put another way,
any violation of the representation invariant may be localized to the implementation of one of the operations. This significantly reduces the time
required to find an error in a program.
32.1 Dictionaries
32.1
266
Dictionaries
To make these ideas concrete we will consider the abstract data type of
dictionaries. A dictionary is a mapping from keys to values. For simplicity we take keys to be strings, but it is possible to define a dictionary for
any ordered type; the values associated with keys are completely arbitrary.
Viewed as an ADT, a dictionary is a type a dict of dictionaries mapping
strings to values of type a together with empty, insert, and lookup operations that create a new dictionary, insert a value with a given key, and
retrieve the value associated with a key (if any). In short a dictionary is an
implementation of the following signature:
signature DICT =
sig
type key = string
type a entry = key * a
type a dict
exception Lookup of key
val empty : a dict
val insert : a dict * a entry -> a dict
val lookup : a dict * key -> a
end
Notice that the type a dict is not specified in the signature, whereas the
types key and a entry are defined to be string and string * a, respectively.
32.2
R EVISED 11.02.11
D RAFT
V ERSION 1.2
267
binary tree with values at the nodes; the representation invariant isolates
a set of structures satisfying some additional, more stringent, conditions.
We may use a binary search tree to implement a dictionary as follows:
structure BinarySearchTree :> DICT =
struct
type key = string
type a entry = key * a
(* Rep invariant: a tree is a binary search tree *)
datatype a tree =
Empty |
Node of a tree * a entry * a tree
type a dict = a tree
exception Lookup of key
val empty = Empty
fun insert (Empty, entry) =
Node (Empty, entry, Empty)
| insert (n as Node (l, e as (k, ), r), e as (k, )) =
(case String.compare (k, k)
of LESS => Node (insert (l, e), e, r)
| GREATER => Node (l, e, insert (r, e))
| EQUAL => n)
fun lookup (Empty, k) = raise (Lookup k)
| lookup (Node (l, (k, v), r), k) =
(case String.compare (k, k)
of EQUAL => v
| LESS => lookup (l, k)
| GREATER => lookup (r, k))
end
Notice that empty is defined to be a valid binary search tree, that insert
yields a binary search tree if its argument is one, and that lookup relies
on its argument being a binary search tree (if not, it might fail to find a
key that in fact occurs in the tree!). The structure BinarySearchTree is
sealed with the signature DICT to ensure that the representation type is
held abstract.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
32.3
268
The difficulty with binary search trees is that they may become unbalanced. In particular if we insert keys in ascending order, the representation is essentially just a list! The left child of each node is empty; the
right child is the rest of the dictionary. Consequently, it takes O(n) time in
the worse case to perform a lookup on a dictionary containing n elements.
Such a tree is said to be unbalanced because the children of a node have
widely varying heights. Were it to be the case that the children of every
node had roughly equal height, then the lookup would take O(lg n) time,
a considerable improvement.
Can we do better? Many approaches have been suggested. One that we
will consider here is an instance of what is called a self-adjusting tree, called
a red-black tree (the reason for the name will be apparent shortly). The general idea of a self-adjusting tree is that operations on the tree may cause a
reorganization of its structure to ensure that some invariant is maintained.
In our case we will arrange things so that the tree is self-balancing, meaning that the children of any node have roughly the same height. As we just
remarked, this ensures that lookup is efficient.
How is this achieved? By imposing a clever representation invariant on
the binary search tree, called the red-black tree condition. A red-black tree
is a binary search tree in which every node is colored either red or black
(with the empty tree being regarded as black) and such that the following
properties hold:
1. The children of a red node are black.
2. For any node in the tree, the number of black nodes on any two paths
from that node to a leaf is the same. This number is called the black
height of the node.
These two conditions ensure that a red-black tree is a balanced binary
search tree. Heres why. First, observe that a red-black tree of black height
h has at least 2h 1 nodes. We may prove this by induction on the structure
of the red-black tree. The empty tree has black-height 1 (since we consider
it to be black), which is at least 21 1, as required. Suppose we have a red
node. The black height of both children must be h, hence each has at most
2h 1 nodes, yielding a total of 2 (2h 1) + 1 = 2h+1 1 nodes, which is
at least 2h 1. If, on the other hand, we have a black node, then the black
R EVISED 11.02.11
D RAFT
V ERSION 1.2
269
height of both children is h 1, and each have at most 2h1 1 nodes, for
a total of 2 (2h1 1) + 1 = 2h 1 nodes. Now, observe that a red-black
tree of height h with n nodes has black height at least h/2, and hence has at
least 2h/2 1 nodes. Consequently, lg(n + 1) h/2, so h 2 lg(n + 1).
In other words, its height is logarithmic in the number of nodes, which
implies that the tree is height balanced.
To ensure logarithmic behavior, all we have to do is to maintain the redblack invariant. The empty tree is a red-black tree, so the only question is
how to perform an insert operation. First, we insert the entry as usual for
a binary search tree, with the fresh node starting out colored red. In doing
so we do not disturb the black height condition, but we might introduce a
red-red violation, a situation in which a red node has a red child. We then
remove the red-red violation by propagating it upwards towards the root
by a constant-time transformation on the tree (one of several possibilities,
which well discuss shortly). These transformations either eliminate the
red-red violation outright, or, in logarithmic time, push the violation to
the root where it is neatly resolved by recoloring the root black (which
preserves the black-height invariant!).
The violation is propagated upwards by one of four rotations. We will
maintain the invariant that there is at most one red-red violation in the
tree. The insertion may or may not create such a violation, and each propagation step will preserve this invariant. It follows that the parent of a
red-red violation must be black. Consequently, the situation must look
like this. This diagram represents four distinct situations, according to
whether the uppermost red node is a left or right child of the black node,
and whether the red child of the red node is itself a left or right child. In
each case the red-red violation is propagated upwards by transforming it
to look like this. Notice that by making the uppermost node red we may be
introducing a red-red violation further up the tree (since the black nodes
parent might have been red), and that we are preserving the black-height
invariant since the great-grand-children of the black node in the original
situation will appear as children of the two black nodes in the re-organized
situation. Notice as well that the binary search tree conditions are also preserved by this transformation. As a limiting case if the red-red violation is
propagated to the root of the entire tree, we re-color the root black, which
preserves the black-height condition, and we are done re-balancing the
tree.
Lets look in detail at two of the four cases of removing a red-red vioR EVISED 11.02.11
D RAFT
V ERSION 1.2
270
lation, those in which the uppermost red node is the left child of the black
node; the other two cases are handled symmetrically. If the situation looks
like this, we reorganize the tree to look like this. You should check that the
black-height and binary search tree invariants are preserved by this transformation. Similarly, if the situation looks like this, then we reorganize the
tree to look like this (precisely as before). Once again, the black-height and
binary search tree invariants are preserved by this transformation, and the
red-red violation is pushed further up the tree.
Here is the ML code to implement dictionaries using a red-black tree.
Notice that the tree rotations are neatly expressed using pattern matching.
structure RedBlackTree :> DICT =
struct
type key = string
type a entry = string * a
(* Inv: binary search tree + red-black conditions *)
datatype a dict =
Empty |
Red of a entry * a dict * a dict |
Black of a entry * a dict * a dict
val empty = Empty
exception Lookup of key
fun lookup (dict, key) =
let
fun lk (Empty) = raise (Lookup key)
| lk (Red tree) = lk tree
| lk (Black tree) = lk tree
and lk ((key1, datum1), left, right) =
(case String.compare(key,key1)
of EQUAL => datum1
| LESS => lk left
| GREATER => lk right)
in
lk dict
end
fun restoreLeft
(Black (z, Red (y, Red (x, d1, d2), d3), d4)) =
Red (y, Black (x, d1, d2), Black (z, d3, d4))
| restoreLeft
R EVISED 11.02.11
D RAFT
V ERSION 1.2
271
(Black (z, Red (x, d1, Red (y, d2, d3)), d4)) =
Red (y, Black (x, d1, d2), Black (z, d3, d4))
| restoreLeft dict = dict
fun restoreRight
(Black (x, d1, Red (y, d2, Red
Red (y, Black (x, d1, d2), Black
| restoreRight
(Black (x, d1, Red (z, Red (y,
Red (y, Black (x, d1, d2), Black
| restoreRight dict = dict
R EVISED 11.02.11
D RAFT
V ERSION 1.2
32.4
272
You might wonder whether we could equally well use run-time checks to
enforce representation invariants. The idea would be to introduce a debug flag that, when set, causes the operations of the dictionary to check
that the representation invariant holds of their arguments and results. In
the case of a binary search tree this is surely possible, but at considerable
expense since the time required to check the binary search tree invariant is
proportional to the size of the binary search tree itself, whereas an insert
(for example) can be performed in logarithmic time. But wouldnt we turn
off the debug flag before shipping the production copy of the code? Yes,
indeed, but then the benefits of checking are lost for the code we care about
most! By using the type system to enforce abstraction, we can confine the
possible violations of the representation invariant to the dictionary package itself, and, moreover, we need not turn off the check for production
code because there is no run-time penalty for doing so.
A more subtle point is that it may not always be possible to enforce data
abstraction at run-time. Efficiency considerations aside, you might think
that we can always replace static localization of representation errors by
dynamic checks for violations of them. But this is false! One reason is that
the representation invariant might not be computable. As an example,
consider an abstract type of total functions on the integers, those that are
guaranteed to terminate when called, without performing any I/O or having any other computational effect. Another is that no run-time check can
be defined that ensures that a given integer-valued function is total. Yet
we can define an abstract type of total functions that, while not admitting
every possible total function on the integers as values, provides a useful
set of such functions as elements of a structure. By using these specified
operations to create a total function, we are in effect encoding a proof of
totality in the program itself.
Heres a sketch of such a package:
signature TIF = sig
type tif
val apply : tif -> (int -> int)
val id : tif
val compose : tif * tif -> tif
val double : tif
R EVISED 11.02.11
D RAFT
V ERSION 1.2
273
.
.
.
end
structure Tif :> TIF = struct
type tif = int->int
fun apply t n = t n
fun id x = x
fun compose (f, g) = f o g
fun double x = 2 * x
.
.
.
end
Should the application of such some value of type Tif.tif fail to terminate, we know where to look for the error. No run-time check can assure
us that an arbitrary integer function is in fact total.
Another reason why a run-time check to enforce data abstraction is impossible is that it may not be possible to tell from looking at a given value
whether or not it is a legitimate value of the abstact type. Heres an example. In many operating systems processes are named by integer-value
process identifiers. Using the process identifier we may send messages to
the process, cause it to terminate, or perform any number of other operations on it. The thing to notice here is that any integer at all is a possible
process identifier; we cannot tell by looking at the integer whether it is
indeed valid. No run-time check on the value will reveal whether a given
integer is a real or bogus process identifier. The only way to know is
to consider the history of how that integer came into being, and what
operations were performed on it. Using the abstraction mechanisms just
described, we can enforce the requirement that a value of type pid, whose
underlying representation is int, is indeed a process identifier. You are
invited to imagine how this might be achieved in ML.
32.5
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 33
Representation Independence and
ADT Correctness
This chapter is concerned with proving correctness of ADT implementations by exhibiting a simulation relation between a reference implementation (taken, or known, to be correct) and a candidate implementation
(whose correctness is to be established). The methodology generalizes
Hoares notion of abstraction functions to an arbitrary relation, and relies
on Reynolds notion of parametricity to conclude that related implementations engender the same observable behavior in all clients.
33.1
Sample Code
Chapter 34
Modularity and Reuse
1. Naming conventions.
2. Exploiting structural subtyping (type t convention).
3. Impedance-matching functors.
34.1
Sample Code
Chapter 35
Dynamic Typing and Dynamic
Dispatch
This chapter is concerned with dynamic typing in a statically typed language. It is commonly thought that there is an opposition between
statically-typed languages (such as Standard ML) and dynamically-typed
languages (such as Scheme). In fact, dynamically typed languages are a
special case of statically-typed languages! We will demonstrate this by
exhibiting a faithful representation of Scheme inside of ML.
35.1
Sample Code
Chapter 36
Concurrency
In this chapter we consider some fundamental techniques for concurrent
programming using CML.
36.1
Sample Code
Part V
Appendices
Appendix A
The Standard ML Basis Library
The Standard ML Basis Library is a collection of modules providing a basic
collection of abstract types that are shared by all implementations of Standard ML. All of the primitive types of Standard ML are defined in structures in the Standard Basis. It also defines a variety of other commonlyused abstract types.
Most implementations of Standard ML include module libraries implementing a wide variety of services. These libraries are usually not portable
across implementations, particularly not those that are concerned with the
internals of the compiler or its interaction with the host computer system.
Please refer to the documentation of your compiler for information on its
libraries.
Appendix B
Compilation Management
All program development environments provide tools to support building
systems out of collections of separately-developed modules. These tools
usually provide services such as:
1. Source code management such as version and revision control.
2. Separate compilation and linking to support simultaneous development
and to reduce build times.
3. Libraries of re-usable modules with consistent conventions for identifying modules and their components.
4. Release management for building and disseminating systems for general use.
Different languages, and different vendors, support these activities in different ways. Some rely on generic tools, such as the familiar Unix tools,
others provide proprietary tools, commonly known as IDEs (integrated
development environments).
Most implementations of Standard ML rely on a combination of generic
program development tools and tools specific to that implementation of
the language. Rather than attempt to summarize all of the known implementations, we will instead consider the SML/NJ Compilation Manager
(CM) as a representative program development framework for ML. Other
compilers provide similar tools; please consult your compilers documentation for details of how to use them.
B.1 Overview of CM
281
B.1
Overview of CM
B.2
B.3
Sample Code
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Appendix C
Sample Programs
A number of example programs illustrating the concepts discussed in the
preceding chapters are available in the Sample Code directory on the worldwide web.
Revision History
Revision
Date
Author(s)
Description
1.0
1.1
1.2
21.01.11
10.02.11
11.02.11
RH
RH
RH
Created
Expanded HOFs techniques
Added matching combinators to
higher-order function techniques
Bibliography
[1] Emden R. Gansner and John H. Reppy, editors. The Standard ML Basis
Library. Cambridge University Press, 2000.
[2] Peter Lee. Standard ML at Carnegie Mellon. Available within CMU at
http://www.cs.cmu.edu/afs/cs/local/sml/common/smlguide.
[3] Robin Milner, Mads Tofte, Robert Harper, and David MacQueen. The
Definition of Standard ML (Revised). MIT Press, 1997.