SML Chapter1
SML Chapter1
Standard ML
The first ML compiler was built in 1974. As the user community grew, various
dialects began to appear. The ML community then got together to develop and
promote a common language, Standard ML — sometimes called SML, or just
ML . Good Standard ML compilers are available.
Standard ML has become remarkably popular in a short time. Universities
around the world have adopted it as the first programming language to teach to
students. Developers of substantial applications have chosen it as their imple-
mentation language. One could explain this popularity by saying that ML makes
it easy to write clear, reliable programs. For a more satisfying explanation, let
us examine how we look at computer systems.
Computers are enormously complex. The hardware and software found in a
typical workstation are more than one mind can fully comprehend. Different
people understand the workstation on different levels. To the user, the worksta-
tion is a word processor or spreadsheet. To the repair crew, it is a box contain-
ing a power supply, circuit boards, etc. To the machine language programmer,
the workstation provides a large store of bytes, connected to a processor that
can perform arithmetic and logical operations. The applications programmer
understands the workstation through the medium of the chosen programming
language.
Here we take ‘spreadsheet’, ‘power supply’ and ‘processor’ as ideal, abstract
concepts. We think of them in terms of their functions and limitations, but not in
terms of how they are built. Good abstractions let us use computers effectively,
without being overwhelmed by their complexity.
Conventional ‘high level’ programming languages do not provide a level of
abstraction significantly above machine language. They provide convenient no-
tations, but only those that map straightforwardly to machine code. A minor
error in the program can make it destroy other data or even itself. The resulting
behaviour can be explained only at the level of machine language — if at all!
ML is well above the machine language level. It supports functional program-
ming, where programs consist of functions operating on simple data structures.
Functional programming is ideal for many aspects of problem solving, as argued
1
2 1 Standard ML
briefly below and demonstrated throughout the book. Programming tasks can be
approached mathematically, without preoccupation with the computer’s internal
workings. ML also provides mutable variables and arrays. Mutable objects can
be updated using an assignment command; using them, any piece of conven-
tional code can be expressed easily. For structuring large systems, ML provides
modules: parts of the program can be specified and coded separately.
Most importantly of all, ML protects programmers from their own errors. Be-
fore a program may run, the compiler checks that all module interfaces agree
and that data are used consistently. For example, an integer may not be used as
a store address. (It is a myth that real programs must rely on such tricks.) As the
program executes, further checking ensures safety: even a faulty ML program
continues to behave as an ML program. It might run forever and it might return
to the user with an error message. But it cannot crash.
ML supports a level of abstraction that is oriented to the requirements of the
programmer, not those of the hardware. The ML system can preserve this ab-
straction, even if the program is faulty. Few other languages offer such assur-
ances.
Functional Programming
Programming languages come in several varieties. Languages like For-
tran, Pascal and C are called procedural: their main programming unit is the
procedure. A popular refinement of this approach centres on objects that carry
their own operations about with them. Such object-oriented languages include
C++ and Modula-3. Both approaches rely on commands that act upon the ma-
chine state; they are both imperative approaches.
Just as procedural languages are oriented around commands, functional lan-
guages are oriented around expressions. Programming without commands may
seem alien to some readers, so let us see what lies behind this idea. We begin
with a critique of imperative programming.
has but a passing resemblance to that formula. Let us consider the advantages
of expressions in detail. Expressions in Fortran can have side effects: they can
change the state. We shall focus on pure expressions, which merely compute a
value.
Expressions have a recursive structure. A typical expression like
f (E1 + E2 ) − g(E3 )
is built out of other expressions E1 , E2 and E3 , and may itself form part of a
larger expression.
The value of an expression is given recursively in terms of the values of its
subexpressions. The subexpressions can be evaluated in any order, or even in
parallel.
Expressions can be transformed using mathematical laws. For instance, re-
placing E1 + E2 by E2 + E1 does not affect the value of the expression above,
thanks to the commutative law of addition. This ability to substitute equals
for equals is called referential transparency. In particular, an expression may
safely be replaced by its value.
Commands share most of these advantages. In modern languages, commands
are built out of other commands. The meaning of a command like
can be given in terms of the meanings of its parts. Commands even enjoy refer-
ential transparency: laws like
Lists and trees. Collections of data can processed as lists of the form
[a, b, c, d , e, . . .].
Lists support sequential access: scanning from left to right. This suffices for
most purposes, even sorting and matrix operations. A more flexible way of
organizing data is as a tree:
b f
a c e g
Balanced trees permit random access: any part can be reached quickly. In theory,
trees offer the same efficiency as arrays; in practice, arrays are often faster. Trees
play key rôles in symbolic computation, representing logical terms and formulæ
in theorem provers. Lists and trees are represented using references, so the run-
time system must include a garbage collector.
1 Recursion does have its critics. Backus (1978) recommends providing iteration
primitives to replace most uses of recursion in function definitions. However,
his style of functional programming has not caught on.
1.4 Elements of a functional language 7
fun length [] = 0
| length (x ::xs) = 1 + length xs;
We instantly see that the length of the empty list ([]) is zero, and that the length
of a list consisting of the element x prefixed to the list xs is the length of xs plus
one. Here is the equivalent definition in Lisp, which lacks pattern-matching:
(define (length x)
(if (null? x)
0
(+ 1 (length (cdr x)))))
ML function declarations often consider half a dozen cases, with patterns much
more complicated than x ::xs. Expressing such functions without using pat-
terns is terribly cumbersome. The ML compiler does this internally, and can do
a better job than the programmer could.
Polymorphic type checking. Programmers, being human, often err. Using a non-
existent part of a data structure, supplying a function with too few arguments,
or confusing a reference to an object with the object itself are serious errors:
they could make the program crash. Fortunately, the compiler can detect them
before the program runs, provided the language enforces a type discipline. Types
classify data as integers, reals, lists, etc., and let us ensure that they are used
sensibly.
Some programmers resist type checking because it can be too restrictive. In
Pascal, a function to compute the length of a list must specify the — completely
irrelevant! — type of the list’s elements. Our ML length function works for all
lists because ML’s type system is polymorphic: it ignores the types of irrelevant
components. Our Lisp version also works for all lists, because Lisp has no
compile-time type checking. Lisp is more flexible than ML; a single list can mix
elements of different types. The price of this freedom is hours spent hunting
errors that might have been caught automatically.
x1 + (x2 + · · · + (xn + 0) · · · ).
x1 × (x2 × · · · × (xn × 1) · · · ).
Infinite data structures. Infinite lists like [1, 2, 3, . . .] can be given a computa-
tional meaning. They can be of great help when tackling sophisticated problems.
Infinite lists are processed using lazy evaluation, which ensures that no value —
or part of a value — is computed until it is actually needed to obtain the final
result. An infinite list never exists in full; it is rather a process for computing
successive elements upon demand.
The search space in a theorem prover may form an infinite tree, whose success
nodes form an infinite list. Different search strategies produce different lists of
success nodes. The list can be given to another part of the program, which need
not know how it was produced.
Infinite lists can also represent sequences of inputs and outputs. Many of us
have encountered this concept in the pipes of the Unix operating system. A
chain of processes linked by pipes forms a single process. Each process con-
sumes its input when available and passes its output along a pipe to the next
process. The outputs of intermediate processes are never stored in full. This
saves storage, but more importantly it gives us a clear notation for combining
processes. Mathematically, every process is a function from inputs to outputs,
and the chain of processes is their composition.
Input and output. Communication with the outside world, which has state, is
hard to reconcile with functional programming. Infinite lists can handle sequen-
tial input and output (as mentioned above), but interactive programming and
process communication are thorny issues. Many functional approaches have
been investigated; monads are one of the most promising (Peyton Jones and
Wadler, 1993). ML simply provides commands to perform input and output;
thus, ML abandons functional programming here.
1.5 The efficiency of functional programming 9
utility for systems programming. A major natural language processing system, called
LOLITA , has been written in Haskell (Smith et al., 1994); the authors adopted functional
programming in order to manage the complexity of their system. Hartel and Plasmeijer
(1996) describe six major projects, involving diverse applications. Wadler and Gill
(1995) have compiled a list of real world applications; these cover many domains and
involve all the main functional languages.
Standard ML
Every successful language was designed for some specific purpose: Lisp
for artificial intelligence, Fortran for numerical computation, Prolog for natural
language processing. Conversely, languages designed to be general purpose —
such as the ‘algorithmic languages’ Algol 60 and Algol 68 — have succeeded
more as sources of ideas than as practical tools.
ML was designed for theorem proving. This is not a broad field, and ML was
intended for the programming of one particular theorem prover — a specific
purpose indeed! This theorem prover, called Edinburgh LCF (Logic for Com-
putable Functions) spawned a host of successors, all of which were coded in ML.
And just as Lisp, Fortran and Prolog have applications far removed from their
origins, ML is being used in diverse problem areas.
The ML system of Edinburgh LCF was slow: programs were translated into Lisp
and then interpreted. Luca Cardelli wrote an efficient compiler for his version of
ML , which included a rich set of declaration and type structures. At Cambridge
12 1 Standard ML
University and INRIA, the ML system of LCF was extended and its performance
improved. ML also influenced HOPE; this purely functional language adopted
polymorphism and added recursive type definitions and pattern-matching.
Robin Milner led a standardization effort to consolidate the dialects into Stan-
dard ML. Many people contributed. The module language — the language’s
most complex and innovative feature — was designed by David MacQueen and
refined by Milner and Mads Tofte. In 1987, Milner won the British Computer
Society Award for Technical Excellence for his work on Standard ML. The first
compilers were developed at the Universities of Cambridge and Edinburgh; the
excellent Standard ML of New Jersey appeared shortly thereafter.
Several universities teach Standard ML as the students’ first programming lan-
guage. ML provides a level base for all students, whether they arrive knowing
C, Basic, machine language or no language at all. Using ML, students can learn
how to analyse problems mathematically, breaking the bad habits learned from
low-level languages. Significant computations can be expressed in a few lines.
Beginners especially appreciate that the type checker detects common errors,
and that nothing can crash the system!
Section 1.5 has mentioned applications of Standard ML to networking, com-
piler construction, etc. Theorem proving remains ML’s most important applica-
tion area, as we shall see below.
Further reading. Gordon et al. (1979) describe LCF. Landin (1966) discusses
the language ISWIM, upon which ML was originally based. The formal defini-
tion of Standard ML has been published as a book (Milner et al., 1990), with a separate
volume of commentary (Milner and Tofte, 1990).
Standard ML has not displaced all other dialects. The French, typically, have gone
their own way. Their language CAML provides broadly similar features with the tradi-
tional ISWIM syntax (Cousineau and Huet, 1990). It has proved useful for experiments
in language design; extensions over Standard ML include lazy data structures and dy-
namic types. CAML Light is a simple byte-code interpreter that is ideal for small com-
puters. Lazy dialects of ML also exist, as mentioned previously. HOPE continues to be
used and taught (Bailey, 1990).
proving, proof checking, soon becomes intolerable. Most proofs involve long,
repetitive combinations of rules.
Edinburgh LCF represented a new kind of theorem prover, where the level of
automation was entirely up to the user. It was basically a programmable proof
checker. Users could write proof procedures in ML — the Meta Language —
rather than typing repetitive commands. ML programs could operate on expres-
sions of the Object Language, namely Scott’s Logic of Computable Functions.
Edinburgh LCF introduced the idea of representing a logic as an abstract type
of theorems. Each axiom was a primitive theorem while each inference rule was
a function from theorems to theorems. Type checking ensured that theorems
could be made only by axioms and rules. Applying inference rules to already
known theorems constructed proofs, rule by rule, in the forward direction.
Tactics permitted a more natural style, backward proof. A tactic was a func-
tion from goals to subgoals, justified by the existence of an inference rule going
the other way. The tactic actually returned this inference rule (as a function) in
its result: tactics were higher-order functions.
Tacticals provided control structures for combining simple tactics into com-
plex ones. The resulting tactics could be combined to form still more complex
tactics, which in a single step could perform hundreds of primitive inferences.
Tacticals were even more ‘higher-order’ than tactics. New uses for higher-order
functions turned up in rewriting and elsewhere.
Further reading. Automated theorem proving originated as a task for artificial
intelligence. Later research applied it to reasoning tasks such as planning (Rich
and Knight, 1991). Program verification aims to prove software correct. Hardware
verification, although a newer field, has been more successful; Graham (1992) describes
the verification of a substantial VLSI chip and surveys other work.
Offshoots of Edinburgh LCF include HOL 88, which uses higher-order logic (Gordon
and Melham, 1993) and Nuprl, which supports constructive reasoning (Constable et al.,
1986).
Other recent systems adopt Standard ML. LAMBDA is a hardware synthesis tool,
for designing circuits and simultaneously proving their correctness using higher-order
logic. ALF is a proof editor for constructive type theory (Magnusson and Nordström,
1994).
• Operations on lists and lists of pairs belong to the structures List and
ListPair . Some of these will be described in later chapters.
• Integer operations belong to the structure Int. Integers may be available
in various precisions. These may include the usual hardware integers
(structure FixedInt), which are efficient but have limited size. They
could include unlimited precision integers (structure IntInf ), which are
essential for some tasks.
• Real number operations belong to the structure Real , while functions
such as sqrt, sin and cos belong to Math. The reals may also be avail-
able in various precisions. Structures have names such as Real 32 or
Real 64, which specify the number of bits used.
• Unsigned integer arithmetic is available. This includes bit-level opera-
tions such as logical ‘and’, which are normally found only in low-level
languages. The ML version is safe, as it does not allow the bits to be
converted to arbitrary types. Structures have names such as Word 8.
• Arrays of many forms are provided. They include mutable arrays like
those of imperative languages (structure Array), and immutable arrays
(structure Vector ). The latter are suitable for functional programming,
since they cannot be updated. Their initial value is given by some cal-
culation — one presumably too expensive to perform repeatedly.
• Operations on characters and character strings belong to structures Char
and String among others. The conversion between a type and its textual
representation is defined in the type’s structure, such as Int.
1.9 ML and the working programmer 15
• Input/output is available in several forms. The main ones are text I/O,
which transfers lines of text, and binary I/O, which transfers arbitrary
streams of bytes. The structures are TextIO and BinIO.
• Operating system primitives reside in structure OS . They are concerned
with files, directories and processes. Numerous other operating system
and input/output services may be provided.
• Calendar and time operations, including processor time measurements,
are provided in structures Date, Time and Timer .
• Declarations needed by disparate parts of the library are collected into
structure General .
Many other packages and tools, though not part of the library, are widely avail-
able. The resulting environment provides ample support for the most demanding
projects.
This quote, from a lecture first given in 1973, has seldom been heeded. Typical
compilers omit checks unless specifically commanded to include them. The C
language is particularly unsafe: as its arrays are mere storage addresses, check-
ing their correct usage is impractical. The standard C library includes many
procedures that risk corrupting the store; they are given a storage area but not
told its size! In consequence, the Unix operating system has many security
loopholes. The Internet Worm exploited these, causing widespread network dis-
ruption (Spafford, 1989).
ML supports the development of reliable software in many ways. Compilers
16 1 Standard ML
do not allow checks to be omitted. Appel (1993) cites its safety, automatic
storage allocation, and compile-time type checking; these eliminate some major
errors altogether, and ensure the early detection of others. Appel shares the view
that functional programming is valuable, even in major projects.
Moreover, ML is defined formally. Milner et al. (1990) is not the first formal
definition of a programming language, but it is the first one that compiler writers
can understand.2 Because the usual ambiguities are absent, compilers agree to a
remarkable extent. The new standard library will strengthen this agreement. A
program ought to behave identically regardless of which compiler runs it; ML is
close to this ideal.
A key advantage of ML is its module system. System components, however
large, can be specified and coded independently. Each component can supply its
specified services, protected from external tampering. One component can take
other components as parameters, and be compiled separately from them. Such
components can be combined in many ways, configuring different systems.
Viewed from a software engineering perspective, ML is an excellent language
for large systems. Its modules allow programmers to work in teams, and to reuse
components. Its types and overall safety contribute to reliability. Its exceptions
allow programs to respond to failures. Comparing ML with C, Appel admits
that ML programs need a great deal of space, but run acceptably fast. Software
developers have a choice of commercially supported compilers.
We cannot soon expect to have ML programs running in our digital watches.
With major applications, however, reliability and programmer productivity are
basic requirements. Is the age of C drawing to a close?