SML Chapter4
SML Chapter4
Chapter outline
This chapter describes datatypes, pattern-matching, exception handling
and trees. It contains the following sections:
The datatype declaration. Datatypes, constructors and pattern-matching
are illustrated through examples. To represent the King and his subjects, a sin-
gle type person comprises four classes of individual and associates appropriate
information with each.
Exceptions. These represent a class of error values. Exceptions can be de-
clared for each possible error. Raising an exception signals an error; handling
the exception allows an alternative computation to be performed.
Trees. A tree is a branching structure. Binary trees are a generalization of
lists and have many applications.
Tree-based data structures. Dictionaries, flexible arrays and priority queues
are implemented easily using binary trees. The update operation creates a new
data structure, with minimal copying.
123
124 4 Trees and Concrete Data
Unfortunately, these do not all have the same type! No ML function could handle
both Kings and Peasants with this representation.
Five things are declared, namely the type person and its four constructors King,
Peer , Knight and Peasant.
The type person consists precisely of the values built by its constructors.
Note that King has type person, while the other constructors are functions that
return something of that type. Thus the following have type person:
King
Peer ("Earl","Carlisle",7) Peer ("Duke","Norfolk",9)
Knight "Gawain" Knight "Galahad"
Peasant "Jack Cade" Peasant "Wat Tyler"
Furthermore, these values are distinct. No person can be both a Knight and a
Peasant; no Peer can have two different degrees.
Values of type person, like other ML values, may be arguments and results of
functions and may belong to data structures such as lists:
Each case is governed by a pattern with its own set of pattern variables. The
Knight and Peasant cases each involve a variable called name, but these vari-
ables have separate scopes.
Patterns may be as complicated as necessary, combining tuples and the list con-
structors with datatype constructors. The function sirs returns the names of all
the Knights in a list of persons:
126 4 Trees and Concrete Data
fun sirs [] = []
| sirs ((Knight s) :: ps) = s :: (sirs ps)
| sirs (p :: ps) = sirs ps;
> val sirs = fn : person list -> string list
sirs persons;
> ["Gawain"] : string list
The cases in a function are considered in order. The third case (with pattern
p::ps) is not considered if p is a Knight, and therefore must not be taken out
of context. Some people prefer that the cases should be disjoint in order to assist
mathematical reasoning. But replacing the case for p::ps with separate cases
for King, Peer and Peasant would make the function longer, slower and less
readable. The third case of sirs makes perfect sense as a conditional equation,
holding for all p not of the form Knight(s).
The ordering of cases is even more important when one person is compared
with another. Rather than testing 16 cases, we test for each true case and take
all the others for false, a total of 7 cases. Note the heavy use of wildcards in
patterns.
fun superior (King, Peer _) = true
| superior (King, Knight _) = true
| superior (King, Peasant _) = true
| superior (Peer _, Knight _) = true
| superior (Peer _, Peasant _) = true
| superior (Knight _, Peasant _) = true
| superior _ = false;
> val superior = fn : person * person -> bool
Exercise 4.2 Modify type person to add the constructor Esquire, whose ar-
guments are a name and a village (both represented by strings). What is the type
of this constructor? Modify function title to generate, for instance,
Functions on type degree are defined by case analysis. What is the title of a lady
of quality?
fun lady Duke = "Duchess"
| lady Marquis = "Marchioness"
| lady Earl = "Countess"
| lady Viscount = "Viscountess"
| lady Baron = "Baroness";
> val lady = fn : degree -> string
Accuracy being paramount in the Court and Social column, we cannot overesti-
mate the importance of this example for electronic publishing.
A type like degree, consisting of a finite number of constants, is an enumer-
ation type. Another example is the built-in type bool , which is declared by
datatype bool = false | true;
This captures the three possible outcomes of a comparison. The library struc-
tures for strings, integers, reals, times, dates, etc., each include a function compare
that returns one of these three outcomes:
String.compare ("York", "Lancaster");
> GREATER : order
Relations that return a boolean value are more familiar. But we need two calls
of < to get as much information as we can get from one call to String.compare.
The first version of Fortran provided three-way comparisons! The more things
change, the more they stay the same . . .
128 4 Trees and Concrete Data
Exercise 4.4 Declare an enumeration type consisting of the names of six dif-
ferent countries. Write a function to return the capital city of each country as a
string.
Exercise 4.5 Write functions of type bool ×bool → bool for boolean conjunc-
tion and disjunction. Use pattern-matching rather than andalso, orelse or
if. How many cases have to be tested explicitly?
The ‘optional’ type. The standard library declares the type operator option:
datatype 0 a option = NONE | SOME of 0 a;
> datatype ’a option
> con NONE : ’a option
> con SOME : ’a -> ’a option
The type operator option takes one argument. Type τ option contains a copy of
type τ , augmented with the extra value NONE. It can be used to supply optional
data to a function, but its most obvious use is to indicate errors. For example,
the library function Real .fromString interprets its string argument as a real
number, but it does not accept every way of expressing 60,000:
Real .fromString "6.0E5";
> SOME 60000.0 : real option
Real .fromString "full three score thousand";
1 The correct term is type constructor (Milner et al., 1990). I avoid it here to
prevent confusion with a constructor of a datatype.
4.3 Polymorphic datatypes 129
You can use a case expression to consider the alternatives separately; see Sec-
tion 4.4 below.
The disjoint sum type. A fundamental operator forms the disjoint sum or union
of two types:
datatype (0 a,0 b)sum = In1 of 0 a | In2 of 0 b;
The type operator sum takes two arguments. Its constructors are
The type (σ, τ )sum is the disjoint sum of the types σ and τ . Its values have the
form In1(x ) for x of type σ , or In2(y) for y of type τ . The type contains a
copy of σ and a copy of τ . Observe that In1 and In2 can be viewed as labels
that distinguish σ from τ .
The disjoint sum allows values of several types to be present where normally
only a single type is allowed. A list’s elements must all have the same type.
If this type is (string, person)sum then an element could contain a string or a
person, while type (string, int)sum comprises strings and integers.
Pattern-matching for the disjoint sum tests whether In1 or In2 is present. The
function concat1 concatenates all the strings included by In1 in a list:
fun concat1 [] = ""
| concat1 ((In1 s)::l ) = s ˆ concat1 l
| concat1 ((In2 _)::l ) = concat1 l ;
> val concat1 = fn : (string, ’a) sum list -> string
concat1 [ In1 "O!", In2 (1040,1057), In1 "Scotland" ];
> "O!Scotland" : string
The expression In1 "Scotland" has appeared with two different types, namely
(string, int × int)sum and (string, person)sum. This is possible because its
type is polymorphic:
In1 "Scotland";
> In1 "Scotland" : (string, ’a) sum
130 4 Trees and Concrete Data
Representing other datatypes. The disjoint sum can express all other datatypes
that are not recursive. The type person can be represented by
((unit, string × string × int)sum, (string, string)sum)sum
with constructors
King = In1(In1())
Peer (d , t, n) = In1(In2(d , t, n))
Knight(s) = In2(In1(s))
Peasant(s) = In2(In2(s))
These are valid as both expressions and patterns. Needless to say, type person
is pleasanter. Observe how unit, the type whose sole element is (), represents
the one King.
Storage requirements. Datatypes require a surprising amount of space, at least
with current compilers. A typical value takes four bytes for the tag (which
identifies the constructor) and four bytes for each component of the associated tuple.
The garbage collector requires a header consisting of a further four bytes. The total
comes to twelve bytes for a Knight or Peasant and twenty bytes for a Peer . This
would include the integer in a Peer , but the strings would be stored as separate objects.
The internal values of an enumeration type require no more space than integers, espe-
cially on those ML systems where integers have unlimited precision. List cells typically
occupy eight to twelve bytes. With a generational garbage collector, the amount of
space taken by an object can vary with its age!
Optimizations are possible. If the datatype has only one constructor, no tag needs
to be stored. If all but one of the constructors are constants then sometimes the non-
constant constructor does not require a tag; this holds for list but not for option, since
the operand of SOME can be anything at all. Compared with Lisp, not having types at
run-time saves storage. Appel (1992) discusses such issues. With advances in run-time
systems, we can expect storage requirements to decrease.
Exercise 4.6 What are the types of King, Peer , Knight and Peasant as de-
clared above?
Exercise 4.7 Exhibit a correspondence between values of type (σ, τ )sum and
certain values of type (σ list) × (τ list) — those of the form ([x ], []) or ([], [y]).
In a pattern, all names except constructors are variables. Any meaning they
may have outside the pattern is insignificant. The variables in a pattern must be
distinct. These conditions ensure that values can be matched efficiently against
the pattern and analysed uniquely to bind the variables.
Constructors absolutely must be distinguished from variables. In this book,
constructors begin with a capital letter while most variables begin with a small
letter.2 However, the standard constructors nil , true and false are also in lower
case. A constructor name may be symbolic or infix, such as :: for lists. The
standard library prefers constructor names consisting of all capitals, such as
NONE.
The first error is the misspelling of the constructor King as Kong. This is a variable
and matches all values, preventing further cases from being considered. ML compilers
warn if a function has a redundant case; this warning must be heeded!
The second error is Knightname: the omission of a space again reduces a pattern
to a variable. Since the error leaves the variable name undefined, the compiler should
complain.
The third error is the omission of parentheses around Peasant name. The resulting
error messages could be incomprehensible.
Misspelled constructor functions are quickly detected, for
fun f (g x ) = ...
is allowed only if g is a constructor. Other misspellings may not provoke any warning.
Omitted spaces before a wildcard, as in Peer , are particularly obscure.
Exercise 4.8 Which simple mistake in superior would alter the function’s
behaviour without making any case redundant?
val P = E
defines the variables in the pattern P to have the corresponding values of ex-
pression E . We have used this in Chapter 2 to select components from tuples:
val (xc,yc) = scalevec(4.0, a);
> val xc = 6.0 : real
> val yc = 27.2 : real
The declaration fails (raising an exception) if the value of the expression does
not match the pattern. When the pattern is a tuple, type checking eliminates this
danger.
The following declarations are valid: the values of their expressions match
their patterns. They declare no variables.
val King = King;
val [1,2,3] = upto(1,3);
Constructor names cannot be declared for another purpose using val. In the
scope of type person, the names King, Peer , Knight and Peasant are re-
served as constructors. Declarations like these, regarded as attempts at pattern-
matching, will be rejected with a type error message:
val King = "Henry V";
val Peer = 925;
Id as P
If the entire pattern (which includes the pattern P as a part) matches, then the
value that matches P is also bound to the identifier Id . This value is viewed both
through the pattern and as a whole. The function nextrun (from Section 3.21)
can be coded
4.4 Pattern-matching with val, as, case 133
Here run and r ::_ are the same list. We now may refer to its head as r instead
of hd run. Whether it is more readable than the previous version is a matter for
debate.
The case expression. This is another vehicle for pattern-matching and has the
form
The function merge (also from Section 3.21) can be recoded using case to test
the first argument before the second:
fun merge(xlist,ylist) : real list =
case xlist of
[] => ylist
| x ::xs => (case ylist of
[] => xlist
| y::ys => if x <=y then x ::merge(xs, ylist)
else y::merge(xlist, ys));
In the recursive call, xlist and x ::xs denote the same list — an effect also
obtainable through the pattern xlist as x ::xs.as keyword@as keyword
The scope of case. No symbol terminates the case expression, so enclose it
in parentheses unless you are certain there is no ambiguity. Below, the second
line is part of the inner case expression, although the programmer may have intended
it to belong to the outer:
case x of 1 => case y of 0 => true | 1 => false
| 2 => true;
134 4 Trees and Concrete Data
The following declaration is not syntactically ambiguous, but many ML compilers parse
it incorrectly. The case expression should be enclosed in parentheses:
fun f [x ] = case g x of 0 => true | 1 => false
| f xs = true;
Exercise 4.9 Express the function title using a case expression to distinguish
the four constructors of type person.
Exercise 4.10 Describe a simple method for removing all case expressions
from a program. Explain why your method does not affect the meaning of the
program.
Exceptions
A hard problem may be tackled by various methods, each of which
succeeds in a fraction of the cases. There may be no better way of choosing
a method than to try one and see if it succeeds. If the computation reaches
a dead end, then the method fails — or perhaps determines that the problem
is impossible. A proof method may make no progress or may reduce its goal
to 0 = 1. A numerical algorithm may suffer overflow or division by zero.
These outcomes can be represented by a datatype whose values are Success(s),
where s is a solution, Failure and Impossible. Dealing with multiple outcomes
is complicated, as we saw with the topological sorting functions of Section 3.17.
The function cyclesort, which returns information about success or failure,
is more complex than pathsort, which expresses failure by (horribly!) call-
ing hd [].
ML deals with failure through exceptions. An exception is raised where the
failure is discovered and handled elsewhere — possibly far away.
then methodB is tried, while if either reports that the problem is impossible then
it is abandoned. In all cases, the result has the same type: string.
case methodA(problem) of
Success s => show s
| Failure => (case methodB (problem) of
Success s => show s
| Failure => "Both methods failed"
| Impossible => "No solution exists")
| Impossible => "No solution exists"
Functions methodA and methodB — and any functions they call within the
scope of these exception declarations — can signal errors by code such as
if ... then raise Impossible
else if ... then raise Failure
else (*compute successful result*)
The attempts to apply methodA and methodB involve two exception handlers:
show (methodA(problem)
handle Failure => methodB (problem))
handle Failure => "Both methods failed"
| Impossible => "No solution exists"
The first handler traps Failure from methodA, and tries methodB . The sec-
ond handler traps Failure from methodB and Impossible from either method.
Function show is given the result of methodA, if successful, or else methodB .
Even in this simple example, exceptions give a shorter, clearer and faster
program. Error propagation does not clutter our code.
Constructor Failedbecause has the type string → exn while Badvalue has
type int → exn. They create exceptions Failedbecause(msg), where msg is a
message to be displayed, and Badvalue(k ), where k may determine the method
to be tried next.
Exceptions can be declared locally using let, even inside a recursive func-
tion. This can result in different exceptions having the same name and other
complications. Whenever possible, declare exceptions at top level. The type of
a top level exception must be monomorphic.3
Values of type exn can be stored in lists, returned by functions, etc., like
values of other types. In addition, they have a special rôle in the operations
raise and handle.
Dynamic types and exn. Because type exn can be extended with new con-
structors, it potentially includes the values of any type. We obtain a weak form
of dynamic typing. This is an accidental feature of ML; CAML treats dynamics in a more
sophisticated manner (Leroy and Mauny, 1993).
For example, suppose we wish to provide a uniform interface for expressing arbitrary
data as strings. All the conversion functions can have type exn → string. To extend
the system with a new type, say Complex .t, we declare a new exception for that type,
and write a new conversion function of type exn → string:
exception ComplexToString of Complex .t;
fun convertc omplex (ComplexToString z ) = ...
The library structure List declares the exception Empty. Functions hd and tl
raise this exception if they are applied to the empty list:
exception Empty;
fun hd (x ::_) = x
| hd [] = raise Empty;
fun tl (_::xs) = xs
| tl [] = raise Empty;
Less trivial is the library function to return the nth element in a list, counting
from 0:
exception Subscript;
fun nth(x ::_, 0) = x
| nth(x ::xs, n) = if n>0 then nth(xs,n-1)
else raise Subscript
| nth _ = raise Subscript;
The declaration val P = E raises exception Bind if the value of E does not
match pattern P . This is usually poor style (but see Figure 8.4 on page 348).
If there is any possibility that the value will not match the pattern, consider the
alternatives explicitly using a case expression:
If E returns a normal value, then the handler simply passes this value on. On
the other hand, if E returns a packet then its contents are matched against the
patterns. If Pi is the first pattern to match then the result is the value of Ei , for
i = 1, . . . , n.
There is one major difference from case. If no pattern matches, then the
handler propagates the exception packet rather than raising exception Match. A
typical handler does not consider every possible exception.
In Section 3.7 we considered the problem of making change. The greedy
algorithm presented there (function change) could not express 16 using 5 and 2
because it always took the largest coin. Another function, allChange, treated
such cases by returning the list of all possible results.
Using exceptions, we can easily code a backtracking algorithm. We declare
the exception Change and raise it in two situations: if we run out of coins with
a non-zero amount or if we cause amount to become negative. We always try
the largest coin, undoing the choice if it goes wrong. The exception handler
always undoes the most recent choice; recursion makes sure of that.
exception Change;
fun backChange (coinvals, 0) = []
| backChange ([], amount) = raise Change
| backChange (c::coinvals, amount) =
if amount<0 then raise Change
else c :: backChange(c::coinvals, amount-c)
handle Change => backChange(coinvals, amount);
> val change = fn : int list * int -> int list
Unlike allChange, this function returns at most one solution. Let us compare
the two functions by redoing the examples from Section 3.7:
There are none, two and 25 solutions; we get at most one of them. Similar ex-
amples of exception handling occur later in the book, as we tackle problems like
parsing and unification. Lazy lists (Section 5.19) are an alternative to exception
handling. Multiple solutions can be computed on demand.
Pitfalls in exception handling. An exception handler must be written with care,
as with other forms of pattern matching. Never misspell an exception name; it
will be taken as a variable and match all exceptions.
Be careful to give exception handlers the correct scope. In the expression
Writing N for the exception packet, the evaluation of len[1] goes like this:
len[1] ⇒ 1 + len(tl [1]) handle => 0
⇒ 1 + len[] handle => 0
⇒ 1 + (1 + len(tl []) handle => 0) handle => 0
⇒ 1 + (1 + lenN handle => 0) handle => 0
⇒ 1 + (1 + N handle => 0) handle => 0
⇒ 1 + (N handle => 0) handle => 0
⇒ 1 + 0 handle => 0
⇒1
4.9 Objections to exceptions 141
This evaluation is more complicated than one for the obvious length function
defined by pattern-matching. Test for different cases in advance, if possible,
rather than trying them willy-nilly by exception handling.
Most proponents of lazy evaluation object to exception handling. Exceptions
complicate the theory and can be abused, as we have just seen. The conflict
is deeper. Exceptions are propagated under the call-by-value rule, while lazy
evaluation follows call-by-need.
ML includes assignments and other commands, and exceptions can be haz-
ardous in imperative programming. It is difficult to write correct programs when
execution can be interrupted in arbitrary places. Restricted to the functional
parts of a program, exceptions can be understood as dividing the value space
into ordinary values and exception packets. They are not strictly necessary in a
programming language and could be abused, but they can also promote clarity
and efficiency.
Exercise 4.11 Type exn does not admit the ML equality operator. Is this re-
striction justified?
Trees
A tree is a branching structure consisting of nodes with branches lead-
ing to subtrees. Nodes may carry values, called labels. Despite the arboreal
terminology, trees are usually drawn upside down:
B D E
C F G
H
142 4 Trees and Concrete Data
The node labelled A is the root of the tree, while nodes C , D, F and H (which
have no subtrees) are its leaves.
The type of a node determines the type of its label and how many subtrees it
may have. The type of a tree determines the types of its nodes. Two types of
tree are especially important. The first has labelled nodes, each with one branch,
terminated by an unlabelled leaf. Such trees are simply lists. The second type
of tree differs from lists in that each labelled node has two branches instead of
one. These are called binary trees.
When functional programmers work with lists, they can draw on a body of
techniques and a library of functions. When they work with trees, they usually
make all their own arrangements. This is a pity, for binary trees are ideal for
many applications. The following sections apply them to efficient table lookup,
arrays and priority queues. We develop a library of polymorphic functions for
binary trees.
Recursive datatypes are understood exactly like non-recursive ones. Type τ tree
consists of all the values that can be made by Lf and Br . There is at least
one τ tree, namely Lf ; and given two trees and a label of type τ , we can make
another tree. Thus Lf is the base case of the recursion.
Here is a tree labelled with strings:
val birnam =
Br ("The", Br ("wood", Lf ,
Br ("of", Br ("Birnam", Lf , Lf ),
Lf )),
Lf );
> val birnam = Br ("The", ..., Lf) : string tree
Here are some trees labelled with integers. Note how trees can be combined to
form bigger ones.
val tree2 = Br (2, Br (1,Lf ,Lf ), Br (3,Lf ,Lf ));
> val tree2 = Br (2, Br (1, Lf, Lf),
> Br (3, Lf, Lf)) : int tree
val tree5 = Br (5, Br (6,Lf ,Lf ), Br (7,Lf ,Lf ));
> val tree5 = Br (5, Br (6, Lf, Lf),
4.10 A type for binary trees 143
The 4
wood 2 5
1 3 6 7
of
Birnam
fun size Lf = 0
| size (Br (v ,t1,t2)) = 1 + size t1 + size t2;
> val size = fn : ’a tree -> int
size birnam;
> 4 : int
size tree4;
> 7 : int
Another measure of the size of a tree is its depth: the length of the longest path
from the root to a leaf.
fun depth Lf = 0
| depth (Br (v ,t1,t2)) = 1 + Int.max (depth t1, depth t2);
> val depth = fn : ’a tree -> int
depth birnam;
> 4 : int
depth tree4;
> 3 : int
Observe that birnam is rather deep for its size while tree4 is as shallow as
144 4 Trees and Concrete Data
size(t) ≤ 2depth (t ) − 1
A function over trees, reflect forms the mirror image of a tree by exchanging
left and right subtrees all the way down:
fun reflect Lf = Lf
| reflect (Br (v ,t1,t2)) = Br (v , reflect t2, reflect t1);
> val reflect = fn : ’a tree -> ’a tree
reflect tree4;
> Br (4, Br (5, Br (7, Lf, Lf),
> Br (6, Lf, Lf)),
> Br (2, Br (3, Lf, Lf),
> Br (1, Lf, Lf))) : int tree
Exercise 4.15 Write a function that determines whether two arbitrary trees t
and u satisfy t = reflect(u). The function should not build any new trees, so it
should not call reflect or Br , although it may use Br in patterns.
Exercise 4.16 Lists need not have been built into ML. Give a datatype
declaration of a type equivalent to α list.
Exercise 4.17 Declare a datatype (α, β)ltree of labelled binary trees, where
branch nodes carry a label of type α and leaves carry a label of type β.
Exercise 4.18 Declare a datatype of trees where each branch node may have
any finite number of branches. (Hint: use list.)
An inorder list places the label between the labels from the left and right sub-
trees, giving a strict left-to-right traversal:
fun inorder Lf = []
| inorder (Br (v ,t1,t2)) = inorder t1 @ [v ] @ inorder t2;
> val inorder = fn : ’a tree -> ’a list
inorder birnam;
> ["wood", "Birnam", "of", "The"] : string list
inorder tree4;
> [1, 2, 3, 4, 6, 5, 7] : int list
fun postorder Lf = []
| postorder (Br (v ,t1,t2)) = postorder t1 @ postorder t2 @ [v ];
> val postorder = fn : ’a tree -> ’a list
postorder birnam;
> ["Birnam", "of", "wood", "The"] : string list
postorder tree4;
> [1, 3, 2, 6, 7, 5, 4] : int list
Although these functions are clear, they take quadratic time on badly unbalanced
trees. The culprit is the appending (@) of long lists. It can be eliminated using
an extra argument vs to accumulate the labels.The following versions perform
exactly one cons (::) operation per branch node:
fun preord (Lf , vs) = vs
| preord (Br (v ,t1,t2), vs) = v :: preord (t1, preord (t2,vs));
These definitions are worth study; many functions are declared similarly. For
instance, logical terms are essentially trees. The list of all the constants in a
term can be built as above.
Exercise 4.19 Describe how inorder (birnam) and inord (birnam, []) are eval-
uated, reporting how many cons operations are performed.
Exercise 4.20 Complete the following equations and explain why they are cor-
rect.
preorder (reflect(t)) =?
inorder (reflect(t)) =?
postorder (reflect(t)) =?
1 1 1 1 1
2 2 2 3 2 2
3 3 3 3
Only one of these trees is balanced. To construct balanced trees, divide the list
of labels roughly in half. The subtrees may differ in size (number of nodes) by
at most 1.
To make a balanced tree from a preorder list of labels, the first label is attached
to the root of the tree:
fun balpre [] = Lf
| balpre(x ::xs) =
let val k = length xs div 2
in Br (x , balpre(List.take(xs,k )), balpre(List.drop(xs,k )))
end;
> val balpre = fn : ’a list -> ’a tree
To make a balanced tree from an inorder list, the label is taken from the middle.
This resembles the top-down merge sort of Section 3.21:
fun balin [] = Lf
| balin xs =
let val k = length xs div 2
val y::ys = List.drop(xs,k )
in Br (y, balin (List.take(xs,k )), balin ys)
end;
> val balin = fn : ’a list -> ’a tree
Exercise 4.22 The function balpre constructs one tree from a preorder list of
labels. Write a function that, given a list of labels, constructs the list of all trees
that have those labels in preorder.
structure Tree =
struct
fun size Lf = 0
| size (Br (v ,t1,t2)) = 1 + size t1 + size t2;
Exercise 4.23 Let us put the datatype declaration inside the structure, then
make the constructors available outside using these declarations:
val Lf = Tree.Lf ;
val Br = Tree.Br ;
4.14 Dictionaries
A dictionary is a collection of items, each identified by a unique key
(typically a string). It supports the following operations:
• Lookup a key and return the item associated with it.
• Insert a new key (not already present) and an associated item.
150 4 Trees and Concrete Data
• Update the item associated with an existing key (insert it if the key is
not already present).
We can make this description more precise by writing an ML signature:
signature DICTIONARY =
sig
type key
type 0 a t
exception E of key
val empty : 0 a t
val lookup : 0 a t * key -> 0 a
val insert : 0 a t * key * 0 a -> 0a t
val update : 0 a t * key * 0 a -> 0a t
end;
The signature has more than the three operations described above. What are the
other things for?
• key is the type of search keys.5
• α t is the type of dictionaries whose stored items have type α.
• E is the exception raised when errors occur. Lookup fails if the key is
not found, while insert fails if the key is already there. The exception
carries the rejected key.
• empty is the empty dictionary.
A structure matching this signature must declare the dictionary operations with
appropriate types. For instance, the function lookup takes a dictionary and a
key, and returns an item. Nothing in signature DICTIONARY indicates that trees
are involved. We may adopt any representation.
A binary search tree can implement a dictionary. A reasonably balanced tree
(Figure 4.1) is considerably more efficient than an association list of (key, item)
pairs. The time required to search for a key among n items is order n for lists
and order log n for binary search trees. The time required to update the tree is
also of order log n. An association list can be updated in constant time, but this
does not compensate for the long search time.
In the worst case, binary search trees are actually slower than association lists.
A series of updates can create a highly unbalanced tree. Search and update can
take up to n steps for a tree of n items.
The keys in an association list may have any type that admits equality, but
the keys in a binary search tree must come with a linear ordering. Strings, with
5 Here it is string; Section 7.10 will generalize binary search trees to take the
type of search keys as a parameter.
4.14 Dictionaries 151
Hungary, 36
France, 33 Mexico, 52
Egypt, 28 Japan, 81
alphabetic ordering, are an obvious choice. Each branch node of the tree carries
a (string, item) pair; its left subtree holds only lesser strings; the right subtree
holds only greater strings. The inorder list of labels puts the strings in alphabetic
order.
Unlike tree operations that might be coded in Pascal, the update and insert
operations do not modify the current tree. Instead, they create a new tree. This
is less wasteful than it sounds: the new tree shares most of its storage with the
existing tree.
Figure 4.2 presents the structure Dict, which is an instance of signature DIC-
TIONARY. It starts by declaring types key and α t, and exception E . The type
declarations are only abbreviations, but they must be present in order to satisfy
the signature.
Lookup in a binary search tree is simple. At a branch node, look left if the
item being sought is lesser than the current label, and right if it is greater. If
the item is not found, the function raises exception E . Observe the use of the
datatype order .
Insertion of a (string, item) pair involves locating the correct position for the
string, then inserting the item. As with lookup, comparing the string with the
current label determines whether to look left or right. Here the result is a new
branch node; one subtree is updated and the other borrowed from the original
tree. If the string is found in the tree, an exception results.
In effect, insert copies the path from the root of the tree to the new node.
Function update is identical apart from its result if the string is found in the
tree.
The exception in lookup is easily eliminated because that function is itera-
tive. It could return a result of type α option, namely SOME x if the key is
152 4 Trees and Concrete Data
exception E of key;
val empty = Lf ;
end;
4.14 Dictionaries 153
found and NONE otherwise. The exception in insert is another matter: since
the recursive calls construct a new tree, returning SOME t or NONE would be
cumbersome. Function insert could call lookup and update, eliminating the
exception but doubling the number of comparisons.
Binary search trees are built from the empty tree (Lf ) by repeated updates or
inserts. We construct a tree ctree1 containing France and Egypt:
Dict.insert(Lf , "France", 33);
> Br (("France", 33), Lf, Lf) : int Dict.t
val ctree1 = Dict.insert(it, "Egypt", 20);
> val ctree1 = Br (("France", 33),
> Br (("Egypt", 20), Lf, Lf),
> Lf) : int Dict.t
Note that ctree1 still exists, even though ctree2 has been constructed from it.
Dict.lookup(ctree1, "France");
> 33 : int
Dict.lookup(ctree2, "Mexico");
> 52 : int
Dict.lookup(ctree1, "Mexico");
> Exception: E
Inserting items at random can create unbalanced trees. If most of the insertions
occur first, followed by many lookups, then it pays to balance the tree before the
lookups. Since a binary search tree corresponds to a sorted inorder list, it can be
balanced by converting it to inorder, then constructing a new tree:
Tree.inord (ctree2, []);
154 4 Trees and Concrete Data
Exercise 4.24 Give four examples of a binary search tree whose depth equals 5
and that contains only the 5 labels of ctree2. For each tree, show a sequence of
insertions that creates it.
A[k ] := x ,
changing the machine state such that A[k ] = x . The previous contents of A[k ]
are lost. Updating in place is highly efficient, both in time and space, but it is
hard to reconcile with functional programming.
4.15 Functional and flexible arrays 155
2 3
4 6 5 7
8 12 10 14 9 13 11 15
A flexible array augments the usual lookup and update with operations to insert
or delete elements from either end of the array. A program starts with an empty
array and inserts elements as required. Gaps are forbidden: element n + 1 must
be defined after element n, for n > 0. Let us examine the underlying tree
operations, which have been credited to W. Braun.
The lookup function, sub, divides the subscript by 2 until 1 is reached. If the
remainder is 0 then the function follows the left subtree, otherwise the right. If
it reaches a leaf, it signals error by raising a standard exception.
fun sub (Lf , _) = raise Subscript
| sub (Br (v ,t1,t2), k ) =
if k = 1 then v
else if k mod 2 = 0
then sub (t1, k div 2)
else sub (t2, k div 2);
> val sub = fn : ’a tree * int -> ’a
The update function, update, also divides the subscript repeatedly by 2. When
it reaches 1 it replaces the branch node by another branch with the new label. A
leaf may be replaced by a branch, extending the array, provided no intervening
nodes have to be generated. This suffices for arrays without gaps.
fun update (Lf , k , w ) =
156 4 Trees and Concrete Data
if k = 1 then Br (w , Lf , Lf )
else raise Subscript
| update (Br (v ,t1,t2), k , w ) =
if k = 1 then Br (w , t1, t2)
else if k mod 2 = 0
then Br (v , update(t1, k div 2, w ), t2)
else Br (v , t1, update(t2, k div 2, w ));
> val update = fn : ’a tree * int * ’a -> ’a tree
Letting a flexible array grow and shrink from above is simple. Just store the
upper bound with the binary tree and use update and delete. But how can we
let the array grow and shrink from below? As the lower bound is fixed, this
seems to imply shifting all the elements.
Consider extending a tree from below with the element w . The result has w
at position 1, replacing the previous element v . Its right subtree (positions 3, 5,
. . .) is simply the old left subtree (positions 2, 4, . . .). By a recursive call, its left
subtree has v at position 2 and takes the rest (positions 4, 6, . . .) from the old
right subtree (positions 3, 5, . . .).
fun loext (Lf , w ) = Br (w , Lf , Lf )
| loext (Br (v ,t1,t2), w ) = Br (w , loext(t2,v ), t1);
> val loext = fn : ’a tree * ’a -> ’a tree
loext(Lf ,"A");
> Br ("A", Lf, Lf) : string tree
loext(it,"B");
> Br ("B", Br ("A", Lf, Lf), Lf) : string tree
loext(it,"C");
> Br ("C", Br ("B", Lf, Lf), Br ("A", Lf, Lf))
> : string tree
loext(it,"D");
> Br ("D", Br ("C", Br ("A", Lf, Lf), Lf),
> Br ("B", Lf, Lf)) : string tree
val tlet = loext(it,"E");
> val tlet = Br ("E", Br ("D", Br ("B", Lf, Lf), Lf),
> Br ("C", Br ("A", Lf, Lf), Lf))
> : string tree
D C
B A
Updating elements of tlet does not affect that array, but creates a new array:
val tdag = update(update (tlet, 5, "Amen"),
2, "dagger");
> val tdag =
> Br ("E", Br ("dagger", Br ("B", Lf, Lf), Lf),
> Br ("C", Br ("Amen", Lf, Lf), Lf))
> : string tree
sub(tdag,5);
> "Amen" : string
sub(tlet,5);
> "A" : string
The binary tree remains balanced after each operation. Lookup and update with
subscript k take order log k steps, the best possible time complexity for any data
structure of unbounded size. Access to a million-element array will be twice as
slow as access to a thousand-element array.
The standard library structure Array provides imperative arrays. They will
be used in Chapter 8 to implement functional (but not flexible) arrays. That
implementation gives fast, constant access time — if the array is used in an
imperative style. Imperative arrays are so prevalent that functional applications
require some imagination.
Here is a signature for flexible arrays. It is based on the structure Array,
158 4 Trees and Concrete Data
structure Braun =
struct
fun sub ...
fun update ...
fun delete ...
fun loext ...
fun lorem ...
end;
end;
4.15 Functional and flexible arrays 159
but includes extension and removal from below (loext, lorem) and from above
(hiext, hirem).
signature FLEXARRAY =
sig
type 0 a array
val empty : 0 a array
val length : 0 a array -> int
val sub : 0 a array * int -> 0 a
val update : 0 a array * int * 0 a -> 0 a array
val loext : 0 a array * 0 a -> 0 a array
val lorem : 0 a array -> 0 a array
val hiext : 0 a array * 0 a -> 0 a array
val hirem : 0 a array -> 0 a array
end;
Figure 4.3 presents the implementation. The basic tree manipulation functions
are packaged as the structure Braun, to prevent name clashes with the analo-
gous functions in structure Flex . Incidentally, Braun subscripts range from 1
to n while Flex subscripts range from 0 to n − 1. The former arises from the
representation, the latter from ML convention.
Structure Flex represents a flexible array by a binary tree paired with an inte-
ger, its size. It might have declared type array as a type abbreviation:
Exercise 4.27 Write a function to convert the array consisting of the elements
x1 , x2 , . . . , xn (in subscript positions 1 to n) to a list. Operate directly on the
tree, without repeated subscripting.
Exercise 4.28 Implement sparse arrays, which may have large gaps between
elements, by allowing empty labels in the tree.
2 3
4 5 6 7
8 9 10 11 12 13 14 15
condition. It may end up higher in the tree, forcing larger items downwards.
Function insert works on the same principle as loext, but maintains the heap
condition while inserting the item. Unless the new item w exceeds the current
label v , it becomes the new label and v is inserted further down; because v does
not exceed any item in the subtrees, neither does w .
fun insert(w : real , Lf ) = Br (w , Lf , Lf )
| insert(w , Br (v , t1, t2)) =
if w <= v then Br (w , insert(v , t2), t1)
else Br (v , insert(w , t2), t1);
> val insert = fn : real * real tree -> real tree
Function siftdown makes a heap from the displaced item and the two subtrees
of the old heap. The item moves down the tree. At each branch node, it follows
the smaller of the subtrees’ labels. It stops when no subtree has a smaller label.
Thanks to the indexing scheme, the cases considered below are the only ones
possible. If the left subtree is empty then so is the right; if the right subtree is
empty then the left subtree can have only one label.
fun siftdown (w :real , Lf , Lf ) = Br (w ,Lf ,Lf )
| siftdown (w , t as Br (v ,Lf ,Lf ), Lf ) =
if w <= v then Br (w , t, Lf )
else Br (v , Br (w ,Lf ,Lf ), Lf )
| siftdown (w , t1 as Br (v 1,p1,q1), t2 as Br (v 2,p2,q2)) =
if w <= v 1 andalso w <= v 2 then Br (w ,t1,t2)
else if v 1 <= v 2 then Br (v 1, siftdown(w ,p1,q1), t2)
(*v2 < v1*) else Br (v 2, t1, siftdown(w ,p2,q2));
> val siftdown = fn
> : real * real tree * real tree -> real tree
162 4 Trees and Concrete Data
Now we can perform deletions. Function delmin calls leftrem to delete and re-
turn an item, then siftdown to put the item back in some suitable place. Deletion
from the empty heap is an error, and the one-element heap is treated separately.
fun delmin Lf = raise Size
| delmin (Br (v ,Lf ,_)) = Lf
| delmin (Br (v ,t1,t2)) =
let val (w ,t) = leftrem t1
in siftdown (w ,t2,t) end;
> val delmin = fn : real tree -> real tree
Our signature for priority queues specifies the primitives discussed above. It also
specifies operations to convert between heaps and lists, and the corresponding
sorting function. It specifies item as the type of items in the queue; for now this
is real , but a functor can take any ordered type as a parameter (see Section 7.10).
signature PRIORITYQ UEUE =
sig
type item
type t
val empty : t
val null : t -> bool
val insert : item * t -> t
val min : t -> item
val delmin : t -> t
val fromList : item list -> t
val toList : t -> item list
val sort : item list -> item list
end;
Figure 4.4 displays the structure for heaps. It uses the obvious definitions of
empty and the predicate null . The function min merely returns the root.
Priority queues easily implement heap sort. We sort a list by converting it to
a heap and back again. Function heapify converts a list into a heap in a man-
ner reminiscent of top-down merge sort (Section 3.21). This approach, using
siftdown, builds the heap in linear time; repeated insertions would require or-
der n log n time. Heap sort’s time complexity is optimal: it takes order n log n
time to sort n items in the worst case. In practice, heap sort tends to be slower
than other n log n algorithms. Recall our timing experiments of Chapter 3.
Quick sort and merge sort can process 10,000 random numbers in 200 msec
or less, but Heap.sort takes 500 msec.
Although there are better ways of sorting, heaps make ideal priority queues.
To see how heaps work, let us build one and remove some items from it.
Heap.fromList [4.0, 2.0, 6.0, 1.0, 5.0, 8.0, 5.0];
4.16 Priority queues 163
val empty = Lf ;
end;
164 4 Trees and Concrete Data
Observe that the smallest item is removed first. Let us apply delmin twice more:
Heap.delmin it;
> Br (5.0, Br (5.0, Br (8.0, Lf, Lf), Lf),
> Br (6.0, Lf, Lf)) : Heap.t
Heap.delmin it;
> Br (5.0, Br (6.0, Lf, Lf), Br (8.0, Lf, Lf)) : Heap.t
ML ’s response has been indented to emphasize the structure of the binary trees.
Other forms of priority queues. The heaps presented here are sometimes called
binary or implicit heaps. Algorithms textbooks such as Sedgewick (1988)
describe them in detail. Other traditional representations of priority queues can be coded
in a functional style. Leftist heaps (Knuth, 1973, page 151) and binomial heaps (Cormen
et al., 1990, page 400) are more complicated than binary heaps. However, they allow
heaps to be merged in logarithmic time. The merge operation for binomial heaps works
by a sort of binary addition. Chris Okasaki, who supplied most of the code for this
section, has implemented many other forms of priority queues. Binary heaps appear to
be the simplest and the fastest, provided we do not require merge.
Exercise 4.29 Draw diagrams of the heaps created by starting with the empty
heap and inserting 4, 2, 6, 1, 5, 8 and 5 (as in the call to heapoflist above).
Exercise 4.30 Describe the functional array indexing scheme in terms of the
binary notation for subscripts. Do the same for the conventional indexing scheme
of heap sort.
Exercise 4.31 Write ML functions for lookup and update on functional arrays,
represented by the conventional indexing scheme of heap sort. How do they
compare with Braun.sub and Braun.update?
4.17 Propositional Logic 165
A tautology checker
This section introduces elementary theorem proving. We define propo-
sitions and functions to convert them into various normal forms, obtaining a
tautology checker for Propositional Logic. Rather than using binary trees, we
declare a datatype of propositions.
¬p a negation, ‘not p’
p∧q a conjunction, ‘p and q’
p∨q a disjunction, ‘p or q’
Our example is based on some important attributes — being rich, landed and
saintly:
val rich = Atom "rich"
and landed = Atom "landed"
and saintly = Atom "saintly";
Here are two assumptions about the rich, the landed and the saintly.
• Assumption 1 is landed → rich: the landed are rich.
• Assumption 2 is ¬(saintly ∧ rich): one cannot be both saintly and rich.
A plausible conclusion is landed → ¬saintly: the landed are not saintly.
Let us give these assumptions and desired conclusion to ML:
val assumption1 = implies(landed , rich)
and assumption2 = Neg(Conj (saintly,rich));
> val assumption1 = Disj (Neg (Atom "landed"),
166 4 Trees and Concrete Data
If the conclusion follows from the assumptions, then the following proposition is
a propositional theorem — a tautology. Let us declare it as a goal to be proved:
val goal = implies(Conj (assumption1,assumption2), concl );
> val goal =
> Disj (Neg (Conj (Disj (Neg (Atom "landed"),
> Atom "rich"),
> Neg (Conj (Atom "saintly",
> Atom "rich")))),
> Disj (Neg (Atom "landed"), Neg (Atom "saintly")))
> : prop
For a more readable display, let us declare a function for converting a proposition
to a string.
fun show (Atom a) = a
| show (Neg p) = "(˜" ˆ show p ˆ ")"
| show (Conj (p,q)) = "(" ˆ show p ˆ " & " ˆ show q ˆ ")"
| show (Disj (p,q)) = "(" ˆ show p ˆ " | " ˆ show q ˆ ")";
> val show = fn : prop -> string
Spaces and line breaks have been inserted in the output above to make it more
legible, as elsewhere in this book.
¬¬p by p
¬(p ∧ q) by (¬p) ∨ (¬q)
¬(p ∨ q) by (¬p) ∧ (¬q)
Such replacements are sometimes called rewrite rules. First, consider whether
they make sense. Are they unambiguous? Yes, because the left sides of the
rules cover distinct cases. Will the replacements eventually stop? Yes; although
they can create additional negations, the negated parts shrink. How do we know
when to stop? Here a single sweep through the proposition suffices.
Function nnf applies these rules literally. Where no rule applies, it simply
makes recursive calls.
fun nnf (Atom a) = Atom a
| nnf (Neg (Atom a)) = Neg (Atom a)
| nnf (Neg (Neg p)) = nnf p
| nnf (Neg (Conj (p,q))) = nnf (Disj (Neg p, Neg q))
| nnf (Neg (Disj (p,q))) = nnf (Conj (Neg p, Neg q))
| nnf (Conj (p,q)) = Conj (nnf p, nnf q)
| nnf (Disj (p,q)) = Disj (nnf p, nnf q);
> val nnf = fn : prop -> prop
Making the function evaluate this expression directly saves a recursive call —
similarly for ¬(p ∨ q).
It can be faster still. A separate function to compute nnf (Neg p) avoids the
needless construction of negations. In mutual recursion, function nnfpos p com-
putes the normal form of p while nnfneg p computes the normal form of Neg p.
fun nnfpos (Atom a) = Atom a
| nnfpos (Neg p) = nnfneg p
| nnfpos (Conj (p,q)) = Conj (nnfpos p, nnfpos q)
| nnfpos (Disj (p,q)) = Disj (nnfpos p, nnfpos q)
and nnfneg (Atom a) = Neg (Atom a)
| nnfneg (Neg p) = nnfpos p
| nnfneg (Conj (p,q)) = Disj (nnfneg p, nnfneg q)
| nnfneg (Disj (p,q)) = Conj (nnfneg p, nnfneg q);
p ∨ (q ∧ r ) by (p ∨ q) ∧ (p ∨ r )
(q ∧ r ) ∨ p by (q ∨ p) ∧ (r ∨ p)
These replacements are less straightforward than those that yield negation nor-
mal form. They are ambiguous — both of them apply to (a ∧ b) ∨ (c ∧ d ) — but
the resulting normal forms are logically equivalent. Termination is assured; al-
though each replacement makes the proposition bigger, it replaces a disjunction
by smaller disjunctions, and this cannot go on forever.
A disjunction may contain buried conjunctions; take for instance a ∨(b ∨(c ∧
d )). Our replacement strategy, given p ∨ q, first puts p and q into CNF. This
4.19 Conjunctive normal form 169
brings any conjunctions to the top. Then applying the replacements distributes
the disjunctions into the conjunctions.
Calling distrib(p, q) computes the disjunction p ∨ q in CNF, given p and q
in CNF. If neither is a conjunction then the result is p ∨ q, the only case where
distrib makes a disjunction. Otherwise it distributes into a conjunction.
fun distrib (p, Conj (q,r )) = Conj (distrib(p,q), distrib(p,r ))
| distrib (Conj (q,r ), p) = Conj (distrib(q,p), distrib(r ,p))
| distrib (p, q) = Disj (p,q) (*no conjunctions*);
> val distrib = fn : prop * prop -> prop
The first two cases overlap: if both p and q are conjunctions then distrib(p, q)
takes the first case, because ML matches patterns in order. This is a natural way
to express the function. As we can see, distrib makes every possible disjunction
from the parts available:
distrib (Conj (rich,saintly), Conj (landed , Neg rich));
> Conj (Conj (Disj (Atom "rich", Atom "landed"),
> Disj (Atom "saintly", Atom "landed")),
> Conj (Disj (Atom "rich", Neg (Atom "rich")),
> Disj (Atom "saintly", Neg (Atom "rich"))))
> : prop
show it;
> "(((rich | landed) & (saintly | landed)) &
> ((rich | (˜rich)) & (saintly | (˜rich))))" : string
Finally, we convert the desired goal into CNF using cnf and nnf :
val cgoal = cnf (nnf goal );
> val cgoal = Conj (..., ...) : prop
show cgoal ;
> "((((landed | saintly) | ((˜landed) | (˜saintly))) &
> (((˜rich) | saintly) | ((˜landed) | (˜saintly)))) &
> (((landed | rich) | ((˜landed) | (˜saintly))) &
> (((˜rich) | rich) | ((˜landed) | (˜saintly)))))"
> : string
This is indeed a tautology. Each of the four disjunctions contains some Atom
and its negation: landed , saintly, landed and rich, respectively. To detect this,
170 4 Trees and Concrete Data
Function taut performs the tautology check on any CNF proposition, using inter
(see Section 3.15) to form the intersection of the positive and negative atoms.
The final outcome is perhaps an anticlimax.
fun taut (Conj (p,q)) = taut p andalso taut q
| taut p = not (null (inter (positives p, negatives p)));
> val taut = fn : prop -> bool
taut cgoal ;
> true : bool
Exercise 4.35 Modify the definition of distrib so that no two cases overlap.