Prog 2021
Prog 2021
Functional Programming
and the Structure of
Programming Languages
using OCaml
Lecture Notes
Version of April 5, 2022
Gert Smolka
Saarland University
Preface vii
1 Getting Started 1
1.1 Programs and Declarations . . . . . . . . . . . . . . . . . . . 1
1.2 Functions and Let Expressions . . . . . . . . . . . . . . . . . 3
1.3 Conditionals, Comparisons, and Booleans . . . . . . . . . . . 4
1.4 Recursive Power Function . . . . . . . . . . . . . . . . . . . . 5
1.5 Integer Division . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Mathematical Level versus Coding Level . . . . . . . . . . . 9
1.7 More about Mathematical Functions . . . . . . . . . . . . . . 11
1.8 Linear Search and Higher-Order Functions . . . . . . . . . . 14
1.9 Partial Applications . . . . . . . . . . . . . . . . . . . . . . . 15
1.10 Inversion of Strictly Increasing Functions . . . . . . . . . . . 17
1.11 Tail Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.12 Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.13 Exceptions and Spurious Arguments . . . . . . . . . . . . . . 21
1.14 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
iii
Contents
4 Lists 64
4.1 Nil and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Basic List Functions . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 List Functions in OCaml . . . . . . . . . . . . . . . . . . . . 68
4.4 Fine Points About Lists . . . . . . . . . . . . . . . . . . . . . 70
4.5 Membership and List Quantification . . . . . . . . . . . . . . 71
4.6 Head and Tail . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.7 Position Lookup . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.8 Option Types . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.9 Generalized Match Expressions . . . . . . . . . . . . . . . . . 75
4.10 Sublists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.11 Folding Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.12 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.13 Generalized Insertion Sort . . . . . . . . . . . . . . . . . . . 82
4.14 Lexical Order . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.15 Prime Factorization . . . . . . . . . . . . . . . . . . . . . . . 84
4.16 Key-Value Maps . . . . . . . . . . . . . . . . . . . . . . . . . 86
6 Parsing 102
6.1 Lexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.2 Recursive Descent Parsing . . . . . . . . . . . . . . . . . . . 105
6.3 Infix with Associativity . . . . . . . . . . . . . . . . . . . . . 106
6.4 Infix with Precedence and Juxtaposition . . . . . . . . . . . 108
6.5 Postfix Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.6 Mini-OCaml . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
iv
Contents
10 Arrays 155
10.1 Basic Array Operations . . . . . . . . . . . . . . . . . . . . . 155
10.2 Conversion between Arrays and Lists . . . . . . . . . . . . . 157
10.3 Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . 158
10.4 Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
10.5 Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
10.6 Equality for Arrays . . . . . . . . . . . . . . . . . . . . . . . 164
10.7 Execution Order . . . . . . . . . . . . . . . . . . . . . . . . . 164
10.8 Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
10.9 Mutable Objects Mathematically . . . . . . . . . . . . . . . . 167
v
Contents
vi
Preface
vii
Preface
realization with stack machines with jumps (see the German textbook)
would be interesting, but this extra material may not fit into the time-
budget of a one-semester course.
At Saarland University, the course spans 15 weeks of teaching in the
winter semester. Each week comes with two lectures (90 minutes each),
an exercises assignment, office hours, tutorials, and a test (15 minutes).
There is a midterm exam in December and a final exam (offered twice)
after the lecture period has finished. The 2021 iteration of the course also
came with a take home project over the holiday break asking the students
to write an interpreter for Mini-OCaml. The take home project should
be considered an important part of the course, given that it requires
the students writing and debugging a larger program (about 250 lines),
which is quite different from the small programs (up to 10 lines) the
weekly assignments ask for.
viii
1 Getting Started
1
1 Getting Started
has checked and executed a program, the user can extend it with further
declarations. This way one can write the declarations of a program one
by one.
At this point you want to start working with the interpreter. You
will learn the exact rules of the language through experiments with the
interpreter guided by the examples and explanations given in this text.
Here is a program redeclaring the identifier a:
let a = 2 * 3 + 2
let b = 5 * a
let a = 5
let c = a + b
2
1 Getting Started
declares a function
3
1 Getting Started
let d in e
The declaration
let abs x = if x < 0 then -x else x
4
1 Getting Started
There is a type bool having two values true and false called booleans.
The comparison operator “≤” used for max (written “<=” in OCaml) is
used with the functional type int → int → bool saying that an expression
e1 ≤ e2 where both subexpressions e1 and e2 have type int has type bool.
The declaration
let test (x : int) y z = if x <= y then y <= z else false
declares a function test : int → int → int → bool testing whether its
three arguments appear in order. The type specification given for
the first argument x is needed so that test receives the correct type. Al-
ternatively, type specifications could be given for one or all of the other
arguments.
Exercise 1.3.1 Declare minimum functions analogous to the maximum
functions declared above. For each declared function give its type (before
you check it with the interpreter). Also, write the declarations with and
without redundant parentheses to understand the nesting.
Exercise 1.3.2 Declare functions int → int → bool providing the com-
parisons x = y, x 6= y, x < y, x ≤ y, x > y, and x ≥ y. Do this by
just using conditionals and comparisons x ≤ y. Then realize x ≤ y with
x ≤ 0 and subtraction.
x0 = 1
xn+1 = x · xn
23 = 2 · 22 2nd equation
= 2·2·2 1
2nd equation
= 2·2·2·2 0
2nd equation
= 2·2·2·1 1st equation
= 8
5
1 Getting Started
The trick is that the 2nd equation reduces larger powers to smaller pow-
ers, so that after repeated application of the 2nd equation the power x0
appears, which can be computed with the 1st equation. What we see
here is a typical example of a recursive computation.
Recursive computations can be captured with recursive functions.
To arrive at a function computing powers, we merge the two equations
for powers into a single equation using a conditional:
The identifier pow receives the type pow : int → int → int. We say
that the function pow applies itself. Recursive function applications are
admitted if the declaration of the function uses the keyword rec.
We have demonstraed an important point about programming at the
example of the power function: Recursive functions are designed at a
mathematical level using equations. Once we have the right equations,
we can implement the function in any programming language.
x = k·y+r
and
r<y
6
1 Getting Started
the shelf is the quotient k and the length of the space remaining is the
remainder r = x − k · y < y. For instance, if the shelf has length 11
and each box has length 3, we can place at most 3 boxes into the shelf,
with 2 units of length remaining.4
Given that x and y uniquely determine k and r, we are justified in
using the notations x/y and x % y for k and r. By definition of k and r,
we have
x = (x/y) · y + x % y
and
x%y < y
for all x ≥ 0 and y > 0.
From our explanations it is clear that we can compute x/y and x % y
given x and y. In fact, the resulting operations x/y and x % y are es-
sential for programming and are realized efficiently for machine integers
on computers. We refer to the operations as division and modulo, or
just as “div” and “mod”. Accordingly, we read the applications x/y and
x % y as “x div y” and “x mod y”. OCaml provides both operations as
primitive operations using the notations x/y and x mod y.
Digit sum
With div and mod we can decompose the decimal representation of
numbers. For instance, 367 % 10 = 7 and 367/10 = 36. More generally,
x % 10 yields the last digit of the decimal representation of x, and x/10
cuts off the last digit of the decimal representation of x.
Knowing these facts, we can declare a recursive function computing
the digit sum of a number:
let rec digit_sum x =
if x < 10 then x
else digit_sum (x / 10) + (x mod 10)
For instance, we have digit sum 367 = 16. We note that digit sum
terminates since the argument gets smaller upon recursion.
Exercise 1.5.1 (First digit) Declare a function that yields the first
digit of the decimal representation of a number. For instance, the first
digit of 367 is 3.
7
1 Getting Started
Digit reversal
We now write a function rev that given a number computes the number
represented by the reversed digital representation of the number. For
instance, we want rev 76 = 67, rev 67 = 76, and rev 7600 = 67. To
write rev, we use an important algorithmic idea. The trick is to have an
additional accumulator argument that is initially 0 and that collects
the digits we cut off at the right of the main argument. For instance,
we want the trace
We refer to rev 0 as the worker function for rev and to the argument a
of rev 0 as the accumulator argument of rev 0 . We note that rev 0 ter-
minates since the first argument gets smaller upon recursion.
Greatest common divisors
Recall the notion of greatest common divisors. For instance, the greatest
common divisor of 34 and 85 is the number 17. In general, two numbers
x, y ≥ 0 such that x + y > 0 always have a unique greatest common
divisor. We assume the following rules for greatest common divisors
(gcds for short):5
1. The gcd of x and 0 is x.
2. If y > 0, the gcd of x and y is the gcd of y and x % y.
The two rules suffice to declare a function gcd : int → int → int com-
puting the gcd of two numbers x, y ≥ 0 such that x + y > 0:
let rec gcd x y =
if y < 1 then x
else gcd y (x mod y)
5
We will prove the correctness of the rules in a later chapter.
8
1 Getting Started
We remark that both functions terminate for x ≥ 0 and y > 0 since the
first argument gets smaller upon recursion.
We remark that the functions rev 0 , gcd, and my mod employ a special
form of recursion known as tail recursion. We will discuss tail recursion
in §1.11.
N : 0, 1, 2, 3, . . . natural numbers
N+ : 1, 2, 3, 4, . . . positive integers
Z : . . . , −2, −1, 0, 1, 2, . . . integers
B : false, true booleans
9
1 Getting Started
equations the function should satisfy. The goal here is to come up with
a collection of equations that is sufficient for computing the function.
For instance, when we design a power function, we may start with
the mathematical type
pow : Z → N → Z
pow x n = xn
Together, the type and the equation specify the function we want to
define. Next we need equations that can serve as defining equations
for pow. The specifying equation is not good enough since we assume,
for the purpose of the example, that the programming language we want
to code in doesn’t have a power operator. We now recall that powers xn
satisfy the equations
x0 = 1
xn+1 = x · xn
In §1.4 we have already argued that rewriting with the two equations
suffices to compute all powers we can express with the type given for pow.
Next we adapt the equations to the function pow we are designing:
pow x 0 = 1
pow x n = x · pow x (n − 1) if n > 0
pow : Z → N → Z
pow x 0 := 1
pow x n := x · pow x (n − 1) if n > 0
Note that we write defining equations with the symbol “:=” to mark
them as defining.
We observe that the defining equations for pow are terminating.
The termination argument is straightforward: Each recursion step
10
1 Getting Started
We use the opportunity and give mathematical definitions for some func-
tions we already discussed. A mathematical function definition consists
of a type and a system of defining equations.
11
1 Getting Started
Remainder
% : N → N+ → N
x % y := x if x < y
x % y := (x − y) % y if x ≥ y
D: N→N
D(x) := x if x < 10
D(x) := D(x/10) + (x % 10) if x ≥ 10
Digit Reversal
R: N→N→N
R 0 a := a
R x a := R (x/10) (10 · a + (x % 10)) if x > 0
12
1 Getting Started
f : N→N
f (n) := f (n) if n < 10
f (n) := n if n ≥ 10
Graph of a function
Abstractly, we may see a function f : X → Y as the set of all argument-
result pairs (x, f (x)) where x is taken from the argument type X. We
speak of the graph of a function. The graph of a function completely
forgets about the definition of the function.
When we speak of a mathematical function in this text, we always
include the definition of the function with a type and a system of defining
equations. This information is needed so that we can compute with the
function.
In noncomputational mathematics, one usually means by a function
just a set of argument-result pairs.
GCD function
It is interesting to look at the mathematical version of the gcd function
from §1.5:
G: N→N→N
G x 0 := x
G x y := G y (x % y) if y > 0
13
1 Getting Started
6
The greek letter “λ” is pronounced “lambda”. Sometimes lambda expressions λx.e
are written with the more suggestive notation x 7→ e.
14
1 Getting Started
15
1 Getting Started
Note that lambda expressions with two argument variables are nota-
tion for nested lambda expressions with single arguments. We can also
describe the function test with a nested lambda expression:
e1 e2 e3 = (e1 e2 ) e3
t1 → t2 → t3 = t1 → (t2 → t3 )
Note that applications group to the left and function types group to the
right.
We have considered equations between functions in the discussion of
partial applications of test. We consider two functions as equal if they
agree on all arguments. So functions with very different definitions may
be equal. In fact, two functions are equal if and only if they have the
same graph. We remark that there is no algorithm deciding equality of
functions in general.
Exercise 1.9.1
a) Write λxyk. (k + 1) · y > x as a nested lambda expression.
b) Write test 11 3 10 as a nested application.
c) Write int → int → int → bool as a nested function type.
16
1 Getting Started
17
1 Getting Started
Exercise 1.10.4 Let y > 0. Convince yourself that λx. x/y inverts the
strictly increasing function λn. n · y.
18
1 Getting Started
Tail recursive functions have the property that their execution can
be traced in a simple way. For instance, we have the tail recursive trace
For functions where the recursion is not tail recursive, traces look more
complicated, for instance
pow 2 3 = 2 · pow 2 2
= 2 · (2 · pow 2 1)
= 2 · (2 · (2 · pow 2 0))
= 2 · (2 · (2 · 1))
= 8
pow 0 2 3 1 = pow 0 2 2 2
= pow 0 2 1 4
= pow 0 2 0 8
= 8
n! = n · (n − 1) · · · · · 2 · 1
19
1 Getting Started
For instance,
5! = 5 · 4 · 3 · 2 · 1 = 120
!:N→N
0! := 1
(n + 1)! := (n + 1) · n!
1.12 Tuples
Sometimes we want a function that returns more than one value. For
instance, the time for a marathon may be given with three numbers in
the format h : m : s, where h is the number of hours, m is the number
of minutes, and s is the number of seconds the runner needed. The time
of Eliud Kipchoge’s world record in 2018 in Berlin was 2 : 01 : 39. There
is the constraint that m < 60 and s < 60.
OCaml has tuples to represent collections of values as single values.
To represent the marathon time of Kipchoge in Berlin, we can use the
tuple
consisting of three integers. The product symbol “×” in the tuple type
is written as “*” in OCaml. The component types of a tuple are not
restricted to int, and there can be n ≥ 2 positions, where n is called
the length of the tuple. We may have tuples as follows:
Note that the last example nests tuples into tuples. We mention that
tuples of length 2 are called pairs, and that tuples of length 3 are called
triples.
20
1 Getting Started
Exercise 1.12.1
a) Give a tuple of length 5 where the components are the values 2 and 3.
b) Give a tuple of type int × (int × (bool × bool)).
c) Give a pair whose first component is a pair and whose second com-
ponent is a triple.
21
1 Getting Started
rem : N → N+ → N
rem x y := x if x < y
rem x y := rem (x − y) y if x ≥ y
receiving the type int → int → int. In OCaml we now have the spurious
situation that rem x 0 diverges for all x ≥ 0. There other spurious
situations whose analysis is tedious since machine arithmetic needs to
be taken into account. Using the wrapper function
let rem_checked x y =
if x >=0 && y > 0 then rem x y
else invalid_arg "rem_checked"
22
1 Getting Started
1.14 Summary
After working through this chapter you should be able to design and code
functions computing powers, integer quotients and remainders, digit
sums, digit reversals, and integer roots. You should understand that
the design of functions happens at a mathematical level using mathe-
matical types and equations. A given design can then be refined into
a program in a given programming language. In this text we are us-
ing OCaml as programming language, assuming that this is the first
programming language you see.8
7
OCaml says “invalid argument” for “spurious argument”.
8
Most readers will have done some programming in some programming language
before starting with this text. Readers of this group often face the difficulty that
they invest too much energy on mapping back the new things they see here to the
form of programming they already understand. Since functional programming is
rather different from other forms of programming, it is essential that you open
yourself to the new ideas presented here. Keep in mind that a good programmer
quickly adapts to new ways of thinking and to new programming languages.
23
1 Getting Started
You also saw a first higher-order function first, which can be used to
obtain integer quotients and integer roots with a general scheme known
as linear search. You will see many more examples of higher-order func-
tions expressing basic computational schemes.
The model of computation we have assumed in this chapter is rewrit-
ing with defining equations. In this model, recursion appears in a natural
way. You will have noticed that recursion is the feature where things get
really interesting. We also have discussed tail recursion, a restricted form
of recursion with nice properties we will study more carefully as we go
on. All recursive functions we have seen in this chapter have equivalent
tail recursive formulations (often using an accumulator argument).
Finally, we have seen tuples, which are compound values combining
several values into a single value.
24
2 Syntax and Semantics
25
2 Syntax and Semantics
(* ... *)
26
2 Syntax and Semantics
counting as white space. Comments are only visible at the lexical level.
Comments are useful for remarks the programmer wants to write into a
program. Here is a text where the words identified by the lexical syntax
are marked:
(* Linear Search *)
let rec first f k =
if f k then k (* base case *)
else first f ( k + 1 ) (* recursive case *)
Note that the characters of the three comments in the text count as
white space and don’t contribute to the words.
Consult the OCaml manual if you want to know more about the
lexical syntax of OCaml.
27
2 Syntax and Semantics
+ - / mod + − / %
* · or ×
= <> = 6=
< <= > >= < ≤ > ≥
-> →
The keyword “*” is special in that it translates into the symbol × if
used to form tuple types, and into the symbol · if used as multiplication
operator.
2.2.1 Types
We have seen the base types int, bool, and string. Types can be
combined into function types and tuple types using the symbols →
and ×. As example we consider the nested function type
int →
→ int
int bool
making explicit the nesting of the outer function type. The nesting can
also be indicated with redundant parentheses in the lexical represen-
tation:
Redundant parentheses are always possible. If you like, you can write
the base type int as ((int)).
The binary structure of function types explains functions with several
arguments as functions taking single arguments and returning functions.
Partial applications are a necessary feature of the binary structure of
function types.
Next we look at a nested tuple type
28
2 Syntax and Semantics
× string ×
whose result type is a tuple type. The tree representation of this type is
→
int ×
The grammar defines the syntactic classes htypei and hbase typei by
means of grammar rules, where each rule offers three alternatives
separated by a vertical bar “|”. Function types and tuple types are com-
pound types coming with constituents that are types again. A func-
tion type htypei → htypei, for instance, has two constituents, the argu-
ment type and the result type. A tuple type htypei1 × · · · × htypein
has n ≥ 2 constituents called component types. Tuple types are also
known as product types.
Grammars like the above grammar for types are often called BNFs.
Google BNF (Backus-Naur form) to learn more about their origin.
29
2 Syntax and Semantics
Exercise 2.2.1 Draw syntax trees for the following phrases in lexical
representation.
a) int × int → bool
b) int × (int × int) × int → bool
2.2.2 Expressions
Following the ideas we have seen for types, we now describe the phrasal
syntax of a class of expressions with the grammar shown in Figure 2.1.
To keep things manageable, we only consider a small sublanguage of
OCaml. In particular, we do not consider type specifications in lambda
expressions and let expressions.
Note that there are binary operator applications (e.g., e1 + e2 )
and unary operator applications (e.g., −e and +e).
let f x y = x + y in f 5 2
let
pattern + •
f x y x y • 2
f 5
Note that the let expression shown contains patterns, a binary operator
application, and two function applications.
One thing we need to explain about the syntactic structure of expres-
sions in addition to what is explained by the grammar is how parentheses
can be omitted in lexical representations. For instance, the expression
<
− +
3 2 · y
2 x
30
2 Syntax and Semantics
hexpressioni ::=
| hatomic expressioni
| hoperator applicationi
| hfunction applicationi
| hconditional i
| htuple expressioni
| hlambda expressioni
| hlet expressioni
hpatterni ::=
| hidentifieri
| (hpatterni1 , . . . , hpatternin ) (n≥2)
31
2 Syntax and Semantics
(3 − 2) < ((2 · x) + y)
making explicit all nestings through parentheses, but can also be de-
scribed with the sequence
3−2<2·x+y
e1 e2 e3 (e1 e2 ) e3
The rule says that function applications group to the left, which ensures
that arguments are taken from left to right. The lexical representation
of the let expression shown above already made use of this rule.
The parentheses rules for binary operators distinguish 3 levels:
= 6= < ≤ > ≥
+ −
· / %
The operators at the lowest level take their arguments first, followed by
the operators at the mid-level, followed by the comparison operators at
the top level. This rule explains why we have
If we have several operators at the same level, they take their arguments
from left to right. For instance,
e1 + e2 + e3 − e4 − e5 (((e1 + e2 ) + e3 ) − e4 ) − e5
The operators “+” and “−” can also be used as unary operators.
When used as unary operators, “+” and “−” take their arguments before
all binary operators. For instance,
1 + + + 5 − − + −7 (1 + (+(+5))) − (−(+(−7)))
Note that the grammar in Figure 2.1 allows for free nesting of ex-
pressions. If you like, you can nest a parenthesized conditional into a
function application, or a let expression into the first constituent of a
conditional.
32
2 Syntax and Semantics
λx.e1 e2 λx.(e1 e2 )
λx.e1 o e2 λx.(e1 o e2 )
a) x + 3 · f (x) − 4 e) 1 + 2 · x − y · 3 + 4
b) 1 + (2 + (3 + 4)) f) (1 + 2) · x − (y · 3) + 4
c) 1 + 2 + 3 + 4 g) if x < 3 then 3 else p(3)
d) 1 + (2 + 3) + 4 h) let x = 2 + y in x − y
i) let p x n = if n > 0 then x · p x (n − 1) else 1 in p 2 10
f + < 1 − −
· 3 + − 2 − −
2 x 3 2 x 1 3 4 1 x
Once we have the syntactic class hdeclarationi, we could model let ex-
pressions as trees with only two subtrees
What we see here is that there are many design choices for the grammar
of a language, even after the lexical representation of phrases is fixed.
Once the grammar is fixed, the structure of syntax trees is fixed, up to
the exact choices of labels.
33
2 Syntax and Semantics
˙ 5
4< ˙ 6
5< ˙ 4
3< ˙ 5
4<
˙ 4
3< ˙ 6
4< ˙ 5
3< ˙ 6
5<
˙ 6
3< ˙ 6
3<
Every line in the derivations represents the application of one of the
two inferences rules. Note that the initial judgments of the derivation
are justified by the first rule, and that the non-initial judgments of the
derivation are justified by the second rule.
In general, an inference rule has the format
P1 ··· Pn
C
where the conclusion C is a judgment and the premises P1 , . . . , Pn
are either judgments or side conditions. A rule says that a derivation
of the conclusion judgment can be obtained from the derivations of the
premise judgments provided the side conditions appearing as premises
are satisfied. The inference rules our example system don’t have side
conditions.
Returning to our example derivation system, it is clear that whenever
a judgment x < ˙ y is derivable, the comparison x < y is true. This
follows from the fact that both inference rules express valid properties
of comparisons x < y. Moreover, given two numbers x and y such x < y,
we can always construct a derivation of the judgment x < ˙ y following a
recursive algorithm:
34
2 Syntax and Semantics
˙ y.
1. If y = x + 1, the first rule yields a derivation of x <
2. Otherwise, obtain a derivation of x < ˙ y − 1 by recursion. Moreover,
obtain a derivation of y − 1 < ˙ y by the first rule. Now obtain a
derivation of x <˙ y using the second rule.
The example derivation system considered here was chosen for the
purpose of explanation. In practice, derivation systems are used for
describing computational relations that cannot be described otherwise.
Typing and evaluation of expressions are examples for such relations,
and we are going to use derivation systems for this purpose.
˙ 9.
Exercise 2.3.1 Give all derivations for 5 <
Exercise 2.3.2 Give two inference rules for judgments x <˙ y such that
your rules derive the same judgments the rules given above derive, but
have the additional property that no judgments has more than one
derivation.
Exercise 2.3.3 Give two inference rules for judgments x < ˙ y such that
˙ y is derivable if and only if x + k · 5 = y for some k ≥ 1.
a judgment x <
35
2 Syntax and Semantics
e ::= c | x | e1 o e2 | e1 e2
| if e1 then e2 else e3
| (e1 , . . . , en ) n≥2
| πn e n≥1
| let x = e1 in e2
| λx.e
| let rec f x = e1 in e2
E`e:t
36
2 Syntax and Semantics
Note that the subderivations are established by the rules for variables
and constants.
The rules for lambda expressions and let expressions update the
existing environment E with a binding x : t. If E doesn’t contain a
binding for x, the binding x : t is added. Otherwise, the existing binding
for x is replaced with x : t. For instance,
[x : t1 , y : t2 ], z : t3 = [x : t1 , y : t2 , z : t3 ]
[x : t1 , y : t2 ], x : t3 = [y : t2 , x : t3 ]
37
2 Syntax and Semantics
c:t (x : t) ∈ E
E`c : t E`x : t
o : t1 → t2 → t E ` e1 : t1 E ` e2 : t2
E ` e1 o e 2 : t
E ` e1 : t2 → t E ` e2 : t2
E ` e1 e2 : t
E ` e1 : bool E ` e2 : t E ` e3 : t
E ` if e1 then e2 else e3 : t
E ` e1 : t1 ··· E ` en : tn
E ` (e1 , . . . , en ) : t1 × · · · × tn
E ` e : t1 × · · · × tn 1≤i≤n
E ` πi e : ti
E ` e1 : t1 E, x : t1 ` e2 : t
E ` let x = e1 in e2 : t
E, x : t1 ` e : t2
E ` λx.e : t1 → t2
E, f : t1 → t2 , x : t1 ` e1 : t2 E, f : t1 → t2 ` e2 : t
E ` let rec f x = e1 in e2 : t
38
2 Syntax and Semantics
λx.e λx : t. e
let rec f x = e1 in e2 let rec f (x : t1 ) : t2 = e1 in e2
The derivation uses the typing rules for constants, tuple expressions,
and projections.
Here is an example of a type derivation where t1 and t2 are place-
holders for arbitrary types:
x : t2 ` x : t2
x : t1 ` λx.x : t2 → t2
[] ` λx.λx.x : t1 → t2 → t2
The derivation uses the typing rules for variables and lambda expres-
sions. Note how elegantly the typing rules handle the shadowing of the
first argument variable in λx.λx.x.
39
2 Syntax and Semantics
a) λx.x d) λf g x. f x(gx)
b) λxy.x e) let rec f x = f x in f
c) λxx.x f) λxyz.if x then y else z
Exercise 2.5.2 Extend the grammar for abstract expressions with rules
for declarations and programs and design corresponding typing rules.
Hint: Use judgments E ` D ⇒ E 0 and E ` P ⇒ E 0 and model programs
with the grammar rule P ::= ∅ | D P (∅ is the empty program).
40
2 Syntax and Semantics
a) f x e) λx : t. f x(gx)y
b) f x(gx)y f) if x then y else z
c) λx : t1 . λy : t2 . x g) if x then y else x
d) λx : t1 . λx : t2 . x h) if x then y + 0 else z
Exercise 2.6.4 Test your understanding by giving typing rules for
judgments E ` e : t and expressions e ::= x | λx.e | ee without looking
at Figure 2.2.
41
2 Syntax and Semantics
let f = λx.f x in f
let rec f x = f x in f
let f = λx.f x in f
the first occurrences of f and x are binding, and all other occurrences
of f and x are using.
The typing rules for expressions observe the binding rules for vari-
ables. If E ` e : t is derivable, we know that at most the variables that
are assigned a type by the environment E are free in e. Hence we know
that e is closed if [] ` e : t is derivable for some type t.
Let the letter X denote finite sets of variables. We say that an
expression e is closed in X if every variable that is free in e is an element
of X. It is helpful to consider a derivation system for judgments X ` e
such that X ` e is derivable if and only if e is closed in X. Figure 2.3
gives such a derivation system. We refer to the rules of the system as
binding rules. The notation X, x stands for the set obtained from X
by adding x (i.e., X ∪ {x}), Note that the binding rules can be seen as
simplifications of the typing rules, and that the algorithmic reading of
the rules is obvious (check whether e is closed in X). Only the rule for
variables has a side condition.
42
2 Syntax and Semantics
x∈X X ` e1 X ` e2 X ` e1 X ` e2
X`c X`x X ` e1 o e2 X ` e1 e2
X ` e1 X ` e2 X ` e3 X ` e1 ··· X ` en X`e
X ` if e1 then e2 else e3 X ` (e1 , . . . , en ) X ` πi e
X ` e1 X, x ` e2 X, x ` e X, f, x ` e1 X, f ` e2
X ` let x = e1 in e2 X ` λx.e X ` let rec f x = e1 in e2
43
2 Syntax and Semantics
tion λx.e. The name closure is motivated by the fact that a lambda
expression becomes a self-contained function once we add an environ-
ment providing values for the free variables. Closures don’t appear in the
rewriting model since there the values for free variables are substituted
in as part of rewriting.
Closures for recursive functions take the form
(f, x, e, V )
V `e.v
44
2 Syntax and Semantics
(x . v) ∈ V
V `c.c V `x.v
V ` e1 . v1 V ` e2 . v 2 v1 o v 2 = v
V ` e1 o e 2 . v
V ` e1 . true V ` e2 . v V ` e1 . false V ` e3 . v
V ` if e1 then e2 else e3 . v V ` if e1 then e2 else e3 . v
V ` e1 . v1 ··· V ` en . vn
V ` (e1 , . . . , en ) . (v1 , . . . , vn )
V ` e . (v1 , . . . , vn ) 1≤i≤n
V ` πi e . vi
V ` e1 . v1 V, x . v1 ` e2 . v
V ` let x = e1 in e2 . v
V ` λx.e . (x, e, V )
V ` e1 . (x, e, V 0 ) V ` e2 . v 2 V 0 , x . v2 ` e . v
V ` e1 e2 . v
V, f . (f, x, e1 , V ) ` e2 . v
V ` let rec f x = e1 in e2 . v
V ` e1 . v1 V ` e2 . v 2 v1 = (f, x, e, V 0 ) V 0 , f . v1 , x . v2 ` e . v
V ` e1 e2 . v
45
2 Syntax and Semantics
Note that the evaluation rules for function applications can yield a
value for an application e1 e2 only if both subexpressions e1 and e2 are
evaluable. This is in contrast to conditionals if e1 then e2 else e3 ,
which may be evaluable although one of the constituents e2 and e3 is
not evaluable. This relates to the problem underlying Exercise 1.13.1.
Type Safety
The evaluation rules do not mention types at all. Thus we can execute
ill-typed expressions. This may lead to erroneous situations that cannot
appear with well-typed expressions, for instance a function application
e1 e2 where e1 evaluates to a number, or an addition of a closure and
a string, the application of a projection to a value that is not a tuple,
In contrast, if we evaluate a well-typed expression, this kind of erro-
neous situations are impossible. This fundamental property is known
as type safety. Type safety provides for more efficient execution of
programming languages, and also makes it easier for the programer to
write correct programs. OCaml is designed as a type-safe language.
Exercise 2.8.1 For the following expressions e find values v such that
[] ` e . v is derivable. Draw the derivations.
46
2 Syntax and Semantics
e1 || e2 lazy or
if e1 then true else e2
47
2 Syntax and Semantics
lazy or || right
lazy and && right
comparisons = 6= < ≤ > ≥ none
arithmetic operations + − left
· / % left
unary minus and plus − + right
function application e1 e2 left
the transport variable a must be fresh (i.e., does not occur otherwise).
The freshness condition is needed so that a does not shadow an outer
variable used in e1 . Moreover, the freshness condition disallows a = x.
Syntactically, the lazy boolean operations && (and) and || (or) act
as operators that take their arguments after comparisons. Moreover,
|| takes its arguments after && . Both lazy boolean operators group to
the right. Figure 2.6 shows a table showing the precedence (in taking
arguments) and associativity (i.e., grouping direction) of all operators
we have seen so far. In OCaml operators are realized following the table,
except that comparisons group to the left.
The grouping rules for the typ-forming operators are as follows:
function type former → right
tuple type former × none
Abstract expressions can express a form of programs that comes close
to programs as sequences of declarations. A program now takes the form
D1 · · · Dn in e
D1 in · · · in Dn in e
Exercise 2.9.2 (Tuple patterns) The translation rule for tuple pat-
terns in Figure 2.5 is just given for pairs so that the translation fits into
one line. Give the translation rule for triples. Try to give the translation
rule for general patterns using the dots notation (· · · ).
48
2 Syntax and Semantics
e ::= c | x | e1 o e2 | e1 e2 | λx.e | · · ·
t ::= int | t1 → t2 | · · ·
49
2 Syntax and Semantics
make use of types. The static semantics defines a type discipline such
that the evaluation of well-typed expressions doesn’t lead to obviously
erroneous situations. The static semantics determines well-typed ex-
pressions and their types, and the dynamic semantics determines evalu-
able expressions and their values. Both semantics employ environments
mapping free variables to either types or values. The dynamic semantics
represents functional values as so-called closures combining the syntactic
description of a function with a value environment providing values for
the free variables.
The semantic systems of a language are described as so-called deriva-
tion systems consisting of inference rules for syntactic objects called
judgments. We speak of typing judgments and typing rules for the static
semantics, and of evaluation judgments and evaluation rules for the dy-
namic semantics. Typing rules and evaluation rules have a declarative
reading (concerning the naive construction of derivations) and an algo-
rithmic reading. The algorithmic reading of the inference rules gives us
abstract algorithms that given an expression and an environment com-
pute either the type or the value of the expression provided it exists.
The type checking algorithm given by the typing rules needs to guess
the types of the argument variables of lambda expressions. The guessing
can be eliminated by using lambda expressions carrying the type of the
argument variable:
e ::= · · · | λx : t.e
A good way to think about type checking sees type checking as ab-
stract evaluation of an expression. While evaluation computes values,
type checking computes types. There is the connection that an expres-
sion of a type t evaluate to a value of type t (so-called type safety).
As a consequence, well-typed expressions have the property that there
evaluation cannot lead to type conflicts (e.g., addition of a function and
a number, or application of a number to a number).
The type checking algorithm given by the typing rules always termi-
nates. This is in contrast to the evaluation algorithm given by the evalu-
ation rules, which may diverge on expressions not identified as evaluable
by the evaluation rules. Note however that the evaluation algorithm will
always terminate on expressions identified as evaluable by the evaluation
50
2 Syntax and Semantics
51
2 Syntax and Semantics
52
3 Polymorphic Functions and Iteration
So far, some functions don’t have unique types, as for instance the pro-
jection functions for pairs. The problem is solved by giving such func-
tions schematic types called polymorphic types.
Using polymorphic types, we can declare a tail-recursive higher-order
function iterating a function on a given value. It turns out that many
functions can be obtained as instances of iteration. Our key example
will be a function computing the nth prime number.
What type does fst have? Clearly, int × int → int and bool × int → bool
are both types admissible for fst. In fact, every type t1 × t2 → t1
is admissible for fst. Thus there are infinitely many types admissible
for fst.
OCaml solves the situation by typing fst with the polymorphic type
∀αβ. α × β → α
A polymorphic type is a type scheme whose quantified variables (α
and β in the example) can be instantiated with all types. The instances
of the polymorphic type above are all types t1 × t2 → t1 where the
types t1 and t2 can be freely chosen. When a polymorphically typed
identifier is used in an expression, it can be used with any instance of its
polymorphic type. Thus fst (1 , 2 ) and fst (true, 5 ) are both well-typed
expressions.
Here is a polymorphic swap function for pairs:
let swap (x,y) = (y,x)
53
3 Polymorphic Functions and Iteration
3.2 Iteration
f n+1 (x) = f n (f x)
Given an iteration f n (x), we call f the step function and x the start
value of the iteration.
With iteration we can compute sums, products, and powers of non-
negative integers just using additions x + 1:
succ x := x + 1
add x n := succ n (x)
mul n x := (add x)n (0)
pow x n := (mul x)n (1)
iter : ∀α. (α → α) → N → α → α
iter f 0 x := x
iter f (n + 1) x := iter f n (f x)
54
3 Polymorphic Functions and Iteration
pred : N+ → N
pred (n + 1) := n
f (a, k) = (k, k + 1)
1
the predecessor of an integer x is x − 1.
55
3 Polymorphic Functions and Iteration
fib : N → N
fib(0) := 0
fib(1) := 1
fib (n + 2) := fib (n) + fib (n + 1)
Incidentally, this is the first recursive function we see where the recursion
is binary (two recursive applications) rather than linear (one recursive
application). Termination follows with the usual argument that each
recursion step decreases the argument.
If we look again at the rule generating the Fibonacci sequence, we
see that we can compute the sequence by starting with the pair (0, 1)
and iterating with the step function f (a, b) = (b, a + b). For instance,
56
3 Polymorphic Functions and Iteration
0, 1, 1, 2, 4, 7, 13, . . .
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, · · ·
Note that next prime x yields the first prime greater than x. Also note
that the elegant declarations of next prime and nth prime are made
possible by the higher-order functions first and iter.
It remains to come up with a primality test (a test checking
whether an integer is a prime number). Here are 4 equivalent char-
acterizations of primality of x assuming that x, k, and n are natural
numbers:
57
3 Polymorphic Functions and Iteration
1. x ≥ 2 ∧ ∀ k ≥ 2. ∀ n ≥ 2. x 6= k · n
2. x ≥ 2 ∧ ∀ k, 2 ≤ k < x. x % k > 0
3. x ≥ 2 ∧ ∀ k > 1, k 2 ≤ x. x % k > 0
√
4. x ≥ 2 ∧ ∀ k, 1 < k ≤ 2 x. x % k > 0
Characterization (1) is close to the informal definition of prime numbers.
Characterizations (2), (3), and (4) are useful for our purposes since they
can be realized algorithmically. Starting from (1), we see that x − 1 can
be used as an upper bound for k and n. Thus we have to test x = k · n
only for finitely many k and n, which can be done algorithmically. The
next thing we see is that is suffices to have k because n can be kept
implicit using the remainder operation. Finally, we see that it suffices
to test for k such that k 2 ≤ x since we can assume n ≥ k. Thus we can
√
sharpen the upper bound from x − 1 to 2 x.
Here we choose the second characterization to declare a primality
test in OCaml and leave the realization of the computationally faster
fourth characterization as an exercise.
To test the bounded universal quantification in the second charac-
terization, we declare a higher-order function
such that
58
3 Polymorphic Functions and Iteration
Exercise 3.4.1 Declare a test exists : int → int → (int → bool) → bool
such that exists m n f = true ←→ ∃k, m ≤ k ≤ n. f k = true in two
ways:
a) Directly following the design of forall.
b) Using forall and boolean negation not : bool → bool.
Exercise 3.4.3 Explain why the following functions are primality tests:
a) λx. x ≥ 2 && first (λk. x % k = 0) 2 = x
b) λx. x ≥ 2 && (first (λk. k 2 ≥ x || x % k = 0) 2)2 > x
Hint for (b): Let k be the number the application of first yields. Distin-
guish three cases: k 2 < x, k 2 = x, and k 2 > x.
(x : T ) ∈ E T t
E`x : t
59
3 Polymorphic Functions and Iteration
E ` e1 : t1 E, x : T ` e2 : t ˙ E t1
T e1 = λ · · ·
E ` let x = e1 in e2 : t
let f = λx.x in f f
E, f : ∀α. α → α ` f : (α → α) → (α → α) E, f : ∀α. α → α ` f : α → α
E, f : ∀α. α → α ` f f : α → α
60
3 Polymorphic Functions and Iteration
E, x : α ` x : α ··· ···
E ` λx.x : α → α E, f : ∀α. α → α ` f f : α → α
E ` let f = λx.x in f f : α → α
The derivation assumes that E does not contain the type variable α.
E, f : t1 → t2 , x : t1 ` e1 : t2 E, f : T ` e2 : t ˙ E t1 → t2
T
E ` let rec f x = e1 in e2 : t
Recall the predefined function invalid arg discussed in §1.13. Like every
function in OCaml, invalid arg must be accommodated with a type. It
turns out that the natural type for invalid arg is a polymorphic type:
With this type an application of invalid arg can be typed with whatever
type is required by the context of the application. Since evaluation of
the application raises an exception and doesn’t yield a value, the return
type of the function doesn’t matter for evaluation.
3
In principle, f can be typed polymorphically for both constituents e1 and e2 of a
recursive let expression, but this generalization of the polymorphic typing rule for
recursive let expressions ruins the type inference algorithm for polymorphic types.
61
3 Polymorphic Functions and Iteration
3.7 Summary
62
3 Polymorphic Functions and Iteration
63
4 Lists
[1, 2, 3] : L(Z)
[true, true, false] : L(B)
[(true, 1), (true, 2), (false, 3)] : L(B × Z)
[[1, 2], [3], []] : L(L(Z))
All lists are obtained from the empty list [] using the binary constructor
cons written as “::”:
[1] = 1 :: []
[1, 2] = 1 :: (2 :: [])
[1, 2, 3] = 1 :: (2 :: (3 :: []))
The empty list [] also counts as a constructor and is called nil. Given a
nonempty list x :: l (i.e., a list obtained with cons), we call x the head
and l the tail of the list.
It is important to see lists as trees. For instance, the list [1, 2, 3] may
be depicted as the tree
64
4 Lists
::
1 ::
2 ::
3 []
The tree representation shows how lists are obtained with the construc-
tors nil and cons. It is important to keep in mind that the bracket
notation [x1 , . . . , xn ] is just notation for a list obtained with n applica-
tions of cons from nil. Also keep in mind that every list is obtained with
either the constructor nil or the constructor cons.
Notationally, cons acts as an infix operator grouping to the right.
Thus we can omit the parentheses in 1 :: (2 :: (3 :: [])). Moreover, we
have [x1 , . . . , xn ] = x1 :: · · · :: xn :: [].
Given a list [x1 , . . . , xn ], we call the values x1 , . . . , xn the elements
or the members of the list.
Despite the fact that tuples and lists both represent sequences, tuple
types and list types are quite different:
• A tuple type t1 × · · · × tn admits only tuples of length n, but may
fix different types for different components.
• A list type L(t) admits lists [x1 , . . . , xn ] of any length but fixes a
single type t for all elements.
Exercise 4.1.1 Give the types of the following lists and tuples.
a) [1, 2, 3] c) [(1, 2), (2, 3)] e) [[1, 2], [2, 3]]
b) (1, 2, 3) d) ((1, 2), (2, 3))
The fact that all lists are obtained with nil and cons facilitates the
definition of basic operations on lists. We start with the definition of a
polymorphic function
65
4 Lists
For instance, rev [1, 2, 3] = [3, 2, 1]. To define rev, we need defining
equations for nil and cons. The defining equation for nil is obvious,
since the reversal of the empty list is the empty list. For cons we use
the equation rev (x :: l) = rev (l) @ [x] which expresses a basic fact about
reversal. This brings us to the following definition of list reversal:
We have rev (l) = rev append l []. The following trace shows the defining
equations of rev append at work:
The functions defined so far are all defined by list recursion.1 List
recursion means that there is no recursion for the empty list, and that
1
List recursion is better known as structural recursion on lists.
66
4 Lists
for every nonempty list x :: l the recursion is on the tail l of the list.
List recursion always terminates since lists are obtained from nil with
finitely many applications of cons.
Another prominent list operation is
seq : Z → Z → L(Z)
seq m n := if m > n then [] else m :: seq (m + 1) n
For instance, seq −1 5 = [−1, 0, 1, 2, 4, 5]. This time the recursion is not
on a list but on the number m. The recursion terminates since every
recursion step decreases n − m and recursion stops once n − m < 0.
67
4 Lists
68
4 Lists
• The operators “::” and “@” take their arguments before comparisons
and after arithmetic operations.
flatten [l1 , . . . , ln ] = l1 @ · · · @ ln @ []
For instance, we want flatten [[1, 2], [], [3], [4, 5]] = [1, 2, 3, 4, 5].
69
4 Lists
such that
[] : ∀α. L(α)
(::) : ∀α. α → L(α) → L(α)
70
4 Lists
Note that mem tail-recurses on the list argument. Also recall the discus-
sion of OCaml’s polymorphic equality test in §3.6. We will write x ∈ l
to say that x is an element of the list l.
The structure of the membership test can be generalized with a poly-
morphic function that for a test and a list checks whether some element
of the list satisfies the test:
The expression
now gives us a test that checks whether a lists of numbers contains the
number 5. More generally, we have
Exercise 4.5.1 Declare mem and exists in OCaml. For mem consider
two possibilities, one with exists and one without helper function.
71
4 Lists
testing whether all elements of a list satisfy a given test. Consider two
possibilities, one without a helper function, and one with exists exploit-
ing the equivalence (∀x ∈ l. p(x)) ←→ (¬∃x ∈ l. ¬p(x)).
which tests whether all elements of the first list are elements of the
second list.
hd : ∀α. L(α) → α
tl : ∀α. L(α) → L(α)
let tl l =
match l with
| [] -> failwith "tl"
| _ :: l -> l
72
4 Lists
Both functions raise exceptions when applied to the empty list. Note
the use of the underline symbol “_” for pattern variables that are not
used in a rule. Also note the use of the predefined function
hd : ∀α. L(α) → α
since we don’t have a value we can return for the empty list over α.
The positions of a list are counted from left to right starting with
the number 0. For instance, the list [5, 6, 5] has the positions 0, 1, 2;
moreover, the element at position 1 is 6, and the value 5 appears at the
positions 0 and 2. The empty list has no position. More generally, a list
of length n has the positions 0, . . . , n − 1.
We declare a tail-recursive lookup function
that given a list and a position returns the element at the position:
let rec nth l n =
match l with
| [] -> failwith "nth"
| x :: l -> if n < 1 then x else nth l (n-1)
73
4 Lists
raises an exception if the position argument is not valid for the given
list. There is the possibility to avoid the exception by changing the
result type of nth to an option type
that in addition to the values of α has an extra value None that can be
used to signal that the given position is not valid. In fact, OCaml comes
with a type family
such that Some injects the values of a type t into O(t) and None rep-
resents the extra value. We can declare a lookup function returning
options as follows:
let rec nth_opt l n =
match l with
| [] -> None
| x :: l -> if n < 1 then Some x else nth_opt l (n-1)
74
4 Lists
that agrees with nth opt but returns a list with at most one element.
OCaml provides pattern matching in more general form than the ba-
sic list matches we have seen so far. A good example for explaining
generalized match expressions is a function
testing equality of lists using an equality test for the base type given as
argument:
let rec eq (p: 'a -> 'a -> bool) l1 l2 =
match l1, l2 with
| [], [] -> true
| x::l1, y::l2 -> p x y && eq p l1 l2
| _, _ -> false
75
4 Lists
match l1 with
| [] ->
begin match l2 with
| [] -> true
| _ :: _ -> false
end
| x::l1 ->
begin match l2 with
| [] -> false
| y::l2 -> p x y && eq p l1 l2
end
The keywords begin and end provide a notational variant for a pair
( · · · ) of parentheses.
The notions of disjointness and exhaustiveness established for defin-
ing equations in §1.6 carry over to generalized match expressions. We
will only use exhaustive match expressions but occasionally use non-
disjoint match expressions where the order of the rules matters (e.g.,
catch-all rules). We remark that simple match expressions are always
disjoint and exhaustive.
We see generalized match expressions as derived forms that compile
into simple match expressions.
OCaml also has match expressions for tuples. For instance,
let fst a =
match a with
| (x, _) -> x
Exercise 4.9.1 Declare a function testing whether a list starts with the
numbers 1 and 2 just using simple match expressions for lists.
76
4 Lists
4.10 Sublists
We observe that the empty list [] has only itself as a sublist, and that
a sublist of a nonempty list x :: l is either a sublist of l, or a list x :: l0
where l0 is a sublist of l. Using this observation, it is straightforward to
define a function that yields a list of all sublists of a list:
77
4 Lists
78
4 Lists
79
4 Lists
Note that isort inserts the elements of the input list reversing the order
they appear in the input list.
2
More elegantly, we may say that the insertion function preserves sortedness.
80
4 Lists
This follows from the fact that OCaml accommodates comparisons with
the polymorphic type
∀α. α → α → B
it also uses for the equality test (§3.6). We will explain later how com-
parisons behave on tuples and lists. For boolean values, OCaml realizes
the order false < true. Thus we have
Insertion order
Sorting by insertion inserts the elements of the input list one by one
into the empty list. The order in which this is done does not matter for
the result. The function isort defined above inserts the elements of the
input list reversing the order of the input list. If we define isort as
81
4 Lists
such that (x, n) ∈ table l if and only if x occurs n > 0 times in l. For
instance, we want
Make sure table lists the count pairs for the elements of l in the order
the elements appear in l, as in the example above.
Rather than sorting lists using the predefined order ≤, we may sort lists
using an order given as argument:
Now the function gisort (≤) sorts as before in ascending order, while the
function gisort (≥) sorts in descending order:
82
4 Lists
We now explain how we obtain an order for lists over t from an order
for the base type t following the principle used for ordering words in
dictionaries. We speak of a lexical ordering. Examples for the lexical
ordering of lists of integers are
[] < [−1] < [−1, −2] < [0] < [0, 0] < [0, 1] < [1]
The general principle behind the lexical ordering can be formulated with
two rules:
• [] < x :: l
• x1 :: l1 < x2 :: l2 if either x1 < x2 , or x1 = x2 and l1 < l2 .
Following the rules, we define a function that yields a test for the lexical
order ≤ of lists given a test for an order ≤ of the base type:
lex : ∀α. (α → α → B) → L(α) → L(α) → B
lex p [] l2 := true
lex p (x1 :: l1 ) [] := false
lex p (x1 :: l1 ) (x2 :: l2 ) := p x1 x2 &&
if p x2 x1 then lex p l1 l2 else true
Exercise 4.14.1 (Lexical order for pairs) The idea of lexical order
extends to pairs and to tuples in general.
a) Explain the lexical order of pairs of type t1 × t2 given orders for the
component types t1 and t2 .
b) Declare a function
lexP : ∀αβ. (α → α → B) → (β → β → B) → α × β → α × β → B
testing the lexical order of pairs. For instance, we want
lexP (≤) (≥) (1, 2) (1, 3) = false
and lexP (≤) (≥) (0, 2) (1, 3) = true.
83
4 Lists
60 = 2 · 2 · 3 · 5
147 = 3 · 7 · 7
735 = 3 · 5 · 7 · 7
declared as follows:
let rec prime_fac x =
if x < 2 then []
else let k = first (fun k -> x mod k = 0) 2 in
if k = x then [x]
else k :: prime_fac (x / k)
We have prime fac 735 = [3, 5, 7, 7], for instance.
As is, the algorithm is slow on large prime numbers (try 479,001,599).
The algorithm can be made much faster by stopping the linear search
for the first k dividing x once k 2 > x since then the first k dividing x is
x. Moreover, if we have a least prime factor k < x, it suffices to start
the search for the next prime factor at k since we know that no number
smaller than k divides x. We realize the optimized algorithm as follows:
let rec prime_fac' k x =
if k * k > x then [x]
else if x mod k = 0 then k :: prime_fac' k (x / k)
else prime_fac' (k + 1) x
We have prime fac 0 2 735 = [3, 5, 7, 7], for instance. Interestingly, the
optimized algorithm is simpler than the naive algorithm we started from
84
4 Lists
The correctness of the algorithm also relies on the fact that the safety
condition
• 2 ≤ k ≤ x and no number 2 ≤ n < k divides x
propagates from every initial application of prime fac 0 to all recursive
applications. We say that the safety condition is an invariant for the
applications of prime fac 0 .
It suffices to argue the termination of prime fac 0 for the case that
the safety condition is satisfied. In this case the x − k ≥ 0 is decreased
by every recursion step.
Exercise 4.15.4 Dieter Schlau has simplified the naive prime factor-
ization function:
let rec prime_fac x =
if x < 2 then []
else let k = first (fun k -> x mod k = 0) 2 in
k :: prime_fac (x / k)
85
4 Lists
Recall the use of environments in the typing and evaluation systems for
expressions we discussed in Chapter 2. There environments are modeled
as sequences of pairs binding variables to types or values, as for instance
in the value environment [x . 5, y . 7, z . 2]. Abstractly, we can see
an environment as a list of pairs
(key, value)
map α β := L(α × β)
Recall that the typing and evaluation rules need only two operations
on environments, where lookup yields the value for a given key provided
the environment contains a pair for the key, and update updates an
environment with a given key-value pair. For instance,
lookup [] a := None
lookup ((a0 , b) :: l) a := if a0 = a then Some b
else lookup l a
86
4 Lists
that checks whether a map binds a given key. Note that you can define
bound using lookup.
deleting the entry for a given key. We want lookup (delete l a) a = None
for all environments l and all keys a.
update l a b := (a, b) :: l
Redo the previous exercises for the new definition of update. Also define
a function
87
4 Lists
with update and the only thing that matters is that lookup yields the
correct results. Assume the definition
map α β := α → Some β
and define empty and the operations update and lookup accordingly. Test
your solution with
lookup (update (update (update empty ”y” 7) ”x” 2) ”y” 5) ”y” = Some 5
Note that you can still define the operations bound and delete from
Exercises 4.16.1 and 4.16.2.
88
5 Constructor Types, Trees, and Linearization
Recall that the elements of list types are obtained with two constructors
nil and cons. Similarly, the elements of option types are obtained with
the constructors None and Some. We say that list types and option
types are constructor types. We can describe both type families with
grammars:
L(α) ::= [] | α :: L(α)
O(α) ::= None | Some α
We can also see the type B as a constructor type:
B ::= false | true
Moreover, we can see the family of pair types as a family of constructor
types:
α × β ::= (α, β)
Simple matches are available for every constructor type. We have
seen simple matches for list types and option types before. It turns out
that we can express conditionals with simple matches for booleans:
if e1 then e2 else e3 match e1 with
| true → e2
| false → e3
89
5 Constructor Types, Trees, and Linearization
Note that the translation to boolean matches explains nicely that not
both e2 are e3 are evaluated and that e1 must be evaluated first. We
can also reduce pair patterns to simple matches for pairs:
5.2 AB-Trees
A : tree
B : tree → tree → tree
B B B
B B A A A A B
A A A A A A A
90
5 Constructor Types, Trees, and Linearization
Based on the graphical notation one speaks of the nodes and the
edges of a tree. The nodes are the occurrences of the constructors in
the drawing, and the edges are the lines between the nodes. The root
of a tree is the topmost node of the tree, and the leaves of a tree are
the bottom-most nodes of the tree. The size of a tree is the number of
nodes in its drawing, and the depth of a tree is the maximal number of
edges on a path from the root to a leaf (only proceeding downwards).
Here is a function yielding the size of an AB-tree, defined both math-
ematically and in OCaml:
size : tree → N
size A := 1
size (B t1 t2 ) := 1 + size t1 + size t2
Exercise 5.2.3 Define and declare a function mirror : tree → tree that
yields a tree whose graphical representation is the mirror image of the
given tree. For instance, we want mirror (B(BAA)A) = BA(BAA) and
mirror (B(BA(BAA))A) = BA(B(BAA)A).
91
5 Constructor Types, Trees, and Linearization
B B
A A A A
a) Convince yourself that the two subtrees of a maximal tree are max-
imal and identical.
b) Declare a function mtree : N → tree that yields the unique maximal
tree of depth n. Use a let expression to avoid binary recursion. Also
give a tail-recursive function such that mtree 0 n A yields the maximal
tree of depth n.
c) Give a non-maximal tree t such that mirror(t) = t.
d) Declare a function check : tree → O(N) such that check(t) = Some(n)
if t is a maximal tree of depth n, and check(t) = None if t is not
maximal.
Exercise 5.2.6 (Minimal Trees)
A tree is minimal if its size is minimal for all trees of the same depth.
The two minimal trees of depth 2 look as follows:
B B
A B B A
A A A A
92
5 Constructor Types, Trees, and Linearization
93
5 Constructor Types, Trees, and Linearization
for all trees t, where lin : tree → string is a linearization function and
parse : string → tree is the inverting parsing function.
Here are two mutually recursive functions tree → string linearizing AB-
trees with a minimum of parentheses given that B is accommodated as
a left-associative infix operator:
We have tree (B(B(A, A), A)) = ”ABABA” and tree (B(A, B(A, A))) =
”AB(ABA)”. Both functions linearize trees, where B-trees are linearized
either with parentheses (ptree) or without parentheses (tree). In either
case, the left subtree of a B-tree is linearized without parentheses and
the right subtree is linearized with parentheses.
94
5 Constructor Types, Trees, and Linearization
95
5 Constructor Types, Trees, and Linearization
5.5 ABC-Trees
1
In Mathematics, products are often written with juxtaposition; e.g., 2n for 2 · n.
96
5 Constructor Types, Trees, and Linearization
5.6 Mini-OCaml
97
5 Constructor Types, Trees, and Linearization
98
5 Constructor Types, Trees, and Linearization
The grammar follows the conventions of OCaml except that the com-
parison operator is not associative:
• Conditionals, lambda expressions, and let expressions take their ar-
guments after operator and function applications. Hence they must
be enclosed in parentheses when they appear as argument expressions
of operator and function applications.
• Function applications take their arguments before operator applica-
tions.
• Multiplication takes it arguments before addition and subtraction,
and all three operators are left-associative.
• Addition and subtraction are at the same level and take their argu-
ments before the comparison operator ’≤’, which is neither left nor
right-associative.
Note that the grammar has 6 levels. The top level accommodates
prefix constructs (conditionals and lambda and let expressions), and the
bottom level accommodates primitive expressions (variables and con-
stants). Level 2 accommodates function applications, and the remaining
3 levels accommodate the infix operators.
In contrast to the grammars we have seen so far, the linearization
functions for the linearization grammar for expressions must put white
space characters between the words so that identifiers, keywords, and
numbers are properly separated.
OCaml’s concrete syntax doesn’t have constants for negative inte-
gers but represents them with a unary negation operator. There is the
subtlety that one can write ”2 + -7” but that one can not drop the paren-
theses in ”f (-7)”. For our purposes it seems best to linearize negative
integers always with parentheses.
99
5 Constructor Types, Trees, and Linearization
100
5 Constructor Types, Trees, and Linearization
Taken together, we now see that we can define the natural numbers as
a constructor type, and the operations on natural numbers as recursive
functions. The idea for this setup goes back to the work of Dedekind
and Peano. Defining the natural numbers with a constructor type is
logically of great importance. On the practical side, however, the naive
constructor representation of natural numbers is not helpfull since it is
hopelessly inefficient.
Exercise 5.7.1 Declare the constructor type nat in OCaml and de-
clare functions for addition, multiplication, and exponentiation. Also
declare functions converting between nonnegative machine integers and
constructor numbers.
5.8 Summary
101
6 Parsing
6.1 Lexing
The first step in parsing a string translates the string into a list of tokens
representing words. Most examples in this chapter take ABC-trees
type tree = A | B of tree * tree | C of tree * tree
as syntactic objects (often we will just use the constructors A and B).
We accommodate the tokens needed for ABC-trees with a constructor
type definition
type token = AT | BT | CT | LP | RP
providing tokens for the words ’A’, ’B’, ’C’ and the parentheses ’(’ and ’)’.
To translate a string into a list of tokens, we need to know a little
bit more about strings in OCaml.
Strings provide an efficient and hardware-oriented representation of
sequences of characters, where characters are represented as sequences
of 8 booleans called bytes. Altogether there are 256 = 28 characters.
OCaml has types for both strings and characters and provides a function
102
6 Parsing
that yields the length of a string (the number of positions a string has).
OCaml has constants for characters, for instance ’A’, ’B’, ’C’, ’(’,
and ’)’. There are also three constants ’ ’, ’\n’, and ’\t’ corresponding to
the white space characters associated with the space key, enter key, and
tabulator key on a keyboard.
For our examples dealing with ABC-trees, we will use a tail-recursive
lexer
declared as follows:
let lex s =
let n = String.length s in
let rec lex i l =
if i >= n then List.rev l
else match String.get s i with
| 'A' -> lex (i+1) (AT::l)
| 'B' -> lex (i+1) (BT::l)
| 'C' -> lex (i+1) (CT::l)
| '(' -> lex (i+1) (LP::l)
| ')' -> lex (i+1) (RP::l)
| ' ' | '\n' | '\t' -> lex (i+1) l
| _ -> failwith "lex: illegal character"
in lex 0 []
103
6 Parsing
convert between strings and character lists such that the equations
implode (explode s) = s
explode (implode l) = l
hold for all strings s and all character lists l. Note the use of OCaml’s
function String.make providing for the translation of characters into
strings (String.make n c yields the string consisting of n ≥ 0 occurrences
of the character c).
a) Try explode "Saarbrücken" and try to explain what you see.
b) Try implode [’\240’; ’\159’; ’\152’; ’\138’] and enjoy.
c) Try implode [’\240’; ’\159’; ’\167’; ’\144’] and enjoy.
d) Convince yourself that OCaml’s order on characters yields true for
the following comparisons: ’A’ < ’B’, ’Z’ < ’a’, and ’a’ < ’b’.
Look up “ASCII” in the web to know more.
e) Convince yourself that OCaml’s order on strings is the lexical order
on character lists obtained with OCaml’s order on characters.
Exercise 6.1.4 (ASCII Codes) OCaml has two functions
converting between characters and ASCII codes, which are numbers be-
tween 0 and 255. We have Char.chr (Char.code c) = c for all charac-
ters c. Moreover c < c0 if and only if Char.code c < Char.code c0 . Look
up an ASCII table in the web to learn more. Using conversion to ASCII
codes, declare functions as follows:
a) A function checking wether a character is a digit.
b) A function converting characters that are digits into numbers.
c) A function checking whether a character is a lower case letter
between ’a’ and ’z’.
d) A function checking whether a character is an upper case letter
between ’A’ and ’Z’.
104
6 Parsing
Note the systematic use of shadowing for the local variable l. The pars-
ing function terminates since both recursive applications are done with
smaller token lists. Here is an example:
tree (lex "BBAABABAA") = (B (B (A, A), B (A, B (A, A))), [])
The recursive calls of the parsing function on the given token list can be
visualized with parentheses in the lexical representation:
BBAABABAA
BBAA BA BAA
105
6 Parsing
in the parsing function, where the helper function checking the presence
of the right parenthesis is declared as follows:
let verify c l = match l with
| [] -> failwith "verify: no token"
| c'::l -> if c'=c then l else failwith "verify: wrong token"
Note that we use the letter c for variables ranging over tokens (the letter t
is already taken for trees).
Exercise 6.2.2 Declare recursive descent parsers for AB-trees and lin-
earizations as follows:
a) OCaml linearization, with and without extra parentheses.
b) Fully parenthesized infix linearization, with and without extra paren-
theses.
Do the same for ABC-trees.
106
6 Parsing
Note that the rule for tree 0 has a nil alternative which is matched by
the empty token list. The recursion of the grammar terminates since
every recursion step takes at least one token from the token list. Here
are the parsing functions for the grammar treating B as a left-associative
infix operator:
let rec tree l =
let (t,l) = ptree l in tree' t l
and tree' t l = match l with
| BT::l -> let (t',l) = ptree l in tree' (B(t,t')) l
| _ -> (t,l)
and ptree l = match l with
| AT::l -> (A,l)
| LP::l -> let (t,l) = tree l in (t, verify RP l)
| _ -> failwith "tree"
gets the tree that has been parsed before as an argument. Note that
realizing the nil alternative of tree 0 is straightforward.
If we want to treat B as a right-associative infix operator, the gram-
mar remains unchanged and the rule for the token BT of the parsing
function tree 0 is changed as follows:
| BT::l -> let (t',l) = ptree l in
let (t'',l) = tree' t' l in
(B(t,t''), l)
Exercise 6.3.1 Declare a parser for ABC-trees where B and C are left-
associative infix operators at the same precedence level.
107
6 Parsing
We now come to the case where we have two infix operators B and C
and B takes its arguments before C (we say that B takes precedence
over C). As in the linearization grammars the precedence hierarchy is
taken care of in the parsing grammar by having rules for every prece-
dence level:
As it comes to the parsing function for this rule we are now confronted
with the problem that we need to decide whether the first alternative
or the nil alternative is to be taken. We base the decision on the first
token and take the first alternative if the token is a token a ptree can
start with (”A” or ”(”).
108
6 Parsing
For instance, the tree B(B(A, A), A) has the postfix linearization
AABAB. As with the prefix linearization, there is no need for paren-
theses.
There is an elegant tail-recursive method for parsing postfix lineariza-
tions that is different from recursive descent parsing. The key idea is
to work with a stack of already parsed subtrees so that when the oper-
ator B is encountered, the associated subtrees t1 and t2 can be taken
from the stack and be replaced by the compound tree B(t1 , t2 ). We
demonstrate the method with a trace for the string ”AABAB”:
”AABAB” []
”ABAB” [A]
”BAB” [A, A]
”AB” [B(A, A)]
”B” [A, B(A, A)]
”” [B(B(A, A), A)]
109
6 Parsing
such that depost (lex (post(t))) [] = [t], and more generally the equation
holds for all trees t, character lists l1 , and stacks l2 (i.e., lists of trees).
The defining equations for depost are now straightforward:
depost [] l2 := l2
depost (AT :: l1 ) l2 := depost l1 (A :: l2 )
depost (BT :: l1 ) (t2 :: t1 :: l2 ) := depost l1 ((B(t1 , t2 ) :: l2 )
depost (CT :: l1 ) (t2 :: t1 :: l2 ) := depost l1 ((C(t1 , t2 ) :: l2 )
depost := []
6.6 Mini-OCaml
The techniques we have seen for recursive decent parsing suffice to con-
struct a parser for Mini-OCaml. We adapt the linearization grammar
in §5.6 by adding auxiliary categories as follows:
The lexer for Mini-OCaml needs work. We start with the constructor
type for tokens:
110
6 Parsing
111
6 Parsing
112
7 Mini-OCaml Interpreter
113
7 Mini-OCaml Interpreter
Our starting point is the abstract grammar for the types and expressions
of Mini-OCaml:
114
7 Mini-OCaml Interpreter
Environments
For the evaluator and the type checker we need environments. We im-
plement environments as key-value maps:
type ('a,'b) env = ('a * 'b) list
let empty : ('a,'b) env = []
let update (env : ('a,'b) env) a b : ('a,'b) env = (a,b) :: env
let rec lookup (env : ('a,'b) env) a = match env with
| (a',b) :: env -> if a = a' then Some b else lookup env a
| [] -> None
Note that env is a second name for list. There is the possibility to realize
environments with a custom constructor type different from lists.
Note that we are using helper functions for operator and function appli-
cations for better readability.
7.3 Evaluator
For the evaluator we first define a constructor type providing the values
of Mini-OCaml as OCaml values:
115
7 Mini-OCaml Interpreter
As with the type checker, we use helper functions for operator and func-
tion applications for better readability. This time the helper function for
function applications is mutually recursive with the master evaluation
function.
7.4 Lexer
The lexer for Mini-OCaml needs work, mainly because it needs to rec-
ognize identifiers (i.e., variables) and number constants. The first step
consists in declaring a constructor type for tokens:
type const = BCON of bool | ICON of int
type token = LP | RP | EQ | COL | ARR | ADD | SUB | MUL | LEQ
| IF | THEN | ELSE | LAM | LET | IN | REC
| CON of const | VAR of string | BOOL | INT
The tokens LP, RP, EQ, COL, ARR, and LAM are realized with the
strings ”(”, ”)”, ”=”, ”:”, ”->”, and ”fun”.
116
7 Mini-OCaml Interpreter
117
7 Mini-OCaml Interpreter
We remark that the lexer follows the longest munch rule when it
reads identifiers and numbers; that is, it reads as many characters as
are possible for an identifier or number. Thus ” ifx ” is recognized
as a single variable token Var ”ifx” rather than the keyword token If
followed by the variable token Var ”x”.
Exercise 7.4.1 Rewrite the function lex so that no patterns with when
conditions are used. Use nested conditionals instead.
7.5 Parser
The top level takes care of the prefix phrases, which recurse to the top
level. The top level delegates to the comparison level in case no keyword
for a prefix phrase is present. We then have 4 infix levels taking care of
operator applications and function applications. Function applications
are at the lowest infix level and are realized with juxtaposition. Addi-
tive and multiplicative operators and function applications are realized
with left-associativity, and comparisons are realized without associativ-
ity, deviating from OCaml. The bottom level takes care of variables,
constants, and parenthesized expressions. For the 3 levels realizing left-
associativity the parsing grammar has auxiliary categories sexp0 , mexp0 ,
and aexp0 .
The parsing function for the top level is interesting in that it can
make good use of pattern matching:
118
7 Mini-OCaml Interpreter
ty ::= pty ty 0
ty 0 ::= ”->” ty | []
pty ::= ”bool” | ”int” | ”(” ty ”)”
Exercise 7.5.2 Write the parsing functions for Mini-OCaml types and
Mini-OCaml expressions.
119
7 Mini-OCaml Interpreter
120
8 Running Time
A function that has linear running time has the property that the ex-
ecution time grows at most linearly with the argument size. Similarly,
functions that have logarithmic or quadratic or exponential running time
have the property that the execution time grows at most logarithmically
or quadratically or exponentially with the argument size. Linearithmic
time is almost like linear time and behaves the same in practice. Fi-
nally, the execution time of a function that has constant running time
is bounded for all arguments by a fixed constant. In practice, constant,
linear, and linearithmic running time is what one looks for, quadratic
running times signals that there will be performance problems for larger
121
8 Running Time
arguments, and exponential running time signals that there are argu-
ments of reasonable size for which the execution time is too high for
practical purposes.
The running times for list sorting algorithms are interesting. We will
see that insertion sort has quadratic running time, and that there is a
sorting algorithm (merge sort) with linearithmic running time that is
much faster in practice.
Logarithmic running time O(log n) is much faster than linear time.
The standard example for logarithmic running time is binary search, an
algorithm for searching ordered sequences we will discuss in this chap-
ter. Logarithmic running time occurs if the input size is halved at each
recursion step, and only a constant number of functions calls is needed
to prepare a recursion step.
A good measure of the abstract running time of an expression is the
number of function calls needed for its execution. Expressions whose ex-
ecution doesn’t involve function calls receive the abstract running time 0.
As an example, we consider a list concatenation function:
[] @ l2 := l2
(x :: l1 ) @ l2 := x :: (l1 @ l2 )
r(0) := 1
r(n + 1) := 1 + r(n)
giving us the running time for every call l1 @ l2 where l1 has length n.
A little bit of discrete mathematics gives us an explicit characterization
of the running time function:
r(n) = n + 1
We can now say that the list concatenation function @ has linear running
time in the length of the first list.
We remark that we distinguish between function applications and
function calls. A function application is a syntactic expression e e1 . . . en .
A function call is a tuple (v, v1 , . . . , vn ) of values that is obtained by
evaluating the expressions of a function application.
122
8 Running Time
We will now look at two functions for list reversal giving us two interest-
ing examples for abstract running times. We start with a tail-recursive
function
rev append [] l2 := l2
rev append (x :: l1 ) l2 := rev append l1 (x :: l2 )
that has linear running time in the length of the first argument (following
the argumentation we have seen for list concatenation). We can now
reverse a list l with a function call (rev append l []) taking running time
linear in the length of l. We remark at this point that rev append gives
us an optimal list reversal function that cannot be further improved.
Another list reversal function we call naive reversal is
rev [] := []
rev (x :: l) := rev (l) @ [x]
123
8 Running Time
Tail-recursive list reversal has running time n+1 for lists of length n,
which can be shown with the same method we used for list concatena-
tion. For naive reversal we have to keep in mind that list concatenation
ist not an operation (as suggested by the symbol @ ) but a function. This
means we have to count the function calls needed for concatenation for
every recursive call of rev. This gives us the recursive definition of the
running time function for rev:
r(0) := 1
r(n + 1) := 1 + r(n) + (n + 1)
One speaks of recurrence solving. We will not explain this topic here.
For practical purposes it is often convenient to use one of the recurrence
solvers in the Internet.1
Exercise 8.2.1 Define a linear time function concatenating two lists us-
ing only tail recursion. Hint: Exploit the equations l1 @r l2 = rev l1 @ l2
and rev (rev l) = l.
124
8 Running Time
125
8 Running Time
From the code it is clear that there is no dependence on the order of the
input list. It is also clear that the running time of split for an input list
of length n is at most n + 1 (for larger n it is at most 1 + n/2). The
running time of merge is 1 + n1 + n2 where n1 and n2 are the lengths
of the input lists l1 and l2 . Thus the running time of msort is linear
in the length of the input list plus twice the running time of msort for
input lists of half the length. This results in linearithmic running time
O(n log n) for msort and input lists of length n.
Note that in the above code split is tail-recursive but merge is not.
A tail-recursive variant of merge is straightforward:
let rec merge l1 l2 l = match l1, l2 with
| [], l2 -> List.rev_append l l2
| l1, [] -> List.rev_append l l1
| x::l1, y::l2 when x <= y -> merge l1 (y::l2) (x::l)
| x::l1, y::l2 -> merge (x::l1) l2 (y::l)
With the tail-recursive merge function only the recursion depth of msort
remains, which is logarithmic in the length of the input list.
takes the value x for some k (that is, f (k) = x). Without further
information about f , we can perform a linear search for the first k such
that f (k) = x.
We can search faster if we assume that f is increasing:
126
8 Running Time
f (l) ≤ · · · ≤ f (r)
declared as follows:
let find f x l r =
let rec aux l r =
if r < l then None
else let m = (l+r)/2 in
let y = f m in
if x = y then Some m
else if x < y then aux l (m-1) else aux (m+1) r
in aux l r
127
8 Running Time
8.6 Trees
depth A := 0
depth (B (t1 t2 )) := 1 + max (depth t1 ) (depth t2 )
Note that the function size computes its own running time; that is, the
running time of size for a tree t is size (t).
Here is a function mtree : N → tree that yields an AB-tree of depth n
that has maximal size (see also Exercise 5.2.5):
mtree (0) := A
mtree (n + 1) := B (mtree (n), mtree (n))
The running time of mtree is
r(0) := 1
r(n + 1) := 1 + r(n) + r(n)
128
8 Running Time
r(n) = 2n+1 − 1
mtree (0) := A
mtree (n + 1) := let t = mtree (n) in B (t, t)
The insight is that a maximal tree for a given depth has two identical
subtrees. Thus it suffices to compute the subtree once and use it twice.
For maximal trees the running time of size is exponential in the depth
of the input tree. Consequently, the running time of λn. size (mtree n)
is exponential in n even if the linear-time variant of mtree is used.
129
8 Running Time
A key idea in this chapter is taking the number of function calls needed
for the execution of an expression as running time of the expression.
This definition of running time works amazingly well in practice, in par-
ticular as it comes to identifying functions whose execution time may
be problematic in practice. As one would expect, in a functional pro-
gramming language significant running times can only be obtained with
recursion.
Another basic idea is looking at the maximal running time for argu-
ments of a given size. For lists, typically their length is taken as size.
The idea of taking the maximal running time for all arguments of a given
size is known as worst-case assumption. A good example for the worst-
case assumption is insertion sort, where the best-case running time is
linear and the worst-case running time is quadratic in the length of the
input list.
Once a numeric argument size is fixed, one can usually define the
worst-case running time function for a given function as a recursive
function N → N following the defining equations of the given function.
Given a recursive running time function, one can usually obtain an
explicit formula for the running time using so-called recurrence solving
techniques from discrete mathematics. Once one has an explicit formula
for the running time, one can obtain the asymptotic running time in
big O notation (e.g., O(n), O(n log n), O(n2 )). Big O notation is also
known as Bachmann–Landau notation or asymptotic notation.
Big O notation is based on an asymptotic dominance relation f g
for functions f, g : N → N defined as follows:
f g := ∃ n0 , k, c. ∀ n ≥ n0 . f (n) ≤ k · g(n) + c
f ≈ g := f g ∧ g f
2n + 13 ≈ n
7n2 + 2n + 13 ≈ n2
2n+3 + 7n2 + 2n + 13 ≈ 2n
130
8 Running Time
We also have
n1 n2 n3 · · · 2n 3n 4n · · ·
| {z } | {z }
polynomial exponential
a) λn. 2n + 13 d) λn. 2n + n4
b) λn. 5n2 + 2n + 13 e) λn. 2n+5 + n2
c) λn. 5n3 + n(n + 1) + 13n + 2
131
8 Running Time
When a call is executed, all calls in its call tree must be executed.
When an interpreter executes a call that is a descendent of the the initial
call, it must store the path from the root to the current call so that it
can return from the current call to the parent call. The path from the
initial call to the current call is stored in the so-called call stack.
The length of the path from the initial call to a lower call is known as
call depth (often called recursion depth). In contrast to functional pro-
gramming languages, most programming languages severely limit the
available call depth, typically to a few thousand calls. At this point
tail-recursion and tail-recursive calls turn out to be important since tail-
recursive calls don’t count for the call depth in some programming sys-
tems limiting the call depth. The reason is that a tail-recursive call
directly yields the result of the parent call so that it can replace the
parent call on the call stack when its execution is initiated.
In contrast to native OCaml interpreters, the browser-based
TryOCaml interpreter (realized with JavaScript) limits the recursion
depth to a few thousand calls. If you want to attack computationally
demanding problems with TryOCaml, it is necessary that you limit
the necessary recursion depth by replacing full recursion with tail
recursion. A good example for this problem is merge sort, where the
helper functions split and merge can be realized with tail recursion and
only the main function requires full recursion. Since the main function
halves the input size upon recursion, it requires call depth that is only
logarithmic in the size of the initial input (for instance, call depth 20
for a list of length 106 ).
132
9 Inductive Correctness Proofs
133
9 Inductive Correctness Proofs
(l1 @ l2 ) @ l3 = l1 @ (l2 @ l3 )
rev (l1 @ l2 ) = rev l2 @ rev l1
rev (rev l) = l
The letters l1 , l2 , l3 , and l act as variables ranging over lists over some
base type α. With all typed quantifications made explicit, the second
equation, for instance, becomes
where the symbol T represents the type of types. We will usually not give
the types of the variables, but for the correctness of the mathematical
reasoning it is essential that types can be assigned consistently.
The equations can be shown by equational reasoning using the defin-
ing equations of the functions and a recursive proof technique called list
induction.
Fact 9.2.1 (List induction) To show that a proposition p(l) holds for
all lists l, it suffices to show the following propositions:
1. Nil case: p([])
2. Cons case: ∀x. ∀l. p(l) → p(x :: l)
Proof We assume proofs of the nil case and the cons case. Let l be a
list. We prove p(l) by case analysis and recursion on l. If l = [], the nil
case gives us a proof. If l = x :: l0 , recursion gives us a proof of p(l0 ).
Now the function proving the cons case gives us a proof of p(l).
134
9 Inductive Correctness Proofs
Fact l @ [] = l.
Fact 9.2.2 l @ [] = l.
(x :: l) @ [] = x :: l simplification
x :: (l @ []) induction
x :: l
The compact format arranges things such that the inductive hypoth-
esis is identical with the claim that is proven. This way there is no need
to write down the inductive hypothesis.
Our second example for an inductive proof concerns the associativity
of list concatenation.
135
9 Inductive Correctness Proofs
Cons case:
Next we show that naive lists reversal agrees with tail-recursive list
reversal. Recall the definition of reversing concatenation (rev append):
136
9 Inductive Correctness Proofs
(x :: l1 ) @r l2 = rev (x :: l1 ) @ l2 simplification
l1 @r (x :: l2 ) = (rev l1 @ [x]) @ l2 induction
rev l1 @ (x :: l2 ) Fact 9.2.3
rev l1 @ ([x] @ l2 ) simplification
Exercise 9.3.6 Give the induction predicate for the proof of Fact 9.3.2.
137
9 Inductive Correctness Proofs
Exercise 9.3.9 In this exercise we will write |l| for length l for better
mathematical readability. Prove the following properties of the length
of lists:
a) |l1 @ l2 | = |l1 | + |l2 |.
b) |rev l| = |l|.
c) |map f l| = |l|.
Exercise 9.3.11 Give the running time function for con l1 l2 and the
length of l1 not using recursion.
a) Prove |pow l| = 2|l| where |l| is notation for length l. The equation
says that a list of length n has 2n sublists.
b) Give the running time function for function pow.
c) Give the running time function for a variant of pow which avoids the
binary recursion with a let expression.
len [] a = a
len (x :: l) a = len l (a + 1)
Inductive proofs are not limited to lists. The most basic induction prin-
ciple is induction on natural numbers, commonly known as mathematical
induction.
138
9 Inductive Correctness Proofs
sum : N → N
sum 0 := 0
sum (n + 1) := sum n + (n + 1)
139
9 Inductive Correctness Proofs
a) 0 + 1 + · · · + n = n · (n + 1)/2
b) 0 + 1 + · · · + n = sum n
Equations (a) is known as little Gauss formula and gives us an efficient
method to compute the sum of all numbers up to n by hand. For in-
stance, the sum of all numbers up to 100 is 100 · 101/2 = 5050. The
second equation tells us that the mathematical notation 0 + 1 + · · · + n
may be defined with the recursive function sum.
r1 (0) := 1 r2 (0) := 1
r1 (n + 1) := 1 + r1 (n) r2 (n + 1) := 1 + r2 (n) + (n + 1)
r3 (0) := 1
r3 (n + 1) := 1 + r3 (n) + r3 (n)
Given a recursive function that is not tail-recursive, one can often come
up with a tail-recursive version collecting intermediate results in addi-
tional accumulator arguments. One then would like to prove that the
tail-recursive version agrees with the original version. We already saw
140
9 Inductive Correctness Proofs
such a proof in §9.3 for the tail-recursive version of list-reversal. For list
reversal we have to prove
rev l = l @r []
l1 @r l2 = rev l1 @ l2
fib : N → N
fib(0) := 0
fib(1) := 1
fib (n + 2) := fib (n) + fib (n + 1)
One speaks of a binary recursion since there are two parallel recursive
applications in the 3rd defining equation. It is known that the running
time of fib is exponential in the sense that it is not bounded by any
polynomial.
To come up with a tail-recursive version of fib, we observe that a
Fibonacci number is obtained as the sum of the preceding two Fibonacci
numbers starting from the initial Fibonacci numbers 0 and 1. In §3.3
this fact is used to obtain the Fibonacci numbers with iteration. We now
realize the iteration using two accumulator arguments for the preceding
Fibonacci numbers:
fibo : N → N → N → N
fibo 0 a b := a
fibo (n + 1) a b := fibo n b (a + b)
We would like to show
fib n = fibo n 0 1
141
9 Inductive Correctness Proofs
fac : N → N → N
fac 0 a := a
fac (n + 1) a := fac n ((n + 1) · a)
pow : N → N → N → N
pow x 0 a := a
pow x (n + 1) a := pow x n (x · a)
142
9 Inductive Correctness Proofs
it : ∀α. (α → α) → N → α → α
it f 0 x := x
it f (n + 1) x := f (it f n x)
Note that it is defined differently from the tail-recursive iteration func-
tion iter in §3.2. We will eventually show that both iteration functions
agree.
We first show that factorials
!0 := 1
!(n + 1) := (n + 1) · n!
f (it f (n + 1) x) = it f (n + 1) (f x) simplification
f (f (it f n x)) = f (it f n (f x)) induction
iter f 0 x := x
iter f (n + 1) x := iter f n (f x)
143
9 Inductive Correctness Proofs
it f (n + 1) x = iter f (n + 1) x simplification
f (it f n x) = iter f n (f x) shift law
it f n (f x) induction
Exercise 9.6.5 Recall the Fibonacci function fib from §9.5. Prove
(fib n, fib (n + 1)) = it f n (0, 1) for f (a, b) := (b, a + b).
Exercise 9.6.6 Prove the shift law for iter in two ways: (1) by a direct
inductive proof, and (2) using the agreement with it and the shift law
for it.
% : N → N+ → N
x % y := x if x < y
x % y := (x − y) % y if x ≥ y
Fact x % y < y.
144
9 Inductive Correctness Proofs
Proof By induction on x % y.
1. x % y = x and x < y. The claim follows.
2. x % y = (x − y) % y and x ≥ y. By induction we have (x − y) % y < y.
The claim follows.
145
9 Inductive Correctness Proofs
Note that either the first or the second rule is applicable, and that
application of the second rule terminates since the second rule decreases
the sum of the two numbers.
We now recall that x % y is obtained by subtracting y from x as long
as y ≤ x. This gives us the following function for computing greatest
common divisors:
gcd : N → N → N
gcd x 0 := x
gcd x y := gcd y (x % y) if y > 0
Our goal here is to carefully explain how the algorithm can be derived
from basic principles so that the correctness of the algorithm is obvious.
In this section, all numbers will be nonnegative integers. We first
define the divides relation for numbers:
k | x := ∃n. x = n · k k divides x
146
9 Inductive Correctness Proofs
Proof Follows with the subtraction rule (Fact 9.8.2) since x % y is ob-
tained from x by repeated subtraction of y. More formally, we argue the
claim by function induction on x % y.
1. x % y = x and x < y. The claim follows.
2. x % y = (x−y) % y and x ≥ y. By induction we have that the common
divisors of x − y and y are the common divisors of (x − y) % y and y.
The claim follows with the subtraction rule (Fact 9.8.2).
Euclid’s algorithm now simply applies the modulo rule until the zero
rule applies, using the symmetry rule when needed. Recall our formula-
tion of Euclid’s algorithm with the function
gcd : N → N → N
gcd x 0 := x
gcd x y := gcd y (x % y) if y > 0
Recall that gcd terminates since recursion decreases the second argument
(ensured by Fact 9.7.1).
147
9 Inductive Correctness Proofs
Running Time
The running time of gcd x y is at most y since each recursion step de-
creases y. In fact, one can show that the asymptotic running time of
gcd x y is logarithmic in y. For both considerations we exploit that x % y
is realized as a constant-time operation on usual computers. One can
also show that gcd at least halves its second argument after two recursion
steps. For an example, see the trace at the beginning of this section.
148
9 Inductive Correctness Proofs
Proof We assume H1 : ∀n. (∀k < n. p(k)) → p(n). To show ∀n. p(n),
we show more generally
H2 : ∀k < n. p(k)
∀k 0 < k. p(k 0 )
149
9 Inductive Correctness Proofs
Proof For n = 0, the claim is obvious. For n > 0, the list has the form
x :: l. Every sublist of x :: l now either keeps x or deletes x. A sublist
of x :: l that deletes x is a sublist of l. By complete induction we know
that l has 2n−1 sublists. We now observe that the sublists of x :: l that
keep x are exactly the sublists of l with x added in front. This gives us
that l has 2n−1 + 2n−1 = 2n sublists.
Note that the proof suggests a function that given a list obtains a
list of all sublists.
Recall that a prime number is an integer greater 1 that cannot be
obtained as the product of two integers greater 1 (§3.4).
Note that the proof suggest a function that given a number greater 1
obtains a prime factorization of the number. The function relies on a
helper function that given x > 1 decides whether there are x1 , x2 > 1
such that x = x1 · x2 .
150
9 Inductive Correctness Proofs
k | x := ∃n. x = n · k k divides x
prime x := x ≥ 2 ∧ ¬∃ k, n ≥ 2. x = k · n
pfac : N → L(N)
[] if x < 2
pfac x :=
let k = first (λk. k | x) 2 in k :: pfac (x/k) if x ≥ 2
We first argue the termination of pfac. The linear search realized with
first terminates since x ≥ 2, the search starts with 2, and x divides x.
The linear search yields a k such that 2 ≤ k ≤ x and thus x/k < x.
Thus pfac terminates since it decreases its argument.
We now show
151
9 Inductive Correctness Proofs
pfac 0 : N → N → L(N)
[x] if k 2 > x
pfac 0 k x := k :: pfac 0 k (x/k) if k 2 ≤ x and k | x
pfac 0 (k + 1) x if k 2 ≤ x and ¬(k | x)
safe1 k x := 2 ≤ k ≤ x
safe2 k x := 2 ≤ k ≤ x ∧ ∀ m ≥ 2. m | x → m ≥ k
152
9 Inductive Correctness Proofs
Exercise 9.10.1 For each of the recursive applications of pfac 0 give the
condition one has to verify to establish safe2 as an invariant.
Give a predicate p(x) holding exactly for the numbers f terminates on.
Convince yourself that p is an invariant for f .
Give an invariant pxy for f holding exactly for the numbers f terminates
on.
153
9 Inductive Correctness Proofs
154
10 Arrays
x0 ··· xn−1
0 n−1
The length and the element type of an array are fixed at allocation time
(the time an array comes into existence). The fields or cells of an array
are boxes holding values known as elements of the array. The fields
are numbered starting from 0. One speaks of the indices or positions
of an array. The key difference between lists and arrays is that arrays
are mutable in that the values in their fields can be overwritten by
an update operation. Arrays are represented by special array values
acting as immutable names for the mutable array objects in existence.
Each time an array is allocated in the computer’s memory, a fresh name
identifying the new array is chosen.
OCaml provides arrays through the standard structure Array. The
operation
allocates a fresh array of a given length where all fields are initialized
with a given value. An array type has the form array(t) and covers all
155
10 Arrays
arrays whose elements have type t, no matter what the length of the
array is. There is also a special constructor type unit
unit ::= ()
e1 .(e2 ) Array.get e1 e2
e1 .(e2 ) ← e3 Array.set e1 e2 e3
The 1st declaration allocates an array a whose initial state is [7, 7, 7].
The 2nd declaration binds the identifier v before to 7 (the 1st element
of a at this point). The 3rd declaration updates a to the state [7, 5, 7].
The 4th declaration binds the identifier v after to 5 (the 1st element
of a at this point). Note that the identifier v before is still bound to 7.
There is also a basic array operation that yields the length of an
array:
156
10 Arrays
let a = [|1;2;3|]
This declaration binds a to a new array of length 3 whose fields hold the
values 1, 2, 3. The declaration may be seen as derived syntax for
let a = Array.make 3 1
let _ = a.(1) <- 2
let _ = a.(2) <- 3
Sequentializations
OCaml offers expressions of the form
e1 ; e2
let to_list a =
let rec loop i l =
if i < 0 then l else loop (i-1) (a.(i)::l)
in loop (Array.length a - 1) []
157
10 Arrays
The loop function shifts a pointer i from right to left through the array
and collects the values in the scanned field in a list l.
The conversion from lists to arrays can be defined with the operations
Array.make and Array.set:
let of_list l =
match l with
| [] -> [||]
| x::l ->
let a = Array.make (List.length l + 1) x in
let rec loop l i =
match l with
| [] -> a
| x::l -> (a.(i) <- x; loop l (i + 1))
in loop l 1
Here the loop function recurses on the list and shifts a pointer from left
to right through the array to copy the values from the list into the array.
There is the subtlety that given an empty list we can obtain the cor-
responding empty array only with the expression [| |] since Array.make
requires a default value even if we ask for an empty array. Thus we
consider [| |] as a native constant rather than a derived form. In fact,
the empty array [| |] is an immutable value.
Note that the worker functions in both conversion functions are tail-
recursive. We reserve the identifier loop for tail-recursive worker func-
tions.
Exercise 10.2.3 Declare a function that yields the sum of the elements
of an array over int.
158
10 Arrays
is x, and in case it is not, continue with either the left half or the right
half of the array. The calculation of the running time assumes that field
access is constant time, which is the case for arrays. One speaks of
binary search to distinguish the algorithm from linear search, which has
linear running time.1 We have seen binary search before in generalized
form (§8.5).
Binary search is usually realized such that in the positive case it
returns an index where the element searched for occurs.
To ease the presentation of binary search and related algorithms, we
work with a three-valued constructor type
type comparison = LE | EQ | GR
Note that the worker function loop is tail-recursive and that the segment
of the array to be considered is at least halved upon recursion.
Binary search satisfies two invariants:
1. The state of the array agrees with the initial state (that is, remains
unchanged).
2. All elements to the left of l are smaller than x and all elements to
the right of r are greater than x.
Given the invariants, the correctness of bsearch is easily verified.
1
Linear search generalizes binary search in that it doesn’t assume the array is sorted.
159
10 Arrays
10.4 Reversal
160
10 Arrays
10.5 Sorting
161
10 Arrays
It is easy to verify that qsort terminates and sorts the given array if
partition has the properties specified. Moreover, if partition splits a
segment in two segments of the same size (plus/minus 1) and runs in
linear time in the length of the given segment, the running time of qsort
is O(n log n) (following the argument for merge sort).
We give a simple partitioning algorithm that chooses the rightmost
element of the segment as pivot element that will be swapped to the
dividing position m once it has been determined:
let partition l r =
let x = a.(r) in
let rec loop i j =
if j < i
then (swap a i r; i)
else if a.(i) < x then loop (i+1) j
else if a.(j) >= x then loop i (j-1)
else (swap a i j; loop (i+1) (j-1))
in loop l (r-1)
The function expects a segment l < r in a and works with two pointers i
and j initialized as l and r − 1 and satisfying the following invariants for
the pivot element x = a.(r) :
1. l ≤ i ≤ r and l − 1 ≤ j < r
2. a.(k) < x for all positions k left of i (meaning l ≤ k < i)
3. x ≤ a.(k) for all postions k right of j (meaning j < k ≤ r)
The situation may be described with the following diagram:
l i→ ←j r
<x ≥x
The pointers are moved further inside the segment as long as the in-
variants are preserved. If this leads to j < i, the dividing position is
chosen as i and the elements at i and r are swapped. If i ≤ j, we have
a.(j) < x and x ≤ a.(i). Now the elements at i and j are swapped and
both pointers are moved inside by one position.
The partitioning algorithm is subtle in that it cannot be understood
without the invariants. The algorithm moves the pointers i and j inside
where the moves must respect the invariants and the goal is j < i. There
is one situation where a swap preserves the invariants and enables a move
of both i and j. As long as i ≤ j, a move is always possible.
The given partitioning algorithm may yield an empty segment and
a segment only by one position shorter than the given segment, which
162
10 Arrays
means that the worst case running time of qsort will be quadratic. Bet-
ter partitioning algorithms can be obtained by first swapping a cleverly
determined pivot element into the rightmost position of the given seg-
ment.
163
10 Arrays
= : ∀α. α → α → B
considers two arrays as equal if their states are equal. However, different
arrays may have the same state. There is a second polymorphic equality
test
== : ∀α. α → α → B
testing the equality of two array values without looking at their state:
let a = Array.make 2 1
let b = Array.make 2 1
let test2 = (a = b) (* true *)
let test1 = (a == b) (* false *)
let test3 = (a.(1) <- 2; a=b) (* false *)
let test4 = (a.(1) <- 1; a=b) (* true *)
We will refer to = as structural equality and to == as physical
equality. The name physical equality refers to the fact that the operator
== when applied to arrays tests whether they are the same physical
object (i.e., are stored at the same address in the computer memory). For
arrays and other mutable objects, physical equality implies structural
equality. We will consider physical equality only for arrays.
164
10 Arrays
fixed and the interpreter can choose which order it realizes. This design
decision gives compilers2 more freedom in generating efficient machine
code.
You can find out more about the execution order your interpreter
realizes by executing the following declarations:
1 let test = invalid_arg "1" + invalid_arg "2"
2 let test = (invalid_arg "1", invalid_arg "2")
3 let test = (invalid_arg "1"; invalid_arg "2")
10.8 Cells
OCaml provides single field arrays called cells using a type family ref (t)
and three operations:
The allocation operation ref creates a new cell initialized with a value
given; the dereference operation ! yields the value stored in a cell; and
the assignment operation := updates a cell with a given value (i.e., the
value in the cell is replaced with a given value). Cell values (i.e., the
names of cells) are called references.
With cells we can declare a function
165
10 Arrays
166
10 Arrays
167
11 Data Structures
The keywords in the first line are explained by the fact that OCaml refers
to structures also as modules, and to signatures also as module types.
We also note that the signatures CELL keeps the type constructor cell
abstract, making it possible to give different implementations for cells.
An obvious implementation of the signature CELL would just mirror
OCaml’s native cells. For the purpose of demonstration, we choose an
implementation of CELL with arrays of length 1. We realize the imple-
mentation with a declaration of a structure Cell realizing the signature
CELL.
module Cell : CELL = struct
type 'a cell = 'a array
let make x = Array.make 1 x
let get c = c.(0)
let set c x = c.(0) <- x
end
168
11 Data Structures
There is type checking that ensures that the structure implements the
fields the signature requires with the required types. Before checking
compliance with the signature, the declarations of the structure are
checked as if they would appear at the top level.
Once the structure Cell is declared, we can use its fields with the dot
notation:
let enum = let c = Cell.make 0 in
fun () -> let x = Cell.get c in Cell.set c (x+1); x
Exercise 11.1.2 Arrays can be implemented with cells and lists. De-
clare a signature with a type family array(α) and the operations make,
get, set, and length, and implement the signature with a structure not
using arrays. We remark that expressing arrays with cells and lists comes
at the expense that get and set cannot be realized as constant-time op-
erations.
169
11 Data Structures
The operation make allocates a stack with a given single element. One
says that push puts an additional element on the stack, pop removes
the topmost element of the stack, top yields the topmost element of the
stack, and height yields the height of the stack.
A stack may become empty after sufficiently many pop operations.
Note that the structure Stack declares an exception Empty that is raised
by pop if applied to an empty stack.
We require that make is given an initial element so that the element
type of the stack is fixed. Alternatively one can have a make operation
that throws away the given element.
For queues we may have the signature
170
11 Data Structures
We will now realize a stack with an array. The elements of the stack are
written into the array, and the height of stack is bounded by the length
of the array. This approach has the advantage that a push operation
does not require new memory but reuses space in the existing array.
Such memory-aware techniques are important when data structures are
realized close to the machine level.
We will realize a bounded stack with a structure and a single array.
To make this possible, we fix the element type of the stack to machine
integers. We declare the signature
171
11 Data Structures
172
11 Data Structures
173
11 Data Structures
10 11 12 −1
4 5 2 3 0 1
Each of the 3 blocks has two fields, where the first field represents a
number in the list and the second field holds a link, which is either a
link to the next block or −1 representing the empty list. The numbers
below the fields are the indices in the array where the fields are allocated.
174
11 Data Structures
The fields of a block are allocated consecutively in the heap array. The
address of a block is the index of its first field in the heap array. Links
always point to blocks and are represented through the addresses of the
blocks. Thus the above linked block representation of the list [10, 11, 12]
is represented with the following segment in the heap:
12 −1 11 0 10 2 ···
0 1 2 3 4 5
Note that the representation of the empty list as −1 is safe since the
addresses of blocks are nonnegative numbers (since they are indices of
the heap array).
Given a linked block representation of a list
10 11 12 −1
we can freely choose where the blocks are allocated in the heap. There
is no particular order to observe, except that the fields of a block must
be allocated consecutively. We might for instance choose the following
addresses:
10 11 12 −1
104 105 62 63 95 96
1 2
Exercise 11.5.1 How many blocks and how many fields are required to
store the list [1, 2, 3, 4] in a heap using the linked block representation?
175
11 Data Structures
The function alloc allocates a new block of the given length in the heap
and returns the address of the block. Blocks must have at least length 1
(that is, at least one field). The function get yields the value in the field
of a block, where the field is given by its index in the block, and the
block is given by its address in the heap. The function set replaces the
value in the field of a block. Thus we may think of blocks as mutable
objects. The function release is given the address of an allocated block
and deallocates this block and all blocks allocated after this block. We
implement the heap with the following structure:
module H : HEAP = struct
let maxSize = 1000
let h = Array.make maxSize (-1)
let s = ref 0 (* current size of heap *)
exception Address
exception Full
type address = int
type index = int
let alloc n = if n < 1 then raise Address
else if !s + n > maxSize then raise Full
else let a = !s in s:= !s + n; a
let check a = if a < 0 || a >= !s then raise Address else a
let get a i = h.(check(a+i))
let set a i x = h.(check(a+i)) <-x
let release a = s:= check a
end
Note that we use the helper function check to ensure that only allocated
addresses are used by set, get, and release. The exception address signals
an address error that has occurred while allocating or accessing a block.
Given the heap H, we declare a high-level allocation function
176
11 Data Structures
Note that the low level allocation function ensures that an allocation
attempt for the empty list fails with an address exception.
It is now straightforward to declare a function
that given the address of the first block of a list stored in the heap
recovers the list as a value:
let rec getlist a =
if a = -1 then []
else H.get a 0 :: getlist (H.get a 1)
Note that putlist and getlist also work for the empty list. We have
getlist (putlist l) = l
for every list l whose block representation fits into the heap.
177
11 Data Structures
−1 −1
At this point we may ask why in the tree B(A, A) is represented with
two boxes rather than just one:
−1 −1
−1 −1 −1 −1 −1 −1 −1 −1 −1 −1
We now see that the block representation with no structure sharing is ex-
ponentially larger than the block representation with maximal structure
sharing.
178
11 Data Structures
with a block representation employing the tags 0 and 1 for the construc-
tors B and C.
a) Draw a block representation without structure sharing.
b) Draw a block representation with structure sharing.
c) Give an expression allocating the tree with structure sharing.
179
11 Data Structures
180
12 Appendix: Example Exams
Midterm
First 8P
Declare a function foo that for a non-negative integer x yields the largest
number n such that n3 ≤ x. For instance, foo (26) = 2 and foo (27) = 3.
Fibonacci numbers with iteration 8P
Recall the sequence of Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8. . . . . Using
iteration, declare a function fib that for n ≥ 0 yields the nth Fibonacci
number. For instance, fib(0) = 0, fib(1) = 1, and fib(4) = 3. Hint:
fib(n + 2) = fib(n) + fib(n + 1).
List reversal 8P
Declare a polymorphic function rev reversing lists. For instance,
rev [1, 2, 3] = [3, 2, 1]. Use only tail recursion. Do not use list con-
catenation @ .
List construction 8P
Declare a polymorphic function init : ∀α. N → (N → α) → L(α) such
that init n f = [f (0), . . . , f (n − 1)]. For instance, init f 0 = [] and
init f 2 = [f (0), f (1)]. Use only tail recursion.
181
12 Appendix: Example Exams
Decimal representation 8P
Declare a function dec that yields the decimal representation of a number
as a list. For instance, dec(456) = [4, 5, 6] and dec(0) = [].
Prime factorization 8P
Declare a function pfac that yields the prime factorization of a number
x ≥ 2 as a list. For instance, pfac(60) = [2, 2, 3, 5].
Insertion sort 8P
Declare a polymorphic function sort sorting lists following the insertion
sort algorithm. For instance, sort [5, 3, 2, 3, 2] = [2, 2, 3, 3, 5].
List prefixes 8P
Declare a polymorphic function pre that given a list yields a list
containing all prefixes of the list. For instance, pre [1, 2, 3] =
[[], [1], [1, 2], [1, 2, 3]].
AB-tree construction 8P
Declare a function ctree that for a number n ≥ 0 yields an AB-tree of
depth n and size 2n + 1. For instance, ctree(1) = B(A, A).
AB-tree infix linearization 8P
Declare a function lin linearizing AB trees such that B is treated as a left-
associative infix operator. For instance, lin(B(B(A, A), B(A, A))) =
”ABAB(ABA)”.
Typing and evaluation rules 10P
Complete the typing and evaluation rules for lambda expressions and
function applications.
E ` λx.e : V ` λx.e .
E ` e1 e2 : V ` e1 e2 .
Early Endterm
Euclid’s algorithm 6P
Declare a function gcd : N → N → N computing the GCD of two
numbers using the remainder operation x % y. Use only tail recursion.
List construction 6P
Declare a function make : ∀α. N → (N → α) → L(α) such that
make n f = [f (0), . . . , f (n − 1)]. Use only tail recursion.
182
12 Appendix: Example Exams
Fibonacci numbers 6P
Declare a constant-time function enum : unit → int enumerating the
sequence of Fibonacci numbers:
fib 0 := 0
fib 1 := 1
fib (n + 1) := fib n + fib (n + 1)
183
12 Appendix: Example Exams
Note that the expressions considered can be type checked and evaluated
in environments without making assumptions about types and values.
a) Realize environments with three declarations as follows:
type 'a env
val lookup : 'a env -> var -> 'a option
val update : 'a env -> var -> 'a -> 'a env
b) Declare a type checker
val check : 'a env -> exp -> 'a option
Do not raise exceptions.
c) Can the type checker be used as an evaluator? Answer yes or no.
184
12 Appendix: Example Exams
reading the fields of a block. As usual, address and index are names for
the type int, and the fields of a block are numbered starting from 0.
a) Declare a function putMaxTree : int → address that for n ≥ 0 stores
the maximal AB-tree of depth n in the heap using blocks of length 2
and taking running time O(n).
b) Declare a function getTree : address → tree that reads AB-trees
from the heap preserving structure sharing between siblings. A call
getTree (putMaxTree n) should take running time O(n).
Late Endterm
Strictly ascending lists 7P
Declare a tail-recursive function test : ∀α. L(α) → B testing whether
a list is strictly ascending. A list [x1 , x2 , . . . , xn ] is strictly ascending if
x1 < x2 < · · · < xn .
Enumerators 8P
Declare a function enum : ∀α. (int → α) → unit → α such that each call
enum f yields a new enumerator for the sequence f (0), f (1), f (2), . . . .
Linear search 8P
Declare a function find : ∀α. α → L(α) → O(N) that for x and l returns
the first position x occurs in l. For instance, find 3 [3, 2, 3] = Some 0
and find 1 [3, 2, 3] = None. Use only tail recursion.
185
12 Appendix: Example Exams
186
12 Appendix: Example Exams
Expressions 15P
We consider expressions e ::= c | x | λx.e | e1 e2 implemented with the
declarations
type var = string
type exp = Con of int | Var of var | Lam of var * exp | App of exp * exp
a) Declare a function
closed : var list -> exp -> bool
that checks whether an expression is closed in a list of variable. For
instance, λx.f xy is closed in [f, y].
b) Declare a function
eval : value env -> exp -> value
evaluating an expression in an environment. We assume that values
and environments are implemented as follows:
type 'a env
val lookup : 'a env -> var -> 'a option
val update : 'a env -> var -> 'a -> 'a env
type value = Int of int | Clo of var * exp * value env
If an expression cannot be evaluated, eval should raise an exception
with failwith ”eval”.
it f 0 x := x it 0 f 0 x := x
it f (n + 1) x := it f n (f x) it 0 f (n + 1) x := f (it 0 f n x)
187