Introduction To Programming Languages
Introduction To Programming Languages
The contents of this book are based on the KAIST Programming Languages course. We
thank PLT1 since the course referred to many materials from PLT in the beginning. We 1: https:
//racket-lang.
also thank every student who took the course before. We have learned many things
org/people.html
from the interaction with the students, and those lessons have affected various parts of
the book. In addition, we thank all the previous and current teaching assistants of the
course. They gave opinions to the course and wrote some of the exercises in the book.
Especially, Jihyeok Park highly contributed to the course, and Jihee Park helped us edit
the exercises.
We would be delighted to receive comments and corrections, which may be sent to
[email protected]. We thank in advance everyone who will contribute to the
book in the future.
Contents
Acknowledgement iii
Contents iv
1 Introduction 1
1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Scala 4
2 Introduction to Scala 5
2.1 Functional Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 REPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Classes and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 SBT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Immutability 20
3.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Tail Call Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Functions 30
4.1 First-Class Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 First-Class Functions and Lists . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.5 For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5 Pattern Matching 43
5.1 Algebraic Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Conciseness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Exhaustivity Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Reachability Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3 Patterns in Scala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Constant and Wildcard Patterns . . . . . . . . . . . . . . . . . . . . . . . . 48
Or Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Nested Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Patterns with Binders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Type Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Tuple Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Pattern Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Patterns with Backticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4 Applications of Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . 54
Variable Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.5 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Untyped Languages 61
6 Syntax and Semantics 62
6.1 Concrete Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.4 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5 Syntactic Sugar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7 Identifiers 80
7.1 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.3 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.4 Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8 First-Order Functions 87
8.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.3 Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
8.4 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
9 First-Class Functions 93
9.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
9.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
9.3 Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
9.4 Syntactic Sugar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10 Recursion 105
10.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.3 Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.4 Recursion as Syntactic Sugar . . . . . . . . . . . . . . . . . . . . . . . . . . 108
10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
11 Boxes 112
11.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
11.3 Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
14 Continuations 138
14.1 Redexes and Continuations . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
14.2 Continuation-Passing Style . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
14.3 Interpreter in CPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
14.4 Small-Step Operational Semantics . . . . . . . . . . . . . . . . . . . . . . . 150
Appendix 240
A Solutions to Selected Exercises 241
Bibliography 242
List of Terms 243
distinguish syntax and semantics know that semantics remains the same
even if syntax may vary. They know that understanding the principles of
semantics is important to learn languages. Becoming familiar with the
new syntax is all they need to use a new language fluently.
This book explains the semantics of principal concepts in programming
languages. Chapters 2, 3, 4, and 5 introduce the Scala programming
language. This book uses Scala to implement interpreters and type
checkers of languages introduced in the book. Chapter 6 explains syntax
and semantics. Then, the book finally introduces various features of
programming languages.
I Chapter 7 introduces identifiers.
I Chapters 8, 9, and 10 introduce functions.
I Chapters 11 and 12 introduce mutation.
I Chapter 13 introduces lazy evaluation.
I Chapters 14, 15, and 16 introduce continuations.
I Chapter 17 introduces De Bruijn indices.
I Chapters 18 and 19 introduce basic type systems.
I Chapter 20 introduces algebraic data types.
I Chapter 21 introduces parametric polymorphism.
I Chapter 22 introduces subtype polymorphism.
Each chapter explains a feature by defining a small language providing
the feature. Those languages may seem inconvenient in practice because
they are too small. However, the simplicity will allow us to focus on the
topic of each chapter.
1.1 Exercises
1. Write the name of a programming language that you have used.
What are the pros and cons of the language?
2. Write the names of two programming languages you know and
compare them.
Scala
Introduction to Scala 2
This book uses Scala as an implementation language, and this chapter 2.1 Functional Programming . . . 5
thus introduces the Scala programming language. Scala stands for a 2.2 Installation . . . . . . . . . . . . 7
scalable language [OSV16]. It is a multi-paradigm language that allows 2.3 REPL . . . . . . . . . . . . . . . . 8
Variables . . . . . . . . . . . . . 9
both functional and object-oriented styles. This book focuses on the
Functions . . . . . . . . . . . . 10
functional nature of Scala. In this chapter, we will see what functional
Conditionals . . . . . . . . . . 12
programming is and why this book uses functional programming. In
Lists . . . . . . . . . . . . . . . 12
addition, we will install Scala and write simple programs in Scala. Tuples . . . . . . . . . . . . . . 14
Maps . . . . . . . . . . . . . . . 15
Classes and Objects . . . . . 15
2.1 Functional Programming 2.4 Interpreter . . . . . . . . . . . 16
2.5 Compiler . . . . . . . . . . . . . 17
2.6 SBT . . . . . . . . . . . . . . . . 18
What is functional programming? According to Wikipedia,
int x = 1;
int y = 2;
if (y < 3)
x = x + 4;
else
x = x - 5;
The state keeps changes throughout the execution of the program. Each
line modifies the current state rather than resulting in some value.
let x = 1 in
let y = 2 in
if y < 3 then x + 4 else x - 5
2.2 Installation
As Scala programs are compiled to Java bytecode, which runs on the Java
Virtual Machine (JVM), you must install Java before installing Scala. Java
has various versions. Scala 2.13, which is used in this book, needs JDK
8 or higher. JDK 8 is the most recommended one. The Scala website10 10: https://docs.scala-lang.org/
overviews/jdk-compatibility/
discusses compatibility issues regarding the other versions.
overview.html
2 Introduction to Scala 8
The Oracle website11 provides an installation file for JDK 8. 11: https://www.oracle.com/
java/technologies/javase/
You can donwload an installation file for Scala 2.13 from the Scala web- javase-jdk8-downloads.html
site.12 Note that you need a file in the “Other resources” section at 12: https://www.scala-lang.org/
the bottom of the page. On macOS, you may use Homebrew instead. download/
By installing Scala, you can use the Scala REPL, interpreter, and com-
piler. Section 2.3, Section 2.4, and Section 2.5 will discuss their usages
respectively.
Another thing to install is SBT. SBT is a build tool for Scala. An installation
file for SBT is available at the SBT website.13 Section 2.6 will discuss the 13: https://www.scala-sbt.org/
download.html
usage of SBT.
2.3 REPL
Once you install Scala, you can launch Scala REPL by typing scala in
your command line.
$ scala
Welcome to Scala 2.13.5.
Type in expressions for evaluation. Or try :help.
scala>
The term REPL stands for read, eval, print, and loop. It is a program
that iterativley reads code from a user, evaluates the code, and prints the
result. REPL is not a place to write a program but is a good place to write
short code and see how it works.
If you input an integer to REPL, it will evaluate the integer and show the
result.
scala> 0
val res0: Int = 0
It means that the expression 0 evaluates to the value 0 and the type of 0
is Int. You can try some arithmetic expressions as well.
scala> 1 + 2
val res1: Int = 3
scala> true
val res2: Boolean = true
scala> "hello"
val res4: String = hello
scala> "hello".length
val res5: Int = 5
scala> "hello".substring(0, 4)
val res6: String = hell
Strings in Scala provide the same methods as those in Java.14 14: https://docs.oracle.com/javase/
8/docs/api/java/lang/String.html
The println function prints a given message into the console.
Variables
If the type of the result does not match a given type, the variable will not
be defined due to a type mismatch.
You can omit the : [type] part and use the following syntax instead:
In this case, a type mimatch never happens, and the type of the variable
becomes the same as the type of its value. People usually omit the type
annotations of local variables.
scala> val x = 3
val x: Int = 3
Variables defined by val cannot be mutated, i.e. their values never change.
Reassignment will incur an error. We call such variables immutable
variables.
scala> x = 4
^
error: reassignment to val
Sometimes, mutable variables, i.e. variables whose values can change, are
useful. Scala provides mutable variables as well as immutable variables.
You need to use var instead of val to define mutable variables. You may
or may not write the type of a variable.
scala> var z = 5
var z: Int = 5
scala> z = 6
// mutated z
scala> z
val res8: Int = 6
scala> z = true
^
error: type mismatch;
found : Boolean(true)
required: Int
Functions
scala> add(3, 7)
val res9: Int = 10
Inside quadruple, the variable y is defined and used for the computation
of the return value.15
15: Vertical bars (|) at the beginning of
Multiple expressions inside curly braces are collectively treated as a single lines are not part of code. They have been
expression. We call such an expression a sequenced expression. Like automatically inserted by REPL.
any other expressions, a sequenced expression can occur anywhere an
expression is needed. For example, it can be used to define a variable.
scala> val a = {
| val x = 1 + 1
| x + x
| }
val a: Int = 4
Conditionals
The first expression is the condition; the second expression is the true
branch; the last expression is the false branch.
On the other hand, people write code like below in languages like C.
int x;
if (true)
x = 1;
else
x = 2;
scala> if (true) {
| val x = 2
| x + x
| } else {
| val x = 3
| x * x
| }
val res11: Int = 4
Lists
scala> List(1, 2, 3)
val res12: List[Int] = List(1, 2, 3)
scala> 1 :: 2 :: 3 :: Nil
val res13: List[Int] = List(1, 2, 3)
List(...) is more convenient than :: for creating a new list from scratch.
However, :: is more flexible since it can prepend a new element in front
of an existing list.16 16: It does not mutate the existing list to
prepend the new element. It creates a new
list with the element and the list.
scala> val l = List(1, 2, 3)
val l: List[Int] = List(1, 2, 3)
scala> 0 :: l
val res14: List[Int] = List(0, 1, 2, 3)
The length method computes the length of a list; parentheses are used
to fetch the element at a specific index.17 17: The first index is 0.
scala> l.length
val res15: Int = 3
scala> l(0)
val res16: Int = 1
scala> headOrZero(List())
val res18: Int = 0
Chapter 3 will show use of pattern matching for lists in recursive functions,
and Chapter 5 will discuss pattern matching in detail.
Tuples
A tuple contains two or more elements and maintains the order between
its elements. We use parentheses to create a new tuple:
To fetch the i-th element of a tuple, we can use ._i.21 21: The first index is 1.
Tuples look similar to lists but have important differences from lists. First,
a tuple’s elements can have different types, while a list’s elements cannot.
For example, a tuple of the type (Int, Boolean) has one integer and one
boolean, while a list of the type List[Int] can have only integers. We
say that tuples are heterogenous, while lists are homogeneous. Second, a
list allows accessing an arbitrary index of a list, while a tuple does not.
For example, l(f()) is possible where l is a list and f returns an integer,
while there is no way to access the f()-th element of a tuple since the
return value of f is unknown before execution.
We use lists and tuples for different purposes. Lists are appropriate
when the number of elements can vary and an arbitrary index should be
accessible. For instance, a list should be used to represent a collection of
the heights of students in a certain class.
We can use ._1 to find the name, ._2 to find the height, and ._3 to check
whether one has payed.
We call a length-2 tuple a pair and a length-3 tuple a triple. Also, we can
consider unit as a length-0 tuple.
Maps
The type of a map whose keys have type T and values have type S is
Map[T, S].
scala> m(2)
val res21: String = two
An object is a value with fields and methods. Fields store values, and
methods are operations related to the object. A class is a blueprint of
objects. We can easily create multiple objects of the same structure by
defining a single class. This book uses only “case” classes of Scala. Case
classes are similar to classes but more convenient, e.g. automatic support
for pretty printing and pattern matching.
The syntax of a class definition is as follows:
The first name is the name of a new class. The names inside the paren-
theses are the names of the fields of the class. A class definition must
specify the types of its fields.
scala> s.name
val res22: String = John Doe
scala> s.height
val res23: Int = 180
2.4 Interpreter
An interpreter is a program that takes source code as input and runs the
code. The Scala interpreter takes Scala source code as input. To use the
interpreter, we need to save source code into a file. Make a file with the
following code, and save it as Hello.scala.
println("Hello world!")
You can excute the interpreter by typing scala with the name of a file in
your command line. Here, we need to say scala Hello.scala.
$ scala Hello.scala
Hello world!
You can write multiple lines in a single file. Fix Hello.scala like below.
val x = 2
println(x)
val y = x * x
println(y)
$ scala Hello.scala
2
4
2 Introduction to Scala 17
2.5 Compiler
object Hello {
def main(args: Array[String]): Unit = {
println("Hello world!")
}
}
You can make the compiler compile the code by typing scalac with the
name of the file in your command line.
$ scalac Hello.scala
After compilation, you will be able to find the Hello.class file in the
same directory. The file contains Java bytecode.
You can run the bytecode with the JVM by the scala command. In this
time, you should write only the class name.
$ scala Hello
Hello world!
You can change the behavior of a program by modifying the main method.
Each time you modify, you need to re-compile the program to re-generate
the bytecode.
Running bytecode is much more efficient than interpreting Scala source
code. You can easily notice that scala Hello takes much less than scala
Hello.scala even though their results are the same.
Scala has two sorts of errors: compile-time errors and run-time errors.
Compile-time errors occur during compilation, i.e. while running scalac.
If the compiler finds things that might go wrong at run time, it raises
errors and aborts the compilation. For example, an expression adding
an integer to a boolean results in a compile-time error because such an
addition cannot succeed at run time.
true + 1
true + 1
^
Compile-time error
1 / 0
java.lang.ArithmeticException: / by zero
Run-time error
2.6 SBT
SBT is a build tool for Scala. Build tools help programmers work on large
projects with many files and libraries by tracking dependencies between
files and managing libraries. There are various build tools in the world,
and SBT is the most popular one for Scala.
You can create a new Scala project by the sbt new command.
The build.sbt file configures the project. It manages the version of Scala
used for the project, third-party libraries used in the project, and many
2 Introduction to Scala 19
other things. Source files are in the src directory. Files in main are main
source files, while files in test are only for testing. You can add files into
the src/main/scala directory and edit them to write code.
An SBT console can be started by the sbt command. The current working
directory of your shell should be the base directory of the project.
$ sbt
[info] welcome to sbt 1.4.7
[info] loading global plugins from ~/.sbt/1.0/plugins
[info] loading project definition from ~/hello/project
[info] loading settings for project root from build.sbt ...
[info] set current project to hello (in build file:~/hello/)
You can compile, run, and test the project by executing SBT commands
in the console.
sbt:hello> compile
[info] compiling 1 Scala source to ~/hello/target/scala-2.13
3.1 Advantages
We will focus on the first two advantages: easier reasoning and no need
for defensive copies.
First, let us see why immutability makes things easy to reason about.
val x = 1
...
f(x)
var x = 1
...
f(x)
On the other hand, if x is a mutable variable, one should read every line
of code in the middle to find the value of x at the time when the function
call happens.
3 Immutability 21
val x = List(1, 2)
...
f(x)
...
x
import scala.collection.mutable.ListBuffer
val x = ListBuffer(1, 2)
...
f(x)
...
x
val x = ListBuffer(1, 2)
...
f(x)
...
x
3 Immutability 22
val x = ListBuffer(1, 2)
val y = x.clone
...
f(y)
...
x
In cases that x has many elements and the code is executed multiple
times, copying x increases the execution time significantly.
In the code, using the clone method is enough to copy the list because
the list contains only integers. However, to pass lists containing mutable
objects safely to functions, defining additional methods for deep copy is
inevitable.
Immutability has several clear advantages. Immutability is an important
concept in functional programming. Functional programs use immutable
variables and data structures in most cases. If you write a large program
whose logic is complex and correctness is important, you should adopt
the functional paradigm. However, mind that immutability is not the
silver bullet for every program. For example, implementing algorithms
in a functional style is usually inefficient. It would be better to use
mutable data structures like arrays, mutable variables, and loops to
implement algorithms. They make programs much more efficient and
faster. Choosing a programming proper paradigm for the purpose of a
program is the key to write good code.
3.2 Recursion
}
res
}
Note that recursive functions always require explicit return types in Scala,
unlike non-recursive functions, whose return types can be omitted.
The recursive version is preferred over the imperative version since its
correctness is easily verified.
To check the correctness of the imperative factorial function, one should
find a loop invariant, which is a proposition that is always true at the loop
head. The loop invariant of this case is ((i − 1)! res) ∧ (i ≤ n + 1).
By using this invariant, we can conclude that i n + 1 and, therefore,
res (i − 1)! n! at the last line of the function, which implies that
it correctly implements factorial. It is nontrival to find a proper loop
invariant and show that the loop invariant holds at the beginning of each
iteration.
On the other hand, recursive functions usually reveal their mathemat-
ical definitions more clearly than functions using loops. Consider the
following mathematical definition of factorial:
(
1 if n 0
n!
n × (n − 1)! otherwise
You can see that the implementation of the factorial function using
recursion is identical to the mathematical definition of factorial. It is
almost trivial to show that the recursive factorial function is correct.
Recursion allows concise and intuitive descriptions of mathematical
functions. In many cases, functions with recursion is much easier to be
verified formally or informally than functions with loops.
Recursive functions are also good at treating recursive data structures like
lists. A list is recursive since a nonempty list consists of a head element
and a tail list, which means that a nonempty list has another list as its
component. Writing some functions regarding lists helps understanding
and practicing recursion.
The following function takes a list as an argument and returns a list
whose elements are one larger than the elements of the given list.
When a given list is empty, the function returns the empty list. Otherwise,
the return value is a list whose head is one larger than the head of the
given list and tail has elements that are one larger than the elements of
the tail of the given list.
Similarly, square takes a list of integers as an argument and returns a list
whose elements are the squares of the elements of the given list.
For a nonempty list, the function checks whether the head is odd or not.
If the head is odd, the resulting list contains the head, and its tail has
only odd integers. Otherwise, the head is removed.
Similarly, positive takes a list of integers as an argument and returns a
list whose elements are positive.
The sum of elements in the empty list is zero as there are no elements.
When a list is nonempty, the sum of its elements can be calculated by
adding the value of the head to the sum of its tail’s elements.
Similarly, product calculates the product of the elements of a given list.
3 Immutability 25
If the last action of a function is a function call, then the call is a tail call.
When a tail call happens, the callee does every computation, and thus
the local variables of the caller have no need to remain after the call. The
stack frame of the caller can be destroyed. Most functional languages
exploit this fact to optimize tail calls. This optimization is called tail call
optimization. At compile time, compilers check whether calls are tail calls.
If a call is a tail call, the compilers generate code that eliminates the stack
frame of the caller before the call. They do not optimize non-tail function
calls because the local variables of the callers can be used after the callees
return. If every function call in a program is a tail call, the stack never
grows so that the program is safe from stack overflow.
The previous factorial function multiplies n and the return value of the
recursive factorial(n - 1) call. The multiplication is the last action.
The recursive call is not a tail call. The stack frame of the caller must
remain. The following process computes factorial(3):
I factorial(3)
I 3 * factorial(2)
I 3 * (2 * factorial(1))
I 3 * (2 * (1 * factorial(0)))
I 3 * (2 * (1 * 1))
I 3 * (2 * 1)
I 3 * 2
I 6
At most four stack frames coexist. For a large argument, a stack grows
again and again and finally overflows.
3 Immutability 26
factorial(10000)
java.lang.StackOverflowError
at .factorial
Run-time error
I factorial(3)
I factorial(2, intermediate result = 3)
I factorial(1, intermediate result = 3 * 2)
I factorial(1, intermediate result = 6)
I factorial(0, intermediate result = 6 * 1)
I factorial(0, intermediate result = 6)
I 6
There is no need to return to the caller. The below code shows the
factorial function with a tail call. The function needs one more param-
eter that takes an intermediate result as an argument. factorial(n, i)
computes n! × i.
The function uses a tail call. More precisely, the function is tail-recursive.
Its last action is calling itself. Unlike most functional languages, Scala
cannot optimize general tail calls. Scala optimizes only tail-recursive
calls. The Scala compiler generates Java bytecode, which is excuted by
the JVM. The JVM does not allow bytecode to jump to the beginning
of another function. In the JVM, functions can only either return or
call functions. Therefore, the Scala compiler cannot generate optimized
code by removing the stack frame of a caller. Instead, they transform
tail-recursive calls into loops. The factorial function is compiled to the
following bytecode:
16: istore_1
17: goto 0
20: ireturn
We can check that there is no function call at all.2 The function just jumps 2: invokevirtual is a function call in-
to instructions inside the function. Due to the tail call optimization, the struction.
function never incurs stack overflow.
Even with tail recursion, the result is still incorrect because of integer
overflow.
assert(factorial(10000, 1) > 0)
The optimization of the Scala compiler not only prevents stack overflow
but also removes the overheads of function calls. The downside is that
mutually recursive functions using tail calls lie beyond the scope of
the optimization. Mutual recursion is recursion involving two or more
definitions. The following functions can cause stack overflow in Scala
even though they use tail calls because they are not tail-recursive:
import scala.annotation.tailrec
@tailrec def factorial(n: BigInt, inter: BigInt): BigInt =
if (n <= 0)
inter
else
factorial(n - 1, inter * n)
^
error:
could not optimize @tailrec annotated method factorial:
it contains a recursive call not in tail position
Compile-time error
The annotation does not affect the behavior of the resulting bytecode.
Regardless of the existence of the annotation, the compiler always opti-
mizes tail-recursive functions. Still, using the annotations is desirable to
prevent mistakes.
Calling the tail-recursive version of factorial needs the unnecessary
second argument. The below code defines a new factorial function
with one parameter and uses the tail-recursive one as a local function
inside the function.
3.4 Exercises
1. Consider the following definition of Student:
case class Student(name: String, height: Int)
Implement a function names:
3 Immutability 29
assert(g(f) == 0)
The function g has one parameter h. The type of h is Int => Int. An
argument passed to g is a function that receives one integer and returns
an integer. In Scala, => expresses the types of functions. Functions without
4 Functions 31
assert(f(0)(0) == 0)
The function f returns the function g. Since the return type of f is Int
=> Int, its return value must be a function that takes an integer as an
argument and returns an integer. g satisfies the condition. f(0) is the
same as g and therefore is a function. f(0)(0) equals g(0), which returns
0.
val h0 = f(0)
assert(h0(0) == 0)
A variable can refer to f(0). h0 refers to the return value of f(0) and
has type Int => Int. Calling variables referring to function values is
possible. h0(0) is a valid expression and results in 0.
val h1 = f
^
error: missing argument list for method f
val h1 = f _
assert(h1(0)(0) == 0)
Compiling the above code succeeds. The type of h1 is Int => (Int
=> Int). Int => Int => Int denotes the same type because => is a
right-associative type operator. h1(0)(0) is valid and yields 0.
4 Functions 32
When programmers use function names as values, they usually place the
names where function types are expected. In these cases, underscores
and explicit type annotations are unnecessary. Code rarely becomes
problematic and needs underscores or type annotations like the above to
enforce the transformations.
How does the compiler create function values from function names?
If the parameter type of function f is Int, the corresponding function
value is (x: Int) => f(x). The transformation is called eta expansion.
(x: Int) => f(x) is a function value without a name and does the same
thing as f. The following section covers functions without names.
The code does similar things to the previous code but uses anonymous
functions.
4 Functions 33
Since g has a parameter of type Int => Int, the compiler expects x =>
x to have the type Int => Int. It infers the type of x as Int.
h has an explicit type annotation. Int => Int is the expected type of x
=> x. The compiler infers the type of x as Int.
val h = x => x
^
error: missing parameter type
Compile-time error
As intended, f(_) becomes x => f(x), whose type is Int => Int.1 1: Actually, there is no need to write g(f(_-
)) because it is equal to g(f).
g1(f(_ + 1))
^
error: missing parameter type for expanded function
((<x$1: error>) => x$1.$plus(1))
Compile-time error
On the other hand, f(_ + 1) becomes f(x => x + 1) but not x => f(x
+ 1). As f takes an integer, not a function, it results in a compile-time
error.
f(_) + _ becomes (x, y) => f(x) + y, whose type is (Int, Int) =>
Int, and the compilation succeeds.
g2(f(_ + 1) + _)
^
error: missing parameter type for expanded function
((<x$2: error>) => f(((<x$1: error>) => x$1.$plus(1)))
.<$plus: error>(x$2))
Compile-time error
Like type inference of parameter types, novices may not be sure about
how anonymous functions with underscores are transformed. It is rec-
ommended to use normal anonymous functions without underscores for
those who are not confident about the mechanism of underscores.
4.3 Closures
Closures are function values that capture environments, which store the
values of existing variables, when they are defined. The bodies of closures
may have variables not defined in themselves, and the environments
store the values of those variables.
4 Functions 35
add1 and add2 refer to the same adder function, but the former returns
an integer one larger than an argument, and the latter returns an integer
two larger than an argument. The results of add1(2) and add2(2) are
3 and 4, respectively. It is possible because the closures capture the
environments when they are created. add1 refers to a thing like (adder,
x = 1) instead of just adder. Similarly, add2 is actually (adder, x = 2).
Since the environment of add1 stores the fact that x is 1, add1(2) results
in 3. Under the environment of add2, x denotes 2, and thus x + y is 4
when y is 2.
inc1 increases every element of a given list by one, and square squares
every element. The two functions are remarkably similar. To make the
similarity clearer, let us rename the functions to g.
This function is called map. The returned list has elements obtained by
mapping a given function to the elements of a given list.
They look similar. They can become identical by renaming and adding
parameters.
def foldRight(
l: List[Int],
n: Int,
f: (Int, Int) => Int
): Int = l match {
case Nil => n
case h :: t => f(h, foldRight(t, n, f))
}
4 Functions 38
def foldLeft(
l: List[Int],
n: Int,
4 Functions 39
The order traversing a list does not affect the results of sum and product.
Both foldRight and foldLeft can express the functions.
On the other hand, the order is important for some functions. Consider a
function that takes a list of digits as arguments and returns the decimal
number obtained by concatenating the digits. foldLeft is the easiest
way to implement this function.
foldLeft(List(1, 2, 3), 0, f)
= f(f(f(0, 1), 2), 3)
= ((0 * 10 + 1) * 10 + 2) * 10 + 3
= (1 * 10 + 2) * 10 + 3
= 12 * 10 + 3
= 123
Using foldRight with the same arguments will yield completely different
result.
foldRight(List(1, 2, 3), 0, f)
4 Functions 40
map, filter, foldRight, and foldLeft are powerful functions. The four
functions offer concise implementation for many procedures dealing
with lists. Since they are so useful, the Scala standard library provides
map, filter, foldRight, and foldLeft as the methods of the List class.
You do not need to implement map, filter, foldRight, and foldLeft
by yourself.
map(l, f) can be rewritten to l.map(f) by using the map method in-
stead.
The methods in the standard library are polymorphic, i.e. they can take
arguments of various types. For example, our map function takes only a
list of integers. To use map for a list of students, we need to define a new
version of map. However, the map method in the standard library can take
lists of any types as arguments.
The standard library provides many other useful methods for lists.2 2: https://www.scala-lang.org/
api/current/scala/collection/
immutable/List.html
4 Functions 41
Scala has for loops. In Scala, a for loop is an expression, which evalautes
to a value. For expressions are highly expressive. Unlike while, which
work with mutable variables or objects, for of Scala helps programmers
to write code in a functional and readable way.
The syntax of a for expression is as follows:
For this reason, for expressions are powerful. Any user-defined types can
appear in for expressions if the types define map.
For expressions can replace use of the filter method as well.
4.6 Exercises
1. Implement a function incBy:
def incBy(l: List[Int], n: Int): List[Int] = ???
that takes a list of integers and an integer as arguments and increases
every element of the list by the given integer. Use the map method.
2. Implement a function gt:
def gt(l: List[Int], n: Int): List[Int] = ???
that takes a list of integers and an integer as arguments and filters
elements less than or equal to the given integer out from the list.
Use the filter method.
3. Implement a function append:
def append(l: List[Int], n: Int]): List[Int] = ???
that takes a list of integers and an integer as arguments and returns
a list obtained by appending the integer at the end of the list. Use
the foldRight method.
4. Implement a function reverse:
def reverse(l: List[Int]): List[Int] = ???
that takes a list of integers and returns a list obtained by reversing
the order between the elements. Use the foldLeft method.
Pattern Matching 5
This section explains pattern matching of Scala. Pattern matching is one 5.1 Algebraic Data Types . . . . 43
of the key features of functional programming. It helps programmers 5.2 Advantages . . . . . . . . . . . 46
handle complex, but structured data. We have already used a simple Conciseness . . . . . . . . . . 46
Exhaustivity Checking . . . . 47
form of pattern matching for lists. This section discusses the benefits of
Reachability Checking . . . 48
pattern matching and various patterns available in Scala. In addition,
5.3 Patterns in Scala . . . . . . . . 48
it will introduce the option type, which is widely-used in functional
Constant and Wildcard Pat-
programming. terns . . . . . . . . . . . . . . . . . 48
Or Patterns . . . . . . . . . . . 49
Nested Patterns . . . . . . . . 50
5.1 Algebraic Data Types Patterns with Binders . . . . 50
Type Patterns . . . . . . . . . . . 51
Tuple Patterns . . . . . . . . . 52
It is common to include values of various shapes in a single type. Pattern Guards . . . . . . . . . 52
A natural number is Patterns with Backticks . . . 53
5.4 Applications of Pattern Match-
I zero or ing . . . . . . . . . . . . . . . . . . 54
I the successor of a natural number. Variable Definitions . . . . . 54
Anonymous Functions . . . . 55
A list is For Loops . . . . . . . . . . . . 55
5.5 Options . . . . . . . . . . . . . 56
I the empty list or
I a pair of an element and a list.
A binary tree is
An arithmetic expression is
I a number,
I the sum of two arithmetic expressions, or
I the difference of two arithmetic expressions.
I a variable,
I a function, which is a pair of a variable and an expression, or
I a function application, which is a pair of two expressions.
I a number,
I the sum of two arithmetic expressions, or
I the difference of two arithmetic expressions.
type ae =
| Num of int
| Add of ae * ae
| Sub of ae * ae
Scala does not provide a direct way to define ADTs. Instead, Scala
provides traits and classes, which are more general mechanisms to define
new types, and programmers can express ADTs with traits and classes.
A new type can be defined as a trait. The syntax of a trait definition is as
follows:
trait [name]
It defines a type whose name is [name]. The following code defines the
AE type, which is the type of arithmetic expressions:
sealed trait AE
The sealed modifier prevents AE being extended outside the file that
defines AE. We will get back to this point when we discuss the exhaustivity
checking of pattern matching.
Once a type is defined as a trait, the type can be used just like any other
types. For example, we can define an identity function for arithmetic
expressions.
However, traits do not have ability to construct new values. It means that
there is no way to create a value of the type AE yet. We need to define the
variants of AE as case classes by extending AE.
val n = Num(10)
val m = Num(5)
val e1 = Add(n, m)
val e2 = Sub(e1, Num(3))
Like traits, case classes also define types. The name of each class is the
name of the defined type. Every instance of a class belongs to the type
corresponding to the class.
In addition, becuase of the extends keyword, Num, Add, and Sub are
subtypes of AE. It means that any value of the types Num, Add, or Sub is
also a value of the type AE.
val n: AE = Num(10)
val m: AE = Num(5)
val e1: AE = Add(n, m)
val e2: AE = Sub(e1, Num(3))
We know that we can access the fields of objects with their names.
However, we cannot access the fields of an object when its type becomes
AE.
val m: AE = Num(10)
m.value
^
error: value value is not a member of AE
Compile-time error
The reason is that m can be Add or Sub, which do not have the field value,
as AE consists of not only Num but also Add and Sub. The compiler thinks
that m may not have the field value and considers m.value as an unsafe
expression, which should be rejected.
The best way to use ADTs is pattern matching. The following function
evaluates a given arithmetic expression and returns the number denoted
by the arithmetic expression.
The list type is another good example of an ADT. The Scala standard
library defines lists similar to the following code:
This code omits some details but clearly shows the high-level idea to
define lists.1 A list is either the empty list or a nonempty list, which is a 1: We will not see what [+A] and Nothing
pair of its head and tail. Nil is defined as a case object, not a case class, are here. You can understand the over-
all ADT structure without knowing those
since there is only one empty list. Every empty list is identical. We use a
concepts.
case object to express this idea. Nil is created only once during entire
execution, and every Nil is identitcal. The name :: looks a bit weird,
but it is for readability of pattern matching. Scala allows writing class
names as infix operators in patterns. It means that both case ::(h, t)
=> and case h :: t => are allowed. Due to the class name ::, we can
write case h :: t => in pattern matching.
5.2 Advantages
Conciseness
else if (e.isInstanceOf[Add]) {
val e0 = e.asInstanceOf[Sub]
eval(e0.left) + eval(e0.right)
}
Exhaustivity Checking
eval(Num(3))
^
warning: match may not be exhaustive.
It would fail on the following input: Num(_)
Compile-time warning
The compiler warns programmers about that the patterns are not exhaus-
tive. Moreover, it precisely informs which patterns are missing to help
debugging. Exhaustivity checking is beneficial for complex programs.
It helps programmers make error-free programs and thus is a crucial
strength of pattern matching.
For exhaustivity checking, the sealed modifier of traits is necessary.
Wihtout sealed, a trait can be extended outside the file that defines
it. The unit of compilation is a single file, so it is impossible to find
all the variants by scanning a single file when a trait is not sealed.
Exhaustivity checking during pattern matching will be impossible. The
sealed keyword resolves the problem. Since sealed traits cannot be
extended further, it is enough to check only the file that defines a sealed
trait to find every variant of the trait. It is why we use sealed traits to
define ADTs.
Reachability Checking
When code is simple and short, it is easy to check whether there are un-
reachable patterns. However, in complex code, programmers often insert
unreachable patterns by mistake and make critical bugs. Reachability
checking of the compiler is an important feature to prevent such bugs.
assert(grade(85) == "B")
Or Patterns
assert(grade(100) == "A")
Nested Patterns
Type Patterns
Type patterns are useful for dynamic type checking. The following
function takes any value as an argument and check whether it is a string
or not.2 2: Every type is a subtype of Any, i.e. every
value belongs to Any.
^
warning: non-variable type argument String in type pattern
List[String] is unchecked since it is eliminated by erasure
Compile-time warning
Tuple Patterns
Pattern Guards
The function add takes a tree and an integer as arguments and returns a
tree obtained by adding the integer to the tree. If the integer is an element
of the given tree, the tree itself is the return value.
The patterns in the above code is not exhaustive, but the compiler does
not warn programmers about the inexhaustivity.
The function remove takes a tree and an integer as arguments and returns
a tree obtained by removing the integer from the tree. If the integer is not
an element of the tree, the given tree itself is the return value. removeMin
is a helper function used by remove. It returns the pair of the smallest
element of a given tree and a tree obtained by removing the element
from the tree.
Variable Definitions
assert(n == 1 && m == 2)
val h :: t = List(1, 2, 3, 4)
assert(h == 1 && t == List(2, 3, 4))
Anonymous Functions
The function toSum takes a list of pairs of two integers as arguments and
returns a list whose elements are the sums of the integers in the pairs.
For Loops
5.5 Options
try {
get(List(1, 2), 2)
} catch {
case e: Exception =>
// prints "index out of bounds"
println(e.getMessage)
}
get(List(1, 2), 2)
The Scala compiler does not check whether exceptions are handled
properly. It means that there will not be any compile-time error even if
there is a possibility of unhandled exceptions.
5 Pattern Matching 57
null
^
error: an expression of type Null is ineligible
for implicit conversion
else
get(t, n - 1)
}
The strategy has an obvious problem. The caller cannot distinguish two
situations:
I The list contains -1.
I The index is invalid.
An option that may have a value of type T has type Option[T]. An option
is either None or Some. None is a value that does not denote any value
and similar to null. It indicates a problematic situation. Like Nil, it is
defined as a case object because every None is identical. Some constructs
a value that denotes that a value exists. It is similar to a reference to a
real object and indicates that computation has succeeded.
The following code defines getOption, which returns an option.
For an invalid index, the return value is None. The caller can notice
that the operation has failed by None. Otherwise, the function packs an
element inside Some to make the return value.
The Scala standard library uses options in many places. Various methods
return options. For example, headOption of a list returns None when the
list is empty. Otherwise, Some containing the head of the list is returned.
assert(List().headOption == None)
assert(List(1).headOption == Some(1))
Also, get of a map returns None when the map does not have a given
key. Otherwise, Some containing the value corresponding to the key is
returned.
def getHeight(
m: Map[String, Student],
name: String
): Option[Int] =
m.get(name).map(_.height)
def getStudent(
l: List[String],
m: Map[String, Student]
): Option[Student] =
l.headOption.flatMap(m.get)
The standard library provides many other useful methods for options.5
5: https://www.scala-lang.org/api/
current/scala/Option.html
Untyped Languages
Syntax and Semantics 6
This chapter is about syntax and semantics. 6.1 Concrete Syntax . . . . . . . . 62
6.2 Abstract Syntax . . . . . . . . 65
Syntax of a programming language decides the appearance of the lan- 6.3 Parsing . . . . . . . . . . . . . . . 71
guage. Syntax consists of concrete syntax and abstract syntax. While 6.4 Semantics . . . . . . . . . . . . 72
concrete syntax describes programs as strings, abstract syntax describes 6.5 Syntactic Sugar . . . . . . . . 77
the structures of programs as trees. Parsing is the process bridging the 6.6 Exercises . . . . . . . . . . . . . 78
gap between concrete syntax and abstract syntax. A string is transformed
to a tree by parsing. This chapter explains concrete syntax, abstract syntax,
and parsing.
Semantics of a programming langauge determines the behavior of each
program. This chapter explains how we can define the semantics of a
language. In addition, we will see what syntactic sugar is.
People write programs with strings. Some strings are valid programs,
while other strings are not. For example, consider the following code:
println()
It is a valid Scala program. On the other hand, the following code is not
a valid Scala program.
println(
P⊆S
The vertical bars in the right hand side separate distinct expressions.
The expressions define the set denoted by the nonterminal. The union
of the sets denoted by the expressions equals the set denoted by the
nonterminal. For example, the following definition makes <digit> denote
{"0" , "1" , "2" , "3" , "4" , "5" , "6" , "7" , "8", "9"}:
From now on, we are going to define the concrete syntax of a tiny language
named AE to show example usage of BNF. AE stands for arithmetic
expressions. Its features is limited to addition and subtraction of decimal
integers.
AE programs should be able to express decimal integers. Thus, the
following strings should be programs of AE:
I "0"
I "1"
I "-10"
I "42"
I "0+1"
I "-2-1"
I "1+-3+42"
I "4-3+2-1"
First, we can define the set of every string that represents a decimal
integer in BNF. It can be done with the following definitions:
We know that <digit> denotes {"0", "1", "2", "3", "4", "5", "6", "7",
"8", "9"}. Since <digit> is one way to construct an element of <nat>,
every string denoted by <digit> is also a string of <nat>. Hence, {"0",
6 Syntax and Semantics 65
"1", "2", "3", "4", "5", "6", "7", "8", "9"} is a subset of the set denoted
by <nat>. At the same time, <digit> <nat> is the other way to construct
an element. We can make a new element of <nat> by selecting strings
from <digit> and <nat>, respectively. For example, <digit> can denote
"1", and <nat> can denote "0". Therefore, "10" is an element of <nat>.
By repeating this process, we can construct infinitely many strings, e.g.
"110" by concatenating "1" and "10", "1110" by concatenating "1" and
"110", and so on. In the end, we can conclude that <nat> denotes the set
of every string that consists of the characters from ’0’ to ’9’, i.e. every
string that represents a decimal natural number.1 1: This book considers zero as a natural
number.
Finding the set denoted by <number> is easier. Since <nat> is one way to
construct an element of <number>, the set denoted by <nat> is a subset
of the set denoted by <number>. The expression "-" <nat> is the other
way to construct an element. It implies that if we concatenate "-" and
a string denoted by <nat>, we can get a new element of <number>. In
conclusion, <number> denotes the set of every string that represents a
decimal integer.
It is enough to add only addition and subtraction to complete the
definition of the concrete syntax.
In a similar way, we can figure out which set is denoted by <expr>. The
set includes every string that represents arithmetic expression consisting
of decimal integers, addition, and subtraction. We can say that <expr>
defines P of AE, and the concrete syntax of AE is defined now.
Defining syntax solely with concrete syntax has problems from both
language users’ and language designers’ perspectives.
Programmers usually learn multiple languages. Languages considerably
vary in their concrete syntax. Consider a function that takes two integers
as arguments and returns the sum of the integers. We can implement the
function in four different languages like below.
I Python
I JavaScript
function add(n, m) {
return n + m;
}
I Racket
I OCaml
6 Syntax and Semantics 66
let add n m = n + m
They look so different even though they define the same function. The
keyword to define a function is def in Python, function in JavaScript,
define in Racket, and let in OCaml. It is not the only difference. Python
and JavaScript need parentheses and commas for parameters, while
Racket and OCaml do not. JavaScript puts function bodies inside curly
braces. Racket treats + as a prefix operator, while the others treat +
as an infix operator. These differences are the differences between the
concrete syntax of each language. Various forms of concrete syntax hinder
programmers from learning multiple languages easily.
However, their structures are quite the same. In every language, a function
definition consists of a name, parameters, and a body expression. The
above example defines a function whose name is add, parameters are n
and m, and body is an expression that adds n and m. In every language in
the example, an addition expression consists of two operands. The body
expression uses n and m as the operands of the addition.
Thus, programmers should focus on the structures of programs, rather
than strings per se, to learn multiple languages easily. The structures
remain the same even when the strings vary.
At the same time, concrete syntax cares about tedious details that language
designers want to ignore. For example, both "2+1" and "02+001" are AE
programs. They are different strings but represent the same arithmetic
expression: 2 + 1. When the designers of AE define logic to evaluate
arithmetic expressions, distinction between 2 + 1 and 2 − 1 is important,
but distinction between "2+1" and "02+001" is completely unnecessary.
The designers want to focus only on the structures of programs but not
strings.
For both programmers and designers, concrete syntax is problematic
because it describes only strings and does not give good abstraction
of structures even though people want to focus on the structures. Of
course, we cannot discard the notion of concrete syntax. Everyone write
programs as strings, and concrete syntax is essential for that step. At the
same time, we need a way to describe the structures of programs without
being affected by differences in strings. To meet the need, we introduce
another notion of syntax: abstract syntax. Concrete syntax and abstract
syntax are complementary. They collectively construct the syntax of a
language.
Abstract syntax describes the structure of a program as a tree. A pro-
gram consists of multiple components. Each component consists of
subcomponents again. Trees formally express such recursive structures.
A component can be represented as a tree whose root describes the sort
of a component and children are the trees representing the subcompo-
nents.
As an example, let us express the function add as a tree. The function
definition has four components: the name add, the first parameter n,
the second parameter m, and the body expression. The following tree
represents the function definition:
6 Syntax and Semantics 67
FunDef
add n m
The root of the tree is the symbol FunDef, which explains that this tree
represents a function definition. The tree has four children: add, n, m, and
the body expression. We do not know how to draw the tree representing
the body expression yet.
The body expression is an addition expression. It has two components:
the operands of the addition.
FunDef
add n m Add
The root of the tree is Add as the expression is addition. It has two
children: the operand expressions.
The first operand expression consists of a single component: n.
FunDef
add n m Add
Name
The root of the tree is Name, since the expression is just a name. The only
child is n.
The second operand expression can be similarly represented as a tree.
FunDef
add n m Add
Name Name
n m
6 Syntax and Semantics 68
Num
If n ∈ Z, then ∈ E.
n
Add
If e1 , e2 ∈ E , then ∈ E.
e1 e2
Subtraction is similar.
Sub
If e1 , e2 ∈ E , then ∈ E.
e1 e2
By collecting all the above facts, we can define the abstract syntax of AE
as the smallest set E satisfying the following conditions.
Num
If n ∈ Z, then ∈ E.
n
Add
If e1 , e2 ∈ E , then ∈ E.
e1 e2
6 Syntax and Semantics 69
Sub
If e1 , e2 ∈ E , then ∈ E.
e1 e2
Add
Sub Num
Num Num 3
5 1
sealed trait AE
case class Num(value: Int) extends AE
case class Add(left: AE, right: AE) extends AE
case class Sub(left: AE, right: AE) extends AE
Num( n ) corresponds to
Num
Add( e1 , e2 ) corresponds to
Add
e1 e2
Sub( e1 , e2 ) corresponds to
Sub
e1 e2
6 Syntax and Semantics 70
I If n ∈ Z, then n ∈ E .
I If e1 , e2 ∈ E , then e1 + e2 ∈ E .
I If e1 , e2 ∈ E , then e1 − e2 ∈ E .
Even though the notations themselves do not look like trees at all, they
still represent ASTs. Also, symbols like + and − do not have any meaning.
It is extremely important to keep these points in your mind. Otherwise,
you will mix abstract syntax using notations up with concrete syntax in
the end.
Notations are just notations. You can define different notations and use
them. For example, one may use ADD e1 e2 instead of e1 + e2 to represent
addition. You can freely choose notations, but once you define them, you
should consistently use them not to make other people confused.
To make the definition of abstract syntax more concise, we adopt BNF to
the definition of abstract syntax. We can re-define the abstract syntax of
AE with BNF:
e :: n | e + e | e − e
Add Sub
or
Num Num 2 3 Num Num
3 1 1 2
6 Syntax and Semantics 71
6.3 Parsing
parse : S →7 E
Partial functions
A partial function from a set A to a set B is a function from a subset S
of A to B . S is called the domain of definition, or just domain in short,
of the partial function. While A → B is a set of functions from A to B ,
A→ 7 B is a set of partial functions from A to B .
Let f be a partial function from A to B . Then, there can be a ∈ A such
that f (a) is undefined. From a programmers’ perspective, f can be
interpreted as a function from A to Option[B ], where None means
that the image is undefined and Some( b ) means that the image is b .
The result of parse is undefined when an input does not belong to P (the
set of every program). That is why parse is a partial function. When an
input belongs to P , parse results in its corresponding AST.
Consider the parser of AE. The results of parse are undefined for the
following strings as they are not AE programs:
I 1+
I 2*4
I 0++3
Add
−1 2
6.4 Semantics
Rule Num
n evaulates to n .
I 1 evaulates to 1.
I 5 evaulates to 5.
Rule Add
If e1 evaluates to n 1 , and e2 evaluates to n 2 ,
6 Syntax and Semantics 73
then e1 + e2 evaluates to n 1 +Z n 2 .
Rule Sub
If e1 evaluates to n 1 , and e2 evaluates to n 2 ,
then e1 − e2 evaluates to n 1 −Z n 2 .
These three rules are all of the semantics of AE. We now know the behavior
of every AE program. For example, consider (3 − 1) + 2. The following
steps show why (3 − 1) + 2 evaluates to 4.
eval : E → Z
Binary relations
⇒⊆ E × Z
Rule Num
n ⇒ n.
Rule Add
If e1 ⇒ n 1 and e2 ⇒ n 2 ,
then e1 + e2 ⇒ n 1 +Z n 2 .
Rule Sub
If e1 ⇒ n 1 and e2 ⇒ n 2 ,
then e1 − e2 ⇒ n 1 −Z n 2 .
n⇒n [Num]
e1 ⇒ n 1 e2 ⇒ n 2
[Add]
e 1 + e 2 ⇒ n 1 +Z n 2
e1 ⇒ n 1 e2 ⇒ n 2
[Sub]
e 1 − e 2 ⇒ n 1 −Z n 2
6 Syntax and Semantics 75
As you can see, the rules are much clearer and more concise than the
rules written in a natural language.
We can prove (3 − 1) + 2 ⇒ 4 with the rules. We usually draw a proof
tree when we prove a proposition with inference rules. A proof tree is a
tree whose root is the proposition to be proven. Each node of the tree is a
proposition, and the children nodes of a node are evidences supporting
that the proposition of the node is true. Unlike most trees in computer
science, we place the root of a proof tree at the bottom. Every node is
placed below its children.
The following proof tree proves 3 ⇒ 3.
3⇒3
The tree has only the root node because Rule Num does not have any
premises.
Similarly, the following proof tree proves 1 ⇒ 1.
1⇒1
We draw the following proof tree with Rule Sub and the above trees to
prove 3 − 1 ⇒ 2.
3⇒3 1⇒1
3−1⇒2
2⇒2
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
To explain what proof trees are, we have drawn the proof tree from its
leaf nodes. However, we usually draw a proof tree from the root node.
We start by drawing a horizontal line and writing the program we want
to evaluate.
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
Then, we find which inference rule can be applied. In this case, we can
use Rule Add since the program is addition.
6 Syntax and Semantics 76
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
Similarly, 1 ⇒ 1.
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
3⇒3 1⇒1
2⇒2
3−1⇒2
(3 − 1) + 2 ⇒ 4
e :: n | e + e | e − e | − e
Rule Neg
If e evaluates to n ,
then −e evaluates to −Z n .
e⇒n
Neg
−e ⇒ −Z n
6 Syntax and Semantics 78
6.6 Exercises
1. Consider the following concrete syntax:
e1 ∈ <milk> e2 ∈ <coffee>
espresso ∈ <coffee>
e1 "on" e2 ∈ <coffee>
val x = 3
println(x)
7.1 Identifiers
f(0)
def f(x: Int): Int = {
val y = 2
x + y
}
f(1)
x - z
I f at line 2
It relates f to a function.
I x at line 2
It relates x to the value of an argument given to f.
I y at line 3
It relates y to the value 2.
Every binding occurrence has its own scope. The scope of a binding
occurrence means a code region where the indentifier defined by the
binding occurrence is alive, i.e. usable. The scope of each identifier in the
program is as follows:
I f
A function can be used in its body (as Scala allows recursive
function definitions) and at the lines below its definition. The scope
of f is from line 3 to line 7.
I x
A paramter of a function can be used only in the function body.
The scope of x is line 3 and line 4.
I y
A variable can be used at the lines below its definition. The scope
of y is line 4.
I f at line 6
It denotes the function defined at line 2.
I x at line 4
It denotes the value of an argument passed to f.
I y at line 4
It denotes the value 2.
I f at line 1
It is outside the scope of f.
I x at line 7
It is outside the scope of x.
I z at line 7
The program never defines z.
7 Identifiers 82
7.2 Syntax
val x = 3
println(x)
To add variables to AE, we need two kinds of expressions. The first kind is
expressions defining a variable, i.e. binding an identifier. In the example,
val x = 3; println(x) is such an expression. It defines the variable x
and starts the scope of x so that x can be used in println(x). We can
conclude that an expression defining a variable consists of three parts:
the name of the variable, an expression determining the value of the
variable, and an expression that can use the variable. These parts are
x, 3, and println(x), respectively, in the example. The second kind is
expressions using a variable, i.e. a bound occurrence. In the example, x at
the second line is such an expression. It uses the variable x to denote the
value 3. Based on this observation, we can define the syntax of VAE.
First, we need to add a new syntactic element: identifiers. The metavari-
able x ranges over identifiers. Let Id be the set of every identifier.
x ∈ Id
e :: · · · | val x = e in e | x
I val x = e1 in e2
It defines a new variable whose name is x . Therefore, the occurrence
of x is a binding occurrence. e1 decides the value denoted by the
variable. The scope of the variable includes e2 but excludes e1 .
I x
It uses a variable; it is either a bound occurrence of x or a free
identifier. If it belongs to the scope of a binding occurrence of
the same name, then it is a bound occurrence and denotes the
value associated with the identifier. Otherwise, it is a free identifier,
which denotes nothing.
7.3 Semantics
Env Id →
fin
7 Z
σ ∈ Env
Rule Id
If x is in the domain of σ ,
then x evaluates to σ(x) under σ .
⇒⊆ Env × E × Z
x ∈ Domain(σ)
[Id]
σ ` x ⇒ σ(x)
(
0 n if x x 0
σ[x 7→ n](x )
σ(x 0) if x , x 0
Rule Val
If e1 evaluates to n 1 under σ , and e2 evaluates to n 2 under σ[x 7→ n 1 ],
then val x = e1 in e2 evaluates to n 2 under σ .
σ ` e1 ⇒ n 1 σ[x 7→ n1 ] ` e2 ⇒ n2
[Val]
σ ` val x =e1 in e2 ⇒ n2
The remaining cases are n , e1 + e2 , and e1 − e2 . Rules for those cases are
basically identical to the rules of AE. However, we need to additionally
take environments into account.
Rule Num
n evaluates to n under σ.
7 Identifiers 85
Rule Add
If e1 evaluates to n 1 under σ , and e2 evaluates to n 2 under σ ,
then e1 + e2 evaluates to n 1 + n 2 under σ .
Rule Sub
If e1 evaluates to n 1 under σ , and e2 evaluates to n 2 under σ ,
then e1 − e2 evaluates to n 1 − n 2 under σ .
σ`n⇒n [Num]
σ ` e1 ⇒ n 1 σ ` e2 ⇒ n 2
[Add]
σ ` e1 + e2 ⇒ n 1 + n 2
σ ` e1 ⇒ n 1 σ ` e2 ⇒ n 2
[Sub]
σ ` e1 − e2 ⇒ n 1 − n 2
The following proof tree proves that val x=1 in x + x evaluates to 2 under
the empty environment. Note that [x 1 7→ n 1 , . . . , x m 7→ n m ] denotes
an environment whose domain includes from x 1 to x m and each x i is
mapped to n i .
7.4 Interpreter
We can add a pair of a key and a value to a map with the + operator. For
example, where m is Map(1 -> "one"), m + (2 -> "two") is the same
as Map(1 -> "one", 2 -> "two").
Since the structure of the code is almost identical to the semantics rules,
there is nothing much to explain. In the Id case, when x is a key in env,
the corresponding value becomes the result of interp. Otherwise, an
exception is thrown, and the execution terminates without producing
any results.
7.5 Exercises
1. Which of the following are examples of shadowing?
a) val x=(val x=3 in 5 − x) in 1 + x
b) val x=3 in val y=5 in 1 + x
c) val x=3 in val x=5 in 1 + x
2. For each of the following expression:
I val x=(val x=3 in 5 − x) in 1 + x
I val x=3 in val y=5 in 1 + x
I val x=3 in val x=5 in 1 + x
println(twice(3) + twice(5))
It defines a function, twice. The function takes one argument and returns
twice of the argument. The program can call the function whenever we
want. twice(3) passes 3 as an argument to twice. Its result is 6, which
is twice of 3. Similarly, twice(5) results in 10. Therefore, the program
prints 16.
This chapter defines F1VAE by adding first-order functions to VAE. Every
function in F1VAE is top-level. It means that a function definition cannot
be a part of an expression. We assume that a F1VAE program is a single
expression that is evaluated under an environment and a list of function
definitions. This design prevents us from exploring interesting topics
like closures but enables us to focus on the semantics of function calls.
The next chapter will introduce first-class functions and closures, which
make functions more expressive.
8.1 Syntax
We can figure out the components of a function definition from the above
example. If we ignore the type annotations, the definition consists of three
parts: twice, x, and x + x. twice is the name of the function; x is the
parameter of the function; x + x is the body of the function. Therefore,
we can define the syntax of a function definition as follows:
e :: · · · | x(e)
8.2 Semantics
FEnv Id →
fin
7 FunDef
Λ ∈ FEnv
⇒⊆ Env × FEnv × E × Z
Rule Call
If
e evaluates to n 0 under σ and Λ,
x is in the domain of Λ,
Λ(x) is def x(x 0)=e 0, and
e 0 evaluates to n under [x 0 7→ n 0] and Λ,
then
x(e) evaluates to n under σ and Λ.
σ, Λ ` e ⇒ n 0
x ∈ Domain(Λ) Λ(x) def x(x 0)= e 0 [x 0 7→ n 0], Λ ` e 0 ⇒ n
[Call]
σ, Λ ` x(e) ⇒ n
Rule Num
n evaluates to n under σ and Λ.
Rule Add
If e1 evaluates to n 1 under σ and Λ, and e2 evaluates to n 2 under σ and Λ,
then e1 + e2 evaluates to n 1 + n 2 under σ and Λ.
Rule Sub
If e1 evaluates to n 1 under σ and Λ, and e2 evaluates to n 2 under σ and Λ,
then e1 − e2 evaluates to n 1 − n 2 under σ and Λ.
Rule Val
If e1 evaluates to n 1 under σ and Λ, and e2 evaluates to n 2 under σ[x 7→
n1 ] and Λ,
then val x = e1 in e2 evaluates to n 2 under σ and Λ.
Rule Id
If x is in the domain of σ ,
then x evaluates to σ(x) under σ and Λ.
σ, Λ ` n ⇒ n [Num]
σ, Λ ` e1 ⇒ n1 σ, Λ ` e2 ⇒ n2
[Add]
σ, Λ ` e1 + e2 ⇒ n1 + n2
σ, Λ ` e1 ⇒ n1 σ, Λ ` e2 ⇒ n2
[Sub]
σ, Λ ` e1 − e2 ⇒ n1 − n2
8 First-Order Functions 90
σ, Λ ` e1 ⇒ n 1 σ[x 7→ n1 ], Λ ` e2 ⇒ n2
[Val]
σ, Λ ` val x = e1 in e2 ⇒ n2
x ∈ Domain(σ)
[Id]
σ, Λ ` x ⇒ σ(x)
8.3 Interpreter
The following Scala code expresses the abstract syntax of F1VAE: 2 2: We omit the common part to VAE.
The implementation reflects the semantics exactly. You can easily check
its correctness with the case-wise comparison.
8.4 Scope
The current semantics is called static scope. Static scope allows the scope
of a binding occurrence to be determined statically, i.e. only by looking
the code, without executing it. In other words, a function body can
use only variables that have been defined already when the function is
defined. For example, consider the following code:
8 First-Order Functions 91
def f(x)=x + y
During the first function call, y in f is bound to the first y and denotes 1.
However, during the second function call, it is bound to the second one
and denotes 2. The scope of the first y includes not only f(0), which is
normal, but also the body of f. It is the same for the second y. As you can
see, under dynamic scope, the scope of a binding identifier is not fixed; it
becomes extended at run time due to function calls.
To adopt dynamic scope to F1VAE, we need to change the function call
semantics as follows:
Rule Call-Dyn
If
e evaluates to n 0 under σ and Λ,
x is in the domain of Λ,
Λ(x) is def x(x 0)=e 0, and
e 0 evaluates to n under σ[x 0 7→ n 0] and Λ,
then
x(e) evaluates to n under σ and Λ.
σ, Λ ` e ⇒ n 0
x ∈ Domain(Λ) Λ(x) def x(x 0)= e 0 σ[x 0 7→ n 0], Λ ` e 0 ⇒ n
[Call-Dyn]
σ, Λ ` x(e) ⇒ n
8.5 Exercises
1. With the following list of function definitions in F1VAE:
def twice(x)=x + x
def x(y)=y
def f(x)=x + 1
def g(g)=g
Show the results of evaluating the following expressions under the
empty environment. When it is an error, describe which error it is.
a) twice(twice)
b) val x=5 in x(x)
c) g(3)
d) g(f)
e) g(g)
First-Class Functions 9
First-class functions are functions that can be used as values. They are 9.1 Syntax . . . . . . . . . . . . . . 93
much more expressive than first-order functions, which are the topic of 9.2 Semantics . . . . . . . . . . . . 94
the previous chapter. This chapter explains the semantics of first-class 9.3 Interpreter . . . . . . . . . . . 97
9.4 Syntactic Sugar . . . . . . . . 98
functions. We need to introduce the notion of a closure to make first-class
9.5 Exercises . . . . . . . . . . . . . 99
functions work properly. We will see what closures are and why they are
necessary.
This chapter defines FVAE by extending VAE with first-class functions.
The only way to create a function in FVAE is to make an anonymous
function, which is a function without a name. However, we can add
named functions as syntactic sugar. In addition, we will see that even
variable definitions can be considered as syntactic sugar.
9.1 Syntax
determines the function to be called and the other determines the value
of the argument.
We have used the term function call so far. In the context of functional
programming, we use the term function application more frequently. When
we see f(1), we say “f is applied to 1” instead of “f is called with the
argument 1.” Applications sound more natural than calls especially when
we are talking about first-class functions. For example, we usually say
“makeAdder(3) is applied to 5” rather than “makeAdder(3) is called with
the argument 5.”
From the above observation on anonymous functions and function
applications, we can define the syntax of FVAE. The following is the
syntax of FVAE: 1 1: We omit the common part to VAE.
e :: · · · | λx.e | e e
I λx.e
It is called an anonymous function or a lambda abstraction. It denotes
a function whose parameter is x and body is e . x is a binding occur-
rence, and its scope is e . A function has zero or more parameters
in many real-world languages, but we restrict a function in FVAE
to have one and only one parameter for simplicity as before.
I e1 e2
It is a function application, or just an application in short. e1 denotes
the function; e2 denotes the argument.
9.2 Semantics
Integers are the only values in VAE. It is not true in FVAE. Since first-class
functions are values, a value of FVAE is either an integer or a function.
Thus, we define a new kind of semantic element, value. The metavariable
v ranges over values. Also, let V be the set of every value.
v :: n | hλx.e , σi
(λx.λy.x + y) 1 2
Env Id →
fin
7 V
⇒⊆ Env × E × V
Rule Fun
λx.e evaluates to hλx.e , σi under σ .
Rule App
9 First-Class Functions 96
If
e1 evaluates to hλx.e , σ0i under σ,
e2 evaluates to v 0 under σ, and
e evaluates to v under σ0[x 7→ v 0],
then
e1 e2 evaluates to v under σ.
We can reuse Rule Num, Rule Add, Rule Sub, and Rule Id of VAE. However,
it is important to note that FVAE has more cases that evaluation can fail
than VAE. For example, consider Rule Add.
Rule Add
If e1 evaluates to n 1 under σ , and e2 evaluates to n 2 under σ ,
then e1 + e2 evaluates to n 1 + n 2 under σ .
σ ` e1 ⇒ n 1 σ ` e2 ⇒ n 2
[Add]
σ ` e1 + e2 ⇒ n 1 + n 2
Rule Val
If e1 evaluates to v 1 under σ , and e2 evaluates to v 2 under σ[x 7→ v 1 ],
then val x = e1 in e2 evaluates to v 2 under σ .
σ ` e1 ⇒ v 1 σ[x 7→ v1 ] ` e2 ⇒ v2
[Val]
σ ` val x =e1 in e2 ⇒ v2
x ∈ Domain(σ2 ) y ∈ Domain(σ2 )
σ2 ` x ⇒ 1 σ2 ` y ⇒ 2
σ2 ` x + y ⇒ 3
9 First-Class Functions 97
∅ ` (λ x.λy.x + y) 1 ⇒ hλ y.x + y , σ1 i
∅`2⇒2 σ2 ` x + y ⇒ 3
∅ ` (λ x.λy.x + y) 1 2 ⇒ 3
9.3 Interpreter
The following Scala code implements the syntax of FVAE: 2 2: We omit the common part to VAE.
In the Num case, the return value is NumV(n), not n, since the function
must return a value of the type Value.
9 First-Class Functions 98
In the Add and Sub cases, we cannot assume that the operands are
integers any longer. We use pattern matching to discriminate integers
from closures. If both operands are integers, addition or subtraction
succeeds. Otherwise, at least one of them is a closure, and the interpreter
crashes due to a pattern matching failure. Note that this code is equivalent
to the following code:
We can add named local functions to FVAE with the following change in
the syntax:
the three features. This book does not discuss how integers, addition,
and subtraction can be desugared into the lambda calculus.
9.5 Exercises
1. Consider the following expression:
Describe the semantics of the App case in prose when we use each
of the following for the blank above:
I env
I Map()
I fEnv
3. This exercise extends FVAE to support multiple parameters. Con-
sider the following language:
∅ ` (λ f m.f(m))(λ x.x , 8) ⇒
Field f ∈ Field
ρ Record Field →
fin
Record ∈ 7 Value
Expression e :: · · · | { f e, . . . , f e} | e. f | e ; e
Value v :: ··· | ρ
language:
∅ ` (8 , (320, 42).1).2 ⇒
trait Expr
trait Value extends Expr
case class Num(n: Int) extends Expr with Value
case class Add(l: Expr, r: Expr) extends Expr
case class Sub(l: Expr, r: Expr) extends Expr
case class Val(x: String, e: Expr, b: Expr) extends Expr
case class Id(x: String) extends Expr
case class Fun(x: String, b: Expr) extends Expr with Value
case class App(f: Expr, a: Expr) extends Expr
interp(subst(b, x, interp(i)))
case Id(x) => error("free identifier")
case Fun(x, b) => Fun(x, b)
case App(f, a) =>
val Fun(x, b) = interp(f)
interp(subst(b, x, interp(a)))
}
v :: n | λx.e
a) Write the operational semantics of the above implementation
of the form e ⇒ v where e[x/v] denotes subst( e , x , v ).
b) Write the definition of the substitution e[x/v] of the form
e[x/v] e :
c) Consider the following expression:
σ ` a ,→ v σ ` e ⇒↑
σ ` fn m ⇒ hm, σi
σ`a⇒v σ ` e a ⇒↑
σ ` e ⇒ hm, σ0i σ`a⇒v (σ0 , v) ` m ⇒ v 0
σ ` e a ⇒ v0
σ ` e ⇒ hm, σ0i σ`a⇒v (σ0 , v) ` m ⇒↑
σ ` e a ⇒↑
9 First-Class Functions 104
σ ` a ,→ v
x ∈ Domain(σ)
σ ` n ,→ n
σ ` x ,→ σ(x)
println(sum(10))
The function sum takes an integer n as an argument, and returns the sum
of the integers between 0 to n (including n). 1 Therefore, the program 1: We ignore the case when the input is
prints 55. negative.
However, it is wrong since the scope of sum includes sum 10 but excludes
λ x.if0 x 0 (x + sum (x − 1)). sum in the body of the function is a free
identifier. Evaluation of the expression termiantes with a run-time error.
It is nontrival to define recursive functions in FAE.
10.1 Syntax
I if0 e1 e2 e3
It is a conditional expression. e1 is the condition; e2 is the true
branch; e3 is the false branch. We consider the condition to be true
when it denotes 0. All the other values, i.e. nonzero integers and
closures, are considered as false.
10 Recursion 106
I def x 1 (x 2 )= e1 in e2
It defines a recursive function whose name is x 1 , parameter is x 2 ,
and body is e1 . Both x 1 and x 2 are binding occurrences. The scope
of x 1 is e1 and e2 ; the scope of x 2 is e1 . If x 1 occurs in e1 , it is a bound
occurrence, which implies that the function can be recursive.
10.2 Semantics
Rule If0-Z
If e1 evaluates to 0 under σ and e2 evaluates to v under σ ,
then if0 e1 e2 e3 evaluates to v under σ .
σ ` e1 ⇒ 0 σ ` e2 ⇒ v
[If0-Z]
σ ` if0 e1 e2 e3 ⇒ v
Rule If0-Nz
If e1 evaluates to v 0 under σ , where v 0 , 0, and e3 evaluates to v under σ ,
then if0 e1 e2 e3 evaluates to v under σ .
σ ` e1 ⇒ v 0 v0 , 0 σ ` e3 ⇒ v
[If0-Nz]
σ ` if0 e1 e2 e3 ⇒ v
following rule:
Rule Rec
If e2 evaluates to v under σ0, where σ0 σ[x 1 7→ hλx 2 .e1 , σ0i],
then def x 1 (x 2 )= e1 in e2 evaluates to v under σ .
The environment of the closure is σ0, which has x 1 and the closure. σ0
is recursively defined at the meta-level. It is not that surprising. We
are defining a recursive function, so the defined function value itself
should be recursive. When the body of the closure, e1 , is evaluated, the
environment is σ0[x 2 7→ v] for some v , which contains x 1 . The function
x1 can be used in its body and thus recursive.
We can reuse the rules of FAE for the other expressions.
The following proof trees prove that def f(x)=if0 x 0 (x + f (x − 1)) in f 1
evaluates to 1 under the empty environment. The proof splits into three
trees for readability. Suppose the following facts:
ef if0 x 0 (x + f (x − 1))
vf hλ x.ef , σ1 i
σ1 [f 7→ vf ]
σ2 σ1 [x 7→ 1]
σ3 σ1 [x 7→ 0]
x ∈ Domain(σ2 ) x ∈ Domain(σ3 )
σ2 ` 1 ⇒ 1 σ3 ` 0 ⇒ 0
f ∈ Domain(σ2 ) σ2 ` x ⇒ 1 σ3 ` x ⇒ 0
σ2 ` f ⇒ vf σ2 ` x − 1 ⇒ 0 σ3 ` ef ⇒ 0
σ 2 ` f (x − 1 ) ⇒ 0
x ∈ Domain(σ2 )
σ 2 ` f (x − 1 ) ⇒ 0
x ∈ Domain(σ2 ) σ2 ` x ⇒ 1
1,0
f ∈ Domain(σ1 ) σ2 ` x ⇒ 1 σ 2 ` x + f (x − 1 ) ⇒ 1
σ1 ` 1 ⇒ 1
σ1 ` f ⇒ v f σ2 ` ef ⇒ 1
σ1 ` f 1 ⇒ 1
σ1 [f 7→ v f ] σ1 ` f 1 ⇒ 1
∅ ` def f(x)= ef in f 1 ⇒ 1
10.3 Interpreter
The following Scala code implements the syntax of RFAE: 4 4: We omit the common part to FAE.
Values are defined in a similar way to FAE. The only difference is that
the field e, which denotes the captured environment, of CloV is now
mutable. Using mutation is the easiest way to make recursive values in
Scala, though we can do it without mutation.
In the If0 case, the condition is evaluated first. According to the condition,
one of the branches is evaluated.
In the Rec case, we construct a closure first. However, the closure is not
complete at this point. We next create a new environment: the environ-
ment with the closure. The closure must capture the new environment.
To achieve this, we change the environment of the closure to the new en-
vironment. Now, both closure and environment have recursive structures,
and e can be evaluated under the environment.
Let us call the function f . Suppose that sum is given to the function as an
10 Recursion 109
which evaluates to
(λ f.λv.e f ) (λ v.F F v)
By expanding e f , we get
which evalutes to
Let us call this function g . Note that we now know that F F evaluates to
g . If we apply g to 0, the result is 0. If we apply g to a positive integer n ,
the result is
n + (λ v.F F v) (n − 1)
which evalutes to
n + F F (n − 1)
Since F F evaluates to g , the above expression evaluates to
n + g (n − 1)
It implies that g equals sum. Actually, this evaluation has started from
Z f . Therefore, Z f equals sum.
One may ask if we can use the following expression Z0 instead of Z :
λx.(λf.λv.e f ) (x x)
(λx.(λf.λv.e f ) (x x)) F0
which evaluates to
(λf.λv.e f ) (F0 F0)
To evaluate this expression, we need to evaluate the argument. However,
the argument is F0 F0, which implies that we need the value of F0 F0 to
compute the value of F0 F0. Thus, at this point, the execution starts to
evaluate F0 F0 forever and never terminates. For this reason, we should
use Z , not Z0, as a fixed point combinator.
Finally, we can define the syntactic transformation rule to desugar recur-
sive functions: def x 1 (x 2 )= e1 in e2 is transformed into val x 1 =Z (λx 1 .λx 2 .e1 ) in e2 .
If x 1 in def x 1 (x 2 )= e1 in e2 denotes a function h , h is a fixed point of
λx1 .λx2 .e1 . Therefore, Z (λx1 .λx2 .e1 ) is equal to h . Both def x 1 (x2 )=e1 in e2
and val x 1 =Z (λx 1 .λx 2 .e1 ) in e2 evaluate e2 under the environment that x 1
denotes h . Therefore, they have the same semantics, and the desugaring
is correct.
10.5 Exercises
1. Explain why the following expression does not terminate and
describe how to fix it.
val z=(λ f.
val x=(λ y.
val g=y y in
fg
) in
xx
) in
val f=z (λ f.λ v.if0 v 0 (v + f (v − 1))) in
f 10
2. Consider the following definition of z and its use to define the
recursive function f.
val z=(λ f.
val x=(λ y.
val g= λ a.y y a in
fg
) in
xx
) in
val f=z (λ f.λ v.if0 v 0 (v + f (v − 1))) in
f 10
Describe conditions that an argument given to z must satisfy so
10 Recursion 111
Expression e :: n | e − e | b | e ∧ e | ¬e | if e e e | x
| λx · · · x.e | e(e, · · · , e) | def x(x, · · · , x)= e in e
Value v :: n | b | hλx · · · x.e , σi
Note that b ranges over boolean values. The language does not
support the short-circuiting semantics, i.e. e2 must be evaluated in
expression e1 ∧ e2 whenever e1 evaluates to true or not. Write the
operational semantics of the form σ ` e ⇒ v for the expressions.
5. What are the results of the following expression:
a) Dynamic scope
b) Static scope
Boxes 11
Mutation is a widely-used feature. It is an important concept in imperative 11.1 Syntax . . . . . . . . . . . . . . 112
languages. Even functional languages support mutation. Few languages 11.2 Semantics . . . . . . . . . . . . 113
are purely functional, i.e. do not allow any mutation: e.g. Haskell and Coq. 11.3 Interpreter . . . . . . . . . . . 118
11.4 Exercises . . . . . . . . . . . . . 120
Muation is important since many programs can be implemented concisely
and efficiently with mutation. At the same time, mutation often makes
programs difficult to be reasoned about and error-prone. While binding
of identifiers works modularly and allows local reasoning, mutation has a
global effect on execution and enables uncontrolled interference between
distinct parts of a program. Mutation should be used with extreme care
of programmers.
This chapter introduces mutation by defining BFAE, which extends FAE
with boxes. A box is a cell in memory that contains a single value. The
value contained in a box can be modified anytime after the creation of
the box. Boxes in BFAE are higher-order. Each box can contain any value,
which can be a box or a closure, rather than only an integer. A box itself is
rarely found in real-world languages: it is almost the same as a reference in
OCaml (ref). However, it is a good abstraction of more general mutation
mechanisms including mutable objects and data structures. We can find
such concepts in most languages, and boxes are useful to understand
those concepts.
We can consider mutable objects in Scala as generalization of boxes. By
going the opposite direction, we can say that boxes can be represented as
objects. Consider the following class definition in Scala:
The class Box has one field: value. Like any other classes, we can construct
instances of Box and read the fields of the instances.
val b = Box(3)
println(b.value)
b.value = 10
println(b.value)
11.1 Syntax
e :: · · · | box e | ! e | e := e | e ; e
I box e
It creates a new box, cf. Box(3) in the example. e determines the
initial value of the box.
I !e
It reads the value of a box, cf. b.value in the example. e determines
the box to be read.
I e 1 := e 2
It changes the value of a box, cf. b.value = 10 in the example. e1
determines the box to be updated; e2 determines the new value.
I e1 ; e2
It is a sequencing expression. e1 is the first expression to be evalu-
ated; e2 is the second. Many real-world languages allow sequencing
of an arbitrary number of expressions. For brevity, BFAE allows
only sequencing of two expressions. Sequencing of multiple expres-
sions can be easily expresssed by nested sequencing. For example,
e1 ; e2 ; e3 can be expresssed as (e1 ; e2 ); e3 .
11.2 Semantics
a ∈ Addr
Sto Addr →
fin
7 V
11 Boxes 114
M ∈ Sto
The semantics does not require a concrete notion of a box. Since every
box is uniquely identified by an address, the semantics can consider each
address as a box. Thus, we treat an address as a value of BFAE, instead of
introducing a new semantic element denoting boxes. For example, an
expression creating a box evaluates to an address. We need to revise the
definition of a value to include addresses. 2 2: We omit the common part to FAE.
v :: · · · | a
Note that we keep using the concept of a box for explanation. Even
though the semantics abstracts boxes with addresses, boxes do exist from
the programmers’ perspective. The term box and the term address will
be interchangeably used.
How are stores used in the semantics? First, consider an expression
reading a box. Evaluating ! e needs not only an environment but also a
store. If e denotes a box, the store has the value of the box. The value
becomes the result of ! e . Without a store, there is no way to find the value
of a box and yield a result. It implies that evaluation requires a store to
be given.
Now, let us consider the other kinds of expressions related to boxes. box e
creates a new box; e1 := e2 changes the content of a box. Both modify stores.
Modifying a store differs from extending an environment with a new
identifier.
A change in an environment is propagated to the subexpressions of an
expression that has caused the change. Consider val x = e1 in e2 . It extends
the environment with x , but only e2 uses the extended environment
because the scope of x is e2 but nowhere else. A variable definition can
affect only its subexpressions. For instance, in (val x = e1 in e2 ) + e3 , e3 does
not belong to the scope of x . The extended environment must be used
for only e2 , but not e3 . Therefore, we say that binding and environments
are local and modular.
On the other hand, the modified store is unnecessary for the subex-
pressions of an expression that modifies the store, while other parts of
the program need the modified one. Consider (x:=2); !x as an example.
Assume that x denotes a box. !x must be aware of that x:=2 has changed
the value of the box to 2. Otherwise, !x will get the previous value of
the box and produce a wrong result. Note that !x is not a subexpression
of x:=2. However, in x:=2, the evaluation of 2, which is a subexpression
of x:=2, must not be affected by the change in the value of the box since
the change happens after the evaluation of 2. Therefore, how stores
change due to expressions is important. If an expression contains two
subexpressions, the store obtained by evaluating the first subexpression
has to be passed to the evaluation of the second subexpression. Stores are
completely different from environments. Any change in a store affects
the entire remaining computation. Stores are global and not modular.
From the observation, we can conclude that evaluation of an expression
needs to take a store in addition to an environment as input and output
a new store along with a result value. We can define the semantics as a
11 Boxes 115
relation over Env, Sto, E , V , and Sto. The former store is input, and the
latter store is output.
Rule Num
n evaluates to n and changes the store from M to M under σ .
σ, M ` n ⇒ n, M [Num]
Rule Id
If x is in the domain of σ ,
then x evaluates to σ(x) and changes the store from M to M under σ .
x ∈ Domain(σ)
[Id]
σ, M ` x ⇒ σ(x), M
Rule Fun
λx.e evaluates to hλx.e , σi and changes the store from M to M under σ .
Rule Seq
If
e1 evaluates to v1 and changes the store from M to M1 under σ, and
e2 evaluates to v2 and changes the store from M1 to M2 under σ,
then
e1 ; e2 evaluates to v2 and changes the store from M to M2 under σ .
σ, M ` e1 ⇒ v 1 , M1 σ, M1 ` e2 ⇒ v2 , M2
[Seq]
σ, M ` e1 ; e2 ⇒ v 2 , M2
Rule Add
If
e1 evaluates to n1 and changes the store from M to M1 under σ, and
e2 evaluates to n2 and changes the store from M1 to M2 under σ,
then
e1 + e2 evaluates to n1 + n2 and changes the store from M to M2 under σ.
σ, M ` e1 ⇒ n 1 , M1 σ, M1 ` e2 ⇒ n2 , M2
[Add]
σ, M ` e1 + e2 ⇒ n1 + n2 , M2
Rule Sub
If
e1 evaluates to n1 and changes the store from M to M1 under σ, and
e2 evaluates to n2 and changes the store from M1 to M2 under σ,
then
e1 − e2 evaluates to n1 − n2 and changes the store from M to M2 under σ.
σ, M ` e1 ⇒ n1 , M1 σ, M1 ` e2 ⇒ n2 , M2
[Sub]
σ, M ` e1 − e2 ⇒ n 1 − n2 , M2
Rule App
If
e1 evaluates to hλx.e , σ0i and changes the store from M to M1 under σ ,
e2 evaluates to v 0 and changes the store from M1 to M2 under σ, and
e evaluates to v and changes the store from M2 to M3 under σ0[x 7→ v 0],
then
e1 e2 evaluates to v and changes the store from M to M3 under σ.
σ, M ` e1 ⇒ hλx.e, σ0i, M1
σ, M1 ` e2 ⇒ v 0 , M2 σ0[x 7→ v 0], M2 ` e ⇒ v, M3
[App]
σ, M ` e1 e2 ⇒ v, M3
11 Boxes 117
Note that the evaluation of the body of a closure can modify the store as
well.
Now, let us define the semantics of expressions treating boxes. box e is an
expression creating a new box. The result of e becomes the initial value
of the box. The result of box e is the new box.
Rule NewBox
If
e evaluates to v and changes the store from M to M1 under σ, and
a is not in the domain of M1 ,
then
box e evaluates to a and changes the store from M to M1 [a 7→ v] under σ .
σ, M ` e ⇒ v, M1 a < Domain(M1 )
[NewBox]
σ, M ` box e ⇒ a, M1 [a 7→ v]
To get the initial value, e is evaluated first. The address of the new box
must not belong to M1 , the store attained by evaluating e . There is no
additional condition the address must satisfy, so we can freely choose
any address that is not in M1 . Note that if we check the domain of M ,
not M1 , we result in multiple boxes sharing the same address, which is
certainly wrong, when e also creates boxes. The result is the address of
the box. Also, we add a mapping from the address of the box to the value
of the box to the final store.
! e is an expression reading the value of a box. e determines the box to be
read. If e does not evaluate to a box, a run-time error occurs. Otherwise,
e is some box, and the final result is the value of the box.
Rule OpenBox
If
e evaluates to a and changes the store from M to M1 under σ, and
a is in the domain of M1 ,
then
! e evaluates to M1 (a) and changes the store from M to M1 under σ .
σ, M ` e ⇒ a, M1 a ∈ Domain(M1 )
[OpenBox]
σ, M `! e ⇒ M1 (a), M1
Rule SetBox
If
e1 evaluates to a and changes the store from M to M1 under σ, and
e2 evaluates to v and changes the store from M1 to M2 under σ,
then
e1 := e2 evaluates to v and changes the store from M to M2 [a 7→ v] under σ.
σ, M ` e1 ⇒ a, M1 σ, M1 ` e2 ⇒ v, M2
[SetBox]
σ, M ` e1 := e2 ⇒ v, M2 [a 7→ v]
Like all the other expressions, an expression modifying a box uses the
left-to-right order. If e1 evaluates to an address a , the value associated
with a in the store changes into the value denoted by e2 . Also, the
value is the result of the whole expression. This semantics follows the
semantics of many real-world imperative languages. For example, x =
1 in C changes the value of x to 1 and results in 1. On the other hand,
functional languages usually use unit as the results of expressions for
mutation. We can easily adopt the semantics in BFAE by adding unit to
the language.
11.3 Interpreter
The following Scala code implements the syntax of BFAE: 4 4: We omit the common part to FAE.
In addition, we add a new variant of Value to represent boxes. 5 5: We omit the common part to FAE.
Box( a ) represents a .
The Num, Id, and Fun cases use given stores as the results.
The Seqn, Add, Sub, and App cases do not directly modify or read stores,
but pass the stores returned from the recursive calls to the next recursive
calls or use them as results.
The NewBox case computes the initial value of the box first. Then, it com-
putes an address not used in the store. We use the method maxOption.
If a collection is empty, the method returns None. Otherwise, the re-
sult is Some(n), where n is the greatest value in the collection. By
.getOrElse(0), we can get n from Some(n) and 0 from None. Con-
sequently, sto.keys.maxOption.getOrElse(0) results in the maximum
key in the store when the store is nonempty and 0 otherwise. a is one
greater than that value and thus does not belong to the store. Therefore,
we can use a as the address of the box. The result of the function consists
of the address and the extended store.
11 Boxes 120
The SetBox case evaluates both subexpressions and modifies the store.
11.4 Exercises
1. Consider the following expression:
interp(Add(Id("x"), Id("y")),
Map("x" -> NumV(3), "y" -> NumV(4)),
Map.empty)
exp: x + y
env: {x -> NumV(3), y -> NumV(4)}
sto: {}
res: NumV(7) {}
exp: x
env: {x -> NumV(3), y -> NumV(4)}
sto: {}
res: NumV(3) {}
exp: y
env: {x -> NumV(3), y -> NumV(4)}
sto: {}
res: NumV(4) {}
Mutable Variables 12
BFAE of the previous chatper provides boxes. Boxes are good abstraction 12.1 Syntax . . . . . . . . . . . . . . 122
of mutable objects and data structures but do not explain mutable 12.2 Semantics . . . . . . . . . . . . 122
variables well. Boxes, mutable objects, mutable data structures are values, 12.3 Interpreter . . . . . . . . . . . 124
12.4 Call-by-Reference . . . . . . . 125
while mutable variables are names. Mutable variables allow the values
12.5 Exercises . . . . . . . . . . . . . 128
associated with names to change. We can find the notion of a mutable
variable in many real-world languages except a few functional languages
including OCaml and Haskell.
The semantics of mutable variables seem trivial. We can change the
values of mutable variables. However, if we use mutable variables with
closures, we can do many interesting things. Consider the following Scala
program:
println(counter1())
println(counter2())
println(counter1())
println(counter2())
12.1 Syntax
e :: · · · | x :=e
12.2 Semantics
Since MFAE provides mutation, its semantics uses store-passing just like
BFAE. Therefore, a store is a finite partial function from addresses to
values.
Sto Addr →
fin
7 V
M ∈ Sto
Env Id →
fin
7 Addr
σ ∈ Env
Rule Id
If
x is in the domain of σ , and
σ(x) is in the domain of M ,
then
x evaluates to M(σ(x)) and changes the store from M to M under σ.
Like boxes in BFAE, each variable of MFAE has its own address. New
variables can be defined only by function applications. Hence, function
applications are the only expressions that create new addresses. Let us
see the semantics of function applications.
Rule App
12 Mutable Variables 124
If
e1 evaluates to hλx.e , σ0i and changes the store from M to M1 under σ ,
e2 evaluates to v 0 and changes the store from M1 to M2 under σ,
a is not in the domain of M2 , and
e evaluates to v and changes the store from M2 [a 7→ v 0] to M3 under σ0[x 7→ a],
then
e1 e2 evaluates to v and changes the store from M to M3 under σ.
σ, M ` e1 ⇒ hλx.e , σ0i, M1 σ, M1 ` e2 ⇒ v 0 , M2
a < Domain(M2 ) σ0[x 7→ a], M2 [a 7→ v 0] ` e ⇒ v, M3
[App]
σ, M ` e1 e2 ⇒ v, M3
Rule Set
If
x is in the domain of σ, and
e evaluates to v and changes the store from M to M1 under σ,
then
x := e evaluates to v and changes the store from M to M1 [σ(x) 7→ v] under σ .
x ∈ Domain(σ) σ, M ` e ⇒ v, M1
[Set]
σ, M ` x :=e ⇒ v, M1 [σ(x) 7→ v]
12.3 Interpreter
The following Scala code implements the syntax of MFAE: 2 2: We omit the common part to FAE.
12 Mutable Variables 125
Set( x , e ) represents x := e .
In the Id case, the function finds the address of the variable first and
then the value at the address.
In the App case, we use the same strategy to the interpreter of BFAE to
compute a new address. The body of the function is evaluated under the
extended environment and the extended store.
The Set case uses the environment to find the address of the variable.
Then, it updates the store to change the value of the variable.
12.4 Call-by-Reference
int a = 1, b = 2;
swap(a, b);
std::cout << a << " " << b << std::endl;
They expect the program to print 2 1 as swap has been called. On the
contrary, their expectation is wrong. The result is 1 2. We can explain
the reason based on the content of this chapter. When swap is called, two
new fresh addresses are allocated for x and y. The values of a and b are
copied and stored in the addresses, respectively. The function affects only
the values in the addresses of x and y. It never touches the addresses of a
and b. As a consequence, while the values of x and y are swapped, the
values of a and b are not.
This is the usual semantics of function applications. The values of
arguments are copied and saved at fresh addresses. This semantics is
called call-by-value (CBV) as function calls pass the values of arguments.
People have explored another semantics for function applications to
implement functions like swap easily. The semantics is called call-by-
reference (CBR). In this semantics, function calls pass the references, i.e.
addresses, when variables are used as arguments.
The following rule defines the semantics of a function application using
CBR when its argument is a variable:
Rule App-Cbr
If
e1 evaluates to hλx 0 .e 0 , σ0i and changes the store from M to M1 under σ ,
x is in the domain of σ, and
e 0 evaluates to v and changes the store from M1 to M2 under σ0[x 0 7→ σ(x)],
then
e x evaluates to v and changes the store from M to M2 under σ.
σ, M ` e1 ⇒ hλx 0 .e 0 , σ0i, M1
x ∈ Domain(σ) σ0[x 0 7→ σ(x)], M1 ` e 0 ⇒ v, M2
[App-Cbr]
σ, M ` e x ⇒ v, M2
The rule does not evaluate the argument to get a value. It simply uses
the address of the variable. Then, the parameter of the function has the
exactly same address to the argument. Any change in the parameter that
happens in the function body affects the variable outside the function.
We say that the parameter is an alias of the argument as they share the
same address.
Even if we want to adopt the CBR semantics in MFAE, we cannot use it
when the argument is not a variable. We cannot get an address from an
expression that is not a variable. In such cases, we fall back to the CBV
semantics. The following rule specifies such cases:
Rule App-Cbv
12 Mutable Variables 127
If
e1 evaluates to hλx.e , σ0i and changes the store from M to M1 under σ ,
e2 is not an identifier,
e2 evaluates to v 0 and changes the store from M1 to M2 under σ,
a is not in the domain of M2 , and
e evaluates to v and changes the store from M2 [a 7→ v 0] to M3 under σ0[x 7→ a],
then
e1 e2 evaluates to v and changes the store from M to M3 under σ.
It is the same as Rule App except that it has one more premise to ensure
that the argument is not a variable.
The interpreter needs the following change:
int a = 1, b = 2;
swap(a, b);
std::cout << a << " " << b << std::endl;
It is enough to fix only the first line to make the parameters use CBR.
When swap is applied to a and b, the addresses of a and b are passed.
The address of x is the same as that of a, and the address of y is the same
as that of b. Therefore, the function swaps not only the values of x and y
but also the values of a and b. The program prints 2 1 as intended.
12 Mutable Variables 128
12.5 Exercises
1. This exercise extends MFAE with pointers. Consider the following
language:
e :: · · · | ∗ e | & x | ∗ e := e
v :: · · · | a
The semantics of some constructs are as follows:
I The value of ∗e is the value in the store at the address denoted
by the expression.
I The value of & x is the address denoted by the identifier in
the environment.
I The evaluation of ∗e1 := e2 evaluates e2 first, which is the value
of the whole expression. Then, it evaluates e1 , and it maps the
address denoted e1 to the value of e2 .
Write the operational semantics of the form σ, M ` e ⇒ v, M for
the expressions.
2. The following code is an excerpt from the implementation of the
interpreter for MFAE:
x * x
}
13.1 Semantics
v :: · · · | (e , σ)
⇓⊆ V × V
Rule Strict-Num
n strictly evaluates to n
n⇓n [Strict-Num]
Rule Strict-Clo
hλx.e, σi strictly evaluates to hλx.e, σi
Rule Strict-Expr
If e evaluates to v 1 under σ , and v 1 strictly evaluates to v 2 , then (e , σ)
strictly evaluates to v 2 .
σ ` e ⇒ v1 v1 ⇓ v2
[Strict-Expr]
(e , σ) ⇓ v 2
Rule App
If
e1 evaluates to v1 under σ,
v 1 strictly evaluates to hλx.e , σ0i , and
e evaluates to v under σ0[x 7→ (e2 , σ)],
then
e1 e2 evaluates to v under σ.
Rule Add
If
e1 evaluates to v1 under σ,
v 1 strictly evaluates to n1 ,
e2 evaluates to v2 under σ, and
v2 strictly evaluates to n2 ,
then
e1 + e2 evaluates to n1 + n2 under σ.
σ ` e1 ⇒ v 1 v1 ⇓ n1 σ ` e2 ⇒ v 2 v2 ⇓ n2
[Add]
σ ` e1 + e2 ⇒ n 1 + n 2
Rule Sub
13 Lazy Evaluation 133
If
e1 evaluates to v1 under σ,
v1 strictly evaluates to n1 ,
e2 evaluates to v2 under σ, and
v2 strictly evaluates to n2 ,
then
e1 − e2 evaluates to n1 − n2 under σ.
σ ` e1 ⇒ v 1 v1 ⇓ n1 σ ` e2 ⇒ v 2 v2 ⇓ n2
[Sub]
σ ` e1 − e2 ⇒ n 1 − n 2
There is nothing difficult. They are similar to the rules of FAE but
additionally require strict evaluation since addition and subtraction are
possible only by using integers, not expression-values.
The semantics is a correct instance of CBN but has a flaw from a practical
perspective. Consider (λ x.x) (1 + 1). It results in (1 + 1 , ∅), not 2. Most
programmers are likely to perfer 2 as a result. We need to apply one last
strict evaluation at the end of the evaluation to resolve the problem. It is
to say that “the result of a program e is v when ∅ ` e ⇒ v 0 and v 0 ⇓ v .”
Note that it is different from applying strict evaluation to the evaluation
of every expression in the program. Strict evaluation is applied to only
the result of the whole expression, which is the program. In this way, we
can make the result of the above expression 2 and eliminate the flaw.5 5: It is not a flaw in real-world program-
ming languages like Haskell. A program
If evaluating an expression in the CBV semantics results in a value, then shows its result by output operations (e.g.
the CBN semantics yields the same value. It is known as a corollary of to files) rather than the value of a single
expression. Each output operation applies
the standardization theorem of lambda calculus [Rey09]. Note that it is
strict evaluation to its argument (like Rule
true only in languages without side effects. The result of an expression Add, Rule Sub, and Rule App in LFAE), and
with side effects varies in the order of the evaluation. For example, if an the value of each expression does not need
argument is an expression changing the value of a box, and the body of to be a normal value.
the function reads the value of the box without using the argument, the
program can behave differently in CBV and CBN. In CBV, the read value
will be the value after the update. On the other hand, in CBN, the update
never happens, and the read value will be the original value of the box.
While the CBN semantics preserves the results of the CBV semantics, the
converse is false even without mutation, i.e. there are expressions that
yield results only in CBN. For instance, consider a function application
whose argument is a nonterminating expression. If the function returns
zero without using the argument, evaluation with CBN results in zero,
while evaluation with CBV does not terminate.
13.2 Interpreter
13.3 Call-by-Need
The way to solve this problem is to store the value of an argument and
use the value again. This strategy is as optimal as CBV when a parameter
appears multiple times; it is as optimal as CBN when a parameter is
not used at all. For programmers, it is tedious to implement such logic
in their programs by themselves. Instead, programming languages can
provide the optimization. This optimization is called call-by-need as each
argument is evaluated based on need for its value. It is evaluated once if
needed and is not otherwise.
Call-by-need is not different semantics from CBN in purely functional
languages. The behaviors of a program in call-by-need and CBN are com-
pletely equal. Call-by-need is just an optimization strategy of interpreters
and compilers. On the other hand, call-by-need is different semantics
from CBN in languages with side effects. In such languages, the number
of computation of a certain expression can affect the result. For example,
consider an argument that is an expression that increases the value of
the box by one. Suppose that its value is used twice in the function body.
Then, the value of the box increases by two in CBN, while it increases by
one in call-by-need.
Since LFAE lacks side effects, we can adopt call-by-need to the language
as optimization of the interpreter. There is no need to newly define the
call-by-need version of the semantics.
To store the strict value of an expression-value, we add a new field to the
class ExprV.
It checks whether there exists a cached value. If it is the case, the function
simply returns the cached value. Otherwise, e is evaluated under env
like before. In addition, the function stores the value in v.
The function interp needs only one fix. When a new ExprV instance is
created in the App case, one additional argument is required to initialize
the field v.
13 Lazy Evaluation 136
lazy val x = {
println(1)
1
}
val y = x + x
The program prints 1 only once. By using both by-name parameter and
lazy variable, we can simulate the call-by-need semantics in Scala.
13.4 Exercises
1. Which of the following produce different results in a CBV language
and a CBN language? Both produce the same result if they both
produce the same number or they both produce closures (even if
they do not behave exactly the same when applied).
a) (λy.y 3) (λ x.1 2)
b) (λy.y λ x.10) (λ x.x (1 2))
c) (λy.y λ x.10) (λ x.1 2)
d) (λy.y) (1 + λ x.x)
e) (λy.1 + 2) (1 + λx.x)
2. Show the results of each expression in a CBV language and a CBN
language.
a) (λx.8) + 10
b) (λx.8) (1 2)
c) λx.((λy.42) (9 2))
d) 1 + ((λ x.x + 13) (1 + λ y.7))
e) 1 + ((λ x.1 + 13) (1 + λ y.7))
3. Note that there is a recursive call in the following function:
13 Lazy Evaluation 137
Write an example LFAE expression showing the need for the recur-
sive strict call.
4. Consider the following expression:
val f= λ x.y + 7 in
val y=5 in
f (42 + λ y.3)
foo(3)
Control diverters are useful for writing programs with complex control
flows. For instance, consider a function numOfWordsInFile that takes the
name of a file and a string as arguments and returns how many times
the string occurs in the file.
If such a file does not exist, the function returns -1. When the file is read
for the first time, its content is cached, so the function must check the
cache first to reuse the cached result if available.
Assume that we have the following helper functions:
The code looks fine, but we cannot directly express the idea that the
function needs to call numOfWords except one errorneous case, where
both cache and file are not found. It is not a big flaw in the current
implementation of numOfWordsInFile. However, if we write a function
with a large number of conditions, we would prefer return (Figure 14.1)
to call numOfWords multiple times (Figure 14.2).
Like mutation, control diverters make languages impure. In pure lan-
guages, the order of evaluation does not matter. Each expression only
produces a result; there is no other side effect. On the other hand, in impure
languages, the order of evaluation matters. Expressions can perform
side effects, including mutation and control flow changes. Evaluting a
certain expression can change the result of other expressions or make
other expressions not evaluated. Therefore, programs written in impure
languages require global reasoning, while programs wirtten in pure
languages require local, modular reasoning. Mutation and control di-
verters make the reasoning of programs difficult despite their usefulness.
Control diverters must be used with extra care of programmers.
14 Continuations 140
...
else if (F)
f
else
return -1
numOfWords(content, word)
} Figure 14.1: numOfWordsInFile with
return
...
else if (F)
numOfWords(f, word)
else
-1
} Figure 14.2: numOfWordsInFile without
return
value 10.
We can split N steps of computation into two parts: the former n steps
and the remaining N − n steps. We call the expression evaluated by the
former n steps a redex2 and the remaining computation described by the 2: The term redex stands for a reducible
latter N − n steps the continuation of the redex. expression. However, we introduce the
notion of a redex without explaining what
For example, if we split the above steps into step 1 and steps 2-7, then the “reduction” or “reducible” is. The notion
can be understood without knowing what
redex is the expression 1, and the continuation consists of the remaining
reduction is, so we do not care about the
steps. The important point is that the continuation requires the result of origin of the term.
the redex to complete the evaluation. Without 1, the result of step 1, the
continuation cannot proceed beyond step 3. Step 3 can be accomplished
only when the result of the redex is provided. Therefore, we can consider
the continuation as an expression with a hole that must be filled with
the result of a redex. Intuitively, the continuation of 1 can be written
as ( + 2) + (3 + 4), where denotes the place in which the result of
the redex is used. Since the continuation takes the result of a redex as
input and completes the remaining computation, the continuation can be
interpreted as a function. Following this interpretation, we can express
the continuation of 1 as λ x.(x + 2) + (3 + 4).
There are multiple ways to split the steps. The following table shows
three different ways of splitting the evaluation of (1 + 2) + (3 + 4) to find
a redex and the continuation.
Redex Continuation
Steps Expression Steps Hole Function
1 1 2-7 ( + 2) + (3 + 4) λ x.(x + 2) + (3 + 4)
1-3 1+2 4-7 + (3 + 4) λ x.x + (3 + 4)
1-7 (1 + 2) + (3 + 4) · λ x.x
Note that 2, 3, 4, and 3 + 4 are not redexes, while 1, 1 + 2, and (1 + 2)+(3 + 4)
are redexes. A redex is an expression that can be evaluated first. Since
2 cannot be evaluated until 1 is evaluated, 2 is not a redex. Similarly, 3,
4, and 3 + 4 cannot be evaluated until 1 + 2 is evaluated, so they are not
redexes. On the other hand, 1 + 2 is a redex because there is nothing need
to be done before the evaluation of 1 + 2.
Since a continuation also consists of multiple steps of computation, it can
split again into a redex and the continuation of the redex. For example,
consider the continuation of 1, which consists of steps 2-7. If we split it into
step 2 and steps 3-7, the redex is the expression 2, and the continuation
is (1 + ) + (3 + 4). Here, the line below 1 expresses that 1 is an integer
value, not an expression.
Therefore, evaluation of an expression repeats evaluation of a redex and
application of a continuation. A given expression splits into a redex and
a continuation. The redex evaluates to a value, and the continuation is
applied to the value. Then, the continuation splits again into a redex and
a continuation, and the redex is evaluated. This process repeats until
there is no more remaining computation, i.e. the continuation becomes
the identity function.
14 Continuations 142
The variable sto denotes a mutable map. The function interp depends
on sto, instead of passing stores as an argument and a return value. The
Add case does not pass the resulting store from the evaluation of l to the
14 Continuations 143
evaluation of r. The NewBox case simply mutates sto to create a new box.
Note that sto += (a -> v) mutates the map by adding a mapping from
a to v. Store passing is unnecessary since there is a global, mutable map,
which records every update.
Two code snippets clearly compare interpreters with and without store
passing. In the former, with store passing, the current store at each point
of execution is explicit. When we see the Add case, it is clear that sto is
used for the evaluation of l, which may change the store, and the resulting
store of l is used for the evaluation of r. However, in the latter, without
store passing, the current store at each point is implicit. The code does not
reveal the fact that interp(l, env) can change the store and, therefore,
affect the result of interp(r, env). Implementation with store-passing
style explicitly shows the use and flow of stores by passing stores from
functions to functions, while implementation without store-passing style
hides the use and flow of stores and makes the code shorter.
CPS is similar to store-passing style. The difference is that CPS passes
continuations, while store-passing style passes stores. Like that store-
passing style exposes the store used by each function application, CPS
exposes the continuation of each function application.
This section illustrates how we can write programs in CPS by giving
factorial as an example. Consider a function calculating the factorial of a
given integer. The following function does not use CPS:
Since factorial does not use CPS, the continuation is implicit. For exam-
ple, in factorial(5) + 1, the continuation of factorial(5) is to add 1
to the result, i.e. x => x + 1. Although the continuation of factorial(5)
does exist and is executed during the evaluation of factorial(5) +
1, we cannot find x => x + 1 in the code per se. The reason is that
factorial does not use CPS.
Let us transform this function to use CPS. Since each function in CPS
takes a continuation as an argument, the first thing to do is to add a
parameter to a function. The continuation of a function application uses
the return value of certain computation. Therefore, a continuation can
be interpreted as a function that takes the return value as input. In the
case of factorial, the continuation takes an integer as input. On the
other hand, there is no restriction on what the continuation computes;
it can do whatever it wants. In factorial(5) + 1, the continuation of
factorial(5) results in an integer. At the same time, factorial(5) +
1 results in an integer, too. In 120 == factorial(5), the continuation of
factorial(5), which is x => 120 == x, results in a boolean. The whole
expression 120 == factorial(5) also results in a boolean. Therefore,
the output of a continuation can have any type, but the type must be the
same as the type of the whole expression.
Based on these observations, we can define the type of the continuation
of factorial. It is a function type whose parameter type is Int. The
14 Continuations 144
return type can be any type, but for brevity, we fix the return type to
Int.
Note that f(if (e1) e2 else e3) is the same as if (e1) f(e2) else
f(e3). Therefore, the above code is equivalent to the following code:
The function uses CPS because its recursive call explicitly passes the con-
tinuation as an argument. When n is greater than one, factorialCps(n,
k) computes (n − 1)!, multiplies the result by n, and applies k to the result
of the multiplication. The first step, computing (n − 1)!, is done by calling
factorialCps itself. The subsequent two steps are the continuation of
the recursive call. In the implementation, the continuation is x => k(n *
x). It exactly coincides with the aforementioned steps: multiplying the
result by n and applying k.
Now, we can compute 5! with factorialCps by writing factorialCps(5,
x => x). The continuation is x => x because there is nothing more to do
with 5!, which is the desired result. In factorial(5), the continuation
is implicit since x => x is not written in the code. On the other hand, x
=> x is explicitly written in factorialCps(5, x => x), which clearly
illustrates the main characteristic of CPS. Similarly, to compute 5! + 1, we
can write factorialCps(5, x => x + 1) instead of factorial(5) +
1. To obtain 5! + 1 from 5!, the only thing to do is adding 1. Therefore, the
continuation is x => x + 1. Just like before, the code with factorialCps
directly shows the continuation, while the code with factorial does
not.
Since the output type of a continuation is T, any code using factorial can
be rewritten with factorialCps. For example, factorial(5) % 2 == 0
checks whether 5! is an even integer. It is equivalent to factorialCps(5,
x => x % 2 == 0), which explicitly shows the continuation. Similarly,
println(factorial(5)) prints 120, which is 5!. It is the same as factorialCps(5,
println), which also reveals the continuation.
These are not individual ones; they are connected and express the same
idea. Since a continuation is given as an argument, the only way to finish a
computation is calling the continuation. Therefore, a continuation is used
once and at most once in a function body. Also, there is no need to do
additional computation with the return value of a function application.
The continuation does every additional computation with the return
value, so return values are not used at all. Since we do not use return
values, every call is a tail call. Once a function calls another function, the
result of the callee is the result of caller. Moreover, there is no way of
returning from a function without calling any function. If the function is
the last step of a computation, it must call its continuation. Otherwise,
it needs to call another function to proceed the computation. Therefore,
every function ends with a function call.
While CPS may seem to be needlessly complex, it is useful in various
cases. If we compare factorial and factorialCps, the former looks
more concise. It is difficult to implement programs correctly in CPS.
One benefit of CPS is that it makes every function call be a tail call.
If implementation languages support tail-call optimization, CPS can
be used to avoid stack overflow. However, this book uses Scala, which
optimizes only tail-recursive calls. Scala programs written in CPS can
suffer from stack overflow despite the use of CPS. Then, why does this
section introduce CPS? The first reason is to help readers understand the
notion of a continuation. The other reason is that the characteristic of
CPS, passing a continuation explicitly as a value, is sometimes useful.
The next chapter shows such an example: an interpreter of a language
with first-class continuations. We will see how CPS can contribute to the
implementation of an interpreter in the next chapter.
The remaining cases are Add, Sub, and App. They are similar in the sense
that each sort of expression consists of two subexpressions, so if you
understand one of them, the others are straightforward. Let us consider
the Add case first. The previous implementation is as follows:
where add(v1, v2) denotes val NumV(n) = v1; val NumV(m) = v2;
NumV(n + m). Since interpCps(e, env, k) equals k(interp(e, env)),
we can start from the following code:
14 Continuations 148
(v1 => {
val v2 = interp(r, env)
k(add(v1, v2))
})(interp(l, env))
(v2 =>
k(add(v1, v2))
)(interp(r, env))
Then, we can use this new expression as the body of the continuation.
The App case is also similar but needs extra care. The previous implemen-
tation is as follows:
Unlike Add and Sub, an interp function call still exists. It is not CPS
because the result of the function call is used by being passed to k.
Replacing interp with interpCps resolves the problem.
σ ` e1 ⇒ n 1 σ ` e2 ⇒ n 2
σ ` e1 + e2 ⇒ n 1 + n 2
Reflexive relations
Let A be a set and R be a binary relation over A and A.
If (a, a) ∈ R for every a ∈ A, R is reflexive.
Transitive relations
Let A be a set and R be a binary relation over A and A.
If (a, b), (b, c) ∈ R implies (a, c) ∈ R for every a, b, c ∈ A, R is
transitive.
k || s →∗ k || s
k1 || s1 →∗ k2 || s2 k2 || s2 → k3 || s 3
∗
k 1 || s 1 → k 3 || s 3
σ ` n :: k || s → k || n :: s [Red-Num]
σ ` x :: k || s → k || σ(x) :: s [Red-Id]
σ ` e1 + e2 :: k || s → σ ` e1 :: σ ` e2 :: (+) :: k || s [Red-Add1]
(+) :: k || n2 :: n1 :: s → k || n1 + n2 :: s [Red-Add2]
14 Continuations 155
σ ` e1 + e2 :: k || s
→ σ ` e1 :: σ ` e2 :: (+) :: k || s
→∗ σ ` e2 :: (+) :: k || n1 :: s
→∗ (+) :: k || n2 :: n1 :: s
→ k || n1 + n2 :: s
We can define the rules for the Sub case in a similar way:
σ ` e1 − e2 :: k || s → σ ` e1 :: σ ` e2 :: (−) :: k || s [Red-Sub1]
(+) :: k || n2 :: n1 :: s → k || n1 − n2 :: s [Red-Sub2]
σ ` e1 e2 :: k || s → σ ` e1 :: σ ` e2 :: (@) :: k || s [Red-App1]
However, the last step is a bit different. In Add and Sub, the last step
applies the continuation to a certain value, which is obtained by addition
or subtraction. In App, the body of the function must be evaluated. Thus,
we define the rule to evaluate the body with the same continuation
instead of directly applying the continuation to a particular value.
σ ` e1 e2 :: k || s
→ σ ` e1 :: σ ` e2 :: (@) :: k || s
→∗ σ ` e2 :: (@) :: k || hλx.e , σ0i :: s
→∗ (@) :: k || v2 :: hλx.e , σ0i :: s
→ σ [x 7→ v2 ] ` e ::
0 k || s
→∗ k || v :: s
∅ ` (1 + 2) − (3 + 4) :: ||
→ ∅ ` 1 + 2 :: ∅ ` 3 + 4 :: (−) :: ||
→ ∅ ` 1 :: ∅ ` 2 :: (+) :: ∅ ` 3 + 4 :: (−) :: ||
→ ∅ ` 2 :: (+) :: ∅ ` 3 + 4 :: (−) :: || 1 ::
→ (+) :: ∅ ` 3 + 4 :: (−) :: || 2 :: 1 ::
→ ∅ ` 3 + 4 :: (−) :: || 3 ::
→ ∅ ` 3 :: ∅ ` 4 :: (+) :: (−) :: || 3 ::
→ ∅ ` 4 :: (+) :: (−) :: || 3 :: 3 ::
→ (+) :: (−) :: || 4 :: 3 :: 3 ::
→ (−) :: || 7 :: 3 ::
→ || −4 ::
∅ ` e 1 2 :: ||
→ ∅ ` e 1 :: ∅ ` 2 :: (@) :: ||
→ ∅ ` e :: ∅ ` 1 :: (@) :: ∅ ` 2 :: (@) :: ||
→ ∅ ` 1 :: (@) :: ∅ ` 2 :: (@) :: || he , ∅i ::
→ (@) :: ∅ ` 2 :: (@) :: || 1 :: he , ∅i ::
→ σ1 ` λ y.x + y :: ∅ ` 2 :: (@) :: ||
→ ∅ ` 2 :: (@) :: || hλ y.x + y , σ1 i ::
→ (@) :: || 2 :: hλ y.x + y , σ1 i ::
→ σ2 ` x + y :: ||
→ σ2 ` x :: σ2 ` y :: (+) :: ||
→ σ2 ` y :: (+) :: || 1 ::
→ (+) :: || 2 :: 1 ::
→ || 3 ::
where
e λ x.λ y.x + y
σ1 [x 7→ 1]
σ2 [x 7→ 1 , y 7→ 2]
First-Class Continuations 15
The previous chapter defines the small-step semantics of FAE and imple- 15.1 Syntax . . . . . . . . . . . . . . 157
ments the interpreter of FAE in CPS. Conceptually, continuations exist 15.2 Semantics . . . . . . . . . . . . 158
during the evaluation of FAE programs. However, they are not exposed to 15.3 Interpreter . . . . . . . . . . . 160
15.4 Use of First-Class Continua-
programmers. Programmers cannot utilize continuations directly while
tions . . . . . . . . . . . . . . . . . 161
writing programs in FAE.
Return . . . . . . . . . . . . . . 162
A first-class entity of a language is an entity treated as a value. Since it Break and Continue . . . . . 162
is a value, it can be the value denoted by a variable, an argument for a 15.5 Exercises . . . . . . . . . . . . . 163
function call, and the return value of a function. For example, first-class
functions are functions used as values.
First-class continuations are continuations used as values. If a language
supports first-class continuations, continuations can be the value of a
variable, an argument for a function call, and the return value of a function.
A continuation can be considered as a function since it takes a value
and performs computation. Programmers can call continuations like
calling functions. However, continuations are different from functions. A
function call returns a value, and the execution continues with the return
value. On the other hand, a continuation call does not come back to its
call site. The continuation at some point of execution is the remaining
computation. Once a continuation is called and evaluated, the execution
finishes. Calling a continuation changes the current continuation to
the called one. It changes the control flow of the execution. First-class
continuations allow programmers to express computations with complex
control flows concisely.
This chapter defines KFAE by extending FAE with first-class continuations.
It defines the small-step semantics of KFAE and implements an interpreter
of KFAE in CPS. While implementing the interpreter, you will see why
CPS is required. In addition, this chapter shows utilization of first-class
continuations in programming.
15.1 Syntax
e :: · · · | vcc x in e
15.2 Semantics
result of function application fills the hole, and the evaluation continues.
However, x is a continuation, not a function. The evaluation of x 2
completely ignores the original continuation 1 + ( + 3). It replaces the
continuation with the continuation denoted by x and fills the hole with
the argument, 2. Thus, x 2 results in evaluating 1 + 2. Since the original
continuation is ignored, there is nothing more do to after the evaluation
of 1 + 2. The result of the whole expression is 3.
To compare first-class continuations and functions, consider the following
expression:
1 + (val x= λ y.1 + y in (x 2) + 3)
In the previous expression, x denotes a continuation, but in this ex-
pression, x denotes a function. The continuation and the function are
almost the same. Both take an argument and add 1 to the argument.
However, continuations change the control flow, while functions do not.
Therefore, in this case, x 2 preserves its continuation, 1 + ( + 3). The
return value of the function application is 3, and it fills the hole in the
original continuation. After the function returns, 1 + (3 + 3) is evaluated,
and the whole expression results in 7.
Let us consider another example:
vcc x in (vcc y in x (1 + (vcc z in y z))) 3
What is the result of this expression? The first thing happens during
the evaluation is binding of x. x denotes the continuation of the whole
expression, which is the identity function, i.e. . Then, (vcc y in x (1 +
(vcc z in y z))) 3 is evaluated. Any function application evaluates the
expression at the function position first. Thus, the redex is vcc y in x (1 +
(vcc z in y z)), and the continuation is 3. The redex defines y, which
denotes the continuation, 3. Under the environment containing x and
y, x (1 + (vcc z in y z)) is evaluated. x directly evaluates to a continuation,
and the argument expression becomes the redex. At this point, the
continuation is (x ) 3. The argument expression is 1 + (vcc z in y z), and 1
evaluates to 1. Then, vcc z in y z becomes the redex, and the continuation
is (x (1 + )) 3. Therefore, z denotes (x (1 + )) 3. When y is applied
to z, the original continuation is ignored, and z fills the hole in the
continuation denoted by y. Now, the remainig computation is z 3, which
is obtained by filling the hole of 3 with z. Applying z to 3 ignores the
continuation again, and (x (1 + 3)) 3 is obtained by filling the hole of
(x (1 + )) 3 with 3. Since 1 + 3 evaluates to 4, x is applied to 4. Then, 4
fills the hole of and becomes the final result.
Now, we define the semantics of KFAE. First, since continuations are
values, values must be extended.
15 First-Class Continuations 159
v :: · · · | hk, si
(@) :: k || v :: hk 0 , s 0i :: s → k 0 || v :: s 0 [Red-App2-Cont]
The following shows that 1 +(vcc x in ((x 2)+ 3)) evaluates to 3 by applying
reduction according to the semantics:
where
v1 h, i
v2 hσ1 ` 3 :: (@) :: , i
v3 h(+) :: (@) :: σ1 ` 3 :: (@) :: , 1 :: v1 :: i
σ1 [x 7→ v1 ]
σ2 σ1 [y 7→ v2 ]
σ3 σ2 [z 7→ v3 ]
15.3 Interpreter
The following Scala code implements the syntax of KFAE: 3 3: We omit the common part to FAE.
By doing so, the designers can make their language convenient for
programmers while preventing the language from being complicated.
Return
which is the result of the loop. Since break terminates the loop, it applies
the continuation of the loop to 0. Thus, when b denotes the continuation
of the loop, break can be desugared to b 0. 6 To make b the continuation 6: Assume that the rest of the expression
of a loop, vcc that binds b should enclose the loop. Therefore, every does not use b at all.
While break terminates the loop, continue just skips the current iteration.
It makes the program jump to the condition expression of the loop.
Evaluating the condition expression is the continuation of the body of the
loop because the condition is evaluated after the evaluation of the body.
Therefore, an expression while0 e1 e2 is desugared to while0 e1 (vcc c in e2 )
when e2 contains continue, and continue is desugared to c 0. 7 7: Assume that the rest of the expression
does not use c at all.
For example, consider the following expression:
while0 0 (continue; (1 + λ x.x))
At each iteration, when c 0 is evaluated, the result of whole vcc c in ((c 0); (1+
λ x.x)) becomes 0 without evaluating 1 + λ x.x. Then, the loop proceeds to
the next iteration without incurring a run-time error. Thus, the expression
never terminates. It is what we expect from continue. Without continue,
the expression causes a run-time error because it is impossible to add a
number to a function. However, continue prevents the addition from
being evaluated, so the expression never terminates.
Note that the selection of 0 in c 0 is completely arbitrary since the result
of the loop body is never used. We may desugar continue to c 42 instead.
It is different from the case of break, which must apply b to 0 to make
the result of the loop 0.
15.5 Exercises
1. What is the result of the following expression?
vcc x in (vcc y in x (2 + (vcc z in y z))) 8
2. Write the reduction steps of the following expression:
(vcc x in 42 + (x 2)) + 8
First-Order Representation of
Continuations 16
The previous chapter implements an interpreter of KFAE with first-class 16.1 First-Order Representation of
functions in Scala. The interpreter treats continuations as Scala functions. Continuations . . . . . . . . . . . 164
Since the interpreter uses CPS, continuations are passed from functions 16.2 Big-Step Semantics of KFAE 169
to functions. In addition, continuations sometimes need to be stored in
ContV because KFAE supports first-class continuations. ContV instances
are stored inside environments or returned from interp. Therefore, the
interpreter relies on the fact that Scala provides first-class functions. Since
functions are values in Scala, the interpreter can represent continuations
with functions and uses them as values.
The use of first-class functions is problematic for some cases. First, low-
level languages, such as C, lack first-class functions.1 There must be 1: C provides function pointers but not clo-
another way to implement an interpreter of KFAE for those who use sures. Closures are necessary to represent
continuations.
low-level languages. Second, functions do not give useful information.
The only ability of functions is being applied to arguments. However, in
particular programs like debuggers, it is necessary to figure out what a
given first-class continuation does. The current implementation disallows
such analysis on continuations. On the other hand, a CloV instance, which
represents a closure, can give the exact information about the parameter,
body, and environment of the closure. Alas, ContV instances do not have
such capabilities.
This chapter hows how we can represent continuations without first-class
functions. By avoiding first-class functions, an interpreter of a language
with first-class continuations can be written in low-level languages. In
addition, if continuations are not functions and have specific structures
instead, debuggers can analyze what a given continuation denotes.
value of the left operand. The function body contains three free variables:
e2, env, and k. e2 is the right operand; env is the current environment; k
is the continuation of the addition. If the values of the free variables are
determined, the behavior of the continuation is also determined. There-
fore, (e2, env, k), which is a triple of an expression, an environment,
and a continuation, can represent the continuation.
Currently, continue continues the evaluation with a given function,
which represents the continuation. Since the continuation is a Scala
function, it can be directly applied to a given value. However, if we use
a triple to represent the continuation instead of a function, it cannot be
applied to a value. We need a new way to continue evaluation when a
continuation and a value are given. The clue already exists—look at the
body of the function representing a continuation. When the function
is applied to v1, the result is interp(e2, env, v2 => continue(k,
add(v1, v2))). Now, (e2, env, k) and v1 are provided instead of
the function and v1. It is enough to evaluate interp(e2, env, v2 =>
continue(k, add(v1, v2))) with v1, e2, env, and k. It evaluates the
same thing as the original function application.
Below compare the previous and current strategies:
sion at the function position. The body of the function representing the
continuation contains three free variables: e2, env, and k.3 e2 is the 3: k is in ....
expression at the argument position; env is the current environment; k is
the continuation of the application. Therefore, e2, env, and k determine
what the continuation does, and (e2, env, k), a triple of an expression,
an environment, and a continuation, can represent the continuation. Con-
tinuing the evaluation is evaluating interp(e2, env, v2 => v1 match
...), which can be done with (e2, env, k) and v1.
The fourth continuation, v2 => v1 match ..., is used after the evalua-
tion of the argument of a function application. It applies a function (or
a continuation) to the argument. v2 denotes the value of the argument.
The body of the function representing the continuation contains two free
variables: v1 and k.4 v1 is the value of the expression at the function 4: k is in ....
position; k is the continuation of the application. Therefore, (v1, k),
a pair of a value and a continuation, can represent the continuation.
Continuing the evaluation is evaluating v1 match ... with (v1, k)
and v2.
In fact, there is one more continuation, which does not appear in the
implementation of interp. It is the one that is represented as the identity
function and is passed to interp in the beginning. The identity function
returns a given argument without any changes. No additional information
is necessary to determine the behavior of the continuation. Therefore,
(), the zero-length tuple (the Unit value in Scala) can represent the
continuation. To continue the evaluation with the continuation and a
value v, it is enough to give v as the result.
Note that the first and the third are different even though they look
the same. The first continuation computes interp(e2, env, v2 =>
continue(k, add(v1, v2))) with its data, while the third continua-
tion computes interp(e2, env, v2 => v1 match ...) with its data.
16 First-Order Representation of Continuations 168
Similarly, the second and the fourth are diffent as well. The second
computes continue(k, add(v1, v2)), while the fourth computes v1
match ....
The names of the classes do not matter, though they are named carefully
so that the names can reflect what they are for. The important things are
data carried by each continuation. Following the Scala convention, the
last sort, which can be represented by the empty tuple, is now represented
by a singleton object. One may use case class MtK() extends Cont
instead without changing the semantics, but the singleton object is
more efficient than the case class from the implementation perspective.
Now, the implementation of continuations does not require first-class
functions.
Now, we need to revise the continue function. The previous implemen-
tation is def continue(k: Cont, v: Value): Value = k(v). It works
because Cont is a function type before our change. However, Cont is
not a function now, and continue needs a fix. In fact, we already know
everything to make a correct fix. Previously, continue applies k to v
when k and v are given. Now, it should check k and do the correct compu-
tation according to the data in k. Below is the repetition of the previous
explanations, but with the names of the case classes and object.
The first and third explanations still pass functions to interp even
though continuations are not functions anymore. They need small
changes. Now, DoAddK(v1, k) represents v2 => continue(k, add(v1,
v2)), and DoAppK(v1, k) represents v2 => v1 match ....
Only the Add and App cases are different from before. The Add case uses
AddSecondK(e2, env, k) to represent v1 => interp(e2, env, v2 =>
continue(k, add(v1, v2))); the App case uses AppArgK(e2, env, k)
to represent v1 => interp(e2, env, v2 => v1 match ...).
Note that MtK does not appear in interp. MtK is used to call interp in
the beginning. One should write interp( e , Map(), MtK) to evaluate
e.
n 7→ κ ⇓ v
[Interp-Num]
σ, κ ` n ⇒ v
x ∈ Domain(σ) σ(x) 7→ κ ⇓ v
[Interp-Id]
σ, κ ` x ⇒ v
hλx.e, σi 7→ κ ⇓ v
[Interp-Fun]
σ, κ ` λx.e ⇒ v
The rules for variables and functions are similar to the rule for integers.
σ, [ + (e2 , σ)] :: κ ` e1 ⇒ v
[Interp-Add]
σ, κ ` e1 + e2 ⇒ v
16 First-Order Representation of Continuations 171
σ, [ (e2 , σ)] :: κ ` e1 ⇒ v
[Interp-App]
σ, κ ` e1 e2 ⇒ v
The rule for function application is similar to the rule for addition.
σ[x 7→ κ], κ ` e ⇒ v
[Interp-Vcc]
σ, κ ` vcc x ; e ⇒ v
Each case of the continue function also produces a single inference rule.
The only exception is the DoAppK case because it requires two rules: one
for the CloV case and the other for the ContV case.
σ, [v1 + ] :: κ ` e2 ⇒ v2
[Continue-AddSecondK]
v1 7→ [ + (e2 , σ)] :: κ ⇓ v2
n1 + n 2 7→ κ ⇓ v
[Continue-DoAddK]
n2 7→ [n 1 + ] :: κ ⇓ v
σ, [v1 ] :: κ ` e ⇒ v2
[Continue-AppArgK]
v1 7→ [ (e , σ)] :: κ ⇓ v 2
This rule is similar to the rule when the continuation is [ + (e2 , σ)].
σ[x 7→ v2 ], κ ` e ⇒ v
[Continue-DoAppK-CloV]
v 2 7→ [hλx.e , σi ] :: κ ⇓ v
v2 7→ κ1 ⇓ v
[Continue-DoAppK-ContV]
v2 7→ [κ1 ] :: κ ⇓ v
v 7→ [] ⇓ v [Continue-MtK]
val f λ x.x;
val g λ y.y;
(f 1) + (g 2)
The above expression defined the functions f and g and, then, evaluate
(f 1) + (g 2). f and g are semantically equivalent, but the names of their
parameters are different. If a compiler is aware of their equivalence, it
can reduce the size of the program by modifying the expression like
below:
17 Nameless Representation of Expressions 174
val f λ x.x;
(f 1) + (f 2)
It is important to notice that different indices can denote the same variable,
and the same indices can denote different variables. Consider the second
example from the bottom. The first 0 in λ.(0 λ.(1 0)) denotes x of the
original expression. At the same time, 1 also denotes x of the original
expression. On the other hand, the second 0 denotes y of the original
expression. The distance from the definition depends on the location of a
variable. Since de Bruijn indices represent variables with their distances,
the indices of a single variable can vary among places.
Note that expressions should be treated as trees, not strings, to calculate
the distances. Consider the last example. There are two λ ’s between
17 Nameless Representation of Expressions 175
the last x and its definition when the expression is written as a string.
However, when the abstract syntax tree representing the expression is
considered, there is only one λ in between. Therefore, the index of the
last x is 1, not 2. We usually write expressions as strings for convenience,
but they always have tree structures in fact.
De Bruijn indices successfully resolve the issues arising from names.
Consider the comparison of expressions. λ x.x and λ y.y are semantically
equivalent but syntactically different expressions. Both become λ.0 when
de Bruijn indices are used. By the help of de Bruijn indices, a simple
syntactic check will find out that two expressions are equal.
Now, let us define the procedure that transforms named expressions into
nameless expressions. It helps readers understand de Bruijn indices. At
the same time, the procedure is practically valuable. Use of names is the
best way to denote variables for programmers. Therefore, expressions
written by programmers have names. On the other hand, programs
like interpreters and compilers sometimes need to use de Bruijn indices
to represent variables. In such cases, the procedure is a part of the
interpreter/compiler implementation.
First, we define indices as follows:
i∈N
χ ∈ Id →
fin
7 N
[x]χ i if χ(x) i
[λx.e]χ λ.[e]χ0 where χ0 (↑ χ)[x 7→ 0]
[e1 e2 ]χ [e1 ]χ [e2 ]χ
[n]χ n
[e1 + e2 ]χ [e1 ]χ + [e2 ]χ
[λx.λy.x + y]∅
λ.[λ y.x + y][x 7→ 0]
λ.λ.[x + y][x 7→ 1 , y 7→ 0]
λ.λ.[x][x 7→ 1 , y 7→ 0] + [y][x 7→ 1 , y 7→ 0]
λ.λ.1 + [y][x 7→ 1, y 7→ 0]
λ.λ.1 + 0
object Nameless {
sealed trait Expr
case class Num(n: Int) extends Expr
case class Add(l: Expr, r: Expr) extends Expr
case class Id(i: Int) extends Expr
case class Fun(e: Expr) extends Expr
case class App(f: Expr, a: Expr) extends Expr
}
v :: n | hλ.e , σi
σ∈N→
fin
7 V
Environments are finite partial functions from indices, which are natural
numbers, to values.
Now, let us define the inference rules.
Rule Id
If i is in the domain of σ ,
then i evaluates to σ(i) under σ .
i ∈ Domain(σ)
[Id]
σ ` i ⇒ σ(i)
Rule Fun
λ.e evaluates to hλ.e , σi under σ.
Rule App
If
e1 evaluates to hλ.e, σ0i under σ,
e2 evaluates to v2 under σ, and
e evaluates to v under (↑ σ0)[0 7→ v2 ],
then
e1 e2 evaluates to v under σ.
between the use and the definition. Its index is 0. Therefore, the value of
the argument has the index 0 in the new environment. In addition, every
index in the environment of the closure needs a change. Let a value v
correspond to the index 0. The value is not the value of the argument,
so it cannot correspond to the index 0 anymore. As λ from the closure
exists between the use and the definition, the index should increase by
one. By the same principle, every index in the environment increases by
one. Since ↑ σ0 denotes the context same as σ0 but whose indices are one
larger, the body of the closure is evaluated under (↑ σ0)[0 7→ v 2 ].
The rules for integers and addition are omitted because they are the same
as those of FAE.
This new semantics for nameless expressions is equivalent to the previous
semantics for named expressions. Let e be a named expression. The result
of evaluating e is the same as evaluating e 0 where e 0 is the nameless
expression obtained by transforming e . 3 Mathematically, the following 3: Assume that the equality of closures is
proposition is true: defined properly.
The App case is the only interesting case. The others are the same as before.
Since a closure lacks its parameter name and an environment does not
need the name, it is enough to prepend the value of the argument in
front of the list.
Typed Languages
Type Systems 18
This chapter is the first chapter about typed languages. This chapter 18.1 Run-Time Errors . . . . . . . 181
explains the motivation of type checking and introduces a simple type 18.2 Detecting Run-Time Errors . 182
system by defining TFAE, a typed variant of FAE. 18.3 Type Errors . . . . . . . . . . . 184
18.4 Type Checking . . . . . . . . . 185
18.5 TFAE . . . . . . . . . . . . . . . 188
Syntax . . . . . . . . . . . . . . 188
18.1 Run-Time Errors Dynamic Semantics . . . . . 189
Interpreter . . . . . . . . . . . 189
In FAE, expressions can be classified into three groups according to their Static Semantics . . . . . . . . 190
behaviors. Let us see what those three groups are. Note that in most Type Checker . . . . . . . . . . 192
18.6 Extending Type Systems . . 194
languages, expressions can be classified into three groups in the same
Local Variable Definitions . 194
manner. Thus, the discussion of this section can be applied to various
Pairs . . . . . . . . . . . . . . . 195
real-world languages. Just for brevity, this section uses FAE. 18.7 Exercises . . . . . . . . . . . . . 196
The first group includes every expression that evaluates to a value. For
example, (1 + 2) − 3 and (λ x.λ y.x + y) 1 2 belong to the first group
because (1 + 2) − 3 evalutes to 0, and (λ x.λ y.x + y) 1 2 evaluates to 3.
Expressions in this group correspond to programs that terminate without
any problem. When we write a program, the program usually belongs to
the first group.
The second group includes every expression that never teminates. For
instance, (λ x.x x) (λ x.x x) belongs to the second group. The expression
is function application. The first λ x.x x is a function, and the seceond
λ x.x x is an argument. To evaluate the function application, the body, x x,
is evaluated under the environment that maps x to λ x.x x. Following the
content of the environment, evaluating x x is equivalent to evaluating
(λ x.x x) (λ x.x x), which is the original expression. Thus, we can say that
the evaluation of (λ x.x x) (λ x.x x) leads to the evaluation of the exactly
same expression. The evaluation runs forever and never terminates. There
are many nonterminating programs in real world. If a language supports
recursive functions or loops, writing nonterminating programs becomes
much easier. Some of them are created by programmers’ mistakes. Wrong
use of recursive functions or loops makes programs run forever, contrary
to the expectation of the programmers. However, programmers some-
times intentionally write nonterminating programs. Consider operating
systems, web servers, and shells. They do not finish their execution unless
a user inputs a termination command. If an operating system terminates
although a user has not given any commands, such a behavior should
be considered as a bug. These examples clearly show the necessity of
writing nonterminating programs.
The third group includes every expression that terminates but fails
to produce a result. For example, (λ x.x) + 1, 1 0, and 2 − x belong to
the third group. The first example, (λ x.x) + 1 adds a function to an
integer. Since such addition is impossible, the evaluation cannot proceed
beyond the addition. Thus, the evaluation terminates at the middle of the
computation rather than reaching the final stage and producing a result.
18 Type Systems 182
String s = null;
s.length();
Now, let us say that “the type of e is τ ” when the following conditions
are true:
I e does not incur a type error.
I e evaluates to a value of τ or does not terminate.
where the metavariable τ ranges over types. Then, we can restate the
finding of the above paragraph: when the type of e1 is num and the type
of e2 is num, the type of e1 + e2 is num.
This example shows what a type checker does. A type checker computes
the type of an expression. When the type is successfully computed, it
ensures that the expression does not incur type errors. In this case, we
say that the expression is well-typed. Then, the type can be used to check
whether an expression containing the previously checked expression can
cause type errors. This process is repeated until the whole program is
checked. We call this process type checking.
A type checker requires different strategies to predict the types of dif-
ferent sorts of an expression. In the above example, addition requires
both subexpressions to have num as their types. However, it is clear that
function application requires different types. It requires the first subex-
pression to have fun as its type because only functions can be applied to
values. These examples show that a type checker needs a separate rule
for each sort of an expression to predict the type of the expression. We
call such rules typing rules.
There are multiple typing rules in a single language, and we call the
collection of all the typing rules in a language the type system of the
language. Static semantics is another name of a type system since type
systems explain the behaviors of expressions by predicting their types
without execution. To distinguish the semantics so far, which explains
the behaviors of expressions by defining their values from execution,
from static semantics, we use the term dynamic semantics.
The following table compares dynamic semantics and static semantics:
Dynamic semantics Static semantics
What it is for Evaluation Type checking
Which program implements it Interpreter Type checker
Result Value Type
Dynamic semantics defines how expressions are evaluated. By evaluation,
expressions result in values. An interpreter is a program that takes
an expression and computes its result. Static semantics defines how
expressions are type-checked. By type checking, the types of expressions
are computed. A type checker is a program that takes an expresion,
predicts its type, and checks whether run-time errors are possible. We
can consider static semantics as overapproximation of dynamic semantics.
For example, dynamic semantics lets us know that 1 + 2 results in 3, while
static semantics lets us know that 1 + 2 results in an integer without any
run-time errors or does not terminate.
As mentioned before, the goal of a type checker, P , is soundness. Therefore,
the most important property of type systems is type soundness, or simply,
just soundness. If a type checker says OK for a given program, then
the program must never incur type errors. In this case, we say that
the program passes type checking or that the type checker accepts the
18 Type Systems 187
program. On the other hand, if a type checker says NOT OK for a given
program, we cannot conclude anything, but the program might incur a
type error. In this case, we say that the type checker rejects the program.
It is nontrivial to design a sound type system for a given language.
Proving the soundness of a type system is more challenging. Proving type
soundness is beyond the scope of this book. This book introduces various
type systems whose type soundness has been proved by researchers
already.
Since designing a type system and implementing a type checker are
difficult tasks, those tasks are the jobs of language designers, not language
users in most cases. Some languages come out with type systems. We
call such languages typed languages or statically typed languages. The terms
imply that the languages have the notion of a type whose correct use
is verified statically. In such languages, only programs that pass type
checking can be executed. Programs rejected by the type checker are
disallowed to be exectued because their safety is not ensured. Therefore,
any execution is guaranteed to be type error free. Java, Scala, and Rust
are well-known statically typed languages in real world.
On the other hand, some languages do not provide type systems. We call
such languages untyped languages or dynamically typed languages. The term
untyped languages implies that they do not have type checking. The term
dynamically typed languages implies that they have the notion of a type
only at run time. Note that a type is a natural concept that exists anywhere
because values can be classified according to their characteristics in any
languages. However, in dynamically typed languages, types exist only
during execution since there are no static type checking. In such languages,
programs may incur type errors during execution. Python and JavaScript
are well-known dynamically typed languages in real world.
Statically typed languages and dynamically typed languages have their
own pros and cons. Statically typed languages have the following advan-
tages:
I Errors can be detected early. Programmers can find errors before
execution.
I Static type checking gives type information to compilers, and the
compilers can optimize programs with the information. For these
reasons, programs in statically typed languages usually outperform
programs in dynamically typed languages.
I Some statically typed languages require programmers to write
types explicitly on the source code. Such types on the code are
called type annotations. Type checkers verify the correctness of the
type annotations. Thus, type annotations are automatically verified
comments, which never become outdated, and help programmers
understand the programs easily.
On the other hand, statically typed languages have the following disad-
vantages:
I Statically typed languages attain type soundness by giving up
completeness. Type checkers may reject programs that never incur
type errors. Therefore, programmers may waste their time in
making type checkers agree that given programs do not result in
type errors.
18 Type Systems 188
18.5 TFAE
Syntax
e :: n | e + e | e − e | x | λx : τ.e | e e
The only difference from FAE is the type annotation of a lambda ab-
straction. λx : τ.e is a function whose parameter is x , parameter type is τ ,
and body is e . The parameter type annotation is required during type
checking, which will be explained soon.
Now, we need to define types. Classifying values into num and fun like so
far is too imprecise. We need more fine-grained types for functions for
a few reasons. First, functions require arguments to belong to specific
types. Consider λ x.x + x. When the function is applied to a value, the
value must be a number to avoid a type error. If a function is given as an
argument, the evaluation of the body incurs a type error. Each function
has its own requirement. Therefore, the type of a function must describe
the type of an argument expected by the function. Second, different
functions return different values. Some functions returns numbers, while
others return functions. To predict the type of a function application
expression, the type checker must be able to predict the type of the return
18 Type Systems 189
value. Thus, the type of a function must describe the type of the return
value as well.
Based on the above observations, we define types as follows:
τ :: num | τ → τ
The type num is the type of every number. A type τ1 → τ2 is the type
of a function that takes a value of τ1 as an argument and returns a
value of τ2 . For example, λ x:num.x takes a value of num and returns
the value. Its type is num → num. λ x:num.λ y:num.x + y takes a value of
num and returns λ y:num.x + y. λ y:num.x + y also takes a value of num.
Since both x and y are numbers, x + y also is a number, whose type is
num. Therefore, the type of λ y:num.x + y is num → num, and the type
of λ x:num.λ y:num.x + y is num → (num → num). Because arrows in
function types are right associative, we can write num → num → num
instead.
A type is either num or τ1 → τ2 for some τ1 and τ2 . Every value belongs
to at most one type. No value is an integer and a function at the same
time. No function takes an integer as an argument and a function as
an argument at the same time. In this chapter, every value has at most
one type, and, therefore, every expression has at most one type as well.
However, in some type systems, a single value or a single expression can
have multiple types. Chapter 22 shows such an example.
Dynamic Semantics
The dynamic semantics of TFAE is similar to that of FAE. The only differ-
ence is type annotations in lambda abstractions. Since type annotations
are used only for type checking and do not have any role at run time,
they are simply ignored when closures are constructed.
Rule Fun
λx : τ.e evaluates to hλx.e , σi under σ.
Interpreter
The interp function needs only one fix in the Fun case.
Static Semantics
TEnv Id →
fin
7 T
Γ ∈ TEnv
: ⊆ TEnv × E × T
Rule Typ-Num
The type of n is num under Γ.
Γ ` n : num [Typ-Num]
Rule Typ-Add
18 Type Systems 191
Γ ` e1 : num Γ ` e2 : num
[Typ-Add]
Γ ` e1 + e2 : num
If the types of e1 and e2 are both num, then the type of e1 + e2 is num.
Rule Typ-Sub
If the type of e1 is num under Γ and the type of e2 is num under Γ,
then the type of e1 − e2 is num under Γ.
Γ ` e1 : num Γ ` e2 : num
[Typ-Sub]
Γ ` e1 − e2 : num
Rule Typ-Id
If x is in the domain of Γ,
then the type of x is Γ(x) under Γ.
x ∈ Domain(Γ)
[Typ-Id]
Γ ` x : Γ(x)
Rule Typ-Fun
If the type of e is τ2 under Γ[x : τ1 ],5 5: Since 7→ looks similar to arrows in
then the type of λx : τ1 .e is τ1 → τ2 under Γ. types, we use : instead of 7→ to prevent
confusion.
Γ[x : τ1 ] ` e : τ2
[Typ-Fun]
Γ ` λx :τ1 .e : τ1 → τ2
The rule for a lambda abstraction needs to compute the type of a closure
created by the lambda abstraction. The type of an argument is given as
τ1 by the type annotation. The rule should determine the type of the
return value of the function as well. The return type equals the type of e ,
the function body. The value of an argument is unknown, but the type is
known as τ1 . It shows why a lambda abstraction needs a parameter type
annotation. It gives information to compute the type of the body. Since
a closure captures the environment when it is created, evaluation of its
body can use variables in the environment. Thus, computation of the
type of e needs every information in Γ and that the type of x is τ1 . The
computation uses Γ[x : τ1 ]. If the type of e is τ2 , the return type of the
function also is τ2 . Finally, the type of the lambda abstraction becomes
τ1 → τ2 .
18 Type Systems 192
Rule Typ-App
If the type of e1 is τ1 → τ2 under Γ and the type of e2 is τ1 under Γ,
then the type of e1 e2 is τ2 under Γ.
Γ ` e 1 : τ1 → τ2 Γ ` e 2 : τ1
[Typ-App]
Γ ` e 1 e 2 : τ2
x ∈ Domain(Γ2 ) y ∈ Domain(Γ2 )
Γ2 ` x : num Γ2 ` y : num
Γ2 ` x + y : num
Γ1 ` λ y:num.x + y : num → num
∅ ` 1 : num
∅ ` e : num → num → num
∅ ` 2 : num
∅ ` e 1 : num → num
∅ ` e 1 2 : num
where
e λx:num.λy:num.x + y
Γ1 [x : num]
Γ2 [x : num , y : num]
We call a proof tree that proves the type of an expression a type derivation.
This type system is sound; it rejects every expression producing a type
error. For example, consider (λ x:num → num.x 1) 1. Evaluation of the
expression results in evaluation of 1 1, which causes a type error. Since
the type of x 1 is num, the type of the function is (num → num) → num.
The function takes an argument of type num → num. However, 1, the
argument, has the type num, which differs from num → num. Therefore,
the type checker rejects the expression, which is a correct decision.
Any sound type system is incomplete. Therefore, this type system is
incomplete. The type system can reject a type-error-free expression.
Various such expressions exist. Consider (λ x:num.x) (λ x:num.x). The
expression evaluates to hλ x.x , ∅i without any type error. However, the
type system rejects the expression. λ x:num.x takes an argument of the
type num. However, λ x:num.x, the argument, has the type num → num,
which differs from num. As a result, the type system rejects the expression
even though it evaluates to a value without any type error.
Type Checker
To implement a type checker of TFAE, we first define the TEnv type, which
is the type of a type environment.
18 Type Systems 193
If type checking succeeds, the function returns the type of the expression.
Otherwise, it throws an exception. Therefore, if the function throws
an exception for a given expression, the expression is ill-typed. If the
function terminates without throwing an exception, the expression is
well-typed.
Each case of the pattern matching coincides with the corresponding
typing rule. In the Num case, the type is NumT. In the Add and Sub cases, the
subexpressions of the expression must have the type NumT. The type of
the expression also is NumT. The Fun case checks the type of the function
body under the extended type environment. The type of the function is
a function type. The parameter type is the same as the type annotation,
and the return type is the type of the body. The App case checks the types
of the function and the argument positions. The parameter type of the
function position must equal the type of the argument position. The type
of the application expression is the return type of the function position.
18 Type Systems 194
Rule Typ-Val
If the type of e1 is τ1 under Γ and the type of e2 is τ2 under Γ[x : τ1 ],
then the type of val x = e1 in e2 is τ2 under Γ.
Γ ` e 1 : τ1 Γ[x : τ1 ] ` e2 : τ2
[Typ-Val]
Γ ` val x = e1 in e2 : τ2
Note that local variable definitions do not require type annotations, while
lambda abstractions do. Therefore, local variable definitions are more
convenient then lambda abstractions for binding.
Pairs
Pairs are the first kind of an extension. In FAE, we can desugar pairs to
functions:
I (e1 , e2 ), which creates a new pair, is desugared to λ f.f e1 e2 .6 6: Strictly speaking, the correct desugar-
I e .1, which acquires the first element of a pair, is desugared to ing is (λ x.λ y.λ f.f x y) e1 e2 in eager lan-
guages like FAE, but we use the simpler
e λ x.λ y.x. one here.
I e .2, which acquires the second element of a pair, is desugared to
e λ x.λ y.y.
However, such expressions are ill-typed in TFAE. When the type of e1
is num and the type of e2 is num → num, λ f.f e1 e2 is a function that
returns num in some cases and num → num in some other cases. There is
no way to represent the type of such a function. Thus, programs using
pairs cannot be written in TFAE.
To overcome the limitation, we extend TFAE to support pairs. The syntax
and dynamic semantics of pairs follow Exercise 7 of Chapter 9. We add
pair types as follows:
τ :: · · · | τ × τ
Rule Typ-Pair
If the type of e1 is τ1 under Γ and the type of e2 is τ2 under Γ,
then the type of (e1 , e2 ) is τ1 × τ2 under Γ.
Γ ` e 1 : τ1 Γ ` e 2 : τ2
[Typ-Pair]
Γ ` (e1 , e2 ) : τ1 × τ2
Rule Typ-Fst
If the type of e is τ1 × τ2 under Γ,
18 Type Systems 196
Γ ` e : τ1 × τ2
[Typ-Fst]
Γ ` e .1 : τ1
Rule Typ-Snd
If the type of e is τ1 × τ2 under Γ,
then the type of e .2 is τ2 under Γ.
Γ ` e : τ1 × τ2
[Typ-Snd]
Γ ` e .2 : τ2
18.7 Exercises
1. This exercise extends TFAE with lists.
e :: · · · | box e | ! e | e := e | e ; e
τ :: · · · | box τ
e :: · · · | x := e
e :: · · · | ∗ e | & x | ∗ e := e
τ :: · · · | τ∗
19.1 Syntax
Rule Rec
If e2 evaluates to v under σ0, where σ0 σ[x 1 7→ hλx 2 .e1 , σ0i],
then def x 1 (x 2 : τ1 ): τ2 = e1 in e2 evaluates to v under σ .
19.3 Interpreter
Rule Typ-If0
If
the type of e1 is num under Γ,
the type of e2 is τ under Γ, and
the type of e3 is τ under Γ,
then
the type of if0 e1 e2 e3 is τ under Γ
19 Typing Recursive Functions 199
Γ ` e1 : num Γ ` e2 : τ Γ ` e3 : τ
[Typ-If0]
Γ ` if0 e1 e2 e3 : τ
Rule Typ-If0’
If
the type of e1 is τ0 under Γ,
the type of e2 is τ under Γ, and
the type of e3 is τ under Γ,
then
the type of if0 e1 e2 e3 is τ under Γ
Γ ` e 1 : τ0 Γ ` e2 : τ Γ ` e3 : τ
[Typ-If0’]
Γ ` if0 e1 e2 e3 : τ
Both rules make the type system sound, but they are different from each
other. Rule Typ-If0 rejects more expressions than Rule Typ-If0’ because
the former allows only integers to be conditions, while the latter allows
functions as well. Therefore, from the perspective of reducing false
positives, Rule Typ-If0’ is better than Rule Typ-If0. However, if the type of
the condition is a function type, it is highly likely to be a mistake of the
programmer. When the condition evaluates to a function, the conditional
expression always evaluates its false branch. The use of a conditional
expression is totally meaningless. Thus, from the perspective of detecting
programmers’ mistakes, Rule Typ-If0 is better than Rule Typ-If0’.
Now, we define the typing rule of a recursive function.
Rule Typ-Rec
If
the type of e1 is τ2 under Γ[x 1 : τ1 → τ2 , x 2 : τ1 ] and
the type of e2 is τ under Γ[x 1 : τ1 → τ2 ],
then
the type of def x 1 (x 2 : τ1 ): τ2 = e1 in e2 is τ under Γ.
Γ[x1 : τ1 → τ2 , x2 : τ1 ] ` e1 : τ2 Γ[x1 : τ1 → τ2 ] ` e2 : τ
[Typ-Rec]
Γ ` def x 1 (x2 :τ1 ): τ2 = e1 in e2 : τ
The principle of the rule is the same as the typing rule of a lambda
abstraction: given type annotations are used during the type checking of
the function body. Since the function itself can be used in the body, the
type checking of the body requires the type of the function. This is the
19 Typing Recursive Functions 200
reason that the expression needs the return type annotation in addition
to the parameter type annotation. To type-check the body, the return type
must be known. For the type checking of the body, the type environment
is extended with the type of the function, τ1 → τ2 , and the type of the
parameter, τ1 . Since the type of the body is the return type, it must be τ2 .
After the type checking of the body, e2 is type-checked. For e2 , only the
type of the function is required; the parameter can be used only in the
body.
The following proof trees prove that the type of def f(x:num):num=if0 x 0 (x+
f (x − 1)) in f 3 is num:
x ∈ Domain(Γ1 )
Γ1 ` 1 : num
f ∈ Domain(Γ1 ) Γ1 ` x : num
x ∈ Domain(Γ1 ) Γ1 ` f : num → num Γ1 ` x − 1 : num
Γ1 ` x : num Γ1 ` f (x − 1) : num
Γ1 ` x + (f (x − 1)) : num
x ∈ Domain(Γ1 )
Γ1 ` 0 : num Γ1 ` x + (f (x − 1)) : num
Γ1 ` x : num
Γ1 ` if0 x 0 (x + f (x − 1)) : num
f ∈ Domain(Γ2 )
Γ2 ` 3 : num
Γ2 ` f : num → num
Γ1 ` if0 x 0 (x + f (x − 1)) : num
Γ2 ` f 3 : num
∅ ` def f(x:num):num=if0 x 0 (x + f (x − 1)) in f 3 : num
To implement a type checker, we need to add the If0 and Rec cases to
the typeCheck function for TFAE.
The parameter type is p, and the return type is r. Thus, the type of f is
the function type from p to r. The type of x is p. To type-check b, the type
environment must have the types of f and x. The type of b must equal
r. The mustSame function compares the types. The function can be used
not only in b, which is the body of the function, but also in e. On the
other hand, the parameter x cannot be used in e. Therefore, it is enough
to add only the type of f to the type environment used to type-check e.
The type of the whole expression is equal to the type of e.
19.6 Exercises
1. Write a TRFAE expression e such that only one of e and λ x:num.(e x)
terminates, while both e and λ x:num.(e x) are well-typed.
2. Consider the following language:
σ ` e1 ⇒ n 1 σ ` e2 ⇒ n 2 σ ` e1 ⇒ n 1 σ ` e2 ⇒ n 2
σ ` e1 + e2 ⇒ n 1 + n 2 σ ` e1 < e2 ⇒ n 1 < n 2
σ ` e ⇒ true σ ` c1 ⇒ σ1 σ ` e ⇒ false σ ` c2 ⇒ σ1
σ ` if e c 1 c 2 ⇒ σ1 σ ` if e c 1 c 2 ⇒ σ1
σ ` e ⇒ true σ ` c ⇒ σ1 σ1 ` while e c ⇒ σ2
σ ` while e c ⇒ σ2
σ ` e ⇒ false σ ` c 1 ⇒ σ1 σ1 ` c 2 ⇒ σ2
σ ` while e c ⇒ σ σ ` c 1 ; c2 ⇒ σ2
Scala code:
Note that 1 can be replaced with any nonzero integer, and 0 can be
replaced with any integer.
It is tedious and error-prone to make fruit values like the above, while
Scala provides a simple way to construct fruit values. In TFAE, we can
define functions to mimic constructors in Scala.
val Apple= λ x:num.(0 , (x , (0 , 0))) in
val Banana= λ x:(num × num).(1 , (0 , x)) in
...
Apple is a function that takes an integer as an argument and returns an
apple whose radius is the given integer. Similarly, Banana is a function
that takes a pair of integers as an argument and returns a banana whose
size is represented by the given pair. We can now easily create fruit values
with Apple and Banana.
val apple=Apple 5 in
val banana=Banana (6 , 2) in
...
In Scala, a typical way to use a value of an ADT is pattern matching. For
instance, consider a function that computes the radius of a given fruit.
The function can be implemented like below.
TFAE does not have pattern matching, but we can exploit the fact that the
first value of a given pair indicates which fruit it is. We use a conditional
expression to perform a certain operation when the fruit is an apple, i.e.
the first value is 0, and another opertaion when the fruit is a banana, i.e.
the first value is nonzero. The following expression defines the radius
function:
val radius= λ x:(num × (num × (num × num))).if0 x.1 x.2.1 x.2.2.1 in . . .
This example shows that we can desugar ADTs and pattern matching to
pairs, functions, and conditional expressions in TFAE. The ADT of the
example has only two variants, which have one or two parameters. ADTs
can have any number of variants, and vairants can have any number of
parameters. The same strategy can be used to desugar ADTs with more
variants and variants with more parameters.
Although nonrecursive ADTs can be desugared in TFAE, there are a
few flaws. First, desugared programs have unnecessary values. Even
when we make an apple, we need (0 , 0), which is a pair for the size of a
banana. Similarly, a banana value requires the size of an apple. They add
unessential complexity and computation to the code. Second, a single
type may represent conceptually different types when a single program
uses multiple ADTs. In practice, it is common to use multiple ADTs in a
single program. Recall that the type of a fruit is num ×(num ×(num × num)).
The same type may represent other types as well. For example, the type
of an electronic product can also be num × (num × (num × num)). In this
case, the type system allows a function intended to take a fruit to take an
electronic product as an argument. It does not incur any type errors at
20 Algebraic Data Types 204
which implements an integer list type. A list is one of the most famous
recursive types. Look at the definition of Cons. Cons is one variant of
List, so it defines List. At the same time, the definition uses List as the
type of the second parameter. Thus, List is a recursively defined type,
whose definition depends on itself.
Can we desugar the definition of a list in TFAE? For desugaring, the first
thing to do is to determine the type of a list. Let the type of a list be
τ . Then, τ equals (num , τ0) for some τ0. The first element is an integer
that indicates which variant the value denotes. When the integer is 0,
the value is Nil; otherwise, the value is Cons. When the value is Nil, no
other data is required since Nil does not have any parameters. Thus, the
second element of a type τ0 is for Cons. Since Cons has two parameters,
an integer and a list, τ0 equals (num , τ). Then, we obtain the equation
τ (num , (num, τ)). However, as discussed in the previous chapter, no
type in TFAE can be the same as a part of itself. Therefore, there is no
such τ . We can conclude that we cannot desugar lists in TFAE. In general,
recursive ADTs cannot be expressed in TFAE.
This chapter defines TVFAE by extending TFAE 2 with ADTs, each of 2: For the rest of the chapter, local vari-
which can be either nonrecursive or recursive. It allows programmers to able definitions and types are not parts of
TFAE. However, we may keep using them
represent ADTs efficiently and concisely. In addition, many interesting in examples.
recursive data types become able to be used in programs.
20.1 Syntax
First, we introduce type identifiers, which are the names of types defined
by programmers. For example, Fruit of the previous example is a type
identifier. Let TId be the set of every type identifier.
t ∈ TId
The metavariable t ranges over type identifiers. Since TId includes only
the names of user-defined types, num is not a member of TId.
The names of user-defined types can be used as types. For example, Fruit,
which is a type name, is used as a type in def radius(f: Fruit): Int
= .... Therefore, we extend the syntax of types as follows:
20 Algebraic Data Types 205
τ :: · · · | t
Note that each type can have only two variants, and each variant can have
only one parameter. This restriction can be easily removed. For brevity,
this chapter keeps the restriction.
Let us see some example expressions. The following expression defines
the Fruit type and the radius function:
type Fruit Apple@num + Banana@(num × num) in
val radius= λ x:Fruit.x match Apple(y) → y , Banana(y) → y.1 in
...
The following expression defines the List type:
type List Nil@num + Cons@(num × List) in . . .
Note that Nil has one parameter since every variant of TVFAE must have
a parameter. Nil can have any value because the value is not used at all
anyway.
Recursive data types are typically used with recursive functions. If we
add recursive functions of TRFAE to the language, we can implement the
following sum function, which calculates the sum of every integer in a
given list:
def sum(x:List):num=x match Nil(y) → 0 , Cons(y) → y.1+(sum y.2) in . . .
For example, Apple(5) is an apple value whose radius is 5, and Banana((2 , 6))
is a banana value whose radius is 2 and height is 6. Both hApplei
and hBananai are constructors. When hApplei is applied to 5, the re-
sult is Apple(5), and when hBananai is applied to (2 , 6), the result is
Banana((2 , 6)).
Now, let us define the dynamic semantics of the added expressions. First,
consider an expression that defines a new type.
Rule TypeDef
If e evaluates to v under σ[x 1 7→ hx 1 i, x 2 7→ hx 2 i],
then type t x 1 @ τ1 + x 2 @ τ2 in e evaluates to v under σ .
Rule Match-L
If e evaluates to x 1 (v 0) under σ and e1 evaluates to v under σ[x 3 7→ v 0],
then e match x 1 (x 3 ) → e1 , x 2 (x 4 ) → e2 evaluates to v under σ .
σ ` e ⇒ x 1 (v 0) σ[x3 7→ v 0] ` e1 ⇒ v
[Match-L]
σ ` e match x1 (x3 ) → e1 , x2 (x4 ) → e2 ⇒ v
Rule Match-R
If e evaluates to x 2 (v 0) under σ and e2 evaluates to v under σ[x 4 7→ v 0],
then e match x 1 (x 3 ) → e1 , x 2 (x 4 ) → e2 evaluates to v under σ .
σ ` e ⇒ x2 (v 0) σ[x4 7→ v 0] ` e2 ⇒ v
[Match-R]
σ ` e match x1 (x3 ) → e1 , x2 (x4 ) → e2 ⇒ v
Rule App-Cnstr
If e1 evaluates to hxi under σ and e2 evaluates to v under σ ,
then e1 e2 evaluates to x(v) under σ .
σ ` e1 ⇒ hxi σ ` e2 ⇒ v
[App-Cnstr]
σ ` e1 e2 ⇒ x(v)
Apple ∈ Domain(σ1 )
σ1 ` 5 ⇒ 5
σ1 ` Apple ⇒ hApplei y ∈ Domain(σ2 )
σ1 ` Apple 5 ⇒ Apple(5) σ2 ` y ⇒ 5
σ1 ` (Apple 5) match Apple(y) → y , Banana(z) → z.1 ⇒ 5
∅`e⇒5
where
σ1 [Apple 7→ hApplei, Banana 7→ hBananai]
σ2 σ1 [y 7→ 5]
20.3 Interpreter
Well-Formed Types
`⊆ TEnv × T
Rule Wf-NumT
num is well-formed under Γ.
Γ ` num [Wf-NumT]
20 Algebraic Data Types 210
The first well-formedness rule states that num is always well-formed. num
is neither a type identifier nor the name of a user-defined type. It is a
built-in type, which always exists. Thus, num is well-formed under any
type environment.
Rule Wf-ArrowT
If τ1 is well-formed under Γ and τ2 is well-formed under Γ,
then τ1 → τ2 is well-formed under Γ.
Γ ` τ1 Γ ` τ2
[Wf-ArrowT]
Γ ` τ1 → τ2
Rule Wf-IdT
If t in the domain of Γ,
then t is well-formed under Γ.
t ∈ Domain(Γ)
[Wf-IdT]
Γ`t
If a type identifier can be found in the type environment, the type identifier
is a well-formed type. For example, if λ x:Fruit.x is the whole expression,
Fruit is ill-formed since there is no Fruit in the type environment. How-
ever, in type Fruit Apple@num + Banana@(num × num) in λ x:Fruit.x,
Fruit is well-formed since the expression puts the definition of Fruit
into the type environment.
Typing Rules
Rule Typ-TypeDef
If
Γ0 denotes Γ[t x 1 @ τ1 + x2 @τ2 , x1 : τ1 → t, x2 : τ2 → t],
τ1 is well-formed under Γ0,
τ2 is well-formed under Γ0, and
the type of e is τ under Γ0,
then
the type of type t x 1 @ τ1 + x 2 @ τ2 in e is τ under Γ.
20 Algebraic Data Types 211
Γ0 Γ[t x1 @ τ1 + x 2 @ τ2 , x1 : τ1 → t, x2 : τ2 → t]
Γ0 ` τ1 Γ0 ` τ2 Γ0 ` e : τ
[Typ-TypeDef]
Γ ` type t x1 @τ1 + x2 @ τ2 in e : τ
Rule Typ-Match
If
the type of e is t under Γ,
t is in the domain of Γ,
Γ(t) equals x1 @τ1 + x2 @ τ2 ,
the type of e1 is τ under Γ[x 3 : τ1 ], and
the type of e2 is τ under Γ[x 4 : τ2 ],
then
the type of e match x 1 (x 3 ) → e1 , x 2 (x 4 ) → e2 is τ under Γ.
Γ`e:t t ∈ Domain(Γ)
Γ(t) x1 @ τ1 + x 2 @ τ2 Γ[x3 : τ1 ] ` e1 : τ Γ[x4 : τ2 ] ` e2 : τ
[Typ-Match]
Γ ` e match x1 (x3 ) → e1 , x2 (x 4 ) → e2 : τ
Rule Typ-Fun
If τ1 is well-formed under Γ and the type of e is τ2 under Γ[x : τ1 ],
then the type of λx : τ1 .e is τ1 → τ2 under Γ.
Γ ` τ1 Γ[x : τ1 ] ` e : τ2
[Typ-Fun]
Γ ` λx :τ1 .e : τ1 → τ2
Apple ∈ Domain(Γ1 )
Γ1 ` 5 : num
Γ1 ` Apple : num → Fruit
Γ1 ` Apple 5 : Fruit
y ∈ Domain(Γ2 )
Γ2 ` y : num
z ∈ Domain(Γ3 )
Γ3 ` z : num × num
Γ3 ` z.1 : num
Γ1 ` Apple 5 : Fruit
Fruit ∈ Domain(Γ1 ) Γ1 (Fruit) Apple@num + Banana@(num × num)
Γ2 ` y : num Γ3 ` z.1 : num
Γ1 ` (Apple 5) match Apple(y) → y, Banana(z) → z.1 : num
Γ1 ` num Γ1 ` num
Γ1 Γ1 Γ1 ` num
Γ1 ` num × num
Γ1 ` (Apple 5) match Apple(y) → y, Banana(z) → z.1 : num
∅ ` e : num
where
Γ1 [Fruit Apple@num + Banana@(num × num), Apple : num → Fruit , Banana : (num × num) → Fruit]
Γ2 Γ1 [y : num]
Γ3 Γ1 [z : num × num]
First, we extend the definition of a type since TVFAE has a new sort of a
type.
IdT( t ) represents t .
TEnv has two fields: vars and tbinds. The field vars, which is a map
from strings to TVFAE types, contains the types of variables. The field
tbinds, which is a map from strings to maps, contains type definitions.
Each map in tbinds maps strings, which are the names of variants, to
TVFAE types, which are the parameter types of variants. For example,
tbinds containing the Fruit type is as follows:
For the ease of adding type definitions and variables to type environments,
the TEnv class has two methods named add. Adding that the type of a
variable x is num to env can be written like below.
env.add("x", NumT)
The contains method of the TEnv class checks whether a particular type
identifier is a bound type identifier. For instance, the following code
checks whether Fruit is bound:
env.contains("Fruit")
Now let us define a function that checks the well-formedness of a type. The
following wfType function checkes whether a given type is well-formed
under a given type environment:
If the type is ill-formed under the type environment, the function throws
an exception.
Now, we add the TypeDef and Match cases to the typeCheck function.
First, the function adds the type definition and the constructors to the
type environment. Then, it checks the well-formedness of the parameter
types of the variants under the new type environment. If both are well-
formed, it type-checks the body expression. The type of the body is the
type of the whole type-defining expression.
First, the function type-checks the target expression. The type must be a
type identifier. The definition of the type should be found in the type
environment. The definition gives the parameter type of each variant.
The function type-checks e1 and e2 under the type environments with
the type of x1 and with the type of x2, respectively. The types must be
the same, and if it is the case, the common type is the type of the whole
pattern mathcing expression.
The Id case also has a small change. Due to the new definition of a type
environment, a way to find the type of a variable is a bit different.
20 Algebraic Data Types 215
The expression defines Fruit twice in a nested manner. The outer Fruit
has Apple and Banana as variants, and the inner Fruit has Cherry
and Durian as variants. The expression applies a function of (Fruit →
num) → num to a value of Fruit → num, so it is well-typed. However, the
expression causes a run-time error. The function of (Fruit → num) →
num applies a given function to a value of the Cherry variant because
Fruit has Cherry and Durian inside the function. However, the inner
definition of Fruit is unavailable outside the function, so the argument
given to the function is a function that expects a value of Apple or Banana.
Thus, at run time, the pattern matching fails and incurs a run-time
error.
The reason of broken type soundness is that the language allows multiple
different types of the same name, while its type checking depends solely
on the names of types to distinguish different types. Two different types
may incorrectly considered as the same type when they have the same
name.
There are multiple ways to fix the problem. The first solution is to prohibit
local type definitions. Every type definition should be at top level, just like
functions in F1VAE. Then, types cannot be nested, and every type must
have a different name from each other. Since there cannot be different
types of the same name, the problem is resolved.
The second solution is to prevent interaction between different types of
the same name. It can be achieved by changing Rule Typ-TypeDef like
below.
Rule Typ-TypeDef’
If
t is not in the domain of Γ,
Γ0 denotes Γ[t x 1 @ τ1 + x2 @τ2 , x1 : τ1 → t, x2 : τ2 → t],
τ1 is well-formed under Γ0,
τ2 is well-formed under Γ0,
the type of e is τ under Γ0, and
τ is well-formed under Γ,
then
the type of type t x 1 @ τ1 + x 2 @ τ2 in e is τ under Γ.
20 Algebraic Data Types 216
The first one prevents nested types from having the same name. The
second one prevents each locally defined type from escaping its scope.
In this way, we can effectively solve the issue. A program still can have
different types of the same name, but different types of the same name
cannot meet each other, i.e. they cannot be used in the same place.
The third solution is to rename types before type checking in order to
remove any duplication in type names. Since the bound-bind relation
between identifiers can be easily determined with simple syntactic
checking, it is possible to rename types without changing the semantics of
a given program. For instance, the above example becomes the following
expression after renaming:
20.7 Exercises
1. What does each of the following expressions evaluate to? If it is a
run-time error, describe where the error occurs.
a) type Fruit Apple@num + Banana@num in
type Animal Apple@(num → num) + Banana@(num → num) in
(λx:Fruit.x match Apple(y) → y, Banana(y) → y) (Banana 10)
b) type Fruit Apple@num + Banana@num in
type Fruit Apple@(num → num) + Banana@(num → num) in
(λx:Fruit.x match Apple(y) → y, Banana(y) → y) (Banana 10)
2. Consider the following expression:
20 Algebraic Data Types 217
Γ0 Γ[t x1 @ τ1 + x2 @ τ2 , x 1 : τ1 → t, x2 : τ2 → t]
Γ0 ` τ2 Γ0 ` e : τ
Γ ` type t x1 @ τ1 + x2 @ τ2 in e : τ
On the other hand, the following TFAE expression is rejected by the type
system:
val f= λ x:num.x in
val y=f 1 in
f true
entity as multiple types. For example, it may allow λ x.x to have mul-
tiple types: num → num and bool → bool. There are three widely-used
ways to realize polymorphism in a language: parametric polymorphism,
subtype polymorphism, and ad-hoc polymorphism. The topic of this
chapter is parametric polymorphism. Chapter 22 introduces subtype
polymorphism, and ad-hoc polymorphism is beyond the scope of this
book.
To introduce parametric polymorphism, we first need to discuss what
parameterization is. Functions are well-known examples of parameter-
izing entities. Each function parameterizes an expression with a value
(or an expression in the case of lazy languages). Consider λ x.x + x. In
this function, x is the parameter. The body, x + x is parameterized by
x. This function is the most general form of adding a value to the same
value. By applying the function, we can express any expresion that adds
a value to the same value. For example, 1 + 1 is equivalent to (λ x.x + x) 1,
and 42 + 42 is equivalent to (λ x.x + x) 42. A function abstracts an expres-
sion by replacing some portion of the expression with a parameter. By
applying a function to values, multiple expressions can be expressed
without repeating the common constituents. Only different parts should
be written as an argument in each case.
Parametric polymorphism allows entities to be parameterized by types.
It is a new form of parameterization, which functions do not provide.
Parametric polymorphism allows parameterizing an expression with a
type, instead of a value. To distinguish this new notion of parameterization
from functions, we use the term type abstraction. While functions are
applied to values to replace their parameters with real values, type
abstractions are applied to types to replace their type parameters with real
types. To differentiate application of type abstractions from application
of functions, we use the term type application. Since type abstractions
parameterize expressions, the results of type application are values,
just like functions. The following table compares functions and type
abstractions:
Consider λ x:num.x and λ x:bool.x. The only difference is the type annota-
tion: num and bool. We can parameterize both expressions with a type by
introducing a type parameter α . By replacing num with α in λ x:num.x, we
obtain λ x: α.x. Similarly, by replacing bool with α in λ x:bool.x, we obtain
λ x: α.x. The results are exactly identical to each other. We can make a
type abstraction that takes a type τ as a type argument and returns λ x: τ.x
as a result. This book uses Λ to denote type abstractions. Thus, the type
abstraction we want is Λα.λ x: α.x. The type abstraction can be applied
to types to recover the original expressions. This book uses [] to denote
type application. Then, (Λα.λ x: α.x)[num] is equivalent to λ x:num.x, and
(Λα.λ x:α.x)[bool] is equivalent to λ x:bool.x.
After adding parametric polymorphism, we can make the previous
example well-typed while defining a function only once.
21 Parametric Polymorphism 220
It is still more complex than the FAE version but defines a function only
once, unlike the TFAE version.
Traditionally, parametric polymorphism was supported by only func-
tional languages. For example, OCaml and Haskell have been well-known
for their support for parametric polymorphism. On the other hand,
object-oriented languages provided only subtype polymorphism. For
instance, Java lacked parametric polymorphism until Java 4. However,
programmers in these days require languages to provide more advanced
features because their programs become more complicated. For this
reason, Java has been supporting parametric polymorphism since Java
5. Many recent languages, such as Scala, provide both parametric and
subtype polymorphism. In the context of object-oriented programming,
parametric polymorphism is often called generics since it allows generic
programming.
This chapter defines PTFAE by extending TFAE with parametric poly-
morphism. PTFAE is known as System F in the programming language
community. System F was first discoverd by Girard in the context of logic
in 1972 [Gir72]. Later, Reynolds independently discoverd the equivalent
system in the context of computer science in 1974 [Rey74]. System F, or PT-
FAE, is the most foundatinoal formulation of parametric polymorphism,
and its metatheory and variants are widely studied even in these days.
21.1 Syntax
α ∈ TId
τ :: · · · | α | ∀α.τ
v :: · · · | hΛα.e , σi
Rule TyAbs
Λα.e evaluates to hΛα.e , σi under σ.
Rule TyApp
If
e evaluates to hΛα.e 0 , σ0i under σ and
e 0[α ← τ] evaluates to v under σ0,
then
e[τ] evaluates to v under σ.
not have any roles at run time. We can omit substitution in the dynamic
semantics of PTFAE. However, if we extend the language to support any
form of dynamic type testing, substitution is mandatory. In addition, if
we want to prove type soundness, we should prove that every expression
of a certain type evaluates to a value of the same type when the evaluation
terminates. This property is called type preservation, and evaluation will
not preserve the type of an expression if the rule omits substitution. For
these reasons, Rule TyApp requires substitution.
The following proof trees prove that (Λα.λ x: α.x)[num] 1 evaluates to
1:
x ∈ Domain(σ1 )
∅ ` (Λα.λx: α.x)[num] ⇒ hλ x.x , ∅i ∅`1⇒1
σ1 ` x ⇒ 1
∅ ` (Λα.λx: α.x)[num] 1 ⇒ 1
where σ1 [x 7→ 1].
Well-Formed Types
Rule Wf-NumT
num is well-formed under Γ.
Γ ` num [Wf-NumT]
21 Parametric Polymorphism 223
Rule Wf-ArrowT
If τ1 is well-formed under Γ and τ2 is well-formed under Γ,
then τ1 → τ2 is well-formed under Γ.
Γ ` τ1 Γ ` τ2
[Wf-ArrowT]
Γ ` τ1 → τ2
Rule Wf-IdT
If α is in the domain of Γ,
then α is well-formed under Γ.
α ∈ Domain(Γ)
[Wf-IdT]
Γ`α
Rule Wf-ForallT
If τ is well-formed under Γ[α],
then ∀ α.τ is well-formed under Γ.
Γ[α] ` τ
[Wf-ForallT]
Γ ` ∀α.τ
Typing Rules
Rule Typ-TyAbs
If the type of e is τ under Γ[α],
then the type of Λα.e is ∀ α.τ under Γ.
Γ[α] ` e : τ
[Typ-TyAbs]
Γ ` Λα.e : ∀α.τ
Rule Typ-TyApp
If τ is well-formed under Γ and the type of e is ∀ α.τ0 under Γ,
then the type of e[τ] is τ0[α ← τ] under Γ.
21 Parametric Polymorphism 224
Γ`τ Γ ` e : ∀α.τ0
[Typ-TyApp]
Γ ` e[τ] : τ0[α ← τ]
If the type of e is ∀ α.τ0, the type of e[τ] is τ0[α ← τ], which is a type
obtained by substituting α with τ in τ0. Since τ is a user-written type
annotation, the well-formeness of τ must be checked.
In addition, like in TVFAE, Rule Typ-Fun has to check the well-formedness
of the parameter type annotation.
Rule Typ-Fun
If τ1 is well-formed under Γ and the type of e is τ2 under Γ[x : τ1 ],
then the type of λx : τ1 .e is τ1 → τ2 under Γ.
Γ ` τ1 Γ[x : τ1 ] ` e : τ2
[Typ-Fun]
Γ ` λx :τ1 .e : τ1 → τ2
The following proof tree proves that the type of (Λα.λ x: α.x)[num] 1 is
num:
[α] ` α [α, x : α] ` x : α
[α] ` λx:α.x : α → α
∅ ` num
∅ ` Λα.λ x:α.x : ∀α.α → α
∅ ` 1 : num
∅ ` (Λα.λx: α.x)[num] : num → num
∅ ` (Λα.λx: α.x)[num] 1 : num
The current type system of PTFAE has two problems. First, multiple type
parameters of the same name breaks the type soundness, as multiple
type definitions of the same name does in TVFAE. Second, syntactic
comparison of types makes the type checking too restrictive. For example,
if Λβ.λ x: β.x is given to a function that expects a value of ∀ α.α → α ,
syntactically comparing ∀ α.α → α and ∀ β.β → β judges them to be
different and makes the type checking reject the program. However,
∀α.α → α and ∀β.β → β denote the same type semantically.
The best solution to both of the problems is de Bruijn indices, introduced
in Chapter 17. Chapter 17 shows use of de Bruijn indices for expressions.
However, de Bruijn indices are not limited to expressions; they can be
applied to types. For instance, both ∀ α.α → α and ∀ β.β → β can be
represented with Λ.0 → 0, so their semantic equivalence can be verified
with syntactic comparison. In addition, de Bruijn indices prevent multiple
types from being considered as the same type because of their names.
De Bruijn indices seem to be the best solution, but, still, other solutions
can be used. The three solutions described in Chapter 20 can be applied
to PTFAE in the same manner to resolve the first issue. The second issue
can be fixed by renaming type parameters before the comparison. For
example, simple syntactic transformation can transform ∀ α.α → α into
∀β.β → β.
21 Parametric Polymorphism 225
21.4 Exercises
1. Draw the type derivation of each of the following expressions:
a) (λ f:∀ α.α → α.f[num] 10) (Λα.λ x: α.x)
b) (Λα.Λβ.λ f: α → β.λ x: α.f x)[num][num] (λ y:num.17 − y) 9
2. Rewrite the following code with type abstractions and type appli-
cations to replace all the occurrences of ? with types and to make
function calls take explicit type arguments.
val f= λ g:?.λ x:?.g x in
val g= λ x:?.x in
f g 10
3. Consider the following language:
Syntax
e :: n | λx.e τ :: num σ :: τ σ ∈ TScheme
| ∀α.σ
fin
|b |ee | bool Γ ∈ Id → 7 TScheme
|x | val x e in e |τ→τ b :: true
| e; e |α | false
Typing rules
x ∈ Domain(Γ) Γ(x) τ
Γ ` n : num Γ ` b : bool
Γ`x:τ
Γ ` e 1 : τ1 Γ ` e 2 : τ2 Γ[x : τ] ` e : τ0
Γ ` e 1 ; e 2 : τ2 Γ ` λx.e : τ → τ0
Γ ` e 1 : τ → τ0 Γ ` e2 : τ Γ ` e1 : τ τ ≺Γ σ Γ[x : σ] ` e2 : τ0
Γ ` e 1 e 2 : τ0 Γ ` val x e1 in e2 : τ0
Miscellaneous definitions
στ ∀α1 . · · · ∀α n .τ τ[α1 ← τ1 , . . . , α n ← τn ]
FTV(τ) \ FTV(Γ) {α 1 , . . . , α n }
τ ≺Γ σ
τ ≺Γ ∀α1 . · · · ∀α n .τ
FTV (num) ∅
FTV (bool) ∅
FTV (τ1 → τ2 ) FTV (τ1 ) ∪ FTV (τ2 )
FTV (α) {α}
FTV (∀ α.σ) FTV (σ) \ {α}
FTV (Γ) x∈Domain(Γ) FTV (Γ(x))
S
object Expr {
sealed trait Expr
case class Num(n: Int) extends Expr
case class Add(l: Expr, r: Expr) extends Expr
case class Sub(l: Expr, r: Expr) extends Expr
case class Id(x: String) extends Expr
case class Fun(p: String, b: Expr) extends Expr
case class App(f: Expr, a: Expr) extends Expr
}
E :: n | E + E | E − E | x | λx.E | E E
σ ` e1 ⇒ hxi σ ` e2 ⇒ v
σ ` e1 e2 ⇒ x(v)
σ ` e ⇒ x1 (v 0) σ[x 3 7→ v 0] ` e1 ⇒ v
σ ` e match x1 (x3 ) → e1 , x2 (x4 ) → e2 ⇒ v
σ ` e ⇒ x2 (v 0) σ[x 4 7→ v 0] ` e2 ⇒ v
σ ` e match x1 (x3 ) → e1 , x2 (x4 ) → e2 ⇒ v
For example, programmers can write the following code, which
defines a polymorphic option type, in this language:
type option[α] None@num + Some@ α ;
val getOrElse Λα.λ x:option[α].λ y: α.(
x match
None(z) → y ,
Some(z) → z
);
getOrElse[num] (Some[num] 1) 2
On the other hand, the following code is not well-typed since types
are not recursive in this language:
type foo[α] bar@num + baz@foo[α];
...
Note that foo appears in the definition of itself, which implies that
foo is a recursive type.
22.1 Records
Syntax
First, we introduce labels, which are the names of fields in records. Let
L be the set of every label and the metavariable l ranges over labels.
22 Subtype Polymorphism 229
l∈L
e :: · · · | {l e , · · · , l e} | e.l
Dynamic Semantics
v :: · · · | {l v, · · · , l v}
Rule Record
If e1 evaluates to v 1 under σ , · · · , and e n evaluates to v n under σ ,
then {l 1 e1 , · · · , l n e n } evaluates to {l 1 v 1 , · · · , l n v n } under σ .
σ ` e1 ⇒ v 1 ··· σ ` en ⇒ vn
[Record]
σ ` {l 1 e1 , · · · , l n e n } ⇒ {l1 v1 , · · · , l n v n }
Rule Proj
If e evaluates to {· · · , l v, · · · } under σ ,
then e.l evaluates to v under σ .
σ ` e ⇒ {· · · , l v, · · · }
[Proj]
σ ` e.l ⇒ v
22 Subtype Polymorphism 230
Static Semantics
Since records are a new sort of a value, exsiting types cannot include
records. We need to add new types that records can belong to.
τ :: · · · | {l : τ, · · · , l : τ}
Rule Typ-Record
If the type of e1 is τ1 under Γ, · · · , the type of e n is τn under Γ,
then the type of {l 1 e1 , · · · , l n e n } is {l 1 : τ1 , · · · , l n : τn } under Γ.
Γ ` e 1 : τ1 ··· Γ ` e n : τn
[Typ-Record]
Γ ` {l 1 e1 , · · · , l n e n } : {l1 : τ1 , · · · , l n : τn }
Rule Typ-Proj
If the type of e is {· · · , l : τ, · · · } under Γ,
then the type of e.l is τ under Γ.
Γ ` e : {· · · , l : τ, · · · }
[Typ-Proj]
Γ ` e.l : τ
The current type system is sound but not expressive enough. It rejects
many expressions that do not cause any run-time errors. Consider the
following expression:
(λ x:{a : num}.x.a) {a 1 , b 2}
The expression evaluates {a 1 , b 2}.a, which yields 1 without any
error. However, the type system rejects the expression. The type of
{a 1 , b 2} is {a : num, b : num}, while the parameter type of the
function is {a : num}. Since the argument type is different from the
designated parameter type, the application expression is ill-typed.
Currently, the type {a : num} denotes the set of every record that has
only the integer-valued field a. However, this definition is too restrictive.
The type implies that its value can be used for any place that accesses the
field a and expects the value of the field to be an integer. Thus, the type
does not need to exclude values that have other fields in addition to the
field a.
To resolve the problem, we extend the meaning of {a : num}. Now, the
type includes any records that have an integer-valued field a. Records
that have additional fields also can be values of {a : num}. This change
can be attained by modifying Rule Typ-Record like below.
Rule Typ-Record’
If the type of e1 is τ1 under Γ, · · · , the type of e n is τn under Γ, · · · , the
type of e n+m is τn+m under Γ,
then the type of {l 1 e1 , · · · , l n e n , · · · , l n+m e n+m } is {l 1 :
τ1 , · · · , l n : τn } under Γ.
The rule allows forgetting the types of some fields if they are unnecessary.
Now, {a 1 , b 2} is a value of {a : num}. Thus, the previous example,
(λ x:{a : num}.x.a) {a 1 , b 2}, is well-typed.
Does this fix solve all the problems? Unfortunately, no. Consider the
following expression:
val x={a 1 , b 2} in
val y=(λ x:{a : num}.x.a) x in
(λ x:{a : num , b : num}.x.a + x.b) x
This expression is still ill-typed though it does not incur any run-time
errors. If we say the type of x is {a : num}, the first function application
is well-typed. However, the second function application is ill-typed. If
we say the type of x is {a : num , b : num} instead, the second function
application becomes well-typed. However, the first function application
becomes ill-typed. There is no way to make both application expressions
well-typed. We need a way to consider x as an expression of {a : num}
and as an expression of {a : num , b : num} at the same time. In other
words, we should assign multiple types to a single entity, and this is the
22 Subtype Polymorphism 232
notion of polymorphism.
Subtype polymorphism is one way of polymorphism, which is based on
the notion of subtyping. Recall that a type is a set of values. Sometimes,
one type is a subset of another type. For example, any values that belong
to {a : num , b : num} are values of {a : num}, so {a : num , b : num} is a
subset of {a : num}. When τ1 is a subset of τ2 , we say that τ1 is a subtype
of τ2 and τ2 is a supertype of τ1 . For example, {a : num , b : num} is a
subtype of {a : num} and {a : num} is a supertype of {a : num , b : num}.
This is the notion of subtyping.
Roughly speaking, subtyping is an “A is a B” relation betwen types. As
an example, consider Animal and Cat, which denote the type of every
animal and the type of every cat, respectively. We know that a cat is an
animal. Then, we can say that Cat is a subtype of Animal. On the other
hand, can we say that an animal is a cat? No, because there is an animal
that is not a cat. For instance, a dog is an animal, but not a cat. Thus,
Animal is not a subtype of Cat. We can do the same thing for record types.
A record that has fields a and b is a record that has a. (For brevity, ignore
the types of the fields here.) Therefore, {a : num , b : num} is a subtype
of {a : num}. On the other hand, a record that has a is not a record that
has both a and b since it may lack b. As a consequence, {a : num} is not a
subtype of {a : num , b : num}.
Mathematically, subtyping is a relation over types and types.
<: ⊆ T × T
Now, let us see how subtyping induces polymorphsim. The key insight
is that any expression of τ1 can be treated as an expression of τ2 without
breaking type soundness when τ1 is a subtype of τ2 . For example, suppose
that there is an animal hospital that cures any animal. We can consider
the hospital as a function that takes a value of Animal. A cat is an animal,
so any cat can be cured in the hospital. Thus, if an expression evaluates
to a value of Cat, it can be considered as an expression that evaluates to a
value of Animal and safely given to the function representing the hospital.
On the other hand, the inverse is false. If a hospital cures only cats and we
know only that someone has an animal, then we cannot say to him or her
to carry the animal to the hospital. There is no guarantee that the hospital
will be able to cure the animal. Thus, the fact that τ1 is a subtype of τ2
does not imply that any expression of τ2 can be treated as an expression
of τ1 . In a similar fasion, any expression of {a : num , b : num} can be
treated as an expression of {a : num}, but the inverse is false. We can
express this idea with the following typing rule:
Rule Typ-Sub
If the type of e is τ0 under Γ and τ0 is a subtype of τ ,
then the type of e is τ under Γ.
Γ ` e : τ0 τ0 < : τ
[Typ-Sub]
Γ`e:τ
22 Subtype Polymorphism 233
Rule Sub-Refl
τ is a subtype of τ.
τ <: τ [Sub-Refl]
Rule Sub-Trans
If τ1 is a subtype of τ2 and τ2 is a subtype of τ3 ,
then τ1 is a subtype of τ3 .
τ1 < : τ2 τ2 < : τ3
[Sub-Trans]
τ1 < : τ3
Consider the previous example again. The type system should be able to
prove {a : num , b : num} <: {a : num}. To achieve the goal, we define the
following subtyping rule:
Rule Sub-Width
{l 1 : τ1 , · · · , l n : τn , l : τ} is a subtype of {l 1 : τ1 , · · · , l n : τn }.
{l 1 : τ1 , · · · , l n : τn , l : τ} <: {l 1 : τ1 , · · · , l n : τn } [Sub-Width]
∅ ` 1 : num ∅ ` 2 : num
{a : num , b : num} <: {a : num}
∅ ` {a 1, b 2} : {a : num , b : num}
∅ ` {a 1 , b 2} : {a : num}
{a : num, b : num, c : num} <: {a : num, b : num} {a : num, b : num} <: {a : num}
{a : num, b : num, c : num} <: {a : num}
By the same principle, {}, which is the empty record type, is a supertype
of every record type. In other words, every record type is a subtype of
{}.
Alas, the type system is still restrictive. The following expression is
ill-typed but does not cause run-time errors:
val x={a 1 , b 2} in
val y=(λ x:{b : num , a : num}.x.a + x.b) x in
(λ x:{a : num , b : num}.x.a + x.b) x
The type of x is {a : num , b : num}. Therefore, the second function
application is well-typed, while the first function application is not.
We need to make x be a value of {a : num , b : num} and a value of
{b : num , a : num} at the same time. Like before, fixing Rule Typ-Record
cannot be a proper solution. The correct solution is to add a new subtyping
rule.
The key idea to define a new subtyping rule is that the order between
fields does not matter at all. For example, a record that has fields a
and b is a record that has fields b and a. Thus, it is safe to consider
{a : num , b : num} as a subtype of {b : num , a : num}. By generalizing
this observation, we define the following subtyping rule:
Rule Sub-Perm
If (l 1 , τ1 ), · · · , (l n , τn ) is a permutation of (l 10 , τ10 ), · · · , (l 0n , τ0n ),
then {l 1 : τ1 , · · · , l n : τn } is a subtype of {l 10 : τ10 , · · · , l 0n : τ0n }.
The rule states that altering the order between the fields of a record type
results in a subtype of the record type.
22 Subtype Polymorphism 235
Even after the addition of Rule Sub-Width and Rule Sub-Perm, the type
system still can be improved more. Consider the following expression:
val x={a {a 1 , b 2}} in
val y=(λ x:{a : {a : num}}.x.a.a) x in
(λ x:{a : {a : num , b : num}}.x.a.a + x.a.b) x
The above expression does not incur any run-time errors. However,
the first function application is ill-typed, while the second function
application is well-typed. We need to make {a {a 1 , b 2}} be a
value of {a : {a : num}} and a value of {a : {a : num , b : num}} at the
same time by adding a subtyping rule.
The current type system is too strict about the types of fields in records.
For example, consider {a : {a : num , b : num}} and {a : {a : num}}. A
value of {a : {a : num , b : num}} has at least one field, whose name is a.
The value of the field is a value of {a : num , b : num}. We already know
that any value of {a : num , b : num} is a value of {a : num}. Therefore,
we can say that a value of {a : {a : num , b : num}} has at least one field,
whose name is a and type is {a : num}. In fact, it is the characteristic of a
value of {a : {a : num}}. As a result, any value of {a : {a : num , b : num}}
is a value of {a : {a : num}} at the same time, so {a : {a : num , b : num}}
must be a subtype of {a : {a : num}}. By generalizing this observation,
we define the following subtyping rule:
Rule Sub-Depth
If τ1 is a subtype of τ10 , · · · , τn is a subtype of τ0n ,
then {l 1 : τ1 , · · · , l n : τn } is a subtype of {l 1 : τ10 , · · · , l n : τ0n }.
The rule states that strengthening1 the type of each field in a record type 1: If we strengthen a type, a subtype is ob-
results in a subtype of the record type. tained in the sense that a subtype imposes
a stronger condition on its value than the
By using Rule Sub-Width and Rule Sub-Depth together, we can prove original type.
that {a : {a : num , b : num}} is a subtype of {a : {a : num}}.
22 Subtype Polymorphism 236
Rule Sub-Ret
If τ2 is a subtype of τ20 ,
then τ1 → τ2 is a subtype of τ1 → τ20 .
τ2 <: τ20
[Sub-Ret]
τ1 → τ2 <: τ1 → τ20
Rule Sub-Param
If τ10 is a subtype of τ1 ,
then τ1 → τ2 is a subtype of τ10 → τ2 .
τ10 <: τ1
[Sub-Param]
τ1 → τ2 <: τ10 → τ2
Rule Sub-ArrowT
If τ10 is a subtype of τ1 and τ2 is a subtype of τ20 ,
then τ1 → τ2 is a subtype of τ10 → τ20 .
top is the top type. The type denotes the set of every value. The set is
a superset of any set of values. Thus, the top type is a supertype of
every type. In other words, every type is a subtype of the top type. The
following is the subtyping rule for the top type:
Rule Sub-TopT
τ is a subtype of top.
The top type can be used to give a single type to two or more com-
pletely irrelevant expressions. Suppose that the language has conditional
expressions. Then, the type of the following expression is {a : num}:
if0 0 {a 1} {a 1 , b 2}
By extending STFAE with the top type, the type of the above expression
can be top.
bottom is the bottom type, which is the dual of the top type. The bottom
type denotes the empty set. Since the empty set is a subset of any set, the
bottom type is a subtype of every type, and every type is a supertype
of the bottom type. The following is the subtyping rule for the bottom
type:
Rule Sub-BottomT
bottom is a subtype of τ .
Even though no value is a value of the the bottom type, the bottom type
is useful. It can be the type of expressions that throw exceptions or call
first-class continuations. Those expressions do not evaluate to any values.
They just change control flows. Thus, it is quite natural to say that the
type of such an expression is the bottom type.
22.6 Exercises
1. Write whether each expression is well-typed in STFAE without the
top type, If so, draw the type derivation. Otherwise, explain why.
a) if0 1 {} 2
b) if0 1 {} {a 2}
2. Consider TFAE with lists in Exercise 1 of Chapter 18.
a) When can list τ1 be a subtype of list τ2 ? Write a new subtyping
rule for list types.
b) Suppose that we extend the language as follows:
e :: · · · | e[e] : e
τ :: · · · | top
The typing rule for the new expression, which mutates an
element of a list, is as follows:
Γ ` e1 : list τ Γ ` e2 : num Γ ` e3 : τ
Γ ` e1 [e2 ] : e3 : τ
Suppose that the operational semantics and the typing rules are
the same as those of TVFAE except that some rules are revised to
handle zero or more variants properly.
Some expressions are rejected by the type system even though they
do not cause run-time errors. The following expression is such an
example:
type abc apple@num + banana@num + cherry@num;
val f λ x:abc.(
x match
apple(a) → a
banana(b) → b
cherry(c) → c
);
type ab apple@num + banana@num;
f (apple 42)
We want to add subtyping to the language to allow more ex-
pressions including the above one. Add subtyping rule(s) of the
form Γ ` τ <: τ to the language. Assume that the following rules
already exist:
Γ ` τ1 < : τ2 Γ ` τ2 < : τ3
Γ ` τ <: τ
Γ ` τ1 < : τ3
[OSV16] Martin Odersky, Lex Spoon, and Bill Venners. ‘Programming in Scala: Updated for Scala 2.12’. In:
Artima Incorporation, USA, (2016) (cited on pages 5, 20).
[CB14] Paul Chiusano and Rnar Bjarnason. Functional programming in Scala. Manning Publications Co.,
2014 (cited on page 5).
[Rey09] John C Reynolds. Theories of programming languages. Cambridge University Press, 2009 (cited on
page 133).
[Tai67] William W Tait. ‘Intensional interpretations of functionals of finite type I’. In: The journal of symbolic
logic 32.2 (1967), pp. 198–212 (cited on page 197).
[Pie02] Benjamin C Pierce. Types and programming languages. MIT press, 2002 (cited on page 197).
[Gir72] Jean-Yves Girard. ‘Interprétation fonctionnelle et élimination des coupures de l’arithmétique
d’ordre supérieur’. PhD thesis. Éditeur inconnu, 1972 (cited on page 220).
[Rey74] John C Reynolds. ‘Towards a theory of type structure’. In: Programming Symposium. Springer. 1974,
pp. 408–425 (cited on page 220).
Special Terms
A
ADT algebraic data type. 43
AST abstract syntax tree. 69
B
BNF Backus-Naur form. 63
C
CBN call-by-name. 129
CBR call-by-reference. 126
CBV call-by-value. 126
CPS continuation-passing style. 142
J
JVM Java Virtual Machine. 7
R
REPL read-eval-print loop. 8
Alphabetical Index