100% found this document useful (1 vote)
1K views222 pages

Formal Languages and Automata, Grammars and Parsing Using Haskell

Uploaded by

Hussein Terawi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views222 pages

Formal Languages and Automata, Grammars and Parsing Using Haskell

Uploaded by

Hussein Terawi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 222

Grammars and Parsing

Johan Jeuring
Doaitse Swierstra
1
Copyright c _ 2001 by Johan Jeuring and Doaitse Swierstra.
2
Contents
1 Goals 9
1.1 History 10
1.2 Grammar analysis of context-free grammars 11
1.3 Compositionality 11
1.4 Abstraction mechanisms 12
2 Context-Free Grammars 13
2.1 Languages 14
2.2 Grammars 17
2.2.1 Notational conventions 20
2.3 The language of a grammar 21
2.3.1 Some basic languages 22
2.4 Parse trees 24
2.4.1 From context-free grammars to datatypes 26
2.5 Grammar transformations 27
2.6 Concrete and abstract syntax 32
2.7 Constructions on grammars 35
2.7.1 SL: an example 36
2.8 Parsing 38
2.9 Exercises 39
3 Parser combinators 45
3.1 The type Parser 47
3.2 Elementary parsers 49
3.3 Parser combinators 52
3.3.1 Matching parentheses: an example 57
3.4 More parser combinators 59
3.4.1 Parser combinators for EBNF 59
3.4.2 Separators 61
3.5 Arithmetical expressions 64
3.6 Generalised expressions 66
3.7 Exercises 67
4 Grammar and Parser design 69
4.1 Step 1: A grammar for the language 70
4.2 Step 2: Analysing the grammar 70
4.3 Step 3: Transforming the grammar 71
4.4 Step 4: Deciding on the types 71
4.5 Step 5: Constructing the basic parser 72
3
4 CONTENTS
4.5.1 Basic parsers from strings 73
4.5.2 A basic parser from tokens 74
4.6 Step 6: Adding semantic functions 75
4.7 Step 7: Did you get what you expected 76
4.8 Exercises 76
5 Regular Languages 79
5.1 Finite-state automata 79
5.1.1 Deterministic nite-state automata 80
5.1.2 Nondeterministic nite-state automata 82
5.1.3 Implementation 85
5.1.4 Constructing a DFA from an NFA 87
5.1.5 Partial evaluation of NFAs 90
5.2 Regular grammars 91
5.2.1 Equivalence of Regular grammars and Finite automata 94
5.3 Regular expressions 95
5.4 Proofs 99
5.5 Exercises 103
6 Compositionality 107
6.1 Lists 108
6.1.1 Built-in lists 108
6.1.2 User-dened lists 109
6.1.3 Streams 111
6.2 Trees 111
6.2.1 Binary trees 112
6.2.2 Trees for matching parentheses 112
6.2.3 Expression trees 115
6.2.4 General trees 116
6.2.5 Eciency 117
6.3 Algebraic semantics 118
6.4 Expressions 118
6.4.1 Evaluating expressions 118
6.4.2 Adding variables 119
6.4.3 Adding denitions 121
6.4.4 Compiling to a stack machine 123
6.5 Block structured languages 125
6.5.1 Blocks 125
6.5.2 Generating code 126
6.6 Exercises 129
7 Computing with parsers 131
7.1 Insert a semantic function in the parser 131
7.2 Apply a fold to the abstract syntax 131
7.3 Deforestation 132
7.4 Using a class instead of abstract syntax 133
7.5 Passing an algebra to the parser 134
CONTENTS 5
8 Programming with higher-order folds 135
8.1 The rep min problem 136
8.1.1 A straightforward solution 136
8.1.2 Lambda lifting 137
8.1.3 Tupling computations 137
8.1.4 Merging tupled functions 138
8.2 A small compiler 140
8.2.1 The language 140
8.2.2 A stack machine 141
8.2.3 Compiling to the stackmachine 141
8.3 Attribute grammars 143
9 Pumping Lemmas: the expressive power of languages 147
9.1 The Chomsky hierarchy 147
9.1.1 Type-0 grammars 148
9.1.2 Type-1 grammars 148
9.1.3 Type-2 grammars 148
9.1.4 Type-3 grammars 148
9.2 The pumping lemma for regular languages 149
9.3 The pumping lemma for context-free languages 151
9.4 Proofs of pumping lemmas 154
9.5 exercises 156
10 LL Parsing 159
10.1 LL Parsing: Background 159
10.1.1 A stack machine for parsing 160
10.1.2 Some example derivations 160
10.1.3 LL(1) grammars 164
10.2 LL Parsing: Implementation 167
10.2.1 Context-free grammars in Haskell 167
10.2.2 Parse trees in Haskell 169
10.2.3 LL(1) parsing 169
10.2.4 Implementation of isLL(1) 171
10.2.5 Implementation of lookahead 171
10.2.6 Implementation of empty 175
10.2.7 Implementation of rst and last 177
10.2.8 Implementation of follow 179
A The Stack module 185
B Answers to exercises 187
6 CONTENTS
Voorwoord
Het hiervolgende dictaat is gebaseerd op teksten uit vorige jaren, die onder andere
geschreven zijn in het kader van het project Kwaliteit en Studeerbaarheid.
Het dictaat is de afgelopen jaren verbeterd, maar we houden ons van harte aanbev-
olen voor suggesties voor verdere verbetering, met name daar waar het het aangeven
van verbanden met andere vakken betreft.
Veel mensen hebben een bijgedrage geleverd aan de totstandkoming van dit dictaat
door een gedeelte te schrijven, of (een gedeelte van) het dictaat te becommentarieren.
Speciale vermelding verdienen Jeroen Fokker, Rik van Geldrop, en Luc Duponcheel,
die mee hebben geholpen door het schrijven van (een) hoofdstuk(ken) van het dic-
taat. Commentaar is onder andere geleverd door: Arthur Baars, Arnoud Berendsen,
Gijsbert Bol, Breght Boschker, Martin Bravenboer, Pieter Eendebak, Rijk-Jan van
Haaften, Graham Hutton, Daan Leijen, Andres Loh, Erik Meijer, en Vincent Oost-
indie.
Tenslotte willen we van de gelegenheid gebruik maken enige studeeraanwijzingen te
geven:
Het is onze eigen ervaring dat het uitleggen van de stof aan iemand anders
vaak pas duidelijk maakt welke onderdelen je zelf nog niet goed beheerst. Als
je dus van mening bent dat je een hoofdstuk goed begrijpt, probeer dan eens
in eigen woorden uiteen te zetten.
Oefening baart kunst. Naarmate er meer aandacht wordt besteed aan de pre-
sentatie van de stof, en naarmate er meer voorbeelden gegeven worden, is het
verleidelijker om, na lezing van een hoofdstuk, de conclusie te trekken dat je
een en ander daadwerkelijk beheerst. Begrijpen is echter niet hetzelfde als
kennen, kennen is iets anders dan beheersen en beheersen is weer iets
anders dan er iets mee kunnen. Maak dus de opgaven die in het dictaat
opgenomen zijn zelf, en doe dat niet door te kijken of je de oplossingen die
anderen gevonden hebben, begrijpt. Probeer voor jezelf bij te houden welk
stadium je bereikt hebt met betrekking tot alle genoemde leerdoelen. In het
ideale geval zou je in staat moeten zijn een mooi tentamen in elkaar te zetten
voor je mede-studenten!
Zorg dat je up-to-date bent. In tegenstelling tot sommige andere vakken is het
bij dit vak gemakkelijk de vaste grond onder je voeten kwijt te raken. Het is
niet elke week nieuwe kansen. We hebben geprobeerd door de indeling van
de stof hier wel iets aan te doen, maar de totale opbouw laat hier niet heel
veel vrijheid toe. Als je een week gemist hebt is het vrijwel onmogelijk de
nieuwe stof van de week daarop te begrijpen. De tijd die je dan op college en
7
8 CONTENTS
werkcollege doorbrengt is dan weinig eectief, met als gevolg dat je vaak voor
het tentamen heel veel tijd (die er dan niet is) kwijt bent om in je uppie alles
te bestuderen.
We maken gebruik van de taal Haskell om veel concepten en algoritmen te
presenteren. Als je nog moeilijkheden hebt met de taal Haskell aarzel dan niet
direct hier wat aan te doen, en zonodig hulp te vragen. Anders maak je jezelf
het leven heel moeilijk. Goed gereedschap is het halve werk, en Haskell is hier
ons gereedschap.
Veel sterkte, en hopelijk ook veel plezier,
Johan Jeuring en Doaitse Swierstra
Chapter 1
Goals
introduction
Courses on Grammars, Parsing and Compilation of programming languages have
always been some of the core components of a computer science curriculum. The
reason for this is that from the very beginning of these curricula it has been one
of the few areas where the development of formal methods and the application
of formal techniques in actual program construction come together. For a long
time the construction of compilers has been one of the few areas where we had a
methodology available, where we had tools for generating parts of compilers out of
formal descriptions of the tasks to be performed, and where such program generators
were indeed generating programs which would have been impossible to create by
hand. For many practicing computer scientists the course on compiler construction
still is one of the highlights of their education.
One of the things which were not so clear however is where exactly this joy originated
from: the techniques taught denitely had a certain elegance, we could construct
programs someone else could not thus giving us the feeling we had the right
stu, and when completing the practical exercises, which invariably consisted of
constructing a compiler for some toy language, we had the usual satised feeling.
This feeling was augmented by the fact that we would not have had the foggiest idea
how to complete such a product a few months before, and now we knew how to do
it.
This situation has remained so for years, and it is only in the last years that we
have started to discover and make explicit the reasons why this area attracted so
much interest. Many of the techniques which were taught on a this is how you solve
this kind of problems basis, have been provided with a theoretical underpinning
which explains why the techniques work. As a benecial side-eect we also gradually
learned to see where the discovered concept further played a role, thus linking the
area with many other areas of computer science; and not only that, but also giving
us a means to explain such links, stress their importance, show correspondences and
transfer insights from one area of interest to the other.
goals
The goals of these lecture notes can be split into primary goals, which are associated
with the specic subject studied, and secondary but not less important goals
which have to do with developing skills which one would expect every educated
9
10 Goals
computer scientist to have. The primary, somewhat more traditional, goals are:
to understand how to describe structures (i.e. formulas) using grammars;
to know how to parse, i.e. how to recognise (build) such structures in (from)
a sequence of symbols;
to know how to analyse grammars to see whether or not specic properties
hold;
to understand the concept of compositionality;
to be able to apply these techniques in the construction of all kinds of programs;
to familiarise oneself with the concept of computability.
The secondary, more far reaching, goals are:
to develop the capability to abstract;
to understand the concepts of abstract interpretation and partial evaluation;
to understand the concept of domain specic languages;
to show how proper formalisations can be used as a starting point for the
construction of useful tools;
to improve the general programming skills;
to show a wide variety of useful programming techniques;
to show how to develop programs in a calculational style.
1.1 History
When at the end of the fties the use of computers became more and more wide-
spread, and their reliability had increased enough to justify applying them to a wide
range of problems, it was no longer the actual hardware which posed most of the
problems. Writing larger and larger programs by more and more people sparked the
development of the rst more or less machine-independent programming language
FORTRAN (FORmula TRANslator), which was soon to be followed by ALGOL-60
and COBOL.
For the developers of the FORTRAN language, of which John Backus was the prime
architect, the problem of how to describe the language was not a hot issue: much
more important problems were to be solved, such as, what should be in the language
and what not, how to construct a compiler for the language that would t into the
small memories which were available at that time (kilobytes instead of megabytes),
and how to generate machine code that would not be ridiculed by programmers who
had thus far written such code by hand. As a result the language was very much
implicitly dened by what was accepted by the compiler and what not.
Soon after the development of FORTRAN an international working group started
to work on the design of a machine independent high-level programming language,
to become known under the name ALGOL-60. As a remarkable side-eect of this
undertaking, and probably caused by the need to exchange proposals in writing,
not only a language standard was produced, but also a notation for describing pro-
gramming languages was proposed by Naur and used to describe the language in
the famous Algol-60 report. Ever since it was introduced, this notation, which soon
became to be known as the Backus-Naur formalism (BNF), has been used as the
primary tool for describing the basic structure of programming languages.
It was not for long that computer scientists, and especially people writing compilers,
1.2 Grammar analysis of context-free grammars 11
discovered that the formalism was not only useful to express what language should
be accepted by their compilers, but could also be used as a guideline for structuring
their compilers. Once this relationship between a piece of BNF and a compiler
became well understood, programs emerged which take such a piece of language
description as input, and produce a skeleton of the desired compiler. Such programs
are now known under the name parser generators.
Besides these very mundane goals, i.e., the construction of compilers, the BNF-
formalism also became soon a subject of study for the more theoretically oriented. It
appeared that the BNF-formalism actually was a member of a hierarchy of grammar
classes which had been formulated a number of years before by the linguist Noam
Chomsky in an attempt to capture the concept of a language. Questions arose
about the expressibility of BNF, i.e., which classes of languages can be expressed
by means of BNF and which not, and consequently how to express restrictions and
properties of languages for which the BNF-formalism is not powerful enough. In the
lectures we will see many examples of this.
1.2 Grammar analysis of context-free grammars
Nowadays the use of the word Backus-Naur is gradually diminishing, and, inspired
by the Chomsky hierarchy, we most often speak of context-free grammars. For the
construction of everyday compilers for everyday languages it appears that this class
is still a bit too large. If we use the full power of the context-free languages we
get compilers which in general are inecient, and probably not so good in handling
erroneous input. This latter fact may not be so important from a theoretical point of
view, but it is from a pragmatical point of view. Most invocations of compilers still
have as their primary goal to discover mistakes made when typing the program, and
not so much generating actual code. This aspect is even stronger present in strongly
typed languages, such as Java and Hugs, where the type checking performed by
the compilers is one of the main contributions to the increase in eciency in the
programming process.
When constructing a recogniser for a language described by a context-free gram-
mar one often wants to check whether or not the grammar has specic desirable
properties. Unfortunately, for a human being it is not always easy, and quite often
practically impossible, to see whether or not a particular property holds. Further-
more, it may be very expensive to check whether or not such a property holds. This
has led to a whole hierarchy of context-free grammars classes, some of which are
more powerful, some are easy to check by machine, and some are easily checked
by a simple human inspection. In this course we will see many examples of such
classes. The general observation is that the more precise the answer to a specic
question one wants to have, the more computational eort is needed and the sooner
this question cannot be answered by a human being anymore.
1.3 Compositionality
As we will see the structure of many compilers follows directly from the grammar
that describes the language to be compiled. Once this phenomenon was recognised
it went under the name syntax directed compilation. Under closer scrutiny, and
12 Goals
under the inuence of the more functional oriented style of programming, it was
recognised that actually compilers are a special form of homomorphisms, a concept
thus far only familiar to mathematicians and more theoretically oriented computer
scientist that study the description of the meaning of a programming language.
This should not come as a surprise since this recognition is a direct consequence
of the tendency that ever greater parts of compilers are more or less automatically
generated from a formal description of some aspect of a programming language;
e.g. by making use of a description of their outer appearance or by making use of
a description of the semantics (meaning) of a language. We will see many exam-
ples of such mappings. As a side eect you will acquire a special form of writing
functional programs, which makes it often surprisingly simple to solve at rst sight
rather complicated programming assignments. We will see that the concept of lazy
evaluation plays an important role in making these ecient and straightforward
implementations possible.
1.4 Abstraction mechanisms
One of the main reasons for that what used to be an endeavour for a large team in
the past can now easily be done by a couple of rst years students in a matter of
days or weeks, is that over the last thirty years we have discovered the right kind of
abstractions to be used, and an ecient way of partitioning a problem into smaller
components. Unfortunately there is no simple way to teach the techniques which
have led us thus far. The only way we see is to take a historians view and to compare
the old and the new situations.
Fortunately however there have also been some developments in programming lan-
guage design, of which we want to mention the developments in the area of functional
programming in particular. We claim that the combination of a modern, albeit quite
elaborate, type system, combined with the concept of lazy evaluation, provides an
ideal platform to develop and practice ones abstraction skills. There does not exist
another readily executable formalism which may serve as an equally powerful tool.
We hope that by presenting many algorithms, and fragments thereof, in a modern
functional language, we can show the real power of abstraction, and even nd some
inspiration for further developments in language design: i.e. nd clues about how to
extend such languages to enable us to make common patterns, which thus far have
only been demonstrated by giving examples, explicit.
Chapter 2
Context-Free Grammars
introduction
We often want to recognise a particular structure hidden in a sequence of symbols.
For example, when reading this sentence, you automatically structure it by means
of your understanding of the English language. Of course, not any sequence of
symbols is an English sentence. So how do we characterise English sentences? This
is an old question, which was posed long before computers were widely used; in the
area of natural language research the question has often been posed what actually
constitutes a language. The simplest denition one can come up with is to say that
the English language equals the set of all grammatically correct English sentences,
and that a sentence consists of a sequence of English words. This terminology has
been carried over to computer science: the programming language Java can be seen
as the set of all correct Java programs, whereas a Java program can be seen as a
sequence of Java symbols, such as identiers, reserved words, specic operators etc.
This chapter introduces the most important notions of this course: the concept
of a language and a grammar. A language is a, possibly innite, set of sentences
and sentences are sequences of symbols taken from a nite set (e.g. sequences of
characters, which are referred to as strings). Just as we say that the fact whether
or not a sentence belongs to the English language is determined by the English
grammar (remember that before we have used the phrase grammatically correct),
we have a grammatical formalism for describing articial languages.
A dierence with the grammars for natural languages is that this grammatical for-
malism is a completely formal one. This property may enable us to mathematically
prove that a sentence belongs to some language, and often such proofs can be con-
structed automatically by a computer in a process called parsing. Notice that this
is quite dierent from the grammars for natural languages, where one may easily
disagree about whether something is correct English or not. This completely formal
approach however also comes with a disadvantage; the expressiveness of the class of
grammars we are going to describe in this chapter is rather limited, and there are
many languages one might want to describe but which cannot be described, given
the limitations of the formalism
goals
The main goal of this chapter is to introduce and show the relation between the
main concepts for describing the parsing problem: languages and sentences, and
13
14 Context-Free Grammars
grammars.
In particular, after you have studied this chapter you will:
know the concepts of language and sentence;
know how to describe languages by means of context-free grammars;
know the dierence between a terminal symbol and a nonterminal symbol;
be able to read and interpret the BNF notation;
understand the derivation process used in describing languages;
understand the role of parse trees;
understand the relation between context-free grammars and datatypes;
understand the EBNF formalism;
understand the concepts of concrete and abstract syntax;
be able to convert a grammar from EBNF-notation into BNF-notation by
hand;
be able to construct a simple context-free grammar in EBNF notation;
be able to verify whether or not a simple grammar is ambiguous;
be able to transform a grammar, for example for removing left recursion.
2.1 Languages
The goal of this section is to introduce the concepts of language and sentence.
In conventional texts about mathematics it is not uncommon to encounter a deni-
tion of sequences that looks as follows:
Denition 1: Sequence
Let X be a set. The set of sequences over X, called X

, is dened as follows: sequence


is a sequence, called the empty sequence, and
if xs is a sequence and x is an element of X, then x:xs is also a sequence, and
nothing else is a sequence over X.
2
There are two important things to note about this denition of the set X

.
1. It is an instance of a very common denition pattern: it is dened by induction, induction
i.e. the denition of the concept refers to the concept itself.
2. It corresponds almost exactly to the denition of the type [x] of lists of el-
ements of a type x in Haskell; except for the nal clause nothing else is
a sequence of X-elements (in Haskell you can dene innite lists, sequences
are always nite). Since this pattern of denition is so common when den-
ing a recursive data type, the last part of the denition is always implicitly
understood: if it cannot be generated it does not exist.
We will use Haskell datatypes and functions for the implementation of X

in the
sequel.
Functions on X

such as reverse, length and concatenation (++ in Haskell) are induc-


tively dened and these denitions often follow a recursion pattern which is similar
to the denition of X

itself. (Recall the foldrs that you have used in the course
on Functional Programming.)
One nal remark on sequences is on notation: In Haskell one writes
2.1 Languages 15
[x1,x2, ... ,xn] for x1 : x2 : ... xn :
"abba" for [a,b,b,a]
When discussing languages and grammars traditionally one uses
abba for a : b : b : a : []
xy for x ++ y
So letters from the beginning of the alphabet represent single symbols, and letters
from the end of the alphabet represent sequences of symbols.
Note that the distinction between single elements (like a) and sequences (like aa)
is not explicit in this notation; it is traditional however to let characters from the
beginning of the alphabet stand for single symbols (a, b, c,..) and symbols from
the end of the alphabet for sequences of symbols (x, y, z). As a consequence ax
should be interpreted as a sequence which starts with a single symbol called a and
has a tail called x.
Now we move from individual sequences to nite or innite sets of sequences. We
start with some terminology:
Denition 2: Alphabet, Language, Sentence
An alphabet is a nite set of symbols. alphabet
A language is a subset of T

, for some alphabet T.


language
A sentence is an element of a language.
sentence
2
Some examples of alphabets are:
the conventional Roman alphabet: a, b, c, . . . , z;
the binary alphabet 0, 1 ;
sets of reserved words if, then, else
the set of characters l = a, b, c, d, e, i, k, l, m, n, m, o, p, r, s, t, u, w, x;
the set of English words course, practical, exercise, exam.
Examples of languages are:
T

, (the empty set), and T are languages over alphabet T;


the set course, practical, exercise, exam is a language over the alphabet
l of characters and exam is a sentence in it.
The question that now arises is how to specify a language. Since a language is a set
we immediately see three dierent approaches:
enumerate all the elements of the set explicitly;
characterise the elements of the set by means of a predicate;
dene which elements belong to the set by means of induction.
We have just seen some examples of the rst (the Roman alphabet) and third (the
set of sequences over an alphabet) approach. Examples of the second approach are:
the even natural numbers n [ n 0, 1, .., 9

, n mod 2 = 0;
PAL, the palindromes, sequences which read the same forward as backward,
over the alphabet a, b, c: s [ s a, b, c

, s = s
R
, where s
R
denotes the
reverse of sequence s.
16 Context-Free Grammars
One of the fundamental dierences between the predicative and the inductive ap-
proach to dening a language is that the latter approach is constructive, i.e., it
provides us with a way to enumerate all elements of a language. If we dene a
language by means of a predicate we only have a means to decide whether or not
an element belongs to a language. A famous example of a language which is easily
dened in a predicative way, but for which the membership test is very hard, is the
set of prime numbers.
Quite often we want to prove that a language L, which is dened by means of
an inductive denition, has a specic property P. If this property is of the form
P(L) = x L : P(x), then we want to prove that L P.
Since languages are sets the usual set operators such as union, intersection and dif-
ference can be used to construct new languages from existing ones. The complement
of a language L over alphabet T is dened by L = x[x T

, x / L.
In addition to these set operators, there are more specic operators, which apply only
to sets of sequences. We will use these operators mainly in the chapter on regular
languages, Chapter 5. Note that denotes set union, so 1, 2 1, 3 = 1, 2, 3.
Denition 3: Language operations
Let L and M be languages over the same alphabet T, then
L = T

L complement of L
L
R
= s
R
[ s L reverse of L
LM = st [ s L, t M concatenation of L and M
L
0
= 0
th
power of L
L
n
= LL. . . L (n times) n
th
power of L
L

= L
0
L
1
L
2
. . . star closure of L
L
+
= L
1
L
2
. . . positive closure of L
2
The following equations follow immediately from the above denitions.
L

= LL

L
+
= LL

Exercise 2.1 Let L = ab, aa, baa, where a, and b are the terminals. Which of the following
strings are in L

:
abaabaaabaa, aaaabaaaa, baaaaabaaaab, baaaaabaa?
Exercise 2.2 What are the elements of

?
Exercise 2.3 For any language, prove
1. L = L =
2. L = L = L

Exercise 2.4 Can you motivate our choice for L


0
= ?
Hint: Establish an inductive denition for the powers of a language.
2.2 Grammars 17
Exercise 2.5 In this section we dened two star operators: one for arbitrary sets (De-
nition 1) and one for languages (Denition 3). Is there a dierence between these
operators?
2.2 Grammars
The goal of this section is to introduce the concept of context-free grammars.
Working with sets might be fun, but it is complicated to manipulate sets, and to
prove properties of sets. For these purposes we introduce syntactical denitions,
called grammars, of sets. This section will only discuss so-called context-free gram-
mars, a kind of grammars that are convenient for automatical processing, and that
can describe a large class of languages. But the class of languages that can be
described by context-free grammars is limited.
In the previous section we dened PAL, the language of palindromes, by means of
a predicate. Although this denition denes the language we want, it is hard to
use in proofs and programs. An important observation is the fact that the set of
palindromes can be dened inductively as follows.
Denition 4: Palindromes by induction
The empty string, , is a palindrome;
the strings consisting of just one character, a, b, and c, are palindromes;
if P is a palindrome, then the strings obtained by prepending and appending
the same character, a, b, and c, to it are also palindromes, that is, the strings
aPa
bPb
cPc
are palindromes.
2
The rst two parts of the denition cover the basic cases. The last part of the
denition covers the inductive cases. All strings which belong to the language PAL
inductively dened using the above denition read the same forwards and backwards.
Therefore this denition is said to be sound (every string in PAL is a palindrome). sound
Conversely, if a string consisting of as, bs, and cs reads the same forwards and
backwards then it belongs to the language PAL. Therefore this denition is said to
be complete (every palindrome is in PAL). complete
Finding an inductive denition for a language which is described by a predicate (like
the one for palindromes) is often a nontrivial task. Very often it is relatively easy
to nd a denition that is sound, but you also have to convince yourself that the
denition is complete. A typical method for proving soundness and completeness of
an inductive denition is mathematical induction.
Now that we have an inductive denition for palindromes, we can proceed by giving
a formal representation of this inductive denition.
Inductive denitions like the one above can be represented formally by making use
18 Context-Free Grammars
of deduction rules which look like:
a
1
, a
2
, , a
n
a or a
The rst kind of deduction rule has to be read as follows:
if a
1
, a
2
, . . . and a
n
are true,
then a is true
The second kind of deduction rule, called an axiom, has to be read as follows:
a is true
Using these deduction rules we can now write down the inductive denition for PAL
as follows:
PAL
a PAL
b PAL
c PAL
P PAL aPa PAL
P PAL bPb PAL
P PAL cPc PAL
Although the denition PAL is completely formal, it is still laborious to write. Since
in computer science we use many denitions which follow this pattern, we introduce
a shorthand for it, called a grammar. A grammar consists of production rules for grammar
constructing palindromes. The rule with which the empty string is constructed is:
P
This rule corresponds to the axiom that states that the empty string is a palin-
drome. A rule of the form s , where s is symbol and is a sequence of symbols,
is called a production rule, or production for short. A production rule can be con- produc-
tion
rule
sidered as a possible way to rewrite the symbol s. The symbol P to the left of the
arrow is a symbol which denotes palindromes. Such a symbol is an example of a
nonterminal symbol, or nonterminal for short. Nonterminal symbols are also called non-
terminal auxiliary symbols: their only purpose is to denote structure, they are not part of
the alphabet of the language. Three other basic production rules are the rules for
constructing palindromes consisting of just one character. Each of the one element
strings a, b, and c is a palindrome, and gives rise to a production:
P a
P b
P c
These production rules correspond to the axioms that state that the one element
strings a, b, and c are palindromes. If a string is a palindrome, then we obtain a
new palindrome by prepending and appending an a, b, or c to it, that is, aa, bb,
2.2 Grammars 19
and cc are also palindromes. To obtain these palindromes we use the following
recursive productions:
P aPa
P bPb
P cPc
These production rules correspond to the deduction rules that state that, if P is a
palindrome, then one can deduce that aPa, bPb and cPc are also palindromes. The
grammar we have presented so far consists of three components:
the set of terminals a, b, c; terminal
the set of nonterminals P;
and the set of productions (the seven productions that we have introduced so
far).
Note that the intersection of the set of terminals and the set of nonterminals is
empty. We complete the description of the grammar by adding a fourth component:
the nonterminal start-symbol P. In this case we have only one choice for a start start-
symbol symbol, but a grammar may have many nonterminal symbols.
This leads to the following grammar for PAL
P
P a
P b
P c
P aPa
P bPb
P cPc
The denition of the set of terminals, a, b, c, and the set of nonterminals, P,
is often implicit. Also the start-symbol is implicitly dened since there is only one
nonterminal.
We conclude this example with the formal denition of a context-free grammar.
Denition 5: Context-Free Grammar
A context-free grammar G is a four tuple (T, N, R, S) where context-
free
gram-
mar
T is a nite set of terminal symbols;
N is a nite set of nonterminal symbols (T and N are disjunct);
R is a nite set of production rules. Each production has the form A ,
where A is a nonterminal and is a sequence of terminals and nonterminals;
S is the start-symbol, S N.
2
The adjective context-free in the above denition comes from the specic produc-
tion rules that are considered: exactly one nonterminal in the left hand side. Not
every language can be described via a context-free grammar. The standard example
here is a
n
b
n
c
n
[ n IIN . We will encounter this example again later in these
lecture notes.
20 Context-Free Grammars
2.2.1 Notational conventions
In the denition of the grammar for PAL we have written every production on a
single line. Since this takes up a lot of space, and since the production rules form the
heart of every grammar, we introduce the following shorthand. Instead of writing
S
S
we combine the two productions for S in one line as using the symbol [.
S [
We may rewrite any number of rewrite rules for one nonterminal in this fashion, so
the grammar for PAL may also be written as follows:
P [ a [ b [ c [ aPa [ bPb [ cPc
The notation we use for grammars is known as BNF - Backus Naur Form -, after BNF
Backus and Naur, who rst used this notation for dening grammars.
Another notational convention concerns names of productions. Sometimes we want
to give names to production rules. The names will be written in front of the pro-
duction. So, for example,
Alpha rule : S
Beta rule : S
Finally, if we give a context-free grammar just by means of its productions, the
start-symbol is usually the nonterminal in the left hand side of the rst production,
and the start-symbol is usually called S.
Exercise 2.6 Give a context free grammar for the set of sentences over alphabet X where
1. X = a
2. X = a, b

Exercise 2.7 Give a grammar for palindromes over the alphabet a, b


Exercise 2.8 Give a grammar for the language
L = s s
R
[ s a, b

This language is known as the mirror-palindromes language.


Exercise 2.9 A parity-sequence is a sequence consisting of 0s and 1s that has an even number
of ones. Give a grammar for parity-sequences.
Exercise 2.10 Give a grammar for the language
L = w [ w a, b

nr(a, w) = nr(b, w)
where nr(c, w) is the number of c-occurrences in w.
2.3 The language of a grammar 21
2.3 The language of a grammar
The goal of this section is to describe the relation between grammars and languages:
to show how to derive sentences of a language given its grammar.
We have seen how to obtain a grammar from a given language. Now we consider
the reverse question: how to obtain a language from a given grammar? Before we
can answer this question we rst have to say what we can do with a grammar. The
answer is simple: we can derive sequences with it.
How do we construct a palindrome? A palindrome is a sequence of terminals, in our
case the characters a, b and c, that can be derived in zero or more direct derivation
steps from the start-symbol P using the productions of the grammar for palindromes
given above.
For example, the sequence bacab can be derived using the grammar for palindromes
as follows:
P

bPb

baPab

bacab
Such a construction is called a derivation. In the rst step of this derivation pro- derivation
duction P bPb is used to rewrite P into bPb. In the second step production
P aPa is used to rewrite bPb into baPab. Finally, in the last step production
P c is used to rewrite baPab into bacab. Constructing a derivation can be seen
as a constructive proof that the string bacab is a palindrome.
We will now describe derivation steps more formally.
Denition 6: Derivation
Suppose X is a production of a grammar, where X is a nonterminal symbol
and is a sequence of (nonterminal or terminal) symbols. Let X be a sequence of
(nonterminal or terminal) symbols. We say that X directly derives the sequence direct
deriva-
tion
, which is obtained by replacing the left hand side X of the production by the
corresponding right hand side . We write X and we also say that X
rewrites to in one step. A sequence
n
is derived from a sequence
1
, written derivation

n
, if there exist sequences
1
, . . . ,
n
with n 1 such that
i, 1 i < n :
i

i+1
If n = 1 this statement is trivially true, and it follows that we can derive each
sentence from itself in zero steps:



A partial derivation is a derivation of a sequence that still contains nonterminals. partial
deriva-
tion
2
Finding a derivation
1


n
is, in general, a nontrivial task. A derivation is only
one branch of a whole search tree which contains many more branches. Each branch
22 Context-Free Grammars
represents a (successful or unsuccessful) direction in which a possible derivation may
proceed. Another important challenge is to arrange things in such a way that nding
a derivation can be done in an ecient way.
From the example derivation above it follows that
P

bacab
Because this derivation begins with the start-symbol of the grammar and results in a
sequence consisting of terminals only (a terminal string), we say that the string bacab
belongs to the language generated by the grammar for palindromes. In general, we
dene
Denition 7: Language of a grammar (or language generated by a grammar)
The language of a grammar G = (T, N, R, S), usually denoted by L(G), is dened
as
L(G) = s [ S

s , s T

2
We sometimes also talk about the language of a nonterminal, which is dened by
L(A) = s [ A

s , s T

for nonterminal A. The language of a grammar could have been dened as the
language of its start-symbol.
Note that dierent grammars may have the same language. For example, if we
extend the grammar for PAL with the production P bacab, we obtain a grammar
with exactly the same language as PAL. Two grammars that generate the same
language are called equivalent. So for a particular grammar there exists a unique
language, but the reverse is not true: given a language we can usually construct many
grammars that generate the language. In mathematical language: the mapping
between a grammar and its language is not a bijection.
Denition 8: Context-free language
A context-free language is a language that is generated by a context-free grammar. context-
free
lan-
guage
2
All palindromes can be derived from the start-symbol P. Thus, the language of
our grammar for palindromes is PAL, the set of all palindromes over the alphabet
a, b, c and PAL is context-free.
2.3.1 Some basic languages
Digits occur in a lot of programming and other languages, and so do letters. In this
subsection we will dene some grammars that specify some basic languages such as
digits and letters. These grammars will be used often in later sections.
The language of single digits is specied by a grammar with 10 production
rules for the nonterminal Dig.
Dig 0 [ 1 [ 2 [ 3 [ 4 [ 5 [ 6 [ 7 [ 8 [ 9
2.3 The language of a grammar 23
We obtain sequences of digits by means of the following grammar:
Digs [ Dig Digs
Natural numbers are sequences of digits that start with a nonzero digit. So
in order to specify natural numbers, we rst dene the language of nonzero
digits.
Dig-0 1 [ 2 [ 3 [ 4 [ 5 [ 6 [ 7 [ 8 [ 9
Now we can dene natural numbers as follows.
Nat 0 [ Dig-0 Digs
Integers are natural numbers preceded by a sign. If a natural number is not
preceded by a sign, it is supposed to be a positive number.
Sign + [ -
Z Sign Nat [ Nat
The languages of underscore letters and capital letters are each specied by a
grammar with 26 productions:
ULetter a [ b [ . . . [ z
CLetter A [ B [ . . . [ Z
In the real denitions of these grammars we have to write each of the 26 letters,
of course. A letter is now either an underscore or a capital letter.
Letter ULetter [ CLetter
Variable names, function names, data types, etc., are all represented by iden-
tiers in programming languages. The following grammar for identiers might
be used in a programming language:
Identier Letter SoS
SoS [ Letter SoS [ Dig SoS
An identier starts with a letter, and is followed by a sequence of letters and
digits. We might want to allow more symbols, such as for example underscores
and dollars, but then we have to adjust the grammar, of course.
Dutch zipcodes consist of four digits, of which the rst digit is nonzero, followed
by two capitals. So
ZipCode Dig-0 Dig Dig Dig CLetter CLetter
Exercise 2.11 A terminal string is always derived in one or more steps from the start-symbol.
Why?
24 Context-Free Grammars
Exercise 2.12 What language is generated by the grammar with the single production rule
S

Exercise 2.13 What language does the grammar with the following productions generate?
S Aa
A B
B Aa

Exercise 2.14 Give a simple description of the language generated by the grammar with
productions
S aA
A bS
S

Exercise 2.15 Is the language L dened in exercise 2.1 context free ?


2.4 Parse trees
The goal of this section is to introduce parse trees, and to show how parse trees
relate to derivations. Furthermore, this section denes (non)ambiguous grammars.
For any partial derivation, i.e. a derivation that contains nonterminals in its right
hand side, there may be several productions of the grammar that can be used to
proceed the partial derivation with and, as a consequence, there may be dierent
derivations for the same sentence. There are two reasons why derivations for a
specic sentence dier:
Only the order in which the derivation steps are chosen diers. All such
derivations are considered to be equivalent.
Dierent derivation steps have been chosen. Such derivations are considered
to be dierent.
Here is a simple example. Consider the grammar SequenceOfS with productions:
S SS
S s
Using this grammar we can derive the sentence sss as follows (we have underlined
the nonterminal that is rewritten).
S

SS

SSS

SsS

ssS

sss
S

SS

sS

sSS

sSs

sss
2.4 Parse trees 25
These derivations are the same up to the order in which derivation steps are taken.
However, the following derivation does not use the same derivation steps:
S

SS

SSS

sSS

ssS

sss
In this derivation, the rst S is rewritten to SS instead of s.
The set of all equivalent derivations can be represented by selecting a, so called,
canonical element. A good candidate for such a canonical element is the leftmost
derivation. In a leftmost derivation, only the leftmost nonterminal is rewritten. If leftmost
deriva-
tion
there exists a derivation of a sentence x using the productions of a grammar, then
there exists a leftmost derivation of x. The leftmost derivation corresponding to the
two equivalent derivations above is
S

SS

sS

sSS

ssS

sss
There exists another convenient way for representing equivalent derivations: they
all have the same parse tree (or derivation tree). A parse tree is a representation of a parse
tree derivation which abstracts from the order in which derivation steps are chosen. The
internal nodes of a parse tree are labelled with a nonterminal N, and the children of
such a node are the parse trees for symbols of the right hand side of a production for
N. The parse tree of a terminal symbol is a leaf labelled with the terminal symbol.
The resulting parse tree of the rst two derivations of the sentence sss looks as
follows.
S







c
c
c
c
c
c
c
S S







c
c
c
c
c
c
c
s
S S
s s
The third derivation of the sentence sss results in a dierent parse tree:
S







c
c
c
c
c
c
c
S







c
c
c
c
c
c
c
S
S S
s
s s
As another example, all derivations of the string abba using the productions of the
grammar
P
P APA
P BPB
26 Context-Free Grammars
A a
B b
are represented by the following derivation tree:
P
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
A P
~
~
~
~
~
~
~
d
d
d
d
d
d
d
A
a
B P B
a
b b
A derivation tree can be seen as a structural interpretation of the derived sentence.
Note that there might be more than one structural interpretation of a sentence with
respect to a given grammar. Such grammars are called ambiguous.
Denition 9: ambiguous, unambiguous grammar
A grammar is unambiguous if every sentence has a unique leftmost derivation, or, unam-
biguous equivalently, if every sentence has a unique derivation tree. Otherwise it is called
ambiguous. 2
ambi-
guous
The grammar SequenceOfS for constructing sequences of ss is an example of an
ambiguous grammar, since there exist two parse trees for the string sss.
It is in general undecidable whether or not an arbitrary context-free grammar is
ambiguous. This implies that is impossible to write a program that determines the
(non)ambiguity of a context-free grammar.
It is usually rather dicult to translate languages with ambiguous grammars. There-
fore, you will nd that most grammars of programming languages and other lan-
guages that are used in processing information are unambiguous.
Grammars have proved very successful in the specication of articial languages
(such as programming languages). They have proved less successful in the specica-
tion of natural languages (such as English), partly because is extremely dicult to
construct an unambiguous grammar that species a nontrivial part of the language.
Take for example the sentence They are ying planes. This sentence can be read
in two ways, with dierent meanings: They - are - ying planes, and They - are
ying - planes. Ambiguity of natural languages may perhaps be considered as an
advantage for their users (e.g. politicians), it certainly is considered a disadvantage
for language translators because it is usually impossible to maintain an ambiguous
meaning in a translation.
2.4.1 From context-free grammars to datatypes
For each context-free grammar we can dene a corresponding datatype in Haskell.
Values of these datatypes represent parse trees of the context-free grammar. As an
example we take the grammar used in the beginning of this section:
S SS
2.5 Grammar transformations 27
S s
First, we give each of the productions of this grammar a name:
Beside : S SS
Single : S s
And now we interpret the start-symbol of the grammar S as a datatype, using the
names of the productions as constructors:
data S = Beside S S
| Single Char
Note that this datatype is too general: the type Char should really be the single
character s. This datatype can be used to represent parse trees of sentences of
S. For example, the parse tree that corresponds to the rst two derivations of the
sequence sss is represented by the following value of the datatype S.
Beside (Single s) (Beside (Single s) (Single s))
The third derivation of the sentence sss produces the following parse tree:
Beside (Beside (Single s) (Single s)) (Single s)
Because the datatype S is too general, we will reconsider the construction of datatypes
from context-free grammars in Section 2.6.
Exercise 2.16 Consider the grammar for palindromes that you have constructed in exercise
2.7. Give parse trees for the palindromes cPal1 = "abaaba" and cPal2 = "baaab".
Dene a datatype Palc corresponding to the grammar and represent the parse trees
for cPal1 and cPal2 as values of Palc.
2.5 Grammar transformations
The goal of this section is to discuss properties of grammars, grammar transforma-
tions, and to show how grammar transformations can be used to obtain grammars
that satisfy particular properties.
Grammars satisfy properties. Examples of properties are:
a grammar may be unambiguous, that is, every sentence of its language has a
unique parse tree;
a grammar may have the property that only the start-symbol can derive the
empty string; no other nonterminal can derive the empty string;
a grammar may have the property that every production either has a single
terminal, or two nonterminals in its right-hand side. Such a grammar is said
to be in Chomsky normal form.
28 Context-Free Grammars
So why are we interested in such properties? Some of these properties imply that it
is possible to build parse trees for sentences of the language of the grammar in only
one way. Some other properties imply that we can build these parse trees very fast.
Other properties are used to prove facts about grammars. Yet other properties are
used to eciently compute some other information from parse trees of a grammar.
For example, suppose we have a program p that builds parse trees for sentences
of grammars in Chomsky normal form, and that we can prove that each grammar
can be transformed in a grammar in Chomsky normal form. (When we say that a
grammar G can be transformed in another grammar G

, we mean that there exists grammar


trans-
forma-
tion
some procedure to obtain G

fromG, and that Gand G

generate the same language.)


Then we can use this program p for building parse trees for any grammar.
Since it is sometimes convenient to have a grammar that satises a particular prop-
erty for a language, we would like to be able to transform grammars in other gram-
mars that generate the same language, but that possibly satisfy dierent properties.
This section describes a number of grammar transformations
Removing duplicate productions.
Substituting right-hand sides for nonterminals.
Left factoring.
Removing left recursion.
Associative separator.
Introduction of priorities.
There are many more transformations than we describe here; we will only show a
small but useful set of grammar transformations. In the following transformations we
will assume that u, v, w, x, y, and z denote sequences of terminals and nonterminals,
i.e., are elements of (N T)

.
Removing duplicate productions
This grammar transformation is a transformation that can be applied to any gram-
mar of the correct form. If a grammar contains two occurrences of the same pro-
duction rule, one of these occurrences can be removed. For example,
A u [ u [ v
can be transformed into
A u [ v
Substituting right-hand sides for nonterminals
If a nonterminal N occurs in a right-hand side of a production, the production may
be replaced by just as many productions as there exist productions for N, in which
N has been replaced by its right-hand sides. For example,
A uBv [ z
B x [ w
may be transformed into
A uxv [ uwv [ z
B x [ w
2.5 Grammar transformations 29
Left factoring
Left factoring a grammar is a grammar transformation that is useful when two left
factor-
ing
productions for the same nonterminal start with the same sequence of (terminal
and/or nonterminal) symbols. These two productions can then be replaced by a
single production, that ends with a new nonterminal, replacing the part of the
sequence after the common start sequence. Two productions for the new nonterminal
are added: one for each of the two dierent end sequences of the two productions.
For example:
A xy [ xz [ v
where x (N T)

, and x ,= , may be transformed into


A xZ [ v
Z y [ z
where Z is a new nonterminal.
Removing left recursion
A left recursive production is a production in which the right-hand side starts with left
recur-
sion
the nonterminal of the left-hand side. For example, the production
A Az
is left recursive. A grammar is left recursive if we can derive A

Az for some
nonterminal A of the grammar. Left recursive grammars are sometimes undesirable.
The following transformation removes left recursive productions.
To remove the left recursive productions of a nonterminal A, divide the productions
for A in sets of left recursive and non left recursive productions. Factorise A as
follows:
A Ax
1
[ Ax
2
[ . . . [ Ax
n
A y
1
[ y
2
[ . . . [ y
m
with x
i
, y
j
(N T)

, head y
j
,= A (where head returns the rst element of a list
of elements), and 1 i n, 1 j m. Add a new nonterminal Z, and replace As
productions by:
A y
1
[ y
1
Z [ . . . [ y
m
[ y
m
Z
Z x
1
[ x
1
Z [ . . . [ x
n
[ x
n
Z
This procedure only works for a grammar that is direct left recursive, i.e., a grammar
that contains a production of the form A Ax. Removing left recursion in general
left recursive grammars, which for example contain productions like A Bx, B
Ay is a bit more complicated, see [1].
For example, the grammar SequenceOfS, see Section 2.4, with the productions
S SS
S s
30 Context-Free Grammars
is a left recursive grammar. The above procedure for removing left recursion gives
the following productions:
S s [ sZ
Z S [ SZ
Associative separator
The following grammar generates a list of declarations, separated by a semicolon ;.
Decls Decls ; Decls
Decls Decl
where the productions for Decl , which generates a single declaration, are omit-
ted. This grammar is ambiguous, for the same reason as SequenceOfS is ambigu-
ous. The operator ; is an associative separator in the generated language, that is:
d1 ; (d2 ; d3) = (d1 ; d2) ; d3 where d1, d2, and d3 are declarations. There-
fore, we may use the following unambiguous grammar for generating a language of
declarations:
Decls Decl ; Decls
Decls Decl
An alternative grammar for the same language is
Decls Decls ; Decl
Decls Decl
This grammar transformation can only be applied because the semicolon is asso-
ciative in the generated language; it is not a grammar transformation that can be
applied blindly to any grammar.
The same transformation can be applied to grammars with productions of the form:
A AaA
where a is an associative operator in the generated language. As an example you
may think of natural numbers with addition.
Introduction of priorities
Another form of ambiguity often arises in the part of a grammar for a programming
language which describes expressions. For example, the following grammar generates
arithmetic expressions:
E E + E
E E * E
E (E)
E Digs
where Digs generates a list of digits, see Section 2.3.1. This grammar is ambiguous:
the sentence 2+4*6 has two parse trees: one corresponding to (2+4)*6, and one
corresponding to 2+(4*6). If we make the usual assumption that * has higher
2.5 Grammar transformations 31
priority than +, the latter expression is the intended reading of the sentence 2+4*6.
In order to obtain parse trees that respect these priorities, we transform the grammar
as follows:
E T
E E + T
T F
T T * F
F (E)
F Digs
This grammar generates the same language as the previous grammar for expressions,
but it respects the priorities of the operators.
In practice, often more than two levels of priority are used. Then, instead of writ-
ing a large number of identically formed production rules, we use parameterised
nonterminals. For 1 i < n,
E
i
E
i+1
E
i
E
i
OP
i
E
i+1
Operator OP
i
is a parameterised nonterminal that generates operators of priority
i. In addition to the above productions, there should also be a production for
expressions of the highest priority, for example:
E
n
(E
1
) [ Digs
A grammar transformation transforms a grammar into another grammar that gen-
erates the same language. For each of the above transformations we should prove
that the generated language remains the same. Since the proofs are too complicated
at this point, they are omitted. The proofs can be found in any of the theoretical
books on language and parsing theory [12].
There exist many more grammar transformations, but the ones given in this section
suce for now. Note that everywhere we use left (left recursion, left factoring),
we can replace it by right, and obtain a dual grammar transformation. We will
discuss a larger example of a grammar transformation after the following section.
Exercise 2.17 The standard example of ambiguity in programming languages is the dangling
else. Let G be a grammar with terminal set if, b, then, else , a and productions
S if b then S else S
S if b then S
S a
1 Give two derivation trees for the sentence if b then if b then a else a.
2 Give an unambiguous grammar that generates the same language as G.
3 How does Java prevent this dangling else problem?

32 Context-Free Grammars
Exercise 2.18 A bit-list is a nonempty list of bits separated by commas. A grammar for
Bit-Lists is given by
L B
L L , L
B 0 [ 1
Remove the left recursion from this grammar.
Exercise 2.19 Consider the grammar with start symbol S
S AB
A [ aaA
B [ Bb
1 What language does this grammar generate?
2 Give an equivalent non left recursive grammar.

2.6 Concrete and abstract syntax


The goal of this section is to introduce abstract syntax, and to show how to obtain
an abstract syntax from a concrete syntax.
Recall the grammar SequenceOfS for producing sequences of ss:
Beside : S SS
Single : S s
As explained in Section 2.4.1, the following datatype can be used to represent parse
trees of sentences of the language of S.
data S = Beside S S
| Single Char
For example, the sequence sss may be represented by the parse tree
Beside (Beside (Single s) (Single s)) (Single s)
The function s2string constructs the sentence that corresponds to a value of the
datatype S:
s2string :: S -> String
s2string x = case x of
Beside l r -> s2string l ++ s2string r
Single x -> "s"
Since in each parse tree for a sentence of the language of S Single will always be fol-
lowed by the character s, we do not have to include the type Char in the datatype
denition S. We rene the datatype for representing parse trees of SequenceOfS as
follows:
2.6 Concrete and abstract syntax 33
data SA = BesideA SA SA
| SingleA
Note that the type Char, representing the terminal symbol, has disappeared now.
The sequence sss is represented by the parse tree
BesideA (BesideA SingleA SingleA) SingleA
A concrete syntax of a language describes the appearance of the sentences of a concrete
syntax language. So the concrete syntax of the language of S is given by the grammar
SequenceOfS. An abstract syntax of a language describes the parse trees of a lan-
abstract
syntax
guage. Parse trees are therefore sometimes also called abstract syntax trees. The
datatype SA is an example of an abstract syntax for the language of SequenceOfS.
The adjective abstract says that values of the abstract syntax do not need to have
all information about particular sentences, as long as the information is recover-
able. For example, function sa2string takes a value of the abstract syntax for
SequenceOfS, and returns the sentence represented by the abstract syntax tree.
sa2string :: SA -> String
sa2string x = case x of
BesideA l r -> sa2string l ++ sa2string r
SingleA -> "s"
Such a function is often called a semantic function. A semantic function is a func- semantic
func-
tion
tion that is dened on an abstract syntax of a language. Semantic functions are
used to give semantics (meaning) to values. Here, the meaning of a more abstract
representation is expressed in terms of a concrete representation.
Using the removing left recursion grammar transformation, the grammar SequenceOfS
can be transformed into the grammar with the following productions:
S sZ [ s
Z SZ [ S
An abstract syntax of this grammar may be given by
data SA2 = ConsS Z | SingleS
data Z = ConsSA2 SA2 Z | SingleSA2 SA2
In fact, the only important information about sequences of ss is how many occur-
rences of s there are. So the ultimate abstract syntax for SequenceOfS is
data SA3 = Size Int
The sequence sss is represented by the parse tree Size 3.
The SequenceOfS example shows that one may choose between many dierent ab-
stract syntaxes for a given grammar. The application determines which abstract
syntax is most convenient.
Exercise 2.20 A datatype in Haskell describes an inductively dened set. The following
datatype represents a limited form of arithmetic expressions
data Expr = Add Expr Expr | Mul Expr Expr | Con Int
Give a grammar that corresponds to this datatype.
34 Context-Free Grammars
Exercise 2.21 Consider your answer to exercise 2.7, which describes the concrete syntax for
palindromes over a, b.
1 Dene a datatype Pal that describes the abstract syntax corresponding to your
grammar. Give the two abstract palindromes aPal1 and aPal2 that corre-
spond to the concrete palindromes cPal1 = "abaaba" and cPal2 = "baaab"
2 Write a (semantic) function that transforms an abstract representation of a
palindrome into a concrete one. Test your function with the abstract palin-
dromes aPal1 and aPal2.
3 Write a function that counts the number of as occurring in a palindrome. Test
your function with the abstract palindromes aPal1 and aPal2.

Exercise 2.22 Consider your answer to exercise 2.8, which describes the concrete syntax for
mirror-palindromes.
1 Dene a datatype Mir that describes the abstract syntax corresponding to
your grammar. Give the two abstract mirror-palindromes aMir1 and aMir2
that correspond to the concrete mirror-palindromes cMir1 = "abaaba" and
cMir2 = "abbbba"
2 Write a (semantic) function that transforms an abstract representation of a
mirror-palindrome into a concrete one. Test your function with the abstract
mirror-palindromes aMir1 and aMir2.
3 Write a function that transforms an abstract representation of a mirror-palin-
drome into the corresponding abstract representation of a palindrome. Test
your function with the abstract mirror-palindromes aMir1 and aMir2.

Exercise 2.23 Consider your anwer to exercise 2.9, which describes the concrete syntax for
parity-sequences.
1 Describe the abstract syntax corresponding to your grammar. Give the two
abstract parity-sequences aEven1 and aEven2 that correspond to the concrete
parity-sequences cEven1 = "00101" and cEven2 = "01010"
2 Write a (semantic) function that transforms an abstract representation of a
parity-sequence into a concrete one. Test your function with the abstract
parity-sequences aEven1 and aEven2.

Exercise 2.24 Consider your answer to exercise 2.18, which describes the concrete syntax for
bit-lists by means of a grammar that is not left recursive.
1 Dene a datatype BitList that describes the abstract syntax corresponding to
your grammar. Give the two abstract bit-lists aBitList1 and aBitList2 that
correspond to the concrete bit-lists cBitList1 = "0,1,0" and cBitList2 =
"0,0,1".
2 Write a function that transforms an abstract representation of a bit-list into
a concrete one. Test your function with the abstract bit-lists aBitList1 and
aBitList2.
3 Write a function that concatenates two abstract representations of a bit-lists
into a bit-list. Test your function with the abstract bit-lists aBitList1 and
aBitList2.

2.7 Constructions on grammars 35


2.7 Constructions on grammars
This section introduces some constructions on grammars that are useful when spec-
ifying larger grammars, for example for programming languages. Furthermore, it
gives an example of a larger grammar that is transformed in several steps.
The BNF notation, introduced in section 2.2.1, was rst used in the early sixties
when the programming language ALGOL 60 was dened and until now it is the way
of dening programming languages. See for instance the Java Language Grammar.
You may object that the Java grammar contains more syntactical sugar than the
grammars that we considered thus far (and to be honest, this also holds for the Algol
60 grammar): one encounters nonterminals with postxes ?, + and *.
This extended BNF notation, EBNF, is introduced because the denition of a pro- EBNF
gramming language requires a lot of nonterminals and adding (superuous) nonter-
minals for standard construction such as:
one or zero occurrences of nonterminal P, (P?),
one or more occurrences of nonterminal P, (P
+
),
and zero or more occurrences of nonterminal P, (P

),
decreases the readability of the grammar. In other texts you will sometimes nd [P]
instead of P?, and P instead of P

. The same notation can be used for languages,


grammars, and sequences of terminal and nonterminal symbols instead of single
nonterminals. This section denes the meaning of these constructs.
We introduced grammars as an alternative for the description of languages. Design-
ing a grammar for a specic language may not be a trivial task. One approach is to
decompose the language and to nd grammars for each of its constituent parts.
In denition 3 we dened a number of operations on languages using operations on
sets. Here we redene these operations using context-free grammars.
Denition 10: Language operations
Suppose we have grammars for the languages L and M, say G
L
= (T, N
L
, R
L
, S
L
)
and G
M
= (T, N
M
, R
M
, S
M
). We assume that the sets N
L
and N
M
are disjoint.
Then
LM is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,
N = N
L
N
M
S and R = R
L
R
M
S S
L
, S S
M
)
LM is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,
N = N
L
N
M
S and R = R
L
R
M
S S
L
S
M

is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,


N = N
L
S and R = R
L
S , S S
L
S.
L
+
is generated by the grammar (T, N, R, S) where S is a fresh nonterminal,
N = N
L
S and R = R
L
S S
L
, S S
L
S
2
The nice thing about the above denitions is that the set-theoretic operation at
the level of languages (i.e. sets of sentences) has a direct counterpart at the level
of grammatical description. A straightforward question to ask is now: can I also
36 Context-Free Grammars
dene languages as the dierence between two languages or as the intersection of two
languages? Unfortunately there are no equivalent operators for composing grammars
that correspond to such intersection and dierence operators.
Two of the above constructions are important enough to also dene them as grammar
operations. Furthermore, we add a new construction for choice.
Denition 11: Grammar operations
Let G = (T, N, R, S) be a context-free grammar and let S

be a fresh nonterminal.
Then
G

= (T, N S

, R S

, S

SS

, S

) star G
G
+
= (T, N S

, R S

S, S

SS

, S

) plus G
G? = (T, N S

, R S

, S

S, S

) optional G
2
The denition of P?, P
+
, and P

is very similar to the denitions of the operations


on grammars. For example, P

denotes zero or more occurrences of nonterminal P,


so Dig

denotes the language consisting of zero or more digits.


Denition 12: EBNF for sequences
Let P be a sequence of nonterminals and terminals, then
L(P

) = L(Z) with Z [ PZ
L(P
+
) = L(Z) with Z P [ PZ
L(P?) = L(Z) with Z [ P
where Z is a new nonterminal in each denition. 2
Because the concatenation operator for sequences is associative, the operators + and
can also be dened symmetrically:
L(P

) = L(Z) with Z [ ZP
L(P
+
) = L(Z) with Z P [ ZP
There are many variations possible on this theme:
L(P

Q) = L(Z) with Z Q [ PZ (2.1)


2.7.1 SL: an example
To illustrate EBNF and some of the grammar transformations given in the previous
section, we give a larger example. The following grammar generates expressions in
a very small programming language, called SL.
Expr if Expr then Expr else Expr
Expr Expr where Decls
Expr AppExpr
AppExpr AppExpr Atomic [ Atomic
Atomic Var [ Number [ Bool [ (Expr)
Decls Decl
Decls Decls ; Decls
Decl Var = Expr
2.7 Constructions on grammars 37
where the nonterminals Var, Number, and Bool generate variables, number expres-
sions, and boolean expressions, respectively. Note that the brackets around the
Expr in the production for Atomic, and the semicolon in between the Decls in the
second production forDecls are also terminal symbols. The following program is a
sentence of this language:
if true then funny true else false where funny = 7
It is clear that this is not a very convenient language to write programs in.
The above grammar is ambiguous (why?), and we introduce priorities to resolve
some of the ambiguities. Application binds stronger than if, and both application
and if bind stronger then where. Using the introduction of priorities grammar trans-
formation, we obtain:
Expr Expr1
Expr Expr1 where Decls
Expr1 Expr2
Expr1 if Expr1 then Expr1 else Expr1
Expr2 Atomic
Expr2 Expr2 Atomic
where Atomic, and Decls have the same productions as before.
The nonterminal Expr2 is left recursive. Removing left recursion gives the following
productions for Expr2.
Expr2 Atomic [ Atomic Z
Z Atomic [ Atomic Z
Since the new nonterminal Z has exactly the same productions as Expr2, these
productions can be replaced by
Expr2 Atomic [ Atomic Expr2
So Expr2 generates a nonempty sequence of atomics. Using the +-notation intro-
duced in this section, we may replace Expr2 by Atomic
+
.
Another source of ambiguity are the productions for Decls. Decls generates a
nonempty list of declarations, and the separator ; is assumed to be associative.
Hence we can apply the associative separator transformation to obtain
Decls Decl [ Decl ; Decls
or, according to (2.1),
Decls (Decl ;)

Decl
which, using an omitted rule for that star-operator , may be transformed into
Decls Decl (;Decl )

38 Context-Free Grammars
The last grammar transformation we apply is left factoring. This transformation
applies to Expr, and gives
Expr Expr1 Z
Z [ where Decls
Since nonterminal Z generates either nothing or a where clause, we may replace Z
by an optional where clause in the production for Expr.
Expr Expr1 (where Decls)?
After all these grammar transformations, we obtain the following grammar.
Expr Expr1 (where Decls)?
Expr1 Atomic
+
Expr1 if Expr1 then Expr1 else Expr1
Atomic Var [ Number [ Bool [ (Expr)
Decls Decl (;Decl )

Exercise 2.25 Give the EBNF notation for each of the basic languages dened in section
2.3.1.
Exercise 2.26 What language is generated by G? ?
Exercise 2.27 In case we dene a language by means of a predicate it is almost trivial to
dene the intersection and the dierence of two languages. Show how.
Exercise 2.28 Let L
1
= a
n
b
n
c
m
[ n, m IIN and L
2
= a
n
b
m
c
m
[ n, m IIN.
Give grammars for L
1
and L
2
.
Is L
1
L
2
context free, i.e. can you give a context-free grammar for this
language?

2.8 Parsing
This section formulates the parsing problem, and discusses some of the future topics
of the course.
Denition 13: Parsing problem
Given the grammar G and a string s the parsing problem answers the question
whether or not s L(G). If s L(G), the answer to this question may be either a
parse tree or a derivation. 2
This question may not be easy to answer given an arbitrary grammar. Until now
we have only seen simple grammars for which it is easy to determine whether or not
a string is a sentence of the grammar. For more complicated grammars this may
be more dicult. However, in the rst part of this course we will show how given
a grammar with certain reasonable properties, we can easily construct parsers by
2.9 Exercises 39
hand. At the same time we will show how the parsing process can quite often be
combined with the algorithm we actually want to perform on the recognised object
(the semantic function). As such it provides a simple, although surprisingly ecient,
introduction into the area of compiler construction.
A compiler for a programming language consists of several parts. Examples of such
parts are a scanner, a parser, a type checker, and a code generator. Usually, a parser
is preceded by a scanner, which divides an input sentence in a list of so-called tokens. scanner
For example, given the sentence
if true then funny true else false where funny = 7
a scanner might return the following list of tokens:
["if","true","then","funny","true","else","false"
,"where","funny","=","7"]
So a token is a syntactical entity. A scanner usually performs the rst step towards token
an abstract syntax: it throws away layout information such as spacing and newlines.
In this course we will concentrate on parsers, but some of the concepts of scanners
will sometimes be used.
In the second part of this course we will take a look at more complicated grammars,
which do not always conform to the restrictions just referred to. By analysing the
grammar we may nevertheless be able to generate parsers as well. Such generated
parsers will be in such a form that it will be clear that writing such parsers by hand
is far from attractive, and actually impossible for all practical cases.
One of the problems we have not referred to yet in this rather formal chapter is of
a more practical nature. Quite often the sentence presented to the parser will not
be a sentence of the language since mistakes were made when typing the sentence.
This raises another interesting question: What are the minimal changes that have
to be made to the sentence in order to convert it into a sentence of the language?
It goes almost without saying that this is an important question to be answered in
practice; one would not be very happy with a compiler which, given an erroneous
input, would just reply that the Input could not be recognised. One of the most
important aspects here is to dene metric for deciding about the minimality of a
change; humans usually make certain mistakes more often than others. A semicolon
can easily be forgotten, but the chance that an if-symbol was forgotten is far from
likely. This is where grammar engineering starts to play a role.
summary
Starting from a simple example, the language of palindromes, we have introduced
the concept of a context-free grammar. Associated concepts, such as derivations and
parse trees were introduced.
2.9 Exercises
Exercise 2.29 Do there exist languages L such that (L

) = (L)

?
Exercise 2.30 Give a language L such that L = L


40 Context-Free Grammars
Exercise 2.31 Under what circumstances is L
+
= L

?
Exercise 2.32 Let L be a language over alphabet a, b, c such that L = L
R
. Does L contain
only palindromes?
Exercise 2.33 Consider the grammar with productions
S AA
A AAA
A a
A bA
A Ab
1. Which terminal strings can be produced by derivations of four or fewer steps?
2. Give at least four distinct derivations for the string babbab
3. For any m, n, p 0, describe a derivation of the string b
m
ab
n
ab
p
.

Exercise 2.34 Consider the grammar with productions


S aaB
A bBb
A
B Aa
Show that the string aabbaabba cannot be derived from S.
Exercise 2.35 Give a grammar for the language
L = c
R
[ a, b

This language is known as the center marked palindromes language. Give a deriva-
tion of the sentence abcba.
Exercise 2.36 Describe the language generated by the grammar:
S
S A
A aAb
A ab
Can you nd another (preferably simpler) grammar for the same language?
Exercise 2.37 Describe the languages generated by the grammars.
S
S A
A Aa
A a
2.9 Exercises 41
and
S
S A
A AaA
A a
Can you nd other (preferably simpler) grammars for the same languages?
Exercise 2.38 Show that the languages generated by the grammars G
1
, G
2
en G
3
are the
same.
G
1
: G
2
: G
3
:
S
S aS
S
S Sa
S
S a
S SS

Exercise 2.39 Consider the following property of grammars:


1. the start-symbol is the only nonterminal which may have an empty production
(a production of the form X ),
2. the start symbol does not occur in any alternative.
A grammar having this property is called non-contracting The grammar A = aAb [
does not have this property. Give a non-contracting grammar which describes the
same language as A = aAb [ .
Exercise 2.40 Describe the language L of the grammar A = AaA [ a. Give a grammar for
L that has no left recursive productions. Give a grammar for L that has no right
recursive productions.
Exercise 2.41 Describe the language L of the grammar X = a [ Xb. Give a grammar
for L that has no left recursive productions. Give a grammar for L that has no left
recursive productions and which is non-contracting.
Exercise 2.42 Consider the language L of the grammar
S = T [ US
T = aSa [ Ua
U = S [ SUT
Give a grammar for L which uses only alternatives of length 2. Give a grammar
for L which uses only 2 nonterminals.
Exercise 2.43 Give a grammar for the language of all sequences of 0s and 1s which start
with a 1 and contain exactly one 0.
Exercise 2.44 Give a grammar for the language consisting of all nonempty sequences of
brackets,
( , )
42 Context-Free Grammars
in which the brackets match. ()( () )() is a sentence of the language, give a derivation
tree for it.
Exercise 2.45 Give a grammar for the language consisting of all nonempty sequences of two
kinds of brackets,
( , ) , [ , ]
in which the brackets match. [ ( ) ] ( ) is a sentence of the language.
Exercise 2.46 This exercise shows an example (attributed to Noam Chomsky) of an ambigu-
ous English sentence. Consider the following grammar for a part of the English
language:
Sentence Subject Predicate.
Subject they
Predicate Verb NounPhrase
Predicate AuxVerb Verb Noun
Verb are
Verb ying
AuxVerb are
NounPhrase Adjective Noun
Adjective ying
Noun planes
Give two dierent left most derivations for the sentence
they are ying planes.

Exercise 2.47 Try to nd some ambiguous sentences in your own natural language. Here are
some ambiguous Dutch sentences seen in the newspapers:
Vliegen met hartafwijking niet gevaarlijk
Jessye Norman kan niet zingen
Alcohol is voor vrouwen schadelijker dan mannen

Exercise 2.48 Is your grammar of exercise 2.45 unambiguous? If not, nd one which is
unambiguous.
Exercise 2.49 This exercise deals with a grammar that uses unusual terminal and nonterminal
symbols.


3



2.9 Exercises 43
Find a derivation for the sentence 3.
Exercise 2.50 Prove, using induction, that the grammar G for palindromes of section 2.2
does indeed generate the language of palindromes.
Exercise 2.51 Prove that the language generated by the grammar of exercise 2.33 contains
all strings over a, b with a number of as that is even and greater than zero.
Exercise 2.52 Consider the natural numbers in unary notation where only the symbol I is
used; thus 4 is represented as IIII. Write an algorithm that, given a string w of
Is, determines whether or not w is divisible by 7.
Exercise 2.53 Consider the natural numbers in reverse binary notation; thus 4 is represented
as 001. Write an algorithm that, given a string w of zeros and ones, determines
whether or not w is divisible by 7.
Exercise 2.54 Let w be a string consisting of as en bs only. Write an algorithm that de-
termines whether or not the number of as in w equals the number of bs in w.

44 Context-Free Grammars
Chapter 3
Parser combinators
introduction
This chapter is an informal introduction to writing parsers in a lazy functional lan-
guage using parser combinators. Parsers can be written using a small set of basic
parsing functions, and a number of functions that combine parsers into more com-
plicated parsers. The functions that combine parsers are called parser combinators.
The basic parsing functions do not combine parsers, and are therefore not parser
combinators in this sense, but they are usually also called parser combinators.
Parser combinators are used to write parsers that are very similar to the grammar of
a language. Thus writing a parser amounts to translating a grammar to a functional
program, which is often a simple task.
Parser combinators are built by means of standard functional language constructs
like higher-order functions, lists, and datatypes. List comprehensions are used in
a few places, but they are not essential, and could easily be rephrased using the
map, filter and concat functions. Type classes are only used for overloading the
equality and arithmetic operators.
We will start by motivating the denition of the type of parser functions. Using
that type, we can build parsers for the language of (possibly ambiguous) grammars.
Next, we will introduce some elementary parsers that can be used for parsing the
terminal symbols of a language.
In Section 3.3 the rst parser combinators are introduced, which can be used for
sequentially and alternatively combining parsers, and for calculating so-called se-
mantic functions during the parse. Semantic functions are used to give meaning to
syntactic structures. As an example, we construct a parser for strings of match-
ing parentheses in Section 3.3.1. Dierent semantic values are calculated for the
matching parentheses: a tree describing the structure, and an integer indicating the
nesting depth.
In Section 3.4 we introduce some new parser combinators. Not only do these make
life easier later, but their denitions are also nice examples of using parser combi-
nators. A real application is given in Section 3.5, where a parser for arithmetical
expressions is developed. Finally, the expression parser is generalised to expressions
with an arbitrary number of precedence levels. This is done without coding the
priorities of operators as integers, and we will avoid using indices and ellipses.
It is not always possible to directly construct a parser from a context-free grammar
using parser combinators. If the grammar is left recursive, it has to be transformed
45
46 Parser combinators
into a non left recursive grammar before we can construct a combinator parser. An-
other limitation of the parser combinator technique as described in this chapter is
that it is not trivial to write parsers for complex grammars that perform reason-
ably ecient. However, there do exist implementations of parser combinators that
perform remarkably well, see [11, 13]. For example, there exist good parsers using
parser combinator for Haskell.
Most of the techniques introduced in this chapter have been described by Burge [4],
Wadler [14] and Hutton [7]. Recently, the use of so-called monads has become quite
popular in connection with parser combinators [15, 8]. We will not use them in
this article, however, to show that no magic is involved in using parser combinators.
You are nevertheless encouraged to study monads at some time, because they form
a useful generalisation of the techniques described here.
This chapter is a revised version of [5].
goals
This chapter introduces the rst programs for parsing in these lecture notes. Parsers
are composed from simple parsers by means of parser combinators. Hence, important
primary goals of this chapter are:
to understand how to parse, i.e. how to recognise structure in a sequence of
symbols, by means of parser combinators;
to be able to construct a parser when given a grammar;
to understand the concept of semantic functions.
Two secondary goals of this chapter are:
to develop the capability to abstract;
to understand the concept of domain specic language.
required prior knowledge
To understand this chapter, you should be able to formulate the parsing problem,
and you should understand the concept of context-free grammar. Furthermore, you
should be familiar with functional programming concepts such as type, class, and
higher-order functions.
3.1 The type Parser 47
3.1 The type Parser
The goals of this section are:
develop a type Parser that is used to give the type of parsing functions;
show how to obtain this type by means of several abstraction steps.
The parsing problem is (see Section 2.8): Given a grammar G and a string s, de-
termine whether or not s L(G). If s L(G), the answer to this question may
be either a parse tree or a derivation. For example, in Section 2.4 you have seen a
grammar for sequences of ss:
S SS [ s
A parse tree of an expression of this language is a value of the datatype SA (or a
value of S, SA2, SA3, see Section 2.6, depending on what you want to do with the
result of the parser), which is dened by
data SA = BesideA SA SA | SingleA
A parser for expressions could be implemented as a function of the following type:
type Parser = String -> SA
For parsing substructures, a parser can call other parsers, or call itself recursively.
These calls do not only have to communicate their result, but also the part of the
input string that is left unprocessed. For example, when parsing the string sss, a
parser will rst build a parse tree BesideA SingleA SingleA for ss, and only then
build a complete parse tree
BesideA (BesideA SingleA SingleA) SingleA
using the unprocessed part s of the input string. As this cannot be done using
a global variable, the unprocessed input string has to be part of the result of the
parser. The two results can be grouped in a tuple. A better denition for the type
Parser is hence:
type Parser = String -> (SA,String)
Any parser of type Parser returns an SA and a String. However, for dierent
grammars we want to return dierent parse trees: the type of tree that is returned
depends on the grammar for which we want to parse sentences. Therefore it is better
to abstract from the type SA, and to make the parser type into a polymorphic type.
The type Parser is parametrised with a type a, which represents the type of parse
trees.
type Parser a = String -> (a,String)
For example, a parser that returns a structure of type Oak (whatever that is) now
has type Parser Oak. A parser that parses sequences of ss has type Parser SA.
We might also dene a parser that does not return a value of type SA, but instead
the number of ss in the input sequence. This parser would have type Parser Int.
48 Parser combinators
Another instance of a parser is a parse function that recognises a string of digits, and
returns the number represented by it as a parse tree. In this case the function is
also of type Parser Int. Finally, a recogniser that either accepts or rejects sentences
of a grammar returns a boolean value, and will have type Parser Bool.
Until now, we have assumed that every string can be parsed in exactly one way. In
general, this need not be the case: it may be that a single string can be parsed in
various ways, or that there is no way to parse a string. For example, the string sss
has the following two parse trees:
BesideA (BesideA SingleA SingleA) SingleA
BesideA SingleA (BesideA SingleA SingleA)
As another renement of the type denition, instead of returning one parse tree
(and its associated rest string), we let a parser return a list of trees. Each element
of the result consists of a tree, paired with the rest string that was left unprocessed
after parsing. The type denition of Parser therefore becomes:
type Parser a = String -> [(a,String)]
If there is just one parsing, the result of the parse function is a singleton list. If no
parsing is possible, the result is an empty list. In case of an ambiguous grammar,
the result consists of all possible parsings.
This method for parsing is called the list of successes method, described by Wadler list of
suc-
cesses
[14]. It can be used in situations where in other languages you would use so-called
backtracking techniques. In the Bird and Wadler textbook it is used to solve combi-
natorial problems like the eight queens problem [3]. If only one solution is required
rather than all possible solutions, you can take the head of the list of successes.
Thanks to lazy evaluation, not all elements of the list are computed if only the rst lazy
evalua-
tion
value is needed, so there will be no loss of eciency. Lazy evaluation provides a
backtracking approach to nding the rst solution.
Parsers with the type described so far operate on strings, that is lists of characters.
There is however no reason for not allowing parsing strings of elements other than
characters. You may imagine a situation in which a preprocessor prepares a list of
tokens (see Section 2.8), which is subsequently parsed. To cater for this situation
we rene the parser type once more: we let the type of the elements of the input
string be an argument of the parser type. Calling it b, and as before the result type
a, the type of parsers is now dened by:
type Parser b a = [b] -> [(a,[b])]
or, if you prefer meaningful identiers over conciseness:
type Parser symbol result = [symbol] -> [(result,[symbol])]
We will use this type denition in the rest of this chapter. This type is dened in
listing 1, the rst part of our parser library. The list of successes appears in result
type of a parser. Each element of this list is a possible parsing of (an initial part of)
the input. We will hardly use the generality provided by the Parser type: the type
of the input b (or symbol) will almost always be Char.
3.2 Elementary parsers 49
-- The type of parsers
type Parser symbol result = [symbol] -> [(result,[symbol])]
Listing 1: ParserType.hs
3.2 Elementary parsers
The goals of this section are:
introduce some very simple parsers for parsing sentences of grammars with
rules of the form:
A
A a
A x
where x is a sequence of terminals;
show how one can construct useful functions from simple, trivially correct
functions by means of generalisation and partial parametrisation.
This section denes parsers that can only be used to parse xed sequences of terminal
symbols. For a grammar with a production that contains nonterminals in its right-
hand side we need techniques that will be introduced in the following section.
We will start with a very simple parse function that just recognises the terminal
symbol a. The type of the input string symbols is Char in this case, and as a parse
tree we also simply use a Char:
symbola :: Parser Char Char
symbola [] = []
symbola (x:xs) | x == a = [(a,xs)]
| otherwise = []
The list of successes method immediately pays o, because now we can return an
empty list if no parsing is possible (because the input is empty, or does not start
with an a).
In the same fashion, we can write parsers that recognise other symbols. As always,
rather than dening a lot of closely related functions, it is better to abstract from the
symbol to be recognised by making it an extra argument of the function. Further-
more, the function can operate on lists of characters, but also on lists of symbols of
other types, so that it can be used in other applications than character oriented ones.
The only prerequisite is that the symbols to be parsed can be tested for equality. In
Hugs, this is indicated by the Eq predicate in the type of the function.
Using these generalisations, we obtain the function symbol that is given in listing 2.
The function symbol is a function that, given a symbol, returns a parser for that
symbol. A parser on its turn is a function too. This is why two arguments appear
in the denition of symbol.
50 Parser combinators
-- Elementary parsers
symbol :: Eq s => s -> Parser s s
symbol a [] = []
symbol a (x:xs) | x == a = [(x,xs)]
| otherwise = []
satisfy :: (s -> Bool) -> Parser s s
satisfy p [] = []
satisfy p (x:xs) | p x = [(x,xs)]
| otherwise = []
token :: Eq s => [s] -> Parser s [s]
token k xs | k == take n xs = [(k,drop n xs)]
| otherwise = []
where n = length k
failp :: Parser s a
failp xs = []
succeed :: a -> Parser s a
succeed r xs = [(r,xs)]
-- Applications of elementary parsers
digit :: Parser Char Char
digit = satisfy isDigit
Listing 2: ParserType.hs
3.2 Elementary parsers 51
We will now dene some elementary parsers that can do the work traditionally taken
care of by lexical analysers (see Section 2.8). For example, a useful parser is one
that recognises a xed string of symbols, such as while or switch. We will call this
function token; it is dened in listing 2. As in the case of the symbol function we
have parametrised this function with the string to be recognised, eectively making
it into a family of functions. Of course, this function is not conned to strings of
characters. However, we do need an equality test on the type of values in the input
string; the type of token is:
token :: Eq s => [s] -> Parser s [s]
The function token is a generalisation of the symbol function, in that it recognises
a list of symbols instead of a single symbol. Note that we cannot dene symbol in
terms of token: the two functions have incompatible types.
Another generalisation of symbol is a function which may, depending on the in-
put, return dierent parse results. Instead of specifying a specic symbol, we can
parametrise the function with a condition that the symbol should fulll. Thus the
function satisfy has a function s -> Bool as argument. Where symbol tests for
equality to a specic value, the function satisfy tests for compliance with this
predicate. It is dened in listing 2. This generalised function is for example useful
when we want to parse digits (characters in between 0 and 9):
digit :: Parser Char Char
digit = satisfy isDigit
where the function isDigit is the standard predicate that tests whether or not a
character is a digit:
isDigit :: Char -> Bool
isDigit x = 0 <= x && x <= 9
In books on grammar theory an empty string is often called epsilon. In this tradi-
tion, we will dene a function epsilon that parses the empty string. It does not
consume any input, and hence always returns an empty parse tree and unmodied
input. A zero-tuple can be used as a result value: () is the only value of the type
().
epsilon :: Parser s ()
epsilon xs = [((),xs)]
A more useful variant is the function succeed, which also doesnt consume input,
but always returns a given, xed value (or parse tree, if you can call the result of
processing zero symbols a parse tree). It is dened in listing 2.
Dual to the function succeed is the function failp, which fails to recognise any
symbol on the input string. As the result list of a parser is a list of successes,
and in the case of failure there are no successes, the result list should be empty.
Therefore the function failp always returns the empty list of successes. It is dened
in listing 2. Note the dierence with epsilon, which does have one element in its
list of successes (albeit an empty one).
52 Parser combinators
Do not confuse failp with epsilon: there is an important dierence between re-
turning one solution (which contains the unchanged input as rest string) and not
returning a solution at all!
Exercise 3.1 Dene a function capital :: Parser Char Char that parses capital letters.
Exercise 3.2 Since satisfy is a generalisation of symbol, the function symbol can be dened
as an instance of satisfy. How can this be done?
Exercise 3.3 Dene the function epsilon using succeed .
3.3 Parser combinators
Using the elementary parsers from the previous section, parsers can be constructed
for terminal symbols from a grammar. More interesting are parsers for nonterminal
symbols. It is convenient to construct these parsers by partially parametrising
higher-order functions.
The goals of this section are:
show how parsers can be constructed directly from the productions of a gram-
mar. The kind of productions for which parsers will be constructed are
A x [ y
A x y
where x, y (N T)

;
show how we can construct a small, powerful combinator language (a domain
specic language) for the purpose of parsing;
understand and use the concept of semantic functions in parsers.
Let us have a look at the grammar for expressions again, see also Section 2.5:
E T+E [ T
T F*T [ F
F Digs [ (E)
where Digs is a nonterminal that generates the language of sequences of digits,
see Section 2.3.1. An expression can be parsed according to any of the two rules
for E. This implies that we want to have a way to say that a parser consists of
several alternative parsers. Furthermore, the rst rule says that in order to parse
an expression, we should rst parse a term, then a terminal symbol +, and then an
expression. This implies that we want to have a way to say that a parser consists of
several parsers that are applied sequentially.
So important operations on parsers are sequential and alternative composition: a
more complex construct can consist of a simple construct followed by another con-
struct (sequential composition), or by a choice between two constructs (alternative
composition). These operations correspond directly to their grammatical counter-
parts. We will develop two functions for this, which for notational convenience
3.3 Parser combinators 53
are dened as operators: <*> for sequential composition, and <|> for alternative
composition. The names of these operators are chosen so that they can be easily
remembered: <*> multiplies two constructs together, and <|> can be pronounced
as or. Be careful, though, not to confuse the <|>-operator with Hugs built-in
construct |, which is used to distinguish cases in a function denition.
Priorities of these operators are dened so as to minimise parentheses in practical
situations:
infixl 6 <*>
infixr 4 <|>
Both operators take two parsers as argument, and return a parser as result. By
again combining the result with other parsers, you may construct even more involved
parsers.
In the denitions in listing 3, the functions operate on parsers p and q. Apart from
the arguments p and q, the function operates on a string, which can be thought of
as the string that is parsed by the parser that is the result of combining p and q.
We start with the denition of operator <*>. For sequential composition, p must be
applied to the input rst. After that, q is applied to the rest string of the result.
The rst parser, p, returns a list of successes, each of which contains a value and a
rest string. The second parser, q, should be applied to the rest string, returning a
second value. Therefore we use a list comprehension, in which the second parser is
applied in all possible ways to the rest string of the rst parser:
(p <*> q) xs = [(combine r1 r2, zs)
|(r1,ys) <- p xs
,(r2,zs) <- q ys
]
The rest string of the parser for the sequential composition of p and q is whatever
the second parser q leaves behind as rest string.
Now, how should the results of the two parsings be combined? We could, of course,
parametrise the whole thing with an operator that describes how to combine the
parts (as is done in the zipWith function). However, we have chosen for a dierent
approach, which nicely exploits the ability of functional languages to manipulate
functions. The function combine should combine the results of the two parse trees
recognised by p and q. In the past, we have interpreted the word tree liberally:
simple values, like characters, may also be used as a parse tree. We will now also
accept functions as parse trees. That is, the result type of a parser may be a function
type.
If the rst parser that is combined by <*> would return a function of type b -> a,
and the second parser a value of type b, a straightforward choice for the combine
function would be function application. That is exactly the approach taken in the
denition of <*> in listing 3. The rst parser returns a function, the second parser
a value, and the combined parser returns the value that is obtained by applying the
function to the value.
Apart from sequential composition we need a parser combinator for representing
choice. For this, we have the parser combinator operator <|>. Thanks to the list
of successes method, both p1 and p2 return lists of possible parsings. To obtain all
54 Parser combinators
-- Parser combinators
(<|>) :: Parser s a -> Parser s a -> Parser s a
(p <|> q) xs = p xs ++ q xs
(<*>) :: Parser s (b -> a) -> Parser s b -> Parser s a
(p <*> q) xs = [(f x,zs)
|(f ,ys) <- p xs
,( x,zs) <- q ys
]
(<$>) :: (a -> b) -> Parser s a -> Parser s b
(f <$> p) xs = [(f y,ys)
|( y,ys) <- p xs
]
-- Applications of parser combinators
newdigit :: Parser Char Int
newdigit = f <$> digit
where f c = ord c - ord 0
Listing 3: ParserCombinators.hs
3.3 Parser combinators 55
possible parsings when applying p1 or p2, we only need to concatenate these two
lists.
By combining parsers with parser combinators we can construct new parsers. The
most important parser combinators are <*> and <|>. The parser combinator <,> in
exercise 12 is just a variation of <*>.
Sometimes we are not quite satised with the result value of a parser. The parser
might work well in that it consumes symbols from the input adequately (leaving the
unused symbols as rest-string in the tuples in the list of successes), but the result
value might need some postprocessing. For example, a parser that recognises one
digit is dened using the function satisfy: digit = satisfy isDigit. In some
applications, we may need a parser that recognises one digit character, but returns
the result as an integer, instead of a character. In a case like this, we can use a new
parser combinator: <$>. It takes a function and a parser as argument; the result is a
parser that recognises the same string as the original parser, but postprocesses the
result using the function. We use the $ sign in the name of the combinator, because
the combinator resembles the operator that is used for normal function application
in Haskell: f $ x = f x. The denition of <$> is given in listing 3. It is an inx
operator:
infixl 7 <$>
Using this postprocessing parser combinator, we can modify the parser digit that
was dened above:
newdigit :: Parser Char Int
newdigit = f <$> digit
where f c = ord c - ord 0
The auxiliary function f determines the ordinal number of a digit character; using
the parser combinator <$> it is applied to the result part of the digit parser.
In practice, the <$> operator is used to build a certain value during parsing (in the
case of parsing a computer program this value may be the generated code, or a list
of all variables with their types, etc.). Put more generally: using <$> we can add
semantic functions to parsers.
A parser for the SequenceOfS grammar that returns the abstract syntax tree of the
input, i.e., a value of type SA, see Section 2.6, is dened as follows:
sequenceOfS :: Parser Char SA
sequenceOfS = BesideA <$> sequenceOfS <*> sequenceOfS
<|> const SingleA <$> symbol s
But if you try to run this function, you will get a stack overow! If you apply
sequenceOfS to a string, the rst thing it does is to apply itself to the same string,
which . . . The problem is that the underlying grammar is left recursive, and you
cannot use parser combinators to parse sentences of left recursive grammars. In
section 2.5 we have shown how to remove the left recursion in the SequenceOfS
grammar. The resulting grammar is used to obtain the following parser:
sequenceOfS :: Parser Char SA2
56 Parser combinators
sequenceOfS =
const ConsZ <$> symbol s <*> parseZ
<|> const SingleS <$> symbol s
where parseZ = ConsSA2 <$> sequenceOfS <*> parseZ
<|> SingleSA2 <$> sequenceOfS
This example is a direct translation of the grammar obtained by using the removing
left recursion grammar transformation. There exists a much simpler parser for
parsing sequences of ss.
Exercise 3.4 Prove that for all f :: a -> b
f <$> succeed a = succeed (f a)
In the sequel we will often use this rule for constant functions f, i.e. f = \a -> c,
where c is a value that does not contain a.
Exercise 3.5 Consider the parser list <$> symbol a, where list a as = a:as. Give its
type and show its results on inputs [] and x:xs.
Exercise 3.6 Consider the parser list <$> symbol a <*> p. Give its type and show its
results on inputs [] and x:xs.
Exercise 3.7 Dene a parser for Booleans.
Exercise 3.8 Dene parsers for each of the basic languages dened in section 2.3.1.
Exercise 3.9 Consider the grammar for palindromes that you have constructed in exercise
2.7.
1. Give the datatype Pal2 that corresponds to this grammar.
2. Dene a parser palin2 that returns parse trees for palindromes. Test your
function with the palindromes cPal1 = "abaaba" and cPal2 = "baaab". Com-
pare the results with your answer to exercise 2.16.
3. Dene a parser palina that counts the number of as occurring in a palin-
drome.

Exercise 3.10 Consider the grammar for a part of the English language that is given in
exercise 2.46.
1. Give the datatype English that corresponds to this grammar.
2. Dene a parser english that returns parse trees for the English language.
Test your function with the sentence they are flying planes. Compare
the result to your answer of exercise 2.46.

Exercise 3.11 When dening the priority of the <|> operator with the infixr keyword, we
also specied that the operator associates to the right. Why is this a better choice
than association to the left?
3.3 Parser combinators 57
Exercise 3.12 Dene a parser combinator <,> that combines two parsers. The value returned
by the combined parser is a tuple containing the results of the two component
parsers. What is the type of this parser combinator?
Exercise 3.13 The term parser combinator is in fact not an adequate description for <$>.
Can you think of a better word?
Exercise 3.14 Compare the type of <$> with the type of the standard function map. Can you
describe your observations in an easy-to-remember, catchy phrase?
Exercise 3.15 Dene <*> in terms of <,> and <$>. Dene <,> in terms of <*> and <$>.
Exercise 3.16 If you examine the denitions of <*> and <$> in listing 3, you can observe that
<$> is in a sense a special case of <*>. Can you dene <$> in terms of <*>?
3.3.1 Matching parentheses: an example
Using parser combinators, it is often fairly straightforward to construct a parser for
a language for which you have a grammar. Consider, for example, the grammar that
you wrote in exercise 2.44:
S ( S ) S [
This grammar can be directly translated to a parser, using the parser combinators
<*> and <|>. We use <*> when symbols are written next to each other, and <|>
when [ appears in a production (or when there is more than one production for a
nonterminal).
parens :: Parser Char ???
parens = symbol ( <*> parens <*> symbol ) <*> parens
<|> epsilon
However, this function is not correctly typed: the parsers in the rst alternative
cannot be composed using <*>, as for example symbol ( is not a parser returning
a function.
But we can postprocess the parser symbol ( so that, instead of a character, this
parser does return a function. So, what function should we use? This depends
on the kind of value that we want as a result of the parser. A nice result would
be a tree-like description of the parentheses that are parsed. For this purpose we
introduce an abstract syntax, see Section 2.6, for the parentheses grammar. A rst
abstract syntax is given by the datatype Parentheses1.
data Parentheses1 = Match1 Char Parentheses1 Char Parentheses1
| Empty1
For example, the sentence ()() is represented by
Match1 ( Empty1 ) (Match1 ( Empty1 ) Empty1)
Suppose we want to calculate the number of parentheses in a sentence. The number
of parentheses is calculated by the function nrofpars, which is dened by induction
on the datatype Parentheses1.
58 Parser combinators
data Parentheses = Match Parentheses Parentheses
| Empty
deriving Show
open = symbol (
close = symbol )
parens :: Parser Char Parentheses
parens = f <$> open <*> parens <*> close <*> parens
<|> succeed Empty
where f a b c d = Match b d
nesting :: Parser Char Int
nesting = f <$> open <*> nesting <*> close <*> nesting
<|> succeed 0
where f a b c d = max (1+b) d
Listing 4: ParseParentheses.hs
nrofpars :: Parentheses1 -> Int
nrofpars (Match1 cl pl cr pr) = nrofpars pl + nrofpars pr + 2
nrofpars Empty1 = 0
Since the values cl and cr in the case for Match1 are not used (and should never
be used) by functions dened on Parentheses1, we use the following datatype for
the abstract syntax for the grammar of parentheses.
data Parentheses = Match Parentheses Parentheses
| Empty
Now we can add semantic functions to the parser. Thus, we get the denition of
parens in listing 4.
By varying the function used before <$> (the semantic function), we can return
other things than parse trees. As an example we construct a parser that calculates
the nesting depth of nested parentheses, see the function nesting dened in listing 4.
A session in which nesting is used may look like this:
? nesting "()(())()"
[(2,[]), (2,"()"), (1,"(())()"), (0,"()(())()")]
? nesting "())"
[(1,")"), (0,"())")]
As you can see, when there is a syntax error in the argument, there are no solutions
with empty rest string. It is fairly simple to test whether a given string belongs to
the language that is parsed by a given parser.
Exercise 3.17 What is the type of the function f which appears in function parens in listing 4?
What is the type of the parser open? Using the type of <$>, what is the type of
f <$> open ? Can f <$> open be used as a left hand side of <*> parens? What
is the type of the result?
3.4 More parser combinators 59
Exercise 3.18 What is a convenient way for <*> to associate? Does it?
Exercise 3.19 Write a function test that determines whether or not a given string belongs
to the language parsed by a given parser.
3.4 More parser combinators
In principle you can build parsers for any context-free language using the combina-
tors <*> and <|>, but in practice it is easier to have some more parser combinators
available. In traditional grammar formalisms, additional symbols are used to de-
scribe for example optional or repeated constructions. Consider for example the
BNF formalism, in which originally only sequential and alternative composition can
be used (denoted by juxtaposition and vertical bars, respectively), but which was
later extended to EBNF to also allow for repetition, denoted by a star. The goal of
this section is to show how the set of parser combinators can be extended.
3.4.1 Parser combinators for EBNF
It is very easy to make new parser combinators for EBNF. As a rst example we
consider repetition. Given a parser for a construction, many constructs a parser for
zero or more occurrences of that construction:
many :: Parser s a -> Parser s [a]
many p = list <$> p <*> many p
<|> succeed []
So the EBNF expression P

is implemented by many P. The auxiliary function list


takes an element and a list, and combines them in a simple list:
list x xs = x:xs
So list is just (:), and we might have written that in the denition of many, but
then the denition might have looked too cryptic at rst sight.
The order in which the alternatives are given only inuences the order in which
solutions are placed in the list of successes.
For example, the many combinator can be used in parsing a natural number:
natural :: Parser Char Int
natural = foldl f 0 <$> many newdigit
where f a b = a*10 + b
Dened in this way, the natural parser also accepts empty input as a number. If
this is not desired, we had better use the many1 parser combinator, which accepts
one or more occurrences of a construction, and corresponds to the EBNF expression
P
+
, see Section 2.7. It is dened in listing 5.
Another combinator from EBNF is the option combinator P?. It takes a parser as
argument, and returns a parser that recognises the same construct, but which also
succeeds if that construct is not present in the input string. The denition is given
60 Parser combinators
-- EBNF parser combinators
option :: Parser s a -> a -> Parser s a
option p d = p <|> succeed d
many :: Parser s a -> Parser s [a]
many p = list <$> p <*> many p <|> succeed []
many1 :: Parser s a -> Parser s [a]
many1 p = list <$> p <*> many p
pack :: Parser s a -> Parser s b -> Parser s c -> Parser s b
pack p r q = (\x y z -> y) <$> p <*> r <*> q
listOf :: Parser s a -> Parser s b -> Parser s [a]
listOf p s = list <$> p <*> many ((\x y -> y) <$> s <*> p)
-- Auxiliary functions
first :: Parser s b -> Parser s b
first p xs | null r = []
| otherwise = [head r]
where r = p xs
greedy, greedy1 :: Parser s b -> Parser s [b]
greedy = first . many
greedy1 = first . many1
list x xs = x:xs
Listing 5: EBNF.hs
3.4 More parser combinators 61
in listing 5. It has an additional argument: the value that should be used as result
in case the construct is not present. It is a kind of default value.
By the use of the option and many functions, a large amount of backtracking possi-
bilities are introduced. This is not always advantageous. For example, if we dene
a parser for identiers by
identifier = many1 (satisfy isAlpha)
a single word may also be parsed as two identiers. Caused by the order of the
alternatives in the denition of many (succeed [] appears as the second alternative),
the greedy parsing, which accumulates as many letters as possible in the identier
is tried rst, but if parsing fails elsewhere in the sentence, also less greedy parsings
of the identier are tried in vain. You will give a better denition of identifier
in Exercise 3.26.
In situations where from the way the grammar is built we can predict that it is
hopeless to try non-greedy results of many, we can dene a parser transformer first,
that transforms a parser into a parser that only returns the rst possible parsing.
It does so by taking the rst element of the list of successes.
first :: Parser a b -> Parser a b
first p xs | null r = []
| otherwise = [head r]
where r = p xs
Using this function, we can create a special take all or nothing version of many:
greedy = first . many
greedy1 = first . many1
If we compose the first function with the option parser combinator:
obligatory p d = first (option p d)
we get a parser which must accept a construction if it is present, but which does not
fail if it is not present.
3.4.2 Separators
The combinators many, many1 and option are classical in compiler constructions
there are notations for it in EBNF (*, + and ?, respectively), but there is no need
to leave it at that. For example, in many languages constructions are frequently
enclosed between two meaningless symbols, most often some sort of parentheses.
For this case we design a parser combinator pack. Given a parser for an opening
token, a body, and a closing token, it constructs a parser for the enclosed body, as
dened in listing 5. Special cases of this combinator are:
parenthesised p = pack (symbol () p (symbol ))
bracketed p = pack (symbol [) p (symbol ])
compound p = pack (token "begin") p (token "end")
62 Parser combinators
-- Chain expression combinators
chainr :: Parser s a -> Parser s (a -> a -> a) -> Parser s a
chainr pe po = h <$> many (j <$> pe <*> po) <*> pe
where j x op = (x op)
h fs x = foldr ($) x fs
chainl :: Parser s a -> Parser s (a -> a -> a) -> Parser s a
chainl pe po = h <$> pe <*> many (j <$> po <*> pe)
where j op x = (op x)
h x fs = foldl (flip ($)) x fs
Listing 6: Chains.hs
Another frequently occurring construction is repetition of a certain construction,
where the elements are separated by some symbol. You may think of lists of ar-
guments (expressions separated by commas), or compound statements (statements
separated by semicolons). For the parse trees, the separators are of no importance.
The function listOf below generates a parser for a (possibly empty) list, given a
parser for the items and a parser for the separators:
listOf :: Parser s a -> Parser s b -> Parser s [a]
listOf p s = list <$> p <*> many ((\x y -> y) <$> s <*> p )
Useful instantiations are:
commaList, semicList :: Parser Char a -> Parser Char [a]
commaList p = listOf p (symbol ,)
semicList p = listOf p (symbol ;)
A somewhat more complicated variant of the function listOf is the case where the
separators carry a meaning themselves. For example, in arithmetical expressions,
where the operators that separate the subexpressions have to be part of the parse
tree. For this case we will develop the functions chainr and chainl. These functions
expect that the parser for the separators returns a function (!); that function is used
by chain to combine parse trees for the items. In the case of chainr the operator is
applied right-to-left, in the case of chainl it is applied left-to-right. The functions
chainr and chainl are dened in listing 6 (remember that $ is function application:
f $ x = f x).
The denitions look quite complicated, but when you look at the underlying gram-
mar they are quite straightforward. Suppose we apply operator ( is an operator
variable, it denotes an arbitrary right-associative operator) from right to left, so
e
1
e
2
e
3
e
4
=
e
1
(e
2
(e
3
e
4
))
=
((e
1
) (e
2
) (e
3
)) e
4
3.4 More parser combinators 63
It follows that we can parse such expressions by parsing many pairs of expressions
and operators, turning them into functions, and applying all those functions to the
last expression. This is done by function chainr, see listing 6.
If operator is applied from left to right, then
e
1
e
2
e
3
e
4
=
((e
1
e
2
) e
3
) e
4
=
((e
4
) (e
3
) (e
2
)) e
1
So such an expression can be parsed by rst parsing a single expression (e
1
), and
then parsing many pairs of operators and expressions, turning them into functions,
and applying all those functions to the rst expression. This is done by function
chainl, see listing 6.
Functions chainl and chainr can be made more ecient by avoiding the construc-
tion of the intermediate list of functions. The resulting denitions can be found in
[5].
Note that functions chainl and chainr are very similar, the only dierence is that
everything is turned around: function j of chainr takes a value and an operator,
and returns the function obtained by left applying the operator; function j of
chainl takes an operator and a value, and returns the function obtained by right
applying the operator to the value. Such functions are sometimes called dual. dual
Exercise 3.20 What is the value of
many (symbol a) xs
for xs [],[a],[b],[a,b],[a,a,b]?
Exercise 3.21 Consider the application of the parser many (symbol a) to the string "aaa".
In what order do the four possible parsings appear in the list of successes?
Exercise 3.22 Using the parser combinators option, many and many1 dene parsers for each
of the basic languages dened in 2.3.1.
Exercise 3.23 As another variation on the theme repetition, dene a parser combinator
psequence that transforms a list of parsers for some type into a parser returning
a list of elements of that type. What is the type of psequence? Also dene a
combinator choice that iterates the operator <|>.
Exercise 3.24 As an application of psequence, dene the function token that was discussed
in Section 3.2.
Exercise 3.25 Carefully analyse the semantic functions in the denition of chainl in listing 6.

Exercise 3.26 In real programming languages, identiers follow more exible rules: the rst
symbol must be a letter, but the symbols that follow (if any) may be a letter, digit,
or underscore symbol. Dene a more realistic parser identifier.
64 Parser combinators
3.5 Arithmetical expressions
The goal of this section is to use parser combinators in a concrete application. We
will develop a parser for arithmetical expressions, which have the following concrete
syntax:
E E + E [ E - E [ E * E [ E / E [ (E) [ Digs
Besides these productions, we also have productions for identiers and applications
of functions:
E Identier [ Identier (LoA)
LoA [ E(,E)

The parse trees for this grammar are of type Expr:


data Expr = Con Int
| Var String
| Fun String [Expr]
| Expr :+: Expr
| Expr :-: Expr
| Expr :*: Expr
| Expr :/: Expr
You can almost recognise the structure of the parser in this type denition. But
in order to account for the priorities of the operators, we will use a grammar with
three non-terminals expression, term and factor: an expression is composed of
terms separated by + or ; a term is composed of factors separated by or /, and
a factor is a constant, variable, function call, or expression between parentheses.
This grammar appears as a parser in the functions in listing 7. The rst parser,
fact, parses factors.
fact :: Parser Char Expr
fact = Con <$> integer
<|> Var <$> identifier
<|> Fun <$> identifier <*> parenthesised (commaList expr)
<|> parenthesised expr
The rst alternative is an integer parser which is postprocessed by the semantic
function Con. The second and third alternative are a variable or function call,
depending on the presence of an argument list. In absence of the latter, the function
Var is applied, in presence the function Fun. For the fourth alternative there is no
semantic function, because the meaning of an expression between parentheses is the
meaning of the expression.
For the denition of a term as a list of factors separated by multiplicative operators
we use the function chainr. Recall that chainr repeatedly recognises its rst argu-
ment (fact), separated by its second argument (a or /). The parse trees for the
individual factors are joined by the constructor functions that appear before <$>.
The function expr is analogous to term, only with additive operators instead of
multiplicative operators, and with terms instead of factors.
3.5 Arithmetical expressions 65
-- Type definition for parse tree
data Expr = Con Int
| Var String
| Fun String [Expr]
| Expr :+: Expr
| Expr :-: Expr
| Expr :*: Expr
| Expr :/: Expr
-------------------------------------------------------------
-- Parser for expressions with two priorities
fact :: Parser Char Expr
fact = Con <$> integer
<|> Var <$> identifier
<|> Fun <$> identifier <*> parenthesised (commaList expr)
<|> parenthesised expr
integer :: Parser Char Int
integer = (const negate <$> (symbol -)) option id <*> natural
term :: Parser Char Expr
term = chainr fact
( const (:*:) <$> symbol *
<|> const (:/:) <$> symbol /
)
expr :: Parser Char Expr
expr = chainr term
( const (:+:) <$> symbol +
<|> const (:-:) <$> symbol -
)
Listing 7: ExpressionParser.hs
66 Parser combinators
This example clearly shows the strength of parsing with parser combinators. There is
no need for a separate formalism for grammars; the production rules of the grammar
are combined with higher-order functions. Also, there is no need for a separate
parser generator (like yacc); the functions can be viewed both as description of the
grammar and as an executable parser.
Exercise 3.27 Modify the functions in listing 7, in such a way that + is parsed as a right
associative operator, and is parsed as a left associative operator.
3.6 Generalised expressions
This section generalises the parser in the previous section with respect to priorities.
Arithmetical expressions in which operators have more than two levels of priority
can be parsed by writing more auxiliary functions between term and expr. The
function chainr is used in each denition, with as rst argument the function of
one priority level lower.
If there are nine levels of priority, we obtain nine copies of almost the same text.
This is not as it should be. Functions that resemble each other are an indication
that we should write a generalised function, where the dierences are described using
extra arguments. Therefore, let us inspect the dierences in the denitions of term
and expr again. These are:
The operators and associated tree constructors that are used in the second
argument of chainr
The parser that is used as rst argument of chainr
The generalised function will take these two dierences as extra arguments: the rst
in the form of a list of pairs, the second in the form of a parse function:
type Op a = (Char,a -> a -> a)
gen :: [Op a] -> Parser Char a -> Parser Char a
gen ops p = chainr p (choice (map f ops))
where f (s,c) = const c <$> symbol s
If furthermore we dene as shorthand:
multis = [ (*,(:*:)), (/,(:/:)) ]
addis = [ (+,(:+:)), (-,(:-:)) ]
then expr and term can be dened as partial parametrisations of gen:
expr = gen addis term
term = gen multis fact
By expanding the denition of term in that of expr we obtain:
expr = addis gen (multis gen fact)
which an experienced functional programmer immediately recognises as an applica-
tion of foldr:
3.7 Exercises 67
-- Parser for expressions with aribitrary many priorities
type Op a = (Char,a -> a -> a)
fact :: Parser Char Expr
fact = Con <$> integer
<|> Var <$> identifier
<|> Fun <$> identifier <*> parenthesised (commaList expr)
<|> parenthesised expr
gen :: [Op a] -> Parser Char a -> Parser Char a
gen ops p = chainr p (choice (map f ops))
where f (s,c) = const c <$> symbol s
expr :: Parser Char Expr
expr = foldr gen fact [addis, multis]
multis = [ (*,(:*:)), (/,(:/:)) ]
addis = [ (+,(:+:)), (-,(:-:)) ]
Listing 8: GExpressionParser.hs
expr = foldr gen fact [addis, multis]
From this denition a generalisation to more levels of priority is simply a matter of
extending the list of operator-lists.
The very compact formulation of the parser for expressions with an arbitrary number
of priority levels is possible because the parser combinators can be used together with
the existing mechanisms for generalisation and partial parametrisation in Haskell.
Contrary to conventional approaches, the levels of priority need not be coded expli-
citly with integers. The only thing that matters is the relative position of an operator
in the list of list with operators of the same priority. Also, the insertion of new
priority levels is very easy. The denitions are summarised in listing 8.
summary
This chapter shows how to construct parsers from simple combinators. It shows
how a small parser combinator library can be a powerful tool in the construction of
parsers. Furthermore, this chapter gives a rather basic implementation of the parser
combinator library. More advanced implementations are discussed elsewhere.
3.7 Exercises
Exercise 3.28 Prove the following laws
1. h <$> (f <$> p) = (h.f) <$> p
68 Parser combinators
2. h <$> (p <|> q) = (h <$> p) <|> (h <$> q)
3. h <$> (p <*> q) = ((h.) <$> p) <*> q

Exercise 3.29 Consider your answer to exercise 2.22. Dene a combinator parser pMir that
transforms a concrete representation of a mirror-palindrome into an abstract one.
Test your function with the concrete mirror-palindromes cMir1 and cMir2.
Exercise 3.30 Consider your answer to exercise 2.24. Assuming the comma is an associative
operator, we can give the following abstract syntax for bit-lists:
data BitList = SingleB Bit | ConsB Bit BitList
Dene a combinator parser pBitList that transforms a concrete representation of a
bit-list into an abstract one. Test your function with the concrete bit-lists cBitList1
and cBitList2.
Exercise 3.31 Dene a parser for xed-point numbers, that is numbers like 12.34 and 123.456.
Also integers are acceptable. Notice that the part following the decimal point looks
like an integer, but has a dierent semantics!
Exercise 3.32 Dene a parser for oating point numbers, which are xed point numbers
followed by an optional E and an (positive or negative, integer) exponent.
Exercise 3.33 Dene a parser for Java assignments that consist of a variable, an = sign, an
expression and a semicolon.
Exercise 3.34 Dene a parser for (simplied) Java statements.
Exercise 3.35 Outline the construction of a parser for Java programs.
Chapter 4
Grammar and Parser design
The previous chapters have introduced many concepts related to grammars and
parsers. The goal of this chapter is to review these concepts, and to show how they
are used in the design of grammars and parsers.
The design of a grammar and parser for a language consists of several steps: you
have to
1. give a grammar for the language for which you want to have a parser;
2. analyse this grammar to nd out whether or not it has some desirable prop-
erties;
3. possibly transform the grammar to obtain some of these desirable properties;
4. decide on the type of the parser: Parser a b, that is, decide on both the
input type a of the parser (which may be the result type of a scanner), and
the result type b of the parser.
5. construct a basic parser;
6. add semantic functions;
7. check whether or not you have obtained what you expected.
We will describe and exemplify each of these steps in detail in the rest of this section.
As a running example we will construct a grammar and parser for travelling schemes
for day trips, of the following form:
Groningen 8:37 9:44 Zwolle 9:49 10:15 Utrecht 10:21 11:05 Den Haag
We might want to do several things with such a schema, for example:
1. compute the net travel time, i.e. the travel time minus the waiting time (2
hours and 17 minutes in the above example);
2. compute the total time one has to wait on the intermediate stations (11 min-
utes).
This chapter denes functions to perform these computations.
69
70 Grammar and Parser design
4.1 Step 1: A grammar for the language
The starting point for designing a parser for your language is to dene a grammar
that describes the language as precisely as possible. It is important to convince your-
self from the fact that the grammar you give really generates the desired language,
since the grammar will be the basis for grammar transformations, which might turn
the grammar into a set of incomprehensible productions.
For the language of travelling schemes, we can give several grammars. The following
grammar focuses on the fact that a trip consists of zero or more departures and
arrivals.
TS TS Departure Arrival TS [ Station
Station Identier
Departure Time
Arrival Time
Time Nat:Nat
where Identier and Nat have been dened in Section 2.3.1. So a travelling scheme
is a sequence of departure and arrival times, separated by stations. Note that a
single station is also a travelling scheme with this grammar.
Another grammar focuses on changing at a station:
TS Station Departure (Arrival Station Departure)

Arrival Station
[ Station
So each travelling scheme starts and ends at a station, and in between there is a list
of intermediate stations.
4.2 Step 2: Analysing the grammar
To parse sentences of a language eciently, we want to have a nonambiguous gram-
mar that is left-factored and not left recursive. Depending on the parser we want
to obtain, we might desire other properties of our grammar. So a rst step in de-
signing a parser is analysing the grammar, and determining which properties are
(not) satised. We have not yet developed tools for grammar analysis (we will do so
in the chapter on LL(1) parsing) but for some grammars it is easy to detect some
properties.
The rst example grammar is left and right recursive: the rst production for TS
starts and ends with TS. Furthermore, the sequence Departure Arrival is an asso-
ciative separator in the generated language.
These properties may be used for transforming the grammar. Since we dont mind
about right recursion, we will not make use of the fact that the grammar is right
recursive. The other properties will be used in grammar transformations in the
following subsection.
4.3 Step 3: Transforming the grammar 71
4.3 Step 3: Transforming the grammar
Since the sequence Departure Arrival is an associative separator in the generated
language, the productions for TS may be transformed into:
TS Station [ Station Departure Arrival TS (4.1)
Thus we have removed the left recursion in the grammar. Both productions for
TS start with the nonterminal Station, so TS can be left factored. The resulting
productions are:
TS Station Z
Z [ Departure Arrival TS
We can also apply equivalence (2.1) to the two productions for TS from (4.1), and
obtain the following single production:
TS (Station Departure Arrival )

Station (4.2)
So which productions do we take for TS? This depends on what we want to do with
the parsed sentences. We will show several choices in the next section.
4.4 Step 4: Deciding on the types
We want to write a parser for travel schemes, that is, we want to write a function
ts of type
ts :: Parser ? ?
The question marks should be replaced by the input type and the result type, re-
spectively. For the input type we can choose between at least two possibilities:
characters, Char or tokens Token. The type of tokens can be chosen as follows:
data Token = Station_Token Station | Time_Token Time
type Station = String
type Time = (Int,Int)
We will construct a parser for both input types in the next subsection. So ts has
one of the following two types.
ts :: Parser Char ?
ts :: Parser Token ?
For the result type we have many choices. If we just want to compute the total
travelling time, Int suces for the result type. If we want to compute the total
travelling time, the total waiting time, and a nicely printed version of the travelling
scheme, we may do several things:
dene three parsers, with Int (total travelling time), Int (total waiting time),
and String (nicely printed version) as result type, respectively;
72 Grammar and Parser design
dene a single parser with the triple (Int,Int,String) as result type;
dene an abstract syntax for travelling schemes, say a datatype TS, and dene
three functions on TS that compute the desired results.
The rst alternative parses the input three times, and is rather inecient compared
with the other alternatives. The second alternative is hard to extend if we want
to compute something extra, but in some cases it might be more ecient than the
third alternative. The third alternative needs an abstract syntax. There are several
ways to dene an abstract syntax for travelling schemes. The rst abstract syntax
corresponds to denition (4.1) of grammar TS.
data TS1 = Single1 Station
| Cons1 Station Time Time TS1
where Station and Time are dened above. A second abstract syntax corresponds
to the grammar for travelling schemes dened in (4.2).
type TS2 = ([(Station,Time,Time)],Station)
So a travelling scheme is a tuple, the rst component of which is a list of triples
consisting of a departure station, a departure time, and an arrival time, and the
second component of which is the nal arrival station. A third abstract syntax
corresponds to the second grammar dened in Section 4.1:
data TS3 = Single3 Station
| Cons3 (Station,Time,[(Time,Station,Time)],Time,Station)
Which abstract syntax should we take? Again, this depends on what we want to do
with the abstract syntax. Since TS2 and TS1 combine departure and arrival times in
a tuple, they are convenient to use when computing travelling times. TS3 is useful
when we want to compute waiting times since it combines arrival and departure
times in one constructor. Often we want to exactly mimic the productions of the
grammar in the abstract syntax, so if we use (4.1) for the grammar for travelling
schemes, we use TS1 for the abstract syntax. Note that TS1 is a datatype, whereas
TS2 is a type. TS1 cannot be dened as a type because of the two alternative
productions for TS. TS2 can be dened as a datatype by adding a constructor.
Types and datatypes each have their advantages and disadvantages; the application
determines which to use. The result type of the parsing function ts may be one of
types mentioned earlier (Int, etc.), or one of TS1, TS2, TS3.
4.5 Step 5: Constructing the basic parser
Converting a grammar to a parser is a mechanical process that consists of a set
of simple replacement rules. Functional programming languages oer some extra
exibility that we sometimes use, but usually writing a parser is a simple translation.
We use the following replacement rules.
4.5 Step 5: Constructing the basic parser 73
=
[ <|>
(space) <*>
+ many1
many
? option
terminal x symbol x
begin of sequence of symbols undefined<$>
Note that we start each sequence of symbols by undefined<$>. The undefined has
to be replaced by an appropriate semantic function in Step 6, but putting undefined
here ensures type correctness of the parser. Of course, running the parser will result
in an error.
We construct a basic parser for each of the input types Char and Token.
4.5.1 Basic parsers from strings
Applying these rules to the grammar (4.2) for travelling schemes, we obtain the
following basic parser.
station :: Parser Char Station
station = undefined <$> identifier
time :: Parser Char Time
time = undefined <$> natural <*> symbol : <*> natural
departure, arrival :: Parser Char Time
departure = undefined <$> time
arrival = undefined <$> time
tsstring :: Parser Char ?
tsstring = undefined <$>
many (undefined <$>
spaces
<*> station
<*> spaces
<*> departure
<*> spaces
<*> arrival
)
<*> spaces
<*> station
spaces :: Parser Char String
spaces = undefined <$> many (symbol )
The only thing left to do is to add the semantic glue to the functions. The semantic
glue also determines the type of the function tsstring, which is denoted by ? for
the moment. For the other basic parsers we have chosen some reasonable return
types. The semantic functions are dened in the next and nal step.
74 Grammar and Parser design
4.5.2 A basic parser from tokens
To obtain a basic parser from tokens, we rst write a scanner that produces a list
of tokens from a string.
scanner :: String -> [Token]
scanner = map mkToken . words
mkToken :: String -> Token
mkToken xs = if isDigit (head xs)
then Time_Token (mkTime xs)
else Station_Token (mkStation xs)
parse_result :: [(a,b)] -> a
parse_result xs
| null xs = error "parse_result: could not parse the input"
| otherwise = fst (head xs)
mkTime :: String -> Time
mkTime = parse_result . time
mkStation :: String -> Station
mkStation = parse_result . station
This is a basic scanner with very basic error messages, but it suces for now. The
composition of the scanner with the function tstoken1 dened below gives the nal
parser.
tstoken1 :: Parser Token ?
tstoken1 = undefined <$>
many (undefined <$>
tstation
<*> tdeparture
<*> tarrival
)
<*> tstation
tstation :: Parser Token Station
tstation (Station_Token s:xs) = [(s,xs)]
tstation _ = []
tdeparture, tarrival :: Parser Token Time
tdeparture (Time_Token (h,m):xs) = [((h,m),xs)]
tdeparture _ = []
tarrival (Time_Token (h,m):xs) = [((h,m),xs)]
tarrival _ = []
where again the semantic functions remain to be dened. Note that functions
tdeparture and tarrival are the same functions. Their presence reects their
presence in the grammar.
4.6 Step 6: Adding semantic functions 75
Another basic parser from tokens is based on the second grammar of Section 4.1.
tstoken2 :: Parser Token Int
tstoken2 = undefined <$>
tstation
<*> tdeparture
<*> many (undefined <$>
tarrival
<*> tstation
<*> tdeparture
)
<*> tarrival
<*> tstation
<|> undefined <$> tstation
4.6 Step 6: Adding semantic functions
Once we have the basic parsing functions, we need to add the semantic glue: the
functions that take the results of the elements in the right hand side of a production,
and convert them into the result of the left hand side. The basic rule is: Let the
types do the work!
First we add semantic functions to the basic parsing functions station, time,
departure, arrival, and spaces. Since function identifier already returns a
string, we can take the identity function id for undefined in function station.
Since id <$> is the identity function, it can be omitted. To obtain a value of type
Time from an integer, a character, and an integer, we have to combine the two
integers in a tuple. So we take the following function
\x y z -> (x,z)
for undefined in time. Now, since function time returns a value of type Time, we
can take the identity function for undefined in departure and arrival, and then
we replace id <$> time by just time. Finally, the result of many is a string, so for
undefined in spaces we can take the identity function too.
The rst semantic function for the basic parser tsstring dened in Section 4.5.1
returns an abstract syntax tree of type TS2. So the rst undefined in tsstring
should return a tuple of a list of things of the correct type (the rst component of
the type TS2) and a Station. Since many returns a list of things, we can construct
such a tuple by means of the function
\x y z -> (x,z)
provided many returns a value of the desired type: [(Station,Time,Time)]. Note
that this semantic function basically only throws away the value returned by the
spaces parser: we are not interested in the spaces between the components of our
travelling scheme. The many parser returns a value of the correct type if we replace
the second occurrence of undefined in tsstring by the function
\u v w x y z -> (v,x,z)
76 Grammar and Parser design
Again, the results of spaces are thrown away. This completes a parser for travelling
schemes.
The next semantic functions we dene compute the net travel time. To compute the
net travel time, we have to compute the travel time of each trip from a station to a
station, and to add the travel times of all of these trips. We obtain the travel time
of a single trip if we replace the second occurrence of undefined by:
\u v w (xh,xm) y (zh,zm) -> (zh-xh)*60 + zm-xm
and Haskells prelude function sum sums these times, so for the rst occurrence of
undefined we take:
\x y z -> sum x
The nal set of semantic functions we dene are used for computing the total waiting
time. Since the second grammar of Section 4.1 combines arrival times and departure
times, we use a parser based on this grammar: the basic parser tstoken2. We have
to give denitions of the three undefined semantic functions. If a trip consists of
a single station, there is now waiting time, so the last occurrence of undefined is
the function const 0. The second occurrence of function undefined computes the
waiting time for one intermediate station:
\(uh,um) v (wh,wm) -> (wh-uh)*60 + wm-um
Finally, the rst occurrence of undefined sums the list of waiting time obtained by
means of the function that replaces the second occurrence of undefined:
\s t x y z -> sum x
4.7 Step 7: Did you get what you expected
In the last step you test your parser(s) to see whether or not you have obtained
what you expected, and whether or not you have made errors in the above process.
summary
This chapter describes the dierent steps that have to be considered in the design
of a grammar and a language.
4.8 Exercises
Exercise 4.1 Write a parser for Java oat-literals. The EBNF grammar for oat-literals is
given by:
Digits Digits Digit [ Digit
FloatLiteral IntPart . FractPart? ExponentPart? FloatSux?
[ . FractPart ExponentPart? FloatSux?
[ IntPart ExponentPart FloatSux?
4.8 Exercises 77
[ IntPart ExponentPart? FloatSux
IntPart SignedInteger
FractPart Digits
ExponentPart ExponentIndicator SignedInteger
SignedInteger Sign? Digits
ExponentIndicator e [ E
Sign + [
FloatSux f [ F [ d [ D
To keep your parser simple, assume that all nonterminals, except for the nonterminal
FloatLiteral , are represented by a String in the abstract syntax.
Exercise 4.2 Write a evaluator for Java oat-literals (the oat-sux may be ignored).
Exercise 4.3 Up to the denition of the semantic functions, parsers constructed on a (xed)
abstract syntax have the same shape. Give this parsing scheme for Java oat literals.

78 Grammar and Parser design


Chapter 5
Regular Languages
introduction
The rst phase of a compiler takes an input program, and splits the input into a
list of terminal symbols: keywords, identiers, numbers, punctuation, etc. Reg-
ular expressions are used for the description of the terminal symbols. A regular
grammar is a particular kind of context-free grammar that can be used to describe
regular expressions. Finite-state automata can be used to recognise sentences of
regular grammars. This chapter discusses all of these concepts, and is organised
as follows. Section 5.1 introduces nite-state automata. Finite-state automata ap-
pear in two versions, nondeterministic and deterministic ones. Section 5.1.4 shows
that a nondeterministic nite-state automaton can be transformed into a determin-
istic nite-state automaton, so you dont have to worry about whether or not your
automaton is deterministic. Section 5.2 introduces regular grammars (context-free
grammars of a particular form), and regular languages. Furthermore, it shows their
equivalence with nite-state automata. Section 5.3 introduces regular expressions as
nite descriptions of regular languages and shows that such expressions are another,
equivalent, tool for regular languages. Finally, Section 5.4 gives some of the proofs
of the results of the previous sections.
goals
After you have studied this chapter you will know that
regular languages are a subset of context-free languages;
it is not always possible to give a regular grammar for a context-free grammar;
regular grammars, nite-state automata and regular expressions are three dif-
ferent tools for regular languages;
regular grammars, nite-state automata and regular expressions have the same
expressive power;
nite-state automata appear in two, equally expressive, versions: deterministic
and nondeterministic.
5.1 Finite-state automata
The classical approach to recognising sentences from a regular language uses nite-
state automata. A nite-state automaton can be viewed as a simple form of digital
79
80 Regular Languages
computer with only a nite number of states, no temporary storage, an input le but
only the possibility to read it, and a control unit which records the state transitions.
A rather limited medium, but a useful tool in many practical subproblems. A nite-
state automaton can easily be implemented by a function that takes time linear in
the length of its input, and constant space. This implies that problems that can be
solved by means of a nite-state automaton, can be implemented by means of very
ecient programs.
5.1.1 Deterministic nite-state automata
Finite-state automata come in two avours: deterministic and nondeterministic. We
start with a description of deterministic nite-state automata, the simplest form of
automata.
Denition 1: Deterministic nite-state automaton, DFA
A deterministic nite-state automaton (DFA) is a 5-tuple (X, Q, d, S, F) where deter-
mi-
nistic
nite-
state
auto-
maton
X is the input alphabet,
Q is a nite set of states,
d :: Q X Q is the state transition function,
S Q is the start state,
F Q is the set of accepting states.
2
As an example, consider the DFA M
0
= (X, Q, d, S, F) with
X = a, b, c
Q = S, A, B, C
F = C
where state transition function d is dened by
d S a = C
d S b = A
d S c = S
d A a = B
d B c = C
For human beings, a nite-state automaton is more comprehensible in a graphical
representation. The following representation is customary: states are depicted as
the nodes in a graph; accepting states get a double circle; start states are explicitly
mentioned or indicated otherwise. The transition function is represented by the
edges: whenever d Q
i
x is a state Q
j
, then there is an arrow labelled x from Q
i
to Q
j
. The input alphabet is implicit in the labels. For automaton M
0
above, the
pictorial representation is:
5.1 Finite-state automata 81
S
?>=< 89:;
ED GF
c
@A
/
a
/
b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a
/
B
?>=< 89:;
c
O
Note that d is a partial function: for example d B a is not dened. We can make d
into a total function by introducing a new sink state, the result state of all undened
transitions. For example, in the above automaton we can introduce a sink state D
with d D x = D for all terminals x, and d E x = D for all states E and terminals
x for which d E x is undened. The sink state and the transitions from/to it are
almost always omitted.
The action of a DFA on an input string is described as follows: given a sequence
w of input symbols, w can be processed symbol by symbol (from left to right)
and depending on the specic input symbol the DFA (initially in the start
state) moves to the state as determined by its state transition function. If no move
is possible, the automaton blocks. When the complete input has been processed
and the DFA is in one of its accepting states, then we say that w is accepted by the
automaton. accept
To illustrate the action of an DFA, we will show that the sentence bac is accepted
by M
0
. We do so by recording the successive congurations, i.e. the pairs of current
state and remaining input values.
(S, bac)

(A, ac)

(B, c)

(C, )
Because of the deterministic behaviour of a DFA the denition of acceptance by
a DFA is relatively easy. Informally, a sequence w X

is accepted by a DFA
(X, Q, d, S, F), if it is possible, when starting the DFA in S, to end in an accepting
state after processing w. This operational description of acceptance is formalised in
the predicate dfa accept. The predicate will be derived in a top-down fashion, i.e.
we formulate the predicate in terms of (smaller) subcomponents and afterwards
we give solutions to the subcomponents.
Suppose dfa is a function that reects the behaviour of the DFA, i.e. a function
which given a transition function, a start state and a string, returns the unique
state that is reached after processing the string starting from the start state. Then
the predicate dfa accept is dened by:
dfa accept :: X

(Q X Q, Q, Q) Bool
dfa accept w (d, S, F) = (dfa d S w) F
82 Regular Languages
It remains to construct a denition of function dfa that takes a transition function,
a start state, and a list of input symbols, and reects the behaviour of a DFA. The
denition of dfa is straightforward
dfa :: (Q X Q) Q X

Q
dfa d q = q
dfa d q (ax) = dfa d (d q a) x
Note that both the type and the denition of dfa match the pattern of the function
foldl , and it follows that we can write function dfa as a foldl .
dfa d q = foldl d q.
Denition 2: Acceptance by a DFA
The sequence w X

is accepted by DFA (X, Q, d, S, F) if


dfa accept w (d, S, F)
where
dfa accept w (d, qs, fs) = dfa d qs w fs
dfa d qs = foldl d qs
2
Using the predicate dfa accept, the language of a DFA is dened as follows.
Denition 3: Language of a DFA
For DFA M = (X, Q, d, S, F), the language of M, Ldfa(M), is dened by
Ldfa(M) = w X

[ dfa accept w (d, S, F)


2
5.1.2 Nondeterministic nite-state automata
This subsection introduces nondeterministic nite-state automata and denes their
semantics, i.e. the language of a nondeterministic nite-state automaton.
The transition function of a DFA returns a state, which implies that for all terminal
symbols x and for all states t there can only be one edge starting in t labelled with
x. Sometimes it is convenient to have two or more edges labelled with the same
terminal symbol from a state. In these cases one can use a nondeterministic nite-
state automaton. Nondeterministic nite state automata are dened as follows.
Denition 4: Nondeterministic nite-state automaton, NFA
A nondeterministic nite-state automaton (NFA) is a 5-tuple (X, Q, d, Q
0
, F), where nonde-
termi-
nistic
nite-
state
auto-
maton
X is the input alphabet,
Q is a nite set of states,
d :: Q X Q is the state transition function,
5.1 Finite-state automata 83
Q
0
Q is the set of start states,
F Q is the set of accepting states.
2
An NFA diers from a DFA in that there may be more than one start state and that
there may be more than one possible move for each state and input symbol. Here
is an example of an NFA:
S
?>=< 89:;
ED GF
c
@A
/
a
/
b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a
/
BC
ED
a
o
B
?>=< 89:;
c
O
Note that this NFA is very similar to the DFA in the previous section: the only
dierence is that there are two outgoing arrows labelled with a from state A. Thus
the DFA becomes an NFA.
Formally, this NFA is dened as M
1
= (X, Q, d, Q
0
, F) with
X = a, b, c
Q = S, A, B, C
Q
0
= S
F = C
where state transition function d is dened by
d S a = C
d S b = A
d S c = S
d A a = S, B
d B c = C
Again d is a partial function, which can be made total by adding d D x = for
all states D and all terminal symbols x for which d D x is undened.
Since an NFA can make an arbitrary (nondeterministic) choice for one of its possible
moves, we have to be careful in dening what it means that a sequence is accepted
by an NFA. Informally, sequence w X

is accepted by NFA (X, Q, d, Q


0
, F), if it
is possible, when starting the NFA in a state from Q
0
, to end in an accepting state
after processing w. This operational description of acceptance is formalised in the
predicate nfa accept.
Assume that we have a function, say nfa, which reects the behaviour of the NFA.
That is a function which given a transition function, a set of start states and a string,
returns all possible states that can be reached after processing the string starting in
some start state. Then the predicate nfa accept can be expressed as
nfa accept :: X

(Q X Q, Q, Q) Bool
nfa accept w (d, Q
0
, F) = nfa d Q
0
w F ,=
84 Regular Languages
Now it remains to nd a function nfa d qs of type X

Q that reects the


behaviour of the NFA. For lists of length 1 such a function, called deltas, is dened
by
deltas :: (Q X Q) Q X Q
deltas d qs a = r [ q qs, r d q a
The behaviour of the NFA on X-sequences of arbitrary length follows from this one
step behaviour:
nfa :: (Q X Q) Q X

Q
nfa d qs = qs
nfa d qs (ax) = nfa d (deltas d qs a) x
Again, it follows that nfa can be written as a foldl.
nfa d qs = foldl (deltas d) qs
This concludes the denition of predicate nfa accept. In summary we have derived
Denition 5: Acceptance by an NFA
The sequence w X

is accepted by NFA (X, Q, d, Q


0
, F) if
nfa accept w (d, Q
0
, F)
where
nfa accept w (d, qs, fs) = nfa d qs w fs ,=
nfa d qs = foldl (deltas d) qs
deltas d qs a = r [ q qs, r d q a
2
Using the nfa accept-predicate, the language of an NFA is dened by
Denition 6: Language of an NFA
For NFA M = (X, Q, d, Q
0
, F), the language of M, Lnfa(M), is dened by
Lnfa(M) = w X

[ nfa accept w (d, Q


0
, F)
2
Note that it is computationally expensive to determine whether or not a list is an
element of the language of a nondeterministic nite-state automaton. This is due to
the fact that all possible transitions have to be tried in order to determine whether or
not the automaton can end in an accepting state after reading the input. Determin-
ing whether or not a list is an element of the language of a deterministic nite-state
automaton can be done in time linear in the length of the input list, so from a com-
putational view, deterministic nite-state automata are preferable. Fortunately, for
each nondeterministic nite-state automaton there exists a deterministic nite-state
automaton that accepts the same language. We will show how to construct a DFA
from an NFA in subsection 5.1.4.
5.1 Finite-state automata 85
5.1.3 Implementation
This section describes how to implement nite state machines. We start with im-
plementing DFAs. Given a DFA M = (X, Q, d, S, F), we dene two datatypes:
data StateM = ... deriving Eq
data SymbolM = ...
where the states of M (the elements of the set Q) are listed as constructors of
StateM, and the symbols of M (the elements of the set X) are listed as constructors
of SymbolM. Furthermore, we dene three values (one of which a function):
start :: StateM
delta :: SymbolM -> StateM -> StateM
finals :: [StateM]
Note that the rst two arguments of delta have changed places: this has been
done in order to be able to apply partial evaluation later. The extended transition
function dfa and the accept function dfaAccept are now dened by:
dfa :: [SymbolM] -> StateM
dfa = foldl (flip delta) start
dfaAccept :: [SymbolM] -> Bool
dfaAccept xs = elem (dfa xs) finals
Given a list of symbols [x1,x2,...,xn], the computation of dfa [x1,x2,...,xn]
uses the following intermediate states:
start, delta x1 start, delta x2 (delta x1 start),...
This list of states is determined uniquely by the input [x1,x2,...,xn] and the
start state.
Since we want to use the same function names for dierent automata, we introduce
the following class:
class Eq a => DFA a b where
start :: a
delta :: b -> a -> a
finals :: [a]
dfa :: [b] -> a
dfa = foldl (flip delta) start
dfaAccept :: [b] -> Bool
dfaAccept xs = elem (dfa xs) finals
Note that the functions dfa and dfaAccept are dened once and for all for all
instances of the class DFA.
As an example, we give the implementation of the example DFA (called MEX here)
given in the previous subsection.
86 Regular Languages
data StateMEX = A | B | C | S deriving Eq
data SymbolMEX = SA | SB | SC
So the state A is represented by A, and the symbol a is represented by SA, and similar
for the other states and symbols. The automaton is made an instance of class DFA
as follows:
instance DFA StateMEX SymbolMEX where
start = S
delta x S = case x of SA -> C
SB -> A
SC -> S
delta SA A = B
delta SC B = C
finals = [C]
We can improve the performance of the automaton (function) dfa by means of
partial evaluation. The main idea of partial evaluation is to replace computations partial
evalua-
tion
that are performed often at run-time by a single computation that is performed only
once at compile-time. A very simple example is the replacement of the expression
if True then f1 else f2 by the expression f1. Partial evaluation applies to nite
automatons in the following way.
dfa [x1,x2,...,xn]
=
foldl (flip delta) start [x1,x2,...,xn]
=
foldl (flip delta) (delta x1 start) [x2,...,xn]
=
case x1 of
SA -> foldl (flip delta) (delta SA start) [x2,...,xn]
SB -> foldl (flip delta) (delta SB start) [x2,...,xn]
SC -> foldl (flip delta) (delta SC start) [x2,...,xn]
All these equalities are simple transformation steps for functional programs. Note
that the rst argument of foldl is always flip delta, and the second argument is
one of the four states S, A, B, or C (the result of delta). Since there are only a nite
number of states (four, to be precise), we can dene a transition function for each
state:
dfaS, dfaA, dfaB, dfaC :: [Symbol] -> State
Each of these functions is a case expression over the possible input symbols.
dfaS [] = S
dfaS (x:xs) = case x of SA -> dfaC xs
SB -> dfaA xs
SC -> dfaS xs
5.1 Finite-state automata 87
dfaA [] = A
dfaA (x:xs) = case x of SA -> dfaB xs
dfaB [] = B
dfaB (x:xs) = case x of SC -> dfaC xs
dfaC [] = C
With this denition of the nite automaton, the number of steps required for com-
puting the value of dfaS xs for some list of symbols xs is reduced considerably.
The implementation of NFAs is similar to the implementation of DFAs. The only
dierence is that the transition and accept functions have to take care of sets (lists)
of states now. We will use the following class, in which we use some names that
also appear in the class DFA. This is a problem if the two classes appear in the same
module.
class Eq a => NFA a b where
start :: [a]
delta :: b -> a -> [a]
finals :: [a]
nfa :: [b] -> [a]
nfa = foldl (flip deltas) start
deltas :: b -> [a] -> [a]
deltas a = union . map (delta a)
nfaAccept :: [b] -> Bool
nfaAccept xs = intersect (nfa xs) finals /= []
Here, functions union and intersect are implemented as follows:
union :: Eq a => [[a]] -> [a]
union = nub . concat
nub :: Eq a => [a] -> [a]
nub = foldr (\x xs -> x:filter (/=x) xs) []
intersect :: Eq a => [a] -> [a] -> [a]
intersect xs ys = intersect (nub xs)
where intersect =
foldr (\x xs -> if x elem ys then x:xs else xs) []
5.1.4 Constructing a DFA from an NFA
Is it possible to express more languages by means of nondeterministic nite-state
automata than by deterministic nite-state automata? For each nondeterministic
automaton it is possible to give a deterministic nite-state automaton such that
both automata accept the same language, so the answer to the above question is
88 Regular Languages
no. Before we give the formal proof of this claim, we illustrate the construction of a
DFA for an NFA in an example.
Consider the nondeterministic nite-state automaton corresponding with the exam-
ple grammar of the previous subsection.
S
?>=< 89:;
ED GF
c
@A
/
a
/
b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a
/
BC
ED
a
o
B
?>=< 89:;
c
O
The nondeterminicity of this automaton appears in state A: two outgoing arcs of A
are labelled with an a. Suppose we add a new state D, with an arc from A to D
labelled a, and we remove the arcs labelled a from A to S and from A to B. Since
D is a merge of the states S and B we have to merge the outgoing arcs from S and
B into outgoing arcs of D. We obtain the following automaton.
S
?>=< 89:;
ED GF
c
@A
/
a
/
b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a
/
D
?>=< 89:;
BC @A
b
O
c
_c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
a,c
?
















B
?>=< 89:;
c
O
We omit the proof that the language of the latter automaton is equal to the language
of the former one. Although there is just one outgoing arc from A labelled with a,
this automaton is still nondeterministic: there are two outgoing arcs labelled with
c from D. We apply the same procedure as above. Add a new state E and an arc
labelled c from D to E, and remove the two outgoing arcs labelled c from D. Since
E is a merge of the states C and S we have to merge the outgoing arcs from C and
S into outgoing arcs of E. We obtain the following automaton.
S
?>=< 89:;
ED GF
c
@A
/
a
/
b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a
/
D
?>=< 89:;
BC @A
b
O
a
=
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
c
/
E
?>=< 89:; /.-, ()*+
a
F

c
eu
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
ED BC
b
@A
O
B
?>=< 89:;
c
O
Again, we do not prove that the language of this automaton is equal to the language
of the previous automaton, provided we add the state E to the set of accepting
5.1 Finite-state automata 89
states, which until now consisted just of state C. State E is added to the set of
accepting states because it is the merge of a set of states among which at least one
belongs to the set of accepting states. Note that in this automaton for each state,
all outgoing arcs are labelled dierently, i.e. this automaton is deterministic. The
DFA constructed from the NFA above is the 5-tuple (X, Q, d, S, F) with
X = a, b, c
Q = S, A, B, C, D, E
F = C, E
where transition function d is dened by
d S a = C
d S b = A
d S c = S
d A a = D
d B c = C
d D a = C
d D b = A
d D c = E
d E a = C
d E b = A
d E c = S
This construction is called the subset construction. In general, the construction
works as follows. Suppose M = (X, Q, d, Q
0
, F) is a nondeterministic nite-state au-
tomaton. Then the nite-state automaton M

= (X

, Q

, d

, Q

0
, F

), the components
of which are dened below, accepts the same language.
X

= X
Q

= subs Q
where subs returns all subsets of a set. subs Q is also called the powerset of Q. For
example,
subs :: X X
subs A, B = , A, A, B, B
For the other components of M

we dene
d

q a = t [ t d r a, r q
Q

0
= Q
0
F

= p [ p F ,= , p Q

The proof of the following theorem is given in Section 5.4.


90 Regular Languages
Theorem 7: DFA for NFA
For every nondeterministic nite-state automaton M there exists a nite-state au-
tomaton M

such that
Lnfa(M) = Ldfa(M

)
2
Theorem 7 enables us to freely switch between NFAs and DFAs. Equipped with
this knowledge we continue the exploration of regular languages in the following
section. But rst we show that the transformation from an NFA to a DFA is an
instance of partial evaluation.
5.1.5 Partial evaluation of NFAs
Given a nondeterministic nite state automaton we can obtain a deterministic nite
state automaton not just by means of the above construction, but also by means of
partial evaluation.
Just as for function dfa, we can calculate as follows with function nfa.
nfa [x1,x2,...,xn]
=
foldl (flip deltas) start [x1,x2,...,xn]
=
foldl (flip deltas) (deltas x1 start) [x2,...,xn]
=
case x1 of
SA -> foldl (flip deltas) (deltas SA start) [x2,...,xn]
SB -> foldl (flip deltas) (deltas SB start) [x2,...,xn]
SC -> foldl (flip deltas) (deltas SC start) [x2,...,xn]
Note that the rst argument of foldl is always flip deltas, and the second argu-
ment is one of the six sets of states [S], [A], [B], [C], [B,S], [C,S] (the possible
results of deltas). Since there are only a nite number of results of deltas (six, to
be precise), we can dene a transition function for each state:
nfaS, nfaA, nfaB, nfaC, nfaBS, nfaCS :: [Symbol] -> [State]
For example,
nfaA [] = A
nfaA (x:xs) = case x of
SA -> nfaBS xs
_ -> error "no transition possible"
Each of these functions is a case expression over the possible input symbols. By
partially evaluating the function nfa we have obtained a function that is the imple-
mentation of the deterministic nite state automaton corresponding to the nonde-
terministic nite state automaton.
5.2 Regular grammars 91
5.2 Regular grammars
This section denes regular grammars, a special kind of context-free grammars.
Subsection 5.2.1 gives the correspondence between nondeterministic nite-state au-
tomata and regular grammars.
Denition 8: Regular Grammar
A regular grammar G is a context free grammar (T, N, R, S) in which all production regular
gram-
mar
rules in R are of one of the following two forms:
A xB
A x
with x T

and A, B N. So in every rule there is at most one nonterminal, and


if there is a nonterminal present, it occurs at the end. 2
The regular grammars as dened here are sometimes called right-regular grammars.
There is a symmetric denition for left-regular grammars.
Denition 9: Regular Language
A regular language is a language that is generated by a regular grammar. 2 regular
lan-
guage
From the denition it is clear that each regular language is context-free. The question
is now: Is each context-free language regular? The answer is: No. There are context-
free languages that are not regular; an example of such a language is a
n
b
n
[ n IIN.
To understand this, you have to know how to prove that a language is not regular.
Because of its subtlety, we postpone this kind of proofs until Chapter 9. Here it
suces to know that regular languages form a proper subset of context-free languages
and that we will prot from their speciality in the recognition process.
A rst similarity between regular languages and context-free languages is that both
are closed under union, concatenation and Kleene-star.
Theorem 10:
Let L and M be regular languages, then
L M is regular
LM is regular
L

is regular
2
Proof: Let G
L
= (T, N
L
, R
L
, S
L
) and G
M
= (T, N
M
, R
M
, S
M
) be regular grammars
for L and M respectively, then
for regular grammars, the well-known union construction for context-free gram-
mars is a regular grammar again;
we obtain a regular grammar for LM if we replace, in G
L
, each production of
the form T x and T by T xS
M
and T S
M
, respectively;
since L

= LL

, it follows from the above that there exists a regular


grammar for L

.
2
92 Regular Languages
In addition to these closure properties, regular languages are closed under intersec-
tion and complement too. See the exercises. This is remarkable because context-free
languages are not closed under these operations. Recall the language L = L
1
L
2
where L
1
= a
n
b
n
c
m
[ n, m IIN and L
2
= a
n
b
m
c
m
[ n, m IIN.
As for context-free languages, there may exist more than one regular grammar for
a given regular language and these regular grammars may be transformed into each
other. We conclude this section with a grammar transformation:
Theorem 11:
For each regular grammar G there exists a regular grammar G

with start-symbol
S

such that
L(G) = L(G

)
and such that G

has no productions of the form U V and W , with W ,= S

.
In other words: every regular grammar can be transformed to a form where every
production has a nonempty terminal string in its righthandside (with a possible
exception for S ). 2
The proof of this transformation is omitted, we only briey describe the construction
of such a regular grammar, and illustrate the construction with an example.
Given a regular grammar G, a regular grammar with the same language but without
productions of the form U V and W for all U, V , and all W ,= S is obtained
as follows. First, consider all pairs Y , Z of nonterminals of G such that Y

Z.
Add productions Y z to the grammar, with Z z a production of the original
grammar, and z not a single nonterminal. Remove all productions U V from
G. Finally, remove all productions of the form W for W ,= S, and for each
production U xW add the production U x. The following example illustrates
this construction.
Consider the following regular grammar G.
S aA
S bB
S A
S C
A bB
A S
A
B bB
B
C c
The grammar G

of the desired form is constructed in 3 steps.


Step 1
Let G

equal G.
Step 2
Consider all pairs of nonterminals Y and Z. If Y

Z, add the productions Y z to
5.2 Regular grammars 93
G

, with Z z a production of the original grammar, and z not a single nonterminal.


Furthermore, remove all productions of the form U V from G

. In the example
we remove the productions S A, S C, A S, and we add the productions
S bB and S since S

A, and the production S c since S

C, and the
productions A aA and A bB since A

S, and the production A c since
A

C. We obtain the grammar with the following productions.
S aA
S bB
S c
S
A bB
A aA
A c
A
B bB
B
C c
This grammar generates the same language as G, and has no productions of the
form U V . It remains to remove productions of the form W for W ,= S.
Step 3
Remove all productions of the form W for W ,= S, and for each production
U xW add the production U x. Applying this transformation to the above
grammar gives the following grammar.
S aA
S bB
S a
S b
S c
S
A bB
A aA
A a
A b
A c
B bB
B b
C c
Each production in this grammar is of one of the desired forms: U x or U xV ,
and the language of the grammar G

we thus obtain is equal to the language of


grammar G.
94 Regular Languages
5.2.1 Equivalence of Regular grammars and Finite automata
In the previous section, we introduced nite-state automata. Here we show that
regular grammars and nondeterministic nite-state automata are two sides of one
coin.
We will prove the equivalence using theorem 7. The equivalence consists of two
parts, formulated in the theorems 12 and 13 below. The basis for both theorems is
the direct correspondence between a production A xB and a transition
A
?>=< 89:;
x
/
B
?>=< 89:;
Theorem 12: Regular grammar for NFA
For each NFA M there exists a regular grammar G such that
Lnfa(M) = L(G)
2
Proof: We will just sketch the construction, the formal proof can be found in
the literature. Let (X, Q, d, S, F) be a DFA for NFA M. Construct the grammar
G = (X, Q, R, S) where
the terminals of the grammar are the input alphabet of the automaton;
the nonterminals of the grammar are the states of the automaton;
the start state of the grammar is the start state of the automaton;
the productions of the grammar correspond to the automaton transitions:
a rule A xB for each transition
A
?>=< 89:;
x
/
B
?>=< 89:;
a rule A for each terminal state A.
In formulae:
R = A xB [ A, B Q, x X, d A x = B
A [ A F
2
Theorem 13: NFA for regular grammar
For each regular grammar G there exists a nondeterministic nite-state automaton
M such that
L(G) = Lnfa(M)
2
Proof: Again, we will just sketch the construction, the formal proof can be found
in the literature. The construction consists of two steps: rst we give a direct trans-
lation of a regular grammar to an automaton and then we transform the automaton
into a suitable shape.
From a grammar to an automaton. Let G = (T, N, R, S) be a regular grammar
without productions of the form U V and W for W ,= S.
Construct NFA M = (X, Q, d, S, F) where
5.3 Regular expressions 95
The input alphabet of the automaton are the nonempty terminal strings (!)
that occur in the rules of the grammar:
X = x T
+
[ A, B N, A xB R
x T
+
[ A N, A x R
The states of the automaton are the nonterminals of the grammar extended
with a new state N
f
.
Q = N N
f

The transitions of the automaton correspond to the grammar productions:


for each rule A xB we get a transition
A
?>=< 89:;
x
/
B
?>=< 89:;
for each rule A x with nonempty x, we get a transition
A
?>=< 89:;
x
/
N
f
GFED @ABC
In formulae: For all A N and x X:
d A x = B [ B N, A xB R
N
f
[ A x R
The nal states of the automaton are N
f
and possibly S, if S is a grammar
production.
F = N
f
S [ S R
Lnfa(M) = L(G), because of the direct correspondence between derivation steps in
G and transitions in M.
Transforming the automaton to a suitable shape. There is a minor aw in the au-
tomaton given above: the grammar and the automaton have dierent alphabets.
This shortcoming will be remedied by an automaton transformation which yields
an equivalent automaton with transitions labelled by elements of T (instead of T

).
The transformation is relatively easy and is depicted in the diagram below. In order
to eliminate transition d q x = q

where x = x
1
x
2
. . . x
k+1
with k > 0 and x
i
T for
all i, add new (nonnal) states p
1
, . . . , p
k
to the existing ones and new transitions
d q x
1
= p
1
, d p
1
x
2
= p
2
, . . . , d p
k
x
k+1
= q

.
b
q
x
1
x
2
...x
k+1
b
q

b
q
x
1
b
p
1
x
2
b
p
2
b
p
k
x
k+1
b
q

- - - p p p p p p p p p p p p p p p p p p -
It is intuitively clear that the resulting automaton is equivalent to the original one.
Carry out this transformation for each M-transition d q x = q

with [x[ > 1 in order


to get an automaton for G with the same input alphabet T. 2
5.3 Regular expressions
Regular expressions are a classical and convenient way to describe, for example, the
structure of terminal words. This section denes regular expressions, denes the
96 Regular Languages
language of a regular expression, and shows that regular expressions and regular
grammars are equally expressive formalisms. We do not discuss implementations of
(datatypes and functions for matching) regular expressions; implementations can be
found in the literature, see [9, 6].
Denition 14: RE
T
, regular expressions over alphabet T
The set RE
T
of regular expressions over alphabet T is inductively dened as follows: regular
expres-
sion
for regular expressions R, S
RE
T
RE
T
a RE
T
R +S RE
T
RS RE
T
R RE
T
where a T. The operator + is associative, commutative, and idempotent; the con-
catenation operator, written as juxtaposition (so x concatenated with y is denoted
by xy), is associative, and is the unit of it. In formulae this reads, for all regular
expressions R, S, and V ,
R + (S +U) = (R +S) +U
R +S = S +R
R +R = R
R(S U) = (RS) U
R = R (= R)
2
Furthermore, the star operator, , binds stronger than concatenation, and concate-
nation binds stronger than +. Examples of regular expressions are:
(bc) +
+ b()
The language (i.e. the semantics) of a regular expression over T is a set of T-
sequences compositionally dened on the structure of regular expressions. As follows.
Denition 15: Language of a regular expression
Function Lre :: RE
T
T

returns the language of a regular expression. It is


dened inductively by:
Lre() =
Lre() =
Lre(b) = b
5.3 Regular expressions 97
Lre(x +y) = Lre(x)) Lre(y)
Lre(xy) = Lre(x) Lre(y)
Lre(x) = (Lre (x))

2
Since is associative, commutative, and idempotent, set concatenation is associative
with as its unit, and function Lre is well dened. Note that the language Lreb
is the set consisting of zero, one or more concatenations of b, i.e., Lre(b) = (b)

.
As an example of a language of a regular expression, we compute the language of
the regular expression ( + bc)d.
Lre(( + bc)d)
=
(Lre( + bc)) (Lre(d))
=
(Lre() Lre(bc))d
=
( (Lre(b))(Lre(c)))d
=
, bc d
=
d, bcd
Regular expressions are used to describe the tokens of a language. For example, the
list
if p then e1 else e2
contains six tokens, three of which are identiers. An identier is an element in the
language of the regular expression
letter(letter + digit)
where
letter = a + b +. . . + z +
A + B +. . . + Z
digit = 0 + 1 +. . . + 9
see subsection 2.3.1.
In the beginning of this section we claimed that regular expressions and regular
grammars are equivalent formalisms. We will prove this claim later, but rst we
illustrate the construction of a regular grammar out of a regular expressions in an
example. Consider the following regular expression.
R = a + + (a + b)
We aim at a regular grammar G such that Lre(R) = L(G) and again we take a
top-down approach.
98 Regular Languages
Suppose that nonterminal A generates the language Lre(a), nonterminal B gener-
ates the language Lre(), and nonterminal C generates the language Lre((a + b)).
Suppose furthermore that the productions for A, B, and C satisfy the conditions
imposed upon regular grammars. Then we obtain a regular grammar G with
L(G) = Lre(R) by dening
S A
S B
S C
where S is the start-symbol of G. It remains to construct productions for nonter-
minals A, B, and C.
The nonterminal A with productions
A aA
A
generates the language Lre(a).
Since Lre() = , the nonterminal B with production
B
generates the language .
Nonterminal C with productions
C aC
C bC
C
generates the language Lre((a + b)).
For a specic example it is not dicult to construct a regular grammar for a regular
expression. We now give the general result.
Theorem 16: Regular Grammar for Regular Expression
For each regular expression R there exists a regular grammar G such that
Lre(R) = L(G)
2
The proof of this theorem is given in Section 5.4.
To obtain a regular expression that generates the same language as a given regular
grammar we go via an automaton. Given a regular grammar G, we can use the
theorems from the previous sections to obtain a DFA D such that
L(G) = Ldfa(D)
5.4 Proofs 99
So if we can obtain a regular expression for a DFA D, we have found a regular
expression for a regular grammar. To obtain a regular expression for a DFA D, we
interpret each state of D as a regular expression dened as the sum of the concate-
nation of outgoing terminal symbols with the resulting state. For our example DFA
we obtain:
S = aC +bA+cS
A = aB
B = cC
C =
It is easy to merge these four regular expressions into a single regular expression,
partially because this is a simple example. Merging the regular expressions obtained
from a DFA that may loop is more complicated, as we will briey explain in the
proof of the following theorem. In general, we have:
Theorem 17: Regular Expression for Regular Grammar
For each regular grammar G there exists a regular expression R such that
L(G) = Lre(R)
2
The proof of this theorem is given in Section 5.4.
5.4 Proofs
This section contains the proofs of some of the theorems given in this chapter.
Proof: of Theorem 7.
Suppose M = (X, Q, d, Q
0
, F) is a nondeterministic nite-state automaton. Dene
the nite-state automaton M

= (X

, Q

, d

, Q

0
, F

) as follows.
X

= X
Q

= subs Q
where subs returns the powerset of a set. For example,
subs A, B = , A, A, B, B
For the other components of M

we dene
d

q a = t [ t d r a, r q
Q

0
= Q
0
F

= p [ p F ,= , p Q

We have
100 Regular Languages
Lnfa(M)
= denition of Lnfa
w [ w X

, nfa accept w (d, Q


0
, F)
= denition of nfa accept
w [ w X

, (nfa d Q
0
w F) ,=
= denition of F

w [ w X

, nfa d Q
0
w F

= assume nfa d Q
0
w = dfa d

0
w
w [ w X

, dfa d

0
w F

= denition of dfa accept


w [ w X

, dfa accept w (d

, Q

0
, F

)
= denition of Ldfa
Ldfa(M

)
It follows that Lnfa(M) = Ldfa(M

) provided
nfa d Q
0
w = dfa d

0
w (5.1)
We prove this equation as follows.
nfa d Q
0
w
= denition of nfa
foldl (deltas d) Q
0
w
= Q
0
= Q

0
; assume deltas d = d

foldl d

0
w
= denition of dfa
dfa d

0
w
So if
deltas d = d

equation (5.1) holds. This equality follows from the following calculation.
d

q a
= denition of d

t [ r q, t d r a
= denition of deltas
deltas d q a
2
Proof: of Theorem 16.
The proof of this theorem is by induction to the structure of regular expressions.
For the three base cases we argue as follows.
5.4 Proofs 101
The regular grammar without productions generates the language Lre(). The
regular grammar with the production S generates the language Lre(). The
regular grammar with production S b generates the language Lre(b).
For the other three cases the induction hypothesis is that there exist a regular
grammar with start-symbol S
1
that generates the language Lre(x), and a regular
grammar with start-symbol S
2
that generates the language Lre(y).
We obtain a regular grammar with start-symbol S that generates the language
Lre(x +y) by dening
S S
1
S S
2
We obtain a regular grammar with start-symbol S that generates the language
Lre(xy) by replacing, in the regular grammar that generates the language Lre(x),
each production of the form T a and T by T aS
2
and T S
2
, respectively.
We obtain a regular grammar with start-symbol S that generates the language
Lre(x) by replacing, in the regular grammar that generates the language Lre(x),
each production of the form T a and T by T aS and T S, and by
adding the productions S S
1
and S , where S
1
is the start-symbol of the
regular grammar that generates the language Lre(x). 2
Proof: of Theorem 17.
In sections 5.1.4 and 5.2.1 we have shown that there exists a DFA D = (X, Q, d, S, F)
such that
L(G) = Ldfa(D)
So, if we can show that there exists a regular expression R such that
Ldfa(D) = Lre(R)
then the theorem follows.
Let D = (X, Q, d, S, F) be a DFA such that L(G) = Ldfa(D). We dene a regular
expression R such that
Lre(R) = Ldfa(D)
For each state q Q we dene a regular expression q, and we let R be

S. We obtain
the denition of q by combining all pairs c and C such that d q c = C.
q = if q / F
then foldl (+) [cC [ d q c = C]
else + foldl (+) [cC [ d q c = C]
This gives a set of possibly mutually recursive equations, which we have to solve. In
solving these equations we use the fact that concatenation distributes over the sum
operator:
z(x +y) = zx +zy
102 Regular Languages
and that recursion can be removed by means of the star operator :
A = xA+z (where A / z) A = xz
The algorithm for solving such a set of equations is omitted.
We prove Ldfa(D) = Lre(

S).
Ldfa(D)
= denition of Ldfa
w [ w X

, dfa accept w (d, S, F)


= denition of dfa accept
w [ w X

, (dfa d S w) F
= denition of dfa
w [ w X

, (foldl d S w) F
= assumption
w [ w Lre(

S)
= equality for set-comprehensions
Lre(

S)
It remains to prove the assumption in the above calculation: for w X

,
(foldl d S w) F w Lre(

S)
We prove a generalisation of this equation, namely, for arbitrary q,
(foldl d q w) F w Lre( q)
This equation is proved by induction to the length of w. For the base case w =
we calculate as follows.
(foldl d q ) F
denition of foldl
q F
denition of q, E abbreviates the fold expression
q = +E
denition of Lre, denition of q
Lre( q)
The induction hypothesis is that for all lists w with [w[ n we have (foldl d q w)
F w Lre( q). Suppose ax is a list of length n+1.
(foldl d q (ax)) F
denition of foldl
(foldl d (d q a) x) F
induction hypothesis
x Lre(

d q a)
denition of q, D is deterministic
ax Lre( q)
5.5 Exercises 103
2
summary
This chapter discusses methods for recognising sentences from regular languages, and
introduces several concepts related to describing and recognising regular languages.
Regular languages are used for describing simple languages like the language of
identiers and the language of keywords and regular expressions are convenient
for the description of regular languages. The straightforward translation of a regular
expression into a recogniser for the language of that regular expression results in a
recogniser that is often very inecient. By means of (non)deterministic nite-state
automata we construct a recogniser that requires time linear in the length of the
input list for recognising an input list.
5.5 Exercises
Exercise 5.1 Given a regular grammar G for language L, construct a regular grammar for
L

.
Exercise 5.2 Transform the grammar with the following productions to a grammar without
productions of the form U V and W with W ,= S.
S aA
S A
A aS
A B
B C
B
C cC
C a

Exercise 5.3 Suppose that the state transition function d in the denition of a nondetermin-
istic nite-state automaton has the following type
d :: Q X Q
Function d takes a set of states V and an element a, and returns the set of states
that are reachable from V with an arc labelled a. Dene a function ndfsa of type
(Q X Q) Q X

Q
which given a function d, a set of start states, and an input list, returns the set of
states in which the nondeterministic nite-state automaton can end after reading
the input list.
Exercise 5.4 Prove the converse of Theorem 7: show that for every deterministic nite-state
automaton M there exists a nondeterministic nite-state automaton M

such that
Lda(M) = Lna(M

104 Regular Languages


Exercise 5.5 Regular languages are closed under complementation. Prove this claim. Hint:
construct a nite automaton for L out of an automaton for regular language L.
Exercise 5.6 Regular languages are closed under intersection.
1 Prove this claim using the result from the previous exercise.
2 A direct proof form this claim is the following:
Let M
1
= (X, Q
1
, d
1
, S
1
, F
1
) and M
2
= (X, Q
2
, d
2
, S
2
, F
2
) be DFAs for the
regular languages L
1
and L
2
respectively. Dene the (product) automaton
M = (X, Q
1
Q
2
, d, (S
1
, S
2
), F
1
F
2
) by d (q
1
, q
2
) x = (d
1
q
1
x, d
2
q
2
x) Now
prove that Ldfa(M) = Ldfa(M
1
) Ldfa(M
2
)

Exercise 5.7 Dene nondeterministic nite-state automata that accept languages equal to
the languages of the following regular grammars.
1.

S (A
S
S )A
A )
A (
2.

S 0A
S 0B
S 1A
A 1
A 0
B 0
B

Exercise 5.8 Describe the language of the following regular expressions.


1. + b()
2. (bc) +
3. a(b) + c

Exercise 5.9 Prove that for arbitrary regular expressions R, S, and T the following equiva-
lences hold.
Lre(R(S +T)) = Lre(RS +RT)
Lre((R +S)T) = Lre(RT +ST)

Exercise 5.10 Give regular expressions S and R such that


Lre(RS) = Lre(SR)
Lre(RS) ,= Lre(SR)

5.5 Exercises 105


Exercise 5.11 Give regular expressions V and W, with Lre(V ) ,= Lre(W), such that for all
regular expressions R and S with S ,=
Lre(R(S +V )) = Lre(R(S +W))
V and W may be expressed in terms of R and S.
Exercise 5.12 Give a regular expression for the language that consists of all lists of zeros
and ones such that the segment 01 occurs nowhere in a list. Examples of sentences
of this language are 1110, and 000.
Exercise 5.13 Give regular grammars that generate the language of the following regular
expressions.
1. ((a + bb) + c)
2. a + b + ab

Exercise 5.14 Give regular expressions of which the language equals the language of the
following regular grammars.
1.

S bA
S aC
S
A bA
A
B aC
B bB
B
C bB
C b
2.

S 0S
S 1T
S
T 0T
T 1S

Exercise 5.15 Construct for each of the following regular expressions a nondeterministic
nite-state automaton that accepts the sentences of the language. Transform the
nondeterministic nite-state automata into deterministic nite-state automata.
1. a + b + (ab)
2. (1 + (12) + 0)(30)

Exercise 5.16 Dene regular expressions for the languages of the following deterministic
nite-state automata.
106 Regular Languages
1. Start state is S.
S
?>=< 89:; /.-, ()*+
a
/
ED GF
b
@A
/
A
?>=< 89:;
b
/
ED GF
a
@A
/
B
?>=< 89:; /.-, ()*+
ED GF
a,b
@A
/
2. Start state is S.
S
?>=< 89:; /.-, ()*+
a
/
ED GF
b
@A
/
A
?>=< 89:;
b
/
BC @A
b
O
B
?>=< 89:;
BC @A
b
O

Chapter 6
Compositionality
introduction
Many recursive functions follow a common pattern of recursion. These common
patterns of recursion can conveniently be captured by higher order functions. For
example: many recursive functions dened on lists are instances of the higher order
function foldr. It is possible to dene a function such as foldr for a whole range of
datatypes other than lists. Such functions are called compositional. Compositional
functions on datatypes are dened in terms of algebras of semantic actions that cor-
respond to the constructors of the datatype. Compositional functions can typically
be used to dene the semantics of programming languages constructs. Such seman-
tics is referred to as algebraic semantics. Algebraic semantics often uses algebras
that are functions from tuples to tuples. Such functions can be seen as computations
that read values from a component of the domain and write values to a component
of the codomain. The former values are called inherited attributes and the latter
values are called synthesised attributes. Attributes can be both inherited and syn-
thesised. As explained in Section 2.4.1, there is an important relationship between
grammars and compositionality: with every grammar, which describes the concrete
syntax of a language, one can associate a (possibly mutually recursive) datatype,
which describes the abstract syntax of the language. Compositional functions on
these datatypes are called syntax driven.
goals
After studying this chapter and making the exercises you will
know how to generalise constructors of a datatype to an algebra;
know how to write compositional functions, also known as folds, on (possibly
mutually recursive) datatypes;
understand the advantages of using folds, and have seen that many problems
can be solved with a fold;
know that a fold applied to the constructors algebra is the identity function;
have seen the notions of fusion and deforestation;
know how to write syntax driven code;
understand the connection between datatypes, abstract syntax and concrete
syntax;
understand the notions of synthesised and inherited attributes;
can associate inherited and synthesised attributes with the dierent alterna-
107
108 Compositionality
tives of (possibly mutually recursive) datatypes (or the dierent nonterminals
of grammars);
can dene algebraic semantics in terms of compositional (or syntax driven)
code that is dened using algebras of computations which make use of inherited
and synthesised attributes.
organisation
The chapter is organised as follows. Section 6.1 shows how to dene compositional
recursive functions on built-in lists using a function which is similar to foldr and
shows how to do the same thing with user-dened lists and streams. Section 6.2
shows how to dene compositional recursive functions on several kinds of trees.
Section 6.3 denes algebraic semantics. Section 6.4 shows the usefulness of algebraic
semantics by presenting an expression evaluator, an expression interpreter which
makes use of a stack and expression compiler to a stack machine. They only dier
in the way they handle basic expressions (variables and local denitions are handled
in the same way). All three examples use an algebra of computations which can
read values from and write values to an environment which binds names to values.
In a second version of the expression evaluator the use of inherited and synthesised
attributes is made more explicit by using tuples. Section 6.5 presents a relatively
complex example of the use of tuples in combination with compositionality. It deals
with the problem of variable scope when compiling block structured languages.
6.1 Lists
This section introduces compositional functions on the well known datatype of lists.
Compositional functions are dened on the built-in datatype [a] for lists (Section
6.1.1), on a user-dened datatype List a for lists (Section 6.1.2), and on streams or
innite lists (Section 6.1.3). We also show how to construct an algebra that directly
corresponds to a datatype.
6.1.1 Built-in lists
The datatype of lists is perhaps the most important example of a datatype. A list
is either the empty list [] or a nonempty list (x:xs) consisting of an element x at
the head of the list, followed by a tail xs which itself is again a list. Thus, the type
[x] is recursive. In Hugs the type [x] for lists is built-in. The informal denition
of above corresponds to the following (pseudo) datatype.
data [x] = x : [x] | []
Many recursive functions on lists look very similar. Computing the sum of the
elements of a list of integers (sumL, where the L denotes that it is a sum function
dened on lists) is very similar to computing the product of the elements of a list of
integers (prodL).
sumL, prodL :: [Int] -> Int
sumL (x:xs) = x + sumL xs
sumL [] = 0
6.1 Lists 109
prodL (x:xs) = x * prodL xs
prodL [] = 1
The function sumL replaces the list constructor (:) by (+) and the list construc-
tor [] by 0 and the function prodL replaces the list constructor (:) by (*) and
the list constructor [] by 1. Note that we have replaced the constructor (:) (a
constructor with two arguments) by binary operators (+) and (*) (i.e. functions
with two arguments) and the constructor [] (a constructor with zero variables) by
constants 0 and 1 (i.e. functions with zero variables). The similarity between the
denitions of the functions sumL and prodL can be captured by the following higher
order recursive function foldL, which is nothing else but an uncurried version of the
well known function foldr. Dont confuse foldL with Haskells prelude function
foldl, which works the other way around.
foldL :: (x -> l -> l,l) -> [x] -> l
foldL (op,c) = fold where
fold (x:xs) = op x (fold xs)
fold [] = c
The function foldL recursively replaces the constructor (:) by an operator op and
the constructor [] by a constant c. We can now use foldL to compute the sum and
product of the elements of a list of integers as follows.
? foldL ((+),0) [1,2,3,4]
10
? foldL ((*),1) [1,2,3,4]
24
The pair (op,c) is often referred to as a list-algebra. More precisely, a list-algebra
consists of a type l (the carrier of the algebra), a binary operator op of type x->l->l
and a constant c of type l. Note that a type (like Int) can be the carrier of a list-
algebra in more than one way (for example using ((+),0) and ((*),1)). Here is
another example of how to turn Int into a list-algebra.
? foldL (\_ n -> n+1,0) [1,2,3,4]
4
This list-algebra ignores the value at the head of a list, and increments the result
obtained thus far with one. It corresponds to the function sizeL dened by:
sizeL :: [x] -> Int
sizeL (_:xs) = 1 + sizeL xs
sizeL [] = 0
Note that the type of sizeL is more general than the types of sumL and prodL. The
type of the elements of the list does not play a role.
6.1.2 User-dened lists
In this subsection we present an example of a fold function dened on another
datatype than built-in lists. To keep things simple we redo the list example for
user-dened lists.
110 Compositionality
data List x = Cons x (List x) | Nil
User-dened lists are dened in the same way as built-in ones. The constructors
(:) and [] are replaced by constructors Cons and Nil. Here are the types of the
constructors Cons and Nil.
? :t Cons
Cons :: a -> List a -> List a
? :t Nil
Nil :: List a
A algebra type ListAlgebra corresponding to the datatype List directly follows
the structure of that datatype.
type ListAlgebra x l = (x -> l -> l,l)
The left hand side of the type denition is obtained from the left hand side of the
datatype as follows: a postx Algebra is added at the end of the name List and a
type variable l is added at the end of the whole left hand side of the type denition.
The right hand side of the type denition is obtained from the right hand side of the
data denition as follows: all List x valued constructors are replaced by l valued
functions which have the same number of arguments (if any) as the corresponding
constructors. The types of recursive constructor arguments (i.e. arguments of type
List x) are replaced by recursive function arguments (i.e. arguments of type l).
The types of the other arguments are simply left unchanged. In a similar way, the
denition of a fold function can be generated automatically from the data denition.
foldList :: ListAlgebra x l -> List x -> l
foldList (cons,nil) = fold where
fold (Cons x xs) = cons x (fold xs)
fold Nil = nil
The constructors Cons and Nil in the left hand sides of the denition of the local
function fold are replaced by functions cons and nil in the right hand sides. The
function fold is applied recursively to all recursive constructor arguments of type
List x to return a value of type l as required by the functions of the algebra (in this
case Cons and cons have one such recursive argument). The other arguments are
left unchanged. Recursive functions on user-dened lists which are dened by means
of foldList are called compositional. Every algebra denes a unique compositional
function. Here are three examples of compositional functions. They correspond to
the examples of section 6.1.1.
sumList, prodList :: List Int -> Int
sumList = foldList ((+),0)
prodList = foldList ((*),1)
sizeList :: List x -> Int
sizeList = foldList (const (1+),0)
It is worth mentioning one particular ListAlgebra: the trivial ListAlgebra that
replaces Cons by Cons and Nil by Nil. This algebra denes the identity function
on user-dened lists.
6.2 Trees 111
idListAlgebra :: ListAlgebra x (List x)
idListAlgebra = (Cons,Nil)
idList :: List x -> List x
idList = foldList idListAlgebra
? idList (Cons 1 (Cons 2 Nil))
Cons 1 (Cons 2 Nil)
6.1.3 Streams
In this section we consider streams (or innite lists).
data Stream x = And x (Stream x)
Here is a standard example of a stream: the innite list of bonacci numbers.
fibStream :: Stream Int
fibStream = And 0 (And 1 (restOf fibStream)) where
restOf (And x stream@(And y _)) = And (x+y) (restOf stream)
The algebra type StreamAlgebra, and the fold function foldStream can be gener-
ated automatically from the datatype Stream.
type StreamAlgebra x s = x -> s -> s
foldStream :: StreamAlgebra x s -> Stream x -> s
foldStream and = fold where
fold (And x xs) = and x (fold xs)
Note that the algebra has only one component because Stream has only one con-
structor. For the same reason the fold function is dened using only one equation.
Here is an example of using a compositional function on user dened streams. It
computes the rst element of a monotone stream that is greater or equal than a
given value.
firstGreaterThan :: Ord x => x -> Stream x -> x
firstGreaterThan n = foldStream (\x y -> if x>=n then x else y)
6.2 Trees
Now that we have seen how to generalise the foldr function on built-in lists to
compositional functions on user-dened lists and streams we proceed by explaining
another common class of datatypes: trees. We will treat four dierent kinds of trees
in the subsections below:
binary trees;
trees for matching parentheses;
expression trees;
and general trees.
Furthermore, we will briey mention the concepts of fusion and deforestation.
112 Compositionality
6.2.1 Binary trees
A binary tree is either a node where the tree splits into two subtrees or a leaf which
holds a value.
data BinTree x = Bin (BinTree x) (BinTree x) | Leaf x
One can generate the corresponding algebra type BinTreeAlgebra and fold function
foldBinTree from the datatype automatically. Note that Bin has two recursive
arguments and that Leaf has one non-recursive argument.
type BinTreeAlgebra x t = (t -> t -> t,x -> t)
foldBinTree :: BinTreeAlgebra x t -> BinTree x -> t
foldBinTree (bin,leaf) = fold where
fold (Bin l r) = bin (fold l) (fold r)
fold (Leaf x) = leaf x
In the BinTreeAlgebra type, the bin part of the algebra has two arguments of
type t and the leaf part of the algebra has one argument of type x. Similarly,
in the foldBinTree function, the local fold function is applied recursively to both
arguments of bin and is not called on the argument of leaf. We can now dene
compositional functions on binary trees much in the same way as we dened them on
lists. Here is an example: the function sizeBinTree computes the size of a binary
tree.
sizeBinTree :: BinTree x -> Int
sizeBinTree = foldBinTree ((+),const 1)
?sizeBinTree (Bin (Bin (Leaf 3) (Leaf 7)) (Leaf 11))
3
If a tree consists of a leaf, then sizeBinTree ignores the value at the leaf and returns
1 as the size of the tree. If a tree consists of two subtrees, then sizeBinTree returns
the sum of the sizes of those subtrees as the size of the tree. Functions for computing
the sum and the product of the integers at the leafs of a binary tree can be dened
in a similar way. It suces to dene appropriate semantic actions bin and leaf on
a type t (in this case Int) that correspond to the syntactic constructs Bin and Leaf
of the datatype BinTree.
6.2.2 Trees for matching parentheses
Section 3.3.1 denes the datatype Parentheses for matching parentheses.
data Parentheses = Match Parentheses Parentheses
| Empty
For example, the sentence ()() of the concrete syntax for matching parentheses is
represented by the value Match Empty (Match Empty Empty) in the abstract syn-
tax Parentheses. Remember that the abstract syntax ignores the terminal bracket
symbols of the concrete syntax.
6.2 Trees 113
We can now dene, in the same way as we did for lists and binary trees, an algebra
type ParenthesesAlgebra and a fold function foldParentheses, which can be used
to compute the depth (depthParentheses) and the width (widthParentheses) of
matching parenthesis in a compositional way. The depth of a string of matching
parentheses s is the largest number of unmatched parentheses that occurs in a sub-
string of s. For example, the depth of the string ((()))() is 3. The width of a
string of matching parentheses s is the the number of substrings that are matching
parentheses themselves, which are not a substring of a surrounding string of match-
ing parentheses. For example, the width of the string ((()))() is 2. Compositional
functions on datatypes that describe the abstract syntax of a language are called
syntax driven.
type ParenthesesAlgebra m = (m -> m -> m,m)
foldParentheses :: ParenthesesAlgebra m -> Parentheses -> m
foldParentheses (match,empty) = fold where
fold (Match l r) = match (fold l) (fold r)
fold Empty = empty
depthParenthesesAlgebra :: ParenthesesAlgebra Int
depthParenthesesAlgebra = (\x y -> max (1+x) y,0)
widthParenthesesAlgebra :: ParenthesesAlgebra Int
widthParenthesesAlgebra = (\_ y -> 1+y,0)
depthParentheses, widthParentheses :: Parentheses -> Int
depthParentheses = foldParentheses depthParenthesesAlgebra
widthParentheses = foldParentheses widthParenthesesAlgebra
parenthesesExample = Match (Match (Match Empty Empty) Empty)
(Match Empty
(Match (Match Empty Empty)
Empty
) )
? depthParentheses parenthesesExample
3
? widthParentheses parenthesesExample
3
Our example reveals that abstract syntax is not very well suited for interpretation
by human beings. What is the concrete representation of the matching parenthesis
example represented by parenthesesExample? It happens to be ((()))()(()).
Fortunately, we can easily write a program that computes the concrete representation
from the abstract one. We know exactly which terminals we have deleted when going
from the concrete syntax to the abstract one. The algebra used by the function
a2cParentheses simply reinserts those terminals that we have deleted. Note that
a2cParentheses does not deal with layout such as blanks, indentation and newlines.
For a simple example layout does not really matter. For large examples layout is
very important: it can be used to let concrete representations look pretty.
114 Compositionality
a2cParenthesesAlgebra :: ParenthesesAlgebra String
a2cParenthesesAlgebra = (\xs ys -> "("++xs++")"++ys,"")
a2cParentheses :: Parentheses -> String
a2cParentheses = foldParentheses a2cParenthesesAlgebra
? a2cParentheses parenthesesExample
((()))()(())
This example illustrates that a computer can easily interpret abstract syntax (some-
thing human beings have diculties with). Strangely enough, human beings can
easily interpret concrete syntax (something computers have diculties with). What
we would really like is that computers can interpret concrete syntax as well. This
is the place where parsing enters the picture: computing an abstract representation
from a concrete one is precisely what parsers are used for.
Consider the functions parens and nesting of Section 3.3.1 again.
open = symbol (
close = symbol )
parens :: Parser Char Parentheses
parens = f <$> open <*> parens <*> close <*> parens
<|> succeed Empty
where f a b c d = Match b d
nesting :: Parser Char Int
nesting = f <$> open <*> nesting <*> close <*> nesting
<|> succeed 0
where f a b c d = max (1+b) d
Function nesting could have been dened by means of function parens and a fold:
nesting :: Parser Char Int
nesting = depthParentheses <$> parens
(Remember that depthParentheses has been dened as a fold.) Functions nesting
and nesting compute exactly the same result. The function nesting is the fusion
of the fold function with the parser parens from the function nesting. Using laws
for parsers and folds (which we have not and will not give) we can prove that the
two functions are equal.
Note that function nesting rst builds a tree by means of function parens, and
then attens it by means of the fold. Function nesting never builds a tree, and is
thus preferable for reasons of eciency. On the other hand: in function nesting we
reuse the parser parens and the function depthParentheses, in function nesting
we have to write our own parser, and convince ourselves that it is correct. So for
reasons of programming eciency function nesting is preferable. To obtain the
best of both worlds, we would like to write function nesting and have our compiler
gure out that it is better to use function nesting in computations. The automatic
transformation of function nesting into function nesting is called deforestation
(trees are removed). Some (very few) compilers are clever enough to perform this
transformation automatically.
6.2 Trees 115
6.2.3 Expression trees
The matching parentheses grammar has only one nonterminal. Therefore its ab-
stract syntax is described by a single datatype. In this section we look again at the
expression grammar of section 3.5.
E T
E E + T
T F
T T * F
F (E)
F Digs
This grammar has three nonterminals, E, T and F. Using the approach from Section
2.6 we transform the nonterminals to datatypes:
data E = E1 T | E2 E T
data T = T1 F | T2 T F
data F = F1 E | F2 Int
where we have translated Digs by the type Int. Note that this is a rather incon-
venient and clumsy abstract syntax for expressions; the following abstract syntax is
more convenient.
data Expr = Con Int | Add Expr Expr | Mul Expr Expr
However, to illustrate the concept of mutual recursive datatypes, we will study the
datatypes E, T, and F dened above. Since E uses T, T uses F, and F uses E, these
three types are mutually recursive. The main datatype of the three datatypes is the
one corresponding to the start-symbol E. Since the datatypes are mutually recursive,
the algebra type EAlgebra consists of three tuples of functions and three carriers
(the main carrier is, as always, the one corresponding to the main datatype and is
therefore the one corresponding to the start-symbol).
type EAlgebra e t f = ((t -> e,e -> t -> e)
,(f -> t,t -> f -> t)
,(e -> f,Int -> f)
)
The fold function foldE for E also folds over T and F, so it uses three mutually
recursive local functions.
foldE :: EAlgebra e t f -> E -> e
foldE ((e1,e2),(t1,t2),(f1,f2)) = fold where
fold (E1 t) = e1 (foldT t)
fold (E2 e t) = e2 (fold e) (foldT t)
foldT (T1 f) = t1 (foldF f)
foldT (T2 t f) = t2 (foldT t) (foldF f)
foldF (F1 e) = f1 (fold e)
foldF (F2 n) = f2 n
116 Compositionality
We can now use foldE to write a syntax driven expression evaluator evalE. In the
algebra that is used in the foldE, all type variables e, f, and t are instantiated with
Int.
evalE :: E -> Int
evalE = foldE ((id,(+)),(id,(*)),(id,id))
exE = E2 (E1 (T2 (T1 (F2 2)) (F2 3))) (T1 (F2 1))
? evalE exE
7
Once again our example shows that abstract syntax cannot easily be interpreted by
human beings. Here is a function a2cE which does this job for us.
a2cE :: E -> String
a2cE = foldE ((e1,e2),(t1,t2),(f1,f2))
where e1 = \t -> t
e2 = \e t -> e++"+"++t
t1 = \f -> f
t2 = \t f -> t++"*"++f
f1 = \e -> "("++e++")"
f2 = \n -> show n
? a2cE exE
"2*3+1"
6.2.4 General trees
A general tree consist of a node, holding a value, where the tree splits into a list
of subtrees. Notice that this list may be empty (in which case, of course, only the
value at the node is of interest). As usual, the type TreeAlgebra and the function
foldTree can be generated automatically from the data denition.
data Tree x = Node x [Tree x]
type TreeAlgebra x a = x -> [a] -> a
foldTree :: TreeAlgebra x a -> Tree x -> a
foldTree node = fold where
fold (Node x gts) = node x (map fold gts)
Notice that the constructor Node has a list of recursive arguments. Therefore the
node function of the algebra has a corresponding list of recursive arguments. The
local fold function is recursively called on all elements of a list using the map func-
tion.
One can compute the sum of the values at the nodes of a general tree as follows:
sumTree :: Tree Int -> Int
sumTree = foldTree (\x xs -> x + sum xs)
Computing the product of the values at the nodes of a general tree and computing
the size of a general tree can be done in a similar way.
6.2 Trees 117
6.2.5 Eciency
A fold takes a value of a datatype, and replaces its constructors by functions. If
the evaluation of each of these functions on their arguments takes constant time,
evaluation of the fold takes time linear in the number of constructors in its argument.
However, some functions require more than constant evaluation time. For example,
list concatenation is linear in its left argument, and it follows that if we dene the
function reverse by
reverse :: [a] -> [a]
reverse = foldL (\x xs -> xs ++ [x],[])
then function reverse takes time quadratic in the length of its argument list. So,
folds are often ecient functions, but if the functions in the algebra are not constant,
the fold is usually not linear. Often such a nonlinear fold can be transformed into
a more ecient function. A technique that often can be used in such a case is the
accumulating parameter technique. For example, for the reverse function we have
reverse x = reverse x []
reverse :: [a] -> [a] -> [a]
reverse [] ys = ys
reverse (x:xs) ys = reverse xs (x:ys)
The evaluation of reverse xs takes time linear in the length of xs.
Exercise 6.1 Dene an algebra type and a fold function for the following datatype.
data LNTree a b = Leaf a
[ Node (LNTree a b) b (LNTree a b)

Exercise 6.2 Dene the following functions as folds on the datatype BinTree, see Section
6.2.1.
1. height, which returns the height of a tree.
2. flatten, which returns the list of leaf values in left-to-right order.
3. maxBinTree, which returns the maximal value at the leaves.
4. sp, which returns the length of a shortest path.
5. mapBinTree, which maps a function over the elements at the leaves.

Exercise 6.3 A path through a binary tree describes the route from the root of the tree to
some leaf. We choose to represent paths by sequences of Directions:
data Direction = Left | Right
in such a way that taking the left subtree in an internal node will be encoded by
Left and taking the right subtree will be encoded by Right. Dene a compositional
function allPaths which produces all paths of a given tree. Dene this function
rst using explicit recursion, and then using a fold.
118 Compositionality
Exercise 6.4 This exercise deals with resistances. There are some basic resistances with a
xed (oating point) capacity and, given two resistances, they can be put in parallel
(:|:) or in sequence (:*:).
1. Dene the datatype Resist to represent resistances. Also, dene the datatype
ResistAlgebra and the corresponding function foldResist.
2. Dene a compositonal function result which determines the capacity of a
resistance. (Recall the rules
1
r
=
1
r
1
+
1
r
2
and r = r
1
+r
2
.)

6.3 Algebraic semantics


With every (possibly mutually recursive) datatype one can associate an algebra type
and a fold function. The algebra is a tuple (one component for each datatype) of
tuples (one component for each constructor of the datatype) of semantic actions.
The algebra uses a set of auxiliary carriers (one for each datatype). One of them
(the one corresponding to the main datatype) is the main carrier of the algebra.
The fold function recursively replaces syntactic constructors of the datatypes by
corresponding semantic actions of the algebra. Functions which are dened in terms
of a fold function and an algebra are called compositional functions. There is one
special algebra: the one whose components are the constructor functions of the
mutually recursive datatypes. This algebra denes the identity function on the
datatype. The compositional function that corresponds to an algebra is the unique,
so called, algebra homomorphism from the datatype to the given algebra. Therefore
the datatype is often called the initial algebra and compositional functions are said
to dene algebraic semantics. We summarise this important statement as follows.
algebraicSemantics :: InitialAlgebra -> Algebra
6.4 Expressions
The rst part of this section presents a basic expression evaluator. The evaluator
is extended with variables in the second part and with local denitions in the third
part.
6.4.1 Evaluating expressions
In this subsection we start with a more involved example: an expression evaluator.
We will use another datatype for expressions than the one introduced in Section
6.2.3: here we will use a single, and hence non mutual recursive datatype for expres-
sions. We restrict ourselves to oat valued expressions on which addition, subtrac-
tion, multiplication and division are dened. The datatype and the corresponding
algebra type and fold function are as follows:
infixl 7 Mul
infix 7 Dvd
6.4 Expressions 119
infixl 6 Add, Min
data Expr = Expr Add Expr
| Expr Min Expr
| Expr Mul Expr
| Expr Dvd Expr
| Num Float
type ExprAlgebra a = (a->a->a -- add
,a->a->a -- min
,a->a->a -- mul
,a->a->a -- dvd
,Float->a) -- num
foldExpr :: ExprAlgebra a -> Expr -> a
foldExpr (add,min,mul,dvd,num) = fold where
fold (expr1 Add expr2) = fold expr1 add fold expr2
fold (expr1 Min expr2) = fold expr1 min fold expr2
fold (expr1 Mul expr2) = fold expr1 mul fold expr2
fold (expr1 Dvd expr2) = fold expr1 dvd fold expr2
fold (Num n) = num n
There is nothing special to notice about these denitions except, perhaps, the fact
that Expr does not have an extra parameter x like the list and tree examples. Com-
puting the result of an expression now simply consists of replacing the constructors
by appropriate functions.
resultExpr :: Expr -> Float
resultExpr = foldExpr ((+),(-),(*),(/),id)
6.4.2 Adding variables
Our next goal is to extend the evaluator of the previous subsection such that it
can handle variables as well. The values of variables are typically looked up in an
environment which binds the names of the variables to values. We implement an
environment as a list of name-value pairs. For our purposes names are strings and
values are oats. In the following programs we will use the following functions and
types:
type Env name value = [(name,value)]
(?) :: Eq name => Env name value -> name -> value
env ? x = head [ v | (y,v) <- env, x == y]
type Name = String
type Value = Float
The datatype and the corresponding algebra type and eval function are now as
follows. Note that we use the same name (Expr) for the datatype, although it diers
from the previous Expr datatype.
120 Compositionality
data Expr = Expr Add Expr
| Expr Min Expr
| Expr Mul Expr
| Expr Dvd Expr
| Num Value
| Var Name
type ExprAlgebra a = (a->a->a -- add
,a->a->a -- min
,a->a->a -- mul
,a->a->a -- dvd
,Value->a -- num
,Name->a) -- var
foldExpr :: ExprAlgebra a -> Expr -> a
foldExpr (add,min,mul,dvd,num,var) = fold where
fold (expr1 Add expr2) = fold expr1 add fold expr2
fold (expr1 Min expr2) = fold expr1 min fold expr2
fold (expr1 Mul expr2) = fold expr1 mul fold expr2
fold (expr1 Dvd expr2) = fold expr1 dvd fold expr2
fold (Num n) = num n
fold (Var x) = var x
Expr now has an extra constructor: the unary constructor Var. Similarly, the ar-
gument of foldExpr now has an extra component: the unary function var which
corresponds to the unary constructor Var. Computing the result of an expression
somehow needs to use an environment. Here is a rst, bad way of doing this: one
can use it as an argument of a function that computes an algebra (we will explain
why this is a bad choice in the next subsection; the basic idea is that we use the
environment as a global variable here).
resultExprBad :: Env Name Value -> Expr -> Value
resultExprBad env = foldExpr ((+),(-),(*),(/),id,(env ?))
?resultExprBad [("x",3)] (Var "x" Mul Num 2)
6
The good way of using an environment is the following: instead of working with a
computation which, given an environment, yields an algebra of values it is better
to turn the computation itself into an algebra. Thus we turn the environment in a
local variable.
(<+>),(<->),(<*>),(</>) :: (Env Name Value -> Value) ->
(Env Name Value -> Value) ->
(Env Name Value -> Value)
f <+> g = \env -> f env + g env
f <-> g = \env -> f env - g env
f <*> g = \env -> f env * g env
f </> g = \env -> f env / g env
6.4 Expressions 121
resultExprGood :: Expr -> (Env Name Value -> Value)
resultExprGood =
foldExpr ((<+>),(<->),(<*>),(</>),const,flip (?))
?resultExprGood (Var "x" Mul Num 2) [("x",3)]
6
The actions ((+), (-), . . .) on values are now replaced by corresponding actions
((<+>), <->), . . . on computations. Computing the result of the sum of two subex-
pressions within a given environment consists of computing the result of the subex-
pressions within this environment and adding both results to yield a nal result.
Computing the result of a constant does not need the environment at all. Comput-
ing the result of a variable consists of looking it up in the environment. Thus, the
algebraic semantics of an expression is a computation which yields a value. This
important statement can be summarised as follows.
algebraicSemantics :: InitialAlgebra -> Compute Value
In this case the computation is of the form env -> val. The value type is an
example of a synthesised attribute. The value of an expression is synthesised from
values of its subexpressions. The environment type is an example of an inherited
attribute. The environment which is used by the computation of a subexpression
of an expression is inherited from the computation of the expression. Since we are
working with abstract syntax we say that the synthesised and inherited attributes are
attributes of the datatype Expr. If Expr is one of the mutually recursive datatypes
which are generated from the nonterminals of a grammar, then we say that the
synthesised and inherited attributes are attributes of the nonterminal.
6.4.3 Adding denitions
Our next goal is to extend the evaluator of the previous subsection such that it can
handle denitions as well. A denition is an expression of the form Def name expr1
expr2, which should be interpreted as: let the value of name be equal to expr1 in
expression expr2. Variables are typically dened by updating the environment with
an appropriate name-value pair.
The datatype (called Expr again) and the corresponding algebra type and eval func-
tion are now as follows:
data Expr = Expr Add Expr
| Expr Min Expr
| Expr Mul Expr
| Expr Dvd Expr
| Num Value
| Var Name
| Def Name Expr Expr
type ExprAlgebra a = (a->a->a -- add
,a->a->a -- min
,a->a->a -- mul
,a->a->a -- dvd
122 Compositionality
,Value->a -- num
,Name->a -- var
,Name->a->a->a) -- def
foldExpr :: ExprAlgebra a -> Expr -> a
foldExpr (add,min,mul,dvd,num,var,def) = fold where
fold (expr1 Add expr2) = fold expr1 add fold expr2
fold (expr1 Min expr2) = fold expr1 min fold expr2
fold (expr1 Mul expr2) = fold expr1 mul fold expr2
fold (expr1 Dvd expr2) = fold expr1 dvd fold expr2
fold (Num n) = num n
fold (Var x) = var x
fold (Def x value body) = def x (fold value) (fold body)
Expr now has an extra constructor: the ternary constructor Def, which can be used
to introduce a local variable. For example, the following expression can be used to
compute the number of seconds per year.
seconds = Def "days_per_year" (Num 365) (
Def "hours_per_day" (Num 24) (
Def "minutes_per_hour" (Num 60) (
Def "seconds_per_minute" (Num 60) (
Var "days_per_year" Mul
Var "hours_per_day" Mul
Var "minutes_per_hour" Mul
Var "seconds_per_minute" ))))
Similarly, the parameter of foldExpr now has an extra component: the ternary
function def which corresponds to the ternary constructor Def. Notice that the
last two arguments are recursive ones. We can now explain why the rst use of
environments is inferior to the second one. Trying to extend the rst denition
gives something like:
resultExprBad :: Env Name Value -> Expr -> Value
resultExprBad env =
foldExpr ((+),(-),(*),(/),id,(env ?),error "def")
The last component causes a problem: a body that contains a local denition has to
be evaluated in an updated environment. We cannot update the environment in this
setting: we can read the environment but afterwards it is not accessible any more
in the algebra (which consists of values). Extending the second denition causes
no problems: the environment is now accessible in the algebra (which consists of
computations). We can easily add a new action which updates the environment.
The computation corresponding to the body of an expression with a local denition
can now be evaluated in the updated environment.
f <+> g = \env -> f env + g env
f <-> g = \env -> f env - g env
f <*> g = \env -> f env * g env
f </> g = \env -> f env / g env
6.4 Expressions 123
x <:=> f = \g env -> g ((x,f env):env)
resultExprGood :: Expr -> (Env Name Value -> Value)
resultExprGood =
foldExpr ((<+>),(<->),(<*>),(</>),const,flip (?),(<:=>))
?resultExprGood seconds []
31536000
Note that by consing a pair (x,y) onto an environment (in the denition of the
operator <:=>), we add the pair to the environment. By denition of (?), the
binding for x hides possible other bindings for x.
6.4.4 Compiling to a stack machine
In this section we compile expressions to instructions on a stack machine. We can
then use this stack machine to evaluate compiled expressions. This section is inspired
by an example in [3].
Imagine a simple computer for evaluating arithmetic expressions. This computer
has a stack and can execute instructions which change the value of the stack. The
class of possible instructions is dened by the following datatype.
data MachInstr v = Push v | Apply (v -> v -> v)
type StackMachine v = [MachInstr v]
An instruction either pushes a value of type v on the stack, or it executes an operator
that takes the two top values of the stack, applies the operator, and pushes the result
back on the stack. A stack (a value of type Stack v for some value type v) is a
list of values, from which you can pop values, on which you can push values, and
from which you can take the top value. A module Stack for stacks can be found in
appendix A. The eect of executing an instruction of type MachInstr is dened by
execute :: MachInstr v -> Stack v -> Stack v
execute (Push x) s = push x s
execute (Apply op) s = let a = top s
t = pop s
b = top t
u = pop t
in push (op a b) u
A sequence of instructions is executed by the function run dened by
run :: StackMachine v -> Stack v -> Stack v
run [] s = s
run (x:xs) s = run xs (execute x s)
It follows that run can be dened as a foldl.
An expression can be translated (or compiled) into a list of instructions by the
function compile, dened by:
124 Compositionality
compileExpr :: Expr ->
Env Name (StackMachine Value) ->
StackMachine Value
compileExpr = foldExpr (add,min,mul,dvd,num,var,def) where
f add g = \env -> f env ++ g env ++ [Apply (+)]
f min g = \env -> f env ++ g env ++ [Apply (-)]
f mul g = \env -> f env ++ g env ++ [Apply (*)]
f dvd g = \env -> f env ++ g env ++ [Apply (/)]
num v = \env -> [Push v]
var x = \env -> env ? x
def x fd fb = \env -> fb ((x,fd env):env)
We now have for all expressions e, environments env, and stacks s:
top (run (compileExpr e env) s) = resultExprGood e env
The proof of this equality, which is by induction on expressions, is omitted.
Exercise 6.5 Dene the following functions as folds on the datatype Expr that contains de-
nitions.
1. isSum, which determines whether or not an expression is a sum.
2. vars, which returns the list of variables that occur in the expression.

Exercise 6.6 This exercise deals with expressions without denitions. The function der is
dened by
der :: Expr -> String -> Expr
der (e1 Add e2) dx = der e1 dx Add der e2 dx
der (e1 Min e2) dx = der e1 dx Min der e2 dx
der (e1 Mul e2) dx = e1 Mul (der e2 dx) Add (der e1 dx) Mul e2
der (e1 Dvd e2) dx = (e2 Mul (der e1 dx)
Min e1 Mul (der e2 dx))
Dvd (e2 Mul e2)
der (Num f) dx = Num 0.0
der (Var s) dx = if s == dx then Num 1.0 else Num 0.0
1. Give an informal description of the function der.
2. Why is the function der not compositional ?
3. Dene a datatype Exp to represent expressions consisting of (oating point)
constants, variables, addition and substraction. Also, dene the type ExpAlgebra
and the corresponding foldExp.
4. Dene the function der on Exp and show that this function is compositional.

6.5 Block structured languages 125


Exercise 6.7 Dene the function replace, which given a binary tree and an element m replaces
the elements at the leaves by m as a fold on the datatype BinTree, see Section 6.2.1. It
is easy to write a function with the required functionality if you swap the arguments,
but then it is impossible to write replace as a fold. Note that the fold returns a
function, which when given m replaces all the leaves by m.
Exercise 6.8 Consider the datatype of paths introduced in Exercise 3. A path in a tree leads
to a unique leaf. Dene a compositonal function path2Value which, given a tree
and a path in the tree, yields the element at the unique leaf.
6.5 Block structured languages
This section presents a more complex example of the use of tuples in combination
with compositionality. The example deals with the scope of variables in a block
structured language. A variable from a global scope is visible in a local scope only
if it is not hidden by a variable with the same name in the local scope.
6.5.1 Blocks
A block is a list of statements. A statement is a variable declaration, a variable
usage or a nested block. The concrete representation of an example block of our
block structured language looks as follows (dcl stands for declaration, use stands
for usage and x, y and z are variables).
use x ; dcl x ;
(use z ; use y ; dcl x ; dcl z ; use x) ;
dcl y ; use y
Statements are separated by semicolons. Nested blocks are surrounded by paren-
theses. The usage of z refers to the local declaration (the only declaration of z).
The usage of y refers to the global declaration (the only declaration of y). The local
usage of x refers to the local declaration and the global usage of x refers to the
global declaration. Note that it is allowed to use variables before they are declared.
Here are some mutually recursive (data)types, which describe the abstract syntax of
blocks, corresponding to the grammar that describes the concrete syntax of blocks
which is used above. We use meaningful names for data constructors and we use
built-in lists instead of user-dened lists for the blockalgebra. As usual, the algebra
type BlockAlgebra, which consists of two tuples of functions, and the fold function
foldBlock, which uses two mutually recursive local functions, can be generated
from the two mutually recursive (data)types.
type Block = [Statement]
data Statement = Dcl Idf | Use Idf | Blk Block
type Idf = String
type BlockAlgebra b s = ((s -> b -> b,b)
,(Idf -> s,Idf -> s,b -> s)
)
126 Compositionality
foldBlock :: BlockAlgebra b s -> Block -> b
foldBlock ((cons,empty),(dcl,use,blk)) = fold where
fold (s:b) = cons (foldS s) (fold b)
fold [] = empty
foldS (Dcl x) = dcl x
foldS (Use x) = use x
foldS (Blk b) = blk (fold b)
6.5.2 Generating code
The goal of this section is to generate code from a block. The code consists of a
sequence of instructions. There are three types of instructions.
Enter (l,c): enters the lth nested block in which c local variables are de-
clared.
Leave (l,c): leaves the lth nested block in which c local variables were
declared.
Access (l,c): accesses the cth variable of the lth nested block.
The code generated for the above example looks as follows.
[Enter (0,2),Access (0,0)
,Enter (1,2),Access (1,1),Access (0,1),Access (1,0),Leave (1,2)
,Access (0,1),Leave (0,2)
]
Note that we start numbering levels (l) and counts (c) (which are sometimes called
displacements) from 0. The abstract syntax of the code to be generated is described
by the following datatype.
type Count = Int
type Level = Int
type Variable = (Level,Count)
type BlockInfo = (Level,Count)
data Instruction = Enter BlockInfo
| Leave BlockInfo
| Access Variable
type Code = [Instruction]
The function ab2ac, which generates abstract code (a value of type Code) from an
abstract block (a value of type Block), uses a compositional function block2Code.
For all syntactic constructs of Statement and Block we dene appropriate semantic
actions on an algebra of computations. Here is a, somewhat simplied, description
of these semantic actions.
6.5 Block structured languages 127
Dcl: Every time we declare a local variable x we have to update the local
environment le of the block we are in by associating with x the current level
local-count pair (l,lc). Moreover we have to increment the local variable
count lc to lc+1. Note that we do not generate any code for a declaration
statement. Instead we perform some computations which make it possible to
generate appropriate code for other statements.
dcl x (le,l,lc) = (le,lc) where
le = le update (x,(l,lc))
lc = lc+1
where function update is dened in the AssociationList module.
Use: Every time we use a local variable x we have to generate code cd for it.
This code is of the form [Access (l,lc)]. The levellocal-count pair (l,lc)
of the variable is looked up in the global environment e.
use x e = cd where
cd = [Access (l,c)]
(l,c) = e ? x
Blk: Every time we enter a nested block we increment the global level l to l+1,
start with a fresh local variable count 0 and set the local environment of the
nested block we enter to the current global environment e. The computation
for the nested block results in a local variable count lcB and a local envi-
ronment leB. Furthermore we need to make sure that the global environment
(the one in which we look up variables) which is used by the computation for
the nested block is equal to leB. The code which is generated for the block is
surrounded by an appropriate [Enter lcB]-[Leave lcB] pair.
blk fB (e,l) = cd where
l = l+1
(leB,lcB,cdB) = fB (leB,l,e,0)
cd = [Enter (l,lcB)]++cdB++[Leave (l,lcB)]
[]: No action need to be performed for an empty block.
(:): For every nonempty block we perform the computation of the rst state-
ment of the block which, given a local environment le and local variable count
lc, results in a local environment leS and local variable count lcS. This
environment-count pair is then used by the computation of the rest of the
block to result in a local environment le and local variable count lc. The
code cd which is generated is the concatenation cdS++cdB of the code cdS
which is generated for the rst statement and the code cdB which is generated
for the rest of the block.
cons fS fB (le,lc) = (le,lc,cd) where
(leS,lcS,cdS) = fS (le,lc)
(le,lc,cdB) = fB (leS,lcS)
cd = cdS++cdB
128 Compositionality
What does our actual computation type look like? For dcl we need three inherited
attributes: a global level, a local block environment and a local variable count.
Two of them: the local block environment and the local variable count are also
synthesised attributes. For use we need one inherited attribute: a global block
environment, and we compute one synthesised attribute: the generated code. For
blk we need two inherited attributes: a global block environment and a global
level, and we compute two synthesised attributes: the local variable count and the
generated code. Moreover there is one extra attribute: a local block environment
which is both inherited and synthesised. When processing the statements of a nested
block we already make use of the global block environment which we are synthesising
(when looking up variables). For cons we compute three synthesised attributes: the
local block environment, the local variable count and the generated code. Two of
them, the local block environment and the local variable count are also needed as
inherited attributes. It is clear from the considerations above that the following
types fulll our needs.
type BlockEnv = [(Idf,Variable)]
type GlobalEnv = (BlockEnv,Level)
type LocalEnv = (BlockEnv,Count)
The implementation of block2Code is now a straightforward translation of the ac-
tions described above. Attributes which are not mentioned in those actions are
added as extra components which do not contribute to the functionality.
block2Code :: Block -> GlobalEnv -> LocalEnv -> (LocalEnv,Code)
block2Code = foldBlock ((cons,empty),(dcl,use,blk)) where
cons fS fB (e,l) (le,lc) = ((le,lc),cd) where
((leS,lcS),cdS) = fS (e,l) (le,lc)
((le,lc),cdB) = fB (e,l) (leS,lcS)
cd = cdS++cdB
empty (e,l) (le,lc) = ((le,lc),[])
dcl x (e,l) (le,lc) = ((le,lc),[]) where
le = (x,(l,lc)):le
lc = lc+1
use x (e,l) (le,lc) = ((le,lc),cd) where
cd = [Access (l,c)]
(l,c) = e ? x
blk fB (e,l) (le,lc) = ((le,lc),cd) where
((leB,lcB),cdB) = fB (leB,l) (e,0)
l = l+1
cd = [Enter (l,lcB)] ++ cdB ++ [Leave (l,lcB)]
The code generator starts with an empty local environment, a fresh level and a fresh
local variable count. The code is a synthesised attribute. The global environment
is an attribute which is both inherited and synthesised. When processing a block
we already use the global environment which we are synthesising (when looking up
variables).
ab2ac :: Block -> Code
ab2ac b = [Enter (0,c)] ++ cd ++ [Leave (0,c)] where
6.6 Exercises 129
((e,c),cd) = block2Code b (e,0) ([],0)
aBlock
= [Use "x",Dcl "x"
,Blk [Use "z",Use "y",Dcl "x",Dcl "z",Use "x"]
,Dcl "y",Use "y"]
? ab2ac aBlock
[Enter (0,2),Access (0,0)
,Enter (1,2),Access (1,1),Access (0,1),Access (1,0),Leave (1,2)
,Access (0,1),Leave (0,2)]
6.6 Exercises
Exercise 6.9 Consider your answer to exercise 2.21, which gives an abstract syntax for palin-
dromes.
1 Dene a type PalAlgebra that describes the type of the semantic actions that
correspond to the syntactic constructs of Pal.
2 Dene the function foldPal, which describes how the semantics actions that
correspond to the syntactic constructs of Pal should be applied.
3 Dene the functions a2cPal and aCountPal as foldPals.
4 Dene the parser pfoldPal which interprets its input in an arbitrary semantic
PalAlgebra without building the intermediate abstract syntax tree.
5 Describe the parsers pfoldPal m1 and pfoldPal m2 where m1 and m2 corre-
spond to the algebras of a2cPal and aCountPal respectively.

Exercise 6.10 Consider your answer to exercise 2.22, which gives an abstract syntax for
mirror-palindromes.
1 Dene the type MirAlgebra that describes the semantic actions that corre-
spond to the syntactic constructs of Mir.
2 Dene the function foldMir, which describes how semantic actions that cor-
respond to the syntactic constructs of Mir should be applied.
3 Dene the functions a2cMir and m2pMir as foldMirs.
4 Dene the parser pfoldMir, which interprets its input in an arbitrary semantic
MirAlgebra without building the intermediate abstract syntax tree.
5 Describe the parsers pfoldMir m1 and pfoldMir m2 where m1 and m2 corre-
spond to the algebras of a2cMir and m2pMir, respectively.

Exercise 6.11 Consider your answer to exercise 2.23, which gives an abstract syntax for
parity-sequences.
1 Dene the type ParityAlgebra that describes the semantic actions that cor-
respond to the syntactic constructs of Parity.
2 Dene the function foldParity, which describes how the semantic actions
that correspond to the syntactic constructs of Parity should be applied.
3 Dene the function a2cParity as foldParity.

130 Compositionality
Exercise 6.12 Consider your answer to exercise 2.24, which gives an abstract syntax for bit-
lists.
1 Dene the type BitListAlgebra that describes the semantic actions that cor-
respond to the syntactic constructs of BitList.
2 Dene the function foldBitList, which describes how the semantic actions
that correspond to the syntactic constructs of BitList should be applied.
3 Dene the function a2cBitList as a foldBitList.
4 Dene the parser pfoldBitList, which interprets its input in an arbitrary
semantic BitListAlgebra without building the intermediate abstract syntax
tree.

Exercise 6.13 The following grammar describes the concrete syntax of a simple block-struc-
tured programming language
B SR Block
R ;SR [ Rest
S D [ U [ N Statement
D x [ y Declaration
U X [ Y Usage
N (B) Nested Block
1. Dene a datatype Block that describes the abstract syntax that corresponds
to the grammar. What is the abstract representation of x;(y;Y);X?
2. Dene the type BlockAlgebra that describes the semantic actions that corre-
spond to the syntactic constructs of Block.
3. Dene the function foldBlock, which describes how the semantic actions cor-
responding to the syntactic constructs of Block should be applied.
4. Dene the function a2cBlock, which converts an abstract block into a concrete
one. Write a2cBlock as a foldBlock
5. The function checkBlock tests whether or not each variable of a given abstract
block is declared before use (declared in the same or in a surrounding block).

Chapter 7
Computing with parsers
Parsers produce results. For example, the parsers for travelling schemes given in
Chapter 4 return an abstract syntax, or an integer that represents the net travelling
time in minutes. The net travelling time is computed directly by inserting the
correct semantic functions. Another way to compute the net travelling time is by
rst computing the abstract syntax, and then applying a function to the abstract
syntax that computes the net travelling time. This section shows several ways to
compute results using parsers:
insert a semantic function in the parser;
apply a fold to the abstract syntax;
use a class instead of abstract syntax;
pass an algebra to the parser.
7.1 Insert a semantic function in the parser
In Chapter 4 we have dened two parsers: a parser that computes the abstract
syntax for a travelling schema, and a parser that computes the net travelling time.
These functions are obtained by inserting dierent functions in the basic parser. If
we want to compute the total travelling time, we have to insert dierent functions
in the basic parser. This approach works ne for a small parser, but it has some
disadvantages when building a larger parser:
semantics is intertwined with the parsing process;
it is dicult to locate all positions where semantic functions have to be inserted
in the parser.
7.2 Apply a fold to the abstract syntax
Instead of inserting operations in a basic parser, we can write a parser that parses
the input to an abstract syntax, and computes the desired result by applying a fold
to the abstract syntax.
An example of such an approach has been given in Section 6.2.2, where we dened
two functions with the same functionality: nesting and nesting; both compute
the maximum nesting depth in a string of parentheses. Function nesting is dened
by inserting functions in the basic parser. Function nesting is dened by applying
a fold to the abstract syntax. Each of these denitions has its own merits; we repeat
the main arguments below.
131
132 Computing with parsers
parens :: Parser Char Parentheses
parens = (\_ b _ d -> Match b d) <$>
open <*> parens <*> close <*> parens
<|> succeed Empty
nesting :: Parser Char Int
nesting = (\_ b _ d -> max (1+b) d) <$>
open <*> nesting <*> close <*> nesting
<|> succeed 0
nesting :: Parser Char Int
nesting = depthParentheses <$> parens
The rst denition (nesting) is more ecient, because it does not build an inter-
mediate abstract syntax tree. On the other hand, it might be more dicult to write
because we have to insert functions in the correct places in the basic parser. The ad-
vantage of the second denition (nesting) is that we reuse both the parser parens,
which returns an abstract syntax tree, and the function depthParentheses (or the
function foldParentheses, which is used in the denition of depthParentheses),
which does recursion over an abstract syntax tree. The only thing we have to write
ourselves in the second denition is the depthParenthesesAlgebra. The disadvan-
tage of the second denition is that it builds an intermediate abstract syntax tree,
which is attened by the fold. We want to avoid building the abstract syntax
tree altogether. To obtain the best of both worlds, we would like to write function
nesting and have our compiler gure out that it is better to use function nesting
in computations. The automatic transformation of function nesting into function
nesting is called deforestation (trees are removed). Some (very few) compilers are
clever enough to perform this transformation automatically.
7.3 Deforestation
Deforestation removes intermediate trees in computations. The previous section
gives an example of deforestation on the datatype Parentheses. This section
sketches the general idea.
Suppose we have a datatype AbstractTree
data AbstractTree = ...
From this datatype we construct an algebra and a fold, see Chapter 6.
type AbstractTreeAlgebra a = ...
foldAbstractTree :: AbstractTreeAlgebra a -> AbstractTree -> a
A parser for the datatype AbstractTree (which returns a value of AbstractTree)
has the following type:
parseAbstractTree :: Parser Symbol AbstractTree
7.4 Using a class instead of abstract syntax 133
where Symbol is some type of input symbols (for example Char). Suppose now that
we dene a function p that parses an AbstractTree, and then computes some value
by folding with an algebra f over this tree:
p = foldAbstractTree f . parseAbstractTree
Then deforestation says that p is equal to the function parseAbstractTree in which
occurrences of the constructors of the datatype AbstractTree have been replaced
by the corresponding components of the algebra f. The following two sections each
describe a way to implement such a deforestated function.
7.4 Using a class instead of abstract syntax
Classes can be used to implement the deforestated or fused computation of a fold
with a parser. This gives a solution of the desired eciency.
For example, for the language of parentheses, we dene the following class:
class Parens a where
match :: a -> a -> a
empty :: a
Note that types of the functions in the class Parens correspond exactly to the two
types that occur in the type ParenthesesAlgebra. This class is used in a parser for
parentheses:
parens :: Parens a => Parser Char a
parens = (\_ b _ d -> match b d) <$>
open <*> parens <*> close <*> parens
<|> succeed empty
The algebra is implicit in this function: the only thing we know is that there exist
functions empty and match of the correct type; we know nothing about their imple-
mentation. To obtain a function parens that returns a value of type Parentheses
we create the following instance of the class Parens.
instance Parens Parentheses where
match = Match
empty = Empty
Now we can write:
?(parens :: Parser Char Parentheses) "()()"
[(Match Empty (Match Empty Empty), "")
,(Match Empty Empty, "()")
,(Empty, "()()")
]
Note that we have to supply the type of parens in this expression, otherwise Hugs
doesnt know which instance of Parens to use. This is how we turn the implicit
class algebra into an explicit instance algebra. Another instance of Parens can
be used to compute the nesting depth of parentheses:
134 Computing with parsers
instance Parens Int where
match b d = max (1+b) d
empty = 0
And now we can write:
?(parens :: Parser Char Int) "()()"
[(1, ""), (1, "()"), (0, "()()")]
So the answer depends on the type we want our function parens to have. This
also immediately shows a problem of this, otherwise elegant, approach: it does not
work if we want to compute two dierent results of the same type, because Haskell
doesnt allow you to dene two (or more) instances with the same type. So once we
have dened the instance Parens Int as above, we cannot use function parens to
compute, for example, the width (also an Int) of a string of parentheses.
7.5 Passing an algebra to the parser
The previous section shows how to implement a parser with an implicit algebra.
Since this approach fails when we want to dene dierent parsers with the same
result type, we make the algebras explicit. Thus we obtain the following denition
of parens:
parens :: ParenthesesAlgebra a -> Parser Char a
parens (match,empty) = par where
par = (\_ b _ d -> match b d) <$>
open <*> par <*> close <*> par
<|> succeed empty
Note that it is now easy to dene dierent parsers with the same result type:
nesting, breadth :: Parser Char Int
nesting = parens (\b d -> max (1+b) d,0)
breadth = parens (\b d -> d+1,0)
Chapter 8
Programming with higher-order folds
introduction
In the previous chapters we have seen that algebras play an important role when
describing the meaning of a recognised structure (a parse tree). For each recursive
datatype T we have a function foldT, and for each constructor of the datatype we
have a corresponding function as a component in the algebra. Chapter 6 introduces
a language in which local declarations are permitted. Evaluating expressions in
this language can be done by choosing an appropriate algebra. The domain of
that algebra is a higher order (data)type (a (data)type that contains functions).
Unfortunately, the resulting code comes as a surprise to many. In this chapter
we will illustrate a related formalism, which will make it easier to construct such
involved algebras. This related formalism is the attribute grammar formalism. We
will not formally dene attribute grammars, but instead illustrate the formalism
with some examples, and give an informal denition.
We start with developing a somewhat unconventional way of looking at functional
programs, and especially those programs that use functions that recursively descend
over datatypes a lot. In our case one may think about these datatypes as abstract
syntax trees. When computing a property of such a recursive object (for example, a
program) we dene two sets of functions: one set that describes how to recursively
visit the nodes of the tree, and one set of functions (an algebra) that describes what
to compute at each node when visited.
One of the most important steps in this process is deciding what the carrier type
of the algebras is going to be. Once this step has been taken, these types are
a guideline for further design steps. We will see that such carrier types may be
functions themselves, and that deciding on the type of such functions may not always
be simple. In this chapter we will present a view on recursive computations that
will enable us to design the carrier type in an incremental way. We will do so by
constructing algebras out of other algebras. In this way we dene the meaning of a
language in a semantically compositional way.
We will start with the rep min example, which looks a bit articial, and deals with
a non-interesting, highly specic problem. However, it has been chosen for its sim-
plicity, and to not distract our attention to specic, programming language related,
semantic issues. The second example of this chapter demonstrates the techniques
on a larger example: a small compiler for part of a programming language.
135
136 Programming with higher-order folds
data Tree = Leaf Int
| Bin Tree Tree deriving Show
type TreeAlgebra a = (Int -> a, a -> a -> a)
foldTree :: TreeAlgebra a -> Tree -> a
foldTree alg@(leaf, _ ) (Leaf i) = leaf i
foldTree alg@(_ , bin) (Bin l r) = bin (foldTree alg l)
(foldTree alg r)
Listing 9: rm.start.hs
goals
In this chapter you will learn:
how to write circular functional programs, or higher-order folds;
how to combine algebras;
(informally) the concept of an attribute grammar.
8.1 The rep min problem
One of the famous examples in which the power of lazy evaluation is demonstrated is
the so-called rep min problem [2]. Many have wondered how this program achieves
its goal, since at rst sight it seems that it is impossible to compute anything with
this program. We will use this problem, and a sequence of dierent solutions, to
build up an understanding of a whole class of such programs.
In listing 9 we present the datatype Tree, together with its associated algebra. The
carrier type of an algebra is the type that describes the objects of the algebra. We
represent it by a type parameter of the algebra type:
type TreeAlgebra a = (Int -> a, a -> a -> a)
The associated evaluation function foldTree systematically replaces the construc-
tors Leaf and Bin by corresponding operations from the algebra alg that is passed
as an argument.
We now want to construct a function rep_min :: Tree -> Tree that returns a
Tree with the same shape as its argument Tree, but with the values in its leaves
replaced by the minimal value occurring in the original tree. For example,
?rep_min (Bin (Bin (Leaf 1) (Leaf 7)) (Leaf 11))
Bin (Bin (Leaf 1) (Leaf 1)) (Leaf 1)
8.1.1 A straightforward solution
A straightforward solution to the rep min problem consists of a function in which
foldTree is used twice: once for computing the minimal value of the leaf values,
and once for constructing the resulting Tree. The function rep_min that solves
8.1 The rep min problem 137
minAlg :: TreeAlgebra Int
minAlg = (id, min :: Int->Int->Int)
rep_min :: Tree -> Tree
rep_min t = foldTree repAlg t
where m = foldTree minAlg t
repAlg = (const (Leaf m), Bin)
Listing 10: rm.sol1.hs
repAlg = ( \_ -> \m -> Leaf m
,\lfun rfun -> \m -> let lt = lfun m
rt = rfun m
in Bin lt rt
)
rep_min t = (foldTree repAlg t) (foldTree minAlg t)
Listing 11: rm.sol2.hs
the problem in this way is given in listing 10. Notice that the variable m is a global
variable of the repAlg-algebra, that is used in the tree constructing call of foldTree.
One of the disadvantages of this solution is that in the course of the computation
the pattern matching associated with the inspection of the tree nodes is performed
twice for each node in the tree.
Although this solution as such is no problem, we will try to construct a solution
that calls foldTree only once.
8.1.2 Lambda lifting
We want to obtain a program for the rep min problem in which pattern matching is
used only once. Program listing 11 is an intermediate step towards this goal. In this
program the global variable m has been removed and the second call of foldTree
does not construct a Tree anymore, but instead a function constructing a tree of type
Int -> Tree, which takes the computed minimal value as an argument. Notice how
we have emphasized the fact that a function is returned through some superuous
notation: the rst lambda in the function denitions constituting the algebra repAlg
is required by the signature of the algebra, the second lambda, which could have
been omitted, is there because the carrier set of the algebra contains functions of
type Int -> Tree. This process is done routinely by functional compilers and is
known as lambda-lifting.
8.1.3 Tupling computations
We are now ready to formulate a solution in which foldTree is called only once. Note
that in the last solution the two calls of foldTree dont interfere with each other.
138 Programming with higher-order folds
infix 9 tuple
tuple :: TreeAlgebra a -> TreeAlgebra b -> TreeAlgebra (a,b)
(leaf1, bin1) tuple (leaf2, bin2) = (\i -> (leaf1 i, leaf2 i)
,\l r -> (bin1 (fst l) (fst r)
,bin2 (snd l) (snd r)
)
)
min_repAlg :: TreeAlgebra (Int, Int -> Tree)
min_repAlg = (minAlg tuple repAlg)
rep_min t = r m
where (m, r) = foldTree min_repAlg t
Listing 12: rm.sol3.hs
As a consequence we may perform both the computation of the tree constructing
function and the minimal value in one go, by tupling the results of the computations.
The solution is given in listing 12. First a function tuple is dened. This function
takes two TreeAlgebras as arguments and constructs a third one, which has as its
carrier tuples of the carriers of the original algebras.
8.1.4 Merging tupled functions
In the next step we transform the type of the carrier set in the previous example,
(Int, Int->Tree), into an equivalent type Int -> (Int, Tree). This transfor-
mation is not essential here, but we use it to demonstrate that if we compute a
cartesian product of functions, we may transform that type into a new type in
which we compute only one function, which takes as its arguments the cartesian
product of all the arguments of the functions in the tuple, and returns as its result
the cartesian product of the result types. In our example the computation of the
minimal value may be seen as a function of type ()->Int. As a consequence the
argument of the new type is ((), Int), which is isomorphic to just Int, and the
result type becomes (Int, Tree).
We want to mention here too that the reverse is in general not true; given a function
of type (a, b) -> (c, d), it is in general not possible to split this function into
two functions of type a -> c and b -> d, which together achieve the same eect.
The new version is given in listing 13.
Notice how we have, in an attempt to make the dierent roles of the parameters
explicit, again introduced extra lambdas in the denition of the functions of the
algebra. The parameters after the second lambda are there because we construct
values in a higher order carrier set. The parameters after the rst lambda are there
because we deal with a TreeAlgebra. A curious step taken here is that part of
the result, in our case the value m, is passed back as an argument to the result of
(foldTree mergedAlg t). Lazy evaluation makes this work.
That such programs were possible came originally as a great surprise to many func-
tional programmers, especially to those who used to program in LISP or ML, lan-
8.1 The rep min problem 139
mergedAlg :: TreeAlgebra (Int -> (Int,Tree))
mergedAlg = (\i -> \m -> (i, Leaf m)
,\lfun rfun -> \m -> let (lm,lt) = lfun m
(rm,rt) = rfun m
in (lm min rm
, Bin lt rt
)
)
rep_min t = r
where (m, r) = (foldTree mergedAlg t) m
Listing 13: rm.sol4.hs
rep_min t = r
where (m, r) = tree t m
tree (Leaf i) = \m -> (i, Leaf m)
tree (Bin l r) = \m -> let (lm, lt) = tree l m
(rm, rt) = tree r m
in (lm min rm, Bin lt rt)
Listing 14: rm.sol5.hs
guages that require arguments to be evaluated completely before a call is evaluated
(so-called strict evaluation in contrast to lazy evaluation). Because of this surprising
behaviour this class of programs became known as circular programs. Notice how-
ever that there is nothing circular in this program. Each value is dened in terms
of other values, and no value is dened in terms of itself (as in ones=1:ones).
Finally, listing 14 shows the version of this program in which the function foldTree
has been unfolded. Thus we obtain the original solution as given in Bird [2].
Concluding, we have systematically transformed a program that inspects each node
twice into an equivalent program that inspects each node only once. The resulting
solution passes back part of the result of a call as an argument to that same call.
Lazy evaluation makes this possible.
Exercise 8.1 The deepest front problem is the problem of nding the so-called front of a tree.
The front of a tree is the list of all nodes that are at the deepest level. As in the
rep min problem, the trees involved are elements of the datatype Tree, see listing 9.
A straightforward solution is to compute the height of the tree and passing the result
of this function to a function frontAtLevel :: Tree -> Int -> [Int].
1. Dene the functions height and frontAtLevel
2. Give the four dierent solutions as dened in the rep min problem.

140 Programming with higher-order folds


data ExprAS = If ExprAS ExprAS ExprAS
| Apply ExprAS ExprAS
| ConInt Int
| ConBool Bool deriving Show
type ExprASAlgebra a = (a -> a -> a -> a
,a -> a -> a
,Int -> a
,Bool -> a
)
foldExprAS :: ExprASAlgebra a -> ExprAS -> a
foldExprAS (iff,apply,conint,conbool) = fold
where fold (If ce te ee) = iff (fold ce) (fold te) (fold ee)
fold (Apply fe ae) = apply (fold fe) (fold ae)
fold (ConInt i) = conint i
fold (ConBool b) = conbool b
Listing 15: ExprAbstractSyntax.hs
Exercise 8.2 Redo the previous exercise for the highest front problem
8.2 A small compiler
This section constructs a small compiler for (a part of) a small language. The
compiler compiles this code into code for a hypothetical stack machine.
8.2.1 The language
The language we consider in this section has integers, booleans, function application,
and an if-then-else expression. A language with just these constructs is useless, and
you will extend the language in the exercises with some other constructs, which make
the language a bit more interesting. We take the following context-free grammar for
the concrete syntax of the language.
Expr0 if Expr1 then Expr1 else Expr1 [ Expr1
Expr1 Expr2 Expr2

Expr2 Int [ Bool


where Int generates integers, and Bool booleans. An abstract syntax for our lan-
guage is given in listing 15. Note that we use a single datatype for the abstract
syntax instead of three datatypes (one for each nonterminal); this simplies the
code a bit. The listing 15 also contains a denition of a fold and an algebra type for
the abstract syntax.
A parser for expressions is given in listing 16.
8.2 A small compiler 141
sptoken :: String -> Parser Char String
sptoken s = (\_ b _ -> b) <$>
many (symbol ) <*> token s <*> many1 (symbol )
boolean = const True <$> token "True" <|> const False <$> token "False"
parseExpr :: Parser Char ExprAS
parseExpr = expr0
where expr0 = (\a b c d e f -> If b d f) <$>
sptoken "if"
<*> parseExpr
<*> sptoken "then"
<*> parseExpr
<*> sptoken "else"
<*> parseExpr
<|> expr1
expr1 = chainl expr2 (const Apply <$> many1 (symbol ))
<|> expr2
expr2 = ConBool <$> boolean
<|> ConInt <$> natural
Listing 16: ExprParser.hs
8.2.2 A stack machine
In section 6.4.4 we have dened a stack machine with which simple arithmetic ex-
pressions can be evaluated. Here we dene a stack machine that has some more
instructions. The language of the previous section wil be compiled into code for this
stack machine in the following section.
The stack machine we will use has the following instructions:
it can load an integer;
it can load a boolean;
given an argument and a function on the stack, it can call the function on the
argument;
it can set a label in the code;
given a boolean on the stack, it can jump to a label provided the boolean is
false;
it can jump to a label (unconditionally).
The datatype for instructions is given in listing 17.
8.2.3 Compiling to the stackmachine
How do we compile the dierent expressions to stack machine code? We want to
dene a function compile of type
compile :: ExprAS -> [InstructionSM]
A ConInt i is compiled to a LoadInt i.
142 Programming with higher-order folds
data InstructionSM = LoadInt Int
| LoadBool Bool
| Call
| SetLabel Label
| BrFalse Label
| BrAlways Label
type Label = Int
Listing 17: InstructionSM.hs
compile (ConInt i) = [LoadInt i]
A ConBool b is compiled to a LoadBool b.
compile (ConBool b) = [LoadBool b]
An application Apply f x is compiled by rst compiling the argument x, then
the function f (at the moment it is impossible to dene functions in our
language, hence the quotes around function), and nally putting a Call on
top of the stack.
compile (Apply f x) = compile x ++ compile f ++ [Call]
An if-then-else expression If ce te ee is compiled by rst compiling the con-
ditional expression ce. Then we jump to a label (which will be set before the
code of the else expression ee later) if the resulting boolean is false. Then
we compile the then expression te. After the then expression we always jump
to the end of the code of the if-then-else expression, for which we need an-
other label. Then we set the label for the else expression, we compile the
else expression ee, and, nally, we set the label for the end of the if-then-else
expression.
compile (If ce te ee) = compile ce
++ [BrFalse ?lab1]
++ compile te
++ [BrAlways ?lab2]
++ [SetLabel ?lab1]
++ compile ee
++ [SetLabel ?lab2]
Note that we use labels here, but where do these labels come from?
From the above description we see that we also need labels when compiling an
expression. We add a label argument (an integer, used for the rst label in the
compiled code) to function compile, and we want function compile to return the
rst unused label. We change the type of function compile as follows:
8.3 Attribute grammars 143
compile :: ExprAS -> Label -> ([InstructionSM],Label)
type Label = Int
The four cases in the denition of compile have to take care of the labels. We obtain
the following denition of compile:
compile (ConInt i) = \l -> ([LoadInt i],l)
compile (ConBool b) = \l -> ([LoadBool b],l)
compile (Apply f x) = \l -> let (xc,l) = compile x l
(fc,l) = compile f l
in (xc ++ fc ++ [Call],l)
compile (If ce te ee) = \l -> let (cc,l) = compile ce (l+2)
(tc,l) = compile te l
(ec,l) = compile ee l
in ( cc
++ [BrFalse l]
++ tc
++ [BrAlways (l+1)]
++ [SetLabel l]
++ ec
++ [SetLabel (l+1)]
,l
)
Function compile is a fold, the carrier type of its algebra is a function of type
Label -> ([InstructionSM],Label). The denition of function compile as a fold
is given in listing 18.
Exercise 8.3 Extend the code generation example by adding variables to the datatype Expr.

Exercise 8.4 Extend the code generation example by adding denitions to the datatype Expr
too.
8.3 Attribute grammars
In Section 8.1 we have written a program that solves the rep min problem. This
program computes the minimum of a tree, and it computes the tree in which all the
leaf values are replaced by the minimum value. The minimum is computed bottom-
up: it is synthesized from its children. The minimum value is then passed on to the
functions that build the tree with the minimum value in its leaves. These functions
receive the minimum value from their parent tree node: they inherit the minimum
value from their parent.
We can see the rep min computation as a computation on a value of type Tree,
on which two attributes are dened: the minimum and result tree attributes. The
minimum is computed bottom-up, and is then passed down to the result tree, and is
therefore a synthesized and inherited attribute. The result tree is computed bottom-
up, and is hence a synthesized attribute.
The formalism in which it is possible to specify such attributes and computations
on datatypes or grammars is called attribute grammars, and was originally proposed
144 Programming with higher-order folds
compile = foldExprAS compileAlgebra
compileAlgebra :: ExprASAlgebra (Label -> ([InstructionSM],Label))
compileAlgebra = (\cce cte cee -> \l ->
let (cc,l) = cce (l+2)
(tc,l) = cte l
(ec,l) = cee l
in ( cc
++ [BrFalse l]
++ tc
++ [BrAlways (l+1)]
++ [SetLabel l]
++ ec
++ [SetLabel (l+1)]
,l
)
,\cf cx -> \l -> let (xc,l) = cx l
(fc,l) = cf l
in (xc ++ fc ++ [Call],l)
,\i -> \l -> ([LoadInt i],l)
,\b -> \l -> ([LoadBool b],l)
)
Listing 18: CompileExpr.hs
8.3 Attribute grammars 145
by Donald Knuth in [10]. Attribute grammars provide a solution for the system-
atic description of the phases of the compiler that come after scanning and parsing.
Although they look dierent from what we have encountered thus far and are prob-
ably a little easier to write, they can straightforwardly be mapped onto a functional
program. The programs you have seen in this chapter could also have been obtained
by means of such a mapping from an attribute grammar specication. Traditionally
such attribute grammars are used as the input of a compiler generator. Just as we
have seen how by introducing a suitable set of parsing combinators one may avoid
the use of a special parser generator and even gain a lot of exibility in extend-
ing the grammatical formalism by introducing more complicated combinators, we
have shown how one can do without a special purpose attribute grammar processing
system. But, just as the concept of a context free grammar was useful in understand-
ing the fundamentals of parser combinators, understanding attribute grammars will
help signicantly in describing the semantic part of the recognition and compilation
process. This chapter does not further introduce attribute grammars, but they will
appear again in the course in implementing programming languages.
146 Programming with higher-order folds
Chapter 9
Pumping Lemmas: the expressive
power of languages
introduction
In these lecture notes we have presented several ways to show that a language
is regular or context-free, but until now we did not give any means to show the
nonregularity or noncontext-freeness of a language. In this chapter we ll this gap
by introducing the so-called Pumping Lemmas. For example, the pumping lemma
for regular languages says
IF language L is regular,
THEN it has the following property P: each suciently long word w L has
a substring that can be repeated any number of times, every time yielding
another word of L
In applications, pumping lemmas are used in the contrapositive way. In the regular
case this means that one may conclude that L is not regular, if P does not hold.
Although the ideas behind pumping lemmas are very simple, a precise formulation
is not. As a consequence, it takes some eort to get familiar with applying pumping
lemmas. Regular grammars and context-free grammars are part of the Chomsky
hierarchy, which consists of four dierent kinds of grammars and their corresponding
languages. Pumping lemmas are used to show that the expressive power of the
dierent elements of the Chomsky hierarchy is dierent.
goals
After you have studied this chapter you will be able to
prove that a language is not regular;
prove that a language is not context-free;
identify languages and grammars as regular, context-free or none of these;
give examples of languages that are not regular, and/or not context-free;
explain the Chomsky hierarchy.
9.1 The Chomsky hierarchy
In the preceding chapters we have seen contex-free grammars and regular grammars.
You may now wonder: is it possible to express any language with these grammars?
147
148 Pumping Lemmas: the expressive power of languages
And: is it possible to obtain any context-free language from a regular grammar? The
answer to these questions is no. The Chomsky hierarchy explains why the answer
is no. The Chomsky hierarchy consists of four elements, each of which is explained
below.
9.1.1 Type-0 grammars
The most powerful grammars are the type-0 grammars, in which a production has
the form , where V
+
, V

, where V is the set of symbols of the


grammar. So the left-hand side of a production may consist of a list of nonterminal
and terminal symbols, instead of a single nonterminal as in context-free grammars.
Type-0 grammars have the same expressive power as Turing machines, and the
languages described by these grammars are the recursive enumerable languages.
This expressive power comes at a cost though: it is very dicult to parse sentences
from type-0 grammars.
9.1.2 Type-1 grammars
We can slightly restrict the form of the productions to obtain type-1 grammars. In
a type-1, or context-sensitive grammar, each production has the form A ,
where , V

, V
+
. So a production describes how to rewrite a nonterminal
A, in the context of lists of symbols and . A language generated by a context-
sensitive grammar is called a context-sensitive language. Although context-sensitive
grammars are less expressive than type-0 grammars, parsing is still very dicult for
context-sensitive grammars.
9.1.3 Type-2 grammars
The type-2 grammars are the context-free grammars which you have seen a lot in the
preceding chapters. As the name says, in a context-free grammar you can rewrite
nonterminals without looking at the context in which they appear. Actually, it is
impossible to look at the context when rewriting symbols. Context-free grammars
are less expressive than context-sensitive grammars. This statement can be proved
using the pumping lemma for context-free languages.
However, it is much easier to parse sentences from context-free languages. In fact, a
sentence of length n can be parsed in time at most O(n
3
) (or even a bit less than this)
for any sentence of a context-free language. And if we put some more restrictions
on context-free grammars (for example LL(1)), we obtain linear-time algorithms for
parsing sentences of such grammars.
9.1.4 Type-3 grammars
The type-3 grammars in the Chomsky hierarchy are the regular grammars. Any
sentence from a regular language can be processed by means of a nite-state au-
tomaton, which takes linear time and constant space in the size of its input. The
set of regular languages is strictly smaller than the set of context-free languages, a
fact we will prove below by means of the pumping lemma for regular languages.
9.2 The pumping lemma for regular languages 149
9.2 The pumping lemma for regular languages
In this section we give the pumping lemma for regular languages. The lemma gives
a property that is satised by all regular languages. The property is a statement
of the form: in sentences longer than a certain length a substring can be identied
that can be duplicated while retaining a sentence. The idea behind this property
is simple: regular languages are accepted by nite automata. Given a DFA for a
regular language, a sentence of the language describes a path from the start state to
some nite state. When the length of such a sentence exceeds the number of states,
then at least one state is visited twice; consequently the path contains a cycle that
can be repeated as often as desired. The proof of the following lemma is given in
Section 9.4.
Theorem 1: Regular Pumping Lemma
Let L be a regular language. Then
there exists n IIN :
for all x, y, z : xyz L and [y[ n :
there exist u, v, w : y = uvw and [v[ > 0 :
for all i IIN : xuv
i
wz L
2
Note that [y[ denotes the length of the string y. Also remember that for all x X :
is true if X = , and there exists x X : is false if X = .
For example, consider the following automaton.
S
?>=< 89:;
a
/
A
?>=< 89:;
b
/
B
c

?>=< 89:;
C
?>=< 89:;
a
_c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
d

D
?>=< 89:; /.-, ()*+
This automaton accepts: abcabcd, abcabcabcd, and, in general, a(bca)

bcd. The
statement of the pumping lemma amounts to the following. Take for n the number
of states in the automaton (5). Let x, y, z be such that xyz L, and [y[ n.
Then we know that in order to accept y, the above automaton has to pass at least
twice through state A. The part that is accepted in between the two moments
the automaton passes through state A can be pumped up to create sentences that
contain an arbitrary number of copies of the string v = bca.
This pumping lemma is useful in showing that a language does not belong to the
family of regular languages. Its application is typical of pumping lemmas in general;
they are used negatively to show that a given language does not belong to some
family.
150 Pumping Lemmas: the expressive power of languages
Theorem 1 enables us to prove that a language L is not regular by showing that
for all n IIN :
there exist x, y, z : xyz L and [y[ n :
for all u, v, w : y = uvw and [v[ > 0 :
there exists i IIN : xuv
i
wz , L
In all applications of the pumping lemma in this chapter, this is the formulation we
will use.
Note that if n = 0, we can choose y = , and since there is no v with [v[ > 0 such
that y = uvw, the statement above holds for all such v (namely none!).
As an example, we will prove that language L = a
m
b
m
[ m 0 is not regular.
Let n IIN.
Take s = a
n
b
n
with x = , y = a
n
, and z = b
n
.
Let u, v, w be such that y = uvw with v ,= , that is, u = a
p
, v = a
q
and w = a
r
with p +q +r = n and q > 0.
Take i = 2, then
xuv
2
wz , L
defn. x, u, v, w, z, calculus
a
p+2q+r
b
n
, L
p +q +r = n
n +q ,= n
arithmetic
q > 0
q > 0
true
Note that the language L = a
m
b
m
[ m 0 is context-free, and together with the
fact that each regular grammar is also a context-free grammar it follows immediately
that the set of regular languages is strictly smaller than the set of context-free
languages.
Note that here we use the pumping lemma (and not the proof of the pumping lemma)
to prove that a language is not regular. This kind of proof can be viewed as a kind of
game: for all is about an arbitrary element which can be chosen by the opponent;
there exists is about a particular element which you may choose. Choosing the right
elements helps you win the game, where winning means proving that a language is
not regular.
Exercise 9.1 Prove that the following language is not regular
a
k
2
[ k 0

Exercise 9.2 Show that the following language is not regular.


x [ x a, b

nr a x < nr b x
where nr a x is the number of occurrences of a in x.
9.3 The pumping lemma for context-free languages 151
Exercise 9.3 Prove that the following language is not regular
a
k
b
m
[ k m 2k

Exercise 9.4 Show that the following language is not regular.


a
k
b
l
a
m
[ k > 5 l > 3 m l

9.3 The pumping lemma for context-free languages


The Pumping Lemma for context-free languages gives a property that is satised by
all context-free languages. This property is a statement of the form: in sentences
exceeding a certain length, two sublists of bounded length can be identied that
can be duplicated while retaining a sentence. The idea behind this property is the
following. Context-free languages are described by context-free grammars. For each
sentence in the language there exists a derivation tree. When sentences have a
derivation tree that is higher than the number of nonterminals, then at least one
nonterminal will occur twice in a node; consequently a subtree can be inserted as
often as desired.
As an example of an application of the Pumping Lemma, consider the context-free
grammar with the following productions.
S aAb
A cBd
A e
B fAg
The following parse tree represents the derivation of the sentence acfegdb.
S








c
c
c
c
c
c
c
c
a
A








c
c
c
c
c
c
c
c
b
c
B








c
c
c
c
c
c
c
c
d
f A
g
e
If we replace the subtree rooted by the lower occurrence of nonterminal A by the
152 Pumping Lemmas: the expressive power of languages
subtree rooted by the upper occurrence of A, we obtain the following parse tree.
S








c
c
c
c
c
c
c
c
a
A








c
c
c
c
c
c
c
c
b
c
B








c
c
c
c
c
c
c
c
d
f A








c
c
c
c
c
c
c
c
g
c
B








c
c
c
c
c
c
c
c
d
f A
g
e
This parse tree represents the derivation of the sentence acfcfegdgdb. Thus we
pump the derivation of sentence acfegdb to the derivation of sentence acfcfegdgdb.
Repeating this step once more, we obtain a parse tree for the sentence
acfcfcfegdgdgdb
We can repeatedly apply this process to obtain derivation trees for all sentences of
the form
a(cf)
i
e(gd)
i
b
for i 0. The case i = 0 is obtained if we replace in the parse tree for the sentence
acfegdb the subtree rooted by the upper occurrence of nonterminal A by the subtree
rooted by the lower occurrence of A:
S

b
b
b
b
b
b
b
a
A b
e
This is a derivation tree for the sentence aeb. This step can be viewed as a negative
pumping step.
The proof of the following lemma is given in Section 9.4.
Theorem 2: Context-free Pumping Lemma
Let L be a context-free language. Then
there exist c, d : c, d IIN :
9.3 The pumping lemma for context-free languages 153
for all z : z L and [z[ > c :
there exist u, v, w, x, y : z = uvwxy and [vx[ > 0 and [vwx[ d :
for all i IIN : uv
i
wx
i
y L
2
The Pumping Lemma is a tool with which we prove that a given language is not
context-free. The proof obligation is to show that the property shared by all context-
free languages does not hold for the language under consideration.
Theorem 2 enables us to prove that a language L is not context-free by showing that
for all c, d : c, d IIN :
there exists z : z L and [z[ c :
for all u, v, w, x, y : z = uvwxy and [vx[ > 0 and [vwx[ d :
there exists i IIN : uv
i
wx
i
y , L
As an example, we will prove that the language T dened by
T = a
n
b
n
c
n
[ n > 0
is not context-free.
Proof: Let c, d IIN.
Take z = a
r
b
r
c
r
with r = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
Note that our choice for r guarantees that substring vwx has one of the following
shapes:
vwx consists of just as, or just bs, or just cs.
vwx contains both as and bs, or both bs and cs.
So vwx does not contain as, bs, and cs.
Take i = 0, then
If vwx consists of just as, or just bs, or just cs, then it is impossible to write
the string uwy as a
s
b
s
c
s
for some s, since only the number of terminals of one
kind is decreased.
If vwx contains both as and bs, or both bs and cs it lies somewhere on the
border between as and bs, or on the border between bs and cs. Then the
string uwy can be written as
uwy = a
s
b
t
c
r
uwy = a
r
b
p
c
q
for some s, t, p, q, respectively. At least one of s and t or of p and q is less
than r. Again this list is not an element of T.
2
Exercise 9.5 Prove that the following language is not context-free
a
k
2
[ k 0

154 Pumping Lemmas: the expressive power of languages


Exercise 9.6 Prove that the following language is not context-free
a
i
[ i is a prime number

Exercise 9.7 Prove that the following language is not context-free


ww [ w a, b

9.4 Proofs of pumping lemmas


This section gives the proof of the pumping lemmas.
Proof: of the Regular Pumping Lemma, Theorem 1.
Since L is a regular language, there exists a deterministic nite-state automaton D
such that L = Ldfa D.
Take for n the number of states of D.
Let s be an element of L with sublist y such that [y[ n, say s = xyz .
Consider the sequence of states D passes through while processing y. Since [y[ n,
this sequence has more than n entries, hence at least one state, say state A, occurs
twice.
Take u, v, w as follows
u is the initial part of y processed until the rst occurrence of A,
v is the (nonempty) part of y processed from the rst to the second occurrence
of A,
w is the remaining part of y
Note that D could have skipped processing v, and hence would have accepted xuwz .
Furthermore, D can repeat the processing in between the rst occurrence of A and
the second occurrence of A as often as desired, and hence it accepts xuv
i
wz for all
i IIN. Formally, a simple proof by induction shows that (i : i 0 : xuv
i
wz L).
2
Proof: of the Context-free Pumping Lemma, Theorem 2.
Let G = (T, N, R, S) be a context-free grammar such that L = L(G). Let m be the
length of the longest right-hand side of any production, and let k be the number of
nonterminals of G.
Take c = m
k
. In Lemma 3 below we prove that if z is a list with [z[ > c, then in all
derivation trees for z there exists a path of length at least k+1.
Let z L such that [z[ > c. Since grammar G has k nonterminals, there is at least
one nonterminal that occurs more than once in a path of length k+1 (which contains
k+2 symbols, of which at most one is a terminal, and all others are nonterminals)
of a derivation tree for z. Consider the nonterminal A that satises the following
requirements.
A occurs at least twice in the path of length k+1 of a derivation tree for z.
9.4 Proofs of pumping lemmas 155
Call the list corresponding to the derivation tree rooted at the lower A w, and call
the list corresponding to the derivation tree rooted at the upper A (which contains
the list w) vwx.
A is chosen such that at most one of v and x equals .
Finally, we suppose that below the upper occurrence of A no other nonterminal
that satises the same requirements occurs, that is, the two As are the lowest
pair of nonterminals satisfying the above requirements.
First we show that a nonterminal A satisfying the above requirements exists. We
prove this statement by contradiction. Suppose that for all nonterminals A that
occur twice on a path of length at least k+1 both v and x, the border lists of the list
vwx corresponding to the tree rooted at the upper occurrence of A, are both equal to
. Then we can replace the tree rooted at the upper A by the tree rooted at the lower
A without changing the list corresponding to the tree. Thus we can replace all paths
of length at least k+1 by a path of length at most k. But this contradicts Lemma 3
below, and we have a contradiction. It follows that a nonterminal A satisfying the
above requirements exists.
There exists a derivation tree for z in which the path from the upper A to the leaf
has length at most k+1, since either below A no nonterminal occurs twice, or there is
one or more nonterminal B that occurs twice, but the border lists v

and x

from the
list v

corresponding to the tree rooted at the upper occurrence of nonterminal


B are empty. Since the lists v

and x

are empty, we can replace the tree rooted at


the upper occurrence of B by the tree rooted at the lower occurrence of B without
changing the list corresponding to the derivation tree. Since we can do this for all
nonterminals B that occur twice below the upper occurrence of A, there exists a
derivation tree for z in which the path from the upper A to the leaf has length at
most k+1. It follows from Lemma 3 below that the length of vwx is at most m
k+1
,
so we dene d = m
k+1
.
Suppose z = uvwxy, that is, the list corresponding to the subtree to the left (right)
of the upper occurrence of A is u (y). This situation is depicted as follows.
S
r
r
r
r
r
r
r
r
r
r
r
r
r

`
`
`
`
`
`
`
`
v
v
v
v
v
v
v
v
v
v
v
v
v
u
A
r
r
r
r
r
r
r
r
r
r
r
r
r

`
`
`
`
`
`
`
`
v
v
v
v
v
v
v
v
v
v
v
v
v
y
v
A







c
c
c
c
c
c
c x
w
We prove by induction that (i : i 0 : uv
i
wx
i
y L). In this proof we apply the
tree substitution process described in the example before the lemma. For i = 0 we
have to show that the list uwy is a sentence in L. The list uwy is obtained if the
tree rooted at the upper A is replaced by the tree rooted at the lower A. Suppose
that for all i n we have uv
i
wx
i
y L. The list uv
i+1
wx
i+1
y is obtained if the tree
rooted at the lower A in the derivation tree for uv
i
wx
i
y L is replaced by the tree
156 Pumping Lemmas: the expressive power of languages
rooted above it A. This proves the induction step.
2
The proof of the Pumping Lemma above frequently refers to the following lemma.
Theorem 3:
Let G be a context-free grammar, and suppose that the longest right-hand side of
any production has length m. Let t be a derivation tree for a list z L G. If
height t j, then [z[ m
j
. 2
Proof: We prove a slightly stronger result: if t is a derivation tree for a list z, but
the root of t is not necessarily the start-symbol, and height t j, then [z[ m
j
.
We prove this statement by induction on j.
For the base case, suppose j = 1. Then tree t corresponds to a single production in
G, and since the longest right-hand side of any production has length m, we have
that [z[ m = m
j
.
For the induction step, assume that for all derivation trees t of height at most j
we have that [z[ m
j
, where z is the list corresponding to t. Suppose we have
a tree t of height j+1. Let A be the root of t, and suppose the top of the tree
corresponds to the production A v in G. For all trees s rooted at the symbols of
v we have height s j, so the induction hypothesis applies to these trees, and the
lists corresponding to the trees rooted at the symbols of v all have length at most
m
j
. Since A v is a production of G, and since the longest right-hand side of any
production has length m, the list corresponding to the tree rooted at A has length
at most mm
j
= m
j+1
, which proves the induction step. 2
summary
This section introduces pumping lemmas. Pumping lemmas are used to prove that
languages are not regular or not context-free.
9.5 exercises
Exercise 9.8 Show that the following language is not regular.
x [ x 0, 1

nr 1 x = nr 0 x
where function nr takes an element a and a list x, and returns the number of
occurrences of a in x.
Exercise 9.9 Consider the following language:
a
i
b
j
[ 0 i j
1. Is this language context-free? If it is, give a context-free grammar and prove
that this grammar generates the language. If it is not, why not?
2. Is this language regular? If it is, give a regular grammar and prove that this
grammar generates the language. If it is not, why not?

Exercise 9.10 Consider the following language:


wcw [ w a, b

9.5 exercises 157


1. Is this language context-free? If it is, give a context-free grammar and prove
that this grammar generates the language. If it is not, why not?
2. Is this language regular? If it is, give a regular grammar and prove that this
grammar generates the language. If it is not, why not?

Exercise 9.11 Consider the grammar G with the following productions.

S A
S
A S
A AA
A
1. Is this grammar
Context-free?
Regular?
Why?
2. Give the language of G without referring to G itself. Prove that your descrip-
tion is correct.
3. Is the language of G
Context-free?
Regular?
Why?

Exercise 9.12 Consider the grammar G with the following productions.

S
S 0
S 1
S S0
1. Is this grammar
Context-free?
Regular?
Why?
2. Give the language of G without referring to G itself. Prove that your descrip-
tion is correct.
3. Is the language of G
Context-free?
Regular?
Why?

158 Pumping Lemmas: the expressive power of languages


Chapter 10
LL Parsing
introduction
This chapter introduces LL(1) parsing. LL(1) parsing is an ecient (linear in the
length of the input string) method for parsing that can be used for all LL(1) gram-
mars. A grammar is LL(1) if at each step in a derivation the next symbol in the
input uniquely determines the production that should be applied. In order to de-
termine whether or not a grammar is LL(1), we introduce several kinds of grammar
analyses, such as determining whether or not a nonterminal can derive the empty
string, and determining the set of symbols that can appear as the rst symbol in a
derivation from a nonterminal.
goals
After studying this chapter you will
know the denition of LL(1) grammars;
know how to parse a sentence from an LL(1) grammar;
be able to apply dierent kinds of grammar analyses in order to determine
whether or not a grammar is LL(1).
This chapter is organised as follows. Section 10.1 describes the background of LL(1)
parsing, and Section 10.2 describes an implementation in Haskell of LL(1) parsing
and the dierent kinds of grammar analyses needed for checking whether or not a
grammar is LL(1).
10.1 LL Parsing: Background
In the previous chapters we have shown how to construct parsers for sentences of
context-free languages using combinator parsers. Since these parsers may backtrack,
the resulting parsers are sometimes a bit slow. There are several ways in which
we can put extra restrictions on context-free grammars such that we can parse
sentences of the corresponding languages eciently. This chapter discusses one such
restriction: LL(1). Other restrictions, not discussed in these lecture notes are LR(1),
LALR(1), SLR(1), etc.
159
160 LL Parsing
10.1.1 A stack machine for parsing
This section presents a stack machine for parsing sentences of context-free grammars.
We will use this machine in the following subsections to illustrate why we need
grammar analysis.
The stack machine we use in this section diers from the stack machines introduced
in Sections 6.4.4 and 8.2. A stack machine for a grammar G has a stack and an
input, and performs one of the following two actions.
1. Expand: If the top stack symbol is a nonterminal, it is popped from the
stack and a right-hand side from a production of G for the nonterminal is
pushed onto the stack. The production is chosen nondeterministically.
2. Match: If the top stack symbol is a terminal, then it is popped from the stack
and compared with the next symbol of the input sequence. If they are equal,
then this terminal symbol is read. If the stack symbol and the next input
symbol do not match, the machine signals an error, and the input sentence
cannot be accepted.
These actions are performed until the stack is empty. A stack machine for G accepts
an input if it can terminate with an empty input when starting with the start-symbol
from G on the stack.
For example, let grammar G be the grammar with the productions:
S aS [ cS [ b
The stack machine for G accepts the input string aab because it can perform the
following actions (the rst component of the state (before the |in the picture below)
is the symbol stack and the second component of the state (after the |) is the
unmatched (remaining part of the) input string):
stack input
S aab
aS aab
S ab
aS ab
S b
b b
and end with an empty input. However, if the machine had chosen the production
S cS in its rst step, it would have been stuck. So not all possible sequences
of actions from the state (S,aab) lead to an empty input. If there is at least one
sequence of actions that ends with an empty input on an input string, the input
string is accepted. In this sense, the stack machine is similar to a nondeterministic
nite-state automaton.
10.1.2 Some example derivations
This section gives three examples in which the stack machine for parsing is applied.
It turns out that, for all three examples, the nondeterministic stack machine can
10.1 LL Parsing: Background 161
act in a deterministic way by looking ahead one (or two) symbols of the sequence
of input symbols. Each of the examples exemplies why dierent kinds of grammar
analyses are useful in parsing.
The rst example
Our rst example is gramm1. The set of terminal symbols of gramm1 is a, b, c, the
set of nonterminal symbols is S, A, B, C, the start symbol is S; and the productions
are
S cA [ b
A cBC [ bSA [ a
B cc [ Cb
C aS [ ba
We want to know whether or not the string ccccba is a sentence of the language of
this grammar. The stack machine produces, amongst others, the following sequence,
corresponding with a leftmost derivation of ccccba.
stack input
S ccccba
cA ccccba
A cccba
cBC cccba
BC ccba
ccC ccba
cC cba
C ba
ba ba
a a
Starting with S the machine chooses between two productions:
S cA [ b
but, since the rst symbol of the string ccccba to be recognised is c, the only
applicable production is the rst one. After expanding S a match-action removes
the leading c from ccccba and cA. So now we have to derive the string cccba from
A. The machine chooses between three productions:
A cBC [ bSA [ a
and again, since the next symbol of the remaining string cccba to be recognised is
c, the only applicable production is the rst one. After expanding A a match-action
removes the leading c from cccba and cBC. So now we have to derive the string
ccba from BC. The top stack symbol of BC is B. The machine chooses between
two productions:
B cc [ Cb
162 LL Parsing
The next symbol of the remaining string ccba to be recognised is, once again, c. The
rst production is applicable, but the second production may be applicable as well.
To decide whether it also applies we have to determine the symbols that can appear
as the rst element of a string derived from B starting with the second production.
The rst symbol of the alternative Cb is the nonterminal C. From the productions
C aS [ ba
it is immediately clear that a string derived from C starts with either an a or a b.
The set a, b is called the rst set of the nonterminal C. Since the next symbol
in the remaining string to be recognised is a c, the second production cannot be
applied. After expanding B and performing two match-actions it remains to derive
the string ba from C. The machine chooses between two productions C aS and
C ba. Clearly, only the second one applies, and, after two match-actions, leads
to success.
From the above derivation we conclude the following.
Deriving the sentence ccccba using gramm1 is a deterministic computa-
tion: at each step of the derivation there is only one applicable alternative
for the nonterminal on top of the stack.
Determinicity is obtained by looking at the rst set of the nonterminals.
The second example
A second example is the grammar gramm2 whose productions are
S abA [ aa
A bb [ bS
Now we want to know whether or not the string abbb is a sentence of the language of
this grammar. The stack machine produces, amongst others, the following sequence
stack input
S abbb
abA abbb
bA bbb
A bb
bb bb
b b
Starting with S the machine chooses between two productions:
S abA [ aa
since both alternatives start with an a, it is not sucient to look at the rst symbol a
of the string to be recognised. The problem is that the lookahead sets (the lookahead
set of a production N is the set of terminal symbols that can appear as the rst
symbol of a string that can be derived from N starting with the production N ,
the denition is given in the following subsection) of the two productions for S both
10.1 LL Parsing: Background 163
contain a. However, if we look at the rst two symbols ab, then we nd that the
only applicable production is the rst one. After expanding and matching it remains
to derive the string bb from A. Again, looking ahead one symbol in the input string
does not give sucient information for choosing one of the two productions
A bb [ bS
for A. If we look at the rst two symbols bb of the input string, then we nd that
the rst production applies (and, after matching, leads to success). Each string
derived from A starting with the second production starts with a b and, since it is
not possible to derive a string starting with another b from S, the second production
does not apply.
From the above derivation we conclude the following.
Deriving the string abbb using gramm2 is a deterministic computation:
at each step of the derivation there is only one applicable alternative for
the nonterminal on the top of the stack.
Again, determinicity is obtained by analysing the rst set (of strings of length 2) of
the nonterminals. Alternatively, we can left-factor the grammar to obtain a grammar
in which all productions for a nonterminal start with a dierent terminal symbol.
The third example
A third example is grammar gramm3 with the following productions:
S AaS [ B
A cS [
B b
Now we want to know whether or not the string acbab is an element of the language
of this grammar. The stack machine produces the following sequence
stack input
S acbab
AaS acbab
aS acbab
Starting with S the machine chooses between two productions:
S AaS [ B
since each nonempty string derived from A starts with a c, and each nonempty string
derived from B starts with a b, there does not seem to be a candidate production
to start a leftmost derivation of acabb with. However, since A can also derive the
empty string, we can apply the rst production, and then apply the empty string
for A, producing aS which, as required, starts with an a. We do not explain the
rest of the leftmost derivation since it does not use any empty strings any more.
Nonterminal symbols that can derive the empty sequence will play a central role in
the grammar analysis problems which we will consider in section 10.2.
From the above derivation we conclude the following.
164 LL Parsing
Deriving the string acbab using gramm3 is a deterministic computation:
at each step of the derivation there is only one applicable alternative for
the nonterminal on the top of the stack.
Determinicity is obtained by analysing whether or not nonterminals can derive the
empty string, and which terminal symbols can follow upon a nonterminal in a deriva-
tion.
10.1.3 LL(1) grammars
The examples in the previous subsection show that the derivations of the example
sentences are deterministic, provided we can look ahead one or two symbols in
the input. An obvious question now is: for which grammars are all derivations
deterministic? Of course, as the second example shows, the answer to this question
depends on the number of symbols we are allowed to look ahead. In the rest of
this chapter we assume that we may look 1 symbol ahead. A grammar for which
all derivations are deterministic with 1 symbol lookahead is called LL(1): Leftmost
with a Lookahead of 1. Since all derivations of sentences of LL(1) grammars are
deterministic, LL(1) is a desirable property of grammars.
To formalise this denition, we dene lookAhead sets.
Denition 1: lookAhead set
The lookahead set of a production N is the set of terminal symbols that
can appear as the rst symbol of a string that can be derived from N (where N
appears as a tail substring in a derivation from the start-symbol) starting with the
production N . So
lookAhead (N ) = x [ S

N

x
2
For example, for the productions of gramm1 we have
lookAhead (S cA) = c
lookAhead (S b) = b
lookAhead (A cBC) = c
lookAhead (A bSA) = b
lookAhead (A a) = a
lookAhead (B cc) = c
lookAhead (B Cb) = a, b
lookAhead (C aS) = a
lookAhead (C ba) = b
We use lookAhead sets in the denition of LL(1) grammar.
Denition 2: LL(1) grammar
A grammar G is LL(1) if all pairs of productions of the same nonterminal have
disjoint lookahead sets, that is: for all productions N , N of G:
lookAhead (N ) lookAhead (N ) =
2
10.1 LL Parsing: Background 165
Since all lookAhead sets for productions of the same nonterminal of gramm1 are
disjoint, gramm1 is an LL(1) grammar. For gramm2 we have:
lookAhead (S abA) = a
lookAhead (S aa) = a
lookAhead (A bb) = b
lookAhead (A bS) = b
Here, the lookAhead sets for both nonterminals S and A are not disjoint, and it
follows that gramm2 is not LL(1). gramm2 is an LL(2) grammar, where an LL(k)
grammar for k 2 is dened similarly to an LL(1) grammar: instead of one symbol
lookahead we have k symbols lookahead.
How do we determine whether or not a grammar is LL(1)? Clearly, to answer this
question we need to know the lookahead sets of the productions of the grammar.
The lookAhead set of a production N , where starts with a terminal symbol x,
is simply x. But what if starts with a nonterminal P, that is = P, for some ?
Then we have to determine the set of terminal symbols with which strings derived
from P can start. But if P can derive the empty string, we also have to determine
the set of terminal symbols with which a string derived from can start. As you
see, in order to determine the lookAhead sets of productions, we are interested in
whether or not a nonterminal can derive the empty string (empty);
which terminal symbols can appear as the rst symbol in a string derived from
a nonterminal (firsts);
and which terminal symbols can follow upon a nonterminal in a derivation
(follow).
In each of the following denitions we assume that a grammar G is given.
Denition 3: Empty
Function empty takes a nonterminal N, and determines whether or not the empty
string can be derived from the nonterminal:
empty N = N


2
For example, for gramm3 we have:
empty S = False
empty A = True
empty B = False
Denition 4: First
The rst set of a nonterminal N is the set of terminal symbols that can appear as
the rst symbol of a string that can be derived from N:
firsts N = x [ N

x
2
166 LL Parsing
For example, for gramm3 we have:
firsts S = a, b, c
firsts A = c
firsts B = b
We could have given more restricted denitions of empty and firsts, by only looking
at derivations from the start-symbol, for example,
empty N = S

N


but the simpler denition above suces for our purposes.
Denition 5: Follow
The follow set of a nonterminal N is the set of terminal symbols that can follow on
N in a derivation starting with the start-symbol S from the grammar G:
follow N = x [ S

Nx
2
For example, for gramm3 we have:
follow S = a
follow A = a
follow B = a
In the following section we will give programs with which lookahead, empty, firsts,
and follow are computed.
Exercise 10.1 Give the results of the function empty for the grammars gramm1 and gramm2.

Exercise 10.2 Give the results of the function firsts for the grammars gramm1 and gramm2.

Exercise 10.3 Give the results of the function follow for the grammars gramm1 and gramm2.

Exercise 10.4 Give the results of the function lookahead for grammar gramm3.
Is gramm3 an LL(1) grammar ?
Exercise 10.5 Grammar gramm2 is not LL(1), but it can be transformed into an LL(1) gram-
mar by left factoring. Give this equivalent grammar gramm2 and give the results of
the functions empty, first, follow and lookAhead on this grammar. Is gramm2
an LL(1) grammar?
Exercise 10.6 A non-leftrecursive grammar for Bit-Lists is given by the following grammar
(see your answer to exercise 2.18):
L BR
R [ ,BR
B 0 [ 1
10.2 LL Parsing: Implementation 167
Give the results of functions empty, firsts, follow and lookAhead on this gram-
mar. Is this grammar LL(1)?
10.2 LL Parsing: Implementation
Until now we have written parsers with parser combinators. Parser combinators use
backtracking, and this is sometimes a cause of ineciency. If a grammar is LL(1)
we do not need backtracking anymore: parsing is deterministic. We can use this
fact by either adjusting the parser combinators so that they dont use backtracking
anymore, or by writing a special purpose LL(1) parsing program. We present the
latter in this section.
This section describes the implementation of a program that parses sentences of
LL(1) grammars. The program works for arbitrary context-free LL(1) grammars,
so we rst describe how to represent context-free grammars in Haskell. Another
consequence of the fact that the program parses sentences of arbitrary context-free
LL(1) grammars is that we need a generic representation of parse trees in Haskell.
The second subsection denes a datatype for parse trees in Haskell. The third
subsection presents the program that parses sentences of LL(1) grammars. This
program assumes that the input grammar is LL(1), so in the fourth subsection we
give a function that determines whether or not a grammar is LL(1). Both this
and the LL(1) parsing function use a function that determines the lookahead of a
production. This function is presented in the fth subsection. The last subsections
of this section dene functions for determining the empty, rst, and follow symbols
of a nonterminal.
10.2.1 Context-free grammars in Haskell
A context-free grammar may be represented by a pair: its start symbol, and its
productions. How do we represent terminal and nonterminal symbols? There are at
least two possibilities.
The rigorous approach uses a datatype Symbol:
data Symbol a b = N a | T b
The advantage of this approach is that nonterminals and terminals are strictly
separated, the disadvantage is the notational overhead of constructors that
has to be carried around. However, a rigorous implementation of context-
free grammars should keep terminals and nonterminals apart, so this is the
preferred implementation. But in this section we will use the following imple-
mentation:
class Eq s => Symbol s where
isT :: s -> Bool
isN :: s -> Bool
isT = not . isN
where isN and isT determine whether or not a symbol is a nonterminal or a
terminal, respectively. This notation is compact, but terminals and nontermi-
nals are no longer strictly separated, and symbols that are used as nonterminals
168 LL Parsing
cannot be used as terminals anymore. For example, the type of characters is
made an instance of htis class by dening:
instance Symbol Char where
isN c = A <= c && c <= Z
that is, capitals are nonterminals, and, by denition, all other characters are
terminals.
A context-free grammar is a value of the type CFG:
type CFG s = (s,[(s,[s])])
where the list in the second component associates nonterminals to right-hand sides.
So an element of this list is a production. For example, the grammar with produc-
tions
S AaS [ B [ CB
A SC [
B A [ b
C D
D d
is represented as:
exGrammar :: CFG Char
exGrammar =
(S, [(S,"AaS"),(S,"B"),(S,"CB")
,(A,"SC"),(A,"")
,(B,"A"),(B,"b")
,(C,"D")
,(D,"d")
]
)
On this type we dene some functions for extracting the productions, nonterminals,
terminals, etc. from a grammar.
start :: CFG s -> s
start = fst
prods :: CFG s -> [(s,[s])]
prods = snd
terminals :: (Symbol s, Ord s) => CFG s -> [s]
terminals = unions . map (filter isT . snd) . snd
where unions :: Ord s => [[s]] -> [s] returns the union of the sets of sym-
bols in the lists.
10.2 LL Parsing: Implementation 169
nonterminals :: (Symbol s, Ord s) => CFG s -> [s]
nonterminals = nub . map fst . snd
Here, nub :: Ord s => [s] -> [s] removes duplicates from a list.
symbols :: (Symbol s, Ord s) => CFG s -> [s]
symbols grammar =
union (terminals grammar) (nonterminals grammar)
nt2prods :: Eq s => CFG s -> s -> [(s,[s])]
nt2prods grammar s =
filter (\(nt,rhs) -> nt==s) (prods grammar)
where function union returns the set union of two lists (removing duplicates). For
example, we have
?start exGrammar
S
? terminals exGrammar
"abd"
10.2.2 Parse trees in Haskell
A parse tree is a tree, in which each internal node is labelled with a nonterminal,
and has a list of children (corresponding with a right-hand side of a production of
the nonterminal). It follows that parse trees can be represented as rose trees with
symbols, where the datatype of rose trees is dened by:
data Rose a = Node a [Rose a] | Nil
The constructor Nil has been added to simplify error handling when parsing: when
a sentence cannot be parsed, the parse tree Nil is returned. Strictly speaking it
should not occur in the datatype of rose trees.
10.2.3 LL(1) parsing
This subsection denes a function ll1 that takes a grammar and a terminal string
as input, and returns one tuple: a parse tree, and the rest of the inputstring that
has not been parsed. So ll1 takes a grammar, and returns a parser with Rose s
as its result type. As mentioned in the beginning of this section, our parser doesnt
need backtracking anymore, since parsing with an LL(1) grammar is deterministic.
Therefore, the parser type is adjusted as follows:
type Parser b a = [b] -> (a,[b])
Using this parser type, the type of the function ll1 is:
ll1 :: (Symbol s, Ord s) => CFG s -> Parser s (Rose s)
170 LL Parsing
Function ll1 is dened in terms of two functions. Function isll1 :: CFG s ->
Bool is a function that checks whether or not a grammar is LL(1). And function
gll1 (for generalised LL(1)) produces a list of rose trees for a list of symbols. ll1
is obtained from gll1 by giving gll1 the singleton list containing the start symbol
of the grammar as argument.
ll1 grammar input =
if isll1 grammar
then let ([rose], rest) = gll1 grammar [start grammar] input
in (rose, rest)
else error "ll1: grammar not LL(1)"
So now we have to implement functions isll1 and gll1. Function isll1 is imple-
mented in the following subsection. Function gll1 also uses two functions. Function
grammar2ll1table takes a grammar and returns the LL(1) table: the association
list that associates productions with their lookahead sets. And function choose
takes a terminal symbol, and chooses a production based on the LL(1) table.
gll1 :: (Symbol s, Ord s) => CFG s -> [s] -> Parser s [Rose s]
gll1 grammar =
let ll1table = grammar2ll1table grammar
-- The LL(1) table.
nt2prods nt = filter (\((n,l),r) -> n==nt) ll1table
-- nt2prods returns the productions for nonterminal
-- nt from the LL(1) table
selectprod nt t = choose t (nt2prods nt)
-- selectprod selects the production for nt from the
-- LL(1) table that should be taken when t is the next
-- symbol in the input.
in \stack input ->
case stack of
[] -> ([], input)
(s:ss) ->
if isT s
then -- match
let (rts,rest) = gll1 grammar ss (tail input)
in if s == head input
then (Node s []: rts, rest)
-- The parse tree is a leaf (a node with
-- no children).
else ([Nil], input)
-- The input cannot be parsed
else -- expand
let t = head input
(rts,zs) = gll1 grammar (selectprod s t) input
-- Try to parse according to the production
-- obtained from the LL(1) table from s.
(rrs,vs) = gll1 grammar ss zs
-- Parse the rest of the symbols on the
-- stack.
10.2 LL Parsing: Implementation 171
in ((Node s rts): rrs, vs)
Functions grammar2ll1table and choose, which are used in the above function
gll1, are dened as follows. These functions use function lookaheadp, which returns
the lookahead set of a production and is dened in one of the following subsections.
grammar2ll1table :: (Symbol s, Ord s) => CFG s -> [((s,[s]),[s])]
grammar2ll1table grammar =
map (\x -> (x,lookaheadp grammar x)) (prods grammar)
choose :: Eq a => a -> [((b,c),[a])] -> c
choose t l =
let [((s,rhs), ys)] = filter (\((x,p),q) -> t elem q) l
in rhs
Note that function choose assumes that there is exactly one element in the associ-
ation list in which the element t occurs.
10.2.4 Implementation of isLL(1)
Function isll1 checks whether or not a context-free grammar is LL(1). It does this
by computing for each nonterminal of its argument grammar the set of lookahead
sets (one set for each production of the nonterminal), and checking that all of these
sets are disjoint. It is dened in terms of a function lookaheadn, which computes
the lookahead sets of a nonterminal, and a function disjoint, which determines
whether or not all sets in a list of sets are disjoint. All sets in a list of sets are
disjoint if the length of the concatenation of these sets equals the length of the
union of these sets.
isll1 :: (Symbol s, Ord s) => CFG s -> Bool
isll1 grammar =
and (map (disjoint . lookaheadn grammar) (nonterminals grammar))
disjoint :: Ord s => [[s]] -> Bool
disjoint xss = length (concat xss) == length (unions xss)
Function lookaheadn computes the lookahead sets of a nonterminal by computing
all productions of a nonterminal, and computing the lookahead set of each of these
productions by means of function lookaheadp.
lookaheadn :: (Symbol s, Ord s) => CFG s -> s -> [[s]]
lookaheadn grammar =
map (lookaheadp grammar) . nt2prods grammar
10.2.5 Implementation of lookahead
Function lookaheadp takes a grammar and a production, and returns the lookahead
set of the production. It is dened in terms of four functions. Each of the rst three
functions will be dened in a separate subsection below, the fourth function is dened
in this subsection.
172 LL Parsing
isEmpty :: (Ord s,Symbol s) => CFG s -> s -> Bool
Function isEmpty takes a grammar and a nonterminal and determines whether
or not the empty string can be derived from the nonterminal in the grammar.
(This function was called empty in Denition 3.)
firsts :: (Ord s, Symbol s) => CFG s -> [(s,[s])]
Function firsts takes a grammar and computes the rst set of each symbol
(the rst set of a terminal is the terminal itself).
follow :: (Ord s, Symbol s) => CFG s -> [(s,[s])]
Function follow takes a grammar and computes the follow set of each non-
terminal (so it associates a list of symbols with each nonterminal).
lookSet :: Ord s =>
(s -> Bool) -> -- isEmpty
(s -> [s]) -> -- firsts?
(s -> [s]) -> -- follow?
(s, [s]) -> -- production
[s] -- lookahead set
Note that we use the operator ?, see Section 6.4.2, on the firsts and follow
association lists. Function lookSet takes a predicate, two functions that given
a nonterminal return the rst and follow set, respectively, and a production,
and returns the lookahead set of the production. Function lookSet is intro-
duced after the denition of function lookaheadp.
Now we dene:
lookaheadp :: (Symbol s, Ord s) => CFG s -> (s,[s]) -> [s]
lookaheadp grammar =
lookSet (isEmpty grammar) ((firsts grammar)?) ((follow grammar)?)
We will exemplify the denition of function lookSet with the grammar exGrammar,
with the following productions:
S AaS [ B [ CB
A SC [
B A [ b
C D
D d
Consider the production S AaS. The lookahead set of the production contains
the set of symbols which can appear as the rst terminal symbol of a sequence of
symbols derived from A. But, since the nonterminal symbol A can derive the empty
string, the lookahead set also contains the symbol a.
Consider the production A SC. The lookahead set of the production contains
the set of symbols which can appear as the rst terminal symbol of a sequence of
symbols derived from S. But, since the nonterminal symbol S can derive the empty
string, the lookahead set also contains the set of symbols which can appear as the
rst terminal symbol of a sequence of symbols derived from C.
10.2 LL Parsing: Implementation 173
Finally, consider the production B A. The lookahead set of the production
contains the set of symbols which can appear as the rst terminal symbol of a
sequence of symbols derived from A. But, since the nonterminal symbol A can
derive the empty string, the lookahead set also contains the set of terminal symbols
which can follow the nonterminal symbol B in some derivation.
The examples show that it is useful to have functions firsts and follow in which,
for every nonterminal symbol n, we can look up the terminal symbols which can
appear as the rst terminal symbol of a sequence of symbols in some derivation
from n and the set of terminal symbols which can follow the nonterminal symbol n
in a sequence of symbols occurring in some derivation respectively. It turns out that
the denition of function follow also makes use of a function lasts which is similar
to the function firsts, but which deals with last nonterminal symbols rather than
rst terminal ones.
The examples also illustrate a control structure which will be used very often in the
following algorithms: we will fold over right-hand sides. While doing so we compute
sets of symbols for all the symbols of the right-hand side which we encounter and
collect them into a nal set of symbols. Whenever such a list for a symbol is
computed, there are always two possibilities:
either we continue folding and return the result of taking the union of the set
obtained from the current element and the set obtained by recursively folding
over the rest of the right-hand side
or we stop folding and immediately return the set obtained from the current
element.
We continue if the current symbol is a nonterminal which can derive the empty
sequence and we stop if the current symbol is either a terminal symbol or a non-
terminal symbol which cannot derive the empty sequence. The following function
makes this statement more precise.
foldrRhs :: Ord s =>
(s -> Bool) ->
(s -> [s]) ->
[s] ->
[s] ->
[s]
foldrRhs p f start = foldr op start
where op x xs = f x union if p x then xs else []
The function foldrRhs is, of course, most naturally dened in terms of the function
foldr. This function is somewhere in between a general purpose and an application
specic function (we could easily have made it more general though). In the exercises
we give an alternative characterisation of foldRhs. We will also need a function
scanrRhs which is like foldrRhs but accumulates intermediate results in a list.
The function scanrRhs is most naturally dened in terms of the function scanr.
scanrRhs :: Ord s =>
(s -> Bool) ->
(s -> [s]) ->
174 LL Parsing
[s] ->
[s] ->
[[s]]
scanrRhs p f start = scanr op start
where op x xs = f x union if p x then xs else []
Finally, we will also need a function scanlRhs which does the same job as scanrRhs
but in the opposite direction. The easiest way to dene scanlRhs is in terms of
scanrRhs and reverse.
scanlRhs p f start = reverse . scanrRhs p f start . reverse
We now return to the function lookSet.
lookSet :: Ord s =>
(s -> Bool) ->
(s -> [s]) ->
(s -> [s]) ->
(s,[s]) ->
[s]
lookSet p f g (nt,rhs) = foldrRhs p f (g nt) rhs
The function lookSet makes use of foldrRhs to fold over an right-hand side. As
stated above, the function foldrRhs continues processing an right-hand side only
if it encounters a nonterminal symbol for which p (so isEmpty in the lookSet in-
stance lookaheadp) holds. Thus, the set g nt (follow?nt in the lookSet instance
lookaheadp) is only important for those right-hand sides for nt that consist of non-
terminals that can all derive the empty sequence. We can now (assuming that the
denitions of the auxiliary functions are given) use the function lookaheadp instance
of lookSet to compute the lookahead sets of all productions.
look nt rhs = lookaheadp exGrammar (nt,rhs)
? look S "AaS"
dba
? look S "B"
dba
? look S "CB"
d
? look A "SC"
dba
? look A ""
ad
? look B "A"
dba
? look B "b"
b
? look C "D"
d
? look D "d"
d
10.2 LL Parsing: Implementation 175
It is clear from this result that exGrammar is not an LL(1)-grammar. Let us have
a closer look at how these lookahead sets are obtained. We will have to use the
functions firsts and follow and the predicate isEmpty for computing intermediate
results. The corresponding subsections explain how to compute these intermediate
results.
For the lookahead set of the production A AaS we fold over the right-hand side
AaS. Folding stops at a and we obtain
firsts? A union firsts? a
==
"dba" union "a"
==
"dba"
For the lookahead set of the production A SC we fold over the right-hand side
SC. Folding stops at C since it cannot derive the empty sequence, and we obtain
firsts? S union firsts? C
==
"dba" union "d"
==
"dba"
Finally, for the lookahead set of the production B A we fold over the right-hand
side A In this case we fold over the complete (one element) list and and we obtain
firsts? A union follow? B
==
"dba" union "d"
==
"dba"
The other lookahead sets are computed in a similar way.
10.2.6 Implementation of empty
Many functions dened in this chapter make use of a predicate isEmpty, which
tests whether or not the empty sequence can be derived from a nonterminal. This
subsection denes this function. Consider the grammar exGrammar. We are now only
interested in deriving sequences which contain only nonterminal symbols (since it is
impossible to derive the empty string if a terminal occurs). Therefore we only have
to consider the productions in which no terminal symbols appear in the right-hand
sides.
S B [ CB
A SC [
B A
C D
One can immediately see form those productions that the nonterminal A derives the
empty string in one step. To know whether there are any nonterminals which derive
176 LL Parsing
the empty string in more than one step we eliminate the productions for A and we
eliminate all occurrences of A in the right hand sides of the remaining productions
S B [ CB
B
C D
One can now conclude that the nonterminal B derives the empty string in two steps.
Doing the same with B as we did with A gives us the following productions
S [ C
C D
One can now conclude that the nonterminal S derives the empty string in three steps.
Doing the same with S as we did with A and B gives us the following productions
C D
At this stage we can conclude that there are no more new nonterminals which derive
the empty string.
We now give the Haskell implementation of the algorithm described above. The
algorithm is iterative: it does the same steps over and over again until some desired
condition is met. For this purpose we use function fixedPoint, which takes a
function and a set, and repeatedly applies the function to the set, until the set does
not change anymore.
fixedPoint :: Ord a => ([a] -> [a]) -> [a] -> [a]
fixedPoint f xs | xs == nexts = xs
| otherwise = fixedPoint f nexts
where nexts = f xs
fixedPoint f is sometimes called the xed-point of f. Function isEmpty determines
whether or not a nonterminal can derive the empty string. A nonterminal can derive
the empty string if it is a member of the emptySet of a grammar.
isEmpty :: (Symbol s, Ord s) => CFG s -> s -> Bool
isEmpty grammar = (elem emptySet grammar)
The emptySet of a grammar is obtained by the iterative process described in the
example above. We start with the empty set of nonterminals, and at each step n of
the computation of the emptySet as a fixedPoint, we add the nonterminals that
can derive the empty string in n steps. Function emptyStepf adds a nonterminal if
there exists a production for the nonterminal of which all elements can derive the
empty string.
emptySet :: (Symbol s, Ord s) => CFG s -> [s]
emptySet grammar = fixedPoint (emptyStepf grammar) []
emptyStepf :: (Symbol s, Ord s) => CFG s -> [s] -> [s]
emptyStepf grammar set =
nub (map fst (filter (\(nt,rhs) -> all (elem set) rhs)
(prods grammar)
) )
10.2 LL Parsing: Implementation 177
10.2.7 Implementation of rst and last
Function firsts takes a grammar, and returns for each symbol of the grammar (so
also the terminal symbols) the set of terminal symbols with which a sentence derived
from that symbol can start. The rst set of a terminal symbol is the terminal symbol
itself.
The rst set of each symbol consists of that symbol itself, plus the (rst) symbols
that can be derived from that symbol in one or more steps. So the rst set can be
computed by an iterative process, just as the function isEmpty.
Consider the grammar exGrammar again. We start the iteration with
[(S,"S"),(A,"A"),(B,"B"),(C,"C"),(D,"D")
,(a,"a"),(b,"b"),(d,"d")
]
Using the productions of the grammar we can derive in one step the following lists
of rst symbols.
[(S,"ABC"),(A,"S"),(B,"Ab"),(C,"D"),(D,"d")]
and the union of these two lists is
[(S,"SABC"),(A,"AS"),(B,"BAb"),(C,"CD"),(D,"Dd")
,(a,"a"),(b,"b"),(d,"d")]
In two steps we can derive
[(S,"SAbD"),(A,"ABC"),(B,"S"),(C,"d"),(D,"")]
and again we have to take the union of this list with the previous result. We repeat
this process until the list doesnt change anymore. For exGrammar this happens
when:
[(S,"SABCDabd")
,(A,"SABCDabd")
,(B,"SABCDabd")
,(C,"CDd")
,(D,"Dd")
,(a,"a")
,(b,"b")
,(d,"d")
]
Function firsts is dened as the fixedPoint of a step function that iterates the
rst computation one more step. The fixedPoint starts with the list that contains
all symbols paired with themselves.
firsts :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
firsts grammar =
fixedPoint (firstStepf grammar) (startSingle grammar)
startSingle :: (Ord s, Symbol s) => CFG s -> [(s,[s])]
startSingle grammar = map (\x -> (x,[x])) (symbols grammar)
178 LL Parsing
The step function takes the old approximation and performs one more iteration step.
At each of these iteration steps we have to add the start list with which the iteration
started again.
firstStepf :: (Ord s, Symbol s) =>
CFG s -> [(s,[s])] -> [(s,[s])]
firstStepf grammar approx = (startSingle grammar)
combine (compose (first1 grammar) approx)
combine :: Ord s => [(s,[s])] -> [(s,[s])] -> [(s,[s])]
combine xs = foldr insert xs
where insert (a,bs) [] = [(a,bs)]
insert (a,bs) ((c,ds):rest)
| a == c = (a, union bs ds) : rest
| otherwise = (c,ds) : (insert (a,bs) rest)
compose :: Ord a => [(a,[a])] -> [(a,[a])] -> [(a,[a])]
compose r1 r2 = [(a, unions (map (r2?) bs)) | (a,bs) <- r1]
Finally, function first1 computes the direct rst symbols of all productions, taking
into account that some nonterminals can derive the empty string, and combines the
results for the dierent nonterminals.
first1 :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
first1 grammar =
map (\(nt,fs) -> (nt,unions fs))
(group (map (\(nt,rhs) -> (nt,foldrRhs (isEmpty grammar)
single
[]
rhs
) )
(prods grammar)
) )
where group groups elements with the same rst element together
group :: Eq a => [(a,b)] -> [(a,[b])]
group = foldr insertPair []
insertPair :: Eq a => (a,b) -> [(a,[b])] -> [(a,[b])]
insertPair (a,b) [] = [(a,[b])]
insertPair (a,b) ((c,ds):rest) =
if a==c then (c,(b:ds)):rest else (c,ds):(insertPair (a,b) rest)
function single takes an element and returns the set with the element, and unions
returns the union of a set of sets.
Function lasts is dened using function firsts. Suppose we reverse the right-hand
sides of all productions of a grammar. Then the rst set of this reversed grammar
is the last set of the original grammar. This idea is implemented in the following
functions.
10.2 LL Parsing: Implementation 179
reverseGrammar :: Symbol s => CFG s -> CFG s
reverseGrammar =
\(s,al) -> (s,map (\(nt,rhs) -> (nt,reverse rhs)) al)
lasts :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
lasts = firsts . reverseGrammar
10.2.8 Implementation of follow
Finally, the last function we have to implement is the function follow, which takes
a grammar, and returns an association list in which nonterminals are associated to
symbols that can follow upon the nonterminal in a derivation. A nonterminal n is
associated to a list containing terminal t in follow if n and t follow each other in
some sequence of symbols occurring in some leftmost derivation. We can compute
pairs of such adjacent symbols by splitting up the right-hand sides with length at
least 2 and, using lasts and firsts, compute the symbols which appear at the end
resp. at the beginning of strings which can be derived from the left resp. right part
of the split alternative. Our grammar exGrammar has three alternatives with length
at least 2: "AaS", "CB" and "SC". Function follow uses the functions firsts and
lasts and the predicate isEmpty for intermediate results. The previous subsections
explain how to compute these functions.
Lets see what happens with the alternative "AaS". The lists of all nonterminal
symbols that can appear at the end of sequences of symbols derived from "A" and
"Aa" are "ADC" and "" respectively. The lists of all terminal symbols which can
appear at the beginning of sequences of symbols derived from "aS" and "S" are "a"
and "dba" respectively. Zipping together those lists shows that an A, a D and
a C can be be followed by an a. Splitting the alternative "CB" in the middle
produces rst and last sets "CD" and "dba". Splitting the alternative "SC" in the
middle produces rst and last sets "SDCAB" and "d". From the rst pair we can see
that a C and a D can be followed by a d, a b and an a From the second
pair we see that an S, a D, a C, an A, and a B can be followed by a d.
Combining all these results gives:
[(S,"d"),(A,"ad"),(B,"d"),(C,"adb"),(D,"adb")]
The function follow uses the functions scanrAlt and scanlAlt. The lists produced
by these functions are exactly the ones we need: using the function zip from the
standard prelude we can combine the lists. For example: for the alternative "AaS"
the functions scanlAlt and scanrAlt produce the following lists:
[[], "ADC", \es, "SDCAB"]
["dba", "a", "dba", []]
Only the two middle elements of both lists are important (they correspond to the
nontrivial splittings of "AaS"). Thus, we only have to consider alternatives of length
at least 2. We start the computation of follow with assigning the empty follow set
to each symbol:
follow :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
follow grammar = combine (followNE grammar) (startEmpty grammar)
180 LL Parsing
startEmpty grammar = map (\x -> (x,[])) (symbols grammar)
The real work is done by functions followNE and function splitProds. Func-
tion followNE passes the right arguments on to function splitProds, and removes
all nonterminals from the rst set, and all terminals from the last set. Function
splitProds splits the productions of length at least 2, and pairs the last nontermi-
nals with the rst terminals.
followNE :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
followNE grammar = splitProds
(prods grammar)
(isEmpty grammar)
(isTfirsts grammar)
(isNlasts grammar)
where isTfirsts = map (\(x,xs) -> (x,filter isT xs)) . firsts
isNlasts = map (\(x,xs) -> (x,filter isN xs)) . lasts
splitProds :: (Symbol s, Ord s) =>
[(s,[s])] -> -- productions
(s -> Bool) -> -- isEmpty
[(s,[s])] -> -- terminal firsts
[(s,[s])] -> -- nonterminal lasts
[(s,[s])]
splitProds prods p fset lset =
map (\(nt,rhs) -> (nt,nub rhs)) (group pairs)
where pairs = [(l, f)
| rhs <- map snd prods
, length rhs >= 2
, (fs, ls) <- zip (rightscan rhs) (leftscan rhs)
, l <- ls
, f <- fs
]
leftscan = scanlRhs p (lset?) []
rightscan = scanrRhs p (fset?) []
Exercise 10.7 Give the Rose tree representation of the parse tree corresponding to the deriva-
tion of the sentence ccccba using grammar gramm1.
Exercise 10.8 Give the Rose tree representation of the parse tree corresponding to the deriva-
tion of the sentence abbb using grammar gramm2 dened in the exercises of the
previous section .
Exercise 10.9 Give the Rose tree representation of the parse tree corresponding to the deriva-
tion of the sentence acbab using grammar gramm3.
Exercise 10.10 In this exercise we will take a closer look at the functions foldrRhs and
scanrRhs which are the essential ingredients of the implementation of the grammar
analysis algorithms. From the denitions it is clear that grammar analysis is easily
expressed via a calculus for (nite) sets. A calculus for nite sets is implicit in the
programs for LL(1) parsing. Since the code in this module is obscured by several
10.2 LL Parsing: Implementation 181
implementation details we will derive the functions foldrRhs and scanrRhs in a
stepwise fashion. In this derivation we will use the following:
A (nite) set is implemented by a list with no duplicates. In order to construct a
set, the following operations may be used:
[] :: [a] the empty set of a-elements
union :: [a] [a] [a] the union of two sets
unions :: [[a]] [a] the generalised union
single :: a [a] the singleton function
These operations satisfy the well-known laws for set operations.
1. Dene a function list2Set :: [a] [a] which returns the set of elements
occurring in the argument.
2. Dene list2Set as a foldr.
3. Dene a function pref p :: [a] [a] which given a list xs returns the
set of elements corresponding to the longest prex of xs all of whose elements
satisfy p.
4. Dene a function prefplus p :: [a] [a] which given a list xs returns
the set of elements in the longest prex of xs all of whose elements satisfy p
together with the rst element of xs that does not satisfy p (if this element
exists at all).
5. Dene prefplus p as a foldr.
6. Show that prefplus p = foldrRhs p single [].
7. It can be shown that
foldrRhs p f [] = unions . map f . prefplus p
for all set-valued functions f. Give an informal description of the functions
foldrRhs p f [] and foldrRhs p f start.
8. The standard function scanr is dened by
scanr f q0 = map (foldr f q0) . tails
where tails is a function which takes a list xs and returns a list with all
tailsegments (postxes) of xs in decreasing length. The function scanrRhs is
dened is a similar way
scanrRhs p f start = map (foldrRhs p f start) . tails
Give an informal description of the function scanrRhs.

182 LL Parsing
Exercise 10.11 The computation of the functions empty and firsts is not restricted to
nonterminals only. For terminal symbols s these functions are dened by
empty s = False
firsts s = s
Using the denitions in the previous exercise, compute the following.
1. For the example grammar gramm1 and two of its productions A bSA and
B Cb.
(a) foldrRhs empty firsts [] bSA
(b) foldrRhs empty firsts [] Cb
2. For the example grammar gramm3 and its production S AaS
(a) foldrRhs empty firsts [] AaS
(b) scanrRhs empty firsts [] AaS

Bibliography
[1] A.V. Aho, Sethi R., and J.D. Ullman. Compilers Principles, Techniques and
Tools. Addison-Wesley, 1986.
[2] R.S. Bird. Using circular programs to eliminate multiple traversals of data.
Acta Informatica, 21:239250, 1984.
[3] R.S. Bird and P. Wadler. Introduction to Functional Programming. Prentice
Hall, 1988.
[4] W.H. Burge. Parsing. In Recursive Programming Techniques. Addison-Wesley,
1975.
[5] J. Fokker. Functional parsers. In J. Jeuring and E. Meijer, editors, Advanced
Functional Programming, volume 925 of Lecture Notes in Computer Science.
Springer-Verlag, 1995.
[6] R. Harper. Proof-directed debugging. Journal of Functional Programming,
1999. To appear.
[7] G. Hutton. Higher-order functions for parsing. Journal of Functional Program-
ming, 2(3):323 343, 1992.
[8] G. Hutton and E. Meijer. Monadic parsing in haskell. Journal of Functional
Programming, 8(4):437 444, 1998.
[9] B.W. Kernighan and R. Pike. Regular expressions languages, algorithms,
and software. Dr. Dobbs Journal, April:19 22, 1999.
[10] D.E. Knuth. Semantics of context-free languages. Math. Syst. Theory, 2(2):127
145, 1968.
[11] Niklas Rojemo. Garbage collection and memory eciency in lazy functional
languages. PhD thesis, Chalmers University of Technology, 1995.
[12] S. Sippu and E. Soisalon-Soininen. Parsing Theory, Vol. 1: Languages and
Parsing, volume 15 of EATCS Monographs on THeoretical Computer Science.
Springer-Verlag, 1988.
[13] S.D. Swierstra and P.R. Azero Alcocer. Fast, error correcting parser combina-
tors: a short tutorial. In SOFSEM99, 1999.
[14] P. Wadler. How to replace failure by a list of successes: a method for excep-
tion handling, backtracking, and pattern matching in lazy functional languages.
In J.P. Jouannaud, editor, Functional Programming Languages and Computer
Architecture, pages 113 128. Springer, 1985. LNCS 201.
183
184 BIBLIOGRAPHY
[15] P. Wadler. Monads for functional programming. In J. Jeuring and E. Meijer,
editors, Advanced Functional Programming. Springer, 1995. LNCS 925.
Appendix A
The Stack module
module Stack
( Stack
, emptyStack
, isEmptyStack
, push
, pushList
, pop
, popList
, top
, split
, mystack
)
where
data Stack x = MkS [x] deriving (Show,Eq)
emptyStack :: Stack x
emptyStack = MkS []
isEmptyStack :: Stack x -> Bool
isEmptyStack (MkS xs) = null xs
push :: x -> Stack x -> Stack x
push x (MkS xs) = MkS (x:xs)
pushList :: [x] -> Stack x -> Stack x
pushList xs (MkS ys) = MkS (xs ++ ys)
pop :: Stack x -> Stack x
pop (MkS xs) = if isEmptyStack (MkS xs)
then error "pop on emptyStack"
else MkS (tail xs)
popIf :: Eq x => x -> Stack x -> Stack x
popIf x stack = if top stack == x
185
186 The Stack module
then pop stack
else error "argument and top of stack dont match"
popList :: Eq x => [x] -> Stack x -> Stack x
popList xs stack = foldr popIf stack (reverse xs)
top :: Stack x -> x
top (MkS xs) = if isEmptyStack (MkS xs)
then error "top on emptyStack"
else head xs
split :: Int -> Stack x -> ([x], Stack x)
split 0 stack = ([], stack)
split n (MkS []) = error "attempt to split the emptystack"
split (n+1) (MkS (x:xs)) = (x:ys, stack)
where
(ys, stack) = split n (MkS xs)
mystack = MkS [1,2,3,4,5,6,7]
Appendix B
Answers to exercises
2.1 Three of the fours strings are elements of L

: abaabaaabaa, aaaabaaaa, baaaaabaa.


2.2 .
2.3
L
= Denition of concatenation of languages
st [ s , t L
= s

and the other equalities are proved in a similar fashion.


2.4 L
n
should satisfy:
L
0
= ?
L
n+1
= L
n
L
It follows that we want L
0
to satisfy
L
0
L = L
If we choose L
0
= , we have
L
0
L
= Denition of concatenation of languages
st [ s , t L
= Denition of string concatenation
t [ t L
= Denition of set comprehension
L
as desired.
2.5 The star operator on sets injects the elements of a set in a list; the star operator
on languages concatenates the sentences of the language. The former star operator
preserves more structure.
2.6 Section 2.1 contains an inductive denition of the set of sequences over an
arbitrary set X. Syntactical denitions for such sets follow immediately from this.
187
188 Answers to exercises
1. A grammar for X = a is given by
S
S aS
2. A grammar for X = a, b is given by
S
S XS
X a [ b
2.7 Analogous to the construction of the grammar for PAL .
P [ a [ b [ aPa [ bPb
2.8 Analogous to the construction of PAL.
M [ aMa [ bMb
2.9 First establish an inductive denition for parity-sequences. An example of a
grammar that can be derived from the inductive denition is:
P [ 1P1 [ 0P [ P0
There are many other solutions.
2.10 Again, establish an inductive denition for L. An example of a grammar that
can be derived from the inductive denition is:
S [ aSb [ bSa [ SS
Again, there are many other solutions.
2.11 A sentence is a sentential form consisting only of terminals which can be
derived in zero or more derivation steps from the start-symbol (to be more precise:
the sentential form consisting only of the start-symbol). The start-symbol is a
nonterminal. The nonterminals of a grammar do not belong to the alphabet (the
set of terminals) of the language we describe using the grammar. Therefore the
start-symbol cannot be a sentence of the language. As a consequence we have to
perform at least one derivation step from the start-symbol before we end up with a
sentence of the language.
2.12 The language consisting of the empty string only, i.e. .
2.13 This grammar generates the empty language, i.e. .
2.14 The sentences in this language consist of zero or more concatenations of ab.
2.15 Yes. Each nite language is context free. A context free grammar can be
obtained by taking one nonterminal and adding a production rule for each sentence
in the language. For the language in exercise 2.1 this yields
S ab
S aa
S baa
Answers to exercises 189
2.16 Draw a tree yourself from the following derivations: P

aPa

abPba

abaPaba

abaaba and P

bPb

baPab

baaab. The datatype Palc can be
dened by:
data Palc = Empty | A Char | B Char | A2 Char Palc Char | B2 Char Palc Char
The two example palindromes are represented by the following values:
cPal1 = A2 a (B2 b (A2 a Empty a) b) a
cPal2 = B2 b (A2 a (A a) a) b)
2.17
1 S if b then S else S if b then if b then S else S if b then if b then a else S
if b then if b then a else a, and S if b then S if b then if b then S else S
if b then if b then a else S if b then if b then a else a
2 The rule we apply is: match else with the closest previous unmatched then.
This disambiguating rule is incorporated directly into the grammar:
S MatchedS [ UnmatchedS
MatchedS if b then MatchedS else MatchedS [ a
UnmatchedS if b then S [ if b then MatchedS else UnmatchedS
3 An else clause is always matched with the closest previous unmatched if.
2.18 An equivalent grammar for Bit-Lists is
L BZ [ B
Z ,LZ [ ,L
B 0 [ 1
2.19
1 This grammar generates a
2n
b
m
[ n 0, m 0
2 An equivalent non left recursive grammar is
S AB
A [ aaA
B [ bB
2.20
Expr Expr+Expr [ Expr Expr [ Int
where Int is a nonterminal that produces integers.
2.21 Palindromes
1 data Pal = Pal1 | Pal2 | Pal3 | Pal4 Pal | Pal5 Pal
aPal1 = Pal4 (Pal5 (Pal4 Pal1))
aPal2 = Pal5 (Pal4 Pal2)
2 a2cPal Pal1 = ""
a2cPal Pal2 = "a"
a2cPal Pal3 = "b"
a2cPal (Pal4 p) = "a" ++ a2cPal p ++ "a"
a2cPal (Pal5 p) = "b" ++ a2cPal p ++ "b"
190 Answers to exercises
3 aCountPal Pal1 = 0
aCountPal Pal2 = 1
aCountPal Pal3 = 0
aCountPal (Pal4 p) = aCountPal p + 2
aCountPal (Pal5 p) = aCountPal p
2.22 Mirror-Palindromes
1 data Mir = Mir1 | Mir2 Mir | Mir3 Mir
aMir1 = Mir2 (Mir3 (Mir2 Mir1))
aMir2 = Mir2 (Mir3 (Mir3 Mir1))
2 a2cMir Mir1 = ""
a2cMir (Mir2 m) = "a" ++ a2cMir m ++ "a"
a2cMir (Mir3 m) = "b" ++ a2cMir m++"b"
3 m2pMir Mir1 = Pal1
m2pMir (Mir2 m) = Pal4 (m2pMir m)
m2pMir (Mir3 m) = Pal5 (m2pMir m)
2.23 Parity-Sequences
1 data Parity = Empty | Parity1 Parity | ParityL0 Parity | ParityR0 Parity
aEven1 = ParityL0 (ParityL0 (Parity1 (ParityL0 Empty)))
aEven2 = ParityL0 (ParityR0 (Parity1 (ParityL0 Empty)))
2 a2cParity Empty = ""
a2cParity (Parity1 p) = "1" ++ a2cParity p ++ "1"
a2cParity (ParityL0 p) = "0" ++ a2cParity p
a2cParity (ParityR0 p) = a2cParity p ++ "0"
2.24 Bit-Lists
1 data BitList = ConsB Bit Z | SingleB Bit
data Z = ConsBL BitList Z | SingleBL BitList
data Bit = Bit0 | Bit1
aBitList1 = ConsB Bit0 (ConsBL (SingleB Bit1) (SingleBL (SingleB Bit0)))
aBitList2 = ConsB Bit0 (ConsBL (SingleB Bit0) (SingleBL (SingleB Bit1)))
2 a2cBitList (ConsB b z) = bit2c b ++ a2cz z
a2cBitList (SingleB b) = bit2c b
a2cz (ConsBL bl z) = a2cBitList bl ++ a2cz z
a2cz (SingleBL bl) = a2cBitList bl
bit2c Bit0 = "0"
bit2c Bit1 = "1"
3 concatBL :: BitList -> BitList -> BitList
concatBL (SingleB b) bl = ConsB b (SingleBL bl)
concatBL (ConsB b z) bl = ConsB b (concatZBL z bl)
concatZBL (ConsBL bl z) bl = ConsBL bl (concatZBL z bl)
concatZBL (SingleBL bl) bl = ConsBL bl (SingleBL bl)
2.25 We only give the EBNF notation for the productions that change.
Digs Dig

Z Sign?Nat
SoS LoD

LoD Letter [ Dig


Answers to exercises 191
2.26
L(G?) = L(G)
2.27 Let L(P) be the language dened by means of predicate P. Then we have
L(P1) L(P2) = L(x.P1(x) P2(x)), etc.
2.28
L
1
is generated by:
S ZC
Z aZb [
C c

and L
2
is generated by:
S AZ
Z bZc [
A a

L
1
L
2
= a
n
b
n
c
n
[ n IIN
and as we will see in Chapter 9, this language is not context-free.
2.32 No, the language L = aab, baa also satises L = L
R
.
2.36 L = a
n
b
n
[ n IIN. This language is also generated by the grammar:
S aSb [
2.39
S = A [
A = aAb [ ab
2.40
1. a
2n+1
[ n 0
2. A = aaA [ a
3. A = Aaa [ a
2.41
1. ab
n
[ n 0
2. X = aY
Y = bY [
3. X = aY [ a
Y = bY [ b
2.42
192 Answers to exercises
1. S = T [ US
T = Xa [ Ua
X = aS
U = S [ Y T
Y = SU
The sentential forms aS and SU have been abstracted to nonterminals X and
Y .
2. S = aSa [ Ua [ US
U = S [ SUaSa [ SUUa
The nonterminal T has been substituted for its alternatives aSa and Ua.
2.43
S 1O
O 1O [ 0N
N 1

2.46
1. rst leftmost derivation:
Sentence

Subject Predicate

theyPredicate

theyVerbNounPhrase

theyareNounPhrase

theyareAdjective Noun

theyareyingNoun

theyareyingplanes
2. second leftmost derivation:
Sentence

Subject Predicate

theyPredicate

theyAuxVerb Verb Noun

theyareVerb Noun
Answers to exercises 193

theyareyingNoun

theyareyingplanes
2.48 Here is an unambiguous grammar for the double parentheses language (al-
though it may not be trivial to convince yourself that it actually is unambiguous).
S () [ S() [ (S) [ S(S) [ [] [ S[] [ [S] [ S[S]
2.49 Here is a leftmost derivation for 3
3 3
3 3 3 3
Notice that the grammar of this exercise is the same, up to renaming, as the grammar
E = E+T [ T
T = T*F [ F
F = 0 [ 1
3.1
capital = satisfy (\s -> (A <= s) && (s <= Z))
3.2 A symbol equal to a satises (==a):
symbol a = satisfy (==a)
3.3 The function epsilon is a special case of succeed:
epsilon :: Parser s ()
epsilon = succeed ()
3.4 Let xs :: [s]. Then
(f <$> succeed a) xs
= denition of <$>
[ (f x,ys) [ (x,ys) succeed a xs ]
= denition of succeed
[ (f a,xs) ]
= denition of succeed
succeed (f a) xs
3.5 The type and results of list <$> symbol a are (note that you cannot write
this as a denition in Haskell):
(list <$> symbol a) :: Parser Char (String -> String)
(list <$> symbol a) [] = []
(list <$> symbol a) (x:xs) | x == a = [(list x,xs)]
| otherwise = []
194 Answers to exercises
3.6 The type and results of list <$> symbol a <*> p are:
(list <$> symbol a <*> p) :: Parser Char String
(list <$> symbol a <*> p) [] = []
(list <$> symbol a <*> p) (x:xs)
| x == a = [(list a x,ys) | (x,ys) <- p xs]
| otherwise = []
3.7
pBool :: Parser Char Bool
pBool = const True <$> token "True"
<|> const False <$> token "False"
3.9
1 data Pal2 = Nil | Leafa | Leafb | Twoa Pal2 | Twob Pal2
2 palin2 :: Parser Char Pal2
palin2 = (\x y z -> Twoa y) <$>
symbol a <*> palin2 <*> symbol a
<|> (\x y z -> Twob y) <$>
symbol b <*> palin2 <*> symbol b
<|> const Leafa <$> symbol a
<|> const Leafb <$> symbol b
<|> succeed Nil
3 palina :: Parser Char Int
palina = (\x y z -> y+2) <$>
symbol a <*> palina <*> symbol a
<|> (\x y z -> y) <$>
symbol b <*> palina <*> symbol b
<|> const 1 <$> symbol a
<|> const 0 <$> symbol b
<|> succeed 0
3.10
1 data English = E1 Subject Pred
data Subject = E2 String
data Pred = E3 Verb NounP | E4 AuxV Verb Noun
data Verb = E5 String | E6 String
data AuxV = E7 String
data NounP = E8 Adj Noun
data Adj = E9 String
data Noun = E10 String
2 english :: Parser Char English
english = E1 <$> subject <*> pred
subject = E2 <$> token "they"
pred = E3 <$> verb <*> nounp
<|> E4 <$> auxv <*> verb <*> noun
verb = E5 <$> token "are"
<|> E6 <$> token "flying"
auxv = E7 <$> token "are"
Answers to exercises 195
nounp = E8 <$> adj <*> noun
adj = E9 <$> token "flying"
noun = E10 <$> token "planes"
3.11 As <|> uses ++, it is more eciently evaluated right associative.
3.12 The function is the same as <*>, but instead of applying the result of the rst
parser to that of the second, it pairs them together:
(<,>) :: Parser s a -> Parser s b -> Parser s (a,b)
(p <,> q) xs = [((x,y),zs)
|(x,ys) <- p xs
,(y,zs) <- q ys
]
3.13 Parser transformator, or parser modier or parser postprocessor, etcetera.
3.14 The transformator <$> does to the result part of parsers what map does to the
elements of a list.
3.15 The parser combinators <*> and <,> can be dened in terms of each other:
p <*> q = h <$> (p <,> q)
where -- h :: (b -> a,b) -> a
h (f,y) = f y
p <,> q = (h <$> p) <*> q
where -- h :: a -> (b -> (a,b))
h x y = (x, y)
3.16 Yes. You can combine the parser parameter of <$> with a parser that consumes
no input and always yields the function parameter of <$>:
f <$> p = succeed f <*> p
3.17
f ::
Char->Parentheses->Char->Parentheses->Parentheses
open ::
Parser Char Char
f <$> open ::
Parser Char (Parentheses->Char->Parentheses->Parentheses)
parens ::
Parser Char Parentheses
(f <$> open) <*> parens ::
Parser Char (Char->Parentheses->Parentheses)
3.18 To the left. Yes.
3.19
test p s = not (null (filter (null . snd) (p s)))
test p = not . null . filter (null . snd) . p
3.20 Introduce the abbreviation
196 Answers to exercises
listOfa = list <$> symbol a
and use the results of exercises 3.5 and 3.6.
xs = []
many (symbol a) []
= denition of many, denition of listOfa
(listOfa <*> many (symbol a) <|> succeed []) []
= denition of <|>
(listOfa <*> many (symbol a)) [] ++ succeed [] []
= exercise 3.6, denition of succeed
[] ++ [([], [])]
=
[([], [])]
xs = [a]
many (symbol a) [a]
= denition of many, denition of listOfa
(listOfa <*> many (symbol a) <|> succeed []) [a]
= def <|>
(listOfa <*> many (symbol a)) [a] ++ succeed [] [a]
= exercise 3.6, previous calculation
[([a],[]),([],[a])]
xs = [b]
many (symbol a) [b]
= denition of many, denition of listOfa
(listOfa <*> many (symbol a) <|> succeed []) [b]
= denition of <|>
(listOfa <*> many (symbol a)) [b] ++ succeed [] [b]
= exercise 3.6
[([],[b])]
xs = [a,b]
many (symbol a) [a,b]
=
(listOfa <*> many (symbol a)) [a,b] ++ succeed [] [a,b]
= exercise 3.6, previous calculation
[ ([a],[b]),([],[a,b])]
xs = [a,a,b]
many (symbol a) [a,a,b]
=
Answers to exercises 197
(listOfa <*> many (symbol a)) [a,a,b] ++ succeed [] [a,a,b]
= exercise 3.6, previous calculation
[([a,a],[b]),([a],[a,b]),([],[a,a,b])]
3.21 The empty alternative is presented last, because the <|> combinator uses list
concatenation for concatenating lists of successes. This also holds for the recursive
calls; thus the greedy parsing of all three as is presented rst, then two as with a
singleton rest string, then one a, and nally the empty result with the original input
as rest string.
3.23
-- Combinators for repetition
psequence :: [Parser s a] -> Parser s [a]
psequence [] = succeed []
psequence (p:ps) = list <$> p <*> psequence ps
psequence :: [Parser s a] -> Parser s [a]
psequence = foldr f (succeed [])
where f p q = list <$> p <*> q
choice :: [Parser s a] -> Parser s a
choice = foldr (<|>) failp
3.24
token :: Eq s => [s] -> Parser s [s]
token = psequence . map symbol
3.26
identifier :: Parser Char String
identifier = list <$> satisfy isAlpha <*> greedy (satisfy isAlphaNum)
3.27
expr = chainr (chainl term (const (:-:) <$> symbol -))
(const (:+:) <$> symbol +)
3.28 The proofs can be given by using laws for list comprehension, but here we
prefer to exploit the following equation
(f <$> p) xs = map (cross (f,id)) (p xs) (B.1)
where cross is dened by:
cross :: (a -> c,b -> d) -> (a, b) -> (c, d)
cross (f,g) (a,b) = (f a,g b)
198 Answers to exercises
It has the following property
cross (f,g) . cross (h,k) = cross (f.h,g.k) (B.2)
Furthermore, we will use the following laws about map in our proof: map distributes
over composition, concatenation, and function concat.
map f . map g = map (f.g) (B.3)
map f (x ++ y) = map f x ++ map f y (B.4)
map f . concat = concat . map (map f) (B.5)
1. (h <$> (f <$> p)) xs
=
map (cross (h,id)) ((f <$> p) xs)
=
map (cross (h,id)) (map (cross (f,id)) (p xs))
= map distributes over composition: B.3
map ((cross (h,id)) . (cross (f,id))) (p xs)
= equation B.2 for cross
map (cross (h.f,id)) (p xs)
= equation B.1 for <$>
((h.f) <$> p) xs
2. (h <$> (p <|> q)) xs
=
map (cross (h,id)) ((p <|> q) xs)
=
map (cross (h,id)) (p xs ++ q xs)
= map distributes over concatenation: B.4
map (cross (h,id)) (p xs) ++ map (cross (h,id)) (q xs)
= equation B.1 for <$>
(h <$> p) xs ++ (h <$> q) xs
= denition of <|>
((h <$> p) <|> (h <$> q)) xs
3. First note that (p <*> q) xs can be written as
(p <*> q) xs = concat (map (mc q) (p xs)) (B.6)
where
mc q (f,ys) = map (cross (f,id)) (q ys)
Now we calculate
Answers to exercises 199
( ((h.) <$> p) <*> q) xs
=
concat (map (mc q) (((h.) <$> p) xs))
= equation B.1 for <$>
concat (map (mc q) (map (cross ((h.),id)) (p xs)))
= map distributes over composition
concat (map (mc q . (cross ((h.),id))) (p xs))
= claim B.7 below
concat (map ((map (cross (h,id))) . mc q) (p xs))
= map distributes over composition
concat (map (map (cross (h,id)) (map (mc q) (p xs))))
= map distributes over concat: B.5
map (cross (h,id)) (concat (map (mc q) (p xs)))
= equation B.6
map (cross (h,id)) ((p <*> q) xs)
= equation B.1 for <$>
(h <$> (p <*> q)) xs
It remains to prove the claim
mc q . cross ((h.),id) = map (cross (h,id)) . mc q (B.7)
This claim is also proved by calculation.
((map (cross (h,id))) . mc q) (f,ys)
=
map (cross (h,id)) (mc q (f,ys))
= denition of mc q
map (cross (h,id)) (map (cross (f,id)) (q ys))
= map and cross distribute over composition
map (cross (h.f,id)) (q ys)
= denition of mc q
mc q (h.f,ys)
= denition of cross
(mc q . (cross ((h.),id))) (f,ys)
3.29
200 Answers to exercises
pMir :: Parser Char Mir
pMir = (\o p q -> Mir3 p) <$> symbol b <*> pMir <*> symbol b
<|> (\o p q -> Mir2 p) <$> symbol a <*> pMir <*> symbol a
<|> succeed Mir1
3.30
pBitList :: Parser Char BitList
pBitList = SingleB <$> pBit
<|> (\b c bl -> ConsB b bl) <$> pBit <*> symbol , <*> pBitList
pBit = const Bit0 <$> symbol 0
<|> const Bit1 <$> symbol 1
3.31
-- Parser for floating point numbers
fixed :: Parser Char Float
fixed = (+) <$> (fromInt <$> greedy integer)
<*>
(((\x y -> y) <$> symbol . <*> fractpart) option 0.0)
fractpart :: Parser Char Float
fractpart = foldr f 0.0 <$> greedy newdigit
where f d n = (n + fromInt d)/10.0
3.32
float :: Parser Char Float
float = f <$> fixed
<*>
(((\x y -> y) <$> symbol E <*> integer) option 0)
where f m e = m * power e
power e | e<0 = 1.0 / power (-e)
| otherwise = fromInteger (10^e)
4.1
data FloatLiteral = FL1 IntPart FractPart ExponentPart FloatSuffix
| FL2 FractPart ExponentPart FloatSuffix
| FL3 IntPart ExponentPart FloatSuffix
| FL4 IntPart ExponentPart FloatSuffix
deriving Show
type ExponentPart = String
type ExponentIndicator = String
type SignedInteger = String
type IntPart = String
type FractPart = String
type FloatSuffix = String
Answers to exercises 201
digit = satisfy isDigit
digits = many1 digit
floatLiteral = (\a b c d e -> FL1 a c d e) <$>
intPart <*> period <*> optfract <*> optexp <*> optfloat
<|> (\a b c d -> FL2 b c d) <$>
period <*> fractPart <*> optexp <*> optfloat
<|> (\a b c -> FL3 a b c) <$>
intPart <*> exponentPart <*> optfloat
<|> (\a b c -> FL4 a b c) <$>
intPart <*> optexp <*> floatSuffix
intPart = signedInteger
fractPart = digits
exponentPart = (++) <$> exponentIndicator <*> signedInteger
signedInteger = (++) <$> option sign "" <*> digits
exponentIndicator = token "e" <|> token "E"
sign = token "+" <|> token "-"
floatSuffix = token "f" <|> token "F"
<|> token "d" <|> token "D"
period = token "."
optexp = option exponentPart ""
optfract = option fractPart ""
optfloat = option floatSuffix ""
4.2 The data and type denitions are the same as before, only the parsers return
another (semantic) result.
digit = f <$> satisfy isDigit
where f c = ord c - ord 0
digits = foldl f 0 <$> many1 digit
where f a b = 10*a + b
floatLiteral = (\a b c d e -> (fromInt a + c) * power d) <$>
intPart <*> period <*> optfract <*> optexp <*> optfloat
<|> (\a b c d -> b * power c) <$>
period <*> fractPart <*> optexp <*> optfloat
<|> (\a b c -> (fromInt a) * power b) <$>
intPart <*> exponentPart <*> optfloat
<|> (\a b c -> (fromInt a) * power b) <$>
intPart <*> optexp <*> floatSuffix
intPart = signedInteger
fractPart = foldr f 0.0 <$> many1 digit
where f a b = (fromInt a + b)/10
exponentPart = (\x y -> y) <$> exponentIndicator <*> signedInteger
signedInteger = (\ x y -> x y) <$> option sign id <*> digits
exponentIndicator = symbol e <|> symbol E
sign = const id <$> symbol +
<|> const negate <$> symbol -
floatSuffix = symbol f <|> symbol F<|> symbol d <|> symbol D
period = symbol .
optexp = option exponentPart 0
optfract = option fractPart 0.0
202 Answers to exercises
optfloat = option floatSuffix
power e | e < 0 = 1 / power (-e)
| otherwise = fromInt (10^e)
4.3 The parsing scheme for Java oats is
digit = f <$> satisfy isDigit
where f c = .....
digits = f <$> many1 digit
where f ds = ..
floatLiteral = f1 <$>
intPart <*> period <*> optfract <*> optexp <*> optfloat
<|> f2 <$>
period <*> fractPart <*> optexp <*> optfloat
<|> f3 <$>
intPart <*> exponentPart <*> optfloat
<|> f4 <$>
intPart <*> optexp <*> floatSuffix
where
f1 a b c d e = .....
f2 a b c d = .....
f3 a b c = .....
f4 a b c = .....
intPart = signedInteger
fractPart = f <$> many1 digit
where f ds = ....
exponentPart = f <$> exponentIndicator <*> signedInteger
where f x y = .....
signedInteger = f <$> option sign ?? <*> digits
where f x y = .....
exponentIndicator = f1 <$> symbol e <|> f2 <$> symbol E
where
f1 c = ....
f2 c = ..
sign = f1 <$> symbol + <|> f2 <$> symbol -
where
f1 h = .....
f2 h = .....
floatSuffix = f1 <$> symbol f
<|> f2 <$> symbol F
<|> f3 <$> symbol d
<|> f4 <$> symbol D
where
f1 c = .....
f2 c = .....
f3 c = .....
f4 c = .....
period = symbol .
optexp = option exponentPart ??
optfract = option fractPart ??
Answers to exercises 203
optfloat = option floatSuffix ??
5.1 Let G = (T, N, R, S). Dene the regular grammar G

= (T, N S

, R

, S

) as
follows. S

is a new nonterminal symbol. For the productions R

, divide the pro-


ductions of R in two sets: the productions R1 of which the right-hand side consists
just of terminal symbols, and the productions R2 of which the right-hand side ends
with a nonterminal. Dene a set R3 of productions by adding the nonterminal S
at the right end of each production in R1. Dene R

= R R3 S

S [ . The
grammar G

thus dened is regular, and generates the language L

.
5.2 In step 2, add the productions
S aS
S
S cC
S a
A
A cC
A a
B cC
B a
and remove the productions
S A
A B
B C
In step 3, add the production S a, and remove the productions A , B .
5.3
ndfsa d qs [ ] = qs
ndfsa d qs (x : xs) = ndfsa d (d qs x) xs
5.4 Let M = (X, Q, d, S, F) be a deterministic nite-state automaton. Then the non-
deterministic nite-state automaton M

dened by M

= (X, Q, d

, S, F), where
d

q x = d q x accepts the same language.


5.5 Let M = (X, Q, d, S, F) be a deterministic nite-state automaton that accepts
language L. Then the deterministic nite-state automaton M

= (X, Q, d, S, QF),
where Q F is the set of states Q from which all states in F have been removed,
accepts the language L. Here we assume that d q a is dened for all q and a.
5.6
1 L
1
L
2
= (L
1
L
2
), so it follows that regular languages are closed under
intersection if they are closed under complementation and union. Regular
languages are closed under union, see Theorem 10, and they are closed under
complementation, see exercise 5.5.
2 Ldfa M
=
w X

[ dfa accept w (d, (S


1
, S
2
), (F
1
F
2
))
=
204 Answers to exercises
w X

[ dfa d (S
1
, S
2
) w F
1
F
2

= This requires a proof by induction


w X

[ (dfa d
1
S
1
w, dfa d
2
S
2
w) F
1
F
2

=
w X

[ dfa d
1
S
1
w F
1
dfa d
2
S
2
w F
2

=
w X

[ dfa d
1
S
1
w F
1
w X

[ dfa d
2
S
2
w F
2

=
Ldfa M
1
Ldfa M
2
5.7
1
S
?>=< 89:; /.-, ()*+
(
/
@A BC
)
O
A
?>=< 89:;
)
/
@A BC
(
O
B
?>=< 89:; /.-, ()*+
2
S
?>=< 89:;
0
/
GF ED
1

A
?>=< 89:;
0

1
/
C
?>=< 89:; /.-, ()*+
B
?>=< 89:; /.-, ()*+
0
/
D
?>=< 89:; /.-, ()*+
5.8 Use the denition of Lre to calculate the languages:
1 , b
2 (bc

)
3 ab

5.9
1 Lre(R(S +T))
=
(Lre(R))(Lre(S +T))
=
(Lre(R))(Lre(S) Lre(T))
=
(Lre(R))(Lre(S)) (Lre(R))(Lre(T))
=
Lre(RS +RT)
2 Similar to the above calculation.
5.10 Take R = a, S = a, and R = a, S = b.
5.11 If both V and W are subsets of S, then Lre(R(S + V )) = Lre(R(S + W)).
Since S ,= , V = S and W = satisfy the requirement. Another solution is
V = S R
W = S R
Since S ,= , at least one of S R and S R is not empty, and it follows that
V ,= W. There exist other solutions than these.
Answers to exercises 205
5.12 The string 01 may not occur in a string in the language of the regular expres-
sion. So when a 0 appears somewhere, only 0s can follow. Take 10.
5.13
1 If we can give a regular grammar for (a + bb) + c, we can use the proce-
dure constructed in exercise 5.1 to obtain a regular grammar for the complete
regular expression. The following regular grammar generates (a + bb) + c:
S
0
S
1
[ c
provided S
1
generates (a + bb). Again, we use exercise 5.1 and a regular
grammar for a + bb to obtain a regular grammar for (a + bb). The language
of the regular expression a + bb is generated by the regular grammar
S
2
a [ bb
2 The language of the regular expression a is generated by the regular grammar
S
0
[ aS
0
The language of the regular expression b is generated by the regular grammar
S
1
[ bS
1
The language of the regular expression ab is generated by the regular grammar
S
2
ab
It follows that the language of the regular expression a +b +ab is generated
by the regular grammar
S
3
S
0
[ S
1
[ S
2
5.14 First construct a nondeterministic nite-state automaton for these grammars,
and use the automata to construct the following regular expressions.
1 a(b + b(b + ab)(ab +)) + bb +
2 (0 + 101)
5.16
1 b +b (aa b(a + b))
2 (b + a(bb) b)
6.1
type LNTreeAlgebra a b x = (a -> x,x -> b -> x -> x)
foldLNTree :: LNTreeAlgebra a b x -> LNTree a b -> x
foldLNTree (leaf,node) = fold where
fold (Leaf a) = leaf a
fold (Node l m r) = node (fold l) m (fold r)
6.2
1. -- height (Leaf x) = 0
-- height (Bin lt rt) = 1 + (height lt) max (height rt)
height :: BinTree x -> Int
height = foldBinTree heightAlgebra
heightAlgebra = (\u v -> 1 + u max v, const 0)
2. -- flatten (Leaf x) = [x]
-- flatten (Bin lt rt) = (flatten lt) ++ (flatten rt)
flatten :: BinTree x -> [x]
flatten = foldBinTree ((++) ,\x -> [x])
206 Answers to exercises
3. -- maxBinTree (Leaf x) = x
-- maxBinTree (Bin lt rt) = (maxBinTree lt) max (maxBinTree rt)
maxBinTree :: Ord x => BinTree x -> x
maxBinTree = foldBinTree (max, id)
4. -- sp (Leaf x) = 0
-- sp (Bin lt rt) = 1 + (sp lt) min (sp rt)
sp :: BinTree x -> Int
sp = foldBinTree spAlgebra
spAlgebra = (\u v -> 1 + u min v, const 0)
5. -- mapBinTree f (Leaf x) = Leaf (f x)
-- mapBinTree f (Bin lt rt) = Bin (mapBinTree f lt) (mapBinTree f rt)
mapBinTree :: (a -> b) -> BinTree a -> BinTree b
mapBinTree f = foldBinTree (Bin, Leaf . f)
6.3
--allPaths (Leaf x) = [[]]
--allPaths (Bin lt rt) = map (Left:) (allPaths lt)
-- ++ map (Right:) (allPaths rt)
allPaths :: BinTree a -> [[Direction]]
allPaths = foldBinTree psAlgebra
psAlgebra :: BinTreeAlgebra a [[Direction]]
psAlgebra = (\u v -> map (Left:) u ++ map (Right:) v, const [[]])
6.4
1. data Resist = Resist :|: Resist
| Resist :*: Resist
| BasicR Float
deriving Show
type ResistAlgebra a = (a -> a -> a, a -> a -> a, Float -> a)
foldResist :: ResistAlgebra a -> Resist -> a
foldResist (par, seq,basic) = fold where
fold (r1 :|: r2) = par (fold r1) (fold r2)
fold (r1 :*: r2) = seq (fold r1) (fold r2)
fold (BasicR f) = basic f
2. result :: Resist -> Float
result = foldResist resultAlgebra
resultAlgebra :: ResistAlgebra Float
resultAlgebra = (\u v -> (u * v) /(u + v), (+), id)
6.5
1. isSum = foldExpr ((&&)
,\x y -> False
,\x y -> False
,\x y -> False
Answers to exercises 207
,const True
,const True
,\x -> (&&)
)
2. vars = foldExpr ((++)
,(++)
,(++)
,(++)
,const []
,\x -> [x]
,\x y z -> x : (y ++ z)
)
6.6
1. der computes a symbolic dierentiation of an expression.
2. The function der is not a compositional function on Expr because the right-
hand sides of the Mul and Dvd expressions do not only use der e1 dx and
der e2 dx, but also e1 and e2 themselves.
3. data Exp = Exp Plus Exp
| Exp Sub Exp
| Con Float
| Idf String
deriving Show
type ExpAlgebra a = (a -> a -> a
,a -> a -> a
,Float -> a
,String -> a
)
foldExp :: ExpAlgebra a -> Exp -> a
foldExp (plus, sub, con, idf) = fold where
fold (e1 Plus e2) = plus (fold e1) (fold e2)
fold (e1 Sub e2) = sub (fold e1) (fold e2)
fold (Con n) = con n
fold (Idf s) = idf s
4. -- der :: Exp -> String -> Exp
-- der (e1 Plus e2) dx = (der e1 dx) Plus (der e2 dx)
-- der (e1 Sub e2) dx = (der e1 dx) Sub (der e2 dx)
-- der (Con f) dx = Con 0.0
-- der (Idf s) dx = if s == dx then Con 1.0 else Con 0.0
der = foldExp derAlgebra
derAlgebra :: ExpAlgebra (String -> Exp)
derAlgebra = (\f g -> \s -> f s Plus g s
,\f g -> \s -> f s Sub g s
,\n -> \s -> Con 0.0
,\s -> \t -> if s == t then Con 1.0 else Con 0.0
)
208 Answers to exercises
6.7
-- replace (Leaf x) y = Leaf y
-- replace (Bin lt rt) y = Bin (replace lt y) (replace rt y)
replace :: BinTree a -> a -> BinTree a
replace = foldBinTree repAlgebra
repAlgebra = (\f g -> \y -> Bin (f y) (g y), \x y -> Leaf y)
6.8
--path2Value (Leaf x) = \bs -> if null bs then x else error"no roothpath"
--path2Value (Bin lt rt) =
-- \bs -> case bs of
-- [] -> error "no roothpath"
-- (Left:rs) -> path2Value lt rs
-- (Right:rs) -> path2Value rt rs
--
path2Value :: BinTree a -> [Direction] -> a
path2Value = foldBinTree pvAlgebra
pvAlgebra :: BinTreeAlgebra a ([Direction] -> a)
pvAlgebra = (\fl fr -> \ bs -> case bs of
{[] -> error "no roothpath"
;(Left:rs) -> fl rs
;(Right:rs) -> fr rs
}
, \x -> \ bs -> if null bs
then x
else error "no roothpath")
6.9
1 type PalAlgebra p = (p, p, p, p -> p, p -> p)
2 foldPal :: PalAlgebra p -> Pal -> p
foldPal (pal1,pal2,pal3,pal4,pal5) = fPal where
fPal Pal1 = pal1
fPal Pal2 = pal2
fPal Pal3 = pal3
fPal (Pal4 p) = pal4 (fPal p)
fPal (Pal5 p) = pal5 (fPal p)
3 a2cPal = foldPal (""
,"a"
,"b"
,\p ->"a"++p++"a"
,\p ->"b"++p++"b"
)
aCountPal = foldPal (0,1,0,\p->p+2,\p->p)
4 pfoldPal :: PalAlgebra p -> Parser Char p
pfoldPal (pal1, pal2, pal3, pal4, pal5) = pPal where
Answers to exercises 209
pPal = const pal1 <$> epsilon
<|> const pal2 <$> syma
<|> const pal3 <$> symb
<|> (\_ p _ -> pal4 p) <$> syma <*> pPal <*> syma
<|> (\_ p _ -> pal5 p) <$> symb <*> pPal <*> symb
syma = symbol a
symb = symbol b
5 The parser pfoldPal m1 returns the concrete representation of a palindrome.
The parser pfoldPal m2 returns the number of as occurring in a palindrome.
6.10
1 type MirAlgebra m = (m,m->m,m->m)
2 foldMir :: MirAlgebra m -> Mir -> m
foldMir (mir1,mir2,mir3) = fMir where
fMir Mir1 = mir1
fMir (Mir2 m) = mir2 (fMir m)
fMir (Mir3 m) = mir3 (fMir m)
3 a2cMir = foldMir ("",\m->"a"++m++"a",\m->"b"++m++"b")
m2pMir = foldMir (Pal1,Pal4,Pal5)
4 pfoldMir :: MirAlgebra m -> Parser Char m
pfoldMir (mir1, mir2, mir3) = pMir where
pMir = const mir1 <$> epsilon
<|> (\_ m _ -> mir2 m) <$> syma <*> pMir <*> syma
<|> (\_ m _ -> mir3 m) <$> symb <*> pMir <*> symb
syma = symbol a
symb = symbol b
5 The parser pfoldMir m1 returns the concrete representation of a palindrome.
The parser pfoldMir m2 returns the abstract representation of a palindrome.
6.11
1 type ParityAlgebra p = (p,p->p,p->p,p->p)
2 foldParity :: ParityAlgebra p -> Parity -> p
foldParity (empty,parity1,parityL0,parityR0) = fParity where
fParity Empty = empty
fParity (Parity1 p) = parity1 (fParity p)
fParity (ParityL0 p) = parityL0 (fParity p)
fParity (ParityR0 p) = parityR0 (fParity p)
3 a2cParity = foldParity (""
,\x->"1"++x++"1"
,\x->"0"++x
,\x->x++"0"
)
210 Answers to exercises
6.12
1 type BitListAlgebra bl z b = ((b->z->bl,b->bl),(bl->z->z,bl->z),(b,b))
2 foldBitList :: BitListAlgebra bl z b -> BitList -> bl
foldBitList ((consb,singleb),(consbl,singlebl),(bit0,bit1)) = fBitList
where
fBitList (ConsB b z) = consb (fBit b) (fZ z)
fBitList (SingleB b) = singleb (fBit b)
fZ (ConsBL bl z) = consbl (fBitList bl) (fZ z)
fZ (SingleBL bl) = singlebl (fBitList bl)
fBit Bit0 = bit0
fBit Bit1 = bit1
3 a2cBitList = foldBitList (((++),id)
,((++),id)
,("0","1")
)
4 pfoldBitList :: BitListAlgebra bl z b -> Parser Char bl
pfoldBitList ((consb,singleb),(consbl,singlebl),(bit0,bit1)) = pBitList
where
pBitList = consb <$> pBit <*> pZ
<|> singleb <$> pBit
pZ = (\_ bl z -> consbl bl z) <$> symbol ,
<*> pBitList
<*> pZ
<|> singlebl <$> pBitList
pBit = const bit0 <$> symbol 0
<|> const bit1 <$> symbol 1
6.13
1. data Block = B1 Stat Rest
data Rest = R1 Stat Rest | Nix
data Stat = S1 Decl | S2 Use | S3 Nest
data Decl = Dx | Dy
data Use = UX | UY
data Nest = N1 Block
The abstract representation of x;(y;Y);X is
B1 (S1 Dx) (R1 stat2 rest2) where
stat2 = S3 (N1 block)
block = B1 (S1 Dy) (R1 (S2 UY) Nix)
rest2 = R1 (S2 UX) Nix
2. type BlockAlgebra b r s d u n =
(s -> r -> b
,(s -> r -> r, r)
,(d -> s, u -> s, n -> s)
Answers to exercises 211
,(d,d)
,(u,u)
,b -> n
)
3. foldBlock :: BlockAlgebra b r s d u n -> Block -> b
foldBlock (b1, (r1, nix), (s1, s2, s3), (dx, dy), (ux, uy), n1) =
foldB
where
foldB (B1 stat rest) = b1 (foldS stat) (foldR rest)
foldR (R1 stat rest) = r1 (foldS stat) (foldR rest)
foldR Nix = nix
foldS (S1 decl) = s1 (foldD decl)
foldS (S2 use) = s2 (foldU use)
foldS (S3 nest) = s3 (foldN nest)
foldD Dx = dx
foldD Dy = dy
foldU UX = ux
foldU UY = uy
foldN (N1 block) = n1 (foldB block)
4. a2cBlock :: Block -> String
a2cBlock = foldBlock a2cBlockAlgebra
a2cBlockAlgebra ::
BlockAlgebra String String String String String String
a2cBlockAlgebra =
(b1, (r1, nix), (s1, s2, s3), (dx, dy), (ux, uy), n1)
where b1 u v = u ++ v
r1 u v = ";" ++ u ++ v
nix = ""
s1 u = u
s2 u = u
s3 u = u
dx = "x"
dy = "y"
ux = "X"
uy = "Y"
n1 u = "(" ++ u ++ ")"
8.1 The type TreeAlgebra and the function foldTree are dened in the rep-min
problem.
1. height = foldTree heightAlgebra
heightAlgebra = (const 0, \l r -> (l max r) + 1)
--frontAtLevel :: Tree -> Int -> [Int]
--frontAtLevel (Leaf i) h = if h == 0 then [i] else []
--frontAtLevel (Bin l r) h =
-- if h > 0 then frontAtLevel l (h-1) ++ frontAtLevel r (h-1)
-- else []
frontAtLevel = foldTree frontAtLevelAlgebra
212 Answers to exercises
frontAtLevelAlgebra =
( \i h -> if h == 0 then [i] else []
, \f g h -> if h > 0 then f (h-1) ++ g (h-1) else []
)
2. (a) Straightforward Solution: impossible to give a solution like rm.sol1.
heightAlgebra = (const 0, \l r -> (l max r) + 1)
front t = foldTree frontAtLevelAlgebra t h
where h = foldTree heightAlgebra t
(b) Lambda Lifting
front t = foldTree
frontAtLevelAlgebra
t
(foldTree heightAlgebra t)
(c) Tupling Computations
htupfr :: TreeAlgebra (Int, Int -> [Int])
htupfr
-- = (heightAlgebra tuple frontAtLevelAlgebra)
= ( \i -> ( 0
, \h -> if h == 0 then [i] else []
)
, \(lh,f) (rh,g) -> ( (lh max rh) + 1
, \h -> if h > 0
then f (h-1) ++ g (h-1)
else []
)
)
front t = fr h
where (h, fr) = foldTree htupfr t
(d) Merging Tupled Functions
It is helpful to note that the merged algebra is constructed such that
foldTree mergedAlgebra t i = (height t, frontAtLevel t i)
Therefore, the denitions corresponding to the fourth solution are
mergedAlgebra :: TreeAlgebra (Int -> (Int, [Int]))
mergedAlgebra =
(\i -> \h -> ( 0
, if h == 0 then [i] else []
)
, \lfun rfun -> \h -> let (lh, xs) = lfun (h-1)
(rh, ys) = rfun (h-1)
in
( (lh max rh) + 1
, if h > 0 then xs ++ ys else []
)
)
Answers to exercises 213
front t = fr
where (h, fr) = foldTree mergedAlgebra t h
8.2 The highest front problem is the problem of nding the rst non-empty list
of nodes which are at the lowest level. The solutions are similar to these of the
deepest front problem.
1. --lowest :: Tree -> Int
--lowest (Leaf i) = 0
--lowest (Bin l r) = ((lowest l) min (lowest r)) + 1
lowest = foldTree lowAlgebra
lowAlgebra = (const 0, \ l r -> (l min r) + 1)
2. (a) Straightforward Solution: impossible to give a solution like rm.sol1.
lfront t = foldTree frontAtLevelAlgebra t l
where l = foldTree lowAlgebra t
(b) Lambda Lifting
lfront t = foldTree
frontAtLevelAlgebra
t
(foldTree lowAlgebra t)
(c) Tupling Computations
ltupfr :: TreeAlgebra (Int, Int -> [Int])
ltupfr =
( \i -> ( 0
, (\l -> if l == 0 then [i] else [])
)
, \ (lh,f) (rh,g) -> ( (lh min rh) + 1
, \l -> if l > 0
then f (l-1) ++ g (l-1)
else []
)
)
lfront t = fr l
where (l, fr) = foldTree ltupfr t
(d) Merging Tupled Functions
It is helpful to note that the merged algebra is constructed such that
foldTree mergedAlgebra t i = (lowest t, frontAtLevel t i)
Therefore, the denitions corresponding to the fourth solution are
lmergedAlgebra :: TreeAlgebra (Int -> (Int, [Int]))
lmergedAlgebra =
( \i -> \l -> ( 0
, if l == 0 then [i] else []
)
, \lfun rfun -> \l ->
let (ll, lres) = lfun (l-1)
214 Answers to exercises
(rl, rres) = rfun (l-1)
in
( (ll min rl) + 1
, if l > 0 then lres ++ rres else []
)
)
lfront t = fr
where (l, fr) = foldTree lmergedAlgebra t l
9.1 We follow the proof pattern for non-regular languages given in the text.
Let n IIN.
Take s = a
n
2
with x = , y = a
n
, and z = a
n
2
n
.
Let u, v, w be such that y = uvw with v ,= , that is, u = a
p
, v = a
q
and w = a
r
with p +q +r = n and q > 0.
Take i = 2, then
xuv
2
wz , L
defn. x, u, v, w, z, calculus
a
p+2q+r
a
n
2
n
, L
p +q +r = n and q > 0
n
2
+q is not a square

n
2
< n
2
+q < (n + 1)
2

q < 2n + 1
9.2 Using the proof pattern.
Let n IIN.
Take s = a
n
b
n+1
with x = a
n
, y = b
n
, and z = b.
Let u, v, w be such that y = uvw with v ,= , that is, u = b
p
, v = b
q
and w = b
r
with p +q +r = n and q > 0.
Take i = 0, then
xuwz , L
defn. x, u, v, w, z, calculus
a
n
b
p+r+1
, L

p +q +r p +r + 1

q > 0
9.3 Using the proof pattern.
Let n IIN.
Take s = a
n
b
2n
with x = a
n
, y = b
n
, and z = b
n
.
Let u, v, w be such that y = uvw with v ,= , that is, u = b
p
, v = b
q
and w = b
r
with p +q +r = n and q > 0.
Take i = 2, then
xuv
2
wz , L
Answers to exercises 215
defn. x, u, v, w, z, calculus
a
n
b
2n+q
, L

2n +q > 2n

q > 0
9.4 Using the proof pattern.
Let n IIN.
Take s = a
6
b
n+4
a
n
with x = a
6
b
n+4
, y = a
n
, and z = .
Let u, v, w be such that y = uvw with v ,= , that is, u = a
p
, v = a
q
and w = a
r
with p +q +r = n and q > 0.
Take i = 6, then
xuv
6
wz , L
defn. x, u, v, w, z, calculus
a
6
b
n+4
a
n+5q
, L

n + 5q > n + 4

q > 0
9.5 We follow the proof pattern for not context-free languages given in the text.
Let c, d IIN.
Take z = a
k
2
with k = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
That is u = a
p
, v = a
q
, w = a
r
, x = a
s
and y = a
t
with p + q + r + s + t = k
2
,
q +r +s d and q +s > 0.
Take i = 2, then
uv
2
wx
2
y , L
defn. u, v, w, x, y, calculus
a
k
2
+q+s
, L
q +s > 0
k
2
+q +s is not a square

k
2
< k
2
+q +s < (k + 1)
2

q +s < 2k + 1
defn. k
q +s d
9.6 Using the proof pattern.
Let c, d IIN.
Take z = a
k
with k is prime and k > max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
That is u = a
p
, v = a
q
, w = a
r
, x = a
s
and y = a
t
with p + q + r + s + t = k,
q +r +s d and q +s > 0.
216 Answers to exercises
Take i = k + 1, then
uv
k+1
wx
k+1
y , L
defn. u, v, w, x, y, calculus
a
k+kq+ks
, L

k(1 +q +s) is not a prime

q +s > 0
9.7 Using the proof pattern.
Let c, d IIN.
Take z = a
k
b
k
a
k
b
k
with k = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
Note that our choice for k guarantees that substring vwx has one of the following
shapes:
vwx consists of just as, or just bs.
vwx contains both as and bs.
Take i = 0, then
If vwx consists of just as, or just bs, then it is impossible to write the string
uwy as ww for some string w, since only the number of terminals of one kind
is decreased.
If vwx contains both as and bs, it lies somewhere on the border between as
and bs, or on the border between bs and as. Then the string uwy can be
written as
uwy = a
s
b
t
a
p
b
q
for some s, t, p, q, respectively. At least one of s ,t, p and q is less than k,
while two of them are equal to k. Again this sentence is not an element of the
language.
10.1 For both grammars we have:
empty = const False
10.2 For grammar1 we have:
firsts S = {b,c}
firsts A = {a,b,c}
firsts B = {a,b,c}
firsts C = {a,b}
For grammar2:
firsts S = {a}
firsts A = {b}
10.3 For gramm1 we have:
Answers to exercises 217
follow S = {a,b,c}
follow A = {a,b,c}
follow B = {a,b}
follow C = {a,b,c}
For gramm2:
follow = const {}
10.4 For the productions of gramm3 we have:
lookAhead (S AaS) = a,c
lookAhead (S B) = b
lookAhead (A cS) = c
lookAhead (A []) = a
lookAhead (B b) = b
Since all lookAhead sets for productions of the same nonterminal are disjoint, gramm3
is an LL(1) grammar.
10.5 After left factoring gramm2 we obtain gramm2 with productions
S aC
C bA [ a
A bD
D b [ S
For this transformed grammar gramm2 we have:
The empty function
empty = const False
The first function
firsts S = {a}
firsts C = {a,b}
firsts A = {b}
firsts D = {a,b}
The follow function
follow = const {}
The lookAhead function
lookAhead (S aC) = a
lookAhead (C bA) = b
lookAhead (C a) = a
lookAhead (A bD) = b
lookAhead (D b) = b
lookAhead (D S) = a
218 Answers to exercises
Clearly, gramm2 is an LL(1) grammar.
10.6 For the empty function we have:
empty R = True
empty _ = False
For the firsts function we have
firsts L = {0,1}
firsts R = {,}
firsts B = {0,1}
The follow function is
follow B = {,}
follow _ = {}
The lookAhead function is
lookAhead (L B R) = 0, 1
lookAhead (R ) =
lookAhead (R , B R) = ,
lookAhead (B 0) = 0
lookAhead (B 1) = 1
Since all lookAhead sets for productions of the same nonterminal are disjoint, the
grammar is LL(1).
10.7
Node S [ Node c []
, Node A [ Node c []
, Node B [ Node c [], Node c [] ]
, Node C [ Node b [], Node a [] ]
]
]
10.8
Node S [ Node a []
, Node C [ Node b []
, Node A [ Node b [], Node D [ Node b [] ] ]
]
]
10.9
Node S [ Node A []
, Node a []
, Node S [ Node A [ Node c [], Node S [ Node B [ Node b []]]]
, Node a []
, Node S [ Node B [ Node b [] ]]
]
]
Answers to exercises 219
10.10
1. list2Set :: Ord s => [s] -> [s]
list2Set = unions . map single
2. list2Set :: Ord s => [s] -> [s]
list2Set = foldr op []
where
op x xs = single x union xs
3. pref :: Ord s => (s -> Bool) -> [s] -> [s]
pref p = list2Set . takeWhile p
or
pref p [] = []
pref p (x:xs) = if p x then single x union pref p xs else []
or
pref p = foldr op []
where
op x xs = if p x then single x union xs else []
4. prefplus p [] = []
prefplus p (x:xs) = if p x then single x union prefplus p xs
else single x
5. prefplus p = foldr op []
where
op x us = if p x then single x union us
else single x
6. prefplus p
=
foldr op []
where
op x us = if p x then single x union us else single x
=
foldr op []
where
op x us = single x union rest
where
rest = if p x then us else []
=
foldrRhs p single []
7. The function foldrRhs p f [] takes a list xs and returns the set of all f-
images of the elements of the prex of xs all of whose elements satisfy p
together with the f-image of the rst element of xs that does not satisfy p (if
this element exists).
The function foldrRhs p f start takes a list xs and returns the set start
union foldrRhs p f [] xs.
220 Answers to exercises
8.
10.11
1. For gramm1
(a) foldrRhs empty first [] bSA
=
unions (map first (prefplus empty bSA))
= empty b = False
unions (map first [b])
=
unions [b]
=
b
(b) foldrRhs empty first [] Cb
=
unions (map first (prefplus empty Cb))
= empty C = False
unions (map first [C])
=
unions [a, b ]
=
a, b
2. For gramm3
(a) foldrRhs empty first [] AaS
=
unions (map first (prefplus empty AaS))
= empty A = True
unions (map first ([A] union prefplus empty aS))
empty a = False
unions (map first ([A] union [a]))
=
unions (map first ([A, a]))
=
unions [first A, first a]
=
unions [c, a]
=
a, c
(b) scanrRhs empty first [] AaS
=
map (foldrRhs empty first []) (tails AaS)
=
map (foldrRhs empty first []) [AaS, aS, S, []]
Answers to exercises 221
=
[ foldrRhs empty first [] AaS
, foldrRhs empty first [] aS
, foldrRhs empty first [] S
, foldrRhs empty first [] []
]
= calculus
[ a, c, a, a, b, c , [] ]

You might also like