COMP11212 Fundamentals of Computation Part 1: Formal Languages
COMP11212 Fundamentals of Computation Part 1: Formal Languages
Organizational issues
When, what, where
This course will be taught as follows.
Lectures: Will take place Mondays at 9.00 and Thursdays at 10.00 in
Kilburn 1.1.
Examples classes: Will take place from Week 2 as follows.
Groups
Z
B+X
Y
M+W
Time
Tuesdays 10.00
Mondays 2.00
Mondays 3.00
Mondays 1.00
Location
LF15
IT407
IT407
IT407
Two parts
This course consists of two distinct parts, taught by two different members
of staff, Andrea Schalk and David Lester.
Lectures
1
211
1221
22
From
To
28/01
28/01
31/01
04/03
07/03
29/04
To be announced
Content
Intro
Part 1
Part 2
Revision
Taught by
Andrea & David
Andrea
David
David & Andrea
belongs to
Part 1
Part 2
Assessment
There is assessed coursework in the form of exercises you are expected to
prepare before each examples class, see below for more detail. The
coursework is worth 25% of the overall mark for the course unit.
The exam consists of four question, with two questions from each part
of the course. Each question in the exam is worth twenty marks. Students
have to answer three out of the four questions. This means your exam mark
will be out of sixty. It will be worth 75% of your mark for this course unit.
2
Coursework
You are expected to prepare exercises before each examples class, see
the exercise sheets at the end of the notes. During the examples class these
will be marked, and you will also get help with the questions you could not
do, potentially allowing you to get some additional marks if you can catch
up before the end of the class.
Each exercise is marked out of two, where the marks mean the following.
0: You did not make a serious attempt to solve the question.
1: You made a serious attempt but did not get close to finishing the
question.
2: You did work your way through the exercise and made no serious
mistakes.
There are five marked exercises for each examples class in this part,
allowing a mark out of ten each week. Note that not all exercise are
equally difficult, or require an equal amount of work. If you are seriously
struggling with one exercise, move on to the next one (they dont usually
build on each other). If you are struggling for time, try to do as many as you
can, leaving out the more labour-intensive ones, but try to do something.
Be aware that if you exclusively intend to work during the examples classes
youll likely get no more than four or five marks each week. Later you will
still have to learn the material for the exam.
Note that the marker may ask you how you solved the various exercises
before assigning you a mark for each of them.
If you cannot answer the question we assume you have plagiarized the
solution and you will get a mark of 0 for all your work that
week.
If this happens twice we will put a note in your file that you
have been caught plagiarizing work.
Weeks in which you have done this will not count towards the seven
examples class threshold (see below).
However, note that your marks here only count if you have something
recorded for at least seven out of the ten examples classes, or if there
are mitigating circumstances.
Changes
Note that the course unit was taught very differently until (and including)
2011/2012. Here are the most important changes.
There used to be three parts. The old second part (Part B in old exam
papers) is no longer taught. Part 1 has been very slightly, and Part 2
(the old Part 3) substantially, expanded since then.
In the old exams questions used to be worth ten marks each, with the
exam being marked out of fifty.
There used to be coursework on top of the exercises prepared for the
examples classes. This no longer exists, but now the examples class
prep is being marked.
3
concepts and operations we use. If you only understand the informal side of
the course you can pass the exam (and even get a very good mark) but to
master the material (and get an excellent result in the exam) you will have
to get to grips with the mathematical notation as well. A maximum of 10%
of the final mark depends on the material in the Appendix.
Examples classes. The examples classes give you the opportunity to
get help with any problems you have, either with solving exercises or with
understanding the notes. There are four examples classes associated with
this part of the course, in Weeks 25. For each of these you are expected
to prepare by solving the key exercises on the appropriate sheet. It also
suggests exercises to do if you are working through the Appendix, and there
are suggestions for additional exercises you might want to do if you find the
set work easy. The sheets can be found from pages 83 of the notes. We will
check in each examples classes whether you have done the preparation, and
the data will be entered into Arcade. Solutions for each exercise sheet will
be made available on the webpage for this part of the course after the last
associated examples class has taken place.
Marking criteria. The marking criteria for the assessed coursework are
stricter than those for the exam. In particular, I ask you to follow various
algorithms as described in the notes, whereas in the exam Im happy if you
can solve the problem at hand, and I dont mind how exactly you do that.
In all cases it is important to show your workif you only give an answer
you may lose a substantial number of marks.
Revision. For revision purposes I suggest going over all the exercises
again. Some of the exercises will probably be new to you since they are not
part of the set preparation work for the examples classes. Also, you can turn
all the NFAs you encounter along the way into DFAs (removing -transitions
as required). Lastly, there are exams from previous years to practice on. If
you can do the exercises you will do well in the exam. You will not be asked
to repeat definitions, but knowing about properties of regular and contextfree languages may be advantageous. You should also be aware of the few
theorems, although you will not be asked to recite them.
Webpage for the course. http://www.cs.manchester.ac.uk/ugt/
2012/COMP11212/
Reading
For this part of the course these notes cover all the examinable material.
However, sometimes it can be useful to have an additional source to see the
same material introduced and explained in a slightly different way. Also,
there are new examples and exercises to be looked at. For this purpose you
may find one of the following useful.
M. Sipser. Introduction to the Theory of Computation. PWS Publishing Company, 1997. ISBN 0-534-94728-X.
This is a fairly mathematical book that nonetheless tries to keep mathematical notation at a minimum. It contains many illustrated examples and
aims to explain ideas rather than going through proofs mechanically. A 2005
edition is also available. Around 50this book is very well thought of and
even used copies are quite expensive. Relevant to this course: Chapters 0
(what hasnt yet been covered by COMP10020), 1 and 2.1
Get in touch
Feedback. Im always keen to get feedback regarding my teaching. Although the course has been taught a few times the notes probably still
contain some errors. Id like to hear about any errors that you may find
so that I can correct them. I will make those available at http://www.cs.
manchester.ac.uk/ugt/2010/COMP11212/ and fix them for future years.
You can talk to me after lectures, or send me email (use the address on the
title page).
Reward! If you can find a substantial mistake in the lecture notes (that
is, more than just a typo, or a minor language mistake) you get a chocolate
bar.
Wanted. Im also on the lookout for good examples to use, either in the
notes or in the lectures. These should be interesting, touch important applications of the material, or be fun. For a really good reasonably substantial
such example the reward is a chocolate bar.
Acknowledgements. I would like to thank Howard Barringer, Pete
Jinks, Djihed Afifi, Francisco Lobo, Andy Ellyard, Ian Worthington, Peter
Sutton, James Bedford, Matt Kelly, Cong Jiang, Mohammed Sabbar, Jonas
Lorenz, Tomas Markevicius and Joe Razavi for taking the time to help me
improve these notes.
Contents
Organization
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
12
13
15
16
17
19
19
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
20
20
22
23
25
28
34
42
50
53
55
56
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
58
58
59
63
64
67
68
69
70
70
Glossary
72
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
75
78
79
82
Exercise Sheet 1
83
Exercise Sheet 2
84
Exercise Sheet 3
85
Exercise Sheet 4
86
Exercise Sheet 5
87
Chapter 1
10
Chapter 2
Describing languages to a
computer
In order to solve the kinds of problems that are mentioned in Chapter 1 we
need to be able to describe to a computer what it is looking for.
Only in very simple cases will this consist of just one or two stringsin
general, we want the computer to look for a much larger collection of words.
This could be the set of all possible IP addresses within the University, or
it could be the collection of all strings of the form Subject:. . . (to pick
out emails on a particular topic), or it could be the set of all strings of the
form s s (in the simplest case of finding all doubled wordsthis one wont
take care of doubled words spread over two lines, nor of the first word being
capitalized).
2.1
Terminology
can concatenate that with the word ba to obtain abba. If we concatenate 0 letters we get the word . When we concatenate any word with
the word we obtain the same word.
We use the notation of powers for concatenation as follows: If s is a
word then (s)n is the word we get by concatenating n copies of s. For
example, (010)3 = 010010010, 12 = 11, b0 = and c1 = c.
A language is a collection of words, which we think of as a set. Examples are {}, , {ab, abc, aba} and {an | n N}. We use letters such
as L, L1 and L to refer to an arbitrary language.
In these notes we are interested in how we can describe languages in
various ways.
If a language is finite then we can describe it quite easily: We just have
to list all its elements. However, this method fails when the language is
infinite, and even when the language in question is very large. If we want to
communicate to a computer that it is to find all words of such a language
we have to find a concise description.
2.2
One way of describing languages is using set-theoretic notation (see Appendix A for more detail). In the main development here we try to avoid
being overly mathematical, but there are some operations on languages we
need to consider. There are seven of them:
Union. Since languages are just sets we can form their unions.
Intersection. Since languages are merely sets we can form their intersections.
Set difference. If L1 and L2 are languages we can form
L1 \ L2 = {s L1 | s
/ L2 }.
Complement. If L is the language of all words over some alphabet
, and L is a subset of L then the complement of L in L is L \ L ,
the set of all words over which are not contained in L .
Concatenation. Because we can concatenate words we can use the
concatenation operation to define new languages from existing ones by
extending this operation to apply to languages as follows. Let L1 and
L2 be languages over some alphabet . Then
L1 L2 = {s t | s L1 and t L2 }.
n-ary Concatenation. If we apply concatenation to the same language by forming L L there is no reason to stop after just one concatenation. For an arbitrary language L we define
Ln = {s1 s2 sn | si L for all 1 i n}.
We look at the special case
L0 = {s1 sn | si L for all 1 i n} = {},
since is what we get when concatenating 0 times.
12
Note that
=
n = {} = L0
nN
2.3
13
machine can cope with, although we would have to use a slightly different
alphabet, say (01)^*. All a computer now has to do is to compare: Does
the first character2 of my string equal 0? Does the next one3 equal 1? And
so on.
What we have done here is to create a pattern. It consists of various characters of the alphabet, concatenated. We are allowed to apply the Kleene
star to any part of the string so created, and we use brackets ( ) to indicate
which part should be affected by the star. A computer can then match this
pattern.
Are these all the patterns we need? Not quite. How, for example, would
we describe the language
{0n | n N} {1n | n N} = {xn | x = 0 or x = 1}?
We cant use either of 0 1 or (arguably worse) (01) because both of
these include words that contain 0 as well as 1, whereas any word in our
target language consists entirely of 0s or entirely of 1s. We need to have a
way of saying that either of two possibilities might hold. For this, we use
the symbol |. Then we can use 0 |1 to describe the language above.
Exercise 3. Which of the following words match the given patterns?
Pattern
(ab)
a b
(a|b)
(a|b)
ab|b|a
ab
aba
abab
aab
aabb
aa
Exercise 4. Describe all the words matching the following patterns. For
finite languages just list the elements, for infinite ones, try to describe the
words in question using English. If you want to practise using set-theoretic
notation, add a description in that format.
(a) (0|1|2)
(b) (0|1)(0|2)2
(c) (01|10)
(d) 0 1,
(e) (01) 0,
(f) 0 1 ,
(g) (010) ,
(h) (01) 0 ,
(i) (01) (01) .
(j) (0|1)
(k) 0 |1 |2
(l) 0|1 |2
2
3
14
2.4
Regular expressions
So far we have been using the idea of a pattern intuitivelywe have not
said how exactly we can form patterns, nor have we properly defined when
a word matches a pattern. It is time to become rigorous about these issues.
For reason of completeness we will need two patterns which seem a
bit weird, namely (the pattern which is matched precisely by the empty
word ) and (the pattern which is matched by no word at all).
Definition 1. Let be an alphabet. A pattern or regular expression
over is any word over
pat = {, , |, , (, )}
generated by the following inductive definition.
Empty pattern The character is a pattern;
Empty word the character is a pattern;
Letters every letter from is a pattern;
Concatenation if p1 and p2 are patterns then so is (p1 p2 );
Alternative if p1 and p2 are patterns then so is (p1 |p2 );
Kleene star if p is a pattern then so is (p).
In other words we have defined a language4 , namely the language of all
regular expressions, or patterns. Note that while we are interested in words
over the alphabet we need additional symbols to create our patterns. That
is why we have to extend the alphabet to pat .
In practice we will often leave out some of the brackets that appear in
the formal definitionbut only those brackets that can be uniquely reconstructed. Otherwise we would have to write ((0|1) 0) instead of the simpler
(0|1) 0. In order to be able to do that we have to define how to put the
brackets back into such an expression. We first put brackets around any occurrence of with the sub-pattern immediately to its left, then around any
occurrence of concatenation, and lastly around the alternative operator |.
Note that every regular expression with all its brackets has precisely one
way of building it from the ruleswe say that patterns are uniquely parsed.
This isnt quite true once we have removed the brackets: (0|(1|2)) turns into
0|1|2 as does ((0|1)|2). However, given the way we use regular expressions
this does not cause any problems.
Note that many computer languages that use regular expressions have
additional operators for these (see Exercise 7). However, these exist only
for the convenience of the programmer and dont actually make these regular expressions more powerful. We say that they have the same power of
expressivity. Whenever I ask you to create a regular expression or pattern
it is Definition 1 I expect you to follow.
4
In Chapter 4 we look at how to describe a language like thatit cannot be done using
a pattern.
15
2.5
Students who are on the joint honours CS and Maths programme dont take
COMP10020, but they should be able to grasp these ideas without problems.
16
2.6
Theoreticians want their patterns with as few cases as possible so as to have fewer
cases for proofs by induction. Practicians want lots of pre-defined shortcuts for ease of
use. This exercise shows that it doesnt really matter which version you use.
17
If you find the last few of these really hard then skip them for now. The tools of the
next chapter should help you finish them.
18
Exercise 10. Find a regular expression p over the alphabet {a, b, c} such
that the language defined by p is the one given.
(a) All the words that dont contain the letter c.
(b) All the words where every a is immediately followed by b.
(c) All the words that do not contain the string ab.
(d) All the words that do not contain the string aba.
2.7
Regular languages
2.8
Summary
19
Chapter 3
3.1
Using pictures
0
Even
Odd
0
Every time we see 0 we switch from the even state to the odd state
and vice versa.
If we wanted to give this as a description to somebody else then maybe
we should also say what we are doing when we see a letter other than 0,
namely stay in whatever state were in. Lets assume we are talking about
words consisting of 0s and 1s. Also, wed like to use circles for our states
because they look nicer, so well abbreviate their names.
0
E
O
0
20
So now somebody else using our picture would know what to do if the
next letter is 0, and what to do if it is 1. But how would somebody else
know where to begin?
0
E
O
0
We give them a little arrow that points at the state one should start
in. However, they would still only know whether they finished in the state
called E or the one called O, which wouldnt tell them whether this was the
desired outcome or not. Hence we mark the state we want to be in when
the word comes to an end and now we do have a complete description of our
task.
0
E
O
0
0, 1
0, 1
0, 1
Now the third. This time something happens: If we see 0, were still
okay, but if we see 1 then we need to reject the word.
4
0
0, 1
0, 1
1
2
0, 1
4
0
0, 1
0, 1
1
2
But what if the word is okay until state 3? Then we have to start all
over again, not caring about the next letter or the one after, but requiring
the third one to be 0. In other words, were in the same position as at the
startso the easiest thing to do is not to create state 3, but instead to have
that edge go back to state 0.
0, 1
4
0
0
0, 1
0, 1
1
2
Exercise 11. For the languages described in parts (b) and (c) of Exercise 9,
draw a picture as in the examples just given.
3.2
Following a word
b
a
a, b
a, b
b
b
a
b
a
a, b
b
a, b
a, b
b
b
a
a, b
b
a
b
a
a
a
a, b
a, b
a, b
b
a, b
b
a
a
a, b
b
a, b
b
a
3.3
Formally we may consider the set of all the states in the automaton,
say Q. One of these states is the one we want to start in, called the start
state or the initial state. Some of the states are the ones that tell us if we
23
end up there we have found the kind of word we were looking for. We call
these accepting states. They form a subset, say F , of Q.
The edges in the graph are a nice way of visualizing the transitions.
Formally what we need is a function that
takes as its input
a state and
a letter from
and returns
a state.
We call this the transition function, . It takes as inputs a state and a letter,
so the input is a pair (q, x), where q Q and x . That means that the
input comes from the set
Q = {(q, x) | q Q, x }.
Its output is a state, that is an element of Q. So we have that
:Q
- Q.
- q.
We sometimes put these four items together in a quadruple and speak of the
DFA (Q, q , F, ).
Sometimes people also refer to a finite state machine.
Note that for every particular word there is precisely one path through
the automaton: We start in the start state, and then read off the letters
one by one. The transition function makes sure that we will have precisely
one edge to follow for each letter. When we have followed the last letter
of the word we can read off whether we want to accept it (if we are in an
accepting state) or not (otherwise). Thats why these automata are called
deterministic; we see non-deterministic automata below.
For every word x1 x2 xn we have a uniquely determined sequence of
states q , q1 , . . . , qn such that
(q = q0 )
x1
- q1
x2
- q2
xn
- qn .
We accept the word if and only if the last state reached, qn is an accepting
state. Formally it is easiest to define this condition as follows.
1
All our automata have a finite number of states and we often drop the word finite
when referring to them in these notes.
24
3.4
Non-deterministic automata
0, 1
2
1
0
Now we dont care what happens until we reach the last symbol, and
when that is 1 we want to accept the word. (If it wasnt the last letter then
we shouldnt accept the word.) The following would do that job:
25
0, 1
2
1
0
0, 1
0
0, 1
But now when we are in state 1 and see 1 there are two edges we might
follow: The loop that leads again to state 1 or the edge that leads to state 3.
So now when we follow the word 011 through the automaton there are two
possible paths:
From state 0, read 0, go to state 1.
We say that the automaton accepts the word if there is at least one such
path that ends in an accepting state.
So how does the definition of a non-deterministic automaton differ from
that of a deterministic one? We still have a set of states Q, a particular
start state q in Q, and a set of accepting states F Q.
However, it no longer is the case that for every state and every letter from
x there is precisely one edge labelled with x, there may be several. What
we no longer have is a transition function. Instead we have a transition
relation.2 Given a state q, a letter x and another state q the relation tells
us whether or not there is an edge labelled x from q to q .
Exercise 15. Go back and check your solutions to Exercises 13 and 14.
Were they all deterministic as required? If not, redo them.
We can turn this idea into a formal definition.
Definition 8. A non-deterministic finite automaton, short NFA, is
given by
a finite non-empty set Q of states,
a start state q in Q,
a subset F of Q of accepting states as well as
a transition relation which relates a pair consisting of a state and a
letter to a state. We often write
q
- q
if (q, x) is -related to q .
We can now also say when an NFA accepts a word.
2
26
Definition 9. A word s = x1 xn over is accepted by the nondeterministic finite automaton (Q, q , F, ) if there are states
q0 = q , q1 , . . . , qn
such that for all 0 i < n, relates (qi , xi ) to qi+1 and such that qn F ,
that is, qn is an accepting state. The language recognized by an NFA
is the set of all words it accepts.
An NFA therefore accepts a word x1 x2 xn if there are states
q = q0 , q1 , . . . , qn
such that
(q = q0 )
x1
x2
- q1
- q2
xn
- qn ,
b
a
a, b
0, 1
0
Note that unless we have to describe the automaton in another way, or otherwise have
reasons to be able to refer to a particular state, there is no reason for giving the states
names in the picture.
27
0, 1
0
This is a perfectly good picture of a deterministic finite automaton. However, not all the states, and not all the transitions, are drawn for this automaton: Above we said that for every state, and every letter from the
alphabet, there must be a transition from that state labelled with that letter. Here, however, there is no transition labelled 1 from the state 0, and
no transition labelled 0 from the state 1.
What does the automaton do if it sees 1 in state 0, or 0 in state 1? Well,
it discards the word as non-acceptable, in a manner of speaking.
We can complete the above picture to show all required states by assuming theres a hidden state that we may think of as a dump. As soon as we
have determined that a particular word cant be accepted we send it off in
that dump state (which is certainly not an accepting state), and theres no
way out of that state. So all the transitions not shown in the picture above
go to that hidden state. With the hidden state drawn our automaton looks
like this:
0, 1
0
1
1
1
0 0, 1
3
This picture is quite a bit more complicated than the previous one, but
both describe the same DFA, and so contain precisely the same information.
I am perfectly happy for you to draw automata either way when it comes
to exam questions or assessed coursework.
Exercise 17. Consider the following DFA. Which of its states are dump
states, and which are unreachable? Draw the simplest automaton recognizing the same language.
b
a
c
a
c
a, b
b, c
a, b
Describe the language recognized by the automaton.
Exercise 18. Go through the automata you have drawn for Exercise 13
and 14. Identify any dump states in them.
3.5
So far we have found the following differences between deterministic and nondeterministic automata: For the same problem it is usually easier to design a
non-deterministic automaton, and the resulting automata are often smaller.
28
Algorithm 1, example
Before looking at the general case we consider an example. Consider the
following NFA.
a
0
a, b
b
a
0
With a, we can go from state 0 to states 1 and 2, so we invent a new state
we call 12 (think of it as being a set containing both, state 1 and state 2).
Because 2 is an accepting state we make 12 an accepting state too.
12
1
b
0
12
Now we have to consider the states we have just created. In the original
automaton from state 1, we cant go anywhere with a, but with b we can go
to state 2, so we introduce an accepting state 2 (thought of as {2}) into our
new automaton.
29
1
b
0
12
1
b
0
12
02
1
b
0
12
02
b
In the original automaton with a we can go from state 2 to states 0 and
2, so we need a transition labelled a from state 2 to state 02 in our new
DFA.
1
b
0
12
2
a
02
b
With b from state 2 we can only go back to 1, so we add this transition
to the new automaton.
b
1
b
0
12
2
a
02
b
Now for the new state 02. From 0 we can go with a to states 1 and 2,
and from state 2 we can get to states 0 and 2, so taking it all together from
state 02 we can go to a new accepting state we call 012.
30
b
1
b
0
12
2
a
02
012
b
With b from state 0 we can go to state 1 in the old automaton, and from
state 2 we can also only go to state 1 with a b, so we need a transition from
the new state 02 to state 1 labelled b.
b
1
b
0
b
12
2
a
02
012
b
Following the same idea, from 012 with a we can go back to 012, and
with b we can go to 12.
b
1
b
0
b
12
2
a
a 02
a
012
a
0
1
b
b
31
a, b
b
1
b
0
01 ab
b
12
2
a
a 02
b
b
32
a
012
We already know that we may leave out dump states like when drawing
an automaton.
We have another extra state in this picture, namely {0, 1}. This is a
state we can never get to when starting at the start state {0}, so no word
will ever reach it either. It is therefore irrelevant when it comes to deciding
whether or not a word is accepted by this automaton. We call such states
unreachable and usually dont bother to draw them.
Exercise 20. For the NFA from the previous exercise draw the picture of
the full automaton with all states (given by the set of all subsets of {0, 1, 2},
including the unreachable ones) and all transitions.
Exercise 21. For each of the following NFAs, give a DFA recognizing the
same language.
(a)
a, b
0
1
b
(b)
a
a
a
b
1
(c)
a, b
2
b
a
0
a
2
a
b
b
a
b
1
b
a, b
We have three different ways now in which we can describe the same
language.
Exercise 23. Consider following language over the alphabet {a, b, c} which
consists of all words of odd length. Describe this language in the following
ways:
(a) Using a regular expression.
(b) Using a DFA.
(c) As a set. Unless you work through the Appendix you will probably find
this difficult.
3.6
a
b
a
b
Clearly any word that will be accepted has to start with a, can then
contain arbitrarily many further as, has b, and arbitrarily many further bs.
That gets the word into an accepting state. After that, the word may have
another a and then it all repeats.
A pattern describing the same language is
(aa bb )(aaa bb ) .
However, if the automaton is more complicated then reading off a pattern
can become very difficult, in particular if there are several accepting states.
If the automaton is moreover non-deterministic the complexity of the task
worsens furtherusually it is therefore a good idea to first convert an NFA
to a DFA using Algorithm 1 from page 32.
In order to show that it is possible to construct a regular expression
defining the same language for every automaton we have to give an algorithm
that works for all automata. This algorithm may look overly complex at first
sight, but it really does work for every automaton. If you had to apply it a
lot you could do the following:
Define a data structure of finite automata in whatever language youre
using. In Java you would be creating a suitable class.
34
0
c
1
a
2
There are two accepting states. Hence a word that is accepted will either
start in state 0 and end in state 0 or start in state 0 and end in state 2.
That means the language L accepted by this automaton is the union of two
languages which we write as follows:
L = L00 L02
The indices tell us in which state we start and in which state we finish. This is already a useful observation since we can now concentrate on
calculating one language at a time.
To calculate L00 we note that we can move from state 0 to state 0
in a number of ways. It is so complicated because there are loops in the
automaton. What we do is to break up these ways by controlling the use of
the state with the highest number, state 2, as follows:
To get from state 0 to state 0 we can
either not use state 2 at all (that is, go only via states 0 and 1) or
go to state 2, return to it as many times as we like, and then go from
state 2 to state 0 at the end.
At first sight this does not seem to have simplified matters at all. But
we have now gained control over how we use the state 2 because we can use
this observation to obtain the following equality.
1
1
1
1
L00 = L2
00 = L00 L02 (L22 ) L20
35
L1
02 . To go from state 0 to state 2 without using state 2 in between
we must see a followed by b. Hence this language is equal to {ab}.
L1
22 . To go from state 2 to state 2 without using state 2 in between
there are two possibilities:
we can go from state 2 to state 1 and back, seeing ab along the
way or
we can go from state 2 to state 0 to state 1 to state 2 and see cab
along the way. (It would be a good idea now to convince yourself
that there is no other possibility.)
Hence this language is equal to {ab} {cab} = {ab, cab}.
L1
20 . The only way of getting from state 2 to state 0 using only
states 0 and 1 in between is the direct route, which means we must
see c. Hence this language is equal to {c}.
Putting all these together we get the following.
1
1
1
1
L00 = L2
00 = L00 L02 (L22 ) L20
36
1
L02 = L1
02 (L22 )
a
1
a
0
3
c
b
2
b
b
a
a, c
What is confusing about this automaton is not its size in terms of the
number of states, but the way the transitions criss-cross it.
In order to find all the words accepted by the automaton we have to
identify all the words that
when starting in state 0 end up in state 0. We call the resulting
language L00 . We also need all the words that
37
But finding all the different paths that lead from 0 to 0, or from 0 to 2
is still pretty tough. The way we simplify that is by taking the state with
the highest index, namely 2, out of consideration as follows.
Every path from the state 0 to the state 0 can do one of the following:
It either doesnt use the state 2 at all or
it goes from the state 0 to the state 2, then goes back to the state 2
as often as it likes, and ultimately goes to the state 0.
At first sight this doesnt look like a very useful observation. But what
we have done now is to break up any path that starts at the state 0 and
finishes at the state 0 into a succession of paths that only use the state 2 at
controlled points.
We use the same notation as before: All words that follow a path that
goes from state 0 to state 0 while only using states 0 and 1 (but not state 2)
in between make up the language L1
00 . This works similarly for other
start or end states. Reformulating our last observation means then that
every word that follows a path from state 0 to state 0 satisfies one of the
following:
It either is an element of L1
00 or
it is an element of
1
1
L1
02 (L22 ) L20 .
While the equation may appear to make things more confusing at first
sight, we now have languages which we can more easily determine on the
right hand side of the equation.
We now have the choice between trying to determine the languages on
the right directly, or applying the same idea again.
L1
00 . How do we get from state 0 to state 0 only using states 0
and 1? The simple answer is that we cant move there, but we are
already there so L1
00 = {}.
38
L1
02 . Going from state 0 to state 2 using only states 0 and 1 can be
done in two ways, either directly using the letter b or via state 1 using
ab or cb. Hence L1
02 = {b, ab, cb}.
L1
22 . This is more complicated. Instead of trying to work this out
directly we apply our rule again: When going from state 2 to state 2
using only states 0 and 1 we can either go directly from state 2 to
state 2 or we can go from state 2 to state 1, return to state 1 as often
as we like using only state 0, and then go from state 1 to state 2. In
other words we have.
0
0
0
0
L1
22 = L22 L21 (L11 ) L12 .
0
0
We now read off L0
22 = {b, ab}, L21 = {aa, ac, c}, L11 = {} and
L0
12 = {b}. That gives us
L1
22 = {b, ab} {aa, ac, c} {} {b}
Now L1
02 we already calculated above, it is equal to {b, ab, cb}. We also
know already that L1
22 = {b, ab, aab, acb, cb}. Hence
L02 = {b, ab, cb} {b, ab, aab, acb, cb} = L((b|ab|cb)(b|ab|aab|acb|cb) ).
Hence the language recognized by the automaton is
L00 L02
= L(|(b|ab|cb)(b|ab|aab|acb|cb) a) L((b|ab|cb)(b|ab|aab|acb|cb) )
= L(|((b|ab|cb)(b|ab|aab|acb|cb) a)|((b|ab|cb)(b|ab|aab|acb|cb) ))
= L((b|ab|cb)(b|ab|aab|acb|cb) (|a)),
and a regular expression giving the same language is
|(b|ab|cb)(b|ab|aab|acb|cb) (|a).
39
iF
where L0i is the language of all words that, when starting in state 0 end
in state i. Since a word is accepted if and only if it ends in an accepting
state the above equation is precisely what we need.
We can now think of some L0i as equal to Ln
0i : the language of all
words that, when starting in state 0, end up state i is clearly the language of
all words that do so when using any of the states in {0, 1, . . . , n}. In general,
Lk
ji is the language of all those words that, when starting in state j end
in state i, use only states with a number less than or equal to k in between.
It is the languages of the form Lk
ji for which we can find expressions that
reduce k by 1: Any path that goes from state j to state i using only states
with numbers at most k will
either go from state j to state i only using states with number at most
k 1 in between (that is, not use state k at all)
or go from state j to state k (using only states with number at most
k 1 in between), return to state k an arbitrary number of times,
and then go from state k to state i using only states with number at
most k 1 in between.
Hence we have
k1
k1
k1
k1
Lk
ji = Lji Ljk (Lkk ) Lki .
then
k1
k1
Lji
= Lki
and
k1
k1
Ljk
= Lkk
if i = k
then
k1
k1
Lji
= Ljk
and
k1
k1
.
= Lkk
Lki
and
Thus we get slightly simpler expressions (compare L02 in the above example):
j1 j1
Lj
ji = (Ljj ) Lji
i1
i1
Li
ji = Lji (Lii )
expression we have will become longer and longer, so it pays to read off
languages from the automaton as soon as that is possible.
Once we have reached paths which are not allowed to use any other
states we have the following.
For j = i:
L1
ji
transition from j to i.
2
b
1
(b)
b
1
a
Exercise 26. Give regular expressions defining the languages recognized by
the following automata using Algorithm 2. Hint: Recall that the way you
number the states has an impact on how many steps of the algorithm you
will have to apply!
41
(a)
a
b
b
b
(b)
a
b
(c)
a, b
a
3.7
We have a way of going from an automaton to a pattern that we can communicate to a computer, so a natural question is whether one can also go
in the opposite direction. This may sound like a theoretical concern at first
sight, but it is actually quite useful to be able to derive an automaton from
a pattern. That way, if one does come across a pattern that one doesnt
entirely understand one can turn it into an automaton. Also, changing existing patterns so that they apply to slightly different tasks can often be
easier done by first translating them to an automaton.
For some patterns we can do this quite easily
Exercise 27. Design DFAs over the alphabet {a, b, c} that recognize the
languages defined by the following patterns.
(a) (a|b)cc.
(b) cc(a|b).
(c) aa|bb|cc.
(d) c(a|b) c.
Now assume that instead we want to recognize all words that contain a substring matching those patterns. How do you have to change your automata
to achieve that?
For slightly more complicated patterns we can still do this without too
many problems, provided we are allowed to use NFAs.
42
Exercise 28. Design NFAs over the language {0, 1} that recognize the languages defined by the following patterns.
(a) (00) |(01)
(b) (010) |0(11)
Now turn your NFAs into DFAs.
However, in general it can be quite difficult to read off an automaton from
a pattern.6 We therefore introduce an algorithm that works for all regular
expression. This algorithm is recursive, and it builds on Definition 1, making
use of the recursive structure of patterns. It is very easy to build automata
for the base cases. However, to build automata for the constructors of
patterns alternative, concatenation, and star, we need to be a bit cleverer.
Algorithm 3, example
Assume that we want to build an automaton for the regular expression a b
based on already having automata for the patterns a and b .
The latter are quite easily constructed:
That then is precisely what we do: We generalize our notion of automaton to include transitions labelled not by a letter from our alphabet, but by
the empty word .
Definition 10. Let be an alphabet not containing . An NFA with transitions over is an NFA over the alphabet that may have transitions
labelled with . Hence the transition relation relates pairs of the form (q, x),
where q is a state and x is either an element of or equal to , to states.
We now have to worry about what it means for an NFA with -transitions
to accept a word. Whenever there is a transition labelled with in the
automaton we are allowed to follow it without matching the next letter in
our word.
Definition 11. A word s = x1 xn over is accepted by the NFA
with -transitions (Q, q , F, ) over if there are states
q0 = q , q1 , . . . , ql
6
You can always write down some reasonably complex patterns and give it a go.
43
- qm +1
i1
- ...
xi
- qm 1
i
- qm
i
as well as transitions
- q1
mn +1
qmn
- ...
- ql ,
such that m0 = 0 and such that ql is an accepting state. Here in each case
the number of -transitions may be 0. The language recognized by an
NFA with -transitions is the set of all words it accepts.
While we do need NFAs with -transitions along the way to constructing an automaton from a pattern we do not want to keep the -transitions
around since they make the automata in question much more confusing. We
introduce an algorithm here that removes the -transitions from such an
automaton. For that let be an arbitrary alphabet not containing .
Before we turn to describing the general case of Algorithm 3 we
investigate the algorithm that removes -transitions.
- q2 qn2
- qn1
- qn = q .
Algorithm 4, example
Here is an example. Consider the following automaton with -transitions.
44
We copy the states as they are, and create some new accepting states as
follows:
Pick a non-accepting state. If from there we can reach an accepting state
(in the original automaton) using only -transitions, we make this state an
accepting state.
For the example, this means the initial state and the third state from
the left are now accepting states.
We then copy all the transitions labelled with a letter other than from
the original automaton.
b
a
b
a
a
This gives us one new transition, but we already have a transition from
the initial state to the third state from the left, so we can just add the new
label b to that.7
7
We could also make do with just one loop transition being drawnI only left in two
to make the process clearer.
45
a, b
b
a
a
Now we may safely remove the unreachable states and any transitions
involving them.
a, b
a
We now have an automaton without -transitions that accepts precisely
the same words as the origin.
Exercise 29. Turn the following automaton into one that does not have
-transitions.
The pattern
An automaton recognizing the language defined by this pattern is given
below.
46
The pattern
An automaton recognizing the language defined by this pattern is given
below.
Concatenation
An automaton recognizing the language defined by the pattern p1 p2 is given
below. Here we turn every accepting state of A1 into a non-accepting state
and draw an -transition from it to the start state of A2 . The only accepting
states in the new automaton are those from A2 .
A1
A2
A1
combine
to
A1
A1
A2
Alternative
An automaton accepting the language defined by the pattern p1 |p2 is given
below. We add a new start state and connect it with -transitions to the
start states of A1 and A2 (so these are no longer start states).
A1
combine to
A2
A1
A2
Kleene Star
We assume that we have a pattern p and an automaton A that recognizes
the language defined by p.
An automaton accepting the language defined by the pattern p is given
below. Given an automaton A we introduce a new start state. This state
10
Again we could leave out the transition and the right hand state, which is a dump
state.
47
becomes
a
For this automaton we apply the Kleene star.
b
b
48
b
b
3.8
In Section 2 there are examples of how we can build new regular languages
from existing ones. At first sight these may seem like theoretical results
without much practical use. However, what they allow us to do is to build
up quite complicated languages from simpler ones. This also means that we
can build the corresponding regular expressions or automata from simple
ones, following established algorithms. That makes finding suitable patterns
of automata less error-prone, which can be very important.
If our language is finite to start with then finding a pattern for it is very
easy.
Exercise 33. Show that every finite language is regular.
Assume we have two regular languages, L1 and L2 . We look at languages
we can build from these, and show that these are regular.
Concatenation
In Section 2 it is shown that the concatenation of two regular languages
is regular. If L1 and L2 are two regular languages, and p1 is a regular
expression defining the first, and p2 works in the same way for the second,
then
L1 L2 = L(p1 ) L(p2 ) = L(p1 p2 ),
so this is also a regular language. There is a description how to construct
an automaton for L1 L2 from of those for L1 and L2 on page 46.
Kleene star
Again this is something we have already considered. If L is a regular language then so is L . If p is a regular expression for L then p is one for L ,
and again our algorithm for turning patterns into automata shows us how
to turn an automaton for L into one for L .
Reversal
If s is a word over some alphabet then we can construct another word over
the same alphabet by reading s backwards, or, in other words, reversing it.
For a language L over we define
LR = {xn xn1 x2 x1 | x1 x2 xn1 xn L}.
Exercise 34. Here are some exercises concerned with reversing strings
that offer a good opportunity for practising what you have learned so far.
Parts (a) and (e) require material that is discussed in Appendix A, so it only
makes sense to do these if you are working through this part of the notes.
(a) Define the reversal of a string as a recursive function.
(b) Look at an automaton for the language of all non-empty words over
the alphabet {a, b} which start with a. How can it be turned into one for
the language of all words which end with a? Hint: How could we have a
given word take the reverse path through the automaton than it would do
ordinarily?
50
(c) Look at the language given in Exercise 9 (i). What do the words in its
reversal look like? Now look at an automaton for the given language and
turn it into one for the reversed language. Hint: See above.
(d) In general, describe informally how, given an automaton for a language
L, one can draw one for the language LR .
(e) Take the formal description of a DFA recognizing a language L as in Definition 5 and turn that into a formal definition for an NFA with -transitions
which recognizes the language LR .
Unions
If we have regular expressions for L1 and L2 , say p1 and p2 respectively, it is
easy to build a regular expression for L1 L2 , namely p1 |p2 . But how do we
build an automaton for L1 L2 from those for L1 and L2 ? We have already
seen how to do that as wellform an NFA with a new start state which has
-transitions to the (old) start states of the two automata, as illustrated on
page 46. If we like we can then turn this NFA with -transitions into a DFA.
Intersections
This isnt so easy. Its not at all clear how one would go about about it
using patterns, whereas with automata one can see how it might work. The
problem is that we cant say first get through the automaton for L1 , then
through that for L2 : When we have followed the word through the first
automaton it has been consumed, because we forget about the letters once
we have followed a transition for them. So somehow we have to find a way
to let the word follow a path through both automata at the same time.
Lets try this with an example: Assume we want to describe the language
L of all words that have an even number of as and an odd number of bs.
Clearly
L = {s {a, b} | s has an even number of as}
{s {a, b} | s has an odd number of bs}.
DFAs for those two languages are easy to construct.
a
b
a
0
b
a
1
b
b
00
01
b
a
b
10
11
b
What have we done? We have formed pairs of states, a state from the
first automaton and a state from the second one. We have then added
transitions that literally do follow both automata at the same time.
In general, given DFAs (Q1 , q1 , F1 , 1 ) and (Q2 , q2 , F2 , 2 ) we can form
an automaton that recognizes the intersection of the languages recognized
by the two DFAs as follows.
States: Q1 Q2 .
Start state: (q1 , q2 ).
Accepting states: F1 F2 .
Transition function: maps (q1 , q2 ) and x to ((q1 , x), (q2 , x)). In
other words, there is a transition
x
- (q , q )
1 2
(q1 , q2 )
if and only if there are transitions
q1
- q
1
and
q2
- q .
2
Complements
If L is a language over the alphabet then we may want to consider its
complement, that is
L,
the set of all words over that do not belong to L.
Exercise 36. This is a question concerned with the complement of a language.
52
(a) Consider the language discussed on page 21 of all words which have a
0 in every position that is a multiple of 3. Begin by defining the complement of this language in {0, 1} , either using set-theoretic notation
or in English.
Now take the DFA for the original language given on page 22 and turn
it into one for the complement.
(b) Consider the following DFA that recognizes the language of all words
over {0, 1} of length at least 2 whose first letter is a 0 and whose second
letter is a 1.
0, 1
0
3.9
Equivalence of Automata
Here is an issue we have not thought about so far. Is there a way of telling
whether two automata recognize the same language? You might particularly
care about this if you and a friend have created different automata for the
same problem and youre not sure whether youre both correct. We say that
two automata are equivalent if they recognize the same language. Note
that if we have an NFA, and we carry out Algorithm 1 for it, then the
resulting DFA is equivalent to the NFA we started with.
If youre lucky, this is easy to decide: If you have both drawn a picture of
the same automaton, but have merely put different names in your states,
then the automata are what we call isomorphic. It is easy to show that
isomorphic automata recognize the same language.
There are several ways of addressing the more general issue.
Via minimization
Given a DFA it is possible to minimize the number of states it has. There is
an algorithm which does this, and this calculation has very nice properties.
In particular two DFAs recognize the same language if and only if their
minimized automata are isomorphic.
However, the algorithm is non-trivial, and showing that it does what it
needs to do is fairly involved. For this reason we do not give any detail
here.11 Note that if you have NFAs you first have to find equivalent DFAs.
11
Ask me for recommendations if you would like to read more about this! This algorithm
is taught in the third year course on compilers.
53
if and only if
L L and L L
Via bisimulation
Note that both the methods for deciding equivalence of automata we have
mentioned so far work for DFAs, but not for their non-deterministic relations. Clearly the question of whether two NFA recognize the same language
is even harder than that for two DFAs. Nonetheless there is an idea that
helps with this.
The idea is this: Assume we have two NFAs, say A = (Q, q , F, ) and
B = (P, p , E, ). If we can show that for each state q of A there is an
analogous state p of B, and vice versa then they should recognize the same
language if only we can get our definition of analogous right.
Definition 12. We say that a relation from Q to P is a bisimulation
between automata A and B if and only if
relates q and p ;
if relates q Q with p P then
x
- q then there exists p P such that p
if q
relates q and p
x
- p then there exists q Q such that q
if p
relates q and p ;
- p and
- q and
a
a
a
a
We claim that they define the same language, and we demonstrate this by
showing there is a bisimulation between them. We draw the two automata
above each other and show which states the bisimulation relates using grey
(instead of black) lines.
54
b
a
a
a
a
a
b
a
a
(a)
b
0
(b)
0
1
0 1
3.10
So far we have assumed implicitly that all languages of interest are regular, that is, that they can be described using regular expressions or finite
automata. This is not really true, but the reason regular languages are so
important is that they have nice descriptions, and that they suffice for many
purposes.
Something a finite automaton cannot do is to countor at least, it cannot count beyond a bound defined a priori. Such an automaton has a finite
number of states, and its only memory is given by those states. Hence it
cannot remember information that cannot be encoded into that many states.
If we want an automaton that decides whether or not a word consists
of at least three letters then this automaton needs at least four states: One
to start in, and three to remember how many letters it has already seen.
Similarly, an automaton that is to decide whether a word contains at least
55 as must have at least 56 states.
However, if we try to construct an automaton that decides whether a
word contains precisely as many 0s as 1s, then we cannot do this: Clearly
the automaton must have a different state for every number of 0s it has
already seen, but that would require it to have infinitely many states, which
is not allowed.
Similarly, how would one construct a pattern for the language
L = {0n 1n | n N}?
We can certainly cope with the language {(01)n | n N} by using the
pattern (01) , but that is because once we have seen 0, and the subsequent 1,
we may forget about it again (or return to the start state of our automaton).
In order to describe L, on the other hand, we really have to remember how
many 0s there are before we find the first 1. But there could be any number
55
3.11
Summary
56
57
Chapter 4
4.1
Generating words
0
1
2
3
4
5
6
7
8
9
S
S
S
S
S
S
B
(S)
S+S
SS
SS
S/S
We have to know where to start, and we always use S as the only symbol
we may create from nothing. Once we have S, we can use any of the rules on
the right hand side to replace it by a more complicated string, for example
S S S S (S) S (S S) B (S S) 5 (S S)
5 (B S) 5 (B B) 5 (3 B) 5 (3 4).
58
0|1|2|3|4|5|6|7|8|9
B | (S) | S + S | S S | S S | S/S
What we have done is to group together all the ways in which we may
replace either of the two symbols B and S.
4.2
Context-free grammars
and
AB B.
This assumes that we have a special symbol, namely which should not be contained
in either or to avoid confusion.
2
Note that such a string may not contain any non-terminal symbols.
60
However, in a finite state machine we only get to scan the word from left to right;
when creating a derivation we may be working on several places at the same time.
61
4.3
- q
in the automaton (that is, for every pair (q, x) Q and q = (q, x))
we introduce a production rule
q xq .
For every accepting state q F we add a production rule q .
We recall the following automaton from Section 3 where we have changed
the names of the states.
0
S
A
0
We can now use non-terminal symbols S and A as well as terminal symbols 0 and 1 for a CFG with the following production rules:
S
| 1S | 0A
1A | 0S
63
(c) The words that do not contain the string 010. Hint: This language
appears in Exercises 10 and 14 so you already have a suitable DFA. Could
you have found such a grammar without using this method?
(d) By the pattern ab c |a b c. Could you have found such a grammar
without using this method?
We note that the production rules we have used here are very limited:
for an arbitrary non-terminal symbol R they are of the form R xR ,
where x and R , or R x, where4 x , or R . We call
grammars were all production rules are of this shape right-linear. Such
grammars are particularly simple, and in fact every language generated by
a right-linear grammar is regular.
We can do even more for regular languages using CFGs. This is our first
truly applied example of a context-free grammar.
There is a context-free grammar for the language of regular expressions
over some alphabet .
The underlying alphabet of terminal symbols5 we require here is
{, , |, }.
The alphabet of non-terminal symbols can be {S}. We use the following
production rules:
S .
S .
S x, for all x .
S SS for concatenation.
S S|S for alternative.
S S for the Kleene star.
4.4
When a compiler deals with a piece of code it has to parse it. In other
words, it has to break it down into its constituent parts in a way that makes
sense. Similarly, if you were to write a program that could take (possibly
quite long) arithmetic expressions as input and evaluate it, it would have to
break down the expression to decide in which order to carry out the various
operations.
When we give a derivation we automatically provide one way of breaking
down the given string. Parsing is an attempt of carrying out the opposite
process, namely to take a string and find out how to assemble it from simpler
parts. Instead of finding a derivation it is often more meaningful to create a
parse tree which gives a better overview of how the various bits fit together.
4
5
We did not need that rule above, but we need it to be part of this definition.
It would not be a good idea to call this alphabet this time.
64
S
S
S
S
We can read off the string generated from the leaves of the tree,6 here it
is 5 (3 4), and we can also see which production rules to use to get this
result. If we compare this with a derivation like that given on page 58 we
can see that in general, there will be several derivations for every parse tree.
This is because the parse tree does not specify in which order the rewriting
of non-terminal symbols should be carried out.
If we have S S, should we first replace the left S by B, or the right
one by (S)? The answer is that it doesnt really matter, since we can do
this in either order, and the strings we can get to from there are just the
same. What the parse tree does then is to remember the important parts
of a derivation, namely which rules to apply to which non-terminals, while
not caring about the order some of these rules are applied.
Any non-trivial word in a context-free language typically has many different derivations. However, ideally we would like every word to only have
one parse tree for reasons explained below.
Consider the string 5 3 4. This has the following two parse trees.
S
S
S
S
S
Note that may appear as a leaf, in which case we ignore it from reading off the word
in question.
65
words are evaluated in some way then one should always aim to give a unambiguous grammar. However, it is not always possible to do this, and if
that is the case then we call the language in question inherently ambiguous.
We can change the grammar given above in such a way that every word
it generates can be parsed unambiguously as follows.
B
0|1|2|3|4|5|6|7|8|9
B | (S + S) | (S S) | (S S) | (S/S)
aT | T a
aT | bT |
S
T
T
T
a
T
Hence we know that this grammar is also ambiguous. How can we turn
it into an unambiguous one? First of all we have to find out which language
is described by the grammar. In this case this is not so difficult: It is the
set of all words over the alphabet {a, b} which start with a or end with a
(or both). The ambiguity arises from the first rule: The a that is being
7
The usual convention that tells us that multiplication should be carried out before
addition is there so that we do not have to write so many brackets for an expression while
it can still be evaluated in a way that leads to a unique result.
66
demanded may either be the first, or the last, letter of the word, and if the
word starts and ends with a then there will be two ways of generating it.
What we have to do to give an unambiguous grammar for the same
language is to pick the first or last symbol to have priority, and the other
one only being relevant if the chosen one has not been matched. A suitable
grammar is as follows:
S
aT | bU
aT | bT |
aU | bU | a
Now we generate any word from left to write, ensuring the grammar is
unambiguous. We also remember whether the first symbol generated is a,
in which case the remainder of the word can be formed without restrictions,
or b, in which case the last symbol has to be a. It turns out that we do not
need three non-terminal symbols to do this: Can you find a grammar that
uses just two and is still unambiguous?
Exercise 47. For both grammars over the alphabet = {a, b} given below
do the following: Show that the grammar is ambiguous by finding a word
that has two parse trees, and give both the parse trees. Now try to determine the language generated by the grammar. Finally give an unambiguous
grammar for the same language.
(a) Let = {S, T } with production rules S T aT and T aT | bT | .
(b) The grammar with = {S} and S aS | aSbS | .
4.5
A programming language
we call maxint. We also need variables which can take on the value of any
of our natural numbers. Lets assume for the moment we only have three
variables, x, y, and z. The language has three kinds of entities:
arithmetic expressions A
boolean expressions B and
statements S.
For each of these we have a production rule in our grammar.
A
0 | 1 | | maxint | x | y | z | A + A | A A | A A
tt | ff | A = A | A A | B B | B
This may surprise you, but if we add more variables to our language8 then
it can calculate precisely what, say, Java can calculate for natural numbers
and booleans.
We can now look at programs in this language. Assume that the value
held in the variable y is 10.
x := 1; while y = 0 do x := 2 x; y := y 1
This program will calculate 210 and store it in x. In fact, it will calculate
2y (assuming y holds a natural number).
Exercise 48. We look at this example in more detail.
(a) Give a derivation for the program above.
(b) Is this grammar unambiguous? If you think it isnt, then give an example and show how this grammar might be made unambiguous, otherwise
explain why you think it is.
(c) Give a parse tree for the above program. Can you see how that tells you
something about what computation is supposed to be carried out?
4.6
When people define grammars for programming languages they typically use abbreviations that allow them to stipulate any variable without listing them all explicitly, but
we dont want to introduce more notation at this stage.
68
We repeat the grammar for the While language from above to illustrate
what this looks like.
Here we assume that a ranges over arithmetic expressions, AExp, b
ranges over boolean expressions, BExp, and S ranges over statements, Stat.
a ::= 0 | 1 | | maxint | x | y | z | a + a | a a | a a
b ::= tt | ff | a = a | a a | b b | b
S ::= x := a | y := a | z := a | skip | S; S | if b then S else S | while b do S
In the next example we assume that instead we are using haexpi to range
over AExp, hbexpi over BExp, and hstati over Stat respectively.
haexpi ::= 0 | 1 | | maxint | x | y | z |
haexpi + haexpi | haexpi haexpi | haexpi haexpi
hbexpi ::= tt | ff | haexpi = haexpi | haexpi haexpi |
hbexpi hbexpi | hbexpi
hstati ::= x := haexpi | y := haexpi | z := haexpi | skip | hstati; hstati |
if hbexpi then hstati else hstati | while hbexpi do hstati
4.7
There is a description for building new regular languages from old ones in
Section 3. We can use almost any way we like to do so; set-theoretic ones
such as unions, intersections and complements, as well as ones using concatenation or the Kleene star; even reversing all the words work. Context-free
languages are rather more fiddly to deal with: Not all of these operations
work for them.
Concatenation. This does work. We do the following. Assume that
we have two grammars with terminal symbols taken from the alphabet ,
and non-terminal symbols 1 respective 2 . We now take every symbol in
1 and put a subscript 1 onto it, and similarly for every symbol in 2 , and
the subscript 2. We have now forced those two alphabets to be disjoint, so
when we form their union the number of symbols in the union is the sum
of the symbols in 1 and 2 . We add a new start symbol S to the set of
non-terminal symbols. We now take all the production rules from the first
grammar, and put subscripts of 1 onto each non-terminal symbol occurring
in it, and do the same for the second grammar with the subscript 2. We
add one new production rule S S1 S2 .
Exercise 49. Use your solution to Exercise 42 (c) to produce a grammar
for the language of all words whose length is at least 6, using this procedure.
Kleene star. This looks as if it should be more complicated than concatenation, but it is not. We merely add two production rules to the grammar (if they arent already there) for our given language, namely S SS,
and S . If we then wish to generate a word that is the n-fold concatenation of words in our language, where n N+ we merely start by applying
the rule S SS (n 1) times, giving us n copies of S. For each one of
these we can then generate the required word. If we wish to generate the
empty word we can do this by applying the second new rule, S .
69
Exercise 50. Use your solution to Exercise 42 (c) to produce a grammar for
the language of all words whose length is divisible by 3, using this procedure.
Reversal. This is quite easy. Leave the two alphabets of terminal
and non-terminal symbols as they are. Now take each production rule, and
replace the string on the right by its reverse. So if there is a production rule
of the form R 00R1, replace it by the rule R 1R00.
Exercise 51. Use this procedure to turn your solution for Exercise 40 into
a grammar that generates all words over {0, 1} that start with 1 and end
with 0.
Unions. This does work.
Exercise 52. Show that the union of two context-free languages over some
alphabet is context-free.
Intersections. Somewhat surprisingly, this is not an operation that
works. The intersection of a context-free language with a regular one is
context-free, but examining this issue in more detail goes beyond the scope
for this course.
Complements. This also does not work. Just because we have a way
of generating all the words in a particular set does not mean we can do the
same for all the words not belonging to this set.
4.8
4.9
Summary
70
71
Glossary
accept
A word can be accepted, or not accepted, by an automaton. 25, 27,
44
Algorithm 1
Turns an NFA into a DFA. 31
Algorithm 2
Turns an automaton (works for NFA and DFA, but more complicated
for NFA) into a pattern describing the language recognized by the
automaton. 39
Algorithm 3
Takes a pattern and turns it into an NFA with -transition such that
the language recognized by the automaton is that defined by the pattern. 46
Algorithm 4
Turns an NFA with -transitions into an NFA without these. 44
alphabet
A set of letters. 11
ambiguous
A word can be ambiguously generated if we have a grammar that has
two parse trees for it, and if such a word exists we call the grammar
ambiguous. A language is inherently ambiguous if all the grammars
generating it are ambiguous. 65
Backus-Naur form
A particular way of writing down a grammar popular within computer science, in particular for describing the syntax of programming
languages. 68
bisimulation
A relation between the states of two automata; its existence implies
that the two automata are equivalent. 54
CFG
Rules for generating words by using rewrite rules known as a grammar,
in our case a context-free one. 60
72
concatenation
An operation on words (or letters) that takes two words and returns
one word by attaching the second word to the end of the first word.
11
context-free
A language is context-free if there is a context-free grammar generating
it. (Note that grammars can also be context-freewe do not consider
other grammars on this course. 62
context-free grammar
Same as CFG. 60
derivation
A way of demonstrating that a particular word is generated by a grammar. 58
deterministic finite automaton
Same as DFA. 24
DFA
An alternative way of defining a language, more suitable for human
consumption than a regular expression. 24
NFA
An alternative way of defining a language, often the easiest for people
to devise. 26
NFA with -transitions
An NFA which also allows transitions labelled with which do not
have to be matched against a letter. 43
non-deterministic finite automaton
Same as NFA. 26
non-terminal symbol
An auxiliary symbols used to describe a grammarwe want to create
words that do not contain auxiliary symbols. 60
parse tree
A parse tree shows how a string can be generated from a given grammar in a more useful way than a derivation. 64
pattern
A particular kind of string used to define languages. 15
recognize
Automata recognize languages. 25, 27, 44
regular
A language is regular if it can be defined using a pattern. 19
regular expression
Same as pattern. 15
right-linear
A grammar is right-linear if its production rules are particularly simple. The definition is given on page 63
string
Same as word. 11
string generated by grammar
A word consisting of terminal and non-terminal symbols that we can
build by using the rules of the grammar. 60
symbol
Same as letter. 11
terminal symbol
An element of the alphabet over which we are trying to create words
using a grammar. 60
unambiguous
A grammar is unambiguous if it isnt ambiguous. 65
word
Obtained from concatenating 0 or more letters. Same as string. 11
74
Appendix A
In the language of
mathematics
The account given so far is informal as far as possible. To be completely
rigorous the language of mathematics has to be used to define the various
notions we have used. For the interested reader we here give a glimpse of
how this is done. It is possible to pass this course, and pass it with a good
mark, without knowing the content of this chapter, but those who want to
get the whole picture, and get a really good mark, should work through this
part as well.
A.1
Now that we have the basic concept we can define operations for words,
such as the length of a word. Note how this definition follows the recursive
definition of a word.
Definition 18. The length |s| of a word s over some alphabet is defined
as follows:
0
s=
|s| =
|s | + 1 s = s x
Exercise A.2. Try to come up with a non-recursive definition of the length
function. Hint: Look at the original definition of word, or at the definition
of concatenation to get an idea. Use the recursive definition of a word to
argue that your definition agrees with the original.
We have a binary operation on words, namely concatenation.
Definition 19. Given an alphabet , concatenation is an operation from
pairs of words to words, all over , which, for word s and t over , we write
as s t. It is defined as follows:
x1 . . . xm y1 . . . yn = x1 . . . xm y1 . . . yn .
Exercise A.3. Recall the definition of an associative or commutative operation from COMP11120.
(a) Argue that the concatenation operation is associative.
(b) Show that the concatenation operation is not commutative.
Exercise A.4. Use recursion to give an alternative definition of the concatenation operation. (You may find this difficulttry to give it a go anyway.)
Exercise A.5. Show that |s t| = |s| + |t|. Hint: You may well find it easier
to use the non-recursive definition for everything.
Exercise A.6. Practice your understanding of recursion by doing the following.
(a) Using the recursive definition of a word give a recursive definition of
the following operation. It takes a word, and returns the word where every
letter is repeated twice. So ab turns into aabb, and aba into aabbaa, and aa
to aaaa.
(b) Now do the same thing for the operation that takes a word and returns
the reverse of the word, so abc becomes cba.
In order to describe words of arbitrary length concisely we have adopted
notation such as a3 for aaa. This works in precisely the same way as it does
for powers of numbers: By a3 we mean the word that results from applying
the concatenation operation to three copies of a to obtain aaa, just as in
arithmetic we use 23 to indicate that multiplication should be applied to
three copies of 2 to obtain 2 2 2. So all we do by writing an is to find
a shortcut that tells us how many copies of a we require (namely n many)
without having to write them all out.
Because we know both these operations to be associative we do not have
to use brackets here: (2 2) 2 is the same as 2 (2 2) and therefore the
notation 2 2 2 is unambiguous, just as is the case for 2 2 2.
76
What should a1 mean? Well, this is simple, it is merely the word consisting of one copy of a, that is a. The question of what a0 might mean is
somewhat trickier: What is a word consisting of 0 copies of the letter a? A
useful convention in mathematics is to use this to mean the unit for the underlying operation. Hence in arithmetic 20 = 1. The unit for concatenation
is the empty word and so we think of a0 as a way of referring to the empty
word .
This way we obtain useful rules such as am an = an+m . Similarly
we have (am )n = anm , just as we have for exponentiation in arithmetic.
However, note that (ab)n consists of n copies of ab concatenated with each
other, rather than an bn , as we would in arithmetic.1
Exercise A.7. Write out in full the words 05 , 03 13 , (010)2 , (01)3 0, 10 .
Languages are merely collections of words.
Definition 20. Let be an alphabet. A language over is a set of words
over .
As mentioned in Chapter 2 using this definition we automatically obtain
set-theoretic operations: We can form the unions, intersections, complements and differences of languages in precisely the same way as we do this
for other sets. Hence expressions such as L1 L2 , L1 L2 and L1 \ L2 are
immediately meaningful.
Exercise A.8. Let be the alphabet {0, 1, 2} and let
L1 = {s | s is a word consisting of 0s and 1s only},
L2 = {s | s is a word beginning with 0 and ending with 2}.
Calculate the following languages: L1 L2 , L1 L2 and the complement of
L1 in the language of all words over .
Note that we have notation to find a more compact description of the
language
{, 1, 11, 111, 1111, . . .}
of all words over the alphabet {1} as
{1n | n N}.
Exercise A.9. Write down the following languages using set-theoretic notation:
(a) All the words consisting of the letters a and b which contain precisely
two as and three bs.
(b) All the words consisting of the letter 1 that have even length.
(c) All the words consisting of an arbitrary number of as followed by at
least one, but possibly more, bs.
(d) All the non-empty words consisting of a and b occurring alternatingly,
beginning with an a and ending with a b.
1
The reason this rule doesnt hold is that the concatenation operation isnt commutative (see Exercise A.3), and so we cant swap over the as and bs to change the order in
which they appear.
77
A.2
Regular expressions
A.3
0
E
O
0
of an automaton that accepts precisely those words over {0, 1} that contain an even number of 0s. We have two states, which in the picture are
labelled E and O, so the set of all states is {E, O}. The initial state is E,
and there is only one accepting state which is also E.
The transition function for this example is given by the following table:
input
(E, 1)
(E, 0)
(O, 1)
(O, 0)
output
E
O
O
E
0, 1
0
0, 1
It has states {0, 1, 2, 3}, start state 0 and accepting states {3}. The
transition relation is defined as follows.
relates
state letter
0
0
0
1
1
0
1
1
2
0
2
1
3
0
3
1
to
0
1
X
X
X
X
X
X
X
X
X
79
Think of the ticks as confirming that from the given state in the left
hand column there is an edge labelled with the letter in the second column
to the state in the top row. We can see that here is a relation rather than
a function because for the input state 1, letter 1, we find two ticks in the
corresponding row.
Exercise A.17. Draw a non-deterministic automaton for the language described in Exercise 9 (d). Then describe it in the same way as the above
example.
Exercise A.18. Draw the automata with states {0, 1, 2, 3}, start state 0,
accepting states {0, 2} and the following transitions.
(a)
(b)
relates
state letter
0
a
0
b
1
a
1
b
2
a
2
b
3
a
3
b
relates
state letter
0
a
0
b
1
a
1
b
2
a
2
b
3
a
3
b
to
0
1
X
X
X
X
X
X
X
X
to
0
X
X
X
X
1
X
X
X
X
X
X
i
(a) Justify to yourself the expressions for Lj
ji and Lji from the general
ones for Lk
ji .
(b) Justify the equalities given at the end of the description of the general
case.
Exercise A.23.* Explain why it is safe not to draw the dump states for the
two automata when constructing an automaton for the intersection of the
languages recognized by the automata as described on page 52 by forming
the product of the two automata.
Exercise A.24. Exercise 34 has two parts that are mathematical, namely a
and (e). If you have delayed answering these do so now.
Let us now turn to the question of when to automata recognize the same
language. We begin by giving a proper definition of what it means for two
automata to be isomorphic.
Definition 21. An automaton (Q, q , F, ) over the alphabet is isomorphic
to an automaton (P, p , E, ) over the same alphabet if and only if there
exists a function f : Q - P such that
f is bijective;
f (q ) = p ;
q is in F if and only if p is in E for all q Q and all p P ;
in case the automata are
deterministic: f ((q, x)) = (f (q, x)) for all q Q and all x ;
x
Exercise A.25. Convince yourself that you understand this definition! Also
convince yourself that two automata are isomorphic precisely when they have
the same picture (if the states remained unlabelled).
On page 54 we had a calculation used to turn the problem of equivalence
of two automata into one of deciding whether two (different) automata recognize any words at all. You should be able to prove this, using definitions
from the set-theory part of COMP111.
Exercise A.26. For subsets S and S of some set T show that S = S if
and only if S (T S ) = and S (T S) = .
The notion of bisimulation deserves additional study and, indeed, this is
a concept that is very important in concurrency. Here we restrict ourselves
to one more (if abstract) example.
Exercise A.27. Show that there is a bisimulation between an NFA and the
result of turning it into a DFA using Algorithm 1.
Exercise 38 is somewhat mathematical in the way it demands you to
think about languages. Try to do as many of its parts as you can.
81
A.4
Grammars
Exercises in this section that asks you to reason mathematically are Exercises 44 and 45they ask you to demonstrate something.
In Section 4.3 there is a method that takes an automaton and turns it
into a context-free grammar. How can we see that the words generated by
this grammar are precisely those accepted by the automaton?
If we have a word x1 x2 xn accepted by the automaton then we get
the following for the grammar.
In the automaton:
We have S = q x1 q1 .
q1 = (q , x1 ).
Once we have reached the word
x1 x2 xi qi
qi = (qi1 , xi ),
we go to
qi+1 = (qi , xi+1 )
x1 x2 xi xi+1 qi+1 ,
where
qn = (qn1 , xn ).
82
COMP11212 Fundamentals of
Computation
Exercise Sheet 1
For examples classes in Week 2
Foundational exercises
The foundational exercises prepare important concepts appearing in Chapters 3 and 4.
Exercise 1
Exercise 2
Exercise 8 (a)(c)
83
COMP11212 Fundamentals of
Computation
Exercise Sheet 2
For examples classes in Week 3
84
COMP11212 Fundamentals of
Computation
Exercise Sheet 3
For examples classes in Week 4
85
COMP11212 Fundamentals of
Computation
Exercise Sheet 4
For examples classes in Week 5
Foundational exercises
Exercise 33
86
COMP11212 Fundamentals of
Computation
Exercise Sheet 5
For examples classes in Week 6
87