Automata
Chapter 4
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Automata
Discrete Mathematics II
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
(Materials drawn from this chapter in:
- Peter Linz. An Introduction to Formal Languages and Automata, (5th Ed.),
Jones & Bartlett Learning, 2011.
- John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullamn. Introduction to
Automata Theory, Languages, and Computation (3rd Ed.), Prentice Hall,
2006.
- Antal Iv
anyi Algorithms of Informatics, Kempelen Farkas Hallgat
oi
Inform
aci
os K
ozpont, 2011. )
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang
Faculty of Computer Science and Engineering
University of Technology, VNU-HCM
4.1
Contents
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
1 Motivation
2 Alphabets, words and languages
Contents
3 Regular expression or rationnal expression
Motivation
Alphabets, words and
languages
4 Non-deterministic finite automata
Regular expression or
rationnal expression
Non-deterministic
finite automata
5 Deterministic finite automata
Deterministic finite
automata
Recognized languages
6 Recognized languages
Determinisation
7 Determinisation
4.2
Automata
Introduction
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Standard states of a process in operating system
O with label: states
: transitions
Contents
Resource
Motivation
Waiting
Blocked
Alphabets, words and
languages
Regular expression or
rationnal expression
Resource
Non-deterministic
finite automata
CPU
Deterministic finite
automata
Recognized languages
CPU
Resource
Determinisation
Running
4.3
Why study automata theory?
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
A useful model
for many important kinds of software and hardware
1
designing and checking the behaviour of digital circuits
lexical analyser of a typical compiler: a compiler component
that breaks the input text into logical units
scanning large bodies of text, such as collections of Web
pages, to find occurrences of words, phrases or other patterns
Contents
Motivation
verifying pratical systems of all types that have a finite
number of distinct states, such as communications protocols
of protocols for secure exchange information, etc.
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.4
Alphabets, symbols
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Definition
Alphabet (bng ch ci) is a finite and non-empty set of
symbols (or characters).
For example:
= {a, b}
The binary alphabet: = {0, 1}
The set of all lower-case letters: = {a, b, . . . , z}
The set of all ASCII characters.
Remark
is almost always all available characters (lowercase letters,
capital letters, numbers, symbols and special characters such as
space or newline).
But nothing prevents to imagine other sets.
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.5
Automata
Strings (words)
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Definition
A string/word u (chui/t) over is a finite sequence (possibly
empty) of symbols (or characters) in .
A empty string is denoted by .
The length of the string, denoted by |u|, is the number of
characters.
Contents
Motivation
Alphabets, words and
languages
All the strings over is denoted by .
Regular expression or
rationnal expression
A language L over is a sub-set of .
Non-deterministic
finite automata
Deterministic finite
automata
Remark
Recognized languages
The purpose aims to analyze a string of in order to know
whether it belongs or not to L.
Determinisation
4.6
Example
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {0, 1}
is a string with length of 0.
0 and 1 are the strings with length of 1.
00, 01, 10 and 11 are the strings with length of 2.
is a language over . Its called the empty language.
is a language over . Its called the universal language.
{} is a language over .
{0, 00, 001} is also a language over .
The set of strings which contain an odd number of 0 is a language
over .
The set of strings that contain as many of 1 as 0 is a language
over .
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.7
String concatenation
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Intuitively, the concatenation of two strings 01 and 10 is 0110.
Concatenating the empty string and the string 110 is the string
110.
Definition
String concatenation is an application of to .
Concatenation of two strings u and v in is the string u.v.
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.8
Languages
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Specifying languages
A language can be specified in several ways:
a) enumeration of its words, for example:
L1 = {, 0, 1},
L2 = {a, aa, aaa, ab, ba},
L3 = {, ab, aabb, aaabbb, aaaabbbb, . . .},
b) a property, such that all words of the language have this property
but other words have not, for example:
L4 = {an bn |n = 0, 1, 2, . . .},
L5 = {uu1 |u },
L6 = {u {a, b} |na (u) = nb (u)} where na (u) denotes the
number of letter a in word u.
c) its grammar, for example:
Let G = (N, T, P, S) where
N = {S}, T = {a, b}, P = {S aSb, S ab}
i.e. L(G) = {an bn |N 1} since
S aSb a2 Sb2 . . . an Sbn
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.9
Operations on languages
L, L1 , L2 are languages over
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
union
L1 L2 = {u | u L1 or u L2 },
intersection
L1 L2 = {u | u L1 and u L2 },
difference
L1 \ L2 = {u | u L1 and u 6 L2 },
complement
L = \ L,
multiplication
L1 L2 = {uv | u L1 , v L2 },
power
L0 = {},
Ln = Ln1 L , if n 1,
iteration or star operation
[
L =
Li = L0 L L2 Li ,
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
i=0
We will use also the notation L+
[
L+ =
Li = L L2 Li .
i=1
The union, product and iteration are called regular operations.
4.10
Example
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c} and L = {ab, aa, b, ca, bac}
L2 = u.v, with u, v L including the following strings:
abab, abaa, abb, abca, abbac,
aaab, aaaa, aab, aaca, aabac,
bab, baa, bb, bca, bbac,
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
caab, caaa, cab, caca, cabac,
Non-deterministic
finite automata
bacab, bacaa, bacb, bacca, bacbac.
Deterministic finite
automata
Recognized languages
Determinisation
4.11
Exercise
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c}
Give at least 3 strings for each of the following languages
1) all strings with exactly one a.
2) all strings of even length.
Contents
3) all strings which the number of appearances of b is divisible by 3.
Motivation
4) all strings ending with a.
Alphabets, words and
languages
5) all strings not ending with a.
Regular expression or
rationnal expression
6) all non-empty strings not ending with a.
Non-deterministic
finite automata
7) all strings with at least one a.
Deterministic finite
automata
8) all strings with at most one a.
Recognized languages
9) all strings without any a.
Determinisation
10) all strings including at least one a and whose the first appearance
of a is not followed by a c.
4.12
Exercise
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c} and L = {ab, aa, b, ca, bac}
Which of the following strings are in L :
1) aaa = a3 ,
2) abaabaaabaa = aba2 ba3 ba2 ,
3) bbb,
4) aab,
5) cc,
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
6) aaaabaaaa = a4 ba4 ,
Non-deterministic
finite automata
7) cabbbbaaaaaaaaab = cab3 a9 b,
Deterministic finite
automata
8) baaaaabaaaab = ba5 ba4 b,
Recognized languages
9) baaaaabaac = ba5 ba2 c,
Determinisation
10) baca ?.
4.13
Regular expressions
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Regular expressions (biu thc chnh quy)
Permit to specify a language with strings consist of letters and ,
parentheses (), operating symbols +, ., . This string can be
empty, denoted .
Contents
Regular operations on the languages
union or +
product of concatenation
transitive closure
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Example
Recognized languages
Determinisation
(a + b) represent all the strings over the aphabet = {a, b}
a (ba ) represent the same language
(a + b) aab represent all strings ending with aab.
4.14
Automata
Regular expressions
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
is a regular expression representing the empty language.
is a regular expression representing language {}.
If a , then a is a regular expression representing language {a}.
If x, y are regular expressions representing languages X and Y
respectively, then (x + y),S(xy), x are regular expression
representing languages X Y , XY and X respectively.
(x + y)
Contents
Motivation
Alphabets, words and
languages
x+y
y+x
(x + y) + z
x + (y + z)
(xy)z
x(yz)
Non-deterministic
finite automata
(x + y)z
xz + yz
Deterministic finite
automata
x(y + z)
xy + xz
(x + y )
(x + y)
(x y )
(x )
(x + y)
x x
xx +
Regular expression or
rationnal expression
Recognized languages
(x + y )
Determinisation
xx
x
4.15
Regular expressions
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Contents
Kleenes theorem
Motivation
Language L is regular if and only if there exists a regular
expression over representing language L.
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.16
Exercise
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c}
Give at least 3 words for each language represented by the
following regular expressions
Contents
1) a + b ,
Motivation
2) a b + b a,
Alphabets, words and
languages
3) b(ca + ac)(aa) + a (a + b),
Regular expression or
rationnal expression
4) (a b + b a) .
Non-deterministic
finite automata
Deterministic finite
automata
Example
a b = {b, ab, a2 b, a3 b, . . . , aaa . . . ab},
Recognized languages
Determinisation
4.17
Automata
Exercise
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c} and L = {ab, aa, b, ca, bac}
Which languages represented by the following regular expressions
are in L :
1) a + b,
Contents
Motivation
2) b ,
Alphabets, words and
languages
3) aab + cab ac,
Regular expression or
rationnal expression
4) b(ca + ac)(aa) + a (a + b),
5) (aaaabaaa)2 c,
Non-deterministic
finite automata
6) b+ ac (b+ = bb ),
Deterministic finite
automata
7) (b + c)ab + (ba(c + ab2 + a3 + a4 + b) ) ?
Recognized languages
Determinisation
Define a (simple) regular expression representing the language L .
4.18
Automata
Finite automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Finite automata (tmat hu hn)
The aim is representation of a process system.
It consists of states (including an initial state and one or
several (or one) final/accepting states) and transitions
(events).
The number of states must be finite.
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
a, b
Deterministic finite
automata
q0
q1
Recognized languages
Determinisation
Regular expression
b (a + b)
4.19
Automata
Exercise
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b}
Which of the strings
1) a3 b,
2) aba2 b,
3) a4 b2 ab3 a,
Contents
4) a4 ba4 ,
Motivation
5) ab4 a9 b,
Alphabets, words and
languages
6) ba5 ba4 b,
Regular expression or
rationnal expression
7) ba5 b2 ,
Non-deterministic
finite automata
8) bab a?
Deterministic finite
automata
are accepted by the following finite automata?
Recognized languages
Determinisation
b
q0
q1
q2
b
a
4.20
Automata
Exercise
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Give regular expression for the following finite automata.
b
b
q0
q1
a
b
q2
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
and this one.
Non-deterministic
finite automata
b
a, b
q0
Deterministic finite
automata
Recognized languages
Determinisation
q1
a
4.21
Nondeterministic finite automata
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Definition
A nondeterministic finite automata (NFA, tmat hu hn phi
n nh) is mathematically represented by a 5-tuples
(Q, , q0 , , F ) where
Q a finite set of states.
is the alphabet of the automata.
Contents
Motivation
Alphabets, words and
languages
q0 Q is the initial state.
Regular expression or
rationnal expression
: Q Q is a transition function.
Non-deterministic
finite automata
F Q is the set of final/accepting states.
Deterministic finite
automata
Recognized languages
Remark
Determinisation
According to an event, a state may go to one or more states.
4.22
Automata
NFA with empty symbol
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Other definition of NFA
Finite automaton with transitions defined by character x (in ) or
empty character .
Contents
Motivation
Alphabets, words and
languages
q0
a, b
a
q1
Regular expression or
rationnal expression
Non-deterministic
finite automata
q2
Deterministic finite
automata
Recognized languages
Determinisation
4.23
Exercise
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Contents
Consider the set of strings on {a, b} in which every aa is followed
immediately by b.
For example aab, aaba, aabaabbaab are in the language,
but aaab and aabaa are not.
Construct an accepting NFA.
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.24
Automata
Exercise
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c}
Construct an accepting finite automata for languages represented
by the following regular expressions.
Contents
E1 = a + b,
Motivation
E2 = b ,
E3 = aab + cab ac,
E4 = b(ca + ac)(aa) + a (a + b),
E5 = (aaaabaaa)2 c,
E6 = b+ ac (b+ = bb ),
E7 = (b + c)ab + (ba(c + ab2 + a3 + a4 + b) ) ,
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
E8 = [a(b + c) abc] .
4.25
Deterministic finite automata
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Definition
A deterministic finite automata (DFA, tmat hu hn n nh)
is given by a 5-tuplet (Q, , q0 , , F ) with
Q a finite set of states.
is the input alphabet of the automata.
q0 Q is the initial state.
: Q Q is a transition function.
F Q is the set of final/accepting states.
Condition
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
Transition function is an application.
4.26
Automata
Example
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b}
Hereinafter, a deterministic and complete automata that
recognizes the set of strings which contain an odd number of a.
b
Motivation
a
q0
Contents
q1
a
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Q = {q0 , q1 },
Determinisation
(q0 , a) = q1 , (q0 , b) = q0 , (q1 , a) = q0 , (q1 , b) = q1 ,
F = {q1 }.
4.27
Configurations and executions
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let A = (Q, , q0 , , F )
A configuration (cu hnh) of automata A is a couple (q, u) where
q Q and u .
We define the relation of derivation between configurations :
(q, a.u) (q 0 , u) iif (q, a) = q 0
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
An execution (thc thi) of automata A is a sequence of
configurations
(q0 , u0 ) . . . (qn , un ) such that
(qi , ui ) (qi+1 , ui+1 ), for i = 0, 1, . . . , n 1.
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.28
Exercise
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Soit = {0, 1}
Give an automaton that accepts all words that contain a number
of 0 multiple of 3.
Give an execution of this automata on 1101010.
Soit = {a, b}
Give an automata that accepts all strings containing 2 characters
a.
Give an execution of this automata on aabb, ababb and bbaa.
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.29
Recognized languages
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Definition
A language L over an alphabet , defined as a sub-set of , is
recognized if there exists a finite automata accepting all strings of
L.
Contents
Motivation
Alphabets, words and
languages
Proposition
Regular expression or
rationnal expression
If L1 and L2 are two recognized languages, then
Non-deterministic
finite automata
L1 L2 and L1 L2 are also recognized;
L1 .L2 and L1 are also recognized.
Deterministic finite
automata
Recognized languages
Determinisation
4.30
Automata
Example
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Sub-string ab
Construct a DFA that recognizes the language over the alphabet
{a, b} containing the sub-string ab.
Contents
Motivation
Regular expression
(a + b) ab(a + b)
Alphabets, words and
languages
Automata
b
Transition table
q0
q1
q2
a
q1
q1
q2
b
q0
q2
q2
a
q0
a, b
a
q1
q2
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.31
Automata
Example
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Determine build a DFA that recognizes the language over the
alphabet {a, b} with an even number of a and an even number b.
Automata
Contents
Motivation
Transition table
q0
q1
b
b
q1
q0
q3
q2
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
b
q2
q0
q1
q2
q3
a
q2
q3
q0
q1
Alphabets, words and
languages
q3
: start state
: final state(s)
b
4.32
Automata
Equivalent automatons
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Two following DFAs are equivalent?
q0
p0
p3
Contents
Motivation
a
b
Alphabets, words and
languages
a
q1
Regular expression or
rationnal expression
Non-deterministic
finite automata
q2
p1
p2
Deterministic finite
automata
Recognized languages
Determinisation
4.33
Automata
Equivalent automatons
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Two following DFAs are equivalent?
q0
p0
p3
Contents
Motivation
Alphabets, words and
languages
a
q1
Regular expression or
rationnal expression
Non-deterministic
finite automata
q2
p1
p2
Deterministic finite
automata
Recognized languages
Determinisation
4.34
Automata
From NFA to DFA
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Transition table
{0}
{1}
{0, 2}
Given a NFA
b
a
a
{1}
{0, 2}
{1}
b
{0}
{1}
{0, 2}
Contents
Corresponding DFA
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
{0}
Non-deterministic
finite automata
{1}
Deterministic finite
automata
Recognized languages
Determinisation
2
b
{0, 2}
4.35
Automata
Exercise
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c}
Determine DFAs which corresponds to the following NFAs:
b, c
b, c
a,
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
c,
Recognized languages
b, c
Deterministic finite
automata
b,
Determinisation
4.37
Exercise
Automata
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c}
Determine finite automata, not necessarily deterministic,
recognizing the following languages:
L1 = {a, ab, ca, cab, acc},
L2 = { set of words of even number of a},
L3 = { set of words containing ab and ending with b}.
Then, determine the corresponging DFAs.
Contents
Motivation
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
4.38
Automata
Exercise
Nguyen An Khuong,
Huynh Tuong Nguyen,
Bui Hoai Thang
Let = {a, b, c}
Construct accepting DFAs for languages represented by the
following regular expressions.
Contents
E1 = a + b,
Motivation
E2 = b ,
E3 = aab + cab ac,
E4 = b(ca + ac)(aa) + a (a + b),
E5 = (aaaabaaa)2 c,
E6 = b+ ac (b+ = bb ),
E7 = (b + c)ab + (ba(c + ab2 + a3 + a4 + b) ) ,
Alphabets, words and
languages
Regular expression or
rationnal expression
Non-deterministic
finite automata
Deterministic finite
automata
Recognized languages
Determinisation
E8 = [a(b + c) abc] .
4.39