1 Automata
1 Automata
Matteo Pradella
Email: [email protected]
• Complexity theory:
how much does it cost to solve a problem through a
computer?
• Only the basics: further developments in follow-up
courses
6
Organization (1/3)
• Requirements :
– Basics of Computer science (CS 1)
– Elements of discrete mathematics (Algebra and Logic)
• Lesson and practice classes (all rather classical style…)
– Student-teacher interaction is quite appreciated:
• In classroom
• In my office (when possible)
• By Email (for administrative matters and to fix face-to-face meetings)
7
Organization (2/3)
• Operational models
(abstract machines, dynamic systems, …)
based on the concept of a state and of means (operations) to
represent its evolution (chronologically, i.e., w.r.t. “time”)
• Descriptive models
aimed at expressing properties (desired or feared) of the
modeled system, rather than its functioning as an evolution in
time through states
12
Examples
ax2+by2=c
13
Elements of a language
• Alphabet or vocabulary
(from a mathematical viewpoint these are synonyms):
Finite set of basic symbols
{a,b,c, …z}
{0,1}
{Do, Re, Mi, …}
ASCII char set
...
17
• Operations on strings:
concatenation (product): x y
• A* is called
free monoid over A built through “”
• e is the unit w.r.t. “” : for any x, x = x = x
19
Language
Operations on languages
●
L0 = {}, Li = Li-1.L
●
L = U Ln
*
n=0
●
NB: {} !
{} . L = L;
.L=
●
+ = “* - 0”: A+ = set of all strings on A.
A = {0,1}, A+ = {0, 1, 00, 01, 10, …}
●
L = U Ln
+
n=1
22
Translation
• y = t (x)
t : translation is a function from L1 to L2
– t 1 : double the “1” (1 11):
t 1(0010110) = 0011011110, …
– t 2 : change a with b and viceversa (a b):
t 2(abbbaa) = baaabb, …
– Other examples:
• File compression
• Self-correcting protocols
• Computer language compilation
• translation Italian English
25
Conclusion
• The notion of language and the basic associated operations
provide a very general and expressive means to describe any
type of systems, their properties and related problems:
– Compute the determinant of a matrix;
– Determine if a bridge will collapse under a certain load;
– ….
• Notice that, after all, in a computer any information is a string of
bits (hence a string of some language…)
26
Operational Models
(state machines, dynamical systems)
Graphic representation:
On Off
27
S
R R
On Off
S
Turning a light on and off, ...
28
A first formalization
0 0
1
q0 q1
1
30
* a
means for all a alphabet I
32
Automata as language translators
y = t(x)
t: doubles the number of “1” and halves the “0” (the input must have an even number of “0”)
1/11 1/11
0/e
q0 q1
0/0
33
b b a a
q5 q4 q3 q8
a b
a
b
q6 q7
If, when reading a string, one goes through the cycle once, then
one can also go through it 2, 3, …, n, … times
36
Formally:
• If x L and |x| > |Q| then there exists a q Q and a w I + such that:
x = ywz
d* (q, w) = q
and then y wn z L, n ≥ 0
Closure properties of FA
Intersection
b a a
A q0 q1 q2 q9
b a a
B p0 p1 p2 p9
b a a
<A,B> q0 p0 q1 p1 q2 p2 q9 p9
43
Example
a
a
S0 S1 b,c c
T0 T1
a b,c b
c
S0 T0
a c
c
S0 T1 b a S1 T0
c b
S1 T1
c
44
Formally:
0 0
1
q0 q1
1
An idea: F^ = Q - F:
Yes, it works for the automaton above, but ….
46
0 0
1 L: strings with
exactly one ‘1’
q0 q1
0 0
1 L: strings with no
q0 1 q1 ‘1’ or with more
than one ‘1’
qE
0
1
47
• If the analysis of the string terminates for all possible strings then it
suffices the “turn a yes into a no” (F into Q-F)
• If, for some string, the analysis of the string does not terminate (it
gets blocked or continues forever) then turning F into Q-F does not
work
• With FA the problem is easily solved …
• In general if we are unable to provide a positive answer to a
problem (e.g., xL), this does not necessarily mean we can provide
a positive answer for the complement problem (e.g., xL)
48
Input tape
Control device
(finite state)
Output tape
49
“stack” memory
Input tape
a
Control device
(finite state) q p
A
x
B
Output tape
Z0
Z0 stack bottom symbol
50
The move of the pushdown automaton (with a stack memory):
• Depending on :
– the symbol read on the input tape (but it could also ignore the input …)
– the symbol read on top of the stack
– the state of the control device:
• the pushdown automaton
– changes its state
– moves ahead the scanning head (or it does not if input was ignored)
– changes the symbol A read on top of the stack with a string a of
symbols (a may be empty: this amounts to a pop of A)
– (if translator) it writes a string (possibly empty) on the output tape
(advancing the writing head consequently)
51
• The input string x is recognized (accepted) if
– The automaton scans it completely (the scanning head reaches
the end of x), and
– Upon reaching the end of x it is in an acceptance state (just like
for the FA)
• If the automaton is also a translator, then
t(x) is the string on the output tape after x has been
accepted (if x is accepted, otherwise t(x) is undefined:
t(x) = )
a,A/AA b,A/e
a,Z0/ B Z0
q0 q1 q2 q3
b,A/e
b,B /e
a,B/AB
b,B /e
}
the B in the stack marks
the first symbol of x, to
match the last symbol A
n - 1 A’s
A
B
Z0
53
Another one:
a,A/AA b,A/e
a,Z0/ A Z0
q0 q1 q2 q3
b,A/e e, Z0 / Z0
e,A/ e, a
a,Z0/ AZ0,e
a,A/ AA,e
b,A/ BA,e
55
Now we formalize ...
• Pushdown automaton [transducer]: <Q, I, G, d, q0, Z0 , F [, O, h]>
• Q, I, q0, F [O] just like FA [FST]
• G stack alphabet (disjointed from other ones for ease of definition)
• Z0 : initial stack symbol (not essential, useful to simplify definitions)
• d: Q (I {}) G Q G* d is partial just as in
FSA
• h: Q (I {}) G O* ( h defined where d is)
Graphical notation: <p,a> = d(q,i, A)
w = h(q,i, A)
i,A/a,w
q p
56
stack last
input string
content state
completely
immaterial final
scanned
Some consequences
• LP = class of languages accepted by pushdown automata
• LP is not closed under union nor intersection
• Why?
• Function d must be made complete (as with FA) with an error state.
Pay attention to the nondeterminism caused by e-moves!
• The e-moves can cause cycles never reach the end of the string
the string is not accepted, but it is not accepted either by the
automaton with F^ = Q-F.
• There exists however a construction that associates to every
automaton an equivalent loop-free automaton
• Not finished yet: what if there is a sequence of e-moves at the end of
the scanning with some states in F and other ones not in F?
64
For now we consider the “k-tape” version, slightly different from the
(even simpler) original model. This choice will be explained later.
66
k-tape TM
First memory tape
Input tape A
a
Second memory tape
B
q p
…..
x
• States and alphabets as with other automata (input, output, control device, memory
alphabet)
• For historical reasons and due to some “mathematical technicalities” the tapes are
represented as infinite cell sequences [0,1,2, …] rather than finite strings. There
exists however a special symbol “blank” ( “ ”, or “barred b” or “_” ) and it is
assumed that every tape contains only a finite number of non-blank cells.
– The equivalence of the two ways of representing the tape content is obvious.
• A special symbol Z0 in the first cell of all memory tapes (as in the PDA)
• Scanning (input) and output heads are also as in previous models
68
As a consequence:
, [ ] : Q I k Q k {R, L, S }k 1 [ O {R, S }]
(partial!)
Graphical notation:
Why do we not loose generality having O rather than O* for the output?
70
• Initial configuration:
• Z0 followed by all blanks in the memory tapes
• [output tape all blank]
• Heads in the 0-th position on every tape
• Initial state of the control device q0
• Input string x starting from the 0-th cell of the input tape,
followed by all blanks
71
• Final configurations:
– Accepting states F Q
– For ease of notation, convention:
<d,[h]> (q, …) = q F: no further action from a final state
– The TM stops when <d,[h]> (q, …) =
– Input string x is accepted iff:
• After a finite number of moves the TM stops (hence it is in a
configuration where <d,[h]> (q, …) = )
• When it stops, its state q F
72
• N = S: ’ = , ’ = A’
• N = R: ’ = A’, if = e then ’ = _ else ’ =
• N = L: = ’B, ’ = BA’.
73
Language of a TM:
• Note: as a consequence
– x is not accepted if:
• The TM stops in a state q F; or
• The TM never stops (NB: this case is very important)
– There is a similarity with the PDA (a non-loop-free PDA might also not accept
because of a “non stopping run”), however … does there exist a loop-free TM?
74
Some examples
Input tape
a a a b b b c c c
Memory tape
Z0 M M M
CD
75
a, _ / M, <R, R>
b, M / M, <R, L>
_, _ / _, <S, S>
qF q3
c, M / M, <R, R>
76
Computing the successor of a number n coded with decimal digits
two memory tapes T1 and T2
• M copies all digits of n on T1, to the right of Z0, while it moves head T2
by the same number of positions.
• M scans the digits of T1 from right to left. It writes on T2 from right to
left changing the digits as needed (9’s become 0’s, first digit9
increased by one, then all other ones to its left are unchanged, …)
• M copies T2 on the output tape.
Purpose of the example: to show that the TM can compute any function
77
Input tape
1 3 4 9 9 9
T1
Z0 1 3 4 9 9 9
CD
T2
Z0
Output tape
78
copy input on T1 Change rightmost 9’s into 0’s
d,_,_/_,<d,_>,<R,R,R,S>
d,Z0 ,Z0 /_,<Z0,Z0>,<S,R,R,S> _,9,_/_,< 9,0>,
q0 q1 q2
_,_,_/_,< _,_>,<S,L,L,S> <S,L,L,S>
CD
81
• Bidimensional tape TM
CD
CD
The state encodes the
symbols in the k + 1
cells under the heads
83
Let us compare
with
case
– x=y then S1
– z > y +3 then S2
– …. then …
endcase
86
Another form of nondeterminism which is usually “hidden”:
blind search
?
87
q1
a
q3
b q4
a q2
b
q1
q5
a b
q3
b
q6
* (q, ) {q}
xL ( q0 , x ) F
*
Among the various possible runs (with the same input) of the ND FA it
suffices that one of them (that is, there exists one that) succeeds and
accepts the input string
( (q0 , x) F )
*
91
nondeterministic PDA (NPDA)
i, A/a q2
q1
e, A/b
q3
92
i, A/a q2
q1
i, A/b
q3
: Q ( I { }) F (Q * )
a,Z0/ Z0 A
q0’ q1’ b,A/e
q2’ q3’
e, Z0 /e
e,Z0/ Z0
q0
e,Z0/ Z0 q3
a,A/AA b,A/e
e, Z0 /e
a,Z0/ Z0 A b,A/A
q0” q1 b,A/A
q2 q4
94
b q4
a q2
b
q1
q5
a b
q3
b
q6
D ( qD , i ) N ( q N , i )
– qN qD -- recall that N (qN, i ) is a set
– q0D = {q0N}
• Though it is true that for all NFA one can find (and build) an equivalent
deterministic one
• This does not mean that using NFA is useless:
– It can be easier to “design” a NFA and then obtain from it automatically an
equivalent deterministic one, just to skip the (painful) job of build it ourselves
deterministic from the beginning (we will soon see an application of this idea)
– For instance, from a NFA with 5 states one can obtain, in the worst case, one with
25 states!
• Consider NFA and FA for languages L1=(a,b)*a(a,b) (i.e., strings over {a,b}
with ‘a’ as the symbol before the last one) and L2=(a,b)*a(a,b)4 (i.e., ‘a’ as the
fourth symbol before the last...)
Nondeterministic TM
Unfinished
(possibly
c11 c12 c13 nonterminating)
computations
c32
c31 ckj cim
102
• x is accepted by a ND TM iff there exists a computation that terminates in an
accepting state
• Can a deterministic TM establish whether a “sister” ND TM accepts x, that is,
accept x if and only if the ND TM accepts?
• This amounts to build the computation tree of the NDTM and “visit” it to
establish whether it contains a path that finishes in an accepting state
• This is a (almost) trivial, well known problem of tree visit, for which there are
classical algorithms
• The problem is therefore reduced to implementing an algorithm for visiting
trees through TM’s: a boring, but certainly feasible exercise … but beware the
above “almost” ...
103