Chapter 1 Intro To The Theory of Computation2016
Chapter 1 Intro To The Theory of Computation2016
1
Strings and Languages
• Symbol: any thing like a, b, c, 0, 1, …
• Alphabet Σ: It is defined as a finite set of symbols.
• Example: Roman alphabet {A,..,Z, a, b, ...... z}.
• “Binary Alphabet” {0, 1} is pertinent to the theory of computation.
• String: A “string” over an alphabet is a finite sequence of symbols
from that alphabet, which is usually written next to one another and
not separated by commas.
• (i) If Sa = {0,1} then 001001 is a string over Sa.
• (ii) If Sb = {a, b, .., z) then axyrpqstcd is a string over Sb .
• Length of String: is its length as a sequence. It is the number
of symbols in the string.
• The length of a string w is written as |w|.
• Example: |10011| = 5
• Empty String: The string of zero length is called the “empty
string”.
• This is denoted by e or λ orÎ.
• The empty string plays the role of 0 in a number system.
Strings
• If w is a string, then wn stands for the string
obtained by repeating ω n times.
• As a special case, we define w0= λ, for all w.
• If Σ is an alphabet, then we use Σ* to denote the
set of strings obtained by concatenating zero
or more symbols from Σ.
• The set Σ* always contains λ.
• To exclude the empty string, we define Σ+= Σ*-
{λ}
3
• ={ε}, where ε is the empty string (common to all
0
• While Σ is finite by assumption, Σ* and Σ+ are always
infinite since there is no limit on the length of the
strings in these sets.
wnwn-1..w1 .
Substring: z is a substring of w if z appears consecutively within w.
As an example, ‘deck’ is a substring of ‘abcdeckabcjkl’.
Concatenation: Assume a string x of length m and string y of length n,
the concatenation of x and y is written xy, which is the string obtained by
appending y to the end of x, as in x1x2..xm y1 y2.. yn .
To concatenate a string with itself many times we use the “superscript”
notation:
Suffix: If w = xv for some x, then v is a suffix of w.
Example: Let us take a string w = 0110. For the particular string, λ, 0, 10,
110, and 0110 are suffixes of the string 0110. For a string of length n, there
are n + 1 number of suffixes.
Proper suffix: For a string, any suffix of the string other than the string
itself is called as the proper suffix of the string. Example: For the string w
= 0110, the proper suffixes are λ, 0, 10, and 110.
eg. S={0,1}
L1=set of all strings of length 2, ={00,01,10,11}
L2=set of all strings of length 3, ={000,001,010,011,100, 101, 111}
L3=set of all strings that begin with 0, ={0,00,01,011,000, 0101, …}
L1and L2 are finite and L3 is infinite
Power of S : S={0,1}
S0= set of all strings of length 0: S0={Î}
S1= set of all strings of length 1: S1={0,1}
S2= set of all strings of length 2: S2={00,01,10,11}
S3= set of all strings of length 3: S3={000,001,010,011,101,110,111}
….
Sn= set of all strings of length n
Cardinality: number of elements in a set, Sn =2n . Cardinality(S0)=1
S*={Î} u {0,1} u {00,01,10,11} u...
= set of all possible strings of all lengths over {0,1}
it is infinite
Formal Language
• A formal language is a set of words, i.e. finite strings of letters, or
symbols.
• The inventory/list from which these letters are taken is called the
alphabet over which the language is defined.
• A formal language is often defined by means of a formal grammar.
Formal languages are a purely syntactical notion, so there is not
necessarily any meaning associated with them.
Formal Definition
A formal language L over an alphabet Σ is just a subset of Σ*, that
is, a set of words over that alphabet. For example, three sample
languages over the same alphabet Σ = { a, b }:
L1 = {a a, a a a }
L2 = {a ba, a a b}
L3 = {a b, b a, a a bb, a ba b, . . . , a a a bbb, . . . }
In computer science and mathematics, which do not deal with
natural languages, the adjective "formal" is usually omitted as
redundant.
Example1
The following rules define a formal language L over the alphabet Σ=
{0,1,2,3,4,5,6,7,8,9,+,=}:
• Every non empty string that does not contain + or = and does not
start with 0 is in L.
• The string 0 is in L.
• A string containing=is in L if and only if there is exactly one =, and
it separates two strings in L.
• A string containing + is in L if and only if every + in the string
separates two valid strings in L.
• No string is in L other than those implied by the previous rules.
Automata
• An automaton is an abstract model of a
digital computer and the computational
problems that can be solved using these
machines.
• abstract 'mathematical' machines or systems
• It has a mechanism for reading input
• the input is a string over a given alphabet, written
on an input file, which the automaton can read but
not change
• The input file is divided into cells, each of which can
hold one symbol.
15
Figure 1.1 schematic representation of a general automaton.
• The input mechanism can read the input file from left
to right, one symbol at a time
• The automaton can produce output of some form.
• It may have a temporary storage device,
• consisting of an unlimited number of cells,
• each capable of holding a single symbol from an alphabet
The automaton can read and change the contents
of the storage cells.
• the automaton has a control unit, which can be in
any one of a finite number of internal states, and
which can change state in some defined manner.
Formal Definitions
An automaton is represented formally by the 5-tuple ⟨Q, Σ, δ, q0, F⟩,
Automaton
where:
• Q is a finite set of states.
• Σ is a finite set of symbols, called the alphabet of the automaton.
• δ is the transition function, that is, δ: Q ×Σ→ Q.
• q0 is the start state, that is, the state which the automaton is in
when no input has been processed yet, where q0∈Q.
• F is a set of states of Q (i.e. F⊆Q) called accept states.
Automata Recognizable language
Deterministic finite automata (DFA) regular languages
q1
2
States of the FA
FA has following states
• Initial state
• Final states
• Non-final states: all except final state
• Hang-states: states not included into Q,
and after reaching these states FA sits in
idle situation. These have no outgoing
edge. These states are generally denoted
by . For example consider a FA shown
above
Definition of a Finite
1. Finite set of states, typically Q.
Automaton
2. Alphabet of input symbols, typically
3. One state is the start/initial state, typically q0
// q0 Q
4. Zero or more final/accepting states; the set
is typically F. // F Q
5. A transition function, typically δ. This
function
• Takes a state and input symbol as arguments.
• Returns a state.
• One “rule” would be written δ(q, a) = p, where q
and p are states, and a is an input symbol.
• Intuitively: if the FA is in state q, and input a is
received, then the FA goes to state p (note: q = p
δ,qOK).
0,
6. A
F).FA is represented as the five-tuple: A = (Q, 2
Definition of
• Let Computation
M = (Q, , δ,q0, F) be a finite
automaton and let w = w1w2…wn
be a string where each wi is a
member of alphabet ∑.
• M accepts w if a sequence of
states r0r1…rn in Q exists with
conditions:
three
1. r0 = q0
2. δ(ri, wi+1) = ri+1 for i=0, … ,
n-1
We say that M recognizes language A if A = {w | M accepts w }
3. rn F
In other words, the language is all of those strings that are accepted
by the finite automata.
3
Construc
t• A Finite Automaton Accepting the
3
Finite Automata\Some
–Applications
Software for designing and checking
the behavior of digital circuits
– Lexical analyzer of a typical compiler
– Software for scanning large bodies of
text (e.g., web pages) for pattern
finding
– Software for verifying systems of all
types that have a finite number of
states (e.g., stock market transaction,
communication/network protocol)
32
Finite Automata
FA with FA without
output output
Moore Mealy
Machine Machine
DFA NFA Î-NFA
• FA without outputs
• both describe regular languages
– Deterministic (DFA) – There is a fixed number of
states and we can only be in one state at a time.
It is one in which each move (transition from
one state to another) is unequally determined
by the current configuration.
– Nondeterministic (NFA) –There is a fixed number
of states but we can be in multiple states at one
time
• While NFA’s are more expressive than DFA’s,
we will see that adding nondeterminism
does not let us define any language that
cannot be defined by a DFA.
• One way to think of this is we might write
a program using a NFA, but then when it
is “compiled” we turn the NFA into an
34
Deterministic Finite Automaton (DFA)
is represented by a quintuple (5-tuple)
M =Q(Q, Σ, ,
is the q0of
set , states
F) : (finite)
Σ is the alphabet (finite) λ Σ
: Q Σ → Q is the transition
function q0 Q is the start state
F Q is the set of accept states
Let w1, ... , wn Σ and w = w1... wn Σ*
Then M accepts w if there are r0, r1, ..., rn Q, s.t. r0=q0
(ri, wi+1 ) = ri+1, for i = 0, ..., n-1, and rn F
• The input mechanism can move only from left to right and reads exactly one
symbol on each step.
• The transition from one internal state to another are governed by the transition
function .
• If (q0 , a) =q1 ,then if the DFA is in state q0 and the current input symbol is a, the
DFA will go into state q1.
(DFA
)
states accept states (F)
0 q1 1
0,1
1
q0 q2
0 0
1
start state (q0) q3 states
q1 acceptin
transitio g state
initia stat n
l e
38
Alphabet {a,b a, b
} q
5
a
,
b
q0 q1 q2
q3 q4
39
hea Initial
d Configuration
Input
Tape
Input String
a, b
q5
a,
b
q0 q
1 q2 q3 q4
Initial state
40
Scanning the
Input
a,
b
q5
a,
q0 q b
1 q2 q3 q4
41
a,
b
q5
a,
q0 q b
1 q2 q3 q4
18
a,
b
q5
a,
q0 q b
1 q2 q3 q4
43
Input
finished
a,
b
q5
a,
b accep
q0 q1 q2 q3 q4 t
Last state determines the
outcome 44
A Rejection
Case
Input String
a,
b
q5
a,
q0 q b
1 q2 q3 q4
45
a,
b
q5
a,
q0 q b
1 q2 q3 q4
22
a,
b
q5
a,
q0 q b
1 q2 q3 q4
47
Input
finished
a,
b
reject
q5
a, b
q0 q1 q2 q3 q4
Tape
()is empty
q5
a,
b
q0 q1 q2 q3
q4 49
This automaton accepts only one
string
Language L abba
Accepted:
a,
b
q5
a,
q0 q b
1 q2 q3 q4
50
Another
Example
L ,ab,abba
a, b
q
5
q0 q1 q2 q3 a
q4
,
Accep Accep Accep
t t tb
51
Empty
Tape
()
Input a,
Finished b
q5
a,
q0 q b
1 q2 q3 q4
accep
t 52
DFA problems can be
Accepted: 0}
a,
b
q a, q
q0
b
1 2
56
Another
Example
Alphabet:
1
{1}
q0 q1
1
Language Accepted:
EVEN {x : x *
and x is
even}
57
Set of States Q
Example
q
5 a,
b
q q q q
0 1 2 3 q4
58
Input Alphabet
Exampl
contains
e a,
a,b b
q
5 a,
b
q q q q
0 1 2 3 q4
59
Initial State q0
Exampl
e
a,
b
q5
a,
q0 q b
1 q2 q3 q4
34
Set of Accepting States F
Q
Exampl
e F a,
q 4 b
q5
a,
q0 q b
1 q2 q3 q4
61
Transition : Q
Function
Q
(q,x )
q x
q q
Describes the result of a
from state
transition
with symbol x
q
62
Exampl
e: q 0 , a
q1
a,
b
q
5 a,
q q q q b
0 1 2 3 q4
63
q0 ,b
q5
a,
b
q
5 a,
q q q q b
0 1 2 3 q4
64
q2 ,b
q3
a,
b
q5
a,
q0 q b
1 q2 q3 q4
65
Transition Table for
symbol
s
q0 q1 q5
q1 q5 q2
q2 q5 q3
state
q3 q4 q5 a,
b
s
q4 q5 q5
q5 q5 q5 q a,
5
q0 q1 q2 b
q 66
67
Extended Transition
Function *
:Q *
Q
(q,w )
*
q
after scanning string from state
Describes the resulting
w
state q
68
Example: *
q 0 , ab
q2
a,
b
q
5 a,
q q q q b
0 1 2 3 q4 69
* q0 , abbbaa
q5
a,
b
q
5 a,
q q q q b
0 1 2 3 q4 70
q1 , bba
*
q4
a,
b
q5
a,
q0 q b
1 q2 q3 q4
71
Special
case:
q ,
q *
q
72
In q,w
*
general:
q
implies that there is a walk of
transitions
1 1 2 k
w
q 2
k q
states may be
repeated
q w
q
73
More DFA
Examples
{a,b }
a,
a,
b
b
q0 q0
L(M) { L(M)
}
Empty *
All
language strings
74
{a,b
}
a, b
q0 a, q1
b
L(M ) {}
Language of the empty
string
75
{a,b }
LM = { all strings with prefix
ab }
a, b
q0 q q2
1
accep
t
q a,
3 b
76
LM = { all binary strings
containing substring
001 }
0,
1 0
1
1
0 0 0 1 00
0 1
0
77
LM = { all binary strings
without substring
001 }
1 0 0,
1 1
0 1
0 0 00
0 1
0
78
L(M ) awa : w
a , b
*
q0 q q
2 3
q1
a, 79
2.3 Nondeterministic Finite
• Automata
A NFA (nondeterministic finite automata) is
able to be in several states at once.
– In a DFA, we can only take a transition to a
single deterministic state
– In a NFA we can accept multiple destination states
for the same input.
– You can think of this as the NFA “guesses”
something
about its input and will always follow the proper
path if that can lead to an accepting state.
– Another way to think of the NFA is that it travels all
possible paths, and so it remains in many states at
once. As long as at least one of the paths results in
an accepting state, the NFA accepts the input.
• NFA is a useful tool
– More expressive than a DFA.
– BUT we will see that it is not more powerful!
80
An NFA
• Similar to a DFA
1. Finite set of states, typically Q.
2. Alphabet of input symbols, typically
3. One state is the start/initial state, typically q0
4. Zero or more final/accepting states; the set is typically F.
5. A transition function, typically . This function:
Takes a state and input symbol as arguments.
Returns a set of states instead of a single state, as a DFA
6.A FA is represented as the five-tuple: A = (Q, , ,q0, F). Here, F is a set of
accepting states.
81
DFA NFA
For each symbolic representation of the alphabet, No need to specify how does the NFA react
1 there is only one state transition in DFA. according to some symbol.
2 DFA cannot use Empty String transition. NFA can use Empty String transition.
In DFA, the next possible state is distinctly set. In NFA, each pair of state and input symbol can
3 have many possible next states.
4 DFA is more difficult to construct. NFA is easier to construct.
DFA rejects the string in case it terminates in a NFA rejects the string in the event of all branches
5 state that is different from the accepting state. dying or refusing the string.
6 Time needed for executing an input string is less. Time needed for executing an input string is more.
q1
b a
ε
a q2 q3
a,b
83
• there are three major differences
– In the NFA:
• The range of δ in the power set 2Q, its value is
not a single element of Q, but a subset of it.
•
Allow a s the second argument of δ; make a
transition without consuming an input symbol
• The set δ(qi, a) may be empty; there is no
transition defined
84
Nondeterministic Finite Automaton
(NFA)
Alphabet =
{a}
q1 q2
q0
q3
85
Alphabet =
{a}
Two
choices
q1 q2
q0
q3
86
Alphabet =
{a}
Two
choices
q1 q2 No
transition
q0
q3 No
transition
87
First
Choice
q1 q2
q0
q3
88
First
Choice
q1 q2
q0
q3
89
First
Choice
All input is
consumed
q2
q1 “accept”
q0
q3
90
Second
Choice
q1 q2
q0
q3
91
Second
Choice
q1
Automaton
q0 Halts
q2
q3
“reject”
92
aa is accepted by the
NFA:
“accept
q1 ” q1 q2
q2
q0 q0
q3 q3 “reject
because ”
this
this computation is
computatio 93
Rejection
example
q1 q2
q0
q3
94
First
Choice
“reject
”
q2
q1
q0
q3
95
Second
Choice
q1 q2
q0
q3
72
Second
Choice
q1 q2
q0
q3 “reject
”
97
Another Rejection
example
q1 q2
q0
q3
98
First
Choice
q1 q2
q0
q3
99
First
Choice
Input cannot be
consumed
q1 q2
“reject”
q0
Automaton
q3 halts
76
Second
Choice
q1 q2
q0
q3
77
Second
Choice
Input cannot be
consumed
q1 q2
q0 Automaton
halts
q3 “reject
”
10
An NFA rejects a
string:
if there is no computation of the
NFA that accepts the string.
OR
• 10
is rejected by the NFA:
“reject
q1 q2 ” q2
q1
q0 q0
q3 “reject q3
”
“reject
q1 ” q1 q2
q2
q0 q0
q3 q3 “reject
”
{aa}
q1 q2
q0
q3
106
Lambda
Transitions
q0 q1 q2 q3
107
q0 q1 q2 q3
108
q0 q1 q2 q3
109
input tape head does not
move
q0 q1 q2 q3
110
all input is
consumed
“accept
”
q0 q1 q2
q3
String aa is 111
Rejection
Example
q0 q1 q2 q3
112
q0 q1 q2 q3
113
(read head doesn’t
move)
q0 q1 q2 q3
114
Input cannot be
consumed
Automaton
halts “reject”
q2 q3
q0 q1
String is rejected
115
Language accepted: L
{aa}
q0 q1 q2 q3
116
Another NFA
Example
a q1 b q2 q3
q0
117
a b
a q1 b q2 q3
q0
118
a b
a q1 b q2 q3
q0
119
a b
“accept
a b ”
q0 q1 q3
q2
120
Another
String
a b a b
a b
q0 q1 q2 q3
121
a b a b
a b
q0 q1 q2 q3
122
a b a b
a b
q0 q1 q2 q3
123
a b a b
a b
q0 q1 q2 q3
124
a b a b
a b
q0 q1 q2 q3
125
a b a b
a b
q0 q1 q2 q3
126
a b a b
“accept
a b ”
q0 q1 q3
q2
127
Language
accepted
L ab,abab, ababab,
...
ab
a q1 b q2 q3
q0
128
Another NFA
Example
0
q1 0, q2
q0
1 1
129
Language
accepted
q1
x resulting states with
q x
q following one
x 1 transition with symbol
q x
k
131
q 0 , 1
q 1
0
0,
q0 1 q1 q2
1
132
(q1,0) {q0 ,
q2 }
0
0,
q0 1 q1 q2
1
133
(q0 , )
{q 2 }
0
0,
q0 1 q1 q2
1
134
(q2 ,1)
0
0,
q0 1 q1 q2
1
135
Extended Transition Function
*
*
q , a q
0 1
q4 q5
a
a a b
q0 q2 q3
q1
136
*
q 0 , aa
q , q 4 5
q4 q5
a
a a b
q0 q2 q3
q1
113
* q0 , ab
q , q , q
2 3 0
q4 q5
a
a a b
q0 q2 q3
q1
138
Special
case:
for any state
q
q *
q,
139
In
general
qj
* qi , w : there is a walk from qi
with label w
to q j
qi w qj
w 1 2
k
qi 1 k qj
2
140
The Language of an NFA
The language accepted by M M
is:
LM
w1 ,w2 ,...wn
wher * (q 0 ,wm )
e {qi ,...,qk ,,qj }
142
F q 0 ,
q4 q5
q5
a
a a b
q0 q2 q3
q1
* q0 , aa q 4 , q5 aa
L(M )
143
F q 0 ,
q4 q5
q5
a
a a b
q0 q2 q3
q1
NFA
LM1
q0 M1 q1
{10}*
01
DFA M 2 0,
0
LM 2 1
q0 q1 1 q2
{10}* 1
0 146
Theore
m:
Languag
Regular
es
Languag
accepted
es
by NFAs
Languag
es
accepted
NFAs and DFAs have the same computation
by DFAs
power, accept the same set of languages
147
Equivalence of DFA’s and
• NFA’s
For most languages, NFA’s are easier
to construct than DFA’s
• But it turns out we can build a
corresponding DFA for any NFA
– The downside is there may be up to 2n
states in turning a NFA into a DFA.
However, for most problems the number of
states is approximately equivalent.
• Theorem: A language L is accepted by
some DFA if and only if L is accepted by
some NFA; i.e. : L(DFA) =
L(NFA) for an appropriately constructed
DFA from an NFA.
148
Conversion NFA to
DFA
NFA M a
q0 a q1 q2
b
DFA M
q0
149
* (q0 ,a )
{q1 , q2 } a
NFA M
a q q2
1
q0
b
DFA M
q0 a
q1,
q2
150
* (q0 ,b) empty
a set
NFA M
a q q2
1
q0
b
DFA M
q0 a
q1,
b q2
trap 151
* (q1 ,a )
a {q1 ,q2 }
NFA M * (q2 ,a )
q0 a q1 q2
b qunion
1,
q2
a
DFA M
q0 a
q1,
b q2
152
* (q1 ,b)
a { q0 }
NFA M * (q2 ,b)
q0 a q1 q
{q2 }
b
0
q 0
unio
n
b a
DFA M
q0 a
q1,
b q2
153
a
NFA M
a q1 q2
q0
b
b a
DFA M
q0 a
q1,
b q2
a,b trap
state 130
END OF CONSTRUCTION
a
NFA M
q0 a q1 q2 q1
b F
a
b
DFA M
q0 a
q1,
q1, q2
b q2
a,b F
155
General Conversion
Procedure
Input: an NFA M
Output: an equivalent DFA M
with LM L(M )
156
Step 2: select only those states which are reachable from start state
The NFA has q 0 , q 1,
states
q2 ,...
159
Conversion Procedure
Steps
step
160
Exampl
e a
a
NFA M q0 q1 q2
b
DFA M
q0
161
step
2. For every DFA’s {qi ,q
state
compute in the j ,...,qm}
NFA *
q*, a Union
i
q , a
j {qk,ql,...,
... qn }
*
add transition to
qm , a
DFA
{qi , qj ,..., qm }, a 162
Exampl *(q0 , a) {q1,
e q2}
a a
NFA M q0 q1 q2
b
q 0 , a q1 ,
DFA M
q0q 2 a
q1,
q2
163
ste
p3. Repeat Step 2 for every state in DFA
and symbols in alphabet until no more
states can be added in the DFA
164
Exampl
e a
a
NFA M q0 q1 q2
b
b a
DFA M
q0 a
q1,
b q2
a,b
165
ste
p
4. For any DFA state {qi , q j ,..., qm}
a
b
DFA M
q0 a
q1,
q1, q2
b q2
a,b F
167
Lemm
a:
If we convert NFA M to DFA M
then the two automata are
equivalent:
LM LM
Proof:
We only need to LM LM
show:
AND
168
Languages & Grammars
Phrase-Structure Grammars
Types of Phrase-Structure
Grammars
Derivation Trees
Backus-Naur Form
Intro to Languages
noun boy
noun dog
verb runs
verb sleeps
174
A derivation of “the boy sleeps”:
175
A derivation of “a dog runs”:
L = { “a boy runs”,
“a boy sleeps”,
“the boy runs”,
“the boy sleeps”,
“a dog runs”,
“a dog sleeps”,
“the dog runs”,
“the dog sleeps” }
177
Notation
noun boy
noun dog
Variable Terminal
or Production
Symbols of
Non-terminal rule
the vocabulary
Symbols of
the vocabulary
178
Basic Terminology
► A vocabulary/alphabet, V is a finite nonempty set
of elements called symbols.
Example: V = {a, b, c, A, B, C, S}
w1=>*wn
The * indicates that an unspecified number of steps
(including zero) can be taken to derive wn from w1.
Definition :- Let G = (V, T, S, P) be a grammar. Then the
set L(G)=(w € T*:S=>w*) is the language generated by
G.
If w ∈ L (G),then the sequence S=>w1=>w2--- =>wn
=>w
is a derivation of the sentence w. The strings S, w1,
w2,…, wn, which contain variables as well as terminals,
are called sentential forms of the derivation.
Conti…
where V = {a, b, A, B, S}
T = {a, b},
S is a start symbol
P = {S → ABa, A → BB, B → ab, A → Bb}.
G is a Phrase-Structure Grammar.
L(G)= {w T* | S =>*w}
189
Language L(G)
► EXAMPLE:
Grammar:
G=(V,T,S,P) T={a,b} P= S aSb
V={a,b,S}
S
Derivation of sentence :
ab
S aSb ab
Derivation of sentence :
aabb
S aSb aaSbb aabb
S aSb S
192
Other derivations:
194
PSG Example – English
Fragment
V T
Grammar G (V , T , S , P )
a A B
B b a b B
c
c
Backus-Naur Form