0% found this document useful (0 votes)
43 views59 pages

Formal Languages, Automata and Computability

The document provides information about a course on formal languages, automata and computability. It includes: - Homework grades with a mean of 81.8 and median of 86. - A review for an upcoming midterm exam covering topics like DFAs, NFAs, pumping lemmas, and Turing machines. - An overview of the concepts that will be on the exam, including different machine models and their relationships. - Descriptions of theorems and constructions related to regular languages and their representations using regular expressions and finite automata.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views59 pages

Formal Languages, Automata and Computability

The document provides information about a course on formal languages, automata and computability. It includes: - Homework grades with a mean of 81.8 and median of 86. - A review for an upcoming midterm exam covering topics like DFAs, NFAs, pumping lemmas, and Turing machines. - An overview of the concepts that will be on the exam, including different machine models and their relationships. - Descriptions of theorems and constructions related to regular languages and their representations using regular expressions and finite automata.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 59

15-453

FORMAL LANGUAGES,
AUTOMATA AND
COMPUTABILITY
• Homework 2 Grades

• 35 homeworks handed in.


• Histogram:

• 40 | **
• 50 | *
• 60 | *****
• 70 | ***
• 80 | **********
• 90 | *************
• 100 | *

• mean=81.8, median=86
REVIEW for MIDTERM 1

THURSDAY Feb 7
Midterm 1 will cover everything we
have seen so far

The PROBLEMS will be similar to Sipser in


Chapter 1, 2, 3 and HWs 1, 2, 3.

It will be Closed-Book,
Closed-Everything
• 1. Deterministic Finite Automata and
Regular Languages
• 2. Non-Deterministic Finite Automata
• 3. Pumping Lemma for Regular
Languages; Regular Expressions
• 4. Minimizing DFAs
• 5. PDAs, CFGs;
Pumping Lemma for CFLs
• 6. Equivalence of PDAs and CFGs
• 7. Chomsky Normal Form
• 8. Turing Machines
Machines Syntactic Rules

DFAs

Regular
NFAs
Expressions

Context-Free
PDAs
Grammars
THE REGULAR OPERATIONS

Union: A  B = { w | w  A or w  B }

Intersection: A  B = { w | w  A and w  B }

Negation: A = { w  Σ* | w  A }

Reverse: AR = { w1 …wk | wk …w1  A }

Concatenation: A  B = { vw | v  A and w  B }

Star: A* = { s1 … sk | k ≥ 0 and each si  A }


REGULAR EXPRESSIONS
 is a regexp representing {}
ε is a regexp representing {ε}
 is a regexp representing 

If R1 and R2 are regular expressions


representing L1 and L2 then:
(R1R2) represents L1  L2
(R1  R2) represents L1  L2
(R1)* represents L1*
How can we test if two regular
expressions are the same?

Length n R1 R2

O(n) states N1 N2

O(2n) states M1 M2

M1 MIN ?= M2 MIN
THEOREMS
and
CONSTRUCTIONS
THE PUMPING LEMMA
(for Regular Languages)

Let L be a regular language with |L| = 


Then there is an integer P such that
if w  L and |w| ≥ P
then can write w = xyz, where:

1. |y| > 0
y 2. |xy| ≤ P
z 3. xyiz  L for any i ≥ 0
x
THE PUMPING LEMMA
(for Context Free Grammars)
Let L be a context-free language with |L| = 
Then there is an integer P such that
if w  L and |w| ≥ P
then can write w = uvxyz, where:
T 1. |vy| > 0
T
R 2. |vxy| ≤ P
R
R
R
R 3. uvixyiz  L,
u v y z for any i ≥ 0
u v x y z v x y
CONVERTING NFAs TO DFAs
Input: NFA N = (Q, Σ, , Q0, F)
Output: DFA M = (Q, Σ, , q0, F)

Q = 2Q
 : Q  Σ → Q
(R,) =  ε( (r,) ) *
rR
q0 = ε(Q0)
* F = { R  Q | f  R for some f  F }

For R  Q, the ε-closure of R, ε(R) = {q that can be reached


from some r  R by traveling along zero or more ε arrows}
Given: NFA N = ( {1,2,3}, {a,b},  , {1}, {1} )
Construct: Equivalent DFA M
M = (2{1,2,3}, {a,b}, , {1,3}, …)
N a
1 a b
{1,3} {3} 
a b a b b b a,b
ε

2
a,b
3 {2} a {2,3}
b a
a
{1}, {1,2} ?
ε({1}) = {1,3}
{1,2,3}
EQUIVALENCE

L can be represented by a regexp



L is a regular language
L can be represented by a regexp
 
L is a regular language

Induction on the length of R:

Base Cases (R has length 1):


R=

R=ε

R=
Inductive Step:
Assume R has length k > 1,
and that every regexp of length < k
represents a regular language

Three possibilities for what R can be:

R = R1  R2 (Closure under Union)


R = R1 R2 (Closure under Concat.)
R = (R1)* (Closure under Star)

Therefore: L can be represented by a regexp


 L is regular
L is a regular language 
 L can be represented by a regexp

Proof idea: Transform an NFA for L into a


regular expression by removing states and re-
labeling the arrows with regular expressions

Add unique and distinct start and accept states


ε
ε
ε
ε
ε
ε
ε
ε
ε NFA
ε
While machine has more than 2 states:
Pick an internal state, rip it out and
re-label the arrows with regexps,
to account for the missing state

0 0

01*0
a a,b

ε b ε
q0 q1 q2 q3

a*b
(a*b)(ab)*

R(q0,q3) = (a*b)(ab)*
Transform (1(0  1))* to an NFA

ε 1 1,0

ε
THEOREM
For every regular language L, there exists
a UNIQUE (up to re-labeling of the states)
minimal DFA M such that L = L(M)
EXTENDING 
Given DFA M = (Q, Σ, , q0, F) extend 
to ^ : Q  Σ* → Q as follows:
^ ε) = q
(q,
^ ) = (q, )
(q,
^ w …w ) = ( (q,
(q, ^ w …w ), w )
1 k+1 1 k k+1

Note: (q0, w)  F  M accepts w

String w  Σ* distinguishes states q1 and q2 iff


^ ^
exactly ONE of (q , w), (q , w) is a final state
1 2
Fix M = (Q, Σ, , q0, F) and let p, q, r  Q
Definition:
p ~ q iff p is indistinguishable from q
p ~/ q iff p is distinguishable from q
Proposition: ~ is an equivalence relation
p ~ p (reflexive)
p ~ q  q ~ p (symmetric)
p ~ q and q ~ r  p ~ r (transitive)
Proposition: ~ is an equivalence relation
so ~ partitions the set of states of M into
disjoint equivalence classes

[q] = { p | p ~ q }

q
0

1
1 1 0
1 0

0
TABLE-FILLING ALGORITHM
Input: DFA M = (Q, Σ, , q0, F)
Output: (1) DM = { (p,q) | p,q  Q and p ~/ q }
(2) EM = { [q] | q  Q }

IDEA:
• We know how to find those pairs of
states that ε distinguishes…
• Use this and recursion to find those
pairs distinguishable with longer strings
• Pairs left over will be indistinguishable
TABLE-FILLING ALGORITHM
Input: DFA M = (Q, Σ, , q0, F)
Output: (1) DM = { (p,q) | p,q  Q and p ~/ q }
(2) EM = { [q] | q  Q }
q0
q1 Base Case: p accepts
and q rejects  p ~
/ q
Recursion: if there is σ  Σ
qi D D and states p, q satisfying

 (p, ) = p
qn D ~/  p ~/ q
q0 q1 qi qn  (q, ) = q
Repeat until no more new D’s
Algorithm MINIMIZE
Input: DFA M
Output: DFA MMIN
(1) Remove all inaccessible states from M
(2) Apply Table-Filling algorithm to get
EM = { [q] | q is an accessible state of M }
MMIN = (QMIN, Σ, MIN, q0 MIN, FMIN)

QMIN = EM, q0 MIN = [q0], FMIN = { [q] | q  F }

MIN( [q], ) = [ ( q, ) ]


Claim: MMIN  M
A Language L is generated by a CFG

L is recognized by a PDA
Suppose L is generated by a CFG G = (V, Σ, R, S)
Construct P = (Q, Σ, Γ, , q, F) that recognizes L

ε,ε → $S For each rule 'A → w’  R:


ε,A → wR
For each terminal a  Σ:
a,a → ε
ε,$ → ε
S → aTb
T → Ta | ε
ε,ε → $
ε,ε → T

ε,ε → S ε,ε → T

ε,$ → ε
ε,ε → a
ε,T → ε
a,a → ε
b,b → ε
A Language L is generated by a CFG

L is recognized by a PDA
Given PDA P = (Q, Σ, Γ, , q, F)
Construct a CFG G = (V, Σ, R, S) such that
L(G) = L(P)
First, simplify P to have the following form:
(1) It has a single accept state, qaccept
(2) It empties the stack before accepting
(3) Each transition either pushes a symbol or
pops a symbol, but not both at the same time
SIMPLIFY
ε,ε → $
q0 q1 0,ε → 0

1,0 → ε

ε,$ → ε
q3 q2 1,0 → ε
SIMPLIFY
ε,ε → E
ε,ε → $
Q q0 q1 0,ε → 0

ε,ε → D 1,0 → ε
q’0 ε,$ → ε
q3 q2 1,0 → ε
ε,D→ ε
q’3 ε,ε → D

ε,D → ε
q4
ε,E → ε q5
ε,σ → ε
Idea For Our Grammar G:
For every pair of states p and q in PDA P,

G will have a variable Apq which generates all


strings x that can take:

P from p with an empty stack


to q with an empty stack

V = {Apq | p,qQ }

S = Aq0qacc
x = ayb takes p with empty stack to q with empty stack

1. The symbol t popped at the end is exactly


the one pushed at the beginning

stack
height
r s
input a push t pop t b
string p ────x──── q
δ(p, a, ε) → (r, t)
δ(s, b, t) → (q, ε) Apq → aArsb
2. The symbol popped at the end is not
the one pushed at the beginning

stack
height

input p r q
string

Apq → AprArq
Formally:
V = {Apq | p, qQ }
S = Aq0qacc

For every p, q, r, s  Q, t  Γ and a, b  Σε


If (r, t)  (p, a, ε) and (q, ε)  (s, b, t)
Then add the rule Apq → aArsb

For every p, q, r  Q,
add the rule Apq → AprArq

For every p  Q,
add the rule App → ε
THE CHOMSKY NORMAL FORM
A context-free grammar is in Chomsky normal
form if every rule is of the form:
A → BC B, C are variables (not the start var)
A→a a is a terminal
S→ε S is the start variable

S0 → TU | ε
S → 0S1 T→0
S → TT U → SV | 1
T→ε S → TU
V→1
Theorem: If G is in CNF, w  L(G) and |w| > 0,
then any derivation of w in G has length 2|w| - 1

Theorem: Any context-free language


can be generated by a context-free
grammar in Chomsky normal form

“Can transform any CFG into


Chomsky normal form”
Theorem: Any CFL can be generated
by a CFG in Chomsky normal form
Algorithm:
1. Add a new start variable (S0S)
2. Eliminate all Aε rules:
For each occurrence of A on the RHS of a rule,
add a new rule that removes that occurrence
(unless this new rule was previously removed)
3. Eliminate all AB rules:
For each rule with B on LHS of a rule,
add a new rule that puts A on the LHS instead
(unless this new rule was previously removed)
4. Convert Au1u2... uk to A u1A1, A1u2A2, ...
If ui is a terminal, replace ui with Ui and add Uiui
Convert the following into Chomsky normal form:
A → BAB | B | ε
B → 00 | ε

S0 → A S0 → A | ε
A → BAB | B | ε A → BAB | B | BB | AB | BA
B → 00 | ε B → 00

S0 → BAB | 00 | BB | AB | BA | ε
A → BAB | 00 | BB | AB | BA
B → 00

S0 → BC | DD | BB | AB | BA | ε, C → AB,
A → BC | DD | BB | AB | BA , B → DD, D → 0
FORMAL DEFINITIONS
deterministic DFA
A ^ finite automaton ^ is a 5-tuple M = (Q, Σ, , q0, F)
Q is the set of states (finite)
Σ is the alphabet (finite)
 : Q  Σ → Q is the transition function
q0  Q is the start state
F  Q is the set of accept states
Let w1, ... , wn  Σ and w = w1... wn  Σ*
Then M accepts w if there are r0, r1, ..., rn  Q, s.t.
1. r0=q0
2. (ri, wi+1 ) = ri+1, for i = 0, ..., n-1, and
3. rn  F
A non-deterministic finite automaton (NFA)
is a 5-tuple N = (Q, Σ, , Q0, F)

Q is the set of states


Σ is the alphabet
 : Q  Σε → 2Q is the transition function
Q0  Q is the set of start states
F  Q is the set of accept states

2Q is the set of all possible subsets of Q


Σε = Σ  {ε}
Let w Σ* and suppose w can be written as
w1... wn where wi  Σε (ε = empty string)
Then N accepts w if there are r0, r1, ..., rn  Q
such that

1. r0  Q0
2. ri+1  (ri, wi+1 ) for i = 0, ..., n-1, and
3. rn  F

L(N) = the language recognized by N


= set of all strings machine N accepts

A language L is recognized by an NFA N


if L = L(N).
PUSHDOWN AUTOMATA (PDA)

FINITE INPUT
STATE
CONTROL

STACK
(Last in,
first out)
string pop push

ε,ε → $
0,ε → 0

1,0 → ε

ε,$ → ε
1,0 → ε
Definition: A (non-deterministic) PDA is a tuple
P = (Q, Σ, Γ, , q0, F), where:

Q is a finite set of states


Σ is the input alphabet
Γ is the stack alphabet
 : Q  Σε  Γε → 2 Q  Γε
q0  Q is the start state
F  Q is the set of accept states

2Q is the set of subsets of Q and Σε = Σ  {ε}


Let w Σ* and suppose w can be written as
w1... wn where wi  Σε (recall Σε = Σ  {ε})
Then P accepts w if there are
r0, r1, ..., rn  Q and
s0, s1, ..., sn  Γ* (sequence of stacks) such that
1. r0 = q0 and s0 = ε (P starts in q0 with empty stack)

2. For i = 0, ..., n-1:


(ri+1 , b) (ri, wi+1, a), where si =at and si+1 = bt for
some a, b  Γε and t  Γ*
(P moves correctly according to state, stack and symbol read)

3. rn  F (P is in an accept state at the end of its input)


ε,ε → $
q0 q1 0,ε → 0

1,0 → ε

ε,$ → ε
q3 q2 1,0 → ε

Q = {q0, q1, q2, q3} Σ = {0,1} Γ = {$,0,1}

 : Q  Σε  Γε → 2 Q  Γε
(q1,1,0) = { (q2,ε) } (q2,1,1) = 
(q2,ε,$) = { (q3,ε) }
CONTEXT-FREE GRAMMARS
A context-free grammar (CFG) is a tuple
G = (V, Σ, R, S), where:

V is a finite set of variables


Σ is a finite set of terminals (disjoint from V)
R is set of production rules of the form A → W,
where A  V and W  (VΣ)*
S  V is the start variable
L(G) = {w  Σ* | S * w} Strings Generated by G
G = { {S}, {0,1}, R, S } R = { S → 0S1, S → ε }
L(G) = { 0n1n | n ≥ 0 } Strings Generated by G
TURING MACHINE

FINITE
STATE
q10
CONTROL

AI N P U T

INFINITE TAPE
read write move

0 → 0, R  → , R
qaccept

0 → 0, R
 → , R
qreject
Definition: A Turing Machine is a 7-tuple
T = (Q, Σ, Γ, , q0, qaccept, qreject), where:

Q is a finite set of states


Σ is the input alphabet, where   Σ
Γ is the tape alphabet, where   Γ and Σ  Γ
 : Q  Γ → Q  Γ  {L,R}
q0  Q is the start state
qaccept  Q is the accept state
qreject  Q is the reject state, and qreject  qaccept
A TM recognizes a language iff it accepts all
and only those strings in the language

A language L is called Turing-recognizable


or recursively enumerable
iff some TM recognizes L

A TM decides a language L iff it accepts all


strings in L and rejects all strings not in L

A language L is called decidable or recursive


iff some TM decides L
A language is called Turing-recognizable or
recursively enumerable (r.e.) if some TM
recognizes it
A language is called decidable or recursive
if some TM decides it

r.e. recursive
languages languages
WWW.FLAC.WS
Happy studying!

You might also like