Pda Annotated 10 12 2021
Pda Annotated 10 12 2021
Introduction
Pushdown automata and context free languages are foundational for defining several types computing
of languages that occur in practice; including programming languages, markup languages, and lan-
guages for communication protocols. The importance stems from the fact that, on one hand CFL’s
are much more expressive than regular languages, yet there are CFL’s that can be accepted by a class
of relatively simple machines, known as pushdown automata (PDA). Unlike DFA’s a PDA has unlim-
ited memory, albeit in the form of a stack. CFL’s were first studied in relation to natural-language
acquisition back in the 1950’s.
Context-Free Grammars
A → s,
where A ∈ V and s ∈ (V ∪ Σ)∗ . Variable A is referred to as the head of the rule, while s is
referred to its body.
1
Example 1. Consider the set of rules
Then we may use this set of rules to define a CFG G = (V, Σ, R, S), where
V = {S},
Σ = {a, b},
and variable S is the start variable.
For brevity we may list together rules having the same head as follows.
S → SS | aSb | ε.
2
Example 2. One common use of CFG’s is to provide grammatical formalism for natural languages.
For example, consider the set of rules
hSENTENCEi → hNOUN-PHRASEihVERB-PHRASEi
Here, the variables are the ten parts of speech delimited by h i, Σ is the lowercase English alphabet,
including the space character, and hSENTENCEi is the start variable.
3
Example 3. A CFG may also be used to define the syntax of a programming language. One fun-
damental language component to any programming language is that of an expression. The following
rules imply a CFG for defining expressions formed by a single terminal a, parentheses, and the two
arithmetic operations + and ×. Here E stands for expression, T for term, and F for factor.
E →E+T | T
T →T ×F | F
F → (E) | a
We have V = {E, T, F }, Σ = {+, ×, a, (, )}, and E is the start variable.
4
Grammar derivations
Let G = (V, Σ, R, S) be a CFG, then the language D(G) ∈ (V ∪ Σ)∗ of derived words is structurally
defined as follows.
Atom S ∈ D(G).
Compound Rule Suppose s ∈ D(G), s is of the form uAv for some u, v ∈ (V ∪ Σ)∗ , A ∈ V , and
A → γ is a rule of G, then
uγv ∈ D(G).
In this case we write s ⇒ uγv, and say that s yields uγv. In words, to get a new derived word,
take an existing derived word and replace one of its variables A with the body of a rule whose
head is A.
The subset L(G) of derived words w ∈ D(G) for which w ∈ Σ∗ is called the context-free language
(CFL) associated with G. Thus, the words of L(G) consist only of terminal symbols.
5
The Derivation relation
∗
Let u and v be words in (V ∪ Σ)∗ . We say that u derives v, written u ⇒ v if and only if either u = v
or there is a sequence of words w1 , w2 , . . . , wn such that
u = w1 ⇒ w2 ⇒ w3 ⇒ · · · ⇒ wn = v.
6
Example 4. Use the CFG from Example 1 to derives the word aabbaababb.
S → SS | aSb | ε.
Solution.
7
Derivation parse trees
S = w1 ⇒ w2 ⇒ · · · ⇒ wn = w.
Then the parse tree for w can be defined in a step-by-step manner. To begin the parse tree T1 for
S = w1 consists of a single node labeled with S.
Now suppose a parse tree Tk has been associated with wk , the k th word of the derivation. Moreover,
assume that, from left to right, the leaves of Tk are labeled in one-to-one correspondence with the
symbols of wk . Moreover, assume that wk has the form wk = uAv, where A is substituted for a word
γ, so that wk+1 = uγv. Then Tk+1 is obtained from Tk by assigning the leaf node labeled with A a
number of children equal to the length of γ and for which, from left to right, the i th child is labeled
with the ith symbol of γ.
8
Example 5. Use the CFG from Example 3 to derive the expression a × (a + a), and provide the
parse tree associated with the derivation.
E →E+T | T
T →T ×F | F
F → (E) | a
Solution.
9
Ambiguity
Given a CFG G, and a word w ∈ L(G), there may be several different derivations of w from start
symbol S. Many of these derivations however will yield identical parse trees. But in the event
that two different derivation sequences of w from S yield two different parse trees, then we call G
ambiguous. It turns out that an easy way to check for ambiguity is to check that no word w has
more than one leftmost derivation.
10
Example 6. Show that the grammar defined by the following rules is ambiguous.
hSENTENCEi → hNOUN-PHRASEihVERB-PHRASEi
hNOUN-PHRASEi → hCOMPLEX-NOUNi | hCOMPLEX-NOUNihPREP-PHRASEi
hVERB-PHRASEi → hCOMPLEX-VERBi | hCOMPLEX-VERBihPREP-PHRASEi
hPREP-PHRASEi → hPREPihCOMPLEX-NOUNi
hCOMPLEX-NOUNi → hARTICLEihNOUNi
hCOMPLEX-VERBi → hVERBi|hVERBihNOUN-PHRASEi
hARTICLEi → a | the
hNOUNi → trainer | dog | whistle
hVERBi → calls | pets | sees
hPREPi → with | in
Solution.
11
Solution Continued.
12
Chomsky Normal Form
Sometimes when working in the abstract with CFG’s, it is helpful to assume that derivations with
the CFG yield binary parse trees. It turns out that any CFG can be converted to one that has this
property, yet produces the same language. This can be done by converting the CFG to Chomsky
Normal Form (CNF).
A CFG G = (V, Σ, R, S) is in Chomsky Normal Form if and only if the following conditions hold.
1. The start variable S may not appear on the right-hand side of any rule in R.
3. All other rules must be of the form A → BC, where B, C ∈ V − {S}, or A → a, where a ∈ Σ.
13
Algorithm for converting a CFG to one in CNF.
– For each rule of the form B → u1 Au2 A · · · un Aun+1 (where A is not a symbol of any ui
word)
∗ Create 2n new rules which represent the different possible ways to make a new rule
from B → u1 Au2 A · · · un Aun+1 by either keeping or removing each of the A variables.
∗ Remove any of these new 2n rules that have the form D → ε, and have already been
processed in the outer while-loop (otherwise the algorithm will loop forever)
– Add all the newly created rules from the previous loop to Ĝ
– Remove A → ε from Ĝ
– Replace this rule with the set of rules A → u1 A1 , A1 → u2 A2 , . . ., Ak−2 → uk−1 uk , where
the Ai variables are all new
– For each i, if ui is a terminal symbol, then replace it with unused variable Ui , and add the
rule Ui → ui
14
Example 7. Apply the CNF conversion algorithm to the following CFG.
S → ASC|Bb
A → C|S
B → a|ε
C → b|B
Solution.
1. Add rule S0 → S
S0 → S
S → ASC|Bb
A → C|S
B → a|ε
C → b|B
2. Remove ε-rule B → ε
S0 → S
S → ASC|Bb|b
A → C|S
B→a
C → b|ε
3. Remove ε-rule C → ε
S0 → S
S → ASC|AS|Bb|b
A → ε|S
B→a
C→b
4. Remove ε-rule A → ε
S0 → S
S → ASC|SC|AS|S|Bb|b
A→S
B→a
15
C→b
S0 → ASC|SC|AS|Bb|b
S → ASC|SC|AS|S|Bb|b
A→S
B→a
C→b
S0 → ASC|SC|AS|Bb|b
S → ASC|SC|AS|Bb|b
A→S
B→a
C→b
S0 → ASC|SC|AS|Bb|b
S → ASC|SC|AS|Bb|b
A → ASC|SC|AS|Bb|b
B→a
C→b
8. Add rule A1 → AS
S0 → A1 C|SC|AS|Bb|b
S → A1 C|SC|AS|Bb|b
A → A1 C|SC|AS|Bb|b
B→a
C→b
A1 → AS
9. Add rule B1 → b
S0 → A1 C|SC|AS|BB1 |b
S → A1 C|SC|AS|BB1 |b
A → A1 C|SC|AS|BB1 |b
B→a
C→b
A1 → AS
B1 → b
16
Pushdown Automata
After the DFA/NFA, the pushdown automaton (PDA) represents the next level of computational
power for state machines. A PDA’s is more powerful than DFA’s because i) every regular language
can be accepted by a PDA, and ii) there are nonregular languages that can also be accepted by
PDA’s. In fact, the definition for a PDA is quite similar to that for an NFA. The only difference is
that a PDA has access to an infinite stack memory for which it is able to push and pop symbols from
the stack.
Γ a finite stack alphabet that represent the symbols that may be pushed on to the stack
δ a transition function
δ : Q × Σε × Γε → P(Q × Γε ).
that takes as input a state-input-stack triple and maps it to a subset of state-stack pairs, where
input triple represents the i) current state, ii) input symbol being read, and iii) stack symbol
being read/popped, and an output pair represents i) the next state and ii) the stack symbol
being pushed on to the stack.
Notice that the above definition incorporates nondeterminism, since their are possibly multiple next-
state/stack-symbol pairs. Technically, this definition is for that of an NPDA. DPDA’s may also
be defined and are of great practical importance. However, their study is considered slightly more
advanced and are left as a topic for a more advanced ToC course. In this lecture, PDA is short for
NPDA.
17
Like NFA’s, a PDA may be represented with a state diagram using an almost identical notation.
However, in addition to labeling an edge with an input symbol (or ε), we must also include
s1 → s2 ,
where s1 , s2 ∈ Γε is such that s1 represents the current symbol being read/popped from the stack,
and s2 is the symbol to be pushed on to the stack.
Example 8. Draw the state diagram and provide the 6-tuple for a PDA that accepts the binary
language {0n 1n |n ≥ 0}. Recall that this language was proved to be nonregular in the Finite Automata
lecture.
ε, ε → ε
0, ε → 0
ε, ε → $
b c
1, 0 → ε
ε, $ → ε
e d
1, 0 → ε
18
PDA computation. Let M = (Q, Σ, Γ, δ, q0 , F ) be a PDA, and w = w1 w2 · · · wm be a word in Σ∗ .
Then the computation of M on input w is a tree T (M, w) for which each node is labeled with a
configuration, i.e. a triple (q, r, st), where q ∈ Q, is the current state, r ∈ Σε is the next symbol
of w to be read, and st ∈ Γ∗ is the current stack configuration, s being the top stack symbol, and t
representing the rest of the symbols on the stack. The tree is defined recursively as follows.
Base case The root of T (M, w) is labeled with (q0 , w1 , ε), where q0 is the initial state of M . Again,
the computation starts in the initial state with an empty stack, and w1 is the next input symbol
to be read.
Recursive case Let n be a node of T (M, w) that is labeled with (q, r, st), where q ∈ Q, r = wi , for
some 1 ≤ i ≤ m, s ∈ Γε and t ∈ Γ∗ . We now describe the kinds of children that n can have. In
what follows we assume q 0 ∈ Q, r0 = wi+1 if i < m, and r0 = ε, otherwise. Also, assume u ∈ Γ.
Case 1: Read only. For each pair (q 0 , ε) ∈ δ(q, r, ε), there is a child node labeled with (q 0 , r0 , st).
Case 2: Read and pop. For each pair (q 0 , ε) ∈ δ(q, r, s), there is a child node labeled with (q 0 , r0 , t).
Case 3: Read and push. For each pair (q 0 , u) ∈ δ(q, r, ε), there is a child node labeled with
(q 0 , r0 , ust).
Case 4: Read, pop and push. For each pair (q 0 , u) ∈ δ(q, r, s), there is a child node labeled with
(q 0 , r0 , ut).
Case 5: No read and pop. For each pair (q 0 , ε) ∈ δ(q, ε, s), there is a child node labeled with
(q 0 , r, t).
Case 6: No read and push. For each pair (q 0 , u) ∈ δ(q, ε, ε), there is a child node labeled with
(q 0 , r, ust).
Case 7: No read, pop and push. For each pair (q 0 , u) ∈ δ(q, ε, s), there is a child node labeled with
(q 0 , r, ut).
Case 8: No read, no pop, no push. For each pair (q 0 , ε) ∈ δ(q, ε, ε), there is a child node labeled
with (q 0 , r, st).
Finally, M accepts input w iff there is a branch of T (M, w) whose leaf node is labeled with the
configuration (q, r, st), where q ∈ F and r = ε. In other words, there is some branch that terminates
with i) an acccepting state, ii) having read all the input, and iii) there are no further steps (e.g.
further pushes and pops) that can be taken in the computation. Such a branch is called an accepting
branch. If T (M, w) has no accepting branch, then M rejects w. Finally, L(M ) denotes the set of
input words accepted by M .
19
Example 9. For the PDA M with state diagram shown below, the computation tree T (M, 0110) is
shown on the next page.
0, ε → a
0, ε → a
0, ε → a
a b c
1,a → ε
0, ε → a 1, ε → ε
ε, a → ε 0, ε → a
0,ε → a
ε, a → ε
1, a → ε
d e f
1, ε → a
0, a → ε
1, a → ε
20
(a, 0, ε)
(f, 1, a)
(e, 1, aa)
21
Example 10. Draw the state diagram and provide the 6-tuple for a PDA that accepts the binary
language that has an equal number of 1’s and 0’s.
22
Example 11. Let L = {xi y j z k |i, j, k ≥ 0 and j = 2i or k = 2j}. Provide the state diagram of a
PDA that accepts L.
23
On the Equivalence of PDA’s and CFG’s
In this section we show that i) every CFL can be accpeted by some PDA and ii) for every language
L accepted by a PDA, there is a grammar G for which L = L(G).
Proof of Theorem 1. Let G = (V, Σ, R, S) and L = L(G) be its associated CFL. We design a PDA
M that accepts L as follows. To begin, the delimiter stack symbol $ is first pushed on to the stack,
followed by variable S being pushed, followed by entering a self-looping main state qloop that allows
for the following kinds of transitions.
Terminal Matching If s ∈ Σ is the input symbol being read and is also on top of the stack, then
s is both read and popped from the stack.
Variable Substitution If variable A is on top of the stack and A → u is a rule, then A is popped
from the stack and the symbols of u are pushed on to the stack in reverse order (note: this
may require several push transitions before returning to the qloop . Note: this is where nonde-
terminism plays an important role, since a single variable A can be the head of more than one
rule, and thus the computation will branch accordingly.
Read Delimiter If $ is on top of the stack, then transition to an accept state from which no further
input may be read.
Based on the above transitions, some thought and reflection should convince the reader that M
accepts exactly those terminal words that are derived by G. This is because, in order to be accepted,
the input symbols must match symbol-for-symbol some terminal word derived by the grammar.
The PDA defined above will be referred to as the natural PDA associated with CFG G.
24
Example 12. Draw a state diagram for the natural PDA associated with the grammar from Exam-
ple 1 and having rules Consider the set of rules
Show an accepting branch of the computation tree for the word abab.
25
Example 12 Continued.
26
Defining a CFG for a PDA
In this section we make the following assumptions about an arbitrary PDA M that we consider in
the next theorem.
2. M either pushes or pops the stack on each transition, but not both, and
Note that for any PDA M , there is always a PDA N that meets the above criteria and for which
L(M ) = L(N ). Thus, no generality is lost by making these assumptions.
Theorem 2. Let PDA M = (Q, Σ, Γ, δ, q0 , {qf }) satisfy the above simplifying assumptions. Then
L(M ) = L(G) for some grammar G = (V, Σ, R, S).
We now define a CFG G = (V, Σ, R, S) for which L(G) = L(M ). The idea is to define, for every
pair of states p, q ∈ Q, a variable Apq so that Apq has the ability to derive w if and only if starting
in state p with the stack empty, a computation branch of M on input w transitions to state q, and
upon reaching q has an empty stack. There are two cases to consider.
Case 1: During the computation the stack remains nonempty (until the final step). In this case we add
the rule Apq → w1 Ars wn , where we have (r, t) ∈ δ(p, w1 , ε) and (q, ε) ∈ δ(s, wn , t). In words, we
know that r is a candidate for the 2nd state of the computation (assuming t ∈ Γ was pushed
first), while s is a candidate for the next-to-last state of the computation, since it is a state for
which wn can be read, and for which t can be popped. Notice how t must be popped at the
end, since it was the first symbol pushed, and the stack did not empty until the very end.
Case 2: During the computation the stack empties (before reaching the final state). Let r be the first
state in which the stack is empty during the computation. Then there is no guarantee that the
first symbol pushed is the last symbol popped. So we need a more conservative rule of the form
Apq → Apr Arq . Thus, Apq is allowing for Apr to derive the first part of w, while Arq will derive
the second part of w. In other words, the computation is cut up into pieces, where each piece
represents a computation that starts with an empty stack, and ends with an empty stack, with
the stack remaining nonempty until the end.
27
The rules for Case 1 will be called type-1, while the rules for Case 2 are called type-2. Putting it all
together, a CFG G for which L(G) = L(M ) will have start variable Aq0 ,qf , and the following rules:
1. For every pair of states p, q ∈ Q, the type-1 rules Apq → aArs b, for every a, b ∈ Σ and r, s ∈ Q
for which (r, t) ∈ δ(p, a, ε) and (q, ε) ∈ δ(s, b, t).
2. For every triple of states p, q, r ∈ Q, the type-2 rules Apq → Apr Arq .
The grammar defined above will be referred to as the natural grammar CFG for PDA M .
28
Example 13. Recall the PDA from Example 8. Modify it so that it only accepts words of the form
0n 1n , for all n ≥ 1. Provide a natural grammar derivation of the word 000111.
0, ε → 0
ε, ε → $
a b
1, 0 → ε
ε, $ → ε
d c
1, 0 → ε
Type 1 Aad → Abc Push $ while moving from a to b. Pop $ while moving from c to d.
Abc → 0Abc 1 Push 0 while moving from b to b. Pop 0 while moving from c to c.
Abc → 0Abb 1 Push last 0 while moving from b to b. Pop last zero while moving from b to
c.
ε Rules Abb → ε
29
Example 14. Suppose a PDA has the following configuration sequence:
(a, 1, ε), (b, 0, x), (c, ε, yx), (b, ε, zyx), (a, 0, zzyx), (c, 1, zyx), (d, 1, yx), (a, 0, zyx), (e, ε, yx),
30
Pumping Lemma for CFL’s
Context-Free languages have a pumping lemma similar to that for regular languages. Again, the
importance of the lemma is to help establish that a language is not context-free.
Pumping Lemma. If L is a CFL, then there is a positive integer p, called the pumping length,
such that, if s ∈ L has length at least p, then s may be divided into five pieces s = uvxyz and the
following three properties are satisfied.
2. |vy| > 0
3. |vxy| ≤ p
31
Example 15. Prove that {an bn cn |n ≥ 0} is not context-free.
32
Exercises
1. Exercise 2.1 from ITC
2. Prove that the union of two CFL’s is also a CFL. Hint: how to take the “union” of two CFG’s?
9. For the CFG in Example 6, provide a parse tree for the word “the trainer calls the dog with
the whistle”, where we assume that it is the dog that possesses the whistle.
13. Explain why every regular language can be accepted by some PDA.
14. Draw the computation tree T (M, w), where M is the PDA from Example 8, and w is the word
i) ε, ii) 010, iii) 001, and iv) 011.
15. For the PDA M with state diagram shown below, draw the computation tree T (M, 001).
0, ε → x
0, ε → x
0, ε → x
a b c
ε,x → ε
ε, ε → x 1, ε → ε
ε, x → ε 0, ε → x
ε, x → ε
1, x → ε
d e f
1, ε → x
0, x → ε
1, x → ε
16. Repeat the previous exercise, but now using input word 011.
33
17. Provide a PDA that accepts the language of all binary palindromes.
26. Provide the natural CFG for the PDA provided in Example 10, and use it to derive the word
011010.
34
Exercise Solutions
1. Exercise 2.1 from ITC
(a)
(b)
(c) We have
E ⇒E+T ⇒E+T +T ⇒T +T +T ⇒F +T +T ⇒
a + T + T ⇒ a + F + T ⇒ a + a + T ⇒ a + a + F ⇒ a + a + a.
E + T
E + T F
T F a
F a
(d)
2. Suppose L1 and L2 are CFL’s, where L1 = L(G1 ) and L2 = L(G2 ), where L1 = (V1 , Σ1 , R1 , S1 )
and L2 = (V1 , Σ1 , R1 , S1 ). Without loss of generality, we may assume that V1 ∩ V2 = ∅, which
in turn implies that R1 ∩ R2 = ∅. Then define the grammar
G = (V1 ∪ V2 ∪ {S}, Σ1 ∪ Σ2 , R1 ∪ R2 ∪ {S → S1 | S2 }, S),
where S 6∈ V1 ∪ V2 . The idea is that the rule S → S1 will allow for all words in L1 to be
derived, and the same is true for rule S → S2 and L2 . Moreover, since V1 and V2 are disjoint,
it is impossible to derive any other word that does not belong in either L1 or L2 . Therefore,
L(G) = L1 ∪ L2 is a CFL.
3. (a) The intersection of languages A and B yields the set of words {an bn cn |n ≥ 0}. Example
2.36 proves that this language is not a CFL. Moreover, A and B are CFL’s (show this!).
Therefore, CFL’s are not closed under intersection.
(b) Suppose the class of CFL’s was closed under complement. Then, since CFL’s are closed
under union by the previous exercise, it then follows that
L1 ∪ L2 = L1 ∩ L2
must also be a CFL. But this is not always true by part (a). Therefore, the complement
of a CFL is not always itself a CFL.
35
4. Exercise 2.3 from ITC
(a)
(b)
(c) We have the following rules.
S → Z1DZ
Z → Z0 | ε
D → DZ1Z1Z | ε
The start rule S guarantees that at least one 1 is derived, while each application of the D
rule always yields an even number (either 0 or 2) more number of 1’s. This gives a total
number of 1’s that is odd.
(d)
(e)
(f)
9. For the CFG in Example 6, provide a parse tree for the word “the trainer calls the dog with
the whistle”, where we assume that it is the dog that possesses the whistle.
13. Explain why every regular language can be accepted by some PDA.
14. Draw the computation tree T (M, w), where M is the PDA from Example 8, and w is the word
i) ε, ii) 010, iii) 001, and iv) 011.
15. For the PDA M with state diagram shown below, draw the computation tree T (M, 001).
36
0, ε → x
0, ε → x
0, ε → x
a b c
ε,x → ε
ε, ε → x 1, ε → ε
ε, x → ε 0, ε → x
ε, x → ε
1, x → ε
d e f
1, ε → x
0, x → ε
1, x → ε
16. Repeat the previous exercise, but now using input word 011.
17. Provide a PDA that accepts the language of all binary palindromes.
26. Provide the natural CFG for the PDA provided in Example 10, and use it to derive the word
011010.
37