0% found this document useful (0 votes)
12 views146 pages

cse211slides03contextfreelanguages

The document discusses context-free grammars (CFGs) in the context of computation theory, providing definitions, examples, and methods for designing CFGs. It includes various grammar rules and derivations, illustrating how CFGs can generate specific languages, such as binary palindromes and nested parentheses. The content is intended for a course on the Theory of Computation, taught by Dr. Muhammad Masroor Ali at Bangladesh University of Engineering and Technology.

Uploaded by

arianadil892
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views146 pages

cse211slides03contextfreelanguages

The document discusses context-free grammars (CFGs) in the context of computation theory, providing definitions, examples, and methods for designing CFGs. It includes various grammar rules and derivations, illustrating how CFGs can generate specific languages, such as binary palindromes and nested parentheses. The content is intended for a course on the Theory of Computation, taught by Dr. Muhammad Masroor Ali at Bangladesh University of Engineering and Technology.

Uploaded by

arianadil892
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 146

CSE 211 (Theory of Computation)

Context Free Languages

Dr. Muhammad Masroor Ali

Professor
Department of Computer Science and Engineering
Bangladesh University of Engineering and Technology

January 2023

Version: 3.2, Last modified: July 19, 2023


Context-Free Grammars
Sipser, 2.1, p-102

Grammar, G1 .

A → 0A1
A→B
B→#

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 2 / 95


Context-Free Grammars
Sipser, 2.1, p-102

Grammar, G1 .

A → 0A1
A→B
B→#

Substitution rules, also called productions, or


production rules.
Variables or non-terminal symbols.
Terminals or terminal symbols.
Start variable or start symbol.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 2 / 95


Context-Free Grammars
Sipser, 2.1, p-102

A → 0A1
A→B
B→#

Grammar G1 generates the string 000#111.


The sequence of substitutions to obtain a string is called a
derivation.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 3 / 95


Context-Free Grammars
Sipser, 2.1, p-102

A → 0A1
A→B
B→#

A derivation of string 000#111 in grammar G1 is


A→0A1
A ====⇒ 0A1
A→0A1
====⇒ 00A11
A→0A1
====⇒ 000A111
A→B
===⇒ 000B111
B→#
===⇒ 000#111
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 3 / 95
Context-Free Grammars — continued
Sipser, 2.1, p-102

A→0A1
A ====⇒ 0A1
A→0A1
====⇒ 00A11
A→0A1
====⇒ 000A111
A→B
===⇒ 000B111
B→#
===⇒ 000#111

You may also represent the same information pictorially


with a parse tree.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 4 / 95


Context-Free Grammars — continued
Sipser, Figure 2.1, p-103

You may also represent the same information pictorially


with a parse tree.

A→0A1
A ====⇒ 0A1
A→0A1
====⇒ 00A11
A→0A1
====⇒ 000A111
A→B
===⇒ 000B111
B→#
===⇒ 000#111

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 5 / 95


Context-Free Grammars
Hopcroft, Motwani, and Ullman, 5.1.1, p-170

Let us consider the language of (binary) palindromes,

P → ϵ
P → 0
P → 1

P → 0P0
P → 1P1

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 6 / 95


Context-Free Grammars
Hopcroft, Motwani, and Ullman, 5.1.1, p-170

Let us consider the language of (binary) palindromes,


P → ϵ 
P → 0 Base cases
P → 1


P → 0P0
Recursive cases
P → 1P1

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 6 / 95


Context-Free Grammars
Hopcroft, Motwani, and Ullman, 5.1.1, p-170

Let us consider the language of (binary) palindromes,


P → ϵ 
P → 0 Base cases
P → 1


P → 0P0
Recursive cases
P → 1P1

Can be succinctly written as,

P → ϵ | 0 | 1 | 0P0 | 1P1

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 6 / 95


Context-Free Grammars
Sipser, 2.1, p-102

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 7 / 95


Context-Free Grammars
Sipser, 2.1, p-102

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 8 / 95


Formal Definition of a Context-Free Grammar
Sipser, Definition 2.2, p-104

Definition 2.2
A context-free grammar is a 4-tuple (V, Σ, R, S), where
1. V is a finite set called the variables,
2. Σ is a finite set, disjoint from V, called the terminals,
3. R is a finite set of rules, with each rule being a variable
and a string of variables and terminals, and
4. S ∈ V is the start variable.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 9 / 95


Formal Definition of a Context-Free Grammar —
continued
Sipser, Definition 2.2, p-104

If u, v, and w are strings of variables and terminals.


And A → w is a rule of the grammar.
We say that uAv yields uwv, written uAv ⇒ uwv.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 10 / 95


Formal Definition of a Context-Free Grammar —
continued
Sipser, Definition 2.2, p-104


⇒ v, if u = v or if a sequence
Say that u derives v, written u =
u1 , u2 , . . . , uk exists for k ≥ 0 and
u ⇒ u1 ⇒ u2 ⇒ · · · ⇒ uk ⇒ v.
n o

The language of the grammar is w ∈ Σ∗ | S =
⇒w .

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 11 / 95


Example
Sipser, Example 2.3, p-105

G3 = ({S}, {a, b}, R, S).


The set of rules, R, is S → aSb | SS | ϵ.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 12 / 95


Example
Sipser, Example 2.3, p-105

G3 = ({S}, {a, b}, R, S).


The set of rules, R, is S → aSb | SS | ϵ.
This grammar generates strings such as abab, aaabbb, and
aababb.
You can see more easily what this language is if you think
of a as a left parenthesis “(” and b as a right parenthesis “)”.
Viewed in this way, L(G3 ) is the language of all strings of
properly nested parentheses.
Observe that the right-hand side of a rule may be the
empty string ϵ.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 12 / 95


Example
Sipser, Example 2.4, p-105

G4 = (V, Σ, R, <EXPR>).
V is {<EXPR>, <TERM>, <FACTOR>},
and Σ is {a, +, ×, (, )}.
The rules are,

<EXPR> → <EXPR> + <TERM> | <TERM>


<TERM> → <TERM> × <FACTOR> | <FACTOR>
<FACTOR> → (<EXPR>) | a

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 13 / 95


Formal Definition of a Context-Free Grammar
Sipser, Figure 2.5, p-105

<EXPR> → <EXPR> + <TERM> | <TERM>


<TERM> → <TERM> × <FACTOR> | <FACTOR>
<FACTOR> → (<EXPR>) | a

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 14 / 95


Designing CFGs
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Like designing DFAs, NFAs, and regular expressions,


designing CFGs is a craft.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 15 / 95


Designing CFGs
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

When thinking about CFGs:


Think recursively:
Build up bigger structures from smaller ones.
Have a construction plan:
Know in what order you will build up the string.
Store information in nonterminals:
Have each nonterminal correspond to some
useful piece of information.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 16 / 95


Designing Context-Free Grammars
Sipser, 2.1, p-106

L = {0n 1n | n ≥ 0} ∪ {1n 0n | n ≥ 0}

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 17 / 95


Designing Context-Free Grammars
Sipser, 2.1, p-106

To get a grammar for the language


L = {0n 1n | n ≥ 0} ∪ {1n 0n | n ≥ 0}
First construct the grammar S1 → 0S1 1 | ϵ for the language
{0n 1n | n ≥ 0}.
Then construct the grammar S2 → 1S2 0 | ϵ for the language
{1n 0n | n ≥ 0}.
Then add the rule S → S1 | S2 to give the grammar

S → S1 | S2
S1 → 0S1 1 | ϵ
S2 → 1S2 0 | ϵ.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 18 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = 0n 12n | n ≥ 0


Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 19 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = 0n 12n | n ≥ 0


Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 20 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = 0n 12n | n ≥ 0


The grammar forces every 0 to match to 11.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 20 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = 0n 12n | n ≥ 0


The grammar forces every 0 to match to 11.


The context-free grammar for L is,

S → 0S11 | ϵ

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 20 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, 2n ≤ m ≤ 3n}

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 21 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, 2n ≤ m ≤ 3n}

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 22 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, 2n ≤ m ≤ 3n}
The grammar forces every 0 to match to 11 or 111.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 22 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, 2n ≤ m ≤ 3n}
The grammar forces every 0 to match to 11 or 111.
The context-free grammar for L is n=1
2<=m <=3
S → 0S11 | 0S111 | ϵ now think pair

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 22 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, n ̸= m}

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 23 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, n ̸= m}

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 24 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, n ̸= m}
Let L1 = {0n 1m | m, n ≥ 0, n > m}
Let L2 = {0n 1m | m, n ≥ 0, n < m}

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 24 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, n ̸= m}
Let L1 = {0n 1m | m, n ≥ 0, n > m}
Let L2 = {0n 1m | m, n ≥ 0, n < m}
Then, if S1 generates L1 , and S2 generates L2 , our grammar
will be,

S → S1 | S2

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 24 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L1 is just the language of strings 0n 1n with one or more


extra 0’s in front.
So,

S1 → 0S1 | 0E
E → 0E1 | ϵ

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 25 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L1 is just the language of strings 0n 1n with one or more


extra 0’s in front.
So,

S1 → 0S1 | 0E
E → 0E1 | ϵ

L2 is just the language of strings 0n 1n with one or more


extra 1’s in the end.
So,

S2 → S2 1 | E1
E → 0E1 | ϵ

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 25 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {0n 1m | m, n ≥ 0, n ̸= m}

Finally, our desired grammar is,

S → S1 | S2
S1 → 0S1 | 0E
S2 → S2 1 | E1
E → 0E1 | ϵ

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 26 / 95


Designing Context-Free Grammars
http://www.andrew.cmu.edu/user/ko/pdfs/lecture-7.pdf

L = {w | w ∈ {a, b}∗ , na (w) = nb (w)}.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 27 / 95


Designing Context-Free Grammars
http://www.andrew.cmu.edu/user/ko/pdfs/lecture-7.pdf

L = {w | w ∈ {a, b}∗ , na (w) = nb (w)}

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 28 / 95


Designing Context-Free Grammars
http://www.andrew.cmu.edu/user/ko/pdfs/lecture-7.pdf

L = {w | w ∈ {a, b}∗ , na (w) = nb (w)}


The grammar generates the basis strings of ϵ, ab and ba.

If w is a string in this grammar, awb will belong to this


grammar.
awb will be generated from by using the rule S → aSb.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 28 / 95


Designing Context-Free Grammars
http://www.andrew.cmu.edu/user/ko/pdfs/lecture-7.pdf

L = {w | w ∈ {a, b}∗ , na (w) = nb (w)}


The grammar generates the basis strings of ϵ, ab and ba.

If w is a string in this grammar, bwa will belong to this


grammar.
bwa will be generated from by using the rule S → bSa.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 28 / 95


Designing Context-Free Grammars
http://www.andrew.cmu.edu/user/ko/pdfs/lecture-7.pdf

L = {w | w ∈ {a, b}∗ , na (w) = nb (w)}


The grammar generates the basis strings of ϵ, ab and ba.

If w is a string in this grammar, ww will belong to this


grammar.
w will be generated from by using the rule S → SS.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 28 / 95


Designing Context-Free Grammars
http://www.andrew.cmu.edu/user/ko/pdfs/lecture-7.pdf

L = {w | w ∈ {a, b}∗ , na (w) = nb (w)}


S → aSb | bSa | SS | ϵ

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 29 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {w | w ∈ {0, 1}∗ and of even length}.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 30 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {w | w ∈ {0, 1}∗ and of even length}


The grammar generates the basis strings of ϵ, 00, 01, 10,
and 11.

If w is a string in this grammar, 0w0 will belong to this


grammar.
0w0 will be generated by using the rule S → 0S0.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 31 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {w | w ∈ {0, 1}∗ and of even length}


The grammar generates the basis strings of ϵ, 00, 01, 10,
and 11.

If w is a string in this grammar, 0w1 will belong to this


grammar.
0w1 will be generated by using the rule S → 0S1.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 31 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {w | w ∈ {0, 1}∗ and of even length}


The grammar generates the basis strings of ϵ, 00, 01, 10,
and 11.

If w is a string in this grammar, 1w0 will belong to this


grammar.
1w0 will be generated by using the rule S → 1S0.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 31 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {w | w ∈ {0, 1}∗ and of even length}


The grammar generates the basis strings of ϵ, 00, 01, 10,
and 11.

If w is a string in this grammar, 1w1 will belong to this


grammar.
1w1 will be generated by using the rule S → 1S1.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 31 / 95


Designing Context-Free Grammars
http://www.eecs.yorku.ca/course_archive/2006-07/F/2001/handouts/
lect11.pdf

L = {w | w ∈ {0, 1}∗ and of even length}


S → ϵ | 0S0 | 0S1 | 1S0 | 1S1.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 32 / 95


Designing Context-Free Grammars
http://www.cs.toronto.edu/˜azadeh/page11/page12/material/
hw5-sol.pdf

L = an bm ck | n, m, k ≥ 0 and n = m + k


Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 33 / 95


Designing Context-Free Grammars
http://www.cs.toronto.edu/˜azadeh/page11/page12/material/
hw5-sol.pdf

L = an bm ck | n, m, k ≥ 0 and n = m + k


Every b should match an a.


Every c should match an a.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 34 / 95


Designing Context-Free Grammars
http://www.cs.toronto.edu/˜azadeh/page11/page12/material/
hw5-sol.pdf

L = an bm ck | n, m, k ≥ 0 and n = m + k


Every b should match an a.


Every c should match an a.
Thinking recursively, we will want to build the as . . . cs part
first.
And then, build the ar br inside the previously built as . . . cs ,
like, as ar br cs .

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 34 / 95


Designing Context-Free Grammars
http://www.cs.toronto.edu/˜azadeh/page11/page12/material/
hw5-sol.pdf

L = an bm ck | n, m, k ≥ 0 and n = m + k


Every b should match an a.


Every c should match an a.
Thinking recursively, we will want to build the as . . . cs part
first.
And then, build the ar br inside the previously built as . . . cs ,
like, as ar br cs .
Can we go in the other direction, like, build the ar br and
then build the as . . . cs ?

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 34 / 95


Designing Context-Free Grammars
http://www.cs.toronto.edu/˜azadeh/page11/page12/material/
hw5-sol.pdf

L = an bm ck | n, m, k ≥ 0 and n = m + k


S → aSc | B
B → aBb | ϵ

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 35 / 95


Example
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Let Σ = {{, }} and let


L = {w ∈ Σ∗ | w is a string of balanced braces}.
Some sample strings in L:

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 36 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Σ = {{, }}, L = {w ∈ Σ∗ | w is a string of balanced braces}.


Let’s think about this recursively.
Base case: the empty string is a string of balanced braces.
Recursive step: Look at the closing brace that matches the
first open brace.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 37 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Σ = {{, }}, L = {w ∈ Σ∗ | w is a string of balanced braces}.


Let’s think about this recursively.
Base case: the empty string is a string of balanced braces.
Recursive step: Look at the closing brace that matches the
first open brace.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 37 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Σ = {{, }}, L = {w ∈ Σ∗ | w is a string of balanced braces}.


Let’s think about this recursively.
Base case: the empty string is a string of balanced braces.
Recursive step: Look at the closing brace that matches the
first open brace.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 37 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Σ = {{, }}, L = {w ∈ Σ∗ | w is a string of balanced braces}.


Let’s think about this recursively.
Base case: the empty string is a string of balanced braces.
Recursive step: Look at the closing brace that matches the
first open brace.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 37 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Σ = {{, }}, L = {w ∈ Σ∗ | w is a string of balanced braces}.


Let’s think about this recursively.
Base case: the empty string is a string of balanced braces.
Recursive step: Look at the closing brace that matches the
first open brace.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 38 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Σ = {{, }}, L = {w ∈ Σ∗ | w is a string of balanced braces}.


Let’s think about this recursively.
Base case: the empty string is a string of balanced braces.
Recursive step: Look at the closing brace that matches the
first open brace.

S → {S} S | ϵ

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 38 / 95


Designing CFGs — Storing Information in
Nonterminals
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Different non-terminals should represent different states or


different types of strings.
For example, different phases of the build, or different
possible structures for the string.
Think like the same ideas from DFA/NFA design where
states in your automata represent pieces of information.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 39 / 95


Example
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Let Σ = {a, b} and let

L = {w ∈ Σ∗ | Length of w is a multiple of 3 and


all the characters in the first third of
w are the same.}

Examples:

ϵ∈L a ̸∈ L
|
a bb ∈ L b ̸∈ L
b | ab ∈ L |
ab abab ̸∈ L
aa | baba ∈ L |
aab aaaaaa ̸∈ L
bb | bbbb ∈ L bbbb ̸∈ L

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 40 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

aaa bab
abb bbb
aaabab bbabbb
aababa bbbaaaaaa
aaaaaaaaa bbbbbabaa

Observation 1: Strings in this language are either the first third


is as or the first third is bs.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 41 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

aaa bab
abb bbb
aaabab bbabbb
aababa bbbaaaaaa
aaaaaaaaa bbbbbabaa

Observation 2: Amongst these strings, for every a we have in


the first third, we need two other characters in the
last two thirds.
This pattern of “for every x we see here, we need a y
somewhere else in the string” is very common in CFGs!

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 41 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

aaa bab
abb bbb
aaabab bbabbb
aababa bbbaaaaaa
aaaaaaaaa bbbbbabaa

A → aAXX | ϵ
X→a|b

Here the nonterminal A represents “a string where the first


third is a’s”.
The nonterminal X represents “any character”.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 41 / 95
Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

aaa bab
abb bbb
aaabab bbabbb
aababa bbbaaaaaa
aaaaaaaaa bbbbbabaa

B → bBXX | ϵ
X→a|b

Here the nonterminal B represents “a string where the first


third is b’s”.
The nonterminal X represents “any character”.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 41 / 95
Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

aaa bab
abb bbb
aaabab bbabbb
aababa bbbaaaaaa
aaaaaaaaa bbbbbabaa

Tying everything together:

S→A|B
A → aAXX | ϵ
B → bBXX | ϵ
X→a|b

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 41 / 95


Example — continued
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

S→A|B|ϵ
A → aAXX
B → bBXX
X→a|b

Overall strings in this language either follow the pattern of


A or B.
A represents “strings where the first third is a’s”.
B represents “strings where the first third is b’s”.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 42 / 95


Summary of CFG Design Tips
Adapted from https://web.stanford.edu/class/archive/cs/cs103/
cs103.1208/lectures/16-CFGs/CFGs.pdf

Look for recursive structures where they exist.


They can help guide you toward a solution.
Keep the build order in mind — often, you’ll build two totally
different parts of the string concurrently.
Usually, those parts are built in opposite directions.
One’s built right-to-left, the other left-to-right.
Use different nonterminals to represent different structures.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 43 / 95


Designing Context-Free Grammars
Sipser, 2.1, p-107

Second, constructing a CFG for a language that happens


to be regular is easy if you can first construct a DFA for that
language.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 44 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

You can convert any DFA into an equivalent CFG as


follows.
Make a variable Ri for each state qi of the DFA.
Add the rule Ri → aRj to the CFG if δ(qi , a) = qj is a
transition in the DFA.
Add the rule Ri → ϵ if qi is an accept state of the DFA.
Make R0 the start variable of the grammar, where q0 is the
start state of the machine.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 45 / 95


Designing Context-Free Grammars — continued
Sipser, Figure 1.22, p-44

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 46 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

Make a variable Ri for each state qi of the DFA.


Add the rule Ri → aRj to the CFG if
δ(qi , a) = qj is a transition in the DFA.
Add the rule Ri → ϵ if qi is an accept state of
the DFA.
Make R0 the start variable of the grammar,
where q0 is the start state of the machine.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 47 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

Make a variable Ri for each state qi of the DFA.


Add the rule Ri → aRj to the CFG if
δ(qi , a) = qj is a transition in the DFA.
Add the rule Ri → ϵ if qi is an accept state of
the DFA.
Make R0 the start variable of the grammar,
where q0 is the start state of the machine.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 47 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

Make a variable Ri for each state qi of the DFA.


Add the rule Ri → aRj to the CFG if
δ(qi , a) = qj is a transition in the DFA.
Add the rule Ri → ϵ if qi is an accept state of
the DFA.
Make R0 the start variable of the grammar,
where q0 is the start state of the machine.

So, the resulting grammar is,

P → 0Q R → 0R
P → 1P R → 1S
S→ϵ
Q → 0R S → 0S
Q → 1P S → 1S

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 47 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

Third, certain context-free languages contain strings with


two substrings.
These are “linked” in the sense that a machine for such a
language would need to remember an unbounded amount
of information about one of the substrings to verify that it
corresponds properly to the other substring.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 48 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

This situation occurs in the language {0n 1n | n ≥ 0}


because a machine would need to remember the number
of 0s in order to verify that it equals the number of 1s.
You can construct a CFG to handle this situation by using a
rule of the form R → uRv.
Which generates strings wherein the portion containing the
u’s corresponds to the portion containing the v’s.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 49 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

Finally, in more complex languages, the strings may


contain certain structures that appear recursively as part of
other (or the same) structures.
That situation occurs in the grammar that generates
arithmetic expressions.

E →E+T |T
T →T ×F |F
F → (E) | a

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 50 / 95


Designing Context-Free Grammars — continued
Sipser, 2.1, p-107

E →E+T |T
T →T ×F |F
F → (E) | a

Any time the symbol a appears, an entire parenthesized


expression might appear recursively instead.
To achieve this effect, place the variable symbol generating
the structure in the location of the rules corresponding to
where that structure may recursively appear.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 51 / 95


Leftmost and Rightmost Derivations
Hopcroft, Motwani, and Ullman, 5.1.4, p-175

We want to restrict the number of choices we have in


deriving a string.
It is often useful to require that at each step we replace the
leftmost variable by one of its production bodies.
Such a derivation is called a leftmost derivation.
We indicate that a derivation is leftmost by using the

relations =⇒ and =⇒, for one or many steps, respectively.
lm lm
If the grammar G that is being used is not obvious, we can
place the name G below the arrow in either of these
symbols.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 52 / 95


Leftmost and Rightmost Derivations — continued
Hopcroft, Motwani, and Ullman, 5.1.4, p-175

Similarly, it is possible to require that at each step the


rightmost variable is replaced by one of its bodies.
If so, we call the derivation rightmost and use the symbols

=⇒ and =⇒ to indicate one or many rightmost derivation
rm rm
steps, respectively.
Again, the name of the grammar may appear below these
symbols if it is not clear which grammar is being used.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 53 / 95


Example
Hopcroft, Motwani, and Ullman, Example 5.6, p-176

A leftmost derivation.
E ⇒ E∗E
⇒ I∗E
⇒ a∗E
⇒ a ∗ (E)
⇒ a ∗ (E + E)
⇒ a ∗ (I + E)
⇒ a ∗ (a + E)
⇒ a ∗ (a + I)
⇒ a ∗ (a + I0)
⇒ a ∗ (a + I00)
⇒ a ∗ (a + b00)

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 54 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.6, p-176

Thus, we can describe the same derivation by:

E =⇒ E ∗ E =⇒ I ∗ E =⇒ a ∗ E =⇒
lm lm lm lm
a ∗ (E) =⇒ a ∗ (E + E) =⇒ a ∗ (I + E) =⇒
lm lm lm
a ∗ (a + E) =⇒ a ∗ (a + I) =⇒ a ∗ (a + I0) =⇒
lm lm lm
a ∗ (a + I00) =⇒ a ∗ (a + b00)
lm

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 55 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.6, p-176

Thus, we can describe the same derivation by:

E =⇒ E ∗ E =⇒ I ∗ E =⇒ a ∗ E =⇒
lm lm lm lm
a ∗ (E) =⇒ a ∗ (E + E) =⇒ a ∗ (I + E) =⇒
lm lm lm
a ∗ (a + E) =⇒ a ∗ (a + I) =⇒ a ∗ (a + I0) =⇒
lm lm lm
a ∗ (a + I00) =⇒ a ∗ (a + b00)
lm

We can also summarize the leftmost derivation by saying



E =⇒ a ∗ (a + b00).
lm
Or express several steps of the derivation by expressions

such as E ∗ E =⇒ a ∗ (E).
lm

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 55 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.6, p-176

There is a rightmost derivation that uses the same


replacements for each variable.
Although it makes the replacements in different order.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 56 / 95


This rightmost derivation is:
E =⇒ E ∗ E
rm
=⇒ E ∗ (E)
rm
=⇒ E ∗ (E + E)
rm
=⇒ E ∗ (E + I)
rm
=⇒ E ∗ (E + I0)
rm
=⇒ E ∗ (E + I00)
rm
=⇒ E ∗ (E + b00)
rm
=⇒ E ∗ (I + b00)
rm
=⇒ E ∗ (a + b00)
rm
=⇒ I ∗ (a + b00)
rm
=⇒ a ∗ (a + b00)
rm
E =⇒ E ∗ E
rm
=⇒ E ∗ (E)
rm
=⇒ E ∗ (E + E)
rm
=⇒ E ∗ (E + I)
rm
=⇒ E ∗ (E + I0)
rm
=⇒ E ∗ (E + I00)
rm
=⇒ E ∗ (E + b00)
rm
=⇒ E ∗ (I + b00)
rm
=⇒ E ∗ (a + b00)
rm
=⇒ I ∗ (a + b00)
rm
=⇒ a ∗ (a + b00)
rm

This derivation allows us to conclude E =⇒ a ∗ (a + b00).
rm
Leftmost and Rightmost Derivations — continued
Hopcroft, Motwani, and Ullman, 5.1.4, p-177

Any derivation has an equivalent leftmost and an


equivalent rightmost derivation.
That is, if w is a terminal string, and A a variable, then
∗ ∗
A= ⇒ w if and only if A =⇒ w, and
lm
∗ ∗
⇒ w if and only if A =⇒ w.
A=
rm

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 59 / 95


The Language of a Grammar
Hopcroft, Motwani, and Ullman, 5.1.5, p-177

If G(V, T, P, S) is a CFG, the language of G, denoted L(G),


is the set of terminal strings that have derivations from the
start symbol.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 60 / 95


The Language of a Grammar
Hopcroft, Motwani, and Ullman, 5.1.5, p-177

If G(V, T, P, S) is a CFG, the language of G, denoted L(G),


is the set of terminal strings that have derivations from the
start symbol.
That is,
 

L(G) = w in T ∗ S =
⇒w .
G

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 60 / 95


The Language of a Grammar — continued
Hopcroft, Motwani, and Ullman, 5.1.5, p-177

If a language L is the language of some context-free


grammar, then L is said to be a context-free language, or
CFL.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 61 / 95


Sentential Forms
Hopcroft, Motwani, and Ullman, 5.1.6, p-178

Derivations from the start symbol produce strings that have


a special role.
We call these “sentential forms.”

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 62 / 95


Sentential Forms
Hopcroft, Motwani, and Ullman, 5.1.6, p-178

Derivations from the start symbol produce strings that have


a special role.
We call these “sentential forms.”
That is, if G(V, T, P, S) is a CFG, then any string α in

(V ∪ T)∗ such that S = ⇒ α is a sentential form.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 62 / 95


Sentential Forms — continued
Hopcroft, Motwani, and Ullman, 5.1.6, p-178


If S =⇒ α, then α is a left-sentential form.
lm

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 63 / 95


Sentential Forms — continued
Hopcroft, Motwani, and Ullman, 5.1.6, p-178


If S =⇒ α, then α is a left-sentential form.
lm

And if S =⇒ α, then α is a right-sentential form.
rm

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 63 / 95


Sentential Forms — continued
Hopcroft, Motwani, and Ullman, 5.1.6, p-178


If S =⇒ α, then α is a left-sentential form.
lm

And if S =⇒ α, then α is a right-sentential form.
rm
Note that the language L(G) is those sentential forms that
are in T ∗ ; i.e., they consist solely of terminals.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 63 / 95


Example
Hopcroft, Motwani, and Ullman, 5.1.6, p-178

Consider the grammar for expressions,

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 64 / 95


Example — continued
Hopcroft, Motwani, and Ullman, 5.1.6, p-178

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

For example, E ∗ (I + E) is a sentential form, since there is


a derivation,
E ⇒ E ∗ E ⇒ E ∗ (E) ⇒ E ∗ (E + E) ⇒ E ∗ (I + E).

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 65 / 95


Example — continued
Hopcroft, Motwani, and Ullman, 5.1.6, p-178

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

For example, E ∗ (I + E) is a sentential form, since there is


a derivation,
E ⇒ E ∗ E ⇒ E ∗ (E) ⇒ E ∗ (E + E) ⇒ E ∗ (I + E).

However this derivation is neither leftmost nor rightmost,


since at the last step, the middle E is replaced.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 65 / 95
Example — continued
Hopcroft, Motwani, and Ullman, 5.1.6, p-178

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

As an example of a left-sentential form, consider α ∗ E, with


the leftmost derivation,

E =⇒ E ∗ E =⇒ I ∗ E =⇒ α ∗ E.
lm lm lm

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 66 / 95


Example — continued
Hopcroft, Motwani, and Ullman, 5.1.6, p-178

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

Additionally, the derivation,

E =⇒ E ∗ E =⇒ E ∗ (E) =⇒ E ∗ (E + E)
rm rm rm

shows that E ∗ (E + E) is a right-sentential form.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 67 / 95


Parse Trees
Hopcroft, Motwani, and Ullman, 5.2, p-181

There is a tree representation for derivations that has


proved extremely useful.
This tree shows us clearly how the symbols of a terminal
string are grouped into substrings.
Each of the terminals belongs to the language of one of
the variables of the grammar.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 68 / 95


Parse Trees — continued
Hopcroft, Motwani, and Ullman, 5.2, p-181

The tree, known as a “parse tree” when used in a compiler,


is the data structure of choice to represent the source
program.
In a compiler, the tree structure of the source program
facilitates the translation of the source program into
executable code by allowing natural, recursive functions to
perform this translation process.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 69 / 95


Parse Trees — continued
Hopcroft, Motwani, and Ullman, 5.2, p-181

Certain grammars allow a terminal string to have more


than one parse tree.
That situation makes the grammar unsuitable for a
programming language.
The compiler could not tell the structure of certain source
programs.
And therefore could not with certainty deduce what the
proper executable code for the program was.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 70 / 95


Constructing Parse Trees
Hopcroft, Motwani, and Ullman, 5.2.1, p-181

Let us fix on a grammar G(V, T, P, S).


The parse trees for G are trees with the following
conditions:

1. Each interior node is labeled by a variable in V.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 71 / 95


Constructing Parse Trees
Hopcroft, Motwani, and Ullman, 5.2.1, p-181

Let us fix on a grammar G(V, T, P, S).


The parse trees for G are trees with the following
conditions:

2. Each leaf is labeled by either a variable, a terminal, or ϵ.


However, if the leaf is labeled ϵ, then it must be the
only child of its parent.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 71 / 95


Constructing Parse Trees
Hopcroft, Motwani, and Ullman, 5.2.1, p-181

Let us fix on a grammar G(V, T, P, S).


The parse trees for G are trees with the following
conditions:

3. If an interior node is labeled A, and its children are labeled

X1 , X2 , . . . , Xk

respectively, from the left, then A → X1 X2 . . . Xk is a


production in P.
Note that the only time one of the X’s can be ϵ is if that
is the label of the only child, and A → ϵ is a production
of G.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 71 / 95


Example
Hopcroft, Motwani, and Ullman, Example 5.9, p-182

Figure shows a parse tree that uses the expression


grammar.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 72 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.9, p-182

1 E→I 5 I→a
2 E → E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

The production used at the root is E → E + E.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 73 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.9, p-182

1 E→I 5 I→a
2 E → E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

The production used at the root is E → E + E.


At the leftmost child of the root, the production E → I is
used.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 73 / 95


Example
Hopcroft, Motwani, and Ullman, Example 5.10, p-182

Figure shows a parse tree for the palindromic grammar.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 74 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.10, p-182

P → ϵ
P → 0
P → 1

P → 0P0
P → 1P1

The production used at the root is P → 0P0.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 75 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.10, p-182

P → ϵ
P → 0
P → 1

P → 0P0
P → 1P1

At the middle child of the root it is P → 1P1.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 75 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.10, p-182

P → ϵ
P → 0
P → 1

P → 0P0
P → 1P1

Note that at the bottom is a use of the production P → ϵ.


That use, labeled ϵ, is the only time that a node labeled ϵ
can appear in a parse tree.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 75 / 95


The Yield of a Parse Tree
Hopcroft, Motwani, and Ullman, 5.2.2, p-183

If we look at the leaves of any parse tree and concatenate


them from the left, we get a string.
This is called the yield of the tree.
This is always a string that is derived from the root variable.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 76 / 95


The Yield of a Parse Tree — continued
Hopcroft, Motwani, and Ullman, 5.2.2, p-183

Of special importance are those parse trees such that:


1. The yield is a terminal string.
That is, all leaves are labeled either with a
terminal or with ϵ.
2. The root is labeled by the start symbol.

These are the parse trees whose yields are strings in the
language of the underlying grammar.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 77 / 95


The Yield of a Parse Tree — continued
Hopcroft, Motwani, and Ullman, 5.2.2, p-183

Of special importance are those parse trees such that:


1. The yield is a terminal string.
That is, all leaves are labeled either with a
terminal or with ϵ.
2. The root is labeled by the start symbol.

These are the parse trees whose yields are strings in the
language of the underlying grammar.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 77 / 95


Example
Hopcroft, Motwani, and Ullman, Example 5.11, p-183

The figure is an example of a tree with a terminal string as


yield and the start symbol at the root.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 78 / 95


1 E→I 5 I→a
2 E → E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

It is based on the grammar for expressions.


This tree’s yield is the string a ∗ (a + b00).
1 E→I 5 I→a
2 E → E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

It is based on the grammar for expressions.


This tree’s yield is the string a ∗ (a + b00).
This parse tree is a representation of derivation.
Ambiguous Grammars
Hopcroft, Motwani, and Ullman, 5.4.1, p-205

Expression grammar of
figure lets us generate
expressions with any 1 E→I 5 I→a
sequence of ∗ and + 2 E →E+E 6 I→b
operators. 3 E →E∗E 7 I → Ia
The productions 4 E → (E) 8 I → Ib
E → E + E | E ∗ E allow
9 I → I0
us to generate these
expressions in any 10 I → I1
order we choose.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 80 / 95


Example
Hopcroft, Motwani, and Ullman, Example 5.25, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

Consider the sentential form E + E ∗ E.


It has two derivations from E:
1. E ⇒ E + E ⇒ E + E ∗ E
2. E ⇒ E ∗ E ⇒ E + E ∗ E
In derivation (1), the second E is replaced by E ∗ E.
While in derivation (2), the first E is replaced by E + E.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 81 / 95
Example
Hopcroft, Motwani, and Ullman, Example 5.25, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

Consider the sentential form E + E ∗ E.


It has two derivations from E:
1. E ⇒ E + E ⇒ E + E ∗ E
2. E ⇒ E ∗ E ⇒ E + E ∗ E
In derivation (1), the second E is replaced by E ∗ E.
While in derivation (2), the first E is replaced by E + E.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 81 / 95
Example — continued
Hopcroft, Motwani, and Ullman, Example 5.25, p-206

3 + 4 ∗ 5 = 23?
3 + 4 ∗ 5 = 35?

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 82 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.25, p-206

1. E ⇒E+E ⇒E+E∗E
2. E ⇒E∗E ⇒E+E∗E

Figure shows the two parse trees, which we should note


are distinct trees.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 83 / 95
1. E ⇒E+E ⇒E+E∗E
2. E ⇒E∗E ⇒E+E∗E

The difference between these two derivations is significant.


Derivation (1) says that the second and third expressions
are multiplied, and the result is added to the first
expression.
Derivation (2) adds the first two expressions and multiplies
the result by the third.
1. E ⇒E+E ⇒E+E∗E
2. E ⇒E∗E ⇒E+E∗E

In more concrete terms, the first derivation suggests that


1 + 2 ∗ 3 should be grouped 1 + (2 ∗ 3) = 7.
The second derivation suggests the same expression
should be grouped (1 + 2) ∗ 3 = 9.
Obviously, the first of these, and not the second, matches
our notion of correct grouping of arithmetic expressions.
Example — continued

1 E→I 5 I→a
2 E→ 6 I→b
E+E 7 I → Ia
3 E → E∗E 8 I → Ib
4 E → (E) 9 I → I0
10 I → I1

The grammar of figure gives two different structures to any


string of terminals that is derived by replacing the three
expressions in E + E ∗ E by identifiers.
We see that this grammar is not a good one for providing
unique structure.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 85 / 95


Example — continued

1 E→I 5 I→a
2 E→ 6 I→b
E+E 7 I → Ia
3 E → E∗E 8 I → Ib
4 E → (E) 9 I → I0
10 I → I1

In particular, while it can give strings the correct grouping


as arithmetic expressions, it also gives them incorrect
groupings.
To use this expression grammar in a compiler, we would
have to modify it to provide only the correct groupings.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 85 / 95


Ambiguous Grammars — continued
Hopcroft, Motwani, and Ullman, 5.4.1, p-205

On the other hand, the mere existence of different


derivations for a string (as opposed to different parse
trees) does not imply a defect in the grammar.
The following is an example.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 86 / 95


Example
Hopcroft, Motwani, and Ullman, Example 5.26, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

Using the same expression grammar, we find that the


string a + b has many different derivations.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 87 / 95


Example
Hopcroft, Motwani, and Ullman, Example 5.26, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

Two examples are:


E ⇒E+E ⇒I+E ⇒a+E ⇒a+I ⇒a+b
E ⇒E+E ⇒E+I ⇒I+I ⇒I+b⇒a+b

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 87 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.26, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

E ⇒E+E ⇒I+E ⇒a+E ⇒a+I ⇒a+b


E ⇒E+E ⇒E+I ⇒I+I ⇒I+b⇒a+b

However, there is no real difference between the structures


provided by these derivations.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 88 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.26, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

E ⇒E+E ⇒I+E ⇒a+E ⇒a+I ⇒a+b


E ⇒E+E ⇒E+I ⇒I+I ⇒I+b⇒a+b

They each say that a and b are identifiers, and that their
values are to be added.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 88 / 95


Example — continued
Hopcroft, Motwani, and Ullman, Example 5.26, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

E ⇒E+E ⇒I+E ⇒a+E ⇒a+I ⇒a+b


E ⇒E+E ⇒E+I ⇒I+I ⇒I+b⇒a+b

The two examples above suggest that it is not a multiplicity


of derivations that cause ambiguity, but rather the
existence of two or more parse trees.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 88 / 95
Example — continued
Hopcroft, Motwani, and Ullman, Example 5.26, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

E ⇒E+E ⇒I+E ⇒a+E ⇒a+I ⇒a+b


E ⇒E+E ⇒E+I ⇒I+I ⇒I+b⇒a+b

Thus, we say a CFG G(V, T, P, S) is ambiguous if there is


at least one string w in T ∗ for which we can find two
different parse trees, each with root labeled S and yield w.
Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 88 / 95
Example — continued
Hopcroft, Motwani, and Ullman, Example 5.26, p-206

1 E→I 5 I→a
2 E →E+E 6 I→b
3 E →E∗E 7 I → Ia
4 E → (E) 8 I → Ib
9 I → I0
10 I → I1

E ⇒E+E ⇒I+E ⇒a+E ⇒a+I ⇒a+b


E ⇒E+E ⇒E+I ⇒I+I ⇒I+b⇒a+b

If each string has at most one parse tree in the grammar,


then the grammar is unambiguous.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 88 / 95


Ambiguity
Sipser, 2.1, p-107

Grammar, G5 .

<EXPR> → <EXPR> + <EXPR>


| <EXPR> × <EXPR>
| (<EXPR>) | a

This grammar doesn’t capture the usual precedence


relations and so may group the + before the × or vice
versa.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 89 / 95


Ambiguity — continued
Sipser, Figure 2.6, p-108

<EXPR> → <EXPR> + <EXPR>


| <EXPR> × <EXPR>
| (<EXPR>) | a

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 90 / 95


Ambiguity — continued
Sipser, 2.1, p-107

In contrast, the following grammar generates exactly the


same language, but every generated string has a unique
parse tree.

<EXPR> → <EXPR> + <TERM> | <TERM>


<TERM> → <TERM> × <FACTOR> | <FACTOR>
<FACTOR> → (<EXPR>) | a

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 91 / 95


Ambiguity — continued
Sipser, Figure 2.5, p-105

<EXPR> → <EXPR> + <TERM> | <TERM>


<TERM> → <TERM> × <FACTOR> | <FACTOR>
<FACTOR> → (<EXPR>) | a

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 92 / 95


Ambiguity — continued
Sipser, Definition 2.7, p-108

Definition 2.7
A string w is derived ambiguously in context-free
grammar G if it has two or more different leftmost
derivations.
Grammar G is ambiguous if it generates some string
ambiguously.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 93 / 95


Ambiguity — continued
Sipser, Example 2.1, p-108

Sometimes when we have an ambiguous grammar we can


find an unambiguous grammar that generates the same
language.
Some context-free languages, however, can be generated
only by ambiguous grammars.
Such languages are called inherently ambiguous.
The language ai bj ck i = j, or, j = k is inherently


ambiguous.

Dr. Muhammad Masroor Ali CSE 211 (Theory of Computation) 94 / 95


End of
Slides

You might also like