0% found this document useful (0 votes)

70 views11 pages

Parse Trees, Ambiguity, and Chomsky Normal Form

This document discusses parse trees, ambiguity, and Chomsky normal form in context-free grammars. It provides examples of left-most derivations and parse trees for a context-free grammar, and explains how a grammar is ambiguous if a string can have multiple parse trees. It then gives examples of unambiguous context-free grammars that generate the same languages as ambiguous grammars.

Uploaded by

Matea

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views11 pages

Parse Trees, Ambiguity, and Chomsky Normal Form

Uploaded by

Matea

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Lecture 8

Parse trees, ambiguity, and Chomsky

normal form

In this lecture we will discuss a few important notions connected with context-
free grammars, including parse trees, ambiguity, and a special form for context-free
grammars known as the Chomsky normal form.

8.1 Left-most derivations and parse trees

In the previous lecture we covered the definition of context-free grammars as well as
derivations of strings by context-free grammars. Let us consider one of the context-
free grammars from the previous lecture:

S → 0 S 1 S 1 S 0 S ε. (8.1)

Again we’ll call this CFG G, and as we proved last time we have

L( G ) = w ∈ Σ ∗ : | w |0 = | w |1 ,

(8.2)

where Σ = {0, 1} is the binary alphabet and |w|0 and |w|1 denote the number of
times the symbols 0 and 1 appear in w, respectively.

Left-most derivations
Here is an example of a derivation of the string 0101:

S ⇒ 0 S 1 S ⇒ 0 1 S 0 S 1 S ⇒ 0 1 0 S 1 S ⇒ 0 1 0 1 S ⇒ 0 1 0 1. (8.3)

This is an example of a left-most derivation, which means that it is always the left-
most variable that gets replaced at each step. For the first step there is only one

1
CS 360 Introduction to the Theory of Computing

variable that can possibly be replaced—this is true both in this example and in
general. For the second step, however, one could choose to replace either of the
occurrences of the variable S, and in the derivation above it is the left-most oc-
currence that gets replaced. That is, if we underline the variable that gets replaced
and the symbols and variables that replace it, we see that this step replaces the
left-most occurrence of the variable S:

0 S 1 S ⇒ 0 1 S 0 S 1 S. (8.4)

The same is true for every other step—always we choose the left-most variable
occurrence to replace, and that is why we call this a left-most derivation. The same
terminology is used in general, for any context-free grammar.
If you think about it for a moment, you will quickly realize that every string
that can be generated by a particular context-free grammar can also be generated
by that same grammar using a left-most derivation. This is because there is no
“interaction” among multiple variables and/or symbols in any context-free gram-
mar derivation; if we know which rule is used to substitute each variable, then it
doesn’t matter what order the variable occurrences are substituted, so you might
as well always take care of the left-most variable during each step.
We could also define the notion of a right-most derivation, in which the right-
most variable occurrence is always evaluated first—but there isn’t really anything
important about right-most derivations that isn’t already represented by the notion
of a left-most derivation, at least from the viewpoint of this course. For this reason,
we won’t have any reason to discuss right-most derivations further.

Parse trees
With any derivation of a string by a context-free grammar we may associate a tree,
called a parse tree, according to the following rules:

• We have one node of the tree for each new occurrence of either a variable, a
symbol, or an ε in the derivation, with the root node of the tree corresponding
to the start variable. (We only have nodes labelled ε when rules of the form
V → ε are applied.)
• Each node corresponding to a symbol or an ε is a leaf node (having no children),
while each node corresponding to a variable has one child for each symbol
or variable with which it is replaced. The children of each (variable) node are
ordered in the same way as the symbols and variables in the rule used to replace
that variable.

For example, the derivation (8.3) yields the parse tree illustrated in Figure 8.1.

2
Lecture 8

0 S 1 S

1 S 0 S ε

ε ε

Figure 8.1: The parse tree corresponding to the derivation (8.3) of the string 0101.

0 S 1 S

ε 0 S 1 S

ε ε

Figure 8.2: A parse tree corresponding to the derivation (8.5) of the string 0101.

There is a one-to-one and onto correspondence between parse trees and left-
most derivations, meaning that every parse tree uniquely determines a left-most
derivation and each left-most derivation uniquely determines a parse tree.

8.2 Ambiguity
Sometimes a context-free grammar will allow multiple parse trees (or, equivalently,
multiple left-most derivations) for some strings in the language that it generates.
For example, a different left-most derivation of the string 0101 by the CFG (8.1)
from the derivation (8.3) is given by

S ⇒ 0 S 1 S ⇒ 0 1 S ⇒ 0 1 0 S 1 S ⇒ 0 1 0 1 S ⇒ 0 1 0 1. (8.5)

The parse tree corresponding to this derivation is illustrated in Figure 8.2.

When it is the case, for a given context-free grammar G, that there exists at least
one string w ∈ L( G ) having at least two different parse trees, then the CFG G is

3
CS 360 Introduction to the Theory of Computing

said to be ambiguous. Note that this is so even if there is just a single string having
multiple parse trees—in order to be unambiguous, a CFG must have just a single,
unique parse tree for each string it generates.
Being unambiguous is generally considered to be a positive attribute of a CFG,
and indeed it is a requirement for some applications of context-free grammars.

Designing unambiguous CFGs

In some cases it is possible to come up with an unambiguous context-free gram-
mar that generates the same language as an ambiguous context-free grammar. For
example, we can come up with a different context-free grammar for the language

w ∈ {0, 1}∗ : |w|0 = |w|1

(8.6)

that, unlike the CFG (8.1), is unambiguous. Here is such a CFG:

S → 0X1S 1Y0S ε

X → 0X1Xε (8.7)

Y → 1Y0Y ε
We won’t take the time to go through a proof that this CFG is unambiguous—
but if you think about it for a few moments, it shouldn’t be too hard to convince
yourself that it is unambiguous. The variable X generates strings having the same
number of 0s and 1s, where the number of 1s never exceeds the number of 0s, and
the variable Y is similar except the role of the 0s and 1s is reversed. If you try to
generate a particular string, you’ll never have more than one option as to which
rule to apply, assuming you restrict your attention to left-most derivations.
Here is another example of how an ambiguous CFG can be modified to make it
unambiguous. Let us define an alphabet

Σ = a, b, +, ∗ , ( , ) }

(8.8)

along with a CFG

S → S+S S∗S (S) a b (8.9)
This grammar generates strings that look like arithmetic expressions in variables
a and b, where we allow the operations ∗ and +, along with parentheses. For in-
stance, the string
( a + b) ∗ a + b (8.10)
is such an expression, and we can generate it (for instance) as follows:

S ⇒ S ∗ S ⇒ (S) ∗ S ⇒ (S + S) ∗ S ⇒ ( a + S) ∗ S ⇒ ( a + b) ∗ S
(8.11)
⇒ ( a + b) ∗ S + S ⇒ ( a + b) ∗ a + S ⇒ ( a + b) ∗ a + b.

4
Lecture 8

S ∗ S

( S ) S + S

S + S a b

a b

Figure 8.3: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.11).

This happens to be a left-most derivation, as it is always the left-most variable

that is substituted. The parse tree corresponding to this derivation is shown in
Figure 8.3.
You can imagine a more complex version of this grammar allowing for other
arithmetic operations, variables, and so on, but we will stick to the grammar in
(8.9) for the sake of simplicity.
Now, the CFG (8.9) is certainly ambiguous. For instance, a different (left-most)
derivation for the same string ( a + b) ∗ a + b as before is

S ⇒ S + S ⇒ S ∗ S + S ⇒ (S) ∗ S + S ⇒ (S + S) ∗ S + S
⇒ ( a + S) ∗ S + S ⇒ ( a + b) ∗ S + S ⇒ ( a + b) ∗ a + S (8.12)
⇒ ( a + b) ∗ a + b,

and the parse tree for this derivation is shown in Figure 8.4. Notice that there is
something appealing about the parse tree illustrated in Figure 8.4, which is that
it actually carries the meaning of the expression ( a + b) ∗ a + b, in the sense that
the tree structure properly captures the order in which the operations should be
applied. In contrast, the first parse tree seems to represent what the expression
( a + b) ∗ a + b would evaluate to if we lived in a society where addition was given
higher precedence than multiplication.
The ambiguity of the grammar (8.9), along with the fact that parse trees may
not represent the meaning of an arithmetic expression in the sense just described,
is a problem in some settings. For example, if we were designing a compiler and
wanted a part of it to represent arithmetic expressions (presumably allowing much
more complicated ones than our grammar from above allows), a CFG along the
lines of (8.9) would be completely inadequate.

5
CS 360 Introduction to the Theory of Computing

S + S

S ∗ S b

( S ) a

S + S

a b

Figure 8.4: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.12).

We can, however, come up with a new CFG for the same language that is much
better—it is unambiguous and it properly captures the meaning of arithmetic ex-
pressions. Here it is:

S → TS+T

T → FT∗F
(8.13)
F → I (S)

I → ab

For example, the unique parse tree corresponding to the string ( a + b) ∗ a + b is as

shown in Figure 8.5.
In order to better understand the CFG (8.13), it may help to associate meanings
with the different variables. In this CFG, the variable T generates terms, the variable
F generates factors, and the variable I generates identifiers. An expression is either a
term or a sum of terms, a term is either a factor or a product of factors, and a factor
is either an identifier or an entire expression inside of parentheses.

Inherently ambiguous languages

While we have seen that it is sometime possible to come up with an unambiguous
CFG that generates the same language as an ambiguous CFG, it is not always pos-
sible. There are some context-free languages that can only be generated by ambigu-
ous CFGs. Such languages are called inherently ambiguous context-free languages.

6
Lecture 8

S + T

T F

T ∗ F I

F I b

( S ) a

S + T

T F

F I

I b

Figure 8.5: Unique parse tree for ( a + b) ∗ a + b for the CFG (8.13).

An example of an inherently ambiguous context-free language is this one:

n m k
0 1 2 : n = m or m = k . (8.14)

We will not discuss a proof that this language is inherently ambiguous, but the
intuition is that the string 0n 1n 2n will always have multiple parse trees for some
sufficiently large natural number n.

8.3 Chomsky normal form

Some context-free grammars are strange. For example, the CFG

S → SSSS ε (8.15)

7
CS 360 Introduction to the Theory of Computing

simply generates the language {ε}; but it is obviously ambiguous, and even worse
it has infinitely many parse trees (which of course can be arbitrarily large) for the
only string ε it generates. While we know we cannot always eliminate ambiguity
from CFGs—as some context-free languages are inherently ambiguous—we can
at least eliminate the possibility to have infinitely many parse trees for a given
string. Perhaps more importantly, for any given CFG G, we can always come up
with a new CFG H for which it holds that L( H ) = L( G ), and for which we are
guaranteed that every parse tree for a given string w ∈ L( H ) has the same size and
a very simple, binary-tree-like structure.
To be more precise about the specific sort of CFGs and parse trees we’re talking
about, it is appropriate at this point to define what is called the Chomsky normal
form for context-free grammars.

Definition 8.1. A context-free grammar G is in Chomsky normal form if every rule

of G has one of the following three forms:

1. X → YZ, for variables X, Y, and Z, and where neither Y nor Z is the start
variable,
2. X → σ, for a variable X and a symbol σ, or
3. S → ε, for S the start variable.

Now, the reason why a CFG in Chomsky normal form is nice is that every parse
tree for such a grammar has a simple form: the variable nodes form a binary tree,
and for each variable node that doesn’t have any variable node children, a single
symbol node hangs off. A hypothetical example meant to illustrate the structure
we are talking about is given in Figure 8.6. Notice that the start variable always
appears exactly once at the root of the tree because it is never allowed on the right-
hand side of any rule.
If the rule S → ε is present in a CFG in Chomsky normal form, then we have a
special case that doesn’t fit exactly into the structure described above. In this case
we can have the very simple parse tree shown in Figure 8.7 for ε, and this is the
only possible parse tree for this string.
Because of the very special form that a parse tree must take for a CFG G in
Chomsky normal form, we have that every parse tree for a given string w ∈ L( G )
must have exactly 2|w| − 1 variable nodes and |w| leaf nodes (except for the special
case w = ε, in which we have one variable node and 1 leaf node). An equivalent
statement is that every derivation of a (nonempty) string w by a CFG in Chomsky
normal form requires exactly 2|w| − 1 substitutions.
The following theorem establishes that every context-free language is gener-
ated by a CFG in Chomsky normal form.

8
Lecture 8

X Y

X Z 0

0 Y X

1 1

Figure 8.6: A hypothetical example of a parse tree for a CFG in Chomsky normal
form.

Figure 8.7: The unique parse tree for ε for a CFG in Chomsky normal form, assum-
ing it includes the rule S → ε.

Theorem 8.2. Let Σ be an alphabet and let A ⊆ Σ∗ be a context-free language. There

exists a CFG G in Chomsky normal form such that A = L( G ).

The usual way to prove this theorem is through a construction that converts an
arbitrary CFG G into a CFG H in Chomsky normal form for which it holds that
L( H ) = L( G ). The conversion is, in fact, quite straightforward—a summary of the
steps one may perform to do this conversion for an arbitrary CFG G = (V, Σ, R, S)
is as follows:

1. Add a new start variable S0 along with the rule S0 → S.

Doing this will ensure that the start variable S0 never appears on the right-hand
side of any rule.
2. Eliminate ε-rules of the form X → ε and “repair the damage.”
Aside from the special case S0 → ε, there is never any need for rules of the
form X → ε; you can get the same effect by simply duplicating rules in which
X appears on the right-hand side, and directly replacing or not replacing X

9
CS 360 Introduction to the Theory of Computing

with ε in all possible combinations. For example, if we have the CFG

S0 → S
(8.16)
S → (S)S ε
we can easily eliminate the ε-rule S → ε in this way:

S0 → S ε
(8.17)
S → (S)S ( )S (S) ( )

It can get messy for larger grammars, and when ε-rules for multiple variables
are involved one needs to take care in making the process terminate correctly,
but it is always possible to remove all ε-rules (aside from S0 → ε) in this way.
3. Eliminate unit rules, which are rules of the form X → Y.
Rules like this are never necessary, and they can be eliminated by simply “hard
coding” the substitution X → Y into every other (non-unit) rule where X ap-
pears on the right-hand side. For instance, carrying on the example from above,
we can eliminate the unit rule S0 → S like this:

S0 → ( S ) S ( ) S ( S ) ( ) ε
(8.18)
S → (S)S ( )S (S) ( )

4. Introduce a new variable Xσ for each symbol σ.

Include the rule Xσ → σ, and replace every instance of σ appearing on the
right-hand side of a rule by Xσ (except when σ appears all by itself on the right-
hand side of a rule, because you don’t want to introduce new unit rules). For the
example above, we may use L and R (for left-parenthesis and right-parenthesis)
to obtain
S0 → L S R S L R S L S R L R ε

S → LSRS LRS LSR LR
(8.19)
L→(
R→)
5. Finally, we can split up rules of the form X → Y1 · · · Ym using auxiliary vari-
ables in a straightforward way.
For instance, X → Y1 · · · Ym can be broken up as
X → Y1 X2
X2 → Y2 X3
.. (8.20)
.
Xm−1 → Ym−1 Ym .

10
Lecture 8

We need to be sure to use separate auxiliary variables for each rule so that there
is no “cross talk” between separate rules. Let’s not write this out explicitly for
our example because it will be lengthy and hopefully the idea is clear.

The description above is only meant to give you the basic idea of how the construc-
tion works and does not constitute a proof of Theorem 8.2. It is possible, however,
to be more formal and precise in describing this construction in order to obtain a
proper proof of Theorem 8.2.
As you may by now suspect, this conversion of a CFG to Chomsky normal form
often produces very large CFGs. The steps are routine but things get messy and it
is easy to make a mistake when doing it by hand. I will never ask you to perform
this conversion on an actual CFG, but we will make use of the theorem from time
to time—when we are proving things about context-free languages it is sometimes
extremely helpful to know that we can always assume that a given context-free
language is generated by a CFG in Chomsky normal form.
Finally, it must be stressed that the Chomsky normal form says nothing about
ambiguity in general—a CFG in Chomsky normal form may or may not be am-
biguous, just like we have for arbitrary CFGs. On the other hand, if you start with
an unambiguous CFG and perform the conversion described above, the resulting
CFG in Chomsky normal form will still be unambiguous.

Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Unit Iv Context Free Languages
No ratings yet
Unit Iv Context Free Languages
74 pages
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
No ratings yet
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
29 pages
BV Raman 300 Important Yogas
67% (3)
BV Raman 300 Important Yogas
17 pages
5 Junior P.E and Arts
No ratings yet
5 Junior P.E and Arts
83 pages
Recruitment Selection Training
No ratings yet
Recruitment Selection Training
29 pages
Parse Trees, Ambiguity, and Chomsky Normal Form
No ratings yet
Parse Trees, Ambiguity, and Chomsky Normal Form
12 pages
08 CFG
No ratings yet
08 CFG
27 pages
SE Compiler Chapter 3-Parser
No ratings yet
SE Compiler Chapter 3-Parser
27 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
Formal Languages and Automata Theory: CH 4: Context Free Languages
No ratings yet
Formal Languages and Automata Theory: CH 4: Context Free Languages
59 pages
CNP Bill
No ratings yet
CNP Bill
1 page
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
Automata Theory Lec-03
No ratings yet
Automata Theory Lec-03
58 pages
Parsing Bun
No ratings yet
Parsing Bun
48 pages
Context
No ratings yet
Context
57 pages
Session 2 Overview of Integrity
No ratings yet
Session 2 Overview of Integrity
19 pages
Theme
No ratings yet
Theme
11 pages
ContextFreeGrammars
No ratings yet
ContextFreeGrammars
28 pages
Samir CFG
No ratings yet
Samir CFG
105 pages
Chương 3. Phân Tích Cú Pháp
No ratings yet
Chương 3. Phân Tích Cú Pháp
103 pages
COSC3054 Lec 03 I Grammars
No ratings yet
COSC3054 Lec 03 I Grammars
96 pages
FAFL Final Lecture 24.1 CMH
No ratings yet
FAFL Final Lecture 24.1 CMH
21 pages
The 10 Hook Lead System
100% (1)
The 10 Hook Lead System
5 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Chapter 3
No ratings yet
Chapter 3
57 pages
Figure 1two Parse Trees For 9-5+2
No ratings yet
Figure 1two Parse Trees For 9-5+2
3 pages
SDC - Grammar - CFG
No ratings yet
SDC - Grammar - CFG
46 pages
Module 3 CFG - Final
No ratings yet
Module 3 CFG - Final
40 pages
Context Free Grammars
No ratings yet
Context Free Grammars
25 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
Unit 4 ContextFreeLanguage
No ratings yet
Unit 4 ContextFreeLanguage
58 pages
Module 2
No ratings yet
Module 2
19 pages
TOC II Updated
No ratings yet
TOC II Updated
41 pages
Grid Audit Report Format
100% (1)
Grid Audit Report Format
7 pages
Chapter 3
No ratings yet
Chapter 3
77 pages
Module-3 Notes
No ratings yet
Module-3 Notes
28 pages
Unit 2
No ratings yet
Unit 2
10 pages
17 CFGremove Ambiguity Optional
No ratings yet
17 CFGremove Ambiguity Optional
30 pages
Lex
No ratings yet
Lex
13 pages
Module III
No ratings yet
Module III
18 pages
Chapter Four: Context Free Languages (CFG) : - Contents
No ratings yet
Chapter Four: Context Free Languages (CFG) : - Contents
36 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
You Are Not Your Brain
0% (1)
You Are Not Your Brain
7 pages
7-Parsing and Ambiguity-16-09-2024
No ratings yet
7-Parsing and Ambiguity-16-09-2024
10 pages
Lecture 9
No ratings yet
Lecture 9
22 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
No ratings yet
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
28 pages
Portable Radios: Operating Instructions
100% (1)
Portable Radios: Operating Instructions
47 pages
Castaneda Notes
No ratings yet
Castaneda Notes
10 pages
Automata Chapter 3
No ratings yet
Automata Chapter 3
14 pages
Chapter3 CFG
No ratings yet
Chapter3 CFG
67 pages
CS6109 Module 4
No ratings yet
CS6109 Module 4
36 pages
CFG & GNF
No ratings yet
CFG & GNF
21 pages
CST302 - Compiler - Design - Module 2
No ratings yet
CST302 - Compiler - Design - Module 2
19 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
CGF and CFL
No ratings yet
CGF and CFL
45 pages
4 Parsing
No ratings yet
4 Parsing
32 pages
Chapter 4 Intro - To - Parsing
No ratings yet
Chapter 4 Intro - To - Parsing
53 pages
Compiler 3
No ratings yet
Compiler 3
11 pages
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
No ratings yet
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
69 pages
l5 CFG
No ratings yet
l5 CFG
21 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
2018 Book CyberSecurityForCyberPhysicalS PDF
100% (1)
2018 Book CyberSecurityForCyberPhysicalS PDF
189 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
(Ebook) The Transformers Legends by David Cian ISBN 9780743497916, 0743497910 Download
100% (2)
(Ebook) The Transformers Legends by David Cian ISBN 9780743497916, 0743497910 Download
67 pages
Passband Digital Transmission
No ratings yet
Passband Digital Transmission
99 pages
FLAT - Ch. 3 (Lecture Notes)
No ratings yet
FLAT - Ch. 3 (Lecture Notes)
23 pages
ContextFreeGrammars Myppt
No ratings yet
ContextFreeGrammars Myppt
41 pages
Unit-3 Context Free Grammar
No ratings yet
Unit-3 Context Free Grammar
57 pages
Working With Vectors: Magnitude and Direction
No ratings yet
Working With Vectors: Magnitude and Direction
16 pages
Compiler 8
No ratings yet
Compiler 8
28 pages
2GMD 21 TK
100% (1)
2GMD 21 TK
3 pages
Final Exam For Handle Mail
No ratings yet
Final Exam For Handle Mail
2 pages
Cusat Btech Ece S8 Syllabus
No ratings yet
Cusat Btech Ece S8 Syllabus
4 pages
Essay and Elocution Competition
No ratings yet
Essay and Elocution Competition
1 page
Section One1
No ratings yet
Section One1
85 pages
Reg 216 - B520
No ratings yet
Reg 216 - B520
24 pages
Job Opportunity Bootloader Specialist at Elektrobit Automotive GMBH Jobportal1
No ratings yet
Job Opportunity Bootloader Specialist at Elektrobit Automotive GMBH Jobportal1
3 pages
Data Sheet: SFH757 and SFH757V
No ratings yet
Data Sheet: SFH757 and SFH757V
4 pages
Hytera+VM780+4G+Body+Worn+Camera+User+Manual+ (HyTalk) +V1.0.00 Eng
No ratings yet
Hytera+VM780+4G+Body+Worn+Camera+User+Manual+ (HyTalk) +V1.0.00 Eng
50 pages
Brosur Grolen HP19R
No ratings yet
Brosur Grolen HP19R
2 pages
Satish
No ratings yet
Satish
5 pages
DTX Adjust Captured Images
No ratings yet
DTX Adjust Captured Images
3 pages
German Observers Guns - Harry Woodman
No ratings yet
German Observers Guns - Harry Woodman
16 pages
Unit 4 - Week 2: Introduction To Python: Assignment 2
No ratings yet
Unit 4 - Week 2: Introduction To Python: Assignment 2
4 pages
Astm A799a799m - 10
No ratings yet
Astm A799a799m - 10
4 pages
Footscan®v9 Software Packages
No ratings yet
Footscan®v9 Software Packages
1 page
Linear Algebra
From Everand
Linear Algebra
Georgi E. Shilov
2.5/5 (3)
Topology Essentials
From Everand
Topology Essentials
Emil G. Milewski
5/5 (1)

Parse Trees, Ambiguity, and Chomsky Normal Form

Uploaded by

Parse Trees, Ambiguity, and Chomsky Normal Form

Uploaded by

Lecture 8

Parse trees, ambiguity, and Chomsky

8.1 Left-most derivations and parse trees

The parse tree corresponding to this derivation is illustrated in Figure 8.2.

Designing unambiguous CFGs

w ∈ {0, 1}∗ : |w|0 = |w|1

that, unlike the CFG (8.1), is unambiguous. Here is such a CFG:

along with a CFG

Figure 8.3: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.11).

This happens to be a left-most derivation, as it is always the left-most variable

Figure 8.4: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.12).

For example, the unique parse tree corresponding to the string ( a + b) ∗ a + b is as

Inherently ambiguous languages

An example of an inherently ambiguous context-free language is this one:

8.3 Chomsky normal form

Definition 8.1. A context-free grammar G is in Chomsky normal form if every rule

Theorem 8.2. Let Σ be an alphabet and let A ⊆ Σ∗ be a context-free language. There

1. Add a new start variable S0 along with the rule S0 → S.

with ε in all possible combinations. For example, if we have the CFG

4. Introduce a new variable Xσ for each symbol σ.

You might also like