0% found this document useful (0 votes)

49 views12 pages

Parse Trees, Ambiguity, and Chomsky Normal Form

This document discusses parse trees, ambiguity, and Chomsky normal form in context-free grammars. It provides an example of a context-free grammar and shows left-most derivations and parse trees for strings generated by the grammar. Ambiguity occurs when a context-free grammar allows multiple parse trees for a string. The document gives an example of an ambiguous grammar and discusses how it can be modified to be unambiguous.

Uploaded by

jk lm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views12 pages

Parse Trees, Ambiguity, and Chomsky Normal Form

Uploaded by

jk lm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Lecture 8

Parse trees, ambiguity, and Chomsky

normal form

In this lecture we will discuss a few important notions connected with context-
free grammars, including parse trees, ambiguity, and a special form for context-free
grammars known as Chomsky normal form.

8.1 Left-most derivations and parse trees

In the previous lecture we covered the definition of context-free grammars as well
as derivations of strings by context-free grammars. Let us consider one of the
context-free grammars from the previous lecture:

S → 0S1S 1S0S ε. (8.1)

Again we will call this CFG G, and as we proved last time we have

L( G ) = w ∈ Σ ∗ : | w |0 = | w |1 ,

(8.2)

where Σ = {0, 1} is the binary alphabet and |w|0 and |w|1 denote the number of
times the symbols 0 and 1 appear in w, respectively.

Left-most derivations
Here is an example of a derivation of the string 0101:

S ⇒ 0S1S ⇒ 01S0S1S ⇒ 010S1S ⇒ 0101S ⇒ 0101. (8.3)

This is an example of a left-most derivation, which means that it is always the left-
most variable that gets replaced at each step. For the first step there is only one

75
CS 360 Introduction to the Theory of Computing

variable that can possibly be replaced; this is true both in this example and in gen-
eral. For the second step, however, one could choose to replace either of the occur-
rences of the variable S, and in the derivation above it is the left-most occurrence
that gets replaced. That is, if we underline the variable that gets replaced and the
symbols and variables that replace it, we see that this step replaces the left-most
occurrence of the variable S:

0S1S ⇒ 01S0S1S. (8.4)

The same is true for every other step: always we choose the left-most variable
occurrence to replace, and that is why we call this a left-most derivation. The same
terminology is used in general, for any context-free grammar.
If you think about it for a moment, you will quickly realize that every string
that can be generated by a particular context-free grammar can also be generated
by that same grammar using a left-most derivation. This is because there is no “in-
teraction” among multiple variables and/or symbols in any context-free grammar
derivation; if we know which rule is used to substitute each variable, then it does
not matter what order the variable occurrences are substituted, so you might as
well always take care of the left-most variable during each step.
We could also define the notion of a right-most derivation, in which the right-
most variable occurrence is always evaluated first, but there is not really anything
important about right-most derivations that is not already represented by the no-
tion of a left-most derivation, at least from the viewpoint of this course. For this
reason, we will not have any reason to discuss right-most derivations further.

Parse trees
With any derivation of a string by a context-free grammar we may associate a tree,
called a parse tree, according to the following rules:

1. We have one node of the tree for each new occurrence of either a variable, a
symbol, or an ε in the derivation, with the root node of the tree corresponding
to the start variable. We only have nodes labeled ε when rules of the form
V → ε are applied.
2. Each node corresponding to a symbol or an ε is a leaf node (having no chil-
dren), while each node corresponding to a variable has one child for each sym-
bol, variable, or ε with which it is replaced. The children of each variable node
are ordered in the same way as the symbols and variables in the rule used to
replace that variable.

For example, the derivation (8.3) yields the parse tree illustrated in Figure 8.1.

76
Lecture 8

0 S 1 S

1 S 0 S ε

ε ε

Figure 8.1: The parse tree corresponding to the derivation (8.3) of the string 0101.

There is a one-to-one and onto correspondence between parse trees and left-
most derivations, meaning that every parse tree uniquely determines a left-most
derivation and each left-most derivation uniquely determines a parse tree.

8.2 Ambiguity
Sometimes a context-free grammar will allow multiple parse trees (or, equivalently,
multiple left-most derivations) for some strings in the language that it generates.
For example, a left-most derivation of the string 0101 by the CFG (8.1) that is dif-
ferent from (8.3) is

S ⇒ 0S1S ⇒ 01S ⇒ 010S1S ⇒ 0101S ⇒ 0101. (8.5)

The parse tree corresponding to this derivation is illustrated in Figure 8.2.

When it is the case, for a given context-free grammar G, that there exists at least
one string w ∈ L( G ) having at least two different parse trees, the CFG G is said to
be ambiguous. Note that this is so even if there is just a single string having multiple
parse trees; in order to be unambiguous, a CFG must have just a single, unique parse
tree for every string it generates.
Being unambiguous is generally considered to be a positive attribute of a CFG,
and indeed it is a requirement for some applications of context-free grammars.

Designing unambiguous CFGs

In some cases it is possible to come up with an unambiguous context-free gram-
mar that generates the same language as an ambiguous context-free grammar. For
example, we can come up with a different context-free grammar for the language

77
CS 360 Introduction to the Theory of Computing

0 S 1 S

ε 0 S 1 S

ε ε

Figure 8.2: The parse tree corresponding to the derivation (8.5) of the string 0101.

L( G ) described in (8.2) that, unlike the CFG (8.1), is unambiguous. Here is such a
CFG:
S → 0X1S 1Y0S ε

X → 0X1X ε (8.6)

Y → 1Y0Y ε
We will not take the time to go through a proof that this CFG is unambiguous, but
if you think about it for a few moments you should be able to convince yourself
that it is unambiguous. The variable X generates strings having the same number
of 0s and 1s, where the number of 1s never exceeds the number of 0s when you
read from left to right, and the variable Y is similar except the role of the 0s and 1s
is reversed. If you try to generate a particular string by a left-most derivation with
this CFG, you will never have more than one option as to which rule to apply.
Here is another example of how an ambiguous CFG can be modified to make it
unambiguous. Let us define an alphabet

Σ = a, b, +, ∗ , ( , ) }

(8.7)

along with a CFG

S → S+S S∗S (S) a b (8.8)
This grammar generates strings that look like arithmetic expressions in variables a
and b, where we allow the operations ∗ and +, along with parentheses.
For example, the string ( a + b) ∗ a + b corresponds to such an expression, and
one derivation for this string is as follows:
S ⇒ S ∗ S ⇒ (S) ∗ S ⇒ (S + S) ∗ S ⇒ ( a + S) ∗ S ⇒ ( a + b) ∗ S
(8.9)
⇒ ( a + b) ∗ S + S ⇒ ( a + b) ∗ a + S ⇒ ( a + b) ∗ a + b.
This happens to be a left-most derivation, as it is always the left-most variable
that is substituted. The parse tree corresponding to this derivation is shown in

78
Lecture 8

S ∗ S

( S ) S + S

S + S a b

a b

Figure 8.3: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.9).

Figure 8.3. You can of course imagine a more complex version of this grammar
allowing for other arithmetic operations, variables, and so on, but we will stick to
the grammar in (8.8) for the sake of simplicity.
The CFG (8.8) is ambiguous. For instance, a different (left-most) derivation for
the same string ( a + b) ∗ a + b as before is

S ⇒ S + S ⇒ S ∗ S + S ⇒ (S) ∗ S + S
⇒ (S + S) ∗ S + S ⇒ ( a + S) ∗ S + S ⇒ ( a + b) ∗ S + S (8.10)
⇒ ( a + b) ∗ a + S ⇒ ( a + b) ∗ a + b,

and the parse tree for this derivation is shown in Figure 8.4.
Notice that the parse tree illustrated in Figure 8.4 is appealing because it actu-
ally carries the meaning of the expression ( a + b) ∗ a + b, in the sense that the tree
structure properly captures the order in which the operations should be applied
according to the standard order of precedence for arithmetic operations. In con-
trast, the parse tree shown in Figure 8.3 seems to represent what the expression
( a + b) ∗ a + b would evaluate to if we lived in a society where addition was given
higher precedence than multiplication.
The ambiguity of the grammar (8.8), along with the fact that parse trees may
not represent the meaning of an arithmetic expression in the sense just described,
is a problem in some settings. For example, if we were designing a compiler and
wanted a part of it to represent arithmetic expressions (presumably allowing much
more complicated ones than our grammar from above allows), a CFG along the
lines of (8.8) would be completely inadequate.
We can, however, come up with a new CFG for the same language that is much
better, in the sense that it is unambiguous and properly captures the meaning of

79
CS 360 Introduction to the Theory of Computing

S + S

S ∗ S b

( S ) a

S + S

a b

Figure 8.4: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.10).

arithmetic expressions (given that we give multiplication higher precedence than

addition). Here it is:
S → TS+T

T → FT∗F
(8.11)
F → I (S)

I → ab
For example, the unique parse tree corresponding to the string ( a + b) ∗ a + b is as
shown in Figure 8.5.
To better understand the CFG (8.11), it may help to associate meanings with
the different variables. In this CFG, the variable T generates terms, the variable F
generates factors, and the variable I generates identifiers. An expression is either a
term or a sum of terms, a term is either a factor or a product of factors, and a factor
is either an identifier or an entire expression inside of parentheses.

Inherently ambiguous languages

While we have seen that it is sometime possible to come up with an unambiguous
CFG that generates the same language as an ambiguous CFG, it is not always pos-
sible. There are some context-free languages that can only be generated by ambigu-
ous CFGs. Such languages are called inherently ambiguous context-free languages.
An example of an inherently ambiguous context-free language is this one:
n m k
0 1 2 : n = m or m = k . (8.12)

80
Lecture 8

S + T

T F

T ∗ F I

F I b

( S ) a

S + T

T F

F I

I b

Figure 8.5: Unique parse tree for ( a + b) ∗ a + b for the CFG (8.11).

We will not prove that this language is inherently ambiguous, but the intuition is
that no matter what CFG you come up with for this language, the string 0n 1n 2n
will always have multiple parse trees for some sufficiently large natural number n.

8.3 Chomsky normal form

Some context-free grammars are strange. For example, the CFG

S → SSSS ε (8.13)

simply generates the language {ε}; but it is obviously ambiguous, and even worse
it has infinitely many parse trees (which of course can be arbitrarily large) for the

81
CS 360 Introduction to the Theory of Computing

X Y

X Z 0

0 Y X

1 1

Figure 8.6: A hypothetical example of a parse tree for a CFG in Chomsky normal
form.

only string ε it generates. While we know we cannot always eliminate ambiguity

from CFGs, as some context-free languages are inherently ambiguous, we can at
least eliminate the possibility to have infinitely many parse trees for a given string.
Perhaps more importantly, for any given CFG G, we can always come up with a
new CFG H for which L( H ) = L( G ), and for which we are guaranteed that every
parse tree for a given string w ∈ L( H ) has the same size and a very simple, binary-
tree-like structure.
To be more precise about the specific sort of CFGs and parse trees we are talking
about, it is appropriate at this point to define Chomsky normal form for context-free
grammars.
Definition 8.1. A context-free grammar G is in Chomsky normal form if every rule
of G has one of the following three forms:
1. X → YZ, for variables X, Y, and Z, and where neither Y nor Z is the start
variable,
2. X → a, for a variable X and a symbol a, or
3. S → ε, for S the start variable.
Now, the reason why a CFG in Chomsky normal form is nice is that every parse
tree for such a grammar has a simple form: the variable nodes form a binary tree,
and for each variable node that does not have two variable node children, a single
symbol node hangs off. A hypothetical example meant to illustrate the structure
we are talking about is given in Figure 8.6. Notice that the start variable always
appears exactly once at the root of the tree because it is never allowed on the right-
hand side of any rule.

82
Lecture 8

Figure 8.7: The unique parse tree for ε for a CFG in Chomsky normal form, assum-
ing it includes the rule S → ε.

If the rule S → ε is present in a CFG in Chomsky normal form, then we have

a special case that does not match the structure described above. In this case we
can have the very simple parse tree shown in Figure 8.7 for ε, and this is the only
possible parse tree for this string.
Because of the special form that a parse tree must take for a CFG G in Chomsky
normal form, we have that every parse tree for a given string w ∈ L( G ) must have
exactly 2|w| − 1 variable nodes and |w| leaf nodes (except for the special case w = ε,
in which we have one variable node and 1 leaf node). An equivalent statement is
that every derivation of a (nonempty) string w by a CFG in Chomsky normal form
requires exactly 2|w| − 1 substitutions.
The following theorem establishes that every context-free language is gener-
ated by a CFG in Chomsky normal form.

Theorem 8.2. Let Σ be an alphabet and let A ⊆ Σ∗ be a context-free language. There

exists a CFG G in Chomsky normal form such that A = L( G ).

The usual way to prove this theorem is through a construction that converts an
arbitrary CFG G into a CFG H in Chomsky normal form for which L( H ) = L( G ).
The conversion is, in fact, fairly straightforward—a summary of the steps one may
perform to do this conversion for an arbitrary CFG G = (V, Σ, R, S) appear be-
low. To illustrate how these steps work, let us start with the following CFG, which
generates the balanced parentheses language BAL from the previous lecture:

S → (S)S ε (8.14)

1. Add a new start variable S0 along with the rule S0 → S.

Doing this will ensure that the start variable S0 never appears on the right-hand
side of any rule.
Applying this step to the CFG (8.14) yields

S0 → S
(8.15)
S → (S)S ε

83
CS 360 Introduction to the Theory of Computing

2. Introduce a new variable Xa for each symbol a ∈ Σ.

First include the new rule Xa → a. Then, for every other rule in which a appears
on the right-hand side, except for the cases when a appears all by itself on the
right-hand side, replace each a with Xa .
Continuing with our example, the CFG (8.15) is transformed into this CFG
(where we will use the names L and R rather than the weird-looking variables
X( and X) in the interest of style):

S0 → S

S → LSRS ε
(8.16)
L→(
R→ )

3. Split up rules of the form X → Y1 · · · Ym , whenever m ≥ 3, using auxiliary

variables in a straightforward way.
In particular, X → Y1 · · · Ym can be broken up as

X → Y1 Z2
Z2 → Y2 Z3
.. (8.17)
.
Zm−2 → Ym−2 Zm−1
Zm−1 → Ym−1 Ym

Note that we must use separate auxiliary variables for each rule so that there
is no “cross talk” between different rules—so do not reuse the same auxiliary
variables to break up multiple rules.
Transforming the CFG (8.16) in this way results in the following CFG:

S0 → S

S → LZ2 ε
Z2 → SZ3
(8.18)
Z3 → RS
L→(
R→ )

4. Eliminate ε-rules of the form X → ε and “repair the damage.”

84
Lecture 8

Aside from the special case S0 → ε, there is never any need for rules of the form
X → ε; you can get the same effect by simply duplicating rules in which X
appears on the right-hand side, and directly replacing or not replacing X with ε
in each possible combination. You might introduce new ε-rules in this way, but
they can be handled recursively—and any time a new ε-rule is generated that
was already eliminated, it is not added back in.
Transforming the CFG (8.18) in this way results in the following CFG:

S0 → S ε
S → LZ2

Z2 → SZ3 Z3
(8.19)
Z3 → RS R
L→(
R→ )

Note that we do end up with the ε-rule S0 → ε, but we do not eliminate this
one because S0 → ε is the special case that we allow as an ε-rule.
5. Eliminate unit rules, which are rules of the form X → Y.
Rules like this are never necessary, and they can be eliminated provided that
we also include the rule X → w in the CFG whenever Y → w appears as a rule.
If you obtain a new unit rule that was already eliminated (or is the unit rule
currently being eliminated), it is not added back in.
Transforming the CFG (8.19) in this way results in the following CFG:

S0 → LZ2 ε
S → LZ2

Z2 → SZ3 RS )
(8.20)
Z3 → RS )
L→(
R→ )

At this point we are finished; this context-free grammar is in Chomsky normal

form.

The description above is only meant to give you the basic idea of how the construc-
tion works and does not constitute a formal proof of Theorem 8.2. It is possible,
however, to be more formal and precise in describing this construction in order to
obtain a proper proof of Theorem 8.2.

85
CS 360 Introduction to the Theory of Computing

We will make use of the theorem from time to time. In particular, when we are
proving things about context-free languages, it is sometimes extremely helpful to
know that we can always assume that a given context-free language is generated
by a CFG in Chomsky normal form.
Finally, it must be stressed that the Chomsky normal form says nothing about
ambiguity in general. A CFG in Chomsky normal form may or may not be am-
biguous, just like we have for arbitrary CFGs.

Learn Hindi Lesson One
No ratings yet
Learn Hindi Lesson One
24 pages
Unit 01 - PATHWAYS 4 - Listening, Speaking and Critical Thinking PDF
No ratings yet
Unit 01 - PATHWAYS 4 - Listening, Speaking and Critical Thinking PDF
20 pages
Teacher's Guide (Beginner)
100% (1)
Teacher's Guide (Beginner)
64 pages
This Study Resource Was Shared Via
No ratings yet
This Study Resource Was Shared Via
5 pages
Adjective
No ratings yet
Adjective
27 pages
I. Instructions: Read The Selection Then Response To The Questions by Choosing The Correct Answer
No ratings yet
I. Instructions: Read The Selection Then Response To The Questions by Choosing The Correct Answer
5 pages
Cleft Sentences Explanations (Michael Swan)
No ratings yet
Cleft Sentences Explanations (Michael Swan)
3 pages
Intermediate Book 26-145
100% (8)
Intermediate Book 26-145
400 pages
The Function of Modal Auxiliaries in The Time Magazine
100% (1)
The Function of Modal Auxiliaries in The Time Magazine
23 pages
Grammar Chapter-1 Articles
No ratings yet
Grammar Chapter-1 Articles
7 pages
Materi Kuliah B.ing
No ratings yet
Materi Kuliah B.ing
29 pages
(Further) Practice On Phrasal Verbs
No ratings yet
(Further) Practice On Phrasal Verbs
4 pages
Paraphrasing
No ratings yet
Paraphrasing
14 pages
Automata Theory Lec-03
No ratings yet
Automata Theory Lec-03
58 pages
First Quarterly Examination in English 7
No ratings yet
First Quarterly Examination in English 7
2 pages
The Inner Circle vs. The Outer Circle or British English vs. American English
No ratings yet
The Inner Circle vs. The Outer Circle or British English vs. American English
8 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Compiler 3
No ratings yet
Compiler 3
11 pages
Formal Languages and Automata Theory: CH 4: Context Free Languages
No ratings yet
Formal Languages and Automata Theory: CH 4: Context Free Languages
59 pages
Parsing 120903115324 Phpapp02
No ratings yet
Parsing 120903115324 Phpapp02
20 pages
Parse Trees, Ambiguity, and Chomsky Normal Form
No ratings yet
Parse Trees, Ambiguity, and Chomsky Normal Form
11 pages
Context-Free Languages & Grammars (Cfls & CFGS) : Reading: Chapter 5
No ratings yet
Context-Free Languages & Grammars (Cfls & CFGS) : Reading: Chapter 5
38 pages
Inversion
No ratings yet
Inversion
12 pages
08 CFG
No ratings yet
08 CFG
27 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
Ambiguity
100% (1)
Ambiguity
6 pages
For Final DLP Demo
No ratings yet
For Final DLP Demo
13 pages
Chapter Four: Context Free Languages (CFG) : - Contents
No ratings yet
Chapter Four: Context Free Languages (CFG) : - Contents
36 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Structure of English Lesson Plan
No ratings yet
Structure of English Lesson Plan
19 pages
SE Compiler Chapter 3-Parser
No ratings yet
SE Compiler Chapter 3-Parser
27 pages
Context Free Grammars
No ratings yet
Context Free Grammars
25 pages
F
No ratings yet
F
14 pages
The Ultimate Guide To Japanese KEIGO
No ratings yet
The Ultimate Guide To Japanese KEIGO
24 pages
RRB Group D Previous Year Paper in English
No ratings yet
RRB Group D Previous Year Paper in English
430 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
ContextFreeGrammars
No ratings yet
ContextFreeGrammars
28 pages
Theme
No ratings yet
Theme
11 pages
CFG & GNF
No ratings yet
CFG & GNF
21 pages
Context
No ratings yet
Context
57 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
Lec4 SyntaxAnalysis
No ratings yet
Lec4 SyntaxAnalysis
41 pages
Manuscript Template - English InfoSys 2
No ratings yet
Manuscript Template - English InfoSys 2
3 pages
Figure 1two Parse Trees For 9-5+2
No ratings yet
Figure 1two Parse Trees For 9-5+2
3 pages
Link GrammarBk2 U1 Answer-Checking
No ratings yet
Link GrammarBk2 U1 Answer-Checking
10 pages
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
No ratings yet
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
28 pages
Chương 3. Phân Tích Cú Pháp
No ratings yet
Chương 3. Phân Tích Cú Pháp
103 pages
Unit 2
No ratings yet
Unit 2
10 pages
CD Determiners
No ratings yet
CD Determiners
1 page
Unit 4 ContextFreeLanguage
No ratings yet
Unit 4 ContextFreeLanguage
58 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
English by KD Neetu Singh. Preview
No ratings yet
English by KD Neetu Singh. Preview
10 pages
17 CFGremove Ambiguity Optional
No ratings yet
17 CFGremove Ambiguity Optional
30 pages
Module 2
No ratings yet
Module 2
19 pages
Lecture 4 - Context-Free Grammars
No ratings yet
Lecture 4 - Context-Free Grammars
24 pages
Lecture 9
No ratings yet
Lecture 9
22 pages
Module 3 CFG - Final
No ratings yet
Module 3 CFG - Final
40 pages
Past Progressive Tense: Área: Ingles Grado: Quinto de Primaria Autor: Mary Loly Reyes Saavedra
No ratings yet
Past Progressive Tense: Área: Ingles Grado: Quinto de Primaria Autor: Mary Loly Reyes Saavedra
8 pages
Parsing Bun
No ratings yet
Parsing Bun
48 pages
COSC3054 Lec 03 I Grammars
No ratings yet
COSC3054 Lec 03 I Grammars
96 pages
7-Parsing and Ambiguity-16-09-2024
No ratings yet
7-Parsing and Ambiguity-16-09-2024
10 pages
ClassNotes For 2nd Grade
No ratings yet
ClassNotes For 2nd Grade
3 pages
FAFL Final Lecture 24.1 CMH
No ratings yet
FAFL Final Lecture 24.1 CMH
21 pages
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
No ratings yet
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
29 pages
4 Parsing
No ratings yet
4 Parsing
32 pages
MINUTO LOCO - Intro Sheet
100% (1)
MINUTO LOCO - Intro Sheet
2 pages
Chapter 3
No ratings yet
Chapter 3
77 pages
Context Free Language
No ratings yet
Context Free Language
31 pages
Samir CFG
No ratings yet
Samir CFG
105 pages
Chapter 3
No ratings yet
Chapter 3
57 pages
CGF and CFL
No ratings yet
CGF and CFL
45 pages
Chapter3 CFG
No ratings yet
Chapter3 CFG
67 pages
Module-3 Notes
No ratings yet
Module-3 Notes
28 pages
Automata Chapter 3
No ratings yet
Automata Chapter 3
14 pages
Ly Thuyet Dich Tuan 4 Methods - SV
No ratings yet
Ly Thuyet Dich Tuan 4 Methods - SV
16 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
CS6109 Module 4
No ratings yet
CS6109 Module 4
36 pages
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
No ratings yet
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
69 pages
l5 CFG
No ratings yet
l5 CFG
21 pages
Chapter 4 Intro - To - Parsing
No ratings yet
Chapter 4 Intro - To - Parsing
53 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
Practice - Syntax
No ratings yet
Practice - Syntax
2 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
CST302 - Compiler - Design - Module 2
No ratings yet
CST302 - Compiler - Design - Module 2
19 pages
CD Unit-2 Notes
No ratings yet
CD Unit-2 Notes
23 pages
ContextFreeGrammars Myppt
No ratings yet
ContextFreeGrammars Myppt
41 pages
Compiler 8
No ratings yet
Compiler 8
28 pages
Linear Algebra
From Everand
Linear Algebra
Georgi E. Shilov
2.5/5 (3)
Topology Essentials
From Everand
Topology Essentials
Emil G. Milewski
5/5 (1)

Parse Trees, Ambiguity, and Chomsky Normal Form

Uploaded by

Parse Trees, Ambiguity, and Chomsky Normal Form

Uploaded by

Lecture 8

Parse trees, ambiguity, and Chomsky

8.1 Left-most derivations and parse trees

S ⇒ 0S1S ⇒ 01S0S1S ⇒ 010S1S ⇒ 0101S ⇒ 0101. (8.3)

0S1S ⇒ 01S0S1S. (8.4)

S ⇒ 0S1S ⇒ 01S ⇒ 010S1S ⇒ 0101S ⇒ 0101. (8.5)

The parse tree corresponding to this derivation is illustrated in Figure 8.2.

Designing unambiguous CFGs

along with a CFG

Figure 8.3: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.9).

Figure 8.4: Parse tree for ( a + b) ∗ a + b corresponding to the derivation (8.10).

arithmetic expressions (given that we give multiplication higher precedence than

Inherently ambiguous languages

8.3 Chomsky normal form

only string ε it generates. While we know we cannot always eliminate ambiguity

If the rule S → ε is present in a CFG in Chomsky normal form, then we have

Theorem 8.2. Let Σ be an alphabet and let A ⊆ Σ∗ be a context-free language. There

1. Add a new start variable S0 along with the rule S0 → S.

2. Introduce a new variable Xa for each symbol a ∈ Σ.

3. Split up rules of the form X → Y1 · · · Ym , whenever m ≥ 3, using auxiliary

4. Eliminate ε-rules of the form X → ε and “repair the damage.”

At this point we are finished; this context-free grammar is in Chomsky normal

You might also like