0% found this document useful (0 votes)
36 views86 pages

ToA - Lecture 17 18 - Ambiguity CNF

Uploaded by

wosqa nisar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views86 pages

ToA - Lecture 17 18 - Ambiguity CNF

Uploaded by

wosqa nisar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Derivation Trees

43
S → AB A → aaA | λ B → Bb | λ

S ⇒ AB
S

A B

44
S → AB A → aaA | λ B → Bb | λ

S ⇒ AB ⇒ aaAB
S

A B

a a A

45
S → AB A → aaA | λ B → Bb | λ

S ⇒ AB ⇒ aaAB ⇒ aaABb
S

A B

a a A B b

46
S → AB A → aaA | λ B → Bb | λ

S ⇒ AB ⇒ aaAB ⇒ aaABb ⇒ aaBb


S

A B

a a A B b

λ
47
S → AB A → aaA | λ B → Bb | λ

S ⇒ AB ⇒ aaAB ⇒ aaABb ⇒ aaBb ⇒ aab


Derivation Tree S

A B

a a A B b

λ λ
48
S → AB A → aaA | λ B → Bb | λ

S ⇒ AB ⇒ aaAB ⇒ aaABb ⇒ aaBb ⇒ aab


Derivation Tree S

A B
yield

a a A B b aaλλb
= aab
λ λ
49
Ambiguity

50
E → E + E | E ∗ E | (E) | a
a + a∗a

E E ⇒ E + E ⇒ a+ E ⇒ a+ E∗E
⇒ a + a∗ E ⇒ a + a*a
E + E
leftmost derivation

a E ∗ E

a a
51
E → E + E | E ∗ E | (E) | a
a + a∗a

E ⇒ E∗E ⇒ E + E∗E ⇒ a+ E∗E E


⇒ a + a∗E ⇒ a + a∗a
E ∗ E
leftmost derivation

E + E a

a a
52
E → E + E | E ∗ E | (E) | a
a + a∗a
Two derivation trees
E E

E + E E ∗ E

a E ∗ E E + E a

a a a a
53
The grammarE → E + E | E ∗ E | (E) | a
is ambiguous:

string a + a ∗ a has two derivation trees

E E

E + E E ∗ E

a E ∗ E E + E a

a a a a
54
The grammarE → E + E | E ∗ E | (E) | a
is ambiguous:

string a + a ∗ a has two leftmost derivations

E ⇒ E + E ⇒ a+ E ⇒ a+ E∗E
⇒ a + a∗ E ⇒ a + a*a

E ⇒ E∗E ⇒ E + E∗E ⇒ a+ E∗E


⇒ a + a∗E ⇒ a + a∗a 55
Definition:
A context-free grammar G is ambiguous

if some string w∈ L(G ) has:

two or more derivation trees

56
In other words:

A context-free grammar G is ambiguous

if some string w∈ L(G ) has:

two or more leftmost derivations


(or rightmost)

57
Why do we care about ambiguity?

a + a∗a
take a=2
E E

E + E E ∗ E

a E ∗ E E + E a

a a a a
58
2 + 2∗2

E E

E + E E ∗ E

2 E ∗ E E + E 2

2 2 2 2
59
2 + 2∗2 = 6 2 + 2∗2 = 8
6 8
E E
2 4 4 2
E + E E ∗ E
2 2 2 2
2 E ∗ E E + E 2

2 2 2 2
60
Correct result: 2 + 2∗2 = 6

6
E
2 4
E + E
2 2
2 E ∗ E

2 2
61
• Ambiguity is bad for programming languages

• We want to remove ambiguity

62
Another Ambiguous Grammar

IF_STMT → if EXPR then STMT


| if EXPR then STMT else STMT

63
If expr1 then if expr2 then stmt1 else stmt2
IF_STMT

if expr1 then STMT

if expr2 then stmt1 else stmt2

IF_STMT

if expr1 then STMT else stmt2

if expr2 then stmt1


64
Inherent Ambiguity

Some context free languages


have only ambiguous grammars

Example: L = {a b c } ∪ {a b c }
n n m n m m

S → S1 | S 2 S1 → S1c | A S 2 → aS 2 | B
A → aAb | λ B → bBc | λ
65
n n n
The string a b c
has two derivation trees

S S

S1 S2

S1 c a S2

66
Simplifications
of
Context-Free Grammars

67
A Substitution Rule

Equivalent
grammar
S → aB
S → aB | ab
A → aaA
Substitute A → aaA
A → abBc B→b A → abBc | abbc
B → aA
B → aA
B→b
68
A Substitution Rule
S → aB | ab
A → aaA
A → abBc | abbc
B → aA
Substitute
B → aA
S → aB | ab | aaA
Equivalent
A → aaA
A → abBc | abbc | abaAc
grammar
69
In general:
A → xBz

B → y1

Substitute
B → y1

equivalent
A → xBz | xy1z grammar
70
Nullable Variables

λ − production : A→λ

Nullable Variable: A ⇒K⇒ λ

71
Removing Nullable Variables

Example Grammar:

S → aMb
M → aMb
M →λ

Nullable variable

72
Final Grammar

S → aMb
S → aMb
Substitute S → ab
M → aMb M →λ
M → aMb
M →λ
M → ab

73
Unit-Productions

Unit Production: A→ B

(a single variable in both sides)

74
Removing Unit Productions

Observation:

A→ A

Is removed immediately

75
Example Grammar:

S → aA
A→a
A→ B
B→A
B → bb

76
S → aA
S → aA | aB
A→a
Substitute A→a
A→ B A→ B B → A| B
B→A
B → bb
B → bb

77
S → aA | aB S → aA | aB
A→a Remove A→a
B → A| B B→B B→A
B → bb B → bb

78
S → aA | aB
S → aA | aB | aA
A→a Substitute
B→A A→a
B→A
B → bb
B → bb

79
Remove repeated productions

Final grammar
S → aA | aB | aA S → aA | aB
A→a A→a
B → bb B → bb

80
Useless Productions

S → aSb
S →λ
S→A
A → aA Useless Production

Some derivations never terminate...

S ⇒ A ⇒ aA ⇒ aaA ⇒ K ⇒ aa K aA ⇒ K
81
Another grammar:

S→A
A → aA
A→λ
B → bA Useless Production

Not reachable from S

82
In general: contains only
terminals
if S ⇒ K ⇒ xAy ⇒ K ⇒ w

w∈ L(G )

then variable A is useful

otherwise, variable A is useless

83
A production A → x is useless
if any of its variables is useless

S → aSb
S →λ Productions
Variables S→A useless

useless A → aA useless
useless B→C useless

useless C→D useless


84
Removing Useless Productions

Example Grammar:

S → aS | A | C
A→a
B → aa
C → aCb

85
First: find all variables that can produce
strings with only terminals

S → aS | A | C Round 1: { A, B}
A→a S→A
B → aa
C → aCb Round 2: { A, B, S }

86
Keep only the variables
that produce terminal symbols: { A, B, S }
(the rest variables are useless)

S → aS | A | C
A→a S → aS | A
B → aa A→a
C → aCb B → aa
Remove useless productions
87
Second: Find all variables
reachable from S

Use a Dependency Graph

S → aS | A
A→a S A B
B → aa not
reachable

88
Keep only the variables
reachable from S
(the rest variables are useless)

Final Grammar
S → aS | A
S → aS | A
A→a
A→a
B → aa

Remove useless productions

89
Removing All

Step 1: Remove Nullable Variables

Step 2: Remove Unit-Productions

Step 3: Remove Useless Variables

90
Normal Forms
for
Context-free Grammars

91
Chomsky Normal Form

Each productions has form:

A → BC or A→a

variable variable terminal

92
Examples:

S → AS S → AS
S →a S → AAS
A → SA A → SA
A→b A → aa
Chomsky Not Chomsky
Normal Form Normal Form

93
Convertion to Chomsky Normal Form

Example: S → ABa
A → aab
B → Ac

Not Chomsky
Normal Form

94
Introduce variables for terminals: Ta , Tb , Tc

S → ABTa
S → ABa A → TaTaTb
A → aab B → ATc
B → Ac Ta → a
Tb → b
Tc → c
95
Introduce intermediate variable: V1

S → AV1
S → ABTa
V1 → BTa
A → TaTaTb
A → TaTaTb
B → ATc
B → ATc
Ta → a
Ta → a
Tb → b
Tb → b
Tc → c
Tc → c
96
Introduce intermediate variable: V2
S → AV1
S → AV1
V1 → BTa
V1 → BTa
A → TaV2
A → TaTaTb
V2 → TaTb
B → ATc
B → ATc
Ta → a
Ta → a
Tb → b
Tb → b
Tc → c
Tc → c 97
Final grammar in Chomsky Normal Form:
S → AV1
V1 → BTa
A → TaV2
Initial grammar
V2 → TaTb
S → ABa B → ATc
A → aab Ta → a
B → Ac Tb → b
Tc → c 98
In general:

From any context-free grammar


(which doesn’t produce λ )
not in Chomsky Normal Form

we can obtain:
An equivalent grammar
in Chomsky Normal Form

99
The Procedure

First remove:

Nullable variables

Unit productions

100
Then, for every symbol a:

Add production Ta → a

In productions: replace a with Ta

New variable: Ta
101
Replace any production A → C1C2 LCn

with A → C1V1
V1 → C2V2
K
Vn−2 → Cn−1Cn

New intermediate variables: V1, V2 , K,Vn−2


102
Theorem: For any context-free grammar
(which doesn’t produce λ )
there is an equivalent grammar
in Chomsky Normal Form

103
Observations

• Chomsky normal forms are good


for parsing and proving theorems

• It is very easy to find the Chomsky normal


form for any context-free grammar

104
Greinbach Normal Form

All productions have form:

A → a V1V2 LVk k ≥0

symbol variables

105
Observations

• Greinbach normal forms are very good


for parsing

• It is hard to find the Greinbach normal


form of any context-free grammar

106
Compilers

107
Machine Code
Program Add v,v,0
v = 5; cmp v,5
if (v>5) jmplt ELSE
x = 12 + v; THEN:
while (x !=3) { Compiler
add x, 12,v
x = x - 3; ELSE:
v = 10; WHILE:
} cmp x,3
...... ...
108
Compiler

Lexical
parser
analyzer

input output

machine
program
code
109
A parser knows the grammar
of the programming language

110
Parser
PROGRAM → STMT_LIST
STMT_LIST → STMT; STMT_LIST | STMT;
STMT → EXPR | IF_STMT | WHILE_STMT
| { STMT_LIST }

EXPR → EXPR + EXPR | EXPR - EXPR | ID


IF_STMT → if (EXPR) then STMT
| if (EXPR) then STMT else STMT
WHILE_STMT→ while (EXPR) do STMT

111
The parser finds the derivation
of a particular input

derivation
Parser
input E => E + E
E -> E + E
=> E + E * E
10 + 2 * 5 |E*E
=> 10 + E*E
| INT
=> 10 + 2 * E
=> 10 + 2 * 5

112
derivation tree
derivation
E

E => E + E E + E
=> E + E * E
=> 10 + E*E 10
E * E
=> 10 + 2 * E
=> 10 + 2 * 5 2 5

113
derivation tree

E machine code

E + E mult a, 2, 5
add b, 10, a
10
E * E

2 5

114
Parsing

115
Parser
input
grammar derivation
string

116
Example:

Parser
S → SS derivation
input
S → aSb
aabb ?
S → bSa
S →λ

117
Exhaustive Search

S → SS | aSb | bSa | λ

Phase 1: S ⇒ SS Find derivation of


S ⇒ aSb aabb
S ⇒ bSa
S ⇒λ
All possible derivations of length 1
118
S ⇒ SS aabb
S ⇒ aSb
S ⇒ bSa
S ⇒λ

119
Phase 2 S → SS | aSb | bSa | λ
S ⇒ SS ⇒ SSS
S ⇒ SS ⇒ aSbS aabb
Phase 1 S ⇒ SS ⇒ bSaS
S ⇒ SS S ⇒ SS ⇒ S
S ⇒ aSb S ⇒ aSb ⇒ aSSb
S ⇒ aSb ⇒ aaSbb
S ⇒ aSb ⇒ abSab
S ⇒ aSb ⇒ ab 120
S → SS | aSb | bSa | λ
Phase 2
S ⇒ SS ⇒ SSS
S ⇒ SS ⇒ aSbS aabb
S ⇒ SS ⇒ S

S ⇒ aSb ⇒ aSSb
S ⇒ aSb ⇒ aaSbb
Phase 3
S ⇒ aSb ⇒ aaSbb ⇒ aabb
121
Final result of exhaustive search
(top-down parsing)
Parser
S → SS
input
S → aSb
aabb
S → bSa
S →λ
derivation

S ⇒ aSb ⇒ aaSbb ⇒ aabb


122
Time complexity of exhaustive search

Suppose there are no productions of the form

A→λ
A→ B
Number of phases for string w : approx. |w|

123
For grammar with k rules

Time for phase 1: k

k possible derivations

124
Time for phase 2: k 2

k 2 possible derivations

125
Time for phase |w| is 2|w|:

A total of 2|w| possible derivations

126
Total time needed for string w:

k + k +L+ k
2 | w|

phase 1 phase 2 phase |w|

Extremely bad!!!
127
For general context-free grammars:

There exists a parsing algorithm


that parses a string | w |
in time | w |3

The CYK parser

128

You might also like