Parsing: Fall 2005 Costas Buch - RPI 1
Parsing: Fall 2005 Costas Buch - RPI 1
Lexical
parser
analyzer
input output
machine
program
code
Fall 2005 Costas Buch - RPI 3
A parser knows the grammar
of the programming language
derivation
Parser
E => E + E
input
E -> E + E => E + E * E
10 + 2 * 5 |E*E => 10 + E*E
| INT => 10 + 2 * E
=> 10 + 2 * 5
=> 10 + E*E 10
E
* E
=> 10 + 2 * E
=> 10 + 2 * 5 2 5
machine code
mult a, 2, 5
add b, 10, a
Fall 2005 Costas Buch - RPI 7
A simple parser
Exhaustive Parser
input
grammar derivation
string
Exhaustive Parser
S SS derivation
input
S aSb
aabb ?
S bSa
S
S SS | aSb | bSa |
S aSb aSSb
S aSb aaSbb
Phase 3
S aSb aaSbb aabb
Fall 2005 Costas Buch - RPI 14
Final result of exhaustive search
Exhaustive Parser
S SS
input
S aSb
aabb
S bSa
S
derivation
S aSb aaSbb aabb
Fall 2005 Costas Buch - RPI 15
The time complexity of exhaustive search
Derivations Number of
from phase 1 Productions
In General
(i 1) i
Steps for phase i: at most k k k
Derivations Number of
from phase i-1 Productions
Fall 2005 Costas Buch - RPI 18
Total steps needed for string w:
Extremely bad!!!
Fall 2005 Costas Buch - RPI 19
Faster Parsers
S-grammar: A av
Symbol String of variables
*
X F (w) if X w
Fall 2005 Costas Buch - RPI 26
F (w) can be computed recursively:
prefix suffix
Write w uv
If X F (u ) and Y F (v)
* *
( X u) (Y v)
Then Z F (w)
* *
( Z XY uY uv w)
Fall 2005 Costas Buch - RPI 27
Compute F (w) by taking the union
all possible decompositions of w
Length Set of Variables
1 That generates w
w u1v1 H1
2
w u2v2 H2
|w|-1
w u|w|1v|w|1 H|w|1
Result: F ( w) H1 H 2 H|w|1
Fall 2005 Costas Buch - RPI 28
At the basis of the recursion
we have strings of length 1
symbol X
Remark:
The whole algorithm can be implemented
with dynamic programming
2 aa ab bb bb
4 aabb abbb
5 aabbb
Fall 2005 Costas Buch - RPI 32
S AB, A BB | a, B AB | b
a a b b b
F ( ) {A} {A} {B} {B} {B}
aa ab bb bb
aabb abbb
aabbb
Fall 2005 Costas Buch - RPI 33
S AB, A BB | a, B AB | b
a a b b b
F ( ) {A} {A} {B} {B} {B}
aa ab bb bb
F () {} {S,B} {A} {A}
aab abb bbb
aabb abbb
aabbb
Fall 2005 Costas Buch - RPI 34
S AB, A BB | a, B AB | b
F (aa ) prefix aa suffix
F (a ) { A} F (a ) { A}
There is no production of form X AA
Thus, F (aa ) {}
Number of Number of
strings Prefix-suffix
decompositions
for a string