Cs1622 Parsing Part2 Bun
Cs1622 Parsing Part2 Bun
Cs1622 Parsing Part2 Bun
Derivations vs Parses
Grammar is used to derive string or construct parser
Rightmost derivation: The same parse tree results from both the rightmost and leftmost derivations in
E ⇒ E + E ⇒ E + E * E ⇒ E + E * id ⇒ E + id * id ⇒ ... the previous example:
⇒ id * id + id * id E
E + E
E * E E * E
id id id id
id id id id
E
E + E
E * E E * E
id id id id
1
9/18/2012
id
2
9/18/2012
2 3
Parsing
We will study two approaches:
Top-down
• Easier to understand and implement manually
Parsing Bottom-up
• More powerful, can be implemented automatically
LL(k) — predictive parser for LL(k) grammar Parsing fails if no production for the start symbol generates the entire input.
• Non recursive and only k symbol look ahead
• Table driven — efficient Terminals of the derivation are compared against input.
• Match — advance input, continue parsing
• Mismatch — backtrack, or fail
3
9/18/2012
Implementation Problems
Create a procedure for each non-terminal: Unclear what to label the last case with.
1. Checks if input symbol matches a terminal symbol in the grammar rule
2. Calls other procedure when non-terminals are part of the rule What if we don’t label it at all and make it the default?
3. If end of procedure is reached, success is reported to the caller
Consider parsing 5 + 5:
E → int | ( E ) | E + E
We’d find INT and be done with the parse with more input to consume. We’d
void E() { want to backtrack, but there’s no prior function call to return to.
switch(lexer.yylex()) {
case INT: eat(INT); break; What if we put the call to E() prior to the switch/case?
case LPAREN: eat(LPAREN); E(); eat(RPAREN); break;
case ???: E(); eat(PLUS); E(); break; Then E() would always make a recursive call to E() with no end case for the
} recursion.
}
Recursive descent parsers cannot deal with left recursion. By changing the grammar to:
A → y A’
However, we can rewrite the grammar to represent the same language without the A’ → x A’ |
need for left recursion.
Not all left recursion is immediate may be hidden in multiple production rules
A → BC | D
B → AE | F
There is a general approach for removing indirect left recursion, but we’ll not
worry about if for this course.
4
9/18/2012