Module 4 - Top Down Parsing
Module 4 - Top Down Parsing
Compiler Design
MODULE – 2
Dr. WI. Sureshkumar
Associate Professor
School of Computer Science and Engineering (SCOPE)
VIT Vellore
[email protected]
SJT413A34
Top-Down Parsing
• The parse tree is created top to bottom.
• Top-down parser
• Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule does not work, we backtrack to
try other alternatives.)
• It is a general parsing technique, but not widely used.
• Not efficient
• Predictive Parsing
• no backtracking
• needs a special form of grammars (LL(1) grammars).
• Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.
Recursive-Descent Parsing (uses
Backtracking)
• Backtracking is needed.
• It tries to find the left-most derivation.
S cAd
A ab| a
S S
input: cad
c A d c A d
a b a
Recursive Descent Parser
• Now, we have a match for the second input symbol “a”, so we
advance the input pointer to “d”, the third input symbol, and compare
d against the next leaf “b”.
• Backtracking
• Since “b” does not match “d”, we report failure and go back to A to see
whether there is another alternative for A that has not been tried - that might
produce a match
• In going back to A, we must reset the input pointer to “a”.
Creating a top-down parser
Top-down parsing can be viewed as the problem of
constructing a parse tree for the input string, starting form the
root and creating the nodes of the parse tree in preorder.
Example
• Given the grammar :
• E → TE’
• E’ → +TE’ |
• T → FT’
• T’ → *FT’ |
• F → (E) | id
• The input: id + id * id
Predictive parsers
• The class of grammars for which we can construct predictive parsers
looking k symbols ahead in the input is called the LL(k) class.
• The first “L” stands for scanning input from left to right.
• The second “L” for producing a leftmost derivation.
• The “1” for using one input symbol of look-ahead at each step to
make parsing decisions.
LL(1) Grammars
• A grammar whose parsing table has no multiply-defined entries is
said to be LL(1) grammar.
FOLLOW(E) = { $, ) } F (E) | id
FOLLOW(E’) = { $, ) } E TE’ FOLLOW(E)
FOLLOW(T) = { +, ), $ } FIRST(E’) = {+, } FOLLOW(E) = { $, ) }
FOLLOW(T’) = { +, ), $ } FOLLOW(T)
FOLLOW(F) = { *,+, ), $ } FIRST(T’) FOLLOW(T)
Construction of a Predictive Parsing
Table
Input : Grammar G
Output : Parsing table M.
Method:
1. For each production A α of the grammar, do step 2 and 3.
2. For each terminal a in FIRST(α), add A α to M[A, a].
3. If is in FIRST(α), add A α to M[A, b] for each terminal b in
FOLLOW(A). If is in FIRST(α) and $ is in FOLLOW(A), add
A α to M[A, $]
4. Mark each undefined entry of M be error.
LL(1) Parsing Table
id + * ( ) $
E E TE’ E TE’
E’ E’ +TE’ E’ E’
T T FT’ T FT’
T’ T’ T’ *FT’ T’ T’
F F id F (E)
Non recursive predictive parsing
The parser considers X, the symbol on top of the stack, and a, the
current input symbol. These two symbols determine the action of the
parser. There are three possibilities.
1. If X = a = $, the parser halts and announces successful completion
of parsing.
2. If X = a ≠ $, the parser pops X off the stack and advances the input
pointer to the next input symbol.
3. If X is a non-terminal, the parser consults entry M[X, a] of the
parsing table M. This entry will be either an X- production or an
error entry. If, for example, M[X, a] = {X →UVW}, the parse replaces
X on the top of the stack by WVU (with U on top). As output, we
shall assume that the parser just prints the production used.
If M[X, a] = error, the parser calls an error recovery routine.
LL(1) Parser
Stack Input Output
$E id + id * id $
$E’T id + id * id $ E → TE’
$E’T’F id + id * id $ T → FT’
$E’T’id id + id * id $ F → id
$E’T’ + id * id $ match id
$E’ + id * id $ T’ → λ
$E’T+ + id * id $ E’ → +TE’
LL(1) Parser
Stack Input Output
$E’T id * id $ match +
$E’T’F id * id $ T → FT’
$E’T’id id * id $ F → id
$E’T’ * id $ match id
$E’T’F* * id $ T’ → *FT’
$E’T’F id $ match *
$E’T’id id $ F → id
$E’T’ $ match id
$E’ $ T’ → λ
E
T E’
F T’
+ T E’
id
F T’
id
* F T’
id
Example - 2
Consider the grammar
S → (L) / a
L→L,S/S
Construct a predictive parser for the above grammar. Also, find
the parse trees for the following words:
i) (a, a)
ii) (a, (a, a))
iii) (a, ((a, a), (a, a)))
Eliminate the immediate left recursion,
S → (L) / a
L → SL’
L’ → ,SL’ /
FIRST(S) = {(, a}
FIRST(L) = {(, a}
FIRST(L’) = {,, }
FOLLOW(S) = { $, , , )}
FOLLOW(L) = { ) }
FOLLOW(L’) = { ), $ }
LL(1) Parsing Table
( ) a , $
S S → (L) S→ a
L L → SL’ L → SL’
L’ L’ → L’ → ,SL’ L’ →
Stack Input Output
$S (a, a)$ S → (L)
$)L( (a, a)$ match (
$) L a, a) $ L → SL’
$) L’ S a, a) $ S→ a
$) L’ a a, a) $ match a
$) L’ , a) $ L’ → ,SL’
$) L’ S , , a) $ match ,
$) L’ S a) $ S→ a
$) L’ a a) $ match a
$ ) L’ )$ L’ →
Stack Input Output
$ $ Halt
S
( L )
S L’
a
, s L’
a
Example - 3
S → iEtS / iEtSeS / a
E→b
Using left factoring the original productions becomes,
S → iEtSS1 / a
S1→ eS /
E→b
S S → iEtSS1 S → a
S1 S1→ eS S1→
S1→
E E→b