Top-Down and Bottom-Up Parsing

Top-Down and Bottom-Up
Top Down Parsing
Bottom Up Parsing
Top Down Parsing
Things to know:
Top down parsing is constructing a parse tree for the input starting from the root and
create nodes of the parse tree in preorder(depth first).
A general form of top down parsing is the recursive descent parsing.
A recursive descent parsing is a top down parsing technique that execute a set of
recursive procedures to process the input, that involves backtracking(means
scanning the input repeatedly).
Backtracking is time consuming and therefore, inefficient. Thats why a special case
of top down parsing was developed, called predictive parsing, where no
backtracking is required.
A dilemma can occur if there is a left recursive grammar. Even with backtracking, you
can find the parser to go into an infinite loop.
There are two types of recursion, left recursive and right recursive, based on its
name, a left recursive grammar build trees that grows down to the left, while right
recursive is vice versa.
Top-down Parse tree of Grammar G(Where input=id):
G= E -> T E
E-> +T E | E E E E
T-> F T
T E T E T E
T-> *F T |
F-> (E) | id
F T F T
id
An example of a simple production with left recursive grammar
Consider the grammar: expr -> expr + term
This is an example of a left recursive grammar.
Whenever we call expr , the same procedure is called out, and the parser will loop forever.
By carefully writing a grammar, one can eliminate left recursion from it.
expr -> expr + term, can be written as
expr -> expr + term | term
After obtaining a grammar that needs no backtracking, we can use the

PREDICTIVE PARSER
Top Down Parsing Techniques
Recursive-Descent Parsing
Predictive Parsing
Recursive-Descent
Recursive -Descent Parsing Parsing
A recursive-descent parsing program consists of a set of procedures, one for each
nonterminal. Execution begins with the procedure for the start symbol, which halts
and announces success if its procedure body scans the entire input string.
General recursive-descent may require backtracking; that is, it may require repeated
scans over the input.
Consider the grammar with input string cad:
S -> c A d
A -> a b | a
S S S
c A d c A d c A d
a b a
c a d
Back
Predictive Parsing-a parsing technique that uses a lookahead symbol to
determine if the current input arguments matches the lookahead symbol.
Construction of
First and
Predictive
Follow
Parsing Tables
LL(1)
Error Recovery
Grammars
First and
Follow
First and Follow aids the construction of a predictive parser.

They allow us to fill in the entries of a predictive parsing table.
a is any string of terminals , then First(a) is the set of terminals

that begin the strings derived from a. If a is an empty string(),
then is also in First(a).
Follow (A), for a nonterminal A, to be the set of terminals a that

can appear immediately to the right of A in a sentential form.
First and
Follow
Rules in computing FIRST (X) where X can be a terminal or nonterminal, or even (empty
string).
1) If X is a terminal, then FIRST(X)= X.
2) If X is , then FIRST (X) = .
3) If X is a nonterminal and Y and Z are nonterminals, with a production of
X -> Y
Y -> Za
Z-> b; then FIRST(X) = b; where FIRST(nonterminal1) -> FIRST(nonterminal2)or until
you reach the first terminal of the production. In that case
(FIRST(nonterminaln) =FIRST(nonterminaln+1))
4) If X is a nonterminal and contains two productions. EX:

X -> a | b; then FIRST (X) = {a , b}
First and
Follow
Consider again grammar G:
1) E -> T E
E -> +T E |
T -> F T
T -> *F T |
F -> ( E ) | id
2) S -> iEtSS | a
S -> eS |
E -> b
ANSWERS(FIRST):
1) FIRST(E) = FIRST(T) = FIRST(F) = { ( , id }
FIRST (E) = { + , }
First and
Follow
Rules in computing FOLLOW ( X) where X is a nonterminal
1) If X is a part of a production and is succeeded by a terminal, for example: A -> Xa; then
Follow(X) = { a }
2) If X is the start symbol for a grammar, for ex:
X -> AB
A -> a
B -> b; then add $ to FOLLOW (X); FOLLOW(X)= { $ }
3) If X is a part of a production and followed by another non terminal, get the FIRST of that
succeeding nonterminal.
ex: A -> XD
D -> aB ; then FOLLOW(X)= FIRST(D) = { a }; and if FIRST(D) contains
(ex: D->aB | ), then everything in FOLLOW(D) is in FOLLOW(X).
4) If X is the last symbol of a production, ex: S -> abX, then
FOLLOW(X)= FOLLOW(S)
First and
Follow

1) E -> T E
E -> +T E |
T -> F T
T -> *F T |
F -> ( E ) | id
2) S -> iEtSS | a
S -> eS |
E -> b
ANSWERS(FIRST):
1) FIRST(E) = FIRST(T) = FIRST(F) = { ( , id }
FIRST (E) = { + , }
FIRST (T) = { *, }
2) FIRST(S)= { i , a }; FIRST(S)= { e, }; FIRST(E) = { b }

ANSWERS(FOLLOW):
ANSWERS FOR FOLLOW:

Construction of
Predictive
Parsing Tables
The general idea is to use the FIRST AND FOLLOW to
construct the parsing tables.
Each FIRST of every production is labeled in the table
whenever the input matches with it.
When a FIRST of a production contains , then we get
the Follow of the production
Construction of E -> T E
E -> + T E |
Predictive T -> F T
T- -> *FT |
Parsing Tables F -> ( E ) | id
and their First and Follow
FIRST(E) = FIRST(T) = FIRST(F) = { ( , id } FOLLOW(E) = FOLLOW(E)= { ) , $}

FIRST (E) = { + , } FOLLOW (T)= FOLLOW(T)= { +, ), $}
FIRST (T) = { *, } FOLLOW (F) = { +, * , ), $}
Nontermi
nals
Id + * ( ) $
E E->TE E->TE
E E->+TE E-> E->
T T->FT T-FT
T T-> T->*FT T-> T->
F F-> id F->(E)
Nontermi
nals
Id + * ( ) $
E E->TE E->TE
E E->+TE E-> E->
T T->FT T->FT
T T-> T->*FT T-> T->
F F-> id F->(E)
STACK INPUT ACTION
$E id + id * id $
$ET id + id * id $ E->TE
$ETF id + id * id $ T->FT
$ETid id + id * id $ F-> id
$ET + id * id $
$E + id * id $ T->
$ET + + id * id $ E->+TE
$ET id * id $
$ETF id * id $ T->FT
$ETid id * id $ F-> id
$ET * id $
$ETF* * id $ T->*FT
$ETF id $
$ETid id $ F-> id
$ET $
$E $ T->
$ $ Back
E->
LL(1)
Grammars
What does LL(1) mean?

The first L in LL(1) stands for scanning the input from left to right, the second L
is for producing a leftmost derivation, and the 1 for using one input symbol of
lookahead at each step to make parsing action decisions.
No ambiguous or left recursive grammar is LL(1).
LL(1)
Grammars
There remains a question of what should be done when a parsing table has
multiple-defined entries.
One solution is to transform the grammar by eliminating all left recursion and then
left factoring when possible, but not all grammars can yield an LL(1) grammar
at all.
The main difficulty in using a predictive parsing is in writing a grammar for the
source language such that a predictive parser can be constructed from the
grammar.
To alleviate some of the difficulty, one can use a operator precedence, or even
better the LR parser, that provides both the benefits of predictive parsing and
operator precedence automatically.
BACK
Error Recovery
When does an error possibly occur?

-An error is detected when the terminal on the top of the stack
does not match the next input symbol or when the
nonterminal A is on the top of the stack, a is the next input
symbol, and the parsing table entry M[A, a] is empty.
How can we deal with errors?
Panic-mode error recovery is based on the idea of skipping
symbols on the input until a token in a selected set of synch
tokens appears.
Error Recovery
How does it work?

Using follow and first symbols as synchronizing tokens works
well. The parsing table will be filled with synch tokens
obtained from the FOLLOW set of the nonterminal.
When a parser looks up entry M[A,a] and finds it blank, then a

is skipped. If the entry is synch, then the nonterminal is
popped in an attempt to resume parsing.
Nontermi
nals
Id + * ( ) $
E E->TE E->TE synch synch

E E->+TE E-> E->
T T->FT synch T->FT synch synch
T T-> T->*FT T-> T->
F F-> id synch synch F->(E) synch
STACK INPUT ACTIONsynch
$E ) id * + id $ Error, skip )
$E id * + id $ Id is in FIRST(E)
$E T id * + id $
$E TF id * + id $
$E Tid id * + id $
$E T * + id $
$E T F * * + id $
$E T F + id $ Error, M[F, +1 = synch
$E T + id $ F has been popped
$E + id $
$E T+ + id $
$E T id $
$E T F id $
$E T id id $
$ET $
$E $
$ $
Back
Error Recovery
Another error recovery procedure is the Phrase-level

Recovery. This is implemented by filling in the blank entries
in the parsing table with pointers to error routines. These
routines can also pop symbols from the stack, change,
insert or delete symbols on the input, and issue
appropriate error messages. The alteration of stack
symbols is very questionable and risky.
BACK
Bottom Up Parsing
A general style of bottom up parsing will be introduced, it is
the shift-reduce parsing.
Shift reduce parsing works based on its name, Shift and
Reduce, so whenever the stack holds symbols that
cannot be reduced anymore, we shift another input, and
when it matches, we reduce.
Bottom Up Parsing
STACK INPUT ACTION
1) $ id1 + id2 * id3 $ Shift
2) $id1 + id2 * id3 $ Reduce by E
3) $E + id2 * id3 $ ->id
4) $E + id2 * id3 $ Shift
5) $E + id2 Shift
* id3 $ Reduce by E->id
6) $E + E * id3 $
7) $E + E * Shift
8) $E + E * id3 id3 $ Shift
9) $E + E * E $ Reduce by E->id
10)$E + E $ Reduce by E-> E * E
$ Reduce by E-> E+ E
11)$E ACCEPT
$

Top-Down and Bottom-Up Parsing

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Top-Down and Bottom-Up Parsing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Top-Down and Bottom-Up Parsing

Uploaded by

Copyright:

Available Formats

Top-Down and Bottom-Up

Top Down Parsing

expr -> expr + term | term

After obtaining a grammar that needs no backtracking, we can use the

First and Follow aids the construction of a predictive parser.

a is any string of terminals , then First(a) is the set of terminals

Follow (A), for a nonterminal A, to be the set of terminals a that

4) If X is a nonterminal and contains two productions. EX:

Consider again grammar G:

2) FIRST(S)= { i , a }; FIRST(S)= { e, }; FIRST(E) = { b }

ANSWERS FOR FOLLOW:

FIRST(E) = FIRST(T) = FIRST(F) = { ( , id } FOLLOW(E) = FOLLOW(E)= { ) , $}

What does LL(1) mean?

When does an error possibly occur?

How does it work?

When a parser looks up entry M[A,a] and finds it blank, then a

E E->TE E->TE synch synch

Another error recovery procedure is the Phrase-level

You might also like