0% found this document useful (0 votes)
35 views43 pages

4 Topdownparser-L6,7

The document discusses top-down parsers. It provides examples of how a top-down parser works by starting at the root of a parse tree and using production rules to try to match input strings. The parser uses backtracking if it makes an incorrect choice of production rule. It must find the appropriate rule to successfully derive the input string from the start symbol.

Uploaded by

PRANJAL SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views43 pages

4 Topdownparser-L6,7

The document discusses top-down parsers. It provides examples of how a top-down parser works by starting at the root of a parse tree and using production rules to try to match input strings. The parser uses backtracking if it makes an incorrect choice of production rule. It must find the appropriate rule to successfully derive the input string from the start symbol.

Uploaded by

PRANJAL SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

BITS Pilani

BITS Pilani Prof.Aruna Malapati


Department of CSIS
Hyderabad Campus
BITS Pilani
Hyderabad Campus

Top down parsers


Today’s Agenda

• Types of parsers

• Top down parser

BITS Pilani, Hyderabad Campus


Parsing technique

• Scan input string left to right and identify the derivation is


leftmost or rightmost.

• Make use of productions for choosing the appropriate


derivation.

BITS Pilani, Hyderabad Campus


Types of parsers

Parsers

Top Down Bottom Up

Operator LR
Backtracking Predictive
Precedence

Recursive
LL(1) LR(0) SLR(1) LALR(1) CLR(1)
Descent

BITS Pilani, Hyderabad Campus


Two Approaches

• Top-down parsers LL(1), recursive descent


• Start at the root of the parse tree and grow toward leaves
• Pick a production & try to match the input
• Bad “pick” → may need to backtrack
• Bottom-up parsers LR(1), operator precedence
• Start at the leaves and grow toward root
• As input is consumed, encode possible parse trees in an
internal state
• Bottom-up parsers handle a large class of grammars
BITS Pilani, Hyderabad Campus
Difference between Top down
and bottom up parser
Input string abbcde
S -> aABe
A -> Abc | b Top down parser generates the tree from the start symbol or the root
Node and continues finding the right production to derive the string.
B -> d
Main task is to make a decision to use the right production for deriving
the string

Bottom Up parsers generates the tree by looking at input by reducing


It i.e it looks for strings in the inputs that matches the RHS of productions
and replace it with its LHS continues until start production is derived.
Main task is to make a decision of whether to shift or reduce.

BITS Pilani, Hyderabad Campus


Grammars and Parsers

LL(1) parsers
– Left-to-right input Grammars that this
can handle are called
– Leftmost derivation
LL(1) grammars
– 1 symbol of look-ahead

LR(1) parsers
– Left-to-right input Grammars that this
– Rightmost derivation can handle are called
LR(1) grammars
– 1 symbol of look-ahead

BITS Pilani, Hyderabad Campus


Top down parser

• Built from root to leaves.

• The derivation terminates when the required input string


terminates.

• Left derivation matches this requirement.

• Main task is to find appropriate production rule in order


to produce the correct input string.

BITS Pilani, Hyderabad Campus


Example - 1
Grammar
Sentential form
# Production rule S

1 S -> x P z
2 P -> yw | y x P z
Input string x y z
First input string matches
with the leftmost node, hence
Advance the input string pointer

BITS Pilani, Hyderabad Campus


Example - 1
Grammar
Sentential form
# Production rule S

1 S -> x P z
2 P -> yw | y x P z
Input string x y z
Match next node P with current
Character in input string. It does
not match and P is non terminal
Hence expand.

BITS Pilani, Hyderabad Campus


Example - 1
Grammar
Sentential form
# Production rule S

1 S -> x P z
2 P -> yw | y x P z
Input string x y z
y w

Match, hence advance


The input string pointer

BITS Pilani, Hyderabad Campus


Example - 1
Grammar
Sentential form
# Production rule S

1 S -> x P z
2 P -> yw | y x P z
Input string x y z
y w

Mismatch
Hence backtrack
And use other
Production of P
BITS Pilani, Hyderabad Campus
Example - 1
Grammar
Sentential form
# Production rule S

1 S -> x P z
2 P -> yw | y x P z
Input string x y z

BITS Pilani, Hyderabad Campus


Example
Grammar
Sentential form
# Production rule S

1 S -> x P z
2 P -> yw | y x P z
Input string x y z

Matching done for


entire string

BITS Pilani, Hyderabad Campus


Example - 2
Expression grammar (with precedence)

# Production rule
1 expr → expr + term
2 | expr - term
3 | term
4 term → term * factor
5 | term / factor
6 | factor
7 factor → number
8 | identifier

Input string x – 2 * y

BITS Pilani, Hyderabad Campus


Example -2
Current position in
the input stream

Rule Sentential form Input string expr


- expr  x - 2 * y
1 expr + term  x - 2 * y
3 term + term  x – 2 * y expr + term
6 factor + term  x – 2 * y
8 <id> + term x  – 2 * y
x  – 2 * y term
- <id,x> + term

Problem: fact
– Can’t match next terminal
– We guessed wrong at step 2 x

BITS Pilani, Hyderabad Campus


Backtracking

Rule Sentential form Input string


- expr  x - 2 * y
2 expr + term  x - 2 * y
3 term + term  x – 2 * y Undo all these
6 factor + term  x – 2 * y productions
8 <id> + term x  – 2 * y
? <id,x> + term x  – 2 * y

• Rollback productions
• Choose a different production for expr
• Continue

BITS Pilani, Hyderabad Campus


Retrying

Rule Sentential form Input string expr


- expr  x - 2 * y
2 expr - term  x - 2 * y
3 term - term  x – 2 * y expr - term
6 factor - term  x – 2 * y
8 <id> - term x  – 2 * y
x –  2 * y term fact
- <id,x> - term
3 <id,x> - factor x –  2 * y
7 <id,x> - <num> x – 2  * y fact 2

x
Problem:
– More input to read
– Another cause of backtracking

BITS Pilani, Hyderabad Campus


Successful Parse

Rule Sentential form Input string expr


- expr  x - 2 * y
2 expr - term  x - 2 * y
3 term - term  x – 2 * y expr - term
6 factor - term  x – 2 * y
8 <id> - term x  – 2 * y
term term * fact
- <id,x> - term x –  2 * y
4 <id,x> - term * fact x –  2 * y
6 <id,x> - fact * fact x –  2 * y fact fact y
7 <id,x> - <num> * fact x – 2  * y
- <id,x> - <num,2> * fact x – 2 *  y
x 2
8 <id,x> - <num,2> * <id> x – 2 * y 

All terminals match – we’re finished

BITS Pilani, Hyderabad Campus


Other Possible Parses

Rule Sentential form Input string


- expr  x - 2 * y
2 expr + term  x - 2 * y
2 expr + term + term  x – 2 * y
2 expr + term + term + term  x – 2 * y
2 expr + term + term + term + term  x – 2 * y

Top down parser cannot handle left recursive grammar.


Problem: termination
– Wrong choice leads to infinite expansion

(More importantly: without consuming any input!)


– May not be as obvious as this
– Our grammar is left recursive
BITS Pilani, Hyderabad Campus
Left Recursion

• Bad news:
– Top-down parsers cannot handle left
recursion

• Good news:
– We can systematically eliminate left
recursion

BITS Pilani, Hyderabad Campus


Backtracking parser

• Tries different production rules to find the match for the


input sting by backtracking each time.

• Slower and requires exponential time in general.

• Hence not preferred in practical compilers.

BITS Pilani, Hyderabad Campus


Predictive parsing

• The goal of predictive parsing is to construct a top-down


parser that never backtracks.

• To do so, we must transform a grammar in two ways:


– eliminate left recursion, and
– perform left factoring.

Consider this grammar:

A ::= A a | b A ::= A a | b
This grammar recognizes ba* Here is an alternative way:
A ::= b A'
A' ::= a A' | ε
BITS Pilani, Hyderabad Campus
Recursive Descent Parser
Basic idea
• Given A →a | b, the parser should be able to choose
between a & b.

• The parser uses a collection of recursive procedures for


paring the given input string.

• The RHS of the production is directly converted to a


program.

• For each non terminal a separate procedure is written


and the body of the procedure is RHS of the
corresponding non terminal.
BITS Pilani, Hyderabad Campus
Recursive Descent parser
For every variable we will have one function and if a
variable
Why has name?
this many productions depending on the
E -> id E’
number of variable
For every productions
in thewe will have
grammar if else
we will cases
write or
a function
E’ -> + id E’ | Є switch cases
E (){ E’ (){ match(char t){ l=getchar();
if(l==‘id’){ if(l==‘+’){ If(l==t) main(){
match(‘id’); match(‘+’); l=getchar(); E();
E’(); match(‘id’); else If(l==‘$’)
} E’(); printf(“error”); printf(“parsing
successful”);
} }
According to the grammar
else id + id $ can be generated E’()
E’()
return; E()
} Recursion stack main()
BITS Pilani, Hyderabad Campus
Exercise

S->ABd | aBc
A -> є
B->b|c

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
LL(1) Parser

$ Input Buffer

LL(1) Parser

$ LL(1) Parse Table


Stack

BITS Pilani, Hyderabad Campus


First and Follow sets

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
Terminals First Set Follow Set Input: Grammar G
E (,id ),$ Output: Parse table M
Method:
E’ +,ε ),$ 1. For each production S -> a of the grammar,
T (,id +,), $ perform steps 2 and 3.
T’ *, ε +,), $ 2. For each terminal a in First(a), add S -> a to M[S,a]
3. If ε is in First(S), add S -> ε to M[S,b] for each
F (,id *,+,),$ terminal b in Follow(S).If ε is in First(a) and $ is in
Follow(S), add S-> ε to M[S,$].

Input Symbols
NT Id + * ( ) $
E E →TE’ E →TE’
E’ E’ →+TE’ E’ → ε E’ → ε
T T →FT’ T →FT’
T’ T’ → ε T’ →*FT’ T’ → ε T’ → ε
F F->id F →(E)

BITS Pilani, Hyderabad Campus


Input Symbols
NT Id + * ( ) $
E E →TE’ E →TE’
E’ E’ →+TE’ E’ → ε E’ → ε
T T →FT’ T →FT’
T’ T’ → ε T’ →*FT’ T’ → ε T’ → ε
F F->id F →(E)
Input Stack Action Input Stack Action
id+id*id$ E$ [E,id] *id$ T’E’$ [T’,*]
id+id*id$ TE’$ [T,id] *id$ *FT’E’$ *=*
id+id*id$ FT’E’$ [F,id] id$ FT’E’$ [F,id]
id+id*id$ id T’E’$ id = id id$ id T’E’$ id=id
+id*id$ T’E’$ [T’,+] $ T’E’$ [T’,$]
+id*id$ E’$ [E’,+] $ E’$ [E’,$]
+id*id$ +TE’$ +=+ $ $ $=$
id*id$ TE’$ [T,id] Since no more symbols to match both
id*id$ FT’E’$ [F,id] in input and the stack the input string
is accepted
id*id$ idT’E’$ id=id BITS Pilani, Hyderabad Campus
Rules for LL(1) Grammars

• Grammar must not have left-recursion

• Grammar must be predictive / deterministic

• Grammar must not be ambiguous

BITS Pilani, Hyderabad Campus


Error Handling

• Errors
• Terminal at top of stack ≠ terminal on input
• Variable A is top of stack and M[A, a] = no production
• The parser has to “recover” or synchronize itself
• If we just continue: many further senseless errors
• The generated error messages should possibly be
• Exact in meaning
• Exact in place (e.g. line, column)
• Find as many errors as possible in one run • Avoid
propagated errors

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
Take home message

• Top down parsers use left derivations.

• Backtracking parsers are exponential.

• Backtracking parsers cannot handle left recursive grammars.

• Backtracking overcome by look ahead in recursive descent


parsers.

• Recursive descent parsers are implemented by writing


recessive procedures.

BITS Pilani, Hyderabad Campus


Take home message

• The Top down parsers use the derivation from the start
symbol and use left derivation.

• Two class of top down parsers exist


• Backtracking: Exponential time since it tries all products suitable

• Top Down Parsers: Predict based on parse table

BITS Pilani, Hyderabad Campus


Take home message

• All CFG cannot be LL(1) grammar.

• Before Implementing the predictive parser convert the


CFG into LL(1) by eliminating the left recursion and
performing the left factoring.

• Compute the first and follow sets.

• Make the entries in the LL(1) table,if every cell in the


table has unique entries then the grammar is of type
LL(1).

BITS Pilani, Hyderabad Campus

You might also like