0% found this document useful (0 votes)
14 views30 pages

L5 TopDownParsing

Uploaded by

mekasiddu44
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views30 pages

L5 TopDownParsing

Uploaded by

mekasiddu44
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

23CS2204

Compiler Design
Dr. Sadu Chiranjeevi
Assistant Professor
Department of Computer Science and Engineering
[email protected]

1
Top down Parsing
• Following grammar generates types of
Pascal

type  simple
|  id
| array [ simple] of type

simple  integer
| char
| num dotdot num

2
Example …
• Construction of a parse tree is done by starting
the root labeled by a start symbol

• repeat following two steps

– at a node labeled with non terminal A select one of the


productions of A and construct children nodes
(Which production?)
– find the next node at which subtree is Constructed

(Which node?)

3
• Parse
array [ num dotdot num ] of integer

Start symbol type


Expanded using the
rule type  simple
simple
• Cannot proceed as non terminal “simple” never generates
a string beginning with token “array”. Therefore, requires
back-tracking.

• Back-tracking is not desirable, therefore, take help of a


“look-ahead” token. The current token is treated as look-
ahead token. (restricts the class of grammars)

4
array [ num dotdot num ] of integer

Start symbol
look-ahead Expand using the rule
type type  array [ simple ] of type

array [ simple ] of type


Left most non terminal
Expand using the rule
Simple  num dotdot num num dotdot num simple

all the tokens exhausted Left most non terminal integer


Parsing completed Expand using the rule
type  simple

Left most non terminal


Expand using the rule
simple  integer 4
Recursive descent parsing
First set:

Let there be a production


A

then First() is the set of tokens that appear as


the first token in the strings generated from 

For example :
First(simple) = {integer, char, num}
First(num dotdot num) = {num}

6
Define a procedure for each non terminal
procedure type;
if lookahead in {integer, char, num}
then simple
else if lookahead = 
then begin match(  );
match(id)
end
else if lookahead = array
then begin match(array);
match([);
simple;
match(]);
match(of);
type
end
else error;
7
procedure simple;
if lookahead = integer
then match(integer)
else if lookahead = char
then match(char)
else if lookahead = num
then begin match(num);
match(dotdot);
match(num)
end
else
error;

procedure match(t:token);
if lookahead = t
then lookahead = next token
else error; 8
Left recursion
• A top down parser with production
A  A  may loop forever

• From the grammar A  A  | 


left recursion may be eliminated by
transforming the grammar to

A R
RR|
9
Parse tree corresponding Parse tree corresponding
to a left recursive grammar to the modified grammar

A A

A R

A R

β α α β α Є

Both the trees generate string βα*


10
Example
• Consider grammar for arithmetic expressions

EE+T|T
TT*F|F
F  ( E ) | id

• After removal of left recursion the grammar becomes

E  T E’
E’  + T E’ | Є
T  F T’
T’ * F T’ | Є
F  ( E ) | id

11
Removal of left recursion
In general

A  A1 | A2 | ….. |Am


|1 | 2 | …… | n

transforms to

A  1A' | 2A' | …..| nA'


A'  1A' | 2A' |…..| mA' | Є
12
Left recursion hidden due to many
productions
• Left recursion may also be introduced by two or more grammar rules.
For example:

S  Aa | b
A  Ac | Sd | Є

there is a left recursion because

S  Aa  Sda

• In such cases, left recursion is removed systematically

– Starting from the first rule and replacing all the occurrences of the first
non terminal symbol

– Removing left recursion from the modified grammar

13
Removal of left recursion due to
many productions …
• After the first step (substitute S by its rhs in the rules) the
grammar becomes

S  Aa | b
A  Ac | Aad | bd | Є

• After the second step (removal of left recursion) the


grammar becomes

S  Aa | b
A  bdA' | A'
A'  cA' | adA' | Є
14
Left factoring
• In top-down parsing when it is not clear which production to choose
for expansion of a symbol
defer the decision till we have seen enough input.

In general if A  1 | 2

defer decision by expanding A to A'

we can then expand A’ to 1 or 2

• Therefore A   1 |  2

transforms to

A  A’
A’  1 | 2

15
Dangling else problem again
Dangling else problem can be handled by left factoring

stmt  if expr then stmt else stmt


| if expr then stmt

can be transformed to

stmt  if expr then stmt S'


S'  else stmt | Є

16
Predictive parsers
• A non recursive top down parsing method

• Parser “predicts” which production to use

• It removes backtracking by fixing one production for every


non-terminal and input token(s)

• Predictive parsers accept LL(k) languages


– First L stands for left to right scan of input
– Second L stands for leftmost derivation
– k stands for number of lookahead token

• In practice LL(1) is used

17
Predictive parsing
• Predictive parser can be implemented by
maintaining an external stack
input
Parse table is a
two dimensional array
M[X,a] where “X” is a
stack

parser output non terminal and “a” is


a terminal of the grammar

Parse
table
18
Example
• Consider the grammar

E  T E’
E'  +T E' | Є
T  F T'
T'  * F T' | Є
F  ( E ) | id

19
Parse table for the grammar

id + * ( ) $
E ETE’ ETE’

E’ E’+TE’ E’Є E’Є

T TFT’ TFT’

T’ T’Є T’*FT’ T’Є T’Є

F Fid F(E)

Blank entries are error states. For example


E cannot derive a string starting with ‘+’

20
Parsing algorithm
• The parser considers 'X' the symbol on top of stack, and 'a' the
current input symbol

• These two symbols determine the action to be taken by the parser

• Assume that '$' is a special token that is at the bottom of the stack
and terminates the input string

if X = a = $ then halt

if X = a ≠ $ then pop(x) and ip++

if X is a non terminal
then if M[X,a] = {X  UVW}
then begin pop(X); push(W,V,U)
end
else error
21
Example
Stack input action
$E id + id * id $ expand by ETE’
$E’T id + id * id $ expand by TFT’
$E’T’F id + id * id $ expand by Fid
$E’T’id id + id * id $ pop id and ip++
$E’T’ + id * id $ expand by T’Є
$E’ + id * id $ expand by E’+TE’
$E’T+ + id * id $ pop + and ip++
$E’T id * id $ expand by TFT’

22
Example …
Stack input action
$E’T’F id * id $ expand by Fid
$E’T’id id * id $ pop id and ip++
$E’T’ * id $ expand by T’*FT’
$E’T’F* * id $ pop * and ip++
$E’T’F id $ expand by Fid
$E’T’id id $ pop id and ip++
$E’T’ $ expand by T’Є
$E’ $ expand by E’Є
$ $ halt

23
Constructing parse table
• Table can be constructed if for every non terminal, every lookahead
symbol can be handled by at most one production

• First(α) for a string of terminals and non terminals α is


– Set of symbols that might begin the fully expanded (made of only tokens)
version of α

• Follow(X) for a non terminal X is


– set of symbols that might follow the derivation of X in the input stream

first follow
24
Compute first sets
• If X is a terminal symbol then First(X) = {X}

• If X  Є is a production then Є is in First(X)

• If X is a non terminal
and X  YlY2 … Yk is a production
then
if for some i, a is in First(Yi)
and Є is in all of First(Yj) (such that j<i)
then a is in First(X)

• If Є is in First (Y1) … First(Yk) then Є is in First(X)

25
Example
• For the expression grammar
E  T E’
E'  +T E' | Є
T  F T'
T'  * F T' | Є
F  ( E ) | id

First(E) = First(T) = First(F) = { (, id }


First(E') = {+, Є}
First(T') = { *, Є}

26
Compute follow sets
1. Place $ in follow(S)
2. If there is a production A → αBβ
then everything in first(β) (except ε) is in follow(B)

3. If there is a production A → αB
then everything in follow(A) is in follow(B)

4. If there is a production A → αBβ


and First(β) contains ε
then everything in follow(A) is in follow(B)

Since follow sets are defined in terms of follow sets last two steps
have to be repeated until follow sets converge

27
Example
• For the expression grammar
E  T E’
E'  + T E' | Є
T  F T'
T'  * F T' | Є
F  ( E ) | id

follow(E) = follow(E’) = , $, ) -
follow(T) = follow(T’) = , $, ), + -
follow(F) = { $, ), +, *}
28
Construction of parse table
• for each production A  α do
– for each terminal ‘a’ in first(α)
M[A,a] = A  α

– If Є is in First(α)
M[A,b] = A  α
for each terminal b in follow(A)

– If ε is in First(α) and $ is in follow(A)


M[A,$] = A  α

• A grammar whose parse table has no multiple entries is called LL(1)

29
Practice Assignment
• Construct LL(1) parse table for the expression grammar
bexpr  bexpr or bterm | bterm
bterm  bterm and bfactor | bfactor
bfactor  not bfactor | ( bexpr ) | true | false

• Steps to be followed
– Remove left recursion
– Compute first sets
– Compute follow sets
– Construct the parse table

30

You might also like