0% found this document useful (0 votes)
1 views

Module 4

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Module 4

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 125

MODULE 4

Chapter 1
Recursive Descent Parsing
• A parser that uses collection of recursive procedures
for parsing the given input string is called Recursive
Descent Parser.

• CFG is used to build recursive routines.

• The RHS of the production rule is directly converted to a


program.

• For each non terminal separate procedure is written and


body of the procedure is the RHS of corresponding non
terminal
Basic Steps for construction of RD Parser
• The RHS of the rule is directly converted into program
code symbol by symbol.
1.If the input symbol is non terminal then a call to
procedure corresponding to non terminal is made.
2.If the input symbol is terminal , then it is matched with
look ahead from input. The look ahead pointer has to be
advanced on matching the input symbol.
3.If the production rule has many alternates then all these
alternates has to be combined into single body of procedure.
4.The parser should be activated by a procedure
corresponding to the start symbol.
Example –RD Parser
1.Consider
• Enum T
• T* num T |ε
Example –RD Parser
Solution: Enum T
E( )
{
If lookahead=‘num’ then
{
match (num);
T();
}
else
error;
Example –RD Parser
Solution:
T( ) T * num T |ε
{
If lookahead=‘*’
{
match *;
If lookahead =‘num’
{
match(num);
T( );
}
else
error;
}
else NULL;
}
Example –RD Parser
Solution:
Procedure error
Procedure match (token t) {
{ print(‘‘error’’);
}
if lookahead=t
lookahead=next_token;
else
error;
}
main()
{
E();
if lookahead=‘$’
printf(“parsing successful”);
Example –RD Parser
Shift Reduce Parser

1. The initial buffer storing the input string.


2.A stack for storing and accessing the LHS and RHS of the rules
Shift Reduce Parser
1. Consider the grammar
EE+E
EE*E
Eid

Perform the shift reduce parsing on the input string


‘‘ id1+id2*id3’’
Shift Reduce Parsing

Stack Input Buffer Parsing Action


$ id1+id2*id3$ Shift

$id1 +id2*id3$ Reduce by Eid

$E +id2*id3$ Shift +

$E+ id2*id3$ Shift id2

$E+id2 *id3 $ Reduce Ed

$E+E *id3$ Shift *

$E+E* id3$ Shift id3

$E+E*id3 $ Reduce Eid

$E+E*E $ Reduce by EE*E

$E+E $ Reduce by EE+E

$E $ Accept
2. Consider the grammar
STL;
Tint| float
LL,id | id
Perform the shift reduce parsing on the input string
‘‘ int id,id;’’
Shift Reduce Parsing
Stack Input Buffer Parsing Action
$ int id,id;$ Shift
$int id,id;$ Reduce by Tint
$T id,id;$ Shift

$Tid ,id;$ Reduce by Lid


$TL ,id;$ Shift
$TL, id;$ Shift
$TL,id ;$ Reduce by LL,id
$TL ;$ Shift
$TL; $ Reduce by STL;
$S $ Accept
3. Consider the grammar
S(L) | a
LL,S| S
Perform the shift reduce parsing on the input string
‘‘ (a,(a,a))’’
Shift Reduce Parsing
Stack Input Buffer Parsing Action
$ (a,(a,a))$ Shift
$( a,(a,a))$ Shift
$(a ,(a,a))$ Reduce Sa
$(S ,(a,a))$ Reduce LS
$(L ,(a,a))$ Shift
$(L, (a,a))$ Shift
$(L,( a,a))$ Shift
$(L,(a ,a))$ Reduce Sa
$(L,(S ,a))$ Reduce LS
$(L,(L ,a))$ Shift
$(L,(L, a))$ Shift

$(L,(L,a ))$ Reduce Sa


$(L,(L,S ))$ Reduce LL,S
$(L,(L ))$ shift
$(L,(L) )$ Reduce S(L)
$(L,S )$ Reduce LL,S
$(L )$ Shift
$(L) $ Reduce S(L)
$S $ Accept
Problem 3
• S->E
• E->E+E
• E->E*E
• E->num
• E->id

• Input string: id1+num*id2


CONFLICTS IN SHIFT REDUCE TECHNIQUES

• There are two kinds of conflicts that can


occur:
• A shift-reduce conflict occurs in a state that
requests both a shift action and a reduce
action.
1.A reduce-reduce conflict occurs in a state that
requests two or more different reduce actions.
First and Follow sets
• First and Follow sets are needed so that the
parser can properly apply the needed
production rule at the correct position.
Rules to calculate first set
Rules to calculate first set
Rules to calculate follow set
1. Consider the following grammar
SAaAb | BbBa
A ε
Bε
Compute first and follow sets
• Solution:
SAaAb | BbBa
Aε
Bε
First(S)={a,b}
First (A)={ε}
First (B)={ε}
Follow(S)={$}
Follow(A)=(a,b)
Follow(B)={a,b}
2.Consider the following grammar
S(L)| a
LL,S|S
Compute first and follow sets
• Solution:
As the given grammar is left recursive
LL,S|S

Eliminate left recursion


So the grammar is
S(L)| a
LSL’
L’,SL’ | ε

First(S)={( a}
First (L)={( a}
First (L’)={, ε}

Follow(S)={$ , )}
Follow(L)={)}
Follow(L’)={)}
3.Consider the following grammar
SiEtS | iEtSeS |a
Eb
Compute first and follow sets
• As the production rule
SiEtS | iEtSeS |a
Needs to be left factored.
Rewrite the grammar as
• SiEtSS’ |a
• S’eS | ε
• Eb
• SiEtSS’ |a
• S’eS | ε
• Eb

First(S)={i,a}
First (S’)={e, ε}
First (E)={b}

Follow(S)={e,$}
Follow(S’)={e,$}
Follow(E)={t}
Problem 4:
Calculate the first and follow set for
the following

EE+T |T
TT*F |F
F(E) |id
Step 1: Remove Left Recursion
EE+T |T
AAα|β

A=E α=+T β=T

ETE’
E’+TE’|Є
Step 1: Remove Left Recursion
TT*F |F
AAα|β

A=T α=*F β=F

TFT’
T’*FT’|Є
The right recursive grammar is

E TE’
E’+TE’|Є
TFT’
T’*FT’|Є
F(E) |id
Step 2:

FIRST(E)={ ( id}
FIRST(E’)={ + Є}
FIRST (T)={ ( id}
FIRST(T’)= {* Є}
FIRST (F)={( id}
Step 3:

FOLLOW(E)={ $ ) }
FOLLOW(E’)={ $ )}
FOLLOW (T)={ + $ )}
FOLLOW(T’)= {+ $ )}
FOLLOW (F)={* + $ )}
LL(1) Parsing
Here the 1st L represents that the scanning of the Input will be done
from Left to Right manner and second L shows that in this Parsing
technique we are going to use Left most Derivation Tree and finally
the 1 represents the number of look ahead, means how many symbols
are you going to see when you want to make a decision.

Now, after computing the First and Follow set for each Non-Terminal
symbol we have to construct the Parsing table. In the table Rows will
contain the Non-Terminals and the column will contain the Terminal
Symbols.

All the Null Productions of the Grammars will go under the Follow
elements and the remaining productions will lie under the elements of
First set.
LL(1) Parser
Steps to construct predictive LL(1) Parser table

1.Computation of First and Follow Sets.


2.Construct the predictive parsing table using
FIRST and FOLLOW functions.
1.Show that the following grammar is LL(1)
SAaAb | BaBb
A ε
B ε
• Step1: Calculate the first and follow sets

First(S)={a,b}
First (A)={ε}
First (B)={ε}

Follow(S)={$}
Follow(A)=(a,b)
Follow(B)={a,b}
• Step2: The LL(1) Parsing table is
a b $

S SAaAb SBbBa

A Aε Aε

B B-->ε Bε

• As you can see that all the null productions


are put under the follow set of that symbol
and all the remaining productions are lie
under the first of that symbol.Since there is
no multiple entries , this grammar is LL(1).
2.Show that the following grammar is LL(1)
S(L)| a
LL,S|S

• Step1: Calculate the first and follow sets

As the given grammar is left recursive


LL,S|S

Eliminate left recursion


So the grammar is
S(L)| a
LSL’
L’,SL’ | ε

First(S)={( a}
First (L)={( a}
First (L’)={, ε}

Follow(S)={$ , )}
Follow(L)={)}
Follow(L’)={)}
• Step2: The LL(1) Parsing table is
( ) a , $

S S(L) Sa

L LSL’ LSL’

L’ L’ε L’,SL’

• As you can see that all the null productions are put under the follow set of that
symbol and all the remaining productions are lie under the first of that symbol.
Since there is no multiple entries , this grammar is LL(1).
3.Show that the following grammar is LL(1) or not
SiEtS | iEtSeS |a
E b

• Step1: Calculate the first and follow sets

As the production rule


SiEtS | iEtSeS |a
Needs to be left factored.
Rewrite the grammar as
SiEtSS’ |a
S’eS | ε
Eb
• SiEtSS’ |a
• S’eS | ε
• Eb

First(S)={i,a}
First (S’)={e, ε}
First (E)={b}

Follow(S)={e,$}
Follow(S’)={e,$}
Follow(E)={t}
• Step2: The LL(1) Parsing table is

i a e b $ t

S SiEtSS’ Sa

S’ S’eS S’ε
S’ε
E Eb

• As you can see that all the null productions are


put under the follow set of that symbol and all
the remaining productions are lie under the
first of that symbol.Since there are multiple
entries in cell e , this grammar is not LL(1).
4. Show that the following grammar is LL(1) or not

EE+T |T
TT*F |F
F(E) |id
Solution:

Step 1: Remove Left Recursion


E TE’
E’+TE’|Є
TFT’
T’*FT’|Є
F(E) |id
Step 2:

FIRST(E)={ ( id}
FIRST(E’)={ + Є}
FIRST (T)={ ( id}
FIRST(T’)= {* Є}
FIRST (F)={( id}
Step 3:

FOLLOW(E)={ $ ) }
FOLLOW(E’)={ $ )}
FOLLOW (T)={ + $ )}
FOLLOW(T’)= {+ $ )}
FOLLOW (F)={* + $ )}
id + * ( ) $
Step 4: LL(1) Table
E E TE’ E TE’
E’ E’+TE’ E’Є E’Є

T TFT’ TFT’
T’ T’Є T’*FT’ T’Є T’Є

F Fid F (E)

Grammar is LL(1)
Problem 4:
Check the following grammar is LL(1) or not

SiCtSE |a
EeS| Є
Cb
Step 1: Free from left recursion and
grammar with no common prefix

Step 2:
FIRST(S)={ i a}
FIRST(E)={ e Є}
FIRST (C)={ b}
Step 3:
FOLLOW(S)= First (E)
={$, e}

FOLLOW(E)=Follow(S)
={$, e}

FOLLOW(C)={ t}
i a e $ b t
S SiCtSE Sa
E EeS EЄ
EЄ

C Cb

Multiple productions
Grammar is not LL(1)
Problem 5:

S (L) | a
L  SL’
L’  )SL' | ε
( ) a $
S S(L) Sa
L LSL’ LSL’
L’ L’)SL’
L’Є

Multiple productions
Grammar is not LL(1)
LR PARSING
• The most prevalent type of bottom-up parser today is
based on a concept called LR(k) parsing; the \L" is
for left-to-right scanning of the input, the \R" for
constructing a rightmost derivation in reverse,
and the k for the number of input symbols of
lookahead that are used in making parsing decisions.
Why LR Parsers

• LR parsers can be constructed to recognize virtually all


programming language constructs for which context-free grammars
can be written.
• NonLR context-free grammars exist, but these can generally be
avoided for typical programming-language constructs.
• The LR-parsing method is the most general nonbacktracking shift-
reduce parsing method known, yet it can be implemented as
efficiently as other, more primitive shift-reduce methods.
• An LR parser can detect a syntactic error as soon as it is possible
to do so on a left-to-right scan of the input.
• The class of grammars that can be parsed using LR methods is a
proper superset of the class of grammars that can be parsed with
predictive or LL methods. LR grammars can describe more
languages than LL grammars.
Drawback of LR Parsers

• The principal drawback of the LR method is


that it is too much work to construct an LR
parser by hand for a typical programming-
language grammar.
• A specialized tool, an LR parser generator, is
needed.
LR parsing algorithm
• It consists of an input, an output, a stack, a driver
program, and a parsing table that has two parts
(ACTION and GOTO).
• The driver program is the same for all LR parsers;
only the parsing table changes from one parser to
another.
• The parsing program reads characters from an
input buffer one at a time.
• Where a shift-reduce parser would shift a symbol,
an LR parser shifts a state. Each state summarizes
the information contained in the stack below it.
LR parsing algorithm
• The stack holds a sequence of states, s0s1 sm, where
sm is on top.
• In the SLR method, the stack holds states from the
LR(0) automaton;
• By construction, each state has a corresponding
grammar symbol.
• The states correspond to sets of items, that there is a
transition from state i to state j if GOTO(Ii ; X) = Ij .
• All transitions to state j must be for the same
grammar symbol X.
• Thus, each state, except the start state 0, has a unique
grammar symbol associated with it.
Model of LR Parser
The LR parser consists of
1) Input
2)Output
3)Stack
4) Driver Program
5) Parsing Table
❖ The Driver Program is same for all LR Parsers.
❖ Only the Parsing Table changes from one parser
to the other
The LR driver Program determines S m, the
state on top of the stack and a i , the Current
Input symbol.
❖ It then consults Action[ Sm, ai ] which can

✓ Shift
take one of four values:

✓ Reduce
✓ Accept
✓ Error
GOTO Table
❖ The GOTO table specifies which state to put on top of the

✓Rows are State Names;


stack after a reduce

✓Columns are Non-Terminal


The GOTO Table is important to find out the next state after
every reduction.
❖ The GOTO Table is indexed by a state of the parser and a
Non Terminal (Grammar Symbol)
ex : GOTO[S, A]
❖ The GOTO Table simply indicates what the next state of
the parser if it has recognized a certain Non Terminal
Augmented Grammar
❖ If G is a Grammar with Start Symbol S, the
Augmented Grammar G’ is G with a New
Start Symbol S`, and New Production S`
→S$.
❖ The Purpose of the Augmented Grammar is to
indicate to the parser when it should stop
parsing and announce acceptance of the input
LR(0) Items
An LR(0) Item of a Grammar G is a Production of G with a Dot (. ) at
some position of the right side.

Production A → XYZ yields the Four items:


1.A→•XYZ We hope to see a string derivable from XYZ next on the input.
2. A→X•YZ We have just seen on the input a string derivable from X and
that we hope next to see a string derivable from YZ next on the input.
3. A→XY•Z
4. A→XYZ•

❖ The production A→Є generates only one item, A→•.


❖ Each of this item is a Viable prefixes
❖ Closure Item : An Item created by the closure operation on a state.
❖ Complete Item : An Item where the Item Dot is at the end of the RHS
Type of conflicts that can arise in LR(0) techniques
are Shift-reduce conflict and Reduce – reduce

Shift reduce conflict: It is caused when grammar


allows a production rule to be reduced in a state
and in the same state another production rule is
shifted for same token.
Reduce – reduce conflict: It is when two or more
productions apply to the same sequence of inputs.
This time grammar becomes ambiguous because a
program can be interpreted in more than one way.
Problem 1:
Construct LR(0) parsing table for the following
grammar and verify whether it is LR(0) or not .

S → AB
A→a
B→b

Sol. Augmented Grammar- S’ → S


S → AB
A→a
B→b
No conflicts .Grammar is LR( 0 )
Problem 2.
E→ T+E
E→ T
T→ i
Sol. Augmented Grammar- E’ → E
E → T+E
E→T
T→i
Shift | reduce conflict in State I2 so not LR( 0 )
Problem 3
S → AA
A → aA | b

Solution:

Add Augment Production and insert '•' symbol at


the first position for every production in G

S` → •S
S → •AA
A → •aA
A → •b
0 State:
Add Augment production to the I0 State and Compute the Closure

I0 = Closure (S` → •S)


Add all productions starting with S in to I0 State because "•" is followed by the non-terminal. So, the I0 State
becomes

I0 = S` → •S
S → •AA
Add all productions starting with "A" in modified I0 State because "•" is followed by the non-terminal. So, the
I0 State becomes.

I0= S` → •S
S → •AA
A → •aA
A → •b

I1= Go to (I0, S) = closure (S` → S•) = S` → S•


Here, the Production is reduced so close the State.

I1= S` → S•

I2= Go to (I0, A) = closure (S → A•A)


Add all productions starting with A in to I2 State because "•" is followed by the non-terminal. So, the I2 State
becomes
I2 =S→A•A
A → •aA
A → •b
Go to (I2,a) = Closure (A → a•A) = (same as I3)
AD
Go to (I2, b) = Closure (A → b•) = (same as I4)
I3= Go to (I0,a) = Closure (A → a•A)

Add productions starting with A in I3.


A → a•A
A → •aA
A → •b

Go to (I3, a) = Closure (A → a•A) = (same as I3)


Go to (I3, b) = Closure (A → b•) = (same as I4)
I4= Go to (I0, b) = closure (A → b•) = A → b•
I5= Go to (I2, A) = Closure (S → AA•) = SA → A•
I6= Go to (I3, A) = Closure (A → aA•) = A → aA•
Module 4
Chapter 2
SYNTAX DIRECTED
TRANSLATION
Contents
• Syntax Directed Definition
• Syntax Directed Translation
• Attributed Grammar
• Parse Tree
• Syntax Tree
• Annotated Parse Tree
Phases of Compiler
Syntax Directed Definition vs Syntax Directed Translation
Syntax Directed Translation
• SDD(Rules) and SDT(Actions)
Attributed Grammar
Types of Attributes
• An attribute is said to be Synthesized
Attribute if its parse tree node value is
determined by the attribute value at child
nodes.

• An attribute is said to be Inherited


attribute if its parse tree node value is
determined by the attribute value at parent
and/or siblings node.
Synthesized Attributes
These attributes get values from the attribute values of their
child nodes.

To illustrate, assume the following production:


• S → ABC

• If S is taking values from its child nodes (A, B, C), then it


is said to be a synthesized attribute, as the values of ABC
are synthesized to S.
• Synthesized attributes never take values from their parent
nodes or any sibling nodes.
Inherited Attributes
• In contrast to synthesized attributes, inherited
attributes can take values from parent and/or
siblings. As in the following production,

• S → ABC
• A can get values from S, B and C. B can take
values from S, A, and C. Likewise, C can take
values from S, A, and B.
Example
• Synthesized Attributes:

• Inherited Attributes:
PARSE TREE
Parse tree is the graphical representation of symbol. The
symbol can be terminal or non-terminal.

•In parsing, the string is derived using the start symbol. The root of
the parse tree is that start symbol.

•It is the graphical representation of symbol that can be terminals


or non-terminals.

The parse tree follows these points:


•All leaf nodes have to be terminals.
•All interior nodes have to be non-terminals.
•In-order traversal gives original input string.
PARSE TREE

• E E +T Rule 1
• EE-T Rule 2
• ET Rule 3
• TT*F Rule 4
• T->F Rule 5
• Fa Rule 6
• F(E) Rule 7
• Non-terminals are syntactic variables that denote sets of
strings.Symbols on left hand side of the production rule are Non
Terminal Symbols . In the above example E,T,F are Non terminal
Symbols
• Terminals are the basic symbols from which strings are formed.
• In the above example + , * , - , a, ( , ) are terminal symbols
Parse Tree
Example :Considering the following grammar-
E→E+T|T
T→TxF|F
F → ( E ) | id

Generate the following for the string id + id x id


• Parse tree
• Syntax tree
SYNTAX TREE
• Syntax trees are abstract or compact representation of
parse trees.
• They are also called as Abstract Syntax Trees.

Syntax trees are called as Abstract Syntax Trees because-


• They are abstract representation of the parse trees.
• They do not provide every characteristic information from
the real syntax.
• For example- no rule nodes, no parenthesis etc.
Annotated Parse Tree

• The parse tree containing the values of


attributes at each node for given input string is
called annotated or decorated parse tree.

Features :
• High level specification
• Hides implementation details
• Explicit order of evaluation is not specified
Annotated Parse Tree
• Let us assume an input string 4 * 5 + 6 for computing synthesized
attributes. The annotated parse tree for the input string is :
Dependency graph
1.Design the dependency graph for the following
EE1 +E2
EE1*E2
2. Design the dependency graph for the following

String int a,b,c


Solution
Production Rule Semantic Actions

ST List List.in:=T.type

Tnt T.type:=integer

Tfoat T.type:=float

Tchar T.type:=char

Tdouble T.type:=double

ListList1, id List1.in=List.in
Enter_type(id.entry,List.in)

Listid Enter_type(id.entry,List.in)
Dependency graph
Construction of syntax tree for expression

• Constructing syntax tree for an expression means


translation of expression into postfix form.
• The nodes for each operator and operand is created.
Each node can be implemented as a record with
multiple fields.
• Following are the functions used in syntax tree for
expressions.
• mknode(op,left,right)
• mkleaf(id,entry)
• mkleaf(num,value)
Construction of syntax tree for expression

• mknode(op,left,right):
It creates a node with the field operator having operator
as label and two pointers to the left and right.
• mkleaf(id,entry):
It creates an identifier node with label id and a pointer to
symbol tables is given by entry.

• mkleaf(num,value):
It creates node for number with label num and val is
value of that number.
Example 1: Construct the syntax tree for
the expression x*y-5+z

You might also like