Compiler Design Unit 2
Compiler Design Unit 2
(SYNTAX ANALYZER)
In compiler model, the parser obtains a string of
tokens from lexical analyzer and verifies that the
string can be generated by the grammar for the
source language.
We expect the parser to report any syntax error in
an intelligible fashion. It should also recover from
common occurring errors so that it can be
continue processing the remainder of its input.
TYPES OF PARSER
TOP DOWN PARSING
(Backtracking)
• Top- down parsers start from the root node (start symbol) and
match the input string against the production rules to replace
them (if matched).
• To understand this, take the following example of CFG:
W=read
TOP DOWN PARSING
(Recursive Descent Parsing)
• Recursive descent is a simple parsing
algorithm that is very easy to implement.
• It is a top-down parsing algorithm because it
builds the parse tree from the top (the start
symbol) down.
TOP DOWN PARSING
(Recursive Descent Parsing)
Q: Write down the recursive procedure for Top Down parsing
for grammar:
ScAdW
Aab/a
Wcad
Q: Write down the recursive procedure for Top Down parsing
for grammar:
ETA
A+TA/ Ꜫ
TFB
B*FB/ Ꜫ
F(E)/id
TOP DOWN PARSING
(Recursive Descent Parsing) cont…
Sol 1: Procedure to write Recursive Descent Parsing
ScAdW Aab/a Wcad
Procedure S() Procedure A() Procedure W()
{ { {
if (lookahead pointer == ‘c’) if (lookahead pointer = ‘a’) if (lookahead pointer = ‘c’)
{ { {
Match(‘c’); Match(‘a’); Match(‘c’);
} Match(‘b’); Match(‘a’);
A(); } Match(‘d’);
if (lookahead pointer==‘d’) else if (lookahead pointer = ‘a’) }
{ { }
Match(‘d’); Match(‘a’); else
} } Error();
W(); }
}
TOP DOWN PARSING
(Recursive Descent Parsing) cont…
Sol2: Procedure to write Recursive Descent Parsing
ETA A+TA/Ꜫ TFB
Procedure E() Procedure A() Procedure T()
{ { {
T(); if (lookahead pointer = ‘+’) F();
A(); { B();
} Match(‘+’); }
}
T();
A();
Else
return;
}
TOP DOWN PARSING
(Recursive Descent Parsing) cont…
Sol2 cont..:
Procedure to write Recursive Descent Parsing
B*FB/Ꜫ F(E)/id
Procedure B() Procedure F()
{ {
if (lookahead pointer = ‘*’) if (lookahead pointer = ‘(’)
{ {
Match(‘*’); Match(‘(’);
} }
E();
F(); if (lookahead pointer = ‘)’)
B(); {
Else Match(‘)’);
Return; }
}
elseif (lookahead pointer = ‘id’)
{
Match(‘id’);
}
}
LEFT RECURSION
• A grammar becomes left-recursive if it has any non-terminal ‘A’ whose
derivation contains ‘A’ itself as the left-most symbol.
• Left-recursive grammar is considered to be a problematic situation for top-
down parsers.
• Top-down parsers start parsing from the Start symbol, which in itself is non-
terminal.
• So, when the parser encounters the same non-terminal in its derivation, it
becomes hard for it to judge when to stop parsing the left non-terminal and it
goes into an infinite loop.
• Example:
(1) A => Aα | β
(2) S => Aα | β
A => Sd
(1) is an example of immediate left recursion, where A is any non-terminal symbol and
α represents a string of non-terminals.
(2) is an example of indirect-left recursion.
A top-down parser will first parse the A, which in-turn will yield a string
consisting of A itself and the parser may go into a loop forever.
LEFT RECURSION
Problem:1 Direct Recursion
A => A α | β
A => Aα α
A => Aα α α
A => Aα α α α
A => Aα α α α α
A => Aα α α α α α
LEFT RECURSION
Problem:2 Indirect Recursion
S => A α | β
A => Sd α
A => Aα d α
A => Sd α d α
A => Aα d α d α
A => Sd α d α d α
REMOVAL OF LEFT RECURSION
One way to remove left recursion is to use the
following technique:
The production
A => Aα | β
is converted into following productions
A => βA'
A'=> αA' | ε
This does not impact the strings derived from the
grammar, but it removes immediate left
recursion.
LEFT RECURSION
Remove Left Recursion: Eg:
A'=> β | 𝜸 | …
S => iEtS Ꜫ / iEtS eS / a
So,
First(A) ={a, d, g}
FIRST AND FOLLOW
Examples:
Calculate the first functions for the given grammar-
S→A Non terminal First
A → aC / Bd
S a,b
B→b
A a,b
C→g
B b
C g
So,
First(S) =
{ First (A) } = { First{ aC} U {First (Bd)} = { {a} U {First(B)} } = { {a} U {b} } = {a , b}
First(A) =
{ First{ aC} U {First (Bd)} = { {a} U {First(B)} } = { {a} U {b} } = {a,b}
First(B) = { b}
First(C) = { g}
FIRST AND FOLLOW
Examples:
Calculate the first functions for the given grammar-
S → aBDh Non terminal First
B → cC
S a
C → bC / ∈
B c
D → EF
C b,∈
E→g/∈
D g,f,∈
F→f/∈
E g, ∈
F f,∈
So,
First(S) = { First (aBDh) } = { a }
First(B) = { First (cC) } = { c }
First(C) = { First (bC) U First (∈)} = { b , ∈ }
First(D) = { First(E) – ∈ } ∪ First(F) = { g , f , ∈ }
First(E) = { First(g) U First(∈) } = { g , ∈}
First(F) = {First (f) U First (∈) } = { f , ∈ }
FIRST AND FOLLOW
Examples:
Calculate the follow functions for the given grammar-
S→A Non terminal First Follow
A → aC / Bd
S a,b $
B→b
A a,b $
C→g
B b d
C g $
So,
Follow(S) = { $ } : For the start symbol S, place $ in Follow(S).
Follow(A) = Follow(S) ={ $ }
Follow(B) = First( d ) = { d }
Follow(C) = Follow( A ) = { $ }
FIRST AND FOLLOW
Examples:
Calculate the Follow functions for the given grammar-
S → aBDh Non terminal First Follow
B → cC
S a $
C → bC / ∈
D → EF B c g, f, h
E→g/∈ C b,∈ g, f, h
F→f/∈ D g,f,∈ h
E g, ∈ f, h
So,
Follow(S) = { $ } F f,∈ h
Follow(B) = { First (Dh) } = First( D) = { g,f, ∈} = while put ∈ in place of D it looks First(h) =
{h}
So, Follow (B) = {g, f, h}
Follow(C) = Follow(B) U Follow (C) = {g, f, h} U Follow(C) = { g, f, h}
Follow(D) = First(h) = { h }
Follow(E) = First(F) = {f , ∈ } = while put ∈ in place of F so we find Follow (D)={ h}
So,, Follow(E)={ f, h }
Follow(F) = Follow(D) = { h }
FIRST AND FOLLOW
EE+T/T
TT*F/F After Removal LF and
LR
F(E)/id
FIRST AND FOLLOW
• Sol:
NON TERMINAL
FIRST FOLLOW
E {id, (} {$,)}
E’ {+, Ꜫ} {$,)}
T {id, (} {+,$,)}
T’ {*, Ꜫ} {+,$,)}
F {id, (} {*,+,$,)}
FIRST AND FOLLOW
A {c, Ꜫ} {a, b, e, d, $}
B {a, d, e} {a, b, e}
C {e, Ꜫ} {a, b, c, d, $}
D {a, b} {a, b, e, d, $}
TOP DOWN PARSING
(Non-Recursive Descent Parsing)
The parser is controlled by a program
that behaves as follows:
• If X=a=$, the parser halts and
announces successful completion of
parsing.
• If X=a!=$, the parser pops X off the stack
and advances the I/P pointer to the next
I/P symbol.
• If X is a Non-terminal, the program
consults entry M[X,a] of the parsing
table M. This entry will be either an X-
production of the grammar or an error
entry.
TOP DOWN PARSING
(Non-Recursive Descent Parsing) Cont…
• Q1. Consider this grammar
ETE’
E’+TE’/ Ꜫ
TFT’
T’*FT’/Ꜫ
F(E)/id
TOP DOWN PARSING
(Non-Recursive Descent Parsing) Cont…
• Sol:
NON TERMINAL
FIRST FOLLOW
E {id, (} {$,)}
E’ {+, Ꜫ} {$,)}
T {id, (} {+,$,)}
T’ {*, Ꜫ} {+,$,)}
F {id, (} {*,+,$,)}
TOP DOWN PARSING
(Non-Recursive Descent Parsing) Cont…
• Sol: Predictive Parsing Table..
Non
terminal + * ( ) id $
E ETE’ ETE’
T TFT’ TFT’
F F(E) Fid
TOP DOWN PARSING
(Non-Recursive Descent Parsing) Cont…
Finding the sequence of moves for Input : id + id * id
STACK INPUT ACTION STACK INPUT ACTION
$E Id +id *id $ $E’T’F id $
$ E’ T Id +id *id $ ETE’ $E’T’ id id $ Fid
$ E’T’F Id + id *id $ TFT’ $E’T’ $
$ E’T’ id Id + id *id $ Fid $E’ $ T’Ꜫ
$ E’T’ + id *id $ $ $ E’Ꜫ
$ E’ + id *id $ T’Ꜫ
$ E’T+ + id *id $ E’+TE’
$E’T id *id $
$E’T’F id *id $ TFT’
$E’T’ id id *id $ Fid
$E’T’ * id $
$E’T’F* * id $ T’*FT’
TOP DOWN PARSING
(Non-Recursive Descent Parsing) Cont…
• Q2. Consider this grammar
SiEtSS’/a
S’eS/ Ꜫ
Eb
TOP DOWN PARSING
(Non-Recursive Descent Parsing) Cont…
• Sol:
NON TERMINAL
FIRST FOLLOW
S {i, a} {$,e}
S’ {e, Ꜫ} {$,e}
E {b} {t}
TOP DOWN PARSING
(Non-Recursive Descent Parsing) Cont…
Non
terminal a b e i t $
S Sa SiEtSS’
S’eS
S’ S’Ꜫ
S’Ꜫ
E Eb
Because of Multiple entries in the table the Grammar is not LL1
BOTTOM UP PARSING
(SR parser)
Bottom-up parsing starts from the leaf nodes of a tree and works in
upward direction till it reaches the root node. Here, we start from a
sentence and then apply production rules in reverse manner in
order to reach the start symbol.
Shift-Reduce Parsing
Shift-reduce parsing uses two unique steps for bottom-up parsing.
These steps are known as shift-step and reduce-step.
• Shift step: The shift step refers to the advancement of the input
pointer to the next input symbol, which is called the shifted symbol.
This symbol is pushed onto the stack. The shifted symbol is treated
as a single node of the parse tree.
• Reduce step : When the parser finds a complete grammar rule
(RHS) and replaces it to (LHS), it is known as reduce-step. This
occurs when the top of the stack contains a handle. To reduce, a
POP function is performed on the stack which pops off the handle
and replaces it with LHS non-terminal symbol.
BOTTOM UP PARSING
(SR parser)
On the basis of these operation the String may be accepted or
return an Error.
• Accept: If only start symbol is present in the stack and the
input buffer is empty then, the parsing action is called accept.
When accept action is obtained, it is means successful parsing
is done.
• Error : This is the situation in which the parser can neither
perform shift action nor reduce action and not even accept
action.
BOTTOM UP PARSING
SR PARSER
Finding the sequence of moves for Input : “ ( a , ( a , a ) ) ”
STACK INPUT ACTION
S( L ) $ (a,(a,a))$
$( a,(a,a))$ Shift (
S a
$(a ,(a,a))$ Shift a
LL , S $(S ,(a,a))$ Reduced by S a
,(a,a))$ Reduced by L S
$(L
L--> S
$(L, (a,a))$ Shift ,
$(L,S )$ Reduced by S ( L )
L--> S
$(L )$ Reduced by L L , S
$(L) $ Shift )
$S $ Reduced by S ( L )
$S $ Accepted
Operator Precedence Parsing
Operator precedence is kinds of shift reduce
parsing method. It is applied to a small class of
operator grammars.
It is used for OPG.
A grammar is said to OPG if it has following
properties.
• It doesn’t contain Ꜫ at the right side of the
production.
• Two consecutive Non terminal can not be
present in the right side of production.
Operator Precedence Parsing
Eg.
E E A E | id
Not an Operator Grammar
A+|*
Note that the reduction is in the opposite direction from id1 + id2 * id3 back to E,
where the handle at every step is underlined.
Operator Precedence Parsing
Consider grammar EE+E EE*E Eid
Id + * $
Id ∙> ∙> ∙>
Fid Gid
F+ G+
F* G*
F$ G$
From the previous graph we extract the following precedence functions: You have to find the longest path of Fid and Gid .
Fid G* F+ G+ F$
Gid F* G* F+ G+ F$
S’S S’.S
SAB
Aa SAB S.AB
Bb Aa A.a
Bb B.b
BOTTOM UP PARSER
LR(0)
Find whether the grammar is LR(0) or not?
EE+T
ET
TT*F
TF
Fid
BOTTOM UP PARSER
LR(0)
Sol:
STEP1:
Augmented grammar
E’E STEP2: Closure
EE+T E’.E
ET E.E+T
TT*F E.T
TF
T.T*F I0
Fid
T.F
F.id
BOTTOM UP PARSER
LR(0)
Sol: GOTO(I0,F)
GOTO(I0,E)
E’E.
TF. I3
EE.+T I1
GOTO(I0,id)
GOTO(I0,T) Fid. I4
ET.
TT.*F
I2
BOTTOM UP PARSER
LR(0)
Sol:
GOTO(I5,T)
GOTO(I1,+)
EE+.T
EE+T.
T.T*F TT.*F I7
T.F I5
F.id GOTO(I5,F)
TF. Same as I3
GOTO(I2,*)
TT*.F GOTO(I5,id)
F.id I6 Fid. Same as I4
BOTTOM UP PARSER
LR(0)
Sol:
GOTO(I6,F)
TT*F. I8
GOTO(I7,*)
GOTO(I6,id) TT*.F
Same as I6
Fid. Same as I F.id
4
BOTTOM UP PARSER
LR(0)
PARSING TABLE FOR LR(0)
Because of SR conflict the Grammar
is Not LR(0)
ACTION GOTO
STATES
+ * id $ E T F
I0 S4 1 2 3
I1 S5 Accept
I2 R2 S6 /R2 R2 R2
I3 R4 R4 R4 R4
I4 R5 R5 R5 R5
I5 S4 7 3
I6 S4 8
I7 R1 S6 /R1 R1 R1
I8 R3 R3 R3 R3
BOTTOM UP PARSER
SLR
• For SLR we have to follow the same process and Find
First and Follow for the Non-terminal.
NON TERMINAL
FIRST FOLLOW
E {id} {$,+}
T {id} {$,+,*}
F {id} {$,+,*}
BOTTOM UP PARSER
SLR
PARSING TABLE FOR SLR There is no multiple entries in the
Table so the grammar is SLR
ACTION GOTO
STATES
+ * id $ E T F
I0 S4 1 2 3
I1 S5 Accept
I2 R2 S6 R2
I3 R4 R4 R4
I4 R5 R5 R5
I5 S4 7 3
I6 S4 8
I7 R1 S6 R1
I8 R3 R3 R3
BOTTOM UP PARSER
GOTO(I6,*) GOTO(I6,id)
L*.R
R.L Same as I5
L.*R I
Same as 4
Lid.
L.id
BOTTOM UP PARSER
LR(0)
PARSING TABLE FOR LR(0)
Because of SR conflict the Grammar
is Not LR(0)
ACTION GOTO
STATES
= * id $ S L R
I0 S4 S5 1 2 3
I1 Accept
I2 S6 /R5 R5 R5 R5
I3 R2 R2 R2 R2
I4 S4 S5 8 7
I5 R4 R4 R4 R4
I6 S4 S5 8 9
I7 R3 R3 R3 R3
I8 R5 R5 R5 R5
I9 R1 R1 R1 R1
BOTTOM UP PARSER
SLR
• For SLR we have to follow the same process and Find
First and Follow for the Non-terminal.
NON TERMINAL
FIRST FOLLOW
S {*,id} {$}
L {*,id} {=,$}
R {*,id} {=,$}
BOTTOM UP PARSER
I1 Accept
I2 S6 /R5 R5
I3 R2
I4 S4 S5 8 7
I5 R4 R4
I6 S4 S5 8 9
I7 R3 R3
I8 R5 R5
I9 R1
BOTTOM UP PARSER
CLR
Find whether the grammar is CLR or not?
SAA
AaA
Ab
BOTTOM UP PARSER
CLR
Sol:
STEP1: Augmented grammar
S’S
SAA
AaA
Ab
STEP2: Closure
S’.S , $
S.AA , $
A.aA , a/b I0
A.b , a/b
BOTTOM UP PARSER
CLR
Sol:
GOTO(I0,S)
GOTO(I0,a)
S’S. ,$ I1 Aa.A , a/b
A.aA , a/b I3
A.b , a/b
GOTO(I0,A)
SA.A ,$ GOTO(I0,b)
A.aA ,$ I2
A.b ,$ Ab. a/b I4
,
BOTTOM UP PARSER
CLR
Sol:
GOTO(I2,b)
GOTO(I2,A)
Ab. , $
I7
SAA. ,$ I5
GOTO(I3,A)
GOTO(I2,a) AaA. , a/b I8
Aa.A ,$
A.aA ,$ I6 GOTO(I3,a)
Aa.A , a/b
A.b ,$ A.aA , a/b Same as I3
A.b , a/b
BOTTOM UP PARSER
CLR
Sol:
GOTO(I3,b)
Ab. , a/b Same as I4 GOTO(I6,a)
Aa.A ,$
A.aA ,$ Same as I6
A.b ,$
GOTO(I6,A)
AaA. ,$ I9 GOTO(I6,b)
Ab. ,$ Same as I7
BOTTOM UP PARSER
CLR
There is no multiple entries in the
PARSING TABLE FOR CLR Table so the grammar is CLR
ACTION GOTO
STATES
a b $ S A
I0 S3 S4 1 2
I1 ACCEPT
I2 S6 S7 5
I3 S3 S4 8
I4 R3 R3
I5 R1
I6 S6 S7 9
I7 R3
I8 R2 R2
I9 R2
BOTTOM UP PARSER
LALR
There is no multiple entries in the
PARSING TABLE FOR LALR Table so the grammar is LALR
ACTION GOTO
STATES
a b $ S A
I0 S36 S47 1 2
I1 ACCEPT
I2 S36 S47 5
I36 S36 S47 89
I47 R3 R3 R3
I5 R1
I89 R2 R2 R2
BOTTOM UP PARSER
CLR/LALR
Find whether the grammar is CLR or not?
SAaAb
SBbBa
A Ꜫ
B Ꜫ
BOTTOM UP PARSER
CLR
Sol:
STEP1:Augmented STEP2: Closure
grammar S’.S , $
S’S S.AaAb , $
SAaAb S.BbBa , $ I0
SBbBa A. , a
AꜪ B. , b
B Ꜫ
BOTTOM UP PARSER
CLR
Sol: GOTO(I0,B)
GOTO(I0,S) SB.bBa , $
I3
S’S. ,$ I1
GOTO(I2,a)
SAa.Ab , $
A. ,
I4
b
GOTO(I0,A)
SA.aAb , $
I2 GOTO(I3,b)
SBb.Ba , $
B. , a
I5
BOTTOM UP PARSER
CLR
Sol: GOTO(I6,b)
GOTO(I4,A) $
SAaAb. , I8
SAaA.b , $ I6
GOTO(I7,a)
SBbBa. , $ I9
GOTO(I5,B)
SBbB.a , $
I7
BOTTOM UP PARSER
CLR
There is no multiple entries in the
PARSING TABLE FOR CLR Table so the grammar is CLR
ACTION GOTO
STATES
a b $ S A B
I0 R3 R4 1 2 3
I1 ACCEPT
I2 S4
I3 S5
I4 R3 6
I5 R4 7
I6 S8
I7 S9
I8 R1
I9 R2
LL vs LR
LL LR
Does a leftmost derivation. Does a rightmost derivation in reverse.
Starts with the root non terminal on the Ends with the root non terminal on the
stack. stack.
Ends when the stack is empty. Starts with an empty stack.
Uses the stack for designating what is still Uses the stack for designating what is
to be expected. already seen.
Builds the parse tree top-down. Builds the parse tree bottom-up.
Continuously pops a non terminal off the Tries to recognize a right hand side on the
stack, and pushes the corresponding right stack, pops it, and pushes the
hand side. corresponding non terminal.
Expands the non-terminals. Reduces the non-terminals.
Reads the terminals when it pops one off Reads the terminals while it pushes them
the stack. on the stack.
Pre-order traversal of the parse tree. Post-order traversal of the parse tree.
Parser Conflicts
How to handle Ambiguous grammar in LR parser:
An LR parser may encounter two types of conflicts :
• Shift-Reduce Conflicts
• Reduce-Reduce Conflicts
It can be resolved by using following methods:
• The SR conflict in the parsing table is resolved by favouring the shift
action over reduced action.
• The RR conflict in the parsing table is resolved by favouring first
reduced over second reduced action.
NOTE:
• The parser generator tool YACC also resolves the conflict in the
above manner.
• If the grammar is expression grammar then the conflicts SR and RR
are resolved based on the precedence's of the operation