Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
Parser is that phase of the compiler which takes token string as input and with the help of existing
grammar, converts it into the corresponding parse tree. Parser is also known as Syntax Analyzer.
Types of Parser
It generates parsers for the given input string with the help of grammar productions by expanding
the non-terminals.
It starts from the start symbol and ends on the terminals. It uses the leftmost derivation.
Top Down Parser are of 2 types
1|Page
Recursive descent parser
It is also known as Brute force parser or the with backtracking parser. It basically
generates the parse tree by using brute force and backtracking.
Bottom-up Parser is the parser which generates the parse tree for the given input string with the
help of grammar productions by compressing the non-terminals
It starts from non-terminals and ends on the stat symbol. It uses the reverse of the right most
derivation.
Bottom Up Parser are of 2 types
LR parser LR parser is the bottom-up parser which generates the parse tree for the given
string by using unambiguous grammar. It follows the reverse of rightmost derivation.
Further they are of 4 types
1. LR( 0 )
2. SLR( 1 )
3. LALR( 1 )
4. CLR( 0 )
Operator precedence parser.
It generates the parse tree form given grammar and string but the only condition is two
consecutive non-terminals and epsilon never appear in the right-hand side of any production.
2|Page
Bottom Up Parsing
Introduction - All the Bottom Up Parsers will work based on following 4 actions-
1. Shift Action - This involves moving of symbols from input buffer onto the stack.
2. Reduce Action - If the handle appears on top of the stack then, its reduction by using
appropriate production rule is done i.e. RHS of production rule is popped out of stack and
LHS of production rule is pushed onto the stack.
3. Accept Action - If only start symbol is present in the stack and the input buffer is empty
then, the parsing action is called accept. When accept action is obtained, it is means
successful parsing is done.
4. Error Action - This is the situation in which the parser can neither perform shift action nor
reduce action and not even accept action.
LR Parsing Algorithm
3|Page
1. An input - An input string w and an LR parsing table with functions action and goto for
grammar G.
2. An output - If w is in L(G), a bottom-up parse for w; otherwise, an error indication.
3. A stack
4. A driver program
5. A parsing table that has two parts: action and goto
The driver program is the same for all LR parsers ; only the parsing table changes from one parser
to another.
The program uses a stack to store a string of the form
s0 X1 s1 X2 s2...Xm sm
where s m is on the top
1. Each Xi is a grammar symbol
2. Each si is a state
Each state summarises the information contained in the stack below it The combination of
the state on the top of the stack and the current input symbol are used to index the parsing
table and determine the shift-reduce parsing decision.
4|Page
States Terminal T T Non terminal
I0
I1
I2 Next States
………..
Shift / Reduce Actions
In
LR Parsing Table
Let S be the state on the top of the stack (TOS ) and ‘ a ’ is the look ahead symbol , then parser
takes decision regarding shift / Reduce action from the parsing table
The following are the possible actions in the parsing table
If action [ S , a ] = Shift j in d parsing table , then shift onto the stack and then Shift j onto
the top of the Stack & increment the look ahead Symbol
If action [ S , a ] = Reduce in the parsing table abs A→𝛃 is the production used for reduction
, then pop 2*|𝛃| ( 2 times of length of 𝛃 ) symbols from the stack and push ‘ A ’ on the stack
and then push goto( i , A) to TOS where i → previous state on the state
If action [ S, a ] = Accept . then parser halts and announces success return success
If action [ S, a ] = error ( Blank in the table entry ) , parse halts and returns syntax error to
the error handler
Time complexity on this algo is O( n ) , where n is the length of the input string
S → bA | aB addc
A →b|c
B → ccd | ddc
6|Page
Without Backtracking top down parsers are also called as predictive parsers because if a non -
terminal having multiple productions on the RHS part, then best production is selected by
using LookAhead Symbol.
LL( 1 ) Parsing or Non Recursive Descent Predictive Parser
Let X be the symbol on the top of the Stack & ‘a’ is Look Ahead Symbol , then parser takes
decisions as follow :
In the table M [X,a] = X → uvw production then replace X by uvw such that u appears on
the top of the start.
1. If M [X,a] = blank entry in the table, then there is a syntax error, return it to error handler.
LL(1) Parsing:
Here the 1st L represents that the scanning of the Input will be done from Left to Right manner
and second L shows that in this Parsing technique we are going to use Left most Derivation Tree.
and finally the 1 represents the number of look ahead, means how many symbols are you going to
see when you want to make a decision.
7|Page
LL(1) Parsing:
Here the 1st L represents that the scanning of the Input will be done from Left to Right manner
and second L shows that in this Parsing technique we are going to use Left most Derivation Tree.
and finally the 1 represents the number of look ahead, means how many symbols are you going to
see when you want to make a decision.
1: First(): If there is a variable, and from that variable if we try to drive all the strings then the
beginning Terminal Symbol is called the first.
2: Follow(): What is the Terminal Symbol which follow a variable in the process of derivation.
Now, after computing the First and Follow set for each Non-Terminal symbol we have to construct
the Parsing table. In the table Rows will contain the Non-Terminals and the column will contain
the Terminal Symbols.
All the Null Productions of the Grammars will go under the Follow elements and the remaining
productions will lie under the elements of First set.
Now, let’s understand with an example.
Example-1:
Consider the Grammar:
E --> TE' E' --> +TE' | e T --> FT' T' --> *FT' | e F --> id | (E) **e denotes epsilon
E’ –> +TE’/e { +, e } { $, ) }
T’ –> *FT’/e { *, e } { +, $, ) }
id + * ( ) $
E E –> TE’ E –> TE’
As you can see that all the null productions are put under the follow set of that symbol and all the
remaining productions are lie under the first of that symbol.
Note: Every grammar is not feasible for LL(1) Parsing table. It may be possible that one cell may
contain more than one production.
Let’s see with an example.
Example-2:
Consider the Grammar
S --> A | a A --> a
First Follow
9|Page
Parsing Table:
A $
S S –> A, S –> a
A A –> a
Here, we can see that there are two productions into the same cell. Hence, this grammar is not
feasible for LL(1) Parser.
S→ AB
A→ a
B→ b
a b $ S A B
I0 S3 1 2
I1 Accept
10 | P a g e
I2 r5 4
I3 r2
I4 r1
I5 r3
Action GoTo
2 Shift
3 Reduce
1 Accept .
6 Total
Any Parsing algo is based on table that parsing is known as table driven parsing
→ Hence LL( 1 ) parsing and LR( 1 ) parsing are table driven parsing
11 | P a g e
→ Recursive descent parser is not table driven parser
Question 2
E→ T + E
E→ T
T→ i
String i+i
i + $ E T
I0 S3 1 2
I1 Accept
I2 S4 r2
I3 r3 r3
I4 S3 5 2
I5 r1
Action GoTo
12 | P a g e
3 Shift, 4 Reduce, 1 Accept
= 8 Total Actions
Note-
Size of LR Parser table = States * [(T+1) + NT]
T → # Terminals
NT → # Non - Terminals
13 | P a g e
Remove Left Recursion
S→ Sa | b
⇓
S→ bS’
S’ → aS’ | ϵ
Practice Question on removing Left Recursion
Question 1
E → E+ T | T
T→T*F|ϵ
F → id
Sol. E → TE’
E’ → +TE’ | ϵ
T → FT’
T’ → *FT’ | ϵ
Note -
If multiple entries are there, then the grammar is not suitable for Top - Down Parsing.
Power of Parser :The no. of grammars handled by a particular parser is known as power of
that parser.
14 | P a g e
If a grammar contains left recursion, then its predictive parsing table contains multiple
entries. Hence, left recursive grammars are not LL(1).
We can make this grammar as LL(1) by removing left recursion from that.
Size of LL parsing table NT * (T+1)
NT => # Non- Terminals
T => # Terminals
1. S → Sa / Sb / c / d
Check if the given grammar contains left recursion, if present then separate the production
and start working on it.
Introduce a new nonterminal and write it at the last of every terminal. We produce a new
nonterminal S’ And write new production as,
It is applicable to both ambiguous and unambiguous grammar and also simple to design
A parser that reads and understands an operator precedence grammar is called an
Operator Precedence Parser.
Operator Precedence Grammar
It is nothing but context free grammar in which no ϵ rule is defined & no adjacent variable on the
RHS part
15 | P a g e
Example S→ AB S→ A
A→ a A→ ϵ
B→ b
Both are not operator grammar
S→AB ⇒ Adjacent variable
A→ ϵ ⇒ ϵ rule
P→ QaR
P→ ϵ
P→ bQaR
P→ QR
Answer
Note: Every operator precedence grammar is operator grammar but operator grammars need not
be operator precedence grammars.
Question How may shift reduce total action are there to parse the I/P string
id + id * id
E→ E + E
E→ E * E
E→ id
Answer
16 | P a g e
Y→
id + * $
X↓
Id ⋗ ⋗ ⋗
+ ⋖ ⋗ ⋖ ⋗
* ⋖ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ Accept
String Parsing :
Question - E→ E+E / E*E / id
String - id + id * id
17 | P a g e
Stack I/P Action
id + id * id
$ $ ⋖ id Shift
$
id ⋗ + Reduce
$ id + id * id $
1
$ + id * id $ $ ⋖+ Shift
$+ id * id $ + ⋖ id Shift
$ + id * id $ id ⋗ * Reduce 2
$+ * id $ + ⋖* Shift
$+* id $ * ⋖ id Shift
$ + * id ⋗ $ Reduce
id $
id 3
$+* $ * ⋗$ Reduce 4
$+ $ + ⋗$ Reduce 5
$ $ Accept
LALR ( 1 )
LALR refers to the lookahead LR. To construct the LALR (1) parsing table, we use the canonical
collection of LR (1) items. In the LALR (1) parsing, the LR (1) items which have same productions
but different look ahead are combined to form a single set of items
LALR (1) parsing is same as the CLR (1) parsing, only difference in the parsing table.
If the core part is same and the look ahead part is different in 2 or more state, then merge these
states with common core part and take all the look ahead in that state
For example
18 | P a g e
Question
S→ AA
A→ aA | b
Answer
a D $ S π
I0 S36 S47 1 2
I1 Accept
I2 S36 S47 5
19 | P a g e
I47 r3 r3 r3
I5 r1
I89 r2 r2 r2
A→(A)
A→a
Answer
A’ → .A
A → .( A )
A→ a
Calculating Look aheads
State I0
First production A’ → .A , $
A → . ( A ) , ? Look aheads
Check previous
First ( $ ) is $ therefore A → . ( A ) , $
A → .a , ?
Again check previous
First ( A ) ,$ ) is a therefore A → .a , a
State I2
A→ ( .A ) . $ look ahead forwarded
A→ . ( A ) , ?
Check previous state
20 | P a g e
First ( ) ,$ ) is ) therefore A→ . ( A ) , )
Same thing for all ………
a ( ) $ π
I0 S3 S2 1
I1 Accept
I2 S6 S5 4
I3 r2
I4 S7
I5 S6 S5 8
I6 r2
I7 r1
I8 S9
I9 r1
I2 , I5 → I25 LA → { $ , ) }
I3 , I6 → I36 LA → { $ , ) }
I4 , I8 → I48 LA → { $ , ) }
I7 , I9 → I79 LA → { $ , ) }
LALR ( 1 ) table
a ( ) $ A
I0 S36 S25 1
I1 Accept
I36 r2 r2
I48 S79
I79 r1 r1
Table Size 1
21 | P a g e
LALR(1) ≤ CLR(1)
SLR(1) < CLR(1)
SLR(1) = LALR(1)
Note : If the table size of SLR ( 1 ) = CLR ( 1 ) then no need to construct LALR( 1 )
Conflicts
Shift | Reduce
If there is no S | R conflicts in CLR ( 1 ) , then there is no S | R conflicts in LALR ( 1 )
Reduce | Reduce
Even though , there is no R|R conflict in CLR ( 1 ) , there may be a chance of having R | R conflict in
LALR ( 1 )
22 | P a g e
LR(0) ⊂ SLR(1) ⊂ LALR(1) ⊂ CLR(1)
LL(1) ⊂ CLR(1)
CLR ( 1 ) | Conical LR( 1 )
CLR refers to canonical lookahead. CLR parsing uses the canonical collection of LR (1) items to
build the CLR (1) parsing table. CLR (1) parsing table produces the more number of states as
compared to the SLR (1) parsing.
In the CLR (1), we place the reduce node only in the lookahead symbols
We have seen for all the above parsers the parsing procedure is same ., only teh construction of
parsing table
For construction of CLR and LALR parsing table we use LR( 1 ) items
LR( 0 ) items → Ay item with dot in right hand side
LR( 0 ) items → LR( 0 ) items + LookAhead
For example
Question
23 | P a g e
S→ AA
A→ aA | b
Answer
24 | P a g e
Therefore S→ .AA , $
So the Rule is
When we have to calculate Look-ahead of Non-terminal( NT ) in a production
See ( .NT ) in the previous production and calculate First ( .NT ) that will be the look ahead
Now for look ahead of next production A → .aA check the previous production S→ .AA
. is followed by A
(.A ) in the previous production
States A b $ S A
I0 S3 S4 2
I1
I2 S6 S7 5
I3 S3 S4 8
I4 r3 r3
I5 r1
I6 S6 S7 9
I7 r3
I8 r2 r2
I9 r2
25 | P a g e
Practice Question
Conflicts in SLR
Shift / Reduce Conflict -
A→αaβ
↓
terminal
B → r. Follow (B) = {a}
=> then, it is S/R conflict.
Reduce / Reduce Conflict -
A → r. Follow (A) ∩ Follow (B) ≠ ϕ
B → r.
=> then it is R/R conflict.
It is unambiguous.
Question 1:
E→ T + E
E→ T
T→ i
Answer
Augmented Grammar : E’→E
E→ T + E
E→ T
T→ i
26 | P a g e
i + $ E T
I0 S3 1 2
I1 Accept
I2 S4 r2
I3 r3 r3
I4 S3 5 2
I5 r1
Action GoTo
This Grammar is SLR (1)
Question 2:
S→ AB
A→ a
B→ b
Answer
Augmented Grammar : S’→S
S→AB
A→a
B→b
27 | P a g e
a b $ S A B
I0 S3 1 2
I1 Accept
I2 S5 4
I3 r2
I4 r1
I5 r3
Action GoTo
This Grammar is SLR( 1 )
Question 3
Check whether the following grammar is SLR(1 ) or not
S→ A
A→ AB | ϵ
B→ aB | b
Answer
Augmented Grammar : S’→S
S→A
28 | P a g e
A→AB
A→ϵ
B→aB
B→b
Follow ( s ) ={ $ }
Question 4
E→ E + T
E→ T
T→ T * F
T→ F
F→ ( E )
F→ id
Answer
Augmented Grammar : E’→ E#
E→ E + T
E→ T
29 | P a g e
T→ T * F
T→ F
F→ ( E )
F→ id
E’→ .E#
E→ .E + T (r1) Follow (E) = { # , + , ) }
E→ .T (r2) Follow (T) = { * , # , + , ) }
T→ .T * F (r3) Follow (F) = { * , # , + , ) }
T→ .F (r4)
F→ .( E ) (r5)
F→ .id (r6)
30 | P a g e
No conflicts so this grammar is SLR( 1 )
It is placing the reduced action based on the follow . But here lookahead are considered while
calculating follow but not considered properly
In SLR( 1 ) , reduced entries are placed in the parsing table based on Follow calculation but follow
set may contain more elements than actual lookaheads . Hence there may be chance of having
unnecessary reduced actions in the parsing table
To avoid this drawback , we can construct CLR and LALR parsing table
Question 7
S → iEtSS’ | a
S’ → eS | aϵ
E→b
S → iEtSS’ | a { i, a } { $, e }
S’ → eS | aϵ { e, a } { $, e }
31 | P a g e
E→b {b} {t}
Question 8
S → aAbB | bBaA | ϵ
A→S
B→S
S → aAbB | bBaA | ϵ { a, b, ϵ } { $, a, b }
A→S { a, b, ϵ } { b, a }
B→S { a, b, ϵ } { $, a, b }
Question 9
S → ABCDE
A→a|ϵ
B→b|ϵ
C→c
D→d|ϵ
E→e|ϵ
S → ABCDE { a, ϵ } {$}
A→a|ϵ { a, ϵ } { b, c, d, e }
B→b|ϵ { b, ϵ } { c, d, e }
C→c { c, ϵ } { d, e }
D→d|ϵ { d, ϵ } {e}
E→e|ϵ { e, ϵ } {$}
Question 10
S → Bb | Cd
B → aB | ϵ
C → cC | ϵ
32 | P a g e
Rules First Follow
S → Bb | Cd { a, c, ϵ } {$}
B → aB | ϵ { a, ϵ } {b}
C → cC | ϵ { c, ϵ } {d}
Question 11
S → ACB | CbB | Ba
A → da | BC
B→g|ϵ
C→h|ϵ
A → da | BC { d, g, ϵ } { g, h }
B→g|ϵ { g, ϵ } {a}
C→h|ϵ { h, ϵ } { b, g ]
Question 12
S → aABb
A→c|ϵ
B→d|ϵ
A→c|ϵ { c, ϵ } { d, b}
B→d|ϵ { d, ϵ } {b}
Question 13
S → aBDh
B → cC
C → bC | ϵ
D → EF
33 | P a g e
E→g|ϵ
F→f|ϵ
B → cC {c} { g,f,h }
C → bC | ϵ { b, ϵ } { g, f,h }
D → EF { g, f, ϵ } {h}
E→g|ϵ {g,ϵ} { f, h }
F→f|ϵ { f, ϵ } {h}
Practice Questions
Question 1
S → aAB
A→b
B→c
Question 2
E → TE’
E’ → +TE’ / ϵ
T → FT’
T’ → *FT’ / ϵ
F → (E) / id
34 | P a g e
Rules First Follow
E → TE’ { ( , id } { $, ) }
E’ → +TE’ / ϵ { +, ϵ } { $, ) }
T → FT’ { ( , id } { +, $, ) }
T’ → *FT’ / ϵ {*, ϵ} { +, $, ) }
F → (E) / id { ( , id } { *, +, $, ) }
Question 3
S → AaAb / BbBa
A→ϵ
B→ϵ
A→ϵ {ϵ} { a, b }
B→ϵ {ϵ} { a, b }
Question 4
S → ACB | CbB | Ba
A → da | BC
B→g|ϵ
C→h|ϵ
A → da | BC { d, g, h, ϵ } { h, g, $ }
B→g|ϵ { g, ϵ } { $, a, h, g }
C→h|ϵ { h, ϵ } { g, $, b, h }
35 | P a g e
Question 5
A → A1 A2 A3
F( A1 ) = 5 elements
F( A2 ) = 4 elements
F( A3 ) = 3 elements
All first sets Contain ϵ and all elements are different then first ( A ) =?
Sol. A = A1 + A2 + A3
=4+3+2
= 9 + ϵ = 10
F(A) = 10 elements
Question 6
S → AB
A→a
B→b
S → AB { a, b } {$}
Question 1
S → AB
A→a
B→b
Sol.
Step 1:
First
S →
{a}
AB
36 | P a g e
A→a {a}
B→b {b}
Step 2: M-Table
a b $
S S → AB
A A→a
B B→b
Step 3:
for string: ab
37 | P a g e
Step 4: Parse Tree
Now, String: aa
38 | P a g e
Question
S→ A
A→ AB | ϵ
B→ aB | b
Answer
→ Left Recursion
∴ Not LL(1)
LALR(1) ✓
39 | P a g e
CLR(1) ✓ LL(1) ⨉
Question
S→ aAb
A→ Aa | ϵ
Answer
A→ Aa | ϵ → Left Recursion
∴ Not LL(1)
∴ CLR(1) ✓ LALR(1) ✓
∴ It is unambiguous.
Question
S → Aa | bAc | Bc | bBa
A→d
B→d
Answer
40 | P a g e
Not LL ( 1 )
not LR( 1 )
Not SLR ( 1 )
CLR ( 1 )
Not LALR ( 1 )
Simple LR ( 1 ) | SLR ( 1 )
SLR Parser
SLR(1) Parser is used for accepting the certain grammar which is not accepted by LR(0) parser
The letters “SLR” stand for “Simple” , “Left”and“Right” .
“Left” indicates that the input is read from left to right and The“Right”indicates that a right-
derivation is built.
SLR(1) parsers use the same LR(0) Configuring Sets and have the Same Table
Structure and Parser Operation , so everything you've already learned about LR(0) applies here
The difference in SLR(1) Parser with LR(0) Parser comes in Assigning Table Actions
The SLR parser is similar to LR(0) parser except that the reduced entry. The reduced productions
are written only in the FOLLOW of the variable whose production is reduced.
Or
The Simple Improvement that SLR(1) makes on the basic LR(0) parser is to reduce only if the next
input token is a member of the Follow Set of the non-terminal being reduced
41 | P a g e
SLR(1) parser will perform a reduce action for configuration B→a• if the look ahead symbol is in
the set Follow(B)
A Grammar is an SLR(1) grammar if there is no conflict in the grammar.
Clearly SLR(1) is a proper super set of LR(0)
Augmented grammar
Compute LR( 0 ) items
Closure () GoTo()
I = { I0I1I2…………….In}
Construct LR( 0 ) parsing table
Augmented Grammar means adding a new production S’ → S to the original grammar .This
augmented production helps the parser to show accept action i.e whenever the parser tries to
reduce S by S’ it leads to accept action.
LR(0) items is nothing but a context free grammar production having a dor on RHS port
LR(0) items
A → .XYZ
A → X.YZ
A → .XYZ
A → .XYZ
Closure() add non terminal production and it will put a ‘ . ‘ on the RHS port of the production
Goto() moves ‘ . ’ one position ahead
Construct LR(0) parsing table for the following grammar and verify whenever it is LR(0) or not ,
check ambiguous or not
S → AB
A→a
B→b
Sol. Augmented Grammar- S’ → S
S → AB
A→a
B→b
42 | P a g e
Note- All LR(0) Grammar are U.G. but all U.G. need not be LR(0).
Inadequate states
43 | P a g e
Shift /Reduce Conflict :-
A → α a β ( a is Terminal )
B → r.
Reduce /Reduce Conflict :-
A → α.
B → r.
Question 2.
E→ T+E
E→ T
T→ i
Sol. Augmented Grammar- E’ → E
E → T+E
E→T
T→i
Shift | reduce conflict in State I2 so not LR( 0 ) , may or may not be ambiguous
44 | P a g e
Imp :
Note- If the LR(0) items automata contains at least 1 inadequate state, then the Grammar is not
LR(0).
First and Follow sets are needed so that the parser can properly apply the needed production rule
at the correct position.
Important Notes-
Rule-01:
For a production rule X → ∈,
First(X) = { ∈ }
Rule-02:
For any terminal symbol ‘a’,
First(a) = { a }
Rule-03:
For a production rule X → Y1Y2Y3,
Calculating First(X)
45 | P a g e
If ∈ ∈ First(Y2), then First(Y2Y3) = { First(Y2) – ∈ } ∪ First(Y3)
Rule-01:
For the start symbol S, place $ in Follow(S).
Rule-02:
For any production rule A → αB,
Follow(B) = Follow(A)
Rule-03:
For any production rule A → αBβ,
46 | P a g e