Bottom Up Parser
Bottom Up Parser
Bottom Up Parser
• A bottom-up parser creates the parse tree of the given input starting
from leaves towards the root.
• A bottom-up parser tries to find the right-most derivation of the given
input in the reverse order.
S ... (the right-most derivation of )
(the bottom-up parser finds the right-most derivation in the reverse order)
• Bottom-up parsing is also known as shift-reduce parsing because its
two main actions are shift and reduce.
– At each shift action, the current symbol in the input string is pushed to a stack.
– At each reduction step, the symbols at the top of the stack (this symbol sequence is the right
side of a production) will replaced by the non-terminal at the left side of that production.
– There are also two more actions: accept and error.
• At each reduction step, a substring of the input matching to the right side of a
production rule is replaced by the non-terminal at the left side of that production rule.
• If the substring is chosen correctly, the right most derivation of that string is created in
the reverse order.
Rightmost Derivation: *
S rm aaAbb
rm aAbb rm rm aaabb
rm 1
rm 2
rm ...
rm n-1
rm n=
input string
1. Shift : The next input symbol is shifted onto the top of the stack.
2. Reduce: Replace the handle on the top of the stack by the non-
3. Accept: Successful completion of parsing.
4. Error: Parser discovers a syntax error, and calls an error recovery
1. Operator-Precedence Parser
– simple, but only a small class of grammars.
2. LR-Parsers SLR
– covers wide range of grammars.
• SLR – simple LR parser
• LR – most general LR parser
• LALR – intermediate LR parser (lookhead LR parser)
– SLR, LR and LALR work same, only their parsing tables are different.
LR(k) parsing.
4. Error -- Parser detected an error (an empty entry in the action table)
CS416 Compiler Design 15
Reduce Action
• pop 2|| (=r) items from the stack; let us assume that = Y1Y2...Yr
• then push A and s where s=goto[sm-r,A]
(four different possibility)
A aB b
A aBb
A a Bb
• Sets of LR(0) items will be the states of action and goto table of the SLR
• A collection of sets of LR(0) items (the canonical LR(0) collection) is
the basis for constructing SLR parsers.
• Augmented Grammar:
G’ is G with a new production rule S’S where S’ is the new starting
CS416 Compiler Design 19
The Closure Operation
• If I is a set of LR(0) items for a grammar G, then closure(I) is the
set of LR(0) items constructed from I by the two rules:
1. Initially, every LR(0) item in I is added to closure(I).
2. If A B is in closure(I) and B is a production rule of G;
then B will be in the closure(I).
We will apply this rule until no more new LR(0) items can be
added to closure(I).
.. .. .
I ={ E’ E, E E+T, E T,
. .. .
T T*F, T F,
F (E), F id }
.. .
goto(I,E) = { E’ E , E E +T }
goto(I,T) = { E T , T T *F }
.. .. . .
goto(I,F) = {T F }
goto(I,() = { F ( E), E E+T, E T, T T*F, T .
F (E), F id }
goto(I,id) = { F id }
• Algorithm:
C is { closure({S’ S}) }
repeat the followings until no more set of LR(0) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C
I5: F id.
E + T
I0 I1 I6 I9 * to I7
( to I3
id to I4
to I5
F I2 I7 F
( I10
I3 id
to I4
E to I5
id Iid
4 T I8 )
F to I2 +
I5 ( to I3
to I6
to I4
CS416 Compiler Design 25
Constructing SLR Parsing Table
(of an augumented grammar G’)
• If the SLR parsing table of a grammar G has a conflict, we say that that
grammar is not SLR grammar.
a reduce by A b reduce by A
reduce by B reduce by B
reduce/reduce conflict reduce/reduce conflict
A .,a n
can be written as
A ,a1/a2/.../an
A a
I4: S Aa.Ab ,$ I6: S AaA.b ,$ I8: S AaAb. ,$
A . ,b
B b
I5: S Bb.Ba ,$ I7: S BbB.a ,$ I9: S BbBa. ,$
B . ,a
I9:S L=R.,$
R I13:L *R.,$
I6:S L=.R,$ to I9
L I10:R L.,$
R .L,$ to I10
L .*R,$ * I4 and I11
to I11 R
L .id,$ I11:L *.R,$ to I13
id L
to I12 R .L,$ to I10 I5 and I12
I7:L *R.,$/= L .*R,$ *
to I11
L .id,$ id I7 and I13
I8: R L.,$/= to I12
I12:L id.,$ I8 and I10
CS416 Compiler Design 40
Construction of LR(1) Parsing Tables
1. Construct the canonical collection of sets of LR(1) items for G’.
2. Create
the parsing action table as follows
If a is a terminal, A a,b in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.
If A ,a is in Ii , then action[i,a] is reduce A where AS’.
If S’S ,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not LR(1).
• LALR parsers are often used in practice because LALR parsing tables
are smaller than LR(1) parsing tables.
• The number of states in SLR and LALR parsing tables for a grammar G
are equal.
• But LALR parsers recognize more grammars than SLR parsers.
• yacc creates a LALR parser for the given grammar.
• A state of LALR parser will be again a set of LR(1) items.
.. ..
• The core of a set of LR(1) items is the set of its first component.
Ex: S L =R,$ S L =R Core
R L ,$ RL
• We will find the states (sets of LR(1) items) in a canonical LR(1) parser with same
cores. Then we will merge them as a single state.
. .
I1:L id ,= A new state: I12: L id ,=
L id ,$
I2:L id ,$ have same core, merge them
• We will do this for all states of a canonical LR(1) parser to get the states of the LALR
• In fact, the number of the states of the LALR parser for a grammar will be equal to the
number of states of the SLR parser for that grammar.
state of LALR parser must have:
A ,a and .
B a,b
A .,a B .a,c
• This means that a state of the canonical LR(1) parser must have:
But, this state has also a shift/reduce conflict. i.e. The original canonical
LR(1) parser has a conflict.
(Reason for this, the shift operation does not depend on lookaheads)
I1 : A ,a I2: A ,b .
B ,b B ,c .
I12: A ,a/b reduce/reduce conflict
B ,b/c
4) L id
5) R L
R .
I3:S R ,$ id .
I512:L id ,$/=
to I512
. R I9:S L=R ,$ .
I6:S L= R,$
R L,$ L
to I9
to I810
Same Cores
I4 and I11
L *R,$
L id,$
to I411
I5 and I12
to I512
I713:L *R ,$/= I7 and I13
I810: R L ,$/=
I8 and I10
CS416 Compiler Design 49
LALR(1) Parsing Tables – (for Example2)
id * = $ S L R
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
5 r4 r4 no shift/reduce or
6 s12 s11 10 9 no reduce/reduce conflict
9 r1 so, it is a LALR(1) grammar
..E*E E E *E . *
E (E) .. id
I2 E E *E
E .id
E id
( I 5: E E * E . .
I2: E ( ..E+E
E) E E+E ..
I8: E E*E + I
E E +E *
E E*E id
E ..E*E E
E (E)
E id
I3 E E *E
id E
id I6: E (E ) .. ) I9: E (E) .
I3: E id
E E +E
E E *E . +
* I4
E + E
I0 I1 I4 I7
E * E
I0 I1 I5 I7