0% found this document useful (0 votes)
76 views

CS346 Bottom Up Parser

The document discusses bottom-up parsing and shift-reduce parsing. Bottom-up parsing builds the parse tree from the leaves to the root by shifting and reducing based on production rules. Shift-reduce parsing uses a stack and parsing tables to shift input symbols and reduce substrings matching production rules until reaching the start symbol.

Uploaded by

Abhijit Karan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

CS346 Bottom Up Parser

The document discusses bottom-up parsing and shift-reduce parsing. Bottom-up parsing builds the parse tree from the leaves to the root by shifting and reducing based on production rules. Shift-reduce parsing uses a stack and parsing tables to shift input symbols and reduce substrings matching production rules until reaching the start symbol.

Uploaded by

Abhijit Karan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

CS 346: Bottom Up Parser

Resource: Textbook
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman,
“Compilers: Principles,Techniques, and Tools”,
Addison-Wesley, 1986.
Bottom-Up Parsing
 Bottom-up parser:
 parse tree created from the given input starting from leaves towards the root
 tries to find the right-most derivation of the given input in the reverse order
S ⇒ ... ⇒ ω (the right-most derivation of ω)
← (the bottom-up parser finds the right-most derivation in the reverse order)

 Bottom-up parsing: also known as shift-reduce parsing because its two main actions
are shift and reduce
 At each shift action, the current symbol in the input string is pushed to a stack
 At each reduction step, the symbols at the top of the stack (this symbol sequence is the right
side of a production) replaced by the non-terminal at the left side of that production
 Two more actions: accept and error
Shift-Reduce Parsing
 Shift-reduce parser tries to reduce the given input string into the starting symbol

a string  the starting symbol


reduced to

 At each reduction step, a substring of the input matching to the right side of a production
rule is replaced by the non-terminal at the left side of that production rule

 If the substring is chosen correctly, the right most derivation of that string is created in
the reverse order
*
Rightmost Derivation: S⇒ω
rm

Shift-Reduce Parser finds: ω ⇐rm... ⇐rm


S
Shift-Reduce Parsing -- Example
S → aABb input string: aaabb
A → aA | a aaAbb
B → bB | b aAbb ⇓
reduction
aABb
S

S ⇒ aABb ⇒ aAbb ⇒ aaAbb ⇒ aaabb


rm rm rm
rm

Right Sentential Forms

 How do we know which substring to be replaced at each reduction step?


Handle
 Handle of a string is a substring that matches the right side of a
production rule
 not every substring that matches the right side of a production rule is handle
 A handle of a right sentential form γ (≡ αβω) :
a production rule A → β and a position of γ
where the string β may be found and replaced by A to produce
the previous right-sentential form in a rightmost derivation of γ

rm
* rm
S ⇒ αAω ⇒ αβω
ω is a string of terminals
 If the grammar is unambiguous, then every right-sentential form of the
grammar has exactly one handle
Handle Pruning
 A right-most derivation in reverse can be obtained by handle-pruning

S=γ0 ⇒
rm γ1 rm
⇒ γ2 rm
⇒ ...rm⇒ γn-1 rm
⇒ γn= ω
input string

 Start from γn, find a handle An→βn in γn, and replace βn by An to get γn-1

 Then find a handle An-1→βn-1 in γn-1, and replace βn-1 by An-1 to get γn-2

 Repeat this, until we reach S


A Shift-Reduce Parser
E → E+T | T Right-Most Derivation of id + id*id
T → T*F | F E ⇒ E+T ⇒ E+T*F ⇒ E+T*id ⇒ E+F*id
F → (E) | id ⇒ E+id*id ⇒ T+id*id ⇒ F+id*id ⇒ id+id*id

Right-Most Sentential Form Reducing Production


id+id*id F → id
F+id*id T→F
T+id*id E →T
E+id*id F → id
E+F*id T→F
E+T*id F → id
E+T*F T → T*F
E+T E → E+T
E
Handles are red and underlined in the right-sentential forms.
A Stack Implementation of A Shift-Reduce Parser
Four possible actions of a shift-parser :

1. Shift :The next input symbol is shifted onto the top of the stack
2. Reduce: Replace the handle on the top of the stack by the non-terminal
3. Accept: Successful completion of parsing
4. Error: Parser discovers a syntax error, and calls an error recovery routine

 Initial stack: contains only the end-marker $


 End of the input string: marked by the end-marker $
A Stack Implementation of A Shift-Reduce
Parser
Stack Input Action
$ id+id*id $ shift
$id +id*id$ reduce by F → id Parse Tree
$F +id*id$ reduce by T → F
$T +id*id$ reduce by E →T E 8
$E +id*id$ shift
$E+ id*id$ shift E 3 + T 7
$E+id *id$ reduce by F → id
$E+F *id$ reduce by T → F T 2 T 5 * F6
$E+T *id$ shift
$E+T* id$ shift F 1 F 4 id
$E+T*id $ reduce by F → id
$E+T*F $ reduce by T → T*F id id
$E+T $ reduce by E → E+T
$E $ accept
Conflicts During Shift-Reduce Parsing
 For certain class of CFGs, shift-reduce parsers cannot be used
 Stack contents and the next input symbol may not decide action:
 shift/reduce conflict: Whether make a shift operation or a reduction
 reduce/reduce conflict: The parser cannot decide which of several
reductions to make
 If a shift-reduce parser cannot be used for a grammar, that grammar is
called as non-LR(k) grammar

left to right right-most k lookhead


scanning derivation

An ambiguous grammar can never be a LR grammar


Shift-Reduce Parsers
Two main categories of shift-reduce parsers:

1. Operator-Precedence Parser
 simple, but only a small class of grammars

CFG
LR
LALR
2. LR-Parsers
SLR
 covers wide range of grammars
 SLR – simple LR parser
 LR – most general LR parser
 LALR – intermediate LR parser (lookahead LR parser)
 SLR, LR and LALR work same, only their parsing tables are different
LR Parsers
 The most powerful shift-reduce parsing (yet efficient) is:

LR(k) parsing

left to right right-most k lookahead


scanning derivation

 LR parsing is attractive because:


 LR parsers can be constructed to recognize virtually all programming language constructs for
which CFGs can be written
 LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient
 Class of grammars that can be parsed using LR methods is a proper superset of the class of
grammars that can be parsed with predictive parsers or LL methods
LL(1)-Grammars ⊂ LR(1)-Grammars
 LR-parser can detect a syntactic error as soon as it is possible to do so a left-to-right scan of
the input
LR Parsing Algorithm
input a1 ... ai ... an $
stack
Sm
Xm
LR Parsing Algorithm output
Sm-1
Xm-1
.
.
Action Table Goto Table
S1 terminals and $ non-terminal
X1 s s
t four different t each item is
S0 a actions a a state number
t t
e e
s s
A Configuration of LR Parsing Algorithm
 A configuration of a LR parsing is:

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )

Stack Rest of Input

 Sm and ai decides the parser action by consulting the parsing action table
(Initial Stack contains just So )
 So : does not represent any grammar symbol

 A configuration of a LR parsing represents the right sentential form:

X1 ... Xm ai ai+1 ... an $


Actions of A LR-Parser
1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )

2. reduce A→β (or rn where n is a production number)


 pop 2|β| (=r) items from the stack;
 then push A and s where s=goto[sm-r, A]

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )

 Output is the reducing production A→β

3. Accept – Parsing successfully completed

4. Error -- Parser detected an error (an empty entry in the action table)
Reduce Action
 pop 2|β| (=r) items from the stack; let us assume that β =
Y1Y2...Yr
 then push A and s where s=goto[sm-r, A]

( So X1 S1 ... Xm-r Sm-r Y1 Sm-r-1 ...Yr Sm, ai ai+1 ... an $ )


 ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )

 In fact,Y1Y2...Yr is a handle

X1 ... Xm-r A ai ... an $ ⇒ X1 ... Xm Y1...Yr ai ai+1 ... an $


(SLR) Parsing Tables for Expression Grammar
Action Table Goto Table
1) E → E+T state id + * ( ) $ E T F

2) E→T 0 s5 s4 1 2 3
1 s6 acc
3) T → T*F 2 r2 s7 r2 r2
4) T→F 3 r4 r4 r4 r4
5) F → (E) 4 s5 s4 8 2 3
6) F → id 5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Actions of A (S)LR-Parser -- Example
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by F→id F→id
0E1+6F3 $ reduce by T→F T→F
0E1+6T9 $ reduce by E→E+T E→E+T
0E1 $ accept
Constructing SLR Parsing Tables – LR(0) Item

Ex: A → aBb Possible LR(0) Items: ..


 LR(0) item: a production of G a dot at the some position of the right side
A → aBb
(four different possibilities)
..
A → a Bb
A → aB b
A → aBb
 Sets of LR(0) items: states of action and goto table of the SLR parser
 Collection of sets of LR(0) items (the canonical LR(0) collection) is
the basis for constructing SLR parsers
 Augmented Grammar: G’ is
G with a new production rule S’→S where S’ is the new starting symbol
The Closure Operation
I: set of LR(0) items for a grammar G
closure(I): Set of LR(0) items constructed from I by the two rules:
1. Initially, every LR(0) item in I is added to closure(I)

. .
2. If A → α Bβ is in closure(I) and B→γ is a production rule of G; then
B→ γ will be in the closure (I)

We will apply this rule until no more new LR(0) items can be added to
closure(I)
The Closure Operation -- Example
E’ → E .
closure({E’ → E}) =
E → E+T .
{ E’ → E
.
kernel items
E →T E → E+T
T → T*F .
E→ T
T→F .
T → T*F
F → (E) .
T→ F
F → id .
F → (E)
.
F → id }
Kernel and Non-kernel Items

 Each set of items formed by taking the closure of a set of kernel items

 We are really interested in kernel items

 Non-kernel items can be removed to save storage

 Non-kernel items could be generated by the closure process


Goto Operation
 If I is a set of LR(0) items and X is a grammar symbol (terminal or non-

.
terminal), then goto (I,X) is defined as follows:
.
 If A → α Xβ in I then every item in closure({A → αX β}) will be in
goto (I,X)
Example:
I ={ .. ..
E’ → E, E → E+T, E → T, .
. .. .
T → T*F, T → F,
F → (E), F → id }
..
goto (I,E) = { E’ → E , E → E +T }

..
goto (I,T) = { E → T , T → T *F }
goto (I,F) = {T → F }
.. . . .
..
goto (I,() = { F → ( E), E → E+T, E →
F → (E), F → id }
goto (I,id) = { F → id }
T, T → T*F, T → F,
Construction of The Canonical LR(0) Collection

 To create the SLR parsing tables for a grammar G, we will create


the canonical LR(0) collection of the grammar G’

 Algorithm:
.
C is { closure({S’→ S}) }
repeat the followings until no more set of LR(0) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C

 goto function is a DFA on the sets in C.


The Canonical LR(0) Collection -- Example
I0: E’ → .E I1: E’ → E. I6: E → E+.T I9: E → E+T.
E → .E+T E → E.+T T → .T*F T → T.*F
E → .T T → .F
T → .T*F I2: E → T. F → .(E) I10: T → T*F.
T → .F T → T.*F F → .id
F → .(E)
F → .id I3: T → F. I7: T → T*.F I11: F → (E).
F → .(E)
I4: F → (.E) F → .id
E → .E+T
E → .T I8: F → (E.)
T → .T*F E → E.+T
T → .F
F → .(E)
F → .id

I5: F → id.
Transition Diagram (DFA)
E + T
I0 I1 I6 I9 * to I7
F
( to I3
T
id to I4
to I5
F I2 * I7 F
I10
(
I3 id to I4
(
to I5
I4 E I8 )
id id T
F
to I2 + I11
I5 to I3 to I6
(
to I4
Constructing SLR Parsing Table
(of an augmented grammar G’)

1. Construct the canonical collection of sets of LR(0) items for G’


C←{I0,...,In}
2. Create the parsing action table as follows
• If a is a terminal, A→α.aβ in Ii and goto (Ii, a)=Ij then action[i, a] is shift j
• If A→α. is in Ii , then action[i,a] is reduce A→α for all a in FOLLOW(A) where A≠S’
• If S’→S. is in Ii , then action[i,$] is accept
• If any conflicting actions generated by these rules, the grammar is not SLR(1)

3. Create the parsing goto table


• for all non-terminals A, if goto (Ii, A)=Ij then goto [i, A]=j

4. All entries not defined by (2) and (3) are errors

5. Initial state of the parser contains S’→.S


The Canonical LR(0) Collection -- Example
I0: E’ → .E I1: E’ → E. I6: E → E+.T I9: E → E+T.
E → .E+T E → E.+T T → .T*F T →T.*F
E → .T T → .F
T → .T*F I2: E → T. F → .(E) I10: T →T*F.
T → .F T →T.*F F → .id
F → .(E)
F → .id I3: T → F. I7: T → T*.F I11: F → (E).
F → .(E)
I4: F → (.E) F → .id
E → .E+T
E → .T I8: F → (E.)
T → .T*F E → E.+T
T → .F
F → .(E)
F → .id

I5: F → id.
Parsing Tables of Expression Grammar
Action Table Goto Table
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
SLR(1) Grammar
 An LR parser using SLR(1) parsing tables for a grammar G is called as the
SLR(1) parser for G

 If a grammar G has an SLR(1) parsing table, it is called SLR(1) grammar


(or SLR grammar in short)

 Every SLR grammar is unambiguous, but every unambiguous grammar is not a SLR
grammar
shift/reduce and reduce/reduce conflicts

 shift/reduce conflict: choice between a shift operation or reduction for


a terminal

 reduce/reduce conflict: If a state does not know whether it will make


a reduction operation using the production rule i or j for a terminal

 If the SLR parsing table of a grammar G has a conflict, we say that the
grammar is not SLR grammar
Conflict Example
S → L=R I0: S’ → .S I1: S’ → S. I6: S → L=.R I9: S  L=R.
S→R S → .L=R R → .L
L→ *R S → .R I2: S → L.=R L→ .*R
L → id L → .*R R → L. L → .id
R→L L → .id
R → .L I3: S → R.

I4:L → *.R I7: L → *R.


Problem R → .L
FOLLOW(R)={=,$} L→ .*R I8: R → L.
= shift 6 L → .id
reduce by R → L
shift/reduce conflict I5: L → id.
Conflict Example2
S → AaAb I0: S’ → .S
S → BbBa S → .AaAb
A→ε S → .BbBa
B→ε A→.
B→.

Problem
FOLLOW(A)={a,b}
FOLLOW(B)={a,b}
a reduce by A → ε b reduce by A → ε
reduce by B → ε reduce by B → ε
reduce/reduce conflict reduce/reduce conflict
Constructing Canonical LR(1) Parsing Tables
 In SLR method, the state i makes a reduction by A→α when the
current token is a:
 if A→α. in Ii and a is in FOLLOW(A)

 In some situations, βA cannot be followed by the terminal a in a


right-sentential form when βα and the state i are on the top stack.
This means that making reduction in this case is not correct.

S → AaAb S⇒AaAb⇒Aab⇒ab S⇒BbBa⇒Bba⇒ba

S → BbBa
A→ε Aab ⇒ ε ab Bba ⇒ ε ba
B→ε AaAb ⇒ Aa ε b BbBa ⇒ Bb ε a
An Example
S → L=R I0: S’ → .S I1: S’ → S. I6: S → L=.R I9: S → L=R.
S→R S → .L=R R → .L
L→ *R S → .R I2: S → L.=R L→ .*R
L → id L → .*R R → L. L → .id
R→L L → .id
R → .L I3: S → R.

I4: L → *.R I7: L → *R.


Problem R → .L
FOLLOW(R)={=,$} L→ .*R I8: R → L.
= shift 6 L → .id
reduce by R → L
I5: L → id.
- If L is reduced to R then the contents appear as: R= (no right sentential form can derive it)
- R → L is INVALID for the input symbol “=“
LR(1) Item
 To avoid some of invalid reductions, the states need to carry more
information

 Extra information is put into a state by including a terminal symbol as a


second component in an item

 A LR(1) item is:


.
A → α β, a where a is the look-ahead of the LR(1) item
(a is a terminal or end-marker)
LR(1) Item (cont.)
 When β ( in the LR(1) item A → α.β,a ) is not empty, the look-ahead
does not have any effect

 When β is empty (A → α.,a ), then


Reduce by A→α only if the next input symbol is a (not for any
terminal in FOLLOW(A))

 A state contains A → α.,a1 where {a1,...,an} ⊆ FOLLOW(A)


...
A → α.,an
Canonical Collection of Sets of LR(1) Items
 Process: similar to the construction of the canonical collection of the
sets of LR(0) items,
but closure and goto operations work a little bit different

closure(I) : ( where I is a set of LR(1) items)


 every LR(1) item in I is in closure(I)
.
 if A→α Bβ, a in closure(I) and
B→γ is a production rule of G;

then B→.γ, b belongs to closure(I) for each terminal b in FIRST(βa)


goto operation
 If I is a set of LR(1) items and X is a grammar symbol
(terminal or non-terminal),

then goto (I, X) defined as follows:


 If A → α.Xβ, a in I
then every item in closure({A → αX.β, a}) belongs to
goto (I,X)
Construction of The Canonical LR(1)
Collection
 Algorithm:
C is { closure({S’→.S,$}) }
repeat the followings until no more set of LR(1) items can be added to
C.
for each I in C and each grammar symbol X
if goto (I, X) is not empty and not in C
add goto (I, X) to C

 goto function is a DFA on the sets in C


A Short Notation for The Sets of LR(1) Items
 A set of LR(1) items containing the following items
.
A → α β,a1
...
.
A → α β,an

can be written as

.
A → α β,a1/a2/.../an
Canonical LR(1) Collection -- Example
S → AaAb I0: S’ → .S ,$ I1: S’ → S. ,$
S
S → BbBa S → .AaAb ,$
a
A→ε S → .BbBa ,$ A I2: S → A.aAb ,$ to I4
B→ε A → . ,a
B b
B → . ,b I3: S → B.bBa ,$ to I5

b
I4: S → Aa.Ab ,$ A I6: S → AaA.b ,$ I8: S → AaAb. ,$
A → . ,b

a
I5: S → Bb.Ba ,$ B I7: S → BbB.a ,$ I9: S → BbBa. ,$
B → . ,a
Canonical LR(1) Collection – Example 2
S’ → S I0:S’ → .S,$ I1:S’ → S.,$ I4: L → *.R,= R to I7
1) S → L=R S → .L=R,$ S * R → .L, = L
to I8
2) S → R S → .R,$ L I2:S → L.=R,$ to I6 L→ .*R, = *
3) L→ *R L → .*R,= R → L.,$ L → .id, = to I4
id
4) L → id L → .id,= R to I5
I3:S → R.,$ id
I5:L → id., =
5) R → L R → .L,$

I9:S → L=R.,$
R I13:L → *R.,$
I6:S → L=.R,$ to I9
R → .L,$ L I10:R → L.,$
to I10
L → .*R,$ *
R
I4 and I11
L → .id,$ to I11 I11:L → *.R,$ to I13
id R → .L,$ L I5 and I12
to I12 to I10
L→ .*R,$ *
I7:L → *R.,= L → .id,$ to I11 I7 and I13
id
I8: R → L.,= to I12 I8 and I10
I12:L → id.,$
Construction of LR(1) Parsing Tables
1. Construct the canonical collection of sets of LR(1) items for G’.
C←{I0,...,In}
2. Create the parsing action table as follows
• .
If a is a terminal, A→α aβ,b in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.
• .
If A→α ,a is in Ii , then action[i,a] is reduce A→α where A≠S’.
• .
If S’→S ,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not LR(1)

3. Create the parsing goto table


• for all non-terminals A, if goto(Ii, A)=Ij then goto[i, A]=j

4. All entries not defined by (2) and (3) are errors


5. Initial state of the parser contains S’→.S,$
Canonical LR(1) Collection – Example 2
S’ → S I0:S’ → .S,$ I1:S’ → S.,$ I4:L → *.R,= R to I7
1) S → L=R S → .L=R,$ S * R → .L, = L
to I8
2) S → R S → .R,$ L I2:S → L.=R,$ to I6 L→ .*R, = *
3) L→ *R L → .*R,= R → L.,$ L → .id, = to I4
id
4) L → id L → .id,= R to I5
I3:S → R.,$ id
I5:L → id., =
5) R → L R → .L,$

I9:S → L=R.,$
R I13:L → *R.,$
I6:S → L=.R,$ to I9
R → .L,$ L I10:R → L.,$
to I10
L → .*R,$ *
R
I4 and I11
L → .id,$ to I11 I11:L → *.R,$ to I13
id R → .L,$ L I5 and I12
to I12 to I10
L→ .*R,$ *
I7:L → *R.,= L → .id,$ to I11 I7 and I13
id
I8: R → L.,= to I12 I8 and I10
I12:L → id.,$
LR(1) Parsing Tables – (for Example2)
id * = $ S L R
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
5 r4 no shift/reduce or
6 s12 s11 10 9 no reduce/reduce conflict
7
8
r3
r5 ⇓
9 r1 so, it is a LR(1) grammar
10 r5
11 s12 s11 10 13
12 r4
13 r3
LALR Parsing Tables

 LALR stands for LookAhead LR


 LALR parsers are often used in practice because LALR parsing tables are
smaller than LR(1) parsing tables
 Number of states in SLR and LALR parsing tables for a grammar G are
equal
 But, LALR parsers recognize more grammars than SLR parsers

 yacc creates a LALR parser for the given grammar

 A state of LALR parser will be again a set of LR(1) items


Creating LALR Parsing Tables

Canonical LR(1) Parser  LALR Parser


shrink # of states

 Shrink process may introduce a reduce/reduce conflict in the resulting


LALR parser (so the grammar is NOT LALR)

 But, this shrink process does not produce a shift/reduce conflict


The Core of A Set of LR(1) Items
 The core of a set of LR(1) items is the set of its first component

R → L ,$
..
Ex: S → L =R,$ 
R→L
..
S → L =R Core

 Find the states (sets of LR(1) items) in a canonical LR(1) parser with the same cores. Merge
them as a single state

. .
.
I1:L → id ,= A new state: I12: L → id ,=

.
 L → id ,$
I2:L → id ,$ have same core, merge them

 Do this for all states of a canonical LR(1) parser to get the states of the LALR parser

number of the states of the LALR parser = number of states of the SLR parser for any grammar
Creation of LALR Parsing Tables
 Create the canonical LR(1) collection of the sets of LR(1) items for the
given grammar
 Find each core; find all sets having that same core; replace those sets having
same cores with a single set which is their union.
C={I0,...,In}  C’={J1,...,Jm} where m ≤ n
 Create the parsing tables (action and goto tables) same as the construction
of the parsing tables of LR(1) parser
 Note that: If J=I1 ∪ ... ∪ Ik since I1,...,Ik have same cores
 cores of goto (I1,X),..., goto (I2,X) must be same
 So, goto (J,X)=K, where K is the union of all sets of items having same cores
as goto (I1,X)
 Grammar is LALR(1) if no conflict is introduced
 possible to introduce reduce/reduce conflicts
 cannot introduce a shift/reduce conflict
Shift/Reduce Conflict
 Assume that we can introduce a shift/reduce conflict. In this case, a state of
LALR parser must have:
A → α ,a . and .
B → β aγ, b

 This means that a state of the canonical LR(1) parser must have:
A → α ,a. and .
B → β aγ, c

But, this state has also a shift/reduce conflict. i.e. the original canonical LR(1)
parser has a conflict

(Reason for this, the shift operation does not depend on lookaheads)
Reduce/Reduce Conflict
 For reduce/reduce conflict:

.
I1 : A → α ,a .
I2: A → α ,b
.
B → β ,b .
B → β ,c

.
I12: A → α ,a/b  reduce/reduce conflict
.
B → β ,b/c
Canonical LR(1) Collection – Example 2
S’ → S I0:S’ → .S,$ I1:S’ → S.,$ I4:L → *.R,= R to I7
1) S → L=R S → .L=R,$ S * R → .L, = L
to I8
2) S → R S → .R,$ L I2:S → L.=R,$ to I6 L→ .*R, = *
3) L→ *R L → .*R,= R → L.,$ L → .id, = to I4
id
4) L → id L → .id,= R to I5
I3:S → R.,$ id
I5:L → id., =
5) R → L R → .L,$

I9:S → L=R.,$
R I13:L → *R.,$
I6:S → L=.R,$ to I9
R → .L,$ L I10:R → L.,$
to I10
L → .*R,$ *
R
I4 and I11
L → .id,$ to I11 I11:L → *.R,$ to I13
id R → .L,$ L I5 and I12
to I12 to I10
L→ .*R,$ *
I7:L → *R.,= L → .id,$ to I11 I7 and I13
id
I8: R → L.,= to I12 I8 and I10
I12:L → id.,$
Canonical LALR(1) Collection – Example2
S’ → S I0:S’ → .S,$ .
I1:S’ → S ,$ .
I411: L → * R, =/$ R
1) S → L=R S→ .
L=R,$ S
..* .
R → L, =/$ L
to I713

2) S → R S→ .
R,$ .
L I2:S → L =R,$ to I6 L→ *R, =/$ *
to I810
3) L→ *R L→ .
*R,= R → L ,$ .
L → id, =/$ to I411
4) L → id L→ .
id,= R
.
I3:S → R ,$ id
.
id
to I512
5) R → L R→ .
L,$
I :L → id , =/$
512

.
I6:S → L= R,$
R
to I9 .
I9:S → L=R ,$
..
R → L,$
L → *R,$
L
*
to I810
Same Cores
I4 and I11

.
L → id,$
id
to I411
to I512
I5 and I12

.
I713:L → *R ,=/$
I7 and I13

.
I810: R → L , =/$
I8 and I10
LALR(1) Parsing Tables – (for Example2)
id * = $ S L R
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
5 r4 r4 no shift/reduce or
6 s12 s11 10 9 no reduce/reduce conflict
7
8
r3
r5 ⇓
9 r1 so, it is a LALR(1) grammar
LR(1) Parsing Tables – (for Example2)
id * = $ S L R
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
5 r4 no shift/reduce or
6 s12 s11 10 9 no reduce/reduce conflict
7
8
r3
r5 r5 ⇓
9 r1 so, it is a LR(1) grammar
10 r5
11 s12 s11 10 13
12 r4
13 r3
Using Ambiguous Grammars
 All grammars used in the construction of LR-parsing tables must be un-ambiguous
 Can we create LR-parsing tables for ambiguous grammars ?
 Yes, but they will have conflicts
 We can resolve these conflicts in favor of one of them to disambiguate the grammar
 At the end, we will have again an unambiguous grammar

 Why we want to use an ambiguous grammar?


 Some of the ambiguous grammars are much natural, and a corresponding unambiguous
grammar can be very complex

 Usage of an ambiguous grammar may eliminate unnecessary reductions

Ex.
E → E+T | T
E → E+E | E*E | (E) | id  T → T*F | F
F → (E) | id
Sets of LR(0) Items for Ambiguous Grammar
I0: E’ → .E E I1: E’ → E. + .
I4: E → E + E E .
I7: E → E+E + I4
E→
E→
..E+E
E*E
..
E → E +E
E → E *E
..
E → E+E
E → E*E
(
I2
..
E → E +E * I
E → E *E 5

E→ .(E) * .
E → (E)
id
E→ .id (
.
E → id
I3

(
.
I5: E → E * E E
.
I2: E → ( .E) ..
E → E+E ( I8: E → E*E + I4
..
E → E +E * I
E→
E→
..E+E E → E*E id
.
E → (E) I3
I2
E → E *E 5

E→
E*E
.(E) E
.
E → id
id E→ .id id .
I6: E → (E ) ) I9: E → (E) .
I : E → id.
..
E → E +E
E → E *E
+
* I4
3
I5
SLR-Parsing Tables for Ambiguous Grammar
FOLLOW(E) = { $,+,*,) }

State I7 has shift/reduce conflicts for symbols + and *

E + E
I0 I1 I4 I7

when current token is +


shift  + is right-associative
reduce  + is left-associative

when current token is *


shift  * has higher precedence than +
reduce  + has higher precedence than *
SLR-Parsing Tables for Ambiguous Grammar
FOLLOW(E) = { $,+,*,) }

State I8 has shift/reduce conflicts for symbols + and *.

E * E
I0 I1 I5 I8

when current token is *


shift  * is right-associative
reduce  * is left-associative

when current token is +


shift  + has higher precedence than *
reduce  * has higher precedence than +
SLR-Parsing Tables for Ambiguous Grammar
Action Goto
id + * ( ) $ E
0 s3 s2 1
1 s4 s5 acc
2 s3 s2 6
3 r4 r4 r4 r4
4 s3 s2 7
5 s3 s2 8
6 s4 s5 s9
7 r1 s5 r1 r1
8 r2 r2 r2 r2
9 r3 r3 r3 r3
Error Recovery in LR Parsing
 LR parser detects an error when it consults the parsing action table and
finds an error entry
 Empty entries in the action table are error entries
 Errors are never detected by consulting the goto table
 Some tricks
 LR parser will announce error as soon as there is no valid continuation for
the scanned portion of the input
 Canonical LR parser (LR(1) parser) will never make even a single
reduction before announcing an error
 SLR and LALR parsers may make several reductions before announcing an
error
 LR parsers (LR(1), LALR and SLR parsers) will never shift an erroneous
input symbol onto the stack
Panic Mode Error Recovery in LR Parsing
 Scan down the stack until a state s with a goto on a particular non-
terminal A is found (Get rid of everything from the stack before this
state s)
 Discard zero or more input symbols until a symbol a is found that can
legitimately follow A
 Symbol a is simply in FOLLOW(A), but this may not work for all situations
 Parser stacks the non-terminal A and the state goto [s, A], and it
resumes the normal parsing
 Non-terminal A is normally a basic programming block (there can be
more than one choice for A)
 stmt, expr, block, ...
Phrase-Level Error Recovery in LR Parsing
 Each empty entry in the action table marked with a specific error
routine
 An error routine reflects the error that the user most likely will
make in that case
 An error routine
 inserts the symbols into the stack or the input
 or, deletes the symbols from the stack and the input
 or, can do both insertion and deletion
 Error routine may denote
 missing operand
 unbalanced right parenthesis

You might also like