Unit 3 CD
Unit 3 CD
• Constructs parse tree for an input string beginning at the leaves (the
bottom) and working towards the root (the top)
• Example: id*id
id id F id T*F
id F id
id
Shift-reduce parser
• The general idea is to shift some symbols of input to the stack until a
reduction can be applied
• At each reduction step, a specific substring matching the body of a
production is replaced by the nonterminal at the head of the
production
• The key decisions during bottom-up parsing are about when to
reduce and about what production to apply
• A reduction is a reverse of a step in a derivation
• The goal of a bottom-up parser is to construct a derivation in reverse:
• E=>T=>T*F=>T*id=>F*id=>id*id
Handle pruning
• Basic operations:
• Shift
• Reduce Stack Input Action
• Accept
• Error $ id*id$ shift
• Example: id*id $id *id$ reduce by F->id
$F *id$ reduce by
$T *id$ shift
T->F
$T* id$ shift
$T*id $ reduce by F->id
$T*F $ reduce by
$T $ reduce by E->T
T->T*F
$E $ accept
Handle will appear on top of the stack
S S
A
B
B A
α β γ z α γ z
y x y
Stack Input Stack Input
$αβγ yz$ $αγ xyz$
$αβB yz$ $αBxy z$
$αβBy z$
Conflicts during shit reduce parsing
Stack Input
… if expr then stmt else …$
Reduce/reduce conflict
• Augmented grammar:
• G with addition of a production: S’->S
• Closure of item sets:
• If I is a set of items, closure(I) is a set of items constructed from I by the
following rules:
• Add every item in I to closure(I)
• If A->α.Bβ is in closure(I) and B->γ is a production then add the item
B->.γ to clsoure(I).
• Example:
I0=closure({[E’->.E]}
E’->E E’->.E
E -> E + T | T E->.E+T
T -> T * F | F E->.T
T->.T*F
F -> (E) | id T->.F
F->.(E)
F->.id
Constructing canonical LR(0) item sets (cont.)
SetOfItems CLOSURE(I) {
J=I;
repeat
α.Bβ in J)
for (each item A->
for (each prodcution B->γ of G)
if (B->.γ is not in J)
add B->.γ to J;
until no more items are added to J on one round;
return J;
GOTO algorithm
SetOfItems GOTO(I,X) {
J=empty;
if (A->
α.X β is in I)
add CLOSURE(A-> αX. β ) to J;
return J;
}
Canonical LR(0) items
Void items(G’) {
C= CLOSURE({[S’->.S]});
repeat
for (each set of items I in C)
for (each grammar symbol X)
if (GOTO(I,X) is not empty and not in C)
add GOTO(I,X) to C;
until no new set of items are added to C on a round;
}
E’->E
E -> E + T | T
Example T -> T * F | F
acc F -> (E) | id
$ I6 I9
E->E+.T
I1
+ T->.T*F T E->E+T.
E’->E. T->.F
T->T.*F
E E->E.+T
F->.(E)
F->.id
I0=closure({[E’->.E]} I2
E’->.E T I7
F I10
E->.E+T E’->T. * T->T*.F
F->.(E) T->T*F.
E->.T T->T.*F id F->.id
T->.T*F id
T->.F I5
F->.(E)
F->.id ( F->id. +
I4
F->(.E)
I8 I11
E->.E+T
E->.T
E E->E.+T )
T->.T*F F->(E.) F->(E).
T->.F
F->.(E)
F->.id
I3
T>F.
Use of LR(0) automaton
• Example: id*id
INPUT a1 … ai … an $
• Method
• Construct C={I0,I1, … , In}, the collection of LR(0) items for G’
• State i is constructed from state Ii:
• If [A->α.aβ] is in Ii and Goto(Ii,a)=Ij, then set ACTION[i,a] to “shift j”
• If [A->α.] is in Ii, then set ACTION[i,a] to “reduce A->α” for all a in follow(A)
• If {S’->.S] is in Ii, then set ACTION[I,$] to “Accept”
• If any conflicts appears then we say that the grammar is not SLR(1).
• If GOTO(Ii,A) = Ij then GOTO[i,A]=j
• All entries not defined by above rules are made “error”
• The initial state of the parser is the one constructed from the set of items
containing [S’->.S]
Example grammar which is not SLR(1)
S -> L=R | R
L -> *R | id
R -> L
I0 I1 I3 I5 I7
S’->.S S’->S. S ->R. L -> id. L -> *R.
S -> .L=R
S->.R I2 I4 I6
I8
L -> .*R | S ->L.=R L->*.R S->L=.R
R -> L.
L->.id R ->L. R->.L R->.L
R ->. L L->.*R L->.*R I9
L->.id L->.id S -> L=R.
Action
=
Shift 6
2 Reduce R->L
More powerful LR parsers
*
S=>aaBab=>aaaBab
rm
Item [B->a.B,a] is valid for γ=aaa
and w=ab
Constructing LR(1) sets of items
SetOfItems Closure(I) {
repeat
for (each item [A->α.Bβ,a] in I)
for (each production B->γ in G’)
for (each terminal b in First(βa))
add [B->.γ, b] to set I;
until no more items are added to I;
return I;
}
SetOfItems Goto(I,X) {
initialize J to be the empty set;
for (each item [A->α.Xβ,a] in I)
add item [A->αX.β,a] to set J;
return closure(J);
}
void items(G’){
initialize C to Closure({[S’->.S,$]});
repeat
for (each set of items I in C)
for (each grammar symbol X)
if (Goto(I,X) is not empty and not in C)
add Goto(I,X) to C;
until no new sets of items are added to C;
}
Example
S’->S
S->CC
C->cC
C->d
Canonical LR(1) parsing table
• Method
• Construct C={I0,I1, … , In}, the collection of LR(1) items for G’
• State i is constructed from state Ii:
• If [A->α.aβ, b] is in Ii and Goto(Ii,a)=Ij, then set ACTION[i,a] to “shift j”
• If [A->α., a] is in Ii, then set ACTION[i,a] to “reduce A->α”
• If {S’->.S,$] is in Ii, then set ACTION[I,$] to “Accept”
• If any conflicts appears then we say that the grammar is not LR(1).
• If GOTO(Ii,A) = Ij then GOTO[i,A]=j
• All entries not defined by above rules are made “error”
• The initial state of the parser is the one constructed from the set of items
containing [S’->.S,$]
Example
S’->S
S->CC
C->cC
C->d
LALR Parsing Table
I4
C->d. , c/d
I47
C->d. , c/d/$
I7
C->d. , $
S’->S
S -> aAd | bBd | aBe | bAe
A -> c
B -> c
An easy but space-consuming LALR table
construction
• Method:
1. Construct C={I0,I1,…,In} the collection of LR(1) items.
2. For each core among the set of LR(1) items, find all sets having that core,
and replace these sets by their union.
3. Let C’={J0,J1,…,Jm} be the resulting sets. The parsing actions for state i, is
constructed from Ji as before. If there is a conflict grammar is not
LALR(1).
4. If J is the union of one or more sets of LR(1) items, that is J = I1 UI2…IIk
then the cores of Goto(I1,X), …, Goto(Ik,X) are the same and is a state like
K, then we set Goto(J,X) =k.
• This method is not efficient, a more efficient one is discussed in the
book
Compaction of LR parsing table
E->id 2 S3 S2 6
3 R4 R4 R4 R4
4 S3 S2 7
5 S3 S2 8