CD - R16 - UNIT III - Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

UNIT-III

BOTTOM-UP PARSING

BASIC TERMS AND DEFINITIONS

HANDLE A handle of a string is a substring that matches the right side of a production,
and whose reduction to the non-terminal on the left side of the production
represents one step along the reverse of a rightmost derivation
SHIFT The next input symbol is shifted onto the top of the stack.

REDUCE The parser replaces the handle within a stack with a non-terminal.

ACCEPT The parser announces successful completion of parsing.

ERROR The parser discovers that a syntax error has occurred and calls an error
recovery routine.

SHIFT-REDUCE The parser cannot decide whether to shift or to reduce.


CONFLICT

REDUCE- The parser cannot decide which of several reductions to make.


REDUCE
CONFLICT
VIABLE PREFIX α is a viable prefix of the grammar if there is w such that αw is a right
sentinel form.

LR PARSERS An efficient bottom-up syntax analysis technique that can be used to parse a
large class of CFG is called LR(k) parsing. The ‘L’ is for left-to-right scanning
of the input, the ‘R’ for constructing a rightmost derivation in reverse, and the
‘k’ for the number of input symbols.
TYPES OF LR 1. SLR- Simple LR
PARSING ▪ Easiest to implement, least powerful.
METHODS
2. CLR- Canonical LR
▪ Most powerful, most expensive.
3. LALR- Look-Ahead LR
▪ Intermediate in size and cost between the other two methods.

LR(0) ITEMS An LR(0) item of a grammar G is a production of G with a dot at


some position of the right side.
ABSTRACT ASTs are important data structures in a compiler with least unnecessary
SYNTAX TREE information. ASTs are more compact than a parse tree and can be easily
used by a compiler
If I is a set of items for a grammar G, then closure(I) is the set of
CLOSURE
items constructed from I by the two rules:
OPERATION
1. Initially, every item in I is added to closure(I).
2. If A → α . Bβ is in closure(I) and B → γ is a
production, then add the item B → . γ to I , if it is
not already there. We apply this rule until no more
new items can be added to closure(I).

GOTO Goto(I, X) is defined to be the closure of the set of all items


OPERATION [A→ αX . β] such that [A→ α . Xβ] is in I.

LR (1) ITEM The LR (1) item is defined by production, position of data and a terminal
symbol.
The terminal is called as look ahead symbol.
CONCEPTS

Constructing a parse tree for an input string beginning at the leaves and going towards
the root is called bottom-up parsing.
A general type of bottom-up parser is a shift-reduce parser.

Shift- Reduce Parsing


Shift-reduce parsing is a type of bottom-up parsing that attempts to construct a parse tree
for an input string beginning at the leaves (the bottom) and working up towards the root (the
top).
Example:
Consider the grammar:
S → aABe
A → Abc | b
B→d
The sentence to be recognized is abbcde.
REDUCTION (LEFTMOST) RIGHTMOST DERIVATION

abbcde (A → b) S → aABe
aAbcde (A → Abc) → aAde
aAde (B → d) → aAbcde
aABe (S → aABe) → abbcde
S
The reductions trace out the right-most derivation in reverse.

Handles:

A handle of a string is a substring that matches the right side of a production, and whose
reduction to the non-terminal on the left side of the production represents one step along the reverse
of a rightmost derivation.
Example:
Consider the grammar:

E → E+E
E → E*E
E → (E)
E → id
And the input string id1+id2*id3
The rightmost derivation is :
E → E+E
→ E+E*E
→ E+E*id3
→ E+id2*id3
→ id1+id2*id3
In the above derivation the underlined substrings are called handles.
Handle pruning:

A rightmost derivation in reverse can be obtained by “handle pruning”.


(i.e.) if w is a sentence or string of the grammar at hand, then w = γn, where γn is the nth right-
sentinel form of some rightmost derivation.
Stack implementation of shift-reduce parsing :

Stack Input Action


$ id1+id2*id3 $ shift

$ id1 +id2*id3 $ reduce by E→id

$E +id2*id3 $ shift

$ E+ id2*id3 $ shift

$ E+id2 *id3 $ reduce by E→id

$ E+E *id3 $ shift

$ E+E* id3 $ shift

$ E+E*id3 $ reduce by E→id

$ E+E*E $ reduce by E→ E *E

$ E+E $ reduce by E→ E+E

$E $ accept

Actions in shift-reduce parser:


shift – The next input symbol is shifted onto the top of the stack.
reduce – The parser replaces the handle within a stack with a non-terminal.
accept – The parser announces successful completion of parsing.
error – The parser discovers that a syntax error has occurred and calls an error recovery
routine.

Conflicts in shift-reduce parsing:

There are two conflicts that occur in shift shift-reduce parsing:

1. Shift-reduce conflict: The parser cannot decide whether to shift or to reduce.

2. Reduce-reduce conflict: The parser cannot decide which of several reductions to make.
1. Shift-reduce conflict:
Example:
Consider the grammar:

E→E+E | E*E | id and input id+id*id

Stack Input Action Stack Input Action


$ E+E *id $ Reduce by $E+E *id $ Shift
E→E+E
$E *id $ Shift $E+E* id $ Shift

$ E* id $ Shift $E+E*id $ Reduce by


E→id
$ E*id $ Reduce by $E+E*E $ Reduce by
E→id E→E*E
$ E*E $ Reduce by $E+E $ Reduce by
E→E*E E→E*E
$E $E

2. Reduce-reduce conflict:
Consider the grammar:
M → R+R | R+c |
RR→c
and input c+c

Stack Input Action Stack Input Action


$ c+c $ Shift $ c+c $ Shift

$c +c $ Reduce by $c +c $ Reduce by
R→c R→c
$R +c $ Shift $R +c $ Shift

$ R+ c$ Shift $ R+ c$ Shift

$ R+c $ Reduce by $ R+c $ Reduce by


R→c M→R+c
$ R+R $ Reduce by $M $
M→R+R
$M $
Viable prefixes:
• α is a viable prefix of the grammar if there is w such that αw is a right sentinel form.
• The set of prefixes of right sentinel forms that can appear on the stack of a shift-reduce parser
are called viable prefixes.
• The set of viable prefixes is a regular language.

OPERATOR-PRECEDENCE PARSING

An efficient way of constructing shift-reduce parser is called operator-precedence parsing.

Operator precedence parser can be constructed from a grammar called Operator-grammar. These
grammars have the property that no production on right side is ɛ or has two adjacent non-terminals.
Example:

Consider the grammar: E → EAE | (E) | -E | id


A→+|-|*|/|↑

Since the right side EAE has three consecutive non-terminals, the grammar can be written as
follows:
E → E+E | E-E | E*E | E/E | E↑E | -E | id

Operator precedence relations:


There are three disjoint precedence relations namely
.
< - less than
=. - equal to
.
> - greater than
The relations give the following meaning:
.
a < b – a yields precedence to b
a = b – a has the same precedence as b
.
a > b – a takes precedence over b

Rules for binary operations:


1. If operator θ1 has higher precedence than operator θ2, then make
. .
θ1 > θ2 and θ2 < θ1
2. If operators θ1 and θ2, are of equal precedence, then make
. .
θ1 > θ2 and θ2 > θ1 if operators are left associative θ1
. .
< θ2 and θ2 < θ1 if right associative
3. Make the following for all operators θ:
. .
θ < id , id > θ
.(,(<.
θ < θ
.
.

)>θ,θ>)
. .
θ>$,$<θ
Also make

. . . . . . . .
( = ) , ( < ( , ) > ) , ( < id , id > ) , $ < id , id > $ , $ < ( , ) > $
Example:
Operator-precedence relations for the grammar

E → E+E | E-E | E*E | E/E | E↑E | (E) | -E | id is given in the following table assuming

1. ↑ is of highest precedence and right-associative


2. * and / are of next higher precedence and left-associative, and
3. + and - are of lowest precedence and left-associative
Note that the blanks in the table denote error entries.
TABLE : Operator-precedence relations
+ - * / ↑ id ( ) $
+ . > . > <. <. <. <. <. . > . >
- . > . > <. <. <. <. <. . > . >
* . . .
. > . > . > . > < < < . > . >
/ . > . > . > . > <. <. <. . > . >
↑ . . . . . . . . .
< < <
> > > > > >
id .
. > . > . > . > . > . >
∙>

( . . . . . . . =
< < < < < < <
) . > . > . > . > . > . > . >
$ . . . . . . .
< < < < < < <

Operator precedence parsing algorithm:

Input : An input string w and a table of precedence relations.


Output : If w is well formed, a skeletal parse tree ,with a placeholder non-terminal E labeling all
interior nodes; otherwise, an error indication.
Method : Initially the stack contains $ and the input buffer the string w $. To parse, we execute
the following program :
(1) Set ip to point to the first symbol of w$;
(2) repeat forever
(3) if $ is on top of the stack and ip points to $ then
(4) return
else begin
(5) let a be the topmost terminal symbol on the stack
and let b be the symbol pointed to by ip;
.
(6) if a < b or a = b then begin
(7) push b onto the stack;
(8) advance ip to the next input symbol;
end;
.
(9) else if a > b then /*reduce*/
(10) Repeat
to the terminal most recently Popped
(11) pop the stack
.
(12) until the top stack terminal is related by <
(13) else error( )
end
Stack implementation of operator precedence parsing:
Operator precedence parsing uses a stack and precedence relation table for its
implementation of above algorithm. It is a shift-reduce parsing containing all four actions shift,
reduce, accept and error.
The initial configuration of an operator precedence parsing is
STACK
$

Example:

Consider the grammar E → E+E | E-E | E*E | E/E | E↑E | (E) | id. Input string is id+id*id .The
implementation is as follows:
where w is the input string to be parsed.

STACK INPUT COMMENT


$ <∙ id+id*id $ shift id
$ id ∙> +id*id $ pop the top of the stack id
$ <∙ +id*id $ shift +
$+ <∙ id*id $ shift id
$ +id ∙> *id $ pop id
$+ <∙ *id $ shift *
$+* <∙ id $ shift id
$ + * id ∙> $ pop id
$+* ∙> $ pop *
$+ ∙> $ pop +
$ $ accept

Advantages of operator precedence parsing:


It is easy to implement.
Once an operator precedence relation is made between all pairs of terminals of a grammar ,
the grammar can be ignored. The grammar is not referred anymore during implementation.

Disadvantages of operator precedence parsing:


1. It is hard to handle tokens like the minus sign (-) which has two different precedence.
2. Only a small class of grammar can be parsed using operator-precedence parser.
LR PARSERS
An efficient bottom-up syntax analysis technique that can be used to parse a large class of
CFG is called LR(k) parsing. The ‘L’ is for left-to-right scanning of the input, the ‘R’ for constructing
a rightmost derivation in reverse, and the ‘k’ for the number of input symbols. When ‘k’ is omitted,
it is assumed to be 1.

Advantages of LR parsing:
• It recognizes virtually all programming language constructs for which CFG can be
written.
• It is an efficient non-backtracking shift-reduce parsing method.
• A grammar that can be parsed using LR method is a proper superset of a grammar that
can be parsed with predictive parser.
• It detects a syntactic error as soon as possible.

Drawbacks of LR method:
It is too much of work to construct a LR parser by hand for a programming language
grammar. A specialized tool, called a LR parser generator, is needed. Example: YACC.

Types of LR parsing method:


4. SLR- Simple LR
▪ Easiest to implement, least powerful.
5. CLR- Canonical LR
▪ Most powerful, most expensive.
6. LALR- Look-Ahead LR
▪ Intermediate in size and cost between the other two methods.

The LR parsing algorithm:

The schematic form of an LR parser is as follows:


It consists of : an input, an output, a stack, a driver program, and a parsing table that has two
parts (action and goto).
• The driver program is the same for all LR parser.

• The parsing program reads characters from an input buffer one at a time.

• The program uses a stack to store a string of the form s0X1s1X2s2…Xmsm, where sm is on
top. Each Xi is a grammar symbol and each si is a state.
• The parsing table consists of two parts : action and goto functions.

Action : The parsing program determines sm, the state currently on top of stack, and ai, the current
input symbol. It then consults action[sm,ai] in the action table which can have one of four values :
1. shift s, where s is a state,
2. reduce by a grammar production A → β,
3. accept, and
4. error.

Goto : The function goto takes a state and grammar symbol as arguments and produces a state.

LR Parsing algorithm:

Input: An input string w and an LR parsing table with functions action and goto for grammar G.

Output: If w is in L(G), a bottom-up-parse for w; otherwise, an error indication.

Method: Initially, the parser has s0 on its stack, where s0 is the initial state, and w$ in the input
buffer. The parser then executes the following program :
set ip to point to the first input symbol of
w$; repeat forever begin
let s be the state on top of the stack
and a the symbol pointed to by ip;
if action[s, a] = shift s’ then begin push
a then s’ on top of the stack;
advance ip to the next input symbol
end
else if action[s, a] = reduce A→β then begin
pop 2* | β | symbols off the stack;
let s’ be the state now on top of the stack;
push A then goto[s’, A] on top of the
stack; output the production A→ β
end
else if action[s, a] = accept then
return
else error( )
end
CONSTRUCTING SLR(1) PARSING TABLE:

To perform SLR parsing, take grammar as input and do the following:


2. Find LR(0) items.
3. Completing the closure.
4. Compute goto(I,X), where, I is set of items and X is grammar symbol.

LR(0) items:
An LR(0) item of a grammar G is a production of G with a dot at some position of the
right side. For example, production A → XYZ yields the four items :
A → . XYZ
A → X . YZ
A → XY . Z
A → XYZ .

Closure operation:
If I is a set of items for a grammar G, then closure(I) is the set of items constructed from
I by the two rules:
1. Initially, every item in I is added to closure(I).
2. If A → α . Bβ is in closure(I) and B → γ is a production, then add the item B → . γ
to I , if it is not already there. We apply this rule until no more new items can be added
to closure(I).
Goto operation:
Goto(I, X) is defined to be the closure of the set of all items [A→ αX . β] such
that [A→ α . Xβ] is in I.
Steps to construct SLR parsing table for grammar G are:

1. Augment G and produce G’


2. Construct the canonical collection of set of items C for G’
3. Construct the parsing action function action and goto using the following
algorithm that requires FOLLOW(A) for each non-terminal of grammar.

Algorithm for construction of SLR parsing table:


Input : An augmented grammar G’
Output : The SLR parsing table functions action and goto for G’
Method :
1. Construct C = {I0, I1, …. In}, the collection of sets of LR(0) items for G’.
2. State i is constructed from Ii.. The parsing functions for state i are determined as
follows:
(a) If [A→α∙aβ] is in Ii and goto(Ii,a) = Ij, then set action[i,a] to “shift j”.
Here a must be terminal.
(b) If [A→α∙] is in Ii , then set action[i,a] to “reduce A→α” for all a in
FOLLOW(A).
(c) If [S’→S.] is in Ii, then set action[i,$] to “accept”.

If any conflicting actions are generated by the above rules, we say grammar is not SLR(1).
3. The goto transitions for state i are constructed for all non-terminals A using the rule: If
goto(Ii,A) = Ij, then goto[i,A] = j.
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial state of the parser is the one constructed from the set of items containing
[S’→.S].

Example for SLR parsing:

Construct SLR parsing for the following grammar :


G:E→E+T|T
T→T*F|F
F → (E) | id
The given grammar is :
G : E → E + T ------ (1)
E →T------ (2)
T → T * F ------ (3)
T → F------ (4)
F → (E)------ (5)
F → id------ (6)

Step 1 : Convert given grammar into augmented grammar.


Augmented grammar:
E’ → E
E →E+T
E →T
T →T*F
T →F
F → (E)
F → id
Step 2 : Find LR (0) items.

I0 : E’ → . E
E → . E +T
E → .T
T→.T*F
T→.F
F → . (E)
F → . id

GOTO ( I0 , E) GOTO ( I4 , id )
I1 : E’ → E . I5 : F → id .
E→E.+T
GOTO ( I6 , T )
I9 : E → E + T .
GOTO ( I0 , T)
T→T.*F
I2 : E → T .
T→T.*F
GOTO ( I6 , F )
I3 : T → F .
GOTO ( I0 , F)
I3 : T → F .
GOTO ( I6 , ( )
I4 : F → ( . E )

GOTO ( I0 , ( ) GOTO ( I6 , id)


I4 : F → ( . E) I5 : F → id .
E → . E +T
E → .T GOTO ( I7 , F )
T → . T *F I10 : T → T * F .
T → .F
F → . (E) GOTO ( I7 , ( )
F → . id I4 : F → ( . E )
E→.E+T
GOTO ( I0 , id ) E→.T
I5 : F → id . T→.T*F
T→.F
F → . (E)
GOTO ( I1 , + )
F → . id
I6 : E → E + . T
T→.T*F
T→.F GOTO ( I7 , id )
F → . (E) I5 : F → id .
F → . id
GOTO ( I8 , ) )
GOTO ( I2 , * ) I11 : F → ( E ) .
I7 : T → T * . F
F → . (E)
GOTO ( I8 , + )
F → . id I6 : E → E + . T
T→.T*F
GOTO ( I4 , E ) T→.F
I8 : F → ( E . ) F→.(E)
E→E.+T F → . id

GOTO ( I4 , T) GOTO ( I9 , *)
I2 : E →T . I7 : T → T * . F
T→T.*F F→.(E )
F → . id
GOTO ( I4 , F)
I3 : T → F .
GOTO ( I4 , ( )
I4 : F → ( . E)
E → .E +T
E → .T
T → . T *F
T → .F
F → . (E)
F → id

FOLLOW (E) = { $ , ) , +)
FOLLOW (T) = { $ , + , ) , * }
FOOLOW (F) = { * , + , ) , $ }

SLR parsing table:

ACTION GOTO

id + * ( ) $ E T F

I0 s5 s4 1 2 3

I1 s6 ACC

I2 r2 s7 r2 r2

I3 r4 r4 r4 r4

I4 s5 s4 8 2 3

I5 r6 r6 r6 r6

I6 s5 s4 9 3

I7 s5 s4 10

I8 s6 s11

I9 r1 s7 r1 r1

I10 r3 r3 r3 r3

I11 r5 r5 r5 r5
Blank entries are error entries.
Stack implementation:

Check whether the input (id + id) * id is valid or not.

Canonical LR (1) Parsing

Various steps involved in the CLR (1) Parsing:

Write the Context free Grammar for the given input string

Check for the Ambiguity

Add Augment production

4. Create Canonical collection of LR ( 1 ) items

5. Draw DFA

6. Construct the CLR ( 1 ) Parsing table

7. Based on the information from the Table, with help of Stack and
p arsing algorithm generate the output.
LR (1) item
The LR (1) item is defined by production, position of data and a terminal symbol.
The terminal is called as look ahead symbol.
General form of LR (1) item is
S->α•Aβ , $

Rules to create canonical collection:


1. Every element of I is added to closure of I
2. If an LR (1) item [X-> A•BC, a] exists in I, and there exists a production B-
>b1b2….., then add item [B->• b1b2, z] where z is a terminal in FIRST(Ca), if it is
not already in Closure(I).keep applying this rule until there are no more elements
adde.

Ex: if the Grammar is


S->C
C->cC
C->d
Canonical collection of LR (1) items can be created as follows:

0 S`->•S (Augment Production)


1 S->•CC
2 C->•cC
3 C->•d

I0 State:

Add Augment production and compute the Closure, the look ahead symbol for the Augment Production
is $.

S`->•S, $= Closure(S`->•S, $)

The dot symbol is followed by a Non terminal S. So, add productions starting with S in
I0 State.
S->•CC, FIRST ($), using 2nd rule
S->•CC, $
The dot symbol is followed by a Non terminal C. So, add productions starting with C in
I0 State.
C->•cC, FIRST(C,
$) C->•d, FIRST(C,
$)
FIRST(C) = {c, d} so, the items are

C->•cC, c/d
C->•d, c/d
The dot symbol is followed by a terminal value. So, close the I0 State. So, the productions in
the I0 are
S`->•S , $
S->•CC ,
$
C->•cC, c/d
C->•d , c/d
I1 = Goto ( I0, S)= S`->S•,$

S-> C->•cC ,$
C->•d,$
So, the I2 State is

C->•cC , $
C->•d,$

I3= Goto(I0,c)= Closure( C->c•C, c/d)


C->•cC, c/d
C->•d , c/d So, the I3 State is

C->c•C, c/d
C->•cC, c/d
C->•d , c/d
I4= Goto( I0, d)= Colsure( C->d•, c/d) = C->d•, c/d

I5 = Goto ( I2, C)= closure(S->CC•,$)= S->CC•, $

I6= Goto ( I2, c)= closure(C->c•C , $)=


C->•cC, $
C->•d , $ S0, the I6 State is
C->c•C , $
C->•cC, $
C->•d,$
I7 = Go to (I2 , d)= Closure(C->d•,$ ) = C->d•, $
Goto(I3, c)= closure(C->•cC, c/d)= I3.

I8= Go to (I3 , C)= Closure(C->cC•, c/d) = C->cC•, c/d


Go to (I3 , c)= Closure(C->c•C, c/d) = I3
Go to (I 3 , d)= Closure(C->d•, c/d) = I4

I9= Go to (I6 , C)= Closure(C->cC• , $) = C->cC• , $


Go to (I6 , c)= Closure(C->c•C , $) = I6
Go to (I6 , d)= Closure(C->d•,$ ) = I7
Drawing DFA for the above LR (1) items

I1 I5 C->cC• , $

0 S`->•S , $ I9
S->C•C,$
1 S->•CC , $
C c
2C- C->•d,$ C->•d,$
I6
3 C->•d
I2 I6 I7

I0

C->d•, $

C->d•,
c/d C->•d , c/ d
I7
I4
d I3

I4 I3 c/ d

I8
Construction of CLR (1) Table

Rule1: if there is an item [A->α•Xβ,b] in Ii and goto(Ii,X) is in Ij then action [Ii][X]= Shift
j, Where X is Terminal.
Rule2: if there is an item [A->α•, b] in Ii and (A≠S`) set action [Ii][b]= reduce along with
the production number.
Rule3: if there is an item [S`->S•, $] in Ii then set action [Ii][$]= Accept.
Rule4: if there is an item [A->α•Xβ,b] in Ii and go to(Ii,X) is in Ij then goto [Ii][X]= j,
Where X is Non Terminal.

States ACTION GOTO


c D $ S C
I0 S3 S4 1 2
I1 ACCEPT
I2 S6 S7 5
I3 S3 S4 8
I4 R3 R3 5
I5 R1
I6 S6 S7 9
I7 R3
I8 R2 R2
I9 R2

LR (1) Table

LALR (1) Parsing

The CLR Parser avoids the conflicts in the parse table. But it produces more number of States
when compared to SLR parser. Hence more space is occupied by the table in the memory. So LALR
parsing can be used. Here, the tables obtained are smaller than CLR parse table. But, it is also as
efficient as CLR parser. Here LR (1) items that have same productions but different look-aheads are
combined to form a single set of items.
For example, consider the grammar in the previous example. Consider the states I4 and I7
as given below:
I4= Goto( I0, d)= Colsure( C->d•, c/d) = C->d•, c/d
I7 = Go to (I2 , d)= Closure(C->d•,$ ) = C->d•, $

These states are differing only in the look-aheads. They have the same productions.
Hence these states are combined to form a single state called as I47.

Similarly the states I3 and I6 differing only in their look-aheads as given below:
I3= Goto(I0,c)=
C->c•C, c/d
C->•cC, c/d
C->•d , c/d
I6= Goto ( I2, c)=
C->c•C , $ C-
>•cC , $ C-
>•d,$
These states are differing only in the look-aheads. They have the same productions.
Hence these states are combined to form a single state called as I36.
Similarly the States I8 and I9 differing only in look-aheads. Hence they combined to form
the state I89.

States ACTION GOTO


c D $ S C
I0 S36 S47 1 2
I1 ACCEPT
I2 S36 S47 5
I36 S36 S47 89
I47 R3 R3 R3 5
I5 R1
I89 R2 R2 R2

LALR Table
Conflicts in the CLR (1) Parsing

When, multiple entries occur in the table. Then, the situation is said to be a Conflict.

Shift-Reduce Conflict in CLR (1) Parsing

Shift Reduce Conflict in the CLR (1) parsing occurs when a state has
3. A Reduced item of the form A α•, a and
4. An incomplete item of the form A β•aα as shown below:

1 A-> β•a α ,
States Action GOTO
$ a
Ij a $ A B
2 B->b• ,a
Ii Sj/r2
Ii

Reduce - Reduce Conflict in CLR (1) Parsing

Reduce- Reduce Conflict in the CLR (1) parsing occurs when a state has two or more
reduced items of the form
3. A α•
4. B ȕ• If two productions in a state (I) reducing on same look ahead symbol
as shown below:

Action GOTO
1 A-> α• ,a

2 B->β•,a a $ A B

Ii r1/r2
States
Ii
String Acceptance using LR Parsing:
Consider the above example, if the input String is cdd

States ACTION GOTO


c D $ S C
I0 S3 S4 1 2
I1 ACCEPT
I2 S6 S7 5
I3 S3 S4 8
I4 R3 R3 5
I5 R1
I6 S6 S7 9
I7 R3
I8 R2 R2
I9 R2

S`->•S (Augment Production)


S->•CC
C->•cC
3 C->•d

STACK INPUT ACTION

$0 cdd$ Shift S3
$0c3 dd$ Shift S4
$0c3d4 d$ Reduce with R3,C->d, pop
2*β symbols from the stack
$0c3C d$ Goto ( I3, C)=8Shift S6
$0c3C8 d$ Reduce with R2 ,C->cC, pop
2*β symbols from the stack
$0C d$ Goto ( I0, C)=2
$0C2 d$ Shift S7
$0C2d7 $ Reduce with R3,C->d, pop
2*β symbols from the stack
$0C2C $ Goto ( I2, C)=5
$0C2C5 $ Reduce with R1,S->CC, pop
2*β symbols from the stack
$0S $ Goto ( I0, S)=1
$0S1 $ Accept
LL Parsers vs LR Parsers:

• LL does a leftmost derivation, LR does a rightmost derivation.

• LL starts with only the root nonterminal on the stack, LR ends with only the root nonterminal
on the stack.

• LL ends when the stack is empty. But, LR starts with an empty stack.

• LL uses the stack for designating what is still to be expected, LR uses the stack for
designating what is already seen.

• LL builds the parse tree top down. But, LR builds the parse tree bottom up.

• LL continuously pops a nonterminal off the stack, and pushes a corresponding right hand side.
But, LR tries to recognize a right hand side on the stack, pops it, and pushes the corresponding
nonterminal.

• LL thus expands nonterminal, while LR reduces them.

• LL reads terminal when it pops one off the stack, LR reads terminals while it pushes them on
the stack.

• LL uses grammar rules in an order which corresponds to pre-order traversal of the parse tree,
LR does a post-order traversal.
IMPORTANT QUESTIONS

1. Write the steps for the efficient construction of LALR parsing table. Explain with an example.
LALR (1) Parsing

The CLR Parser avoids the conflicts in the parse table. But it produces more number of States
when compared to SLR parser. Hence more space is occupied by the table in the memory. So LALR
parsing can be used. Here, the tables obtained are smaller than CLR parse table. But, it is also as
efficient as CLR parser. Here LR (1) items that have same productions but different look-aheads are
combined to form a single set of items.

For example, consider the grammar in the previous example. Consider the states I4 and I7
as given below:
I4= Goto( I0, d)= Colsure( C->d•, c/d) = C->d•, c/d
I7 = Go to (I2 , d)= Closure(C->d•,$ ) = C->d•, $

These states are differing only in the look-aheads. They have the same productions.
Hence these states are combined to form a single state called as I47.

Similarly the states I3 and I6 differing only in their look-aheads as given below:
I3= Goto(I0,c)=
C->c•C, c/d
C->•cC, c/d
C->•d , c/d
I6= Goto ( I2, c)=
C->c•C , $ C-
>•cC , $ C-
>•d,$
These states are differing only in the look-aheads. They have the same productions.
Hence these states are combined to form a single state called as I36.
Similarly the States I8 and I9 differing only in look-aheads. Hence they combined to form
the state I89.

States ACTION GOTO


c D $ S C
I0 S36 S47 1 2
I1 ACCEPT
I2 S36 S47 5
I36 S36 S47 89
I47 R3 R3 R3 5
I5 R1
I89 R2 R2 R2

2. LALRof
Write about SR conflicts and RR conflicts Table
shift reduce parsers.
There are two conflicts that occur in shift shift-reduce parsing:

1. Shift-reduce conflict: The parser cannot decide whether to shift or to reduce.

2. Reduce-reduce conflict: The parser cannot decide which of several reductions to make.

1. Shift-reduce conflict:
Example:
Consider the grammar:

E→E+E | E*E | id and input id+id*id

Stack Input Action Stack Input Action


$ E+E *id $ Reduce by $E+E *id $ Shift
E→E+E
$E *id $ Shift $E+E* id $ Shift

$ E* id $ Shift $E+E*id $ Reduce by


E→id
$ E*id $ Reduce by $E+E*E $ Reduce by
E→id E→E*E
$ E*E $ Reduce by $E+E $ Reduce by
E→E*E E→E*E
$E $E

2. Reduce-reduce conflict:
Consider the grammar:
M → R+R | R+c |
RR→c
and input c+c

Stack Input Action Stack Input Action


$ c+c $ Shift $ c+c $ Shift

$c +c $ Reduce by $c +c $ Reduce by
R→c R→c
$R +c $ Shift $R +c $ Shift

$ R+ c$ Shift $ R+ c$ Shift

$ R+c $ Reduce by $ R+c $ Reduce by


R→c M→R+c
$ R+R $ Reduce by $M $
M→R+R
$M $

3. Explain the way to implement a shift-reduce parser using a stack by taking an input string for a
grammar
Shift- Reduce Parsing
Shift-reduce parsing is a type of bottom-up parsing that attempts to construct a parse tree
for an input string beginning at the leaves (the bottom) and working up towards the root (the
top).
Example:
Consider the grammar:
S → aABe
A → Abc | b
B→d
The sentence to be recognized is abbcde.
REDUCTION (LEFTMOST) RIGHTMOST DERIVATION

abbcde (A → b) S → aABe
aAbcde (A → Abc) → aAde
aAde (B → d) → aAbcde
aABe (S → aABe) → abbcde
S
The reductions trace out the right-most derivation in reverse.
Handles:

A handle of a string is a substring that matches the right side of a production, and whose
reduction to the non-terminal on the left side of the production represents one step along the reverse
of a rightmost derivation.
Example:
Consider the grammar:

E → E+E
E → E*E
E → (E)
E → id
And the input string id1+id2*id3
The rightmost derivation is :
E → E+E
→ E+E*E
→ E+E*id3
→ E+id2*id3
→ id1+id2*id3
In the above derivation the underlined substrings are called handles.
Handle pruning:

A rightmost derivation in reverse can be obtained by “handle pruning”. (i.e.) if w is a sentence or


string of the grammar at hand, then w = γn, where γn
sentinel form of some rightmost derivation.
Stack implementation of shift-reduce parsing :

Stack Input Action


$ id1+id2*id3 $ shift

$ id1 +id2*id3 $ reduce by E→id

$E +id2*id3 $ shift

$ E+ id2*id3 $ shift

$ E+id2 *id3 $ reduce by E→id

$ E+E *id3 $ shift

$ E+E* id3 $ shift

$ E+E*id3 $ reduce by E→id

$ E+E*E $ reduce by E→ E *E

$ E+E $ reduce by E→ E+E

$E $ accept

4. Consider the following grammar and construct the CLR parsing table.
S→C
C→cC
C→d

the Grammar is
S->C
C->cC
C->d

Canonical collection of LR (1) items can be created as follows:

3 S`->•S (Augment Production)


4 S->•CC
5 C->•cC
3 C->•d
I0 State:
Add Augment production and compute the Closure, the look ahead symbol for the Augment Production

S`->•S, $= Closure(S`->•S, $)

The dot symbol is followed by a Non terminal S. So, add productions starting with S in
I0 State.
S->•CC, FIRST ($), using 2nd rule
S->•CC, $
The dot symbol is followed by a Non terminal C. So, add productions starting with C in
I0 State.
C->•cC, FIRST(C,
$) C->•d, FIRST(C,
$)
FIRST(C) = {c, d} so, the items are

C->•cC, c/d
C->•d, c/d

The dot symbol is followed by a terminal value. So, close the I0 State. So, the productions in
the I0 are
S`->•S , $
S->•CC ,
$
C->•cC, c/d
C->•d , c/d
I1 =

S-> C->•cC , $
C->•d,$
So, the I2 State is

C->•cC ,
$ C->•d,$
I3= Goto(I0,c)= Closure( C->c•C, c/d)
C->•cC, c/d
C->•d , c/d So, the I3 State is

C->c•C, c/d
C->•cC, c/d
C->•d , c/d
I4= Goto( I0, d)= Colsure( C->d•, c/d) = C->d•, c/d

I5 = Goto ( I2, C)= closure(S->CC•,$)= S->CC•, $

I6= Goto ( I2, c)= closure(C->c•C , $)=


C->•cC, $
C->•d , $ S0, the I6 State is
C->c•C , $
C->•cC, $
C->•d,$
I7 = Go to (I2 , d)= Closure(C->d•,$ ) = C->d•, $
Goto(I3, c)= closure(C->•cC, c/d)= I3.

I8= Go to (I3 , C)= Closure(C->cC•, c/d) = C-


>cC•, c/d Go toGoto
(I3 ,(c)=
I0, Closure(C->c•C, c/d) = I3
S)= S`-
>S•,$
Go to (I3 , d)= Closure(C->d•, c/d) = I 4

I9= Go to (I6 , C)= Closure(C->cC• , $) = C->cC• , $


Go to (I6 , c)= Closure(C->c•C , $) = I6 Go to (I6 , d)= Closure(C->d•,$ ) = I7
Construction of CLR (1) Table

Rule1: if there is an item [A->α•Xβ,b] in Ii and goto(Ii,X) is in Ij then action [Ii][X]= Shift
j, Where X is Terminal.
Rule2: if there is an item [A->α•, b] in Ii and (A≠S`) set action [Ii][b]= reduce along with
the production number.
Rule3: if there is an item [S`->S•, $] in Ii then set action [Ii][$]= Accept.
Rule4: if there is an item [A->α•Xβ,b] in Ii and go to(Ii,X) is in Ij then goto [Ii][X]= j,
Where X is Non Terminal.

States ACTION GOTO


c D $ S C
I0 S3 S4 1 2
I1 ACCEPT
I2 S6 S7 5
I3 S3 S4 8
I4 R3 R3 5
I5 R1
I6 S6 S7 9
I7 R3
I8 R2 R2
I9 R2

LR (1) Table

5. Write down the advantages and disadvantages of LR parsers.


LR parsers are used to parse the large class of context free grammars. This technique is called LR(k)
parsing.
• L is left-to-right scanning of the input.
• R is for constructing a right most derivation in reverse.
• k is the number of input symbols of lookahead that are used in making parsing decisions.
There are three widely used algorithms available for constructing an LR parser:
• SLR(l) - Simple LR
o Works on smallest class of grammar.
o Few number of states, hence very small table.
o Simple and fast construction.
• LR( 1) - LR parser
o Also called as Canonical LR parser.
o Works on complete set of LR(l) Grammar.
o Generates large table and large number of states.
o Slow construction.
• LALR(l) - Look ahead LR parser
o Works on intermediate size of grammar.
o Number of states are same as in SLR(l).
Advantages of LR parser
• LR parsers can handle a large class of context-free grammars.
• The LR parsing method is a most general non-back tracking shift-reduce parsing method.
• An LR parser can detect the syntax errors as soon as they can occur.
• LR grammars can describe more languages than LL grammars.
Drawbacks of LR parsers
• It is too much work to construct LR parser by hand. It needs an automated parser generator.
• If the grammar contains ambiguities or other constructs then it is difficult to parse in a left-to-right
scan of the input.

6. What is an augmented grammar? Describe with an example.


Before the transitions between the different states are determined, the grammar is augmented
with an extra rule
S→E
where S is a new start symbol and E the old start symbol. The parser will use this rule for reduction
exactly when it has accepted the whole input string.

Consider the grammar E -> T+E | T

T ->id
Augmented grammar - E’ -> E

E -> T+E | T

T -> id

You might also like