0% found this document useful (0 votes)
63 views125 pages

Unit-II (Part-B) Bottom Up Parsing Techniques

The document discusses bottom-up parsing and shift-reduce parsing. Bottom-up parsing involves reducing the input string to the starting symbol by replacing substrings matching grammar rules. Shift-reduce parsing implements bottom-up parsing using a stack, shifting input symbols onto the stack until a reduction is possible.

Uploaded by

Ravi Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views125 pages

Unit-II (Part-B) Bottom Up Parsing Techniques

The document discusses bottom-up parsing and shift-reduce parsing. Bottom-up parsing involves reducing the input string to the starting symbol by replacing substrings matching grammar rules. Shift-reduce parsing implements bottom-up parsing using a stack, shifting input symbols onto the stack until a reduction is possible.

Uploaded by

Ravi Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 125

Compiler Design (KCS-502)

3rd year (Semester – V)


Session – 2023 - 24
Unit – II
Part – B
Ratish Srivastava
Asst. Prof.
CSE Dept.
UCER, Prayagraj
Bottom-Up Parsing
• In bottom-up parsing, we start with the input string and try
to obtain the start symbol of the grammar using successive
reductions.
• If we could reduce the input string to the start symbol, the
parse is successful otherwise it is unsuccessful.
• The parse tree is constructed from bottom to upwards that
is from leaves to root.
• In this process, the input symbols are placed at the leaf
nodes.
• The leaf nodes together are reduced further to internal
nodes, these internal nodes are further reduced and
eventually a root node is obtained.
Bottom-Up Parsing
• So, in this technique, we reduce the input string to the
starting symbol of a grammar.
• At each reduction step a particular substring matching the
right side of the production is replaced by the symbol on
the left of that production.
• Bottom-up parsing involves the selection of a substring that
matches the right side of the production, whose reduction
to the non-terminal on the left side of the production
represents one step along the reverse of a right-most
derivation.
• That is it leads to the generation of the previous right most
derivation.
Operator Precedence Grammar
No Ɛ-transition.
No two adjacent non-terminals.
Eg.
E  E op E | id
op  + | *
The above grammar is not an
operator grammar but:
E  E + E | E* E | id
Operator Precedence Grammar

If a has higher precedence over b; a .> b


If a has lower precedence over b; a <. b
If a and b have equal precedence; a =. b
Note:
id has higher precedence than any other symbol
$ has lowest precedence.
if two operators have equal precedence, then we check the Associativity of that
particular operator.
Operator Precedence Table

id + * $
id .> .> .>
+ <. .> <. .>
* <. .> .> .>
$ <. <. <. .>

Example: w= $id + id * id$


$<.id.>+<.id.>*<.id.>$
Basic Principle
• Scan input string left to right, try to
detect .> and put a pointer on its
location.
• Now scan backwards till reaching <.
• String between <. And .> is our handle.
• Replace handle by the head of the
respective production.
• REPEAT until reaching start symbol.
Algorithm
w  input
a  input symbol
b  stack top
Repeat
{
if(a is $ and b is $)
return
if(a .> or = b)
push a into stack
move input pointer
else if(a <. b)
c  pop stack
until(a .> b)
else
error()
}
Example
STACK INPUT ACTION/REMARK
$ id + id * id$ $ <. Id
$ id + id * id$ id >. +
$ + id * id$ $ <. +
$+ id * id$ + <. Id
$ + id * id$ id .> *
$+ * id$ + <. *
$+* id$ * <. Id
$ + * id $ id .> $
$+* $ * .> $
$+ $ + .> $
$ $ accept
Precedence Functions
Operator precedence parsers use precedence functions that map
terminal symbols to integers.
Algorithm for Constructing Precedence Functions
1. Create functions fa for each grammar terminal a and for the
end of string symbol.
2. Partition the symbols in groups so that fa and gb are in the
same group if a =· b (there can be symbols in the same group
even if they are not connected by this relation).
3. Create a directed graph whose nodes are in the groups, next
for each symbols a and b do: place an edge from the group of
gb to the group of fa if a <· b, otherwise if a ·> b place an edge
from the group of fa to that of gb.
4. If the constructed graph has a cycle then no precedence
functions exist. When there are no cycles collect the length of
the longest paths from the groups of fa and gb respectively.
Consider the following table:

Resulting graph:
gid fid

f* g*

g+ f+

f$ g$
From the previous graph we extract the
following precedence functions:

id + * $
f 4 2 4 0
g 5 1 3 0
Handles
• A handle of a string is a substring that matches
the right side of a production and whose
reduction to the non-terminal on the left side of
the production represents the previous right
sentential form.

• The string appearing to the right of a handle


contains only terminal symbols.
• If a grammar is unambiguous, then every right
sentential form of the grammar has exactly one
handle.
Handles
• In a formal way, we define a handle of a right
sentential form γ as a production A→β and a
position of γ such that string β may be found at
that position and replaced by A resulting in the
string which is the previous right sentential form
in a right most derivation of γ.
• This means if S→αAω→αβω are the steps in a
right most derivation then rule, A→β in the
position following α in γ is a handle of γ.
• All the symbols to the right of the handle contains
only terminal symbols.
Handles
Consider an example:
E → E + E | E * E |id
The right most derivation is
E→E+E
→E+E*E
→ E + E * id3
→ E + id2 * id3
→ id1 + id2 * id3
id1 is a handle of the right sentential form id1 + id2 * id3
because id is the right side of the production E → id and
replacing id by E produces the previous right sentential
form E + id2 * id3.
Handles
• The most important aspect of bottom-up
parsing is the process of detecting handles
and using them in reductions.

• This process is called handle pruning.

• So, a rightmost derivation in reverse can be


obtained by handle pruning.
Handle Pruning
• Handle pruning can obtain a rightmost derivation
in reverse that means start with a string of
terminals ω, if ω is a sentence of the grammar at
right hand, then ω = yn, where yn is the nth right
sentential form.
S = y0 → y1 → y2 → … → yn = ω
• To reconstruct this derivation in reverse order we
locate the handle Bn in yn and replace that Bn by
the left side of some production An → Bn and we
repeat this process until we get a start symbol.
Handle Pruning
Example:
Consider the following grammar and show the
handle of each right sentential form for the
string (a, (a, a)).
S → (L) | a
L → L, S | S
Handle Pruning
Solution:
The right most derivation of the string (a, (a, a)) is
S → (L)
→ (L, S)
→ (L, (L))
→ (L, (L, S))
→ (L, (L, a))
→ (L, (S, a))
→ (L, (a, a))
→ (S, (a, a))
→ (a, (a, a))
Following table presents the handles of the sentential forms
occurring in the above derivation.
Handle Pruning

Sentential form Handle


(a, (a, a)) S → a at the position preceding the first comma
(S, (a, a)) L → S at the position preceding the first comma
(L, (a, a)) S → a at the position preceding the second comma
(L, (S, a)) L → S at the position preceding the second comma
(L, (L, a)) S → a at the position following the second comma
(L, (L, S)) L → L, S at the position following the second left bracket
(L, (L)) S → (L) at the position following the first comma
(L, S) L → L, S at the position following the first left bracket
(L) S → (L) at the position before the end marker
Shift-Reduce Parsing
• A convenient way to implement a bottom-up parser is to
use a shift-reduce technique.
• A stack is used for the implementation of the shift-reduce
parser.
• A parser goes on shifting the input symbols onto the stack
until a handle comes on top of the stack.
• When a handle appears on the top of the stack, it performs
reduction.
• Stack holds the grammar symbols and an input buffer to
hold the string ‘w’ to be parsed.
• $ is used to indicate the bottom of the stack and right end
of the input.
Shift-Reduce Parsing
• The initial configuration of the stack and the
input buffer is as follows:

$ w $

Stack Input buffer


Shift-Reduce Parsing
• The shift-reduce parser performs the shift and reduce actions
repeatedly until an accept or an error condition is found.
– It keeps moving input symbols from the buffer on to the stack, one
symbol at a time. Moving the next input symbol on the top of the
stack is known as shift action.
– If the handle appears on the top of the stack then reduction of it by
appropriate rule is done. The reduction is performed by popping off
the right hand side of a rule from the stack and pushing its left hand
side. This is called reduce action.
– If the stack contains start symbol only and input buffer is empty at the
same time then that action is called accept and the parser announces
a successful parse.
– In case parser is not able to either shift or reduce or accept, it
announces that a syntax error has occurred and calls an error recovery
routine. This is known as an error action.
Shift-Reduce Parsing
• Thus, a shift-reduce parser scans input, at
each step, considers whether to:
– Shift the next token to the top of the parse stack
(along with some state information).

– Reduce the stack by popping several symbols off


the stack (and this state information) and pushing
the corresponding non-terminal (and state
information)
Shift-Reduce Parsing
Example 1:
Consider the grammar as
S→E
E→E+E|E*E
E → num
E → id
Input to parse is id1 + num * id2. Perform shift-
reduce parsing.
Shift-Reduce Parsing
Solution:
Stack Input Action
$ id1 + num * id2$ Shift
$id1 + num * id2$ Reduce by E → id
$E + num * id2$ Shift
$E + num * id2$ Shift
$E + num * id2$ Reduce by E → num
$E + E * id2$ Shift
$E + E * id2$ Shift
$E + E * id2 $ Reduce by E → id
$E + E * E $ Reduce by E → E * E
$E + E $ Reduce by E → E + E
$E $ Reduce by S → E
$S $ Accept
Shift-Reduce Parsing
Example 2:
Consider the grammar as
E→E+E
E→E*E
E → (E)
E → id
and the input string is id1 + id2 * id3. Use the shift-
reduce parser to check whether the input string is
accepted by the grammar.
Shift-Reduce Parsing
Solution:
Stack Input Action
$ id1 + id2 * id3$ Shift
$id1 + id2 * id3$ Reduce by E → id
$E + id2 * id3$ Shift
$E + id2 * id3$ Shift
$E + id2 * id3$ Reduce by E → id
$E + E * id3$ Shift
$E + E * id3$ Shift
$E + E * id3 $ Reduce by E → id
$E + E * E $ Reduce by E → E * E
$E + E $ Reduce by E → E + E
$E $ Accept
Shift-Reduce Parsing
Example 3:
Consider the grammar as
E→E+T|T
T→T*F|F
F → (E) | id
Consider the parsing of the string id + id * id.
Shift-Reduce Parsing
Solution:
Stack Input Action
$ id + id * id$ Shift
$id + id * id$ Reduce by F → id
$F + id * id$ Reduce by T → F
$T + id * id$ Reduce by E → T
$E + id * id$ Shift
$E + id * id$ Shift
$E + id * id$ Reduce by F → id
$E + F * id$ Reduce by T → F
$E + T * id$ Shift
$E + T * id$ Shift
$E + T * id $ Reduce by F → id
$E + T * F $ Reduce by T → T * F
$E + T $ Reduce by E → E + T
$E $ Accept
Shift-Reduce Parsing
– Note:
• The handle is always on top of the stack. The parser
never had to go into the stack to find the handle. It is
the aspect of handle pruning that makes a stack a
particularly convenient data structure for implementing
a shift-reduce parser.
• Viable Prefix:
– The set of prefixes of the right sentential forms
that can appear on the stack of a shift reduce
parser are called viable prefixes.
Shift-Reduce Parsing
• Conflicts during shift-reduce parsing:
– There are some context free grammars for which
reduce parsing is not useful.
– For such grammar every shift reduce parser reaches a
configuration in which it knows the stack contents and
the next input symbol but cannot decide whether to
shift or to reduce (a shift/reduce conflict) or cannot
decide which of several reductions to make (a
reduce/reduce conflict).
– So these are the conflicts during shift reduce parsing.
Shift-Reduce Parsing
For example,
S→a+b|a+b*c
If we have a shift reduce parser in configuration
Stack Input
: :
a+b * c …… $
We cannot tell whether a + b is the handle. Depending on
what follows on the input, it might correct to reduce a + b
to S or it might correct to shift * and then take for another c
to complete the alternative a + b * c. Thus, we cannot tell
whether to shift or reduce in this case.
LR Parsers
• LR parsing is a bottom up parsing technique which can
be used to parse a large class of context free
grammars.
• This technique is called LR(k) parsing where L stands
for left to right scanning of the input, R stands for
constructing a rightmost derivation in reverse and k for
the number of input symbols of lookahead which are
used for making parsing decisions.
• The three representatives of this family are SLR (Simple
LR), LALR (Look Ahead LR) and CLR (Canonical LR).
LR Parsers
• The relative power and ease of implementation
for these representatives is different and can be
expressed as
SLR(1) ≤ LALR(1) ≤ CLR(1)
where a ≤ b means that ‘b’ parses a larger class of
grammars as compared to that parsed by ‘a’.
• All these parsers are table driven and their overall
structure is the same, it is only the parsing table
which changes from one parser to another.
• When (k) is omitted, k is assumed to be 1.
LR Parsers
• Advantages of LR Parser:
– LR parsers can be constructed to recognize virtually all
programming language constructs for which CFG can be
written.
– The LR parsing method is the most general nonback-
tracking shift reduce parsing method known, yet it can be
implemented as efficiently as other shift-reduce methods.
– The class of grammars that can be parsed using LR
methods is a proper superset of the class of grammars that
can be parsed with predictive parsers.
– A LR parser can detect a syntactic error as soon as it is
possible to do so on a left-to-right scan of the input.
Structure of a LR Parser
Structure of a LR Parser
Structure of a LR Parser
Structure of a LR Parser
LR Parsing Algorithm
LR Parsing Algorithm
LR Parsing Algorithm
Example:
Table shows the parsing action and goto
functions of an LR parser for the grammar
LR Parsing Algorithm
Solution: Parsing Table for expression grammar
LR Parsing Algorithm
LR Parsing Algorithm
LR Grammars
• A grammar for which we can construct parsing table in
which every entry is uniquely defined is said to be an
LR grammar.
• In order for a grammar to be LR, it is sufficient that a
left-to-right parser be able to recognize handles when
they appear on top of the stack.
• We may note that LR parsers do not have to scan the
entire stack to know when the handle appears on top.
• The state symbol on top of the stack contains all the
information it needs.
LR Grammars
• Another source of information that an LR
parser can use to help make its shift-reduce
decisions is the next ‘k’ input symbols.
• The cases k=0 or k=1 are of practical interest.
• A grammar that can be parsed by an LR parser
examining upto k input symbols on each move
is called an LR(k) grammar.
LR Grammars
• There is a significant difference between LL and LR
grammar.
– For a grammar to be LR(k), we must be able to recognize
the occurrence of the right side of a production, having
seen all of what is derived from that right hand side with k
input symbols of look ahead.
– This requirement is far less stringent than that of LL(k)
grammars where we must be able to recognize the use of a
production seeing only the first k symbols of what its right
side derives.
– Thus LR grammars can describe more languages than LL
grammars.
LR Grammars
• Context-free grammars are not LR:
– A grammar is non LR(1) if every shift reduce parser for
that grammar can reach a configuration in which the
parser cannot decide whether to shift or to reduce or
cannot decide which of several reductions to make.
– An ambiguous grammar can never be LR.
– Another common cause of non-LR ness occurs when
we know we have a handle, but what we see on the
stack is not sufficient to tell what production should
be used in a reduction.
The Canonical Collection of LR(0)
Items
• LR(0) item:
– An LR(0) item for a grammar G is a rule of G with a dot
inserted at some position in the right hand side of the
rule.
– Thus, production A → XYZ generates the four items
A → .XYZ A → X.YZ
A → XY.Z A → XYZ.

– Note: If the length of the right side of the production


is n, then there are (n + 1) different positions on the
right side of a production where a dot can be placed.
The Canonical Collection of LR(0)
Items
– Similarly, rule A → Є has only one LR(0) item A → .
– Inside the computer, items are easily represented by pair
of integers, the first giving the number of the production
and the second the position of the dot.
– An item indicates how much of a production we have seen
at a given point in the parsing process.
– For example, the first item A → .XYZ would indicate that
we are expecting to see a string derivable from XYZ next on
the input.
– The second item A → X.YZ would indicate that we have just
seen on the input a string derivable from X and that we
next expect to see a string derivable from YZ.
The Canonical Collection of LR(0)
Items
– An LR(0) item is complete if we have seen the
complete right hand side of the rule i.e., if the dot is
the last symbol in the right hand side. Otherwise it is
an incomplete item.
– We may note that for every rule, A → α, α ǂ Є there is
only one complete LR(0) item viz. A → α. but as many
incomplete items as there are grammar symbols in the
right hand side.
– To find out the states of the SLR parser, we group
items into sets called Canonical LR(0) collection.
• To do that, we need to define augmented grammar and two
functions closure and goto.
The Canonical Collection of LR(0)
Items
• Augmented Grammar:
– If G is a grammar with start symbol S, then G’, the
augmented grammar for G, is G with a new start
symbol S’ and production S’ → S.
– The purpose of this new starting production is to
indicate to the parser when it should stop parsing
and announce acceptance of the input.
– This would occur when the parser was about to
reduce by S’ → S.
The Canonical Collection of LR(0)
Items
• Closure Operation:
– If I is a set of items for a grammar G, then the set of
items CLOSURE(I) is constructed from I by the rules:
(i) Every item in I is added to CLOSURE(I).
(ii) If A → α.Bβ is in CLOSURE(I) and B → γ is a production,
then add
the item B → .γ to CLOSURE(I), if it is not already there.
(iii) Repeat step (ii) until no more new items can be added to
CLOSURE(I).
The Canonical Collection of LR(0)
Items
– Intuitively A → α.Bβ in CLOSURE(I) indicates that,
at some point in the parsing process, we next
expect to see a string derivable from Bβ as input.
– If B → γ is a production, we would also expect to
see a string derivable from γ at this point.

– It is for this reason that we also include B → .γ in


CLOSURE(I).
The Canonical Collection of LR(0)
Items
Example:
Consider the augmented grammar
E’ → E
E → E+T | T
T → T*F | F
F → (E) | id
Find CLOSURE(I).
The Canonical Collection of LR(0)
Items
Solution:
If I is the set of one item {[E’ → .E]}, then the
CLOSURE(I) contains the items
E’ → .E
E → .E+T
E → .T
T → .T*F
T → .F
F → .(E)
F → .id
The Canonical Collection of LR(0)
Items
• GOTO Operation:
– The second useful function is GOTO(I, X) where I is
a set items and X is a grammar symbol.
– GOTO(I, X) is defined to be the closure of the set
of all items [A → αX.β] such that [A → α.Xβ] is in I.
– The GOTO function is defined using
CLOSURE GOTO(A → α.Xβ, X) =
CLOSURE(A → αX.β) where X is a
grammar symbol.
The Canonical Collection of LR(0)
Items
Example:
If I is the set of one item {[E’ → E.], [E → E.+T]},
then GOTO(I, +) consists of
E → E+.T
T → .T*F
T → .F
F → .(E)
F → .id
The Canonical Collection of LR(0)
Items
• Kernel and Non-Kernel Items:
– We can divide LR(0) items as kernel and non-kernel
items.
– Kernel items are those items, which do not have dot
at the left end.
• Item S → S’ is an exception and is included in kernel items.
• So kernel items include the initial item, S’ → .S and all items
whose dots are not at the left end.

– Non-kernel items are those items which have dot at


the left end.
The Canonical Collection of LR(0)
Items
– We may note that items added in the CLOSURE
can never be kernel items.

– Note:
• GOTO identifies groups of ‘Kernel items’ i.e., those
whose dots are not at extreme left.
• CLOSURE adds ‘Non-kernel items’ i.e., those whose
dots are at extreme left.
Construction of Canonical Collection
of Set of Items
• A canonical collection of LR(0) items for an
augmented grammar can be constructed using
CLOSURE and GOTO functions.
• A collection of set of items which corresponds
to the states of a DFA recognizing viable
prefixes is called as canonical collection.
• Therefore construction of a DFA involves
finding canonical collection of set of items.
Construction of Canonical Collection
of LR(0) items
• An LR (0) item is a production G with dot at
some position on the right side of the
production.
• LR(0) items is useful to indicate that how
much of the input has been scanned up to a
given point in the process of parsing.
• In the LR (0), we place the reduce node in the
entire row.
Construction of Canonical Collection
of LR(0) items
 Given grammar:
S → AA
A → aA | b
 Add Augment Production and insert '•'
symbol at the first position for every
production in G
S` →•S
S → •AA
A → •aA
A → •b
Construction of Canonical Collection
of LR(0) items
I0 State:
Add Augment production to the I0 State and
Compute the Closure
I0 = Closure (S` → •S)
 Add all productions starting with S in to I0
State because "•" is followed by the non-
terminal. So, the I0 State becomes
I0 = S` → •S
S → •AA
Construction of Canonical Collection
of LR(0) items
 Add all productions starting with "A" in
modified I0 State because "•" is followed by
the non-terminal. So, the I0 State becomes.
I0= S` → •S
S → •AA
A → •aA
A → •b
Construction of Canonical Collection
of LR(0) items
I1= Go to (I0, S) = closure (S` → S•) = S` → S•
Here, the Production is reduced so close the
State.
I1= S` → S•
I2= Go to (I0, A) = closure (S → A•A)
 Add all productions starting with A in to I2
State because "•" is followed by the non-
terminal. So, the I2 State becomes
Construction of Canonical Collection
of LR(0) items
I2 =S→A•A
A → •aA
A → •b
Go to (I2,a) = Closure (A → a•A) = (same as I3)
Go to (I2, b) = Closure (A → b•) = (same as I4)
I3= Go to (I0,a) = Closure (A → a•A)
Add productions starting with A in I3.
A → a•A
A → •aA
A → •b
Construction of Canonical Collection
of LR(0) items
Go to (I3, a) = Closure (A → a•A) = (same as I3)
Go to (I3, b) = Closure (A → b•) = (same as I4)
I4= Go to (I0, b) = closure (A → b•) = A → b•
I5= Go to (I2, A) = Closure (S → AA•) = SA → A•
I6= Go to (I3, A) = Closure (A → aA•) = A → aA•
Construction of Canonical Collection
of LR(0) items
Drawing DFA:
The DFA contains the 7 states I0 to I6.
Construction of LR(0) Table

LR(0) Table
• If a state is going to some other state on a
terminal then it correspond to a Shift move.
• If a state is going to some other state on a
variable then it correspond to GO TO move.
• If a state contain the final item in the
particular row then write the reduce node
completely.
Construction of LR(0) Table

Productions are numbered as follows:


S → AA ... (1)
A → aA ... (2)
A → b ... (3)
Constructing SLR Parsing tables
• We construct the SLR parsing ACTION and
GOTO functions from the DFA that recognizes
viable prefixes.

• It will not produce uniquely-defined parsing


action tables for all grammar but it does
succeed on many grammars for programming
languages.
Constructing SLR Parsing tables
• Algorithm:
1) Construct an augmented grammar G’ for the
input grammar G.
2) Construct FOLLOW(A), for all A Є N.
3) Construct C = {I0, I1, …… , In}, the collection of
sets of LR(0) items for G’.
4) Construct state I from Ii.
5) In order to define the entries for state I of the
ACTION table and GOTO table, do steps (i) to (v)
for each Ii.
Constructing SLR Parsing tables
i. If [A → α.aβ] Є Ii and GOTO(Ii, a) = Ij then ACTION(Ii, a) = Sj which
means shift to state j. Here ‘a’ is the terminal symbol.
ii. If [A → α.] Є Ii then ACTION(Ii, a) = rj for all a Є FOLLOW(A)
except A = S’.
iii. If [S’ → S.] Є Ii then ACTION(Ii, $) = Accept. If any conflicting actions
are generated by the above rules, we say the grammar is not
SLR(1). The algorithm fails to produce a parser in this case.
iv. For all non-terminals A, if GOTO(Ii, A) = Ij then GOTO(Ii, A) = j.
v. All the remaining entries in the ACTION table and the GOTO table
are marked as error.

6) The initial table of the parser is the one constructed


from the set of items containing [S’ → .S].
Constructing SLR Parsing tables
• The parsing table consisting of the parsing
ACTION and GOTO functions determined by
this algorithm is called the SLR table for G.

• An LR parser using the SLR table for G is called


the SLR parser for G and a grammar having an
SLR parsing table is said to be SLR(1).
Constructing SLR Parsing tables
Example 1:
Find the SLR or LR(0) parsing table for the
grammar
E → E+T | T
T → T*F | F
F → id
Constructing SLR Parsing tables
Solution:
The augmented grammar G’ is
S→E
E → E+T | T
T → T*F | F
F → id
The transition diagram for the given grammar is
as follows:
Constructing SLR Parsing tables
Constructing SLR Parsing tables
Now we prepare a FOLLOW set for all non-terminals.
FOLLOW(E) = {$, +}
FOLLOW(T) = {$, *, +}
FOLLOW(F) = {$, *, +}
The productions are numbered as
(1) E → E+T
(2) E→T
(3) T → T*F
(4) T→F
(5) F → id
Constructing SLR Parsing tables
Now build SLR parsing table as follows:
Entry in the action table ACTION(I0, id) = S4 because
GOTO(I0, id) = I4
Similarly, ACTION(I1, +) = S5 since GOTO(I1, +) = I5
ACTION(I2, *) = S6
ACTION(I5, id) = S4
ACTION(I6, id) = S4
ACTION(I7, *) = S6

Now we will fill it up with reduce and accept action.


We will also fill up the GOTO table for all the non-
terminals.
Constructing SLR Parsing tables
So final SLR table is as follows:
ACTION table GOTO table
State id + * $ E T F
I0 S4 1 2 3
I1 S5 Accept
I2 r2 S6 r2
I3 r4 r4 r4
I4 r5 r5 r5
I5 S4 7 3
I6 S4 8
I7 r1 S6 r1
I8 r3 r3 r3
Constructing SLR Parsing tables
Example 2:
Construct a SLR or LR(0) parsing table for the
following grammar:
E → E+T | T
T → TF | F
F → F* | a | b
Constructing SLR Parsing tables
Solution:
The augmented grammar G’ is
S→E
E → E+T | T
T → TF | F
F → F* | a | b
The transition diagram for the given grammar is
as follows:
Constructing SLR Parsing tables
Constructing SLR Parsing tables
Now we prepare a FOLLOW set for all non-terminals.
FOLLOW(E) = {$, +}
FOLLOW(T) = {$, +, a, b}
FOLLOW(F) = {$, *, +, a, b}
The productions are numbered as
(1) E → E+T
(2) E→T
(3) T → TF
(4) T→F
(5) F → F*
(6) F→a
(7) F→b
Constructing SLR Parsing tables
So final SLR table is as follows:
ACTION table GOTO table
State a b + * $ E T F
I0 S4 S5 1 2 3
I1 S6 Accept

I2 S4 S5 r2 r2 7
I3 r4 r4 r4 S9 r4
I4 r6 r6 r6 r6 r6
I5 r7 r7 r7 r7 r7
I6 S4 S5 8 3
I7 r3 r3 r3 S9 r3
I8 S4 S5 r1 r1 7
I9 r5 r5 r5 r5 r5
Constructing SLR Parsing tables
Example 3:
Construct a SLR or LR(0) parsing table for the
following grammar:
S → cA | ccB
A → cA | a
B → ccB | b
Constructing SLR Parsing tables
Solution:
The augmented grammar G’ is
S’ → S
S → cA | ccB
A → cA | a
B → ccB | b
The transition diagram for the given grammar is
as follows:
Constructing SLR Parsing tables
Constructing SLR Parsing tables
Now we prepare a FOLLOW set for all non-terminals.
FOLLOW(S) = {$}
FOLLOW(A) = {$}
FOLLOW(B) = {$}
The productions are numbered as
(1) S → cA
(2) S → ccB
(3) A → cA
(4) A→a
(5) B → ccB
(6) B→b
Constructing SLR Parsing tables
So final SLR table is as follows:
ACTION table GOTO table
State a b c $ S A B
I0 S3 S4 S2 1
I1 Accept

I2 S3 S6 5
I3 r4
I4 r6
I5 r1 /r3
I6 S3 S4 S9 8 7
I7 r2 /r5 From table, there
I8 r3 occurs reduce-
reduce conflict. So,
I9 S3 S10 8
the given grammar
I10 S3 S4 S9 8 11 is not LR(0).
I11 r5
Canonical LR Parser (CLR)
• Every SLR grammar is unambiguous but there are many
unambiguous grammars that are not SLR.
• We define CLR (Canonical LR) parser to accommodate
extra information in the form of a terminal symbol, as a
second component in each item of state.
• The general form of item becomes [A → α.β, a], where
A → α.β is a production and ‘a’ is terminal symbol or
right end marker $. We call such an item as LR(1)
items.

• The ‘1’ refers to the length of the second component,


called the lookahead of the item.
Canonical LR Parser (CLR)
• An LR(1) item is of the form [A → α.β, a] where
A → α.β is a LR(0) item and the second component is a
set of terminal symbols. (lookahead symbol)

• Let us consider an item of the form [A → α.β, a]


i. If β=Є, the reduction is applied only if next input
symbol is ‘a’. Same thing happens for an item
[A → α. , a]
ii. If βǂЄ, the lookahead has no effect.

• We may note that the lookahead symbol is used in a


reduce operation only.
Canonical LR Parser (CLR)
• Algorithm: Construction of the sets of LR(1) items
1) Augment the grammar G to obtain G’.
2) C = { closure(S’ → .S, $)
repeat
for each set of items in C and each grammar
symbol X
if GOTO(I, X) ǂ φ and GOTO(I, X) not in C
C = C U GOTO(I, X)
until no more sets of items can be added to C
}
Canonical LR Parser (CLR)
• CLOSURE function:
– We can define CLOSURE(I) as the set of items
constructed from I as follows:
i. For each item [A → α.Bβ, a] in I and each rule
B → γ in G’ and each terminal b in FIRST(βa) if
[B → .γ, b] is not in I then I = I U (B → .γ, b)
ii. Repeat step (i) until no more new items can
be added to I.
Canonical LR Parser (CLR)
• GOTO function:
– GOTO function is defined as follows:
function GOTO(I, X)
begin
Let J be the set of items [A → αX.β, a] such
that [A → α.Xβ, a] is in I
return CLOSURE(J)
end
Canonical LR Parser (CLR)
Example 1:
Consider the following grammar
S → CC
C → cC | d
Construct LR(1) set of items for this grammar.
Canonical LR Parser (CLR)
Solution:
The augmented grammar G’ is
S’ → S
S → CC
C → cC | d
We begin by computing the CLOSURE(S’ → .S, $)
We match this with the item [A → α.Bβ, a] in the
CLOSURE.
Canonical LR Parser (CLR)
CLOSURE tells to add [B → .γ, b] for each
production B → γ and terminal b in FIRST(βa).
Here B → γ is S → CC and since β is Є and a is $,
hence FIRST(Є$) = $
Thus we add [S → .CC, $]
Now we have B = C, β = C, a = $
C → .cC FIRST(C$) = FIRST(C) = {c, d}
C → .d
Therefore, add [C → .cC, c|d]
[C → .d, c|d]
Canonical LR Parser (CLR)
Now, none of the new items has a non-
terminal immediately to the right of the dot so
we have completed our first set of LR(1) items.
The initial set of items as
I0 : S’ → .S, $
S → .CC, $
C → .cC, c|d
C → .d, c|d
Canonical LR Parser (CLR)
Now we compute GOTO(I0, X) for the various values of X.
I1 : S’ → S., $
I2 : S → C.C, $
C → .cC, $
C → .d, $
I3 : C → c.C, c|d C
→ .cC, c|d
C → .d, c|d
I4 : C → d., c|d

Similarly other item sets can be generated.


Canonical LR Parser (CLR)
The transition diagram is
Constructing CLR Parsing Table
• Algorithm:
1) Construct C = {I0, I1, ….., In}, the collection of sets of LR(1)
items for G.
2) State i of the parser is constructed from Ii. The parsing
actions for state i are determined as follows:
a) If [A → α.aβ, b] is in Ii and GOTO[Ii, a] = Ij then set
ACTION[i, a] = Sj.
b) If [A → α., a] is in Ii, AǂS’, then set ACTION[i, a] = rj.
c) If [S’ → S., $] is in Ii, then set ACTION[i, $] = Accept.
If a conflict results from the above rules, the grammar is
said not to be LR(1) and the algorithm is said to fail.
Constructing CLR Parsing Table
3) The GOTO transitions for state i are determined as
follows:
If GOTO(Ii, A) = Ij then GOTO(i, A) = j.
4) All entries not defined by rules (2) and (3) are made
“error”.
5) The initial state of the parser is the one constructed
from the set containing item [S’ → .S, $].
Constructing CLR Parsing Table
• The table formed from the parsing action and goto
function produced by this algorithm is called the
canonical LR(1) parsing table. An LR parser using this
table is called a canonical LR(1) parser.
• If the parsing action function has no multiple-defined
entries, then the given grammar is called an LR(1)
grammar.
Constructing CLR Parsing Table
Example:
The canonical parsing table or LR(1) parsing
table for the above grammar is given below.

The productions are numbered as


(1) S → CC
(2) C → cC
(3) C → d
Constructing CLR Parsing Table
So final CLR table is as follows:
ACTION table GOTO table
State c d $ S C
0 S3 S4 1 2
1 Accept
2 S6 S7 5
3 S3 S4 8
4 r3 r3
5 r1
6 S6 S7 9
7 r3
8 r2 r2
9 r2
Constructing CLR Parsing Table
• Note:
– Every SLR(1) grammar is an LR(1) grammar, but for
an SLR(1) grammar the canonical LR parser may
have more states than the SLR parser for the same
grammar.
Constructing CLR Parsing Table
Example 2:
Consider the following grammar:
S → AaAb
S → BbBa
A→ϵ
B→ϵ
Construct CLR Parsing table.
Constructing CLR Parsing Table
Solution:
The augmented grammar will be:
S’ → .S,$
S → .AaAb,$
S → .BbBa,$
A → .,a
B → .,b
Constructing CLR Parsing Table
Goto {I0,S} = {S’ → S.,$} = I1
Goto {I0,A} = {S → A.aAb,$} = I2
Goto {I0,B} = {S → B.bBa,$} = I3
Goto {I2,a} = {S → Aa.Ab,$
A → .,b} = I4
Goto {I3,a} = {S → Bb.Ba,$
B → .,a } = I5
Constructing CLR Parsing Table
Goto {I4,A} = {S → AaA.b,$} = I6
Goto {I5,B} = {S → BbB.a,$} = I7
Goto {I6,b} = {S → AaAb.,$} = I8
Goto {I7,a} = {S → BbBa.,$} = I9
Constructing CLR Parsing Table
So final CLR table is as follows:
States ACTION Table GOTO Table
a b $ S A B
I0 r3 r4 1 2 3
I1 accept
I2 S4
I3 S5
I4 r3 6
I5 r4 7
I6 S8
I7 S9
I8 r1
I9 r2
LALR Parsing
• LALR stands for lookahead LR.
• This parser has the size advantage of a SLR parser
as the tables obtained by it are considerably
smaller than the canonical LR tables.
• In addition, it has the lookahead capability like a
CLR parser.
• The number of states of SLR and LALR parsing
tables are always the same.
Constructing LALR Parsing Table
• We can find the LALR states by merging CLR states that
have the same set of first components.
• More generally, we can look for sets of LR(1) items
having the same core, that is, set of first components,
and we may merge these sets with common cores into
one set of items.
• Note:
– The merging of states with common cores can never
produce a shift/reduce conflict that was not present in one
of the original states because shift actions depend only on
the core, not the lookahead.
Constructing LALR Parsing Table
• Algorithm:
1) Construct C = {I0, I1, ….., In}, the collection of sets of
LR(1) items.
2) For each core present among the sets of LR(1)
items, find all sets having that core, and replace
these sets by their union.
3) Let C’ = {J0, J1, ….., Jm} be the resulting sets of LR(1)
items. The parsing actions for state i are constructed
from Ji in the same manner as LR parsing algorithm.
If there is a parsing-action conflict, the algorithm
fails to produce a parser and the grammar is said
not to be LALR(1).
Constructing LALR Parsing Table
4) The GOTO table is constructed as follows:
– If J is the union of one or more sets of LR(1) items,
i.e., J = I1 U I2 U ….. U Ik, then the cores of
GOTO(I1, X), GOTO(I2, X), ….., GOTO(Ik, X) are the
same, since I1, I2, …., Ik all have the same core.
– Let k be the union of all sets of items having the
same core as GOTO(I1, X).
– Then GOTO(J, X) = k.
Constructing LALR Parsing Table
• The table produced by this algorithm is called
the LALR parsing table for G.
• If there are no parsing conflicts, then the given
grammar is said to be an LALR(1) grammar.
• The collection of sets of items constructed in
step(3) is called the LALR(1) collection.
Constructing LALR Parsing Table
Example:
Consider the grammar
S’ → S
S → CC
C → cC | d
Consider the sets of LR(1) items of this grammar.
We can find the LALR states by merging CLR states that
have the same set of first components.
There are three pairs of sets of items that can be
merged.
Constructing LALR Parsing Table
I3 and I6 are replaced by their union.
I36 : C → c.C, c|d|$
C → .cC, c|d|$
C → .d, c|d|$
I4 and I7 are replaced by their union.
I47 : C → d., c|d|$
and I8 and I9 are replaced by their union.
I89 : C → cC., c|d|$
Constructing LALR Parsing Table
In the original set of LR(1) items, GOTO(I3, C) =
I8 and I8 is now part of I89, so we make
GOTO(I36, C) be I89.
Similarly GOTO(I6, C) = I9 and I9 is now part of
I89, so we make GOTO(I36, C) = I89.
Thus the LALR ACTION and GOTO functions for
the considered sets of items are shown in the
table.
Constructing LALR Parsing Table
So final LALR table is as follows:
ACTION table GOTO table
State c d $ S C
0 S36 S47 1 2
1 Accept
2 S36 S47 5
36 S36 S47 89
47 r3 r3 r3
5 r1
89 r2 r2 r2
Constructing LALR Parsing Table
• We may note the following facts about the behaviour
of LALR(1) parsing table:
– A LALR(1) parsing table is always free to shift reduce
conflicts provided the corresponding LR(1) table is shift
reduce conflict free.
– A LALR(1) parsing table may have a reduce-reduce conflict
even if the corresponding LR(1) table is conflict free.
– For a sequence of the language, both LALR and SLR
produce same sequence of shifts and reduces.
– For the erroneous input, a LALR parser may continue to
perform reductions even after the CLR parser has caught
error. But it does not shift any symbol beyond the point
that the CLR parser declared an error.

You might also like