0% found this document useful (0 votes)
9 views13 pages

Syntax Analysis - Bottom Up Parsers - LR - Parsers - KR - Notes

The document outlines the architecture and functioning of LR parsers, including the driver program, ACTION and GOTO tables, and the parsing process. It details the construction of LR(0) and SLR parsing tables, emphasizing the steps involved in creating the canonical collection of LR(0) items and the parsing actions based on grammar symbols. Additionally, it introduces CLR parsing and the construction of LR(1) items, highlighting the use of lookahead symbols for more precise parsing.

Uploaded by

aditi21paul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

Syntax Analysis - Bottom Up Parsers - LR - Parsers - KR - Notes

The document outlines the architecture and functioning of LR parsers, including the driver program, ACTION and GOTO tables, and the parsing process. It details the construction of LR(0) and SLR parsing tables, emphasizing the steps involved in creating the canonical collection of LR(0) items and the parsing actions based on grammar symbols. Additionally, it introduces CLR parsing and the construction of LR(1) items, highlighting the use of lookahead symbols for more precise parsing.

Uploaded by

aditi21paul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

LR Parser architecture

 Every LR parser has a driver program. It reads input from left to right, uses a stack to
keep track of the states of LR automaton, an ACTION table, and a GOTO table. All LR
parsers have the same architecture and uses the same driver program. They differ only
in how they construct their tables.
 The states of the LR automaton that describes that stack content are numbered by small
integers. So are the productions.
 The value of ACTION[i, a], where i is a state and a is a grammar symbol, can have one of
four forms:
o Shift j, written sj, tells the parser to shift state j onto the stack.
o Reduce j, written rj, tells the parser to reduce by production j, i.e., popping off
as many states on the stack as there are symbols on the body of production j,
exposing state i on the stack. The parser then pushes onto to the stack the state
determined by GOTO[i, A] where A is the head of production j.
o Accept, written acc, tells the parser to accept the input and finish parsing.
o Error, shown as a blank entry, tells the parser there’s an error in the input and
that it should take a corrective action.

Sample parsing execution

 This shows execution of the SLR parser on id * id + id.


LR(0) parser is a type of shift-reduce parser that uses a bottom-up approach to construct a parsing table for a
given grammar. It is a simpler version of LR(1) parser, in which it does not take into account the lookahead
symbols while constructing the parsing table. This makes LR(0) parser less powerful but also more efficient
in terms of time and space complexity.
To construct a parsing table for LR(0) parser, we follow these steps:

1. Augment the grammar:


Add a new start symbol S' and a new production rule S' -> S, where S is the original start symbol.
2. Compute the closure of the initial state:
The initial state of the LR(0) parser is the closure of the item [S' -> .S]. The closure of an item is the set of
all items that can be derived from it using the production rules of the grammar.
3. Compute the transition function: For each item in the closure, we compute the transition function for each
terminal and non-terminal symbol. The transition function tells us which state to go to when we see a
particular symbol.
4. Generate the parsing table: The parsing table is a matrix that contains the actions to be taken for each state
and input symbol. The actions can be either shift (move to the next state) or reduce (apply a production rule).
If there is a conflict, we use the precedence rules to determine the action.
5. Repeat steps 2-4 for all the states until we have constructed the complete parsing table.

In summary, LR(0) parser is a simpler version of LR(1) parser that constructs a parsing table without
considering the lookahead symbols. This makes it more efficient but also less powerful. By following the
steps outlined above, we can construct a parsing table for a given grammar using LR(0) parser.
Canonical Collection of LR(0) items
If a state (Ii) is going to some other state (Ij) on a terminal then it corresponds to a shift move in the
action part.

If a state (Ii) is going to some other state (Ij) on a variable then it correspond to go to move in the
Go to part.

If a state (Ii) contains the final item like A → ab• which has no transitions to the next state then the
production is known as reduce production. For all terminals X in FOLLOW (A), write the reduce entry
along with their production numbers.
Example
1. S -> •Aa
2. A->αβ•
1. Follow(S) = {$}
2. Follow (A) = {a}
Example
Given grammar:

1. S → AA
2. A → aA | b

LR(0) Table
o If a state is going to some other state on a terminal then it correspond to a shift move.
o If a state is going to some other state on a variable then it correspond to go to move.
o If a state contain the final item in the particular row then write the reduce node completely.

Explanation:
o I0 on S is going to I1 so write it as 1.
o I0 on A is going to I2 so write it as 2.
o I2 on A is going to I5 so write it as 5.
o I3 on A is going to I6 so write it as 6.
o I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
o I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
o I4, I5 and I6 all states contains the final item because they contain • in the right most end. So
rate the production as production number.
Productions are numbered as follows:
1. S → AA ... (1)
2. A → aA ... (2)
3. A → b ... (3)
o I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
o I4 contains the final item which drives A → b• and that production corresponds to the
production number 3 so write it as r3 in the entire row.
o I5 contains the final item which drives S → AA• and that production corresponds to the
production number 1 so write it as r1 in the entire row.
o I6 contains the final item which drives A → aA• and that production corresponds to the
production number 2 so write it as r2 in the entire row.
******* More Notes **********
SLR Parser
An LR(0) item (or just item) of a grammar G is a production of G with a dot at some position of the right
side indicating how much of a production we have seen up to a given point.
For example, for the production E -> E + T we would have the following items:
[E -> .E + T]
[E -> E. + T]
[E -> E +. T]
[E -> E + T.]

Constructing the SLR Parsing Table


The states in the LR table will be the e-closures of the states corresponding to the items
...the process of creating the LR state table parallels the process of constructing an equivalent DFA from a
machine with e-transitions.
You need two operations: closure() and goto().
closure()
If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I by the two rules:
1. Initially every item in I is added to closure(I)
2. If A -> a.Bb is in closure(I), and B -> g is a production, then add the initial item [B -> .g] to I, if it is not
already there. Apply this rule until no more new items can be added to closure(I).
From the grammar above, if I is the set of one item {[E'-> .E]}, then closure(I) contains:
I0: E' -> .E
E -> .E + T
E -> .T
T -> .T * F
T -> .F
F -> .(E)
F -> .id
goto()
goto(I, X), where I is a set of items and X is a grammar symbol, is defined to be the closure of the set of all items
[A -> aX.b] such that [A -> a.Xb] is in I.
The idea here is fairly intuitive: if I is the set of items that are valid for some viable prefix g, then goto(I, X) is the
set of items that are valid for the viable prefix gX.

Algorithm for Sets-of-Items-Construction


To construct the canonical collection of sets of LR(0) items for
augmented grammar G'.
procedure items(G')
begin
C := {closure({[S' -> .S]})};
repeat
for each set of items in C and each grammar symbol X
such that goto(I, X) is not empty and not in C do
add goto(I, X) to C;
until no more sets of items can be added to C
end;
Algorithm for Constructing an SLR Parsing Table
Input: augmented grammar G'
Output: SLR parsing table functions action and goto for G'
Method:
Construct C = {I0, I1 , ..., In} the collection of sets of LR(0) items for G'.
State i is constructed from Ii:
if [A -> a.ab] is in Ii and goto(Ii, a) = Ij, then set action[i, a] to "shift j". Here a must be a terminal.
if [A -> a.] is in Ii, then set action[i, a] to "reduce A -> a" for all a in FOLLOW(A). Here A may
not be S'.
if [S' -> S.] is in Ii, then set action[i, $] to "accept"
If any conflicting actions are generated by these rules, the grammar is not SLR(1
The goto transitions for state i are constructed for all nonterminals A using the rule:
If goto(Ii, A)= Ij, then goto[i, A] = j.
All entries not defined by rules 2 and 3 are made "error".
The inital state of the parser is the one constructed from the set of items containing [S' -> .S].
An Example

(1) E -> E * B The Action and Goto Table The two LR(0) parsing tables for this grammar look as follows:
(2) E -> E + B
(3) E -> B
(4) B -> 0
(5) B -> 1
Example LR(0) automaton

SLR parsing table


Algorithm for constructing an SLR table:

 1. Construct the collection of sets of LR(0) items for G'.


 2. Determine parsing actions for each state Ii as follows:
 (a) If a is a terminal and [A → α●aβ] is in Ii,
 and GOTO(Ii,a) = Ij, then set ACTION[i,a] to "shift j".
 (b) If A is a nonterminal different from S' and if [A → α●]
 is in Ii, then set ACTION[i,a] to "reduce A → α" for all
 a ∈ FOLLOW(A).
 (c) If [S' → S●] is in Ii, then set ACTION[i,$] to "accept".
 3. For all nonterminal A, if GOTO(Ii,A) = Ij, set GOTO[i,A] to j.
 4. All entries not defined by rule (2) and (3) above are "error".
 5. The initial state of the parser is the one constructed from
the set of items containing [S' → ●S].
 If any conflicting actions result from rule 2, that gives a proof that the grammar is not in SLR(1).
CLR Parsing
CLR refers to canonical lookahead. CLR parsing use the canonical collection of LR (1) items to build
the CLR (1) parsing table. CLR (1) parsing table produces the more number of states as compare to
the SLR (1) parsing.
In the CLR (1), we place the reduce node only in the lookahead symbols.
Various steps involved in the CLR (1) Parsing:

o For the given input string write a context free grammar


o Check the ambiguity of the grammar
o Add Augment production in the given grammar
o Create Canonical collection of LR (1) items
o Draw a data flow diagram (DFA)
o Construct a CLR (1) parsing table
LR (1) item
LR (1) item is a collection of LR (0) items and a look ahead symbol.
LR (1) item = LR (0) item + look ahead
The look ahead is used to determine that where we place the final item.
The look ahead always add $ symbol for the argument production.
Let us take an example and understand CLR Parsing.
Consider the grammar
S → AA
A → aA/b
and construct the CLR(1) parser for the given grammar.
Solution:
The augmented grammar is: S’→ .S,S
S → AA
A → aA/b
Closure (I) = S’ → .S,$
S → .AA, $
A → .aA, a/b
A → .b, a/b

Goto (Io, $) = I1 Goto (Io, b) = I4


S1 → S.,$ A → b, a/b
Goto (Io, A) = I2 Goto (I4, A) = I5
S→ A.A,$ S → AA., $
A→ .aA,$ Goto (I2, a) = I6
A → .b, $ A → a.A, $
Goto (Io,a) = I3 A → .aA, $
A → a.A, a/b A → .b/$
A → .aA, a/b Goto (I2, b) = I7
A → .b, a/b A → b, $
Goto (I3, b) = I8 Goto (I0, A) = I9
A → b., a/b A → aA. $
Goto (I3, a) = I3 Goto (I0, a) = I6
A → a.A, a/b A→ a.A, $
A → .aA, a/b A → .aA, $
A → .b, a/b A → .b, $
Goto (I3, b) = I4 Goto (I0, b) = I7
A → b., a/b A→ b., $
As it points out, sometimes the FOLLOW sets give too much information and doesn't (can't) discriminate
between different reductions, i.e shift/reduce and reduce/reduce conflicts.
The general form of an LR(k) item becomes [A -> a.b, s] where A -> ab is a production and s is a string of
terminals. The first part (A -> a.b) is called the core and the second part is the lookahead. In LR(1) |s| is 1, so s is
a single terminal.

ALGORITHM FOR CONSTRUCTION OF THE SETS OF LR(1) ITEMS


Input: grammar G'
Output: sets of LR(1) items that are the set of items valid for one or more viable prefixes of G'
Method:
closure(I)
begin
repeat
for each item [A -> a.Bb, a] in I,
each production B -> g in G',
and each terminal b in FIRST(ba)
such that [B -> .g, b] is not in I do
add [B -> .g, b] to I;
until no more items can be added to I;
end;

goto(I, X)
begin
let J be the set of items [A -> aX.b, a] such that
[A -> a.Xb, a] is in I
return closure(J);
end;
procedure items(G')
begin
C := {closure({S' -> .S, $})};
repeat
for each set of items I in C and each grammar symbol X such
that goto(I, X) is not empty and not in C do
add goto(I, X) to C
until no more sets of items can be added to C;
end;

ALGORITHM FOR CONSTRUCTION OF THE CANONICAL LR PARSING TABLE


Input: grammar G'
Output: canonical LR parsing table functions action and goto
1. Construct C = {I0, I1 , ..., In} the collection of sets of LR(1) items for G'.State i is constructed from Ii.
2. if [A -> a.ab, b>] is in Ii and goto(Ii, a) = Ij, then set action[i, a] to "shift j". Here a must be a terminal.
3. if [A -> a., a] is in Ii, then set action[i, a] to "reduce A -> a" for all a in FOLLOW(A). Here A may not be S'.
4. if [S' -> S.] is in Ii, then set action[i, $] to "accept"
5. If any conflicting actions are generated by these rules, the grammar is not LR(1) and the algorithm fails to
produce a parser.
6. The goto transitions for state i are constructed for all nonterminals A using the rule: If goto(Ii, A)= Ij, then
goto[i, A] = j.
7. All entries not defined by rules 2 and 3 are made "error".
8. The inital state of the parser is the one constructed from the set of items containing [S' -> .S, $].
Another Example
Another Example
CLR ( 1 ) Grammar
1. S → AA
2. A → aA
3. A → b
Add Augment Production, insert '•' symbol at the first position for every production in G and also add the
lookahead.

1. S` → •S, $
2. S → •AA, $
3. A → •aA, a/b
4. A → •b, a/b
I0 State:
Add Augment production to the I0 State and Compute the Closure
I0 = Closure (S` → •S)
Add all productions starting with S in to I0 State because "." is followed by the non-terminal. So, the I0 State
becomes
I0 = S` → •S, $
S → •AA, $
Add all productions starting with A in modified I0 State because "." is followed by the non-terminal. So, the
I0 State becomes.
I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I1= Go to (I0, S) = closure (S` → S•, $) = S` → S•, $
I2= Go to (I0, A) = closure ( S → A•A, $ )
Add all productions starting with A in I2 State because "." is followed by the non-terminal. So, the I2 State
becomes
I2= S → A•A, $
A → •aA, $
A → •b, $

I3= Go to (I0, a) = Closure ( A → a•A, a/b )


Add all productions starting with A in I3 State because "." is followed by the non-terminal. So, the I3 State
becomes

I3= A → a•A, a/b


A → •aA, a/b
A → •b, a/b
Go to (I3, a) = Closure (A → a•A, a/b) = (same as I3)
Go to (I3, b) = Closure (A → b•, a/b) = (same as I4)
I4= Go to (I0, b) = closure ( A → b•, a/b) = A → b•, a/b
I5= Go to (I2, A) = Closure (S → AA•, $) =S → AA•, $
I6= Go to (I2, a) = Closure (A → a•A, $)
Add all productions starting with A in I6 State because "." is followed by the non-terminal. So, the I6 State
becomes
I6 = A → a•A, $
A → •aA, $
A → •b, $
Go to (I6, a) = Closure (A → a•A, $) = (same as I6)
Go to (I6, b) = Closure (A → b•, $) = (same as I7)
I7= Go to (I2, b) = Closure (A → b•, $) = A → b•, $
I8= Go to (I3, A) = Closure (A → aA•, a/b) = A → aA•, a/b
I9= Go to (I6, A) = Closure (A → aA•, $) = A → aA•, $
Drawing DFA:

CLR (1) Parsing table:

Productions are numbered as follows:

1.S → AA … (1) The placement of shift node in CLR (1) parsing table is same as the SLR
2. A → aA ....(2) (1) parsing table. Only difference in the placement of reduce node.
3. A → b ... (3)
I4 contains the final item which drives ( A → b•, a/b), so action {I4, a} =
R3, action {I4, b} = R3.
I5 contains the final item which drives ( S → AA•, $), so action {I5, $} =
R1.
I7 contains the final item which drives ( A → b•,$), so action {I7, $} = R3.
I8 contains the final item which drives ( A → aA•, a/b), so action {I8, a} =
R2, action {I8, b} = R2.
I9 contains the final item which drives ( A → aA•, $), so action {I9, $} =
R2.

You might also like