0% found this document useful (0 votes)

28 views10 pages

Automata Compiler Design U5

The document discusses canonical LR parsing and the construction of LR parsing tables. It describes how canonical LR parsing splits states to have each state indicate the exact input symbols that can follow a handle. It then provides the general form of an LR(k) item and describes how LR(1) items have a lookahead of length 1. It also gives the algorithm for constructing the sets of LR(1) items and describes how this is used to construct the canonical LR parsing table. Finally, it briefly discusses LALR parsing and how this can merge states to reduce the number of states compared to LR(1) parsing.

Uploaded by

chintu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views10 pages

Automata Compiler Design U5

Uploaded by

chintu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Smartworld.asia Specworld.

UNIT -5

5.1 CANONICAL LR PARSING

By splitting states when necessary, we can arrange to have each state of an LR parser
indicate exactly which input symbols can follow a handle a for which there is a possible
reduction to A. As the text points out, sometimes the FOLLOW sets give too much
informationand doesn't (can't) discriminate between different reductions.
The general form of an LR(k) item becomes [A -> a.b, s] where A -> ab is a production and s
is a string of terminals. The first part (A -> a.b) is called the core and the second part is the
lookahead. In LR(1) |s| is 1, so s is a single terminal.
A -> ab is the usual righthand side with a marker; any a in s is an incoming token in which
we are interested. Completed items used to be reduced for every incoming token in
FOLLOW(A), but now we will reduce only if the next input token is in the lookahead set s..if
we get two productions A -> a and B -> a, we can tell them apart when a is a handle on the
stack if the corresponding completed items have different lookahead parts. Furthermore, note
that the lookahead has no effect for an item of the form [A -> a.b, a] if b is not e. Recall that
our problem occurs for completed items, so what we have done now is to say that an item of
the form [A -> a., a] calls for a reduction by A -> a only if the next input symbol is a. More
formally, an LR(1) item [A -> a.b, a] is valid for a viable prefix g if there is a derivation
S =>* s abw, where g = sa, and either a is the first symbol of w, or w is e and a is $.
5.2 ALGORITHM FOR CONSTRUCTION OF THE SETS OF LR(1) ITEMS
Input: grammar G'
Output: sets of LR(1) items that are the set of items valid for one or more viable prefixes of
G'
Method:
closure(I)
begin
repeat
for each item [A -> a.Bb, a] in I,
each production B -> g in G',
and each terminal b in FIRST(ba)

Department of CSE
- 40 -

Smartzworld.com 1 jntuworldupdates.org
Smartworld.asia Specworld.in

such that [B -> .g, b] is not in I do

add [B -> .g, b] to I;
until no more items can be added to I;
end;
5.3 goto(I, X)
begin
let J be the set of items [A -> aX.b, a] such that
[A -> a.Xb, a] is in I
return closure(J);
end;
procedure items(G')
begin
C := {closure({S' -> .S, $})};
repeat
for each set of items I in C and each grammar symbol X such
that goto(I, X) is not empty and not in C do
add goto(I, X) to C
until no more sets of items can be added to C;
end;

An example,
Consider the following grammer,
S’->S
S->CC
C->cC
C->d
Sets of LR(1) items
I0: S’->.S,$
S->.CC,$
C->.Cc,c/d
C->.d,c/d

I1:S’->S.,$
I2:S->C.C,$
C->.Cc,$
C->.d,$

Department of CSE
- 41 -

Smartzworld.com 2 jntuworldupdates.org
Smartworld.asia Specworld.in

I3:C->c.C,c/d
C->.Cc,c/d
C->.d,c/d

I4: C->d.,c/d

I5: S->CC.,$

I6: C->c.C,$
C->.cC,$
C->.d,$

I7:C->d.,$

I8:C->cC.,c/d

I9:C->cC.,$

Here is what the corresponding DFA looks like

Department of CSE
- 42 -

Smartzworld.com 3 jntuworldupdates.org
Smartworld.asia Specworld.in

5.4 ALGORITHM FOR CONSTRUCTION OF THE CANONICAL LR PARSING

TABLE

Input: grammar G'

Output: canonical LR parsing table functions action and goto
1. Construct C = {I0, I1 , ..., In} the collection of sets of LR(1) items for G'.State i is
constructed from Ii.
2. if [A -> a.ab, b>] is in Ii and goto(Ii, a) = Ij, then set action[i, a] to "shift j". Here a
must be a terminal.
3. if [A -> a., a] is in Ii, then set action[i, a] to "reduce A -> a" for all a in
FOLLOW(A). Here A may not be S'.
4. if [S' -> S.] is in Ii, then set action[i, $] to "accept"
5. If any conflicting actions are generated by these rules, the grammar is not LR(1)
and the algorithm fails to produce a parser.
6. The goto transitions for state i are constructed for all nonterminals A using the
rule: If goto(Ii, A)= Ij, then goto[i, A] = j.
7. All entries not defined by rules 2 and 3 are made "error".
8. The inital state of the parser is the one constructed from the set of items
containing [S' -> .S, $].

Department of CSE
- 43 -

Smartzworld.com 4 jntuworldupdates.org
Smartworld.asia Specworld.in

5.5.LALR PARSER:
We begin with two observations. First, some of the states generated for LR(1) parsing have
the same set of core (or first) components and differ only in their second component, the
lookahead symbol. Our intuition is that we should be able to merge these states and reduce
the number of states we have, getting close to the number of states that would be generated
for LR(0) parsing. This observation suggests a hybrid approach: We can construct the
canonical LR(1) sets of items and then look for sets of items having the same core. We merge
these sets with common cores into one set of items. The merging of states with common
cores can never produce a shift/reduce conflict that was not present in one of the original
states because shift actions depend only on the core, not the lookahead. But it is possible for
the merger to produce a reduce/reduce conflict.
Our second observation is that we are really only interested in the lookahead symbol in
places where there is a problem. So our next thought is to take the LR(0) set of items and add
lookaheads only where they are needed. This leads to a more efficient, but much more
complicated method.
5.6 ALGORITHM FOR EASY CONSTRUCTION OF AN LALR TABLE
Input: G'
Output: LALR parsing table functions with action and goto for G'.
Method:
1. Construct C = {I0, I1 , ..., In} the collection of sets of LR(1) items for G'.
2. For each core present among the set of LR(1) items, find all sets having that core
and replace these sets by the union.
3. Let C' = {J0, J1 , ..., Jm} be the resulting sets of LR(1) items. The parsing actions
for state i are constructed from Ji in the same manner as in the construction of the
canonical LR parsing table.
4. If there is a conflict, the grammar is not LALR(1) and the algorithm fails.
5. The goto table is constructed as follows: If J is the union of one or more sets of
LR(1) items, that is, J = I0U I1 U ... U Ik, then the cores of goto(I0, X), goto(I1,
X), ..., goto(Ik, X) are the same, since I0, I1 , ..., Ik all have the same core. Let K
be the union of all sets of items having the same core asgoto(I1, X).

Department of CSE
- 44 -

Smartzworld.com 5 jntuworldupdates.org
Smartworld.asia Specworld.in

6. Then goto(J, X) = K.
Consider the above example,
I3 & I6 can be replaced by their union
I36:C->c.C,c/d/$
C->.Cc,C/D/$
C->.d,c/d/$
I47:C->d.,c/d/$
I89:C->Cc.,c/d/$
Parsing Table

state c d $ S C

0 S36 S47 1 2

1 Accept

2 S36 S47 5

36 S36 S47 89

47 R3 R3

5 R1

89 R2 R2 R2

5.7HANDLING ERRORS
The LALR parser may continue to do reductions after the LR parser would have spotted an
error, but the LALR parser will never do a shift after the point the LR parser would have
discovered the error and will eventually find the error.

5.8 DANGLING ELSE

The dangling else is a problem in computer programming in which an optional else clause in
an If–then(–else) statement results in nested conditionals being ambiguous. Formally,
the context-free grammar of the language is ambiguous, meaning there is more than one
correct parse tree.

Department of CSE
- 45 -

Smartzworld.com 6 jntuworldupdates.org
Smartworld.asia Specworld.in

In many programming languages one may write conditionally executed code in two forms:
the if-then form, and the if-then-else form – the else clause is optional:

Consider the grammar:

S ::= E $
E ::= E + E
|E*E
|(E)
| id
| num
and four of its LALR(1) states:
I0: S ::= . E $ ?
E ::= . E + E +*$ I1: S ::= E . $ ? I2: E ::= E * . E +*$
E ::= . E * E +*$ E ::= E . + E +*$ E ::= . E + E +*$
E ::= . ( E ) +*$ E ::= E . * E +*$ E ::= . E * E +*$
E ::= . id +*$ E ::= . ( E ) +*$
E ::= . num +*$ I3: E ::= E * E . +*$ E ::= . id +*$
E ::= E . + E +*$ E ::= . num +*$

Department of CSE
- 46 -

Smartzworld.com 7 jntuworldupdates.org
Smartworld.asia Specworld.in

E ::= E . * E +*$
Here we have a shift-reduce error. Consider the first two items in I3. If we have a*b+c and
we parsed a*b, do we reduce using E ::= E * E or do we shift more symbols? In the former
case we get a parse tree (a*b)+c; in the latter case we get a*(b+c). To resolve this conflict, we
can specify that * has higher precedence than +. The precedence of a grammar production is
equal to the precedence of the rightmost token at the rhs of the production. For example, the
precedence of the production E ::= E * E is equal to the precedence of the operator *, the
precedence of the production E ::= ( E ) is equal to the precedence of the token ), and the
precedence of the production E ::= if E then E else E is equal to the precedence of the token
else. The idea is that if the look ahead has higher precedence than the production currently
used, we shift. For example, if we are parsing E + E using the production rule E ::= E + E
and the look ahead is *, we shift *. If the look ahead has the same precedence as that of the
current production and is left associative, we reduce, otherwise we shift. The above grammar
is valid if we define the precedence and associativity of all the operators. Thus, it is very
important when you write a parser using CUP or any other LALR(1) parser generator to
specify associativities and precedence’s for most tokens (especially for those used as
operators). Note: you can explicitly define the precedence of a rule in CUP using the %prec
directive:
E ::= MINUS E %prec UMINUS
where UMINUS is a pseudo-token that has higher precedence than TIMES, MINUS etc, so
that -1*2 is equal to (-1)*2, not to -(1*2).
Another thing we can do when specifying an LALR(1) grammar for a parser generator is
error recovery. All the entries in the ACTION and GOTO tables that have no content
correspond to syntax errors. The simplest thing to do in case of error is to report it and stop
the parsing. But we would like to continue parsing finding more errors. This is called error
recovery. Consider the grammar:
S ::= L = E ;
| { SL } ;
| error ;
SL ::= S ;
| SL S ;

Department of CSE
- 47 -

Smartzworld.com 8 jntuworldupdates.org
Smartworld.asia Specworld.in

The special token error indicates to the parser what to do in case of invalid syntax for S (an
invalid statement). In this case, it reads all the tokens from the input stream until it finds the
first semicolon. The way the parser handles this is to first push an error state in the stack. In
case of an error, the parser pops out elements from the stack until it finds an error state where
it can proceed. Then it discards tokens from the input until a restart is possible. Inserting
error handling productions in the proper places in a grammar to do good error recovery is
considered very hard.
5.9LR ERROR RECOVERY
An LR parser will detect an error when it consults the parsing action table and find a blank or
error entry. Errors are never detected by consulting the goto table. An LR parser will detect
an error as soon as there is no valid continuation for the portion of the input thus far scanned.
A canonical LR parser will not make even a single reduction before announcing the error.
SLR and LALR parsers may make several reductions before detecting an error, but they will
never shift an erroneous input symbol onto the stack.
5.10 PANIC-MODE ERROR RECOVERY
We can implement panic-mode error recovery by scanning down the stack until a state s with
a goto on a particular nonterminal A is found. Zero or more input symbols are then discarded
until a symbol a is found that can legitimately follow A. The parser then stacks the state
GOTO(s, A) and resumes normal parsing. The situation might exist where there is more than
one choice for the nonterminal A. Normally these would be nonterminals representing major
program pieces, e.g. an expression, a statement, or a block. For example, if A is the
nonterminal stmt, a might be semicolon or }, which marks the end of a statement sequence.
This method of error recovery attempts to eliminate the phrase containing the syntactic error.
The parser determines that a string derivable from A contains an error. Part of that string has
already been processed, and the result of this processing is a sequence of states on top of the
stack. The remainder of the string is still in the input, and the parser attempts to skip over the
remainder of this string by looking for a symbol on the input that can legitimately follow A.
By removing states from the stack, skipping over the input, and pushing GOTO(s, A) on the
stack, the parser pretends that if has found an instance of A and resumes normal parsing.

Department of CSE
- 48 -

Smartzworld.com 9 jntuworldupdates.org
Smartworld.asia Specworld.in

5.11 PHRASE-LEVEL RECOVERY

Phrase-level recovery is implemented by examining each error entry in the LR action table
and deciding on the basis of language usage the most likely programmer error that would
give rise to that error. An appropriate recovery procedure can then be constructed;
presumably the top of the stack and/or first input symbol would be modified in a way deemed
appropriate for each error entry. In designing specific error-handling routines for an LR
parser, we can fill in each blank entry in the action field with a pointer to an error routine that
will take the appropriate action selected by the compiler designer.

The actions may include insertion or deletion of symbols from the stack or the input or both,
or alteration and transposition of input symbols. We must make our choices so that the LR
parser will not get into an infinite loop. A safe strategy will assure that at least one input
symbol will be removed or shifted eventually, or that the stack will eventually shrink if the
end of the input has been reached. Popping a stack state that covers a non terminal should be
avoided, because this modification eliminates from the stack a construct that has already been
successfully parsed.

Department of CSE
- 49 -

Smartzworld.com 10 jntuworldupdates.org

GATE Compiler Design 93-2009
67% (3)
GATE Compiler Design 93-2009
12 pages
CLR (1) LALR (1) Parsing
No ratings yet
CLR (1) LALR (1) Parsing
17 pages
Compiler Design - 2 Marks Question Set With Answers - TUTOR4CS PDF
89% (9)
Compiler Design - 2 Marks Question Set With Answers - TUTOR4CS PDF
16 pages
Bottom Up Parser
100% (1)
Bottom Up Parser
61 pages
Bottom Up Parsing - LR Parsers (LR (0), SLR, CLR and LALR Parsers)
0% (1)
Bottom Up Parsing - LR Parsers (LR (0), SLR, CLR and LALR Parsers)
7 pages
18 Miscellaneous Parsing
No ratings yet
18 Miscellaneous Parsing
8 pages
Bottom-Up Parsing Including LR (0), SLR
No ratings yet
Bottom-Up Parsing Including LR (0), SLR
55 pages
SLR Parser
No ratings yet
SLR Parser
18 pages
LR Parsing Table Costruction
100% (1)
LR Parsing Table Costruction
47 pages
LR Parsing: Dewan Tanvir Ahmed Assistant Professor, CSE Buet
No ratings yet
LR Parsing: Dewan Tanvir Ahmed Assistant Professor, CSE Buet
60 pages
Unit 3 - Sessions 6 - 7 (A)
No ratings yet
Unit 3 - Sessions 6 - 7 (A)
25 pages
Unit 3 CD
No ratings yet
Unit 3 CD
34 pages
LALR Parsers
No ratings yet
LALR Parsers
5 pages
LR (K) Parsing: CPSC 388 Ellen Walker Hiram College
No ratings yet
LR (K) Parsing: CPSC 388 Ellen Walker Hiram College
30 pages
N244 4 LR More
No ratings yet
N244 4 LR More
25 pages
Bottom Up Parsing1
No ratings yet
Bottom Up Parsing1
69 pages
SLR CLR Lalr
No ratings yet
SLR CLR Lalr
20 pages
Module-2-Second Half
No ratings yet
Module-2-Second Half
16 pages
LP Week6 LRparsingonly
No ratings yet
LP Week6 LRparsingonly
15 pages
LR (1) & Lalr Parser: Prepared By: Nitin Goyal - 04Cs1026 Jitendra Bindal - 04Cs1027
No ratings yet
LR (1) & Lalr Parser: Prepared By: Nitin Goyal - 04Cs1026 Jitendra Bindal - 04Cs1027
24 pages
LR (0) Parser
No ratings yet
LR (0) Parser
8 pages
SLR Parsing
No ratings yet
SLR Parsing
22 pages
Compiler Design Unit 3
No ratings yet
Compiler Design Unit 3
20 pages
CC 5
No ratings yet
CC 5
77 pages
CC LR Parser
No ratings yet
CC LR Parser
37 pages
Syntax Analysis - LR (1) and LALR (1) Parsing
No ratings yet
Syntax Analysis - LR (1) and LALR (1) Parsing
21 pages
SQL Assignment by U.diwakar 3
No ratings yet
SQL Assignment by U.diwakar 3
9 pages
LR 1 Notes
No ratings yet
LR 1 Notes
13 pages
Lecture 7 Introduction To LR Parsing Simple LR Part 1
No ratings yet
Lecture 7 Introduction To LR Parsing Simple LR Part 1
13 pages
Unit III CLR & LALR Additional Problems
No ratings yet
Unit III CLR & LALR Additional Problems
12 pages
Compiler Design Unit 4: LR (0) Example (Belongs To Unit 3)
No ratings yet
Compiler Design Unit 4: LR (0) Example (Belongs To Unit 3)
7 pages
SLR Parsing
No ratings yet
SLR Parsing
66 pages
S2 BottomUpParsing
No ratings yet
S2 BottomUpParsing
59 pages
6LR1 Lalr
No ratings yet
6LR1 Lalr
21 pages
Lrparser HaLrparser Handout Ndout
No ratings yet
Lrparser HaLrparser Handout Ndout
16 pages
Compiler Design Unit-2
No ratings yet
Compiler Design Unit-2
29 pages
Unit - 3 Syntax Analysis: 3.1 Role of The Parser
No ratings yet
Unit - 3 Syntax Analysis: 3.1 Role of The Parser
6 pages
Syntax Analysis - Bottom Up Parsers - LR - Parsers - KR - Notes
No ratings yet
Syntax Analysis - Bottom Up Parsers - LR - Parsers - KR - Notes
13 pages
Linux
No ratings yet
Linux
12 pages
Lalr Parser
No ratings yet
Lalr Parser
9 pages
Gr.2 Miniproject Cse4th CompilerD
No ratings yet
Gr.2 Miniproject Cse4th CompilerD
28 pages
CS346 Bottom Up Parser
No ratings yet
CS346 Bottom Up Parser
64 pages
Lec 9
No ratings yet
Lec 9
28 pages
Unit 02 - Part 03
No ratings yet
Unit 02 - Part 03
50 pages
07 - SLR Parsers - LR
No ratings yet
07 - SLR Parsers - LR
49 pages
LR Parsing Methods
No ratings yet
LR Parsing Methods
50 pages
Lecture3 Parser Full
No ratings yet
Lecture3 Parser Full
30 pages
04 - CALR Parsing
No ratings yet
04 - CALR Parsing
28 pages
LR (1) Parser
No ratings yet
LR (1) Parser
37 pages
LR Parser
No ratings yet
LR Parser
8 pages
CD Chapter-3
No ratings yet
CD Chapter-3
53 pages
Table-Driven Parsing: Tables
No ratings yet
Table-Driven Parsing: Tables
22 pages
CD Unit II Part II LR Parsers
No ratings yet
CD Unit II Part II LR Parsers
16 pages
Basic Parsing Technique
No ratings yet
Basic Parsing Technique
30 pages
M2 - P4 LR Parser
No ratings yet
M2 - P4 LR Parser
38 pages
Chapter 6 - Compiler Construction
No ratings yet
Chapter 6 - Compiler Construction
13 pages
Unit 3 LR (0) Parser
No ratings yet
Unit 3 LR (0) Parser
20 pages
Unit III
No ratings yet
Unit III
211 pages
MR LLK and LRK Parsers
No ratings yet
MR LLK and LRK Parsers
18 pages
Lec3 SyntaxAnalysis Part4
No ratings yet
Lec3 SyntaxAnalysis Part4
23 pages
LRParsing LRParserGenerator
No ratings yet
LRParsing LRParserGenerator
32 pages
General Framework: X X X X: LR Parser
No ratings yet
General Framework: X X X X: LR Parser
6 pages
Rajalakshmi Institute of Technology Chennai: Department of Computer Science and Engineering
No ratings yet
Rajalakshmi Institute of Technology Chennai: Department of Computer Science and Engineering
20 pages
Theory of Automata and Formal Languages
No ratings yet
Theory of Automata and Formal Languages
24 pages
Turing Machines
No ratings yet
Turing Machines
16 pages
Unit 3-FLAT
No ratings yet
Unit 3-FLAT
80 pages
Chomsky Hierarchy
No ratings yet
Chomsky Hierarchy
4 pages
Compiler Design
No ratings yet
Compiler Design
59 pages
Atcd U 4
No ratings yet
Atcd U 4
25 pages
Bottom Up Evalution of Inherited Attributes - Group 7
No ratings yet
Bottom Up Evalution of Inherited Attributes - Group 7
11 pages
FLA Syllabus
No ratings yet
FLA Syllabus
2 pages
Chapter-1 FLAT
No ratings yet
Chapter-1 FLAT
5 pages
CSE273 02 Mid Spring 2022 PDF
No ratings yet
CSE273 02 Mid Spring 2022 PDF
2 pages
KMP Skip Search Algorithm: Advisor: Prof. R. C. T. Lee Speaker: Z. H. Pan
No ratings yet
KMP Skip Search Algorithm: Advisor: Prof. R. C. T. Lee Speaker: Z. H. Pan
18 pages
Compiler Design
No ratings yet
Compiler Design
121 pages
TCS Lect 20 Pumping Lemma
No ratings yet
TCS Lect 20 Pumping Lemma
28 pages
Decision Properties: Membership
No ratings yet
Decision Properties: Membership
8 pages
Java String Reference
No ratings yet
Java String Reference
3 pages
ToC Key Jan2021
No ratings yet
ToC Key Jan2021
17 pages
Module-4 Lex and Yacc
No ratings yet
Module-4 Lex and Yacc
67 pages
2022 Afl End Regular Question
No ratings yet
2022 Afl End Regular Question
10 pages
CD File Kunal 4 To 9
No ratings yet
CD File Kunal 4 To 9
10 pages
Unit-2 25 09 2024
No ratings yet
Unit-2 25 09 2024
132 pages
Lec 09-b Context Free Grammar
No ratings yet
Lec 09-b Context Free Grammar
39 pages
Pumping Lemma
No ratings yet
Pumping Lemma
1 page
Lex Yacc Tutorial
No ratings yet
Lex Yacc Tutorial
38 pages
Learn Java - String Methods Cheatsheet - Codecademy
No ratings yet
Learn Java - String Methods Cheatsheet - Codecademy
2 pages
Cosc 327
No ratings yet
Cosc 327
3 pages

Automata Compiler Design U5

Uploaded by

Automata Compiler Design U5

Uploaded by

Smartworld.asia Specworld.

5.1 CANONICAL LR PARSING

such that [B -> .g, b] is not in I do

Here is what the corresponding DFA looks like

5.4 ALGORITHM FOR CONSTRUCTION OF THE CANONICAL LR PARSING

Input: grammar G'

5.8 DANGLING ELSE

Consider the grammar:

5.11 PHRASE-LEVEL RECOVERY

You might also like