Unit 3
Unit 3
Chapter 4
Syntax Analysis
Bottom – Up Parsing :
Shift Reduce Parsing and
Operator Precedence Parsing
Outline
• Bottom-Up Parsing
– Shift Reduce Parsing
– Operator Precedence Parsing.
– LR parsers: (next Presentation)
(C) 2014, Prepared by Partha Sarathi Chakraborty
• Simple LR (SLR)
• Canonical LR
• Lookahead LR (LALR)
3
Bottom-Up Parsing
• Start at the leaves and grow toward root.
• We can think of the process as reducing the input
string to the start symbol.
• At each reduction step a particular substring matching
(C) 2014, Prepared by Partha Sarathi Chakraborty
Bottom-Up Parsing
• A general style of bottom-up syntax analysis, known as
shift-reduce parsing.
• Main actions are shift and reduce.
• At each shift action, the current symbol in the input
string is pushed to a stack.
(C) 2014, Prepared by Partha Sarathi Chakraborty
Shift-Reduce Parsing
Handles
A handle is a substring of grammar symbols in a right-
sentential form that matches a right-hand side
of a production
Grammar: abbcde
SaABe aAbcde
(C) 2014, Prepared by Partha Sarathi Chakraborty
Handles
• A handle of a right sentential form ( ) is a
production rule A and a position of where the string
may be found and replaced by A to produce the previous right-
sentential form in a rightmost derivation of .
S A
i.e. A is a handle of at the location immediately after
(C) 2014, Prepared by Partha Sarathi Chakraborty
the end of ,
• If the grammar is unambiguous, then every right-sentential
form of the grammar has exactly one handle.
• is a string of terminals
9
Handle Pruning
• The process of discovering a handle & reducing it to
the appropriate left-hand side is called handle
pruning. Handle pruning forms the basis for a
bottom-up parsing method.
non-terminal.
– Accept: Successful completion of parsing.
– Error: Parser discovers a syntax error, and calls an error
recovery routine.
11
• Initial State
STACK INPUT
$ W$
• Final State
(C) 2014, Prepared by Partha Sarathi Chakraborty
STACK INPUT
$S $
12
Stack Implementation of
Shift-Reduce Parsing
$BxA z$ reduce
Note: It never had to go into the stack to find the handle. It is this
aspect of handle pruning that makes a stack a particularly
convenient data structure to implementing a shift reduce parser.
16
Conflicts
• Shift-reduce and reduce-reduce conflicts are
caused by
– The limitations of the LR parsing method (even
when the grammar is unambiguous)
– Ambiguity of the grammar
(C) 2014, Prepared by Partha Sarathi Chakraborty
17
Shift-Reduce Parsing:
Shift-Reduce Conflicts
S if E then S
| if E then S else S
| other
Resolve in favor
of shift, so else
matches closest if
18
Shift-Reduce Parsing:
Reduce-Reduce Conflicts
CAB
Aa
Ba
Resolve in favor
of reduce A a,
otherwise we’re stuck!
19
can have:
– at the right side
– two adjacent non-terminals at the right side.
• Example
E → E + E | E * E | ( E ) | −E | id
20
$ ⋖ ⋖ ⋖
$ ⋖ id ⋗ + ⋖ id ⋗ * ⋖ id ⋗ $ E id $ id + id * id $
$ ⋖ + ⋖ id ⋗ * ⋖ id ⋗ $ E id $ E + id * id $
$ ⋖ + ⋖ * ⋖ id .⋗ $ E id $ E + E * id $
$⋖ +⋖ *⋗ $ EE*E $E+ E*E$
$⋖ +⋗ $ EE+E $E+E$
$$ $E$
26
• TRAILING(A) Algorithm
– a is in TRAILING(A) if there is a production of the form A
→ a, where is or a single nonterminal.
– If a is in TRAILING(B), and there is a production of the
form A → B, then a is in TRAILING(A).
28
T * , ( , id * , ) , id
F ( , id ) , id
29
set Xi ≐ Xi+2 ;
if Xi is a terminal and Xi+1 is a nonterminal then
for all a in LEADING(Xi+1) do set Xi ⋖ a ;
if Xi is a nonterminal and Xi+1 is a terminal then
for all a in TRAILNG(Xi) do set a ⋗ Xi+1 ;
end
31
EE+T
(C) 2014, Prepared by Partha Sarathi Chakraborty
+ ⋗ ⋖ ⋖ ⋗ ⋖ ⋗
* ⋗ ⋗ ⋖ ⋗ ⋖ ⋗
Xi Xi+1 Xi+2 ( ⋖ ⋖ ⋖ ≐ ⋖
Xi Xi+1 ) ⋗ ⋗ ⋗ ⋗
id ⋗ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ ⋖
32
Method: Initially, the stack contains $ and the input buffer the
(C) 2014, Prepared by Partha Sarathi Chakraborty
string w$.
$+ ⋖ id * id$ shift + ⋖ ⋗ ⋖ ⋗
$ + id ⋗ * id$ reduce E id * ⋖ ⋗ ⋗ ⋗
$+ ⋖ * id$ shift
$ ⋖ ⋖ ⋖
$+* ⋖ id$ shift
$ + * id ⋗ $ reduce E id
$+* ⋗ $ reduce E E * E
$+ ⋗ $ reduce E E + E
$ $ Accept
35
id
(a) (b)
(C) 2014, Prepared by Partha Sarathi Chakraborty
$● + id $ $●+ id $
● ● +
id id
(c) (d)
Actions of Operator – Precedence Parsing
36
input string: id + id
● + id ● + ●
id (e) id id (f)
(C) 2014, Prepared by Partha Sarathi Chakraborty
Stack Input
$● $
● + ●
id id (g)
37
↑ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗⋖ ⋗
id ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
( ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ≐ error
) ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖
41
Precedence Functions
• Compilers using operator precedence parsers do not need to
store the table of precedence relations.
• The table can be encoded by two precedence functions f and g
that map terminal symbols to integers.
• For symbols a and b.
f(a) < g(b) whenever a ⋖ b
(C) 2014, Prepared by Partha Sarathi Chakraborty
Precedence Functions
Consider the grammar
E → E + E | E − E | E * E | E / E | E ↑ E | ( E ) | −E | id
+ − * / ↑ ( ) id $
f 2 2 4 4 4 0 6 6 0
(C) 2014, Prepared by Partha Sarathi Chakraborty
g 1 1 3 3 5 5 0 5 0
Precedence Functions
For example:
* ⋖ id, and f(*) < g(id)
Note: f(id) > g(id) suggests that id ⋗ id;
In fact no precedence relation holds between id and id.
44
Method
(C) 2014, Prepared by Partha Sarathi Chakraborty
+ +
(C) 2014, Prepared by Partha Sarathi Chakraborty
+ * id $
f 2 4 4 0
g 1 3 5 0
47
input.
– If a handle has been found, but there is no
production with this handle as a right side.
48
id e3 e3 ⋗ ⋗
⋖ ⋖ ≐
Handling Shift/Reduce Errors: ( e4
) e3 e3 ⋗ ⋗
Error handling routines $ ⋖ ⋖ e2 e1
$ + id ⋗ )($ reduce
$+ ⋗ )($ reduce
$ blank )($ error, e2. unbalanced right parenthesis
$ ($ delete ‘)’from the INPUT
$ ⋖ ($ shift
$( blank $ error, e4. missing right parenthesis
$ $ pop ‘(’from STACK
$ $ accept
1
Unit - II
Chapter 4
Syntax Analysis
Bottom – Up Parsing : LR parsers
(C) 2014, Prepared by Partha Sarathi Chakraborty
Outline
• Bottom-Up Parsing
– LR parsers:
• Simple LR (SLR)
• Canonical LR
• Lookahead LR (LALR)
(C) 2014, Prepared by Partha Sarathi Chakraborty
3
LR Parsers
• Efficient bottom-up syntax analysis technique
that can used to parse a large class of CFG.
• The technique is called LR(k) parsing; ‘L’ is
for Left-to-Right scanning of the input, ‘R’ for
(C) 2014, Prepared by Partha Sarathi Chakraborty
LR Parsers: Attractive
• LR parsing is attractive for variety of reasons.
– LR parsers can be constructed to recognize virtually all
programming language constructs for which context-free
grammars can be written.
– The LR-parsing method is the most general nonbacktracking
shift-reduce parsing method known, yet it can be
(C) 2014, Prepared by Partha Sarathi Chakraborty
LR Parsers: Drawback
• The principal drawback of the LR method is that it is
too much work to construct an LR parser by hand for a
typical programming-language grammar.
• A specialized tool, an LR parser generator, is needed.
• YACC: Such a generator takes a context-free grammar
(C) 2014, Prepared by Partha Sarathi Chakraborty
expensive.
• Lookahead LR (in short LALR) – It is intermediate
in power and cost between other two. It will work on
most programming-language grammars, and with
some effort, implemented efficiently.
• Powerful: Canonical LR > LALR > SLR
7
[A X • Y Z]
[A X Y • Z]
[A X Y Z •]
• Note that production A has one item [A •]
8
F
FIRST(E) = FIRST(T) = FIRST(F) = { ( , id}
FOLLOW(E) = { $ , ) , + }
FOLLOW(T) = { * , $ , ) , + }
FOLLOW(F) = { * , $ , ) , + }
(C) 2014, Prepared by Partha Sarathi Chakraborty
F
• If I is the set of one item {[E’ → . E}, then closure(I) contains
the items.
10
Final Closure I0
(C) 2014, Prepared by Partha Sarathi Chakraborty
Items
The Goto Operation for LR(0)
11
12
+
I1 I6
E *
I0 I1 I2 I7
T E F
I4 I8 I4 I3
(C) 2014, Prepared by Partha Sarathi Chakraborty
I0 I2
( id
I4 I4 I4 I5
F
I0 I3 T F
I6 I9 I6 I3
( id
I6 I4 I6 I5
(
I0 I4 F ( id
I7 I10 I7 I4 I7 I5
)
I8 I11 *
+ I9 I7
I8 I6
14
Transition diagram for the grammar G
represent Goto operation
(C) 2014, Prepared by Partha Sarathi Chakraborty
expression grammar G
Parsing table SLR(1) for
16
17
Model of an LR Parser
18
(C) 2014, Prepared by Partha Sarathi Chakraborty
LR Parsing Algorithm
19
20
Example of LR parsing
(C) 2014, Prepared by Partha Sarathi Chakraborty
Moves of LR parser on id * id + id
21
S S(S)
S
• FIRST(S) = { ( , }
• FOLLOW(S) = { $ , ( , ) }
(C) 2014, Prepared by Partha Sarathi Chakraborty
Closure Set I
Construction of LR(0) items:
22
23
LL vs. LR Grammars
• For a grammar to be LR(k), we must be able to
recognize the occurrence of the right side of a
production in a right-sentential form, with k input
symbols of lookahead.
• This requirement is far less stringent than that for
(C) 2014, Prepared by Partha Sarathi Chakraborty
=
I2 I6
S R *
I0 I1 I4 I7 I4 I4
(C) 2014, Prepared by Partha Sarathi Chakraborty
L id
I4 I8 I4 I5
L
I0 I2
R
I6 I9
R
I0 I3 L
I6 I8
*
I0 I4 *
I6 I4
id
I6 I5
27
Parsing table SLR(1) for expression
grammar X shows S-R conflict
Conflict
Grammar X,
(1) S L = R
(C) 2014, Prepared by Partha Sarathi Chakraborty
(2) S R
(3) L * R
(4) L id
(5) R L
Grammar X is not ambiguous. This shift/reduce conflict arises from the fact that the
SLR parser construction method is not powerful enough to remember enough left
context to decide what action the parser should take on input =, having seen a string
reducible to L.
28
LR(1) Grammars
RL
30
LR(1) Items
• An LR(1) item
[A•, a]
contains a lookahead terminal a, meaning already
on top of the stack, expect to see a
• For items of the form
[A•, a]
(C) 2014, Prepared by Partha Sarathi Chakraborty
• Augment with S’ S
(C) 2014, Prepared by Partha Sarathi Chakraborty
A•B , a
Core / First Component
FIRST(a)
33
Construct Closure I0
Core lookahead
FIRST(a)
I0 rewrite as:
(C) 2014, Prepared by Partha Sarathi Chakraborty
34
grammar X
Canonical LR(1) parsing table for
36
37
LALR(1) Grammars
• LR(1) parsing tables have many states
• LALR(1) parsing (Look-Ahead LR) combines LR(1)
states to reduce table size
• Less powerful than LR(1)
– Will not introduce shift-reduce conflicts, because shifts do
(C) 2014, Prepared by Partha Sarathi Chakraborty
grammar X
LALR(1) parsing table for
41
42
Analysis
• The LR and LALR parsers will mimic one
another on correct inputs.
• When presented with erroneous input, the
LALR parser may proceed to do some
(C) 2014, Prepared by Partha Sarathi Chakraborty
S’ S
S S(S)
S
• LR(1) items (next slide)
(C) 2014, Prepared by Partha Sarathi Chakraborty
Parsing canonical LR
47
(C) 2014, Prepared by Partha Sarathi Chakraborty
LALR parsing
48
49
FOLLOW(S) = { $ , c , d }
S’ S
S CC
C cC
Cd
• LR(1) items (next slide)
(C) 2014, Prepared by Partha Sarathi Chakraborty
I36: C cC , c | d | $
C cC , c | d | $
C d , c | d | $
I47: C d , c | d | $
I89: C cC , c | d | $
(C) 2014, Prepared by Partha Sarathi Chakraborty
Assignment
Consider the following grammar G,
EE+T|T
T TF | F
F F* | a | b
(C) 2014, Prepared by Partha Sarathi Chakraborty