0% found this document useful (0 votes)
27 views83 pages

03 Syntaxanalysis 2 2012 2013

Uploaded by

Tashif Manna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views83 pages

03 Syntaxanalysis 2 2012 2013

Uploaded by

Tashif Manna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

Transforming a grammar for LL(1) parsing

Ambiguous grammars are not LL(1) but unambiguous grammars are


not necessarily LL(1)
Having a non-LL(1) unambiguous grammar for a language does not
mean that this language is not LL(1).
But there are languages for which there exist unambiguous
context-free grammars but no LL(1) grammar.

We will see two grammar transformations that improve the chance


to get a LL(1) grammar:
I Elimination of left-recursion
I Left-factorization

Syntax analysis 145


Left-recursion
The following expression grammar is unambiguous but it is not
LL(1):

Exp ! Exp + Exp2


Exp ! Exp Exp2
Exp ! Exp2
Exp2 ! Exp2 ⇤ Exp3
Exp2 ! Exp2/Exp3
Exp2 ! Exp3
Exp3 ! num
Exp3 ! (Exp)

Indeed, First(↵) is the same for all RHS ↵ of the productions for
Exp et Exp2
This is a consequence of left-recursion.
Syntax analysis 146
Left-recursion
Recursive productions are productions defined in terms of
themselves. Examples: A ! Ab ou A ! bA.
When the recursive nonterminal is at the left (resp. right), the
production is said to be left-recursive (resp. right-recursive).
Left-recursive productions can be rewritten with right-recursive
productions
Example:
0
N ! 1N
N ! N↵1
..
.. .
. 0
N ! nN
N ! N↵m 0
N ! ↵1 N 0
N ! 1
, ..
.. .
.
N0 ! ↵m N 0
N ! n
N0 ! ✏

Syntax analysis 147


Right-recursive expression grammar

Exp ! Exp2Exp 0
0
Exp ! Exp + Exp2 Exp ! +Exp2Exp 0
Exp ! Exp Exp2 Exp 0 ! Exp2Exp 0
Exp ! Exp2 Exp 0 ! ✏
Exp2 ! Exp2 ⇤ Exp3 Exp2 ! Exp3Exp20
0
Exp2 ! Exp2/Exp3 Exp2 ! ⇤Exp3Exp20
,
Exp2 ! Exp3 Exp20 ! /Exp3Exp20
Exp3 ! num Exp20 ! ✏
Exp3 ! (Exp) Exp3 ! num
Exp3 ! (Exp)

Syntax analysis 148


Left-factorisation
The RHS of these two productions have the same First set.
Stat ! if Exp then Stat else Stat
Stat ! if Exp then Stat

The problem can be solved by left factorising the grammar:


Stat ! if Exp then Stat ElseStat
ElseStat ! else Stat
ElseStat ! ✏

Note
I The resulting grammar is ambiguous and the parsing table will
contain two rules for M[ElseStat, else]
(because else 2 Follow (ElseStat) and else 2 First(else Stat))
I Ambiguity can be solved in this case by letting
M[ElseStat, else] = {ElseStat ! else Stat}.
Syntax analysis 149
Hidden left-factors and hidden left recursion
Sometimes, left-factors or left recursion are hidden
Examples:
I The following grammar:

A ! da|acB
B ! abB|daA|Af

has two overlapping productions: B ! daA and B ) daf .
I The following grammar:

S ! Tu|wx
T ! Sq|vvS

has left recursion on T (T ) Tuq)
Solution: expand the production rules by substitution to make
left-recursion or left factors visible and then eliminate them

Syntax analysis 150


Summary

Construction of a LL(1) parser from a CFG grammar


Eliminate ambiguity
Eliminate left recursion
left factorization
Add an extra start production S 0 ! S$ to the grammar
Calculate First for every production and Follow for every
nonterminal
Calculate the parsing table
Check that the grammar is LL(1)

Syntax analysis 151


Recursive implementation
Recursive implementation
From the parsing table, it is easy to implement a predictive parser
recursively
From the parsing table, it is easy to implement a predictive parser
From the parsing table, it is easy to implement a predictive parser
3.12. LL(1) PARSING 81
recursively
recursively (with one function 3.12.
3.12. LL(1) per
PARSINGnonterminal)
LL(1) PARSING
function parseT’() =
81
if next = ’a’ or next = ’b’ or next = ’$’ then
parseT() ; match(’$’)
function parseT’() =
else reportError()
T0 ! T$ function parseT’() =
if next = ’a’ or next = ’b’ or next = ’$’ then
parseT() ; match(’$’)
function parseT() =
T 0 ! R if next = ’a’ or next = ’b’ or next = ’$’ then
else reportError()
if next = ’b’ or next = ’c’ or next = ’$’ then
T ! T$
T ! aTc parseR()
elsefunction
parseT() ; match(’$’)
if next parseT()
= ’a’ then=
T ! R if next ;= parseT()
’b’ or next = ’c’ or next = ’$’ then
R ! ✏ match(’a’)
parseR()
else reportError()
; match(’c’)
T ! aTc else reportError()
R ! bR else if next = ’a’ then
R ! ✏ match(’a’)
function parseR() = ; parseT() ; match(’c’)
else
if next = ’c’function parseT() =
reportError()
or next = ’$’ then
a R b! bR c $ (* do nothing *)
T0 T0 ! T$ T0 ! T$ T0 ! T$ elsefunction
if next = ’b’ or next =
if next parseR()
= ’b’ then=
’c’ or next = ’$’ then
T T ! aTc a T ! R
b T ! R
c T ! R$
match(’b’)
else reportError()
parseR()
if next ;= parseR()
’c’ or next = ’$’ then
(* do nothing *)
!0 !
RT 0 T 0 ! T $R T bR T $R ! ✏ RT!0 ✏! T $ else ifelse if then
next = ’b’ next = ’a’ then
match(’b’) ; parseR()
T T ! aTc T !R T !R T !R match(’a’) ; parseT() ; match(’c’)
Figure 3.16: Recursive
else reportError()
descent parser for grammar 3.9

R R ! bR R!✏ R!✏ else reportError()


For parseR, we must choose the empty production on symbols in FOLLOW(R)
(c or $). The production FigureR! 3.16:
bR isRecursive
chosen on descent
input b.parser
Again,forall
grammar 3.9
other symbols
produce an error.
Syntax analysis
function parseR() =
The function match takes as argument a symbol, which it tests for equality
62
with the next input symbol. If they are equal, the following symbol is readininto
For parseR, we must choose the empty production on symbols FOLLOW(R)
the variable if next = ’c’ or next = ’$’ then
(c or next.
$). The Weproduction
assume next R !is bR is chosen
initialised on first
to the inputinput
b. Again,
symbolall other symbols
before
produce an error.
is called.
Syntax analysis parseT’
The program
(* do nothing *)
The function
in figurematch takeschecks
3.16 only as argument a symbol,
if the input is valid. which
It can iteasily
62
testsbefor equality
with
extended else if next = ’b’ then
the next ainput
to construct syntaxsymbol. If theytheare
tree by letting equal,
parse the following
functions return thesymbol
sub-treesis read into
for thethe variable
parts of inputnext. Weparse.
that they assume next is initialised to the first input symbol before
match(’b’) ; parseR()
parseT’ is called.
The program in figure 3.16 only checks if the input is valid. It can easily be
extended toelse
construct areportError()
3.12.2 Table-driven LL(1) parsing
syntax tree by letting the parse functions return the sub-trees
for the parts
In table-driven ofparsing,
LL(1) input that
wethey parse.
encode the selection of productions into a table
instead of in the program text. A simple non-recursive program uses this table and
a stack to perform the parsing. (Mogensen)
3.12.2 Table-driven LL(1) Figure 3.16: Recursive descent parser for grammar 3.
parsing
The table is cross-indexed by nonterminal and terminal and contains for each
such pair the production
In table-driven (if any)
LL(1) that iswe
parsing, chosen for that
encode nonterminal
the selection of when that ter-into a table
productions
Syntax analysis minal is the next input symbol. This decision is made just as for recursive descent
instead of in the program text. A simple non-recursive program uses this table and 152
Outline

1. Introduction

2. Context-free grammar

3. Top-down parsing

4. Bottom-up parsing
Shift/reduce parsing
LR parsers
Operator precedence parsing
Using ambiguous grammars

5. Conclusion and some practical considerations

Syntax analysis 153


Bottom-up parsing

A bottom-up parser creates the parse tree starting from the leaves
towards the root
It tries to convert the program into the start symbol
Most common form of bottom-up parsing: shift-reduce parsing

Syntax analysis 154


Bottom-up parsing: example

Bottum-up parsing of
int + (int + int + int)
Grammar:
One View of a Bottom-Up Parse
S
S ! E$ S → E$
E→T E
E ! T E→E+T
T
T → int
E ! E + T T → (E)
E
T ! int E
T ! (E ) E E

T T T T

int + ( int + int + int ) $

(Keith Schwarz)

Syntax analysis 155


Bottom-up parsing: example
Bottum-up parsing of
int + (int + int + int):
Grammar: int + (int + int + int)$
T + (int + int + int)$
S ! E$
E + (int + int + int)$
E ! T E + (T + int + int)$
E ! E + T E + (E + int + int)$
T ! int E + (E + T + int)$
E + (E + int)$
T ! (E )
E + (E + T )$
E + (E )$
E + T$
E$
S
Top-down parsing is often done as a rightmost derivation in reverse
(There is only one if the grammar is unambiguous).
Syntax analysis 156
Terminology
A Rightmost (canonical) derivation is a derivation where the
rightmost nonterminal is replaced at each step. A rightmost

derivation from ↵ to is noted ↵ )rm .
A reduction transforms uwv to uAv if A ! w is a production

↵ is a right sentential form if S )rm ↵ with ↵ = x where x is a
string of terminals.
A handle of a right sentential form (= ↵ w ) is a production
A ! and a position in where may be found and replaced by A
to produce the previous right-sentential form in a rightmost
derivation of :

S )rm ↵Aw )rm ↵ w

I Informally, a handle is a production we can reverse without getting


stuck.
I If the handle is A ! , we will also call the handle.

Syntax analysis 157


Handle: example
Bottum-up parsing of
int + (int + int + int)
Grammar: int + (int + int + int)$
T + (int + int + int)$
S ! E
E + (int + int + int)$
E ! T E + (T + int + int)$
E ! E + T E + (E + int + int)$
T ! int E + (E + T + int)$
E + (E + int)$
T ! (E )
E + (E + T )$
E + (E )$
E + T$
E$
S
The handle is in red in each right sentential form
Syntax analysis 158
Finding the handles

Bottom-up parsing = finding the handle in the right sentential form


obtained at each step
This handle is unique as soon as the grammar is unambiguous
(because in this case, the rightmost derivation is unique)
Suppose that our current form is uvw and the handle is A ! v
(getting uAw after reduction). w can not contain any nonterminals
(otherwise we would have reduced a handle somewhere in w )

Syntax analysis 159


Shift/reduce parsing

Proposed model for a bottom-up parser:


Split the input into two parts:
I Left substring is our work area
I Right substring is the input we have not yet processed
All handles are reduced in the left substring
Right substring consists only of terminals
At each point, decide whether to:
I Move a terminal across the split (shift)
I Reduce a handle (reduce)

Syntax analysis 160


Shift/reduce parsing: example

Left substring Right substring Action


Grammar:
$ id + id ⇤ id$ Shift
E ! E + T |T $id +id ⇤ id$ Reduce by F ! id
$F +id ⇤ id$ Reduce by T ! F
T ! T ⇤ F |F $T +id ⇤ id$ Reduce by E ! T
F ! ( E )| id $E +id ⇤ id$ Shift
$E + id ⇤ id$ Shift
$E + id ⇤id$ Reduce by F ! id
$E + F ⇤id$ Reduce by T ! F
$E + T ⇤id$ Shift
Bottum-up parsing of $E + T ⇤ id$ Shift
id + id ⇤ id $E + T ⇤ id $ Reduce by F ! id
$E + T ⇤ F $ Reduce by T ! T ⇤ F
$E + T $ Reduce by E ! E + T
$E $ Accept

Syntax analysis 161


Shift/reduce parsing

In the previous example, all the handles were to the far right end of
the left area (not inside)
This is convenient because we then never need to shift from the left
to the right and thus could process the input from left-to-right in
one pass.

Is it the case for all grammars? Yes !


Sketch of proof: by induction on the number of reduces
I After no reduce, the first reduction can be done at the right end of
the left area
I After at least one reduce, the very right of the left area is a
nonterminal (by induction hypothesis). This nonterminal must be
part of the next reduction, since we are tracing a rightmost derivation
backwards.

Syntax analysis 162


Shift/reduce parsing

Consequence: the left area can be represented by a stack (as all


activities happen at its far right)

Four possible actions of a shift-reduce parser:


1. Shift: push the next terminal onto the stack
2. Reduce: Replace the handle on the stack by the nonterminal
3. Accept: parsing is successfully completed
4. Error: discover a syntax error and call an error recovery routine

Syntax analysis 163


Shift/reduce parsing

There still remain two open questions: At each step:


I How to choose between shift and reduce?
I If the decision is to reduce, which rules to choose (i.e., what is the
handle)?
Ideally, we would like this choice to be deterministic given the stack
and the next k input symbols (to avoid backtracking), with k
typically small (to make parsing efficient)
Like for top-down parsing, this is not possible for all grammars
Possible conflicts:
I shift/reduce conflict: it is not possible to decide between shifting or
reducing
I reduce/reduce conflict: the parser can not decide which of several
reductions to make

Syntax analysis 164


Shift/reduce parsing

We will see two main categories of shift-reduce parsers:


LR-parsers
I They cover a wide range of grammars
I Di↵erent variants from the most specific to the most general: SLR,
LALR, LR
Weak precedence parsers
I They work only for a small class of grammars
I They are less efficient than LR-parsers
I They are simpler to implement

Syntax analysis 165


Outline

1. Introduction

2. Context-free grammar

3. Top-down parsing

4. Bottom-up parsing
Shift/reduce parsing
LR parsers
Operator precedence parsing
Using ambiguous grammars

5. Conclusion and some practical considerations

Syntax analysis 166


LR-parsers
LR(k) parsing: Left-to-right, Rightmost derivation, k symbols
lookahead.
Advantages:
I The most general non-backtracking shift-reduce parsing, yet as
efficient as other less general techniques
I Can detect syntactic error as soon as possible (on a left-to-right scan
of the input)
I Can recognize virtually all programming language constructs (that
can be represented by context-free grammars)
I Grammars recognized by LR parsers is a proper superset of grammars
recognized by predictive parsers (LL(k) ⇢ LR(k))
Drawbacks:
I More complex to implement than predictive (or operator precedence)
parsers
Like table-driven predictive parsing, LR parsing is based on a parsing
table.

Syntax analysis 167


LR Parsing Algorithm
Structure of a LR parser
input a1 ... ai ... an $
stack
Sm
Xm
LR Parsing Algorithm output
Sm-1
Xm-1
.
.
Action Table Goto Table
S1 terminals and $ non-terminal
X1 s s
t four different t each item is
S0 a actions a a state number
t t
e e
s s

Syntax analysis 168


Structure of a LR parser

A configuration of a LR parser is described by the status of its stack


and the part of the input not analysed (shifted) yet:

(s0 X1 s1 . . . Xm sm , ai ai+1 . . . an $)

where Xi are (terminal or nonterminal) symbols, ai are terminal


symbols, and si are state numbers (of a DFA)
A configuration corresponds to the right sentential form

X1 . . . Xm ai . . . an

Analysis is based on two tables:


I an action table that associates an action ACTION[s, a] to each state
s and nonterminal a.
I a goto table that gives the next state GOTO[s, A] from state s after
a reduction to a nonterminal A

Syntax analysis 169


Actions of a LR-parser
Let us assume the parser is in configuration

(s0 X1 s1 . . . Xm sm , ai ai+1 . . . an $)

(initially, the state is (s0 , a1 a2 . . . an $), where a1 . . . an is the input


word)
ACTION[sm , ai ] can take four values:
1. Shift s: shifts the next input symbol and then the state s on the
stack (s0 X1 s1 . . . Xm sm , ai ai+1 . . . an ) ! (s0 X1 s1 . . . Xm sm ai s, ai+1 . . . an )
2. Reduce A ! (denoted by rn where n is a production number)
I Pop 2| | (= r ) items from the stack
I Push A and s where s = GOTO[sm r , A]
(s0 X1 s1 . . . Xm sm , ai ai+1 . . . an ) !
(s0 X1 s1 . . . Xm r sm r As, ai ai+1 . . . an )
I Output the prediction A !
3. Accept: parsing is successfully completed
4. Error: parser detected an error (typically an empty entry in the action
table).

Syntax analysis 170


LR-parsing algorithm

Create a stack with the start state s0


a = getnexttoken()
while (True)
s = pop()
if (ACTION[s, a] = shift t)
Push a and t onto the stack
a = getnexttoken()
elseif (ACTION[s, a] = reduce A ! )
Pop 2| | elements o↵ the stack
Let state t now be the state on the top of the stack
Push A onto the stack
Push GOTO[t, A] onto the stack
Output A !
elseif (ACTION[s, a] = accept)
break // Parsing is over
else call error-recovery routine

Syntax analysis 171


Example: parsing table for the expression grammar

(SLR) Parsing Tables for Expression Grammar


Action Table Goto Table
1) E → E+T state id + * ( ) $ E T F
0 s5 s4 1 2 3
2) E→T
1. E ! E +3)T T → T*F
1 s6 acc
2 r2 s7 r2 r2
2. E ! T 4) T→F 3 r4 r4 r4 r4

3. T ! T ⇤ 5)
F F → (E) 4 s5 s4 8 2 3
6) F → id 5 r6 r6 r6 r6
4. T ! F 6 s5 s4 9 3
7 s5 s4 10
5. F ! (E ) 8 s6 s11

6. F ! id 9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5

34

Syntax analysis 172


Example: LRActions of Athe(S)LR-Parser
parsing with -- Example
expression grammar
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by F→id F→id
0E1+6F3 $ reduce by T→F T→F
0E1+6T9 $ reduce by E→E+T E→E+T
0E1 $ accept

Syntax analysis 173


Constructing the parsing tables

There are several ways of building the parsing tables, among which:
I LR(0): no lookahead, works for only very few grammars
I SLR: the simplest one with one symbol lookahead. Works with less
grammars than the next ones
I LR(1): very powerful but generate potentially very large tables
I LALR(1): tradeo↵ between the other approaches in terms of power
and simplicity
I LR(k), k> 1: exploit more lookahead symbols

Main idea of all methods: build a DFA whose states keep track of
where we are in the parsing

Syntax analysis 174


Parser generators

LALR(1) is used in most parser generators like Yacc/Bison

We will nevertheless only see SLR in details:


I It’s simpler.
I LALR(1) is only minorly more expressive.
I When a grammar is SLR, then the tables produced by SLR are
identical to the ones produced by LALR(1).
I Understanding of SLR principles is sufficient to understand how to
handle a grammar rejected by LALR(1) parser generators (see later).

Syntax analysis 175


LR(0) item
An LR(0) item (or item for short) of a grammar G is a production of
G with a dot at some position of the body.
Example: A ! XYZ yields four items:
A ! .XYZ
A ! X .YZ
A ! XY .Z
A ! XYZ .
(A ! ✏ generates one item A ! .)
An item indicates how much of a production we have seen at a
given point in the parsing process.
I A ! X .YZ means we have just seen on the input a string derivable
from X (and we hope to get next YZ ).
Each state of the SLR parser will correspond to a set of LR(0) items
A particular collection of sets of LR(0) items (the canonical LR(0)
collection) is the basis for constructing SLR parsers

Syntax analysis 176


Construction of the canonical LR(0) collection

The grammar G is first augmented into a grammar G 0 with a new


start symbol S 0 and a production S 0 ! S where S is the start
symbol of G
We need to define two functions:
I Closure(I ): extends the set of items I when some of them have a
dot to the left of a nonterminal
I Goto(I , X ): moves the dot past the symbol X in all items in I
These two functions will help define a DFA:
I whose states are (closed) sets of items
I whose transitions (on terminal and nonterminal symbols) are defined
by the Goto function

Syntax analysis 177


Closure
Closure(I )
repeat
for any item A ! ↵.X in I
for any production X !
I = I [ {X ! . }
until I does not change
return I
Example:

E0 ! E Closure({E 0 ! .E }) = {E 0 ! .E ,
E !E +T E ! .E + T
E !T E ! .T
T !T ⇤F
T ! .T ⇤ F
T !F
F ! (E ) T ! .F
F ! id F ! .(E )
F ! . id }
Syntax analysis 178
Goto

Goto(I , X )
Set J to the empty set
for any itemSA ! ↵.X in I
J = J {A ! ↵X . }
return closure(J)

Example:

E0 ! E I0 = {E 0 ! .E ,
goto(I0 , E ) = {E 0 ! E ., E ! E . + T }
E !E +T E ! .E + T
goto(I0 , T ) = {E ! T ., T ! T . ⇤ F }
E !T E ! .T goto(I0 , F ) = {T ! F .}
T !T ⇤F
T ! .T ⇤ F goto(I0 ,0 (0 ) = Closure({F ! (.E )})
T !F
T ! .F = {F ! (.E )} [ (I0 \ {E 0 ! E })
F ! (E )
goto(I0 , id) = {F ! id.}
F ! id F ! .(E )
F ! . id }

Syntax analysis 179


Construction of the canonical collection

C = {closure({S 0 ! .S})}
repeat
for each item set I in C
for each item A ! ↵.X in I
C = C [ Goto(I , X )
until C did not change in this iteration
return C

Collect all sets of items reachable from the initial state by one or
several applications of goto.
Item sets in C are the states of a DFA, goto is its transition
function

Syntax analysis 180


T ! .T ⇤ F I : E ! E + .T
T!
T ! .F Example
.T ⇤ F T ! .F F ! .(E (SLR)) T6Parsing
Example !T .F Tables
⇤ F for E
Example Example
! .T
T0 ! ! .(E .F ) F ! F ! ) . id
Example
Example
Example I0 : E ! .E ,F
F!
F
IE6 :! E Example
! ..(E
.E ! id )
+ TE + .T F !
.(E
. id
I9 : E !1)E E+→TE+T
F !T .(E
. F !
! .F )
state.(Eid)
Action T

s of a LR-parser F ! . id !!EET++! F ! . id + *
Example: parsing table I1 : for 0
EE 0! the
! .TE !
T . expression.T ⇤EF+ .Tgrammar I6 :I9 E: E .T
TT
2)
⇤ FT
. E. → F ! 0 . ids5
0 I : E
E! !
! .T E
E. + I. :
⇤6 FT E ! Example I 6 : E ! E 1.+ .T
E !
II6 !
: .T
:E ET.⇤
!+⇤! TET I : EI !
: E E
! +E + T T . .
! T+→
I0 : E ! .E , I6 : E ! E + .TT 1 TT T FF 9 ⇤.T F .9 1 s6
t us assume E the ! .E parserExample Example
+ Tis in configuration T ! .T I⇤2 F : T E!
E !T T
.FE ..! + T.F T ! .T ⇤ F T ! .T ⇤ F10
T !IExample:
T: E ! I11!
T ! :T F!
TT ⇤!
3)
.T
4)F (E
Example
parsing
T*F
).FFT table
.T ⇤→ !T T !2 .T⇤. ⇤FFr2 s7
(SLR) Parsing
I : E ! Tables
F T !
. for
.(E
T !) Expression
.F Grammar 102.
.F .F I10 : T ! 3 forT ⇤ Fthe . r4 ex
r4
E ! .T T ! .F 2 F T ! ! .(ET .) ⇤ F I : F ! T (E! ). .F
(s X s . . . Xm sm , ai ai+1 . . . aT n $) I : E ! E + .T F ! F+!
.(E
11 ) I
.(ET)⇤ 5)F F10→ (E)I11 : T !
6 : : EF !T4 E(E⇤+ F.T.
s5).
T ! .T0 ⇤1 F1 F ! .(EI)30 : FT E 0!
accept
!!F
!.F.E
Tid.!. ⇤ Action
, .Fid Example
F6F !Table
T
.(E )
! . id ! .T ⇤ F
IGoto
6 : E
F
! E3.
Table
! . id
F ⇤!
T.F! F ! .(E )
F . id 6) IF11→:id F ! (E T !
!
5 .T
). ⇤ F r6 r6
Example
nitially, the T state ! .F
F ! .(E )
is (s0 , a1 a2 . .1). FanE! → .E+T
$), idIII143 ::: IaEF
where T
9E1 .:!
!
0state
!
. .E Fid..is+
E
a(.En !
.E +
I9) the
:T EE *input
Example
+
! TE T(! .F
+. )T . $ I9 E1: ETT2 !
T !F .T
I9 ! !
E4.+ T
.F
: TE. ⇤!
T . !FF ! . id (SLR) T Parsing
! 6 .F s5 Table
I4 T: .EE F! ! E.E :EE(E +!TE. + T .
0 s5
(.E )T s4 3
I9 : 2)E E! →E T + ! .T.+ + TT .! FT !. ⇤.(E ) T F !7 .(Es5)
ord) 1 T I!
T E F⇤!
s6: T F . id F+ 5. )FIF9 ! )
!E
FI !: 1.. E idE ! !E E+ acc.T F ! .(E
CTION[sm ,60ai ] can take four 3)values:
+ T.T TT! T .I2⇤ :F EE
→ T*F E!
T !
! !T .T.E
.T . I6+ ⇤F s7 ! T ⇤ r2 F .r2 I10: E
I : FT ! !.T Tid⇤! F . TT. ⇤ !FT . ⇤ F F !8 . id s6
2 10Ir2: :T T E ! ! .T E ⇤ + F .T ! E + .F
I1 : E !2.E T . !
E ! T
.T I ⇤:
4) FTT ! F T ⇤ F IT E!
10
.T !!: T.T T
.F
.T . I⇤⇤!
Example F6F:I9 T:Fr4 !E⇤ ! F .
E + T . I 6 : F ! 6. ).F ! id
(E 1) EI9→ E !9 E +state
: E+T Tr1. ids7 +
E ! E . +IT : IE
10 →
1. Shift s: shifts the next input symbol and then the state11s on the (E ).
T!
3
! .TExample r4
⇤ F (E T T ! !
r4
.F ..T
r4 I : I !: .T
11
⇤ F⇤ 7F8 T10
TT⇤⇤I.F !
F : TT⇤!
10 F .T ⇤2)F .EI6→: TET ! !10
. ⇤.T0Fr3 s5r3
Example:
I2 : AE!! T
2. Reduce
Example
stack (s0 X1 s1 . .3. .XTmTsm!
.
(denoted
parsing
,6ai a.F
!
T
by
11
i+1 ⇤
T !
rn
:
5)
!
6)
FEF
F
.T
table
where
!


+

(E
(E)
.T
id
nF
I
I
is
3).
a
for
:
:
I T
F
F
T
11
T!
F
the
!
!
4:
5!(.E
production
F F
.(E
.F
s5.
. . .Fan ) ! (s0 X1 s1 . . . Xm ai s, ai+1 . . . an ) !
expression
)
..Fid))number)
.(E r6I10F
T
r6: T
).s4
T !
!
!
Example
!
Tgrammar
r6T.F
.(E ⇤)
r6F .
2
IF11!
T 3
!: .(E .FF)I! 11 EF!
1. : (E ).!E(E+).3)T TI10→: T*F TT!
11
!.T
ET +
T ⇤⇤ F 1 r5
F.
r5 s6

4.TFr.T ! !.(E Ffrom ) 4


F ! F9 !3 .(E . id) I11 : TF ! !.F (E ).2 r2
I Pop T 2| ! | (= )⇤items
F T ! the.Fstack I1 : E E 0!
F !6!.E ..(E
s5
Eid+ . ) TI11F: F! Fs4! (E ).
!. id .(E ) I7 E: + 2. E ! T 4) T → F
: TA!
II3Push and 5. FF ! .s id
s.Fwhere! (E =) (SLR)
GOTO[s
F ! .(E ) m
Example
Parsing
r ,I A]
5 : EF
E
F !!
!
7Tables
.T.Es5id
id. . + Tfor
F
s4 I6 : E !Grammar
Expression
! . id I :
F .T
T
!10.(Eid.)
E⇤ ! E
T . ⇤+.F F3. T ! T ⇤ F 5) F →
F ! .(E )
(E)
F ! . id
3
4 s5
r4

I :
I4(s:06XF1Is91!E !
. :. . (.E
XE E
m s)m!
+
, ai aE .T
i+1 .+ . . aT n) .
! II25 :: TE F! !
!
8
id.
.T I s6
T . 9⇤Action
: E !
F s7 TableI6 r1: Er1 ! E
s11 + T T ! .7 .T
E Table
F
+ .T .(ESyntax
F a! . id Goto 6)Syntax
FI9→ E ! E + T5 .
id
: analysis r6
(s0 XE 1 s1! . . 6.
T !XmF+
. .E
T .T r s!
! T⇤r As,Fid i ai+1 . . . an )
⇤ T0!
Tstate
9 r1
.F . +⇤Ir39F *:T E!! )T E. Tr3⇤+F !T .F. TF⇤ ! ) analysis
4. T ! F
I9 : TE !F
m
.! !!idT r3 ( r3 $T !E .T F
F 6 s5
I Output the prediction
T ! .F A 1) EE→+ E+TT .I0 : E 10 .E , F ! .(E !
F) Syntax . idanalysis T ! T. ⇤ F
E ! .T
FT→. .T⇤ F I3 : FET1!
F 0! !s5.(EF I. )10
+r5T: r5T T
I6 :T s4 E !! ! r5T
E TTr5+⇤. ⇤FI18F
.T
! ..F2Syntax 3
I10 : T ! T ⇤ 7F . s5
3. Accept: parsingI!
T10 F1.:! isT
E !T
successfully
⇤!F)E +
.(E T2)T⇤ E
!completed 11
! .E
id s6)I I6 :: ET !!Eacc .TF
+⇤.T ! F .: idF ! (E .) 5. F ! (E )
analysis
.T 3) T → T*F SyntaxIan
4 : emptyF! !.entry
analysis
E (.E
.T I 10 : F ! ! (E T
F).! ⇤ F.(E. E ) ! E. + F I11 : F ! (E ). 8 s6
4. Error: parser I F 6!
:
detected F !
I
id an :(Eerror
T ). ! (typically
T ⇤ F . 2 11
r2 in the
s7 action
T T ! !
r2 .T .F
r2 ⇤ F
34

EE.! ! T .E + .T6. F ! id
I9 : E 9 r1
T11 !
I2. :.F ! T 4) T → F Syntax
10 E + .T I5 : analysis
FE ! id. + T F )I! 6 : .EE id+!
table). I : F ! (E ). T3 ! ! .E .T r4⇤I11 F r4 : TFF! !
!
r4 .(E
.F (E
r4 ). Syntax analysis 10 r3
I9 F: ! E ! .(ETE ) ! +
11 T .T. 5)⇤ FF → (E) E ! .T s4 F I!
T !
Syntax T
E) !8 E 2+ T3 .
T . ⇤! F
analysis .T ⇤ F
F ! 3. .! TidTT !. ⇤TF⇤ F T4 !s5.F Syntax analysis
F ! 9 :.(E
. id 11 r5
is T ! .F 6) F → id T 5 ! .T
F ! .(E ) I9 :FE! ⇤
r6 F r6 I r6
10 :
78 r6 T ! T ⇤! F .F
.
Syntax analysis !. EidT+ ! T. T. ⇤ F 88
I5 :I10F: ! 4.Tid. !F!
T T!⇤F.(E F. ) T 6 !s5.F s4 I11 : F ! (E F 9 ).! 3 .(E )
F ! . id Syntax ET! I10!E:T+ .T⇤TF! . T ⇤ 10F . Syntax analysis Syntax analysis
I11 : 5.
Syntax F ! FF!
analysis (E !). (E. )id
Syntax analysis F7 !s5.(E ) I9 : analysis s4 F ! . id 80
I1 : E80 ! E .s6 I10 T : T Is11
! !: T
11 T .F ⇤⇤Syntax
F!F . (E ).
I9 : E ! E + T . Syntax F ! . id analysis
I9 : E ! E + T .
6. F ! id E9 !analysis
E. + r1 Ts7I11 : F ! r1(E ).
: T r1! T ⇤ F . Syntax analysis
I : F ! id. I10 T ! T. ⇤ F
Syntax analysis
T ! T. ⇤ F I25 : E10 ! T . r3 I11 r3
: F ! r3
(Er3
). 88
I10 : T ! T ⇤ F . T11 ! T . r5⇤ F r5 r5 r5 I
10 : T !Syntax T ⇤ analysis
F.
I11 : F ! (E ). I3 : T ! F . I11 : F ! (E ). 34
Syntax analysis
Syntax analysis
F ! (.E )
I4 : analysis
Syntax
E ! .E + T
Syntax analysis E ! .T Syntax analysis 88
Syntax analysis
Syntax analysis 181
Constructing the LR(0) parsing table

1. Construct C = {I0 , I1 , . . . , In }, the collection of sets of LR(0) items


for G 0 (the augmented grammar)
2. State i of the parser is derived from Ii . Actions for state i are as
follows:
2.1 If A ! ↵.a is in Ii and goto(Ii , a) = Ij , then ACTION[i, a] = Shift j
2.2 If A ! ↵. is in Ii , then set ACTION[i, a] = Reduce A ! ↵ for all
terminals a.
2.3 If S 0 ! S. is in Ii , then set ACTION[i, $] = Accept
3. If goto(Ii , X ) = Ij , then GOTO[i, X ] = j.
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial state s0 is the set of items containing S 0 ! .S

) LR(0) because the chosen action (shift or reduce) only depends on the
current state (but the choice of the next state still depends on the token)

Syntax analysis 182


S . ( THREE.
L) S .(L) S
CHAPTER PARSING
( x 3
( L L,S.
S .x S .x
Example of a CHAPTER
LR(0)THREE.
grammar3.3.SLR PARSING
PARSING (.L)
L .S ,
0 S ! → S$ 1 L . L , S2 8
TER THREE. PARSING S x Lx
5
0
!
S → S$ S' . S3$( L → SS S . (xL. ) L L,.S
S (L.)S 9
S →(L) S . ( 4L ) L → SL , S x. x S .(L)
1
S 4. x
3 L (→ S 3
(
SL .x L.,S
L L,S.
S (.L)
21 S → → x( L ) S' 4 L → LL S, S . S
S.$ , )
23 LS →→Sx L . L , S7 6
S L
5
GRAMMAR
L → L3.20. ( L S S ..( L )
4 , S SS ( L (. )L ).
S .x L L.,S
GRAMMAR 3.20. 4
S' S.$ S )
Rather than rescan the stack for each 7 token, the parser 6 can remember in-
stead
FIGURE the state
Rather
3.21. thanreached
rescanstates
LR(0) forL for
the each
stack
S .stack element.
for each
Grammar 3.20.
S ( L ) . the parsing algorithm
token,Then
the parser can remember in-
isstead the state reached for each stack element. Then the parsing algorithm
ther than rescan the stack forLook each up token, the parser
top stack state,can andremember
input symbol, in- to get action;
is FIGURE (element. ) Then xLR(0) , parsing
for$Grammar S 3.20.L
the state reached for each stack IfLook
action is
3.21. the states algorithm
1 s3 up top stack s2 state, and input g4symbol, to get action;
2 If Shift(n):
action
r2 r2 Advance
is r2 r2 input r2one token; push n on stack.
ok up top stack state, and input symbol,( to get) action; x , $ S L
3 Reduce(k):
s31
Shift(n): s3
Pop
s2 stack
Advance
s2
as many
input onetimes
g7 as
g4token;g5the number
push of
n on stack.
action is 4 ar2 the right-hand side of rule k;
2
Reduce(k):
r2 r2
Pop symbols
r2
stack
r2 on
as many times as the number of
Shift(n): Advance input one 5 token; 3 push
s6 n on stack.
s3 s2 s8 g7 g5
4 Let X symbols
be the left-hand-side
on
a the right-hand symbol sideof of
rulerule
k; k;
6
Reduce(k): Pop stack as many times r1 r1 r1
as the number r1
of s8 r1
7 r36
5
r3 In
s6 the state
r3 r3 now
r3 on top of stack, look up X to k;
get “goto n”;
symbols on the right-hand r1 r1Let
side ber1k;
ofr1Xrule the left-hand-side
r1 symbol of rule
8 s37 r3 Push
r3s2 r3 n onr3top of r3 stack.
g9
Let X be the left-hand-side
9 Accept:
r48 symbol
r4
s3
In therule
r4 ofs2
state now ong9top of stack, look up X to get “goto n”;
r4 k; r4
Stop parsing, report success.
In the state now on top9of stack, r4 Pushr4upn Xon
look
r4 r4totop r4of
get stack.
“goto n”;
Error: Stop parsing, report failure.
Push n on TABLE
top of stack.Accept:
3.22.
TABLE 3.22.
Stop
LR(0) parsing, report success.3.20.
parsing table for Grammar
LR(0) parsing table for Grammar 3.20.
(Appel)
Accept: Stop parsing, report Error:
success. Stop parsing, report failure.
Error: LR(0)failure.
Stop parsing, report
Syntax analysis
PARSER GENERATION 183
T ! .T 0⇤ F I6 : E ! E + .T
T I! E Example
0 : .T ⇤!F .E , TExample
F: !
ExampleTFT0 !
!
I0 : .F 0 ! .E ,
Example !I6 .F E ! .(E E)+Example
(SLR) TParsing
.T !T .F ! .TTables
⇤ F for Expre
Example Example
I0 : E !Example
E ! .E +T Example: F parsing
! .(E.F I6F: ! !) +!. .T I6 ⇤table
id.T Tfor the.Texpression
Example
Example
Example of a non LR(0)IE6Example
F!
F :!
.E
! ..(E
E
E
id!
E
T
) grammar
E,)!
! .T
!TE
+!
.E + T
.EExample
.T +
.T ⇤ F.T
E !
FI9!
.(E T
E
Eid!
T: .! T1)!
.T ⇤EFE+
:F
.F→TE+T
FE!
.T !
!
F !
!E .F
.(E
.T
+)
.(E⇤)F
Action Table
F 0 ! . id F I !: E. !idE + .T
state id + * ( )
ctions of a LR-parser
Example: parsing table
I1 : for
EE 0!the
!.TET
T T! .T Example
. expression
! .T
! .F ⇤⇤F F grammar !!
I6 :I9 E: E TEE
T++
! !
F.T
.FT! T ⇤ )F T6Parsing
.(SLR)
. E.(E
→ T Example
! id
F !0 . Tables
! .Ts5⇤ F for Expres
T .F s4
I :Example 2)
Example of a Example
non LR(0) grammar ) IT ⇤:.TFEI.9F:! ETE 1+ TT.s6.
E ! E + .T
I0 : E 0 ! .E ,
Let us assume E the Example
Example
Example
parser Example
Example
+ Tis in configuration
I6 : E ! E + I1 .T : T E!
E
E I!
!
!
!.T
0 :T
EE EFT .0I.+
! ⇤6! !
!
:FT.(E
!T..F
E ! E + .T
.F
.E ,)
6
T ! .T ⇤ F10
1.
TT E
!II6 ! !: .T
F :! E
TET.⇤ !+F⇤!
.(E F
3)
T
!
E
F 9. id
T+→ T*F !
!!
.(E
E.F+
) Action Table
! .E Example T ! .T I⇤2 F : T E IEE6FF.. :!
T
.F + ! E.Eid
T !
.(E
! ) .T ⇤ F
+ TE + .T T !IExample: T: E ! TI11! F! I:9!TT:TF.!
Example E⇤id!
!4)F.T .TE⇤→
(E
parsing EF
).+→FT
TE+T .!
table
TFT !!2 T . ⇤ Fr2
⇤ idF) the
. for
.(E s7 r2
⇤ Fr4. r4 expres
(SLR) Parsing ! Tables
F F ! ! .(E
. Tforid ! ) Expression
.F Grammar 10.F2. .F 1)
I F : ! T .!3 id
state
T
+ * ( )
r4
ActionsE of ! a.TLR-parserExample: Tparsing ! I
.F 2. . table : E
FT$) ! I! 1 :.(E T
for
TEE.)0! . 0
⇤!the FT.T E! . expression .T ⇤ F grammar II :
6.(E:I)9 F E : E !! !T E
(E ! +
ET).+! .T
.FTT
I
2) . E. →
:⇤ FT
T
10
!
F !
T 0 .

id s5
F . s4
(s0 X s . . . X s , a a . a I : E ! E + .T F ! 11 F+E! I.(E )ET 5) F →
10I9+ :.T(E) I.9: ::! EFE! !1E
4
E+ s5
T.T s4

Example Example ExampleExample


!T.(E !
3. T +! ⇤T.⇤ 6E E + T.s6
) E + .TI6 : IE 6 :! EE
:T.T F+ !
6F .T I11 (E ).
1 01 m m i i+1
ET
n I! :. T E!
.. ! ⇤! F..T E.id I.+
⇤6 T:FTable E ! 1..F E ! E T + .
T !I.T 0 : ⇤EF ! .E , F ! : EI)3! !F E
,! E T T! ! ! ⇤!F FE
I6 .(E 1 .T
0: T
F E+ 0! ! FT id
.E Action
T ! !
.F .T ⇤ F GotoFTable ! . id I610F :!
6).(E3)F
IF11 )TT→ ⇤ T*F
F
:).FFT !
id T !
(E5 2 .T ⇤ Fr6 r6 r6
(initially, the T
Let us assume
Example
! .F
the
E ! .E +1)T parser
F !
is in
→ .E+T
configuration
idT II143 !::: IaE.TTaccept I!
0state !⇤2 F: (.EFT E
E. + ! !
+ .F
EF .
T .* T (! !+ T . id
T ! .T$ ⇤ F E T T
!F .T !
FI10⇤! .T
TF:. !
+Example:

!
F
.TI11id F! :!TTTF! .⇤! id.T

.T ⇤
(E
parsing Example
F I table ! T !
T .).
T ⇤F. ⇤ F r2 s7 r2
E4. T ! for Ts5 ⇤ Fthe . r4 expres
2 . . . anE$), I(SLR) E id
nE.!
)
I9 : ETGrammar 2. E ! .F →(SLR) T Parsing Tables s4 for
state
F ! .(E )E ! (s
is (s0 , a1 a.T
I X: 2) sE E
. ! . . X ET
where
sT
+ I ,!Ta: .
aE
F
E
9.F
F
E !
!
1 .:Parsing
I0.2. E
! ! . .
:a.E
E
. F
(.E
aT
s5.
is
I!
+$)
)!
9) !
the
T
TE
:Tables
F.(ET
TE.! Example
+
input
.)! ⇤FFIs4
T
E
!
for
.(E
T
:
+ .! .F
)T
E
.(E
Expression
.F
!
.
) E + 1 I
.T
T2 !
!T
! :F 3 .F !
!
E I⇤
.F
!
.(E
T
F :) ! EF
F
!+
.(E T T
(E) ! .
).
4)
5) .F IF : (E)
10 : T
T F! !
! 6 3 .F
7T ⇤s5 ) F .
r4
s4
r4

I60+ TT
word) → 4 .T
.ETn! T 9 T . 11 F → I : E ! 4.(E
E +
s5 .T s4
90 1 1 m m i i+1
1)3 :TF Example
! s6:F TTidE .! ..!
, F⇤!
⇤ Action
F.T6F
Fid .! E F.(E
⇤ Table ) IF 6 :!E 5.
! )E I9+!
F3. :T.FE! (E! TE ⇤ F+10 T . 6 : F ! (E ).
F) !6).(EF )→ id 11 FT!!
.Tid.+
!!E acc.T
FI6!: 1.. E idET ! E.T+ +⇤3)T F
.T
TT! → T*F
F ! .(E
T .I2⇤ :F EE E!
T I!
! ! 0 T .E
.T
.T
TE
accept
!
. 0Istate+ ⇤!! T
F
.F .E !F ! ! id .T I⇤10F: T !
Goto FTable.(E
!T
T ⇤. !Fid
F .⇤! T FT. ⇤ idFT . ⇤I11F : F ! 89(E 5. .T
).⇤ Fs6r6 r6 Action
id s11
r6
ACTION[s m , 0ai ] can
I1 :(initially,
E !2.the E
T E. !
take
T state
! Example
! four
T
.T
.F
is (s:4)
I ⇤
values:
0, a
F TT1 a→ 2 .F
! .1)
T . Fan⇤ E!$),→I .E+T
F T
10
. EidI!
where
T
II!213 ::: E
:I4 :TI.T
! Ta9F
.F
.T
T
E1. .:⇤
10
!
!!
.Ir2F
!
Example
Example
⇤ .6FEI
FidT
:a(.E
En:T
.E T
:
..!
s7
is
I )E
+ !
E + T* ⇤

the
: !
TE !E!
F +
input
! E TF
r2
.T
. EE
(!
T
+
.r2

+ .+
T
) I $: E
.F F
.
T I.T
. 6 : IF
E !T
9F:! ! ET.E
T 6.
!
(E
!
!
id+F .T
).EF 4.
.F
.F
+! T .!
T !
idFF ! . id1) EI(SLR) 9→ : E+T ET! Parsing
! E + Tr1. Tables
6 .F s5state ids7 +s4 for *r1 E
1. Shift s:word)
E ! E . +F T!1..IE
F
shifts the next input symbol ! .(E 10 ) I9 and : 2)E Ethen ! →E T
T the+! 3 4T state .EE
.T
FI! Example
11!
0
!
⇤ s :on
r4
1 T(E F
9
(.E
E.E.Ts5.F+
T
r4
! +
9
the)!
T T T
! T
(E ).
! ..F
r4
FT
⇤!
s4 r4
!. ⇤.(E
..T
F⇤ E F⇤+
11
I)7 : T
F
I 101
! TI:9.T
2
T! :T⇤⇤TE
3
I.F!
F
.
10 ⇤! I
5. F9 ! (E )F: T : E
T ⇤
E +
! F
! T.
T E . ⇤ + F T.
2) E6 → T. F!!107 .(Es5) 0 r3 s5r3 s4
! ⇤ r3
stack Example:
(s0 X1 s1 . .3.
I2 :ACTION[s
AE!! IT
.XT mExample
Tm!
s !I
. mE, 0ai!
, parsing
6 i] T
6 a : a .F
T
Eid
11
i+1 ⇤ F n
E
. .!:
5)
!
.
can take four 3)
!
6)
!
a FE
)F
F
.T
Etable
E ! !
→+ +
+

(E)
(s
id
(E
.T
F
T.T
XTI 3).
0 1T 1→ T*F
values:F
I
sfor
!:
:
I
. T
11
.
F
F
T
T.
T!
X .the
! !
Im2!
!
4:⇤ a
5 (.E
F F
s,
:i .F
I
F
.(E
.F
.T
.(E
s5
EE.
E
a
Ti+1
id)!
!!
expression
)
)
!
!!T .
2 r6I r6
!:
. .E
T
.
.T
.T a I 6+
. In ⇤r2F
F :⇤10
)
!
T
T
I
).
s6:Ts4
T
!
T:I :T
!
Example
T
!
!
E
Ts7 r6
! T
.(E
F
!
E⇤T!
grammar
!
.FT
F⇤)
! r6..T
.
F
id
⇤r2 .F
F E
acc
.r2

8 .T 2
+ F
ITF1011!
II.T
6 :: E
!
: F
F3 ! .(E )
:T.(E.F
!
!
!F).ET IT!
6.
11
id1.
+
⇤!
F
F(E
:.F E. FT!
!
T!
).. ⇤!
id
E(EF+ T)..3)T
I :
⇤ FTI10→: T*F
1) E
ET
TT!F
E+T
!
11
!9.T
ET
8 . id
+.
T ⇤⇤state
F
.T
1
F
s6
r5 r5
Fr1. ids7 s6+
Action
s11
*r1
r5 T

I:11sr4FExample
2. Reduce F :
1(denoted ! 2.
.(E E
by T .
E rn ! !
) inputwhere T
.T →
I ⇤n:4) is
F T
4 T a ! production
F T ⇤ F 10
.TE number)
! T .T
10
.F
.T . ⇤ F6
F : E ! E + T . I F ! F .
.(E !
id ).T(E ). I → : E ! E + T
2 . r2 s7
1.
T Shift
I Pop 2| | (= r ) E! 4.T s:.T⇤ shifts !
F !E F
the
T.mT+
next
T :.Fstack
10 symbol and then the →
I1Ftable:+E(E) F
F
E 0
I!
! ! 6 ! .(E
. 11
:1 .E
s5
E id+
T .!
3 state
)! .(E
4: TF
I Example
.T )⇤F!
:on
s4
9
F (E !Fr4the
TT
!
T(E
:!
(E
).
!.F
).
r4
..T
r4 I 11
9
7 :
F: ⇤FF8 ! 2.(Eid3.)
I
T3 10 ! : T T ⇤
2. ⇤ I .F !
F
10E : T
! T ⇤
T! F .
T ⇤ F I
4) 2)T E .
11
9
: TF !
:F TET!
! 10.F(E
! ET +
).
. ⇤.T0 r3 s5r3 r3

II3Push
: 2.TAReduce
stack F
!and IF
Example:
(s0 X!
2 s:.Fwhere
AE!
items
1 s1 . .
!!(E
.3.
sT
T
. id
X s!
=.)GOTO[s
!
m ,6parsing
Example Example
from Ithe
! ai a.F
(SLR) T IE11. .!
i+1 ⇤
:5)
.FanFE
Parsing
,IA]
)! !

F :→⇤E
(sF
E
(E
.T0 X!
nF! Iis!
3).sfor
71Tables
: .T
.I..T
EF
T
s5
F
.id
T .!
Xthe
+
!
5!T
11
for
F
.Fs5.F!
m ai s, aexpression
..Fid))Expression
i+1 . s4 . .! aIn6.) ). Example
.(E
id
s4! E
T)grammar
! I⇤7 E
Grammar
⇤ F . E !I11
+ .T10 F !
T !: .(E .FF)I! 111. : (E
F3.) T ! T ⇤ 5)
EF! ).!E(E+).3)TF T→
I6 →

I →(E) : TT! !.T
F
F !11.(E ) 31 r5 r5 s6r4 r4
T ⇤⇤ FF .
r5
5. F
(denoted ! .(E byT)mrn )! 6)r where 5.T F
id id. F
84 a production
! .(E
(.E
s6: Enumber) Fr6I10F
! r6: T T !
!
s11.+
!id.(E r6T.F
I.7)r6.T : T⇤ FF E T! . ⇤+ F I10 FT*F! . .F id 4 s5 r2 s7
. ⇤.!
F.+ .(E I. 90⇤! ! E T T ! . .Fid
I4(s:I06XF1:Is91!. :.I.!
E XE
(.Em s)
EmT! ,+ a!i aE
.T4.
i+1 T T
. aT!
F n) . ! F I
I 5 :
: T F
E ! !
I ! : .Tid.
T EFF
E !
6!
! F
Action .
.E.(Es5
E id +. )Table
T I I : :F s4 ! E (E
! Goto ). E +
Table .T 9 3
.(E
2. E ! T 4) T 11 : TF !
F
! (E ).52 r6 r6
Pop 2| | (=Fr )! items . id T !
from the2.Fstack 91 11
F!
F6Expression ! . :! .(E
id ) IF7 E:!+ 6)Syntax
FI9→ → id
(s0 XE 1 s1! . .I6.
T !
X
.I.E :F+.TrT
!s!T! ⇤andr As,Fid
FF s..F ai+1 ..s.id
i!
awhere . anGOTO[s)(SLR)TParsing
Tr0,! Example .F F.! r1
7Tables
⇤I9F.T
! .E*s5:ids7
.r3+E
T (! for ! ) TE
r1
s4. T Ir1
⇤6+ F T E.F .! TGrammar
F .(E .T
F E ! E .4.
).(Eid.) analysis
!10Syntax
TT !! F T ⇤ 5)F F →T(E)
EF!
: analysis !E.(E +)T3 . r4 r4

Em⇤
Tm m
T F I! T EFE T r3s11!
$T
IF.7⇤.T F
3Push A F=E+T A] E .T 64 s5
I Output the prediction I6.F I E 5.
. ::. .! XE
1) ! ! E→
ai aEE
(E+ )!
. .T .I.(E
0!: E 10
)mstate 5 :id
II25!::s5 .E ,+!
8 r3
id.
I.69⇤:T s6: E r3 F! ! E! .+ Tid T! :! T⇤. Fidanalysis T ⇤+.F F3. F! !T. .id⇤ F s5
F1:Is991! ,+ .T
A ! F Syntax
E ! T I.T !
4(s:0 X (.E Em s) ! i+1 .+ aTn) .
I : FT0! ! .(EF TFEI. )10 ! !
! r5 .T :id.
TIr5 T FE
s4
Action ! ! ! r5T
Table E T Ir5+
⇤ . :⇤.T
F 1EF. .(E ! 2 )E3Table
Goto + .T 75 . s5 r6 r6
I! ! 2) ⇤ E F T ! . id 11 T 6 ! .F F ! .(E ) F! 6) I FI →:: id
TE ! T
E + ⇤ F
T
F1.I:! isT T F⇤T .Fid . T ⇤(EFF .) 5. 4. (EF)
→ 3 Syntax analysis Syntax analysis
3. Accept: parsing (s 0X
)T
successfully
E1 sF .E . 6. !
X
. .ET mF r!
.T s! completed
r As, . .⇤a⇤i aFi+1 . . . aE ) ! .E + 9 T r1
:. +⇤Ir3E9FT*:! Ts7
EE! !
+T
r1
EF.I8T
r1
⇤.+ :FidF
! Syntax
T.F ! analysis 10 9
T10 .TE ⇤!
.(E 1!
+ T T
+ ! T
m
: TE 1)! F
EIE E+TFnFT1! .(.E T T0s6
id
state !
)!Iin
I.F
:6T
id
:F TT!
!
r3 ( F) r3!
!
.T
accF
T).
⇤.T!
⇤ET$T !..TEE1F .T T! I11 : FT ! ! (E T .). ⇤ 8F
6 s5 s6
4→:+ !
TT:! FF ).(E ! . id
r3.(E
FTI011 (E
3).F I9T(typically E .I! 0 :entry E 10 ! .E , action F ESyntax
TAT*F
Syntax analysis ! .+ Fanalysis
4. Error: parser
TI11 ::.FOutput
detectedFEIid !
I! !Tthe!
an :E(E
.T
prediction
error
T !→
)..T ⇤2)!Syntax
FIE5.an empty
FEF2!
.T !r2 10 .(E the
s7 I. )10 IIr56r2! :T: E.F
s4.T r2!
! ! ⇤r5I!6F :T
+ .!
⇤.E ⇤ 34
2) 3
F 6! EE.! 9T T F+I8.! id(E ) I10 : T ! T ⇤978F . s5 s6r1 s7
s5
!
I2. 10
T : +T
4) !T TT F ⇤! F T→.: . T⇤analysis I! 3 :id. .E E1 r4 +11
!! T F
.E + r5 E
F !
r5 E
!
id :TidF
.F .E !
Syntax +analysis
(E.T6. .) 5. F! F !
table). 3. Accept:
!
parsing
)TII!
10 F1.
!: .TF⇤! ! is
E successfully
!
.(E E
) →
F (E ). + T completed T 3 !
Ian
.TFF !
: empty
⇤ I
!11
F .(.E id )ITF!
r4 : s6 FI 6 :
:!
r4! E.F
T
T
.(E
! (E
r4! !
!
E).
)Syntax .T
acc
T
F
+
F).
⇤ !
.T
!
F .
⇤analysis
. ⇤F! .(E . E).T!⇤EF. + F Syntax analysis I11 : F ! (E ).10 r3 r3
I94.F: Error:
E ! .(Eparser
TE +
11 T :.T . 5).⇤id F
! T3))..T T → T*F ESyntax
4 ! .4 : s5.T
analysis
E2 ! .TI11 10 :s7the Faction Tr2! (E
! ⇤T! T F
⇤EF :(E
detected I! Fan error
(E) ! (typically
T ⇤T F entry in
9 T .T! .F 6F2+
34

! r2 s4
F I! :.(E EI9) :! .T r28 E
: .ET
F 6! .F .! T .E + .T6. F ! id
T11 → Syntax analysis E 3+ 119 r1
r5 s7r5
! :.F F )IT E
Syntax analysis
! 3. .!
F table). TidTT ! ⇤T
I2. EF! 10
TE + 4) T → F T I5! F E3! !!F.E
id. +⇤IF FTr4!
id
! ! id
tax analysis T I9 F: !
.! ESyntax
F
.F
.(ET
! 6)E) I! +F→
11 T:.T.id F⇤! F (E ).F5 ! .(E .T T r6
)⇤ .T
r6 r4
I :
11
E
I:r610
!
F:Fr6!
T78E T !
r4
+
T! .F
T
! (E
r4
.(E T
. T !
). ⇤!
.
Syntax ⇤ T FT . ..F
Fanalysis
⇤ ! F .T ⇤ F Syntax analysis 10 r3 r3
E ! .T F ! . id
I5 :I10F: ! 4.Tid. !F!
T FT ! ! F.(E
3.
⇤ F.! T.T)T!!
id
analysis
. ⇤TF ⇤ 5) F F → (E)F T6 ! !s5..F T4 !s5.F
id5 !
9
Syntax analysis
s4 I11F:F!
s4 I! 9F :.(E . id
! E) (E !
F98). !E32+.(E T3 .) 11 88 r5 r5
Syntax analysis T .F 6) F → id
SyntaxSyntax analysisanalysis F7 !s5.(E FT !
)!Syntax
.T r6⇤EFT
I.(E
9 : )analysis s4 !
r6I10
I9 : F
!E:T+
E! ! T⇤T
Ir6.10 :F
. E!
78 r6. TT!
idT+ F ! T .⇤ T10TF. .⇤⇤! ..F analysis
FFSyntax
Syntax analysis
I11Syntax: 5. FI5! F:IF10!
analysis (E
F !
:! ).
4.
(E T.id. )id
!F!
T T!⇤F.(E F . ) I1 : E80 ! EF T6. ! s5.F I10 T : T !s4 T I11 ⇤F:! F ..F⇤(E !). ! (EF9 .). ! id3 .(E )
80 88
F ! . Fid s6 . id !I
s11 11 T
E !: T .I F
!⇤ : T T F! T ⇤ F . Syntax analysis
: I9FI:10 ).E :+Syntax T . !F E!
10ISyntax Eanalysis +. T .
I9 : IE !analysis
FE! +(E T!). . . )id Syntax Syntax analysis
analysis 7 !s5.(E )Syntax analysis
! s4 10 idSyntax analysis
6. F 11!: 5. id FF! E 9 I! : EE.80+ T I11 (E 9T analysis
Syntax 80
(E 1 r1! I E
s7 . : T r1! T: T T
r1
!I ⇤! T :
F .. F ⇤ ⇤ F!F . (E ).
F E + TII.25 :: E F10 ! id. F ! .10 id s6 s11
11
T !I9 T : .E⇤ ! ! SyntaxTE.9 r3!analysis E r3. + Ts7I11 r3 : r3 Fr1). ! r1(E ). 9 :!
ISyntax
T Eanalysis
T!. E ⇤ F+ T .
Syntax analysis 6. F ! id I5 : F10r5!Iid. 11 r1 : F I10! : T (E !T ⇤ F . Syntax analysis 88
ISyntax
10 : analysisT ! TT⇤ ! F .T. ⇤ F T11 I! 2 : TE. ! ⇤ FTr5 . r3 I11 r3 r5 r5 r3 I r3 : T !
10 T ! TT ⇤F . ⇤. F
: F ! (E ). Syntax analysis 88
I11 : F ! I10 (E : ). T ! T ⇤ FI3. : T ! FT.11 ! T . r5⇤ F r5 r5
I11r5 : I10F : !T (E 34
!). T ⇤
Syntax analysis
F . Syntax analysis
SyntaxI analysis
I11 : F ! (E ). I : F ! I3 : (.E T )! F . 11 : F ! (E ).34
Syntax analysis
Syntax 4 analysis Syntax analysis
ESyntax I! 4 : .E F+
analysis !T(.E )

Conflict: in state 2, we don’t know whether to shift or reduce.


Syntax analysis E ! Syntax .TE ! .E analysis
+T
88
Syntax analysis E ! .T 88
Syntax analysis T ! .T ⇤ FSyntax analysis 88
Syntax analysis Syntax analysis TSyntax ! .T analysis ⇤F 80 88
Syntax analysis T ! .FT ! .FSyntax analysis 80

F ! .(E Syntax analysis


Syntax analysis Syntax analysis F )! .(E ) Syntax analysis
91
184
Syntax analysis F ! . id Syntax analysis Syntax analysis 88
Constructing the SLR parsing tables

1. Construct c = {I0 , I1 , . . . , In }, the collection of sets of LR(0) items


for G 0 (the augmented grammar)
2. State i of the parser is derived from Ii . Actions for state i are as
follows:
2.1 If A ! ↵.a is in Ii and goto(Ii , a) = Ij , then ACTION[i, a] = Shift j
2.2 If A ! ↵. is in Ii , then ACTION[i, a] = Reduce A ! ↵ for all
terminals a in Follow (A) where A 6= S 0
2.3 If S 0 ! S. is in Ii , then set ACTION[i, $] = Accept
3. If Goto(Ii , A) = Ij for a nonterminal A, then GOTO[i, A] = j
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial state s0 is the set of items containing S 0 ! .S

) the simplest form of one symbol lookahead, SLR (Simple LR)

Syntax analysis 185


Example
(SLR) Parsing Tables for Expression Grammar
Action Table Goto Table
1) E → E+T state id + * ( ) $ E T F
0 s5 s4 1 2 3
2) E→T
1 s6 acc
3) T → T*F 2 r2 s7 r2 r2
4) T→F 3 r4 r4 r4 r4
5) F → (E) 4 s5 s4 8 2 3
6) F → id 5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
First Follow 10 r3 r3 r3 r3
E id ( $+) 11 r5 r5 r5 r5

T id ( $+*) 34

F id ( $+*)

Syntax analysis 186


SLR(1) grammars

A grammar for which there is no (shift/reduce or reduce/reduce)


conflict during the construction of the SLR table is called SLR(1)
(or SLR in short).
All SLR grammars are unambiguous but many unambiguous
grammars are not SLR
There are more SLR grammars than LL(1) grammars but there are
LL(1) grammars that are not SLR.

Syntax analysis 187


Conflict example for SLR parsing

(Dragonbook)
Follow (R) contains ’=’. In I2 , when seeing ’=’ on the input, we don’t
know whether to shift or to reduce with R ! L.

Syntax analysis 188


Summary of SLR parsing
Construction of a SLR parser from a CFG grammar
Eliminate ambiguity (or not, see later)
Add the production S 0 ! S, where S is the start symbol of the
grammar
Compute the LR(0) canonical collection of LR(0) item sets and the
Goto function (transition function)
Add a shift action in the action table for transitions on terminals
and goto actions in the goto table for transitions on nonterminals
Compute Follow for each nonterminals (which implies first adding
S 00 ! S 0 $ to the grammar and computing First and Nullable)
Add the reduce actions in the action table according to Follow
Check that the grammar is SLR (and if not, try to resolve conflicts,
see later)

Syntax analysis 189


Outline

1. Introduction

2. Context-free grammar

3. Top-down parsing

4. Bottom-up parsing
Shift/reduce parsing
LR parsers
Operator precedence parsing
Using ambiguous grammars

5. Conclusion and some practical considerations

Syntax analysis 190


Operator precedence parsing

Bottom-up parsing methods that follow the idea of shift-reduce


parsers
Several flavors: operator, simple, and weak precedence.
In this course, only weak precedence

Main di↵erences compared to LR parsers:


I There is no explicit state associated to the parser (and thus no state
pushed on the stack)
I The decision of whether to shift or reduce is taken based solely on the
symbol on the top of the stack and the next input symbol (and stored
in a shift-reduce table)
I In case of reduction, the handle is the longest sequence at the top of
stack matching the RHS of a rule

Syntax analysis 191


Structure of the weak precedence parser
input a1 ai an $

stack
Xm

Xm 1

Weak precedence parsing output

X2

X1
Shift-reduce table
terminals and $
nonterminals and $
terminals,

Shift/Reduce/Error

(Ã modifier)
Syntax analysis 192
Weak precedence parsing algorithm
Create a stack with the special symbol $
a = getnexttoken()
while (True)
if (Stack= = $S and a = = $)
break // Parsing is over
Xm = top(Stack)
if (SRT [Xm , a] = shift)
Push a onto the stack
a = getnexttoken()
elseif (SRT [Xm , a] = reduce)
Search for the longest RHS that matches the top of the stack
if no match found
call error-recovery routine
Let denote this rule by Y ! Xm r +1 . . . Xm
Pop r elements o↵ the stack
Push Y onto the stack
Output Y ! Xm r +1 . . . Xm
else call error-recovery routine

Syntax analysis 193


Example for the expression grammar

Example:
Shift/reduce table
⇤ + ( ) id $
E !E +T E S S R
E !T T S R R R
T !T ⇤F F R R R R
T !F ⇤ S S
F ! (E ) + S S
F ! id ( S S
) R R R R
id R R R R
$ S S

Syntax analysis 194


Example of parsing

Stack Input Action


$ id + id ⇤ id$ Shift
$id +id ⇤ id$ Reduce by F ! id
$F +id ⇤ id$ Reduce by T ! F
$T +id ⇤ id$ Reduce by E ! T
$E +id ⇤ id$ Shift
$E + id ⇤ id$ Shift
$E + id ⇤id$ Reduce by F ! id
$E + F ⇤id$ Reduce by T ! F
$E + T ⇤id$ Shift
$E + T ⇤ id$ Shift
$E + T ⇤ id $ Reduce by F ! id
$E + T ⇤ F $ Reduce by T ! T ⇤ F
$E + T $ Reduce by E ! E + T
$E $ Accept

Syntax analysis 195


Precedence relation: principle

We define the (weak precedence) relations l and m between


symbols of the grammar (terminals or nonterminals)
I X l Y if XY appears in the RHS of a rule or if X precedes a
reducible word whose leftmost symbol is Y
I X m Y if X is the rightmost symbol of a reducible word and Y the
symbol immediately following that word
Shift when Xm l a, reduce when Xm m a
Reducing changes the precedence relation only at the top of the
stack (there is thus no need to shift backward)

Syntax analysis 196


Precedence relation: formal definition

Let G = (V , ⌃, R, S) be a context-free grammar and $ a new


symbol acting as left and right end-marker for the input word.
Define V 0 = V [ {$}
The weak precedence relations l and m are defined respectively on
V 0 ⇥ V and V ⇥ V 0 as follows:
+
1. X l Y if A ! ↵XB is in R, and B ) Y ,
2. X l Y if A ! ↵XY is in R
+
3. $ l X if S ) X ↵
+ ⇤
4. X m a if A ! ↵B is in R, and B ) X and )a
+
5. X m $ if S ) ↵X
for some ↵, , , and B

Syntax analysis 197


Construction of the SR table: shift
Shift relation, l:

Initialize S to the empty set.


1 add $ l S to S
2 for each production X ! L1 L2 . . . Lk
for i = 1 to k 1
add Li l Li+1 to S
3 repeat
for each⇤ pair X l Y in S
for each production Y ! L1 L2 . . . Lk
Add X l L1 to S
until S did not change in this iteration.


We only need to consider the pairs X l Y with Y a nonterminal that were added in
S at the previous iteration

Syntax analysis 198


Example of the expression grammar: shift

Step 1 Sl$
Step 2 E l+
+lT
T l⇤
⇤lF
E !E +T (lE
E !T E l)
T !T ⇤F Step 3.1 +lF
T !F ⇤ l id
F ! (E ) ⇤l(
F ! id (lT
Step 3.2 + l id
+l(
(lF
Step 3.3 (l(
(lid

Syntax analysis 199


Construction of the SR table: reduce
Reduce relation, m:

Initialize R to the empty set.


1 add S m $ to R
2 for each production X ! L1 L2 . . . Lk
for each pair X l Y in S
add Lk m Y in R
3 repeat
for each⇤ pair X m Y in R
for each production X ! L1 L2 . . . Lk
Add Lk m Y to R
until R did not change in this iteration.


We only need to consider the pairs X m Y with X a nonterminal that were added in
R at the previous iteration.

Syntax analysis 200


Example of the expression grammar: reduce

Step 1 E m$
Step 2 T m+
F m⇤
T m)
Step 3.1 T m$
E !E +T F m+
E !T )m⇤
T !T ⇤F id m ⇤
T !F F m)
F ! (E ) Step 3.2 F m$
F ! id )m+
id m +
)m)
idm)
Step 3.3 id m $
)m$

Syntax analysis 201


Weak precedence grammars

Weak precedence grammars are those that can be analysed by a


weak precedence parser.
A grammar G = (V , ⌃, R, S) is called a weak precedence grammar
if it satisfies the following conditions:
1. There exist no pair of productions with the same right hand side
2. There are no empty right hand sides (A ! ✏)
3. There is at most one weak precedence relation between any two
symbols
4. Whenever there are two syntactic rules of the form A ! ↵X and
B ! , we don’t have X l B
Conditions 1 and 2 are easy to check
Conditions 3 and 4 can be checked by constructing the SR table.

Syntax analysis 202


Example of the expression grammar

Shift/reduce table
⇤ + ( ) id $
E !E +T
E S S R
E !T T S R R R
T !T ⇤F F R R R R
T !F ⇤ S S
F ! (E ) + S S
F ! id ( S S
) R R R R
id R R R R
$ S S

Conditions 1-3 are satisfied (there is no conflict in the SR table)


Condition 4:
I E ! E + T and E ! T but we don’t have + l E (see slide 250)
I T ! T ⇤ F and T ! F but we don’t have ⇤ l T (see slide 250)

Syntax analysis 203


Removing ✏ rules
Removing rules of the form A ! ✏ is not difficult
For each rule with A in the RHS, add a set of new rules consisting
of the di↵erent combinations of A replaced or not with ✏.
Example:

S ! AbA|B
B ! b|c
A ! ✏

is transformed into

S ! AbA|Ab|bA|b|B
B ! b|c

Syntax analysis 204


Summary of weak precedence parsing

Construction of a weak precedence parser


Eliminate ambiguity (or not, see later)
Eliminate productions with ✏ and ensure that there are no two
productions with identical RHS
Construct the shift/reduce table
Check that there is no conflict during the construction
Check condition 4 of slide 254

Syntax analysis 205


Outline

1. Introduction

2. Context-free grammar

3. Top-down parsing

4. Bottom-up parsing
Shift/reduce parsing
LR parsers
Operator precedence parsing
Using ambiguous grammars

5. Conclusion and some practical considerations

Syntax analysis 206


Using ambiguous grammars with bottom-up parsers

All grammars used in the construction of Shift/Reduce parsing


tables must be un-ambiguous
We can still create a parsing table for an ambiguous grammar but
there will be conflicts
We can often resolve these conflicts in favor of one of the choices to
disambiguate the grammar
Why use an ambiguous grammar?
I Because the ambiguous grammar is much more natural and the
corresponding unambiguous one can be very complex
I Using an ambiguous grammar may eliminate unnecessary reductions
Example:
E ! E + T |T
E ! E + E |E ⇤ E |(E )|id ) T ! T ⇤ F |F
F ! (E )|id

Syntax analysis 207


Set of LR(0) items of the ambiguous expression grammar

E ! E + E |E ⇤ E |(E )|id

Follow (E ) = {$, +, ⇤, )}
) states 7 and 8 have
shift/reduce conflicts for
+ and ⇤.

(Dragonbook)
Syntax analysis 208
Disambiguation
Example:
Parsing of id + id ⇤ id will give the configuration

(0E 1 + 4E 7, ⇤id$)

We can choose:
I ACTION[7, ⇤] =shift 5) precedence to ⇤
I ACTION[7, ⇤] =reduce E ! E + E ) precedence to +

Parsing of id + id + id will give the configuration

(0E 1 + 4E 7, +id$)

We can choose:
I ACTION[7, +] =shift 4) + is right-associative
I ACTION[7, +] =reduce E ! E + E ) + is left-associative
(same analysis for I8 )
Syntax analysis 209
outline

1. Introduction

2. Context-free grammar

3. Top-down parsing

4. Bottom-up parsing
Shift/reduce parsing
LR parsers
Operator precedence parsing
Using ambiguous grammars

5. Conclusion and some practical considerations

Syntax analysis 210


Top-down versus bottom-up parsing

Top-down
I Easier to implement (recursively), enough for most standard
programming languages
I Need to modify the grammar sometimes strongly, less general than
bottom-up parsers
I Used in most hand-written compilers and some parser generators
(JavaCC, ANTLR)
Bottom-up:
I More general, less strict rules on the grammar, SLR(1) powerful
enough for most standard programming languages
I More difficult to implement, less easy to maintain (add new rules,
etc.)
I Used in most parser generators (Yacc, Bison)

Syntax analysis 211


Hierarchy of grammarCHAPTER
classesTHREE. PARSING

Unambiguous Grammars Ambiguous


Grammars
LL(k) LR(k)

LL(1) LR(1)

LALR(1)

SLR

LL(0) LR(0)

(Appel)
FIGURE 3.29. A hierarchy of grammar classes.

Syntax analysis For example, the items in states 6 and 13 of the LR(1) parser 212
for G
Error detection and recovery

In table-driven parsers, there is an error as soon as the table


contains no entry (or an error entry) for the current stack (state)
and input symbols
The least one can do: report a syntax error and give information
about the position in the input file and the tokens that were
expected at that position
In practice, it is however desirable to continue parsing to report
more errors
There are several ways to recover from an error:
I Panic mode
I Phrase-level recovery
I Introduce specific productions for errors
I Global error repair

Syntax analysis 213


Panic-mode recovery

In case of syntax error within a “phrase”, skip until the next


synchronizing token is found (e.g., semicolon, right parenthesis) and
then resume parsing
In LR parsing:
I Scan down the stack until a state s with a goto on a particular
nonterminal A is found
I Discard zero or more input symbols until a symbol a is found that can
follow A
I Stack the state GOTO(s, A) and resume normal parsing

Syntax analysis 214


Phrase-level recovery

Examine each error entry in the parsing table and decide on an


appropriate recovery procedure based on the most likely programmer
error.
Examples in LR parsing: E ! E + E |E ⇤ E |(E )|id
I id + ⇤id:
⇤ is unexpected after a +: report a “missing operand” error, push an
arbitrary number on the stack and go to the appropriate next state
I id + id) + id:
Report an “unbalanced right parenthesis” error and remove the right
parenthesis from the input

Syntax analysis 215


Other error recovery approaches

Introduce specific productions for detecting errors:


Add rules in the grammar to detect common errors
Examples for a C compiler:
I ! if E I (parenthesis are missing around the expression)
I ! if (E ) then I (then is not needed in C)

Global error repair:


Try to find globally the smallest set of insertions and deletions that
would turn the program into a syntactically correct string
Very costly and not always e↵ective

Syntax analysis 216


Building the syntax tree

Parsing algorithms presented so far only check that the program is


syntactically correct
In practice, the parser also needs to build the parse tree (also called
concrete syntax tree)
Its construction is easily embedded into the parsing algorithm

Top-down parsing:
I Recursive descent: let each parsing function return the sub-trees for
the parts of the input they parse
I Table-driven: each nonterminal on the stack points to its node in the
partially built syntax tree. When the nonterminal is replaced by one
of its RHS, nodes for the symbols on the RHS are added as children
to the nonterminal node

Syntax analysis 217


in which tokens are grouped
ea often represented
token such inname
as <id,1>. The a parse
id is short for identifier. The value 1 is
Building the syntax tree
ymbol table produced by the compiler. This table is used to pass

he token <=>. In reality it is probably mapped to a pair, whose second


Bottom-up parsing:
hat there are many different identifiers so we need the second component,
mbol =.
n <id,2> I Each stack element points to a subtree of the syntax tree
en <+>.
right. I When performing a reduce, a new syntax tree is built with the
g and is discussed further in subsequent chapters. It is mapped to
e something. On the one nonterminal
hand there is onlyat
as bethe
one 3 so root
we couldand thethepopped-o↵ stack elements
just use as children
can be a difference between how such
ammar containing rules this should printed (e.g., in an error
hases) and how it should be stored (fixed vs. float vs double). Perhaps the
le where an entry for "this kind of 3" is stored. Another possibility is to
Note:
<;>.
I In practice, the concrete syntax tree is not built but rather a
rlly removed during scanning. In C, most blanks are non-significant.
simplified (abstract) syntax tree
I Depending on the complexity of the compiler, the syntax tree might
rs, and the various symbols and punctuation without using recursion
evenalso
ression (expr). Note notthe
behierarchical
constructed decomposition in the figure on the right.
ng)
parsing is somewhat arbitrary, but invariably if a recursive definition is involved,
g.
ch tokens are grouped
represented in a parse
d the syntax tree with operators as interior nodes and
rator. The syntax tree on the right corresponds to the parse

epresents ansuch
containing rules assignment
as expression not an assignment statement. In C an
railing semicolon.
Syntax analysis
That is, in C (unlike in Algol) the semicolon is a statement 218
For your project

The choice of a parsing technique is left open for the project


You can either use a parser generator or implement the parser by
yourself
Motivate your choice in your report and explain any transformation
you had to apply to your grammar to make it fit the constraints of
the parser

Parser generators:
I Yacc: Unix parser generator, LALR(1) (companion of Lex)
I Bison: free implementation of Yacc, LALR(1) (companion of Flex)
I ANTLR: LL(*), implemented in Java but output code in several
languages
I ...
http://en.wikipedia.org/wiki/Comparison_of_parser_generators

Syntax analysis 219


An example with Flex/Bison
Example: Parsing of the following expression grammar:

Input ! Input Line


Input ! ✏
Line ! Exp EOL
Line ! EOL
Exp ! num
Exp ! Exp + Exp
Exp ! Exp Exp
Exp ! Exp ⇤ Exp
Exp ! Exp/Exp
Exp ! (Exp)

https://github.com/prashants/calc
Syntax analysis 220
Flex file: calc.lex
%{
#define YYSTYPE double /* Define the main semantic type */
#include "calc.tab.h" /* Define the token constants */
#include <stdlib.h>
%}
%option yylineno /* Ask flex to put line number in yylineno */
white [ \t]+
digit [0-9]
integer {digit}+
exponent [eE][+-]?{integer}
real {integer}("."{integer})?{exponent}?
%%
{white} {}
{real} { yylval=atof(yytext); return NUMBER; }
"+" { return PLUS; }
"-" { return MINUS; }
"*" { return TIMES; }
"/" { return DIVIDE; }
"(" { return LEFT; }
")" { return RIGHT; }
"\n" { return END; }
. { yyerror("Invalid token"); }

Syntax analysis 221


Bison file: calc.y
Declaration:
%{
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#define YYSTYPE double /* Define the main semantic type */
extern char *yytext; /* Global variables of Flex */
extern int yylineno;
extern FILE *yyin;
%}

Definition of the tokens and start symbol


%token NUMBER
%token PLUS MINUS TIMES DIVIDE
%token LEFT RIGHT
%token END

%start Input

Syntax analysis 222


Bison file: calc.y

Operator associativity and precedence:


%left PLUS MINUS
%left TIMES DIVIDE
%left NEG

Production rules and associated actions:


%%

Input: /* epsilon */
| Input Line
;

Line:
END
| Expression END { printf("Result: %f\n", $1); }
;

Syntax analysis 223


Bison file: calc.y
Production rules and actions (continued):
Expression:
NUMBER { $$ = $1; }
| Expression PLUS Expression { $$ = $1 + $3; }
| Expression MINUS Expression { $$ = $1 - $3; }
| Expression TIMES Expression { $$ = $1 * $3; }
| Expression DIVIDE Expression { $$ = $1 / $3; }
| MINUS Expression %prec NEG { $$ = -$2; }
| LEFT Expression RIGHT { $$ = $2; }
;

Error handling:
%%

int yyerror(char *s)


{
printf("%s on line %d - %s\n", s, yylineno, yytext);
}

Syntax analysis 224


Bison file: calc.y
Main functions:
int main(int argc, char **argv)
{
/* if any input file has been specified read from that */
if (argc >= 2) {
yyin = fopen(argv[1], "r");
if (!yyin) {
fprintf(stderr, "Failed to open input file\n");
}
return EXIT_FAILURE;
}

if (yyparse()) {
fprintf(stdout, "Successful parsing\n");
}

fclose(yyin);
fprintf(stdout, "End of processing\n");
return EXIT_SUCCESS;
}

Syntax analysis 225


Bison file: makefile

How to compile:
bison -v -d calc.y
flex -o calc.lex.c calc.lex
gcc -o calc calc.lex.c calc.tab.c -lfl -lm

Example:
>./calc
1+2*3-4
Result: 3.000000
1+3*-4
Result: -11.000000
*2
syntax error on line 3 - *
Successful parsing
End of processing

Syntax analysis 226


The state machine
Excerpt of calc.output (with Expression abbreviated in Exp):

state 9 state 11
6 Exp: Exp . PLUS Exp 6 Exp: Exp PLUS . Exp
7 | Exp . MINUS Exp
8 | Exp . TIMES Exp
9 | Exp . DIVIDE Exp NUMBER shift, and go to state 3
10 | MINUS Exp . MINUS shift, and go to state 4
LEFT shift, and go to state 5
$default reduce using rule 10 (Exp)
Exp go to state 17
state 10
6 Exp: Exp . PLUS Exp
7 | Exp . MINUS Exp
8 | Exp . TIMES Exp
9 | Exp . DIVIDE Exp
11 | LEFT Exp . RIGHT

PLUS shift, and go to state 11


MINUS shift, and go to state 12
TIMES shift, and go to state 13
DIVIDE shift, and go to state 14
RIGHT shift, and go to state 16

Syntax analysis 227

You might also like