05 Parsingbottomup PDF
05 Parsingbottomup PDF
Parsing
Spring 2016
1 / 131
Outline
1. Parsing
Bottom-up parsing
Bibs
2 / 131
Outline
1. Parsing
Bottom-up parsing
Bibs
3 / 131
Bottom-up parsing: intro
unambiguous ambiguous
LL(k) LR(k)
LL(1) LR(1)
LALR(1)
SLR
LR(0)
LL(0)
5 / 131
LR-parsing and its subclasses
• right-most derivation (but left-to-right parsing)
• in general: bottom-up parsing more powerful than top-down
• typically: tool-supported (unlike recursive descent, which may
well be hand-coded)
• based on parsing tables + explicit stack
• thankfully: left-recursion no longer problematic
• typical tools: yacc and it’s descendants (like bison, CUP, etc)
• another name: shift-reduce parser
tokens + non-terms
6 / 131
Example grammar
S′ → S
S → ABt7 ∣ . . .
A → t4 t5 ∣ t1 B ∣ . . .
B → t2 t3 ∣ At6 ∣ . . .
1
That will later be used when constructing a DFA for “scanning” the stack,
to control the reactions of the stack machine. This restriction leads to a
unique, well-define initial state.
7 / 131
Parse tree for t1 . . . t7
S′
A B
B A
t1 t2 t3 t4 t5 t6 t7
exp
term
term factor
factor
number ∗ number
10 / 131
Bottom-up parse: Growing the parse tree
exp
term
term factor
factor
number ∗ number
number ∗ number
11 / 131
Bottom-up parse: Growing the parse tree
exp
term
term factor
factor
number ∗ number
12 / 131
Bottom-up parse: Growing the parse tree
exp
term
term factor
factor
number ∗ number
13 / 131
Bottom-up parse: Growing the parse tree
exp
term
term factor
factor
number ∗ number
14 / 131
Bottom-up parse: Growing the parse tree
exp
term
term factor
factor
number ∗ number
15 / 131
Bottom-up parse: Growing the parse tree
exp
term
term factor
factor
number ∗ number
16 / 131
Reduction in reverse = right derivation
• underlined part:
• different in reduction vs. derivation
• represents the “part being replaced”
• for derivation: right-most non-terminal
• for reduction: indicates the so-called handle
• note: all intermediate words are right-sentential forms
17 / 131
Handle
Definition (Handle)
Assume S ⇒∗r αAw ⇒r αβw . A production A → β at position k
following α is a handle of αβw We write ⟨A → β, k⟩ for such a
handle.
Note:
• w (right of a handle) contains only terminals
• w : corresponds to the future input still to be parsed!
• αβ will correspond to the stack content.
• the ⇒r -derivation-step in reverse:
• one reduce-step in the LR-parser-machine
• adding (implicitly in the LR-machine) a new parent to children
β (= bottom-up!)
• “handle” β can be empty (= )
18 / 131
Schematic picture of parser machine (again)
... if 1 + 2 ∗ ( 3 + 4 ) ...
q2
Reading “head”
(moves left-to-right)
q3 ⋱
q2 qn ...
q1 q0
unbouded extra memory (stack)
Finite control
19 / 131
General LR “parser machine” configuration
• Stack:
• contains: terminals + non-terminals (+ $)
• containing: what has been read already but not yet “processed”
• position on the “tape” (= token stream)
• represented here as word of terminals not yet read
• end of “rest of token stream”: $, as usual
• state of the machine
• in the following schematic illustrations: not yet part of the
discussion
• later: part of the parser table, currently we explain without
referring to the state of the parser-engine
• currently we assume: tree and rest of the input given
• the trick will be: how do achieve the same without that tree
already given (just parsing left-to-right)
20 / 131
Schematic run (reduction: from top to bottom)
$ t 1t 2t 3t 4t 5t 6t 7$
$t 1 t 2t 3t 4t 5t 6t 7$
$t 1 t 2 t 3t 4t 5t 6t 7$
$t 1 t 2 t 3 t 4t 5t 6t 7$
$t 1 B t 4t 5t 6t 7$
$A t 4t 5t 6t 7$
$At 4 t 5t 6t 7$
$At 4 t 5 t 6t 7$
$AA t 6t 7$
$AAt 6 t 7$
$AB t 7$
$ABt 7 $
$S $
$S ′ $
21 / 131
2 basic steps: shift and reduce
Shift Reduce
Move the next input Remove the symbols of the
symbol (terminal) over to right-most subtree from the stack
the top of the stack and replace it by the non-terminal
(“push”) at the root of the subtree
(replace = “pop + push”).
• easy to do if one has the parse tree already!
22 / 131
Example: LR parsing for addition (given the tree)
E′ → E
E → E +n ∣ n
E′
parse stack input action
1 $ n + n$ shift
E 2 $n + n$ red:. E → n
3 $E + n$ shift
4 $E + n$ shift
5 $E + n $ reduce E → E + n
E 6 $E $ red.: E ′ → E
7 $E ′
$ accept
S′ → S
S → (S )S ∣
side remark: unlike previous grammar, here:
• production with two non-terminals in the right
⇒ difference between left-most and right-most derivations (and
mixed ones)
24 / 131
Parentheses: tree, run, and right-most derivation
S′
parse stack input action
1 $ ( ) $ shift
2 $( ) $ reduce S →
S
3 $(S ) $ shift
4 $(S ) $ reduce S →
S S 5 $(S )S $ reduce S → (S )S
6 $S $ reduce S′ → S
′
7 $S $ accept
( )
Note: the 2 reduction steps for the
productions
Right-most derivation and right-sentential forms
S ′ ⇒r S ⇒r ( S ) S ⇒r ( S ) ⇒r ( )
25 / 131
Right-sentential forms & the stack
• sentential form: word from Σ∗ derivable from start-symbol
• right-sentential forms:
• part of the “run”
• but: split between stack and input
parse stack input action
1 $ n + n$ shift E ′ ⇒r E ⇒r E + n ⇒r n + n
2 $n + n$ red:. E → n
+ n$
n +n ↪ E +n ↪ E ↪ E′
3 $E shift
4 $E + n$ shift
5 $E + n $ reduce E → E + n
6 $E $ red.: E ′ → E
7 $E ′ $ accept
E ′ ⇒r E ⇒r E + n ∥ ∼ E + ∥ n ∼ E ∥ + n ⇒r n ∥ + n ∼∥ n + n
26 / 131
Viable prefixes of right-sentential forms and handles
• right-sentential form: E + n
• viable prefixes of RSF
• prefixes of that RSF on the stack
• here: 3 viable prefixes of that RSF: E , E +, E + n
• handle: remember the definition earlier
• here: for instance in the sentential form n + n
• handle is production E → n on the left occurrence of n in n + n
(let’s write n 1 + n 2 for now)
• note: in the stack machine:
• the left n 1 on the stack
• rest + n 2 on the input (unread, because of LR(0))
• if the parser engine detects handle n 1 on the stack, it does a
reduce-step
• However (later): depends on current “state” of the parser
engine
27 / 131
A typical situation during LR-parsing
28 / 131
General design for an LR-engine
• some ingredients clarified up-to now:
• bottom-up tree building as reverse right-most derivation,
• stack vs. input,
• shift and reduce steps
• however, one ingredient missing: next step of the engine may
depend on
• top of the stack (“handle”)
• look ahead on the input (but not for LL(0))
• and: current state of the machine (same stack-content, but
different reactions at different stages of the parse)
General idea:
Construct an NFA (and ultimately DFA) which works on the stack
(not the input). The alphabet consists of terminals and
non-terminals ΣT ∪ ΣN . The language
is regular! 29 / 131
LR(0) parsing as easy pre-stage
• LR(0): in practice too simple, but easy conceptual step
towards LR(1), SLR(1) etc.
• LR(1): in practice good enough, LR(k) not used for k > 1
LR(0) item
production with specific “parser position” . in its right-hand side
LR(0) item
A → β.γ
8 items
S′ → .S
S′ → S.
S → .(S )S
S → ( .S ) S
S → ( S. ) S
S → ( S ) .S
S → ( S ) S.
S → .
A → β.γ
• β on the stack
• γ: to be treated next (terminals on the input, but can contain
also non-terminals)
33 / 131
State transitions of the NFA
• X ∈Σ
• two kind of transitions
2
We have explained shift steps so far as: parser eats one terminal (= input
token) and pushes it on the stack.
34 / 131
Transitions for non-terminals and
• so far: we never pushed a non-terminal from the input to the
stack, we replace in a reduce-step the right-hand side by a
left-hand side
• however: the replacement in a reduce steps can be seen as
1. pop right-hand side off the stack,
2. instead, “assume” corresponding non-terminal on input &
3. eat the non-terminal an push it on the stack.
• two kind of transitions
1. the -transition correspond to the “pop” half
2. that X transition (for non-terminals) corresponds to that
“eat-and-push” part
• assume production X → β) and initial item X → .β
final states:
• NFA has a specific task, scanning the stack, not scanning the
input
• acceptance condition of the overall machine a bit more
complex
• input must be empty
• stack must be empty except the (new) start symbol
• NFA has a word to say about acceptence
• but not in form of being in an accepting state
• so: no accepting states
• but: accepting action (see later)
36 / 131
NFA: parentheses
S
start S′ → .S S′ → S.
S→ .(S )S S→ . S→ ( S ) S.
(
S→ ( .S ) S S→ ( S. ) S S
S
)
S→ ( S ) .S
37 / 131
Remarks on the NFA
38 / 131
NFA: addition
E
start E′ → .E E′ → E.
n
E→ .E + n E→ .n E→ n.
E→ E. + n E→ E + .n E→ E + n.
+ n
39 / 131
Determinizing: from NFA to DFA
• standard subset-construction3
• states then contains sets of items
• especially important: -closure
• also: direct construction of the DFA possible
3
technically, we don’t require here a total transition function, we leave out
any error state.
40 / 131
DFA: parentheses
0
S′ → .S 1
S
start S→ .(S )S S′ → S.
S→ .
( 2
S→ ( .S ) S 3
S
( S→ .(S )S S→ ( S. ) S
S→ .
)
(
4
S→ ( S ) .S 5
S
S→ .(S )S S→ ( S ) S.
S→ .
41 / 131
DFA: addition
0
1
E′ → .E
E E′ → E.
start E→ .E + n
E→ E. + n
E→ .n
n +
2 3 4
n
E→ n. E→ E + .n E→ E + n.
42 / 131
Direct construction of an LR(0)-DFA
-closure
• if A → α.Bγ is an item in a state
• there is productions B → β1 ∣ β2 . . .
• add items B → .β1 , B → .β2 . . . to the state
• continue that process, until saturation
initial state
S ′ → .S
start
plus closure
43 / 131
Direct DFA construction: transitions
...
A1 → α1 .X β1 A1 → α1 X .β1
X
... A2 → α2 X .β2
A2 → α2 .X β2 plus closure
...
44 / 131
How does the DFA do the shift/reduce and the rest?
45 / 131
Stack contents and state of the automaton
46 / 131
State transition allowing a shift
X → α.aβ
• construction thus has transition as follows
s t
... ...
a
X→ α.aβ X→ αa.β
... ...
• shift is possible;
• if shift is the correct operation and a is terminal symbol
corresponding to the current token: state afterwards = t
47 / 131
State transition: analogous for non-terminals
s t
X → α.Bβ ... B ...
X→ α.Bβ X→ αB.β
E′ → E
E → E +n ∣ n
E′
parse stack input action
1 $ n + n$ shift
E 2 $n + n$ red:. E → n
3 $E + n$ shift
4 $E + n$ shift
5 $E + n $ reduce E → E + n
E 6 $E $ red.: E ′ → E
7 $E ′
$ accept
51 / 131
DFA of addition example
0
1
E′ → .E
E E′ → E.
start E→ .E + n
E→ E. + n
E→ .n
n +
2 3 4
n
E→ n. E→ E + .n E→ E + n.
52 / 131
LR(0) grammars
LR(0) grammar
The top-state alone determines the next step.
53 / 131
Simple parentheses
A → (A) ∣ a
0
A′ → .A 1 • for shift:
A
start A→ .(A) A →
′
A. • many shift
A→ .a transitions in state
allowed
( a • shift counts as one
3 action (including
A→ ( .A ) 2 “shifts” on
a
( A→ .(A) A→ a. non-terms)
A→ .a • but for reduction:
also the production
A
4 5 must be clear
A→ ( A. ) A→ (A).
)
54 / 131
Simple parentheses is LR(0)
0
A →
′
.A 1
A
start A→ .(A) A′ → A.
A→ .a state possible action
0 only shift
(
only red: (with A′ → A)
a
3
1
A→ ( .A ) 2
2 only red: (with A → a)
a 3 only shift
( A→ .(A) A→ a.
A→ .a 4 only shift
5 only red (with A → ( A ))
A
4 5
A→ ( A. ) A→ (A).
)
55 / 131
NFA for simple parentheses (bonus slide)
A
start A′ → .A A′ → A.
a
A→ .(A) A→ .a A→ a.
(
A→ ( .A ) A→ ( A. ) A→ (A).
A )
56 / 131
Parsing table for an LR(0) grammar
• table structure slightly different for SLR(1), LALR(1), and
LR(1) (see later)
• note: the “goto” part: “shift” on non-terminals (only one
non-terminal here)
• corresponding to the A-labelled transitions
• see the parser run on the next slide
1 $0 ((a))$ shift
2 $ 0 (3 (a))$ shift
3 $ 0 (3 (3 a))$ shift
4 $ 0 (3 (3 a 2 ))$ reduce A → a
5 $0 (3 (3 A4 ))$ shift
6 $0 (3 (3 A4 )5 )$ reduce A → ( A )
7 $0 (3 A4 )$ shift
8 $0 (3 A4 )5 $ reduce A → ( A )
9 $0 A1 $ accept
A′
A
( ( a ) )
• As said:
• the reduction “contains” the parse-tree
• reduction: builds it bottom up
• reduction in reverse: contains a right-most derivation (which is
“top-down”)
• accept action: corresponds to the parent-child edge A′ → A of
the tree
59 / 131
Parsing of erroneous input
• empty slots it the table: “errors”
Invariant
important general invariant for LR-parsing: never shift something
“illegal” onto the stack
60 / 131
LR(0) parsing algo, given DFA
61 / 131
LR(0) parsing algo remarks
62 / 131
DFA parentheses again: LR(0)?
S′ → S
S → (S )S ∣
0
S′ → .S 1
S
start S→ .(S )S S →
′
S.
S→ .
( 2
S→ ( .S ) S 3
S
( S→ .(S )S S→ ( S. ) S
S→ .
)
(
4
S→ ( S ) .S 5
S
S→ .(S )S S→ ( S ) S.
S→ .
63 / 131
DFA parentheses again: LR(0)?
S′ → S
S → (S )S ∣
0
S′ → .S 1
S
start S→ .(S )S S′ → S.
S→ .
( 2
S→ ( .S ) S 3
S
( S→ .(S )S S→ ( S. ) S
S→ .
)
(
4
S→ ( S ) .S 5
S
S→ .(S )S S→ ( S ) S.
S→ .
E′ → E
E → E + number ∣ number
0
1
E′ → .E
E E′ → E.
start E→ .E + n
E→ E. + n
E→ .n
n +
2 3 4
n
E→ n. E→ E + .n E→ E + n.
65 / 131
DFA addition again: LR(0)?
E′ → E
E → E + number ∣ number
0
1
E′ → .E
E E′ → E.
start E→ .E + n
E→ E. + n
E→ .n
n +
2 3 4
n
E→ n. E→ E + .n E→ E + n.
66 / 131
Decision? If only we knew the ultimate tree already . . .
. . . especially the parts still to come
E′
0
1
E′ → .E
E E′ → E.
start E→ .E + n
E→ E. + n
E→ .n
n +
2 3 4
n
E→ n. E→ E + .n E→ E + n.
68 / 131
One look-ahead
69 / 131
Resolving LR(0) reduce/reduce conflicts
70 / 131
Resolving LR(0) reduce/reduce conflicts
71 / 131
Resolving LR(0) shift/reduce conflicts
... b2
B1 → β1 .b1 γ1
B2 → β2 .b2 γ2
72 / 131
Resolving LR(0) shift/reduce conflicts
... b2
B1 → β1 .b1 γ1
B2 → β2 .b2 γ2
73 / 131
SLR(1) requirement on states (as in the book)
74 / 131
Revisit addition one more time
0
1
E′ → .E
E E →
′
E.
start E→ .E + n
E→ E. + n
E→ .n
n +
2 3 4
n
E→ n. E→ E + .n E→ E + n.
• Follow (E ′ ) = {$}
⇒ • shift for +
• reduce with E ′ → E for $ (which corresponds to accept, in case
the input is empty)
75 / 131
SLR(1) algo
let s be the current state, on top of the parse stack
1. s contains A → α.X β, where X is a terminal and X is the next
token on the input, then
• shift X from input to top of stack. the new state pushed on
X
the stack: state t where s Ð
→ t5
2. s contains a complete item (say A → γ.) and the next token in
the input is in Follow (A): reduce by rule A → γ:
• A reduction by S ′ → S: accept, if input is empty6
• else:
pop: remove γ (including “its” states from the stack)
back up: assume to be in state u which is now head state
push: push A to the stack, new head state t where
A
uÐ
→t
3. if next token is such that neither 1. or 2. applies: error
5
Cf. to the LR(0) algo: since we checked the existence of the transition
before, the else-part is missing now.
6
Cf. to the LR(0) algo: This happens now only if next token is $. Note that
the follow set of S ′ in the augmented grammar is always only $
76 / 131
Parsing table for SLR(1)
0
1
E′ → .E
E E′ → E.
start E→ .E + n
E→ E. + n
E→ .n
n +
2 3 4
n
E→ n. E→ E + .n E→ E + n.
7
by which it, strictly speaking, would no longer be an SRL(1)-table :-)
78 / 131
SLR(1) parser run (= “reduction”)
1 $0 n + n + n$ shift: 2
2 $0 n 2 + n + n$ reduce: E → n
3 $ 0 E1 + n + n$ shift: 3
4 $0 E1 +3 n + n$ shift: 4
5 $0 E1 +3 n 4 + n$ reduce: E → E + n
6 $ 0 E1 n$ shift 3
7 $0 E1 +3 n$ shift 4
8 $0 E1 +3 n 4 $ reduce: E → E + n
9 $ 0 E1 $ accept
79 / 131
Corresponding parse tree
E′
80 / 131
Revisit the parentheses again: SLR(1)?
0
S′ → .S 1
S
start S→ .(S )S S′ → S.
S→ .
( 2
S→ ( .S ) S 3
S
( S→ .(S )S S→ ( S. ) S
S→ .
)
(
4
S→ ( S ) .S 5
S
S→ .(S )S S→ ( S ) S.
S→ .
81 / 131
SLR(1) parse table
82 / 131
Parentheses: SLR(1) parser run (= “reduction”)
84 / 131
Ambiguity & LR-parsing
• in principle: LR(k) (and LL(k)) grammars: unambiguous
• definition/construction: free of shift/reduce and reduce/reduce
conflict (given the chosen level of look-ahead)
• However: ambiguous grammar tolerable, if (remaining)
conflicts can be solved meaningfully otherwise:
86 / 131
Simplified conditionals
Follow-sets
Follow
′
S {$}
S {$, else}
I {$, else}
87 / 131
DFA of LR(0) items
0 1
S ′ → .S S ′ → S.
S
S → .I 2
I
start S → .other S → I.
I → .if S I I
I → .if S else S
if 4
6
I → if.S
other I → ifS else .S
I → if.S else S
3 S → .I
S → .I
S → other . S → .other
S → .other
I → .if S
other if
I → .if S
I → .if S else S
I → .if S else S
else
S S
other
5
7
I → ifS.
I → ifS else S.
I → ifS . else S
88 / 131
Simple conditionals: parse table
89 / 131
Parser run (= reduction)
I I S
S S S
“dangling else”
“an else belongs to the last previous, still open (= dangling)
if-clause”
92 / 131
Use of ambiguous grammars
E′ → E
E → E + E ∣ E ∗ E ∣ number
93 / 131
DFA for + and ×
0
1
E ′ → .E
E′ → E.
E → .E + E E
start E → E. + E
E → .E ∗ E
E → E. ∗ E
E → .n 3 +
5
E → E + .E
E → E + E.
E → .E + E +
E → E. + E
E → .E ∗ E
E → E. ∗ E
E → .n E
∗
6
n
+ E → E ∗ E. ∗
E → E. + E
n E → E. ∗ E
∗
4
E → E ∗ .E
2 E E → .E + E
E→ n.
n E → .E ∗ E
E → .n 94 / 131
States with conflicts
• state 5
• stack contains ...E + E
• for input $: reduce, since shift not allowed from $
• for input +; reduce, as + is left-associative
• for input ∗: shift, as ∗ has precedence over +
• state 6:
• stack contains ...E ∗ E
• for input $: reduce, since shift not allowed from $
• for input +; reduce, a ∗ has precedence over +
• for input ∗: shift, as ∗ is left-associative
• see also the table on the next slide
95 / 131
Parse table + and ×
96 / 131
For comparison: unambiguous grammar for + and ∗
Follow
E′ {$} (as always for start symbol)
E {$, +}
T {$, +, ∗}
97 / 131
DFA for unambiguous + and ×
0
E ′ → .E 2
1
E → .E + T E → E + .T
E E′ → E. +
start E → .T T → .T ∗ n
E → E. + T
E → .T ∗ n T → .n
E → .n
n T
n
6
3
E → E +T.
T → n.
T → T.∗n
T
∗
4
5 7
E → T. ∗
T → T ∗ .n T → T ∗ .n
T → T.∗n n
98 / 131
DFA remarks
99 / 131
LR(1) parsing
100 / 131
Limits of SLR(1) grammars
101 / 131
non-SLR(1): Reduce/reduce conflict
102 / 131
Situation can be saved
103 / 131
LALR(1) (and LR(1)): Being more precise with the
follow-sets
LR(1) items
[A → α.β, a] (2)
• a: terminal/token, including $
9
Not to mention if we wanted look-ahead of k > 1, which in practice is not
done, though
105 / 131
LALR(1)-DFA (or LR(1)-DFA)
106 / 131
Remarks on the DFA
107 / 131
Full LR(1) parsing
SLR(1) LALR(1)
LR(0)-item-based parsing, with LR(1)-item-based parsing, but
afterwards adding some extra afterwards throwing away
“pre-compiled” info (about precision by collapsing states,
follow-sets) to increase expressivity to save space
108 / 131
LR(1) transitions: arbitrary symbol
X -transition
X
[A → α.X β, a] [A → αX .β, a]
109 / 131
LR(1) transitions:
-transition
for all
B → β1 ∣ β2 . . . and all b ∈ First(γa)
[ A → α.Bγ ,a] [ B → .β ,b]
Special case (γ = )
for all B → β1 ∣ β2 . . .
[ A → α.B ,a] [ B → .β ,b]
110 / 131
LALR(1) vs LR(1)
LR(1)
LALR(1)
111 / 131
Core of LR(1)-states
112 / 131
LALR(1)-DFA by as collapse
113 / 131
Concluding remarks of LR / bottom up parsing
10
If designing a new language, there’s also the option to massage the
language itself. Note also: there are inherently ambiguous languages for which
there is no unambiguous grammar.
114 / 131
LR/bottom-up parsing overview
advantages remarks
LR(0) defines states also used by not really used, many con-
SLR and LALR flicts, very weak
SLR(1) clear improvement over weaker than LALR(1). but
LR(0) in expressiveness, often good enough. Ok
even if using the same for hand-made parsers for
number of states. Table small grammars
typically with 50K entries
LALR(1) almost as expressive as method of choice for most
LR(1), but number of generated LR-parsers
states as LR(0)!
LR(1) the method covering all large number of states
bottom-up, one-look-ahead (typically 11M of entries),
parseable grammars mostly LALR(1) preferred
Remeber: once the table specific for LR(0), . . . is set-up, the parsing
algorithms all work the same
115 / 131
Error handling
Minimal requirement
Upon “stumbling over” an error (= deviation from the grammar):
give a reasonable & understandable error message, indicating also
error location. Potentially stop parsing
116 / 131
Error messages
• important:
• try to avoid error messages t hat only occur because of an
already report ed error!
• report error as early as possible, if possible at the first point
where the program cannot be extended to a correct program.
• make sure that , after an error, one doesn’t end up in a infinite
loop without reading any input symbols.
• What’s a good error message?
• assume: that the method factor() chooses the alternative
( exp ) but that it , when control returns from method exp(),
does not find a )
• one could report : left paranthesis missing
• But this may often be confusing, e.g. if what the program text
is: ( a + b c )
• here the exp() method will terminate after ( a + b, as c
cannot extend the expression). You should therefore rather
give the message error in expression or left
paranthesis missing.
117 / 131
Error recovery in bottom-up parsing
• panic recovery in LR-parsing
• simple form
• the only one we shortly look at
• upon error: recovery ⇒
• pops parts of the stack
• ignore parts of the input
• until “on track again”
• but: how to do that
• additional problem: non-determinism
• table: onstructed conflict-free under normal operation
• upon error (and clearing parts of the stack + input): no
guarantee it’s clear how to continue
⇒ heuristic needed (like panic mode recovery)
119 / 131
Possible error situation
120 / 131
Panic mode recovery
Algo
1. Pop states for the stack until a state is found with non-empty
goto entries
2. • If there’s legal action on the current input token from one of
the goto-states, push token on the stack, restart the parse.
• If there’s several such states: prefer shift to a reduce
• Among possible reduce actions: prefer one whose associated
non-terminal is least general
3. if no legal action on the current input token from one of the
goto-states: advance input until there is a legal action (or until
end of input is reached)
121 / 131
Example again
122 / 131
Example again
123 / 131
Panic mode may loop forever
124 / 131
Typical yacc parser table
some variant of the expression grammar again
command → exp
exp → term ∗ factor ∣ factor
term → term ∗ factor ∣ factor
factor → number ∣ ( exp )
125 / 131
Panicking and looping
127 / 131
Outline
1. Parsing
Bottom-up parsing
Bibs
128 / 131
References I
129 / 131