0% found this document useful (0 votes)

82 views

Bottom-Up Parsing: Goal of Parser: Build A Derivation

The document discusses bottom-up parsing and shift-reduce parsing. It can be summarized as: (1) Bottom-up parsing builds the parse tree from leaves to root by working from the input back toward the start symbol, unlike top-down parsing which proceeds from the start symbol to the input. (2) Shift-reduce parsing is a bottom-up parsing technique where the parser shifts input symbols onto a stack until a "handle" matching a production is found, then reduces it by popping the handle and pushing the production's left-hand side symbol. (3) The parser determines whether to shift or reduce by using a parsing table constructed from the grammar's LR(1) items, which include lookahead information

Uploaded by

Bacha Hunde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views

Bottom-Up Parsing: Goal of Parser: Build A Derivation

Uploaded by

Bacha Hunde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 31

Bottom-up parsing

Goal of parser : build a derivation Top-down parser : build a derivation by working from the start symbol towards the input. Builds parse tree from root to leaves Builds leftmost derivation Bottom-up parser : build a derivation by working from the input back toward the start symbol Builds parse tree from leaves to root Builds reverse rightmost derivation

Bottom-up parsing
The parser looks for a substring of the parse tree's

frontier...

...that matches the rhs of a production and ...whose reduction to the non-terminal on the lhs represents on step along the reverse of a rightmost derivation

Such a substring is called a handle.

Important: Not all substrings that match a rhs are handles.

Bottom-up parsing techniques

Shift-reduce parsing Shift input symbols until a handle is found. Then, reduce the substring to the non-terminal on the lhs of the corresponding production. Operator-precedence parsing Based on shift-reduce parsing. Identifies handles based on precedence rules.

Example: Shift-reduce parsing

STACK ACTION Shift Reduce (rule 5)

Grammar: 1. 2. 3. 4. 5. S E E E E E E+E E*E num id

$ $ id1

$E
$E+

Shift
Shift

$ E + num
$E+E $E+E* $ E + E * id2 $E+E*E $E+E $E $S

Reduce (rule 4)
Shift Shift Reduce (rule 5) Reduce (rule 3) Reduce (rule 2) Reduce (rule 1) Accept
4

Input to parse: id1 + num * id2 Handles:

underlined

Shift-Reduce parsing
A shift-reduce parser has 4 actions: Shift -- next input symbol is shifted onto the stack Reduce -- handle is at top of stack pop handle push appropriate lhs Accept -- stop parsing & report success Error -- call error reporting/recovery routine

Shift-Reduce parsing
How can we know when we have found a handle? Analyze the grammar beforehand. Build tables Look ahead in the input LR(1) parsers recognize precisely those languages in

which one symbol of look-ahead is enough to determine whether to reduce or shift.

L : for left-to-right parse of the input R : for reverse rightmost derivation 1: for one symbol of lookahead

How does it work?

Read input, one token at a time

Use stack to keep track of current state The state at the top of the stack summarizes the information below. The stack contains information about what has been parsed so far.
Use parsing table to determine action based on

current state and look-ahead symbol. How do we build a parsing table?

LR parsing techniques
SLR (not in the book) Simple LR parsing Easy to implement, not strong enough Uses LR(0) items Canonical LR Larger parser but powerful Uses LR(1) items LALR (not in the book) Condensed version of canonical LR May introduce conflicts Uses LR(1) items
8

Class examples
E' E E E+T S' S S L=R

E
T T

T
T*F F

S
L L

R
*R id

Finding handles
As a shift/reduce parser processes the input, it must

keep track of all potential handles. For example, consider the usual expression grammar and the input string x+y.

Suppose the parser has processed x and reduced it to E. Then, the current state can be represented by E +E where means that an E has already been parsed and that +E is a potential suffix, which, if found, will result in a successful parse. Our goal is to eventually reach state E+E, which represents an actual handle and should result in the reduction EE+E
10

LR parsing
Typically, LR parsing works by building an automaton where

each state represents what has been parsed so far and what we hope to parse in the future.

In other words, states contain productions with dots, as described

earlier. Such productions are called items

States containing handles (meaning the dot is all the way to the

right end of the production) lead to actual reductions depending on the lookahead.

SLR parsing
SLR parsers build automata where states contain

items (a.k.a. LR(0) items) and reductions are decided based on FOLLOW set information. We will build an SLR table for the augmented grammar
S'S S L=R SR L *R L id RL
12

SLR parsing
When parsing begins, we have not parsed any input at all and

we hope to parse an S. This is represented by S'S.

Note that in order to parse that S, we must either parse an L=R or an R. This is represented by SL=R and SR

closure of a state: if AaBb represents the current state and B is a production, then add B to the state. Justification: aBb means that we hope to see a B next. But parsing a B is equivalent to parsing a , so we can say that we hope to see a next

SLR parsing
Use the closure operation to define states containing

LR(0) items. The first state will be:

S' S S L=R SR L *R L id RL

From this state, if we parse, say, an id, then we go to L id state If, after some steps we parse input that reduces to S L =R an L, then we go to state R L

SLR parsing
Continuing the same way, we define all LR(0) item

states:

S' S S L=R SR L *R L id RL id R L id

R I6 S L= R S L=R RL I9 id L *R L I3 S L =R = L id I2 R L L * * L * R I5 R L L R L I7 L id R id L * R L *R I8
S' S *
15

I4 S R

SLR parsing
The automaton and the FOLLOW sets tell us how to build the

parsing table: Shift actions If from state i, you can go to state j when parsing a token t, then slot [i,t] of the table should contain action "shift and go to state j", written sj Reduce actions If a state i contains a handle A, then slot [i, t] of the table should contain action "reduce using A", for all tokens t that are in FOLLOW (A). This is written r(A)

The reasoning is that if the lookahead is a symbol that may follow A, then a reduction A should lead closer to a successful parse.
16

continued on next slide

SLR parsing
The automaton and the FOLLOW sets tell us how to build the

parsing table: Reduce actions, continued Transitions on non-terminals represent several steps together that have resulted in a reduction. For example, if we are in state 0 and parse a bit of input that ends up being reduced to an L, then we should go to state 2. Such actions are recorded in a separate part of the parsing table, called the GOTO part.

SLR parsing
Before we can build the parsing table, we need to compute the

FOLLOW sets:

S' S S L=R S R L *R L id R L

FOLLOW(S') = {$} FOLLOW(S) = {$} FOLLOW(L) = {$, =} FOLLOW(R) = {$, =}

SLR parsing
state 0 1 2 3 4 5 6 7 8 9 id s3 = s6/r(RL) r(Lid) s3 s3 s5 s5 action * s5 $ accept r(Lid) r(SR) r(RL) r(L*R) r(SL=R) goto S L R 1 2 4

r(RL) r(L*R)

7 7

8 9

Note the shift/reduce conflict on state 2 when the lookahead is an =

Conflicts in LR parsing
There are two types of conflicts in LR parsing: shift/reduce On some particular lookahead it is possible to shift or reduce The if/else ambiguity would give rise to a shift/reduce conflict reduce/reduce This occurs when a state contains more than one handle that may be reduced on the same lookahead.

Conflicts in SLR parsing

The parser we built has a shift/reduce conflict.

Does that mean that the original grammar was

ambiguous? Not necessarily. Let's examine the conflict:

it seems to occur when we have parsed an L and are seeing an =. A reduce at that point would turn the L into an R. However, note that a reduction at that point would never actually lead to a successful parse. In practice, L should only be reduced to an R when the lookahead is EOF ($). An easy way to understand this is by considering that L represents l-values while R represents r-values.

Conflicts in SLR parsing

The conflict occurred because we made a decision about when

to reduce based on what token may follow a non-terminal at any time. However, the fact that a token t may follow a non-terminal N in some derivation does not necessarily imply that t will follow N in some other derivation. SLR parsing does not make a distinction.

Conflicts in SLR parsing

SLR parsing is weak.

Solution : instead of using general FOLLOW

information, try to keep track of exactly what tokens many follow a non-terminal in each possible derivation and perform reductions based on that knowledge. Save this information in the states. This gives rise to LR(1) items:

items where we also save the possible lookaheads.

Canonical LR(1) parsing

In the beginning, all we know is that we have not

read any input (S'S), we hope to parse an S and after that we should expect to see a $ as lookahead. We write this as: S'S, $ Now, consider a general item A, x. It means that we have parsed an , we hope to parse and after those we should expect an x. Recall that if there is a production , we should add to the state. What kind of lookahead should we expect to see after we have parsed ?

We should expect to see whatever starts a . If is empty or can vanish, then we should expect to see an x after we have parsed (and reduced it to B)
24

Canonical LR(1) parsing

The closure function for LR(1) items is then defined

as follows:

For each item A, x in state I, each production in the grammar, and each terminal b in FIRST(x), add , b to I If a state contains core item with multiple possible lookaheads b1, b2,..., we write , b1/b2 as shorthand for , b1 and , b2
25

Canonical LR(1) parsing

I6 S' S , $ I0 S' S, $ S L=R, $ L S R, $ S L =R, $ = L *R, =/$ I2 R L , $ L id, =/$ * R L, $ L *R, =/$ id R I5 R L, =/$ I5' L id, =/$ I3 L id , =/$ id L *R, =/$ L R * I4 S R, =/$ I8 L *R , =/$ S I1 S L= R, $ R SL=R, $ R L, $ id L *R, $ Lid, $ I3' L id, $ * L R L, $ I7' L *R, $ L id I3' R L, $ L id, $ L *R , $ L *R, $ R I8' * I9

R L, =/$ I7
26

Canonical LR(1) parsing

The table is created in the same way as SLR, except

we now use the possible lookahead tokens saved in each state, instead of the FOLLOW sets. Note that the conflict that had appeared in the SLR parser is now gone. However, the LR(1) parser has many more states. This is not very practical.

It may be possible to merge states!

LALR(1) parsing
This is the result of an effort to reduce the number of

states in an LR(1) parser. We notice that some states in our LR(1) automaton have the same core items and differ only in the possible lookahead information. Furthermore, their transitions are similar.

States I3 and I3', I5 and I5', I7 and I7', I8 and I8'

We shrink our parser by merging such states. SLR : 10 states, LR(1): 14 states, LALR(1) : 10 states

LALR(1) parsing
I6 S' S , $ I0 S' S, $ S L=R, $ L S R, $ S L =R, $ = L *R, =/$ I2 R L , $ L id, =/$ * R L, $ L *R, =/$ id R I5 R L, =/$ L id, =/$ I3 L id , =/$ id L *R, =/$ L I4 S R, =/$ S I1 S L= R, $ R SL=R, $ R L, $ id I L *R, $ 3 L id, $ * L I9

R L, =/$ I7

I8 L *R , =/$
29

Conflicts in LALR(1) parsing

Note that the conflict that had vanished when we

created the LR(1) parser has not reappeared. Can LALR(1) parsers introduce conflicts that did not exist in the LR(1) parser? Unfortunately YES. BUT, only reduce/reduce conflicts.

Conflicts in LALR(1) parsing

LALR(1) parsers cannot introduce shift/reduce conflicts.

Such conflicts are caused when a lookahead is the same as a token on which we can shift. They depend on the core of the item. But we only merge states that had the same core to begin with. The only way for an LALR(1) parser to have a shift/reduce conflict is if one existed already in the LR(1) parser. LALR(1) parsers can introduce reduce/reduce conflicts. Here's a situation when this might happen:

A B , x A C , y

merges with

A B,y to give: A C , x

A B , x/y A C , x/y

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Ch4b Modified
No ratings yet
Ch4b Modified
64 pages
Ch4b Modified
No ratings yet
Ch4b Modified
66 pages
CS346 Bottom Up Parser
No ratings yet
CS346 Bottom Up Parser
64 pages
Bottomupparser
No ratings yet
Bottomupparser
58 pages
Syntax Analyzer 2-up to LALR
No ratings yet
Syntax Analyzer 2-up to LALR
74 pages
Syntax Analyzer 2-up to LR(0)
No ratings yet
Syntax Analyzer 2-up to LR(0)
73 pages
07 Bottom Up Parsing
No ratings yet
07 Bottom Up Parsing
79 pages
Syntax Analysis 2
No ratings yet
Syntax Analysis 2
70 pages
Lec03 Part I SLR
No ratings yet
Lec03 Part I SLR
70 pages
Chapter 6-1 note
No ratings yet
Chapter 6-1 note
54 pages
Bottom Up (Shift Reduce) Parsing New
No ratings yet
Bottom Up (Shift Reduce) Parsing New
97 pages
Lec03 Part I SLR
No ratings yet
Lec03 Part I SLR
70 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
149 pages
Compiler Design(Unit-II)
No ratings yet
Compiler Design(Unit-II)
89 pages
Bottom Up Parser
No ratings yet
Bottom Up Parser
75 pages
Syntax Analysis (Part-II)
No ratings yet
Syntax Analysis (Part-II)
69 pages
Bottomupparsingh
No ratings yet
Bottomupparsingh
21 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
39 pages
5.ll-lr
No ratings yet
5.ll-lr
53 pages
Lecture4 Java
No ratings yet
Lecture4 Java
46 pages
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
No ratings yet
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
46 pages
Module 3
No ratings yet
Module 3
29 pages
RkCD-Chapter 4 - Syntax Analysis
No ratings yet
RkCD-Chapter 4 - Syntax Analysis
20 pages
Syntax Analysis: COP5621 Compiler Construction
No ratings yet
Syntax Analysis: COP5621 Compiler Construction
56 pages
CD_Chap3_III_Bottom Up Parsing (2)
No ratings yet
CD_Chap3_III_Bottom Up Parsing (2)
37 pages
UNIT-3-LR(0)-PARSER
No ratings yet
UNIT-3-LR(0)-PARSER
20 pages
07 - SLR Parsers_LR (0)
No ratings yet
07 - SLR Parsers_LR (0)
49 pages
Bottom Up Parsing1
No ratings yet
Bottom Up Parsing1
69 pages
Unit 3 21csc304j CD
No ratings yet
Unit 3 21csc304j CD
103 pages
bottom up
No ratings yet
bottom up
10 pages
Unit3.2 bottomupparsars
No ratings yet
Unit3.2 bottomupparsars
71 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
24 pages
mod3
No ratings yet
mod3
29 pages
Unit 02 - Part 03
No ratings yet
Unit 02 - Part 03
50 pages
General Framework: X X X X: LR Parser
No ratings yet
General Framework: X X X X: LR Parser
6 pages
Mod 2
No ratings yet
Mod 2
29 pages
CD Unit3 Part1
No ratings yet
CD Unit3 Part1
22 pages
LR Parsing Methods
No ratings yet
LR Parsing Methods
50 pages
UNIT-4 Parsing Techniques
No ratings yet
UNIT-4 Parsing Techniques
20 pages
Unit III
No ratings yet
Unit III
211 pages
CC LR Parser
No ratings yet
CC LR Parser
37 pages
Ch4b
No ratings yet
Ch4b
55 pages
UNIT-2-2
No ratings yet
UNIT-2-2
26 pages
SLR Parsing
No ratings yet
SLR Parsing
66 pages
Bottom Up Parsing: Session 14-15-16
No ratings yet
Bottom Up Parsing: Session 14-15-16
28 pages
Lecture 09
No ratings yet
Lecture 09
22 pages
BottomUp Shift Reduce Parser
No ratings yet
BottomUp Shift Reduce Parser
45 pages
Lecture3 Parser Full
No ratings yet
Lecture3 Parser Full
30 pages
Botttom Up Parsing
No ratings yet
Botttom Up Parsing
30 pages
Syntax Analysis - LL LR Parser
No ratings yet
Syntax Analysis - LL LR Parser
148 pages
Compiler Construction 1 1 Compiler Construction 1 2
No ratings yet
Compiler Construction 1 1 Compiler Construction 1 2
12 pages
CH 4 Syntax Analysis - Part2
No ratings yet
CH 4 Syntax Analysis - Part2
31 pages
LR (K) Parsing: CPSC 388 Ellen Walker Hiram College
No ratings yet
LR (K) Parsing: CPSC 388 Ellen Walker Hiram College
30 pages
LR Parsing: Dewan Tanvir Ahmed Assistant Professor, CSE Buet
No ratings yet
LR Parsing: Dewan Tanvir Ahmed Assistant Professor, CSE Buet
60 pages
LR(0) and SLR(0)
No ratings yet
LR(0) and SLR(0)
102 pages
Lec 05
No ratings yet
Lec 05
22 pages
Bottom-Up Parsing Including LR (0), SLR
No ratings yet
Bottom-Up Parsing Including LR (0), SLR
55 pages
LR Parser
No ratings yet
LR Parser
15 pages
Introduction To Bottom Up Parser
No ratings yet
Introduction To Bottom Up Parser
75 pages