0% found this document useful (0 votes)
93 views39 pages

Bottom Up Parsing

Bottom-up parsing builds parse trees from the leaves up. It uses shift-reduce parsing which attempts to construct a parse tree beginning at the leaves and working up towards the root. Shift-reduce parsing uses a stack and input buffer, shifting input symbols onto the stack until a substring matches a production which can then be reduced. LR parsing is a general shift-reduce parsing technique that uses parsing tables to determine the shift and reduce actions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views39 pages

Bottom Up Parsing

Bottom-up parsing builds parse trees from the leaves up. It uses shift-reduce parsing which attempts to construct a parse tree beginning at the leaves and working up towards the root. Shift-reduce parsing uses a stack and input buffer, shifting input symbols onto the stack until a substring matches a production which can then be reduced. LR parsing is a general shift-reduce parsing technique that uses parsing tables to determine the shift and reduce actions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Bottom Up Parsing

Bottom Up Parsers
• Bottom-up parsers build parse trees from the leaves and work up to
the root.
• Bottom-up syntax analysis known as shift-reduce parsing.
• An easy-to-implement shift-reduce parser is called operator precedence
parsing.
• General method of shift-reduce parsing is called LR parsing.
• Shift-reduce parsing attempts to construct a parse tree for an input string
beginning at the leaves (the bottom) and working up towards the root
(the top).
• At each reduction step a particular substring matching the right side of a
production is replaced by the symbol on the left of that production, and
if the substring is chosen correctly at each step, a rightmost derivation is
traced out in reverse.
Example
•   Consider the grammar
S → aABe
A → Abc | b
B → d  
• The sentence abbcde can be reduced to S by the following steps.
abbcde
aAbcde
aAde
aABe
S
• These reductions trace out the following rightmost derivation in
reverse.
Handles
• A handle of a string is a substring that matches the
right side of a production, and whose reduction to
the non-terminal on the left side of the production
represents one step along the reverse of a
rightmost derivation.
• A handle of a right-sentential form γ is a production
A → β and a position of γ where the string β may be
found and replaced by A to produce the previous
right-sentential form in a rightmost derivation of γ.
Example
Right Sentential Form Handle Reducing Productions
abbcde b A->b
aAbcde Abc A->Abc
aAde d B->D
aABe aABe S->aABe
S

S → aABe
A → Abc | b
B → d  
(Grammar)
Handle Pruning
•  A rightmost derivation in reverse can be
obtained by handle pruning.
– start with a string of terminals w that is to parse.
– If w is a sentence of the grammar at hand, then
w = γn, where γn is the nth right sentential form of
some as yet unknown rightmost derivation.
S        L      w
0 rm 1 rm 2 rm rm n 1 rm n
Example for right sentential form and handle for grammar
E→E+E
E→E*E
E→(E)
E → id
Shift Reduce Parsing
• Two problems that must be solved to parse by
handle pruning.
– To locate the substring to be reduced in a right
sentential form.
– To determine what production to chose in case
there is more than one production with that
substring on the right side.
Implementation Of Shift-Reduce Parser
• Use a stack to hold grammar symbols and use $ to mark the bottom of the stack
• Use an input buffer to hold the string w to be parsed and $ for marking the right end of
the input
• Initially the stack is empty, and the string w is on the input, as follows:
Stack                                                   Input
$                                                             w $
• The parser operates by shifting zero or more input symbols onto the stack until a handle
β is on top of the stack.
• The parser then reduces β to the left side of the appropriate production.
• The parser repeats this cycle until it has detected an error or until the stack contains the
start symbol and the input is empty:
Stack                                                   Input
$S                                                               $
• After entering this configuration, the parser halts and announces successful completion
of parsing.
Possible Actions
• In a shift action, the next symbol is shifted onto the top
of the stack.
• In a reduce action, the parser knows the right end of
the handle is at the top of the stack. It must then
locate the left end of the handle within the stack and
decide with what non-terminal to replace the handle.
• In an accept action, the parser announces successful
completion of parsing.
• In an error action, the parser discovers that a syntax
error has occurred and calls an error recovery routine.
Example

E→E+E
E→E*E
E→(E)
E → id
Exercise
• Indicate the handle in each of the following
right sentential forms for the grammars
– 000111 for the grammar S-> 0S1 | 01 ,
– aaa*a+a+ for gramamer S-> SS+ | SS* | a
Conflicts
• There are CFGs for which shift-reduce parsing
cannot be used.
• For every shift-reduce parser for such grammar
can reach a configuration in which
– the parser cannot decide whether to shift or to
reduce (a shift-reduce conflict),
– The parser cannot decide which of several reductions
to make (a reduce/reduce conflict), by knowing the
entire stack contents and the next input symbol.
Example of such grammars:

• An ambiguous grammar can never be used for shift


reduce parsing.
Stmt → if expr then stmt
           | if expr then stmt else stmt
           | other                                      (Grammar)
• If we have a configuration
Stack Input
… if expr then stmtelse …$
– There is shift/reduce conflict i.e. Whether to shift the next
input symbol or take this as handle and reduce.
LR Parsing
• The LR parsing method is a most general non-back
tracking shift-reduce parsing method.
• LR parsers are used to parse the large class of
context free grammars. This technique is called
LR(k) parsing.
– L is left-to-right scanning of the input.
– R is for constructing a right most derivation in reverse.
– k is the number of input symbols of lookahead that are
used in making parsing decisions.
Model of LR Parser

• LR parser consists of an input, an output, a stack, a driver


program and a parsing table that has two functions
1. Action
2. Goto
• The driver program is same for all LR parsers. Only the
parsing table changes from one parser to another.
• The parsing program reads character from an input
buffer one at a time, where a shift reduces parser would
shift a symbol; an LR parser shifts a state. Each state
summarizes the information contained in the stack.
• The stack holds a sequence of states, so, s1, ·
·· , Sm, where Sm is on the top.
Action and Goto
• Action This function takes as arguments a state i and a
terminal a (or $, the input end marker). The value of
ACTION [i, a] can have one of the four forms:
i) Shift j, where j is a state.
ii) Reduce by a grammar production A---> β.
iii) Accept.
iv) Error.
• Goto This function takes a state and grammar symbol as
arguments and produces a state. If GOTO [Ii ,A] = Ij, the
GOTO also maps a state i and non terminal A to state j.
Types of LR parsers
• In LR Parsing, there are three widely used
algorithms available for constructing the table
for an LR parser:
– LR(0) (Requires LR(0) items)
– SLR(1) - Simple LR (Requires LR(0) items)
– LR( 1) - LR parser Also called as Canonical LR
parser. (requires LR(1) items)
– LALR(1) - Look ahead LR parser (requires LR(1)
items)
LR(0) Parser
• An LR(0)parser is a shift-reduce parser that uses
zero tokens of look-ahead to determine what
action to take (hence the 0).
• This means that in any configuration of the
parser, the parser must have an unambiguous
action to choose-either it shifts a specific
symbol or applies a specific reduction.
• If there are ever two or more choices to make,
the parser fails and the grammar is not LR(0).
LR(0) Item
• An LR parser makes shift-reduce decisions by maintaining
states to keep track of where we are in a parse.
– State represents set of “items”
• An LR(0) item of a grammar G is a production of G with a dot
at some position of the body. Thus a production A->XYZ yields
for items.
                                                   A ---> •XYZ
                                                   A ---> X • YZ
                                                   A ---> XY • Z
                                                   A ---> XYZ•
• The production A-> generates only one item, A-> •.
LR(0) item
• An item indicates how much of a production we
have seen at a given point in the parsing process.
– A ---> •XYZ indicates that we hope to see a string
derivation from XYZ next on the input.
–   A ---> X • YZ indicates that we have just seen on
input a string derivable from X and hope next to see a
string derivable from YZ.
–   A ---> X YZ • indicates that we have seen the body
XYZ and it may be time to reduce XYZ to A.
Canonical collection of LR(0) items

S’->S
S-> AA
S-> AA
A->aA | b
A->aA | b

Grammar Augmented Grammar


Canonical collection of LR(0) items
S S’->S.

S’->.S A
A
S-> .AA S-> A.A S->AA.
A->.aA|.b A->.aA|.b
a b
a
b A->a.A A
A->.aA|.b A->aA.

A->b. a

b
Start with augmented production in first state with a dot in the beginning of RHS. i.e. S’->.S. Now for
closure, the production of the non-terminal followed by . is added in the state. Thus S->.AA is added
which further on add the production starting with A i.e. A->.aA|.b. Since there is no non-terminal
after the . In RHS of this production. Closure of the state is complete.
Canonical collection of LR(0) items
S S1
S’->S.

S’->.S A
S0 A
S-> .AA S-> A.A S->AA.
S5
A->.aA|.b S2
A->.aA|.b
a b
a
b A->a.A A
S3
A->.aA|.b S6
A->aA.

A->b.
S4 a

b
Parsing Table for LR(0) Parser
ACTION GOTO Stack Symbols Input Action

a b $ A S 0 aabb$ Shift3
S0 s3 S4 2 1 03 a abb$ Shift3
S1 acc 033 a bb$ Shift4
S2 S3 S4 5 0334 b b$ Reduce A-
>b
S3 s3 S4 6
0336 A b$ Reduce A-
S4 R3 R3 R3 >aA
S5 R1 R1 R1
036 A b$ Reduce A-
S6 R2 R2 R2 >aA
02 A b$ Shift 4
024 $ Reduce A-
1. S-> AA >b
2 .A->aA
3. A->b 025 A $ Reduce S-
And string is aabb >AA
01 S $ accept
Difference between LR(0) and SLR(1)
parser
• While constructing table, difference lies in the
placement of reduce operations.
• For a reduce operation, check the LHS of the
production in final state of canonical collection
of LR(0) items.
• Place the reduce operation only under the
symbols which are in follow of symbol on LHS.
Similarity between LR(0) and SLR(1)

• Any grammar which is LR(0) will be SLR(1).


• If there are shift reduce conflict or reduce-
reduce conflict in the LR(0), there may be
chance that such a conflict can be removed in
SLR(1).
• Number of states of LR(0) is same as SLR(1).
Parsing Table for SLR(1)
ACTION GOTO
a b $ A S
S0 s3 S4 2 1
S1 acc
S2 S3 S4 5
S3 s3 S4 6
S4 R3 R3 R3
S5 R1
S6 R2 R2 R2
Exercise
• Construct LR(0) and SLR(1) parsing table
– for Grammar below and parse the string id*id+id.
1. E->E+T
2. E->T
3. T->T*F
4. T->F
5. F->(E)
6. F->id

– For grammar below and parse the string aaac.


1. S->dA
2. S->aB
3. A->bA
4. A->c
5. B->aB
6. B->c
Unambiguous Grammar but not SLR(1)
Consider the Grammar Canonical LR(0) collection of sets of LR(0) items are
S-> L=R|R
L -> *R|id I0 : S’->.S I5 : L->id.
R->L S->.L=R I6 : S->L=.R
S->.R R->.L
L->.*R L->.*R
•In state I2 , it is not clear L->.id L->.id
whether to shift or reduce. R->.L I7 : L->*R.
Thus it is not a LR(0) I1 : S’->S. I8 : R -> L.
grammar.
•It is not a SLR(1) grammar. I2 : S->L.=R I9 : S->L=R.
Because follow(R) = {$, = } so R->L.
there will be a conflict if the I3 : S->R.
next symbol is = on state I2.
I4 : L->*.R
R->.L
L->.*R
L->.id
LR(1) item
• LR(1) item is a LR(0) item with a look ahead
symbol.
– A ---> •XYZ , a/b is LR(1) item where a and b are
look-ahead symbol.
– look ahead symbol help in placing the reduce
moves in the action of parsing table.
– For a final item, the reduce operation will be
placed in only the columns corresponding to look
ahead symbols.
Canonical Collection of LR(1) items
• Start with augmented grammar as in case of
LR(0) items.

S’->S
S-> AA
S-> AA
A->aA | b
A->aA | b

Grammar Augmented Grammar


• For LR(1) item, [A->.B , a]. While finding
closure of such an item, all the productions
with B on left side, i.e B->,
– LR(0) part will be added in the same manner as in
Canonical set of LR(0) items
– Lookahead will be every terminal which is in first
(a).
Construction of LR(1) items
• Procedure is same as that of LR(0) set
construction.
– Start with augmented production S’->S with initial
lookahead $.
– On transition from one state to another on any
symbol, lookahead does not change.
– Lookahead changes only while finding closure.
Example
CLR Parsing Table
LALR Parsing Table
• Canonical LR parse table has a large number of
states
• An LR parser will not make any wrong
shift/reduce unless there is an error. But the
number of states in LR parse table is too large.
• To reduce number of states we will combine all
states which have same core and different look
ahead symbol to construct LALR parsing table.

You might also like