0% found this document useful (0 votes)
19 views

Lecture 9

The document discusses bottom-up parsing and LR parsing. Bottom-up parsing reduces a string to the start symbol by inverting productions and generating a rightmost derivation in reverse. LR parsing uses states containing items to make shift-reduce decisions and avoid conflicts by tracking the parsing position.

Uploaded by

Vedang Chavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lecture 9

The document discusses bottom-up parsing and LR parsing. Bottom-up parsing reduces a string to the start symbol by inverting productions and generating a rightmost derivation in reverse. LR parsing uses states containing items to make shift-reduce decisions and avoid conflicts by tracking the parsing position.

Uploaded by

Vedang Chavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

CS327 - Compilers

Bottom-up Parsing

Abhishek Bichhawat 14/02/2024


Bottom-up Parsing
● More general than top-down parsing
○ Just as efficient as top-down parsing
○ Do not need special grammars
○ Can work with this grammar:
E→T+E|T
T → int * T | int | (E)
● Reduces a string to the start symbol by inverting productions
● Generates a rightmost derivation in reverse
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int


Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int T → int
T+T
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int T → int
T+T E→T
T+E
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int T → int
T+T E→T
T+E E→T+E
E
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E) E

Str: int * int + int


T + E
int * T + int
T + int
T
T+T
int * T
T+E
int
E
int
How do reductions happen?
● Split string into left and right substrings
○ Right substring contains only terminals (unexamined input)
○ Left substring contains a set of non-terminals and terminals
○ Initially, the string is all terminals; replace some substring with a
non-terminal and then proceed
● We will use a marker (|) to point the split
● Initially, we have |a1a2..an
Shift-reduce Parsing
● Shift moves the marker to the right
○ |a1a2..an → a1|a2..an
○ Shifts a terminal into the left substring
● Reduce applies a production on the left substring
○ Suppose A → a1a2 , then a1a2|..an → A|..an
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: | int * int + int shift $


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int | * int + int shift $ int


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * | int + int shift $ int *


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int shift $ int * int


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $ int * T


int * T | + int
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T


int * T | + int reduce T → int * T
T | + int
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T+


int * T | + int reduce T → int * T
T + | int shift
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $ T + int


int * T | + int reduce T → int * T
T + int | shift
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T+T


int * T | + int reduce T → int * T
T + int | reduce T → int
T+T|
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T+E


int * T | + int reduce T → int * T
T + int | reduce T → int
T+T| reduce E → T
T+E |
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $E


int * T | + int reduce T → int * T
T + int | reduce T → int
T+T| reduce E → T
T+E | reduce E → T + E
E|
Handles
● Handle is a substring that matches the right hand side of the
production
○ Can be reduced to the LHS non-terminal
○ If int * int | + int reduces to int * T | + int, then int is said to be the
handle
● If a grammar is unambiguous, then every right-sentential form
has exactly one handle
● Use stack to implement shift-reduce parsing with handle on the
top of the stack
Conflicts
● We may shift or reduce at some of the steps in the parsing
○ Known as shift-reduce conflict
○ Expected and can be removed
● We may reduce by two different productions at some steps
○ Known as reduce-reduce conflict
○ Problems with grammar that need to be resolved
Conflicts
E→T+E|T
T → int * T | int | (E)

Str: int | * int + int

● At this point, how do we know whether to shift or reduce


● Reduction can be performed but we shall not reduce to E
LR Parsing
● LR(k) Parsers are bottom-up parsers
○ L is for scanning inputs left to right
○ R is for constructing a rightmost derivation trees
○ k is the number of lookahead symbols
● LR parsers make shift-reduce decisions by maintaining states to
keep track of where we are
○ These states are set of “items”
○ An item of grammar G is a production of G with . (dot) somewhere in RHS
For A → BC: A→ .BC, A → B.C, A → BC. are all items of the grammar
○ Item indicates the production we have seen at a given point in parsing
Recognizing Handles - Viable Prefixes
● Handles are the substrings that we want to reduce
○ Handles always appear on the top of the stack; no backtracking
○ Need to recognize handles to correctly shift and reduce
○ No known algorithm to recognize handles
● Use heuristics to guess that a substring on the stack is handle
● Stack contents are prefix of a right sentential form, i.e.,
if α is on the top of the stack and the rest is β then we should be
able to reduce αβ to the start symbol
● LR parsing is based on the fact that the items can construct a FA
known as LR(0) automaton, which accepts viable prefixes
Viable Prefixes - Example
E→T+E|T
T → int * T | int | (E)

Given the input (int)

● (E|) is a state of the shift-reduce parse


● (E is a viable prefix as T → (E)
● T → (E.) says that so far we have matched (E for the input and
we hope to see ) next
Viable Prefixes - Example
Given the input string (int * int):
(int * | int) is a state in the parsing where

( is a prefix of T → (E) T → (.E)


ε is a prefix of E → T E → .T
int * is a prefix of T → int * T T → int * . T
LR(0) Automaton
● LR(0) automaton takes as input a stack and returns whether or
not the symbols on the stack are viable prefixes
● Each state of the automaton will contain a set of items
● Start by adding a dummy production S’ → S to the grammar
○ Indicates the parser that when we reduce using S’ → S, accept the input
● If I (state in FA) is the set of items, then compute CLOSURE(I):
○ Every item in I is in the CLOSURE(I)
○ CLOSURE (I) = CLOSURE(I) ∪ {B → .γ},
if A → α.Bβ is in the CLOSURE(I) and B → .γ
● Compute GOTO(I, X) for some grammar symbol X:
○ Closure of the set of all items [A → αB.β] such that [A → α.Bβ] is in I
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T
T → .int
T → .(E)
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E)
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E) int

T → int. * T
T → int.
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E) int

T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E) int

T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int E → T.
T → .(E) int int (
T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int E → T.
T → .(E) int int (
T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
* T → int *. T T → .int * T
T → .int * T T → .int
T → .int T → .(E)
T → .(E) (
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int E → T.
T → .(E) int int (
T → int. * T
T → int. T → (.E)
E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int T → (E.)
E → T.
T → .(E) int int
T (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T T → (E).
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T )
T T → .(E)
T → .int * T E → T. + E T
T → .int T → (E.)
E → T.
T → .(E) int int T (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
LR Parsing Algorithm
● Use stack of states and symbols
○ All states in the FA are numbered
● Two parse tables
○ Goto table (State x Symbol → State)
■ GOTO[i, A] = j, if statei →A statej
○ Action table (state x terminal → {shift, reduce, accept, reject})
■ statei has S’ → S., then ACTION [i, $] = accept
■ statei has A → α.tβ and GOTO[i, t] = j, then ACTION[i, t] = shift j
■ statei has A → α. and t ∈ FOLLOW(A), then ACTION[i, t] = reduce A → α
■ None of the above ⇒ ACTION[i, t] = reject
LR Parsing Algorithm
INPUT: An input string w and an LR parsing table
OUTPUT: If w is accepted, the reduction steps; else error

a ← first symbol of w
while (true) {
s ← state on the top of the stack /* initially it is s0 or start state */
if (ACTION[s, a] = shift t) {
push t onto the stack
a ← next symbol of w
} else if (ACTION[s, a] = reduce A → β) {
pop |β| elements off the stack
t ← state on top of the stack
push GOTO[t, A] onto the stack
output production A → β
} else if (ACTION[s, a] = accept) { break } /* parsing is done */
else { reject with error }
LR(0) Parsing
● Shift if the item A → α.Bβ can transition on B, i.e., there is a
transition from A → α.Bβ to A → αB.β
● Reduce by A → α, if the item-set contains the item A → α.
● Example
String : (id)
Grammar : S → E | (E)
E → id
LR(0) Parsing
● Reduce by A → α, if the item-set contains the item A → α.
● Shift if the item A → α.Bβ can transition on B, i.e., there is a
transition from A → α.Bβ to A → αB.β
● Conflicts
○ Shift-reduce
■ If any state in the DFA has a shift and a reduce item,
e.g., A → α. and A’ → α’.B
○ Reduce-reduce
■ If any state in the DFA has two reduce items, e.g., A → α. and A’ → α’.
E→T+E|T
Shift-reduce Conflict T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T T → (E).
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T )
T T → .(E)
T → .int * T E → T. + E T T
T → .int T → (E.)
E → T.
T → .(E) int int (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
Simple LR Parsing
● Reduce by A → α, if the item-set contains the item A → α. and
the next input symbol is in FOLLOW(A)
● Shift if the item A → α.Bβ can transition to B, i.e., there is a
transition from A → α.Bβ to A → αB.β
● There may still be conflicts because not all grammars are SLR
● Example: int * int + int
E→T+E|T
SLR Parsing T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T T → (E).
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T )
T T → .(E)
T → .int * T E → T. + E T
T → .int T → (E.)
E → T.
T → .(E) int int
T (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
int + * ( ) $ E T

GOTO Table 1 4 11 2 3

3 5

4 6

5 4 11 7 3

6 4 11 8

10 9

11 4 11 10 3
State int + * ( ) $

ACTION Table 1 s4 s11

2 A

3 s5 r3 r3

4 r5 s6 r5 r5

5 s4 s11

1. E’ → E 6 s4 s11

2. E→T+E 7 r2 r2

3. E→T 8 r4 r4 r4

4. T → int * T 9 r6 r6 r6

5. T → int 10 s9
6. T → (E) 11 s4 s11
State int + * ( ) $ E T

Parse Table 1 s4 s11 2 3

2 A

3 s5 r3 r3

4 r5 s6 r5 r5

5 s4 s11 7 3
1. E’ → E 6 s4 s11 8
2. E→T+E 7 r2 r2
3. E→T 8 r4 r4 r4
4. T → int * T
9 r6 r6 r6
5. T → int
10 s9
6. T → (E)
11 s4 s11 10 3
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 1 Shift 4


SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 4 int Shift 4


int | * int + int $ 4 1 Shift 6
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 6 * Shift 4


int | * int + int $ 4 4 int Shift 6
int * | int + int $ 6 1 Shift 4
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 4 int Shift 4


int | * int + int $ 4 6 * Shift 6
int * | int + int $ 6 4 int Shift 4
int * int | + int $ 4
1
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Goto [6,T] T Shift 4


int | * int + int $ 4 6 * Shift 6
int * | int + int $ 6 4 int Shift 4
int * int | + int $ 4 Reduce5 (T → int)
1
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 8 T Shift 4


int | * int + int $ 4 6 * Shift 6
int * | int + int $ 6 4 int Shift 4
int * int | + int $ 4 Reduce5 (T → int)
1
int * T | + int $ 8
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Shift 4


int | * int + int $ 4 Shift 6
int * | int + int $ 6 Goto[1,T] T Shift 4
int * int | + int $ 4 1 - Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Shift 4


int | * int + int $ 4 3 T Shift 6
int * | int + int $ 6 1 - Shift 4
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 5 + Shift 4


int | * int + int $ 4 3 T Shift 6
int * | int + int $ 6 1 - Shift 4
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 4 int Shift 4
int | * int + int $ 4 5 + Shift 6
int * | int + int $ 6 Shift 4
3 T
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 Goto[5,T] T Shift 4
int | * int + int $ 4 5 + Shift 6
int * | int + int $ 6 Shift 4
3 T
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 3 T Shift 4
int | * int + int $ 4 5 + Shift 6
int * | int + int $ 6 Shift 4
3 T
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Goto[5,E] E Shift 4


int | * int + int $ 4 Shift 6
5 +
int * | int + int $ 6 Shift 4
int * int | + int $ 4 3 T Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 7 E Shift 4


int | * int + int $ 4 Shift 6
5 +
int * | int + int $ 6 Shift 4
int * int | + int $ 4 3 T Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 Goto[1,E] E Shift 4
int | * int + int $ 4 Shift 6
int * | int + int $ 6 1 - Shift 4
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7 Reduce2 (E → T + E)
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 $ Shift 4
int | * int + int $ 4 Shift 6
int * | int + int $ 6 2 E Shift 4
int * int | + int $ 4 1 - Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7 Reduce2 (E → T + E)
E|$ 2
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 Acc $ Shift 4
int | * int + int $ 4 Shift 6
int * | int + int $ 6 2 E Shift 4
int * int | + int $ 4 1 - Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7 Reduce2 (E → T + E)
E|$ 2 Accept

You might also like