Lecture 8

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

CS327 - Compilers

LL(1) Parsing

Abhishek Bichhawat 09/02/2024


LL(1) Parser
● Looks ahead 1 token
● Only one choice of production at every step
○ Unique production or no production given the next token
● No backtracking
● Left-factoring for the grammar:
E→T+E|T
T → int | int * T | (E)

E → TA T → int B | (E)
A→+E|ε B→*T|ε
LL(1) Parser - First sets
● If α is any string of terminals and nonterminals (like the right
side of a production) then FIRST(α) is the set of terminal
symbols that start some string that α produces, plus ε if α can
produce the empty string.
● If α →* tβ, then t ∈ FIRST(α)
● FIRST(α) = {t | α →* tβ} ⋃ {ε | α →* ε}
LL(1) Parser - Follow sets
● If A is a non-terminal symbol, then FOLLOW(A) is the set of
terminal symbols that can come immediately to the right of A in
some sentential form, i.e., the set of terminals such that there
exists a derivation of the form S →* αAtβ
● If A → α →* ε and S →* αAtβ, then t can be consumed as the
next input, if we are at A
● FOLLOW(A)={t | S →* αAtβ}
Computing First Sets
1. If X is a terminal, then FIRST(X) = {X}
2. If X is a non-terminal, then
ε ∈ FIRST(X)
if X → A1 .. An and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n
FIRST(α) ∈ FIRST(X)
if X → A1 .. Anα and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n
3. If X → ε then ε ∈ FIRST(X)
Computing First Sets - Example
1. If X is a terminal, then FIRST(X) = {X}
2. If X is a non-terminal, then
ε ∈ FIRST(X) if X → A1 .. An and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n
FIRST(α) ⊆ FIRST(X) if X → A1 .. Anα and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n
3. If X → ε then ε ∈ FIRST(X)

E → TA A→+E|ε T → int B | (E) B→*T|ε

FIRST(E) = FIRST(T) =

FIRST(A) = FIRST(B) =
Computing First Sets - Example
1. If X is a terminal, then FIRST(X) = {X}
2. If X is a non-terminal, then
ε ∈ FIRST(X) if X → A1 .. An and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n
FIRST(α) ⊆ FIRST(X) if X → A1 .. Anα and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n
3. If X → ε then ε ∈ FIRST(X)

E → TA A→+E|ε T → int B | (E) B→*T|ε

FIRST(E) = {int, (} FIRST(T) = {int, (}

FIRST(A) = {+, ε} FIRST(B) = {*, ε}


Computing Follow Sets
1. Place $ in FOLLOW(S), where S is the start symbol,
and $ is the input right endmarker
2. If there is a production A → αBβ, then everything in FIRST(β)
except ε is in FOLLOW(B)
3. If there is a production A → αBβ where ε ∈ FIRST(β),
or A → αB, then everything in FOLLOW(A) is in FOLLOW(B)
Computing Follow Sets
1. $ ∈ FOLLOW(S), where S is the start symbol
2. For each A → αBβ, then FIRST(β) - {ε} ⊆ FOLLOW(B)
3. For each A → αBβ where ε ∈ FIRST(β), or A → αB, FOLLOW(A) ⊆ FOLLOW(B)

E → TA A→+E|ε T → int B | (E) B→*T|ε


FIRST(E) = {int, (} FIRST(T) = {int, (}
FIRST(A) = {+, ε} FIRST(B) = {*, ε}
FOLLOW(+) = FOLLOW(*) =
FOLLOW( ( ) =
FOLLOW(E) = FOLLOW(T) =
FOLLOW(A) = FOLLOW(B) =
FOLLOW( ) ) = FOLLOW(int) =
Computing Follow Sets
1. $ ∈ FOLLOW(S), where S is the start symbol
2. For each A → αBβ, then FIRST(β) - {ε} ⊆ FOLLOW(B)
3. For each A → αBβ where ε ∈ FIRST(β), or A → αB, FOLLOW(A) ⊆ FOLLOW(B)

E → TA A→+E|ε T → int B | (E) B→*T|ε

FOLLOW(+) = {int, (} FOLLOW(*) = {int, (}


FOLLOW( ( ) = {int, (}
FOLLOW(E) = {), $} FOLLOW(T) = {+, ), $}
FOLLOW(A) = {), $} FOLLOW(B) = {+, ), $}
FOLLOW( ) ) = {+, ), $} FOLLOW(int) = {+, ), $}
First Sets and Follow Sets
● FIRST Sets
○ If X is a terminal, then FIRST(X) = {X}
○ If X is a non-terminal, then
■ if X → A1..An and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n then ε ∈ FIRST(X)
■ if X → A1..Anα and ε ∈ FIRST(Ai) for all 1 ≤ i ≤ n then FIRST(α) ⊆
FIRST(X)
■ If X → ε then ε ∈ FIRST(X)
○ To compute FIRST(A1A2 ... An)
■ Add the non-ε symbols of FIRST(A1).
■ Add the non-ε symbols of FIRST(A2) if ε is in FIRST(A1), the non-ε symbols of
FIRST(A3) if ε is in both FIRST(A1) and FIRST(A2), and so on. Finally, add ε to
FIRST(A1A2 ... An) if, for all i, FIRST(Ai) contains ε.
First Sets and Follow Sets
● FOLLOW Sets
○ If S is the start symbol then $ ∈ FOLLOW(S)
○ For each A → αBβ, then FIRST(β) - {ε} ⊆ FOLLOW(B)
○ For each A → αBβ where ε ∈ FIRST(β), or A → αB, FOLLOW(A) ⊆
FOLLOW(B)
Compute First and Follow Sets
S → iEtSS’ | a
S’ → eS | ε
E→b
LL(1) Parser - Parsing Table
● Construct a table that provides us the correct production based
on the next token in the string
● Given a non-terminal A and token (next input symbol) t what
production (A → α) should the parsing table return, i.e.,
○ For each terminal t ∈ FIRST(α), T[A, t] = α
○ If ε ∈ FIRST(α), then for each t ∈ FOLLOW(A), T[A, t] = α
○ If ε ∈ FIRST(α) and $ ∈ FOLLOW(A), T[A, t] = α
LL(1) Parser - Parsing Table
● Construct a table that provides us the correct production based
on the next token in the string
● Given a non-terminal A and token (next input symbol) t what
production (A → α) should the parsing table return, i.e.,
○ For each terminal t ∈ FIRST(α), T[A, t] = α
○ If ε ∈ FIRST(α), then for each t ∈ FOLLOW(A), T[A, t] = α
○ If ε ∈ FIRST(α) and $ ∈ FOLLOW(A), T[A, t] = α
● To compute FIRST(A1A2 ... An)
○ Add the non-ε symbols of FIRST(A1).
○ Add the non-ε symbols of FIRST(A2), if ε is in FIRST(X1), the non-ε
symbols of FIRST(A3) if ε is in both FIRST(A1) and FIRST(A2), and so
on. Finally, add ε to FIRST(A1A2 ... An) if, for all i, FIRST(Ai) contains ε.
LL(1) Parsing Table
E → TA A→+E|ε T → int B | (E) B→*T|ε

FIRST(E) = {int, (} FOLLOW(+) = {int, (} FOLLOW(*) = {int, (}


FIRST(T) = {int, (} FOLLOW( ( ) = {int, (} FOLLOW( ) ) = {+, ), $} FOLLOW(int) = {+, ), $}
FIRST(A) = {+, ε} FOLLOW(E) = {), $} FOLLOW(T) = {+, ), $}
FIRST(B) = {*, ε} FOLLOW(A) = {), $} FOLLOW(B) = {+, ), $}

Input Symbol
Non-terminal
int * + ( ) $

B
LL(1) Parsing Table
E → TA A→+E|ε T → int B | (E) B→*T|ε

FIRST(E) = {int, (} FOLLOW(+) = {int, (} FOLLOW(*) = {int, (}


FIRST(T) = {int, (} FOLLOW( ( ) = {int, (} FOLLOW( ) ) = {+, ), $} FOLLOW(int) = {+, ), $}
FIRST(A) = {+, ε} FOLLOW(E) = {), $} FOLLOW(T) = {+, ), $}
FIRST(B) = {*, ε} FOLLOW(A) = {), $} FOLLOW(B) = {+, ), $}

Input Symbol
Non-terminal
int * + ( ) $

E TA TA

T int B (E)

A +E ε ε

B *T ε ε ε
LL(1) Parsing Example
Stack Input Action

E$ int * int $ TA

E → TA TA$ int * int $ int B

A→+E|ε int B A $ int * int $

T → int B | (E) BA$ * int $ *T

B→*T|ε *TA$ * int $

TA$ int $ int B

int B A $ int $

int * int $ BA$ $ ε

A$ $ ε

$ $ ACCEPT
LL(1) Parsing Table
S → iEtSS’ | a
S’ → eS | ε Input: i b t i b t a e a
E→b
Input Symbol
Non-terminal
a b e i t $

S a iEtSS’

S’

E
LL(1) Parser
● CFG is not LL(1)
○ If it is ambiguous
○ If it is left-recursive
○ If it is not left-factored
● Most programming languages are not LL(1)!

You might also like