0% found this document useful (0 votes)
8 views32 pages

Lecture 07

The document discusses predictive top-down parsing, specifically LL(1) parsing, which allows for parsing without backtracking by using a single lookahead token. It explains the concepts of left factoring, constructing parsing tables, and computing first and follow sets to build an LL(1) parsing table. The document emphasizes that if any entry in the parsing table is multiply defined, the grammar is not LL(1), which is common in many programming languages.

Uploaded by

itsmeshinoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views32 pages

Lecture 07

The document discusses predictive top-down parsing, specifically LL(1) parsing, which allows for parsing without backtracking by using a single lookahead token. It explains the concepts of left factoring, constructing parsing tables, and computing first and follow sets to build an LL(1) parsing table. The document emphasizes that if any entry in the parsing table is multiply defined, the grammar is not LL(1), which is common in many programming languages.

Uploaded by

itsmeshinoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Top-Down Parsing

CS143
Lecture 7

Instructor: Fredrik Kjolstad


Slide design by Prof. Alex Aiken, with modifications
1
Predictive Top-Down Parsers

• Like recursive-descent but parser can “predict”


which production to use
– By looking at the next few tokens
– No backtracking

• Predictive parsers accept LL(k) grammars


– L means “left-to-right” scan of input
– L means “leftmost derivation”
– k means “predict based on k tokens of lookahead”
– In practice, LL(1) is used

2
Recursive Descent vs. LL(1)

• In recursive-descent,
– At each step, many choices of production to use
– Backtracking used to undo bad choices

• In LL(1),
– At each step, only one choice of production
– That is
• When a non-terminal A is leftmost in a derivation
• And the next input symbol is t
• There is a unique production A → α to use
– Or no production to use (an error state)

• LL(1) is a recursive descent variant without backtracking

3
Predictive Parsing and Left Factoring

• Recall the grammar


E→T+E|T
T → int | int * T | ( E )

• Hard to predict because


– For T two productions start with int
– For E it is not clear how to predict

• We need to left-factor the grammar

4
Left-Factoring Example

• Recall the grammar


E→T+E|T
T → int * T | int | ( E )

• Factor out common prefixes of productions


E→TX X→+E|ε
T → int Y | ( E ) Y→*T|ε

5
LL(1) Parsing Table Example

• Left-factored grammar
E→TX X→+E|ε
T → int Y | ( E ) Y→*T|ε

• The LL(1) parsing table: next input token

int * + ( ) $
E TX TX
X +E ε ε

T int Y (E)
Y *T ε ε ε

rhs of production to use


leftmost non-terminal 6
E→TX X→+E|ε
T → int Y | ( E ) Y→*T|ε
LL(1) Parsing Table Example

• Consider the [E, int] entry


– “When current non-terminal is E and next input is int,
use production E → T X”
– This can generate an int in the first position

int * + ( ) $
E TX TX
X +E ε ε

T int Y (E)
Y *T ε ε ε

7
E→TX X→+E|ε
T → int Y | ( E ) Y→*T|ε
LL(1) Parsing Tables. Errors

• Consider the [Y,+] entry


– “When current non-terminal is Y and current token is +,
get rid of Y”
– Y can be followed by + only if Y → ε

int * + ( ) $
E TX TX
X +E ε ε

T int Y (E)
Y *T ε ε ε

8
E→TX X→+E|ε
T → int Y | ( E ) Y→*T|ε
LL(1) Parsing Tables. Errors

• Consider the [Y,(] entry


– “There is no way to derive a string starting with ( from
non-terminal Y”
– Blank entries indicate error situations

int * + ( ) $
E TX TX
X +E ε ε

T int Y (E)
Y *T ε ε ε

9
Using Parsing Tables

• Method similar to recursive descent, except


– For the leftmost non-terminal S
– We look at the next input token a
– And choose the production shown at [S,a]

• A stack records frontier of parse tree


– Non-terminals that have yet to be expanded
– Terminals that have yet to matched against the input
– Top of stack = leftmost pending terminal or non-terminal

• Reject on reaching error state


• Accept on end of input & empty stack

10
LL(1) Parsing Algorithm (using the table)

initialize stack = <S $> and next


repeat
case stack of
<X, rest> : if T[X,*next] = Y1…Yn
then stack ← <Y1…Yn, rest>;
else error ();

<t, rest> : if t == *next ++


then stack ← <rest>;
else error ();

until stack == < >

11
LL(1) Parsing Algorithm $ marks bottom of stack

initialize stack = <S $> and next


repeat For non-terminal X on top of stack,
lookup production
case stack of
<X, rest> : if T[X,*next] = Y1…Yn
then stack ← <Y1…Yn, rest>;
else error ();
Pop X, push
production rhs
<t, rest> : if t == *next ++
on stack.
For terminal t on top of stack,then stack ← <rest>; Note leftmost
check t matches next input else error (); symbol of rhs
token. is on top of
until stack == < > the stack.

12
E→TX X→+E|ε
T → int Y | ( E ) Y→*T|ε
LL(1) Parsing Example

Stack Input Action


E$ int * int $ TX
TX$ int * int $ int Y
int Y X $ int * int $ terminal
YX$ * int $ *T
*TX$ * int $ terminal
TX$ int $ int Y
int Y X $ int $ terminal
YX$ $ ε
X$ $ ε
$ $ ACCEPT

13
Constructing Parsing Tables: The Intuition

• Consider non-terminal A, production A → α, and token t

Greek letters denote strings of


1. Add T[A,t] = α non-terminals and terminals
if A → α →* t β
– α can derive a t in the first position
– We say that t ∈ First(α)

2. Add T[A,t] = ε
if A → α →* ε and S →* γ A t δ
– Useful if stack has A, input is t, and A cannot derive t
– In this case only option is to get rid of A (by deriving ε)
• Can work only if t can follow A in at least one derivation
– We say t ∈ Follow(A)
14
Computing First Sets

Definition
First(X) = { t | X →* tα} ∪ {ε | X →* ε}

Algorithm sketch:
1. First(t) = { t }
2. ε ∈ First(X)
• if X → ε or
• if X → A1 … An and ε ∈ First(Ai) for all 1 ≤ i ≤ n
3. First(α) ⊆ First(X)
• if X → α or
• if X → A1 … An α and ε ∈ First(Ai) for all 1 ≤ i ≤ n

15
First Sets: Example

1. First(t) = { t }
2. ε ∈ First(X)
– if X → ε or
– if X → A1…An and ε ∈ First(Ai) for all 1 ≤ i ≤ n
3. First(α) ⊆ First(X)
– if X → α or
– if X → A1…An α and ε ∈ First(Ai) for all 1 ≤ i ≤ n

E→TX X→+E|ε
T → int Y | ( E ) Y→ *T|ε

First( E ) = First( X ) =
First( T ) = First( Y ) = 16
First Sets: Example

• Recall the grammar


E→TX X→+E|ε
T → int Y | ( E ) Y→*T|ε

• First sets
First( ( ) = { ( } First( T ) = {int, ( }
First( ) ) = { ) } First( E ) = {int, ( }
First( int) = { int } First( X ) = {+, ε }
First( + ) = { + } First( Y ) = {*, ε }
First( * ) = { * }

17
Computing Follow Sets

• Definition:
Follow(X) = { t | S →* β X t δ }

• Intuition
– If X → A B then First(B) ⊆ Follow(A) and
Follow(X) ⊆ Follow(B)
• if B →* ε then Follow(X) ⊆ Follow(A)

– If S is the start symbol then $ ∈ Follow(S)

18
Computing Follow Sets (Cont.)

Algorithm sketch:
1. $ ∈ Follow(S)
2. For each production A → αXβ
– First(β) - {ε} ⊆ Follow(X)
3. For each production A → αXβ where ε ∈ First(β)
– Follow(A) ⊆ Follow(X)

19
Computing the Follow Sets (for the Non-Terminals)

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→*T|ε

• $ ∈ Follow(E)

20
Computing the Follow Sets (for the Non-Terminals)

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→*T|ε

• $ ∈ Follow(E)
• First(X) ⊆ Follow(T)
• Follow(E) ⊆ Follow(X)
• Follow(E) ⊆ Follow(T) because ε ∈ First(X)

21
Computing the Follow Sets (for the Non-Terminals)

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→*T|ε

• $ ∈ Follow(E)
• First(X) ⊆ Follow(T)
• Follow(E) ⊆ Follow(X)
• Follow(E) ⊆ Follow(T) because ε ∈ First(X)
• ) ∈ Follow(E)

22
Computing the Follow Sets (for the Non-Terminals)

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→*T|ε

• $ ∈ Follow(E)
• First(X) ⊆ Follow(T)
• Follow(E) ⊆ Follow(X)
• Follow(E) ⊆ Follow(T) because ε ∈ First(X)
• ) ∈ Follow(E)
• Follow(T) ⊆ Follow(Y)

23
Computing the Follow Sets (for the Non-Terminals)

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→*T|ε

• $ ∈ Follow(E)
• First(X) ⊆ Follow(T)
• Follow(E) ⊆ Follow(X)
• Follow(E) ⊆ Follow(T) because ε ∈ First(X)
• ) ∈ Follow(E)
• Follow(T) ⊆ Follow(Y)
• Follow(X) ⊆ Follow(E)

24
Computing the Follow Sets (for the Non-Terminals)

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→*T|ε

• $ ∈ Follow(E)
• First(X) ⊆ Follow(T)
• Follow(E) ⊆ Follow(X)
• Follow(E) ⊆ Follow(T) because ε ∈ First(X)
• ) ∈ Follow(E)
• Follow(T) ⊆ Follow(Y)
• Follow(X) ⊆ Follow(E)
• Follow(Y) ⊆ Follow(T)

25
Computing the Follow Sets (for the Non-Terminals)

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→*T|ε

• $ ∈ Follow(E)
• First(X) ⊆ Follow(T)
Follow(X)
• Follow(E) ⊆ Follow(X)
$ Follow(E)
• Follow(E) ⊆ Follow(T)
Follow(T)
• ) ∈ Follow(E) )
First(X)
• Follow(T) ⊆ Follow(Y)
• Follow(X) ⊆ Follow(E)
Follow(Y)
• Follow(Y) ⊆ Follow(T)

26
Computing the Follow Sets (for the Non-Terminals)

Follow(X)

$ Follow(E)

Follow(T)
) +
First(X)

Follow(Y)

27
Computing the Follow Sets (for the Non-Terminals)

$,)
$,) Follow(X)

$ Follow(E) $,),+

Follow(T)
) +
First(X)

$,),+
Follow(Y)

28
Computing the Follow Sets (for all symbols)

$,)
$,) Follow(X)

$ Follow(E) $,),+

Follow(T)
) +
First(X)
(,int $,),+
(,int
Follow(+) $,),+
First(E) Follow())
Follow(Y)
(,int
$,),+,*
Follow(() *
First(Y) Follow(int)
(,int (,int
First(T) Follow(*)

29
Follow Sets: Example

• Recall the grammar


E→TX X→+E|ε
T → ( E ) | int Y Y→ *T|ε

• Follow sets
Follow( + ) = { int, ( } Follow( * ) = {int, ( }
Follow( ( ) = { int, ( } Follow( E ) = {$, )}
Follow( X ) = {$, ) } Follow( T ) = {$, +, )}
Follow( ) ) = {+, ) , $} Follow( Y ) = {$, +, )}
Follow( int) = {*, +, ) , $}

30
Constructing LL(1) Parsing Tables

• Construct a parsing table T for CFG G

• For each production A → α in G do:


– For each terminal t ∈ First(α) do
• T[A, t] = α
– If ε ∈ First(α), then for each t ∈ Follow(A) do
• T[A, t] = ε
– If ε ∈ First(α) and $ ∈ Follow(A) do
• T[A, $] = ε

31
Notes on LL(1) Parsing Tables

• If any entry is multiply defined then G is not LL(1)


– If G is ambiguous
– If G is left recursive
– If G is not left-factored
– And in other cases as well

• Most programming language CFGs are not LL(1)

32

You might also like