0% found this document useful (0 votes)

108 views86 pages

Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA

The document discusses top-down parsing techniques, including recursive descent parsing and LL(1) parsing. It covers converting context-free grammars to EBNF form and using that to generate recursive descent parser code. It also discusses issues that can arise with recursive descent parsing like ambiguity and the need to calculate first and follow sets. Finally, it provides an overview of LL(1) parsing and how a parsing table is used to determine actions.

Uploaded by

Srijan Apurva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

108 views86 pages

Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA

Uploaded by

Srijan Apurva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 86

Compiler

Principle and
Technology
Mr. Aruna Malik
BIT (Mesra) Ranchi,
Off Campus NOIDA
4. Top-Down
Parsing
PART ONE
Contents
PART ONE
4.1 Top-Down Parsing by Recursive Descent
4.2 LL(1) Parsing

PART TWO
4.3 First and Follow Sets
4.5 Error Recovery in Top-Down Parsers
Basic Concepts

Context free grammar

Don’t have
backtracking,
Backtracking
each predict Top-down Bottom-up
step is
parsers parsing parsing
decided

Recursive-descent
Predictive parsers parsing
Error
recovery
LL(1) parsing:
First set & non-recursive
Follow set
4.1 Top-Down
Parsing by
Recursive-
Descent
4.1.1 The Basic
Method of
Recursive-
Descent
The idea of Recursive-Descent Parsing
 The grammar rule for a non-terminal A : a definition for a
procedure to recognize an A

 The right-hand side of the grammar for A : the structure of

the code for this procedure

 The Expression Grammar:

 exp → exp addop term∣term
 addop → + ∣-
 term → term mulop factor ∣ factor
 mulop →*
 factor →(exp) ∣ number
A recursive-descent procedure that
recognizes a factor

procedure factor
• The token keeps the current next
begin
token in the input (one symbol of
case token of
look-ahead)
( : match( ( );
exp;
match( )); • The Match procedure matches the
number: current next token with its
match (number); parameters, advances the input if
else error; it succeeds, and declares error if it
end case; does not
end factor
Match Procedure
 Matches the current next token with its parameters
 Advances the input if it succeeds, and declares error if it
does not

procedure match( expectedToken);

begin
if token = expectedToken then
getToken;
else
error;
end if;
end match
Requiring the Use of EBNF

 The corresponding EBNF is

exp  term { addop term }
addop + | -
term  factor { mulop factor }
mulop *
factor  ( exp ) | number

 Writing recursive-decent procedure for the remaining rules

in the expression grammar is not as easy for factor
The corresponding syntax
diagrams
+
exp
addop
term

term addop -

term
factor mulop
*

factor mulop

( exp )

factor

number
4.1.2 Repetition
and Choice:
Using EBNF
An Example
procedure ifstmt; • The grammar rule for an if-statement:
begin If-stmt → if ( exp ) statement
match( if ); ∣ if ( exp ) statement else statement
match( ( );
exp;
match( ) );
statement;
Issuse
if token = else then
• Could not immediately distinguish
match (else); the two choices because the both
statement; start with the token if
end if; • Put off the decision until we see the
end ifstmt; token else in the input
The EBNF of the if-statement
 If-stmt → if ( exp ) statement [ else statement]
Square brackets of the EBNF are translated into a test in
the code for if-stmt:
if token = else then
match (else);
statement;
end if;

 Notes
EBNF notation is designed to mirror closely the actual
code of a recursive-descent parser,
So a grammar should always be translated into EBNF if
recursive-descent is to be used.
It is natural to write a parser that matches each else
token as soon as it is encountered in the input
EBNF for Simple Arithmetic Grammar(1)
The EBNF rule for :
exp → exp addop term∣term
exp → term {addop term}

The curly bracket expressing repetition can be translated

into the code for a loop:
procedure exp;
begin
term;
while token = + or token = - do
match(token);
term;
end while;
end exp;
EBNF for Simple Arithmetic Grammar(2)
 The EBNF rule for term:
term → factor {mulop factor}

Becomes the code

procedure term;
begin
factor;
while token = * do
match(token);
factor;
end while;
end exp;
Left associatively
implied by the curly bracket
 The left associatively implied by the curly bracket
(and explicit in the original BNF) can still be
maintained within this code
function exp: integer;
var temp: integer;
begin
temp:=term;
while token=+ or token = -
do
case token of
+ : match(+);
temp:=temp+term;
-: match(-);
temp:=temp-term;
end case;
end while;
return temp;
end exp;
Some Notes
 The method of turning grammar rule in EBNF into code is
quite powerful.
 There are a few pitfalls, and care must be taken in
scheduling the actions within the code.
 In the previous pseudo-code for exp:
(1) The match of operation should be before repeated
calls to term;
(2) The global token variable must be set before the parse
begins;
(3) The getToken must be called just after a successful test
of a token
Construction of the syntax tree
The expression: 3+4+5

+ 5

3 4
The pseudo-code for constructing the
syntax tree
function exp : syntaxTree;
var temp, newtemp: syntaxTree;
begin
temp:=term;
while token=+ or token = -
do
case token of
+ : match(+);
newtemp:=makeOpNode(+);
leftChild(newtemp):=temp;
rightChild(newtemp):=term;
temp=newtemp;
-: match(-);
newtemp:=makeOpNode(-);
leftChild(newtemp):=temp;
rightChild(newtemp):=term;
temp=newtemp;
end case;
end while;
return temp;
end exp;
A simpler one
function exp : syntaxTree;
var temp, newtemp: syntaxTree;
begin
temp:=term;
while token=+ or token = -
do
newtemp:=makeOpNode(token);
match(token);
leftChild(newtemp):=temp;
rightChild(newtemp):=term;
temp=newtemp;
end while;
return temp;
end exp;
The pseudo-code for the
if-statement procedure
function ifstatement: syntaxTree;
var temp:syntaxTree;
begin
match(if);
match(();
temp:= makeStmtNode(if);
testChild(temp):=exp;
match());
thenChild(temp):=statement;
if token= else then
match(else);
elseChild(temp):=statement;
else
ElseChild(temp):=nil;
end if;
end ifstatement
4.1.3 Further
Decision
Problems
Characteristics
of recursive-descent

The recursive-descent method simply translates the

grammars into procedures, thus, it is very easy to write and
understand, however, it is ad-hoc, and has the following
drawbacks:

(1) It may be difficult to convert a grammar in BNF

into EBNF form;
(2) It is difficult to decide when to use the choice A
→α and the choice A →β; if both α and β begin with non-
terminals. (requires the computation of the First Sets)
Characteristics
of recursive-descent
(3) It may be necessary to know what token legally coming from
the non-terminal A.
In writing the code for an ε-production: A →ε. Such tokens indicate.
A may disappear at this point in the parse.This set is called the
Follow Set of A.
We need a more
general and
(4) It requires computing the First and Follow sets in order to detect
formal method !
the errors as early as possible.
Such as “)3-2)”, the parse will descend from exp to term to
factor before an error is reported.
4.2 LL(1)
PARSING
4.2.1 The Basic
Method of LL(1)
Parsing
Main idea

LL(1) method a + b $ input

uses stack
instead of
recursive calls

X Predictive parsing output

Stack programm
Y
Z
$
Parsing table M
Main idea
LL(1) Parsing uses an explicit stack rather than
recursive calls to perform a parse, the parser can be
visualized quickly and easily.
For example:
a simple grammar for the strings of balanced
parentheses:
S→(S) S|ε
Thefollowing table shows the actions of a top-
down parser given this grammar and the string ( )
Table of Actions
Steps Parsing Stack Input Action
1 $S ()$ S→(S) S
2 $S)S( ()$ match
3 $S)S )$ S→ε
4 $S) )$ match
5 $S $ S→ε
6 $ $ accept

Actions can be decided by

a Parsing table which will
be introduced later
General Schematic
 A top-down parser begins by pushing the start symbol onto
the stack
 It accepts an input string if, after a series of actions, the
stack and the input become empty
 A general schematic for a successful top-down parse:
$ StartSymbol Inputstring$
… … //one of the two actions
… … //one of the two actions
$ $ accept
Two Actions
 The two actions
 Generate: Replace a non-terminal A at the top of the
stack by a string α(in reverse) using a grammar rule A
→α, and
 Match: Match a token on top of the stack with the next
input token.

 The list of generating actions in the above table:

S => (S)S [S→(S) S]
=> ( )S [S→ε]
=> ( ) [S→ε]
 Which corresponds precisely to the steps in a leftmost
derivation of string ( ).This is the characteristic of top-down
parsing.
4.2.2 The LL(1)
Parsing Table
and Algorithm
Purpose and Example of LL(1) Table
 Purpose of the LL(1) Parsing Table:
To express the possible rule choices for a non-terminal A
when the A is at the top of parsing stack based on the
current input token (the look-ahead).

 The LL(1) Parsing table for the following simple grammar:

S→(S) S∣ε

M[N,T] ( ) $

S S→(S) S S→ε S→ε

The General Definition of Table

 Two-dimensional array indexed by non-terminals

and terminals
 Containing production choices to use at the
appropriate parsing step called M[N,T]
 N is the set of non-terminals of the grammar
 T is the set of terminals or tokens (including $)
 Any entrances remaining empty represent
potential errors
Table-Constructing Rule
 The table-constructing rule
 If A→α is a production choice, and there is a derivation
α=>*aβ, where a is a token, then add A→α to the table
entry M[A,a];
 If A→α is a production choice, and there are derivations
α=>*ε and S$=>*βAaγ, where S is the start symbol and a
is a token (or $), then add A→α to the table entry
M[A,a];
A Table-Constructing Case
 The constructing-process of the following table
 For the production : S→(S) S, α=(S)S, where a=(, this
choice will be added to the entry M[S, (] ;
 Since: S=>(S)Sε，rule 2 applied withα= ε, β=(,A = S, a
= ), and γ=S$, so add the choice S→ε to M[S, )]
 Since S$=>* S$, S→ε is also added to M[S, $].

M[N,T] ( ) $

S S→(S) S S→ε S→ε

Properties of LL(1) Grammar

 Definition of LL(1) Grammar：

A grammar is an LL(1) grammar if the associated
LL(1) parsing table has at most one production in
each table entry

 An LL(1) grammar cannot be ambiguous

A Parsing Algorithm
Using the LL(1) Parsing Table
(* assumes $ marks the bottom of the stack and the end of the
input *)

Push the start symbol onto the top the parsing stack;
While the top of the parsing stack ≠ $ and
the next input token ≠ $
do
if the top of the parsing stack is terminal a and the
next input token = a
then (* match *)
pop the parsing stack;
advance the input;
A Parsing Algorithm
Using the LL(1) Parsing Table
else if the top of the parsing stack is non-terminal A
and the next input token is terminal a and
parsing table entry M[A,a] contains production
A→X1X2…Xn
then (* generate *)
pop the parsing stack;
for i:=n downto 1 do
push Xi onto the parsing stack;
else error;

if the top of the parsing stack = $

and the next input token = $
then accept
else error.
Example: If-Statements

 The LL(1) parsing table for simplified grammar of if-

statements:

Statement → if-stmt | other

If-stmt → if (exp) statement else-part
else-part → else statement | ε
exp →0|1
M[N,T] If Other Else 0 1 $

Statement Statement Statement

→ if- → other
stmt

If-stmt If-stmt → if
(exp)
statemen
t else-
part

Else-part Else-part Else-

→ else part →
statem ε
ent

Else-part
→ε

Exp Exp → 0 Exp → 1

Notice for Example: If-Statement
 The entry M[else-part, else] contains two entries, i.e.
the dangling else ambiguity.
 Disambiguating rule: always prefer the rule that
generates the current look-ahead token over any
other, and thus the production
Else-part → else statement
over
Else-part →ε
 With this modification, the above table will
become unambiguous
The grammar can be parsed as if it were an LL(1)
grammar
The parsing based LL(1) Table

 The parsing actions for the string:

If (0) if (1) other else other

 ( for conciseness, statement= S, if-stmt=I,

else-part=L, exp=E, if=I, else=e, other=o)
( for conciseness, statement= S, if-stmt=I,
else-part=L, exp=E, if=i, else=e, other=o) Steps Parsing Stack Input Action
1 $S i(0)i(1)oeo$ S→I
S→I|o If (0) if (1) other else other 2 $I i(0)i(1)oeo$ I→i(E)SL

I → i (E) S L 3 $LS)E(i i(0)i(1)oeo$ Match

L→ e S | ε 4 $ LS)E( (0)i(1)oeo $ Match

E→0
E→0|1
5 $ LS)E 0)i(1)oeo $
$ LS)0 0)i(1)oeo $ Match

S $ LS) )i(1)oeo $ Match

$ LS i(1)oeo $ S→I
$ LI i(1)oeo $ I→i(E)SL
$ LLS)E(i i(1)oeo $ Match
$ LLS)E( (1)oeo Match
… … E→1
Match

S match
S→o

$ match
L→eS
Match
S→o
match
L→ε
22 $ $ accept
( for conciseness, statement= S, if-stmt=I,
else-part=L, exp=E, if=i, else=e, other=o) Steps Parsing Stack Input Action
1 $S i(0)i(1)oeo$ S→I
S→I|o If (0) if (1) other else other 2 $I i(0)i(1)oeo$ I→i(E)SL

I → i (E) S L 3 $LS)E(i i(0)i(1)oeo$ Match

L→ e S | ε 4 $ LS)E( (0)i(1)oeo $ Match

E→0
E→0|1
5 $ LS)E 0)i(1)oeo $
S $ LS)0 0)i(1)oeo $ Match
$ LS) )i(1)oeo $ Match
$ LS i(1)oeo $ S→I
I $ LI i(1)oeo $ I→i(E)SL
$ LLS)E(i i(1)oeo $ Match
$ LLS)E( (1)oeo Match
… … E→1
Match

I match
S→o

I → i (E) S L 3 $LS)E(i i(0)i(1)oeo$ Match

L→ e S | ε 4 $ LS)E( (0)i(1)oeo $ Match

E→0|1 i 5 $ LS)E 0)i(1)oeo $ E→0

$ LS)0 0)i(1)oeo $ Match

S ( $ LS) )i(1)oeo $ Match

$ LS i(1)oeo $ S→I
E $ LI i(1)oeo $ I→i(E)SL
I $ LLS)E(i i(1)oeo $ Match
) $ LLS)E( (1)oeo Match
… … E→1
S
i ( E ) S L Match

L match
S→o

$ match
L→eS
Match
S→o
match
L→ε
22 $ $ accept
( for conciseness, statement= S, if-stmt=I,
else-part=L, exp=E, if=i, else=e, other=o) Steps Parsing Stack Input Action
1 $S i(0)i(1)oeo$ S→I
S→I|o if (0) if (1) other else other 2 $I i(0)i(1)oeo$ I→i(E)SL

I → i (E) S L 3 $LS)E(i i(0)i(1)oeo$ Match

L→ e S | ε 4 $ LS)E( (0)i(1)oeo $ Match

E→0
E→0|1
5 $ LS)E 0)i(1)oeo $
$ LS)0 0)i(1)oeo $ Match

S ) $ LS) )i(1)oeo $ Match

$ LS i(1)oeo $ S→I
E $ LI i(1)oeo $ I→i(E)SL
I $ LLS)E(i i(1)oeo $ Match
) $ LLS)E( (1)oeo Match
… … E→1
S
i ( E ) S L Match

L match
S→o

$ match
L→eS
Match
S→o
match
L→ε
22 $ $ accept
( for conciseness, statement= S, if-stmt=I,
else-part=L, exp=E, if=i, else=e, other=o) Steps Parsing Stack Input Action
1 $S i(0)i(1)oeo$ S→I
S→I|o if (0) if (1) other else other 2 $I i(0)i(1)oeo$ I→i(E)SL

I → i (E) S L 3 $LS)E(i i(0)i(1)oeo$ Match

L→ e S | ε 4 $ LS)E( (0)i(1)oeo $ Match

E→0
E→0|1
5 $ LS)E 0)i(1)oeo $
$ LS)0 0)i(1)oeo $ Match

S $ LS) )i(1)oeo $ Match

$ LS i(1)oeo $ S→I
E $ LI i(1)oeo $ I→i(E)SL
I $ LLS)E(i i(1)oeo $ Match
) $ LLS)E( (1)oeo Match
… … E→1
S
i ( E ) S L Match

L match
S→o

$ match
L→eS
Match
S→o
match
L→ε
22 $ $ accept
( for conciseness, statement= S, if-stmt=I,
else-part=L, exp=E, if=i, else=e, other=o) Steps Parsing Stack Input Action
1 $S i(0)i(1)oeo$ S→I
S→I|o if (0) if (1) other else other 2 $I i(0)i(1)oeo$ I→i(E)SL

I → i (E) S L 3 $LS)E(i i(0)i(1)oeo$ Match

L→ e S | ε 4 $ LS)E( (0)i(1)oeo $ Match

E→0
E→0|1
5 $ LS)E 0)i(1)oeo $
$ LS)0 0)i(1)oeo $ Match

S $ LS) )i(1)oeo $ Match

$ LS i(1)oeo $ S→I
0 $ LI i(1)oeo $ I→i(E)SL
I $ LLS)E(i i(1)oeo $ Match
) $ LLS)E( (1)oeo Match
… … E→1
S
i ( E ) S L Match

L match
S→o

$ match
0 L→eS
Match
S→o
match
L→ε
22 $ $ accept
( for conciseness, statement= S, if-stmt=I,
else-part=L, exp=E, if=i, else=e, other=o) Steps Parsing Stack Input Action
1 $S i(0)i(1)oeo$ S→I
S→I|o if (0) if (1) other else other 2 $I i(0)i(1)oeo$ I→i(E)SL

I → i (E) S L 3 $LS)E(i i(0)i(1)oeo$ Match

L→ e S | ε 4 $ LS)E( (0)i(1)oeo $ Match

E→0
E→0|1
5 $ LS)E 0)i(1)oeo $
$ LS)0 0)i(1)oeo $ Match

S $ LS) )i(1)oeo $ Match

$ LS i(1)oeo $ S→I
$ LI i(1)oeo $ I→i(E)SL
I $ LLS)E(i i(1)oeo $ Match
) $ LLS)E( (1)oeo Match
… … E→1
S
i ( E ) S L Match

L match
S→o

$ match
0 L→eS
Match
S→o
match
L→ε
22 $ $ accept
The last Step:
We omit the procedure, and the last status Steps Parsing Stack Input Action

of the stack and the parse tree is as follows: 1 $S i(0)i(1)oeo$ S→I

2 $I i(0)i(1)oeo$ I→i(E)SL

if (0) if (1) other else other 3 $LS)E(i i(0)i(1)oeo$ Match

S 4 $ LS)E( (0)i(1)oeo $ Match

5 $ LS)E 0)i(1)oeo $ E→0
$ LS)0 0)i(1)oeo $ Match
I $ LS) )i(1)oeo $ Match
$ LS i(1)oeo $ S→I
$ LI i(1)oeo $ I→i(E)SL

i ( E ) S L
$ LLS)E(i i(1)oeo $ Match
$ LLS)E( (1)oeo Match
… … E→1

0 I ε Match
match
S→o

i ( $
E ) S
match
L L→eS
Match

1 o e S S→o
match
L→ε

o 22 $ $ accept
4.2.3 Left
Recursion
Removal and
Left Factoring
Repetition and Choice Problem

 Repetition and choice in LL(1) parsing suffer from similar

problems to be those that occur in recursive-descent
parsing:
The grammar is ambiguous and less of deterministic.

 Solutions：
1. Apply the same ideas of using EBNF (in recursive-descent
parsing) to LL(1) parsing;
2. Rewrite the grammar within the BNF notation into a form
that the LL(1) parsing algorithm can accept.
Two standard techniques for Repetition
and Choice

 Left Recursion removal

exp → exp addop term | term
(in recursive-descent parsing, EBNF: exp→ term {addop
term})
 Left Factoring
If-stmt → if ( exp ) statement
∣ if ( exp ) statement else statement
(in recursive-descent parsing, EBNF:
if-stmt→ if (exp) statement [else statement])
Left Recursion
Removal
 Left recursion is commonly used to make operations left
associative
The simple expression grammar, where
exp → exp addop term | term
 Immediate left recursion:
The left recursion occurs only within the production of a
single non-terminal.
exp → exp + term | exp - term |term

 Indirect left recursion:

Never occur in actual programming language grammars,
but be included for completeness.
A → Bb |…
B → Aa |…
CASE 1: Simple Immediate Left Recursion
 A → Aα| β
Where, α and β are strings of terminals and non-
terminals;β does not begin with A.

 The grammar will generate the strings of the form.

 n
 We rewrite this grammar rule into two rules:
A → βA’
To generate β first;
A’ → αA’| ε
To generate the repetitions of α, using right
recursion.
Example

 exp → exp addop term | term

 To rewrite this grammar to remove left recursion,

we obtain
exp → term exp’
exp’ → addop term exp’ | ε
CASE2: General Immediate Left
Recursion

A → Aα1| Aα2| … |Aαn|β1|β2|…|βm

Where none of β1,…,βm begin with A.

The solution is similar to the simple case:

A →β1A’|β2A’| …|βmA’
A’ → α1A’| α2A’| … |αn A’|ε
Example

 exp → exp + term | exp - term |term

 Remove the left recursion as follows:

exp → term exp’
exp’ → + term exp’ | - term exp’ |ε
CASE3: General Left Recursion

 Grammars with no ε-productions and no cycles

(1) A cycle is a derivation of at least one step that begins and

ends with same non-terminal:
A=>α=>A
(2) Programming language grammars do have ε-productions,
but usually in very restricted forms.
Algorithm for General Left Recursion
Removal
For i:=1 to m do
For j:=1 to i-1 do
Replace each grammar rule choice of the form
Ai→ Ajβ by the rule
Ai→α1β|α2β| … |αkβ,
where Aj→α1|α2| … |αk is the current rule for Aj.

 Leftrecursion removal not changes the language,

but Change the grammar and the parse tree. This
change causes a complication for the parser
Example
Simple arithmetic After removal of the left
expression grammar recursion

expr → expr addop term∣term

exp → term exp’
exp’→ addop term exp’∣ε
addop → +|-
addop → + -
term → term mulop factor ∣
factor term → factor term’
term’ → mulop factor term’∣ε
mulop →* mulop →*
factor →(expr) ∣ number factor →(expr) ∣ number
Parsing Tree

 The parse tree for the expression 3-4-5

Not express the left associativity of subtraction.
exp

term exp’

addop term
factor
exp’
- factor
number
addop term exp’
(3)
number
(4)
- factor ε

number
(5)
Syntax Tree
 Nevertheless, a parse should still construct the
appropriate left associative syntax tree

- 5

3 4

• From the given parse tree, we can see how the value of
3-4-5 is computed.
Left-Recursion Removed Grammar and
its Procedures
 The grammar with its left recursion removed, exp and exp’
as follows:
exp → term exp’
exp’→ addop term exp’∣ε
Procedure exp Procedure exp’
Begin Begin
Term; Case token of
Exp’; +: match(+);
End exp; term;
exp’;
-: match(-);
term;
exp’;
end case;
end exp’
Left-Recursion Removed Grammar and
its Procedures
 To compute the value of the expression, exp’ needs a
parameter from the exp procedure
exp → term exp’
exp’→ addop term exp’∣ε

function exp:integer; function exp’(valsofar:integer):integer;

var temp:integer; Begin
Begin If token=+ or token=- then
Temp:=Term; Case token of
Return Exp’(temp); +: match(+);
End exp; valsofar:=valsofar+term;
-: match(-);
valsofar:=valsofar-term;
end case;
return exp’(valsofar);
The LL(1) parsing table for the new expression
M[N,T] ( number ) + - * $
Exp exp→term exp’ exp→term exp’
Exp’ exp’ → exp’ → exp’ → exp’ →
ε addop addop ε
term term
exp’ exp’
Addop addop addop
→ + → -
Term term → factor term →factor term’
term’
Term’ term’ term’ term’ term’ term’
→ε →ε →ε → →ε
mulop
factor
term’
Mulop mulop
→*
factor factor →(expr) factor → number
Left Factoring
 Left factoring is required when two or more grammar rule
choices share a common prefix string, as in the rule

A→αβ|αγ

Example:
stmt-sequence→stmt; stmt-sequence | stmt
stmt→s
 An LL(1) parser cannot distinguish between the production
choices in such a situation

 The solution in this simple case is to “factor” the α out on the

left and rewrite the rule as two rules:

A→αA’
A’→β|γ
Algorithm for Left Factoring a Grammar

While there are changes to the grammar do

For each non-terminal A do
Let α be a prefix of maximal length that is shared by two or more
production choices for A
If α≠ε then
Let A →α1|α2|…|αn be all the production choices for A
And suppose that α1,α2,…,αk share α, so that
A →αβ1|αβ2|…|αβk|αK+1|…|αn, the βj’s share
No common prefix, and αK+1,…,αn do not share α
Replace the rule A →α1|α2|…|αn by the rules
A →αA’|αK+1|…|αn
A ‘→β1|β2|…|βk
Example 4.4
 Consider the grammar for statement sequences,
written in right recursive form:
Stmt-sequence→stmt; stmt-sequence | stmt
Stmt→s

 Left Factored as follows:

Stmt-sequence→stmt stmt-seq’
Stmt-seq’→; stmt-sequence | ε
Example 4.4
 Notices:
If we had written the stmt-sequence rule left
recursively:
Stmt-sequence→stmt-sequence ;stmt | stmt
Then removing the immediate left recursion would
result in the same rules:
Stmt-sequence→stmt stmt-seq’
Stmt-seq’→; stmt-sequence | ε
Example 4.5
 Consider the following grammar for if-statements:
If-stmt → if ( exp ) statement
∣ if ( exp ) statement else statement
 The left factored form of this grammar is:
If-stmt → if (exp) statement else-part
Else-part → else statement | ε
Example 4.6
 An arithmetic expression grammar with right
associativity operation:
exp → term+exp |term
 This grammar needs to be left factored, and we
obtain the rules
exp → term exp’
exp’→ + exp∣ε
 Suppose we substitute term exp’ for exp, we then
obtain:
exp → term exp’
exp’→ + term exp’∣ε
Example 4.7
 An typical case where a grammar fails to be LL(1)

Statement → assign-stmt| call-stmt| other

Assign-stmt→identifier:=exp
Call-stmt→indentifier(exp-list)

Where, identifier is shared as first token of both

assign-stmt and call-stmt and, thus, could be the
lookahead token for either. But not in the form can
be left factored.
Example 4.7

 First replace assign-stmt and call-stmt by the right-

hand sides of their definition productions:
Statement → identifier :=
exp | indentifier(exp-list)| other
 Then, we left factor to obtain
Statement → identifier statement’ | other
Statement’ →:=exp |(exp-list)
 Note:
This obscures the semantics of call and
assignment by separating the identifier from the
actual call or assign action.
4.2.4 Syntax
Tree
Construction in
LL(1) Parsing
Difficulty in Construction

 It is more difficult for LL(1) to adapt to syntax tree

construction than recursive descent parsing

 The structure of the syntax tree can be obscured

by left factoring and left recursion removal

 The parsing stack represents only predicated

structure, not structure that have been actually
seen
Solution
 The solution
Delay the construction of syntax tree nodes to the
point when structures are removed from the parsing
stack.
An extra stack is used to keep track of syntax tree
nodes, and the “action” markers are placed in the
parsing stack to indicate when and what actions
on the tree stack should occur
Example
 A barebones expression grammar with only an
addition operation.
E →E + n |n
/* be applied left association*/

 The corresponding LL(1) grammar with left

recursion removal is:
E →n E’
E’ →+nE’|ε
To compute the arithmetic value of the
expression
 Use a separate stack to store the intermediate values of the
computation, called the value stack; Schedule two operations
on that stack:
 A push of a number;
 The addition of two numbers.
PUSH can be performed by the match procedure, and
ADDITION should be scheduled on the stack, by pushing a
special symbol (such as #) on the parsing stack.
This symbol must also be added to the grammar rule that
match a +, namely, the rule for E’: E’ →+n#E’|ε

 Notes: The addition is scheduled just after the next number, but
before any more E’ non-terminals are processed. This guaranteed
left associativity.
The actions of the parser to compute
the value of the expression 3+4+5

Parsing Stack Input Action Value Stack

$E 3+4+5$ E→n E’ $
$E’n 3+4+5$ Match/push $
$E’ +4+5$ E’ →+n#E’ 3$
$E’#n+ +4+5$ Match 3$
$E’#n 4+5$ Match/push 3$
$E’# +5$ Addstack 43$
$E’ +5$ E’ →+n#E’ 7$
$E’#n+ +5$ Match 7$
$E’#n 5$ Match/push 7$
$E’# $ Addstack 57$
$E’ $ E’ →ε 12$
$ $ Accept 12$
End of Part
One
THANKS

Cognitive and Language Development in Children - J. Oates, A. Grayson (Blackwell, 2004) WW
100% (2)
Cognitive and Language Development in Children - J. Oates, A. Grayson (Blackwell, 2004) WW
352 pages
XSLT in A Nutshell: Paul Prescod, Blast Radius Products Co-Author: XML Handbook
No ratings yet
XSLT in A Nutshell: Paul Prescod, Blast Radius Products Co-Author: XML Handbook
137 pages
The Importance of Language Awareness
100% (3)
The Importance of Language Awareness
44 pages
Lecture04 TopDownParsing 2
No ratings yet
Lecture04 TopDownParsing 2
104 pages
Csf401 Unit 02
No ratings yet
Csf401 Unit 02
82 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
てしまう
No ratings yet
てしまう
20 pages
Past Continuous - Bank Robbery
No ratings yet
Past Continuous - Bank Robbery
4 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
The Anatomy of Text: First Element of Multimedia
No ratings yet
The Anatomy of Text: First Element of Multimedia
24 pages
Compiler 9
No ratings yet
Compiler 9
48 pages
LL1 Parsing
0% (1)
LL1 Parsing
71 pages
Answer Key: Preview Test
No ratings yet
Answer Key: Preview Test
19 pages
Chapter 5 Intro To Top Down Parsing
No ratings yet
Chapter 5 Intro To Top Down Parsing
50 pages
Script For AI-generated Video by ChatGPT
No ratings yet
Script For AI-generated Video by ChatGPT
66 pages
Chapter Plan 13-14
No ratings yet
Chapter Plan 13-14
13 pages
Moving To A New Country British English Student
No ratings yet
Moving To A New Country British English Student
4 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
Mahindra British Telecom
No ratings yet
Mahindra British Telecom
3 pages
4 - Top-Down
No ratings yet
4 - Top-Down
67 pages
General Revision G 5 and G 6
No ratings yet
General Revision G 5 and G 6
37 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
Staiger - Hybrid or Inbred The Purity Hypothesis and Hollywood Genre History
0% (1)
Staiger - Hybrid or Inbred The Purity Hypothesis and Hollywood Genre History
17 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
03 Syntaxanalysis 2 2012 2013
No ratings yet
03 Syntaxanalysis 2 2012 2013
83 pages
Pagtuturo NG Salitang Tagalog
No ratings yet
Pagtuturo NG Salitang Tagalog
7 pages
LL 1
No ratings yet
LL 1
73 pages
Ll1parser 190921075612
No ratings yet
Ll1parser 190921075612
84 pages
PYQs Unit 2 CD
No ratings yet
PYQs Unit 2 CD
31 pages
Lecture04 Week06 TopDownParsing 1 - Compilers
No ratings yet
Lecture04 Week06 TopDownParsing 1 - Compilers
48 pages
PW Unit 1
100% (2)
PW Unit 1
3 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
84 pages
4th Meeting (Reduksi Dan Inversi)
No ratings yet
4th Meeting (Reduksi Dan Inversi)
36 pages
Toc Unit 3
No ratings yet
Toc Unit 3
49 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
Buku Ajar
No ratings yet
Buku Ajar
94 pages
Chapter 3 Syntax Analyzer1
No ratings yet
Chapter 3 Syntax Analyzer1
58 pages
T e 1672399089 Esl Articles A An The Powerpoint Lesson Ver 2
No ratings yet
T e 1672399089 Esl Articles A An The Powerpoint Lesson Ver 2
14 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Parsers
No ratings yet
Parsers
24 pages
Unit-5 Top Down Parsing
No ratings yet
Unit-5 Top Down Parsing
35 pages
Unit 7
No ratings yet
Unit 7
34 pages
Lec 3
No ratings yet
Lec 3
25 pages
French Subjunctive
No ratings yet
French Subjunctive
14 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Three Homeric Hymns
No ratings yet
Three Homeric Hymns
4 pages
CD Chapter 2
No ratings yet
CD Chapter 2
39 pages
CSC 4181 Compiler Construction Parsing
No ratings yet
CSC 4181 Compiler Construction Parsing
53 pages
Parser
No ratings yet
Parser
36 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Pert 4 - Syntax Analysis-Top Down Parsing
No ratings yet
Pert 4 - Syntax Analysis-Top Down Parsing
54 pages
Module 4 - Top Down Parsing
No ratings yet
Module 4 - Top Down Parsing
31 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Transgression in Language
No ratings yet
Transgression in Language
3 pages
Solomons (1896) Normal Motor Automatism
No ratings yet
Solomons (1896) Normal Motor Automatism
21 pages
Module 4 PDF
No ratings yet
Module 4 PDF
9 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Discoursecommunityethnography Kneeskern Final
No ratings yet
Discoursecommunityethnography Kneeskern Final
9 pages
Chapter 3-Syntax Analysis-II
No ratings yet
Chapter 3-Syntax Analysis-II
28 pages
IM1 - Week 2
No ratings yet
IM1 - Week 2
12 pages
Chapter 3a - Syntax Analysis
No ratings yet
Chapter 3a - Syntax Analysis
10 pages
Parsing
No ratings yet
Parsing
33 pages
CD Unit 2
No ratings yet
CD Unit 2
6 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
36 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
31 pages
Chapter 4 Top-Down Parsing: Outline
No ratings yet
Chapter 4 Top-Down Parsing: Outline
17 pages
Parsing
No ratings yet
Parsing
38 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
SKBL 1013 Fonetik Artikulator
No ratings yet
SKBL 1013 Fonetik Artikulator
9 pages
Session 3
No ratings yet
Session 3
18 pages
LP Assignment Report: Group 9 Akshit Sood Kartik Saboo Kshitij Madan Saurabh Khemuka
No ratings yet
LP Assignment Report: Group 9 Akshit Sood Kartik Saboo Kshitij Madan Saurabh Khemuka
9 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Unit - 3 Syntax Analysis: 3.1 Role of The Parser
No ratings yet
Unit - 3 Syntax Analysis: 3.1 Role of The Parser
6 pages
Dela Arisma 2015002029 7A Final Assignment
No ratings yet
Dela Arisma 2015002029 7A Final Assignment
10 pages
Lesson 3 Reading 1 and 2 Lesson 4 Speaking 1 and 2: Unit 1 Unit 1
No ratings yet
Lesson 3 Reading 1 and 2 Lesson 4 Speaking 1 and 2: Unit 1 Unit 1
4 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Unit Test 10 (Word)
No ratings yet
Unit Test 10 (Word)
2 pages
Top Down Parsing Example: The Problem Is Simple: Left Recursion!
No ratings yet
Top Down Parsing Example: The Problem Is Simple: Left Recursion!
4 pages
Criteria For Evaluating PowerPoint Presentations
No ratings yet
Criteria For Evaluating PowerPoint Presentations
3 pages
Language - For - Jokes - and - Humour - Ss Coppy
No ratings yet
Language - For - Jokes - and - Humour - Ss Coppy
2 pages