0% found this document useful (0 votes)

103 views37 pages

Top Down Parsing

Top-down parsing constructs a parse tree starting from the root node labeled with a start symbol. It repeats expanding non-terminal nodes using grammar productions until all tokens are consumed. Backtracking is avoided using lookahead tokens. Left recursion is removed to allow top-down parsing. Predictive parsing determines productions using a parse table to remove backtracking and allow parsing of LL(k) grammars.

Uploaded by

Shukla Shravan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

103 views37 pages

Top Down Parsing

Uploaded by

Shukla Shravan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Top down Parsing

• Following grammar generates types of

Pascal

type  simple
|  id
| array [ simple] of type

simple  integer
| char
| num dotdot num

1
Example …
• Construction of a parse tree is done by starting
the root labeled by a start symbol

• repeat following two steps

– at a node labeled with non terminal A select one of the

productions of A and construct children nodes
(Which production?)
– find the next node at which subtree is Constructed

(Which node?)

2
• Parse
array [ num dotdot num ] of integer

Start symbol type

Expanded using the
rule type  simple
simple
• Cannot proceed as non terminal “simple” never generates
a string beginning with token “array”. Therefore, requires
back-tracking.

• Back-tracking is not desirable, therefore, take help of a

“look-ahead” token. The current token is treated as look-
ahead token. (restricts the class of grammars)

3
array [ num dotdot num ] of integer

Start symbol
look-ahead Expand using the rule
type type  array [ simple ] of type

array [ simple ] of type

Left most non terminal
Expand using the rule
Simple  num dotdot num num dotdot num simple

all the tokens exhausted Left most non terminal integer

Expand using the rule
Parsing completed
type  simple

Left most non terminal

Expand using the rule
simple  integer 4
Recursive descent parsing
First set:

Let there be a production

A

then First() is the set of tokens that appear as

the first token in the strings generated from 

For example :
First(simple) = {integer, char, num}
First(num dotdot num) = {num}

5
Define a procedure for each non terminal

procedure type;
if lookahead in {integer, char, num}
then simple
else if lookahead = 
then begin match(  );
match(id)
end
else if lookahead = array
then begin match(array);
match([);
simple;
match(]);
match(of);
type
end
else error;
6
procedure simple;
if lookahead = integer
then match(integer)
else if lookahead = char
then match(char)
else if lookahead = num
then begin match(num);
match(dotdot);
match(num)
end
else
error;

procedure match(t:token);
if lookahead = t
then lookahead = next token
else error; 7
Left recursion
• A top down parser with production
A  A  may loop forever

• From the grammar A  A  | 

left recursion may be eliminated by
transforming the grammar to

A R
RR|
8
Parse tree corresponding Parse tree corresponding
to a left recursive grammar to the modified grammar

A A

A R

β α α β α Є

Both the trees generate string βα*

9
Example
• Consider grammar for arithmetic expressions

EE+T|T
TT*F|F
F  ( E ) | id

• After removal of left recursion the grammar becomes

E  T E’
E’  + T E’ | Є
T  F T’
T’ * F T’ | Є
F  ( E ) | id

10
Removal of left recursion
In general

A  A1 | A2 | ….. |Am

|1 | 2 | …… | n

transforms to

A  1A' | 2A' | …..| nA'

A'  1A' | 2A' |…..| mA' | Є
11
Left recursion hidden due to many
productions
• Left recursion may also be introduced by two or more grammar rules.
For example:

S  Aa | b
A  Ac | Sd | Є

there is a left recursion because

S  Aa  Sda

• In such cases, left recursion is removed systematically

– Starting from the first rule and replacing all the occurrences of the first
non terminal symbol

– Removing left recursion from the modified grammar

12
Removal of left recursion due to
many productions …
• After the first step (substitute S by its rhs in the rules) the
grammar becomes

S  Aa | b
A  Ac | Aad | bd | Є

• After the second step (removal of left recursion) the

grammar becomes

S  Aa | b
A  bdA' | A'
A'  cA' | adA' | Є
13
Left factoring
• In top-down parsing when it is not clear which production to choose
for expansion of a symbol
defer the decision till we have seen enough input.

In general if A  1 | 2

defer decision by expanding A to A'

we can then expand A’ to 1 or 2

• Therefore A   1 |  2

transforms to

A  A’
A’  1 | 2

14
Dangling else problem again
Dangling else problem can be handled by left factoring

stmt  if expr then stmt else stmt

| if expr then stmt

can be transformed to

stmt  if expr then stmt S'

S'  else stmt | Є

15
Predictive parsers
• A non recursive top down parsing method

• Parser “predicts” which production to use

• It removes backtracking by fixing one production for every

non-terminal and input token(s)

• Predictive parsers accept LL(k) languages

– First L stands for left to right scan of input
– Second L stands for leftmost derivation
– k stands for number of lookahead token

• In practice LL(1) is used

16
Predictive parsing
• Predictive parser can be implemented by
maintaining an external stack
input
Parse table is a
two dimensional array
M*X,a+ where “X” is a
stack

parser output non terminal and “a” is

a terminal of the grammar

Parse
table
17
Example
• Consider the grammar

E  T E’
E'  +T E' | Є
T  F T'
T'  * F T' | Є
F  ( E ) | id

18
Parse table for the grammar

id + * ( ) $
ETE’ ETE’
E
E’+TE’ E’Є E’Є
E’
TFT’ TFT’
T
T’Є T’*FT’ T’Є T’Є
T’
Fid F(E)
F

Blank entries are error states. For example

E cannot derive a string starting with ‘+’

19
Parsing algorithm
• The parser considers 'X' the symbol on top of stack, and 'a' the
current input symbol

• These two symbols determine the action to be taken by the parser

• Assume that '$' is a special token that is at the bottom of the stack
and terminates the input string

if X = a = $ then halt

if X = a ≠ $ then pop(x) and ip++

if X is a non terminal
then if M[X,a] = {X  UVW}
then begin pop(X); push(W,V,U)
end
else error
20
Example
Stack input action
$E id + id * id $ expand by ETE’
$E’T id + id * id $ expand by TFT’
$E’T’F id + id * id $ expand by Fid
$E’T’id id + id * id $ pop id and ip++
$E’T’ + id * id $ expand by T’Є
$E’ + id * id $ expand by E’+TE’
$E’T+ + id * id $ pop + and ip++
$E’T id * id $ expand by TFT’

21
Example …
Stack input action
$E’T’F id * id $ expand by Fid
$E’T’id id * id $ pop id and ip++
$E’T’ * id $ expand by T’*FT’
$E’T’F* * id $ pop * and ip++
$E’T’F id $ expand by Fid
$E’T’id id $ pop id and ip++
$E’T’ $ expand by T’Є
$E’ $ expand by E’Є
$ $ halt

22
Constructing parse table
• Table can be constructed if for every non terminal, every lookahead
symbol can be handled by at most one production

• First(α) for a string of terminals and non terminals α is

– Set of symbols that might begin the fully expanded (made of only tokens)
version of α

• Follow(X) for a non terminal X is

– set of symbols that might follow the derivation of X in the input stream

first follow
23
Compute first sets
• If X is a terminal symbol then First(X) = {X}

• If X  Є is a production then Є is in First(X)

• If X is a non terminal
and X  YlY2 … Yk is a production
then
if for some i, a is in First(Yi)
and Є is in all of First(Yj) (such that j<i)
then a is in First(X)

• If Є is in First (Y1) … First(Yk) then Є is in First(X)

24
Example
• For the expression grammar
E  T E’
E'  +T E' | Є
T  F T'
T'  * F T' | Є
F  ( E ) | id

First(E) = First(T) = First(F) = { (, id }

First(E') = {+, Є}
First(T') = { *, Є}

25
Compute follow sets
1. Place $ in follow(S)
2. If there is a production A → αBβ
then everything in first(β) (except ε) is in follow(B)

3. If there is a production A → αB
then everything in follow(A) is in follow(B)

4. If there is a production A → αBβ

and First(β) contains ε
then everything in follow(A) is in follow(B)

Since follow sets are defined in terms of follow sets last two steps
have to be repeated until follow sets converge

26
Example
• For the expression grammar
E  T E’
E'  + T E' | Є
T  F T'
T'  * F T' | Є
F  ( E ) | id

follow(E) = follow(E’) = , $, ) -
follow(T) = follow(T’) = , $, ), + -
follow(F) = { $, ), +, *}
27
Construction of parse table
• for each production A  α do
– for each terminal ‘a’ in first(α)
M[A,a] = A  α

– If Є is in First(α)
M[A,b] = A  α
for each terminal b in follow(A)

– If ε is in First(α) and $ is in follow(A)

M[A,$] = A  α

• A grammar whose parse table has no multiple entries is called LL(1)

• Steps to be followed
– Remove left recursion
– Compute first sets
– Compute follow sets
– Construct the parse table

29
Error handling
• Stop at the first error and print a message
– Compiler writer friendly
– But not user friendly

• Every reasonable compiler must recover from errors and identify as

many errors as possible

• However, multiple error messages due to a single fault must be

avoided

• Error recovery methods

– Panic mode

– Phrase level recovery

– Error productions

– Global correction
30
Panic mode
• Simplest and the most popular method

• Most tools provide for specifying panic mode

recovery in the grammar

• When an error is detected

– Discard tokens one at a time until a set of tokens is
found whose role is clear
– Skip to the next token that can be placed reliably in the
parse tree

31
Panic mode …
• Consider following code
begin
a = b + c;
x=pr;
h = x < 0;
end;

• The second expression has syntax error

• Panic mode recovery for begin-end block

skip ahead to next ‘;’ and try to parse the next expression

• It discards one expression and tries to continue parsing

• May fail if no further ‘;’ is found

32
Phrase level recovery
• Make local correction to the input

• Works only in limited situations

– A common programming error which is easily detected
– For example insert a “;” after closing “-” of a class
definition

• Does not work very well!

33
Error productions
• Add erroneous constructs as productions in the grammar

• Works only for most common mistakes which can be

easily identified

• Essentially makes common errors as part of the grammar

• Complicates the grammar and does not work very well

34
Global corrections
• Considering the program as a whole find a correct
“nearby” program

• Nearness may be measured using certain metric

• PL/C compiler implemented this scheme:

anything could be compiled!

• It is complicated and not a very good idea!

35
Error Recovery in LL(1) parser
• Error occurs when a parse table entry M[A,a] is empty

• Skip symbols in the input until a token in a selected set

(synch) appears

• Place symbols in follow(A) in synch set. Skip tokens until

an element in follow(A) is seen.
Pop(A) and continue parsing

• Add symbol in first(A) in synch set. Then it may be

possible to resume parsing according to A if a symbol in
first(A) appears in input.

36
Practice Assignment
• Reading assignment: Read about error
recovery in LL(1) parsers
• Assignment to be submitted:
– introduce synch symbols (using both follow
and first sets) in the parse table created for the
boolean expression grammar in the previous
assignment
– Parse “not (true and or false)” and show how
error recovery works

Chapter 5 Intro To Top Down Parsing
No ratings yet
Chapter 5 Intro To Top Down Parsing
50 pages
4 - Top-Down
No ratings yet
4 - Top-Down
67 pages
Compiler Unit2
No ratings yet
Compiler Unit2
89 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
10-11-12-13-Top Down Parser
No ratings yet
10-11-12-13-Top Down Parser
76 pages
Ch4a Modified
No ratings yet
Ch4a Modified
53 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
63 pages
Syntax Analysis
No ratings yet
Syntax Analysis
115 pages
Parsing
No ratings yet
Parsing
9 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
Compiler 9
No ratings yet
Compiler 9
48 pages
Chapter 3
No ratings yet
Chapter 3
9 pages
Lecture 05
No ratings yet
Lecture 05
59 pages
Module 4
No ratings yet
Module 4
125 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
Nahid - 2474 PDF
No ratings yet
Nahid - 2474 PDF
9 pages
Chapter4 1
No ratings yet
Chapter4 1
61 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
U 2 PPT
No ratings yet
U 2 PPT
91 pages
03 Parsing
No ratings yet
03 Parsing
71 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
Unit 7
No ratings yet
Unit 7
34 pages
Syntax Analysis
No ratings yet
Syntax Analysis
115 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
Chapter 8 - Syntax Analysis
No ratings yet
Chapter 8 - Syntax Analysis
92 pages
Parser
No ratings yet
Parser
36 pages
Module 4 - Top Down Parsing
No ratings yet
Module 4 - Top Down Parsing
31 pages
L5 TopDownParsing
No ratings yet
L5 TopDownParsing
30 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
Toc Unit 3
No ratings yet
Toc Unit 3
49 pages
Top To Bottom
No ratings yet
Top To Bottom
31 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
Chapter 4 - Syntax Analysis Part 1
No ratings yet
Chapter 4 - Syntax Analysis Part 1
36 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Web Programming Lab Manual
82% (11)
Web Programming Lab Manual
49 pages
Pec 31 Acd Material
No ratings yet
Pec 31 Acd Material
12 pages
Chapter 3-Syntax Analysis-II
No ratings yet
Chapter 3-Syntax Analysis-II
28 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Presented by Jyoti Thakur
No ratings yet
Presented by Jyoti Thakur
31 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
36 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
Top-Down Parsing PDF
No ratings yet
Top-Down Parsing PDF
6 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
Redis Tutorial
No ratings yet
Redis Tutorial
110 pages
Parsing
No ratings yet
Parsing
38 pages
Elimination of Left Recursion
No ratings yet
Elimination of Left Recursion
17 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Epsilon Commands
No ratings yet
Epsilon Commands
4 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
Realtime C Efficient Objectoriented and Template Microcontroller Programming 4th Ed 2021 Christopher Kormanyos Download
No ratings yet
Realtime C Efficient Objectoriented and Template Microcontroller Programming 4th Ed 2021 Christopher Kormanyos Download
66 pages
Free Programs For Trading. StockSharp
No ratings yet
Free Programs For Trading. StockSharp
8 pages
STD - Xi Web Application Unit 2 Section I HTML
No ratings yet
STD - Xi Web Application Unit 2 Section I HTML
22 pages
Cloud Pentesting Cheatsheet
100% (1)
Cloud Pentesting Cheatsheet
22 pages
Professional Summary: Madhuri
No ratings yet
Professional Summary: Madhuri
4 pages
Python PostgreSQL Basics
No ratings yet
Python PostgreSQL Basics
19 pages
Advanced Data Analytics Using Python - Unit II
No ratings yet
Advanced Data Analytics Using Python - Unit II
57 pages
BS23 - GLassDoor
No ratings yet
BS23 - GLassDoor
4 pages
Adobe Actionscript 3 Class Diagram: For Adobe Flash Player 9 and Adobe AIR
100% (2)
Adobe Actionscript 3 Class Diagram: For Adobe Flash Player 9 and Adobe AIR
5 pages
Binary Search Tree
No ratings yet
Binary Search Tree
20 pages
HTTP
No ratings yet
HTTP
20 pages
VBScript Commands
No ratings yet
VBScript Commands
1 page
Online Bike - Two-Wheeler Rental Services
No ratings yet
Online Bike - Two-Wheeler Rental Services
19 pages
3 Years Software Test Engineer Resume
No ratings yet
3 Years Software Test Engineer Resume
3 pages
OS LAb Manual
No ratings yet
OS LAb Manual
43 pages
5-Language Basics PDF
No ratings yet
5-Language Basics PDF
69 pages
Sample MCQ
No ratings yet
Sample MCQ
4 pages
Erlang Intro PDF
No ratings yet
Erlang Intro PDF
12 pages
Dzone Kubernetes in The Enterprise 2022 1669219453241
No ratings yet
Dzone Kubernetes in The Enterprise 2022 1669219453241
52 pages
R - Tili-Maruza Matni - 11-Qism
No ratings yet
R - Tili-Maruza Matni - 11-Qism
11 pages
Chapter10 Pointers
No ratings yet
Chapter10 Pointers
48 pages
Data Contracts For Schema Registry - Confluent Documentation
No ratings yet
Data Contracts For Schema Registry - Confluent Documentation
22 pages
Ranjith Krishnan: Session 11
No ratings yet
Ranjith Krishnan: Session 11
12 pages
Boot Sector Programming
No ratings yet
Boot Sector Programming
6 pages
Python As A Calculator: Arithmetic Operators
No ratings yet
Python As A Calculator: Arithmetic Operators
2 pages
Software Engineer Resume
No ratings yet
Software Engineer Resume
1 page
Lorem Picsum
No ratings yet
Lorem Picsum
1 page
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)

Top Down Parsing

Uploaded by

Top Down Parsing

Uploaded by

Top down Parsing

• Following grammar generates types of

• repeat following two steps

– at a node labeled with non terminal A select one of the

Start symbol type

• Back-tracking is not desirable, therefore, take help of a

array [ simple ] of type

all the tokens exhausted Left most non terminal integer

Left most non terminal

Let there be a production

then First() is the set of tokens that appear as

• From the grammar A  A  | 

Both the trees generate string βα*

• After removal of left recursion the grammar becomes

A  A1 | A2 | ….. |Am

A  1A' | 2A' | …..| nA'

there is a left recursion because

• In such cases, left recursion is removed systematically

– Removing left recursion from the modified grammar

• After the second step (removal of left recursion) the

In general if A  1 | 2

defer decision by expanding A to A'

we can then expand A’ to 1 or 2

stmt  if expr then stmt else stmt

stmt  if expr then stmt S'

• Parser “predicts” which production to use

• It removes backtracking by fixing one production for every

• Predictive parsers accept LL(k) languages

• In practice LL(1) is used

parser output non terminal and “a” is

Blank entries are error states. For example

• These two symbols determine the action to be taken by the parser

if X = a ≠ $ then pop(x) and ip++

• First(α) for a string of terminals and non terminals α is

• Follow(X) for a non terminal X is

• If X  Є is a production then Є is in First(X)

• If Є is in First (Y1) … First(Yk) then Є is in First(X)

First(E) = First(T) = First(F) = { (, id }

4. If there is a production A → αBβ

– If ε is in First(α) and $ is in follow(A)

• A grammar whose parse table has no multiple entries is called LL(1)

• Every reasonable compiler must recover from errors and identify as

• However, multiple error messages due to a single fault must be

• Error recovery methods

– Phrase level recovery

• Most tools provide for specifying panic mode

• When an error is detected

• The second expression has syntax error

• Panic mode recovery for begin-end block

• It discards one expression and tries to continue parsing

• May fail if no further ‘;’ is found

• Works only in limited situations

• Does not work very well!

• Works only for most common mistakes which can be

• Essentially makes common errors as part of the grammar

• Complicates the grammar and does not work very well

• Nearness may be measured using certain metric

• PL/C compiler implemented this scheme:

• It is complicated and not a very good idea!

• Skip symbols in the input until a token in a selected set

• Place symbols in follow(A) in synch set. Skip tokens until

• Add symbol in first(A) in synch set. Then it may be

You might also like