0% found this document useful (0 votes)

52 views8 pages

Types of Parsing Example Grammar: CMSC 430 CMSC 430

The document discusses top-down and bottom-up parsing. Top-down parsers start at the root of the parse tree and recursively expand non-terminals. They may require backtracking if the chosen production does not match the input. Bottom-up parsers start at the leaves and group symbols using a stack. Left recursion can cause issues for top-down parsers. The document also discusses LL(1) and LR(1) parsing which use limited lookahead to parse context-free grammars.

Uploaded by

Khalifa Alkhater

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views8 pages

Types of Parsing Example Grammar: CMSC 430 CMSC 430

Uploaded by

Khalifa Alkhater

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Types of parsing Top-down parsers start at the root of derivation tree and ll in picks a production and tries to match

ch the input may require backtracking some grammars are backtrack-free (predictive)

Bottom-up parsers start at the leaves and ll in start in a state valid for legal rst tokens as input is consumed, change state to encode possibilities (recognize valid prexes) use a stack to store both state and sentential forms

Consider parsing the input string x - 2 * y

CMSC 430

Lecture 4, Page 1

CMSC 430

Lecture 4, Page 3

Top-down parsing A top-down parser starts with the root of the parse tree. It is labeled with the start symbol or goal symbol of the grammar. To build a parse, it repeats the following steps until the fringe of the parse tree matches the input string. 1. At a node labeled A, select a production with A on its lhs and for each symbol on its rhs, construct the appropriate child. 2. When a terminal is added to the fringe that doesnt match the input string, backtrack. 3. Find the next node to be expanded. (Must have a label in N T )

Backtracking parse example One possible parse for x - 2 * y Prodn 1 3 4 7 9 7 9 5 7 9 9

CMSC 430

The key is selecting the right production in step 1. should be guided by input string

Sentential form <goal> <expr> <expr> - <term> <term> - <term> <factor> - <term> <id> - <term> <id> - <term> <id> - <term> <id> - <factor> <id> - <num> <id> - <num> <id> - <term> <id> - <term> * <factor> <id> - <factor> * <factor> <id> - <num> * <factor> <id> - <num> * <factor> <id> - <num> * <factor> <id> - <num> * <id> <id> - <num> * <id>

Input x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y x - 2 * y
Lecture 4, Page 4

CMSC 430

Lecture 4, Page 2

Example Another possible parse for x - 2 * y Prodn 1 2 2 2 2 2 Sentential form <goal> <expr> <expr> + <term> <expr> + <term> + <term> <expr> + <term> + <term> + <term> <expr> + <term> + <term> + <term> + Input x x x x x x x -

Eliminating left recursion To remove left recursion, we can transform the grammar. Consider the grammar fragment: 2 2 2 2 2 2 2 * * * * * * * y y y y y y y <foo> ::= <foo> | where and do not start with <foo>. We can rewrite this as: <foo> ::= <bar> <bar> ::= <bar> | where <bar> is a new non-terminal.

If the parser makes the wrong choices, the expansion doesnt terminate. This isnt a good property for a parser to have.

This fragment contains no left recursion.

CMSC 430

Lecture 4, Page 5

CMSC 430

Lecture 4, Page 7

Left recursion Top-down parsers cannot handle left-recursion in a grammar. Formally, a grammar is left recursive if A N T such that a derivation A + A for some string .

Example Our expression grammar contains two cases of left recursion <expr> ::= | | <term> ::= | | <expr> + <term> <expr> - <term> <term> <term> * <factor> <term> / <factor> <factor>

Applying the transformation gives <expr> ::= <term> <expr > <expr > ::= + <term> <expr > | - <term> <expr > | <term> ::= <factor> <term > <term > ::= * <factor> <term > | / <factor> <term > |
CMSC 430 Lecture 4, Page 6 CMSC 430 Lecture 4, Page 8

Our simple expression grammar is left recursive.

Eliminating left recursion A general technique for removing left recursion arrange the non-terminals in some order A1, A2, . . . , An for i 1 to n for j 1 to i-1 replace each production of the form Ai ::= Aj with the productions Ai ::= 1 | 2 | . . . | k , where Aj ::= 1 | 2 | . . . | k are all the current Aj productions. eliminate any immediate left recursion on Ai using the direct transformation This assumes that the grammar has no cycles (A + A) or productions (A ::= ).

How much lookahead is needed? We saw that top-down parsers may need to backtrack when they select the wrong production Do we need arbitrary lookahead to parse CFGs? in general, yes Fortunately large subclasses of CFGs can be parsed with limited lookahead most programming language constructs can be expressed in a grammar that falls in these subclasses Among the interesting subclasses are LL(1) and LR(1).

CMSC 430

Lecture 4, Page 9

CMSC 430

Lecture 4, Page 11

How does this algorithm work? 1. impose an arbitrary order on the non-terminals 2. outer loop cycles through N T in order 3. inner loop ensures that a production expanding Ai has no non-terminal Aj with j < i 4. It forward substitutes those away 5. last step in the outer loop converts any direct recursion on Ai to right recursion using the simple transformation showed earlier 6. new non-terminals are added at the end of the order and only involve right recursion At the start of the ith outer loop iteration for all k < i, a production expanding Ak that has Al in its rhs, for l < k. At the end of the process (n < i), the grammar has no remaining left recursion.
CMSC 430 Lecture 4, Page 10

Recursive Descent Parsing Properties top-down parsing algorithm parser built on procedure calls procedures may be (mutually) recursive Algorithm write procedure for each non-terminal turn each production into clause insert call to procedure A() for non-terminal A to match(x) for terminal x start by invoking procedure for start symbol S Example A ::= a B c
CMSC 430

A() { match(a); B(); match(c); }

Lecture 4, Page 12

Recursive Descent Parsing Example S 1 2 3 A Helpers tok; // current token match(x) { if (tok != x) error(); tok = getToken(); } grammar ::= a A | b ::= S c Parser S() { if (tok == a) match(a); A(); else if (tok == b) match(b); else error(); } A() { S(); match(c); }

Left Factoring What if a grammar does not have the LL(1) property? Sometimes, we can transform a grammar to have this property. For each non-terminal A nd the longest prex common to two or more of its alternatives. if = , then replace all of the A productions A ::= 1 | 2 | | n | with A ::= L | L ::= 1 | 2 | | n where L is a new non-terminal. Repeat until no two alternatives for a single non-terminal have a common prex.

CMSC 430

Lecture 4, Page 13

CMSC 430

Lecture 4, Page 15

Predictive Parsing Basic idea For any two productions A ::= | , we would like a distinct way of choosing the correct production to expand. FIRST sets For some rhs G, dene FIRST() as the set of tokens that appear as the rst symbol in some string derived from . That is, x FIRST() i x for some . LL(1) property Whenever two productions A ::= and A ::= both appear in the grammar, we would like F IRST () F IRST () = This would allow the parser to make a correct choice with a lookahead of only one symbol! Pursuing this idea leads to predictive LL(1) parsers.
CMSC 430 Lecture 4, Page 14

Example Consider a right-recursive version of the expression grammar: 1 <goal> ::= <expr> 2 <expr> ::= <term> + <expr> 3 | <term> - <expr> 4 | <term> 5 <term> ::= <factor> * <term> 6 | <factor> / <term> 7 | <factor> 8 <factor> ::= number 9 | id To choose between productions 2, 3, & 4, the parser must see past the number or id and look at the +, -, *, or /.
FIRST(2)

FIRST(3) FIRST(4) =

This grammar fails the test. Note: This grammar is right-associative.

CMSC 430

Lecture 4, Page 16

Example: Sentential form

1 2 6 11 9 4 2 6 10 7 6 11 9 5

<goal> x - 2 * y <expr> x - 2 * y <term> <expr > x - 2 * y <factor> <term > <expr > x - 2 * y <id> <term > <expr > x - 2 * y <id> <term > <expr > x - 2 * y <id> <expr > x - 2 <id> - <expr> x - 2 * y <id> - <expr> x - 2 * y <id> - <term> <expr > x - 2 * y <id> - <factor> <term > <expr > x - 2 * y <id> - <num> <term > <expr > x - 2 * y <id> - <num> <term > <expr > x - 2 * y <id> - <num> * <term> <expr > x -2 * y <id> - <num> * <term> <expr > x -2 * y <id> - <num> * <factor> <term > <expr > x -2 * y <id> - <num> * <id> <expr > x -2 * y <id> - <num> * <id> <term > <expr > x -2 * y <id> - <num> * <id> <expr > x -2 * y <id> - <num> * <id> x -2 * y

Input

The next symbol determined each choice correctly.

CMSC 430

Lecture 4, Page 17

CMSC 430

Lecture 4, Page 19

Generality Question: By eliminating left recursion and left factoring, can we transform an arbitrary context free grammar to a form where it can be predictively parsed with a single token lookahead? Answer: Given a context free grammar that doesnt meet our conditions, it is undecidable whether an equivalent grammar exists that does meet our conditions.

Many context free languages do not have such a grammar. Now, selection requires only a single token lookahead. Note: This grammar is still right-associative. {an0bn | n 1} {an1b2n | n 1}

CMSC 430

Lecture 4, Page 18

CMSC 430

Lecture 4, Page 20

The FIRST set For a string of grammar symbols , dene FIRST() as the set of terminal symbols that begin strings derived from if , then FIRST()
FIRST()

The FIRST construction

contains the set of tokens valid in the rst position of

To build FIRST(X): 1. if X is a terminal, FIRST(X) is {X} 2. if X ::= , then FIRST(X) 3. if X ::= Y1Y2 Yk then put FIRST(Y1) in FIRST(X) 4. if X is a non-terminal and X ::= Y1Y2 Yk , then a FIRST(X) if a FIRST(Yi) and FIRST(Yj ) for all 1 j < i (If FIRST(Y1), then FIRST(Yi) is irrelevant, for 1 < i)

rule 1 goal expr expr term term factor num num id id + + * * / /

2 3 num,id num,id +, num,id *,/ num,id

FIRST

{num,id} {num,id} { ,+,-} {num,id} { ,*,/} {num,id} {num} {id} {+} {-} {*} {/}

CMSC 430

Lecture 4, Page 21

CMSC 430

Lecture 4, Page 23

Our example grammar 1 2 3 4 5 6 7 8 9 goal expr expr ::= expr ::= term ::= + expr | - expr | ::= factor ::= * term | / term | term expr

The FOLLOW set For a non-terminal A, dene FOLLOW(A) as the set of terminals that can appear immediately to the right of A in some sentential form Thus, a non-terminals FOLLOW set species the tokens that can legally appear after it A terminal symbol has no FOLLOW set

term term

To build FOLLOW(X): 1. place eof in FOLLOW( goal ) 2. if A ::= B, then put {FIRST() } in FOLLOW(B) 3. if A ::= B then put FOLLOW(A) in FOLLOW(B) 10 factor ::= num 11 | id 4. if A ::= B and FIRST(), then put FOLLOW(A) in FOLLOW(B)

CMSC 430

Lecture 4, Page 22

CMSC 430

Lecture 4, Page 24

The FOLLOW construction

LL(1) grammars Features input parsed from left to right leftmost derivation one token lookahead

rule 1 2 3 4 FOLLOW goal eof {eof} expr eof {eof} expr eof {eof} term +, eof {eof,+,-} term eof,+, {eof,+,-} factor *,/ eof,+,- {eof,+,-,*,/}

Denition A grammar G is LL(1) if and only if, for all non-terminals A, each distinct pair of productions A ::= and A ::= satisfy the condition FIRST() FIRST() = A grammar G is LL(1) if and only if for each set of productions A ::= 1 | 2 | | n 1. FIRST(1), FIRST(2), , FIRST(n) are all pairwise disjoint 2. if i , then FIRST(j )
FOLLOW(A)

= , for all 1 j n, i =j.

If G is free, condition 1 is sucient.

CMSC 430 Lecture 4, Page 25 CMSC 430 Lecture 4, Page 27

Using FIRST and FOLLOW To build a predicative recursive-descent parser: For each production A ::= and lookahead token expand A using production if token FIRST() if FIRST() expand A using production if token FOLLOW(A) all other tokens return error If multiple choices, the grammar is not LL(1) (predicative).

LL(1) grammars Provable facts about LL(1) grammars: no left recursive grammar is LL(1) no ambiguous grammar is LL(1) LL(1) parsers operate in linear time an free grammar where each alternative expansion for A begins with a distinct terminal is a simple LL(1) grammar Not all grammars are LL(1) S ::= aS | a is not LL(1) FIRST(aS) = FIRST(a) = {a} S ::= aS S ::= aS | accepts the same language and is LL(1)

goal expr expr term term factor

id num + * / eof ge ge e te e te e +e e -e e t ft t ft t t t *t t /t t f id f num

Lecture 4, Page 26

CMSC 430

Lecture 4, Page 28

LL grammars
LL(1)

grammars

may need to rewrite grammar (left recursion, left factoring) resulting grammar larger, less maintainable
LL(k)

grammars

k-token lookahead, more powerful than LL(1) grammars example: S ::= ac | abc is LL(2) Not all grammars are LL(k) example: S ::= aibj where i j

equivalent to dangling else problem problem - must choose production after k tokens of lookahead Bottom-up parsers avoid this problem
CMSC 430 Lecture 4, Page 29

Recursive Descent Parsing: Goal Approach Key Question: Which Production To Use?
No ratings yet
Recursive Descent Parsing: Goal Approach Key Question: Which Production To Use?
25 pages
Ch3 SyntaxAnalysispdf 2024 01 01 08 48 28
No ratings yet
Ch3 SyntaxAnalysispdf 2024 01 01 08 48 28
134 pages
Top Down
No ratings yet
Top Down
25 pages
Top Down Parser - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
No ratings yet
Top Down Parser - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
79 pages
Lec 09-Left Recursion Removal
No ratings yet
Lec 09-Left Recursion Removal
23 pages
Lecture 13
No ratings yet
Lecture 13
63 pages
Unit 2 Basic Parsing Techniques
No ratings yet
Unit 2 Basic Parsing Techniques
34 pages
U 2 PPT
No ratings yet
U 2 PPT
91 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Pert 4 - Syntax Analysis-Top Down Parsing
No ratings yet
Pert 4 - Syntax Analysis-Top Down Parsing
54 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq
34 pages
CD Chapter 2
No ratings yet
CD Chapter 2
39 pages
CSE2002 Session21 TopDownParsingSession2
No ratings yet
CSE2002 Session21 TopDownParsingSession2
23 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
51 pages
Ch4a Modified
No ratings yet
Ch4a Modified
53 pages
Chapter 4 - Syntax Analysis Part 1
No ratings yet
Chapter 4 - Syntax Analysis Part 1
36 pages
Syntax Analysis: COP5621 Compiler Construction
No ratings yet
Syntax Analysis: COP5621 Compiler Construction
36 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
No ratings yet
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
26 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
Compiler Construction Lecture 12 Predictive Parsing-Step1
No ratings yet
Compiler Construction Lecture 12 Predictive Parsing-Step1
24 pages
Compiler Design: Parsing
No ratings yet
Compiler Design: Parsing
48 pages
Chapter4 1
No ratings yet
Chapter4 1
61 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Parsing
No ratings yet
Parsing
33 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
No ratings yet
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
9 pages
CD - Ch.2
No ratings yet
CD - Ch.2
39 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
Parsing ME Modified
No ratings yet
Parsing ME Modified
168 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Chapter 8 - Syntax Analysis
No ratings yet
Chapter 8 - Syntax Analysis
92 pages
Lesson 18
No ratings yet
Lesson 18
32 pages
Parsing - Part II: (Ambiguity, Top-Down Parsing, Left-Recursion Removal)
No ratings yet
Parsing - Part II: (Ambiguity, Top-Down Parsing, Left-Recursion Removal)
24 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
Session 3
No ratings yet
Session 3
18 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
Cdeprt
No ratings yet
Cdeprt
12 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
Unit 2
No ratings yet
Unit 2
67 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
CS6109 Module 5
No ratings yet
CS6109 Module 5
117 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
CD Unit-Ii
No ratings yet
CD Unit-Ii
56 pages
Chapter - 3
No ratings yet
Chapter - 3
46 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
88 pages
Compiler Design Unit II-1
No ratings yet
Compiler Design Unit II-1
46 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Chapter 5 Intro To Top Down Parsing
No ratings yet
Chapter 5 Intro To Top Down Parsing
50 pages
Parser
No ratings yet
Parser
36 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet