0% found this document useful (0 votes)

16 views9 pages

Parsing

Recursive-descent parsing is a top-down parsing technique that uses procedures associated with each non-terminal in a context-free grammar. Each procedure attempts to match the right-hand side of productions for its non-terminal by calling other procedures or comparing terminals to input. Lookahead is used to determine which production to choose when alternatives exist. First and follow sets are computed to determine the lookahead symbols for each non-terminal. A grammar is LL(1) if lookahead of one symbol is sufficient to determine the production to select.

Uploaded by

Washington Brown

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views9 pages

Parsing

Uploaded by

Washington Brown

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Parsing

A parser is an algorithm that determines whether a given input string is in a language and, as a side-effect,
usually produces a parse tree for the input. There is a procedure for generating a parser from a given context-free
grammar.

Recursive-Descent Parsing
Recursive-descent parsing is one of the simplest parsing techniques that is used in practice. Recursive-descent
parsers are also called top-down parsers, since they construct the parse tree top down (rather than bottom up).

The basic idea of recursive-descent parsing is to associate each non-terminal with a procedure. The goal of each
such procedure is to read a sequence of input characters that can be generated by the corresponding non-
terminal, and return a pointer to the root of the parse tree for the non-terminal. The structure of the procedure is
dictated by the productions for the corresponding non-terminal.

The procedure attempts to "match" the right hand side of some production for a non-terminal.

To match a terminal symbol, the procedure compares the terminal symbol to the input; if they agree, then
the procedure is successful, and it consumes the terminal symbol in the input (that is, moves the input
cursor over one symbol).

To match a non-terminal symbol, the procedure simply calls the corresponding procedure for that non-
terminal symbol (which may be a recursive call, hence the name of the technique).

Recursive-Descent Parser for Expressions

Consider the following grammar for expressions (we'll look at the reasons for the peculiar structure of this
grammar later):

1. <E> --> <T> <E*>

2. <E*> --> + <T> <E*> | - <T> <E*> | epsilon
3. <T> --> <F> <T*>
4. <T*> --> * <F> <T*> | / <F> <T*> | epsilon
5. <F> --> ( <E> ) | number

We create procedures for each of the non-terminals. According to production 1, the procedure to match
expressions (<E>) must match a term (by calling the procedure for <T>), and then more expressions (by calling
the procedure <E*>).
procedure E;
T; Estar;

Some procedures, such as <E*>, must examine the input to determine which production to choose.
procedure Estar;
if NextInputChar = "+" or "-" then
read(NextInputChar);
T; Estar;
We will append a special marker symbol (ENDM) to the input string; this marker symbol notifies the parser that
the entire input has been seen. We should also modify the procedure for the start symbol, E, to recognize the end
marker after seeing an expression.

Top-Down Parser for Expressions

procedure E;
T; Estar;
if NextInputChar = ENDM then /* done */
else print("syntax error")

procedure Estar;
if NextInputChar = "+" or "-" then
read(NextInputChar);
T; Estar;

procedure T;
F; Tstar;

procedure Tstar;
if NextInputChar = "*" or "/" then
read(NextInputChar);
F; Tstar;

procedure F;
if NextInputChar = "(" then
read(NextInputChar);
E;
if NextInputChar = ")" then
read(NextInputChar)
else print("syntax error");
else if NextInputChar = number then
read(NextInputChar)
else print("syntax error");

Tracing the Parser

As an example, consider the following input: 1 + (2 * 3) / 4. We just call the procedure corresponding to the start
symbol.

NextInputChar = "1"
Call E
Call T
Call F
NextInputChar = "+" /* Match 1 with F */
Call Tstar /* Match epsilon */
Call Estar
NextInputChar = "(" /* Match + */
Call T
Call F
/* Match (, looking for E ) */
NextInputChar = "2"
Call E
Call T
Call F
/* Match 2 with F */
NextInputChar = "*"
Call Tstar
/* Match * */
NextInputChar = "3"
Call F
/* Match 3 with F */
NextInputChar = ")"
Call Tstar
/* Match epsilon */
Call Estar /* Match epsilon */
NextInputChar = "/" /* Match ")" */
Call Tstar
NextInputChar = "4" /* Match "/" */
Call F
/* Match 4 with F */
NextInputChar = ENDM
Call Tstar /* Match epsilon */
Call Tstar /* Match epsilon */
Call Estar /* Match epsilon */
/* Match ENDM */

Observations about Recursive-Descent Parser

In procedure Estar and Tstar, we match one of the productions with an arithmetic operator if we see such
an operator in the input; otherwise we simply return. A procedure that returns without matching any
symbols is, in effect, choosing the epsilon production.

In our expression parser, we only choose the epsilon production if the NextInputChar doesn't match the
first terminal on the right hand side of the production.

We never attempt to read beyond the end marker (ENDM), which is matched only at the end of an
expression. In all other circumstances, the presence of the end marker signals a syntax error.

As written, our recursive-descent parser only determines whether or not the input string is in the language
of the grammar; it does not give the structure of the string according to the grammar. We could easily
build a parse tree incrementally during parsing.

Lookahead in Recursive-Descent Parsing

In order to implement a recursive-descent parser for a grammar, for each nonterminal in the grammar, it must be
possible to determine which production to apply for that non-terminal by looking only at the current input
symbol. (We want to avoid having the compiler or other text processing program scan ahead in the input to
determine what action to take next.)

The lookahead symbol is simply the next terminal that we will try to match in the input. We use a single
lookahead symbol to decide what production to match.

Consider a production: A --> X1...Xm. We need to know the set of possible lookahead symbols that indicate this
production is to be chosen.

This set is clearly those terminal symbols that can be produced by the symbols X1...Xm (which may be
either terminals or non-terminals).

Since a lookahead is only a single terminal symbol, we want the first (i.e., leftmost) symbol that could be
produced by X1...Xm.

We donote the set of symbols that could be produced first by X1...Xm as First(X1...Xm).
First Sets
To distinguish two productions with the same non-terminal on the left hand side, we examine the First sets for
their corresponding right hand sides. Given the production A --> X1...Xm we must determine First(X1...Xm).

We first consider the leftmost symbol, X1.

If this is a terminal symbol, then First(X1...Xm) = X1.

If X1 is a non-terminal, then we compute the First sets for each right hand side corresponding to X1.

In our expression grammar above:

First(<E>) = First(<T> <E*>)
First(<T> <E*>) = First(<T>)
First(<T>) = First(<F> <T*>)
First(<F> <T*>) = First(<F>) = {(,number}

If X1 can generate epsilon, then X1 can (in effect) be erased, and First(X1...Xm) depends on X2.

If X2 is a terminal, it is included in First(X1...Xm).

If X2 is a non-terminal, we compute the First sets for each of its corresponding right hand sides.

Similarly, if both X1 and X2 can produce epsilon, we consider X3, then X4, etc.

Follow Sets

Suppose we are attempting to compute the lookahead symbols that suggest the production A --> X1...Xm. What
if each of the Xi can produce epsilon?

If the entire right hand side of a production can produce epsilon, then the lookahead for A is determined by those
terminal symbols that can follow A in a parse. We denote the set of terminal symbols that can follow a non-
terminal A in a parse as Follow(A).

We inspect the grammar for all occurences of the non-terminal A. In each production, A is either:

followed by a terminal symbol x, so x is in Follow(A).

followed by a non-terminal symbol B, so Follow(A) includes First(B).

at the end of a production for some non-terminal S (as in S -> Y1...YmA), in which case Follow(A)
includes Follow(S).

First and Follow Sets for Expression Grammar

Computing the First and Follow sets for our expression grammar (as augmented with a new start symbol that
includes the ENDM in the production):

1. <S> --> <E> ENDM

First(<E>) = First(<T> <E*>) = First(<T>)

First(<E>) = {+} U {-} U Follow(<E>)

Follow(<E*>) = Follow(<E>) = {),ENDM}
First(<E*>) = {+,-,),ENDM}

First(<T>) = First(<F> <T*>) = First(<F>)

First(<T>) = {} U {/} U Follow(<T*>)

Follow(<T*>) = Follow(<T>) = First(<E*>)
First(<T*>) = {*,/,+,-,),ENDM}

First(<F>) {(,number}

LL(1) Grammars for Recursive-Descent Parsing

The set of lookahead symbols that will cause the selection (ie., prediction) of the production A --> X1...Xm is
Predict(A --> X1...Xm) = First(X1...Xm) U
If X1...Xm --> epsilon then Follow(A) else null

That is, any symbol that can be the first symbol produced by the right hand side of a production will predict that
production. Further, if the entire right hand side can produce epsilon, then symbols that can immediately follow
the left hand side of a production will also predict that production.

If, for two productions

1. A --> X1...Xm
2. A --> Y1...Yn

we have some symbol s for which

1. s is in Predict(A --> X1...Xm)

2. s is in Predict(A --> Y1...Yn)

then we cannot in general know which production to select by looking at a single input symbol.

Recursive-descent parsing can only parse those CFG's that have disjoint predict sets for productions that share a
common left hand side. CFG's that obey this restriction are called LL(1).

From experience we know that it is usually possible to create an LL(1) CFG for a programming language.
However, not all CFG's are LL(1) and a CFG that is not LL(1) may be parsable using some other (usually more
complex) parsing technique.

Creating LL(1) Grammars

Recursive-descent parsing can only parse grammars that have disjoint predict sets for productions that share a
common left hand side.

Two common properties of grammars that violate this condition are:

Left recursion: any grammar containing productions with left recursion, that is, productions of the form A
--> A X1...Xm, cannot be LL(1). The problem is that any symbol that predicts this production the first
time will, of necessity, continue to predict this production forever (and never be matched).

Common prefix: any grammar containing two productions for the same non-terminal that share a common
prefix on the right hand side cannot be LL(1). The problem is that any symbol that predicts the first
production must also predict the second; since the predict sets for the two productions are not disjoint, the
grammar is not LL(1).

Creating an LL(1) Grammar

Consider the following grammar for expressions:

1. <E> --> <E> + <T>

2. <E> --> <E> - <T>
3. <E> --> <T>
4. <T> --> <T> * <F>
5. <T> --> <T> / <F>
6. <T> --> <F>
7. <F> --> ( <E> )
8. <F> --> number

This grammar has left recursion, and therefore cannot be LL(1). We can replace the use of left recursion with
right recursion as follows:

1. <E> --> <T> + <E>

2. <E> --> <T> - <E>
3. <E> --> <T>
4. <T> --> <F> * <T>
5. <T> --> <F> / <T>
6. <T> --> <F>
7. <F> --> ( <E> )
8. <F> --> number

The resulting grammar is still not LL(1); productions 1-3 share a common prefix, as do productions 4-6. We can
eliminate the common prefix by defering the decision as to which production to pick until after seeing the
common prefix. This technique is called factoring the common prefix.

1. <E> --> <T> <E*>

2. <E*> --> + <T> <E*> | - <T> <E*> | epsilon
3. <T> --> <F> <T*>
4. <T*> --> * <F> <T*> | / <F> <T*> | epsilon
5. <F> --> ( <E> ) | number

Table-Driven Parsing

In recursive-descent parsing, the decision as to which production to choose for a particular non-terminal is hard-
coded into the procedure for the non-terminal. The procedure uses the Predict sets (computed from the First and
Follow sets) for the grammar to decide which production to choose based on the lookahead symbol.

The problem with recursive-descent parsing is that it is inflexible; changes in the grammar can cause significant
(and in some cases non-obvious) changes to the parser.
Since recursive-descent parsing uses an implicit stack of procedure calls, it is possible to replace the parsing
procedures and implicit stack with an explicit stack and a single parsing procedure that manipulates the stack.

In this scheme, we encode the actions the parsing procedure should take in a table. This table can be generated
automatically (with the grammar as input), which is why this approach adapts more easily to changes in the
grammar.

A Table-Driven Parser
The parse table encodes the choice of production as a function of the current non-terminal of interest and the
lookahead symbol.
T: Non-terminals x Terminals -> Productions U {Error}

The entry T[A,x] gives the production number to choose when A is the non-terminal of interest and x is the
current input symbol. The table is a mapping from non-terminals x terminals to productions.
T[A,x] == A -> X1..Xm if x in Predict(A->X1..Xm)
otherwise T[A,x] == Error

The driver procedure is very simple. It stacks symbols that are to be matched or expanded. Terminal symbols on
the stack must match an input symbol; non-terminal symbols are expanded via the Predict function (which is
encoded in the parse table).

Parse Table for Expressions

Here is an LL(1) expression grammar, augmented to include the end marker:

1. <S> --> <E> ENDM

2. <E> --> <T> <E*>
3. <E*> --> + <T> <E*>
4. <E*> --> - <T> <E*>
5. <E*> --> epsilon
6. <T> --> <F> <T*>
7. <T*> --> * <F> <T*>
8. <T*> --> / <F> <T*>
9. <T*> --> epsilon
10. <F> --> ( <E> )
11. <F> --> number

The table for this expression grammar is (where a blank entry corresponds to an error):

( ) + - * / Number ENDM
-------------------------------------------
S 1 1
-------------------------------------------
E 2 2
-------------------------------------------
E* 5 3 4 5
-------------------------------------------
T 6 6
-------------------------------------------
T* 9 9 9 7 8 9
-------------------------------------------
F 10 11

This table is constructed from the Predict sets described earlier.

Driver Procedure
Under table-driven parsing, there is a single procedure that "interprets" the parse table. This "driver" procedure
takes the following form:
procedure Parser;
/* Push the start symbol S onto the stack */
Push(S,stack)
/* Initialize lookahead symbol */
scanner(NextInputSymbol)
while not Empty(stack) do
top = Top(stack)
if top is a nonterminal then
action = ParseTable[top,NextInputSymbol]
if action > 0 then
/* Pop top symbol *
Pop(stack)
/* Push RHS of production */
for each symbol on RHS #action do
Push(symbol)
else print("syntax error")
else if NextInputSymbol == top then
/* Match terminal symbol in input */
Pop(stack)
/* Get next terminal symbol in input */
scanner(NextInputSymbol)
else print("syntax error")

Example Parse

Let's trace the parse for the input 1 + (2 * 3) / 4 ENDM:

Stack Contents Current input Action
1: S 1 + (2 * 3) / 4 ENDM 1
2: E ENDM 1 + (2 * 3) / 4 ENDM 2
3: T E* ENDM 1 + (2 * 3) / 4 ENDM 6
4: F T* E* ENDM 1 + (2 * 3) / 4 ENDM 11
5: N T* E* ENDM 1 + (2 * 3) / 4 ENDM Pop
6: T* E* ENDM + (2 * 3) / 4 ENDM 9
7: E* ENDM + (2 * 3) / 4 ENDM 3
8: + T E* ENDM + (2 * 3) / 4 ENDM Pop
9: T E* ENDM (2 * 3) / 4 ENDM 6
10: F T* E* ENDM (2 * 3) / 4 ENDM 10
11: ( E ) T* E* ENDM (2 * 3) / 4 ENDM Pop
12: E ) T* E* ENDM 2 * 3) / 4 ENDM 2
13: T E* ) T* E* ENDM 2 * 3) / 4 ENDM 6
14: F T* E* ) T* E* ENDM 2 * 3) / 4 ENDM 11
15: N T* E* ) T* E* ENDM 2 * 3) / 4 ENDM Pop
16: T* E* ) T* E* ENDM * 3) / 4 ENDM 7
17: * F T* E* ) T* E* ENDM * 3) / 4 ENDM Pop
18: F T* E* ) T* E* ENDM 3) / 4 ENDM 11
19: N T* E* ) T* E* ENDM 3) / 4 ENDM Pop
20: T* E* ) T* E* ENDM ) / 4 ENDM 9
21: E* ) T* E* ENDM ) / 4 ENDM 5
22: ) T* E* ENDM ) / 4 ENDM Pop
23: T* E* ENDM / 4 ENDM 8
24: / F T* E* ENDM / 4 ENDM Pop
25: F T* E* ENDM 4 ENDM 11
26: N T* E* ENDM 4 END Pop
27: T* E* ENDM ENDM 9
28: E* ENDM ENDM 5
29: ENDM ENDM Pop
30: Done!

Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
ICSE 2004: English Paper 1 (English Language) : Answer Key / Correct Responses On
No ratings yet
ICSE 2004: English Paper 1 (English Language) : Answer Key / Correct Responses On
6 pages
Touchstone Teacher's Edition 3 Unit 9
100% (2)
Touchstone Teacher's Edition 3 Unit 9
13 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Alexis Reid - Type Specimens
No ratings yet
Alexis Reid - Type Specimens
81 pages
Thesis Documentation Template
100% (3)
Thesis Documentation Template
7 pages
314325-Electrical Estimating and Contracting
No ratings yet
314325-Electrical Estimating and Contracting
9 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
2.2 - Syntax Analysis (Upto Top-Down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-Down Parsing)
91 pages
FREE Advance Manual
No ratings yet
FREE Advance Manual
188 pages
Applied Linear Regression Models 4th Edi
No ratings yet
Applied Linear Regression Models 4th Edi
4 pages
Chapter - 3
No ratings yet
Chapter - 3
46 pages
Stock Ledger
0% (1)
Stock Ledger
25 pages
Brochure Inpage
No ratings yet
Brochure Inpage
2 pages
DTC B1615/14 Front Airbag Sensor LH Circuit Malfunction: Description
No ratings yet
DTC B1615/14 Front Airbag Sensor LH Circuit Malfunction: Description
2 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Parsing
No ratings yet
Parsing
38 pages
Syntax Analysis
No ratings yet
Syntax Analysis
115 pages
04 Parsing
No ratings yet
04 Parsing
330 pages
CSE231 - Lecture 5
No ratings yet
CSE231 - Lecture 5
33 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
CD - Ch.2
No ratings yet
CD - Ch.2
39 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
9 pages
CD Unit-Ii
No ratings yet
CD Unit-Ii
56 pages
Compiler Unit2
No ratings yet
Compiler Unit2
89 pages
CSE 4102 Syntax Analysis or Parsing
No ratings yet
CSE 4102 Syntax Analysis or Parsing
73 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Top-Down Parsing: CS 671 January 29, 2008
No ratings yet
Top-Down Parsing: CS 671 January 29, 2008
36 pages
CSC-437 Chapter 4
No ratings yet
CSC-437 Chapter 4
65 pages
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
No ratings yet
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
31 pages
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
No ratings yet
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
26 pages
Module 4 - Top Down Parsing
No ratings yet
Module 4 - Top Down Parsing
31 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
CD Unit 3
No ratings yet
CD Unit 3
76 pages
Module 2 C D Notes
No ratings yet
Module 2 C D Notes
21 pages
Secospace USG2110 V100R001C03SPC200 Upgrade Guide (English Document)
No ratings yet
Secospace USG2110 V100R001C03SPC200 Upgrade Guide (English Document)
38 pages
Top Down Parsing
No ratings yet
Top Down Parsing
37 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Toc Unit 3
No ratings yet
Toc Unit 3
49 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
Lecture04 Week06 TopDownParsing 1 - Compilers
No ratings yet
Lecture04 Week06 TopDownParsing 1 - Compilers
48 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
18.10.2007 Electroni 3rd Year RAGHVENDRA KUMAR
No ratings yet
18.10.2007 Electroni 3rd Year RAGHVENDRA KUMAR
31 pages
Compiler Design: - Top-Down Parsing With A Recursive Descent Parser
No ratings yet
Compiler Design: - Top-Down Parsing With A Recursive Descent Parser
20 pages
Cdeprt
No ratings yet
Cdeprt
12 pages
Top Down Parsing
No ratings yet
Top Down Parsing
27 pages
Avanti Kumari - A Report
No ratings yet
Avanti Kumari - A Report
39 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
Pec 31 Acd Material
No ratings yet
Pec 31 Acd Material
12 pages
Unit III
No ratings yet
Unit III
29 pages
The Entity-Relationship Model: Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
No ratings yet
The Entity-Relationship Model: Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
18 pages
SAP MM - Defining Organizational Structure
No ratings yet
SAP MM - Defining Organizational Structure
19 pages
CD KCS502 Unit 2
No ratings yet
CD KCS502 Unit 2
18 pages
04 Syntax Analysis - RDP
No ratings yet
04 Syntax Analysis - RDP
28 pages
Lecture 05
No ratings yet
Lecture 05
59 pages
Chapter 5 Intro To Top Down Parsing
No ratings yet
Chapter 5 Intro To Top Down Parsing
50 pages
Technical Answers For Real World Problems (TARP) CSE-3999: Assessment - 3
No ratings yet
Technical Answers For Real World Problems (TARP) CSE-3999: Assessment - 3
9 pages
Computer Communication (MIS Project)
No ratings yet
Computer Communication (MIS Project)
16 pages
Unit 2 (CD)
No ratings yet
Unit 2 (CD)
12 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Syntax Analysis
No ratings yet
Syntax Analysis
115 pages
PCS902S 21L
No ratings yet
PCS902S 21L
5 pages
Lab 2
No ratings yet
Lab 2
14 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
A Cuckoo Search Based Pairwise Strategy For Combinatorial Testing Problem
No ratings yet
A Cuckoo Search Based Pairwise Strategy For Combinatorial Testing Problem
9 pages
Nahid - 2474 PDF
No ratings yet
Nahid - 2474 PDF
9 pages
Why Syntax Analysis?
No ratings yet
Why Syntax Analysis?
15 pages
3 Syntax Analysis - Top Down Parsing
No ratings yet
3 Syntax Analysis - Top Down Parsing
9 pages
MD-100 Exam Study Guide
No ratings yet
MD-100 Exam Study Guide
6 pages
Chapter 04 Top-Down Syntactic Analysis
No ratings yet
Chapter 04 Top-Down Syntactic Analysis
10 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
Database Management Systems 1
No ratings yet
Database Management Systems 1
7 pages
Megaproject
No ratings yet
Megaproject
6 pages
Accomplishment Report Format
No ratings yet
Accomplishment Report Format
6 pages
Top-Down Parsing PDF
No ratings yet
Top-Down Parsing PDF
6 pages
CD Unit 2
No ratings yet
CD Unit 2
6 pages
Commucation Ws
No ratings yet
Commucation Ws
3 pages
431-342-02 Using Mitutoyo DP-1 VR
No ratings yet
431-342-02 Using Mitutoyo DP-1 VR
2 pages
BCOS Math 10 Chapter 2
No ratings yet
BCOS Math 10 Chapter 2
3 pages
CV - Manuel Antonio Gomez Merino
No ratings yet
CV - Manuel Antonio Gomez Merino
2 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet

Parsing

Uploaded by

Parsing

Uploaded by

Parsing

Recursive-Descent Parser for Expressions

1. <E> --> <T> <E*>

Top-Down Parser for Expressions

Tracing the Parser

Observations about Recursive-Descent Parser

Lookahead in Recursive-Descent Parsing

We first consider the leftmost symbol, X1.

If this is a terminal symbol, then First(X1...Xm) = X1.

In our expression grammar above:

If X2 is a terminal, it is included in First(X1...Xm).

followed by a terminal symbol x, so x is in Follow(A).

followed by a non-terminal symbol B, so Follow(A) includes First(B).

First and Follow Sets for Expression Grammar

1. <S> --> <E> ENDM

First(<E>) = First(<T> <E*>) = First(<T>)

First(<E*>) = {+} U {-} U Follow(<E*>)

First(<T>) = First(<F> <T*>) = First(<F>)

First(<T*>) = {*} U {/} U Follow(<T*>)

LL(1) Grammars for Recursive-Descent Parsing

If, for two productions

we have some symbol s for which

1. s is in Predict(A --> X1...Xm)

Creating LL(1) Grammars

Two common properties of grammars that violate this condition are:

Creating an LL(1) Grammar

1. <E> --> <E> + <T>

1. <E> --> <T> + <E>

1. <E> --> <T> <E*>

Parse Table for Expressions

Here is an LL(1) expression grammar, augmented to include the end marker:

1. <S> --> <E> ENDM

This table is constructed from the Predict sets described earlier.

Let's trace the parse for the input 1 + (2 * 3) / 4 ENDM:

You might also like

First(<E>) = {+} U {-} U Follow(<E>)

First(<T>) = {} U {/} U Follow(<T*>)