0% found this document useful (0 votes)

28 views22 pages

Cdmodule 2

The document discusses syntax analysis in compilers and context free grammars. It defines context free grammars, describes their components like terminals, nonterminals, productions and the start symbol. It also explains derivation, leftmost and rightmost derivations, derivation trees and parse trees.

Uploaded by

Subhashree Pradhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views22 pages

Cdmodule 2

Uploaded by

Subhashree Pradhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Prepared by EBIN P.

M (AP, CSE)
IES College of Engineering

MODULE 2
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

2.1 SYNTAX ANALYSIS

Syntax analysis or parsing is the second phase of a compiler.

A lexical analyzer can identify tokens with the help of regular expressions and pattern rules.
But a lexical analyzer cannot check the syntax of a given sentence due to the limitations of the
regular expressions. Regular expressions cannot check balancing tokens, such as parenthesis.
Therefore, this phase uses context-free grammar (CFG), which is recognized by push-down
automata.

The output of a syntax analyzer is a parse tree. For performing the syntax analysis, the
grammar of the language has to be specified. CFG is used to define the grammar of the
language. This process of verifying whether an input string matches the grammar of the
language is called parsing.

2.1.1 REVIEW OF CONTEXT FREE GRAMMARS

A grammar is defined as a quadruple G= (V, T, P, S)

V = Set of variables, i.e. Non-Terminals

T = Terminal symbols
P = Set of productions
S = Start symbol

A grammar G = (V, T, P, S) is said to be context free, if all productions in P have the form
α→β, where |α | <= |β| and α is element of V. That is, left-hand side contains only non-
terminals.
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

A context-free grammar (grammar for short) consists of terminals, nonterminal, a start

symbol, and productions.

1. Terminals are the basic symbols from which strings are formed. The term "token
name" is a synonym for "terminal" and frequently we will use the word "token" for
terminal when it is clear that we are talking about just the token name.

2. Nonterminals are syntactic variables that denote sets of strings. The nonterminals
define sets of strings that help define the language generated by the grammar. They
also impose a hierarchical structure on the language that is useful for both syntax
analysis and translation.

3. In a grammar, one nonterminal is distinguished as the start symbol, and the set of
strings it denotes is the language generated by the grammar. Conventionally, the
productions for the start symbol are listed first.

4. The productions of a grammar specify the manner in which the terminals and
nonterminals can be combined to form strings. Each production consists of:

a. A nonterminal called the head or left side of the production; this production
defines some of the strings denoted by the head.

b. The symbol →. Sometimes : : = has been used in place of the arrow.

c. A body or right side consisting of zero or more terminals and nonterminals.

All the production rules are of the form X→Y. Production rules are the heart of the grammar.
Consider the production rules
S → aSB
S → aB
B→b
Here, V= {S, B} , T={a, b} and Starting symbol is S. Using this production rule , we can derive the
string aabb by
S → aSB
→ aaBB
→ aabB
→aabb
Here all the individual steps are called sentential form or Sequential form. All steps together
called Derivation

Eg: Let V= {S, C} , T={a, b} P={S→aCa, C→aCa, C→b}. Generate the string a2ba2 from the
grammar given above?

S → aCa
→aaCaa
→aabaa
→ a2ba2

EXAMPLE:

The grammar with the following productions defines simple arithmetic expression:
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

In this grammar, the terminal symbols are : id + - * / ( )

The nonterminal symbols are : expression, term, factor

Start symbol : expression

Notational Conventions
To avoid always having to state that "these are the terminals," "these are the nonterminals,"
and so on, the following notational conventions for grammars will be used.

These symbols are terminals:

a. Lowercase letters early in the alphabet, such as a, b, c.

b. Operator symbols such as +, *, and so on.
c. Punctuation symbols such as parentheses, comma, and so on.
d. The digits 0, 1, . . . , 9.
e. Boldface strings such as id or if, each of which represents a single terminal
symbol.

These symbols are nonterminals:

a. Uppercase letters early in the alphabet, such as A, B, C.

b. The letter S, which, when it appears, is usually the start symbol.
c. Lowercase, italic names such as expr or stmt.
d. When discussing programming constructs, uppercase letters may be used to
represent nonterminals for the constructs. For example, nonterminals for
expressions, terms, and factors are often represented by E, T, and F,
respectively.

Uppercase letters late in the alphabet, such as X, Y, Z, represent grammar symbols;

that is, either nonterminals or terminals.

Lowercase letters late in the alphabet , chiefly u, v, ... ,z, represent (possibly empty)
strings of terminals.

Lowercase Greek letters α, β, γ for example, represent (possibly empty) strings of

grammar symbols.

A set of productions A → α1 , A → α2 , ... , A → αk with a common head A (call them

A-productions) , may be written A → α1| α2|.....| αk · We call α1, α2,.., αn the
alternatives for A.
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Unless stated otherwise, the head of the first production is the start symbol.

Using these conventions, the grammar for arithmetic expression can be rewritten as:

E→ E + T | E - T | T

T→ T * F | T / F | F

F→ ( E ) | id

Eg: Consider the derivation

E → E+ T |T
T→ T * F |F
F → (E) | a

Check the input a + a* a comes under the above grammar?

Leftmost Derivation Right most derivation

E → E+T E →E +T
→ T+T → E+T * F
→ F+T → E+T * a
→ a+T → E+F * a
→ a+T * F → E+ a * a
→ a+F *F → T +a * a
→ a+a*F →F+a*a
→ a+a *a →a+a*a

We can represent it in a tree like structure called parse tree

Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

2.1.1.1 DERIVATION TREES AND PARSE TREES

The construction of a parse tree can be made precise by taking a derivational view, in which
productions are treated as rewriting rules.

Beginning with the start symbol, each rewriting step replaces a nonterminal by the body of
one of its productions.

For example, consider the following grammar, with a single nonterminal E:

E → E + E | E * E | - E | ( E ) | id

The production E → - E signifies that if E denotes an expression, then – E must also denote an
expression. The replacement of a single E by - E will be described by writing E => -E which
is read, "E derives - E."

The production E -+ ( E ) can be applied to replace any instance of E in any string of grammar
symbols by (E) , e.g., E * E => (E) * E or E * E => E * (E)

We can take a single E and repeatedly apply productions in any order to get a sequence of
replacements. For example, E => - E => - (E) => - (id)

We call such a sequence of replacements a derivation of - (id) from E. This derivation provides
a proof that the string - (id) is one particular instance of an expression.

Leftmost and Rightmost Derivation of a String

Leftmost derivation − A leftmost derivation is obtained by applying production to
the leftmost variable in each step.

Rightmost derivation − A rightmost derivation is obtained by applying production

to the rightmost variable in each step.
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a
over an alphabet {a}.
The leftmost derivation for the string "a+a*a" may be –
X → X+X → a+X → a + X*X → a+a*X → a+a*a

The stepwise derivation of the above string is shown as below –

The rightmost derivation for the above string "a+a*a" may be

X → XX → Xa → X+Xa → X+aa → a+a*a

Parse Tree
Parse tree is a hierarchical structure which represents the derivation of the grammar
to yield input strings.

Simply it is the graphical representation of derivations.

Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Root node of parse tree has the start symbol of the given grammar from where the
derivation proceeds.

Leaves of parse tree are labeled by non-terminals or terminals.

Each interior node is labeled by some non terminals.

If A  xyz is a production, then the parse tree will have A as interior node whose
children are x, y and z from its left to right.

Construct parse tree for E  E + E / E * E /id

Yield Of Parse Tree

The leaves of the parse tree are labeled by nonterminals or terminals and read from
left to right, they constitute a sentential form, called the yield or frontier of the tree.

Figure above represents the parse tree for the string id+ id*id. The string id + id * id,
is the yield of parse tree depicted in Figure.

2.1.1.2 AMBIGUITY
An ambiguous grammar is one that produces more than one leftmost or more than
one rightmost derivation for the same sentence.
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

For most parsers, it is desirable that the grammar be made unambiguous, for if it is
not, we cannot uniquely determine which parse tree to select for a sentence.

In other cases, it is convenient to use carefully chosen ambiguous grammars, together

with disambiguating rules that "throw away" undesirable parse trees, leaving only
one tree for each sentence.

EXAMPLE

Consider the following Grammar

E → E + E | E * E | - E | ( E ) | id

And the string (Sentence) is id+ id * id.

1st Leftmost Derivation 2nd Leftmost Derivation

E ===> E + E E ===> E * E
===> id + E ===> E + E * E
===> id + E * E ===> id + id * E
===> id + id * E ===> id + id * E
===> id + id * id ===> id + id * id

1st Parse Tree 2nd Parse Tree

E E
/|\ /|\
E+E E * E
| /|\ /|\ |
id E * E E + E id
| | | |
id id id id

2.2 TOP DOWN PARSING

Parsing is the process of determining if a string of token can be generated by a
grammar.

Mainly 2 parsing approaches:

 Top Down Parsing

 Bottom Up Parsing

In top down parsing, parse tree is constructed from top (root) to the bottom (leaves).

In bottom up parsing, parse tree is constructed from bottom (leaves)) to the top (root).

It can be viewed as an attempt to construct a parse tree for the input starting from the
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

root and creating the nodes of parse tree in preorder.

Pre-order traversal means: 1. Visit the root 2. Traverse left subtree 3. Traverse right
subtree.
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Top down parsing can be viewed as an attempt to find a leftmost derivation for an
input string (that is expanding the leftmost terminal at every step).

2.2.1 RECURSIVE DESCENT PARSING

It is the most general form of top-down parsing.

It may involve backtracking, that is making repeated scans of input, to obtain the correct
expansion of the leftmost non-terminal. Unless the grammar is ambiguous or left-recursive,
it finds a suitable parse tree

EXAMPLE

Consider the grammar:

S  cAd

A  ab | a

and the input string w = cad.

 To construct a parse tree for this string top down, we initially create a tree consisting
of a single node labelled S.
 An input pointer points to c, the first symbol of w. S has only one production, so we
use it to expand S and obtain the tree as:
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

 The leftmost leaf, labeled c, matches the first symbol of input w, so we advance the
input pointer to a, the second symbol of w, and consider the next leaf, labeled A.
 Now, we expand A using the first alternative A → ab to obtain the tree as:

 We have a match for the second input symbol, a, so we advance the input pointer to
d, the third input symbol, and compare d against the next leaf, labeled b.
 Since b does not match d, we report failure and go back to A to see whether there is
another alternative for A that has not been tried, but that might produce a match.
 In going back to A, we must reset the input pointer to position 2 , the position it had
when we first came to A, which means that the procedure for A must store the input
pointer in a local variable.
 The second alternative for A produces the tree as:

 The leaf a matches the second symbol of w and the leaf d matches the third symbol.
Since we have produced a parse tree for w, we halt and announce successful
completion of parsing. (that is the string parsed completely and the parser stops).
 The leaf a matches the second symbol of w and the leaf d matches the third symbol.
Since we have produced a parse tree for w, we halt and announce successful
completion of parsing. (that is the string parsed completely and the parser stops).

2.2.2 PREDICTIVE PARSING

A predictive parsing is a special form of recursive-descent parsing, in which the
current input token unambiguously determines the production to be applied at each
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

step. The goal of predictive parsing is to construct a top-down parser that never
backtracks. To do so, we must transform a grammar in two ways:

 Eliminate left recursion, and

 Perform left factoring.

These rules eliminate most common causes for backtracking although they do not
guarantee a completely backtrack-free parsing (called LL(1) as we will see later).

Left Recursion
A grammar is said to be left –recursive if it has a non-terminal A such that there is a
derivation A  A, for some string .

EXAMPLE

Consider the grammar

A  A

A  

 It recognizes the regular expression *. The problem is that if we use the
first production for top-down derivation, we will fall into an infinite
derivation chain. This is called left recursion.
 Top–down parsing methods cannot handle left recursive grammars, so a
transformation that eliminates left-recursion is needed. The left-recursive
pair of productions A  A| could be replaced by two non-recursive
productions.
A   A’
A’  A’|

Consider The following grammar which generates arithmetic expressions

E  E + T|T

T  T * F|F

F  ( E )|id

Eliminating the immediate left recursion to the productions for E and then for T, we
obtain

E  T E’

E’  + T E’|

T  F T’

T’  * F T’|
F  ( E )|id
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

No matter how many A-productions there are, we can eliminate immediate left
recursion from them by the following technique. First, we group the A productions as

A  A1 | A2 | . . . | Am| 1| 2| . . . | n

where no  i begins with an A. Then we replace the A-productions by

A   1 A’| 2 A’| . . . | n A’

A’ 1 A’|2 A’| . . . |m A’|

Left Factoring
Left factoring is a grammar transformation that is useful for producing a grammar
suitable for predictive parsing.

The basic idea is that when it is not clear which of two alternative productions to use
to expand a non-terminal A, we may be able to rewrite the A-productions to defer the
decision until we have seen enough of the input to make the right choice

A    1|  2

are two A-productions, and the input begins with a non-empty string derived from 
we do not know whether to expand A to   1 or  2.

However, we may defer the decision by expanding A to B. Then, after seeing the
input derived from , we may expand B to  1 or 2 .

The left factored original expression becomes:

A  B
B   1| 2

For the “dangling else “grammar:

stmt if cond then stmt else stmt |if cond then stmt

The corresponding left – factored grammar is:

stmt  if cond then stmt else_clause

else_clause  else stmt | 


Non Recursive Predictive parser
It is possible to build a nonrecursive predictive parser by maintaining a stack
explicitly, rather than implicitly via recursive calls.

The key problem during predictive parsing is that of determining the production to
be applied for a nonterminal.

The nonrecursive parser in looks up the production to be applied in a parsing table

Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Requirements
1. Stackv

2. Parsing Table

3. Input Buffer

4. Parsing

Figure : Model of a nonrecursive predictive parser

Input buffer - contains the string to be parsed, followed by $(used to indicate end of
input string)

Stack – initialized with $, to indicate bottom of stack.

Parsing table - 2 D array M[A,a] where A is a nonterminal and a is terminal or the

symbol $

The parser is controlled by a program that behaves as follows. The program considers
X, the symbol on top of the stack, and a current input symbol. These two symbols
determine the action of the parser.

There are three possibilities,

1. If X = a = $ , the parser halts and announces successful completion of parsing.

2. If X = a  $ , the parser pops X off the stack and advances the input pointer to
the next input symbol,

3. If X is a nonterminal, the program consults entry M|X, a | of the parsing table

M. The entry will be either an X-production of the grammar or an error entry.
If, for example, M |X, u |= {X  UVW}, the parser replaces X on top of the stack
by WVU (with U on top). As output we shall assume that the parser just prints
the production used; any other code could be executed here. If M|X, a| = error,
the parser calls an error recovery routine.
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Construction of Predictive Parsing Table

Steps to construct a predictive parsing table for a grammar G is given below:

1. Eliminate Left recursion in Grammar G
2. Perform Left factoring on the Grammar G
3. Find First and follow on the symbol in Grammar G
4. Construct the predictive parse table
5. Check if the given input string can be accepted by the parser

Uses 2 functions:

 FIRST()

 FOLLOW()

These functions allows us to fill the entries of predictive parsing table

FIRST
If 'α' is any string of grammar symbols, then FIRST(α) be the set of terminals that begin
the string derived from α . If α==*>є then add є to FIRST(α).First is defined for both
terminals and non-terminals.

To Compute First Set

1. If X is a terminal , then FIRST(X) is {X}

2. If X є then add є to FIRST(X)

3. If X is a non-terminal and XY1Y2Y3...Yn , then put 'a' in FIRST(X) if for some i,

a is in FIRST(Yi) and є is in all of FIRST(Y1),...FIRST(Yi-1).

EXAMPLE

Consider Grammar:

E  T E’

E'  +T E' | Є

T  F T'

T'  * F T' | Є

F  ( E ) | id
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

FOLLOW
FOLLOW is defined only for non-terminals of the grammar G.

It can be defined as the set of terminals of grammar G , which can immediately follow
the non-terminal in a production rule from start symbol.

In other words, if A is a nonterminal, then FOLLOW(A) is the set of terminals 'a' that
can appear immediately to the right of A in some sentential form.

Rules to Compute Follow Set

1. If S is the start symbol, then add $ to the FOLLOW(S).

2. If there is a production rule A αBβ then everything in FIRST(β) except for є is

placed in FOLLOW(B).

3. If there is a production A αB , or a production AαBβ where FIRST(β) contains

є then everything in FOLLOW(A) is in FOLLOW(B).

EXAMPLE
Consider Grammar:
E  T E’
E'  +T E' | Є
T  F T'
T'  * F T' | Є
F  ( E ) | id
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

EXAMPLE

EXAMPLE
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Algorithm To Construct A Predictive Parsing Table.

INPUT: Grammar G.

OUTPUT: Parsing table M.

METHOD

1. For each production A  of the grammar, do steps 2 and 3.

2. For each terminal a in FIRST(), add A  to M|A, a|
3. If  is in FIRST(), add A  to M|A, b | for each terminal b in FOLLOW
(A). If  is in FIRST ()j and $ is in FOLLOW(A), add A to M |A, $|
4. Make each undefined entry of M be error
Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

Parsing Table

Blank entries are error states. For example, E cannot derive a string starting with ‘+’

Moves made by predictive parser for the input id+id*id

Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

2.2.3 LL(1)GRAMMARS
LL(l) grammars are the class of grammars from Which the predictive parsers can be
constructed automatically.

A context-free grammar G = (VT, VN, P, S) whose parsing table has no multiple entries
is said to be LL(1).

In the name LL(1),

 the first L stands for scanning the input from left to right,

 the second L stands for producing a leftmost derivation,

 and the 1 stands for using one input symbol of look ahead at each step to make
parsing action decision.

A language is said to be LL(1) if it can be generated by a LL(1) grammar. It can be

shown that LL(1) grammars are

 not ambiguous and

 not left-recursive

EXAMPLE

Consider the following grammar

S → i E t S S' | a

S' → eS | ϵ

E→b

This is not LL(1) grammar

Prepared by EBIN P.M (AP, CSE)
IES College of Engineering

**********

Flat (Unit-III) Notes
No ratings yet
Flat (Unit-III) Notes
33 pages
2.2 - Syntax Analysis (Upto Top-Down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-Down Parsing)
91 pages
Simple Syntax Directed Translation
No ratings yet
Simple Syntax Directed Translation
51 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
CSC412 Compiler Construction I March 24 2022 NOUN-pages-3
No ratings yet
CSC412 Compiler Construction I March 24 2022 NOUN-pages-3
150 pages
Chương 3. Phân Tích Cú Pháp
No ratings yet
Chương 3. Phân Tích Cú Pháp
103 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
6 pages
Module 2
No ratings yet
Module 2
19 pages
Unit-Ii: Top Down Parsing
No ratings yet
Unit-Ii: Top Down Parsing
42 pages
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
No ratings yet
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
29 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
CST302 - Compiler - Design - Module 2
No ratings yet
CST302 - Compiler - Design - Module 2
19 pages
AT&CD Unit 2
No ratings yet
AT&CD Unit 2
26 pages
SE Compiler Chapter 3-Parser
No ratings yet
SE Compiler Chapter 3-Parser
27 pages
CD Mod2
No ratings yet
CD Mod2
18 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
43 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
44 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
03 Compiler Design Lecture - Syntax Analysis
No ratings yet
03 Compiler Design Lecture - Syntax Analysis
39 pages
Top Down Parsing-Note1
No ratings yet
Top Down Parsing-Note1
18 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
CD Unit2
No ratings yet
CD Unit2
45 pages
CD Unit Ii
No ratings yet
CD Unit Ii
11 pages
Sources - Mami Wata
100% (7)
Sources - Mami Wata
33 pages
Syntax Analysis
No ratings yet
Syntax Analysis
73 pages
Unit-2 PCD
No ratings yet
Unit-2 PCD
36 pages
Context Free Grammar: 1. G (V, T, P, S)
No ratings yet
Context Free Grammar: 1. G (V, T, P, S)
37 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
20 pages
Module 2 C D Notes
No ratings yet
Module 2 C D Notes
21 pages
Unit 2
No ratings yet
Unit 2
45 pages
CH03
No ratings yet
CH03
57 pages
Chapter 6
No ratings yet
Chapter 6
52 pages
(Week 4) Syntax Analysis (CFG)
No ratings yet
(Week 4) Syntax Analysis (CFG)
50 pages
Syntax Analysis: Role of Parsers
No ratings yet
Syntax Analysis: Role of Parsers
6 pages
II. Parser: Syntax Analysis
No ratings yet
II. Parser: Syntax Analysis
18 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
Toc 4 and 5 Unit Notes
No ratings yet
Toc 4 and 5 Unit Notes
72 pages
Module-3 Syntax Analyzer
No ratings yet
Module-3 Syntax Analyzer
80 pages
TPL Lect 17-20
No ratings yet
TPL Lect 17-20
8 pages
Module-3 Notes
No ratings yet
Module-3 Notes
28 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
ToC Notes - Unit 2
No ratings yet
ToC Notes - Unit 2
20 pages
CD Unit 2
No ratings yet
CD Unit 2
15 pages
Unit 2
No ratings yet
Unit 2
39 pages
CD Unit 2
100% (1)
CD Unit 2
20 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
Compiler 2
No ratings yet
Compiler 2
32 pages
Bunn ULTRA Service and Repair Manual
No ratings yet
Bunn ULTRA Service and Repair Manual
73 pages
Chapter - Three: Syntax Analysis
No ratings yet
Chapter - Three: Syntax Analysis
100 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
Chapter-3 So Far
No ratings yet
Chapter-3 So Far
50 pages
Haitian Plastics Machinery Group CO., LTD.: Spare Parts
0% (1)
Haitian Plastics Machinery Group CO., LTD.: Spare Parts
529 pages
Unit2 TopDownParsing
No ratings yet
Unit2 TopDownParsing
12 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
DS100-2-Grp#4 Chapter 6 Advanced Analytical Theory and Methods Regression (CADAY, CASTOR, CRUZ, SANORIA, TAN)
No ratings yet
DS100-2-Grp#4 Chapter 6 Advanced Analytical Theory and Methods Regression (CADAY, CASTOR, CRUZ, SANORIA, TAN)
4 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
28 pages
Basic Ecg Interpretation Practice Test V 1
100% (1)
Basic Ecg Interpretation Practice Test V 1
7 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
(Diabetes No More) Diabetes Destroyer
100% (2)
(Diabetes No More) Diabetes Destroyer
30 pages
rkCD-Chapter 3 - Syntax Analysis
No ratings yet
rkCD-Chapter 3 - Syntax Analysis
15 pages
Adc New Scenario Based Question 2
100% (2)
Adc New Scenario Based Question 2
4 pages
Compact Hi-Fi Power Amplifier
No ratings yet
Compact Hi-Fi Power Amplifier
2 pages
Sigma-Aldrich General Laboratory Reagents
No ratings yet
Sigma-Aldrich General Laboratory Reagents
102 pages
RUIDE Disteo 23 USER MANUAL 180313 (A5)
No ratings yet
RUIDE Disteo 23 USER MANUAL 180313 (A5)
25 pages
Prelab Acid Base
100% (4)
Prelab Acid Base
1 page
Week 1 Lesson 2 - PPT
No ratings yet
Week 1 Lesson 2 - PPT
21 pages
Australia Training in Oil, Chemical and Hydrocarbons PDF
No ratings yet
Australia Training in Oil, Chemical and Hydrocarbons PDF
507 pages
Ddpmas Policy Draft
No ratings yet
Ddpmas Policy Draft
79 pages
25 Lieb-Thirring Inequalities
No ratings yet
25 Lieb-Thirring Inequalities
35 pages
Port Coal Stockyard1
No ratings yet
Port Coal Stockyard1
30 pages
Hermite Curves, B-Splines and NURBS: Outline
No ratings yet
Hermite Curves, B-Splines and NURBS: Outline
10 pages
Property Wednesday Final
No ratings yet
Property Wednesday Final
9 pages
A +Fadel+Muhammad
No ratings yet
A +Fadel+Muhammad
8 pages
Review of Bacteriology: BY: Paul Aeron E. Bansil, RMT
No ratings yet
Review of Bacteriology: BY: Paul Aeron E. Bansil, RMT
18 pages
CT Class Very Important
No ratings yet
CT Class Very Important
6 pages
Introduction To 2D Modelling in InfoWorks ICM - Generating A 2D Mesh - Autodesk
No ratings yet
Introduction To 2D Modelling in InfoWorks ICM - Generating A 2D Mesh - Autodesk
11 pages
Design and Scale-Up Challenges in Hydrothermal Liquefaction
No ratings yet
Design and Scale-Up Challenges in Hydrothermal Liquefaction
14 pages
TM - Pbe-Mt Fhi - UPF
No ratings yet
TM - Pbe-Mt Fhi - UPF
68 pages
Kinder Small Eggs - Google Search
No ratings yet
Kinder Small Eggs - Google Search
1 page
Study Guide For Module No. 5
No ratings yet
Study Guide For Module No. 5
5 pages
Đề Kiểm Tra Cuối Kỳ II Anh 8
No ratings yet
Đề Kiểm Tra Cuối Kỳ II Anh 8
2 pages
Daniel Asmus CV
No ratings yet
Daniel Asmus CV
5 pages
McAdams Olson 2010 Personality Development
No ratings yet
McAdams Olson 2010 Personality Development
26 pages
JMD K 250
No ratings yet
JMD K 250
1 page
The Genetic Code of All Languages,(Part-1; An Overview)
From Everand
The Genetic Code of All Languages,(Part-1; An Overview)
Moni Kanchan Panda
No ratings yet
Ian Talks Regex A-Z
From Everand
Ian Talks Regex A-Z
Ian Eress
No ratings yet

Cdmodule 2

Uploaded by

Cdmodule 2

Uploaded by

Prepared by EBIN P.

2.1 SYNTAX ANALYSIS

2.1.1 REVIEW OF CONTEXT FREE GRAMMARS

V = Set of variables, i.e. Non-Terminals

A context-free grammar (grammar for short) consists of terminals, nonterminal, a start

b. The symbol →. Sometimes : : = has been used in place of the arrow.

c. A body or right side consisting of zero or more terminals and nonterminals.

In this grammar, the terminal symbols are : id + - * / ( )

The nonterminal symbols are : expression, term, factor

Start symbol : expression

These symbols are terminals:

a. Lowercase letters early in the alphabet, such as a, b, c.

These symbols are nonterminals:

a. Uppercase letters early in the alphabet, such as A, B, C.

Uppercase letters late in the alphabet, such as X, Y, Z, represent grammar symbols;

Lowercase Greek letters α, β, γ for example, represent (possibly empty) strings of

A set of productions A → α1 , A → α2 , ... , A → αk with a common head A (call them

Eg: Consider the derivation

Check the input a + a* a comes under the above grammar?

Leftmost Derivation Right most derivation

We can represent it in a tree like structure called parse tree

2.1.1.1 DERIVATION TREES AND PARSE TREES

For example, consider the following grammar, with a single nonterminal E:

Leftmost and Rightmost Derivation of a String

Rightmost derivation − A rightmost derivation is obtained by applying production

The stepwise derivation of the above string is shown as below –

The rightmost derivation for the above string "a+a*a" may be

X → X*X → X*a → X+X*a → X+a*a → a+a*a

Simply it is the graphical representation of derivations.

Leaves of parse tree are labeled by non-terminals or terminals.

Each interior node is labeled by some non terminals.

Construct parse tree for E  E + E / E * E /id

Yield Of Parse Tree

In other cases, it is convenient to use carefully chosen ambiguous grammars, together

Consider the following Grammar

And the string (Sentence) is id+ id * id.

1st Leftmost Derivation 2nd Leftmost Derivation

1st Parse Tree 2nd Parse Tree

2.2 TOP DOWN PARSING

Mainly 2 parsing approaches:

 Top Down Parsing

root and creating the nodes of parse tree in preorder.

2.2.1 RECURSIVE DESCENT PARSING

Consider the grammar:

and the input string w = cad.

2.2.2 PREDICTIVE PARSING

 Eliminate left recursion, and

 Perform left factoring.

Consider the grammar

Consider The following grammar which generates arithmetic expressions

A  A1 | A2 | . . . | Am| 1| 2| . . . | n

where no  i begins with an A. Then we replace the A-productions by

A’ 1 A’|2 A’| . . . |m A’|

The left factored original expression becomes:

For the “dangling else “grammar:

The corresponding left – factored grammar is:

stmt  if cond then stmt else_clause

else_clause  else stmt | 

The nonrecursive parser in looks up the production to be applied in a parsing table

Figure : Model of a nonrecursive predictive parser

Stack – initialized with $, to indicate bottom of stack.

Parsing table - 2 D array M[A,a] where A is a nonterminal and a is terminal or the

There are three possibilities,

1. If X = a = $ , the parser halts and announces successful completion of parsing.

3. If X is a nonterminal, the program consults entry M|X, a | of the parsing table

Construction of Predictive Parsing Table

Steps to construct a predictive parsing table for a grammar G is given below:

These functions allows us to fill the entries of predictive parsing table

To Compute First Set

1. If X is a terminal , then FIRST(X) is {X}

2. If X є then add є to FIRST(X)

3. If X is a non-terminal and XY1Y2Y3...Yn , then put 'a' in FIRST(X) if for some i,

Rules to Compute Follow Set

1. If S is the start symbol, then add $ to the FOLLOW(S).

2. If there is a production rule A αBβ then everything in FIRST(β) except for є is

3. If there is a production A αB , or a production AαBβ where FIRST(β) contains

X → XX → Xa → X+Xa → X+aa → a+a*a