Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR

The document summarizes syntactic analysis in compiler design. It discusses how a parser analyzes the syntactic structure of a program by using a context-free grammar (CFG) to check for errors. A CFG consists of terminal symbols, non-terminal symbols, a start symbol, and productions. The parser uses the CFG to derive strings and construct a parse tree. Context-free grammars are equivalent to pushdown automata and are useful for describing programming language syntax recursively through productions.

Uploaded by

Aashish Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views13 pages

Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR

Uploaded by

Aashish Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Syntactic Analysis

Introduction
I Second phase of the compiler.
I Main task:
I Analyze syntactic structure of program and its components
I to check these for errors.
I Role of parser:
token
source Lexical parse IR
Parser Rest of
Analyzer Front end
tree
req

Symbol
Table

I Approach to constructing parser: similar to lexical analyzer

I Represent source language by a meta-language, Context Free Grammar
I Use algorithms to construct a recognizer that recognizes strings generated by the
grammar.
This step can be automated for certain classes of grammars. One such tool:
YACC.
I Parse strings of language using the recognizer.

1/1
Context Free Grammar (CFG)
I Syntax analysis based on theory of automata and formal languages, specifically
the equivalence of two mechanisms of context free grammars and pushdown
automata.
I Context free grammars used to describe the syntactic structures of programs of
a programming language. Describe what elementary constructs there are and
how composite constructs can be built from other constructs.
Stmt → if (Expr) Stmt else Stmt
Note recursive nature of definition.
I Formally, a CFG has four components:
a) a set of tokens Vt , called terminal symbols, (token set produced by the scanner)
examples: if, then, identifier, etc.
b) a set of different intermediate symbols, called non-terminals, syntactic categories,
syntactic variables, Vn
c) a start symbol, S ∈ Vn , and
d) a set of productions P of the form
A → X1 · · · Xn
where A ∈ Vn , Xi ∈ (Vn ∪ Vt ), 1 ≤ i ≤ m, m ≥ 0.
I Sentences generated by starting with S and applying productions until left with
nothing but terminals.
I Set of strings derivable from a CFG G comprises the context free language,
denoted L(G ).

2/1
CFG - example.
I Nonterminal start with uppercase letters. rest are non-terminals.
I If-then-else:
Stmt → IfStmt | other
IfStmt → if ( Exp ) Stmt ElseStmt
ElseStmt → else Stmt |
Exp → 0 | 1
Example strings:
other
if (0) other
if (1) other else if (0) other else other
Derivation of if (1) other else if (0) other else other:
Stmt ⇒ IfStmt ⇒ if (Exp) Stmt ElseStmt
⇒ if (1) Stmt ElseStmt
...
I Grammar for sequence of statements:
StmtSeq → Stmt; StmtSeq | Stmt
Stmt → s
L(G) = { s, s;s, s;s;s, ... }
I What if statment sequence is empty?
StmtSeq → Stmt; StmtSeq |
Stmt → s
L(G) = { , s;, s;s;, s;s;s;, ... }
Note: Here ’;’ is not a statement separator, but a terminator.
What if we want a statement separator?
StmtSeq → NonEmpStmtSeq |
NonEmpStmtSeq → Stmt; NonEmpStmtSeq | Stmt
Stmt → s

3/1
Context Free Grammar (CFG) - cont’d.
I Notations:
1. Nonterminals: Uppercase letters such as A, B, C
2. Terminals: lower case letters such as a,b, c, operators +,−, etc,
punctuation, digits, and boldface strings such as id.
3. Nonterminals or terminals: Upper-case letters late in alphabet, such
as X , Y , Z .
4. Strings of terminals: lower-case letters late in alphabet, such as x, y ,
z.
5. Strings of grammar symbols: lower-case greek letters α, β, etc.
6. Write A → α1 , A → α2 , etc as
A → α1 |α2 | · · ·
I Example:
E → E A E | ( E ) | − E | id
A → +| − | ∗ |/| ↑
I Derivation of strings: a production can be thought of as a rewrite
rule in which nonterminal on left is replaced by string on right side.
Notation: Write such a replacement as E ⇒ (E).
Example:
E ⇒ −E ⇒ −(E ) ⇒ −(id)

4/1
CFG - cont’d.
I Notation: Write αAβ ⇒ αγβ if A → γ.
∗
I Notation: Write α ⇒ β to denote that β can be derived from α in zero or
more steps.
∗
L(G ) = {α| S ⇒ α}
∗
I Sentential form: α is a sentential form, if S ⇒ α and α contains
non-terminals.
Example: E + E
I Leftmost derivation: Derivation α ⇒ β is leftmost if the leftmost terminal in
α is replaced.
Example:
∗ ∗ ∗ ∗
E ⇒ EAE ⇒ idAE ⇒ id + E ⇒ id + id
Production sequence discovered by a large class of parsers (the top-down
parsers) is a leftmost derivation; hence, these parsers are said to produce
leftmost parse.
I Rightmost derivation: Derivation α ⇒ β is left most if the rightmost terminal
in α is replaced.
Example:
∗ ∗ ∗ ∗
E ⇒ EAE ⇒ EAid ⇒ E + id ⇒ id + id
Also, called canonical derivation. Corresponds well to an important class of
parsers (the bottom-up parsers). In particular, as a bottom up parser discovers
the productions used to derive a token sequence, it discovers a rightmost
derivation, but in reverse order : last production applied is discovered first,
while the first production is the last to be discovered.
5/1
Representations of derivations
I Derivations represented graphically by a derivation of parse tree:
I Root: start symbol, leaves: grammar symbols or
I Interior nodes: nonterminals; Offsprings of a nonterminal represent application of
a rule.
I Example: Parse tree for leftmost and rightmost derivations of string id + id ∗ id:
E E

E + E E * E

id E * E E + E id

id id id id
I Abstract syntax tree: A more abstract representation of the input string.
Stmt if
if ( exp ) Stmt else Stmt
0 other other

0 Other Other
I Parse tree may contain information that may not be needed in later phases of
compiler. AST does not include intermediate nodes primary used for derivation
purposes.
I In general, during the semantic analysis phase, the parse tree of a string may
be converted into an abstract syntax tree.

6/1
Parse Tree - Examples
I Parse tree for string: if (o) other else other
if
Stmt

0 other other

IfStmt

if ( exp ) Stmt ElseStmt

0 other else Stmt

other

I Parse tree for string: s;s;s

StmtSeq seq

Stmt ; StmtSeq s s s

Stmt ; StmtSeq

s
s
s

7/1
Properties of Context Free Grammars
I Context free grammars that are limited to productions of the form A → a
B and C → form the class of regular grammars. Languages defined by
regular grammars are a proper subset of the context-free languages.
I Why not use lexical analysis during parsing?
I Lexical rules are in general simple.
I RE are more concise and easier to understand.
I Domain specific language so that efficient lexical analyzer can be
constructed.
I Separate into two manageable parts. Useful for multi-lingual programming.
I Non-reduced CFGs: A CFG containing nonterminals that are unreachable
or derive no terminal string.
Example:
S → A|B
A → a
B → B b
C → c
Nonterminal C cannot be reached from S. B does not derive any strings.
Useless terminals can be safely removed from a CFG without affecting the
language. Reduced grammar:
S → A
A → a
Algorithms exist that check for useless nonterminals.

8/1
Properties of Context Free Grammars - Ambiguity
I Ambiguity : A context free grammar is ambiguous if it allows different
derivation trees for a single tree.
E E
E − E E − E
id E − E E − E id
id id id id

Each tree defines a different semantics for −

I No algorithm exists for automatically checking if a grammar is ambiguous
(impossibility result). However, for certain grammar classes (including
those that generate parsers), one can prove that grammars are
unambiguous.
I How to eliminate ambiguity: one way is to rewrite the grammar: Example:
S → if E then S | if E then S else S

S → M|U
M → if E then M else M
U → if E then S | if E then M else U
Represents semantics:Match each else with the closet previous unmatched
then. The above transformation makes the grammar unnecessarily
complex.
I Another approach: Disambiguate by defining additional tokens end.
S → if E then S end | if E then S else S end
I Provide information to the parser so that it can handle it in a certain way.
9/1
Properties of Context Free Grammars - cont’d.
I Left recursion: G is left recursive if for a nonterminal A, there is a
+
derivation A ⇒ Aα
Top-down parsing methods cannot handle left-recursive grammars. So
eliminate left recursion.
I Left factoring : Factor out the common left prefixes of grammars: Replace
grammar A → αβ1 |αβ2 by the rule:
A → αA0
A0 → β1 |β2
I Context free grammars are not powerful enough to represent all constructs
of programming languages.
Cannot distinguish the following:
I L1 = {wcw |w ∈ (a|b)∗ }: Conceptually represents problem of verifying that
an identifier is declared before used. Such checkings are done during the
semantic analysis phase.
I L2 = {an b m c n c m |n ≥ 1 ∧ m ≥ 1}. Abstracts the problem of checking that
number of formal parameters agrees with the number of actual parameters.
I L3 = {an b n c n |n ≥ 0}.
CFG’s can keep count of two items but not three.

10 / 1
Properties of Context Free Grammars - cont’d.
I Context free grammar can capture some of language semantics as
well.
I Example grammar:
<exp> ::=<exp> + <term> | <term>
<term> ::=<term> * <term>
| ‘(’<exp>‘)’
| <number>
<number> ::= 0 | 1 | · · · | 9
I Precedence of * over +: by deriving * lower in the parse tree.
I Left recursion
<exp> ::= <exp> + <term>
left associativity of +
I Right recursion:
<exp> ::= <term> + <exp>
right associativity of +

11 / 1
Backus-Naur Form(BNF)
I BNF: a kind of CFG.
I First used in Algol60 report. Many extensions since, but all similar and most
give power of context-free grammar.
I Has four parts: (i) terminals (atomic symbols), (ii) non-terminals (representing
constructs), called syntactic categories, iii) productions and iv) a starting
nonterminal.
I Each nonterminal denotes a set of strings. Set of strings associated with
starting nonterminal represents language.
I BNF uses following notations:
(i) Non-terminals enclosed in < and >.
(ii) Rules written as
X ::= Y
(a) X is LHS of rule and can only be a NT.
(b) Y can be a string, which is a terminal, nonterminal, or concatenation of terminal
and nonterminals, or a set of strings separated by alternation symbol |.
I Example: Terminals: A, B, · · · Z; 0, 1, · · · 9
Nonterminals: <id>, <rest>, <alpha>, <alphanum>, <digit>
Starting NT: <id>
Productions/rules:
<id> ::= <alpha> | <alpha><rest>
<rest> ::= <rest><alphanum> | <alphanum>
<alphanum> ::= <alpha> | <digit>
<alpha> ::= A | B | ··· | Z
<digit> ::= 0 | 1 | ··· | 9
12 / 1
Extended BNF (EBNF)
I Extend BNF by adding more meta-notation =⇒ shorter productions
I Nonterminals begin with uppercase letters (discard <>)
I Terminals that are grammar symbols (’[’ for instance) are enclosed in ‘’.
I Repetitions (zero or more) are enclosed in {}
I Options are enclosed in []:
I Use () to group items together:
Exp ::= Item {+ Item} | Item {- Item}
=⇒
Exp ::= Item {(+|-) Item}
Conversion from EBNF to BNF and Vice Versa
I BNF to EBNF:
i) Look for recursion in grammar:
A ::= a A | B =⇒ { a } B
ii) Look for common string that can be factored out with grouping and options.
A ::= a B | a =⇒ A := a [B]
I EBNF to BNF:
i) Options []:
A ::= a [B] C =⇒
A’ ::= a N C
N ::= B |
ii) Repetition {}:
A ::= a B1 B2 ... Bn C =⇒
A’ ::= a N C
N ::= B1 B2 ... Bn N |
13 / 1

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Authentication Plug-In DeveloperGuide
No ratings yet
Authentication Plug-In DeveloperGuide
50 pages
Parsing Bun
No ratings yet
Parsing Bun
48 pages
Chapter 4
No ratings yet
Chapter 4
62 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Top Down
No ratings yet
Top Down
25 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
2nd Phase Syntax Analyzer - 1
No ratings yet
2nd Phase Syntax Analyzer - 1
136 pages
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
No ratings yet
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
69 pages
CO3005 Chapter 3 Syntax Analysis
No ratings yet
CO3005 Chapter 3 Syntax Analysis
62 pages
4th - Syntax Analysis
No ratings yet
4th - Syntax Analysis
29 pages
Context Free Grammar and Parsing
0% (1)
Context Free Grammar and Parsing
138 pages
CD Chapter 2
No ratings yet
CD Chapter 2
39 pages
Lecture05-Syntax Analysis-CFG
No ratings yet
Lecture05-Syntax Analysis-CFG
19 pages
Unit II PDF
No ratings yet
Unit II PDF
7 pages
COSC3054 Lec 03 I Grammars
No ratings yet
COSC3054 Lec 03 I Grammars
96 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
Motivation For Formal Grammars
No ratings yet
Motivation For Formal Grammars
15 pages
(Week 4) Syntax Analysis (CFG)
No ratings yet
(Week 4) Syntax Analysis (CFG)
50 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
(Week 3) Syntax Analysis (Derivation)
No ratings yet
(Week 3) Syntax Analysis (Derivation)
46 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
Context Free Grammars & Parsing: CPSC 388 Fall 2001 Ellen Walker Hiram College
No ratings yet
Context Free Grammars & Parsing: CPSC 388 Fall 2001 Ellen Walker Hiram College
14 pages
Compiler Design 3
No ratings yet
Compiler Design 3
140 pages
Lecture 4
No ratings yet
Lecture 4
26 pages
4 Parsing
No ratings yet
4 Parsing
32 pages
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
No ratings yet
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
56 pages
CGF and CFL
No ratings yet
CGF and CFL
45 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
Chapter 3
No ratings yet
Chapter 3
41 pages
Lec4 SyntaxAnalysis
No ratings yet
Lec4 SyntaxAnalysis
41 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
l5 CFG
No ratings yet
l5 CFG
21 pages
2.2 - Syntax Analysis (Upto Top-Down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-Down Parsing)
91 pages
Compiler 8
No ratings yet
Compiler 8
28 pages
Samir CFG
No ratings yet
Samir CFG
105 pages
Grammar and Parse Trees (Syntax) : What Makes A Good Programming Language?
100% (2)
Grammar and Parse Trees (Syntax) : What Makes A Good Programming Language?
50 pages
2015 Grammar 4 CS
No ratings yet
2015 Grammar 4 CS
19 pages
CD - Ch.2
No ratings yet
CD - Ch.2
39 pages
Chapter 3 - Context Free Languages
No ratings yet
Chapter 3 - Context Free Languages
59 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
Chapter 3
No ratings yet
Chapter 3
77 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Lecture 4 - Context-Free Grammars
No ratings yet
Lecture 4 - Context-Free Grammars
24 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
CH03
No ratings yet
CH03
57 pages
Theme
No ratings yet
Theme
11 pages
Context-Free Grammar (CFG)
No ratings yet
Context-Free Grammar (CFG)
27 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
2024 CD-Ch03 Syntaxx Analysis
No ratings yet
2024 CD-Ch03 Syntaxx Analysis
28 pages
Chương 3. Phân Tích Cú Pháp
No ratings yet
Chương 3. Phân Tích Cú Pháp
103 pages
09 Parsing
No ratings yet
09 Parsing
11 pages
Lecture 05
No ratings yet
Lecture 05
58 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
04 Syntax Analysis
No ratings yet
04 Syntax Analysis
43 pages
15 Syntax Parsing
No ratings yet
15 Syntax Parsing
30 pages
Parser Lec1
No ratings yet
Parser Lec1
20 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Competence Vs Performance
No ratings yet
Competence Vs Performance
5 pages
Behavioral Approach To Speech, Language, Hearing, and Communication Disorders
No ratings yet
Behavioral Approach To Speech, Language, Hearing, and Communication Disorders
4 pages
Linguistic Profile of Children With Language Impairment: Revista Da Sociedade Brasileira de Fonoaudiologia January 2012
No ratings yet
Linguistic Profile of Children With Language Impairment: Revista Da Sociedade Brasileira de Fonoaudiologia January 2012
6 pages
12 Audiometic Testing
No ratings yet
12 Audiometic Testing
99 pages
BL Chart
No ratings yet
BL Chart
1 page
Larynx 5
No ratings yet
Larynx 5
60 pages
Suprasegmentals Fy
No ratings yet
Suprasegmentals Fy
22 pages
Ear Hubbard
No ratings yet
Ear Hubbard
3 pages
Key Concepts in Transformational
No ratings yet
Key Concepts in Transformational
10 pages
Evolution of Computer Architecture
50% (2)
Evolution of Computer Architecture
2 pages
ENGI 3703 Surveying and Geomatics: Distance Measurement Errors (Chapter 6)
No ratings yet
ENGI 3703 Surveying and Geomatics: Distance Measurement Errors (Chapter 6)
6 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
5 pages
Q4 STEM Pre Caculus Week1
No ratings yet
Q4 STEM Pre Caculus Week1
4 pages
Nota T-Test
No ratings yet
Nota T-Test
27 pages
EOHS
No ratings yet
EOHS
1 page
Getting The Occupational Safety Basics Organised 10 Important Measures
No ratings yet
Getting The Occupational Safety Basics Organised 10 Important Measures
8 pages
Sample Problems
No ratings yet
Sample Problems
2 pages
Muet Essay
No ratings yet
Muet Essay
1 page
Rudder Assignment
No ratings yet
Rudder Assignment
1 page
Elx DD Nic 5.00.31.01-6 Windows 32-64
No ratings yet
Elx DD Nic 5.00.31.01-6 Windows 32-64
4 pages
QA Procedures Summary PDF
No ratings yet
QA Procedures Summary PDF
15 pages
Least Square Fit
100% (2)
Least Square Fit
16 pages
Class 11 Holiday Homework
No ratings yet
Class 11 Holiday Homework
5 pages
Test Leap Kitap Örnek Sayfalar
No ratings yet
Test Leap Kitap Örnek Sayfalar
12 pages
Chapter Ten Business Ethics/ Social Responsibility/ Environmental Sustainability
0% (1)
Chapter Ten Business Ethics/ Social Responsibility/ Environmental Sustainability
27 pages
ReformingPersonnelPrep 08-191 214 PDF
No ratings yet
ReformingPersonnelPrep 08-191 214 PDF
24 pages
Control System Presentation
No ratings yet
Control System Presentation
15 pages
Rjaps v25 Dean
No ratings yet
Rjaps v25 Dean
21 pages
tmp330D TMP
No ratings yet
tmp330D TMP
18 pages
Baily-Academic Writing - Words Definition - Midterm 2021
No ratings yet
Baily-Academic Writing - Words Definition - Midterm 2021
4 pages
Guar Gel Ingredients
No ratings yet
Guar Gel Ingredients
36 pages
Introduction For Control
No ratings yet
Introduction For Control
6 pages
Ra 7796 - Tesda
No ratings yet
Ra 7796 - Tesda
11 pages
BNA'25 Delegate Guide
No ratings yet
BNA'25 Delegate Guide
2 pages
US Coast Guard Auxiliary Boat Crew Program Mentor Guide
No ratings yet
US Coast Guard Auxiliary Boat Crew Program Mentor Guide
5 pages
Assignment Programming Fundamenta
No ratings yet
Assignment Programming Fundamenta
6 pages
(14 15) Boundary Layer Theory
No ratings yet
(14 15) Boundary Layer Theory
25 pages
The PTC Creo Suite of NC and Tooling Solutions: Data Sheet
No ratings yet
The PTC Creo Suite of NC and Tooling Solutions: Data Sheet
5 pages