0% found this document useful (0 votes)

19 views36 pages

Chapter 4 - Syntax Analysis Part 1

Uploaded by

Abdullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views36 pages

Chapter 4 - Syntax Analysis Part 1

Uploaded by

Abdullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

1

Syntax Analysis
Part I
Chapter 4

COP5621 Compiler Construction

Position of a Parser in the

Compiler Model
Token,
Source tokenval Parser
Lexical Intermediate
Program and rest of
Analyzer representation
Get next front-end
token

Lexical error Syntax error

Semantic error

Symbol Table
3

The Parser
• A parser implements a C-F grammar as a
recognizer of strings
• The role of the parser in a compiler is twofold:
1. To check syntax (= string recognizer)
• And to report syntax errors accurately
2. To invoke semantic actions
• For static semantics checking, e.g. type checking of
expressions, functions, etc.
• For syntax-directed translation of the source code to an
intermediate representation
4

Syntax-Directed Translation
• One of the major roles of the parser is to produce
an intermediate representation (IR) of the source
program using syntax-directed translation
methods
• Possible IR output:
– Abstract syntax trees (ASTs)
– Control-flow graphs (CFGs) with triples, three-address
code, or register transfer list notation
– WHIRL (SGI Pro64 compiler) has 5 IR levels!
5

Error Handling
• A good compiler should assist in identifying and
locating errors
– Lexical errors: important, compiler can easily recover
and continue
– Syntax errors: most important for compiler, can almost
always recover
– Static semantic errors: important, can sometimes
recover
– Dynamic semantic errors: hard or impossible to detect
at compile time, runtime checks are required
– Logical errors: hard or impossible to detect
6

Viable-Prefix Property
• The viable-prefix property of parsers allows
early detection of syntax errors
– Goal: detection of an error as soon as possible
without further consuming unnecessary input
– How: detect an error as soon as the prefix of the
input does not match a prefix of any string in
the language
Error is
Error is detected here
… detected here …
Prefix Prefix DO 10 I = 1;0
for (;)
… …
7

Error Recovery Strategies

• Panic mode
– Discard input until a token in a set of designated
synchronizing tokens is found
• Phrase-level recovery
– Perform local correction on the input to repair the error
• Error productions
– Augment grammar with productions for erroneous
constructs
• Global correction
– Choose a minimal sequence of changes to obtain a
global least-cost correction
8

Grammars (Recap)
• Context-free grammar is a 4-tuple
G = (N, T, P, S) where
– T is a finite set of tokens (terminal symbols)
– N is a finite set of nonterminals
– P is a finite set of productions of the form
→
where   (NT)* N (NT)* and   (NT)*
– S  N is a designated start symbol
9

Notational Conventions Used

• Terminals
a,b,c,…  T
specific terminals: 0, 1, id, +
• Nonterminals
A,B,C,…  N
specific nonterminals: expr, term, stmt
• Grammar symbols
X,Y,Z  (NT)
• Strings of terminals
u,v,w,x,y,z  T*
• Strings of grammar symbols
,,  (NT)*
10

Derivations (Recap)
• The one-step derivation is defined by
A
where A →  is a production in the grammar
• In addition, we define
–  is leftmost lm if  does not contain a nonterminal
–  is rightmost rm if  does not contain a nonterminal
– Transitive closure * (zero or more steps)
– Positive closure + (one or more steps)
• The language generated by G is defined by
L(G) = {w  T* | S + w}
11

Derivation (Example)
Grammar G = ({E}, {+,*,(,),-,id}, P, E) with
productions P = E→E+E
E→E*E
E→(E)
E→-E
E → id
Example derivations:
E  - E  - id
E rm E + E rm E + id rm id + id
E * E
E * id + id
E + id * id + id
12

Chomsky Hierarchy: Language

Classification
• A grammar G is said to be
– Regular if it is right linear where each production is of
the form
A→wB or A→w
or left linear where each production is of the form
A→Bw or A→w
– Context free if each production is of the form
A→
where A  N and   (NT)*
– Context sensitive if each production is of the form
A→
where A  N, ,,  (NT)*, || > 0
– Unrestricted
13

Chomsky Hierarchy

L(regular)  L(context free)  L(context sensitive)  L(unrestricted)

Where L(T) = { L(G) | G is of type T }

That is: the set of all languages
generated by grammars G of type T

Examples:
Every finite language is regular! (construct a FSA for strings in L(G))
L1 = { anbn | n  1 } is context free
L2 = { anbncn | n  1 } is context sensitive
14

Parsing
• Universal (any C-F grammar)
– Cocke-Younger-Kasimi
– Earley
• Top-down (C-F grammar with restrictions)
– Recursive descent (predictive parsing)
– LL (Left-to-right, Leftmost derivation) methods
• Bottom-up (C-F grammar with restrictions)
– Operator precedence parsing
– LR (Left-to-right, Rightmost derivation) methods
• SLR, canonical LR, LALR
15

Top-Down Parsing
• LL methods (Left-to-right, Leftmost
derivation) and recursive-descent parsing
Grammar: Leftmost derivation:
E→T+T E lm T + T
T→(E) lm id + T
T→-E lm id + id
T → id
E E E E

T T T T T T

+ id + id + id
16

Left Recursion (Recap)

• Productions of the form
A→A
|
|
are left recursive
• When one of the productions in a grammar
is left recursive then a predictive parser
loops forever on certain inputs
17

A General Systematic Left

Recursion Elimination Method
Input: Grammar G with no cycles or -productions
Arrange the nonterminals in some order A1, A2, …, An
for i = 1, …, n do
for j = 1, …, i-1 do
replace each
Ai → Aj 
with
A i → 1  | 2  | … | k 
where
A j → 1 | 2 | … | k
enddo
eliminate the immediate left recursion in Ai
enddo
18

Immediate Left-Recursion
Elimination
Rewrite every left-recursive production
A→A
|
|
|A
into a right-recursive production:
A →  AR
|  AR
AR →  AR
|  AR
|
19

Example Left Recursion Elim.

A→BC|a
B→CA|Ab Choose arrangement: A, B, C
C→AB|CC|a

i = 1: nothing to do
i = 2, j = 1: B→CA|Ab
 B→CA|BCb|ab
(imm) B → C A BR | a b BR
BR → C b BR | 
i = 3, j = 1: C→AB|CC|a
 C→BCB|aB|CC|a
i = 3, j = 2: C→BCB|aB|CC|a
 C → C A BR C B | a b BR C B | a B | C C | a
(imm) C → a b BR C B CR | a B CR | a CR
CR → A BR C B CR | C CR | 
20

Left Factoring
• When a nonterminal has two or more productions
whose right-hand sides start with the same
grammar symbols, the grammar is not LL(1) and
cannot be used for predictive parsing
• Replace productions
A →  1 |  2 | … |  n | 
with
A →  AR | 
AR →  1 |  2 | … |  n
21

Predictive Parsing
• Eliminate left recursion from grammar
• Left factor the grammar
• Compute FIRST and FOLLOW
• Two variants:
– Recursive (recursive-descent parsing)
– Non-recursive (table-driven parsing)
22

FIRST (Revisited)
• FIRST() = { the set of terminals that begin all
strings derived from  }

FIRST(a) = {a} if a  T
FIRST() = {}
FIRST(A) = A→ FIRST() for A→  P
FIRST(X1X2…Xk) =
if for all j = 1, …, i-1 :   FIRST(Xj) then
add non- in FIRST(Xi) to FIRST(X1X2…Xk)
if for all j = 1, …, k :   FIRST(Xj) then
add  to FIRST(X1X2…Xk)
23

FOLLOW
• FOLLOW(A) = { the set of terminals that can
immediately follow nonterminal A }

FOLLOW(A) =
for all (B →  A )  P do
add FIRST()\{} to FOLLOW(A)
for all (B →  A )  P and   FIRST() do
add FOLLOW(B) to FOLLOW(A)
for all (B →  A)  P do
add FOLLOW(B) to FOLLOW(A)
if A is the start symbol S then
add $ to FOLLOW(A)
24

LL(1) Grammar
• A grammar G is LL(1) if it is not left recursive
and for each collection of productions
A → 1 | 2 | … | n
for nonterminal A the following holds:

1. FIRST(i)  FIRST(j) =  for all i  j

2. if i *  then
2.a. j *  for all i  j
2.b. FIRST(j)  FOLLOW(A) = 
for all i  j
25

Non-LL(1) Examples

Grammar Not LL(1) because:

Recursive-Descent Parsing
(Recap)
• Grammar must be LL(1)
• Every nonterminal has one (recursive) procedure
responsible for parsing the nonterminal’s
syntactic category of input tokens
• When a nonterminal has multiple productions,
each production is implemented in a branch of a
selection statement based on input look-ahead
information
27

Using FIRST and FOLLOW in a

Recursive-Descent Parser
procedure rest();
begin
expr → term rest if lookahead in FIRST(+ term rest) then
rest → + term rest match(‘+’); term(); rest()
else if lookahead in FIRST(- term rest) then
| - term rest match(‘-’); term(); rest()
| else if lookahead in FOLLOW(rest) then
term → id return
else error()
end;

where FIRST(+ term rest) = { + }

FIRST(- term rest) = { - }
FOLLOW(rest) = { $ }
28

Non-Recursive Predictive
Parsing: Table-Driven Parsing
• Given an LL(1) grammar G = (N, T, P, S)
construct a table M[A,a] for A  N, a  T
and use a driver program with a stack
input a + b $

stack
Predictive parsing
X output
program (driver)
Y
Z Parsing table
$ M
29

Constructing an LL(1) Predictive

Parsing Table
for each production A →  do
for each a  FIRST() do
add A →  to M[A,a]
enddo
if   FIRST() then
for each b  FOLLOW(A) do
add A →  to M[A,b]
enddo
endif
enddo
Mark each undefined entry in M error
30

Example Table A→ FIRST() FOLLOW(A)

E → T ER ( id $)
ER → + T ER +
E → T ER $)
ER → + T ER |  ER →  
T → F TR T → F TR ( id +$)
TR → * F TR |  TR → * F TR *
+$)
F → ( E ) | id TR →  
F→(E) ( *+$)
F → id id *+$)

id + * ( ) $
E E → T ER E → T ER
ER ER → + T ER ER →  ER → 
T T → F TR T → F TR
TR TR →  TR → * F TR TR →  TR → 
F F → id F→(E)
31

LL(1) Grammars are

Unambiguous
Ambiguous grammar A→ FIRST() FOLLOW(A)
S → i E t S SR | a S → i E t S SR i
SR → e S |  e$
S→a a
E→b SR → e S e
e$
SR →  
E→b b t
Error: duplicate table entry
a b e i t $
S S→a S → i E t S SR
SR → 
SR SR → 
SR → e S
E E→b
32

Predictive Parsing Program

push($)
(Driver)
push(S)
a := lookahead
repeat
X := pop()
if X is a terminal or X = $ then
match(X) // moves to next token and a := lookahead
else if M[X,a] = X → Y1Y2…Yk then
push(Yk, Yk-1, …, Y2, Y1) // such that Y1 is on top
… invoke actions and/or produce IR output …
else error()
endif
until X = $
33

Example Table-Driven Parsing

Stack Input Production applied
$E id+id*id$ E → T ER
$ERT id+id*id$ T → F TR
$ERTRF id+id*id$ F → id
$ERTRid id+id*id$
$ERTR +id*id$ TR → 
$ER +id*id$ ER → + T ER
$ERT+ +id*id$
$ERT id*id$ T → F TR
$ERTRF id*id$ F → id
$ERTRid id*id$
$ERTR *id$ TR → * F TR
$ERTRF* *id$
$ERTRF id$ F → id
$ERTRid id$
$ERTR $ TR → 
$ER $ ER → 
$ $
34

Panic Mode Recovery

Add synchronizing actions to FOLLOW(E) = { ) $ }
undefined entries based on FOLLOW FOLLOW(ER) = { ) $ }
FOLLOW(T) = { + ) $ }
Pro: Can be automated FOLLOW(TR) = { + ) $ }
Cons: Error messages are needed FOLLOW(F) = { + * ) $ }

id + * ( ) $
E E → T ER E → T ER synch synch
ER ER → + T ER ER →  ER → 
T T → F TR synch T → F TR synch synch
TR TR →  TR → * F TR TR →  TR → 
F F → id synch synch F→(E) synch synch
synch: the driver pops current nonterminal A and skips input till
synch token or skips input until one of FIRST(A) is found
35

Phrase-Level Recovery
Change input stream by inserting missing tokens
For example: id id is changed into id * id
Pro: Can be fully automated
Cons: Recovery not always intuitive
Can then continue here

id + * ( ) $
E E → T ER E → T ER synch synch
ER ER → + T ER ER →  ER → 
T T → F TR synch T → F TR synch synch
TR insert * TR →  TR → * F TR TR →  TR → 
F F → id synch synch F→(E) synch synch

insert : driver inserts missing and retries the production

Error Productions
E → T ER Add “error production”:
ER → + T ER |  TR → F TR
T → F TR to ignore missing *, e.g.: id id
TR → * F TR |  Pro: Powerful recovery method
F → ( E ) | id Cons: Manual addition of productions
id + * ( ) $
E E → T ER E → T ER synch synch
ER ER → + T ER ER →  ER → 
T T → F TR synch T → F TR synch synch
TR TR → F TR TR →  TR → * F TR TR →  TR → 
F F → id synch synch F→(E) synch synch

Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
100% (2)
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
56 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
36 pages
Ch4a
No ratings yet
Ch4a
36 pages
Chapter 3 Syntax Analysis 2021
No ratings yet
Chapter 3 Syntax Analysis 2021
104 pages
u2ppt
No ratings yet
u2ppt
91 pages
Lec02-Syntax Analysis and LL
No ratings yet
Lec02-Syntax Analysis and LL
79 pages
Lec02-Syntax Analysis and LL
No ratings yet
Lec02-Syntax Analysis and LL
74 pages
Unit 3
No ratings yet
Unit 3
37 pages
Ch4a Modified
No ratings yet
Ch4a Modified
53 pages
parser (1)
No ratings yet
parser (1)
36 pages
Top to Bottom (1)
No ratings yet
Top to Bottom (1)
31 pages
Top Down
No ratings yet
Top Down
25 pages
Lecture#15, Chapter 04 (Part I)
No ratings yet
Lecture#15, Chapter 04 (Part I)
18 pages
Lec03 parserCFG
No ratings yet
Lec03 parserCFG
27 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
Chapter 8 - Syntax Analysis
No ratings yet
Chapter 8 - Syntax Analysis
92 pages
Compiler Unit2
No ratings yet
Compiler Unit2
89 pages
CS6109-MODULE-5
No ratings yet
CS6109-MODULE-5
117 pages
Chapter4-1
No ratings yet
Chapter4-1
61 pages
Syntax Analysis
No ratings yet
Syntax Analysis
90 pages
Pert 4 - Syntax Analysis-Top Down Parsing
No ratings yet
Pert 4 - Syntax Analysis-Top Down Parsing
54 pages
PPT Lecture 1.9 Top Down Parsing and Lecture 1.10 Recursive Descent Parsing (1)
No ratings yet
PPT Lecture 1.9 Top Down Parsing and Lecture 1.10 Recursive Descent Parsing (1)
21 pages
Tekkom M4,5
No ratings yet
Tekkom M4,5
29 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Lec02-Syntax Analysis and LL
No ratings yet
Lec02-Syntax Analysis and LL
74 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
unit7
No ratings yet
unit7
34 pages
Lesson 18
No ratings yet
Lesson 18
32 pages
Lecture_13
No ratings yet
Lecture_13
35 pages
Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq
34 pages
Parsing ME Modified
No ratings yet
Parsing ME Modified
168 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
Enhancing Email Integration in CPI Using Groovy Scripts 1746155417
No ratings yet
Enhancing Email Integration in CPI Using Groovy Scripts 1746155417
6 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Ll1parser 190921075612
No ratings yet
Ll1parser 190921075612
84 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Chapter 3-Syntax Analysis-II
No ratings yet
Chapter 3-Syntax Analysis-II
28 pages
Cdeprt
No ratings yet
Cdeprt
12 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
51114. Compiler Design Syntax Analysis Top Down
No ratings yet
51114. Compiler Design Syntax Analysis Top Down
34 pages
Oomd Notes Soft PDF
0% (1)
Oomd Notes Soft PDF
129 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
No ratings yet
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
31 pages
Important Question With Answers UNIT 1 To 5
No ratings yet
Important Question With Answers UNIT 1 To 5
145 pages
CD Chapter 2
No ratings yet
CD Chapter 2
39 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
0-1 Knapsack Problem
No ratings yet
0-1 Knapsack Problem
6 pages
Crack The Code - Top 100 LWC Interview Questions
No ratings yet
Crack The Code - Top 100 LWC Interview Questions
80 pages
Context Free Grammars
No ratings yet
Context Free Grammars
10 pages
Compiler Construction Lecture 12 Predictive Parsing-Step1
No ratings yet
Compiler Construction Lecture 12 Predictive Parsing-Step1
24 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
No ratings yet
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
26 pages
NPTEL Joy of Computing Using Python.
33% (3)
NPTEL Joy of Computing Using Python.
2 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
N Slab Report 11808037
No ratings yet
N Slab Report 11808037
53 pages
Parsing
No ratings yet
Parsing
38 pages
BSC Computer Application Syllabus 2020-21
No ratings yet
BSC Computer Application Syllabus 2020-21
99 pages
Real time system
No ratings yet
Real time system
30 pages
5.1 Distributed Hash Table
No ratings yet
5.1 Distributed Hash Table
49 pages
Chapter10 - Object-Oriented Systems Analysis and Design Using UML - E
No ratings yet
Chapter10 - Object-Oriented Systems Analysis and Design Using UML - E
79 pages
Maulana Abul Kalam Azad University of Technology, WB
No ratings yet
Maulana Abul Kalam Azad University of Technology, WB
60 pages
Hex Dump
No ratings yet
Hex Dump
5 pages
2024 08 06 10 01 Solution
No ratings yet
2024 08 06 10 01 Solution
10 pages
DS IAT - 1 Answerkey
No ratings yet
DS IAT - 1 Answerkey
20 pages
Lenguaje Ensamblador. Problemas: Capítulo 1
No ratings yet
Lenguaje Ensamblador. Problemas: Capítulo 1
8 pages
Multithreading in Java
No ratings yet
Multithreading in Java
14 pages
Data Structures and Algorithms: Linked List Overview
No ratings yet
Data Structures and Algorithms: Linked List Overview
6 pages
Properties of Context-Free Languages: Reading: Chapter 7
No ratings yet
Properties of Context-Free Languages: Reading: Chapter 7
61 pages
285 Project Paper
No ratings yet
285 Project Paper
7 pages
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
No ratings yet
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
13 pages
02 TA80 INTG020 GosuForIntegration
No ratings yet
02 TA80 INTG020 GosuForIntegration
30 pages
Hotel Management System Report
50% (22)
Hotel Management System Report
19 pages
Control Puerto Serial Con Visual Basic
No ratings yet
Control Puerto Serial Con Visual Basic
5 pages
CH 4 (Threads)
No ratings yet
CH 4 (Threads)
8 pages
CS111 Lab Test 1 (5%)
No ratings yet
CS111 Lab Test 1 (5%)
3 pages
0796 Ict Al p1 Soremex 2024
No ratings yet
0796 Ict Al p1 Soremex 2024
6 pages
WK 03
No ratings yet
WK 03
2 pages
Adoption of Programming Codes in The Design of Earth Retaining Wall in Different Backfill Conditions
No ratings yet
Adoption of Programming Codes in The Design of Earth Retaining Wall in Different Backfill Conditions
7 pages
Resume - Riaz Mahmud
No ratings yet
Resume - Riaz Mahmud
8 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)

Chapter 4 - Syntax Analysis Part 1

Uploaded by

Chapter 4 - Syntax Analysis Part 1

Uploaded by

1

COP5621 Compiler Construction

Position of a Parser in the

Lexical error Syntax error

Error Recovery Strategies

Notational Conventions Used

Chomsky Hierarchy: Language

L(regular)  L(context free)  L(context sensitive)  L(unrestricted)

Where L(T) = { L(G) | G is of type T }

Left Recursion (Recap)

A General Systematic Left

Example Left Recursion Elim.

1. FIRST(i)  FIRST(j) =  for all i  j

Grammar Not LL(1) because:

Using FIRST and FOLLOW in a

where FIRST(+ term rest) = { + }

Constructing an LL(1) Predictive

Example Table A→ FIRST() FOLLOW(A)

LL(1) Grammars are

Predictive Parsing Program

Example Table-Driven Parsing

Panic Mode Recovery

insert *: driver inserts missing * and retries the production

You might also like

insert : driver inserts missing and retries the production