0% found this document useful (0 votes)

9 views

Lec4 SyntaxAnalysis

This document provides an introduction to parsing and context-free grammars. It discusses how a parser uses the grammar of a language to verify that a string of tokens can be generated by the grammar, report any syntax errors, and construct a parse tree representation. It defines key parsing concepts like context-free grammars, productions, terminals, non-terminals, derivations, parse trees, left-most and right-most derivations, ambiguity, and dealing with ambiguity in grammars.

Uploaded by

areejalqahtani214

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Lec4 SyntaxAnalysis

Uploaded by

areejalqahtani214

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

Introduction to Parsing

Ambiguity and Syntax Errors

Outline

• Parser overview

• Context-free grammars (CFG’s)

• Derivations

• Ambiguity

• Syntax errors

2
What is the job of Syntax Analysis?
• Syntax Analysis is also called Parsing or Hierarchical
Analysis.
• A Parser implements grammar of the language may it be java,
C, C++ etc
• The parser obtains a string of tokens from the lexical analyzer
and :
• verifies that the string can be generated by the
grammar for the source language
• Reports any syntax errors in the program
• Constructs a parse tree representation of the
program
• usually calls the lexical analyzer to supply a token to it
when necessary
• The grammar that a parser implements is called a Context
The Functionality of theParser

• Input: sequence of tokens from lexer

• Output: parse tree of the program

Comparison with Lexical Analysis:
Phase Input Output
Lexer Sequence of Sequence of
characters tokens
Parser Sequence of Parse tree
tokens
4
Example

• If-then-else statement
if (x == y) then z =1; else z = 2;
• Parser input
IF (ID == ID) THEN ID = INT; ELSE ID =
INT;
• Possible parser output

IF-THEN-ELSE
== = =

I I I IN I IN
D D D T D T
5
The Role of the Parser
• Not all sequences of tokens are programs ...
• Parser must distinguish between valid and
invalid sequences of tokens
• The Role is:
1. To check syntax (= string recognizer)
And to report syntax errors accurately
2. To invoke semantic actions
For static semantics checking, e.g. type checking of
expressions, functions, etc.
• We need
– A language for describing valid sequences of
tokens
– A method for distinguishing valid from invalid
6
What is the difference between Syntax and Semantic?

 Syntax is the way in which we construct sentences

by following principles and rules.

 Semantics is the interpretations of and meanings

derived from the sentence transmission and
understanding of the message or in other words
are the logical sentences making sense or not
Context-Free Grammars
• Many programming language constructs have a
recursive structure

• A STMT is of the form

if COND then STMT else STMT , or
while COND do STMT , or
…
• Context-free grammars are a natural notation
for this recursive structure

8
CFGs (Cont.)
 A CFG consists of
– A set of terminals T
– A set of non-terminals N
– A start symbol S (a non-terminal)
– A set of productions

Assuming X  N the productions are of the form

X , or
X  Y1 Y2 ... Yn where Yi  N T

10
Notational Conventions
Terminals: Lower case letters, operator symbols,
punctuation symbols, digits, boldface strings are all
terminals

Non Terminals: Upper case letters, lower case italic

names are usually non terminals
• Greek letters such as ,, represent strings of
grammars symbols. Thus a generic production can
be written as A 
• The start symbol is the left-hand side of the first
production

10
Examples of CFGs

A fragment of our example language

(simplified):

STMT  if COND then STMT else STMT

| while COND do STMT
| id = int

11
Examples of CFGs (cont.)
Grammar for simple arithmetic expressions:

E  E*
E
|E+E
|(E)
|id

12
The Language of a CFG
Read productions as replacement rules:

X  Y1 ... Yn
Means X can be replaced by Y1 ... Yn
X
Means X can be erased (replaced with empty string)

13
Key Idea

(1) Begin with a string consisting of the start

symbol “S”
(2) Replace any non-terminal X in the string
by a right-hand side of some production
X  Y1 . . Y n
(3) Repeat (2) until there are no non-terminals
in the string

14
The Language of a CFG (Cont.)
More formally, we write
X1 . . X i . . X n  X1 ..X i  1 Y 1 . . Y m Xi1 . . X n

if there is a production
Xi  Y1 . . Y m
Write
X1 . . X n Y1 . . Y m
if
X1 . . X n .. .. 
Y1 . . Y m
15
The Language of a CFG
Let G be a context-free grammar with start
symbol S. Then the language of G is:

{a1 . . a n  | S  a1 . . a n and every ai is a

terminal }
Terminals

• Terminals are called so because there are no

rules for replacing them
• Once generated, terminals are permanent
• Terminals ought to be tokens of the
language 16
Examples

L(G) is the language of the CFG G

Strings of balanced parentheses

( ) i i
|i
Two grammars:

S  (S ) S 0(S)
or
S   | 

20
Example

A fragment of our example language (simplified :

STMT  if COND then STMT

|if COND then STMT else STMT
|while COND do STMT
|id = int
COND  (id == id)
|
(id != id)

18
Arithmetic Example

Simple arithmetic expressions:

E  E+E | E E | (E) |
id
Some elements
id of the id
language:
+ id
(id) id 
(id) id id
 (id)
 19
Derivations and Parse Trees

A derivation is a sequence of productions

S.. .. ..

A derivation can be drawn as a tree
– Start symbol is the tree’s root
– For a production
X  Y1 . . Y n add
children Y1 . . Y n to node X

20
Derivation Example
• Grammar

E  E+E | E E | (E) | id
• String id  id + id E
E
 E+E
E + E
 E  E+E
 id E + E E * id
 id id + E E
 id id + id id id
21
Derivation Example
• Grammar

E  E+E | E E | (E) | id
• String id  id + id
Notes on Derivations
• A parse tree has
– Terminals at the leaves
– Non-terminals at the interior nodes

• An in-order traversal of the leaves is the

original input

• The parse tree shows the association of

operations, the input string does not

23
Left-most and Right-most
Derivations
E • What was shown before
was a left-most derivation
 E+E – At each step, replace the
 left-most non-terminal
E+id
 E  E + id • There is an equivalent
 E id + id notion of a right-

most derivation
id id + id – Shown on the right

24
Right-most Derivation in Detail
E E E

E  E+E
E + E

E
E
 E+E E + E

 E+id
id

25
Right-most Derivation in Detail (2)
E E
 E+E
 E+id E + E

 E  E + id
E * id
 E id + id E
 id id + id id
• Note that right-most and left-most
id
derivations have the same parse tree
• The difference is just in the order in
which branches are added
26
Ambiguity
• Grammar:
EE+E|E* E | (E)|
int

• The string int * int + int has two parse trees

E
E E+ E * E
E
E * E int int
E + E
int int int
int 27
Ambiguity (Cont.)
 A grammar is ambiguous if it has more
than one parse tree for some string
– Equivalently, there is more than one right-
most or left-most derivation for some string
 Ambiguity is bad
– Leaves meaning of some programs ill-defined
 Ambiguity is common in programming
languages
– Arithmetic expressions
– IF-THEN-ELSE

28
Dealing with Ambiguity

• There are several ways to handle ambiguity

• Most direct method is to rewrite grammar

unambiguously

ET+E|T
T  int * T | int | ( E )

• This grammar enforces precedence of * over +

29
The Dangling Else: Example
Consider the following grammar S  if C then S

• The expression |if

C then S else S
if C1 then if C2 then S3 else S4
|
has twoOTHER
parse trees
if if

C1 if S4 C1 if

C2 S3
S3 S4
C2

• Typically we want the second form

30
The Dangling Else: A Fix
• else should match the closest unmatched then
• We can describe this in the grammar

S  MIF /* all then are matched */

| /* some then are unmatched */
MIFUIF
 if C then MIF else MIF
| OTHER
UIF  if C then S
| if C then MIF else UIF

• Describes the same set of strings

The Dangling Else: Example Revisited
• The expression if C1 then if C2 then S3 else S4

if if

C1 if C1 if S4

C2 C2
S3
S3
• Not valid because the
S4 then expression is
not a MIF
• A valid parse tree 32
Ambiguity
• No general techniques for handling ambiguity

• Impossible to convert automatically an

ambiguous grammar to an unambiguous one

• Used with care, ambiguity can simplify the

grammar
– Sometimes allows more natural definitions
– We need disambiguation mechanisms

33
Precedence and Associativity Declarations
• Instead of rewriting the grammar
– Use the more natural (ambiguous) grammar
– Along with disambiguating declarations

• Most tools allow precedence and associativity

declarations to disambiguate grammars

• Examples …

34
Associativity Declarations
• Consider the grammar EE+E|
int
• Ambiguous: two parse trees of int + int + int

E E
E + E + E
E
E + E int int E + E

int int int int

• Left associativity declaration: %left +

35
Precedence Declarations
• Consider the grammar E  E + E * E | int
| E And the string int + int * int

E E
* E + E
E
E + E int int E *
E
int int int
int
• Precedence declarations: %left
+
36
Error Handling
 Purpose of the compiler is
– To detect non-valid programs
– To translate the valid ones
 Many kinds of possible errors (e.g. in C)
Error kind Example Detected by …
Lexical …$… Lexer
Syntax … x *% … Parser
Semantic … int x; y = x(3); … Type checker
Correctness your favorite program Tester/User

37
Error Handling
 A good compiler should assist in identifying and locating errors
◦ Lexical errors: important, compiler can easily recover and
continue such as misspelling an identifier, keyword etc.
◦ Syntax errors: most important for compiler, can almost
always recover such as arithmetic expression with unbalanced
parenthesis
◦ Static semantic errors: important, can sometimes recover
such as operator applied to incompatible operands
◦ Dynamic semantic errors: hard or impossible to detect at
compile time, runtime checks are required
◦ Logical errors: hard or impossible to detect such as infinite
recursive calls

38
Syntax Error Handling
 Error handler should
– Report errors accurately and clearly
– Recover from an error quickly
– Not slow down compilation of valid code

• Good error handling is not easy to achieve

39
Approaches to Syntax Error Recovery
 Approaches from simple to complex
– Panic mode
– Error productions
– Automatic local or global correction
• Panic mode is the simplest, most popular method
• When an error is detected:
– Discard tokens until one with a clear role is
found
– Continue from there

• Such tokens are called synchronizing tokens

– Typically the statement or expression
terminators
40
Questions???

156-585 Free Questions: Check Point Certified Troubleshooting Expert
No ratings yet
156-585 Free Questions: Check Point Certified Troubleshooting Expert
11 pages
COSC3054 Lec 03 I Grammars (4)
No ratings yet
COSC3054 Lec 03 I Grammars (4)
96 pages
Principles of Programming Languages: Syntax Analysis
No ratings yet
Principles of Programming Languages: Syntax Analysis
51 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Chapter 4_01a0a63b848e0c15cdfbc525231434fc
No ratings yet
Chapter 4_01a0a63b848e0c15cdfbc525231434fc
62 pages
CS6109-MODULE-4
No ratings yet
CS6109-MODULE-4
36 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
CH2-1 To CH2-3
No ratings yet
CH2-1 To CH2-3
79 pages
Lecture 1 Introduction DR Raheel 19022024 032426pm
No ratings yet
Lecture 1 Introduction DR Raheel 19022024 032426pm
32 pages
4.parsing
No ratings yet
4.parsing
32 pages
Entrepreneurship Process
No ratings yet
Entrepreneurship Process
22 pages
Compilers - Week 3
No ratings yet
Compilers - Week 3
17 pages
CH03
No ratings yet
CH03
57 pages
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
No ratings yet
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
70 pages
Parsing 120903115324 Phpapp02
No ratings yet
Parsing 120903115324 Phpapp02
20 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
No ratings yet
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
28 pages
Class Three
No ratings yet
Class Three
74 pages
CH2 1
No ratings yet
CH2 1
27 pages
L4 Formal Grammers
No ratings yet
L4 Formal Grammers
23 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
[Week 3] Syntax Analysis (Derivation)
No ratings yet
[Week 3] Syntax Analysis (Derivation)
46 pages
Computing Theory Lecture 4
No ratings yet
Computing Theory Lecture 4
25 pages
Ch2 Modified
No ratings yet
Ch2 Modified
39 pages
Grammar and Parse Trees (Syntax) : What Makes A Good Programming Language?
100% (2)
Grammar and Parse Trees (Syntax) : What Makes A Good Programming Language?
50 pages
4th - Syntax Analysis
No ratings yet
4th - Syntax Analysis
29 pages
Lecture 04
No ratings yet
Lecture 04
51 pages
Compiler Theory: (A Simple Syntax-Directed Translator)
No ratings yet
Compiler Theory: (A Simple Syntax-Directed Translator)
50 pages
CD UNIT 3
No ratings yet
CD UNIT 3
76 pages
PCD 1.4 Syntax Analysis
No ratings yet
PCD 1.4 Syntax Analysis
33 pages
CC lec 7
No ratings yet
CC lec 7
16 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
chapter 3
No ratings yet
chapter 3
57 pages
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
No ratings yet
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
22 pages
Compiler 2
No ratings yet
Compiler 2
45 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
Lecture05-Syntax Analysis-CFG
No ratings yet
Lecture05-Syntax Analysis-CFG
19 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Parser Lec1
No ratings yet
Parser Lec1
20 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
160 pages
Compiler 2
100% (1)
Compiler 2
45 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
003chapter 3 - Syntax Analysis
No ratings yet
003chapter 3 - Syntax Analysis
171 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
Lecture 4- Syntax Analysis (1)
No ratings yet
Lecture 4- Syntax Analysis (1)
66 pages
Lesson 3: Syntax Analysis: Risul Islam Rasel
No ratings yet
Lesson 3: Syntax Analysis: Risul Islam Rasel
106 pages
UNIT-I Part 2 Describing Syntax and Semantics
No ratings yet
UNIT-I Part 2 Describing Syntax and Semantics
70 pages
Parsing Notes
No ratings yet
Parsing Notes
96 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
17 CFGremove Ambiguity Optional
No ratings yet
17 CFGremove Ambiguity Optional
30 pages
CSC441-Lesson 04
No ratings yet
CSC441-Lesson 04
40 pages
Context Free Grammars
No ratings yet
Context Free Grammars
39 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
MyINSPIRE Guide For Students
No ratings yet
MyINSPIRE Guide For Students
47 pages
Visual Programming BSCS-6 Semester: Course Instructor: MR - Malak Roman Lecturer CS/IT CS Department University of Chitral
No ratings yet
Visual Programming BSCS-6 Semester: Course Instructor: MR - Malak Roman Lecturer CS/IT CS Department University of Chitral
21 pages
CSCI571 Final Spring2022
100% (1)
CSCI571 Final Spring2022
20 pages
Soubhik_Ghosh_Resume
No ratings yet
Soubhik_Ghosh_Resume
1 page
DissectingConfuser1.9.0.0 Anti Tamper
No ratings yet
DissectingConfuser1.9.0.0 Anti Tamper
9 pages
Vinod Jadhav
No ratings yet
Vinod Jadhav
4 pages
Simple Interest - 2 (Class Notes)
No ratings yet
Simple Interest - 2 (Class Notes)
41 pages
Assignment7 July 2024
No ratings yet
Assignment7 July 2024
5 pages
Allen Saldanha 18BCB0088 Capstone Report
No ratings yet
Allen Saldanha 18BCB0088 Capstone Report
22 pages
Notes Oopm Unit 2
No ratings yet
Notes Oopm Unit 2
33 pages
Marquee - Keyboard Shortcuts For Mac - 2021
No ratings yet
Marquee - Keyboard Shortcuts For Mac - 2021
1 page
ICT 2 Module 2 Lesson 8
No ratings yet
ICT 2 Module 2 Lesson 8
4 pages
Loosely Coupled-Django Aims To Make Each Element of Its Stack Independent of Others. Django MVC-MVT Architecture
No ratings yet
Loosely Coupled-Django Aims To Make Each Element of Its Stack Independent of Others. Django MVC-MVT Architecture
3 pages
Powers Hell Notes For Professionals
100% (1)
Powers Hell Notes For Professionals
184 pages
Source Code Management - CS181
No ratings yet
Source Code Management - CS181
88 pages
Wed Dev Journey
No ratings yet
Wed Dev Journey
3 pages
CS Art Integrated Progect
No ratings yet
CS Art Integrated Progect
11 pages
Day10 Java
No ratings yet
Day10 Java
4 pages
Debian - How To Solve This Dependencies Apt - Fix-Broken Install - Super User
No ratings yet
Debian - How To Solve This Dependencies Apt - Fix-Broken Install - Super User
2 pages
Creating HTML Reports With Style
No ratings yet
Creating HTML Reports With Style
13 pages
Tool Overview (Less Than 150 Words)
No ratings yet
Tool Overview (Less Than 150 Words)
1 page
WAF01 Barracuda Web Application Firewall - Foundation - Lab Guide
No ratings yet
WAF01 Barracuda Web Application Firewall - Foundation - Lab Guide
19 pages
CSS Background Image
No ratings yet
CSS Background Image
1 page
Clang - The C, C++ Compiler: Synopsis
No ratings yet
Clang - The C, C++ Compiler: Synopsis
9 pages
inheritance_OOPC_datastream
No ratings yet
inheritance_OOPC_datastream
27 pages
Abraham Wong: Year 4, Computer Science (Available For Intern From May To Dec 2021)
No ratings yet
Abraham Wong: Year 4, Computer Science (Available For Intern From May To Dec 2021)
1 page
RTL8192EU Wireless Adapter On Void Linux - by Leandro Ramos - Medium
No ratings yet
RTL8192EU Wireless Adapter On Void Linux - by Leandro Ramos - Medium
5 pages
Arduino-LED Programming (Switch-Case, If-Else, Loop, Function)
No ratings yet
Arduino-LED Programming (Switch-Case, If-Else, Loop, Function)
68 pages
Section 6: Correct
No ratings yet
Section 6: Correct
12 pages

Lec4 SyntaxAnalysis

Uploaded by

Lec4 SyntaxAnalysis

Uploaded by

Introduction to Parsing

Ambiguity and Syntax Errors

• Context-free grammars (CFG’s)

• Input: sequence of tokens from lexer

• Output: parse tree of the program

 Syntax is the way in which we construct sentences

 Semantics is the interpretations of and meanings

• A STMT is of the form

Assuming X  N the productions are of the form

Non Terminals: Upper case letters, lower case italic

A fragment of our example language

STMT  if COND then STMT else STMT

(1) Begin with a string consisting of the start

{a1 . . a n  | S  a1 . . a n and every ai is a

• Terminals are called so because there are no

L(G) is the language of the CFG G

Strings of balanced parentheses

A fragment of our example language (simplified :

STMT  if COND then STMT

Simple arithmetic expressions:

A derivation is a sequence of productions

S.. .. ..

• An in-order traversal of the leaves is the

• The parse tree shows the association of

• The string int * int + int has two parse trees

• There are several ways to handle ambiguity

• Most direct method is to rewrite grammar

• This grammar enforces precedence of * over +

• The expression |if

• Typically we want the second form

S  MIF /* all then are matched */

• Describes the same set of strings

• Impossible to convert automatically an

• Used with care, ambiguity can simplify the

• Most tools allow precedence and associativity

int int int int

• Good error handling is not easy to achieve

• Such tokens are called synchronizing tokens

You might also like