0% found this document useful (0 votes)
7 views

CCWeek 05lecture09

Uploaded by

usairashahbaz152
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

CCWeek 05lecture09

Uploaded by

usairashahbaz152
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Compiler Construction

(CS-342)
Lecture # 09

Course Instructor: M. Ramzan Shahid Khan

Department of Computer Science,


Namal University Mianwali
Fall Semester, 2024
Topics
• Syntax Analysis Phase – Also known as Parser
• Backus–Naur form (BNF)
• Context Free Grammars

• Parse Tree

• Parsing

2
Backus–Naur form (BNF)
• BNF stands for Backus-Naur Form.

• It is used to write a formal representation of a context-free


grammar.

• It is also used to describe the syntax of a programming language.

• BNF notation is basically just a variant of a context-free grammar.

3
Backus–Naur form (BNF)
• In BNF, productions have the form:

• Left side → definition

• Where left side ∈ (Vn∪ Vt)+ and definition ∈ (Vn∪ Vt)*.

• In BNF, the left side contains one non-terminal.

• We can define the several productions with the same left side.

• All the productions are separated by a vertical bar symbol "|".

4
Backus–Naur form (BNF)
• There is the production for any grammar as follows:
S → aSa
S → bSb
S→c
• In BNF, we can represent above grammar as follows:
S → aSa | bSb | c

5
Metalanguages
• A metalanguage is a language used to talk about a language (usually
a different one)

• We can use English as its own Metalanguage


• E.g., describing English Grammar in English

• It is essential to distinguish between the metalanguage terms and


the object language terms

6
BNF
• BNF stands for either Backus-Naur Form or Backus Normal Form
• BNF is a metalanguage used to describe the grammar of a
programming language

• BNF is formal and precise


• BNF is a notation for context-free grammars
• BNF is essential in compiler construction
• There are many dialects of BNF in use, but…
• … the differences are almost always minor
7
BNF
• < > indicate a nonterminal that needs to be further expanded, e.g.
<variable>
• Symbols not enclosed in < > are terminals; they represent
themselves, e.g. if, while, (
• The symbol ::= means is defined as
• The symbol | means or; it separates alternatives, e.g.
<addop> ::= + | -
• This is all there is to “plain” BNF; but we will discuss extended BNF
(EBNF) here:

8
BNF uses recursion
• <integer> ::= <digit> | <integer> <digit>
Or
<integer> ::= <digit> | <digit> <integer>

• Recursion is all that is needed (at least, in a formal sense)


• “Extended BNF” allows repetition as well as recursion
• Repetition is usually better when using BNF to construct a compiler

9
BNF – Examples I
• <digit> ::= 0|1|2|3|4|5|6|7|8|9

• <if statement> ::=


if ( <condition> ) <statement>
| if ( <condition> ) <statement>
else <statement>

10
BNF – Examples II
• <unsigned integer> ::= <digit> | <unsigned integer> <digit>

• <integer> ::=
<unsigned integer>
| + <unsigned integer>
| - <unsigned integer>

11
BNF – Examples III
• <identifier> ::=
<letter>
| <identifier> <letter>
| <identifier> <digit>

• <block> ::= { <statement list> }

• <statement list> ::=


<statement>
| <statement list> <statement>

12
BNF – Examples IV
• <statement> ::=
<block>
| <assignment statement>
| <break statement>
| <continue statement>
| <do statement>
| <for loop>
| <goto statement>
| <if statement>
| …

13
Extended BNF
• The following are pretty standard:
• [ ] enclose an optional part of the rule
• Example:
<if statement> ::=
if ( <condition> ) <statement> [ else <statement> ]
• {} mean the enclosed can be repeated any number of times (including zero)
• Example:
<parameter list> ::= ( )
| ( { <parameter>, } <parameter> )

14
BNF - Variations
• The preceding notation is the original and most common notation
• BNF was designed before we had
• boldface,
• colour,
• more than one font, etc.
• A typical modern variation might:
• Use boldface to indicate multi-character terminals
• Quote single-character terminals (because boldface isn’t so obvious in this case)
• Example:
• if_statement ::=
if “(“ condition ”)” statement [ else statement]

15
Limitations of BNF
• No easy way to impose length limitations, such as:
• Maximum length of variable names
• No easy way to describe ranges, such as 1 to 31
• No way at all to impose distributed requirements, such as:
• A variable must be declared before it is used
• Describe Only Syntax, Not Semantics
• Nothing clearly better has been devised

16
BNF Or CFG

• The syntax of programming language constructs


can be specified by
• context-free grammars OR
• BNF (Backus-Naur Form) notation,

17
Context Free Grammar (CFG)
• It is the combination of 4 tuples:
1. Start Symbol (S)
2. Set of non-terminals (V)
3. Set of terminals (T)
4. Production Rules (P)
• Mathematically 𝐺 = (𝑉, 𝑇, 𝑃, 𝑆)
18
Context Free Grammar (CFG)
• Terminals: (Represented by small alphabets)
• A terminal is a symbol that does not appear on left side of a
production
• All tokens such as plus sign, *, -, small alphabets from a-z,
identifiers etc. – We can’t replace them
• Non-Terminals: (Represented by Capital Letters)
• Non-terminal symbols are those symbols that can be
replaced
• Appears on the left side of production rules
• The right side can contain a combination of terminals and
non-terminals.
19
CFG - Examples
1. Example# 1 𝑆 → 0𝑆1
𝑆 → 0𝑆1 | ∈ 𝑆 →∈

𝑆 → 0𝑆1
𝑆→𝜆
20
CFG - Examples
2. Example# 2
𝑇 →𝑇∗𝐹
𝑇→𝐹
𝐹 → 𝑖𝑑

3. Example# 3
𝑆 → 𝑎𝑆𝑏
𝑆→𝜆
21
Context Free Grammar (CFG) - Examples
1. Example# 1 𝑆 ⟹ 𝑎𝐵𝑐
𝑆 → 𝑎𝐵𝑐 ⟹ 𝑎𝑏𝑐
𝐵→𝑏
S = {S}
V={S,B} Derivation of input string
T=(a,c,b} is not possible with the
P=2 given grammar
Input String: abbc
22
Context Free Grammar (CFG) - Examples
2. Example# 2 𝑆 ⟹ 𝑎𝑆𝑏
𝑆 → 𝑎𝑆𝑏 ⟹ 𝑎𝑎𝑆𝑏𝑏
𝑆→𝜆 ⟹ 𝑎𝑎𝜆𝑏𝑏
⟹ 𝑎𝑎𝑏𝑏
Input String: aabb Derivation of input
string is possible with
the given grammar
23
Context Free Grammar (CFG) – Parse Tree
How the Syntax Analysis Phase Works:
𝑆 ⟹ 𝑎𝑆𝑏 S
⟹ 𝑎𝑎𝑆𝑏𝑏
⟹ 𝑎𝑎𝜆𝑏𝑏
a
⟹ 𝑎𝑎𝑏𝑏 S b

a S b

λ
24
Context Free Grammar (CFG) - Examples
3. Construct CFG for 𝑆 → 𝑎𝑆
the Language having 𝑆→𝜆
any number of a’s.
𝐿 = {𝜆, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎, … } 𝑆⟹𝜆

Derivation of input
string is possible with
Input String: λ
the given grammar
25
Context Free Grammar (CFG) - Examples
𝑆 → 𝑎𝑆
3. Construct CFG for 𝑆→𝜆
the Language having
𝑆 ⟹ 𝑎𝑆
any number of a’s. ⟹ 𝑎𝜆
𝐿 = {𝜆, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎, … } ⟹𝑎

Derivation of input
string is possible with
Input String: a the given grammar
26
Context Free Grammar (CFG) - Examples
𝑆 → 𝑎𝑆
3. Construct CFG for 𝑆→𝜆
the Language having 𝑆 ⟹ 𝑎𝑆
any number of a’s. ⟹ 𝑎𝑎𝑆
𝐿 = {𝜆, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎, … } ⟹ 𝑎𝑎𝜆
⟹ 𝑎𝑎

Derivation of input
string is possible with
Input String: aa the given grammar
27
Context Free Grammar (CFG) - Examples
𝑆 → 𝑎𝑆𝑏𝑏|𝑎𝑏𝑏
4. Construct CFG for
the Language L.
𝐿 = 𝑎𝑛 𝑏 2𝑛 𝑆 ⟹ 𝑎𝑆𝑏𝑏
𝑤ℎ𝑒𝑟𝑒 𝑛 ≥ 1 ⟹ 𝑎𝑎𝑏𝑏𝑏𝑏
𝐿 = {𝑎𝑏𝑏, 𝑎𝑎𝑏𝑏𝑏𝑏, … }
Derivation of input
Input String: aabbbb string is possible with
the given grammar
28
Context Free Grammar (CFG) – Parse Tree
How the Syntax Analysis Phase Works:
S
𝑆 ⟹ 𝑎𝑆𝑏𝑏
⟹ 𝑎𝑎𝑏𝑏𝑏𝑏
a S b b

a b b

29
Syntax Analysis – Example
How Parse Tree is Generated?

sum = new_value + old_value * x;

30
Syntax Analysis – Example
sum = new_value + old_value * x;
input

Scanner
𝑆 → 𝑖𝑑 = 𝐸;
𝐸 → 𝐸 + 𝑇|𝑇
𝑇 → 𝑇 ∗ 𝐹|𝐹
Stream of Tokens id = id + id * id
𝐹 → 𝑖𝑑
input

Grammar Parser

31
sum = new_value + old_value * x;
input 𝑆 → 𝑖𝑑 = 𝐸;
𝐸 → 𝐸 + 𝑇|𝑇
Scanner
𝑇 → 𝑇 ∗ 𝐹|𝐹
𝐹 → 𝑖𝑑
Stream of Tokens id = id + id * id
𝑆 → 𝑖𝑑 = 𝐸;
input ⇒ 𝑖𝑑 = 𝐸 + 𝑇;
𝑖𝑑 = 𝑇 + 𝑇;
Grammar Parser
𝑖𝑑 = 𝐹 + 𝑇;
𝑖𝑑 = 𝐹 + 𝑇 ∗ 𝐹;
𝑖𝑑 = 𝑖𝑑 + 𝑇 ∗ 𝐹;
𝑖𝑑 = 𝑖𝑑 + 𝐹 ∗ 𝐹;
𝑖𝑑 = 𝑖𝑑 + 𝑖𝑑 ∗ 𝐹;
𝑖𝑑 = 𝑖𝑑 + 𝑖𝑑 ∗ 𝑖𝑑;
32
Parse Tree S
𝑆 → 𝑖𝑑 = 𝐸;
⇒ 𝑖𝑑 = 𝐸 + 𝑇; ;
𝑖𝑑 = 𝑇 + 𝑇; id E
=
𝑖𝑑 = 𝐹 + 𝑇;
𝑖𝑑 = 𝐹 + 𝑇 ∗ 𝐹; E + T
𝑖𝑑 = 𝑖𝑑 + 𝑇 ∗ 𝐹;
𝑖𝑑 = 𝑖𝑑 + 𝐹 ∗ 𝐹; T F
T *
𝑖𝑑 = 𝑖𝑑 + 𝑖𝑑 ∗ 𝐹;
𝑖𝑑 = 𝑖𝑑 + 𝑖𝑑 ∗ 𝑖𝑑;
F F id

id id
33
Parsing
• The discovery of derivation for a given string is
called Parsing.

34

You might also like