0% found this document useful (0 votes)
8 views

Compiler Construction Week 04 Syntax Analysis I)

Uploaded by

malinazaket
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Compiler Construction Week 04 Syntax Analysis I)

Uploaded by

malinazaket
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Compiler Construction

Week 04
Syntax Directed Translator

MUHAMMAD SHAHID KHAN


LECTURER
DEPARTMENT OF COMPUTING &
TECHNALOGY
IQRA UNIVERSITY ISLAMABAD
Layout

01 Syntax Directed Translator

02 What is Context Free Grammar?

03 Basic Terminologies of CFG


04 Derivation & Parse Tree
05 Ambiguity

2
Syntax
Directed
Translato
r
3
Syntax Directed Translator
 This section illustrates the compiling techniques
by developing a program that translates
representative programming language
statements into three-address code, an
intermediate representation.
 We will focus on
 Front end of a compiler
 Lexical analysis
 Parsing
 Intermediate code generation.
4
Syntax Directed Translator

5
Three address code

 Three address code (x=y op z)


 X=(a*b)+(c*d)
 If we want to convert in three address code, then
 t1= a*b X=y op z x=y+z

 t2= c*d X=op y x=+y

 X=t1+t2 X=y op z x=y

 Various statement in three address code


 Assignment

6
Introduction

 Analysis is organized around the "syntax" of the


language to be compiled.
 The syntax of a programming language
describes the proper form of its programs.
 The semantics of the language defines what its
programs mean.
 For specifying syntax, Context-Free Grammars is
used.
 Also known as BNF (Backus-Naur Form)

7
Introduction

 We start with a syntax-directed translation of an


infix expression to postfix form.
 Infix form: 9-5+2
 to
 Postfix form: 9 5-2+

8
Syntax Definition

 Context Free Grammar is used to specify the


syntax of the language.
 Shortly we can say it "Grammar".
 A grammar describes the hierarchical structure of
most programming language constructs.
 Ex.
 if (expression) statement else statement

9
Syntax Definition

 This rule can be expressed as production by


using the variable expr to denote an expression
and the variable stmt to denote a statement.
 stmt-> if (expr) stmt else stmt
 In a production
 lexical elements like the keyword if, else and the
parentheses are called terminals.
 Variables like expr and stmt represent sequences
of terminals and are called nonterminal.

10
What is
Context Free
Grammar?
11
Grammar

 A context-free grammar has four components


 A set of tokens (terminal symbols)
 A set of non-terminals
 A set of productions
 A designated start symbol
 Let's check an example that elaborates these
components.

12
 Expressions... Grammar
 9-5+2, 5-4, 8...
 Since a plus or minus sign must appear between two
digits, we refer to such expressions as lists of digits
separated by plus or minus signs.
 The productions are
 List -> list + digit P-
1
 List -> list – digit P-
2
 List -> digit P-
3
 Digit->0 1 2 3 4 5 6 7 8 9 P-
4
13
Context Free Grammar
• Why we need CFGs?

• Many languages are not regular for example string of


balanced parentheses ((((…))))

• There is no regular expression for this language

• A finite automata may repeat states; however, it can


not remember the number of times it has been to a
particular state

14
Context
• Context-free grammar is used to specifyFree Grammar
the syntax of a
language
• A grammar naturally describes the hierarchical structure
for e.g.

• Arrow may be read as "can have the form." Such a rule is


called a production.
• if and the parentheses are called terminals.
• Variables like expr and stmt represent sequences of
terminals and are called non-terminals.
15
Basic
Terminologie
s of CFG

16
A context-free grammar has four components:
Basic Terminologies of CFG
• Terminals: Sometimes referred to as "tokens." The
terminals are the elementary symbols of the language
defined by the grammar.

• Non-terminals: called "syntactic variables." Each non-


terminal represents a set of strings of terminals, in a
manner we shall describe.

• Production: Each production consists of a non-terminal


called the head or left side of the production, an arrow,
and a sequence of terminals and/or non-terminals , called
the body or right side of the production

• A designation of one of the non-terminals as the start


symbol. 17
Basic Terminologies of CFG
A context-free grammar has four components:

CFG (V, T, P, S)
• Terminals (T): Finite sets of end-nodes values
0, 1
• Non-terminals (V): S P
Finite sets of Variables
S, P
• Production (P): Substitution Rules
P  0P0
P  1P1
• Start symbol (S): Initiate Point P0
S
P1
P
18
Basic Terminologies of Lexical
Example (9-5+2, 3-1, or 7) Analysis

• Terminals:
• Non-terminals: list and digit
• Empty list: The string of zero terminals, written as
∊, is called the empty string

19
Derivation &
Parse Tree

20
Derivation & Parse Tree
Derivation:

• A grammar derives strings by beginning with the start


symbol and repeatedly replacing a non-terminal by the
body of a production for that non-terminal

21
Derivation & Parse Tree
Derivation:

list  list + digit


 list – digit + digit
 digit – digit + digit
 9 – digit + digit
 9 – 5 + digit
 9–5+2

Therefore, the string 9-5+2 belongs to the language


specified by the grammar

22
Parse Tree:
Derivation & Parse Tree
• Parsing is the problem of taking a string of terminals
and figuring out how to derive it from the start
symbol of the grammar.
• If it cannot be derived from the start symbol of the
grammar, then reporting syntax errors within the
string.
• Given a context-free grammar, a parse tree according
to the grammar is a tree with the following
properties:
• The root is labeled by the start symbol.
• Each leaf is labeled by a terminal or by ε.
• Each interior node is labeled by a nonterminal

23
Derivation & Parse Tree

Parse Tree:

If A→ X1 X2 ... Xn is a production, then node A has immediate children X1, X2, ..., Xn
where Xi is a (non)terminal or 8.

24
Derivation & Parse Tree

Parse Tree:

25
Derivation & Parse Tree
Parse Tree:

• A parse tree defines how the start symbol of a


grammar derives a string

• If non-terminal A has a production A → XYZ, then a


parse tree in the language is:

26
Parse Tree:
Derivation & Parse Tree

• Each node in the tree is labeled by a grammar


symbol
• An interior node and its children correspond to a
production
• The children of the root are labeled, from left to right

27
Ambiguity

28
Ambiguity
• Ambiguity is problematic because meaning of the
programs can be incorrect

• Ambiguity can be handled in several ways


• Enforce associativity and precedence
• Rewrite the grammar (cleanest way)

• There are no general techniques for handling ambiguity

• It is impossible to convert automatically an ambiguous


grammar to an unambiguous one

29
Ambiguity
Consider grammar
string  string +
string
| string –
string
|0|1|…
|9

30
Tree
Terminology

31
Tree Terminology

• A tree consists of one or more nodes.


• Exactly one is the root.
• If node N is the parent of node M, then M is a child of N.
• The children of one node are called siblings. They have
an order, from the left.
• A node with no children is called a leaf.
• A descendant of a node N is either N itself, a child of N, a
child of a
• child of N, and so on.

32
Associativity
of Operators

33
Associativity of Operators

• Left-associative operators have left-recursive


productions.
• For instance,
• list  list-digit | digit.
• String 9-5-2 has the same meaning as (9-5)-2.
• Right-associative operators have right-recursive
productions
• For Instance, see the grammar below
• right  letter =right |letter
• String a=b=c has the same meaning as a=(b=c)

34
Operator
Precedence

35
Associativity of Operators

• Consider the expression 9+5*2.


• There are two possible interpretations of this expression:
• (9+5)*2 or 9+(5°2)
• The associativity rules for + and apply to occurrences of
the same operator, so they do not resolve this ambiguity.
• A grammar for arithmetic expressions can be constructed
from a table showing the associativity and precedence of
operators.

36
Associativity and Precedence Table

37
Operator Precedence
• Let's see an example of four common arithmetic
operators and a precedence table, showing the operators
in order of increasing precedence.
• left-associative: +
• left-associative: */
• Now we create two nonterminal expr and term for the
two levels of precedence, and an extra nonterminal factor
for generating basic units in expressions.
• The basic units in expressions are presently digits and
parenthesized expressions.
• factor->digit I (expr)

38
Operator Precedence
• Now consider the binary operators, and /, that have the
highest precedence and left associativity.
• term->term * factor | term / factor | factor
• Similarly, expr generates lists of terms separated by the
additive operators.
• expr -> expr+ term I expr-term I term
• Final grammar is
• expr -> expr+ term I expr-term I term
• term->term * factor | term / factor | factor factor> digit I
(expr)

39
Operator Precedence

40
Solve It

“A tree of branches, you must construct,


A puzzle to solve, it's not abrupt.
Paths may fork, with no clear direction,
Follow them wrong, you'll face rejection.

What am I?”
41

You might also like