Compiler Construction
Week 04
Syntax Directed Translator
MUHAMMAD SHAHID KHAN
LECTURER
DEPARTMENT OF COMPUTING &
TECHNALOGY
IQRA UNIVERSITY ISLAMABAD
Layout
01 Syntax Directed Translator
02 What is Context Free Grammar?
03 Basic Terminologies of CFG
04 Derivation & Parse Tree
05 Ambiguity
2
Syntax
Directed
Translato
r
3
Syntax Directed Translator
This section illustrates the compiling techniques
by developing a program that translates
representative programming language
statements into three-address code, an
intermediate representation.
We will focus on
Front end of a compiler
Lexical analysis
Parsing
Intermediate code generation.
4
Syntax Directed Translator
5
Three address code
Three address code (x=y op z)
X=(a*b)+(c*d)
If we want to convert in three address code, then
t1= a*b X=y op z x=y+z
t2= c*d X=op y x=+y
X=t1+t2 X=y op z x=y
Various statement in three address code
Assignment
6
Introduction
Analysis is organized around the "syntax" of the
language to be compiled.
The syntax of a programming language
describes the proper form of its programs.
The semantics of the language defines what its
programs mean.
For specifying syntax, Context-Free Grammars is
used.
Also known as BNF (Backus-Naur Form)
7
Introduction
We start with a syntax-directed translation of an
infix expression to postfix form.
Infix form: 9-5+2
to
Postfix form: 9 5-2+
8
Syntax Definition
Context Free Grammar is used to specify the
syntax of the language.
Shortly we can say it "Grammar".
A grammar describes the hierarchical structure of
most programming language constructs.
Ex.
if (expression) statement else statement
9
Syntax Definition
This rule can be expressed as production by
using the variable expr to denote an expression
and the variable stmt to denote a statement.
stmt-> if (expr) stmt else stmt
In a production
lexical elements like the keyword if, else and the
parentheses are called terminals.
Variables like expr and stmt represent sequences
of terminals and are called nonterminal.
10
What is
Context Free
Grammar?
11
Grammar
A context-free grammar has four components
A set of tokens (terminal symbols)
A set of non-terminals
A set of productions
A designated start symbol
Let's check an example that elaborates these
components.
12
Expressions... Grammar
9-5+2, 5-4, 8...
Since a plus or minus sign must appear between two
digits, we refer to such expressions as lists of digits
separated by plus or minus signs.
The productions are
List -> list + digit P-
1
List -> list – digit P-
2
List -> digit P-
3
Digit->0 1 2 3 4 5 6 7 8 9 P-
4
13
Context Free Grammar
• Why we need CFGs?
• Many languages are not regular for example string of
balanced parentheses ((((…))))
• There is no regular expression for this language
• A finite automata may repeat states; however, it can
not remember the number of times it has been to a
particular state
14
Context
• Context-free grammar is used to specifyFree Grammar
the syntax of a
language
• A grammar naturally describes the hierarchical structure
for e.g.
• Arrow may be read as "can have the form." Such a rule is
called a production.
• if and the parentheses are called terminals.
• Variables like expr and stmt represent sequences of
terminals and are called non-terminals.
15
Basic
Terminologie
s of CFG
16
A context-free grammar has four components:
Basic Terminologies of CFG
• Terminals: Sometimes referred to as "tokens." The
terminals are the elementary symbols of the language
defined by the grammar.
• Non-terminals: called "syntactic variables." Each non-
terminal represents a set of strings of terminals, in a
manner we shall describe.
• Production: Each production consists of a non-terminal
called the head or left side of the production, an arrow,
and a sequence of terminals and/or non-terminals , called
the body or right side of the production
• A designation of one of the non-terminals as the start
symbol. 17
Basic Terminologies of CFG
A context-free grammar has four components:
CFG (V, T, P, S)
• Terminals (T): Finite sets of end-nodes values
0, 1
• Non-terminals (V): S P
Finite sets of Variables
S, P
• Production (P): Substitution Rules
P 0P0
P 1P1
• Start symbol (S): Initiate Point P0
S
P1
P
18
Basic Terminologies of Lexical
Example (9-5+2, 3-1, or 7) Analysis
• Terminals:
• Non-terminals: list and digit
• Empty list: The string of zero terminals, written as
∊, is called the empty string
19
Derivation &
Parse Tree
20
Derivation & Parse Tree
Derivation:
• A grammar derives strings by beginning with the start
symbol and repeatedly replacing a non-terminal by the
body of a production for that non-terminal
21
Derivation & Parse Tree
Derivation:
list list + digit
list – digit + digit
digit – digit + digit
9 – digit + digit
9 – 5 + digit
9–5+2
Therefore, the string 9-5+2 belongs to the language
specified by the grammar
22
Parse Tree:
Derivation & Parse Tree
• Parsing is the problem of taking a string of terminals
and figuring out how to derive it from the start
symbol of the grammar.
• If it cannot be derived from the start symbol of the
grammar, then reporting syntax errors within the
string.
• Given a context-free grammar, a parse tree according
to the grammar is a tree with the following
properties:
• The root is labeled by the start symbol.
• Each leaf is labeled by a terminal or by ε.
• Each interior node is labeled by a nonterminal
23
Derivation & Parse Tree
Parse Tree:
If A→ X1 X2 ... Xn is a production, then node A has immediate children X1, X2, ..., Xn
where Xi is a (non)terminal or 8.
24
Derivation & Parse Tree
Parse Tree:
25
Derivation & Parse Tree
Parse Tree:
• A parse tree defines how the start symbol of a
grammar derives a string
• If non-terminal A has a production A → XYZ, then a
parse tree in the language is:
26
Parse Tree:
Derivation & Parse Tree
• Each node in the tree is labeled by a grammar
symbol
• An interior node and its children correspond to a
production
• The children of the root are labeled, from left to right
27
Ambiguity
28
Ambiguity
• Ambiguity is problematic because meaning of the
programs can be incorrect
• Ambiguity can be handled in several ways
• Enforce associativity and precedence
• Rewrite the grammar (cleanest way)
• There are no general techniques for handling ambiguity
• It is impossible to convert automatically an ambiguous
grammar to an unambiguous one
29
Ambiguity
Consider grammar
string string +
string
| string –
string
|0|1|…
|9
30
Tree
Terminology
31
Tree Terminology
• A tree consists of one or more nodes.
• Exactly one is the root.
• If node N is the parent of node M, then M is a child of N.
• The children of one node are called siblings. They have
an order, from the left.
• A node with no children is called a leaf.
• A descendant of a node N is either N itself, a child of N, a
child of a
• child of N, and so on.
32
Associativity
of Operators
33
Associativity of Operators
• Left-associative operators have left-recursive
productions.
• For instance,
• list list-digit | digit.
• String 9-5-2 has the same meaning as (9-5)-2.
• Right-associative operators have right-recursive
productions
• For Instance, see the grammar below
• right letter =right |letter
• String a=b=c has the same meaning as a=(b=c)
34
Operator
Precedence
35
Associativity of Operators
• Consider the expression 9+5*2.
• There are two possible interpretations of this expression:
• (9+5)*2 or 9+(5°2)
• The associativity rules for + and apply to occurrences of
the same operator, so they do not resolve this ambiguity.
• A grammar for arithmetic expressions can be constructed
from a table showing the associativity and precedence of
operators.
36
Associativity and Precedence Table
37
Operator Precedence
• Let's see an example of four common arithmetic
operators and a precedence table, showing the operators
in order of increasing precedence.
• left-associative: +
• left-associative: */
• Now we create two nonterminal expr and term for the
two levels of precedence, and an extra nonterminal factor
for generating basic units in expressions.
• The basic units in expressions are presently digits and
parenthesized expressions.
• factor->digit I (expr)
38
Operator Precedence
• Now consider the binary operators, and /, that have the
highest precedence and left associativity.
• term->term * factor | term / factor | factor
• Similarly, expr generates lists of terms separated by the
additive operators.
• expr -> expr+ term I expr-term I term
• Final grammar is
• expr -> expr+ term I expr-term I term
• term->term * factor | term / factor | factor factor> digit I
(expr)
39
Operator Precedence
40
Solve It
“A tree of branches, you must construct,
A puzzle to solve, it's not abrupt.
Paths may fork, with no clear direction,
Follow them wrong, you'll face rejection.
What am I?”
41