0% found this document useful (0 votes)
51 views13 pages

Lecture 04

Uploaded by

Hamayon Wazir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views13 pages

Lecture 04

Uploaded by

Hamayon Wazir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Dr.

Atif Ishaq Khan – Assistant Professor GC University, Lahore

Compiler Construction
CS-4207
Lecture – 04

Disclaimer: The Contents of this reader are borrowed from the book(s) mentioned
in the reference section.

Introduction
The input string is validated by the parser using a context-free grammar (CFG), which
then generates output for the compiler's subsequent phase. A parse tree or an abstract
syntax tree could be the output. Syntax Directed Translation is now used to interleave the
semantic analysis and syntax analysis phases of the compiler.
We conceptually parse the input token stream, create the parse tree, and then traverse the
tree as necessary to assess the semantic rules at the parse tree nodes, using both syntax-
directed definition and translation schemes. The evaluation of the semantic rules may
result in the generation of code, the saving of data in a symbol table, the issuance of error
messages, or any other actions. The outcome of analyzing the semantic rules is the
translation of the token stream.

What is Syntax Directed Translation (SDT)?


Syntax Directed Translation has augmented rules to the grammar that facilitate semantic
analysis. SDT involves passing information bottom-up and/or top-down to the parse tree
in form of attributes attached to the nodes. Syntax-directed translation rules use 1)
lexical values of nodes, 2) constants & 3) attributes associated with the non-terminals
in their definitions.

Grammar + Semantic rules = Syntax Directed translation

Another way to define SDT is “a Systematic process of assigning meaning to program


can be viewed as computation of attributes associated with non-terminals”.

 In syntax directed translation, every non-terminal can get one or more than one
attribute or sometimes 0 attribute depending on the type of the attribute. The value of
these attributes is evaluated by the semantic rules associated with the production rule.
 In the semantic rule, attribute is VAL and an attribute may hold anything like a string,
a number, a memory location and a complex record
 In Syntax directed translation, whenever a construct encounters in the programming
language then it is translated according to the semantic rules define in that particular
programming language.

1|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

Computation of Semantics [5]


 Ensure that context-specific language rules, which are not explicitly defined in
the grammar, are adhered to. For instance, a requirement that functions must
include a return statement.
 Enhance the Abstract Syntax Tree (AST) by embedding semantic details that will
be used later during the code generation phase. This includes tasks such as
deducing the data type of an expression.
 Break down intricate program structures in anticipation of code generation. This
involves recognizing the termination points of loops and procedures, among other
complex constructs.
We will remain focus on syntax directed approach to compute the attributes
Attributes and Attribute Grammars
Syntax Directed Framework [5]
 Every symbol within the grammar is associated with definable properties, such as
calculating the value of a constant expression.
 For each production in the grammar, we provide rules for computing the
properties of all symbols appearing on both sides of the production. For instance,
we determine that the value of a sum should be the sum of the values of its
operands.
 These rules are confined to the local scope and only reference other symbols
within the same production.
 The evaluation of attributes may necessitate multiple traversals of the Abstract
Syntax Tree (AST) and exhibit arbitrary context dependence. For example,
determining the value of a declared constant requires referencing the constant
declaration, which may be located anywhere in the code.

The general approach to Syntax-Directed Translation is to construct a parse tree or


syntax tree and compute the values of attributes at the nodes of the tree by visiting them
in some order. In many cases, translation can be done during parsing without building
an explicit tree.
We consider two general approaches to the specification of syntax-directed translation:
syntax-directed rules and translation scheme.

2|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

Example
Productions Semantic Rules
E→E+T E.val := E.val + T.val
E→T E.val := T.val
T→T*F T.val := T.val * F.val
T→F T.val := F.val
F → (E) F,val := E.val
F → num F.val := num.lexval

Where
E.val is one of the attributes of E
num.lexval is the attribute returned by the lexical analyzer

Example
Syntax-directed translation is done by attaching rules or program fragments to
productions in a grammar. For example, consider an expression E generated by the
production
→ +
is sum of two sub expressions (subscript is used to distinguished similar instances
of expressions) and .
Pseudocode
;
;
ℎ +;
The example is translation of an infix expression to postfix expression. Before exploring
this example let us understand two importantly related concepts
Attributes: An attribute is any quantity associated with programming construct (symbol
in grammar). Examples includes, data types of expression, the number of instructions in
the generated code, or the location of the first instruction for a construct, among many
other possibilities.
(Syntax Directed) translation Scheme: A translation scheme is a notation for attaching
program fragments to the productions of a grammar. The program fragments are executed
when the production is used during syntax analysis. The combined result of all these
fragment executions, in the order induced by the syntax analysis, produces the translation
of the program to which this analysis/synthesis process is applied.

3|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

Postfix Notation
The Postfix notation for an expression E is inductively defined as follows
1. If E is a variable or constant then postfix notation of E is E itself.
2. If E is an expression of the form E1 op E2, where op is any binary operator, then the
` ` ` `
postfix notation for E is ! op, where are the postfix notations for E1
and E2 respectively.
3. If E is a parenthesized expression of the form (E1), then the post fix notation of E
is same as the postfix notation of E1.
Example of Postfix notation
(9 – 5) + 2 have postfix form 95-2+
 In this case 9, 5, and 2 are constants itself doe rul1 1 applies to these constants
 Translation of 9 – 5 to 95- is through rule 2
 The translation of (9 – 5) is through rule 3
 For the entire expression, (9 – 5) treated as E1 and 2 as E2, and we get 95-2+
resultantly by applying rule 2.

9 – (5 + 2) have postfix form 9-52+


 In this case 9, 5, and 2 are constants itself doe rule 1 applies to these constants
 Due to parenthesis (5 + 2) is evaluated first as 52+
 Then the expression becomes the second argument for the minus operation
 For the entire expression 9 is treated as E1 and 52+ is treated as E2 and we get 952+-
resultantly
How to deal with post fix notations
 No parenthesis are required in postfix expression, because only one operation is
allowed due to position and arity of the operators.
 The trick is to repeatedly scan postfix expression from left until and operator is
found
 Then look to the left for proper number of operands
 Evaluate the operator on operands and replace them all with result
 Then repeat this process to scan a new operator and its corresponding operands.

Exercise
Consider the following postfix expression 952+-3*. Evaluate it following the above
mentioned evaluation steps.
952+-3* → 97-3* → 23* → 6

4|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

Syntax Directed Definition


With each production → we associate a set of semantic rules of the form ≔
( ,…, ) where a function and either
 is synthesized attribute of and ,…, are attributes of the symbols in
( . = ( . ,….., . )
or
 is inherited attribute of one of the symbols and ,….., are attributes of
symbols in { }
Types of Attributes
From the above SDD we define that attributes are of two types
1. Synthesized Attributes
A Synthesized attribute is an attribute of the non-terminal on the left-hand side of a
production. Synthesized attributes represent information that is being passed up the
parse tree. The attribute can take value only from its children (Variables in the RHS of
the production).
For example let’s say → is a production of a grammar, and A’s attribute is
dependent on ’ attributes or ’ attributes then it will be synthesized attribute.
→ + { . = . + . }
In this case E.val derives value from E1.val and T.val.
2. Inherited Attributes
An attribute of a nonterminal on the right-hand side of a production is called an inherited
attribute. The attribute can take value either from its parent or from its siblings (variables
in the LHS or RHS of the production).
For example, let’s say A → BC is a production of a grammar and B’s attribute is
dependent on A’s attributes or C’s attributes then it will be inherited attribute.
First we discuss synthesized attributes in detail. The grammar allows us to express the
idea of associating quantities with programing constructs like values and types with
expression. The attributes associates with not-terminals and terminals. Resultantly we
associate rules with the grammar productions that describes how attributes are computed
at those nodes where under consideration production is used to relate a node to its
children.
A Syntax-directed definition associates
1. With each grammar symbol, a set of attributes, and
2. With each production, a set of semantic rules for computing the values of the
attributes associated with the symbols appearing in the production.
Syntax Directed Translation Scheme
 The Syntax directed translation scheme is a context -free grammar.
 The syntax directed translation scheme is used to evaluate the order of semantic
rules.
 In translation scheme, the semantic rules are embedded within the right side of the
productions.

5|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

 The position at which an action is to be executed is shown by enclosed between


braces. It is written within the right side of the production.

Suppose a node N in a parse tree is labeled by the grammar symbol X. We write X.a to
denote the value of attribute a of X at that node. A parse tree showing the attribute
values at each node is called an annotated parse tree. Below is an annotated parse tree
for the expression 9 – 5 + 2 with an attribute value a associated with nonterminals E
and T. The value 95-2+ is postfix notation for the given expression.

We have already discussed that an attribute is said to be synthesized if its value at a


parse-tree node N is determined from attribute values at the children of N and at N
itself. Synthesized attributes have the desirable property that they can be evaluated
during a single bottom-up traversal of a parse tree. Below table describes the syntax
directed definition for infix to postfix translation of above annotated parse tree. Each
nonterminal has a string-valued attribute a that represents the postfix notation for the
expression generated by that nonterminal in a parse tree. The symbol || in the semantic
rule is the operator for string concatenation.
Productions Sematic Rules
E → E1 + T E.a = E1.a || T.a || ‘+’
E → E1 - T E.a = E1.a || T.a || ‘-’
E→T E.a = T.a
T→0 T.a = ‘0’
T→1 T.a = ‘1’
….. …..
T→9 T.a = ‘9’

AS we have already discussed that the postfix form of a digit is the digit itself; e.g., the
semantic rule associated with the production T → 9 defines T.a to be 9 itself whenever
this production is used at a node in a parse tree. The other digits are translated similarly.
As another example, when the production T → T is applied, the value of T.a becomes
the value of E.a.
Important Consideration
The string representing the translation of the nonterminal at the head of each production
is the concatenation of the translations of the nonterminals in the production body, in

6|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

the same order as in the production, with some optional additional strings interleaved.
A syntax-directed definition with this property is termed simple. We can see in the
above table that while defining the semantic rules the operands appear in the same order
as in the production body.
Tree Traversal
Tree traversals will be used to describe attribute evaluation and to specify how code
pieces in a translation scheme should be executed. A traversal of a tree begins at the
root and travels through each node in some sequence.
Depth-First Search
DFS (Depth-first search) is technique used for traversing tree or graph. Here
backtracking is used for traversal. In this traversal first the deepest node is visited and
then backtracks to it’s parent node if no sibling of that node exist. The below given
procedure visit (N) is a depth-first traversal that visits children node in a left-to-right
order.

A syntax-directed definition does not impose any specific order for the evaluation of
attributes on a parse tree; any evaluation order that computes an attribute a after all the
other attributes that a depends on is acceptable. Synthesized attributes can be evaluated
during any bottom-up traversal, that is, a traversal that evaluates attributes at a node
after having evaluated attributes at its children.

Translation Scheme
We now have a look at a different strategy that doesn't require the manipulation of
strings and slowly creates the same translation by running program fragments.
Semantic actions are program pieces that are incorporated into production bodies. The
place at which an action is to be executed is indicated by enclosing it within curly
brackets and writing it within the production body. When drawing a parse tree for a
translation scheme, we indicate an action by constructing an extra child for it, connected
by a dashed line to the node that corresponds to the head of the production. Below parse
tree is an action to convert expression 9 – 5 + 2 into postfix notation 95-2+.

7|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

Actions for translating into Postfix notation

The root of annotated parse tree show the first production of the grammar translating into
postfix notation. In a post order traversal, we first perform all the actions in the leftmost
subtree of the root, for the left operand, also labeled expr like the root. We then visit the
leaf + at which there is no action. We next perform the actions in the subtree for the right
operand term and, finally, the semantic action of { (‘ + ’)} at the extra node.
Since the productions for term have only a digit on the right side, that digit is printed by
the actions for the productions. No output is necessary for the production expr → term,
and only the operator needs to be printed in the action for each of the first two productions.
The semantic actions in the parse tree translate the infix expression 9 − 5 + 2 into 95 −
2 + by printing each character in 9 − 5 + 2 exactly once, without using any storage for
the translation of subexpressions. While, the implementation of a translation scheme must
ensure that semantic actions are performed in the order they would appear during a post
order traversal of a parse tree.

Exercise-1
Construct a syntax-directed translation scheme that translates arithmetic expressions from
infix notation into prefix notation in which an operator appears before its operands; e.g.,
− is the prefix notation for − . Give annotated parse trees for the inputs 9 − 5 + 2
and 9 − 5 ∗ 2.

8|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

Solution
Productions Translation Scheme
expr -> expr + term expr -> {print("+")} expr +
| expr - term term
| term | {print("-")} expr -
term -> term * factor term
| term / factor | term
| factor term -> {print("*")} term *
factor -> digit | factor
(expr) | {print("/")} term /
factor
| factor
factor -> digit {print(digit)}
| (expr)

Exercise-2
Construct a syntax-directed translation scheme that translates arithmetic expressions from
post fix notation into infix notation. Give annotated parse trees for the inputs
95 − 2 ∗ and 952 ∗ −.
Solution
Productions Translation Scheme
expr -> expr expr + expr -> expr {print("+")} expr
| expr expr - +
| expr expr * | expr {print("-")} expr
| expr expr / -
| digit | {print("(")} expr
{print(")*(")} expr
{print(")")} *
| {print("(")} expr
{print(")/(")} expr
{print(")")} /

9|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

| digit {print(digit)}

Build the annotated tree by yourself as practice.

Parser
 Parsing also referred as recursive decent parsing can be used to both to parse and to
implement syntax directed translation.
 Parsing is the process of determining if a string of tokens can be generated by a
grammar
 For any Context Free Grammar there is a parser that takes at most O(n3) time to parse
a string of n tokens
 Linear algorithms suffice for parsing programming language source code
 Top-down parsing “constructs” a parse tree from root to leaves
 Bottom-up parsing “constructs” a parse tree from leaves to root
Top down Parser
The top-down construction of a parse tree is done by starting with the root, labeled with
the starting nonterminal, and repeatedly performing the following two steps.
1. At node N, labeled with nonterminal A, select one of the productions for A and
construct children at N for the symbols in the production body.
2. Find the next node at which a subtree is to be constructed, typically the leftmost
unexpanded nonterminal of the tree.

10 | P a g e
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

11 | P a g e
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

Predictive Parsing
Recursive descent parsing is a top-down method of syntax analysis in which a set of
recursive procedures is used to process the input.
 Each nonterminal has one (recursive) procedure that is responsible for parsing the
nonterminal’s syntactic category of input tokens
 When a nonterminal has multiple productions, each production is implemented in
a branch of a selection statement based on input look-ahead information
Predictive parsing is a special form of recursive descent parsing where we use one
lookahead token to unambiguously determine the parse operations
For the following statement

The sequence of calls is

The relevant code looks like

References
1. Alferd. V. Aho, Monica, S Lam, Ravi Sethi, and Jeffry D. Ullman, “Compilers,
Principles, Techniques and Tools”, Chapter-2, Second Edition, Pearson,
2. https://www.geeksforgeeks.org/syntax-directed-translation-in-compiler-design/
3. https://www.javatpoint.com/syntax-directed-translation

12 | P a g e
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore

4. https://www.cs.csustan.edu/~xliang/Courses/CS4300-19F/Notes/Ch2.pdf
5. Honor Compilers, NYU, Lecture 5, Fall 2009, A. Pnueli

13 | P a g e

You might also like