Lecture 04
Lecture 04
Compiler Construction
CS-4207
Lecture – 04
Disclaimer: The Contents of this reader are borrowed from the book(s) mentioned
in the reference section.
Introduction
The input string is validated by the parser using a context-free grammar (CFG), which
then generates output for the compiler's subsequent phase. A parse tree or an abstract
syntax tree could be the output. Syntax Directed Translation is now used to interleave the
semantic analysis and syntax analysis phases of the compiler.
We conceptually parse the input token stream, create the parse tree, and then traverse the
tree as necessary to assess the semantic rules at the parse tree nodes, using both syntax-
directed definition and translation schemes. The evaluation of the semantic rules may
result in the generation of code, the saving of data in a symbol table, the issuance of error
messages, or any other actions. The outcome of analyzing the semantic rules is the
translation of the token stream.
In syntax directed translation, every non-terminal can get one or more than one
attribute or sometimes 0 attribute depending on the type of the attribute. The value of
these attributes is evaluated by the semantic rules associated with the production rule.
In the semantic rule, attribute is VAL and an attribute may hold anything like a string,
a number, a memory location and a complex record
In Syntax directed translation, whenever a construct encounters in the programming
language then it is translated according to the semantic rules define in that particular
programming language.
1|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
2|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
Example
Productions Semantic Rules
E→E+T E.val := E.val + T.val
E→T E.val := T.val
T→T*F T.val := T.val * F.val
T→F T.val := F.val
F → (E) F,val := E.val
F → num F.val := num.lexval
Where
E.val is one of the attributes of E
num.lexval is the attribute returned by the lexical analyzer
Example
Syntax-directed translation is done by attaching rules or program fragments to
productions in a grammar. For example, consider an expression E generated by the
production
→ +
is sum of two sub expressions (subscript is used to distinguished similar instances
of expressions) and .
Pseudocode
;
;
ℎ +;
The example is translation of an infix expression to postfix expression. Before exploring
this example let us understand two importantly related concepts
Attributes: An attribute is any quantity associated with programming construct (symbol
in grammar). Examples includes, data types of expression, the number of instructions in
the generated code, or the location of the first instruction for a construct, among many
other possibilities.
(Syntax Directed) translation Scheme: A translation scheme is a notation for attaching
program fragments to the productions of a grammar. The program fragments are executed
when the production is used during syntax analysis. The combined result of all these
fragment executions, in the order induced by the syntax analysis, produces the translation
of the program to which this analysis/synthesis process is applied.
3|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
Postfix Notation
The Postfix notation for an expression E is inductively defined as follows
1. If E is a variable or constant then postfix notation of E is E itself.
2. If E is an expression of the form E1 op E2, where op is any binary operator, then the
` ` ` `
postfix notation for E is ! op, where are the postfix notations for E1
and E2 respectively.
3. If E is a parenthesized expression of the form (E1), then the post fix notation of E
is same as the postfix notation of E1.
Example of Postfix notation
(9 – 5) + 2 have postfix form 95-2+
In this case 9, 5, and 2 are constants itself doe rul1 1 applies to these constants
Translation of 9 – 5 to 95- is through rule 2
The translation of (9 – 5) is through rule 3
For the entire expression, (9 – 5) treated as E1 and 2 as E2, and we get 95-2+
resultantly by applying rule 2.
Exercise
Consider the following postfix expression 952+-3*. Evaluate it following the above
mentioned evaluation steps.
952+-3* → 97-3* → 23* → 6
4|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
5|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
Suppose a node N in a parse tree is labeled by the grammar symbol X. We write X.a to
denote the value of attribute a of X at that node. A parse tree showing the attribute
values at each node is called an annotated parse tree. Below is an annotated parse tree
for the expression 9 – 5 + 2 with an attribute value a associated with nonterminals E
and T. The value 95-2+ is postfix notation for the given expression.
AS we have already discussed that the postfix form of a digit is the digit itself; e.g., the
semantic rule associated with the production T → 9 defines T.a to be 9 itself whenever
this production is used at a node in a parse tree. The other digits are translated similarly.
As another example, when the production T → T is applied, the value of T.a becomes
the value of E.a.
Important Consideration
The string representing the translation of the nonterminal at the head of each production
is the concatenation of the translations of the nonterminals in the production body, in
6|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
the same order as in the production, with some optional additional strings interleaved.
A syntax-directed definition with this property is termed simple. We can see in the
above table that while defining the semantic rules the operands appear in the same order
as in the production body.
Tree Traversal
Tree traversals will be used to describe attribute evaluation and to specify how code
pieces in a translation scheme should be executed. A traversal of a tree begins at the
root and travels through each node in some sequence.
Depth-First Search
DFS (Depth-first search) is technique used for traversing tree or graph. Here
backtracking is used for traversal. In this traversal first the deepest node is visited and
then backtracks to it’s parent node if no sibling of that node exist. The below given
procedure visit (N) is a depth-first traversal that visits children node in a left-to-right
order.
A syntax-directed definition does not impose any specific order for the evaluation of
attributes on a parse tree; any evaluation order that computes an attribute a after all the
other attributes that a depends on is acceptable. Synthesized attributes can be evaluated
during any bottom-up traversal, that is, a traversal that evaluates attributes at a node
after having evaluated attributes at its children.
Translation Scheme
We now have a look at a different strategy that doesn't require the manipulation of
strings and slowly creates the same translation by running program fragments.
Semantic actions are program pieces that are incorporated into production bodies. The
place at which an action is to be executed is indicated by enclosing it within curly
brackets and writing it within the production body. When drawing a parse tree for a
translation scheme, we indicate an action by constructing an extra child for it, connected
by a dashed line to the node that corresponds to the head of the production. Below parse
tree is an action to convert expression 9 – 5 + 2 into postfix notation 95-2+.
7|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
The root of annotated parse tree show the first production of the grammar translating into
postfix notation. In a post order traversal, we first perform all the actions in the leftmost
subtree of the root, for the left operand, also labeled expr like the root. We then visit the
leaf + at which there is no action. We next perform the actions in the subtree for the right
operand term and, finally, the semantic action of { (‘ + ’)} at the extra node.
Since the productions for term have only a digit on the right side, that digit is printed by
the actions for the productions. No output is necessary for the production expr → term,
and only the operator needs to be printed in the action for each of the first two productions.
The semantic actions in the parse tree translate the infix expression 9 − 5 + 2 into 95 −
2 + by printing each character in 9 − 5 + 2 exactly once, without using any storage for
the translation of subexpressions. While, the implementation of a translation scheme must
ensure that semantic actions are performed in the order they would appear during a post
order traversal of a parse tree.
Exercise-1
Construct a syntax-directed translation scheme that translates arithmetic expressions from
infix notation into prefix notation in which an operator appears before its operands; e.g.,
− is the prefix notation for − . Give annotated parse trees for the inputs 9 − 5 + 2
and 9 − 5 ∗ 2.
8|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
Solution
Productions Translation Scheme
expr -> expr + term expr -> {print("+")} expr +
| expr - term term
| term | {print("-")} expr -
term -> term * factor term
| term / factor | term
| factor term -> {print("*")} term *
factor -> digit | factor
(expr) | {print("/")} term /
factor
| factor
factor -> digit {print(digit)}
| (expr)
Exercise-2
Construct a syntax-directed translation scheme that translates arithmetic expressions from
post fix notation into infix notation. Give annotated parse trees for the inputs
95 − 2 ∗ and 952 ∗ −.
Solution
Productions Translation Scheme
expr -> expr expr + expr -> expr {print("+")} expr
| expr expr - +
| expr expr * | expr {print("-")} expr
| expr expr / -
| digit | {print("(")} expr
{print(")*(")} expr
{print(")")} *
| {print("(")} expr
{print(")/(")} expr
{print(")")} /
9|Page
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
| digit {print(digit)}
Parser
Parsing also referred as recursive decent parsing can be used to both to parse and to
implement syntax directed translation.
Parsing is the process of determining if a string of tokens can be generated by a
grammar
For any Context Free Grammar there is a parser that takes at most O(n3) time to parse
a string of n tokens
Linear algorithms suffice for parsing programming language source code
Top-down parsing “constructs” a parse tree from root to leaves
Bottom-up parsing “constructs” a parse tree from leaves to root
Top down Parser
The top-down construction of a parse tree is done by starting with the root, labeled with
the starting nonterminal, and repeatedly performing the following two steps.
1. At node N, labeled with nonterminal A, select one of the productions for A and
construct children at N for the symbols in the production body.
2. Find the next node at which a subtree is to be constructed, typically the leftmost
unexpanded nonterminal of the tree.
10 | P a g e
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
11 | P a g e
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
Predictive Parsing
Recursive descent parsing is a top-down method of syntax analysis in which a set of
recursive procedures is used to process the input.
Each nonterminal has one (recursive) procedure that is responsible for parsing the
nonterminal’s syntactic category of input tokens
When a nonterminal has multiple productions, each production is implemented in
a branch of a selection statement based on input look-ahead information
Predictive parsing is a special form of recursive descent parsing where we use one
lookahead token to unambiguously determine the parse operations
For the following statement
References
1. Alferd. V. Aho, Monica, S Lam, Ravi Sethi, and Jeffry D. Ullman, “Compilers,
Principles, Techniques and Tools”, Chapter-2, Second Edition, Pearson,
2. https://www.geeksforgeeks.org/syntax-directed-translation-in-compiler-design/
3. https://www.javatpoint.com/syntax-directed-translation
12 | P a g e
Dr. Atif Ishaq Khan – Assistant Professor GC University, Lahore
4. https://www.cs.csustan.edu/~xliang/Courses/CS4300-19F/Notes/Ch2.pdf
5. Honor Compilers, NYU, Lecture 5, Fall 2009, A. Pnueli
13 | P a g e