0% found this document useful (0 votes)
2 views

Unit III Notes

The document discusses syntax-directed translation, which involves attaching rules to grammar productions for translating expressions, particularly into postfix notation. It explains attributes, including synthesized and inherited attributes, and their evaluation through parse trees and dependency graphs. Additionally, it covers the construction of syntax trees, annotated parse trees, and the application of syntax-directed translation schemes in compiler design.

Uploaded by

Vijay Dhanush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Unit III Notes

The document discusses syntax-directed translation, which involves attaching rules to grammar productions for translating expressions, particularly into postfix notation. It explains attributes, including synthesized and inherited attributes, and their evaluation through parse trees and dependency graphs. Additionally, it covers the construction of syntax trees, annotated parse trees, and the application of syntax-directed translation schemes in compiler design.

Uploaded by

Vijay Dhanush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Compiler Design

UNIT-III-Basics
Syntax-Directed Translation

Syntax-directed translation is done by attaching rules or program fragments to productions in a


grammar. For example, consider an expression expr generated by the production
expr -» exprx + term
Here, expr is the sum of the two subexpressions exprx and term. (The subscript in exprx is used
only to distinguish the instance of expr in the production body from the head of the production).
We can translate expr by exploiting its structure, as in the following pseudo-code:

translate exprx;
translate term;
handle +;

This section introduces two concepts related to syntax-directed translation:

• Attributes. An attribute is any quantity associated with a programming construct. Examples of


attributes are data types of expressions, the number of instructions in the generated code, or the
location of the first instruction in the generated code for a construct, among many other
possibilities. Since we use grammar symbols (nonterminals and terminals) to represent
programming constructs.

• (Syntax-directed) translation schemes. A translation scheme is a notation for attaching


program fragments to the productions of a grammar. The program fragments are executed when
the production is used during syntax analysis.

Postfix Notation

In this section deal with translation into postfix notation. The postfix notation for an
expression E can be defined inductively as follows:

1. If E is a variable or constant, then the postfix notation for E is E itself.


2. If E is an expression of the form E\ op E2, where op is any binary operator, then the postfix
notation for E is E1 E2 op, where E1 and E2 are the postfix notations for E1 and E2,
respectively.
3. If E is a parenthesized expression of the form (Ei), then the postfix notation for E is the same
as the postfix notation for Ei.

SREC, Nandyal Page 1


Compiler Design

Example 2.8: The postfix notation for (9-5)+2 is 95-2+. That is, the translations of 9, 5, and 2
are the constants themselves, by rule (1). Then, the translation of 9-5 is 95- by rule (2). The
translation of (9-5) is the same by rule (3). Having translated the parenthesized sub expression,
we may apply rule (2) to the entire expression, with (9-5) in the role of E\ and 2 in the role of E2:
to get the result 95-2+.
As another example, the postfix notation for 9- (5+2) is 952+-. That is, 5+2 is first
translated into 52+, and this expression becomes the second argument of the minus sign.

Synthesized Attributes
A syntax-directed definition associates
1. With each grammar symbol, a set of attributes, and
2. With each production, a set of semantic rules for computing the values of the attributes
associated with the symbols appearing in the production.

Attributes can be evaluated as follows. For a given input string x, construct a parse tree for x.
Then, apply the semantic rules to evaluate attributes at each node in the parse tree, as follows.

A parse tree showing the attribute values at each node is called an annotated parse tree.

For example, Fig. 2.9 shows an annotated parse tree for 9-5+2 with an attribute t associated with
the nonterminals expr and term. The value 95-2+ of the attribute at the root is the postfix
notation for 9-5+2.

An attribute is said to be synthesized if its value at a parse-tree node N is determined from


attribute values at the children of N and at N itself. Synthesized attributes have the desirable
property that they can be evaluated during a single bottom-up traversal of a parse tree.

Informally, inherited attributes have their value at a parse-tree node determined from attribute
values at the node itself, its parent, and its siblings in the parse tree.

SREC, Nandyal Page 2


Compiler Design

The postfix form of a digit is the digit itself; e.g., the semantic rule associated with the
production term -» 9 defines term.t to be 9 itself whenever this production is used at a node in a
parse tree. The other digits are translated similarly. As another example, when the production
expr term is applied, the value of term.t becomes the value of expr.t.

SREC, Nandyal Page 3


Compiler Design

Syntax Directed Translation:

A syntax-directed definition specifies the values of attributes by associating semantic rules with
the grammar productions. For example, an infix-to-postfix translator might have a production
and rule

PRODUCTION SEMANTIC RU LE

E -> Ei + T E.code = Ei.code || T.code || '+'

Both E and T have a string-valued attribute code. The semantic rule specifies that the string
E.code is formed by concatenating Ei.code, T.code, and the character '+'.

A syntax-directed translation scheme embeds program fragments called semantic actions


within production bodies, as in

E -» Ei + T { print '+' }

By convention, semantic actions are enclosed within curly braces. (If curly braces occur as
grammar symbols, we enclose them within single quotes, as in ' { ' and '}'.) The position of a
semantic action in a production body determines the order in which the action is executed.

Syntax Directed Definition:

A syntax-directed definition (SDD) is a context-free grammar together with attributes and


rules. Attributes are associated with grammar symbols and rules are associated with productions.
If X is a symbol and a is one of its attributes, then we write X.a to denote the value of a at a
particular parse-tree node labeled X.

Inherited and Synthesized Attributes:


We shall deal with two kinds of attributes for no terminals:
1. A synthesized attribute for a nonterminal A:
A synthesized attribute at node N is defined only in terms of attribute values at the
children of N and at N itself.
2. An inherited attribute for a non terminal B:
An inherited attribute at node N is defined only in terms of attribute values at JV's
parent, N itself, and N's siblings.

SREC, Nandyal Page 4


Compiler Design

Terminals can have synthesized attributes, but not inherited attributes. Attributes for terminals
have lexical values that are supplied by the lexical analyzer; there are ho semantic rules in the
SDD itself for computing the value of an attribute for a terminal.
Example: The SDD is based on our familiar grammar for arithmetic expressions with operators
+ and *. It evaluates expressions terminated by an end marker n. In the SDD, each of the non
terminals has a single synthesized attribute, called val We also suppose that the terminal digit
has a synthesized attribute lexval, which is an integer value returned by the lexical analyzer.

The rule for production 1, L ->• E n, sets L.val to E.val, which we shall see is the numerical
value of the entire expression.
Production 2, E -> Ei + T, also has one rule, which computes the val attribute for the head E as
the sum of the values at E\ and T. At any parse tree node N labeled E, the value of val for E is the
sum of the values of val at the children of node N labeled E and T.
Production 3, E -»• T, has a single rule that defines the value of val for E to be the same as the
value of val at the child for T.
Production 4 is similar to the second production; its rule multiplies the values at the children
instead of adding them.
The rules for productions 5 and 6 copy values at a child, like that for the third production.
Production 7 gives F.val the value of a digit, that is, the numerical value of the token digit that
the lexical analyzer returned.

An SDD that involves only synthesized attributes is called S-attributed. In an S-attributed SDD,
each rule computes an attribute for the non terminal at the head of a production from attributes
Taken from the body of the production.

SREC, Nandyal Page 5


Compiler Design

Evaluating an SDD at the Nodes of a Parse Tree:


A parse tree, showing the value(s) of its attribute(s) is called an annotated parse tree.
All attributes are synthesized, as in Example 5.1, then we must evaluate the val attributes at all of
the children of a node before we can evaluate the val attribute at the node itself.
Synthesized attributes, we can evaluate attributes in any bottom-up order, such as that of a
postorder traversal of the parse tree.

For SDD's with both inherited and synthesized attributes, there is no guarantee that there is even
one order in which to evaluate attributes at nodes.
For instance, consider non terminals A and B, with synthesized and inherited attributes A.s and
BA, respectively, along with the production and rules

PRODUCTION SEMANTIC RU L ES

A→B A.s = B.i;


B.i = A.s + l

Inherited attributes are useful when the structure of a parse tree does not "match" the
abstract syntax of the source code.

SREC, Nandyal Page 6


Compiler Design

E x a m p l e: The SDD in Fig. 5.4 computes terms like 3 * 5 and 3 * 5 * 7 . The top-down parse
of input 3*5 begins with the production T ^ FT'. Here, F generates the digit 3, but the operator *
is generated by X". Thus, the left operand 3 appears in a different sub tree of the parse tree from
*. An inherited attribute will therefore be used to pass the operand to the operator.

A grammar as a running example to illustrate top-down parsing

An SDD based on a grammar suitable for top-down parsing

Each of the non terminals T and F has a synthesized attribute val; the terminal digit has a
synthesized attribute lexval. The non terminal T has two attributes: an inherited attribute inh and
a synthesized attribute syn.
To see how the semantic rules are used, consider the annotated parse tree for 3 * 5 in Fig.
5.5. The leftmost leaf in the parse tree, labeled digit, has attribute value lexval = 3, where the 3 is
supplied by the lexical analyzer. Its parent is for production 4, F -> digit. The only semantic rule
associated with this production defines F.val = digit.lexval, which equals 3.

SREC, Nandyal Page 7


Compiler Design

At the second child of the root, the inherited attribute T'.inh is defined by the semantic
rule T'.inh = F.val associated with production 1. Thus, the left operand, 3, for the * operator is
passed from left to right across the children of the root.

Evaluation Orders for SDD's:


"Dependency graphs" are a useful tool for determining an evaluation order for the attribute
instances in a given parse tree. While an annotated parse tree shows the values of attributes, a
dependency graph helps us determine how those values can be computed.

Dependency Graphs:
A dependency graph depicts the flow of information among the attribute instances in a particular
parse tree; an edge from one attribute instance to another means that the value of the first is
needed to compute the second. Edges express constraints implied by the semantic rules.

In more detail:

• For each parse-tree node, say a node labeled by grammar symbol X, the dependency graph has
a node for each attribute associated with X.

• Suppose that a semantic rule associated with a production p defines the value of synthesized
attribute A.b in terms of the value of X.c (the rule may define A.b in terms of other attributes in
addition to X.c). Then, the dependency graph has an edge from X.c to A.b.

• Suppose that a semantic rule associated with a production p defines the value of inherited
attribute B.c in terms of the value of X.a. Then, the dependency graph has an edge from X.a to
B.c.
E x a m p l e : Consider the following production and rule:

PRODUCTION SEMANTIC RULE

E -> Ei + T E.val - Ei.val + T.val


The synthesized attribute val at N is computed using the values of val at the two children, labeled
E and T. Thus, a portion of the dependency graph for every parse tree in which this production is
used looks like Fig. 5.6. As a convention, we shall show the parse tree edges as dotted lines,
while the edges of the dependency graph are solid.

SREC, Nandyal Page 8


Compiler Design

Example:2:

An example of a complete dependency graph appears in Fig. 5.7. The nodes of the dependency
graph, represented by the numbers 1 through 9, correspond to the attributes in the annotated
parse tree in Fig. 5.5.

Figure 5.7: Dependency graph for the annotated parse tree of Fig. 5.5

Nodes 1 and 2 represent the attribute lexval associated with the two leaves labeled digit.
Nodes 3 and 4 represent the attribute val associated with the two nodes labeled F. The edges to
node 3 from 1 and to node 4 from 2 result from the semantic rule that defines F.val in terms of
digit.lexval.

SREC, Nandyal Page 9


Compiler Design

Nodes 5 and 6 represent the inherited attribute T'.inh associated with each of the occurrences of
nonterminal T'. The edge to 5 from 3 is due to the rule T'.inh = F.val, which defines T'.inh at the
right child of the root from F.val at the left child. We see edges to 6 from node 5 for T'.inh and
from node 4 for F.val, because these values are multiplied to evaluate the attribute inh at node 6.
Nodes 7 and 8 represent the synthesized attribute syn associated with the occurrences of X". The
edge to node 7 from 6 is due to the semantic rule T'.syn = T'.inh associated with production 3 in
Fig. 5.4. The edge to node 8 from 7 is due to a semantic rule associated with production 2.
Finally, node 9 represents the attribute T.val. The edge to 9 from 8 is due to the semantic rule,
T.val = T'.syn, associated with production 1.
Example-2:

Fig: Dependency graph for a declaration float idi , id2 , id3.

SREC, Nandyal Page 10


Compiler Design

Applications of Syntax-Directed Translation:


1. Construction of Syntax Trees
2. Construction of Annotational Parse tree
3. Construction of Dependency Graph
Syntax-Directed Translation Schemes:
Syntax-directed translation schemes are a complementary notation to syntax directed definitions.
syntax-directed translation scheme (SDT) is a context free grammar with program fragments
embedded within production bodies. The program fragments are called semantic actions and can
appear at any position within a production body.
we focus on the use of SDT's to implement two important classes of SDD's:
1. The underlying grammar is LR-parsable, and the SDD is S-attributed.
2. The underlying grammar is LL-parsable, and the SDD is L-attributed.

1. Postfix Translation Schemes:


In this case, we can construct an SDT in which each action is placed at the end of the
production and is executed along with the reduction of the body to the head of that production.
SDT's with all actions at the right ends of the production bodies are called postfix SDT's.
Ex:

SREC, Nandyal Page 11


Compiler Design

SDT's for L-Attributed Definitions:


We consider the more general case of an L-attributed SDD. With any grammar, the
technique below can be implemented by attaching actions to a parse tree and executing them
during preorder traversal of the tree.
The rules for turning an L-attributed SDD into an SDT are as follows:

1. Embed the action that computes the inherited attributes for a nonterminal A immediately
before that occurrence of A in the body of the production. If several inherited attributes for A
depend on one another in an acyclic fashion, order the evaluation of attributes so that those
needed first are computed first.

2. Place the actions that compute a synthesized attribute for the head of a production at the end of
the body of that production.

We use the following attributes to generate the proper intermediate code:

1. The inherited attribute S.next labels the beginning of the code that must be executed after S is
finished.
2. The synthesized attribute S.code is the sequence of intermediate-code steps that implements a
statement S and ends with a jump to S.next.
3. The inherited attribute C.true labels the beginning of the code that must be executed if C is
true.
4. The inherited attribute C.false labels the beginning of the code that must be executed if C is
false.
5. The synthesized attribute C.code is the sequence of intermediate-code steps that implements
the condition C and jumps either to C.true or to C.false, depending on whether C is true or false.

Implementing L-Attributed SDD's:


Since many translation applications can be addressed using L-attributed definitions, The
following methods do translation by traversing a parse tree:
1. Build the parse tree and annotate. This method works for any noncircular SDD whatsoever.
We introduced annotated parse trees.

2. Build the parse tree, add actions, and execute the actions in preorder. This approach works
for any L-attributed definition.
In this section, we discuss the following methods for translation during parsing:

3. Use a recursive-descent parser with one function for each nonterminal. The function for
nonterminal A receives the inherited attributes of A as arguments and returns the synthesized
attributes of A.

SREC, Nandyal Page 12


Compiler Design

4. Generate code on the fly, using a recursive-descent parser.

5. Implement an SDT in conjunction with an LL-parser. The attributes are kept on the parsing
stack, and the rules fetch the needed attributes from known locations on the stack.

6. Implement an SDT in conjunction with an LR-parser. In this method. The SDT for an L-
attributed SDD typically has actions in the middle of productions, and we cannot be sure during
an LR parse that we are even in that production until its entire body has been constructed.

1. Translation During Recursive-Descent Parsing:


A recursive-descent parser has a function A for each nonterminal A, We can extend the parser
into a translator as follows:
a) The arguments of function A are the inherited attributes of nonterminal A.
b) The return-value of function A is the collection of synthesized attributes of nonterminal A.

In the body of function A, we need to both parse and handle attributes:

1. Decide upon the production used to expand A.


2. Check that each terminal appears on the input when it is required.
3. Preserve, in local variables, the values of all attributes needed to compute inherited attributes
for nonterminals in the body or synthesized attributes for the head nonterminal.
4. Call functions corresponding to nonterminals in the body of the selected production, providing
them with the proper arguments.

2. On-The-Fly Code Generation:


The construction of long strings of code that are attribute values are undesirable for
several reasons, including the time it could take to copy or move long strings. In common cases
such as our running code generation example, we can instead incrementally generate pieces of
the code
into an array or output file by executing actions in an SDT. The elements we need to make this
technique work are:

1. There is, for one or more nonterminals, a main attribute. For convenience, we shall assume
that the main attributes are all string valued. In Example 5.20, the attributes S.code and C.code
are main attributes; the other attributes are not.
2. The main attributes are synthesized.
3. The rules that evaluate the main attribute (s) ensure that
(a) The main attribute is the concatenation of main attributes of nonterminals appearing
in the body of the production involved, perhaps with other elements that are not main
attributes, such as the string label or the values of labels LI and L2.

SREC, Nandyal Page 13


Compiler Design

(b) The main attributes of nonterminals appear in the rule in the same order as the
nonterminals themselves appear in the production body.

3.Implementing L-Attributed SDD's on an LL Grammar:


Every L-attributed definition with an underlying LL grammar can be implemented along
with the parse. Records to hold the synthesized attributes for a nonterminal are placed below that
nonterminal on the stack, while inherited attributes for a nonterminal are stored with that
nonterminal on the stack. Action records are also placed on the stack to compute attributes at the
appropriate time.
4. Implementing L-Attributed SDD's on an LL Grammar, Bottom-Up:
An L-attributed definition with an underlying LL grammar can be converted to a
translation on an LR grammar and the translation performed in connection with a bottom-up
parse. The grammar transformation introduces "marker" nonterminals that appear on the bottom-
up parser's stack and hold inherited attributes of the nonterminal above it on the stack.
Synthesized attributes are kept with their nonterminal on the stack.

SREC, Nandyal Page 14


Compiler Design

Intermediate-Code Generation

Intermediate code is the interface between front end and back end in a compiler Ideally
the details of source language are confined to the front end and the details of target machines to
the back end (a m*n model)
In this chapter we study intermediate representations, static type checking and
intermediate code generation

High-level representations are close to the source language and low-level representations are
close to the target machine. Syntax trees are high level; they depict the natural hierarchical
structure of the source program and are well suited to tasks like static type checking.

A low-level representation is suitable for machine-dependent tasks like register allocation


and instruction selection Three-address code can range from high- to low-level, depending on the
choice of operators.
Various Forms of Intermediate Code:
1. Abstract Syntax Tree
2. Polish Notation
3. Three Address Code

1. Variants of Syntax Trees


Nodes in a syntax tree represent constructs in the source program; the children of a node
represent the meaningful components of a construct. A directed acyclic graph (hereafter called a
DAG) for an expression identifies the common sub expressions (sub expressions that occur more
than once) of the expression.
Directed Acyclic Graphs for Expressions
A DAG has leaves corresponding to atomic operands and interior codes corresponding to
operators. We can easily identify the common sub-expressions and then use that knowledge
during code generation.

SREC, Nandyal Page 15


Compiler Design

Example: a+a*(b-c)+(b-c)*d

SDD for creating DAG’s

Value-number method for constructing DAG’s


The nodes of a syntax tree or DAG are stored in an array of records as shown in fig ,
Each row of the array represents one record, and therefore one node. In each record, the first
field is an operation code, indicating the label of the node. leaves have one additional field,
which holds the lexical value interior nodes have two additional fields indicating the left and
right children.

SREC, Nandyal Page 16


Compiler Design

Fig: (a) DAG Fig:(b) Array.

Fig: Nodes of a DAG for i = i + 10 allocated in an array


• Algorithm
➢ Search the array for a node M with label op, left child l and right child r
➢ If there is such a node, return the value number M
➢ If not create in the array a new node N with label op, left child l, and right
child r and return its value
• We may use a hash table

2. Polish Notation:
• Linearization of syntax tree is called Polish Notation.
• It is also called as prefix notation, in which operator occurs first and then operands are
arranged.
Ex: (a+b)*(c-d) ------> *+ab-cd.

• Reverse polish notation is called as postfix notation.

3. Three address code


In three-address code, there is at most one operator on the right side of an instruction; Thus a
source- language expression like x+y*z might be translated into the sequence of three-address
instructions
t1=y*z
t2= x + ti
where ti and t2 are compiler-generated temporary names.
Ex: Three-address code is a linearized representation of a syntax tree or a DAG in which explicit
names correspond to the interior nodes of the graph.

SREC, Nandyal Page 17


Compiler Design

Various Forms of three address instructions:


• Assignment Instructions of the form --------> Ex: x = y op z
• Assignments of the form ------> Ex: x = op y
• Copy instructions of the form ------> Ex : x = y
• An unconditional jumps ----------> Ex: goto L
• Conditional jumps of the form --->Ex: if x goto L and if False x goto L
• Conditional jumps such as --------> Ex: if x relop y goto L
• Procedure calls using:
– param x
– call p,n
– y = call p,n
• Indexed copy instructions of the form. -------> Ex: x = y[i] and x[i] = y
• Address and pointer assignments of the form ----->
Ex: x = &y, x = * y, and * x - y.

• Example: do i = i+1; while (a[i] < v);


L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100

Symbolic labels Position numbers


Note: The multiplication i * 8 is appropriate for an array of elements that each take 8 units of
space.

SREC, Nandyal Page 18


Compiler Design

Representation of three address codes:


• Quadruples
• Triples
• Indirect Triples
Quadruples:
A quadruple (or just "quad!') has four fields, which we call op, arg1, arg2,and result. The op
field contains an internal code for the operator. the three-address instruction x = y +z is
represented by placing + in op, y in arg1, z in arg2, and x in result.
The following are some exceptions to this rule:
1. Instructions with unary operators like x = minus y or x = y do not use arg2. Note that for a
copy statement like x = y, op is =, while for most other operations, the assignment operator is
implied.
2. Operators like p a r am use neither arg2 nor result.
3. Conditional and unconditional jumps put the target label in result.

Example: a= b * minus c + b * minus c

t1 = minus c Op Arg1 Arg2 Result


t2 = b * t1
t3 = minus c Minus c - t1
t4 = b * t3
* b t1 t2
t5 = t2 + t4
a = t5 Minus c - t3

Three Address Code * b t3 t4


+ t2 t4 t5
= t5 - a

Quadraple

SREC, Nandyal Page 19


Compiler Design

Triples:
A triple has only three fields, which we call op,arg1, and arg2, Note that the result field in Fig is
used primarily for temporary names.

Indirect triples
Indirect triples consist of a listing of pointers to triples, rather than a listing of triples themselves.
For example, let us use an array instruction to list pointers to triples in the desired order.

Static Single-Assignment Form:


Static single-assignment form (SSA) is an intermediate representation that facilitates certain code
optimizations. Two distinctive aspects distinguish SSA from three-address code. The first is that
all assignments in SSA are to variables with distinct names; hence the term static single-
assignment. Figure 6.13 shows the same intermediate program in three-address code and in static
single assignment form.

SREC, Nandyal Page 20


Compiler Design

Types and Declarations:


The applications of types can be grouped under checking and translation:
• Type checking uses logical rules to reason about the behavior of a program at run time.
Specifically, it ensures that the types of the operands match the type expected by an operator. For
example, the && operator in Java expects its two operands to be booleans; the result is also of
type boolean.
• Translation Applications. From the type of a name, a compiler can determine the storage that
will be needed for that name at run time. Type information is also needed to calculate the address
denoted by an array reference, to insert explicit type conversions, and to choose the right version
of an arithmetic operator, among other things.

Type Expressions:
A type expression is either a basic type or is formed by applying an operator called a type
constructor to a type expression.

Example: int[2][3]
array(2,array(3,integer))

The array type int [2] [3] can be read as "array of 2 arrays of 3 integers each" and written
as a type expression array(2, array(3, integer)).

We shall use the following definition of type expressions:

• A basic type is a type expression. Typical basic types for a language include boolean, char,
integer, float, and void; the latter denotes "the absence of a value."
• A type name is a type expression.
• A type expression can be formed by applying the array type constructor to a number and a
type expression.
• A record is a data structure with named field
• A type expression can be formed by using the type constructor g for function types
• If s and t are type expressions, then their Cartesian product s*t is a type expression
• Type expressions may contain variables whose values are type expressions

SREC, Nandyal Page 21


Compiler Design

Type Equivalence
"if two type expressions are equal then return a certain type else error.“
• Two conditions for Type Equivalence as follows:
– They are the same basic type.
– They are formed by applying the same constructor to structurally equivalent
types.

Declarations
Types and declarations using a simplified grammar that declares just one name at a time;

• Nonterminal -D generates a sequence of declarations.


• Nonterminal -T generates basic, array, or record types.
• Nonterminal -B generates one of the basic types int and float.
• Nonterminal C, for "component," generates strings of zero or more integers, each integer
surrounded by brackets.
• A record type (the second production for T) is a sequence of declarations for the fields of
the record, all surrounded by curly braces.
Storage Layout for Local Names:
From the type of a name, we can determine the amount of storage that will be needed for
the name at run time. At compile time, we can use these amounts to assign each name a relative
address. The type and relative address are saved in the symbol-table entry for the name.

SREC, Nandyal Page 22


Compiler Design

Translation of Expressions
• An expression with more than one operator, like a + b* c, will translate into instructions
with at most one operator per instruction.
• An array reference A[i][j] will expand into a sequence of three-address instructions that
calculate an address for the reference.
• The syntax-directed definition builds up the three-address code for an assignment
statement S using attribute code for S and attributes addr and code for an expression E.
• Attributes S.code and E.code denote the three-addresscode for S and E, respectively.
• Attribute E.addr denotes the address that will hold the value of E.

SREC, Nandyal Page 23


Compiler Design

Incremental Translation
• we can arrange to generate only the new three-address instructions,
• In the incremental approach, gen not only constructs a three-address instruction, it
appends the instruction to the sequence of instructions generated so far.

Addressing Array Elements


• Layouts for a two-dimensional array:

Type Checking:

•A compiler has to do semantic checks in addition to syntactic checks. •Semantic Checks


–Static –done during compilation
–Dynamic –done during run-time
•Type checking is one of these static checking operations.
–we may not do all type checking at compile-time.
–Some systems also use dynamic type checking too.
•A type system is a collection of rules for assigning type expressions to the parts of a program.
•A type checker implements a type system.
•A sound type system eliminates run-time type checking for type errors.

SREC, Nandyal Page 24


Compiler Design

•A programming language is strongly-typed, if every program its compiler accepts will execute
without type errors.
Rules for Type Checking:
Type checking can take on two forms:
1. Synthesis
2. Inference.
Type synthesis builds up the type of an expression from the types of its sub expressions. It
requires names to be declared before they are used.
Type inference determines the type of a language construct from the way it is used.

Type Conversions:
Consider expressions like x + i, where x is of type float and i is of type integer. Since the
representation of integers and floating-point numbers is different within a computer and different
machine instructions are used for operations on integers and floats, the compiler may need to
convert one of the operands of + to ensure that both operands are of the same type when the
addition occurs.

Control Flow:
The translation of statements such as if-else-statements and while-statements is tied to the
translation of Boolean expressions. In programming languages, boolean expressions are often
used to
1. Alter the flow of control. Boolean expressions are used as conditional expressions in
statements that alter the flow of control. The value of such boolean expressions is implicit in a
position reached in a program. For example, in if (E) 5, the expression E must be true if
statement S is reached.
2. Compute logical values. A Boolean expression can represent true Or false as values. Such
Boolean expressions can be evaluated in analogy to arithmetic expressions using three-address
instructions with logical operators.

SREC, Nandyal Page 25


Compiler Design

Boolean Expressions:
Boolean expressions are composed of the boolean operators (which we denote &&, II, and !,
using the C convention for the operators AND, OR, and NOT, respectively) applied to elements
that are boolean variables or relational expressions. Relational expressions are of the form E1 rel
E2, where E1 and E2 are arithmetic expressions. In this section, we consider boolean expressions
are generated by the following grammar:

We use the attribute rel. op to indicate which of the six comparison operators <, < = , =, ! =, >, or
>= is represented by rel. we assume that and &;& are left-associative, and that I I has lowest
precedence, then I I &&, then !.

Short-Circuit Code:
In short-circuit (or jumping) code, the boolean operators &&, I I, and ! translate into
jumps. The operators themselves do not appear in the code; instead, the value of a boolean
expression is represented by a position in the code sequence.
Example 6 . 2 1 : The statement
if ( x < 100 II x > 200 && x != y ) x = 0;
might be translated into the code of Fig. 6.34. In this translation, the Boolean expression is true if
control reaches label L2. If the expression is false, control goes immediately to Lu skipping L2
and the assignment x = 0.

Flow-of-Control Statements:
We now consider the translation of boolean expressions into three-address code in the context of
statements such as those generated by the following grammar:

In these productions, nonterminal B represents a boolean expression and nonterminal S


represents a statement.

SREC, Nandyal Page 26


Compiler Design

Syntax-directed definition

SREC, Nandyal Page 27


Compiler Design

Generating three-address code for Booleans:

Backpatching:
In which lists of jumps are passed as synthesized attributes. Specifically, when a jump is
generated, the target of the jump is temporarily left unspecified. Each such jump is put on a list
of jumps whose labels are to be filled in when the proper label can be determined. All of the
jumps on a list have the same target label.

One-Pass Code Generation Using Backpatching:


Backpatching can be used to generate code for boolean expressions and flow of- control
statements in one pass.
In this section, synthesized attributes truelist and falselist of nonterminal B are used to
manage labels in jumping code for boolean expressions. In particular, B. truelist will be a list of
jump or conditional jump instructions into which we must insert the label to which control goes
if B is true. B.falselist likewise is the list of instructions that eventually get the label to which
control goes when B is false. As code is generated for B, jumps to the true and false exits are left
incomplete, with the label field unfilled. These incomplete jumps are placed on lists pointed to

SREC, Nandyal Page 28


Compiler Design

by B.truelist and B.falselist, as appropriate. Similarly, a statement S has a synthesized attribute


S.nextlist, denoting a list of jumps to the instruction immediately following the code for S.
For non-terminal B we use two attributes B.truelist and B.falselist together with following
functions:
➢ Makelist(i): create a new list containing only I, an index into the array of instructions
➢ Merge(p1,p2): concatenates the lists pointed by p1 and p2 and returns a pointer to the
concatenated list
➢ Backpatch(p,i): inserts i as the target label for each of the instruction on the list pointed
to by p

Backpatching for Boolean Expressions:

SREC, Nandyal Page 29


Compiler Design

• Annotated parse tree for x < 100 || x > 200 && x ! = y


Switch-Statements:
The "switch" or "case" statement is available in a variety of languages. Our switch- statement
syntax is as follows

• There is a selector expression E, which is to be valuated, followed by n constant values


Vi, V2, •• ,Vn, that the expression might take, perhaps including a default "value," which
always matches the expression if no other value does.

Translation of Switch-Statements:
1. Evaluate the expression E.
2. Find the value {V} in the list of cases that is the same as the value of the expression.
Recall that the default value matches the expression.
3. Execute the statement Sj associated with the value found.

SREC, Nandyal Page 30


Compiler Design

Example:

Intermediate Code for Procedures:


• We use the term function in this section for a procedure that returns a value.
Example:
n=f(a[i]);
It translate into the following three-address code:

1) t1=i*4
2) t2=a[t1]
3) param t2
4) t3 = call f,
5) n=t3

Adding functions to the source language:

SREC, Nandyal Page 31


Compiler Design

• Nonterminals D and T generate declarations and types.


• A function definition generated by D consists of keyword define, a return type, the
function name.
• Nonterminal F generates zero or more formal parameters.
• Nonterminals S and E generate statements and expressions.
• The production for E adds function calls, with actual parameters generated by A. An
actual parameter is an expression.
Four functions for procedure:
• Function types: Ex: Void, integer, …..
• Symbol tables. : Data Structure
• Type checking: Type of operands and operators
• Function calls: parameter calls and values

SREC, Nandyal Page 32

You might also like