Compiler Design UNIT III
Compiler Design UNIT III
UNIT – III
Syntax Directed Definition:
Syntax Directed Definition (SDD) is a kind of abstract specification. It is
generalization of context free grammar in which each grammar production X –>
a is associated with it a set of production rules of the form s = f(b1,
b2, ……bk) where s is the attribute obtained from function f. The attribute can be
a string, number, type or a memory location.
Semantic rules are fragments of code which are embedded usually at the end of
production and enclosed in curly braces ({ }).
Example:
E --> E1 + T {E.val = E1.val + T.val}
Annotated Parse Tree – The parse tree containing the values of attributes at
each node for given input string is called annotated or decorated parse tree.
Features
High level specification
Hides implementation details
Explicit order of evaluation is not specified
Types of attributes: There are two types of attributes:
1. Synthesized Attributes: These are those attributes which derive their values
from their children nodes i.e. value of synthesized attribute at node is computed
from the values of attributes at children nodes in parse tree.
Example:
E --> E1 + T { E.val = E1.val + T.val}
In this, E.val derive its values from E 1.val and T.val
Computation of Synthesized Attributes –
Write the SDD using appropriate semantic rules for each production in given
grammar.
The annotated parse tree is generated and attribute values are computed in
bottom up manner.
The value obtained at root node is the final output.
Example: Consider the following grammar
S --> E
E --> E1 + T
E --> T
T --> T1 * F
T --> F
F --> digit
The SDD for the above grammar can be written as follow
Example:
A --> BCD { C.in = A.in, C.type = B.type }
The value of L nodes is obtained from T.type (sibling) which is basically lexical
value obtained as int, float or double. Then L node gives type of identifiers a and
c. The computation of type is done in top down manner or preorder traversal.
Using function Enter_type the type of identifiers a and c is inserted in symbol
table at corresponding id.entry.
Applications of Syntax-Directed Translation:
Here are some applications of SDT in Compiler Design:
1. Syntax Directed Translation is used for executing arithmetic expressions
2. Conversion from infix to postfix expression
3. Conversion from infix to prefix expression
4. For Binary to decimal conversion
5. Counting the number of Reductions
6. Creating a Syntax Tree
7. Generating intermediate code
8. Storing information into the symbol table
9. Type checking
Syntax Direct Translation (SDT):
Syntax Direct Translation (SDT) method is used in compiler design, associating the
translation rules with grammar production. Syntax Directed Translation can
identify informal notations, called semantic rules along with the grammar.
The right side of the translation rule is always in correspondence to the attribute
values of the right side of the production rule. From this, we come to the
conclusion that SDT in compiler design associates:
1. A set of attributes to every node in grammar.
2. A set of translation rules to every production rule with the help of attributes.
Let’s take a string to see how semantic analysis happens – S = 2+3*4. Parse tree
corresponding to S would be
S-attributed SDT:
The S-attributed definition is a type of syntax-directed attributes in compiler
design that solely uses synthesized attributes. The symbol attribute values in the
production's body are used to calculate the attribute values for the non-terminal at
the head.
The nodes of the parse tree can be ranked from the bottom up when evaluating an
S-attributed SDD's attributes. i.e., by conducting a post-order traverse of the parse
tree and evaluating the characteristics at a node once the traversal finally leaves
that node.
Let us see an example of S-attributed SDT.
The grammar is given below:
S→E
E → E1+T
E→T
T → T1*F
T→F
F → digit
The S-attributed SDT of the above grammar can be written in the following way.
Statement Meaning
X = Y op Z Binary Operation
X= op Z Unary Operation
X=Y Assignment
if X(rel op)Y goto L Conditional Goto
goto L Unconditional Goto
A[i] = X
Y= A[i] Array Indexing
P = addr X
Y = *P
*P = Z Pointer Operations
The array type int [2][3][2] can be read as an "array of 2 arrays each of 3 arrays of
2 integers each" and written as a type expression array(2, array(3, array(2,
integer))).
Basic types
char, int, double, float are type expressions.
Type names
It is convenient to consider that names of types are type expressions.
Arrays
If E is a type expression and n is an int, then
array(n, E)
is a type expression indicating the type of an array with elements in T and
indices in the range 0 ... n - 1.
Products
If E1 and E2 are type expressions then
E1 × E2
is a type expression indicating the type of an element of the Cartesian
product of E1 and E2. The products of more than two types are defined
similarly, and this product operation is left-associative. Hence
(E1 × E2) × E3 and E1 × E2 × E3
are equivalent.
Records
The only difference between a record and a product is that the fields of a
record have names.
If abc and xyz are two type names, if E1 and E2 are type expressions then
record(abc: E1, xyz: E2)
is a type expression indicating the type of an element of the Cartesian
product of T1 and T2.
Pointers
Let E is a type expression, then
pointer(E)
is a type expression denoting the type pointer to an object of type T.
Function types
If E1 and E2 are type expressions then
E1 -> E2
is a type expression denoting the type of functions associating an element
from E1 with an element from E2.
Type Equivalence
When graphs represent type expressions, two types are structurally equivalent if
and only if one of the following conditions is true:
They are of the same basic type.
They are formed by applying the identical constructor to structurally
equivalent types.
One is a type name that refers to the other.
The first two conditions in the above definition lead to the name equivalence of
type expressions if type names are interpreted as though they stand for
themselves.
"If two type expressions are equal, return a specified type else error," says a
common type-checking rule. Ambiguities might develop when type expressions are
given names and then utilized in later type expressions.
Declarations
When we encounter declarations, we need to layout storage for the declared
variables.
For every local name in a procedure, we create a Symbol Table(ST) entry
including:
1. The type of the name
2. How much storage the name requires
Grammar:
D -> real, id
D -> integer, id
D -> D1, id
We use ENTER to enter the symbol table and use ATTR to trace the data type.
Type checking:
Type checking is the process of verifying and enforcing constraints of types in
values. A compiler must check that the source program should follow the
syntactic and semantic conventions of the source language and it should also
check the type rules of the language.
It checks the type of objects and reports a type error in the case of a violation,
and incorrect types are corrected.
Conversion from one type to another type is known as implicit if it is to be
done automatically by the compiler. Implicit type conversions are also
called Coercion and coercion is limited in many languages.
Types of Type Checking:
There are two kinds of type checking:
1. Static Type Checking.
2. Dynamic Type Checking.
Static Type Checking:
Static type checking is defined as type checking performed at compile time. It
checks the type variables at compile-time, which means the type of the variable
is known at the compile time.
The Benefits of Static Type Checking:
1. Runtime Error Protection.
2. It catches syntactic errors like spurious words or extra punctuation.
3. It catches wrong names like Math and Predefined Naming.
4. Detects incorrect argument types.
5. It catches the wrong number of arguments
Dynamic Type Checking:
Dynamic Type Checking is defined as the type checking being done at run time.
In Dynamic Type Checking, types are associated with values, not variables.
Implementations of dynamically type-checked languages runtime objects are
generally associated with each other through a type tag, which is a reference to a
type containing its type information. Dynamic typing is more flexible.
Languages like Pascal and C have static type checking. Type checking is used to
check the correctness of the program before its execution. The main purpose of
type-checking is to check the correctness and data type assignments and type-
casting of the data types, whether it is syntactically correct or not before their
execution.
Static Type-Checking is also used to determine the amount of memory needed to
store the variable.
Control flow
Control flow is an order in which statements are executed. The code executes in
order from the first line in the file to the last line unless it comes across
conditionals or loops that change the order.
Control flow statements include conditional statements, branching statements,
and looping statements. Control flow can be decided with the help of a boolean
expression.
1. Boolean Expressions
2. Short-Circuit Code
3. Flow-of-Control Statements
Boolean Expressions
Boolean expressions are composed of the boolean operators (which we denote &&,
I I, and !, using the C convention for the operators AND, OR, and NOT,
respectively) applied to elements that are boolean variables or relational ex-
pressions. Relational expressions are of the form E± rel E2, where Ex and E2 are
arithmetic expressions. In this section, we consider boolean expressions generated
by the following grammar:
We use the attribute rel. op to indicate which of the six comparison operators
<, < = , =, ! =, >, or >= is represented by rel.
Short-Circuit Code
In short-circuit (or jumping) code, the boolean operators &&, I I, and ! trans-late
into jumps. The operators themselves do not appear in the code; instead, the value
of a boolean expression is represented by a position in the code se-quence.
Example 6.21 : The statement
if ( x < 100 || x > 200 && x != y ) x = 0;
might be translated into the code of Fig. 6.34. In this translation, the boolean
expression is true if control reaches label L2. If the expression is false, control goes
immediately to Lu skipping L2 and the assignment x = 0.
if x < 100 goto L2
ifFalse x > 200 goto L1
ifFalse x != y goto L1
L2: x = 0
L1:
Flow-of-Control Statements
In the above program we have the main function and inside the main function a
function call swap(x,y), where x and y are actual arguments. We also have a
function definition for swap, swap(int a, int b) where parameters a and b are
formal parameters.
In the three-address code, a function call is unraveled into the evaluation of
parameters in preparation for a call, followed by the call itself.