0% found this document useful (0 votes)
25 views46 pages

Compiler Unit3

The document discusses syntax-directed translation schemes in compiler design, explaining how grammar and semantic rules are used to generate intermediate code. It covers various types of attributes, including synthesized and inherited attributes, and introduces concepts like S-attributed and L-attributed definitions. Additionally, it details the generation of intermediate code in formats such as postfix notation, syntax trees, and three-address code, along with their respective implementations and comparisons.

Uploaded by

kasliwalkm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views46 pages

Compiler Unit3

The document discusses syntax-directed translation schemes in compiler design, explaining how grammar and semantic rules are used to generate intermediate code. It covers various types of attributes, including synthesized and inherited attributes, and introduces concepts like S-attributed and L-attributed definitions. Additionally, it details the generation of intermediate code in formats such as postfix notation, syntax trees, and three-address code, along with their respective implementations and comparisons.

Uploaded by

kasliwalkm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

Unit 3:

Syntax Directed Translation


Scheme
By

Ms. C.B.Thaokar
Assistant Professor
Department of Information Technology
RCOEM, Nagpur
Structure of Compiler

Source
Program Tokens Syntactic
Semantic
Scanner Parser
(Character StructureRoutines
Stream)
Intermediate
Representation

Symbol and Optimizer


Error
Tables

(Used by all Phases of the


Compiler)
Eg. A = B + C => id1 = id2 + id3 => Parse tree Code
Generator
C.B.Thaokar 2
Target machine code
–Use the grammar to direct the translation
•The grammar defines the syntax of the input language.
Attributes are attached to grammar symbols.
Semantic rules are associated with grammar productions.
• Attributes - type of values associated with grammar symbols

representing programming language constructs.


These values are computed by the semantic rules.
–2 notations for associating semantic rules with productions.
•Syntax directed definitions
•Syntax directed translation schemes
%%
[aeiouAEIOU] { vowels++;}
[a-zA-Z] { cons++;}
%%
Ms. C.B.Thaokar 3
• Syntax directed definitions
-A syntax directed definition specifies the values of attributes by
associating semantic rules with the grammar productions.
Production Semantic Rule
E -> E + T E.code = E1.code + T.code

-Syntax directed definitions hide implementation details.

• Syntax directed translation schemes


- We may alternatively insert the semantic actions inside
the grammar
Production Semantic Rule
E -> E + T E -> E1+T {print ‘+’}

Ms. C.B.Thaokar 4
Two types of attributes associated with grammar
• Synthesized Attributes
If attribute value at a parse tree node is determined by the
attribute values at the child node or from constant.
eg. E->E1 + T E.val = E1.val + T.val

• Inherited Attributes
This are those initial value at a node in the parse tree determined
in terms of values of the parent or sibling of that node or nodes
own attribute value.
eg. D ->T L L.in = T.type

Ms. C.B.Thaokar 5
S attributed definition:
A syntax directed definition that uses synthesized attributes
exclusively is said to be an S-attributed definition.
Production Semantic rules
E->E1 + T E.val = E1.val + T.val
E->T E.val = T.val
T->T1 * F T.val = T1.val * F.val
T->F T.val = F.val
F-> id F.val = num.lexval
W=2+2*3

Ms. C.B.Thaokar 6
Example S attributed definition
W=2+2*3

Ms. C.B.Thaokar 7
Inherited attributed definition:
Production Semantic rules
D ->T L L.in = T.type
T->int T.type = integer
T->real T.type = real
L->L1, id L1.inh = L.inh , addtype(id.entry, L.inh)
L->id addtype(id.entry, L.inh)
w= real id1, id2, id3

Ms. C.B.Thaokar 8
Example Inherited attributed definition:
w= real id1, id2, id3 Production Semantic rules
D ->T L L.in = T.type

T->int T.type = integer

T->real T.type = real

L->L1, id L1.inh = L.inh ,


addtype(id.entry, L.inh)

L->id addtype(id.entry, L.inh)

Ms. C.B.Thaokar 9
L-attributed definitions:
• A syntax directed definition is L-attributed if each inherited
attribute of Xj, 1<=j<=n, on the right side of
A->X1 X2…Xn depends only on
– attributes of the symbols X1, X2, …, Xj-1.
–the inherited attributes of A.
•L stands for Left since information appears to flow from left to
right in the compilation process.
•Example:
A->LM { L.i=A.i; M.i =L.s ; A.s = M.s }
A->QR { R.i = A.i; Q.i = R.s; A.s = Q.s}

Ms. C.B.Thaokar 10
• Given a syntax directed definition, how to build a
translator?
– For general definitions, to evaluate the semantic rules
correctly, we need to follow the dependence of the
attributes (defined by the semantic rules).
• Build a dependency graph for the parsing tree.
Topologically sort the graph, then evaluate the rules
accordingly.
• Example: real id, id, id
– For some special definitions, we can perform
translation while parsing
• e.g. bottom-up evaluation of S-attributed
definitions.
• Most L-attributed definitions also works.

Ms. C.B.Thaokar 11
• Syntax directed translation scheme:
– a context-free grammar in which attributes are
associated with grammar symbols and the semantic
actions are enclosed between {} and are inserted
within the right side of productions to indicate the
order in which translation takes place.

– Example:
E->T R
R-> + T { print(‘+’) } R
R -> T { print(‘-’) } R | e
T->num { print (num.val) }

Ms. C.B.Thaokar 12
– S-attributed definitions can directly translated into a
translation scheme by placing the semantic actions at the
end of each productions.
• Perfect for bottom up parsing (LR parsing)

– Actions in the middle of productions can be removed to be


put at the end of productions by changing the grammar.
( Note: Previous Example)

Ms. C.B.Thaokar 13
Example
E->T R
R-> + T { print(‘+’) } R
R -> T { print(‘-’) } R | Ɛ
T->num { print (num.val) }
String = 9 + 4
E
T R

num + T R

9 num Ɛ

4 Ms. C.B.Thaokar
14
Production Semantic Rules

S→E$ { S.VAL = E.VAL }


E→E+E {E.VAL := E.VAL + E.VAL }
E→E*E {E.VAL := E.VAL * E.VAL }
E → (E) {E.VAL := E.VAL }
E→I {E.VAL := I.VAL }
I → I digit {I.VAL := 10 * I.VAL + LEXVAL }
I → digit { I.VAL:= LEXVAL}
Syntax direct translation is implemented by constructing a parse tree and
performing the actions in a left to right depth first order.
Eg- SDTS

Ms. C.B.Thaokar 16
Intermediate Code Generation
• If the compiler directly translates source code into the machine
code without generating intermediate code then a full native
compiler is required for each new machine.
• The intermediate code keeps the analysis portion same for all
the compilers that's why it doesn't need a full compiler for
every unique machine.
• Intermediate code generator receives input from its
predecessor phase and semantic analyzer phase. It takes input
in the form of an annotated syntax tree.
• Using the intermediate code, the second pass of the compiler
synthesis phase is changed according to the target machine.

Ms. C.B.Thaokar 17
Intermediate Code Generation
ICG can be represented in following ways

• Postfix Notation
• Syntax Tree
• Three Address Code

Ms. C.B.Thaokar 18
SDTS of Postfix Notation
• Postfix notation is the useful form of intermediate code if the
given language is expressions.
• Postfix notation is a linear representation of a syntax tree.
Eg. D + Y DY+

E -> E1 + T { E.Code = concate( E1.code,T.code, ‘+’) ;}


E -> T { E.code = T.code }
T -> T1 * F { T.code = concate (T1.code, F.code, ‘*’ ); }
T -> F {T.code = F.code }
F -> id { F.code = getname (id.place); }

Ms. C.B.Thaokar 19
Syntax Tree
• In the parse tree, most of the leaf nodes are single child to their
parent nodes.
• In the syntax tree, we can eliminate this extra information.
• Syntax tree is a variant of parse tree. In the syntax tree, interior
nodes are operators and leaves are operands.
• Syntax tree is usually used to represent a program in a tree structure.

Ms. C.B.Thaokar 20
SDTS of Syntax Tree
E -> E1 + T { E.ptr = mknode( ‘+’ ,E1.ptr,T.ptr) }
E -> T { E.ptr = T.ptr }
T -> T1 * F { T.ptr = mknode(‘*’, T1.ptr, F.ptr) }
T -> F {T.ptr = F.ptr }
F -> id { F.ptr = mkleaf (id.place) }
mknode(operator, Left, right ) - Creates an node.
mkleaf (identifier, entry) - Creates identifier node which is pointer to
symbol table entry.
String: id + id * id

Abstract Syntax Tree


Ms. C.B.Thaokar 21
Three Address Code
• In three-address code, the given expression is broken down into
several separate instructions. These instructions can easily translate
into assembly language.
• Each Three address code instruction has at most three operands. It is
a combination of assignment and a binary operator.
eg : a = (-c * b) + (-c * d)
TAC :
t1 = -c
t2 = t1 * b
t3 = -c
t4 = t3 * d
t5 = t2 + t4
a = t5

• Three address code can be represented in two


Ms. C.B.Thaokar 22
forms: quadruples and triples.
Three Address Code
Quadruples: The quadruples have four fields to implement the
three address code. The field of quadruples contains the name of
the operator, the first source operand, the second source operand
and the result respectively.
Eg: a = -b * c + d
TAC : t1 = -b t2 = t1 * c t3 = t2 + d a = t3

Address Operator Source 1 Source 2 Result

(0) uminus b - t1
(1) * t1 C t2
(2) + t2 d t3
(3) = t3 - a

Ms. C.B.Thaokar 23
Three Address Code
• Triples: The triples have three fields to implement the three
address code. The field of triples contains the name of the
operator, the first source operand and the second source operand.
• In triples, the results of respective sub-expressions are denoted by
the position of expression.
Eg: a = -b * c + d
TAC : t1 = -b t2 = t1 * c t3 = t2 + d a = t3

Address Operator Source 1 Source 2

(0) uminus b -
(1) * (0) c
(2) + (1) d
(3) = (2) -

Ms. C.B.Thaokar 24
Three Address Code
• Indirect Triples: This representation makes use of pointer to the
listing of all references to computations which is made separately
and stored. Its similar in utility as compared to quadruple
representation.
Eg: a = -b * c + d
TAC : t1 = -b t2 = t1 * c t3 = t2 + d a = t3

Address Operator Source 1 Source 2 Index Address

(0) uminus b - (100) (0)

(1) * (0) c (101) (1)

(2) + (1) d (102) (2)

(3) = (2) - (103) (3)


Ms. C.B.Thaokar 25
Comparisons of TAC methods
• In quadruples, statement that computes t1 can be moved without
requiring any changes in the statements using t1, because the
result field is explicit.
• Whereas , in a triplet representation, if we want to move a
statement that defines a temporary value, then we must change all
of the pointers in the operand1 and operand2 fields of the records
in which this temporary value is used.
• Indirect triple representation presents no such problems, because a
separate list of pointers to the triple structure is maintained.
So when statements are moved, this list is reordered, and no
change in the triple structure is necessary; hence, the utility of
indirect triples is almost the same as that of quadruples.

Ms. C.B.Thaokar 26
SDTS of Three Address Code for Arithmetic
expression
E -> E1 + T { t1 = gentemp () ;
gencode( ‘+’ , E1.place,T.place);
E.place = t1; }
E -> T { E.place = T.place ; }

T -> T1 * F { t2 = gentemp () ;
gencode( ‘*’ ,T1.place, F.place);
T.place = t2 ; }

T -> F {T.place = F.place ; }


F -> id { F.place = id.place ; }

Ms. C.B.Thaokar 27
Data structure reqd. for TAC
Implementation ( quadruple)
• Makelist(quad. No.) / Makelist(i) :
This creates a list with the quad. No. as the only element in a
list . It returns the pointer to the list so created.
• Merge(P1, P2) / Merge (list1 , list2) :
This concatenates the items in P1, P2 and returns to
concatenated list .
• Backpatch ( list, label ) / Backpatch( p, i):
This fills the GOTO target labels in the list with the label.
For eg. Truelist = {100, 103} Falselist = { 101 , 104}
backpatch( Truelist, L1)

Ms. C.B.Thaokar 28
SDTS for Boolean Expression Relational
Operator
(Short Circuit Code/ Jumping code)
E -> E1 relop E2 Relop.val = ‘<‘
{ Relop.val = ‘>‘
Relop.val = ‘<=‘
E.true = mklist(nextquad)
Relop.val = ‘>=‘
E.false = mklist(nextquad + 1) Relop.val = ‘==‘
gencode( if E1.place relop.val E2.place goto ---) Relop.val = ‘!=‘
gencode(goto ---)
}
Eg : Write TAC for a < b
=> 100 : if a < b goto 102
101: goto E.True = 100

E.False = 101

Ms. C.B.Thaokar 29
SDTS for Boolean Operator AND
(Short Circuit Code/ Jumping code)
E -> E1 AND M E2
{
backpatch( E1.true, M.quad);
E.true = E2.true;
E.false = merge( E1.false , E2.false)
}
M -> ɛ { M.quad = next.quad ; }
Eg : Write TAC for a < b and c > d
=> 100 : if a < b goto 102
101: goto
102: if c >d goto 105
103: goto E1.True = 100
E2.True = 102
E.True =102
E1.False=101
E2.False= 103
E.False = 101, 103
Ms. C.B.Thaokar 30
SDTS for Boolean Operator OR
E -> E1 OR M E2
{
backpatch( E1.false, M.quad);
E.false = E2.false;
E.true = merge( E1.true , E2.true );
}
M -> ɛ { M.quad = next.quad ; }
Eg : Write TAC for a < b OR c > d
=> 100 : if a < b goto 102
101: goto E1.True = 100
E2.True = 102
102: if c >d goto 105
E.True =100, 102
103: goto E1.False=101
E2.False= 103
E.False = 103

Ms. C.B.Thaokar 31
SDTS for Boolean Operator NOT
E -> NOT E1
{
E.false = E1. true;
E.true = E1. false;
}

Eg : Write TAC for not a < b


=> 100 : if a < b goto 102
101: goto
E1.True= 100
E1.False= 101

E.True = 101
E.False = 100

Ms. C.B.Thaokar 32
SDTS for IF THEN
S -> If E then MS1
{
backpatch( E.true ,M.quad);
S.next = merge( E.false, S1.next)
}
M -> ɛ { M.quad = next.quad ; }
Eg : Write TAC for if a < b then x = x +1
=> 100 : if a < b goto 102
E.True= 100
101: goto E.False= 101
102: t = x +1
S.Next = 101
103: x = t

Ms. C.B.Thaokar 33
SDTS for IF THEN ELSE
S -> If E then M1 S1 N else M2 S2 {
backpatch( E.true , M1.quad);
backpatch( E.false, M2.quad);
S.next = merge( S1.next, S2.next, N.next) }
M1 -> ɛ { M1.quad = next.quad ; }
M2 -> ɛ { M2.quad = next.quad ; }
N -> ɛ { N.next = mklist(nextquad);
gencode ( goto --------) ; }

Eg :Write TAC for if a < b then x = x +1 else y =y +1


=> 100 : if a < b goto 102
101: goto 105 E.True= 100
102: t1 = x +1 E.False= 101
103: x = t1
104: goto
105: t2 = y +1
106: y = t2
Ms. C.B.Thaokar 34
SDTS for WHILE stmt
S -> while M1 E do M2 S1 {
backpatch( E.true , M2.quad);
backpatch( S1.next, M1.quad);
S.next = E.false;
gencode( goto ( M1.quad)); }
M1 -> ɛ { M1.quad = next.quad ; }
M2 -> ɛ { M2.quad = next.quad ; }

Eg :Write TAC for while a < b do x = y+ 2


=> 100 : if a < b goto 102
101: goto
102: t1 = y +2 E.True= 100
103: x = t1 E.False= 101
S.next = 101
104: goto 100

Ms. C.B.Thaokar 35
SDTS for FOR stmt
S -> for ( E1; M1 E2 ; M2 E3 ) M3 S1 {
backpatch( E2.true , M3.quad);
backpatch( M3.next, M1.quad);
backpatch( S1.next, M2.quad);
gencode( goto ( M2.quad));
S.next = E2.false; }
M1 -> ɛ { M1.quad = next.quad ; }
M2 -> ɛ { M2.quad = next.quad ; }
M3 -> ɛ { M3.next = mklist(nextquad);
gencode ( goto --------) ;
M3.quad = nextquad;
}

Ms. C.B.Thaokar 36
FOR stmt Example
Eg :Write TAC for
For (i=1;i<5 ; i++)
x=y+2
=> 100 : i=1
101: if i<5 goto 106
102: goto
103: t1 = i +1
104: x = t1
105: goto 101 S.next = 102

106: t2 = y +2
107: x =t2
108: goto 103

Ms. C.B.Thaokar 37
Array Reference
• How to refer Array?
1D – a[i] =2 ;
In this 2 is assigned to a specific location pointed by a[i] where a
is base and i is index.

Loc of a[i] = base address of a + offset


= base address of a + ( i - lb ) * w
= [base address of a – w] + [ i * w]
Constant Variable
w = width or bytes per word
lb = lower bound of array [ i.e lb=1]
* For C language lb = 0 by default.
Eg : int a[2] = 1000 + ( 2 - 0 ) * 2 = 1004

Ms. C.B.Thaokar 38
Array Reference contd.
• How to refer Array?
2D – a[i][j] =2 ;
In this 2 is assigned to a specific location pointed by a[i][j]
where a is base and i and j are row and column index.
Row major matrix representation
Loc of a[i][j] = base address of a + offset
= base address of a + [( i – lb1 )(ub2-lb2+1) + (j –lb2)] * w
= base + ( -ub2- 1) * w + [( i * ub2) + j ] * w
Constant Variable
where
lb1 and lb2 = lower bound of row and column =1
ub1 and ub2 = upper bound of row and column , assumption W =1
* Eg : a[2][2] = [1000 + ( -2 -1) * 1] + [ (2 * 2) + 2] * 1
= 997 + 6
= 1003 Ms. C.B.Thaokar 39
SDTS for Array Reference
Production Translation rule
1. A→L=E If L.offset = null then
Gencode(L.value = E.value)
else
Gencode( L.value [ L.offset] = E.value)

2 E → E1 + E2 E.value = newtemp()
Gencode(E.value ‘=’ E1.value ‘+’ E2.value)
3 E → E1 * E2 E.value = newtemp()
Gencode(E.value ‘=’ E1.value ‘*’ E2.value)
4 E→L If L.offset = null then
E.value = L.value
else
E.value = newtemp()
Gencode (E.value = L.value [ L.offset ])
End Ms. C.B.Thaokar 40
Array Reference
Production Translation rule
5 L → Alist] L.value = newtemp()
L.offset = newtemp()
Gencode (L.value = Alist. array - C
Gencode (L.offset = Alist. Value * width(Alist.array))
6 L → id L.value = id.value
L.offset = null
7 Alist→ Alist1, E t = newtemp()
m = Alist1.ndim + 1
Gencode (t = Alist1.value * limit(Alist1.array, m))
Gencode (t = t + E.value)
Alist. array = Alist1.array
Alist.value = t
Alist.ndim = m
8 Alist → id [ E Alist.array = id.value
Alist.value = E.value
Alist.ndim = 1
Ms. C.B.Thaokar 41
Array Reference Example
Eg: Write a TAC for given code with bpw=w= 4 , dim= 20
sum =0
for ( i =0; i<=20; i++)
{ sum = sum + a[i] + b[i] }

100: sum =0 110 : t5 = i *4


101: i = 1 111: t6 =addr(b) -4
102: if i < =20 goto 107 112: t7=t6[t5]
103: goto 117 113: t8 = sum + t4
104: t1 = i +1 114: t8 =t8 + t7
105: i =t1 115: sum =t8
106: goto 102 116: goto 104
107: t2 = i * 4 117:
108: t3 = addr(a) -4
109: t4 =t3[t2]

Ms. C.B.Thaokar 42
Array Reference Annotated Parse tree
Example
X = A[ y,z]

Ms. C.B.Thaokar 43
SDTS for Switch stmt

switch(E) S -> switch (E) { caselist }


{ caselist -> caselist case V : S
case V1 : S1 caselist -> case V : S
case V2 : S2 caselist -> default : S
… caselist -> caselist default : S
case Vn-1 : Sn-1
default : Sn
}

Ms. C.B.Thaokar 44
TAC Switch stmt
Code to evaluate E into t
goto test
L1: code for S1
goto next
L2: code for S2
goto next

Ln-1 : code for Sn-1
goto next
Ln: code for Sn
goto next
Ld: code for default
goto next

test: if t = V1 goto L1

if t = Vn-1 goto Ln-1
goto Ln
goto Ld
next:
Ms. C.B.Thaokar 45
Questions
• Write SDTS to count number of operators in the given input
expression.
• Write TAC for given code
i =0; j=0;
while ( i < =5){
sum[i][j] = a[i ] + 5;
i++; j ++;
}
• Why S attributed definitions are L attributed definitions
• Write TAC for switch case example. ( Assume suitable code)

Ms. C.B.Thaokar 46

You might also like