Chapter 5 Intermidate Code Generation
Chapter 5 Intermidate Code Generation
Complier Design
1
Chapter 5
2
Intermediate Code Generation
A source code can directly be translated into its target machine code.
Then why at all we need to translate the source code into an intermediate code which is
then translated to its target code?
Intermediate Code Generation
If a compiler translates the source language to its target machine language without having
the option for generating intermediate code, then for each new machine, a full native
compiler is required.
Intermediate code eliminates the need of a new full compiler for every unique machine by
keeping the analysis portion same for all the compilers.
The second part of compiler, synthesis, is changed according to the target machine.
It becomes easier to apply the source code modifications to improve code performance by
applying code optimization techniques on the intermediate code.
Intermediate Representation
Intermediate codes can be represented in a variety of ways and they have their own benefits.
High Level IR –
It is very close to the source language itself.
It can be easily generated from the source code easy to code modifications to enhance
performance.
But for target machine optimization, it is less preferred.
Low Level IR
This one is close to the target machine code, which makes it suitable for register and
memory allocation, instruction set selection, etc.
It is good for machine-dependent optimizations.
Intermediate code can be either language specific (e.g., Byte Code for Java) or language
independent (three-address code).
Intermediate code representation can be done in:
1. Postfix Notation
2. Three-address Code
Postfix notation is the notation in which operators are placed after the corresponding
+.
Revers of this is known as prefix notation in which operator comes before operands + a b.
operator precedence.
Rules to represent postfix notations
1. Consider operator priorities. Exponent has highest , /* next and +- lowest priority
2. No two operators of same priority can stay together on stack column. E.g. (+ -) and ( / *)
have same priority and cannot stay together . If it happens pop-top of stack operator before
push next operator.
3. Lowest priority operator cannot placed before highest priority operator. If it happens, pop
highest priority operator from stack first and store in postfix arrangement, then push lowest
priority operator to stack.
4. If the operator is enclosed by ( ) then pop the operator to postfix.
When we scan the given expression from left to right, if it is operand placed on postfix and
if it is operator, put on top of stack.
Example ( A+B/C*(D+E) –F) let us convert this infix to postfix natation
( (
A ( A
+ (+ A
B (+ AB
/ (+/ AB here + is lower than / so no need to pop +
C (+/ ABC
* (+* ABC/ because / * cannot stay together. / is popped and place in postfix
( (+*( ABC/
D (+*( ABC/D
+ (+*(+ ABC/D
E (+*(+ ABC/DE
) (+* ABC/DE+ (+) is popped and placed on postfix
- (- ABC/DE+*+ * is higher than – so pop * and again – and + can’t stay together pop +
F (- ABC/DE+*+F
)Thus the expression
(-) ( A+B/C*(D+E) –F) is converted to ABC/DE+*+F- in postfix notation.
ABC/DE+*+F-
2. Three-Address Code
Intermediate code generator receives input from its predecessor phase, semantic analyzer,
That syntax tree then can be converted into a linear representation, e.g., postfix notation.
Therefore, code generator assumes to have unlimited number of memory storage (register)
to generate code
Three-Address Code
Three-Address Code
Example 2
Three-Address Code
1. Quadruple
2. Triple
3. Indirect Tipple
Declarations
A declaration in a program refers to a statement that provides the data about the name
and type of data objects to the programming language translators. For example int a,
b;
This declaration provides the programming language translator with the information that
a and b are the data objects of type integer that are needed during the execution of the
subprogram.
For every local name in a procedure, we create a ST (Symbol Table) entry containing:
The type of the name
How much storage the name requires
Declarations
The production:
1. D → integer, id
2. D → real, id
3. D → D1, id
A suitable transition scheme for declarations would be:
1. When a procedure call occurs then space is allocated for activation record.
2. Evaluate the argument of the called procedure.
3. Establish the environment pointers to enable the called procedure to access data in enclosing blocks.
4. Save the state of the calling procedure so that it can resume execution after the call.
5. Also save the return address of the location to which the called routine must transfer after it is
finished.
6. Finally generate a jump to the beginning of the code for the called procedure.
Let us consider a grammar for a simple procedure call statement
S→ call id(Elist)
Elist → Elist, E
Elist → E
A suitable transition scheme for procedure call would be
1. Boolean expression:
Boolean expression are statements whose results can be either true or false.
2. Flow of control statements:
The flow of control statements needs to be controlled during the execution of
statements in a program.
Back Patching
Backpatching for
Boolean Expressions:
production rules table
To find the TAC(Three address code) for the given expression using Backpatching :
(A < B) OR (C < D) AND (P < Q )
TAC(Three address code
We have two Operators (Relational < ) and ( Logical( Boolean ) AND and OR) operators.
AND operator has highest priority than OR operator.
And let us represent A<B result in T1 , C<D result in T2 and p<q result in T3.
Then (A < B) OR (C < D) AND (P < Q) will become T1 OR T2 AND T3.
And let we store T2 AND T3 result in T4. Then T1 OR T4 in T5 .
T1 T4 OR
0 0 0
0 1 1
1 0 1
1 1 1
When T1 is 1 the result is 1 so no need to check T4 values. But When T1 is 0 the result
depends on T4 value. Hence Backpatching is required to check T4 value. so when T1 is
0 False we have to backpatch at OR operation
As Backpatch (where we get result, next quadruple)
Which is Backpatch( 11,12) here 11 is 0 (False) value of T1 and 12 is next quadruple.
To fined True value of OR, we have to merge True values of T1 and T4. ( 10 and 14)
To fined False value again merge False values of T1 and T4. F=(11,13,15)
The parse tree for the expression :
Thus We can now fill the unspecified information's for the Exxpression
(A < B) OR (C < D) AND (P < Q )
” Thank You “