Code Generation and Code Optimization Eng 16
Code Generation and Code Optimization Eng 16
co
1 | Page
www.gradeup.co
Code Generation and Code optimization:
Content:
1. Types Expression
2. Writing simple type checker
3. Intermediate code generation
4. Intermediate form
5. Syntax tree
6. Three address code
7. Code optimization
8. Compile time evaluation
9. Dead code evaluation
10. Code movement
11. Code generation
Writing a simple type checker: It checks for type compatibility which essentially finds
out whether two type expressions are equivalent. This type equivalence may be name
equivalence or structural equivalence.
2 | Page
www.gradeup.co
Intermediate code generation: compiler often generate intermediate code instead of
translating an input file directly to binary. We have seen that it is not always possible
to generate target code for a program in one pass of the compiler due to various
possible reasons which are as follows
1. Sufficient core memory may not be available to accommodate a single pass compiler.
2. A multi pass structure may be required to satisfy the primary aims of the compiler to
generate a highly efficient target code or to occupy minimum possible storage space.
Intermediate Forms: There are several intermediate code forms viz. postfix notation,
syntax trees and three address code. The choice of intermediate code form to be
used in a compiler depends on two important strategy. These are:
1. Ease of conversion from the source program to the intermediate code and.
2. Ease of subsequent processing which is to be done on intermediate code viz.
generation of target code in the simplest manner or program optimization.
Syntax Trees: It is a condensed form of parse tree which is used to represent the
syntactic structure of the language constructs. Operators and keywords are associated
with the interior nodes in a syntax tree and do not appear as leaves.
3 | Page
www.gradeup.co
its children. These child nodes nay represent the sub expression which is constituting
the operands of that operator.
Three Address Code: In three address code, each statement generally contains 3
addresses, two for the operands and one for the result. This is why it is termed as
three address code. It is a sequence having a general form
a: b op c
where op is any operator and a, b and c can be variables, constants or the temporary
variables generated by the compiler. We can say that three address code is a
linearized representation by the following three address code.
t1 : = b * 9
t2 : = a + t1
4 | Page
www.gradeup.co
Implementation of the three address code: It can be implemented as records having
fields for the operators and operands. It can be realized in several ways viz.
quadruples and triples.
Triples: It is a record structure having three fields as follows
Where operand1 and operand2 are two operands for the operator and result field
contains the result of the operation on operand1 and operand2.
+b c a
Type conversion: In the above generation of three address code for assignment
statements, we assumed that all the operands are of the same type which is not
always possible. In the practice, programming languages allows certain operations on
mixed types. For example C allows a*b, where a is of type real and b is of type
integer. In such cases, the compiler may have to first convert one of the operands to
ensure that both operands are of the same type before generating appropriate code.
5 | Page
www.gradeup.co
• They must ensure that the transformed program is semantically equivalent to the
original program.
• The improvement of the program efficiency must be achieved without changing the
algorithms which are used in the programs.
The input program is generally written in a high level language and the output
program can be in a high level language or in a low level language. If both input and
output programs are in high level language, the optimizer is known as source to
source optimizer. If the output is in low level language, the optimizer is known as an
optimizing compiler.
The main goal of optimization is to produce target program with high execution
efficiency. To obtain the best results in a minimum effort we can identify the frequently
executed parts of a program and make them as efficient as possible. Most programs
spend maximum percentage of their execution time in a small fraction of a program. If
we can optimize this small fraction of a program, efficiency may be improved
drastically. It may be obtained by rearranging the computations in a program without
changing the meaning of the program.
6 | Page
www.gradeup.co
• Allocation of scare resources to achieve high efficiency in program execution. Example
the top of stack registers, few arithmetic registers etc.
• Use of immediate instructions where a value is a part of the instruction, incrimination
where a memory location can be incremented by some constant, indexing or
indirection and vector operations are some special machine features that can be
gainfully exploited.
• If data is intermixed with the instruction sequence, it can be accessed more efficiently
on some machines.
Let us consider which illustrates a situation in which we can optimize the code by
using a LOAD COMPLEMENT instruction.
• It should preserve the meaning of programs i.e. it should not change the output
produced by a program for a given input.
• It should speed up programs by a measurable amount on an average.
• It should reduce the size of the program
7 | Page
www.gradeup.co
Common sub expression elimination: An expression need to be evaluated if it was
previously computed and the values of variables in this expression have not changed
since the earlier computation.
Example , consider the following code
a := b ** c;
d := b** c+ x-y
we can eliminate the second evaluation of b**c from this code if none of the
intervening statements has changed its value. We can thus, rewrite the code as
t1 :=b** c
a=t1
d := t1 + x-y
Let us consider the following code
a := b ** c
b := x;
d := b** c +x – y
8 | Page
www.gradeup.co
Dead code elimination: If the value contained in the variable at a point is not used
anywhere in the program subsequently, the variable is said to be dead at that place.
Loop test replacement: We can replace a loop termination test phrased in terms of
one variable by an equivalent loop test phrased in terms of another variable.
Code Generation:
It can be considered as the final phase of compilation. Through post code generation,
the optimization process can be applied on the code, but that can be seen as a part
of code generation phase itself. Code generated by the compiler is an object code of
some lower-level programming language, i.e. assembly language. We have seen that
source code written in a higher-level language is transformed into a lower-level
language that results in a lower-level object code, which should have the following
minimum properties:
9 | Page
www.gradeup.co
a. It should carry the exact meaning of the source code.
b. It should be efficient in terms of CPU usage and memory management.
10 | P a g e
www.gradeup.co
11 | P a g e