2024 CD Ch06 Intermidiate & Ch07 Runtime & Ch08 Code Optimization
2024 CD Ch06 Intermidiate & Ch07 Runtime & Ch08 Code Optimization
used interchangeably
Intermediate code is the interface between front end and back end.
It is abstraction lies between the high-level source code and the machine code.
Front end translates a source program into an intermediate representation then, the back
1/11/2025 2
Intermediate Code/languages cont’d….…
Why we need to translate source code into intermediate code, which translated to its
target code
What happen if a source code directly translated into its target machine code
Let us see the reasons why we need an intermediate code.
If a compiler directly translates the source code to its target machine code without
having the option for generating intermediate code,
Then a full native compiler is required for each new machine.
So, Intermediate code eliminates the need of a new full compiler for every unique
machine by keeping the analysis portion same for all the compilers.
Intermediate code generator takes input from its predecessor phase, in the form of an
annotated syntax tree, then converted it into a linear representation, e.g.,postfix notation
Using intermediate code, it is easy to apply the source code modifications to improve
code performance, which done by applying code optimization techniques on the
intermediate code
1/11/2025 3
Intermediate Code/languages cont’d….…
Intermediate code Representation
Intermediate codes can be represented in two ways each of them with their own benefits:
1. High Level intermediate representations:
High level intermediate code can be represented as source code, which is very
close to the source language itself.
To enhance performance of source code, we can easily apply code modification.
But for target machine optimization, it is less preferred.
2. Low Level intermediate representations
Low level intermediate code is close to the target machine,
It is suitable for register and memory allocation, instruction set selection, etc.
It is good to used for machine-dependent optimizations.
Intermediate code tends to be machine independent code.
1/11/2025 4
Intermediate Code/languages cont’d….…
Since Intermediate code tends to be machine independent code, a code generator
assumes to have unlimited number of memory storage (register) to generate code.
For example: a : = b * - c + b * - c
Intermediate code generator will try to divide this expression into sub-expressions, then
generate the corresponding code as follows;
t1 : = - c
t2 : = b * t1
t3 : = - c
t4 : = b * t3
t5 : = t2 + t4 where t being used as registers in the target program
a : = t5
Intermediate code can be either language specific (e.g., Byte Code for Java) or language
independent (three-address code).
1/11/2025 5
Intermediate Code/languages cont’d….…
There are three ways of intermediate representation:
a. Syntax tree: is graphically represent hierarchical structure of expressions
b. Postfix notation: Uses a stack-based approach for a linearized representation of
a syntax tree/expressions
c. Three address code: Instructions with at most three operands
a. Syntax tree: which is graphical representations
A syntax tree depicts the natural hierarchical structure of a source program.
A dag (Directed Acyclic Graph) gives the same information but in a more compact way
because common subexpressions are identified.
A syntax tree and dag for the assignment statement a : = b * - c + b * - c are as follows:
1/11/2025 6
Intermediate Code/languages cont’d….…
b. Postfix notation: which is a linearized representation of a syntax tree
it is a list of the nodes of the tree in which a node appears immediately after its
children.
The postfix notation for the above syntax tree is:
a b c uminus * b c uminus * + assign
C. Three address code
In a three address code there is at most one operator at the right side of an
instruction
+
Example: a:= a+a*b-c+b-c*d t1 = b – c
+ * t2 = a * t1
t3 = a + t2
* t4 = t1 * d
d
t5 = t3 + t4
a -
b c
1/11/2025 7
6.2. Three address statements/ Code
Three address code: is an abstract form of intermediate code which is easy to generate
and can be easily converted to machine code.
It makes use of at most three addresses and one operator to represent an expression, and
The value computed at each instruction is stored in temporary variable generated by
compiler.
The compiler decides the order of operation given by three address code.
For instances: the general form of Three-address code for a sequence of statements is:
x : = y op z where x, y and z are names, constants, or compiler-generated temporaries;
op stands for any operator, such as arithmetic operator, or logical operator, etc.
Thus a source language expression like x+ y*z might be translated into a sequence
t1 : = y * z
t2 : = x + t1 where t1 and t2 are compiler-generated temporary names.
1/11/2025 WCU-CS Compiled by TM. 8
Three address statements/ Code cont’d….…
Three-address code corresponding to the syntax tree and dag given above for the expression
a:=b*-c+b*-c
t1 : = - c t1 : = -c
t2 : = b * t1 t2 : = b * t1
t3 : = - c t5 : = t2 + t2
t4 : = b * t3 a : = t5
t5 : = t2 + t4 (b) Code for the dag
a : = t5
(a) Code for the syntax tree
Chapter Seven
Run-time Environments
Outline 7.1. Overview of runtime environment
7.2. Symbol table
7.2. Hash Table
By: Tseganesh M.(MSc.)
7.1. Overview of runtime environment
Runtime environment in compiler design refers to the set of tools and resources necessary
to execute a program at runtime.
It is responsible for managing the execution of a program including memory management,
control flow, stack allocation, library functions, and access to variables
During execution of a program, various memory locations are allocated/deallocated as required
The runtime environment is responsible for managing these memory locations to avoid
conflicts or memory leaks.
Another crucial aspect of a runtime environment is the stack allocation.
Stack is a data structure that keeps track of function calls and local variables within a program.
The runtime environment making sure that the correct values are pushed and popped in the
stack as required.
Proper stack allocation ensures the correct execution of function calls and prevents
information loss between function invocations
1/11/2025 15
Overview of runtime environment cont’d…..
Components of a Runtime Environment:
1. Activation Records: Stores function call information, including parameters, return
address, local variables, and temporary data.
2. Call Stack: Maintains activation records for function calls in a stack-like manner.
3. Heap: Dynamically allocated memory for objects and variables.
4. Static Area: Stores global variables and constants.
E.g.: For a function call: int add(int x, int y) {
return x + y;
}
int main() {
int result = add(5, 10);}
The call to add creates an activation record on the call stack with parameters x=5
and y=10
1/11/2025 16
7.2. Symbol Table
Symbol table is a crucial data structure used by compilation process of compiler designing.
It serves as a "repository" that stores information about variables, functions, objects, and
relevant attributes (such as name, data type, scope, etc) and properties in the symbol table.
In SA phases, symbol table provides a mechanism to verify semantic properties (such as
variables declared before use, type compatibility, and functions properly defined & called)
As the compilation process progresses, the symbol table is continually updated with
additional details about each symbol, like its scope, memory location, and value.
This allows the compiler to perform tasks like type checking, ensuring consistency and
compatibility of data types used within the program.
1/11/2025 17
Symbol Table cont’d…..
The uses of a symbol table:
1. Tracking Identifiers: Keeps track of variable names, their types, and scopes.
Implementation: it can be implemented as a hash table, tree, or linked list for efficient lookups
1/11/2025 18
7.3. Hash Table
Hash Table: acts as a data structure that stores key-value pairs and enables efficient and rapid
retrieval of information using a hash function, particularly during symbol table management.
By utilizing a hash table, a compiler can quickly search for symbol attributes, ensuring
effective compilation.
It minimizes the time complexity of searching for symbol attributes, as it typically ensures
constant-time retrieval.
During the compilation process, the hash function computes the hash value for the
identifier, then it is used an index for storage and retrieval purposes.
Hash Function: Converts a key (e.g., variable name) into an index for efficient storage and retrieval.
E.g.: For variable x, the hash function might compute:
Index = Hash("x") % Table_Size
Collision Handling:
1. Chaining: Uses linked lists to store multiple keys at the same index.
2. Open Addressing: Find the next available slot in the table 19
1/11/2025
Chapter Eight
Code Generation and Optimization
Outline:
8.1. A simple code generation algorithm
8.2. Register allocation
8.3. peephole Optimization
8.4. DAG Representation (reading assignment)
20
1/11/2025 WCU-CS Compiled by TM.
8.1. Overview of a simple code generation algorithm
Code generation is final step in the compilation process that transforms source code into machine code,
executable by a computer.
In this phase, the compiler analyzes the intermediate representation of the source code and translates it into
a target language, usually assembly or machine code.
Code generator is used to produce the target code for three-address statements, and it uses registers to store
the operands of the three address statement.
Steps in Code Generation:
1. Intermediate Code: Convert the intermediate representation (e.g., TAC) to machine code.
Copy propagation,
Dead-code elimination,
Loop unrolling,
These different optimization techniques are applied during code generation to reduce execution
time, memory usage, code size, and improve power consumption.
By applying these techniques, the compiler producing code that better utilizes the available
resources and achieves faster execution.
1/11/2025 26
peephole Optimization cont’d…..
Constant folding: here, we can eliminate both the test and printing from the object code.
This reducing a compile time that the value of an expression is a constant and using the constant instead.
Optimizations must preserve the meaning of programs, it must not change the output.
THANK YOU!!! 29