Unit V
Unit V
AUTOMATATHEORY&COMPILERDESIGN
UNIT V
CODE GENERATION
Code generator converts the intermediate representation of source code into a form
that can be readily executed by the machine. A code generator is expected to generate the
correct code. Designing of the code generator should be done in such a way that it can be
easily implemented, tested, and maintained.
2. Target program:
The target program is the output of the code generator. The output may be absolute
machine language, relocatable machine language, or assembly language.
Absolute machine language as output has the advantages that it can be placed in a
fixed memory location and can be immediately executed. For example, WATFIV is a compiler
that produces the absolute machine code as output.
Assembly language as output makes the code generation easier. We can generate
symbolic instructions and use the macro-facilities of assemblers in generating code. And we
need an additional assembly step after code generation.
3. Memory management
Mapping the names in the source program to the addresses of data objects is done by
the front end and the code generator. A name in the three address statements refers to the
symbol table entry for the name. Then from the symbol table entry, a relative address can be
determined for the name.
4. Instruction selection
Selecting the best instructions will improve the efficiency of the program. It includes
the instructions that should be complete and uniform. Instruction speeds and machine idioms
also play a major role when efficiency is considered. But if we do not care about the efficiency
of the target program then instruction selection is straightforward.
For example, the respective three-address statements would be translated into the latter
code sequence as shown below:
P:=Q+R
S:=P+T
MOV Q, R0
ADD R, R0
MOV R0, P
MOV P, R0
ADD T, R0
MOV R0, S
Here the fourth statement is redundant as the value of the P is loaded again in that
statement that just has been stored in the previous statement. It leads to an inefficient code
sequence. A given intermediate representation can be translated into many code sequences,
with significant cost differences between the different implementations. Prior knowledge of
instruction cost is needed in order to design good sequences, but accurate cost information is
difficult to predict.
5. Register allocation
Register can be accessed faster than memory. The instructions involving operands in
register are shorter and faster than those involving in memory operand. The following sub
problems arise when we use registers:
During Register allocation – we select only those sets of variables that will reside
in the registers at each point in the program.
During a subsequent Register assignment phase, the specific register is picked to
access the variable.
To understand the concept consider the following three address code sequence
t:=a+b
t:=t*c
t:=t/d
Their efficient machine code sequence is as follows:
MOV a,R0
ADD b,R0
MUL c,R0
DIV d,R0
MOV R0,t
6. Evaluation order
The code generator decides the order in which the instruction will be executed. The
order of computations affects the efficiency of the target code. Among many computational
orders, some will require only fewer registers to hold the intermediate results. However,
picking the best order in the general case is a difficult NP-complete problem.
Approaches to code generation issues: Code generator must always generate the correct
code. It is essential because of the number of special cases that a code generator might face.
Some of the design goals of code generator are:
Correct
Easily maintainable
Testable
Efficient
Disadvantages in the design of a code generator:
Limited flexibility: Code generators are typically designed to produce a specific type
of code, and as a result, they may not be flexible enough to handle a wide range of inputs or
generate code for different target platforms. This can limit the usefulness of the code generator
in certain situations.
Maintenance overhead: Code generators can add a significant maintenance overhead
to a project, as they need to be maintained and updated alongside the code they generate. This
can lead to additional complexity and potential errors.
Debugging difficulties: Debugging generated code can be more difficult than
debugging hand-written code, as the generated code may not always be easy to read or
understand. This can make it harder to identify and fix issues that arise during development.
Performance issues: Depending on the complexity of the code being generated, a code
generator may not be able to generate optimal code that is as performant as hand-written code.
This can be a concern in applications where performance is critical.
Learning curve: Code generators can have a steep learning curve, as they typically
require a deep understanding of the underlying code generation framework and the
programming languages being used. This can make it more difficult to onboard new developers
onto a project that uses a code generator.
Over-reliance: It‘s important to ensure that the use of a code generator doesn‘t lead to
over-reliance on generated code, to the point where developers are no longer able to write code
manually when necessary. This can limit the flexibility and creativity of a development team,
and may also result in lower quality code overall.
Machine dependent code generation involves transformations that take into account the
target machine's properties, such as its registers and special machine instruction sequences. It's
closely tied to the architecture and instruction set of the target processor.
Machine dependent code generation is performed after the target code has been generated
and when the code is transformed according to the target machine architecture. It involves CPU
registers and may have absolute memory references rather than relative references.
Machine dependent code generation can provide significant performance gains because
the code is specifically designed to take advantage of the specific features of the hardware.
Object code refers to the output of the compilation process after the source code has been
translated into machine code by a compiler or an assembler. Object code comes in different
forms depending on the stage of compilation, the programming language, and the target
platform. Here are some common forms of object code:
Machine Code: This is the binary representation of instructions that can be executed
directly by the CPU. Machine code is specific to the target architecture and is usually represented
in hexadecimal or binary format.
Object Files: These are binary files containing machine code along with additional
information such as symbol tables, relocation information, and metadata. Object files are
produced by the compilation process and can be linked together to create executable programs or
shared libraries.
Executable Files: These are fully linked object files that are ready to be executed by the
operating system. Executable files contain machine code, as well as headers and metadata
required by the operating system to load and execute the program.
Shared Libraries: Also known as dynamic link libraries (DLLs) on Windows or shared
objects (SO) on Unix-like systems, shared libraries contain reusable code and data that can be
dynamically linked into multiple executable programs at runtime. Shared libraries are similar to
executable files but are designed to be loaded into memory and shared among multiple
processes.
Each form of object code serves a specific purpose in the software development process,
from compilation and linking to program execution. The choice of object code form depends on
factors such as performance requirements, portability, and development workflow.
One approach to register allocation and assignment is to assign specific values in the
target program to certain registers. For example, we could decide to assign base addresses to one
group of registers, arithmetic computations to another, the top of the stack to fixed register, and
so on.
This approach has the advantage that it simplifies the design of a code generator. Its
disadvantage is that, applied too strictly, it uses registers inefficiently; certain registers may go
unused over substantial portions of code, while unnecessary loads and stores are generated into
the other registers. Nevertheless, it is reasonable in most computing environments to reserve a
few registers for base registers, stack pointers, and the like, and to allow the remaining registers
to be used by the code generator as it sees fit.
A primary task of the compiler is register allocation for the variables. The number of
registers available in any hardware architecture is very minimal compared to the number of
variables that are defined in a particular piece of program. The getreg algorithm is simple but not
optimal as the algorithm stores all live variables in registers till the end of a block. The register
allocation problem is NP complete. Suppose, if we go in for Global register allocation which
involves assigning variables to limited number of available registers and attempts to keep these
registers consistent across basic block boundaries.
A key problem in code generation is deciding what values to hold in what registers.
Registers are the fastest computational unit on the target machine, but we usually do not have
enough of them to hold all values. Values not held in registers need to reside in memory.
Instructions involving register operands are invariably shorter and faster than those involving
operands in memory, so efficient utilization of registers is particularly important.
The use of registers is often subdivided into two sub problems
1. Register allocation, during which we select the set of variables that will reside in
registers at each point in the program.
2. Register assignment, during which we pick the specific register that a variable will
reside in.
Finding an optimal assignment of registers to variables is difficult, even with single-
register machines. Mathematically, the problem is NP-complete. The problem is further
complicated because the hardware and/or the operating system of the target machine may require
that certain register-usage conventions be observed.
a = b + c
d = a
c = a + d
Variables stored in Main Memory :
a b c d
2 fp 4 fp 6 fp 8 fp
Machine Level Instructions:
a=b+c
d=e+f
d=d+e
IFZ a goto L0
b=a+d
goto L1
L0 : b = a - d
L1 : i = b
Control Flow Graph :
At any point of time the maximum number of live variables is 4 in this example. Thus we
require 4 registers at maximum for register allocation.
If we draw horizontal line at any point on the above diagram we can see that we require
exactly 4 registers to perform the operations in the program.
Splitting:
Sometimes the required number of registers may not be available . In such case we may
require to move some variables to and from the RAM . This is known as spilling .
If we draw horizontal line at any point on the above diagram we can see that we require
exactly 4 registers to perform the operations in the program .
Disadvantages:
Linear Scan Algorithm doesn‘t take into account the ―lifetime holes‖ of the variable .
Variables are not live throughout the program and this algorithm fails to record the holes
in the live range of the variable.
Graph Coloring (Chaitin’s Algorithm:
Register allocation is interpreted as a graph coloring problem .
Nodes represent live range of the variable.
Edges represent the connection between two live ranges .
Assigning color to the nodes such that no two adjacent nodes have same color .
Number of colors represents the minimum number of registers required .
A k-coloring of the graph is mapped to k registers .
Steps :
1. Choose an arbitrary node of degree less than k .
2. Push that node onto the stack and remove all of it‘s outgoing edges .
3. Check if the remaining edges have degree less than k, if YES goto 5 else goto #
4. If the degree of any remaining vertex is less than k then push it onto to the stack .
5. If there is no more edge available to push and if all edges are present in the stack POP
each node and color them such that no two adjacent nodes have same color.
6. Number of colors assigned to nodes is the minimum number of registers needed .
# spill some nodes based on their live ranges and then try again with same k value . If problem
persists it means that the assumed k value can‘t be the minimum number of registers .Try
increasing the k value by 1 and try the whole procedure again .
For the same instructions mentioned above the graph coloring will be as follows :
Assuming k=4
Note : Any color(register) can be assigned to ‗i‘ as it has no edge to any other nodes .
One of the ideas behind the DAG construction is code generation. The steps involved
include reordering of the instructions, labeling the nodes with the number of registers required
and use this information to generate target assembly language code.
The code generation algorithm uses a recursive procedure on a labeled DAG. Considers
code generation based on the labels assigned to the nodes. It uses two stacks, one register stack
―rstack‖ and another memory stack, ―mstack‖. Stack ―rstack‖ is used to allocate registers.
Initially rstack contains all available registers. The algorithm retains the registers on rstack in the
same order it has found them. The typical functions of the stack, like push(), pop() is used to
rearrange the rstack and in addition, the algorithm uses a swap(rstack) function to interchange
the top two registers on rstack.
The algorithm, considers five different cases to generate code. They are discussed as
follows:
Case 0: This is a simple and terminating case of the recursive procedure. If, ‘n‘ is a leaf
and the leftmost child of its parent, we generate just a load instruction.
Case 1: This is the situation when the right node is a leaf and the left node could be a
sub-tree. In this case, we generate code to evaluate n1 into register R=top(rstack)
followed by the instruction ―op name R‖.
Case 2: The right sub-tree requires more registers than the left sub-tree. A sub-tree of the
form where n1 can be evaluated without stores but n2 is harder to evaluate than n1 as it
requires more registers. For this case, swap the top two registers on rsatck, then evaluate
n2 into R=top(rstack).We remove R from rstack and evaluate n1 into S = top(rstack).
Then we generate the instruction ―op R, S‖, which produce the value of ―n‖ in register S.
Another call to swap leaves rstack as it was, upon this call code generation begins.
Case 3: It is similar to case 2 except that here the left sub-tree is harder and is evaluated
first. There is no need to swap registers here.
Case 4: It occurs when both sub-trees require r or more registers to evaluate without
stores. Since we must use a temporary memory location, we first evaluate the right sub-
tree into the temporary T, then the left sub-tree, and finally the root. All these cases are
discussed in Algorithm 32.1 to generate code from the DAG and the algorithm is named
gencode(n) where ‗n‘ is the root of the DAG which is passed as argument
Procedure gencode(n);
Begin
/* case 0 */
if n is a left leaf representing operand name and n is the leftmost child of its
parent then print ‗MOV‘ || name || ‗.‘ || top(rstack)
else if n is an interior node with operator op, left child n1, and right child n2
then /* case 1 */
if label(n2) = 0 then begin
let name be the operand represented by n2;
gencode(n1);
print op || name || ‗.‘ || top(rstack)
end
/* case 2 */
else if 1 ≤ label (n1) < label(n2) and label(n1) < r then begin swap(rstack);
gencode(n2 );
push(rstack,R);
swap(rstack)
end
/* case 3 */
else if 1 ≤ label (n2) < label(n1) and label(n2) < r then begin gencode(n1);
R := pop(rstack); /* n1 was evaluated into register R */ gencode(n2);
print op || R || ‗.‘ || top(rstack);
push(rstack,R);
end
Case 1 checks if the right child of an interior node is a leaf node. If it is a leaf node, then we
call the function recursively with the left child as the root node. To evaluate and finally conclude
this case with generating an ―op‖ instruction.
Case 2 is evaluated if the right sub-tree is heavy. If it is heavy, we swap the register stack so
that the right sub—tree is evaluated into the register which is beneath the top register. We then
recursively call gencode() function with the right sub -tree‘s node as root and we remove this
register from rstack. We then call gencode() to evaluate the left sub-tree and use the top of the
stack register. After that we swap the rstack contents to ensure the initial rstack content is
retained.
Case 3 is just the opposite of Case 2 and since in this context, we evaluate first the left sub-
tree, the rstack contents are used as it is and not swapped.
Case 4 is the situation when there are no registers in the rstack. We use a memory based
operation where the operands would be memory to compute.
The nodes of the DAG are labeled with the number of registers that it requires for
computation. Assume that there are two register R0 and R1 with R0 on the top of the stack. The
algorithm gencode() is called with the root node t4. The children of this node are t1 and t3. Since
the label of t3 is greater, case 2 is initiated. This calls recursively gencode() with t3 after
swapping the register stack. This results in its left child being a leaf node and hence falls under
case 0.
Case 0 is a load instruction and the value ‗e‘ is loaded into R1. After this call returns it
goes to the next step of the previous call, which removes the register R1 from rstack using the
pop() command and the gencode function is called with the right sub-tree node, which is t2. This
falls under case 1 as the label of the right leaf node is 0 and thus gencode is again called with
node ‗c‘. This again falls under case 0 and initiates a load instruction into R0 and returns. The
next instruction is the next step of Case1 where an operator instruction is issued followed by the
next instruction of case 2 where a SUB instruction is issued and the register is pushed and then
swapped.