0% found this document useful (0 votes)
14 views24 pages

18 Unit-5

The document discusses runtime environments, focusing on symbol tables, memory allocation strategies, and parameter passing mechanisms. It explains the structure and operations of symbol tables, the organization of runtime environments including stack and heap management, and various methods for passing parameters between procedures. Additionally, it covers static and dynamic allocation, activation records, and the handling of non-local data access in programming languages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views24 pages

18 Unit-5

The document discusses runtime environments, focusing on symbol tables, memory allocation strategies, and parameter passing mechanisms. It explains the structure and operations of symbol tables, the organization of runtime environments including stack and heap management, and various methods for passing parameters between procedures. Additionally, it covers static and dynamic allocation, activation records, and the handling of non-local data access in programming languages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Unit-5

Runtime Environments, Stack allocation of space, access to Non Local date on the stack
Heap Management code generation – Issues in design of code generation the target
Language Address in the target code Basic blocks and Flow graphs. A Simple Code
generation.
--------------------------------------------------------------------------------------------------------
Symbol tables: Use and need of symbol tables
SYMBOL TABLES

• Symbol table is a data structure meant to collect information about names appearing in the source
program.
• It keeps track about the scope/binding information about names.
• Each entry in the symbol table has a pair of the form (name and information).
• Information consists of attributes (e.g. type, location) depending on the language.
• Whenever a name is encountered, it is checked in the symbol table to see if already occurs. If not, a
new entry is created.
• In some cases, the symbol table record is created by the lexical analyzer as soon as the name is
encountered in the input, and the attributes of the name are entered when the declarations are
processed.
• If same name can be used to denote different program elements in the same block, the symbol
table record is created only when the name’s syntactic role is discovered.

Operations on a Symbol Table

• Determine whether a given name is in the table


• Add a new name to the table
• Access information associated to a given name
• Add new information for a given name
• Delete a name (or a group of names) from the table
Implementation
• Each entry in a symbol table can be implemented as a record that consists of several fields.
• The entries in symbol table records are not uniform and depend on the program element
identified by the name.
• Some information about the name may be kept outside of the symbol table record and/or some
fields of the record may be left vacant for the reason of uniformity. A pointer to this information
may be stored in the record.
• The name may be stored in the symbol table record itself, or it can be stored in a separate array
of characters and a pointer to it in the symbol table.
• The information about runtime storage location, to be used at the time of code generation, is kept
in the symbol table.

• There are various approaches to symbol table organization e.g. Linear List.

Data Structure for Symbol table

Linear List
• It is the simplest approach in symbol table organization.
• The new names are added to the table in the order they arrive.
• A name is searched for its existence linearly.
• The average number of comparisons required are proportional to 0.5*(n+1) where n=number of entries
in the table.
• It takes less space but more access time.

Search Tree
• It is more efficient than Linear Trees.
• We provide two links- left and right, which point to record in the search tree.
• A new name is added at a proper location in the tree such that it can be accessed
alphabetically.
• For any node name1 in the tree, all names accessible by following the left link precede name1
alphabetically.
• Similarly, for any node name1 in the tree, all names accessible by following the right link
succeed name1 alphabetically.
• The time for adding/searching a name is proportional to (m+n) log2 n.

Hash Table
• A hash table is a table of k-pointers from 0 to k-1 that point to the symbol table and record within
the symbol table.
• To search a value, we find out the hash value of the name by applying suitable hash function.
• The hash function maps the name into an integer value between 0 and k-1 and uses it as an index in
the hash table to search the list of the table records that are built on that hash index.
• To add a non-existent name, we create a record for that name and insert it at the head of the list.

Runtime Environment: Storage Organization

RUNTIME ENVIRONMENT

• A program as a source code is merely a collection of text code, statementsetc. and to make
it alive, it requires actions to be performed on the target machine.
• A program needs memory resources to execute instructions. A program contains names for
procedures, identifiers etc., that require mapping with the actual memory location at runtime.
By runtime, we mean a program in execution.
• Runtime environment is a state of the target machine, which may include software libraries,
environment variables, etc., to provide services to the processes running in the system.
Runtime support system is a package, mostly generated with the executable program itself
and facilitates the process communication between the process and the runtime environment.
• It takes care of memory allocation and de-allocation while the program is being executed.
Activation Trees A program is a sequence of instructions combined into a number of
procedures. Instructions in a procedure are executed sequentially.
• A procedure has a start and an end delimiter and everything inside it is called the body of
the procedure. The procedure identifier and the sequence of finite instructions inside it make
up the body of the procedure. The execution of a procedure is called its activation.
• An activation record contains all the necessary information required to call a procedure. An
activation record may contain the following units depending upon the source language used.
Temporaries Stores temporary and intermediate values of an expression.
• Local Data Stores local data of the called procedure. Machine Status Stores machine status
such as Registers, Program Counter etc., before the procedure is called. Control Link Stores
the address of activation record of the caller procedure. Access Link Stores the information
of data which is outside the local scope.

Procedure Activations
• Each execution of a procedure is called as activation of that procedure.
• An execution of a procedure starts at the beginning of the procedure body;
• When the procedure is completed, it returns the control to the point immediately after the
place where that procedure is called.
• Each execution of a procedure is called as its activation.
• Lifetime of an activation of a procedure is the sequence of the steps between the first andthe
last steps in the execution of that procedure (including the other procedures called by that
procedure).
• If a and b are procedure activations, then their lifetimes are either non-overlapping or are
nested.
• If a procedure is recursive, a new activation can begin before an earlier activation of the
same procedure has ended.

Control Stack
• The flow of the control in a program corresponds to a depth-first traversal of the
activation tree that:
starts at the root,
visits a node before its children, and
recursively visits children at each node an a left-to-right order.
• A stack (called control stack) can be used to keep track of live procedure activations.
An activation record is pushed onto the control stack as the activation starts.
That activation record is popped when that activation ends.
• When node n is at the top of the control stack, the stack contains the nodes along the path from n to the
root.
Variable Scopes
• The same variable name can be used in the different parts of the program.
• The scope rules of the language determine which declaration of a name applies when the name
appears in the program.
• An occurrence of a variable (a name) is:
– local: If that occurrence is in the same procedure in which that name is declared.
– non-local: Otherwise (ie. it is declared outside of that procedure)

STORAGE ORGANIZATION

Fixed-size objects can be placed in predefined locations

Figure 5.1: Storageorganization

R
un-time stack and heap The STACK is used to store:
Procedure activations.
The status of the machine just before calling a procedure, so that the status can be restored when the
called procedure returns.
The HEAP stores data allocated under program control (e.g. by malloc() in C).
Activation Records
• Information needed by a single execution of a procedure is managed using a contiguous
block of storage called activation record.
• An activation record is allocated when a procedure is entered, and it is de-allocated when
that procedure exited.
• Size of each field can be determined at compile time (Although actual location of the
activation record is determined at run-time).
• Except that if the procedure has a local variable and its size depends on a.
Stack allocation: Activation tress, Activation records, Calling sequences, Variable-
length data on the stack
STATIC ALLOCATION

Statically allocated names are bound to storage at compile time. Storage bindings of statically
allocated names never change, so even if a name is local to a procedure, its name is always bound to
the same storage. The compiler uses the type of a name (retrieved from the symbol table) to determine
storage size required. The required number of bytes (possibly aligned) is set aside for the name.The
address of the storage is fixed at compile time.

Limitations:
• The size required must be known at compile time.
• Recursive procedures cannot be implemented as all locals are statically allocated.
• No data structure can be created dynamically as all data is static.

Stack-dynamic allocation
• Storage is organized as a stack.
• Activation records are pushed and popped.
• Locals and parameters are contained in the activation records for the call.
• This means locals are bound to fresh storage on every call.
• If we have a stack growing downwards, we just need a stack_top pointer.
• To allocate a new activation record, we just increase stack_top.
• To deallocate an existing activation record, we just decrease stack_top.
Address generation in stack allocation
The position of the activation record on the stack cannot be determined statically.
Therefore the compiler must generate addresses RELATIVE to the activation record. If we have a
downward-growing stack and a stack_top pointer, we generate addresses of the form stack_top + offset

Access to Nonlocal Names: Data access without procedures, issues with nested
procedures, nesting depth
How does the code find non-local data at run-time globals visible everywhere naming convention
gives an address initialization requires cooperation Lexical nesting view variables as (level,offset)
pairs (compile-time) chain of non-local access links more expensive to
Two important problems arise How do we map a name into a (level,offset) pair? Use a block-
structured symbol table (remember previous lecture?) look up a name, want its most recent
declaration declaration may be at current level or any lower level Given a (level,offset) pair,
what’s the address? Two classic approaches access links (or static links) displays

Access links, Manipulating access links, Displays


To find the value specified by (l,o) need current procedure level, k k = l ⇒ local value k > l ⇒
find l’s activation record k < l cannot occur Maintaining access links: (static links ) calling level
k +1 procedure 1 pass my FP as access link 2 my backward chain will work for lower levels
calling procedure at level l < k 1 find link to level l−1 and call it 2 its access link will work for
lower level.
The idea behind the use of access links is as follows:
• Add a new field to each AR -- the access link field.
• If P is (lexically) nested inside Q, then at runtime, P's AR's access link will point to the access link
field in the AR of the most recent activation of Q.
• Therefore, at runtime, access links will form a chain corresponding to the nesting of sub-programs.
For each use of a non-local x:
• At compile time, use the "Level Number" attribute of x and the "Current Level Number" (of the
sub-program that is accessing x) to determine how many links of the chain to follow at runtime.
• If P at level i uses variable x, declared at level j, follow i-j links, then use x's "Offset" attribute to
find x's storage space inside the AR.
How to set up access links:
• The link is set up by the calling procedure.
• How to set up the link depends on the relative nesting levels of the calling and called
procedures.

Heap management: The Memory Manager Heap Allocation: Hierarchical


organization of memory, Optimization in memory usage
Heap Allocation
Variables local to a procedure are allocated and de-allocated only at runtime. Heap allocation is
used to dynamically allocate memory to the variables and claim it back when the variables are no
more required.

Except statically allocated memory area, both stack and heap memory can grow and shrink
dynamically and unexpectedly. Therefore, they cannot be provided with a fixed amount of
memory in the system.

Figure 5.2: Heap Allocation

As shown in the image above, the text part of the code is allocated a fixed amount of memory.
Stack and heap memory are arranged at the extremes of total memory allocated to the program.
Both shrink and grow against each other.

Implicit and explicit memory allocation request, Memory Allocation strategies

Storage Allocation
Runtime environment manages runtime memory requirements for the following entities:
• Code : It is known as the text part of a program that does not change at runtime.
Itsmemory requirements are known at the compile time.
• Procedures : Their text part is static but they are called in a random manner. Thatis
why, stack storage is used to manage procedure calls and activations.
• Variables : Variables are known at the runtime only, unless they are global orconstant.
Heap memory allocation scheme is used for managing allocation and de-allocation of
memory for variables in runtime.

Static Allocation
In this allocation scheme, the compilation data is bound to a fixed location in the memory and it
does not change when the program executes. As the memory requirement and locations are known
in advance, runtime support package for memory allocation and de-allocation is not required.

Stack Allocation
Procedure calls and their activations are managed by means of stack memory allocation. It works
in last-in-first-out (LIFO) method and this allocation strategy is very useful for recursive
procedure calls.
Heap Allocation
Variables local to a procedure are allocated and de-allocated only at runtime. Heap allocation is
used to dynamically allocate memory to the variables and claim it back when the variables are no
more required.
Except statically allocated memory area, both stack and heap memory can grow and shrink
dynamically and unexpectedly. Therefore, they cannot be provided with a fixed amount of
memory in the system.

Parameter Passing Mechanisms

Parameter Passing
The communication medium among procedures is known as parameter passing. The values of the
variables from a calling procedure are transferred to the called procedure by some mechanism.
Before moving ahead, first go through some basic terminologies pertaining to the values in a
program.

r-value
The value of an expression is called its r-value. The value contained in a single variable also
becomes an r-value if it appears on the right-hand side of the assignment operator. r-values can
always be assigned to some other variable.

l-value
The location of memory (address) where an expression is stored is known as the l-value of that
expression. It always appears at the left hand side of an assignment operator.
For example:

day = 1;

week = day * 7;

month = 1;

year = month * 12;

From this example, we understand that constant values like 1, 7, 12, and variables like day, week,
month, and year, all have r-values. Only variables have l-values, as they also represent the
memory location assigned to them.

For example:

7 = x + y;

is an l-value error, as the constant 7 does not represent any memory location.

Formal Parameters

Variables that take the information passed by the caller procedure are called formal parameters.
These variables are declared in the definition of the called function.
Actual Parameters

Variables whose values or addresses are being passed to the called procedure are called actual
parameters. These variables are specified in the function call as arguments.

Example:

fun_one()
{
intactual_parameter = 10;
callfun_two(intactual_parameter)}
fun_two(intformal_parameter)
{
printformal_parameter;
}

Formal parameters hold the information of the actual parameter, depending upon the parameter
passing technique used. It may be a value or an address.

Pass by Value
In pass by value mechanism, the calling procedure passes the r-value of actual parameters and the
compiler puts that into the called procedure’s activation record. Formal parameters then hold the
values passed by the calling procedure. If the values held by the formal parameters are changed,
it should have no impact on the actual parameters.

Pass by Reference
In pass by reference mechanism, the l-value of the actual parameter is copied to the activation
record of the called procedure. This way, the called procedure now has the address (memory
location) of the actual parameter and the formal parameter refers to the same memory location.
Therefore, if the value pointed by the formal parameter is changed, the impact should be seen on
the actual parameter, as they should also point to the same value.

Pass by Copy-restore
This parameter passing mechanism works similar to ‘pass-by-reference’ except that the changes
to actual parameters are made when the called procedure ends. Upon function call, the values of
actual parameters are copied in the activation record of the called procedure. Formal parameters,
if manipulated, have no real-time effect on actual parameters (as l-
values are passed), but when the called procedure ends, the l-values of formal parameters are
copied to the l-values of actual parameters.

Example:

Int y;
calling_procedure)
{
y = 10;
copy_restore(y); //l-value of y is passed printf y;
//prints 99
}
copy_restore(int x)
{
x = 99; // y still has value 10 (unaffected) y = 0; // y is
now 0
}
When this function ends, the l-value of formal parameter x is copied to the actual parameter y.
Even if the value of y is changed before the procedure ends, the l-value of x is copied to the l-
value of y, making it behave like call by reference.

Pass by Name
Languages like Algol provide a new kind of parameter passing mechanism that works like
preprocessor in C language. In pass by name mechanism, the name of the procedure being called
is replaced by its actual body. Pass-by-name textually substitutes the argument expressions in a
procedure call for the corresponding parameters in the body of the procedure so that it can now
work on actual parameters, much like pass-by-reference.

Introduction to garbage collection: Design goals for garbage collectors,


reachability.
• So far, in C0 we have had only primitives for allocation of memory on the heap.
• Memory was never freed. In C, the free function accomplishes this, but it is very error-prone.
• Memory may not be freed that is no longer needed (a leak), or, worse, memory may be free
that will be referenced later in the computation.
• In type-safe languages this can avoided by using garbage collection that atomically reclaims
storage that can no longer be referenced.
• Since it is undecidable if memory might still be referenced, a garbage collector uses a
conservative approximation, where different techniques may approximate in different ways.
There are three basic garbage collection techniques.

Reference counting garbage collectors


• Reference Counting. Each heap object maintains an additional field containing the number of
references to the object.
• The compiler must generate code that maintains this reference count correctly. When the count
reaches 0, the object is deallocated, possibly triggering the reduction of other reference counts.
• Reference counts are hard to maintain, especially in the presence of optimizations. The other
problem is that reference counting does not work well for circular data structures because
reference counts in a cycle can remain positive even though the structure is unreachable.
• Nevertheless, reference counting appears to remain popular for scripting languages like Perl,
PHP, or Python. Another use of reference counting is in part of an operating system where we
know that no circularities can arise.
Code generation: Issues, The target language

A source code can directly be translated into its target machine code, then why at all we need to
translate the source code into an intermediate code which is then translated to its target code?
Let us see the reasons why we need an intermediate code.

Figure 5.3: compiler translates the source language to its target machine language
• If a compiler translates the source language to its target machine language without having the
option for generating intermediate code, then for each new machine, a full native compiler
is required.
• Intermediate code eliminates the need of a new full compiler for every unique machine by
keeping the analysis portion same for all the compilers.
• The second part of compiler, synthesis, is changed according to the target machine.
• It becomes easier to apply the source code modifications to improve code performance by
applying code optimization techniques on the intermediate code.

Intermediate Representation
Intermediate codes can be represented in a variety of ways and they have their own benefits.
• High Level IR - High-level intermediate code representation is very close to the source
language itself. They can be easily generated from the source code and we can easily apply code
modifications to enhance performance. But for target machine optimization, it is less preferred.
• Low Level IR - This one is close to the target machine, which makes it suitable
forregister and memory allocation, instruction set selection, etc. It is good for machine-dependent
optimizations.
Intermediate code can be either language-specific (e.g., Byte Code for Java) or language-
independent (three-address code).
Basic blocks & flow graphs: Basic blocks, Next-use information
BASIC BLOCKS AND FLOW GRAPHS
A graph representation of three-address statements, called a flow graph, is useful for
understanding code-generation algorithms, even if the graph is not explicitly constructed by a
code-generation algorithm. Nodes in the flow graph represent computations, and the edges
represent the flow of control. Flow graph of a program can be used as a vehicle to collect
information about the intermediate program. Some register-assignment algorithms use flow
graphs to find the inner loops where a program is expected to spend most of its time.

BASIC BLOCKS
A basic block is a sequence of consecutive statements in which flow of control enters at the
beginning and leaves at the end without halt or possibility of branching except at the end. The
following sequence of three-address statements forms a basic block:
t1 := a*a
t2 := a*b
t3 := 2*t2
t4 := t1+t3

t5 := b*b
t6 := t4+t5
A three-address statement x :=y+z is said to define x and to use y or z. A name in a
basic block is said to live at a given point if its value is used after that point in the
program, perhaps in another basic block.
The following algorithm can be used to partition a sequence of three-address
statements into basic blocks.
Algorithm 1: Partition into basic blocks.
Input: A sequence of three-address statements.
Output: A list of basic blocks with each three-address statement in exactly one block. Method:
1. We first determine the set of leaders, the first statements of basic blocks. The rules we use are
the following:
I) The first statement is a leader.
II) Any statement that is the target of a conditional or unconditional goto is a leader.
III) Any statement that immediately follows a goto or conditional goto statement is a leader.
2. For each leader, its basic block consists of the leader and all statements up to but not including
the next leader or the end of the program
Let us apply Algorithm 1 to the three-address code in fig 8 to determine its basic blocks. statement
(1) is a leader by rule (I) and statement (3) is a leader by rule (II), since the last statement can
jump to it. By rule (III) the statement following (12) is a leader. Therefore, statements (1) and (2)
form a basic block. The remainder of the program beginning with statement (3) forms a second
basic block.
(1) prod := 0
(2) i := 1
(3) t1 := 4*i
(4) t2 := a [ t1 ]
(5) t3 := 4*i
(6) t4 :=b [ t3 ]
(7) t5 := t2*t4
(8) t6 := prod +t5
(9) prod := t6
(10) t7 := i+1
i := t7
ifi<=20 goto (3)

TRANSFORMATIONS ON BASIC BLOCKS


• A basic block computes a set of expressions.
• These expressions are the values of the names live on exit from block.
• Two basic blocks are said to be equivalent if they compute the same set of
expressions.
• A number of transformations can be applied to a basic block without changing the
set of expressions computed by the block.
• Many of these transformations are useful for improving the quality of code that will
be ultimately generated from a basic block. to basic blocks; these are the structure-
preserving transformations and the algebraic transformations.

Flow graphs, representation of flow graphs, Loops

Control Flow Graph


• Basic blocks in a program can be represented by means of control flow graphs.
• A control flow graph depicts how the program control is being passed among the blocks.
• It is a useful tool that helps in optimization by help locating any unwanted loops in the
program.

Figure 5.4: Control Flow Graph

Optimization of basic blocks

Basic Blocks
• Source codes generally have a number of instructions, which are always executed in
sequence and are considered as the basic blocks of the code.

• These basic blocks do not have any jump statements among them, i.e., when the first
instruction is executed, all the instructions in the same basic block will be executed in their
sequence of appearance without losing the flow control of the program.

• A program can have various constructs as basic blocks, like IF-THEN-ELSE, SWITCH-
CASE conditional statements and loops such as DO-WHILE, FOR, and REPEAT-UNTIL,
etc.
Basic block identification
We may use the following algorithm to find the basic blocks in a program:

• Search header statements of all the basic blocks from where a basic block starts:

• First statement of a program.

• Statements that are target of any branch (conditional/unconditional).

• Statements that follow any branch statement.

• Header statements and the statements following them form a basic block.

• A basic block does not include any header statement of any other basic block.

Basic blocks are important concepts from both code generation and optimization point of view.

Basic blocks play an important role in identifying variables, which are being used more than
once in a single basic block. If any variable is being used more than once, the register memory
allocated to that variable need not be emptied unless the block finishes execution.

Simple code generator: Register and address descriptors, The code-generation


algorithm, Design of the function getReg

Code Generator
A code generator is expected to have an understanding of the target machine’s runtime
environment and its instruction set. The code generator should take the following things into
consideration to generate the code:
Target language : The code generator has to be aware of the nature of the target language for
which the code is to be transformed. That language may facilitate some machine-specific
instructions to help the compiler generate the code in a more convenient way. The target machine
can have either CISC or RISC processor architecture.

IR Type : Intermediate representation has various forms. It can be in Abstract Syntax Tree (AST)
structure, Reverse Polish Notation, or 3-address code.

Selection of instruction: The code generator takes Intermediate Representation as input and
converts (maps) it into target machine’s instruction set. One representation can have many ways
(instructions) to convert it, so it becomes the responsibility of the code generator to choose the
appropriate instructions wisely.

Register allocation : A program has a number of values to be maintained during the execution.
The target machine’s architecture may not allow all of the values to be kept in the CPU memory
or registers. Code generator decides what values to keep in the registers. Also, it decides the
registers to be used to keep these values.

Ordering of instructions : At last, the code generator decides the order in which the instruction
will be executed. It creates schedules for instructions to execute them.

Descriptors
The code generator has to track both the registers (for availability) and addresses (location of
values) while generating the code. For both of them, the following two descriptors are used:

Register descriptor : Register descriptor is used to inform the code generator about the
availability of registers. Register descriptor keeps track of values stored in each register.
Whenever a new register is required during code generation, this descriptor is consulted for
register availability.

Address descriptor : Values of the names (identifiers) used in the program might be stored at
different locations while in execution. Address descriptors are used to keep track of memory
locations where the values of identifiers are stored. These locations may include CPU registers,
heaps, stacks, memory or a combination of the mentioned locations.

Code generator keeps both the descriptor updated in real-time. For a load statement, LD R1, x,
the code generator:

• updates the Register Descriptor R1 that has value of x and


• updates the Address Descriptor (x) to show that one instance of x is in R1.

Code Generation
Basic blocks comprise of a sequence of three-address instructions. Code generator takes these
sequence of instructions as input.

Note : If the value of a name is found at more than one place (register, cache, or memory), the
register’s value will be preferred over the cache and main memory. Likewise cache’s value will
be preferred over the main memory. Main memory is barely given any preference.

getReg : Code generator uses getReg function to determine the status of available registers and
the location of name values. getReg works as follows:

• If variable Y is already in register R, it uses that register.

• Else if some register R is available, it uses that register.

• Else if both the above options are not possible, it chooses a register that requires minimal
number of load and store instructions.

For an instruction x = y OP z, the code generator may perform the following actions. Let us
assume that L is the location (preferably register) where the output of y OP z is to be saved:

Call function getReg, to decide the location of L.

• Determine the present location (register or memory) of yby consulting the Address
Descriptor of y. If y is not presently in register L, then generate the following instruction
to copy the value of y to L:

MOV y’, L

where y’ represents the copied value of y.

• Determine the present location of z using the same method used in step 2 for y and
generate the following instruction:

OP z’, L

where z’ represents the copied value of z.

• Now L contains the value of y OP z, that is intended to be assigned to x. So, if L is a


register, update its descriptor to indicate that it contains the value of x. Update the
descriptor of x to indicate that it is stored at location L.
• If y and z has no further use, they can be given back to the system.

Peephole optimization

PEEPHOLE OPTIMIZATION

A statement-by-statement code-generations strategy often produce target code that


contains redundant instructions and suboptimal constructs .The quality of such target
code can be improved by applying “optimizing” transformations to the target program.

A simple but effective technique for improving the target code is peephole
optimization, a method for trying to improving the performance of the target program
by examining a short sequence of target instructions (called the peephole) and
replacing these instructions by a shorter or faster sequence, whenever possible.
The peephole is a small, moving window on the target program. The code inthe peephole need
not contiguous, although some implementations do require this. We shall give the following
examples of program transformations that are characteristic of peephole optimizations:
• Redundant-instructions elimination
• Flow-of-control optimizations
• Algebraic simplifications
• Use of machine idioms

REDUNTANT LOADS AND STORES


If we see the instructions sequence
(1) MOV R0,a
(2) MOV a,R0
-we can delete instructions (2) because whenever (2) is executed. (1) will ensure that
the value of a is already in register R0.If (2) had a label we could not be sure that (1)
was always executed immediately before (2) and so we could not remove (2).

UNREACHABLE CODE
Another opportunity for peephole optimizations is the removal of unreachable instructions. An
unlabeled instruction immediately following an unconditional jump may be removed. This
operation can be repeated to eliminate a sequence of instructions. For example, for debugging
purposes, a large program may have within it certain segments that are executed only if a variable
debug is 1.In C, the source code might look like:
#define debug 0
….
If (debug ) {
Print debugging information
}
In the intermediate representations the if-statement may be translated as:
If debug =1 goto L2
Goto L2
L1: print debugging information
L2: …………………………(a)

One obvious peephole optimization is to eliminate jumps over jumps .Thus no matter
what the value of debug; (a) can be replaced by:

If debug ≠1 goto L2
Print debugging information
L2: ……………………………(b)
As the argument of the statement of (b) evaluates to a constant true it can be replaced by
If debug ≠0 goto L2
Print debugging information
L2: ……………………………(c)

As the argument of the first statement of (c) evaluates to a constant true, it can be replaced by
goto L2. Then all the statement that print debugging aids are manifestly unreachable and can be
eliminated one at a time.

Register allocation and assignment: Global register allocation, usage counts


REGISTER ALLOCATION
Instructions involving register operands are usually shorter and faster than those involving
operands in memory. Therefore, efficient utilization of register is particularly important in
generating good code. The use of registers is often subdivided into two sub problems:
• During register allocation, we select the set of variables that will reside in registers at a
point in the program.
• During a subsequent register assignment phase, we pick the specific register that a variable
will reside in. Finding an optimal assignment of registers to variables is difficult, even with
single register values. Mathematically, the problem is NP-complete. The problem is further
complicated because the hardware and/or the operating system of the target machine may
require that certain register usage conventions be observed.
Certain machines require register pairs (an even and next odd numbered register) for some
operands and results. For example, in the IBM System/370 machines integer multiplication and
integer division involve register pairs. The multiplication instruction is of the form M x, y where
x, is the multiplicand, is the even register of an even/odd register pair.
The multiplicand value is taken from the odd register pair. The multiplier y is a single register.
The product occupies the entire even/odd register pair.
The division instruction is of the form D x, y where the 64-bit dividend occupies an even/odd
register pair whose even register is x; y represents the divisor. After division, the even register
holds the remainder and the odd register the quotient. Now consider the two three address code
sequences (a) and (b) in which the only difference is the operator in the second statement. The
shortest assembly sequence for (a) and (b) are given in(c). Ri stands for register i. L, ST and A
stand for load, store and add respectively.

Register assignment for outer loops, Register allocation by graph coloring

Register Allocation via Graph Coloring:

Once we have constructed the interference graph, we can pose the register allocation problem as
follows: construct an assignment of K colors (representing K registers) to the nodes of the graph
(representing variables) such that no two connected nodes are of the same color.
If no such coloring exists, then we have to save some variables on the stack which is called
spilling. Unfortunately, the problem whether an arbitrary graph is K-colorable is NP-complete
for K ≥ 3. Chaitin [Cha82] has proved that register allocation is also NP-complete by showing
that for any graph G there exists some program which has G as its interference graph. In other
words, one cannot hope for a theoretically optimal and efficient register allocation algorithm that
works on all machine programs. Fortunately, in practice the situation is not so dire. One
particularly important intermediate form is static single assignment (SSA). Hack [Hac07]
observed that for programs in SSA form, the interference graph always has a specific form called
chordal. Coloring for chordal graphs can be accomplished in time O(|V | + |E|) and is quite
efficient in practice. better yet, Pereira and Palsberg [PP05] noted that as much as 95% of the
programs occuring in practice have chordal interference graph. Moreover, using the algorithms
designed for chordal graphs behaves well in practice even if the graph is not quite chordal.

You might also like