13 CD - Runtime Env 3
13 CD - Runtime Env 3
13 CD - Runtime Env 3
2
Back-End Compiler Structure
There is a plausible seven-phase structure for a conventional compiler.
The first three phases (scanning, parsing, and semantic analysis) are language dependent.
The last two (target code generation and machine specific code improvement) are machine
dependent.
The middle two (intermediate code generation and machine-independent code improvement)
are dependent on neither the language nor the machine.
The scanner and parser drive a set of action routines that build a syntax tree.
The semantic analyzer traverses the tree, performing all static semantic checks and initializing
various attributes of use to the back end.
Certain code improvements can be performed on syntax trees, but a less hierarchical
representation of the program makes most code improvement easier.
3
Contd…
4
Contd…
Our example compiler therefore includes an explicit phase for intermediate code generation.
The code generator begins by grouping the nodes of the tree into basic blocks, each of which
consists of a maximal-length set of operations that should execute sequentially at run time, with
no branches in or out.
It then creates a control flow graph in which the nodes are basic blocks and the arcs represent
interblock control flow.
Within each basic block, operations are represented as instructions for an idealized RISC
machine with an unlimited number of registers. We will call these virtual registers.
By allocating a new one for every computed value, the compiler can avoid creating artificial
connections between otherwise independent computations too early in the compilation
process.
5
Contd…
The machine-independent code improvement phase performs transformations on the control
flow graph.
Local code improvement - it modifies the instruction sequence within each basic block to
eliminate redundant loads, stores, and arithmetic computations.
Global code improvement - it also identifies and removes a variety of redundancies across the
boundaries between basic blocks within a subroutine.
An expression whose value is computed immediately before an if statement need not be
recomputed after else.
An expression that appears within the body of a loop need only be evaluated once if its value
will not change in subsequent iterations.
Some global improvements change the number of basic blocks and/or the arcs among them.
6
Syntax tree and symbol table for the GCD program
7
Contd…
The next phase of compilation is target code generation
This phase strings the basic blocks together into a linear program, translating each block into
the instruction set of the target machine and generating branch instructions (or “fall-throughs”)
that correspond to the arcs of the control flow graph.
The output of this phase differs from real assembly language primarily in its continued reliance
on virtual registers.
So long as the pseudo-instructions of the intermediate form are reasonably close to those of the
target machine, this phase of compilation, though tedious, is more or less straightforward.
To reduce programmer effort and increase the ease with which a compiler can be ported to a
new target machine, target code generators are often generated automatically from a formal
description of the machine.
Automatically generated code generators all rely on some sort of pattern-matching algorithm to
replace sequences of intermediate code instructions with equivalent sequences of target machine
instructions.
8
Control flow graph for the GCD program
9
Contd…
The final phase of our example compiler structure consists of register allocation and instruction
scheduling, both of which can be thought of as machine-specific code improvement.
Register allocation requires that we map the unlimited virtual registers onto the bounded set of
registers available in the target machine.
If there aren’t enough architectural registers to go around, we may need to generate additional
loads and stores to multiplex a given architectural register among two or more virtual
registers.
Instruction scheduling consists of reordering the instructions of each basic block to fill the
pipeline(s) of the target machine.
10
Phases and Passes
A pass of compilation is a phase or sequence of phases that is serialized with respect to the rest
of compilation.
It does not start until previous phases have completed.
It finishes before any subsequent phases start.
If desired, a pass may be written as a separate program, reading its input from a file and
writing its output to a file.
Two-pass compilers are particularly common
They may be divided between the front end and the back end (between semantic analysis and
intermediate code generation)
or
they may be divided between intermediate code generation and global code improvement.
In the latter case, the first pass is still commonly referred to as the front end and the second
pass as the back end.
11
Intermediate Forms
An intermediate form (IF) provides the connection between the front end and the back end of
the compiler, and continues to represent the program during the various back-end phases.
IFs can be classified in terms of their level, or degree of machine dependence.
High level Ifs
IFs are often based on trees or directed acyclic graphs (DAGs) that directly capture the
structure of modern programming languages.
facilitates certain kinds of machine-independent code improvement, incremental program
updates, direct interpretation, and other operations based strongly on the structure of the
source.
Because the permissible structure of a tree can be described formally by a set of productions,
manipulations of tree-based forms can be written as attribute grammars.
Stack-based languages are another common type of high level IF.
12
Contd…
The most common medium level IFs consist of three-address instructions for a simple idealized
machine, typically one with an unlimited number of registers.
Since the typical instruction specifies two operands, an operator, and a destination, three-
address instructions are called quadruples.
In older compilers, one may sometimes find an intermediate form consisting of triples or
indirect triples in which destinations are specified implicitly.
• The index of a triple in the instruction stream is the name of the result.
• An operand is generally named by specifying the index of the triple that produced it.
13
Contd…
Different compilers use different Ifs.
Many compilers use more than one IF internally, though in the common two-pass
organization one of these is distinguished as “the” intermediate form.
• connection between the front end and the back end.
The syntax trees passed from semantic analysis to intermediate code generation constitute a
high level IF.
Control flow graphs containing pseudo-assembly language (passed in and out of machine-
independent code improvement) are a medium level IF.
The assembly language of the target machine serves as a low level IF.
Compilers that have back ends for different target architectures do as much work as possible on
a high or medium level IF.
The machine-independent parts of the code improver can be shared by different back ends.
14
Stack-Based Intermediate Forms
Stack-based language are another type of IFs.
E.g. JVM, Pascal’s P-code.
They are simple and compact.
They resemble post-order tree enumeration.
Operations
Take their operands from an implicit stack.
Return their result to an implicit stack.
These languages tend to make language easy to port and the result code is very compact.
Ideal for network transfer of applets.
15
Stack-based versus three-address IF.
Consider Heron’s formula to compute the area of
a triangle given the lengths of its sides, , , and :
, where
16
Code Generation
Like semantic analysis, intermediate code generation can be formalized in terms of an attribute
grammar, though it is most commonly implemented via handwritten ad hoc traversal of a syntax
tree.
17
Contd…
Register Allocation:
Evaluation of the rules of the attribute grammar itself consists of two main tasks
• In each subtree we first determine the registers that will be used to hold various quantities at
run time; then we generate code.
• Our naive register allocation strategy uses the next_free_reg inherited attribute to manage
registers r1, ... , rk as an expression evaluation stack.
To calculate the value of (a + b) × (c - (d / e)) for example, we would generate the following:
18
Contd…
In a particularly complicated fragment of code it is possible to run out of architectural registers.
In this case we must spill one or more registers to memory.
Our naive register allocator pushes a register onto the program’s subroutine call stack.
In effect, architectural registers hold the top k elements of an expression evaluation stack of
effectively unlimited size.
It should be emphasized that our register allocation algorithm, makes very poor use of machine
resources.
If we were generating medium level intermediate code, we would employ virtual registers,
rather than architectural ones.
Mapping of virtual registers to architectural registers would occur much later in the
compilation process.
19
Address Space Organization
Assemblers, linkers, and loaders typically operate on a pair of related file formats:
20
Contd…
A relocatable object file includes the following descriptive information:
Import table: Identifies instructions that refer to named locations whose addresses are
unknown, but are presumed to lie in other files yet to be linked to this one.
Relocation table: Identifies instructions that refer to locations within the current file, but that
must be modified at link time to reflect the offset of the current file within the final,
executable program.
Export table: Lists the names and addresses of locations in the current file that may be
referred to in other files.
Imported and exported names are known as external symbols.
An executable object file is distinguished by the fact that it contains no references to external
symbols. It also defines a starting address for execution. An executable file may or may not be
relocatable, depending on whether it contains the tables above.
21
Contd…
Segments of runnable program:
Code
Constants
Initialized data
Uninitialized data: may be allocated at load time or on demand in response to page faults
• Usually zero filled, both to provide repeatable symptoms for programs that erroneously read
data they have not yet written.
Stack: may be allocated in some fixed amount at load time
• more commonly, is given a small initial size, and then
• extended automatically by the operating system in response to (faulting) accesses beyond
the current segment end.
22
Contd…
Heap: may also be allocated in some fixed amount at load time.
• more commonly, is given a small initial size, and is then
• extended in response to explicit requests from heap-management library routines.
Files: In many systems, library routines allow a program to map a file into memory
• The map routine interacts with the operating system to create a new segment for the file,
and returns the address of the beginning of the segment.
• The contents of the segment are usually fetched from disk on demand, in response to page
faults.
Dynamic libraries: Modern operating systems typically arrange for most programs to share a
single copy of the code for popular libraries.
23
Layout of process address space in x86 Linux
24
Assembly
Some compilers translate source files directly into object files acceptable to the linker.
More commonly, they generate assembly language that must subsequently be processed by an
assembler to create an object file.
symbolic (textual) notation for code.
Within a compiler it would still be symbolic, most likely consisting of records and linked
lists.
To translate this symbolic representation into executable code, we must
Replace opcodes and operands with their machine language encodings.
Replace uses of symbolic names with actual addresses.
25
Contd…
When passing assembly language from the compiler to the assembler, it makes sense to use
some internal (records and linked lists) representation
At the same time, we must provide a textual front end to accommodate the occasional need for
human input:
26
Contd…
An alternative organization has the compiler generate object code directly.
This organization gives the compiler a bit more flexibility: operations normally performed by
an assembler (e.g., assignment of addresses to variables) can be performed earlier if desired.
Because there is no separate assembly pass, the overall translation to object code may be
slightly faster.
27
Contd…
Emitting Instructions:
The most basic task of the assembler is to translate symbolic representations of instructions into
binary form.
In some assemblers this is easy.
There is a one-one correspondence between mnemonic operations and instruction op-codes.
Many assemblers extend the instruction set in minor ways to make the assembly language
easier for human beings to read.
Most MIPS assemblers, for example, provide a large number of pseudo-instructions that
translate into different real instructions depending on their arguments, or that correspond to
multi-instruction sequences.
28
Contd…
Assemblers respond to a variety of directives (MIPS):
segment switching
• .text directive indicates that subsequent instructions and data should be placed in the code
(text) segment.
• .data directive indicates that subsequent instructions and data should be placed in the
initialized data segment.
• .space n directive indicates that n bytes of space should be reserved in the uninitialized data
segment.
• .byte, .hword, .word, .float, and .double directives each take a sequence of arguments.
• related .ascii directive takes a single character string as argument, which it places in
consecutive bytes.
symbol identification: .globl name directive indicates that name should be entered into the
table of exported symbols.
alignment: .align n directive causes the subsequent output to be aligned at an address evenly
divisible by .
29
Contd…
RISC assemblers implement a virtual machine - instruction set is “nicer” than that of the real
hardware.
In addition to pseudo-instructions, the virtual machine may have non-delayed branches.
If desired, the compiler or assembly language programmer can ignore the existence of branch
delays.
The assembler will move nearby instructions to fill delay slots if possible, or generate nops if
necessary.
To support systems programmers, the assembler must also make it possible to specify that
delay slots have already been filled.
30
Contd…
Assemblers commonly work in several phases.
if the input is textual, an initial phase scans and parses the input, and builds an internal
representation.
there are two additional phases.
• first phase identifies all internal and external (imported) symbols, assigning locations to the
internal ones.
complicated by the length of some instructions (on a CISC machine).
or
complicated by number of real instructions produced by a pseudo-instruction (on a RISC
machine).
• final phase produces object code.
31
Contd…
CISC assemblers distinguish between absolute and relocatable words in an object file.
Absolute words are known at assembly time; they need not be changed by the linker
constants and register-register instructions
A relocatable word must be modified by adding to it the address within the final program of the
code or data segment of the current object file
A CISC jump instruction might consist of a one-byte jmp opcode followed by a four-byte
target address
For a local target, the address bytes in the object file would contain the symbol’s offset within
the file
The linker finalizes the address by adding the offset of the file’s code segment within the final
program
32
Linking
Most language implementations - certainly all that are intended for the construction of large
programs - support separate compilation
fragments of the program can be compiled and assembled more-or-less independently
After compilation, these fragments (known as compilation units) are “glued together” by a
linker
programmer explicitly divides the program into modules or files separately compiled
integrated environments may abandon the notion of a file in favor of a database of subroutines
separately compiled
Linker joins together compilation units
A static linker does its work prior to program execution, producing an executable object file.
A dynamic linker does its work after the program has been brought into memory for execution.
33
Contd…
Each of the compilation units of a program to be linked must be a relocatable object file.
some files will have been produced by compiling fragments of the application being
constructed.
others will be general purpose library packages needed by the application.
Since most programs make use of libraries, even a “one-file” application typically needs to be
linked.
Linking involves two subtasks: relocation and the resolution of external references.
Relocation is somewhere referred as loading, and the entire “joining together” process is known
as “link-loading.”
Here “loading” is referred to the process of bringing an executable object file into memory for
execution.
On very simple machines loading entails relocation.
The operating system uses virtual memory to giving the impression that it starts at some
standard address (zero).
Often loading also entails a certain amount of linking.
Mr. Sayantan Saha PLC_Lec_37 34
Linking relocatable object files A and B to make an
executable object file
35
Dynamic Linking
On a multi-user system, it is common for several instances of a program (an editor or web
browser, for example) to be executing simultaneously.
It would be highly wasteful to allocate space in memory for a separate, identical copy of the
code of such a program for every running instance.
Many operating systems therefore keep track of the programs that are running, and set up
memory mapping tables so that all instances of the same program share the same read-only copy
of the program’s code segment.
Each instance receives its own writable copy of the data segment.
Code segment sharing can save enormous amounts of space.
It does not work, however, for instances of programs that are similar but not identical.
Many sets of programs, while not identical, have large amounts of library code in common.
Moreover, if programs are statically linked, then much larger amounts of disk space may be
wasted on nearly identical copies of the library in separate executable object files.
36
Questions to practice
1) What is sometimes called the “middle end” of a compiler?
2) What are virtual registers? What purpose do they serve?
3) What is the difference between local and global code improvement?
4) What is register spilling ?
5) Name two advantages of a stack-based IF. Name one disadvantage.
6) Explain what is meant by the “level” of an intermediate form(IF).What are the comparative
advantages and disadvantages of high-, medium-, and low-level IFs?
7) What are the distinguishing characteristics of a relocatable object file? An executable object
file?
8) List four tasks commonly performed by an assembler.
9) What is the difference between linking and loading ? What are the principal tasks of a linker?
10) How can a linker enforce type checking across compilation units?
11) What is the motivation for dynamic linking?
37
Thank You
38