System Programming: Second Class
System Programming: Second Class
Second Class
ﺩﻋﺎﺳﻣ ﺱﺭﺩﻣ: ﺏﻭﻠﻁﻣ ﻝﺎﻧﻣ
CHAPTER FIVE (Loaders and Linkers)
Introduction:
In this chapter we will understand the concept of linking and loading. As
discussed earlier the source program is converted to object program by
assembler. The loader is a program which takes this object program, prepares it
for execution, and loads this executable code of the source into memory for
execution.
Definition of Loader:
Loader is utility program which takes object code as input prepares it for
execution and loads the executable code into the memory. Thus loader is
actually responsible for initiating the execution process.
Functions of Loader:
The loader is responsible for the activities such as allocation, linking, relocation
and loading
1) It allocates the space for program in the memory, by calculating the size
of the program. This activity is called allocation.
2) It resolves the symbolic references (code/data) between the object
modules by assigning all the user subroutine and library subroutine
addresses. This activity is called linking.
3) There are some address dependent locations in the program, such address
constants must be adjusted according to allocated space, such activity
done by loader is called relocation.
4) Finally it places all the machine instructions and data of corresponding
programs and subroutines into the memory. Thus program now becomes
ready for execution, this activity is called loading.
Loader Schemes:
Based on the various functionalities of loader, there are various types of loaders:
1) “compile and go” loader: in this type of loader, the instruction is read
line by line, its machine code is obtained and it is directly put in the main
memory at some known address. That means the assembler runs in one
part of memory and the assembled machine instructions and data is
1
directly put into their assigned memory locations. After completion of
assembly process, assign starting address of the program to the location
counter. The typical example is WATFOR-77, it’s a FORTRAN compiler
which uses such “load and go” scheme. This loading scheme is also called
as “assemble and go”.
Advantages:
0• This scheme is simple to implement. Because assembler is placed at
one part of the memory and loader simply loads assembled machine
instructions into the memory.
Disadvantages:
0• In this scheme some portion of memory is occupied by assembler
which is simply a wastage of memory. As this scheme is
combination of assembler and loader activities, this combination
program occupies large block of memory.
0• There is no production of .obj file, the source code is directly
converted to executable form. Hence even though there is no
modification in the source program it needs to be assembled and
executed each time, which then becomes a time consuming activity.
0• It cannot handle multiple source programs or multiple programs
written in different languages. This is because assembler can
translate one source language to other target language.
0• For a programmer it is very difficult to make an orderly modulator
program and also it becomes difficult to maintain such program,
and the “compile and go” loader cannot handle such programs.
0• The execution time will be more in this scheme as every time
program is assembled and then executed.
2
2) General Loader Scheme: in this loader scheme, the source program is
converted to object program by some translator (assembler). The loader
accepts these object modules and puts machine instruction and data in
an executable form at their assigned memory. The loader occupies
some portion of main memory.
Advantages:
0• The program need not be retranslated each time while
running it. This is because initially when source program gets
executed an object program gets generated. Of program is not
modified, then loader can make use of this object program to
convert it to executable form.
0• There is no wastage of memory, because assembler is not
placed in the memory, instead of it, loader occupies some
portion of the memory. And size of loader is smaller than
assembler, so more memory is available to the user.
0• It is possible to write source program with multiple programs
and multiple languages, because the source programs are first
converted to object programs always, and loader accepts
these object modules to convert it to executable form.
20 JMP 2000
21 END
In this example there are two segments, which are interdependent. At line
number 1 the assembler directive START specifies the physical starting address
that can be used during the execution of the first segment MAIN. Then at line
number 15 the J MP instruction is given which specifies the physical starting
address that can be used by the second segment. The assembler creates the object
codes for these two segments by considering the stating addresses of these two
segments. During the execution, the first segment will be loaded at address 1000
and second segment will be loaded at address 5000 as specified by the
programmer. Thus the problem of linking is manually solved by the programmer
itself by taking care of the mutually dependant dresses. As you can notice that
the control is correctly transferred to the address 5000 for invoking the other
segment, and after that at line number 20 the JMP ins truction transfers the
control to the location 2000, necessarily at location 2000 the instruction STORE
of line number 16 is present. Thus resolution of mutual references and linking is
done by the programmer. The task of assembler is to create the object codes
4
for the above segments and along with the information such as starting address
of the memory where actually the object code can be placed at the time of
execution. The absolute loader accepts these object modules from assembler and
by reading the information about their starting addresses, it will actually place
(load) them in the memory at specified addresses.
Advantages:
1. It is simple to implement
2. This scheme allows multiple programs or the source programs written
different languages. If there are multiple programs written in different
languages then the respective language assembler will convert it to the
language and a common object file can be prepared with all the ad
resolution.
3. The task of loader becomes simpler as it simply obeys the instruction
regarding where to place the object code in the main memory.
4.The process of execution is efficient.
5
Disadvantages:
1.In this scheme it is the programmer's duty t o adjust all the int er segment
addresses and manually do the linkin g activity. For that, it is
necessar y for a programmer to know the memory management.
If at all any modification is done the some segments, the starting
addresses of immediate next segments m ay get changed, the programmer
has to take care of this issue and he needs to update the
corresponding starting addresses on any modification in the source.
Algorithm for absolute Loader
Input: Object codes and starting address of program segments.
Output: An e xecutable code for corresponding source program. This
executable code is to be placed in the main memory
Method: Begin
For each program segment
do Begin
Read the first line from object module to
obtain information about memory location.
The starting address say S in corresponding
object modul e is the mem or y location where
executale code is to be placed.
Hence
Memory_location = S
Line counter = 1 ; as it is first
line While (! end of fil e)
For the cure nt object code
do Begin
1. Read next line
2. Write line into location S
3. S = S + 1
4. Line counter Line counter + 1
6
external , we can use the assembler directive EXT. Thus the st atement such as
EXT B should be added at the beginning of the segment A. This actually helps
to inform assembler that B is de fined somewhere else. Similarly, if one
subroutine or a variable is defined in the current segment and can be referred by
other segments then those should be declared by using pseudo-
ops INT. Thereby the assembler could inform loader that these are the
subroutines or variables used by other segments. This overall process of
establi shing the relations between the subroutines can be conceptually called a_
subroutine linkage.
For example
.
.
.
CALL B
.
.
END
B START
.
.
RET
END
At the beginning of the MAIN the subroutine B is declared as external.
When a call to subro utine B is made, before making the unconditional jump,
the current content of the program counter should be stored in the system stack
maintained internally. Similarly while returning from the subroutine B
(at RET) the pop is performed to restore the program counter of caller routine
with the address of next instruction to be executed.
Concept of relocations:
Relocation is the process of updating the addresses used in the address sensitive
instructions of a program. It is necessary that such a modification
should help to execute the program from designated area of the memory.
The assembl er generates the object code. This o bject code gets executed
after loading at storage l ocations. The addresses of such object code will get
specified only after the assembly process is over. Therefore, after loading,
7
Address of object code = Mere address of object code + relocation constant.
There are two t ypes of addresses being generated: Absolut e address and,
relative address. The absolute address can be directly used to map the object
code in the main memory. Wh ereas the relative address is only after the
addition of relocation cons tant to the object code address. This kind of
adjustment needs to be done in case of relative address before actu al execution
of the code. The typical example of relative reference is : addresses of the
symbols defined in the Label field, addresses of the data which is defined by the
assembler directi ve, literals, redefinable symbols. Similarly, the typical exam
ple of absolute address is the constants which are generated by assembler are
absolute.
The assem bler cal culates which addresses are absolut e and which addresses
ar e relative during the assembly process. During the assembly process the
assembler calculates the address with the help of simple expressions.
For example
LOADA(X)+5
The expression A(X) means the address of variable X. The meaning of the
above instruction is that loading of the contents of m emory location which is
5 more than the address of variable X. Suppose if the address of X is 50 then
by above command we try to get the memory location 50+5=55. Therefore
as the address of variabl e X is relative A(X) + 5 is also relative. To calculate
the relative addresses the simple expressions are allowed. It is expected that the
expression should pos sess at the most
addition and multiplication operat ions. A simple exercise can be car ried out to
determine whether the given address is absolute or relative. In t he expression if
the address is absolute then put 0 over there and if address is
relative then put lover there. The express ion then get s transformed to sum of
O's and l's. If the resultant value of the express ion i s 0 then expression is
absolute. And if t he resultant value of the expression is 1 then the expression is
relative. If the resultant is other than 0 or 1then the expression is illegal. For
example:
8
In the above expression the A, Band C are the variable names. Th e assembler is
to c0l1sider the relocation attribute and adjust the object code by relocation
constant. Assembler is th en responsible to convey the information loading of
object cod e to the loader. Let us now see how assembler generates code using
relocation information.
Sometimes a program may require mo re storage space t han the available one
Exe cution of such program can be possible if all the segments are not required
simultaneously to be present i n the main memory. In such situations only those
segments are resident in the memory that are actually
needed at th e time of execution But the question arises what will happen if the
requir ed segment is not present i n the memor y? Nat urally the execution
process will be delayed until the required segment gets loaded in the memory.
The overall effect of this i s efficiency of execution process gets degraded. The
efficiency can then be improved by carefully selecting all the interdependent
segments. Of course the asse mbler can not do this task. Only the user can
specify such dependencies. The inter dependency of the
9
segments can be specified by a tree like structure called st atic overlay struct
ures. The overl ay structure contain multiple root/nodes and edges. Each node
represents the segment. The specification of required amount of
memory is also ess ential in this s tructure. The two segments can lie
simultaneously in the main memory if they are on the same path. Let us take
an example to unders tand the concept. Various segments along with their
memory requir ements is as sho wn below.
Linkage Editor:
The execution of any program needs four basic functionalities and those
are allo cation, relocation, linking and loading. As w e have also seen in direct
linking loader for execution of any pro gram each time these four
functionalities need to be performed. But performing all these functionalities
each time is time and space consumin g task. Moreover if the program
contains many subroutines o r functions and the progra m nee ds to be
executed repeatedly then this activity becomes annoyingly c omplex .E ach
time for execution of a program, the allocation, relocation linking and -
loadi ng needs to be done. Now doing these activities each time increases the
time and space complexit y. Actually, there is no need to redo all these f our
activ ities each time. Instead, if the r esults of some of these activities are
stored in a file then that file can be used by other activit ies. And performing
allocation, relocation, linking and lo ading can be avoided each time. The idea is
to separate out these activities in separate groups. Thus dividing the
essential four functions in groups reduces the overall time complexity of loading
process.
The program which performs allocati on, relocation and linking is called
binder. The binder performs relo cation, creates linked executable text and stores
th is text in a file in some systematic manner. Such kind of module prepared by
the bin der execution is called load module. This load module can then be
actually lo aded in the m ain memory by the loader. This loader is also called as
module loader. If the binder can produce the exact replica of
executable code in the load module then the module loader simply loads this file
into the main memory which ultimately reduces the overall time
complexity. But in this process the binder should knew the current positions of
the main m emory. Even though the binder knew the main memory locations this
is not the onl y thing which is sufficient. In multiprogramming environme nt, the
region of main memor y available for loading the program is decide d by the host
operating system. The binder should also know which memory area is allocated
to the loading program and it should modify the relocation information
accordingly. The binder which performs t he linking function and produces
adequate information about allocation and relocation
and writes this information along with the program code in the file is called
linkage editor. The module loader then accepts this rile as input, reads the
informa tion stored in and based on thi s info rmatio n about alloc ation and
relocation it performs the task of loading in the main memory.
Ev en though the program is repeatedly executed the linking is done only
once. Mor eover, the flexibility of allocation and relocation helps efficient
utilization of the main memory.
11
Direct linking:
Advantages
1. The overhead on the loader is reduced. The required subroutine will be
load in the main memory only at the time of execution.
2.The system can be dynamically reconfigured.
Disadvantages
The linking and loading need to be postponed until the execution. During the
execution if at all any subroutine is needed then the process of execution needs to
be suspended until the required subroutine gets loaded in the main memory.
12
Bootstrap Loader:
As we turn on the computer there is nothing meaningful in the main memory
(RAM). A small program is written and stored in the ROM. This
program initially loads the operating system from secondary storage to main
memory. The operating system then takes the overall control. This program
which is responsible for booting up the system is called bootstrap loader. This is
the program which must be executed first when the system is first powered on. If
the program starts from the location x then to execute this program the program
counter of this machine should be loaded with the value x. Thus the task of
setting the initial value of the program counter is to be done by machine
hardware. The bootstrap loader is a very small program which is to be fitted in
the ROM. The task of bootstrap loader is to load the necessary portion of the
operating system in the main memory .The
initial address at which the bootstrap loader is to be loaded is generally the
lowest (may be at 0th location) or the highest location. .
Concept of Linking:
As we have discussed earlier, the execution of program can be done with the
help of following steps
2.1.Translation
execution. of
thethe
Linking ofThis program(done
program
also involves by other
assembler
withpreparation
all or compiler)
programs
of which
a program areload
called needed for
module.
3. Loading of the load module prepared by linker to some specified memory
location.
The output of tr anslator is a program called object module. The linker processes
these object modules binds with necessary library routines and prepares a
ready to execute program. Such a program is called binary program. The
"binar y program also contains some necessary information about allocation
and relocation. The loader then load s this program into memory for execution
purpose.
Linke
r
Given to
Output
Translator
Outputs
The linking process is performed in two passes. Two passes are necessary
because the li nker may encounter a forward reference before knowing its
address. So it is necessar y to scan all the DEFINITION and USE table at least
once. Linker then builds the Global symbol table with the help of USE and
DEFINITION table. In Global symbol t able name of each externally
referenced symbol is i ncluded along with its address rel ative to beginning of the
load module. And during pass 2, the addresses of external r eferences are
replaced by obtaining the addresses from global s ymbol table.
14
1
5
P
D
F
to
W
or
d