0% found this document useful (0 votes)
59 views18 pages

System Programming: Second Class

The document discusses different types of loaders and their functions. It describes three main types of loaders: 1) Compile and go loader - assembles and loads code directly into memory without producing object files. Simple but wastes memory and requires recompiling for each run. 2) General loader - produces object files from assembly then loads and links them at runtime. More efficient by not recompiling each time but requires additional loader program in memory. 3) Absolute loader - loads pre-linked and relocated object files directly into specified memory addresses determined by the programmer. Simplest for the loader but requires programmer to manually handle linking between modules.

Uploaded by

Rajat Nagpal
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views18 pages

System Programming: Second Class

The document discusses different types of loaders and their functions. It describes three main types of loaders: 1) Compile and go loader - assembles and loads code directly into memory without producing object files. Simple but wastes memory and requires recompiling for each run. 2) General loader - produces object files from assembly then loads and links them at runtime. More efficient by not recompiling each time but requires additional loader program in memory. 3) Absolute loader - loads pre-linked and relocated object files directly into specified memory addresses determined by the programmer. Simplest for the loader but requires programmer to manually handle linking between modules.

Uploaded by

Rajat Nagpal
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 18

System Programming

Second Class
‫ﺩﻋﺎﺳﻣ ﺱﺭﺩﻣ‬: ‫ﺏﻭﻠﻁﻣ ﻝﺎﻧﻣ‬
CHAPTER FIVE (Loaders and Linkers)
Introduction:
In this chapter we will understand the concept of linking and loading. As
discussed earlier the source program is converted to object program by
assembler. The loader is a program which takes this object program, prepares it
for execution, and loads this executable code of the source into memory for
execution.

Definition of Loader:
Loader is utility program which takes object code as input prepares it for
execution and loads the executable code into the memory. Thus loader is
actually responsible for initiating the execution process.

Functions of Loader:
The loader is responsible for the activities such as allocation, linking, relocation
and loading

1) It allocates the space for program in the memory, by calculating the size
of the program. This activity is called allocation.
2) It resolves the symbolic references (code/data) between the object
modules by assigning all the user subroutine and library subroutine
addresses. This activity is called linking.
3) There are some address dependent locations in the program, such address
constants must be adjusted according to allocated space, such activity
done by loader is called relocation.
4) Finally it places all the machine instructions and data of corresponding
programs and subroutines into the memory. Thus program now becomes
ready for execution, this activity is called loading.

Loader Schemes:
Based on the various functionalities of loader, there are various types of loaders:

1) “compile and go” loader: in this type of loader, the instruction is read
line by line, its machine code is obtained and it is directly put in the main
memory at some known address. That means the assembler runs in one
part of memory and the assembled machine instructions and data is
1
directly put into their assigned memory locations. After completion of
assembly process, assign starting address of the program to the location
counter. The typical example is WATFOR-77, it’s a FORTRAN compiler
which uses such “load and go” scheme. This loading scheme is also called
as “assemble and go”.

Advantages:
0• This scheme is simple to implement. Because assembler is placed at
one part of the memory and loader simply loads assembled machine
instructions into the memory.
Disadvantages:
0• In this scheme some portion of memory is occupied by assembler
which is simply a wastage of memory. As this scheme is
combination of assembler and loader activities, this combination
program occupies large block of memory.
0• There is no production of .obj file, the source code is directly
converted to executable form. Hence even though there is no
modification in the source program it needs to be assembled and
executed each time, which then becomes a time consuming activity.
0• It cannot handle multiple source programs or multiple programs
written in different languages. This is because assembler can
translate one source language to other target language.
0• For a programmer it is very difficult to make an orderly modulator
program and also it becomes difficult to maintain such program,
and the “compile and go” loader cannot handle such programs.
0• The execution time will be more in this scheme as every time
program is assembled and then executed.

2
2) General Loader Scheme: in this loader scheme, the source program is
converted to object program by some translator (assembler). The loader
accepts these object modules and puts machine instruction and data in
an executable form at their assigned memory. The loader occupies
some portion of main memory.

Advantages:
0• The program need not be retranslated each time while
running it. This is because initially when source program gets
executed an object program gets generated. Of program is not
modified, then loader can make use of this object program to
convert it to executable form.
0• There is no wastage of memory, because assembler is not
placed in the memory, instead of it, loader occupies some
portion of the memory. And size of loader is smaller than
assembler, so more memory is available to the user.
0• It is possible to write source program with multiple programs
and multiple languages, because the source programs are first
converted to object programs always, and loader accepts
these object modules to convert it to executable form.

3) Absolute Loader: Absolute loader is a kind of loader in which


relocated object files are created, loa der accepts these files and
places them at specified locations in the memor y. This type of
loader is called absolute because no relocation information is
needed; rather it is obtained from the programmer or assembler. The
starting address of every module is known to the programmer, this
corresponding starting address is stored in the object file, then task
of loader becomes very simple and that is to simply place the
executable form of the machine instructions at the locations
mentioned in the object file. In this scheme, the programmer or
3
assembler should have knowledge of memory management. The
resolution of external references or linking of different subroutines
are the issues which need to be handled by the programmer. The
programmer should take care of two things: first thing is :
specification of starting address of each module to be used. If some
modification is done in some module then the length of that
module may vary. This causes a change in the starting address of
immediate next . modules, its then the programmer's duty to make
necessary changes in the starting addresses of respective modules.
Second thing is ,while branching from one segment to another the
absolute starting address of respective module is to be known by
the programmer so that such address can be specified at respective JMP
instruction. For example
Line number
1 MAIN START 1000
. .
. .
. .
15 JMP 5000
16 STORE ;instruction at location 2000
END
1 SUM START 5000
2

20 JMP 2000
21 END

In this example there are two segments, which are interdependent. At line
number 1 the assembler directive START specifies the physical starting address
that can be used during the execution of the first segment MAIN. Then at line
number 15 the J MP instruction is given which specifies the physical starting
address that can be used by the second segment. The assembler creates the object
codes for these two segments by considering the stating addresses of these two
segments. During the execution, the first segment will be loaded at address 1000
and second segment will be loaded at address 5000 as specified by the
programmer. Thus the problem of linking is manually solved by the programmer
itself by taking care of the mutually dependant dresses. As you can notice that
the control is correctly transferred to the address 5000 for invoking the other
segment, and after that at line number 20 the JMP ins truction transfers the
control to the location 2000, necessarily at location 2000 the instruction STORE
of line number 16 is present. Thus resolution of mutual references and linking is
done by the programmer. The task of assembler is to create the object codes
4
for the above segments and along with the information such as starting address
of the memory where actually the object code can be placed at the time of
execution. The absolute loader accepts these object modules from assembler and
by reading the information about their starting addresses, it will actually place
(load) them in the memory at specified addresses.

The entire process is modeled in the following figure.

Thus the absolute loader is simple to implement in this scheme-l)


Allocati on is done by either programmer or assembler 2)Linking is
done by the programmer or assembler 3)Resolution is done by
assembler
4)Simply loading is done by the loader
As the name suggests, no relocation information is needed, if at all it is
required then that task can be done by either a programmer or assembler

Advantages:
1. It is simple to implement
2. This scheme allows multiple programs or the source programs written
different languages. If there are multiple programs written in different
languages then the respective language assembler will convert it to the
language and a common object file can be prepared with all the ad
resolution.
3. The task of loader becomes simpler as it simply obeys the instruction
regarding where to place the object code in the main memory.
4.The process of execution is efficient.
5
Disadvantages:
1.In this scheme it is the programmer's duty t o adjust all the int er segment
addresses and manually do the linkin g activity. For that, it is
necessar y for a programmer to know the memory management.
If at all any modification is done the some segments, the starting
addresses of immediate next segments m ay get changed, the programmer
has to take care of this issue and he needs to update the
corresponding starting addresses on any modification in the source.
Algorithm for absolute Loader
Input: Object codes and starting address of program segments.
Output: An e xecutable code for corresponding source program. This
executable code is to be placed in the main memory
Method: Begin
For each program segment
do Begin
Read the first line from object module to
obtain information about memory location.
The starting address say S in corresponding
object modul e is the mem or y location where
executale code is to be placed.
Hence
Memory_location = S
Line counter = 1 ; as it is first
line While (! end of fil e)
For the cure nt object code
do Begin
1. Read next line
2. Write line into location S
3. S = S + 1
4. Line counter Line counter + 1

Subroutine Linkage: T o understa nd the concept of subroutine linkages,


first consider the following scenario:
"In Pro gra m A a call to subroutine B is made. The subroutine B is not
written in the program s eg ment of A, rather B is defined in some ano ther
program segment C"
Nothing is wron g in it. But fro m assembler's point of view w hile generating
the code f or B, as B is not defined in the segment A, the assembler can not find
th e value of this symbolic reference and hence it will declare it as an error. To
overcome problem, there should be some mechanism by which the assembler
should be explicitly informed that
segment B is really defined in some other segment C. The refore whenever
segment B is used in segment A and if at all B is defined in C, then B must -
be declared as an external routine in A. To declare such subroutine as

6
external , we can use the assembler directive EXT. Thus the st atement such as
EXT B should be added at the beginning of the segment A. This actually helps
to inform assembler that B is de fined somewhere else. Similarly, if one
subroutine or a variable is defined in the current segment and can be referred by
other segments then those should be declared by using pseudo-
ops INT. Thereby the assembler could inform loader that these are the
subroutines or variables used by other segments. This overall process of
establi shing the relations between the subroutines can be conceptually called a_
subroutine linkage.

For example

MAIN START EXT B

.
.
.
CALL B
.
.
END
B START
.
.
RET
END
At the beginning of the MAIN the subroutine B is declared as external.
When a call to subro utine B is made, before making the unconditional jump,
the current content of the program counter should be stored in the system stack
maintained internally. Similarly while returning from the subroutine B
(at RET) the pop is performed to restore the program counter of caller routine
with the address of next instruction to be executed.

Concept of relocations:
Relocation is the process of updating the addresses used in the address sensitive
instructions of a program. It is necessary that such a modification
should help to execute the program from designated area of the memory.
The assembl er generates the object code. This o bject code gets executed
after loading at storage l ocations. The addresses of such object code will get
specified only after the assembly process is over. Therefore, after loading,

7
Address of object code = Mere address of object code + relocation constant.

There are two t ypes of addresses being generated: Absolut e address and,
relative address. The absolute address can be directly used to map the object
code in the main memory. Wh ereas the relative address is only after the
addition of relocation cons tant to the object code address. This kind of
adjustment needs to be done in case of relative address before actu al execution
of the code. The typical example of relative reference is : addresses of the
symbols defined in the Label field, addresses of the data which is defined by the
assembler directi ve, literals, redefinable symbols. Similarly, the typical exam
ple of absolute address is the constants which are generated by assembler are
absolute.
The assem bler cal culates which addresses are absolut e and which addresses
ar e relative during the assembly process. During the assembly process the
assembler calculates the address with the help of simple expressions.

For example
LOADA(X)+5
The expression A(X) means the address of variable X. The meaning of the
above instruction is that loading of the contents of m emory location which is
5 more than the address of variable X. Suppose if the address of X is 50 then
by above command we try to get the memory location 50+5=55. Therefore
as the address of variabl e X is relative A(X) + 5 is also relative. To calculate
the relative addresses the simple expressions are allowed. It is expected that the
expression should pos sess at the most
addition and multiplication operat ions. A simple exercise can be car ried out to
determine whether the given address is absolute or relative. In t he expression if
the address is absolute then put 0 over there and if address is
relative then put lover there. The express ion then get s transformed to sum of
O's and l's. If the resultant value of the express ion i s 0 then expression is
absolute. And if t he resultant value of the expression is 1 then the expression is
relative. If the resultant is other than 0 or 1then the expression is illegal. For
example:

Expression Computation Relocation attribute

A-B 1-1 = 0 Absolute

A+B-C 1+1-1= 1 Relative

A-B+5 1-1+0 = 0 Absolute

A+B 1+1 = 2 Illegal

8
In the above expression the A, Band C are the variable names. Th e assembler is
to c0l1sider the relocation attribute and adjust the object code by relocation
constant. Assembler is th en responsible to convey the information loading of
object cod e to the loader. Let us now see how assembler generates code using
relocation information.

Direct Linking Loaders


The direct linking loader is the most common type of loader. This type of
loader is a relocatable loader. The loader can not have the direct access to the
source code. And to place the object code in the me mory there are two
situations: eit her the address of the object code could be absolute which then
can be dire ctly placed at the specified location or the addr ess can be relative. If
at all the address is relative then it is the assembler who informs the loader about
the relative addresses.
The assembler shoul d give the following information to the loader
1)The length of the object code segment
2) The list of all the symbols which are not defined 111 the current
segment but can be used in the current segment.
3) The list of all the symbols which are defined in the current segment
but can be referred by the other segments.
The l ist of s ymbols which are be used in the not defined in the current
current segment segment but can are stored
table. The USE table holds the address, address in a data structure called
relativity. USE
information such as name
of the symbol,
The l ist of symbols which are defined in the current segment and can be
referred by the other segments are stored in a data structure called DEFINI TION
table. The definition table holds the information such as symbol, address.

Overlay Structures and Dynamic Loading:

Sometimes a program may require mo re storage space t han the available one
Exe cution of such program can be possible if all the segments are not required
simultaneously to be present i n the main memory. In such situations only those
segments are resident in the memory that are actually
needed at th e time of execution But the question arises what will happen if the
requir ed segment is not present i n the memor y? Nat urally the execution
process will be delayed until the required segment gets loaded in the memory.
The overall effect of this i s efficiency of execution process gets degraded. The
efficiency can then be improved by carefully selecting all the interdependent
segments. Of course the asse mbler can not do this task. Only the user can
specify such dependencies. The inter dependency of the
9
segments can be specified by a tree like structure called st atic overlay struct
ures. The overl ay structure contain multiple root/nodes and edges. Each node
represents the segment. The specification of required amount of
memory is also ess ential in this s tructure. The two segments can lie
simultaneously in the main memory if they are on the same path. Let us take
an example to unders tand the concept. Various segments along with their
memory requir ements is as sho wn below.

Automatic Library Search:


Pre viously, the library routines were available in absolute code but now the
library routines are provided in relocated form that ultimately reduces their
size on the disk, which in turn increases t he memory utilization. At execution
time certain library routines may be needed. Keeping track of
which library routines are required and how much storage is required by these
routines, if at all is done by an assembler itself then the activity of
automatic library search becomes simpler and effectiv e. The library routines
can a lso make an external call to other routines. The idea is to make a list of
such ca lls made by the routines. And if such list is made available to the linker
then linker can efficiently find the set of required routines and can link the
references accordingly.
For an efficient search of library routines it d esirable to store all the calling
routines firs t and then the called routines. This avoids was tage of time due to
winding and rewi nding. For efficient automated search of library routines even
the dictionary of such routines can be maintained. A table containing the names
of library routines and the addresses where they are
actually located i n relocatable form is prepared with the help of translator and
such table i s s ubmitted to the l inker. Such a table is called subroutine
directory. Even if these routines have made any extern al calls the - information
ab out it is also given in subroutine directory. The linker searches the subroutine
dir ectory, finds the address of desired lib rary routine (the address where the
routine is stored in relocated form).Then linker prepares a
10
load module appending the user program and necessary library routines by
doing the necessary relocation. If the library routine contai ns the external calls
then the linker searches the subr outine directory finds the address of
such external calls, prepares the load module by resolving the external
references.

Linkage Editor:
The execution of any program needs four basic functionalities and those
are allo cation, relocation, linking and loading. As w e have also seen in direct
linking loader for execution of any pro gram each time these four
functionalities need to be performed. But performing all these functionalities
each time is time and space consumin g task. Moreover if the program
contains many subroutines o r functions and the progra m nee ds to be
executed repeatedly then this activity becomes annoyingly c omplex .E ach
time for execution of a program, the allocation, relocation linking and -
loadi ng needs to be done. Now doing these activities each time increases the
time and space complexit y. Actually, there is no need to redo all these f our
activ ities each time. Instead, if the r esults of some of these activities are
stored in a file then that file can be used by other activit ies. And performing
allocation, relocation, linking and lo ading can be avoided each time. The idea is
to separate out these activities in separate groups. Thus dividing the
essential four functions in groups reduces the overall time complexity of loading
process.
The program which performs allocati on, relocation and linking is called
binder. The binder performs relo cation, creates linked executable text and stores
th is text in a file in some systematic manner. Such kind of module prepared by
the bin der execution is called load module. This load module can then be
actually lo aded in the m ain memory by the loader. This loader is also called as
module loader. If the binder can produce the exact replica of
executable code in the load module then the module loader simply loads this file
into the main memory which ultimately reduces the overall time
complexity. But in this process the binder should knew the current positions of
the main m emory. Even though the binder knew the main memory locations this
is not the onl y thing which is sufficient. In multiprogramming environme nt, the
region of main memor y available for loading the program is decide d by the host
operating system. The binder should also know which memory area is allocated
to the loading program and it should modify the relocation information
accordingly. The binder which performs t he linking function and produces
adequate information about allocation and relocation
and writes this information along with the program code in the file is called
linkage editor. The module loader then accepts this rile as input, reads the
informa tion stored in and based on thi s info rmatio n about alloc ation and
relocation it performs the task of loading in the main memory.
Ev en though the program is repeatedly executed the linking is done only
once. Mor eover, the flexibility of allocation and relocation helps efficient
utilization of the main memory.

11
Direct linking:

As we have seen in overlay structure certain selective subroutines can be


resident in the m em ory. That means it is not necessary to resident all the
subroutines in the memory for all the time. Only necessary routines can
be present in the main memory and during execution the required
subroutines ca n be loaded in the memory. This process of postponing
linking and loading of external reference until execution is called dynamic
linking.
For example suppose the subroutine main calls A,B,C,D then it is not
desirable to load A,B,C and D along with the main in the memory. Whether A,
B, C or D is called by the main or not will be know n only at the time of
execution. Hence keeping these routin es already before is really n ot needed. As
the subroutines get executed when the program runs. Also the linking of
all the subroutines has to be performed. And the cod e of all the subroutines
remains resident in the main memory. As a result of all this is t hat memory
gets o ccupied unnecessarily. Typically 'error routines' are such ro utines which
can be invo ked rarely. T hen one can postpone the loading of these routines
during the execution. If linking and loa ding of such rarely invoked extern al ref
erences could be postponed until the execution time when it was found to be
absolutely necessary, then it increases the efficiency of overhead
of the loader.
In dynamic li nking , the binder fir st prepares a load module in which along with
program code the allocation and relocation information i s stored. The loader
simply loads the main module in the main memory. If any external
·reference to a subroutine comes, then the execution is suspended for a
while, th e loader brings the required subroutine in the main memory and then
the execution process is resumed. Thus dynamic linking both the
loading and linking is done dynamically.

Advantages
1. The overhead on the loader is reduced. The required subroutine will be
load in the main memory only at the time of execution.
2.The system can be dynamically reconfigured.
Disadvantages
The linking and loading need to be postponed until the execution. During the
execution if at all any subroutine is needed then the process of execution needs to
be suspended until the required subroutine gets loaded in the main memory.

12
Bootstrap Loader:
As we turn on the computer there is nothing meaningful in the main memory
(RAM). A small program is written and stored in the ROM. This
program initially loads the operating system from secondary storage to main
memory. The operating system then takes the overall control. This program
which is responsible for booting up the system is called bootstrap loader. This is
the program which must be executed first when the system is first powered on. If
the program starts from the location x then to execute this program the program
counter of this machine should be loaded with the value x. Thus the task of
setting the initial value of the program counter is to be done by machine
hardware. The bootstrap loader is a very small program which is to be fitted in
the ROM. The task of bootstrap loader is to load the necessary portion of the
operating system in the main memory .The
initial address at which the bootstrap loader is to be loaded is generally the
lowest (may be at 0th location) or the highest location. .
Concept of Linking:
As we have discussed earlier, the execution of program can be done with the
help of following steps
2.1.Translation
execution. of
thethe
Linking ofThis program(done
program
also involves by other
assembler
withpreparation
all or compiler)
programs
of which
a program areload
called needed for
module.
3. Loading of the load module prepared by linker to some specified memory
location.
The output of tr anslator is a program called object module. The linker processes
these object modules binds with necessary library routines and prepares a
ready to execute program. Such a program is called binary program. The
"binar y program also contains some necessary information about allocation
and relocation. The loader then load s this program into memory for execution
purpose.

Linke
r
Given to
Output
Translator

Outputs

Load in memory Loader Given to

Process of linking a program


13
Various tasks of linker are -
1. Prepare a single load module and adjust all the addresses and
subroutine references with respect to the offset location.
2. To prepare a load module concatenate all the object modules and
adjust all the operand address references as well as external references
to the offset location.
3. At correct locations in the load module, copy the binary machine
instructions and constant data in order to prepare ready to execute
module.

The linking process is performed in two passes. Two passes are necessary
because the li nker may encounter a forward reference before knowing its
address. So it is necessar y to scan all the DEFINITION and USE table at least
once. Linker then builds the Global symbol table with the help of USE and
DEFINITION table. In Global symbol t able name of each externally
referenced symbol is i ncluded along with its address rel ative to beginning of the
load module. And during pass 2, the addresses of external r eferences are
replaced by obtaining the addresses from global s ymbol table.

14
1
5

P
D
F
to
W
or
d

You might also like