System Prgramming - Loaders-Linkers
System Prgramming - Loaders-Linkers
Loaders Linkers
System Programming
Loaders and Linkers
Introduction:
In this chapter we will understand the concept of linking and loading. As
discussed earlier the source program is converted to object program by
assembler. The loader is a program which takes this object program,
prepares it for execution, and loads this executable code of the source into
memory for execution.
Definition of Loader:
Loader is utility program which takes object code as input prepares it for
execution and loads the executable code into the memory. Thus loader is
actually responsible for initiating the execution process.
Functions of Loader:
The loader is responsible for the activities such as allocation, linking,
relocation and loading
1) It allocates the space for program in the memory, by calculating the
size of the program. This activity is called allocation.
Loader Schemes:
Based on the various functionalities of loader, there are various types of
loaders:
1) “compile and go” loader: in this type of loader, the instruction is read
line by line, its machine code is obtained and it is directly put in the main
memory at some known address. That means the assembler runs in one
part of memory and the assembled machine instructions and data is
Advantages:
• This scheme is simple to implement. Because assembler is placed at one
part of the memory and loader simply loads assembled machine
instructions into the memory.
Disadvantages:
• In this scheme some portion of memory is occupied by assembler which
is simply a wastage of memory. As this scheme is combination of
assembler and loader activities, this combination program occupies large
block of memory.
• There is no production of .obj file, the source code is directly converted
to executable form. Hence even though there is no modification in the
source program it needs to be assembled and executed each time, which
then becomes a time consuming activity.
• It cannot handle multiple source programs or multiple programs written
in different languages. This is because assembler can translate one source
language to other target language.
• For a programmer it is very difficult to make an orderly modulator
program and also it becomes difficult to maintain such program, and the
“compile and go” loader cannot handle such programs.
• The execution time will be more in this scheme as every time program
is assembled and then executed.
Advantages:
• The program need not be retranslated each time while running it. This is
because initially when source program gets executed an object program
gets generated. Of program is not modified, then loader can make use of
this object program to convert it to executable form.
• There is no wastage of memory, because assembler is not placed in the
memory, instead of it, loader occupies some portion of the memory. And
size of loader is smaller than assembler, so more memory is available to
the user.
• It is possible to write source program with multiple programs and
multiple languages, because the source programs are first converted to
object programs always, and loader accepts these object modules to
convert it to executable form.
Advantages:
1. It is simple to implement
2. This scheme allows multiple programs or the source programs written
different languages. If there are multiple programs written in different
languages then the respective language assembler will convert it to the
language and a common object file can be prepared with all the ad
resolution.
3. The task of loader becomes simpler as it simply obeys the instruction
regarding where to place the object code in the main memory.
4. The process of execution is efficient.
Disadvantages:
1. In this scheme it is the programmer's duty to adjust all the inter
segment addresses and manually do the linking activity. For that, it is
necessary for a programmer to know the memory management.
If at all any modification is done the some segments, the starting
addresses of immediate next segments may get changed, the programmer
has to take care of this issue and he needs to update the corresponding
starting addresses on any modification in the source.
Algorithm for absolute Loader
Input: Object codes and starting address of program segments.
Output: An executable code for corresponding source program. This
executable code is to be placed in the main memory
Method: Begin
For each program segment
do Begin
Read the first line from object module to
obtain information about memory location. The
starting address say S in corresponding object
module is the memory location where executale
code is to be placed.
Hence
Memory_location = S
Line counter = 1; as it is first line While (!
end of file)
For the curent object code
do Begin
1. Read next line
2. Write line into location S
3. S = S + 1
4. Line counter Line counter + 1
could inform loader that these are the subroutines or variables used by
other segments. This overall process of establishing the relations between
the subroutines can be conceptually called a_ subroutine linkage.
For example
MAIN START
EXT B
.
.
.
CALL B
.
.
END
B START
.
.
RET
END
At the beginning of the MAIN the subroutine B is declared as external.
When a call to subroutine B is made, before making the unconditional
jump, the current content of the program counter should be stored in the
system stack maintained internally. Similarly while returning from the
subroutine B (at RET) the pop is performed to restore the program
counter of caller routine with the address of next instruction to be
executed.
Concept of relocations:
Relocation is the process of updating the addresses used in the address
sensitive instructions of a program. It is necessary that such a
modification should help to execute the program from designated area of
the memory.
The assembler generates the object code. This object code gets executed
after loading at storage locations. The addresses of such object code will
get specified only after the assembly process is over. Therefore, after
loading,
address of object code = Mere address of object code + relocation
constant.
There are two types of addresses being generated: Absolute address and,
relative address. The absolute address can be directly used to map the
object code in the main memory. Whereas the relative address is only
after the addition of relocation constant to the object code address. This
kind of adjustment needs to be done in case of relative address before
actual execution of the code. The typical example of relative reference is :
addresses of the symbols defined in the Label field, addresses of the data
which is defined by the assembler directive, literals, redefinable symbols.
8 ﺻﻔﺤﺔ Dr.Shaimaa H.Shaker
In the above expression the A, Band C are the variable names. The
assembler is to c0l1sider the relocation attribute and adjust the object
code by relocation constant. Assembler is then responsible to convey the
information loading of object code to the loader. Let us now see how
assembler generates code using relocation information.
Direct Linking Loaders
The direct linking loader is the most common type of loader. This type of
loader is a relocatable loader. The loader can not have the direct access to
the source code. And to place the object code in the memory there are
two situations: either the address of the object code could be absolute
which then can be directly placed at the specified location or the address
can be relative. If at all the address is relative then it is the assembler who
informs the loader about the relative addresses.
The assembler should give the following information to the loader
1)The length of the object code segment
2) The list of all the symbols which are not defined 111 the current
segment but can be used in the current segment.
3) The list of all the symbols which are defined in the current segment but
can be referred by the other segments.
The list of symbols which are not defined in the current segment but can
be used in the current segment are stored in a data structure called USE
table. The USE table holds the information such as name of the symbol,
address, address relativity.
The list of symbols which are defined in the current segment and can be
referred by the other segments are stored in a data structure called
DEFINITION table. The definition table holds the information such as
symbol, address.
Overlay Structures and Dynamic Loading:
Sometimes a program may require more storage space than the available
one Execution of such program can be possible if all the segments are not
required simultaneously to be present in the main memory. In such
situations only those segments are resident in the memory that are
actually needed at the time of execution But the question arises what will
happen if the required segment is not present in the memory? Naturally
the execution process will be delayed until the required segment gets
loaded in the memory. The overall effect of this is efficiency of execution
process gets degraded. The efficiency can then be improved by carefully
selecting all the interdependent segments. Of course the assembler can
not do this task. Only the user can specify such dependencies. The inter
dependency of the segments can be specified by a tree like structure
called static overlay structures. The overlay structure contain multiple
root/nodes and edges. Each node represents the segment. The
specification of required amount of memory is also essential in this
structure. The two segments can lie simultaneously in the main memory if
they are on the same path. Let us take an example to understand the
concept. Various segments along with their memory requirements is as
shown below.
external calls then the linker searches the subroutine directory finds the
address of such external calls, prepares the load module by resolving the
external references. Linkage Editor: The execution of any program
needs four basic functionalities and those are allocation, relocation,
linking and loading. As we have also seen in direct linking loader for
execution of any program each time these four functionalities need to be
performed. But performing all these functionalities each time is time and
space consuming task. Moreover if the program contains many
subroutines or functions and the program needs to be executed repeatedly
then this activity becomes annoyingly complex .Each time for execution
of a program, the allocation, relocation linking and -loading needs to be
done. Now doing these activities each time increases the time and space
complexity. Actually, there is no need to redo all these four activities
each time. Instead, if the results of some of these activities are stored in a
file then that file can be used by other activities. And performing
allocation, relocation, linking and loading can be avoided each time. The
idea is to separate out these activities in separate groups. Thus dividing
the essential four functions in groups reduces the overall time complexity
of loading process. The program which performs allocation, relocation
and linking is called binder. The binder performs relocation, creates
linked executable text and stores this text in a file in some systematic
manner. Such kind of module prepared by the binder execution is called
load module. This load module can then be actually loaded in the main
memory by the loader. This loader is also called as module loader. If the
binder can produce the exact replica of executable code in the load
module then the module loader simply loads this file into the main
memory which ultimately reduces the overall time complexity. But in this
process the binder should knew the current positions of the main memory.
Even though the binder knew the main memory locations this is not the
only thing which is sufficient. In multiprogramming environment, the
region of main memory available for loading the program is decided by
the host operating system. The binder should also know which memory
area is allocated to the loading program and it should modify the
relocation information accordingly. The binder which performs the
linking function and produces adequate information about allocation and
relocation and writes this information along with the program code in the
file is called linkage editor. The module loader then accepts this rile as
input, reads the information stored in and based on this information about
allocation and relocation it performs the task of loading in the main
memory. Even though the program is repeatedly executed the linking is
done only once. Moreover, the flexibility of allocation and relocation
helps efficient utilization of the main memory.
(Operating System)
Introduction:
2. I/O operation: I/O means any file or any specific I/O device. Program
may require any I/O device while running. So operating system must
provide the required I/O.
3. File system manipulation: Program needs to read a file or write a file.
The operating system gives the permission to the program for operation
on file.
4. Communication: Data transfer between two processes is required for
some time. The both processes are on the one computer or on different
computer but connected through computer network. Communication may
be implemented by two methods: shared memory and message passing.
5. Error detection: Error may occur in CPU, in I/O devices or in the
memory hardware. The operating system constantly needs to be aware of
possible errors. It should take the appropriate action to ensure correct and
consistent computing.
Operating system with multiple users provides following services.
1. Resource allocation
2. Accounting
3. Protection •
Batch Systems
• Some computer systems only did one thing at a time. They had a list of
instructions to carry out and these would be carried out one after the
other. This is called a serial system. The mechanics of development and
preparation of programs in such environments are quite slow and
numerous manual operations involved in the process.
• Batch operating system is one where programs and data are collected
together in a batch before processing starts. A job is predefined sequence
of commands, programs and data that are combined into a single unit
called job.
• Figure below shows the memory layout for a simple batch system.
Spooling:
• Acronym for simultaneous peripheral operations on line. Spooling
refers to putting jobs in a buffer, a special area in memory or on a disk
where a device can access them when it is ready.
• Spooling is useful because device access data at different rates. The
buffer provides a waiting station where data can rest while the slower
device catches up.
• One difficulty with simple batch systems is that the computer still needs
to read the deck of cards before it can begin to execute the job. This
means that the CPU is idle during these relatively slow operations.
• Spooling batch systems were the first and are the simplest of the
multiprogramming systems.
Advantages of Spooling:
1. The spooling operation uses a disk as a very large buffer.
2. Spooling is however capable of overlapping I/O operation for one job
with processor operations for another job.
• The operating system keeps several jobs in memory at a time. This set
of jobs is a subset of the jobs kept in the job pool. The operating system
picks and begins to execute one of the job in the memory.
• Multiprogrammed systems provide an environment in which the
various system resources are utilized effectively, but they do not provide
for user interaction with the computer system.
• Jobs entering into the system are kept into the memory. Operating
system picks the job and begins to execute one of the jobs in the memory.
Having several programs in memory at the same time requires some form
of memory management.
• Multiprogramming operating system monitors the state of all active
programs and system resources. This ensures that the CPU is never idle
unless there are no jobs.
Advantages
1. High CPU utilization.
2. It appears that many programs are allotted CPU almost simultaneously.
Disadvantages
1. CPU scheduling is required.
2. To accommodate many jobs in memory, memory management is
required.
Desktop System: During the late 1970, computers had faster CPU, thus
creating an even greater disparity between their rapid processing speed
and slower I/O access time. Multiprogramming schemes to increase CPU
use were limited by the physical capacity of the main memory, which was
a limited resource and very expensive. These system includes PC running
MS window and the Apple Macintosh. The Apple Macintosh OS support
new advance hardware i.e. virtual memory and multitasking with virtual
memory, the entire program did not need to reside in memory before
execution could begin.
• Linux, a unix like OS available for PC, has also become popular
recently. The microcomputer was developed for single users in the late
1970. Physical size was smaller than the minicomputers of that time,
though larger than the microcomputers of today.
• Microcomputer grew to accommodate software with large capacity and
greater speeds. The distinguishing characteristics of a microcomputer is
its single user status. MS-DOS is an example of a microcomputer
operating system.
• The most powerful microcomputers used by commercial; educational,
government enterprises. Hardware cost for microcomputers are
sufficiently low that a single user (individuals) have sole use of a
computer. Networking capability has been integrated into almost every
system.
Multiprocessor System:
• Multiprocessor system have more than one processor in close
communication. They share the computer bus, system clock and input-
output devices and sometimes memory. In multiprocessing system, it is
possible for two processes to run in parallel.
• Multiprocessor systems are of two types: symmetric multiprocessing
and asymmetric multiprocessing.
• In symmetric multiprocessing, each processor runs an identical copy of
the operating system and they communicate with one another as needed.
All the CPU shared the common memory.
Cluster System:
• It is a group of computer system connected with a high speed
communication link. Each computer system has its own memory and
peripheral devices. Clustering is usually performed to provide high
availability. Clustered systems are integrated with hardware cluster and
software cluster. Hardware cluster means sharing of high performance
disks. Software cluster is in the form of unified control of the computer
system in a cluster.
• A layer of software cluster runs on the cluster nodes. Each node can
monitor one or more of the others. If the monitoring machine fails, the
monitoring machine can take ownership of its storage and restart the
application that were running on the failed machine.
• Clustered system can be categorized into two groups: asymmetric
clustering and symmetric clustering.
• In asymmetric clustering, one machine is in hot standy mode while the
other is running the applications. Hot standy mode monitors the active
server and sometimes becomes the active server when the original server
fails.
• In symmetric clustering mode, two or more than two hosts are running
applications and they are monitoring each other.
• Parallel clusters and clustering over a WAN is also available in
clustering.
Parallel clusters allow multiple hosts to access the same data on the
shared storage. A cluster provides all the key advantages of distributed
systems. A cluster provides better reliability than the symmetrical
multiprocessor system.
• A hard real time system guarantees that the critical tasks be completed
on time. This goal requires that all delay in the system be bounded. Soft
real time system is a less restrictive type. In this, a critical r.eal time task
gets priority over other tasks, and retains that priority until it completes.
• Real time operating system uses priority scheduling algorithm to meet
the response requirement of a real time application.
• General real time applications with some examples are listed below.
Handheld System:
• Personal Digital Assistants (PDA) is one type of handheld systems.
Developing such device is the complex job and many challenges will face
by developers. Size of these system is small i.e. height is 5 inches and
width is 3 inches.
• Due to the limited size, most handheld devices have a small amount of
memory, include slow processors and small display screen. Memory of
handheld system is in the range of 512 kB to 8 MB. Operating system and
applications must manage memory efficiently. This includes returning all
allocated memory back to the memory manager once the memory is no
longer needed. Developers are working only on confines of limited
physical memory because any handheld devices not using virtual
memory.
• Speed of the handheld system is major factor. Faster processors require
for handheld systems. Processors for most handheld devices often run at a
fraction of the speed of a processor in a Pc. Faster processors require
more power. Larger battery requires for faster processors.
• For mimimum size of handheld devices, smaller, slower processors
which consumes less power are used. Typically small display screen is
available in these devices. Display size of handheld device is not more
than 3 inches square.
• At the same time, display size of monitor is up to 21 inches. But these
handheld device provides the facility for reading email, browsing web
pages on smaller display. Web clipping is used for displaying web page
on the handheld devices.
Computing Environments:
• Different types of computing environments are:
a.Traditional computing
b.Web based computing
c. Embedded computing
1. Batch: Jobs with similar needs are batched together and run through
the computer as a group by an operator or automatic job sequencer.
Performance is increased by attempting to keep CPU and I/O devices
busy at all times through buffering, off line operation, spooling and
multiprogramming. A Batch system is good for executing large jobs that
need little interaction, it can be submitted and picked up latter.
4.Real time system: Real time systems are usually dedicated, embedded
systems. They typically read from and react to sensor data. The system
must guarantee response to events within fixed periods of time to ensure
correct performance.
Process Management
• Process refers to a program in execution. The process abstraction is a
fundamental operating system mechanism for management of concurrent
program execution. The operating system responds by creating a process.
• A process needs certain resources, such as CPU time, memory, files and
I/O devices. These resources are either given to the process when it is
created or allocated to it while it is running.
• When the process terminates, the operating system will reclaim any
reusable resources.
• The term process refers to an executing set of machine instructions.
Program by itself is not a process. A program is a passive entity.
• The operating system is responsible for the following activities of the
process management.
1. Creating and destroying the user and system processes.
2. Allocating hardware resources among the processes.
3. Controlling the progress of processes.
File Management
• Logically related data items on the secondary storage are usually
organized into named collections called files. In short, file is a logical
collection of information. Computer uses physical media for storing the
different information.
• A file may contain a report, an executable program or a set of
commands to the operating system. A file consists of a sequence of bits,
bytes, lines or records whose meanings are defined by their creators. For
storing the files, physical media (secondary storage device) is used.
• Physical media are of different types. These are magnetic disk,
magnetic tape and optical disk. All the media has its own characteristics
and physical organization. Each medium is controlled by a device.
• The operating system is responsible for the following in connection
with file management.
Protection System:
• Modern computer systems support many users and allow the concurrent
execution of multiple processes. Organizations rely on computers to store
information. It is necessary that the information and devices must be
protected from unauthorised users or processors. The protoction is any
mechanism for controlling the access of programs, processes or users to
the resources defined by a computer system.
• Protection mechanisms are implemented in operating systems to
support various security policies. The goal of the security system is to
authenticate subjects and to authorise their access to any object.
• Protection can improve reliability by detecting latent errors at the
interfaces between compoent subsystems. Protection domains are
extensions of the hardware supervisor mode ability
2. I/O operation
3. File system manipulation
4. Communications
5. Error detection.
2. I/O Operation: I/O means any file or any specific I/O device. Program
may require any I/O device while running. So operating system must
provide the required I/O.
A) Resource allocation:
If there are more than one user or jobs running at the same time, then
resources must be allocated to each of them. Operating system manages
different types of resources. Some resources require special allocation
code, i.e., main memory, CPU cycles and file storage.
• There are some resources which require only general request and
release code. For allocating CPU, CPU scheduling algorithms are used
for better utilization of CPU. CPU scheduling routines consider the speed
of the CPU, number of available registers and other required factors.
B) Accounting:
• Logs of each user must be kept. It is also necessary to keep record of
which user uses how much and what kinds of computer resources. This
log is used for accounting purposes.
• The accounting data may be used for statistics or for the billing. It also
used to improve system efficiency.
C) Protection:
• Protection involves ensuring that all access to system resources is
controlled.
Security starts with each user having to authenticate to the system,
usually by means of a password. External I/O devices must be also
protected from invalid access attempts.
• In protection, all the access to the resources is controlled. In
multiprocess environment, it is possible that, one process to interface with
the other, or with the operating system, so protection is required.
System Calls:
• Modern processors provide instructions that can be used as system
calls.
System calls provide the interface between a process and the operating
system. A system call instruction is an instruction that generates an
interrupt that cause the operating system to gain control of the processor.
1. File management
2. Interprocess communication
3. Process management
4. I/O device management
5. Information maintenance.
Hardware Protection:
• For single-user programmer operating systems, programmer has the
complete control over the system. They operate the system from the
console. When new operating systems developed with some additional
features, the system control transfers from programmer to the operating
system.
•Early operating systems were called resident monitors, and starting with
the resident monitor, the operating system began to perform many of the
functions, like input-output operation.
•Before the operating system, programmer is responsible for the controls
of input-output device operations. As the requirements of programmers
from computer systems go on increasing and development in the field of
communication helps to the operating system.
•Sharing of resource among different programmers is possible without
increasing cost. It improves the system utilization but problems increase.
If single system was used without share, an error occurs, that could cause
problems for only the one program which was running on that machine.
•In sharing, other programs also affected by single program. For example
, batch operating system faces the problem of infinite loop. This loop
could prevent the correct operation of many jobs. In multiprogramming
system, one erroreous program affects the other program or data of that
program.
• For proper operation and error free result, protection of error is required
. Without protection, only single process will execute one at a time
otherwise the output of each program is separated. While designing the
operating system, this type of care must be taken into consideration.
• Many programming errors are detected by the computer hardware.
Operating system handled this type of errors. Execution of illegal
instruction or access of memory that is not in the user's address space,
this type of operation found by the hardware and will trap to the operating
system.
• The trap transfers control through the interrupt vector to the operating
system. Operating system must abnormally terminate the program when
program error occurs. To handle this type of situation, different types of
hardware protection is used.
CP/M
Control Program/Microcomputer. An operating system created by Gary Kildall, the
founder of Digital Research. Created for the old 8-bit microcomputers that used the
8080, 8085, and Z-80 microprocessors. Was the dominant operating system in the late
1970s and early 1980s for small computers used in a business environment.
DOS
Disk Operating System. A collection of programs stored on the DOS disk that contain
routines enabling the system and user to manage information and the hardware
resources of the computer. DOS must be loaded into the computer before other
programs can be started.
0S/2
A universal operating system developed through a joint effort by IBM and Microsoft
Corporation. The latest operating system from IBM for microcomputers using the Intel
386 or better microprocessors. OS/2 uses the protected mode operation of the
processor to expand memory from 1M to 4G and to support fast, efficient multitasking.
The 0512 Workplace Shell, an integral part of the system, is a graphical interface
similar to Microsoft Windows and the Apple Macintosh system. The latest version runs
DOS, Windows, and OS/2-specific software.
1 ﺻﻔﺤﺔDr.shaimaa H. Shaker
2 ﺻﻔﺤﺔDr.shaimaa H. Shaker
Resources
• Process: An executing program
3 ﺻﻔﺤﺔDr.shaimaa H. Shaker
The First OS
• Resident monitors were the first, rudimentary, operating systems
– monitor is similar to OS kernel that must be resident in memory
– control-card interpreters eventually become command processors or shells
• There were still problems with computer utilization. Most of these problems
revolved around I/O operations
4 ﺻﻔﺤﺔDr.shaimaa H. Shaker
• multiprogrammed systems - several tasks can be started and left un- finished; the
CPU is assigned to the individual tasks by rotation, task waiting to the completion of the
I/O operation (or other event) are blocked to save CPU time
• time-sharing systems - the CPU switching is so frequent that several users can
interact with the computer simultaneously - interactive processing
D. Bertrand 2 First Year University Studies in Science. ULB . Computer Principles. Chapter 8
Classification
From the hardware point of view :
• software = set of instructions
o either always in memory (resident)
o either loaded on request (non-resident or transient)
From the user point of view : Classification from the functionality
• System software :
o Operating systems (including monitor, supervisor, …)
o Loaders
o Libraries and utility programs
• Support software (developpers) :
o Assemblers
o Compilers and interpreters
o Editors
o Debuggers
• Application software
5 ﺻﻔﺤﺔDr.shaimaa H. Shaker
6 ﺻﻔﺤﺔDr.shaimaa H. Shaker
(Macro processors)
Introduction:
Macro Expansion
• Macro call leads to macro expansion.
• Macro call statement is replaced by a sequence of assembly statement in
macro expansion.
• Two key notions are used in macro expansion
a) Expansion time control flow
b) Lexical substitution
• RDBUFF and WRBUFF are the two macro instruction used in the Fig.
SIC/XE program. MACRO and MEND are the two new assembler
directives also used. RDBUFF is the name of macro which is in the label
field. The entry in the operand field identify the parameters of the macro
instruction.
• Each parameter begins with the character & which facilities the
substitution of parameters during macro expansion.
• The macro name and parameters define a pattern or prototype for the
macro instruction used by the programmer. Body of the macro definition
is defined by the MACRO' directive statements. Macro expansion
generate these statements. End of the macro definition is defined by the
MEND assembler directive.
• In the main program . Macro invocation statement that gives the name
of the macro instruction being invoked and the arguments to be used in
expanding the macro .
• Each macro invocation statement has been expanded into the statements
that form the body of the macro, with the arguments from the macro
invocation substituted for the parameters in the macro prototype.
Parameters and arguments, both are associated with one another
according to their positions.
• The argument F1 is substituted for the parameter &INDEV whenever it
occurs in the body of the macro. BUFFER is substituted for &BUFADR
and &LENGTH is substituted for &RECLTH.
After macro processing, the expanded file can be used as input to the
assembler.
Compilers