System Software Unit I
System Software Unit I
UNIT I
Introduction
The Software is set of instructions or programs written to carry out certain task on
digital computers. It is classified into system software and application software. System software
consists of a variety of programs that support the operation of a computer. Application software
focuses on an application or problem to be solved.
Examples for system software are Operating system, compiler, assembler, macro
processor, loader or linker, debugger, text editor, database management systems (some of them)
and, software engineering tools. These software’s make it possible for the user to focus on an
application or other problem to be solved, without needing to know the details of how the
machine works internally.
One characteristic in which most system software differs from application software is
machine dependency.
Compilers must generate machine language code, taking into account such hardware
characteristics as the number and type of registers and the machine instructions available.
Operating systems are directly concerned with the management of nearly all of the resources of a
computing system.
There are aspects of system software that do not directly depend upon the type of
computing system, general design and logic of an assembler, general design and logic of a
compiler and code optimization techniques, which are independent of target machines. Likewise,
the process of linking together independently assembled subprograms does not usually depend
on the computer being used.
This machine has been designed to illustrate the most commonly encountered hardware
features and concepts, while avoiding most of the peculiarity that are often found in real
machines.
SIC comes in two versions:
1. The standard model
2. An XE version(extra equipment or extra expensive)
MEMORY
Memory consists of 8-bit bytes, any 3 consecutive bytes form a word.(24 bits)
All addresses on SIC are byte addresses, words are addressed by the location of their
lowest numbered byte.
There are a total of 32,768 bytes in the computer memory.
REGISTERS
There are 5 registers, all of which have special uses.
Each register is 24 bits in length.
Mnemonic Number Special use
Accumulator, used for
A 0
arithmetic operations
Index register, used for
X 1
addressing
Linkage register, the Jump to
L 2
subroutines
Program counter, contains
the address of the next
PC 8
instruction to be fetched for
execution
Status word, contains a
variety of information,
SW 9
including a Condition Code
(CC).
DATA FORMATS
Integers are stored as 24-bit binary numbers, 2’s complement representation is used for
negative values.
Characters are stored using their 8-bit ASCII codes.
There is no floating-point hardware on the standard version of SIC.
INSTRUCTION FORMATS
All machine instructions on the standard version of SIC have the following 24-bit format:
The flag bit x is used to indicate indexed-addressing mode.
8 1 15
Opcode X Address
ADDRESSING MODES
LDA TEN
STCH BUFFER, X
INSTRUCTION SET
This includes instruction that load and store register.
LDA – load accumulator
LDX – load index register
STA – store accumulator
STX – store index register
It also includes integer arithmetic instructions ADD, SUB, MUL, DIV.
All arithmetic operations involve register A and a word in memory, with the result being
left in the register.
It also includes an instruction COMP that compares the value in register A with a word in
memory.
It also includes jump instructions like,
JLT - less than
JEQ – equal
JGT – greater than.
The two instructions are provided for subroutine linkage. They are,
1. JSUB – jumps to subroutine
2. RSUB – returns to subroutine.
MEMORY
The maximum memory available is 1 megabyte.
This increase leads to a change in instruction formats and addressing modes.
REGISTERS
DATA FORMATS
The data format is same as standard SIC version.
In addition is a 48 bit floating-point data type and the format is
1 11 36
S exponent Fraction
The fraction lies between 0 and 1.
The exponent is an unsigned binary number lies between 0 and 2047.
There is a 48-bit floating-point data type, F*2(e-1024)
Instruction Formats:
The new set of instruction formats from SIC/XE machine architecture is as follows.
Format 1 (1 byte): contains only operation code (straight from table).
8
OP
OP R1 R2
Format 3 (3 bytes): First 6 bits contain operation code, next 6 bits contain flags, last 12
bits contain displacement for the address of the operand. Operation code uses only 6 bits,
thus the second hex digit will be affected by the values of the first two flags (n and i). The
flags, in order, are: n, i, x, b, p, and e. The last flag e indicates the instruction format.
6 1 1 1 1 1 1 12
OP n i x b p e disp
Format 4 (4 bytes): same as format 3 with an extra 2 hex digits (8 bits) for addresses
that require more than 12 bits to be represented.
6 1 1 1 1 1 1 20
OP n i x b p e disp
Addressing mode
Two new relative addressing modes are available for use with instructions assembled
using format 3.
Instruction Set
SIC/XE provides all of the instructions that are available on the standard version.
In addition we have, Instructions to load and store the new registers LDB, STB, etc,
Floating- point arithmetic operations, ADDF, SUBF, MULF, DIVF, Register move
instruction : RMO Register-to-register arithmetic operations, ADDR, SUBR, MULR,
DIVR and, Supervisor call instruction : SVC generates an interrupt that can be used for
communication with the OS.
Loading which brings the object program into memory for execution.
Relocation which modifies the object program so that it can be loaded at an address.
The most fundamental function of a loader is bringing an object program into memory
and starting its execution.
Design of an absolute loader
This loader does not need to perform functions like linking and program relocation.
All operations are done in a single pass.
The header record is checked to verify that the correct program has been presented for
loading.
When each text record is read, the object code from the test record is moved to the
indicated address in memory.
When the end record is encountered the loader jumps to the specified address to begin
execution of the loader program.
In this section we consider the design and implementation of a more complex loader that
is used on a SIC/XE version.
This loader provides for program relocation and linking and also for the simple loading
function.
RELOCATION
The need for program relocation is an indirect reason for the change to larger and more
powerful computer.
The way relocation is implemented in a loader is also dependent upon machine
characteristics.
Loaders that allow for program relocation are called relocating loaders or relative loaders.
The two methods for specifying relocation are:
1. Relocation by modification records.
2. Relocation by bit mask
Relocation by modification records
A modification record is used to describe each part of the object that must be changed
when the program is relocated.
Modification record
col 1: M
col 2-7: Starting address of the field to be modified, relative to the beginning of the
control section (hexadecimal).
col 8-9: Length of the field to be modified, in half bytes (hexadecimal)
col 10: Modification flag (+/-)
col 11-17: External symbol whose value is to be added to or subtracted from the indicated
field.
The following SIC/XE program is used for specifying relocation.
NASC- Department of Computer Applications[System Software] Page 9
Line loc Source Statement object code
5 0000 COPY START 0
10 0000 FIRST RETADR 17202D
. . .
. . .
15 0006 CLOOP +JSUB REREC 4B101036
. . .
. . .
35 0013 +JSUB WRREC 4B10105D
. . .
. . .
65 0026 +JSUB WRREC 4B10105D
. . .
. . .
115 SUBROUTINE TO READ RECORD INTO BUFFER
. . .
125 1036 RDREC CLEAR X B410
. . .
. . .
200 SUBROUTINE TO WRITE RECORD INTO BUFFER
. . .
. . .
210 105D WRREC CLEAR X B410
Most of the instruction in the above program use relative or immediate addressing.
The instruction on lines 15, 35, 65 contain actual addresses (instructions are extended
format) whose values are affected by relocation.
The following is an object program corresponding to the above source program.
There is one modification record for each instruction that must be changed during
execution. ( 3 modification record for instruction in line 15, 35, 65).
Each modification record specifies the starting address and length of the filed whose
value is to be altered.
In the above example, all modifications add the value of the symbol COPY, which
represents the starting address of the program.
Relocation by bitmask technique is mainly used to a machine that primarily uses direct
addressing and has a fixed instruction format.
The standard SIC program is used for this method.
The below figure shows the object program with relocation by bitmask.
PROGRAM LINKING
In this section we are going to see complex examples of external references between
programs and examine the relationship between relocation and linking
Consider the following 3 separate program each consists of a single control section.
Loc Source statement Object code
PROGADDR CSADDR
(Program load (Control Section
Address) Address)
ESTAB:
1. Used to store the name & address of each external symbol in the control section being
loaded.
2. The table also indicates in which control section the symbol is designed.
CSADDR:
It contains starting address assigned to the control section currently being scanned by the
loader.
This address is added to all relative addresses within the control section to convert them
to actual addresses.
PASS 1 ALGORITHM:
Begin
get PROGADDR from operating system
set CSADDR to PROOADDR {for first control section}
while not end of input do
begin
read next input record {Header record for control section}
set CSLTH to control section length
search ESTAB for control section name
if found then
set error flag {duplicate external symbol}
else
enter control section name into ESTAB with value CSADDR
while record type ~ 'E' do
begin
read next input record
if record type = 'D' then
for each symbol in the record do
begin
search ESTAB for symbol name
if found then
set error flag (duplicate external symbol)
else
enter symbol into ESTAB with value(CSADDR +
indicated address)
PASS 2 ALGORITHM:
Begin
set CSADDR to PROOADDR
set EXECADDR to PROOADDR
while not end of input do
begin
read next input record {Header record}
set CSLTH to control section length
while record type != 'E' do
begin
read next input record
if record type = 'T' then -~
begin
{if object code is in character form, convert
into internal representation}
move object code from record to location
(CSADDR + specified address)
end {if 'T'}
else if record type = 'M' then
begin
search ESTAB for modifying symbol name
if found then
add or subtract symbol value at location
(CSADDR + specified address)
Else
set error flag (undefined external symbol )
end {if 'M' }
end {while != 'E'}
if an address is specified {in End record} then
set EXECADDR to (CSADDR + specified address)
add CSLTH to CSADDR
end {while not EOF}
NASC- Department of Computer Applications[System Software] Page 15
jump to location given by EXECADDR {to start execution of loadedprogram)
end {Pass 2}
Pass2 of the loader performs the actual loading, relocation and linking of the program.
As each text record is read, the object code is moved to the specified address.
When a modification record is encountered, the symbol whose value is to be used for
modification is looked up in ESTAB.
This value is then added to or subtracted from the indicated location in memory.
The last step performed by the loader is that transferring of control to the loaded program
to begin execution.
LOADED PROGRAM
Linked program produce by the linkage editor is processed by a relocating loader.
All external references are resolved and relocation is indicated by some methods such as
modification records or bit mask.
Even though all linking has been performed, information about external references is
often retained in the linked program.
This allows relinking of the program to replace control sections, modify external
references etc.,
FUNCTIONS OF LINKAGE EDITORS
Linkage editor can perform many useful functions using editor commands. They are:
Assume that a program (PLANNER) that uses many subroutines.
One of the subroutine (PROJECT) has to be change to new version.
After the new version of PROJECT is assembled or compiled, the linkage editor is used
to replace this subroutine in the program (PLANNER).
The following linkage editor commands are used to perform the above work.
INCLUDE PLANNER (PROGLIB)
DELETE PROJECT {Delete from existing planner}
INCLUDE PROJECT (NEWLIB) {Include new version}
REPLACE PLANNER (PROGLIB)
Linkage editor is also used to build packages of subroutines or other control sections that
are generally used.
It combines the related subroutines into a package using editor commands.
Ex:
INCLUDE BLOCK (FTNLIB)
INCLUDE DEBLOCK (FTNLIB)
INCLUDE ENCODE (FTNLIB)
INCLUDE DECODE (FTNLIB)
.
.
SAVE FTN10 (SUBLIB)
In the above command sequence, all the subroutines are linked into a module named
FTNIO.
This module is available in the directory SUBLIB.
A search of SUBLIB, will search FTNIO instead of the separate routines.
This method saves search time.
DYNAMIC LINKING
Sometimes loading and linking of subroutine to the program will occur when it is first
called.
Here, linking function is postponed until execution time.
This type of function is called as dynamic linking or dynamic loading or load on call.
NASC- Department of Computer Applications[System Software] Page 19
Loading & linking of a subroutine using dynamic linking
The following figure shows a method in which subroutines that are dynamically loaded
must be called through an operating system service request.
In the above figure, the user program makes a load-and-call service request to the
operating system.
Then the OS checks its internal tables to determine whether the subroutine is loaded or
not.
If the subroutine is not loaded, then it is loaded from the system libraries as shown in the
below figure.
Control is then passed from the dynamic loader to the subroutine being called.
After the subroutine is completed, the memory that was allocated for subroutine is retain
for later use as long as the storage space is not needed for other processing.
If a subroutine is still in memory, a second call request is not require for another load
operation.
If the question, how is the loader itself loaded into the memory? is asked, then the answer
is, when computer is started – with no program in memory, a program present in ROM
( absolute address) can be made executed – may be OS itself or A Bootstrap loader,
which in turn loads OS and prepares it for execution.
The first record ( or records) is generally referred to as a bootstrap loader – makes the
OS to be loaded.
Such a loader is added to the beginning of all object programs that are to be loaded into
an empty and idle system.
On some computers, as absolute program is permanently resident in a read-only memory.
When some hardware signal occurs, the machine begins to execute this ROM program.
On some computers, the program is executed directly in ROM, on others, the program is
copied from ROM to main memory and executed other.
Some machines do not have such read-only storage.
identifier := expression
A Expression + expression
Identifier Identifier
B C
Solution
1. Many p-code compilers are designed for a single user running on a microcomputer
system. In that case, speed of execution may be relatively insignificant.
2. If execution speed is important some p-code compilers support the use of machine
language subroutines.
By rewriting a small number of commonly used routines in machine language, it is often
possible to achieve some improvements in performance.
Compiler-compilers
A compiler-compilers is a software tool that can be used to help in the task of compiler
construction.
Such tools are often called compiler generators or translator writing system.
In row major order, the rightmost subscript varies most rapidly, in column major
order, the left most subscript varies more rapidly
Referring array element
To refer to an array element, we must calculate the address of the referenced element
relative to the base address of the array.
Ex: One-dimensional array
A: ARRAY [1…10] OF INTEGER
1. Suppose a statement refers to array element A[6]
2. There are five array elements preceding A[6]
3. On a SIC machine, each such element would occupy 3 bytes
4. Thus the address of A[6] relative to the starting address of the array is given by 5*3=15
Code generation for array references
1. If an array reference involves only constant subscripts Ex: A[6], the relative address
calculation can be performed during compilation
2. If the subscripts involve variables Ex: A[6], however the compiler must generate object
code to perform this calculation during execution
Ex: A: ARRAY [l…u] OF INTEGER //array declaration
1. Suppose each array element occupy w bytes of storage
2. If the value of the subscript is S, then the relative address of the referenced array element
A[S] is given by,
W*(s-l)
3. The generation of code to perform such a calculation is illustrated in following figure
Code generation for Array references
A: ARRAY [1….10] OF INTEGERS
.
.
A[J]:=5
1) –I =1, i1
2) *i1=3, i2
3) := =5, A[i2]
2. The first action taken by MAIN is to store the return address from register at a fixed
location RETADR within MAIN
1. In the above figure shows that stack as it was appear after SUB returns from the recursive
call.
2. Register B has been reset to point to the activation record for the previous invocation of
SUB.
Rules for automatic storage allocation
1. When automatic allocation is used, the compiler must generate code for references to
variables using some sort of relative addressing.
1. In the above figure, shows the outline of a block-structured program in a PASCAL like
language.
2. Each procedure form a block.
3. In block structured program, blocks may be nested within other blocks. In the above
example, procedures B & D are nested within procedure A, & procedure C is nested
within procedure B.
NASC- Department of Computer Applications[System Software] Page 35
4. Each block may contain a declaration of variables.
5. A inner block may also refer to variables that are defined in any outer block, but the same
names are not redefined in the inner block.
Compiling & execution of block-structured programs
1. In compiling a program within in a block-structured language, it is convenient to number
the blocks as shown in above figure.
2. The compiler construct a table that describes the block structure as shown below
3. The table contains the details of block name, block number, block level and surrounding
block.
4. The block-level entry gives the nesting depth for each block.
5. The outermost block has a level number of 1, and each other block has a level number
that is one greater than that of the surrounding block.
Searching of identifiers in symbol table
Same name can be declared more than once in a program in different blocks.
So there can be several symbol-table entries for the same name.
The entries that represent declarations of the same name by different blocks can be
linked together in the symbol table with a chain of pointers.
When a reference to an identifier appears in the source program the compiler must first
check the symbol table for a definition of that identifier by the current block.
Id=f no such definition is found, the compiler looks for a definition by the block that
surrounds the current block, then by the block that surrounds that, and so on.
If the outermost block is reached without finding a definition of the identifier, then the
reference is an error.
The search process just described can easily be implemented within a symbol table that
uses hashed addressing.
Access to variables in surrounding block
One common method for providing access to variables in surrounding block uses a data
structure called a display.
The display contains pointers to the most recent activation records for the current block
and for all blocks that surround the current one in the source program.
When a block refers to a variable that is declared in some surrounding block, the
generated object code uses the display to find the activation record that contains this
variable.
Ex: The use of display is illustrated in the following figure. Here data structure display is
used for pascal procedure that is discussed previously.
1. Assume that procedure A has been invoked by the system, A has then called
procedure B, and B has called procedure C. The resulting situation is shown in
following figure.
Another activation record for C is created on the stock as a result of this call.
The display pointer for C is changed accordingly.
Variables that correspond to the previous invocation of C are not accessible
for the record.
3. Suppose now that procedure C calls D. The resulting stack & display are shown
below
An activation record for D has been created the usual way & added to the
stack.
Note, however, that the display now contains only two pointers : one each to
the activation records for D & A.
This is because procedure D cannot refer to variables in B (or) C.
Procedure D can refer only to the variables that are declared by D (or) by
some block that contains D in the source program (in this case, procedure A)
B S1 , S4 S2 /
*
4 I J
S1 & S4 are common sub expressions. This can be eliminated as shown below:
B S1 S2 /
*
4 I J
Loop unrolling
This deals with reducing the number of tests carried out if the number of iteration is
constant.
i=1;
while (i<=100)
{
x[i]=0;
i++;
}
“i<=100” is performed 100 times
This sequence can be replaced by the following set of statements
i=1;
while (i<=50)
{
x[i]=0;
i++;
x[i]=0;
i++;
}
Replication of body will reduce the number of checking process up to 50%.
Loop jamming
This is a technique of merging the bodies of two loops if they have the same number of
iterations.
for(i=0;i<=10;i++)
x[i]=0;
for(i=0;i<=10;i++)
y[i]=1;
Body of two ‘for’ loops having the variable “I” within the same range can be
concatenated.
Result will be
for(i=0;i<=10;i++)
{
x[i]=0;
y[i]=1;
}
Advantages gained by code optimization
1. Codes can be made to run faster.
An arrow from block x to block y indicates that control can pass directly from the last
quadruple of x to the first quadruple of y. this kind of representation is called as flow
graph.
2. Another possibility involves rearranging quadruples before machine code is
generated.
With a little analysis, an optimizing compiler could recognize this situation and
rearrange the quadruples so the second operand of the subtraction is computed first.
The first two quadruple in the sequence have been interchanged.
The resulting machine code requires two fewer instructions and uses only one
temporary variable instead of two.
3. Other possibilities involve taking advantage of specific characteristics and
instructions of the target machine.
For example, there may be special addressing modes that can be used to create more
efficient object code.