Ssos Notes
Ssos Notes
UNIT I
Introduction –System Software and machine architecture. Loader and Linkers: BasicLoader
Functions - Machine dependent loader features –Machine independent loader features- Loader
design options.
UNIT II
UNIT III
UNIT IV
UNIT V
UNIT I
INTRODUCTION
The subject introduces the design and implementation of system software. Software is
set of instructions or programs written to carry out certain task on digital computers. It is
classified into system software and application software.
System software consists of a variety of programs that support the operation of a
computer. This software makes it possible for the user to focus on an application or other
problem to be solved, without needing to know the details of how the machine works internally
Eg. Operating system, compiler, assembler, macro processor, loader or linker, debugger, text
editor, database management systems (some of them) and, software engineering tools.
Application software focuses on an application or problem to be solved. System
software consists of a variety of programs that support the operation of a computer.
Step 1: A program written (create & modify) in High level language ( C, C++, pascal
typed in text editor
Step 2: Translated into machine language (object program) using compiler. The compiler
in turn store the .obj into the secondary device
Step 3. The resulting machine language program was loaded into memory & prepared
for execution by a loader or liker. There are diff loading schemes viz. absolute,
relocating and direct linking. In general the loader must load relocate and link the
object program
Step 1: Program written using macro instructions to read & write data
Step 2: Uses assembler, which probably included a macro processor to translate these
programs into machine language
Assembler translates mnemonic instructions into machine code. The instruction formats,
addressing modes etc., are of direct concern in assembler design. Similarly, Compilers must
generate machine language code, taking into account such hardware characteristics as the
number and type of registers and the machine instructions available.
Operating systems are directly concerned with the management of nearly all of the resources of
a computing system.
There are aspects of system software that do not directly depend upon the type of
UCAS
computing system, general design and logic of an assembler, general design and logic of a
compiler and, code optimization techniques, which are independent of target machines.
LOADERS AND LINKERS
Introduction
The Source Program written in assembly language or high level language will be
converted to object program, which is in the machine language form for execution. This
conversion either from assembler or from compiler, contains translated instructions and data
values from the source program, or specifies addresses in primary memory where these items
are to be loaded for execution.
This contains the following three processes, and they are,
Loading - which allocates memory location and brings the object program into memory
for execution - (Loader)
Linking- which combines two or more separate object programs and supplies the
information needed to allow references between them - (Linker)
Relocation - which modifies the object program so that it can be loaded at an address
different from the location originally specified - (Linking Loader)
locations in the memory. At the end the loader jumps to the specified address to begin
execution of the loaded program. The advantage of absolute loader is simple and efficient. But
the disadvantages are, the need for programmer to specify the actual address, and, difficult to use
subroutine libraries.
The algorithm for this type of loader is given here. The object program and, the object
program loaded into memory by the absolute loader are also shown. Each byte of assembled
code is given using its hexadecimal representation in character form. Easy to read by human
beings. Each byte of object code is stored as a single byte. Most machine store object programs
in a binary form, and we must be sure that our file and device conventions do not cause some of
the program bytes to be interpreted as control characters.
Begin
read Header record
verify program name and length
read first Text record
while record type is <> ‘E’ do
begin
{if object code is in character form, convert into internal representation}
move object code to specified location in memory
read next object program record
end
jump to address specified in End record
end
Begin
X=0x80 (the address of the next memory location to be loaded
Loop
A GETC (and convert it from the ASCII character
code to the value of the hexadecimal digit)
save the value in the high-order 4 bits of S
A GETC
combine the value to form one byte A<-(A+S)
store the value (in A) to the address in register X
XX+1
End
It uses a subroutine GETC, which is
GETC A read one character
if A=0x04 then jump to 0x80
if A<48 then GETC
A A-48 (0x30)
if A<10 then return
A
Return
there is room for it. The actual starting address of the object program is not known until load
time. Relocation provides the efficient sharing of the machine with larger memory and when
several independent programs are to be run together. It also supports the use of subroutine
libraries efficiently. Loaders that allow for program relocation are called relocating loaders or
relative loaders.
Methods for specifying relocation
Use of modification record and, use of relocation bit, are the methods available for
specifying relocation. In the case of modification record, a modification record M is used in the
object program to specify any relocation. In the case of use of relocation bit, each instruction is
associated with one relocation bit and, these relocation bits in a Text record is gathered into bit
masks.
Modification records are used in complex machines and are also called Relocation and
Linkage Directory (RLD) specification. The format of the modification record (M) is as follows.
The object program with relocation by Modification records is also shown here.
Modification record
col 1: M
col 2-7: relocation address
col 8-9: length (halfbyte)
col 10: flag (+/-)
col 11-17: segment name
The relocation bit method is used for simple machines. Relocation bit is 0: no
modification is necessary, and is 1: modification is needed. This is specified in the columns 10-
12 of text record (T), the format of text record, along with relocation bits is as follows.
Text record
col 1: T
col 2-7: starting address
col 8-9: length (byte)
col 10-12: relocation bits
col 13-72: object code
UCAS
Twelve-bit mask is used in each Text record (col:10-12 – relocation bits), since each text
record contains less than 12 words, unused words are set to 0, and, any value that is to be
modified during relocation must coincide with one of these 3-byte segments. For absolute
loader, there are no relocation bits column 10-69 contains object code. The object program with
relocation by bit mask is as shown below. Observe FFC - means all ten words are to be
modified and, E00 - means first three records are to be modified.
Program Linking
The Goal of program linking is to resolve the problems with external references
(EXTREF) and external definitions (EXTDEF) from different control sections.
EXTREF (external reference) - The EXTREF statement names symbols used in this
(present) control section and are defined elsewhere.
ex: EXTREF RDREC, WRREC
EXTREF LISTB, ENDB, LISTC, ENDC
Define record
The format of the Define record (D) along with examples is as shown here.
Col. 1 D
Col. 2-7 Name of external symbol defined in this control section
Col. 8-13 Relative address within this control section (hexadecimal)
Col.14-73 Repeat information in Col. 2-13 for other external symbols
Example records
D LISTA 000040 ENDA 000054
D LISTB 000060 ENDB 000070
Refer record
The format of the Refer record (R) along with examples is as shown here.
Col. 1 R
Col. 2-7 Name of external symbol referred to in this control section
Col. 8-73 Name of other external reference symbols
Example records
R LISTB ENDB LISTC ENDC
R LISTA ENDA LISTC ENDC
R LISTA ENDA LISTB ENDB
Here are the three programs named as PROGA, PROGB and PROGC, which are
separately assembled and each of which consists of a single control section. LISTA, ENDA in
PROGA, LISTB, ENDB in PROGB and LISTC, ENDC in PROGC are external definitions in
each of the control sections. Similarly LISTB, ENDB, LISTC, ENDC in PROGA, LISTA,
ENDA, LISTC,ENDC in PROGB, and LISTA, ENDA, LISTB, ENDB in PROGC, are external
references. These sample programs given here are used to illustrate linking and relocation. The
following figures give the sample programs and their corresponding object programs. Observe
the object programs, which contain D and R records along with other records.
UCAS
M000057 06+ENDC
M000057 06 -LISTC
M00005A06+ENDC
M00005A06 -LISTC
M00005A06+PROGA
M00005D06-ENDB
M00005D06+LISTB
M00006006+LISTB
M00006006-PROGA
E000020
H PROGB 000000 00007F
D LISTB 000060 ENDB 000070
R LISTA ENDA LISTC ENDC
.
T 000036 0B 03100000 772027 05100000
.
T 000007 0F 000000 FFFFF6 FFFFFF FFFFF0 000060
M000037 05+LISTA
M00003E 06+ENDA
M00003E 06 -LISTA
M000070 06 +ENDA
M000070 06 -LISTA
M000070 06 +LISTC
M000073 06 +ENDC
M000073 06 -LISTC
M000073 06 +ENDC
M000076 06 -LISTC
M000076 06+LISTA
M000079 06+ENDA
M000079 06 -LISTA
UCAS
M00007C 06+PROGB
M00007C 06-LISTA
E
H PROGC 000000 000051
D LISTC 000030 ENDC 000042
R LISTA ENDA LISTB ENDB
.
T 000018 0C 03100000 77100004 05100000
.
T 000042 0F 000030 000008 000011 000000 000000
M000019 05+LISTA
M00001D 06+LISTB
M000021 06+ENDA
M000021 06 -LISTA
M000042 06+ENDA
M000042 06 -LISTA
M000042 06+PROGC
M000048 06+LISTA
M00004B 06+ENDA
M00004B 006-LISTA
M00004B 06-ENDB
M00004B 06+LISTB
M00004E 06+LISTB
M00004E 06-LISTA
E
The following figure shows these three programs as they might appear in memory after
loading and linking. PROGA has been loaded starting at address 4000, with PROGB and
PROGC immediately following.
UCAS
For example, the value for REF4 in PROGA is located at address 4054 (the beginning
address of PROGA plus 0054, the relative address of REF4 within PROGA). The following
figure shows the details of how this value is computed.
ESTAB - ESTAB for the example (refer three programs PROGA PROGB and PROGC)
given is as shown below. The ESTAB has four entries in it; they are name of the control
section, the symbol appearing in the control section, its address and length of the control
section.
Control section Symbol Address Length
PROGA 4000 63
UCAS
LISTA 4040
ENDA 4054
PROGB 4063 7F
LISTB 40C3
ENDB 40D3
PROGC 40E2 51
LISTC 4112
ENDC 4124
The object programs for PROGA, PROGB and PROGC are shown below, with above
modification to Refer record ( Observe R records).
Symbol and Addresses in PROGA, PROGB and PROGC are as shown below. These
are the entries of ESTAB. The main advantage of reference number mechanism is that it avoids
multiple searches of ESTAB for the same symbol during the loading of a control section
Ref No. Symbol Address
1 PROGA 4000
2 LISTB 40C3
3 ENDB 40D3
4 LISTC 4112
5 ENDC 4124
4 LISTC 4112
5 ENDC 4124
Ref No. Symbol Address
1 PROGC 4063
2 LISTA 4040
3 ENDA 4054
4 LISTB 40C3
5 ENDB 40D3
Loader Options
Loader options allow the user to specify options that modify the standard processing.
The options may be specified in three different ways. They are, specified using a command
language, specified as a part of job control language that is processed by the operating system,
and an be specified using loader control statements in the source program.
LIBRARY UTLIB
INCLUDE READ (UTLIB)
UCAS
The commands are, use UTLIB ( say utility library), include READ and WRITE control
sections from the library, delete the control sections RDREC and WRREC from the load, the
change command causes all external references to the symbol RDREC to be changed to the
symbol READ, similarly references to WRREC is changed to WRITE, finally, no call to the
functions SQRT, PLOT, if they are used in the program.
Linking Loaders
The above diagram shows the processing of an object program using Linking Loader.
The source program is first assembled or compiled, producing an object program. A linking
loader performs all linking and loading operations, and loads the program into memory for
execution.
Linkage Editors
The figure below shows the processing of an object program using Linkage editor. A
linkage editor produces a linked version of the program – often called a load module or an
executable image – which is written to a file or library for later execution. The linked program
produced is generally in a form that is suitable for processing by a relocating loader.
Some useful functions of Linkage editor are, an absolute object program can be created,
UCAS
if starting address is already known. New versions of the library can be included without
changing the source program. Linkage editors can also be used to build packages of
subroutines or other control sections that are generally used together. Linkage editors often
allow the user to specify that external references are not to be resolved by automatic library
search – linking will be done later by linking loader – linkage editor + linking loader – savings in
space
Dynamic Linking
The scheme that postpones the linking functions until execution. A subroutine is loaded
and linked to the rest of the program when it is first called – usually called dynamic linking,
dynamic loading or load on call. The advantages of dynamic linking are, it allow several
executing programs to share one copy of a subroutine or library. In an object oriented system,
dynamic linking makes it possible for one object to be shared by several programs. Dynamic
linking provides the ability to load the routines only when (and if) they are needed. The actual
loading and linking can be accomplished using operating system service request.
Bootstrap Loaders
If the question, how is the loader itself loaded into the memory ? is asked, then the
answer is, when computer is started – with no program in memory, a program present in ROM (
absolute address) can be made executed – may be OS itself or A Bootstrap loader, which in
turn loads OS and prepares it for execution. The first record ( or records) is generally referred
to as a bootstrap loader – makes the OS to be loaded. Such a loader is added to the beginning of
all object programs that are to be loaded into an empty and idle system.
Implementation Examples
This section contains brief description of loaders and linkers for actual computers. They
are, MS-DOS Linker - Pentium architecture, SunOS Linkers - SPARC architecture, and, Cray
MPP Linkers – T3E architecture.
MS-DOS Linker
UCAS
This explains some of the features of Microsoft MS-DOS linker, which is a linker for
Pentium and other x86 systems. Most MS-DOS compilers and assemblers (MASM) produce
object modules, and they are stored in .OBJ files. MS-DOS LINK is a linkage editor that
combines one or more object modules to produce a complete executable program - .EXE file;
this file is later executed for results.
THEADR specifies the name of the object module. MODEND specifies the end of the
module. PUBDEF contains list of the external symbols (called public names). EXTDEF contains
list of external symbols referred in this module, but defined elsewhere. TYPDEF the data types
are defined here. SEGDEF describes segments in the object module ( includes name, length,
and alignment). GRPDEF includes how segments are combined into groups. LNAMES contains
all segment and class names. LEDATA contains translated instructions and data. LIDATA has
above in repeating pattern. Finally, FIXUPP is used to resolve external references.
SunOS Linkers
SunOS Linkers are developed for SPARC systems. SunOS provides two different linkers
– link-editor and run-time linker.
Link-editor is invoked in the process of assembling or compiling a program – produces a
single output module – one of the following types
A relocatable object module – suitable for further link-editing
A static executable – with all symbolic references bound and ready to run
UCAS
A dynamic executable – in which some symbolic references may need to be bound at run time
A shared object – which provides services that can be, bound at run time to one ore more
dynamic executables
An object module contains one or more sections – representing instructions and data
area from the source program, relocation and linking information, external symbol table.
Run-time linker uses dynamic linking approach. Run-time linker binds dynamic
executables and shared objects at execution time. Performs relocation and linking operations to
prepare the program for execution.
UNIT II
MACHINE DEPENDENT COMPILER FEATURES:
At an elementary level, all the code generation is machine dependent. This isbecause,
we must know the instruction set of a computer to generate code for it. There aremany more
complex issues involved. They are:
Allocation of register
such types of code optimization. In this intermediate form, the syntax and semantics of
Compilers the source statements have been completely analyzed, but the actual translation into
machine code has not yet been performed. It is easier to analyze and manipulate this intermediate
code than to perform the operations on either the source program or the machine code. The
intermediate form made in a compiler, is not strictly dependent on the machine for which the
compiler is designed.
The intermediate form that is discussed here represents the executable instruction of the
program with a sequence of quadruples.
Where
Operation - is some function to be performed by the object code
OP 1 & OP2 - are the operands for the operation and
Result - designation when the resulting value is to be placed.
The entry i1,designates an intermediate result (SUM + VALUE); the secondquadruple assigns
the value of this intermediate result to SUM. Assignment is treated as aseparate operation ( : =).
VARIABLE
Note: Quadruples appears in the order in which the corresponding object code
instructions are to be executed. This greatly simplifies the task of analyzing the code for
purposes of optimization. It is also easy to translate into machine instructions.
For the source program in Pascal shown in fig. 1. The corresponding quadruples are
shown in fig. 27. The READ and WRITE statements are represented with a CALL operation,
followed by PARM quadruples that specify the parameters of the READ or WRITE. The JGT
UCAS
operation in quadruples 4 in fig. 27 compares the values of its two operands and jumps to
quadruple 15 if the first operand is greater than the second. The Joperation in quadruples 14
jumps unconditionally to quadruple 4.
optimization .
Assignment and use of registers: Here we concentrate the use of registers as instruction
operand. The bottleneck in all computers to perform with high speed is the access of data from
memory. If machine instructions use registers as operands the speed of operation is much faster.
Therefore, we would prefer to keep in registers all variables and intermediate result that will be
used later in the program.
There are rarely as many registers available as we would like to use. The problem then
becomes which register value to replace when it is necessary to assign a register for some other
purpose. On reasonable approach is to scan the program for the next point at which each register
value would be used. The value that will not be needed for the longest time is the one that should
be replaced. If the register that is being reassigned contains the value of some variable already
stored in memory, the value can simply be discarded. Otherwise, this value must be saved using
a temporary variable. This is one of the functions performed by the GETA procedure. In using
register assignment, a compiler must also consider control flow of the program. If they are jump
operations in the program, the register content may not have the value that is intended. The
contents maybe changed. Usually the existence of jump instructions creates difficulty in keeping
track of registers contents. One way to deal with the problem is to divide the problem into basic
blocks.
A basic block is a sequence of quadruples with one entry point, which is at the
beginning of the block, one exit point, which is at the end of the block, and no jumps within the
blocks. Since procedure calls can have unpredictable effects as register contents, a CALL
operation is usually considered to begin a new basic block. The assignment and use of registers
UCAS
within a basic block can follow as described previously. When control passes from one block to
another, all values currently held in registers are saved in temporary variables.
STRUCTURED VARIABLES
Structured variables discussed here are arrays, records, strings and sets. The
primarily consideration is the allocation of storage for such variable and then the
generation of code to reference then.
If each integer variable occupies one word of memory, then we require 10 words of
memory to store this array. In genera an array declaration is ARRAY [ l .. u ]OFINTEGER
The data is stored in memory in two different ways. They are row-major and column
major. All array elements that have the same value of the first subscript are stored in contiguous
locations. This is called row-major order. Another way of looking at this is to scan the words of
the array in sequence and observe the subscript values. In row-major order, the right most
subscript varies most rapidly.
Fig. 30(b) shows the column major way of storing the data in memory. All elements
that have the same value of the second subscript are stored together; this is called column major
order. In other words, the column major order, the left most subscript varies most rapidly.
To refer to an element, we must calculate the address of the referenced element relative
to the base address of the array. Compiler would generate code to place the relative address in an
index register. Index addressing mode is made easier to access the desired array element.
(1) One Dimensional Array: On a SIC machine to access A [6], the address is
data.
should refer to A using index addressing after having placed the value
UCAS
A: ARRAY [ 1 . . 10 ] OF INTEGER
major order. To access element B[ 2,3 ] of the matrix B[ 6, 4 ], we must skip over two complete
rows before arriving at the beginning of row 2. Each row contains 6 elements so we have to skip
6 x 2 = 12 array elements before we come to the beginning of row 2 to arrive at B[ 2, 3 ]. To skip
over the first two elements of row 2 to arrive at B[ 2, 3 ]. This makes a total of 12 + 2 = 14
elements between the beginning of the array and element[2, 3 ]. If each element occurs 3 byte as
in SIC, the B[2, 3] is located relating at 14 x 3 =42 address within the array.
The symbol - table entry for an array usually specifies the following:
This information is sufficient for the compiler to generate the code required for array reference.
Some of the languages line FORTRAN 90, the values of ROWS and COLUMNS are not known
at completion time. The compiler cannot directly generate code. Then, the compiler create a
descriptor called dope vector for the array. The descriptor includes space for storing the lower
and upper bounds for each array subscript. When storage is allocated for the array, the values of
these bounds are computed and stored in the descriptor. The generated code for one array
reference uses the values from the descriptor to calculate relative addresses as required. The
descriptor may also include the number of dimension for the array, the type of the array elements
and a pointer to the beginning of the array. This information can be useful if the allocated array
is passed as a parameter to another procedure.
UCAS
In the compilation of other structured variables like recode, string and sets the same
type of storage allocations are required. The compiler must store information concerning the
structure of the variable and use the information to generate code to access components of the
structure and it must construct a description for situation in which the required conformation is
not known at compilation time.
...FORI : = 1TO10DO
X [ I, 2 * J - 1 ]: = [ I,2 * J }
Fig. 33(a)
The operand is not changed in value between quadruples 5 and 12. It is not possible to
reach quadruple 12 without passing through quadruple 5 first because the quadruples are part of
the same basic block. Therefore, quadruples 5 and 12 compute the same value. This means we
can delete quadruple 12 and replace any reference to its result ( i10 ), with the reference to i3, the
result of quadruple 5. this information eliminates the duplicate calculation of 2 * J which we
identified previously as a common expression in the source statement.
After the substitution of i3 for i10 , quadruples 6 and 13 are the same except for the
name of the result. Hence the quadruple 13 can be removed and substitute i4 for i11wherever it is
UCAS
used. Similarly quadruple 10 and 11 can be removed because they are equivalent to quadruples 3
and 4.
STORAGE ALLOCATION
All the program defined variable, temporary variable, including the location used to save
the return address use simple type of storage assignment called static allocation.
When recursively procedures are called, static allocation cannot be used. This is
explained with an example. Fig. 38(a) shows the operating system calling the program MAIN.
The return address from register 'L' is stored as a static memory location RETADR within
MAIN.
MAIN has called the procedure SUB. The return address for the call has been stored at a
fixed location within SUB (invocation 2). If SUB now calls itself recursively as shown in a
problem occurs.SUB stores the return address for invocation 3 into RETADR from register L.
This destroys the return address for invocation 2. As a result, there is no possibility of ever
making a correct return to MAIN.
There is no provision of saving the register contents. When the recursive call is made, variable
within SUB may set few variables. These variables may be destroyed. However, these previous
values may be needed by invocation 2 or SUB after the return from the recursive call. Hence it is
necessary to preserve the previous values of any variables used by SUB, including parameters,
temporaries, return addresses, register save areas etc., when a recursive call is made. This is
accomplished with a dynamic storage allocation technique. In this technique, each procedure call
creates an activation record that contains storage for all the variables used by the procedure. If
the procedure is called recursively, another activation record is created. Each activation record is
associated with a particular invocation of the procedure, not with the itself. An activation record
is not deleted until a return has been made from the corresponding invocation.
Activation records are typically allocated on a stack, with the correct record at the tip
of the stack. The procedure MAIN has been called; its activation record appears on the stack.
The base register B has been set to indicate the starting address of this correct activation record.
The first word in an activation record would normally contain a pointer PREV to the previous
UCAS
record on the stack. Since the record is the first, the pointer value is null. The second word of the
activation record contain a portion NEXT to the first unused word of the stack, which will be the
starting address for the next activation record created. The third word contain the return address
for this invocation of the procedure, and then necessary words contain the values of variables
used by the procedure.
Each procedure corresponds to a block. Note that blocks are rested within other
blocks. Example: Procedures B and D are rested within procedure A and procedure C is rested
within procedure B. Each block may contain a declaration of variables. A block may also refer to
variables that are defined in any block that contains it, provided the same names are not
redefined in the inner block. Variables cannot be used outside the block in which they are
declared.
The compiler design is briefly discussed in this section. The compiler is divided
COMPILER PASSES
One pass compiler for a subset of the Pascal language was discussed in section 1.In this
design the parsing process drove the compiler. The lexical scanner was called when the parser
UCAS
needed another input token and a code-generation routine was invoked as the parser recognized
each language construct. The code optimization techniques discussed cannot be applied in total
to one-pass compiler without intermediate code-generation. One pass compiler is efficient to
generate the object code.
One pass compiler cannot be used for translation for all languages. FORTRAN and
PASCAL language programs have declaration of variable at the beginning of the program. Any
variable that is not declared is assigned characteristic by default.
One pass compiler may fix the formal reference jump instruction without problem as in
one pass assembler. But it is difficult to fix if the declaration of an identifier appears after it has
been used in the program as in some programming languages.
Example:
X:=Y*Z
If all the variables x, y and z are of type INTEGER, the object code for this statement
might consist of a simple integer multiplication followed by storage of the result. If the variable
are a mixture of REAL and INTEGER types, one or more conversion operations will need to be
included in the object code, and floating point arithmetic instructions may be used. Obviously the
compiler cannot decide what machine instructions to generate for this statement unless
instruction about the operands is available. The statement may even be illegal for certain
combinations of operand types. Thus a language that allows forward reference to data items
cannot be compiled in one pass.
Example :
There are a number of factors that should be considered in deciding between one
Computer running students jobs tend to spend a large amount of time performing compilations.
The resulting object code is usually executed only once or twice for each compilation, these test
runs are not normally very short. In such an environment, improvement in the speed of
compilation can lead to significant benefit in system performance and job turn around time.
(2) Multi-Pass Compiles: If programs are executed many times for each
compilation or if they process large amount of data, then speed of executive becomes more
important than speed of compilation. In a case, we might prefer a multi-pass compiler design that
could incorporate sophisticated code-optimization technique.
Multi-pass compilers are also used when the amount of memory, or other systems
resources, is severely limited. The requirements of each pass can be kept smaller if the work by
compilation is divided into several passes.
Other factors may also influence the design of the compiler. If a compiler is divided into
several passes, each pass becomes simpler and therefore, easier to understand, read and test.
Different passes can be assigned to different programmers and can be written and tested in
parallel, which shortens the overall time require for compiler construction.
INTERPRETERS
An interpreter performs lexical and syntactic analysis functions just like compiler and
then translates the source program into an internal form. The internal form may also be a
sequence of quadruples.
UCAS
After translating the source program into an internal form, the interpreter executes the
operations specified by the program. During this phase, an interpreter can be viewed as a set of
subtractions. The internal form of the program drives the execution of this subtraction.
In some languages the type of a variable can change during the execution of a program.
Dynamic scoping is used, in which the variable that are referred to by a function or a subroutines
are determined by the sequence of calls made during execution, not by the nesting of blocks in
the source program. It is difficult to compile such language efficiently and allow for dynamic
changes in the types of variables and the scope of names. These features can be more easily
handled by an interpreter that provides delayed binding of symbolic variable names to data types
and locations.
P-CODE COMPILERS
P-Code compilers also called byte of code compilers are very similar in concept to
interpreters. A P-code compiler, intermediate form is the machine language for a hypothetical
computers, often called pseudo-machine or P-machine.
The main advantage of this approach is portability of software. It is not necessary for
the compiler to generate different code for different computers, because the P-code object
program can be executed on any machine that has a P-code interpreter. Even the compiler itself
can be transported if it is written in the language that it compiles. To accomplish this, the source
version of the compiler is compiled into P-code; this P-code can then be interpreted on another
compiler. In this way P-code compiler can be used without modification as a wide variety of
system if a P-code interpreter is written for each different machine.
UCAS
The design of a P-machine and the associated P-code is often related to the
requirements of the language being compiled. For example, the P-code for a Pascal compiler
might include single P-instructions that perform:
This simplifies the code generation process, leading to a smaller and more efficient compiler.
The P-code object program is often much smaller than a corresponding machine code program.
This is particularly useful on machines with severely limited memory size. The interpretive
execution of P-code program may be much slower than the execution of the equivalent machine
code. Many P-code compilers are designed for a single user running on a dedicated micro-
computer systems. In that case, the speed of execution may be relatively insignificant because
the limiting factor is system performance may be the response time and " think time " of the user.
If execution speed is important, some P-code compilers support the use of machine-
language subtraction. By rewriting a small number of commonly used routines in machine
language, rather than P-code, it is often possible to improve the performance. Of course, this
approach sacrifices some of the portability associated with the use of P-code compilers.
COMPILER-COMPILERS
compiler construction. Such tools are also called Compiler Generators or Translator -
writing system
The compiler writer also provides a set of semantic or code-generation routines. There
is one such routine for each rule of the grammar. The parser each time it recognizes the language
construct described by the associated rule calls this routine. Some compiler-compiler can parse a
longer section of the program before calling a semantic routine. In that case, an internal form of
the statements that have been analyzed, such as a portion of the parse tree, may be passed to the
semantic routine. This approach is often used when code optimization is to be performed.
Compiler-compilers frequently provide special languages, notations, data structures, and other
similar facilities that can be used in the writing of semantic routines.
UNIT III
DEFINITION OF DOS:
In the 1960s operating system is defined as the software that controls the hardware. A better
definition of operating system is needed. An operating system as the programs, implemented in
either s/w or firmware, that make the hardware usable. Hardware provides “raw computing
UCAS
power”. operating system make this computing power conveniently available to users, and they
manage the hardware carefully to achieve good performance.
Operating system are primarily resource managers; the main resource they
manage is computer hardware. In the form of processors, storage, input /output devices,
communication devices, and data. Operating system perform many functions such as
implementing the user interface, sharing hardware among users, allowing users to share data
among themselves, preventing users from interfacing with one another, scheduling resources
among users, facilitating i/o recovering from errors, accounting for resource usage, facilitating
parallel operations, organizing data for secure and rapid access and handling network
communications.
1) Convenience:
2) Efficiency:
3) Ability to evolve:
HISTORY OF DOS:
Operating system have evolved over the last 40 years through a number of distinct phases
or generations. In the 1940’s , the earliest electronic digital computers had no operating system.
Machines of the time programs were entered one bit at a time on rows of mechanical switches.
Machine language programs were entered on punched cards, and assembly languages were
developed to speed the programming process.
The general motors research laboratories implemented the first operating system in the
early 1990’s for their IBM 701.The systems of the 1950’s generally ram only one job at a time
and smoothed the transition between jobs to get maximum initialization of the computer system.
these were called single-stream batch processing system because programs and data were
submitted in groups or batches.
The 1960’s:
The systems of the 1960’s were also batch processing systems, but they were able to take
better advantage of the computer’s resources by running several jobs at once. They contained
many peripheral devices such as card readers, card punches, printers, tape drives and disk
drives. Any one job rarely utilized all a computer’s resources effectively. Operating system
designers that when one job was waiting for an i/o operation to complete before the job could
continue using the processor, some other job could use the idle processor. Similarly, when one
job was using the processor other job could be using the various input /output devices. In fact
running a mixture of diverse jobs appeared to be the best way to optimize computer utilization.
So operating system designers developed the concept of in which several jobs are in main
memory at once, a processor is switched from job to job as needed to keep several jobs
advancing while keeping the peripheral devices in use.
More advanced operating system were developed to service multiple interactive users at
once. Timesharing systems were developed to multi program large numbers of simultaneous
interactive users. Many of the time-sharing systems of the 1960’s were multimode systems also
UCAS
The key time-sharing development efforts of this period included the CTSS system
developed at MIT, the TSS system developed by IBM, the multics system developed at MIT, as
the successor to CTSS turn around time that is the time between submission of a job and the
return of results, was reduced to minutes or even seconds.
The 1980’s was the decade of the personal computer and the workstation. Individuals
could have their own dedicated computers for performing the bulk of their work, and they use
communication facilities for transmitting data between systems. Computing was distributed to
the sites at which it was needed rather than bringing the data to be processed to some central,
large - scale, computer installation. The key was to transfer information between computers in
computer networks. E-mail file transfer and remote database access applications and client/server
model become widespread.
In 1990’s the distributed computing were used in which computations will be paralleled
into sub - computations that can be executed on other processors in multiprocessor computers
and in computer networks. Networks will be dynamically configured new devices and s/w are
added/removed. When new server is added, it will make itself known to the server tells the
networks about its capabilities, billing policies accessibility and forth client need not know all
the details of the networks instead they contact locating brokers for the services provided by
servers. The locating brokers know which servers are available, where they are, and how to
UCAS
access them. This kind of connectivity will be facilitated by open system standards and
protocols.
Computing is destined to become very powerful and very portable. In recent years, laptop
computers have been introduced that enable people to carry their computers with them where
ever they go. With the development of OSI communication protocols, integrated services digital
network (ISDN) people will be able to communicate and transmit data worldwide with high
reliability.
UNIX:
The unix operating system was originally designed in the late 1960’s and elegance
attracted researchers in the universities and industry. Unix is the only operating system that has
been implementing on computers ranging from micros to supercomputers
PROCESS CONCEPTS:
The notion of process, which is central to the understanding of today’s computer systems that
perform and keep track of many simultaneous activities.
DEFINITIONS OF “PROCESS”:
The term “Process” was first used by the designers of the Multics system in the 1960s.
A program in execution.
An asynchronous activity.
The “animated spirit” of a procedure.
The “locus of control” of a procedure in execution.
That which is manifested by the existence of a “process control block” in the
operating system.
That entity to which processors are assigned.
The “dispatchable” unit.
PROCESS STATES:
UCAS
A process goes through a series of discrete process states. Various events can cause a process
to change states.
A process is said to be running (ie.,in the running state) if it currently has the CPU. A process
is said to be ready (ie. in the ready state) if it could use a CPU if one were available. A process
is said to be blocked (ie.,in the blocked state) if it is waiting for some event to happen (such as an
I/O completion event) before it can proceed.
For example consider a single CPU system, only one process can run at a time, but several
processes may be ready, and several may be blocked. So establish a ready list of ready processes
and a blocked list of blocked processes. The ready list is maintained in priority order so that the
next process to receive the CPU is the first process on the list.
When a job is admitted to the system, a corresponding process is created and normally
inserted at the back of the ready list. The process gradually moves to the head of the ready list as
the processes before it complete their turns at using the CPU. When the process reaches the head
of the list, and when the CPU becomes to make a state transition from ready state to the running
state. The assignment of the CPU to the first process on the ready list is called dispatching, and
is performed by a system entity called the dispatcher. We indicate this transition as follows
To prevent any one process to use the system. Wholly, the operating system sets a hardware
interrupting clock ( or interval timer ) to allow this user to run for a specific time interval or
quantum. If the process does not leave the CPU before the time interval expires, the interrupting
clock generates an interrupt, causing the operating system to regain control. The operating
system then makes the previously running process ready, and makes the first process on the
ready list running. These state transitions are indicated as
If a running process initiates an input/output operation before its quantum expires, the running
process voluntarily leaves the CPU. This state transition is
When an input/output operation (or some other event the process is waiting for) completes. The
process makes the transition from the blocked state to the ready state. The transition is
The PCB is a data structure containing certain important information about the process
including.
INTERRUPT PROCESSING:
An interrupt is an event that alters the sequence in which a processor executes instructions.
It is generated by the Hardware of the computer system. When an interrupt occurs.
INTERRUPT CLASSES:
These are initiated by a running process that execute the svc is a user generated request
for a particular system service such as performing input/output, obtaining more storage, or
communicating with the system operator.
* I/O interrupts:
These are initiated by the input/output hardware. They signal to the cpu that the status of
a channel or device has changed. For eg., they are caused when an I/O operation completes,
when an I/O error occurs.
* External interrupts:
* Restart interrupts:
These occur when the operator presses the restart button or arrival of restart signal
processor instruction from another processor on a multiprocessor system.
These may occur when a programs machine language instructions are executed. These
problems include division by zero, arithmetic overflow or underflow, data is in wrong format,
attempt to execute an invalid operation code or attempt to refer a memory location that do not
exist or attempt to refer protected resource.
* Machine check interrupts:
*do we place the program as close as possible into available memory slots to minimize
wasted space.
If a new program needs to be placed in main storage and if main storage is currently full, which
of the other programs do we displace? Should we replace the oldest programs, or should we
replace those that are least frequently used or least recently used.
Storage management strategies are to obtain the best possible use of the main storage
resource. Storage management strategies are divided into the following categories
1. fetch strategies
2.placement strategies
3.replacement strategies.
Fetch strategies are concerned with when to obtain the next piece of program or data for transfer
to main storage from secondary storage. Demand fetch, in which the next piece of program or
data is brought into the main storage when it is referenced by a running program. Placement
strategies are concerned with determining where in main storage to place an incoming program.
UCAS
Replacement strategies are concerned with determining which piece of program are data to
displace to make room for incoming programs.
In contiguous storage allocation each program had to occupy a single contiguous block of
storage locations. In noncontiguous storage allocation, a program is divided into several blocks
or segments that may be placed throughout main storage in pieces not necessarily adjacent to one
another.
The earliest computer systems allowed only a single person at a time to use the machine.
All of the machines resources were at the users disposal. User wrote all the code necessary to
implement a particular application, including the highly detailed machine level input/output
instructions. To implement basic functions was consolidated into an input/output control system
(iocs).
If a particular program section is not needed for the duration of the program’s execution, then
another section of the program may be brought in from the secondary storage to occupy the
storage used by the program section that is no longer needed.
In single user contiguous storage allocation systems, the user has complete control over
all of main storage. Storage is divided into a portion holding operating system routines, a portion
holding the user’s program and an unused portion.
UCAS
Suppose the user destroys the operating system for example, suppose certain input/output
are accidentally changed. The operating system should be protected from the user. Protection is
implemented by the use of a single boundary register built into the CPU. Each time a user
program refers to a storage address, the boundary register is checked to be certain that the user is
not about to destroy the operating system. The boundary register contains the highest address
used by the operating system. If the user tries to enter the operating system, the instruction is
intercepted and the job terminates with an appropriate error message.
The user needs to enter the operating system from time to time to obtain services such as
input/output. This problem is solved by giving the user a specific instruction with which to
request services from the operating system( ie., A supervisor call instruction). The user wanting
to read from tape will issue an instruction asking the operating system to do so in the user’s
behalf.
Early single-user real storage systems were dedicated to one job for more than the job’s
execution time. During job setup and job tear down the computer is idle. Designers realized that
if they could automate job-to-job transition, then they could reduce considerably the amount of
time wasted between jobs. In single stream batch processing, jobs are grouped in batches by
loading them consecutively onto tape or disk. A job stream processor reads the job control
language statements and facilitates the setup of the next job. When the current job terminates the
job stream reader automatically reads in the control language statements for the next job, and
facilitate the transition to the next job .
UCAS
FIXED-PARTITION MULTIPROGRAMMING:
Even with batch processing system ,single user systems still waste a considerable
amount of the computing resource .
in use”
Use Wait for Use Wait for Use Wait for Use
*the program consumes the cpu resource until an input or output is needed
*when the input and output request is issued the job often can’t continue until the requested data
is either sent or received.
*input and output speeds are extremely slow compared with cpu speeds
*to increase the utilization of the cpu multiprogramming systems are implemented in which
several users simultaneously compete for system resources .
*advantage of multiprogramming is several jobs should reside in the computer‘s main storage at
once. Thus when one job requests input/output ,the cpu may immediately switched to another job
ing and may do calculations without delay. Thus both input/output and cpu calculations can
occur simultaneously. This greatly increases cpu utilization and system through put.
*multiprogramming normally requires considerably more storage than a single user system.
Because multi-user programs has to be stored inside the main storage.
*Jobs were translated with absolute assemblers & compilers to run only in a specified partition.
*If a job was ready to run and its partition was occupied, then that job had to wait, even if other
partitions were available.
O
OPERATING
SYSTEM
PARTITION 1
These jobs may be
PARTITION 2
only in partition 2
PARTITION 3
UCAS
only in partition 3
B
Job queue for partition 2
partition 2
C
UCAS
PARTITION 3
JobJobJob
C B A
*Relocating compilers, assemblers and loaders are used to produce relocatable programs that
can run in any available partition that is large enough to hold them.
*This scheme eliminates storage waste inherent in multiprogramming with absolute translation
and loading.
OPERATING
PARTITION 1
PARTITION 2
*With two registers, the low and high boundaries of a user partition can be delineated or the
low boundary (high boundary) and the length of the region can be indicated.
*The user wants any service to be done by operating system. The user can request operating
system through supervisor call instruction (SVC).
*This allows the user to cross the boundary of the operating system without compromising
operating system security.
UCAS
CPU
C c CURRENTLY
2
ACTIVE USER
PARTITION 1
A
LOW BOUNDARY
B
OPERATING
PARTITION 2
B C
PARTITION 3
There are two difficulties with the use of equal-size fixed partitions.
UCAS
*A program may be too big to fit into a partition. In this case, the programmer must design the
program with the use of the overlays, so that only a portion of the program need be in main
memory at any one-time.
*Main memory use is extremely inefficient. Any program, no matter how small, occupies an
entire partition. In our example, there may be a program that occupies less than 128KB of
memory, yet it takes up a 512K partition whenever it is swapped in. This phenomenon, in which
there is wasted space internal to a partition due to the fact that the block of data located is smaller
than the partition, is referred to as internal fragmentation.
*To overcome the problems with fixed partition multiprogramming, is to allow jobs to occupy
as much space as they needed.
An example of variable partition programming is shown below using 1MB of main memory.
Main memory is empty except for the operating system.(Fig a). The first three processes are
loaded in starting where the operating system ends, and occupy just enough space for each
process. (Fig b,c,d).
This leaves a “hole”(ie a unused space) at the end of memory that is too small for a fourth
process. At some point, none of the processes in memory is ready. The operating system
therefore swaps out process 2(Fig e), which leaves sufficient room to load a new process, process
4(Fig f). Because process 4 is smaller than process 2, another small hole is created.
PROCESS 2 PROCESS 2
PROCE
PROCES PROCESS 3
SS
S SPACE SPACE
SPACE UNUSED
UNUSED SPACE 64 K
SYSTEM
PROCESS 1
288 K
UCAS
FREE SPACE
FREE
SPACE
64K
(E) (F) (G) (H)
Then after that the operating system swaps out process 1, and swaps process 2 back in (Fig h).
As this example this method starts out well but leads to a lot of small holes in memory. As time
goes on, memory becomes more and more fragmented, and memory use declines. This
phenomenon is called external fragmentation. One technique for overcoming external
fragmentation is compaction.
COALESCING HOLES:
When a job finishes in a variable partition multiprogramming system, we can check whether the
storage being freed borders on other free storage areas(holes). If it does then we may record in
the free storage list either
The process of merging adjacent hole to form a single larger hole is called coalescing. By
coalescing we reclaim, the largest possible contiguous block of storage.
OPERATING
OPERATING OPERATING
OTHER USERS
STORAGE COMPACTION: OTHER USERS
Sometimes when a job requests a certain amount of main storage no individual holes is large
enough to hold the job, even though the 2K
sum of all
HOLE FREEthe holes is larger than the storage needed by
2K HOLE FREE
the new job. 7K HOLE
5K HOLE FREE
5K USER A
UCAS
User 6 wants to execute his program . The program requires 100k of storage in main storage. But
he cannot use the main storage of his program in contiguous storage allocation. Because 100k of
OPERATING
storage is available but divided into 20k, 40k and 40k. So user 6 programs cannot be stored in the
storage area. So the memory space is wasted. To avoid this technique storage compaction was
followed.
USER 1 10 K
FREE SPACE
STORAGE “HOLES” IN VARIABLE PARTITION MULTIPROGRAMMING:
US COMPLETES
OPERATING OPERATING USER 3 30 K
USER 4 40 K FREE
FREE SPACE
USER 1 10 K USER 1 10 K
USER 4 COMPLETES AND FREES ITSTOR A
USER 5 50 K
USER
USER 52 50
20 KK
USER 2 20 K USER 5 50 K
FREE
FREE
STORAGE
USER 3 30 K SPACE 30 K ER 2
USER 340 FREE SPACE
K
UCAS
The technique of storage compaction involves moving all occupied areas of storage to one
end or the other of main storage. This leaves a single large hole for storage hole instead of the
numerous small holes common in variable partition multiprogramming. Now all of the available
free storage is contiguous so that a waiting job can run if its memory requirement is met by the
single hole that results from compaction.
USER
USER 51 40
10 KK USER 1 10 K
FREEsystem
Operating SPACE 20 K all “in use” blocks together
places USER 3 30 free
leaving K storage as a single, large hole
The system must stop everything while it performs the compaction. This can result
inerratic response times for interactive users and could be devastating in real-time
systems.
Compaction involves relocating the jobs that are in storage. This means that relocation
information, ordinarily lost when a program is loaded, must now be maintained in readily
accessible form.
With a normal, rapidly changing job mix, it is necessary to compact frequently.
Storage placement strategies are used to determine where in the main storage to place
incoming programs and data
1) Best-fit Strategy: An incoming job is placed in the hole in main storage in which it fits most
tightly and leaves the smallest amount of unused space.
BEST-FIT STRATEGY
PLACE JOB IN THE SMALLEST POSSIBLE HOLE IN WHICH IT WILL FIT
FREE STORAGE LIST (KEPT IN ASCENDING ORDER BY HOLE SIZE)
OPERATING SYSTEM
O
START LENGTH A
16 K HOLE
ADDRESS REQUEST FOR 13 K
E 5K B
IN USE
CC C 14 K HOLE
UCAS
A 16K D
IN USE
G 30K
E
5 K HOLE
F
IN USE
G
2) First-fit Strategy: An incoming job is placed in the main storage in the first available hole
large enough to hold it
PLACE JOB IN FIRST STORAGE HOLE ON FREE STORAGE LIST IN WHICH IT WILL
FITFREE STORAGE LIST (KEPT IN STORAGE ADDRESS ORDER OR SOMETIMES IN
RANDOM ORDER)
START LENGTH
ADDRESS O
OPERATING SYSTEM
A 16 K 16 K HOLE
A
REQUEST
FOR 13 K
B
IN USE
C 14 K
E 5K C 14 K HOLE
G 30 K D
IN USE
UCAS
E
5 K HOLE
F IN USE
3) Worst-fit Strategy: Worst fit says to place a program in main storage in the hole in which it fits
worst ie., the largest possible hole. The idea behind is after placing the program in this large
hole, the remaining hole often is also large and is thus able to hold a relatively large new
program.
WORST-FIT STRATEGY:
O
OPERATING SYSTEM
START LENGTH A
16 K HOLE
ADDRESS
REQUEST
FOR 13 K
B
G 30K IN USE
C 14 K HOLE
D
A 16K IN USE
C 14K
UCAS
E 5K E
5 K HOLE
F
IN USE
G
30 K HOLE
USE OPERATING
SWAPPIN
The term virtual storage is associated with the ability to ce much larger
address a storage
than that available in the primary storage of a particular co spa G
AREA
The two most common methods of implementing virtual storage are paging and segmentation.
Fixed-Size blocks are called pages; variable-size blocks are called segments.
UNIT IV
In this case operating system storage management routines must decide which page in
primary storage to displace to make room for an incoming page.
The principle of optimality states that to obtain optimum performance the page to replace is
the one that will not be used again for the furthest time in future.
All pages in main storage thus have an equal likelihood of being selected for replacement.
This strategy could select any page for replacement, including the next page to be referred
When a page needs to be replaced, we choose the one that has been in storage the longest.
First-in-first-out is likely to replace heavily used pages because the reason a page has been in
primary storage for a long time may be that it is in constant use.
FIFO ANAMOLY:
Belady, Nelson and Shelder discovered that under FIFO page replacement, certain page
reference patterns actually cause more page faults when the number of page frames allocated to a
process is increased. This phenomenon is called the FIFO Anamoly or Belady’s Anamoly.
This strategy selects that page for replacement that has not been used for the longest time. LRU
can be implemented with a list structure containing one entry for each occupied page frame.
Each time a page frame is referenced, the entry for that page is placed at the head of the list.
Older entries migrate toward the tail of the list. When a page must be replaced to make room for
an incoming page, the entry at the tail of the list is selected, the corresponding page frame is
freed, the incoming page is placed in that page frame, and the entry for that page frame is placed
at the head of the list because that page is now the one that has been most recently used.
In this strategy the page to replace is that page that is least frequently used or least intensively
referenced. The wrong page could be selected for replacement. For example , the least frequently
used page could be the page brought into main storage most recently.
Pages not used recently are not likely to be used in the near future and they may be replaced
with incoming pages.
The NUR strategy is implemented with the addition of two hardware bit per page. These
are
The NUR strategy works as follows. Initially, the referenced bits of all pages are set to 0. As a
reference to a particular page occurs, the referenced bit of that page is set to 1. When a page is to
be replaced we first try to find a page which has not been referenced.
The second chance variation of FIFO examines the referenced bit of the oldest page; if this bit
is off, the page is immediately selected for replacement. If the referenced bit is on, it is set off
and the page is moved to the tail of the FIFO list and treated essentially as a new arrival; this
page gradually moves to the head of the list from which it will be selected for replacement only
if its referenced bit is still off. This essentially gives the page a second chance to remain in
primary storage if indeed its referenced bit is turned on before the page reaches the head of the
list.
LOCALITY:
Locality is a property exhibited by running processes, namely that processes tend to favor a
subset of their pages during an execution interval. Temporal locality means that if a process
reference a page, it will probably reference that page again soon. Spatial locality means that if a
process references a page it will probably reference adjacent pages in its virtual address space.
WORKING SETS:
Denning developed a view of program paging activity called the working set theory of
program behavior. A working set is a collection of pages a process is actively referencing. To run
program efficiently, its working set of pages must be maintained in primary storage. Otherwise
excessive paging activity called thrashing might occur as the program repeatedly requests pages
from secondary storage.
UCAS
A working set storage management policy seeks to maintain the working sets of active
programs in primary storage. The working set
of pages of a process, W(t,w) at time t, is the set of pages referenced by the process during time
interval t-w to t. Process time is the time during which a process has the CPU.
Processes attempting to execute without sufficient space for their working sets often
experience thrashing, a phenomenon in which they continually replace pages and then
immediately recall the replaced pages back to primary storage.
The page fault frequency algorithm adjusts a process’s resident page set, ie., those of its
pages which are currently in memory, based on the frequency at which the process is faulting.
DEMAND PAGING:
No pages will be brought from secondary to primary storage until it is explicitly referenced
by a running process. Demand paging guarantees that the only pages brought to main storage are
those actually needed by processes. As each new page is referenced, the process must wait while
the new page is transferred to primary storage.
ANTICIPATORY PAGING:
The method of reducing the amount of time people must wait for results from a computer.
Anticipatory paging is sometimes called prepaging. In anticipatory paging, the operating system
attempts to predict the pages a process will need, and then preloads these pages when space is
available. If correct decisions are made, the total run time of the process can be reduced
considerably. While the process runs with its current pages, the system loads new pages that will
be available when the process requests them.
PAGE RELEASE:
When a page will no longer be needed, a user could issue a page release command to free the
page frame. It could eliminate waste and speed program execution.
PAGE SIZE:
UCAS
A number of issues affect the determination of optimum page size for a given system
A small page size causes larger page table. The waste of storage due to excessively large
tables is called table fragmentation.
A large page size causes large amount of information that ultimately may not be
referenced are paged into primary storage.
I/O transfers are more efficient with large pages.
Localities tend to be small.
Internal fragmentation is reduced with small.
In the balance, most designers feel that pages factors point to the need for small pages.
The assignment of physical processors to processes allows processes to accomplish work. The
problems of determining when processors should be assigned, and to which processes. This is
called processor scheduling.
SCHEDULING LEVELS:
High-Level Scheduling:
Sometimes called job scheduling, this determines which jobs shall be allowed to compete
actively for the resources of the system. This is sometimes called admission scheduling
because it determines which jobs gain admission to the system.
Intermediate-Level Scheduling:
This determines which processes shall be allowed to compete for the CPU.
Low-Level Scheduling:
UCAS
This determines which ready process will be assigned the CPU when it next becomes
available, and actually assigns the CPU to this process.
A scheduling discipline is nonpreemptive if, once a process has been given the CPU, the
CPU cannot be taken away from that process. A scheduling discipline is preemptive if the CPU
can be taken away.
To make preemption effective, many processes must be kept in main storage so that the next
process is normally ready for the CPU when it becomes available. Keeping nonrunning program
in main storage also involves overhead.
In nonpreemptive systems, short jobs are made to wait by longer jobs, but the treatment of all
processes is fairer. Response time are more predictable because incoming high-priority jobs
cannot displace waiting jobs.
The processes to which the CPU is currently assigned is said to be running. To prevent
users from monopolizing the system the operating system has mechanisms for taking the CPU
away from the user. The operating system sets an interrupting clock or interval timer to generate
an interrupt at some specific future time. The CPU is then dispatched to the process. The process
retains control of the CPU until it voluntarily releases the CPU, or the clock interrupts or some
other interrupt diverts the attention of the CPU. If the user is running and the clock interrupts, the
interrupt causes the operating system to run. The operating system then decides which process
should get the CPU next. The interrupting clock helps guarantee reasonable response times to
interactive users, to prevent the system from getting hung up on a user in an infinite loop, and
UCAS
allows processes to respond to time-dependent events. Processes that need to run periodically
depend on the interrupting events.
PRIORITIES:
Priorities may be assigned automatically by the system or they may be assigned externally.
Static priorities do not change. Static priority mechanisms are easy to implement and have
relatively low overhead. They are not responsive to changes in environment, changes that might
make it desirable to adjust a priority.
Dynamic priority mechanisms are responsive to change. The initial priority assigned to a
process may have only a short duration after which it is adjusted to a more appropriate values.
Dynamic priority schemes are more complex to implement and have greater overhead than static
schemes.
PURCHASED PRIORITIES:
An operating system must provide competent and reasonable service to a large community of
users but must also provide for those situations in which a member of the user community needs
special treatment.
A user with a rush job may be willing to pay a premium, ie., purchase priority, for a higher
level of service. This extra charge is merited because resources may need to be withdrawn from
other paying customers. If there were no extra charge, then all users would request the higher
level of service.
DEADLINE SCHEDULING:
In deadline scheduling certain jobs are scheduled to be completed within a specific time or
deadline. These jobs may have very high value if delivered on time and may be worthless if
delivered later than the deadline. The user is often willing to pay a premium to have the system
ensure on-time consumption. Deadline scheduling is complex for many reasons.
UCAS
The user must supply the resource requirements of the job in advance. Such information
is rarely available.
The system must run the deadline job without severely degrading service to other users.
The system must plan its resource requirements through to the deadline because new jobs
may arrive and place unpredictable demands on the system.
If many deadline jobs are to be active at once, scheduling could become so complex.
The intensive resource management required by deadline scheduling may generate
substantial overhead.
UNIT V
DEVICE INFORMATION AND MANAGEMENT:
Therefore, before data can be accessed, the portion of the disk surface from which the data is to
be read (or the portion on which the data is to be written) must rotate until it is immediately
below (or above) the read-write head. The time it takes for data to rotate from its current position
to a position adjacent to the read-write head is called latency time.
Each of the several read-write heads, while fixed in position, sketches out in circular track of
data on a disk surface. All read-write heads are attached to a single boom or moving arm
assembly. The boom may move in or out. When the boom moves the read-write heads to a new
position, a different set of tracks becomes accessible. For a particular position of the boom, the
set of tracks sketched out by all the read-write heads forms a vertical cylinder. The process of
moving the boom to a new cylinder is called a seek operation.
Thus, in order to access a particular record of data on a moving-head disk, several operations
are usually necessary. First, the boom must be moved to the appropriate cylinder. Then the
portion of the disk on which the data record is stored must rotate until it is immediately under(or
over) the read-write head (ie., latency time).
Then the record, which is of arbitrary size must be made to spin by the rea d-write This is
called transmission time. This is tediously slow compared with the high processing speeds of the
central computer system.
they can be serviced by the moving-head disks, waiting lines or queues build up for each device.
Some computing systems simply service these requests on a first-come-first-served (FCFS)
basis. Whichever request for service arrives first is serviced first. FCFS is a fair method of
allocating service, but when the request rate becomes heavy, FCFS can result in very long
waiting times.
FCFS random seek pattern. The numbers indicate the order in which the requests arrived
FCFS exhibits a random seek pattern in which successive requests can cause time consuming
seeks from the innermost to the outermost cylinders. To minimize time spent seeking records, it
seems reasonable to order the request queue in some manner other than FCFS. This process is
called disk scheduling.
Disk scheduling involves a careful examination of pending requests to determine the
most efficient way to service the requests.
A disk scheduler examines the positional relationships among waiting requests. The request
queue is then reordered so that the requests will be serviced with minimum mechanical motion.
UCAS
The two most common types of scheduling are seek optimization and rotation (or
latency) optimization.
SEEK OPTIMIZATION:
Most popular seek optimization strategies.
1) FCFS(First-Come-First Served) Scheduling:
In FCFS scheduling, the first request to arrive is the first one serviced. FCFS is fair
in the same that once a request has arrived, its place in the schedule is fixed. A request cannot be
displaced because of the arrival of a higher priority request. FCFS will actually do a lengthy seek
to service a distant waiting request even though another request may have just arrived on the
same cylinder to which the read-write head is currently positioned. It ignores the positional
relationships among the pending requests in the queue. FCFS is acceptable when the load on a
disk is light. FCFS tend to saturate the device and response times become large.
2) SSTF(Shortest-Seek-Time-First) Scheduling:
In SSTF Scheduling, the request that results in the shortest seek distance is serviced
next, even if that request is not the first one in the queue. SSTF is a cylinder –oriented shceme
UCAS
SSTF seek patterns tend to be highly localized with the result that the innermost and outermost
tracks can receive poor service compared with the mid-range tracks.
3) SCAN Scheduling:
Denning developed the SCAN scheduling strategy to overcome the discrimination and
high variance in response times of SSTF. SCAN operates like SSTF except that it chooses the
request that results in the shortest seek distance in a preferred direction. If the preferred direction
is currently outward, then the SCAN strategy chooses the shortest seek distance in the outward
UCAS
direction. SCAN does not change direction until it reaches the outermost cylinder or until there
are no further requests pending in the preferred direction. It is sometimes called the elevator
algorithm because an elevator normally continues in one direction until there are no more
requests pending and then it reverses direction.
SCAN behaves very much like SSTF in terms of improved
throughput and improved mean response times, but it eliminates much of the discrimination
inherent in SSTF schemes and offers much lower variance.
Outward
sweep
Inward
Sweep \\\
together and ordered for optimum service during the return sweep. N-STEP SCAN offers good
performance in throughput and mean response time. N-STEP has a lower variance of response
times than either SSTF or conventional SCAN scheduling. N-STEP SCAN avoids the possibility
of indefinite postponement occurring if a large number of requests arrive for the current cylinder.
It saves these requests for servicing on the return sweep.
5) C-SCAN SCHEDULING:
Another interesting modification to the basic SCAN strategy is called C-SCAN (for circular
SCAN). In C-SCAN strategy, the arm moves from the outer cylinder to the inner cylinder,
servicing requests on a shortest-seek basis. When the arm has completed its inward sweep, it
jumps (without servicing requests) to the request nearer the outermost cylinder, and then resumes
its inward sweep processing requests. Thus C-SCAN completely eliminates the
discrimination against requests for the innermost or outermost cylinder. It has a very small
variance in response times. At low loading, the SCAN policy is best. At medium to heavy
loading, C-SCAN yields the best results.
Inward
Sweep
Outward
Sweep
C-SCAN SCHEDULING
UCAS
RER
Inward
Sweep
------ Jump to
--------- ----- outermost
request
next inward
sweep
RAM DISKS:
A RAM disk is a disk device simulated in conventional random access memory. It
completely eliminates delays suffered in conventional disks because of the mechanical motions
inherent in seeks and in spinning a disks. RAM disks are especially useful in high-performance
applications.
Caching incurs a certain amount of CPU overhead in maintaining the contents of the cache
and in searching for data in the cache before attempting to read the data from disk. If the record
reference patterns is not seen in the cache, then the disk cache hit ratio will be small and the
CPU’s efforts in managing the cache will be waster, possibly resulting in poor performance.
UCAS
RAM disks are much faster than conventional disks because they involve no mechanical
motion. They are separate from main memory so they do not occupy space needed by the
operating system or applications. Reference times to individual data items are uniform rather
than widely variable as with conventional disks.
RAM disks are much more expensive than regular disks. Most forms of RAM in use today
are volatile ie., they lose their contents when power is turned off or when the power supply is
interrupted. Thus RAM disk users should perform frequent backups to conventional disks. As
memory prices continue decreasing, and as capacities continue increasing it is anticipated that
RAM disks will become increasingly popular.
OPTICAL DISKS:
Various recording techniques are used. In one technique, intense laser heat is used to burn
microscopic holes in a metal coating. In another technique, the laser heat causes raised blisters
on the surface. In a third technique, the reflectivity of the surface is altered.
The first optical disks were write-once-read-many(WORM) devices. This is not useful for
applications that require regular updating. Several rewritable optical disk products have
appeared on the market recently. Each person could have a disk with the sum total of human
knowledge and this disk could be updated regularly. Some estimates of capacities are so huge
that researchers feel it will be possible to store 10^21 bits on a single optical disk.
100 files. Thus with a user community of several thousand users, a system disks might contain
50,000 to 1,00,000 or more separate files. These files need to be accessed quickly to keep
response times small.
A file system for this type of environment may be organized as follows. A root is used to
indicate where on disk the root directory begins. The root directory points to the various user
directories. A user directory contains an entry for each of a user’s files; each entry points to
where the corresponding file is stored on disk.
Files names should be unique within a given user directory. In hierarchically structured file
systems, the system name of a file is usually formed as pathname from the root directory to the
file. For eg., in a two-level file system with users A,B and C and in which A has files
PAYROLL and INVOICES, the pathname for file
PAYROLL is A:PAYROLL.
User
directory
UCAS
User files
FILE SYSTEM FUNCTIONS:
Some of the functions normally attributed to file systems follows.
1) users should be able to create, modify and delete files.
2) Users should be able to share each others files in a carefully controlled manner in
order to build upon each others work.
3) The mechanism for sharing files should provide various types of controlled access
such as read access, write access, execute access or various combinations of these.
4) Users should be able to structure their files in a manner most appropriate for each
application.
5) Users should be able to order the transfer of information between files.
6) Backup and recovery capabilities must be provided to prevent either accidental loss
or malicious destruction of information.
7) Users should be able to refer to their files by symbolic names rather than having to
user physical devices name (ie., device independence)
8) In sensitive environments in which information must be kept secure and private, the
file system may also provide encryption and decryption capabilities.
9) The file system should provide a user-friendly interface. It should give users a
logical view of their data and functions to be performed upon it rather than a physical
view. The user should not have to be concerned with the particular devices on which
data is stored, the form the data takes on those devices, or the physical means of
transferring data to and from these devices.
popular for representing data internally in mainframe computer systems, particularly those of
IBM.
A field is a group of characters. A record is a group of fields. A record key is a control
field that uniquely identifies the record. A file is a group of related records. A database is a
collection of files.
FILE ORGANIZATION:
File organization refers to the manner in which the records of a file are arranged on secondary
storage. The most popular file organization schemes in use today follows.
UCAS
sequential – Records are placed in physical order. The “next” record is the one
that physically follows the previous record. This organization is natural for files stored on
magnetic tape, an inherently sequential medium.
direct – records are directly (randomly) accessed by their physical addresses on a
direct access storage device (DASD).
indexed sequential – records are arranged in logical sequence according to a key
contained in each record. Indexed sequential records may be accessed sequentially in key order
or they may be accessed directly.
Partitioned – This is essentially a file of sequential subfiles. Each sequential subfile is
called a member. The starting address of each member is stored in the file’s directory.
The term volume is used to refer to the recording medium for each particular auxiliary
storage device. The volume used on a tape drive is a reel of magnetic tape; the volume used on a
disk drive is a disk.
and free areas may be collected into a single block or a group of large blocks. This garbage
collection is often done during the system shut down; some systems perform compaction
dynamically while in operation. A system may choose to reorganize the files of users not
currently logged in, or it may reorganize files that have not been referenced for a long time.
Designing a file system requires knowledge of the user community, including the
number of users, the average number and size of files per user, the average duration of user
sessions, the nature of application to be run on the system, and the like. Users searching a file for
information often use file scan options to locate the next record or the previous record.
In a paged systems, the smallest amount of information transferred between secondary
and primary storage is a page, so it makes sense to allocate secondary storage in blocks of the
page size or a multiple of a page size.
Locality tells us tat once a process has referred to a data item on a page it is likely to
reference additional data items on that page; it is also likely to reference data items on pages
contiguous to that page in the user’s virtual address space.
CONTIGUOUS ALLOCATION:
In contiguous allocation, files are assigned to contiguous areas of secondary storage. A
user specifies in advance the size of the area needed to hold a file is to be created. If the desired
amount of contiguous space is not available the file cannot be created.
One advantage of contiguous allocation is that successive logical records are normally
physically adjacent to one another. This speed access compared to systems in which successive
logical records are dispersed throughout the disk.
The file directories in contiguous allocation systems are relatively straightforward to
implement. For each file it is necessary to retain the address of the start of the file and the file’s
length.
Disadvantage of contiguous allocation is as files are deleted, the space they occupied on
secondary storage is reclaimed. This space becomes available for allocation of new files, but
these new files must fit in the available holes. Thus contiguous allocation schemes exhibit the
same types of fragmentation problems inherent in variable partition multiprogramming systems –
UCAS
adjacent secondary storage holes must be coalesced, and periodic compaction may need to be
performed to reclaim storage areas large enough to hold new files.
NONCONTIGUOUS ALLOCATION:
Files tend to grow or shrink over time so generally we go for dynamic noncontiguous storage
allocation systems instead of contiguous allocation systems.
BLOCK ALLOCATION:
One scheme used to manage secondary storage more efficiently and reduce execution time
overhead is called block allocation. This is a mixture of both contiguous allocation and
noncontiguous allocation methods.
In this scheme, instead of allocating individual sectors, blocks of contiguous sectors
(sometimes called extents) are allocated. There are several common ways of implementing
block-allocation systems. These include block chaining, index block chaining, and block –
oriented file mapping.
In block chaining entries in the user directory point to the first block of each file. The fixed-
length blocks comprising a file each contain two portions: a data block, and a pointer to the next
block. Locating a particular record requires searching the block chain until the appropriate block
is found, and then searching that block until the appropriate block is found, and then searching
that block until the appropriate record is found. Insertions and deletion are straightforward.
UCAS
BLOCK CHAINING
USER DIRECTORY
FILE LOCATION
With index block chaining, the pointers are placed into separate index blocks. Each index
block contains a fixed number of items. Each entry contains a record identifier and a pointer to
that record. If more than one index block is needed to describe a file, then a series of index
blocks is chained together. The big advantage of index block chaining over simple block
UCAS
chaining over simple block chaining is that searching may take place in the index blocks
themselves. Once the appropriate record is located via the index blocks, the data block
containing that record is read into primary storage. The disadvantage of this scheme is that
insertions can require the complete reconstruction of the index blocks, so some systems leave a
certain portion of the index blocks empty to provide for future insertions.
In block-oriented file mapping instead of using pointers, the system uses block numbers.
Normally, these are easily converted to actual block addresses because of the geometry of the
disk. A file map contains one entry for each block on the disk. Entries in the user directory
point to the first entry in the file map for each file.
Each entry in the file map for each file. Each entry in the file map
contains the block number of the next block in that file. Thus all the blocks in a file may be
located by following the entries in the file map.
FILE LOCATION
Index Continuation
Block index block
UCAS
The entry in the file map that corresponds to the last entry of a particular file is set to
some sentinel value like ‘Nil’ to indicate that the last block of a file has been reached. Some of
the entries in the file map are set to “Free” to indicate that the block is available for allocation.
The system may either search the file map linearly to locate a free block, or a free block list can
be maintained. An advantage of this scheme is that the physical adjacencies on the disk are
reflected in the file map. Insertions and deletions are straightforward in this scheme.
LOCATION
A 8
B 6
C 2
UCAS
22
Nil
5
26
9
20
10
Free
17
1
14
Free
3
4
0
Free
Free
12
13
Nil
23
Free
18
UCAS
19
Free
Free
Nil
Free
FILE DESCRIPTOR:
A file descriptor or file control block is a control block containing information the system
needs to manage a file.
A typical file descriptor might include
1) symbolic file name
2) location of file in secondary storage
3) file organization (Sequential, indexed sequential, etc.)
4) device type
5) access control data
6) type (data file, object program, c source program, etc.)
7) disposition (permanent vs temporary)
8) creation date and time
9) destroy date
10) date and time last modified
11) access activity counts (number of reads, for example)
File descriptors are maintained on secondary storage. They are brought to primary storage
when a file is opened.
To make a matrix concept useful, it would be necessary to use codes to indicate various kinds
of access such as read only, write only, execute only, read write etc.
UCAS
1 2 3 4
1 1 1 0 0
2 0 0 1 0
3 0 1 0 1
4 1 0 0 0