Operating System Concepts
Operating System Concepts
Introduction
An operating system is a program that acts as an intermediary between the user and the
computer hardware. The primary objective of an operating system is to provide an
environment in which a user can execute programs in a convenient way by utilizing
computer hardware in an efficient manner.
Operating system directly controls computer hardware resources. Other programs rely on
facilities provided by the operating system to gain access to computer system resources.
There are two ways one can interact with operating system:
1) By means of Operating System Call in a program.
2) Directly by means of Operating System Commands.
System Call
System calls provide the interface to a running program and the operating system. User
program receives operating system services through a set of system calls. The use of
system calls in high-level languages like C, VB, etc., very much resembles pre-defined
function or subroutine calls. A user program takes heavy use of the operating system. All
interaction between the program and its environment must occur as the result of requests
from the program to the operating system.
An operating system may process its task sequentially or concurrently. It means that the
resources of the computer system may be dedicated to a single program until its
completion or they may be allocated among several programs in different programs in
different stages of execution. The feature of operating system to execute multiple
programs in interleaved fashion or different time cycles is called as multiprogramming
systems.
Buffering is a method of overlapping input, output and processing of a single job. After
data has been read and the CPU is about to start operating on it, the input device is
instructed to begin the next input immediately. The CPU and input device are then both
busy. In case of output, the CPU creates data that is put into a buffer until an output
device can accept it.
In today’s context, as the CPU is much faster than an input device, it always finds an
empty buffer and has to wait for the input device. Thus, buffering will be of little use. For
output, the CPU can proceed at full speed until, eventually all system buffers are full.
Multiprogramming: a single user can’t always keep CPU or I/O devices busy at all
times. Multiprogramming offers a more efficient approach to increase system
performance. In order to increase the resource utilization, systems supporting
multiprogramming approach allow more than one job/program to utilize CPU time at any
moment.
The main memory of a system contains more than one program. The operating system
picks one of the program and start executing. During execution process, currently
executing program may need some 100 I/O operations to complete. In a sequential
execution environment, the CPU would sit idle. In a multiprogramming system, operating
system will simply switch over to the next program. If there is no other new program left
in the main memory, the CPU will pass its control back to the previous programs.
Process Concept
A process is a running program with some specific tasks to do, e.g. in UNIX Operating
System, Shell/command interpreter is also a process, performing the task of listening to
whatever is typed on a terminal. 1
The key idea about a process is that it’s an instance of a program and consists of pattern
of bytes. A single processor may be shared among several processes with some
scheduling policy being used by the processor to allocate one process and kill another
one.
The lifetime of a process can be divided into several stages as states, each with certain
characteristics that describe the process. Each process may be one of the following states:
Suspende
Suspende
d
d
Process State Diagram
The operating system groups all information that it needs about a particular process into a
data structure called a Process Control Block (PCB). When a process is created, the
operating system creates a corresponding PCB and when it terminates, its PCB is released
to the pool of free memory. A process is eligible to compete for system resources only
when it has active PCB associated with it. A PCB is implemented as a record containing
information associated with a specific process:
Processor Scheduling
Scheduling refers to a set of policies and mechanisms supported by operating system that
controls the order in which the work to be done or completed. A scheduler is an operating
system module/program that selects the next job to be admitted for execution. The main
objective of scheduling is to increase CPU utilization by supporting multiprogramming
system where the CPU may be shared among a number of programs in computer memory
at the same time.
Types of Schedulers
Long-term/job Scheduler: there are always more processes than it can be executed by
the CPU. These processes are kept in large storage devices like hard disk, etc. for later
processing. The long-term scheduler selects processes from this pool and loads them into
memory. In memory these processes belong to a ready queue. The short-term/CPU
scheduler selects from among the processes (if in ready state) in memory, and assigns the
CPU to one of them. The long-term scheduler executes less frequently. If the average rate
of number of processes arriving in memory is equal to that of leaving the system, then the
long-term scheduler may need to be invoked only when a process departs the system. The
Long-term scheduler’s performance will depend upon the selection of a balanced
combination of CPU bound and I/O bound process.
Medium-term Scheduler: most of the processes need some I/O operation and may be
suspended for I/O operation after running a while. It is beneficial to remove these
processes from main memory to hard disk to make room for other processes. At some
later time when the suspending condition is fulfilled, these suspended processes can be
reloaded into memory and continued where from they were left earlier. Saving of the
suspended process is said to be swapped out or rolled out. The process is swapped in and
out by the medium-term scheduler.
CPU Utilization: the key idea is that if the CPU is busy all the time, the utilization factor
of all the components of the system will also be high.
Throughput: it refers to the amount of work completed in a unit of time. One way of
measuring throughput is by means of the number of processes that are completed in a unit
of time. To compare throughput of different scheduling algorithms, we should always
consider process with similar resource requirements.
Response Time: in time-sharing system it may be defined as an interval from the time
the last character of a command line of a program or transaction is entered to the time the
last result appears on the terminal. In real time system it may be defined as an interval
from the time an internal or external event is signaled to the time the first instruction of
the respective service routine is executed.
Scheduling Algorithms
A scheduling discipline is non-preemptive if once a process has been given the CPU; the
CPU can not be taken away from that process. A scheduling discipline is pre-emptive if
the CPU can be taken away.
Preemptive scheduling is more useful in high priority process, which needs immediate
response, e.g., in real time system the consequence of missing one interrupt could be
dangerous. In non-preemptive systems, jobs are made to wait by longer jobs, but the
treatment of all processes is fairer.
First-Come-First-Served (FCFS) Scheduling: in FCFS scheduling jobs are served in
the order of their arrival. Its implementation is maintained by FIFO (First-in-First-out)
queue. Once a process has the CPU, it runs to completion.
If both processes P1 and P2 arrive in order, the turnaround times are 20 and 24
respectively; thus giving an average of 22 units of time. The corresponding waiting times
are 0 and 20 with an average of 10 units of time. When the same processes arrive in
reverse order, the turnaround times are 4 and 24 units of time respectively giving an
average of 14 units of time. The average waiting time is 2. This is a substantial reduction
and this is how shorter jobs may suffer in FCFS scheduling.
Using SJF scheduling, these processes would be scheduled in the P4-P1-P3-P2 order.
Average waiting time is 6.75 units of time. If we were using FCFS scheduling then the
average waiting time would be 10.75 units of time.
SJF scheduling algorithm works optimally only when the exact future execution times of
jobs are known at the time of scheduling. This comes on a way to effective
implementation of SJF scheduling in practice, as there is difficulty in estimating future
process behavior reliably except for very specialized deterministic cases.
This problem can be tackled through sorting of ready list processes according to the
increasing values of their remaining execution times. This approach can also improve the
schedulers’ performance by removing the search process of shortest process. However,
insertion into a sorted list is generally more complex if the list is to remain sorted after
insertion.
Round Robin Scheduling: here, the CPU time is divided into small time slices (from
10-100 milliseconds) and each process is allocated a time slice. No process can run for
more than one time slice when there are others waiting in the ready queue. If a process
needs more CPU time to complete after exhausting one time slice, it goes to the end of
ready queue to await the next allocation. Otherwise, if the running process releases a
control to operating system voluntarily due to I/O request or termination, another process
is scheduled to run.
Round Robin scheduling utilizes the system resources in an equitable manner. Small
process may be executed in a single time-slice giving good response time whereas long
processes need several time-slices and thus be forced to pass through ready queue a few
times before completion.
P1 P2 P3 P1 P1 P1 P1
0 5 10 20 25 30 35
Priority Based Scheduling: a priority is attached with each process and the scheduler
always picks up the highest priority process for execution from the ready queue. Equal
priority processes are scheduled FCFS. The level of priority may be determined on the
basis of resource requirements, process characteristics and its run time behavior.
A major problem with this scheduling is indefinite blocking of a low priority process by a
high priority process. Completion of a process within a finite time cannot be guaranteed
with this scheduling policy.
A solution to indefinite blockage of low priority process by high priority process is
provided by aging priority. Aging priority is a technique of gradually increasing the
priority of processes that wait in the ready queue for a long time. Eventually, the older
processes attain high priority and are ensured of completion in a finite time span.
A multi-level queue scheduling algorithm partitions the ready queue into separate queues.
Each queue has its own scheduling algorithm. The interactive queue might be scheduled
by a round robin algorithm while batch queue may follow FCFS.
Concurrency: operating system processes are those that execute system code and the rest
being user’s code. All these processes can potentially execute in concurrent manner.
Concurrency refers to a parallel execution of a program. A sequential program specifies
sequential execution of a list of statements. A concurrent program specifies two or more
sequential programs that may be executed concurrently as parallel processes.
Mutual Exclusion: processes that are working together often share some common
storage that one can read or write. The shared storage may be in main memory or it may
be a shared file. Each process has segment of code, called a critical section, which
accesses shared memory or files. Mutual exclusion is a way of making sure that if one
process is executing in its critical section, the other processes will be excluded from
doing the same thing.
The Dutch mathematician Dekker is believed to be the first to solve mutual exclusion
problem. But its original algorithm works for two processes only and it cannot be
extended beyond that number.
When more than one process wishes to enter the critical section, the decision to grant
entrance to one of them must be made in finite time.
Module Mutex
Var P1busy, P2busy: Boolean;
Process P1;
Begin
While true do
Begin
P1busy:=true;
While P2busy do {keep testing};
Critical_section;
P1busy:=false;
Other_P1busy_processing;
End{while}
End;{P1}
Process P2;
Begin
While true do
Begin
P2busy:=true;
While P1busy do {keep testing};
Critical_section;
P2busy:=false;
Other_P2busy_processing;
End{while}
End;{P2}
{Parentprocess}
Begin(mutex)
P1busy:=false;
P2busy:=false;
Initiate P1, P2
End (mutex)
Semaphores
a. Wait (S):
While S<=0 do (keep testing) S: =S-1;
b. Signal(S):
S: =S+1;
Mutual Exclusion with Semaphore
Module Sem-mutex
var bsem: Semaphore; {binary semaphore}
Process P1;
Begin
While true do
Begin
Wait (bsem)
Critical_section
Signal (bsem)
The rest_of P1_Processing
End; (While)
Process P2;
Begin
While true do
Wait (bsem)
Critical_section
Signal (bsem)
The rest_of P2_Processing
End; (While)
Process P3;
Begin
While true do
Wait (bsem)
Critical_section
Signal (bsem)
The rest_of P3_Processing
End; (While)
(Parent process)
Begin (Sem-mutex)
bsem: = 1 (free)
initiate P1, P2, P3
End; (Sem-mutex)
One of the main problems with Semaphore that it is not syntactically related to the
resources it protects. It does not warn the compiler that a specific data structure/resource
is being shared and its access needs to be controlled. To remove this bottleneck Brinch-
Harson proposed two languages constructs: critical region and conditional critical region.
Critical region is a language construct that strictly enforces mutually exclusive use of a
resource/data structure declared as shared. It enforces usage of shared variables and
prevents potential errors resulting from improper use of ordinary semaphores. It cannot
be used to solve some general synchronization problem. Conditional critical region
allows a process to wait till a certain condition is satisfied within a critical section
without preventing other eligible processes from accessing the shared resource. Monitor
is another similar outgrowth of semaphore and provides structured ways to control access
to shared variable. Unlike these technologies, message passing can be viewed as
extending semaphore to convey data/message as well as to implement synchronization. It
is a relatively simple mechanism suitable for both inter-process communication and
synchronization in centralized as well as distributed environment. When message passing
is used for communication and synchronization, processes send and receive messages
instead of reading and writing shared variable. Communication is performed because a
process upon receiving a message obtains values from some sender process.
Synchronization is achieved because a message can be received only after it has been
sent, which constrains the order in which these two events can occur.
1) send(message) to destination
2) receive(message) from source
Deadlock
A deadlock is a situation where a group of processes is permanently blocked as a result of
each process having acquired a set of resources needed for its completion and having to
wait for release of the remaining resources held by others thus making it impossible for
any of the processes to proceed.
Deadlock can occur in concurrent environment as a result of uncontrolled granting of
system resources to the requesting processes.
Conditions that characterizes deadlock:
1. Mutual Exclusion: only one process at a time can use the resources. If another
process requests that resource the requesting process must be delayed until the
resource has been released.
2. Hold and Wait: there must be one process that is having one resource and
waiting for another resource that is being currently held by another process.
3. No Preemption: resources previously granted cannot be forcibly taken away
from a process. The process holding them must explicitly release them.
4. Circular Wait Condition: there must be a circular chain of two or more
processes, each of which is waiting for resources held by the next member of the
chain.
In general, four strategies are used for dealing with deadlocks:
Memory Management
This is the simplest memory management approach. The memory is divided into two
sections, first section is for operating system program (also called monitor) and second
section is for user program. In this type of approach, operating system only keeps track of
the first and the last location available for allocation of user programs. In order to provide
a contiguous area of free storage for user program, operating system is loaded is kept in
low memory along with interrupt vector. This type of memory management scheme is
commonly used in single process operating system such as CP/M.
A register called fence register is set to the highest address occupied by operating system
code. A memory address generated by user program to access certain memory location is
first compared with fence register’s content. If the address generated is below the fence, it
will be trapped and denied permission.
Due to lack of support of multiprogramming, this memory management technique results
in lower utilization of CPU and memory capacity.
Static partitioning implies that the division of memory into number of partitions and its
size is made during the system generation process and remain fixed thereafter. In
dynamic partitioning, the size and the number of partitions are decided during the run
time by the operating system. The basic approach here is to divide memory into several
fixed size partitions where each partition will accommodate only one program for
execution. The number of programs i.e., degree of multiprogramming, residing in
memory will be bound by the number of partition. When a program terminates, that
partition is free for another program waiting in a queue. One partitions are defined,
operating system keeps track of status of memory partitions. It is done through a data
structure called partition description table.
The two most common strategies to allocate free partitions to ready processes are: (i)
first-fit and (ii) best-fit. The approach followed in the first fit to allocate the first free
partition large enough to accommodate the process. The best fit approach allocates the
smallest free partition that meets the need of the process. The first-fit executes faster
whereas the best fit achieves higher utilization of memory by searching the smallest free
partition.
Whenever a new process is ready to be loaded into memory and if no partition is free,
swapping of processes between main memory and secondary is done. Swapping helps in
CPU utilization by replacing low priority processes residing in main memory with ready
to execute high priority processes from secondary storage. When the higher priority
process is terminated, the lower priority process can be swapped back and continued. The
main problem with swapping process is that it takes considerable time to access process
from secondary storage device.
Suppose that user program is 100K words and secondary storage device is a fixed head
disk with an average latency of 8 ms and a transfer rate of 250000 words/sec, a transfer to
or from memory takes:
Such overhead must be considered when deciding whether to swap a process to make
room for another process. Static partitioning eliminates overhead of run time allocation of
partition at the expense of lowest utilization of primary memory. On the other hand,
dynamic partitioning is much more flexible and utilizes memory more efficiently. The
only drawback with dynamic partitioning is run time overhead of partition allocation.
The loading of a process into the same partition from where it was swapped out is
dependent upon relocation policy. If the relocation is performed before or during the
loading of a program into memory by a relocating loader or linker, the relocation
approach is called static relocation. Dynamic relocation refers to run-time mapping of
virtual address into physical address with the support of base registers and limit registers.
Virtual address refers to information within a program’s address space, while physical
address specifies the actual memory locations where program and data are stored in
memory during execution time. When a process is scheduled, the base register is loaded
with the starting address. Every memory address generated automatically has the base
register content added to it before being sent to main memory. If the base register is
100K, a MOVE R1, 200 i.e. supposed to load the content of virtual address 200 (relative
to program beginning) into register, effectively turned into a MOVE R1, 100K+200,
without the instruction itself being modified. An advantage of using base register for
relocation is that a program can be moved anywhere in memory after it has started
execution.
Protection & Sharing: the operating system must be protected from user programs and
each user process should also be protected from maliciously accessing the areas of other
processes. A common approach is to use limit/bound register for protection. It detects
attempts to access memory location beyond the boundary assigned by the operating
system. When a process is scheduled, the limit register is loaded with the highest virtual
address in a program. Each memory access of a running program is first compared with
the content of limit register. If it exceeds the limit register, user process is trapped and
denied.
A good memory management mechanism must also allow sharing of data and code
between cooperating processes. One traditional approach is to place data and code in a
dedicated common partition.
i. No single program/process may exceed the size of the largest partition in a given
system.
ii. It does not support a system having dynamically data structure such as stack, queue,
heap etc.
iii. It limits the degree of multiprogramming which in turn may reduce the effectiveness
of short-term scheduling.
The main problem with fixed size partition is the wastage of memory by programs that
are smaller than their partitions (i.e. internal fragmentation). A different memory
management approach known as dynamic partitions (also called variable partition) which
creates partitions dynamically to meet the requirements of each requesting process. When
a process terminates or becomes swapped-out, the memory manager can return the
vacated space to the pool of free memory areas from which partition allocations are
made.
Compared to fixed partitions, in dynamic partitions, neither the size nor the number of
dynamically allocated partition need be limited at any other time. Memory manager
continues creating and allocating partitions to requesting processes until all physical
memory is exhausted or maximum allowable degree of multiprogramming is reached.
The main difference between the fixed partition and variable partitions is that the number,
location and size of partitions vary dynamically in the latter as processes are created and
terminated, whereas they are fixed in the former. The flexibility of not being tied to a
fixed number of partitions that may be too large or too small for requesting processes,
improves memory utilization but it also complicates the process of allocation and de-
allocation of memory. In variable partition, operating system keeps track of which parts
of memory are available and which are allocated.
Assume that we have 640K main memory available in which 40K is occupied by
operating system program. There are 5 jobs waiting for memory allocation in a job queue.
Applying FCFS scheduling policy, Process 1, Process 2 and Process 3 can be
immediately allocated in memory. Process 4 cannot be accommodated because there are
only 600 - 550 = 50K left for it.
Let us assume that after some time Process 1 is terminated, releasing 200K memory
space. After that the control is returned to the Process queue and next process (Process 4)
is swapped, in the memory. After Process 1, Process 3 gets terminated releasing 100K
memory but Process 5 cannot be accommodated due to external fragmentation. After the
swapping out of Process 2 due to termination, Process 5 will be loaded for execution.
This example illustrates one important problem with variable size partitions. The main
problem is external fragmentation. It exists when the size of memory is large enough for
a requesting process, but it cannot satisfy a request because it is not contiguous; storage is
fragmented into a small number of holes (free spaces). Depending upon the total size of
memory and number and size of a program, external fragmentation may be either a minor
or a major problem.
One solution to this problem is compaction. It is possible to combine all the holes (free
spaces) into a large block by pushing all the processes downward as far as possible. The
following figure illustrates the compaction of memory. In figure (a) there are 4 holes of
sizes 30K, 20K, 40K and 20K which have been compacted into one large hole or block of
110 K (figure (b)).
Compaction is usually not done because it consumes a lot of CPU time; on a 1M
Microcomputer that has the capacity of copying at a rate of 1 megabyte/sec, CPU takes
one second to compact memory. It is usually done on a large machine like mainframe or
supercomputer because they are supported with a special hardware to perform this task
(compaction) at a rate of 40 megabytes/sec or more. Solution of an optimal compaction
strategy is quite difficult.
If we apply simplest algorithm, Process 3 and 4 will be moved at the end, for a total
movement of 500K. If we simply move Process 4 above 3, then we only 300K is moved
or if we move Process 3 down below Process 4, then we move only 200K. Please observe
that a large memory hole (block) is now in the middle. Therefore, if we have large
number of processes, selection of which process or processes for shifting downwards or
upwards to meet the requirement for waiting process is quite difficult task.
One advantage with variable partition is that memory utilization is generally better than
fixed size partitions, since partitions are created accordingly to the size of process.
Protection and sharing in static and dynamic partitions are quite similar, because of same
hardware requirement except for some additional consideration due to compaction of
memory during dynamic partitioning.
Disadvantages:
i. Dynamic memory management requires lots of operating system space, time, complex
memory management algorithm and bookkeeping operation.
ii. Compaction time is very high. Although internal fragmentation is negligible, external
fragmentation may be severe problem imposing a time penalty for compaction.
Paging
When a programmer wants to transfer a data from one location to another, he might write
for example: move R1 2000 (i.e. moving the content of memory address 2000 to register
R1). Address of memory location can be generated using indexing, base registers,
segment registers and other ways.
These program generated addresses are called virtual addresses and form the virtual
address space. There has to be mapping between virtual address space and physical
address space.
Address mapping in a paging system
The physical memory is conceptually divided into a number of fixed-size blocks called
frames (or page frames). The virtual address space or logical memory of a process is also
broken into blocks of the same size called pages. When a program is to be run, its pages
are loaded into any frame from the disks.
An important component of paging operation is a page map table which contains starting
address or base address of each page stored in physical memory.
As shown in the figure, every address generated by CPU contains two components:
virtual (or logical) page number and a page offset into that page. The page number works
as an index into the page map table. To define a physical memory address, the base
address in PMT is added with offset and sent to physical memory unit.
Page size (and frame size) is defined between 512 bytes to 4K byte depending on
computer architecture.
For example virtual page is assumed to be placed in the physical frame whose starting
address is FFF00H. This value is stored in the first entry of the PMT. All other PMT
entries are filled with frame address where the corresponding pages are loaded.
The main objectives for hardware support for paging is to store page map table and make
virtual to physical address translation more efficient since every access to memory must
go through the PMT, efficiency is a major consideration.
To reduce the access time, the use of registers can be considered if the number of entries
in PMT is quite small (around 256 entries). For keeping very large entries (around
1,000,000 entries), PMT is usually kept in primary memory because the use of registers
for keeping such a large number of entries is not feasible. Thus, PMT is usually kept in
primary memory and there is a Page Table Base Register (PTBR) pointing to the
beginning of PMT.
Problem with this approach is memory access time. Suppose we want to access a location
L, we first index into PMT by counting a base address of PMT (b) offset by page number
(p) for that location i.e. b + p. his task requires a memory access which in turn gives a
frame number, which is combined with page offset to produce actual address. With this
scheme, two memory accesses are required (one for PMT, and one for the data). Thus,
memory is slowed by a factor of two. Therefore, faster translation methods must be used.
The standard solution to this problem is to store the complete page map table into an
associative memory also called look-aside memory or content addressable memory.
A set of associative registers are built of high speed memory. Unlike primary memory,
associative memories can be searched by contents rather than by addresses. Each register
consists of two parts: a key and a value. If associative registers are presented with a
question "Whether a particular page is stored in associative memory"? it is compared
with all keys simultaneously. If the page is found in associative registers, its
corresponding value is output. Although the hardware is quite expensive, the searching
operation is very fast.
Associative memory works with page map table in the following manner:
Associative memory contains only a part of PMT. "page entries maintained in this PMT
correspond to the most recently referenced pages only using the heuristic that a page
referenced recently in the past is likely to be referenced again in the future.
When a virtual address which contains page p and offset d is generated by a CPU,
address translation mechanism first tries to find page p in the partial associative page map
table which contains page number and their corresponding frame numbers. If the page
number is found in associative registers, its frame number is immediately available which
is combined with offset d to form the real address that corresponds to virtual address.
This operation is shown in the figure.
If the target page number is not in the associative registers, a reference to PMT in
memory is made when the frame number is obtained. We can use it to access memory. In
addition, we add the page number and frame number to the associative registers, so that
they can be found very quickly on the next reference.
The percentage of times that a page number is found in the associative registers is called
hit ratio. A 70% hit ratio means that 70% of the time we find the desired page number in
the associative registers. If it takes 60 nanosecond to search the associative registers and
740 nanosecond to access physical memory, then a total memory access time is 60+740 =
800 ns, when the target page is in associative registers. If the target page (60 nanosecond)
is not found in associative registers, then we must first access primary memory (740
nanosecond) for page map table for getting the corresponding flame number and then
access the desired location in memory another (740 ns) for a total of 1480 ns.
The hit ratio is clearly related to the number of associative registers. With 8 or 16
associative registers, a hit ratio of 80 percent to 90 percent can be obtained.
50KX20+5KX20 = 11OOK
A little over 1 million byte. The obvious solution to this problem is to share those pages
among all users.
Sharing must be controlled to prevent one process from modifying data that another
process is reading. In today's systems that support sharing, programs consists of two
separate parts, one for procedure code and another for data. To be sharable, the code must
be non-self-modifiable or reentrant or pure code. The meaning of reentrant code is that it
is read- only. There should not be any attempt on part of users to modify this code.
Modifiable data or code cannot be shared. Since reentrant code never changes while
execution, two or more processes, can execute the same code at the same time without
having a personnel copy of it. But data for two different processes for the same code will
of course be different. Each process has it own copy of registers and data storage to hold
the data for its own execution
Only one copy of compiler needs to be kept in primary memory. The page map table of
each process maps onto same physical copy of compiler but data pages is mapped into
different frames
Therefore to support only one copy of compiler (50K) plus 20 copies of the 5K of data
space per user, the total memory requirement will be now just 1 SOK instead of 1 mega
byte, a significant saving.
Protection: Memory protection is usually done by protection bits associated with each
page. These bits are usually kept in the page map table. One protection bit can define a
page to be read/write or a read-only. Since reference to each page is passed through the
page map table to find the correct frame number, protection bits can be checked to verify
that no write operation is being used to read-only page. An attempt to do this type of
operation is easily trapped.
SEGMENTATION
Segmentation is a memory management scheme which supports programmers' view of
memory. Programmers never think of their programs as a linear array of words. Rather,
they think of their programs as a collection of logically related entities, such as
subroutines or procedures, functions, global or local data areas, stack etc.
Segments are formed at program translation time by grouping together logically related
entities. Formation of these segments varies from one compiler to another. A Pascal
compiler might create separate segments for (1) code of each procedure (2) global data
(3) local data or variable (4) stack for calling procedure and storing its parameter.
A Fortran Compiler, for example, might create a separate segment for each common
block; Arrays might be formed as separate segments. TO simplify implementation
procedures, each segment in a program is numbered and referred to by a segment number
rather than a segment name. In segmented systems, components belonging to a single
segment reside in one contiguous area but different segments belonging to the same
process occupy non-contiguous area of physical memory because each segment is
individually relocated.
Address Mapping in a Segmented System
An important component of address mapping in a segmented system is a segment table.
Its use is illustrated in the following figure.
A virtual (logical) address consists of two parts: a segment number and an offset into that
segment. The segment number provided in the virtual address is used as an index into the
segment table. Each row of the segment table contains a Starting address (base address)
of segment and a size of the segment. 'Me offset of the virtual address must be within
(less than or equal to) the size of the segment. If the offset of virtual address is not within
the range, it is trapped by the operating system otherwise the offset is added to the base
address of the segment to produce physical address of the desired segment. There are 5
segments numbered from 0 to 4. The segment table has a separate entry for each segment
having the starting address (base address) of segment in physical memory and the size of
that segment.
For example, segment 1 is 1500 words long, starting at location 2500. Thus, a reference
word at 500 location within segment 1 is mapped onto location 2500 (base address) + 500
= 3000 (actual physical address). A reference to segment 2 at location 300 will be
mapped to 6000+300 = 6300 etc. But a reference to a word 900 of segment 4 would
cause segment size violation and it will be trapped by the operating system.
Implementation of segment tables
Like the page map table, segment table can be stored either into fast registers or in
primary memory. The main advantage to keep segment table in fast registers is that it can
be accessed very fast because of simultaneous operations of comparison and addition of
base address.
In the case where a program consists of a large number of segments, it is not feasible to
keep segment table in registers; it must be kept in memory only. In general, the size of a
segment table is related to the size of virtual address space of a program.
In this way, a segment table need contain only as many entries as there are segments
actually defined in a given program. An attempt to access non-existent segment may be
trapped by STLR.
For a virtual address containing virtual segment number and offset, the first thing is to
check whether the segment number is valid (segment number STLR). Then the addition
of segment number with STBR generates address for segment table entry in-memory.
This entry is read from memory and then the offset is compared with segment size. If it is
legal then the physical address of the desired word is computed as a sum of base address
of segment and offset.
As with paging, this mapping requires two memory references per logical address,
slowing the computer system by a factor of two. The normal solution to this problem is to
use a set of associative registers to keep the most recently used segment table entries. A
small set of to 16 associative register can reduce the delay of memory access to around 10
to 15 per cent than unmapped memory access.
Sharing and Protection in a Segmented System
One of the advantages of segmentation over paging is that it is logical rather than a
physic concept. Generally segments are not limited to a particular size as it is done with
page size. Rather, segments are allowed to be as large (with reasonable limit) as they
require to be. A segment for an array is as large as the array. A segment corresponding to
a dynamic data structure such as stack, queue, binary tree etc. may increase and decrease
in size as the data structure itself increases and decreases.
VIRTUAL MEMORY
The common problem before programmers few years ago was how to fit large programs
into the small memory. The solution developed was to break programs into small pieces
called overlays. The first overlay (overlay 0) would be loaded and executed first, which
in turn call another overlay. The overlays used to be on hard disk and swapped in and out
of memory by operating system.
Although the swapping was done by the operating system, the work of creating number
of layers in a program was done by a programmer which was time consuming job. The
virtual memory is a memory management technique which does splitting of a program
into number of pieces as well as swapping.
The basic idea behind virtual memory is that the combined size of the program, data and
stack may exceed the amount of physical memory. The operating system keeps those
parts of the program in the memory which are required during execution and the rest on
the disk. For example, a 2MB program can run on a 640K RAM machine by carefully
choosing 64 to keep in memory at each instant and swapping pieces of a program
between disk and memory as needed.
The main objectives Of various memory management strategies discussed so far, were to
support multiprogramming i.e. to keep many processes in memory simultaneously. But,
all these schemes require the entire process to be in memory before the execution starts.
Virtual memory is a memory management scheme that allows the execution of processes
that may not be completely in main memory. In other words, virtual memory allows
execution of partially loaded processes. As a consequence user programs can be larger
than the physical memory.
Advantages of Virtual Memory
In virtual memory scheme, Programmers get illusion of much larger memory than
physical memory; therefore, they are not concerned about fitting their programs into
small physical memory. In most of the cases, it has been observed that an entire program
is not needed. For example take the case of assembler. It performs its tasks into two
passes: In the first pass, it scans the entire program and builds the symbol table and in the
second pass it generates object code. Therefore only the program related to a particular
pass should be in memory at a time but not an entire assembler program. Even in those
cases where the entire program is needed, it may not all be needed at the same time.
1. The size of user's program would no longer be constrained by the available size of
physical memory. Users would be able to write Programs for very large virtual address
space, simplifying the programming task.
2. Since each user utilizes less physical memory, more users can keep their programs
there simultaneously which will cause increase in CPU utilization and throughput.
3. Since a process may be loaded into a space of arbitrary size, which in turn reduces
external fragmentation without the need to change the scheduled order of process
execution. Moreover, the amount of space in use by a given process may be changing
during its memory residence. As a result the operating system may speed up the
execution of important processes by allocating more physical memory.
But what will happen if the program tries to access a page that was not swapped in
memory? In that case, page fault trap occurs. Page fault trap is the result of the operating
system's failure to bring a valid part of the program into memory in an attempt to
minimize swapping overhead and physical memory requirement. When the running
program experiences a page fault, it must be suspended until the missing page is swapped
in main memory. Since disk access time is usually several orders of magnitude longer
than main memory cycle time, operating system usually schedules another process during
this period. Here is a list of steps operating system follows in handling a page fault:
1. If a process refers to a page which is not in the physical memory then an internal table
kept with a process control block is checked to verify whether a memory reference to a
page was valid or invalid.
2. If the memory reference to a page was valid, but the page is missing, the process of
bringing a page into the physical memory starts.
4. By reading a disk, the desired page is brought back into the free memory location.
5. Once the page is in the physical memory, the internal table kept with the process and
page map table is updated to indicate that the page is now in memory.
6. Restart the instruction that was interrupted due to the missing page.
Whenever there is an interrupt due to any reason, the system saves the state (registers,
stack, program counter) of the interrupted process. After the condition is fulfilled, the
interrupted process restarts from the same place where it was interrupted.
The hardware requirement to support demand paging is the same as the hardware for
demand paging:
A page-map table with the ability to mark YES or NO for page entry.
A hard disk to hold those pages not in memory.
4. Placement policy: - Where to place an incoming new page into the physical memory’?