0% found this document useful (0 votes)
193 views124 pages

Operating Systems Notes by S Dhall

The document provides an overview of operating systems, including: 1. Operating systems act as an intermediary between the computer hardware and user/application programs, managing resources and providing a programming interface. 2. Key operating system functions include managing processes, performing input/output operations, facilitating file manipulation and communication between processes, handling errors, allocating resources, and providing security and protection. 3. At the core of most operating systems is a kernel, a central module that loads first and remains in memory, responsible for essential tasks like memory, process, and disk management.

Uploaded by

SALMAN KHAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
193 views124 pages

Operating Systems Notes by S Dhall

The document provides an overview of operating systems, including: 1. Operating systems act as an intermediary between the computer hardware and user/application programs, managing resources and providing a programming interface. 2. Key operating system functions include managing processes, performing input/output operations, facilitating file manipulation and communication between processes, handling errors, allocating resources, and providing security and protection. 3. At the core of most operating systems is a kernel, a central module that loads first and remains in memory, responsible for essential tasks like memory, process, and disk management.

Uploaded by

SALMAN KHAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

Introduction to Operating Systems

Operating System:- An Operating System is a program that manages the computer


hardware (or resources). It also provides a basis for application programs to execute &
acts as an intermediary between the computer user & the computer hardware.

Computer System:- A Computer System can be divided roughly into following


components:
- hardware & micro architecture
- operating system
- system & application programs

Figure 1: Abstract Layered view of components of a Computer System

Hardware: Registers, data path, CPU, memory, input/output devices provides the basic
computing resources for the system. Hardware only understands machine language.

Operating System: To hide the complexity associated with managing/operating


hardware, an operating system is provided. It consists of a layer of software that hides
(partially) the hardware & gives the programmer a more convenient set of instructions to
work with.

System Software: On top of the operating system lies the system softwares like
command interpreter (Shell), Compilers, editors and similar application-independent
programs.

Application programs: Finally, above the system programs came the application
programs. These programs are purchased or written by the users to solve their particulars
problems, such as word processing, spreadsheets, web-browsers, storing info in a
database etc.

1
Note: Operating System runs in kernel mode or supervision mode. The system programs
& application programs run in user mode.

ROLE OF OPERATING SYSTEM

Operating system (OS) has two basically unrelated roles:


(i) Extending the machine
(ii) Managing resources

Extending the Machine: The role of OS is to present the user with the equivalent of an
extended machine that is easier to program than the underlying hardware. OS provides
the user with a convenient interface. It acts as an intermediate between the user & the
underlying hardware. The OS hides from the programmer/user the details of the hardware
thus making application layer independent of the lower layer thereby providing the
desired convenience.

Operating system a Resource Manager: Besides providing a convenient interface to


the user, the operating system is also required to manage all the resources of a computer
system like processors, memory, disk, I/O devices, network interfaces etc, Thus, the job
of the OS is to provide for an orderly & controlled allocation of these system resources
among various processes (executing programs) competing for them. Its task is to keep
track of which process is using which resource, to grant resource requests, to account for
usage, and to mediate conflicting requests from different processes and users.

OPERATING SYSTEM FUNCTIONS

An operating system provides an environment for the execution of programs. It provides


certain services to programs and the users of those programs. Some of these
services/functions are:

1. User interface
Operating System provides an interface between the computer user and the
computer hardware. Almost all operating systems have a user interface (UI) via
which the user interacts with the applications and the hardware. This interface can
take several forms like:

 Command-line interface (CLI), which uses text commands and a method


for entering them (say, a program to allow entering and editing of
commands).

 Batch interface, in which commands and directives to control those


commands are entered into files, and those files are executed.

2
 Graphical user interface (GUI) which uses a pointing device (like mouse)
to direct I/O, choose from menus, and make selections and a keyboard to
enter text.

2. Program execution & Process Management


The system must be able to load a program into memory and to run that program.
This task is facilitated by the OS. The OS is responsible to create, execute and
delete a process, cancel or resume a process, schedule, synchronize processes and
handle deadlock situations for processes.

3. I/O operations
A running program may require input output (I/O), which may involve a file or an
I/O device. For specific devices, special functions may be desired (such as
recording to a CD or DVD drive or blanking a CRT screen). For efficiency and
protection, users usually cannot control I/O devices directly. Therefore, the
operating system must provide a means to do I/O. The device management tasks
handled by OS are – open, close, communicate, control and monitor device
drivers.

4. File-system manipulation
Programs need to read and write files and directories. They also need to create
and delete them by name, search for a given file, allocate space for files, list file
information, perform permission management to allow or deny access to files or
directories based on file ownership & permissions. All these functionalities
related to file system management are supported by the OS.

5. Communication
One of the important functionalities of OS is to facilitate interprocess
communication where one process needs to exchange information with another
process. Such communication may occur between processes that are executing on
the same computer or between processes that are executing on different computer
systems connected to each other via a network. This interprocess communications
may be implemented via shared memory or through message passing, in which
packets of information are moved between processes by the operating system.

6. Error detection
The operating system needs to be constantly aware of possible errors. Errors may
occur in the CPU and memory hardware (such as a memory error or a power
failure), in I/O devices (such as a parity error on tape, a connection failure on a
network, or lack of paper in the printer), and in the user program (such as an
arithmetic overflow, an attempt to access an illegal memory location, or a too-
great use of CPU time). For each type of error, the operating system should take
the appropriate action to ensure correct and consistent computing. Debugging
facilities can greatly enhance the user's and programmer's abilities to use the
system efficiently.

3
7. Resource allocation
Systems with multiple users can gain efficiency by sharing the computer
resources among the users. When there are multiple users or multiple jobs running
at the same time, resources must be allocated to each of them. Many different
types of resources like CPU, main memory, file storage, I/O devices etc. are
managed by the operating system. For example, in determining how best to use
the CPU, operating systems have CPU-scheduling routines that take into account
the speed of the CPU, the jobs that must be executed, job sizes, priority of jobs
etc.

8. Memory Management
OS performs the activities of memory management like – allocate memory, free
memory, re-allocate memory to a process when a used block is freed, keep track
of memory usage etc.

9. Accounting
OS keeps track of which users/processes use how much and what kinds of
computer resources. Usage statistics may be a valuable for reconfiguring the
system to improve overall efficiency and performance of the system.

10. Protection and security


The owners of information being stored in a multiuser or networked computer
system may want to control use of that information. When several separate
processes execute concurrently, it should not be possible for one process to
interfere with the others or with the operating system itself.

 Protection involves ensuring that all access to system resources is


controlled.
 Security of the system from outsiders is also important. Such security
starts with requiring each user to authenticate himself or herself to the
system, usually by means of a password, to gain access to system
resources. It extends to defending external I/O devices, including modems
and network adapters, from invalid access attempts and to recording all
such connections for detection of possible breach (if any).

Ensuring protection & security of the system is another functionality of the OS.

KERNEL

The kernel is the central module of an operating system. It is the part of the operating
system that loads first, and it remains in main memory all the time as long as the system
is on. Because it stays in memory, it is important for the kernel to be as small as possible
while still providing all the essential services required by other parts of the operating
system and applications. The kernel code is usually loaded into a protected area of

4
memory to prevent it from being overwritten by programs or other parts of the operating
system. Typically, the kernel is responsible for memory management, process and task
management, and disk management. The kernel facilitates connection between the system
hardware & the application software. Every operating system has a kernel. For example,
the Linux kernel is used numerous operating systems including Linux, FreeBSD, Android
and others.

What happens when a computer system is powered on or rebooted?

When a computer starts running i.e. when it is powered up or rebooted, it requires to have
an initial program to run. This initial program or boot strap program, tends to be simple.
Typically, it is stored in the read-only memory (ROM) or electrically erasable
programmable read-only memory (EEPROM), known by the general term firmware,
within the computer hardware. It initializes all aspects of the system, from CPU registers
to device controllers to memory contents. The bootstrap program must know how to load
the operating system and how to start executing that system. To accomplish this, the
bootstrap program locate & load the operating-system kernel into memory. Once the
kernel is loaded in the memory, it starts executing the first process, such as “init” (on
Unix OS), which in turn can start many daemon process (running in background for the
entire time the kernel is running). Once this phase is complete, the system is fully booted,
and the system waits for some event to occur.

The occurrence of an event is usually signalled by an interrupt from either the hardware
or the software. Hardware may trigger an interrupt at any time by sending a signal to the
CPU, usually by way of the system bus. Software may trigger an interrupt by executing a
special operation called a system call (in Unix OS).

References:

1. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.),


John Wiley & Sons, Inc.

2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.

3. https://www.webopedia.com/TERM/K/kernel.html

4. https://en.wikitolearn.org/User:Srijancse/Operating_Systems/Introduction/Compu
ter-System_Organization

5
Types of Operating Systems

Mainframe Operating Systems


At the high end are the operating systems for the mainframes, which are room-sized computers
used in major corporate data centres. These computers distinguish themselves from personal
computers in terms of their I/O capacity.
The operating systems for mainframes are heavily oriented towards processing many jobs at
once, most of which need huge amounts of I/O. They typically offer three kinds of services:
batch, transaction processing, and timesharing. A batch system is one that processes routine
jobs without any interactive user present. For instance, claims processing in an insurance
company or sales reporting for a chain of stores are typically done in batch mode. Transaction
processing systems handle large numbers of small requests, for example, cheque processing at
a bank or airline reservations. Each unit of work is small, but the system must handle hundreds
or thousands per second. Timesharing systems allow multiple remote users to run jobs on the
same computer all at once, such as querying a big database. These functions are closely related
& mainframe operating systems often facilitate all of them. An example mainframe operating
system is OS/390, a descendant of OS/360.

Personal Computer Operating Systems


These are the operating systems which are loaded in personal computer. Besides performing
the functionalities of a general operating system, their major job is to provide a good interface
to a single user to make it convenient for him to operate the machine even without much of
technical know-how. They are widely used by users who work on simple applications like word
processing, spreadsheets, media playing, and Internet access etc. Common examples are
Windows 98, Windows 2000, the Macintosh operating system, and Linux.

Network Operating System


A network operating system (NOS) includes special functions for connecting computers and
devices into a local-area network (LAN) to support peer-to-peer networking models i.e. NOS
is enhancement of basic operating system by adding networking features to it. NOSes must be
able to handle typically network duties such as the following:
 Providing access to remote systems and recognizing when devices aren’t available to
the network.
 Enabling and managing access to files on remote systems, and determining who can
access what – and who can’t.
 Granting access to remote applications and resources, and making those resources
seem like local resources to the user i.e. the network is ideally transparent to the user.
 Providing routing services, including support for major networking protocols, so that
the operating system knows what data to send where.

1
 Monitoring the system and security, so as to provide proper security against viruses,
hackers, etc. and data corruption.
 Providing basic network administration utilities, enabling an administrator to perform
tasks involving managing network resources and users.
Some operating systems, such as UNIX and the Mac OS, have networking functions built
in. Novell Netware and Windows NT are few more examples of a NOS.

Distributed Operating System


Distributed Operating System manages a distributed system where distributed applications are
running on multiple computers linked by communication channels. A distributed operating
system is an extension of the network operating system that supports higher levels of
communication and integration of the machines on the network.
Multiple computers of a distributed system possess no hardware connections at the CPU –
memory bus level, but are connected by external interfaces that run under the control of
software. Each processor has its own local memory and processors communicate with one
another through various communication lines, such as high-speed buses or telephone lines.
The Distributed OS facilitates a collection of autonomous computer systems to communicate
and cooperate with each other through a LAN/WAN. A Distributed OS provides a virtual
machine abstraction to its users and facilitates wide sharing of resources like computational
capacity, I/O and files etc.
The users of a true distributed system should not know on which machine their programs are
running and where their files are stored. A Distributed OS manages the system shared resources
used by multiple processes, the process scheduling activity (how processes are allocating on
available processors), the communication and synchronization between running processes and
so on. Thus, using Distributed OS the system looks to its users like a single system with an
ordinary centralized operating system while actually it runs on multiple, independent central
processing units (CPUs) on separate physical computers.
LOCUS and MICROS are the examples of Distributed Operating Systems.

Multiprocessor Operating System


Multiprocessor refers to the use of two or more central processing units (CPU) within a single
computer system so as to enhance the computing power. All CPUs within the multiprocessor
share full access to a common RAM. Although all multiprocessors have the property that every
CPU can address all of the memory, some multiprocessors have the additional property that
every memory word can be read as fast as every other memory word. These machines are called
UMA (Uniform Memory Access) multiprocessors. Others that don’t have this property are
called NUMA (Nonuniform Memory Access) multiprocessors. Also, any of the processors can
access any of the I/O devices, although they may have to go through one of the other processors.
A multiprocessor has one operating system used by all processors. The multiprocessor
operating system provides interaction between processors and their tasks at the process and

2
data element level. For the most part, multiprocessor operating systems are just regular
operating systems. They handle system calls, do memory management, provide a file system
and manage I/O devices. Nevertheless, there are some areas in which multiprocessor operating
systems have unique features. These include process synchronization, resource management,
and scheduling. Most modern network operating systems support multiprocessing. These
operating systems include Windows NT, Windows 2000, Windows XP, Unix etc.

Server Operating System


A server operating system is an operating system specifically designed to run on servers, which
are specialized computers that operate within a client/server architecture to serve the requests
of client computers on the network.
The server operating system offers a software layer on top of which other software, or
application programs, can run on the server hardware. The server operating system facilitates
serving multiple users at once over a network and allow users to share hardware and software
resources. It helps to enable and facilitate typical server roles such as Web server, mail server,
file server, database server, application server etc.
Popular server operating systems include Windows Server, Mac OS X Server, variants of Linux
such as Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise Server etc.

Real Time Operating Systems


Real-time operating system runs on the real-time system. These systems are characterized by
having time as a key parameter. For example, in industrial process control systems, real-time
computers have to collect data about the production process and use it to control machines in
the factory. Often there are hard deadlines that must be met. For example, if a car is moving
down an assembly line, certain actions must take place at certain instants of time, like if a
welding robot welds too early or too late, the car will be ruined. If the action absolutely must
occur at a certain moment (or within a certain range), it is a hard real-time system.
Another kind of real-time system is a soft real-time system, in which missing an occasional
deadline is acceptable. Digital audio or multimedia systems fall in this category.
The OS running on such real-time systems are called real-time OS whose basic objective is to
ensure that the real-time deadlines are met as per requirement of the applications.
VxWorks and QNX are well-known real-time operating systems.

Embedded Operating Systems


Embedded operating systems run to control devices that are not generally thought of as
computers, such as TV sets, microwave ovens, airplane controls, digital cameras, elevators etc.
In contrast to an operating system for general-purpose computer, an embedded OS is typically
quite limited in terms of functions it performs. Depending on the device in question, the system
may only run a single application. However, that single application is crucial to the devices

3
operation, so an embedded OS must be reliable & should be able to run with constraints in
memory, size, processing power and time.
Examples of such operating systems are PalmOS and Windows CE (Consumer Electronics).

Smart Card Operating Systems


The smallest operating systems run on smart cards, which are credit card-sized devices
containing a CPU chip. They have very severe processing power and memory constraints.
Some of them can handle only a single function, such as electronic payments, but others can
handle multiple functions on the same smart card. Often these are proprietary systems.
Some smart cards are Java oriented i.e. the ROM on the smart card holds an interpreter for the
Java Virtual Machine (JVM). Java applets (small programs) are downloaded to the card and
are interpreted by the JVM interpreter. Some of these cards can handle multiple Java applets at
the same time, leading to multiprogramming and the need to schedule them. Resource
management and protection also become an issue when two or more applets are present at the
same time. These issues must be handled by the (usually extremely primitive) operating system
present on the card. The smart card OS is usually extremely primitive and small sized.
MULTOS, Java Card OpenPlatform (JCOP) are examples of Smart Card Operating Systems.

References:

1. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.

2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.

3. Other online resources.

4
Some Basic Concepts of OS

Multiprogramming: In a multiprogramming system there are one or more programs loaded in


main memory which are ready to execute. Only one program at a time is able to get the CPU
for executing its instructions (i.e., there is at most one process running on the system) while all
the others are waiting for their turn. The main idea of multiprogramming is to maximize the
use of CPU time. Indeed, suppose the currently running process is performing an I/O task
(which, by definition does not need the CPU to be accomplished), then, the OS may interrupt
that process and give the control of CPU to one of the other in-main memory processes that are
ready to execute. For doing this, the OS does process context switching so that when the former
process is rescheduled to CPU (after completion of its I/O task) it gets resumed from exactly
the same state at which it was interrupted previously. In this way, no CPU time is wasted by
the system waiting for the I/O task to be completed, and a running process keeps executing
until either it voluntarily releases the CPU or when it blocks for an I/O operation. Therefore,
the ultimate goal of multiprogramming is to keep the CPU busy as long as there are processes
ready to execute.

Multitasking: It is a logical extension on multiprogramming. It means running multiple tasks


on the same CPU apparently simultaneously by switching between them. New tasks start and
interrupt already started ones before they have reached completion, instead of executing the
tasks sequentially one after the other. As a result, a computer executes segments of multiple
tasks in an interleaved manner, while the tasks share common processing resources such as
central processing units (CPUs) and main memory.

The main differentiator between multiprogramming and multitasking is that multiprogramming


works solely on the concept of context switching with an objective to maximize CPU utilization
whereas multitasking is based on time sharing alongside the concept of context switching
which is occurring so fast that the user is able to interact with each program (process) separately
while it is running. In this way, the user is given the illusion that multiple processes/tasks are
executing simultaneously.

Timesharing: Timesharing basically is the sharing of a computing resource among many users
by means of multiprogramming & multitasking.

Timesharing systems are multi-user systems which are designed to allow several programs to
execute apparently simultaneously. Most time-sharing systems use time-slice (round robin)
scheduling of CPU. In this approach, processes are executed on rotating basis where a process
executing longer than the system defined time-slice is interrupted by the operating system and
placed at the end of the queue of ready programs waiting for CPU.

Multiprocessing: It refers to the ability of the simultaneous execution of two or more programs
or instructions sequences by separate CPUs under integrated control inside a single computer
system.

Multiuser: A computer in which OS facilitates multiple terminals to connect to a host computer


that handles processing of tasks of different users.

1
Multithreading: Multithreading extends the idea of multitasking into applications. Here,
specific operation of a single application can be subdivided into individual multiple threads.
Threads management may be supported by the application or operating system or both.

References:

1. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.

2
PROCESSES
Informally, a process is a program in execution. A process is more than the program code,
which is sometimes known as the text section. It also includes the current activity, as
represented by the value of the program counter and the contents of the processor’s
registers. A process generally also includes the process stack, which contains temporary data
(such as function parameters, return addresses, and local variables), and a data section,
which contains global variables. A process may also include a heap, which is the memory
used for dynamic allocation during process run time. The structure of a process in memory is
as shown below:

It should be noted that a program by itself is not a process; a program is a passive entity, such
as a file containing a list of instructions stored on disk (often called an executable file),
whereas a process is an active entity, with a program counter specifying the address of the
next instruction to execute and a set of associated resources. A program becomes a process
when an executable file is loaded into memory.

To implement process model, the operating system maintains table (an array of structures),
called the process table, with one entry per process. These entries are called process table
entries or process control blocks. The entry contains information about the proces state, its
program counter, stack pointer, memory allocation, the status of its open files, its accouting
and scheduling information, and everything else about the process that must be saved when
the process is switched from running to ready or blocked state so that it can be restarted later
as if it had never been stopped.

Some fields of typical Process Table Entry

Process Management
 Registers
 Program Counter
 PSW (Program Status Word)
 Stack Pointer
 Process State
 Priority
 Process ID
 Scheduling parameters
 Parent Process
 Process group
 Signals
 CPU time used etc…

Memory Management
 Pointer to text segment
 Pointer to data segment
 Pointer to stack segment

File Management
 Root direcory
 Working directory
 File descriptors
 Used ID
 Group ID

PROCESS STATE
As a process executes, it changes state. The state of a process is defined in part by the current
activity of that process. Each process may be in one of the following states:

 New: The process is being created


 Running: The process has been granted processor and instructions of the process are
being executed.
 Waiting: The process is waiting for some event to occur (such as an I/O completion
or reception of a signal).
 Ready: The process is waiting to be assigned to a processor.
 Terminated: The process has finished execution.

It is to be noted that these names are arbitrary and they vary across operating systems but the
states that they represent exist on all systems. It is also important to realize that only one
process can be running on any processor at any instant of time. Many processes may be ready
and waiting, however. The state diagram showing transition among these states is as shown
below:
Figure: State Transition Diagram

INTERPROCESS COMMUNICATION

Often, processes require to communicate with each other and share information for doing a
task. Interprocess communication (IPC) refers to this transfer of data or information among
the processes. For example: Suppose we wish to view the list of files and subdirectories in the
current directory page by page (on Unix), then we need to first perform ls –l to list the
contents of the current directory and then pass the output of this process to more command to
display it page by page like:
ls -l | more
So here the process created to execute ls command is generating output which is treated as
input for process created to execute more command.

There are three issues involved when dealing with IPC:

 How the information can be shared or passed from one process to another.
 To ensure that processes do not hinder other processes when doing critical activities
like contesting for sharable resources.
 Proper sequencing should be ensured in case where dependencies are present: if
process A produces data and process B prints them, then process B has to wait until
process A has produced some data before starting to print.

There are two fundamental models for inter-process communication:


a) Shared memory
b) Message Passing

SHARED MEMORY MESSAGE PASSING


a) In the shared memory-model, a region of a) In message passing model,
memory that is shared by cooperating communication takes place by means of
processes is established. Processes then messages exchanged between the
exchange information by reading and cooperating processes. The memory
writing data from/to the shared region. address space is not shared.
b) Shared memory allows maximum speed b) Message passing is useful for
and convenience of communication, as exchanging smaller amounts of data or
the involved shared memory accesses are for inter-computer communication
same as any normal memory access. connected via network i.e. when
Hence it is faster than message passing. processes communicating to each other
reside on different computers connected
via network.

c) System calls are required only to c) Message passing is implemented using


establish shared-memory regions, no system calls and information sharing
further intricacies are involved while using message passing is much more
doing shared memory accesses. complex as compared to shared
memory.
d) No special assistance from kernel is d) Message passing involves time
required as such after the shared memory consuming kernel interventions on
region is established. regular basis.

RACE CONDITIONS
The processes that are working together may share some common storage that each one can
read and write for communicating with each other. In general, the shared storage may be:
 in main memory (possibly in a kernel data structure)
 or it may be a shared file

The location of the shared storage does not change the nature of the communication or the
problems that arise.

Race conditions refer to the situations, where two or more processes are reading or writing
some shared data and the final result depends on or varies based on the order/sequence in
which the involved processes access the shared data.

For example: Consider, there are two processes A and B willing to print using the printer.
The spooler directory stores the information about the files to be printed. The spooler
directory is like a queue for the details of files to be printed. So, there will be information
about the next free slot maintained in the memory somewhere which acts as a shared location
for both these processes wishing to print, let it be called ‘next_free_slot’. At a particular
instant, suppose the slots 0-4 are already occupied by other files already in the queue and the
next vacant/free slot is 5 and at this instant the current running process is A. The process A
reads that the next vacant slot is 5 and the moment it reads this information i.e. before the
next_free_slot could be updated to the next value, the CPU scheduler switched the process
and now the current running process is B which again reads the next_free_slot as 5 (as read
by process A). Process B continues ahead and writes the details of the file to be printed at the
slot 5 in the spooler directory. After some time the Process A again gets hold of the CPU and
starts executing exactly after where it stopped i.e. writes the details of file to be printed at the
slot 5 in the spooler directory, thereby overwriting the file details of process B. Hence, the
result is that the file of process A will get printed while file of process B will not. But if the
CPU scheduler had not switched the process when A was executing initially, then the files of
both process A as well as process B would have got successfully printed. Another situation
could have been when the original sequence of execution of processes A and B were
completely reversed (i.e. Process B started first but got stopped and then Process A got the
CPU and it writes details of its file in the spooler directory and then Process B again gets hold
of the CPU and overwrites the details of files of Process A in the spooler directory), in that
case the file of Process B would have got printed while that of process A would not have. So,
the final result is depending on the order in which the processes are accessing the shared
location, hence there occur a RACE CONDITION.

CRITICAL REGION
The part of the program where the shared memory/storage is accessed is called the critical
region or critical section.

The race condition problem can be solved if we ensure that at any instant of time only one
process can access the shared data which may be stored in memory (or other shared storage).
In other words, we may say that to avoid races we require that no two processes
communicating with each other ever enter in their respective critical regions (accessing
shared memory/storage and hence the shared data) at the same time, this is called Mutual
Exclusion.

Achieving mutual exclusion is definitely a necessary condition but is not a complete solution
for handling proper and efficient cooperation among several processes running (pseudo)
simultaneously and accessing data from shared locations. The following four conditions are
required to hold for a good solution:

 No two processes cooperating with each other and sharing common data may be
simultaneously inside their critical regions i.e. mutual exclusion.
 No assumptions may be made about speeds or the number of CPUs.
 No process running outside its critical region may block other processes.
 No process should have to wait forever to enter its critical region.

References:

1. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.

2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.


SOLUTIONS TO MUTUAL EXCLUSION PROBLEM

A) Disabling Interrupt
The simplest solution that appears to solve mutual exclusion problem is to allow each
process to disable all interrupts just after entering its critical region and re-enable them
just before leaving it.

With all interrupts disabled, no process switching will be possible as the CPU will not be
able to be switched to another process without interrupts. Thereby, allowing the process
to have a mutually exclusive access to the shared memory without fear of interference by
any other process.

But this is not supposed to be a good approach because of the following reasons:
 It is not wise to allow user processes the power to turn off interrupts, as the user
processes cannot be trusted to be given this responsibility of dealing with
interrupts (They may accidentally or intentionally not re-enable the interrupts).
 Also, on a multiprocessor system, with two or more CPUs, disabling of interrupts
will be required separately for each CPU which is again not a feasible situation.

B) Lock Variables
Shared Lock variables may be looked upon at first as a good software solution. The
value 0 of the lock variable indicates that no process is inside the critical region, and 1
value indicates that another process is inside the critical region. So, before entering
the critical region, the process checks the value of the lock variable, if it is 0 then the
process sets it to 1 and enters into the critical region otherwise it waits till the lock
variable becomes 0.

But the basic problem with this solution is that this lock variable is also a shared
memory, hence if suppose a process A reads its value as 0 but before setting it to 1,
CPU switches to the process B, which also sees the value of the lock as 0, sets the
lock variable to 1 and enters the critical region. Now, while B is still in critical
section, process A again gets the CPU and enters the critical region as it had earlier
read the lock value as 0. So, now both processes A and B are in critical section which
is violation of mutual exclusion.

C) Strict Alteration
Strict Alteration is an approach where the same process cannot enter inside the critical
region second time before another process has entered the critical region i.e. if there
are two processes A and B requiring to enter the critical region, then it is mandatory
that they can enter the critical region only alternatively, Process A, then Process B,
then Process A and so on.

while (TRUE) { while (TRUE) {


while (turn !=0); //loop while (turn !=1); //loop
critical_region(); critical_region();
turn=1; turn=0;
non_critical_region(); non_critical_region();
} }

The problem in this approach is that, in case one process is much slower than the
other in its non-critical region, the faster running process will have to suffer and
unnecessarily wait for the slower running process to complete its turn of entering the
critical region and completing its non-critical region as well. Therefore, there may
arise situation where a process in its non-critical region is able to restrict the entry of
another process to enter its critical region thereby violating the third condition i.e. no
process running outside critical region should block another process from entering
critical region.

D) Peterson’s Solution
This solution uses the concepts of lock variables and warning variables to ensure mutual
exclusion when a process enters the critical region.

#define FALSE 0
#define TRUE 1
#define N 2 /* number of processes */
int turn; /* whose turn is it? */
int interested[N]; /* all values initially 0 (FALSE) */

void enter_region(int process) /* process is 0 or 1 */


{
int other; /* number of the other process */
other = 1 - process; /* the opposite of process */
interested[process] = TRUE; /* show that you are interested */
turn = process; /* set flag */
while (turn == process && interested[other] == TRUE) ;
/* null statement */
}

void leave_region (int process) /* process, who is leaving */


{
interested[process] = FALSE; /* indicate departure from
critical region */
}

Before entering inside the critical region, the process needs to call enter_region with its
own process number, 0 or 1, as parameter. This call will cause it to wait, if need be, until
it is safe to enter. On exiting the critical region, the process calls leave_region to
indicate that it no more needs to access shared memory (i.e. it is exiting its critical
region).

When a process calls enter_region, besides setting the turn to itself and setting its
interested value as true, it also checks that while it is its turn, whether the other process is
also interested to enter the critical region. If so, it sits in a tight loop busy waiting for the
time when it is safe to enter the critical region thereby ensuring mutual exclusion. At the
time of exiting the critical section, by calling leave_region, the process resets its
interested value back to false thereby allowing other process to enter the critical section if
the other so desires.

E) Sleep and Wakeup


Though Peterson’s solution is correct but it involves busy waiting because as per this
solution whenever a process requires entering the critical region, it checks if it is safe to
do so, otherwise it continues to wait in a loop until it is safe. Significant CPU time is
wasted in busy waiting and also there may occur unexpected situations like Priority
Inversion.

So, instead of busy waiting another approach could be to block the process. The simplest
primitives to achieve this are sleep and wakeup pair. Sleep is a system call that makes the
caller process block itself until another process wakes it up by the system call wakeup.
The wakeup system call takes one input parameter to identify the process to be woken
up.

PRIORITY INVERSION
Consider the situation with a process A running inside (or wanting to enter inside) the
critical region at an instant of time. Let B be a higher priority process that comes to
existence. Since B is having a higher priority, therefore, CPU schedules B which now
requires to enter the critical region but has to busy wait because process A was already
inside (or was interested in entering) the critical region. As A is lower priority than B so
CPU will not schedule A back while B is running. Hence, B loops (busy waits) forever
while A never gets a chance to execute and/or leave the critical region. This is called
Priority Inversion Problem as a higher priority process waits for a lower priority
process to complete execution of critical region of the latter whereas the latter never gets
rescheduled and therefore may lead to deadlock kind of a situation.

References:

1. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.

2. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
PRODUCER CONSUMER PROBLEM
Let there be two processes sharing a common buffer to hold the shared information. One of
them produces the information while the other consumes it. The problem to be noted here is
that a producer cannot produce if the buffer is full and a consumer cannot consume if the
buffer does not contain any item. The solution to this is that the producer should go to sleep
in case there is no space in the buffer until the consumer consumes atleast one item and
wakes up the producer. Similarly, the consumer must sleep on finding no item in the buffer
until producer produces atleast one item and wakes up the consumer.

Let the maximum no. of items in the buffer be N. Variable count represents the no. of items
currently present in the buffer. So, a possible solution using sleep and wakeup could be of the
form:

#define TRUE 1
#define N 100 /* number of slots in the buffer */
int count = 0; /* number of items in the buffer */
void producer (void)
{
int item;
while (TRUE) /* repeat forever */
{
item = produce_item(); /* generate next item */

if (count == N)
sleep(); /* if buffer is full, go to sleep */

insert_item(item); /* put item in buffer */


count = count + 1; /* increment count of items in buffer */

if (count == 1)
wakeup(consumer); /* was buffer empty? */
}
}

void consumer(void)
{
int item;
while (TRUE) /* repeat forever */
{
if (count == 0)
sleep(); /* if buffer is empty, got to sleep */

item = remove_item(); /* take item out of buffer */


count = count - 1; /* decrement count of items in buffer */

if (count == N - 1)
wakeup(producer); /* was buffer full? */

consume_item(item); /* print item */


}
}

Notice that the variable count is a shared piece of information and its access is unconstrained.
Consider the situation where suppose the buffer is empty and the consumer reads the count
value as 0. Before it goes to sleep, the CPU scheduler schedules the producer process which
on seeing count value as 0 sends a wakeup signal to the consumer (after producing an item
and inserting it in the buffer). But by this time the consumer had not fallen asleep so this
wakeup signal gets lost. After some time, the consumer process is scheduled which now goes
to sleep forever waiting for wakeup by producer which has already been sent but got lost.

A solution to this problem could be by keeping a wakeup bit to catch such wakeup signals
received by processes which have not gone to sleep yet, so as to store the wakeup signals for
any future sleeps. But keeping just a bit will not solve the problem in case when there are
more number of processes involved in the situation and not just two.

SEMAPHORES

To solve the problem of working in the critical section Dijkstra proposed to use an integer
variable called semaphore. It could be initialized and its value could be modified only by a
set of two atomic operations namely up (V or signal) and down (P or wait). To ensure that
these operations are indeed atomic normally the interrupts are disabled while executing these
operations. A possible implementation of these operations is as given below:

typedef struct {
int value;
process *queue;
} semaphore;

down(semaphore* s)
{
s->value--;
if (s->value < 0)
{
place this process in s->queue;
block this process
}
}

up(semaphore* s)
{
s->value++;
if (s->value <= 0)
{
remove a process P from s->queue;
wakeup P;
}
}

The operating system disables the interrupt while it is checking or manipulation the
semaphore value and blocking process if it is so required. In short, the interrupts are disabled
when the up or down operations are being performed on the semaphore and they are re-
enabled back on completion of these operations. This is doable if it is a single CPU system,
but in case of multiprocessor system either interrupt should be disabled for all processors or
each semaphore would further require protection by other locking techniques like TSL
instructions.
Here semaphore is associated with an integer value. Also, there may be a queue of processes
if waiting on the semaphore. There are two types of semaphores:

 Counting Semaphore: These are the semaphores whose value ranges over an
unrestricted integer domain. These are normally used to control the access to some
resource consisting of only finite no. of instances of that resource, and to ensure
proper synchronization among the processes while using such shared resource. While
accessing an instance of the resource, the process requires to perform a down on the
semaphore thereby decrementing its value by 1. So, processes can keep accessing
instances of the resource till the value of the semaphore is not less than 0. Beyond
this, the process will have to block itself by calling sleep as part of the down
operation. On releasing the resource instance, the process needs to perform up on the
semaphore to indicate the availability of the just released resource instance. If the
value of the semaphore (after performing up operation) is still less than equal to 0,
then it indicates that one or more processes was/were waiting on the semaphore and
therefore one of the waiting processes is woken up as part of the up operation.

 Binary Semaphore: These are the semaphores whose value can only be 0 or 1. They
are also known as mutex locks or simply mutex, as these are normally used to ensure
mutual exclusion. Multiple processes when wish to access the critical section in
mutually exclusive way, they can use the mutex by initializing it to 1 and then
performing a down when a process wishes to enter the critical section and an up after
exiting from the critical section.

SOLUTION TO PRODUCER CONSUMER PROBLEM USING SEMAPHORES

#define N 100 /* number of slots in the buffer */


typedef int semaphore;/* semaphores are a special kind of int */
semaphore mutex = 1;/* controls access to critical region */
semaphore empty = N;/* counts empty buffer slots */
semaphore full = 0;/* counts full buffer slots */

void producer(void)
{
int item;

while (TRUE)
{ /* TRUE is the constant 1 */
item = produce_item(); /* generate something to put in buffer */
down(&empty); /* decrement empty count */
down(&mutex);/* enter critical region */
insert_item(item); /* put new item in buffer */
up(&mutex); /* leave critical region */
up(&full); /* increment count of full slots */
}
}

void consumer(void)
{
int item;
while (TRUE)
{/* infinite loop */
down(&full);/* decrement full count */
down(&mutex);/* enter critical region */
item = remove_item();/* take item from buffer */
up(&mutex);/* leave critical region */
up(&empty);/* increment count of empty slots */
consume_item(item);/* do something with the item */
}

Three semaphores have been used in the solution:


 full: for counting the number of slots that are full, which is initialized to 0.
 empty: for counting the number of slots that are empty, which is initialized to N i.e.
the buffer size
 mutex: to ensure mutual exclusion i.e. the producer and consumer processes do not
access the buffer at the same time. mutex is initialized to 1.

The semaphores full and empty are used for achieving synchronization i.e. the consumer stops
when the buffer does not contain any element to consume while the producer stops when the
buffer is full. The semaphore mutex is used to ensure mutual exclusion i.e. at a time only one
process is writing or reading the buffer and associated variables. full and empty are counting
semaphores and mutex is binary semaphore.

SYNCRONIZATION PROBLEM

Processes work together to solve problems and they need to collaborate and coordinate with
each other in order to accomplish a task, otherwise things may go wrong. Synchronization
refers to the coordination of simultaneously running threads or processes to complete a task;
in order to obtain correct runtime order and avoid unexpected race conditions.

Concurrent access to shared data may result in data inconsistency. Maintaining data
consistency requires mechanisms to ensure the “orderly” execution of cooperating processes.
This is referred to as the Synchronization Problem.

Typical Synchronization problems include Dinning Philosophers Problem, Reader & Writer
Problem and Sleeping Barber Problem.

References:

1. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.

2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.


1/25/2021 Solution to Dining Philosophers Problem using Semaphores.png

https://classroom.google.com/u/0/c/MTE5MTU3NzM3NzAy/m/MTQzOTQ1MDk1MzQx/details 1/1
Journal of Computing and Information Technology - CIT 13, 2005, 1, 43–51 43

Process Synchronization with


Readers and Writers Revisited

Jalal Kawash
Department of Computer Science, American University of Sharjah, Sharjah, UAE

The readers-writers problem is one of the very well Courtios et.al. 1 ] developed two solutions to
known problems in concurrency theory. It was first two versions of the readers-writers problem,
introduced by Courtois et.al. in 1971 1 ] and requires
the synchronization of processes trying to read and write which are known as the first and the second
a shared resource. Several readers are allowed to access readers-writers problems. Both of these so-
the resource simultaneously, but a writer must be given lutions use Dijkstra’s semaphore 2 ]. A (bi-
exclusive access to that resource. Courtois et.al. gave nary) semaphore S is an object that has an as-
semaphore-based solutions to what they called the first
and second readers-writers problems. Both of their sociated integer value (val) and a FIFO queue
solutions are prone to starvation. The first allows readers (queue) with the support of two atomic oper-
to indefinitely lock out writers and the second allows ations wait(S) and signal(S) defined as fol-
writers to indefinitely lock out readers. This paper
presents and proves correct a third semaphore-based lows. Initially, S.val is 1 and S.queue is empty.
solution, which is starvation-free for both reader and
writer processes. wait(S) f
if S.val = 0 then
Keywords: concurrency control, shared objects, mutual wait on S.queue
exclusion, formal verification, computing education. (block the process)
else S.val 0
g
1. Introduction signal(S) f
if S.queue is not empty then
remove one process from S.queue
The readers-writers problem 1 ] requires the and unblock it
else S.val 1
synchronization of concurrent processes simul-
taneously accessing a shared resource, such as
g
a database object. This problem is different
from the known mutual exclusion problem 9 ] These operations are atomic, which requires
in that it distinguishes between two categories them to appear as if they are executed in a
of processes: those who only read the resource, critical section. When a process is executing
called readers, and those who write it, called wait(S) or signal(S), no other process can
writers. Since reader processes only read the execute either of these two operations on the
resource, it is more efficient to grant all such same semaphore S.
reader processes simultaneous access to the re-
source. However, a writer process is granted Most recent work on the readers-writers prob-
exclusive access to the resource. Thus, it is not lem addresses building analytical models and
acceptable to protect the resource using the tra- studying performance implications (see 14,10,
ditional critical section 11 ] technique of mutual 7 ] and references therein). That work, how-
exclusion, allowing at most one process to ac- ever, does not propose solutions to the prob-
cess the resource at a time. The readers-writers lem. The group mutual exclusion problem pro-
requirements allow more concurrency and more posed by Joung 4 ] is a generalization of the
efficient use of the resource. readers-writers problem. A solution to group
44 Process Synchronization with Readers and Writers Revisited

Writer process Reader process

Repeat repeat
wait(resource) wait(mutex)
readers readers + 1
// write the resource if readers = 1 then
wait(resource)
signal(resource) end-if
until done signal(mutex)

//read the resource

wait(mutex)
readers readers - 1
if (readers = 0) then
signal(resource)
end-if
signal(mutex)
until done

Fig. 1. Solution to the first readers-writers problem.

exclusion implies a solution to the readers- tois et.al. 1 ], without explicitly detailing a fair
writers problem. Joung’s solution uses only alternative. Our algorithm can be of high ed-
read/write primitives of shared memory. It pro- ucational value when it is used to complement
duces high processor-to-memory traffic, mak- the original solutions.
ing it less scalable. Keane and Moir 5 ] provide
In Section 2, the original solutions to the first
a more efficient solution to group mutual exclu-
and second problems are restated. Section 3 in-
sion than Joung’s. Their solution depends on the
troduces our third readers-writers problem and
pre-existence of a fair “classical” mutual exclu-
solution. In Section 3, we show that our algo-
sion algorithm to implement their acquire and
rithm is correct by automatically verifying the
release operations. The algorithm also makes
required properties using the SPIN formal veri-
use of explicit local spinning or busy waiting
fier 3 ]. Finally, Section 4 concludes the paper.
to force processes to wait. Finally, the solution
depends on using an explicit queue for waiting
processes.
2. Previous Solutions
The solution presented in this paper is simpler,
mainly because it solves a special case (readers-
Given a group of processes portioned into read-
writers) of the more general problem (group
ers and writers, a solution to the readers-writers
mutual exclusion). We do not make use of ex-
problem must satisfy the following two proper-
plicit spinning. Given that semaphore opera-
tions can be efficiently built into an operating ties:
system using blocking instead of spinning, spin- Safety: if there are more than two processes
ning can be altogether avoided in our solution. using the resource at the same time, then all
In this paper, we do not address the complex- of these processes must be readers.
ity of our algorithm, but it is obvious that it
Progress: if there is more than one process
largely depends on the implementation of the
trying to access the resource, then at least
semaphore and the underlying memory archi-
one process succeeds.
tecture (such as cache coherent or non-uniform
memory access). The most widely used operat- The first, second, and our third problem require
ing system books (for example, see 11,12,13 ]) different fairness properties. Courtois et.al. 1 ]
still refer to the original unfair solutions of Cour- state:
Process Synchronization with Readers and Writers Revisited 45

“For the first problem it is ] possible that a writer In Figure 1, if the first reader progresses to read
could wait indefinitely while a stream of readers the resource, it will block any potential writ-
arrived.” ers until it is done. However, if a stream of
Hence, the first problem requires: readers keep on arriving, they may all skip the
if statement in the entry section. Therefore, it
Fairness-1: if some reader process is try- is possible that each such reader never waits
ing to access the resource, then this process for resource and writers can be locked out in-
eventually succeeds. definitely. A similar argument applies to the
This property obviously favors readers and in solution in Figure 2, but here writers can lock
the first problem there is no guarantee that a out readers.
writer process does not starve. Similarly, the
second problem favors writers. Courtois et.al.
1 ] require: 3. The Third Problem
“In the second ] problem we give priority to
writers and allow readers to wait indefinitely
while a stream of writers is working.” For highly demanded resources both the first
and second solutions could be undesirable in
Hence, the fairness requirement of the second practice. In this section, we present a solution
problem is as follows: that gives the readers and writers equal priori-
Fairness-2: if some writer process is try- ties.
ing to access the resource, then this process The fairness requirement for our third prob-
eventually succeeds. lem is stronger than that of both Fairness-1 and
The original solutions 1 ] to the first and second Fairness-2 since it does not restrict the eventual
readers-writers problems are given in Figure 1 progress of any process by its type (reader or
and Figure 2, respectively. writer).

Writer process Reader process

Repeat repeat
wait(mutex2) wait(mutex3)
writers writers + 1 wait(read)
if writers = 1 then wait(mutex1)
wait(read) readers readers + 1
end-if if readers = 1 then
signal(mutex2) wait(write)
wait(write) end-if
signal(mutex1)
// write the resource signal(read)
signal(mutex3)
signal(write)
wait(mutex2) // read the resource
writers writers - 1
if (writers = 0) then wait(mutex1)
signal(read) readers readers - 1
end-if if (readers = 0) then
signal(mutex2) signal(write)
until done end-if
signal(mutex1)
until done

Fig. 2. Solution to the second readers-writers problem.


46 Process Synchronization with Readers and Writers Revisited

Fairness-3: if some process is trying to ac- the shared resource, or if a writer process is try-
cess the resource, then this process eventu- ing to access the resource (writers > 0 at line
ally succeeds. R2).
That is, Fairness-3 is defined as Fairness-1 and If the first reader is trying and the resource is
Fairness-2. available, the wait(resource) on line R4 al-
Our solution is given in Figure 3. The solution lows it to proceed locking out any following
uses two integer variables readers and writ- writers. All subsequent readers will skip the
if statement (lines R3 to R5) as long as there
ers to respectively count the number of reader
and writer process trying to gain access to the are no writers trying (writers = 0). Hence,
resource. Both of these variables are initial- the solution allows several readers to access the
ized to 0. Before a writer process gains access resource simultaneously. However, if a writer
to the shared resource, the process increments tries to use the resource, it will be forced to wait
the variable writers (line W2). After releasing at line W4. The algorithm forces subsequent
the resource, the writer process decrements the readers to execute the body of the if statement,
variable writers (line W6). The same applies to forcing them to wait too (line R4). Eventually,
reader processes and the variable readers. The all readers reading the shared resource will exe-
algorithm makes use of two semaphores mutex, cute lines R8 to R12 and only the last such reader
which is used to guarantee mutual exclusive ac- will execute line R11, allowing a waiting writer
cess to the variables readers and writers, and to proceed.
resource, which is used to synchronize access
to the shared resource.
4. Proof of Correctness
Writers simply check the availability of the re-
source at line W4. If the resource is busy, the
wait(resource) operation forces the writer to In this section, we describe how we used the
wait in the associated queue. A reader executes SPIN model checker 3 ] to verify the two prop-
wait(resource) at line R4 only if it is the first erties of our algorithm: Safety and Fairness-3.
reader (readers = 0) trying to gain access to Progress is implied by Fairness-3.

Writer process Reader process

Repeat Repeat
W1 wait(mutex) R1 wait(mutex)
W2 writers writers + 1 R2 if writers > 0 or readers = 0 then
W3 signal(mutex) R3 signal(mutex)
W4 wait(resource) R4 wait(resource)
R5 wait(mutex)
// write the resource end-if
R6 readers readers + 1
W5 wait(mutex) R7 signal(mutex)
W6 writers writers - 1
W7 signal(mutex) // read the resource
W8 signal(resource)
until done R8 wait(mutex)
R9 readers readers - 1
R10 if (readers = 0) then
R11 signal(resource)
end-if
R12 signal(mutex)
until done

Fig. 3. Solution to the third readers-writers problem.


Process Synchronization with Readers and Writers Revisited 47

4.1. Assumptions maximum fairness delay of (n2 ; n)=2) implies


a fair semaphore implementation.
For the correctness of our algorithm, we assume The Promela implementation of wait(s,i) in
the following: Figure 5 is an implementation of the enter(i)
The execution is sequentially consistent. operation of Figure 4. The readers-writers solu-
Lamport 6 ] requires for sequential consis- tion of Figure 3, makes use of two semaphores,
mutex and resource. The Promela integer con-
tency: “the result of any execution is the
same as if the operations of all the proces- stant s (0 or 1) identifies which semaphore the
wait(s,i) is being invoked on. Hence, the vari-
sors were executed in some sequential order,
ables flagi], k, j, and turnk] of Figure 4
and the operations of each individual pro-
for process i and semaphore s are represented
cessor appear in this sequence in the order
using the Promela variables flags].vali],
specified by its program.”
ks].vali], js].vali], and turns].val
The execution either eventually terminates ks].vali]], respectively. The Promela do-
(the executing processes terminate and no -od loop construct is used to represent for and
new processes are admitted to the system) or, while loops. The outer most do-od loop in Fig-
if it is infinite and there is at least one partic- ure 5 corresponds to the for loop in Figure 4.
ipating writer process, the execution contin- The Promela statements
ues indefinitely to have participating writer :: (ks].vali] == n-1) - >
break
processes. That is, the Progress property re- :: else ->
flags].vali] = ks].vali]
quires that in an infinite execution with some turns].valks].vali]] = i
participating writers, the execution does not do ..
come to a point where, from that point on,
all the processes are indefinitely readers. read: if (k == n-1), then break the for loop
otherwise assign k to flagi], assign i to
turnk], and hence forth. The local variable
4.2. Formal Verification busy and the inner most do-od loop repre-
sent a for loop implementation of the condition
Implementation of the wait and signal oper- 9j=6 i:flagj]  k. So the statements
ations in Promela, SPIN’s programming lan- do
guage are given in Figure 5. Since Promela :: >
(js].vali] == n) - break
lacks constructs for blocking an active process, :: else ->
if
we must use busy waiting to delay the process. :: (i != js].vali]) - >
We choose to implement the wait and signal if
:: (flags].valjs].vali]]
operations using Peterson’s n-process mutual >= ks].vali])
exclusion algorithm 8 ], reproduced in Figure >
- busy = true break
4. That is, the wait operation is the code to
enter a critical section and the signal is the exit read: when j reaches the value n, break the loop;
code. The fairness of Peterson’s algorithm (a otherwise, if there is a j 6= i, where flagj]

f g
Shared variables:

f g
flag1 .. n] values in 0 .. n-1
turn1 .. n-1] values in 1 .. n

enter(i):
for k 1 to n-1 do
flagi] k

9j6=i:flagj]  k do skip
turnk] i
while (turnk] = i) and
exit(i):
flagi] 0

Fig. 4. Peterson’s n-process mutual exclusion algorithm.


48 Process Synchronization with Readers and Writers Revisited

 k, then set busy to true and break the for loop. In the writer process, SPIN asserts that the num-
The variable busy is checked to break the outer ber of writer processes is one and the number of
do-od loop corresponding to the while loop in reader processes is zero, when a writer process
Figure 4. Now, the rest of the Promela code is writing the resource, indicated by assert(inw
should be readable for readers with even little == 1 && inr == 0). SPIN’s results (Figure 7)
background in programming. indicating that these properties are never vio-
The code for the readers and writers in Promela lated, establishing the Safety property.
is given in Figure 6 and it mimics the pseudo-
code given in Figure 3. The Safety property Fairness-3 is established using Promela’s pro-
is verified using the assert statement in the gress labels. SPIN checks for any scenario
protected sections for each reader and writer that violates the property that the progress-
process. The extra variables inr and inw are labeled instruction is always eventually reach-
introduced to verify the safety property. They able. There are two progress labels, one in the
respectively represent the number of readers and critical section of the writer and one in that of
writers engaged in the critical section. the reader process. SPIN’s output indicates that
In the reader process, SPIN asserts that the both sections are always eventually reachable,
number of writers writing the resource, while establishing the Progress and Fairness-3 prop-
a reader is reading, is zero, indicated by as- erties. Figure 7 shows a screen shot of SPIN’s
sert(inw == 0) in Promela. verification results.

inline wait(s,i) f
ks].vali] = 0
bool busy
do
:: (ks].vali] == n-1) - > break
:: else - >
flags].vali] = ks].vali]
turns].valks].vali]] = i
do
:: (turns].valks].vali]] != i) - >
break
>
:: else - busy = false js].vali] = 0
do
:: (js].vali] == n) - >
break
:: else - >
if
:: (i != js].vali]) - >
if
:: (flags].valjs].vali]]
>=
ks].vali]) - >
busy = true break
:: else - >
skip
fi
:: else - >
skip
fi
js].vali]++
od
if
>
:: (!busy) - break
:: else - > skip
fi
od
ks].vali]++

g
od

inline signal(s,i) f
g
flags].vali] = 0

Fig. 5. Wait and signal implementation in Promela.


Process Synchronization with Readers and Writers Revisited 49

proctype writer(int i) f
do
::
skip
wait(mutex,i)
writers++
signal(mutex,i)
wait(resource,i)
progress: inw++;
assert(inr == 0 && inw == 1);
inw--;
wait(mutex,i)
writers--
signal(mutex,i)
signal(resource,i)

g
od

proctype reader(int i) f
do
::
skip
wait(mutex,i)

jj
if
:: ((writers >
0) (readers == 0)) - >
signal(mutex,i)
wait(resource,i)
wait(mutex,i)
:: else - >
skip
fi
readers++
signal(mutex,i)
progress: inr++;
assert(inw == 0);
inr--;
wait(mutex,i)
readers--
if
:: (readers == 0) - > signal(resource,i)
:: else - >skip
fi
signal(mutex,i)

g
od

Fig. 6. The reader and writer processes of the third problem in Promela.

Fig. 7. SPIN’s verification output for the third problem.


50 Process Synchronization with Readers and Writers Revisited

5. Conclusion It may be more efficient to allow more than


one reader to proceed with simultaneous read-
ing. However, it is not clear to us how this
This paper introduced a new semaphore-based could be achieved without indefinitely locking
solution to the readers-writers concurrency prob- writers out. We are currently investigating if it
lem. Previous specialized solutions either (a) is possible to optimize the algorithm to behave
did not permit more than one reader to simulta- more efficiently in such a situation. Further-
neously access the resource, (b) permitted read- more, we would like to consider the complexity
ers to indefinitely lock out writers, (c) or per- implications of our algorithm.
mitted writers to indefinitely lock out readers.
None of these solutions is practically appealing
and our solution answers all of their limitations. 6. Acknowledgments
There are, however, recent solutions to a more
general problem, the group mutual exclusion
problem. Our solution is a simpler solution to We are thankful to the anonymous reviewers
a simpler problem (the readers-writers problem for their comments, which helped us improve
versus the group mutual exclusion problem). the paper.
It also has an educational value if the widely
quoted unfair solutions in famous operating sys-
tems text books are supplemented with it. References

We followed an automatic verification approach


to prove the correctness of our algorithm, using 1 ] P. J., COURTOIS, F. HEYMANS, AND D. L. PARNAS,
the state-of-the-art SPIN model checker with Concurrent Control with ‘Readers’ and ‘Writ-
ers’, Communications of the ACM 14(10):667–668,
the Promela programming language. We be- 1971.
lieve that the use of SPIN to establish the cor-
rectness of our algorithm is of an independent 2 ] E. DIJKSTRA, Cooperating Sequential Processes, in
F. Genuys, editor, Programming Languages, Aca-
interest and deserves the attention given in this demic Press, 1968.
paper. This also can serve teaching purposes,
especially at the undergraduate level, where stu- 3 ] G. J. HOLZMANN, The Model Checker SPIN, IEEE
Transactions on Software Engineering 23(5):1–5,
dents studying operating systems typically do 1997.
not have the necessary background to construct
formal proofs of correctness for concurrent al- 4 ] Y. J. JOUNG, Asynchronous Group Mutual Exclu-
sion, in Proc. 17thACM Symp. Principles of Dis-
gorithms. tributed Computing, pp. 51–60, 1998.
Because our solution is extremely fair, it is pos- 5 ] P. KEANE AND M. MOIR, A Simple Local-Spin
sible, under certain circumstances, that only one Group Mutual Exclusion Algorithm, IEEETtrans.
process at a time is allowed to access the re- Parallel and Distributed Systems 12(7):673–685,
2001.
source. This can take place when both readers
and writers are lining up to use the resource. 6 ] L. LAMPORT, How to Make a Multiprocessor
Precisely, if streams of writers and readers ex- Computer that Correctly Executes Multiprocess
programs, IEEE Transactions on Computers C-
ist, the readers and the writers will be forced to 28(9):690–691, 1979.
wait on semaphore resource. When a process
7 ] C. LANGRIS AND E. MOUTZOUKIS, A Batch Ar-
(reader or writer) exits signal(resource) must
rival Reader-Writer Queue with Retrial Writers,
be executed. (In the case of a reader process, Commun. Statist, Stochast. Models, 13(3):523–545,
the stream of readers will be blocked in entry 1997.
because writers > 0 and the last reader exit- 8 ] G. PETERSON, Myths About the Mutual Exclusion
ing the protected section will execute the sig- Problem, Parallel Processing Letters 12(3):115–
nal(resource) operation). The next waiting 116, 1981.
process will be allowed to proceed, regardless 9 ] M. RAYNAL, Algorithms for Mutual Exclusion, The
of its type. If such a process is a reader, it will be MIT Press, 1986.
the only reader process accessing the resource 10 ] T. SANLI AND V. KULKARNI, Optimal Admission to
at that time, even if the next waiting process is Reader-Writer Systems with no Queuing, Opera-
also a reader. tions Research Letters 25:213–218, 1999.
Process Synchronization with Readers and Writers Revisited 51

11 ] A. SILBERSCHATZ AND P. GALVIN, Operating System


Concepts 5thed., Wiley, 1999.
12 ] W. STALLINGS, Operating Systems 4thed., Prentice
Hall, 2001.
13 ] A. S. TANENBAUM AND A. S. WOODHULL, Oper-
ating Systems Design and Implementation 2nd ed.,
Prentice Hall, 1997.
14 ] E. XU AND A. S. ALFA, A Vacation Model for the
Non-saturated Readers and Writers with a Thresh-
old Policy, Performance Evaluation 50:233–244,
2002.

Received: November, 2003


Accepted: September, 2004

Contact address:
Jalal Kawash
Department of Computer Science
American University of Sharjah
P.O.Box 2666
Sharjah
UAE
e-mail: [email protected]

JALAL KAWASH received his Ph.D. from The University of Calgary,


Canada in 2000. He is currently an assistant professor of computer
science at the American University of Sharjah, UAE and an adjunct
assistant professor at The University of Calgary, Canada. His research
interests are in distributed systems and algorithms, Internet computing,
and computing education.
1/25/2021 Solution to Sleeping Barber Problem using Semaphores.png

https://classroom.google.com/u/0/c/MTE5MTU3NzM3NzAy/m/MTQzOTQ1MDk1MzQx/details 1/1
Process Scheduling

In multiprogramming systems, to obtain better CPU utilizations multiple processes are made
to reside in the memory and CPU keeps switching among the processes to achieve this
objective. Further, in timesharing systems the objective is to achieve quick response from the
user interacting processes and therefore CPU keeps switching among the processes so that
different users can interact with system together.

In general, whenever there are multiple processes simultaneously in the ready state
competing for CPU availability at the same time, a decision has to be made which process is
to be scheduled to run on the CPU. This is called CPU scheduling.

CPU Scheduling Components

CPU Scheduler: Whenever the CPU is idle, the part of the operating system that makes the
choice of selecting one process from the ready queue to be executed on the CPU is called the
CPU scheduler or short-term scheduler and the algorithm it uses is called the scheduling
algorithm.

Dispatcher: The dispatcher is the module that gives control of the CPU to the process
selected by the CPU scheduler. It performs the following functions:
1. Switching context
2. Switching to user mode
3. Jumping to the proper location in the user process to restart (or start if it is a new
process) that process.

Dispatcher is invoked during every process switch so it should be as fast as possible. The
time it takes for the dispatcher to stop one process and start another running is known as the
dispatch latency.

Types of Processes
Nearly all processes alternate bursts of computing with (disk) I/O requests but based on the
predominance of CPU time required by the process, processes are broadly categorized in two
categories:

Compute bound Processes: They have long CPU bursts and thus infrequent I/O waits. A
compute bound process may be represented as following:

1
I/O bound Processes: They are called I/O bound processes because they do not compute
much between I/O requests and not because they have especially long I/O requests.

When to schedule
Following are the situations when a CPU scheduler needs to choose a process to be scheduled
to run on the CPU:

a) When a running process exits, a decision is required to be made which process to run
next.
b) When a process blocks (on I/O, or on semaphore, or for some other reason like
dependency on another process), in such situation another process is to be selected to
be run on the CPU.

c) Decision for scheduling a process may also be taken in case of a preemptive


scheduling algorithm being used and when certain criteria (based on the scheduling
algorithm) is satisfied specifically:

i. When a new process is created, a decision is required to be made which


process to run, already running process or the new one.
ii. When an interrupt occurs say an I/O interrupt came from an I/O device after
completion of work, making an earlier blocked process now ready, then CPU
scheduler may decide on whether to run the newly ready process or the one
which was running when the interrupt arrived.
iii. When the CPU time slot for running process is exhausted (in case of round
robin scheduling algorithm), another process is required to be scheduled.

2
Three Level Scheduling
A batch system allows scheduling at three different levels. They are:

Admission scheduler: As jobs arrive at the system, they are initially placed in an input queue
stored on the disk. The admission scheduler decides which jobs to admit to the system while
the remaining remains in the queue until they are selected. A typical algorithm could be mix
of compute-bound jobs and I/O-bound jobs, another could be to choose shortest first.

Memory Scheduler: Once a job has been admitted to the system, a process can be created for
it and it can contend for the CPU. However, if there does not exist enough memory to keep
all the processes then, some of the processes have to be swapped out to disk. The second
level of scheduling i.e. deciding which processes should be kept in memory and which ones
to be kept on disk is done by memory scheduler. This decision has to be reviewed frequently
to allow the processes on disk to get some service. However, since bringing a process in from
disk is expensive, the review should not be very frequent.

CPU scheduler: The third level of scheduling is done by CPU scheduler which decides
among the ready processes in main memory to which process is to be run next. Several
preemptive and non-preemptive algorithms like FCFS, SJF, Round Robin, Priority
Scheduling etc... may be applied for this level of scheduling.

Figure: Three Level Scheduling

3
Scheduling Criteria
Different CPU scheduling algorithms have different characteristics which may favour one
class of processes over the other. Hence, the decision of which algorithm is to be chosen in a
particular situation should be made based on studying the properties of various algorithms.
Various criteria have been proposed of which few are as follows:

a) Throughput: The number of processes that are completed per time unit is called
throughput.
b) CPU Utilization: The proportion of time in which the CPU remains busy executing
some process over the total available time. It may be represented in percentage.
c) Turnaround time: The interval of time from the time of submission of a process to the
time of completion is known as the turnaround time.
d) Waiting Time: It refers to the total time that the process spends waiting for the CPU in
the ready queue, i.e. it is the sum of all the time periods that it spent in the ready
queue.
e) Response time: It refers to the time from the submission of a process until the first
response is produced. This measure is normally considered important for interactive
systems.

Also, in different environments different scheduling algorithms are needed. This is because
different application areas (and different kinds of operating systems) have different objectives
to be met. Thus, the scheduler’s priorities to be met are not the same for all systems.
Following gives a brief on the different criteria to be focused on when dealing with different
types of systems:

All systems
Fairness - giving each process a fair share of the CPU
Policy enforcement – ensuring that stated policy is carried out
Balance - keeping all parts of the system busy

Batch systems
Throughput - maximize jobs per hour
Turnaround time - minimize time between submission and termination
CPU utilization - keep the CPU busy all the time

Interactive systems
Response time - respond to requests quickly
Proportionality - meet users’ expectations

Real-time systems
Meeting deadlines – strict time deadlines should be met so as to avoid losing data and/or any
catastrophic impact esp. in case of hard real-time systems.

4
Predictability – system should behave in a predictable manner for the users. For example: in
case of real-time multimedia systems the system should behave predictably to avoid quality
degradation in multimedia systems.

Scheduling Algorithms
Scheduling algorithms can be divided into two categories:

a) Non-preemptive: Once the CPU has been allocated to a process, the process keeps
running on the CPU until it releases the CPU either by terminating or by switching to
the wait state and no intermediate decision is made by the CPU scheduler to switch
the running process out at any time till the process is capable of running.
b) Preemptive: If the algorithm is such that the CPU scheduler can pre-empt an already
running process based on some criteria (without the process being terminated or
blocked), then such scheduling algorithms are called preemptive.

First-Come First-Served (FCFS)

FCFS is a non-preemptive algorithm where processes are assigned the CPU in the order they
request it. Basically, there is a single queue of ready processes and processes are made to run
on the CPU on first come first serve basis. The advantage is that this algorithm is very simple
to understand and implement and is also fair to processes to a certain extent. But, the
disadvantage is compute bound processes once made to run on the CPU can significantly
delay other processes esp. I/O processes which otherwise require very little CPU time.

For example,

Note: Assume Burst time in milliseconds

5
The average turnaround time for processes arriving in the order P1, P2, P3 is
(24 + 27 + 30)/3 = 81/3 = 27 milliseconds.

The average turnaround time for processes arriving in the order P2, P3, P1 is
(3 + 6 + 30)/3 = 13 milliseconds.

Shortest Job First (SJF)

SJF is also a non-preemptive algorithm where the processes are assigned CPU in increasing
order of the CPU burst required by them, i.e., the shorted process will be scheduled on the
CPU first then the process with next higher CPU burst requirement and so on. In principle the
algorithm is expected to give optimal turnaround time. But the drawback is that, it behaves
optimally only if all jobs arrive at the same time.
For example,

6
The turnaround time for P1 is 9, for P2 is 24, for P3 is 16 and for P4 is 3 and thus
the average turnaround time is (9 + 24 + 16 + 3)/4 = 52/4 = 13 milliseconds.

Shortest Remaining Time Next or Preemptive SJF

A preemptive version of shortest job first is shortest remaining time next. With this
algorithm, the scheduler always chooses the process whose remaining run time is the shortest.
Here also, the run time has to be known in advance. When a new job arrives, its total time is
compared to the current process’ remaining time. If the new job needs less time to finish than
the current process, the current process is suspended and the new job started. This scheme
allows new short jobs to get good service.
For example,

7
The average turnaround time for Process P1 is (17 - 0) = 17, for P2 is (5 – 1) = 4,
for P3 is (26 – 2) = 24 and for P4 is (10 – 3) = 7. Thus, the average turnaround
time is (17 + 4 + 24 + 7)/4 = 52/4 = 13 milliseconds.

Round Robin (RR)


In this algorithm, each process is assigned a time interval (called quantum) for which it is
allowed to run on the CPU. If the process is still running at the end of the quantum, the CPU
is pre-empted and given to another process. When a process finishes its time quantum it is put
in the end of the queue or list of runnable processes so that other processes in the list get
chance on the CPU before this process again gets hold of the CPU. If the process gets
finished or blocked in between a quantum then also CPU switching takes place.

An important aspect to be taken care of with RR algorithm is the size of the quantum.
Switching from one process to another requires context switch which is a time-consuming
task involving change of the memory map, registers, program counter etc... Thus, having too
small quantum will lead to frequent context switches thereby making it less CPU efficient.
On the other hand, keeping it too large may lead to pre-emption not happening at all as
processes may finish before the quantum elapses causing poor response to short interactive
requests. Normally time quantum ranges from 10-100 milliseconds, and a quantum of around
20-50 miliseconds is considered reasonable.

For example,

8
Q: Calculate the average turnaround time.

Priority Scheduling

The basic idea is in this approach of scheduling is, each process is assigned a priority, and the
runnable process with the highest priority is allowed to run. Priority could be decided
internally based on factors like memory requirements, no. of open files, ratio of average I/O
burst to average CPU burst, time limits etc. Priority may also depend on external factors like
importance of process, type and amount of funds being paid for computer use etc.

Priority Scheduling can be preemptive as well as non-preemptive. In case of preemptive


priority scheduling, when a new process arrives at the ready queue, its priority is compared
with the priority of the currently running process. If the priority of the newly arrived process
is more than that of the running process then the running process is pre-empted and the newly
arrived process is scheduled, but in case on non-preemptive priority scheduling the running
process will not be pre-empted and the newly arrived higher priority process will be placed at
the head of the ready queue so that executes immediately next.

To prevent high-priority process from running indefinitely, the scheduler may decrease the
priority of the currently running process at each clock tick. If this action causes its priority to
drop below that of the next higher priority process, a process switch occurs. Alternatively,
each process may be assigned a maximum time quantum that is allowed to run. When this
quantum is used up, the next highest priority process is given a chance to run.

Priorities may be assigned statically or dynamically:

9
Statically: Certain processes can be given higher priority over others in a predefined or static
manner.
Dynamically: Priorities can also be assigned dynamically by the system to achieve certain
system goals. For example, I/O bound processes can be given higher priority. A simple
approach could be to set the priority to 1/f where f is the fraction of the last quantum that a
process used.
Example for Priority Scheduling with static priority (assume lower numeric value represents
higher priority),

Q: Calculate the average turnaround time.

Multilevel Queue Scheduling

It partitions the ready queue into several separate queues. The processes are permanently
assigned one queue, generally based on some property of the process, such as memory size,
process priority or process type.

Each queue has its own scheduling algorithm. For example, the first queue may be scheduled
as per Round Robin, while the other as per FCFS.

In addition, there must be scheduling among the queues which is commonly implemented as
fixed-priority scheduling or another possibility could be time-slice among the queues like
first queue may have 80% of CPU time while second may get 20% of CPU time.

Example,

10
Multilevel Feedback Queue Scheduling
It allows a process to move between queues. The idea is to separate processes according to
the characteristics of their CPU bursts. If a process uses too much CPU time, it will be moved
to a lower priority queue. The scheme leaves I/O bound and interactive process in the higher
priority queues.

In addition, a process that waits too long in a lower priority queue may be moved to a higher
priority queue. In general, a multilevel feedback queue scheduler is defined by the following
parameters:

 The number of queues.


 The scheduling algorithm for each queue.
 The method used to upgrade a process to higher priority from a lower priority queue.
 The method used to demote a process from higher priority to a lower priority queue.
 The method used to determine which queue a process enters when it requires service.

The definition of a multilevel feedback queue scheduler makes it the most general CPU
scheduling algorithm. It can be configured to match a specific system under design but it is
very complex.

Example, consider a multilevel feedback-queue scheduler with three queues, numbered 0 to


2. Queue 0 employs round robin with 8 milliseconds quantum, queue 1 will employ 16
milliseconds quantum, queue 2 employs FCFS. The scheduler first executes all processes in
queue 0. Only when queue 0 is empty will it execute processes in queue 1. Similarly,
processes in queue 2 will only be executed if queues 0 and 1 are empty. A process that arrives
for queue 1 will pre-empt a process in queue 2. A process in queue 1 will in turn be pre-
empted by a process arriving for queue 0. Here, a ready process joins the queue 0. If a
process in queue 0 does not finish in 8 milliseconds, it is moved to the tail of queue 1. If

11
queue 0 is empty, the process at the head of queue 1 is given a quantum of 16 milliseconds. If
such a process does not complete in 16 milliseconds, it is pre-empted and is put at the tail of
queue 2. Process in queue 2 will run on FCFS basis when queue 0 and queue 1 are empty.

Shortest Process Next

Shortest Job First (SJF) always produces the minimum average response time for batch
systems. But for interactive system figuring out which of the currently runnable processes is
the shortest one is difficult. One approach is to make estimate based on the past behaviour
and run the process with the shortest estimated running time. Suppose that the estimated time
per command for some terminal is T0. Now suppose its next run is measured to be T1. The
estimate could be updated by taking a weighted sum of these 2 numbers i.e.
a T0 + (1 - a) T1 a<1
This technique of estimating the next value by taking weighted average of current measured
value and previous estimated value is sometimes called aging. Aging is easy to implement
with a=1/2 as it just requires adding the new value to the current estimate and divide it by 2
(right shift by 1 bit).

References:

1. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.

2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.

12
THREADS & MULTITHREADING

Each process in a traditional operating system has an address space and a single thread of
control. But in several situations, it is desirable to have multiple threads of controls in the
same address space running in a quasi-parallel manner so that they can share some resources
and work together to perform some task. Multithreading refers to this situation of allowing
multiple threads to execute in the same process.

Differences between Process and Threads

There are significant differences in the concept of processes and threads which are as
discussed below:

 A thread is the entity within a process that can be scheduled for execution while a
process is a program in execution and it groups related resources.
 A process has an address space containing program text and data along with other
resources which may include open files, child processes, pending alarms, signal
handlers, accounting information etc... A thread (of execution) on the other hand has a
program counter, registers, state and stack to maintain execution history.
 In systems supporting multithreading, a process may have multiple threads while a
thread is always associated to a single process.
 Different processes on a system share physical memory, disks, printers and other
system resources while different threads of a process share common address space,
open files, child processes, alarms and signals etc...
 Different threads in a process are not quite as independent as different processes.
Since all threads of process have exactly the same address space, thus they share the
same global variables.
 Also, another repercussion of sharing the common address space is that threads can
read, write, or even completely wipe out another thread’s stack indicating no
protection among threads. Infact, unlike processes, ‘Protection’ is not as serious a
concern in case of threads because processes may belong to different competing users
but threads share resources belonging to the same process thus the same user, so they
will be created to cooperate and not to compete.
 Processes have parent-child or hierarchical relationships while threads may or may
not have such a relationship i.e. threads may be or infact normally are treated as equal.
 Processes are always managed by the Operating System where as implementation of
threads can be in user space as well as kernel space.
 Since threads do not have much resources of themselves, thus, they are easier to
create and destroy than processes.
Similarities between Processes and Threads

Besides the dissimilarities, there are evident similarities among threads and process which
include:

 Processes and threads have inherent characteristic of sharing resources and allowing
multiple tasks to progress in a parallel or pseudo-parallel manner.

 Like processes, a thread can be in any one of the several states: running, blocked,
ready, or terminated.

 The way CPU switches back and forth among processes similarly the CPU switches
back and forth among different threads of a process giving illusion of parallel
activities running within the process.

Since threads have some properties of processes but do not have much resources of its own,
they are also called lightweight processes.

Library Procedures for Threads

By default each process has one thread of control which can further call library procedure e.g.
thread_create to create more threads when multithreading is supported by the system. After a
thread has finished it work, it can exit and will no more be schedulable by calling library
procedure say thread_exit. One thread can wait for a specific thread to exit by calling a
procedure like thread_wait. Another common thread call is thread_yield which allows a
thread to voluntarily give up the CPU to let another thread run as threads are normally
cooperating and not competing.

Motivation for Threads

Many applications require several activities going on at once some of which may be
dependent on others while others may not, some may be blocking while others may not. Thus
identifying separate activities and decomposing the application into multiple threads makes
the job simpler to model and efficient to execute. But this is a similar argument that is
proposed for having multiple processes. Following points discuss the motivation behind
introducing the concept of threads:

 Threads provide the ability for the parallel activities to share address space while
processes do not.
 Since threads do not have much separate resources attached to them, therefore they
are easier and faster to create and destroy than processes.
 Nonetheless, like processes with a mix of compute bound and I/O bound threads
performance of the application can significantly be improved. Also again like
processes, on systems with multiple CPUs, multiple threads can be run
simultaneously on separate CPUs thus achieving real-parallelism.

Complications with Threads

Though threads are often useful, they introduce significant level of complexities. Some of
them are as given below which can actually be dealt with by appropriate designing of
multithreaded programs.

 If parent process has multiple threads, do the child also have all of them.
 If a thread in parent was blocked when the child was forked, will there be two threads
blocked on the same event now and what happens after occurrence of the event, will
both of them be unblocked or only parent or only child.
 Threads share many data structures, consider a situation when a thread identifies
shortage of memory and starts allocating more, part way in between another thread is
scheduled which again on identifying memory shortage starts allocating memory. So
possibly the memory to the process will be allocated twice.

Applications

 Word Processor having separate threads for saving or disk backup, typing, error
detection etc.
 Server serving multiple clients simultaneously

References:

1. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
DEADLOCKS

RESOURCES

Preemptable Resource: A preemptable resource is one that can be taken away from the
process owning it with no ill-effects. Memory is an example of preemptable resource.

Non-preemptable Resource: A non-preemptable resource is one that cannot be taken away


from its current owner process without causing the operations of the process to fail. CD
recorder, scanner is an example of non-preemptable resource.

The sequence of events required to use a resource are as follows:

1. Request the resource


2. Use the resource after the resource has been granted
3. Release the resource

DEADLOCK

Deadlock can be formally defined as – “A set of processes is deadlocked if all processes in


the set are waiting in a circular way such that each process is waiting for an event that only
next process in the chain can cause”.

Mostly, the event is release of some resource possessed by another member process of the
chain. Thus, each member of the set of deadlocked processes is waiting for a resource that is
owned by next deadlocked process in the chain.

COFFMAN CONDITIONS FOR DEADLOCK

Following are the Coffman conditions which are the four necessary conditions for a deadlock
to occur, and absence of any one of them guarantee that no deadlock can occur:

1. Mutual exclusion condition: Each resource is either currently assigned to exactly one
process or is available.
2. Hold and wait condition: Processes currently holding resources granted earlier can
request new resources.
3. No preemption condition: Resources previously granted cannot be forcibly taken
away from a process. They must be explicitly released by the process holding them.
4. Circular wait condition: There must be a circular chain of two or more processes,
each of which is waiting for a resource held by the next member of the chain.
STRATEGIES TO DEAL WITH DEADLOCK

In general, the following four strategies are used for dealing with deadlocks:

1. Ignore: Just ignore the problem of deadlock altogether if the possibility of it


occurring on particular resource is very less.
2. Detection and recovery: Let deadlocks occur, detect them, and take action to recover
from the situation.
3. Dynamic deadlock avoidance: Avoid occurrence of deadlock by careful resource
allocation.
4. Deadlock Prevention: Prevent deadlock by structurally negating one of the four
conditions necessary to cause a deadlock.

RESOURCE ALLOCATION GRAPH

Deadlocks can be described in terms of directed graph known as resource allocation graph.
The graph has two kinds of nodes: processes shown as circles, and resource, shown a squares.
An arc from resource node to a process node represents that a requested resource has been
granted to and is currently held by that process. An arc from a process to a resource means
that the process is currently blocked waiting for that resource.
Figure 1: Resource Allocation graph showing deadlock situation and how it is avoided
with altered sequence of requests of resources by processes
Example of resource allocation graph with multiple instances of same resources:

Figure 2: Resource allocation graph with multiple instances of same resources

DEADLOCK DETECTION & RECOVERY

With deadlock detection, the system does not attempt to prevent deadlocks from occurring,
rather, it lets them occur and later detects their occurrence to recover from the situation using
several means like recovery through pre-emption, killing process or rollback.
 Deadlock detection with one instance of each resource type – This is done by
detecting presence of cycles in resource allocation graph.

A formal algorithm for detecting deadlocks using resource allocation graph is


required which basically will detect presence of a cycle in the resource allocation
graph. Many algorithms for detecting cycles in directed graphs are known. Following
is a simple algorithm that inspects a graph and terminates either when it has found a
cycle or when it has shown that none exist. It uses one data structure, L, a list of
nodes. During the algorithm, arcs will be marked to indicate that they have already
been inspected, to prevent repeated inspections.
The algorithm operates by carrying out the following steps as specified:
1. For each node, N in the graph, perform the following 5 steps with N as the
starting node.
2. Initialize L to the empty list, and designate all the arcs as unmarked.
3. Add the current node to the end of L and check to see if the node now appears
in L two times. If it does, the graph contains a cycle (listed in L) and the
algorithm terminates.
4. From the given node, see if there are any unmarked outgoing arcs. If so, go to
step 5; if not, go to step 6.
5. Pick an unmarked outgoing arc at random and mark it. Then follow it to the
new current node and go to step 3.
6. We have now reached a dead end. Remove it and go back to the previous
node, that is, the one that was current just before this one, make that one the
current node, and go to step 3. If this node is the initial node, the graph does
not contain any cycles and the algorithm terminates.

This algorithm takes each node, in turn, as the root of what it hopes will be a tree, and
does a depth-first search on it. If it ever comes back to a node it has already
encountered, then it has found a cycle. If it exhausts all the arcs from any given node,
it backtracks to the previous node. If it backtracks to the root and cannot go further,
the subgraph reachable from the current node does not contain any cycles. If this
property holds for all nodes, the entire graph is cycle free, so the system is not
deadlocked.

To see how the algorithm works in practice, let us take an example:

Consider a system with seven processes, A though G, and six resources, R through W.
The state of which resources are currently owned and which ones are currently being
requested is as follows:
1. Process A holds R and wants S.
2. Process B holds nothing but wants T.
3. Process C holds nothing but wants S.
4. Process D holds U and wants S and T.
5. Process E holds T and wants V.
6. Process F holds W and wants S.
7. Process G holds V and wants U.
Figure 3: Corresponding Resource Allocation Graph

Now we attempt to detect presence of deadlock using the above mentioned algorithm.
The order of processing the nodes is arbitrary, so let us just inspect them from left to
right, top to bottom, first running the algorithm starting at R then successively, A, B,
C, S, D, T, E, F, and so forth. If we hit a cycle, the algorithm stops. We start at R and
initialize L to the empty list. Then we add R to the list and move to the only
possibility, A, and add it to L, giving L = [R, A]. From A we go to S, giving L = [R,
A, S]. S has no outgoing arcs, so it is a dead end, forcing us to backtrack to A. Since
A has no unmarked outgoing arcs, we backtrack to R, completing our inspection of R.
Now we restart the algorithm starting at A, resetting L to the empty list. This search,
too, quickly stops, so we start again at B. From B we continue to follow outgoing arcs
until we get to D, at which time L = [B, T, E, V, G, U, D]. Now we must make a
(random) choice. If we pick S we come to a dead end and backtrack to D. The second
time we pick T and update L to be [B, T, E, V, G, U, D, T], at which point we
discover the cycle and stop the algorithm.

 Deadlock detection with multiple instances of each resource type – This is done by
maintain existing and available resource vectors along with current allocation and
request matrix. Deadlock cannot be detected by detecting presence of a cycle in cases
when there are multiple instances of the same resource.

It can be shown that if a resource allocation graph contains no cycle, then no process
in the system is deadlocked but if the graph does contain a cycle, then a deadlock may
exist but not necessarily. To understand this let us consider the following two resource
allocation graphs:
(a) (b)
Figure 4: (a) Resource allocation graphs for multiple instances of same resources
having cycles where deadlock occurs (b) Resource allocation graphs for multiple
instances of same resources having cycles where deadlock does not occur

Both the above resource allocation graphs possess cycles. Corresponding to Figure 4
(a) the existing cycles are two which are:

In this example deadlock actually occurs because all process in the cycles are waiting
on one of the resource held by the next in the chain and there is no way any of the
process can progress any further.
Now consider Figure 4 (b), the one cycle that exists here is:

But, this cycle does not actually result in a deadlock situation because if either of the
resource instance of R1 granted to process P2 or the resource instance of R2 granted to
process P4 is returned back by the respective processes the cycle will break of its own.
This is because in such a case the newly freed resource instance will be available for
the process waiting on it and hence it is not a deadlock situation.

Hence detecting presence of a cycle in resource allocation graph is not correct


approach to detect deadlock in such cases where there are multiple instances of same
resources.

So, following is a matrix-based algorithm for detecting deadlock among n processes.


Let the number of resource classes be m, with E1 resources of class 1, E2 resources of
class 2, and generally, Ei resources of class i (1 ≤ i ≤ m). So, E is the existing resource
vector. It gives the total number of instances of each resource in existence.

Also, at any instant, some of the resources are assigned and are not available. Let A
be the available resource vector, with Ai giving the number of instances of resource i
that are currently available (i.e., unassigned). Further, there are two arrays, C, the
current allocation matrix, and R, the request matrix. The ith row of C tells how many
instances of each resource class the process i currently holds. Thus, Cij is the number
of instances of resource j that are held by process i. Similarly, Rij is the number of
instances of resource j that the process i wants. These four data structures are shown
below:

An important invariant holds for these four data structures. In particular, every
resource is either allocated or is available. This observation means that:

The meaning is that if we add up all the instances of the resource j that have been
allocated and to this add all the instances that are available, the result is the number of
instances of that resource class that exist.
The deadlock detection algorithm is based on comparing vectors. Let us define the
relation A ≤ B on two vectors A and B to mean that each element of A is less than or
equal to the corresponding element of B. Mathematically, A ≤ B holds if and only if
Ai ≤ Bi for 1 ≤ i ≤ m.

Each process is initially said to be unmarked. As the algorithm progresses, processes


will be marked, indicating that they are able to complete and are thus not deadlocked.
When the algorithm terminates, any unmarked processes are known to be deadlocked.
The deadlock detection algorithm can now be given, as follows:
1. Look for an unmarked process, Pi, for which the ith row of R is less than or
equal to A.
2. If such a process is found, add the ith row of C to A, mark the process, and go
back to step 1.
3. If no such process exists, the algorithm terminates.

When the algorithm finishes, all the unmarked processes, if any, are deadlocked.

In step 1 the algorithm is looking for a process that can be run to completion. Such a
process is characterized as having resource demands that can be met by the currently
available resources. The selected process can be run until it finishes, at which time it
returns the resources it is holding to the pool of available resources. It is then marked
as completed. If all the processes are ultimately able to run, none of them are
deadlocked. If some of them can never run, they are deadlocked. Although the
algorithm is nondeterministic (because it may run the processes in any feasible order),
the result is always the same.

For example, consider there are three processes and four resource classes, say, tape
drives, plotters, scanner, and CD-ROM drive. Process 1 has one scanner. Process 2
has two tape drives and a CD-ROM drive. Process 3 has a plotter and two scanners.
Each process needs additional resources, as shown by the R matrix.

To run the deadlock detection algorithm, a process is to be identified whose resource


request can be satisfied. The first one cannot be satisfied because there is no CD-
ROM drive available. The second cannot be satisfied either, because there is no
scanner free. Fortunately, the third one can be satisfied, so process 3 can run and
eventually it will return all its resources, giving
A = (2 2 2 0)
At this point process 2 can be run because the row corresponding to process 2 in
request matrix has values <1, 0, 1, 0> which can be satisfied by the available
resources. Hence, it can run and return its resources, giving
A = (4 2 2 1)
Now the remaining process can run. There is no deadlock in the system.
Now consider a minor variation of the above situation. Suppose that process 2 needs a
CD-ROM drive as well as the two tape drives and the plotter i.e. the row
corresponding to process 2 in request matrix has values <2, 1, 0, 1>. None of the
requests can be satisfied, so the entire system is deadlocked.

Recovery through Pre-emption: In some cases, it may be possible to temporarily take a


resource away from its current owner and give it to another process. The ability to take a
resource away from a process, have another process use it, and then give it back without the
process noticing it is highly dependent on the nature of the resource. Recovering this way is
frequently difficult or impossible. Choosing the process to suspend depends largely on which
ones have resources that can easily be taken back.

Recovery through Rollback: If deadlocks are frequent for a system, processes can be
checkpointed periodically and a whole sequence of checkpoint files are accumulated.
Checkpointing a process means that its state is written to a file so that it can be restarted later.
The checkpoint contains not only the memory image, but also the resource state, that is,
which resources are currently assigned to the process.

When a deadlock is detected, the needed resources are identified, and to do the recovery a
process that owns a needed resource is rolled back to a point in time before it acquired the
resource by starting one of its earlier checkpoints thereby resetting the process to an earlier
moment when it did not have the resource (i.e. the resource which is now assigned to one of
the deadlocked processes). If the restarted process tries to acquire the resource again, it will
have to wait until that resource becomes available.

Recovery through Killing Processes: The easiest way to break a deadlock is to kill one or
more processes. The process to be killed may be one among those in the deadlocked cycle or
an altogether separate process which is not part of the cycle but is holding resources required
to break the deadlock. For example, one process might hold a printer and want a plotter, with
another process holding a plotter and wanting a printer. These two are deadlocked. A third
process may hold another identical printer and another identical plotter and be happily
running. Killing the third process will release these resources and break the deadlock
involving the first two.

Where possible, it is best to kill a process that can be rerun from the beginning with no ill
effects. For example, process like “a compilation of a program” can always be rerun because
the first run has no influence on the second run. But, a process that updates a database cannot
always be run a second time safely as a transaction may be incomplete when it is chosen to be
killed thereby causing partial manipulations to be done in the database.

References:

1. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
Deadlock Avoidance
Deadlock detection is done to detect a deadlock situation and recover from it after it has been
identified. The processes do not generally request all their required resources at once but
normally request for few resources at a time. And to avoid deadlocks, the system must be
able to decide whether granting a resource request is safe or not and only make the allocation
when it is safe.

This way we can avoid deadlocks, but only if certain information is available in advance like
number of processes, maximum resource requirement of each process, number of existing
resources.

Following defines safe and unsafe states in context with deadlock avoidance:

Safe State: A state is said to be safe if it is not deadlocked and there is some scheduling order
in which every process can be run to completion even if all of them suddenly request their
maximum number of resources immediately at once.

Unsafe state: An unsafe state is the state which is not ‘safe’ i.e. there is no guarantee that a
deadlock will not occur. But an unsafe state is not necessarily a deadlocked state because it is
possible that the processes never ask for their complete or maximum requirement of
resources all at once.

For example, following demonstrates that the state in Figure 1 (a) is a safe state because there
exists a sequence of allocations that allows all processes to complete as shown below:

Figure 1: Demonstration for safe state

Further, following demonstrates that the state in figure 2 (a) is not a safe state or we may say
it is an unsafe state because there does not exist a sequence of allocations that allows all
processes to complete in case they demand their maximum requirement all at once as is
shown below:
Figure 2: Demonstration for unsafe state

Banker’s Algorithm for Deadlock Avoidance: The Banker’s algorithm considers each
request of resource from a process as it occurs and checks if granting the request leads to safe
state or not. If it does, the request is granted; otherwise, it is postponed until later. To see if a
state is safe, it is checked if there exist enough resources to satisfy the resource requirements
of all processes if the processes make their maximum remaining demand all at once. In other
words, there should exist atleast one sequence of process scheduling so that maximum
demands of all processes are met completely and the processes can be run/execute to
completion. To check this, existing, available and possessed resource vectors are maintained
along with resource assigned and still needed matrices (or maximum resource requirement
matrix can be maintained using which still needed matrix can easily be calculated).

Consider a system having 10 resource instances of a resource and processes A, B, C and D


with their maximum demands as shown in figure 3 (a). Now consider a situation where at any
instant of time the allocation of resources is such that system has the state as shown in figure
3(b). And, at this instant assume that there is a request by process B to allocate 1 instance of
the resource, then using Banker’s algorithm the request cannot be granted because it will lead
to an unsafe state which is as shown in figure 3(c). This is because maximum remaining
resource requirements of Processes A, B, C and D are 5, 3, 2, and 3 respectively, none of
which can be satisfied by the currently available or free single/one resource instance. Hence,
such a resource request by process B will not be granted as per Banker’s algorithm so as to
avoid deadlock situation.

Please note that the state in figure 3 (b) by itself is safe because there will exist atleast one
sequence of execution of processes such that all processes can execute to completion even if
they demand their maximum resource requirements all at once.
Figure 3: Banker’s algorithm for single type of resources

Now consider the case of a system having resources of multiple types. Consider figure 4
showing two matrices. The one on the left shows how many of each resource are currently
assigned to each of the five processes. The matrix on the right shows how many resources
each process still needs in order to complete. The three vectors at the right of the figure show
the existing resources, E, the possessed resources, P, and the available resources, A,
respectively. From E we see that the system has six tape drives, three plotters, four scanners,
and two CD-ROM drives. Of these, five tape drives, three plotters, two scanners, and two
CD-ROM drives are currently assigned. This fact can also be seen by adding up the four
resource columns in the Resources assigned matrix to get the possessed resources vector P.
Thus, the available resource vector is simply the difference between what the system has i.e.
possessed resources vector P and what are the existing resources which is represented by
vector E.

Figure 4: Banker’s algorithm for multiple type of resources


The steps of the algorithm for checking to see if a state is safe can be stated as follows:

1. Look for a row, R, whose unmet resource needs are all smaller than or equal to A. If
no such row exists, the system will eventually deadlock since no process can run to
completion. In such case go to Step 3.

2. Assume the process of the row chosen requests all the resources it needs (which is
guaranteed to be possible) and finishes. Mark that process as terminated and add all its
resources to the A vector.

3. Repeat steps 1 and 2 until either all processes are marked terminated, in which case
the initial state was safe, or until a deadlock occurs. If a deadlock is identified the
initial state is not safe i.e. it is unsafe.

Note: If several processes are eligible to be chosen in step 1, it does not matter which one is
selected

It can be verified that the current state as shown in figure 4 is a safe state. Now, suppose that
process B now requests a scanner. It is to be now checked using Banker’s Algorithm whether
this request can be granted or not?

To apply Banker’s algorithm we create the new state which as shown below:

Now it is to be checked whether the new state is a safe state or not so as to determine whether
the resource request of Process B can be granted or not using Banker’s Algorithm. To check
whether the state is safe or not, we need to identify a sequence of execution of processes so
that the processes can still execute to completion even if all of the processes demand their
remaining maximum requirements all at once.

1. Process D’s still need requirement i.e. (0, 0, 1, 0) is less than the available resource
vector A i.e. (1, 0, 1, 0). Hence, Process D can eventually finish and will return its
occupied resources making the new available resource vector A as A = A + resource
assigned to Process D i.e.:
A = (1, 0, 1, 0) + (1, 1, 0, 1) = (2, 1, 1, 1)

2. Now, Process A’s or E’s demand can be met by the available resources. Let us
consider the Process A’s demand or still needed requirement i.e. (1, 1, 0, 0) which is
less than the latest available resource vector A i.e. (2, 1, 1 ,1). Hence, Process A can
eventually finish and will return its occupied resources making the new available
resource vector A as A = A + resource assigned to Process A i.e.:
A = (2, 1, 1, 1) + (3, 0, 1, 1) = (5, 1, 2, 2)

3. Now, any of the remaining Processes’ demands can be met by the available resources.
Let us consider the Process B’s demand or still needed requirement i.e. (0, 1, 0, 2)
which is less than the latest available resource vector A i.e. (5, 1, 2, 2). Hence, Process
B can eventually finish and will return its occupied resources making the new
available resource vector A as A = A + resource assigned to Process B i.e.:
A = (5, 1, 2, 2) + (0, 1, 1, 0) = (5, 2, 3, 2)

4. Again, any of the remaining Processes’ demands can be met by the available
resources. Let us consider the Process C’s demand or still needed requirement i.e. (3,
1, 0, 0) which is less than the latest available resource vector A i.e. (5, 2, 3, 2). Hence,
Process C can eventually finish and will return its occupied resources making the new
available resource vector A as A = A + resource assigned to Process C i.e.:
A = (5, 2, 3, 2) + (1, 1, 1, 0) = (6, 3, 4, 2)

5. Finally, Process E is the last process remaining and its demand can be met by the
available resources because the Process E’s demand or still needed requirement i.e. (2,
1, 1, 0) is less than the latest available resource vector A i.e. (6, 3, 4, 2). Hence,
Process E can eventually finish and will return its occupied resources making the new
available resource vector A as A = A + resource assigned to Process E i.e.:
A = (5, 2, 3, 2) + (0, 0, 0, 0) = (6, 3, 4, 2)

So, there is atleast one sequence of execution of processes, i.e. D --> A --> B --> C --> E, in
which the processes can execute to completion even if all of them demand their maximum
remaining resource requirements all at once. Hence, this request of Process B for one scanner
can be granted because the resulting state is still safe.

Now imagine that after giving B one of the two remaining scanners, Process E wants the last
scanner. Granting that request would create a state such that the vector of available resources
will be reduced to (1 0 0 0), and this may lead to deadlock as none of the processes’
maximum remaining requirements will be able to be fulfilled by the available resources (1, 0,
0, 0). The resulting state is as shown below:
Thus, the resulting state is unsafe and hence clearly Process E’s request cannot be granted in
this situation as per Banker’s algorithm in order to avoid deadlock and the request has to be
deferred for a future time.

Problems with Banker’s Algorithm


1. For Banker’s algorithm maximum resource requirement of all processes should be
known well in advance which is not very practical.

2. Banker’s algorithm works in a very simplified environment, where the number of


processes competing for resources is considered to be fixed, whereas the competing
processes are normally not fixed and keep changing dynamically.

3. It also assumes that the no. of existing resources remain fixed, while this is also not
true in general. New resources may become part of the system and on the other hand
some resources may wear out or stop performing thereby decreasing the total count of
existing resources.

Deadlock Prevention
Deadlock Prevention refers to attacking one of the four necessary conditions for deadlock to
happen in order to prevent a deadlock situation to happen.

Attacking mutual exclusion: Attacking mutual exclusion means the resource cannot be
assigned exclusively to a single process. For resource like printer, spooling printer output can
be considered as an example of attacking mutual exclusion because this way several
processes can generate the output at the same time and it is only the printer daemon process
that requests and gets allotted the physical printer so the other processes actually requiring
printer are by themselves not getting mutually exclusive access to the printer resource,
instead the printer daemon process accesses the printer. Since the printer daemon process
never requests other resources being sought by other competing processes, hence no situation
of deadlock may arise.
But this strategy can work for only a few types of resource and not for all in general. Further,
competition for disk space for spooling can itself lead to deadlock.

Attacking Hold and Wait Condition: One way to attack hold and wait condition is to require
all processes to request all their required resources before starting execution. If everything is
available, only then the process is started so that it can then run to completion. But this
approach leads to highly inefficient utilization of resources because the processes
unnecessarily hold resource which they may probably require at the end of its execution.
Also, normally it is not known by the process in the start regarding how many and which all
resources will be required by it during the course of its execution.
Another approach could be that whenever a process requests for a resource, it first
temporarily releases all the resources that it currently holds, then try to get everything it needs
all at once.

Attacking No Preemption Condition: As per this approach, the resources once granted to
process, may be forcibly taken away if it is required. Therefore, there remains no scope of
deadlock to occur. But this approach is not very promising because normally taking away
resource from process (after it has been granted to the process) will be accompanied with ill-
effects as the process might be part way between using the resource. This approach has very
limited use only with certain specific type of resources which when taken away may not
cause serious ill impact on the process.

Attacking Circular Wait: One way to attack circular wait is to have a rule saying that a
process is entitled only to have a single resource at any moment of time. But this is not
practically possible because most applications will genuinely require more than one resources
at a time.

Another approach to attack circular wait is by allotting global number for all resources.
Further processes can request resources whenever they want to, but all requests may be made
in strict numerical ascending order. For example, for a system having resources like printer,
scanner, plotter, tape drive and CD ROM Drive a possible ordering could be:

1. Printer
2. Scanner
3. Plotter
4. Tape drive
5. CD ROM Drive
Then for a process which has been once allotted a Scanner cannot request for a Printer but if
it has been granted printer then it can still demand Scanner. With this kind of a rule, the
resource allocation graph can never have cycles.

A minor variation to this rule is to drop the requirement that resources be acquired in strictly
ascending order and merely insist that the process request a resource having global number
greater than that of resources it is currently already holding. For example: If a process
initially requests resources numbered 9 and 10, and then releases both of them, then now it
can request resource with global number say 1 because the process is no more holding any
resource numbered higher than 1.

The problem with this approach is that it is practically very difficult to identify such an
ordering to satisfy the needs of all processes.

References:

1. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
MEMORY MANAGEMENT

The part of the operating system that manages the memory is called the memory manager.
Its job is to keep track of the fact that which parts of memory are in use and which parts are
not in use, to allocate memory to processes when they need it and deallocate it when they
have completed, and to manage swapping between main memory and disk when main
memory is too small to hold all the processes.

Basic Memory Management


There are two types of basic memory management systems:
a) Those that move processes back and forth between main memory and disk (swapping
and virtual memory like paging etc.)
b) Those which do not move processes back and forth between memory and disk.

Monoprogramming without Swapping or Virtual memory like Paging etc.


Single program is in the memory along with operating system. This approach is highly simple
but very inflexible and not suitable for today’s multiuser, timesharing environment with
complex applications. Following figure shows few simple ways of organizing memory in
monoprogramming set up along with the operating system:

Clearly, in case of monoprogramming without swapping or virtual memory, only a single


user process can reside in memory and also the process size cannot exceed beyond the
physically available size of the memory.

Multiprogramming with Fixed Partition (without Swapping or virtual memory like Paging
etc.)
Here the memory is divided in fixed partitions possibly unequal in size. The partitioning may
be done once when the system is started up. There could be following variations:
 Multiple queues for different sized partitions – In this approach multiple queues of
jobs are maintained corresponding to different partitions which may be of unequal
sizes. It is efficient to locate the appropriate free partition for the job, but jobs may
have to unnecessarily wait for memory even if memory is presently available in a
different partition. If the queue for jobs waiting for different sized partitions vary
significantly in their length, in such a case if there are more no. of small jobs and less
no. of larger jobs, the smaller jobs may keep waiting for their chance in the smaller
partition while the larger partition may lie empty.

 Single queue for different sized partitions – In this approach a single queue of jobs
is maintained for all partitions. It involves CPU time in deciding which job from the
single queue is to be put in the newly freed partition. Various approaches with
different trade-offs may be there for deciding which job to be put in freed partition,
like:
o Identifying the first job small enough to fit the partition from the beginning of
the queue and putting this identified job in the partition. But this approach
wastes memory in case a very large partition is allocated to a very small job.
o Searching the whole queue for the largest job to fit in the partition. This
approach may require larger CPU time to identify the job to be put in the
partition. Also, this approach is biased towards larger jobs and may be unfair
to smaller jobs thus providing poor service to smaller jobs which should not be
the case.
 Another variation could be to have a rule stating that the job that fits in a free partition
cannot be skipped more than say k times. This approach is a kind of hybrid of the
above two approaches.

In general, there are few more problems that are associated with fixed partition approach:
1. Internal fragmentation happens when processes of smaller size are made to occupy
larger partitions.
2. It is difficult to determine the number and size of fixed partitions.
3. If process wants to grow dynamically then complexities arise because partitions are
not expandable.

Degree of Multiprogramming
The degree of multiprogramming describes the maximum number of processes residing in the
memory simultaneously which a single-processor system can accommodate efficiently. The
primary factor affecting the degree of multiprogramming is the amount of memory available
to be allocated to executing processes. Other factors affecting the degree of
multiprogramming are program I/O needs, program CPU needs and memory & disk access
speed.

Relocation and Protection


When a program is linked (i.e., the main program, user written procedures, and library
procedures are combined into a single address space), the linker must know at what address
the program will begin in memory. Relocation is the process of assigning load addresses or
absolute addresses to various parts of a program and adjusting the code and data in the
program to reflect the assigned absolute addresses.

One possible solution is to perform relocation during loading by actually modifying the
instructions when the program is loaded into memory. Programs loaded into partition 1 have
100K added to each address, programs loaded into partition 2 have 200K added to addresses,
and so forth (refer figure 2(a)). There are two problems associated with this approach:

1. The code performing relocation must be able to distinguish between which program
words are addresses (to be relocated) and which are opcodes, constants, or other items
that must not be relocated.

2. Relocation during loading does not solve the protection problem. Because programs
in this system use absolute memory addresses hence there is no way to prevent a
program from building an instruction that reads or writes any word in memory which
may not be part of the process. In multiuser systems, it is highly undesirable to let
processes read or write memory belonging to other users.
An alternative solution to both the relocation and protection problems is to equip the machine
with two special hardware registers, called the base and limit registers. When a process is
scheduled, the base register is loaded with the address of the start of its partition, and the limit
register is loaded with the length of the partition. Every memory address generated
automatically has the base register contents added to it before accessing the actual memory.
Thus if the base register contains the value 100K, a CALL 100 instruction is effectively
turned into a CALL 100K+100 instruction, without the instruction itself being modified.
Addresses are also checked against the limit register to make sure that they do not attempt to
address memory outside the current partition. The hardware protects the base and limit
registers to prevent user programs from modifying them.

A disadvantage of this scheme is the need to perform an addition and a comparison on every
memory reference. Comparisons can be done fast, but additions are slow.

Multiprogramming with Variable Partition


In this approach the partitions may be variable in number and may be variable sized, thus
allowing flexibility to accommodate dynamically growing processes and avoiding internal
fragmentation for improving memory utilization. But external fragmentation still remains a
problem when holes (or free memory) of unusable sizes occur in between occupied memory
partitions that normally happens because of processes leaving out of memory after
completion.

Swapping
Swapping refers to bringing in each process in its entirety, running it for a while and then
putting it back on disk. This ensures higher flexibility in managing memory among
competing processes and avoiding starvation in case the total memory requirement by all the
processes exceeds the existing physical memory available.

Example of Swapping
The following figure displays the following sequence of events:
1) Initially only process A is in memory.
2) Processes B and C are created or swapped in from disk.
3) Process A is swapped out to disk because enough space is not there in memory for
incoming process D.
4) Process D comes in the memory
5) Process B leaves out of the memory after completion
6) Finally Process A is again swapped in.

Since A is now at a different location in memory, addresses contained in it must be relocated,


either by software when it is swapped in or (more likely) by hardware during program
execution.
Fragmentation
Memory fragmentation occurs when a system contains memory that is technically free but the
computer can’t utilize such memory. It occurs when the memory allocation or the
deallocation of previously occupied memory segments leads to creation of blocks of free
memory that are too small and/or too isolated to be used, thereby making such free memory
unusable for future processes.

External Fragmentation: External Fragmentation happens when some memory is allocated


to process and a small piece of memory is left over that, which cannot be effectively used. Or
it may also happen when processes release the memory to be added back to the pool of free
memory but the free memory blocks are separated by pieces of memory allocated to alive
processes. If too much external fragmentation occurs, the amount of usable memory is
drastically reduced. In case of external fragmentation, the free memory holes are separated
apart by alive processes. Hence, even when the total memory space is sufficient to satisfy a
request, the request can’t be satisfied as free memory is not contiguous.

Internal Fragmentation: Internal fragmentation is the space wasted inside allocated memory
blocks as a result of restriction on the allowed sizes of allocated blocks. Allocated memory
block may be slightly larger than requested memory which leads to internal fragmentation.
This wasted memory is the memory internal to a partition or allocated memory block, and
cannot be used by other processes.

Memory Compaction
Compaction is a method used to overcome the external fragmentation. All free blocks of
memory are brought together as one large block of free space. The issues associated with
memory compaction are:
1) It requires dynamic relocation of processes.
2) It requires additional and significant CPU time to perform this activity.

Dealing with dynamically growing processes


While allocating memory to a process, it is important to identify how much memory should
be allotted to the process. If processes are created with a fixed size that never changes, then
the allocation is simple: the operating system allocates exactly what is needed, no more and
no less. But many times the memory requirements of processes are decided dynamically, i.e.
the processes may grow dynamically. Following are few approaches for dealing with memory
requirements of dynamically growing process:
1) If an adjacent hole is available, the same can be given to the process.
2) If no adjacent hole is present, process has to be moved to a bigger hole or another
process will have to be swapped out to create a bigger hole. But if swap area on the
disk is also full, process will have to get blocked.
3) Room for growth can be provided well in advance to the processes, in anticipation of
future memory requirement.

Memory Management with Bitmaps


With a bitmap, memory is divided up into allocation units, perhaps as small as a few words or
as large as several kilobytes. One bit per allocation unit is reserved to indicate whether the
allocation unit is busy or free. Deciding the size of the allocation unit is a critical aspect here.

 If the size of the allocation unit is too small, the size of the bit map will be very large,
hence will require more memory to store the bitmap itself.
 And if the size of the allocation unit is too large, then significant memory will be
wasted in the last allocation unit of memory allotted to the processes having size
which is not multiple of size of allocation unit.

The main problem with using bitmaps is that when it has been decided to bring a k unit
process into memory, the memory manager must search the bitmap to find a run of k
consecutive 0 bits in the map. Searching a bitmap for a run of a given length is a slow
operation as compared to searching a free memory block using the other approach employing
linked list as discussed in next section.

Memory Management with Linked List


Another way of keeping track of memory is to maintain a linked list of allocated and free
memory segments, where a segment is either a process or a hole between two processes. Each
entry in the list specifies whether it corresponds to a hole (H) or process (P), the address at
which it starts, the length, and a pointer to the next entry.

The segment list may be kept sorted by address. Sorting this way has the advantage that when
a process terminates or is swapped out, updating the list is straightforward i.e. if the
terminating process vacates memory which is adjacent to hole on one or both sides, the same
can be merged with the newly vacated memory to create a larger hole. The following figure
explains the four possibilities:
There are several memory allocation algorithms which may be used to decide how to satisfy a
request of size n from the list of free holes. Following are some of the algorithms:

First Fit: It means allocate the first hole that is big enough. Searching starts from the
beginning of the set of holes and searching stops as soon as a free hole which is large enough
to satisfy the demand. It is fastest approach because it searches as little as possible.

Next Fit: A minor variation to First Fit is Next Fit where instead of starting search from
beginning, the search starts from where previous search ended.

Best Fit: It means allocate the smallest hole that is big enough to satisfy the memory
requirement. Best fit is slower than first fit because entire list must be searched if the list is
not sorted as per hole size. This strategy produces the smallest leftover hole.

Worst Fit: It means allocate the largest hole by searching the entire list unless it is sorted by
hole size. This strategy produces the largest leftover hole which may be more useful than
smaller leftover hole from the best fit approach.

Simulation results have shown that both first fit and best fit are better than worst fit in terms
of storage utilization. Neither first fit nor best fit is clearly better than the other in terms of
storage utilization, but first fit is generally faster.

There are some variations which can be made to the approach of memory management with
Linked List.

 Single list for process and holes: By maintaining single list for processes and holes
the implementation becomes simpler, and also merging adjacent holes become easier
when a new hole is created adjacent to existing hole. But with this approach, effort is
unnecessarily wasted in checking whether the memory corresponding to the linked list
node is a hole or a process. Also, the list to be searched for appropriate sized hole
would have been shorter if it contained only holes and no processes, thereby making
such searching faster.
 Separate lists for processes and holes:

o By maintaining separate lists for holes and processes, we can sort the list of
holes as per hole size thereby making it more efficient to perform best fit.
Infact, in this case both first fit and best will perform equally fast.

o Another algorithm Quick Fit can be implemented by maintaining multiple lists


for holes, one each for some common sizes and one to keep odd sized holes.
Thus, if a request for common sized hole arises, it can be granted quickly by
assigning a hole directly from the list maintained for that size.

o Though by keeping holes and processes in separate lists searching


appropriately sized hole may become more efficient, but in such a case
merging adjacent holes becomes difficult in cases of newly vacated memory,
i.e. when a new hole is created.

References:

1. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.


2. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
VIRTUAL MEMORY

The basic idea behind virtual memory is to allow the execution of processes that are not
completely in the memory. The main advantage of this memory management technique is
that the combined size of the program, data, heap and stack may exceed the amount of
physical memory available for it. The operating system keeps only those parts of the process
in main memory which are currently in use, and the rest of the process is kept on the disk.
The process is divided into pieces which are being swapped between disk and memory as
needed. For example, a 16-MB program can run on a 4-MB machine by carefully choosing
which 4 MB has to be kept in memory at each instant of time.

Virtual memory involves the separation of logical memory as perceived by users from
physical memory. The virtual address space of a process refers to the logical (or virtual) view
of how a process is stored in memory. Memory Management Unit (MMU) translates virtual
address to physical address.

Paging
In Paging approach, the virtual address space of the process is divided up into units called
pages. The corresponding units in the physical memory are called page frames. The pages
and page frames are always of the same size. Transfers between RAM and disk are always in
units of a page. Page sizes of 512 bytes to 64KB have been used in real systems.
A page table per process is maintained which stores the mapping of page to page frame for
each virtual page of the process. The page no. is used as an index into page table, which can
be used to find the page frame number corresponding to this virtual page. A bit called
Present/Absent bit is maintained in each page table entry to indicate whether a virtual page is
available in the physical memory or not. When a page is referenced and the corresponding
Present/Absent bit is 0, MMU notices that the page is unmapped i.e. it is not available in the
physical memory and causes the CPU to trap to the OS. This trap is called Page Fault. When
page fault occurs the operating system takes appropriate action to make the page available in
the physical memory by identifying a free page frame (if there is any) or by replacing an
existing page residing in a page frame in the physical memory with the required page to be
brought in.

With respect to Paging, there are two major issues that are required to be faced:
1) The page table can be extremely large and therefore keeping the entire page table in
memory is an issue. To address this issue solutions like caching, multilevel page
tables etc. exist.
2) The mapping (for all processes) or conversion of virtual memory address to physical
memory address must be fast.

For making the memory mapping (translating logical/virtual address to physical address) fast,
an option could be to keep separate hardware register to store the page table of the currently
running process. But this approach may be very expensive for large page tables.
Alternatively, page table would be required to be stored in memory, thereby requiring one or
more (in case of multilevel page table) extra memory references to read page table entries for
each required memory reference.

The following figure 2 represents conversion of virtual memory address 8196 to the
corresponding physical memory address using page table in a system having 64KB virtual
address space and 32KB physical memory. Each memory address addresses each byte and the
size of each page is 4KB.

Since, the virtual address space is 64KB = 216 bytes in size, therefore the virtual address uses
16 bits to address each byte. Similarly, since, the physical memory address space is 32KB =
215 bytes in size, therefore the physical address uses 15 bits to address each byte. Also, each
page is 4KB while the virtual address space is 64KB, hence there are 64KB/4KB=16 pages
(16 = 24). So, each of these pages can be identified using most significant 4 bits of the virtual
address. Further, each page frame is also 4KB while the physical address space is 32KB,
hence there are 32KB/4KB=8 pages (8 = 23). So, each of these page frames can be identified
using most significant 3 bits of the physical address. Also, each page/page-frame is 4KB =212
bytes in size, therefore to represent offset within a page/page-frame 12 bits of the virtual
address/physical address are required.
Structure of Page Table
The per-process page table contains page table entries for each page belonging to the
virtual/logical address space of the process. In general, the following information are
contained in page table entries:
1) Page frame number: This field contains the page frame number where the page
resides in the physical memory.
2) Present/absent bit: If this bit is 1, it indicates that the page is available in the memory
and this entry can be used for virtual/logical to physical address conversion. If it is 0,
the virtual page (to which the entry belongs) is not currently in the physical memory.
Accessing a page table entry with this bit set to 0 causes a page fault.
3) Protection bits: One or more bits are maintained to indicate the kind of access
permitted for the page. In the simplest form, this field can contain 1 bit, with value 0
for read/write and value 1 for read only. A more sophisticated arrangement is having 3
bits, one bit each for enabling reading, writing, and executing the page.
4) Modified bit: When a page (which is brought in physical memory) is written to, the
Modified bit in the corresponding page table entry is set. This bit may be used when
the operating system decides to reclaim or vacate a page frame. If the page in the
vacated page frame has been modified (i.e., is “dirty”), it must be written back to the
disk otherwise (i,e. when the page is not “dirty”) the disk copy is still valid and hence
there is no need to write the page on the disk. The bit is sometimes called the dirty
bit, since it reflects the page’s state i.e. whether it is dirty or not.
5) Referenced bit: The Referenced bit is set whenever a page is referenced, either for
reading or writing or executing. This bit may be used when the operating system
chooses a page to evict (so as to vacate a page frame) when a page fault occurs. This
is because pages that are not being used are better candidates for eviction as compared
to pages that are being currently used and this referenced bit helps in deciding the
same.
6) Caching bit: This bit indicates whether caching is to be disabled for the page.

Note: The page no. is used as the index for the page table during logical/virtual to physical
address conversion.

References:

1. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.


2. A. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.),
John Wiley & Sons, Inc.
1/25/2021 Multilevel Page Table.png

https://classroom.google.com/u/0/c/MTE5MTU3NzM3NzAy/m/MTkzODA3Mjc5MzEx/details 1/1
Page Replacement Algorithms
Page replacement algorithms are the techniques using which the operating system decides
which page in memory should be swapped out, written back to disk (if required) when
another page which is not already in the physical memory, needs memory to be allocated.
Page replacement is done whenever a page fault occurs and a free page frame is not directly
available in the physical memory. Therefore, operating system decides which page from the
physical memory is to be swapped out to vacate space for the incoming new page.

Reference String: The string of memory references (pages) is called reference string.
Reference strings are generated artificially or by tracing a given system and recording the
address of each memory reference.

Optimal Page replacement algorithm


 An optimal page replacement algorithm has the lowest page-fault rate of all
algorithms.
 It replaces the page that will not be used for the longest period of time.
 It is generally not practically feasible to implement this algorithm on normal real
systems because it requires knowledge of the entire reference string in advance. But it
may be used in case of simulations, where the reference string can be obtained by the
first run of the simulation.

Least Recently Used (LRU) algorithm


 The page which has not been used for the longest time in main memory is the one
which will be selected to be evicted out for performing page replacement in LRU.
 It is an approximation to optimal algorithm for practical purposes, which keeps a list
of pages currently in memory and replace pages by looking back into time.
First In First Out (FIFO) algorithm
 It is one of the simplest page replacement algorithm to implement. As per this
algorithm, the oldest page in memory is selected to be evicted out during page
replacement.
 It can be easily implemented by maintaining a queue of pages brought in memory,
where pages are evicted out (for replacement) from head of the queue and new pages
are added at the tail.
 This algorithm may replace a heavily referenced page, thereby leading to an increase
in overall number of page faults.

Second Chance Page Replacement Algorithm


 A simple modification to FIFO that avoids the problem of throwing out a heavily used
page by inspecting the R bit of the oldest page.
 If R bit is 0, the page is replaced as it is both oldest and not referenced. If R bit is 1,
the bit is cleared, the page is put onto the end of the queue of pages as though it had
just arrived in memory.
 It is called second chance because this algorithm gives second chance to the page
before it is being replaced.
Not frequently Used (NFU) algorithm
 A counter associated with each page is maintained to count the number of times the
page has been referenced so far.
 Page with the smallest count is the one which will be selected for replacement.
 This algorithm suffers in the situation where there is a page which is used heavily
during the initial phase of a process, but then is never used again. Even in such a
situation an initially heavily used page will not be evicted out due to its higher counter
value while actually it should have been a good choice for eviction as it not going to
be used again.

Not Recently Used


 It uses the Referenced bit R and Modified bit M of page table entry which are
initialized to 0 for each page.
 R bit is set when the page is referenced in the last clock interval, and M bit is set if the
page has been modified since it was brought from the disk to the memory.
 Periodically, the R bit is cleared by the operating system.
 When a page fault occurs, the operating system inspects all the pages and removes a
page at random from the lowest numbered nonempty class of the following four
classes. The classes are defined based on R and M bits as follows:
Class 0: not referenced (R = 0), not modified (M = 0)
Class 1: not referenced (R = 0), modified (M = 1)
Class 2: referenced (R = 1), not modified (M = 0)
Class 3: referenced (R = 1), modified (M = 1)

Demand Paging
Demand paging follows that pages should only be brought into memory if the executing
process demands them; pages that are not accessed till an instant of time are thus not loaded
into the physical memory until they are finally referenced. This is often referred to as lazy
evaluation as only those pages which are demanded by the process are brought
from secondary storage to main memory and not the other pages.

Segmentation
The virtual memory discussed so far is one-dimensional because the virtual addresses go
from 0 to some maximum address, one address after another. For many problems, having two
or more separate virtual address spaces may be much better than having only one, so as to
support logical user’s view of memory. This need is catered in segmentation technique of
virtual memory. In segmentation, the processes are allowed to be broken up into logically
independent address spaces to aid sharing, protection and dynamic process growth/shrinkage.

Segmentation provides a logical address space which is a collection of segments. Each


segment is a linear sequence of addresses and has a name and length. The length of each
segment may be anything to some maximum allowed. The virtual address specifies both the
segment name and the offset within the segment. For simplicity of implementation, segment
number is used instead of segment name. The logical address thus consists of a <segment
number, offset>. It is to be noted that a segment is a logical entity, which the programmer is
aware of and uses as a logical entity.

To convert the virtual or logical address to the corresponding physical memory address, a
mapping is maintained between the logical two-dimensional space to physical one-
dimensional space. This mapping is stored in the segment table. Each entry in the segment
table has a segment base and a segment limit. The segment base contains the starting physical
address where the segment resides in physical memory, whereas the segment limit specifies
the length of the segment.

Use of segment table for converting logical address to corresponding physical address: A
logical address consists of two parts: a segment number s, and an offset in that segment, d.
The segment number is used as an index to the segment table. The offset d must lie between 0
and the segment length, and if it doesn’t, a trap to the operating system is made indicating
that logical addressing attempt beyond end of segment. And if the offset is valid, it is added
to the segment base to produce the physical memory address.

Example for segmentation: The user may logically define different segments for following
different components of the process (say corresponding to the user’s C program). When the
user’s C program is compiled, the compiler constructs following segments reflecting the
input program as per user’s need:

1) The code
2) Global variables
3) The heap (from which memory is allocated)
4) The stacks (used by each thread)
5) The standard C library

The following figure shows pictorially another example of logical address space as a
collection of segments and also displays the use of segment table to map the logical address
space to the corresponding physical address space:
Advantages of Segmentation
1) Different segments may and usually do, have different lengths. Moreover, segment
length may change during execution. Because each segment constitutes a separate
address space, therefore, different segments can grow or shrink independently,
without affecting each other.
2) Segmentation facilitates sharing procedures or data among several processes. A
common example is the shared library.
3) Because each segment forms a logical entity of which the programmer is aware,
different segments can have different kinds of protection. Example: a procedure
segment can be given execute only permission while data segment may be given read
and write.
4) If different procedures occupy different segments each starting with address 0, the
linking up of procedures compiled separately is greatly simplified as a procedure call
to the procedure in segment n will use the two-part address (n, 0) to address word 0.
Further, if the procedure in segment n is subsequently modified and recompiled, no
other procedures need be changed.

Comparison between Paging and Segmentation


Following gives a comparison between two virtual memory techniques – paging and
segmentation:

1) Paging technique does not require programmer to intervene or be aware of how the
pages are internally managed whereas the programmer is aware of the segments
constituting the process in case segmentation technique is being used.
2) In case of paging there is only one linear virtual address space while in case of
segmentation normally there are many linear virtual address spaces, one for each
segment. Thus, in case of segmentation we also call it 2-dimensional virtual address
space.
3) In case of paging the dynamic growth or shrinkage of different sections of the process
(stack, heap etc.) cannot be suitably accommodated (if room for growth in the process
itself is not provided) whereas segmentation can accommodate dynamically growing
and shrinking segments.
4) Segmentation supports sharing of procedures and/or data while paging does not.
5) Procedures and data cannot be distinguished and protected separately using paging,
whereas using segmentation procedures and data can be distinguished and be
separately protected as well.
6) Paging technique was invented to get a large linear virtual address space without
having to buy more physical memory. But, segmentation was invented to allow
programs and data to be broken up into logically independent address spaces and to
aid sharing and protection. So, there is a basic difference in objective for invention of
these different kinds of virtual memory techniques.

Thrashing

Thrashing occurs when a system spends more time processing page faults than executing
process transactions. While processing page faults is necessary in order to appreciate the
benefits of virtual memory, thrashing has a negative affect on the system.

As the page fault rate increases, the queue at the paging device increases, resulting in
increased service time for a page fault. As a result of which, the system is waiting for the
paging device, and therefore, the CPU utilization, system throughput and system response
time decrease, resulting in below optimal performance of a system.

Thrashing becomes a greater threat as the degree of multiprogramming of the system


increases.
This graph shows that there is a degree of multiprogramming that is optimal for system
performance such that CPU utilization reaches a maximum before a swift decline happens on
further increasing the degree of multiprogramming. This is because thrashing occurs in the
over-extended system. This indicates that controlling the load on the system is important for
avoiding thrashing. In the system represented by the graph, it is important to maintain the
multiprogramming degree that corresponds to the peak of the graph.

References:

1. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.


2. A. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.),
John Wiley & Sons, Inc.
FILES

Files

A file is a named collection of related information that is recorded on secondary storage such
as magnetic disks, magnetic tapes and optical disks. In general, a file is a sequence of bits,
bytes, lines or records whose meaning is defined by the file’s creator and user. The operating
system hides the physical properties of its storage devices and provides user with the logical
storage entity in the form of files.

File Structure

 Disk systems typically have a well-defined block size determined by the size of a
sector. All blocks are of the same size.
 All disk I/O is performed in units of one block (physical record).
 Since physical record size may differ from the logical record size, therefore packing a
number of logical records into physical blocks is done.
 The logical record size, physical block size, and packing technique determine how
many logical records are in each physical block.
 Packing can be done either by the user's application program or by the operating
system. Thus, the file may be considered to be a sequence of blocks and all the basic
I/O functions operate in terms of blocks.
 Because disk space is always allocated in blocks, some portion of the last block of
each file is generally wasted which amounts to internal fragmentation.
 The larger the block size, the greater will be the internal fragmentation.
 There are various disk block allocation methods supported by different operating
systems to allocate disk blocks for storing file information and data.
 To access a logical record, conversion from logical records to physical blocks is done
and then the appropriate physical block is accessed.

File Attributes
A file's attributes vary from one operating system to another but typically consist of the
following:

 Name: The symbolic file name is the only information kept in human readable form.
User accesses a file using this file name along with specifying its logical path.
 Identifier: This is a unique tag, which usually a number, identifies the file uniquely
within the file system; it is the non-human-readable name for the file used inside the
file system.
 Type: This information specifies the type of the file. It is needed for systems that
support different types of files. It specifies the type of the file.
 Location: This information is a pointer to a device and to the physical location of the
file on that device.
 Size: The current size of the file (in bytes, words, or blocks) and possibly the
maximum allowed size are contained in this attribute.
 Protection: This attribute contains access-control information of the file which
determines who can perform read, write, execute operations on the file.
 Time, date, and user identification: This information about time, date and user may
be kept for file creation, last modification, and last use of the file. This data can be
useful for protection, security, and usage monitoring.

The information about all files is kept in the directory structure, which also is stored on
secondary storage. Typically, a directory entry consists of the file's name and its unique
identifier. The identifier in turn locates the other attributes of the file. Because directories,
like files, must be non-volatile, they must be stored on the device (secondary storage) brought
into memory in pieces, as and when required.

File Type
File type refers to the ability of the operating system to distinguish different types of file.
There may be different classifications of types of files. One classification of files could be as
text files, source files and binary files etc. The file’s structure depends on its type.
 A text file is a sequence of characters organized into line (and possibly pages).
 A source file is a sequence of subroutines and functions which may be written to solve
a problem or achieve some functionality.
 An object file is a sequence of bytes organized into blocks understandable by the
system’s linker.
 An executable file is a series of code sections that the loader can bring into memory
and execute.

Another classification of files supported by OS like UNIX is:

Ordinary or Regular files


 These are the files that contain user information.
 These may have text, data or executable program.
 The user can apply various operations on such files like add, modify, delete
information from the file or even remove the entire file.

Directory files
These files contain list of file names and other information related to files contained in the
directories. Operations like creation, renaming, removal of files (contained in the directory)
affect the content of the directory files. The operation of searching can also be done on
directory files to search within directories.
Special files
 These files are also known as device files.
 These files represent physical device like disks, terminals, printers, network interface,
tape drive etc.
 These files are of two types
o Character special files – represent physical devices in which data is handled
character by character as in case of terminals or printers.
o Block special files - represent physical devices in which data is handled in
blocks as in the case of disks and tapes.

File Operations
Following are the six basic operations that can be performed on files. The operating system
can provide system calls to create, write, read, reposition, delete, and truncate files:

Creating a file: File creation operation involves two steps:


1. Finding free space in the file system for the file and allocating space to the file for
storing its attributes and data.
2. Adding an entry for the new file in the directory.

Writing a file: To perform write in a file, name with path of the file and the information to be
written to the file are sent as input to the system call. Following are the steps involved in
write operation:
1. The system searches the directory to find the file's location.
2. The system keeps a write pointer to the location in the file where the next write is to
take place. The required data is written into the file at the position of the write pointer
and if space is required then storage is also allocated. In addition, the write pointer
and the file attributes are updated appropriately.

Reading a file: To perform read in a file, name with path of the file and address of memory
location where the read data should be put are sent as inputs. Following are the steps involved
in read operation:

1. The system searches the directory to find the file's location.


2. The system keeps a read pointer to the location in the file where the next read is to
take place. Reading starts from the read pointer’s location and after reading from the
file, the read pointer is stored at the memory address provided as input for the read
operation. Further, the read pointer is updated appropriately.

Because a process is usually either reading from or writing to a file at the same location, the
current operation location can be kept as a per-process current file-position pointer. Both
the read and write operations use this same pointer, saving space and reducing system
complexity.
Repositioning within a file: Repositioning within a file need not involve any actual I/O. This
file operation is also known as a file seek. Following are the steps involved in this operation:

1. The directory is searched to find the appropriate entry for the file.
2. The current-file-position pointer is repositioned to a given value.

Deleting a file: To delete a file, the steps performed are:

1. The directory for the named file is searched.


2. All disk space allocated to the file (data as well as file attribute information) is
released, so that it can be reused by other files in future.
3. The corresponding directory entry is removed.

Truncating a file: The user may want to erase the contents of a file but keep its attributes.
Rather than forcing the user to delete the file and then recreate it, this operation allows all
attributes of the file to remain unchanged—except for file length which is reset to length zero
and all disk space containing data of the file is released.

These six basic operations comprise the minimal set of required file operations. Other
common operations include appending new information to the end of an existing file, copying
an existing file into another etc. The primitive operations can be combined to perform such
other file operations. For instance, we can create a copy of a file by creating a new file and
then reading from the old and writing to the new.

Clearly, most of file operations involve searching the directory. The open() system call is
used to avoid multiple or repeated searches of the same file in the directory again and again
for several operations done on the same file by the same or different processes. The operating
system keeps a small table, called the open-file table, containing information about all open
files. The open() system call returns a pointer (or index) to the entry in open file table. This
pointer (or index), and not the actual file name, is used for all subsequent I/O operations
related to the file thereby avoiding repeated searching in the directory. The open() system call
also accept access mode information- create, read only, write only, read-write, append-only
etc. When the file is no longer being actively used, it is closed by the process, and the
operating system removes its entry from the open-file table. The said open-file table is
maintained for each process. In most systems more than one process can open the same file at
the same time, and therefore the operating system uses two levels of internal open file tables:
a) system-vide: Entries in system-vide open file table contain process independent
information like location of file on disk, access dates, file size, copy of FCB (file
control block) of the opened file, file-open count etc...
b) per-process: Entries in per-process open file table contain information related to use
of file by the process i.e. process dependent information about the file like current file
position pointer, access rights to the file for the process, accounting information etc.
along with pointer to the corresponding system-vide open file table entry. When a file
is opened, pointer or index of per-process open file table entry is returned to the
process.

The open() system call always adds an entry in the per-process open file table. An entry in the
system-vide open file table is inserted only if it does not exist already i.e. the file being
opened has not been opened already by any process yet. If entry in the system-vide open file
table already exists then the open() system call increments file-open count field in this entry
to indicate that one more process has now opened the file for using it. The system call close()
removes the per process open-file table entry and decrements this file-open count by 1 in the
system-wide open file table entry. When this file open count is reduced to 0, the system wide
open-file table entry is also removed. Some important pieces of information associated with
open files are as follows:

1) Current File position pointer


 It is unique to each process operating on the file and is therefore kept in the
per-process open file table.
 It tracks the last read-write location within the file for this process.

2) File-Open Count: It tracks number of opens and closes performed by different


processes on the file by specifying the number of processes which have currently
opened the file for using it. It is stored in the system-vide open file table.

3) Disk location of the file: Most file operations require the system to access or modify
data within the file. The information needed to locate the file on disk is kept in
memory in the system-vide open file table so that the system does not have to read
this information from disk for each operation performed on the file by different
processes.

4) Access rights: Each process opens a file in an access mode. This information is stored
in the per-process open file table so that the operating system can allow or deny
subsequent I/O requests by the process.

Note:- Some operating systems provide facilities for locking an open file (or sections of a
file). File locks allow one process to lock a file and prevent other processes from gaining
access to it. File locks are useful for files that are shared by several processes. For example:
File locks can be useful for System log files which can be accessed and modified by a
number of processes in the system.

References:

1. A. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
1/25/2021 File System Mounting.png

https://classroom.google.com/u/0/c/MTE5MTU3NzM3NzAy/m/MjAxNzYwNjk5MzYy/details 1/1
File Access Mechanisms
File access mechanism refers to the manner in which the records of a file may be accessed.
Following are the several ways to access files:

 Sequential access
 Direct/Random access
 Indexed sequential access

Sequential access
 A sequential access is that in which the records are accessed in a sequence i.e. the
information in the file is processed in order, one record after the other.
 This access method is the most primitive one. Example: Compilers usually access
files in this fashion.
 A read operation—read next—reads the next portion of the file and automatically
advances the file position pointer, which tracks the I/O location in the file.
 Similarly, the write operation—write next—writes at the file position pointer and
advances the file position pointer to the end of the newly written material

Direct/Random access
 Random access allows the records to be directly accessed without any sequential
order hence the name ‘random’.
 A file is made up of fixed length logical records and this method allows programs to
read and write records in no fixed order.
 The direct-access method is based on a disk model of file, since disks allow random
access to any file block. For direct-access, the file is viewed as a numbered sequence
of blocks.
 Here, the file operations must be modified to include the block number as a
parameter. Thus, the read operation is read n, where n is the block number, rather than
read next, and the write operation is write n rather than write next. An alternative
approach is to retain read next and write next, as with sequential access, and to add an
operation seek n, where n is the block number. To achieve read n, the operation seek n
can be performed which is then followed by read next. Similarly, for write n, the
operation seek n followed by write next can be performed.

Note:
The block number provided by the user to the operating system is normally a relative
block number which is an index relative to the beginning of the file. The use of relative
block numbers allows the operating system to decide where the required block of file is
placed and helps to prevent the user from accessing portions of the file system that may
not be part of the accessed file. Let L be the logical record length, the request for record N
is turned into an I/O request for L bytes starting at logical location L * N within the file,
assuming the first record is corresponding to N = 0 and file blocks are sequentially
allocated and provided L * N does not exceed the length of the file.

Indexed sequential access


 This mechanism is built up on the basis of direct access.
 An index is created for records in each file which contains pointers to physical blocks.
 To find a record, index is searched sequentially and the corresponding pointer is used
to access the file directly to reach to the desired record.
 Clearly, this mechanism can work both sequential access and also allow direct access
to records even if the data blocks are not stored sequentially on the disk.

Storage Structure & File System


A disk (or any storage device that is large enough) can be used in its entirety for a single file
system. Sometimes, though, it is desirable to place multiple file systems on a disk or to use
parts of a disk for a file system and other parts for other things, such as swap space or
unformatted (raw) disk space. These parts are known variously as partitions, slices,
minidisks etc... A file system can be created on each of these parts of the disk. The parts can
also be combined to form larger structures known as volumes, and file systems can be
created on these as well. For simplicity, a chunk of storage that holds a file system is referred
as a volume. Each volume that contains a file system must also contain information about the
files in the system.

For most users, the file system is the most visible aspect of an operating system. The file
system resides permanently on secondary storage, which is designed to hold large amount of
data permanently. To provide efficient and convenient access to the disk, the operating
system uses file system. It provides the mechanism for easy storage of and access to both data
and programs of the operating system and of all the users of the computer system. The file
system consists of two distinct parts: a collection of files, each storing related data, and a
directory structure, which organizes and provides information about all the files in the
system.

Directory Structure
To manage data in terms of a large number of files, we need to organize them. This
organization involves the use of directories. In a system with many files, the size of the
directory itself may be megabytes.

The directory can be viewed as a table or some other form of organization of data that
contains information about the files contained in that directory in the form of directory
entries, one for each file, and each directory entry helps in accessing the corresponding file’s
attributes like its name, location, size, type etc. The directory itself can be organized in many
ways. The operations performed on a directory include:
 Search for a file, thereby traversing a directory
 Create a file, thereby creating a new directory entry and hence writing in the
directory.
 Delete a file, thereby removing a directory entry and hence changing contents of the
directory.
 List all entries in a directory
 Rename a file and hence again changing the content of the corresponding directory
entry in the directory.
 Traverse the file system (accessing every directory and every file within a directory
structure)

Types of Directory Structure

Single-level Directory
In this type of directory structure, all the files are contained in the same directory.

Advantages
 Easy to implement
 Easy to support and understand.

Limitations
 Since all files are located in the same directory, they must have unique names. This is
not viable and convenient in case of many files and multi-user systems
 Even a single user on a single-level directory may find it difficult to remember the
names of all the files when the number of files increases.
 Searching also becomes more time consuming when files of multiple used are stored
in the single directory.

Two-level Directory
 A two-level directory can be thought of as a tree, or an inverted tree, of height 2. The
root of the tree is the the system's master file directory (MFD). Its direct descendants
are the user file directories (UFDs). The descendants of the UFDs are the files
themselves. The files are the leaves of the tree.
 Each user has his own user file directory (UFD). The UFDs have similar structures,
but each lists only the files of a single user.
 When user logs in, the MFD is searched. The MFD is indexed by user name or
account number, and each entry points to the UFD for that user. As user files are
stored only in his respective UFD, therefore when user refers to a particular file, only
his own UFD is searched.
 With every new user, an entry in MFD is inserted and a UFD is created.
 Specifying a user name and a file name defines a path in the tree from the root (the
MFD) to a leaf (the specified file). Thus, a user name and a file name define a path.

Advantages
 This structure solves the name-collision problem among multiple users, while
this problem existed in single-directory structure.
 It facilitates protection against unauthorized access by other users.

Limitations
 It isolates one user from another which is a disadvantage if the users want to
cooperate on some task and wish to access one another’s files.
 But, per user filenames should still be unique which is again not convenient in
case user has large number of files.

Tree-Structured Directory
 Users can create their own subdirectories and organize their files accordingly.
 Directory structure is organized as tree with root directory as the root.
 A directory contains a set of files and/or subdirectories. A directory is simply another
file, but treated in a special way.
 Every file has a unique path. Path names can be of two types:
o Absolute Path
o Relative Path
 Each process has a current working directory which can be redefined using system
call for change directory.
 An interesting policy decision in a tree-structured directory concerns how to handle
the deletion of a directory, especially when directory is not empty. Following may be
the options:
o To delete a directory, it must be empty so user must recursively delete all
subdirectories himself.
o When a directory is to be deleted, all the directory’s files and subdirectories
are also to be deleted.

Advantages
 It provides user with flexibility and convenience to organize files the way he may
wish to.
 With a tree-structured directory, users can be allowed to access, in addition to their
own files, the files of other users by using relative or absolute paths (if permission is
granted).

Limitations
 It is complex to maintain the directory structure to accommodate changes.
 It is complex to perform operations like searching, traversal etc. in tree-structured
directory.

Shared Files
When several users are working together on a project, they often need to share
files/directories. As a result, it is often convenient for a shared file/directory to appear
simultaneously in different directories which may belong to different users. A shared
directory or file will exist in the file system’s directory structure at two or more places but the
data will be stored on storage device only once. The file system itself is a Directed Acyclic
Graph, or DAG, rather than a tree so as to allow sharing of directories or files. It is important
to note that a shared file (or directory) is not same as two copies of the file (or directory) as in
case of shared file (or directory) its contents are not duplicated. If they were two separate
copies of the file (or directory) then changes done to one will not get reflected in the other
where as in case of shared file (or directory) since the changes are done to the same single file
(or directory), hence, the changes are visible from all places and to all users sharing the file
(or directory).

A common way to implement shared files is to create a new directory entry called a link
whenever a file is needed to be shared. A link is effectively a pointer to another file or
subdirectory. There are two types of links namely hard link and soft (or symbolic) link on
Unix/Linux OS.

Soft or Symbolic Link: Soft links doesn’t contain any information about the destination file
(target) or contents of the file, instead of that, it simply contains the reference to the
destination/target file or directory in the form of an absolute or relative path. Soft link
contains a text string that is automatically interpreted and followed by operating system as a
path to another file or directory. The soft link is basically a second file that exists
independently of its target. In soft link, a new file is created with a new inode, which have the
reference to the original or destination/target file. This is explained with the following
diagram:

Symbolic links are created with the “ln” command in linux with –s option. The syntax of the
command is:

ln –s file_path link_path
For example:
$ ln -s /usr/bin/gedit ~/Desktop/gedit

Hard Link: Hard link is a bit different when compared to a soft link. In soft link a new file
and a new inode is created, but in hard link, only an entry into directory structure is created
for the file, while it points to the inode location of the original file. This means there is no
new inode creation in the hard link. The following figure explains this:

So, hard link references the inode of the original/target file directly on the disk, which means
that there should be a way to know how many hard links exist to a file. For the same, in the
inode information, there is an option for “links”, which will tell how many links exists to a
file. This information can be found using the following command:

$ stat <file name>

$ stat 01
Size: 923383 Blocks: 1816 IO Block: 4096 regular file
Device: 803h/2051d Inode: 12684895 Links: 3
Access: (0644/-rw-r–r–) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-09-07 01:46:54.000000000 -0500
Modify: 2012-04-27 06:22:02.000000000 -0500
Change: 2012-04-27 06:22:02.000000000 -0500

In this example, it means that the specific file have 2 hard links, which makes the count to 3.
A hard link can be created with the same command “ln” (without –s option)
ln

For example:
$ ln /usr/bin/gedit ~/Desktop/gedit

Advantages and Applications of Soft and Hard Links


Following are the uses of Soft Links:
1. Link across filesystems: To link files across the filesystems, only symlinks/soft links
can be used.
2. Links to directory: To link directories, again only Soft links can be used. Hard link
can’t be created to a directory.

Overall, we may say that soft links are advantageous over hard links in some situations.
Firstly, hard links do not link paths on different volumes or file systems whereas soft links
may point to any file or directory irrespective of volumes on which the link and the target
resides. Secondly, hard links cannot be used to create links for directories.

In all other situations hard links may be used because of the following advantages offered by
hard links over soft links:

1. Storage Space: Hard links takes very negligible amount of space, as there are no new
inodes created while creating hard links. In soft links we create a file which consumes
space for both file content and its inode (may go upto few KB depending upon the file
system).
2. Performance: Performance will be slightly better while accessing a hard link, as the
location of the file’s content on disk is directly accessed instead of going through
another file.
3. Moving file location: If the original file is moved to some other location in the same
file system, the hard link will still work, but soft link will fail in such a situation.
4. Safety: For assuring safety of data, hard link should be used, because in hard link the
data is safe until all the links to the files are deleted. Whereas in soft link, there is no
check made by the operating system before deleting a file and its contents. Thus, the
data is lost if the master (original) instance of the file is deleted.
5. Reliability: Hard links always refer to an existing file, whereas soft links may contain
an arbitrary path that does not point to anything. Soft links pointing to moved or non-
existing targets are sometimes called broken or orphaned or dead or dangling.

References:

1. A. Silberschatz, P.B.Galvin and G. Gagne, Operating System Concepts (7th ed.), John
Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
File System
File System

The file system resides permanently on secondary storage, which is designed to facilitate
storage of large amount of data permanently. Besides storage, file system facilitates efficient
and convenient access to the disk by allowing the data to be stored, located and retrieved
easily from the storage devices. The visible aspect of file system consists of two distinct
parts:

i) Collection of files, each storing related data, and


ii) A directory structure, which organizes and provides information about the logical
arrangement of all the files in the system (user’s view)

A file system poses two quite different design issues:

1. How the file system should look to the user i.e. defining a file to make it convenient to
the user to access the file and its attributes, the operations allowed on a file, and the
directory structure for organizing files.
2. Creating algorithms and data structures to map the logical file system onto the
physical secondary-storage devices to meet desired level of efficiency and
performance requirements.

File System Structure


The file system itself is generally composed of many different levels. A layered design is
depicted below:
The I/O control, consists of device drivers (or disk driver) and interrupt handlers to
transfer information between the main memory and the disk system. A disk driver acts as a
translator which takes the input regarding which physical block is to be accessed and outputs
low level, hardware-specific instructions that are used by the hardware controller (which
interfaces the I/O device to the rest of the system). The driver usually writes specific bit
patterns to special locations in the I/O controller's memory to tell the controller which device
location to act on and what actions to take.

The basic file system needs only to issue generic commands to the appropriate device driver
to read or write physical blocks on the disk. Each physical block is identified by its numeric
disk address (for example: drive 1, cylinder 73, track 2, sector 10).

The file-organization module knows about files and their logical blocks, as well as
corresponding physical blocks. By knowing the type of disk block allocation method used
and the starting location of the file, the file-organization module can translate logical block
addresses to physical block addresses for transferring it to the basic file system, where each
file's logical blocks are numbered from 0 (or 1) through N. The file-organization module also
includes the free-space manager, which tracks unallocated blocks so that these blocks may be
provided when requested.

The logical file system manages metadata information i.e. all information about the file-
system structure except the actual data (or contents of the files). It also manages the directory
structure to provide required information to the file organization module when given a
symbolic file name (with path). It maintains file structure using file-control blocks. A file-
control block (FCB) contains information about the file, including ownership, permissions,
and location of the file contents. The logical file system is also responsible for protection and
security of the files.

File System Implementation


Several on-disk and in-memory structures are used to implement a file system. These
structures vary depending on the operating system and the file system, but some of the
common component/information more or less remain similar which is discussed as below.

On disk, the file system may contain information about


 how to boot an operating system stored there,
 the total number of blocks, FCBs, the number and location of free blocks & FCBs,
 the directory structure,
 individual files

Following give brief details on these structures:


Boot control block (per volume):
 It contains information needed by the system to boot an operating system from that
volume.
 If the volume does not contain an operating system, this block can be empty.
 It is typically the first block of a volume. In UFS, it is called the boot block; in NTFS,
it is the partition boot sector.

Volume control block (per volume):


 It contains volume (or partition) details, such as
o the number of blocks in the partition,
o size of the blocks,
o free block count,
o free block pointers or start of the free block pointers,
o free FCB count,
o FCB pointers and information about free FCB pointers
 In UFS, this is called a superblock; in NTFS, it is stored in. the master file table.

Directory structure per file system:


 It is used to organize the files.
 In UFS, directories are implemented as files which contain directory entries
specifying file names and associated inode numbers for the files. In NTFS it is stored
in the master file table.

Per-file FCB
 It contains all details about the file (except actual data), including
o file permissions,
o dates of creation/last access/modification
o ownership,
o size,
o location of the data blocks etc.

 In UFS, this is called the inode. In NTFS, this information is actually stored within the
master file table, and is organized in rows per file.

The in-memory information is used for both file-system management and performance
improvement via caching. The data is loaded in memory at mount time, subsequently it is
updated and finally discarded at unmount. The structures may include the ones described
below:
 An in-memory mount table contains information about each mounted volume to
facilitate easy and convenient access of files stored in the mounted volumes and
makes it appear as a single directory hierarchy to the user.
 An in-memory directory-structure cache holds the directory information of recently
accessed directories for their faster access.
 The in-memory system-vide open-file table contains a copy of the FCB of each open
file in the system, along with other information like open-file count.
 The in-memory per-process open-file table contains entries for each file opened by
the process. Each of these entries contains a pointer to the appropriate entry in the
system-vide open-file table, along with other information like access mode, current
file-position pointer.

File operations (File System view)

The impact of different file operations on the file system’s on-disk and in-memory structures
is as discussed below:

Creating a new file


 Firstly, application program calls logical file system.
 The logical file system allocates a new FCB or a free FCB to the file.
 The logical file system knows the format of the directory structure. The appropriate
directory is read into the memory and is updated with the new file name and FCB.
Before updating the directory, it is ensured that a file with the same name does not
already exist in the directory otherwise appropriate error is shown. In case of no error,
the updated directory is written back to the disk.
 Optionally the data blocks may also be allocated to the file depending on the default
size of the file.
 Volume control block is also updated to account for the newly allocated FCB and data
blocks (if any) to the file.

Opening a file
 The open() call passes a file name (with path) to the file system.
 The system-vide open-file table is searched to see if the file is already in use by
another process. If it is, the file-open count in the system-vide open-file table is
incremented by one and if it is not already being used by another process then an
entry in the system-vide open-file table is created for the newly opened file.
 When a file is opened for the first time in the system, the directory structure is
searched for the given file name (with path). Once the file is found and system-vide
open-file table entry does not exist, the FCB is copied into the system-vide open-file
table in memory. Besides FCB, the system-wide open-file table entry also keeps track
of the number of processes that have this file opened using the file-open count.
 An entry is always made in the per-process open-file table, with a pointer/reference to
the entry in the system-vide open-file table and some other information like current
file position pointer to specify current read/write location in the file, access mode in
which the file is opened by the process etc.
 Open() call returns a pointer (or index) to the appropriate entry in the per-process
open-file table.
 All subsequent operations on this opened file are then performed using this pointer (or
index) returned by open() call.

Note:
 The name given to the pointer (or index) to the open-file entry returned by the open()
call differ from system to system, in Unix it is referred as file descriptor, in windows
it is referred as File handle.
 As long as the file is not closed, any changes done to the metadata of file due to file
operations will get updated in the copy of the FCB present in the open-file table.

Closing a file
 When a process closes the file, the per-process open-file table entry is removed and
the system-vide open-file table entry’s file-open count is decremented by one.
 When all processes that have opened the file close it, any updated metadata in the
copy of FCB (present in system-vide open-file table) is copied back to the disk and
the system-vide open-file table entry is removed.
Directory Implementation
The selection of directory-allocation and directory-management algorithms significantly
affect the efficiency, performance, and reliability of the file system.

Linear List
 This is the simplest method for implementing a directory which involves creating a
linear list of file names along with information about the location of the associated
data blocks.
 To create a new file, the directory is searched to be sure that no existing file has the
same name. Then, new entry is added at the end of the directory for this newly created
file, only if file with same name does not already exist in the directory.
 To delete a file, the directory is searched for the named file, then the space allocated
to it is released including the directory entry. A slight variation could be that instead
of a contiguous linear list, each directory entry may contain reference of the next valid
directory entry thereby making it a linked list kind of arrangement without physically
removing directory entries corresponding to deleted files.
 To reuse the directory entry (corresponding to the deleted file) several approaches can
be taken:
o Mark the entry as unused for later use
o The freed entry can be attached to a list maintained for free directory entries.
o A third alternative is to copy the last entry in the directory into the freed
location thereby decreasing the length of the directory after each file deletion
in the directory.
Advantage
 Easy and simple to implement and program.

Disadvantages/Issues
 It is time consuming to execute operations because only linear search is possible when
finding a file.

Solution
 Caching of the most recent directory information to avoid reread from the disk is a
possible solution to avoid the slow access of the directory structure implemented as
linear list.

Hash Table

 In this approach, hash data structure is used besides maintaining linear list to store
directory entries.
 Hash table takes a value computed from the filename and returns reference to the
filename in the linear list.
Advantage
 The directory search time is significantly reduced.

Disadvantages/Issues
 Collision handling is required to be done for situations in which two filenames hash to
the same location.
 The hash table size is generally fixed and the hash function is also dependent on that
size. Therefore, in scenarios where the number of files grow and there is a
requirement to grow the hash table as well, it will lead to a need of a new hash
function. Further, reorganization is required for the existing directory entries to reflect
their new hash function values.

Solution
 Solution to both the above mentioned problems is by maintaining a chained-overflow
hash table where each hash entry can be a linked list instead of an individual filename,
and collisions can be resolved by adding the new entry to the linked list. But now the
lookups will be slower because searching for a filename may require searching
through a linked list of colliding hash table entries.

References:

1. A. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.),
John Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
Disk Block Allocation Methods
The main issues while designing disk block allocation methods are:
1. How to allocate space to these files so that disk space is utilized effectively?
2. How to ensure that the files can be accessed quickly?

Three major methods of allocating disk space are in wide use: contiguous, linked, and
indexed. Each method has its own advantages and disadvantages. These methods are as
discussed below:

Contiguous Allocation
 Contiguous allocation requires that each file occupy a set of contiguous blocks on the
disk.
 Contiguous allocation of a file is defined by the disk address of the first block and the
length of the file in block units. If the file is n blocks long and starts at location b, then
it occupies blocks b, b + 1, b + 2, ..., b + n — 1.
 The directory entry for each such file indicates the address of the starting block and
the length of this file.

Advantages
 The number of disk seeks required for accessing contiguously allocated files is
minimal.
 Accessing a file that has been allocated contiguously is easy. Both sequential and
direct access can be supported by contiguous allocation.
o For sequential access, the file system remembers the disk address of the last
block referenced and, when required, reads or writes the next block.
o For direct access to block i of a file that starts at block b, the block b + i can be
directly accessed.

Disadvantages/Issues
 One major issue is to decide how much space should be allocated to the file when it is
created and what is to be done when the file needs to grow. The possible solution
could be:
o Restrict the file to be extended or grown beyond the blocks allocated to the
file. But this is not a user-friendly approach.
o To pre-allocate large number of blocks but it will not be efficient utilization of
storage space. Also, estimating total size of the file in advance may not be
feasible.
o Copy contents of the file to larger contiguous space when the file needs to
grow in size and release the previously occupied space. But this a time-
consuming approach.
o Another approach could be to add extents (chunk of contiguous space) to the
already allocated space when the file needs to grow. The location of a file's
blocks is then recorded as a location and a block count along with a link to the
first block of the next extent.
 Another difficulty is finding contiguous space for a new file. The solutions like best
fit, first fit and worst fit could be used.
 Further, as files are allocated and deleted, the free disk space is broken into little
pieces leading to external fragmentation. Depending on the total amount of disk
storage and the average file size, external fragmentation may be a minor or a major
problem. On-line or off-line compaction is required to be performed to deal with the
external fragmentation problem.

Linked Allocation
 With linked allocation, each file is a linked list of disk blocks and the disk blocks may
be scattered anywhere on the disk.
 Here, each block of the file contains the address/link to the next block besides the
actual data.
 The directory contains an address/link to the first and last blocks of the file.
 To create a new file, a new entry in the directory is created. With linked allocation,
each directory entry has a link to the first disk block of the file which is initialized to
nil (the end-of-list address/link value) to signify an empty file. The size field of the
file is also set to 0 initially.
 A write to the file causes the free-space management system to find a free block. This
new block is written to and is linked to the end of the file.
 To read a file, the blocks are simply read by following the links from block to block.

For example, a file of five blocks might start at block 9 and continue at block 16, then block
1, then block 10, and finally block 25 as shown below:

Advantages
 Linked allocation solves all problems of contiguous allocation.
o The size of a file need not be declared when that file is created.
o A file can continue to grow as long as free blocks are available.
o There is no problem of external fragmentation with linked allocation, and any
free block on the free-space list can be used to satisfy a request for a block by
any file.

Disadvantages
 It can be used effectively only for sequential-access files. Direct-access capability for
linked-allocation files is highly inefficient because to read the ith block, i-1 blocks of
the file have to be read first because each block contains link to the next block.
Further, each access to a link to the next block requires a disk read, and many a times
require a disk seek as well, which makes direct-access inefficient with linked
allocation.
 Seek time required (even for sequential access) will be much more because the blocks
allocated to the file are not contiguous and may be scattered over the volume.
 The space required for keeping the links to the next block in each block of the file is
another disadvantage. Each file requires slightly more space than it would otherwise
require because each block allocated to the file contains a link to the next block of the
file which is over and above the contents of the file.
The solution is to collect blocks into multiples, called clusters, and to allocate clusters
rather than blocks so that these links consume smaller percentage of the file's disk
space. This also improves disk throughput as fewer disk-head seeks are required. But
this leads to an increase in internal fragmentation, because more space is wasted when
a cluster is partially full than when a block is partially full.
 Reliability is another problem with linked-allocation approach as each block contains
link to the next block, and, missing or loosing a single link may lead to loss of
information about all the subsequent blocks in the file.
Solution is to maintain a doubly linked list but this approach requires even more
overhead (of storing backward links) for each file.

Indexed Allocation
 Indexed allocation brings all the links of the file blocks together into one location
called the index block.
 Each file has its own index block, which is an array of disk-block addresses. The ith
entry in the index block gives the address of the ith block of the file.
 The directory contains the address of the index block along with the file name.
 To find and access the ith block, the link in the ith index-block entry is used.

 When the file is created, all pointers in the index block are set to nil. When the ith
block is first written, a disk block is obtained from the free-space manager, and its
address is put in the ith index-block entry.
Advantages
 Both direct and sequential access to the file is supported in indexed-allocation
approach.
 There is no issue of external fragmentation because any free block on the disk can
satisfy a request for more storage space made by any file.

Disadvantages
 There is a performance issue faced in this approach because index block has to be
referenced for each file read or write operation. Though index block may be
cached for achieving some improvement but since data blocks may be spread all
over the volume, hence seek time may be large.
 Indexed allocation also suffers from the issue of wasted space. The pointer
overhead of the index block may many a times be greater than the pointer
overhead of linked allocation because with indexed allocation, an entire index
block must be allocated, even if only one or two pointers are non-nil (i.e. even in
case when file size is small).
 Another related issue is to decide how large the index block should be.
o If it is too small, it will not be able to hold enough pointers for a large file.
o Mechanisms for handling large files with indexed-allocation include the
following:

 Linked scheme: An index block is normally one disk block. To


allow for large files, several index blocks can be linked together
with last word of an index block or keeps address of the next index
block or is nil to indicate that it is last index block.
 Multilevel index or Indirect Index: Here, first level index block is
used to point to a set of second-level index blocks, which in turn
point to the file blocks. To access a file block, the operating system
uses the first-level index to find a second-level index block where
it finds the address of the desired data block. This approach could
be continued to a third or fourth level, depending on the desired
maximum file size.

 Combined scheme: Another alternative, is to combine both direct


and indirect indexing. Unix File System keeps the 15 pointers/links
inside the index blocks in the file's inode. The first 12 of these
pointers/links point to direct blocks; that is, they contain addresses
of blocks containing data of the file. Thus, the data for small files
(i.e. where size of file is no more than 12 blocks) do not need a
separate (indirect) index block. The next three pointers/links point
to indirect blocks. The first points to a single indirect block, the
second points to a double indirect block and the last pointer
contains the address of a triple indirect block. Under this method,
the number of blocks that can be allocated to a file exceeds the
amount of space addressable by a 32-bit disk block address used by
many operating systems.

Free-Space Management
Since disk space is limited, we need to reuse the space from deleted files for new files. To
keep track of free disk space, the system maintains a free-space list. The free-space list
records all free disk blocks i.e. the disk blocks which can be allocated to some file or
directory in future. To create a file, we search the free-space list for the required amount
of space and allocate that space to the new file. This space is then removed from the free-
space list. When a file is deleted, its disk space is added to the free-space list.

Bit Vector
 The free-space list is implemented as a bit map or bit vector.
 Each block is represented by 1 bit. If the block is free, the bit is 1; if the block is
allocated, the bit is 0.
 For example, consider a disk where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25,
26, and 27 are free and the rest of the blocks are allocated. The free-space bit map
should be
001111001111110001100000011100000 ...
Advantages
 The main advantage is its relative simplicity.
 Also, it is efficient in finding the first free block or n consecutive free blocks on the
disk.
Disadvantages
 Bit vector should be kept in memory which is feasible to be done for smaller disks but
is not feasible for larger ones.

Linked List
 Another approach to free-space management is to link together all the free disk
blocks.
 Each free block contains pointer/link to the next free block. The pointer/link to the
first free block is kept in a special location on the disk to indicate start of the free-
space list and it is cached in the memory for faster access.

Advantage
 Only start of the free-space list is required to be brought in the memory to access the
free-space information.

Disadvantage
 This scheme is not efficient. To traverse the free-space list, each block is required to
be read, which requires substantial I/O time. For finding first n free blocks, n-1 free
blocks will have to be read, which is clearly time consuming and inefficient.
Grouping
 Here, the addresses of n free blocks are stored in the first free block. The first n-1 of
these blocks are actually free. The last block contains the addresses of another n free
blocks, and so on.

Advantage
 The addresses of a large number of free blocks can now be found quickly, unlike the
situation when the standard linked-list approach is used.

Counting
 Several contiguous blocks may be allocated or freed simultaneously, particularly
when disk space is allocated with the contiguous-allocation algorithm or through
clustering. Thus, rather than keeping a list of n free disk addresses, the address of the
first free block and the number n of free contiguous blocks that follow the first block
is kept.
 Each entry in the free-space list consists of a disk address and a count.
Advantage
 Although each entry requires more space than a simple disk address, the overall list
will be shorter, as long as the count is generally greater than 1.

References:

1. A. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.),
John Wiley & Sons, Inc.
2. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
Disk Structure
Magnetic disks provide the bulk of secondary storage for modern computer systems. A
magnetic disk drive comprises of several disk platters. Each disk platter has a flat circular
shape, like a CD. The two surfaces of a platter are covered with a magnetic material. The
information is stored by recording it magnetically on the platters.

A read-write head moves just above each surface of every platter. The heads are attached to
disk arms and the arm assembly moves all the heads as a unit. The surface of a platter is
logically divided into circular tracks, which are subdivided into sectors. The set of tracks
that are at one arm position makes up a cylinder. There may be thousands of concentric
cylinders in a disk drive, and each track may contain hundreds of sectors. The storage
capacity of common disk drives is measured in gigabytes.

Modern disk drives are addressed as large one-dimensional arrays of logical blocks, where
the logical block is the smallest unit of transfer. The size of a logical block on a hard drive is
usually 512 bytes but newer hard drives are also using 4KB block size. The one-dimensional
array of logical blocks is mapped onto the sectors of the disk sequentially. Sector 0 is the first
sector of the first track on the outermost cylinder. The mapping proceeds in order through
that track, then through the rest of the tracks in that cylinder, and then through the rest of the
cylinders from outermost to innermost.

Disk Scheduling
 Efficient use of hardware is one of the prime responsibilities of the operating system.
For the disk drives, the operating system needs to ensure fast access time and large
disk bandwidth.
 The disk access time has seek time and rotational latency as two major components
besides the actual data transfer time:
o The seek time is the time for the disk arm to move the heads to the cylinder
containing the desired sector.
o The rotational latency is the additional time for the disk to rotate to bring the
desired sector under the disk head.
o The transfer time is the actual time to transfer the data. It depends on the
rotating speed of the disk and number of bytes to be transferred.
 So, Disk Access Time = Seek Time + Rotation Latency + Transfer Time
 Disk Response Time is the time spent by a request waiting to perform its I/O
operation till the I/O operation actually starts.
 The disk bandwidth is the total number of bytes transferred, divided by the total time
between the first request for service and the completion of the last transfer.
 Both the access time and the bandwidth can be improved by scheduling the servicing
of disk I/O requests in an appropriate order.
 Whenever a process needs I/O to or from the disk, it issues a system call to the
operating system. The request specifies several pieces of information:
o Whether this operation is input or output?
o What the disk address for the transfer is?
o What the memory address for the transfer is?
o What the number of sectors to be transferred is?

 If the desired disk drive and controller are available, the request can be serviced
immediately. If the drive or controller is busy, any new requests for service will be
placed in the queue of pending requests for that drive.

 For a multiprogramming system with many processes, the disk queue may often have
several pending requests. Thus, when one request is completed, the operating system
chooses which pending request to service next.

 Since the most significant contribution to the access time is from the seek time,
therefore to reduce the access time the corresponding seek time is to be minimized.

Disk Scheduling Algorithms


1. FCFS: FCFS is the simplest of all the Disk Scheduling Algorithms. In FCFS, the
requests are addressed in the order they arrive in the disk queue.

Advantages:
 Every request gets a fair chance
 No chance of starvation of any request
Disadvantages:
 Does not try to optimize seek time
 May not provide the best possible service, i.e. the response time could be higher

2. SSTF: In SSTF (Shortest Seek Time First), requests having shortest seek time are
executed first. So, the seek time of every request in queue is calculated in advance and
then they are scheduled according to their calculated seek time. As a result, the request
near the disk arm will get executed first. SSTF is certainly a performance improvement
over FCFS as it decreases the average response time and increases the throughput of
system.

Advantages:
 Average Response Time decreases
 Throughput increases

Disadvantages:
 An extra overhead is incurred to calculate seek time in advance
 It can cause starvation for a request if the request has higher seek time as compared to
incoming requests
 It results in high variation of individual response times to average response time as
SSTF favours only some requests (the ones with smaller seek time)
 It is not a fair algorithm as it is biased towards requests having smaller seek time.

3. SCAN: In SCAN algorithm the disk arm moves into a particular direction and services
the requests coming in its path and after reaching the end of disk, it reverses its direction
and again services the request arriving in its path in reverse direction. So, this algorithm
works like an elevator and hence also known as elevator algorithm. As a result, the
requests at the midrange are serviced more often and those arriving behind the disk arm
and esp. the cylinders at both the extremes of the disk will have to wait for much more
on an average as compared to the mid-range cylinders.

Advantages:
 High throughput
 Low average response time
 Avoids starvation because the disk arm movement is such that it reaches all the
cylinders in a systematic way and hence all requests are eventually catered

Disadvantages:
 Long waiting time for requests for cylinders behind the disk arm i.e. the ones not
falling in the path of the disk arm movement
4. C-SCAN: In SCAN algorithm, the disk arm again scans the path that has been scanned
after the arm reverses its direction due to which the same path gets scanned twice i.e.
both in forwards and backward directions. So, it may be possible that too many requests
are waiting at the other end of the disk while the arm continues to service the requests
that keep arriving on its way back.

This situation is avoided in C-SCAN algorithm in which the disk arm instead of
catering to requests on reversing its direction goes to the other end of the disk directly
and starts servicing the requests from there. So, the disk arm moves in a circular fashion
in a way that it continues till the end of the disk while moving in one direction before it
reverses its direction to reach to the other end of the disk directly. Since this algorithm
operates in a circular fashion and is also similar to the SCAN algorithm, hence, it is
known as C-SCAN (Circular SCAN).

Advantages over SCAN:


 C-SCAN provides more uniform wait time and response time for individual
requests as compared to SCAN algorithm

5. LOOK: It is similar to the SCAN disk scheduling algorithm except for the difference
that the disk arm instead of going to the end of the disk goes only to the last request to
be serviced in front of the head in that particular direction and then reverses its direction
from there only.

Advantages over SCAN and C-SCAN:


 It prevents the extra delay which occurred due to unnecessary traversal of the disk
arm to the end of the disk which happens in case of SCAN and C-SCAN algorithms.

6. C-LOOK: As LOOK is similar to SCAN algorithm, in similar way, C-LOOK is similar


to C-SCAN disk scheduling algorithm. In C-LOOK, the disk arm, instead of going to
the end, goes only to the last request to be serviced in front of the head in that particular
direction and then from there goes to the other end’s last request.

Advantages over LOOK:


 C-LOOK provides more uniform wait time and response time for individual requests
as compared to LOOK algorithm

Advantages over SCAN and C-SCAN:


 It prevents the extra delay which occurred due to unnecessary traversal of the disk
arm to the end of the disk which happens in case of SCAN and C-SCAN algorithms.
Exercise: Given the following queue of cylinder no. corresponding to pending I/O requests --
95, 180, 34, 119, 11, 123, 62, 64 with the read-write head initially at the track 50 and the tail
track being at 199. Following demonstrates the working of the different disk scheduling
algorithms:

1. FCFS:

Head movement in terms of number of


cylinders:
|95-50| + |180-95| + |34-180| + |119-34| +
|11-119| + |123-11| + |62-123| + |64-62|
= 45 + 85 + 146 + 85 + 108+ 112 + 61 + 2
= 644

2. SSTF:
Head movement in terms of number of
cylinders:
|62-50| + |64-62| + |34-64| + |11-34| +
|95-11| + |119-95| + |123-119| + |180-123|
= 12 + 2 + 30 + 23 + 84+ 24 + 4 + 57
= 236

3. SCAN: Assume Head movement is towards cylinder 0


Head movement in terms of number of
cylinders:
|34-50| + |11-34| + |0-11| + |62-0| +
|64-62| + |95-64| + |119-95| + |123-119| +
|180-123|
= 16 + 23 + 11 + 62 + 2 + 31 + 24 + 4 + 57
= 230

4. C-SCAN: Assume Head movement is towards cylinder 0


Head movement in terms of number of
cylinders:
|34-50| + |11-34| + |0-11| + |180-199| +
|123-180| + |119-123| + |95-119| + |64-95|
+ |62-64|
= 16 + 23 + 11 + 19 + 57 + 4 + 24 + 31 + 2
= 187
5. C-LOOK: Assume Head movement is towards cylinder 0
Head movement in terms of number of
cylinders:
|34-50| + |11-34| + |123-180| + |119-123| +
|95-119| + |64-95| + |62-64|
= 16 + 23 + 57 + 4 + 24 + 31 + 2
= 157

Note: In C-SCAN and C-LOOK when the head is quickly moved from one end of the disk to
other end, there is a displacement; but the time required is negligible (as no requests are
served in between) and hence this displacement is ignored.

References:

3. A. Silberschatz, P.B. Galvin and G. Gagne, Operating System Concepts (7th ed.),
John Wiley & Sons, Inc.
4. A.S. Tanenbaum, Modern Operating Systems (2nd ed.), Prentice-Hall of India.
5. http://www.cs.iit.edu/~cs561/cs450/disksched/disksched.html

You might also like