Introduction To Operating Systems PDF
Introduction To Operating Systems PDF
A computer system has many resources (hardware and software), which may be require to
complete a task. The commonly required resources are input/output devices, memory, file
storage space, CPU etc. The operating system acts as a manager of the above resources and
allocates them to specific programs and users, whenever necessary to perform a particular
task. Therefore operating system is the resource manager i.e. it can manage the resource of a
computer system internally. The resources are processor, memory, files, and I/O devices. In
simple terms, an ope rating system is the interface between the user and the machine.
The user view of the computer refers to the interface being used. Such systems are designed
for one user to monopolize its resources, to maximize the work that the user is performing. In
these cases, the operating system is designed mostly for ease of use, with some attention pa id
to performance, and none paid to resource utilization.
Operating System: System View
Operating system can be viewed as a resource allocator also. A computer system consists of
many resources like - hardware and software - that must be managed efficiently. The
operating system acts as the manager of the resources, decides between conflicting requests,
controls execution of programs etc.
Early Evolution
1945: ENIAC, Moore School of Engineering, University of Pennsylvania.
1949: EDSAC and EDVAC
1949: BINAC - a successor to the ENIAC
1951: UNIVAC by Remington
1952: IBM 701
1956: The interrupt
1954-1957: FORTRAN was developed
And as the research and development work continues, we are seeing new operating systems
being developed and existing ones getting improved and modified to enhance the overall user
experience, making operating systems fast and efficient like never before.
Also, with the onset of new devies like wearables, which includes, Smart Watches, Smart
Glasses, VR gears etc, the demand for unconventional operating systems is also rising.
Types of Operating Systems
Following are some of the most widely used types of Operating system.
Time Sharing Systems are very similar to Multiprogramming batch systems. In fact time
sharing systems are an extension of multiprogramming systems.
In Time sharing systems the prime focus is on minimizing the response time, while in
multiprogramming the prime focus is to maximize the CPU usage.
Multiprocessor Systems
A Multiprocessor system consists of several processors that share a common physical
memory. Multiprocessor system provides higher computing power and speed. In
multiprocessor system all processors operate under single operating system. Multiplicity of
the processors and how they do act together are transparent to the others.
Advantages of Multiprocessor Systems
1. Enhanced performance
2. Execution of several tasks by different processors concurrently, increases the system's
throughput without speeding up the execution of a single task.
3. If possible, system divides task into many subtasks and then these subtasks can be executed in
parallel in different processors. Thereby speeding up the execution of single tasks.
Desktop Systems
Earlier, CPUs and PCs lacked the features needed to protect an operating system from user
programs. PC operating systems therefore were neither multiuser nor multitasking.
However, the goals of these operating systems have changed with time; instead of
maximizing CPU and peripheral utilization, the systems opt for maximizing user convenience
and responsiveness. These systems are called Desktop Systems and include PCs running
Microsoft Windows and the Apple Macintosh . Operating systems for these computers have
benefited in several ways from the development of operating systems for mainframes.
Microcompute rs were immediately able to adopt some of the technology developed for
larger operating systems. On the other hand, the hardware costs for microcomputers are
sufficiently low that individuals have sole use of the computer, and CPU utilization is no
longer a prime concern. Thus, some of the design decisions made in operating systems for
mainframes may not be appropriate for smaller systems.
These advancements in technology have made it possible to design and develop distributed
systems comprising of many computers that are inter connected by communication networks.
The main benefit of distributed systems is its low price/performance ratio.
1. As there are multiple systems involved, user at one site can utilize the resources of systems at
other sites for resource-intensive tasks.
2. Fast processing.
3. Less load on the Host Machine.
Types of Distributed Operating Systems
1. Client-Server Systems
2. Peer-to-Peer Systems
Client-Server Systems
Centralized systems today act as server systems to satisfy requests generated by client
systems. The general structure of a client-server system is depicted in the figure below:
Server Systems can be broadly categorized as: Compute Servers and File Servers.
Compute Server systems , provide an interface to which clients can send requests to perform
an action, in response to which they execute the action and send back results to the client.
File Server systems, provide a file-system interface where clients can create, update, read,
and delete files.
Peer-to-Peer Systems
The growth of computer networks - especially the Internet and World Wide Web (WWW) –
has had a profound influence on the recent development of operating systems. When PCs
were introduced in the 1970s, they were designed for pe rsonal use and were generally
considered standalone computers. With the beginning of widespread public use of the
Internet in the 1990s for electronic mail and FTP, many PCs became connected to computer
networks.
In contrast to the Tightly Coupled systems, the computer networks used in these applications
consist of a collection of processors that do not share memory or a clock. Instead, each
processor has its own local memory. The processors communicate with one another through
various communication lines, such as high-speed buses or telephone lines. These systems are
usually referred to as loosely coupled systems ( or distributed systems). The general structure
of a client-server system is depicted in the figure below:
Clustered Systems
Like parallel systems, clustered systems gather together multiple CPUs to accomplish
computational work.
Clustered systems differ from parallel systems, however, in that they are composed of two or
more individual systems coupled together.
The definition of the term clustered is not concrete; the general accepted definition is that
clustered computers share storage and are closely linked via LAN networking.
Clustering is usually performed to provide high availability.
A layer of cluster software runs on the cluster nodes. Each node can monitor one or more of
the others. If the monitored machine fails, the monitoring machine can take ownership of its
storage, and restart the application(s) that were running on the failed machine. The failed
machine can remain down, but the users and clients of the application would only see a brief
interruption of service.
Asymmetric Clustering - In this, one machine is in hot standby mode while the other is
running the applications. The hot standby host (machine) does nothing but monitor the active
server. If that server fails, the hot standby host becomes the active server.
Symmetric Clustering - In this, two or more hosts are running applications, and they are
monitoring each other. This mode is obviously more efficient, as it uses all of the available
hardware.
Parallel Clustering - Parallel clusters allow multiple hosts to access the same data on the
shared storage. Because most operating systems lack support for this simultaneous data access
by multiple hosts, parallel clusters are usually accomplished by special versions of software
and special releases of applications.
Clustered technology is rapidly changing. Clustered system's usage and it's features should
expand greatly as Storage Area Networks(SANs). SANs allow easy attachment of multiple
hosts to multiple storage units. Current clusters are usually limited to two or four hosts due to
the complexity of connecting the hosts to shared storage.
While the real- time operating systems that can only guarantee a maximum of the time, i.e. the
critical task will get priority over other tasks, but no assurity of completeing it in a defined
time. These systems are referred to as Soft Real-Time Ope rating Systems.
Handheld Systems
Handheld systems include Personal Digital Assistants(PDAs), such as Palm-Pilots or Cellu lar
Telephones with connectivity to a network such as the Internet. They are usually of limited size
due to which most handheld devices have a small amount of memory, include slow
processors, and feature small display screens.
Many handheld devices have between 512 KB and 8 MB of memory. As a result, the
operating system and applications must manage memory efficiently. This includes returning
all allocated memory back to the memory manager once the memory is no longer being used.
Currently, many handheld devices do not use virtual memory techniques, thus forcing
program developers to work within the confines of limited physical memory.
Processors for most handheld devices often run at a fraction of the speed of a processor in a
PC. Faster processors require more power. To include a faster processor in a handheld device
would require a larger battery that would have to be replaced more frequently.
The last issue confronting program designers for handheld devices is the small display screens
typically available. One approach for displaying the content in web pages is web clipping,
where only a small subset of a web page is delivered and displayed on the handheld device.
Some handheld devices may use wireless technology such as BlueTooth, allowing remote
access to e- mail and web browsing. Cellular telephones with connectivity to the Internet fall
into this category. Their use continues to expand as network co nnections become more
available and other options such as cameras and M P3 players , expand their utility.
What is a Process?
A process is a program in execution. Process is not as same as program code but a lot more
than it. A process is an 'active' entity as opposed to program which is considered to be a
'passive' entity. Attributes held by process include hardware state, memory, CPU etc.
The Text section is made up of the compiled program code, read in from non-volatile
storage when the program is launched.
The Data section is made up the global and static variables, allocated and initialized
prior to executing the main.
The Heap is used for the dynamic memory allocation, and is managed via calls to
new, delete, malloc, free, etc.
The Stack is used for local variables. Space on the stack is reserved for local variables
when they are declared.
The prime aim of the process scheduling system is to keep the CPU busy all the time and to
deliver minimum response time for all programs. For achieving this, the scheduler must apply
appropriate rules for swapping processes IN and OUT of CPU.
A new process is initially put in the Ready queue. It waits in the ready queue until it is
selected for execution(or dispatched). Once the process is assigned to the CPU and is
executing, one of the following several events can occur:
The process could issue an I/O request, and then be placed in the I/O queue.
The process could create a new subprocess and wait for its termination.
The process could be removed forcibly from the CPU, as a result of an interrupt, and be put
back in the ready queue.
In the first two cases, the process eventually switches from the waiting state to the ready
state, and is then put back in the ready queue. A process continues this cycle until it
terminates, at which time it is removed from all queues and has its PCB and resources
deallocated.
Types of Schedulers
Long term scheduler runs less frequently. Long Term Schedulers decide which program must
get into the job queue. From the job queue, the Job Processor, selects processes and loads
them into the memory for execution. Primary aim of the Job Scheduler is to maintain a good
degree of Multiprogramming. An optimal degree of Multiprogramming means the average
rate of process creation is equal to the average departure rate of processes from the execution
memory.
This is also known as CPU Scheduler and runs very frequently. The primary aim of this
scheduler is to enhance CPU performance and increase process execution rate.
This scheduler removes the processes from memory (and from active contention for the
CPU), and thus reduces the degree of multiprogramming. At some later time, the process can
be reintroduced into memory and its execution van be continued where it left off. This
scheme is called s wapping. The process is swapped out, and is later swapped in, by the
medium term scheduler.
Swapping may be necessary to improve the process mix, or because a change in memory
requirements has overcommitted available memory, requiring memory to be freed up. This
complete process is descripted in the below diagram:
Addition of Medium-te rm scheduling to the queueing diagram.
Operations on Process
Below we have discussed the two major operation Process Creation and Process
Termination.
Process Creation
Through appropriate system calls, such as fork or spawn, processes may create other
processes. The process which creates other process, is termed the parent of the other process,
while the created sub-process is termed its child.
Each process is given an integer identifier, termed as process identifier, or PID. The parent
PID (PPID) is also stored for each process.
On a typical UNIX systems the process scheduler is termed as sched , and is given PID 0. The
first thing done by it at system start-up time is to launch in it, which gives that process PID 1.
Further Init launches all the system daemons and user logins, and becomes the ultimate
parent of all other processes.
A child process may receive some amount of shared resources with its parent depending on
system implementation. To prevent runaway children from consuming all of a certain system
resource, child processes may or may not be limited to a subset of the resources originally
allocated to the parent.
There are two options for the parent process after creating the child :
Wait for the child process to terminate before proceeding. Parent process makes a wait()
system call, for either a specific child process or for any particular child process, which
causes the parent process to block until the wait() returns. UNIX shells normally wait for their
children to complete before issuing a new prompt.
Run concurrently with the child, continuing to process without waiting. When a UNIX shell
runs a process as a background task, this is the operation seen. It is also possible for the parent
to run for a while, and then wait for the child later, which might occur in a sort of a parallel
processing operation.
There are also two possibilities in terms of the address space of the new process:
To illustrate these different implementations, let us consider the UNIX operating system. In
UNIX, each process is identified by its process identifier, which is a unique integer. A new
process is created by the fork system call. The new process consists of a copy of the address
space of the original process. This mechanism allows the parent process to communicate
easily with its child process. Both processes (the parent and the child) continue execution at
the instruction after the fork system call, with one difference: The return code for the fork
system call is zero for the ne w(child) process, whereas the(non zero) process identifie r of
the child is returned to the parent.
Typically, the execlp system call is used after the fork system call by one of the two
processes to replace the process memory space with a new program. The execlp system call
loads a binary file into memory - destroying the memory image of the program containing the
execlp system call – and starts its execution. In this manner the two processes are able to
communicate, and then to go their separate ways.
#include<stdio.h>
/* Fo rk another process */
p id = fork();
if(pid < 0)
{
//Error occurred
fprintf(stderr, "Fork Failed");
exit(-1);
}
else if (p id == 0)
{
//Child process
execlp("/bin/ls","ls",NULL);
}
else
{
//Parent process
//Parent will wait for the child to co mplete
wait(NULL);
printf("Child co mp lete");
exit(0);
}
}
GATE Numerical Tip: If fork is called for n times, the number of child processes or new processes
created will be: 2 n – 1.
Process Termination
By making the exit(system call), typically returning an int, processes may request their own
termination. This int is passed along to the parent if it is doing a wait(), and is typically zero
on successful completion and some non-zero code in the event of any problem.
Processes may also be terminated by the system for a variety of reasons, including :
When a process ends, all of its system resources are freed up, open files flushed and closed,
etc. The process termination status and execution times are returned to the parent if the pare nt
is waiting for the child to terminate, or eventually returned to init if the process already
became an orphan.
The processes which are trying to terminate but cannot do so because their parent is not
waiting for them are termed zombies. These are eventually inherited by init as orphans and
killed off.
Whenever the CPU becomes idle, the operating system must select one of the processes in the
ready queue to be executed. The selection process is carried out by the short-term scheduler
(or CPU scheduler). The scheduler selects from among the processes in memory that are
ready to execute, and allocates the CPU to one of them.
Switching context
Switching to user mode
Jumping to the proper location in the user program to restart that program from where it left
last time.
The dispatcher should be as fast as possible, given that it is invoked during every process
switch. The time taken by the dispatcher to stop one process and start another process is
known as the Dis patch Latency. Dispatch Latency can be explained using the below figure:
Types of CPU Scheduling
CPU scheduling decisions may take place under the following four circumstances:
1. When a process switches from the running state to the waiting state(for I/O request or
invocation of wait for the termination of one of the child processes).
2. When a process switches from the running state to the ready state (for example, when an
interrupt occurs).
3. When a process switches from the waiting state to the ready state(for example, completion of
I/O).
4. When a process terminates.
When Scheduling takes place only under circumstances 1 and 4, we say the scheduling
scheme is non-preemptive; otherwise the scheduling scheme is preemptive.
Non-Preemptive Scheduling
Under non-preemptive scheduling, once the CPU has been allocated to a process, the process
keeps the CPU until it releases the CPU either by terminating or by switching to the waiting
state.
This scheduling method is used by the Microsoft Windows 3.1 and by the Apple Macintosh
operating systems.
It is the only method that can be used on certain hardware platforms, because It does not
require the special hardware(for example: a timer) needed for preemptive scheduling.
Preemptive Scheduling
In this type of Scheduling, the tasks are usually assigned with priorities. At times it is
necessary to run a certain task that has a higher priority before another task although it is
running. Therefore, the running task is interrupted for some time and resumed later when the
priority task has finished its execution.
CPU Utilization
To make out the best use of CPU and not to waste any CPU cycle, CPU would be working
most of the time(Ideally 100% of the time). Considering a real system, CPU usage should
range from 40% (lightly loaded) to 90% (heavily loaded.)
Throughput
It is the total number of processes completed per unit time or rather say total amount of work
done in a unit of time. This may range from 10/second to 1/hour depending on the specific
processes.
Turnaround Time
It is the amount of time taken to execute a particular process, i.e. The interval from time of
submission of the process to the time of completion of the process(Wall clock time).
Waiting Time
The sum of the periods spent waiting in the ready queue amount of time a process has been
waiting in the ready queue to acquire get control on the CPU.
Load Average
It is the average number of processes residing in the ready queue waiting for their turn to get
into the CPU.
Response Time
Amount of time it takes from when a request was submitted until the first response is
produced. Remember, it is the time till the first response and not the completion of process
execution(final response).
In general CPU utilization and Throughput are maximized and other factors are reduced for
proper optimization.
Scheduling Algorithms
To decide which process to execute first and which process to execute last to achieve
maximum CPU utilisation, computer scientists have defined some algorithms, they are:
We will be discussing all the scheduling algorithms, one by one, in detail in the next tutorials.
First Come First Serve, is just like FIFO(First in First out) Queue data structure,
where the data element which is added to the queue first, is the one who leaves the
queue first.
This is used in Batch Systems.
It's easy to unde rstand and implement programmatically, using a Queue data
structure, where a new process enters through the tail of the queue, and the scheduler
selects process from the head of the queue.
A perfect real life example of FCFS scheduling is buying tickets at ticket counter.
Consider the processes P1, P2, P3, P4 given in the below table, arrives for execution in the
same order, with Arrival Time 0, and given Burst Time, let's find the average waiting time
using the FCFS scheduling algorithm.
For the above given proccesses, first P1 will be provided with the CPU resources,
The GANTT chart above perfectly represents the waiting time for each process.
Problems with FCFS Scheduling
Below we have a few shortcomings or problems with the FCFS scheduling algorithm:
1. It is Non Pre-emptive algorithm, which means the process priority doesn't matter.
If a process with very least priority is being executed, more like daily routine
backup process, which takes more time, and all of a sudden some other high priority
process arrives, like interrupt to avoid system crash, the high priority process will
have to wait, and hence in this case, the system will crash, just because of improper
process scheduling.
Convoy Effect is a situation where many processes, who need to use a resource for short time
are blocked by one process holding that resource for a long time.
This essentially leads to poort utilization of resources and hence poor performance.
In the program, we will be calculating the Average waiting time and Ave rage turn around
time for a given array of Burst times for the list of processes.
#include<iostream>
// calculate total wait ing time and total turn around time
for (int i = 0; i < n; i++)
{
total_wt = total_wt + wt[i];
total_tat = total_tat + tat[i];
cout << " " << i+1 << "\t\t" << bt[i] <<"\t "<< wt[i] <<"\t\t " << tat[i] <<endl;
}
// main function
int main ()
{
// process ids
int processes[] = { 1, 2, 3, 4};
int n = sizeof processes / sizeof p rocesses[0];
findAverageTime(processes, n, burst_time );
return 0;
}
Here we have simple formulae for calculating various times for given processes:
Completion Time : Time taken for the execution to complete, starting from arrival time.
Turn Around Time : Time taken to complete after arrival. In simple words, it is the
difference between the Completion time and the Arrival time.
Waiting Time: Total time the process has to wait before it's execution begins. It is the
difference between the Turn Around time and the Burst time of the process.
For the program above, we have considered the arrival time to be 0 for all the processes, try
to implement a program with variable arrival times.
We scheduled the same set of processes using the First come first serve algorithm in the
previous tutorial, and got average waiting time to be 18.75 ms , whereas with SJF, the average
waiting time comes out 4.5 ms .
If the arrival time for processes are different, which means all the processes are not available
in the ready queue at time 0, and some jobs arrive after some time, in such situation,
sometimes process with short burst time have to wait for the current process's execution to
finish, because in Non Pre-emptive SJF, on arrival of a process with short duration, the
existing job/process's execution is not halted/stopped to execute the short job first.
This leads to the problem of Starvation, where a shorter process has to wait for a long time
until the current longer process gets executed. This happens if shorter jobs keep coming, but
this can be solved using the concept of aging.
Pre-emptive Shortest Job First
In Preemptive Shortest Job First Scheduling, jobs are put into ready queue as they arrive, but
as a process with s hort burst time arrives, the existing process is preempted or removed
from execution, and the shorter job is executed first.
As you can see in the GANTT chart above, as P1 arrives first, hence it's execution starts
immediately, but just after 1 ms , process P2 arrives with a burst time of 3 ms which is less
than the burst time of P1, hence the process P1(1 ms done, 20 ms left) is preemptied and
process P2 is executed.
As P2 is getting executed, after 1 ms , P3 arrives, but it has a burst time greater than that of P2,
hence execution of P2 continues. But after another millisecond, P4 arrives with a burst time
of 2 ms , as a result P2(2 ms done, 1 ms left) is preemptied and P4 is executed.
After the completion of P4, process P2 is picked up and finishes, then P2 will get executed
and at last P1.
The Pre-emptive SJF is also known as Shortest Remaining Time First, because at any given
point of time, the job with the shortest remaining time is executed first.
Program for SJF Scheduling
In the below program, we consider the arrival time of all the jobs to be 0.
Also, in the program, we will sort all the jobs based on their burst time and then execute
them one by one, just like we did in FCFS scheduling program.
#include<bits/stdc++.h>
struct Process
{
int pid; // process ID
int bt; // burst Time
};
/*
this function is used for sorting all
processes in increasing order of burst time
*/
bool comparison(Process a, Process b)
{
return (a.bt < b.bt);
}
// calculate total wait ing time and total turn around time
for (int i = 0; i < n; i++)
{
total_wt = total_wt + wt[i];
total_tat = total_tat + tat[i];
cout << " " << proc[i].pid << "\t\t"
<< proc[i].bt << "\t " << wt [i]
<< "\t\t " << tat[i] <<endl;
}
// main function
int main ()
{
Process proc[] = {{1, 21}, {2, 3}, {3, 6}, {4, 2}};
int n = sizeof proc / sizeof proc[0];
findAverageTime(proc, n);
return 0;
}
Order in which process gets executed 4 2 3 1 Processes Burst time Waiting time Turn around
time 4 2 0 2 2 3 2 5 3 6 5 11 1 21 11 32 Average waiting time = 4.5 Average turn around time
= 12.5
Try implementing the program for SJF with variable arrival time for different jobs, yourself.
Priority Scheduling
Priority is assigned for each process.
Process with highest priority is executed first and so on.
Processes with same priority are executed in FCFS manner.
Priority can be decided based on memory requirements, time requirements or any
other resource requirement.
Round Robin Scheduling
A fixed time is allotted to each process, called quantum, for execution.
Once a process is executed for given time period that process is preemptied and other
process executes for given time period.
Context switching is used to save states of preemptied proce sses.
A multi- level queue scheduling algorithm partitions the ready queue into several separate
queues. The processes are permanently assigned to one queue, generally based on some
property of the process, such as memory size, process priority, or process type. Each queue
has its own scheduling algorithm.
For example: separate queues might be used for foreground and background processes. The
foreground queue might be scheduled by Round Robin algorithm, while the background
queue is scheduled by an FCFS algorithm.
In addition, there must be scheduling among the queues, which is commonly implemented as
fixed-priority preemptive scheduling. For example: The foreground queue may have
absolute priority over the background queue.
1. System Processes
2. Interactive Processes
3. Interactive Editing Processes
4. Batch Processes
5. Student Processes
Each queue has absolute priority over lower-priority queues. No process in the batch queue,
for example, could run unless the queues for system processes, interactive processes, and
interactive editing processes were all empty. If an interactive editing process entered the
ready queue while a batch process was running, the batch process will be preempted.
Multilevel Feedback Queue Scheduling
In a multilevel queue-scheduling algorithm, processes are permanently assigned to a queue
on entry to the system. Processes do not move between queues. This setup has the advantage
of low scheduling overhead, but the disadvantage of being inflexible.
Multilevel feedback queue scheduling, however, allows a process to move between queues.
The idea is to separate processes with different CPU-burst characteristics. If a process uses
too much CPU time, it will be moved to a lower-priority queue. Similarly, a process that
waits too long in a lower-priority queue may be moved to a higher-priority queue. This form
of aging prevents starvation.
The definition of a multilevel feedback queue scheduler makes it the most general CPU-
scheduling algorithm. It can be configured to match a specific system under design.
Unfortunately, it also requires some means of selecting values for all the parameters to define
the best scheduler. Although a multilevel feedback queue is the most general sche me, it is
also the most complex.
As each thread has its own independent resource for process execution, multpile processes
can be executed parallely by increasing number of threads.
Types of Thread
There are two types of threads:
1. User Threads
2. Kernel Threads
User threads, are above the kernel and without kernel support. These are the threads that
application programmers use in their programs.
Kernel threads are supported within the kernel of the OS itself. All modern OSs support
kernel level threads, allowing the kernel to perform multiple simultaneous tasks and/or to
service multiple kernel system calls simultaneously.
Multithreading Models
The user threads must be mapped to kernel threads, by one of the following strategies:
The one to one model creates a separate kernel thread to handle each and every user thread.
Most implementations of this model place a limit on how many threads can be created.
Linux and Windows from 95 to XP implement the one-to-one model for threads.
Many to Many Model
The many to many model multiplexes any number of user threads onto an equal or smaller
number of kernel threads, combining the best features of the one-to-one and many-to-one
models.
Users can create any number of the threads.
Blocking the kernel system calls does not block the entire process.
Processes can be split across multiple processors.
Thread libraries may be implemented either in user space or in kernel space. The user space
involves API functions implemented solely within the user space, with no kernel support. The
kernel space involves system calls, and requires a kernel with thread library support.
1. POSIX Pitheads, may be provided as either a user or kernel library, as an extension to the
POSIX standard.
2. Win32 threads, are provided as a kernel-level library on Windows systems.
3. Java threads: Since Java generally runs on a Java Virtual Machine, the implementation of
threads is based upon whatever OS and hardware the JVM is running on, i.e. either Pitheads
or Win32 threads depending on the system.
Benefits of Multithreading
1. Responsiveness
2. Resource sharing, hence allowing better utilization of resources.
3. Economy. Creating and managing threads becomes easier.
4. Scalability. One thread runs on one CPU. In Multithreaded processes, threads can be
distributed over a series of processors to scale.
5. Context Switching is smooth. Context switching refers to the procedure followed by CPU to
change from one task to another.
Multithreading Issues
Below we have mentioned a few issues related to multithreading. Well, it's an old saying, All
good things, come at a price.
Thread Cancellation
Thread cancellation means terminating a thread before it has finished working. There can be
two approaches for this, one is Asynchronous cancellation, which terminates the target
thread immediately. The other is Deferred cancellation allows the target thread to
periodically check if it should be cancelled.
Signal Handling
Signals are used in UNIX systems to notify a process that a particular event has occurred.
Now in when a Multithreaded process receives a signal, to which thread it must be delivered?
It can be delivered to all, or a single thread.
fork() is a system call executed in the kernel through which a process creates a copy of itself.
Now the problem in Multithreaded process is, if one thread forks, will the entire process be
copied or not?
Security Issues
Yes, there can be security issues because of extensive sharing of resources between multiple
threads.
There are many other issues that you might face in a multithreaded process, but t here are
appropriate solutions available for them. Pointing out some issues here was just to study both
sides of the coin.
Process Synchronization
Process Synchronization means sharing system resources by processes in a such a way that,
Concurrent access to shared data is handled thereby minimizing the chance of inconsistent
data. Maintaining data consistency demands mechanisms to ensure synchronized execution of
cooperating processes.
Process Synchronization was introduced to handle problems that arose while multiple process
executions. Some of the problems are discussed below.
A solution to the critical section problem must satisfy the following three conditions:
1. Mutual Exclusion
Out of a group of cooperating processes, only one process can be in its critical section at a
given point of time.
2. Progress
If no process is in its critical section, and if one or more threads want to execute their critical
section then any one of these threads must be allowed to get into its critical section.
3. Bounded Waiting
After a process makes a request for getting into its critical section, there is a limit for how
many other processes can get into their critical section, before this process's request is
granted. So after the limit is reached, system must grant the process permission to get into its
critical section.
Synchronization Hardware
Many systems provide hardware support for critical section code. The critical section
problem could be solved easily in a single-processor environment if we could disallow
interrupts to occur while a shared variable or resource is being modified.
In this manner, we could be sure that the current sequence of instructions would be allowed
to execute in order without pre-emption. Unfortunately, this solution is not feasible in a
multiprocessor environment.
This message transmission lag, delays entry of threads into critical section and the system
efficiency decreases.
Mutex Locks
As the synchronization hardware solution is not easy to implement for everyone, a strict
software approach called Mutex Locks was introduced. In this approach, in the entry section
of code, a LOCK is acquired over the critical resources modified and used inside critical
section, and in the exit section that LOCK is released.
As the resource is locked while a process executes its critical section hence no other process
can access it.
Introduction to Semaphores
In 1965, Dijkstra proposed a new and very significant technique for managing concurrent
processes by using the value of a simple integer variable to synchronize the progress of
interacting processes. This integer variable is called semaphore. So it is basically a
synchronizing tool and is accessed only through two low standard atomic operations, wait
and signal designated by P(S) and V(S) respectively.
In very simple words, semaphore is a variable which can hold only a non-negative Integer
value, shared between all the threads, with operations wait and signal, which work as follow:
P(S): if S ≥ 1 then S := S - 1
else <block and enqueue the process >;
Wait: Decrements the value of its argument S, as soon as it would become non-
negative(greater than or equal to 1).
Signal: Increments the value of its argument S, as there is no more process blocked on
the queue.
Properties of Semaphores
1. It's simple and always have a non- negative Integer value.
2. Works with many processes.
3. Can have many different critical sections with different semaphores.
4. Each critical section has unique access semaphores.
5. Can permit multiple processes into the critical section at once, if desirable.
Types of Semaphores
Semaphores are mainly of two types:
1. Binary Semaphore:
2. Counting Semaphores:
These are used to implement bounded concurrency.
Example of Use
Here is a simple step wise implementation involving declaration and usage of semaphore.
Limitations of Semaphores
1. Priority Inve rsion is a big limitation of semaphores.
2. Their use is not enforced, but is by convention only.
3. With improper use, a process may block indefinitely. Such a situation is called
Deadlock. We will be studying deadlocks in details in coming lessons.
Below are some of the classical problem depicting flaws of process synchronaization in
systems where cooperating processes are present.
Because the buffer pool has a maximum size, this problem is often called the
Bounded buffer proble m.
Solution to this problem is, creating two counting semaphores "full" and "empty" to
keep track of the current number of full and empty buffers respectively.
1. Mutual Exclusion
Resources shared such as read-only files do not lead to deadlocks but resources, such
as printers and tape drives, requires exclusive access by a single process.
In this condition processes must be prevented from holding one or more resources
while simultaneously waiting for one or more others.
3. No Preemption
4. Circular Wait
Circular wait can be avoided if we number all resources, and require that processes
request resources only in strictly increasing(or decreasing) order.
Handling Deadlock
The above points focus on preventing deadlocks. But what to do once a deadlock has
occured. Following three strategies can be used to remove deadlock after its occurrence.
1. Preemption
We can take a resource from one process and give it to other. This will resolve the
deadlock situation, but sometimes it does causes problems.
2. Rollback
In situations where deadlock is a real possibility, the system can periodically make a
record of the state of each process and when deadlock occurs, roll everything back to
the last checkpoint, and restart, but allocating resources differently so that deadlock
does not occur.
What is a Livelock?
There is a variant of deadlock called livelock. This is a situation in which two or more
processes continuously change their state in response to changes in the other process(es)
without doing any useful work. This is similar to deadlock in that no progress is made but
differs in that neither process is blocked or waiting for anything.
A human example of livelock would be two people who meet face-to-face in a corridor and
each moves aside to let the other pass, but they end up swaying from side to side without
making any progress because they always move the same way at the same time.
Introduction to Memory Management
Main Memory refers to a physical memory that is the internal memory to the computer. The
word main is used to distinguish it from external mass storage devices such as disk drives.
Main memory is also known as RAM. The computer is able to change only data that is in
main memory. Therefore, every program we execute and every file we access must be copied
from a storage device into main memory.
All the programs are loaded in the main memeory for execution. Sometimes complete
program is loaded into the memory, but some times a certain part or routine of the program is
loaded into the main memory only when it is called by the program, this mechanism is called
Dynamic Loading, this enhance the performance.
Also, at times one program is dependent on some other program. In such a case, rather than
loading all the dependent programs, CPU links the dependent programs to the main executing
program when its required. This mechanism is known as Dynamic Linking.
Swapping
A process needs to be in memory for execution. But sometimes there is not enough main
memory to hold all the currently active processes in a timesharing system. So, excess process
are kept on disk and brought in to run dynamically. Swapping is the process of bringing in
each process in main memory, running it for a while and then putting it back to the disk.
Memory Protection
Memory protection is a phenomenon by which we control memory access rights on a
computer. The main aim of it is to prevent a process from accessing memory that has not
been allocated to it. Hence prevents a bug within a process from affecting other processes, or
the operating system itself, and instead results in a segmentation fault or storage violation
exception being sent to the disturbing process, generally killing of process.
Memory Allocation
Memory allocation is a process by which computer programs are assigned memory or space.
It is of three types :
1. First Fit:
2. Best Fit:
3. Worst Fit:
Fragmentation
Fragmentation occurs in a dynamic memory allocation system when most of the free blocks
are too small to satisfy any request. It is generally termed as inability to use the available
memory.
In such situation processes are loaded and removed from the memory. As a result of this, free
holes exists to satisfy a request but is non contiguous i.e. the memory is fragmented into large
no. Of small holes. This phenomenon is known as External Fragmentation.
Also, at times the physical memory is broken into fixed size blocks and memory is allocated
in unit of block sizes. The memory allocated to a space may be slightly larger than the
requested memory. The difference between allocated and required memory is known as
Inte rnal fragme ntation i.e. the memory that is internal to a partition but is of no use.
Paging
A solution to fragmentation problem is Paging. Paging is a memory management mechanism
that allows the physical address space of a process to be non-contagious. Here physical
memory is divided into blocks of equal size called Pages. The pages belonging to a certain
process are loaded into available memory frames.
Page Table
A Page Table is the data structure used by a virtual memory system in a computer operating
system to store the mapping between virtual address and physical addresses.
Virtual address is also known as Logical address and is generated by the CPU. While
Physical address is the address that actually exists on memory.
Segmentation
Segmentation is another memory management scheme that supports the user- view of
memory. Segmentation allows breaking of the virtual address space of a single process into
segments that may be placed in non-contiguous areas of physical memory.
Both paging and segmentation have their advantages and disadvantages, it is better to
combine these two schemes to improve on each. The combined scheme is known as 'Page the
Elements'. Each segment in this scheme is divided into pages and each segment is maintained
in a page table. So the logical address is divided into following 3 parts :
Segment numbers(S)
Page number (P)
The displacement or offset number (D)
In real scenarios, most processes never need all their pages at once, for following reasons :
Error handling code is not needed unless that specific error occurs, some of which are
quite rare.
Arrays are often over-sized for worst-case scenarios, and only a small fraction of the
arrays are actually used in practice.
Certain features of certain programs are rarely used.
Benefits of having Virtual Memory
Initially only those pages are loaded which will be required the process immediately.
The pages that are not moved into the memory, are marked as invalid in the page table. For
an invalid entry the rest of the table is empty. In case of pages that are loaded in the memory,
they are marked as valid along with the information about where to find the swapped out
page.
When the process requires any of the page that is not loaded into the memory, a page fault
trap is triggered and following steps are followed,
1. The memory address which is requested by the process is first checked, to verify the
request made by the process.
2. If its found to be invalid, the process is terminated.
3. In case the request by the process is valid, a free frame is located, possibly fro m a
free- frame list, where the required page will be moved.
4. A new operation is scheduled to move the necessary page from disk to the specified
memory location. ( This will usually block the process on an I/O wait, allowing some
other process to use the CPU in the meantime. )
5. When the I/O operation is complete, the process's page table is updated with the new
frame number, and the invalid bit is changed to valid.
6. The instruction that caused the page fault must now be restarted from the beginning.
There are cases when no pages are loaded into the memory initially, pages are only loaded
when demanded by the process by generating page faults. This is called Pure Demand
Paging.
The only major issue with Demand Paging is, after a new page is loaded, the process starts
execution from the beginning. Its is not a big issue for small programs, but for larger
programs it affects performance drastically.
Page Replacement
As studied in Demand Paging, only certain pages of a process are loaded initially into the
memory. This allows us to get more number of processes into the memory at the same time.
but what happens when a process requests for more pages and no free memory is available to
bring them in. Following steps can be taken to deal with this problem :
1. Put the process in the wait queue, until any other process finishes its execution
thereby freeing frames.
2. Or, remove some other process completely from the memory to free frames.
3. Or, find some pages that are not being used right now, move them to the disk to get
free frames. This technique is called Page replace ment and is most commonly used.
We have some great algorithms to carry on page replacement efficiently.
Find the location of the page requested by ongoing process on the disk.
Find a free frame. If there is a free frame, use it. If there is no free frame, use a page-
replacement algorithm to select any existing frame to be replaced, such frame is
known as victim frame.
Write the victim frame to disk. Change all related page tables to indica te that this page
is no longer in memory.
Move the required page and store it in the frame. Adjust all related page and frame
tables to indicate the change.
Restart the process that was waiting for this page.
FIFO Page Replacement
Below is a video, which will explain LRU Page replacement algorithm in details with an
example.
Thrashing
A process that is spending more time paging than executing is said to be thrashing. In other
words it means, that the process doesn't have enough frames to hold all the pages for its
execution, so it is swapping pages in and out very frequently to keep executing. Sometimes,
the pages which will be required in the near future have to be swapped out.
Initially when the CPU utilization is low, the process scheduling mechanism, to increase the
level of multiprogramming loads multiple processes into the memory at the same time,
allocating a limited amount of frames to each process. As the memory fills up, process starts
to spend a lot of time for the required pages to be swapped in, again leading to low CPU
utilization because most of the proccesses are waiting for pages. Hence the scheduler loads
more processes to increase CPU utilization, as this continues at a point of time the comp lete
system comes to a stop.
To prevent thrashing we must provide processes with as many frames as they really need
"right now".
File Structure
A file has various kinds of structure. Some of them can be :
Attributes of a File
1. Sequential Access
2. Direct Access
What is a Directory?
Information about files is maintained by Directories. A directory can contain multiple files. It
can even have directories inside of them. In Windows we also call these directories as
folders.
Consider there are n account holders in a bank and the sum of the money in all of their
accounts is S. Everytime a loan has to be granted by the bank, it subtracts the loan amount
from the total money the bank has. Then it checks if that difference is greater than S. It is
done because, only then, the bank would have enough money even if all the n account holders
draw all their money at once.
Whenever a new process is created, it must specify the maximum instances of each resource
type that it needs, exactly.
Let us assume that there are n processes and m resource types. Some data structures that are
used to implement the banker's algorithm are:
1. Available
2. Max
It is an n x m matrix which represents the maximum number of instances of each resource that
a process can request. If Max[i][j] = k, then the process P(i) can request atmost k instances of
resource type R(j).
3. Allocation
It is an n x m matrix which represents the number of resources of each type currently allocated
to each process. If Allocation[i][j] = k, then process P(i) is currently allocated k instances of
resource type R(j).
4. Need
It is an n x m matrix which indicates the remaining resource needs of each process. If Need[i][j]
= k, then process P(i) may need k more instances of resource type R(j) to complete its task.
1. If number of requested instances of each resource is less than the need (which was
declared previously by the process), go to step 2.
2. If number of requested instances of each resource type is less than the available
resources of each type, go to step 3. If not, the process has to wait because sufficient
resources are not available yet.
3. Now, assume that the resources have been allocated. Accordingly do,
This step is done because the system needs to assume that resources have been allocated. So
there will be less resources available after allocation. The number of allocated instances will
increase. The need of the resources by the process will reduce. That's what is represented by
the above three operations.
After completing the above three steps, check if the system is in safe state by applying the
safety algorithm. If it is in safe state, proceed to allocate the requested resources. Else, the
process has to wait longer.
Safety Algorithm
1. Let Work and Finish be vectors of length m and n, respectively. Initially,
2. Work = Available
3. Fin ish[i] =false for i = 0, 1, ... , n - 1.
This means, initially, no process has finished and the number of available resources is
represented by the Available array.
It means, we need to find an unfinished process whose need can be satisfied by the
available resources. If no such process exists, just go to step 4.
Go to step 2.
When an unfinished process is found, then the resources a re allocated and the process
is marked finished. And then, the loop is repeated to check the same for all other
processes.
10. If Finish[i] == true for all i, then the system is in a safe state.
That means if all processes are finished, then the system is in safe state.
A producer tries to insert data into an empty slot of the buffer. A consumer tries to remove
data from a filled slot in the buffer. As you might have guessed by now, those two processes
won't produce the expected output if they are being executed concurrently.
There needs to be a way to make the producer and consumer work in an independent manner.
Here's a Solution
One solution of this problem is to use semaphores. The semaphores which will be used here
are:
At any instant, the current value of empty represents the number of empty slots in the buffer
and full represents the number of occupied slots in the buffer.
do
{
// wait until empty > 0 and then decrement 'empty'
wait(empty);
// acquire lock
wait(mutex);
// release lock
signal(mutex);
// increment 'full'
signal(full);
}
while(TRUE)
Looking at the above code for a producer, we can see that a producer first waits until
there is atleast one empty slot.
Then it decrements the empty semaphore because, there will now be one less empty
slot, since the producer is going to insert data in one of those slots.
Then, it acquires lock on the buffer, so that the consumer cannot access the buffer
until producer completes its operation.
After performing the insert operation, the lock is released and the value of full is
incremented because the producer has just filled a slot in the buffer.
The Consumer Operation
The pseudocode for the consumer function looks like this:
do
{
// wait until full > 0 and then decrement 'full'
wait(fu ll);
// acquire the lock
wait(mutex);
The consumer waits until there is atleast one full slot in the buffer.
Then it decrements the full semaphore because the number of occupied slots will be
decreased by one, after the consumer completes its operation.
After that, the consumer acquires lock on the buffer.
Following that, the consumer completes the removal operation so that the data from
one of the full slots is removed.
Then, the consumer releases the lock.
Finally, the empty semaphore is incremented by 1, because the consumer has just
removed data from an occupied slot, thus making it empty.
Dining Philosophers Problem
The dining philosophers problem is another classic synchronization problem which is used to
evaluate situations where there is a need of allocating multiple resources to multiple
processes.
At any instant, a philosopher is either eating or thinking. When a philosopher wants to eat, he
uses two chopsticks - one from their left and one from their right. When a philosopher wants
to think, he keeps down both chopsticks at their original place.
while(TRUE)
{
wait(stick[i]);
/*
mod is used because if i=5, next
chopstick is 1 (din ing table is circular)
*/
wait(stick[(i+1) % 5]);
/* eat */
signal(stick[i]);
signal(stick[(i+1) % 5]);
/* think */
}
When a philosopher wants to eat the rice, he will wait for the chopstick at his left and picks
up that chopstick. Then he waits for the right chopstick to be available, and then picks it too.
After eating, he puts both the chopsticks down.
But if all five philosophers are hungry simultaneously, and each of them pickup one
chopstick, then a deadlock situation occurs because they will be waiting for another chopstick
forever. The possible solutions for this are:
A philosopher must be allowed to pick up the chopsticks only if both the left and right
chopsticks are available.
Allow only four philosophers to sit at the table. That way, if all the four philosophers
pick up four chopsticks, there will be one chopstick left on the table. So, one
philosopher can start eating and eventually, two chopsticks will be ava ilable. In this
way, deadlocks can be avoided.
Here, we use one mutex m and a semaphore w. An integer variable read_count is used to
maintain the number of readers currently accessing the resource. The variable read_count is
initialized to 0. A value of 1 is given initially to m and w.
Instead of having the process to acquire lock on the shared resource, we use the mutex m to
make the process to acquire and release lock whenever it is updating the read_count variable.
while(TRUE)
{
wait(w);
signal(w);
}
And, the code for the reader process looks like this:
while(TRUE)
{
//acquire lock
wait(m);
read_count++;
if(read_count == 1)
wait(w);
//release lock
signal(m);
// acquire lock
wait(m);
read_count--;
if(read_count == 0)
signal(w);
// release lock
signal(m);
}
Here is the Code uncoded(explained)
As seen above in the code for the writer, the writer just waits on the w semaphore
until it gets a chance to write to the resource.
After performing the write operation, it increments w so that the next writer can
access the resource.
On the other hand, in the code for the reader, the lock is acquired whenever the
read_count is updated by a process.
When a reader wants to access the resource, first it increments the read_count value,
then accesses the resource and then decrements the read_count value.
The semaphore w is used by the first reader which enters the critical section and the
last reader which exits the critical section.
The reason for this is, when the first readers enters the critical section, the writer is
blocked from the resource. Only new readers can access the resource now.
Similarly, when the last reader exits the critical section, it signals the writer using the
w semaphore because there are zero readers now and a writer can have the chance to
access the resource.
A magnetic disk contains several platters. Each platter is divided into circular shaped tracks.
The length of the tracks near the centre is less than the length of the tracks farther from the
centre. Each track is further divided into sectors, as shown in the figure.
Tracks of the same distance from centre form a cylinder. A read-write head is used to read
data from a sector of the magnetic disk.
Transfer rate: This is the rate at which the data moves from disk to the computer.
Random access time: It is the sum of the seek time and rotational latency.
Seek time is the time taken by the arm to move to the required track. Rotational latency is
defined as the time taken by the arm to reach the required sector in the track.
Even though the disk is arranged as sectors and tracks physically, the data is logically
arranged and addressed as an array of blocks of fixed size. The size of a block can be 512 or
1024 bytes. Each logical block is mapped with a sector on the disk, sequentially. In this way,
each sector in the disk will have a logical address.
This algorithm performs requests in the same order asked by the system. Let's take an
example where the queue has the following requests with cylinder numbers as follows:
Assume the head is initially at cylinder 56. The head moves in the given order in the queue
i.e., 56→98→183→...→67.
Here the position which is closest to the current head position is chosen first. Consider the
previous example where disk queue looks like,
Assume the head is initially at cylinder 56. The next closest cylinder to 56 is 65, and then the
next nearest one is 67, then 37, 14, so on.
SCAN algorithm
This algorithm is also called the elevator algorithm because of it's behavior. Here, first the
head moves in a direction (say backward) and covers all the requests in the path. Then it
moves in the opposite direction and covers the remaining requests in the path. This behavior
is similar to that of an elevator. Let's take the previous example,
Assume the head is initially at cylinder 56. The head moves in backward direction and
accesses 37 and 14. Then it goes in the opposite direction and accesses the cylinders as they
come in the path.
Introduction to System Calls
To understand system calls, first one needs to understand the difference between ke rnel
mode and user mode of a CPU. Every modern operating system supports these two modes.
Kernel Mode
When CPU is in kernel mode, the code being executed can access any memory
address and any hardware resource.
Hence kernel mode is a very privileged and powerful mode.
If a program crashes in kernel mode, the entire system will be halted.
User Mode
When CPU is in user mode, the programs don't have direct access to memory and
hardware resources.
In user mode, if any program crashes, only that particular program is halted.
That means the system will be in a safe state even if a program in user mode crashes.
Hence, most programs in an OS run in user mode.
System Call
When a program in user mode requires access to RAM or a hardware resource, it must ask
the kernel to provide access to that resource. This is done via something called a system call.
When a program makes a system call, the mode is switched from user mode to kernel mode.
This is called a context switch.
Then the kernel provides the resource which the program requested. After that, another
context switch happens which results in change of mode from kernel mode back to user
mode.
Generally, system calls are made by the user level programs in the following situations:
In a typical UNIX system, there are around 300 system calls. Some of them which are
important ones in this context, are described below.
Fork()
The fork() system call is used to create processes. When a process (a program in execution)
makes a fork() call, an exact copy of the process is created. Now there are two processes, one
being the parent process and the other being the child process.
The process which called the fo rk() call is the parent process and the process which is created
newly is called the child process. The child process will be exactly the same as the parent.
Note that the process state of the parent i.e., the address space, variables, open files etc. is
copied into the child process. This means that the parent and child process es have identical
but physically different address spaces. The change of values in parent process doesn't affect
the child and vice versa is true too.
Both processes start execution from the next line of code i.e., the line after the fork() call. Let's
look at an example:
// examp le.c
#include <stdio.h>
void main()
{
int val;
val = fork(); // line A
printf("%d", val); // line B
}
When the above example code is executed, when line A is executed, a child process is
created. Now both processes start execution from line B. To differentiate between the child
process and the parent process, we need to look at the value returned by the fork() call.
The difference is that, in the parent process, fork() returns a value which represents the
process ID of the child process. But in the child process, fork() returns the value 0.
This means that according to the above program, the output of parent process will be the
process ID of the child process and the output of the child process will be 0.
Exec()
The exec() system call is also used to create processes. But there is one big difference between
fork() and exec() calls. The fork() call creates a new process while preserving the parent process.
But, an exec() call replaces the address space, text segment, data segment etc. of the current
process with the new process.
It means, after an exec() call, only the new process exists. The process which made the system
call, wouldn't exist.
There are many flavors of exec() in UNIX, one being exec1() which is shown below as an
example:
// examp le2.c
#include <stdio.h>
void main()
{
execl("/bin/ls", "ls", 0); // line A
printf("This text won't be printed unless an error occurs in exec().");
}
As shown above, the first parameter to the execl() function is the address of the program
which needs to be executed, in this case, the address of the ls utility in UNIX. Then it is
followed by the name of the program which is ls in this case and followed by optional
arguments. Then the list should be terminated by a NULL pointer (0).
When the above example is executed, at line A, the ls program is called and executed and the
current process is halted. Hence the printf() function is never called since the process has
already been halted. The only exception to this is that, if the execl() function causes an error,
then the printf() function is executed.