Final CE Handbook OS
Final CE Handbook OS
1.
CE & IT Department
Hand Book
Operating System (3140702)
Year: 2019-2020
An operating system is an interface which provides services to both the user and to the programs.
It provides an environment for the program to execute. It also provides users with the services of
how to execute programs in a convenient manner. The operating system provides some services
to program and also to the users of those programs. The specific services provided by the OS are
off course different.
Following are the common services provided by an operating system:
1. Program execution
2. I/O operations
3. File system manipulation
4. Communication
5. Error detection
6. Resource allocation
7. Protection
1) Program Execution
An operating system must be able to load many kinds of activities into the memory and to
run it. The program must be able to end its execution, either normally or abnormally.
2) I/O Operations
The communication between the user and devices drivers are managed by the operating
system.
I/O devices are required for any running process. In I/O a file or an I/O devices can be
involved.
I/O operations are the read or write operations which are done with the help of input-
output devices.
Operating system give the access to the I/O devices when it required.
The collection of related information which represent some content is known as a file. The
computer can store files on the secondary storage devices. For long-term storage purpose.
examples of storage media include magnetic tape, magnetic disk and optical disk drives like CD,
DVD.
A file system is a collection of directories for easy understand and usage. These
directories contain some files. There are some major activities which are performed by an
operating system with respect to file management.
The operating system gives an access to the program for performing an operation on the
file.
Programs need to read and write a file.
The user can create/delete a file by using an interface provided by the operating system.
The operating system provides an interface to the user creates/ delete directories.
The backup of the file system can be created by using an interface provided by the
operating system.
4) Communication
In the computer system, there is a collection of processors which do not share memory
peripherals devices or a clock, the operating system manages communication between all the
processes. Multiple processes can communicate with every process through communication lines
in the network. There are some major activities that are carried by an operating system with
respect to communication.
Two processes may require data to be transferred between the process.
Both the processes can be on one computer or a different computer, but are connected
through a computer network.
6) Resource management
When there are multiple users or multiple jobs running at the same time resources must be
allocated to each of them. There are some major activities that are performed by an operating
system:
The OS manages all kinds of resources using schedulers.
CPU scheduling algorithm is used for better utilization of CPU.
7) Protection
The owners of information stored in a multi-user computer system want to control its use. When
several disjoints processes execute concurrently it should not be possible for any process to
interfere with another process. Every process in the computer system must be secured and
controlled.
Process Control
This system calls perform the task of process creation, process termination, etc.
Functions:
End and Abort
Load and Execute
Create Process and Terminate Process
Wait and Signed Event
Allocate and free memory
File Management
File management system calls handle file manipulation jobs like creating a file, reading, and
writing, etc.
Functions:
Create a file
Delete file
Open and close file
Read, write, and reposition
Get and set file attributes
Information Maintenance
It handles information and its transfer between the OS and the user program.
Functions:
Get or set time and date
Get process and device attributes
Communication:
These types of system calls are specially used for interprocess communications.
Functions:
Create, delete communications connections
Send, receive message
Help OS to transfer status information
Attach or detach remote devices
fork()
Processes use this system call to create processes that are a copy of themselves. With the help of
this system Call parent process creates a child process, and the execution of the parent process
will be suspended till the child process executes.
kill():
The kill() system call is used by OS to send a termination signal to a process that urges the
process to exit. However, a kill system call does not necessarily mean killing the process and can
have various meanings.
exit():
The exit() system call is used to terminate program execution. Specially in the multi-threaded
environment, this call defines that the thread execution is complete. The OS reclaims resources
that were used by the process after the use of exit() system call.
The operating system can be observed from the point of view of the user or the system. This is
known as the user view and the system view respectively. More details about these are given as
follows:
It acts as an intermediary between hardware and software providing a high level interface to low
level hardware and making it easier for the software to access the use of those resources.
Firstly, the code required to manage the peripheral devices is not standardized; therefor OS
provides device drivers as subroutines that perform the tasks on behalf of the program.
Secondly, the OS maintain the H/W abstraction to hide the communication between hardware
and software.
Thirdly ,
the OS converts the computer h/w into different sections each of which contain different
programs to execute the process accesses the H/W through abstraction.
This involves performing such tasks as keeping track of who is using which resource, granting
resource requests, accounting for resource usage, and mediating conflicting requests from
different programs and users
The motivation behind developing distributed operating systems is the availability of powerful
and inexpensive microprocessors and advances in communication technology.
These advancements in technology have made it possible to design and develop distributed
systems comprising of many computers that are inter connected by communication networks.
The main benefit of distributed systems is its low price/performance ratio.
Client-Server Systems
Centralized systems today act as server systems to satisfy requests generated by client
systems. The general structure of a client-server system is depicted in the figure below:
Peer-to-Peer Systems
The growth of computer networks - especially the Internet and World Wide Web (WWW) – has
had a profound influence on the recent development of operating systems. When PCs were
introduced in the 1970s, they were designed for personal use and were generally considered
standalone computers. With the beginning of widespread public use of the Internet in the 1990s
for electronic mail and FTP, many PCs became connected to computer networks.
In contrast to the Tightly Coupled systems, the computer networks used in these applications
consist of a collection of processors that do not share memory or a clock. Instead, each processor
has its own local memory. The processors communicate with one another through various
communication lines, such as high-speed buses or telephone lines. These systems are usually
referred to as loosely coupled systems ( or distributed systems). The general structure of a client-
server system is depicted in the figure below:
This type of operating system does not interact with the computer directly. There is an operator
which takes similar jobs having same requirement and group them into batches. It is the
responsibility of operator to sort the jobs with similar needs.
Advantages of Batch
Operating System:
It is very difficult to guess or know the time required by any job to complete. Processors
of the batch systems know how long the job would be when it is in queue
Multiple users can share the batch systems
The idle time for batch system is very less
It is easy to manage large work repeatedly in batch systems
Disadvantages of Batch Operating System:
The computer operators should be well known with batch systems
Batch systems are hard to debug
It is sometime costly
The other jobs will have to wait for an unknown time if any job fails
Examples of Batch based Operating System: Payroll System, Bank Statements etc.
Each task is given some time to execute, so that all the tasks work smoothly. Each user gets time
of CPU as they use single system. These systems are also known as Multitasking Systems. The
task can be from single user or from different users also. The time that each task gets to execute
is called quantum. After this time interval is over OS switches over to next task.
Real-time systems are used when there are time requirements are very strict like missile
systems, air traffic control systems, robots etc.
Two types of Real-Time Operating System which are as follows:
Hard Real-Time Systems:
These OSs are meant for the applications where time constraints are very strict and even
Prepared By: Prof. Neha Prajapati Page 16
the shortest possible delay is not acceptable. These systems are built for saving life like
automatic parachutes or air bags which are required to be readily available in case of any
accident. Virtual memory is almost never found in these systems.
Soft Real-Time Systems:
These OSs are for applications where for time-constraint is less strict.
Advantages of RTOS:
Maximum Consumption: Maximum utilization of devices and system,thus more output
from all the resources
Task Shifting: Time assigned for shifting tasks in these systems are very less. For
example in older systems it takes about 10 micro seconds in shifting one task to another
and in latest systems it takes 3 micro seconds.
Focus on Application: Focus on running applications and less importance to applications
which are in queue.
Real time operating system in embedded system: Since size of programs are small,
RTOS can also be used in embedded systems like in transport and others.
Error Free: These types of systems are error free.
Memory Allocation: Memory allocation is best managed in these type of systems.
Disadvantages of RTOS:
Limited Tasks: Very few tasks run at the same time and their concentration is very less
on few applications to avoid errors.
In this, jobs which are of similar type are grouped together and treated as a batch. Now, they are
stored on the Punch card (a stiff paper in which digital data is stored and represented using some
specific sequence of holes) which will be submitted to the system for processing. The system
will then perform all the required operations in a sequence. So, we consider this as a type of
Serialprocessing.
Advantages:
1. Suppose a job takes a very long time(1 day or so). Then, such processes can be performed
even in the absence of humans.
2. They doesn't require any special hardware and system support to input data.
Disadvantages:
1. It is very difficult to debug batch systems.
2. Lack of interaction between user and operating system.
Prepared By: Prof. Neha Prajapati Page 18
3. Suppose an error occurs in one of the jobs of a batch. Then, all the remaining jobs get
affected i.e; they have to wait until the error is resolved.
7) Single Programming OS
Single program os only allow a single program to execute at a single point of time.
Program must be executed completely before the execution of the next program.
8)Multi User OS
A multi-user operating system (OS) is a computer system that allows multiple users that are on
different computers to access a single system's OS resources simultaneously, as shown in Figure
1. Users on the system are connected through a network. The OS shares resources between users,
depending on what type of resources the users need. The OS must ensure that the system stays
well-balanced in resources to meet each user's needs and not affect other users who are
connected. Some examples of a multi-user OS are Unix, Virtual Memory System (VMS) and
mainframe OS.
Multi-user operating systems were originally used for time-sharing and batch processing on
mainframe computers. These types of systems are still in use today by large companies,
universities, and government agencies, and are usually used in servers, such as the Ubuntu
Server edition (18.04.1 LTS) or Windows Server 2016. The server allows multiple users to
access the same OS and share the hardware and the kernel, performing tasks for each user
concurrently.
9) Multi Programming OS
Sharing the processor, when two or more programs reside in memory at the same time, is
referred as multiprogramming. Multiprogramming assumes a single shared processor.
Multiprogramming increases CPU utilization by organizing jobs so that the CPU always
has one to execute.
The following figure shows the memory layout for a multiprogramming system.
Advantages
High and efficient CPU utilization.
User feels that many programs are allotted CPU almost
simultaneously.
Disadvantages
CPU scheduling is required.
To accommodate many jobs in memory, memory management is required.
10) Multitasking
Multitasking is when multiple jobs are executed by the CPU simultaneously by switching
between them. Switches occur so frequently that the users may interact with each program while
it is running. An OS does the following activities related to multitasking −
The user gives instructions to the operating system or to a program directly, and receives
an immediate response.
The OS handles multitasking in the way that it can handle multiple operations/executes
multiple programs at a time.
Multitasking Operating Systems are also known as Time-sharing systems.
These Operating Systems were developed to provide interactive use of a computer system
at a reasonable cost.
A time-shared operating system uses the concept of CPU scheduling and
multiprogramming to provide each user with a small portion of a time-shared CPU.
Each user has at least one separate program in memory.
11)Multi Processing OS
Multiprocessing refers to the hardware (i.e., the CPU units) rather than the software (i.e., running
processes). If the underlying hardware provides more than one processor then that is
multiprocessing. It is the ability of the system to leverage multiple processors’ computing power.
Difference between Multi programming and Multi processing –
A System can be both multi programmed by having multiple programs running at the
same time and multiprocessing by having more than one physical processor. The
difference between multiprocessing and multi programming is that Multiprocessing is
basically executing multiple processes at the same time on multiple processors, whereas
multi programming is keeping several programs in main memory and executing them
concurrently using a single CPU only.
Multiprocessing occurs by means of parallel processing whereas Multi programming
occurs by switching from one process to other (phenomenon called as context switching).
1. Save it
2. Compile Program
To check errors and to obtain machine understandable program(.exe)
3. Run the Program
To obtain result(output)
Process
When a program is loaded into the memory and it becomes a process, it can be divided into four sections ─ stack,
heap, text and data. The following image shows a simplified layout of a process inside main memory −
Open files list – This information includes the list of files opened for a process.
The process control block stores the register content also known as execution content of
the processor when it was blocked from running. This execution content architecture
enables the operating system to restore a process’s execution context when the process
returns to the running state. When the process made transitions from one state to another,
the operating system update its information in the process’s PCB. The operating system
maintains pointers to each process’s PCB in a process table so that it can access the PCB
quickly.
Definition:
The process scheduling is the activity of the process manager that handles the removal of
therunning process from the CPU and the selection of another process on the basis of a
particularstrategy.Process scheduling is an essential part of a Multiprogramming operating
system. Such operatingsystems allow more than one process to be loaded into the executable
memory at a time andloaded process shares the CPU using time multiplexing.
Scheduling Queues :
Scheduling queues refers to queues of processes or devices. When the process enters into
thesystem, then this process is put into a job queue. This queue consists of all processes in the
The process could issue an I/O request and then it would be placed in an I/O queue.The process
could create new sub process and will wait for its termination.The process could be removed
forcibly from the CPU, as a result of interrupt and put back inthe ready queue.
Context Switch
A context switch is the mechanism to store and restore the state or context of a CPU in Process
Control block so that a process execution can be resumed from the same point at a later time.
Using this technique, a context switcher enables multiple processes to share a single CPU.
Context switching is an essential part of a multitasking operating system features.
When the scheduler switches the CPU from executing one process to execute another, the state
from the current running process is stored into the process control block. After this, the state for
Context switches are computationally intensive since register and memory state must be saved
and restored. To avoid the amount of context switching time, some hardware systems employ
two or more sets of processor registers. When the process is switched, the following information
is stored for later use.
Program Counter
Scheduling information
Base and limit register value
Currently used register
Changed State
I/O State information
Accounting information
Scheduling Criteria
Scheduling can be defined as a set of policies and mechanisms which controls the order in which
the work to be done is completed. The scheduling program which is a system software concerned
with scheduling is called the scheduler and the algorithm it uses is called the scheduling
algorithm.
Various criteria or characteristics that help in designing a good scheduling algorithm are:
Preemptive Scheduling
In Preemptive Scheduling, the tasks are mostly assigned with their priorities. Sometimes it is
important to run a task with a higher priority before another lower priority task, even if the lower
priority task is still running. The lower priority task holds for some time and resumes when the
higher priority task finishes its execution.
2
Types of CPU scheduling Algorithm There are mainly six types
of process scheduling algorithms
1. First Come First Serve (FCFS)
2. Shortest-Job-First (SJF) Scheduling
3. Shortest Remaining Time
4. Priority Scheduling
5. Round Robin Scheduling
6. Multilevel Queue Scheduling
Threads
What is Thread?
A thread is a flow of execution through the process code, with its own program counter that
keeps track of which instruction to execute next, system registers which hold its current working
variables, and a stack which contains the execution history.
A thread shares with its peer threads few information like code segment, data segment and open
files. When one thread alters a code segment memory item, all other threads see that.
A thread is also called a lightweight process. Threads provide a way to improve application
performance through parallelism. Threads represent a software approach to improving
performance of operating system by reducing the overhead thread is equivalent to a classical
process.
Each thread belongs to exactly one process and no thread can exist outside a process. Each
thread represents a separate flow of control. Threads have been successfully used in
implementing network servers and web server. They also provide a suitable foundation for
parallel execution of applications on shared memory multiprocessors. The following figure
shows the working of a single-threaded and a multithreaded process.
Types of Thread
Threads are implemented in following two ways −
User Level Threads − User managed threads.
Kernel Level Threads − Operating System managed threads acting on kernel, an
operating system core.
Advantages
Thread switching does not require Kernel mode privileges.
Disadvantages
In a typical operating system, most system calls are blocking.
Multithreaded application cannot take advantage of multiprocessing.
2) Kernel Level Threads
In this case, thread management is done by the Kernel. There is no thread management code in
the application area. Kernel threads are supported directly by the operating system. Any
application can be programmed to be multithreaded. All of the threads within an application are
supported within a single process.
The Kernel maintains context information for the process as a whole and for individuals threads
within the process. Scheduling by the Kernel is done on a thread basis. The Kernel performs
thread creation, scheduling and management in Kernel space. Kernel threads are generally
slower to create and manage than the user threads.
Advantages
Kernel can simultaneously schedule multiple threads from the same process on multiple
processes.
If one thread in a process is blocked, the Kernel can schedule another thread of the same
process.
Kernel routines themselves can be multithreaded.
Disadvantages
Kernel threads are generally slower to create and manage than the user threads.
Transfer of control from one thread to another within the same process requires a mode
switch to the Kernel.
New − A new thread begins its life cycle in the new state. It remains in this state until the
program starts the thread. It is also referred to as a born thread.
Runnable − After a newly born thread is started, the thread becomes runnable. A thread
in this state is considered to be executing its task.
Waiting − Sometimes, a thread transitions to the waiting state while the thread waits for
another thread to perform a task. A thread transitions back to the runnable state only
when another thread signals the waiting thread to continue executing.
Timed Waiting − A runnable thread can enter the timed waiting state for a specified
interval of time. A thread in this state transitions back to the runnable state when that time
interval expires or when the event it is waiting for occurs.
Terminated (Dead) − A runnable thread enters the terminated state when it completes its
task or otherwise terminates.
Blocked State: The thread is waiting for an event to occur or waiting for an I/O device.
Sleep: A sleeping thread becomes ready after the designated sleep time expires.
n this example, Process 1 performs a bit flip, changing the memory value from 0 to 1. Process 2
then performs a bit flip and changes the memory value from 1 to 0.
If a race condition occurred causing these two processes to overlap, the sequence could
potentially look more like this:
In this example, the bit has an ending value of 1 when its value should be 0. This occurs because
Process 2 is unaware that Process 1 is performing a simultaneous bit flip.
If Order of Execution is changed than the result is also changed and its generate
Inconsistency.
1. Mutual Exclusion
Out of a group of cooperating processes, only one process can be in its critical section at a given
point of time.
Progress Criteria :
1. Strict alteration(If there is no strict alteration than its satisfy the progress)
2. Deadlock ( If there is no deadlock than it iits satisfy the progress)
3. Bounded Waiting
After a process makes a request for getting into its critical section, there is a limit for how many
other processes can get into their critical section, before this process's request is granted. So after
the limit is reached, system must grant the process permission to get into its critical section.
Turn variable is a synchronization mechanism that provides synchronization among two processes.
It uses a turn variable to provide the synchronization.
Working-
Process P0 arrives.
It executes the turn!=0 instruction.
Since turn value is set to 0, so it returns value 0 to the while loop.
The while loop condition breaks.
Process P0 enters the critical section and executes.
Now, even if process P0 ets preempted in the middle, process P1 an not enter the critical
section.
Process P1 can not enter unless process P0 completes and sets the turn value to 1.
Scene-02:
Process P1 arrives.
It executes the turn!=1 instruction.
Since turn value is set to 0, so it returns value 1 to the while loop.
The returned value 1 does not break the while loop condition.
The process P1 is trapped inside an infinite while loop.
The while loop keeps the process P1 busy until the turn value becomes 1 and its condition
breaks.
Scene-03:
Process P0 comes out of the critical section and sets the turn value to 1.
The while loop condition of process P1 breaks.
Now, the process P1 waiting for the critical section enters the critical section and execute.
Now, even if process P1 gets preempted in the middle, process P0 can not enter the
critical section.
Process P0 can not enter unless process P1 completes and sets the turn value to 0.
Processes have to compulsorily enter the critical section alternately whether they want it
or not.
This is because if one process does not enter the critical section, then other process will
never get a chance to execute again.
The algorithm uses two variables, flag and turn. A flag[n] value of true indicates that
the process n wants to enter the critical section.. Entrance to the critical section is granted for
process P0 if P1 does not want to enter its critical section or if P1 has given priority to P0 by
setting turn to 0.
The algorithm satisfies the three essential criteria to solve the critical section problem, provided
that changes to the variables turn, flag[0], and flag[1] propagate immediately and atomically. The
while condition works even with preemption.
The three criteria are mutual exclusion, progress, and bounded waiting.
Since turn can take on one of two values, it can be replaced by a single bit, meaning that the
algorithm requires only three bits of memory.
Mutual exclusion
P0 and P1 can never be in the critical section at the same time: If P0 is in its critical section, then
flag[0] is true. In addition, either flag[1] is false (meaning P1 has left its critical section), or turn
is 0 (meaning P1 is just now trying to enter the critical section, but graciously waiting), or P1 is
at label P1_gate (trying to enter its critical section, after setting flag[1] to true but before setting
turn to 0 and busy waiting). So if both processes are in their critical sections then we conclude
that the state must satisfy flag[0] and flag[1] and turn = 0 and turn = 1. No state can satisfy both
turn = 0 and turn = 1, so there can be no state where both processes are in their critical sections.
(This recounts an argument that is made rigorous in.)
Bounded waiting
Bounded waiting, or bounded bypass means that the number of times a process is bypassed by
another process after it has indicated its desire to enter the critical section is bounded by a
function of the number of processes in the system.:11 In Peterson's algorithm, a process will
never wait longer than one turn for entrance to the critical section.
Semaphore
Semaphores are integer variables that are used to solve the critical section problem by using two
atomic operations, wait and signal that are used for process synchronization.
The definitions of wait and signal are as follows −
Wait
The wait operation decrements the value of its argument S, if it is positive. If S is
negative or zero, then no operation is performed.
Signal
The signal operation increments the value of its argument S.
Types of Semaphores
There are two main types of semaphores i.e. counting semaphores and binary semaphores.
Details about these are given as follows:
Advantages of Semaphores
Some of the advantages of semaphores are as follows:
Semaphores allow only one process into the critical section. They follow the mutual
exclusion principle strictly and are much more efficient than some other methods of
synchronization.
There is no resource wastage because of busy waiting in semaphores as processor time is
not wasted unnecessarily to check if a condition is fulfilled to allow a process to access
the critical section.
Semaphores are implemented in the machine independent code of the microkernel. So
they are machine independent.
Disadvantages of Semaphores
Some of the disadvantages of semaphores are as follows −
Semaphores are complicated so the wait and signal operations must be implemented in
the correct order to prevent deadlocks.
Semaphores are impractical for last scale use as their use leads to loss of modularity. This
happens because the wait and signal operations prevent the creation of a structured layout
for the system.
Semaphores may lead to a priority inversion where low priority processes may access the
critical section first and high priority processes later.
Reader Process
The code that defines the reader process is given below:
n the above code, mutex and wrt are semaphores that are initialized to 1. Also, rc is a variable
that is initialized to 0. The mutex semaphore ensures mutual exclusion and wrt handles the
writing mechanism and is common to the reader and writer process code.
The variable rc denotes the number of readers accessing the object. As soon as rc becomes 1,
wait operation is used on wrt. This means that a writer cannot access the object anymore. After
the read operation is done, rc is decremented. When re becomes 0, signal operation is used on
wrt. So a writer can access the object now.
Writer Process
The code that defines the writer process is given below:
If a writer wants to access the object, wait operation is performed on wrt. After that no other
writer can access the object. When a writer is done writing into the object, signal operation is
performed on wrt.
In the above code, mutex, empty and full are semaphores. Here mutex is initialized to 1, empty is
initialized to n (maximum size of the buffer) and full is initialized to 0.The mutex semaphore
ensures mutual exclusion. The empty and full semaphores count the number of empty and full
spaces in the buffer.
After the item is produced, wait operation is carried out on empty. This indicates that the empty
space in the buffer has decreased by 1. Then wait operation is carried out on mutex so that
consumer process cannot interfere.
After the item is put in the buffer, signal operation is carried out on mutex and full. The former
indicates that consumer process can now act and the latter shows that the buffer is full by 1.
Consumer Process
The code that defines the consumer process is given below:
Initially the elements of the chopstick are initialized to 1 as the chopsticks are on the table and
not picked up by a philosopher.
The structure of a random philosopher i is given as follows:
do {
wait( chopstick[i] );
wait( chopstick[ (i+1) % 5] );
. .
. EATING THE RICE
.
signal( chopstick[i] );
signal( chopstick[ (i+1) % 5] );
.
. THINKING
.
} while(1);
In the above structure, first wait operation is performed on chopstick[i] and chopstick[ (i+1) %
5]. This means that the philosopher i has picked up the chopsticks on his sides. Then the eating
function is performed.
After that, signal operation is performed on chopstick[i] and chopstick[ (i+1) % 5]. This means
that the philosopher i has eaten and put down the chopsticks on his sides. Then the philosopher
goes back to thinking.
Procedure P1(....)
{
}
Procedure P2(....)
{
Procedure Pn(....)
{
Initialization Code(....)
{
}
}
Only one process can be active in a monitor at a time. Other processes that need to access the
shared variables in a monitor have to line up in a queue and are only provided access when the
previous process release the shared variables.
procedure add(item)
{
if (itemCount == BUFFER_SIZE)
{
wait(full);
}
putItemIntoBuffer(item);
itemCount = itemCount + 1;
procedure remove()
{
if (itemCount == 0)
{
wait(empty);
}
item = removeItemFromBuffer();
itemCount = itemCount - 1;
if (itemCount == BUFFER_SIZE - 1)
{
notify(full);
}
return item;
}
}
procedure producer()
{
while (true)
{
item = produceItem();
ProducerConsumer.add(item);
}
}
procedure consumer()
{
while (true)
{
item = ProducerConsumer.remove();
consumeItem(item);
}
}
// Pickup chopsticks
Pickup(int i)
{
// indicate that I’m hungry
state[i] = hungry;
test(int i)
{
if (state[(i + 1) % 5] != eating
&& state[(i + 4) % 5] != eating
&& state[i] == hungry) {
init()
{
This allows philosopher i to delay herself when she is hungry but is unable to obtain the
chopsticks she needs. We are now in a position to describe our solution to the dining-
philosophers problem. The distribution of the chopsticks is controlled by the monitor Dining
Philosophers. Each philosopher, before starting to eat, must invoke the operation pickup(). This
act may result in the suspension of the philosopher process. After the successful completion of
the operation, the philosopher may eat. Following this, the philosopher invokes the putdown()
operation. Thus, philosopher i must invoke the operations pickup() and putdown() in the
following sequence:
DiningPhilosophers.pickup(i);
...
eat
...
DiningPhilosophers.putdown(i);
It is easy to show that this solution ensures that no two neighbors are eating simultaneously and
that no deadlocks will occur. We note, however, that it is possible for a philosopher to starve to
death.
Consider an example when two trains are coming toward each other on same track
and there is only one track, none of the trains can move once they are in front of
each other. Similar situation occurs in operating systems when there are two or
more processes hold some resources and wait for resources held by other(s). For
example, in the below diagram, Process 1 is holding Resource 1 and waiting for
resource 2 which is acquired by process 2, and process 2 is waiting for resource 1.
2. Hold and wait or resource holding: a process is currently holding at least one resource
and requesting additional resources which are being held by other processes.
3. No preemption: a resource can be released only voluntarily by the process holding it.
This strategy involves designing a system that violates one of the four necessary conditions
required for the occurrence of deadlock.
This ensures that the system remains free from the deadlock.
1. Mutual Exclusion-
To violate this condition, all the system resources must be such that they can be used in a
shareable mode.
In a system, there are always some resources which are mutually exclusive by nature.
So, this condition can not be violated.
Approach-01:
In this approach,
A process has to first request for all the resources it requires for execution.
Once it has acquired all the resources, only then it can start its execution.
This approach ensures that the process does not hold some resources and wait for other
resources.
Drawbacks-
The drawbacks of this approach are-
It is less efficient.
It is not implementable since it is not possible to predict in advance which resources will
be required during execution.
Approach-03:
In this approach,
A timer is set after the process acquires any resource.
After the timer expires, a process has to compulsorily release the resource.
3. No Preemption-
4. Circular Wait-
This condition can be violated by not allowing the processes to wait for resources in a
cyclic manner.
To violate this condition, the following approach is followed-
Approach-
A natural number is assigned to every resource.
Each process is allowed to request for the resources either in only increasing or only
decreasing order of the resource number.
In case increasing order is followed, if a process requires a lesser number resource, then it
must release all the resources having larger number and vice versa.
This approach is the most practical approach and implementable.
However, this approach may cause starvation but will never lead to deadlock.
This strategy involves maintaining a set of data using which a decision is made whether
to entertain the new request or not.
If entertaining the new request causes the system to move in an unsafe state, then it is
discarded.
This strategy requires that every process declares its maximum requirement of each
resource type in the beginning.
The main challenge with this approach is predicting the requirement of the processes
before execution.
Banker’s Algorithm is an example of a deadlock avoidance strategy.
Banker’s Algorithms
Following Data structures are used to implement the Banker’s Algorithm:
Let ‘n’ be the number of processes in the system and ‘m’ be the number of resources types.
Available :
It is a 1-d array of size ‘m’ indicating the number of available resources of each type.
Available[ j ] = k means there are ‘k’ instances of resource type Rj
Max :
It is a 2-d array of size ‘n*m’ that defines the maximum demand of each process in a
system.
Max[ i, j ] = k means process Pi may request at most ‘k’ instances of resource type Rj.
Allocation :
It is a 2-d array of size ‘n*m’ that defines the number of resources of each type currently
allocated to each process.
Allocation[ i, j ] = k means process Pi is currently allocated ‘k’ instances of resource type
Rj
Need :
It is a 2-d array of size ‘n*m’ that indicates the remaining resource need of each process.
Need [ i, j ] = k means process Pi currently need ‘k’ instances of resource type Rj
for its execution.
Need [ i, j ] = Max [ i, j ] – Allocation [ i, j ]
The algorithm for finding out whether or not a system is in a safe state can be described as
follows
Resource-Request Algorithm
Let Requesti be the request array for process Pi. Requesti [j] = k means process Pi wants k
instances of resource type Rj. When a request for resources is made by process Pi, the following
actions are taken:
Question2. Is the system in a safe state? If Yes, then what is the safe
sequence?
Applying the Safety algorithm on the given system,
Hence the new system state is safe, so we can immediately grant the request for process P1 .
Steps of Algorithm:
1. Let Work and Finish be vectors of length m and n respectively. Initialize Work=
Available. For i=0, 1, …., n-1, if Requesti = 0, then Finish[i] = true; otherwise,
Finish[i]= false.
2. Find an index i such that both
a) Finish[i] == false
b) Requesti <= Work
If no such i exists go to step 4.
3. Work= Work+ Allocationi
Finish[i]= true
Go to Step 2.
5Deadlock Recovery
When a Deadlock Detection Algorithm determines that a deadlock has occurred in the system,
the system must recover from that deadlock. There are two approaches of breaking a Deadlock:
1. Process Termination:
To eliminate the deadlock, we can simply kill one or more processes. For this, we use two
methods:
Aborting all the processes will certainly break the deadlock, but with a great expenses.
The deadlocked processes may have computed for a long time and the result of those
partial computations must be discarded and there is a probability to recalculate them later.
(b). Abort one process at a time untill deadlock is eliminated:
Abort one deadlocked process at a time, untill deadlock cycle is eliminated from the
system. Due to this method, there may be considerable overhead, because after aborting
each process, we have to run deadlock detection algorithm to check whether any
processes are still deadlocked.
2. Resource Preemption:
To eliminate deadlocks using resource preemption, we preepmt some resources from processes
and give those resources to other processes. This method will raise three issues –
(a). Selecting a victim:
We must determine which resources and which processes are to be preempted and also
the order to minimize the cost.
(b). Rollback:
We must determine what should be done with the process from which resources are
preempted. One simple idea is total rollback. That means abort the process and restart it.
(c). Starvation:
In a system, it may happen that same process is always picked as a victim. As a result,
that process will never complete its designated task. This situation is called Starvation
and must be avoided. One solution is that a process must be picked as a victim only a
finite number of times.
So, resource allocation graph is explained to us what is the state of the system in terms of
processes and resources. Like how many resources are available, how many are allocated and
what is the request of each process. Everything can be represented in terms of the diagram. One
of the advantages of having a diagram is, sometimes it is possible to see a deadlock directly by
using RAG, but then you might not be able to know that by looking at the table. But the tables
Prepared By: Prof. Neha Prajapati Page 71
are better if the system contains lots of process and resource and Graph is better if the system
contains less number of process and resource.
We know that any graph contains vertices and edges. So RAG also contains vertices and edges.
In RAG vertices are two type –
1. Process vertex – Every process will be represented as a process vertex.Generally, the process
will be represented with a circle.
2. Resource vertex – Every resource will be represented as a resource vertex. It is also two type
Single instance type resource – It represents as a box, inside the box, there will be one
dot.So the number of dots indicate how many instances are present of each resource type.
Multi-resource instance type resource – It also represents as a box, inside the box, there
will be many dots.
Now coming to the edges of RAG.There are two types of edges in RAG –
1. Assign Edge – If you already assign a resource to a process then it is called Assign edge.
2. Request Edge – It means in future the process might want some resource to complete the
execution, that is called request edge.
So, if a process is using a resource, an arrow is drawn from the resource node to the process
node. If a process is requesting a resource, an arrow is drawn from the process node to the
resource node.
Prepared By: Prof. Neha Prajapati Page 72
Example 1 (Single instances RAG) –
If there is a cycle in the Resource Allocation Graph and each resource in the cycle provides only
one instance, then the processes will be in deadlock. For example, if process P1 holds resource
R1, process P2 holds resource R2 and process P1 is waiting for R2 and process P2 is waiting for
R1, then process P1 and process P2 will be in deadlock.
Here’s another example, that shows Processes P1 and P2 acquiring resources R1 and R2 while
process P3 is waiting to acquire both resources. In this example, there is no deadlock because
there is no circular dependency.
So cycle in single-instance resource type is the sufficient condition for deadlock.
Example 2 (Multi-instances RAG) –
The total number of processes are three; P1, P2 & P3 and the total number of resources
are two; R1 & R2.
Allocation matrix –
For constructing the allocation matrix, just go to the resources and see to which process it
is allocated.
So, there is no deadlock in this RAG.Even though there is a cycle, still there is no
deadlock.Therefore in multi-instance resource cycle is not sufficient condition for
deadlock.
Above example is the same as the previous example except that, the process P3
requesting for resource R1.
So,the Available resource is = (0, 0), but requirement are (0, 1), (1, 0) and (1, 0).So you
can’t fulfill any one requirement.Therefore, it is in deadlock.
Therefore, every cycle in a multi-instance resource type graph is not a deadlock, if there
has to be a deadlock, there has to be a cycle.So, in case of RAG with multi-instance
resource type, the cycle is a necessary condition for deadlock, but not sufficient.
Logical Address is generated by CPU while a program is running. The logical address is virtual
address as it does not exist physically, therefore, it is also known as Virtual Address. This
address is used as a reference to access the physical memory location by CPU. The term Logical
Address Space is used for the set of all logical addresses generated by a program’s perspective.
The hardware device called Memory-Management Unit is used for mapping logical address to its
corresponding physical address.
Physical Address identifies a physical location of required data in a memory. The user never
directly deals with the physical address but can access by its corresponding logical address. The
user program generates the logical address and thinks that the program is running in this logical
address but the program needs physical memory for its execution, therefore, the logical address
Prepared By: Prof. Neha Prajapati Page 77
must be mapped to the physical address by MMU before they are used. The term Physical
Address Space is used for all physical addresses corresponding to the logical addresses in a
Logical address space.
For Example
1. CPU will generate logical address for eg: 346
2. MMU will generate relocation register(base register) for eg:14000
3. In Memory physical address is located eg:(346+14000= 14346)
Instruction-execution cycle Follows steps:
1. First instruction is fetched from memory e.g. ADD A,B
2. Then these instructions are decoded i.e., Addition of A and B
3. And further loading or storing at some particular memory location takes place.
Basic Hardware
As main memory and registers are built into processor and CPU can access these only.So every
instructions should be written in direct access storage devices.
1. If CPU access instruction from register then it can be done in one CPU clock cycle as
registers are built into CPU.
2. If instruction resides in main memory then it will be accessed via memory bus that will
take lot of time. So remedy to this add fast memory in between CPU and main memory
i.e. adding cache for transaction.
3. Now we should insure that process resides in legal address.
4. Legal address consists of base register(holds smallest physical address) and limit
register(size of range).
For example:
Base register = 300040
limit register = 120900
then legal address = (300040+120900)= 420940(inclusive).
legal address = base register+ limit register
Memory is a large array of bytes, where each byte has its own address. The memory allocation
can be classified into two methods contiguous memory allocation and non-contiguous memory
allocation. The major difference between Contiguous and Noncontiguous memory allocation is
that the contiguous memory allocation assigns the consecutive blocks of memory to a process
requesting for memory whereas, the noncontiguous memory allocation assigns the separate
memory blocks at the different location in memory space in a nonconsecutive manner to a
process requesting for memory. We will discuss some more differences between contiguous and
non-contiguous memory allocation with the help of comparison chart shown below.
The contiguous memory allocation can be achieved by dividing the memory into the fixed-sized
partition and allocate each partition to a single process only. But this will cause the degree of
multiprogramming, bounding to the number of fixed partition done in the memory. The
contiguous memory allocation also leads to the internal fragmentation. Like, if a fixed sized
memory block allocated to a process is slightly larger than its requirement then the left over
memory space in the block is called internal fragmentation. When the process residing in the
partition terminates the partition becomes available for the another process.
Answer:
Paging is a memory management scheme that eliminates the need for contiguous allocation of
physical memory. This scheme permits the physical address space of a process to be non –
contiguous.
Logical Address or Virtual Address (represented in bits): An address generated by the
CPU
Logical Address Space or Virtual Address Space( represented in words or bytes): The set
of all logical addresses generated by a program
Physical Address (represented in bits): An address actually available on memory unit
Physical Address Space (represented in words or bytes): The set of all physical addresses
corresponding to the logical addresses
Example:
The Physical Address Space is conceptually divided into a number of fixed-size blocks,
called frames.
The Logical address Space is also splitted into fixed-size blocks, called pages.
Page Size = Frame Size
Let us consider an example:
Physical Address = 12 bits, then Physical Address Space = 4 K words
Logical Address = 13 bits, then Logical Address Space = 8 K words
Page size = frame size = 1 K words (assumption)
Answer:
The hardware implementation of page table can be done by using dedicated registers. But the
usage of register for the page table is satisfactory only if page table is small. If page table contain
large number of entries then we can use TLB(translation Look-aside buffer), a special, small, fast
look up hardware cache.
The TLB is associative, high speed memory.
Each entry in TLB consists of two parts: a tag and a value.
When this memory is used, then an item is compared with all tags simultaneously.If the
item is found, then corresponding value is returned.
A demand paging system is quite similar to a paging system with swapping where processes
reside in secondary memory and pages are loaded only on demand, not in advance. When a
context switch occurs, the operating system does not copy any of the old program’s pages out to
the disk or any of the new program’s pages into the main memory Instead, it just begins
While executing a program, if the program references a page which is not available in the main
memory because it was swapped out a little ago, the processor treats this invalid memory
reference as a page fault and transfers control from the program to the operating system to
demand the page back into the memory.
Advantages
Following are the advantages of Demand Paging −
Large virtual memory.
More efficient use of memory.
There is no limit on degree of multiprogramming.
Disadvantages
Number of tables and the amount of processor overhead for handling page interrupts are
greater than in the case of the simple paged management techniques.
1. If CPU try to refer a page that is currently not available in the main memory, it generates
an interrupt indicating memory access fault.
2. The OS puts the interrupted process in a blocking state. For the execution to proceed the
OS must bring the required page into the memory.
3. The OS will search for the required page in the logical address space.
4. The required page will be brought from logical address space to physical address space.
The page replacement algorithms are used for the decision making of replacing the page
in physical address space.
5. The page table will updated accordingly.
t
o
i
m
p
l
e
m
e
nt, keep a list, replace pages by looking back into time.
Draw the block diagram for DMA. Write steps for DMA data transfer.
A device controller need not necessarily control a single device. It can usually control multiple
I/O devices. It comes in the form of an electronic circuit board that plugs directly into the system
bus, and there is a cable from the controller to each device it controls. The cables coming out of
the controller are usually terminated at the back panel of the main computer box in the form of
connectors known as ports.
elow illustrates how I/O devices are connected to a computer system through device controllers.
Please note the following points in the diagram:
• Each I/O device is linked through a hardware interface called I/O Port.
• Single and Multi-port device controls single or multi-devices.
• The communication between I/O controller and Memory is through bus only in case of
Direct Memory Access (DMA), whereas the path passes through the CPU for such
communication in case of non-DMA.
Using device controllers for connecting I/O devices to a computer system instead of connecting
them directly to the system bus has the following advantages:
A device controller can be shared among multiple I/O devices allowing many I/O
devices to be connected to the system.
I/O devices can be easily upgraded or changed without any change in the computer
system.
RAID 0 was unable to tolerate any disk failure. But RAID 1 is capable of reliability.
Evaluation:
Assume a RAID system with mirroring level 2.
Reliability: 1 to N/2
1 disk failure can be handled for certain, because blocks of that disk would have
duplicates on some other disk. If we are lucky enough and disks 0 and 2 fail, then again
this can be handled as the blocks of these disks have duplicates on disks 1 and 3. So, in
the best case, N/2 disk failures can be handled.
Prepared By: Prof. Neha Prajapati Page 91
Capacity:N*B/2
Only half the space is being used to store data. The other half is just a mirror to the
already stored data.
RAID-4 (Block-Level Striping with Dedicated Parity)
Instead of duplicating data, this adopts a parity-based approach.
Assume that in the above figure, C3 is lost due to some disk failure. Then, we can
recompute the data bit stored in C3 by looking at the values of all the other columns and
the parity bit. This allows us to recover lost data.
Evaluation:
Reliability:1
RAID-4 allows recovery of at most 1 disk failure (because of the way parity works). If
more than one disk fails, there is no way to recover the data.
Capacity: (N-1)*B
One disk in the system is reserved for storing the parity. Hence, (N-1) disks are made
available for data storage, each disk having B blocks.
One of the important jobs of an Operating System is to manage various I/O devices including
mouse, keyboards, touch pad, disk drives, display adapters, USB devices, Bit-mapped screen,
LED, Analog-to-digital converter, On/off switch, network connections, audio I/O, printers etc.
An I/O system is required to take an application I/O request and send it to the physical device,
then take whatever response comes back from the device and send it to the application. I/O
devices can be divided into two categories −
Block devices − A block device is one with which the driver communicates by sending
entire blocks of data. For example, Hard disks, USB cameras, Disk-On-Key etc.
Character devices − A character device is one with which the driver communicates by
sending and receiving single characters (bytes, octets). For example, serial ports, parallel
ports, sounds cards etc
While using memory mapped IO, OS allocates buffer in memory and informs I/O device to use
that buffer to send data to the CPU. I/O device operates asynchronously with CPU, interrupts
CPU when finished.
The advantage to this method is that every instruction which can access memory can be used to
manipulate an I/O device. Memory mapped IO is used for most high-speed I/O devices like
disks, communication interfaces.
Polling I/O
Polling is the simplest way for an I/O device to communicate with the processor. The process of
periodically checking status of the device to see if it is time for the next I/O operation, is called
polling. The I/O device simply puts the information in a Status register, and the processor must
come and get the information.
Interrupts I/O
An alternative scheme for dealing with I/O is the interrupt-driven method. An interrupt is a
signal to the microprocessor from a device that requires attention.
A device controller puts an interrupt signal on the bus when it needs CPU’s attention when CPU
receives an interrupt, It saves its current state and invokes the appropriate interrupt handler using
the interrupt vector (addresses of OS routines to handle various events). When the interrupting
device has been dealt with, the CPU continues with its original task as if it had never been
interrupted.
User Level Libraries − This provides simple interface to the user program to perform
input and output. For example, stdio is a library provided by C and C++ programming
languages.
Kernel Level Modules − This provides device driver to interact with the device
controller and device independent I/O modules used by the device drivers.
Hardware − This layer includes actual hardware and hardware controller which interact
with the device drivers and makes hardware alive.
A key concept in the design of I/O software is that it should be device independent where it
should be possible to write programs that can access any I/O device without having to specify the
device in advance. For example, a program that reads a file as input should be able to read a file
on a floppy disk, on a hard disk, or on a CD-ROM, without having to modify the program for
each different device.
Interrupt handlers
An interrupt handler, also known as an interrupt service routine or ISR, is a piece of software or
more specifically a callback function in an operating system or more specifically in a device
driver, whose execution is triggered by the reception of an interrupt.
Disk Response Time: Response Time is the average of time spent by a request waiting to
perform its I/O operation. Average Response time is the response time of the all requests.
Variance Response Time is measure of how individual request are serviced with respect to
average response time. So the disk scheduling algorithm that gives minimum variance response
time is better.
Output:
Total number of seek operations = 510
Seek Sequence is
176
79
34
60
92
11
41
114
Given an array of disk track numbers and initial head position, our task is to find the total
number of seek operations done to access all the requested tracks if Shortest Seek Time First
(SSTF) is a disk scheduling algorithm is used.
Shortest Seek Time First (SSTF) –
Basic idea is the tracks which are closer to current disk head position should be serviced first in
order to minimise the seek operations.
Algorithm –
1. Let Request array represents an array storing indexes of tracks that have been requested.
‘head’ is the position of disk head.
2. Find the positive distance of all tracks in the request array from head.
3. Find a track from requested array which has not been accessed/serviced yet and has
minimum distance from head.
4. Increment the total seek count with this distance.
5. Currently serviced track position now becomes the new head position.
6. Go to step 2 until all tracks in request array have not been serviced.
Example –
Request sequence = {176, 79, 34, 60, 92, 11, 41, 114}
Initial head position = 50
The following chart shows the sequence in which requested tracks are serviced using SSTF.
Circular SCAN (C-SCAN) scheduling algorithm is a modified version of SCAN disk scheduling
algorithm that deals with the inefficiency of SCAN algorithm by servicing the requests more
uniformly. Like SCAN (Elevator Algorithm) C-SCAN moves the head from one end servicing
all the requests to the other end. However, as soon as the head reaches the other end, it
immediately returns to the beginning of the disk without servicing any requests on the return trip
(see chart below) and starts servicing again once reaches the beginning. This is also known as the
“Circular Elevator Algorithm” as it essentially treats the cylinders as a circular list that wraps
around from the final cylinder to the first one.
Output:
Initial position of head: 50
Total number of seek operations = 190
Seek Sequence is
60
79
92
114
176
199
0
11
34
41
The following chart shows the sequence in which requested tracks are serviced using SCAN.
Given an array of disk track numbers and initial head position, our task is to find the total
number of seek operations done to access all the requested tracks if LOOK disk scheduling
algorithm is used. Also, write a program to find the seek sequence using LOOK disk scheduling
algorithm.
LOOK Disk Scheduling Algorithm:
LOOK is the advanced version of SCAN (elevator) disk scheduling algorithm which gives
slightly better seek time than any other algorithm in the hierarchy (FCFS->SRTF->SCAN->C-
SCAN->LOOK). The LOOK algorithm services request similarly as SCAN algorithm meanwhile
it also “looks” ahead as if there are more tracks that are needed to be serviced in the same
direction. If there are no pending requests in the moving direction the head reverses the direction
and start servicing requests in the opposite direction.
The main reason behind the better performance of LOOK algorithm in comparison to SCAN is
because in this algorithm the head is not allowed to move till the end of the disk.
Algorithm:
1. Let Request array represents an array storing indexes of tracks that have been requested
in ascending order of their time of arrival. ‘head’ is the position of disk head.
2. The intial direction in which head is moving is given and it services in the same direction.
Output:
Initial position of head: 50
Total number of seek operations = 291
Seek Sequence is
60
79
92
114
176
41
34
11
The following chart shows the sequence in which requested tracks are serviced using LOOK.
C-LOOK is an enhanced version of both SCAN as well as LOOK disk scheduling algorithms.
This algorithm also uses the idea of wrapping the tracks as a circular cylinder as C-SCAN
algorithm but the seek time is better than C-SCAN algorithm. We know that C-SCAN is used to
avoid starvation and services all the requests more uniformly, the same goes for C-LOOK.
In this algorithm, the head services requests only in one direction(either left or right) until all the
requests in this direction are not serviced and then jumps back to the farthest request on the other
direction and service the remaining requests which gives a better uniform servicing as well as
avoids wasting seek time for going till the end of the disk.
Algorithm-
1. Let Request array represents an array storing indexes of the tracks that have been
requested in ascending order of their time of arrival and head is the position of the disk
head.
2. The initial direction in which the head is moving is given and it services in the same
direction.
3. The head services all the requests one by one in the direction it is moving.
4. The head continues to move in the same direction until all the requests in this direction
have been serviced.
5. While moving in this direction, calculate the absolute distance of the tracks from the
head.
6. Increment the total seek count with this distance.
7. Currently serviced track position now becomes the new head position.
8. Go to step 5 until we reach the last request in this direction.
9. If we reach the last request in the current direction then reverse the direction and move
the head in this direction until we reach the last request that is needed to be serviced in
this direction without servicing the intermediate requests.
10. Reverse the direction and go to step 3 until all the requests have not been
serviced.
Examples:
Input:
Request sequence = {176, 79, 34, 60, 92, 11, 41, 114}
The following chart shows the sequence in which requested tracks are serviced using C-LOOK.
Therefore, the total seek count = (60 – 50) + (79 – 60) + (92 – 79) + (114 – 92) + (176 – 114) +
(176 – 11) + (34 – 11) + (41 – 34) = 321
Security refers to providing a protection system to computer system resources such as CPU,
memory, disk, software programs and most importantly data/information stored in the computer
system. If a computer program is run by an unauthorized user, then he/she may cause severe
damage to computer or data stored in it. So a computer system must be protected against
unauthorized access, malicious access to system memory, viruses, worms etc. We're going to
discuss following topics
Authentication
One Time passwords
Program Threats
System Threats
Computer Security Classifications
Authentication
Authentication refers to identifying each user of the system and associating the executing
programs with those users. It is the responsibility of the Operating System to create a protection
system which ensures that a user who is running a particular program is authentic. Operating
Systems generally identifies/authenticates users using following three ways −
Username / Password − User need to enter a registered username and password with
Operating system to login into the system.
User card/key − User need to punch card in card slot, or enter key generated by key
generator in option provided by operating system to login into the system.
User attribute - fingerprint/ eye retina pattern/ signature − User need to pass his/her
attribute via designated input device used by operating system to login into the system.
System Threats
System threats refers to misuse of system services and network connections to put user in
trouble. System threats can be used to launch program threats on a complete network called as
program attack. System threats creates such an environment that operating system resources/
user files are misused. Following is the list of some well-known system threats.
Worm − Worm is a process which can choked down a system performance by using
system resources to extreme levels. A Worm process generates its multiple copies where
each copy uses system resources, prevents all other processes to get required resources.
Worms processes can even shut down an entire network.
Port Scanning − Port scanning is a mechanism or means by which a hacker can detects
system vulnerabilities to make an attack on the system.
Denial of Service − Denial of service attacks normally prevents user to make legitimate
use of the system. For example, a user may not be able to use internet if denial of service
attacks browser's content settings.
Principles of Protection
The principle of least privilege dictates that programs, users, and systems be given just
enough privileges to perform their tasks.
This ensures that failures do the least amount of harm and allow the least of harm to be
done.
For example, if a program needs special privileges to perform a task, it is better to make it
a SGID program with group ownership of "network" or "backup" or some other pseudo
group, rather than SUID with root ownership. This limits the amount of damage that can
occur if something goes wrong.
Typically each user is given their own account, and has only enough privilege to modify
their own files.
The root account should not be used for normal day to day activities - The System
Administrator should also have an ordinary account, and reserve use of the root account
for only those tasks which need the root privileges
Domain of Protection
A computer can be viewed as a collection of processes and objects ( both HW & SW ).
The need to know principle states that a process should only have access to those
objects it needs to accomplish its task, and furthermore only in the modes for which it
needs access and only during the time frame when it needs access.
The modes available for a particular object may depend upon its type.
Domain Structure
A protection domain specifies the resources that a process may access.
Each domain defines a set of objects and the types of operations that may be
invoked on each object.
An access right is the ability to execute an operation on an object.
A domain is defined as a set of < object, { access right set } > pairs, as shown
below. Note that some domains may be disjoint while others overlap.
An Example: UNIX
UNIX associates domains with users.
Certain programs operate with the SUID bit set, which effectively changes the
user ID, and therefore the access domain, while the program is running. ( and
similarly for the SGID bit. ) Unfortunately this has some potential for abuse.
An alternative used on some systems is to place privileged programs in
special directories, so that they attain the identity of the directory owner when
they run. This prevents crackers from placing SUID programs in random
directories around the system.
Yet another alternative is to not allow the changing of ID at all. Instead,
special privileged daemons are launched at boot time, and user processes send
messages to these daemons when they need special tasks performed.
An Example: MULTICS
The MULTICS system uses a complex system of rings, each corresponding to
a different protection domain, as shown below:
Rings are numbered from 0 to 7, with outer rings having a subset of the
privileges of the inner rings.
Each file is a memory segment, and each segment description includes an
entry that indicates the ring number associated with that segment, as well as
read, write, and execute privileges.
Each process runs in a ring, according to the current-ring-number, a counter
associated with each process.
A process operating in one ring can only access segments associated with
higher ( farther out ) rings, and then only according to the access bits.
Processes cannot access segments associated with lower rings.
Domain switching is achieved by a process in one ring calling upon a process
operating in a lower ring, which is controlled by several factors stored with
each segment descriptor:
An access bracket, defined by integers b1 <= b2.
A limit b3 > b2
A list of gates, identifying the entry points at which the segments may
be called.
If a process operating in ring i calls a segment whose bracket is such that b1
<= i <= b2, then the call succeeds and the process remains in ring i.
Otherwise a trap to the OS occurs, and is handled as follows:
If i < b1, then the call is allowed, because we are transferring to a
procedure with fewer privileges. However if any of the parameters
being passed are of segments below b1, then they must be copied to an
area accessible by the called procedure.
Access Matrix
The model of protection that we have been discussing can be viewed as an access matrix,
in which columns represent different system resources and rows represent different
protection domains. Entries within the matrix indicate what access that domain has to that
resource.
A Lock-Key Mechanism
Each resource has a list of unique bit patterns, termed locks.
Each domain has its own list of unique bit patterns, termed keys.
Access is granted if one of the domain's keys fits one of the resource's locks.
Again, a process is not allowed to modify its own keys.
Comparison
Each of the methods here has certain advantages or disadvantages, depending
on the particular situation and task at hand.
Many systems employ some combination of the listed methods.
Access Control
Role-Based Access Control, RBAC, assigns privileges to users, programs, or roles as
appropriate, where "privileges" refer to the right to call certain system calls, or to use
certain parameters with those calls.
RBAC supports the principle of least privilege, and reduces the susceptibility to abuse as
opposed to SUID or SGID programs.
Access control
Access control defines rules and policies for limiting access to a system or to physical or virtual
resources. It is a process by which users are granted access and certain privileges to systems,
resources or information. In access control systems, users need to present credentials before they
can be granted access such as a person's name or a computer's serial number. In physical
systems, these credentials may come in many forms, but credentials that can't be transferred
provide the most security.
Authentication
An authentication is a process that ensures and confirms a user's identity or role that someone
has. It can be done in a number of different ways, but it is usually based on a combination of-
something the person has (like a smart card or a radio key for storing secret keys),
something the person knows (like a password),
something the person is (like a human with a fingerprint).
Authentication is the necessity of every organizations because it enables organizations to keep
their networks secure by permitting only authenticated users to access its protected resources.
These resources may include computer systems, networks, databases, websites and other
network-based applications or services.
Authorization
Authorization is a security mechanism which gives permission to do or have something. It is
used to determine a person or system is allowed access to resources, based on an access control
policy, including computer programs, files, services, data and application features. It is normally
preceded by authentication for user identity verification. System administrators are typically
assigned permission levels covering all system and user resources. During authorization, a
system verifies an authenticated user's access rules and either grants or refuses resource access.
Physical Security
Physical security describes measures designed to deny the unauthorized access of IT assets like
facilities, equipment, personnel, resources and other properties from damage. It protects these
assets from physical threats including theft, vandalism, fire and natural disasters.
Backups
Backup is the periodic archiving of data. It is a process of making copies of data or data files to
use in the event when the original data or data files are lost or destroyed. It is also used to make
copies for historical purposes, such as for longitudinal studies, statistics or for historical records
or to meet the requirements of a data retention policy. Many applications especially in a
Windows environment, produce backup files using the .BAK file extension.
Checksums
A checksum is a numerical value used to verify the integrity of a file or a data transfer. In other
words, it is the computation of a function that maps the contents of a file to a numerical value.
They are typically used to compare two sets of data to make sure that they are the same. A
checksum function depends on the entire contents of a file. It is designed in a way that even a
small change to the input file (such as flipping a single bit) likely to results in different output
value.
Physical Protections
Physical safeguard means to keep information available even in the event of physical challenges.
It ensure sensitive information and critical information technology are housed in secure areas.
Computational redundancies
It is applied as fault tolerant against accidental faults. It protects computers and storage devices
that serve as fallbacks in the case of failures.
Unix file system is a logical method of organizing and storing large amounts of information in a way that makes it
easy to manage. A file is a smallest unit in which the information is stored. Unix file system has several important
features. All data in Unix is organized into files. All files are organized into directories. These directories are
organized into a tree-like structure called the file system.
Files in Unix System are organized into multi-level hierarchy structure known as a directory tree. At the very top of
the file system is a directory called “root” which is represented by a “/”. All other files are “descendants” of root.
1. Ordinary files – An ordinary file is a file on the system that contains data, text, or program instructions.
Used to store your information, such as some text you have written or an image you have drawn. This is the
type of file that you usually work with.
Always located within/under a directory file.
Do not contain other files.
In long-format output of ls -l, this type of file is specified by the “-” symbol.
2. Directories – Directories store both special and ordinary files. For users familiar with Windows or Mac OS,
UNIX directories are equivalent to folders. A directory file contains an entry for every file and subdirectory that it
houses. If you have 10 files in a directory, there will be 10 entries in the directory. Each entry has two components.
(1) The Filename
(2) A unique identification number for the file or directory (called the inode number)
Branching points in the hierarchical tree.
Examples
$ ls
fish hello.txt
$ ls -l
-rw-r--r-- 1 username groupname 0 Apr 11 00:09 fish
-rw-r--r-- 1 username groupname 11 Apr 11 00:10 hello.txt
$ ll
Prepared By: Prof. Neha Prajapati Page 131
-rw-r--r-- 1 username groupname 0 Apr 11 00:09 fish
-rw-r--r-- 1 username groupname 11 Apr 11 00:10 hello.txt
$ ls -F /usr/X11R6/bin/X*
/usr/X11R6/bin/X@ /usr/X11R6/bin/Xnest* /usr/X11R6/bin/Xprt*
/usr/X11R6/bin/Xmark* /usr/X11R6/bin/Xorg* /usr/X11R6/bin/Xvfb*
We do not know yet if there is a symbolic link "X" and an executable "Xmark" or if "X@" and
"Xmark*" are just the names of normal files. (Though "@" and "*" are not much found in
filenames, they are possible.) So we check by dropping the -F:
$ ls /usr/X11R6/bin/X*
/usr/X11R6/bin/X /usr/X11R6/bin/Xnest /usr/X11R6/bin/Xprt
/usr/X11R6/bin/Xmark /usr/X11R6/bin/Xorg /usr/X11R6/bin/Xvfb
2. mkdir
mkdir is a utility for creating a directory.
Examples
$ mkdir newdirectoryname
$ mkdir foo
$ cd foo
$ mkdir bar
$ mkdir -p foo/bar
3.cd
cd changes the current directory of the shell. This current directory will be used by other
programs launched from the shell.
Because "cd" changes the state of the shell, it is a shell built-in command. In contrast, most
commands are separate programs which the shell starts.
$ cd foobar
Change to your home directory, cd command used without an option will drop you back into
your home directory.
$ cd
~ (tilde) stores the path to your home directory, this command has same effect as the previous
one.
$ cd ~
$ cd ..
$ cd -
Tips:
By setting "CDPATH" environment variable in your shell you can take advantage of shell
command completion facility.
$ echo $CDPATH
.:/usr/local:/usr/share/doc
If you have the $CDPATH set, then you press 'TAB' key and get possible path completions
$ cd bas [TAB]
base-config/ base-files/ base-passwd/ bash/ bastille/
$ pwd
/home/username
$ cd /usr
$ pwd
/usr
You can also use "pwd" in scripts. If you have enough experience with scripting, then you would
know that the next line complains if the current directory is /home/username.
5.cp
cp copies a file
Most used options are:
-r
copies directories (recursively)
-p
preserves permissions, ownership, and timestamps
-i
prompt before overwrite
-v
verbose, show filenames as they are being copied
Examples
Makes a copy of file 'debian' and call it 'Debian' (assuming 'Debian' is not already a directory)
$ cp -i debian Debian
Makes a copy of file 'debian' and put it at /tmp/debian
$ cp -i debian /tmp/debian
Same as the previous command (the filename defaults to be the same).
Makes a copy of directory 'mydir' (and all its contents) and put it at /tmp/mydir
$ cp -ir mydir/ /tmp
Copy multiple files to directory /tmp
$ cp -i foo bar baz /tmp
5. mv
mv move and/or rename files
Examples
6.rm
rm deletes a file from the filesystem, like the "del" command in DOS.
The GNU long options (like --directory) are available on Linux, but not most other systems.
Some useful options are:
-d, --directory
unlink FILE, even if it is an empty directory (some systems let superuser unlink non-empty
directories too)
-f, --force
ignore nonexistent files, never prompt
-i, --interactive
prompt before any removal
-P
(*BSD only) overwrite file before deletion
-r, -R, --recursive
remove the contents of directories recursively (the force option must often be used to
successfully run rm recursively)
-v, --verbose
(GNU only) explain what is being done
--help
Prepared By: Prof. Neha Prajapati Page 135
(GNU only) display help and exit
--version
(GNU only) output version information and exit
Examples:
The usage of "rm" is considered potentially more dangerous than equivalents in other operating
systems because of the way the shell parses wildcards and names of special directories and in its
non-verbose actions.
Here is a classic example. Instead of deleting files that end with .o ("*.o") it deletes all files in
the directory ("*") and also a file called .o. There is an unwanted space between the asterisk and
the period.
$ rm * .o
rm: cannot remove `.o': No such file or directory
To remove a file whose name starts with a `-', for example `-foo', use one of these commands:
$ rm -- -foo
$ rm ./-foo
It might be useful to create an alias such as "remove" which moves the files to a local "trash" file
so you can go there and recover files you accidentally "remove"d.
Secure deletion of files:
Note that if you use rm to remove a file, it is usually possible to recover the contents of that file
since rm does not remove it from the hard disk. It simply removes the file systems link to it.
On *BSD systems, the -P option overwrites the data with the file before removing it.
$ rm -P secretfile
7.rmdir[edit]
rmdir is a utility for deleting empty directories.
Examples
$ rmdir directoryname
If the directory is not empty, the correct way to remove the directory and all its contents
recursively is to use
$ rm -r directoryname
Original purpose of this command is to concatenate the files (horizontally). It accepts any
number of files as its arguments and outputs the contents of each file, completely, one after
another, on the screen (the standard output device). Assume that you have three
files, file1, file2 and file3, with the following contents in them, respectively:
Process Utilities
This command stands for 'Process Status'. It is similar to the "Task Manager" that pop-ups in a
Windows Machine when we use Cntrl+Alt+Del. This command is similar to 'top' command but
the information displayed is different.
To check all the processes running under a user, use the command -
ps ux
You can also check the process status of a single process, use the syntax -
ps PID
To use these utilities you need to know the PID (process id) of the process you want to kill
Syntax -
kill PID
3. NICE
Linux can run a lot of processes at a time, which can slow down the speed of some high priority
processes and result in poor performance.
To avoid this, you can tell your machine to prioritize processes as per your requirements.
This priority is called Niceness in Linux, and it has a value between -20 to 19. The lower the
Niceness index, the higher would be a priority given to that task.
To start a process with a niceness value other than the default value use the following syntax
If there is some process already running on the system, then you can 'Renice' its value using
syntax.
To change Niceness, you can use the 'top' command to determine the PID (process id) and its
Nice value. Later use the renice command to change the value.
4. DF
This utility reports the free disk space(Hard Disk) on all the file systems.
If you want the above information in a readable format, then use the command
'df -h'
1.grep
The grep filter searches a file for a particular pattern of characters, and displays all lines that
contain that pattern. The pattern that is searched in the file is referred to as the regular expression
(grep stands for globally search for regular expression and print out).
Syntax:
grep [options] pattern [files]
Options Description
-c : This prints only a count of the lines that match a pattern
-h : Display the matched lines, but do not display the filenames.
-i : Ignores, case for matching
-l : Displays list of a filenames only.
-n : Display the matched lines and their line numbers.
-v : This prints out all the lines that do not matches the pattern
-e exp : Specifies expression with this option. Can use multiple times.
-f file : Takes patterns from file, one per line.
-E : Treats pattern as an extended regular expression (ERE)
-w : Match whole word
-o : Print only the matched parts of a matching line,
with each such part on a separate output line.
Sample Commands
Consider the below file as an input.
or
unix
unix
unix
unix
unix
unix
6. Show line number while displaying the output using grep -n : To show the line number of
file with the line matched.
$ grep -n "unix" geekfile.txt
Output:
1:unix is great os. unix is opensource. unix is free os.
4:uNix is easy to learn.unix is a multiuser os.Learn unix .unix is a powerful.
Agarwal
Aggarwal
Agrawal
$grep –f pattern.txt geekfile.txt
2.cut
The cut command in UNIX is a command for cutting out the sections from each line of files and
writing the result to standard output. It can be used to cut parts of a line by byte position,
character and field. Basically the cut command slices a line and extracts the text. It is necessary
to specify option with command otherwise it gives error. If more than one file name is provided
then data from each file is not precedes by its file name.
Syntax:
cut OPTION... [FILE]...
Let us consider two files having name state.txt and capital.txt contains 5 names of the Indian
states and capitals respectively.
$ cat state.txt
Andhra Pradesh
Arunachal Pradesh
Assam
3.finger
Finger command is used in Linux and Unix-like system to check the information of any currently
logged in users from the terminal. It is a command-line utility that can provide users login time,
tty (name), idle time, home directory, shell name, etc.
Finger package is not installed by default in most Linux and Ubuntu, other Debian flavored
systems. In this tutorial, we will check how to install and use finger command in Linux.
4.suid
SUID is a special file permission for executable files which enables other users to run the file
with effective permissions of the file owner. Instead of the normal x which represents execute
permissions, you will see an s (to indicate SUID) special permission for the user.
This below example command will find all files with SUID set in the current directory using -
perm (print files only with permissions set to 4000) option.
5.wc
wc stands for word count. As the name implies, it is mainly used for counting purpose.
It is used to find out number of lines, word count, byte and characters count in the
files specified in the file arguments.
By default it displays four-columnar output.
First column shows number of lines present in a file specified, second column shows
number of words present in the file, third column shows number of characters present in
file and fourth column itself is the file name which are given as argument.
Syntax:
wc [OPTION]... [FILE]...
Let us consider two files having name state.txt and capital.txt containing 5 names of the Indian
states and capitals respectively.
$ cat state.txt
Andhra Pradesh
Arunachal Pradesh
Assam
Bihar
Chhattisgarh
$ cat capital.txt
Hyderabad
Itanagar
Dispur
Patna
Raipur
Passing only one file name in the argument.
6.chmod
In Unix-like operating systems, the chmod command is used to change the access mode of a file.
The name is an abbreviation of change mode.
Syntax :
chmod [reference][operator][mode] file...
The references are used to distinguish the users to whom the permissions apply i.e. they are list
of letters that specifies whom to give permissions. The references are represented by one or more
of the following letters:
7.man
man command in Linux is used to display the user manual of any command that we can run on
the terminal. It provides a detailed view of the command which includes NAME, SYNOPSIS,
DESCRIPTION, OPTIONS, EXIT STATUS, RETURN VALUES, ERRORS, FILES,
VERSIONS, EXAMPLES, AUTHORS and SEE ALSO.
Every manual is divided into the following sections:
Executable programs or shell commands
System calls (functions provided by the kernel)
Library calls (functions within program libraries
Games
Special files (usually found in /dev)
File formats and conventions eg /etc/passwd
Miscellaneous (including macro packages and conventions), e.g. groff(7)
System administration commands (usually only for root)
Kernel routines [Non standard]
Syntax :
In this example, manual pages of the command ‘printf‘ are simply returned.
8.wall
wall command in Linux system is used to write a message to all users. This command displays a
message, or the contents of a file, or otherwise its standard input, on the terminals of all currently
logged in users. The lines which will be longer than 79 characters, wrapped by this command.
Short lines are whitespace padded to have 79 characters. A carriage return and newline at the end
of each line is put by wall command always. Only the superuser can write on the terminals of
users who have chosen to deny messages or are using a program which automatically denies
messages. Reading from a file is refused when the invoker is not superuser and the program
is suid(set-user-ID) or sgid(set-group-ID).
Syntax:
wall [-n] [-t timeout] [message | file]
Options:
9.sort
Output :
Prepared By: Prof. Neha Prajapati Page 148
abhishek
chitransh
divyam
harsh
naveen
rajan
satish
Directory Structure
A directory is a container that is used to contain folders and file. It organizes files and folders
into a hierarchical manner.
There are several logical structures of a directory, these are given below.
1. Single-level directory –
Single level directory is simplest directory structure.In it all files are contained in same
directory which make it easy to support and understand.
A single level directory has a significant limitation, however, when the number of files
increases or when the system has more than one user. Since all the files are in the same
directory, they must have the unique name . if two users call their dataset test, then the
unique name rule violated.
As we have seen, a single level directory often leads to confusion of files names among
different users. the solution to this problem is to create a separate directory for each user.
In the two-level directory structure, each user has there own user files directory (UFD).
The UFDs has similar structures, but each lists only the files of a single user.
system’s master file directory (MFD) is searches whenever a new user id=s logged in. The
MFD is indexed by username or account number, and each entry points to the UFD for that
Advantages:
We can give full path like /User-name/directory-name/.
Different users can have same directory as well as file name.
Searching of files become more easy due to path name and user-grouping.
Disadvantages:
A user is not allowed to share files with other users.
Still it not very scalable, two files of the same type cannot be grouped together in
the same user.
3. Tree-structured directory –
Once we have seen a two-level directory as a tree of height 2, the natural generalization is
to extend the directory structure to a tree of arbitrary height.
This generalization allows the user to create there own subdirectories and to organize on
their files accordingly.
An acyclic graph is a graph with no cycle and allows to share subdirectories and files. The
same file or subdirectories may be in two different directories. It is a natural generalization
of the tree-structured directory.
It is used in the situation like when two programmers are working on a joint project and
they need to access files. The associated files are stored in a subdirectory, separating them
from other projects and files of other programmers, since they are working on a joint
project so they want the subdirectories to be into their own directories. The common
subdirectories should be shared. So here we use Acyclic directories.
It is the point to note that shared file is not the same as copy file . If any programmer makes
some changes in the subdirectory it will reflect in both subdirectories.
In general graph directory structure, cycles are allowed within a directory structure where
multiple directories can be derived from more than one parent directory.
The main problem with this kind of directory structure is to calculate total size or space that
has been taken by the files and directories.
In Unix based operating system each file is indexed by an Inode. Inode are special disk blocks
they are created when the file system is created. The number of Inode limits the total number of
files/directories that can be stored in the file system.
The Inode contains the following information:
Administrative information (permissions, timestamps, etc).
A number of direct blocks (typically 12) that contains to the first 12 blocks of the files.
A single indirect pointer that points to a disk block which in turn is used as an index
block, if the file is too big to be indexed entirely by the direct blocks.
A double indirect pointer that points to a disk block which is a collection of pointers to
disk blocks which are index blocks, used if the file is too big to beindexed by the direct and
single indirect blocks.
A triple indirect pointer that points to an index block of index blocks of index blocks.
Inode Total Size:
Number of disk block address possible to store in 1 disk block = (Disk Block Size / Disk
Block Address).
Small files need only the direct blocks, so there is little waste in space or extra disk reads
in those cases. Medium sized files may use indirect blocks. Only large files make use of the
double or triple indirect blocks, and that is reasonable since those files are large
anyway.The disk is now broken into two different types of blocks: Inode and Data
Blocks.
There must be some way to determine where the Inodes are, and to keep track of free
Inodes and disk blocks. This is done by a Superblock. Superblock is located at a fixed
position in the file system. The Superblock is usually replicated on the disk to avoid
catastrophic failure in case of corruption of the main Superblock.
Index allocation schemes suffer from some of the same performance problems. As does
linked allocation. For example, the index blocks an be cached in memory, but the data
blocks may be spread all over a partition.
n UNIX based operating systems, each file is indexed by an Inode. Inode are the special disk
block which is created with the creation of the file system. The number of files or directories in a
file system depends on the number of Inodes in the file system.
Virtual Machines
Virtual Machine abstracts the hardware of our personal computer such as CPU, disk drives,
memory, NIC (Network Interface Card) etc, into many different execution environments as per
our requirements, hence giving us a feel that each execution environment is a single computer.
For example, VirtualBox.
When we run different processes on an operating system, it creates an illusion that each process
is running on a different processor having its own virtual memory, with the help of CPU
scheduling and virtual-memory techniques. There are additional features of a process that cannot
be provided by the hardware alone like system calls and a file system. The virtual machine
approach does not provide these additional functionalities but it only provides an interface that is
same as basic hardware. Each process is provided with a virtual copy of the underlying computer
system.
We can create a virtual machine for several reasons, all of which are fundamentally related to the
ability to share the same basic hardware yet can also support different execution environments,
i.e., different operating systems simultaneously.
The main drawback with the virtual-machine approach involves disk systems. Let us suppose
that the physical machine has only three disk drives but wants to support seven virtual machines.
Obviously, it cannot allocate a disk drive to each virtual machine, because virtual-machine
software itself will need substantial disk space to provide virtual memory and spooling. The
solution is to provide virtual disks.
Users are thus given their own virtual machines. After which they can run any of the operating
systems or software packages that are available on the underlying machine. The virtual-machine
software is concerned with multi-programming multiple virtual machines onto a physical
machine, but it does not need to consider any user-support software. This arrangement can
Prepared By: Prof. Neha Prajapati Page 156
provide a useful way to divide the problem of designing a multi-user interactive system, into two
smaller pieces.
Advantages:
1. There are no protection problems because each virtual machine is completely isolated
from all other virtual machines.
2. Virtual machine can provide an instruction set architecture that differs from real
computers.
3. Easy maintenance, availability and convenient recovery.
Disadvantages:
1. When multiple virtual machines are simultaneously running on a host computer, one
virtual machine can be affected by other running virtual machines, depending on the
workload.
2. Virtual machines are not as efficient as a real one when accessing the hardware.
Virtualization
Operating system based Virtualization refers to an operating system feature in which the kernel
enables the existence of various isolated user-space instances. The installation of virtualization
software also refers to Operating system-based virtualization. It is installed over a pre-existing
operating system and that operating system is called the host operating system.
In this virtualization, a user installs the virtualization software in the operating system of his
system like any other program and utilize this application to operate and generate various virtual
machines. Here, the virtualization software allows direct access to any of the created virtual
machine to the user. As the host OS can provide hardware devices with the mandatory support,
operating system virtualization may affect compatibility issues of hardware even when the
hardware driver is not allocated to the virtualization software.
Virtualization software is able to convert hardware IT resources which require unique software
for operation into virtualized IT resources. As the host OS is a complete operating system in
itself, many OS based services are available as organizational management and administration
tools can be utilized for the virtualization host management.
Hypervisor is a form of virtualization software used in Cloud hosting to divide and allocate the
resources on various pieces of hardware.The program which provide partitioning, isolation or
abstraction is called virtualization hypervisor. Hypervisor is a hardware virtualization technique
that allows multiple guest operating systems (OS) to run on a single host system at the same
time. A hypervisor is sometimes also called a virtual machine manager(VMM).
Types of Hypervisor –
TYPE-1 Hypervisor:
Hypervisor runs directly on underlying host system.It is also known as “Native Hypervisor” or
“Bare metal hypervisor”.It dose not require any base server operating system.It has direct access
to hardware resources.Examples of Type 1 hypervisors include VMware ESXi, Citrix XenServer
and Microsoft Hyper-V hypervisor.
A Host operating system runs on undrlying host system. It is also known as ‘Hosted
Hypervisor”. Basically a software installed on an operating system. Hypervisor asks operating
system to make hardware calls. Example of Type 2 hypervisor include VMware Player or
Parallels Desktop. Hosted hypervisors are often found on endpoints like PCs.
There are 3 main modues coordinate in order to emiulate the undrelying hardware:
1. Dispatcher
2. Allocator
3. Interpreter
DISPATCHER:
The dispatcher behaves like the entry point of the monitor and reroutes the instructions of the
virtual machine instance to one of the other two modules.
ALLOCATOR:
The allocator is responsible for deciding the system resources to be provided to the virtual
machine instance.It means whenever virtual machine tries to execute an instruction that results in
changing the machine resources associated with the virtual machine, the allocator is invoked by
the dispatcher.
INTERPRETER:
The interpreter module consists of interpreter routines.These are executed, whenever virtual
machine executes a priviliged instruction.
There are 3 main modues coordinate in order to emiulate the undrelying hardware:
Full Virtualization:
It completely relies on binary translation to trap and virtualize the execution of sensitive, non-
virtualizable instructions sets. It emulates the hardware using the software instruction sets. Due
to binary translation, it often criticized for performance issue. Here is the list of software which
will fall under software assisted (BT).
Hardware-assisted full virtualization eliminates the binary translation and it directly interrupts
with hardware using the virtualization technology which has been integrated on X86 processors
since 2005 (Intel VT-x and AMD-V). Guest OS’s instructions might allow a virtual context
execute privileged instructions directly on the processor, even though it is virtualized.
2 Paravirtualization:
Paravirtualization works differently from the full virtualization. It doesn’t need to simulate the
hardware for the virtual machines. The hypervisor is installed on a physical server (host) and a
guest OS is installed into the environment. Virtual guests aware that it has been virtualized,
Prepared By: Prof. Neha Prajapati Page 164
unlike the full virtualization (where the guest doesn’t know that it has been virtualized) to take
advantage of the functions. In this virtualization method, guest source codes will be modified
with sensitive information to communicate with the host. Guest Operating systems require
extensions to make API calls to the hypervisor. In full virtualization, guests will issue a hardware
calls but in paravirtualization, guests will directly communicate with the host (hypervisor) using
the drivers. Here is the lisf of products which supports paravirtualization.
Xen
IBM LPAR
Oracle VM for SPARC (LDOM)
In Hardware assisted full virtualization, Guest operating systems are unmodified and it involves
many VM traps and thus high CPU overheads which limit the scalability. Paravirtualization is a
complex method where guest kernel needs to be modified to inject the API. By considering these
issues, engineers have come with hybrid paravirtualization. It’s a combination of both Full &
Paravirtualization. The virtual machine uses paravirtualization for specific hardware drivers
(where there is a bottleneck with full virtualization, especially with I/O & memory intense
workloads), and the host uses full virtualization for other features. The following products
support hybrid virtualization.
The following diagram will help you to understand how VMware supports both full
virtualization and hybrid virtualization. RDMA uses the paravirual driver to bypass VMkernel in
hardware-assisted full virtualization.
4 OS level Virtualization:
Linux LCX
Docker
AIX WPAR Solaris Containers – OS-level virtualization Example
Memory Virtualization
Prepared By: Prof. Neha Prajapati Page 165
Memory Virtualization Beyond CPU virtualization, the next critical component is memory
virtualization. This involves sharing the physical system memory and dynamically allocating it
to virtual machines. Virtual machine memory virtualization is very similar to the virtual memory
support provided by modern operating systems. Applications see a contiguous address space that
is not necessarily tied to the underlying physical memory in the system. The operating system
keeps mappings of virtual page numbers to physical page numbers stored in page tables. All
modern x86 CPUs include a memory management unit (MMU) and a translation lookaside
buffer (TLB) to optimize virtual memory performance.
To run multiple virtual machines on a single system, another level of memory virtualization is
required. In other words, one has to virtualize the MMU to support the guest OS. The guest OS
continues to control the mapping of virtual addresses to the guest memory physical addresses,
but the guest OS cannot have direct access to the actual machine memory. The VMM is
responsible for mapping guest physical memory to the actual machine memory, and it uses
shadow page tables to accelerate the mappings. As depicted by the red line in Figure 8, the
VMM uses TLB hardware to map the virtual memory directly to the machine memory to avoid
the two levels of translation on every access. When the guest OS changes the virtual memory to
physical memory mapping, the VMM updates the shadow page tables to enable a direct lookup.
MMU virtualization creates some overhead for all virtualization approaches, but this is the area
where second generation hardware assisted virtualization will offer efficiency gains.
Device and I/O Virtualization The final component required beyond CPU and memory
virtualization is device and I/O virtualization. This involves managing the routing of I/O requests
between virtual devices and the shared physical hardware. Software based I/O virtualization and
management, in contrast to a direct pass-through to the hardware, enables a rich set of features
and simplified management. With networking for example, virtual NICs and switches create
virtual networks between virtual machines without the network traffic consuming bandwidth on
the physical network, NIC teaming allows multiple physical NICS to appear as one and failover
transparently for virtual machines, and virtual machines can be seamlessly relocated to different
systems using VMotion while keeping their existing MAC addresses. The key to effective I/O
virtualization is to preserve these virtualization benefits while keeping the added CPU utilization
to a minimum. The hypervisor virtualizes the physical hardware and presents each virtual