0% found this document useful (0 votes)
400 views353 pages

OS Complete

The document provides an overview of the Operating Systems course, including important policies, course topics, and computing environments. It discusses dishonesty penalties, assignment policies, required Piazza participation, and attendance expectations. It also defines what an operating system is, describes the functions of an OS from both the user and system viewpoints, and discusses OS structure including multiprogramming and timesharing. Finally, it covers computing environments like single-processor, multiprocessor, multicore, and clustered systems.

Uploaded by

Suraj R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
400 views353 pages

OS Complete

The document provides an overview of the Operating Systems course, including important policies, course topics, and computing environments. It discusses dishonesty penalties, assignment policies, required Piazza participation, and attendance expectations. It also defines what an operating system is, describes the functions of an OS from both the user and system viewpoints, and discusses OS structure including multiprogramming and timesharing. Finally, it covers computing environments like single-processor, multiprocessor, multicore, and clustered systems.

Uploaded by

Suraj R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 353

OPERATING SYSTEMS (CS F372)

Introduction
Slides Courtesy: Dr. Barsha Mitra
BITS Pilani CSIS Department, BITS Pilani, Hyderabad Campus
Hyderabad Campus
Handout Overview

Let’s go through the handout

BITS Pilani, Hyderabad Campus


Some Important Points

LOR requirement: ‘A’ grade and 90% attendance

Dishonesty penalty: 0 score in the component, including assignments, mid-semester
examination and comprehensive examination

Report to Disciplinary Committee for repeat offenders

Assignments and tutorials will run on Ubuntu 20.04

Late Days Policy for Assignments:

A total 3 free late days are available for each student. The 3 days can be distributed between
assignments 1 and 2 as the student wishes. Beyond the 3 free days a penalty of 10% per day
will apply

Sign up on Piazza – non-negotiable

Attend classes and interact

BITS Pilani, Hyderabad Campus


What is an Operating System?

Let’s hear your thoughts

BITS Pilani, Hyderabad Campus


Introduction


Program that manages computer’s hardware

Acts as an intermediary between computer user and computer h/w

Several types, each type optimising different aspects

mainframe operating systems

personal computer (PC) operating systems

operating systems for mobiles platforms

(Question, what are the characteristics of these platforms which
the OS needs to optimise for?)

BITS Pilani, Hyderabad Campus


What OS Does? : User View
 Users want convenience, ease of use and good performance
 Don’t care about resource utilization
 Shared computer such as mainframe or minicomputer must keep all
users happy
 Users of dedicated systems such as workstations have dedicated
resources but frequently use shared resources from servers
 Handheld computers are resource poor, optimized for individual usability
and battery life
 Some computers have little or no user interface, such as embedded
computers in devices and automobiles

BITS Pilani, Hyderabad Campus


What OS Does? : System View

 OS is a resource allocator
 Manages all resources
 Decides between conflicting requests for efficient and fair resource use
 OS is a control program
 Controls execution of user programs to prevent errors and improper
use of the computer

BITS Pilani, Hyderabad Campus


How do we define OS?
 Everything a vendor ships when you order an operating system is a good
approximation

 “The one program running at all times on the computer”


BITS Pilani, Hyderabad Campus
Computer System Organization
 Computer-system operation
 One or more CPUs, device controllers connect through common bus providing
access to shared memory
 Concurrent execution of CPUs and devices competing for memory cycles

device controller
is a hardware
component that
works as a bridge
between the
hardware device
and the operating
system or an
application
program

BITS Pilani, Hyderabad Campus


Computer-System Operation
 I/O devices and the CPU can execute concurrently
 Each device controller is in charge of a particular device type
 Each device controller has a local buffer
 CPU moves data from/to main memory to/from local buffers
 I/O is from the device to local buffer of controller
 Device controller informs CPU that it has finished its operation by causing
an interrupt
 What is an Interrupt?

BITS Pilani, Hyderabad Campus


Computer-System Operation

BITS Pilani, Hyderabad Campus


Interrupt Handling
 Interrupt transfers control to the interrupt service routine through the
interrupt vector
 IVT - contains the addresses of all the service routines
 Must save the address of the interrupted instruction
 Trap/exception is a software-generated interrupt caused either by an error
or a user request
 Operating system is interrupt driven
 Operating system preserves the state of the CPU by storing registers and
the program counter
BITS Pilani, Hyderabad Campus
Storage Structure
 Main memory –
 only large storage media that the CPU can access directly
 instruction execution
 random access
 Volatile
 Secondary storage (an example of I/O devices)
 Cheaper, slower, non-volatile

BITS Pilani, Hyderabad Campus


Operating System Structure
 Multiprogramming (Batch system)
 Needed for efficiency
 Single process cannot keep CPU
and I/O devices busy at all times
 Organizes jobs (code and data) so
CPU always has one to execute
 A subset of total jobs in system is Job Pool

kept in memory
 One job is selected and run via job
scheduling
 When it has to wait for I/O, OS
switches to another job

CPU should not be Idle!!

BITS Pilani, Hyderabad Campus


Operating System Structure
 Timesharing (multitasking): CPU
switches jobs so frequently that users can
interact with each job while it is running
 interactive computing
 User interaction via input devices
 Response time should be < 1 second
 Different users have different programs
in memory - process
 If several jobs ready to run at the same
time - CPU scheduling
 If processes don’t fit in memory,
Here Focus is Responsiveness/
swapping moves them in and out to run
Response time.
 Virtual memory allows execution of
processes larger than physical memory
BITS Pilani, Hyderabad Campus
Operating System Operations

Interrupt driven - hardware and software


Hardware interrupt by one of the devices
Software interrupt (exception or trap):
Software error (e.g., division by zero, invalid memory access)
Request for operating system service
Other process problems include infinite loop, processes
modifying each other or the operating system

BITS Pilani, Hyderabad Campus


Operating System Operations
 Dual-mode operation allows OS to protect itself and protect users from one another
 User mode and Kernel/Supervisor/System/Privileged mode
 Mode bit provided by hardware
 Provides ability to distinguish when system is running user code or kernel code
 Some instructions designated as privileged, only executable in kernel mode
 System call changes mode to kernel, return from call resets it to user

locate ISR via interrupt


vector and excecute

BITS Pilani, Hyderabad Campus


Operating System Operations

Boot time: hardware starts in kernel mode


After loading OS, user applications are started in user mode
When trap / interrupt occurs, hardware switches from user mode
to kernel mode

BITS Pilani, Hyderabad Campus


Computing Environments
Single-Processor Systems - one main CPU executing a general-purpose
instruction set, including instructions from user processes, some device-
specific processors like disk, keyboard and graphics controller and I/O
processor may be present

BITS Pilani, Hyderabad Campus


Computing Environments
 Multiprocessors
 Also known as parallel systems, multi-
core systems
 2 or more processors in close
communication, sharing the computer bus
and sometimes the clock, memory and
peripheral devices
 Advantages:  Two types:
 Increased throughput  Asymmetric Multiprocessing – each
 Economy of scale
processor is assigned a specific task, boss
 Increased reliability – graceful
processor controls worker processors
degradation, fault tolerance  Symmetric Multiprocessing – each
processor performs all tasks, peers
BITS Pilani, Hyderabad Campus
Computing Environments
 Multicore Systems
include multiple computing cores on a single chip
more efficient than multiple chips with single cores because on-chip
communication is faster than between-chip communication

dual-core design with both cores on same chip

BITS Pilani, Hyderabad Campus


Computing Environments
Clustered Systems
 Like multiprocessor systems, but multiple systems working
together
 Usually sharing storage via a storage-area network (SAN)
 Provides a high-availability service which survives failures, users
can see only a brief interruption of service
 Asymmetric clustering has one machine in hot-standby mode
 Symmetric clustering has multiple nodes running applications,
monitoring each other
 Some clusters are for high-performance computing (HPC)
 Applications must be written to use parallelization

BITS Pilani, Hyderabad Campus


Putting it All Together

BITS Pilani, Hyderabad Campus


Primary Tasks for an OS
 Process management
 Memory management
 Storage and I/O management

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
OS Structures
BITS Pilani
Hyderabad Campus
Operating System Services
 User interface - almost all operating systems have a user interface (UI)
 Command-Line (CLI), Graphics User Interface (GUI)
 Program execution - system must be able to load a program into memory and
to run that program, end execution, either normally or abnormally (indicating
error)
 I/O operations - running program may require I/O, which may involve a file or
an I/O device, users don’t control, OS does
 File-system manipulation - read and write files and directories, create and
delete them, search them, list file Information, permission management

BITS Pilani, Hyderabad Campus


Operating System Services
 Communications – processes may exchange information, on the same
computer or between computers over a network, shared memory or message
passing
 Error detection –
 OS needs to be constantly aware of possible errors
 May occur in CPU and memory h/w, in I/O devices, in user program
 OS should take the appropriate action to ensure correct and consistent
computing
 Debugging

BITS Pilani, Hyderabad Campus


Operating System Services
 Resource allocation – allocating resources like CPU cycles, main memory, file
storage, I/O devices for multiple concurrently executing processes
 Accounting – keep track of which users use how much and what kinds of
computer resources
 Protection and security –
 owners of information stored in a multiuser or networked computer
system want to control use of that information
 concurrent processes should not interfere with each other or with OS
 ensuring that all accesses to system resources is controlled
 security of the system from outsiders requires user authentication,
extends to defending external I/O devices from invalid access attempts

BITS Pilani, Hyderabad Campus


Operating System Services

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:
CLI
 CLI or command interpreter
 Sometimes implemented in kernel, sometimes by separate
program (Unix, Windows)
 Sometimes multiple flavors implemented – shells
 Primarily fetches a command from user and executes it

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:
GUI
 User-friendly interface
 Usually mouse, keyboard, and monitor
 Icons represent files, programs, actions, etc.
 Various mouse buttons over objects in the interface cause various actions
(provide information, options, execute function, open directory (known as a
folder))
 Many systems now include both CLI and GUI interfaces
 Microsoft Windows is GUI with CLI “command” shell
 Unix and Linux have CLI with optional GUI interfaces (CDE, KDE, GNOME)

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:
Touchscreen Interface
 Touchscreen devices require new interfaces
 Mouse not possible or not desired
 Actions and selection based on gestures
 Virtual keyboard for text entry
 Voice commands

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:

Choice of Interface

BITS Pilani, Hyderabad Campus


System Calls
 Interface to the services provided by the OS
 Typically written in a high-level language (C or C++)
 Mostly accessed by programs via a high-level Application Programming
Interface (API) rather than direct system call use
 API specifies a set of functions available to application programmers
 Three most common APIs are
 Win32 API for Windows
 POSIX API for POSIX-based systems (including all versions of UNIX, Linux,
and Mac OS X)
 Java API for the Java virtual machine (JVM)

BITS Pilani, Hyderabad Campus


System Calls
 A number is associated with each system call
 System-call interface maintains a table indexed according to these numbers
 The system call interface invokes the intended system call in OS kernel and
returns status of the system call and any return values
 The caller need know nothing about how the system call is implemented
 Just needs to obey API and understand what OS will do as a result of call
execution
 Most details of OS interface hidden from programmer by API
 Managed by run-time support library (set of functions built into libraries
included with compiler)

BITS Pilani, Hyderabad Campus


System Calls

BITS Pilani, Hyderabad Campus


Types of System Calls
• Process control • Device management
• create process, terminate process
• request device, release device
• end, abort
• read, write
• load, execute
• get device attributes, set
• get process attributes, set process device attributes
attributes
• logically attach or detach
• File management devices
• create file, delete file • Information Maintenance
• open, close file
• Communication
• read, write file
• get and set file attributes • Protection

BITS Pilani, Hyderabad Campus


Examples of System Calls

BITS Pilani, Hyderabad Campus


OS Structure

• Simple Structure/ Monolithic Kernel


• Layered Approach
• Microkernels
• Modules
• Hybrid System

BITS Pilani, Hyderabad Campus


Simple Structure
• not divided into modules
• interfaces and levels of
functionality are not well
separated
• application programs are able to
access the basic I/O routines to
write directly to the display and
disk drives
• vulnerable to malicious programs,
causing entire system crashes
when user programs fail

BITS Pilani, Hyderabad Campus


UNIX Architecture

• the original UNIX operating


system had limited structuring
• consists of two separable parts
• Systems programs
• kernel
• Consists of everything below the
system-call interface and above the
physical hardware
• Provides the file system, CPU
scheduling, memory management,
and other operating-system
functions; a large number of
functions for one level

BITS Pilani, Hyderabad Campus


Monolithic Kernel

• entire operating system is working in


kernel space
• larger in size
• little overhead in system call
interface or in communication
within kernel
• faster
• hard to extend
• if a service crashes, whole system is
affected
• Eg., - MS-DOS
BITS Pilani, Hyderabad Campus
Layered Approach

• The operating system is divided into a


number of layers (levels), each built
on top of lower layers
• The bottom layer (layer 0), is the
hardware; the highest (layer N) is the
user interface
• With modularity, layers are selected
such that each uses functions
(operations) and services of only
lower-level layers

BITS Pilani, Hyderabad Campus


Microkernel
• user services and kernel services
are in separate address spaces
• smaller in size
• slower
• extendible, all new services are
added to user space
• if a service crashes, working of
microkernel is not affected
• more secure and reliable
• eg., Mach, QNX, Windows NT (initial
release)
• Drawback ??? Performance overhead of user space to kernel space communication
BITS Pilani, Hyderabad Campus
A Comparison

macOS X, Windows, iOS

BITS Pilani, Hyderabad Campus


Modules
 loadable kernel modules
 kernel has a core set of components
 links in additional services via modules, either at boot time or during run time
 each module has a well defined interface
 dynamically linking services is preferable to adding new features directly to
the kernel: does not require recompiling the kernel for every change
 better than a layered approach: any module can call any module
 better than microkernel: no message passing required to invoke modules

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Processes
BITS Pilani
Hyderabad Campus
Process Concept

 An operating system executes a variety of programs


 Batch system – jobs
 Time-shared systems – user programs or tasks
 Terms job and process used interchangeably
 Process – a program in execution; process execution must progress in
sequential fashion

BITS Pilani, Hyderabad Campus


Process Concept
 Multiple parts
 The program code, also called text section
 Current activity including program counter, processor registers
 Stack containing temporary data
 Function parameters, return addresses, local variables
 Data section containing global variables
 Heap containing memory dynamically allocated during run time

BITS Pilani, Hyderabad Campus


Process Concept
 Program is passive entity stored on
disk (executable file), process is
active
 Program becomes process when
executable file loaded into
memory
 Execution of program started via GUI
mouse clicks, command line entry of
its name, etc
 One program can be several
processes
 Consider multiple users executing
the same program
BITS Pilani, Hyderabad Campus
States of Process
 new: The process is being
created
 running: Instructions are
being executed
 waiting: The process is
waiting for some event to
occur
 ready: The process is
waiting to be assigned to a
processor
 terminated: The process
has finished execution

BITS Pilani, Hyderabad Campus


Process Control Block
 Process state – new, ready, running, waiting, etc.
 Program counter – location of instruction to next
execute
 CPU registers – contents of all process-centric registers
 CPU scheduling information- priorities, scheduling
queue pointers, scheduling parameters
 Memory-management information – memory
allocated to the process
 Accounting information – CPU used, clock time
elapsed since start, time limits, account nos., process
nos.
 I/O status information – I/O devices allocated to
process, list of open files
BITS Pilani, Hyderabad Campus
Process Creation
 Parent process create children processes, which, in turn create other
processes, forming a tree of processes
 Generally, process identified and managed via a process identifier
(pid), integer number
 Resource sharing options (CPU time, memory, files, I/O devices)
 Parent and children share all resources
 Children share subset of parent’s resources
 Parent and child share no resources
 Execution options
 Parent and children execute concurrently
 Parent waits until children terminate
BITS Pilani, Hyderabad Campus
Process Creation
 Address space
 Child duplicate of parent (same program and data)
 Child has a program loaded into it
 UNIX examples
 fork() system call creates new process
 exec() system call used after a fork() to replace the process’ memory space
with a new program

BITS Pilani, Hyderabad Campus


Process Creation
 fork()
 address space of child process is a copy of parent process
 both child and parent continue execution at the instruction after fork()
 return code for fork() is 0 for child
 return code for fork() is non-zero (child pid) for parent

 exec()
 loads a binary file into memory and starts execution
 destroys previous memory image
 call to exec() does not return unless an error occurs

 wait()
 parent can issue wait() to move out of ready queue until the child is done

BITS Pilani, Hyderabad Campus


Process Creation
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
int main()
{
pid_t pid;
pid = fork(); /* fork a child process */
if (pid < 0) { /* error occurred */
fprintf(stderr, "Fork Failed");
OUTPUT:
return 1;
} Child Process
else if (pid == 0) { /* child process */ a.out Documents  examples.desktop  MyPrograms  Pictures
printf("Child Process\n");  Templates
execlp("/bin/ls","ls",NULL); Desktop  Downloads  Music      parent.c  Public    Videos
} Child Complete
else { /* parent process */
wait(NULL); /* parent waits for child to complete */
printf("Child Complete");
}
return 0;
}
BITS Pilani, Hyderabad Campus
Process Creation
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h> OUTPUT:
int main()
Parent Process: x = ?
{
pid_t pid; int x = 10; Child Process: x = ?
pid = fork(); a.out Documents  examples.desktop  MyPrograms  Pictures  Templates
if (pid < 0) { Desktop  Downloads  Music      parent.c  Public    Videos
fprintf(stderr, "Fork Failed"); Child Complete
return 1;
}
else if (pid == 0) {
printf("Child Process: x = %d\n", x);
execlp("/bin/ls","ls",NULL);
}
else {
x++;
printf("Parent Process: x = %d\n", x);
wait(NULL);
printf("Child Complete");
}
return 0;
BITS Pilani, Hyderabad Campus
Process Termination
 Process executes last statement and then asks the operating system to
delete it using the exit() system call.
 May return status data from child to parent (via wait())
 Process’ resources are deallocated by operating system
 Parent may terminate the execution of children processes because:
 Child has exceeded allocated resources limit
 Task assigned to child is no longer required
 Parent is exiting and the operating systems does not allow a child
to continue if its parent terminates

BITS Pilani, Hyderabad Campus


Process Termination
 Some operating systems do not allow child to exist if parent has terminated.
 If a process terminates, then all its children must also be terminated.
 cascading termination - All children, grandchildren, etc. are terminated.
 The termination is initiated by the operating system.
 The parent process may wait for termination of a child process by using the wait()
system call. The call returns status information and the pid of the terminated
process
pid_t pid;
int status;
pid = wait(&status); //parent can tell which child has terminated
 If no parent waiting (did not invoke wait() till then) process is a zombie
 If parent terminated, process is an orphan

BITS Pilani, Hyderabad Campus


Zombie Process
int main() 

    pid_t child_pid = fork(); 
   if (child_pid > 0) {
        sleep(10); zombie
wait(NULL); no longer a zombie
sleep(200);
}
else{
printf("\n%d",getpid());        
        exit(0); 
     }
      return 0; 

BITS Pilani, Hyderabad Campus


Orphan Process
int main() 

    pid_t child_pid = fork(); 
    if (child_pid > 0){
        printf("\nParent process: %d\n",getpid());
sleep(6);
     }
    else{
printf("\nParent PID: %d\n",getppid()); 
sleep(20);
printf("\nChild Process: %d",getpid());     
printf("\nParent PID: %d",getppid());        
        exit(0); 
     }
  
    return 0; 

BITS Pilani, Hyderabad Campus


Interprocess Communication
 Processes within a system may be independent or cooperating
 Cooperating process can affect or be affected by other processes, including
sharing data
 Reasons for cooperating processes:
 Information sharing
 Computation speedup
 Modularity
 Convenience
 Cooperating processes need interprocess communication (IPC)
 Three models of IPC

Pipes

Shared memory

Message passing
BITS Pilani, Hyderabad Campus
Interprocess Communication
message passing shared memory

BITS Pilani, Hyderabad Campus


Pipe
 Acts as a conduit allowing two processes to communicate
 Issues:
 Is communication unidirectional or bidirectional?
 In the case of two-way communication, is it half or full-duplex?
 Must there exist a relationship (i.e., parent-child) between the
communicating processes?
 Can the pipes be used over a network?
 Ordinary pipes –
 cannot be accessed from outside the process that created it
 parent process creates a pipe and uses it to communicate with a child process
that it created
 Named pipes – can be accessed without a parent-child relationship

BITS Pilani, Hyderabad Campus


Ordinary Pipe
 Ordinary Pipes allow communication in standard producer-consumer style
 Producer writes to one end (the write-end of the pipe)
 Consumer reads from the other end (the read-end of the pipe)
 Ordinary pipes are unidirectional
 Require parent-child relationship between communicating processes
 Windows calls these anonymous pipes

BITS Pilani, Hyderabad Campus


Ordinary Pipe

ordinary pipe can’t be accessed from outside the process that created it
parent process creates a pipe and uses it to communicate with a child process that it
creates via fork()
child inherits the pipe from its parent process like any other file

BITS Pilani, Hyderabad Campus


Named Pipe
 Named Pipes are more powerful than ordinary pipes
 Communication is bidirectional
 No parent-child relationship is necessary between the
communicating processes
 Several processes can use the named pipe for communication
 Do not cease to exist if the communicating processes have
terminated
 Provided on both UNIX and Windows systems
 Referred to as FIFOs in UNIX systems

BITS Pilani, Hyderabad Campus


IPC – Shared Memory
 An area of memory shared among the processes that wish to
communicate

 The communication is under the control of the users


processes not the operating system

 Provide mechanism that will allow the user processes to


synchronize their actions when they access shared memory

BITS Pilani, Hyderabad Campus


IPC – Message Passing
 Mechanism for processes to communicate and to synchronize their
actions

 Message system – processes communicate with each other without


resorting to shared variables, no sharing of address space

 IPC facility provides two operations:


 send(message)
 receive(message)

 The message size is either fixed or variable

 If processes P and Q wish to communicate, they need to:


 Establish a communication link between them
 Exchange messages via send() / receive()
BITS Pilani, Hyderabad Campus
Message Passing – Direct
Communication
 Processes must name each other explicitly:
 send (P, message) – send a message to process P
 receive(Q, message) – receive a message from process Q

 Properties of communication link


 Links are established automatically
 Processes only need to know each other’s identity
 A link is associated with exactly one pair of communicating
processes
 Between each pair there exists exactly one link

BITS Pilani, Hyderabad Campus


Message Passing – Indirect
Communication
 Messages are directed and received from mailboxes (also
referred to as ports)
 Each mailbox has a unique ID
 Processes can communicate only if they share a mailbox
 Properties of communication link
 Link established only if processes share a common mailbox
 A link may be associated with many processes
 Each pair of processes may share several communication
links, each link corresponds to one mailbox

BITS Pilani, Hyderabad Campus


Message Passing – Indirect
Communication
 Operations
 create a new mailbox (port)
 send and receive messages through mailbox
 destroy a mailbox
 Primitives are defined as:
 send(A, message) – send a message to mailbox A
 receive(A, message) – receive a message from mailbox A

BITS Pilani, Hyderabad Campus


Synchronization
 Message passing may be either blocking or non-blocking
 Blocking is considered synchronous
 Blocking send -- the sender is blocked until the message is received
by the receiving process or mailbox
 Blocking receive -- the receiver is blocked until a message is
available
 Non-blocking is considered asynchronous
 Non-blocking send -- the sender sends the message and continues
 Non-blocking receive -- the receiver receives:
 A valid message, or
 Null message

BITS Pilani, Hyderabad Campus


Buffering
messages exchanged by communicating processes reside in a temporary
queue
implemented in one of three ways
 Zero capacity – no messages are queued, link can’t have any
waiting messages, sender must block until receiver receives message
 Bounded capacity – queue is of finite length of n messages, sender
need not block if queue is not full, sender must wait if queue full
 Unbounded capacity – infinite length queue, sender never blocks

BITS Pilani, Hyderabad Campus


Message Queue
• asynchronous communication • Step 1 − Create a message queue or
• messages placed onto the queue are connect to an already existing message
stored until the recipient retrieves them queue (msgget())
• Step 2 − Write into message queue
(msgsnd())
• Step 3 − Read from the message queue
(msgrcv())
• Step 4 − Perform control operations on
the message queue (msgctl())

BITS Pilani, Hyderabad Campus


Context Switch
 When CPU switches to another
process, system must save state of the
old process and load the state for the
new process via a context switch
 Context of a process represented in
the PCB (CPU registers contents,
process state, memory management
info.)
 Context-switch time is overhead;
system does no useful work while
switching
 Time is dependent on hardware
support
BITS Pilani, Hyderabad Campus
Process Scheduling
 Maximize CPU use, quickly switch processes onto CPU for time
sharing
 Process scheduler selects among available processes for next
execution on CPU
 Maintains scheduling queues of processes
 Job queue – set of all processes in the system
 Ready queue – set of all processes residing in main memory,
ready and waiting to execute, generally stored as a linked list
 Device queues – set of processes waiting for an I/O device
 Processes migrate among the various queues

BITS Pilani, Hyderabad Campus


Various Queues

BITS Pilani, Hyderabad Campus


Process Scheduling

Queueing Diagram

BITS Pilani, Hyderabad Campus


Schedulers
 Short-term scheduler (or CPU scheduler) – selects which process should be executed next
and allocates CPU
 Sometimes the only scheduler in a system
 Short-term scheduler is invoked frequently (milliseconds)  (must be fast)
 Long-term scheduler (or job scheduler) – selects which processes should be brought into
the ready queue
 Long-term scheduler is invoked infrequently (seconds, minutes)  (may be slow)
 The long-term scheduler controls the degree of multiprogramming (number of
processes in main memory)
 Processes can be described as either:
 I/O-bound process – spends more time doing I/O than computations, many short CPU
bursts
 CPU-bound process – spends more time doing computations; few very long CPU bursts
 Long-term scheduler strives for good process mix
BITS Pilani, Hyderabad Campus
Schedulers
 Medium-term scheduler can be added if degree of multiprogramming needs to decrease
 Intermediate level of scheduling
 Remove process from memory, store on disk, bring back in from disk to continue
execution from where it left off: swapping
 Required for improving process mix or for freeing of memory

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Threads
BITS Pilani
Hyderabad Campus
Motivation
 Most modern applications are multithreaded
 Threads run within application
 Multiple tasks with the application can be implemented by separate threads
 Update display
 Fetch data
 Spell checking
 Answer a network request
 Process creation is heavy-weight while thread creation is light-weight
 Can simplify code, increase efficiency

BITS Pilani, Hyderabad Campus


Motivation

Multithreaded Server Architecture

BITS Pilani, Hyderabad Campus


What is Thread?
 Basic unit of CPU utilization
 Comprises a thread ID, program counter, registers and stack
 Shares with other threads belonging to the same program
 code section
 data section
 OS resources like open files

BITS Pilani, Hyderabad Campus


Motivation

BITS Pilani, Hyderabad Campus


Benefits
 Responsiveness – may allow continued execution if part of process is blocked,
especially important for user interfaces in interactive environments
 Resource Sharing – threads share resources of process, easier than shared
memory or message passing
 Economy – cheaper than process creation, thread switching has lower overhead
than context switching
 Scalability – process can take advantage of multiprocessor architectures

BITS Pilani, Hyderabad Campus


Multicore Programming
 Multicore or multiprocessor systems putting pressure on programmers,
programming challenges include:
 Identifying tasks
 Balance
 Data splitting
 Data dependency
 Testing and debugging
 Parallelism implies a system can perform more than one task simultaneously
 Concurrency supports more than one task making progress
 Single processor / core, scheduler providing concurrency

BITS Pilani, Hyderabad Campus


Multicore Programming
 Types of parallelism

 Data parallelism – distributes subsets of the same data across multiple cores,
same operation on each subset

 Task parallelism – distributing tasks/threads across cores, each thread


performing unique operation, threads may be operating on same or different
data

BITS Pilani, Hyderabad Campus


Concurrency vs Parallelism
 Concurrent execution on single-core system:

 Parallelism on a multi-core system:

BITS Pilani, Hyderabad Campus


User Threads and Kernel Threads
 User threads - management done by user-level threads library without kernel
support
 Three primary thread libraries:
 POSIX Pthreads
 Windows threads
 Java threads
 Kernel threads - Supported and managed by the Kernel
 Examples – virtually all general purpose operating systems support kernel
threads, including:
 Windows
 Solaris
 Linux
 Tru64 UNIX
 Mac OS X
BITS Pilani, Hyderabad Campus
Multithreading Models

 Many-to-One

 One-to-One

 Many-to-Many

BITS Pilani, Hyderabad Campus


Many-to-One Model

 Many user-level threads mapped to


single kernel thread
 Thread management done by thread
library in user space
 One thread blocking causes all to block
 Multiple threads may not run in parallel
on muticore system because only one
can access kernel at a time
 Few systems currently use this model
 Examples:
 Solaris Green Threads

BITS Pilani, Hyderabad Campus


One-to-One Model

 Each user-level thread maps to kernel


thread
 Creating a user-level thread creates a
kernel thread
 More concurrency than many-to-one
 Number of threads per process
sometimes restricted due to overhead
 Examples:
 Windows
 Linux
 Solaris 9 and later

BITS Pilani, Hyderabad Campus


Many-to-Many Model

 Allows many user level threads to be


mapped to many kernel threads
 Allows the operating system to create a
sufficient number of kernel threads
 Solaris prior to version 9
 Windows with the ThreadFiber package

BITS Pilani, Hyderabad Campus


Two-Level Model

 Similar to M:M, except that it allows a


user thread to be bound to kernel
thread
 Examples
 IRIX
 HP-UX
 Tru64 UNIX
 Solaris 8 and earlier

BITS Pilani, Hyderabad Campus


Thread Libraries
 Thread library provides programmer with API for creating and managing threads
 Two primary ways of implementing
 Library entirely in user space
 Kernel-level library supported by the OS

BITS Pilani, Hyderabad Campus


Pthreads
 May be provided either as user-level or kernel-level
 A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization
 Specification, not implementation
 API specifies behavior of the thread library, implementation is up to development
of the library
 Common in UNIX operating systems (Solaris, Linux, Mac OS X)

BITS Pilani, Hyderabad Campus


Pthreads Example
#include <pthread.h>
#include <stdio.h> /* The thread will begin control in this function */
void *runner(void *param)
int sum; /* this data is shared by the thread(s) */ {
void *runner(void *param); /* threads call this function */ int i, upper = atoi(param);
sum = 0;
int main(int argc, char *argv[]) for (i = 1; i <= upper; i++)
{ sum += i;
pthread_t tid; /* the thread identifier */ pthread_exit(0);
pthread_attr_t attr; /* set of thread attributes */ }
if (argc != 2) {
fprintf(stderr,"usage: a.out <integer value>\n"); synchronous threading
return -1; }
if (atoi(argv[1]) < 0) {
fprintf(stderr,"%d must be >= 0\n",atoi(argv[1]));
return -1; }
pthread_attr_init(&attr); /* get the default attributes */
pthread_create(&tid, &attr, runner, argv[1]); /* create the thread */
pthread_join(tid, NULL); /* wait for the thread to exit */
printf("sum = %d\n", sum);
}
BITS Pilani, Hyderabad Campus
Pthreads Example

#define NUM THREADS 10


/* an array of threads to be joined upon */
pthread_t workers[NUM THREADS];
for (int i = 0; i < NUM THREADS; i++)
pthread join(workers[i], NULL);

JOINING 10
THREADS

BITS Pilani, Hyderabad Campus


Threading Issues
 fork() and exec() system calls
 Signal handling
 Thread cancellation of target thread
 Thread-local storage
 Scheduler Activations

BITS Pilani, Hyderabad Campus


fork() and exec()
 If one thread in a program calls fork(), does the new process duplicate all threads, or is
the new process single-threaded?
 Some UNIX systems have chosen to have two versions of fork(), one that duplicates all
threads and another that duplicates only the thread that invoked the fork() system call
 The exec() system call works in the same way
 if a thread invokes the exec() system call, the program specified in the parameter
to exec() will replace the entire process—including all threads
 If exec() is called immediately after forking, then duplicating all threads is unnecessary,
as the program specified in the parameters to exec() will replace the process
 duplicating only the calling thread is appropriate
 If the separate process does not call exec() after forking, the separate process should
duplicate all threads

BITS Pilani, Hyderabad Campus


Signal Handling

 Signals are used in UNIX systems to notify a process that a particular event has
occurred
 The signal is delivered to a process
 When delivered, signal handler is used to process signals
 Synchronous and asynchronous signals
 Synchronous signals
 illegal memory access, div. by 0
 delivered to the same process that performed the operation generating the signal
 Asynchronous signals
 generated by an event external to a running process
 the running process receives the signal asynchronously
 Ctrl + C, timer expiration

BITS Pilani, Hyderabad Campus


Signal Handling

 Signal is handled by one of two signal handlers:


 default
 user-defined
 Every signal has default handler that kernel runs when handling signal
 User-defined signal handler can override default signal handler
 Some signals can be ignored, others are handled by terminating the process
 For single-threaded, signal is delivered to process

BITS Pilani, Hyderabad Campus


Signal Handling

 Where should a signal be delivered for multi-threaded process?


 Deliver the signal to the thread to which the signal applies
 Deliver the signal to every thread in the process
 Deliver the signal to certain threads in the process
 Assign a specific thread to receive all signals for the process
 Method for delivering a signal depends on the type of signal generated
 synchronous signals need to be delivered to the thread causing the signal and not
to other threads in the process
 some asynchronous signals—such as <Ctrl + C> should be sent to all threads
 Signals can be delivered to a
 specific process – specify process id and type of signal
 specific thread – specify thread id and type of signal

BITS Pilani, Hyderabad Campus


Signal Handling

 Most multithreaded versions of UNIX allow a thread to specify which signals it will
accept and which it will block
 In some cases, an asynchronous signal may be delivered only to those threads that
are not blocking it
 Signals need to be handled only once, a signal is delivered only to the first thread
found that is not blocking it

BITS Pilani, Hyderabad Campus


Thread Cancellation
 Terminating a thread before it has finished
 Thread to be canceled is target thread
 Two general approaches:
 Asynchronous cancellation terminates the target thread immediately
 Deferred cancellation allows the target thread to periodically check if
it should be cancelled
 What about freeing resources??
 Pthread code to create and cancel a thread:
pthread_t tid;
/* create the thread */
pthread_create(&tid, &attr, worker, NULL);
...
/* cancel the thread */
pthread_cancel(tid);
BITS Pilani, Hyderabad Campus
Thread Cancellation
 Invoking thread cancellation requests cancellation, but actual cancellation depends on
how the target thread is set up to handle the request

default type

 If thread has cancellation disabled, cancellation remains pending until thread enables it
 Default type is deferred
 Cancellation only occurs when thread reaches cancellation point
 Establish cancellation point by calling pthread_testcancel()
 If cancellation request is pending, cleanup handler is invoked to release any
acquired resources

BITS Pilani, Hyderabad Campus


Thread-Local Storage

 Thread-local storage (TLS) allows each thread to have its own copy
of data

 Different from local variables


 Local variables visible only during single function invocation
 TLS visible across function invocations

BITS Pilani, Hyderabad Campus


Scheduler Activations
 Both M:M and Two-level models require communication to
maintain appropriate number of kernel threads allocated to
the application
 Use an intermediate data structure between user and kernel
threads – lightweight process (LWP)
 Appears to be a virtual processor on which process can
schedule user thread to run
 Each LWP attached to kernel thread which is
scheduled on a physical processor
 How many LWPs to create?

BITS Pilani, Hyderabad Campus


Scheduler Activations

 Scheduler activations - scheme for communication b/w user-thread library and


kernel
 Kernel provides an application with a set of virtual processors (LWPs)
 Application can schedule user threads onto an available virtual processor
 Kernel must inform an application about certain events via an upcall
 Upcalls are handled by the thread library with an upcall handler
 Upcall handlers must run on a virtual processor
 Upcall trigerring occurs when an application thread is about to block

BITS Pilani, Hyderabad Campus


Kernel makes upcall to the application informing that a thread is about to block and identifying the thread

Kernel allocates a new virtual processor to the application

Application runs an upcall handler on this new virtual processor, which saves the state of the blocking thread
and relinquishes the virtual processor on which the blocking thread is running

Upcall handler then schedules another thread that is eligible to run on an available virtual processor

When the event that the blocking thread was waiting for occurs, the kernel makes another upcall to the
thread library informing it that the previously blocked thread is now eligible to run

Upcall handler for this event requires a virtual processor, and kernel may allocate a new virtual processor or
preempt one of the user threads and run the upcall handler on its virtual processor

After marking the unblocked thread as eligible to run, the application schedules an eligible thread to run
on an available virtual processor BITS Pilani, Hyderabad Campus
Scheduler Activations

At time T1, the kernel


allocates the application two
processors. On each
processor, the kernel
schedules a user-level thread
taken from the ready list and
starts execution.

Source: Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY M. LEVY, 1992

BITS Pilani, Hyderabad Campus


Scheduler Activations

At time T2, one of the user-


level threads (thread 1) blocks in
the kernel. To notify the user
level of this event, the kernel
takes the processor that had
been running thread 1 and
performs an upcall. The user-
level thread scheduler can then
use the processor to take
another thread off the ready list
and start running it.

Source: Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY M. LEVY, 1992

BITS Pilani, Hyderabad Campus


Scheduler Activations

At time T3, the I/O completes. Again, the kernel


must notify the user-level thread system of the
event, but this notification requires a processor.
The kernel preempts one of the processors
running and uses it to do the upcall. (If there are
no processors available when the I/O completes,
the upcall must wait until the kernel allocates
one). The upcall puts the thread that had been
blocked on the ready list and puts the thread that
was preempted on the ready list.

Source: Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY M. LEVY, 1992

BITS Pilani, Hyderabad Campus


Scheduler Activations
Finally, at time T4,
the upcall takes a
thread off the ready
list and starts
running it.

Source: Scheduler Activations: Effective Kernel


Support for the User-Level Management of
Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY
M. LEVY, 1992

BITS Pilani, Hyderabad Campus


Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
CPU Scheduling
BITS Pilani
Hyderabad Campus
Basics

• Maximum CPU utilization obtained with multiprogramming


• CPU–I/O Burst Cycle – Process execution consists of a cycle of
CPU execution and I/O wait
• CPU burst followed by I/O burst
• More number of short CPU bursts and less number of long CPU
bursts

BITS Pilani, Hyderabad Campus


CPU Scheduler
Short-term / CPU scheduler selects from among the processes in ready queue, and
allocates the CPU to one of them
CPU scheduling decisions may take place when a process:
1. Switches from running to waiting state
2. Switches from running to ready state
3. Switches from waiting to ready
4. Terminates
Preemptive scheduling – done in situations 2 and 3
Nonpremptive /Cooperative scheduling – once a process is allocated the CPU it retains
the CPU until termination or switching to waiting state

BITS Pilani, Hyderabad Campus


Dispatcher

Dispatcher module gives control of the CPU to the process


selected by the short-term scheduler; this involves:
switching context
switching to user mode
jumping to the proper location in the user program to restart that
program
Dispatch latency – time it takes for the dispatcher to stop one
process and start another running

BITS Pilani, Hyderabad Campus


Scheduling Criteria
CPU utilization – keep the CPU as busy as possible
Throughput – no. of processes that complete their execution per time unit
Turnaround time – amount of time to execute a particular process, interval
from submission time to completion time, sum of durations spent waiting to
get into memory, waiting in ready queue, executing on CPU, doing I/O
Waiting time – amount of time a process has been waiting in the ready
queue
Response time – amount of time it takes from when a request was
submitted until the first response is produced, not output (for time-sharing
environment), depends on the speed of output device

BITS Pilani, Hyderabad Campus


First- Come, First-Served (FCFS)
Scheduling
Process Burst Time
P1 24
P2 3
P3 3
Suppose that the processes arrive in the order: P1 , P2 , P3

Waiting time for P1 = 0; P2 = 24; P3 = 27


Average waiting time: (0 + 24 + 27)/3 = 17
BITS Pilani, Hyderabad Campus
First- Come, First-Served (FCFS)
Scheduling
Suppose that the processes arrive in the order:P2 , P3 , P1
The Gantt chart for the schedule is:

Waiting time for P1 = 6; P2 = 0; P3 = 3


Average waiting time: (6 + 0 + 3)/3 = 3
Much better than previous case
Convoy effect - short process behind long process
Consider one CPU-bound and many I/O-bound processes
Not applicable for time sharing systems
BITS Pilani, Hyderabad Campus
Shortest-Job-First (SJF) Scheduling

Associate with each process the length of its next CPU burst
 Use these lengths to schedule the process with the shortest time
Use FCFS in case of tie
SJF is optimal – gives minimum average waiting time for a given
set of processes
The difficulty is knowing the length of the next CPU request
For long-term (job) scheduling in a batch system, use the process time
limit that a user specifies when the job is submitted

BITS Pilani, Hyderabad Campus


Determining Length of Next
CPU Burst
 

BITS Pilani, Hyderabad Campus


Prediction of Length of Next CPU
Burst
 =0
n+1 = n
Recent history does not count
 =1
 n+1 =  tn
Only the actual last CPU burst counts
If we expand the formula, we get:
n+1 =  tn+(1 - ) tn -1 + … +(1 -  )j  tn -j + … +(1 -  )n +1 0
Since both  and (1 - ) are less than or equal to 1, each successive term has less weight
than its predecessor
Commonly, α set to ½

BITS Pilani, Hyderabad Campus


Prediction of Length of Next CPU
Burst
Can be nonpreemptive or preemptive
The next CPU burst of a newly arrived process may be shorter than
what is left of the currently executing process
Preemptive version called shortest-remaining-time-first

BITS Pilani, Hyderabad Campus


Shortest-Job-First (SJF) Scheduling
Process Burst Time
P1 0.06
P2 2.08
P3 4.07
P4 5.0 3

Average waiting time = (3 + 16 + 9 + 0) / 4 = 7 ms

BITS Pilani, Hyderabad Campus


Shortest-remaining-time-first
ProcessA arri Arrival TimeT Burst Time
P1 0 8
P2 1 4
P3 2 9
P4 3 5
Preemptive SJF Gantt Chart

Average waiting time = [(10-1)+(1-1)+(17-2)+(5-3)]/4 = 26/4 = 6.5 msec


BITS Pilani, Hyderabad Campus
Priority Scheduling
A priority number (integer) is associated with each process, generally
starting from 0
The CPU is allocated to the process with the highest priority (smallest
integer  highest priority)
Preemptive
Nonpreemptive
SJF is priority scheduling where priority is the inverse of predicted next
CPU burst time
Problem  Starvation – low priority processes may never execute
Solution  Aging – as time progresses increase the priority of the process

BITS Pilani, Hyderabad Campus


Nonpreemptive Priority Scheduling
ProcessA arri Burst Time Priority
P1 10 3
P2 1 1
P3 2 4
P4 1 5
P5 5 2

Average waiting time = 8.2 msec

Problem: indefinite blocking and starvation.

Solutions: aging and pre-emptive priority scheduling

BITS Pilani, Hyderabad Campus


Preemptive Priority Scheduling

BITS Pilani, Hyderabad Campus


Round Robin (RR) Scheduling
Each process gets a small unit of CPU time (time quantum q),
usually 10-100 milliseconds. After this time has elapsed, the process
is preempted and added to the end of the ready queue.
If there are n processes in the ready queue and the time quantum
is q, then each process gets 1/n of the CPU time in chunks of at most
q time units at once. No process waits more than (n-1)q time units.
Timer interrupts every quantum to schedule next process
Performance
q large  FIFO
q small  q must be large with respect to context switch, otherwise
overhead is too high

BITS Pilani, Hyderabad Campus


Round Robin (RR) Scheduling
Process Burst Time
P1 24
P2 3
P3 3
• The Gantt chart is:

Typically, higher average turnaround than SJF, but better response


• q should be large compared to context switch time
• q usually 10ms to 100ms, context switch < 10 usec

BITS Pilani, Hyderabad Campus


Time Quantum and Context Switch
Time

BITS Pilani, Hyderabad Campus


Turnaround Time Varies With The
Time Quantum

80% of CPU bursts


should be shorter than q

BITS Pilani, Hyderabad Campus


Multilevel Queue Scheduling

fixed priority
preemptive scheduling
among queues

BITS Pilani, Hyderabad Campus


Multilevel Feedback Queue
A process can move between the various queues; aging can be
implemented this way
Separate processes according to the characteristics of their CPU bursts
Multilevel-feedback-queue scheduler defined by the following
parameters:
number of queues
scheduling algorithms for each queue
method used to determine when to upgrade a process
method used to determine when to demote a process
method used to determine which queue a process will enter when that process
needs service

BITS Pilani, Hyderabad Campus


Multilevel Feedback Queue

Three queues:
Q0 – RR with time quantum 8 milliseconds
Q1 – RR time quantum 16 milliseconds
Q2 – FCFS
Scheduling
A new process enters queue Q0 which is uses RR
When it gains CPU, job receives 8 milliseconds
If it does not finish in 8 milliseconds, job is moved to queue
Q1
At Q1 process is again served using RR and receives 16
additional milliseconds
If it still does not complete, it is preempted and moved to
queue Q2 BITS Pilani, Hyderabad Campus
CFS

Red-Black Tree
 Every node has a color either red or black.
 The root of the tree is always black.
 There are no two adjacent red nodes (A red
node cannot have a red parent or red child).
 Every path from a node (including root) to any of
its descendants NULL nodes has the same
number of black nodes.
 All leaf nodes are black nodes.
 Self balancing: sub-trees are rotated when the
properties are violated
 Search, insert, delete: O(log n)

BITS Pilani, Hyderabad Campus


CFS

BITS Pilani, Hyderabad Campus


Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Synchronization
BITS Pilani Slides Courtesy: Dr. Barsha Mitra
Deptartment of CSIS, BITS Pilani, Hyderabad Campus
Hyderabad Campus
Background

Processes can execute concurrently


May be interrupted at any time, partially completing execution
Concurrent access to shared data may result in data inconsistency
Maintaining data consistency requires mechanisms to ensure the
orderly execution of cooperating processes
Consider the Producer-Consumer problem

BITS Pilani, Hyderabad Campus


Producer - Consumer
while (true) {
/* produce an item in next produced */
while (counter == BUFFER_SIZE) ;
/* do nothing */
buffer[in] = next_produced;
in = (in + 1) % BUFFER_SIZE;
counter++;
while (true) {
}
while (counter == 0)
; /* do nothing */
next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
counter--;
/* consume the item in next consumed */
}

BITS Pilani, Hyderabad Campus


Race Condition
counter++ could be implemented as
register1 = counter
register1 = register1 + 1
counter = register1

counter-- could be implemented as


register2 = counter
register2 = register2 - 1
counter = register2
Consider this execution interleaving with counter = 5 initially:
S0 producer execute register1 = counter register1 =
5
S1 producer execute register1 = register1 + 1 register1 =
6
S2 consumer execute register2 = counter register2 =
5
S3 consumer execute register2 = register2 – 1 register2 =
4
S4 producer execute counter = register1 counter = 6 BITS Pilani, Hyderabad Campus
Critical Section Problem

Consider system of n processes {p0 , p1, … pn-1}


Each process has critical section segment of code
Process may be changing common variables, updating table,
writing file, etc.
When one process in critical section, no other may be in its
critical section
Critical section problem is to design protocol to solve
this
Each process must ask permission to enter critical
section in entry section, may follow critical section with
exit section, then remainder section

BITS Pilani, Hyderabad Campus


Requirements
1. Mutual Exclusion - If process Pi is executing in its critical section, then no
other processes can be executing in their critical sections
2. Progress - If no process is executing in its critical section and some
processes wish to enter their critical sections, then only those processes that
are not executing in their remainder section can participate in deciding which
will enter its critical section next, and this decision cannot be postponed
indefinitely
3. Bounded Waiting - A bound must exist on the number of times that other
processes are allowed to enter their critical sections after a process has made
a request to enter its critical section and before that request is granted

BITS Pilani, Hyderabad Campus


Critical-Section Handling in OS
 Two approaches depending on if kernel is preemptive or
non-preemptive
Preemptive – allows preemption of process when running in
kernel mode
Non-preemptive – runs until exits kernel mode, blocks, or
voluntarily yields CPU
Essentially free of race conditions in kernel mode
Why then anyone would prefer preemptive kernel?????

BITS Pilani, Hyderabad Campus


Peterson’s Solution
Two process software based solution
Assume that the load and store machine-language instructions are
atomic; that is, cannot be interrupted
The two processes share two variables:
int turn;
boolean flag[2];
The variable turn indicates whose turn it is to enter the critical
section
The flag array is used to indicate if a process is ready to enter the
critical section. flag[i] = true implies that process Pi is ready

BITS Pilani, Hyderabad Campus


Algorithm for Process Pi

do {
flag[i] = true;
turn = j; Provable that the 3 CS requirement are met:
while (flag[j] && turn = = j); 1. Mutual exclusion is preserved
critical section 2. Progress requirement is satisfied
flag[i] = false; 3. Bounded-waiting requirement is met
remainder section
} while (true);

BITS Pilani, Hyderabad Campus


Check 3 Requirements
do {
do {
flag[j] = true;
flag[i] = true;
turn = i;
turn = j;
while (flag[i] && turn = = i);
while (flag[j] && turn = = j);
critical section
critical section
flag[j] = false;
flag[i] = false;
remainder section
remainder section
} while (true);
} while (true);

Pj
Pi
BITS Pilani, Hyderabad Campus
Synchronization Hardware
Many systems provide hardware support for implementing the critical
section code
All solutions below based on idea of locking
Protecting critical regions via locks
Uniprocessors – could disable interrupts
Currently running code would execute without preemption
Generally too inefficient on multiprocessor systems
Modern machines provide special atomic hardware instructions
Atomic = non-interruptible
Either test memory word and set value
Or swap contents of two memory words

BITS Pilani, Hyderabad Campus


Solution to Critical-section Problem
Using Locks

do {

acquire lock

critical section

release lock

remainder section

} while (TRUE);

BITS Pilani, Hyderabad Campus


test_and_set Instruction
boolean test_and_set (boolean *target)
{
boolean rv = *target;
*target = true;
return rv;
}
1.Executed atomically
2.Returns the original value of passed parameter
3.Sets the new value of passed parameter to true

BITS Pilani, Hyderabad Campus


Solution using test_and_set
Instruction
Shared Boolean variable lock, initialized to false, supports mutual
exclusion and progress but not bounded waiting
do {
while (test_and_set(&lock))
; /* do nothing */
/* critical section */
lock = false;
/* remainder section */
} while (true);

BITS Pilani, Hyderabad Campus


compare_and_swap Instruction
int compare_and_swap(int *value, int expected, int new_value)
{
int temp = *value;
if (*value == expected)
*value = new_value;
return temp;
}
1.Executed atomically
2.Returns the original value of passed parameter value
3.Sets the variable value the value of the passed parameter new_value but
only if value == expected. That is, the swap takes place only under this
condition. BITS Pilani, Hyderabad Campus
Solution using compare_and_swap
Shared (global) integer lock initialized to 0;
Solution:
do {
while (compare_and_swap(&lock, 0, 1) != 0)
; /* do nothing */
/* critical section */
lock = 0;
/* remainder section */
} while (true);

BITS Pilani, Hyderabad Campus


Bounded-waiting Mutual Exclusion
with test_and_set
common data structures – boolean waiting[n], boolean lock, initialized to false

do {
waiting[i] = true; while ((j != i) && !
key = true; waiting[j])
while (waiting[i] j = (j + 1) % n;
&& key) if (j == i)
key = lock = false;
test_and_set(&lock); else
waiting[i] = false;
waiting[j] = false;
/* critical section
/* remainder section */
*/
j = (i + 1) % n; } while (true);

BITS Pilani, Hyderabad Campus


Mutex Locks
Previous solutions are complicated and generally inaccessible to
application programmers
OS designers build software tools to solve critical section problem
Simplest is mutex lock
Protect a critical section by first acquire() a lock then release() the lock
 mutex lock has a boolean variable indicating if lock is available or not
Calls to acquire() and release() must be atomic
 Usually implemented via hardware atomic instructions
But this solution requires busy waiting
 This lock therefore called a spinlock

BITS Pilani, Hyderabad Campus


acquire() and release()

acquire() {
while (!available) Adv. of Spinlock:
; /* busy wait */ 1. In a multiprocessor system no context switch is
available = false;;
required when a process must wait on a lock
}
release() {
2. useful when locks are to be held for short
available = true; times
} 3. on a multiprocessor system, one thread can
do {
spin on one processor and the other may
acquire lock
critical section execute CS on another processor
release lock Disadv. of Spinlock: busy waiting wastes CPU
remainder section
cycles
} while (true);

BITS Pilani, Hyderabad Campus


Semaphore
Provides more sophisticated ways (than mutex locks) for process to synchronize
Semaphore S – integer variable
Apart from initialization, can only be accessed via two indivisible (atomic) operations,
wait() and signal(), Originally called P() and V()
Definition of the wait() operation
wait(S) {
while (S <= 0)
; // busy wait
S--;
}
Definition of the signal() operation
signal(S) {
S++;
}
BITS Pilani, Hyderabad Campus
Semaphore Usage
Counting semaphore – integer value can range over an unrestricted domain
Binary semaphore – integer value can range only between 0 and 1
Consider P1 and P2 that require S1 to happen before S2
 Create a semaphore “synch” initialized to 0
P1:
S1;
signal(synch);
P2:
wait(synch);
S2;
Must guarantee that no two processes can execute the wait() and signal() on the
same semaphore at the same time
BITS Pilani, Hyderabad Campus
Semaphore Implementation

With each semaphore there is an associated waiting queue


Each entry in a waiting queue has two data items:
 value (of type integer)
 pointer to next record in the list
Two operations:
block – place the process invoking the operation on the appropriate waiting queue
wakeup – remove one of processes in the waiting queue and place it in the ready
queue
typedef struct{
int value;
struct process *list;
} semaphore;
BITS Pilani, Hyderabad Campus
Semaphore Implementation

signal(semaphore *S) {
S->value++;
if(S->value <= 0){
wait(semaphore *S) { remove a process P from S->list;
S->value--; wakeup(P);
if(S->value < 0){ }
add this process to S->list; }
block();
}
}

BITS Pilani, Hyderabad Campus


Deadlock and Starvation
Deadlock – two or more processes are waiting indefinitely for an event that can be
caused by only one of the waiting processes
Let S and Q be two semaphores initialized to 1
P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
... ...
signal(S); signal(Q);
signal(Q); signal(S);
Starvation – indefinite blocking , A process may never be removed from the semaphore
queue in which it is suspended
Priority Inversion – Scheduling problem when lower-priority process holds a lock needed
by higher-priority process, Solved via priority-inheritance protocol
BITS Pilani, Hyderabad Campus
Classical Problems of
Synchronization
Classical problems used to test newly-proposed synchronization schemes
Bounded-Buffer Problem
Readers and Writers Problem
Dining-Philosophers Problem

BITS Pilani, Hyderabad Campus


Bounded-Buffer Problem
n buffers, each can hold one item (int n)
Semaphore mutex initialized to the value 1
Semaphore full initialized to the value 0
Semaphore empty initialized to the value n

BITS Pilani, Hyderabad Campus


Bounded-Buffer Problem
Structure of the producer process
Structure of the consumer process

do {
do {
...
wait(full);
/* produce item in next_produced */
wait(mutex);
...
...
wait(empty);
/*remove item from buffer to next_consumed*/
wait(mutex);
...
...
signal(mutex);
/* add next_produced to the buffer */
signal(empty);
...
...
signal(mutex);
/*consume the item in next_consumed */
signal(full);
...
} while (true); BITS Pilani, Hyderabad Campus
Readers-Writers Problem
A data set is shared among a number of concurrent processes
Readers – only read the data set; they do not perform any updates
Writers – can both read and write
Problem – allow multiple readers to read at the same time
Only one single writer can access the shared data at the same time
Several variations of how readers and writers are considered – all involve
some form of priorities
First readers-writers prob. no reader is kept waiting unless a writer has
already obtained the permission to use the shared obj., writers may starve
Second readers-writers prob. if a writer is waiting to access the shared obj.
no new readers may start, readers may starve
BITS Pilani, Hyderabad Campus
Readers-Writers Problem
Shared Data
Semaphore rw_mutex initialized to 1, common to both readers and writers, acts
as a mutual exclusion semaphore for writers, used by first reader that enters CS or last
reader that exits CS, not used by other readers

Integer read_count initialized to 0, keeps track of how many processes are reading
the shared obj.

Semaphore mutex initialized to 1, ensures mutual exclusion when read_count


is updated

BITS Pilani, Hyderabad Campus


Readers-Writers Problem
Structure of a writer process

Structure of a reader process


do { do {
wait(mutex);
wait(rw_mutex);
read_count++;
... if (read_count == 1) wait(rw_mutex);
signal(mutex);
/* writing is performed */
...
... /* reading is performed */
...
signal(rw_mutex); wait(mutex);
} while (true); read count--;
if (read_count == 0) signal(rw_mutex);
signal(mutex);
} while (true);

BITS Pilani, Hyderabad Campus


Dining-Philosophers Problem

Philosophers spend their lives alternating


thinking and eating
Don’t interact with their neighbors,
occasionally try to pick up 2 chopsticks (one at a
time) to eat from bowl
Can pickup only one chopstick at a time
Need both to eat, then release both when
done
In the case of 5 philosophers
Shared data
Bowl of rice (data set)
Semaphore chopstick [5] initialized to 1

BITS Pilani, Hyderabad Campus


Dining-Philosophers Problem
Algorithm
Structure of Philosopher i:
do {
wait (chopstick[i] );
wait (chopstick[ (i + 1) % 5] );

// eat

signal (chopstick[i] );
signal (chopstick[ (i + 1) % 5] );

// think

} while (TRUE);

What is the problem with this


algorithm?
BITS Pilani, Hyderabad Campus
Dining-Philosophers Problem
Algorithm
 Deadlock handling
 Allow at most 4 philosophers to be sitting simultaneously at the
table.

 Allow a philosopher to pick up the chopsticks only if both are


available (picking must be done in a critical section.

 Use an asymmetric solution -- an odd-numbered philosopher


picks up first the left chopstick and then the right chopstick.
Even-numbered philosopher picks up first the right chopstick
and then the left chopstick.

BITS Pilani, Hyderabad Campus


Problems with Semaphores
 Incorrect use of semaphore operations:

 signal (mutex) …. wait (mutex)

 wait (mutex) … wait (mutex)

 Omitting of wait (mutex) or signal (mutex) (or both)

 Deadlock and starvation are possible.

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Deadlocks
BITS Pilani
Hyderabad Campus
System Model

System consists of resources


Resource types R1, R2, . . ., Rm
Each resource type Ri has Wi instances.
Each process utilizes a resource as follows:
request
use
release

BITS Pilani, Hyderabad Campus


Necessary Conditions

Deadlock can arise if four conditions hold simultaneously


Mutual exclusion: only one process at a time can use a resource
Hold and wait: a process holding at least one resource is waiting
to acquire additional resources held by other processes
No preemption: a resource can be released only voluntarily by the
process holding it, after that process has completed its task
Circular wait: there exists a set {P0, P1, …, Pn} of waiting processes
such that P0 is waiting for a resource that is held by P1, P1 is waiting
for a resource that is held by P2, …, Pn–1 is waiting for a resource
that is held by Pn, and Pn is waiting for a resource that is held by P0.

BITS Pilani, Hyderabad Campus


Resource-Allocation Graph

A set of vertices V and a set of edges E


V is partitioned into two types:
P = {P1, P2, …, Pn}, the set consisting of all the
processes in the system

R = {R1, R2, …, Rm}, the set consisting of all


resource types in the system

request edge – directed edge Pi → Rj

assignment edge – directed edge Rj → Pi


BITS Pilani, Hyderabad Campus
Examples

BITS Pilani, Hyderabad Campus


Inferring Deadlock

If graph contains no cycles: no deadlock


If graph contains a cycle:
if only one instance per resource type, then deadlock
if several instances per resource type, possibility of deadlock

BITS Pilani, Hyderabad Campus


Deadlock Handling

Ensure that the system will never enter a deadlock state:


Deadlock prevention
Deadlock avoidance
Allow the system to enter a deadlock state and then recover
Ignore the problem and pretend that deadlocks never occur in the
system; used by most operating systems, including UNIX,
application programmers should ensure that deadlocks don’t occur

BITS Pilani, Hyderabad Campus


Deadlock Prevention

Mutual Exclusion – not required for sharable resources (e.g., read-


only files); must hold for non-sharable resources

Hold and Wait – must guarantee that whenever a process


requests a resource, it does not hold any other resources
Require process to request and be allocated all its resources before it
begins execution, or allow process to request resources only when the
process has none allocated to it
Low resource utilization; starvation possible

BITS Pilani, Hyderabad Campus


Deadlock Prevention

No Preemption –
If a process that is holding some resources requests another resource that cannot
be immediately allocated to it, then all resources currently being held are released
Preempted resources are added to the list of resources for which the process is
waiting
Process will be restarted only when it can regain its old resources, as well as the
new ones that it is requesting
When a process P1 requests some resources and they are allocated to some other
waiting process P2, then preempt the desired resources from P2 and give them to P1
If the resources are not allocated to a waiting process, then P1 must wait
While waiting P1’s resources may be preempted

BITS Pilani, Hyderabad Campus


Deadlock Prevention

Circular Wait – impose a total ordering of all resource types, and require
that each process requests resources in an increasing order of enumeration
define a 1:1 function F : R → N (N is set of natural nos.)
Say a process P has requested a no. of instances of Ri
Later, Pi can request resources of type Rj iff F(Rj) > F(Ri)
Alternatively, if Pi requests an instance of Rj then Pi must have released all
instances of Ri s.t. F(Ri) >= F(Rj)
Several instances of same resource type must be requested for in a single
request
Proof by contradiction
BITS Pilani, Hyderabad Campus
Deadlock Avoidance

• Requires that the system has some additional a priori information


available
• Requires that each process declare the maximum number of
resources of each type that it may need
• Resource-allocation state is defined by the number of available and
allocated resources, and the maximum demands of the processes
• Deadlock-avoidance algorithm dynamically examines the resource-
allocation state to ensure that there can never be a circular-wait
condition

BITS Pilani, Hyderabad Campus


Safe State
When a process requests an available resource, system must
decide if immediate allocation leaves the system in a safe state
System is in safe state if there exists a sequence <P1, P2, …, Pn> of
ALL the processes in the systems such that for each Pi, the
resources that Pi can still request can be satisfied by currently
available resources + resources held by all the Pj, with j < i
That is:
If resources needed by Pi are not immediately available, then Pi can wait
until all Pj have finished
When Pj is finished, Pi can obtain needed resources, execute, return
allocated resources, and terminate
When Pi terminates, Pi +1 can obtain its needed resources, and so on
BITS Pilani, Hyderabad Campus
Inferences

If a system is in safe state: no


deadlocks

If a system is in unsafe state:


possibility of deadlock

Avoidance: ensure that a


system will never enter an
unsafe state

BITS Pilani, Hyderabad Campus


Avoidance Algorithms

Single instance of a resource type


Use a resource-allocation graph

Multiple instances of a resource type


 Use the banker’s algorithm

BITS Pilani, Hyderabad Campus


Resource-Allocation-Graph
Algorithm
Claim edge Pi → Rj indicated that process Pi may request
resource Rj in future; represented by a dashed line
Claim edge converts to request edge when a process
requests a resource
Request edge converted to an assignment edge when the
resource is allocated to the process
When a resource is released by a process, assignment edge
reconverts to a claim edge
Resources must be claimed a priori in the system
Suppose that process Pi requests a resource Rj
The request can be granted only if converting the request
edge to an assignment edge does not result in the formation
of a cycle in the resource allocation graph
BITS Pilani, Hyderabad Campus
Banker’s Algorithm

Multiple instances of a resource type

Each process must a priori declare maximum use

When a process requests a resource it may have to wait

When a process gets all its resources it must return them in a finite
amount of time

BITS Pilani, Hyderabad Campus


Data Structures for Banker’s
Algorithm
n = number of processes, and m = number of resources types
Available: Vector of length m. If Available [j] = k, there are k instances of resource type Rj
are available
Max: n x m matrix. If Max [i][j] = k, then process Pi may request at most k instances of
resource type Rj
Allocation: n x m matrix. If Allocation[i][j] = k then Pi is currently allocated k instances of
Rj
Need: n x m matrix. If Need[i][j] = k, then Pi may need k more instances of Rj to complete
its task Need [i][j] = Max[i][j] – Allocation [i][j]
Each row of Max, Allocation and Need can be treated as vectors
If X and Y are 2 vectors, then X <= Y iff, X[i] <= Y[i] for all i = 1, 2, ...., m
X < Y, if X<=Y and X ≠ Y
BITS Pilani, Hyderabad Campus
Safety Algorithm
1. Let Work and Finish be vectors of length m and n, respectively.
Initialize: Work = Available, Finish [i] = false for i = 0, 1, …, n – 1
2. Find an i such that both:
(a) Finish [i] == false
(b) Needi <= Work
If no such i exists, go to step 4 O(mn2) operations
3. Work = Work + Allocationi
Finish[i] = true
go to step 2
4. If Finish [i] == true for all i, then the system is in a safe state

BITS Pilani, Hyderabad Campus


Banker’s Algorithm Example
5 processes P0 through P4;
3 resource types: A (10 instances), B (5instances), and C (7 instances)
Snapshot at time T0:

Process Allocation Max Available Process Need


A B C A B C A B C A B C
P0 0 1 0 7 5 3 3 3 2 P0 7 4 3
P1 2 0 0 3 2 2 P1 1 2 2

P2 3 0 2 9 0 2 P2 6 0 0

P3 2 1 1 2 2 2 P3 0 1 1
P4 0 0 2 4 3 3 P4 4 3 1

The system is in a safe state since the sequence < P1, P3, P4, P2, P0> satisfies safety
criteria BITS Pilani, Hyderabad Campus
Resource-Request Algorithm
Requesti = request vector for process Pi. If Requesti [j] = k then process Pi wants k instances
of resource type Rj
1. If Requesti <= Needi go to step 2. Otherwise, raise error condition, since process
has exceeded its maximum claim
2. If Requesti <= Available, go to step 3. Otherwise Pi must wait, since resources are
not available
3. Pretend to allocate requested resources to Pi by modifying the state as follows:
Available = Available – Requesti ;
Allocationi = Allocationi + Requesti;
Needi = Needi – Requesti ;
 If safe: the resources are allocated to P
i
 If unsafe: P must wait, and the old resource-allocation state is restored
i

BITS Pilani, Hyderabad Campus


Banker’s Algorithm Example:
P1 Requests (1,0,2)
Check that Request <= Available (that is, (1,0,2) <= (3,3,2): true
Process Allocation Max Available Process Need
A B C A B C A B C A B C
P0 0 1 0 7 5 3 2 3 0 P0 7 4 3
P1 3 0 2 3 2 2 P1 0 2 0

P2 3 0 2 9 0 2 P2 6 0 0

P3 2 1 1 2 2 2 P3 0 1 1

P4 0 0 2 4 3 3 P4 4 3 1

Executing safety algorithm shows that sequence < P1, P3, P4, P0, P2> satisfies safety
requirement
Can request for (3,3,0) by P4 be granted?
Can request for (0,2,0) by P0 be granted?
BITS Pilani, Hyderabad Campus
Banker’s Algorithm Example:
P1 Requests (1,0,2)
Check that Request <= Available (that is, (1,0,2) <= (3,3,2): true
Process Allocation Max Available Process Need
A B C A B C A B C A B C
P0 0 1 0 7 5 3 2 3 0 P0 7 4 3
P1 3 0 2 3 2 2 P1 0 2 0

P2 3 0 2 9 0 2 P2 6 0 0
P3 2 1 1 2 2 2 P3 0 1 1

P4 0 0 2 4 3 3 P4 4 3 1

Executing safety algorithm shows that sequence < P1, P3, P4, P0, P2> satisfies safety
requirement
Can request for (3,3,0) by P4 be granted? NO: resources unavailable
Can request for (0,2,0) by P0 be granted? NO: results in unsafe state
BITS Pilani, Hyderabad Campus
Deadlock Detection
Allow system to enter deadlock state

Detection algorithm

Recovery scheme

BITS Pilani, Hyderabad Campus


Single Instance of Each Resource
Type
Maintain wait-for graph
Nodes are processes
Pi → Pj if Pi is waiting for Pj

Periodically invoke an algorithm that


searches for a cycle in the graph. If
there is a cycle, there exists a deadlock

An algorithm to detect a cycle in a


graph requires O(n2) operations, where
n is the number of vertices in the graph

BITS Pilani, Hyderabad Campus


Multiple Instances of Resource Types

Available: A vector of length m indicates the number of available resources of each type.
Allocation: An n × m matrix defines the number of resources of each type currently
allocated to each thread.
Request: An n × m matrix indicates the current request of each thread. If Request[i][j] equals
k, then thread Ti is requesting k more instances of resource type Rj.

BITS Pilani, Hyderabad Campus


Multiple Instances of Resource Types

Algorithm:
1) Let Work and Finish be vectors of length m and n, respectively. Initialize Work = Available. For i = 0, 1, ..., n–
1, if Allocationi ≠ 0, then Finish[i]= false. Otherwise, Finish[i]= true.
2) Find an index i such that both
a) Finish[i]== false

b) Requesti ≤ Work
3)If no such i exists, go to step 4.

4) Work = Work + Allocationi


5)Finish[i]= true
6)Go to step 2.
7)If Finish[i]== false for some i, 0 ≤ i < n, then the system is in a deadlocked state. Moreover, if Finish[i]== false,
then thread Ti is deadlocked.
BITS Pilani, Hyderabad Campus
Recovery from Deadlock:
Process Termination
Abort all deadlocked processes

Abort one process at a time until the deadlock cycle is eliminated

In which order should we choose to abort?


Priority of the process
How long process has computed, and how much longer to completion
Resources the process has used
Resources process needs to complete
How many processes will need to be terminated
Is process interactive or batch?

BITS Pilani, Hyderabad Campus


Recovery from Deadlock:
Resource Preemption
Preempt some resources from processes and give these resources to other
processes until the deadlock cycle is broken

Selecting a victim – which resources and which processes, minimize cost by


deciding order of preemption

Rollback – return to some safe state, restart process for that state, total
rollback or partials

Starvation – same process may always be picked as victim, include number


of rollback in cost factor
BITS Pilani, Hyderabad Campus
Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Main Memory Management
BITS Pilani
Hyderabad Campus
Background

Program must be brought (from disk) into memory and placed


within a process for it to be run
Main memory and registers are only storage CPU can access
directly
Memory unit only sees a stream of addresses + read requests, or
address + data and write requests
Register access in one CPU clock (or less)
Main memory can take many cycles, causing a stall
Cache sits between main memory and CPU registers
Protection of memory required to ensure correct operation

BITS Pilani, Hyderabad Campus


Base and Limit Registers

A pair of base and limit registers


define the logical address space
CPU must check every memory
access generated in user mode to be
sure it is between base and limit for
that user

BITS Pilani, Hyderabad Campus


Hardware Address Protection

BITS Pilani, Hyderabad Campus


Address Binding
Programs on disk, ready to be brought into memory to execute form an input
queue
Further, addresses represented in different ways at different stages of a
program’s life
 Source code addresses usually symbolic
 Compiled code addresses bind to relocatable addresses
 like, “14 bytes from beginning of this module”
 Linker or loader binds relocatable addresses to absolute addresses
 like, 74014
 Each binding maps one address space to another

BITS Pilani, Hyderabad Campus


Binding of Instructions and Data to
Memory
Address binding of instructions and data to memory addresses can happen at
three different stages

Compile time: If memory location known a priori, absolute code can be


generated; must recompile code if starting location changes
Load time: Compiler must generate relocatable code if memory location
is not known at compile time; final binding is delayed till load time
Execution time: Binding delayed until run time if the process can be
moved during its execution from one memory segment to another
Need hardware support for address maps (e.g., base and limit
registers)

BITS Pilani, Hyderabad Campus


Logical vs. Physical Address Space

Logical address – generated by the CPU; also referred to as virtual


address
Physical address – address seen by the memory unit
Logical and physical addresses are the same in compile-time and
load-time address-binding schemes; logical (virtual) and physical
addresses differ in execution-time address-binding scheme
Logical address space is the set of all logical addresses generated
by a program
Physical address space is the set of all physical addresses
corresponding to the logical addresses generated by a program

BITS Pilani, Hyderabad Campus


Memory-Management Unit (MMU)
Hardware device that at run time maps
virtual address to physical address
Many methods possible, covered in the
rest of this chapter
Value in the relocation register is added
to every address generated by a user
process
Base register now called relocation
register
The user program deals with logical
addresses; it never sees the real physical
addresses
BITS Pilani, Hyderabad Campus
Dynamic Loading
Routine is not loaded until it is called
Better memory-space utilization; unused routine is never loaded
All routines kept on disk
Useful when large amounts of code are needed to handle
infrequently occurring cases

BITS Pilani, Hyderabad Campus


Swapping
A process can be swapped temporarily out of memory to a backing store, and
then brought back into memory for continued execution
Total physical memory space of processes can exceed physical memory
Backing store – fast disk large enough to accommodate copies of all memory
images for all users; must provide direct access to these memory images
Major part of swap time is transfer time; total transfer time is directly
proportional to the amount of memory swapped
System maintains a ready queue of ready-to-run processes which have
memory images on disk

BITS Pilani, Hyderabad Campus


Swapping

If next processes to be put on CPU is


not in memory, need to swap out a
process and swap in target process
Context switch time can then be very
high
100MB process swapping to hard
disk with transfer rate of 50MB/sec
Swap out time of 2000 ms
Plus swap in of same sized process
Total context switch swapping
component time of 4000ms (4 seconds)

BITS Pilani, Hyderabad Campus


Swapping

Other constraints on swapping


Pending I/O – can’t swap out as I/O would occur to wrong
process
Standard swapping not used in modern operating systems
But modified version common
Swap only when free memory extremely low
Swapping portions of processes

BITS Pilani, Hyderabad Campus


Contiguous Memory Allocation

Relocation registers used to protect user processes from each


other, and from changing operating-system code and data
Base register contains value of smallest physical address
Limit register contains range of logical addresses – each logical
address must be less than the limit register
MMU maps logical address dynamically

BITS Pilani, Hyderabad Campus


Multiple-Partition Allocation
• Fixed size partitions, Degree of multiprogramming limited by number of partitions
• Variable-partition sizes for efficiency (sized to a given process’ needs)
• Hole – block of available memory; holes of various size are scattered throughout memory
• When a process arrives, it is allocated memory from a hole large enough to accommodate it
• Process exiting frees its partition, adjacent free partitions combined
• Operating system maintains information about:
a) allocated partitions b) free partitions (hole)

BITS Pilani, Hyderabad Campus


Dynamic Storage-Allocation Problem

How to satisfy a request of size n from a list of free holes?


First-fit: Allocate the first hole that is big enough

Best-fit: Allocate the smallest hole that is big enough; must


search entire list, unless ordered by size
Produces the smallest leftover hole

Worst-fit: Allocate the largest hole; must also search entire list
Produces the largest leftover hole
First-fit and best-fit better than worst-fit in terms of speed and storage
utilization
BITS Pilani, Hyderabad Campus
Fragmentation

External Fragmentation – total memory space exists to satisfy a


request, but it is not contiguous
Reduce external fragmentation by compaction
Shuffle memory contents to place all free memory together in
one large block
Compaction is possible only if relocation is dynamic, and is done
at execution time
Allow logical address space of processes to be non-contiguous
Internal Fragmentation – allocated memory may be slightly larger
than requested memory; this size difference is memory internal to a
partition, but not being used
BITS Pilani, Hyderabad Campus
Segmentation
A program is a collection of variable sized
segments
A segment is a logical unit such as:
main program
local variables, global variables
procedure
stack
function
symbol table
method
arrays
object
Memory-management scheme that supports user
view of memory
Logical address space is a collection of segments
BITS Pilani, Hyderabad Campus
Segmentation Architecture
Logical address consists of a two tuple: <segment-number, offset>,
Segment table – maps two-dimensional logical addresses to one-
dimensional physical addresses; each table entry has:
segment base – contains the starting physical address where the segment resides in
memory
segment limit – specifies the length of the segment
Segment-table base register (STBR) points to the segment table’s location
in memory
Segment-table length register (STLR) indicates number of segments used
by a program

BITS Pilani, Hyderabad Campus


Segmentation Hardware

BITS Pilani, Hyderabad Campus


Paging
Physical address space of a process can be noncontiguous; process is allocated physical
memory whenever the latter is available
Avoids external fragmentation
Avoids need for compaction
Divide physical memory into fixed-sized blocks called frames
Size is power of 2
Divide logical memory into blocks of same size called pages
Keep track of all free frames
To run a program of size N pages, need to find N free frames and load program
Set up a page table to translate logical to physical addresses
Backing store likewise split into pages
Still have Internal fragmentation
BITS Pilani, Hyderabad Campus
Address Translation Scheme
Address generated by CPU is divided into:
Page number (p) – used as an index into a page table which contains base address of
each page in physical memory
Page offset (d) – combined with base address to define the physical memory address
that is sent to the memory unit
page number page offset
p d
m -n n

logical address space 2m bytes and page size 2n bytes

BITS Pilani, Hyderabad Campus


Paging Hardware

page table contains


base address of each
in physical memory

BITS Pilani, Hyderabad Campus


Paging Model of Logical and
Physical Memory

BITS Pilani, Hyderabad Campus


Paging Example

n = 2 and m = 4
32-byte memory and 4-byte pages

BITS Pilani, Hyderabad Campus


Paging
• Calculating internal fragmentation
• Page size = 8 bytes
• Process size = 100 bytes
• 12 pages + 4 bytes
• Internal fragmentation of 4 bytes
• Worst case fragmentation = 1 frame – 1 byte
• So small frame sizes desirable?
• But each page table entry takes memory to track
• Process view and physical memory now very different
• By implementation process can only access its own memory

BITS Pilani, Hyderabad Campus


Free Frames

Frame table – one entry for each frame


indicating whether the frame is free or allocated
and if allocated, to which page of which process

Before allocation After allocation

BITS Pilani, Hyderabad Campus


Page Table Implementation
• Page table is kept in main memory
• Page-table base register (PTBR) points to the page table
• Page-table length register (PTLR) indicates size of the page table
• In this scheme every data/instruction access requires two memory accesses
• One for the page table and one for the data / instruction
• The two memory access problem can be solved by the use of a special fast-
lookup hardware cache called associative memory or translation look-aside
buffers (TLBs)

BITS Pilani, Hyderabad Campus


Associative Memory
• Associative memory – parallel search

P a ge # F ra m e #

• Address translation (p, d)


• If p is in associative register, get frame # out
• Otherwise get frame # from page table in memory

BITS Pilani, Hyderabad Campus


Paging Hardware With TLB

BITS Pilani, Hyderabad Campus


Page Table Implementation
• Some TLBs store address-space identifiers (ASIDs) in each TLB entry –
uniquely identifies each process to provide address-space protection for
that process
• Otherwise need to flush at every context switch
• TLBs typically small (64 to 1,024 entries)
• On a TLB miss, value is loaded into the TLB for faster access next time
• Replacement policies must be considered
• Some entries can be wired down for permanent fast access

BITS Pilani, Hyderabad Campus


Effective Access Time
• Associative Lookup = t time unit
• Can be < 10% of memory access time
• Hit ratio = H
• Hit ratio – percentage of times that a page number is found in the associative registers; ratio related
to number of associative registers
• Consider H = 80%, t = 20ns for TLB search, 100ns for memory access
• Effective Access Time (EAT)
Consider H = 80%, t = 20ns for TLB search, 100ns for memory access
• EAT = 0.80 x 100 + 0.20 x 200 = 120ns
• H = 99%, t = 20ns for TLB search, 100ns for memory access
• EAT = 0.99 x 100 + 0.01 x 200 = 101ns

BITS Pilani, Hyderabad Campus


Memory Protection
• Memory protection implemented by associating
protection bit with each frame to indicate if read-only
or read-write access is allowed
• Can also add more bits to indicate page execute-
only, and so on
• Valid-invalid bit attached to each entry in the page
table:
• “valid” indicates that the associated page is in the
process’s logical address space, and is thus a legal
page
• “invalid” indicates that the page is not in the
process’s logical address space
• Or use page-table length register (PTLR)
• Any violations result in a trap to the kernel
BITS Pilani, Hyderabad Campus
Page Table Structure

• Memory structures for paging can get huge using straight-forward methods
• Cost a lot
• Don’t want to allocate that contiguously in main memory
• Hierarchical Paging
• Hashed Page Tables
• Inverted Page Tables

BITS Pilani, Hyderabad Campus


Hierarchical Page Tables
• Break up the logical address space
into multiple page tables
• Two-level page table
• We then page the page table

• p1 is an index into the outer page


table, and p2 is the displacement
within the page of the inner page
table
• Known as forward-mapped page
BITS Pilani, Hyderabad Campus
Hashed Page Table
• The virtual page number is hashed into
a page table
• Page table contains a chain of
elements hashing to the same location
• Each element contains (1) the virtual
page number (2) the value of the
mapped page frame (3) a pointer to
the next element
• Virtual page numbers are compared in
this chain searching for a match
• If a match is found, the corresponding
physical frame is extracted
BITS Pilani, Hyderabad Campus
Inverted Page Table
• Track all physical pages
• One entry for each real page of
memory
• Entry consists of the virtual address
of the page stored in that real
memory location, with information
about the process that owns that
page

Decreases memory needed to store each page


table, but increases time needed to search the
table when a page reference occurs
BITS Pilani, Hyderabad Campus
Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Virtual Memory Management
BITS Pilani
Hyderabad Campus
Background

Code needs to be in memory to execute, but entire program rarely


used
Error code, unusual routines, large data structures
Entire program code not needed at same time
Consider ability to execute partially-loaded program
Program no longer constrained by limits of physical memory
Each program takes less memory while running -> more programs run at
the same time
Increased CPU utilization and throughput with no increase in response time or
turnaround time
Less I/O needed to load or swap programs into memory -> each user
program runs faster

BITS Pilani, Hyderabad Campus


Background
Virtual memory – separation of user logical memory from physical memory
Only part of the program needs to be in memory for execution
Logical address space can therefore be much larger than physical address
space
Allows address spaces to be shared by several processes
Allows for more efficient process creation
More programs running concurrently
Less I/O needed to load or swap processes
• Virtual address space – logical view of how process is stored in memory
• Usually start at address 0, contiguous addresses until end of space
• Meanwhile, physical memory organized in page frames
• MMU must map logical to physical

BITS Pilani, Hyderabad Campus


Demand Paging
Could bring entire process into memory at load time
Or bring a page into memory only when it is needed
Less I/O needed, no unnecessary I/O
Less memory needed
Faster response
More users
Similar to paging system with swapping
Page is needed -> reference to it
invalid reference -> abort
not-in-memory -> bring to memory
Lazy Swapper, Pager

BITS Pilani, Hyderabad Campus


Demand Paging

Pager guesses which pages will be used before swapping out again
Pager brings in only those pages into memory
How to determine that set of pages?
Need new MMU functionality to implement demand paging
If pages needed are already memory resident
If page needed and not memory resident
Need to detect and load the page into memory from storage
Without changing program behavior
Without programmer needing to change code

BITS Pilani, Hyderabad Campus


Valid-Invalid Bit

With each page table entry a valid–


invalid bit is associated
v  legal and in-memory – memory
resident
i  either illegal or not-in-memory
Initially valid–invalid bit is set to i on all
entries
During MMU address translation, if
valid–invalid bit in page table entry is i 
page fault

BITS Pilani, Hyderabad Campus


Valid-Invalid Bit

BITS Pilani, Hyderabad Campus


Page Fault
If there is a reference to a page, first reference to that page will
trap to operating system: page fault
Operating system looks at another table to decide:
Invalid reference  abort
Just not in memory
Find free frame
Swap page into frame via scheduled disk operation
Reset tables to indicate page now in memory
Set validation bit = v
Restart the instruction that caused the page fault

BITS Pilani, Hyderabad Campus


Page Fault

BITS Pilani, Hyderabad Campus


Demand Paging
Pure Demand Paging
Locality of reference - tendency of a processor to access the same
set of memory locations repetitively over a short period of time
Hardware support needed for demand paging
Page table with valid / invalid bit
Secondary memory (swap device with swap space)
Instruction restart

BITS Pilani, Hyderabad Campus


Performance of Demand Paging
• Three major activities
• Service the interrupt
• Read the page
• Restart the process
• Page Fault Rate 0 < p < 1
• if p = 0 no page faults
• if p = 1, every reference is a fault
• Effective Access Time (EAT)
EAT = (1 – p) x memory access + p (page fault overhead
+ swap page out + swap page in )

BITS Pilani, Hyderabad Campus


Demand Paging Example
• Memory access time = 200 nanoseconds
• Average page-fault service time = 8 milliseconds
• EAT = (1 – p) x 200 + p (8 milliseconds)
= (1 – p) x 200 + p x 8,000,000 = 200 + p x 7,999,800
• If one access out of 1,000 causes a page fault, then
EAT = 8.2 microseconds.
This is a slowdown by a factor of 40!!
• If want performance degradation < 10 percent
• 220 > 200 + 7,999,800 x p
20 > 7,999,800 x p
• p < .0000025

BITS Pilani, Hyderabad Campus


Demand Paging Optimizations
• Swap space I/O faster than file system I/O even if on the same device
• Swap allocated in larger chunks, less management needed than file system
• Copy entire process image to swap space at process load time
• Then page in and out of swap space
• Dirty bit

BITS Pilani, Hyderabad Campus


Page Replacement
Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are
written to disk
Page replacement completes separation between logical memory and physical
memory – large virtual memory can be provided on a smaller physical memory
Find the location of the desired page on disk
Find a free frame:
- If there is a free frame, use it
- If there is no free frame, use a page replacement algorithm to select a victim
frame
- Write victim frame to disk if dirty
Bring the desired page into the (newly) free frame; update the page and frame tables
Continue the process by restarting the instruction that caused the trap

BITS Pilani, Hyderabad Campus


First-In-First-Out (FIFO) Algorithm
• Reference string: 7,0,1,2,0,3,0,4,2,3,0,3,2,1,2,0,1,7,0,1
• 3 frames (3 pages can be in memory at a time per process)
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 4 4 4 0 0 0 7 7 7
0 0 0 3 3 3 2 2 2 1 1 1 0 0
1 1 1 0 0 0 3 3 3 2 2 2 1

• Can vary by reference string: consider 1,2,3,4,1,2,5,1,2,3,4,5


• Adding more frames can cause more page faults!
• Belady’s Anomaly
• How to track ages of pages?
• Just use a FIFO queue
BITS Pilani, Hyderabad Campus
Optimal Algorithm
• Replace page that will not be used for longest period of time
• 9 is optimal for the example
• How do you know this?
• Can’t read the future
• Used for measuring how well your algorithm performs

7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 2 2 7
0 0 0 0 4 0 0 0
1 1 3 3 3 1 1

BITS Pilani, Hyderabad Campus


Least Recently Used (LRU)
Algorithm
Use past knowledge rather than future
Replace page that has not been used in the most amount of time
Associate time of last use with each page
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

7 7 7 2 2 4 4 4 0 1 1 1
0 0 0 0 0 0 3 3 3 0 0
1 1 3 3 2 2 2 2 2 7

12 faults – better than FIFO but worse than OPT


Generally good algorithm and frequently used
But how to implement? BITS Pilani, Hyderabad Campus
Least Recently Used (LRU)
Algorithm
Counter implementation
Every page entry has a counter; every time page is referenced through this entry, copy the
clock into the counter
When a page needs to be changed, look at the counters to find smallest value
Search through table needed
Stack implementation
Keep a stack of page numbers in a double link form:
Page referenced:
move it to the top
requires pointers to be changed
But each update more expensive
No search for replacement
LRU and OPT don’t have Belady’s Anomaly
BITS Pilani, Hyderabad Campus
LRU Approximation Algorithms
Reference bit
With each page associate a bit, initially = 0
When page is referenced bit set to 1
Replace any with reference bit = 0 (if one exists)
We do not know the order,
Second-chance algorithm
Generally FIFO, plus hardware-provided reference bit
If page to be replaced has
Reference bit = 0 -> replace it
reference bit = 1 then:
 set reference bit 0, leave page in memory
 replace next page, subject to same rules

BITS Pilani, Hyderabad Campus


Allocation of Frames

Each process needs minimum number of frames, decided by


computer architecture
Maximum of course is total frames in the system
Two major allocation schemes
fixed allocation
priority allocation

BITS Pilani, Hyderabad Campus


Fixed Allocation

Equal allocation – For example, if there are 100 frames (after allocating
frames for the OS) and 5 processes, give each process 20 frames
Keep some as free frame buffer pool

Proportional allocation – Allocate according to the size of process


Dynamic as degree of multiprogramming, process sizes change
si  size of process pi
S   si
m  total number of frames
si
ai  allocation for pi  m
S

BITS Pilani, Hyderabad Campus


Priority Allocation

Use a proportional allocation scheme using priorities rather than


size or a combination of size and priority

If process Pi generates a page fault,


select for replacement one of its frames
select for replacement a frame from a process with lower priority
number

BITS Pilani, Hyderabad Campus


Global vs. Local Allocation

Global replacement – process selects a replacement frame from


the set of all frames; one process can take a frame from another
But then process execution time can vary greatly
But greater throughput

Local replacement – each process selects from only its own set of
allocated frames
More consistent per-process performance
But possibly underutilized memory

BITS Pilani, Hyderabad Campus


Thrashing

If a process does not have “enough” pages, the page-fault rate is
very high
Page fault to get page
Replace existing frame
But quickly need replaced frame back
This leads to:
Low CPU utilization
Operating system thinking that it needs to increase the degree of
multiprogramming
Another process added to the system

Thrashing: a process is busy swapping pages in and out, high


paging activity
BITS Pilani, Hyderabad Campus
Thrashing

BITS Pilani, Hyderabad Campus


Demand Paging & Thrashing

Why does demand paging work?


Locality model
Locality is a set of pages that are actively used together
Defined by program structure and data structures used
Process migrates from one locality to another
Contains several localities
Localities may overlap

Why does thrashing occur?


 size of locality > total memory size
Limit effects by using local or priority page replacement
Allocate enough frames for a single locality
BITS Pilani, Hyderabad Campus
Working Set Model

 working-set window: D, a fixed number of page references (usually most


recent), approximation of program’s locality

BITS Pilani, Hyderabad Campus


Working Set Model
WSSi (working set of Pi) = total no. of pages referenced in the most recent D (varies in time)
if D too small will not encompass entire locality
if D too large will encompass several localities
if D is infinite: will encompass entire program
D = sum of WSSi : total demand for frames
Approximation of locality
m = no. of available frames
if D > m => Thrashing
Policy if D > m, then suspend or swap out one of the processes
Keeps the degree of multiprogramming as high as possible
Optimizes CPU utilization
Difficulty is to keep track of the working set
BITS Pilani, Hyderabad Campus
Page Fault Rates

Page-Fault Frequency
(PFF)
Define an upper and
lower limits of page
fault rate
If page fault rate is
too low -> take away
a frame
If page fault rate is
too high ->
allocate one more
frame

BITS Pilani, Hyderabad Campus


Allocating Kernel Memory

Treated differently from user memory


Often allocated from a free-memory pool
Kernel requests memory for data structures of varying sizes
Some kernel memory needs to be contiguous
i.e. for device I/O, h/w devices interact directly with physical memory without the
benefit of a virtual memory interface

BITS Pilani, Hyderabad Campus


Buddy System
Allocates memory from fixed-size segment consisting of physically-contiguous
pages
Define a maximum and mini mum block size
Memory allocated using power-of-2 allocator
Satisfies requests in units sized as power of 2
Request rounded up to next highest power of 2
When smaller allocation needed than is available, current chunk split into
two buddies of next-lower power of 2
Continue until appropriate sized chunk available
Small blocks require more overhead to keep track of blocks
Large blocks ??????
BITS Pilani, Hyderabad Campus
Buddy System

Advantage – quickly coalesce unused chunks


into larger chunk, reduces external
fragmentation
Disadvantage - fragmentation

BITS Pilani, Hyderabad Campus


Slab Allocator
Slab is one or more physically
contiguous pages
Slab is the actual container of
data associated with objects of
the specific kind
Cache consists of one or more
slabs
Single cache for each unique
kernel data structure
Each cache filled with objects –
instantiations of the data
structure
BITS Pilani, Hyderabad Campus
Memory-Mapped Files
Memory-mapped file I/O allows file I/O to be treated as routine memory
access by mapping a disk block to a page/pages in memory
A file is initially read using demand paging
A page-sized portion of the file is read from the file system into a physical page
Subsequent reads/writes to/from the file are treated as ordinary memory accesses
Simplifies and speeds file access by driving file I/O through memory rather
than read() and write() system calls
Also allows several processes to map the same file allowing the pages in
memory to be shared
But when does written data make it to disk?
Periodically and / or at file close() time
For example, when the pager scans for dirty pages
BITS Pilani, Hyderabad Campus
Memory-Mapped Files

BITS Pilani, Hyderabad Campus


Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Mass Storage
BITS Pilani
Hyderabad Campus
Magnetic Disks

BITS Pilani, Hyderabad Campus


Mass Storage Structure
• Magnetic disks provide bulk of secondary storage of modern computers
• Disk in use rotates at 60 to 250 times per second (RPM)
• Transfer rate is rate at which data flow between drive and computer
• Positioning time (random-access time) is time to move disk arm to desired
cylinder (seek time) and time for desired sector to rotate under the disk head
(rotational latency)
• Head crash results from disk head making contact with the disk surface
• Disks can be removable
• Drive attached to computer via I/O bus
• Buses vary, including ATA, SATA, USB, Fibre Channel (FC), etc.

BITS Pilani, Hyderabad Campus


Disk Structure

Disk drives are addressed as large 1-dimensional arrays of logical


blocks, where the logical block is the smallest unit of transfer
Low-level formatting creates logical blocks on physical media
The 1-dimensional array of logical blocks is mapped into the
sectors of the disk sequentially
Sector 0 is the first sector of the first track on the outermost cylinder
Mapping proceeds in order through that track, then the rest of the tracks
in that cylinder, and then through the rest of the cylinders from
outermost to innermost
Logical to physical address should be easy
Physical Address – cylinder number, track number, sector number

BITS Pilani, Hyderabad Campus


Disk Scheduling
Minimize seek time
Disk bandwidth - total no. of bytes transferred divided by the total time
between the first request for service and the completion of the last transfer
Sources of disk I/O request – OS, System processes, User processes
I/O request includes input or output mode, disk address, memory address,
number of sectors to transfer
OS maintains queue of requests, per disk or device
Idle disk can immediately work on I/O request, busy disk means work must
queue

BITS Pilani, Hyderabad Campus


FCFS
Total head movement of 640 cylinders

BITS Pilani, Hyderabad Campus


SSTF

Total head movement of 236 cylinders

 Shortest Seek Time First selects


the request with the minimum
seek time from the current head
position
 May cause starvation of some
requests

BITS Pilani, Hyderabad Campus


SCAN
 ELEVATOR Algorithm

BITS Pilani, Hyderabad Campus


C-SCAN

BITS Pilani, Hyderabad Campus


Disk Management
Low-level formatting, or physical formatting — Dividing a disk into sectors
that the disk controller can read and write
Each sector can hold header and trailer, plus data, header and trailer
contains error correcting code (ECC)
To use a disk to hold files, the operating system still needs to record its own
data structures on the disk
Partition the disk into one or more groups of cylinders, each treated as a
logical disk
Logical formatting- creation of a file system, OS stores the initial file
system data structures on the disk, data structures include maps of free
and allocated space and an initial empty directory

BITS Pilani, Hyderabad Campus


RAID
RAID – redundant arrays of independent disks
multiple disk drives provides reliability via redundancy
Increases the mean time to failure
Mean time to repair – avg. time to replace a failed disk and restore the data on it
Mean time to data loss based on above factors
Mirrored Disk (volume)
Data Stripping – bit-level stripping, byte-level stripping, block-level stripping
RAID 0 RAID 3
RAID 1 RAID 4
RAID 2 RAID 5
RAID 6
BITS Pilani, Hyderabad Campus
RAID 0
Level 0: Non redundant
• Data striping is used for increased
performance but no redundant
information is maintained.
• Striping is done at block level but
without any redundancy.
• Writing performance is best in this
level because due to absence of
redundant information there is no
need to update redundant
information

BITS Pilani, Hyderabad Campus


RAID 1
Level 1: Mirrored
Same data is copied on two different disks. This type of redundancy is called
as mirroring. It is the most expensive system. Because two copies of same
data are available in two different disks, it allows parallel read

BITS Pilani, Hyderabad Campus


RAID 2
Level 2: Error correcting codes
This level uses bit-level data stripping in place of block level. It is used with
drives with no built in error detection technique. Error-correcting schemes
(ECC) store two or more extra bits and it is used for reconstruction of the data
if a single bit is damaged.

BITS Pilani, Hyderabad Campus


RAID 3
Level 3: Bit-Interleaved parity
Data stripping is used and a single parity bit is used for error correction as well
as for detection. Systems have disk controller that detects which disk has
failed. RAID level 3 has a single check disk with parity bit.

BITS Pilani, Hyderabad Campus


RAID 4
Level 4: Block-Interleaved parity
RAID level 4 is similar as RAID level 3 but it has Block-Interleaved parity instead
of bit parity. You can access data independently so read performance is high.

BITS Pilani, Hyderabad Campus


RAID 5
Level 5: Block-Interleaved distributed parity
RAID level 5 distributes the parity block and data on all disks. For each block,
one of the disks stores the parity and the others store data. RAID level 5 gives
best performance for large read and write.

BITS Pilani, Hyderabad Campus


RAID 6
Level 6: P+Q Redundancy Scheme
What happens if more than one disk fails at a time? This level stores extra
redundant information to save the data against multiple disk failures. It uses
Reed-Solomon codes (ECC) for data recovery. Two different algorithms are
employed

BITS Pilani, Hyderabad Campus


Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
File System
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
File Attributes
Name – only information kept in human-readable form
Identifier – unique tag (number) identifies file within file system, non-humnan-
readable
Type – needed for systems that support different types
Location – pointer to file location on device
Size – current file size
Protection – controls who can do reading, writing, executing
Time, date, and user identification – data for protection, security, and usage
monitoring
Information about files are kept in directory structure maintained on the disk

BITS Pilani, Hyderabad Campus


File Operations
File is an abstract data type
Create
Write – at write pointer location
Read – at read pointer location
Reposition within file - seek
Delete
Truncate
OS maintains an open-file table
For a requested file operation, file is specified via an index into the table
When file is closed, OS removes entry from file table

BITS Pilani, Hyderabad Campus


File Types

BITS Pilani, Hyderabad Campus


Access Methods
• Sequential Access
read_next()
write_next()
reset()

• Direct Access/Relative Access – file consists of fixed length logical records


read(n)
write(n)
position_file(n)
read_next(n)
write_next(n)

n = relative block number w.r.t beginning of file

BITS Pilani, Hyderabad Campus


File System Organization

BITS Pilani, Hyderabad Campus


Directory
• The directory is organized logically to obtain
• Efficiency – locating a file quickly
• Naming – convenient to users
• Two users can have same name for different files
• The same file can have several different names
• Grouping – logical grouping of files by properties
• Directory Operations
• Search for a file
• Create a file
• Delete a file
• List a directory
• Rename a file
• Traverse the file system
BITS Pilani, Hyderabad Campus
Single-Level Directory
• A single directory for all users
• Entire system will contain only one directory which is supposed to mention all the files
present in the file system
• Directory contains one entry per each file present on the file system

• Naming problem
• Grouping problem

BITS Pilani, Hyderabad Campus


Two-Level Directory
Separate directory for each user

User name and file name define a path


MFD is indexed by user name/ account number
Can have the same file name for different user
Efficient searching, only UFD is searched for creation or deletion
Creation and deletion of user directories – admin
Sharing not possible

BITS Pilani, Hyderabad Campus


Tree-Structured Directory
 Efficient searching
 Directory is a special file
 Current directory (working directory)
 cd /spell/mail/prog
 type list
 Absolute or relative path name
 Creating a new file is done in current directory
 Delete a file - rm <file-name>
 Creating a new subdirectory is done in current directory
 mkdir <dir-name>
 With permissions, one user can access another’s files

Deleting “mail”  deleting the entire subtree rooted by “mail”

BITS Pilani, Hyderabad Campus


Acyclic-Graph Directory

 Have shared subdirectories and files


 Two different names
 Dangling pointer
 New directory entry type
 Link – another pointer to an existing file
 Resolve the link – follow pointer to locate the file

BITS Pilani, Hyderabad Campus


General Graph Directory

• How do we guarantee no cycles?


• Allow only links to file not subdirectories
• Garbage collection
• Every time a new link is added use a cycle
detection algorithm to determine whether it
is OK

BITS Pilani, Hyderabad Campus


File System Mounting
A file system must be mounted before it can be accessed
Requires the device name and mount point (location within the file structure where the
file system is to be attached)
An unmounted file system is mounted at a mount point

BITS Pilani, Hyderabad Campus


File Sharing
• Sharing of files on multi-user systems is desirable
• Sharing may be done through a protection scheme
• On distributed systems, files may be shared across a network
• Network File System (NFS) is a common distributed file-sharing method
• If multi-user system
• User IDs identify users, allowing permissions and protections to be per-user
Group IDs allow users to be in groups, permitting group access rights
• Owner of a file / directory
• Group of a file / directory

BITS Pilani, Hyderabad Campus


File Sharing
Specify how multiple users are to access a shared file simultaneously
Require more directory and file attributes
Concept of owner and group
Owner can change attributes and grant access
Consistency semantics
Remote file sharing
ftp – transferring files between machines
Distributed file system – remote directories are visible from a local
machine

BITS Pilani, Hyderabad Campus


File Sharing – Remote File Systems
Allow a computer to mount one or more file systems from one or more
machines
Server- machine containing the files
Client – machine seeking access to the file
Server declares which files are accessible to which clients
Server can serve multiple clients
Clients can use multiple servers
Authentication

BITS Pilani, Hyderabad Campus


Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
File System Implementation
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
File-System Structure
• File structure
• Logical storage unit
• Collection of related information
• File system resides on secondary storage (disks)
• Provides efficient and convenient access to disk by allowing data to be stored, located
retrieved easily
• Disk provides in-place rewrite and random access
• I/O transfers performed in blocks of sectors (usually 512 bytes)
• File control block – storage structure consisting of information about a file
• Device driver controls the physical device
• File system organized into layers

BITS Pilani, Hyderabad Campus


Layered File System

BITS Pilani, Hyderabad Campus


File System Layers
• Device drivers manage I/O devices at the I/O control layer
• Basic file system – gives generic commands to device driver to read and write
physical disk blocks
• Also manages memory buffers and caches (allocation, freeing, replacement)
• Buffers hold data in transit
• Caches hold frequently used file system metadata
• File organization module understands files, logical blocks, and physical
blocks, maps logical block address to physical block address
• Logical file system manages metadata information
• Translates file name into file no., file handle, location by maintaining file control blocks
• Directory management
• Protection
BITS Pilani, Hyderabad Campus
File System Implementation
• We have system calls at the API level, but how do we implement their
functions?
• On-disk and in-memory structures
• Boot control block contains info. needed by system to boot OS from that
volume
• Needed if volume contains OS, usually first block of volume
• Volume control block (superblock, master file table) contains
volume/partition details
• Total # of blocks, # of free blocks, block size, free block pointers or array
• Directory structure organizes the files
• Names and inode numbers, master file table

BITS Pilani, Hyderabad Campus


File System Implementation
• Per-file File Control Block (FCB) contains many details about the file
• unique identifier to allow association with a directory entry, permissions, size,
dates

BITS Pilani, Hyderabad Campus


In-Memory File System Structures
• in-memory mount table – info. about each mounted volume
• in-memory directory structure cache – directory info. of recently
accessed directories
• system-wide open-file table – copy of FCB of each open file
• per-process open-file table – pointer to the appropriate entry in the
system-wide open-file table
• buffers – hold file system blocks when are read from disk or written to
disk

BITS Pilani, Hyderabad Campus


Virtual File System
• VFS allows the same system call
interface (the API) to be used for
different types of file systems
• Separates file-system generic operations
from implementation details
• Unique network wide representation of
file
• Implements vnodes which hold inodes or
network file details
• Dispatches operation to appropriate file
system implementation routines

BITS Pilani, Hyderabad Campus


Directory Implementation

• Linear list of file names with pointer to the data blocks


• Simple to program
• Time-consuming to execute
• Linear search time
• Could keep ordered alphabetically via linked list or use B+ tree
• Hash Table – linear list (for directory entries) with hash data structure
• Decreases directory search time
• Collisions – situations where two file names hash to the same location
• Only good if entries are fixed size, or use chained-overflow method

BITS Pilani, Hyderabad Campus


Allocation Methods - Contiguous
• Allocation method refers to how disk  Mapping from logical to physical
blocks are allocated for files
• Contiguous allocation – each file
occupies set of contiguous blocks
• Best performance in most cases
• Simple – only starting location (block #)
and length (number of blocks) are
required
• Problems include finding space for file,
knowing file size, external
fragmentation, need for compaction off-
line (downtime) or on-line

BITS Pilani, Hyderabad Campus


Allocation Methods - Linked
• Linked allocation – each file a linked
list of blocks
• File ends at nil pointer
• Each block contains pointer to next block
• No compaction, no external
fragmentation
• Free space management system called
when new block needed
• Improve efficiency by clustering blocks
into groups
• Reliability can be a problem
• Locating a block can take many I/Os and
disk seeks
BITS Pilani, Hyderabad Campus
File Allocation Table

FAT (File Allocation Table)


Beginning of volume has table, indexed by block number
Much like a linked list
New block allocation simple

BITS Pilani, Hyderabad Campus


Allocation Methods - Indexed
Indexed allocation - Each file has its own index block of pointers to its data
blocks

index table
 Need index table
 Random access
 Dynamic access without external
fragmentation, but have overhead of index
block
BITS Pilani, Hyderabad Campus
Free Space Management
 File system maintains free-space list to track available blocks/clusters
 (Using term “block” for simplicity)
 Bit vector or bit map (n blocks)

0 1 2 n-1

1  block[i] free

bit[i] =
0  block[i] occupied

Block number calculation

(number of bits per word) *


(number of 0-value words) +
offset of first 1 bit
CPUs have instructions to return offset within word of first “1” bit

BITS Pilani, Hyderabad Campus


Linked List

 Linked list (free list)


 Cannot get contiguous
space easily
 No waste of space
 No need to traverse the
entire list (if # free blocks
recorded)

BITS Pilani, Hyderabad Campus


Free Space Management
 Grouping
 Modify linked list to store address of next n-1 free blocks in first free
block, plus a pointer to next block that contains free-block-pointers
(like this one)

 Counting
 Because space is frequently contiguously used and freed
 Keep address of first free block and count of following free blocks
 Free space list then has entries containing addresses and counts

BITS Pilani, Hyderabad Campus


Thank You

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
I/O Systems
BITS Pilani
Hyderabad Campus
Overview

• I/O management is a major component of operating system design and


operation
• Important aspect of computer operation
• I/O devices vary greatly
• Various methods to control them
• Performance management
• New types of devices frequently come up
• Ports, buses, device controllers connect to various devices
• Device drivers encapsulate device details
• Present uniform device-access interface to I/O subsystem

BITS Pilani, Hyderabad Campus


I/O Hardware
• Incredible variety of I/O devices
• Storage
• Transmission
• Human-interface
• Common concepts – signals from I/O devices interface with computer
• Port – connection point for device
• Bus - daisy chain or shared direct access
• PCI bus common in PCs
• expansion bus connects relatively slow devices
• Controller (host adapter) – electronics that operate port, bus, device
• Sometimes integrated
• Sometimes separate circuit board (host adapter)
• Contains processor, microcode, private memory, bus controller, etc
BITS Pilani, Hyderabad Campus
Typical PC Bus Structure

BITS Pilani, Hyderabad Campus


I/O Hardware

• I/O instructions control devices


• Controllers usually have registers where device driver places processor
commands, addresses, and data to write, or read data from registers after
command execution
• Data-in register – read by host to get i/p
• Data-out register – written by host to send o/p
• Status register – contains bits that can be read by host, bits indicate states
like completion of current command, availability of byte to be read from
data-in register or occurrence of device error
• Control register – written by host to start a command

BITS Pilani, Hyderabad Campus


Polling
1. Host reads busy bit from status register until 0
2. Host sets read or write bit in command register and if write bit is set, writes a data byte
into data-out register
3. Host sets command-ready bit in command register
4. Controller sees command-ready bit and sets busy bit
5. Controller sees write command in command register, reads byte from data-out register
and complete I/O
6. Controller clears busy bit, error bit (status reg.), command-ready bit when transfer done
• Step 1 is busy-waiting or polling
• Reasonable if device is fast, but inefficient if device slow
• CPU switches to other tasks?
• How does CPU know controller is idle, controller buffer may overflow

BITS Pilani, Hyderabad Campus


Interrupts
• CPU Interrupt-request line triggered by I/O device
• Checked by processor after each instruction
• Interrupt handler receives interrupts
• Maskable to ignore or delay some interrupts
• Interrupt vector to dispatch interrupt to correct handler
• Context switch at start and end
• Based on priority
• Some nonmaskable (unrecoverable memory errors)

BITS Pilani, Hyderabad Campus


Interrupts

• Interrupt mechanism also


used for exceptions
• Terminate process, system
crash due to hardware
error
• Page fault executes when
memory access error
• System call executes via
trap to trigger kernel to
execute request

BITS Pilani, Hyderabad Campus


Direct Memory Access
• Used for large data movement
• Requires DMA controller (special-purpose processor)
• Bypasses CPU to transfer data directly between I/O device and memory
• OS writes DMA command block into memory
• Source and destination addresses
• Read or write mode
• Count of bytes
• CPU writes location of command block to DMA controller
• DMA performs transfer without help of CPU

BITS Pilani, Hyderabad Campus


Direct Memory Access
• Handshaking b/w DMA controller and device controller is done via DMA-request and
DMA-acknowledge wires
• Device controller places a signal on the DMA-request wire when a data word is available
for transfer
• DMA controller takes control of memory bus, places the desired address on memory-
address wires and places a signal on DMA-acknowledge wire
• Device controller on receiving DMA-acknowledge signal, transfers data word to memory
and removes DMA-request signal
• When transfer is complete, DMA controller interrupts CPU
• Cycle stealing

BITS Pilani, Hyderabad Campus


I/O Interface
• I/O system calls encapsulate device behaviors in generic classes
• Device-driver layer hides differences among I/O controllers from kernel
• Help OS developers
• Devices vary in many dimensions
• Character-stream or block
• Sequential or random-access
• Synchronous or asynchronous (or both)
• Sharable or dedicated
• Speed of operation
• read-write, read only, or write only

BITS Pilani, Hyderabad Campus


Block & Character Devices
• Block devices include disk drives
• Commands include read, write, seek
• Raw I/O, direct I/O
• Memory-mapped file access possible
• File mapped to virtual memory and clusters brought via demand paging
• DMA
• Character devices include keyboards, mice, serial ports
• Commands include get(), put()

BITS Pilani, Hyderabad Campus


Clocks & Timers
• Provide current time, elapsed time, timer
• Normal resolution about 1/60 second
• Some systems provide higher-resolution timers
• Programmable interval timer used for timings, periodic interrupts

BITS Pilani, Hyderabad Campus


Types of I/O

• Blocking
• Nonblocking

BITS Pilani, Hyderabad Campus


Kernel I/O Subsystem
• Scheduling
• Some I/O request ordering via per-device queue
• Some OSs try fairness
• Device-status table
• Buffering - store data in memory while transferring between devices
• To cope with device speed mismatch
• Double buffering
• Caching - faster device holding copy of data
• Always just a copy
• Key to performance
• Spooling - hold output for a device
• If device can serve only one request at a time
BITS Pilani, Hyderabad Campus
Other Aspects

• Error Handling
• OS can recover from disk read, device unavailable, transient write failures
• Retry a read or write, for example
• Most return an error number or code when I/O request fails
• System error logs hold problem reports
• I/O Protection
• User process may accidentally or purposefully attempt to disrupt normal
operation via illegal I/O instructions
• All I/O instructions defined to be privileged
• I/O must be performed via system calls
BITS Pilani, Hyderabad Campus
Thank You

BITS Pilani, Hyderabad Campus


How many times does this code segment print ‘GATE’?

int main() {
if (fork() || fork())
fork();
printf("GATE\n");
return 0;
}
Choose the most appropriate answer with respect to the following code snippet. Assume all required header files have been included.

1. int main()

2. {

3. int pfds[2] = {0, 1};

4. char buf[30];

5. printf("reading from file descriptor #%d\n", pfds[0]);

6. read(pfds[0], buf, 5);

7. printf("writing to file descriptor #%d\n", pfds[1]);

8. write(pfds[1], buf, 5);

9. printf("read \"%s\"\n", buf);

10 return 0;

11. }

a) Line 6 would cause an error

b) Line 8 would cause an error

c) Line 6 would cause a segmentation fault

d) None of the above


Consider the following code segment:

int main() {

int pfds[2];

char buf[1000];

pipe(pfds);

if (!fork()) {

for (int i=0; i<10; i++) write(pfds[1], "child", 5);

} else {

for (int i=0; i<1; i++) write(pfds[1], "parent", 6);

read(pfds[0], buf, 500);

printf("%s", buf);

wait(NULL);

return 0;

}
Which of the following is a possible output from line 10?
Assume all required header files have been included.
a) segmentation fault
b) parent
c) cphairlednt
d) Error because two processes can’t write to the same
end of the pipe simultaneously

You might also like