0% found this document useful (0 votes)
4 views392 pages

OS slides

The document provides an overview of the Operating Systems course (CS F372) at BITS Pilani, detailing objectives such as process management, memory management, and storage management. It outlines the topics to be covered, evaluation methods, and the structure of operating systems, including multiprogramming and multitasking. Additionally, it discusses the architecture of computer systems, the role of the operating system, and various computing environments.

Uploaded by

Paarth Prakash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views392 pages

OS slides

The document provides an overview of the Operating Systems course (CS F372) at BITS Pilani, detailing objectives such as process management, memory management, and storage management. It outlines the topics to be covered, evaluation methods, and the structure of operating systems, including multiprogramming and multitasking. Additionally, it discusses the architecture of computer systems, the role of the operating system, and various computing environments.

Uploaded by

Paarth Prakash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 392

OPERATING SYSTEMS (CS F372)

Introduction
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
What is an Operating System

BITS Pilani, Hyderabad Campus


Handout Overview
Objectives
• To learn about how process management is carried by the OS. This
will include process creation, thread creation, CPU scheduling,
process synchronization and deadlocks.
• To learn about memory management carried out by OS. This will
include the concepts of paging, segmentation, swapping, and virtual
memory.
• To learn how permanent storage like files and disks are managed by
OS. This will include topics related to access methods, mounting, disk
scheduling, and disk management.
• Hands-on experience

BITS Pilani, Hyderabad Campus


Handout Overview
Text Book:
T1. Silberschatz, Galvin, and Gagne, “Operating System Concepts”, 9th
edition, John Wiley & Sons, 2012.

Reference Books:
R1. W. Stallings, “Operating Systems: Internals and Design Principles”, 6th
edition, Pearson, 2009.
R2. Tanenbaum, Woodhull, “Operating Systems Design & Implementation”,
3rd edition, Pearson, 2006.
R3. Dhamdhere, “Operating Systems: A Concept based Approach”, 2nd
edition, McGrawHill, 2009.
R4. Robert Love, “Linux Kernel Development”, 3rd edition, Pearson, 2010.
BITS Pilani, Hyderabad Campus
Topics to be covered
• Introduction • File System Interface
• OS Structures • File System Implementation
• Processes • I/O Systems
• Threads • Protection
• CPU Scheduling
• Process Synchronization
• Deadlocks
• Main Memory Management
• Virtual Memory
• Mass Storage

BITS Pilani, Hyderabad Campus


Evaluation

Component Duration Weightage (%) Date & Time Nature of Component


Mid Semester 90 minutes 30% As per Time Table Open Book
Examination
Quiz 1 - 10% TBA Open Book
Quiz 2 - 10% TBA Open Book
Assignment 15% TBA Open Book
Comprehensive 120 minutes 35% As per Time Table Open Book
Examination

BITS Pilani, Hyderabad Campus


Handout Overview

• Chamber Consultation
• Notices
• Make-up Policy

BITS Pilani, Hyderabad Campus


Introduction

• program that manages computer’s hardware


• acts as an intermediary between computer user and computer h/w
• mainframe operating systems
• personal computer (PC) operating systems
• operating systems for mobiles

BITS Pilani, Hyderabad Campus


Computer System Architecture
 Hardware – provides basic computing resources
 CPU, memory, storage, I/O devices
 Operating system
 Controls and coordinates use of hardware among various applications and users
 Application programs – define the ways in which the system resources are used to
solve the computing problems of the users
 word processors, email, web browsers, database systems, video games, media
player
 Users
 People, machines, other computers

BITS Pilani, Hyderabad Campus


Computer System Architecture

BITS Pilani, Hyderabad Campus


What OS Does? : User View
 Users want convenience, ease of use and good performance
 Don’t care about resource utilization
 Shared computer such as mainframe or minicomputer must keep all users
happy
 Users of dedicated systems such as workstations have dedicated resources
but frequently use shared resources from servers
 Handheld computers are resource poor, optimized for individual usability and
battery life
 Some computers have little or no user interface, such as embedded
computers in devices and automobiles

BITS Pilani, Hyderabad Campus


What OS Does? : System View

 OS is a resource allocator
 Manages all resources
 Decides between conflicting requests for efficient and fair resource use
 OS is a control program
 Controls execution of user programs to prevent errors and improper use of
the computer (like a program should not delete a section of hard-drive, a
program should not interfere with other program)

BITS Pilani, Hyderabad Campus


How do we define OS?
Everything a vendor ships when you order an operating system is a good
approximation

“The one program running at all times on the computer”


BITS Pilani, Hyderabad Campus
Computer System Organization
 Computer-system operation
 One or more CPUs, device controllers connect through common bus
providing access to shared memory
 Concurrent execution of CPUs and devices competing for memory cycles

BITS Pilani, Hyderabad Campus


Computer-System Organization

BITS Pilani, Hyderabad Campus


Computer-System Operation
 Bootstrap program is loaded at power-up or reboot
 initial program
 stored in ROM or EPROM, generally known as firmware
 initializes all aspects of system like CPU registers, device controllers, memory contents
 locates and loads operating system kernel and starts execution
(kernel is one of the module/part of OS which is responsible for memory management, it is core
component of os basically; kernel is the very first component of the os that is loaded when
computer starts)
To summarize,bootstrap is stored in eprom called firmware, bootstrap is the very first thing that runs
when computer is started/booted ,it initializes all ascepts of system like cpu register,device
controller, also bootstrap loads os kernel as it knows kernel’s location in secondary memory.

BITS Pilani, Hyderabad Campus


Computer-System Operation
 I/O devices and the CPU can execute concurrently
 Each device controller is in charge of a particular device type
 Each device controller has a local buffer
 CPU moves data from/to main memory to/from local buffers
 I/O is from the device to local buffer of controller (ie device talk to local buffer and local
buffer talk to cpu )
 Device controller informs CPU that it has finished its operation by causing an interrupt
 Basically,local buffer has fast access time than real device, ie, loading and extracting data
from local buffer is faster than the device, and since cpu is really fast, local buffer were
intoduced to maintain efficiency of cpu
 Local buffer is the part of device controller BITS Pilani, Hyderabad Campus
Interrupt Handling
 interrupt transfers control to the interrupt service routine (stored in a fixed location) through the
interrupt vector (address for finding ISR)
 IVT – table of pointers containing the addresses of all the interupt service routines
 Basically ivt stores address of all isr, isr have info of what to do when faced with specific interupt
 must save the address of the interrupted instruction, system stack
 trap/exception is a software-generated interrupt caused either by an error or a user request (example
arithmatic/filenotfound/segmentation/indexnotfound exception are error type interupts, user type
interupt are like calling user-defined or any in built function from main function leads to main function
getting on hold till called function returns control )
 operating system is interrupt driven
 operating system preserves the state of the CPU by storing contents of registers and the program counter
, this previous state is stored in system stack
BITS Pilani, Hyderabad Campus
Computer-System Operation
 computer system consists of CPUs and multiple device
controllers that are connected through a common bus
 each device controller is in charge of a specific type of
device
 device controller
 maintains some local buffer storage and a set of
special-purpose registers
 moves data between the peripheral devices that it
controls and its local buffer storage
 operating systems have a device driver for each device
controller
 device driver understands the device controller and
provides the rest of the operating system with a uniform
interface to the device (generally necessary device drivers
come already pre-built inside OS but some others have to
be installed by user as per demand) BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
I/O Structure
 to start an I/O operation, the device driver loads the registers within the device
controller
 device controller examines the contents of registers to determine what action to take
 controller starts the transfer of data from the device to its local buffer
 device controller informs the device driver via an interrupt that it has finished its
operation
 device driver then returns control to the operating system, possibly returning the data
or a pointer to the data if the operation was a read
 for other operations, the device driver returns status information

BITS Pilani, Hyderabad Campus


I/O Structure
 interrupt-driven I/O is fine for moving small amounts of data(usually one interrupt is
generated after each byte transfer which makes it slow)
 can produce high overhead when used for bulk data movement such as disk I/O
 direct memory access (DMA)
 device controller sets up buffers, pointers, and counters for the I/O device
 In DMA , device controller transfers an entire block of data(chunk of many bytes)
directly to or from its own buffer storage to memory, with no intervention by the CPU
 Now only one interrupt is generated per block, to tell the device driver that the
operation has completed instead of one interrupt per byte generated for low-speed
devices
 Hence now CPU is available to accomplish other work
 To summarize, in DMA, block of data is transferred and no CPU is involved but direct
access to memory which allows cpu to do other work, only one interrupt per block
BITS Pilani, Hyderabad Campus
I/O Structure

BITS Pilani, Hyderabad Campus


Storage Structure
 Main memory –
 only storage media that the CPU can access directly
 instruction execution
 random access
 volatile

BITS Pilani, Hyderabad Campus


Storage Structure
 Secondary storage –
 extension of main memory that provides large nonvolatile storage capacity
 stores both program and data

BITS Pilani, Hyderabad Campus


Storage Structure
 Hard disks/Magnetic disks –
 rigid metal or glass platters covered with magnetic recording material
 disk surface is logically divided into tracks, which are subdivided into sectors
 disk controller determines the interaction between the device and the computer

BITS Pilani, Hyderabad Campus


Storage Structure
 Solid-state disks –
 faster than magnetic disks, nonvolatile
 becoming more popular
 stores data in DRAM during normal operation
 also contains a hidden magnetic hard disk and a battery for
backup power
 if external power is interrupted, solid-state disk’s controller
copies the data from RAM to the magnetic disk
 when external power is restored, the controller copies the data
back into RAM
 another form of solid-state disk is flash memory, which is
popular in cameras and personal digital assistants (PDAs),
slower than DRAM but needs no power to retain its contents

BITS Pilani, Hyderabad Campus


Storage Structure

DOWN THE GRAPH:


1)Access speed decreases
2)Physical size to store same
amount of data increases
3)Cost decreases

BITS Pilani, Hyderabad Campus


Putting it All Together

BITS Pilani, Hyderabad Campus


Operating System Structure
 Multiprogramming (Batch system)
 Needed for efficiency
 Single process cannot keep CPU and
I/O devices busy at all times
 Organizes jobs (code and data) so
CPU always has one to execute
 A subset of total jobs in system is
Job Pool

kept in memory
 One job is selected and run via job
scheduling
 When it has to wait for I/O, OS
switches to another job

BITS Pilani, Hyderabad Campus


Operating System Structure
 Timesharing (multitasking): CPU switches jobs so
frequently that users can interact with each job
while it is running
 interactive computing
 User interaction via input devices
 Response time should be minimal
 Each user has at least one program executing
in memory process (program that is being
executed is a process)
 If several processes ready to run at the same
time  CPU scheduling (which process to be
given to which CPU)
 If processes don’t fit in memory(main
memory/RAM), swapping moves them in and
out to run
 Virtual memory allows execution of processes
larger than physical memory BITS Pilani, Hyderabad Campus
Operating System Operations

Interrupt driven - hardware and software


Hardware interrupt by one of the devices
Software interrupt (exception or trap):
Software error (e.g., division by zero, invalid memory access)
Request for operating system service
Other process problems include infinite loop, processes modifying
each other or the operating system

BITS Pilani, Hyderabad Campus


Operating System Operations
Dual-mode operation allows OS to protect itself and protect users from one
another
User mode and Kernel/Supervisor/System/Privileged mode
Mode bit provided by hardware
Provides ability to distinguish when system is running user code or kernel code
Some instructions designated as privileged, only executable in kernel mode
System call changes mode to kernel, return from call resets it to user

locate ISR via interrupt


vector and excecute

BITS Pilani, Hyderabad Campus


Operating System Operations

Boot time  hardware starts in kernel mode


After loading OS, user applications are started in user mode
When trap/interrupt occurs, hardware switches from user mode to kernel
mode
examples of privileged instructions – switch to kernel mode, I/O control,
timer management, interrupt management

If switching to kernel mode itself is privileged(only executed in kernel


mode), how do we to kernel mode switch in the first place?
Sol– in user mode, we can only invoke system call (like printf or scanf
which are privileged), automatically it transitions to kernel mode when
such invocation happens.
BITS Pilani, Hyderabad Campus
Operating System Operations: Timer

User processes must return control to OS


Prevent infinite loop / process hogging resources
Set to interrupt the computer after some time period
Keep a counter that is decremented for every physical clock tick
Operating system sets the counter (privileged instruction) before
switching to user mode
When counter reaches zero, generate an interrupt
Set up before scheduling process to regain control or terminate
program that exceeds allotted time

BITS Pilani, Hyderabad Campus


Computing Environments
Single-Processor Systems - one main CPU executing instructions, including
instructions from user processes, some device-specific processors like disk,
keyboard and graphics controller and I/O processor may be present

BITS Pilani, Hyderabad Campus


Computing Environments
 Multiprocessors
 Also known as parallel systems, multi-core systems
 2 or more processors in close communication, sharing the
computer bus and sometimes the clock, memory and
peripheral devices
 Advantages:
 Increased throughput
 Economy of scale(10 separate processors will cost
more than 10-integrated multi-processors as
multiprocessors share memory, peripheral devices
which reduces cost)  Two types:
 Increased reliability – graceful degradation(ability of  Asymmetric Multiprocessing – each
system to continue providing service proportional to processor is assigned a specific task, boss
the degree of healthy components/surviving processor controls worker processors
hardware(here processors) that are present), fault  Symmetric Multiprocessing – each
tolerance(even if some components(processors) fail, processor performs all tasks, peers
normal operations can continue as if nothing
happened)

BITS Pilani, Hyderabad Campus


Computing Environments
 Multicore Systems
 include multiple computing cores on a single chip
 more efficient than multiple chips with single cores because on-chip
communication is faster than between-chip communication

dual-core design with both cores on same chip

BITS Pilani, Hyderabad Campus


Computing Environments
Clustered Systems
 Like multiprocessor systems, but multiple
systems working together
 Usually sharing storage via a storage-area
network (SAN)
 Provides a high-availability service which
survives failures, users can see only a brief
interruption of service
 Asymmetric clustering has one machine
in hot-standby mode
 Symmetric clustering has multiple  Some clusters are for high-performance
nodes running applications, monitoring computing (HPC) - Applications must be
each other written to use parallelization

BITS Pilani, Hyderabad Campus


Computing Environments
Traditional Computing
stand-alone general purpose machines
most systems interconnect with others (i.e., the Internet)
mobile computers interconnect via wireless networks
home computers

BITS Pilani, Hyderabad Campus


Computing Environments
Mobile Computing
handheld smartphones, tablets, etc.
portable, lightweight
allows different types of apps
use wireless, or cellular data networks for connectivity
Apple iOS, Google Android

BITS Pilani, Hyderabad Campus


Computing Environments
Distributed Computing
collection of separate, possibly heterogeneous, systems networked
together
access to shared resources
They are connected through network,
which is a communications path
Local Area Network (LAN)
Wide Area Network (WAN)
Metropolitan Area Network (MAN)
Personal Area Network (PAN)
systems exchange messages

BITS Pilani, Hyderabad Campus


Computing Environments
Client Server Computing
 terminals are PCs and mobile devices
 servers respond to requests generated
by clients
 servers can be of 2 types
 compute-server system provides an
interface to client to request
services, server executes the action
and sends the results to clients
 file-server system provides
interface for clients to create,
update, read and delete files

BITS Pilani, Hyderabad Campus


Computing Environments

Peer-to-peer Computing
 does not distinguish clients and
servers
 nodes join and may also leave P2P
network
 advantage over client server system

 Napster, Gnutella, BitTorrent, DC++, Skype (VoIP)

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
OS Structures
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Operating System Services
 User interface - almost all operating systems have a user interface (UI)
 Command-Line (CLI), Graphics User Interface (GUI)
 Program execution - system must be able to load a program into memory and
to run that program, end execution, either normally or abnormally (indicating
error)
 I/O operations - running program may require I/O, which may involve a file or
an I/O device
 File-system manipulation - read and write files and directories, create and
delete them, search them, list file Information, permission management

BITS Pilani, Hyderabad Campus


Operating System Services
 Communications – processes may exchange information, on the same
computer or between computers over a network, shared memory or message
passing
 Error detection –
 OS needs to be constantly aware of possible errors
 May occur in CPU and memory h/w, in I/O devices, in user program
 OS should take the appropriate action to ensure correct and consistent
computing
 Take corrective actions

BITS Pilani, Hyderabad Campus


Operating System Services
 Resource allocation – allocating resources like CPU cycles, main memory, file
storage, I/O devices for multiple concurrently executing processes
 Accounting – keep track of which users use how much and what kinds of
computer resources
 Protection and security –
 owners of information stored in a multiuser or networked computer
system want to control use of that information
 concurrent processes should not interfere with each other or with OS
 ensuring that all accesses to system resources is controlled
 security of the system from outsiders requires user authentication,
extends to defending external I/O devices from invalid access attempts

BITS Pilani, Hyderabad Campus


Operating System Services

THE ENTIRE BLUE


REGION IS THE OS

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:
CLI
 CLI or command interpreter
 Sometimes implemented in kernel, sometimes by separate
program (Unix, Windows)
 Sometimes multiple flavors implemented – shells
 Primarily fetches a command from user and executes it

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:
GUI
 User-friendly interface
 Usually mouse, keyboard, and monitor
 Icons represent files, programs, actions, etc.
 Various mouse buttons over objects in the interface cause various actions
(provide information, options, execute function, open directory (known as a
folder))
 Many systems now include both CLI and GUI interfaces
 Microsoft Windows is GUI with CLI “command” shell
 Unix and Linux have CLI with optional GUI interfaces (CDE, KDE, GNOME)

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:
Touchscreen Interface
 Touchscreen devices require new interfaces
 Mouse not possible or not desired
 Actions and selection based on gestures
 Virtual keyboard for text entry
 Voice commands

BITS Pilani, Hyderabad Campus


User and Operating-System Interface:

Choice of Interface

BITS Pilani, Hyderabad Campus


System Calls

BITS Pilani, Hyderabad Campus


System Calls
 Interface to the services provided by the OS
 Typically written in a high-level language (C or C++)
 Mostly accessed by programs via a high-level Application Programming
Interface (API) rather than direct system call use
 API specifies a set of functions available to application programmers
 Programmers access API via code library provided by the OS
 Three most common APIs are
 Win32 API for Windows
 POSIX API for POSIX-based systems (including all versions of UNIX, Linux,
and Mac OS X) (POSIX-portable operating system interface based on unix)
 Java API for the Java virtual machine (JVM)

BITS Pilani, Hyderabad Campus


System Calls
 A number is associated with each system call
 System-call interface maintains a table indexed according to these numbers
 The system call interface invokes the intended system call in OS kernel and
returns status of the system call and any return values
 The caller need know nothing about how the system call is implemented
 Just needs to obey API and understand what OS will do as a result of call
execution
 Most details of OS interface hidden from programmer by API
 Managed by run-time support library (set of functions built into libraries
included with compiler)

BITS Pilani, Hyderabad Campus


System Calls

BITS Pilani, Hyderabad Campus


Types of System Calls
• Process control • Device management
• create process, terminate process
• request device, release device
• end, abort
• read, write
• load, execute
• get device attributes, set
• get process attributes, set process device attributes
attributes
• logically attach or detach
• File management devices
• create file, delete file • Information Maintenance
• open, close file
• Communication
• read, write file
• get and set file attributes • Protection

BITS Pilani, Hyderabad Campus


Examples of System Calls

BITS Pilani, Hyderabad Campus


OS Structure

• Simple Structure/ Monolithic Kernel


• Layered Approach
• Microkernels
• Modules
• Hybrid System

BITS Pilani, Hyderabad Campus


Simple Structure
• not divided into modules
• interfaces and levels of
functionality are not well
separated
• application programs are able to
access the basic I/O routines to
write directly to the display and
disk drives
• vulnerable to malicious programs,
causing entire system crashes
when user programs fail

BITS Pilani, Hyderabad Campus


UNIX Architecture

• the original UNIX operating


system had limited structuring
• consists of two separable parts
• Systems programs
• kernel
• Consists of everything below the
system-call interface and above the
physical hardware
• Provides the file system, CPU
scheduling, memory management,
and other operating-system
functions; a large number of
functions for one level

BITS Pilani, Hyderabad Campus


Monolithic Kernel
• entire operating system is working in kernel space
• larger in size
• little overhead in system call interface or in
communication within kernel
• faster
• hard to extend
• if a service crashes, whole system is affected
• NOTE:
• There are 2 spaces- user space and kernel space which
are basically memory locations/address spaces where
user and kernel operations can take place. In monolithic
, both these spaces are present together in kernel space
making no real distinction between them . user mode is
different concept than user space
• Eg., - Linux, Solaris, MS-DOS

BITS Pilani, Hyderabad Campus


Layered Approach

• The operating system is divided into


a number of layers (levels), each built
on top of lower layers
• The bottom layer (layer 0), is the
hardware; the highest (layer N) is the
user interface
• With modularity, layers are selected
such that each uses functions
(operations) and services of only
lower-level layers
• Advantage- debugging made easy
BITS Pilani, Hyderabad Campus
Microkernel
• user services and kernel services Application File Device user
are in separate address spaces Program System Driver mode

• smaller in size
• slower messages messages

• extendible, all new services are Interprocess


Communication
memory
managment
CPU
scheduling
kernel
mode
added to user space
• if a service crashes, working of microkernel

microkernel is not affected


hardware
• more secure and reliable
• eg., Mach, QNX, Windows NT (initial
release)
• Drawback ??? Performance overhead of user space to kernel space communication
(even user mode to user mode communication needs to pass through kernel mode)
BITS Pilani, Hyderabad Campus
A Comparison

macOS X, Windows, iOS

BITS Pilani, Hyderabad Campus


Modules
 loadable kernel modules
 kernel has a core set of components
 links in additional services via modules, either at boot time or during run time
 each module has a well defined interface
 dynamically linking services is preferable to adding new features directly to
the kernel  does not require recompiling the kernel for every change
 better than a layered approach  any module can call any module
 better than microkernel  no message passing required to invoke modules

BITS Pilani, Hyderabad Campus


Modules

• Many modern operating systems implement loadable


kernel modules
• Uses object-oriented approach
• Each core component is separate
• Each talks to the others over known interfaces
• Each is loadable as needed within the kernel
• Overall, similar to layers but with more flexibility
• Solaris

BITS Pilani, Hyderabad Campus


Operating-System Debugging
• Debugging is finding and fixing errors, or bugs
• OS generate log files containing error information on process failure
• Failure of an application can generate core dump file capturing
memory of the process for later analysis
• Operating system failure can generate crash dump file containing
kernel memory

BITS Pilani, Hyderabad Campus


Performance Tuning

• Beyond crashes, performance tuning


can optimize system performance
• Improve performance by removing
bottlenecks
• Sometimes using trace listings of
activities, recorded for analysis
• OS must provide means of computing
and displaying measures of system
behavior
• For example, “top” program or
Windows Task Manager
BITS Pilani, Hyderabad Campus
System Boot
• bootstrap program / bootstrap loader
• simple bootstrap loader fetches a more complex
boot program from disk, which in turn loads kernel
• instruction register is loaded with a predefined
memory location where the initial bootstrap
program is located
• diagnostics to determine the state of the machine
• POST (Power-On Self-Test) is the diagnostic testing
sequence that a computer's BIOS (basic
input/output system )(or "starting program") runs
to determine if the computer keyboard, random
access memory, disk drives, and other hardware
are working correctly

BITS Pilani, Hyderabad Campus


System Boot
• if the diagnostics pass, the program can continue with the booting steps
• bootstrap will execute the code present in boot block
• A dedicated block usually at the beginning (first block on first track) of a storage medium that
holds special data used to start a system
• Some systems use a boot block of several physical sectors, while some use only one boot
sector
• If a disk contains a boot block it is called a boot disk
• If a partition contains a boot block it is called a boot partition
• boot block will either contain the remaining bootstrap program or the address on disk and length
of the remainder of the bootstrap program
• GRUB(GRand Unified Bootloader) is an example of an open-source bootstrap program for Linux
systems. Grub provides us with choice to boot from multiple os installed on a computer or select a
specific kernel configuration available for a particular OS.
• after the full bootstrap program is loaded, it traverses the file system to locate OS kernel, load
kernel into memory and start its execution
BITS Pilani, Hyderabad Campus
Booting process in a gist

• BIOS is stored on the ROM/EPROM called the firmware chip.


• Bootloader is called by the bios. It is basically stored on disk(first sector or block of disk). Bios and
bootloader are different.
• When Cpu starts , it needs instruction, ram is empty/undefined at this point,cpu loads instructions
from firmware chip where bios is loaded. Firmware chip is present on the motherboard.
• BIOS will perform POST test and do diagonistic checking
• It will initialize different hardware , give success as a beep or failure as a combination of beeps.
• BIOS will now after hardware initialization check different storage media to find bootloader .
Bootloader may be at first sector of disk .
• Now once bootlaoder is located , bios goes out of picture and bootloader takes over.
• Bootloader load remaining part of OS.
• This(bootloader in boot block) is basically first level bootloader which will call the second level
bootlaoder,
• Second level bootloader will locate kernel image located on secondary memory. Kernel image will be
loaded.
• IN UNIX: INIT process is first process that execute when u power on.
BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
Processes
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Process Concept
An operating system executes a variety of programs
Batch system – jobs
Time-shared systems – user programs or tasks
Terms job and process used interchangeably
Process – a program in execution; process execution must progress in
sequential fashion

BITS Pilani, Hyderabad Campus


Process Concept
Multiple parts
The program code, also called text section
Current activity including program counter, processor registers
Stack containing temporary data
Function parameters, return addresses, local variables
Data section containing global variables
Heap containing memory dynamically allocated during run time

BITS Pilani, Hyderabad Campus


Process Concept
Program is passive entity stored on
disk (executable file), process is
active
Program becomes process when
executable file loaded into
memory
Execution of program started via GUI
mouse clicks, command line entry of
its name, etc
One program can be several
processes
Consider multiple users
executing the same program
BITS Pilani, Hyderabad Campus
States of Process
new: The process is being
created(not yet loaded into
main memory and not ready
for execution)
running: Instructions are
being executed
waiting: The process is
waiting for some event to
occur
ready: The process is
waiting to be assigned to a
processor
terminated: The process
has finished execution BITS Pilani, Hyderabad Campus
Process Control Block(PCB)
 PCB is a data structure which holds several kind of information
about a particular process
 Each process has its own PCB
 What all PCB holds:-
1. Process state – new, ready, running, waiting, etc.
2. Program counter – location of instruction to next execute
3. CPU registers – contents of all process-centric registers
4. CPU scheduling information- priorities, scheduling queue
pointers, scheduling parameters
5. Memory-management information – memory allocated to the
process
6. Accounting information – CPU used, clock time elapsed since
start, time limits, account nos., process nos.
7. I/O status information – I/O devices allocated to process, list of
open files

BITS Pilani, Hyderabad Campus


Process Creation
 Parent process creates children processes, which, in turn create other processes, forming a tree of processes
 Ex- In unix, INIT process is the first process to be created, then all others process on the system are either
direct children of init or are descendants of init.(NOTE: nowadays init process has been replaced with system
D),
 so ROOT PROCESS of tree in modern unix system is systemD
 Ex-in mac based system, LAUNCH D is the very first process to be created and executed,so is root process is
launchD
 PID of ROOT PROCESS is 1.
 Generally, process identified and managed via a process identifier (pid), integer number
 Resource sharing options (CPU time, memory, files, I/O devices)
 Parent and children share all resources
 Children share subset of parent’s resources
 Parent and child share no resources
 Execution options
 Parent and children execute concurrently
 Parent waits until children terminate

BITS Pilani, Hyderabad Campus


Process Creation
 Address space
 Address space of Child is a duplicate of Address space of parent (same program and
data) (IMP: It doesn’t mean both address space are same , but they are duplicate,
ex – CHILD WILL HAVE SAME PROGRAM AND DATA VARIABLES AS PARENT HAS BUT
THEY ARE NOT SHARING THESE, if parent has variable x , then child will also have
variable x , but these two variables are two different independent variables)
 Child has a program loaded into it, that means child can execute different program
and perform different task than its parent
 UNIX examples
 fork() system call creates new process
 exec() system call used after a fork() to replace the process’s memory space with a
new program
 ps –el command gives all information about all process running currently(in window
same is done using tasklist command)

BITS Pilani, Hyderabad Campus


Process Creation
 fork()
 address space of child process is a copy of parent process
 both child and parent continue execution at the instruction after fork()
 return code for fork() is 0 for child
 return code for fork() is non-zero (child pid) for parent

 exec()
 loads a binary file into memory and starts execution
 destroys previous memory image
 It tells/allows the child to perform some tasks different than its parent
 call to exec() does not return unless an error occurs, hence any statement written after execlp()
command will not be executed for that process

 wait()
 parent can issue wait() to move out of ready queue until the child is done
 NOTE: Assume that a parent has 3 child processes, it has called for wait function, so as soon as any one
child terminates , parent will resume its execution, ie, wait() doesn’t wait for all children to terminate
but even 1 child termination is enough for wait() function to return. Incase two or more children
terminate, then pid of any of the child process will be randomly selected and returned to the parent
BITS Pilani, Hyderabad Campus
Process Creation
#include <sys/types.h> //pid_t datatype is in this header file
#include <stdio.h>
#include <unistd.h>
#include<sys/wait.h>
int main()
{
pid_t pid;
pid = fork(); /* fork a child process , this will create a
child process and return 0 to child process and return pid of child to parent process*/

//imp: two processes start execution from here now-on


if (pid < 0) { /* error occurred */
fprintf(stderr, "Fork Failed");
return 1;
}
else if (pid == 0) { /* child process */
printf("Child Process\n");
execlp("/bin/ls","ls",NULL);
//If exec() executes normally , then anything written in this line,
ie, below exec() function will never be executed
}
else { /* parent process */
wait(NULL); /* parent waits for child to complete */
printf("Child Complete");
BITS Pilani, Hyderabad Campus
Example 1

hello
hello

BITS Pilani, Pilani Campus


Example 2

Output:
Child Process : 0
I am Parent : 1234
Another Possible output:
I am Parent : 1234
Child Process : 0
BITS Pilani, Pilani Campus
Example 3

Output:
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello

BITS Pilani, Pilani Campus


Process Creation
int main()
{
pid_t pid; OUTPUT:
pid = fork();
parent pid = 6597
if (pid < 0) {
fprintf(stderr, "Fork Failed"); Child Process
return 1; child pid = 6598
} Child Complete
else if (pid == 0) {
printf("Child Process\n");
printf("child pid = %d\n", getpid());
}
else {
printf("parent pid = %d\n", getpid());
wait(NULL);
printf("Child Complete");
}
return 0;
}

BITS Pilani, Hyderabad Campus


Process Creation
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#include<sys/wait.h>
OUTPUT:
int main() Parent Process: x = 10
{ Child Process: x = 10
pid_t pid; int x = 10; a.out Documents examples.desktop MyPrograms Pictures Templates
pid = fork(); Desktop Downloads Music parent.c Public Videos
if (pid < 0) { Child Complete
fprintf(stderr, "Fork Failed");
return 1;
}
else if (pid == 0) {
printf("Child Process: x = %d\n", x);
execlp("/bin/ls","ls",NULL);
}
else {
printf("Parent Process: x = %d\n", x);
wait(NULL);
printf("Child Complete");
}
return 0;
BITS Pilani, Hyderabad Campus
}
Process Creation
int main()
{
pid_t pid;
int x = 10;
pid = fork(); OUTPUT:
if (pid < 0) { Child Process: x = 20
fprintf(stderr, "Fork Failed");
Parent Process: x = 10
return 1;
} Child Complete
else if (pid == 0) {
x = x + 10;
printf("Child Process: x = %d\n", x);
}
else {
wait(NULL);
printf("Parent Process: x = %d\n", x);
printf("Child Complete");
}
return 0;
}

BITS Pilani, Hyderabad Campus


Process Creation
int main()
{
pid_t pid;
int x = 10;
pid = fork(); OUTPUT:
if (pid < 0) { Child Process: x = 10
fprintf(stderr, "Fork Failed");
Parent Process: x = 20
return 1;
} Child Complete
else if (pid == 0) {
for(long i = 0; i < 50000000000; i++);
printf("Child Process: x = %d\n", x);
}
else {
x = x + 10;
wait(NULL);
printf("Parent Process: x = %d\n", x);
printf("Child Complete");
}
return 0;
}
BITS Pilani, Hyderabad Campus
Process Termination
Process executes last statement and then asks the operating system to
delete it using the exit() system call.
May return status data from child to parent (via wait())
Process’ resources are deallocated by operating system
Parent may terminate the execution of children processes because:
Child has exceeded allocated resources limit
Task assigned to child is no longer required
Parent is exiting and the operating systems does not allow a child
to continue if its parent terminates

BITS Pilani, Hyderabad Campus


Process Termination
 Some operating systems do not allow child to exist if parent has terminated.
 If a process terminates, then all its children must also be terminated.
 cascading termination - All children, grandchildren, etc. are terminated.
 The termination is initiated by the operating system.
 The parent process may wait for termination of a child process by using the wait() system call. The call returns
status information and the pid of the terminated process
pid_t pid;
int status;
pid = wait(&status); //parent can tell which child has terminated
 If no parent waiting (did not invoke wait() till then) process is a zombie
(if child terminates and parent doesn’t call wait() or calls wait() after some time;in the meantime the child will be
in a state called zombie state, where it has terminated but its entry is still present in process table.When wait() is
called,its entry gets deleted and it doesn’t remain zombie anymore)
 If parent terminated, process is an orphan
(when child is still running but parent terminates without calling wait(), then that child process is called orphan
process. Orphan process will eventually be converted to zombie once it is terminated.)
HOW TO REMOVE ORPHAN & ZOMBIE RECORD FROM PROCESS TABLE?
Os does it by calling wait() in the root process itself(init ,systemd for unix) ,so that these zombie and orphan record
can be removed from process table

BITS Pilani, Hyderabad Campus


Zombie Process
int main()
{
pid_t child_pid = fork();
if (child_pid > 0) {
sleep(10); zombie
wait(NULL); no longer a zombie
sleep(200);
}
else{
printf("\n%d",getpid());
exit(0);
}
return 0; F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
}
1 Z 1001 17404 17403 0 80 0 - 0 - pts/0 00:00:00 a.out <defunct>

BITS Pilani, Hyderabad Campus


Orphan Process
int main()
{
pid_t child_pid = fork();
if (child_pid > 0){
printf("\nParent process: %d\n",getpid());
sleep(6);
}
else{
printf("\nParent PID: %d\n",getppid());
sleep(20);
printf("\nChild Process: %d",getpid());
printf("\nParent PID: %d",getppid());
exit(0);
}

return 0;
}

BITS Pilani, Hyderabad Campus


Interprocess Communication
 Processes within a system may be independent or cooperating
 Cooperating process can affect or be affected by other processes, including
sharing data
 Reasons for cooperating processes:
 Information sharing
 Computation speedup
 Modularity
 Convenience
 Cooperating processes need interprocess communication (IPC)
 Two models of IPC
 Shared memory
 Message passing

BITS Pilani, Hyderabad Campus


In message passing :

• Only small chunk of data can be communicated

In shared memory:

• It is faster than message passing

• Large amount information can be shared

• One process creates shared memorysegment and other processes link/attach to that memory segment to

communicate with each other (craetion,attaching,deattching will require system call but writing/reading in shared

memory are normal/don’t require system call)

• Two process are not allowed to write simultaneously in the shared memory segment, but two/more can read

simultaneously

BITS Pilani, Hyderabad Campus


Interprocess Communication
message passing shared memory

BITS Pilani, Hyderabad Campus


Producer-Consumer Problem

BITS Pilani, Hyderabad Campus


IPC – Shared Memory
 An area of memory shared among the processes that wish
to communicate

 The communication is under the control of the users


processes not the operating system,ie, we don’t require
system calls to read and write information in shared memory
segment

 Provide mechanism that will allow the user processes to


synchronize their actions when they access shared memory

BITS Pilani, Hyderabad Campus


Producer-Consumer Problem
 Paradigm for cooperating processes, producer process produces information that
is consumed by a consumer process
 unbounded-buffer places no practical limit on the size of the buffer
 bounded-buffer assumes that there is a fixed buffer size
 Shared data, reside in a region of memory shared by producer & consumer
#define BUFFER_SIZE 10
typedef struct {
. . .
} item;

item buffer[BUFFER_SIZE];
int in = 0;
int out = 0;

 Solution is correct, but can only use BUFFER_SIZE-1 elements

BITS Pilani, Hyderabad Campus


Bounded Buffer - Producer
item next_produced;
while (true) {
/* produce an item in next produced */
while (((in + 1) % BUFFER_SIZE) == out) //buffer is full Producer
; /* do nothing */
buffer[in] = next_produced;
in = (in + 1) % BUFFER_SIZE;
}

item next_consumed;
while(true){
while (in == out) //buffer is empty
; /*do nothing*/
Consumer next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
/* consume the item in next consumed */
}

BITS Pilani, Hyderabad Campus


IPC – Message Passing
 Mechanism for processes to communicate and to synchronize their
actions

 processes communicate with each other without resorting to shared


variables, no sharing of address space

 IPC facility provides two operations:


 send(message)
 receive(message)

 The message size is either fixed or variable

 If processes P and Q wish to communicate, they need to:


 Establish a communication link between them
 Exchange messages via send() / receive()
BITS Pilani, Hyderabad Campus
Message Passing – Direct
Communication
 Processes must name each other explicitly:
 send (P, message) – send a message to process P
 receive(Q, message) – receive a message from process Q

 Properties of communication link


 Links are established automatically
 Processes only need to know each other’s identity
 A link is associated with exactly one pair of communicating
processes
 Between each pair there exists exactly one link

BITS Pilani, Hyderabad Campus


Message Passing – Indirect
Communication
 Messages are directed and received from mailboxes (also
referred to as ports)
 Each mailbox has a unique ID
 Processes can communicate only if they share a mailbox
 Properties of communication link
 Link is established only if processes share a common
mailbox
 A link may be associated with many processes
 Each pair of processes may share several communication
links, each link corresponds to one mailbox

BITS Pilani, Hyderabad Campus


Message Passing – Indirect
Communication
 Operations
 create a new mailbox (port)
 send and receive messages through mailbox
 destroy a mailbox
 Primitives are defined as:
 send(A, message) – send a message to mailbox A
 receive(A, message) – receive a message from mailbox A

BITS Pilani, Hyderabad Campus


Synchronization
 Message passing may be either blocking or non-blocking
 Blocking is considered synchronous
 Blocking send -- the sender is blocked until the message is received
by the receiving process or mailbox
 Blocking receive -- the receiver is blocked until a message is
available
 Non-blocking is considered asynchronous
 Non-blocking send -- the sender sends the message and continues
 Non-blocking receive -- the receiver receives:
 A valid message, or
 Null message

BITS Pilani, Hyderabad Campus


Buffering
messages exchanged by communicating processes reside in a temporary
queue
implemented in one of three ways
 Zero capacity – no messages are queued, link can’t have any waiting
messages, sender must block until receiver receives message
 Bounded capacity – queue is of finite length of n messages, sender
need not block if queue is not full, sender must wait if queue full;
receiver need not be blocked irrespective of queue is full or not
 Unbounded capacity – infinite length queue, sender never blocks

BITS Pilani, Hyderabad Campus


Pipe
Acts as a conduit allowing two processes to communicate
Issues:
Is communication unidirectional or bidirectional?
In the case of two-way communication, is it half(like walkie-talkie=>one way
travelling at a given time) or full-duplex(like mobile)?
Must there exist a relationship (i.e., parent-child) between the
communicating processes?
Can the pipes be used over a network?
Ordinary pipes –
cannot be accessed from outside the process that created it
parent process creates a pipe and uses it to communicate with a child process
that it created
Named pipes – can be accessed without a parent-child relationship
BITS Pilani, Hyderabad Campus
Ordinary Pipe
Ordinary Pipes allow communication in standard producer-consumer style
Producer writes to one end (the write-end of the pipe)
Consumer reads from the other end (the read-end of the pipe)
Ordinary pipes are unidirectional
Require parent-child relationship between communicating processes
Windows calls these anonymous pipes
Fd[0] is read
Fd[1] is write

BITS Pilani, Hyderabad Campus


Ordinary Pipe
ordinary pipe can’t be accessed from outside the process that
created it
parent process creates a pipe and uses it to communicate with a
child process that it creates via fork()
child inherits the pipe from its parent process like any other file

BITS Pilani, Hyderabad Campus


Ordinary Pipe

BITS Pilani, Hyderabad Campus


Named Pipe
 Named Pipes are more powerful than ordinary pipes
 Communication is bidirectional
 No parent-child relationship is necessary between the
communicating processes
 Several processes can use the named pipe for communication
 Do not cease to exist if the communicating processes have
terminated
 Provided on both UNIX and Windows systems
 Referred to as FIFOs in UNIX systems

BITS Pilani, Hyderabad Campus


Message Queue
• asynchronous communication • Step 1 − Create a message queue or
connect to an already existing message
• messages placed onto the queue are
queue (msgget())
stored until the recipient retrieves them
• Step 2 − Write into message queue
(msgsnd())
• Step 3 − Read from the message queue
(msgrcv())
• Step 4 − Perform control operations on
the message queue (msgctl())

BITS Pilani, Hyderabad Campus


Context Switch
 When CPU switches to another
process, system must save state of the
old process and load the state for the
new process via a context switch
 Context of a process represented in
the PCB (CPU registers contents,
process state, memory management
info.)
 Context-switch time is overhead;
system does no useful work while
switching
 Context switch Time is dependent on
hardware support
BITS Pilani, Hyderabad Campus
Process Scheduling
 Maximize CPU use, quickly switch processes onto CPU for time sharing
 Process scheduler selects among available processes for next execution on CPU
 Maintains scheduling queues of processes
 Job queue – set of all processes in the system(set of all process submitted to the
system; they are not the process whose image is loaded into the main memory;job
queue is stored in secondary queue;initially all queue are present in job queue)
 Ready queue – set of all processes residing in main memory, ready and waiting to
execute, generally stored as a linked list of PCB (subset of processes in job queue
are brought into main memory, these are called ready queue)
 Device queues – set of processes waiting for an I/O device(it may happen that
many processes require same IO device , so such processes are stored in device
queue);device queue is present in the main memory
 Processes migrate among the various queues

BITS Pilani, Hyderabad Campus


Various Queues

BITS Pilani, Hyderabad Campus


Schedulers
 Short-term scheduler (or CPU scheduler) – selects which process should be executed next
and allocates CPU
 Sometimes the only scheduler in a system
 Short-term scheduler is invoked frequently (milliseconds)  (must be fast)
 Long-term scheduler (or job scheduler) – selects which processes from job queue should
be brought into the ready queue
 Long-term scheduler is invoked infrequently (seconds, minutes)  (may be slow)
 The long-term scheduler controls the degree of multiprogramming (number of
processes in main memory)
 Processes can be described as either:
 I/O-bound process – spends more time doing I/O than computations, many short CPU
bursts
 CPU-bound process – spends more time doing computations; few very long CPU bursts
 Long-term scheduler strives for good process mix
BITS Pilani, Hyderabad Campus
Schedulers
 Medium-term scheduler can be added in time sharing systems if degree of
multiprogramming needs to decrease
 Intermediate level of scheduling
 Remove process from memory, store on disk, bring back in from disk to continue
execution from where it left off: swapping
 Required for improving process mix or for freeing of memory

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
#include <stdio.h> // READER
#include <stdlib.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>

struct my_msgbuf {
long mtype;
char mtext[200];
};

int main(void)
{
struct my_msgbuf buf;
int msqid;
long m;
key_t key;

if ((key = ftok("writer.c", 'B')) == -1) { //same as writer.c perror("ftok");


exit(1);
}

if ((msqid = msgget(key, 0644)) == -1) {// connect to the queue perror("msgget");


exit(1);
}

printf("Reader: ready to receive messages\n");

while(1) {
if (msgrcv(msqid, &buf, sizeof(buf.mtext), buf.mtype, 0) == -1) {
perror("msgrcv");
exit(1);
}
m=buf.mtype;
printf("Reader: %ld %s\n", m, buf.mtext);
}

return 0;
}
BITS Pilani, Hyderabad Campus
#include <stdio.h> // WRITER
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h> printf("Enter lines of text, ^D to quit:\n");
#include <sys/ipc.h> /*setting msg type*/
#include <sys/msg.h> buf.mtype = 1;

struct my_msgbuf { while(fgets(buf.mtext, sizeof(buf.mtext), stdin) != NULL) {


long mtype; int len = strlen(buf.mtext);
char mtext[200];
}; /* ignore newline at end, if it exists */
if (buf.mtext[len-1] == '\n')
int main(void) buf.mtext[len-1] = '\0';
{
struct my_msgbuf buf; /*send the msg*/
int msqid; /*used by msgget*/ if (msgsnd(msqid, &buf, len+1, 0) == -1) /*+1 for '\0' */
key_t key; /*used by ftok*/ perror("msgsnd");
}
/* generate a key*/
if ((key = ftok("writer.c", 'B')) == -1) { /*remove the msg q*/
perror("ftok"); if (msgctl(msqid, IPC_RMID, NULL) == -1) {
exit(1); perror("msgctl");
} exit(1);
}
/*creating a msg q*/
if ((msqid = msgget(key, 0644 | IPC_CREAT)) == -1) { return 0;
perror("msgget"); }
exit(1);
}

BITS Pilani, Hyderabad Campus


#include <stdio.h> //SHMEM
#include <stdlib.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <unistd.h>
#include <sys/wait.h>
else {
if( (shmid = shmget(2041, 32, 0666 | IPC_CREAT)) == -1 )
int main()
exit(1);
{
shmPtr = shmat(shmid, 0, 0);
int shmid;
if (shmPtr == (char *) -1)
char *shmPtr;
exit(2);
int n;
for (n = 0; n < 26; n++)
shmPtr[n] = 'a' + n;
printf ("\nParent Writing ....\n\n") ;
if(fork() == 0) {
for (n = 0; n < 26; n++)
sleep(5); /* To wait for the parent to write */
putchar(shmPtr[n]);
putchar('\n');
if( (shmid = shmget(2041, 32, 0)) == -1 )
wait(NULL);
exit(1);
if( shmctl(shmid, IPC_RMID, NULL) == -1 ){
perror("shmctl");
shmPtr = shmat(shmid, 0, 0);
exit(-1);
}
if (shmPtr == (char *) -1)
}
exit(2);
return 0;
printf ("\nChild Reading ....\n\n");
}
for (n = 0; n < 26; n++)
putchar(shmPtr[n]);
putchar('\n');
}

BITS Pilani, Hyderabad Campus


OPERATING SYSTEMS (CS F372)
Threads
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Motivation
Most modern applications are multithreaded
Threads run within application
Multiple tasks within the application can be implemented by separate threads
Update display
Fetch data
Spell checking
Answer a network request
Process creation is heavy-weight while thread creation is light-weight
Can simplify code, increase efficiency

BITS Pilani, Hyderabad Campus


Motivation

Multithreaded Server Architecture

BITS Pilani, Hyderabad Campus


What is Thread?
Basic unit of CPU utilization
Comprises a thread ID, program counter, registers and stack
Shares with other threads belonging to the same program
code section
data section
OS resources like open files

BITS Pilani, Hyderabad Campus


Motivation

BITS Pilani, Hyderabad Campus


Benefits
 Responsiveness – may allow continued execution if part of process is blocked,
especially important for user interfaces in interactive environments
 Resource Sharing – threads share resources of process, easier than shared
memory or message passing
 Economy – cheaper than process creation, thread switching has lower overhead
than context switching
 Scalability – multithreaded process can take advantage of multiprocessor
architectures

BITS Pilani, Hyderabad Campus


Multicore Programming
 Multicore or multiprocessor systems putting pressure on programmers because they
have to write multithreaded programs to fully utilize the multiprocessor system ,
programming challenges include:
 Identifying tasks
 Balance
 Data splitting
 Data dependency
 Testing and debugging
 Parallelism implies a system can perform more than one task simultaneously;
multiprocessor system each processor does some task at the same time
 Concurrency (illusion of parallelism)supports more than one task making
progress,context switching is very fast which looks as if running parallel but not actually
 Single processor / core, scheduler providing concurrency but not parallelism

BITS Pilani, Hyderabad Campus


Multicore Programming
 Types of parallelism

 Data parallelism – distributes subsets of the same data across multiple cores,
same operation on each subset

 Task parallelism – distributing tasks/threads across cores, each thread


performing unique operation, threads may be operating on same or different
data

BITS Pilani, Hyderabad Campus


Concurrency vs Parallelism
 Concurrent execution on single-core system:

 Parallelism on a multi-core system:

BITS Pilani, Hyderabad Campus


User Threads and Kernel Threads
 User threads - management done by user-level threads library without kernel support
 Three primary thread libraries:
 POSIX Pthreads
 Windows threads
 Java threads
 Kernel threads - Supported and managed by the Kernel
 Examples – virtually all general purpose operating systems support kernel threads,
including:
 Windows
 Solaris
 Linux
 Tru64 UNIX
 Mac OS X
 Note: each user thread Is mapped to kernel thread using LWP, minimum number of
lwp required is equal to number concurrent blocking system calls.
BITS Pilani, Hyderabad Campus
Multithreading Models

Many-to-One

One-to-One

Many-to-Many

BITS Pilani, Hyderabad Campus


Many-to-One Model

 Many user-level threads mapped to


single kernel thread
 Thread management done by thread
library in user space
 One thread blocking causes all to block
 Multiple threads may not run in parallel
on muticore system because only one
can access kernel at a time
 Few systems currently use this model
 Examples:
 Solaris Green Threads

BITS Pilani, Hyderabad Campus


One-to-One Model

 Each user-level thread maps to kernel


thread
 Creating a user-level thread creates a
kernel thread
 More concurrency than many-to-one
 Number of threads per process
sometimes restricted due to overhead
 Examples:
 Windows
 Linux
 Solaris 9 and later

BITS Pilani, Hyderabad Campus


Many-to-Many Model

 Allows many user level threads to be


mapped to many kernel threads
 Allows the operating system to create a
sufficient number of kernel threads
 When a thread performs a blocking
system call, the kernel can schedule
another thread for execution
 Solaris prior to version 9

BITS Pilani, Hyderabad Campus


Two-Level Model

 Similar to M:M (ie,many to many),


except that it allows a user thread to be
bound to kernel thread
 Combination of m:m and one to one
 Examples
 IRIX
 HP-UX
 Tru64 UNIX
 Solaris 8 and earlier

BITS Pilani, Hyderabad Campus


Thread Libraries
 Thread library provides programmer with API for creating and managing threads
 Two primary ways of implementing
 Library entirely in user space
 Kernel-level library supported by the OS

BITS Pilani, Hyderabad Campus


Pthreads
 May be provided either as user-level or kernel-level
 A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization
 Specification, not implementation
 API specifies behavior of the thread library, implementation is up to development
of the library
 Common in UNIX operating systems (Solaris, Linux, Mac OS X)

BITS Pilani, Hyderabad Campus


Pthreads
 int pthread_attr_init(pthread_attr_t *attr);

 int pthread_create(pthread_t *tid, const pthread_attr_t *attr, void


*(*start_routine) (void *), void *arg);

 int pthread_equal(pthread_t tid1, pthread_t tid2);//0->unequal ids

 int pthread_join(pthread_t tid, void **retval);

 void pthread_exit(void *ptr);

BITS Pilani, Hyderabad Campus


#include <pthread.h>
#include <stdio.h> Pthreads Example
int sum; /* this data is shared by the thread(s) */ /* The thread will begin control in this function */
void *runner(void *param); /* threads call this function */ void *runner(void *param)
{
int main(int argc, char *argv[])
int i, upper = atoi(param);
{
sum = 0;
pthread_t tid; /* the thread identifier */
for (i = 1; i <= upper; i++)
pthread_attr_t attr; /* set of thread attributes */
sum += i;
if (argc != 2) {
pthread_exit(0);
fprintf(stderr,"usage: a.out <integer value>\n");
}
return -1; }
if (atoi(argv[1]) < 0) {
fprintf(stderr,"%d must be >= 0\n",atoi(argv[1])); synchronous threading
return -1; }
pthread_attr_init(&attr); /* set the default attributes */
pthread_create(&tid, &attr, runner, argv[1]); /* create the thread */
pthread_join(tid, NULL); /* wait for the thread to exit */
printf("sum = %d\n", sum);
}

BITS Pilani, Hyderabad Campus


Pthreads Example

#define NUM THREADS 10

/* an array of threads to be joined upon */


pthread_t workers[NUM THREADS];

for (int i = 0; i < NUM THREADS; i++)


pthread_join(workers[i], NULL);
JOINING 10 THREADS

BITS Pilani, Hyderabad Campus


Threading Issues
 fork() and exec() system calls
 Signal handling
 Thread cancellation of target thread
 Thread-local storage

BITS Pilani, Hyderabad Campus


fork() and exec()
 If one thread in a program calls fork(), does the new process duplicate all threads, or is
the new process single-threaded?
 Some UNIX systems have chosen to have two versions of fork(), one that duplicates all
threads and another that duplicates only the thread that invoked the fork() system call
 The exec() system call works in the same way
 if a thread invokes the exec() system call, the program specified in the parameter
to exec() will replace the entire process—including all threads
 If exec() is called immediately after forking, then duplicating all threads is unnecessary,
as the program specified in the parameters to exec() will replace the process
 duplicating only the calling thread is appropriate
 If the separate process does not call exec() after forking, the separate process should
duplicate all threads

BITS Pilani, Hyderabad Campus


Signal Handling

 Signals are used in UNIX systems to notify a process that a particular event has
occurred
 The signal is delivered to a process
 When delivered, signal handler is used to process signals
 Synchronous and asynchronous signals
 Synchronous signals
 illegal memory access, div. by 0
 delivered to the same process that performed the operation generating the signal
 Asynchronous signals
 generated by an event external to a running process
 the running process receives the signal asynchronously
 Ctrl + C, timer expiration

BITS Pilani, Hyderabad Campus


Signal Handling

 Signal is handled by one of two signal handlers:


 default
 user-defined
 Every signal has default handler that kernel runs when handling signal
 User-defined signal handler can override default signal handler
 Some signals can be ignored, others are handled by terminating the process
 For single-threaded, signal is delivered to process

BITS Pilani, Hyderabad Campus


Signal Handling
 Where should a signal be delivered for multi-threaded process?
 Deliver the signal to the thread to which the signal applies
 Deliver the signal to every thread in the process
 Deliver the signal to certain threads in the process
 Assign a specific thread to receive all signals for the process
 Method for delivering a signal depends on the type of signal generated
 synchronous signals need to be delivered to the thread causing the signal and not
to other threads in the process
 some asynchronous signals—such as <Ctrl + C> should be sent to all threads
 Most multithreaded versions of UNIX allow a thread to specify which signals it will
accept and which it will block
 In some cases, an asynchronous signal may be delivered only to those threads that
are not blocking it

BITS Pilani, Hyderabad Campus


Thread Cancellation
Terminating a thread before it has finished
Thread to be canceled is target thread
Two general approaches:
Asynchronous cancellation – one thread terminates the target thread immediately
Deferred cancellation allows the target thread to periodically check if it should be
cancelled, target thread can terminate itself in orderly fashion
What about freeing resources??
Pthread code to create and cancel a thread:
pthread_t tid;
/* create the thread */
pthread_create(&tid, &attr, worker, NULL);
...
/* cancel the thread */
pthread_cancel(tid);
BITS Pilani, Hyderabad Campus
Thread Cancellation
Invoking thread cancellation requests cancellation, but actual cancellation depends on
how the target thread is set up to handle the request

default type

If thread has cancellation disabled, cancellation remains pending until thread enables it
Default type is deferred
Cancellation only occurs when thread reaches cancellation point
Establish cancellation point by calling pthread_testcancel()
If cancellation request is pending, cleanup handler is invoked to release any
acquired resources

BITS Pilani, Hyderabad Campus


Thread-Local Storage

 Thread-local storage (TLS) allows each thread to have its own copy
of data
 TLS data are unique to each thread

 Different from local variables


 Local variables visible only during single function invocation
 TLS visible across function invocations

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
CPU Scheduling
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Basics

• Maximum CPU utilization obtained with multiprogramming


• CPU–I/O Burst Cycle – Process execution consists of a cycle of
CPU execution and I/O wait
• CPU burst followed by I/O burst
• More number of short CPU bursts and less number of long CPU
bursts

BITS Pilani, Hyderabad Campus


CPU Scheduler
Short-term / CPU scheduler selects from among the processes in ready queue, and
allocates the CPU to one of them
CPU scheduling decisions may take place when a process:
1. Switches from running to waiting state
2. Switches from running to ready state
3. Switches from waiting to ready
4. Terminates
Preemptive scheduling – done in situations 2 and 3,preemption means pausing
something forcefully
Nonpremptive /Cooperative scheduling – once a process is allocated the CPU it retains
the CPU until termination or switching to waiting state

BITS Pilani, Hyderabad Campus


Dispatcher

Dispatcher module gives control of the CPU to the process


selected by the short-term scheduler; this involves:
switching context
switching to user mode
jumping to the proper location in the user program to restart that
program
Dispatch latency – time it takes for the dispatcher to stop one
process and start another running

BITS Pilani, Hyderabad Campus


Scheduling Criteria

CPU utilization – keep the CPU as busy as possible


Throughput – no. of processes that complete their execution per time unit
Turnaround time – amount of time to execute a particular process, interval
from submission time to completion time, sum of durations spent waiting to
get into memory, waiting in ready queue, executing on CPU, doing I/O
Waiting time – amount of time a process has been waiting in the ready
queue
Response time – amount of time it takes from when a request was
submitted until the first response is produced, not output (for time-sharing
environment), depends on the speed of output device

BITS Pilani, Hyderabad Campus


First- Come, First-Served (FCFS) Scheduling,

Process Burst Time


P1 24
P2 3
P3 3
Arrival time is 0
GANTT CHART

Waiting time for P1 = 0; P2 = 24; P3 = 27


Average waiting time: (0 + 24 + 27)/3 = 17
BITS Pilani, Hyderabad Campus
First- Come, First-Served (FCFS)
Scheduling
Suppose that the processes arrive in the order:P2 , P3 , P1
The Gantt chart for the schedule is:

Waiting time for P1 = 6; P2 = 0; P3 = 3


Average waiting time: (6 + 0 + 3)/3 = 3
Much better than previous case
Non-preemptive
Not applicable for time sharing systems

BITS Pilani, Hyderabad Campus


FCFS: Example
• Draw Gantt chart
• Compute the average wait time, TAT and RT for processes
• Note: if all arrive at same time,schedule in ascending order of processi_d;
waiting time= tat - burst time
• TAT=Finish time - arrival time
• Response time= assume = waiting time (when first output received wrt arrival
time)
Process AT BT FT TAT WT RT
P1 0 7 7 7 0 0
P2 0 3 10 10 7 7
P3 0 4 14 14 10 10
P4 0 6 20 20 14 14

BITS Pilani, Pilani Campus


FCFS: Example
• Draw Gantt chart
• Compute the average wait time, TAT and RT for processes

Process AT BT FT TAT WT RT
P1 0 7 7 7 0 0
P2 8 3 20 12 17-8 9
P3 3 4 11 8 7-3 4
P4 5 6 17 12 11-5 6

BITS Pilani, Pilani Campus


FCFS: Example

• Draw Gantt chart


• Compute the average wait time, TAT and RT for processes
Process AT BT FT TAT WT RT
P1 0 2 2 2 0 0
P2 3 1 4 1 0 0
P3 5 5 10 5 0 0

BITS Pilani, Pilani Campus


Shortest-Job-First (SJF) Scheduling

Associate with each process the length of its next CPU burst
 Use these lengths to schedule the process with the shortest time
Use FCFS in case of tie
SJF is optimal – gives minimum average waiting time for a given
set of processes
The difficulty is knowing the length of the next CPU request
For long-term (job) scheduling in a batch system, use the process time
limit that a user specifies when the job is submitted

BITS Pilani, Hyderabad Campus


Determining Length of Next
CPU Burst
Not possible to implement for short-term scheduling
Can only estimate the length – should be similar to the previous one
Then pick process with shortest predicted next CPU burst
Can be done by using the length of previous CPU bursts, using exponential
averaging:
tn= actual length of nth CPU burst
τn + 1 = predicted value of the next CPU burst,
0 ≤ α ≤ 1, τ0 = constant or overall system average
τn + 1 = αtn + (1 - α)τn

BITS Pilani, Hyderabad Campus


Prediction of Length of Next CPU
Burst
 =0
n+1 = n
 =1
 n+1 = tn
If we expand the formula, we get:
n+1 =  tn+(1 - ) tn -1 + … +(1 -  )j  tn -j + … +(1 -  )n +1 0
0 – constant or system average
Since both  and (1 - ) are less than or equal to 1, each successive term has less weight
than its predecessor
Commonly, α set to ½

BITS Pilani, Hyderabad Campus


Prediction of Length of Next CPU
Burst
Can be nonpreemptive or preemptive
The next CPU burst of a newly arrived process may be shorter than
what is left of the currently executing process
Preemptive version called shortest-remaining-time-first

BITS Pilani, Hyderabad Campus


Shortest-Job-First (SJF) Scheduling
Process Burst Time
P1 0.0 6
P2 2.0 8
P3 4.0 7
P4 5.0 3

• Assuming same arrival time for all


Average waiting time = (3 + 16 + 9 + 0) / 4 = 7 ms

BITS Pilani, Hyderabad Campus


Shortest-remaining-time-first
ProcessAarri Arrival TimeT Burst Time
P1 0 8
P2 1 4
P3 2 9
P4 3 5
Preemptive SJF Gantt Chart

Average waiting time = [(10-1)+(1-1)+(17-2)+(5-3)]/4 = 26/4 = 6.5 msec


BITS Pilani, Hyderabad Campus
SJF (non-preemptive): Example
• Draw Gantt chart
• Compute the average wait time, TAT and RT for processes

Process AT BT FT TAT WT RT
P1 0 7
P2 0 3
P3 0 4
P4 0 6

BITS Pilani, Pilani Campus


SJF (non-preemptive): Example
• Draw Gantt chart
• Compute the average wait time, TAT and RT for processes

Process AT BT FT TAT WT RT
P1 0 7
P2 8 3
P3 3 4
P4 5 6

BITS Pilani, Pilani Campus


SJF (Preemptive) / SRTF: Example
• Draw Gantt chart
• Compute the average wait time, TAT and RT for processes

Process AT BT FT TAT WT RT
P1 0 7
P2 8 3
P3 3 2
P4 5 6

BITS Pilani, Pilani Campus


Priority Scheduling

A priority number (integer) is associated with each process, generally


starting from 0
The CPU is allocated to the process with the highest priority
(smallest integer  highest priority), tie broken using FCFS
Preemptive
Nonpreemptive
SJF is priority scheduling where priority is the inverse of predicted next
CPU burst time
Problem  Starvation/Indefinite blocking – low priority processes may
never execute
Solution  Aging – as time progresses increase the priority of the process
BITS Pilani, Hyderabad Campus
Nonpreemptive Priority Scheduling
ProcessAarri Burst Time Priority
P1 10 3
P2 1 1
P3 2 4
P4 1 5
P5 5 2

Arrival time =0 for all


Average waiting time = 8.2 msec
BITS Pilani, Hyderabad Campus
Preemptive Priority Scheduling

BITS Pilani, Hyderabad Campus


Priority (non-preemptive): Example
• Draw Gantt chart (Lower Number Higher priority, )
• Compute the average wait time, TAT and RT for processes
Process AT BT Pri FT TAT WT RT
P1 0 3 5
P2 2 2 3
P3 3 5 2
P4 4 4 4
P5 6 1 1

BITS Pilani, Pilani Campus


Priority (preemptive): Example
• Draw Gantt chart
• Compute the average wait time, TAT and RT for processes
Process AT BT Pri FT TAT WT RT
P1 0 3 5
P2 2 2 3
P3 3 5 2
P4 4 4 4
P5 6 1 1

BITS Pilani, Pilani Campus


Round Robin (RR) Scheduling
Each process gets a small unit of CPU time (time quantum q), usually 10-100
milliseconds. After this time has elapsed, the process is preempted and added
to the end of the ready queue.
If there are n processes in the ready queue and the time quantum is q, then
each process gets 1/n of the CPU time in chunks of at most q time units at once.
No process waits more than (n-1)q time units until its next time quantum.
If only one process is present in ready queue,premption will happen but
context switching wouldnot happen
Timer interrupts every quantum to schedule next process
Performance
q large  FCFS
q small  q must be large with respect to context switch, otherwise
overhead is too high
BITS Pilani, Hyderabad Campus
Round Robin (RR) Scheduling
Process Burst Time
P1 24
P2 3
P3 3
• The Gantt chart is:

Typically, higher average turnaround than SJF, but better response


• q should be large compared to context switch time
• q usually 10ms to 100ms, context switch < 10 usec

BITS Pilani, Hyderabad Campus


Multilevel Queue Scheduling

fixed priority
preemptive
scheduling among
queues
Note: only when queue1 is
empty will lower queues be
executed,
process arriving in
q0 can preemp
process running in
lower queues

BITS Pilani, Hyderabad Campus


Multilevel Feedback Queue
A process can move between the various queues; aging can be
implemented this way
Separate processes according to the characteristics of their CPU bursts
Multilevel-feedback-queue scheduler is defined by the following
parameters:
number of queues
scheduling algorithms for each queue
method used to determine when to upgrade a process
method used to determine when to demote a process
method used to determine which queue a process will enter when that process
needs service

BITS Pilani, Hyderabad Campus


Multilevel Feedback Queue
Note: only when queue1 is empty will lower queues be executed,process arriving in
q0 can preemp process running in lower queues
Three queues:
Q0 – RR with time quantum 8 milliseconds
Q1 – RR time quantum 16 milliseconds
Q2 – FCFS
Scheduling
A new process enters queue Q0 which is uses RR
When it gains CPU, job receives 8 milliseconds
If it does not finish in 8 milliseconds, job is moved to queue tail of
Q1
At Q1 process is again served using RR and receives 16 additional
milliseconds
If it still does not complete, it is preempted and moved to tail of
queue Q2 BITS Pilani, Hyderabad Campus
FCFS with I/O
Process AT BT & FT TAT WT RT
I/O
P1 0 3, 5, 2
P2 2 4, 3, 3
P3 4 4
P4 6 3, 2, 3

BITS Pilani, Pilani Campus


SJF (Non Pre-emptive) with I/O
Process AT CPU- BT & I/O Total FT TAT WT RT
BT BT
P1 0 (4 – 2 – 4)

P2 0 (5 – 2 – 5)

P3 0 (2 – 2 - 2)

BITS Pilani, Pilani Campus


Thread Scheduling
Distinction between user-level and kernel-level threads
When threads supported, threads scheduled, not processes
Many-to-one and many-to-many models, thread library schedules
user-level threads to run on LWP
Scheme is known as process-contention scope (PCS) since scheduling
competition is within the process (local contention scope)
Typically done via priority set by programmer
Kernel thread scheduled onto available CPU is system-contention
scope (SCS) – competition among all threads in system (global
contention scope)

BITS Pilani, Hyderabad Campus


Pthread Scheduling

API allows specifying either PCS or SCS during thread creation, contention
scope values
PTHREAD_SCOPE_PROCESS schedules threads using PCS scheduling
PTHREAD_SCOPE_SYSTEM schedules threads using SCS scheduling
Can be limited by OS – Linux and Mac OS X only allow
PTHREAD_SCOPE_SYSTEM
• On systems implementing the M:M model, PTHREAD_SCOPE_PROCESS
policy schedules user-level threads onto available LWPs
• PTHREAD_SCOPE_SYSTEM policy will create and bind an LWP for each
user-level thread effectively mapping a user-level thread to a kernel level
thread
BITS Pilani, Hyderabad Campus
Pthread Scheduling

pthread_attr_setscope(pthread_attr_t *attr, int scope)


pthread_attr_getscope(pthread_attr_t *attr, int *scope)
 On success, these functions return 0; on error, they return a nonzero
error number

BITS Pilani, Hyderabad Campus


Pthread Scheduling
#include <pthread.h> /* set the scheduling algorithm to PCS or SCS */
#include <stdio.h> pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
#define NUM_THREADS 5 /* create the threads */
for (i = 0; i < NUM_THREADS; i++)
int main(int argc, char *argv[]) { pthread_create(&tid[i],&attr,runner,NULL);
int i, scope; /* now join on each thread */
pthread_t tid[NUM_THREADS]; for (i = 0; i < NUM_THREADS; i++)
pthread_attr_t attr; pthread_join(tid[i], NULL);
/* get the default attributes */ }
pthread_attr_init(&attr); /* Each thread will begin control in this function */
/* first inquire on the current scope */ void *runner(void *param)
if (pthread_attr_getscope(&attr, &scope) != 0) {
fprintf(stderr, “Error\n"); /* do some work ... */
else { pthread_exit(0);
if (scope == PTHREAD_SCOPE_PROCESS) }
printf("PTHREAD_SCOPE_PROCESS");
else if (scope == PTHREAD_SCOPE_SYSTEM)
printf("PTHREAD_SCOPE_SYSTEM");
else
fprintf(stderr, "Illegal scope value.\n");
}

BITS Pilani, Hyderabad Campus


CFS
• default scheduler of Linux kernel
• red-black tree
• processes are given a fair amount of processor time
• keeps track of the amount of processor time provided to a process –
virtual runtime
• smaller a process’s virtual runtime implies the process has been
permitted to access processor for a smaller amount of time, hence higher
is the need for processor

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
Synchronization
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Background

Processes can execute concurrently


May be interrupted at any time, partially completing execution
Concurrent access to shared data may result in data inconsistency
Maintaining data consistency requires mechanisms to ensure the
orderly execution of cooperating processes
Consider the Producer-Consumer problem

BITS Pilani, Hyderabad Campus


Producer - Consumer
while (true) {
/* produce an item in next produced */
while (counter == BUFFER_SIZE) ;
/* do nothing */
buffer[in] = next_produced;
in = (in + 1) % BUFFER_SIZE;
counter++;
while (true) {
}
while (counter == 0)
; /* do nothing */
next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
counter--;
/* consume the item in next consumed */
}

BITS Pilani, Hyderabad Campus


Race Condition
counter++ could be implemented as
register1 = counter
register1 = register1 + 1
counter = register1

counter-- could be implemented as

register2 = counter
register2 = register2 - 1
counter = register2
Consider this execution interleaving with counter = 5 initially:
S0 producer execute register1 = counter register1 = 5
S1 producer execute register1 = register1 + 1 register1 = 6
S2 consumer execute register2 = counter register2 = 5
S3 consumer execute register2 = register2 – 1 register2 = 4
S4 producer execute counter = register1 counter = 6
S5 consumer execute counter = register2 counter = 4
BITS Pilani, Hyderabad Campus
Critical Section Problem

Consider system of n processes {p0 , p1, … pn-1}


Each process has critical section segment of code
Process may be changing common variables, updating table,
writing file, etc.
When one process in critical section, no other may be in its
critical section
Critical section problem is to design protocol to solve
this
Each process must ask permission to enter critical
section in entry section, may follow critical section with
exit section, then remainder section

BITS Pilani, Hyderabad Campus


Requirements
1. Mutual Exclusion - If process Pi is executing in its critical section, then no
other processes can be executing in their critical sections
2. Progress - If no process is executing in its critical section and some processes
wish to enter their critical sections, then only those processes that are not
executing in their remainder section can participate in deciding which will
enter its critical section next, and this decision cannot be postponed
indefinitely
3. Bounded Waiting - A bound must exist on the number of times that other
processes are allowed to enter their critical sections after a process has made
a request to enter its critical section and before that request is granted

BITS Pilani, Hyderabad Campus


Critical-Section Handling in OS
 Two approaches depending on if kernel is preemptive or
non-preemptive
Preemptive – allows preemption of process when running in
kernel mode
Non-preemptive – runs until exits kernel mode, blocks, or
voluntarily yields CPU
Essentially free of race conditions in kernel mode
Why then anyone would prefer preemptive kernel?????

BITS Pilani, Hyderabad Campus


Peterson’s Solution
Two process software based solution
Assume that the load and store machine-language instructions are
atomic; that is, cannot be interrupted
The two processes share two variables:
int turn;
boolean flag[2];
The variable turn indicates whose turn it is to enter the critical
section
The flag array is used to indicate if a process is ready to enter the
critical section. flag[i] = true implies that process Pi is ready

BITS Pilani, Hyderabad Campus


Algorithm for Process Pi
do {
Provable that the 3 CS requirements are met:
flag[i] = true;
1. Mutual exclusion is preserved
turn = j;
2. Progress requirement is satisfied
while (flag[j] && turn = = j);
critical section
3. Bounded-waiting requirement is met
flag[i] = false;
remainder section
} while (true);

BITS Pilani, Hyderabad Campus


Check 3 Requirements

do { do {
flag[i] = true; flag[j] = true;
turn = j; turn = i;
while (flag[j] && turn = = j); while (flag[i] && turn = = i);
critical section critical section
flag[i] = false; flag[j] = false;
remainder section remainder section
} while (true); } while (true);

Pi Pj

BITS Pilani, Hyderabad Campus


Synchronization Hardware
Many systems provide hardware support for implementing the critical
section code
All solutions based on idea of locking
Protecting critical regions via locks
Uniprocessors – could disable interrupts
Currently running code would execute without preemption
Generally too inefficient on multiprocessor systems
Modern machines provide special atomic hardware instructions
Atomic = non-interruptible
Either test memory word and set value
Or swap contents of two memory words

BITS Pilani, Hyderabad Campus


Solution to Critical-section Problem
Using Locks
do {
acquire lock
critical section
release lock
remainder section
} while (TRUE);

BITS Pilani, Hyderabad Campus


test_and_set Instruction
boolean test_and_set (boolean *target)
{
boolean rv = *target;
*target = true;
return rv;
}
1.Executed atomically
2.Returns the original value of passed parameter
3.Sets the new value of passed parameter to true

BITS Pilani, Hyderabad Campus


Solution using test_and_set
Instruction
Shared Boolean variable lock, initialized to false, supports mutual
exclusion and progress but not bounded waiting
do {
while (test_and_set(&lock))
; /* do nothing */
/* critical section */
lock = false;
/* remainder section */
} while (true);

BITS Pilani, Hyderabad Campus


compare_and_swap Instruction
int compare_and_swap(int *value, int expected, int new_value)
{
int temp = *value;
if (*value == expected)
*value = new_value;
return temp;
}
1.Executed atomically
2.Returns the original value of passed parameter value
3.Sets the variable value the value of the passed parameter new_value but
only if value == expected. That is, the swap takes place only under this
condition.
BITS Pilani, Hyderabad Campus
Solution using compare_and_swap
Shared (global) integer lock initialized to 0;
Solution:
do {
while (compare_and_swap(&lock, 0, 1) != 0)
; /* do nothing */
/* critical section */
lock = 0;
/* remainder section */
} while (true);

BITS Pilani, Hyderabad Campus


Bounded-waiting Mutual Exclusion
with test_and_set
common data structures – boolean waiting[n], boolean lock, all initialized to false
do {
waiting[i] = true;
key = true;
while (waiting[i] && key)
key = test_and_set(&lock);
waiting[i] = false;
/* critical section */
j = (i + 1) % n;
while ((j != i) && !waiting[j])
j = (j + 1) % n;
if (j == i)
lock = false;
else
waiting[j] = false;
/* remainder section */
} while (true);

BITS Pilani, Hyderabad Campus


Mutex Locks
Hardware solutions are complicated and generally inaccessible to
application programmers
OS designers build software tools to solve critical section problem
Simplest is mutex lock
Protect a critical section by first acquire() a lock then release() the lock
 mutex lock has a boolean variable available indicating if lock is available or not
Calls to acquire() and release() must be atomic
 Usually implemented via hardware atomic instructions
But this solution requires busy waiting
 This lock therefore called a spinlock

BITS Pilani, Hyderabad Campus


acquire() and release()
acquire() {
while (!available) Adv. of Spinlock:
; /* busy wait */ 1. no context switch is required when a
available = false;; process must wait on a lock
}
release() {
2. useful when locks are to be held for short
available = true;
times
}
do {
Disadv. of Spinlock: busy waiting wastes CPU
acquire lock
cycles
critical section
release lock
remainder section Busy waiting- process although is waiting, but
} while (true); still is present in running state

BITS Pilani, Hyderabad Campus


Semaphore
Provides more sophisticated ways (than mutex locks) for process to synchronize
Semaphore S – integer variable
Apart from initialization, can only be accessed via two indivisible (atomic) operations,
wait() and signal(), Originally called P() and V()
Definition of the wait() operation
wait(S) {
while (S <= 0)
; // busy wait
S--;
}
Definition of the signal() operation
signal(S) {
S++;
}
BITS Pilani, Hyderabad Campus
Semaphore Usage
Counting semaphore – integer value can range over an unrestricted domain
ex- when multiple instance of same resource present
Binary semaphore – integer value can range only between 0 and 1
Must guarantee that no two processes can execute the wait() and signal()
on the same semaphore at the same time
ex- when single instance of a resource present

BITS Pilani, Hyderabad Campus


Semaphore Implementation

With each semaphore there is an associated waiting queue


Each semaphore has:
 value (of type integer)
 a list of processes
Two operations:
block – place the process invoking the operation on the appropriate waiting queue
wakeup – remove one of processes in the waiting queue and place it in the ready
queue
typedef struct{
int value;
struct process *list;
} semaphore;
BITS Pilani, Hyderabad Campus
Semaphore Implementation

signal(semaphore *S) {
wait(semaphore *S) {
S->value++;
S->value--; if(S->value <= 0){
if(S->value < 0){ remove a process P from S->list;
wakeup(P);
add this process to S- }
>list; }
block(); // to take this
process from running to ready queue

}
}

BITS Pilani, Hyderabad Campus


Deadlock and Starvation
Deadlock – two or more processes are waiting indefinitely for an event that can be
caused by only one of the waiting processes
Let S and Q be two semaphores initialized to 1
P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
... ...
signal(S); signal(Q);
signal(Q); signal(S);
Starvation – indefinite blocking , A process may never be removed from the semaphore
queue in which it is suspended

BITS Pilani, Hyderabad Campus


Classical Problems of
Synchronization

Bounded-Buffer Problem
Readers and Writers Problem
Dining-Philosophers Problem

BITS Pilani, Hyderabad Campus


Bounded-Buffer Problem
Buffer of length n, each location can hold one item (int n)
Semaphore mutex initialized to the value 1
Semaphore full initialized to the value 0
Semaphore empty initialized to the value n

BITS Pilani, Hyderabad Campus


Bounded-Buffer Problem
Structure of the producer process Structure of the consumer process

do { do {
... wait(full);
/* produce item in next_produced */
wait(mutex);
...
...
wait(empty); /*remove item from buffer to next_consumed*/
wait(mutex); ...
... signal(mutex);
/* add next_produced to the buffer */
signal(empty);
...
...
signal(mutex); /*consume the item in next_consumed */
signal(full); ...
} while (true); } while (true);

BITS Pilani, Hyderabad Campus


Readers-Writers Problem
A data set is shared among a number of concurrent processes
Readers – only read the data set; they do not perform any updates
Writers – can both read and write
Problem – allow multiple readers to read at the same time
Only one single writer can access the shared data at the same time,no one else
should be allowed whenever a writer is accessing critical section
First readers-writers prob.  no reader is kept waiting unless a writer has
already obtained the permission to use the shared obj., writers may starve

BITS Pilani, Hyderabad Campus


Readers-Writers Problem
Shared Data
Semaphore rw_mutex initialized to 1, common to both readers and writers, acts
as a mutual exclusion semaphore for writers, used by first reader that enters CS or last
reader that exits CS, not used by other readers

Integer read_count initialized to 0, keeps track of how many processes are reading
the shared obj.

Semaphore mutex initialized to 1, ensures mutual exclusion when read_count


is updated

BITS Pilani, Hyderabad Campus


Readers-Writers Problem
Structure of a writer process Structure of a reader process
do {
wait(mutex);
do {
wait(rw_mutex); read_count++;
... if (read_count == 1) wait(rw_mutex);
/* writing is performed */
signal(mutex);
...
...
signal(rw_mutex);
/* reading is performed */
} while (true);
...

wait(mutex);
read count--;
if (read_count == 0) signal(rw_mutex);

signal(mutex);

} while (true);
BITS Pilani, Hyderabad Campus
Dining-Philosophers Problem

Philosophers spend their lives alternating b/w


thinking and eating
Don’t interact with their neighbors,
occasionally try to pick up 2 chopsticks (one at a
time) to eat from bowl
Can pickup only one chopstick at a time
Need both to eat, then release both when
done
Each chopstick is binary semaphore as they
are not identical
In the case of 5 philosophers
Shared data
Bowl of rice (data set)
Semaphore chopstick [5] initialized to 1 BITS Pilani, Hyderabad Campus
Dining-Philosophers Problem
Algorithm
Structure of Philosopher i:
do {
wait (chopstick[i] );
wait (chopstick[ (i + 1) % 5] );

// eat

signal (chopstick[i] );
signal (chopstick[ (i + 1) % 5] );

// think

} while (TRUE);

What is the problem with this algorithm?


Deadlock can occur

BITS Pilani, Hyderabad Campus


Dining-Philosophers Problem
Algorithm
 Deadlock handling
 Allow at most 4 philosophers to be sitting simultaneously at the
table.

 Allow a philosopher to pick up the chopsticks only if both are


available.

 Use an asymmetric solution -- an odd-numbered philosopher


picks up first the left chopstick and then the right chopstick.
Even-numbered philosopher picks up first the right chopstick
and then the left chopstick.

BITS Pilani, Hyderabad Campus


Problems with Semaphores
 Incorrect use of semaphore operations:

 signal (mutex) …. wait (mutex)


violate mutual exclusiveness
 wait (mutex) … wait (mutex)
leads to deadlock
 Omitting of wait (mutex) or signal (mutex) (or both)
deadlock,starvation,mutual exclusiveness violated
 Deadlock and starvation are possible.

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
Deadlocks
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
System Model

System consists of resources


Resource types R1, R2, . . ., Rm
Each resource type Ri has Wi instances.
Each process utilizes a resource as follows:
request
use
release

BITS Pilani, Hyderabad Campus


Necessary Conditions

Deadlock can arise if four conditions hold simultaneously


Mutual exclusion: only one process at a time can use a resource
Hold and wait: a process holding at least one resource is waiting
to acquire additional resources held by other processes
No preemption: a resource can be released only voluntarily by
the process holding it, after that process has completed its task
Circular wait: there exists a set {P0, P1, …, Pn} of waiting processes
such that P0 is waiting for a resource that is held by P1, P1 is waiting
for a resource that is held by P2, …, Pn–1 is waiting for a resource
that is held by Pn, and Pn is waiting for a resource that is held by P0.

BITS Pilani, Hyderabad Campus


Resource-Allocation Graph

A set of vertices V and a set of edges E


V is partitioned into two types:
P = {P1, P2, …, Pn}, the set consisting of all the
processes in the system

R = {R1, R2, …, Rm}, the set consisting of all


resource types in the system

request edge – directed edge Pi  Rj


assignment edge – directed edge Rj  Pi

BITS Pilani, Hyderabad Campus


Examples

BITS Pilani, Hyderabad Campus


Inferring Deadlock

If graph contains no cycles  no deadlock


If graph contains a cycle 
if only one instance per resource type, then deadlock
if several instances per resource type, possibility of deadlock, look for knot

What is KNOT??
A collection of vertices and edges s.t. every vertex in the knot has outgoing
edges that terminate at other vertices in the knot
A strongly connected subgraph of a directed graph s.t. starting from any
node in the subset it is impossible to leave the knot by following the edges
of the graph
BITS Pilani, Hyderabad Campus
Deadlock Handling

Ensure that the system will never enter a deadlock state:


Deadlock prevention
Deadlock avoidance
Allow the system to enter a deadlock state and then recover
Ignore the problem and pretend that deadlocks never occur in the
system; used by most operating systems, including UNIX,
application programmers should ensure that deadlocks don’t occur

BITS Pilani, Hyderabad Campus


Deadlock Prevention

Mutual Exclusion – not required for sharable resources (e.g., read-


only files); must hold for non-sharable resources

Hold and Wait – must guarantee that whenever a process


requests a resource, it does not hold any other resources
Require process to request and be allocated all its resources before it
begins execution, or allow process to request resources only when the
process has none allocated to it
Low resource utilization; starvation possible

BITS Pilani, Hyderabad Campus


Deadlock Prevention

No Preemption –
If a process that is holding some resources requests another resource that cannot
be immediately allocated to it, then all resources currently being held are released
Preempted resources are added to the list of resources for which the process is
waiting
Process will be restarted only when it can regain its old resources, as well as the
new ones that it is requesting
When a process P1 requests some resources and they are allocated to some other
waiting process P2, then preempt the desired resources from P2 and give them to P1
If the resources are not allocated to a waiting process, then P1 must wait
While waiting P1’s resources may be preempted

BITS Pilani, Hyderabad Campus


Deadlock Prevention

Circular Wait – impose a total ordering of all resource types, and require
that each process requests resources in an increasing order of enumeration
define a 1:1 function F : R  N (N is set of natural nos.)
Say a process Pi has requested a no. of instances of Ri
Later, Pi can request resources of type Rj iff F(Rj) > F(Ri)
Alternatively, if Pi requests an instance of Rj then Pi must have released all
instances of Ri s.t. F(Ri) >= F(Rj)
Several instances of same resource type must be requested for in a single
request
Proof by contradiction

BITS Pilani, Hyderabad Campus


Deadlock Avoidance

• Requires that the system has some additional a priori information


available
• Requires that each process declare the maximum number of
resources of each type that it may need
• Resource-allocation state is defined by the number of available and
allocated resources, and the maximum demands of the processes
• Deadlock-avoidance algorithm dynamically examines the resource-
allocation state to ensure that there can never be a circular-wait
condition

BITS Pilani, Hyderabad Campus


Safe State

When a process requests an available resource, system must


decide if immediate allocation leaves the system in a safe state
System is in safe state if there exists a sequence <P1, P2, …, Pn> of
ALL the processes in the systems such that for each Pi, the
resources that Pi can still request can be satisfied by currently
available resources + resources held by all the Pj, with j < i
That is:
If resources needed by Pi are not immediately available, then Pi can wait
until all Pj have finished
When Pj is finished, Pi can obtain needed resources, execute, return
allocated resources, and terminate
When Pi terminates, Pi +1 can obtain its needed resources, and so on
BITS Pilani, Hyderabad Campus
Inferences

If a system is in safe state 


no deadlocks

If a system is in unsafe state


 possibility of deadlock

Avoidance  ensure that a


system will never enter an
unsafe state

BITS Pilani, Hyderabad Campus


Avoidance Algorithms

Single instance of a resource type


Use a resource-allocation graph

Multiple instances of a resource type


 Use the banker’s algorithm

BITS Pilani, Hyderabad Campus


Resource-Allocation-Graph
Algorithm
Claim edge Pi  Rj indicated that process Pi may request
resource Rj in future; represented by a dashed line
Claim edge converts to request edge when a process
requests a resource
Request edge converted to an assignment edge when the
resource is allocated to the process
When a resource is released by a process, assignment edge
reconverts to a claim edge
Resources must be claimed a priori in the system
Suppose that process Pi requests a resource Rj
The request can be granted only if converting the request
edge to an assignment edge does not result in the formation
of a cycle in the resource allocation graph
BITS Pilani, Hyderabad Campus
Banker’s Algorithm

Multiple instances

Each process must a priori declare maximum use

When a process requests a resource it may have to wait

When a process gets all its resources it must return them in a


finite amount of time

BITS Pilani, Hyderabad Campus


Data Structures for Banker’s
Algorithm
n = number of processes, and m = number of resources types
Available: Vector of length m. If Available [j] = k, there are k instances of resource type Rj
are available
Max: n x m matrix. If Max [i][j] = k, then process Pi may request at most k instances of
resource type Rj
Allocation: n x m matrix. If Allocation[i][j] = k then Pi is currently allocated k instances of
Rj
Need: n x m matrix. If Need[i][j] = k, then Pi may need k more instances of Rj to complete
its task Need [i][j] = Max[i][j] – Allocation [i][j]
Each row of Max, Allocation and Need can be treated as vectors
If X and Y are 2 vectors, then X <= Y iff, X[i] <= Y[i] for all i = 1, 2, ...., n
X < Y
BITS Pilani, Hyderabad Campus
Safety Algorithm
1.Let Work and Finish be vectors of length m and n, respectively.
Initialize: Work = Available, Finish [i] = false for i = 0, 1, …, n - 1
2.Find an i such that both:
(a) Finish [i] == false
(b) Needi  Work O(mn2) operations
If no such i exists, go to step 4

3.Work = Work + Allocationi


Finish[i] = true
go to step 2

4.If Finish [i] == true for all i, then the system is in a safe state
BITS Pilani, Hyderabad Campus
Banker’s Algorithm Example
5 processes P0 through P4;
3 resource types: A (10 instances), B (5 instances), and C (7 instances)
Snapshot at time T0:

Process Allocation Max Available Process Need


A B C A B C A B C A B C
P0 0 1 0 7 5 3 3 3 2 P0 7 4 3
P1 2 0 0 3 2 2 P1 1 2 2
P2 3 0 2 9 0 2 P2 6 0 0
P3 2 1 1 2 2 2 P3 0 1 1
P4 0 0 2 4 3 3 P4 4 3 1

The system is in a safe state since the sequence < P1, P3, P4, P2, P0> satisfies safety
criteria
BITS Pilani, Hyderabad Campus
Resource-Request Algorithm
Requesti = request vector for process Pi. If Requesti [j] = k then process Pi wants k instances
of resource type Rj
1. If Requesti  Needi go to step 2. Otherwise, raise error condition, since process has
exceeded its maximum claim
2. If Requesti  Available, go to step 3. Otherwise Pi must wait, since resources are not
available
3. Pretend to allocate requested resources to Pi by modifying the state as follows:
Available = Available – Requesti ;
Allocationi = Allocationi + Requesti;
Needi = Needi – Requesti ;
 If safe  the resources are allocated to Pi
 If unsafe  Pi must wait, and the old resource-allocation state is restored

BITS Pilani, Hyderabad Campus


Banker’s Algorithm Example:
P1 Requests (1,0,2)
Check that Request  Available (that is, (1,0,2)  (3,3,2)  true
Process Allocation Max Available Process Need
A B C A B C A B C A B C
P0 0 1 0 7 5 3 2 3 0 P0 7 4 3
P1 3 0 2 3 2 2 P1 0 2 0
P2 3 0 2 9 0 2 P2 6 0 0

P3 2 1 1 2 2 2 P3 0 1 1

P4 0 0 2 4 3 3 P4 4 3 1

BITS Pilani, Hyderabad Campus


Deadlock Detection
Allow system to enter deadlock state

Detection algorithm

Recovery scheme

BITS Pilani, Hyderabad Campus


Single Instance of Each Resource
Type
Maintain wait-for graph
Nodes are processes
Pi  Pj if Pi is waiting for Pj

Periodically invoke an algorithm that


searches for a cycle in the graph. If
there is a cycle, there exists a deadlock

An algorithm to detect a cycle in a


graph requires O(n2) operations, where
n is the number of vertices in the graph

BITS Pilani, Hyderabad Campus


Recovery from Deadlock:
Process Termination
Abort all deadlocked processes

Abort one process at a time until the deadlock cycle is eliminated

In which order should we choose to abort?


Priority of the process
How long process has computed, and how much longer to completion
Resources the process has used
Resources process needs to complete
How many processes will need to be terminated
Is process interactive or batch?

BITS Pilani, Hyderabad Campus


Recovery from Deadlock:
Resource Preemption
Preempt some resources from processes and give these resources to other
processes until the deadlock cycle is broken

Selecting a victim – which resources and which processes, minimize cost by


deciding order of preemption

Rollback – return to some safe state, restart process for that state, total
rollback or partials

Starvation – same process may always be picked as victim, include number


of rollback in cost factor
BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
Main Memory Management
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Background

Program must be brought (from disk) into memory


Main memory and registers are only storage CPU can access
directly
Memory unit only sees a stream of addresses + read requests, or
address + data and write requests
Register access in one CPU clock (or less)
Main memory can take many cycles, causing a stall
Cache sits between main memory and CPU registers
Protection of memory required to ensure correct operation

BITS Pilani, Hyderabad Campus


Base and Limit Registers

A pair of base and limit registers


define the address space
CPU must check every memory
access generated in user mode to be
sure it is between base and limit for
that user
Limit will contain the size of range
allowed to access by a process
Highest address accessible by a
process= base+limit-1

BITS Pilani, Hyderabad Campus


Hardware Address Protection

BITS Pilani, Hyderabad Campus


Address Binding

Programs on disk, ready to be brought into memory to execute


Further, addresses represented in different ways at different stages of a
program’s life
 Source code addresses usually symbolic(logical/virtual addresses)

 Compiled code addresses bind to relocatable addresses

 like, “14 bytes from beginning of this module”


 Linker or loader binds relocatable addresses to absolute
addresses(physical addresses)
 like, 74014

BITS Pilani, Hyderabad Campus


Logical vs. Physical Address Space

Logical address – generated by the CPU; also referred to as virtual


address
Physical address – address seen by the memory unit
Logical address space is the set of all logical addresses generated
by a program
Physical address space is the set of all physical addresses
corresponding to the logical addresses generated by a program

BITS Pilani, Hyderabad Campus


Memory-Management Unit (MMU)
Hardware device that at run time maps
virtual address to physical address
Value in the relocation register is added
to every address generated by a user
process
Relocation register
The user program deals with logical
addresses; it never sees the real physical
addresses

BITS Pilani, Hyderabad Campus


Dynamic Loading

Routine is not loaded until it is called


Better memory-space utilization; unused routine is never loaded
All routines kept on disk
Useful when large amounts of code are needed to handle
infrequently occurring cases

BITS Pilani, Hyderabad Campus


Swapping

A process can be swapped temporarily out of memory to a backing


store, and then brought back into memory for continued execution
Total physical memory space of processes can exceed physical
memory,in such case swapping is done
Backing store –(where the temporarily swapped out process are going to
be kept) fast disk large enough to accommodate copies of all memory
images for all users; must provide direct access to these memory images
Major part of swap time is transfer time
System maintains a ready queue of ready-to-run processes which have
memory images on disk
Swap only when free memory extremely low
Optimizing swapping by Swapping portions of processes
BITS Pilani, Hyderabad Campus
Swapping

If next processes to be put on CPU is


not in memory, need to swap out a
process and swap in target process
Context switch time can then be very
high
100MB process swapping to hard
disk with transfer rate of 50MB/sec
Swap out time of 2000 ms
Plus swap in of same sized process
Total context switch swapping
component time of 4000ms (4 seconds)

BITS Pilani, Hyderabad Campus


Contiguous Memory Allocation

Relocation registers used to protect user processes from each


other, and from changing operating-system code and data
Relocation register contains value of smallest physical address
Limit register contains range of logical addresses – each logical
address must be less than the limit register
MMU maps logical address dynamically

BITS Pilani, Hyderabad Campus


Multiple-Partition Allocation
• Fixed size partitions, Degree of multiprogramming limited by number of partitions
• Variable-partition sizes for efficiency (sized to a given process’ needs)
• Hole – block of available memory; holes of various size are scattered throughout memory
• When a process arrives, it is allocated memory from a hole large enough to accommodate it
• Process exiting frees its partition, adjacent free partitions combined
• Operating system maintains information about:
a) allocated partitions b) free partitions (hole)

BITS Pilani, Hyderabad Campus


Dynamic Storage-Allocation Problem

How to satisfy a request of size n from a list of free holes?


First-fit: Allocate the first hole that is big enough

Best-fit: Allocate the smallest hole that is big enough; must


search entire list, unless ordered by size
Produces the smallest leftover hole

Worst-fit: Allocate the largest hole; must also search entire list
Produces the largest leftover hole

First-fit and best-fit better than worst-fit in terms of speed and storage
utilization
BITS Pilani, Hyderabad Campus
Fragmentation

External Fragmentation – total memory space exists to satisfy a


request, but it is not contiguous
Reduce external fragmentation by compaction
Shuffle memory contents to place all free memory together in
one large block
But compaction is expensive,so other method used is : Allow logical
address space of processes to be non-contiguous
Internal Fragmentation – allocated memory may be slightly larger
than requested memory; this size difference is memory internal to a
partition, but not being used

BITS Pilani, Hyderabad Campus


Segmentation (memory manag. Scheme)
A program is a collection of variable sized segments
A segment is a logical unit such as:
main program
procedure local variables, global variables
function stack
method symbol table
object arrays
Memory-management scheme that supports user view of
memory,logic space for a process broken into segments,these
segments are mapped to phy. address
Logical address space is a collection of segments
BITS Pilani, Hyderabad Campus
Segmentation Architecture
Logical address consists of a two tuple: <segment-number, offset>
Segment table – maps logical addresses to physical addresses; each table
entry has:
segment base – contains the starting physical address where the segment resides in
memory
segment limit – specifies the length of the segment
Segment-table base register (STBR) points to the segment table’s location
in memory
Segment-table length register (STLR) indicates number of segments used
by a program
Segmentation has problem of external fragmentatio,hence compaction can
be used
BITS Pilani, Hyderabad Campus
Segmentation Hardware

BITS Pilani, Hyderabad Campus


Problem
Given six memory partitions of 300 KB, 600 KB, 350 KB, 200 KB,
750 KB, and 125 KB (in order), how would the first-fit, best-fit,
and worst-fit algorithms place processes of size 115 KB, 500 KB,
358 KB, 200 KB, and 375 KB (in order)? Note: When hole is
created it is treated as independent partition.

First Fit:
300KB 600KB 350KB 200KB 750KB 125KB

Best Fit:
300KB 600KB 350KB 200KB 750KB 125KB

Worst Fit
300KB 600KB 350KB 200KB 750KB 125KB

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
Paging (memory management scheme)
Physical address space of a process can be noncontiguous; process is allocated physical
memory whenever the latter is available
Avoids external fragmentation
Avoids need for compaction
Divide physical memory into fixed-sized blocks called frames
Size is power of 2
Divide logical memory into blocks of same size as frame-size called pages
Keep track of all free frames
To run a program of size N pages, need to find N free frames and load program
Set up a page table to translate logical to physical addresses
Still have Internal fragmentation, ex- suppose logical memory size=1275KB ,page
size=8KB, then pages=160 ;5 KB space of last page wasted
BITS Pilani, Hyderabad Campus
Process A : 4 pages
Example Process B : 3 pages
Process C : 4 Pages
Process D : 5 Pages
Main Memory : 15 frames

BITS Pilani, Pilani Campus


Address Translation Scheme
Address generated by CPU is divided into:
Page number (p) – used as an index into a page table which contains base address of
each page in physical memory
Page offset (d) – combined with base address to define the physical memory address
that is sent to the memory unit
page number page offset
p d
m -n n

logical address space 2m bytes and page size 2n bytes


Here (m-n)= log(total number of pages)
n= log(size of one page)

BITS Pilani, Hyderabad Campus


Paging Hardware

page table contains base


address of each frame in
physical memory

BITS Pilani, Hyderabad Campus


Paging Model of Logical and
Physical Memory

Actually page table has


frame base address,but
for ease of
understanding we
show frame no. here

BITS Pilani, Hyderabad Campus


Paging Example

n = 2 and m = 4
32-byte memory and 4-byte pages

BITS Pilani, Hyderabad Campus


Paging
• Internal fragmentation
• So small frame sizes desirable?
• But each page table entry takes memory to track

BITS Pilani, Hyderabad Campus


Free Frames

Frame table – one entry for each frame


indicating whether the frame is free or allocated
and if allocated, to which page of which process

Before allocation After allocation

BITS Pilani, Hyderabad Campus


Page Table Implementation
• Page table is kept in main memory
• Page-table base register (PTBR) points to the page table
• Page-table length register (PTLR) indicates size of the page table
• In this scheme every data/instruction access requires two main memory
accesses
• One for the page table stored in main memory and one for the data / instruction
stored in some frame of main memory
• The two memory access problem can be solved by the use of a special fast-
lookup hardware cache called associative memory or translation look-aside
buffers (TLBs)

BITS Pilani, Hyderabad Campus


Associative Memory/TLB
• Associative memory – BIG O(1)parallel search
• Key-value pair fashion
• Made from cache-memory
• Size of tlb<<<page table
Page # Frame #

• Address translation (p, d)


• If p is in associative memory, get frame # out
• Otherwise get frame # from page table in memory
BITS Pilani, Hyderabad Campus
Paging Hardware With TLB

BITS Pilani, Hyderabad Campus


Effective Access Time
• Associative/TLB Lookup =  time unit
• Hit ratio = 
• Hit ratio – percentage of times that a page number is found in the associative/TLB memory
• Consider  = 80%,  = 20ns for TLB search, 100ns for memory access
• Effective Access Time (EAT)
Consider  = 80%,  = 20ns for TLB search, 100ns for memory access
• EAT = 0.80 x 120 + 0.20 x 220

BITS Pilani, Hyderabad Campus


Problem

Consider a logical address space of 32 pages of 1,024 bytes each, mapped


onto a physical memory of 64 frames.
a. How many bits are there in the logical address?
b. How many bits are there in the physical address?

BITS Pilani, Pilani Campus


Memory Protection
• Memory protection implemented by associating
protection bit with each frame to indicate if read-only
or read-write access is allowed
• Can also add more bits to indicate page execute-
only, and so on
• Valid-invalid bit attached to each entry in the page
table:
• “valid” indicates that the associated page is in the
process’s logical address space, and is thus a legal
page
• “invalid” indicates that the page is not in the
process’s logical address space
• Any violations result in a trap to the kernel

BITS Pilani, Hyderabad Campus


Page Table Structure

• Memory structures for paging can get huge using straight-forward methods
• Cost a lot
• Don’t want to allocate that contiguously in main memory
• Hierarchical Paging
• Hashed Page Tables
• Inverted Page Tables

BITS Pilani, Hyderabad Campus


Hierarchical Page Tables
• Break up the logical address space
into multiple page tables
• Two-level page table
• We then page the page table

• p1 is an index into the outer page


table, and p2 is the displacement
within the page of the inner page
table

BITS Pilani, Hyderabad Campus


Hashed Page Table
• The virtual page number is hashed into
a page table
• Page table contains a chain of
elements hashing to the same location
• Each element contains (1) the virtual
page number (2) the value of the
mapped page frame (3) a pointer to
the next element
• Virtual page numbers are compared in
this chain searching for a match
• If a match is found, the corresponding
physical frame is extracted
BITS Pilani, Hyderabad Campus
Inverted Page Table
• We will have a single common page
table for all processes
• Track all physical pages(frames)
• One entry for each real page of PHY.
memory(one entry in page table for
one frame in physical memory)
• Pid present in logical address as all
process have common page table
• Entry consists of the virtual address
of the page stored in that real
memory location, with information
about the process that owns that
page BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
Virtual Memory Management
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Background

Code needs to be in memory to execute, but entire program rarely


used
Error code, unusual routines, large data structures
Entire program code not needed at same time
Consider ability to execute partially-loaded program
Program no longer constrained by limits of physical memory
Each program takes less memory while running -> more programs run at
the same time
Increased CPU utilization and throughput with no increase in response time or
turnaround time
Less I/O needed to load or swap programs into memory -> each user
program runs faster

BITS Pilani, Hyderabad Campus


Background
Virtual memory – separation of user logical memory from physical memory
Only part of the program needs to be in memory for execution
Logical address space can therefore be much larger than physical address
space
Allows address spaces to be shared by several processes
More programs running concurrently
Less I/O needed to load or swap processes
• Virtual address space – logical view of how process is stored in memory
• Usually start at address 0, contiguous addresses until end of space
• Meanwhile, physical memory organized in page frames
• MMU must map logical to physical

BITS Pilani, Hyderabad Campus


Demand Paging(way to implement virtual
memory concept)
Bring a page into memory only when it is needed
Less I/O needed, no unnecessary I/O
Less memory needed
Faster response
More users
Page is needed  reference to it
invalid reference  abort
not-in-memory  bring to memory
Lazy Swapper, Pager

BITS Pilani, Hyderabad Campus


Demand Paging

Pager guesses which pages will be used before swapping out again
Pager brings in only those pages into memory
How to determine that set of pages?
Need new MMU functionality to implement demand paging
If pages needed are already memory resident
If page needed and not memory resident
Need to detect and load the page into memory from storage

BITS Pilani, Hyderabad Campus


Valid-Invalid Bit (for demand paging)

With each page table entry a valid–


invalid bit is associated
v  legal and in-memory – memory
resident
i  either illegal or not-in-memory
Initially valid–invalid bit is set to i on all
entries
During MMU address translation, if
valid–invalid bit in page table entry is i 
page fault

BITS Pilani, Hyderabad Campus


Valid-Invalid Bit

BITS Pilani, Hyderabad Campus


Page Fault
If there is a reference to a page, first reference to that page will
trap to operating system: page fault
Operating system looks at an internal table (usually associated
with PCB) to decide:
Invalid reference  abort
Just not in memory
Find free frame
Swap page into frame via (secondary memory)disk operation
Reset page table to indicate page now in memory
Set valid/invalid bit = v
Restart the instruction that caused the page fault
BITS Pilani, Hyderabad Campus
Page Fault

BITS Pilani, Hyderabad Campus


Demand Paging
Pure Demand Paging – even at start,apart from bare
minimum,don’t bring any pages , as in when demand come,only
then bring page,no guessing of which page to bring
Locality of reference - tendency of a processor to access the same
set of memory locations repetitively over a short period of time
Hardware support needed for demand paging
Page table with valid / invalid bit
Secondary memory (swap device with swap space)
Instruction restart

BITS Pilani, Hyderabad Campus


Performance of Demand Paging
• Three major activities
• Service the interrupt
• Read the page
• Restart the process
• Page Fault Rate 0  p  1
• if p = 0 no page faults
• if p = 1, every reference is a fault
• Effective Access Time (EAT)
EAT = (1 – p) x memory access + p (page fault overhead
+ swap page out + swap page in )

BITS Pilani, Hyderabad Campus


Demand Paging Example
• Memory access time = 200 nanoseconds
• Average page-fault service time = 8 milliseconds
• EAT = (1 – p) x 200 + p (8 milliseconds)
= (1 – p) x 200 + p x 8,000,000 = 200 + p x 7,999,800

BITS Pilani, Hyderabad Campus


Page Replacement
Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are
written to disk
Page replacement completes separation between logical memory and physical
memory – large virtual memory can be provided on a smaller physical memory
Find the location of the desired page on disk
Find a free frame:
- If there is a free frame, use it
- If there is no free frame, use a page replacement algorithm to select a victim
frame
- Write victim frame to disk if dirty
Bring the desired page into the (newly) free frame; update the page and frame tables
Continue the process by restarting the instruction that caused the trap
Note: reference string never contains same pages adjacent to one another
BITS Pilani, Hyderabad Campus
First-In-First-Out (FIFO) Algorithm
• Reference string: 7,0,1,2,0,3,0,4,2,3,0,3,2,1,2,0,1,7,0,1
• 3 frames (3 pages can be in memory at a time per process)
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 4 4 4 0 0 0 7 7 7
0 0 0 3 3 3 2 2 2 1 1 1 0 0
1 1 1 0 0 0 3 3 3 2 2 2 1

• In FIFO,replace the one which entered the very earliest,doesn’t matter if it


was page hit in between, basically follow fixed pointer approach as above.
• Adding more frames can cause more page faults!
• Belady’s Anomaly
• How to track ages of pages?
• Just use a FIFO queue BITS Pilani, Hyderabad Campus
Optimal Algorithm
• Replace page that will not be used for longest period of time
• 9 is optimal for the example
• How do you know this?
• Can’t read the future
• Used for measuring how well your algorithm performs

7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 2 2 7
0 0 0 0 4 0 0 0
1 1 3 3 3 1 1

BITS Pilani, Hyderabad Campus


Least Recently Used (LRU)
Algorithm
Use past knowledge rather than future
Replace page that has not been used in the most amount of time
Associate time of last use with each page
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

7 7 7 2 2 4 4 4 0 1 1 1
0 0 0 0 0 0 3 3 3 0 0
1 1 3 3 2 2 2 2 2 7

12 faults – better than FIFO but worse than OPT


Generally good algorithm and frequently used
But how to implement?
BITS Pilani, Hyderabad Campus
Least Recently Used (LRU)
Algorithm
Counter implementation
Every page table entry has a counter; every time page is referenced through this entry,
copy the clock into the counter
When a page needs to be changed, look at the counters to find smallest value
Stack implementation
Keep a stack of page numbers
Page referenced - move it to the top

LRU and OPT don’t have Belady’s Anomaly

BITS Pilani, Hyderabad Campus


Least Recently Used (LRU)
Algorithm
Using stack implementation

BITS Pilani, Hyderabad Campus


Allocation of Frames

Each process needs minimum number of frames, decided by


computer architecture
Maximum ofcourse is total frames in the system (minus the
frames allocated to OS).
Two major allocation schemes
Equal allocation
Proportional allocation

BITS Pilani, Hyderabad Campus


Frame Allocation

Equal allocation – For example, if there are 100 frames (after allocating
frames for the OS) and 5 processes, give each process 20 frames

Proportional allocation – Allocate according to the size of process


Dynamic as degree of multiprogramming, process sizes change

m= 64 Assume that 2 frames will be


allocated for OS out of 64
s1 = 10
si  size of process pi
s2 = 127
S   si
m  total number of frames available after os is given some 10
a1 = ´ 62 » 4
ai  allocation for pi 
si
m
137
S 127
a2 = ´ 62 » 57
137

BITS Pilani, Hyderabad Campus


Global replacement of frame:
If a process wants more frames durng execution,it can replace
frames of other processes in main memory,ie,victim frame can be a
rame allocated to some other process as well
Effect of thrashing is more severe when global replacement is
used,still thrashing is observed in local replacement also
Local replacement of frame:
If a procss wants more frames during execution,it can’t replace
frames of other processes in main memory

BITS Pilani, Hyderabad Campus


Thrashing

If a process does not have “enough” pages, the page-fault rate is
very high
Page fault to get page
Replace existing frame
But quickly need replaced frame back
This leads to:
Low CPU utilization
Operating system thinking that it needs to increase the degree of
multiprogramming
Another process added to the system

Thrashing  a process is busy swapping pages in and out, high


paging activity
BITS Pilani, Hyderabad Campus
Thrashing

BITS Pilani, Hyderabad Campus


Demand Paging & Thrashing

Why does demand paging work?


Locality model
Locality is a set of pages that are actively used together
Defined by program structure and data structures used
Process migrates from one locality to another
Contains several localities
Localities may overlap

Why does thrashing occur?


 size of locality > total memory size
Limit effects by using local or priority page replacement
Allocate enough frames for a single locality
BITS Pilani, Hyderabad Campus
Working Set Model

  working-set window  a fixed number of page references (usually most


recent), approximation of program’s locality

BITS Pilani, Hyderabad Campus


Working Set Model
WSS-working set size; Working set model as a solution to thrashing
WSSi (working set of Pi) = total no. of pages referenced in the most recent  (varies in time)
if  too small will not encompass entire locality
if  too large will encompass several localities
if  =   will encompass entire program
D =  WSSi  total demand for frames
Approximation of locality
m = no. of available frames
if D > m  Thrashing
Policy if D > m, then suspend or swap out one of the processes
Keeps the degree of multiprogramming as high as possible
Optimizes CPU utilization
Difficulty is to keep track of the working set
BITS Pilani, Hyderabad Campus
Page Fault Rates (another solution to thrashing)

Page-Fault Frequency
(PFF)
Define an upper and
lower limits of page
fault rate
If page fault rate is
too low  take away
a frame
If page fault rate is
too high  allocate
one more frame

BITS Pilani, Hyderabad Campus


Allocating Kernel Memory

Treated differently from user memory


Often allocated from a free-memory pool
Kernel requests memory for data structures of varying sizes
Some kernel memory needs to be contiguous
i.e., for device I/O, h/w devices interact directly with physical memory

BITS Pilani, Hyderabad Campus


Buddy System

Allocates memory from fixed-size segment consisting of physically-


contiguous blocks
Define a maximum and minimum block size
Memory allocated using power-of-2 allocator
Satisfies requests in units sized as power of 2
Request rounded up to next highest power of 2
When smaller allocation needed than is available, current chunk
split into two buddies of next-lower power of 2
Continue until appropriate sized chunk available

BITS Pilani, Hyderabad Campus


Buddy System

Advantage – quickly coalesce smaller chunks


into larger chunk, reduces external
fragmentation
Disadvantage – internal fragmentation

BITS Pilani, Hyderabad Campus


Buddy System
• The buddy system is a memory allocation and management
algorithm
• It manages memory in power of two increments
• Splitting memory into halves and to try to give a best fit

BITS Pilani, Pilani Campus


Contd…
• Provides two operations:
• Allocate(A, 2k) : Allocates a block of 2k and marks it as allocated
• Free(A): Marks the previously allocated block A as free and merge it with other
blocks to form a larger block

• Algorithm: Assume that a process P of size “X” needs to be


allocated
• If 2K-1<X<=2K: Allocate the entire block 2K
• Else: Recursively divide the block equally and test the condition at each time, when
it satisfies, allocate the block.

Note:
1) allocate buddy best fit, then from left chunk to right chunk
2)If u have a appropriate size chunk currently available, do not split or merge any other
chunks ,just allocate the available chunk, example in next slide when allocate(F)

BITS Pilani, Pilani Campus


Problem 1

Consider a memory block of 16K. Perform the following:


Allocate (A: 3.5K)
Allocate (B: 1.2K)
Allocate (C: 1.3K)
Allocate (D: 1.9K)
Allocate (E: 3.2K)
Free (C)
Free (B)
Allocate (F: 1.6K)
Allocate (G: 1.8K)

BITS Pilani, Pilani Campus


• 16K Memory Block
16K

• Allocate (A: 3.5K)


4k (A) 4K 8K

• Allocate (B: 1.2K)


4k (A) 2K(B) 2K 8K
• Allocate (C: 1.3K)
4k (A) 2K(B) 2K(C) 8K

• Allocate (D: 1.9K)


4k (A) 2K(B) 2K(C) 2K(D) 2K 4K

BITS Pilani, Pilani Campus


• Allocate (E: 3.2K)
4k (A) 2K(B) 2K(C) 2K(D) 2K 4K(E)
• Free (C)
4k (A) 2K(B) 2K 2K(D) 2K 4K(E)
• Free (B)
4k (A) 2K 2K 2K(D) 2K 4K(E)

4k (A) 4K 2K(D) 2K 4K(E)


• Allocate (F: 1.6K)
4k (A) 4k 2K(D) 2K(F) 4K(E)

• Allocate (G: 1.8K)


4k (A) 2K(G) 2K 2K(D) 2K(F) 4K(E)

BITS Pilani, Pilani Campus


Tree Structure

2 buddies belonging to 2 different subtrees cannot be fused

BITS Pilani, Pilani Campus


Problem 2

• Consider the state of memory after allocating 5 processes


A, B, C, D and E
4k (A) 2K(B) 2K(C) 2K(D) 2K(E) 4K

• What is the state of memory after freeing process D?


4k (A) 2K(B) 2K(C) 2K 2K(E) 4K

• What is the state of memory after freeing Process C?


4k (A) 2K(B) 2K 2K 2K(E) 4K

Wrong!
4k (A) 2K(B) 4K 2K(E) 4K
BITS Pilani, Pilani Campus
Advantages and Disadvantages

Advantage –
• Easy to implement a buddy system (Linux)
• Allocates block of correct size
• It is easy to merge adjacent holes
• Fast to allocate memory and de-allocating memory
Disadvantage –
• It requires all allocation unit to be powers of two
• It leads to internal fragmentation

BITS Pilani, Pilani Campus


BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
Mass Storage
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Magnetic Disks

BITS Pilani, Hyderabad Campus


Mass Storage Structure
• Magnetic disks provide bulk of secondary storage of modern computers
• Disk in use rotates at 60 to 250 times per second (RPM)
• Transfer rate is rate at which data flow between drive and computer
• Positioning time (random-access time) is time to move disk arm to desired
cylinder (seek time) and time for desired sector to rotate under the disk head
(rotational latency)
• Head crash results from disk head making contact with the disk surface
• Disks can be removable
• Disk drive is attached to computer via I/O bus
• Buses vary, including ATA, SATA, USB, Fibre Channel (FC), etc.

BITS Pilani, Hyderabad Campus


Disk Structure

Disk drives are addressed as large 1-dimensional arrays of logical


blocks, where the logical block is the smallest unit of transfer

The 1-dimensional array of logical blocks is mapped into the


sectors of the disk sequentially
Sector 0 is the first sector of the first track on the outermost cylinder
Mapping proceeds in order through that track, then the rest of the tracks
in that cylinder, and then through the rest of the cylinders from
outermost to innermost
Physical Address – cylinder number, track number, sector number

BITS Pilani, Hyderabad Campus


Disk Scheduling
Minimize seek time
Disk bandwidth - total no. of bytes transferred divided by the total
time between the first request for service and the completion of the
last transfer
OS maintains queue of I/O requests, per disk or device
Idle disk can immediately work on I/O request, busy disk means
request must be queued
Next we look at some disk scheduling algorithms,basically how to
move arm handler when io request pile up

BITS Pilani, Hyderabad Campus


FCFS
Total head movement of 640 cylinders

BITS Pilani, Hyderabad Campus


SSTF(Shortest Seek Time First )

Total head movement of 236 cylinders

 Shortest Seek Time First selects


the request with the minimum
seek time from the current head
position
 May cause starvation of some
requests

BITS Pilani, Hyderabad Campus


SCAN /ELEVATOR Algorithm
 ELEVATOR Algorithm

236 cylinders

•Direction of arm movement is needed


•If moving in a direction,move till u hit
boundary if some more request are
pending,
•if last request is done,no need to goto
boundary

BITS Pilani, Hyderabad Campus


C-SCAN (circular-scan)

(199 – 53) +
(199 – 0) +
(37 – 0) =
382 cylinders

Usually Better
performance/r
esponse time
than scan
BITS Pilani, Hyderabad Campus
Disk Management
To use a disk to hold files, the operating system still needs to record its own
data structures on the disk
Partition the disk into one or more groups of cylinders, each treated as
a logical disk
Logical formatting- creation of a file system, OS stores the initial file
system data structures on the disk, data structures include maps of free
and allocated space and an initial empty directory

BITS Pilani, Hyderabad Campus


RAID
RAID – redundant arrays of independent disks
multiple disk drives provides reliability via redundancy
Increases the mean time to failure
Mean time to repair – avg. time to replace a failed disk and restore the data on it
Mean time to data loss based on above factors
Mirrored Disk (volume)
Data Stripping – bit-level stripping, byte-level stripping, block-level stripping
RAID 0 RAID 3
RAID 1 RAID 4
RAID 5
RAID 2
RAID 6
BITS Pilani, Hyderabad Campus
RAID 0
Level 0: Non redundant
• Data striping is used for increased
performance but no redundant
information is maintained.
• Striping is done at block level but
without any redundancy.
• Writing performance is best in this
level because due to absence of
redundant information there is no
need to update redundant
information

BITS Pilani, Hyderabad Campus


RAID 1
Level 1: Mirrored
Same data is copied on two different disks. This type of redundancy is called
mirroring. It is the most expensive system. Because two copies of same data
are available in two different disks, it allows parallel read

BITS Pilani, Hyderabad Campus


RAID 2
Level 2: Error correcting codes
This level uses bit-level data stripping in place of block level. It is used with
drives with no built in error detection technique. Error-correcting codes (ECC)
store two or more extra bits and it is used for reconstruction of the data if a
single bit is damaged.

BITS Pilani, Hyderabad Campus


RAID 3
Level 3: Bit-Interleaved parity
Data stripping is used and a single parity bit is used for error correction as
well as for detection. Systems have disk controller that detects which disk has
failed. RAID level 3 has a single check disk with parity bit.

BITS Pilani, Hyderabad Campus


RAID 4
Level 4: Block-Interleaved parity
RAID level 4 is similar as RAID level 3 but it has Block-Interleaved parity instead
of bit parity. You can access data independently so read performance is high.

BITS Pilani, Hyderabad Campus


RAID 5
Level 5: Block-Interleaved distributed parity
RAID level 5 distributes the parity block and data on all disks. For each block,
one of the disks stores the parity and the others store data. RAID level 5 gives
best performance for large read and write.

BITS Pilani, Hyderabad Campus


RAID 6
Level 6: P+Q Redundancy Scheme
What happens if more than one disk fails at a time? This level stores extra
redundant information to save the data against multiple disk failures. It uses
Reed-Solomon codes (ECC) for data recovery. Two different algorithms are
employed

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
File System
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
File Attributes
Name – only information kept in human-readable form
Identifier – unique tag (number) identifies file within file system, non-humnan-
readable
Type – needed for systems that support different types
Location – pointer to file location on device
Size – current file size
Protection – controls who can do reading, writing, executing
Time, date, and user identification – data for protection, security, and usage
monitoring
Information about files are kept in directory structure maintained on the disk

BITS Pilani, Hyderabad Campus


File Operations
File is an abstract data type
Create
Write – at write pointer location
Read – at read pointer location
Reposition within file - seek
Delete
Truncate
OS maintains an open-file table
For a requested file operation, file is specified via an index into the table
When file is closed, OS removes entry from file table

BITS Pilani, Hyderabad Campus


File Types

BITS Pilani, Hyderabad Campus


Access Methods
• Sequential Access
read_next()
write_next()
reset()

• Direct Access/Relative Access – file consists of fixed length logical records


read(n)
write(n)
position_file(n)

n = relative block number w.r.t beginning of file

BITS Pilani, Hyderabad Campus


Directory
• Any entity containing a file system is called a volume
• The directory is organized logically to obtain
• Efficiency – locating a file quickly
• Naming – convenient to users
• Two users can have same name for different files
• The same file can have several different names
• Grouping – logical grouping of files by properties
• Directory Operations
• Search for a file
• Create a file
• Delete a file
• List a directory
• Rename a file
• Traverse the file system BITS Pilani, Hyderabad Campus
Single-Level Directory
• A single directory for all users
• Entire system will contain only one directory which is supposed to mention all the files
present in the file system
• Directory contains one entry for each file present on the file system

• Naming problem
• Grouping problem

BITS Pilani, Hyderabad Campus


Two-Level Directory
Separate directory for each user

User name and file name define a path


MFD is indexed by user name/ account number
Can have the same file name for different user
Efficient searching, only UFD is searched for creation or deletion
Creation and deletion of user directories – admin
Sharing not possible

BITS Pilani, Hyderabad Campus


Tree-Structured Directory
 Efficient searching
 Directory is a special file
 Current directory (working directory)
 cd /spell/mail/prog
 type list
 Absolute or relative path name
 Creating a new file is done in current directory
 Delete a file - rm <file-name>
 Creating a new subdirectory is done in current directory
 mkdir <dir-name>
 With permissions, one user can access another’s files

Deleting “mail”  deleting the entire subtree rooted by “mail”

BITS Pilani, Hyderabad Campus


Tree-Structured Directory
 Absolute path name –
root/spell/mail/copy/all
 If you are in inside root/spell/mail, then
relative path name for all is copy/all

BITS Pilani, Hyderabad Campus


Acyclic-Graph Directory

 Have shared subdirectories and files


 Two different absolute path names
 Dangling pointer

BITS Pilani, Hyderabad Campus


General Graph Directory

• How do we guarantee no cycles?


• Allow only links to file not subdirectories
• Garbage collection - mechanism to
determine when the last of the references of
a file has been deleted and the disk space
can be reallocated
• Every time a new link is added use a cycle
detection algorithm to determine whether it
is OK

BITS Pilani, Hyderabad Campus


File System Mounting
A file system must be mounted before it can be accessed
Requires the device name(pendrive,cd) and mount point (location within the file structure
where the file system is to be attached)
An unmounted file system is mounted at a mount point

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
File System Implementation
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
File-System Structure
• File structure
• Logical storage unit
• Collection of related information
• File system resides on secondary storage (disks)
• Provides efficient and convenient access to disk by allowing data to be stored, located
retrieved easily
• Disk provides in-place rewrite and random access
• I/O transfers performed in blocks of sectors (usually 512 bytes)
• File control block – storage structure consisting of information about a file,in
unix it is inode
• Device driver controls the physical device
• File system organized into layers
BITS Pilani, Hyderabad Campus
File System Layers
• Device drivers manage I/O devices at the I/O control layer
• Basic file system – gives generic commands to device driver to read and write
physical disk blocks
• Also manages memory buffers and caches (allocation, freeing, replacement)
• Buffers hold data in transit
• Caches hold frequently used file system metadata
• File organization module understands files, logical blocks, and physical
blocks, maps logical block address to physical block address
• Logical file system manages metadata information
• Translates file name into file no., file handle, location by maintaining fcb/file control
blocks
• Directory management
• Protection BITS Pilani, Hyderabad Campus
File System Implementation
• Boot control block contains info. needed by system to boot OS from that
volume
• Needed if volume contains OS, usually first block of volume
• Volume control block (superblock, master file table) contains
volume/partition details
• Total # of blocks, # of free blocks, block size, free block pointers or array
• Directory structure organizes the files

BITS Pilani, Hyderabad Campus


File System Implementation
• Per-file File Control Block (FCB) contains many details about the file
• unique identifier to allow association with a directory entry, permissions, size,
dates

BITS Pilani, Hyderabad Campus


Directory Implementation

• Linear list of file names with pointer to the data blocks


• Simple to program
• Time-consuming to execute
• Linear search time
• Could keep ordered alphabetically via linked list or use B+ tree
• Hash Table – linear list (for directory entries) with hash data structure
• Decreases directory search time
• Collisions – situations where two file names hash to the same location
• Only good if entries are fixed size, or use chained-overflow method

BITS Pilani, Hyderabad Campus


Allocation Methods - Contiguous
• Allocation method refers to how disk  Mapping from logical to physical
blocks are allocated for files
• Contiguous allocation – each file
occupies set of contiguous blocks
• Best performance in most cases
• Simple – only starting location (block #)
and length (number of blocks) are
required
• Problems include finding space for file,
knowing file size, external
fragmentation, need for compaction off-
line (downtime) or on-line

BITS Pilani, Hyderabad Campus


Allocation Methods - Linked
• Linked allocation – each file a linked
list of blocks
• Space wastage as next-file pointer also
stored for every file/cluster of files
• File ends at nil pointer
• Each block contains pointer to next block
• No compaction, no external
fragmentation
• Free space management system called
when new block needed
• Improve efficiency by clustering blocks
into groups
• Reliability can be a problem
• Locating a block can take many I/Os and
BITS Pilani, Hyderabad Campus
Allocation Methods - File Allocation Table

FAT (File Allocation Table)


Beginning of volume has table, indexed by block number
Much like a linked list allocation but without the
overhead of pointers
Next Block-number stored instead of pointer
New block allocation simple

BITS Pilani, Hyderabad Campus


Allocation Methods - Indexed
Indexed allocation - Each file has its own index block of pointers to its data
blocks

index table
Cluster pointers into single block called index block
 Need index table
 Random access
 Dynamic access without external fragmentation, but have
overhead of index block

BITS Pilani, Hyderabad Campus


Free Space Management (bitmap & linked list)
 File system maintains free-space list to track available blocks/clusters
 (Using term “block” for simplicity)
 Bit vector or bit map (n blocks)

0 1 2 n-1


1  block[i] free
bit[i] =
0  block[i] occupied

First free Block number calculation

(number of bits per word) *


(number of 0-value words) +
offset of first 1 bit
CPUs have instructions to return offset within word of first “1” bit

BITS Pilani, Hyderabad Campus


Linked List

 Linked list (free list of free


blocks)
 Cannot get contiguous
space easily
 No waste of space
 No need to traverse the
entire list (if # free blocks
recorded)
 Some space wasted in
each block (to hold pointer
to next free block)

BITS Pilani, Hyderabad Campus


Free Space Management
IMPROVEMENT IN LINKED LIST APPROACH:
 Grouping
 Modify linked list to store address of next n-1 free blocks in first free
block, plus a pointer to next block that contains free-block-pointers
(like this one)

 Counting
 Because space is frequently contiguously used and freed
 Keep address of first free block and count of following free blocks
 Free space list then has entries containing addresses and counts
BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
I/O Systems
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Overview
• I/O management is a major component
• Ports, buses, device controllers connect to various devices
• Device drivers encapsulate device details
• Common concepts – signals from I/O devices interface with computer
• Port – connection point for device
• Bus - daisy chain(Device a connected to b,b to c ,c to comp.) or shared direct access
• PCI bus common in PCs
• expansion bus connects relatively slow devices
• SCSI BUS
• Controller (host adapter) – electronics that operate port, bus, device
• Sometimes integrated on a chip
• Sometimes separate circuit board plugging into the computer, contains processor,
microcode, private memory BITS Pilani, Hyderabad Campus
I/O Hardware

• I/O instructions control devices


• Controllers usually have registers where device driver places processor
commands, addresses, and data to write, or read data from registers after
command execution
• Data-in register – read by host/CPU to get i/p
• Data-out register – written by host to send o/p
• Status register – contains bits that can be read by host, bits indicate states
like completion of current command, availability of byte to be read from
data-in register or occurrence of device error
• Control register – written by host to start a command

BITS Pilani, Hyderabad Campus


IO protocol Polling
1. Host reads busy bit from status register until made 0 by controller
2. Host sets read or write bit in command register and if write bit is set, writes a data byte
into data-out register
3. Host sets command-ready bit in command register
4. Controller sees command-ready bit and sets busy bit
5. Controller sees write command in command register, reads byte from data-out register
and complete I/O
6. Controller clears busy bit, error bit (status reg.), clears command-ready bit when transfer
done
• Step 1 is busy-waiting or polling
• Reasonable if device is fast, but inefficient if device slow;cpu cycle wasted in polling
• CPU switches to other tasks?
• How does CPU know controller is idle, controller buffer may overflow;solution to this == interupt
BITS Pilani, Hyderabad Campus
Interrupts
Interupt driven IO:
• CPU Interrupt-request line triggered by I/O device
• Checked by processor after each instruction
2 types of interupts- maskable(ignorable) & non-maskable(very imp./non-ignorable by
cpu)
• Interrupt handler receives interrupts
• Maskable to ignore or delay some interrupts
• Interrupt vector to dispatch interrupt to correct handler
• Context switch at start and end
• Some nonmaskable ;ex-unrecoverable memory errors)

BITS Pilani, Hyderabad Campus


Interrupts

• Interrupt mechanism also


used for exceptions
• Terminate process, system
crash due to hardware
error
• Page fault executes when
memory access error
• System call executes via
trap to trigger kernel to
execute request

BITS Pilani, Hyderabad Campus


Direct Memory Access (IO-method)
• Used for large data movement
• Requires DMA controller (special-purpose processor)
• Bypasses CPU to transfer data directly between I/O device and memory
• OS writes DMA command block(a datastructure) into memory ;dma command block
contains:
• Source and destination addresses
• Read or write mode
• Count of bytes
• CPU writes location of command block to DMA controller
• DMA performs transfer without help of CPU

BITS Pilani, Hyderabad Campus


Direct Memory Access
• Handshaking b/w DMA controller and device controller is done via DMA-request and
DMA-acknowledge wires
• Device controller places a signal on the DMA-request wire when a data word is available
for transfer
• DMA controller takes control of memory bus, places the desired address on memory-
address wires and places a signal on DMA-acknowledge wire
• Device controller on receiving DMA-acknowledge signal, transfers data word to memory
and removes DMA-request signal
• When transfer is complete, DMA controller interrupts CPU
• Cycle stealing(for some cycle ,cpu might not be able to access memory bus as dma is using
it)

BITS Pilani, Hyderabad Campus


I/O Interface
• I/O system calls encapsulate device behaviors in generic classes
• Device-driver layer hides differences among I/O controllers from I/O
subsystem of kernel
• Devices vary in many dimensions
• Character-stream(transfer 1 byte of data at a time like monitor) or block(hard disk)
• Sequential(modem) or random-access(cd,bluray disk)
• Synchronous or asynchronous (or both)
• Sharable or dedicated
• Speed of operation
• read-write, read only, or write only

BITS Pilani, Hyderabad Campus


I/O Interface
Block & Character Devices
• Block devices include disk drives
• Commands include read, write, seek
• Character devices include keyboards, mice, modems, printers
• Commands include get(), put()

Types of I/O
• Blocking
• Nonblocking – input from keyboard and mouse
• Asynchronous – disk and n/w I/O
BITS Pilani, Hyderabad Campus
Other Aspects

• Error Handling
• OS can recover from disk read, device unavailable, transient write failures
• Retry a read or write, for example
• Most return an error number or code when I/O request fails
• System error logs hold problem reports
• I/O Protection
• User process may accidentally or purposefully attempt to disrupt normal
operation via illegal I/O instructions
• All I/O instructions defined to be privileged
• I/O must be performed via system calls
BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus
OPERATING SYSTEMS (CS F372)
Protection
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Goals of Protection

• Computer consists of a collection of objects, hardware or software


• Each object can be accessed through a well-defined set of operations
• Protection problem - ensure that each object is accessed correctly and only
by those processes that are allowed to do so
• policy and mechanism

BITS Pilani, Hyderabad Campus


Principles of Protection

• Guiding principle – principle of least privilege


• Programs, users and systems should be given just enough privileges to perform their
tasks
• Limits damage if entity has a bug, gets abused
• Can be static (during life of system, during life of process)
• Or dynamic (changed by process as needed) – privilege escalation
• “Need to know” a similar concept regarding access to data

BITS Pilani, Hyderabad Campus


Access Matrix
• View protection as a matrix (access matrix)
• Rows represent domains
• Columns represent objects
• Access(i, j) is the set of operations that a process executing in
Domaini can invoke on Objectj

BITS Pilani, Hyderabad Campus


BITS Pilani, Hyderabad Campus

You might also like