0% found this document useful (0 votes)
5 views

OS Course File

The document is a course file for the Operating Systems course (20CS22003) at Geethanjali College of Engineering and Technology for the academic year 2024-2025. It includes the syllabus, course objectives, outcomes, instructional learning outcomes, and details about the importance of the course within the curriculum. Additionally, it outlines prerequisites, program educational objectives, and program outcomes relevant to the Computer Science and Engineering graduates.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

OS Course File

The document is a course file for the Operating Systems course (20CS22003) at Geethanjali College of Engineering and Technology for the academic year 2024-2025. It includes the syllabus, course objectives, outcomes, instructional learning outcomes, and details about the importance of the course within the curriculum. Additionally, it outlines prerequisites, program educational objectives, and program outcomes relevant to the Computer Science and Engineering graduates.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 237

Geethanjali College of Engineering and Technology

(An UGC Autonomous Institution)


Cheeryal (V), Keesara (M), Medchal District – 501 301 (T.S)

OPERATING SYSTEMS
COURSE FILE

(Course code: 20CS22003)

DEPARTMENT OF
CSE-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
(2024-2025)

Course Coordinator HOD-AIML

Mrs.M.Supriya Dr. G.Bindu Madhavi


Assistant Professor

1
2
Contents

S.No Topic Page. No.


1 Cover Page 3
2 Syllabus copy 4
3 Vision of the Department 6
4 Mission of the Department 6
5 PEOs and POs 7
6 Course objectives and outcomes 9
7 Brief notes on the importance of the course and how it fits into the curriculum 9
8 Prerequisites if any 12
9 Instructional Learning Outcomes 13
10 Course mapping with POs 15

11 Class Time Table 16


12 Individual Time Table 16
13 Lecture schedule with methodology being used/adopted 17
14 Detailed notes 19

15 Additional topics 186

16 University Question papers of previous years 188

17 Question Bank 208

18 Assignment Questions 216

19 Unit wise Objective Questions 220


20 Tutorial problems 233

21 Known gaps ,if any and inclusion of the same in lecture schedule 234

22 Discussion topics , if any 234

23 References, Journals, websites and E-links if any 236

24 Quality Measurement Sheets 236


A Course End Survey
B Teaching Evaluation
25 Student List 236

26 Group-Wise students list for discussion topic 236

3
GEETHANJALI COLLEGE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF AIML

Name of the Subject : OPERATING SYSTEMS


Course code : 20CS22003 Programme : UG

Branch: AIML (CSE) Version No : 1


Year: II Updated on : 28.02.2025
Semester: II No. of pages :276

Classification status (Unrestricted / Restricted )

Distribution List : Department, Library, Faculty, Students of II Year B.Tech. II Semester

Prepared by : 1) Name : Mrs.M.supriya


Sign :
Design : Assistant Professor
Date :28-02-2025

Verified by : 1) Name : * For Q.C Only.


2) Sign : 1) Name :
3) Design : 2) Sign :
4) Date : 3) Design :
4) Date :
Approved by : (HOD ) 1) Name : Dr.G.Bindu Madhavi
2) Sign :
3) Date :

4
5
6
3. Vision of the Department

To produce globally competent and socially responsible computer science engineers contributing to the
advancement of engineering and technology which involves creativity and innovation by providing excellent
learning environment with world class facilities.

4. Mission of the Department

1. To be a Centre of excellence in instruction, innovation in research and scholarship, and service to the stake
holders, the profession, and the public.

2. To prepare graduates to enter a rapidly changing field as a competent computer science engineer.

3. To prepare graduate capable in all phases of software development, possess a firm understanding of hardware
technologies, have the strong mathematical background necessary for scientific computing, and be sufficiently
well versed in general theory to allow growth within the discipline as it advances.

4. To prepare graduates to assume leadership roles by possessing good communication skills, the ability to work
effectively as team members, and an appreciation for their social and ethical responsibility in a global setting.

PROGRAM EDUCATIONAL OBJECTIVES (PEOs) OF B.Tech. (CSE) PROGRAM:

Program Educational Objectives (PEOs) are broad statements that describe what graduates are expected to
attain within a few years of graduation. The PEOs for Computer Science and Engineering graduates are:

PEO-I: To provide graduates with a good foundation in mathematics, sciences and engineering fundamentals
required to solve engineering problems that will facilitate them to find employment in industry and / or to
pursue postgraduate studies with an appreciation for lifelong learning.

PEO-II: To provide graduates with analytical and problem solving skills to design algorithms, other
hardware / software systems, and inculcate professional ethics, inter-personal skills to work in a multi-
cultural team.

PEO-III: To facilitate graduates get familiarized with state of the art software / hardware tools, imbibing
creativity and Innovation that would enable them to develop cutting-edge technologies of multi-disciplinary
nature for societal development.

6
PROGRAM OUTCOMES (POs) OF B.Tech.(AIML-CSE) PROGRAM:
Program Outcomes (POs) describe what students are expected to know and be able to do by the time of
graduation to accomplish Program Educational Objectives (PEOs). The Program Outcomes for Computer
Science and Engineering graduates are:
Engineering Graduates would be able to:

1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals, and an engineering specialization to the solution of complex engineering problems.

2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences, and engineering sciences.

3. Design/development of solutions: Design solutions for complex engineering problems and


design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.

4. Conduct investigations of complex problems: Use research-based knowledge and research


methods including design of experiments, analysis and interpretation of data, and synthesis of the
information to provide valid conclusions.

5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.

7. Environment and sustainability: Understand the impact of the professional engineering


solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.

8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.

9. Individual and team work: Function effectively as an individual, and as a member or


leader in diverse teams, and in multidisciplinary settings.

10. Communication: Communicate effectively on complex engineering activities with the


engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give
and receive clear instructions.

11. Project management and finance: Demonstrate knowledge and understanding of the
7
engineering and management principles and apply these to one’s own work, as a member
and leader in a team, to manage projects and in multidisciplinary environments.

12. Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological
change.

PROGRAM SPECIFIC OUTCOMES (PSOs):

PSO1: Demonstrate competency in Programming and problem solving skills and apply these
skills in solving computing problems.
PSO2: Select appropriate programming languages, Data structures and algorithms in combination
with modern technologies and tools, apply them in developing creative and innovative solutions.
PSO3: Demonstrate adequate knowledge in the concepts and techniques of artificial intelligence
and machine learning , apply them in developing intelligent systems to solve real world problems.

8
6. Course Objectives & Course Outcomes

Course Objectives
Develop ability to
1. Understand main components of Operating System (OS) and their working.
2. Implement the different CPU scheduling policies, and process synchronization.
3. Apply the different memory management techniques.
4. Understand File management techniques.
5. Handing Deadlock situations and provide system protection.

Course Outcomes (COs)


At the end of the course, student would be able to
CO1. Explain the fundamental concepts, evolution and services of Operating Systems.
CO2. Apply concepts of process management in terms of scheduling and coordination for a
given problem.
CO3. Apply Memory Management techniques for a given problem.
CO4. Elucidate the concept of File Management System
CO5. Explain various protection and access control mechanisms.

7. Brief Importance of the Course and how it fits into the curriculum

a. What role does this course play within the Program?


• An Operating System (OS) is a collection of software that manages computer
hardware and provides services for programs.
• Specifically, it hides hardware complexity, manages computational resources, and
provides isolation and protection.
• Most importantly, it directly has privilege access to the underlying hardware.
b. How is the course unique or different from other courses of the Program?
• Introduction to Operating Systems is a graduate-level introductory course in operating
systems.
• This course teaches the basic operating system abstractions, mechanisms, and their
implementations.
c. What essential knowledge or skills should they gain from this experience?
• Students acquire analysis and design capabilities of Operating Systems
• Students can gain Knowledge on Shell scripting.
• Another feature of the course is it promotes critical thinking and logical reasoning
9
d. What knowledge or skills from this course will students need to have mastered to
perform well in future classes or later (Higher Education / Jobs)?
• The core of the course contains concurrent programming (threads and synchronization)
Inter process communication, and an introduction to distributed operating systems.
• Analytical and design skills of scheduling algorithms, memory management techniques
and file handling methods that includes protection.
e. Why is this course important for students to take?
• In order to understand the basic operational concepts of various Operating Systems to
manage the computer systems, thorough comprehension of this course is fundamental.
• The course is split into four sections: (1) Introduction, (2) Process and Thread
Management, (3) Resource Management and Communication, and (4) Distributed
Systems.
f. What is/are the prerequisite(s) for this course?
• Basic Data Structures
• Computer Organization
• Knowledge in high level programming language such as C or JAVA
g. When students complete this course, what do they need know or be able to do?
Students able to
• Compare synchronous and asynchronous communication mechanisms in their respective
Operating Systems.
• Able to analyze, design and evaluate the Operating Systems.
• Demonstrate the necessity of process concurrency, synchronization and Deadlocks.
h. Is there specific knowledge that the students will need to know in the future?
• In future, students have to apply these concepts in designing and evaluating various
Operating Systems
• The Operating System (OS):
– Controls all execution.
– Multiplexes resources between applications.
– Abstracts away from complexity.
i. Are there certain practical or professional skills that students will need to apply in the
future?
• YES. They should complete a mini project based on mobile Operating Systems.

10
• They are able to create their own shell script.
j. Five years from now, what do you hope students will remember from this course?
• Analytical thinking and logical reasoning and techniques employed in the analysis and
design of Operating Systems.
k. What is it about this course that makes it unique or special?
The course contents help students to understand and as simulate the principles of operating
systems which will enable student to perform well in future courses study (like Linux).
l. Why does the program offer this course?
• This is the basic course in Operating Systems. Without this course, students cannot
understand and evaluate various Operating Systems available for use.
• This course is a prerequisite for Linux Programming, Embedded Systems.

m. Why can’t this course be “covered” as a sub-section of another course?


• It is not possible as it covers many topics such as functionalities of Operating Systems,
Process and their scheduling, Synchronization to achieve concurrency, Memory
Management, dealing with Deadlocks, File handling and Protection and if one tries to
cover these as part of another course, it would be too heavy to be taught in one semester.
n. What unique contributions to students’ learning experience does this course make?
• It helps in executing mini and major projects that require compatibility with different
platforms during the later years of the program.
o. What is the value of taking this course? How exactly does it enrich the program?
• This course plays a vital role in design and development of Operating Systems in the
discipline of Computer Science and Engineering and this course also helps for the
student’s professional career growth in terms of professional career.
p. What are the major career options that require for this course.
The major career options are:
▪ Software Developer
▪ Embedded systems designer

11
8. Prerequisites
• Data Structures
• Computer Organization
• knowledge in high –level language such as C or JAVA

12
9. Instructional Learning Outcomes

S.No Unit Contents Objectives Outcomes


1 I Computer System Understand main Explain the fundamental
Operating System: components of Operating concepts, evolution and
System (OS) and their services of Operating
Structure of OS, working. Systems.
OS services and
OS functions

3 II Process Management: Understand the different Apply concepts of


Process, threads, and scheduling policies of process management in
OS. terms of scheduling and
scheduling algorithms coordination for a given
Concurrency and Understand process problem.
Process Synchronization: concurrency and
synchronization.
Semaphores
Monitors
Classic IPC problems

4 III Memory Management: To learn the various ways Apply Memory


Swapping, Fragmentation to manage the memory and Management techniques
for a given problem.
Paging, segmentation, Memory management
Virtual memory and techniques.
Page replacement
Algorithms

13
4 IV File System Interface Understand the concepts Elucidate the concept of
File System of input/output, storage File Management
and file management. System.
implementation:
File and
Directory implementation,
Free space management
Mass-Storage structure:
Disk structure
Disk schedulin
V Deadlocks: It presents how to detect the Explain various
Principles of deadlock: deadlocks, how to prevent protection and access
5 Deadlock characterization it and how to avoid the control mechanisms.
Deadlock prevention deadlocks
Deadlock detection,
avoidance and recovery. This represents that the
Protection: processes in an OS must be
Goals of protection protected from various
Principles of protection activities. Protection refers
Access matrix to a mechanism for
Access control controlling the access of
programs, processes or
users.

14
10. Course mapping with PEO’s and PO’s

Course PEOs Pos

OPERATING SYSTEMS PEO1,PEO2,PEO3 PO1,PO2,PO3,PO4,PO8,PO9,PO10,PO12,PSO1

Operating Systems

PSO1

PSO2

PSO3
PO10

PO11

PO12
PO1

PO2

PO3

PO4

PO5

PO6

PO7

PO8

PO9
CO1. Explain the fundamental
concepts, evolution and 3 3 - - - - - - - - - 2 2 - -
services of Operating Systems.
CO2. Apply concepts of
process management in terms
3 2 2 2 - - - - 2 2 - 2 2 - -
of scheduling and coordination
for a given problem.
CO3. Apply Memory
Management techniques for a 3 2 2 2 - - - - - - - 2 2 - -
given problem.
CO4. Elucidate the concept of
3 2 1 2 - - - - - - - 2 2 - -
File Management System.
CO5. Explain various
protection and access control 3 3 2 2 - - - 2 - 2 - 2 2 - -
mechanisms.
3 2.4 1.75 2 - - - 2 2 2 - 2 2 - -

15
11. Class Time Table

12. Individual Time Table.

16
13. Lecture Schedule

LESSON PLAN

Teaching aids Remarks


Expected
S. No of Regular / used
No
Date of
Completion Periods Topics to be covered Additional LCD/OHP/BB
UNIT 1 – Operating System Introduction
1 01 Introduction , OS Objectives and Functions regular BB/LCD
2 01 Computer System Architecture, OS Structure, OS regular BB/LCD
Operations
3 01 Evolution of Operating Systems: regular BB/LCD
Simple Batch, Multiprogrammed, Timeshared
4 01 Personal Computer, Parallel, Distributed Systems, regular BB/LCD
Real-time systems, Special Purpose Systems
5 01 Operating System Services, User – OS Interface regular BB/LCD
6 01 System Calls, Types of System Calls regular BB/LCD
7 01 System Programs, OS Design & Implementation regular BB/LCD
8 01 OS Structure, Virtual Machines regular BB/LCD
9 01 Discussion on : Objective Questions, regular LCD
University Question Papers (previous)
UNIT 2 Process and CPU scheduling
10 01 Process Concepts- The Process, Process State, PCB regular BB/LCD
11 01 Threads regular BB/LCD
12 01 Process Scheduling- Scheduling Queues, Schedulers, regular BB/LCD
Context Switch
13 01 Preemptive Scheduling, Dispatcher, Scheduling regular BB/LCD
Criteria, Scheduling Algorithms- FCFS, SJF
14 01 Priority Scheduling, Round Robin Scheduling regular BB/LCD
15 01 Multilevel Queue Scheduling, Multilevel Feedback regular BB/LCD
Queue Scheduling
16 01 Thread Scheduling, Multiple Processor Scheduling regular BB/LCD
17 01 Real Time Scheduling, regular BB/LCD
Case Studies: Linux, Windows
18 01 Process Coordination- Process Synchronization regular BB/LCD
Critical Section Problem, Peterson’s Solution
19 01 Synchronization Hardware, Semaphores regular BB/LCD
20 01 Classic Problems of Synchronization regular BB/LCD
21 01 Monitors, Case Studies: Linux, Windows regular BB/LCD
22 01 Discussion on : Objective Questions, regular LCD
University Question Papers (previous)
UNIT 3 Memory Management and Virtual Memory
23 01 Background- Basic H/W, Address Binding, Logical regular BB/LCD
vs Physical address space, Dynamic Loading,
Dynamic Linking and Shared Libraries
24 01 Swapping, Contiguous Memory Allocation regular BB/LCD
25 01 Paging- H/W Support, Protection, Shared Pages, regular BB/LCD
Structure of Page Table
26 01 Segmentation, Segmentation with paging regular BB/LCD

17
27 01 Virtual Memory- Demand Paging, Performance regular BB/LCD
28 01 Page Replacement Algorithms regular BB/LCD
29 01 Allocation of Frames, Thrashing regular BB/LCD
30 01 Discussion on : Objective Questions, regular LCD
University Question Papers (previous)
UNIT 4 File System Interface
31 01 The Concept of a File, Access Methods regular BB/LCD
32 01 Directory Structure, Directory implementation regular BB/LCD
33 01 File System Mounting, File Sharing and Protection regular BB/LCD
34 01 File System Structure, File System Implementation regular BB/LCD
35 01 Allocation Methods-Sequential, Indexed, Linked regular BB/LCD
36 01 Free Space Management: Efficiency &Performance regular BB/LCD
37 01 Mass Storage Structure- Overview, Disk Structure, regular BB/LCD
Disk Attachment
38 01 Disk Scheduling regular BB/LCD
39 01 Disk Management, and Swap SpaceManagement regular BB/LCD
40 01 Discussion on : Objective Questions, regular LCD
University Question Papers (previous)
UNIT 5 Deadlocks and Protection
41 01 Deadlocks- System Model, Deadlock Characterization regular BB/LCD
42 01 Methods for handling Deadlocks, Deadlock Prevention regular BB/LCD
43 01 Deadlock Avoidance regular BB/LCD
44 01 Deadlock Detection & Recovery regular BB/LCD
45 01 Protection- System Protection, Goals and Principles, regular BB/LCD
Domain Protection
46 01 Access Matrix- Implementation, Access Control- regular BB/LCD
Revocation of Access Rights
47 01 Capability based Systems& Language based Protection regular BB/LCD
48 01 Discussion on : Objective Questions, regular LCD
University Question Papers (previous)

18
14. Detailed Notes:

UNIT-1
What is an Operating System?
A program that acts as an intermediary between a user of a computer and the computer hardware
Operating system goals:
• Execute user programs and make solving user problems easier
• Make the computer system convenient to use
• Use the computer hardware in an efficient manner

1.1 Computer System Structure


Computer system can be divided into four components
• Hardware – provides basic computing resources
CPU, memory, I/O devices
Operating system
Controls and coordinates use of hardware among various applications and users
• Application programs – define the ways in which the system resources are used to solve
the computing problems of the users
Word processors, compilers, web browsers, database systems, video games
• Users People, machines, other computers
Four Components of a Computer System

19
Operating System Definition

• OS is a resource allocator
• Manages all resources
• Decides between conflicting requests for efficient and fair resource use
• OS is a control program
• Controls execution of programs to prevent errors and improper use of the computer
• No universally accepted definition
• Everything a vendor ships when you order an operating system” is good approximation
but varies wildly.

• “The one program running at all times on the computer” is the kernel. Everything else is
either a system program (ships with the operating system) or an application program.

Computer System Organization

• Computer-system operation
• One or more CPUs, device controllers connect through common bus providing access to
shared memory
• Concurrent execution of CPUs and devices competing for memory cycles

Computer-System Operation

• I/O devices and the CPU can execute concurrently


• Each device controller is in charge of a particular device type
• Each device controller has a local buffer
• CPU moves data from/to main memory to/from local buffers
• I/O is from the device to local buffer of controller
• Device controller informs CPU that it has finished its operation by causing An interrupt

20
Interrupt Handling
• The operating system preserves the state of the CPU by storing registers and the program
counter
• Determines which type of interrupt has occurred:
• polling
• vectored interrupt system
• Separate segments of code determine what action should be taken for each type ofinterrupt
Interrupt Timeline

Storage Hierarchy
• Storage systems organized in hierarchy
• Speed
• Cost
• Volatility
Caching – copying information into faster storage system; main memory can be viewed as a last

cache for secondary storage

21
Caching

• Important principle, performed at many levels in a computer (in hardware, operating


system, software)
• Information in use copied from slower to faster storage temporarily
• Faster storage (cache) checked first to determine if information is there
• If it is, information used directly from the cache (fast)
• If not, data copied to cache and used there
• Cache smaller than storage being cached
• Cache management important design problem
• Cache size and replacement policy
Computer-System Architecture
• Most systems use a single general-purpose processor (PDAs through mainframes)
• Most systems have special-purpose processors as well
• Multiprocessors systems growing in use and importance
• Also known as parallel systems, tightly-coupled systems
Advantages include
1. Increased throughput
2. Economy of scale
3. Increased reliability – graceful degradation or fault tolerance

Two types
1. Asymmetric Multiprocessing

2.Symmetric Multiprocessing

How a Modern Computer Works

Symmetric Multiprocessing Architecture

A Dual-Core Design
Clustered Systems
• Like multiprocessor
• systems, but multiple systems working together
• Usually sharing storage via a storage-area network (SAN)
• Provides a high-availability service which survives failures
Asymmetric clustering has one machine in hot-standby mode
Symmetric clustering has multiple nodes running applications, monitoring each other
• Some clusters are for high-performance computing (HPC)
Applications must be written to use parallelization

Operating System Structure

• Multiprogramming needed for efficiency


• Single user cannot keep CPU and I/O devices busy at all times
• Multiprogramming organizes jobs (code and data) so CPU always has one to Execute
• A subset of total jobs in system is kept in memory
• One job selected and run via job scheduling
• When it has to wait (for I/O for example), OS switches to another job
• Timesharing (multitasking) is logical extension in which CPU switches jobs so
frequently that users can interact with each job while it is running, creating interactive
computing
• Response time should be < 1 second
• Each user has at least one program executing in memory [process
• If several jobs ready to run at the same time [ CPU scheduling
• If processes don’t fit in memory, swapping moves them in and out to run
Virtual memory allows execution of processes not completely in memory

Memory Layout for Multiprogramming System

Operating-System Operations

• Interrupt driven by hardware


• Software error or request creates exception or trap
• Division by zero, request for operating system service
• Other process problems include infinite loop, processes modifying each Other or the
operating system
• Dual-mode operation allows OS to protect itself and other system components
• User mode and kernel mode
23
• Mode bit provided by hardware Provides ability to distinguish when system is running
user code or kernel code Some instructions designated as privileged, only executable in
kernel mode System call changes mode to kernel, return from call resets it to user

Transition from User to Kernel Mode


• Timer to prevent infinite loop / process hogging resources
• Set interrupt after specific period
• Operating system decrements counter
• When counter zero generate an interrupt
• Set up before scheduling process to regain control or terminate program that exceeds
allotted time

Operating-System Services

Figure 1.1 - A view of operating system services


OSes provide environments in which programs run, and services for the users of the system,
including:

• User Interfaces - Means by which users can issue commands to the system. Depending
on the system these may be a command-line interface ( e.g. sh, csh, ksh, tcsh, etc. ), a
GUI interface ( e.g. Windows, X-Windows, KDE, Gnome, etc. ), or a batch command
systems. The latter are generally older systems using punch cards of job-control
24
language, JCL, but may still be used today for specialty systems designed for a single
purpose.
• Program Execution - The OS must be able to load a program into RAM, run the
program, and terminate the program, either normally or abnormally.
• I/O Operations - The OS is responsible for transferring data to and from I/O devices,
including keyboards, terminals, printers, and storage devices.
• File-System Manipulation - In addition to raw data storage, the OS is also responsible
for maintaining directory and subdirectory structures, mapping file names to specific
blocks of data storage, and providing tools for navigating and utilizing the file system.
• Communications - Inter-process communications, IPC, either between processes
running on the same processor, or between processes running on separate processors or
separate machines. May be implemented as either shared memory or message passing,
(or some systems may offer both.)
• Error Detection - Both hardware and software errors must be detected and handled
appropriately, with a minimum of harmful repercussions. Some systems may include
complex error avoidance or recovery systems, including backups, RAID drives, and other
redundant systems. Debugging and diagnostic tools aid users and administrators in tracing down
the cause of problems.

Other systems aid in the efficient operation of the OS itself:

• Resource Allocation - E.g. CPU cycles, main memory, storage space, and peripheral
devices. Some resources are managed with generic systems and others with very
carefully designed and specially tuned systems, customized for a particular resource and
operating environment.
• Accounting - Keeping track of system activity and resource usage, either for billing
purposes or for statistical record keeping that can be used to optimize future performance.
• Protection and Security - Preventing harm to the system and to resources, either through
wayward internal processes or malicious outsiders. Authentication, ownership, and
restricted access are obvious parts of this system. Highly secure systems may log all
process activity down to excruciating detail, and security regulation dictate the storage of
those records on permanent non-erasable medium for extended times in secure ( off-site )
facilities.

1.2 User Operating-System Interface

1.2.1 Command Interpreter

• Gets and processes the next user request, and launches the requested programs.
• In some systems the CI may be incorporated directly into the kernel.
• More commonly the CI is a separate program that launches once the user logs in or otherwise accesses
the system.
• UNIX, for example, provides the user with a choice of different shells, which may either be configured
to launch automatically at login, or which may be changed on the fly. ( Each of these shells uses a
different configuration file of initial settings and commands that are executed upon startup. )
• Different shells provide different functionality, in terms of certain commands that are implemented
directly by the shell without launching any external programs. Most provide at least a rudimentary
25
command interpretation structure for use in shell script programming ( loops, decision constructs,
variables, etc. )
• An interesting distinction is the processing of wild card file naming and I/O re- direction. On UNIX
systems those details are handled by the shell, and the program which is launched sees only a list of
filenames generated by the shell from the wild cards. On a DOS system, the wild cards are passed along
to the programs, which can interpret the wild cards as the program sees fit.

Figure 1.2 - The Bourne shell command interpreter in Solaris 10

1.2.2 Graphical User Interface, GUI


➢ Generally implemented as a desktop metaphor, with file folders, trash cans, and
resource icons.
➢ Icons represent some item on the system, and respond accordingly when the icon
is activated.
➢ First developed in the early 1970's at Xerox PARC research facility.
➢ In some systems the GUI is just a front end for activating a traditional command
line interpreter running in the background. In others the GUI is a true graphical
shell in its own right.
➢ Mac has traditionally provided ONLY the GUI interface. With the advent of OSX (
based partially on UNIX ), a command line interface has also become available.
➢ Because mice and keyboards are impractical for small mobile devices, these
normally use a touch-screen interface today that responds to various patterns of
swipes or "gestures". When these first came out they often had a physical
keyboard and/or a trackball of some kind built in, but today a virtual keyboard is
more commonly implemented on the touch screen.

26
Figure 1.3 - The iPad touchscreen

1.2.3 Choice of interface


• Most modern systems allow individual users to select their desired interface, and to
customize its operation, as well as the ability to switch between different interfaces as
needed. System administrators generally determine which interface a user starts with when
they first log in.
• GUI interfaces usually provide an option for a terminal emulator window for entering
command-line commands.
• Command-line commands can also be entered into shell scripts, which can then be run like
any other programs.

Figure 1.4 - The Mac OS X GUI


27
1.3 System Calls
• System calls provide a means for user or application programs to call upon the services of
the operating system.
• Generally written in C or C++, although some are written in assembly for optimal
performance.
• Figure 1.5 illustrates the sequence of system calls required to copy a file:

Figure 1.5 - Example of how system calls are used.

• You can use "strace" to see more examples of the large number of system calls invoked
by a single simple command. Read the man page for strace, and try some simple
examples. ( strace mkdir temp, strace cd temp, strace date > t.t, strace cp t.t t.2, etc. )
• Most programmers do not use the low-level system calls directly, but instead use an
"Application Programming Interface", API. The following sidebar shows the read( ) call
available in the API on UNIX based systems::

The use of APIs instead of direct system calls provides for greater program portability between
different systems. The API then makes the appropriate system calls through the system call
interface, using a table lookup to access specific numbered system calls, as shown in Figure 2.6:

Figure 1.6 - The handling of a user application invoking the open( ) system call
28
• Parameters are generally passed to system calls via registers, or less commonly, by values
pushed onto the stack. Large blocks of data are generally accessed indirectly, through a
memory address passed in a register or on the stack, as shown in Figure 2.7:

Figure 1.7 - Passing of parameters as a table

1.3 Types of System Calls

Six major categories, as outlined in Figure 1.8 and the following six subsections:

(Sixth type, protection, not shown here but described below)

29
• Standard library calls may also generate system calls, as shown here:

1.4.1 Process Control

• Process control system calls include end, abort, load, execute, create process,
terminate process, get/set process attributes, wait for time or event, signal event,
and allocate and free memory.
• Processes must be created, launched, monitored, paused, resumed,and eventually
stopped.
• When one process pauses or stops, then another must be launched or resumed
• When processes stop abnormally it may be necessary to provide core dumps
and/or other diagnostic or recovery tools.
• Compare DOS ( a single-tasking system ) with UNIX ( a multi-tasking system ).
o When a process is launched in DOS, the command interpreter first unloads
as much of itself as it can to free up memory, then loads the process and
transfers control to it. The interpreter does not resume until the process has
completed, as shown in Figure 1.9:

30
Figure 1.9 - MS-DOS execution. (a) At system startup. (b) Running a program.

Because UNIX is a multi-tasking system, the command interpreter remains completely


resident when executing a process, as shown in Figure 1.10 below.
▪ The user can switch back to the command interpreter at any time, and can place the running
process in the background even if it was not originally launched as a background process.
▪ In order to do this, the command interpreter first executes a "fork" system call, which
creates a second process which is an exact duplicate ( clone ) of the original command
interpreter. The original process is known as the parent, and the cloned process is known
as the child, with its own unique process ID and parent ID.
▪ The child process then executes an "exec" system call, which replaces its code with that of
the desired process.
▪ The parent ( command interpreter ) normally waits for the child to complete before issuing
a new command prompt, but in some cases it can also issue a new prompt right away,
without waiting for the child process to complete. ( The child is then said to be running "in
the background", or "as a background process". )

Figure 1.10 - FreeBSD running multiple programs

31
1.4.2 File Management

• File management system calls include create file, delete file, open, close, read,
write, reposition, get file attributes, and set file attributes.
• These operations may also be supported for directories as well as ordinary files.

1.4.3 Device Management

• Device management system calls include request device, release device, read,
write, reposition, get/set device attributes, and logically attach or detach devices.
• Devices may be physical ( e.g. disk drives ), or virtual / abstract ( e.g. files,
partitions, and RAM disks ).
• Some systems represent devices as special files in the file system, so that
accessing the "file" calls upon the appropriate device drivers in the OS. See for
example the /dev directory on any UNIX system.

1.4.4 Information Maintenance

• Information maintenance system calls include calls to get/set the time, date,
system data, and process, file, or device attributes.
• Systems may also provide the ability to dump memory at any time, single step
programs pausing execution after each instruction, and tracing the operation of
programs, all of which can help to debug programs.

1.4.5 Communication

• Communication system calls create/delete communication connection,


send/receive messages, transfer status information, and attach/detach remote
devices.
• The message passing model must support calls to:
o Identify a remote process and/or host with which to communicate.
o Establish a connection between the two processes.
o Open and close the connection as needed.
o Transmit messages along the connection.
o Wait for incoming messages, in either a blocking or non-blocking state.
o Delete the connection when no longer needed.
• The shared memory model must support calls to:
o Create and access memory that is shared amongst processes (and threads. )
o Provide locking mechanisms restricting simultaneous access.
o Free up shared memory and/or dynamically allocate it as needed.
• Message passing is simpler and easier, ( particularly for inter-computer
communications ), and is generally appropriate for small amounts of data.
• Shared memory is faster, and is generally the better approach where large
amounts of data are to be shared, ( particularly when most processes are reading
the data rather than writing it, or at least when only one or a small number of
processes need to change any given data item. )

32
1.4.6 Protection

• Protection provides mechanisms for controlling which users / processes have


access to which system resources.
• System calls allow the access mechanisms to be adjusted as needed, and for non-
priveleged users to be granted elevated access permissions under carefully
controlled temporary circumstances.
• Once only of concern on multi-user systems, protection is now important on all
systems, in the age of ubiquitous network connectivity.

1.4 System Programs

• System programs provide OS functionality through separate applications, which are not
part of the kernel or command interpreters. They are also known as system utilities or
system applications.
• Most systems also ship with useful applications such as calculators and simple editors, (
e.g. Notepad ). Some debate arises as to the border between system and non-system
applications.
• System programs may be divided into these categories:
o File management - programs to create, delete, copy, rename, print, list, and
generally manipulate files and directories.
o Status information - Utilities to check on the date, time, number of users,
processes running, data logging, etc. System registries are used to store and recall
configuration information for particular applications.
o File modification - e.g. text editors and other tools which can change file
contents.
o Programming-language support - E.g. Compilers, linkers, debuggers, profilers,
assemblers, library archive management, interpreters for common languages, and
support for make.
o Program loading and execution - loaders, dynamic loaders, overlay loaders,
etc., as well as interactive debuggers.
o Communications - Programs for providing connectivity between processes and
users, including mail, web browsers, remote logins, file transfers, and remote
command execution.
o Background services - System daemons are commonly started when the system
is booted, and run for as long as the system is running, handling necessary
services. Examples include network daemons, print servers, process schedulers,
and system error monitoring services.
• Most operating systems today also come complete with a set of application programs to
provide additional services, such as copying files or checking the time and date.
• Most users' views of the system is determined by their command interpreter and the
application programs. Most never make system calls, even through the API, ( with the
exception of simple ( file ) I/O in user-written programs. )

33
1.5 Operating-System Design and Implementation

1.6.1 Design Goals

• Requirements define properties which the finished system must have, and are a
necessary first step in designing any large complex system.
o User requirements are features that users care about and understand, and
are written in commonly understood vernacular. They generally do not
include any implementation details, and are written similar to the product
description one might find on a sales brochure or the outside of a shrink-
wrapped box.
o System requirements are written for the developers, and include more
details about implementation specifics, performance requirements,
compatibility constraints, standards compliance, etc. These requirements
serve as a "contract" between the customer and the developers, ( and
between developers and subcontractors ), and can get quite detailed.
• Requirements for operating systems can vary greatly depending on the planned
scope and usage of the system. ( Single user / multi-user, specialized system /
general purpose, high/low security, performance needs, operating environment,
etc. )

1.6.2 Mechanisms and Policies

• Policies determine what is to be done. Mechanisms determine how it is to be


implemented.
• If properly separated and implemented, policy changes can be easily adjusted
without re-writing the code, just by adjusting parameters or possibly loading new
data / configuration files. For example the relative priority of background versus
foreground tasks.

1.6.3 Implementation

• Traditionally OSes were written in assembly language. This provided direct


control over hardware-related issues, but inextricably tied a particular OS to a
particular HW platform.
• Recent advances in compiler efficiencies mean that most modern OSes are written
in C, or more recently, C++. Critical sections of code are still written in assembly
language, ( or written in C, compiled to assembly, and then fine-tuned and
optimized by hand from there. )
• Operating systems may be developed using emulators of the target hardware,
particularly if the real hardware is unavailable ( e.g. not built yet ), or not a
suitable platform for development, ( e.g. smart phones, game consoles, or other
similar devices. )

34
1.6 Operating-System Structure

For efficient performance and implementation an OS should be partitioned into separate


subsystems, each with carefully defined tasks, inputs, outputs, and performance characteristics.
These subsystems can then be arranged in various architectural configurations:

1.7.1 Simple Structure

When DOS was originally written its developers had no idea how big and important it would
eventually become. It was written by a few programmers in a relatively short amount of time,
without the benefit of modern software engineering techniques, and then gradually grew over
time to exceed its original expectations. It does not break the system into subsystems, and has no
distinction between user and kernel modes, allowing all programs direct access to the underlying
hardware. ( Note that user versus kernel mode was not supported by the 8088 chip set anyway,
so that really wasn't an option back then. )

Figure 1.11 - MS-DOS layer structure

The original UNIX OS used a simple layered approach, but almost all the OS was in one big
layer, not really breaking the OS down into layered subsystems:

Figure 1.12 - Traditional UNIX system structure

35
1.7.2 Layered Approach

• Another approach is to break the OS into a number of smaller layers, each of


which rests on the layer below it, and relies solely on the services provided by the
next lower layer.
• This approach allows each layer to be developed and debugged independently,
with the assumption that all lower layers have already been debugged and are
trusted to deliver proper services.
• The problem is deciding what order in which to place the layers, as no layer can
call upon the services of any higher layer, and so many chicken-and-egg
situations may arise.
• Layered approaches can also be less efficient, as a request for service from a
higher layer has to filter through all lower layers before it reaches the HW,
possibly with significant processing at each step.

Figure 1.13 - A layered operating system

1.7.3 Microkernels

• The basic idea behind micro kernels is to remove all non-essential services from
the kernel, and implement them as system applications instead, thereby making
the kernel as small and efficient as possible.
• Most microkernels provide basic process and memory management, and message
passing between other services, and not much more.
• Security and protection can be enhanced, as most services are performed in user
mode, not kernel mode.
• System expansion can also be easier, because it only involves adding more system
applications, not rebuilding a new kernel.
• Mach was the first and most widely known microkernel, and now forms a major
component of Mac OSX.
• Windows NT was originally microkernel, but suffered from performance
problems relative to Windows 95. NT 4.0 improved performance by moving more
services into the kernel, and now XP is back to being more monolithic.
• Another microkernel example is QNX, a real-time OS for embedded systems.

36
Figure 1.14 - Architecture of a typical microkernel

1.7.4 Modules

• Modern OS development is object-oriented, with a relatively small core kernel


and a set of modules which can be linked in dynamically. See for example the
Solaris structure, as shown in Figure 2.13 below.
• Modules are similar to layers in that each subsystem has clearly defined tasks and
interfaces, but any module is free to contact any other module, eliminating the
problems of going through multiple intermediary layers, as well as the chicken-
and-egg problems.
• The kernel is relatively small in this architecture, similar to microkernels, but the
kernel does not have to implement message passing since modules are free to
contact each other directly.

Figure 1.15 - Solaris loadable modules

1.7.5 Hybrid Systems

• Most OSes today do not strictly adhere to one architecture, but are hybrids of
several.

37
1.7.5.1 Mac OS X

• The Max OSX architecture relies on the Mach microkernel for basic
system management services, and the BSD kernel for additional services.
Application services and dynamically loadable modules ( kernel
extensions ) provide the rest of the OS functionality:

Figure 1.16 - The Mac OS X structure

Virtual Machines

• The concept of a virtual machine is to provide an interface that looks like independent
hardware, to multiple different OSes running simultaneously on the same physical
hardware. Each OS believes that it has access to and control over its own CPU, RAM,
I/O devices, hard drives, etc.
• One obvious use for this system is for the development and testing of software that must
run on multiple platforms and/or OSes.
• One obvious difficulty involves the sharing of hard drives, which are generally
partitioned into separate smaller virtual disks for each operating OS.

Figure 1.17 - System models. (a) Nonvirtual machine. (b)Virtual machine.

38
1.8.1 History

• Virtual machines first appeared as the VM Operating System for IBM mainframes
in 1972.

1.8.2 Benefits

• Each OS runs independently of all the others, offering protection and security
benefits.
• ( Sharing of physical resources is not commonly implemented, but may be done
as if the virtual machines were networked together. )
• Virtual machines are a very useful tool for OS development, as they allow a user
full access to and control over a virtual machine, without affecting other users
operating the real machine.
• As mentioned before, this approach can also be useful for product development
and testing of SW that must run on multiple OSes / HW platforms.

1.8.3 Simulation

• An alternative to creating an entire virtual machine is to simply run an emulator,


which allows a program written for one OS to run on a different OS.
• For example, a UNIX machine may run a DOS emulator in order to run DOS
programs, or vice-versa.
• Emulators tend to run considerably slower than the native OS, and are also
generally less than perfect.

1.8.4 Para-virtualization

• Para-virtualization is another variation on the theme, in which an environment is


provided for the guest program that is similar to its native OS, without trying to
completely mimic it.
• Guest programs must also be modified to run on the para-virtual OS.
• Solaris 10 uses a zone system, in which the low-level hardware is not virtualized,
but the OS and its devices ( device drivers ) are.
o Within a zone, processes have the view of an isolated system, in which
only the processes and resources within that zone are seen to exist.
o Figure 1.18 shows a Solaris system with the normal "global" operating
space as well as two additional zones running on a small virtualization
layer.

39
Figure 1.18 - Solaris 10 with two zones.

1.8.5 Implementation

• Implementation may be challenging, partially due to the consequences of user


versus kernel mode.
o Each of the simultaneously running kernels needs to operate in kernel
mode at some point, but the virtual machine actually runs in user mode.
o So the kernel mode has to be simulated for each of the loaded OSes, and
kernel system calls passed through the virtual machine into a true kernel
mode for eventual HW access.
• The virtual machines may run slower, due to the increased levels of code between
applications and the HW, or they may run faster, due to the benefits of caching. (
And virtual devices may also be faster than real devices, such as RAM disks
which are faster than physical disks. )

1.8.6 Examples
1.8.6.1 VMware

• Abstracts the 80x86 hardware platform, allowing simultaneous operation


of multiple Windows and Linux OSes, as shown by example in Figure
1.19:

40
Figure 1.19 - VMWare Workstation architecture
1.8.6.2 The Java Virtual Machine
• Java was designed from the beginning to be platform independent, by running Java only
on a Java Virtual Machine, JVM, of which different implementations have been
developed for numerous different underlying HW platforms.
• Java source code is compiled into Java byte code in .class files. Java byte code is binary
instructions that will run on the JVM.
• The JVM implements memory management and garbage collection.
• Java byte code may be interpreted as it runs, or compiled to native system binary code
using just-in-time ( JIT ) compilation. Under this scheme, the first time that a piece of
Java byte code is encountered, it is compiled to the appropriate native machine binary
code by the Java interpreter. This native binary code is then cached, so that the next
time that piece of codeis encountered it can be used directly.
• Some hardware chips have been developed to run Java byte code directly, which is an
interesting application of a real machine being developed to emulate the services of a
virtual one!

Figure 1.20 - The Java virtual machine

41
UNIT-II

Process Concepts

• An operating system executes a variety of programs:


• Batch system – jobs
• Time-shared systems – user programs or tasks
• Textbook uses the terms job and process almost interchangeably
Process – a program in execution; process execution must progress in sequential fashion

A process includes:

• program counter
• stack
• data section
Process in Memory

Process State
As a process executes, it changes state
• new: The process is being created
• running: Instructions are being executed
• waiting: The process is waiting for some event to occur
• ready: The process is waiting to be assigned to a processor
• terminated: The process has finished execution

Diagram of Process State

Process Control Block (PCB)

42
Information associated with each process
• Process state
• Program counter
• CPU registers
• CPU scheduling information
• Memory-management information
• Accounting information
• I/O status information

CPU Switch from Process to Process (Context switch)

Process Scheduling Queues

• Job queue – set of all processes in the system


• Ready queue – set of all processes residing in main memory, ready and waiting to
execute
• Device queues – set of processes waiting for an I/O device
• Processes migrate among the various queues

Ready Queue And Various I/O Device Queues

43
Representation of Process Scheduling

Schedulers
• Long-term scheduler (or job scheduler) – selects which processes should be brought
into the ready queue
• Short-term scheduler (or CPU scheduler) – selects which process should be executed
next and allocates CPU

Addition of Medium Term Scheduling

• Short-term scheduler is invoked very frequently (milliseconds) Þ (must be fast)


• Long-term scheduler is invoked very infrequently (seconds, minutes) Þ (may be slow)
• The long-term scheduler controls the degree of multiprogramming
• Processes can be described as either:
• I/O-bound process – spends more time doing I/O than computations, many short CPU
bursts
• CPU-bound process – spends more time doing computations; few very long CPU bursts

44
Process Creation
• Parent process create children processes, which, in turn create other processes, forming
a tree of processes
• Generally, process identified and managed via a process identifier (pid)
• Resource sharing
• Parent and children share all resources
• Children share subset of parent’s resources
• Parent and child share no resources
• Child has a program loaded into it
• UNIX examples
• fork system call creates new process
• exec system call used after a fork to replace the process’ memory space with a new
program
Process Creation

A tree of processes on a typical Solaris

45
Process Termination:
• Process executes last statement and asks the operating system to delete it (exit)
• Output data from child to parent (via wait)
• Process’ resources are deallocated by operating system
• Parent may terminate execution of children processes (abort)
• Child has exceeded allocated resources
• Task assigned to child is no longer required
• If parent is exiting
Some operating system do not allow child to continue if its parent terminates

Multithreading Models
• Many-to-One
• One-to-One
• Many-to-Many
Many-to-One
Many user-level threads mapped to single kernel thread

Examples:

• Solaris Green Threads


• GNU Portable Threads

One-to-One

Each user-level thread maps to kernel thread

Examples Windows NT/XP/2000

Linux Solaris 9 and later

Many-to-Many Model

• Allows many user level threads to be mapped to many kernel threads


• Allows the operating system to create a sufficient number of kernel threads
• Solaris prior to version 9
Windows NT/2000 with the Thread Fiber package

41
Two-level Model
Similar to M:M, except that it allows a user thread to be bound to kernel thread

Examples

• IRIX
• HP-UX
• Tru64 UNIX
• Solaris 8 and earlier

Thread Libraries

• Thread library provides programmer with API for creating and managing threads
• Two primary ways of implementing
• Library entirely in user space
• Kernel-level library supported by the OS
Pthreads

• May be provided either as user-level or kernel-level


• A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization
• API specifies behavior of the thread library, implementation is up to development of the
library
Common in UNIX operating systems (Solaris, Linux, Mac OS X)

CPU SCHEDULING

2.1 Basic Concepts

• Almost all programs have some alternating cycle of CPU number crunching and waiting
for I/O of some kind. (Even a simple fetch from memory takes a long time relative to
CPU speeds.)
• In a simple system running a single process, the time spent waiting for I/O is wasted, and
those CPU cycles are lost forever.
• A scheduling system allows one process to use the CPU while another is waiting for I/O,
thereby making full use of otherwise lost CPU cycles.
• The challenge is to make the overall system as "efficient" and "fair" as possible, subject
to varying and often dynamic conditions, and where "efficient" and "fair" are somewhat
subjective terms, often subject to shifting priority policies.

2.1.1 CPU-I/O Burst Cycle

• Almost all processes alternate between two states in a continuing cycle, as shown
in Figure 6.1 below :
o A CPU burst of performing calculations, and
o An I/O burst, waiting for data transfer in or out of the system.

42
Figure 2.1 - Alternating sequence of CPU and I/O bursts.

• CPU bursts vary from process to process, and from program to program, but an
extensive study shows frequency patterns similar to that shown in Figure 2.2:

Figure 2.2 - Histogram of CPU-burst durations.

43
2.1.2 CPU Scheduler

• Whenever the CPU becomes idle, it is the job of the CPU Scheduler ( a.k.a. the
short-term scheduler ) to select another process from the ready queue to run next.
• The storage structure for the ready queue and the algorithm used to select the next
process are not necessarily a FIFO queue. There are several alternatives to choose
from, as well as numerous adjustable parameters for each algorithm, which is the
basic subject of this entire chapter.

2.1.3. Preemptive Scheduling

• CPU scheduling decisions take place under one of four conditions:


1. When a process switches from the running state to the waiting state, such
as for an I/O request or invocation of the wait( ) system call.
2. When a process switches from the running state to the ready state, for
example in response to an interrupt.
3. When a process switches from the waiting state to the ready state, say at
completion of I/O or a return from wait( ).
4. When a process terminates.
• For conditions 1 and 4 there is no choice - A new process must be selected.
• For conditions 2 and 3 there is a choice - To either continue running the current
process, or select a different one.
• If scheduling takes place only under conditions 1 and 4, the system is said to be
non-preemptive, or cooperative. Under these conditions, once a process starts
running it keeps running, until it either voluntarily blocks or until it finishes.
Otherwise the system is said to be preemptive.
• Windows used non-preemptive scheduling up to Windows 3.x, and started using
pre-emptive scheduling with Win95. Macs used non-preemptive prior to OSX,
and pre-emptive since then. Note that pre-emptive scheduling is only possible on
hardware that supports a timer interrupt.
• Note that pre-emptive scheduling can cause problems when two processes share
data, because one process may get interrupted in the middle of updating shared
data structures. Chapter 5 examined this issue in greater detail.
• Preemption can also be a problem if the kernel is busy implementing a system call
( e.g. updating critical kernel data structures ) when the preemption occurs. Most
modern UNIXes deal with this problem by making the process wait until the
system call has either completed or blocked before allowing the preemption
Unfortunately this solution is problematic for real-time systems, as real-time
response can no longer be guaranteed.
• Some critical sections of code protect themselves from con currency problems by
disabling interrupts before entering the critical section and re-enabling interrupts
on exiting the section. Needless to say, this should only be done in rare situations,
and only on very short pieces of code that will finish quickly, ( usually just a few
machine instructions. )

44
2.1.4 Dispatcher

• The dispatcher is the module that gives control of the CPU to the process
selected by the scheduler. This function involves:
o Switching context.
o Switching to user mode.
o Jumping to the proper location in the newly loaded program.
• The dispatcher needs to be as fast as possible, as it is run on every context switch.
The time consumed by the dispatcher is known as dispatch latency.

2.2 Scheduling Criteria

• There are several different criteria to consider when trying to select the "best" scheduling
algorithm for a particular situation and environment, including:
o CPU utilization - Ideally the CPU would be busy 100% of the time, so as to
waste 0 CPU cycles. On a real system CPU usage should range from 40% ( lightly
loaded ) to 90% ( heavily loaded. )
o Throughput - Number of processes completed per unit time. May range from 10
/ second to 1 / hour depending on the specific processes.
o Turnaround time - Time required for a particular process to complete, from
submission time to completion. ( Wall clock time. )
o Waiting time - How much time processes spend in the ready queue waiting their
turn to get on the CPU.
▪ ( Load average - The average number of processes sitting in the ready
queue waiting their turn to get into the CPU. Reported in 1-minute, 5-
minute, and 15-minute averages by "uptime" and "who". )
o Response time - The time taken in an interactive program from the issuance of a
command to the commence of a response to that command.
• In general one wants to optimize the average value of a criteria ( Maximize CPU
utilization and throughput, and minimize all the others. ) However some times one wants
to do something different, such as to minimize the maximum response time.
• Sometimes it is most desirable to minimize the variance of a criteria than the actual
value. I.e. users are more accepting of a consistent predictable system than an
inconsistent one, even if it is a little bit slower.

2.3 Scheduling Algorithms

The following subsections will explain several common scheduling strategies, looking at only a
single CPU burst each for a small number of processes. Obviously real systems have to deal with
a lot more simultaneous processes executing their CPU-I/O burst cycles.

45
2.3.1 First-Come First-Serve Scheduling, FCFS

• FCFS is very simple - Just a FIFO queue, like customers waiting in line at the
bank or the post office or at a copying machine.
• Unfortunately, however, FCFS can yield some very long average wait times,
particularly if the first process to get there takes a long time. For example,
consider the following three processes:

Process Burst Time


P1 24
P2 3
P3 3

• In the first Gantt chart below, process P1 arrives first. The average waiting time
for the three processes is ( 0 + 24 + 27 ) / 3 = 17.0 ms.
• In the second Gantt chart below, the same three processes have an average wait
time of ( 0 + 3 + 6 ) / 3 = 3.0 ms. The total run time for the three bursts is the
same, but in the second case two of the three finish much quicker, and the other
process is only delayed by a short amount.

• FCFS can also block the system in a busy dynamic system in another way, known
as the convoy effect. When one CPU intensive process blocks the CPU, a number
of I/O intensive processes can get backed up behind it, leaving the I/O devices
idle. When the CPU hog finally relinquishes the CPU, then the I/O processes pass
through the CPU quickly, leaving the CPU idle while everyone queues up for I/O,
and then the cycle repeats itself when the CPU intensive process gets back to the
ready queue.

2.3.2 Shortest-Job-First Scheduling, SJF

• The idea behind the SJF algorithm is to pick the quickest fastest little job that
needs to be done, get it out of the way first, and then pick the next smallest fastest
job to do next.

46
• ( Technically this algorithm picks a process based on the next shortest CPU
burst, not the overall process time. )
• For example, the Gantt chart below is based upon the following CPU burst times,
( and the assumption that all jobs arrive at the same time. )

Process Burst Time


P1 6
P2 8
P3 7
P4 3

• In the case above the average wait time is ( 0 + 3 + 9 + 16 ) / 4 = 7.0 ms, ( as


opposed to 10.25 ms for FCFS for the same processes. )
• SJF can be proven to be the fastest scheduling algorithm, but it suffers from one
important problem: How do you know how long the next CPU burst is going to
be?
o For long-term batch jobs this can be done based upon the limits that users
set for their jobs when they submit them, which encourages them to set
low limits, but risks their having to re-submit the job if they set the limit
too low. However that does not work for short-term CPU scheduling on an
interactive system.
o Another option would be to statistically measure the run time
characteristics of jobs, particularly if the same tasks are run repeatedly and
predictably. But once again that really isn't a viable option for short term
CPU scheduling in the real world.
o A more practical approach is to predict the length of the next burst, based
on some historical measurement of recent burst times for this process. One
simple, fast, and relatively accurate method is the exponential average,
which can be defined as follows. ( The book uses tau and t for their
variables, but those are hard to distinguish from one another and don't
work well in HTML. )

estimate[ i + 1 ] = alpha * burst[ i ] + ( 1.0 - alpha ) * estimate[ i ]

o In this scheme the previous estimate contains the history of all previous
times, and alpha serves as a weighting factor for the relative importance of
recent data versus past history. If alpha is 1.0, then past history is ignored,
and we assume the next burst will be the same length as the last burst. If

47
alpha is 0.0, then all measured burst times are ignored, and we just assume
a constant burst time. Most commonly alpha is set at 0.5, as illustrated in
Figure 5.3:

Figure 2.3 - Prediction of the length of the next CPU burst.

• SJF can be either preemptive or non-preemptive. Preemption occurs when a new


process arrives in the ready queue that has a predicted burst time shorter than the
time remaining in the process whose burst is currently on the CPU. Preemptive
SJF is sometimes referred to as shortest remaining time first scheduling.
• For example, the following Gantt chart is based upon the following data:

Process Arrival Time Burst Time


P1 0 8
P2 1 4
P3 2 9
p4 3 5

48
• The average wait time in this case is ( ( 5 - 3 ) + ( 10 - 1 ) + ( 17 - 2 ) ) / 4 = 26 / 4
= 6.5 ms. ( As opposed to 7.75 ms for non-preemptive SJF or 8.75 for FCFS. )

2.3.3 Priority Scheduling

• Priority scheduling is a more general case of SJF, in which each job is assigned a
priority and the job with the highest priority gets scheduled first. ( SJF uses the
inverse of the next expected burst time as its priority - The smaller the expected
burst, the higher the priority. )
• Note that in practice, priorities are implemented using integers within a fixed
range, but there is no agreed-upon convention as to whether "high" priorities use
large numbers or small numbers. This book uses low number for high priorities,
with 0 being the highest possible priority.
• For example, the following Gantt chart is based upon these process burst times
and priorities, and yields an average waiting time of 8.2 ms:

Process Burst Time Priority


P1 10 3
P2 1 1
P3 2 4
P4 1 5
P5 5 2

• Priorities can be assigned either internally or externally. Internal priorities are


assigned by the OS using criteria such as average burst time, ratio of CPU to I/O
activity, system resource use, and other factors available to the kernel. External
priorities are assigned by users, based on the importance of the job, fees paid,
politics, etc.
• Priority scheduling can be either preemptive or non-preemptive.

49
• Priority scheduling can suffer from a major problem known as indefinite
blocking, or starvation, in which a low-priority task can wait forever because
there are always some other jobs around that have higher priority.
o If this problem is allowed to occur, then processes will either run
eventually when the system load lightens ( at say 2:00 a.m. ), or will
eventually get lost when the system is shut down or crashes. ( There are
rumors of jobs that have been stuck for years. )
o One common solution to this problem is aging, in which priorities of jobs
increase the longer they wait. Under this scheme a low-priority job will
eventually get its priority raised high enough that it gets run.

2.3.4 Round Robin Scheduling

• Round robin scheduling is similar to FCFS scheduling, except that CPU bursts are
assigned with limits called time quantum.
• When a process is given the CPU, a timer is set for whatever value has been set
for a time quantum.
o If the process finishes its burst before the time quantum timer expires, then
it is swapped out of the CPU just like the normal FCFS algorithm.
o If the timer goes off first, then the process is swapped out of the CPU and
moved to the back end of the ready queue.
• The ready queue is maintained as a circular queue, so when all processes have had
a turn, then the scheduler gives the first process another turn, and so on.
• RR scheduling can give the effect of all processors sharing the CPU equally,
although the average wait time can be longer than with other scheduling
algorithms. In the following example the average wait time is 5.66 ms.

Process Burst Time


P1 24
P2 3
P3 3

• The performance of RR is sensitive to the time quantum selected. If the quantum


is large enough, then RR reduces to the FCFS algorithm; If it is very small, then
each process gets 1/nth of the processor time and share the CPU equally.
• BUT, a real system invokes overhead for every context switch, and the smaller
the time quantum the more context switches there are. ( See Figure 6.4 below. )
Most modern systems use time quantum between 10 and 100 milliseconds, and

50
context switch times on the order of 10 microseconds, so the overhead is small
relative to the time quantum.

Figure 2.4 - The way in which a smaller time quantum increases context switches.

• Turnaround time also varies with quantum time, in a non-apparent manner.


Consider, for example the processes shown in Figure 2.5:

Figure 2.5 - The way in which turnaround time varies with the time quantum.

• In general, turnaround time is minimized if most processes finish their next cpu
burst within one time quantum. For example, with three processes of 10 ms bursts
each, the average turnaround time for 1 ms quantum is 29, and for 10 ms quantum
it reduces to 20. However, if it is made too large, then RR just degenerates to
FCFS. A rule of thumb is that 80% of CPU bursts should be smaller than the time
quantum.

2.3.5 Multilevel Queue Scheduling

• When processes can be readily categorized, then multiple separate queues can be
established, each implementing whatever scheduling algorithm is most
appropriate for that type of job, and/or with different parametric adjustments.

51
• Scheduling must also be done between queues, that is scheduling one queue to get
time relative to other queues. Two common options are strict priority ( no job in a
lower priority queue runs until all higher priority queues are empty ) and round-
robin ( each queue gets a time slice in turn, possibly of different sizes. )
• Note that under this algorithm jobs cannot switch from queue to queue - Once
they are assigned a queue, that is their queue until they finish.

Figure 2.6 - Multilevel queue scheduling

2.3.6 Multilevel Feedback-Queue Scheduling

• Multilevel feedback queue scheduling is similar to the ordinary multilevel queue


scheduling described above, except jobs may be moved from one queue to
another for a variety of reasons:
o If the characteristics of a job change between CPU-intensive and I/O
intensive, then it may be appropriate to switch a job from one queue to
another.
o Aging can also be incorporated, so that a job that has waited for a long
time can get bumped up into a higher priority queue for a while.
• Multilevel feedback queue scheduling is the most flexible, because it can be tuned
for any situation. But it is also the most complex to implement because of all the
adjustable parameters. Some of the parameters which define one of these systems
include:
o The number of queues.
o The scheduling algorithm for each queue.
o The methods used to upgrade or demote processes from one queue to
another. ( Which may be different. )
o The method used to determine which queue a process enters initially.

52
Figure 2.7 - Multilevel feedback queues.

2.4 Thread Scheduling

• The process scheduler schedules only the kernel threads.


• User threads are mapped to kernel threads by the thread library - The OS ( and in
particular the scheduler ) is unaware of them.

2.4.1 Contention Scope

• Contention scope refers to the scope in which threads compete for the use of
physical CPUs.
• On systems implementing many-to-one and many-to-many threads, Process
Contention Scope, PCS, occurs, because competition occurs between threads that
are part of the same process. ( This is the management / scheduling of multiple
user threads on a single kernel thread, and is managed by the thread library. )
• System Contention Scope, SCS, involves the system scheduler scheduling kernel
threads to run on one or more CPUs. Systems implementing one-to-one threads (
XP, Solaris 9, Linux ), use only SCS.
• PCS scheduling is typically done with priority, where the programmer can set
and/or change the priority of threads created by his or her programs. Even time
slicing is not guaranteed among threads of equal priority.

2.4.2 Pthread Scheduling

• The Pthread library provides for specifying scope contention:


o PTHREAD_SCOPE_PROCESS schedules threads using PCS, by
scheduling user threads onto available LWPs using the many-to-many
model.
o PTHREAD_SCOPE_SYSTEM schedules threads using SCS, by binding
user threads to particular LWPs, effectively implementing a one-to-one
model.
• getscope and setscope methods provide for determining and setting the scope
contention respectively:

53
Figure 2.8 - Pthread scheduling API.

2.5 Multiple-Processor Scheduling

• When multiple processors are available, then the scheduling gets more complicated,
because now there is more than one CPU which must be kept busy and in effective use at
all times.
• Load sharing revolves around balancing the load between multiple processors.
• Multi-processor systems may be heterogeneous, ( different kinds of CPUs ), or
homogenous, ( all the same kind of CPU ). Even in the latter case there may be special
scheduling constraints, such as devices which are connected via a private bus to only one
of the CPUs. This book will restrict its discussion to homogenous systems.

54
2.5.1 Approaches to Multiple-Processor Scheduling

• One approach to multi-processor scheduling is asymmetric multiprocessing, in


which one processor is the master, controlling all activities and running all kernel
code, while the other runs only user code. This approach is relatively simple, as
there is no need to share critical system data.
• Another approach is symmetric multiprocessing, SMP, where each processor
schedules its own jobs, either from a common ready queue or from separate ready
queues for each processor.
• Virtually all modern OSes support SMP, including XP, Win 2000, Solaris, Linux,
and Mac OSX.

2.5.2 Processor Affinity

• Processors contain cache memory, which speeds up repeated accesses to the same
memory locations.
• If a process were to switch from one processor to another each time it got a time
slice, the data in the cache ( for that process ) would have to be invalidated and re-
loaded from main memory, thereby obviating the benefit of the cache.
• Therefore SMP systems attempt to keep processes on the same processor, via
processor affinity. Soft affinity occurs when the system attempts to keep
processes on the same processor but makes no guarantees. Linux and some other
OSes support hard affinity, in which a process specifies that it is not to be moved
between processors.
• Main memory architecture can also affect process affinity, if particular CPUs
have faster access to memory on the same chip or board than to other memory
loaded elsewhere. ( Non-Uniform Memory Access, NUMA. ) As shown below, if
a process has an affinity for a particular CPU, then it should preferentially be
assigned memory storage in "local" fast access areas.

Figure 2.9 - NUMA and CPU scheduling.

55
2.5.3 Load Balancing

• Obviously an important goal in a multiprocessor system is to balance the load


between processors, so that one processor won't be sitting idle while another is
overloaded.
• Systems using a common ready queue are naturally self-balancing, and do not
need any special handling. Most systems, however, maintain separate ready
queues for each processor.
• Balancing can be achieved through either push migration or pull migration:
o Push migration involves a separate process that runs periodically, ( e.g.
every 200 milliseconds ), and moves processes from heavily loaded
processors onto less loaded ones.
o Pull migration involves idle processors taking processes from the ready
queues of other processors.
o Push and pull migration are not mutually exclusive.
• Note that moving processes from processor to processor to achieve load balancing
works against the principle of processor affinity, and if not carefully managed, the
savings gained by balancing the system can be lost in rebuilding caches. One
option is to only allow migration when imbalance surpasses a given threshold.

2.5.4 Multicore Processors

• Traditional SMP required multiple CPU chips to run multiple kernel threads
concurrently.
• Recent trends are to put multiple CPUs ( cores ) onto a single chip, which appear
to the system as multiple processors.
• Compute cycles can be blocked by the time needed to access memory, whenever
the needed data is not already present in the cache. ( Cache misses. ) In Figure
2.10, as much as half of the CPU cycles are lost to memory stall.

Figure 2.10 - Memory stall.

• By assigning multiple kernel threads to a single processor, memory stall can be


avoided ( or reduced ) by running one thread on the processor while the other
thread waits for memory.

56
Figure 2.11 - Multithreaded multicore system.

• A dual-threaded dual-core system has four logical processors available to the


operating system. The UltraSPARC T1 CPU has 8 cores per chip and 4 hardware
threads per core, for a total of 32 logical processors per chip.
• There are two ways to multi-thread a processor:
1. Coarse-grained multithreading switches between threads only when one
thread blocks, say on a memory read. Context switching is similar to
process switching, with considerable overhead.
2. Fine-grained multithreading occurs on smaller regular intervals, say on
the boundary of instruction cycles. However the architecture is designed to
support thread switching, so the overhead is relatively minor.
• Note that for a multi-threaded multi-core system, there are two levels of
scheduling, at the kernel level:
o The OS schedules which kernel thread(s) to assign to which logical
processors, and when to make context switches using algorithms as
described above.
o On a lower level, the hardware schedules logical processors on each
physical core using some other algorithm.
▪ The UltraSPARC T1 uses a simple round-robin method to
schedule the 4 logical processors ( kernel threads ) on each
physical core.
▪ The Intel Itanium is a dual-core chip which uses a 7-level priority
scheme ( urgency ) to determine which thread to schedule when
one of 5 different events occurs.

Virtualization and Scheduling

• Virtualization adds another layer of complexity and scheduling.


• Typically there is one host operating system operating on "real" processor(s) and
a number of guest operating systems operating on virtual processors.
• The Host OS creates some number of virtual processors and presents them to the
guest OSes as if they were real processors.
• The guest OSes don't realize their processors are virtual, and make scheduling
decisions on the assumption of real processors.
• As a result, interactive and especially real-time performance can be severely
compromised on guest systems. The time-of-day clock will also frequently be off.

57
PROCESS SYNCHRONIZATION

2.6 Background

• Recall that back in Chapter 3 we looked at cooperating processes ( those that can effect or
be effected by other simultaneously running processes ), and as an example, we used the
producer-consumer cooperating processes:

Producer code from chapter 3:

item nextProduced;

while( true ) {

/* Produce an item and store it in nextProduced */


nextProduced = makeNewItem( . . . );

/* Wait for space to become available */


while( ( ( in + 1 ) % BUFFER_SIZE ) == out )
; /* Do nothing */

/* And then store the item and repeat the loop. */


buffer[ in ] = nextProduced;
in = ( in + 1 ) % BUFFER_SIZE;

Consumer code from chapter 3:

item nextConsumed;

while( true ) {

/* Wait for an item to become available */


while( in == out )
; /* Do nothing */

/* Get the next available item */


nextConsumed = buffer[ out ];
out = ( out + 1 ) % BUFFER_SIZE;

/* Consume the item in nextConsumed


( Do something with it ) */

58
• The only problem with the above code is that the maximum number of items
which can be placed into the buffer is BUFFER_SIZE - 1. One slot is unavailable
because there always has to be a gap between the producer and the consumer.
• We could try to overcome this deficiency by introducing a counter variable, as
shown in the following code segments:

• Unfortunately we have now introduced a new problem, because both the producer
and the consumer are adjusting the value of the variable counter, which can lead
to a condition known as a race condition. In this condition a piece of code may or
may not work correctly, depending on which of two simultaneous processes
executes first, and more importantly if one of the processes gets interrupted such
that the other process runs between important steps of the first process. ( Bank
balance example discussed in class. )
• The particular problem above comes from the producer executing "counter++" at
the same time the consumer is executing "counter--". If one process gets part way
through making the update and then the other process butts in, the value of
counter can get left in an incorrect state.
• But, you might say, "Each of those are single instructions - How can they get
interrupted halfway through?" The answer is that although they are single
instructions in C++, they are actually three steps each at the hardware level: (1)
Fetch counter from memory into a register, (2) increment or decrement the
register, and (3) Store the new value of counter back to memory. If the
instructions from the two processes get interleaved, there could be serious
problems, such as illustrated by the following:

59
• Exercise: What would be the resulting value of counter if the order of statements
T4 and T5 were reversed? ( What should the value of counter be after one
producer and one consumer, assuming the original value was 5? )
• Note that race conditions are notoriously difficult to identify and debug, because
by their very nature they only occur on rare occasions, and only when the timing
is just exactly right. ( or wrong! :-) ) Race conditions are also very difficult to
reproduce. :-(
• Obviously the solution is to only allow one process at a time to manipulate the
value "counter". This is a very common occurrence among cooperating processes,
so lets look at some ways in which this is done, as well as some classic problems
in this area.

2.7 The Critical-Section Problem

• The producer-consumer problem described above is a specific example of a more general


situation known as the critical section problem. The general idea is that in a number of
cooperating processes, each has a critical section of code, with the following conditions
and terminologies:
o Only one process in the group can be allowed to execute in their critical section at
any one time. If one process is already executing their critical section and another
process wishes to do so, then the second process must be made to wait until the
first process has completed their critical section work.
o The code preceding the critical section, and which controls access to the critical
section, is termed the entry section. It acts like a carefully controlled locking door.

60
o The code following the critical section is termed the exit section. It generally
releases the lock on someone else's door, or at least lets the world know that they
are no longer in their critical section.
o The rest of the code not included in either the critical section or the entry or exit
sections is termed the remainder section.

Figure 2.12 - General structure of a typical process Pi

• A solution to the critical section problem must satisfy the following three conditions:
1. Mutual Exclusion - Only one process at a time can be executing in their critical
section.
2. Progress - If no process is currently executing in their critical section, and one or
more processes want to execute their critical section, then only the processes not
in their remainder sections can participate in the decision, and the decision cannot
be postponed indefinitely. ( i.e. processes cannot be blocked forever waiting to get
into their critical sections. )
3. Bounded Waiting - There exists a limit as to how many other processes can get
into their critical sections after a process requests entry into their critical section
and before that request is granted. ( i.e. a process requesting entry into their
critical section will get a turn eventually, and there is a limit as to how many other
processes get to go first. )
• We assume that all processes proceed at a non-zero speed, but no assumptions can be
made regarding the relative speed of one process versus another.
• Kernel processes can also be subject to race conditions, which can be especially
problematic when updating commonly shared kernel data structures such as open file
tables or virtual memory management. Accordingly kernels can take on one of two
forms:
o Non-preemptive kernels do not allow processes to be interrupted while in kernel
mode. This eliminates the possibility of kernel-mode race conditions, but requires
kernel mode operations to complete very quickly, and can be problematic for real-
time systems, because timing cannot be guaranteed.
o Preemptive kernels allow for real-time operations, but must be carefully written to
avoid race conditions. This can be especially tricky on SMP systems, in which
multiple kernel processes may be running simultaneously on different processors.

61
Non-preemptive kernels include Windows XP, 2000, traditional UNIX, and Linux prior
to 2.6; Preemptive kernels include Linux 2.6 and later, and some commercial UNIXes such as
Solaris and IRIX.

2.8 Peterson's Solution

• Peterson's Solution is a classic software-based solution to the critical section problem. It


is unfortunately not guaranteed to work on modern hardware, due to vagaries of load and
store operations, but it illustrates a number of important concepts.
• Peterson's solution is based on two processes, P0 and P1, which alternate between their
critical sections and remainder sections. For convenience of discussion, "this" process is
Pi, and the "other" process is Pj. ( i.e. j = 1 - i )
• Peterson's solution requires two shared data items:
o int turn - Indicates whose turn it is to enter into the critical section. If turn = = i,
then process i is allowed into their critical section.
o boolean flag[ 2 ] - Indicates when a process wants to enter into their critical
section. When process i wants to enter their critical section, it sets flag[ i ] to true.
• In the following diagram, the entry and exit sections are enclosed in boxes.
o In the entry section, process i first raises a flag indicating a desire to enter the
critical section.
o Then turn is set to j to allow the other process to enter their critical section if
process j so desires.
o The while loop is a busy loop ( notice the semicolon at the end ), which makes
process i wait as long as process j has the turn and wants to enter the critical
section.
o Process i lowers the flag[ i ] in the exit section, allowing process j to continue if it
has been waiting.

Figure 2.13 - The structure of process Pi in Peterson's solution.

• To prove that the solution is correct, we must examine the three conditions listed above:

62
1. Mutual exclusion - If one process is executing their critical section when the
other wishes to do so, the second process will become blocked by the flag of the
first process. If both processes attempt to enter at the same time, the last process
to execute "turn = j" will be blocked.
2. Progress - Each process can only be blocked at the while if the other process
wants to use the critical section ( flag[ j ] = = true ), AND it is the other process's
turn to use the critical section ( turn = = j ). If both of those conditions are true,
then the other process ( j ) will be allowed to enter the critical section, and upon
exiting the critical section, will set flag[ j ] to false, releasing process i. The shared
variable turn assures that only one process at a time can be blocked, and the flag
variable allows one process to release the other when exiting their critical section.
3. Bounded Waiting - As each process enters their entry section, they set the turn
variable to be the other processes turn. Since no process ever sets it back to their
own turn, this ensures that each process will have to let the other process go first
at most one time before it becomes their turn again.
• Note that the instruction "turn = j" is atomic, that is it is a single machine instruction
which cannot be interrupted.

2.9 Synchronization Hardware

• To generalize the solution(s) expressed above, each process when entering their critical
section must set some sort of lock, to prevent other processes from entering their critical
sections simultaneously, and must release the lock when exiting their critical section, to
allow other processes to proceed. Obviously it must be possible to attain the lock only
when no other process has already set a lock. Specific implementations of this general
procedure can get quite complicated, and may include hardware solutions as outlined in
this section.
• One simple solution to the critical section problem is to simply prevent a process from
being interrupted while in their critical section, which is the approach taken by non
preemptive kernels. Unfortunately this does not work well in multiprocessor
environments, due to the difficulties in disabling and the re-enabling interrupts on all
processors. There is also a question as to how this approach affects timing if the clock
interrupt is disabled.
• Another approach is for hardware to provide certain atomic operations. These operations
are guaranteed to operate as a single instruction, without interruption. One such operation
is the "Test and Set", which simultaneously sets a boolean lock variable and returns its
previous value, as shown in Figures 2.14 and 2.15:

63
Figures 2.14 and 2.15 illustrate "test_and_set( )" function

• Another variation on the test-and-set is an atomic swap of two booleans, as shown in


Figures 5.5 and 5.6:

• The above examples satisfy the mutual exclusion requirement, but unfortunately do not
guarantee bounded waiting. If there are multiple processes trying to get into their critical
sections, there is no guarantee of what order they will enter, and any one process could
have the bad luck to wait forever until they got their turn in the critical section. ( Since
there is no guarantee as to the relative rates of the processes, a very fast process could
theoretically release the lock, whip through their remainder section, and re-lock the lock
before a slower process got a chance. As more and more processes are involved vying for
the same resource, the odds of a slow process getting locked out completely increase. )

64
• Figure 5.7 illustrates a solution using test-and-set that does satisfy this requirement, using
two shared data structures, boolean lock and boolean waiting[ N ], where N is the number
of processes in contention for critical sections:

Figure 2.16 Bounded-waiting mutual exclusion with TestAndSet( ).

• The key feature of the above algorithm is that a process blocks on the AND of the critical
section being locked and that this process is in the waiting state. When exiting a critical
section, the exiting process does not just unlock the critical section and let the other
processes have a free-for-all trying to get in. Rather it first looks in an orderly progression
( starting with the next process on the list ) for a process that has been waiting, and if it
finds one, then it releases that particular process from its waiting state, without unlocking
the critical section, thereby allowing a specific process into the critical section while
continuing to block all the others. Only if there are no other processes currently waiting is
the general lock removed, allowing the next process to come along access to the critical
section.
• Unfortunately, hardware level locks are especially difficult to implement in multi-
processor architectures. Discussion of such issues is left to books on advanced computer
architecture.

2.10 Mutex Locks

• The hardware solutions presented above are often difficult for ordinary programmers to
access, particularly on multi-processor machines, and particularly because they are often
platform-dependent.
• Therefore most systems offer a software API equivalent called mutex locks or simply
mutexes. ( For mutual exclusion )
• The terminology when using mutexes is to acquire a lock prior to entering a critical
section, and to release it when exiting, as shown in Figure 2.17:

65
Figure 2.17 - Solution to the critical-section problem using mutex locks

• Just as with hardware locks, the acquire step will block the process if the lock is in use by
another process, and both the acquire and release operations are atomic.
• Acquire and release can be implemented as shown here, based on a boolean variable
"available":

• One problem with the implementation shown here, ( and in the hardware solutions
presented earlier ), is the busy loop used to block processes in the acquire phase. These
types of locks are referred to as spinlocks, because the CPU just sits and spins while
blocking the process.
• Spinlocks are wasteful of cpu cycles, and are a really bad idea on single-cpu single-
threaded machines, because the spinlock blocks the entire computer, and doesn't allow
any other process to release the lock. ( Until the scheduler kicks the spinning process off
of the cpu. )
• On the other hand, spinlocks do not incur the overhead of a context switch, so they are
effectively used on multi-threaded machines when it is expected that the lock will be
released after a short time.

2.11 Semaphores

• A more robust alternative to simple mutexes is to use semaphores, which are integer
variables for which only two ( atomic ) operations are defined, the wait and signal
operations, as shown in the following figure.

66
• Note that not only must the variable-changing steps ( S-- and S++ ) be indivisible, it is
also necessary that for the wait operation when the test proves false that there be no
interruptions before S gets decremented. It IS okay, however, for the busy loop to be
interrupted when the test is true, which prevents the system from hanging forever.

2.11.1 Semaphore Usage

• In practice, semaphores can take on one of two forms:


o Binary semaphores can take on one of two values, 0 or 1. They can be
used to solve the critical section problem as described above, and can be
used as mutexes on systems that do not provide a separate mutex
mechanism.. The use of mutexes for this purpose is shown in Figure 6.9 (
from the 8th edition ) below.

Mutual-exclusion implementation with semaphores. ( From 8th


edition. )

o Counting semaphores can take on any integer value, and are usually used
to count the number remaining of some limited resource. The counter is

67
initialized to the number of such resources available in the system, and
whenever the counting semaphore is greater than zero, then a process can
enter a critical section and use one of the resources. When the counter gets
to zero ( or negative in some implementations ), then the process blocks
until another process frees up a resource and increments the counting
semaphore with a signal call. ( The binary semaphore can be seen as just a
special case where the number of resources initially available is just one. )
o Semaphores can also be used to synchronize certain operations between
processes. For example, suppose it is important that process P1 execute
statement S1 before process P2 executes statement S2.
▪ First we create a semaphore named synch that is shared by the two
processes, and initialize it to zero.
▪ Then in process P1 we insert the code:

S1;
signal( synch );

▪ and in process P2 we insert the code:

wait( synch );
S2;

▪ Because synch was initialized to 0, process P2 will block on the


wait until after P1 executes the call to signal.

2.11.2 Semaphore Implementation

• The big problem with semaphores as described above is the busy loop in the wait
call, which consumes CPU cycles without doing any useful work. This type of
lock is known as a spinlock, because the lock just sits there and spins while it
waits. While this is generally a bad thing, it does have the advantage of not
invoking context switches, and so it is sometimes used in multi-processing
systems when the wait time is expected to be short - One thread spins on one
processor while another completes their critical section on another processor.
• An alternative approach is to block a process when it is forced to wait for an
available semaphore, and swap it out of the CPU. In this implementation each
semaphore needs to maintain a list of processes that are blocked waiting for it, so
that one of the processes can be woken up and swapped back in when the
semaphore becomes available. ( Whether it gets swapped back into the CPU
immediately or whether it needs to hang out in the ready queue for a while is a
scheduling problem. )
• The new definition of a semaphore and the corresponding wait and signal
operations are shown as follows:

68
• Note that in this implementation the value of the semaphore can actually become
negative, in which case its magnitude is the number of processes waiting for that
semaphore. This is a result of decrementing the counter before checking its value.
• Key to the success of semaphores is that the wait and signal operations be atomic,
that is no other process can execute a wait or signal on the same semaphore at the
same time. ( Other processes could be allowed to do other things, including
working with other semaphores, they just can't have access to this semaphore. )
On single processors this can be implemented by disabling interrupts during the
execution of wait and signal; Multiprocessor systems have to use more complex
methods, including the use of spinlocking.

2.11.3 Deadlocks and Starvation

• One important problem that can arise when using semaphores to block processes
waiting for a limited resource is the problem of deadlocks, which occur when
multiple processes are blocked, each waiting for a resource that can only be freed
by one of the other (blocked) processes, as illustrated in the following example

69
• Another problem to consider is that of starvation, in which one or more processes
gets blocked forever, and never get a chance to take their turn in the critical
section. For example, in the semaphores above, we did not specify the algorithms
for adding processes to the waiting queue in the semaphore in the wait( ) call, or
selecting one to be removed from the queue in the signal( ) call. If the method
chosen is a FIFO queue, then every process will eventually get their turn, but if a
LIFO queue is implemented instead, then the first process to start waiting could
starve.

2.11.4 Priority Inversion

• A challenging scheduling problem arises when a high-priority process gets


blocked waiting for a resource that is currently held by a low-priority process.
• If the low-priority process gets pre-empted by one or more medium-priority
processes, then the high-priority process is essentially made to wait for the
medium priority processes to finish before the low-priority process can release the
needed resource, causing a priority inversion. If there are enough medium-
priority processes, then the high-priority process may be forced to wait for a very
long time.
• One solution is a priority-inheritance protocol, in which a low-priority process
holding a resource for which a high-priority process is waiting will temporarily
inherit the high priority from the waiting process. This prevents the medium-
priority processes from preempting the low-priority process until it releases the
resource, blocking the priority inversion problem.
• The book has an interesting discussion of how a priority inversion almost doomed
the Mars Pathfinder mission, and how the problem was solved when the priority
inversion was stopped. Full details are available online at
http://research.microsoft.com/en-
us/um/people/mbj/mars_pathfinder/authoritative_account.htm

70
2.12 Classic Problems of Synchronization

The following classic problems are used to test virtually every new proposed synchronization
algorithm.

2.12.1 The Bounded-Buffer Problem

• This is a generalization of the producer-consumer problem wherein access is


controlled to a shared group of buffers of a limited size.
• In this solution, the two counting semaphores "full" and "empty" keep track of the
current number of full and empty buffers respectively ( and initialized to 0 and N
respectively. ) The binary semaphore mutex controls access to the critical section.
The producer and consumer processes are nearly identical - One can think of the
producer as producing full buffers, and the consumer producing empty buffers.

Figures 2.18 and 2.19 use variables next_produced and next_consumed

71
2.12.2 The Readers-Writers Problem

• In the readers-writers problem there are some processes ( termed readers ) who
only read the shared data, and never change it, and there are other processes (
termed writers ) who may change the data in addition to or instead of reading it.
There is no limit to how many readers can access the data simultaneously, but
when a writer accesses the data, it needs exclusive access.
• There are several variations to the readers-writers problem, most centered around
relative priorities of readers versus writers.
o The first readers-writers problem gives priority to readers. In this problem,
if a reader wants access to the data, and there is not already a writer
accessing it, then access is granted to the reader. A solution to this
problem can lead to starvation of the writers, as there could always be
more readers coming along to access the data. ( A steady stream of readers
will jump ahead of waiting writers as long as there is currently already
another reader accessing the data, because the writer is forced to wait until
the data is idle, which may never happen if there are enough readers. )
o The second readers-writers problem gives priority to the writers. In this
problem, when a writer wants access to the data it jumps to the head of the
queue - All waiting readers are blocked, and the writer gets access to the
data as soon as it becomes available. In this solution the readers may be
starved by a steady stream of writers.
• The following code is an example of the first readers-writers problem, and
involves an important counter and two binary semaphores:
o readcount is used by the reader processes, to count the number of readers
currently accessing the data.
o mutex is a semaphore used only by the readers for controlled access to
readcount.
o rw_mutex is a semaphore used to block and release the writers. The first
reader to access the data will set this lock and the last reader to exit will
release it; The remaining readers do not touch rw_mutex. ( Eighth edition
called this variable wrt. )
o Note that the first reader to come along will block on rw_mutex if there is
currently a writer accessing the data, and that all following readers will
only block on mutex for their turn to increment readcount.

72
• Some hardware implementations provide specific reader-writer locks, which are
accessed using an argument specifying whether access is requested for reading or
writing. The use of reader-writer locks is beneficial for situation in which: (1)
processes can be easily identified as either readers or writers, and (2) there are
significantly more readers than writers, making the additional overhead of the
reader-writer lock pay off in terms of increased concurrency of the readers.

2.12.3 The Dining-Philosophers Problem

• The dining philosophers problem is a classic synchronization problem involving


the allocation of limited resources amongst a group of processes in a deadlock-
free and starvation-free manner:
o Consider five philosophers sitting around a table, in which there are five
chopsticks evenly distributed and an endless bowl of rice in the center, as
shown in the diagram below. ( There is exactly one chopstick between
each pair of dining philosophers. )
o These philosophers spend their lives alternating between two activities:
eating and thinking.
o When it is time for a philosopher to eat, it must first acquire two
chopsticks - one from their left and one from their right.
o When a philosopher thinks, it puts down both chopsticks in their original
locations.

73
Figure 2.20 - The situation of the dining philosophers

• One possible solution, as shown in the following code section, is to use a set of
five semaphores ( chopsticks[ 5 ] ), and to have each hungry philosopher first wait
on their left chopstick ( chopsticks[ i ] ), and then wait on their right chopstick (
chopsticks[ ( i + 1 ) % 5 ] )
• But suppose that all five philosophers get hungry at the same time, and each starts
by picking up their left chopstick. They then look for their right chopstick, but
because it is unavailable, they wait for it, forever, and eventually all the
philosophers starve due to the resulting deadlock.

Figure 2.21 - The structure of philosopher i.

• Some potential solutions to the problem include:


o Only allow four philosophers to dine at the same time. ( Limited
simultaneous processes. )
o Allow philosophers to pick up chopsticks only when both are available, in
a critical section. ( All or nothing allocation of critical resources. )

74
o Use an asymmetric solution, in which odd philosophers pick up their left
chopstick first and even philosophers pick up their right chopstick first. (
Will this solution always work? What if there are an even number of
philosophers? )
• Note carefully that a deadlock-free solution to the dining philosophers problem
does not necessarily guarantee a starvation-free one. ( While some or even most
of the philosophers may be able to get on with their normal lives of eating and
thinking, there may be one unlucky soul who never seems to be able to get both
chopsticks at the same time. :-(

2.13 Monitors

• Semaphores can be very useful for solving concurrency problems, but only if
programmers use them properly. If even one process fails to abide by the proper use of
semaphores, either accidentally or deliberately, then the whole system breaks down. (
And since concurrency problems are by definition rare events, the problem code may
easily go unnoticed and/or be heinous to debug. )
• For this reason a higher-level language construct has been developed, called monitors.

2.13.1 Monitor Usage

• A monitor is essentially a class, in which all data is private, and with the special
restriction that only one method within any given monitor object may be active at
the same time. An additional restriction is that monitor methods may only access
the shared data within the monitor and any data passed to them as parameters. I.e.
they cannot access any data external to the monitor.

Figure 2.22 - Syntax of a monitor.

75
• Figure 2.23 shows a schematic of a monitor, with an entry queue of processes
waiting their turn to execute monitor operations ( methods. )

Figure 2.23 - Schematic view of a monitor

• In order to fully realize the potential of monitors, we need to introduce one


additional new data type, known as a condition.
o A variable of type condition has only two legal operations, wait and
signal. I.e. if X was defined as type condition, then legal operations would
be X.wait( ) and X.signal( )
o The wait operation blocks a process until some other process calls signal,
and adds the blocked process onto a list associated with that condition.
o The signal process does nothing if there are no processes waiting on that
condition. Otherwise it wakes up exactly one process from the condition's
list of waiting processes. ( Contrast this with counting semaphores, which
always affect the semaphore on a signal call. )
• Figure 2.24 below illustrates a monitor that includes condition variables within its
data space. Note that the condition variables, along with the list of processes
currently waiting for the conditions, are in the data space of the monitor - The
processes on these lists are not "in" the monitor, in the sense that they are not
executing any code in the monitor.

76
Figure 2.24 - Monitor with condition variables

• But now there is a potential problem - If process P within the monitor issues a
signal that would wake up process Q also within the monitor, then there would be
two processes running simultaneously within the monitor, violating the exclusion
requirement. Accordingly there are two possible solutions to this dilemma:

Signal and wait - When process P issues the signal to wake up process Q, P then waits, either
for Q to leave the monitor or on some other condition.

Signal and continue - When P issues the signal, Q waits, either for P to exit the monitor or for
some other condition.

There are arguments for and against either choice. Concurrent Pascal offers a third alternative -
The signal call causes the signaling process to immediately exit the monitor, so that the waiting
process can then wake up and proceed.

• Java and C# ( C sharp ) offer monitors bulit-in to the language. Erlang offers
similar but different constructs.

2.13.2 Dining-Philosophers Solution Using Monitors

• This solution to the dining philosophers uses monitors, and the restriction that a
philosopher may only pick up chopsticks when both are available. There are also
two key data structures in use in this solution:
1. enum { THINKING, HUNGRY,EATING } state[ 5 ]; A philosopher may
only set their state to eating when neither of their adjacent neighbors is
eating. ( state[ ( i + 1 ) % 5 ] != EATING && state[ ( i + 4 ) % 5 ] !=
EATING ).
2. condition self[ 5 ]; This condition is used to delay a hungry philosopher
who is unable to acquire chopsticks.
• In the following solution philosophers share a monitor, DiningPhilosophers, and
eat using the following sequence of operations:

77
1. DiningPhilosophers.pickup( ) - Acquires chopsticks, which may block the
process.
2. eat
3. DiningPhilosophers.putdown( ) - Releases the chopsticks.

2.13.3 Implementing a Monitor Using Semaphores

• One possible implementation of a monitor uses a semaphore "mutex" to control


mutual exclusionary access to the monitor, and a counting semaphore "next" on
which processes can suspend themselves after they are already "inside" the
monitor ( in conjunction with condition variables, see below. ) The integer
next_count keeps track of how many processes are waiting in the next queue.
Externally accessible monitor processes are then implemented as:

78
• Condition variables can be implemented using semaphores as well. For a
condition x, a semaphore "x_sem" and an integer "x_count" are introduced, both
initialized to zero. The wait and signal methods are then implemented as follows.
( This approach to the condition implements the signal-and-wait option described
above for ensuring that only one process at a time is active inside the monitor. )

2.14.4 Resuming Processes Within a Monitor


• When there are multiple processes waiting on the same condition within a
monitor, how does one decide which one to wake up in response to a signal on
that condition? One obvious approach is FCFS, and this may be suitable in many
cases.
• Another alternative is to assign (integer) priorities, and to wake up the process
with the smallest ( best ) priority.
• Figure 2.25 illustrates the use of such a condition within a monitor used for
resource allocation. Processes wishing to access this resource must specify the
time they expect to use it using the acquire( time ) method, and must call the
release( ) method when they are done with the resource.

Figure 2.25 - A monitor to allocate a single resource.


79
• Unfortunately the use of monitors to restrict access to resources still only works if
programmers make the requisite acquire and release calls properly. One option
would be to place the resource allocation code into the monitor, thereby
eliminating the option for programmers to bypass or ignore the monitor, but then
that would substitute the monitor's scheduling algorithms for whatever other
scheduling algorithms may have been chosen for that particular resource. Chapter
14 on Protection presents more advanced methods for enforcing "nice"
cooperation among processes contending for shared resources.
• Concurrent Pascal, Mesa, C#, and Java all implement monitors as described here.
Erlang provides concurrency support using a similar mechanism.

2.15 Synchronization Examples (Optional)


This section looks at how synchronization is handled in a number of different systems.

2.15.1 Synchronization in Windows

Figure 2.26 - Mutex dispatcher object

2.15.2 Synchronization in Linux

80
UNIT-III
Main Memory
3.1 Background
• Obviously memory accesses and memory management are a very important part of
modern computer operation. Every instruction has to be fetched from memory before it
can be executed, and most instructions involve retrieving data from memory or storing
data in memory or both.
• The advent of multi-tasking OSs compounds the complexity of memory management,
because as processes are swapped in and out of the CPU, so must their code and data be
swapped in and out of memory, all at high speeds and without interfering with any other
processes.
• Shared memory, virtual memory, the classification of memory as read-only versus read-
write, and concepts like copy-on-write forking all further complicate the issue.

3.1.1 Basic Hardware


• It should be noted that from the memory chips point of view, all memory accesses are
equivalent. The memory hardware doesn't know what a particular part of memory is
being used for, nor does it care. This is almost true of the OS as well, although not
entirely.
• The CPU can only access its registers and main memory. It cannot, for example, make
direct access to the hard drive, so any data stored there must first be transferred into the
main memory chips before the CPU can work with it. ( Device drivers communicate with
their hardware via interrupts and "memory" accesses, sending short instructions for
example to transfer data from the hard drive to a specified location in main memory. The
disk controller monitors the bus for such instructions, transfers the data, and then notifies
the CPU that the data is there with another interrupt, but the CPU never gets direct access
to the disk. )
• Memory accesses to registers are very fast, generally one clock tick, and a CPU may be
able to execute more than one machine instruction per clock tick.
• Memory accesses to main memory are comparatively slow, and may take a number of
clock ticks to complete. This would require intolerable waiting by the CPU if it were not
for an intermediary fast memory cache built into most modern CPUs. The basic idea of
the cache is to tr ansfer chunks of memory at a time fro m th e main memory
to the cache, and th en to access individu al memory locations one at a time
from the cache.
• User processes must be restricted so that they only access memory locations that
"belong" to that particular process. This is usually implemented using a base register and
a limit register for each process, as shown in Figures 3.1 and 3.2 below. Every memory
access made by a user process is checked against these two registers, and if a memory
access is attempted outside the valid range, then a fatal error is generated. The OS
obviously has access to all existing memory locations, as this is necessary to swap users'
code and data in and out of memory. It should also be obvious that changing the contents
of the base and limit registers is a privileged activity, allowed only to the OS kernel.

81
Figure 3.1 - A base and a limit register define a logical addresss space

Figure 3.2 - Hardware address protection with base and limit registers

82
3.1.2 Address Binding

• User programs typically refer to memory addresses with symbolic names such as "i",
"count", and "averageTemperature". These symbolic names must be mapped or bound to
physical memory addresses, which typically occurs in several stages:
o Compile Time - If it is known at compile time where a program will reside in
physical memory, then absolute code can be generated by the compiler,
containing actual physical addresses. However if the load address changes at
some later time, then the program will have to be recompiled. DOS .COM
programs use compile time binding.
o Load Time - If the location at which a program will be loaded is not known at
compile time, then the compiler must generate relocatable code, which references
addresses relative to the start of the program. If that starting address changes, then
the program must be reloaded but not recompiled.
o Execution Time - If a program can be moved around in memory during the
course of its execution, then binding must be delayed until execution time. This
requires special hardware, and is the method implemented by most modern OSes.
• Figure 3.3 shows the various stages of the binding processes and the units involved in
each stage:

Figure 3.3 - Multistep processing of a user program

83
3.1.3 Logical Versus Physical Address Space

• The address generated by the CPU is a logical address, whereas the address actually seen
by the memory hardware is a physical address.
• Addresses bound at compile time or load time have identical logical and physical
addresses.
• Addresses created at execution time, however, have different logical and physical
addresses.
o In this case the logical address is also known as a virtual address, and the two
terms are used interchangeably by our text.
o The set of all logical addresses used by a program composes the logical address
space, and the set of all corresponding physical addresses composes the physical
address space.
• The run time mapping of logical to physical addresses is handled by the memory-
management unit, MMU.
o The MMU can take on many forms. One of the simplest is a modification of the
base-register scheme described earlier.
o The base register is now termed a relocation register, whose value is added to
every memory request at the hardware level.
• Note that user programs never see physical addresses. User programs work entirely in
logical address space, and any memory references or manipulations are done using purely
logical addresses. Only when the address gets sent to the physical memory chips is the
physical memory address generated.

Figure 3.4 - Dynamic relocation using a relocation register

3.1.4 Dynamic Loading

• Rather than loading an entire program into memory at once, dynamic loading loads up
each routine as it is called. The advantage is that unused routines need never be loaded,
reducing total memory usage and generating faster program startup times. The downside
is the added complexity and overhead of checking to see if a routine is loaded every time
it is called and then loading it up if it is not already loaded.

84
3.1.5 Dynamic Linking and Shared Libraries

• With static linking library modules get fully included in executable modules, wasting
both disk space and main memory usage, because every program that included a certain
routine from the library would have to have their own copy of that routine linked into
their executable code.
• With dynamic linking, however, only a stub is linked into the executable module,
containing references to the actual library module linked in at run time.
o This method saves disk space, because the library routines do not need to be fully
included in the executable modules, only the stubs.
o We will also learn that if the code section of the library routines is reentrant,
( meaning it does not modify the code while it runs, making it safe to re-enter it ),
then main memory can be saved by loading only one copy of dynamically linked
routines into memory and sharing the code amongst all processes that are
concurrently using it. ( Each process would have their own copy of the data
section of the routines, but that may be small relative to the code segments. )
Obviously the OS must manage shared routines in memory.
o An added benefit of dynamically linked libraries ( DLLs, also known as shared
libraries or shared objects on UNIX systems ) involves easy upgrades and
updates. When a program uses a routine from a standard library and the routine
changes, then the program must be re-built ( re-linked ) in order to incorporate the
changes. However if DLLs are used, then as long as the stub doesn't change, the
program can be updated merely by loading new versions of the DLLs onto the
system. Version information is maintained in both the program and the DLLs, so
that a program can specify a particular version of the DLL if necessary.
o In practice, the first time a program calls a DLL routine, the stub will recognize
the fact and will replace itself with the actual routine from the DLL library.
Further calls to the same routine will access the routine directly and not incur the
overhead of the stub access. ( Following the UML Proxy Pattern. )
o ( Additional information regarding dynamic linking is available at
http://www.iecc.com/linker/linker10.html )

Swapping
• A process must be loaded into memory in order to execute.
• If there is not enough memory available to keep all running processes in memory at the
same time, then some processes who are not currently using the CPU may have their
memory swapped out to a fast local disk called the backing store.

3.2.1 Standard Swapping


• If compile-time or load-time address binding is used, then processes must be swapped
back into the same memory location from which they were swapped out. If execution
time binding is used, then the processes can be swapped back into any available location.
• Swapping is a very slow process compared to other operations. For example, if a user
process occupied 10 MB and the transfer rate for the backing store were 40 MB per
second, then it would take 1/4 second ( 250 milliseconds ) just to do the data transfer.
Adding in a latency lag of 8 milliseconds and ignoring head seek time for the moment,
and further recognizing that swapping involves moving old data out as well as new data
85
in, the overall transfer time required for this swap is 512 milliseconds, or over half a
second. For efficient processor scheduling the CPU time slice should be significantly
longer than this lost transfer time.
• To reduce swapping transfer overhead, it is desired to transfer as little information as
possible, which requires that the system know how much memory a process is using, as
opposed to how much it might use. Programmers can help with this by freeing up
dynamic memory that they are no longer using.
• It is important to swap processes out of memory only when they are idle, or more to the
point, only when there are no pending I/O operations. ( Otherwise the pending I/O
operation could write into the wrong process's memory space. ) The solution is to either
swap only totally idle processes, or do I/O operations only into and out of OS buffers,
which are then transferred to or from process's main memory as a second step.
• Most modern OSes no longer use swapping, because it is too slow and there are faster
alternatives available. ( e.g. Paging. ) However some UNIX systems will still invoke
swapping if the system gets extremely full, and then discontinue swapping when the load
reduces again. Windows 3.1 would use a modified version of swapping that was
somewhat controlled by the user, swapping process's out if necessary and then only
swapping them back in when the user focused on that particular window.

Figure 3.5 - Swapping of two processes using a disk as a backing store

3.3 Contiguous Memory Allocation

• One approach to memory management is to load each process into a contiguous space.
The operating system is allocated space first, usually at either low or high memory
locations, and then the remaining available memory is allocated to processes as needed.
( The OS is usually loaded low, because that is where the interrupt vectors are located,
but on older systems part of the OS was loaded high to make more room in low memory (
within the 640K barrier ) for user processes. )

86
3.3.1 Memory Protection

• The system shown in Figure 3.6 below allows protection against user programs accessing
areas that they should not, allows programs to be relocated to different memory starting
addresses as needed, and allows the memory space devoted to the OS to grow or shrink
dynamically as needs change.

Figure 3.6 - Hardware support for relocation and limit registers

3.3.2 Memory Allocation

• One method of allocating contiguous memory is to divide all available memory into
equal sized partitions, and to assign each process to their own partition. This restricts both
the number of simultaneous processes and the maximum size of each process, and is no
longer used.
• An alternate approach is to keep a list of unused ( free ) memory blocks ( holes ), and to
find a hole of a suitable size whenever a process needs to be loaded into memory. There
are many different strategies for finding the "best" allocation of memory to processes,
including the three most commonly discussed:
1. First fit - Search the list of holes until one is found that is big enough to satisfy
the request, and assign a portion of that hole to that process. Whatever fraction of
the hole not needed by the request is left on the free list as a smaller hole.
Subsequent requests may start looking either from the beginning of the list or
from the point at which this search ended.
2. Best fit - Allocate the smallest hole that is big enough to satisfy the request. This
saves large holes for other process requests that may need them later, but the
resulting unused portions of holes may be too small to be of any use, and will
therefore be wasted. Keeping the free list sorted can speed up the process of
finding the right hole.
3. Worst fit - Allocate the largest hole available, thereby increasing the likelihood
that the remaining portion will be usable for satisfying future requests.

87
• Simulations show that either first or best fit are better than worst fit in terms of both time
and storage utilization. First and best fits are about equal in terms of storage utilization,
but first fit is faster.

3.3.3. Fragmentation
• All the memory allocation strategies suffer from external fragmentation, though first and
best fits experience the problems more so than worst fit. External fragmentation means
that the available memory is broken up into lots of little pieces, none of which is big
enough to satisfy the next memory requirement, although the sum total could.
• The amount of memory lost to fragmentation may vary with algorithm, usage patterns,
and some design decisions such as which end of a hole to allocate and which end to save
on the free list.
• Statistical analysis of first fit, for example, shows that for N blocks of allocated memory,
another 0.5 N will be lost to fragmentation.
• Internal fragmentation also occurs, with all memory allocation strategies. This is caused
by the fact that memory is allocated in blocks of a fixed size, whereas the actual memory
needed will rarely be that exact size. For a random distribution of memory requests, on
the average 1/2 block will be wasted per memory request, because on the average the last
allocated block will be only half full.
o Note that the same effect happens with hard drives, and that modern hardware
gives us increasingly larger drives and memory at the expense of ever larger block
sizes, which translates to more memory lost to internal fragmentation.
o Some systems use variable size blocks to minimize losses due to internal
fragmentation.
• If the programs in memory are relocatable, ( using execution-time address binding ), then
the external fragmentation problem can be reduced via compaction, i.e. moving all
processes down to one end of physical memory. This only involves updating the
relocation register for each process, as all internal work is done using logical addresses.
• Another solution as we will see in upcoming sections is to allow processes to use non-
contiguous blocks of physical memory, with a separate relocation register for each block.

3.4 Segmentation
3.4.1 Basic Method
• Most users (programmers) do not think of their programs as existing in one continuous
linear address space.
• Rather they tend to think of their memory in multiple segments, each dedicated to a
particular use, such as code, data, the stack, the heap, etc.
• Memory segmentation supports this view by providing addresses with a segment number
(mapped to a segment base address) and an offset from the beginning of that segment.
• For example, a C compiler might generate 5 segments for the user code, library code,
global ( static ) variables, the stack, and the heap, as shown in Figure 3.7:

88
Figure 3.7 Programmer's view of a program.

3.4.2 Segmentation Hardware


• A segment table maps segment-offset addresses to physical addresses, and
simultaneously checks for invalid addresses, using a system similar to the page tables and
relocation base registers discussed previously. (Note that at this point in the discussion of
segmentation, each segment is kept in contiguous memory and may be of different sizes,
but that segmentation can also be combined with paging as we shall see shortly.)

Figure 3.8 - Segmentation hardware

89
Figure 3.9 - Example of segmentation

3.5 Paging

• Paging is a memory management scheme that allows processes physical memory to be


discontinuous, and which eliminates problems with fragmentation by allocating memory
in equal sized blocks known as pages.

90
• Paging eliminates most of the problems of the other methods discussed previously, and is
the predominant memory management technique used today.

3.5.1 Basic Method

• The basic idea behind paging is to divide physical memory into a number of equal sized
blocks called frames, and to divide a programs logical memory space into blocks of the
same size called pages.
• Any page ( from any process ) can be placed into any available frame.
• The page table is used to look up what frame a particular page is stored in at the moment.
In the following example, for instance, page 2 of the program's logical memory is
currently stored in frame 3 of physical memory:

Figure 3.10 - Paging hardware

Figure 3.11 - Paging model of logical and physical memory

• A logical address consists of two parts: A page number in which the address resides, and
an offset from the beginning of that page. ( The number of bits in the page number limits

91
how many pages a single process can address. The number of bits in the offset determines
the maximum size of each page, and should correspond to the system frame size. )
• The page table maps the page number to a frame number, to yield a physical address
which also has two parts: The frame number and the offset within that frame. The number
of bits in the frame number determines how many frames the system can address, and the
number of bits in the offset determines the size of each frame.
• Page numbers, frame numbers, and frame sizes are determined by the architecture, but
are typically powers of two, allowing addresses to be split at a certain number of bits. For
example, if the logical address size is 2^m and the page size is 2^n, then the high-order
m-n bits of a logical address designate the page number and the remaining n bits
represent the offset.
• Note also that the number of bits in the page number and the number of bits in the frame
number do not have to be identical. The former determines the address range of the
logical address space, and the latter relates to the physical address space.

• ( DOS used to use an addressing scheme with 16 bit frame numbers and 16-bit offsets, on
hardware that only supported 24-bit hardware addresses. The result was a resolution of
starting frame addresses finer than the size of a single frame, and multiple frame-offset
combinations that mapped to the same physical hardware address. )
• Consider the following micro example, in which a process has 16 bytes of logical
memory, mapped in 4 byte pages into 32 bytes of physical memory. ( Presumably some
other processes would be consuming the remaining 16 bytes of physical memory. )

92
Figure 3.12 - Paging example for a 32-byte memory with 4-byte pages

• Note that paging is like having a table of relocation registers, one for each page of the
logical memory.
• There is no external fragmentation with paging. All blocks of physical memory are used,
and there are no gaps in between and no problems with finding the right sized hole for a
particular chunk of memory.
• There is, however, internal fragmentation. Memory is allocated in chunks the size of a
page, and on the average, the last page will only be half full, wasting on the average half
a page of memory per process. ( Possibly more, if processes keep their code and data in
separate pages. )
• Larger page sizes waste more memory, but are more efficient in terms of overhead.
Modern trends have been to increase page sizes, and some systems even have multiple
size pages to try and make the best of both worlds.
• Page table entries ( frame numbers ) are typically 32 bit numbers, allowing access to
2^32 physical page frames. If those frames are 4 KB in size each, that translates to 16 TB
of addressable physical memory. ( 32 + 12 = 44 bits of physical address space. )
• When a process requests memory ( e.g. when its code is loaded in from disk ), free
frames are allocated from a free-frame list, and inserted into that process's page table.
• Processes are blocked from accessing anyone else's memory because all of their memory
requests are mapped through their page table. There is no way for them to generate an
address that maps into any other process's memory space.
• The operating system must keep track of each individual process's page table, updating it
whenever the process's pages get moved in and out of memory, and applying the correct
page table when processing system calls for a particular process. This all increases the

93
overhead involved when swapping processes in and out of the CPU. ( The currently
active page table must be updated to reflect the process that is currently running. )

Figure 3.13 - Free frames (a) before allocation and (b) after allocation

3.5.2 Hardware Support

• Page lookups must be done for every memory reference, and whenever a process gets
swapped in or out of the CPU, its page table must be swapped in and out too, along with
the instruction registers, etc. It is therefore appropriate to provide hardware support for
this operation, in order to make it as fast as possible and to make process switches as fast
as possible also.
• One option is to use a set of registers for the page table. For example, the DEC PDP-11
uses 16-bit addressing and 8 KB pages, resulting in only 8 pages per process. ( It takes 13
bits to address 8 KB of offset, leaving only 3 bits to define a page number. )
• An alternate option is to store the page table in main memory, and to use a single register
( called the page-table base register, PTBR ) to record where in memory the page table is
located.
o Process switching is fast, because only the single register needs to be changed.
o However memory access just got half as fast, because every memory access now
requires two memory accesses - One to fetch the frame number from memory and
then another one to access the desired memory location.
o The solution to this problem is to use a very special high-speed memory device
called the translation look-aside buffer, TLB.
▪ The benefit of the TLB is that it can search an entire table for a key value
in parallel, and if it is found anywhere in the table, then the corresponding
lookup value is returned.

94
Figure 3.14 - Paging hardware with TLB

▪ The TLB is very expensive, however, and therefore very small. ( Not large
enough to hold the entire page table. ) It is therefore used as a cache
device.
▪ Addresses are first checked against the TLB, and if the info is not
there ( a TLB miss ), then the frame is looked up from main
memory and the TLB is updated.
▪ If the TLB is full, then replacement strategies range from least-
recently used, LRU to random.
▪ Some TLBs allow some entries to be wired down, which means
that they cannot be removed from the TLB. Typically these would
be kernel frames.
▪ Some TLBs store address-space identifiers, ASIDs, to keep track
of which process "owns" a particular entry in the TLB. This allows
entries from multiple processes to be stored simultaneously in the
TLB without granting one process access to some other process's
memory location. Without this feature the TLB has to be flushed
clean with every process switch.
▪ The percentage of time that the desired information is found in the TLB is
termed the hit ratio.
▪ ( Eighth Edition Version: ) For example, suppose that it takes 100
nanoseconds to access main memory, and only 20 nanoseconds to search
the TLB. So a TLB hit takes 120 nanoseconds total ( 20 to find the frame
number and then another 100 to go get the data ), and a TLB miss takes
220 ( 20 to search the TLB, 100 to go get the frame number, and then
another 100 to go get the data. ) So with an 80% TLB hit ratio, the average
memory access time would be:

0.80 * 120 + 0.20 * 220 = 140 nanoseconds

for a 40% slowdown to get the frame number. A 98% hit rate would yield 122
nanoseconds average access time ( you should verify this ), for a 22% slowdown.

95
▪ ( Ninth Edition Version: ) The ninth edition ignores the 20 nanoseconds
required to search the TLB, yielding

0.80 * 100 + 0.20 * 200 = 120 nanoseconds

for a 20% slowdown to get the frame number. A 99% hit rate would yield 101
nanoseconds average access time ( you should verify this ), for a 1% slowdown.

3.5.3 Protection

• The page table can also help to protect processes from accessing memory that they
shouldn't, or their own memory in ways that they shouldn't.
• A bit or bits can be added to the page table to classify a page as read-write, read-only,
read-write-execute, or some combination of these sorts of things. Then each memory
reference can be checked to ensure it is accessing the memory in the appropriate mode.
• Valid / invalid bits can be added to "mask off" entries in the page table that are not in use
by the current process, as shown by example in Figure 3.12 below.
• Note that the valid / invalid bits described above cannot block all illegal memory
accesses, due to the internal fragmentation. ( Areas of memory in the last page that are
not entirely filled by the process, and may contain data left over by whoever used that
frame last. )
• Many processes do not use all of the page table available to them, particularly in modern
systems with very large potential page tables. Rather than waste memory by creating a
full-size page table for every process, some systems use a page-table length register,
PTLR, to specify the length of the page table.

Figure 3.15 - Valid (v) or invalid (i) bit in page table

3.5.4 Shared Pages

• Paging systems can make it very easy to share blocks of memory, by simply duplicating
page numbers in multiple page frames. This may be done with either code or data.

96
• If code is reentrant, that means that it does not write to or change the code in any way ( it
is non self-modifying ), and it is therefore safe to re-enter it. More importantly, it means
the code can be shared by multiple processes, so long as each has their own copy of the
data and registers, including the instruction register.
• In the example given below, three different users are running the editor simultaneously,
but the code is only loaded into memory ( in the page frames ) one time.
• Some systems also implement shared memory in this fashion.

Figure 3.16 - Sharing of code in a paging environment

3.6 Structure of the Page Table


3.6.1 Hierarchical Paging

• Most modern computer systems support logical address spaces of 2^32 to 2^64.
• With a 2^32 address space and 4K ( 2^12 ) page sizes, this leave 2^20 entries in the page
table. At 4 bytes per entry, this amounts to a 4 MB page table, which is too large to
reasonably keep in contiguous memory. ( And to swap in and out of memory with each
process switch. ) Note that with 4K pages, this would take 1024 pages just to hold the
page table!
• One option is to use a two-tier paging system, i.e. to page the page table.
• For example, the 20 bits described above could be broken down into two 10-bit page
numbers. The first identifies an entry in the outer page table, which identifies where in
memory to find one page of an inner page table. The second 10 bits finds a specific entry
in that inner page table, which in turn identifies a particular frame in physical memory.
( The remaining 12 bits of the 32 bit logical address are the offset within the 4K frame. )

97
Figure 3.17 A two-level page-table scheme

98
Figure 3.18 - Address translation for a two-level 32-bit paging architecture

• VAX Architecture divides 32-bit addresses into 4 equal sized sections, and each page is
512 bytes, yielding an address form of:

• With a 64-bit logical address space and 4K pages, there are 52 bits worth of page
numbers, which is still too many even for two-level paging. One could increase the
paging level, but with 10-bit page tables it would take 7 levels of indirection, which
would be prohibitively slow memory access. So some other approach must be used.

64-bits Two-tiered leaves 42 bits in outer table

Going to a fourth level still leaves 32 bits in the outer table.

99
3.6.2 Hashed Page Tables

• One common data structure for accessing data that is sparsely distributed over a broad
range of possible values is with hash tables. Figure 3.16 below illustrates a hashed page
table using chain-and-bucket hashing:

Figure 3.19 - Hashed page table

3.6.3 Inverted Page Tables

• Another approach is to use an inverted page table. Instead of a table listing all of the
pages for a particular process, an inverted page table lists all of the pages currently loaded
in memory, for all processes. ( I.e. there is one entry per frame instead of one entry per
page. )
• Access to an inverted page table can be slow, as it may be necessary to search the entire
table in order to find the desired page ( or to discover that it is not there. ) Hashing the
table can help speedup the search process.
• Inverted page tables prohibit the normal method of implementing shared memory, which
is to map multiple logical pages to a common physical frame. ( Because each frame is
now mapped to one and only one process. )

100
Figure 3.20 - Inverted page table

3.7.1.1 IA-32 Segmentation

• The Pentium CPU provides both pure segmentation and segmentation with paging. In the
latter case, the CPU generates a logical address ( segment-offset pair ), which the
segmentation unit converts into a logical linear address, which in turn is mapped to a
physical frame by the paging unit, as shown in Figure 3.21:

Figure 3.21 - Logical to physical address translation in IA-32

3.7.1.1 IA-32 Segmentation

• The Pentium architecture allows segments to be as large as 4 GB, ( 24 bits of offset ).


• Processes can have as many as 16K segments, divided into two 8K groups:
o 8K private to that particular process, stored in the Local Descriptor Table, LDT.
o 8K shared among all processes, stored in the Global Descriptor Table, GDT.
• Logical addresses are ( selector, offset ) pairs, where the selector is made up of 16 bits:
o A 13 bit segment number ( up to 8K )
o A 1 bit flag for LDT vs. GDT.
o 2 bits for protection codes.

101
o The descriptor tables contain 8-byte descriptions of each segment, including base
and limit registers.
o Logical linear addresses are generated by looking the selector up in the descriptor
table and adding the appropriate base address to the offset, as shown in Figure
3.22:

Figure 3.22 - IA-32 segmentation

3.7.1.2 IA-32 Paging

• Pentium paging normally uses a two-tier paging scheme, with the first 10 bits being a
page number for an outer page table ( a.k.a. page directory ), and the next 10 bits being a
page number within one of the 1024 inner page tables, leaving the remaining 12 bits as an
offset into a 4K page.

• A special bit in the page directory can indicate that this page is a 4MB page, in which
case the remaining 22 bits are all used as offset and the inner tier of page tables is not
used.
• The CR3 register points to the page directory for the current process, as shown in Figure
8.23 below.
• If the inner page table is currently swapped out to disk, then the page directory will have
an "invalid bit" set, and the remaining 31 bits provide information on where to find the
swapped out page table on the disk.

102
Figure 3.23 - Paging in the IA-32 architecture.

Figure 3.24 - Page address extensions.

103
1 Virtual Memory Background
• Preceding sections talked about how to avoid memory fragmentation by breaking process
memory requirements down into smaller bites (pages), and storing the pages non-
contiguously in memory. However the entire process still had to be stored in memory
somewhere.
• In practice, most real processes do not need all their pages, or at least not all at once, for
several reasons:
1. Error handling code is not needed unless that specific error occurs, some of which
are quite rare.
2. Arrays are often over-sized for worst-case scenarios, and only a small fraction of
the arrays are actually used in practice.
3. Certain features of certain programs are rarely used, such as the routine to balance
the federal budget. :-)
• The ability to load only the portions of processes that were actually needed ( and only
when they were needed ) has several benefits:
o Programs could be written for a much larger address space ( virtual memory space
) than physically exists on the computer.
o Because each process is only using a fraction of their total address space, there is
more memory left for other programs, improving CPU utilization and system
throughput.
o Less I/O is needed for swapping processes in and out of RAM, speeding things
up.
Figure below shows the general layout of virtual memory, which can be much larger than
physical memory:

Figure 3.25 - Diagram showing virtual memory that is larger than physical memory

• Figure 3.25 shows virtual address space, which is the programmers logical view of
process memory storage. The actual physical layout is controlled by the process's page
table.

104
• Note that the address space shown in Figure 9.2 is sparse - A great hole in the middle of
the address space is never used, unless the stack and/or the heap grow to fill the hole.

Figure 3.26 - Virtual address space

• Virtual memory also allows the sharing of files and memory by multiple processes, with
several benefits:
o System libraries can be shared by mapping them into the virtual address space of
more than one process.
o Processes can also share virtual memory by mapping the same block of memory
to more than one process.
o Process pages can be shared during a fork( ) system call, eliminating the need to
copy all of the pages of the original ( parent ) process.

Figure 3.27 - Shared library using virtual memory

105
3.2 Demand Paging

• The basic idea behind demand paging is that when a process is swapped in, its pages are
not swapped in all at once. Rather they are swapped in only when the process needs them.
( on demand. ) This is termed a lazy swapper, although a pager is a more accurate term.

Figure 3.28 - Transfer of a paged memory to contiguous disk space

3.2.1 Basic Concepts

• The basic idea behind paging is that when a process is swapped in, the pager only loads
into memory those pages that it expects the process to need ( right away. )
• Pages that are not loaded into memory are marked as invalid in the page table, using the
invalid bit. ( The rest of the page table entry may either be blank or contain information
about where to find the swapped-out page on the hard drive. )
• If the process only ever accesses pages that are loaded in memory ( memory resident
pages ), then the process runs exactly as if all the pages were loaded in to memory.

106
Figure 3.29 - Page table when some pages are not in main memory.

• On the other hand, if a page is needed that was not originally loaded up, then a page fault
trap is generated, which must be handled in a series of steps:
1. The memory address requested is first checked, to make sure it was a valid
memory request.
2. If the reference was invalid, the process is terminated. Otherwise, the page must
be paged in.
3. A free frame is located, possibly from a free-frame list.
4. A disk operation is scheduled to bring in the necessary page from disk. ( This will
usually block the process on an I/O wait, allowing some other process to use the
CPU in the meantime. )
5. When the I/O operation is complete, the process's page table is updated with the
new frame number, and the invalid bit is changed to indicate that this is now a
valid page reference.
6. The instruction that caused the page fault must now be restarted from the
beginning, ( as soon as this process gets another turn on the CPU. )

Figure 3.30 - Steps in handling a page fault

• In an extreme case, NO pages are swapped in for a process until they are requested by
page faults. This is known as pure demand paging.

107
• In theory each instruction could generate multiple page faults. In practice this is very rare,
due to locality of reference, covered in section 9.6.1.
• The hardware necessary to support virtual memory is the same as for paging and
swapping: A page table and secondary memory. ( Swap space, whose allocation is
discussed in chapter 12. )
• A crucial part of the process is that the instruction must be restarted from scratch once the
desired page has been made available in memory. For most simple instructions this is not
a major difficulty. However there are some architectures that allow a single instruction to
modify a fairly large block of data, ( which may span a page boundary ), and if some of
the data gets modified before the page fault occurs, this could cause problems. One
solution is to access both ends of the block before executing the instruction, guaranteeing
that the necessary pages get paged in before the instruction begins.

3.2.2 Performance of Demand Paging

• Obviously there is some slowdown and performance hit whenever a page fault occurs and
the system has to go get it from memory, but just how big a hit is it exactly?
• There are many steps that occur when servicing a page fault ( see book for full details ),
and some of the steps are optional or variable. But just for the sake of discussion, suppose
that a normal memory access requires 200 nanoseconds, and that servicing a page fault
takes 8 milliseconds. ( 8,000,000 nanoseconds, or 40,000 times a normal memory
access. ) With a page fault rate of p, ( on a scale from 0 to 1 ), the effective access time is
now:

( 1 - p ) * ( 200 ) + p * 8000000

= 200 + 7,999,800 * p

which clearly depends heavily on p! Even if only one access in 1000 causes a page fault, the
effective access time drops from 200 nanoseconds to 8.2 microseconds, a slowdown of a factor
of 40 times. In order to keep the slowdown less than 10%, the page fault rate must be less than
0.0000025, or one in 399,990 accesses.

• A subtlety is that swap space is faster to access than the regular file system, because it
does not have to go through the whole directory structure. For this reason some systems
will transfer an entire process from the file system to swap space before starting up the
process, so that future paging all occurs from the ( relatively ) faster swap space.
• Some systems use demand paging directly from the file system for binary code ( which
never changes and hence does not have to be stored on a page operation ), and to reserve
the swap space for data segments that must be stored. This approach is used by both
Solaris and BSD Unix.

3.3 Page Replacement

• In order to make the most use of virtual memory, we load several processes into memory
at the same time. Since we only load the pages that are actually needed by each process at

108
any given time, there is room to load many more processes than if we had to load in the
entire process.
• However memory is also needed for other purposes ( such as I/O buffering ), and what
happens if some process suddenly decides it needs more pages and there aren't any free
frames available? There are several possible solutions to consider:
1. Adjust the memory used by I/O buffering, etc., to free up some frames for user
processes. The decision of how to allocate memory for I/O versus user processes
is a complex one, yielding different policies on different systems. ( Some allocate
a fixed amount for I/O, and others let the I/O system contend for memory along
with everything else. )
2. Put the process requesting more pages into a wait queue until some free frames
become available.
3. Swap some process out of memory completely, freeing up its page frames.
4. Find some page in memory that isn't being used right now, and swap that page
only out to disk, freeing up a frame that can be allocated to the process requesting
it. This is known as page replacement, and is the most common solution. There
are many different algorithms for page replacement, which is the subject of the
remainder of this section.

Figure 3.31 - Need for page replacement.

3.3.1 Basic Page Replacement

• The previously discussed page-fault processing assumed that there would be free frames
available on the free-frame list. Now the page-fault handling must be modified to free up
a frame if necessary, as follows:
1. Find the location of the desired page on the disk, either in swap space or in the file
system.
2. Find a free frame:
a. If there is a free frame, use it.
b. If there is no free frame, use a page-replacement algorithm to select an
existing frame to be replaced, known as the victim frame.
c. Write the victim frame to disk. Change all related page tables to indicate
that this page is no longer in memory.

109
3. Read in the desired page and store it in the frame. Adjust all related page and
frame tables to indicate the change.
4. Restart the process that was waiting for this page.

Figure 3.32 - Page replacement.

• Note that step 3c adds an extra disk write to the page-fault handling, effectively doubling
the time required to process a page fault. This can be alleviated somewhat by assigning a
modify bit, or dirty bit to each page, indicating whether or not it has been changed since it
was last loaded in from disk. If the dirty bit has not been set, then the page is unchanged,
and does not need to be written out to disk. Otherwise the page write is required. It
should come as no surprise that many page replacement strategies specifically look for
pages that do not have their dirty bit set, and preferentially select clean pages as victim
pages. It should also be obvious that unmodifiable code pages never get their dirty bits
set.
• There are two major requirements to implement a successful demand paging system. We
must develop a frame-allocation algorithm and a page-replacement algorithm. The
former centers around how many frames are allocated to each process ( and to other
needs ), and the latter deals with how to select a page for replacement when there are no
free frames available.
• The overall goal in selecting and tuning these algorithms is to generate the fewest number
of overall page faults. Because disk access is so slow relative to memory access, even
slight improvements to these algorithms can yield large improvements in overall system
performance.
• Algorithms are evaluated using a given string of memory accesses known as a reference
string, which can be generated in one of ( at least ) three common ways:
1. Randomly generated, either evenly distributed or with some distribution curve
based on observed system behavior. This is the fastest and easiest approach, but
may not reflect real performance well, as it ignores locality of reference.
2. Specifically designed sequences. These are useful for illustrating the properties of
comparative algorithms in published papers and textbooks, ( and also for
homework and exam problems. :-) )
3. Recorded memory references from a live system. This may be the best approach,
but the amount of data collected can be enormous, on the order of a million

110
addresses per second. The volume of collected data can be reduced by making
two important observations:
1. Only the page number that was accessed is relevant. The offset within that
page does not affect paging operations.
2. Successive accesses within the same page can be treated as a single page
request, because all requests after the first are guaranteed to be page hits.
( Since there are no intervening requests for other pages that could remove
this page from the page table. )
▪ So for example, if pages were of size 100 bytes, then the sequence of
address requests ( 0100, 0432, 0101, 0612, 0634, 0688, 0132, 0038, 0420 )
would reduce to page requests ( 1, 4, 1, 6, 1, 0, 4 )
As the number of available frames increases, the number of page faults should decrease,
as shown in Figure 3.33:

Figure 3.33 - Graph of page faults versus number of frames.

3.3.2 FIFO Page Replacement

• A simple and obvious page replacement strategy is FIFO, i.e. first-in-first-out.


• As new pages are brought in, they are added to the tail of a queue, and the page at the
head of the queue is the next victim. In the following example, 20 page requests result in
15 page faults:

111
Figure 3.34 - FIFO page-replacement algorithm.

• Although FIFO is simple and easy, it is not always optimal, or even efficient.
• An interesting effect that can occur with FIFO is Belady's anomaly, in which increasing
the number of frames available can actually increase the number of page faults that
occur! Consider, for example, the following chart based on the page sequence ( 1, 2, 3, 4,
1, 2, 5, 1, 2, 3, 4, 5 ) and a varying number of available frames. Obviously the maximum
number of faults is 12 ( every request generates a fault ), and the minimum number is 5 (
each page loaded only once ), but in between there are some interesting results:

Figure 3.35 - Page-fault curve for FIFO replacement on a reference string.

3.3.3 Optimal Page Replacement

• The discovery of Belady's anomaly lead to the search for an optimal page-replacement
algorithm, which is simply that which yields the lowest of all possible page-faults, and
which does not suffer from Belady's anomaly.
• Such an algorithm does exist, and is called OPT or MIN. This algorithm is simply
"Replace the page that will not be used for the longest time in the future."
• For example, Figure 9.14 shows that by applying OPT to the same reference string used
for the FIFO example, the minimum number of possible page faults is 9. Since 6 of the
page-faults are unavoidable ( the first reference to each new page ), FIFO can be shown

112
to require 3 times as many ( extra ) page faults as the optimal algorithm. ( Note: The book
claims that only the first three page faults are required by all algorithms, indicating that
FIFO is only twice as bad as OPT. )
• Unfortunately OPT cannot be implemented in practice, because it requires foretelling the
future, but it makes a nice benchmark for the comparison and evaluation of real proposed
new algorithms.
• In practice most page-replacement algorithms try to approximate OPT by predicting
( estimating ) in one fashion or another what page will not be used for the longest period
of time. The basis of FIFO is the prediction that the page that was brought in the longest
time ago is the one that will not be needed again for the longest future time, but as we
shall see, there are many other prediction methods, all striving to match the performance
of OPT.

Figure 3.36 - Optimal page-replacement algorithm

3.4 LRU Page Replacement

• The prediction behind LRU, the Least Recently Used, algorithm is that the page that has
not been used in the longest time is the one that will not be used again in the near future.
( Note the distinction between FIFO and LRU: The former looks at the oldest load time,
and the latter looks at the oldest use time. )
• Some view LRU as analogous to OPT, except looking backwards in time instead of
forwards. ( OPT has the interesting property that for any reference string S and its reverse
R, OPT will generate the same number of page faults for S and for R. It turns out that
LRU has this same property. )
• Figure 9.15 illustrates LRU for our sample string, yielding 12 page faults, ( as compared
to 15 for FIFO and 9 for OPT. )

113
Figure 3.37 - LRU page-replacement algorithm.

• LRU is considered a good replacement policy, and is often used. The problem is how
exactly to implement it. There are two simple approaches commonly used:
1. Counters. Every memory access increments a counter, and the current value of
this counter is stored in the page table entry for that page. Then finding the LRU
page involves simple searching the table for the page with the smallest counter
value. Note that overflowing of the counter must be considered.
2. Stack. Another approach is to use a stack, and whenever a page is accessed, pull
that page from the middle of the stack and place it on the top. The LRU page will
always be at the bottom of the stack. Because this requires removing objects from
the middle of the stack, a doubly linked list is the recommended data structure.
• Note that both implementations of LRU require hardware support, either for incrementing
the counter or for managing the stack, as these operations must be performed for every
memory access.
• Neither LRU or OPT exhibit Belady's anomaly. Both belong to a class of page-
replacement algorithms called stack algorithms, which can never exhibit Belady's
anomaly. A stack algorithm is one in which the pages kept in memory for a frame set of
size N will always be a subset of the pages kept for a frame size of N + 1. In the case of
LRU, ( and particularly the stack implementation thereof ), the top N pages of the stack
will be the same for all frame set sizes of N or anything larger.

Figure 3.38 - Use of a stack to record the most recent page references.

114
3.5 LRU-Approximation Page Replacement

• Unfortunately full implementation of LRU requires hardware support, and few systems
provide the full hardware support necessary.
• However many systems offer some degree of HW support, enough to approximate LRU
fairly well. ( In the absence of ANY hardware support, FIFO might be the best available
choice. )
• In particular, many systems provide a reference bit for every entry in a page table, which
is set anytime that page is accessed. Initially all bits are set to zero, and they can also all
be cleared at any time. One bit of precision is enough to distinguish pages that have been
accessed since the last clear from those that have not, but does not provide any finer grain
of detail.

3.5.1 Additional-Reference-Bits Algorithm

• Finer grain is possible by storing the most recent 8 reference bits for each page in an 8-bit
byte in the page table entry, which is interpreted as an unsigned int.
o At periodic intervals ( clock interrupts ), the OS takes over, and right-shifts each
of the reference bytes by one bit.
o The high-order ( leftmost ) bit is then filled in with the current value of the
reference bit, and the reference bits are cleared.
o At any given time, the page with the smallest value for the reference byte is the
LRU page.
• Obviously the specific number of bits used and the frequency with which the reference
byte is updated are adjustable, and are tuned to give the fastest performance on a given
hardware platform.

3.5.2 Second-Chance Algorithm

• The second chance algorithm is essentially a FIFO, except the reference bit is used to
give pages a second chance at staying in the page table.
o When a page must be replaced, the page table is scanned in a FIFO ( circular
queue ) manner.
o If a page is found with its reference bit not set, then that page is selected as the
next victim.
o If, however, the next page in the FIFO does have its reference bit set, then it is
given a second chance:
▪ The reference bit is cleared, and the FIFO search continues.
▪ If some other page is found that did not have its reference bit set, then that
page will be selected as the victim, and this page ( the one being given the
second chance ) will be allowed to stay in the page table.
▪ If , however, there are no other pages that do not have their reference bit
set, then this page will be selected as the victim when the FIFO search
circles back around to this page on the second pass.
• If all reference bits in the table are set, then second chance degrades to FIFO, but also
requires a complete search of the table for every page-replacement.

115
• As long as there are some pages whose reference bits are not set, then any page
referenced frequently enough gets to stay in the page table indefinitely.
• This algorithm is also known as the clock algorithm, from the hands of the clock moving
around the circular queue.

Figure 3.39 - Second-chance ( clock ) page-replacement algorithm.

3.5.3 Enhanced Second-Chance Algorithm

• The enhanced second chance algorithm looks at the reference bit and the modify bit
( dirty bit ) as an ordered page, and classifies pages into one of four classes:
1. ( 0, 0 ) - Neither recently used nor modified.
2. ( 0, 1 ) - Not recently used, but modified.
3. ( 1, 0 ) - Recently used, but clean.
4. ( 1, 1 ) - Recently used and modified.
• This algorithm searches the page table in a circular fashion ( in as many as four passes ),
looking for the first page it can find in the lowest numbered category. I.e. it first makes a
pass looking for a ( 0, 0 ), and then if it can't find one, it makes another pass looking for a
( 0, 1 ), etc.
• The main difference between this algorithm and the previous one is the preference for
replacing clean pages if possible.

3.6 Counting-Based Page Replacement

• There are several algorithms based on counting the number of references that have been
made to a given page, such as:
o Least Frequently Used, LFU: Replace the page with the lowest reference count.
A problem can occur if a page is used frequently initially and then not used any
more, as the reference count remains high. A solution to this problem is to right-
shift the counters periodically, yielding a time-decaying average reference count.
o Most Frequently Used, MFU: Replace the page with the highest reference count.
The logic behind this idea is that pages that have already been referenced a lot

116
have been in the system a long time, and we are probably done with them,
whereas pages referenced only a few times have only recently been loaded, and
we still need them.
• In general counting-based algorithms are not commonly used, as their implementation is
expensive and they do not approximate OPT well.

3.7 Page-Buffering Algorithms

There are a number of page-buffering algorithms that can be used in conjunction with the afore-
mentioned algorithms, to improve overall performance and sometimes make up for inherent
weaknesses in the hardware and/or the underlying page-replacement algorithms:

• Maintain a certain minimum number of free frames at all times. When a page-fault
occurs, go ahead and allocate one of the free frames from the free list first, to get the
requesting process up and running again as quickly as possible, and then select a victim
page to write to disk and free up a frame as a second step.
• Keep a list of modified pages, and when the I/O system is otherwise idle, have it write
these pages out to disk, and then clear the modify bits, thereby increasing the chance of
finding a "clean" page for the next potential victim.
• Keep a pool of free frames, but remember what page was in it before it was made free.
Since the data in the page is not actually cleared out when the page is freed, it can be
made an active page again without having to load in any new data from disk. This is
useful when an algorithm mistakenly replaces a page that in fact is needed again soon.

3.8 Applications and Page Replacement

• Some applications ( most notably database programs ) understand their data accessing
and caching needs better than the general-purpose OS, and should therefore be given
reign to do their own memory management.
• Sometimes such programs are given a raw disk partition to work with, containing raw
data blocks and no file system structure. It is then up to the application to use this disk
partition as extended memory or for whatever other reasons it sees fit.

3.4 Allocation of Frames

We said earlier that there were two important tasks in virtual memory management: a page-
replacement strategy and a frame-allocation strategy. This section covers the second part of that
pair.

3.4.1 Minimum Number of Frames

• The absolute minimum number of frames that a process must be allocated is dependent
on system architecture, and corresponds to the worst-case scenario of the number of
pages that could be touched by a single ( machine ) instruction.
• If an instruction ( and its operands ) spans a page boundary, then multiple pages could be
needed just for the instruction fetch.

117
• Memory references in an instruction touch more pages, and if those memory locations
can span page boundaries, then multiple pages could be needed for operand access also.
• The worst case involves indirect addressing, particularly where multiple levels of indirect
addressing are allowed. Left unchecked, a pointer to a pointer to a pointer to a pointer to
a . . . could theoretically touch every page in the virtual address space in a single machine
instruction, requiring every virtual page be loaded into physical memory simultaneously.
For this reason architectures place a limit ( say 16 ) on the number of levels of indirection
allowed in an instruction, which is enforced with a counter initialized to the limit and
decremented with every level of indirection in an instruction - If the counter reaches zero,
then an "excessive indirection" trap occurs. This example would still require a minimum
frame allocation of 17 per process.

3.4.2 Allocation Algorithms

• Equal Allocation - If there are m frames available and n processes to share them, each
process gets m / n frames, and the leftovers are kept in a free-frame buffer pool.
• Proportional Allocation - Allocate the frames proportionally to the size of the process,
relative to the total size of all processes. So if the size of process i is S_i, and S is the sum
of all S_i, then the allocation for process P_i is a_i = m * S_i / S.
• Variations on proportional allocation could consider priority of process rather than just
their size.
• Obviously all allocations fluctuate over time as the number of available free frames, m,
fluctuates, and all are also subject to the constraints of minimum allocation. ( If the
minimum allocations cannot be met, then processes must either be swapped out or not
allowed to start until more free frames become available. )

3.4.3 Global versus Local Allocation

• One big question is whether frame allocation ( page replacement ) occurs on a local or
global level.
• With local replacement, the number of pages allocated to a process is fixed, and page
replacement occurs only amongst the pages allocated to this process.
• With global replacement, any page may be a potential victim, whether it currently
belongs to the process seeking a free frame or not.
• Local page replacement allows processes to better control their own page fault rates, and
leads to more consistent performance of a given process over different system load levels.
• Global page replacement is overall more efficient, and is the more commonly used
approach.

3.4.4 Non-Uniform Memory Access

• The above arguments all assume that all memory is equivalent, or at least has equivalent
access times.
• This may not be the case in multiple-processor systems, especially where each CPU is
physically located on a separate circuit board which also holds some portion of the
overall system memory.

118
• In these latter systems, CPUs can access memory that is physically located on the same
board much faster than the memory on the other boards.
• The basic solution is akin to processor affinity - At the same time that we try to schedule
processes on the same CPU to minimize cache misses, we also try to allocate memory for
those processes on the same boards, to minimize access times.
• The presence of threads complicates the picture, especially when the threads get loaded
onto different processors.
• Solaris uses an lgroup as a solution, in a hierarchical fashion based on relative latency.
For example, all processors and RAM on a single board would probably be in the same
lgroup. Memory assignments are made within the same lgroup if possible, or to the next
nearest lgroup otherwise. ( Where "nearest" is defined as having the lowest access time. )

3.5 Thrashing

• If a process cannot maintain its minimum required number of frames, then it must be
swapped out, freeing up frames for other processes. This is an intermediate level of CPU
scheduling.
• But what about a process that can keep its minimum, but cannot keep all of the frames
that it is currently using on a regular basis? In this case it is forced to page out pages that
it will need again in the very near future, leading to large numbers of page faults.
• A process that is spending more time paging than executing is said to be thrashing.

3.5.1 Cause of Thrashing

• Early process scheduling schemes would control the level of multiprogramming allowed
based on CPU utilization, adding in more processes when CPU utilization was low.
• The problem is that when memory filled up and processes started spending lots of time
waiting for their pages to page in, then CPU utilization would lower, causing the schedule
to add in even more processes and exacerbating the problem! Eventually the system
would essentially grind to a halt.
• Local page replacement policies can prevent one thrashing process from taking pages
away from other processes, but it still tends to clog up the I/O queue, thereby slowing
down any other process that needs to do even a little bit of paging ( or any other I/O for
that matter. )

Figure 3.40 - Thrashing


119
• To prevent thrashing we must provide processes with as many frames as they really need
"right now", but how do we know what that is?

3.5.2 Working-Set Model

• The working set model is based on the concept of locality, and defines a working set
window, of length delta. Whatever pages are included in the most recent delta page
references are said to be in the processes working set window, and comprise its current
working set, as illustrated in Figure 9.20:

Figure 3.41 - Working-set model.

• The selection of delta is critical to the success of the working set model - If it is too small
then it does not encompass all of the pages of the current locality, and if it is too large,
then it encompasses pages that are no longer being frequently accessed.
• The total demand, D, is the sum of the sizes of the working sets for all processes. If D
exceeds the total number of available frames, then at least one process is thrashing,
because there are not enough frames available to satisfy its minimum working set. If D is
significantly less than the currently available frames, then additional processes can be
launched.
• The hard part of the working-set model is keeping track of what pages are in the current
working set, since every reference adds one to the set and removes one older page. An
approximation can be made using reference bits and a timer that goes off after a set
interval of memory references:
o For example, suppose that we set the timer to go off after every 5000 references (
by any process ), and we can store two additional historical reference bits in
addition to the current reference bit.
o Every time the timer goes off, the current reference bit is copied to one of the two
historical bits, and then cleared.
o If any of the three bits is set, then that page was referenced within the last 15,000
references, and is considered to be in that processes reference set.
o Finer resolution can be achieved with more historical bits and a more frequent
timer, at the expense of greater overhead.

3.5.3 Page-Fault Frequency


• A more direct approach is to recognize that what we really want to control is the page-
fault rate, and to allocate frames based on this directly measurable value. If the page-fault
rate exceeds a certain upper bound then that process needs more frames, and if it is below
a given lower bound, then it can afford to give up some of its frames to other processes.
120
• ( I suppose a page-replacement strategy could be devised that would select victim frames
based on the process with the lowest current page-fault frequency. )

Figure 3.42 - Page-fault frequency.

• Note that there is a direct relationship between the page-fault rate and the working-set, as
a process moves from one locality to another.

121
UNIT-IV

FILE-SYSTEM INTERFACE

4.1 File Concept

4.1.1 File Attributes

• Different OSes keep track of different file attributes, including:


o Name - Some systems give special significance to names, and particularly
extensions ( .exe, .txt, etc. ), and some do not. Some extensions may be of
significance to the OS ( .exe ), and others only to certain applications (.jpg)
o Identifier ( e.g. inode number )
o Type - Text, executable, other binary, etc.
o Location - on the hard drive.
o Size
o Protection
o Time & Date
o User ID

4.1.2 File Operations


• The file ADT supports many common operations:
o Creating a file
o Writing a file
o Reading a file
o Repositioning within a file
o Deleting a file
o Truncating a file.
• Most OSes require that files be opened before access and closed after all access is
complete. Normally the programmer must open and close files explicitly, but
some rare systems open the file automatically at first access. Information about
currently open files is stored in an open file table, containing for example:
o File pointer - records the current position in the file, for the next read or
write access.
o File-open count - How many times has the current file been opened
( simultaneously by different processes ) and not yet closed? When this
counter reaches zero the file can be removed from the table.
o Disk location of the file.
o Access rights
• Some systems provide support for file locking.
o A shared lock is for reading only.
o A exclusive lock is for writing as well as reading.
o An advisory lock is informational only, and not enforced. ( A "Keep Out"
sign, which may be ignored. )
o A mandatory lock is enforced. ( A truly locked door. )
o UNIX used advisory locks, and Windows uses mandatory locks.

122
4.1.3 File Types

• Windows ( and some other systems ) use special file extensions to indicate
the type of each file:

Figure 4.1- Common file types.

• Macintosh stores a creator attribute for each file, according to the program that
first created it with the create( ) system call.
• UNIX stores magic numbers at the beginning of certain files. ( Experiment with
the "file" command, especially in directories such as /bin and /dev )

4.1.4 File Structure

• Some files contain an internal structure, which may or may not be known to the
OS.
• For the OS to support particular file formats increases the size and complexity of
the OS.
• UNIX treats all files as sequences of bytes, with no further consideration of the
internal structure. ( With the exception of executable binary programs, which it
must know how to load and find the first executable statement, etc. )
• Macintosh files have two forks - a resource fork, and a data fork. The resource
fork contains information relating to the UI, such as icons and button images, and
can be modified independently of the data fork, which contains the code or data as
appropriate.

4.1.5 Internal File Structure

• Disk files are accessed in units of physical blocks, typically 512 bytes or some
power-of-two multiple thereof. ( Larger physical disks use larger block sizes, to
keep the range of block numbers within the range of a 32-bit integer. )
123
• Internally files are organized in units of logical units, which may be as small as a
single byte, or may be a larger size corresponding to some data record or structure
size.
• The number of logical units which fit into one physical block determines its
packing, and has an impact on the amount of internal fragmentation ( wasted
space ) that occurs.
• As a general rule, half a physical block is wasted for each file, and the larger the
block sizes the more space is lost to internal fragmentation.

4.2 Access Methods

4.2.1 Sequential Access

• A sequential access file emulates magnetic tape operation, and generally supports
a few operations:
o read next - read a record and advance the tape to the next position.
o write next - write a record and advance the tape to the next position.
o rewind
o skip n records - May or may not be supported. N may be limited to
positive numbers, or may be limited to +/- 1.

Figure 4.2 - Sequential-access file.

4.2.2 Direct Access

• Jump to any record and read that record. Operations supported include:
o read n - read record number n. ( Note an argument is now required. )
o write n - write record number n. ( Note an argument is now required. )
o jump to record n - could be 0 or the end of file.
o Query current record - used to return back to this record later.
o Sequential access can be easily emulated using direct access. The inverse
is complicated and inefficient.

124
Figure 4.3- Simulation of sequential access on a direct-access file.

4.2.3 Other Access Methods

• An indexed access scheme can be easily built on top of a direct access system.
Very large files may require a multi-tiered indexing scheme, i.e. indexes of
indexes.

Figure 4.4 - Example of index and relative files.

4.3 Directory Structure

4.3.1 Storage Structure

• A disk can be used in its entirety for a file system.


• Alternatively a physical disk can be broken up into multiple partitions, slices, or
mini-disks, each of which becomes a virtual disk and can have its own filesystem.
( or be used for raw storage, swap space, etc. )
• Or, multiple physical disks can be combined into one volume, i.e. a larger virtual
disk, with its own filesystem spanning the physical disks.

125
Figure 4.4 - A typical file-system organization.

4.3.2 Directory Overview

• Directory operations to be supported include:


o Search for a file
o Create a file - add to the directory
o Delete a file - erase from the directory
o List a directory - possibly ordered in different ways.
o Rename a file - may change sorting order
o Traverse the file system.

4.3.3. Single-Level Directory

• Simple to implement, but each file must have a unique name.

Figure 4.5 - Single-level directory.

4.3.4 Two-Level Directory

• Each user gets their own directory space.


• File names only need to be unique within a given user's directory.
• A master file directory is used to keep track of each users directory, and must be
maintained when users are added to or removed from the system.
• A separate directory is generally needed for system ( executable ) files.
• Systems may or may not allow users to access other directories besides their own

126
o If access to other directories is allowed, then provision must be made to
specify the directory being accessed.
o If access is denied, then special consideration must be made for users to
run programs located in system directories. A search path is the list of
directories in which to search for executable programs, and can be set
uniquely for each user.

Figure 4.6 - Two-level directory structure.

4.3.5 Tree-Structured Directories

• An obvious extension to the two-tiered directory structure, and the one with
which we are all most familiar.
• Each user / process has the concept of a current directory from which all
( relative ) searches take place.
• Files may be accessed using either absolute pathnames ( relative to the root of the
tree ) or relative pathnames ( relative to the current directory. )
• Directories are stored the same as any other file in the system, except there is a bit
that identifies them as directories, and they have some special structure that the
OS understands.
• One question for consideration is whether or not to allow the removal of
directories that are not empty - Windows requires that directories be emptied first,
and UNIX provides an option for deleting entire sub-trees.

127
Figure 4.7 - Tree-structured directory structure.

4.3.6 Acyclic-Graph Directories

• When the same files need to be accessed in more than one place in the directory
structure ( e.g. because they are being shared by more than one user / process ), it
can be useful to provide an acyclic-graph structure. ( Note the directed arcs from
parent to child. )
• UNIX provides two types of links for implementing the acyclic-graph structure. (
See "man ln" for more details. )
o A hard link ( usually just called a link ) involves multiple directory entries
that both refer to the same file. Hard links are only valid for ordinary files
in the same filesystem.
o A symbolic link, that involves a special file, containing information about
where to find the linked file. Symbolic links may be used to link
directories and/or files in other filesystems, as well as ordinary files in the
current filesystem.
• Windows only supports symbolic links, termed shortcuts.
• Hard links require a reference count, or link count for each file, keeping track of
how many directory entries are currently referring to this file. Whenever one of
the references is removed the link count is reduced, and when it reaches zero, the
disk space can be reclaimed.
• For symbolic links there is some question as to what to do with the symbolic links
when the original file is moved or deleted:
o One option is to find all the symbolic links and adjust them also.
o Another is to leave the symbolic links dangling, and discover that they are
no longer valid the next time they are used.
o What if the original file is removed, and replaced with another file having
the same name before the symbolic link is next used?

128
Figure 4.8 - Acyclic-graph directory structure.

4.3.7 General Graph Directory

• If cycles are allowed in the graphs, then several problems can arise:
o Search algorithms can go into infinite loops. One solution is to not follow
links in search algorithms. ( Or not to follow symbolic links, and to only
allow symbolic links to refer to directories. )
o Sub-trees can become disconnected from the rest of the tree and still not
have their reference counts reduced to zero. Periodic garbage collection is
required to detect and resolve this problem. ( chkdsk in DOS and fsck in
UNIX search for these problems, among others, even though cycles are
not supposed to be allowed in either system. Disconnected disk blocks that
are not marked as free are added back to the file systems with made-up file
names, and can usually be safely deleted. )

Figure 4.9 - General graph directory.

4.4 File-System Mounting

• The basic idea behind mounting file systems is to combine multiple file systems into one
large tree structure.

129
• The mount command is given a filesystem to mount and a mount point ( directory ) on
which to attach it.
• Once a file system is mounted onto a mount point, any further references to that directory
actually refer to the root of the mounted file system.
• Any files ( or sub-directories ) that had been stored in the mount point directory prior to
mounting the new filesystem are now hidden by the mounted filesystem, and are no
longer available. For this reason some systems only allow mounting onto empty
directories.
• Filesystems can only be mounted by root, unless root has previously configured certain
filesystems to be mountable onto certain pre-determined mount points. ( E.g. root may
allow users to mount floppy filesystems to /mnt or something like it. ) Anyone can run
the mount command to see what filesystems are currently mounted.
• Filesystems may be mounted read-only, or have other restrictions imposed.

Figure 4.10 - File system. (a) Existing system. (b) Unmounted volume.

Figure 4.11 - Mount point.

• The traditional Windows OS runs an extended two-tier directory structure, where the first
tier of the structure separates volumes by drive letters, and a tree structure is implemented
below that level.
• Macintosh runs a similar system, where each new volume that is found is automatically
mounted and added to the desktop when it is found.
• More recent Windows systems allow filesystems to be mounted to any directory in the
filesystem, much like UNIX.

130
4.5 File Sharing

4.5.1 Multiple Users

• On a multi-user system, more information needs to be stored for each file:


o The owner ( user ) who owns the file, and who can control its access.
o The group of other user IDs that may have some special access to the file.
o What access rights are afforded to the owner ( User ), the Group, and to
the rest of the world ( the universe, a.k.a. Others. )
o Some systems have more complicated access control, allowing or denying
specific accesses to specifically named users or groups.

4.5.2 Remote File Systems

• The advent of the Internet introduces issues for accessing files stored on remote
computers
o The original method was ftp, allowing individual files to be transported
across systems as needed. Ftp can be either account and password
controlled, or anonymous, not requiring any user name or password.
o Various forms of distributed file systems allow remote file systems to be
mounted onto a local directory structure, and accessed using normal file
access commands. ( The actual files are still transported across the
network as needed, possibly using ftp as the underlying transport
mechanism. )
o The WWW has made it easy once again to access files on remote systems
without mounting their filesystems, generally using ( anonymous ) ftp as
the underlying file transport mechanism.

4.5.2.1 The Client-Server Model


• When one computer system remotely mounts a filesystem that is
physically located on another system, the system which physically owns
the files acts as a server, and the system which mounts them is the client.
• User IDs and group IDs must be consistent across both systems for the
system to work properly. ( I.e. this is most applicable across multiple
computers managed by the same organization, shared by a common group
of users. )
• The same computer can be both a client and a server. ( E.g. cross-linked
file systems. )
• There are a number of security concerns involved in this model:
o Servers commonly restrict mount permission to certain trusted
systems only. Spoofing ( a computer pretending to be a different
computer ) is a potential security risk.
o Servers may restrict remote access to read-only.

131
o Servers restrict which filesystems may be remotely mounted.
Generally the information within those subsystems is limited,
relatively public, and protected by frequent backups.
• The NFS ( Network File System ) is a classic example of such a system.

4.5.2.2 Distributed Information Systems


• The Domain Name System, DNS, provides for a unique naming system
across all of the Internet.
• Domain names are maintained by the Network Information System, NIS,
which unfortunately has several security issues. NIS+ is a more secure
version, but has not yet gained the same widespread acceptance as NIS.
• Microsoft's Common Internet File System, CIFS, establishes a network
login for each user on a networked system with shared file access. Older
Windows systems used domains, and newer systems ( XP, 2000 ), use
active directories. User names must match across the network for this
system to be valid.
• A newer approach is the Lightweight Directory-Access Protocol, LDAP,
which provides a secure single sign-on for all users to access all resources
on a network. This is a secure system which is gaining in popularity, and
which has the maintenance advantage of combining authorization
information in one central location.

4.5.2.3 Failure Modes


• When a local disk file is unavailable, the result is generally known immediately,
and is generally non-recoverable. The only reasonable response is for the response
to fail.
• However when a remote file is unavailable, there are many possible reasons, and
whether or not it is unrecoverable is not readily apparent. Hence most remote
access systems allow for blocking or delayed response, in the hopes that the
remote system (or the network) will come back up eventually.

4.5.3 Consistency Semantics


• Consistency Semantics deals with the consistency between the views of shared
files on a networked system. When one user changes the file, when do other users
see the changes?
• At first glance this appears to have all of the synchronization issues discussed in
Chapter 6. Unfortunately the long delays involved in network operations prohibit
the use of atomic operations as discussed in that chapter.

4.5.3.1 UNIX Semantics


• The UNIX file system uses the following semantics:
o Writes to an open file are immediately visible to any other user
who has the file open.
o One implementation uses a shared location pointer, which is
adjusted for all sharing users.
• The file is associated with a single exclusive physical resource, which may
delay some accesses.
132
4.5.3.2 Session Semantics
• The Andrew File System, AFS uses the following semantics:
o Writes to an open file are not immediately visible to other users.
o When a file is closed, any changes made become available only to
users who open the file at a later time.
• According to these semantics, a file can be associated with multiple
( possibly different ) views. Almost no constraints are imposed on
scheduling accesses. No user is delayed in reading or writing their
personal copy of the file.
• AFS file systems may be accessible by systems around the world. Access
control is maintained through ( somewhat ) complicated access control
lists, which may grant access to the entire world ( literally ) or to
specifically named users accessing the files from specifically named
remote environments.

4.5.3.3 Immutable-Shared-Files Semantics


• Under this system, when a file is declared as shared by its creator, it
becomes immutable and the name cannot be re-used for any other
resource. Hence it becomes read-only, and shared access is simple.

4.6 Protection
• Files must be kept safe for reliability (against accidental damage), and protection
(against deliberate malicious access.) The former is usually managed with backup copies.
This section discusses the latter.
• One simple protection scheme is to remove all access to a file. However this makes the
file unusable, so some sort of controlled access must be arranged.

4.6.1 Types of Access


• The following low-level operations are often controlled:
o Read - View the contents of the file
o Write - Change the contents of the file.
o Execute - Load the file onto the CPU and follow the instructions contained
therein.
o Append - Add to the end of an existing file.
o Delete - Remove a file from the system.
o List -View the name and other attributes of files on the system.
• Higher-level operations, such as copy, can generally be performed through
combinations of the above.

4.6.2 Access Control


• One approach is to have complicated Access Control Lists, ACL, which specify
exactly what access is allowed or denied for specific users or groups.
o The AFS uses this system for distributed access.
o Control is very finely adjustable, but may be complicated, particularly
when a specific user involved is unknown. (AFS allows some wild cards,
so for example all users on a certain remote system may be trusted, or a
given username may be trusted when accessing from any remote system.)
133
• UNIX uses a set of 9 access control bits, in three groups of three. These
correspond to R, W, and X permissions for each of the Owner, Group, and Others.
( See "man chmod" for full details. ) The RWX bits control the following
privileges for ordinary files and directories:

bit Files Directories


Read ( view )
R Read directory contents. Required to get a listing of the directory.
file contents.
Write ( change )
Change directory contents. Required to create or delete files.
file contents.
Access detailed directory information. Required to get a long listing, or to access any
Execute file
specific file in the directory. Note that if a user has X but not R permissions on a
X contents as a
directory, they can still access specific files, but only if they already know the name of
program.
the file they are trying to access.

• In addition there are some special bits that can also be applied:
o The set user ID ( SUID ) bit and/or the set group ID ( SGID ) bits applied
to executable files temporarily change the identity of whoever runs the
program to match that of the owner / group of the executable program.
This allows users running specific programs to have access to files ( while
running that program ) to which they would normally be unable to
access. Setting of these two bits is usually restricted to root, and must be
done with caution, as it introduces a potential security leak.
o The sticky bit on a directory modifies write permission, allowing users to
only delete files for which they are the owner. This allows everyone to
create files in /tmp, for example, but to only delete files which they have
created, and not anyone else's.
o The SUID, SGID, and sticky bits are indicated with an S, S, and T in the
positions for execute permission for the user, group, and others,
respectively. If the letter is lower case, ( s, s, t ), then the corresponding
execute permission is not also given. If it is upper case, ( S, S, T ), then the
corresponding execute permission IS given.
o The numeric form of chmod is needed to set these advanced bits.

Sample permissions in a UNIX system.

134
• Windows adjusts files access through a simple GUI:

Figure 4.12 - Windows 7 access-control list management.


4.6.3 Other Protection Approaches and Issues
• Some systems can apply passwords, either to individual files, or to specific sub-
directories, or to the entire system. There is a trade-off between the number of
passwords that must be maintained ( and remembered by the users ) and the
amount of information that is vulnerable to a lost or forgotten password.
• Older systems which did not originally have multi-user file access permissions (
DOS and older versions of Mac ) must now be retrofitted if they are to share files
on a network.
• Access to a file requires access to all the files along its path as well. In a cyclic
directory structure, users may have different access to the same file accessed
through different paths.
• Sometimes just the knowledge of the existence of a file of a certain name is a
security ( or privacy ) concern. Hence the distinction between the R and X bits on
UNIX directories.

4.7 FILE-SYSTEM STRUCTURE


• Hard disks have two important properties that make them suitable for secondary storage
of files in file systems: (1) Blocks of data can be rewritten in place, and (2) they are direct
access, allowing any block of data to be accessed with only ( relatively ) minor
movements of the disk heads and rotational latency.
• Disks are usually accessed in physical blocks, rather than a byte at a time. Block sizes
may range from 512 bytes to 4K or larger.
• File systems organize storage on disk drives, and can be viewed as a layered design:
o At the lowest layer are the physical devices, consisting of the magnetic media,
motors & controls, and the electronics connected to them and controlling them.
Modern disk put more and more of the electronic controls directly on the disk
drive itself, leaving relatively little work for the disk controller card to perform.
135
o I/O Control consists of device drivers, special software programs ( often written
in assembly ) which communicate with the devices by reading and writing special
codes directly to and from memory addresses corresponding to the controller
card's registers. Each controller card ( device ) on a system has a different set of
addresses ( registers, a.k.a. ports ) that it listens to, and a unique set of command
codes and results codes that it understands.
o The basic file system level works directly with the device drivers in terms of
retrieving and storing raw blocks of data, without any consideration for what is in
each block. Depending on the system, blocks may be referred to with a single
block number, ( e.g. block # 234234 ), or with head-sector-cylinder combinations.
o The file organization module knows about files and their logical blocks, and how
they map to physical blocks on the disk. In addition to translating from logical to
physical blocks, the file organization module also maintains the list of free blocks,
and allocates free blocks to files as needed.
o The logical file system deals with all of the meta data associated with a file ( UID,
GID, mode, dates, etc ), i.e. everything about the file except the data itself. This
level manages the directory structure and the mapping of file names to file control
blocks, FCBs, which contain all of the meta data as well as block number
information for finding the data on the disk.
• The layered approach to file systems means that much of the code can be used uniformly
for a wide variety of different file systems, and only certain layers need to be filesystem
specific. Common file systems in use include the UNIX file system, UFS, the Berkeley
Fast File System, FFS, Windows systems FAT, FAT32, NTFS, CD-ROM systems ISO
9660, and for Linux the extended file systems ext2 and ext3 ( among 40 others
supported.)

Figure 4.13- Layered file system.


136
4.8 FILE-SYSTEM IMPLEMENTATION

4.8.1Overview

• File systems store several important data structures on the disk:


o A boot-control block, ( per volume ) a.k.a. the boot block in UNIX or the
partition boot sector in Windows contains information about how to boot
the system off of this disk. This will generally be the first sector of the
volume if there is a bootable system loaded on that volume, or the block
will be left vacant otherwise.
o A volume control block, ( per volume ) a.k.a. the master file table in
UNIX or the superblock in Windows, which contains information such as
the partition table, number of blocks on each filesystem, and pointers to
free blocks and free FCB blocks.
o A directory structure ( per file system ), containing file names and pointers
to corresponding FCBs. UNIX uses inode numbers, and NTFS uses a
master file table.
o The File Control Block, FCB, ( per file ) containing details about
ownership, size, permissions, dates, etc. UNIX stores this information in
inodes, and NTFS in the master file table as a relational database structure.

Figure 4.14- A typical file-control block.

• There are also several key data structures stored in memory:


o An in-memory mount table.
o An in-memory directory cache of recently accessed directory information.
o A system-wide open file table, containing a copy of the FCB for every
currently open file in the system, as well as some other related
information.
o A per-process open file table, containing a pointer to the system open file
table as well as some other information. ( For example the current file
position pointer may be either here or in the system file table, depending
on the implementation and whether the file is being shared or not. )
• Figure below illustrates some of the interactions of file system components when
files are created and/or used:
o When a new file is created, a new FCB is allocated and filled out with
137
important information regarding the new file. The appropriate directory is
modified with the new file name and FCB information.
o When a file is accessed during a program, the open( ) system call reads in
the FCB information from disk, and stores it in the system-wide open file
table. An entry is added to the per-process open file table referencing the
system-wide table, and an index into the per-process table is returned by
the open( ) system call. UNIX refers to this index as a file descriptor, and
Windows refers to it as a file handle.
o If another process already has a file open when a new request comes in for
the same file, and it is sharable, then a counter in the system-wide table is
incremented and the per-process table is adjusted to point to the existing
entry in the system-wide table.
o When a file is closed, the per-process table entry is freed, and the counter
in the system-wide table is decremented. If that counter reaches zero, then
the system wide table is also freed. Any data currently stored in memory
cache for this file is written out to disk if necessary.

Figure 4.15- In-memory file-system structures. (a) File open. (b) File read.

4.8 .2 Partitions and Mounting


138
• Physical disks are commonly divided into smaller units called partitions. They can
also be combined into larger units, but that is most commonly done for RAID
installations and is left for later chapters.
• Partitions can either be used as raw devices ( with no structure imposed upon
them ), or they can be formatted to hold a filesystem ( i.e. populated with FCBs
and initial directory structures as appropriate. ) Raw partitions are generally used
for swap space, and may also be used for certain programs such as databases that
choose to manage their own disk storage system. Partitions containing filesystems
can generally only be accessed using the file system structure by ordinary users,
but can often be accessed as a raw device also by root.
• The boot block is accessed as part of a raw partition, by the boot program prior to
any operating system being loaded. Modern boot programs understand multiple
OSes and filesystem formats, and can give the user a choice of which of several
available systems to boot.
• The root partition contains the OS kernel and at least the key portions of the OS
needed to complete the boot process. At boot time the root partition is mounted,
and control is transferred from the boot program to the kernel found there. ( Older
systems required that the root partition lie completely within the first 1024
cylinders of the disk, because that was as far as the boot program could reach.
Once the kernel had control, then it could access partitions beyond the 1024
cylinder boundary. )
• Continuing with the boot process, additional filesystems get mounted, adding
their information into the appropriate mount table structure. As a part of the
mounting process the file systems may be checked for errors or inconsistencies,
either because they are flagged as not having been closed properly the last time
they were used, or just for general principals. Filesystems may be mounted either
automatically or manually. In UNIX a mount point is indicated by setting a flag in
the in-memory copy of the inode, so all future references to that inode get re-
directed to the root directory of the mounted filesystem.

4.8 .3 Virtual File Systems


• Virtual File Systems, VFS, provide a common interface to multiple different
filesystem types. In addition, it provides for a unique identifier ( vnode ) for files
across the entire space, including across all filesystems of different types. ( UNIX
inodes are unique only across a single filesystem, and certainly do not carry
across networked file systems. )
• The VFS in Linux is based upon four key object types:
o The inode object, representing an individual file
o The file object, representing an open file.
o The superblock object, representing a filesystem.
o The dentry object, representing a directory entry.
• Linux VFS provides a set of common functionalities for each filesystem, using
function pointers accessed through a table. The same functionality is accessed
through the same table position for all filesystem types, though the actual
functions pointed to by the pointers may be filesystem-specific. See
/usr/include/linux/fs.h for full details. Common operations provided include open(
), read( ), write( ), and mmap( ).
139
Figure 4.16- Schematic view of a virtual file system.

4.9 DIRECTORY IMPLEMENTATION

• Directories need to be fast to search, insert, and delete, with a minimum of wasted disk
space.

Linear List

• A linear list is the simplest and easiest directory structure to set up, but it does
have some drawbacks.
• Finding a file ( or verifying one does not already exist upon creation ) requires a
linear search.
• Deletions can be done by moving all entries, flagging an entry as deleted, or by
moving the last entry into the newly vacant position.
• Sorting the list makes searches faster, at the expense of more complex insertions
and deletions.
• A linked list makes insertions and deletions into a sorted list easier, with overhead
for the links.
• More complex data structures, such as B-trees, could also be considered.

Hash Table

• A hash table can also be used to speed up searches.


• Hash tables are generally implemented in addition to a linear or other structure

140
4.10 ALLOCATION METHODS

• There are three major methods of storing files on disks: contiguous, linked, and indexed.

4.10.1 Contiguous Allocation

• Contiguous Allocation requires that all blocks of a file be kept together


contiguously.
• Performance is very fast, because reading successive blocks of the same file
generally requires no movement of the disk heads, or at most one small step to the
next adjacent cylinder.
• Storage allocation involves the same issues discussed earlier for the allocation of
contiguous blocks of memory ( first fit, best fit, fragmentation problems, etc. )
The distinction is that the high time penalty required for moving the disk heads
from spot to spot may now justify the benefits of keeping files contiguously when
possible.
• ( Even file systems that do not by default store files contiguously can benefit from
certain utilities that compact the disk and make all files contiguous in the process.
)
• Problems can arise when files grow, or if the exact size of a file is unknown at
creation time:
o Over-estimation of the file's final size increases external fragmentation
and wastes disk space.
o Under-estimation may require that a file be moved or a process aborted if
the file grows beyond its originally allocated space.
o If a file grows slowly over a long time period and the total final space
must be allocated initially, then a lot of space becomes unusable before the
file fills the space.
• A variation is to allocate file space in large contiguous chunks, called extents.
When a file outgrows its original extent, then an additional one is allocated. ( For
example an extent may be the size of a complete track or even cylinder, aligned
on an appropriate track or cylinder boundary. ) The high-performance files system
Veritas uses extents to optimize performance.

141
Figure 4.17 -Contiguous allocation of disk space.

4.10.2 Linked Allocation


• Disk files can be stored as linked lists, with the expense of the storage space
consumed by each link. ( E.g. a block may be 508 bytes instead of 512. )
• Linked allocation involves no external fragmentation, does not require pre-known
file sizes, and allows files to grow dynamically at any time.
• Unfortunately linked allocation is only efficient for sequential access files, as
random access requires starting at the beginning of the list for each new location
access.
• Allocating clusters of blocks reduces the space wasted by pointers, at the cost of
internal fragmentation.
• Another big problem with linked allocation is reliability if a pointer is lost or
damaged. Doubly linked lists provide some protection, at the cost of additional
overhead and wasted space.

Figure 4.18 - Linked allocation of disk space.

The File Allocation Table, FAT, used by DOS is a variation of linked allocation, where all
the links are stored in a separate table at the beginning of the disk. The benefit of this
approach is that the FAT table can be cached in memory, greatly improving random access
speeds.

142
Figure 4.19- File-allocation table.

4.10.3 Indexed Allocation


• Indexed Allocation combines all of the indexes for accessing each file into a
common block ( for that file ), as opposed to spreading them all over the disk or
storing them in a FAT table.

Figure 4.20- Indexed allocation of disk space.

• Some disk space is wasted ( relative to linked lists or FAT tables ) because an
entire index block must be allocated for each file, regardless of how many data
blocks the file contains. This leads to questions of how big the index block should
be, and how it should be implemented. There are several approaches:
o Linked Scheme - An index block is one disk block, which can be read
and written in a single disk operation. The first index block contains some
header information, the first N block addresses, and if necessary a pointer
to additional linked index blocks.
143
o Multi-Level Index - The first index block contains a set of pointers to
secondary index blocks, which in turn contain pointers to the actual data
blocks.
o Combined Scheme - This is the scheme used in UNIX inodes, in which
the first 12 or so data block pointers are stored directly in the inode, and
then singly, doubly, and triply indirect pointers provide access to more
data blocks as needed. (See below.) The advantage of this scheme is that
for small files (which many are), the data blocks are readily accessible (up
to 48K with 4K block sizes); files up to about 4144K (using 4K blocks)
are accessible with only a single indirect block (which can be cached), and
huge files are still accessible using a relatively small number of disk
accesses (larger in theory than can be addressed by a 32-bit address, which
is why some systems have moved to 64-bit file pointers.)

Figure 4.21- The UNIX inode.

Performance
• The optimal allocation method is different for sequential access files than for
random access files, and is also different for small files than for large files.
• Some systems support more than one allocation method, which may require
specifying how the file is to be used (sequential or random access) at the time it is
allocated. Such systems also provide conversion utilities.
• Some systems have been known to use contiguous access for small files, and
automatically switch to an indexed scheme when file sizes surpass a certain
threshold.
• And of course some systems adjust their allocation schemes (e.g. block sizes) to
best match the characteristics of the hardware for optimum performance.

144
4.11 FREE-SPACE MANAGEMENT
• Another important aspect of disk management is keeping track of and allocating
free space.

Bit Vector
• One simple approach is to use a bit vector, in which each bit represents a disk
block, set to 1 if free or 0 if allocated.
• Fast algorithms exist for quickly finding contiguous blocks of a given size
• The down side is that a 40GB disk requires over 5MB just to store the bitmap.
(For example.)

Linked List
• A linked list can also be used to keep track of all free blocks.
• Traversing the list and/or finding a contiguous block of a given size are not easy,
but fortunately are not frequently needed operations. Generally the system just
adds and removes single blocks from the beginning of the list.
• The FAT table keeps track of the free list as just one more linked list on the table.

Figure 4.22- Linked free-space list on disk.

Grouping
• A variation on linked list free lists is to use links of blocks of indices of free
blocks. If a block holds up to N addresses, then the first block in the linked-list
contains up to N-1 addresses of free blocks and a pointer to the next block of free
addresses.

Counting
• When there are multiple contiguous blocks of free space then the system can keep
track of the starting address of the group and the number of contiguous free
blocks. As long as the average length of a contiguous group of free blocks is
greater than two this offers a savings in space needed for the free list. (Similar to
compression techniques used for graphics images when a group of pixels all the same
color is encountered.)

145
Space Maps
• Sun's ZFS file system was designed for HUGE numbers and sizes of files,
directories, and even file systems.
• The resulting data structures could be VERY inefficient if not implemented
carefully. For example, freeing up a 1 GB file on a 1 TB file system could involve
updating thousands of blocks of free list bit maps if the file was spread across the
disk.
• ZFS uses a combination of techniques, starting with dividing the disk up into (
hundreds of ) metaslabs of a manageable size, each having their own space map.
• Free blocks are managed using the counting technique, but rather than write the
information to a table, it is recorded in a log-structured transaction record.
Adjacent free blocks are also coalesced into a larger single free block.
• An in-memory space map is constructed using a balanced tree data structure,
constructed from the log data.
• The combination of the in-memory tree and the on-disk log provide for very fast
and efficient management of these very large files and free blocks.

4.12 EFFICIENCY AND PERFORMANCE


Efficiency
• UNIX pre-allocates inodes, which occupies space even before any files are
created.
• UNIX also distributes inodes across the disk, and tries to store data files near their
inode, to reduce the distance of disk seeks between the inodes and the data.
• Some systems use variable size clusters depending on the file size.
• The more data that is stored in a directory ( e.g. last access time ), the more often
the directory blocks have to be re-written.
• As technology advances, addressing schemes have had to grow as well.
o Sun's ZFS file system uses 128-bit pointers, which should theoretically
never need to be expanded. ( The mass required to store 2^128 bytes with
atomic storage would be at least 272 trillion kilograms! )
• Kernel table sizes used to be fixed, and could only be changed by rebuilding the
kernels. Modern tables are dynamically allocated, but that requires more
complicated algorithms for accessing them.

Performance
• Disk controllers generally include on-board caching. When a seek is requested,
the heads are moved into place, and then an entire track is read, starting from
whatever sector is currently under the heads (reducing latency.) The requested
sector is returned& the unrequested portion of track is cached in disk's electronics.
• Some OSes cache disk blocks they expect to need again in a buffer cache.
• A page cache connected to the virtual memory system is actually more efficient
as memory addresses do not need to be converted to disk block addresses and
back again.
• Some systems ( Solaris, Linux, Windows 2000, NT, XP ) use page caching for
both process pages and file data in a unified virtual memory.
• Figures below show the advantages of the unified buffer cache found in some
versions of UNIX and Linux - Data does not need to be stored twice, and
problems of inconsistent buffer information are avoided.
146
Figure 4.23- I/O without a unified buffer cache.

Figure 4.24 -I/O using a unified buffer cache.


• Page replacement strategies can be complicated with a unified cache, as one needs
to decide whether to replace process or file pages, and how many pages to
guarantee to each category of pages. Solaris, for example, has gone through many
variations, resulting in priority paging giving process pages priority over file I/O
pages, and setting limits so that neither can knock the other completely out of
memory.
• Another issue affecting performance is the question of whether to implement
synchronous writes or asynchronous writes. Synchronous writes occur in the
order in which the disk subsystem receives them, without caching; Asynchronous
writes are cached, allowing the disk subsystem to schedule writes in a more
efficient order ( See Chapter 12. ) Metadata writes are often done synchronously.
Some systems support flags to the open call requiring that writes be synchronous,
for example for the benefit of database systems that require their writes be
performed in a required order.
• The type of file access can also have an impact on optimal page replacement
policies. For example, LRU is not necessarily a good policy for sequential access
files. For these types of files progression normally goes in a forward direction
only, and the most recently used page will not be needed again until after the file
has been rewound and re-read from the beginning, ( if it is ever needed at all. ) On
the other hand, we can expect to need the next page in the file fairly soon. For this
reason sequential access files often take advantage of two special policies:
147
o Free-behind frees up a page as soon as the next page in the file is
requested, with the assumption that we are now done with the old page
and won't need it again for a long time.
o Read-ahead reads the requested page and several subsequent pages at the
same time, with the assumption that those pages will be needed in the near
future. This is similar to the track caching that is already performed by the
disk controller, except it saves the future latency of transferring data from
the disk controller memory into motherboard main memory.
• The caching system and asynchronous writes speed up disk writes considerably,
because the disk subsystem can schedule physical writes to the disk to minimize
head movement and disk seek times. ( See Chapter 12. ) Reads, on the other hand,
must be done more synchronously in spite of the caching system, with the result
that disk writes can counter-intuitively be much faster on average than disk reads.

MASS-STORAGE STRUCTURE

4.13 Overview of Mass-Storage Structure


Magnetic Disks
• Traditional magnetic disks have the following basic structure:
o One or more platters in the form of disks covered with magnetic media.
Hard disk platters are made of rigid metal, while "floppy" disks are made
of more flexible plastic.
o Each platter has two working surfaces. Older hard disk drives would
sometimes not use the very top or bottom surface of a stack of platters, as
these surfaces were more susceptible to potential damage.
o Each working surface is divided into a number of concentric rings called
tracks. The collection of all tracks that are the same distance from the
edge of the platter, ( i.e. all tracks immediately above one another in the
following diagram ) is called a cylinder.
o Each track is further divided into sectors, traditionally containing 512
bytes of data each, although some modern disks occasionally use larger
sector sizes. ( Sectors also include a header and a trailer, including
checksum information among other things. Larger sector sizes reduce the
fraction of the disk consumed by headers and trailers, but increase internal
fragmentation and the amount of disk that must be marked bad in the case
of errors. )
o The data on a hard drive is read by read-write heads. The standard
configuration ( shown below ) uses one head per surface, each on a
separate arm, and controlled by a common arm assembly which moves all
heads simultaneously from one cylinder to another. ( Other configurations,
including independent read-write heads, may speed up disk access, but
involve serious technical difficulties. )
o The storage capacity of a traditional disk drive is equal to the number of
heads ( i.e. the number of working surfaces ), times the number of tracks
per surface, times the number of sectors per track, times the number of
bytes per sector. A particular physical block of data is specified by
providing the head-sector-cylinder number at which it is located.
148
Figure 4.25 -Moving-head disk mechanism.

• In operation the disk rotates at high speed, such as 7200 rpm ( 120 revolutions per
second. ) The rate at which data can be transferred from the disk to the computer
is composed of several steps:
o The positioning time, a.k.a. the seek time or random access time is the
time required to move the heads from one cylinder to another, and for the
heads to settle down after the move. This is typically the slowest step in
the process and the predominant bottleneck to overall transfer rates.
o The rotational latency is the amount of time required for the desired
sector to rotate around and come under the read-write head.This can range
anywhere from zero to one full revolution, and on the average will equal
one-half revolution. This is another physical step and is usually the second
slowest step behind seek time. ( For a disk rotating at 7200 rpm, the
average rotational latency would be 1/2 revolution / 120 revolutions per
second, or just over 4 milliseconds, a long time by computer standards.
o The transfer rate, which is the time required to move the data
electronically from the disk to the computer. ( Some authors may also use
the term transfer rate to refer to the overall transfer rate, including seek
time and rotational latency as well as the electronic data transfer rate. )
• Disk heads "fly" over the surface on a very thin cushion of air. If they should
accidentally contact the disk, then a head crash occurs, which may or may not
permanently damage the disk or even destroy it completely. For this reason it is
normal to park the disk heads when turning a computer off, which means to move
the heads off the disk or to an area of the disk where there is no data stored.
• Floppy disks are normally removable. Hard drives can also be removable, and
some are even hot-swappable, meaning they can be removed while the computer
is running, and a new hard drive inserted in their place.
• Disk drives are connected to the computer via a cable known as the I/O Bus.
Some of the common interface formats include Enhanced Integrated Drive
Electronics, EIDE; Advanced Technology Attachment, ATA; Serial ATA, SATA,
Universal Serial Bus, USB; Fiber Channel, FC, and Small Computer Systems
Interface, SCSI.
149
• The host controller is at the computer end of the I/O bus, and the disk controller
is built into the disk itself. The CPU issues commands to the host controller via
I/O ports. Data is transferred between the magnetic surface and onboard cache by
the disk controller, and then the data is transferred from that cache to the host
controller and the motherboard memory at electronic speeds.

Solid-State Disks - New


• As technologies improve and economics change, old technologies are often used
in different ways. One example of this is the increasing used of solid state disks,
or SSDs.
• SSDs use memory technology as a small fast hard disk. Specific implementations
may use either flash memory or DRAM chips protected by a battery to sustain the
information through power cycles.
• Because SSDs have no moving parts they are much faster than traditional hard
drives, and certain problems such as the scheduling of disk accesses simply do not
apply.
• However SSDs also have their weaknesses: They are more expensive than hard
drives, generally not as large, and may have shorter life spans.
• SSDs are especially useful as a high-speed cache of hard-disk information that
must be accessed quickly. One example is to store filesystem meta-data, e.g.
directory and inode information, that must be accessed quickly and often. Another
variation is a boot disk containing the OS and some application executables, but
no vital user data. SSDs are also used in laptops to make them smaller, faster, and
lighter.
• Because SSDs are so much faster than traditional hard disks, the throughput of the
bus can become a limiting factor, causing some SSDs to be connected directly to
the system PCI bus for example.

Magnetic Tapes - Magnetic tapes were once used for common secondary storage before the
days of hard disk drives, but today are used primarily for backups.
• Accessing a particular spot on a magnetic tape can be slow, but once reading or
writing commences, access speeds are comparable to disk drives.
• Capacities of tape drives can range from 20 to 200 GB, and compression can
double that capacity.

4.14 Disk Structure


• The traditional head-sector-cylinder, HSC numbers are mapped to linear block addresses
by numbering the first sector on the first head on the outermost track as sector 0.
Numbering proceeds with the rest of the sectors on that same track, and then the rest of
the tracks on the same cylinder before proceeding through the rest of the cylinders to the
center of the disk. In modern practice these linear block addresses are used in place of the
HSC numbers for a variety of reasons:
1. The linear length of tracks near the outer edge of the disk is much longer than for
those tracks located near the center, and therefore it is possible to squeeze many
more sectors onto outer tracks than onto inner ones.
2. All disks have some bad sectors, and therefore disks maintain a few spare sectors
that can be used in place of the bad ones. The mapping of spare sectors to bad
150
sectors in managed internally to the disk controller.
3. Modern hard drives can have thousands of cylinders, and hundreds of sectors per
track on their outermost tracks. These numbers exceed the range of HSC numbers
for many ( older ) operating systems, and therefore disks can be configured for
any convenient combination of HSC values that falls within the total number of
sectors physically on the drive.
• There is a limit to how closely packed individual bits can be placed on a physical media,
but that limit is growing increasingly more packed as technological advances are made.
• Modern disks pack many more sectors into outer cylinders than inner ones, using one of
two approaches:
o With Constant Linear Velocity, CLV, the density of bits is uniform from cylinder
to cylinder. Because there are more sectors in outer cylinders, the disk spins
slower when reading those cylinders, causing the rate of bits passing under the
R/W head to remain constant. This is the approach used by modern CDs/DVDs.
o With Constant Angular Velocity, CAV, the disk rotates at a constant angular
speed, with the bit density decreasing on outer cylinders. (These disks would have
a constant number of sectors per track on all cylinders.)

4.15 DISK ATTACHMENT


Disk drives can be attached either directly to a particular host ( a local disk ) or to a network.
Host-Attached Storage
• Local disks are accessed through I/O Ports as described earlier.
• The most common interfaces are IDE or ATA, each of which allow up to two
drives per host controller.
• SATA is similar with simpler cabling.
• High end workstations or other systems in need of larger number of disks
typically use SCSI disks:
o The SCSI standard supports up to 16 targets on each SCSI bus, one of
which is generally the host adapter and the other 15 of which can be disk
or tape drives.
o A SCSI target is usually a single drive, but the standard also supports up to
8 units within each target. These would generally be used for accessing
individual disks within a RAID array. ( See below. )
o The SCSI standard also supports multiple host adapters in a single
computer, i.e. multiple SCSI busses.
o Modern advancements in SCSI include "fast" and "wide" versions, as well
as SCSI-2.
o SCSI cables may be either 50 or 68 conductors. SCSI devices may be
external as well as internal.
o See wikipedia for more information on the SCSI interface.
• FC is a high-speed serial architecture that can operate over optical fiber or four-
conductor copper wires, and has two variants:
o A large switched fabric having a 24-bit address space. This variant allows
for multiple devices and multiple hosts to interconnect, forming the basis
for the storage-area networks, SANs, to be discussed in a future section.
o The arbitrated loop, FC-AL, that can address up to 126 devices (drives
and controllers.)
151
Network-Attached Storage
• Network attached storage connects storage devices to computers using a remote
procedure call, RPC, interface, typically with something like NFS filesystem
mounts. This is convenient for allowing several computers in a group common
access and naming conventions for shared storage.
• NAS can be implemented using SCSI cabling, or ISCSI uses Internet protocols
and standard network connections, allowing long-distance remote access to shared
files.
• NAS allows computers to easily share data storage, but tends to be less efficient
than standard host-attached storage.

Figure 4.26- Network-attached storage.

Storage-Area Network

• A Storage-Area Network, SAN, connects computers and storage devices in a


network, using storage protocols instead of network protocols.
• One advantage of this is that storage access does not tie up regular networking
bandwidth.
• SAN is very flexible and dynamic, allowing hosts and devices to attach and
detach on the fly.
• SAN is also controllable, allowing restricted access to certain hosts and devices.

Figure 4.27- Storage-area network.

152
4.16 DISK SCHEDULING

• As mentioned earlier, disk transfer speeds are limited primarily by seek times and
rotational latency. When multiple requests are to be processed there is also some
inherent delay in waiting for other requests to be processed.
• Bandwidth is measured by the amount of data transferred divided by the total amount of
time from the first request being made to the last transfer being completed, ( for a series
of disk requests. )
• Both bandwidth and access time can be improved by processing requests in a good order.
• Disk requests include the disk address, memory address, number of sectors to transfer,
and whether the request is for reading or writing.

FCFS Scheduling

• First-Come First-Serve is simple and intrinsically fair, but not very efficient.
Consider in the following sequence the wild swing from cylinder 122 to 14 and
then back to 124:

Figure 4.28 -FCFS disk scheduling.

SSTF Scheduling

• Shortest Seek Time First scheduling is more efficient, but may lead to starvation
if a constant stream of requests arrives for the same general area of the disk.
• SSTF reduces the total head movement to 236 cylinders, down from 640 required
for the same set of requests under FCFS. Note, however that the distance could be
reduced still further to 208 by starting with 37 and then 14 first before processing
the rest of the requests.

153
Figure 4.29- SSTF disk scheduling.

SCAN Scheduling

• The SCAN algorithm, a.k.a. the elevator algorithm moves back and forth from
one end of the disk to the other, similarly to an elevator processing requests in a
tall building.

Figure 4.30- SCAN disk scheduling.

• Under the SCAN algorithm, If a request arrives just ahead of the moving head
then it will be processed right away, but if it arrives just after the head has passed,
then it will have to wait for the head to pass going the other way on the return trip.
This leads to a fairly wide variation in access times which can be improved upon.
• Consider, for example, when the head reaches the high end of the disk: Requests
with high cylinder numbers just missed the passing head, which means they are
all fairly recent requests, whereas requests with low numbers may have been
waiting for a much longer time. Making the return scan from high to low then

154
ends up accessing recent requests first and making older requests wait that much
longer.

C-SCAN Scheduling

• The Circular-SCAN algorithm improves upon SCAN by treating all requests in a


circular queue fashion - Once the head reaches the end of the disk, it returns to the
other end without processing any requests, and then starts again from the
beginning of the disk:

Figure 4.31- C-SCAN disk scheduling.

LOOK Scheduling

• LOOK scheduling improves upon SCAN by looking ahead at the queue of


pending requests, and not moving the heads any farther towards the end of the
disk than is necessary. The following diagram illustrates the circular form of
LOOK:

Figure 4.32- C-LOOK disk scheduling.

155
Selection of a Disk-Scheduling Algorithm
• With very low loads all algorithms are equal, since there will normally only be
one request to process at a time.
• For slightly larger loads, SSTF offers better performance than FCFS, but may lead
to starvation when loads become heavy enough.
• For busier systems, SCAN and LOOK algorithms eliminate starvation problems.
• The actual optimal algorithm may be something even more complex than those
discussed here, but the incremental improvements are generally not worth the
additional overhead.
• Some improvement to overall filesystem access times can be made by intelligent
placement of directory and/or inode information. If those structures are placed in
the middle of the disk instead of at the beginning of the disk, then the maximum
distance from those structures to data blocks is reduced to only one-half of the
disk size. If those structures can be further distributed and furthermore have their
data blocks stored as close as possible to the corresponding directory structures,
then that reduces still further the overall time to find the disk block numbers and
then access the corresponding data blocks.
• On modern disks the rotational latency can be almost as significant as the seek
time, however it is not within the OSes control to account for that, because
modern disks do not reveal their internal sector mapping schemes, ( particularly
when bad blocks have been remapped to spare sectors. )
o Some disk manufacturers provide for disk scheduling algorithms directly
on their disk controllers, ( which do know the actual geometry of the disk
as well as any remapping ), so that if a series of requests are sent from the
computer to the controller then those requests can be processed in an
optimal order.
o Unfortunately there are some considerations that the OS must take into
account that are beyond the abilities of the on-board disk-scheduling
algorithms, such as priorities of some requests over others, or the need to
process certain requests in a particular order. For this reason OSes may
elect to spoon-feed requests to the disk controller one at a time in certain
situations.

4.17 Disk Management

Disk Formatting
• Before a disk can be used, it has to be low-level formatted, which means laying
down all of the headers and trailers marking the beginning and ends of each
sector. Included in the header and trailer are the linear sector numbers, and error-
correcting codes, ECC, which allow damaged sectors to not only be detected, but
in many cases for the damaged data to be recovered ( depending on the extent of
the damage. ) Sector sizes are traditionally 512 bytes, but may be larger,
particularly in larger drives.
• ECC calculation is performed with every disk read or write, and if damage is
detected but the data is recoverable, then a soft error has occurred. Soft errors are
generally handled by the on-board disk controller, and never seen by the OS. ( See
below. )
156
• Once the disk is low-level formatted, the next step is to partition the drive into one
or more separate partitions. This step must be completed even if the disk is to be
used as a single large partition, so that the partition table can be written to the
beginning of the disk.
• After partitioning, then the filesystems must be logically formatted, which
involves laying down the master directory information ( FAT table or inode
structure ), initializing free lists, and creating at least the root directory of the
filesystem. ( Disk partitions which are to be used as raw devices are not logically
formatted. This saves the overhead and disk space of the filesystem structure, but
requires that the application program manage its own disk storage requirements. )

Boot Block
• Computer ROM contains a bootstrap program ( OS independent ) with just
enough code to find the first sector on the first hard drive on the first controller,
load that sector into memory, and transfer control over to it. ( The ROM bootstrap
program may look in floppy and/or CD drives before accessing the hard drive,
and is smart enough to recognize whether it has found valid boot code or not. )
• The first sector on the hard drive is known as the Master Boot Record, MBR, and
contains a very small amount of code in addition to the partition table. The
partition table documents how the disk is partitioned into logical disks, and
indicates specifically which partition is the active or boot partition.
• The boot program then looks to the active partition to find an operating system,
possibly loading up a slightly larger / more advanced boot program along the way.
• In a dual-boot ( or larger multi-boot ) system, the user may be given a choice of
which operating system to boot, with a default action to be taken in the event of
no response within some time frame.
• Once the kernel is found by the boot program, it is loaded into memory and then
control is transferred over to the OS. The kernel will normally continue the boot
process by initializing all important kernel data structures, launching important
system services ( e.g. network daemons, sched, init, etc. ), and finally providing
one or more login prompts. Boot options at this stage may include single-user
a.k.a. maintenance or safe modes, in which very few system services are started -
These modes are designed for system administrators to repair problems or
otherwise maintain the system.

Figure 4.33- Booting from disk in Windows 2000.

157
Bad Blocks
• No disk can be manufactured to 100% perfection, and all physical objects wear
out over time. For these reasons all disks are shipped with a few bad blocks, and
additional blocks can be expected to go bad slowly over time. If a large number of
blocks go bad then the entire disk will need to be replaced, but a few here and
there can be handled through other means.
• In the old days, bad blocks had to be checked for manually. Formatting of the disk
or running certain disk-analysis tools would identify bad blocks, and attempt to
read the data off of them one last time through repeated tries. Then the bad blocks
would be mapped out and taken out of future service. Sometimes the data could
be recovered, and sometimes it was lost forever. ( Disk analysis tools could be
either destructive or non-destructive. )
• Modern disk controllers make much better use of the error-correcting codes, so
that bad blocks can be detected earlier and the data usually recovered. ( Recall
that blocks are tested with every write as well as with every read, so often errors
can be detected before the write operation is complete, and the data simply written
to a different sector instead. )
• Note that re-mapping of sectors from their normal linear progression can throw
off the disk scheduling optimization of the OS, especially if the replacement
sector is physically far away from the sector it is replacing. For this reason most
disks normally keep a few spare sectors on each cylinder, as well as at least one
spare cylinder. Whenever possible a bad sector will be mapped to another sector
on the same cylinder, or at least a cylinder as close as possible. Sector slipping
may also be performed, in which all sectors between the bad sector and the
replacement sector are moved down by one, so that the linear progression of
sector numbers can be maintained.
• If the data on a bad block cannot be recovered, then a hard error has occurred.,
which requires replacing the file(s) from backups, or rebuilding them from
scratch.

4.18 Swap-Space Management


• Modern systems typically swap out pages as needed, rather than swapping out entire
processes. Hence the swapping system is part of the virtual memory management system.
• Managing swap space is obviously an important task for modern OSes.

Swap-Space Use

• The amount of swap space needed by an OS varies greatly according to how it is


used. Some systems require an amount equal to physical RAM; some want a
multiple of that; some want an amount equal to the amount by which virtual
memory exceeds physical RAM, and some systems use little or none at all!
• Some systems support multiple swap spaces on separate disks in order to speed up
the virtual memory system.

Swap-Space Location
Swap space can be physically located in one of two locations:

158
• As a large file which is part of the regular filesystem. This is easy to
implement, but inefficient. Not only must the swap space be accessed
through the directory system, the file is also subject to fragmentation
issues. Caching the block location helps in finding the physical blocks, but
that is not a complete fix.
• As a raw partition, possibly on a separate or little-used disk. This allows
the OS more control over swap space management, which is usually faster
and more efficient. Fragmentation of swap space is generally not a big
issue, as the space is re-initialized every time the system is rebooted. The
downside of keeping swap space on a raw partition is that it can only be
grown by repartitioning the hard drive.

Swap-Space Management: An Example


• Historically OSes swapped out entire processes as needed. Modern systems swap
out only individual pages, and only as needed. ( For example process code blocks
and other blocks that have not been changed since they were originally loaded are
normally just freed from the virtual memory system rather than copying them to
swap space, because it is faster to go find them again in the filesystem and read
them back in from there than to write them out to swap space and then read them
back. )
• In the mapping system shown below for Linux systems, a map of swap space is
kept in memory, where each entry corresponds to a 4K block in the swap space.
Zeros indicate free slots and non-zeros refer to how many processes have a
mapping to that particular block (>1 for shared pages only.)

Figure 4.34- The data structures for swapping on Linux systems.

159
UNIT-V
Deadlocks
5.1 System Model
• For the purposes of deadlock discussion, a system can be modeled as a collection of
limited resources, which can be partitioned into different categories, to be allocated to a
number of processes, each having different needs.
• Resource categories may include memory, printers, CPUs, open files, tape drives, CD-
ROMS, etc.
• By definition, all the resources within a category are equivalent, and a request of this
category can be equally satisfied by any one of the resources in that category. If this is
not the case ( i.e. if there is some difference between the resources within a category ),
then that category needs to be further divided into separate categories. For example,
"printers" may need to be separated into "laser printers" and "color inkjet printers".
• Some categories may have a single resource.
• In normal operation a process must request a resource before using it, and release it when
it is done, in the following sequence:
1. Request - If the request cannot be immediately granted, then the process must
wait until the resource(s) it needs become available. For example the system calls
open( ), malloc( ), new( ), and request( ).
2. Use - The process uses the resource, e.g. prints to the printer or reads from the
file.
3. Release - The process relinquishes the resource. so that it becomes available for
other processes. For example, close( ), free( ), delete( ), and release( ).
• For all kernel-managed resources, the kernel keeps track of what resources are free and
which are allocated, to which process they are allocated, and a queue of processes waiting
for this resource to become available. Application-managed resources can be controlled
using mutexes or wait( ) and signal( ) calls, ( i.e. binary or counting semaphores. )
• A set of processes is deadlocked when every process in the set is waiting for a resource
that is currently allocated to another process in the set ( and which can only be released
when that other waiting process makes progress. )
5.2 Deadlock Characterization
5.2.1 Necessary Conditions
• There are four conditions that are necessary to achieve deadlock:
1. Mutual Exclusion - At least one resource must be held in a non-sharable
mode; If any other process requests this resource, then that process must
wait for the resource to be released.
2. Hold and Wait - A process must be simultaneously holding at least one
resource and waiting for at least one resource that is currently being held
by some other process.
3. No preemption - Once a process is holding a resource ( i.e. once its
request has been granted ), then that resource cannot be taken away from
that process until the process voluntarily releases it.
4. Circular Wait - A set of processes { P0, P1, P2, . . ., PN } must exist such
that every P[ i ] is waiting for P[ ( i + 1 ) % ( N + 1 ) ]. ( Note that this
condition implies the hold-and-wait condition, but it is easier to deal with
the conditions if the four are considered separately. )
160
5.2.2 Resource-Allocation Graph
• In some cases deadlocks can be understood more clearly through the use of
Resource-Allocation Graphs, having the following properties:
o A set of resource categories, { R1, R2, R3, . . ., RN }, which appear as
square nodes on the graph. Dots inside the resource nodes indicate specific
instances of the resource. ( E.g. two dots might represent two laser
printers. )
o A set of processes, { P1, P2, P3, . . ., PN }
o Request Edges - A set of directed arcs from Pi to Rj, indicating that
process Pi has requested Rj, and is currently waiting for that resource to
become available.
o Assignment Edges - A set of directed arcs from Rj to Pi indicating that
resource Rj has been allocated to process Pi, and that Pi is currently
holding resource Rj.
o Note that a request edge can be converted into an assignment edge by
reversing the direction of the arc when the request is granted. ( However
note also that request edges point to the category box, whereas assignment
edges emanate from a particular instance dot within the box. )
o For example:

161
165
Figure 5.1 - Resource allocation graph

• If a resource-allocation graph contains no cycles, then the system is not


deadlocked. ( When looking for cycles, remember that these are directed graphs. )
See the example in Figure 5.2.
• If a resource-allocation graph does contain cycles AND each resource category
contains only a single instance, then a deadlock exists.
• If a resource category contains more than one instance, then the presence of a
cycle in the resource-allocation graph indicates the possibility of a deadlock, but
does not guarantee one. Consider, for example, Figures 5.3 and 5.4 below:

Figure 5.2 - Resource allocation graph with a deadlock Figure 5.3 - Resource allocation graph with a cycle
but no deadlock

166
5.3 Methods for Handling Deadlocks
• Generally speaking there are three ways of handling deadlocks:
1. Deadlock prevention or avoidance - Do not allow the system to get into a
deadlocked state.
2. Deadlock detection and recovery - Abort a process or preempt some resources
when deadlocks are detected.
3. Ignore the problem all together - If deadlocks only occur once a year or so, it may
be better to simply let them happen and reboot as necessary than to incur the
constant overhead and system performance penalties associated with deadlock
prevention or detection. This is the approach that both Windows and UNIX take.
• In order to avoid deadlocks, the system must have additional information about all
processes. In particular, the system must know what resources a process will or may
request in the future. ( Ranging from a simple worst-case maximum to a complete
resource request and release plan for each process, depending on the particular algorithm.
)
• Deadlock detection is fairly straightforward, but deadlock recovery requires either
aborting processes or preempting resources, neither of which is an attractive alternative.
• If deadlocks are neither prevented nor detected, then when a deadlock occurs the system
will gradually slow down, as more and more processes become stuck waiting for
resources currently held by the deadlock and by other waiting processes. Unfortunately
this slowdown can be indistinguishable from a general system slowdown when a real-
time process has heavy computing needs.

5.4 Deadlock Prevention

• Deadlocks can be prevented by preventing at least one of the four required conditions:

5.4.1 Mutual Exclusion


• Shared resources such as read-only files do not lead to deadlocks.
• Unfortunately some resources, such as printers and tape drives, require exclusive
access by a single process.

5.4.2 Hold and Wait


• To prevent this condition processes must be prevented from holding one or more
resources while simultaneously waiting for one or more others. There are several
possibilities for this:
o Require that all processes request all resources at one time. This can be
wasteful of system resources if a process needs one resource early in its
execution and doesn't need some other resource until much later.
o Require that processes holding resources must release them before
requesting new resources, and then re-acquire the released resources along
with the new ones in a single new request. This can be a problem if a
process has partially completed an operation using a resource and then
fails to get it re-allocated after releasing it.
o Either of the methods described above can lead to starvation if a process
requires one or more popular resources.

167
5.4.3 No Preemption
• Preemption of process resource allocations can prevent this condition of
deadlocks, when it is possible.
o One approach is that if a process is forced to wait when requesting a new
resource, then all other resources previously held by this process are
implicitly released, ( preempted ), forcing this process to re-acquire the old
resources along with the new resources in a single request, similar to the
previous discussion.
o Another approach is that when a resource is requested and not available,
then the system looks to see what other processes currently have those
resources and are themselves blocked waiting for some other resource. If
such a process is found, then some of their resources may get preempted
and added to the list of resources for which the process is waiting.
o Either of these approaches may be applicable for resources whose states
are easily saved and restored, such as registers and memory, but are
generally not applicable to other devices such as printers and tape drives.

5.4.4 Circular Wait


• One way to avoid circular wait is to number all resources, and to require that
processes request resources only in strictly increasing ( or decreasing ) order.
• In other words, in order to request resource Rj, a process must first release all Ri
such that i >= j.
• One big challenge in this scheme is determining the relative ordering of the
different resources

5.5 Deadlock Avoidance


• The general idea behind deadlock avoidance is to prevent deadlocks from ever
happening, by preventing at least one of the aforementioned conditions.
• This requires more information about each process, AND tends to lead to low device
utilization. ( I.e. it is a conservative approach. )
• In some algorithms the scheduler only needs to know the maximum number of each
resource that a process might potentially use. In more complex algorithms the scheduler
can also take advantage of the schedule of exactly what resources may be needed in what
order.
• When a scheduler sees that starting a process or granting resource requests may lead to
future deadlocks, then that process is just not started or the request is not granted.
• A resource allocation state is defined by the number of available and allocated resources,
and the maximum requirements of all processes in the system.

5.5.1 Safe State


• A state is safe if the system can allocate all resources requested by all processes
(up to their stated maximums) without entering a deadlock state.
• More formally, a state is safe if there exists a safe sequence of processes { P0, P1,
P2, ..., PN } such that all of the resource requests for Pi can be granted using the
resources currently allocated to Pi and all processes Pj where j < i. (I.e. if all the
processes prior to Pi finish and free up their resources, then Pi will be able to
finish also, using the resources that they have freed up.)
168
• If a safe sequence does not exist, then the system is in an unsafe state, which
MAY lead to deadlock. (All safe states are deadlock free, but not all unsafe states
lead to deadlocks.)

Figure 7.6 - Safe, unsafe, and deadlocked state spaces.


• For example, consider a system with 12 tape drives, allocated as follows. Is this a
safe state? What is the safe sequence?
Maximum Needs Current Allocation
P0 10 5
P1 4 2
P2 9 2
• What happens to the above table if process P2 requests and is granted one more
tape drive?
• Key to the safe state approach is that when a request is made for resources, the
request is granted only if the resulting allocation state is a safe one.

5.5.2 Resource-Allocation Graph Algorithm


• If resource categories have only single instances of their resources, then deadlock
states can be detected by cycles in the resource-allocation graphs.
• In this case, unsafe states can be recognized and avoided by augmenting the
resource-allocation graph with claim edges, noted by dashed lines, which point
from a process to a resource that it may request in the future.
• In order for this technique to work, all claim edges must be added to the graph for
any particular process before that process is allowed to request any resources.
( Alternatively, processes may only make requests for resources for which they
have already established claim edges, and claim edges cannot be added to any
process that is currently holding resources. )
• When a process makes a request, the claim edge Pi->Rj is converted to a request
edge. Similarly when a resource is released, the assignment reverts back to a
claim edge.
• This approach works by denying requests that would produce cycles in the
resource-allocation graph, taking claim edges into effect.
169
• Consider for example what happens when process P2 requests resource R2:

Figure 5.7 - Resource allocation graph for deadlock avoidance

• The resulting resource-allocation graph would have a cycle in it, and so the
request cannot be granted.

Figure 5.8 - An unsafe state in a resource allocation graph

5.5.3 Banker's Algorithm


• For resource categories that contain more than one instance the resource-
allocation graph method does not work, and more complex ( and less efficient )
methods must be chosen.
• The Banker's Algorithm gets its name because it is a method that bankers could
use to assure that when they lend out resources they will still be able to satisfy all
their clients. ( A banker won't loan out a little money to start building a house
unless they are assured that they will later be able to loan out the rest of the
money to finish the house. )
• When a process starts up, it must state in advance the maximum allocation of
resources it may request, up to the amount available on the system.
• When a request is made, the scheduler determines whether granting the request
would leave the system in a safe state. If not, then the process must wait until the
request can be granted safely.
• The banker's algorithm relies on several key data structures: ( where n is the
number of processes and m is the number of resource categories. )
o Available[ m ] indicates how many resources are currently available of
each type.
170
o Max[ n ][ m ] indicates the maximum demand of each process of each
resource.
o Allocation[ n ][ m ] indicates the number of each resource category
allocated to each process.
o Need[ n ][ m ] indicates the remaining resources needed of each type for
each process. ( Note that Need[ i ][ j ] = Max[ i ][ j ] - Allocation[ i ][ j ]
for all i, j. )
• For simplification of discussions, we make the following notations / observations:
o One row of the Need vector, Need[ i ], can be treated as a vector
corresponding to the needs of process i, and similarly for Allocation and
Max.
o A vector X is considered to be <= a vector Y if X[ i ] <= Y[ i ] for all i.

5.5.3.1 Safety Algorithm


• In order to apply the Banker's algorithm, we first need an algorithm for
determining whether or not a particular state is safe.
• This algorithm determines if the current state of a system is safe,
according to the following steps:
1. Let Work and Finish be vectors of length m and n respectively.
▪ Work is a working copy of the available resources, which
will be modified during the analysis.
▪ Finish is a vector of booleans indicating whether a
particular process can finish. ( or has finished so far in the
analysis. )
▪ Initialize Work to Available, and Finish to false for all
elements.
2. Find an i such that both (A) Finish[ i ] == false, and (B) Need[ i ] <
Work. This process has not finished, but could with the given
available working set. If no such i exists, go to step 4.
3. Set Work = Work + Allocation[ i ], and set Finish[ i ] to true. This
corresponds to process i finishing up and releasing its resources
back into the work pool. Then loop back to step 2.
4. If finish[ i ] == true for all i, then the state is a safe state, because a
safe sequence has been found.
• ( JTB's Modification:
1. In step 1. instead of making Finish an array of booleans initialized
to false, make it an array of ints initialized to 0. Also initialize an
int s = 0 as a step counter.
2. In step 2, look for Finish[ i ] == 0.
3. In step 3, set Finish[ i ] to ++s. S is counting the number of
finished processes.
4. For step 4, the test can be either Finish[ i ] > 0 for all i, or s >= n.
The benefit of this method is that if a safe state exists, then Finish[
] indicates one safe sequence (of possibly many.)

171
5.5.3.2 Resource-Request Algorithm (The Bankers Algorithm)
• Now that we have a tool for determining if a particular state is safe or not,
we are now ready to look at the Banker's algorithm itself.
• This algorithm determines if a new request is safe, and grants it only if it is
safe to do so.
• When a request is made ( that does not exceed currently available
resources ), pretend it has been granted, and then see if the resulting state
is a safe one. If so, grant the request, and if not, deny the request, as
follows:
1. Let Request[ n ][ m ] indicate the number of resources of each type
currently requested by processes. If Request[ i ] > Need[ i ] for any
process i, raise an error condition.
2. If Request[ i ] > Available for any process i, then that process must
wait for resources to become available. Otherwise the process can
continue to step 3.
3. Check to see if the request can be granted safely, by pretending it
has been granted and then seeing if the resulting state is safe. If so,
grant the request, and if not, then the process must wait until its
request can be granted safely.The procedure for granting a request
( or pretending to for testing purposes ) is:
▪ Available = Available - Request
▪ Allocation = Allocation + Request
▪ Need = Need - Request

5.5.3.3 An Illustrative Example


• Consider the following situation:

• And now consider what happens if process P1 requests 1 instance of A


and 2 instances of C. ( Request[ 1 ] = ( 1, 0, 2 ) )

172
Figure 5.9 - (a) Resource allocation graph. (b) Corresponding wait-for graph

• What about requests of ( 3, 3,0 ) by P4? or ( 0, 2, 0 ) by P0? Can these be


safely granted? Why or why not?

5.6 Deadlock Detection

• If deadlocks are not avoided, then another approach is to detect when they have occurred
and recover somehow.
• In addition to the performance hit of constantly checking for deadlocks, a policy /
algorithm must be in place for recovering from deadlocks, and there is potential for lost
work when processes must be aborted or have their resources preempted.

5.6.1 Single Instance of Each Resource Type


• If each resource category has a single instance, then we can use a variation of the
resource-allocation graph known as a wait-for graph.
• A wait-for graph can be constructed from a resource-allocation graph by
eliminating the resources and collapsing the associated edges, as shown in the
figure below.
• An arc from Pi to Pj in a wait-for graph indicates that process Pi is waiting for a
resource that process Pj is currently holding.
• As before, cycles in the wait-for graph indicate deadlocks.
• This algorithm must maintain the wait-for graph, and periodically search it for
cycles.

5.6.2 Several Instances of a Resource Type


• The detection algorithm outlined here is essentially the same as the Banker's
algorithm, with two subtle differences:
o In step 1, the Banker's Algorithm sets Finish[ i ] to false for all i. The
173
algorithm presented here sets Finish[ i ] to false only if Allocation[ i ] is
not zero. If the currently allocated resources for this process are zero, the
algorithm sets Finish[ i ] to true. This is essentially assuming that IF all of
the other processes can finish, then this process can finish also.
Furthermore, this algorithm is specifically looking for which processes are
involved in a deadlock situation, and a process that does not have any
resources allocated cannot be involved in a deadlock, and so can be
removed from any further consideration.
o Steps 2 and 3 are unchanged
o In step 4, the basic Banker's Algorithm says that if Finish[ i ] == true for
all i, that there is no deadlock. This algorithm is more specific, by stating
that if Finish[ i ] == false for any process Pi, then that process is
specifically involved in the deadlock which has been detected.
• ( Note: An alternative method was presented above, in which Finish held integers
instead of booleans. This vector would be initialized to all zeros, and then filled
with increasing integers as processes are detected which can finish. If any
processes are left at zero when the algorithm completes, then there is a deadlock,
and if not, then the integers in finish describe a safe sequence. To modify this
algorithm to match this section of the text, processes with allocation = zero could
be filled in with N, N - 1, N - 2, etc. in step 1, and any processes left with Finish =
0 in step 4 are the deadlocked processes. )
• Consider, for example, the following state, and determine if it is currently
deadlocked:

• Now suppose that process P2 makes a request for an additional instance of type C,
yielding the state shown below. Is the system now deadlocked?

174
5.6.3 Detection-Algorithm Usage
• When should the deadlock detection be done? Frequently, or infrequently?
• The answer may depend on how frequently deadlocks are expected to occur, as
well as the possible consequences of not catching them immediately. ( If
deadlocks are not removed immediately when they occur, then more and more
processes can "back up" behind the deadlock, making the eventual task of
unblocking the system more difficult and possibly damaging to more processes. )
• There are two obvious approaches, each with trade-offs:
1. Do deadlock detection after every resource allocation which cannot be
immediately granted. This has the advantage of detecting the deadlock
right away, while the minimum number of processes are involved in the
deadlock. ( One might consider that the process whose request triggered
the deadlock condition is the "cause" of the deadlock, but realistically all
of the processes in the cycle are equally responsible for the resulting
deadlock. ) The down side of this approach is the extensive overhead and
performance hit caused by checking for deadlocks so frequently.
2. Do deadlock detection only when there is some clue that a deadlock may
have occurred, such as when CPU utilization reduces to 40% or some
other magic number. The advantage is that deadlock detection is done
much less frequently, but the down side is that it becomes impossible to
detect the processes involved in the original deadlock, and so deadlock
recovery can be more complicated and damaging to more processes.
3. (As I write this, a third alternative comes to mind: Keep a historical log of
resource allocations, since that last known time of no deadlocks. Do
deadlock checks periodically (once an hour or when CPU usage is low?),
and then use the historical log to trace through and determine when the
deadlock occurred and what processes caused the initial deadlock.
Unfortunately I'm not certain that breaking the original deadlock would
then free up the resulting log jam.)

5.7 Recovery From Deadlock


• There are three basic approaches to recovery from deadlock:
1. Inform the system operator, and allow him/her to take manual intervention.
2. Terminate one or more processes involved in the deadlock
3. Preempt resources.

175
5.7.1 Process Termination
• Two basic approaches, both of which recover resources allocated to terminated
processes:
o Terminate all processes involved in the deadlock. This definitely solves
the deadlock, but at the expense of terminating more processes than would
be absolutely necessary.
o Terminate processes one by one until the deadlock is broken. This is more
conservative, but requires doing deadlock detection after each step.
• In the latter case there are many factors that can go into deciding which processes
to terminate next:
1. Process priorities.
2. How long the process has been running, and how close it is to finishing.
3. How many and what type of resources is the process holding. ( Are they
easy to preempt and restore? )
4. How many more resources does the process need to complete.
5. How many processes will need to be terminated
6. Whether the process is interactive or batch.
7. Whether or not the process has made non-restorable changes to any
resource.

5.7.2 Resource Preemption


• When preempting resources to relieve deadlock, there are three important issues
to be addressed:
1. Selecting a victim - Deciding which resources to preempt from which
processes involves many of the same decision criteria outlined above.
2. Rollback - Ideally one would like to roll back a preempted process to a
safe state prior to the point at which that resource was originally allocated
to the process. Unfortunately it can be difficult or impossible to determine
what such a safe state is, and so the only safe rollback is to roll back all the
way back to the beginning. ( I.e. abort the process and make it start over. )
3. Starvation - How do you guarantee that a process won't starve because its
resources are constantly being preempted? One option would be to use a
priority system, and increase the priority of a process every time its
resources get preempted. Eventually it should get a high enough priority
that it won't get preempted any more time.

176
PROTECTION
5.8 Goals of Protection
• Obviously to prevent malicious misuse of the system by users or programs.
• To ensure that each shared resource is used only in accordance with system
policies, which may be set either by system designers or by system administrators.
• To ensure that errant programs cause the minimal amount of damage possible.
• Note that protection systems only provide the mechanisms for enforcing policies and
ensuring reliable systems. It is up to administrators and users to implement those
mechanisms effectively.

5.9 Principles of Protection


• The principle of least privilege dictates that programs, users, and systems be given just
enough privileges to perform their tasks.
• This ensures that failures do the least amount of harm and allow the least of harm to be
done.
• For example, if a program needs special privileges to perform a task, it is better to make
it a SGID program with group ownership of "network" or "backup" or some other pseudo
group, rather than SUID with root ownership. This limits the amount of damage that can
occur if something goes wrong.
• Typically each user is given their own account, and has only enough privilege to modify
their own files.
• The root account should not be used for normal day to day activities - The System
Administrator should also have an ordinary account, and reserve use of the root account
for only those tasks which need the root privileges

5.10 Domain of Protection


• A computer can be viewed as a collection of processes and objects ( both HW & SW ).
• The need to know principle states that a process should only have access to those objects
it needs to accomplish its task, and furthermore only in the modes for which it needs
access and only during the time frame when it needs access.
• The modes available for a particular object may depend upon its type.

5.10.1 Domain Structure


• A protection domain specifies the resources that a process may access.
• Each domain defines a set of objects and the types of operations that may be
invoked on each object.
• An access right is the ability to execute an operation on an object.
• A domain is defined as a set of < object, { access right set } > pairs, as shown
below. Note that some domains may be disjoint while others overlap.

Figure 5.10 - System with three protection domains.


177
• The association between a process and a domain may be static or dynamic.
o If the association is static, then the need-to-know principle requires a way
of changing the contents of the domain dynamically.
o If the association is dynamic, then there needs to be a mechanism for
domain switching.
• Domains may be realized in different fashions - as users, or as processes, or as
procedures. E.g. if each user corresponds to a domain, then that domain defines
the access of that user, and changing domains involves changing user ID.

5.10.2 An Example: UNIX


• UNIX associates domains with users.
• Certain programs operate with the SUID bit set, which effectively changes the
user ID, and therefore the access domain, while the program is running. ( and
similarly for the SGID bit. ) Unfortunately this has some potential for abuse.
• An alternative used on some systems is to place privileged programs in special
directories, so that they attain the identity of the directory owner when they run.
This prevents crackers from placing SUID programs in random directories around
the system.
• Yet another alternative is to not allow the changing of ID at all. Instead, special
privileged daemons are launched at boot time, and user processes send messages
to these daemons when they need special tasks performed.

5.10.3 An Example: MULTICS


• The MULTICS system uses a complex system of rings, each corresponding to a
different protection domain, as shown below:

Figure 5.11 - MULTICS ring structure.


• Rings are numbered from 0 to 7, with outer rings having a subset of the privileges
of the inner rings.
• Each file is a memory segment, and each segment description includes an entry
that indicates the ring number associated with that segment, as well as read, write,
and execute privileges.
• Each process runs in a ring, according to the current-ring-number, a counter
associated with each process.
• A process operating in one ring can only access segments associated with higher
( farther out ) rings, and then only according to the access bits. Processes cannot
access segments associated with lower rings.
• Domain switching is achieved by a process in one ring calling upon a process
178
operating in a lower ring, which is controlled by several factors stored with each
segment descriptor:
o An access bracket, defined by integers b1 <= b2.
o A limit b3 > b2
o A list of gates, identifying the entry points at which the segments may be
called.
• If a process operating in ring i calls a segment whose bracket is such that b1 <= i
<= b2, then the call succeeds and the process remains in ring i.
• Otherwise a trap to the OS occurs, and is handled as follows:
o If i < b1, then the call is allowed, because we are transferring to a
procedure with fewer privileges. However if any of the parameters being
passed are of segments below b1, then they must be copied to an area
accessible by the called procedure.
o If i > b2, then the call is allowed only if i <= b3 and the call is directed to
one of the entries on the list of gates.
• Overall this approach is more complex and less efficient than other protection
schemes.
5.11 Access Matrix

• The model of protection that we have been discussing can be viewed as an access matrix,
in which columns represent different system resources and rows represent different
protection domains. Entries within the matrix indicate what access that domain has to that
resource.

Figure 5.12 - Access matrix.

• Domain switching can be easily supported under this model, simply by providing
"switch" access to other domains:

Figure 5.13 - Access matrix of Figure 14.3 with domains as objects.

• The ability to copy rights is denoted by an asterisk, indicating that processes in that
179
domain have the right to copy that access within the same column, i.e. for the same
object. There are two important variations:
o If the asterisk is removed from the original access right, then the right is
transferred, rather than being copied. This may be termed a transfer right as
opposed to a copy right.
o If only the right and not the asterisk is copied, then the access right is added to the
new domain, but it may not be propagated further. That is the new domain does
not also receive the right to copy the access. This may be termed a limited copy
right, as shown in Figure 14.5 below:

Figure 5.14 - Access matrix with copy rights.

• The owner right adds the privilege of adding new rights or removing existing ones:

Figure 5.15 - Access matrix with owner rights.

• Copy and owner rights only allow the modification of rights within a column. The
addition of control rights, which only apply to domain objects, allow a process operating
in one domain to affect the rights available in other domains. For example in the table
below, a process operating in domain D2 has the right to control any of the rights in
domain D4.
180
Figure 5.16 - Modified access matrix of Figure 14.4

5.12 Implementation of Access Matrix


5.12.1 Global Table
• The simplest approach is one big global table with < domain, object, rights >
entries.
• Unfortunately this table is very large ( even if sparse ) and so cannot be kept in
memory ( without invoking virtual memory techniques. )
• There is also no good way to specify groupings - If everyone has access to some
resource, then it still needs a separate entry for every domain.

5.12.2 Access Lists for Objects


• Each column of the table can be kept as a list of the access rights for that
particular object, discarding blank entries.
• For efficiency a separate list of default access rights can also be kept, and checked
first.

5.12.3 Capability Lists for Domains


• In a similar fashion, each row of the table can be kept as a list of the capabilities
of that domain.
• Capability lists are associated with each domain, but not directly accessible by the
domain or any user process.
• Capability lists are themselves protected resources, distinguished from other data
in one of two ways:
o A tag, possibly hardware implemented, distinguishing this special type of
data. ( other types may be floats, pointers, booleans, etc. )
o The address space for a program may be split into multiple segments, at
least one of which is inaccessible by the program itself, and used by the
operating system for maintaining the process's access right capability list.

5.12.4 A Lock-Key Mechanism


• Each resource has a list of unique bit patterns, termed locks.
• Each domain has its own list of unique bit patterns, termed keys.
• Access is granted if one of the domain's keys fits one of the resource's locks.
• Again, a process is not allowed to modify its own keys.

5.12.5 Comparison
181
• Each of the methods here has certain advantages or disadvantages, depending on
the particular situation and task at hand.
• Many systems employ some combination of the listed methods.

5.13 Access Control


• Role-Based Access Control, RBAC, assigns privileges to users, programs, or roles as
appropriate, where "privileges" refer to the right to call certain system calls, or to use
certain parameters with those calls.
• RBAC supports the principle of least privilege, and reduces the susceptibility to abuse as
opposed to SUID or SGID programs.

Figure 5.17 - Role-based access control in Solaris 10.

5.14 Revocation of Access Rights

• The need to revoke access rights dynamically raises several questions:


o Immediate versus delayed - If delayed, can we determine when the revocation
will take place?
o Selective versus general - Does revocation of an access right to an object affect all
users who have that right, or only some users?
o Partial versus t7otal - Can a subset of rights for an object be revoked, or are all
rights revoked at once?
o Temporary versus permanent - If rights are revoked, is there a mechanism for
processes to re-acquire some or all of the revoked rights?
• With an access list scheme revocation is easy, immediate, and can be selective, general,
partial, total, temporary, or permanent, as desired.
• With capabilities lists the problem is more complicated, because access rights are
distributed throughout the system. A few schemes that have been developed include:
o Reacquisition - Capabilities are periodically revoked from each domain, which
must then re-acquire them.
o Back-pointers - A list of pointers is maintained from each object to each
capability which is held for that object.
182
o Indirection - Capabilities point to an entry in a global table rather than to the
object. Access rights can be revoked by changing or invalidating the table entry,
which may affect multiple processes, which must then re-acquire access rights to
continue.
o Keys - A unique bit pattern is associated with each capability when created, which
can be neither inspected nor modified by the process.
▪ A master key is associated with each object.
▪ When a capability is created, its key is set to the object's master key.
▪ As long as the capability's key matches the object's key, then the
capabilities remain valid.
▪ The object master key can be changed with the set-key command, thereby
invalidating all current capabilities.
▪ More flexibility can be added to this scheme by implementing a list of
keys for each object, possibly in a global table.

5.18 Capability-Based Systems


5.18.1 An Example: Hydra
• Hydra is a capability-based system that includes both system-defined rights and
user-defined rights. The interpretation of user-defined rights is up to the specific
user programs, but the OS provides support for protecting access to those rights,
whatever they may be
• Operations on objects are defined procedurally, and those procedures are
themselves protected objects, accessed indirectly through capabilities.
• The names of user-defined procedures must be identified to the protection system
if it is to deal with user-defined rights.
• When an object is created, the names of operations defined on that object become
auxiliary rights, described in a capability for an instance of the type. For a
process to act on an object, the capabilities it holds for that object must contain
the name of the operation being invoked. This allows access to be controlled on
an instance-by-instance and process-by-process basis.
• Hydra also allows rights amplification, in which a process is deemed to be
trustworthy, and thereby allowed to act on any object corresponding to its
parameters.
• Programmers can make direct use of the Hydra protection system, using suitable
libraries which are documented in appropriate reference manuals.

5.18.2 An Example: Cambridge CAP System


• The CAP system has two kinds of capabilities:
o Data capability, used to provide read, write, and execute access to objects.
These capabilities are interpreted by microcode in the CAP machine.
o Software capability, is protected but not interpreted by the CAP
microcode.
▪ Software capabilities are interpreted by protected ( privileged )
procedures, possibly written by application programmers.
▪ When a process executes a protected procedure, it temporarily
gains the ability to read or write the contents of a software
capability.
183
▪ This leaves the interpretation of the software capabilities up to the
individual subsystems, and limits the potential damage that could
be caused by a faulty privileged procedure.
▪ Note, however, that protected procedures only get access to
software capabilities for the subsystem of which they are a part.
Checks are made when passing software capabilities to protected
procedures that they are of the correct type.
▪ Unfortunately the CAP system does not provide libraries, making
it harder for an individual programmer to use than the Hydra
system.

5.19 Language-Based Protection


o As systems have developed, protection systems have become more powerful, and
also more specific and specialized.
o To refine protection even further requires putting protection capabilities into the
hands of individual programmers, so that protection policies can be implemented
on the application level, i.e. to protect resources in ways that are known to the
specific applications but not to the more general operating system.

5.19.1 Compiler-Based Enforcement


• In a compiler-based approach to protection enforcement, programmers directly
specify the protection needed for different resources at the time the resources are
declared.
• This approach has several advantages:
1. Protection needs are simply declared, as opposed to a complex series of
procedure calls.
2. Protection requirements can be stated independently of the support
provided by a particular OS.
3. The means of enforcement need not be provided directly by the developer.
4. Declarative notation is natural, because access privileges are closely
related to the concept of data types.
• Regardless of the means of implementation, compiler-based protection relies upon
the underlying protection mechanisms provided by the underlying OS, such as the
Cambridge CAP or Hydra systems.
• Even if the underlying OS does not provide advanced protection mechanisms, the
compiler can still offer some protection, such as treating memory accesses
differently in code versus data segments. ( E.g. code segments cant be modified,
data segments can't be executed. )
• There are several areas in which compiler-based protection can be compared to
kernel-enforced protection:
o Security. Security provided by the kernel offers better protection than that
provided by a compiler. The security of the compiler-based enforcement is
dependent upon the integrity of the compiler itself, as well as requiring
that files not be modified after they are compiled. The kernel is in a better
position to protect itself from modification, as well as protecting access to
specific files. Where hardware support of individual memory accesses is
available, the protection is stronger still.
184
o Flexibility. A kernel-based protection system is not as flexible to provide
the specific protection needed by an individual programmer, though it may
provide support which the programmer may make use of. Compilers are
more easily changed and updated when necessary to change the protection
services offered or their implementation.
o Efficiency. The most efficient protection mechanism is one supported by
hardware and microcode. Insofar as software based protection is
concerned, compiler-based systems have the advantage that many checks
can be made off-line, at compile time, rather that during execution.
The concept of incorporating protection mechanisms into programming
languages is in its infancy, and still remains to be fully developed. However the
general goal is to provide mechanisms for three functions:
0. Distributing capabilities safely and efficiently among customer processes.
In particular a user process should only be able to access resources for
which it was issued capabilities.
1. Specifying the type of operations a process may execute on a resource,
such as reading or writing.
2. Specifying the order in which operations are performed on the resource,
such as opening before reading.
5.19.2 Protection in Java
• Java was designed from the very beginning to operate in a distributed
environment, where code would be executed from a variety of trusted and
untrusted sources. As a result the Java Virtual Machine, JVM incorporates many
protection mechanisms
• When a Java program runs, it load up classes dynamically, in response to requests
to instantiates objects of particular types. These classes may come from a variety
of different sources, some trusted and some not, which requires that the protection
mechanism be implemented at the resolution of individual classes, something not
supported by the basic operating system.
• As each class is loaded, it is placed into a separate protection domain. The
capabilities of each domain depend upon whether the source URL is trusted or
not, the presence or absence of any digital signatures on the class ( Chapter 15 ),
and a configurable policy file indicating which servers a particular user trusts, etc.
• When a request is made to access a restricted resource in Java, ( e.g. open a local
file ), some process on the current call stack must specifically assert a privilege to
perform the operation. In essence this method assumes responsibility for the
restricted access. Naturally the method must be part of a class which resides in a
protection domain that includes the capability for the requested operation. This
approach is termed stack inspection, and works like this:
o When a caller may not be trusted, a method executes an access request
within a doPrivileged( ) block, which is noted on the calling stack.
o When access to a protected resource is requested, checkPermissions( )
inspects the call stack to see if a method has asserted the privilege to
access the protected resource.
▪ If a suitable doPriveleged block is encountered on the stack before
a domain in which the privilege is disallowed, then the request is
granted.
▪ If a domain in which the request is disallowed is encountered first,
185
then the access is denied and a AccessControlException is thrown.
▪ If neither is encountered, then the response is implementation
dependent.
• In the example below the untrusted applet's call to get( ) succeeds, because the
trusted URL loader asserts the privilege of opening the specific URL lucent.com.
However when the applet tries to make a direct call to open( ) it fails, because it
does not have privilege to access any sockets.

Figure 5.18 - Stack inspection.

15. Additional Topics in Operating Systems


Android
Android, Inc. was founded in Palo Alto, California in October 2003 by Andy Rubin (co-founder
of Danger), Rich Miner (co-founder of Wildfire Communications, Inc.), Nick Sears (once VP
at T-Mobile), and Chris White (headed design and interface development at WebTV) to develop,
in Rubin's words "smarter mobile devices that are more aware of its owner's location and
preferences".]The early intentions of the company were to develop an advanced operating system
for digital cameras, when it was realized that the market for the devices was not large enough,
and diverted their efforts to producing a Smartphone operating system to rival those
of Symbian and Windows Mobile. Despite the past accomplishments of the founders and early
employees, Android Inc. operated secretly, revealing only that it was working on software for
mobile phones.
Google acquired Android Inc. on August 17, 2005; key employees of Android Inc., including
Rubin, Miner, and White, stayed at the company after the acquisition. Not much was known
about Android Inc. at the time, but many assumed that Google was planning to enter the mobile
phone market with this move. At Google, the team led by Rubin developed a mobile device
platform powered by the Linux kernel. Google marketed the platform to handset makers
and carriers on the promise of providing a flexible, upgradable system. Google had lined up a
series of hardware component and software partners and signaled to carriers that it was open to
186
various degrees of cooperation on their part.
LINUX
Linux was originally developed as a free operating system for Intel x86-based personal
computers. It has since been ported to more computer hardware platforms than any other
operating system. It is a leading operating system on servers and other big iron systems such
as mainframe computers and supercomputers. As of June 2013, more than 95% of the
world's 500 fastest supercomputers run some variant of Linux,] including all the 44 fastest. Linux
also runs on embedded systems, which are devices whose operating system is typically built into
the firmware and is highly tailored to the system; this includes mobile phones, tablet computers,
network routers, facility automation controls, televisions and video game consoles. Android,
which is a widely used operating system for mobile devices, is built on top of the Linux kernel.

187
16. University Question papers of previous year

188
189
190
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
17. QUESTION BANK
UNIT-I
Short Answer Questions:
1. What is a system call? List different types of system calls used in developing program.
2. Advantages and Disadvantages of real time Operating system.
3. Explain Operating system structure.
4. Define user mode and kernel mode. Why two modes are required?
5. What is Multiprocessor System?
6. What are the advantages of multiprocessors?
7. What are the modes of operation in Hardware Protection?
8. What is meant by Batch Systems?
9. List the various services provided by operating systems.
10. What is meant by Distributed systems?

Long Answer Questions:


1. Explain the functions of operating systems.
2. Differentiate Batch Operating System and Time Sharing Operating System.
3. Explain the operating system services in detail.
4. Explain the various types of System calls in detail.
5. Explain the various types of System programs in detail.
6. Differentiate multi programming and multi-processing OS.
7. Differentiate Parallel systems and Distributed systems.
8. Differentiate hard real time systems and soft real time systems with suitable examples.
9. Explain the simple and layered approach of operating system design in detail.
10. Explain how modular approach of designing OS is different from microkernel approach.

208
UNIT-II

Short answer questions with answers


1) What is a process in an operating system? -----L1 Remembering
Answer: A process is an instance of a program in execution. It consists of the program code, data, and resources
allocated by the operating system.

2) Define the process state in terms of operating systems.------L2 Understanding


Answer: The process state represents the current condition or status of a process. Common states include ready,
running, waiting, and terminated.

3) What is a Process Control Block (PCB) and its significance? --------L2 Understanding
Answer: A PCB is a data structure maintained by the operating system to store information about each process,
including process state, program counter, CPU registers, and other relevant information.

4) What are threads in the context of operating systems?-----L1 Remembering


Answer: Threads are lightweight processes within a single process. They share the same resources but have
their own execution contexts, such as program counters and stack space.

5) Explain process scheduling queues…… L2 Understanding


Answer: Scheduling queues are data structures used by the operating system to organize processes based on
their scheduling attributes, such as priority or arrival time. Common queues include ready queue, waiting
queue, and various priority queues.

6)What is a scheduler in operating systems?…… L2 Understanding


Answer: A scheduler is a component of the operating system responsible for selecting processes from the ready
queue and allocating CPU time to them based on scheduling algorithms.

7)Define context switch and its role in multitasking environments.…… L2 Understanding


Answer: A context switch is the process of saving the state of a running process, loading the state of another
process, and resuming execution. It enables multitasking by allowing the CPU to switch between different
processes efficiently.

8) What is pre-emptive scheduling?…… L2 Understanding


Answer: Pre-emptive scheduling is a scheduling policy where the operating system can interrupt a running
process and allocate CPU time to another process with higher priority. This helps in ensuring fairness and
responsiveness in a multitasking environment.

9) Explain the role of a dispatcher in process scheduling.…… L2 Understanding


Answer: The dispatcher is responsible for transferring control of the CPU from the currently running process to
a selected process from the ready queue. It involves saving the state of the current process and loading the state
of the new process.

10) What criteria are considered in scheduling algorithms?…… L2 Understanding


Answer: Scheduling algorithms consider various criteria, including CPU utilization, throughput, turnaround
time, waiting time, and response time, to make scheduling decisions.

11) Describe multiple-processor scheduling and its challenges.……


209 L2 Understanding
Answer: Multiple-processor scheduling involves distributing processes among multiple CPUs or cores.
Challenges include load balancing, minimizing contention for shared resources, and coordinating
communication between processors.

12) What is real-time scheduling, and how does it differ from other scheduling strategies?…… L2
Understanding
Answer: Real-time scheduling prioritizes tasks based on their deadlines or time constraints. Unlike traditional
scheduling, real-time scheduling aims to meet deadlines consistently, even at the expense of overall system
throughput.

13) Explain thread scheduling in operating systems.…… L2 Understanding


Answer: Thread scheduling involves allocating CPU time to individual threads within a process. Thread
scheduling decisions are based on factors such as thread priority, synchronization requirements, and CPU
affinity.

14) What is process synchronization, and why is it necessary?…… L2 Understanding


Answer: Process synchronization is the coordination of multiple processes or threads to ensure correct and
orderly execution. It is necessary to prevent issues like race conditions, deadlocks, and resource contention.

15) Describe the critical section problem and its significance in concurrent programming.…… L2
Understanding
Answer: The critical section problem involves ensuring that only one process or thread executes a critical
section of code at a time to avoid data inconsistency or race conditions. It is a fundamental challenge in
concurrent programming.

16) How does Peterson's solution address the critical section problem?…… L2 Understanding
Answer: Peterson's solution is a software-based algorithm for achieving mutual exclusion in a shared memory
environment. It uses shared variables and flags to coordinate access to critical sections between two processes.

17) What is synchronization hardware, and how does it aid in process synchronization?…… L2 Understanding
Answer: Synchronization hardware refers to specialized instructions or hardware mechanisms provided by
CPUs to support synchronization primitives like atomic operations and memory barriers. It helps in
implementing efficient synchronization algorithms.

18) Explain the concept of semaphores in operating systems. …… L2 Understanding


Answer: Semaphores are a synchronization mechanism used to control access to shared resources by multiple
processes or threads. They consist of a counter and associated atomic operations for acquiring and releasing
resources.

19) What are classic problems of synchronization, and give examples.…… L2 Understanding
Answer: Classic synchronization problems include the producer-consumer problem, the dining philosophers
problem, and the readers-writers problem. These problems illustrate challenges in coordinating access to shared
resources.

20) How do monitors facilitate synchronization in concurrent programming?…… L2 Understanding


Answer: Monitors are high-level synchronization constructs that encapsulate shared data and synchronization
primitives like condition variables and mutexes. They provide a structured approach to managing concurrency
and simplifying synchronization logic.

210
Long answer questions

1) Analyzing (Level 4):


Explain the process scheduling algorithms used in Linux and Windows operating systems. Compare and
contrast their approaches, considering factors such as efficiency, fairness, and responsiveness.

2) Applying (Level 3):


Design a scheduling algorithm suitable for a real-time operating system used in critical medical equipment.
Justify your choice of algorithm and discuss how it addresses the specific requirements of real-time systems.

3) Evaluating (Level 5):


Critically assess the impact of pre-emptive scheduling on overall system performance and user experience.
Discuss its advantages and disadvantages compared to non-pre-emptive scheduling strategies.

4) Creating (Level 6):


Develop a simulation model to demonstrate the functioning of a semaphore-based solution for the producer-
consumer problem. Include multiple producers and consumers, and analyze the behavior of the system under
different scenarios.

5) Analyzing (Level 4):


Investigate the challenges and benefits of implementing multiple-processor scheduling in a distributed
computing environment. Analyze how such scheduling strategies affect resource utilization and communication
overhead.

6) Evaluating (Level 5):


Evaluate the effectiveness of using monitors as a synchronization mechanism in modern operating systems.
Assess their advantages over traditional synchronization primitives like semaphores, considering factors such as
simplicity, efficiency, and ease of use.

7) Creating (Level 6):


Design and implement a real-time scheduling algorithm suitable for an autonomous vehicle control system.
Consider the specific requirements of such systems, such as responsiveness to external stimuli and
predictability of task execution.

8) Applying (Level 3):


Apply the principles of process synchronization to solve the readers-writers problem. Design a solution using
semaphores or monitors and explain how it ensures concurrent access to a shared resource while maintaining
data consistency.

9) Evaluating (Level 5):


Assess the effectiveness of Peterson's solution for achieving mutual exclusion in a multi-threaded environment.
Discuss its limitations and potential pitfalls, and compare it with other synchronization algorithms in terms of
performance and scalability.

10) Creating (Level 6):


Propose a novel synchronization hardware mechanism that enhances the performance of modern CPUs in
handling synchronization primitives. Describe its architecture and functionality, and discuss how it addresses
the limitations of existing hardware support for synchronization.
211
Long answer numerical problems
1) Applying (Level 3):
You are designing a round-robin scheduling algorithm for a CPU with three processes: P1, P2, and P3. Each
process requires the CPU for a fixed time quantum of 4 milliseconds. Calculate the turnaround time and waiting
time for each process if the arrival times are as follows: P1 at 0 ms, P2 at 3 ms, and P3 at 6 ms.

2) Analyzing (Level 4):


Analyze the efficiency of the shortest job next (SJN) scheduling algorithm by considering the following
processes and their respective burst times: P1 (6 ms), P2 (3 ms), P3 (7 ms), and P4 (4 ms). Compare the average
waiting time under SJN with that under round-robin scheduling with a time quantum of 5 ms.

3) Applying (Level 3):


Suppose you have a CPU with two cores and four processes: P1, P2, P3, and P4. Each process requires access
to both cores for a certain duration. Develop a scheduling plan that maximizes CPU utilization while
minimizing contention for shared resources.

4) Analyzing (Level 4):


Analyze the impact of increasing the time quantum in a round-robin scheduling algorithm on system throughput
and average waiting time. Provide numerical examples and graphical representations to support your analysis.

5) Applying (Level 3):


Design a scheduling algorithm for a real-time operating system that ensures a maximum response time of 10
milliseconds for critical tasks. Given a set of tasks with specified execution times and deadlines, determine
whether the system can meet the timing constraints.

6) Evaluating (Level 5):


Evaluate the effectiveness of using priority-based scheduling in a multi-tasking environment. Consider
scenarios where processes with high priorities continuously arrive and assess the impact on system fairness,
starvation, and overall performance.

7) Analyzing (Level 4):


Analyze the performance of a semaphore-based solution for the producer-consumer problem in terms of
throughput and resource utilization. Consider varying numbers of producers and consumers and different buffer
sizes to evaluate the scalability of the solution.

8) Evaluating (Level 5):


Evaluate the advantages and disadvantages of using synchronization hardware mechanisms such as compare-
and-swap instructions and memory fences. Assess their impact on system performance, complexity, and
scalability in multi-core architectures.

9) Creating (Level 6):


Design a multi-level feedback queue scheduling algorithm that dynamically adjusts priorities based on process
behavior. Develop algorithms for priority aging, quantum expiration, and queue migration, and justify their
effectiveness in improving system responsiveness and fairness.

10) Creating (Level 6):


Develop a real-time scheduling algorithm for a distributed computing environment with heterogeneous
processors. Consider factors such as task dependencies, communication overhead, and resource constraints, and
design a scheduling strategy that optimizes overall system212
performance.
UNIT-III
SMALL ANSWER QUESTIONS:
1) Write any two functions of Memory Management
2) What are the parameters required by memory management
3) Explain compile time and load time
4) Define fragmentation
5) What is paging in MM
6) How segmentation can be done in MM
7) Explain the concept of swapping ,and why it is required
8) Write a note on virtual memory
9) Give the advantages and dis advantages of demand paging
10) What is belady’sanomaly?
11) Write a note on thrashing
12) Describe the working set model
13) List the advantages and disadvantages of contiguous memory allocation
Long Answer Questions:
1) A) compare Logical and physical address space BTL 4
B) Compare and contrast first fit, best fit ,worst fit BTL 4
2) Explain in detail paging with example BTL 5
3) A) compare paging and segmentation BTL 4
B) List the advantages and disadvantages of segmentation BTL 4
4) Explain the following: BTL 5
A) TLB
B) Hash based table
C) Inverted page table
5) Consider the following page reference string: BTL 5
0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3,4,5,6,7
Evaluate how many page faults would occur for the following page replacement
Algorithms. Assume three or four frames and initially all the frames are empty, so
Those first unique pages will also cost one page fault.
i) LRU
ii) FIFO
iii) Optimal page replacement

6) Explain Demand paging and virtual memory in memory management and write its
Importance BTL 5
7) Explain the hardware and software support in demand paging BTL 5
8) List the criteria steps to handle a page fault in MM BTL 4
9) Evaluate the working of page replacement Algorithm BTL 5
10) Consider the following page reference string: BTL 5
1,2,3,4,2,1,5,6,2,1,2,3,7,6,3,2,1,2,3,6
Evaluate how many page faults would occur for the following page replacement
Algorithms. Assume five frames and initially all the frames are empty, so
Those first unique pages will also cost one page fault.
i) Counter LRU ii) LFU
213
UNIT-IV
Short:
1. Define a File and list the various operations over files. BTL1
2. What are the different file access methods? BTL1
3. List various file allocation methods. BTL1
4. Explain the need for file system mounting? BTL2
5. Define a Directory and list its operations. BTL1
6. Outline various disk scheduling algorithms. BTL2
7. Classify file attributes and types. BTL2
8. Interpret naming and grouping problems in directory structure. BTL2
9. What is seek time and rotational latency? BTL1
10. What is swap-space management? BTL1
11. How file sharing is done in Linux? BTL1
12. How file protection can be achieved? BTL1
Long:
1. Compare sequential and indexed file access methods with a suitable example. (BTL4)
2. Analyze file sharing and file protection mechanisms. (BTL4)
3. Explain the file allocation methods with a neat diagram. (BTL5)
4. Explain the four approaches of free-space management.(BTL5)
5. Compare and contrast hierarchical directory with DAG directory organization methods. (BTL4)
6. Discuss the factors that show effect on efficiency and performance of file system. (BTL6)
7. Discuss the merits and demerits of all 4 directory organization methods. (BTL6)
8. Assess the merits and demerits of all 3 file allocation methods. (BTL5)
9. Estimate the total head movement in cylinders using FCFS, SSTF, SCAN and C-SCAN by considering
the following disk queue with requests for I/O to blocks on cylinders 98,183,37,122,14,124,65,67 in that
order, with the disk head initially at cylinder 53. Also provide the necessary diagram to show the head
movement for the above queue. (BTL5)
10. Estimate the total head movement in cylinders using FCFS, SSTF, LOOK and C-LOOK by considering
the following disk queue with requests for I/O to blocks on cylinders 98,183,37,122,14,124,65,67 in that
order, with the disk head initially at cylinder 53. Also provide the necessary diagram to show the head
movement for the above queue. (BTL5)
11. Explain the directory implementation through i) Linear List ii) Hash Table. (BTL5)
12. Explain Disk attachment and Disk space management in detail. (BTL5)

214
UNIT-V
Short:
1. Explain deadlock and illustrate a real-time example of deadlock situation. BTL2
2. What are the necessary conditions for the occurrence of a deadlock? BTL1
3. Infer the methods to deal with deadlocks. BTL2
4. Define Protection and give the principles of protection. BTL1
5. List the Goals of Protection. BTL1
6. Outline the merits and demerits of using Access matrix for system protection. BTL2
7. Interpret the various notations used in Resource Allocation Graph. BTL2
8. Interpret Wait-For-Graph form a Resource Allocation Graph with an example. BTL2
9. What is the difference between Resource Allocation Graph and Wait For Graph?BTL1
10. What is the domain of protection and what does each domain consist of? BTL1
Long:
1. Explain the goals and principles of system protection in detail. BTL5
2. Discuss about Hydra system and Cambridge CAP capability based systems. BTL6
3. Explain Access matrix along with some of its implementation methods. BTL5
4. Explain the language based protection with some examples. BTL5
5. Interpret various implementation methods used for revocation of access rights. BTL5
6. Interpret Banker’s algorithm with proper example. BTL5
7. Inspect the deadlock recovery methods with suitable examples. BTL4
8. Explain the steps involved in safety algorithm. BTL5
9. Explain the Resource Allocation Graph in dealing deadlocks with examples BTL5
10. Consider the following snapshot of a system with resource allocation BTL5
Allocation Max Available
A B C D A B C D A B C D
P0 0 0 1 2 0 0 1 2 1 5 2 0
P1 1 0 0 0 1 7 5 0
P2 1 3 5 4 2 3 5 6
P3 0 6 3 2 0 6 5 2
P4 0 0 1 4 0 6 5 6
Answer the following questions using Banker’s algorithm
i. What is the content of matrix Need?
ii. Is the system in a safe state?
iii. If a request from process P1 arrives for (0, 4, 2, 0), can the request be granted immediately?

215
18. Assignment Questions
Assignment-I OS Basics
1. Explain the following operating systems in detail.
i) Simple Batch Systems ii) Multiprogramming Systems
iii) Time shared systems iv) Parallel systems v) Real Time systems

2. a) Explain OS Structure.
b) Explain the dual mode of operation.
c) Write short notes on Special-purpose systems.

3. a) Explain the services provided by OS.


b) Explain the following User Operating System Interfaces i) Command Interpreter ii) GUI.

4. a) Discuss the various approaches of designing an Operating System.


b) Explain the following OS structures i) Layered ii) Microkernel iii) Modular.

5. Explain the following i) System Programs ii) Virtual Machines.

Assignment-II Process, CPU Scheduling & Process Synchronization


1. a) Define Process. Explain process states and process state transition diagram.
b) Explain process scheduling.
2. Consider the following set of processes, with the length of the CPU burst given in
Milliseconds

The processes are assumed to have arrived in the order P1, P2, P3, P4, P5, all at time 0.
a. Draw four Gantt charts that illustrate the execution of these processes using the
following scheduling algorithms: FCFS, SJF, non preemptive priority and RR
(quantum=1).
b. What is the turnaround time and waiting time of each process for each of these
scheduling algorithms?
c. Which of the algorithms results in the minimum average waiting time?

216
3. Consider the following set of processes, with the length of the CPU burst given in
milliseconds
Process Arrival Time Burst Time Priority
P1 1 7 3
P2 2 8 2
P3 3 6 5
P4 4 9 4
P5 5 4 1
Give the Gantt chart illustrating the execution of these processes using Shortest
Remaining Time First (SRTF) and preemptive priority scheduling. Find the average
waiting time and average turnaround time for each of these algorithms.

4. a. What is race condition?

b. Explain in detail the Critical-Section Problem.

c. Explain the Peterson’s solution for process synchronization.

5. Explain in detail how semaphores and monitors are used to solve the following problems
i) Producer-Consumer problem ii) Readers –Writers Problem iii) Dining –Philosophers
problem

Assignment-III Memory Management & Virtual Memory


1. a) Explain swapping in memory management.
b) Explain the following i) First Fit ii) Best Fit iii) Worst Fit

2. a) Explain the difference between internal and external fragmentation.


b) Explain the paging memory management in detail with a neat diagram.
c) Explain the segmentation with a neat diagram. Give an example.

3. Explain the following:


a) Translation Look-aside Buffer b) Hierarchical Paging
c) Hashed Page Table d) Inverted Page Table

4. Consider the following page reference string 9,2,3,9,2,1,5,6,2,1,2,9,7,6,3,2,1,2,3,6


How many page faults would occur for the following page replacement algorithms,
assuming three, four or five frames? Initially all frames are empty, so the first unique pages
will also cost one page fault.
LRU, FIFO, Optimal and Second chance Replacement algorithms.
217
5. a) Explain various techniques of allocating frames to processes.
b) Define Thrashing. Discuss the cause of thrashing. How Thrashing can be avoided?

Assignment-IV File System


1. a. What is a file? Discuss its attributes. Explain the various types of operations that can be
performed on
files.
b. Explain the access methods: a) Sequential b) Direct c) Indexed
c. Explain the following directory structures
i) Single level ii) Two-level iii) Tree structured iv) Acyclic graph v) General graph.

2. a. Explain the directory implementation through i) Linear List ii) Hash Table.
b. Compare various file allocation techniques.
c. Explain the following Free Space Management Techniques:
i) Bit Map ii) Linked List iii) Grouping

3. a. Write about file system mounting.


b. Explain in detail file sharing.
c. Write about File System Implementation.

4. Explain in detail the following:


i) Magnetic Disk ii) Disk Structure
iii) Disk Attachment iv) Disk Space Management

5. a. Define the following: i) Seek time ii) Rotational Latency

iii) Constant Linear Velocity iv) Constant Angular Velocity v) Bandwidth

b. Suppose that a disk drive has 4,000 cylinders, numbered 0 to 3999. The drive is
currently serving a request at cylinder 143, and the previous request was at cylinder 120.
The queue of pending requests, in FIFO order is:
87, 2465,918, 1784, 998, 509, 122, 750, 130
Starting from the current head position, what is the total distance (in cylinders) that
the disk arm moves to satisfy all the pending requests for each of the following
Disk Scheduling algorithms?

a) FCFS b) SSTF c) SCAN d) C-SCAN e) LOOK f) C-LOOK

218
Assignment-V
1. a) Define Deadlock. Give an example.
b) Explain the necessary and sufficient conditions for deadlock.
c) Explain the use of Resource Allocation Graph in dealing with deadlocks. Give examples.

2. a) Explain the various ways of preventing deadlocks.


b) Explain Banker’s algorithm.
c) Consider the following snapshot of a system
Allocation Max Available
A B C D A B C D A B C D
P0 0 0 1 2 0 0 1 2 1 5 2 0
P1 1 0 0 0 1 7 5 0
P2 1 3 5 4 2 3 5 6
P3 0 6 3 2 0 6 5 2
P4 0 0 1 4 0 6 5 6

Answer the following questions using Banker’s algorithm


i. What is the content of matrix Need?
ii. Is the system in a safe state?
iii. If a request from process P1 arrives for (0, 4, 2, 0), can the request be granted immediately?

3. Explain the following a) Goals of Protection b) Principles of Protection c) Protection Domain.

4. Explain the following Capability Based Protection System:

a) Hydra system b) Cambridge CAP

5. a) What is an access matrix? How can it be used for protection? Explain some implementation schemes.
b) What is the need for revocation of access rights? Discuss various ways of implementing.

219
19.Objective Type Questions
UNIT – I
MCQ’s
1. What is an operating system?
a) interface between the hardware and application programs
b) collection of programs that manages hardware resources
c) system service provider to the application programs
d) all of the mentioned
Answer: d
2. What is the main function of the command interpreter?
a) to provide the interface between the API and application program
b) to handle the files in the operating system
c) to get and execute the next user-specified command
d) none of the mentioned
Answer: c
3. To access the services of the operating system, the interface is provided by the ___________
a) Library
b) System calls
c) Assembly instructions
d) API
Answer: b

4. Which one of the following is not true?


a) kernel remains in the memory during the entire computer session
b) kernel is made of various modules which can not be loaded in running operating system
c) kernel is the first part of the operating system to load into memory during booting
d) kernel is the program that constitutes the central core of the operating system
Answer: b
5. Which one of the following errors will be handle by the operating system?
a) lack of paper in printer
b) connection failure in the network
c) power failure
d) all of the mentioned
Answer: d
6. If a process fails, most operating system write the error information to a ______
a) new file
b) another running process
c) log file
d) none of the mentioned
Answer: c
7. Which one of the following is not a real time operating system?
a) RTLinux
220
b) Palm OS
c) QNX
d) VxWorks
Answer: b
8. In operating system, each process has its own __________
a) open files
b) pending alarms, signals, and signal handlers
c) address space and global variables
d) all of the mentioned
Answer: d
9. Hard real time operating system has ______________ jitter than a soft real time operating system.
a) equal
b) more
c) less
d) none of the mentioned
Answer: c
10. For real time operating systems, interrupt latency should be ____________
a) zero
b) minimal
c) maximum
d) dependent on the scheduling
Answer: b
Fill in the blanks:
1. Multiprogramming was made possible by _____.
Answer: OS
2. Operating system_________.
Answer: provides a layer, user friendly interface
3. UNIX is written in which _______ language.
Answer: C
4. OS classifies the threads as_____
Answer: kernel and user level
5. Main memory of a computer system is_______
Answer: Volatile
6. Multitasking operating systems are also known as________
Answer: Time-sharing system
7. batch means group of __________jobs
Answer: similar
8. Processor Time which is shared among multiple users simultaneously is termed as____.
Answer: Time sharing
9. To access the services of the OS, the interface is provided by the ________
Answer: System calls
10. There are two real time OS____real time systems and _____real time systems.
Answer: hard,soft
221
UNIT-II
Multiple choice questions with answers
1) What is a process in an operating system?
A) A type of CPU scheduling algorithm
B) A data structure used for synchronization
C) An instance of a program in execution
D) A method for inter-process communication
Answer: C) An instance of a program in execution

2) Which of the following represents the current condition or status of a process?


A) Process Scheduler
B) Process Control Block
C) Process State
D) Process Queue
Answer: C) Process State

3) What does PCB stand for in the context of operating systems?


A) Process Control Barrier
B) Process Control Buffer
C) Process Control Block
D) Process Control Bit
Answer: C) Process Control Block

4) Threads are:
A) Independent processes
B) Lightweight processes within a single process
C) A type of scheduling algorithm
D) Used for inter-process communication
Answer: B) Lightweight processes within a single process

5) Scheduling queues are used for organizing processes based on:


A) Priority
B) Memory size
C) CPU speed
D) Input/output operations
Answer: A) Priority

6) The component of the operating system responsible for selecting processes from the ready queue is called:
A) Dispatcher
B) Scheduler
C) PCB
D) Context Switch
Answer: B) Scheduler

7) What is the purpose of a context switch?


A) To switch between different users
B) To switch between different processes
C) To switch between different CPUs
222
D) To switch between different scheduling algorithms
Answer: B) To switch between different processes

8) Pre-emptive scheduling allows:


A) Processes to voluntarily give up CPU
B) Processes to hold CPU until completion
C) Operating system to interrupt a running process
D) Operating system to assign fixed time slices to processes
Answer: C) Operating system to interrupt a running process

9) The component responsible for transferring control of the CPU from one process to another is called:
A) Scheduler
B) PCB
C) Dispatcher
D) Context Switch
Answer: C) Dispatcher

10) Scheduling algorithms consider which of the following criteria?


A) CPU temperature
B) Number of instructions executed
C) CPU utilization, throughput, turnaround time
D) Disk fragmentation
Answer: C) CPU utilization, throughput, turnaround time

11) Multiple-processor scheduling involves distributing processes among:


A) Multiple users
B) Multiple CPUs or cores
C) Multiple disks
D) Multiple networks
Answer: B) Multiple CPUs or cores

12) Real-time scheduling prioritizes tasks based on:


A) CPU utilization
B) Deadline or time constraints
C) Process arrival time
D) Process priority
Answer: B) Deadline or time constraints

13) Thread scheduling involves allocating CPU time to:


A) Different processes
B) Different users
C) Individual threads within a process
D) Disk operations
Answer: C) Individual threads within a process

14) Process synchronization is necessary to prevent issues such as:


A) CPU overheating
B) Disk fragmentation
223
C) Deadlocks and race conditions
D) Network congestion
Answer: C) Deadlocks and race conditions

15) The critical section problem involves ensuring:


A) Efficient utilization of CPU
B) Mutual exclusion in accessing shared resources
C) Fair allocation of CPU time
D) High throughput of processes
Answer: B) Mutual exclusion in accessing shared resources

16) Peterson's solution is a software-based algorithm for achieving:


A) CPU scheduling
B) Process synchronization
C) Disk scheduling
D) Network communication
Answer: B) Process synchronization

17) Semaphores are a synchronization mechanism used to control access to:


A) Memory
B) Disk
C) CPU
D) Shared resources
Answer: D) Shared resources

18) Classic problems of synchronization include:


A) Task prioritization
B) Deadlocks
C) Disk fragmentation
D) CPU scheduling
Answer: B) Deadlocks

19) Monitors provide a structured approach to managing concurrency and:


A) Resource allocation
B) Process scheduling
C) Synchronization
D) Disk management
Answer: C) Synchronization

20) Which of the following is not a synchronization primitive?


A) Mutex
B) Semaphore
C) Monitor
D) CPU Scheduler
Answer: D) CPU Scheduler
Fill in the blanks with answers
1) Process synchronization is necessary to prevent ________ conditions and ensure orderly execution.
2) A ___________ is a data structure containing information about each process, including its state and resource
224
usage.
3) ____________ is the process of saving the state of a running process and loading the state of another process.
4) Peterson's solution is a software-based algorithm for achieving mutual ___________ in a shared memory
environment.
5) Semaphore is a synchronization primitive used to control access to _________ resources.
6) Monitors provide a structured approach to managing ________ and simplifying synchronization logic.
7) Multiple-processor scheduling involves distributing processes among multiple CPUs or __________.
8) Real-time scheduling prioritizes tasks based on their _________ or time constraints.
9) The critical section problem involves ensuring that only one process executes a critical section at a _________.
10) Thread scheduling involves allocating CPU time to individual _________ within a process.
11) Scheduling criteria include CPU utilization, ___________, turnaround time, and waiting time.
12) Deadlocks and race conditions are common issues in concurrent programming that can be prevented through
__________ synchronization.
13) Context switch is essential for supporting multitasking by allowing the CPU to switch between different
_________.
14) Synchronization hardware provides specialized instructions or mechanisms for supporting synchronization
_________.
15) The dispatcher is responsible for transferring control of the CPU from one process to another during a
_________.
16) Classic synchronization problems include the producer-consumer problem and the _________ problem.
17) Process coordination involves managing interactions and synchronization among multiple _________.
18) The process state represents the current condition or _________ of a process.
19) Process scheduling involves selecting processes from the _________ queue and allocating CPU time to them.
20) Pre-emptive scheduling allows the operating system to interrupt a running process to allocate CPU time to
_________ processes.

Answers:
1)race
2)Process Control Block (PCB)
3)context switch
4)exclusion
5)shared
6)concurrency
7)cores
8)deadlines
9)time
10)threads
11)throughput
12)proper
13)processes
14)primitives
15)context switch
16)dining philosophers
17)processes,
18)status,
19)ready,
20)other

225
UNIT-III
Multiple Choice Question:
1) Memory management technique in which system stores and retrieves data from secondary storage for use in main
memory is called?
a) fragmentation
b) paging
c) mapping
d) none of the mentioned
Ans: b
2) Program always deals with ____________
a) logical address
b) absolute address
c) physical address
d) relative address
Ans: a
3) What is compaction?
a) a technique for overcoming internal fragmentation
b) a paging technique
c) a technique for overcoming external fragmentation
d) a technique for overcoming fatal error
Ans: c
4) Operating System maintains the page table for ____________
a) each process
b) each thread
c) each instruction
d) each address
Ans: a
5) Run time mapping from virtual to physical address is done by ____________
a) Memory management unit
b) CPU
c) PCI
d) None of the mentioned
Ans: a
6) The page table contains ____________
a) base address of each page in physical memory
b) page offset
c) page size
d) none of the mentioned
Ans: a
7) relocation register are used to :
a) a different address space to processes
b) providing less address space to processes

226
c) to protect the address spaces of processes
d) providing more address space to processes
Ans: c
8) Each partition may contain ________ when memory is divided into several fixed sized partitions.
a) multiple processes at once
b) exactly one process
c) Two process
d) at least one process
Ans: b
9) The degree of multiprogramming is bounded to which extent In fixed-sized partition?

a) All of these
b) the memory size
c) the CPU utilization
d) the number of partitions
Ans: d
10) The number of ——- can be granted by the Owner of address space.
A. Computers
B. Modules
C. Pages
D. Devices
Ans: c
11) The strategies like the first fit, best fit and worst fit are used to select a ______.
A. process from a queue to put in storage
B. process from a queue to put in memory
C. processor to run the next process
D. free hole from a set of available holes
Ans: d

FILL IN THE BLANKS:


1) A compiler will bind the symbolic address to reloadable address.
2) Compile time address binding generate absolute code.
3) Logical address is generated by CPU.
4) Logical address is also called Virtual Address.
5) Logical and Physical address are different for execution time address binding.
6) Memory Management unit is used to map from virtual to physical address.
7) Physical memory is broken into fixed sized block called Frames.
8) Logical memory is broken into blocks of same size called Pages
9) Paging Eliminates the Fragmentation.
10) In Segmentation programs is divided into Variable size segments

227
UNIT-IV
MCQs:
1. __________is a unique tag, usually a number, identifies the file within the file system.
a) File identifier b) File name c) File type d) none of the mentioned
2. File type can be represented by
a) file name b) file extension c) file identifier d) none of the mentioned
3. What is the mounting of file system?
a) crating of a filesystem b) deleting a filesystem
c) attaching portion of the file system into a directory structure
d) removing portion of the file system into a directory structure
4. Mapping of file is managed by
a) file metadata b) page table c) virtual memory d) file system
5. Mapping of network file system protocol to local file system is done by
a) network file system b) local file system c) volume manager d) remote mirror.
6. Which one of the following explains the sequential file access method?
a) random access according to the byte number b) read bytes one at a time, in order
c) read/write sequentially by record d) read/write randomly by record
7. In which type of allocation method each file occupy a set of contiguous block on the disk?
a) contiguous allocation b) dynamic-storage allocation
c) linked allocation d) indexed allocation
8. If the block of free-space list is free then bit will
a) 1 b) 0 c) Any of 0 or 1 d) none of the mentioned
9. File attributes consist of :
a) name b) type c) size d) ALL
10. The information about all files is kept in :
a) swap space b) operating system c) seperate directory structure d) None of these
11. In the sequential access method, information in the file is processed :
a) one disk after the other, record access doesn’t matter b) one record after the other
c) one text document after the other d) None of these
12. In the single level directory:
a) All files are contained in different directories all at the same level
b) All files are contained in the same directory
c) Depends on the operating system
d) None of these
228
13. In the single level directory:
a) all directories must have unique names b) all files must have unique names
c) all files must have unique owners d) All of these
14. An absolute path name begins at the:
a) leaf b) stem c) current directory d) root
15. A relative path name begins at the:
a) leaf b) stem c) current directory d) root
16. The heads of the magnetic disk are attached to a______that moves all the heads as a unit.
a) spindle b) disk arm c) track d) None of these
17. The set of tracks that are at one arm position make up a .
a) magnetic disks b) electrical disks c) assemblies d) cylinders
18. The time taken to move the disk arm to the desired cylinder is called the :
a) positioning time b) random access time c) seek time d) rotational latency
19. The time taken for the desired sector to rotate to the disk head is called :
a) positioning time b) random access time c) seek time d) rotational latency
20. When the head damages the magnetic surface, it is known as .
a) disk failure b) head crash c) magnetic damage d) All of these

Fill in the blanks:


1. Tracks are divided into_______ (sectors)
2. If there are 30 Paltters, _______ read/write heads will be there? (60)
3. SSTF stands for ____________(Shortest Seek Time First)
4. Consider disk queue with 98 183 37 122 14 124 65 67 & head at 53. Head movement to serve 1st
request using FCFS is _______________ (45)
5. Consider disk queue with 98 183 37 122 14 124 65 67 & head at 53. Head movement to serve 1st
request using SSTF is _______________ (12)
6. __________disk scheduling may lead to starvation. (SSTF)
7. The actual data in the disk is stored on______ (sectors)
8. Each platter on the disk is divided into_________ (tracks)
9. Virtual memory uses disk space as an extension of _________ (Primary memory)
10. Hard disk comes under________storage (secondary)

229
UNIT-V
MCQs:

1. When two or more processes attempt to access the same resource a _________ occurs

a) Critical section problem b) Fight

c) Communication problem d) Race condition

2. A set of processes is deadlock if

a) each process is terminated b) each process is blocked and remain blocked forever

c) all processes killing each other d) None of the above

3. To avoid the race condition, the number of processes that may be simultaneously inside their critical section is

a) Only 2 b) Atleast 1 c) Only 1 d) 0

4. A system has 3 processes and 4 resources. If each process needs a maximum of 2 units, then

a) deadlock may occur b) deadlock will occur

c) deadlock will never occur d) Data insufficient

5. The Banker's algorithm is used as part of__________mechanism

a) Deadlock Recovery b) Deadlock Avoidance

c) Deadlock Prevention d) ALL

6. Four necessary conditions for deadlock are non pre-emption, circular wait, hold and wait and

a) race condition b) buffer overflow

c) mutual exclusion d) None of the above

7. The wait-for graph is a deadlock detection algorithm that is applicable when

a) all resources have multiple instances b) all resources have a single instance

c) Both A and B d) None of the above

8. An edge from process Pi to Pj in a wait for graph indicates that

a) Pj is waiting for Pi to release a resource that Pj needs

b) Pi is waiting for Pj to leave the system

c) Pi is waiting for Pj to release a resource that Pi needs

d) Pj is waiting for Pi to leave the system

230
9. Which of the following situation may cause deadlock?

a) Airplane Traffic b) More trains in same route

c) Road Traffic d) ALL

10. Which of the following is not the approach to dealing with deadlock

a) Prevention b) Avoidance

c) Detection d) Deletion

11. What is principle of least privilege?

a) Less privileges provide difficulty in executing admin tasks

b) Users should be given just enough privileges to perform their tasks

c) Users can get temporary high privilege access

d) ALL

12. What is need of protection?

a) Prevent mischievous violation

b) Prevent unintentional and intentional

c) Ensure that each program uses resources allotted to it only

d) ALL

13. Which of the following is simple way to implementation an Access Matrix?

a) Lock-Key b) Capability List

c) Global Table d) Access Lists

14. Which of the following is not an implementation of Access Matrix

a) Lock-Key b) Capability List

c) Access Lists d) Array

15. In access matrix, objects can be

a) files b) users

c) both A & B d) None of the mentioned

16. In access matrix, domains can be

a) files b) users
231
c) both A & B d) None of the mentioned

17. _______ is an approach to restricting system access to authorized users.


a) Role-based access control b) Process-based access control
c) Job-based access control d) None of the mentioned

18. Access matrix model for user authentication contains _____________

a) a list of objects b) a list of domains

c) a function which returns an object’s type d) ALL

19. Which one of the following is capability based protection system?

a) hydra b) cambridge CAP system

c) both hydra and cambridge CAP system d) none of the mentioned

20. For a domain _______ is a list of objects together with the operation allowed on these objects.

a) capability list b) access list

c) both capability and access list d) none of the mentioned

Fill in the blanks:

1. A set of resource allocation such that the system can allocate resources to each process in some order and avoid
deadlock is known as ________ (safe state)

2. Deadlocks can be described by ___________ graph (Resource Allocation Graph)

3. In resource allocation graph, edge from resource to process is called_____(Assignment edge)

4. In resource allocation graph, edge from process to resource is called______(Request edge)

5. Preventing_____________ is the simplest method for deadlock prevention (circular wait)

6. ACL stands for__________ (Access Control List)

7. RBAC stands for________ (Role Based Access Control)

8. In Access matrix, rows represent_______ (Domains)

9. In Access matrix, columns represent_______ (Objects)

10. In read* operation, * indicate__________ (copy operation)

232
20.Tutorial Problems
Tutorial Sheet-I (Unit-I)

1. Explain system calls and types of system calls.

2. Explain the following: a) Parallel systems b) Distribution Systems c) Real Time systems

3. Explain in detail about Operating system functions and oprations.

Tutorial Sheet-II (Unit-II)

1. Five processes arrive at time 0, in the order given, with the length of the CPU-burst time in
milliseconds, as shown below.
Processes Burst time
P1 8
P2 9
P3 5
P4 6
P5 4
a) Find the average waiting time, considering the following algorithms:
(i) FCFS (ii) SJF (iii) RR (time quantum = 4 milliseconds).
b) Which algorithm gives the minimum average waiting time?

2. Explain Classic problems of synchronization.

3. Explain Semaphores with example.

Tutorial Sheet-IV (Unit-III)

1. Explain Paging and differentiate internal fragmentation and external fragmentation

2. What is dispatcher, explain short term scheduling and long term scheduling.

3. Explain First fit, Best fit and Worst fit algorithms with an example.

4. Consider the virtual page reference string 4, 2, 3, 1, 4, 2, 3, 2, 4, 1 on a demand paged virtual


memory system running on a computer system that main memory size of 3 pages frames which
are initially empty. How many page faults would occur for the following page replacement
algorithms: i) FIFO ii) Optimal iii) LRU

Tutorial Sheet-VI (Unit-IV)

1. Compare the various File Access Methods.

2. Explain how protection is achieved in a system

3. Explain the layered file system.


233
4. Suppose that a disk drive has 5,000 cylinders, numbered 0 to 4999. The drive is currently
serving a request at cylinder 143, and the previous request was at cylinder 125. The queue of
pending requests, in FIFO order is:
86, 1470,913, 1774, 948, 1509, 1022, 1750, 130
Starting from the current head position, what is the total distance (in cylinders) that the
disk arm moves to satisfy all the pending requests for each of the following Disk Scheduling
algorithms? a) FCFS b) SSTF c) SCAN d) C-SCAN e) LOOK f) C-LOOK

Tutorial Sheet-VIII (Unit-V)

1. What are deadlock charecteristics.

2. Consider the following snapshot of a system


Allocation Max Available
A B C D A B C D A B C D
P0 0 0 1 2 0 0 1 2 1 5 2 0
P1 1 0 0 0 1 7 5 0
P2 1 3 5 4 2 3 5 6
P3 0 6 3 2 0 6 5 2
P4 0 0 1 4 0 6 5 6
Answer the following questions using Banker’s algorithm
i. What is the content of matrix Need?
ii. Is the system in a safe state?
iii. If a request from process P1 arrives for (0, 4, 2, 0), can the request be granted
immediately?
21. Known gaps

As stated prerequisites are taught in the previous semesters, there are no known gaps

22. DISCUSSION TOPICS

Unit-1:

1. Distributed Systems
2. Real-time Embedded System
3. Operating Systems Generation (SYSGEN)
4. Handheld Systems
5. Multimedia Systems
6. Computing Environments
a) Traditional Computing
b) Client server Computing
c) Peer-to-peer Computing
d) Web-based Computing
234
7.Case study on UNIX Operating Systems
8. Case study on LINUX Operating Systems
9. Case study on WINDOWS Operating Systems
10. Threading issues
11. Scheduling Algorithms
12. Evaluation of scheduling Algorithms

Unit-2:

1. Synchronization examples
2. Atomic transactions
3. Case study on UNIX
4. Case study on Linux
5. Case study on Windows
6. Classic problems of synchronization

Unit-3:

1. Page-Replacement Algorithms
2. Memory Management
3. Case study on UNIX
4. Case study on Linux
5. Case study on Windows
6. Paging Techniques

Unit-4:
1. free-space management
2. Case study on UNIX
3. Case study on Linux
4. Case study on Windows
5. File system
6. Protection in file systems
7. Mass-storage structure
8. disk scheduling

Unit-5:
1. Deadlock Detection
2. Deadlock Avoidance
3. Recovery form deadlock
4. Protection

235
5. Case study on UNIX
6. Case study on Linux
7. Case study on Windows
8. Implementation of Access Matrix

23. References, Journals, websites and E-links


1. The Journal of Computer and System Sciences (JCSS) is a peer-reviewed scientific
journal in the field of computer science. JCSS is published by Elsevier,Gödel Prize.[1],
1967.

2. The Computing Research and Education Association of Australasia. July 2008.


http://www.core.edu.au/journal%20rankings/Journal%20Rankings.html. Retrieved 22
August 2009.[dead link]

24. Quality Measurement Sheets


a. Course End Survey
b. Teaching Evaluation

25. Student List


SlNo AdmnNo StudentName SlNo AdmnNo StudentName

26. Groupwise Student list


Batch1
SlNo AdmnNo StudentName

Batch2
SlNo AdmnNo StudentName

236

You might also like