Module 5
Module 5
Disk attachment; Disk scheduling; Disk management; Swap space management; Protection:
Case Study: The Linux operating system: Linux history; Design principles; Kernel modules;
Process management; Scheduling; memory management; File systems, Input and output; Inter-
process communication
OPERATING SYSTEMS MODULE 5
MASS-STORAGE STRUCTURE
Hard-Disks
With neat diagram briefly explain the structure of moving disk.
Hard-disks provide the bulk of secondary-storage for modern computer-systems as shown in figure.
Each disk-platter has a flat circular-shape, like a CD. The 2 surfaces of a platter are covered with a
magnetic material. Information is stored on the platters by recording magnetically.
a) Seek-time refers to the time necessary to move the disk-arm to the desired
cylinder.
b) Rotational-latency refers to the time necessary for the desired sector to rotate to
the disk-head.
A disk can be removable which allows different disks to be mounted as needed. A disk-drive is
attached to a computer by an I/O bus.
Different kinds of buses: Advanced technology attachment (ATA), Serial ATA (SATA), eSATA,
Universal serial bus (USB) and Fiber channel (FC).
SOLID-STATE DISKS
An SSD is non-volatile memory that is used like a hard-drive.
For example: DRAM with a battery to maintain its state in a power-failure through flash-memory
technologies.
Advantages compared to Hard-disks: More reliable: SSDs have no moving parts and are faster
because they have no seek-time or latency and Less power consumption.
Disadvantages: More expensive and Less capacity and so shorter life spans, so their uses are
somewhat limited.
Applications: One use for SSDs is in storage-arrays, where they hold file-system metadata that
require high performance. SSDs are also used in laptops to make them smaller, faster, and more
energy-efficient.
MAGNETIC TAPES
Magnetic tape was used as an early secondary-storage medium.
Advantages: It is relatively permanent and can hold large quantities of data.
Disadvantages: Its access time is slow compared with that of main memory and Hard-disk.
In addition, random access to magnetic tape is about a thousand times slower than random access to
Hard-disk, so tapes are not very useful for secondary-storage.
Applications: Tapes are used mainly for backup, for storage of infrequently used information.
Tapes are used as a medium for transferring information from one system to another.
DISK STRUCTURE
Modern Hard-disk-drives are addressed as large one-dimensional arrays of logical blocks. The
logical block is the smallest unit of transfer.
How one-dimensional array of logical blocks is mapped onto the sectors of the disk sequentially?
Sector 0 is the first sector of the first track on the outermost cylinder. The mapping proceeds in order
through that track, then through the rest of the tracks in that cylinder, and then through the rest of the
cylinders from outermost to innermost.
In practice, it is difficult to perform this mapping, for two reasons.
Most disks have some defective sectors, but the mapping hides this by substituting spare
sectors from elsewhere on the disk.
The number of sectors per track is not a constant on some drives.
DISK ATTACHMENT
Computers access disk storage in two ways.
via I/O ports (or host-attached storage); this is common on small systems.
via a remote host in a distributed file system; this is referred to as network-attached storage.
Host-Attached Storage
Host-attached storage is storage accessed through local I/O ports. These ports use several
technologies.
The desktop PC uses an I/O bus architecture called IDE or ATA.This architecture supports a
maximum of 2 drives per I/O bus.
High-end workstations( and servers) use fibre channel (FC), a high-speed serial architecture
that can operate over optical fiber.
It has two variants:
One is a large switched fabric having a 24-bit address space. This variant is the basis of
storage-area networks (SANs).
The other FC variant is an arbitrated loop (FC-AL) that can address 126 devices.
A wide variety of storage devices are suitable for use as host-attached storage. For ex: Hard-disk-
drives, RAID arrays, and CD, DVD, and tape drives.
Network-Attached Storage
Network-attached storage
A network-attached storage (NAS) device is a special-purpose storage system that is accessed
remotely over a data network as shown in figure. Clients access NAS via a remote-procedure-call
interface such as NFS for UNIX systems and CIFS for Windows machines.
The remote procedure calls (RPCs) are carried via TCP or UDP over a local area network (LAN).
Usually, the same local area network (LAN) carries all data traffic to the clients. The NAS device is
usually implemented as a RAID array with software that implements the RPC interface.
Advantage: All computers on a LAN can share a pool of storage with the same ease of naming and
access local host-attached storage.
Disadvantages: NAS is less efficient and have lower performance than some direct-attached storage
options. The storage I/O operations consume bandwidth on the data network, thereby increasing the
latency of network communication.
iSCSI is the latest network-attached storage protocol. iSCSI uses the IP network protocol to carry
the SCSI protocol. Thus, networks rather than SCSI cables can be used as the interconnects between
hosts and their storage.
Storage-Area Network
A storage-area network (SAN) is a private network connecting servers and storage units as shown in
figure. The power of a SAN lies in its flexibility.
1. Multiple hosts and multiple storage-arrays can attach to the same SAN.
2. Storage can be dynamically allocated to hosts.
3. A SAN switch allows or prohibits access between the hosts and the storage.
4. SANs make it possible for clusters of servers to share the same storage and for storage arrays
to include multiple direct host connections.
5. SANs typically have more ports than storage-arrays.
FC is the most common SAN interconnect.
Another SAN interconnect is InfiniBand- a special-purpose bus architecture that provides hardware
and software support for high-speed interconnection networks for servers and storage units.
Storage-area network
DISK SCHEDULING
Explain in brief the selection of disk scheduling algorithm.
Whenever a process needs I/O to or from the disk, it issues a system call to the operating system.
The request specifies several pieces of information:
Whether this operation is input or output
What the disk address for the transfer is
What the memory address for the transfer is
What the number of sectors to be transferred is
If the desired disk-drive and controller are available, the request can be serviced immediately.
If the drive or controller is busy, any new requests for service will be placed in the queue of
pending requests for that drive.
For a multiprogramming system with many processes, the disk queue may often have several
pending requests. Thus, when one request is completed, the operating system chooses which
pending request to service next. Any one of several disk-scheduling algorithms can be used.
DISK SCHEDULING ALGORIHMS
*******What is disk scheduling? Explain different disk scheduling algorithms. (Any two)
In a multiprogramming system with many processes, the disk queue may often have several pending
disk requests. Thus, when one request is completed, the operating system chooses which
pending request to service next. This mechanism is called as disk scheduling.
The various disk scheduling methods are: FCFS, SSTF, SCAN, C-SCAN, LOOK, C-LOOK
FCFS DISK SCHEDULING ALGORITHM: stands for First Come First Serve.The
requests are serviced in the same order, as they are received. For example:
Starting with cylinder 53, the disk-head will first move from 53 to 98, then to 183, 37, 122, 14, 124,
65, and finally to 67 as shown in above figure.
Head movement from 53 to 98 = 45
Head movement from 98 to 183 = 85
Head movement from 183 to 37 = 146
Head movement from 37 to 122 =85
Head movement from 122 to 14 =108
Head movement from 14 to 124 =110
Head movement from 124 to 65 =59
Head movement from 65 to 67 = 2 Total head movement = 640
Advantage: This algorithm is simple & fair and no starvation of process requests.
Disadvantage: Generally, this algorithm does not provide the fastest service.
SSTF SCHEDULING
SSTF stands for Shortest Seek-time First. This selects the request with minimum seek-time from the
current head-position. Since seek-time increases with the number of cylinders traversed by head,
SSTF chooses the pending request closest to the current head-position.
Problem: Seek-time increases with the number of cylinders traversed by head.
Solution: To overcome this problem, SSTF chooses the pending request closest to the current head-
position. For example:
The closest request to the initial head position 53 is at cylinder 65. Once we are at cylinder 65, the
next closest request is at cylinder 67.From there, the request at cylinder 37 is closer than 98, so 37 is
served next. Continuing, we service the request at cylinder 14, then 98, 122, 124, and finally 183. It
is shown in above Figure.
Head movement from 53 to 65 = 12
Head movement from 65 to 67 = 2
Head movement from 67 to 37 = 30
Head movement from 37 to 14 =23
Head movement from 14 to 98 =84
Head movement from 98 to 122 =24
Head movement from 122 to 124 =2
Head movement from 124 to 183 = 59 Total head movement = 236
Advantage: SSTF is a substantial improvement over FCFS, it is not optimal.
Disadvantage: Essentially, SSTF is a form of SJF and it may cause starvation of some requests.
SCAN SCHEDULING
The SCAN algorithm is sometimes called the elevator algorithm, since the disk-arm behaves just
like an elevator in a building.
Here is how it works:
The disk-arm starts at one end of the disk. Then, the disk-arm moves towards the other end,
servicing the request as it reaches each cylinder. At the other end, the direction of the head
movement is reversed and servicing continues. The head continuously scans back and forth across
the disk. For example:
Before applying SCAN algorithm, we need to know the current direction of head movement.
Assume that disk-arm is moving toward 0, the head will service 37 and then 14. At cylinder 0, the
arm will reverse and will move toward the other end of the disk, servicing the requests at 65,67,98,
122, 124, and 183. It is shown in above figure.
Head movement from 53 to 37 = 16
Head movement from 37 to 14 = 23
Head movement from 14 to 0 = 14
Head movement from 0 to 65 =65
Head movement from 65 to 67 =2
Head movement from 67 to 98 =31
Head movement from 98 to 122 =24
Head movement from 122 to 124 = 2
Head movement from 124 to 183 = 59
Total head movement = 236
Disadvantage: If a request arrives just in from of head, it will be serviced immediately.
On the other hand, if a request arrives just behind the head, it will have to wait until the arms reach
other end and reverses direction.
C-SCAN SCHEDULING
Circular SCAN (C-SCAN) scheduling is a variant of SCAN designed to provide a more uniform
wait time. Like SCAN, C-SCAN moves the head from one end of the disk to the other, servicing
requests along the way. When the head reaches the other end, however, it immediately returns to the
beginning of the disk, without servicing any requests on the return trip as shown in below figure.
The C-SCAN scheduling algorithm essentially treats the cylinders as a circular list that wraps
around from the final cylinder to the first one.
Before applying C - SCAN algorithm, we need to know the current direction of head movement.
Assume that disk-arm is moving toward cylinder number 199, the head will service 65, 67, 98, 122,
124, 183. Then it will move to 199 and the arm will reverse and move towards 0.
While moving towards 0, it will not serve. But, after reaching 0, it will reverse again and then serve
14 and 37. It is as shown in below figure.
LOOK SCHEDULING
SCAN algorithm, move the disk-arm across the full width of the disk. In practice, the SCAN
algorithm is not implemented in this way.
The arm goes only as far as the final request in each direction. Then, the arm reverses, without going
all the way to the end of the disk. This version of SCAN is called Look scheduling because they
look for a request before continuing to move in a given direction.
Example:
Look scheduling
C-LOOK SCHEDULING
Circular LOOK (C-LOOK) scheduling is a variant of LOOK designed to provide a more uniform
wait time. Like LOOK, C-LOOK the head goes only as far as the final request in each direction.
Then, the arm reverses, without going all the way to the end of the disk. Now it moves the head
from one end of the disk to the other, without servicing any requests on the return trip. At the other
end, the direction of the head movement is reversed and servicing continues.
Assume that disk-arm is moving toward 199, the head will service 65, 67, 98, 122, 124, 183. Then
the arm will reverse and move towards 14. Then it will serve 37. It is as shown in below Figure
EXERCISE PROBLEMS
Suppose that the disk-drive has 5000 cylinders numbered from 0 to 4999. The drive is
currently serving a request at cylinder 143, and the previous request was at cylinder 125. The
queue of pending requests in FIFO order is 86, 1470, 913, 1774, 948, 1509, 1022, 1750, 130.
Starting from the current (location) head position, what is the total distance (in cylinders) that
the disk-arm moves to satisfy all the pending requests, for each of the following disk-
scheduling algorithms?
i. FCFS
ii. SSTF
iii. SCAN
iv. LOOK
v. C-SCAN
vi. C-LOOK
Solution:
i. FCFS
ii. SSTF
For SSTF schedule, the total seek distance is = (143-130) + (130-86) + (1774-86) = 1745.
iii, SCAN
For SCAN schedule, the total seek distance is = (4999- 143) + (4999 -86) = 9769
iv. C-SCAN
For C-SCAN schedule, the total seek distance is = (4999 – 143) + (4999 – 0)+(130-0) = 9985
v. LOOK
For LOOK schedule, the total seek distance is = (1774 – 143) + (1774 – 86) = 3319.
vi. C-LOOK
For C- LOOK schedule, the total seek distance is = (1774 – 143) +(1774 -86) + (130-86) = 3363
1) Suppose that a disk has 50 cylinder named 0 to 49. The R/W head is currently serving at cylinder
Suppose that a disk has 50 cylinders named 0 to 49. The R/W head is currently serving at
cylinder 15. The queue of pending request are in order: 4, 40, 11, 35, 7, 14 starting from the
current head position, what is the total distance traveled (in cylinders) by the disk-arm to
satisfy the request using algorithms
i. FCFS
ii. SSTF and
iii. LOOK.
Illustrate with figure in each case.
FCFS
Head starts at 15
SSTF
Head starts at 15
FCFS
SSTF
Elevator (SCAN)
For C-LOOK schedule, the total seek distance is = (50 – 11) + (180 – 11) + (180 -62) = 326
DISK MANAGEMENT
The operating system is responsible for several other aspects of disk management. For example:
Disk initialization, Booting from disk and Bad-block recovery.
Disk Formatting: Usually, a new Hard-disk is in a blank slate: it is just a platter of a magnetic
recording material. Before a disk can store data, it must be divided into sectors that the disk
controller can read and write. This process is called low-level formatting, or physical formatting.
Low-level formatting fills the disk with a special data structure for each sector. The data structure
for a sector typically consists of a header, a data area (usually 512 bytes in size), and a trailer. The
header and trailer contain information used by the disk controller, such as sector number and error-
correcting code (ECC). Before a disk can store data, the operating system still needs to record its
own data structures on the disk. It does so in two steps.
Partition the disk into one or more groups of cylinders.
The operating system can treat each partition as a separate disk.
For example: one partition can hold a copy of the operating system’s executable code, another
partition can hold user files.
Logical formatting or creation of a file system: The operating system stores the initial file-system
data structures onto the disk. These data structures may include maps of free and allocated space and
an initial empty directory. To increase efficiency, most file systems group blocks together into larger
chunks, frequently called clusters. Disk I/O is done via blocks, File system I/O is done via clusters.
BOOT BLOCK
What are boot blocks? Explain.
For a computer to start running, it must have a bootstrap program to run. The Bootstrap program
initializes CPU registers, device controllers and the contents of main memory and then starts the
operating system. For most computers, the bootstrap is stored in read-only memory (ROM). To
change the bootstrap code, the ROM hardware chips has to be changed. To solve this problem, most
systems store a tiny bootstrap loader program in the boot-ROM. This loader program in ROM will
bring bootstrap program from disk. The full bootstrap program is stored in the form of boot blocks at
a fixed location on the disk. A disk that has a boot partition is called a boot disk or system disk.
In the boot-ROM, the code instructs the disk-controller to read the boot blocks into memory and
then starts executing that code.
BAD BLOCKS
What are bad blocks? Explain
Because disks have moving parts and small tolerances, they are prone to failure. Sometimes, the disk
needs to be replaced. The disk-contents need to be restored from backup media to the new disk. One
or more sectors may become defective. From the manufacturer, most disks have bad-blocks.
How to handle bad-blocks?
On simple disks, bad-blocks are handled manually.
One strategy is to scan the disk to find bad-blocks while the disk is being formatted. Any bad-blocks
that are discovered are flagged as unusable. Thus, the file system does not allocate them.
If blocks go bad during normal operation, a special program (such as Linux bad-blocks command)
must be run manually to search for the bad-blocks and to lock the bad-blocks. Usually, data that
resided on the bad-blocks are lost.
Bad blocks are recovered by using:
1. Sector sparing method 2. Sector slipping
The controller can be told to replace each bad sector logically with one of the spare sectors. This
scheme is known as sector sparing or forwarding
As an alternative to sector sparing some controllers can be instructed to replace a bad block by
sector slipping
Example: A typical bad-sector transaction might be as follows:
The operating system tries to read logical block 87. The controller calculates the ECC and finds that
the sector is bad. It reports this finding to the operating system. The next time the system is rebooted,
a special command is run to tell the controller to replace the bad sector with a spare. After that,
whenever the system requests logical block 87, the request is translated into the replacement sector’s
address by the controller.
The main goal of swap space: to provide the best throughput for the virtual memory system.
Here, we discuss about 1) Swap space use 2) Swap space location.
Swap-Space Use: Swap space can be used in 2 ways.
Swapping-Systems may use swap space to hold an entire process image, including the code
and data segments.
Paging-systems may simply store pages that have been pushed out of main memory.
The amount of swap space needed on a system can therefore vary from a few megabytes of disk
space to gigabytes, depending on amount of physical memory, amount of virtual memory it is
backing, and way in which the virtual memory is used.
Swap-Space Location: A swap space can reside in one of two places:
1, The swap space can be a large file within the file system: Here, normal file-system routines can be
used to create it, name it, and allocate its space.
Advantage: This approach easy to implement,
Disadvantage: This approach is inefficient. This is because navigating the directory structure and the
disk structures takes time and extra disk accesses. External fragmentation can greatly increase
swapping times by forcing multiple seeks during reading or writing of a process image.
2. The swap space can be in a separate raw (disk) partition: No file system or directory structure is
placed in the swap space. Rather, a separate swap-space storage manager is used to allocate and de-
allocate the blocks from the raw partition. This manager uses algorithms optimized for speed rather
than for storage efficiency, because swap space is accessed much more frequently than file system.
PROTECTION
Protection vs. Security Protection
Protection controls access to the system-resources by Programs, Processes or Users.
Protection ensures that only processes that have gained proper authorization from the OS can
operate on memory-segments, CPU and other resources.
Protection must provide means for specifying the controls to be imposed, means of enforcing
the controls.
Protection is an internal problem. Security, in contrast, must consider both the computer-
system and the environment within which the system is used.
Security
Security ensures the authentication of system-users to protect integrity of the information
stored in the system (both data and code) physical resources of the computer-system.
The security-system prevents unauthorized access malicious destruction alteration of data or
accidental introduction of inconsistency.
Goals of Protection
Explain the goals of protection
Operating system consists of a collection of objects, hardware or software. Each object has a unique
name and can be accessed through a well-defined set of operations.
Protection problem: ensure that each object is accessed correctly & only by those processes that are
allowed to do so.
Reasons for providing protection: To prevent mischievous violation of an access restriction. To
ensure that each program component active in a system uses system resources only in ways
consistent with policies.
Mechanisms are distinct from policies:
Mechanisms determine how something will be done. Policies decide what will be done. This
principle provides flexibility.
Principles of Protection
Explain the principles of protection
A key principle for protection is the principle of least privilege. Principle of Least Privilege:
Programs, users, and even systems are given just enough privileges to perform their tasks. The
principle of least privilege can help produce a more secure computing environment. An operating
system which follows the principle of least privilege implements its features, programs, system-
calls, and data structures. Thus, failure of a component results in minimum damage. An operating
system also provides system-calls and services that allow applications to be written with fine-
grained access controls.
Access Control provides mechanisms to enable privileges when they are needed, to disable
privileges when they are not needed. Audit-trails for all privileged function-access can be created.
Audit-trail can be used to trace all protection/security activities on the system.
The audit-trail can be used by Programmer, System administrator or Law-enforcement officer.
Managing users with the principle of least privilege requires creating a separate account for each
user, with just the privileges that the user needs. Computers implemented in a computing facility
under the principle of least privilege can be limited to running specific services, accessing specific
remote hosts via specific services accessing during specific times. Typically, these restrictions are
implemented through enabling or disabling each service and through using Access Control Lists.
DOMAIN OF PROTECTION
A process operates within a protection domain. Protection domain specifies the resources that the
process may access. Each domain defines set of objects and types of operations that may be invoked
on each object. The ability to execute an operation on an object is an access-right.
A domain is a collection of access-rights. The access-rights are an ordered pair <object-name, rights-
set>.
For example:
If domain D has the access-right <file F, {read, write}>; Then a process executing in domain D can
both read and write on file F. As shown in below Figure, domains may share access-rights. The
access-right <O4, {print}> is shared by D2 and D3.
Access matrix
i. OR
Domain switching allows the process to switch from one domain to another. When we switch a
process from one domain to another, we are executing an operation (switch) on an object (the
domain).We can include domains in the matrix to control domain switching. Consider the access
matrix shown in Figure below: A process executing in domain D2 can switch to domain D3 or to
domain D4.
Access matrix with Copy rights, Owner rights & Control rights
Access-matrix provides mechanism for specifying a variety of policies. The access matrix is used to
implement policy decisions concerning protection in an operating system. In the matrix, 1) Rows
represent domains.2) Columns represent objects. Each entry consists of a set of access-rights (such
as read, write or execute).
In general, Access(i, j) is the set of operations that a process executing in Domaini can invoke on
Objectj
The different methods of implementing access matrix are:
1. By using Global table
2. Access lists for objects
3. Capability lists for domains
4. A Lock-Key mechanism
Global Table
A global table consists of a set of ordered triples <domain, object, rights-set>. Whenever an
operation M is executed on an object Oj within domain Di, the global table is searched for a triple <
Di , Oj , Rk >, with M Є Rk. If this triple is found, Then, we allow the access operation;Otherwise,
access is denied, and an exception condition occurs.
Disadvantages: The table is usually large and can't be kept in main memory.
It is difficult to take advantage of groupings, e.g. if all may read an object, there must be an entry in
each domain.
Access Lists for Objects
In the access-matrix, each column can be implemented as an access-list for one object. Obviously,
the empty entries can be discarded. For each object, the access-list consists of ordered pairs
<domain, rights-set>.
Working : Whenever an operation M is executed on an object Oj within domain Di, the access list
is searched for a triple < Di , Rk >, with M Є Rk.
If this entry is found, Then, we allow the access operation; Otherwise, we check the default-set. If M
is in the default-set, we allow the access operation; Otherwise, access is denied, and an exception
condition occurs.
Advantages:
The strength is the control that comes from storing the access privileges along with each object.This
allows the object to revoke or expand the access privileges in a localized manner.
Disadvantages:
The weakness is the overhead of checking whether the requesting domain appears on the access list.
This check would be expensive and needs to be performed every time the object is accessed. Usually,
the table is large & thus cannot be kept in main memory, so additional I/O is needed. It is difficult to
take advantage of special groupings of objects or domains.
Capability Lists for Domains
For a domain, a capability list is a list of objects & operations allowed on the objects. Often, an
object is represented by its physical name or address, called a capability. To execute operation M on
object Oj, the process executes the operation M, specifying the capability (or pointer) for object Oj
as a parameter. The capability list is associated with a domain. But capability list is never directly
accessible by a process. Rather, the capability list is maintained by the OS & accessed by the user
only indirectly. Capabilities are distinguished from other data in two ways:
Each object has a tag to denote whether it is a capability or accessible data.
Program address space can be split into 2 parts. One part contains normal data, accessible to
the program. Another part contains the capability list, accessible only to the OS.
A Lock–Key Mechanism
The lock–key scheme is a compromise between 1) Access-lists and 2) Capability lists.
Each object has a list of unique bit patterns, called locks. Similarly, each domain has a list of unique
bit patterns, called keys. A process executing in a domain can access an object only if that domain
has a key that matches one of the locks of the object.
ACCESS CONTROL
Protection can be applied to non-file resources as shown in below figure:
Solaris 10 provides role-based access control (RBAC) to implement least privilege. Privilege is right
to execute system call or use an option within a system call. Privilege can be assigned to processes.
Users assigned roles granting access to privileges and programs
5.15.3 Linux-Distributions
• Linux-Distributions include
→ system-installation and management utilities
→ ready-to-install packages of common UNIX tools (ex: text-processing, web browser).
• The first distributions managed these packages by simply providing a means of unpacking all the
files into the appropriate places.
• Early distributions included SLS and Slackware.
• RedHat and Debian are popular distributions from commercial and non-commercial sources,
respectively.
• RPM Package file format permits compatibility among the various Linux-Distributions.
• If clone() is passed the above flags, the parent and child tasks will share
→ same file-system information (such as the current working directory)
→ same memory space
→ same signal handlers and
→ same set of open files.
However, if none of these flags is set when clone() is invoked, the associated resources are not
shared
• A separate data-structures is used to hold information of process. Information includes:
→ file-system context
→ file-descriptor table
→ signal-handler table and
→ virtual-memory context
• The process data-structure contains pointers to these other structures.
• So any number of processes can easily share a sub-context by
→ pointing to the same sub-context and
→ incrementing a reference count.
• The arguments to the clone() system-call tell it
→ which sub-contexts to copy and
→ which sub-contexts to share.
• The new process is always given a new identity and a new scheduling context
5.19 Scheduling
• Scheduling is a process of allocating CPU-time to different tasks within an OS.
• Like all UNIX systems, Linux supports preemptive multitasking.
• In such a system, the process-scheduler decides which process runs and when.
• Page-allocator is used to
→ allocate and free all physical-pages.
→ allocate a ranges of physically-contiguous pages on-demand.
• Page-allocator uses a buddy-heap algorithm to keep track of available physical-pages (Figure 5.17).
• Each allocatable memory-region is paired with an adjacent partner (hence, the name buddy-heap).
1) When 2 allocated partners regions are freed up, they are combined to form a larger region
(called as a buddy heap).
2) Conversely, if a small memory-request cannot be satisfied by allocation of an existing small
free region, then a larger free region will be subdivided into two partners to satisfy therequest.
• Memory allocations occur either
→ statically (drivers reserve a contiguous area of memory during system boot time) or
→ dynamically (via the page-allocator).
Figure 5.17 Splitting of memory in the buddy system.
5.20.2.1 Virtual-Memory-Regions
• Virtual-memory-regions can be classified by backing-store.
• Backing-store defines from where the pages for the region come.
• Most memory-regions are backed either 1) by a file or 2) by nothing.
1) By Nothing
Here, a region is backed by nothing.
The region represents demand-zero memory.
When a process reads a page in the memory, the process is returned a page-of-memory filled
with zeros.
2) By File
A region backed by a file acts as a viewport onto a section of that file.
When the process tries to access a page within that region, the page-table is filled with the
address of a page within the kernel’s page-cache.
The same page of physical-memory is used by both the page-cache and the process’s page
tables.
• A virtual-memory-region can also be classified by its reaction to writes. 1) Private or 2) Shared.
1) If a process writes to a private-region, then the pager detects that a copy-on-write is necessary
to keep the changes local to the process.
2) If a process writes to a shared-region, the object mapped is updated into that region.
Thus, the change will be visible immediately to any other process that is mapping that object.
.
5.20.2.3 Swapping and Paging
• A VM system relocates pages of memory from physical-memory out to disk when that memory is
needed.
• Paging refers to movement of individual pages of virtual-memory between physical-memory & disk.
• Paging-system is divided into 2 sections:
1) Policy algorithm decides
→ which pages to write out to disk and
→ when to write those pages.
2) Paging mechanism
→ carries out the transfer and
→ pages data back into physical-memory when they are needed again.
• Linux’s pageout policy uses a modified version of the standard clock algorithm.
• A multiple pass clock is used, and every page has an age that is adjusted on each pass of the clock.
• The age is a measure of the page’s youthfulness, or how much activity the page has seen recently.
• Frequently accessed pages will attain a higher age value, but the age of infrequently accessed pages
will drop toward zero with each pass. (LFU → least frequently used)
• This age valuing allows the pager to select pages to page out based on a LFU policy.
• The paging mechanism supports paging both to
1) dedicated swap devices and partitions and
2) normal files
• Blocks are allocated from the swap devices according to a bitmap of used blocks, which is
maintained in physical-memory at all times.
• The allocator uses a next-fit algorithm to try to write out pages to continuous runs of disk blocks for
improved performance.
Each process has a pointer brk that points to the current extent of this data region,
Figure 5.19 Memory layout for ELF programs
subsystem.
Figure 5.21 Device-driver block structure.
5.22.1 Block Devices
• Block devices allow random access to completely independent, fixed-sized blocks of data.
• For example: hard disks and floppy disks, CD-ROMs and Blu-ray discs, and flash memory.
• Block devices are typically used to store file-systems.
• Block devices provide the main interface to all disk devices in a system.
• A block represents the unit with which the kernel performs I/O.
• When a block is read into memory, it is stored in a buffer.
• The request manager is the layer of software that manages the reading and writing of buffer
contents to and from a block-device-driver.
• A separate list of requests is kept for each block-device-driver.
• These requests are scheduled according to a C-SCAN algorithm.
• C-SCAN algorithm exploits the order in which requests are inserted in and removed from the lists.
• The request lists are maintained in sorted order of increasing starting-sector number.