Textual Learning Material - Module 4
Textual Learning Material - Module 4
Objectives
After studying this unit, you should be able to:
z Explain the concept of memory management
z Describe the process of memory swapping
z Identify different strategies used for contiguous memory allocation
z Describe the concept of Paging
z Explain Segmentation that supports user view of memory
Amity Directorate of Distance & Online Education
108 Operating Systems
6.1 Introduction
Notes In the previous unit, you have studied about the methods used for handling deadlock.
In this unit, you will study memory management. Memory is the electronic holding
place for instructions and data that the computers microprocessor can reach quickly.
When the computer is in normal operation, its memory usually contains the main parts
of the operating system and some or all of the application programs and related data
that are being used. Memory is often used as a shorter synonym for Random Access
Memory (RAM). This lesson will cover the concept of swapping and contiguous memory
allocation. You will also learn about the concept of paging and segmentation.
Memory management is very significant operating system because the multi-tasking
can happen in the system which switches the memory space from one process to
another.
6.2.1 Background
The background of memory management, Memory is central to the operation of a
modern computer system. Memory consists of a large array of words or bytes, each
with its own address. The CPU fetches instructions from memory according to the value
of the program counter. These instructions may cause additional loading from and
storing to specific memory addresses.
A typical instruction-execution cycle, for example, first fetches an instruction from
memory. The instruction is then decoded and may cause operands to be fetched from
memory. After the instruction has been executed on the operands, results may be
stored back in memory. The memory unit sees only a stream of memory addresses; it
does not know how they are generated (by the instruction counter, indexing, indirection,
literal addresses, and so on) or what they are for (instructions or data).
Accordingly, you can ignore how a memory address is generated by a program. We
are interested in only the sequence of memory addresses generated by the running
program.
Notes Usually, you must understand that a program resides on a disk as a binary executable
file. The program must be brought into memory and placed within a process for it to be
executed. Depending on the memory management in use, the process may be moved
between disk and memory during its execution. The input queue is formed by the
collection of processes on the disk that is waiting to be brought into memory for
execution.
The normal procedure is to select one of the processes in the input queue and to
load that process into memory. As the process is executed, it accesses instructions and
data from memory. Eventually, the process terminates, and its memory space is
declared available.
It is important to note that address binding of instructions and data to memory
addresses can happen at three different stages.
1. Compile Time: If you know at compile time where the process will reside in
memory, then absolute code can be generated. If, at some later time, the starting
location changes, then it will be necessary to recompile this code.
2. Load Time: If it is not known at compile time where the process will reside in
memory, then the compiler must generate relocatable code.
3. Execution Time: If the process can be moved during its execution from one
memory segment to another, then binding must be delayed until run time.
Figure 6.2 is a depiction of the multi-step processing of a user program.
Source
program
Compiler or Compile
assembler time
Object
Other module
object
modules
Linkage
editor
Load Load
module time
System
library
Loader
Dynamically
loaded
system
library In-memory
Executor
Dynamic binary
time (run
linking memory
time)
image
6.2.5 Overlays
Overlaying means replacement of a block of stored instructions or data with another.
Overlaying is a programming method that allows programs to be larger than the central
processing units main memory. It is essential to note that an embedded system would
normally use overlays because of the limitation of physical memory, which is internal
memory for a system- on-chip.
The method assumes dividing a program into self-contained object code blocks
called overlays. The size of an overlay is limited according to memory constraints. The
place in memory where an overlay is loaded is called an overlay region or destination
region. Although the idea is to reuse the same block of main memory, multiple region
systems could be defined. The regions can be of different sizes. An overlay manager,
possibly part of the operating system, will load the required overlay from external
memory into its destination region in order to be used. Some linkers provide support for
overlays.
6.3 Swapping
In this section, you will study about the swapping mechanism. Any operating system
has a fixed amount of physical memory available. Usually, application need more than
the physical memory installed on your system, for that purpose the operating system
uses a swap mechanism: instead of storing data in physical memory, it uses a disk file.
Swapping is the act of moving processes between memory and a backing store.
This is done to free up available memory. Swapping is necessary when there are more
processes than available memory. At the coarsest level, swapping is done a process at
a time.
To move a program from fast-access memory to a slow-access memory is known
as "swap out", and the reverse operation is known as "swap in". The term often refers
specifically to the use of a hard disk (or a swap file) as virtual memory or "swap space".
Amity Directorate of Distance & Online Education
112 Operating Systems
Notes
Figure 6.4: Diagram of Best Fit, Worst Fit and First Fit Memory Allocation Method
0 0
Operating system
User processes
User processes
Operating system
512K 512K
Figure 6.5: Operating System Loaded in (a) Low Memory (b) High Memory
Since, both the operating system and user processes reside in the memory, some
mechanism needs to be enforced to protect the memory allocated to the operating
system from being accessed by the user process. This protection may be enforced with
the help of two registers – relocation register and limit register. The relocation register
contains the value of the smallest physical address while limit register contains the
range of the allowable logical address. The hardware support for this type of partitioning
is shown to you in Figure 6.6.
Notes
Limit register Relocation
register
Memory
logical physical
address yes address
CPU < +
no
OS OS
P3
P2
P1
P1
P2
P3
Figure 6.8 above shows comparison between fixed and variable partition allocation.
6.5 Paging
In this section, you will learn about the concept and process of paging. A possible
solution to the external fragmentation problem is to permit the logical address space of
a process to be non-contiguous, thus allowing a process to be allocated physical
memory wherever the latter is available. One way of implementing this solution is
through the use of a paging scheme. Paging avoids the significant problem of fitting the
varying-sized memory chunks onto the backing store, from which most of the previous
memory-management schemes suffered.
When some code fragments or data residing in main memory need to be swapped
out, space must be found on the backing store.
The fragmentation problems discussed in connection with main memory are also
prevalent with backing store, except that access is much slower, so compaction is
impossible. Because of its advantages over the previous methods, paging in its various
forms is commonly used in many operating systems.
Physical memory is broken into fixed-sized blocks called frames. Logical memory is
also broken into blocks of the same size called pages. When a process is to be
executed, its pages are loaded into any available memory frames from the backing
store. The backing store is divided into fixed-sized blocks that are of the same size as
the memory frames.
Notes
where p is an index into the page table and d is the displacement within the page.
Example: For a concrete, although minuscule, consider the memory of Figure 6.11.
Using a page size of 4 bytes and a physical memory of 32 bytes (8 pages), we show an
example of how the user's view of memory can be mapped into physical memory.
Logical address 0 is page 0, offset 0, indexing into the page table; we find that page 0 is
in frame 5. Thus, logical address 0 maps to physical address 20 (= (5 × 4) + 0). Logical
address 3 (page 0, offset 3) maps to physical address 23 (= (5 × 4) + 3). Logical
address 4 is page 1, offset 0; according to the page table, page 1 is mapped to frame 6.
Thus, logical address 4 maps to physical address 24 (= (6 × 4) + 0). Logical address 13
maps to physical address 9.
Figure 6.11: Paging Example for a 32-byte Memory with 4-byte Pages
Notes Every logical address is bound by the paging hardware to some physical address.
The observant reader will have realised that paging is similar to using a table of base
(relocation) registers, one for each frame of memory.
When using a paging scheme, you have no external fragmentation: Any free frame
can be allocated to a process that needs it. However, you may have some internal
fragmentation. Notice that frames are allocated as units. If the memory requirements of
a process do not happen to fall on page boundaries, the last frame allocated may not be
completely full.
Example: If pages are 2048 bytes, a process of 72,766 bytes would need 35 pages
plus 1,086 bytes. It would be allocated 36 frames, resulting in an internal fragmentation
of 2048 – 1086 = 962 bytes.
You will find it important to note that in the worst case, a process would need n
pages plus one byte. It would be allocated n + 1 frames, resulting in an internal
fragmentation of almost an entire frame. If process size is independent of page size, we
expect internal fragmentation to average one-half page per process. This consideration
suggests that small page sizes are desirable. Disk I/O is more efficient when the
number of data being transferred is larger. Generally, page sizes have grown over time
as processes, data sets, and main memory have become larger. Today, pages typically
are either 2 or 4 kilobytes.
You may already be aware that when a process arrives in the system to be
executed, its size, expressed in pages, is examined. Each page of the process needs
one frame. Thus, if the process requires n pages, there must be at least n frames
available in memory. If there are n frames available, they are allocated to this arriving
process. The first page of the process is loaded into one of the allocated frames, and
the frame number is put in the page table for this process. The next page is loaded into
another frame, and its frame number is put into the page table, and so on (Figure 6.12).
15 15
Page 0
16 16
Page 0 Page 1
Page 1 17 Page 2 17
Page 2 Page 3
Page 3 New Page 2
18 18
New
19 0 14 19
Page 3
20 1 13 20
2 18
21 21
(a) 3 20 (b)
New Process Page Table
Figure 6.12: Free Frames (a) Before Allocation, and (b) After Allocation
Changing page tables requires changing only this one register, substantially reducing
context-switch time.
Notes
The problem with this approach is the time required to access a user memory
location. If you want to access location i, you must first index into the page table, using
the value in the PTBR offset by the page number for 1. This task requires a memory
access. It provides us with the frame number, which is combined with the page offset to
produce the actual address. You can then access the desired place in memory. With
this scheme, two memory accesses are needed to access a byte (one for the page-
table entry, one for the byte).
Thus, memory access is slowed by a factor of 2. This delay would be intolerable
under most circumstances. We might as well resort to swapping. The standard solution
to this problem is to use a special, small, fast-lookup hardware cache, variously called
associative registers or translation look-aside buffers (TLBs). A set of associative
registers is built of especially high-speed memory. Each register consists of two parts: a
key and a value. When the associative registers are presented with an item, it is
compared with all keys simultaneously. If the item is found, the corresponding value
field is output. The search is fast; the hardware, however, is expensive. Typically, the
number of entries in a TLB varies between 8 and 2048.
Associative registers are used with page tables in the following way. The
associative registers contain only a few of the page-table entries. When a logical
address is generated by the CPU, its page number is presented to a set of associative
registers that contain page numbers and their corresponding frame numbers. If the
page number is found in the associative registers, its frame number is immediately
available and is used to access memory. The whole task may take less than 10%
longer than it would were an unmapped memory reference used.
If the page number is not in the associative registers, a memory reference to the
page table must be made. When the frame number is obtained, you can use it to
access memory. In addition, we add the page number and frame number to the
associative registers, so that they will be found quickly on the next reference. If the TLB
is already full of entries, the operating system must select one for replacement.
Unfortunately, every time a new page table is selected (for instance, each context
switch), the TLB must be flushed (erased) to ensure that the next executing process
does not use the wrong translation information. Otherwise, there could be old entries in
the TLB that contain valid virtual addresses but have incorrect or invalid physical
addresses left over from the previous process.
The percentage of times that a page number is found in the associative registers is
called the hit ratio. An 80% hit ratio means that you find the dared page number in the
associative registers 80% of the time.
Example: If it takes 20 nanoseconds to search the associative registers and 100
nanoseconds to access memory, then a mapped memory access takes 120
nanoseconds when the page number is in the associative registers. If you fail to find the
page number in associative registers (20 nanoseconds), then you must first access
memory for the page table and frame number (100 nanoseconds), and then access the
desired byte in memory (100 nanoseconds), for a total of 220 nanoseconds. To find the
effective memory-access time, you must weigh each case by its probability:
Effective access time = 0.80 × 120 + 0.20 × 220
= 140 nanoseconds.
In this example, you suffer a 40 per cent slowdown in memory access time (from
100 to 140 nanoseconds). This increased hit rate produces only a 22% slowdown in
memory also time.
You must be aware that the hit ratio is clearly related to the number of associative
registers. With the number of associative registers ranging between 16 and 512, a hit
ratio of 80% to 98% can be obtained.
Amity Directorate of Distance & Online Education
Memory Management 121
Example: The Motorola 68030 processor (used in AE: Macintosh systems) has a
22-entry TLB. The Intel 80486 CPU (found in some) has 32 registers, and claims a 98%
hit ratio. Notes
6.5.3 Protection
It is important to note that memory protection in a paged environment is accomplished
by protection bits that are associated with each frame. Normally, these bits are kept in
the page table. One bit can define a page to be read and write or read-only. Every
reference to memory goes through the page table to find the correct frame number. At
the same time that the physical address is being computed, the protection bits can be
checked to verify that no writes are being made to a read-only page. An attempt to write
to a read-only page causes a hardware trap the operating system (memory-protection
violation). This approach to protection can be expanded easily to provide a finer level of
protection. You can create hardware to provide read-only, read-write, or execute-only
protection. Or, by providing separate protection bits for each kind of access, any
combination of these accesses can be allowed, and illegal attempts will be trapped to
the operating system.
One more bit is generally attached to each entry in the page table: a valid-invalid
bit. When this bit is set to "valid," this value indicates that the associated page is in the
process's logical address space, and is thus a legal (valid) page. If the bit is set to
"invalid," this value indicates that the page is not in the process's logical address space.
Illegal addresses are trapped by using the valid-invalid bit. The operating system sets
this bit for each page to allow or disallow accesses to that page.
Example: In a system with a 14-bit address space (0 to 16,383), we may have a
program that should use only addresses 0 to 10,468. Given a page size of 2K, we get
the situation shown in Figure 6.12. Addresses in pages 0, 1, 2, 3, 4, and 5 are mapped
normally through the page table. Any attempt to generate an address in pages 6 or 7,
however, finds that the valid-invalid bit is.
6.5.4 Sharing
You need to know that in multi-programming or in multi-user environment it is common
for many users to be executing the same program. If individual copies of these
programs were given to each user, a significant part of the primary storage would be
wasted. The solution is to share those pages that can be shared.
The concept of shared pages is shown in figure 6.13.
Sharing must be carefully regulated to prevent one process from altering data that
another process is accessing.
Notes
In most systems the shared programs are divided into separate pages, i.e. coding
and data are kept separate. This is achieved by having page map table entries of
different processes point to the same page frame, that page frame is shared among
those processes.
6.6 Segmentation
In this section, you will understand the concept and meaning of segmentation. An
important aspect of memory management that became unavoidable with paging is the
separation of the user's view of memory and the actual physical memory. The user's
view of memory is not the same as the actual physical memory. The user's view is
mapped onto physical memory. The mapping allows differentiation between logical
memory and physical memory.
What is the user's view of memory? Does the user think of memory as a linear array
of bytes, some containing instructions and others containing data, or is there some
other preferred memory view? There is general agreement that the user or programmer
of a system does not think of memory as a linear array of bytes. Rather, the user
prefers to view memory as a collection of variable-sized segments, with no necessary
ordering among segments as you can see in figure 6.14.
Sub- Stack
routine
Symbol
Table
sqrt
Main
Program
Limit Base
Segment Table
CPU s d
YES
< +
No
Physical Memory
memory of the desired byte. The segment table is thus essentially an array of base-limit
register pairs.
Notes
Example: Consider the situation shown in Figure 6.16. We have segments
numbered from 0 through 4. The segments are stored in physical memory as shown.
The segment table has a separate entry for each segment, giving the beginning
address of the segment in physical memory (the base) and the length of that segment
(the limit). For example, segment 2 is 400 bytes long, and begins at location 4300.
Thus, a reference to byte 53 of segment 2 is mapped onto location 4300 + 53 = 4353. A
reference to segment 3, byte 852, is mapped to 3200 (the base of segment 3) +852 =
4052. A reference to byte 1222 of segment 0 would result in a trap to the operating
system, as this segment is on bytes long.
Symbol table
Segment 0
Limit Base
Sqrt
Segment 4 1000 1400 3200
Main 400 6300
Program Segment 3
400 4300
1100 3200
4700 4300
Segment 1 Segment 2 Segment 2
4700
Segment 4
5700
6300
Segment 1
6700
Editor
Limit Base
Segment 0
0 25286 43062
43062
1 4425 68348
Data 1 Segment Table Editor
Process P1
68348
Segment 1 Data 1
72773
Editor
90003
Segment 0 Data 2
Limit Base 98553
0 25286 43062
Data 1
1 8850 90003
Physical Memory
Segment Table
Segment 1 Process P2
separate, unique segments to store local variables. These segments, of course, would
not be shared.
Notes
It is also possible to share only parts of programs.
Example: Common subroutine packages can be shared among many users if they
are defined sharable, read-only segments. Two FORTRAN programs, for instance, may
use the same Sqrt subroutine, but only one physical copy of the Sqrt routine would be
needed.
Although this sharing appears simple, there are subtle considerations. Code
segments typically contain references to themselves.
Example: A conditional jump normally has a transfer address.
The transfer address is a segment number and offset. The segment number of the
transfer address will be the segment number of the code segment. If you try to share
this segment, all sharing processes must define the shared code segment to have the
same segment number.
Example: If you want to share the Sqrt routine, and one process wants to make
segment 4 and another wants to make it segment 17, how should the Sqrt routine refer
to itself? Because there is only one physical copy of Sqrt, it must refer to itself in the
same way for both users – it must have a unique segment number.
As the number of users sharing the segment increases, so does the difficulty of
finding an acceptable segment number. Read-only data segments that contain no
physical pointers may be shown as different segment numbers, as may code segments
that refer to them not directly, but rather only indirectly.
Example: Conditional branches that specify the branch address as an offset from
the current program counter or relative to a register containing the current segment
number would allow code to avoid direct reference to the current segment number.
Source: http://fourier.eng.hmc.edu/e85_old/lectures/memory/node9.html
6.8 Summary
Memory consists of a large array of words or bytes, each with its own address. Dynamic
loading is the process in which one can attach a shared library to the address space of
the process during execution. Dynamic linking is accomplished by placing the name of a
sharable library in the executable image. The method assumes dividing a program into
self-contained object code blocks called overlays. A memory management unit (MMU)
is a computer hardware component responsible for handling accesses to memory
requested by the CPU. Swapping is necessary when there are more processes than
available memory. To move a program from fast-access memory to a slow-access
memory is known as "swap out", and the reverse operation is known as "swap in".
Different strategies used to allocate space to processes competing for memory are:
Best fit, Worst fit, and First fit. Paging avoids the considerable problem of fitting the
varying-sized memory chunks onto the backing store, from which most of the previous
memory-management schemes suffered. Segmentation is a memory-management
scheme that supports the user view of memory. Segmentation with paging scheme
removes external fragmentation
Objectives
After studying this unit, you should be able to:
z Explain the concept of virtual memory
z Describe the process of demand paging
z Identify the methods for process creation
z Recognise the need for page replacement
z Describe page replacement algorithms
z Explain the standard methods of frame allocation
z Describe the concept of thrashing
7.1 Introduction
In the previous unit, you have studied memory management concepts such as
swapping and contiguous memory allocation. You have also understood the concept of
paging and segmentation.
In this unit, you will study the concept of virtual memory. Today application is getting
bigger and bigger. Therefore, it requires a bigger system memory in order for the
system to hold the application data, instruction, and thread and to load it. The system
needs to copy the application data from the HDD into the system memory in order for it
to process and execute the data. Once the memory gets filled up with data, the system
will stop loading the program. In this case, users need to add more memory onto their
system to support that intense application. However, adding more system memory
costs the money and the normal user only needs to run the intense application that
requires the memory only for one or two days. Therefore, virtual memory is introduced
to solve that type of problem. This lesson will cover the concept of demand paging,
process creation, page replacement, and allocation of frames. You will also learn about
thrashing in this lesson.
Virtual memory is utilised to improve the RAM capacity and to speed up the
operations of the computer. When RAM is low, data is transferred from virtual memory
Notes to a space named Paging File. The process of transmitting data to paging file frees up
the RAM in order to perform its task.
Page 0
Page 1
Page 2
memory
map
Page n physical
memory
virtual
memory
swap out 0 1 2 3
program
A 4 5 6 7
8 9 10 11
12 13 14 15
program swap in 16 17 18 19
B
20 21 22 23
main
memory
C 0
4 A
2 4 V
D 1 i 5
3 2 6 v
6 C A B
4 E 3 i
F 4 i
7
5 C D E
5 9 v 8
6 G
6 i
9 F F
H i
7 7
logical
page table 10
memory
11
12
13
14
15
physical
memory
Figure 7.3: Page Table when some Pages are not in Main Memory
The procedure for handling this page fault is simple as you can see in Figure 7.4.
1. Check an internal table (usually kept with the process control block) for the process
under consideration, to determine whether the reference was a valid or invalid
memory access.
2. If the reference was invalid, we terminate the process. If it was valid, but we have
not yet brought in that page into memory, we now page in the latter.
3. We find a free frame,
Example: By taking one from the free-frame list.
1. We schedule a disk operation to read the desired page into the newly allocated
frame.
2. When the disk read is complete, we modify the internal table kept with the process
and the page table to indicate that the page is now in memory.
3. We restart the instruction that was interrupted by the illegal address trap. The
process can now access the page as though it had always been in memory.
Operating System
3
Page is in Backing Store
1
Reference
2
Trap
Load M i
Free Frame
Page Table
5 4
Reset page table Bring in missing page
6 Physical Memory
Restart Instruction
It is important to realise that, because we save the state (registers, condition code,
instruction counter) of the interrupted process when the page fault occurs, you can
Notes restart the process in exactly the same place and state, except that the desired page is
now in memory and is accessible. In this way, you are able to execute a process, even
though portions of it are not (yet) in memory. When the process tries to access locations
that are not in memory, the hardware traps the operating system which is known as
"page fault". The operating system reads the desired page into memory and restarts the
process as though the page had always been in memory.
In the extreme case, you could start executing a process with no pages in memory.
When the operating system sets the instruction pointer to the first instruction of the
process would immediately fault for the page. After this page was brought into memory
due to recourse taken in response to this page fault, the process would continue to
execute, faulting as necessary until every page that it needed was actually in memory.
At that point, it could execute with no more faults. This scheme is pure demand paging:
never bring a page into memory until it is required.
Theoretically, you must understand that some programs may access several new
pages of memory with each instruction execution (one page for the instruction and
many for data), possibly causing multiple page faults per instruction. This situation
would result in unacceptable system performance because of the overheads involved.
Fortunately, analyses of running processes show that this behaviour is exceedingly
unlikely. Programs tend to have locality of reference, which results in reasonable
performance from demand paging.
The hardware to support demand paging is the same as the hardware for paging
and swapping:
z Page Table: This table has the ability to mark an entry invalid through a valid-
invalid bit or special value of protection bits.
z Secondary Memory: This memory holds those pages that are not present in main
memory. The secondary memory is usually a high-speed disk. It is known as the
swap device, and the section of disk used for this purpose is known as swap space
or backing store.
In addition to this hardware support, considerable software is needed, as we shall
see. Some additional architectural constraints must be imposed. A crucial issue is the
need to be able to restart any instruction after a page fault. In most cases, this
requirement is easy to meet. A page fault could occur at any memory reference. If the
page fault occurs on the instruction fetch, we can restart by fetching the instruction
again.
If a page fault occurs while we are fetching an operand, not an instruction, then we
must re-fetch the instruction, decode it again, and then fetch the operand.
As a worst-case scenario; consider a three-address instruction such as ADD the
content of A to B placing the result in C. The steps to execute this instruction would be
z Fetch and Decode the Instruction (ADD).
z Fetch A.
z Fetch B.
z Add A and B.
z Store the sum in C.
If we faulted when we tried to store in C (because C is in a page not currently in
memory), we would have to get the desired page in it, correct the page table, and
restart the instruction. The restart would require fetching the instruction again, decoding
it again, fetching the two operands again, and then adding again. However, there is
really not much repeated work (less than one complete instruction), and the repetition is
necessary only when a page fault occurs.
Let p be the probability of a page fault (0 <= p <= 1)). We would expect p to be
close to zero; that is, there will be only a few page faults. The effective access time is
Notes then
effective-access time = (1 – p) * ma + p * page-fault-time.
To compute the effective access time, you must know how much time is needed to
service a page fault.
A page fault causes the following sequence to occur:
1. Cause trap to the operating system.
2. Save the user registers and process state on the memory stack of the process.
3. Determine that the interrupt was a page fault.
4. Check that the page reference was legal and determine location of the page on the
disk.
5. Issue a read from the disk to a free frame:
(a) Wait in a queue for device until the read request is serviced.
(b) Wait for the device seek and/or latency time.
(c) Begin the transfer of the page to a free frame.
6. While waiting, allocate the CPU to some other user (CPU scheduling; optional).
7. Interrupt from the disk (I/O completed).
8. Save the registers and process state for the other user (if step 6 executed).
9. Determine that the interrupt was from the disk.
10. Correct the page table and other tables to show that the desired page is now in
memory.
11. Wait for the CPU to be allocated to this process again.
12. Restore the user registers, process state, and new page table, and then resume the
interrupted instruction.
Not all of these steps may be necessary in every case.
Example: We are assuming that, in step 6, the CPU is allocated to another process
while the I/O occurs. This arrangement allows multi-programming to maintain CPU
utilisation, but requires additional time to resume the page-fault service routine when
the I/O transfer is complete.
In any case, we are faced with three major components of the page-fault service
time:
z Service the page-fault interrupt
z Read in the page
z Restart the process
The first and third tasks may be reduced, with careful coding, to several hundred
instructions. These tasks may take from 1 to 100 microseconds each. The page-switch
time, on the other hand, will probably be close to 24 milliseconds. A typical hard disk
has an average latency of 8 milliseconds, a seek-time of 15 milliseconds, and a transfer
time of 1 millisecond. Thus, the total paging time would be close to 25 milliseconds,
including hardware and software time. Remember also that you are looking at only the
device service time. If a queue of processes is waiting for the device (other processes
that have caused page faults), you have to add device queuing-time as you wait for the
paging device to be free to service our request, increasing the time to swap even more
than the one calculated.
It will be essential to note that this way of creating an address space involved many
memory accesses, used up many CPU cycles, and completely spoiled the cache
Notes contents. Last but not least, it was often pointless because many child processes start
their execution by loading a new program, thus discarding entirely the inherited address
space.
Modern Unix kernels, including Linux, follow a more efficient approach called Copy
On Write (COW). The idea is quite simple: instead of duplicating page frames, they are
shared between the parent and the child process. However, as long as they are shared,
they cannot be modified. Whenever the parent or the child process attempts to write
into a shared page frame, an exception occurs. At this point, the kernel duplicates the
page into a new page frame that it marks as writable. The original page frame remains
write-protected: when the other process tries to write into it, the kernel checks whether
the writing process is the only owner of the page frame; in such a case, it makes the
page frame writable for the process.
Example: Copy-on-write method is shown in Figure 7.5 and Figure 7.6, which
display the physical memory contents before and after process 1 modifies page C.
Figure 7.5 shows the contents before process 1 modifies page C:
Source: http://www.cs.odu.edu/~cs471w/spring11/lectures/virtualmemory.htm
Source: http://www.cs.odu.edu/~cs471w/spring11/lectures/virtualmemory.htm
A Page may
be replaced
Pages to be B
swapped into
the memory
C A
Page may
be replaced
D
D
E
Physical Memory
F with page
Secondary Memory Media location [RAM]
[Hard Disk] G
Page Diagram
is no longer in memory. The freed frame can now be used to hold the page for which
the process faulted. The page-fault service routine is now modified to include
Notes replacement:
z Find the location of the desired page on the disk.
z Find a free frame:
z If there is a free frame, use it.
z Otherwise, use a page-replacement algorithm to select a victim frame.
z Write the victim page to the disk; change the page and frame tables accordingly.
z Read the desired page into the (newly) free frame; change the page and frame
tables.
z Restart the user process.
You must notice that, if no frames are free, two page transfers (one out and one in)
are required. This situation effectively doubles the page-fault service time and will
increase the effective access time accordingly. This overhead can be reduced by the
use of modify (dirty) bit. Each page or frame may have a modify bit associated with it in
the hardware. The modify bit for a page is set by the hardware whenever any word or
byte in the page is written into, indicating that the page has been modified.
When you select a page for replacement, you examine it’s modify bit. If the bit is
set, you know that the page has been modified since it was read in from the disk. In this
case, you must write that page to the disk. If the modify bit is not set, however, the page
has not been modified since it was read into memory. Therefore, if the copy of the page
on the disk has not been overwritten (by some other page, for example), you can avoid
writing the memory page to the disk; it is already there. This technique also applies to
read-only pages (for example, pages of binary code). Such pages cannot be modified;
thus, they may be discarded when desired. This scheme can reduce significantly the
time to service a page fault, since it reduces I/O time by one-half if the page is not
modified.
Swap out
victim
2 change page
to invalid 1
4 victim
reset page
page table table for 3
new page
Swap out
desired
page in
physical
memory
FIFO Algorithm
It will be important for you to understand that the simplest page-replacement algorithm
is a FIFO algorithm. A FIFO replacement algorithm associates with each page the time
when that page was brought into memory. When a page must be replaced, the oldest
page is selected. You can create a FIFO queue to record the time when a page is
brought into the memory. You can create a FIFO queue to hold all pages in memory.
You replace the page at the head of the queue. When a page is brought into memory,
we insert it at the tail of the queue.
Example: As an example of reference string, three frames are initially empty. The
first three references (7, 0, 1) cause page faults, and are brought into these empty
frames. The next reference (2) replaces page 7, because page 7 was brought in first.
Since 0 is the next reference and 0 is already in memory, we have no fault for this
reference. The first reference to 3 results in page 0 being replaced, since it was the
oldest of the three pages in memory (0, 1, and 2) to be brought in. This replacement
means that the next reference, to 0, will fault. Page 1 is then replaced by page 0. This
process continues as shown in Figure 7.10. Every time a fault occurs, we show which
pages are in our three frames. There are 15 faults altogether.
Reference String
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 4 4 4 0 0 0 7 7 7
0 0 0 3 3 3 2 2 2 1 1 1 0 0
1 1 1 0 0 0 3 3 3 2 2 2 1
Page Frames
16
14
12
10
8
6
4
2
1 2 3 4 5 6 7
number of frames
Optimal Algorithm
It is important to note that one result of the discovery of Belady's anomaly was the
search for an optimal page-replacement algorithm. An optimal page-replacement
algorithm has the lowest page-fault rate of all algorithms.
An optimal algorithm will never suffer from Belady's anomaly.
An optimal page-replacement algorithm exists, and has been called OPT or MIN.
Stated simply, it is:
z Replace the page that will not be used for the longest period of time.
z Use of this page-replacement algorithm guarantees the lowest possible page-fault
rate for a fixed number of frames.
Example: On our sample reference string, the optimal page-replacement algorithm
would yield nine page faults, as shown in Figure 7.12. The reference to page 2 replaces
page 7, because 7 will not be used until reference 18, whereas page 0 will be used at 5,
and page 1 at 14. The reference to page 3 replaces page 1, as page 1 will be the last of
the three pages in memory to be referenced again. With only nine page faults, optimal
replacement is much better than a FIFO algorithm, which had 15 faults.
Notes
LRU Algorithm
As stated earlier, you must be aware that it is very difficult to implement optimal
algorithm. Yet an approximation to this algorithm may be employed, and that is LRU.
The algorithm associates to each page brought into the memory, the time at which the
page was last used. When a page is needed to be replaced, the page with the longest
period of time when it was not used is selected for replacement.
Example: You can see this in Figure 7.13.
Figure 7.13: Replace the Page that has not been used for the Longest Period of
Time
However, even this algorithm needs a lot of hardware support for its
implementation. This algorithm looks forward in time than the backward scheme of the
optimal algorithm. This algorithm requires extensive hardware support in terms of
counters and stack.
Example: The Figure 7.14 shows the replacement scheme.
Notes
7.7 Thrashing
Now let us discuss about the concept of thrashing. A process that is spending more
time paging than executing is known as thrashing. Thrashing happens when a hard
Amity Directorate of Distance & Online Education
148 Operating Systems
drive has to move its heads over the swap area many times due to the high number of
page faults. This happens when memory accesses are causing page faults as the
Notes memory is not located in main memory.
The thrashing happens as memory pages are swapped out to disk only to be paged
in again soon afterwards. Instead of memory access happening mainly in main memory,
access is mainly to disk causing the processes to become slow as disk access is
required for many memory pages and thus thrashing.
It is important to note that the operating system monitors CPU utilisation. Early
schemes of process scheduling would control the multiprogramming level based on the
utilisation of CPU, providing more processes when CPU utilisation was low.
The difficulty is that when memory filled up and processes began spending a huge
amount of time waiting for their pages to page in, then the utilisation of CPU would
lower, getting the schedule to add in even more processes and worsening the problem.
Ultimately the system would basically get stuck.
This you can see in the Figure 7.15.
Source: http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/9_VirtualMemory.html
7.10 Summary
Virtual memory permits the execution of processes that may not be completely resident
in the memory. Virtual memory is commonly implemented by Demand Paging. It can
also be implemented in a segmentation system. In demand paging, rather than
swapping the entire process into memory, a lazy swapper is used, which swaps a page
into memory only when that page is needed. The idea of Copy on Write (COW) is quite
Amity Directorate of Distance & Online Education
150 Operating Systems
simple: instead of duplicating page frames, they are shared between the parent and the
child process. Page replacement is basic to demand paging. It completes the
Notes separation between logical memory and physical memory. A FIFO replacement
algorithm associates with each page the time when that page was brought into memory.
An optimal page-replacement algorithm has the lowest page-fault rate of all algorithms.
LRU algorithm associates to each page brought into the memory, the time at which the
page was last used. Standard methods used for frame allocation include Equal
allocation and Proportional allocation. Thrashing happens when a hard drive has to
move its heads over the swap area many times due to the high number of page faults