Unit-4 Class
Unit-4 Class
Execution time: Binding delayed until run time if the process can be
moved during its execution from one memory segment to another
Logical vs. Physical Address Space
Worst-fit: Allocate the largest hole; must also search entire list
Produces the largest leftover hole
First-fit and best-fit better than worst-fit in terms of speed and storage
utilization
Fragmentation
To run a program of size N pages, need to find N free frames and load
program
Page offset (d) – combined with base address to define the physical
One for the page table and one for the data / instruction
The two memory access problem can be solved by the use of a special
fast-lookup hardware cache called associative memory or translation
look-aside buffers (TLBs)
Implementation of Page Table (Cont.)
Some TLBs store address-space identifiers (ASIDs) in each TLB
entry – uniquely identifies each process to provide address-space
protection for that process
On a TLB miss, value is loaded into the TLB for faster access next
time
Replacement policies must be considered
Paging Hardware With TLB
Shared Pages
Shared code
The pages for the private code and data can appear
anywhere in the logical address space
Shared Pages Example
Structure of the Page Table
Hierarchical Paging
Since the page table is paged, the page number is further divided into:
a 12-bit page number
a 10-bit page offset
Thus, a logical address is as follows:
same location
Each element contains (1) the virtual page number (2) the value
match
Entry consists of the virtual address of the page stored in that real memory location, with
Decreases memory needed to store each page table, but increases time needed to
4
1
3 2
4
Logical address space can therefore be much larger than physical address
space.
Page replacement – find some page in memory, but not really in use, swap
it out.
algorithm
performance – want an algorithm which will result in minimum number of
page faults.
- Copy-on-Write
- Memory-Mapped Files
Copy-on-Write
Copy-on-Write (COW) allows both parent and child processes to
initially share the same pages in memory.
If either process modifies a shared page, only then is the page
copied.
COW allows more efficient process creation as only modified
pages are copied.
Copy-on-Write is a common technique used by several
operating systems when duplicating processes, including
Windows 2000, Linux, and Solaris 2.
Free pages for the stack and heap or copy-on-write are
allocated from a pool of zeroed-out pages.
vfork() (for virtual memory fork) is available in several versions
of UNIX (including Solaris 2) and can be used when the child
process calls exec() after creation.
Memory-Mapped Files
Memory-mapped file I/O allows file I/O to be treated as routine memory
access by mapping a disk block to a page in memory.
A file is initially read using demand paging. A page-sized portion of the file is
read from the file system into a physical page. Subsequent reads/writes
to/from the file are treated as ordinary memory accesses.
Simplifies file access by treating file I/O through memory rather than read()
and write() system calls.
Also allows several processes to map the same file allowing the pages in
memory to be shared.
Memory Mapped Files
Page Replacement
An operating system has to decided how much memory to
allocate to I/O and how much to program pages so as to prevent
over-allocation of memory.
3. Read the desired page into the (newly) free frame. Update the
page and frame tables.
Only the page which has been modified will be written back to
the disk.
Page Replacement
Page Replacement Algorithms
We must solve two major problems in demand paging: We must
develop:
A frame-allocation algorithm – How many frames are
allocated to each process.
A page-replacement algorithm – Which frames are
selected to be replaced.
Which algorithm? We want lowest page-fault rate.
Evaluate algorithm by running it on a particular string of memory
references (reference string) and computing the number of
page faults on that string.
They can be generated by a random generator or from the
system memory reference record.
In all our examples, the reference string is
1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.
Graph of Page Faults Versus The Number of Frames
First-In-First-Out (FIFO) Algorithm
Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
3 frames (3 pages can be in memory at a time per process)
1 1 4 5
2 2 1 3 9 page faults
4 frames 3 3 2 4
1 1 5 4
2 2 1 5 10 page faults
FIFO Replacement – Belady’s
3 3 Anomaly 2
more frames less page faults
4 4 3
FIFO Page Replacement
FIFO Illustrating Belady’s Anamoly
Optimal Algorithm
Replace page that will not be used for longest period of time.
4 frames example
1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
1 4
2 6 page faults
3
1 5
2
3 5 4
4 3
Counter implementation
Every page entry has a counter; every time page is referenced through
this entry, copy the clock into the counter.
When a page needs to be changed, look at the counters to determine
which are to change.
LRU Page Replacement
LRU Algorithm (Cont.)
How to implement LRU replacement?
Counter implementation – associate with each page-table entry
a time-of-use field, and add a logical clock or counter.
The contents of the clock register are copied to the time-of-
use field.
Search the page table and Replace the page with the
smallest time value.
Stack implementation – keep a stack of page numbers in a
double link form:
Page referenced:
move it to the top
requires 6 pointers to be changed at worst
No search for replacement
Neither implementation of LRU would be conceivable without
hardware assistance beyond the standard TLB registers.
Use Of A Stack to Record The Most Recent Page References
LRU Approximation Algorithms
Not-recently-used (NRU)
Each page has a reference bit and modified bit.
R = referenced (read or written recently)
M = modified (written to)
When a process starts, both bits R and M are set to 0 for all
pages.
Bits are set when page is referenced and/or modified.
Pages are classified into four classes
class R M
----- --- ---
0 0 0
1 0 1
2 1 0
3 1 1
LRU Approximation Algorithms
Not-recently-used (NRU)
Periodically, (on each clock interval (20msec) ), the R bit is
cleared. (i.e. R = 0).
The NRU algorithm selects to remove a page at random
from the lowest numbered, nonempty class.
Easy to understand and implement.
Performance adequate (though not optimal)
LRU Approximation Algorithms
First-In, First-Out: (FIFO)
a list is maintained, with the oldest page at the front of the
list.
MFU Algorithm: based on the argument that the page with the
smallest count was probably just brought in and has yet to be
used.
Page-buffering algorithm
Keep a pool of free frames.
Maintain a list of modified pages – write out the modified
page when the paging device is idle.
Keep a pool of free frames, but remember which page was
in each frame.
Summary of LRU Algorithm
Few computers have the necessary hardware to implement full
LRU.
Counter-based method could be done, but it’s slow to find
the desired page.
Linked-list method (indicated above) impractical in hardware
Approximate LRU with Not Frequently Used (NFU)
At each clock interrupt, scan through page table
If the reference bit (R) is 1 for a page, add one to its counter
value.
On replacement, pick the page with the lowest counter
value.
Problem: no notion of age – pages with high counter values will
tend to keep them.
Solution: put aging information into the counter.
Summary of Page Replacement Algorithm
OPT (Optimal): Not implementable, but useful as a benchmark
NRU (Not Recently Used): Crude
FIFO (First-In First-Out): Might throw out useful pages
Second chance: Improvement over FIFO
Clock: Better implementation of second chance
LRU (Least Recently Used): Excellent, but hard to implement
exactly
NFU (Not Frequently Used): Poor approximation to LRU
MFU (Most Frequently Used): Poor approximation to LRU
Aging: Good approximation to LRU, efficient to implement
Allocation of Frames
Each process needs minimum number of pages.
Example: IBM 370 – 6 pages to handle SS MOVE instruction:
instruction is 6 bytes, might span 2 pages.
2 pages to handle from.
2 pages to handle to.
Two major allocation schemes.
fixed allocation
priority allocation
Fixed Allocation
Equal allocation – e.g., if 100 frames and 5 processes, give
each 20 pages.
Proportional allocation – Allocate according to the size of
process.
si size of process pi
S si
m total number of frames
si
ai allocation for pi m
S
m 64
si 10
s2 127
10
a1 64 5
137
127
a2 64 59
137
Priority Allocation
Use a proportional allocation scheme using priorities rather than size.
Ideally, the working set of each process is stored in the TLB. Otherwise
there is a high degree of page faults.
Increasing the Size of the TLB
Increase the Page Size. This may lead to an increase in fragmentation as
not all applications require a large page size.
Provide Multiple Page Sizes. This allows applications that require larger
page sizes the opportunity to use them without an increase in
fragmentation. For example, Solaris2 8 KB, 64 KB, 512 KB, and 4 MB.
Recent trends indicate a move towards software-managed TLBs and
operating-system support for multiple page sizes.
The UltraSparc, MIPS, Alpha architectures employ software-managed TLBs.
The PowerPC and Pentium manage the TLB in hardware.
Other Considerations (Cont.)
Inverted Page Table
We save memory by creating a table that has one entry per physical
memory page.
However, the inverted page table no longer contains complete
information about the logical address space.
An external page tables must be kept.
It requires careful handling in the kernel and a delay in the page-lookup
processing.
Other Considerations (Cont.)
Program structure
int A[][] = new int[1024][1024];
Each row is stored in one page
Program 1 for (j = 0; j < A.length; j++)
for (i = 0; i < A.length; i++)
A[i,j] = 0;
1024 x 1024 page faults