Page Replacement Algorithms
Page Replacement Algorithms
n First-In-First-Out (FIFO)
n Throw out the oldest page
Pros: Low-overhead implementation
Demand Paging
n
n Question
n What hardware mechanisms are required to implement LRU?
n Faithful Implementation:
n Use a timestamp on each reference n FIFO clock algorithm
n Keep a list of pages ordered by time of reference n Hand points to the oldest page
n Impractical: because for every memory reference, the queue n On a page fault, follow the hand to inspect pages
is modified n Second chance
n Approximation: n If the reference bit is 1, set it to 0 and advance the hand
n Use reference bits n If the reference bit is 0, use it for replacement
n If page to be replaced, has reference bit set: n What does it mean if the clock hand is moving slowly?
n Do not discard page, but clear reference bit n What does it mean if the clock hand is moving fast?
1
Enhanced FIFO with 2nd-chance Dealing with modified pages
n Same as the basic FIFO with 2nd chance, except that this n Replacement cost includes:
method considers both reference bit and modified bit n Cost of writing modified page out
n (0,0): neither recently used nor modified n Cost of reading a new page in
n (0,1): not recently used but modified
n (1,0): recently used but clean n Decrease this cost by:
n (1,1): recently used and modified n Having a background process constantly writing out modified pages
n Pros n Reasoning: keep I/O device occupied
More page frames →fewer faults? State per page table entry
n Consider the following reference string with 4 page frames Many machines maintain four bits per page table entry:
n FIFO replacement
n 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 n use (aka reference): set when page is referenced,
n 10 page faults cleared by “clock algorithm”
n Consider the same reference string with 3 page frames
n modified (aka dirty): set when page is modified, cleared
FIFO replacement
when page is written to disk
n
n 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
n 9 page faults! n valid (aka present): ok for program to reference this
n This is called Belady’s anomaly page
n LRU does not suffer from this anomaly n read-only: ok for program to read page, but not to modify
n In general, if algorithm satisfies “stack property” (pages that are it (e.g., for catching modifications to code pages)
resident with larger memory are a superset of pages that are
resident with smaller memory) then algorithm avoids this anomaly
n Trick to avoid setting use bit for every reference: n Hardware sets use bit in TLB; when TLB entry is replaced,
n When use bit is to be cleared, also set the invalid bit for the PTE software copies use bit back to page table
n When page is accessed, trap to OS n Setting use bit is very fast operation
n reset invalid bit and set the use bit as well n Combined with address translation operation
n Trick to avoid setting modified bit for every reference: n Software manages TLB entries as FIFO list; everything not
n When page is brought in, set it to “read-only” in TLB is second-chance list (managed as strict LRU).
n When write happens, trap to OS
n Set modified bit, and reset “read-only”
n Insight: TLB filters out most of the overhead
2
How many pages are allocated to
each process ? Fixed allocation
n Each process needs minimum number of pages.
n Equal allocation – e.g., if 100 frames and 5
processes, give each 20 pages.
n Example: IBM 370 – 6 pages to handle SS MOVE
n Proportional allocation – Allocate according to
instruction:
the size of process.
n instruction is 6 bytes, might span 2 pages.
n 2 pages to handle from. si = size of process pi
2 pages to handle to. S = ∑ si
n m = 64
m = total number of frames
s1 = 10
s
n Two major allocation schemes. ai = allocation for pi = i × m s2 = 127
S
fixed allocation 10
n a1 = × 64 ≈ 5
137
n priority allocation
127
a2 = × 64 ≈ 59
137
n Local replacement – each process selects from only its own set of fetched from disk
n I/O devs at 100% utilization but system not getting much useful
allocated frames.
work done
n What we wanted: virtual memory the size of disk with access time
of of physical memory
n What we have: memory with access time = disk access
3
Making the best of a bad
situation Methodology for solving?
n Single process thrashing? n Approach 1: working set
n If process does not fit or does not reuse memory, OS can do n thrashing viewed from a caching perspective
nothing! n given locality of reference, how big a cache does the process need?
n Or: how much memory does process need in order to make
n System thrashing? “reasonable” progress (its working set)?
n If thrashing arises because of the sum of several processes then n Only run processes whose memory requirements can be satisfied.
adapt: n Approach 2: page fault frequency
n figure out how much memory each process needs n thrashing viewed as poor ratio of fetch to work
n change scheduling priorities to run processes in groups whose n PFF = page faults / instructions executed
memory needs can be satisfied (shedding load) n if PFF rises above threshold, process needs more memory
n if new processes try to start, can refuse (admission control)
n not enough memory on the system? Swap out.
n T: the working set parameter n balance set: sum of working sets of all active processes
n Uses? n Long term scheduler:
n Page replacement: preferentially discard non-working set pages n Keep moving processes from active à inactive until balance set less
n Scheduling: process not executed unless working set in main than memory size.
memory n Must allow inactive to become active. What happens if this changes too
frequently?
n As working set changes, must update balance set…
4
Working sets of real programs Working set less important
n The concept is a good perspective on system behavior.
n As a technique for system optimization it’s less important
Working set size
n working set of one may have little to do with other research topic in 80-90s, less so now
n balloons during transitions