VM Os
VM Os
Ali Mashtizadeh
University of Waterloo
1 / 44
Outline
1 Paging
2 Eviction policies
3 Thrashing
4 User-level API
2 / 44
Paging
virtual address
• Disk much, much slower than memory
I Goal: run at memory speed, not disk speed
• 80/20 rule: 20% of memory gets 80% of memory accesses
I Keep the hot 20% in memory
I Keep the cold 80% on disk
4 / 44
Working set model
# of accesses
virtual address
• Disk much, much slower than memory
I Goal: run at memory speed, not disk speed
• 80/20 rule: 20% of memory gets 80% of memory accesses
Keep the hot 20% in memory
I Keep the cold 80% on disk
4 / 44
Working set model
# of accesses
virtual address
• Disk much, much slower than memory
I Goal: run at memory speed, not disk speed
• 80/20 rule: 20% of memory gets 80% of memory accesses
I Keep the hot 20% in memory
Keep the cold 80% on disk
4 / 44
Paging challenges
5 / 44
Re-starting instructions
6 / 44
What to fetch
7 / 44
Selecting physical pages
8 / 44
Superpages
9 / 44
Outline
1 Paging
2 Eviction policies
3 Thrashing
4 User-level API
10 / 44
Straw man: FIFO eviction
11 / 44
Straw man: FIFO eviction
11 / 44
Belady’s Anomaly
13 / 44
Optimal page replacement
13 / 44
LRU page replacement
15 / 44
Clock algorithm
A=1 A=0
A=0
16 / 44
Clock algorithm
A=1 A=0
A=0
16 / 44
Clock algorithm
A=1 A=0
A=0
16 / 44
Clock algorithm (continued)
A=0
• Large memory may be a problem A=1 A=0
I Most pages referenced in long A=1 A=1
interval
• Add a second clock hand A=0 A=0
A=0
• Large memory may be a problem A=1 A=0
I Most pages referenced in long A=1 A=0
interval
• Add a second clock hand A=0 A=0
A=0
• Large memory may be a problem A=1 A=0
I Most pages referenced in long A=1 A=0
interval
• Add a second clock hand A=0 A=0
• Random eviction
I Simple to implement
I Not overly horrible (avoids Belady & pathological cases)
I Used in hypervisors to avoid double swap [Waldspurger]
• LFU (least frequently used) eviction
• MFU (most frequently used) algorithm
• Neither LFU nor MFU used very commonly
• Workload specific policies: Databases
18 / 44
Naïve paging
20 / 44
Page allocation
21 / 44
Outline
1 Paging
2 Eviction policies
3 Thrashing
4 User-level API
22 / 44
Thrashing
23 / 44
Reasons for thrashing
P1
memory
24 / 44
Dealing with thrashing
25 / 44
Working sets
Transitions
working set size
time
26 / 44
Outline
1 Paging
2 Eviction policies
3 Thrashing
4 User-level API
27 / 44
Recall typical virtual address space
kernel
stack
breakpoint
heap
uninitialized data (bss)
initialized data
read-only data
code (text)
• Dynamically allocated memory goes in heap
• Top of heap called breakpoint
I Addresses between breakpoint and stack all invalid 28 / 44
Early VM system calls
29 / 44
Memory mapped files
kernel
stack
mmapped
regions
heap
uninitialized data (bss)
initialized data
read-only data
code (text)
31 / 44
More VM system calls
32 / 44
Exposing page faults
struct sigaction {
union { /* signal handler */
void (*sa_handler)(int);
void (*sa_sigaction)(int, siginfo_t *, void *);
};
sigset_t sa_mask; /* signal mask to apply */
int sa_flags;
};
33 / 44
Example: OpenBSD/i386 siginfo
struct sigcontext {
int sc_gs; int sc_fs; int sc_es; int sc_ds;
int sc_edi; int sc_esi; int sc_ebp; int sc_ebx;
int sc_edx; int sc_ecx; int sc_eax;
int sc_trapno;
int sc_err;
};
• Linux uses ucontext_t – same idea, just uses nested structures that won’t all
fit on one slide
34 / 44
VM tricks at user level
35 / 44
Outline
1 Paging
2 Eviction policies
3 Thrashing
4 User-level API
36 / 44
Overview
• Windows and most UNIX systems seperate the VM system into two parts
I VM PMap: Manages the hardware interface (e.g. TLB in MIPS)
I VM Map: Machine independent representation of memory
37 / 44
Operation
• Page faults
I Exception handler calls into the VM PMap to load the TLB
I If the page isn’t in the PMap we call VM Map code
38 / 44
4.4 BSD VM system [McKusick]
39 / 44
4.4 BSD VM data structures
shadow vm_page
vm_map_entry object
vm_map vnode/
object vm_page
vm_pmap vm_page
vm_map_entry
stats
vm_page
shadow vnode/
vmspace vm_map_entry object object
vm_page vm_page
vm_map_entry vnode/
object
vm_page
40 / 44
Pmap (machine-dependent) layer
41 / 44
Example uses
42 / 44
What happens on a fault?
43 / 44
Paging in day-to-day use
• Demand paging
I Read pages from vm_object of executable file
• Copy-on-write (fork, mmap, etc.)
I Use shadow objects
• Growing the stack, BSS page allocation
I A bit like copy-on-write for /dev/zero
I Can have a single read-only zero page for reading
I Special-case write handling with pre-zeroed pages
• Shared text, shared libraries
I Share vm_object (shadow will be empty where read-only)
• Shared memory
I Two processes mmap same file, have same vm_object (no shadow)
44 / 44