20 DynamicMemory
20 DynamicMemory
Dynamic Memory
Management
1
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 4
Why Allocate Memory Dynamically?
Why allocate memory dynamically?
Problem
• Unknown object size
• E.g. unknown element count in array
• E.g. unknown node count in linked list or tree
• How much memory to allocate?
Solution 1
• Guess (i.e., fixed size buffers. i.e., problems!)
Solution 2
• Allocate memory dynamically
5
Why Free Memory Dynamically?
Solution
• Free dynamically allocated memory that is no longer needed
6
Option A: Automatic Freeing
Original Car
object can’t
be accessed
7
Option B: Manual Freeing
Programmer frees unneeded memory
• C, C++, Objective-C, …
Pros
• Less overhead
• No unexpected pauses
Cons
• More complex for programmer
• Opens possibility of memory-related bugs
• Dereferences of dangling pointers, double frees, memory leaks
8
Option A vs. Option B
Implications…
9
Standard C DMM Functions
10
Goals for DMM
Note
• Easy to reduce time or space
• Hard to reduce time and space
11
Implementing malloc() and free()
Question:
• How to implement malloc() and free()?
• How to implement a DMMgr?
Answer 1:
• Use the heap section of memory
Answer 2:
• (Later in this lecture)
12
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 13
The Heap Section of Memory
Low High
memory memory
Data structures
inuse
pBrk
17
Minimal Impl malloc(n) Example
pBrk, p
Call brk(p+n) to increase heap size, change pBrk
n bytes
p pBrk
Return p
n bytes
p pBrk 18
Minimal Impl free(p) Example
Do nothing!
19
Minimal Impl
20
Minimal Impl Performance
21
What’s Wrong?
Problem
• malloc() executes a system call each time
Solution
• Redesign malloc() so it does fewer system calls
• Maintain a pad at the end of the heap…
22
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 23
Pad Impl
Data structures
inuse pad
pPad pBrk
24
Pad lmpl malloc(n) Example 1
≥ n bytes
pPad pBrk
Are there at least n bytes between pPad and pBrk? Yes!
Save pPad as p; add n to pPad
n bytes
p pPad pBrk
Return p
n bytes
p pPad pBrk
25
Pad lmpl malloc(n) Example 2
< n bytes
pPad pBrk
≥ n bytes
pPad pBrk
≥ n bytes
pPad pBrk
Proceed as previously! 26
Pad Impl free(p) Example
Do nothing!
27
Pad Impl
void *malloc(size_t n) void free(void *p)
{ enum {MIN_ALLOC = 8192}; {
static char *pPad = NULL; }
static char *pBrk = NULL;
char *p;
if (pBrk == NULL)
pPad = pBrk = sbrk(0);
if (pPad + n > pBrk) /* move pBrk */
{ char *pNewBrk =
inuse pad
max(pPad + n, pBrk + MIN_ALLOC);
if (brk(pNewBrk) == -1) return NULL; pPad pBrk
pBrk = pNewBrk;
}
p = pPad;
pPad += n;
return p;
}
28
Pad Impl Performance
29
What’s Wrong?
Problem
• malloc() doesn’t reuse freed chunks
Solution
• free() marks freed chunks as “free”
• malloc() uses marked chunks whenever possible
• malloc() extends size of heap only when necessary
30
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 31
Fragmentation
At any given time, some heap memory chunks are
in use, some are marked “free”
inuse free
32
Internal Fragmentation
Internal fragmentation: waste within chunks
100 bytes
Client asks for 90 bytes
DMMgr provides chunk of size 100 bytes
10 bytes wasted
Generally
Program asks for n bytes
DMMgr provides chunk of size n+Δ bytes
Δ bytes wasted
Space efficiency =>
DMMgr should reduce internal fragmentation 33
External Fragmentation
External fragmentation: waste because of
non-contiguous chunks
35
DMMgr Desired Behavior Demo
0
Heap
p1
char *p1 = malloc(3);
char *p2 = malloc(1);
char *p3 = malloc(4);
free(p2);
char *p4 =
free(p3);
char *p5 =
malloc(6);
malloc(2);
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 36
DMMgr Desired Behavior Demo
0
Heap
p1
char *p1 = malloc(3);
char *p2 = malloc(1); p2
char *p3 = malloc(4);
free(p2);
char *p4 =
free(p3);
char *p5 =
malloc(6);
malloc(2);
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 37
DMMgr Desired Behavior Demo
0
Heap
p1
char *p1 = malloc(3);
char *p2 = malloc(1); p2
char *p3 = malloc(4); p3
free(p2);
char *p4 =
free(p3);
char *p5 =
malloc(6);
malloc(2);
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 38
DMMgr Desired Behavior Demo
malloc(2);
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 39
DMMgr Desired Behavior Demo
0
Heap
p1
char *p1 = malloc(3);
char *p2 = malloc(1); p2
char *p3 = malloc(4); p3
free(p2);
char *p4 =
free(p3);
char *p5 =
malloc(6);
malloc(2);
p4
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 40
DMMgr Desired Behavior Demo
malloc(2);
p4
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 41
DMMgr Desired Behavior Demo
malloc(2);
p4
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 42
DMMgr Desired Behavior Demo
0
Heap
p1
char *p1 = malloc(3);
char *p2 = malloc(1); p5, p2
char *p3 = malloc(4); p3
free(p2);
char *p4 =
free(p3);
char *p5 =
malloc(6);
malloc(2);
p4
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 43
DMMgr Desired Behavior Demo
0
Heap
p1
char *p1 = malloc(3);
char *p2 = malloc(1); p5, p2
char *p3 = malloc(4); p3
free(p2);
char *p4 =
free(p3);
char *p5 =
malloc(6);
malloc(2);
p4
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 44
DMMgr Desired Behavior Demo
0
Heap
p1
char *p1 = malloc(3);
char *p2 = malloc(1); p5, p2
char *p3 = malloc(4); p3
free(p2);
char *p4 =
free(p3);
char *p5 =
malloc(6);
malloc(2);
p4
} Heap
free(p1);
free(p4);
free(p5);
Stack
0xffffffff 45
DMMgr Desired Behavior Demo
DMMgr cannot:
• Reorder requests
• Client may allocate & free in arbitrary order
• Any allocation may request arbitrary number of bytes
• Move memory chunks to improve performance
• Client stores addresses
• Moving a memory chunk would invalidate client pointer!
46
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 47
List Impl
Data structures
Free list Next chunk in free list
size
header payload
chunk
<n >= n
too small reasonable
Free list
<n >= n
return this
49
List Impl: malloc(n) Example 2
Free list
<n >> n
too small too big
Free list
<n n
return this
50
List Impl: free(p) Example
Free list
free this
Free list
51
List Impl: free(p) Example (cont.)
Free list
coalesced chunk
coalesced chunk
≥n
new large
Search list for big-enough chunk chunk
None found =>
Call brk() to increase heap size
Insert new chunk at end of list
(Not finished yet!)
54
List Impl: malloc(n) Example 3 (cont.)
Free list
≥n
prev chunk new large
In list chunk
Free list
≥n
new large
Look at prev chunk in list chunk
Next chunk memory == next chunk in list =>
Remove both chunks from list
Coalesce
Insert chunk into list
Then proceed to use the new chunk, as before
(Finished!)
55
List Impl
Algorithms (see precepts for more precision)
malloc(n)
• Search free list for big-enough chunk
• Chunk found & reasonable size => remove, use
• Chunk found & too big => split, use tail end
• Chunk not found => increase heap size, create new chunk
• New chunk reasonable size => remove, use
• New chunk too big => split, use tail end
free(p)
• Search free list for proper insertion spot
• Insert chunk into free list
• Next chunk in memory also free => remove both, coalesce, insert
• Prev chunk in memory free => remove both, coalesce, insert
56
List Impl Performance
Space
• Some internal & external fragmentation is unavoidable
• Headers are overhead
• Overall: good
Time: malloc()
• Must search free list for big-enough chunk
• Bad: O(n)
• But often acceptable
Time: free()
• ???
Time: malloc()
• Must search free list for big-enough chunk
• Bad: O(n)
• But often acceptable
Time: free()
• Must search free list for insertion spot
• Bad: O(n)
• Often very bad
59
What’s Wrong?
Problem
• free() must traverse (long) free list, so can be (very) slow
Solution
• Use a doubly linked list…
60
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 61
Doubly-Linked List Impl
Data structures
Next chunk in free list Status bit: Prev chunk in free list
0 => free
1 => in use
1
size size
chunk
Free list is doubly-linked
Each chunk contains header, payload, footer
Payload is used by client
Header contains status bit, chunk size, & (if free) addr of next chunk in list
Footer contains redundant chunk size & (if free) addr of prev chunk in list
Free list is unordered
62
Doubly-Linked List Impl
Free list
63
Doubly-Linked List Impl
Algorithms (see precepts for more precision)
malloc(n)
• Search free list for big-enough chunk
• Chunk found & reasonable size => remove, set status, use
• Chunk found & too big => remove, split, insert tail, set status,
use front
• Chunk not found => increase heap size, create new chunk, insert
• New chunk reasonable size => remove, set status, use
• New chunk too big => remove, split, insert tail, set status, use front
64
Doubly-Linked List Impl
65
Doubly-Linked List Impl Performance
Consider sub-algorithms of free()…
Insert chunk into free list
• Linked list version: slow
• Traverse list to find proper spot
• Doubly-linked list version: fast
• Insert at front!
66
Doubly-Linked List Impl Performance
Consider sub-algorithms of free()…
Determine if next chunk in memory is free
• Linked list version: slow
• Traverse free list to see if next chunk in memory is in list
• Doubly-linked list version: fast
Free list
current next
67
Doubly-Linked List Impl Performance
Consider sub-algorithms of free()…
Determine if prev chunk in memory is free
• Linked list version: slow
• Traverse free list to see if prev chunk in memory is in list
• Doubly-linked list version: fast
Free list
prev current
Observation:
• All sub-algorithms of free() are fast
• free() is fast!
69
Doubly-Linked List Impl Performance
Space
• Some internal & external fragmentation is unavoidable
• Headers & footers are overhead
• Overall: Good
Time: free()
• All steps are fast
• Good: O(1)
Time: malloc()
• Must search free list for big-enough chunk
• Bad: O(n)
• Often acceptable
• Subject to bad worst-case behavior
• E.g. long free list with big chunks at end
70
What’s Wrong?
Problem
• malloc() must traverse doubly-linked list, so can be slow
Solution
• Use multiple doubly-linked lists (bins)…
71
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 72
Bins Impl
Data structures
…
…
10 Doubly-linked list containing free chunks of size 10
free(p)
• Set status
• Insert chunk into free list proper bin
• Next chunk in memory also free => remove both, coalesce, insert
• Prev chunk in memory free => remove both, coalesce, insert
74
Bins Impl Performance
Space
• Pro: For small chunks, uses best-fit (not first-fit) strategy
• Could decrease external fragmentation and splitting
• Con: Some internal & external fragmentation is unavoidable
• Con: Headers, footers, bin array are overhead
• Overall: good
Time: malloc()
• Pro: Binning limits list searching
• Search for chunk of size i begins at bin i and proceeds downward
• Con: Could be bad for large chunks (i.e. those in final bin)
• Performance degrades to that of list version
• Overall: good O(1)
Time: free()
• ???
75
iClicker Question
Q: How fast is free() in the Bins implementation?
Time: malloc()
• Pro: Binning limits list searching
• Search for chunk of size i begins at bin i and proceeds downward
• Con: Could be bad for large chunks (i.e. those in final bin)
• Performance degrades to that of list version
• Overall: good O(1)
Time: free()
• Good: O(1) with a small constant
77
DMMgr Impl Summary (so far)
Problem
• How can heap mgr unmap memory effectively?
Solution
• Don’t use the heap!
79
What’s Wrong?
Reprising a previous slide…
Question:
• How to implement malloc() and free()?
• How to implement a DMMgr?
Answer 1:
• Use the heap section of memory
Answer 2:
• Make use of virtual memory concept…
80
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 81
Unix VM Mapping Functions
Unix allows application programs to map/unmap VM
explicitly
void *mmap(void *p, size_t n, int prot, int flags,
int fd, off_t offset);
• Creates a new mapping in the virtual address space of the calling
process
• p: the starting address for the new mapping
• n: the length of the mapping
• If p is NULL, then the kernel chooses the address at which to create
the mapping; this is the most portable method of creating a new
mapping
• On success, returns address of the mapped area
int munmap(void *p, size_t n);
• Deletes the mappings for the specified address range
82
Unix VM Mapping Functions
Typical call of mmap() for allocating memory
p = mmap(NULL, n, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANON, 0, 0);
• Asks OS to map a new read/write area of virtual memory containing
n bytes
• Returns the virtual address of the new area on success, (void*)-1
on failure
See Bryant & O’Hallaron book and man pages for details
83
Agenda
The need for DMM
DMM using the heap section
DMMgr 1: Minimal implementation
DMMgr 2: Pad implementation
Fragmentation
DMMgr 3: List implementation
DMMgr 4: Doubly-linked list implementation
DMMgr 5: Bins implementation
DMM using virtual memory
DMMgr 6: VM implementation 84
VM Mapping Impl
Data structures
size
header payload
chunk
85
VM Mapping Impl
Algorithms
void *malloc(size_t n)
{ size_t *ps;
if (n == 0) return NULL;
ps = mmap(NULL, n + sizeof(size_t), PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
if (ps == (size_t*)-1) return NULL;
*ps = n + sizeof(size_t); /* Store size in header */
ps++; /* Move forward from header to payload */
return (void*)ps;
}
Time
• For small chunks
• One system call (mmap()) per call of malloc()
• One system call (munmap()) per call of free()
• Overall: poor
• For large chunks
• free() unmaps (large) chunks of memory, and so shrinks
page table
• Overall: maybe good!
87
The GNU Implementation
Observation
• malloc() and free() on ArmLab are from the
GNU (the GNU Software Foundation)
Question
• How are GNU malloc() and free() implemented?
Answer
• For small chunks
• Use heap (sbrk() and brk())
• Use bins implementation
• For large chunks
• Use VM directly (mmap() and munmap())
88
Summary
The need for DMM
• Unknown object size
89
iClicker Question
Q: When is coalescing most useful?
A. Always
B. When most of the program’s objects are the same size
C. When the program simultaneously uses objects of
different sizes
D. When the program allocates many objects of size A, then
frees most of them, then allocates many objects of size B
E. Never
Appendix: Additional Approaches
91
Using payload space for management
or, only free chunks need to be in the free-list
Status Next chunk in free list Prev chunk in free list
1
size
size
Status
1
size size
0
size size
Status
1
size
The rain in Spain is mainly in thesize
plain
93
Selective Splitting
Observation
• In previous implementations, malloc() splits whenever chosen
chunk is too big
Pro
• Reduces external fragmentation
Con In In
• Increases internal fragmentation
use use
94
Deferred Coalescing
Observation
• Previous implementations do coalescing whenever possible
Pro
• Handles malloc(n);free();malloc(n) sequences well
Con
• Complicates algorithms
In In
use use
95
Segregated Data
Observation
• Splitting and coalescing consume lots of overhead
Problem
• How to eliminate that overhead?
96
Segregated Data
Segregated data
• Each bin contains chunks of fixed sizes
• E.g. 32, 64, 128, …
• All chunks within a bin are from same virtual memory page
• malloc() never splits! Examples:
• malloc(32) => provide 32
• malloc(5) => provide 32
• malloc(100) => provide 128
• free() never coalesces!
• Free block => examine address, infer virtual memory page,
infer bin, insert into that bin
97
Segregated Data
Pros
• Eliminates splitting and coalescing overhead
• Eliminates most meta-data; only forward links required
• No backward links, sizes, status bits, footers
Con
• Some usage patterns cause excessive external fragmentation
• E.g. Only one malloc(32) wastes all but 32 bytes of one
virtual page
98
Segregated Meta-Data
Observations
• Meta-data (chunk sizes, status flags, links, etc.) are scattered across
the heap, interspersed with user data
• Heap mgr often must traverse meta-data
Problem 1
• User error easily can corrupt meta-data
Problem 2
• Frequent traversal of meta-data can cause excessive page faults
(poor locality)
6
1 megabyte, contiguous
Free: find size (somehow), put back at head of that bin’s list
100
How free() finds the size
↓006FA8B0000 ↓006FA8BFFFF
2
4 ↓00381940000 ↓0038194FFFF
6
Hash table:
006FA8B → 2
0038194 → 4 006FA8B0080
0058217 → 6 “page” number offset in page
etc.
101
Segregated metadata performance
Space
• No overhead for header: very very good,
• No coalescing, fragmentation may occur, possibly bad
Time
• malloc: very very good, O(1)
• free: hash-table lookup, good, O(1)
102
Trade-off