0019 Lecture6 Mem Alloc
0019 Lecture6 Mem Alloc
in C
Brad Karp
UCL Computer Science
CS 0019
29th January 2019
1
Dynamic Heap Memory Allocation in
C
¢ Properties of malloc()/free()
§ N.B. not the design of cs0019_{malloc(),free()}!
Rather, the design of the underlying system software’s
malloc() and free()
¢ Simple Design: Implicit Free Lists
¢ Menagerie of malloc() and free() programming
errors (and undefined behaviors!)
2
Dynamic Memory Allocation: Context
Application
Dynamic Memory Allocator
Heap
3
Dynamic Memory Allocation: Context
Memory
invisible to
Application Kernel virtual memory
user code
User stack
Dynamic Memory Allocator (created at runtime)
%rsp
Heap (stack
pointer)
¢ Programmers use dynamic Memory-mapped region for
memory allocators (such as shared libraries
malloc) to acquire virtual
memory (VM) at run time.
brk
§ for data structures whose Run-time heap
size is only known at runtime (created by malloc)
¢ Dynamic memory allocators Read/write segment Loaded
manage an area of process (.data, .bss) from
the
VM known as the heap. Read-only segment executable
(.init, .text, .rodata) file
0x400000
Unused
0 4
Styles of Dynamic Memory Allocation
5
malloc Standard C Library Allocator
#include <stdlib.h>
void *malloc(size_t size)
§ Successful:
§Returns pointer to memory block of at least size bytes
aligned to a 16-byte boundary (on x86-64)
§ If size == 0, returns NULL
§ Unsuccessful: returns NULL (0) and sets errno to ENOMEM
void free(void *p)
§ Returns the block pointed to by p to pool of available memory
§ p must come from a previous call to malloc() or realloc()
Other functions
§ calloc(): version of malloc() that initializes allocated block to
contain zero bytes
§ realloc(): changes the size of a previously allocated block
§ sbrk(): used internally by allocators to grow or shrink the heap
6
malloc Example
#include <stdio.h>
#include <stdlib.h>
void foo(int n) {
int i, *p;
7
Simplifying Assumptions Made in This
Lecture
¢ Memory is word-addressed
¢ Words are int-sized
¢ Allocations are double-word aligned
8
Allocation Example #define SIZ sizeof(int)
p1 = malloc(4*SIZ)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
p4 = malloc(2*SIZ)
9
Allocation Example #define SIZ sizeof(int)
p1 = malloc(4*SIZ)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
p4 = malloc(2*SIZ)
10
Allocation Example #define SIZ sizeof(int)
p1 = malloc(4*SIZ)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
p4 = malloc(2*SIZ)
11
Allocation Example #define SIZ sizeof(int)
p1 = malloc(4*SIZ)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
p4 = malloc(2*SIZ)
12
Allocation Example #define SIZ sizeof(int)
p1 = malloc(4*SIZ)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
p4 = malloc(2*SIZ)
13
Constraints
¢ Applications
§ Can issue arbitrary sequence of malloc() and free() requests
§ free() request must be to a malloc()’d block
¢ Explicit Allocators
§ Can’t control number or size of allocated blocks
§ Must respond immediately to malloc() requests
§ i.e., can’t reorder or buffer requests
§ Must allocate blocks from free memory
§ i.e., can only place allocated blocks in free memory
§ Must align blocks so they satisfy all alignment requirements
§16-byte (x86-64) alignment on Linux boxes
§ Can manipulate and modify only free memory
§ Can’t move the allocated blocks once they are malloc()’d
§ i.e., compaction is not allowed. Why not?
14
Performance Goal: Throughput
¢ Given some sequence of malloc() and free() requests:
§ R0, R1, ..., Rk, ... , Rn-1
¢ Throughput:
§ Number of completed requests per unit time
§ Example:
§ 5,000 malloc() calls and 5,000 free() calls in 10 seconds
§ Throughput is 1,000 operations/second
15
Performance Goal: Peak Memory
Utilization
¢ Given some sequence of malloc() and free()
requests:
§ R0, R1, ..., Rk, ... , Rn-1
¢ Def: Aggregate payload Pk
§ malloc(p) results in a block with a payload of p bytes
§ After request Rk has completed, the aggregate payload Pk is the
sum of currently allocated payloads
16
Fragmentation
¢ Poor memory utilization caused by fragmentation
§ internal fragmentation
§ external fragmentation
17
Internal Fragmentation
¢ For a given block, internal fragmentation occurs if payload is
smaller than block size
Block
Internal Internal
Payload
fragmentation fragmentation
¢ Caused by
§ Overhead of maintaining heap data structures
§ Padding for alignment purposes
§ Explicit policy decisions
(e.g., to return a big block to satisfy a small request)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
19
External Fragmentation
#define SIZ sizeof(int)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
20
External Fragmentation
#define SIZ sizeof(int)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
21
External Fragmentation
#define SIZ sizeof(int)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
22
External Fragmentation
#define SIZ sizeof(int)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
p4 = malloc(7*SIZ)
23
External Fragmentation
#define SIZ sizeof(int)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
24
External Fragmentation
#define SIZ sizeof(int)
p2 = malloc(5*SIZ)
p3 = malloc(6*SIZ)
free(p2)
p0
p0 = malloc(4*SIZ)
5
27
Knowing How Much to Free
¢ Standard method
§ Keep the length of a block in the word preceding the block.
§ This word is often called the header field or header
§ Requires an extra word for every allocated block
p0
p0 = malloc(4*SIZ)
5
28
Keeping Track of Free Blocks
¢ Method 1: Implicit list using length—links all blocks
Need to tag
Unused
each block as
4 6 4 2
allocated/free
29
Keeping Track of Free Blocks
¢ Method 1: Implicit list using length—links all blocks
Need to tag
Unused
each block as
4 6 4 2
allocated/free
30
Keeping Track of Free Blocks
¢ Method 1: Implicit list using length—links all blocks
Need to tag
Unused
each block as
4 6 4 2
allocated/free
31
Keeping Track of Free Blocks
¢ Method 1: Implicit list using length—links all blocks
Need to tag
Unused
each block as
4 6 4 2
allocated/free
34
Simple Design: Implicit Free List
¢ For each block we need both size and allocation status
§ Could store this information in two words, but that’d be wasteful
¢ Standard trick
§ When blocks are aligned, some low-order address bits are always 0
§ Instead of storing an always-0 bit, use it as an allocated/free flag
§ When reading the Size word, must mask out this bit
1 word
End
Unused Block
Start
of 2/0 4/1 8/0 4/1 0/1
heap
36
Implicit List: Finding a Free Block
¢ First fit:
§ Search list from beginning, choose first free block that fits:
p = start;
while ((p < end) && \\ not passed end
((*p & 1) || \\ already allocated
(*p <= len))) \\ too small
p = p + (*p & -2); \\ goto next block (word addressed)
§ Can take linear time in total number of blocks (allocated and free)
§ In practice it can cause “splinters” at beginning of list
37
Implicit List: Finding a Free Block
¢ First fit:
§ Search list from beginning, choose first free block that fits:
p = start;
while ((p < end) && \\ not passed end
((*p & 1) || \\ already allocated
(*p <= len))) \\ too small
p = p + (*p & -2); \\ goto next block (word addressed)
§ Can take linear time in total number of blocks (allocated and free)
§ In practice it can cause “splinters” at beginning of list
¢ Next fit:
§ Like first fit, but search list starting where previous search finished
§ Should often be faster than first fit: avoids re-scanning unhelpful blocks
§ Some research suggests that fragmentation is worse
38
Implicit List: Finding a Free Block
¢ First fit:
§ Search list from beginning, choose first free block that fits:
p = start;
while ((p < end) && \\ not passed end
((*p & 1) || \\ already allocated
(*p <= len))) \\ too small
p = p + (*p & -2); \\ goto next block (word addressed)
§ Can take linear time in total number of blocks (allocated and free)
§ In practice it can cause “splinters” at beginning of list
¢ Next fit:
§ Like first fit, but search list starting where previous search finished
§ Should often be faster than first fit: avoids re-scanning unhelpful blocks
§ Some research suggests that fragmentation is worse
¢ Best fit:
§ Search the list, choose the best free block: fits, with fewest bytes left
over
§ Keeps fragments small—usually improves memory utilization
§ Will typically run slower than first fit
39
Implicit List: Allocating in Free Block
¢ Allocating within a free block: splitting
§ Since allocated space might be smaller than free space, we might
want to split the block
4 4 6 2 0
addblock(p, 4)
4 4 4 2 2 0
4 4 6 2 0
addblock(p, 4)
4 4 4 2 2 0
4 4 6 2 0
addblock(p, 4)
4 4 4 2 2 0
4 4 6 2 0
addblock(p, 4)
4 4 4 2 2 0
4 4 4 2 2 0
44
Implicit List: Freeing a Block
¢ Simplest implementation:
§ Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
4 4 4 2 2 0
free(p) p
4 4 4 2 2 0
45
Implicit List: Freeing a Block
¢ Simplest implementation:
§ Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
4 4 4 2 2 0
free(p) p
4 4 4 2 2 0
malloc(5*SIZ)
46
Implicit List: Freeing a Block
¢ Simplest implementation:
§ Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
4 4 4 2 2 0
free(p) p
4 4 4 2 2 0
malloc(5*SIZ) Yikes!
47
Implicit List: Freeing a Block
¢ Simplest implementation:
§ Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
4 4 4 2 2 0
free(p) p
4 4 4 2 2 0
malloc(5*SIZ) Yikes!
There is enough contiguous
free space, but the allocator
won’t be able to find it
48
Implicit List: Coalescing
¢ Join (coalesce) with next/previous blocks, if they are free
§ Coalescing with next block
4 4 4 2 2 0
logically
p
free(p) gone
4 4 6 2 2 0
void free_block(ptr p) {
*p = *p & -2; // clear allocated flag
next = p + *p; // find next block
if ((*next & 1) == 0)
*p = *p + *next; // add to this block if
} // not allocated
0 4 4 4 4 6 6 4 4 0
50
Implicit List: Bidirectional Coalescing
¢ Boundary tags [Knuth73]
§ Replicate size/allocated word at “bottom” (end) of free blocks
§ Allows us to traverse the “list” backwards, but requires extra space
§ Important and general technique!
0 4 4 4 4 6 6 4 4 0
52
Constant Time Coalescing (Case 1)
m1 1
m1 1
n 1
n 1
m2 1
m2 1
53
Constant Time Coalescing (Case 1)
m1 1 m1 1
m1 1 m1 1
n 1 n 0
n 1 n 0
m2 1 m2 1
m2 1 m2 1
54
Constant Time Coalescing (Case 2)
m1 1
m1 1
n 1
n 1
m2 0
m2 0
55
Constant Time Coalescing (Case 2)
m1 1 m1 1
m1 1 m1 1
n 1 n+m2 0
n 1
m2 0
m2 0 n+m2 0
56
Constant Time Coalescing (Case 3)
m1 0
m1 0
n 1
n 1
m2 1
m2 1
57
Constant Time Coalescing (Case 3)
m1 0 n+m1 0
m1 0
n 1
n 1 n+m1 0
m2 1 m2 1
m2 1 m2 1
58
Constant Time Coalescing (Case 4)
m1 0
m1 0
n 1
n 1
m2 0
m2 0
59
Constant Time Coalescing (Case 4)
m1 0 n+m1+m2 0
m1 0
n 1
n 1
m2 0
m2 0 n+m1+m2 0
60
Disadvantages of Boundary Tags
¢ Internal fragmentation
Size a
¢ Can it be optimized?
§ Which blocks need the footer tag? Payload and
padding
§ How can we exploit this?
Size a
61
No Boundary Tag Needed in Allocated
Blocks!
¢ Boundary tag needed only for free blocks
¢ When allocation sizes are multiples of 4 or more, have 2+
spare bits
1 word 1 word
Allocated Free
Block Block
62
No Boundary Tag for Allocated Blocks
(Case 1)
m1 ?1
previous
block
block n 11
being
freed
m2 11
next
block
63
No Boundary Tag for Allocated Blocks
(Case 1)
m1 ?1 m1 ?1
previous
block
block n 11 n 10
being
freed n 10
m2 11 m2 01
next
block
64
No Boundary Tag for Allocated Blocks
(Case 2)
m1 ?1
previous
block
block n 11
being
freed
m2 10
next
block m2 10
65
No Boundary Tag for Allocated Blocks
(Case 2)
m1 ?1 m1 ?1
previous
block
block n 11 n+m2 10
being
freed
m2 10
next
block m2 10 n+m2 10
66
No Boundary Tag for Allocated Blocks
(Case 3)
m1 ?0
previous
block
m1 ?0
block n 01
being
freed
m2 11
next
block
67
No Boundary Tag for Allocated Blocks
(Case 3)
m1 ?0 n+m1 ?0
previous
block
m1 ?0
block n 01
being
freed n+m1 ?0
m2 11 m2 01
next
block
68
No Boundary Tag for Allocated Blocks
(Case 4)
m1 ?0
previous
block
m1 ?0
block n 01
being
freed
m2 10
next
block m2 10
69
No Boundary Tag for Allocated Blocks
(Case 4)
m1 ?0 n+m1+m2 ?0
previous
block
m1 ?0
block n 01
being
freed
m2 10
next
block m2 10 n+m1+m2 ?0
70
Summary of Key Allocator Policies
¢ Placement policy:
§ First-fit, next-fit, best-fit, etc.
§ Trades off lower throughput for less fragmentation
§ Interesting observation: segregated free lists (one free list per
allocation size limit, typically spaced in powers of two) can
approximate best-fit placement policy without having to search
entire free list
¢ Splitting policy:
§ When do we go ahead and split free blocks?
§ How much internal fragmentation are we willing to tolerate?
¢ Coalescing policy:
§ Immediate coalescing: coalesce each time free() called
§ Deferred coalescing: try to improve performance of free() by
deferring coalescing until needed, e.g.:
§ Coalesce as you scan free list for malloc()
§ Coalesce when some threshold number of malloc() requests
has failed for lack of large enough free block (triggering
sbrk())
71
Implicit Lists: Summary
¢ Implementation: very simple
¢ Allocate cost:
§ worst-case linear time
¢ Free cost:
§ worst-case constant time
§ even with coalescing
¢ Memory usage efficiency:
§ will depend on placement policy
§ first-fit, next-fit, or best-fit
¢ Not used in practice for malloc()/free() because of
worst-case linear-time allocation
§ but used in some special-purpose applications
¢ Nevertheless, concepts of splitting and boundary tag
coalescing are general to all allocators
72
Dynamic Heap Memory Allocation in
C
¢ Properties of malloc()/free()
¢ Simple Design: Implicit Free Lists
¢ Menagerie of malloc() and free() programming
errors (and undefined behaviors!)
73
C operators
Operators Associativity
() [] -> . ++ -- left to right
! ~ ++ -- + - * & (type) sizeof right to left
* / % left to right
+ - left to right
<< >> left to right
< <= > >= left to right
== != left to right
& left to right
^ left to right
| left to right
&& left to right
|| left to right
?: right to left
= += -= *= /= %= &= ^= != <<= >>= right to left
, left to right
¢ ->, (), and [] have high precedence, with * and & just below
¢ Unary +, -, and * have higher precedence than binary forms
Source: ANSI K&R p. 53 74
C operators Postfix
Operators Associativity
() [] -> . ++ -- left to right
! ~ ++ -- + - * & (type) sizeof right to left
* / % left to right
Unary Unary
+ - Prefix left to right
<< >> Binary left to right
< <= > >= left to right
== != left to right
& left to right
^ Binary left to right
| left to right
&& left to right
|| left to right
?: right to left
= += -= *= /= %= &= ^= != <<= >>= right to left
, left to right
¢ ->, (), and [] have high precedence, with * and & just below
¢ Unary +, -, and * have higher precedence than binary forms
Source: ANSI K&R p. 53 75
C Pointer Declarations: Test Yourself!
int *p
int *p[13]
int *(p[13])
int **p
int (*p)[13]
int *f()
int (*f)()
int (*(*x[3])())[5]
int *p[13]
int *(p[13])
int **p
int (*p)[13]
int *f()
int (*f)()
int (*(*x[3])())[5]
int *(p[13])
int **p
int (*p)[13]
int *f()
int (*f)()
int (*(*x[3])())[5]
int **p
int (*p)[13]
int *f()
int (*f)()
int (*(*x[3])())[5]
int (*p)[13]
int *f()
int (*f)()
int (*(*x[3])())[5]
int *f()
int (*f)()
int (*(*x[3])())[5]
int (*f)()
int (*(*x[3])())[5]
int (*(*x[3])())[5]
int val;
...
scanf("%d", val);
85
Reading Uninitialized Memory
¢ Assuming that heap data is initialized to zero
/* return y = Ax */
int *matvec(int **A, int *x) {
int *y = malloc(N*sizeof(int));
int i, j;
86
Overwriting Memory
¢ Allocating the (possibly) wrong sized object
int **p;
p = malloc(N*sizeof(int));
87
Overwriting Memory
¢ Off-by-one errors
char **p;
p = malloc(N*sizeof(int *));
char *p;
p = malloc(strlen(s));
strcpy(p,s);
88
Overwriting Memory
¢ Not validating input length vs. buffer size
char s[8];
int i;
89
Overwriting Memory
¢ Misunderstanding pointer arithmetic
return p;
}
90
Overwriting Memory
¢ Referencing a pointer instead of the object it points to
int *BinheapDelete(int **binheap, int *size) {
int *packet;
packet = binheap[0];
binheap[0] = binheap[*size - 1];
*size--;
Heapify(binheap, *size, 0);
return(packet);
}
91
Referencing Nonexistent Variables
¢ Forgetting that local variables disappear when a function
returns
int *foo () {
int val;
return &val;
}
92
Freeing Blocks Multiple Times
¢ Nasty!
x = malloc(N*sizeof(int));
<manipulate x>
free(x);
y = malloc(M*sizeof(int));
<manipulate y>
free(x);
93
Referencing Freed Blocks
¢ Evil!
x = malloc(N*sizeof(int));
<manipulate x>
free(x);
...
y = malloc(M*sizeof(int));
for (i=0; i<M; i++)
y[i] = x[i]++;
94
Failing to Free Blocks (Memory Leaks)
¢ Slow, long-term killer!
foo() {
int *x = malloc(N*sizeof(int));
...
return;
}
95
Failing to Free Blocks (Memory Leaks)
¢ Freeing only part of a data structure
struct list {
int val;
struct list *next;
};
foo() {
struct list *head = malloc(sizeof(struct list));
head->val = 0;
head->next = NULL;
<create and manipulate the rest of the list>
...
free(head);
return;
}
96