0% found this document useful (0 votes)
30 views106 pages

Lecture 22

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views106 pages

Lecture 22

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

CSC 252/452: Computer Organization

Fall 2024: Lecture 22

Instructor: Yanan Guo

Department of Computer Science


University of Rochester
Carnegie Mellon

Announcements
• Programming Assignment 5 will be released tomorrow.
• Due Dec. 4th (Wednesday)

2
Carnegie Mellon

Today
• Memory mapping
• Dynamic memory allocation

3
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

...

L1 TLB (16 sets, 4 entries/set)

4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
...

L1 TLB (16 sets, 4 entries/set)

40
PPN

4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40
VPN1 VPN2 VPN3 VPN4
PPN

CR3
PTE PTE PTE PTE

Page tables 4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12
VPN1 VPN2 VPN3 VPN4
PPN PPO

CR3
PTE PTE PTE PTE

Page tables 4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12
VPN1 VPN2 VPN3 VPN4
PPN PPO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 4
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI
L1 d-cache
TLB (64 sets, 8 lines/set)
hit
TLB
miss ... ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 4
Carnegie Mellon

End-to-End Core i7 Address Translation


32/64
CPU
Result
Virtual address (VA)
36 12
VPN VPO L1
32 4 hit
TLBT TLBI
L1 d-cache
TLB (64 sets, 8 lines/set)
hit
TLB
miss ... ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 4
Carnegie Mellon

End-to-End Core i7 Address Translation


32/64
CPU L2, L3, and
Result
Virtual address (VA) main memory
36 12
VPN VPO L1 L1
hit miss
32 4
TLBT TLBI
L1 d-cache
TLB (64 sets, 8 lines/set)
hit
TLB
miss ... ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 4
Carnegie Mellon

Memory Mapping For Sharing


• Multiple processes often share data
• Different processes that run the same code (e.g., shell)
• Different processes linked to the same standard libraries
• Different processes share the same file

• It is wasteful to create exact copies of the share object


• Memory mapping allow us to easily share objects
• Different VM pages point to the same physical page/object

5
Carnegie Mellon

Sharing Revisited: Shared Objects


• Process 1 maps the shared object. • The kernel remembers
that the object (backed
Process 1 Physical Process 2 by a unique file) is
virtual memory memory virtual memory mapped by Proc. 1 to
some physical pages.

Shared
object
6
Carnegie Mellon

Sharing Revisited: Shared Objects


• Process 2 maps the shared object. • The kernel remembers
that the object (backed
Process 1 Physical Process 2 by a unique file) is
virtual memory memory virtual memory mapped by Proc. 1 to
some physical pages.
• Now when Proc. 2
wants to access the
same object, the kernel
can simply point the
PTEs of Proc. 2 to the
already-mapped
physical pages.

Shared
object
7
Carnegie Mellon

The Problem…
• What if Proc. 2 now wants to modify the shared object, but
doesn’t want the modification to be visible to Proc. 1

8
Carnegie Mellon

The Problem…
• What if Proc. 2 now wants to modify the shared object, but
doesn’t want the modification to be visible to Proc. 1
• Simplest solution: always create duplicate copies of shared
objects at the cost of wasting space. Not ideal.

8
Carnegie Mellon

The Problem…
• What if Proc. 2 now wants to modify the shared object, but
doesn’t want the modification to be visible to Proc. 1
• Simplest solution: always create duplicate copies of shared
objects at the cost of wasting space. Not ideal.
• Idea: Copy-on-write (COW)
• First pretend that both processes will share the objects without
modifying them. If modification happens, create separate copies.

8
Carnegie Mellon

Private Copy-on-write (COW) Objects


• Two processes
mapping a private
Process 1 Physical Process 2 copy-on-write
virtual memory memory virtual memory (COW) object.
• Area flagged as
private copy-on-
Private write (COW)
copy-on-write
area
• PTEs in private
areas are flagged
as read-only

Private
copy-on-write object
9
Carnegie Mellon

Private Copy-on-write (COW) Objects


• Instruction writing to
private page triggers
Process 1 Physical Process 2 page (protection) fault.
virtual memory memory virtual memory

Copy-on-write

Write to
private
COW page

Private
copy-on-write object
10
Carnegie Mellon

Private Copy-on-write (COW) Objects


• Instruction writing to
private page triggers
Process 1 Physical Process 2 page (protection) fault.
virtual memory memory virtual memory • Handler checks the area
protection, and sees
Copy-on-write that it’s a COW object

Write to
private
COW page

Private
copy-on-write object
10
Carnegie Mellon

Private Copy-on-write (COW) Objects


• Instruction writing to
private page triggers
Process 1 Physical Process 2 page (protection) fault.
virtual memory memory virtual memory • Handler checks the area
protection, and sees
Copy-on-write that it’s a COW object
• Handler then creates
new R/W page.

Write to
private
COW page

Private
copy-on-write object
10
Carnegie Mellon

Private Copy-on-write (COW) Objects


• Instruction writing to
private page triggers
Process 1 Physical Process 2 page (protection) fault.
virtual memory memory virtual memory • Handler checks the area
protection, and sees
Copy-on-write that it’s a COW object
• Handler then creates
new R/W page.

Write to
private
COW page

Private
copy-on-write object
10
Carnegie Mellon

Private Copy-on-write (COW) Objects


• Instruction writing to
private page triggers
Process 1 Physical Process 2 page (protection) fault.
virtual memory memory virtual memory • Handler checks the area
protection, and sees
Copy-on-write that it’s a COW object
• Handler then creates
new R/W page.

Write to
• Instruction restarts upon
handler return.
private
COW page

Private
copy-on-write object
10
Carnegie Mellon

Private Copy-on-write (COW) Objects


• Instruction writing to
private page triggers
Process 1 Physical Process 2 page (protection) fault.
virtual memory memory virtual memory • Handler checks the area
protection, and sees
Copy-on-write that it’s a COW object
• Handler then creates
new R/W page.

Write to
• Instruction restarts upon
handler return.
private
COW page • Copying deferred as
long as possible!

Private
copy-on-write object
10
Carnegie Mellon

Today
• Memory mapping
• Dynamic memory allocation
• Basic concepts
• Implicit free lists

11
Carnegie Mellon

Process Address Space


Memory
Kernel space invisible to
user code
User stack
(created at runtime)
%rsp
(stack
pointer)

Memory-mapped region for


shared libraries

brk
Run-time heap
(created by malloc)

Read/write data segment


Loaded from the (.data, .bss)
executable file Read-only code segment Program
(.init, .text, .rodata) Counter
0x400000
Unused
0 12
Carnegie Mellon

Dynamic Memory Allocation


• Programmers use dynamic
memory allocators (such as
User stack
malloc) to acquire VM at
run time. Top of heap
• Dynamic memory Heap (via malloc)
(brk ptr)
allocators manage an area
of process virtual memory Uninitialized data (.bss)
known as the heap. Initialized data (.data)
Program text (.text)

13
Carnegie Mellon

The malloc/free Functions


#include <stdlib.h>
void *malloc(size_t size)
• Successful:

• Returns a pointer to a memory block of at least size bytes


aligned to an 8-byte (x86) or 16-byte (x86-64) boundary
• If size == 0, returns NULL
• Unsuccessful: returns NULL (0) and sets errno

14
Carnegie Mellon

The malloc/free Functions


#include <stdlib.h>
void *malloc(size_t size)
• Successful:

• Returns a pointer to a memory block of at least size bytes


aligned to an 8-byte (x86) or 16-byte (x86-64) boundary
• If size == 0, returns NULL
• Unsuccessful: returns NULL (0) and sets errno
void free(void *p)
• Returns the block pointed at by p to pool of available memory

• p must come from a previous call to malloc or realloc

14
Carnegie Mellon

The malloc/free Functions


#include <stdlib.h>
void *malloc(size_t size)
• Successful:

• Returns a pointer to a memory block of at least size bytes


aligned to an 8-byte (x86) or 16-byte (x86-64) boundary
• If size == 0, returns NULL
• Unsuccessful: returns NULL (0) and sets errno
void free(void *p)
• Returns the block pointed at by p to pool of available memory

• p must come from a previous call to malloc or realloc

Other functions
• calloc: Version of malloc that initializes allocated block to zero.

• realloc: Changes the size of a previously allocated block.


• sbrk: Used internally by allocators to grow or shrink the heap

14
Carnegie Mellon

malloc Example

#include <stdio.h> Stack


#include <stdlib.h>
p i n
void foo(int n) {
int i, *p;

/* Allocate a block of n ints */


p = (int *) malloc(n * sizeof(int));
if (p == NULL) {
Heap (via malloc)
perror("malloc");
exit(0);
} N * 8 bytes
/* Initialize allocated block */
for (i=0; i<n; i++)
p[i] = i; Uninitialized data (.bss)
Initialized data (.data)
/* Return allocated block to the heap */ Program text (.text)
free(p);
}

15
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!

16
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!
int* foo(int n) {
int i, *p;

p = (int *) malloc(n * sizeof(int));


if (p == NULL) exit(0);

for (i=0; i<n; i++)


p[i] = i;

return p;
}

void bar() {
int *p = foo(5);

printf(“%d\n”, p[0]);
}

16
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!
int* foo(int n) {
int i, *p;
bar Stack
p
p = (int *) malloc(n * sizeof(int));
if (p == NULL) exit(0);

for (i=0; i<n; i++)


p[i] = i;

return p;
} Heap (via malloc)
void bar() {
int *p = foo(5);

printf(“%d\n”, p[0]);
}

16
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!
int* foo(int n) {
int i, *p;
bar Stack
p
p = (int *) malloc(n * sizeof(int));
if (p == NULL) exit(0); foo Stack
p i n
for (i=0; i<n; i++)
p[i] = i;

return p;
} Heap (via malloc)
void bar() {
int *p = foo(5);

printf(“%d\n”, p[0]);
}

16
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!
int* foo(int n) {
int i, *p;
bar Stack
p
p = (int *) malloc(n * sizeof(int));
if (p == NULL) exit(0); foo Stack
p i n
for (i=0; i<n; i++)
p[i] = i;

return p;
} Heap (via malloc)
void bar() {
int *p = foo(5);
N * 8 bytes
printf(“%d\n”, p[0]);
}

16
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!
int* foo(int n) {
int i, *p;
bar Stack
p
p = (int *) malloc(n * sizeof(int));
if (p == NULL) exit(0); foo Stack
p i n
for (i=0; i<n; i++)
p[i] = i;

return p;
} Heap (via malloc)
void bar() {
int *p = foo(5);
N * 8 bytes
printf(“%d\n”, p[0]);
}

16
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!

int* foo() { bar Stack


int i; p
int p[5];

for (i=0; i<5; i++)


p[i] = i;

return p;
}

void bar() {
int *p = foo();

printf(“%d\n”, p[0]); Heap (via malloc)


}

17
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!

int* foo() { bar Stack


int i; p
int p[5];
foo Stack
for (i=0; i<5; i++) p i
p[i] = i;

return p;
}

void bar() {
int *p = foo();

printf(“%d\n”, p[0]); Heap (via malloc)


}

17
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!

int* foo() { bar Stack


int i; p
int p[5];
foo Stack
for (i=0; i<5; i++) p i
p[i] = i;

}
return p; N * 8 bytes
void bar() {
int *p = foo();

printf(“%d\n”, p[0]); Heap (via malloc)


}

17
Carnegie Mellon

Why Do We Need Dynamic Allocation?


• Some data structures’ size is only known at runtime. Statically
allocating the space would be a waste.
• More importantly: access data across function calls. Variables
on stack are destroyed when the function returns!!!

int* foo() { bar Stack


int i; p
int p[5];
foo Stack
for (i=0; i<5; i++) p i
p[i] = i;

}
return p; N * 8 bytes
void bar() {
int *p = foo();

printf(“%d\n”, p[0]); Heap (via malloc)


}

17
Carnegie Mellon

Dynamic Memory Allocation


• Allocator maintains heap as collection of variable sized blocks/
chunks, which are either allocated or free
• Blocks that are no longer used should be free-ed to save space

Allocated block Free block


(4 words) (3 words) Free word
Allocated word

• Assumptions Made in This Lecture


• Memory is word addressed
• Words are int-sized

18
Carnegie Mellon

Dynamic Memory Allocation


• Types of allocators
• Explicit allocator: application (i.e., programmer) allocates and frees space
• E.g., malloc and free in C
• Implicit allocator: application allocates, but does not free space
• E.g. garbage collection in Java, JavaScript, Python, etc…

• Will discuss simple explicit memory allocation today

19
Carnegie Mellon

Allocation Example
p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(2)

20
Carnegie Mellon

Allocation Example
p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(2)

20
Carnegie Mellon

Allocation Example
p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(2)

20
Carnegie Mellon

Allocation Example
p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(2)

20
Carnegie Mellon

Allocation Example
p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(2)

20
Carnegie Mellon

Constraints
• Applications

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests
• i.e., can’t reorder or buffer requests

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests
• i.e., can’t reorder or buffer requests
• Must allocate blocks from free memory

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests
• i.e., can’t reorder or buffer requests
• Must allocate blocks from free memory
• i.e., can place allocated blocks only in free memory

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests
• i.e., can’t reorder or buffer requests
• Must allocate blocks from free memory
• i.e., can place allocated blocks only in free memory
• Must align blocks so they satisfy all alignment requirements

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests
• i.e., can’t reorder or buffer requests
• Must allocate blocks from free memory
• i.e., can place allocated blocks only in free memory
• Must align blocks so they satisfy all alignment requirements
• 8-byte (x86) or 16-byte (x86-64) alignment on Linux boxes

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests
• i.e., can’t reorder or buffer requests
• Must allocate blocks from free memory
• i.e., can place allocated blocks only in free memory
• Must align blocks so they satisfy all alignment requirements
• 8-byte (x86) or 16-byte (x86-64) alignment on Linux boxes
• Can manipulate and modify only free memory

21
Carnegie Mellon

Constraints
• Applications
• Can issue arbitrary sequence of malloc and free requests
• free request must be to a malloc’d block

• Allocators
• Can’t control number or size of allocated blocks
• Must respond immediately to malloc requests
• i.e., can’t reorder or buffer requests
• Must allocate blocks from free memory
• i.e., can place allocated blocks only in free memory
• Must align blocks so they satisfy all alignment requirements
• 8-byte (x86) or 16-byte (x86-64) alignment on Linux boxes
• Can manipulate and modify only free memory
• Can’t move the allocated blocks once they are malloc’d

21
Carnegie Mellon

External Fragmentation
• Occurs when there is enough aggregate heap memory, but no
single free block is large enough

p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

22
Carnegie Mellon

External Fragmentation
• Occurs when there is enough aggregate heap memory, but no
single free block is large enough

p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(6) Oops! (what would happen now?)

22
Carnegie Mellon

External Fragmentation
• Occurs when there is enough aggregate heap memory, but no
single free block is large enough

p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(6) Oops! (what would happen now?)

• Depends on the pattern of future requests

22
Carnegie Mellon

Key Issues in Dynamic Memory Allocation


• Free:
• How do we know how much memory to free given just a pointer?
• How do we keep track of the free blocks?
• How do we reinsert freed block?

• Allocation:
• What do we do with the extra space when allocating a structure
that is smaller than the free block it is placed in?
• How do we pick a block to use for allocation -- many might fit?

23
Carnegie Mellon

Knowing How Much to Free


• Standard method
• Keep the length of a block in the word preceding the block.
• This word is often called the header field or header
• Requires an extra word for every allocated block

p0

p0 = malloc(4) 5

free(p0)

24
Carnegie Mellon

Internal Fragmentation
Block

Internal Internal
Payload
fragmentation fragmentation

• For a given block, internal fragmentation occurs if payload is


smaller than block size

25
Carnegie Mellon

Internal Fragmentation
Block

Internal Internal
Payload
fragmentation fragmentation

• For a given block, internal fragmentation occurs if payload is


smaller than block size
• Caused by
• Overhead of maintaining heap data structures
• Padding for alignment purposes
• Explicit policy decisions (e.g., to return a big block to satisfy a small
request)

25
Carnegie Mellon

Keeping Track of Free Blocks


• Method 1: Implicit list using length—links all blocks
5 4 6 2

26
Carnegie Mellon

Keeping Track of Free Blocks


• Method 1: Implicit list using length—links all blocks
5 4 6 2

• Method 2: Explicit list among the free blocks using pointers


5 4 6 2

26
Carnegie Mellon

Keeping Track of Free Blocks


• Method 1: Implicit list using length—links all blocks
5 4 6 2

• Method 2: Explicit list among the free blocks using pointers


5 4 6 2

• Method 3: Segregated free list


• Different free lists for different size classes

26
Carnegie Mellon

Keeping Track of Free Blocks


• Method 1: Implicit list using length—links all blocks
5 4 6 2

• Method 2: Explicit list among the free blocks using pointers


5 4 6 2

• Method 3: Segregated free list


• Different free lists for different size classes

• Method 4: Blocks sorted by size


• Can use a balanced tree (e.g. Red-Black tree) with pointers within each free block,
and the length used as a key

26
Carnegie Mellon

Today
• Memory mapping
• Dynamic memory allocation
• Basic concepts
• Implicit free lists

27
Carnegie Mellon

Implicit List
• For each block we need both size and allocation status
• Could store this information in two words: wasteful!

5 4 6 2

28
Carnegie Mellon

Implicit List
• For each block we need both size and allocation status
• Could store this information in two words: wasteful!

5 4 6 2

1 word

Size a = 1: Allocated block


a = 0: Free block
a

Payload Size: block size

Payload: application data


(allocated blocks only)
Optional
padding
28
Carnegie Mellon

Implicit List
• For each block we need both size and allocation status
• Could store this information in two words: wasteful!

• Standard trick
• If blocks are aligned, some low-order address bits are always 0
• Instead of storing an always-0 bit, use it as a allocated/free flag
• When reading size word, must mask out this bit

29
Carnegie Mellon

Implicit List
• For each block we need both size and allocation status
• Could store this information in two words: wasteful!

• Standard trick
• If blocks are aligned, some low-order address bits are always 0
• Instead of storing an always-0 bit, use it as a allocated/free flag
• When reading size word, must mask out this bit
1 word

Size a a = 1: Allocated block


a = 0: Free block
Format of
allocated and Payload Size: block size
free blocks
Payload: application data
(allocated blocks only)
Optional
padding
29
Carnegie Mellon

Detailed Implicit Free List Example

Unused
Start
of 8/0 16/1 32/0 16/1 0/1
heap

Double-word Allocated blocks: shaded


aligned Free blocks: unshaded
Headers: labeled with size in bytes/allocated bit

30
Carnegie Mellon

Finding a Free Block


• First fit:
• Search list from beginning, choose first free block that fits
• Can take linear time in total number of blocks (allocated and free)
• In practice it can cause “splinters” at beginning of list

31
Carnegie Mellon

Finding a Free Block


• First fit:
• Search list from beginning, choose first free block that fits
• Can take linear time in total number of blocks (allocated and free)
• In practice it can cause “splinters” at beginning of list

• Next fit:
• Like first fit, but search list starting where previous search finished
• Should often be faster than first fit: avoids re-scanning unhelpful blocks
• Some research suggests that fragmentation is worse

31
Carnegie Mellon

Finding a Free Block


• First fit:
• Search list from beginning, choose first free block that fits
• Can take linear time in total number of blocks (allocated and free)
• In practice it can cause “splinters” at beginning of list

• Next fit:
• Like first fit, but search list starting where previous search finished
• Should often be faster than first fit: avoids re-scanning unhelpful blocks
• Some research suggests that fragmentation is worse

• Best fit:
• Search the list, choose the best free block: fits, with fewest bytes left over
• Keeps fragments small—usually improves memory utilization
• Will typically run slower than first fit

31
Carnegie Mellon

Allocating in Free Block


• Allocated space might be smaller than free space
• We could simply leave the extra space there. Simple to implement but
causes internal fragmentation
• Or we could split the block
4 4 6 2

4 4 4 2 2

void addblock(ptr p, int len) {


int newsize = ((len + 1) >> 1) << 1; // round up to even
int oldsize = *p & -2; // mask out low bit
*p = newsize | 1; // set new length
if (newsize < oldsize)
*(p+newsize) = oldsize - newsize; // set length in remaining
} // part of block
32
Carnegie Mellon

Freeing a Block
• Simplest implementation:
• Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
• But can lead to “false fragmentation”

4 4 4 2 2

33
Carnegie Mellon

Freeing a Block
• Simplest implementation:
• Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
• But can lead to “false fragmentation”

4 4 4 2 2

free(p) p

4 4 4 2 2

33
Carnegie Mellon

Freeing a Block
• Simplest implementation:
• Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
• But can lead to “false fragmentation”

4 4 4 2 2

free(p) p

4 4 4 2 2

malloc(5) Oops!

33
Carnegie Mellon

Freeing a Block
• Simplest implementation:
• Need only clear the “allocated” flag
void free_block(ptr p) { *p = *p & -2 }
• But can lead to “false fragmentation”

4 4 4 2 2

free(p) p

4 4 4 2 2

malloc(5) Oops!

There is enough free space, but the allocator won’t be able to find it

33
Carnegie Mellon

Coalescing
• Join (coalesce) with next/previous blocks, if they are free
• Coalescing with next block

4 4 4 2 2
logically
free(p) p gone

4 4 6 2 2

void free_block(ptr p) {
*p = *p & -2; // clear allocated flag
next = p + *p; // find next block
if ((*next & 1) == 0)
*p = *p + *next; // add to this block if
} // not allocated

34
Carnegie Mellon

Coalescing
• How about now?

4 4 4 2 2

free(p) p

8 4 4 2 2

35
Carnegie Mellon

Coalescing
• How about now?
• How do we coalesce with previous block?

4 4 4 2 2

free(p) p

8 4 4 2 2

35
Carnegie Mellon

Coalescing
• How about now?
• How do we coalesce with previous block?
• Linear time solution: scans from beginning

4 4 4 2 2

free(p) p

8 4 4 2 2

35
Carnegie Mellon

Bidirectional Coalescing (Constant Time)


• Boundary tags [Knuth73]
• Replicate size/allocated word at “bottom” (end) of free blocks
• Allows us to traverse the “list” backwards, but requires extra space
• Important and general technique!

4 4 4 4 6 6 4 4

36
Carnegie Mellon

Bidirectional Coalescing (Constant Time)


• Boundary tags [Knuth73]
• Replicate size/allocated word at “bottom” (end) of free blocks
• Allows us to traverse the “list” backwards, but requires extra space
• Important and general technique!

4 4 4 4 6 6 4 4

Header Size a a = 1: Allocated block


a = 0: Free block
Format of
allocated and Payload and Size: Total block size
free blocks padding
Payload: Application data
(allocated blocks only)
Boundary tag Size
(footer)
36
Carnegie Mellon

Bidirectional Coalescing (Constant Time)


• Boundary tags [Knuth73]
• Replicate size/allocated word at “bottom” (end) of free blocks
• Allows us to traverse the “list” backwards, but requires extra space
• Important and general technique!

4 4 4 4 6 6 4 4

Header Size a a = 1: Allocated block


a = 0: Free block
Format of
allocated and Payload and Size: Total block size
free blocks padding
Payload: Application data
(allocated blocks only)
Boundary tag Size a
(footer)
36
Carnegie Mellon

Bidirectional Coalescing (Constant Time)


• Boundary tags [Knuth73]
• Replicate size/allocated word at “bottom” (end) of free blocks
• Allows us to traverse the “list” backwards, but requires extra space
• Important and general technique!
• Disadvantages? (Think of small blocks…)

4 4 4 4 6 6 4 4

Header Size a a = 1: Allocated block


a = 0: Free block
Format of
allocated and Payload and Size: Total block size
free blocks padding
Payload: Application data
(allocated blocks only)
Boundary tag Size a
(footer)
36
Carnegie Mellon

Summary of Key Allocator Policies


• Placement policy:
• First-fit, next-fit, best-fit, etc.
• Trades off lower throughput for less fragmentation

37
Carnegie Mellon

Summary of Key Allocator Policies


• Placement policy:
• First-fit, next-fit, best-fit, etc.
• Trades off lower throughput for less fragmentation

• Splitting policy:
• When do we split free blocks?
• How much internal fragmentation are we willing to tolerate?

37
Carnegie Mellon

Summary of Key Allocator Policies


• Placement policy:
• First-fit, next-fit, best-fit, etc.
• Trades off lower throughput for less fragmentation

• Splitting policy:
• When do we split free blocks?
• How much internal fragmentation are we willing to tolerate?

• Coalescing policy:
• Immediate coalescing: coalesce each time free is called
• Deferred coalescing: try to improve performance of free by deferring
coalescing until needed. Examples:
• Coalesce as you scan the free list for malloc
• Coalesce when the amount of external fragmentation reaches
some threshold

37
Carnegie Mellon

Implicit Lists: Summary


• Implementation: very simple

38
Carnegie Mellon

Implicit Lists: Summary


• Implementation: very simple
• Allocate cost:
• linear time worst case
• Identify free blocks requires scanning all the blocks!

38
Carnegie Mellon

Implicit Lists: Summary


• Implementation: very simple
• Allocate cost:
• linear time worst case
• Identify free blocks requires scanning all the blocks!
• Free cost:
• constant time worst case

38
Carnegie Mellon

Implicit Lists: Summary


• Implementation: very simple
• Allocate cost:
• linear time worst case
• Identify free blocks requires scanning all the blocks!
• Free cost:
• constant time worst case
• Memory usage:
• Will depend on placement policy
• First-fit, next-fit, or best-fit

38
Carnegie Mellon

Implicit Lists: Summary


• Implementation: very simple
• Allocate cost:
• linear time worst case
• Identify free blocks requires scanning all the blocks!
• Free cost:
• constant time worst case
• Memory usage:
• Will depend on placement policy
• First-fit, next-fit, or best-fit
• Not used in practice because of linear-time allocation
• used in many special purpose applications

38
Carnegie Mellon

Implicit Lists: Summary


• Implementation: very simple
• Allocate cost:
• linear time worst case
• Identify free blocks requires scanning all the blocks!
• Free cost:
• constant time worst case
• Memory usage:
• Will depend on placement policy
• First-fit, next-fit, or best-fit
• Not used in practice because of linear-time allocation
• used in many special purpose applications
• However, the concepts of splitting and boundary tag coalescing
are general to all allocators

38

You might also like