Dynamic Memory Allocation
Dynamic Memory Allocation
Overview of functions
The C dynamic memory allocation functions are defined in stdlib.h header (cstdlib header in
C++).[1]
Function Description
malloc allocates the specified number of bytes
realloc increases or decreases the size of the specified block of memory. Reallocates it if needed
calloc allocates the specified number of bytes and initializes them to zero
free releases the specified block of memory back to the system
There are two differences between these functions. First, malloc() takes a single argument (the
amount of memory to allocate in bytes), while calloc() needs two arguments (the number of
variables to allocate in memory, and the size in bytes of a single variable). Secondly, malloc()
does not initialize the memory allocated, while calloc() initializes all bytes of the allocated
memory block to zero.
Usage example
Creating an array of 10 integers with automatic scope is straightforward:
int array[10];
However, the size of the array is fixed at compile time. If one wishes to allocate a similar array
dynamically, the following code can be used:
/* Allocate space for an array with ten elements of type int. Some
programmers place an optional "(int *)" cast before malloc. */
int * array = malloc(10 * sizeof(int));
/* Check if the memory couldn't be allocated; if it's the case, handle the
problem as appropriate. */
if (NULL == array) {
/* Handle error… */
}
/* We are done with the array, and can free (release) the block of memory. */
free(array);
/* Make sure the freed pointer isn't used anymore by assigning it to NULL (or
another allocated memory region). */
array = NULL;
malloc() returns a null pointer to indicate that no memory is available, or that some other error
occurred which prevented memory being allocated.
Type safety
malloc returns a void pointer (void *), which indicates that it is a pointer to a region of
unknown data type. The use of casting is required in C++ due to the strong type system, whereas
this is not the case in C. The lack of a specific pointer type returned from malloc is type-unsafe
behaviour according to some programmers: malloc allocates based on byte count but not on
type. This is different from the C++ new operator that returns a pointer whose type relies on the
operand. (See C Type Safety.)
One may "cast" (see type conversion) this pointer to a specific type:
int *ptr;
ptr = malloc(10 * sizeof (*ptr)); /* without a cast */
ptr = (int *)malloc(10 * sizeof (*ptr)); /* with a cast */
ptr = reinterpret_cast<int *>(malloc(10 * sizeof (*ptr))); /* with a cast,
for C++ */
Advantages to casting
Disadvantages to casting
Common errors
The improper use of dynamic memory allocation can frequently be a source of bugs.
Not checking for allocation failures. Memory allocation is not guaranteed to succeed. If
there's no check for successful allocation implemented, this usually leads to a crash of the
program or the entire system.
Memory leaks. Failure to deallocate memory using free leads to build up of non-
reusable memory, which is no longer used by the program. This wastes memory
resources and can lead to allocation failures when these resources are exhausted.
Logical errors. All allocations must follow the same pattern: allocation using malloc,
usage to store data, deallocation using free. Failures to adhere to this pattern, such as
memory usage after a call to free or before a call to malloc, calling free twice ("double
free"), etc., usually leads to a crash of the program.
Implementations
The implementation of memory management depends greatly upon operating system and
architecture. Some operating systems supply an allocator for malloc, while others supply
functions to control certain regions of data. The same dynamic memory allocator is often used to
implement both malloc and the operator new in C++[citation needed]. Hence, it is referred to below as
the allocator rather than malloc.
Heap-based
Implementation of the allocator on IA-32 architectures is commonly done using the heap, or data
segment. The allocator will usually expand and contract the heap to fulfill allocation requests.
The heap method suffers from a few inherent flaws, stemming entirely from fragmentation. Like
any method of memory allocation, the heap will become fragmented; that is, there will be
sections of used and unused memory in the allocated space on the heap. A good allocator will
attempt to find an unused area of already allocated memory to use before resorting to expanding
the heap. The major problem with this method is that the heap has only two significant attributes:
base, or the beginning of the heap in virtual memory space; and length, or its size. The heap
requires enough system memory to fill its entire length, and its base can never change. Thus, any
large areas of unused memory are wasted. The heap can get "stuck" in this position if a small
used segment exists at the end of the heap, which could waste any magnitude of address space,
from a few megabytes to a few hundred.
dlmalloc
Doug Lea has developed dlmalloc ("Doug Lea's Malloc") as a general-purpose allocator, starting
in 1987. The GNU C library (glibc) uses ptmalloc,[10] an allocator based on dlmalloc.[11]
Memory on the heap is allocated as "chunks", an 8-byte aligned data structure which contains a
header and usable memory. Allocated memory contains an 8 or 16 byte overhead for the size of
the chunk and usage flags. Unallocated chunks also store pointers to other free chunks in the
usable space area, making the minimum chunk size 24 bytes.[11]
Unallocated memory is grouped into "bins" of similar sizes, implemented by using a double-
linked list of chunks (with pointers stored in the unallocated space inside the chunk).[11]
For requests below 256 bytes (a "smallbin" request), a simple two power best fit allocator is
used. If there are no free blocks in that bin, a block from the next highest bin is split in two.
For requests of 256 bytes or above but below the mmap threshold, recent versions of dlmalloc
use an in-place bitwise trie algorithm. If there is no free space left to satisfy the request, dlmalloc
tries to increase the size of the heap, usually via the brk system call.
For requests above the mmap threshold (a "largebin" request), the memory is always allocated
using the mmap system call. The threshold is usually 256 KB.[12] The mmap method averts
problems with huge buffers trapping a small allocation at the end after their expiration, but
always allocates an entire page of memory, which on many architectures is 4096 bytes in size.[13]
Since FreeBSD 7.0 and NetBSD 5.0, the old malloc implementation (phkmalloc) was replaced
by jemalloc, written by Jason Evans. The main reason for this was a lack of scalability of
phkmalloc in terms of multithreading. In order to avoid lock contention, jemalloc uses separate
"arenas" for each CPU. Experiments measuring number of allocations per second in
multithreading application have shown that this makes it scale linearly with the number of
threads, while for both phkmalloc and dlmalloc performance was inversely proportional to the
number of threads.[14]
OpenBSD's malloc
OpenBSD's implementation of the malloc function makes use of mmap. For requests greater in
size than one page, the entire allocation is retrieved using mmap; smaller sizes are assigned from
memory pools maintained by malloc within a number of "bucket pages," also allocated with
mmap. On a call to free, memory is released and unmapped from the process address space using
munmap. This system is designed to improve security by taking advantage of the address space
layout randomization and gap page features implemented as part of OpenBSD's mmap system
call, and to detect use-after-free bugs—as a large memory allocation is completely unmapped
after it is freed, further use causes a segmentation fault and termination of the program.
Hoard's malloc
The Hoard memory allocator is an allocator whose goal is scalable memory allocation
performance. Like OpenBSD's allocator, Hoard uses mmap exclusively, but manages memory in
chunks of 64 kilobytes called superblocks. Hoard's heap is logically divided into a single global
heap and a number of per-processor heaps. In addition, there is a thread-local cache that can hold
a limited number of superblocks. By allocating only from superblocks on the local per-thread or
per-processor heap, and moving mostly-empty superblocks to the global heap so they can be
reused by other processors, Hoard keeps fragmentation low while achieving near linear
scalability with the number of threads.[15]
Every thread has local storage for small allocations. For large allocations mmap or sbrk can be
used. TCMalloc, a malloc developed by Google,[16] has garbage-collection for local storage of
dead threads. The TCMalloc is considered to be more than twice as fast as glibc's ptmalloc for
multithreaded programs.[17][18]
In-kernel
Operating system kernels need to allocate memory just as application programs do. The
implementation of malloc within a kernel often differs significantly from the implementations
used by C libraries, however. For example, memory buffers might need to conform to special
restrictions imposed by DMA, or the memory allocation function might be called from interrupt
context.[19] This necessitates a malloc implementation tightly integrated with the virtual memory
subsystem of the operating system kernel.