Smart Dynamic Memory Allocator For Embedded Systems
Smart Dynamic Memory Allocator For Embedded Systems
Embedded Systems
Ramakrishna M, Jisung Kim, Woohyong Lee and Youngki Chung
Embedded Software Platform, System LSI Division, Samsung Semiconductor Business
Abstract—Dynamic memory (DM) allocation is one of the most time independent of number of free blocks and the length of
crucial components of modern software engineering. It offers a execution, should use small memory footprint, should provide
greatest flexibility to the software systems’ design; nevertheless, good locality properties by allocating blocks close in time are
developers of real-time systems often avoid using dynamic
close in space and as well as similar size objects should be
memory allocation due to its problems like unbounded or long
bounded response time and memory fragmentation. However, the allocated close in space.
modern complex applications like multimedia streaming and The majority of memory blocks used in any real application
network applications made the dynamic memory allocation are small size fragments of size typically smaller than few kilo
mandatory for applications’ design. The major challenges of bytes [1, 2]. Moreover, most memory blocks, which are used in
memory allocator are minimizing fragmentation, providing a any application, last for short durations [3]. If all the memory
good response time, and maintaining a good locality among the
requests are allocated from a single memory chunk (heap)
memory blocks. This paper introduces a new smart dynamic
memory allocator particularly for embedded systems that have without considering the life-spans, the heap memory will be
limited memory and processing power. It aimed at addressing the splinted over a time due to the short-lived blocks. Since the
major challenges of dynamic memory allocator. The smart short-lived blocks are most likely the small size blocks, the
allocator predicts the short-lived objects and allocates those effect of those objects on memory fragmentation is severe than
objects on one side of the heap memory and remaining objects on the long-lived objects. If the memory requirements and average
other side of the heap memory for effective utilization of memory
life-span of the objects were known, using a special chunk of
footprint. The allocator is implemented with enhanced multilevel
segregated mechanism using lookup tables and hierarchical memory for short-lived objects would solve the problem.
bitmaps that ensure very good response time and reliable timing Unfortunately, the memory requirements of the recent
performance. The proposed algorithm has shown excellent applications like multimedia streaming and wireless
experimental results, with respect to both response time and applications are unpredictable and moreover, the average
memory footprint usage, as compared to the well known memory requirement varies widely from one configuration to
algorithms. In addition, this paper presents a memory intensive
another [11]. Hence, dedicating a special memory chunks
synthetic (MIS) work load, which can model the allocation
behavior of most of the real applications. considering the worst case memory requirements would lead to
high overhead in memory space. The problem of handling
Index Terms—Dynamic memory allocation, Embedded short-lived objects and long-lived objects effectively, without
systems, Memory workload leading any extra memory overhead, at the same time providing
the advantage of using a special memory chunk is generated in
I. INTRODUCTION this memory allocator.
Dynamic memory allocation has been one of the most active In this paper, a new allocation methodology is introduced
research areas in computer systems for over four decades. A which allows designing a custom dynamic memory allocation
large number of algorithms on dynamic memory allocation mechanism with the reduced memory fragmentation and
have been proposed in the literature. Particularly, some excellent response time. In addition, the allocation mechanism
algorithms have been proposed for a good and reliable timing ensures the worst case response time always bounded and
response at the cost of high memory footprint. However, the almost independent of the application’s execution time.
DM Allocators usage is still considered as unreliable for The remainder of the paper is organized as follows. Section
embedded systems due to its unbounded long response time or II describes the background and related work. Section III
its inefficient memory usage. The DM allocators for embedded introduces the design of smart memory allocator. Section IV
systems should be implemented in their constrained operating describes the workloads used for the performance evaluation.
systems by considering the limited available resources. Hence, Section V presents the performance of the proposed allocator
the allocators should provide both the features such as optimum and the other state of art allocators. Section VI presents the
memory footprint usage and good response time conclusion.
simultaneously.
The important issue in the design of DM allocator is response II. BACKGROUND
time. However the issues like fragmentation, locality, false A vast number of DM allocators have been proposed in the
sharing and mutual exclusion are also equally important in the literature. The basic algorithms among all are first-fit, best-fit,
design of DM allocators [13]. Ideally, a memory allocator that next-fit, worst-fit, segregated-fit, bitmapped algorithms and
is targeted for wide range of applications, should take constant buddy systems. The rest of the algorithms are several variants
and combinations of these basic algorithms. However,
describing the exhaustive list of all algorithms is beyond the
978-1-4244-2881-6/08/$25.00 ©2008 IEEE
scope of this paper. An overview of the DM allocators is number of allocation events is a good measure for objects’
available in the literature [4, 5, and 12]. prediction. The proposed allocator has used a combination of
The major issues of any memory allocator are response time, object size and number of allocation events for the prediction of
fragmentation, locality and cache pollution. The other issues object lifetimes.
like mutual exclusion and synchronization are not described
here since the proposed algorithm aimed at minimizing the
fragmentation and serving the requests in a bounded response
time. However the smart allocator frame work can be used with
any fine grained locking mechanism for mutual exclusion.
Fragmentation denotes the wasted space in memory. The
fragmentation can be occurred in two forms: internal and
external fragmentation. Internal fragmentation occurs when
Fig. 1 Data Structures of Free and Allocated Blocks
memory space is allocated for program without ever intending
to use it, such as headers, footers and padding around an Fig.1 shows the structure of free and used data blocks. The
allocated object. External fragmentation is the phenomenon in memory allocator inserts header information into each used and
which free space is splinted into numerous small blocks over a free blocks. The block header of free blocks holds the
time and those blocks cannot be useful for any future memory information such as BS (32-bits, last two bits are always zeros
requests of large size blocks. Unlike the internal fragmentation, since block sizes are always multiples of 4), which specifies the
external fragmentation is difficult to quantify since it is entirely size of the block, BT (1-bit), which specifies the block type, AV
depends on future memory requests. The external (1-bit), which specifies the block status, Prev_Physical_BlkPtr
fragmentation is usually measured as proportion of total free and Next_Physical_BlkPtr, which are required for identifying
memory space available at the time of memory overflow. the status of physically adjacent blocks. Prev_FreeListPtr and
Cache pollution is the other issue in the designing of a Next_FreeListPtr are required for locating the previous and
memory allocator. A memory allocator may degrade the cache next free block in a segregated free-list. The block header of
performance by accessing several objects before finding a used blocks holds all the fields of the free block header except
suitable free block for memory allocation request. If several the free-list pointers, which are not required for used blocks
objects are accessed for each request, the allocator may cause because those blocks are not linked into any segregated
several cache misses [6]. free-lists. However the used block’s header holds an extra field
The memory allocation algorithms are basically classified called BlkAllocStat, which keeps the block allocation statistics
into following categories [5]: Sequential fits (first-fit, next-fit, for block prediction algorithm. The header overhead is
best-fit and worst-fit), segregated fits, Buddy systems [13] and accounted into internal fragmentation.
Bitmaps.