Embedded Software 5
Embedded Software 5
• Linux configures the memory management unit (MMU) of the CPU to present a
virtual address space to a running program
• Begins at zero and ends at the highest address, 0xffffffff, on a 32-bit processor
• Divided into pages of 4 KiB (there are rare examples of systems using other page sizes)
• Linux divides this virtual address space into an area for applications, called user
space, and an area for the kernel, called kernel space
• The split between the two is set by a kernel configuration parameter named PAGE_OFFSET
• In a typical 32-bit embedded system, PAGE_OFFSET is 0xc0000000
• lower 3 gigabytes to user space and the top gigabyte to kernel space
• The user address space is allocated per process so that each process runs in a
sandbox, separated from the others
• The kernel address space is the same for all processes: there is only one kernel
• Pages in this virtual address space are mapped to physical addresses by the
MMU, which uses page tables to perform the mapping
VIRTUAL MEMORY BASICS (CONTD.)
• Linux employs a lazy allocation strategy for user space, only mapping
physical pages of memory when the program accesses it
• For example, allocating a buffer of 1 MiB using malloc()returns a pointer
to a block of memory addresses but no actual physical memory
• A flag is set in the page table entries such that any read or write access is
trapped by the kernel
• This is known as a page fault
• Only at this point does the kernel attempt to find a page of physical memory
and add it to the page table mapping for the process
#include <stdio.h>
#include <stdlib.h> int main (int argc, char *argv[])
#include <string.h>
{
#include <sys/resource.h> unsigned char *p; Output:
#define BUFFER_SIZE (1024 * 1024) printf("Initial state\n"); Initial state
• You can see the memory map for a process through the proc filesystem
• As an example, here is the map for the init process, PID 1:
SWAPPING
• A process starts with certain amount of memory mapped to the text (thevcode)
and data segments of the program file, together with the shared libraries that it is
linked with
• It can allocate memory on its heap at runtime using malloc()and on the stack through
locally scoped variables and memory allocated through alloca()
• It may also load libraries dynamically at runtime using dlopen(3)
• Process can also manipulate its memory map in an explicit way using mmap():
• void *mmap(void *addr, size_t length, int prot, int flags, int
fd, off_t offset);
• This function maps length bytes of memory from the file with the descriptor fd, starting at
offset in the file, and returns a pointer to the mapping, assuming it is successful
• Since the underlying hardware works in pages, length is rounded up to the nearest whole
number of pages
• The protection parameter, prot, is a combination of read, write, and execute permissions and
the flags parameter contains at least MAP_SHARED or MAP_PRIVATE
USING MMAP TO ALLOCATE PRIVATE MEMORY
• It is possible for a driver to allow its device node to be mmaped and share some
of the device memory with an application
• The exact implementation is dependent on the driver
• One example is the Linux framebuffer, /dev/fb0
• The interface is defined in /usr/include/linux/fb.h, including an ioctl function
to get the size of the display and the bits per pixel
• You can then use mmap to ask the video driver to share the framebuffer with the
application and read and write pixels
HOW MUCH MEMORY DOES MY APPLICATION USE?
• As with kernel space, the different ways of allocating, mapping, and sharing user
space memory make it quite difficult to answer this seemingly simple question
• To begin, you can ask the kernel how much memory it thinks is available, which
you can do using the free command:
• You can force the kernel to free up caches by writing a number between 1 and 3
to /proc/sys/vm/drop_caches:
PER-PROCESS MEMORY USAGE
• There are several metrics to measure the amount of memory a
process is using.
• Two that are easiest to obtain: the virtual set size (vss) and the
resident memory size (rss):
• Vss: Called VSZ in the ps command and VIRT in top
• Total amount of memory mapped by a process
• It is the sum of all the regions shown in /proc/<PID>/map
• This number is of limited interest since only part of the virtual memory is committed to
physical memory at any time
• Rss: Called RSS in ps and RES in top
• Sum of memory that is mapped to physical pages of memory
• This gets closer to the actual memory budget of the process, but there is a problem:
• if you add the Rss of all the processes, you will get an overestimate of the memory in use
because some pages will be shared
IDENTIFYING MEMORY LEAKS
• A memory leak occurs when memory is allocated but not freed when
it is no longer needed
• Memory leakage is not unique to embedded systems
• But it becomes an issue because targets don't have much memory and they
often run for long periods of time without rebooting
• You will realize that there is a leak when you run free or top and
see that free memory is continually going down even if you drop
caches
• There are several tools to identify memory leaks in a program
• Here: mtrace
MTRACE
#include <mcheck.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[]) $ export MALLOC_TRACE=mtrace.log
{ $ ./mtrace-example
$ mtrace mtrace-example mtrace.log
int j;
mtrace(); Memory not freed:
-----------------
for (j = 0; j < 2; j++) Address Size Caller
malloc(100); /* Never freed:a memory leak */ 0x0000000001479460 0x64 at /home/chris/mtrace-example.c:11
calloc(16, 16); /* Never freed:a memory leak */ 0x00000000014794d0 0x64 at /home/chris/mtrace-example.c:11
0x0000000001479540 0x100 at /home/chris/mtrace-example.c:15
exit(EXIT_SUCCESS);
}
RUNNING OUT OF MEMORY
• I/O memory is simply a region of RAM-like locations that the device makes
available to the processor over the bus
• This memory can be used for a number of purposes, such as holding video
data or Ethernet packets, as well as implementing device registers that
behave just like I/O ports
• According to the computer platform and bus being used, I/O memory may
or may not be accessed through page tables
• Device memory regions must be allocated prior to use
• int check_mem_region(unsigned long start, unsigned long len);
• void request_mem_region(unsigned long start, unsigned long len,
char *name);
• void release_mem_region(unsigned long start, unsigned long
len);