Libelf by Example
Libelf by Example
Joseph Koshy
March 8, 2012
Contents
1 Introduction
1.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Tutorial Structure . . . . . . . . . . . . . . . . . . . . . . . . . .
7
8
8
2 Getting Started
11
2.1 Example: Getting started with libelf . . . . . . . . . . . . . . . 11
3 Peering Inside an ELF Object
15
3.1 The Layout of an ELF file . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Example: Reading an ELF executable header . . . . . . . . . . . 20
4 Examining the Program Header Table
25
4.1 The ELF Program Header Table . . . . . . . . . . . . . . . . . . 25
4.2 Example: Reading a Program Header Table . . . . . . . . . . . . 27
5 Looking at Sections
33
5.1 ELF section handling with libelf . . . . . . . . . . . . . . . . . 34
5.2 Example: Listing section names . . . . . . . . . . . . . . . . . . . 38
6 Creating new ELF objects
43
6.1 Example: Creating an ELF object . . . . . . . . . . . . . . . . . 43
6.2 The finer points in creating ELF objects . . . . . . . . . . . . . . 48
7 Processing ar(1) archives
51
7.1 Archive structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.2 Example: Stepping through an ar(1) archive . . . . . . . . . . . . 52
8 Conclusion
57
8.1 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.2 Getting Further Help . . . . . . . . . . . . . . . . . . . . . . . . . 58
CONTENTS
Preface
This tutorial introduces the libelf library being developed at the ElfToolChain
project on SourceForge.Net. It shows how this library can be used to create tools
that can manipulate ELF objects for native and non-native architectures.
The ELF(3)/GELF(3) APIs are discussed, as is handling of ar(1) archives.
The ELF format is discussed to the extent needed to understand the use of the
ELF(3) library.
Knowledge of the C programming language is a pre-requisite.
Legal Notice
c 20062012 Joseph Koshy. All rights reserved.
Copyright
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
Disclaimer
THIS DOCUMENTATION IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR AND CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
5
CONTENTS
Acknowledgements
The following people (names in alphabetical order) offered constructive criticism
of this tutorial: Cherry George Mathew, Douglas Fraser, Hyogeol Lee, Kai
Wang, Prashanth Chandra, Ricardo Nabinger Sanchez, Sam Arun Raj, WeiJen Chen and Y. Giridhar Appaji Nag. Thank you, all.
Chapter 1
Introduction
ELF stands for Extensible Linking Format. It is a format for use by compilers,
linkers, loaders and other tools that manipulate object code.
The ELF specification was released to the public in 1990 as an open standard by a group of vendors. As a result of its ready availability it has been
widely adopted by industry and the open-source community. The ELF standard
supports 32- and 64-bit architectures of both big and little-endian kinds, and
supports features like cross-compilation and dynamic shared libraries. ELF also
supports the special compilation needs of the C++ language.
Among open-source operating systems, the RedHatTM RHL 2.0 Beta release
(late summer 1995) and the Slackware v3.0 (November 1995) release were among
the first LinuxTM -based operating systems to use ELF. The first ELF based
release for NetBSDTM was for the DEC AlphaTM architecture, in release 1.3
(January 1998). FreeBSDTM switched to using ELF as its object format in
FreeBSD 3.0 (October 1998).
The libelf library provides an API set (ELF(3) and GELF(3)) for application writers to read and write ELF objects with. The library eases the task
of writing cross-tools that can run on one machine architecture and manipulate
ELF objects for another.
There are multiple implementations of the ELF(3)/GELF(3) APIs in the
open-source world. This tutorial is based on the libelf library being developed
as part of the elftoolchain project on SourceForge.Net.
Target Audience
This tutorial would be of interest to developers wanting to create ELF processing
tools using the libelf library.
7
CHAPTER 1. INTRODUCTION
1.1
Tutorial Overview
1.2
Tutorial Structure
One of the goals of this tutorial is to illustrate how to write programs using
libelf. So we will jump into writing code at the earliest opportunity. As we
progress through the examples, we introduce the concepts necessary to understand what is happening behind the scenes.
Chapter 2 on page 11 covers the basics involved in getting started with the
ELF(3) libraryhow to compile and link an application that uses libelf. We
look at the way a working ELF version number is established by an application,
how a handle to ELF objects are obtained, and how error messages from the
ELF library are reported. The functions used in this section include elf begin,
elf end, elf errmsg, elf errno, elf kind and elf version.
Chapter 3 on page 15 shows how an application can look inside an ELF
object and understand its basic structure. Along the way we will examine
the way the ELF objects are laid out. Other key concepts covered are the
notions of file representation and memory representation of ELF data types.
New APIs covered include elf getident, elf getphdrnum, elf getshdrnum,
elf getshdrstrndx, gelf getehdr and gelf getclass.
Chapter 4 on page 25 describes the ELF program header table and shows
how an application can retrieve this table from an ELF object. This chapter
introduces the gelf getphdr function.
memory &
file representations
sequential
access
random
access
memory
management
The ELF(3)
& GELF(3)
APIs
library
data types
header
information
ordering
of calls
access
APIs
ELF
version,
class,
byte order
extended
numbering
symbol
& string
tables
Programming
with libelf
ar(1) archives
Basic
concepts
archive
structure
ELF string
tables
object
layout
Program
Headers
reading
& writing
data
ELF Sections
section
header
contents
ELF Segments
Section
Headers
type,
flags &
alignment
iterating
over
sections
layout
constraints
program
headers
Executable
header
10
CHAPTER 1. INTRODUCTION
Chapter 5 on page 33 then looks at how data is stored in ELF sections. A program that looks at ELF sections is examined. The Elf Scn and Elf Data data
types used by the library are introduced. The functions covered in this chapter include elf getscn, elf getdata, elf nextscn, elf strptr, and gelf getshdr.
Chapter 6 on page 43 looks at how we create ELF objects. We cover the
rules in ordering of the individual API calls when creating ELF objects. We look
at the librarys object layout rules and how an application can choose to override these. The APIs covered include elf fill, elf32 getshdr, elf32 newehdr, elf32 newphdr, elf flagphdr, elf ndxscn, elf newdata, elf newscn,
and elf update.
The libelf library also assists applications that need to read ar archives.
Chapter 7 on page 51 covers how to use the ELF(3) library to handle ar archives.
This chapter covers the use of the elf getarhdr, elf getarsym, elf next and
elf rand functions.
Chapter 8 on page 57 ends the tutorial with suggestions for further reading.
Chapter 2
Getting Started
Let us dive in and get a taste of programming with libelf.
2.1
Our first program (Program 1, listing 2.1) will open a filename presented to it
on its command line and retrieve the file type as recognized by the ELF library.
This example is covers the basics involved in using libelf; how to compile
a program using libelf, how to initialize the library, how to report errors, and
how to wind up.
Listing 2.1: Program 1
/*
* Getting started with libelf .
*
* $Id : prog1 . txt 2133 2011 -11 -10 08:28:22 Z jkoshy $
*/
# include < err .h >
# include < fcntl .h >
# include
# include
# include
# include
int
main ( int argc , char ** argv )
{
int fd ;
2
Elf * e ;
char * k ;
3
Elf_Kind ek ;
if ( argc != 2)
errx ( EXIT_FAILURE , " usage : % s file - name " , argv [0]);
11
12
switch ( ek ) {
case ELF_K_AR :
k = " ar (1) archive " ;
break ;
case ELF_K_ELF :
k = " elf object " ;
break ;
case ELF_K_NONE :
k = " data " ;
break ;
default :
k = " unrecognized " ;
}
( void ) printf ( " % s : % s \ n " , argv [1] , k );
( void ) elf_end ( e );
( void ) close ( fd );
exit ( EXIT_SUCCESS );
}
1 The functions and dataypes that make up the ELF(3) API are declared in
the header libelf.h. This file must be included in every application that
desires to use the libelf library.
2 The ELF(3) library uses an opaque type Elf as a handle for the ELF object
being processed.
4 Before the functions in the library can be invoked, an application must
indicate to the library the version of the ELF specification it is expecting
to use. This is done by the call to elf version.
A call to elf version is mandatory before other functions in the ELF
library can be invoked.
There are multiple version numbers that come into play when an application is manipulating an ELF object.
v1
libelf
v1 , v2
v2
13
ELF Object
v2
7 The ELF library can operate on ar archives and ELF objects. The
function elf kind returns the kind of object associated with an Elf handle. The return value of the elf kind function is one of the values defined
by the Elf Kind enumeration in libelf.h.
8 When you are done with a handle, it is good practice to release its resources
using the elf end function.
Now it is time to get something running.
Save the listing in listing 2.1 on page 11 to file prog1.c and then compile
and run it as shown in listing 2.2 on the following page.
14
Chapter 3
3.1
16
Executable
Header
Program
Header
Table
Section
Data(1)
...
Section
Data(n)
Section
Header
Table
17
typedef struct {
typedef struct {
uint16_t
e_type;
uint16_t
e_type;
uint16_t
uint32_t
uint32_t
e_machine;
e_version;
e_entry;
uint16_t
uint32_t
uint32_t
e_machine;
e_version;
e_entry;
uint32_t
e_phoff;
uint64_t
e_phoff;
uint32_t
uint32_t
uint16_t
uint16_t
e_shoff;
e_flags;
e_ehsize;
e_phentsize;
uint64_t
uint32_t
uint16_t
uint16_t
e_shoff;
e_flags;
e_ehsize;
e_phentsize;
uint16_t
e_phnum;
uint16_t
e_phnum;
uint16_t
e_shnum;
uint16_t
e_shnum;
4
5
uint16_t
} Elf32_Ehdr;
e_shstrndx;
uint16_t
} Elf64_Ehdr;
e_shstrndx;
OS ABI
...
F
E
I
IV
B
SA
O
I
B
SI
N
O
SS
SI
R
SA
LA
TA
A
D
G
A
M
G
A
M
1
G
A
G
A
18
e ehsize
Nphdr e phentsize
Nshdr e shentsize
Ehdr
Phdr
Shdr
e phoff
e shoff
Figure 3.3: The ELF Executable Header describes the layout of the rest of the
ELF object.
3 The e machine member describes the machine architecture this ELF object
R i386TM architecture
is for. Example values are 3 (EM 386) for the Intel
TM
and 20 (EM PPC) for the 32-bit PowerPC
architecture.
4
5 The ELF executable header also describes the layout of the rest of the
ELF object (Figure 3.3). The e phoff and e shoff fields contain the file
offsets where the ELF program header table and ELF section header table
reside. These fields are zero if the file does not have a program header table
or section header table respectively. The sizes of these components are
determined by the e phentsize and e shentsize members respectively
in conjunction with the number of entries in these tables.
The ELF executable header describes its own size (in bytes) in field
e ehsize.
7 The e phnum and e shnum fields usually contain the number of ELF
program header table entries and section header table entries. Note that
these fields are only 2 bytes wide, so if an ELF object has a large number of
sections or program header table entries, then a scheme known as Extended
Numbering (section 3.1 on page 20) is used to encode the actual number
of sections or program header table entries. When extended numbering is
in use these fields will contain magic numbers instead of actual counts.
8 If the ELF object contains sections, then we need a way to get at the names
of sections. Section names are stored in a string table. The e shstrndx
stores the section index of this string table (see 3.1 on page 20) so that
processing tools know which string table to use for retrieving the names
of sections. We will cover ELF string tables in more detail in section 5.1.1
on page 37.
The fields e entry and e flags are used for executables and are placed in
the executable header for easy access at program load time. We will not look
at them further in this tutorial.
ELF Class- and Endianness- Independent Processing
Now let us look at the way the libelf API set abstracts out ELF class and
endianness for us.
19
xlatetom()
%malign
Imagine that you are writing an ELF processing application that is going
to support processing of non-native binaries (say for a machine with a different
native endianness and word size). It should be evident that ELF data structures
would have two distinct representations: an in-memory representation that follows the rules for the machine architecture that the application running on, and
an in-file representation that corresponds to the target architecture for the ELF
object.
The application would like to manipulate data in its native memory representation. This memory representation would conform to the native endianness
of the hosts CPU and would conform to the address alignment and structure
padding requirements set by the hosts machine architecture.
When this data is written into the target object it may need to be formatted
differently. For example, it could be packed differently compared to the native
memory representation and may have to be laid out according a different set of
rules for alignment. The endianness of the data in-file could be different from
that of the in-memory representation.
Figure 3.4 depicts the relationship between the file and memory representation of an ELF data structure. As shown in the figure, the size of an ELF data
structure in the file could be different from its size in memory. The alignment
restrictions (%falign and %malign in the figure) could be different. The byte
ordering of the data could be different too.
The ELF(3) and GELF(3) API set can handle the conversion of ELF data
structures to and from their file and memory representations automatically. For
example, when we read in the ELF executable header in program 3.1 on page 21
below, the libelf library will automatically do the necessary byteswapping and
alignment adjustments for us.
For applications that desire finer-grain control over the conversion process,
the elfNN xlatetof and elfNN xlatetom functions are available. These functions will translate data buffers containing ELF data structures between their
memory and file representions.
20
Extended numbering
The e shnum, e phnum and e shstrndx fields of the ELF executable header are
only 2 bytes long and are not physically capable of representing numbers larger
than 65535. For ELF objects with a large number of sections, we need a different
way of encoding section numbers.
ELF objects with such a large number of sections can arise due to the way
GCC copes with C++ templates. When compiling C++ code which uses
templates, GCC generates many sections with names following the pattern
.gnu.linkonce.name. While each compiled ELF relocatable object will now
contain replicated data, the linker is expected to treat such sections specially at
the final link stage, discarding all but one of each section.
When extended numbering is in use:
The e shnum field of the ELF executable header is always zero and the
true number of sections is stored in the sh size field of the section header
table entry at index 0.
The true index of the section name string table is stored in field sh link
field of the zeroth entry of the section header table, while the e shstrndx
field of the executable header set to SHN XINDEX (0xFFFF).
For extended program header table numbering the scheme is similar, with
the e phnum field of the executable header holding the value PN XNUM
(0xFFFF) and the sh link field of the zeroth section header table holding
the actual number of program header table entries.
An application may use the functions elf getphdrnum, elf getshdrnum and
elf getshdrstrndx to retrieve the correct value of these fields when extended
numbering is in use.
3.2
We will now look at a small program that will print out the ELF executable
header in an ELF object. For this example we will introduce the GELF(3) API
set.
The ELF(3) API is defined in terms of ELF class-dependent types (Elf32 Ehdr, Elf64 Shdr, etc.) and consequently has many operations that have both
32- and 64- bit variants. So, in order to retrieve an ELF executable header from
a 32 bit ELF object we would need to use the function elf32 getehdr, which
would return a pointer to an Elf32 Ehdr structure. For a 64-bit ELF object,
the function we would need to use would be elf64 getehdr, which would return
a pointer to an Elf64 Ehdr structure. This duplication is awkward when you
want to write applications that can transparently process either class of ELF
objects.
The GELF(3) APIs provide an ELF class independent way of writing ELF
applications. These functions are defined in terms of generic types that are
large enough to hold the values of their corresponding 32- and 64- bit ELF types.
Further, the GELF(3) APIs always work on copies of ELF data structures
thus bypassing the problem of 32- and 64- bit ELF data structures having
21
incompatible memory layouts. You can freely mix calls to GELF(3) and ELF(3)
functions.
The downside of using the GELF(3) APIs is the extra copying and conversion
of data that occurs. This overhead is usually not significant to most applications.
Listing 3.1: Program 2
/*
* Print the ELF Executable Header from an ELF object .
*
* $Id : prog2 . txt 2133 2011 -11 -10 08:28:22 Z jkoshy $
*/
# include < err .h >
# include < fcntl .h >
# include
# include
# include
# include
# include
# include
1
< gelf .h >
< stdio .h >
< stdint .h >
< stdlib .h >
< unistd .h >
< vis .h >
int
main ( int argc , char ** argv )
{
int i , fd ;
Elf * e ;
char * id , bytes [5];
size_t n ;
2
GElf_Ehdr ehdr ;
if ( argc != 2)
errx ( EXIT_FAILURE , " usage : % s file - name " , argv [0]);
if ( elf_version ( EV_CURRENT ) == EV_NONE )
errx ( EXIT_FAILURE , " ELF library initialization "
" failed : % s " , elf_errmsg ( -1));
if (( fd = open ( argv [1] , O_RDONLY , 0)) < 0)
err ( EXIT_FAILURE , " open \"% s \" failed " , argv [1]);
if (( e = elf_begin ( fd , ELF_C_READ , NULL )) == NULL )
errx ( EXIT_FAILURE , " elf_begin () failed : % s . " ,
elf_errmsg ( -1));
if ( elf_kind ( e ) != ELF_K_ELF )
errx ( EXIT_FAILURE , " \"% s \" is not an ELF object . " ,
argv [1]);
3
if ( gelf_getehdr (e , & ehdr ) == NULL )
errx ( EXIT_FAILURE , " getehdr () failed : % s . " ,
elf_errmsg ( -1));
22
# define
# define
PRINT_FMT
"
% -20 s 0 x % jx \ n "
PRINT_FIELD ( N ) do { \
( void ) printf ( PRINT_FMT , #N , ( uintmax_t ) ehdr . N ); \
} while (0)
6
PRINT_FIELD ( e_type );
PRINT_FIELD ( e_machine );
PRINT_FIELD ( e_version );
PRINT_FIELD ( e_entry );
PRINT_FIELD ( e_phoff );
PRINT_FIELD ( e_shoff );
PRINT_FIELD ( e_flags );
PRINT_FIELD ( e_ehsize );
PRINT_FIELD ( e_phentsize );
PRINT_FIELD ( e_shentsize );
7
if ( elf_getshdrnum (e , & n ) != 0)
errx ( EXIT_FAILURE , " getshdrnum () failed : % s . " ,
elf_errmsg ( -1));
( void ) printf ( PRINT_FMT , " ( shnum ) " , ( uintmax_t ) n );
8
if ( elf_getshdrstrndx (e , & n ) != 0)
errx ( EXIT_FAILURE , " getshdrstrndx () failed : % s . " ,
elf_errmsg ( -1));
( void ) printf ( PRINT_FMT , " ( shstrndx ) " , ( uintmax_t ) n );
9
if ( elf_getphdrnum (e , & n ) != 0)
errx ( EXIT_FAILURE , " getphdrnum () failed : % s . " ,
elf_errmsg ( -1));
( void ) printf ( PRINT_FMT , " ( phnum ) " , ( uintmax_t ) n );
23
( void ) elf_end ( e );
( void ) close ( fd );
exit ( EXIT_SUCCESS );
}
Save the program in listing 3.1 on page 21 to file prog2.c and then compile
and run it as shown in listing 3.2.
Listing 3.2: Compiling and Running prog2
1
% cc -o prog2 prog2 . c - lelf
2
% ./ prog2 prog2
prog2 : 64 - bit ELF object
e_ident [0..8]
[ \^? 7 F ] [ E 45] [ L 4 C ] [ F 46] \
[ \^ B 2] [ \^ A 1] [ \^ A 1] [ \^ I 9] [ \^@ 0]
e_type
0 x2
e_machine
0 x3e
e_version
0 x1
24
0 x400a10
0 x40
0 x16f8
0 x0
0 x40
0 x38
0 x40
0 x18
0 x15
0 x5
1 The process for compiling and linking a GELF(3) using application is the
same as that for ELF(3) programs.
2 We run our program on itself. This listing in this tutorial was generated on
an AMD64TM machine running FreeBSDTM .
You should now run prog2 on other object files that you have lying around.
Try it on a few non-native ELF object files too.
Chapter 4
4.1
The ELF program header table describes the segments present in an ELF file.
The location of the program header table is described by the e phoff field of the
ELF executable header (see section 3.1 on page 16). The program header table
is a contiguous array of program header table entries, one entry per segment.
Figure 4.1 on the next page shows graphically how the fields of a program
header table entry specify the segments placement in file and in memory.
The structure of each program header table entry is shown in table 4.1 on
the following page.
25
26
p vaddr
p vaddr
p filesz
Segmentn in memory
p memsiz
p memsz
p align
...
%p align
ELF object
Ehdr
Segmentn
Phdr
p offset
p filesz
typedef struct {
typedef struct {
Elf32_Word
p_type;
Elf64_Word
p_type;
Elf32_Off
p_offset;
Elf64_Word
p_flags;
Elf32_Addr
p_vaddr;
Elf64_Off
p_offset;
Elf32_Addr
p_paddr;
Elf64_Addr
p_vaddr;
Elf32_Word
p_filesz;
Elf64_Addr
p_paddr;
Elf32_Word
p_memsz;
Elf64_Xword
p_filesz;
Elf32_Word
p_flags;
Elf64_Xword
p_memsz;
Elf32_Word
} Elf32_Phdr;
p_align;
Elf64_Xword
} Elf64_Phdr;
p_align;
27
0000 (PT LOPROC) to 0x7FFFFFFF (PT HIPROC) are similarly reserved for
processor-specific information.
2 The p offset field holds the file offset in the ELF object to the start of the
segment being described by this table entry.
3 The virtual address this segment should be loaded at.
4 The physical address this segment should be loaded at. This field does not
apply for userland objects.
5 The number of bytes the segment takes up in the file. This number is zero
for segments that do not have data associated with them in the file.
6 The number of bytes the segment takes up in memory.
7 Additional flags that specify segment properties. For example, flag PF X
specifies that the segment in question should be made executable and flag
PF W denotes that the segment should be writable.
8 The alignment requirements of the segment both in memory and in the file.
This field holds a value that is a power of two.
Note: The careful reader will note that the 32- and 64- bit Elf Phdr structures are laid out differently in memory. These differences are handled for you
by the functions in the libelf library.
4.2
We will now look at a program that will print out the program header table
associated with an ELF object. We will continue to use the GELF(3) API set
for this example. The ELF(3) API set also offers two ELF class-dependent APIs
that retrieve the program header table from an ELF object: elf32 getphdr and
elf64 getphdr, but these require us to know the ELF class of the object being
handled.
Listing 4.1: Program 3
/*
* Print the ELF Program Header Table in an ELF object .
*
* $Id : prog3 . txt 2133 2011 -11 -10 08:28:22 Z jkoshy $
*/
# include < err .h >
# include < fcntl .h >
# include
# include
# include
# include
1
< gelf .h >
< stdio .h >
< stdint .h >
< stdlib .h >
28
C ( DYNAMIC );
C ( SHLIB );
C ( SUNW_UNWIND );
C ( SUNWDTRACE );
int
main ( int argc , char ** argv )
{
int i , fd ;
Elf * e ;
char * id , bytes [5];
size_t n ;
2
GElf_Phdr phdr ;
if ( argc != 2)
errx ( EXIT_FAILURE , " usage : % s file - name " , argv [0]);
if ( elf_version ( EV_CURRENT ) == EV_NONE )
errx ( EXIT_FAILURE , " ELF library initialization "
" failed : % s " , elf_errmsg ( -1));
if (( fd = open ( argv [1] , O_RDONLY , 0)) < 0)
err ( EXIT_FAILURE , " open \"% s \" failed " , argv [1]);
if (( e = elf_begin ( fd , ELF_C_READ , NULL )) == NULL )
errx ( EXIT_FAILURE , " elf_begin () failed : % s . " ,
elf_errmsg ( -1));
if ( elf_kind ( e ) != ELF_K_ELF )
errx ( EXIT_FAILURE , " \"% s \" is not an ELF object . " ,
argv [1]);
3
if ( elf_getphdrnum (e , & n ) != 0)
errx ( EXIT_FAILURE , " elf_getphdrnum () failed : % s . " ,
29
elf_errmsg ( -1));
for ( i = 0; i < n ; i ++) {
5
if ( gelf_getphdr (e , i , & phdr ) != & phdr )
errx ( EXIT_FAILURE , " getphdr () failed : % s . " ,
elf_errmsg ( -1));
( void ) printf ( " PHDR % d :\ n " , i );
PRINT_FMT
"
% -20 s 0 x % jx "
PRINT_FIELD ( N ) do { \
( void ) printf ( PRINT_FMT , #N , ( uintmax_t ) phdr . N ); \
} while (0)
# define
NL () do { ( void ) printf ( " \ n " ); } while (0)
6
PRINT_FIELD ( p_type );
print_ptype ( phdr . p_type );
NL ();
PRINT_FIELD ( p_offset );
NL ();
PRINT_FIELD ( p_vaddr );
NL ();
PRINT_FIELD ( p_paddr );
NL ();
PRINT_FIELD ( p_filesz );
NL ();
PRINT_FIELD ( p_memsz );
NL ();
PRINT_FIELD ( p_flags );
( void ) printf ( " [ " );
if ( phdr . p_flags & PF_X )
( void ) printf ( " execute " );
if ( phdr . p_flags & PF_R )
( void ) printf ( " read " );
if ( phdr . p_flags & PF_W )
( void ) printf ( " write " );
printf ( " ] " );
NL ();
PRINT_FIELD ( p_align );
NL ();
}
# define
# define
( void ) elf_end ( e );
( void ) close ( fd );
exit ( EXIT_SUCCESS );
}
5 We iterate over all valid indices for the objects program header table,
retrieving the table entry at each index using the gelf getphdr function.
30
Save the program in listing 4.1 on page 27 to file prog3.c and then compile
and run it as shown in listing 4.2.
Listing 4.2: Compiling and Running prog3
% cc -o prog3 prog3 . c - lelf
2
% ./ prog3 prog3
PHDR 0:
p_type
p_offset
p_vaddr
p_paddr
p_filesz
p_memsz
p_flags
p_align
PHDR 1:
p_type
p_offset
p_vaddr
p_paddr
p_filesz
p_memsz
p_flags
p_align
PHDR 2:
p_type
p_offset
p_vaddr
p_paddr
p_filesz
p_memsz
p_flags
p_align
PHDR 3:
p_type
p_offset
p_vaddr
p_paddr
p_filesz
p_memsz
p_flags
p_align
PHDR 4:
p_type
p_offset
p_vaddr
p_paddr
p_filesz
3
0 x6 " PHDR "
0 x34
0 x8048034
0 x8048034
0 xc0
0 xc0
0 x5 [ execute read ]
0 x4
0 x3 " INTERP "
0 xf4
0 x80480f4
0 x80480f4
0 x15
0 x15
0 x4 [ read ]
0 x1
5
0 x1 " LOAD "
0 x0
0 x8048000
0 x8048000
0 xe67
0 xe67
0 x5 [ execute read ]
0 x1000
6
0 x1 " LOAD "
0 xe68
0 x8049e68
0 x8049e68
0 x11c
0 x13c
0 x6 [ read write ]
0 x1000
0 x2 " DYNAMIC "
0 xe78
0 x8049e78
0 x8049e78
0 xb8
31
0 xb8
0 x6 [ read write ]
0 x4
0 x4 " NOTE "
0 x10c
0 x804810c
0 x804810c
0 x18
0 x18
0 x4 [ read ]
0 x4
6 This object has two loadable segments: one with execute and read
permissions and one with read and write permissions. Both these segments
require page alignment.
You should now run prog3 on other object files.
Try a relocatable object file created by a cc -c invocation. Does it have
an program header table?
Try prog3 on shared libraries. What do their program header tables look
like?
Can you locate ELF objects on your system that have PT TLS header
entries?
32
Chapter 5
Looking at Sections
In the previous chapter we looked at the way an executable ELF objects are
viewed by the operating system. In this section we will look at the features of
the ELF format that are used by compilers and linkers.
For linking, data in an ELF object is grouped into sections. Each ELF
section represents one kind of data. For example, a section could contain a table
of strings used for program symbols, another could contain debug information,
and another could contain machine code. Non-empty sections do not overlap in
the file.
ELF sections are described by entries in an ELF section header table. This
table is usually placed at the very end of the ELF object (see figure 3.1 on
page 16). Table 5.1 describes the elements of section header table entry and
figure 5.1 on page 35 shows graphically how the fields of an ELF section header
specify the sections placement.
32 bit SHDR Table Entry
typedef struct {
typedef struct {
Elf32_Word
sh_name;
Elf64_Word
sh_name;
Elf32_Word
sh_type;
Elf64_Word
sh_type;
Elf32_Xword
Elf32_Addr
Elf32_Off
sh_flags;
sh_addr;
sh_offset;
Elf64_Xword
Elf64_Addr
Elf64_Off
sh_flags;
sh_addr;
sh_offset;
Elf32_Xword
sh_size;
Elf64_Xword
sh_size;
Elf32_Word
sh_link;
Elf64_Word
sh_link;
Elf32_Word
sh_info;
Elf64_Word
sh_info;
Elf32_Word
sh_addralign;
Elf64_Word
sh_addralign;
Elf32_Word
} Elf32_Shdr;
sh_entsize;
Elf64_Word
} Elf64_Shdr;
sh_entsize;
34
2 The sh type field specifies the section type. Section types are defined by
the SHT * constants defined in the systems ELF headers. For example, a
section of type SHT PROGBITS is defined to contain executable code, while
a section type SHT SYMTAB denotes a section containing a symbol table.
The ELF specification reserves values in the range 0x60000000 to 0x6FFFFFFF to denote OS-specific section types, and values in the range 0x70000000 to 0x7FFFFFFF for processor-specific section types. In addition,
applications have been given the range 0x80000000 to 0xFFFFFFFF for
their own use.
3 Section flags indicate whether a section has specific properties, e.g., whether
it contains writable data or instructions, or whether it has special link
ordering requirements. Flag values from 0x00100000 to 0x08000000 (8
flags) are reserved for OS-specific uses. Flags values from 0x10000000 to
0x80000000 (4 flags) are reserved for processor specific uses.
4 The sh size member specifies the size of the section in bytes.
5
5.1
You can conveniently retrieve the contents of sections and section headers using the APIs in the ELF(3) library. Function elf getscn will retrieve section
35
%sh addralign
ELF object
Sectionn
Ehdr
sh offset
Shdr
sh size
sh type
sh size
sh addralign
sh offset
...
Section Header Table Entry
Figure 5.1: Section layout.
Elf Scn
D1
D2
D3
D4
List of
Elf Data
descriptors.
ELF object
Section contents.
Figure 5.2: Coverage of an ELF section by Elf Scn and Elf Data descriptors.
information for a requested section number.Iteration through the sections of an
ELF file is possible using function elf nextscn.These routines will take care
of translating between in-file and in-memory representations, thus simplifying
your application.
In the ELF(3) API set, ELF sections are managed using Elf Scn descriptors.
There is one Elf Scn descriptor per ELF section in the ELF object. Functions
elf getscn and elf nextscn retrieve pointers to Elf Scn descriptors for preexisting sections in the ELF object. (Chapter 6 on page 43 covers the use of
function elf newscn for allocating new sections)..
Given an Elf Scn descriptor, functions elf32 getshdr and elf64 getshdr
retrieve its associated section header table entry. The GELF(3) API set offers
an equivalent ELF-class independent function gelf getshdr.
Each Elf Scn descriptor can be associated with zero or more Elf Data descriptors. Elf Data descriptors describe regions of application memory that
contain the actual data in the ELF section. Elf Data descriptors for a given
Elf Scn descriptor are retrieved using the elf getdata function.
Figure 5.2 shows graphically how an Elf Scn descriptor could conceptually
cover the content of a section with Elf Data descriptors.
Figure 5.3 on the next page depicts how an Elf Data structure describes a
36
d buf
d off
d size
The file representation of the data
in memory.
Memory buffer
37
NUL terminator
\0
\0
\0
\0
NUL terminator
The final NUL byte
Figure 5.4: String Table Layout.
4 The d off member contains the file offset from the start of the section of
the data in this buffer. This field is usually managed by the library, but
is under application control if the application has requested full control of
the ELF files layout (see chapter 6 on page 43).
5 The d size member contains the size of the memory buffer.
6 The d type member specifies the ELF type of the data contained in the
data buffer. Legal values for this member are precisely those defined by
the Elf Type enumeration in libelf.h.
7 The d version member specifies the working version for the data in this
descriptor. It must be one of the values supported by the libelf library.
Before we look at an example program we need to understand how string
tables are implemented by libelf.
5.1.1
String Tables
String tables hold variable length strings, allowing other structures in an ELF
object to refer to strings using offsets into the string table. Sections containing
string tables have type SHT STRTAB.
Figure 5.4 shows the layout of a string table graphically:
The initial byte of a string table is NUL (a \0). This allows an string
offset value of zero to denote the NULL string.
Subsequent strings are separated by NUL bytes.
The final byte in the section is again a NUL so as to terminate the last
string in the string table.
An ELF file can have multiple string tables; for example, section names
could be kept in one string table and symbol names in another.
38
5.2
Let us now write a program that would retrieve and print the names of the
sections present in an ELF object. This example will show you how to use:
Functions elf nextscn and elf getscn to retrieve Elf Scn descriptors.
Function gelf getshdr to retrieve a section header table entry corresponding to a section descriptor.
Function elf strptr to convert section name indices to NUL-terminated
strings.
Function elf getdata to retrieve translated data associated with a section.
Listing 5.2: Program 4
/*
* Print the names of ELF sections .
*
* $Id : prog4 . txt 2133 2011 -11 -10 08:28:22 Z jkoshy $
*/
# include
# include
# include
# include
# include
# include
# include
# include
int
main ( int argc , char ** argv )
{
int fd ;
Elf * e ;
char * name , *p , pc [4* sizeof ( char )];
Elf_Scn * scn ;
Elf_Data * data ;
GElf_Shdr shdr ;
size_t n , shstrndx , sz ;
if ( argc != 2)
errx ( EXIT_FAILURE , " usage : % s file - name " , argv [0]);
if ( elf_version ( EV_CURRENT ) == EV_NONE )
errx ( EXIT_FAILURE , " ELF library initialization "
" failed : % s " , elf_errmsg ( -1));
39
3
while (( scn = elf_nextscn (e , scn )) != NULL ) {
4
if ( gelf_getshdr ( scn , & shdr ) != & shdr )
errx ( EXIT_FAILURE , " getshdr () failed : % s . " ,
elf_errmsg ( -1));
if (( name = elf_strptr (e , shstrndx , shdr . sh_name ))
5
== NULL )
errx ( EXIT_FAILURE , " elf_strptr () failed : % s . " ,
elf_errmsg ( -1));
( void ) printf ( " Section % -4.4 jd % s \ n " , ( uintmax_t )
elf_ndxscn ( scn ) , name );
}
6
if (( scn = elf_getscn (e , shstrndx )) == NULL )
errx ( EXIT_FAILURE , " getscn () failed : % s . " ,
elf_errmsg ( -1));
if ( gelf_getshdr ( scn , & shdr ) != & shdr )
errx ( EXIT_FAILURE , " getshdr ( shstrndx ) failed : % s . " ,
elf_errmsg ( -1));
( void ) printf ( " . shstrab : size =% jd \ n " , ( uintmax_t )
shdr . sh_size );
data = NULL ; n = 0;
while ( n < shdr . sh_size &&
7
( data = elf_getdata ( scn , data )) != NULL ) {
p = ( char *) data - > d_buf ;
while ( p < ( char *) data - > d_buf + data - > d_size ) {
if ( vis ( pc , *p , VIS_WHITE , 0))
printf ( " % s " , pc );
n ++; p ++;
40
1 We retrieve the section index of the ELF section containing the string
table of section names using function elf getshdrstrndx. The use of
elf getshdrstrndx allows our program to work correctly when the object being examined has a very large number of sections.
2 Function elf nextscn has the useful property that it returns the pointer
to section number 1 if a NULL section pointer is passed in. Recall that
section number 0 is always of type SHT NULL and is not interesting to
applications.
3 We loop over all sections in the ELF object. Function elf nextscn will
return NULL at the end, which is a convenient way to exit the processing
loop.
4 Given a Elf Scn pointer, we retrieve the associated section header using
function gelf getshdr. The sh name member of this structure holds the
required offset into the section name string table.indexsections!header table entry!retrieval of
5 We convert the string offset in member sh name to a char * pointer using
function elf strptr. This value is then printed using printf.
6 We retrieve the section descriptor associate with the string table holding
section names. Variable shstrndx was retrieved by a prior call to function
elf getshdrstrndx.
7 We cycle through the Elf Data descriptors associated with the section in
question, printing the characters in each data buffer.
Save the program in listing 5.2 on page 38 to file prog4.c and then compile
and run it as shown in listing 5.3.
Listing 5.3: Compiling and Running prog4
% cc -o prog4 prog4 . c - lelf
2
% ./ prog4 prog4
Section 0001 . interp
Section 0002 . note . ABI - tag
Section 0003 . hash
Section 0004 . dynsym
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
41
. dynstr
. rela . plt
. init
. plt
. text
. fini
. rodata
. data
. eh_frame
. dynamic
. ctors
. dtors
. jcr
. got
. bss
. comment
3
Section 0021 . shstrtab
Section 0022 . symtab
Section 0023 . strtab
4
. shstrab : size =287
\^@ . s y m t a b \^@ . s t r t a b
\^@ . s h s t r t a b \^@ . i n t e
r p \^@ . h a s h \^@ . d y n s y m
...etc ...
42
Chapter 6
6.1
In listing 6.1 on the following page we will look at a program that creates a
simple ELF object with a program header table, one ELF section containing
43
44
translatable data and one ELF section containing a section name string table.
We will mark the ELF of the object as using a 32-bit, MSB-first data ordering.
Listing 6.1: Program 5
/*
* Create an ELF object .
*
* $Id : prog5 . txt 2133 2011 -11 -10 08:28:22 Z jkoshy $
*/
# include < err .h >
# include < fcntl .h >
# include
# include
# include
# include
uint32_t hash_words [] = {
0 x01234567 ,
0 x89abcdef ,
0 xdeadc0de
};
char string_table []
/* Offset 0 */
/* Offset 1 */
/* Offset 6 */
3
= {
\0 ,
. , f , o , o , \0 ,
. , s , h , s , t ,
r , t , a , b , \0
};
int
main ( int argc , char ** argv )
{
int fd ;
Elf * e ;
Elf_Scn * scn ;
Elf_Data * data ;
Elf32_Ehdr * ehdr ;
Elf32_Phdr * phdr ;
Elf32_Shdr * shdr ;
if ( argc != 2)
errx ( EXIT_FAILURE , " usage : % s file - name " , argv [0]);
if ( elf_version ( EV_CURRENT ) == EV_NONE )
errx ( EXIT_FAILURE , " ELF library initialization "
" failed : % s " , elf_errmsg ( -1));
if (( fd = open ( argv [1] , O_WRONLY | O_CREAT , 0777)) < 0)
err ( EXIT_FAILURE , " open \% s \" failed " , argv [1]);
45
5
6
if (( ehdr = elf32_newehdr ( e )) == NULL )
errx ( EXIT_FAILURE , " elf32_newehdr () failed : % s . " ,
elf_errmsg ( -1));
ehdr - > e_ident [ EI_DATA ] = ELFDATA2MSB ;
ehdr - > e_machine = EM_PPC ; /* 32 - bit PowerPC object */
ehdr - > e_type = ET_EXEC ;
7
if (( phdr = elf32_newphdr (e , 1)) == NULL )
errx ( EXIT_FAILURE , " elf32_newphdr () failed : % s . " ,
elf_errmsg ( -1));
8
if (( scn = elf_newscn ( e )) == NULL )
errx ( EXIT_FAILURE , " elf_newscn () failed : % s . " ,
elf_errmsg ( -1));
if (( data = elf_newdata ( scn )) == NULL )
errx ( EXIT_FAILURE , " elf_newdata () failed : % s . " ,
elf_errmsg ( -1));
data - > d_align = 4;
data - > d_off = 0 LL ;
data - > d_buf = hash_words ;
data - > d_type = ELF_T_WORD ;
data - > d_size = sizeof ( hash_words );
data - > d_version = EV_CURRENT ;
if (( shdr = elf32_getshdr ( scn )) == NULL )
errx ( EXIT_FAILURE , " elf32_getshdr () failed : % s . " ,
elf_errmsg ( -1));
shdr - > sh_name = 1;
shdr - > sh_type = SHT_HASH ;
shdr - > sh_flags = SHF_ALLOC ;
shdr - > sh_entsize = 0;
9
if (( scn = elf_newscn ( e )) == NULL )
errx ( EXIT_FAILURE , " elf_newscn () failed : % s . " ,
elf_errmsg ( -1));
if (( data = elf_newdata ( scn )) == NULL )
errx ( EXIT_FAILURE , " elf_newdata () failed : % s . " ,
elf_errmsg ( -1));
data - > d_align = 1;
data - > d_buf = string_table ;
data - > d_off = 0 LL ;
46
10
11
if ( elf_update (e , ELF_C_NULL ) < 0)
errx ( EXIT_FAILURE , " elf_update ( NULL ) failed : % s . " ,
elf_errmsg ( -1));
phdr - > p_type = PT_PHDR ;
phdr - > p_offset = ehdr - > e_phoff ;
phdr - > p_filesz = elf32_fsize ( ELF_T_PHDR , 1 , EV_CURRENT );
( void ) elf_flagphdr (e , ELF_C_SET , ELF_F_DIRTY );
12
if ( elf_update (e , ELF_C_WRITE ) < 0)
errx ( EXIT_FAILURE , " elf_update () failed : % s . " ,
elf_errmsg ( -1));
( void ) elf_end ( e );
( void ) close ( fd );
exit ( EXIT_SUCCESS );
}
47
6 We allocate an ELF executable header and set the EI DATA byte in its
e ident member. The machine type is set to EM PPC denoting the PowerPC architecture, and the object is marked as an ELF executable.
7 We allocate an ELF program header table with one entry. At this point
of time we do not know how the ELF object will be laid out so we dont
know where the ELF program header table will reside. We will update
this entry later.
8 We create a section descriptor for the section containing the hash values,
and associate the data in the hash words array with this descriptor. The
type of the section is set to SHT HASH. The library will compute its size and
location in the final object and will byte-swap the values when creating
the ELF object.
9 We allocate another section for holding the string table. We use the prefabricated string table in variable string table. The type of the section
is set to SHT STRTAB. Its offset and size in the file will be computed by the
library.indexsections!string table!allocation of
10 We set the string table index field in the ELF executable header using the
function elf setshstrndx.
11 Calling function elf update with parameter ELF C NULL indicates that
the libelf library is to compute the layout of the object, updating all
internal data structures, but not write it out. We can thus fill in the
values in the ELF program header table entry that we had allocated using
the new values in the executable header after this call to elf update.
The program header table is then marked dirty using a call to function
elf flagdata, so that a subsequent call to elf update will use the new
contents.
12 A call to function elf update with parameter ELF C WRITE causes the
object file to be written out.
Save the program in listing 6.1 on page 44 to file prog5.c and then compile
and run it as shown in listing 6.2.
Listing 6.2: Compiling and Running prog5
1
% cc -o prog5 prog5 . c - lelf
% ./ prog5 foo
2
% file foo
foo : ELF 32 - bit MSB executable , PowerPC or cisco 4500 , \
version 1 ( SYSV ) , statically linked , stripped
3
% readelf -a foo
ELF Header :
Magic :
7 f 45 4 c 46 01 02 01 00 00 00 00 00 00 00 00 00
Class :
ELF32
Data :
2 s complement , big endian
48
Version :
OS / ABI :
ABI Version :
Type :
Machine :
Version :
Entry point address :
Start of program headers :
Start of section headers :
Flags :
Size of this header :
Size of program headers :
Number of program headers :
Size of section headers :
Number of section headers :
Section header string table index :
...etc...
1 ( current )
UNIX - System V
0
EXEC ( Executable file )
PowerPC
0 x1
0 x0
52 ( bytes into file )
112 ( bytes into file )
0 x0
52 ( bytes )
32 ( bytes )
1
40 ( bytes )
3
2
3 We use the file and readelf programs to examine the object that we
have created.
6.2
Some of the finer points in creating ELF objects using the libelf library are
examined below. We cover memory management rules, ELF data structure
lifetimes, and how an application can take full control over an objects layout.
We also briefly cover how to modify an existing ELF object.
6.2.1
By default, the libelf library will lay out your ELF objects for you. The default
layout is shown in figure 3.1 on page 16.An application may request fine-grained
control over the ELF objects layout by setting the flag ELF F LAYOUT on the
ELF descriptor using function elf flagelf.
Once an ELF descriptor has been flagged with flag ELF F LAYOUT the following members of the ELF data structures come under application control:
The e phoff and e shoff fields, which determine whether the ELF program header table and section header table start.
For each section, the sh addralign, sh offset, and sh size fields in its
section header.
These fields must set prior to calling function elf update.
The library will fill gaps between parts of the ELF file with a fill character .
An application may set the fill character using the function elf fill. The
default fill character is a zero byte.
6.2.2
49
Memory Management
6.2.3
As part of the process of writing out an ELF object, the libelf library may
release or reallocate its internal bookkeeping structures.
A rule to be followed when using the libelf library is that all pointers to
returned data structures (e.g., pointers to Elf Scn and Elf Data structures or
to other ELF headers become invalid after a call to function elf update with
parameter ELF C WRITE.
After a successful call to function elf update all ELF data structures will
need to be retrieved afresh.
6.2.4
The libelf library also allows existing ELF objects to be modified. The process
is similar to that for creating ELF objects, the differences being:
The underlying file object would need to be opened for reading and writing, and the call to function elf begin would use parameter ELF C RDWR
instead of ELF C WRITE.
The application would use the elf get* APIs to retrieve existing ELF
data structures in addition to the elf new* APIs used for allocating new
data structures. The libelf library would be informed of modifications
to ELF data structures by calls to the appropriate elf flag* functions.
The rest of the program flow would be similar to the object creation case.
An important point to note when modifying an existing ELF object is that
it is the applications responsibility to ensure that the changed object remains
compliant to the ELF standard and internally consistent. For example, if the
sections in an ELF executable are moved around, then the information in the
executables Program Header Table would also need to be updated appropriately. An in-depth discussion of this topic is, however, out of scope for this
introductory tutorial.
50
Chapter 7
7.1
Archive structure
Each ar(1) archive starts with a sequence of 8 signature bytes (see the constant
ARMAG defined in the system header ar.h). The members of the archive follow,
each member preceded by an archive header describing the metadata associated
with the member. Figure 7.1 on the next page depicts the structure of an ar(1)
archive pictorially.
Each archive header is a collection of fixed size ASCII strings. Archive
headers are required to reside at even offsets in the archive file. Figure 7.1
shows the layout of the archive header as a C structure.
Listing 7.1: Archive Header Layout
struct ar_hdr {
char ar_name [16];
char ar_date [12];
char ar_uid [6];
char ar_gid [6];
char ar_mode [8];
char ar_size [10];
# define
ARFMAG
char ar_fmag [2];
} __packed ;
/*
/*
/*
/*
/*
/*
file name */
file modification time */
creator user id */
creator group id */
octal file permissions */
size in bytes */
" \ n "
/* consistency check */
52
ar mode
ar size
ar fmag
/ //
archive magic
File 0
name
date
uid
gid
File 1
File 2
...
archive headers
Figure 7.1: The structure of ar(1) archives.
and / characters being used for string termination. File names that exceed the length limits of the ar name member are handled by placing them
in a special string table (not to be confused with ELF string tables) and
storing the offset of the file name in the ar name member as a string of
decimal digits.
The archive handling functions offered by the libelf library insulate the
application from these details of the layout of ar(1) archives.
7.2
We now illustrate (listing 7.2) how an application may iterate through the members of an ar(1) archive. The steps involved are:
1. Archives are opened using elf begin in the usual way.
2. Each archive managed by the libelf library tracks the next member to
opened. This information is updated using the functions elf next and
elf rand.
3. Nested calls to function elf begin retrieve ELF descriptors for the members in the archive.
Figure 7.2 on the facing page pictorially depicts how functions elf begin
and elf next are used to step through an ar(1) archive.
We now look at an example program that illustrates these concepts.
Listing 7.2: Program 6
/*
* Iterate through an ar (1) archive .
*
* $Id : prog6 . txt 2135 2011 -11 -10 08:59:47 Z jkoshy $
/ //
archive magic
elf begin(1)
53
elf begin(2)
elf next(0)
elf next(1)
elf next(2)
File 0
File 1
File 2
...
archive headers
Figure 7.2: Iterating through ar(1) archives with elf begin and elf next.
*/
# include
# include
# include
# include
# include
# include
int
main ( int argc , char ** argv )
{
int fd ;
Elf * ar , * e ;
Elf_Arhdr * arh ;
if ( argc != 2)
errx ( EXIT_FAILURE , " usage : % s file - name " , argv [0]);
if ( elf_version ( EV_CURRENT ) == EV_NONE )
errx ( EXIT_FAILURE , " ELF library initialization "
" failed : % s " , elf_errmsg ( -1));
if (( fd = open ( argv [1] , O_RDONLY , 0)) < 0)
err ( EXIT_FAILURE , " open \% s \" failed " , argv [1]);
1
if (( fd = open ( argv [1] , O_RDONLY , 0)) < 0)
err ( EXIT_FAILURE , " open \% s \" failed " , argv [1]);
if (( ar = elf_begin ( fd , ELF_C_READ , NULL )) == NULL )
errx ( EXIT_FAILURE , " elf_begin () failed : % s . " ,
elf_errmsg ( -1));
if ( elf_kind ( ar ) != ELF_K_AR )
errx ( EXIT_FAILURE , " % s is not an ar (1) archive . " ,
argv [1]);
54
( void ) elf_next ( e );
( void ) elf_end ( e );
}
( void ) elf_end ( ar );
( void ) close ( fd );
exit ( EXIT_SUCCESS );
}
2 We open the ar(1) archive for reading and obtain a descriptor in the
usual manner.
3 Function elf begin is used to the iterate through the members of the
archive. The third parameter in the call to elf begin is a pointer to the
descriptor for the archive itself. The return value of function elf begin
is a descriptor that references an archive member.
4 We retrieve the translated ar(1) header using function elf getarhdr. We
then print out the name and size of the member. Note that function
elf getarhdr translates names to null-terminated C strings suitable for
use with printf.
Figure 7.3 shows the translated information returned by elf getarhdr.
Listing 7.3: The Elf Arhdr Structure
typedef struct {
time_t
ar_date ;
char
* ar_name ;
gid_t
ar_gid ;
mode_t
ar_mode ;
char
* ar_rawname ;
size_t
ar_size ;
uid_t
ar_uid ;
} Elf_Arhdr ;
/*
/*
/*
/*
/*
/*
/*
time of creation */
archive member name */
creator s group */
file creation mode */
raw member name */
member size in bytes */
creator s user id */
5 The elf next function sets up the parent archive descriptor (referenced by
variable ar in this example) to return the next archive member on the
next call to function elf begin.
6 It is good programming practice to call elf end on descriptors that are no
longer needed.
55
Save the program in listing 7.2 on page 52 to file prog6.c and then compile
and run it as shown in listing 7.4.
Listing 7.4: Compiling and Running prog6
1
7.2.1
Random access in the archive is supported by the function elf rand. However,
in order to use this function you need to know the file offsets in the archive for
the desired archive member. For archives containing object files this information
is present in the archive symbol table.
If an archive has an archive symbol table, it can be retrieved using the
function elf getarsym. Function elf getarsym returns an array of Elf Arsym
structures. Each Elf Arsym structure (figure 7.5) maps one program symbol to
the file offset inside the ar(1) archive of the member that contains its definition.
Listing 7.5: The Elf Arsym structure
typedef struct {
off_t
as_off ;
/* byte offset to member header */
unsigned long as_hash ; /* elf_hash () value for name */
char
* as_name ; /* null terminated symbol name */
} Elf_Arsym ;
Once the file offset of the member is known, the function elf rand can be
used to set the parent archive to open the desired archive member at the next
call to elf begin.
56
Chapter 8
Conclusion
This tutorial covered the following topics:
We gained an overview of the facilities for manipulating ELF objects offered by the ELF(3) and GELF(3) API sets.
We studied the basics of the ELF format, including the key data structures
involved and their layout inside ELF objects.
We looked at example programs that retrieve ELF data structures from
existing ELF objects.
We looked at how to create new ELF objects using the ELF(3) library.
We looked at accessing information in the ar(1) archives.
8.1
8.1.1
Further Reading
On the Web
8.1.2
The source code for the tools being developed at the ElfToolChain Project at
SourceForge.Net show the use of the ELF(3)/GELF(3) APIs in useful programs.
For readers looking for smaller programs to study, Emmanuel Azencot offers
a website with example programs.
57
58
8.1.3
CHAPTER 8. CONCLUSION
Books
8.1.4
Standards
The current specification of the ELF format, the Tool Interface Standard (TIS)
Executable and Linking Format (ELF) Specification, Version 1.2 is freely available to download.
8.2
If you have further questions about the use of libelf, please feel free to use
our discussion list: [email protected].
Index
ar archive
header, 51
elf getarhdr, 54
layout, 51
retrieval of, 54
long file names, 52
magic, 51
random access, 55
use of elf rand, 55
reading of
elf begin, 54
sequential access, 52
elf begin, 52
elf next, 52
string table, 51
symbol table, 51, 55
elf getarsym, 55
retrieval of, 55
definition of, 54
Elf Arsym
definition of, 55
Elf Data
adding to section descriptor, 43
alignment, 36
data pointer, 36
data size, 37
data type, 37
definition of, 36
describing application memory, 36
descriptor version, 37
offset in section, 37
Elf Scn
definition of, 36
Elf Data descriptors, 35
use of, 35
executable
definition of, 15
dynamically loadable objects, 15
executable header, 15
allocation, 47
ELF
functions, 43
class, 16
gelf newehdr, 43
agnostic APIs, 19, 20, 27, 29,
e ident
35, 38
definition, 16
retrieval of, 23
e ident field, 17
creation of, 43
executable architecture, 18
data conversion
executable type, 16
application control over, 19
flags, 18
automatic, 19
layout, 16
definition of, 7
own size, 18
descriptor
program entry point, 18
allocation, 43
retrieval of, 23
e ident
setting the string table index, 47
retrieval of, 23
updating
endianness, 16
with elf update, 43, 47
fill character, 48
extended numbering, 18, 20
further reading, 57
APIs to use, 20
specification, 58
elf getphdrnum, 23
version number, 16
elf getshdrnum, 23
Elf Arhdr
elf getshdrstrndx, 23
59
60
INDEX
need for, 20
program header, 20
sections, 20
use of APIs, 23
GELF, 20
downsides to, 21
getting help
mailing list, 58
libelf
additional examples, 57
automatic data conversion, 19
header elf.h, 12
header gelf.h, 23
linking with, 13, 23, 30, 40, 47, 55
purpose of, 7
library
shared, 15
linking
books about, 57, 58
definition of, 15
loading
of programs, 15, 25
object creation
default layout, 48
layout
application control of, 48
elf fill, 48
fill character, 48
memory management rules, 49
non-native byte order, 43
refreshing of data structures, 49
with elf begin, 46
writing to file
with elf update, 47
object modification
flagging modified data, 49
process to follow, 49
object representation, 19
automatic translation, 35
differences between file and memory, 19
in-file, 19
in-memory, 19
program header
entry size, 18
layout in file, 18
table, 15, 25
allocation of, 43
entry, 25
iteration over, 29
layout, 26
retrieval of, 29
self-description, 31
relocatable
definition of, 15
sections, 15, 16, 33
adding to an object, 43
alignment of, 34
coverage by data descriptors, 35
data
retrieval of, 38
entry sizes, 34
flags, 34
hash values, 46
header table entry
elf32 getshdr, 35
elf64 getshdr, 35
gelf getshdr, 35
retrieval of, 35
header table, 16, 33
entry size, 18
layout in file, 18
indices
SHN UNDEF, 34
valid indices, 34
iteration through
elf nextscn, 35
names
elf strptr, 40
representation of, 34
retrieval of, 38, 40
string table, 18, 34, 40
placement in file, 33
retrieval
elf getscn, 35
size of, 34
string table, 37, 46
elf strptr, 38
layout, 37
retrieval of strings, 38
type, 34
reserved values, 34
use of, 33
segments, 25
INDEX
aligment of, 27
definition of, 25
example layout, 25
examples of, 31
file size of, 27
flags, 27
memory size of, 27
offset in file, 27
type, 26
reserved values, 27
virtual address of, 27
61