13 Linking
13 Linking
Linking
15-213: Introduction to Computer Systems
13th Lecture, Oct. 13, 2015
Instructors:
Randal E. Bryant and David R. O’Hallaron
Today
Linking
Case study: Library interpositioning
Example C Program
Static Linking
Programs are translated and linked using a compiler driver:
▪ linux> gcc -Og -o prog main.c sum.c
▪ linux> ./prog
Translators Translators
(cpp, cc1, as) (cpp, cc1, as)
Linker (ld)
Why Linkers?
Reason 1: Modularity
▪ Space: Libraries
▪ Common functions can be aggregated into a single file...
▪ Yet executable files and running memory images contain only
code for the functions they actually use.
▪ Symbol definitions are stored in object file (by assembler) in symbol table.
▪ Symbol table is an array of structs
▪ Each entry includes name, size, and location of symbol.
▪ During symbol resolution step, the linker associates each symbol reference
with exactly one symbol definition.
Linker Symbols
Global symbols
▪ Symbols defined by module m that can be referenced by other modules.
▪ E.g.: non-static C functions and non-static global variables.
External symbols
▪ Global symbols that are referenced by module m but defined by some
other module.
Local symbols
▪ Symbols that are defined and referenced exclusively by module m.
▪ E.g.: C functions and global variables defined with the static
attribute.
▪ Local linker symbols are not local program variables
Defining
a global Referencing Linker knows
Linker knows a global… nothing of i or s
nothing of val …that’s defined here
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14
Carnegie Mellon
Local Symbols
Local non-static C variables vs. local static C variables
▪ local non-static C variables: stored on the stack
▪ local static C variables: stored in either .bss, or .data
int f()
{
static int x = 0;
Compiler allocates space in .data for
return x;
} each definition of x
p1.c p2.c
strong int foo=5; int foo; weak
Linker Puzzles
int x;
p1() {} p1() {} Link time error: two strong symbols (p1)
int x; double x;
int y; p2() {} Writes to x in p2 might overwrite y!
p1() {} Evil!
Global Variables
Avoid if you can
Otherwise
▪ Use static if you can
▪ Initialize if you define a global variable
▪ Use extern if you reference an external global variable
Step 2: Relocation
Relocatable Object Files Executable Object File
main()
.text
main.o
swap()
main() .text
System data
sum.o .data
int array[2]={1,2}
sum() .text
.symtab
.debug
Relocation Entries
int array[2] = {1, 2};
int main()
{
int val = sum(array, 2);
return val;
} main.c
0000000000000000 <main>:
0: 48 83 ec 08 sub $0x8,%rsp
4: be 02 00 00 00 mov $0x2,%esi
9: bf 00 00 00 00 mov $0x0,%edi # %edi = &array
a: R_X86_64_32 array # Relocation entry
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Source: objdump –r –d main.o 21
Carnegie Mellon
00000000004004e8 <sum>:
4004e8: b8 00 00 00 00 mov $0x0,%eax
4004ed: ba 00 00 00 00 mov $0x0,%edx
4004f2: eb 09 jmp 4004fd <sum+0x15>
4004f4: 48 63 ca movslq %edx,%rcx
4004f7: 03 04 8f add (%rdi,%rcx,4),%eax
4004fa: 83 c2 01 add $0x1,%edx
4004fd: 39 f2 cmp %esi,%edx
4004ff: 7c f3 jl 4004f4 <sum+0xc>
400501: f3 c3 repz retq
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Source: objdump -dx prog 22
Carnegie Mellon
.data section
.bss section brk
.symtab Run-time heap
(created by malloc)
.debug
Read/write data segment Loaded
.line (.data, .bss) from
the
.strtab Read-only code segment executable
Section header table (.init, .text, .rodata) file
0x400000
(required for relocatables) Unused
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 0 23
Carnegie Mellon
unix> ar rs libc.a \
Archiver (ar)
atoi.o printf.o … random.o
libvector.a
Linking with
Static Libraries int addcnt = 0;
multcnt++;
for (i = 0; i < n; i++)
z[i] = x[i] * y[i];
} multvec.c
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 28
Carnegie Mellon
addvec.o multvec.o
Linker (ld)
Fully linked
prog2c
executable object file
Problem:
▪ Command line order matters!
▪ Moral: put libraries at the end of the command line.
unix> gcc -L. libtest.o -lmine
unix> gcc -L. -lmine libtest.o
libtest.o: In function `main':
libtest.o(.text+0x4): undefined reference to `libfun'
Linker (ld)
Loader libc.so
(execve) libvector.so
int main()
{
void *handle;
void (*addvec)(int *, int *, int *, int);
char *error;
Linking Summary
Linking is a technique that allows programs to be
constructed from multiple object files.
Today
Linking
Case study: Library interpositioning
Example program
Goal: trace the addresses
and sizes of the allocated
#include <stdio.h> and freed blocks, without
#include <malloc.h>
breaking the program, and
int main() without modifying the
{ source code.
int *p = malloc(32);
free(p);
return(0); Three solutions: interpose
} int.c
on the lib malloc and
free functions at compile
time, link time, and
load/run time.
Compile-time Interpositioning
#ifdef COMPILETIME
#include <stdio.h>
#include <malloc.h>
Compile-time Interpositioning
#define malloc(size) mymalloc(size)
#define free(ptr) myfree(ptr)
Link-time Interpositioning
#ifdef LINKTIME
#include <stdio.h>
Link-time Interpositioning
linux> make intl
gcc -Wall -DLINKTIME -c mymalloc.c
gcc -Wall -c int.c
gcc -Wall -Wl,--wrap,malloc -Wl,--wrap,free -o intl
int.o mymalloc.o
linux> make runl
./intl
malloc(32) = 0x1aa0010
free(0x1aa0010)
linux>
#ifdef RUNTIME
Load/Run-time
#define _GNU_SOURCE
#include <stdio.h> Interpositioning
#include <stdlib.h>
#include <dlfcn.h>
Load/Run-time Interpositioning
/* free wrapper function */
void free(void *ptr)
{
void (*freep)(void *) = NULL;
char *error;
if (!ptr)
return;
Load/Run-time Interpositioning
linux> make intr
gcc -Wall -DRUNTIME -shared -fpic -o mymalloc.so mymalloc.c -ldl
gcc -Wall -o intr int.c
linux> make runr
(LD_PRELOAD="./mymalloc.so" ./intr)
malloc(32) = 0xe60010
free(0xe60010)
linux>
Interpositioning Recap
Compile Time
▪ Apparent calls to malloc/free get macro-expanded into calls to
mymalloc/myfree
Link Time
▪ Use linker trick to have special name resolutions
▪ malloc → __wrap_malloc
▪ __real_malloc → malloc
Load/Run Time
▪ Implement custom version of malloc/free that use dynamic linking
to load library malloc/free under different names