Lecture 11 - Linking
Lecture 11 - Linking
Linking
1
Carnegie Mellon
Example C Program
main.c swap.c
int buf[2] = {1, 2}; extern int buf[];
bufp1 = &buf[1];
temp = *bufp0;
*bufp0 = *bufp1;
*bufp1 = temp;
}
2
Carnegie Mellon
Static Linking
Programs are translated and linked using a compiler driver:
▪ unix> gcc -O2 -g -o p main.c swap.c
▪ unix> ./p
Translators Translators
(cpp, cc1, as) (cpp, cc1, as)
Linker (ld)
Why Linkers?
Reason 1: Modularity
4
Carnegie Mellon
▪ Space: Libraries
▪ Common functions can be aggregated into a single file...
▪ Yet executable files and running memory images contain only
code for the functions they actually use.
5
Carnegie Mellon
▪ Linker associates each symbol reference with exactly one symbol definition.
6
Carnegie Mellon
7
Carnegie Mellon
8
Carnegie Mellon
9
Carnegie Mellon
Linker Symbols
Global symbols
▪ Symbols defined by module m that can be referenced by other modules.
▪ E.g.: non-static C functions and non-static global variables.
External symbols
▪ Global symbols that are referenced by module m but defined by some
other module.
Local symbols
▪ Symbols that are defined and referenced exclusively by module m.
▪ E.g.: C functions and variables defined with the static attribute.
▪ Local linker symbols are not local program variables
12
Carnegie Mellon
Resolving Symbols
Global External Local
Global
13
Carnegie Mellon
main()
.text
main.o
swap()
main() .text
int buf[2]={1,2} .data More system code
System data
swap.o int buf[2]={1,2} .data
int *bufp0=&buf[0]
swap() .text int *bufp1 .bss
int *bufp0=&buf[0] .data .symtab
static int *bufp1 .bss .debug
16
Carnegie Mellon
void swap()
{
int temp;
bufp1 = &buf[1];
temp = *bufp0;
*bufp0 = *bufp1;
*bufp1 = temp;
}
17
Carnegie Mellon
08048380 <main>:
8048380: 8d 4c 24 04 lea 0x4(%esp),%ecx
8048384: 83 e4 f0 and $0xfffffff0,%esp
8048387: ff 71 fc pushl 0xfffffffc(%ecx)
804838a: 55 push %ebp
804838b: 89 e5 mov %esp,%ebp
804838d: 51 push %ecx
804838e: 83 ec 04 sub $0x4,%esp
8048391: e8 1a 00 00 00 call 80483b0 <swap>
8048396: 83 c4 04 add $0x4,%esp
8048399: 31 c0 xor %eax,%eax
804839b: 59 pop %ecx
804839c: 5d pop %ebp
804839d: 8d 61 fc lea 0xfffffffc(%ecx),%esp
80483a0: c3 ret
18
Carnegie Mellon
0: 8b 15 00 00 00 00 mov 0x0,%edx
2: R_386_32 buf
6: a1 04 00 00 00 mov 0x4,%eax
7: R_386_32 buf
...
e: c7 05 00 00 00 00 04 movl $0x4,0x0
15: 00 00 00
10: R_386_32 .bss
14: R_386_32 buf
. . .
1d: 89 0d 04 00 00 00 mov %ecx,0x4
1f: R_386_32 buf
23: c3 ret
080483b0 <swap>:
80483b0: 8b 15 20 96 04 08 mov 0x8049620,%edx
80483b6: a1 24 96 04 08 mov 0x8049624,%eax
80483bb: 55 push %ebp
80483bc: 89 e5 mov %esp,%ebp
80483be: c7 05 30 96 04 08 24 movl $0x8049624,0x8049630
80483c5: 96 04 08
80483c8: 8b 08 mov (%eax),%ecx
80483ca: 89 10 mov %edx,(%eax)
80483cc: 5d pop %ebp
80483cd: 89 0d 24 96 04 08 mov %ecx,0x8049624
80483d3: c3 ret
19
Carnegie Mellon
08049620 <buf>:
8049620: 01 00 00 00 02 00 00 00
08049628 <bufp0>:
8049628: 20 96 04 08
20
Carnegie Mellon
p1.c p2.c
strong int foo=5; int foo; weak
21
Carnegie Mellon
22
Carnegie Mellon
Linker Puzzles
int x;
p1() {} p1() {} Link time error: two strong symbols (p1)
int x; double x;
int y; p2() {} Writes to x in p2 might overwrite y!
p1() {} Evil!
Role of .h Files
global.h
c1.c
#ifdef INITIALIZE
#include "global.h" int g = 23;
static int init = 1;
int f() { #else
return g+1; int g;
} static int init = 0;
#endif
c2.c
#include <stdio.h>
#include "global.h"
int main() {
if (!init)
g = 37;
int t = f();
printf("Calling f yields %d\n", t);
return 0;
}
24
Carnegie Mellon
Running Preprocessor
c1.c global.h
#include "global.h" #ifdef INITIALIZE
int g = 23;
int f() { static int init = 1;
return g+1; #else
} int g;
static int init = 0;
#endif
-DINITIALIZE
no initialization
Global Variables
Avoid if you can
Otherwise
▪ Use static if you can
▪ Initialize if you define a global variable
▪ Use extern if you use external global variable
27
Carnegie Mellon
28
Carnegie Mellon
29
Carnegie Mellon
unix> ar rs libc.a \
Archiver (ar)
atoi.o printf.o … random.o
30
Carnegie Mellon
addvec.o multvec.o
Linker (ld)
Fully linked
p2
executable object file
32
Carnegie Mellon
Problem:
▪ Command line order matters!
▪ Moral: put libraries at the end of the command line.
unix> gcc -L. libtest.o -lmine
unix> gcc -L. -lmine libtest.o
libtest.o: In function `main':
libtest.o(.text+0x4): undefined reference to `libfun'
33
Carnegie Mellon
Shared Libraries
Static libraries have the following disadvantages:
▪ Duplication in the stored executables (every function need std libc)
▪ Duplication in the running executables
▪ Minor bug fixes of system libraries require each application to explicitly
relink
35
Carnegie Mellon
Linker (ld)
Partially linked p2
executable object file
Loader libc.so
(execve) libvector.so
int main()
{
void *handle;
void (*addvec)(int *, int *, int *, int);
char *error;
38
Carnegie Mellon
39