0% found this document useful (0 votes)
36 views38 pages

14 Linkers W

The document discusses the process of compiling, assembling, linking, and loading a program from source code to machine-executable code. It explains that compilers output assembly files, assemblers output object files, linkers join object files into an executable, and loaders bring the executable into memory to start execution. It provides an example of a simple C program that calculates the sum from 1 to 100, and shows the assembly output and the steps of assembling, linking, and loading/running the program.

Uploaded by

Khaled Mohamed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views38 pages

14 Linkers W

The document discusses the process of compiling, assembling, linking, and loading a program from source code to machine-executable code. It explains that compilers output assembly files, assemblers output object files, linkers join object files into an executable, and loaders bring the executable into memory to start execution. It provides an example of a simple C program that calculates the sum from 1 to 100, and shows the assembly output and the steps of assembling, linking, and loading/running the program.

Uploaded by

Khaled Mohamed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Assemblers, Linkers, and Loaders

Hakim Weatherspoon
CS 3410, Spring 2013
Computer Science
Cornell University
See: P&H Appendix B.3-4 and 2.12
Goal for Today: Putting it all Together
Review Calling Convention

Compiler output is assembly files

Assembler output is obj files

Linker joins object files into one executable

Loader brings it into memory and starts execution


Recap: Calling Conventions
• first four arg words passed in $a0, $a1, $a2, $a3
• remaining arg words passed in parent’s stack frame
• return value (if any) in $v0, $v1
• stack frame at $sp $fp  saved ra
– contains $ra (clobbered on JAL to sub-functions) saved fp
– contains $fp saved regs
– contains local vars (possibly ($s0 ... $s7)
clobbered by sub-functions)
– contains extra arguments to sub-functions
locals
(i.e. argument “spilling)
– contains space for first 4 arguments to sub-functions
• callee save regs are preserved outgoing
• caller save regs are not args
$sp 
• Global data accessed via $gp

Warning: There is no one true MIPS calling convention.


lecture != book != gcc != spim != web
MIPS Register Conventions
r0 $zero zero r16 $s0
r1 $at assembler temp r17 $s1
r2 $v0 function r18 $s2
r3 $v1 return values r19 $s3 saved
r4 $a0 r20 $s4 (callee save)
r5 $a1 function r21 $s5
r6 $a2 arguments r22 $s6
r7 $a3 r23 $s7
r8 $t0 r24 $t8 more temps
r9 $t1 r25 $t9 (caller save)
r10 $t2 r26 $k0 reserved for
r11 $t3 temps r27 $k1 kernel
r12 $t4 (caller save) r28 $gp global data pointer
r13 $t5 r29 $sp stack pointer
r14 $t6 r30 $fp frame pointer
r15 $t7 r31 $ra return address
Anatomy of an executing program
0xfffffffc top
system reserved

0x80000000
0x7ffffffc
stack

dynamic data (heap)


0x10000000 static data .data

code (text) .text


0x00400000
0x00000000 system reserved bottom
Anatomy of an executing program
Code Stored in Memory
(also, data and stack)
compute
jump/branch
targets

$0 (zero)

A
memory $1 ($at)
register

D
alu
file
$29 ($sp)

B
$31 ($ra)
+4
addr
inst

PC din dout

M
control

B
memory
imm
extend
new
forward
pc detect
unit Stack, Data, Code
hazard
Stored in Memory
Instruction Instruction Write-
ctrl

ctrl

ctrl
Fetch Decode Execute Memory Back
IF/ID ID/EX EX/MEM MEM/WB
Takeaway
We need a calling convention to coordinate use of
registers and memory. Registers exist in the
Register File. Stack, Code, and Data exist in
memory. Both instruction memory and data
memory accessed through cache (modified harvard
architecture) and a shared bus to memory (Von
Neumann).
Next Goal
Given a running program (a process), how do we
know what is going on (what function is executing,
what arguments were passed to where, where is
the stack and current stack frame, where is the
code and data, etc)?
Activity #1: Debugging
init(): 0x400000
printf(s, …): 0x4002B4 CPU:
vnorm(a,b): 0x40107C $pc=0x004003C0
main(a,b): 0x4010A0 0x00000000
$sp=0x7FFFFFAC
pi: 0x10000000 0x0040010c
str1: 0x10000004 $ra=0x00401090
0x7FFFFFF4
0x00000000
What func is running?
0x00000000
Who called it? 0x00000000
0x00000000
Has it called anything?
0x004010c4
Will it? 0x7FFFFFDC
Args? 0x00000000
0x00000000
Stack depth? 0x00000015
Call trace? 0x7FFFFFB0 0x10000004
0x00401090
Compilers and Assemblers
Next Goal
How do we compile a program from source to
assembly to machine object code?
Big Picture
Compiler output is assembly files

Assembler output is obj files

Linker joins object files into one executable

Loader brings it into memory and starts execution


Example: Add 1 to 100
int n = 100;
int main (int argc, char* argv[ ]) {
int i;
int m = n;
int sum = 0;

for (i = 1; i <= m; i++)


count += i;

printf ("Sum 1 to %d is %d\n", n, sum);


} export PATH=${PATH}:/courses/cs3410/mipsel-linux/bin:/courses/cs3410/mips-sim/bin
or
setenv PATH ${PATH}:/courses/cs3410/mipsel-linux/bin:/courses/cs3410/mips-sim/bin
# Assemble
[csug03] mipsel-linux-gcc –S add1To100.c
.data
Example: Add 1 to 100
$L2: lw $2,24($fp)
.globl n lw $3,28($fp)
.align 2
n: .word 100 slt $2,$3,$2
.rdata bne $2,$0,$L3
.align 2 lw $3,32($fp)
$str0: .asciiz lw $2,24($fp)
"Sum 1 to %d is %d\n" addu $2,$3,$2
.text sw $2,32($fp)
.align 2 lw $2,24($fp)
.globl main addiu $2,$2,1
main: addiu $sp,$sp,-48 sw $2,24($fp)
sw $31,44($sp)
sw $fp,40($sp) b $L2
move $fp,$sp $L3: la $4,$str0
sw $4,48($fp) lw $5,28($fp)
sw $5,52($fp) lw $6,32($fp)
la $2,n jal printf
lw $2,0($2) move $sp,$fp
sw $2,28($fp) lw $31,44($sp)
sw $0,32($fp) lw $fp,40($sp)
li $2,1 addiu $sp,$sp,48
sw $2,24($fp)
Example: Add 1 to 100
# Assemble
[csug01] mipsel-linux-gcc –c add1To100.s

# Link
[csug01] mipsel-linux-gcc –o add1To100 add1To100.o
${LINKFLAGS}
# -nostartfiles –nodefaultlibs
# -static -mno-xgot -mno-embedded-pic
-mno-abicalls -G 0 -DMIPS -Wall

# Load
[csug01] simulate add1To100
Sum 1 to 100 is 5050
MIPS program exits with status 0 (approx. 2007
instructions in 143000 nsec at 14.14034 MHz)
Globals and Locals
Variables Visibility Lifetime Location
Function-Local

Global

Dynamic
int n = 100;
int main (int argc, char* argv[ ]) {
int i, m = n, sum = 0, *A = malloc(4 * m);
for (i = 1; i <= m; i++) { sum += i; A[i] = sum; }
printf ("Sum 1 to %d is %d\n", n, sum);
}
Globals and Locals
Variables Visibility Lifetime Location
Function-Local w/in func func stack
i, m, sum invocation
Global whole prgm prgm
.data
n, str execution
Dynamic A Anywhere that b/w malloc heap
C Pointers can be trouble has a ptr and free
Example #2: Review of Program Layout
calc.c
vector* v = malloc(8);
v->x = prompt(“enter x”); system reserved
v->y = prompt(“enter y”);
int c = pi + tnorm(v);
print(“result %d”, c); stack
math.c
int tnorm(vector* v) {
return abs(v->x)+abs(v->y);
}
dynamic data (heap)
lib3410.o
global variable: pi static data
entry point: prompt
entry point: print code (text)
entry point: malloc
system reserved
Assembler
calc.c calc.s calc.o
executable
math.c math.s math.o program
C source calc.exe
files io.s io.o exists on
disk
assembly
files libc.o loader
Compiler
libm.o
Executing
obj files in
Assembler
linker Memory
process
Next Goal
How do we understand the machine object code
that an assembler creates?
Big Picture
.o = Linux
math.c math.s math.o .obj Windows

Output is obj files


• Binary machine code, but not executable
• May refer to external symbols i.e. Need a “symbol table”
• Each object file has illusion of its own address space
– Addresses will need to be fixed later
e.g. .text (code) starts at addr 0x00000000
.data starts @ addr 0x00000000
Symbols and References

Global labels: Externally visible “exported” symbols


• Can be referenced from other object files
• Exported functions, global variables e.g. pi
(from a couple of slides ago)

Local labels: Internal visible only symbols


• Only used within this object file
• static functions, static variables, loop labels, …
e.g. e.g.
static foo $str
static bar $L0
static baz $L2
Object file
Header
• Size and position of pieces of file
Text Segment
• instructions
Object File

Data Segment
• static data (local/global vars, strings, constants)
Debugging Information
• line number  code address map, etc.
Symbol Table
• External (exported) references
• Unresolved (imported) references
Example
math.c
int pi = 3;
int e = 2;
static int randomval = 7;

extern char *username;


extern int printf(char *str, …);

int square(int x) { … }
static int is_prime(int x) { … }
int pick_prime() { … }
int pick_random() {
return randomval;
}
Objdump disassembly
csug01 ~$ mipsel-linux-objdump --disassemble math.o
math.o: file format elf32-tradlittlemips
Disassembly of section .text:
00000000 <pick_random>:
0: 27bdfff8 addiu sp,sp,-8
4: afbe0000 sw s8,0(sp)
8: 03a0f021 move s8,sp
c: 3c020000 lui v0,0x0
10: 8c420008 lw v0,8(v0)
14: 03c0e821 move sp,s8
18: 8fbe0000 lw s8,0(sp)
1c: 27bd0008 addiu sp,sp,8
20: 03e00008 jr ra
24: 00000000 nop

00000028 <square>:
28: 27bdfff8 addiu sp,sp,-8
2c: afbe0000 sw s8,0(sp)
30: 03a0f021 move s8,sp
34: afc40008 sw a0,8(s8)
Objdump symbols
csug01 ~$ mipsel-linux-objdump --syms math.o
math.o: file format elf32-tradlittlemips
SYMBOL TABLE:
00000000 l df *ABS* 00000000 math.c
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .mdebug.abi32 00000000 .mdebug.abi32
00000008 l O .data 00000004 randomval
00000060 l F .text 00000028 is_prime
00000000 l d .rodata 00000000 .rodata
00000000 l d .comment 00000000 .comment
00000000 g O .data 00000004 pi
00000004 g O .data 00000004 e
00000000 g F .text 00000028 pick_random
00000028 g F .text 00000038 square
00000088 g F .text 0000004c pick_prime
00000000 *UND* 00000000 username
00000000 *UND* 00000000 printf
Separate Compilation
Q: Why separate compile/assemble and linking steps?
A: Can recompile one object, then just relink.
Takeaway
We need a calling convention to coordinate use of
registers and memory. Registers exist in the Register
File. Stack, Code, and Data exist in memory. Both
instruction memory and data memory accessed
through cache (modified harvard architecture) and a
shared bus to memory (Von Neumann).

Need to compile from a high level source language to


assembly, then assemble to machine object code. The
Objdump command can help us understand structure
of machine code which is broken into hdr, txt and
data segments, debugging information, and symbol
table
Linkers
Next Goal
How do we link together separately compiled and
assembled machine object files?
Big Picture
calc.c calc.s calc.o

math.c math.s math.o


calc.exe
io.s io.o

libc.o

libm.o
Executing
in
linker Memory
Linkers
Linker combines object files into an executable file
• Relocate each object’s text and data segments
• Resolve as-yet-unresolved symbols
• Record top-level entry point in executable file

End result: a program on disk, ready to execute


• E.g. ./calc Linux
./calc.exe Windows
simulate calc Class MIPS simulator
Linker Example
main.o math.o
... ...
0C000000 21032040
21035000 0C000000
1b301402
.text

1b80050C
4C040000 3C040000
21047002 34040000
0C000000 ...
... 20 T square
Relocation info Symbol tbl

00 D pi
00 T main
*UND* printf
00 D uname
*UND* uname
*UND* printf
*UND* pi 28, JL, printf
40, JL, printf 30, LUI, uname
4C, LW/gp, pi 34, LA, uname
54, JL, square
printf.o
...
3C T printf
Linker Example
main.o math.o calc.exe
... ...
... 21032040
0C000000 21032040
0C000000 0C40023C
21035000 1b301402 1
1b80050C 1b301402 1
2 3C040000 3C041000
4C040000 34040004
21047002 34040000
... ...
0C000000 0C40023C
... 20 T square
21035000
00 D pi A 1b80050c 2
00 T main B
*UND* printf 4C048004
00 D uname 21047002
*UND* uname
*UND* printf 0C400020
*UND* pi 28, JL, printf ...
40, JL, printf 30, LUI, uname 10201000
34, LA, uname 21040330 3
4C, LW/gp, pi 22500102
54, JL, square ...
printf.o uname 00000003
pi 0077616B
... 3 Entry:0040 0100
3C T printf text:0040 0000
data:1000 0000
Object file
Header
• location of main entry point (if any)
Text Segment
• instructions
Data Segment
Object File

• static data (local/global vars, strings, constants)


Relocation Information
• Instructions and data that depend on actual
addresses
• Linker patches these bits after relocating segments
Symbol Table
• Exported and imported references
Object File Formats
Unix
• a.out
• COFF: Common Object File Format
• ELF: Executable and Linking Format
• …
Windows
• PE: Portable Executable

All support both executable and object files


Recap
Compiler output is assembly files

Assembler output is obj files

Linker joins object files into one executable

Loader brings it into memory and starts execution


Administrivia
Upcoming agenda
• Schedule PA2 Design Doc Mtg for next Monday, Mar 11th
• HW3 due next Wednesday, March 13th
• PA2 Work-in-Progress circuit due before spring break

• Spring break: Saturday, March 16th to Sunday, March 24th

• Prelim2 Thursday, March 28th, right after spring break


• PA2 due Thursday, April 4th

You might also like