0% found this document useful (0 votes)
3 views16 pages

Lecture 11

The document discusses the roles of assemblers and linkers in the compilation process, highlighting how assembly language translates to machine code and the importance of address resolution. It explains the two-pass assembler approach and contrasts it with modern one-pass assemblers, as well as the concepts of static and dynamic linking in libraries. Additionally, it touches on the evolution of compilers and their optimization capabilities compared to manual assembly programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views16 pages

Lecture 11

The document discusses the roles of assemblers and linkers in the compilation process, highlighting how assembly language translates to machine code and the importance of address resolution. It explains the two-pass assembler approach and contrasts it with modern one-pass assemblers, as well as the concepts of static and dynamic linking in libraries. Additionally, it touches on the evolution of compilers and their optimization capabilities compared to manual assembly programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Assemblers and Linkers

Long, long, time ago, I can still remember


How mnemonics used to make me smile...
Cause I knew with just those opcode names
that I could play some assembly games
and I’d be hacking kernels in just awhile.
But Comp 411 made me shiver, When I find my code in tons of trouble,
With every new lecture that was delivered, Friends and colleagues come to me,
There was bad news at the doorstep, Speaking words of wisdom:
I just didn’t get the problem sets. "Write in C."
I can’t remember if I cried,
When inspecting my stack frame’s insides,
All I know is that it crushed my pride,
On the day the joy of software died.
And I was singing…
● Problem set #2 due
tonight at 11:59:59pm
● 1st midterm next
Monday (10/8)
● Midterm study
session first 45 mins
of Friday’s lab

10/1/2018 Comp 411 - Fall 2018 1


A Route from Program to Bits
∙ Traditional Compilation
High-level, portable
(architecture independent) C or C++ program “Library Routines”
program description
A collection of precompiled
object code modules
Linker
Compiler

Architecture, ISA, Machine language


Dependent program
description with symbolic Assembly Code “Executable” with all memory references
resolved
memory references

Assembler Loader

Machine language with Program and data bits


“some” remaining symbolic “Object Code” “Memory” loaded into memory
memory references

10/1/2018 Comp 411 - Fall 2018 2


What an Assembler does
Assembly is just a recipe for sequentually filling memory locations.
Address Contents in decimal
.word 0x03fffffc, 0x00000020
0x00000000 : 0x03FFFFFC 67108860
.space 6
0x00000004 : 0x00000020 32
.word 0xE3A00000, 0xE2900001, 0x1AFFFFFD 0x00000008 : 0x00000000 0
0x0000000C : 0x00000000 0
0x00000010 : 0x00000000 0
0x00000014 : 0x00000000 0
0x00000018 : 0x00000000 0
0x0000001C : 0x00000000 0
0x00000020 : 0xE3A00000 -476053504
0x00000024 : 0xE2900001 -493879295
0x00000028 : 0x1AFFFFFD 452984829
0x0000002C : 0x00000000 0
You can even
assemble and run
this program

10/1/2018 Comp 411 - Fall 2018 3


What an Assembler does
Assembly is just a recipe for sequentually filling memory locations.
.word 0x03fffffc, 0x00000020 Address Contents in decimal
.space 6 0x00000000 : 0x03FFFFFC 67108860
main: mov r0,#0 0x00000004 : 0x00000020 32
loop: adds r0,r0,#1 0x00000008 : 0x00000000 0
bne loop 0x0000000C : 0x00000000 0
andeq r0,r0,r0 0x00000010 : 0x00000000 0
0x00000014 : 0x00000000 0
0x00000018 : 0x00000000 0
0x0000001C : 0x00000000 0
0x00000020 : 0xE3A00000 -476053504
0x00000024 : 0xE2900001 -493879295
0x00000028 : 0x1AFFFFFD 452984829
0x0000002C : 0x00000000 0
And this recipe is
equivalent to the
first

10/1/2018 Comp 411 - Fall 2018 4


How an Assembler Works
Three major components of assembly
1) Allocating and initializing data storage
2) Conversion of mnemonics to binary instructions
3) Resolving addresses
So is this
.word 0x03fffffc, main
array: .space 11
total: .word 0
Need to figure out this
main: mov r1,#array immediate value
mov r2,#0
mov r3,#1 This one is a PC-relative offset
ldr r0,total This is a forward reference
b test
loop: add r0,r0,r3
str r3,[r1,r2,lsl #2]
add r3,r3,r3
add r2,r2,#1
test: cmp r2,#11
blt loop This offset is completely different
str r0,total than the one a few instructions ago
*halt: b halt
10/1/2018 Comp 411 - Fall 2018 5
st
Resolving Addresses- 1 Pass
“Old-style” 2-pass assembler approach
● In the first pass, data and
instructions are encoded
and assigned offsets,
while a symbol table is
constructed.
● Unresolved address
references are set to 0

10/1/2018 Comp 411 - Fall 2018 6


nd
Resolving Addresses in 2 pass
“Old-style” 2-pass assembler approach
● In the first pass, data and
instructions are encoded
and assigned offsets,
while a symbol table is
constructed.
● Unresolved address
references are set to 0

10/1/2018 Comp 411 - Fall 2018 7


Modern 1-pass Assembler
Modern assemblers keep more information in their symbol
table which allows them to resolve addresses in a single pass.
● Known addresses (backward references) are immediately resolved.
● Unknown or unresolved addresses (forward references) are
“back-filled” once they are resolved.

State of the symbol


table after the
instruction
str r3, [r1,r2,lsl #2]
is assembled

10/1/2018 Comp 411 - Fall 2018 8


Role of a Linker
Some aspects of address resolution cannot be handled by the assembler alone.

1. References to data or routines in other object modules


2. The layout of all segments in memory
3. Support for REUSABLE code modules To handle this an object file
4. Support for RELOCATABLE code modules includes a symbol table with:
1) Unresolved references
2) Addresses of labels declared
This final step of resolution is the job of a LINKER to be “global” (i.e. accessible
to other object modules).
Source Object
Assembler
file file

Source Object Executable


Assembler Linker File
file file

Source Object
Assembler
file file
Libraries

10/1/2018 Comp 411 - Fall 2018 9


Static and Dynamic Libraries
● LIBRARIES are commonly used routines stored as a concatenation of
“Object files”. A global symbol table is maintained for the entire library
with entry points for each routine.

● When a routine in a LIBRARY is referenced by an assembly module, the


routine’s address is resolved by the LINKER, and the appropriate code is
added to the executable. This sort of linking is called STATIC linking.

● Many programs use common libraries. It is wasteful of both memory and


disk space to include the same code in multiple executables. The modern
alternative to STATIC linking is to allow the LOADER and THE PROGRAM
ITSELF to resolve the addresses of libraries routines. This form of lining
is called DYNAMIC linking (e.x. .dll).

10/1/2018 Comp 411 - Fall 2018 10


Dynamically Linked Libraries
● C call to library function:
printf(“sqr[%d] = %d\n”, x, y); How does
● Assembly code dynamic linking
mov R0,#1
work?
mov R1,ctrlstring
ldr R2,x
ldr R3,y
mov IP,#__stdio__ Why are we loading
mov LR,PC the PC from a
ldr PC,[IP,#16] memory location
rather than
Two things:
branching?
1) This is the first time we’ve seen
the IP (r12) register used
2) At the mov instruction the PC is
pointing to the instruction after
the ldr

10/1/2018 Comp 411 - Fall 2018 11


Dynamically Linked Libraries
• Lazy address resolution: Before any call is made to a
sysload: stmfd sp!,[r0-r10,lr] procedure in “stdio.dll”
.
.globl __stdio__:
.
; check if stdio module __stdio__:
; is loaded, if not load it fopen: .word sysload
. fclose: .word sysload
. fgetc: .word sysload
Because, the
entry points to ; backpatch jump table fputc: .word sysload
dynamic library mov r1,__stdio__
routines are
mov r0,dfopen fprintf: .word sysload
stored in a
TABLE. And the str r0,[r1]
contents of this mov r0,dfclose After the first call is made
table are loaded
on an “as needed”
str r0,[r1,#4]
basis! mov r0,dfputc to any procedure in “stdio.dll”
str r0,[r1,#8]
mov r0,dfgetc .globl __stdio__:
str r0,[r1,#12] __stdio__:
mov r0,dfprintf fopen: dfopen
str r0,[r1,#16] fclose: dclose
fgetc: dfgetc
fputc: dfputc
fprintf: dprintf
10/1/2018 Comp 411 - Fall 2018 12
Modern Languages
Intermediate “object code language”
High-level, portable (architecture
independent) program description Java program

Compiler

PORTABLE mnemonic program


description with symbolic memory
JVM bytecodes “Library Routines”
references

An application that EMULATES a


virtual machine. Can be written Interpreter
for any Instruction Set Architecture.
In the end, machine language
instructions must be executed for
each JVM bytecode

10/1/2018 Comp 411 - Fall 2018 13


Modern Languages
Intermediate “object code language”
High-level, portable (architecture
independent) program description Java program

Compiler

PORTABLE mnemonic program


description with symbolic memory
JVM bytecodes “Library Routines”
references

While interpreting on the first pass


the JIT keeps a copy of the machine JIT Complier
language instructions used.
Future references access machine Today’s JITs are nearly as
language code, avoiding further
interpretation fast as a native compiled code.
Machine code
10/1/2018 Comp 411 - Fall 2018 14
Assembly? Really?
● In the early days compilers were dumb
○ literal line-by-line generation of assembly code of “C” source
○ This was efficient in terms of S/W development time
■ C is portable, ISA independent, write once– run anywhere
■ C is easier to read and understand
■ Details of stack allocation and memory management are hidden
○ However, a savvy programmer could nearly always generate
code that would execute faster
● Enter the modern era of Compilers
○ Focused on optimized code-generation
○ Captured the common tricks that low-level programmers used
○ Meticulous bookkeeping (i.e. will I ever use this variable again?)
○ It is hard for even the best hacker to improve on code
generated by good optimizing compilers
10/1/2018 Comp 411 - Fall 2018 15
Next Time
● Play with the ARM
compiler
● Compiler code
optimization
● We look deeper into the
Rabbit hole

10/1/2018 Comp 411 - Fall 2018 16

You might also like