0% found this document useful (0 votes)
26 views20 pages

CSE451 Linking and Loading Autumn 2002: Gary Kimura Lecture #21 December 9, 2002

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views20 pages

CSE451 Linking and Loading Autumn 2002: Gary Kimura Lecture #21 December 9, 2002

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

CSE451 Linking and Loading

Autumn 2002
Gary Kimura
Lecture #21
December 9, 2002
Todays Topic
How do programs actually get loaded into memory
The Windows executable image format

From source to execution
A programmer writes a
Source file (helloworld.c file)
A compiler then translates it into an
Object module (helloworld.obj file)
The linker combines various object modules it an
Executable image (helloworld.exe file)
The loader does the final work in getting the image
executing on the system

But what does a .obj or a .exe file really contain?
First, a little theory then the real stuff
Three ways a program can get loaded
Absolute loading Load program at the same address
(virtual and/or physical) every time
Relocatable loading Load program at different
addresses based on what is available
Dynamic run-time loading Load and reload the
program at different addresses while the program is
running

Address Binding
Where a symbolic label/name is translated (bound) to
an actual address
The actual binding can be specified in the program, or
resolved at compile time, link time, load time, or run
time.
COFF and PE Files
Common Object File Format (COFF)
Portable Executable (PE) File Format


We are going to concentrate on the PE File format for
executable images. Roughly the same format is used for
object modules and dynamic link libraries.

The PE file closely resembles what is needed in memory to
run the program.
The PE file itself is divided into various sections
representing code, data, etc.

There are many more formats, such as ELF, etc.


Overall PE File Mapping
Copyright 2001 Microsoft Corporation, One Microsoft Way, Redmond,
Washington 98052-6399 U.S.A. All rights reserved.
Relative Virtual Addresses
Some important things to note:
The address where the program runs is not equal to the
file offset where code is stored in the PE file.
The address where the program runs may not be known
at link time.
So any addresses stored in the code by the linker need
to all be relative.
A Relative Virtual Addresses (RVA) is an offset in
memory relative to where the PE file is loaded
Well see a example of this later with base relocation
fixups
The PE File
The PE File starts with a DOS Header
Signature (MZ)
Offset to the PE Header
Followed by a PE Header
Machine type
Number of sections
Timestamp
Data Directory (table of where in the image is stored
the export, import, resource, exception, security, base
relocation, debug, etc.)
Followed by a Section Table
Named list of the sections in the PE file (.text, .data,
.rdata, .idata, .edata, .rsrc, .reloc, etc.)
Followed by the sections
The usual suspects
.text
Executable code
.data
Read/write initialized data
.rdata
Read only data

Note that the linker combines text and data from various
object modules to form the executable image.
Compilers can append $ to the end of the names to
dictate the ordering within a section. For example
.text$X is before .text$Y in the .text section
Exporting names and ordinals
To run an image that requires calling a dll the loader needs
to be able to find the entry points into the dll
Conceptually associated with a dll is a list of addresses
(RVAs) that other modules can call
Each exported entry point is assigned a unique ordinal
value
The module that then wants to call an entry point only
needs to know the dlls name and the ordinal value.
However we as programmers really know the name and
not the ordinal value that gets assigned by the linker.
The export table saves us by specifying ordinal values and
translating names to their ordinal value
The .edata section (what I export)
Copyright 2001 Microsoft Corporation, One Microsoft Way, Redmond,
Washington 98052-6399 U.S.A. All rights reserved.
Kernel 32 Exports
exports table:
Name: KERNEL32.dll
Characteristics: 00000000
TimeDateStamp: 3B7DDFD8 -> Fri Aug 17 23:24:08 2001
Version: 0.00
Ordinal base: 00000001
# of functions: 000003A0
# of Names: 000003A0

Entry Pt Ordn Name
00012ADA 1 ActivateActCtx
000082C2 2 AddAtomA
remainder of exports omitted



Copyright 2001 Microsoft Corporation, One Microsoft Way, Redmond,
Washington 98052-6399 U.S.A. All rights reserved.
Function calls
Consider these three ways to call the function AddAtomA

1. call AddAtomA
2. call PTR [0x1234]
3. call 0x67890
...
0x67890: call PTR [0x1234]
(where 0x1234 contains the address of AddAtomA)

But compilers usually output

call 0x00000000

And expect the linker to put in the correct address for the
function imported or not. So the linker is stuck using
method #3.
Importing functions
The PE contains a table of imported modules (identified by
the imported dll name)
Each table entry identifies the module and lists the
functions that need to be imported
There a three ways of naming the imported function
Virtual address (nice if the dll never moves and the
linker knows this address)
Ordinal value (nice if the linker knows the ordinal
value)
Function name (refer back to how exports works)
This information is stored in two tables
Import Address Table (IAT)
Import Name Table (INT)
The IAT and INT
The IAT and INT are simply array of dwords (4 bytes)
Each dword is either the function address (see earlier
discussion on function calls), ordinal value, or a pointer to
the function name.
The loader changes the IAT values at load time to function
addresses.

For faster image startup images can be bound.
Binding an image means resolving and overwriting the
IAT table in the actual PE file.
However if the imported dll changes the binding needs to
be redone. The INT is used for this purpose.
The .idata section (what I import)
Copyright 2001 Microsoft Corporation, One Microsoft Way, Redmond,
Washington 98052-6399 U.S.A. All rights reserved.
Base Relocation
Each module has a preferred load address.
However the loader may not be able to always honor the
request.
If the module is relocated then the loader must fixup the
addresses.
The .reloc section specifies each location that needs to be
fixed if the modules is moved.
Dont do this too often because it is a big performance hit
Other sections
.resrc
Resources for the image such as icons, bitmaps, etc.
Organized like a file system
.debug
Debug information
Was coff up to NT 4.0 and has moved onto pdb in
Window XP
Debugging
Speaking of debugging
Things to come
Wednesday well wrap everything up

Final is on Tuesday December 17
th
at 2:30

You might also like