MIPS Assembly Language Programmer's Guide
MIPS Assembly Language Programmer's Guide
Assembly Language
Programmer’s Guide
ASM-01-DOC
Customer Service
U.S. and Canada: 1 (800) 800-4SGI
International: Contact your local sales representative
Preface: About This Book
Audience
This book assumes that you are an experienced assembly language
programmer. The assembler produces object modules from the assembly
instructions that the C, Fortran 77, and Pascal compilers generate. It therefore
lacks many functions normally present in assemblers. You should use the
assembler only when you need to:
• Maximize the efficiency of a routine, which might not be possible in
C, Fortran 77, Pascal, or another high-level language; for example, to
write low-level I/O drivers.
• Access machine functions unavailable in high-level languages or
satisfy special constraints such as restricted register usage.
• Change the operating system.
• Change the compiler system.
Further system information can be obtained from the manuals listed at the end
of this section.
Topics Covered
This book has these chapters:
• Chapter 1: Registers describes the format for the general registers,
the special registers, and the floating point registers.
• Chapter 2: Addressing describes how addressing works.
• Chapter 3: Exceptions describes exceptions you might encounter
with assembly programs.
• Chapter 4: Lexical Conventions describes the lexical conventions
that the assembler follows.
• Chapter 5: Instruction Set describes the main processor’s
instruction set, including notation, load and store instructions,
computational instructions, and jump and branch instructions.
• Chapter 6: Coprocessor Instruction Set describes the coprocessor
instruction sets.
• Chapter 7: Linkage Conventions describes linkage conventions for
all supported high-level languages. It also discusses memory
allocation and register use.
• Chapter 8: Pseudo-Op-Codes describes the assembler’s pseudo-
operations (directives).
• Chapter 9: MIPSObject File Format provides an overview of the
components comprising the object file and describes the headers and
sections of the object file.
• Chapter 10: Symbol Table describes the purpose of the Symbol
Table and the format of entries in the table. This chapter also lists the
symbol table routines that are supplied.
• Chapter 11: Execution and Linking Format describes Execution
and Linking Format (ELF) for object files. This chapter also
describes the components of an elf object file, symbol table format,
global data area, register information, and relocation.
• Chapter 12: Program Loading and Dynamic Linking describes the
object file structures that relate to program execution. This chapter
also describes how the process image is created from executable files
and object files.
• Appendix A: Instruction Summary summarizes all assembler
instructions.
• Appendix B: Basic Machine Definition describes instructions that
generate more than one machine instruction.
• Index. Contains index entries for this publication.
❋ Note
A note presents information of greater-than-normal importance.
☞
Detail
Printed in a sans serif font, a detail presents additional information
that is of ancillary importance.
1
Registers
Register Format .................................................................................1-1
Special Registers ...................................................................................1-5
2
Addressing
Address Formats.....................................................................................2-2
Address Descriptions..............................................................................2-3
3
Exceptions
Main Processor Exceptions ....................................................................3-1
Floating-Point Exceptions .....................................................................3-2
4
Lexical Conventions
Tokens ....................................................................................................4-1
Comments...............................................................................................4-2
Identifiers................................................................................................4-2
Constants ................................................................................................4-2
Scalar Constants...............................................................................4-3
Floating Point Constants ..................................................................4-3
String Constants ...............................................................................4-4
5
Instruction Set
Instruction Classes..................................................................................5-1
Reorganization Constraints and Rules....................................................5-2
Instruction Notation ...............................................................................5-2
Load and Store Instructions ...................................................................5-3
Load and Store Formats ...................................................................5-3
Load Instruction Descriptions..........................................................5-4
Store Instruction Descriptions..........................................................5-7
Computational Instructions ..................................................................5-10
Computational Formats..................................................................5-10
Computational Instruction Descriptions ........................................5-13
Jump and Branch Instructions ..............................................................5-21
Jump and Branch Formats .............................................................5-21
Jump and Branch Instruction Descriptions ....................................5-23
Special Instructions .............................................................................5-25
Special Formats..............................................................................5-25
Special Instruction Descriptions ....................................................5-26
Coprocessor Interface Instructions .......................................................5-27
Coprocessor Interface Formats ......................................................5-27
Coprocessor Interface Instruction Descriptions ............................5-28
6
Coprocessor Instruction Set
Instruction Notation................................................................................6-1
Floating-Point Instructions .....................................................................6-2
Floating-Point Formats ....................................................................6-3
Floating-Point Load and Store Formats...........................................6-3
7
Linkage Conventions
Introduction ............................................................................................7-1
Program Design ......................................................................................7-2
Register Use and Linkage ................................................................7-2
The Stack Frame ..............................................................................7-3
The Shape of Data............................................................................7-7
Examples ................................................................................................7-7
Learning by Doing................................................................................7-11
Calling a High-Level Language Routine .......................................7-11
Calling an Assembly Language Routine ..............................................7-13
Memory Allocation ..............................................................................7-15
8
Pseudo Op-Codes
9
MIPS Object File Format
Overview ................................................................................................9-2
The File Header ......................................................................................9-4
File Header Magic Field (f_magic)..................................................9-5
Flags (f_flags) ..................................................................................9-5
Optional Header......................................................................................9-7
Optional Header Magic Field (magic) .............................................9-8
Section Headers ......................................................................................9-8
10
Symbol Table
Overview ..............................................................................................10-2
Format of Symbol Table Entries ..........................................................10-8
Symbolic Header............................................................................10-8
Line Numbers.................................................................................10-9
Procedure Descriptor Table .........................................................10-13
Local Symbols .............................................................................10-13
Optimization Symbols .................................................................10-17
Auxiliary Symbols .......................................................................10-17
File Descriptor Table ...................................................................10-20
External Symbols .........................................................................10-21
11
Execution and Linking Format
Object File Format................................................................................11-2
ELF Header ..........................................................................................11-3
Sections.................................................................................................11-7
Section Header Table.....................................................................11-7
12
Program Loading and Dynamic Linking
Program Header....................................................................................12-2
Base Address..................................................................................12-4
Segment Permissions .....................................................................12-4
Segment Contents ..........................................................................12-5
Program Loading ..................................................................................12-6
Dynamic Linking..................................................................................12-9
Program Interpreter........................................................................12-9
Dynamic Linker .............................................................................12-9
Dynamic Section..........................................................................12-11
Shared Object Dependencies..............................................................12-18
Global Offset Table (GOT) ................................................................12-19
Calling Position Independent Functions ......................................12-20
Symbols........................................................................................12-22
Relocations...................................................................................12-22
Hash table ...........................................................................................12-23
Initialization and Termination Functions ...........................................12-23
Quickstart ...........................................................................................12-24
Shared Object List........................................................................12-24
Conflict Section ...........................................................................12-26
Ordering .......................................................................................12-26
A
Instruction Summary
Index
Chapter 1
1
This chapter describes the organization of data in memory, and the naming
and usage conventions that the assembler applies to the CPU and FPU
registers. See Chapter 7 for information regarding register use and linkage.
Register Format
The CPU’s byte ordering scheme (or endian issues) affects memory
organization and defines the relationship between address and byte position
of data in memory.
The byte ordering is configurable (configuration occurs during hardware
reset) into either big-endian or little-endian byte ordering. When configured
as a big-endian system, byte 0 is always the most-significant (leftmost) byte.
When configured as a little-endian system, byte 0 is always the least-
significant (rightmost byte).
Figure 1-1 and Figure 1-2 illustrate the ordering of bytes within words and the
ordering of halfwords for big and little endian systems.
Word
byte 0 byte1
Word
Bit: 31 .... 24 23 .... 16 15 ... 8 7 .... 0
byte 3 byte 2 byte 1 byte 0
byte1 byte 0
General Registers
The CPU has thirty-two 32-bit registers. In the mip3 architecture, the size of
each of the thirty two integer registers is 64-bit.
Chapter 1
Table 1-1 summarizes the assembler’s usage and conventions and restrictions
for these registers. The assembler reserves all register names; you must use
lowercase for the names. All register names start with a dollar sign($).
The general registers have the names $0..$31. By including the file regdef.h
(use #include <regdef.h>) in your program, you can use software names for
some general registers. The operating system and the assembler use the
general registers $1, $26, $27, $28, and $29 for specific purposes.
NOTE: Attempts to use these general registers in other ways can produce
unexpected results.) If a program uses the names $1, $26, $27, $28, $29 rather
than the names $at, $kt0, $kt1, $gp, $sp respectively, the assembler issues
warning messages.
NOTE: General register $0 always contains the value 0. All other general
registers are equivalent, except that general register $31 also serves as the
implicit link register for jump and link instructions. See Chapter 7 for a
description of register assignments.
Special Registers
The CPU defines three 32-bit special registers: PC (program counter), HI and
LO, as shown in Table 1-2. The HI and LO special registers hold the results
Chapter 1
of the multiplication (mult and multu) and division (div and divu)
instructions.
You usually do not need to refer explicitly to these special registers;
instructions that use the special registers refer to them automatically.
Table 1-2: Special Registers
Name Description
PC Program Counter
Multiply/Divide special register holds the most-
HI
significant 32 bits of multiply, remainder of divide
Multiply/Divide special register holds the least-
LO
significant 32 bits of multiply, quotient of divide
Floating-Point Registers
The FPU has sixteen floating-point registers. Each register can hold either a
single-precision (32 bit) or double-precision (64 bit) value. In case of a
Chapter 1
double-precision value, $f0 holds the least-significant half, and $f1 holds the
most-significant half. All references to these registers use an even register
number (e.g., $f4). Table 1-3 summarizes the assembler’s usage conventions
and restrictions for these registers.
Table 1-3: Floating-Point Registers
Register
Use and Linkage
Name
Used to hold floating-point type function results ($f0)
and complex type function results ($f0 has the real part,
$f0..$f2 $f2 has the imaginary part. $f4..$f10 Temporary
registers, used for expression evaluation, whose values
are not preserved across procedure calls.
Used to pass the first two single or double precision
$f12..$f14 actual arguments, whose values are not preserved
across procedure calls.
Temporary registers, used for expression evaluation,
$f16..$f18 whose values are not preserved across procedure
calls.
Saved registers, whose values must be preserved
$f20..$f30
across procedure calls.
Chapter 2
This chapter describes the formats that you can use to specify addresses. The
machine uses a byte addressing scheme. Access to halfwords requires
alignment on even byte boundaries, and access to words requires alignment
on byte boundaries that are divisible by four. Any attempt to address a data
item that does not have the proper alignment causes an alignment exception.
The unaligned assembler load and store instructions may generate multiple
machine language instructions. They do not raise alignment exceptions.
These instructions load and store unaligned data:
• Load word left (lwl)
• Load word right (lwr)
• Store word left (swl)
• Store word right (swr)
• Unaligned load word (ulw)
• Unaligned load halfword (ulh)
• Unaligned load halfword unsigned (ulhu)
• Unaligned store word (usw)
• Unaligned store halfword (ush)
• These instructions load and store aligned data
• Load word (lw)
• Load halfword (lh)
• Load halfword unsigned (lhu)
• Load byte (lb)
Address Formats
The assembler accepts these formats shown in Table 2-1 for addresses.
Table 2-1: Address Formats
Format Address
Chapter 2
Address Descriptions
The assembler accepts any combination of the constants and operations described
in this chapter for expressions in address descriptions.
Chapter 2
Specifies a based address. To get the address, the machine
expression (base-register) adds the value of the expression to the contents of the base-
register.
Specifies a relocatable address. The assembler generates the
relocatable-symbol necessary instruction(s) to addressx the item and generates
relocatable information for the link editor.
Specifies a relocatable address. To get the address, the
assembler adds or subtracts the value of the expression, which
has an absolute value, from the relocatable symbol. The
assembler generates the necessary instruction(s) to address
relocatable-symbol + expression
the item and generates relocatable information for the link
editor. If the symbol name does not appear as a label anywhere
in the assembly, the assembler assumes that the symbol is
external.
Specifies an indexed relocatable address. To get the address,
the machine adds the index-register to the relocatable symbol’s
address. The assembler generates the necessary instruction(s)
relocatable-symbol (index register) to address the item and generates relocatable information for
the link editor. If the symbol name does not appear as a label
anywhere in the assembly, the assembler assumes that the
symbol is external.
Specifies an indexed relocatable address. To get the address,
the assembler adds or subtracts the relocatable symbol, the
expression, and the contents of the index-register. The
relocatable + expression assembler generates the necessary instruction(s) to address
the item and generates relocation information for the link editor.
If the symbol does not appear as a label anywhere in the
assembly, the assembler assumes that the symbol is external.
3
This chapter describes the exceptions that you can encounter while running
assembly programs. The machine detects some exceptions directly, and the
assembler inserts specific tests that signal other exceptions. This chapter lists
only those exceptions that occur frequently.
Chapter 3
The following exceptions are the most common to the main processor:
• Address error exceptions, which occur when the machine references a
data item that is not on its proper memory alignment or when an
address is invalid for the executing process.
• Overflow exceptions, which occur when arithmetic operations
compute signed values and the destination lacks the precision to store
the result.
• Bus exceptions, which occur when an address is invalid for the
executing process.
• Divide-by-zero exceptions, which occur when a divisor is zero.
Floating-Point Exceptions
The following are the most common floating-point exceptions:
• Invalid operation exceptions which include:
– Magnitude subtraction of infinities, for example: ±-1.
– Multiplication of 0 by 1 with any signs.
– Division of 0/0 or 1/1 with any signs.
– Conversion of a binary floating-point number to an integer
format when an overflow or the operand value for the infinity or
NaN precludes a faithful representation in the format (see
Chapter 4).
– Comparison of predicates that have unordered operands, and that
involve Greater Than or Less Than without Unordered.
– Any operation on a signaling NaN.
• Divide-by-zero exceptions.
• Overflow exceptions—these occur when a rounded floating-point
result exceeds the destination format’s largest finite number.
• Underflow exceptions—these occur when a result has lost accuracy
Chapter 3
and also when a nonzero result is between 2Emin (±2 to the minimum
expressible exponent).
• Inexact exceptions.
4
This chapter discusses lexical conventions for these topics:
• Tokens
• Comments
• Identifiers
• Constants
• Multiple lines per physical line
• Sections and location counters
• Statements
• Expressions
This chapter uses the following notation to describe syntax:
• | (vertical bar) means “or”
Chapter 4
• [ ] (square brackets) enclose options
• + indicates both addition and subtraction operations
Tokens
The assembler has these tokens:
• Identifiers
• Constants
• Operators
The assembler lets you put blank characters and tab characters anywhere
between tokens; however, it does not allow these characters within tokens
(except for character constants). A blank or tab must separate adjacent
identifiers or constants that are not otherwise separated.
Comments
The pound sign character (#) introduces a comment. Comments that start with
a # extend through the end of the line on which they appear. You can also use
C-language notation /*...*/ to delimit comments.
The assembler uses cpp (the C language preprocessor) to preprocess
assembler code. Because cpp interprets #s in the first column as pragmas
(compiler directives), do not start a # comment in the first column.
Identifiers
An identifier consists of a case-sensitive sequence of alphanumeric
characters, including these:
• . (period)
• _ (underscore)
• $ (dollar sign)
Identifiers can be up to 31 characters long, and the first character cannot be
numeric.
If an identifier is not defined to the assembler (only referenced), the assembler
assumes that the identifier is an external symbol. The assembler treats the
identifier like a .globl pseudo-operation (see Chapter 8). If the identifier is
defined to the assembler and the identifier has not been specified as global,
Chapter 4
Constants
The assembler has these constants:
• Scalar constants
• Floating point constants
• String constants
Scalar Constants
The assembler interprets all scalar constants as twos complement numbers.
Scalar constants can be any of the digits 0123456789abcdefABCDEF.
Chapter 4
.float and .double directives may optionally use hexadecimal floating point
constants instead of decimal ones. A hexadecimal floating point constant
consists of:
<+ or –> 0x <1 or 0 or nothing> . <hex digits> H 0x <hex
digits>
The assembler places the first set of hex digits (excluding the 0 or 1 preceding
the decimal point) in the mantissa field of the floating point format without
attempting to normalize it. It stores the second set of hex digits into the
exponent field without biasing them. It checks that the exponent is
appropriate if the mantissa appears to be denormalized. Hexadecimal
floating point constants are useful for generating IEEE special symbols, and
for writing hardware diagnostics.
For example, either of the following generates a single-precision “1.0”:
.float 1.0e+0
.float 0x1.0h0x7f
String Constants
String constants begin and end with double quotation marks (”).
The assembler observes C language backslash conventions. For octal
notation, the backslash conventions require three characters when the next
character could be confused with the octal number. For hexadecimal
notation, the backslash conventions require two characters when the next
character could be confused with the hexadecimal number (i.e., use a 0 for the
first character of a single character hex number).
The assembler follows the backslash conventions shown in Table 4-1.
Convention Meaning
\a Alert (0x07)
\b Backspace (0x08)
\f Form feed (0x0c)
\n Newline (0x0a)
\r Carriage return (0x0d)
\t horizontal tab (0x09)
\v Vertical feed (0x0b)
\\ Backslash (0x5c)
\" Quotation mark (0x22)
\’ Single quote (0x27)
\000 Character whose octal value is 000.
Chapter 4
.lit4
Chapter 4
The assembler always generates the text section before other sections.
Additions to the text section happen in four-byte units. Each section has an
implicit location counter, which begins at zero and increments by one for each
byte assembled in the section.
The bss section holds zero-initialized data. If a .lcomm pseudo-op defines a
variable (see Chapter 8), the assembler assigns that variable to the bss (block
started by storage) section or to the sbss (short block started by storage)
section depending on the variable’s size. The default variable size for sbss is
8 or fewer bytes.
The command line option –G for each compiler (C, Pascal, Fortran 77, or the
assembler), can increase the size of sbss to cover all but extremely large data
items. The link editor issues an error message when the –G value gets too
Statements
Each statement consists of an optional label, an operation code, and the
operand(s). The machine allows these statements:
• Null statements
• Keyword statements
Label Definitions
A label definition consists of an identifier followed by a colon. Label
definitions assign the current value and type of the location counter to the
name. An error results when the name is already defined, the assigned value
changes the label definition, or both conditions exists.
Label definitions always end with a colon. You can put a label definition on
a line by itself.
A generated label is a single numeric value (1...255). To reference a generated
label, put an f (forward) or a b (backward) immediately after the digit. The
reference tells the assembler to look for the nearest generated label that
corresponds to the number in the lexically forward or backward direction.
Chapter 4
Null Statements
A null statement is an empty statement that the assembler ignores. Null
statements can have label definitions. For example, this line has three null
statements in it:
label: ; ;
Keyword Statements
A keyword statement begins with a predefined keyword. The syntax for the
rest of the statement depends on the keyword. All instruction opcodes are
keywords. All other keywords are assembler pseudo-operations (directives).
Expressions
An expression is a sequence of symbols that represent a value. Each
expression and its result have data types. The assembler does arithmetic in
twos complement integers with 32 bits of precision. Expressions follow
precedence rules and consist of:
• Operators
• Identifiers
• Constants
Also, you may use a single character string in place of an integer within an
expression. Thus:
.byte “a” ; .word “a”+0x19
is equivalent to:
.byte 0x61 ; .word 0x7a
Precedence
Unless parentheses enforce precedence, the assembler evaluates all operators
of the same precedence strictly from left to right. Because parentheses also
designate index-registers, ambiguity can arise from parentheses in
expressions. To resolve this ambiguity, put a unary + in front of parentheses
in expressions.
The assembler has three precedence levels, which are listed here from lowest
to highest precedence:
least binding,
lowest precedence: binary +. -
..
Chapter 4
. binary *, /, %, <<, >>, ^, &, |
most binding
highest precedence: unary -, +, ~
Expression Operators
For expressions, you can rely on the precedence rules, or you can group
expressions with parentheses. The assembler has the operators listed in Table
4-2.
Operator Meaning
+ Addition
- Subtraction
* Multiplication
/ Division
% Remainder
<< Shift Left
>> Shift Right (sign NOT extended)
^ Bitwise Exclusive-OR
& Bitwise AND
| Bitwise OR
- Minus (unary)
+ Identity (unary)
~ Complement
Data Types
The assembler manipulates several types of expressions. Each symbol you
reference or define belongs to one of the categories shown in Table 4-3.
Chapter 4
Type Description
Any symbol that is referenced but not defined becomes global undefined,
and this module will attempt to import it. The assembler uses 32-bit
undefined
addressing to access these symbols. (Declaring such a symbol in a. globl
pseudo-op merely makes its status clearer).
A symbol defined by a .extern pseudo-op becomes global small undefined
if its size is greater than zero but less than the number of bytes specified by
sundefined the –G option on the command line (which defaults to 8). The linker places
these symbols within a 64k byte region pointed to by the $gp register, so
that the assembler can use economical 16-bit addressing to access them.
Type Description
absolute A constant defined in an “=” expression.
The text section contains the program’s instructions, which are not
text modifiable during execution. Any symbol defined while the .text pseudo-op
is in effect belongs to the text section.
The data section contains memory which the linker can initialize to nonzero
values before your program begins to execute. Any symbol defined while
data
the .data pseudo-op is in effect belongs to the data section. The assembler
uses 32-bit addressing to access these symbols.
This category is similar to data, except that defining a symbol while the
.sdata (“small data”) pseudo-op is in effect causes the linker to place it
sdata
within a 64k byte region pointed to by the $gp register, so that the
assembler can use economical 16-bit addressing to access it.
Any symbol defined while the .rdata pseudo-op is in effect belongs to this
rdata
category, which is similar to data, but may not be modified during execution.
The bss and sbss sections consist of memory which the kernel loader
initializes to zero before your program begins to execute. Any symbol
defined in a .comm or .lcomm pseudo-op belongs to these sections (except
that a .data, .sdata, or .rdata pseudo-op can override a .comm directive). If
its size is less than the number of bytes specified by the –G option on the
command line (which defaults to 8), it belongs to sbss (“small bss”), and the
bss and sbss linker places it within a 64k byte region pointed to by the $gp register so that
the assembler can use economical 16-bit addressing to access it.
Otherwise, it belongs to bss and the assembler uses 32-bit addressing.
Local symbols in bss or sbss defined by .lcomm are allocated memory by
the assembler; global symbols are allocated memory by the link editor; and
symbols defined by .comm are overlaid upon like-named symbols (in the
Chapter 4
fashion of Fortran “COMMON” blocks) by the link editor.
Symbols in the undefined and small undefined categories are always global
(that is, they are visible to the link editor and can be shared with other
modules of your program). Symbols in the absolute, text, data, sdata, rdata,
bss, and sbss categories are local unless declared in a .globl pseudo-op.
5
This chapter describes instruction notation and discusses assembler
instructions for the main processor. Chapter 6 describes coprocessor notation
and instructions.
Instruction Classes
The assembler has these classes of instructions for the main processor:
• Load and Store Instructions. These instructions load immediate
values and move data between memory and general registers.
• Computational Instructions. These instructions do arithmetic and
logical operations for values in registers.
• Jump and Branch Instructions. These instructions change program
control flow.
• Coprocessor Interface. These instructions provide standard
interfaces to the coprocessors.
• Special Instructions.These instructions do miscellaneous tasks.
Chapter 5
Instruction Notation
The tables in this chapter list the assembler format for each load, store,
computational, jump, branch, coprocessor, and special instruction. The
format consists of an op-code and a list of operand formats. The tables list
groups of closely related instructions; for those instructions, you can use any
op-code with any specified operand.
Operands can take any of these formats:
• Memory references. For example, a relocatable symbol +/– an
expression(register).
• Expressions (for immediate values).
• Two or three operands. For example, add $3,$4 is the same as add
$3,$3,$4.
Chapter 5
Operand Description
destination Destination register
address Symbolic expression (see Chapter2)
source Source register
expression Absolute value
Store Word sw
Unaligned Store Halfword ush
Unaligned Store Word usw
* Not valid in mips1 architectures
Table 5-2: Load and Store Formats for mips3 Architecture Only
Description Op-code Operands
Load Doubleword ld destination, address
Load Linked Doubleword lld
Load Word Unsigned lwu
Load Doubleword Left ldl
Load Doubleword Right ldr
Unaligned Load Double uld
Store Doubleword sd source, address
Store Conditional
scd
Doubleword
Store Double Left sdl
Store Double Right sdr
Unaligned Store Doubleword usd
Load Halfword (lh) Loads the two least-significant bytes of the destination register with the contents
of the halfword that is at the memory location specified by the effective address.
The machine treats the loaded halfword as a signed value. If the effective
address is not even, the machine signals an address error exception.
Halfword (ulh) extends the sign of the halfword. Unaligned Load Halfword loads a halfword
regardless of the halfword’s alignment in memory.
Unaligned Load Loads a halfword into the destination register from the specified address and
Halfword Unsigned zero extends the halfword. Unaligned Load Halfword Unsigned loads a halfword
(ulhu) regardless of the halfword’s alignment in memory.
effective address specifies the lowest numbered byte. Only the bytes which
share the same aligned doubleword in memory are merged into the destination
register.
Unaligned Load Loads a doubleword into the destination register from the specified address. uld
Doubleword (uld) loads a doubleword regardless of the doubleword’s alignment in memory.
(swl) the effective address. The contents of the word at the memory location,
specified by the effective address, are shifted right so that the leftmost byte of
the unaligned word is in the addressed byte position. The stored bytes replace
the corresponding bytes of the effective address. The effective address’s last
two bits determine how many bytes are involved.
Chapter 5
Computational Instructions
The machine has general-purpose and coprocessor-specific computational
instructions (for example, the floating-point coprocessor). This part of the
book describes general-purpose computational instructions.
Computational Formats
In the Table 5-7, operands have the following meanings:
Operand Description
destination/src1 Destination register is also source register 1
destination Destination register
immediate Immediate value
src1,src2 Source registers
Chapter 5
Exclusive-OR (xor) Computes the XOR of two values. This instruction XORs (bit-wise) the contents
of src1 with the contents of src2, or it can XOR the contents of src1 with the
immediate value. The immediate value is not sign extended. Exclusive-OR puts
the result in the destination register.
Move (move) Moves the contents of src1 to the destination register.
Negate with Computes the negative of a value. This instruction negates the contents of src1
Overflow (neg) and puts the result in the destination register. If the value in src1 is –
2147483648, the machine signals an overflow exception.
Negate without Negates the integer contents of src1 and puts the result in the destination
Overflow (negu) register. The machine does not report overflows.
NOT (not) Computes the Logical NOT of a value. This instruction complements (bit-wise)
the contents of src1 and puts the result in the destination register.
Chapter 5
NOT OR (nor) Computes the NOT OR of two values. This instruction combines the contents of
src1 with the contents of src2 (or the immediate value). NOT OR complements
the result and puts it in the destination register.
zero.
Set Greater/Equal Compares two signed 32-bit values. If the contents of src1 are greater than or
(sge) equal to the contents of src2 (or src1 is greater than or equal to the immediate
value), this instruction sets the destination register to one; otherwise, it sets the
destination register to zero.
(srl) zeros at the most-significant bit. The contents of src1 specify the value to shift,
and the contents of src2 (or the immediate value) specify the amount to shift. If
src2 (or the immediate value) is greater than 31 or less than 0, src1 shifts by the
result of src2 MOD 32.
Chapter 5
(dmultu) overflow is possible. Note: The dmultu instruction is a real machine language
instruction.
src2 (or the immediate value) specify the amount to shift. If src2 (or the
immediate value is greater than 63, src1 shifts by src2 MOD 64.
Operand Description
address An expression.
immediate An expression with an absolute value.
label A symbol label.
return Register containing the return address.
src1, src2 The source registers.
target Register containing the target.
Branch on Less Branches to the specified label when the contents of src1 are less than zero. The
Than Zero (bltz) program must define the destination.
Branch on Less Branches to the specified label when the contents of src1 are less than the
(blt) contents of src2, or it can branch when the contents of src1 are less than the
immediate value. The comparison treats the comparands as signed 32-bit values.
* Likely Same an the ordinary branch instruction (without the "Likely"), except in a branch
likely instruction, the instruction in the delay slot is nullified if the conditional
branch is not taken. Note: The branch likely instructions should be used only
inside a .set noreorder schedule in an assembly program. The assembler does
not attempt to schedule the delay slot of a branch likely instruction.
Special Instructions
The main processor’s special instructions do miscellaneous tasks.
Special Formats
In Table 5-13, operands have the following meanings:
Operand Description
register Destination or source register
breakcode Value that determines the break type
Chapter 5
Operand Description
address A symbolic expression
destination The destination coprocessor register
dest-gpr The destination general register
label A symbolic label
operation The coprocessor specific operation
source A coprocessor register from which values are assigned
src-gpr A general register from which values are assigned
z A coprocessor number in the range 0...2
NOTE: You cannot use coprocessor load and store instructions with the
system control coprocessor (cp0).
Move From
Stores the contents of the coprocessor register specified by the source in the
Coprocessor z
general register specified by dest-gpr.
(mfcz)
Chapter 5
Chapter 6
6
This chapter describes the coprocessor instructions for these coprocessors:
• System control coprocessor (cp0) instructions
• Floating-point coprocessor instructions
See Chapter 5 for a description of the main processor’s instructions and the
coprocessor interface instructions.
Instruction Notation
The tables in this chapter list the assembler format for each coprocessor’s
load, store, computational, jump, branch, and special instructions. The format
consists of an op-code and a list of operand formats. The tables list groups of
closely related instructions; for those instructions, you can use any op-code
with any specified operand.
NOTE: The system control coprocessor instructions do not have operands.
Operands can have any of these formats:
• Memory references: for example, a relocatable symbol +/– an
expression(register)
• Expressions (for immediate values)
• Two or three operands: for example, add $3,$4 is the same as add
$3,$3,$4
Floating-Point Instructions
The floating-point coprocessor has these classes of instructions:
• Load and Store Instructions: Load values and move data between
memory and coprocessor registers.
• Move Instructions: Move data between registers.
• Computational Instructions: Do arithmetic and logical operations
on values in coprocessor registers.
• Relational Instructions: Compare two floating-point values.
A particular floating-point instruction may be implemented in hardware,
software, or a combination of hardware and software.
Floating-Point Formats
The formats for the single- and double-precision floating-point constants are
shown below:
Chapter 6
0 1 89 31 (big-endian)
31 30 23 22 0 (little-endian)
(big-endian)
0 1 1112 63
1 11 bits 52 bits
63 62 52 51 0
Double-Precision (little-endian)
Store instructions.
Instruction Description
Load eight bytes for double-precision and four bytes for single-precision from
the specified effective address into the destination register, which must be an
Load Fp Instructions even register. The bytes must be word aligned. Note: We recommend that
you use doubleword alignment for double-precision operands. is required in
the mips2 architecture (R4000 & R6000).
Stores eight bytes for double-precision and four bytes for single-precision
from the source floating-point register in the destination register, which must
Store Fp Instructions be an even register. Note: We recommend that you use doubleword
alignment for double-precision operands. It is required in the mips2
architecture (R4000 & R6000).
Chapter 6
Single div.s
Multiply Fp
Double mul.d
Single mul.s
Subtract Fp
Double sub.d
Single sub.s
Convert Source to Specified Fp Precision
Double to Single Fp cvt.s.d destination, src1
Fixed Point to Single Fp cvt.s.w
Single to Double Fp cvt.d.s
Fixed Point to Double Fp cvt.d.w
Single to Fixed Point Fp cvt.w.s
Double to Fixed Point Fp cvt.w.d
Truncate and Round Operations
Truncate to Single Fp trunc.w.s destination, src, gpr
Truncate to Double Fp trunc.w.d
Round to Single Fp round.w.s
Round to Double Fp round.w.d
Ceiling to Double Fp ceil.w.d
Ceiling to Single Fp ceil.w.s
Ceiling to Double Fp, Unsigned ceilu.w.d
Ceiling to Single Fp, Unsigned ceilu.w.s
Floor to Double Fp floor.w.d
Floor to Single Fp floor.w.s
Floor to Double Fp, Unsigned flooru.w.d
Floor to Single Fp, Unsigned flooru.w.s
roundu.w.
Round to Double Fp, Unsigned
d
Round to Single Fp, Unsigned roundu.w.s
Truncate to Double Fp,
truncu.w.d
Unsigned
Truncate to Single Fp, Unsigned truncu.w.s
Chapter 6
Computational instructions.
Instruction Description
Absolute Value Fp Compute the absolute value of the contents of src1 and put the specified
Instructions precision floating-point result in the destination register.
Add the contents of src1 (or the destination) to the contents of src2 and
put the result in the destination register. When the sum of two operands
Add Fp Single Instructions with opposite signs is exactly zero, the sum has a positive sign for all
rounding modes except round toward –1. For that rounding mode, the
sum has a negative sign.
Convert Source to Another Convert the contents of src1 to the specified precision, round according
Precision Fp Instructions to the rounding mode, and put the result in the destination register.
The trunc instructions truncate the value in the source floating-point
register and put the resulting integer in the destination floating-point
Truncate and Round register, using the third (general-purpose) register to hold a temporary
instructions value. (This is a macro-instruction.) The round instructions work like
trunc, but round the floating-point value to an integer instead of
truncating it.
Compute the quotient of two values. These instructions treat src1 as the
dividend and src2 as the divisor. Divide Fp instructions divide the
Divide Fp Instructions contents of src1 by the contents of src2 and put the result in the
destination register. If the divisor is a zero, the machine signals a error if
the divide-by-zero exception is enabled.
Multiplies the contents of src1 (or the destination) with the contents of
Multiply Fp Instructions
src2 and puts the result in the destination register.
Compute the negative value of the contents of src1 and put the specified
Negate FP Instructions
precision floating-point result in the destination register.
Subtract the contents of src2 from the contents of src1 (or the
destination). These instructions put the result in the destination register.
Subtract Fp Instructions When the difference of two operands with the same signs is exactly zero,
the difference has a positive sign for all rounding modes except round
toward –1. For that rounding mode, the sum has a negative sign.
Condition Relations
Invalid Operation
Mnemonic Greater Less Exception if
Equal Unordered Unordered
True False Than Than
F T F F F F no
UN OR F F F T no
EQ NEQ F F T F no
UEQ OLG F F T T no
OLT UGE F T F F no
ULT OGE F T F T no
OLE UGT F T T F no
ULE OGT F T T T no
SF ST F F F F yes
NGLE GLE F F F T yes
SEQ SNE F F T F yes
NGL GL F F T T yes
LT NLT F T F F yes
NGE GE F T F T yes
LE NLE F T T F yes
NGT GT F T T T yes
Chapter 6
UN Unordered OR Ordered
EQ Equal NEQ Not Equal
UEQ Unordered or Equal OLG Ordered or Less than or Greater than
OLT Ordered Less Than UGE Unordered or Greater than or Equal
ULT Unordered or Less Than OGE Ordered Greater than or Equal
OLE Ordered Less than or Equal UGT Unordered or Greater Than
ULE Unorderd or Less than or Equal OGT Ordered Greater Than
SF Signaling False ST Signaling True
NGLE Not Greater than or Less than or Equal GLE Greater than, or Less than or Equal
SEQ Signaling Equal SNE Signaling Not Equal
NGL Not Greater than or Less than GL Greater Than or Less Less Than
LT Less Than NLT Not Less Than
NGE Not Greater Than GE Greater Than or Equal or Equal
LE Less Than or Equal NLE Not Less Than or Equal
NGT Not Greater Than GT Greater Than
Chapter 6
Double c.lt.d
Single c.lt.s
Compare NGE
Double c.nge.d
Single c.nge.s
*Compare LE
Double c.le.d
Single c.le.s
Compare NGT
Double c.ngt.d
Single c.ngt.s
Instruction Description
Compare the contents of src1 with the contents of src2. If src1 equals
Compare EQ Instructions src2 a true condition results; otherwise, a false condition results. The
machine does not signal an exception for unordered values.
Compare the contents of src1 with the contents of src2. These
Compare F Instructions instructions always produce a false condition. The machine does not
signal an exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
Compare LE than or equal to src2, a true condition results; otherwise, a false condition
results. The machine signals an exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
Compare LT than src2, a true condition results; otherwise, a false condition results.
The machine signals an exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
than src2 (or the contents are unordered), a true condition results;
Compare NGE
otherwise, a false condition results. The machine signals an exception for
unordered values.
Compare the contents of src1 with the contents of src2. If src1 equals
src2 or the contents are unordered, a true condition results; otherwise, a
Compare NGL
false condition results. The machine signals an exception for unordered
values.
Instruction Description
Compare the contents of src1 with the contents of src2. If src1 is
Chapter 6
Compare NGLE unordered, a true condition results; otherwise, a false condition results.
The machine signals an exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
than or equal to src2 or the contents are unordered, a true condition
Compare NGT
results; otherwise, a false condition results. The machine signals an
exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
Compare OLE Instructions than or equal to src2, a true condition results; otherwise, a false condition
results. The machine does not signal an exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
Compare OLT Instructions than src2, a true condition results; otherwise, a false condition results.
The machine does not signal an exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 equals
Compare SEQ Instructions src2, a true condition results; otherwise, a false condition results. The
machine signals an exception for unordered values.
Compare the contents of src1 with the contents of src2. This always
Compare SF Instructions produces a false condition. The machine signals an exception for
unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
than or equal to src2 (or src1 is unordered), a true condition results;
Compare ULE Instructions
otherwise, a false condition results. The machine does not signal an
exception for unordered values.
Compare the contents of src1 with the contents of src2. If src1 equals
src2 (or src1 and src2 are unordered), a true condition results; otherwise,
Compare UEQ Instructions
a false condition results. The machine does not signal an exception for
unordered values.
Compare the contents of src1 with the contents of src2. If src1 is less
than src2 (or the contents are unordered), a true condition results;
Compare ULT Instructions
otherwise, a false condition results. The machine does not signal an
exception for unordered values.
Compare the contents of src1 with the contents of src2. If either src1 or
Compare UN Instructions src2 is unordered, a true condition results; otherwise, a false condition
results. The machine does not signal an exception for unordered values.
Chapter 6
Table 6-9: Floating-Point Move Instruction Formats
Description Op-code Operand
Move FP
Single mov.s destination,src1
Double mov.d
Instruction Description
Move the double or single-precision contents of src1 to the destination
Move FP Instructions
register, maintaining the specified precision.
Instruction Description
Cache is the R4000 instruction to perform cache operations. The 16-bit
offset is sign-extended and added to the contents of general register
base to form a virtual address. The virtual address is translated to a
physical address using the TLB. The 5-bit sub-opcode (“op”) specifies
Cache (cache) ** the cache operation for that address. Part of the virtual address is used to
specify the cache block for the operation. Possible operations include
invalidating a cache block, writeback to a secondary cache or memory,
etc.
** This instruction is not valid in mips1 or mips2 architectures.
Probes the translation lookaside buffer (TLB) to see if the TLB has an
entry that matches the contents of the EntryHi register. If a match occurs,
Translation Lookaside
the machine loads the Index register with the number of the entry that
Buffer Probe (tlbp)
matches the EntryHi register. If no TLB entry matches, the machine sets
the high-order bit of the Index register.
Loads the EntryHi and EntryLo registers with the contents of the
Translation Lookaside
translation lookaside buffer (TLB) entry specified in the TLB Index
Buffer Read (tlbr)
register.
Loads the specified translation lookaside buffer (TLB) entry with the
Translation Lookaside
contents of the EntryHi and EntryLo registers. The contents of the TLB
BufferWrite Random (tlbwr)
Random register specify the TLB entry to be loaded.
Loads the specified translation lookaside buffer (TLB) entry with the
Translation Lookaside
contents of the EntryHi and EntryLo registers. The contents of the TLB
Buffer Write Index (tlbwi)
Index register specify the TLB entry to be loaded.
Ensures that all loads and stores fetched before the sync are completed,
before allowing any following loads or stores. Use of sync to serialize
Synchronize (sync) *
certain memory references may be required in multiprocessor
environments. * This instruction is not valid in the mips1 architecture.
Chapter 6
user-level traps, and indicates exceptions that occurred in the most recently
executed instruction, and any exceptions that may have occurred without
being trapped:
31 24 23 22 18 17 12 11 7 6 2 1 0
0 c 0 exceptions enables sticky- RM
bits
bits: 8 1 5 6 5 5 2
Control and Status Register
(c = compare bit)
11 10 9 8 7 17 16 15 14 13 12 6 5 4 3 2
V Z O U I E V Z O U I V Z O U I
The exception bits are set for instructions that cause an IEEE standard
exception or an optional exception used to emulate some of the more
hardware-intensive features of the IEEE standard.
The exception field is loaded as a side-effect of each floating-point operation
(excluding loads, stores, and unformatted moves). The exceptions which
were caused by the immediately previous floating-point operation can be
determined by reading the exception field.
The meaning of each bit in the exception field is given below. If two
exceptions occur together on one instruction, the field will contain the
inclusive-OR of the bits for each exception:
Exception
Description
Field Bit
E Unimplemented Operation
I Inexact Exception
O Overflow Exception
U Underflow Exception
V Invalid Operation
Z Division-by-Zero
Field Description
I Inexact Exception
O Overflow Exception
U Underflow Exception
V Invalid Operationz
Z Division-by-Zero
Each of the five exceptions is associated with a trap under user control, which
is enabled by setting one of the five bits of the enable field, shown above.
When an exception occurs, both the corresponding exception and status bits
are set. If the corresponding enable flag bit is set, a trap is taken. In some cases
the result of an operation is different if a trap is enabled.
The status flags are never cleared as a side effect of floating-point operations,
but may be set or cleared by writing a new value into the status register, using
a “move to coprocessor control” instruction.
The floating-point compare instruction places the condition which was
detected into the `c’ bit of the control and status register, so that the state of
the condition line may be saved and restored. The `c’ bit is set if the condition
is true, and cleared if the condition is false, and is affected only by compare
and move to control register instructions.
Chapter 6
without a trap, is a quiet NaN when the destination has a floating-point
format, and is indeterminate if the result has a fixed-point format. The invalid
operations are:
• Addition or subtraction: magnitude subtraction of infinities, such as (
+ 1 ) – ( – 1 ).
• Multiplication: 0 times 1, with any signs.
• Division: 0 over 0 or 1 over 1, with any signs.
• Square root: x , where x is less than zero.
• Conversion of a floating-point number to a fixed-point format when
an overflow, or operand value of infinity or NaN, precludes a faithful
representation in that format.
• Comparison of predicates involving < or > without ?, when the
operands are “unordered”.
• Any operation on a signaling NaN.
Software may simulate this exception for other operations that are invalid for
the given source operands. Examples of these operations include IEEE-
specified functions implemented in software, such as Remainder: x REM y,
where y is zero or x is infinite; conversion of a floating-point number to a
decimal format whose value causes and overflow or is infinity of NaN; and
transcendental functions, such as ln (–5) or cos -1(3).
Division-by-zero Exception
The division by zero exception is signaled on an implemented divide
operation if the divisor is zero and the dividend is a finite nonzero number.
The result, when no trap occurs, is a correctly signed infinity.
If division by zero traps are enabled, the result register is not modified, and
the source registers are preserved.
Software may simulate this exception for other operations that produce a
signed infinity, such as ln(0), sec(p/2), csc(0) or 0-1.
Overflow Exception
The overflow exception is signaled when what would have been the
magnitude of the rounded floating-point result, were the exponent range
Chapter 6
unbounded, is larger than the destination format’s largest finite number. The
result, when no trap occurs, is determined by the rounding mode and the sign
of the intermediate result.
If overflow traps are enabled, the result register is not modified, and the
source registers are preserved.
Underflow Exception
Two related events contribute to underflow. One is the creation of a tiny non-
zero result between ±2Emin (minimum expressible exponent) which, because
it is tiny, may cause some other exception later. The other is extraordinary
loss of accuracy during the approximation of such tiny numbers by
denormalized numbers.
The IEEE standard permits a choice in how these events are detected, but
requires that they must be detected the same way for all operations.
The IEEE standard specifies that “tininess” may be detected either: “after
rounding” (when a nonzero result computed as though the exponent range
were unbounded would lie strictly between ±2Emin, or “before rounding”
(when a nonzero result computed as though the exponent range and the
precision were unbounded would lie strictly between ±2Emin. The
architecture requires that tininess be detected after rounding.
Loss of accuracy may be detected as either “denormalization loss” (when the
delivered result differs from what would have been computed if the exponent
range were unbounded), or “inexact result” (when the delivered result differs
from what would have been computed if the exponent range and precision
were both unbounded). The architecture requires that loss of accuracy be
detected as inexact result.
When an underflow trap is not enabled, underflow is signaled (via the
underflow flag) only when both tininess and loss of accuracy have been
detected. The delivered result might be zero, denormalized, or ± 2Emin. When
an underflow trap is enabled, underflow is signaled when tininess is detected
regardless of loss of accuracy.
If underflow traps are enabled, the result register is not modified, and the
source registers are preserved.
Inexact Exception
If the rounded result of an operation is not exact or if it overflows without an
overflow trap, then the inexact exception is signaled. The rounded or
Chapter 6
overflowed result is delivered to the destination register, when no inexact trap
occurs. If inexact exception traps are enabled, the result register is not
modified, and the source registers are preserved.
Floating-Point Rounding
Bits 0 and 1 of the coprocessor control register 31 sets the rounding mode for
floating-point. The machine allows four rounding modes:
Chapter 6
Chapter 7
This chapter gives rules and examples to follow when designing an assembly
language program. The chapter concludes with a “learn by doing” technique
that you can use if you still have any doubts about how a particular calling
sequence should work. This involves writing a skeleton version of your
prospective assembly routine using a high level language, and then compiling
it with the –S option to generate a human-readable assembly language file.
The assembly language file can then be used as the starting point for coding
your routine.
Introduction
When you write assembly language routines, you should follow the same
calling conventions that the compilers observe, for two reasons:
• Often your code must interact with compiler-generated code,
accepting and returning arguments or accessing shared global data.
• The symbolic debugger gives better assistance in debugging
programs using standard calling conventions.
The conventions for the compiler system are a bit more complicated than
some, mostly to enhance the speed of each procedure call. Specifically:
• The compilers use the full, general calling sequence only when
necessary; where possible, they omit unneeded portions of it. For
example, the compilers don’t use a register as a frame pointer
whenever possible.
• The compilers and debugger observe certain implicit rules rather than
communicating via instructions or data at execution time. For
example, the debugger looks at information placed in the symbol
Program Design
This section describes three general areas of concern to the assembly
language programmer:
• Usable and restricted registers.
• Stack frame requirements on entering and exiting a routine.
• The “shape” of data (scalars, arrays, records, sets) laid out by the
Chapter 7
• Leaf routines, that is, routines that do not themselves execute any
procedure calls. Leaf routines are of two types:
• Leaf routines that require stack storage for local variables
• Leaf routines that do not require stack storage for local variables.
You must decide the routine category before determining the calling
sequence.
To write a program with proper stack frame usage and debugging capabilities,
use the following procedure:
1. Regardless of the type of routine, you should include a .ent pseudo-op
and an entry label for the procedure. The .ent pseudo-op is for use by the
Chapter 7
debugger, and the entry label is the procedure name. The syntax is:
.ent procedure_name
procedure_name:
2. If you are writing a leaf procedure that does not use the stack, skip to step
3. For leaf procedure that uses the stack or non-leaf procedures, you must
allocate all the stack space that the routine requires. The syntax to adjust
the stack size is:
subu $sp,framesize
where framesize is the size of frame required; framesize must be a
multiple of 8. Space must be allocated for:
• Local variables.
• Saved general registers. Space should be allocated only for those
registers saved. For non-leaf procedures, you must save $31, which is
used in the calls to other procedures from this routine. If you use
registers $16–$23, you must also save them.
• Saved floating-point registers. Space should be allocated only for
those registers saved. If you use registers $f20–$f30 you must also
save them.
• Procedure call argument area. You must allocate the maximum
number of bytes for arguments of any procedure that you call from
this routine.
NOTE: Once you have modified $sp, you should not modify it again for the
rest of the routine.
3. Now include a .frame pseudo-op:
.frame framereg,framesize,returnreg
high memory
argument n
•
•
•
virtual argument 1
framepointer ($fp)
Chapter 7
argument build
stack pointer($sp)
(framereg) •
•
•
low memory
The returnreg specifies the register containing the return address (usually
$31). These usual values may change if you use a varying stack pointer or
are specifying a kernel trap routine.
4. If the procedure is a leaf procedure that does not use the stack, skip to
step 7. Otherwise you must save the registers you allocated space for in
step 2.
The .mask directive specifies the registers to be stored and where they are
stored.A bit should be on in bitmask for each register saved (for example,
if register $31 is saved, bit 31 should be ‘1’ in bitmask.Bits are set in
bitmask in little-endian order, even if the machine configuration is big-
endian).The frameoffset is the offset from the virtual frame pointer (this
number is usually negative).N should be 0 for the highest numbered
register saved and then incremented by four for each subsequently lower
numbered register saved.For example:
sw $31,framesize+frameoffset($sp)
sw $17,framesize+frameoffset–4($sp)
sw $16,framesize+frameoffset–8($sp)
Figure 7-2 illustrates this example.
Chapter 7
high memory
•
•
•
stack pointer ($sp)
low memory
Now save any floating-point registers that you allocated space for in step
2 as follows:
.fmask bitmask,frameoffset
s.[sd] reg,framesize+frameoffset–N($sp)
6. Next, you must restore registers that were saved in step 4. To restore
general purpose registers:
lw reg,framesize+frameoffset–N($sp)
To restore the floating-point registers:
l.[sd] reg,framesize+frameoffset–N($sp)
Chapter 7
The Shape of Data
In most cases, high-level language routine and assembly routines
communicate via simple variables: pointers, integers, booleans, and single-
and double-precision real numbers. Describing the details of the various high-
level data structures (arrays, records, sets, and so on) is beyond our scope
here. If you need to access such a structure as an argument or as a shared
global variable, refer to Chapter 4 of the RISCompiler and C Programmer’s
Guide, and the “Learn by Doing” technique described at the end of this
section.
Examples
This section contains the examples that illustrate program design rules. Each
example shows a procedure written and C and its equivalent written in
assembly language.
Figure 7-3 shows a non-leaf procedure. Notice that it creates a stackframe,
and also saves its return address since it must put a new return address into
register $31 when it invokes its callee:
float
nonleaf(i, j)
int i, *j;
{
double atof();
int temp;
temp = i - *j;
if (i < *j) temp = -temp;
return atof(temp);
}
.globl nonleaf
# 1 float
Chapter 7
# 2 nonleaf(i, j)
# 3 int i, *j;
# 4 {
.ent nonleaf 2
nonleaf:
subu $sp, 24 ## Create stackframe
sw $31, 20($sp) ## Save the return address
.mask 0x80000000, -4
.frame $sp, 24, $31
# 5 double atof();
# 6 int temp;
# 7
# 8 temp = i - *j;
lw $2, 0($5) ## Arguments are in $4 and $5
subu $3, $4, $2
# 9 if (i < *j) temp = -temp;
bge $4, $2, $32 ## Note: $32 is a label, not a reg
negu $3, $3
$32:
# 10 return atof(temp);
move $4, $3
jal atof
cvt.s.d $f0, $f0 ## Returnvalue goes in $f0
lw $31, 20($sp) ## Restore return address
addu $sp, 24 ## Delete stackframe
j $31 ## Return to caller
.end nonleaf
Figure 7-4 shows a leaf procedure that does not require stack space for local
variables. Notice that it creates no stackframe, and saves no return address.
int
leaf(p1, p2)
int p1, p2;
{
return (p1 > p2) ? p1 :p2;
}
.globl leaf
# 1 int
# 2 leaf(p1, p2)
# 3 int p1, p2;
# 4 {
.ent leaf 2
leaf:
Chapter 7
.frame $sp, 0, $31
# 5 return (p1 > p2) ? p1 : p2;
ble $4, $5, $32 ## Arguments in $4 and $5
move $3, $4
b $33
$32:
move $3, $5
$33:
move $2, $3 ## Return value goes in $2
j $31 ## Return to caller
# 6 }
.end leaf
Figure 7-4: Leaf Procedure Without Stack Space for Local Variables
Figure 7-5 shows a leaf procedure that requires stack space for local variables.
Notice that it creates a stack frame, but does not save a return address.
char
leaf_storage(i)
int i;
{
char a[16];
int j;
for (j = 0; j < 10; j++)
a[j] = ’0’ + j;
for (j = 10; j < 16; j++)
a[j] = ’a’ + j;
return a[i];
}
.globl leaf_storage
# 1 char
Chapter 7
# 2 leaf_storage(i)
# 3 int i;
# 4 {
.ent leaf_storage 2## "2" is the lexical level of
## the procedure. You may omit it
leaf_storage:
subu $sp, 24 ## Create stackframe.
.frame $sp, 24, $31
# 5 char a[16];
# 6 int j;
# 7
# 8 for (j = 0; j < 10; j++)
sw $0, 4($sp)
addu $3, $sp, 24
$32:
# 9 a[j] = ’0’ + j;
lw $14, 4($sp)
addu $15, $14, 48
addu $24, $3, $14
sb $15, -16($24)
lw $25, 4($sp)
addu $8, $25, 1
sw $8, 4($sp)
blt $8, 10, $32
# 10 for (j = 10; j < 16; j++)
li $9, 10
sw $9, 4($sp)
$33:
# 11 a[j] = ’a’ + j;
lw $10, 4($sp)
addu $11, $10, 97
addu $12, $3, $10
sb $11, -16($12)
lw $13, 4($sp)
addu $14, $13, 1
sw $14, 4($sp)
blt $14, 16, $33
# 12 return a[i];
addu $15, $3, $4 ## Argument is in $4.
lbu $2, -16($15) ## Return value goes in $2
addu $sp, 24 ## Delete stackframe.
j $31 ## Return to caller.
.end leaf_storage
Figure 7-5: Leaf Procedure With Stack Space for Local Variables
Learning by Doing
The rules and parameter requirements required between assembly language
and other languages are varied and complex. The simplest approach to coding
an interface between an assembly routine and a routine written in a high-level
language is to do the following:
• Use the high-level language to write a skeletal version of the routine
that you plan to code in assembly language.
• Compile the program using the –S option, which creates an assembly
language (.s) version of the compiled source file.
• Study the assembly-language listing and then, imitating the rules and
Chapter 7
conventions used by the compiler, write your assembly language
code.
The next two sections illustrate techniques to use in creating an interface
between assembly language and high-level language routines. The examples
shown are merely to illustrate what to look for in creating your interface.
Details such as register numbers will vary according to the number, order, and
data types of the arguments. You should write and compile realistic examples
of your own code in writing your particular interface.
cc –S –O caller.c
The –S option causes the compiler to produce the assembly-language
listing; the –O option, though not required, reduces the amount of code
generated, making the listing easier to read.
After compilation, look at the file caller.s (shown below). The highlighted
section of the listing shows how the parameters are passed, the execution of
the call, and how the returned values are retrieved:
.globl c
.align 2
c:
Chapter 7
.word 875638323 : 1
.word 13617 : 1
.comm d 8
.comm f 4
.globl caller
.text
.ent caller 2
caller:
subu $sp, 24
sw $31, 20($sp)
.mask 0x80000000, -4
.frame $sp, 24, $31
# 1 char c[] = "3.1415";
# 2 double d, atof();
# 3 float f;
# 4 caller()
# 5 {
# 6 d = atof(c);
type
str = packed array [1 .. 10] of char;
subr = 2 .. 5;
var
global_r: real;
Chapter 7
global_c: subr;
global_s: str;
global_b: boolean;
.lcomm $dat 0
.comm global_r 4
.comm global_c 1
.comm global_s 10
.comm global_b 1
.text
.globl callee
# 10 function callee(var r: real; c: subr; s: str): boolean;
.ent callee 2
callee:
.frame $sp, 0, $31
sw $5, 4($sp)
sw $6, 8($sp)
Chapter 7
Memory Allocation
The machine’s default memory allocation scheme gives every process two
storage areas, that can grow without bound. A process exceeds virtual storage
only when the sum of the two areas exceeds virtual storage space. The link
editor and assembler use the scheme shown in Figure 7-6. An explanation of
each area in the allocation scheme follows the figure.
Chapter 7
0xffffffff
Reserved for Kernel
(accessible from Kernel Mode) 1
0x8fffffff (2GB)
0x7fffffff
Not Accessible 2
(by convention, not a hardware
implementation)
0x7ffff000 (4KB)
0x7fffefff Activation Stack 3
(grows toward zero)
$sp
Protected
Chapter 7
4
(grows from either edge)
Heap
5
(grows up)
.bss
.sbss
$gp .sdata
.lit4 6
.lit8
.data
Chapter 7
section.
d. data (data) - Data initialized and specified for the data section.
7. Reserved for any shared libraries.
8. Contains the .text section, .rdata section and all dynamic tables.
9. Reserved.
8
This chapter describes pseudo op-codes (directives). These pseudo op-codes
influence the assembler’s later behavior. In the text, boldface type specifies a
keyword and italics represents an operand that you define.
The assembler has the pseudo op-codes listed in Table 8-1.
Chapter 8
Pseudo-Op Description
Sets an alternate entry point for the current procedure. Use this
.aent name, symno information when you want to generate information for the
debugger. It must appear inside an .ent/.end pair.
Indicates that memory reference through the two registers ( reg1,
.alias reg1, reg2 reg2) will overlap. The compiler uses this form to improve
instruction scheduling.
Advance the location counter to make the expression low order
bits of the counter zero. Normally, the .half, .word, .float, and
.double directives automatically align their data appropriately. For
example, .word does an implicit .align 2 (.double does an .align 3).
You disable the automatic alignment feature with .align 0. The
.align expression assembler reinstates automatic alignment at the next .text, .data,
.rdata, or .sdata directive.
Pseudo-Op Description
Assembles each string from the list into successive locations. The
.ascii directive does not null pad the string. You MUST put
.ascii string [, string]... quotation marks (”) around each string. You can use the
backslash escape characters. For a list of the backslash
characters, see Chapter 4.
Assembles each string in the list into successive locations and
.asciiz string [, string]... adds a null. You can use the backslash escape characters. For a
list of the backslash characters, see Chapter 4.
Tells the assembler’s second pass that this assembly came from
.asm0
the first pass. (For use by compilers.)
(For use by compilers.) Sets the beginning of a language block.
The .bgnb and .endb directives delimit the scope of a variable set.
The scope can be an entire procedure, or it can be a nested scope
.bgnb symno (for example a “{}” block in the C language). The symbol number
symno refers to a dense number in a .T file. For an explanation of
.T files, see the RISCompiler and C Programmer’s Guide. To set
the end of a language block, see .endb.
Truncates the expressions from the comma-separated list to 8-bit
values, and assembles the values in successive locations. The
Chapter 8
Pseudo-Op Description
Initializes memory to 64-bit floating point numbers. The operands
can optionally have the form: expression1 [ : expression2 ]. The
expression1 is the floating point value. The optional expression2
.double expression is a non-negative expression that specifies a repetition count. The
[ , expression2] ...[, expressionN] expression2 replicates expression1’s value expression2 times.
This directive automatically aligns its data and any preceding
labels to a double-word boundary. You can disable this feature by
using .align 0.
Truncates the expressions in the comma-separated list to 64-bits
and assembles the values in successive locations. The
expressions must be absolute. The operands can optionally have
.dword expression
the form: expression1 [:expression2]. The expresssion2 replicates
[ , expression2 ] ...[, expressionN]
expression1’s value expression2 number of times. The directive
automatically aligns its data and preceding labels to a doubleword
boundary. You can disable this feature by using .align 0.
Sets the end of a procedure. Use this directive when you want to
.end [proc_name] generate information for the debugger. To set the beginning of a
procedure, see .ent.
Sets the end of a language block. To set the beginning of a
.endb symno
Chapter 8
language block, see .bgnb.
Signals the end of a repeat block. To start a repeat block, see
.endr
.repeat.
Sets the beginning of the procedure proc_name. Use this directive
.ent proc_name when you want to generate information for the debugger. To set
the end of a procedure, see .end.
name is a global undefined symbol whose size is assumed to be
expression bytes. The advantage of using this directive, instead of
permitting an undefined symbol to become global by default, is
that the assembler can decide whether to use the economical
.extern name expression
$gp-relative addressing mode, depending on the value of the –G
option. As a special case, if expression is zero, the assembler
refrains from using $gp to address this symbol regardless of the
size specified by –G.
Signals an error. Any compiler front-end that detects an error
condition puts this directive in the input stream. When the
.err assembler encounters a .err, it quietly ceases to assemble the
source file. This prevents the assembler from continuing to
process a program that is incorrect. (For use by compilers.)
Pseudo-Op Description
Specifies the source file corresponding to the assembly
instructions that follow. For use only by compilers, not by
.file file_number file_name_string programmers; when the assembler sees this, it refrains from
generating line numbers for dbx to use unless it also sees .loc
directives.
Initializes memory to single precision 32-bit floating point
numbers. The operands can optionally have the form: expression1
[ : expression2 ]. The optional expression2 is a non-negative
.float expression1
expression that specifies a repetition count. This optional form
[ , expression2 ] ... [, expressionN]
replicates expression1’s value expression2 times. This directive
automatically aligns its data and preceding labels to a word
boundary. You can disable this feature by using .align 0.
Sets a mask with a bit turned on for each floating point register
that the current routine saved. The least-significant bit
corresponds to register $f0. The offset is the distance in bytes
from the virtual frame pointer at which the floating point registers
.fmask mask offset
are saved. The assembler saves higher register numbers closer to
the virtual frame pointer. You must use .ent before .fmask and
only one .fmask may be used per .ent. Space should be allocated
Chapter 8
Pseudo-Op Description
This directive is similar to .word except that the relocation entry
for local-sym has the R_MIPS_GPREL32 type. After linkage, this
.gpword local-sym results in a 32-bit value that is the distance between local-sym and
“gp”. local-sym must be local. This directive is used by the code
generator for PIC switch tables.
Truncates the expressions in the comma-separated list to 16-bit
values and assembles the values in successive locations. The
expressions must be absolute. This directive can optionally have
.half expression1 [ , expression2 ]
the form: expression1 [ : expression2 ]. The expression2
... [ , expressionN]
replicates expression1’s value expression2 times. This directive
automatically aligns its data appropriately. You can disable this
feature by using .align 0.
Associates a named label with the current location in the program
.lab label_name
text. (For use by compilers).
Makes the name’s data type bss. The assembler allocates the
named symbol to the bss area, and the expression defines the
named symbol’s length. If a .globl directive also specifies the
name, the assembler allocates the named symbol to external bss.
.lcomm name, expression The assembler puts bss symbols in one of two bss areas. If the
Chapter 8
defined size is smaller than (or equal to) the size specified by the
assembler or compiler’s –G command line option, the assembler
puts the symbols in the sbss area and uses $gp to address the
data.
For use by compilers. Affects the next jump instruction even if it is
not the successive instruction. The .livereg directive may come
before any of the following instructions: JAL, JR, and SYSCALL.
By default, external J instructions and JR instructions through a
register other than $ra, are treated as external calls; that is; all
registers are assumed live. The directive .livereg cannot appear
before an external J (it will affect the next JR, JAL, or SYSCALL
instead of the J instruction). .livereg may appear before a JR
instruction through a register other than $ra. The directive can’t be
used before a BREAK instruction. For BREAK instructions, the
.livereg int_bitmask fp_bitmask
assembler also assumes all registers are live.
Pseudo-Op Description
Specifies the source file and the line within that file that
corresponds to the assembly instructions that follow. The
assembler ignores the file number when this directive appears in
the assembly source file. Then, the assembler assumes that the
directive refers to the most recent .file directive. When a .loc
.loc file_number line_number
directive appears in the binary assembly language .G file, the file
number is a dense number pointing at a file symbol in the symbol
table .T file. For more information about .G and .T files, see the
RISCompilers and C Programmer’s Guide. (For use by
compilers).
Sets a mask with a bit turned on for each general purpose register
that the current routine saved. Bit one corresponds to register $1.
The offset is the distance in bytes from the virtual frame pointer
where the registers are saved. The assembler saves higher
.mask mask, offset register numbers closer to the virtual frame pointer. Space should
be allocated for those registers appearing in the mask. If bit zero is
set it is assumed that space is allocated for all 31 registers
regardless of whether they appear in the mask. (For use by
compilers).
Chapter 8
Pseudo-Op Description
Instructs the assembler to enable or to disable certain options.
Use .set options only for hand-crafted assembly routines. The
assembler has these default options: reorder, macro, and at. You
can specify only one option for each .set directive. You can
specify these .set options:\
Chapter 8
The nomacro option causes the assembler to print a warning
whenever an assembler operation generates more than one
machine language instruction. You must select the noreorder
option before using the nomacro option; otherwise, an error
results.
The at option lets the assembler use the $at register for macros,
but generates warnings if the source program uses $at. When you
use the noat option and an assembler operation requires the $at
register, the assembler issues a warning message; however, the
noat option does let source programs use $at without issuing
warnings.
Pseudo-Op Description
Advances the location counter by the value of the specified
.space expression
expression bytes. The assembler fills the space with zeros.
This permits you to lay out a structure using labels plus directives
like .word, .byte, and so forth. It ends at the next segment directive
.struct expression (.data, .text, etc.). It does not emit any code or data, but defines
the labels within it to have values which are the sum of expression
plus their offsets from the .struct itself.
Takes one of these forms: name = expression or name = register.
You must define the name only once in the assembly, and you
CANNOT redefine the name. The expression must be computable
(symbolic equate)
when you assemble the program, and the expression must involve
operators, constants, and equated symbols. You can use the
name as a constant in any later statement.
Tells the assembler to add subsequent code to the text section.
.text
(This is the default.)
Specifies the major and minor version numbers (for example,
.verstamp major minor
version 0.15 would be .verstamp 0 15).
(For use by compilers). Describes a register variable by giving the
Chapter 8
.vreg register offset symno offset from the virtual frame pointer and the symbol number
symno (the dense number) of the surrounding procedure.
Truncates the expressions in the comma-separated list to 32-bits
and assembles the values in successive locations. The
expressions must be absolute. The operands can optionally have
.word expression1 [, expression2
the form: expression1 [ : expression2 ]. The expression2
] ... [ , expressionN]
replicates expression1’s value expression2 times. This directive
automatically aligns its data and preceding labels to a word
boundary. You can disable this feature by using .align 0.
9
This chapter provides information on the object file format and has the
following major topics:
• An overview of the components that make up the object file, and the
differences between the MIPS object-file format and the UNIX
System V common object file format (COFF).
• A description of the headers and sections of the object file. Detailed
information is given on the logic followed by the assembler and link
editor in handling relocation entries.
• The format of object files (OMAGIC, NMAGIC, ZMAGIC, and
LIBMAGIC), and information used by the system loader in loading
object files at run-time.
• Archive files and link editor defined symbols.
Chapter 9
Overview
The assembler and the link editor generate object files that have sections
ordered as shown in Figure 9-1. Any areas empty of data are omitted, except
that the File Header, Optional Header, and Section Header are always present.
The sections of the Symbol table portion (indicated in Figure 9-1) that appear
in the final object file format vary, as follows:
• The Line Numbers, Optimization Symbols, and Auxiliary Symbols
tables appear only when debugging is on (when the user specifies one
of the compiler –g1, –g2 or –g3 options).
• When the user specifies the –x option (strip non-globals) for the link
edit phase, the link editor strips the Line Number, Local Symbols,
Optimization Symbols, Auxiliary Symbols, Local Strings, and
Relative File Descriptor tables from the object file, and updates the
Procedure Descriptor table.
• The link editor strips the entire Symbol table from the object file
when the user specifies the –s option (strip) for the link edit phase.
Any new assembler or link editor designed to work with the compiler system
should lay out the object file sections in the order shown in Figure 9-1. The
link editor can process object files that are ordered differently, but
performance may be degraded.
Chapter 9
File Header
Optional Headers
Section Headers
Section Data
text small data
initialization text small bss (0 size)
read-only data large bss (0 size)
large data shared library info.
8-byte literal pool ucode (ucode ob-
4-byte literal pool jects only)
*
Section Relocation Information
text large data
read-only data small data
Symbolic Header
Comments
Line Numbers*
Dense Numbers
(ucode objects only)
Procedure Descriptor Table
Local Symbols
Optimization Symbols*
Chapter 9
Auxiliary Symbols*
Local Strings
External Strings
f_symptr points to the Symbolic Header of the Symbol table, and f_nsyms
gives the size of the header. For a description of the Symbolic Header, see
Chapter 10.
NOTE: The “_2” magic numbers are defined for mips2 object files. They
cannot be used on a mips1 implementation.
NOTE: The “_3” magic numbers are defined for mips3 object files. They
cannot be used on a mips1 or mips2 implementation.
Chapter 9
Flags (f_flags)
The f_flags field describes the object file characteristics. Table 9-3 describes
the flags and gives their hexadecimal values. The table notes those flags that
do not apply to compiler system object files.
Optional Header
The link editor and the assembler fill in the Optional Header, and the system
(kernel) loader (or other program that loads the object module at run-time)
uses the information it contains, as described in the section Loading Object
Files in this chapter.
Table 9-4 shows the format of the Optional Header (defined in the header file
aouthdr.h).
Chapter 9
See the Object Files section in this chapter for information on the format of
OMAGIC, NMAGIC, ZMAGIC, and LIBMAGIC files.
Section Headers
Table 9-6 shows the format of the Section Header (defined in the header file
scnhdr.h).
Chapter 9
_DSOLIST ”.dsolist” * Dynamic shared object list table.
_CONFLICT ”.conflict” * Additional dynamic linking information.
_REGINFO ”.reginfo” * Register usage information.
* these sections exist only in ZMAGIC type files and are used during
dynamic linking
Flags (s_flags)
Table 9-8 shows the defined values for s_flags; the header file scnhdr.h
contains the definitions (those flags that are not used by compiler system
object files are noted).
s_nreloc contains the value 0xffff and the s_flags field has the
S_NRELOC_OVFL flag set; the value true is in the r_vaddr field of the first
relocation entry for that section. That relocation entry has a type of R_ABS
and all other fields are zero, causing it to be ignored under normal
circumstances.
NOTE: For performance reasons, the link editor uses the s_flags entry
instead of s_name to determine the type of section. However, the link editor
does correctly fill in the s_name entry.
Chapter 9
union gp_table {
struct {
long current_g_value; /* actual value */
long unused;
} header;
struct {
long g_value; /* hypothetical value */
long bytes; /* section size corresponding */
/* to hypothetical value */
} entry;
};
Section Data
Chapter 9
The .text section contains the machine instructions that are to be executed; the
.rdata, .data, .lit8, .lit4, and .sdata contain initialized data, and the .sbss and
.bss sections reserve space for uninitialized data that is created by the kernel
loader for the program before execution and filled with zeros.
.dynamic
.liblist
.rel.dyn
.conflict
.dynstr
.dynsym text segment
.has
.rdata
.text
.init
.fini
.data
.lit8
.lit4 data segment
.sdata
.got
Chapter 9
.sbss
bss segment
.bss
As noted in Figure 9-2, the sections are grouped into the text segment
(containing the .text, .init, and .fini sections), the data segment (.rdata, .data,
.lit8, .lit4, and .sdata), and the bss segment (.sbss and .bss). A section is
described by and referenced through the Section Header; the Optional Header
provides the same information for segments.
The link editor references the data shown in Figure 9-2 both as sections and
segments, through the Section Header and Optional Header respectively.
However, the system (kernel) loader, when loading the object file at run-time,
references the same data only by segment, through the Optional Header.
Chapter 9
Chapter 9
Symbol Value Description
R_SN_TEXT 1 .text section.
R_SN_INIT 7 .init section.
R_SN_RDATA 2 .rdata section.
R_SN_DATA 3 .data section.
R_SN_SDATA 4 .sdata section.
R_SN_SBSS 5 .sbss section.
R_SN_BSS 6 .bss section.
R_SN_LIT 8 .lit8 section.
R_SN_LIT4 9 .lit4 section.
R_SN_FINI 12 .fini section.
value=0
r_extern=1
constant
Chapter 9
Sets r_extern to 1.
NOTE: The assembler always sets the value of the undefined entry in
External Symbols to 0. It may assign a constant value to be added to the
relocated value at the address where the location is to be done. If the width of
the constant is less than a full word, and an overflow occurs after relocation,
the link editor flags this as an error.
When the link editor determines that an external symbol is defined, it changes
the Relocation Table entry for the symbol to a local relocation entry. Figure
9-4 gives an overview of the new entry.
r_type s_vaddr
r_extern=0
Sign-extended to 32 bits.
Examples
The examples that follow use external relocation entries.
Example 1: 32-Bit Reference – R_REFWORD. This example shows
assembly statements that set the value at location b to the global data value y.
.globl y
.data
b: .word y #R_REFWORD relocation type at address b for
symbol y
Chapter 9
sign-extending it), and places the results back into the low 26 bits at address c.
R_JMPADDR relocation entries are produced for the assembler’s j (Jump)
and jal (Jump and Link) instructions. These instructions take the high four
bits of the target address from the address of the delay slot of their instruction.
The link editor makes sure that the same four bits are in the target address
after relocation; if not, it generates an error message.
If the entry is a local relocation type, the target of the Jump instruction is
assembled in the instruction at the address to be relocated. The high four bits
of the jump target are taken from the high 4 bits of the address of the delay
slot of the instruction to be relocated.
Example 3: High/Low Reference - R_REFHI/R_REFLO. This example
shows an assembler macro that loads the absolute address y, plus a constant,
into Register 6:
lw $r6,0x10008000
Chapter 9
Object Files
This section describes the object-file formats created by the link editor,
namely the Impure (OMAGIC), Shared Text (NMAGIC), Demand Paged
(ZMAGIC), and target-shared libraries (LIBMAGIC) formats. Before
reading this section, you should be familiar in the format and contents of the
text, data, and bss segments as described in the Section Data section of this
Chapter 9
chapter.
NOTE: This chapter discusses the creation of LIBMAGIC files (shared
libraries). These are not to be confused with dynamic shared objects that have
type ZMAGIC. Dynamic shared objects are discussed in Chapters 11 and 12.
The following constraints are imposed on the address at which an object can
be loaded and the boundaries of its segments. The operating system can
dictate additional constraints.
• Segments must not overlap and all addresses must be less than
0x80000000.
• Space should be reserved for the stack, which starts below
0x80000000 and grows through lower addresses; that is, the value of
each subsequent address is less than that of the previous address.
• The default text segment address for ZMAGIC and NMAGIC files is
0x0n0400000 and the default data segment address is 0x10000000.
• The default text segment address for OMAGIC files is 0x10000000
with the data segment following the text segment.
• The –B num option (specifying a bss segment origin) cannot be
specified for OMAGIC files; the default, which specifies that the bss
segment follow the data segment, must be used.
• RISC/os requires a 2-megabyte boundary for segments.
.bss
bss segment
.sbss
.sdata
.lit4
.lit8 data segment
.data
.rdata
.init
text segment
.text
Chapter 9
.bss
bss segment
.sbss
.sdata
.lit4
data segment
.lit8
.data
.init
text segment
.fini
.text
Chapter 9
.rdata
• Only the start of the text and data segments, using the link editor’s –
T and –D options, can be specified for a shared text format file; the
start of the text and data segments must be a multiple of the pagesize.
.sdata
data segment
.lit4 (blocked by pagesize)
.lit8
256M .data
empty
fill area
.init
.text
.rdata
text segment
(blocked by pagesize)
Chapter 9
4 Mbyte + header
header
4 Mbyte
empty
0
Symbol Table
0 Fill Area
.got
.init
.text
.rdata
.hash
.rel.dyn
.liblist
.dynamic
headers
Chapter 9
shared library specification
#initbar.o
_libfoo_extext
generates these instructions generated in the .init section:
la $2,ext
sw $2,_libfoo_ext
Initialization instructions are not bounded by any procedure; the initialization
instructions from each .init section are concatenated and the runtime startup
(crt1.o) branches to its label in its .init section. Then the execution falls
through all the concatenated .init sections until reaching crtn.o (the last object
with a .init section) which contains the RETURN instruction.
Object files without shared libraries contain a small .init section that executes,
producing no significant results.
Ucode objects
Ucode objects contain only a file header, the ucode section header, the ucode
section and all of the symbolic information. A ucode section never appears in
a machine code object file.
Optional Header (Table 9-4) contains the size of the text segment and
text_start contains the address at which it is to be loaded.
The starting offset of the data segment follows the text segment. The dsize
field in the Section Header (Table 9-6) contains the size of the data segment;
data_start contains the address at which it is to be loaded.
The system (kernel) loader must fill the .bss segment with zeros. The
bss_start field in the Optional Header specifies the starting address; bsize
specifies the number of bytes to be filled with zeros. In ZMAGIC files, the
link editor adjusts bsize to account for the zero filled area it created in the data
segment that is part of the .sbss or .bss sections.
If the object file itself does not load the global pointer register it must be set
to the gp_value field in the Optional Header (Table 9-4).
The other fields in the Optional Header are gprmask and cprmask[4], whose
bits show the registers used in the .text, .init , and .fini sections. They can be
used by the operating system, if desired, to avoid save register relocations on
context-switch.
Archive files
The link editor can link object files in archives created by the archiver. The
archiver and the format of the archives are based on the UNIX System V
portable archive format. To improve performance, the format of the archives
symbol table was changed so that it is a hash table, not a linear list.
The archive hash table is accessed through the ranhashinit() and ranlookup()
library routines in libmld.a, which are documented in the manual page
ranhash(3x). The archive format definition is in the header file ar.h.
Chapter 9
The dynamic linker also reserves and defines certain symbols; see Chapters
11 and 12 for more information.
The first three symbols come from the standard UNIX system link editors and
the rest are compiler system specific. The last symbol is used by the start up
routine to set the value of the global pointer, as shown in the following
assembly language statements:
globl _GP
la $gp,_GP
which would cause the correct value of the global pointer to be loaded.
The link editor symbol _COBOL_MAIN is set to the symbol value of the first
external symbol with the cobol_main bit set. COBOL objects uses this
symbol to determine the main routine.
Chapter 9
The comments in the header file exception.h describes the routines in that
library.
The Runtime Procedure Table is sorted by procedure address and always has
a dummy entry with a zero address and a 0xffffffff address. When required,
the table is padded with an extra zero entry to ensure that the total number of
entries is an uneven (odd) number.
The Runtime Procedure Table and String Table for the runtime procedure
table are placed at then end of the .data section in the object file.
10
This chapter describes the symbol table and symbol table routines used to
create and make entries in the table. The chapter contains the following major
sections:
• Overview, which gives the purpose of the Symbol table, a summary
of its components, and their relationship to each other.
• Format of Symbol Table Entries, which shows the structures of
Symbol table entries and the values you assign them through the
Symbol Table routines.
• Symbol Table Routine Reference, which lists the symbol table
routines supplied with the compiler and summarizes the function of
each.
NOTE: Third Eye Software, Inc. owns the copyright (dated 1984) to the
format and nomenclature of the Symbol Table used by the compiler system
as documented in this chapter.
Third Eye Software, Inc. grants reproduction and use rights to all parties,
PROVIDED that this comment is maintained in the copy.
Third Eye makes no claims about the applicability of this symbol table to a
particular use.
Chapter 10
Overview
The symbol table in created by the compiler front-end as a stand-alone file.
The purpose of the table is to provide information to the link editor and the
debugger in performing their respective functions. At the option of the user,
the link editor includes information from the Symbol table in the final object
file for use by the debugger. See Figure 9-1 in Chapter 9 for details.
Symbolic Header
Comment Section*
Dense Numbers
Local Symbols
Optimization Symbols
Auxiliary Symbols
Local Strings
External Strings
File Descriptor
The elements that make up the Symbol table are shown in Figure 10-1. The
front-end creates one group of tables (the shaded areas in Figure 10-1) that
contain global information relative to the entire compilation. It also creates a
unique group of tables (the unshaded areas in the figure) for the source file
and each of its include files.
Compiler front-ends, the assembler, and the link editor interact with the
symbol table as summarized below:
• The front-end, using calls to routines supplied with the compiler
system, enters symbols and their descriptions in the table.
• The assembler fills in line numbers, optimization symbols, updates
Local Symbols and External Symbols, and updates the Procedure
Descriptor table.
• The link editor eliminates duplicate information in the External
Symbols and the External Strings tables, removes tables with
duplicate information, updates Local Symbols with relocation
information, and creates the Relative File Descriptor table.
The major elements of the table are summarized in the paragraphs that follow.
Some of these elements are explored in more detail later in the chapter.
Symbolic Header. The Symbolic Header (HDRR for HeadDeR Record)
contains the sizes and locations (as an offset from the beginning of the file)
of the subtables that make up the Symbol Table. Figure 10-2 shows the
symbolic relationship of the header to the other tables.
Chapter 10
Symbolic Header
Line Numbers
Dense Numbers
Procedure Descriptor Table
Local Symbols
Optimization Symbols
Auxiliary Symbols
Local Strings
External Strings
File Descriptor Table
Line Numbers. The assembler creates the Line Number table. It creates an
entry for every instruction. Internally, the information is stored in an encoded
form. The debugger uses the entries to map instruction to the source lines and
vice versa.
Dense Numbers. The Dense Number table is an array of pairs. An index into
this table is called a dense number. Each pair consists of a file table index (ifd)
and an index (isym) into Local Symbols. The table facilitates symbol look-up
for the assembler, optimizer, and code generator by allowing direct table
access rather than hashing.
Procedure Descriptor Table. The Procedure Descriptor table contains
register and frame information, and offsets into other tables that provide
detailed information on the procedure. The front-end creates the table and
links it to the Local Symbols table. The assembler enters information on
registers and frames. The debugger uses the entries in determining the line
numbers for procedures and frame information for stack traces.
Local Symbols. The Local Symbols table contains descriptions of program
variables, types, and structures, which the debugger uses to locate and
Chapter 10
interpret runtime values. The table gives the symbol type, storage class, and
offsets into other tables that further define the symbol.
A unique Local Symbols table exists for every source and include file; the
compiler locates the table through an offset from the file descriptor entry that
exists for every file. The entries in Local Symbols can reference related
information in the Local Strings and Auxiliary Symbols subtables. This
relationship is shown in Figure 10-3.
.
Local Strings
Auxiliaries
Local Strings
Auxiliaries
Figure 10-3: Logical Relationship between the File Descriptor Table and
Local Symbols
The format of an auxiliary entry depends on the symbol type and storage
class. Table entries are required only when the compiler debugging option is
ON.
Local Strings. The Local Strings subtables contain the names of local
symbols.
External Strings. The External Strings table contains the names of external
symbols.
File Descriptor. The File Descriptor table contains one entry each for each
source file and each of its include files. (The structure of an entry is given in
Table 10-14 later in this chapter.) The entry is composed of pointers to a
group of subtables related to the file. The physical layout of the subtables is
shown in Figure 10-4.
Local Symbols
Optimization Symbols
Auxiliary Symbols
Local Strings
Relative File Descriptor
The file descriptor entry allows the compiler to access a group of subtables
unique to one file. The logical relationship between entries in this table and
in its subtables is shown in Figure 10-5.
Chapter 10
Line Numbers
Procedure Descriptor Table
Local Symbols
Optimization Symbols
Auxiliary Symbols
Local Strings
Relative File Descriptor
Figure 10-5: Logical Relationship between the File Descriptor Table and
Other Tables
Relative File Descriptor. See the section Link Editor Processing later in this
chapter.
Chapter 10
Symbolic Header
The structure of the Symbolic Header is shown below in Table 10-1; the
sym.h header file contains the header declaration.
The lower byte of the vstamp field contains LS_STAMP and the upper byte
MS_STAMP (see the stamp.h header file). These values are defined in the
stamp.h file. The iMax fields and the cbOffset field must be set to 0 if one of
the tables shown in Table 10-1 isn’t present. The magic field must contain the
constant magicSym, also defined in longsymconst.h.
Line Numbers
Table 10-2 shows the format of an entry in the Line Numbers table; the sym.h
header file contains its declaration.
Declaration Name
typedef long LINER, *pLINER
The line number section in the Symbol table is rounded to the nearest four-
byte boundary.
Line numbers map executable instructions to source lines; one line number is
stored for each instruction associated with a source line. It is stored as a long
integer in memory and in packed format on disk.
The layout on disk is as follows:
Bit 7 4 0
Delta Count
The compiler assigns a line number to only those lines of source code that
generate one or more executable instructions.
Delta is a four-bit value in the range –7...7, defining the number of source
lines between the current source line, and the previous line generating
executable instructions. The Delta of the first line number entry is the
displacement from the lnLow field in the Procedure Descriptor Table.
Chapter 10
Count is a four-bit field with a value in the range 0...15 indicating the number
(1...16) of executable instructions associated with a source line. If more than
16 instructions (15+1) are associated with a source line, new line number
entries are generated with Delta = 0.
An extended format of the line number entry is used when Delta is outside the
range of –7...7.
The layout of the extended field on disk is as follows:
Bit 7 4 0
1 0 0 0
Constant 78 Count
Bit 7 4 0
1 #include <stdio.h>
2 main ()
3 {
4 char c;
5
6 printf ("this program just prints its input\n");
7 while ((c = getc(stdin)) != EOF) {
8 /* this is a greater than a seven line comment
9 * 1
10 * 2
11 * 3
12 * 4
13 * 5
14 * 6
15 * 7
16 */
17 printf("%c", c);
18 } /* end while */
19 } /* end main */
Figure 10-7 shows the instructions generated for lines 3, 7, 17, 18, and 19.
Table 10-3 shows the compiler-generated liner entries for each source line.
Liner
Source Line
Contents Meaning
3 02 delta 0, count 2
6 31 delta 3, count 1
7 1f delta 1, count 15
7 03 delta 0, count 3
1
17 82 00 0a -81, count 2, delta 10
18 1f delta 1, count 15
Chapter 10
*If NULL, and cycm field in file descriptor table = 0, then this field is indexed
to the actual table.
Local Symbols
Table 10-5 shows the format of an entry in the Local Symbols table; the sym.h
header file contains its declaration.
The meanings of the fields in a local symbol entry are explained in the
following paragraphs.
iss. The iss (for index into string space) is an offset from the issBase field of
an entry in the file descriptor table, to the name of the symbol.
value. An integer representing an address, size, offset from a frame pointer.
The value is determined by the symbol type, as illustrated in Table 10-6.
st and sc. The symbol type (st) defines the symbol; the storage class (sc),
where applicable explains how to access the symbol type in memory. The
valid st and sc constants are given in Table 10-8 and Table 10-9. These
constants are defined in symconst.h.
index. The index is an offset into either Local Symbols or Auxiliary Symbols,
depending of the storage type (st) as shown in Table 10-6. The compiler uses
isymBase in the file descriptor entry as the base for a Local Symbol entry and
iauxBase for an Auxiliary Symbols entry.
Table 10-6: Index and Value as a Function of Symbol Type and Storage
Class
Table 10-6: Index and Value as a Function of Symbol Type and Storage
Class
The link editor ignores all symbols except the types stProc, stStatic, stLabel,
stStaticProc, which it will relocate. Other symbols are used only by the
debugger, and need be entered in the table only when the compiler debugger
option is ON.
Chapter 10
Symbol Type (st). Table 10-7 gives the allowable constants that can be
specified in the st field of Local Symbols entries; the symconst.h header file
contains the declaration for the constants.
Storage Class (st) Constants. Table 10-8 gives the allowable constants that
can be specified in the sc field of Local Symbols entries; the symconst.h
header file contains the declaration for the constants.
reserved 7
scBits 8 This is a bit field
scDbx 9 Dbx internal use
scRegImage 10 Register value saved on stack
scInfo 11 Symbol contains debugger information
Optimization Symbols
Reserved for future use.
Auxiliary Symbols
Table 10-9 shows the format of an entry, which is a union, in Auxiliary
Symbols; the sym.h file contains its declaration.
All of the fields except the ti field are explained in the order they appear in
the above layout. The ti field is explained last.
rndx. Relative File Index. The front-end fills this field in describing
structures, enumerations, and other complex types. The relative file index is
a pair of indexes. One index is an offset from the start of the File Descriptor
table to one of its entries. The second is an offset from the file descriptor entry
to an entry in the Local Symbols or Auxiliary Symbols table.
dnLow. Low Dimension of Array.
dnHigh. High Dimension of Array.
isym. Index into Local Symbols. This index is always an offset to an stEnd
entry denoting the end of a procedure.
width. Width of Structured Fields.
count. Range Count. Used in describing case variants. Gives how many
elements are separated by commas in a case variant.
ti. Type Information Record. Table 10-10 shows the format of a ti entry; the
sym.h file contains its declaration.
All groups of auxiliary entries have a type information record with the
following entries:
• fbitfield – Set if the basic type (bt) is of non-standard width.
• bt (for basic type) specifies if the symbol is integer, real complex,
numbers, a structure, etc. The valid entries for this field are shown in
Chapter 10
tqVol 5 Volatile
tqMax 8
External Symbols
The External Symbols table has the same format as Local Symbols, except an
offset (ifd) field into the File Descriptor table has been added. This field is
used to locate information associated with the symbol in an Auxiliary
Symbols table. Table 10-14 shows the format of an entry in External
Symbols; the sym.h file contains its declaration.
long cbLine
short reserved Reserved for future use
short ifd Pointer to file descriptor entry
SYMR asym Same as Local Symbols.
Chapter 11
11
This chapter describes the Execution and Linking Format (ELF) for object
files. The following topics are covered:
• The Components of an elf object file
• Symbol Table Format
• Global Data Area
• Register Information
• Relocation
Program loading and dynamic linking are discussed in Chapter 12.
There are three types of object files:
• Relocatable files contain code and data and are linked with other
object files to create an executable file or shared object file.
• Executable files contain a program that can be executed.
• Shared object files contain code and data that can be linked. These
files may be linked with relocatable or shared object files to create
other object files. They may also be linked with an executable file
and other shared objects to create a process image.
ELF header
Section 1
.
.
.
Section n
Each object file begins with an ELF header that describes the file. Sections
contain information that is used when the file is linked with other objects (e.g.
code, data, relocation information). The Section Header Table contains
information describing the sections of the file and has an entry for each
section. Files that will be linked with other objects must contain a Section
Header Table.
If the Program Header Table is present, it contains information that is used to
create a process image. Files used to build an executable program must have
a Program Header Table; relocatable files do not need one.
ELF Header
The ELF header has the following format:
Chapter 11
Declaration Field
unsigned char e_ident[EI_NIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shstrndx;
#define EI_NIDENT 16
e_ident
contains machine-independent data concerning the file contents.
The index values for the e_ident member are:
EI_MAG0 0 File identification.
EI_MAG1 1 File identification.
EI_MAG2 2 File identification.
EI_MAG3 3 File identification.
EI_CLASS 4 File class.
EI_DATA 5 Byte order.
EI_VERSION 6 File version.
EI_PAD 7 Start of padding bytes.
EI_NIDENT 16 Size of e_ident[].
e_ident[EI_CLASS]
indicates the file class or capacity and must have the value
ELFCLASS32.
e_ident[EI_DATA]
indicates the byte ordering of processor specific data in the object file
and must be either ELFDATA2LSB (little-endian byte order) or
ELFDATA2MSB (big-endianendian byte order).
e_ident[EI_VERSION]
indicates the version number of the ELF header and must be
EV_CURRENT.
e_ident[PAD]
marks the beginning of unused bytes in the ELF header. These bytes
are reserved and set to zero.
e_type
identifies the type of the object file and can have the following values:
ET_NONE 0 No file type
ET_REL 1 Relocatable
ET_EXEC 2 Executable
ET_DYN 3 Shared object
ET_CORE 4 Core file
ET_LOPROC 0xff00 Processor specific
ET_HIPROC 0xffff Processor specific
e_machine
indicates the required architecture and must have the value
EM_MIPS.
e_version
indicates the object file version and must have the value
EV_CURRENT. The value of EV_CURRENT is 1; in the future, this
value will increase as extensions are added to the ELF header.
e_entry
contains the virtual address to which the system transfers control
when the process is started. If the file has no entry point, this value is
zero.
e_phoff
contains the offset in bytes of the Program Header Table and may be
zero if the table is not present.
Chapter 11
e_shoff
contains the offset in bytes of the Section Header Table. If the file
has no Section Header Table, its value is zero.
e_flags
contains bit flags associated with the file. The following flags are
defined:
EF_MIPS_NOREORDER 0x00000001
EF_MIPS_PIC 0x00000002
EF_MIPS_CPIC 0x00000004
EF_MIPS_ARCH 0xf0000000
EF_MIPS_ARCH_2 0x10000000
EF_MIPS_ARCH_3 0x20000000
This bit is asserted when at least one .noreorder directive in an
assembly source program contributes to the object module.
If EF_MIPS_PIC is set, the file contains position-independent code
that is relocatable.
If EF_MIPS_CPIC is set, the file contains code that conforms to the
standard calling sequence rules for calling position-independent code.
The code in this file is not necessarily position-independent.
The bits indicated by EF_MIPS_ARCH identify extensions to the
MIPS1 architecture. AN ABI compliant file must have zero in these
four bits.
e_ehsize
contains the size in bytes of the ELF header.
e_phentsize
contains the size in bytes of an entry in the file’s Program Header
Table.
e_phnum
indicates the number of entries in the Program Header Table. If a file
has no Program Header Table, this value is zero. The product of
e_phnum and e_phentsize gives the size in bytes of the Program
Header Table.
e_shentsize
contains the size in bytes of an entry in the Section Header Table
(also referred to as a Section Header).
e_shnum
indicates the number of entries in the Section Header Table. If a file
has no Section Header Table, this value is zero. The product of
Chapter 11
Sections
Each section has a section header (an entry in the Section Header Table).
Chapter 11
There may be entries in the Section Header Table that have no associated
section. Each section occupies a contiguous, possibly empty, sequence of
bytes and may not overlap any other section.
SHN_COMMON
indicates that the corresponding references are common symbols,
such as FORTRAN COMMON or unallocated C external variables.
Chapter 11
SHN_HIRESERVE
specifies the upper bound of the reserved indexes. The Section
Header Table does not contain entries for the reserved indexes.
SHN_MIPS_ACOMMON
indicates that the corresponding references are common symbols. The
st_value member for a common symbol contains its virtual address. If
the section is relocated, the alignment indicated by the virtual address
is preserved, up to modulo 65536.
SHN_MIPS_SCOMMON
indicates that the corresponding references are common symbols
which can be placed in the global data area (are gp-addressable). This
section only occurs in reloctable object files.
SHN_MIPS_SUNDEFINED
Undefined symbols with this special index in the st_shndx field can
be placed in the global data area (are gp-addressable). This section
only occurs in reloctable object files.
Section Header
A section header (an entry in the Section Header Table) has the following
structure:
Declaration Field
Elf32_Word sh_name;
Elf32_Word sh_type;
Elf32_Word sh_flags;
Elf32_Addr sh_addr;
Elf32_Off sh_offset;
Elf32_Word sh_size;
Elf32_Word sh_link;
Elf32_Word sh_info;
Elf32_Word sh_addralign;
Elf32_Word sh_entsize;
sh_name
specifies the name of the section. Its value is an index into the section
header string table section and gives the location of a null terminated
string that is the name of the section.
sh_type
indicates the type of the section and may have the following values:
Name Value
Chapter 11
SHT_NULL 0
SHT_PROGBITS 1
SHT_SYMTAB 2
SHT_STRTAB 3
SHT_RELA 4
SHT_HASH 5
SHT_DYNAMIC 6
SHT_NOTE 7
SHT_NOBITS 8
SHT_REL 9
SHT_SHLIB 10
SHT_DYNSYM 11
SHT_LOPROC 0x70000000
SHT_HIPROC 0x7fffffff
SHT_LOUSER 0x80000000
SHT_HIUSER 0xffffffff
SHT_MIPS_LIBLIST 0x70000000
SHT_MIPS_CONFLICT 0x70000002
SHT_MIPS_GPTAB 0x70000003
SHT_MIPS_UCODE 0x70000004
SHT_NULL
marks the section header as inactive. There is no associated section.
Other members of the section header have undefined values.
SHT_PROGBITS
indicates that the section contains information defined by the
program. The format and meaning of the information are determined
by the program.
SHT_SYMTAB and SHT_DYNSYM
sections contain a symbol table. An object file may have only one
section of each type. SHT_SYMTAB contains symbols used in link
editing, but may also be used for dynamic linking. It may contain
many symbols unnecessary for dynamic linking. Consequently, an
object may also contain a SHT_DYNSYM section that contains a
minimal set of dynamic linking symbols.
SHT_STRTAB
indicates that the section holds a string table. An object file may have
multiple string table sections.
SHT_RELA
indicates that the section contains relocation entries with explicit
addends, such as type Elf32_Rela for the 32-bit class of object files.
Chapter 11
SHT_MIPS_CONFLICT
marks a section that contains a list of symbols in an executable object
whose definitions conflict with symbols defined in shared objects.
Chapter 11
SHT_MIPS_GPTAB
indicates that the section contains the global pointer table. The global
pointer table contains a list of possible global data area sizes which
allows the linker to provide the user with information on the optimal
size criteria to use for gp register relative addressing. See the Global
Data Area section of this chapter.
SHT_MIPS_UCODE
indicates that the section contains MIPS ucode instructions.
Other section type values are reserved. The section header for index 0
(SHN_UNDEF) marks undefined section references. This entry has the
following values:
Name Value Note
sh_name 0 No name
sh_type SHT_NULL Inactive
sh_flags 0 No flags
sh_addr 0 No address
sh_offset 0 No file offset
sh_size 0 No size
sh_link SHN_UNDEF No link information
sh_info 0 No auxiliary information
sh_addralign 0 No alignment
sh_entsize 0 No entries
sh_flag
contains bit flags describing attributes of the file. The following flags
are defined:
SHF_WRITE 0x1
SHF_ALLOC 0x2
SHF_EXECINSTR 0x4
SHF_MASKPROC 0xf0000000
SHF_MIPS_GPREL 0x10000000
SHF_WRITE
If this bit is set, the section contains data that must be writable during
process execution.
SHF_ALLOC
This bit indicates that the section occupies memory during process
execution.
Chapter 11
SHF_EXECINSTR
If this bit is set, the section contains executable machine instructions.
SHF_MASK_PROC
All the bits included in this mask are reserved for processor-specific
semantics.
SHF_MIPS_GPREL
This bit indicates that the section contains data that must be made
part of the global data area during program execution. Data in this
section is addressable with a gp relative address. The sh_link field of
a section with this attribute must be a Section Header Index of a
section of type SHT_MIPS_GPTAB.
sh_addr
If the section appears in the memory image of a process, this member
contains the address of the first byte of the section. Otherwise, its
value is zero.
sh_offset
Contains the byte offset from the beginning of the file of this section.
sh_size
Contains the size of the section in bytes.
sh_link
Contains a Section Header Table index link. The interpretation of this
value depends on the section type (see Table 11-1).
sh_info
Contains miscellaneous information. The interpretation of the value
depends on the section type (see Table 11-1).
Chapter 11
One greater than the symbol table
SHT_SYMTAB The section header index of the
index of the last local symbol (bind
SHT_DYNSYM associated string table.
STB_LOCAL).
The section header index of the
SHT_MIPS_LIBLIST string table used by entries in this The number of entries in this section.
section.
The section header index of the
SHT_MIPS_GPTAB not used
SHF_ALLOC + SHF_WRITE section.
other SHN_UNDEF 0
sh_addralign
Indicates address alignment constraints for the section. For example,
if a section contains a doubleword value, the entire section must be
aligned on a doubleword boundary. The value of this member may be
0 or a positive integral power of 2; 0 or 1 indicates that the section
has no alignment constraints.
sh_entsize
If the section holds a table of fixed-size entries, such as a symbol
table, this member gives the size in bytes of each entry. A value of
zero indicates that the section does not contain a table of fixed-size
entries.
Special Sections
An object file has the following special sections:
Chapter 11
NOTE: A MIPS ABI compliant system must support the .sdata, .sbss, .lit4,
.lit8, .reginfo, and .gptab sections. A MIPS ABI compliant system must
recognize, but may choose to ignore, the .liblist, .msym, and .conflict sections.
Chapter 11
However, if any one of these sections is supported, all three must be
supported. A MIPS ABI compliant system is not required to support the
.ucode section, but if this section is present, it must conform to the description
in this manual.
.bss
This section holds uninitialized data. The system initializes the data
to zeros when the program is started. This section occupies no file
space.
.comment
This section holds version information.
.data
This section contains initialized data.
.debug
This section hold information used for symbolic debugging.
.dynamic
This section contains information used for dynamic linking. See
Chapter 12 for more details on dynamic linking.
.dynstr
This section contains strings needed for dynamic linking, usually
strings representing the names associated with symbol table entries.
.fini
This section uholds executable instructions that are executed when
the program terminates normally.
.got
This section hold the Global Offset Table.
.hash
This section contains a hash table for symbols. See the Symbol Table
section of this chapter for a description of the symbol table.
.init
This section holds executable instructions that are executed before
the system calls the main entry point for the program.
.interp
This section holds the path name of a program interpreter. If the file
has a loadable segment that includes the section, the section attributes
include SHF_ALLOC.
.line
This section contains line number information describing the
correspondence between source code lines and machine code. This
Chapter 11
.text
This section contains executable instructions.
.sdata
Chapter 11
This section holds initialized short data.
.sbss
This section holds uninitialized short data. The system sets the data to
zeros when the program is started. Unlike the .bss section, this
section occupies file space.
.lit4
This section holds 4 byte read-only literals. The section is part of a
non-writable segment in the process image.
.lit8
This section holds 8 byte read-only literals. The section is part of a
non-writable segment in the process image.
.reginfo
This section contains information on the program’s register usage.
.liblist
This section contains information on the libraries used at static link
time.
.conflict
This section provides additional dynamic linking information for
symbols in an executable file that conflict with symbols defined in
the dynamic shared libraries.
.gptab
This section contains a Global Pointer Table. The section is named
.gptab.sbss, .gptab.sdata, .gptab.bss, or .gptab.data depending on the
data section to which the section refers.
.ucode
This section holds U-code instructions generated by the compiler.
.mdebug
This section contains MIPS specific symbol table information. The
contents of this section are described in Chapter 10. The information
in this section is dependent on the location of other sections in the
file. If an object is relocated, this section must be updated. This
section must be discarded if an object file is relocated and the ABI
compliant system chooses not to update the section.
.got
This section contains the Global Offset Table. The sh_info field holds
the Global Pointer value used for this Global Offset Table.
.dynamic
This section is a MIPS-specific dynamic section. It is the same as the
previously mentioned .dynamic section except that its attributes do
Chapter 11
String Tables
String table sections contain null-terminated character sequences (strings)
that represent symbol and section names. A string is referenced by an index
into the String Table Section.
The first byte of a string section, accessed by index zero, contains a null
character. The last byte also contains a null character, ensuring that all strings
are null terminated. A string whose index is zero specifies either no name or
a null name, depending on the context.
A String Table Section may be empty. In this case, the sh_size field for the
section would contain zero. Non-zero indexes are invalid for an empty string
table.
The following figure shows an example of a string table:
Index 0 1 2 3 4 5 6 7 8 9
\ a b c d \ v a r n
1 a m e \ f o o \ b a
2 r \
A string table index may refer to any byte in the section; references to
substrings are permitted. A single string may be referenced multiple times
and unreferenced strings may exist.
A symbol table index is a subscript into this array. Index zero is the first entry
in the table and is also the undefined symbol index. A symbol table entry has
the following format:
Chapter 11
Declaration Name
Elf32_Word st_name;
Elf32_Addr st_value;
Elf32_Word st_size;
unsigned char st_info;
unsigned char st_other;
Elf32_Half st_shndx;
st_name
Holds an index into the symbol string table. If its value is non-zero, it
indicates a string that is the symbol name. Otherwise, the symbol
table entry has no associated name.
st_value
Contains the value of the associated symbol.
st_size
Contains the size (the number of bytes comprising the data object) of
the associated symbol. This value is zero if the symbol has no size or
the size is unknown.
st_info
Specifies the type of the symbol and its binding attributes. The
following code fragment shows how to manipulate the binding and
type:
#define ELF32_ST_BIND(i) ((i)>>4)
#define ELF32_ST_TYPE(i) ((i)&0xf)
#define ELF32_ST_INFO(b,t) ((b)<<4+((t)&0xf))
A symbol’s binding determines the linkage visibility. The value of st_info
may be one of the following:
STB_LOCAL 0
STB_GLOBAL 1
STB_WEAK 2
STB_LOPROC 13
STB_HIPROC 15
STB_LOCAL indicates local symbols. These symbols are not visible
outside of the object file containing the definition. Local symbols
with the same name may exist in multiple object files without causing
conflicts.
file.
STB_WEAK indicates weak symbols. Weak symbols are similar to global
symbols, but have lower precedence.
STB_LOPROC through STB_HIPROC values are reserved for processor-
specific semantics.
In each symbol table, all local symbols precede the global and weak
symbols. As indicated in the Section Header section of this chapter,
the sh_info field of the section header contains the symbol table
index for the first non-local symbol. Global and weak symbols differ
in two ways:
• When the link editor combines several relocatable object files, it
does not allow multiple definitions of STB_GLOBAL symbols
with the same name. If a defined global symbol exists, the
appearance of a weak symbol with the same name does not cause
an error. The link editor ignores the weak symbol and uses the
global definition.
• When the link editor searches archive libraries, it extracts
members of the archive that contain definitions of undefined
global symbols. The definition in the extracted member may be
either a global or a weak symbol. The link editor does not extract
archive members to resolve undefined weak symbols; unresolved
weak symbols have a value of zero.
st_other
Contains zero and is currently unused.
st_shndx
Contains the Section Header Table index for the symbol table entry.
Symbol Type
The following symbol types are defined:
Chapter 11
Name Value
STT_NOTYPE 0
STT_OBJECT 1
STT_FUNC 2
STT_SECTION 3
STT_FILE 4
STT_LOPROC 13
STT_HIPROC 15
Symbol Values
Symbol table entries for different object file types have slightly different
interpretation for the st_value field:
• In relocatable files, st_value contains the alignment constraints for a
symbol whose section index is SHN_COMMON.
• In relocatable files, st_value contains a section offset for a defined
symbol; st_value is an offset from the beginning of the section that
st_shndx indicates.
• In executable and shared object files, st_value contains a virtual
address. The section offset gives way to a virtual address for which
the section number is irrelevant.
If an executable file contains a reference to a function defined in a shared object,
the symbol table section for the file contains an entry for that symbol. The
st_shndx field of the symbol table entry for the function contains
SHN_UNDEF. If there is a stub for the function in the executable file, and the
st_value field for the symbol table entry is non-zero, the field contains the
virtual address of the first instruction of the function’s stub. Otherwise, the
st_value field contains zero. This stub is used to call the dynamic linker at
runtime for lazy text evaluation.
Chapter 11
contains short data items which can be addressed by the gp register relative
addressing mode. The global data area comprises all the sections with the
SHF_MIPS_GPREL attribute.
The compilers generate short-form (one machine instruction) gp relative
addressing for all data items in any of these sections with the
SHF_MIPS_GPREL attribute. The compilers must generate two machine
instructions to load or store data items outside the global data area. A program
executes faster if more data items are placed in the global data area.
The size of the global data area is limited by the addressing constraints on gp
relative addressing, namely plus or minus 32 kilobytes relative to gp. This
limits the size of the global data area to 64 kilobytes.
The compilers decide whether or not a data item is placed in the global data
area based on its size. All data items less than or equal to a given size are
placed in the global data area. Initialized data items are placed in a .sdata
section, uninitialized data items are placed in a .sbss section, and floating-
point literals are placed in .lit4 and .lit8 sections. The .got section is also
combined into the global data area.
In order to provide the user with information on the optimal size criteria for
placement of data items in the .sdata and .sbss sections, the linker maintains
tables of possible global data area sizes for each of these sections. These
tables are maintained in .gptab sections. Each .gptab section contains both the
actual value used as the size criteria for an object file and a sorted list of
possible short data and bss area sizes based on different data item size
selections. The size criteria value is also known as the –G num.
The .gptab section is an array of structures that have the following format:
typedef union {
struct {
Elf32_Word gt_current_g_value;
Elf32_Word gt_unused;
} gt_header;
struct {
Elf32_Word gt_g_value;
Elf32_Word gt_bytes;
} gt_entry;
} Elf32_gptab;
gt_header.gt_current_g_value
Is the –G num used for this object file. Data items of this size or
smaller are referenced with gp relative addressing and reside in a
SHF_MIPS_GPREL section.
gt_header.gt_unused
Is not used in the first entry of the array.
gt_entry.gt_g_value
Chapter 11
Register Information
The compilers and assembler collect information on the registers used by the
Chapter 11
code in the object file. This information is communicated to the operating
system kernel in the .reginfo section. The operating system kernel could use
this information to decide what registers it might not need to save or which
coprocessors the program uses. The section also contains a field which
specifies the initial value for the gp register, based on the final location of the
global data area in memory. The register information structure has the
following format:
typedef struct {
Elf32_Word ri_gprmask;;
Elf32_Word ri_cprmask[4];
Elf32_Word ri_gp_value;
} ELF_RegInfo;
ri_gprmask
contains a bit-mask of general registers used by the program. Each
set bit indicates a general integer register used by the program. Each
clear bit indicates a general integer register not used by the program.
For instance, bit 31 set indicates register $31 is used by the program;
bit 27 clear indicates register $27 is not used by the program.
ri_cprmask
contains the bit-mask of co-processor registers used by the program.
The MIPS RISC architecture can support up to four co-processors,
each with 32 registers. Each array element corresponds to one set of
co-processor registers. Each of the bits within the element
corresponds to individual registers in the co-processor register set.
The 32 bits of the words correspond to the 32 registers, with bit
number 31 corresponding to register 31, bit number 30 to register 30,
etc. Set bits indicate the corresponding register is used by the
program; clear bits indicate the program does not use the
corresponding register.
ri_gp_value
contains the gp register value. In relocatable object files it is used for
relocation of the R_MIPS_GPREL and R_MIPS_LITERAL
relocation types.
NOTE: Only co-processor 1 may be used by ABI compliant programs. This
means that only the ri_cprmask[1] array element may have a non-zero value.
ri_cprmask[0], ri_cprmask[2], and ri_cprmask[3] must all be zero in an ABI
compliant program.
Relocation
Relocation entries describe how to alter instruction and data fields for
relocation; bit numbers appear in the lower box corners.
Chapter 11
half16
31 15 0
word32
31 0
targ26
31 0
hi16
31 15 0
lo16
31 15 0
rel16
31 15 0
lit16
31 15 0
p
31 15 0
Calculations below assume the actions are transforming a relocatable file into
either an executable or a shared object file. Conceptually, the linker merges
one or more relocatable files to form the output. It first decides how to
combine and locate the input files, then updates the symbol values, and finally
performs the relocation.
Relocations applied to executable or shared object files are similar and
accomplish the same result. The descriptions in Table 11-3 use the following
notation:
A The addend used to compute the value of the relocatable field.
AHL Another type of addend used to compute the value of the relocatable
field. See the note below for more detail.
P The location (section offset or address) of the storage unit being
relocated (computed using r_offset).
S The value of the symbol whose index resides in the relocation entry,
unless the symbol is STB_LOCAL and is of type STT_SECTION,
in which case it means the original sh_addr minus the final sh_addr.
Chapter 11
G The offset into the global offset table at which the address of the
relocation entry’s symbol resides during execution.
GP The final gp value that is used for the relocatable, executable, or
shared object file being produced.
GPO The gp value used to create the relocatable object.
EA The effective address of the symbol prior to relocation.
L The .lit4 or .lit8 literal table offset. Prior to relocation, the addend
field of a literal reference contains the offset into the global data
area. During relocation, each literal section from each contributing
file is merged into one and sorted, after which duplicate entries are
removed and the section compressed, leaving only unique entries.
The relocation factor L is the mapping from the old offset from the
original gp to the value of gp used in the final file.
A relocation entry’s r_offset value designates the offset or virtual address of
the first byte of the affected storage unit. The relocation type specifies which
bits to change and how to calculate their values. Because MIPS uses only
Elf32_Rel relocation entries, the field to be relocated holds the addend.
The AHL addend is a composite computed from the addends of two
consecutive relocation entries. Each relocation type of R_MIPS_HI16 must
have an associated R_MIPS_LO16 entry immediately following it in the list
of relocations. These relocation entries are always processed as a pair and
both addend fields contribute to the AHL addend. If AHI and ALO are the
addends from the paired R_MIPS_HI16 and R_MIPS_LO16 entries, then the
addend AHL is computed as ((AHI << 16) + (short)ALO). R_MIPS_LO16
entries without an immediately preceding R_MIPS_HI16 entry are orphaned
and the previously defined R_MIPS_HI16 is used for computing the addend.
NOTE: Field names in the following table tell whether the relocation type
checks for overflow. A calculated relocation value may be larger than the
intended field, and a relocation type may verify (V) that the value fits or
truncate(T) the result. As an example, V–half16 means that the computed
value may not have significant non-zero bits outside the half16 field.
In the Symbol column in Table 11-3, if the symbol referenced by the symbol
table index in the relocation entry is STB_LOCAL/STT_SECTION, then it is
a local relocation. If it is not, the relocation is considered an external
relocation.
The R_MIPS_32 and R_MIPS_REL32 relocation types are the only
relocations performed by the dynamic linker.
If an R_MIPS_GOT16 refers to a locally defined symbol, the relocation is
done differently than if it refers to an external symbol. In the local case it must
be followed immediately by an R_MIPS_LO16 relocation. The AHL addend
is extracted and the section in which the referenced data item resides is
determined (this requires all sections in an object module to have unique
addresses and no overlap). From this address the final address of the data item
is calculated. If necessary, a global offset table entry is created to hold the
high 16 bits of this address (an existing entry is used when possible). The
rel16 field is replaced by the offset of this entry in the global offset table. The
lo16 field in the following R_MIPS_LO16 relocation is replaced by the low
16 bits of the actual destination address. This is meant for local data
references in position-independent code so that only one global offset table
entry is necessary for every 64 kilobytes of local data.
Chapter 11
The first instance of R_MIPS_GOT16 causes the link editor to build a global
offset table if one has not already been built.
12
Chapter 12
Executable files and object files are used to create a process image when a
program is started by the system. This chapter describes the object file
structures that relate to program execution and also describes how the process
image is created from these files. Topics in this chapter include:
• Program Header
• Object File Segments
• Program Loading
• Dynamic Linking
• Quickstart
Program Header
The Program Header table is an array of structures, each of which describes
a segment or other data used to create a process image. A Program Header is
meaningful only for a shared object or executable file. A description of the
Program Header for MIPS COFF format is in Chapter 9. The structure of a
Program Header for ELF entry is as follows:
Declaration Field
Elf32_Word p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Chapter 12
Elf32_Addr p_paddr;
Elf32_Word p_filesz;
Elf32_Word p_memsz;
Elf32_Word p_flags;
Elf32_Word p_align;
The size of the Program Header is specified by the ELF Header e_phentsize
and e_phnum fields (see Chapter 11).
p_type indicates what kind of segment this entry describes or how to interpret
the array element’s information. p_type may have the following values:
Name Value
PT_NULL 0
PT_LOAD 1
PT_DYNAMIC 2
PT_INTERP 3
PT_NOTE 4
PT_SHLIB 5
PT_PHDR 6
PT_LOPROC 0x70000000
PT_HIPROC 0x7fffffff
PT_MIPS_REGINFO 0x70000000
The file size may not be larger than the memory size. Loadable
segments appear in the Program Header table in ascending order
based on the p_vaddr field.
PT_DYNAMIC indicates that the entry contains dynamic linking
information. See the Dynamic Section section of this chapter for
more details.
PT_INTERP indicates that the entry specifies the location and size of
a null-terminated path name to invoke as an interpreter. This type is
meaningful only for executable files (though it may occur for shared
objects) and may not occur more than once in a file. If a segment of
this type is present, it must precede any loadable segment entry.
Chapter 12
PT_NOTE indicates that the entry gives the location and size of
auxiliary information.
PT_SHLIB is reserved and has unspecified semantics. A program
which contains a Program Header entry of this type does not conform
to the ABI.
PT_PHDR indicates that the entry specifies the location and size of
the Program Header table, both in the file and in the memory image
of the program. This type may not occur more than once in a file and
it may only occur if the Program Header table is part of the memory
image of the program. If present, it must precede any loadable
segment entries.
PT_LOPROC through PT_HIPROC values are reserved for
processor-specific semantics.
PT_MIPS_REGINFO indicates that this entry specifies register usage
information. This type may not occur more than once in a file. Its
presence is mandatory and it must precede any loadable segment
entry. See Register Information in Chapter 11.
p_offset gives the offset from the beginning of the file to the first byte of the
segment.
p_vaddr gives the virtual address in memory of the first byte of the segment.
p_paddr is reserved for the segment’s physical address (on systems for which
physical addressing is relevant).
p_filesz contains the number of bytes in the file image of the segment; the
value may be zero.
p_memsz holds the number of bytes in the memory images of the segment;
the value may be zero.
p_flags contains the flags associated with the segment. The following flags
are defined:
Base Address
Executable file and shared object files have a base address, which is the
lowest virtual address associated with the process image of the program. The
base address is used to relocate the process image of the program during
dynamic linking.
During execution, the base address is calculated from the memory load
address, the maximum page size, and the lowest virtual address of the
program’s loadable segment. The virtual addresses in the Program Header
might not represent actual virtual addresses (see the Program Loading section
of this chapter). The base address is computed by determining the memory
address associated with the lowest p_vaddr for a PT_LOAD segment, and
then truncating this memory address to the nearest multiple of the maximum
page size. The memory address may or may not match the p_addr values.
Segment Permissions
A program that is to be loaded by the system must have at least one loadable
segment, even though this is not required by the file format. When the process
image is created, it has access permissions as specified in the p_flags field.
If a permission bit is zero, that type of access is denied. All flag combinations
are valid but the system may allow more access than requested. However, a
segment does not have write permission unless it is specified explicitly. Table
12-1 shows the exact and allowable interpretations for p_flags.
Chapter 12
PF_R 4 Read only Read, execute
PF_R+PF_X 5 Read, execute Read, execute
PF_R+PF_W 6 Read, write Read, write, execute
PF_R+PF_W+PF_X 7 Read, write, execute Read, write, execute
Segment Contents
An object file segment may contain one or more sections. The number of
sections in a segment is not important for program loading, but specific
information must be present for linking and execution. The figures below
illustrate typical segment contents for a MIPS executable or shared object. The
order of sections within a segment may vary.
Text segments contain read-only instructions and data, typically including the
following sections:
.reginfo
.dynamic
.liblist
.rel.dyn
.conflict
.dynstr
.dynsym
.hash
.rodata
.text
Data segments contain writable data and instructions, typically including the
following sections:
.got
.lit4
.lit8
.sdata
.data
.sbss
Chapter 12
.bss
Program Loading
As the system creates or augments a process image, it logically copies a file’s
segment to a virtual memory segment. When, and if, the system physically
reads the file depends on the program’s execution behavior, system load, etc.
A process does not require a physical page unless it references the logical
page during execution, and processes commonly leave many pages
unreferenced. Therefore, delaying physical reads frequently obviates them,
improving system performance. To obtain this efficiency in practice,
executable and shared object files must have segment images whose virtual
addresses are zero, modulo the file system block size.
Virtual addresses for MIPS text segments must be aligned on 4 K (0x1000)
or larger powers of 2 boundaries. MIPS text segments include ELF headers
and program headers. MIPS data segments must be aligned on 64 K
(0x10000) or larger powers of 2 boundaries. File offsets for MIPS segments
must be aligned on 4 K (0x1000) or larger powers of 2 boundaries. Regardless
of the 4 K alignment, segments may not overlap in any given 256 K chunk of
virtual memory; this helps prevent alias problems in systems with virtual
caches. Page size on MIPS systems may vary, but does not exceed 64 K
(0x10000).
Figure 12-1 shows an example of an executable file and Table 12-2 shows the
Program Header entries for the example text and data segments.
Chapter 12
...
Because the page size can be larger than the alignment restriction of a
segment’s file offset, up to four file pages hold impure text or data (depending
on page size and file system block size).
• The first text page contains the ELF header, the Program Header
table, and other information.
• The last text page may hold a copy of the beginning of data.
• The first data page may have a copy of the end of text.
• The last data page should be zero or else it will conflict with sbrk
call.
Logically, the system enforces the memory permissions as if each segment
were complete and separate; segments’ addresses are adjusted to ensure that
each logical page in the address space has a single set of permissions. In the
example above, with 16KB pages, the region of the file holding the end of text
and the beginning of data is mapped twice, at one virtual address for text and
at another virtual address for data.
The end of the data segment requires special handling for uninitialized data,
which must be set to zeros. If a file’s last data page includes information not
in the logical memory page, the extraneous data must be set to zero, not the
unknown contents of the executable file.
One aspect of segment loading differs between executable files and shared
objects. Executable file segments typically contain absolute code. To let the
process execute correctly, these segments must reside at the virtual addresses
used to build the executable file. Thus the system uses the unchanged p_vaddr
Chapter 12
Dynamic Linking
Program Interpreter
An executable file can have only one PT_INTERP Program Header entry.
When the system calls exec(2) to start the process, the path name of the
interpreter is retrieved from the PT_INTERP segment and the initial process
image is created from the interpreter file’s segments. It is then the
interpreter’s responsibility to receive control from the system and create the
application program’s environment.
The interpreter receives control in one of two ways. First, it may receive the
Chapter 12
file descriptor of the executable file, positioned at the beginning of the file.
The file descriptor can then be used to read or map the executable file’s
segments into memory. Second, depending on the executable file format, the
system may load the executable file into memory before giving control to the
interpreter. With the possible exception of the file descriptor, the interpreter’s
initial process state is the same as what the executable file would have
received. The interpreter cannot require a second interpreter and may be
either a shared object or an executable file.
A shared object is loaded as position-independent with addresses that may
vary from one process to another; the system creates the segments in the
dynamic segment area used by mmap(2) and related services. As a result, a
shared object interpreter typically does not conflict with the executable file’s
original segment addresses.
An executable file is loaded at fixed addresses; the system creates its
segments using the virtual addresses from the Program Header table.
Consequently, an executable file interpreter’s virtual addresses may conflict
with those of the executable file. The interpreter is responsible for resolving
any conflicts.
Dynamic Linker
When building an executable file that uses dynamic linking, the link editor
adds a Program Header entry of type PT_INTERP to the executable file. This
entry tells the system to invoke the dynamic linker as the program interpreter.
Typically, the dynamic linker requested is libsys, the system library. exec(2)
and the dynamic linker cooperate to create the process image, which involves
the following:
• Adding the file segments to the process image.
• Adding shared object segments to the process image.
• Performing relocations for the executable file and its shared objects.
• Closing the file descriptor for the executable file, if a file descriptor
was passed to the dynamic linker.
• Transferring control to the program, making it appear that the
program received control directly from exec(2).
The link editor also constructs various data for shared objects and executable
file that assist the dynamic linker. These data are located in loadable
segments, are available during execution, and consist of the following:
• A dynamic section of type SHT_DYNAMIC holds various data,
including a structure that resides at the beginning of the section and
hold the addresses of other dynamic linking information.
• The .hash section of type SHT_HASH contains a symbol hash table.
Chapter 12
Dynamic Section
An object file that is used in dynamic linking has an entry in its Program
Header Table of type PT_DYNAMIC. This segment contains the .dynamic
section, which is labeled _DYNAMIC and is an array with entries of the
following type:
typedef struct {
Elf32_Sword d_tag;
union {
Elf32_Word d_val;
Elf32_Addr d_ptr;
} d_un;
Chapter 12
} Elf32_Dyn;
dtag indicates how the d_un field is to be interpreted.
d_val represents integer values.
d_ptr represents program virtual address. A file’s virtual addresses may not
match the memory virtual addresses during execution. The dynamic linker
computes actual addresses based on the virtual address from the file and the
memory base address. Object files do not contain relocation entries to correct
addresses in the dynamic structure.
The tag (d_tag) requirements for executable and shared object files are
summarized in the following table. If the executable entry indicates
mandatory, the dynamic linking array must contain an entry of that type.
Optional indicates that an entry for the tag may exist but is not required.
DT_NULL
An entry of this type marks the end of the _DYNAMIC array.
DT_NEEDED
This element contains the string table offset of a null terminated
string that is the name of a library. The offset is an index into the
table indicated in the DT_STRTAB entry. The dynamic array may
contain multiple entries of this type. The order of this entries is
significant.
DT_PLTRELSZ
This element contains the total size in bytes of the relocation entries
associated with the Procedure Linkage Table. If an entry of type
DT_JMPREL is present, it must have an associated DT_PLTRELSZ
entry.
DT_PLTGOT
Procedure Linkage Table and/or the Global Offset Table.
DT_HASH
This element contains the address of the symbol hash table.
DT_STRTAB
This entry contains the address of the string table.
Chapter 12
DT_SYMTAB
This entry contains the address of the symbol table with Elf32_Sym
entries for the 32-bit class of files.
DT_RELA
This element contains the address of a relocation table. Entries in the
table have explicit addends, such as Elf32_Rela. An object file may
have multiple relocation sections. When the link editor builds the
relocation table for an executable or shared object, these sections are
concatenated to form a single table. While the sections are
independent in the object file, the dynamic linker sees a single table.
When the dynamic linker creates a process image or adds a shared
object to a process image, it reads the relocation table and performs
the associated actions. If this element is present, the dynamic
structure must also contains DT_RELASZ and DT_RELAENT
entries. When relocation is mandatory for a file, either DT_RELA or
DT_REL may be present.
DT_RELASZ
This entry contains the size in bytes of the DT_RELA relocation
table.
DT_RELAENT
This entry contains the size in bytes of the DT_RELA relocation
entry.
DT_STRSZ
This element contains the size in bytes of the string table.
DT_SYMENT
This entry contains the size in bytes of a symbol table entry.
DT_INIT
This element contains the address of the initialization function.
DT_FINI
This element contains the address of the termination function.
DT_SONAME
This entry contains the string table offset of a null-terminated string
that gives the name of the shared object. The offset is an index into
the table indicated in the DT_STRTAB entry.
DT_RPATH
This element contains the string table offset of a null-terminated
search library search path string. The offset is an index into the table
indicated in the DT_STRTAB entry.
DT_SYMBOLIC
Chapter 12
DT_JMPREL
If this element is present, its d_ptr field contains the address of
relocation entries associated only with the Procedure Linkage Table.
The dynamic linker may ignore these entries during process
initialization if lazy binding is enabled.
DT_LOPROC through DT_HIPROC
The values in this range are reserved for processor-specific
semantics.
Table 12-4 lists MIPS-specific tags
Chapter 12
Name Value d_un Executable Shared Object
DT_MIPS_RLD_VERSION 0x70000001 d_val mandatory mandatory
DT_MIPS_TIME_STAMP 0x70000002 d_val optional optional
DT_MIPS_ICHECKSUM 0x70000003 d_val optional optional
DT_MIPS_IVERSION 0x70000004 d_val optional optional
DT_MIPS_FLAGS 0x70000005 d_val mandatory mandatory
DT_MIPS_BASE_ADDRESS 0x70000006 d_ptr mandatory mandatory
DT_MIPS_CONFLICT 0x70000008 d_ptr optional optional
DT_MIPS_LIBLIST 0x70000009 d_ptr optional optional
DT_MIPS_LOCAL_GOTNO 0x7000000a d_val mandatory mandatory
DT_MIPS_CONFLICTNO 0x7000000b d_val optional optional
DT_MIPS_LIBLISTNO 0x70000010 d_val optional optional
DT_MIPS_SYMTABNO 0x70000011 d_val optional optional
DT_MIPS_UNREFEXTNO 0x70000012 d_val optional optional
DT_MIPS_GOTSYM 0x70000013 d_val mandatory mandatory
DT_MIPS_HIPAGENO 0x70000014 d_val mandatory mandatory
DT_MIPS_RLD_MAP 0x70000016 d_val optional optional
DT_MIPS_RLD_VERSION
This element holds an index into the object file’s string table, which
holds the version of the Runtime Linker Interface. The version is
currently 1.
DT_MIPS_TIME_STAMP
This entry contains a 32-bit time stamp.
DT_MIPS_CHECKSUM
This elements’s value is the sum of all external strings and common
sizes.
DT_MIPS_IVERSION
This element holds an index into the object file’s string table. The
version string is a series of colon (:) separated version strings. An
index value of zero means no version string was specified.
DT_MIPS_FLAGS
This entry contains a set of 1-bit flags. Flag definitions appear below.
DT_MIPS_BASE_ADDRESS This element contains the base address as
defined in the generic ABI.
DT_MIPS_CONFLICT
This entry contains the address of the .conflict section.
DT_MIPS_LIBLIST
Chapter 12
Chapter 12
Chapter 12
independence and shareability of a program’s text. A program references its
Global Offset Table using position-independent addressing and extracts
absolute values, thus redirecting position-independent references to absolute
locations.
The Global Offset Table is split into two logically separate subtables: locals
and externals. Local entries reside in the first part of the table; these are
entries for which there are standard local relocation entries. These entries
only require relocation if they occur in a shared object and the shared object’s
memory load address differs from the virtual address of the shared object’s
loadable segments. As with the defined external entries in the Global Offset
Table, these local entries contain actual addresses.
External entries reside in the second part of the section. Each entry in the
external part of the GOT corresponds to an entry in the .dynsym section. The
first symbol in the .dynsym section corresponds to the first word of the table,
the second symbol corresponds to the second word, and so on. Each word in
the external entry part of the GOT contains the actual address for its
corresponding symbol. The external entries for defined symbols must contain
actual addresses. If an entry corresponds to an undefined symbol and the table
entry contains a zero, the entry must be resolved by the dynamic linker, even
if the dynamic linker is performing a quickstart. See the Quickstart section of
this chapter for more information.
After the system creates memory segments for a loadable object file, the
dynamic linker may process the relocation entries. The only relocation entries
remaining are type R_MIPS_REL32, referring to local entries in the GOT and
data containing addresses. The dynamic linker determines the associated
symbol (or section) values, calculates their absolute addresses, and sets the
proper values. Although the absolute addresses may be unknown when the
link editor builds an object file, the dynamic linker knows the addresses of all
memory segments and can find the correct symbols and calculate the absolute
addresses.
jal t9
li t8, .dynsym_index
The stub code loads register t9 with an entry from the GOT which contains a
well-known entry point in the dynamic linker; it also loads register t8 with the
index into the .dynsym section of the referenced external. The code saves
register ra and transfers control to the dynamic linker. The dynamic linker
determines the correct address for the called function and replaces the address
of the stub in the GOT with the address of the function.
Most undefined text references can be handled by lazy text evaluation except
when the address of a function is used other than in a jump and link
instruction. In this case, rather than the actual address of the function you
Chapter 12
would get the address of the stub.
The dynamic linker detects this usage in the following manner:
The link editor generates symbol table entries for all function references with
the st_shndx field containing SHN_UNDEF and the st_type field containing
STT_FUNC. The dynamic linker examines each symbol table entry when it
starts execution. If the st_value field for one of these symbols is non-zero then
there were only jump and link references to the function and nothing need be
done to the GOT entry; if the field is zero, then there was some other kind of
reference to the function and the GOT entry must be replaced with the actual
address of the referenced function.
The LD_BIND_NOW environment variable can also change dynamic linking
behavior. If its value is non-null, the dynamic linker evaluates all symbol
table entries of type STT_FUNC, replacing their stub addresses in the GOT
with the actual address of the referenced function.
NOTE: Lazy binding generally improves overall application performance,
because unused symbols do not incur the dynamic linking overhead.
Nevertheless, two situations make lazy binding undesirable for some
applications. First, the initial reference to a shared object function takes
longer than subsequent calls, because the dynamic linker intercepts the call to
resolve the symbol. Some applications cannot tolerate this unpredictability.
Second, if an error occurs and the dynamic linker cannot resolve the symbol,
the dynamic linker terminates the program. Under lazy binding, this might
occur at arbitrary times. Once again, some applications cannot tolerate this
unpredictability. By turning off lazy binding, the dynamic linker forces the
failure to occur during process initialization, before the application receives
control.
Symbols
All externally visible symbols, both defined and undefined, must be hashed
into the hash table.
Undefined symbols of type STT_FUNC which have been referenced only by
jump and link instructions may contain non-zero values in the their st_value
field denoting the stub address used for lazy evaluation for this symbol. The
run-time linker uses this to reset the GOT entry for this external to its stub
address when unlinking a shared object. All other undefined symbols must
contain zero in their st_value fields.
Defined symbols in an executable may not be preempted. The symbol table
in the executable is always searched first to resolve any symbol references.
Chapter 12
Relocations
There may be only one dynamic relocation section to resolve addresses in
data and local entries in the GOT. It must be called .rel.dyn. Executables
may contain normal relocation sections in addition to a dynamic relocation
section. The normal relocation sections may contain resolutions for any
absolute values in the main program. The dynamic linker does not resolve
these or relocate the main program.
As noted previously, only R_MIPS_REL32 relocation entries are supported
in the dynamic relocation section.
Hash table
A hash table of Elf32_Word entries supplies symbol table access. The hash
table can be viewed as follows:
nbucket
nchain
bucket[0]
...
bucket[nbucket – 1]
chain[0]
...
chain[nchain – 1]
nbucket indicates the number of entries in the bucket array and nchain
indicates the number of entries in the chain array. Both bucket and chain hold
symbol table indexes; the entries in chain parallel the symbol table. The
number of symbol table entries should be equal to nchain; symbol tables
indexes also select chain entries.
A hashing function accepts a symbol name and returns a value that may be
used to compute a bucket index. If the hashing function returns the value X
for a name, bucket[X % nbucket] gives an index, Y, into the symbol table and
chain array. If the symbol table entry indicated is not the correct one, chain[Y]
indicates the next symbol table entry with the same hash value. The chain
links can be followed until either the desired symbol table entry is located, or
Chapter 12
the chain entry contains the value STN_UNDEF.
Quickstart
MIPS supports several sections which are useful for faster startup of
programs that have been linked with shared objects. Some ordering
constraints are imposed on these sections. The group of structures defined in
these sections and the ordering constraints allow the dynamic linker to
operate more efficiently. These additional sections are also used for more
complete dynamic shared object version control.
NOTE: An ABI compliant system may ignore any of the three sections
defined here, but if it supports one of these sections, it must support all three.
Elf32_Word l_checksum;
Elf32_Word l_version;
Elf32_Word l_flags;
} Elf32_Lib;
l_name
This member specifies the name of a shared object. Its value is a
string table index. This name may be a trailing component of the path
to be used with RPATH + LD_LIBPATH, a name containing ‘/’s
which is relative to ‘.’, or it may be a full path name.
l_time_stamp
This member’s value is a 32-bit time stamp. The value can be
combined with the l_checksum value and the l_version string to form
a unique id for this shared object.
l_checksum
This member’s value is the sum of all externally visible symbols’
string names and common sizes.
l_version
This member specifies the interface version. Its value is a string table
index. The interface version is a single string containing no colons. It
is compared to a colon separated string of versions pointed to by a
dynamic section entry of the shared object. Shared objects with
matching names may be considered incompatible if the interface
version strings are deemed incompatible. An index value of zero
means no version string is specified.
l_flags
This member is a set of 1-bit flags. The following flags are defined:
LL_EXACT_MATCH0x00000001require exact match
LL_IGNORE_INT_VER0x00000002ignore interface version
LL_EXACT_MATCH
Chapter 12
Conflict Section
Each .conflict section is an array of indexes into the .dynsym section. Each
index identifies a symbol whose attributes conflict with a shared object on
which it depends, either in type or size, such that this definition preempts the
shared object’s definition. The dependent shared object is identified at static
link time. The .conflict section is an array of Elf32_Conflict elements.
typedef Elf32_Addr Elf32_Conflict;
Ordering
In order to take advantage of Quickstart functionality, ordering constraints
are imposed on the .dynsym and .rel.dyn sections. The .dynsym section must
be ordered on increasing values of the st_value field. Note that this requires
the .got section to be ordered in the same way, since it must correspond to the
.dynsym section.
The .rel.dyn section must have all local entries first, followed by the external
entries. Within these sub-sections, the entries must be ordered by symbol
index. This groups each symbol’s relocations together.
Appendix A
A
The tables in this appendix summarize the assembly language instruction set.
Most of the assembly language instructions have direct machine equivalents.
Refer to Appendix A and Appendix B of the MIPS RISC Architecture book
published by Prentice-Hall for detailed instruction descriptions. In the tables
in this appendix, the operand terms have the following meanings:
Operand Description
destination Destination register
address Symbolic expression (see Chapter2)
source Source register
expression An absolute value
immediate Immediate value
label Symbol label
breakcode Value that determines the break
Appendix A
Trap if Less than, Unsigned tltu
Trap if Greater Than or Equal tge
Trap if Greater than or Equal, Unsigned tgeu
Absolute Value abs destination,src1
Negate with Overflow neg destination/src1
Negate without Overflow negu
NOT not
Doubleword Absolute Value dabs
Doubleword Negate with Overflow dneg
Doubleword Negate without Overflow dnegu
Add with Overflow add destination, src1, src2
Add without Overflow addu destination, src1, src2
AND and destination, src1, immediate
Divide Signed div destination/src1, immediate
Divide Unsigned divu
Exclusive-OR xor
Multiply mul
Multiply with Overflow mulo
Multiply with Overflow Unsigned mulou
NOT OR nor
OR or
Set Equal seq
Set Greater sgt
Set Greater/Equal sge
Set Greater/Equal Unsigned sgeu
Set Greater Unsigned sgtu
Set Less slt
Set Less/Equal sle
Set Less/Equal Unsigned sleu
Set Less Unsigned sltu
Set Not Equal sne
Subtract with Overflow sub
Subtract without Overflow subu
Remainder Signed rem
Remainder Unsigned remu
Rotate Left rol
Rotate Right ror
Shift Right Arithmetic sra
Shift Left Logical sll
Shift Right Logical srl
Multiply mult src1,src2
Multiply Unsigned multu
Appendix A
Control From Coprocessor z cfcz dest-gpr, source
Control To Coprocessor z ctcz src-gpr, destination
Branch on Equal Likely beql src1,src2,label
Branch on Greater Likely immediate,label bgtl src1,
Branch on Greater/Equal Likely bgel
Branch on Greater/Equal Unsigned Likely bgeul
Branch on Greater Unsigned Likely bgtul
Branch on Less Likely bltl
Branch on Less/Equal Likely blel
Branch on Less/Equal Unsigned Likely bleul
Branch on Less Unsigned Likely bltul
Branch on Not Equal Likely bnel
Branch on Equal to Zero beqz src1,label
Branch on Greater/Equal Zero bgez
Branch on Greater Than Zero bgtz
Branch on Greater or Equal to Zero and Link bgezal
Branch on Less Than Zero and Link bltzal
Branch on Less/Equal Zero blez
Branch on Less Than Zero bltz
Branch on Not Equal to Zero bnez
Branch on Equal to Zero Likely beqzl
Branch on Greater/Equal Zero Likely bgezl
Branch on Greater Than Zero Likely bgtzl
Branch on Greater or Equal to Zero and Link bgezall
Likely
Branch on Less Than Zero and Link Likely bltzall
Branch on Less/Equal Zero Likely blezl
Branch on Less Than Zero Likely bltzl
Branch on Not Equal to Zero Likely bnezl
Break break breakcode
Exception Return eret
Restore From Exception rfe
Syscall syscall
Move From HI Register mfhi register
Move To HI Register mthi
Move From LO Register mflo
Move To LO Register mtlo
Move move destination,src1
Cache cache
Translation Lookaside Buffer Probe tlbp
Translation Lookaside Buffer Read tlbr
Translation Lookaside Buffer Write Random tlbwr
Translation Lookaside Write Index tlbwi
Synchronize sync
Appendix A
Fixed Point to Double Fp cvt.d.w
Single to Fixed Point Fp cvt.w.s
Double to Fixed Point Fp cvt.w.d
Long Fixed Point to Single Fp cvt.s.l
Long Fixed Point to Double FP cvt.d.l
Single to Long Fixed Point FP cvt.l.s
Double to Long Fixed Point FP cvt.l.d
Truncate and Round Operations
Truncate to Single Fp trunc.w.s destination, src, gpr
Truncate to Double Fp trunc.w.d
Round to Single Fp round.w.s
Round to Double Fp round.w.d
Ceiling to Double Fp ceil.w.d
Ceiling to Single Fp ceil.w.s
Ceiling to Double Fp, Unsigned ceilu.w.d
Ceiling to Single Fp, Unsigned ceilu.w.s
Floor to Double Fp floor.w.d
Floor to Single Fp floor.w.s
Floor to Double Fp, Unsigned flooru.w.d
Floor to Single Fp, Unsigned flooru.w.s
Round to Double Fp, Unsigned roundu.w.d
Round to Single Fp, Unsigned roundu.w.s
Truncate to Double Fp, Unsigned truncu.w.d
Truncate to Single Fp, Unsigned truncu.w.s
Truncate Single to Long Fixed Point trunc.l.s destination, src, gpr
Truncate Double to Long Fixed Point trunc.l.d
Round Single to Long Fixed Point round.l.s
Round Double to Long Fixed Point round.l.d
Ceiling Single to Long Fixed Point ceil.l.s
Ceiling Double to Long Fixed Point ceil.l.d
Floor Single to Long Fixed Point floor.l.s
Floor Double to Long Fixed Point floor.l.d
Compare F
Double c.f.d src1,src2
Single c.f.s
Compare UN
Double c.un.d
Single c.un.s
*Compare EQ
Double c.eq.d
Single c.eq.s
Compare UEQ
Single c.ueq.s
Compare OLT
Double c.olt.d
Single c.olt.s
Compare ULT
Double c.ult.d
Single c.ult.s
Compare OLE
Double c.ole.d
Single c.ole.s
Compare ULE
Double c.ule.d
Single c.ule.s
Compare SF
Double c.sf.d
Single c.sf.s
Compare NGLE
Double c.ngle.d src1, src2
Single c.ngle.s
Compare SEQ
Double c.seq.d
Single c.seq.s
Compare NGL
Double c.ngl.d
Single c.ngl.s
*Compare LT
Double c.lt.d
Single c.lt.s
Compare NGE
Double c.nge.d
Single c.nge.s
*Compare LE
Double c.le.d
Single c.le.s
Compare NGT
Double c.ngt.d
Single c.ngt.s
Move FP
Single mov.s destination,src1
Double mov.d
Appendix B
The assembly language instructions described in this book are distinct from
the actual machine instructions.
Generally, the assembly language instructions match the machine
instructions; however, in some cases the assembly language instruction are
macros that generate more than one machine instruction (the assembly
language multiplication instructions are examples).
Some machine instructions are not available as assembly language
instructions. For example, the jr machine instruction is not a valid assembly
language instruction. However, the j assembly language instruction with a
register operand gets translated into the jr machine instruction by the
assembler.
You can, in most instances, consider the assembly instructions as machine
instructions; however, for routines that require tight coding for performance
reasons, you must be aware of the assembly instructions that generate more
than one machine language instruction, as described in this appendix.
Computational Instructions
If a computational instruction immediate value falls outside the 0...65535
range for Logical ANDs, Logical ORs, or Logical XORs (exclusive or), the
immediate field causes the machine to explicitly load a constant to a
temporary register. Other instructions generate a single machine instruction
when a value falls in the –32768...32767 range.
The assembler’s seq (set equal) and sne (set not equal) instructions generate
three machine instructions each.
If one operand is a literal outside the range –32768...32767, the assembler’s
sge (set greater than or equal to) and sle (set less/equal) instructions generate
two machine instructions each.
The assembler’s mulo and mulou (multiply) instructions generate machine
instructions to test for overflow and to move the result to a general register;
if the destination register is $0, the check and move are not generated.
The assembler’s mul (multiply unsigned) instruction generates a machine
instruction to move the result to a general register; if the destination register
is $0, the move and divide–by–zero checking is not generated. The
assembler’s divide instructions, div (divide with overflow) and divu (divide
without overflow), generate machine instructions to check for division by
zero and to move the quotient into a general register; if the destination register
is $0, the move is not generated.
The assembler’s rem (signed) and remu (unsigned) instructions also generate
multiple instructions.
The rotate instructions ror (rotate right) and rol (rotate left) generate three
machine instructions each.
The abs (absolute value) instruction generates three machine instructions.
Coprocessor Instructions
For symbolic addresses, the coprocessor interface Load and Store
Appendix B
instructions, lcz (load coprocessor z) and scz (store coprocessor z) can
generate a lui (load upper immediate) machine instruction.
Special Instructions
The assembler’s break instruction packs the breakcode operand in unused
register fields. An operating system convention determines the position.
formats 5-3 P
load instructions performance 5-2
delayed 5-1 maximizing 5-2
description 5-4 pipeline
lb (load byte) 2-2 instruction 5-2
lbu (load byte unsigned) 2-2 position independent functions 12-20
lh (load halfword) 2-1 precedence in expressions 4-7
lhu (load halfword unsigned) 2-1 procedure descriptor table 10-4
lw (load word) 2-1 format 10-13
lwl (load word left) 2-1 program design
lwr (load word right) 2-1 linkage 7-2
ulh (unaligned load halfword program header 12-2
unsigned) 2-1 program interpreter
ulh (unaligned load halfword) 2-1 dynamic linking 12-9
ulw (unaligned load word) 2-1 program loading 12-1, 12-6
loading object Files 9-29 pseudo op-codes 8-1
local strings 10-6
local symbols 10-4 Q
fomat 10-13 quickstart 12-24
M R
memory allocation 7-15 Register 1-1
move instructions register 1-1
floating point 6-13 endianness 1-1
format 1-1
N register information 11-25
NMAGIC Files 9-24 registers
NMAGIC, 9-1 general 1-3
noalias 8-6 special 1-5
non-leaf routines 7-3 relational operations
nop 8-6 floating point 6-8
null statements 4-7 relative file descriptor 10-7
relocation 11-26
O relocation table 9-15
object file relocation type 9-16
format 9-1 relocations 12-22
object file format 11-2 runtime procedure table symbols 9-32
object files 9-22
OMAGIC 9-1 S
OMAGIC Files 9-23 scalar constants 4-3
optional header 9-7 section data 9-12
magic field 9-8 section header 11-8
ordering 12-26 section header table 11-7
overflow exception 6-18 section headers 9-8
section name 9-9