Vasm PDF
Vasm PDF
Volker Barthelmann
i
Table of Contents
1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Legal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 The Assembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 General Assembler Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Include Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.6 Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.7 Conditional Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.8 Known Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.9 Credits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.10 Error Messages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
25 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
25.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
25.2 Building vasm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
25.2.1 Directory Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
25.2.2 Adapting the Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
25.2.3 Building vasm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
25.3 General data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
25.3.1 Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
25.3.2 Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
25.3.3 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
25.3.4 Register symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
vi vasm manual
1 General
1.1 Introduction
vasm is a portable and retargetable assembler able to create linkable objects in different
formats as well as absolute code. Different CPU-, syntax and output-modules are supported.
Many common directives/pseudo-opcodes are supported (depending on the syntax module)
as well as CPU-specific extensions.
The assembler supports optimizations and relaxations (e.g. choosing the shortest possible
branch instruction or addressing mode as well as converting a branch to an absolute jump
if necessary).
The concept is that you get a special vasm binary for any combination of CPU- and syntax-
module. All output modules, which make sense for the current CPU, are included in the
vasm binary and you have to make sure to choose the output file format you need (refer to
the next chapter and look for the -F option). The default is a test output, only useful for
debugging or analyzing the output.
1.2 Legal
vasm is copyright in 2002-2017 by Volker Barthelmann.
This archive may be redistributed without modifications and used for non-commercial pur-
poses.
An exception for commercial usage is granted, provided that the target CPU is M68k and
the target OS is AmigaOS. Resulting binaries may be distributed commercially without
further licensing.
In all other cases you need my written consent.
Certain modules may fall under additional copyrights.
1.3 Installation
The vasm binaries do not need additional files, so no further installation is necessary. To use
vasm with vbcc, copy the binary to vbcc/bin after following the installation instructions
for vbcc.
The vasm binaries are named vasm<cpu>_<syntax> with <cpu> representing the CPU-
module and <syntax> the syntax-module, e.g. vasm for PPC with the standard syntax
module is called vasmppc_std.
Sometimes the syntax-modifier may be omitted, e.g. vasmppc.
Detailed instructions how to build vasm can be found in the last chapter.
Chapter 2: The Assembler 3
2 The Assembler
This chapter describes the module-independent part of the assembler. It documents the
options and extensions which are not specific to a certain target, syntax or output driver.
Be sure to also read the chapters on the backend, syntax- and output-module you are
using. They will likely contain important additional information like data-representation
or additional options.
-Lnf Do not emit any form feed code into the listing file, for starting a new page.
-Lns Do not include symbols in the listing file.
-maxerrors=<n>
Defines the maximum number of errors to display before assembly is aborted.
When <n> is 0 then there is no limit. Defaults to 5.
-maxmacrecurs=<n>
Defines the maximum of number of recursions within a macro. Defaults to
1000.
-nocase Disables case-sensitivity for everything - identifiers, directives and instructions.
Note that directives and instructions may already be case-insensitive by default
in some modules.
-noesc No escape character sequences. This will make vasm treat the escape character
\ as any other character. Might be useful for compatibility.
-noialign
Perform no automatic alignment for instructions. Note that unaligned instruc-
tions make your code crash when executed! Only set when you know what you
do!
-nosym Strips all local symbols from the output file and doesn’t include any other
symbols than those which are required for external linkage.
-nowarn=<n>
Disable warning message <n>. <n> has to be the number of a valid warning
message, otherwise an error is generated.
-o <ofile>
Write the generated assembler output to <ofile> rather than a.out.
-pic Try to generate position independant code. Every relocation is flagged by an
error message.
-quiet Do not print the copyright notice and the final statistics.
-unnamed-sections
Sections are no longer distinguished by their name, but only by their attributes.
This has the effect that when defining a second section with a different name
but same attributes as a first one, it will switch to the first, instead of starting
a new section.
-unsshift
The shift-right operator (>>) treats the value to shift as unsigned, which has
the effect that 0-bits are inserted on the left side. The number of bits in a
value depend on the target address type (refer to the appropriate cpu module
documentation).
-w Hide all warning messages.
-x Show an error message, when referencing an undefined symbol. The default
behaviour is to declare this symbol as externally defined.
Note that while most options allow an argument without any separating blank, some others
require it (e.g. -o and -L).
Chapter 2: The Assembler 5
2.2 Expressions
Standard expressions are usually evaluated by the main part of vasm rather than by one of
the modules (unless this is necessary).
All expressions evaluated by the frontend are calculated in terms of target address values,
i.e. the range depends on the backend.
The available operators include all those which are common in assembler as well as in C
expressions.
C like operators:
• Unary: + - ! ~
• Arithmetic: + - * / % << >>
• Bitwise: & | ^
• Logical: && ||
• Comparative: < > <= >= == !=
Assembler like operators:
• Unary: + - ~
• Arithmetic: + - * / // << >>
• Bitwise: & ! ~
• Comparative: < > <= >= = <>
Up to version 1.4b the operators had the same precedence and associativity as in the C
language. Newer versions have changed the operator priorities to comply with the common
assembler behaviour. The expression evaluation priorities, from highest to lowest, are:
1. + - ! ~ (unary +/- sign, not, complement)
2. << >> (shift left, shift right)
3. & (bitwise and)
4. ^ ~ (bitwise exclusive-or)
5. | ! (bitwise inclusive-or)
6. * / % // (multiply, divide, modulo)
7. + - (plus, minus)
8. < > <= >= (less, greater, less or equal, greater or equal)
9. == != = <> (equality, inequality)
10. && (logical and)
11. || (logical or)
Operands are integral values of the target address type. They can either be specified as
integer constants of different bases (see the documentation on the syntax module to see how
the base is specified) or character constants. Character constants are introduced by ’ or "
and have to be terminated by the same character that started them.
Multiple characters are allowed and a constant is built according to the endianess of the
target.
When the -esc option was specified, or automatically enabled by a syntax module, vasm
interprets escape character sequences as in the C language:
6 vasm manual
\\ Produces a single \.
\f Form feed.
\n Line feed.
\r Carriage return.
\t Tabulator.
\’ Produces a single ’.
\<octal-digits>
One character with the code specified by the digits as octal value.
\x<hexadecimal-digits>
One character with the code specified by the digits as hexadecimal value.
\X<hexadecimal-digits>
Same as \x.
Note, that the default behaviour of vasm has changed since V1.7! Escape sequence handling
has been the default in older versions. This has been changed to increase compatibility with
other assemblers. Use -esc to assemble sources with escape character sequences. It is still
the default in the std syntax module, though.
2.3 Symbols
You can define as many symbols as your available memory permits. A symbol may have
any length and can be of global or local scope. Internally, there are three types of symbols:
Expression
These symbols are usually not visible outside the source, unless they are ex-
plicitely exported.
Label Labels are always addresses inside a program section. By default they have
local scope for the linker.
Imported These symbols are externally defined and must be resolved by the linker.
Beginning with vasm V1.5c one expression symbol is always defined to allow conditional
assembly depending on the assembler being used: __VASM. Its value depends on the selected
cpu module. There may be other symbols which are pre-defined by the syntax- or by the
cpu module.
Chapter 2: The Assembler 7
2.5 Macros
Macros are supported by vasm, but the directives for defining them have to be implemented
in the syntax module. The assembler core supports 9 macro arguments by default to be
passed in the operand field, which can be extended to any number by the syntax module.
They can be referenced inside the macro either by name (\name) or by number (\1 to \9),
or both, depending on the syntax module. Recursions and early exits are supported.
Refer to the selected syntax module for more details.
2.6 Structures
Vasm supports structures, but the directives for defining them have to be implemented in
the syntax module.
2.9 Credits
All those who wrote parts of the vasm distribution, made suggestions, answered my ques-
tions, tested vasm, reported errors or were otherwise involved in the development of vasm
(in descending alphabetical order, under work, not complete):
• Joseph Zatarski
• Frank Wille
• Henryk Richter
• Sebastian Pachuta
8 vasm manual
• Esben Norby
• Gunther Nikl
• George Nakos
• Timm S. Mueller
• Gareth Morris
• Dominic Morris
• Mauricio Mu~ noz Lucero
• Jörg van de Loo
• Robert Leffmann
• Andreas Larsson
• Miro Kropacek
• Mikael Kalms
• Matthew Hey
• Philippe Guichardon
• Romain Giot
• Francois Galea
• Tom Duin
• Karoly Balogh
3.1 Legal
This module is written in 2002-2017 by Volker Barthelmann and is covered by the vasm
copyright without modifications.
3.4 Directives
All directives are case-insensitive. The following directives are supported by this syntax
module (if the CPU- and output-module allow it):
.2byte <exp1>[,<exp2>...]
See .uahalf.
.4byte <exp1>[,<exp2>...]
See .uaword.
.8byte <exp1>[,<exp2>...]
See .uaquad.
.ascii <exp1>[,<exp2>,"<string1>"...]
See .byte.
.abort <message>
Print an error and stop assembly immediately.
.asciiz "<string1>"[,"<string2>"...]
See .string.
.align <bitorbyte_count>[,<fill>][,<maxpad>]
Depending on the current CPU backend .align either behaves like .balign
(x86) or like .p2align (PPC).
.balign <byte_count>[,<fill>][,<maxpad>]
Insert as much fill bytes as required to reach an address which is dividable by
<byte count>. For example .balign 2 would make an alignment to the next
16-bit boundary. The padding bytes are initialized by <fill>, when given. The
optional third argument defines a maximum number of padding bytes to use.
When more are needed then the alignment is not done at all.
.balignl <bit_count>[,<fill>][,<maxpad>]
Works like .balign, with the only difference that the optional fill value can
be specified as a 32-bit word. Padding locations which are not already 32-bit
aligned, will cause a warning and padded by zero-bytes.
.balignw <bit_count>[,<fill>][,<maxpad>]
Works like .balign, with the only difference that the optional fill value can
be specified as a 16-bit word. Padding locations which are not already 16-bit
aligned, will cause a warning and padded by zero-bytes.
.byte <exp1>[,<exp2>,"<string1>"...]
Assign the integer or string constant operands into successive bytes of memory
in the current section. Any combination of integer and character string constant
operands is permitted.
.comm <symbol>,<size>[,<align>]
Defines a common symbol which has a size of <size> bytes. The final size and
alignment will be assigned by the linker, which will use the highest size and
alignment values of all common symbols with the same name found. A common
symbol is allocated in the .bss section in the final executable. ".comm"-areas
Chapter 3: Standard Syntax Module 13
of less than 8 bytes in size are aligned to word boundaries, other- wise to
doubleword boundaries.
.double <exp1>[,<exp2>...]
Parse one of more double precision floating point expressions and write them
into successive blocks of 8 bytes into memory using the backend’s endianess.
.endm Ends a macro definition.
.endr Ends a repetition block.
.equ <symbol>,<expression>
See .set.
.equiv <symbol>,<expression>
Assign the <expression> to <symbol> similar to .equ and .set, but signals an
error when <symbol> has already been defined.
.err <message>
Print a user error message. Do not create an output file.
.extern <symbol>[,<symbol>...]
See .global.
.fail <expression>
Cause a warning when <expresion> is greater or equal 500. Otherwise cause an
error.
.file "string"
Set the filename of the input source. This may be used by some output modules.
By default, the input filename passed on the command line is used.
.float <exp1>[,<exp2>...]
Parse one of more single precision floating point expressions and write them
into successive blocks of 4 bytes into memory using the backend’s endianess.
.global <symbol>[,<symbol>...]
Flag <symbol> as an external symbol, which means that <symbol> is visible to
all modules in the linking process. It may be either defined or undefined.
.globl <symbol>[,<symbol>...]
See .global.
.half <exp1>[,<exp2>...]
Assign the values of the operands into successive 16-bit words of memory in the
current section using the backend’s endianess.
.if <expression>
Conditionally assemble the following lines if <expression> is non-zero.
.ifeq <expression>
Conditionally assemble the following lines if <expression> is zero.
.ifne <expression>
Conditionally assemble the following lines if <expression> is non-zero.
14 vasm manual
.ifgt <expression>
Conditionally assemble the following lines if <expression> is greater than zero.
.ifge <expression>
Conditionally assemble the following lines if <expression> is greater than zero
or equal.
.iflt <expression>
Conditionally assemble the following lines if <expression> is less than zero.
.ifle <expression>
Conditionally assemble the following lines if <expression> is less than zero or
equal.
.ifb <operand>
Conditionally assemble the following lines when <operand> is completely blank,
except an optional comment.
.ifnb <operand>
Conditionally assemble the following lines when <operand> is non-blank.
.ifdef <symbol>
Conditionally assemble the following lines if <symbol> is defined.
.ifndef <symbol>
Conditionally assemble the following lines if <symbol> is undefined.
.incbin <file>
Inserts the binary contents of <file> into the object code at this position. The
file will be searched first in the current directory, then in all paths defined by
-I or .incdir in the order of occurence.
.incdir <path>
Add another path to search for include files to the list of known paths. Paths
defined with -I on the command line are searched first.
.include <file>
Include source text of <file> at this position. The include file will be searched
first in the current directory, then in all paths defined by -I or .incdir in the
order of occurence.
.int <exp1>[,<exp2>...]
See .long.
.irp <symbol>[,<val>...]
Iterates the block between .irp and .endr for each <val>. The current <val>,
which may be embedded in quotes, is assigned to \symbol. If no value is given,
then the block is assembled once, with \symbol set to an empty string.
.irpc <symbol>[,<val>...]
Iterates the block between .irp and .endr for each character in each <val>,
and assign it to \symbol. If no value is given, then the block is assembled once,
with \symbol set to an empty string.
Chapter 3: Standard Syntax Module 15
.lcomm <symbol>,<size>[,<alignment>]
Allocate <size> bytes of space in the .bss section and assign the value to that
location to <symbol>. If <alignment> is given, then the space will be aligned
to an address having <alignment> low zero bits or 2, whichever is greater.
<symbol> may be made globally visible by the .globl directive.
.list The following lines will appear in the listing file, if it was requested.
.local <symbol>[,<symbol>...]
Flag <symbol> as a local symbol, which means that <symbol> is local for the
current file and invisible to other modules in the linking process.
.long <exp1>[,<exp2>...]
Assign the values of the operands into successive 32-bit words of memory in the
current section using the backend’s endianess.
.org <exp>
Before any other section directive <exp> defines the absolute start address of
the program. Within a section <exp> defines the offset from the start of this
section for the subsequent code. When <exp> starts with a current-pc symbol
followed by a plus (+) operator, then the directive behaves like .space.
.p2align <bit_count>[,<fill>][,<maxpad>]
Insert as much fill bytes as required to reach an address where <bit count> low
order bits are zero. For example .p2align 2 would make an alignment to the
next 32-bit boundary. The padding bytes are initialized by <fill>, when given.
The optional third argument defines a maximum number of padding bytes to
use. When more are needed then the alignment is not done at all.
.p2alignl <bit_count>[,<fill>][,<maxpad>]
Works like .p2align, with the only difference that the optional fill value can
be specified as a 32-bit word. Padding locations which are not already 32-bit
aligned, will cause a warning and padded by zero-bytes.
.p2alignw <bit_count>[,<fill>][,<maxpad>]
Works like .p2align, with the only difference that the optional fill value can
be specified as a 16-bit word. Padding locations which are not already 16-bit
aligned, will cause a warning and padded by zero-bytes.
16 vasm manual
.quad <exp1>[,<exp2>...]
Assign the values of the operands into successive quadwords (64-bit) of memory
in the current section using the backend’s endianess.
.rept <expression>
Repeats the assembly of the block between .rept and .endr <expression> num-
ber of times. <expression> has to be positive.
.section <name>[,"<attributes>"][[,@<type>]|[,%<type>]|[,<mem_flags>]]
Starts a new section named <name> or reactivate an old one. If attributes are
given for an already existing section, they must match exactly. The section’s
name will also be defined as a new symbol, which represents the section’s start
address. The "<attributes>" string may consist of the following characters:
Section Contents:
c section has code
d section has initialized data
u section has uninitialized data
i section has directives (info section)
n section can be discarded
R remove section at link time
a section is allocated in memory
Section Protection:
r section is readable
w section is writable
x section is executable
s section is sharable
Section Alignment: A digit, which is ignored. The assembler will automatically
align the section to the highest alignment restriction used within.
Memory flags (Amiga hunk format only):
C load section to Chip RAM
F load section to Fast RAM
The optional <type> argument is mainly used for ELF output and may be
introduced either by a ’%’ or a ’@’ character. Allowed are:
progbits This is the default value, which means the section data occupies
space in the file and may have initialized data.
nobits These sections do not occupy any space in the file and will be
allocated filled with zero bytes by the OS loader.
When the optional, non-standard, <mem_flags> argument is given it defines
a 32-bit memory attribute, which defines where to load the section (platform
specific). The memory attributes are currently only used in the hunk-format
output module.
Chapter 3: Standard Syntax Module 17
.set <symbol>,<expression>
Create a new program symbol with the name <symbol> and assign to it the
value of <expression>. If <symbol> is already assigned, it will contain a new
value from now on.
.size <symbol>,<size>
Set the size in bytes of an object defined at <symbol>.
.short <exp1>[,<exp2>...]
See .half.
.single <exp1>[,<exp2>...]
Same as .float.
.skip <exp>[,<fill>]
Insert <exp> zero or <fill> bytes into the current section.
.space <exp>[,<fill>]
Insert <exp> zero or <fill> bytes into the current section.
.stabs "<name>",<type>,<other>,<desc>,<exp>
Add an stab-entry for debugging, including a symbol-string and an expression.
.stabn <type>,<other>,<desc>,<exp>
Add an stab-entry for debugging, without a symbol-string.
.stabd <type>,<other>,<desc>
Add an stab-entry for debugging, without symbol-string and value.
.string "<string1>"[,"<string2>"...]
Like .byte, but adds a terminating zero-byte.
.swbeg <op>
Just for compatibility. Do nothing.
.type <symbol>,<type>
Set type of symbol called <symbol> to <type>, which must be one of:
1: Object
2: Function
3: Section
4: File
The predefined symbols @object and @function are available for this purpose.
.uahalf <exp1>[,<exp2>...]
Assign the values of the operands into successive 16-bit areas of memory in the
current section regardless of current alignment.
.ualong <exp1>[,<exp2>...]
Assign the values of the operands into successive 32-bit areas of memory in the
current section regardless of current alignment.
.uaquad <exp1>[,<exp2>...]
Assign the values of the operands into successive 64-bit areas of memory in the
current section regardless of current alignment.
18 vasm manual
.uashort <exp1>[,<exp2>...]
Assign the values of the operands into successive 16-bit areas of memory in the
current section regardless of current alignment.
.uaword <exp1>[,<exp2>...]
Assign the values of the operands into successive 16-bit areas of memory in the
current section regardless of current alignment.
.weak <symbol>[,<symbol>...]
Flag <symbol> as a weak symbol, which means that <symbol> is visible to all
modules in the linking process and may be replaced by any global symbol with
the same name. When a weak symbol remains undefined its value defaults to
0.
.word <exp1>[,<exp2>...]
Assign the values of the operands into successive 16-bit words of memory in the
current section using the backend’s endianess.
.zero <exp>[,<fill>]
Insert <exp> zero or <fill> bytes into the current section.
Predefined section directives:
.bss .section ".bss","aurw"
.data .section ".data","adrw"
.rodata .section ".rodata","adr"
.sbss .section ".sbss","aurw"
.sdata .section ".sdata","adrw"
.sdata2 .section ".sdata2","adr"
.stab .section ".stab","dr"
.stabstr .section ".stabstr","dr"
.text .section ".text","acrx"
.tocd .section ".tocd","adrw"
4.1 Legal
This module is written in 2002-2017 by Frank Wille and is covered by the vasm copyright
without modifications.
-warncomm
Warn about all lines, which have comments in the operand field, introduced by
a blank character. For example in: dc.w 1 + 2.
4.4 Directives
The following directives are supported by this syntax module (provided the CPU- and
output-module support them):
<symbol> = <expression>
Equivalent to <symbol> equ <expression>.
<symbol> =.s <expression>
Equivalent to <symbol> fequ.s <expression>. PhxAss compatibility.
<symbol> =.d <expression>
Equivalent to <symbol> fequ.d <expression>. PhxAss compatibility.
Chapter 4: Mot Syntax Module 23
dcb.l <exp>[,<fill>]
Insert <exp> zero or <fill> 32-bit words into the current section.
dcb.q <exp>[,<fill>]
Insert <exp> zero or <fill> 64-bit words into the current section.
dcb.s <exp>[,<fill>]
Insert <exp> zero or <fill> 32-bit words into the current section. <fill> might
also be an IEEE single precision constant.
dcb.w <exp>[,<fill>]
Insert <exp> zero or <fill> 16-bit words into the current section.
dcb.x <exp>[,<fill>]
Insert <exp> zero or <fill> 96-bit words into the current section. <fill> might
also be an IEEE extended precision constant.
dr.b <exp1>[,<exp2>...]
Calculates <expN> - <current pc value> and stores it into successive bytes of
memory in the current section.
dr.w <exp1>[,<exp2>...]
Calculates <expN> - <current pc value> and stores it into successive 16-bit words
of memory in the current section.
dr.l <exp1>[,<exp2>...]
Calculates <expN> - <current pc value> and stores it into successive 32-bit words
of memory in the current section.
ds.b <exp>
Equivalent to dcb.b <exp>,0.
ds.d <exp>
Equivalent to dcb.d <exp>,0.
ds.l <exp>
Equivalent to dcb.l <exp>,0.
ds.q <exp>
Equivalent to dcb.q <exp>,0.
ds.s <exp>
Equivalent to dcb.s <exp>,0.
ds.w <exp>
Equivalent to dcb.w <exp>,0.
ds.x <exp>
Equivalent to dcb.x <exp>,0.
dseg Equivalent to section data,data.
echo <string>
Prints <string> to stdout.
einline End a block of isolated local labels, started by inline.
26 vasm manual
else Assemble the following lines if the previous if condition was false.
end Assembly will terminate behind this line.
endif Ends a section of conditional assembly.
endm Ends a macro definition.
endr Ends a repetition block.
<symbol> equ <expression>
Define a new program symbol with the name <symbol> and assign to it the
value of <expression>. Defining <symbol> twice will cause an error.
<symbol> equ.s <expression>
Equivalent to <symbol> fequ.s <expression>. PhxAss compatibility.
<symbol> equ.d <expression>
Equivalent to <symbol> fequ.d <expression>. PhxAss compatibility.
<symbol> equ.x <expression>
Equivalent to <symbol> fequ.x <expression>. PhxAss compatibility.
<symbol> equ.p <expression>
Equivalent to <symbol> fequ.p <expression>. PhxAss compatibility.
erem Ends an outcommented block. Assembly will continue.
even Aligns to an even address. Equivalent to cnop 0,2.
fail <message>
Show an error message including the <message> string. Do not generate an
ouput file.
<symbol> fequ.s <expression>
Define a new program symbol with the name <symbol> and assign to it the
floating point value of <expression>. Defining <symbol> twice will cause an
error. The extension is for Devpac-compatibility, but will be ignored.
<symbol> fequ.d <expression>
Equivalent to <symbol> fequ.s <expression>.
<symbol> fequ.x <expression>
Equivalent to <symbol> fequ.s <expression>.
<symbol> fequ.p <expression>
Equivalent to <symbol> fequ.s <expression>.
<label> fo.<size> <expression>
Assigns the current value of the stack-frame offset counter to <label>. Af-
terwards the counter is decremented by the instruction’s <size> multiplied by
<expression>. Any valid M68k size extension is allowed for <size>: b, w, l, q,
s, d, x, p. The offset counter can also be referenced directly under the name
__FO.
idnt <name>
Sets the file or module name in the generated object file to <name>, when the
selected output module supports it. By default, the input filename passed on
the command line is used.
Chapter 4: Mot Syntax Module 27
if <expression>
Conditionally assemble the following lines if <expression> is non-zero.
ifeq <expression>
Conditionally assemble the following lines if <expression> is zero.
ifne <expression>
Conditionally assemble the following lines if <expression> is non-zero.
ifgt <expression>
Conditionally assemble the following lines if <expression> is greater than zero.
ifge <expression>
Conditionally assemble the following lines if <expression> is greater than zero
or equal.
iflt <expression>
Conditionally assemble the following lines if <expression> is less than zero.
ifle <expression>
Conditionally assemble the following lines if <expression> is less than zero or
equal.
ifb <operand>
Conditionally assemble the following lines when <operand> is completely blank,
except an optional comment.
ifnb <operand>
Conditionally assemble the following lines when <operand> is non-blank.
ifc <string1>,<string2>
Conditionally assemble the following lines if <string1> matches <string2>.
ifnc <string1>,<string2>
Conditionally assemble the following lines if <string1> does not match <string2>.
ifd <symbol>
Conditionally assemble the following lines if <symbol> is defined.
ifnd <symbol>
Conditionally assemble the following lines if <symbol> is undefined.
ifmacrod <macro>
Conditionally assemble the following line if <macro> is defined.
ifmacrond <macro>
Conditionally assemble the following line if <macro> is undefined.
incbin <file>[,<offset>[,<length>]]
Inserts the binary contents of <file> into the object code at this position. When
<offset> is specified, then the given number of bytes will be skipped at the
beginning of the file. The optional <length> argument specifies the maximum
number of bytes to be read from that file. The file will be searched first in
the current directory, then in all paths defined by -I or incdir in the order of
occurence.
28 vasm manual
incdir <path>
Add another path to search for include files to the list of known paths. Paths
defined with -I on the command line are searched first.
include <file>
Include source text of <file> at this position. The include file will be searched
first in the current directory, then in all paths defined by -I or incdir in the
order of occurence.
inline Local labels in the following block are isolated from previous local labels and
those after einline.
list The following lines will appear in the listing file, if it was requested.
llen <len>
Set the line length in a listing file to a maximum of <len> characters. Currently
without any effect.
macro <name>
Defines a macro which can be referenced by <name>. The <name> may also
appear at the left side of the macro directive, starting at the first column.
Then the operand field is ignored. The macro definition is closed by an endm
directive. When calling a macro you may pass up to 9 arguments, separated by
comma. Those arguments are referenced within the macro context as \1 to \9.
Parameter \0 is set to the macro’s first qualifier (mnemonic extension), when
given. In Devpac- and PhxAss-compatibility mode, or with option -allmp, up
to 35 arguments are accepted, where argument 10-35 can be referenced by \a
to \z.
Special macro parameters:
\@ Insert a unique id, useful for defining labels. Every macro call gets
its own unique id.
\@! Push the current unique id onto a global id stack, then insert it.
\@? Push the current unique id below the top element of the global id
stack, then insert it.
\@@ Pull the top element from the global id stack and insert it. The
macro’s current unique id is not affected by this operation.
\# Insert the number of arguments that have been passed to this
macro. Equivalent to the conents of NARG.
\?n Insert the length of the n’th macro argument.
\. Insert the argument which is selected by the current value of the
CARG symbol (first argument, when CARG is 1).
\+ Works like \., but increments the value of CARG after that.
\- Works like \., but decrements the value of CARG after that.
\<symbolname>
inserts the current decimal value of the absolute symbol
symbolname.
Chapter 4: Mot Syntax Module 29
\<$symbolname>
inserts the current hexadecimal value of the absolute symbol
symbolname, without leading $.
mexit Leave the current macro and continue with assembling the parent context. Note
that this directive also resets the level of conditional assembly to a state before
the macro was invoked (which means that it works as a ’break’ command on
all new if directives).
nolist The following lines will not be visible in a listing file.
nopage Never start a new page in the listing file. This implementation will only prevent
emitting the formfeed code.
nref <symbol>[,<symbol>...]
Flag <symbol> as externally defined, similar to xref, but also indicate that ref-
erences should be optimized to base-relative addressing modes, when possible.
This directive is only present in PhxAss-compatibility mode.
odd Aligns to an odd address. Equivalent to cnop 1,2.
offset [<expression>]
Switches to a special offset-section. The contents of such a section is not in-
cluded in the output. Their labels may be referenced as absolute offset symbols.
Can be used to define structure offsets. The optional <expression> gives the
start offset for this section. When missing the last offset of the previous offset-
section is used, or 0.
org <expression>
Sets the base address for the subsequent code. Note that it is allowed to embed
such an absolute ORG block into a section. Return into relocatable mode with
any new section directive. Although in Devpac compatibility mode the previous
section will stay absolute.
output <name>
Sets the output file name to <name> when no output name was given on the
command line. A special case for Devpac-compatibility is when <name> starts
with a ’.’ and an output name was already given. Then the current output
name gets <name> appended as an extension. When an extension already exists,
then it is replaced.
page Start a new page in the listing file (not implemented). Make sure to start a
new page when the maximum page length is reached.
plen <len>
The the page length for a listing file to <len> lines. Currently ignored.
printt <string>[,<string>...]
Prints <string> to stdout. Each additional string into a new line. Quotes are
optional.
printv <expression>[,<expression>...]
Evaluate <expression> and print it to stdout out in hexadecimal, decimal, ASCII
and binary format.
30 vasm manual
public <symbol>[,<symbol>...]
Flag <symbol> as an external symbol, which means that <symbol> is visible to
all modules in the linking process. It may be either defined or undefined.
rem The assembler will ignore everything from encountering the rem directive until
an erem directive was found.
rept <expression>
Repeats the assembly of the block between rept and endr <expression> number
of times. <expression> has to be positive. The internal symbol REPTN always
holds the iteration counter of the inner repeat loop, starting with 0. REPTN is
-1 outside of any repeat block.
rorg <expression>
Sets the program counter <expression> bytes behind the start of the current
section. The new program counter must not be smaller than the current one.
The space will be padded with zeros.
<label> rs.<size> <expression>
Works like the so directive, with the only difference that the offset symbol is
named __RS.
rsreset Equivalent to clrso, but the symbol manipulated is __RS.
rsset Equivalent to setso, but the symbol manipulated is __RS.
section [<name>,]<sec_type>[,<mem_type>]
Starts a new section named <name> or reactivates an old one. <sec_type>
defines the section type and may be code, text (same as code), data or bss.
<sec_type> defaults to code in Phxass mode. Otherwise a single argument
will start a section with the type and name of <sec_type>. When <mem_type>
is given it defines a 32-bit memory attribute, which defines where to load the
section. <mem_type> is either a numerical constant or one of the keywords chip
(for Chip-RAM) or fast (for Fast-RAM). Optionally it is also possible to attach
the suffix _C, _F or _P to the <sec_type> argument for defining the memory
type. The memory attributes are currently only used in the hunk-format output
module.
<symbol> set <expression>
Create a new symbol with the name <symbol> and assign the value of <expres-
sion>. If <symbol> is already assigned, it will contain a new value from now
on.
setfo <expression>
Sets the stack-frame offset counter to <expresion>. See fo directive.
setso <expression>
Sets the structure offset counter to <expresion>. See so directive.
<label> so.<size> <expression>
Assigns the current value of the structure offset counter to <label>. Afterwards
the counter is incremented by the instruction’s <size> multiplied by <expres-
sion>. Any valid M68k size extension is allowed for <size>: b, w, l, q, s, d, x,
p. The offset counter can also be referenced directly under the name __SO.
Chapter 4: Mot Syntax Module 31
spc <lines>
Output <lines> number of blank lines in the listing file. Currently without any
effect.
text Equivalent to section code,code.
ttl <name>
PhxAss syntax. Equivalent to idnt <name>.
<name> ttl
Motorola syntax. Equivalent to idnt <name>.
weak <symbol>[,<symbol>...]
Flag <symbol> as a weak symbol, which means that <symbol> is visible to all
modules in the linking process and may be replaced by any global symbol with
the same name. When a weak symbol remains undefined its value defaults to
0.
xdef <symbol>[,<symbol>...]
Flag <symbol> as an global symbol, which means that <symbol> is visible to
all modules in the linking process. See also public.
xref <symbol>[,<symbol>...]
Flag <symbol> as externally defined, which means it has to be important from
another module in the linking process. See also public.
5.1 Legal
This module is written in 2015 by Frank Wille and is covered by the vasm copyright without
modifications.
5.3 Directives
The following directives are supported by this syntax module (if the CPU- and output-
module allow it). Note that all directives, besides the equals-character, may be optionally
preceded by a dot (.).
<symbol> = <expression>
Equivalent to <symbol> equ <expression>.
34 vasm manual
<symbol> == <expression>
Equivalent to <symbol> equ <expression>, but declare <symbol> as externally
visible.
assert <expresion>[,<expression>...]
Assert that all conditions are true (non-zero), otherwise issue a warning.
bss The following data (space definitions) are going into the BSS section. The BSS
section cannot contain any initialized data.
data The following data are going into the data section, which usually contains pre-
initialized data and no executable code.
dc <exp1>[,<exp2>...]
Equivalent to dc.w.
dc.b <exp1>[,<exp2>,"<string1>",’<string2>’...]
Assign the integer or string constant operands into successive bytes of memory
in the current section. Any combination of integer and character string constant
operands is permitted.
dc.i <exp1>[,<exp2>...]
Assign the values of the operands into successive 32-bit words of memory in
the current section. In contrast to dc.l the high and low half-words will be
swapped as with the Jaguar-RISC movei instruction.
dc.l <exp1>[,<exp2>...]
Assign the values of the operands into successive 32-bit words of memory in the
current section.
dc.w <exp1>[,<exp2>...]
Assign the values of the operands into successive 16-bit words of memory in the
current section.
dcb Equivalent to dcb.w.
dcb.b <exp>[,<fill>]
Insert <exp> zero or <fill> bytes into the current section.
dcb.l <exp>[,<fill>]
Insert <exp> zero or <fill> 32-bit words into the current section.
dcb.w <exp>[,<fill>]
Insert <exp> zero or <fill> 16-bit words into the current section.
dphrase Align the program counter to the next integral double phrase boundary (16
bytes).
ds <exp> Equivalent to dcb.w <exp>,0.
ds.b <exp>
Equivalent to dcb.b <exp>,0.
ds.l <exp>
Equivalent to dcb.l <exp>,0.
Chapter 5: Madmac Syntax Module 35
ds.w <exp>
Equivalent to dcb.w <exp>,0.
else Else-part of a conditional-assembly block. Refer to ’if’.
end End the assembly of the current file. Parsing of an include file is terminated
here and assembling of the parent source commences. It also works to break
the current conditional block, repetition or macro.
endif Ends a block of conditional assembly.
endm Ends a macro definition.
endr Ends a repetition block.
<symbol> equ <expression>
Define a new program symbol with the name <symbol> and assign to it the
value of <expression>. Defining <symbol> twice will cause an error.
even Align the program counter to an even value, by inserting a zero-byte when it is
odd.
exitm Exit the current macro (proceed to endm) at this point and continue assembling
the parent context. Note that this directive also resets the level of conditional
assembly to a state before the macro was invoked (which means that it works
as a ’break’ command on all new if directives).
extern <symbol>[,<symbol>...]
Declare the given symbols as externally defined. Internally there is no difference
to globl, as both declare the symbols, no matter if defined or not, as externally
visible.
globl <symbol>[,<symbol>...]
Declare the given symbols as externally visible in the object file for the linker.
Note that you can have the same effect by using a double-colon (::) on labels
or a double-equal (==) on equate-symbols.
if <expression>
Start of block of conditional assembly. If <expression> is true, the block between
’if’ and the matching ’endif’ or ’else’ will be assembled. When false, ignore
all lines until and ’else’ or ’endif’ directive is encountered. It is possible to
leave such a block early from within an include file (with end) or a macro (with
endm).
iif <expression>, <statement>
A single-line conditional assembly. The <statement> will be parsed when <ex-
pression> evaluates to true (non-zero). <statement> may be a normal source
line, including labels, operators and operands.
incbin "<file>"
Inserts the binary contents of <file> into the object code at this position. The
file will be searched first in the current directory, then in all paths defined by
-I in the order of occurence.
36 vasm manual
include "<file>"
Include source text of <file> at this position. The include file will be searched
first in the current directory, then in all paths defined by -I in the order of
occurence.
list The following lines will appear in the listing file, if it was requested.
long Align the program counter to the next integral longword boundary (4 bytes),
by inserting as many zero-bytes as needed.
macro <name> [<argname>[,<argname>...]]
Defines a macro which can be referenced by <name> (case-sensitive). The macro
definition is terminated by an endm directive and may be exited by exitm. When
calling a macro you may pass up to 64 arguments, separated by comma. The
first ten arguments are referenced within the macro context as \1 to \9 and
\0 for the tenth. Optionally you can specify a list of argument names, which
are referenced with a leading backslash character (\) within the macro. The
special code \~ inserts a unique id, useful for defining labels. \# is replaced
by the number of arguments. \! writes the the size-qualifier (M68k) including
the dot. \?argname expands to 1 when the named argument is specified and
non-empty, otherwise it expands to 0. It is also allowed to enclose argument
names in curly braces, which is useful in situations where the argument name
is followed by another valid identifier character.
macundef <name>[,<name>...]
Undefine one or more already defined macros, making them unknown for the
following source to assemble.
nlist The following lines will not be visible in a listing file.
nolist The following lines will not be visible in a listing file.
org <expression>
Sets the base address for the subsequent code and switch into absolute mode.
Such a block is terminated by any section directive or by .68000 (Jaguar only).
phrase Align the program counter to the next integral phrase boundary (8 bytes).
print <expression>[,<expression>...]
Prints strings and formatted expressions to the assembler’s console. <expres-
sion> is either a string in quotes or an expression, which is optionally preceded
by special format flags:
Several flags can be used to format the output of expressions. The default is a
16-bit signed decimal.
/x hexadecimal
/d signed decimal
/u unsigned decimal
/w 16-bit word
/l 32-bit longword
Chapter 5: Madmac Syntax Module 37
For example:
.print "Value: ", /d/l xyz
qphrase Align the program counter to the next integral quad phrase boundary (32
bytes).
rept <expression>
The block between rept and endr will be repeated <expression> times, which
has to be positive.
<symbol> set <expression>
Create a new symbol with the name <symbol> and assign the value of <expres-
sion>. If <symbol> is already assigned, it will contain a new value from now
on.
text The following code and data is going into the text section, which usually is the
first program section, containing the executable code.
6.1 Legal
This module is written in 2002-2017 by Frank Wille and is covered by the vasm copyright
without modifications.
6.4 Directives
The following directives are supported by this syntax module (if the CPU- and output-
module allow it):
<symbol> = <expression>
Equivalent to <symbol> equ <expression>.
abyte <offset>,<exp1>[,<exp2>,"<string1>"...]
Write the integer or string constant operands into successive bytes of memory
in the current section while adding the constant <offset> to each byte. Any
combination of integer and character string constant operands is permitted.
addr <exp1>[,<exp2>...]
Equivalent to word <exp1>[,<exp2>...].
align <bitcount>
Insert as much zero bytes as required to reach an address where <bit count>
low order bits are zero. For example align 2 would make an alignment to the
next 32-bit boundary.
asc <exp1>[,<exp2>,"<string1>"...]
Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
ascii <exp1>[,<exp2>,"<string1>"...]
See defm.
asciiz "<string1>"[,"<string2>"...]
See string.
assert <expression>[,<message>]
Display an error with the optional <message> when the expression is false.
binary <file>
Inserts the binary contents of <file> into the object code at this position. The
file will be searched first in the current directory, then in all paths defined by
-I or incdir in the order of occurence.
blk <exp>[,<fill>]
Insert <exp> zero or <fill> bytes into the current section.
blkw <exp>[,<fill>]
Insert <exp> zero or <fill> 16-bit words into the current section, using the en-
dianess of the target CPU.
Chapter 6: Oldstyle Syntax Module 41
bsz exp>[,<fill>]
Equivalent to blk <exp>[,<fill>].
byt Increases the program counter by one. Equivalent to blk 1,0.
byte <exp1>[,<exp2>,"<string1>"...]
Assign the integer or string constant operands into successive bytes of memory
in the current section. Any combination of integer and character string constant
operands is permitted.
data <exp1>[,<exp2>,"<string1>"...]
Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
db <exp1>[,<exp2>,"<string1>"...]
Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
dc <exp>[,<fill>]
Equivalent to blk <exp>[,<fill>].
defb <exp1>[,<exp2>,"<string1>"...]
Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
defc <symbol> = <expression>
Define a new program symbol with the name <symbol> and assign to it the
value of <expression>. Defining <symbol> twice will cause an error.
defl <exp1>[,<exp2>...]
Assign the values of the operands into successive 32-bit integers of memory in
the current section, using the endianess of the target CPU.
defp <exp1>[,<exp2>...]
Assign the values of the operands into successive 24-bit integers of memory in
the current section, using the endianess of the target CPU.
defm "string"
Equivalent to text "string".
defw <exp1>[,<exp2>...]
Equivalent to word <exp1>[,<exp2>...].
dfb <exp1>[,<exp2>,"<string1>"...]
Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
dfw <exp1>[,<exp2>...]
Equivalent to word <exp1>[,<exp2>...].
defs <exp>[,<fill>]
Equivalent to blk <exp>[,<fill>].
dephase Equivalent to rend.
ds <exp>[,<fill>]
Equivalent to blk <exp>[,<fill>].
dsb <exp>[,<fill>]
Equivalent to blk <exp>[,<fill>].
42 vasm manual
dsw <exp>[,<fill>]
Equivalent to blkw <exp>[,<fill>].
dw <exp1>[,<exp2>...]
Equivalent to word <exp1>[,<exp2>...].
end Assembly will terminate behind this line.
endif Ends a section of conditional assembly.
el Equivalent to else.
else Assemble the following lines when the previous if-condition was false.
ei Equivalent to endif. (Not available for Z80 CPU)
endm Ends a macro definition.
endmac Ends a macro definition.
endmacro Ends a macro definition.
endr Ends a repetition block.
endrep Ends a repetition block.
endrepeat
Ends a repetition block.
endstruct
Ends a structure definition.
endstructure
Ends a structure definition.
<symbol> eq <expression>
Equivalent to <symbol> equ <expression>.
<symbol> equ <expression>
Define a new program symbol with the name <symbol> and assign to it the
value of <expression>. Defining <symbol> twice will cause an error.
extern <symbol>[,<symbol>...]
See global.
even Aligns to an even address. Equivalent to align 1.
fail <message>
Show an error message including the <message> string. Do not generate an
ouput file.
fill <exp>
Equivalent to blk <exp>,0.
fcb <exp1>[,<exp2>,"<string1>"...]
Equivalent to byte <exp1>[,<exp2>,"<string1>"...].
fcc "<string>"
Equivalent to text.
Chapter 6: Oldstyle Syntax Module 43
fdb <exp1>[,<exp2>,"<string1>"...]
Equivalent to word <exp1>[,<exp2>...].
global <symbol>[,<symbol>...]
Flag <symbol> as an external symbol, which means that <symbol> is visible to
all modules in the linking process. It may be either defined or undefined.
if <expression>
Conditionally assemble the following lines if <expression> is non-zero.
ifdef <symbol>
Conditionally assemble the following lines if <symbol> is defined.
ifndef <symbol>
Conditionally assemble the following lines if <symbol> is undefined.
ifd <symbol>
Conditionally assemble the following lines if <symbol> is defined.
ifnd <symbol>
Conditionally assemble the following lines if <symbol> is undefined.
ifeq <expression>
Conditionally assemble the following lines if <expression> is zero.
ifne <expression>
Conditionally assemble the following lines if <expression> is non-zero.
ifgt <expression>
Conditionally assemble the following lines if <expression> is greater than zero.
ifge <expression>
Conditionally assemble the following lines if <expression> is greater than zero
or equal.
iflt <expression>
Conditionally assemble the following lines if <expression> is less than zero.
ifle <expression>
Conditionally assemble the following lines if <expression> is less than zero or
equal.
ifused <symbol>
Conditionally assemble the following lines if <symbol> has been previously ref-
erenced in an expression or in a parameter of an opcode. Issue a warning, when
<symbol> is already defined. Note that ifused does not work, when the symbol
has only been used in the following lines of the source.
incbin <file>[,<offset>[,<nbytes>]]
Inserts the binary contents of <file> into the object code at this position. When
<offset> is specified, then the given number of bytes will be skipped at the
beginning of the file. The optional <nbytes> argument specifies the maximum
number of bytes to be read from that file. The file will be searched first in
the current directory, then in all paths defined by -I or incdir in the order of
occurence.
44 vasm manual
incdir <path>
Add another path to search for include files to the list of known paths. Paths
defined with -I on the command line are searched first.
include <file>
Include source text of <file> at this position. The include file will be searched
first in the current directory, then in all paths defined by -I or incdir in the
order of occurence.
mac <name>
Equivalent to macro <name>.
list The following lines will appear in the listing file, if it was requested.
local <symbol>[,<symbol>...]
Flag <symbol> as a local symbol, which means that <symbol> is local for the
current file and invisible to other modules in the linking process.
macro <name>[,<argname>...]
Defines a macro which can be referenced by <name>. The <name> may also
appear on the left side of the macro directive, starting at the first column. The
macro definition is closed by an endm directive. When calling a macro you
may pass up to 9 arguments, separated by comma. Those arguments are refer-
enced within the macro context as \1 to \9, or optionally by named arguments,
which you have to specify in the operand. Argument \0 is set to the macro’s
first qualifier (mnemonic extension), when given. The special argument \@ in-
serts an underscore followed by a six-digit unique id, useful for defining labels.
\() may be used as a separator between the name of a macro argument and
the subsequent text. \<symbolname> inserts the current decimal value of the
absolute symbol symbolname.
mdat <file>
Equivalent to incbin <file>.
nolist The following lines will not be visible in a listing file.
org <expression>
Sets the base address for the subsequent code. This is equivalent to
*=<expression>.
phase <expression>
Equivalent to rorg <expression>.
repeat <expression>
Equivalent to rept <expression>.
rept <expression>
Repeats the assembly of the block between rept and endr <expression> number
of times. <expression> has to be positive.
reserve <exp>
Equivalent to blk <exp>,0.
rend Ends a rorg block of label relocation. Following labels will be based on org
again.
Chapter 6: Oldstyle Syntax Module 45
rorg <expression>
Relocate all labels between rorg and rend based on the new origin from
<expression>.
section <name>[,"<attributes>"]
Starts a new section named <name> or reactivate an old one. If attributes are
given for an already existing section, they must match exactly. The section’s
name will also be defined as a new symbol, which represents the section’s start
address. The "<attributes>" string may consist of the following characters:
Section Contents:
c section has code
d section has initialized data
u section has uninitialized data
i section has directives (info section)
n section can be discarded
R remove section at link time
a section is allocated in memory
Section Protection:
r section is readable
w section is writable
x section is executable
s section is sharable
<symbol> set <expression>
Create a new symbol with the name <symbol> and assign the value of <expres-
sion>. If <symbol> is already assigned, it will contain a new value from now
on.
spc <exp> Equivalent to blk <exp>,0.
string "<string1>"[,"<string2>"...]
Like text, but adds a terminating zero-byte.
struct <name>
Defines a structure which can be referenced by <name>. Labels within a struc-
ture definitation can be used as field offsets. They will be defined as local labels
of <name> and can be referenced through <name>.<label>. All directives are
allowed, but instructions will be ignored when such a structure is used. Data
definitions can be used as default values when the structure is used as initializer.
The structure name, <name>, is defined as a global symbol with the structure’s
size. A structure definition is ended by endstruct.
structure <name>
Equivalent to struct <name>.
46 vasm manual
text "<string>"
Places a single string constant operands into successive bytes of memory in the
current section. The string delimiters may be any printable ASCII character.
weak <symbol>[,<symbol>...]
Flag <symbol> as a weak symbol, which means that <symbol> is visible to all
modules in the linking process and may be replaced by any global symbol with
the same name. When a weak symbol remains undefined its value defaults to
0.
wor <exp1>[,<exp2>...]
Equivalent to word <exp1>[,<exp2>...].
wrd Increases the program counter by two. Equivalent to blkw 1,0.
word <exp1>[,<exp2>...]
Assign the values of the operands into successive 16-bit words of memory in the
current section, using the endianess of the target CPU.
xdef <symbol>[,<symbol>...]
See global.
xlib <symbol>[,<symbol>...]
See global.
xref <symbol>[,<symbol>...]
See global.
6.5 Structures
The oldstyle syntax is able to manage structures. Structures can be defined in two ways:
mylabel struct[ure]
<fields>
endstruct[ure]
or:
struct[ure] mylabel
<fields>
endstruct[ure]
Any directive is allowed to define the structure fields. Labels can be used to define offsets
into the structure. The initialized data is used as default value, whenever no value is given
for a field when the structure is referenced.
Some examples of structure declarations:
struct point
x db 4
y db 5
z db 6
endstruct
This will create the following labels:
Chapter 6: Oldstyle Syntax Module 47
point.x ; 0 offsets
point.y ; 1
point.z ; 2
point ; 3 size of the structure
The structure can be used by optionaly redefining the fields value:
point1 point
point2 point 1, 2, 3
point3 point ,,4
is equivalent to
point1
db 4
db 5
db 6
point2
db 1
db 2
db 3
point3
db 4
db 5
db 4
7.1 Legal
This module is written in 2002 by Volker Barthelmann and is covered by the vasm copyright
without modifications.
7.3 General
This output module outputs a textual description of the contents of all sections. It is mainly
intended for debugging.
7.4 Restrictions
None.
8.1 Legal
This module is written in 2002-2016 by Frank Wille and is covered by the vasm copyright
without modifications.
8.3 General
This output module outputs the ELF (Executable and Linkable Format) format, which is a
portable object file format which works for a variety of 32- and 64-bit operating systems.
8.4 Restrictions
The ELF output format, as implemented in vasm, currently supports the following architec-
tures:
− PowerPC
− M68k
− ARM
− i386
− x86 64
− Jaguar RISC
The supported relocation types depend on the selected architecture.
9.1 Legal
This module is written in 2008-2012 by Frank Wille and is covered by the vasm copyright
without modifications.
9.3 General
This output module outputs the a.out (assembler output) format, which is an older 32-bit
format for Unix-like operating systems, originally invented by AT&T.
9.4 Restrictions
The a.out output format, as implemented in vasm, currently supports the following archi-
tectures:
− M68k
− i386
The following standard relocations are supported by default:
− absolute, 8, 16, 32 bits
− pc-relative, 8, 16, 32 bits
− base-relative
Standard relocations occupy 8 bytes and don’t include an addend, so they are not suitable
for most RISC CPUs. The extended relocations format occupies 12 bytes and also allows
more relocation types.
10.1 Legal
This module is written in 2009-2014 by Frank Wille and is covered by the vasm copyright
without modifications.
10.3 General
This module outputs the TOS executable file format, which is used on Atari 16/32-bit
computers with 68000 up to 68060 CPU. The symbol table uses the DRI format.
10.4 Restrictions
− All symbols must be defined, otherwise the generation of the executable fails. Unknown
symbols are listed by vasm.
− The only relocations allowed in this format are 32-bit absolute.
Those are restrictions of the output format, not of vasm.
11.1 Legal
This module is written in 2002-2016 by Frank Wille and is covered by the vasm copyright
without modifications.
11.3 General
This output module outputs the hunk object (standard for M68k and extended for PowerPC)
and hunkexe executable format, which is a proprietary file format used by AmigaOS and
WarpOS.
The hunkexe module will generate directly executable files, without the need for another
linker run. But you have to make sure that there are no undefined symbols, common
symbols, or unusual relocations (e.g. small data) left.
It is allowed to define sections with the same name but different attributes. They will be
regarded as different entities.
11.4 Restrictions
The hunk/hunkexe output format is only intended for M68k and PowerPC cpu modules and
will abort when used otherwise.
The hunk module supports the following relocation types:
58 vasm manual
− absolute, 32-bit
− absolute, 16-bit
− absolute, 8-bit
− relative, 8-bit
− relative, 14-bit (mask 0xfffc) for PPC branch instructions.
− relative, 16-bit
− relative, 24-bit (mask 0x3fffffc) for PPC branch instructions.
− relative, 32-bit
− base-relative, 16-bit
− common symbols are supported as 32-bit absolute and relative references
The hunkexe module supports absolute 32-bit relocations only.
12.1 Legal
This module is written in 2002-2014 by Volker Barthelmann and is covered by the vasm
copyright without modifications.
12.3 General
This output module outputs the vobj object format, a simple portable proprietary object
file format of vasm.
As this format is not yet fixed, it is not described here.
12.4 Restrictions
None.
13.1 Legal
This module is written in 2002-2009,2013 by Volker Barthelmann and is covered by the
vasm copyright without modifications.
13.3 General
This output module outputs the contents of all sections as simple binary data without any
header or additional information. When there are multiple sections, they must not overlap.
Gaps between sections are filled with zero bytes. Undefined symbols are not allowed.
14.1 Legal
This module is written in 2015 by Joseph Zatarski and is covered by the vasm copyright
without modifications.
14.3 General
This output module outputs the contents of all sections in Motorola srecord format, which
is a simple ASCII output of hexadecimal digits. Each record starts with ’S’ and a one-digit
ID. It is followed by the data and terminated by a checksum and a newline character. Every
section starts with a new header record.
15.1 Legal
This module is written in 2002-2017 by Frank Wille and is covered by the vasm copyright
without modifications.
-mcfv4 Generate code for the V4 ColdFire core. This option selects ISA B and MAC
as supported by the 5407.
-mcfv4e Generate code for the V4e ColdFire core. This option selects ISA B, USP-,
FPU-, MAC- and EMAC-instructions (no hardware division) as supported by
all 547x and 548x CPUs.
-m68851 Generate code for the MC68851 MMU. May be used in combination with an-
other -m option.
-m68881 Generate code for the MC68881 FPU. May be used in combination with another
-m option.
-m68882 Generate code for the MC68882 FPU. May be used in combination with another
-m option.
-no-fpu Ignore any FPU options or directives, which has the effect that no 68881/2 FPU
instructions will be accepted. This option can override the default of -gas to
enable the FPU.
-opt-movem
Enables optimization from MOVEM <ea>,Rn into MOVE <ea>,Rn (or the other way
around). This optimization will modify the flags, when the destination is no
address register.
-opt-mul Immediate multplication factors, which are a power of two (from 2 to 256), are
optimized to shifts. Multiplications with zero are replaced by a MOVEQ #0,Dn,
with -1 are replaced by a NEG.L Dn and with 1 by EXT.L Dn or TST.L Dn (long-
form). Not all optimizations are available for all cpu types (e.g. MULU.W can only
be optimized on ColdFire by using the MVZ.W instruction. This optimization
will leave the flags in a different state as can normally be expected after a
multiplication instruction, and the size of the optimized code may be bigger
than before in a few situations (e.g. MULS.W #4,Dn). The latter will additionally
require the -opt-speed flag.
-opt-div Unsigned immediate divisors, which are a power of two (from 2 to 256), are
optimized to shifts. Divisions by 1 are replaced by TST.L Dn (32-bit) or MVZ.W
Dn,Dn (16-bit, ColdFire only). Divisions by -1 are replaced by NEG.L Dn (32-
bit) or by a combination of NEG.W Dn and MVZ.W Dn,Dn (16-bit, ColdFire only).
This optimization will leave the flags in a different state as can normally be
expected after a division instruction.
-opt-pea Enables optimization from MOVE #x,-(SP) into PEA x. This optimization will
leave the flags unmodified, which might not be intended.
-opt-speed
Optimize for speed, even if this would increase code size. For example it enables
optimization of ASL.W #2,Dn into two ADD.W Dn,Dn instructions. Or MULS.W
#-4,Dn into EXT.L Dn + ASL.L #2,Dn + NEG.L Dn. Generally the assembler will
never optimize a single into multiple instructions without this option.
-opt-st Enables optimization from MOVE.B #-1,<ea> into ST <ea>. This optimization
will leave the flags unmodified, which might not be intended.
-sc All JMP and JSR instructions to external labels will be converted into 16-bit
PC-relative jumps.
-sd References to absolute symbols in a small data section (named " MERGED")
are optimized into a base-relative addressing mode using the current base reg-
ister set by an active NEAR directive. This option is automatically enabled in
-phxass mode.
-showcrit
Print all critical optimizations which have side effects. Among those are -opt-
lsl, -opt-mul, -opt-st, -opt-pea, -opt-movem and -opt-clr.
-showopt Print all optimizations and translations vasm is doing (same as opt ow+).
In its default setting (no -devpac or -phxass option) vasm performs the following opti-
mizations:
− Absolute to PC-relative.
− Branches without explicit size.
68 vasm manual
-regsymredef
Allow redefining register symbols with EQUR. This should only be used for
compatibility with old sources. Not many assemblers support that.
15.3 General
This backend accepts M68k and CPU32 instructions as described in Mototola’s M68000
family Programmer’s Reference Manual. Additionally it supports ColdFire instructions as
described in Motorola’s ColdFire Microprocessor Family Programmer’s Reference Manual.
The syntax for the scale factor in ColdFire MAC instructions is << for left- and >> for right-
shift. The scale factor may be appended as an optional operand, when needed. Example:
mac d0.l,d1.u,<<.
The mask flag in MAC instructions is written as & and is appended directly to the effective
address operand. Example: mac d0,d1,(a0)&,d2.
The target address type is 32bit.
Default alignment for instructions is 2 bytes. The default alignment for data is 2 bytes,
when the data size is larger than 8 bits.
Depending on the selected cpu type the __VASM symbol will have a value defined by the
following bits:
bit 0 MC68000 instruction set. Also used by MC6830x, MC68322, MC68356.
bit 1 MC68010 instruction set.
bit 2 MC68020 instruction set.
bit 3 MC68030 instruction set.
bit 4 MC68040 instruction set.
bit 5 MC68060 instruction set.
bit 6 MC68881 or MC68882 FPU.
bit 7 MC68851 PMMU.
bit 8 CPU32. Any MC6833x or MC6834x CPU.
bit 9 ColdFire ISA A.
bit 10 ColdFire ISA A+.
bit 11 ColdFire ISA B.
bit 12 ColdFire ISA C.
bit 13 ColdFire hardware division support.
bit 14 ColdFire MAC instructions.
bit 15 ColdFire enhanced MAC instructions.
bit 16 ColdFire USP register.
bit 17 ColdFire FPU instructions.
bit 18 ColdFire MMU instructions.
bit 20 Apollo Core AC68080 instruction set.
70 vasm manual
15.4 Extensions
This backend extends the selected syntax module by the following directives:
.sdreg <An>
Equivalents to near <An>.
basereg <expression>,<An>
Starts a block of base-relative addressing through register An (remember that
A7 is not allowed as a base register). The programmer has to make sure that
<expression> is placed into An first, while the assembler automatically subtracts
<expression>, which is usually a program label with an optional offset, from each
displacement in a (d,An) addressing mode. basereg has priority over the near
directive. Its effect can be suspended with the endb directive. It is allowed to
use several base registers in parallel.
cpu32 Generate code for the CPU32 family.
endb <An> Ends a basereg block and suspends its effect onto the specified base register An.
It may be reused with a different base expression thereafter (refer to basereg).
far Disables small data (base-relative) mode. All data references will be absolute.
fpu <cpID>
Enables 68881/68882 FPU code generation. The <cpID> is inserted into the
FPU instructions to select the correct coprocessor. Note that <cpID> is always
1 for the on-chip FPUs in the 68040 and 68060. A <cpID> of zero will disable
FPU code generation.
initnear Initializes the selected small data register. In contrast to PhxAss, where this
directive comes from, just a reference to _LinkerDB is generated, which has to
be resolved by a linker.
machine <cpu_type>
Makes the assembler generate code for <cpu type>, which can be the follow-
ing: 68000, 68010, 68020, 68030, 68040, 68060, 68080, 68851, 68881, 68882,
cpu32. And various ColdFire CPUs, starting with 5....
mc68000 Generate code for the MC68000 CPU.
mc68010 Generate code for the MC68010 CPU.
mc68020 Generate code for the MC68020 CPU.
mc68030 Generate code for the MC68030 CPU.
mc68040 Generate code for the MC68040 CPU.
mc68060 Generate code for the MC68060 CPU.
ac68080 Generate code for the Apollo Core AC68080 FPGA CPU.
mcf5... Generate code for a ColdFire CPU. The recognized models are listed in the
assembler-options section.
near [<An>]
Enables small data (base-relative) mode and sets the base register to An. near
without an argument will reactivate a previously defined small data mode, which
might have been switched off by a far directive.
Chapter 15: m68k cpu module 71
near code All JMP and JSR instructions to external labels will be converted into 16-bit PC-
relative jumps. The small code mode can be switched off by a far directive.
opt <option>[,<option>...]
Sets Devpac-compatible options. When option -phxass is given, then it will
parse PhxAss options instead (which is discouraged for new code, so there is no
detailed description here). Most supported Devpac2-style options are always
suffixed by a + or - to enable or disable the option:
a Automatically optimize absolute to PC-relative references. Default
is off in Devpac-comptability mode, otherwise on.
c Case-sensitivity for all symbols, instructions and macros. Default
is on.
d Include all symbols for debugging in the output file. May also gen-
erate line debugging information in some output formats. Default
is off in Devpac-comptability mode, otherwise on.
l Generate a linkable object file. The default is defined by the se-
lected output format via the assembler’s -F option. This option
was supported by Devpac-Amiga only.
o Enable all optimizations (o1 to o12), or disable all optimizations.
The default is that all are disabled in Devpac-compatibility mode
and enabled otherwise. When running in native vasm mode this
option will also enable PC-relative (opt a) and the following safe
vasm-specific optimizations (see below): og, of.
o1 Optimize branches without an explicit size extension.
o2 Standard displacement optimizations (e.g. (0,An) -> (An)).
o3 Optimize absolute addresses to short words.
o4 Optimize move.l to moveq.
o5 Optimize add #x and sub #x into their quick forms.
o6 No effect in vasm.
o7 Convert bra.b to nop, when branching to the next instruction.
o8 Optimize 68020+ base displacements to 16 bit.
o9 Optimize 68020+ outer displacements to 16 bit.
o10 Optimize add/sub #x,An to lea.
o11 Optimize lea (d,An),An to addq/subq.
o12 Optimize <op>.l #x,An to <op>.w #x,An.
ow Show all optimizations being peformed. Default is on in Devpac-
compatibility mode, otherwise off.
p Check if code is position independant. This will cause an error on
each relocation being required. Default is off.
72 vasm manual
15.5 Optimizations
This backend performs the following operand optimizations:
− (0,An) optimized to (An).
− (d16,An) translated to (bd32,An,ZDn.w), when d16 is not between -32768 and 32767
and the selected CPU allows it (68020 up or CPU32).
− (d16,PC) translated to (bd32,PC,ZDn.w), when d16 is not between -32768 and 32767
and the selected CPU allows it (68020 up or CPU32).
− (d8,An,Rn) translated to (bd,An,Rn), when d8 is not between -128 and 127 and the
selected CPU allows it (68020 up or CPU32).
− (d8,PC,Rn) translated to (bd,PC,Rn), when d8 is not between -128 and 127 and the
selected CPU allows it (68020 up or CPU32).
− <exp>.l optimized to <exp>.w, when <exp> is absolute and between -32768 and 32767.
− <exp>.w translated to <exp>.l, when <exp> is a program label or absolute and not
between -32768 and 32767.
− (0,An,...) optimized to (An,...) (which means the base displacement will be sup-
pressed). This allows further optimization to (An), when the index is suppressed.
− (bd16,An,...) translated to (bd32,An,...), when bd16 is not between -32768 and
32767.
− (bd32,An,...) optimized to (bd16,An,...), when bd16 is between -32768 and 32767.
− (bd32,An,ZRn) optimized to (d16,An), when bd32 is between -32768 and 32767, and
the index is suppressed (zero-Rn).
− (An,ZRn) optimized to (An), when the index is suppressed.
− (0,PC,...) optimized to (PC,...) (which means the base displacement will be sup-
pressed).
− (bd16,PC,...) translated to (bd32,PC,...), when bd16 is not between -32768 and
32767.
− (bd32,PC,...) optimized to (bd16,PC,...), when bd16 is between -32768 and 32767.
− (bd32,PC,ZRn) optimized to (d16,PC), when bd32 is between -32768 and 32767, and
the index is suppressed (zero-Rn).
− ([0,Rn,...],...) optimized to ([An,...],...) (which means the base displacement
will be suppressed).
− ([bd16,Rn,...],...) translated to ([bd32,An,...],...), when bd16 is not between
-32768 and 32768.
− ([bd32,Rn,...],...) optimized to ([bd16,An,...],...), when bd32 is between -
32768 and 32768.
Chapter 15: m68k cpu module 75
− ([...],0) optimized to ([...]) (which means the outer displacement will be sup-
pressed).
− ([...],od16) translated to ([...],od32), when od16 is not between -32768 and
32767.
− ([...],od32) translated to ([...],od16), when od32 is between -32768 and 32767.
Note that an operand optimization will only take place when a displacement’s size was not
enforced by the programmer (e.g. (4.l,a0))!
This backend performs the following instruction optimizations:
− <op>.L #x,An optimized to <op>.W #x,An, when x is between -32768 and 32767.
− ADD.? #x,<ea> optimized to ADDQ.? #x,<ea>, when x is between 1 and 8.
− ADD.? #x,<ea> optimized to SUBQ.? #x,<ea>, when x is between -1 and -8.
− ADDA.? #0,An and SUBA.? #0,An will be deleted.
− ADDA.? #x,An optimized to LEA (x,An),An, when x is between -32768 and 32767.
− ANDI.L #$ff,Dn optimized to MVZ.B Dn,Dn, for ColdFire ISA B/C.
− ANDI.L #$ffff,Dn optimized to MVZ.W Dn,Dn, for ColdFire ISA B/C.
− ANDI.? #0,<ea> optimized to CLR.? <ea>, when allowed by the option -opt-clr or a
different CPU than the MC68000 was selected.
− ANDI.? #-1,<ea> optimized to TST.? <ea>.
− ASL.? #1,Dn optimized to ADD.? Dn,Dn for 68000 and 68010.
− ASL.? #2,Dn optimized into a sequence of two ADD.? Dn,Dn for 68000 and 68010, when
the operation size is either byte or word and the options -opt-speed and -opt-lsl
are given.
− B<cc> <label> translated into a combination of B!<cc> *+8 and JMP <label>, when
<label> is not defined in the same section (and option -opt-brajmp is given), or outside
the range of -32768 to 32767 bytes from the current address when the selected CPU is
not 68020 up, CPU32 or ColdFire ISA B/C.
− B<cc> <label> is automatically optimized to 8-bit, 16-bit or 32-bit (68020 up, CPU32,
MCF5407 only), whatever fits best. When the selected CPU doesn’t support 32-
bit branches it will try to change the conditional branch into a B<!cc> *+8 and JMP
<label> sequence.
− BRA <label> translated to JMP <label>, when <label> is not defined in the same section
(and option -opt-brajmp is given), or outside the range of -32768 to 32767 bytes
from the current address when the selected CPU is not 68020 up, CPU32 or ColdFire
ISA B/C.
− BSR <label> translated to JSR <label>, when <label> is not defined in the same section
(and option -opt-brajmp is given), or outside the range of -32768 to 32767 bytes
from the current address when the selected CPU is not 68020 up, CPU32 or ColdFire
ISA B/C.
− <cp>B<cc> <label> is automatically optimized to 16-bit or 32-bit, whatever fits best.
<cp> means coprocessor and is P for the PMMU and F for the FPU.
− CLR.L Dn optimized to MOVEQ #0,Dn.
76 vasm manual
− CMP.? #0,<ea> optimized to TST.? <ea>. The selected CPU type must be MC68020
up, ColdFire or CPU32 to support address register direct as effective address (<ea>).
− DIVS.W/DIVU.W #1,Dn optimized to MVZ.W Dn,Dn, for ColdFire ISA B/C (-opt-div).
− DIVS.W #-1,Dn optimized to the sequence of NEG.W Dn and MVZ.W Dn,Dn (-opt-div
and -opt-speed).
− DIVS.L/DIVU.L #1,Dn optimized to TST.L Dn (-opt-div).
− DIVS.L #-1,Dn optimized to NEG.L Dn (-opt-div).
− DIVU.L #2..256,Dn optimized to LSR.L #x,Dn (-opt-div).
− EORI.? #-1,<ea> optimized to NOT.? <ea>.
− EORI.? #0,<ea> optimized to TST.? <ea>.
− FMOVEM.? <reglist> is deleted, when the register list was empty.
− FxDIV.? #m,FPn optimized to FxMUL.? #1/m,FPn when m is a power of 2 and option
-opt-fconst is given.
− JMP <label> optimized to BRA.? <label>, when <label> is defined in the same section
and in the range of -32768 to 32767 bytes from the current address. Note that JMP
(<lab>,PC) is never optimized.
− JSR <label> optimized to BSR.? <label>, when <label> is defined in the same section
and in the range of -32768 to 32767 bytes from the current address. Note that JSR
(<lab>,PC) is never optimized.
− LEA 0,An optimized to SUBA.L An,An.
− LEA (0,An),An and LEA (An),An will be deleted.
− LEA (d,An),An is optimized to ADDQ.L #d,An when d is between 1 and 8 and to SUBQ.L
#-d,An when d is between -1 and -8.
− LEA (d,Am),An will be translated into a combination of MOVEA and ADDA.L for 68000
and 68010, when d is lower than -32768 or higher than 32767. The MOVEA will be
omitted when Am and An are identical. Otherwise -opt-speed is required.
− LINK.L An,#x optimized to LINK.W An,#x, when x is between -32768 and 32767.
− LINK.W An,#x translated to LINK.L An,#x, when x is not between -32768 and 32767
and selected CPU supports this instruction.
− LSL.? #1,Dn optimized to ADD.? Dn,Dn for 68000 and 68010, when option -opt-lsl is
given.
− LSL.? #2,Dn optimized into a sequence of two ADD.? Dn,Dn for 68000 and 68010, when
the operation size is either byte or word and the options -opt-speed and -opt-lsl
are given.
− MOVE.? #0,<ea> optimized to CLR.? <ea>, when allowed by the option -opt-clr or a
different CPU than the MC68000 was selected.
− MOVE.? #x,-(SP) optimized to PEA x, when allowed by the option -opt-pea. The
move-size must not be byte (.b).
− MOVE.B #-1,<ea> optimized to ST <ea>, when allowed by the option -opt-st.
− MOVE.L #x,Dn optimized to MOVEQ #x,Dn, when x is between -128 and 127.
− MOVE.L #x,<ea> optimized to MOV3Q #x,<ea>, for ColdFire ISA B and ISA C, when x
is -1 or between 1 and 7.
Chapter 15: m68k cpu module 77
16.1 Legal
This module is written in 2002-2016 by Frank Wille and is covered by the vasm copyright
without modifications.
-no-regnames
Don’t predefine any register-name symbols.
-opt-branch
Enables translation of 16-bit branches into "B<!cc> $+8 ; B label" sequences
when destination is out of range.
-sd2reg=<n>
Sets the 2nd small data base register to Rn.
-sdreg=<n>
Sets small data base register to Rn.
The default setting is to generate code for a 32-bit PPC G2, G3, G4 CPU with Altivec
support.
16.3 General
This backend accepts PowerPC instructions as described in the instruction set manuals
from IBM, Motorola, Freescale and AMCC.
The full instruction set of the following families is supported: POWER, POWER2, 40x,
44x, 46x, 60x, 620, 750, 74xx, 860, Book-E, e300 and e500.
The target address type is 32 or 64 bits, depending on the selected CPU model.
Default alignment for sections and instructions is 4 bytes. Data is aligned to its natural
alignment by default.
16.4 Extensions
This backend provides the following specific extensions:
− When not disabled by the option -no-regnames, the registers r0 - r31, f0 - f31, v0 -
v31, cr0 - cr7, vrsave, sp, rtoc, fp, fpscr, xer, lr, ctr, and the symbols lt, gt, so and un
will be predefined on startup and may be referenced by the program.
This backend extends the selected syntax module by the following directives:
.sdreg <n>
Sets the small data base register to Rn.
.sd2reg <n>
Sets the 2nd small data base register to Rn.
16.5 Optimizations
This backend performs the following optimizations:
− 16-bit branches, where the destination is out of range, are translated into B<!cc> $+8
and a 26-bit unconditional branch.
17.1 Legal
This module is written in 2002-2004 by Volker Barthelmann and is covered by the vasm
copyright without modifications.
17.3 General
This backend accepts c16x/st10 instructions as described in the Infineon instruction set
manuals.
The target address type is 32bit.
Default alignment for sections and instructions is 2 bytes.
17.4 Extensions
This backend provides the following specific extensions:
− There is a pseudo instruction jmp that will be translated either to a jmpr or jmpa
instruction, depending on the offset.
− The sfr pseudo opcode can be used to declare special function registers. It has two,
three of four arguments. The first argument is the identifier to be declared as special
function register. The second argument is either the 16bit sfr address or its 8bit base
address (0xfe for normal sfrs and 0xf0 for extended special function registers). In the
latter case, the third argument is the 8bit sfr number. If another argument is given, it
specifies the bit-number in the sfr (i.e. the declaration declares a single bit).
Example:
.sfr zeros,0xfe,0x8e
− SEG and SOF can be used to obtain the segment or segment offset of a full address.
Example:
mov r3,#SEG farfunc
86 vasm manual
17.5 Optimizations
This backend performs the following optimizations:
− jmp is translated to jmpr, if possible. Also, if -no-translations was not specified,
jmpr and jmpa are translated.
− Relative jump instructions with an offset that does not fit into 8 bits are translated to
a jmps instruction or an inverted jump around a jmps instruction.
− For instruction that have two forms gpr,#IMM3/4 and reg,#IMM16 the smaller form is
used, if possible.
18.1 Legal
This module is written in 2002,2006,2008-2012,2014-2017 by Frank Wille and is covered by
the vasm copyright without modifications.
18.3 General
This backend accepts 6502 family instructions as described in the instruction set reference
manuals from MOS and Rockwell, which are valid for the following CPUs: 6502, 65C02,
65CE02, 65C102, 65C112, 6503, 6504, 6505, 6507, 6508, 6509, 6510, 6511, 65F11, 6512 -
6518, 65C00/21, 65C29, 6570, 6571, 6280, 6702, 740, 7501, 8500, 8502, 65802, 65816.
The target address type is 16 bit.
Instructions consist of one up to three bytes and require no alignment. There is also no
alignment requirement for sections and data.
All known mnemonics for illegal instructions are recognized (e.g. dcm and dcp refer to the
same instruction). Some illegal insructions (e.g. $ab) are known to show unpredictable
behaviour, or do not always work the same on different CPUs.
18.4 Extensions
This backend provides the following specific extensions:
− The parser understands a lo/hi-modifier to select low- or high-byte of a 16-bit word.
The character < is used to select the low-byte and > for the high-byte. It has to be the
first character before an expression.
− When applying the operation /256, %256 or &256 on a label, an appropriate lo/hi-byte
relocation will automatically be generated.
18.5 Optimizations
This backend performs the following operand optimizations:
− Branches, where the destination is out of range, are translated into B<!cc> *+3 and an
absolute JMP instruction.
88 vasm manual
19.1 Legal
This module is written in 2004,2006,2010-2015 by Frank Wille and is covered by the vasm
copyright without modifications.
19.3 General
This backend accepts ARM instructions as described in various ARM CPU data sheets. Ad-
ditionally some architectures support a second, more dense, instruction set, called THUMB.
There are special directives to switch between those two instruction sets.
The target address type is 32bit.
Default alignment for instructions is 4 bytes for ARM and 2 bytes for THUMB. Sections
will be aligned to 4 bytes by default. Data is aligned to its natural alignment by default.
19.4 Extensions
This backend extends the selected syntax module by the following directives:
.arm Generate 32-bit ARM code.
.thumb Generate 16-bit THUMB code.
19.5 Optimizations
This backend performs the following optimizations and translations for the ARM instruction
set:
− LDR/STR Rd,symbol, with a distance between symbol and PC larger than 4KB, is
translated to ADD/SUB Rd,PC,#offset&0xff000 + LDR/STR Rd,[Rd,#offset&0xfff],
when allowed by the option -opt-ldrpc.
Chapter 19: ARM cpu module 91
20.1 Legal
This module is written in 2005-2006,2011,2015-2016 by Frank Wille and is covered by the
vasm copyright without modifications.
20.3 General
This backend accepts 80x86 instructions as described in the Intel Architecture Software
Developer’s Manual.
The target address type is 32 bits. It is 64 bits when the x86 64 architecture was selected
(-m64).
Instructions do not need any alignment. Data is aligned to its natural alignment by default.
The backend uses MIT-syntax! This means the left operands are always the source and the
right operand is the destination. Register names have to be prefixed by a ’%’.
The operation size is indicated by a ’b’, ’w’, ’l’, etc. suffix directly appended to the
mnemonic. The assembler can also determine the operation size from the size of the registers
being used.
94 vasm manual
20.4 Extensions
Predefined register symbols in this backend:
− 8-bit registers: al cl dl bl ah ch dh bh axl cxl dxl spl bpl sil dil r8b r9b r10b
r11b r12b r13b r14b r15b
− 16-bit registers: ax cx dx bx sp bp si di r8w r9w r10w r11w r12w r13w r14w r15w
− 32-bit registers: eax ecx edx ebx esp ebp esi edi r8d r9d r10d r11d r12d r13d
r14d r15d
− 64-bit registers: rax rcx rdx rbx rsp ebp rsi rdi r8 r9 r10 r11 r12 r13 r14 r15
− segment registers: es cs ss ds fs gs
− control registers: cr0 cr1 cr2 cr3 cr4 cr5 cr6 cr7 cr8 cr9 cr10 cr11 cr12 cr13
cr14 cr15
− debug registers: dr0 dr1 dr2 dr3 dr4 dr5 dr6 dr7 dr8 dr9 dr10 dr11 dr12 dr13
dr14 dr15
− test registers: tr0 tr1 tr2 tr3 tr4 tr5 tr6 tr7
− MMX and SIMD registers: mm0 mm1 mm2 mm3 mm4 mm5 mm6 mm7 xmm0 xmm1 xmm2 xmm3
xmm4 xmm5 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 xmm12 xmm13 xmm14 xmm15
− FPU registers: st st(0) st(1) st(2) st(3) st(4) st(5) st(6) st(7)
This backend extends the selected syntax module by the following directives:
.code16 Sets the assembler to 16-bit addressing mode.
.code32 Sets the assembler to 32-bit addressing mode, which is the default.
.code64 Sets the assembler to 64-bit addressing mode.
20.5 Optimizations
This backend performs the following optimizations:
− Immediate operands are optimized to the smallest size which can still represent the
absolute value.
− Displacement operands are optimized to the smallest size which can still represent the
absolute value.
− Jump instructions are optimized to 8-bit displacements, when possible.
21.1 Legal
This module is copyright in 2009 by Dominic Morris.
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ‘‘AS IS’’ AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-swapixiy
Swaps the usage of ix and iy registers. This is useful for compiling generic code
that uses an index register that is reserved on the target machine.
-z80asm Switches on z80asm mode. This translates ASMPC to $ and accepts some
pseudo opcodes that z80asm supports. Most emulation of z80asm directives is
provided by the oldsyntax syntax module.
21.3 General
This backend accepts z80 family instructions in standard Zilog syntax. Rabbit opcodes are
accepted as defined in the publically available reference material from Rabbit Semiconduc-
tor, with the exception that the ljp and lcall opcodes need to be supplied with a 24 bit
number rather than an 8 bit xpc and a 16 bit address.
The target address type is 16 bit.
Instructions consist of one up to six bytes and require no alignment. There is also no
alignment requirement for sections and data.
21.4 Extensions
This backend provides the following specific extensions:
− Certain Rabbit opcodes can be prefixed by the altd and/or the ioi/ioe modifier.
For details of which instructions these are valid for please see the documentation from
Rabbit.
− The parser understands a lo/hi-modifier to select low- or high-byte of a 16-bit word.
The character < is used to select the low-byte and > for the high-byte. It has to be the
first character before an expression.
− When applying the operation /256, %256 or &256 on a label, an appropriate lo/hi-byte
relocation will automatically be generated.
21.5 Optimisations
This backend supports the emulation of certain z80 instructions on the Rabbit/gbz80 pro-
cessor. These instructions are rld, rrd, cpi, cpir, cpd and cpdr. The link stage should
provide routines with the opcode name prefixed with rcmx_ (eg rcmx_rld) which imple-
ments the same functionality. Example implementations are available within the z88dk
CVS tree.
Additionally, for the Rabbit targets the missing call cc, opcodes will be emulated.
22.1 Legal
This module is written in 2013-2016 by Esben Norby and Frank Wille and is covered by
the vasm copyright without modifications.
22.3 General
This backend accepts 6800 family instructions for the following CPUs:
• 6800 code generation: 6800, 6802, 6808.
• 6801 code generation: 6801, 6803.
• 68HC11.
The 6804, 6805, 68HC08 and 6809 are not supported, they use a similar instruction set, but
are not opcode compatible.
The target address type is 16 bit.
Instructions consist of one up to five bytes and require no alignment. There is also no
alignment requirement for sections and data.
22.4 Extensions
This backend provides the following specific extensions:
− When an instruction supports direct and extended addressing mode the < character
can be used to force direct mode and the > character forces extended mode. Otherwise
the assembler selects the best mode automatically, which defaults to extended mode
for external symbols.
− When applying the operation /256, %256 or &256 on a label, an appropriate lo/hi-byte
relocation will automatically be generated.
22.5 Optimizations
None.
23.1 Legal
This module is written in 2014-2015 by Frank Wille and is covered by the vasm copyright
without modifications.
23.3 General
This backend accepts RISC instructions for the GPU or DSP in Atari’s Jaguar custom chip
set according to the "Jaguar Technical Reference Manual for Tom & Jerry", Revision 8.
Documentation bugs were fixed by using various sources on the net.
The target address type is 32 bits.
Default alignment for instructions is 2 bytes. Data is aligned to its natural alignment by
default.
23.4 Optimizations
This backend performs the following optimizations and translations for the GPU/DSP RISC
instruction set:
− load (Rn+0),Rm is optimized to load (Rn),Rm.
− store Rn,(Rm+0) is optimized to store Rn,(Rm).
23.5 Extensions
This backend extends the selected syntax module by the following directives (note that a
leading dot is optional):
<symbol> ccdef <expression>
Allows defining a symbol for the condition codes used in jump and jr instruc-
tions. Must be constant number in the range of 0 to 31 or another condition
code symbol.
104 vasm manual
ccundef <symbol>
Undefine a condition code symbol previously defined via ccdef.
dsp Select DSP instruction set.
<symbol> equr <Rn>
Define a new symbol named <symbol> and assign the address register Rn to it.
<Rn> may also be another register symbol. Note that a register symbol must
be defined before it can be used.
equrundef <symbol>
Undefine a register symbol previously defined via equr.
gpu Select GPU instruction set.
<symbol> regequ <Rn>
Equivalent to equr.
regundef <symbol>
Undefine a register symbol previously defined via regequ.
All directives may be optionally preceded by a dot (.), for compatibility with various syntax
modules.
24.1 Legal
This module is written in 2014 by Luis Panadero Guarde~
no and is covered by the vasm
copyright without modifications.
24.2 General
This backend accepts TR3200 instructions as described in the TR3200 specification
(https://github.com/trillek-team/trillek-computer)
The target address type is 32 bits.
Default alignment for sections is 4 bytes. Instructions alignment is 4 bytes. Data is aligned
to its natural alignment by default, i.e. 2 byte wide data alignment is 2 bytes and 4 byte
wide data alignment is 4 byte.
The backend uses TR3200 syntax! This means the left operands are always the destination
and the right operand is the source (except for single operand instructions). Register names
have to be prefixed by a ’%’ (%bp, %r0, etc.) This means that it should accept WaveAsm
assembly files if oldstyle syntax module is being used. The instructions are lowercase,
-dotdir option is being used and directives are not in the first column.
24.3 Extensions
Predefined register symbols in this backend:
− register by number: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15
− special registers by name: bp sp y ia flags
24.6 Example
It follows a little example to illustrate TR3200 assembly using the oldstyle syntax module
(option -dotdir required):
const .equ 0xBEBACAFE ; A constant
an_addr .equ 0x100 ; Other constant
; ROM code
.org 0x100000
.text
_start ; Label with or without a ending ":"
mov %sp, 0x1000 ; Set the initial stack
mov %r0, 0
mov %r1, 0xA5
mov %r2, 0
storeb %r0, an_addr, %r1
mov %r2, 0
mov %r3, 0x10200
push %r0
.repeat 2 ; directives to repeat stuff!
push const
.endrepeat
.repeat 2
pop %r5
.endrepeat
pop %r0
foo: ; Subrutine
Chapter 24: Trillek TR3200 cpu module 107
ifneq %r5, 0
mul %r5, %r5, 2
sub %r5, %r5, 1
ret
; ROM data
.org 0x100500
var1 .db 0x20 ; A byte size variable
.even ; Enforce to align to even address
var3 .dw 0x1020 ; A word size variable
var4 .dd 0x0A0B0C20 ; A double word size variable
str1 .asciiz "Hello world!" ; ASCII string with null termination
str2 .string "Hello world!" ; ASCII string with null termination
.fill 5, 0xFF ; Fill 5 bytes with 0xFF
.reserve 5 ; Reserves space for 5 byte
Chapter 25: Interface 109
25 Interface
25.1 Introduction
This chapter is under construction!
This chapter describes some of the internals of vasm and tries to explain what has to be
done to write a cpu module, a syntax module or an output module for vasm. However if
someone wants to write one, I suggest to contact me first, so that it can be integrated into
the source tree.
Note that this documentation may mention explicit values when introducing symbolic con-
stants. This is due to copying and pasting from the source code. These values may not be
up to date and in some cases can be overridden. Therefore do never use the absolute values
but rather the symbolic representations.
CC Here you have to insert a command that invokes an ANSI C compiler you want
to use to build vasm. It must support the -I option the same like e.g. vc or
gcc.
COPTS Here you will usually define an option like -c to instruct the compiler to generate
an object file. Additional options, like the optimization level, should also be
inserted here as well. When the host operating system is different from a
Unix (MacOSX and MiNT are Unix), you have to define one of the following
preprocessor macros:
-DAMIGA AmigaOS (M68k or PPC), MorphOS, AROS.
-DATARI Atari TOS.
-DMSDOS CP/M, MS-DOS, Windows.
CCOUT Here you define the option which is used to specify the name of an output file,
which is usually -o.
LD Here you insert a command which starts the linker. This may be the the same
as under CC.
LDFLAGS Here you have to add options which are necessary for linking. E.g. some
compilers need special libraries for floating-point.
LDOUT Here you define the option which is used by the linker to specify the output file
name.
RM Specify a command to delete a file, e.g. rm -f.
An example for the Amiga using vbcc would be:
TARGET = _os3
TARGETEXTENSION =
CC = vc +aos68k
CCOUT = -o
COPTS = -c -c99 -cpu=68020 -DAMIGA -O1
LD = $(CC)
LDOUT = $(CCOUT)
LDFLAGS = -lmieee
RM = delete force quiet
An example for a typical Unix-installation would be:
TARGET =
TARGETEXTENSION =
CC = gcc
CCOUT = -o
COPTS = -c -O2
LD = $(CC)
LDOUT = $(CCOUT)
LDFLAGS = -lm
RM = rm -f
Open/Net/Free/Any BSD i386 systems will probably require the following an additional
-D_ANSI_SOURCE in COPTS.
Chapter 25: Interface 111
For Windows and various Amiga targets there are already Makefiles included, which you
may either copy on top of the default Makefile, or call it explicitely with make’s -f option:
make -f Makefile.OS4 CPU=ppc SYNTAX=std
25.3.1 Source
A source structure represents a source text module, which can be either the main source
text, an included file or a macro. There is always a link to the parent source from where
the current source context was included or called.
struct source *parent;
Pointer to the parent source context. Assembly continues there when the cur-
rent source context ends.
int parent_line;
Line number in the parent source context, from where we were called. This
information is needed, because line numbers are only reliable during parsing
and later from the atoms. But an include directive doesn’t create an atom.
char *name;
File name of the main source or include file, or macro name.
char *text;
Pointer to the source text start.
size_t size;
Size of the source text to assemble in bytes.
macro *macro;
Pointer to macro structure, when currently inside a macro (see also num_
params).
unsigned long repeat;
Number of repetitions of this source text. Usually this is 1, but for text blocks
between a rept and endr directive it allows any number of repetitions, which
is decremented everytime the end of this source text block is reached.
char *irpname;
Name of the iterator symbol in special repeat loops which use a sequence of
arbitrary values, being assigned to this symbol within the loop. Example: irp
directive in std-syntax.
struct macarg *irpvals;
A list of arbitrary values to iterate over in a loop. With each iteration the
frontmost value is removed from the list until it is empty.
int cond_level;
Current level of conditional nesting while entering this source text. It is au-
tomatically restored to the previous level when leaving the source prematurely
through end_source().
struct macarg *argnames;
The current list of named macro arguments.
int num_params;
Number of macro parameters passed at the invocation point from the parent
source. For normal source files this entry will be -1. For macros 0 (no parame-
ters) or higher.
Chapter 25: Interface 113
char *param[MAXMACPARAMS];
Pointer to the macro parameters.
int param_len[MAXMACPARAMS];
Number of characters per macro parameter.
int num_quals;
(If MAX_QUALIFIERS!=0.) Number of qualifiers for a macro. when not passed
on invocation these are the default qualifiers.
char *qual[MAX_QUALIFIERS];
(If MAX_QUALIFIERS!=0.) Pointer to macro qualifiers.
int qual_len[MAX_QUALIFIERS];
(If MAX_QUALIFIERS!=0.) Number of characters per macro qualifier.
unsigned long id;
Every source has its unique id. Useful for macros supporting the special \@
parameter.
char *srcptr;
The current source text pointer, pointing to the beginning of the next line to
assemble.
int line; Line number in the current source context. After parsing the line number of
the current atom is stored here.
size_t bufsize;
Current size of the line buffer (linebuf). The size of the line buffer is extended
automatically, when an overflow happens.
char *linebuf;
A buffer for the current line being assembled in this source text. A child-source,
like a macro, can refer to arguments from this buffer, so every source has got
its own. When returning to the parent source, the linebuf is deallocated to save
memory.
expr *cargexp;
(If CARGSYM was defined.) Pointer to the current expression assigned to the
CARG-symbol (used to select a macro argument) in this source instance. So it
can be restored when reentering this instance.
long reptn;
(If REPTNSYM was defined.) Current value of the repetition counter symbol in
this source instance. So it can be restored when reentering this instance.
25.3.2 Sections
One of the top level structures is a linked list of sections describing continuous blocks of
memory. A section is specified by an object of type section with the following members
that can be accessed by the modules:
struct section *next;
A pointer to the next section in the list.
114 vasm manual
char *name;
The name of the section.
char *attr;
A string describing the section flags in ELF notation (see, for example, docu-
mentation o the .section directive of the standard syntax mopdule.
atom *first;
atom *last;
Pointers to the first and last atom of the section. See following sections for
information on atoms.
taddr align;
Alignment of the section in bytes.
uint32_t flags;
Flags of the section. Currently available flags are:
HAS_SYMBOLS
At least one symbol is defined in this section.
RESOLVE_WARN
The current atom changed its size multiple times, so atom size()
is now called with this flag set in its section to make the back-
end (e.g. instruction_size()) aware of it and do less aggressive
optimizations.
UNALLOCATED
Section is unallocated, which means it doesn’t use any memory
space in the output file. Such a section will be removed before
creating the output file and all its labels converted into absolute
expression symbols. Used for "offset" sections. Refer to switch_
offset_section().
LABELS_ARE_LOCAL
As long as this flag is set new labels in a section are defined as local
labels, with the section name as global parent label.
ABSOLUTE Section is loaded at an absolute address in memory.
PREVABS Remembers state of the ABSOLUTE flag before entering relocated-org
mode (IN_RORG). So it can be restored later.
IN_RORG Section has entered relocated-org mode, which also sets the
ABSOLUTE flag. In this mode code is written into the current
section, but relocated to an absolute address. No relocation
information are generated.
NEAR_ADDRESSING
Section is marked as suitable for cpu-specific "near" addressing
modes. For example, base-register relative. The cpu backend can
use this information as an optimization hint when referencing sym-
bols from this section.
Chapter 25: Interface 115
taddr org;
Start address of a section. Usually zero.
taddr pc; Current address in this section. Can be used while traversing through the
section. Has to be updated by a module using it. Is set to org at the beginning.
unsigned long idx;
A member usable by the output module for private purposes.
25.3.3 Symbols
Symbols are represented by a linked list of type symbol with the following members that
can be accessed by the modules:.
int type; Type of the symbol. Available are:
#define LABSYM 1
The symbol is a label defined at a specific location.
#define IMPORT 2
The symbol is imported from another file.
#define EXPRESSION 3
The symbol is defined using an expression.
uint32_t flags;
Flags of this symbol. Available are:
#define TYPE_UNKNOWN 0
The symbol has no type information.
#define TYPE_OBJECT 1
The symbol defines an object.
#define TYPE_FUNCTION 2
The symbol defines a function.
#define TYPE_SECTION 3
The symbol defines a section.
#define TYPE_FILE 4
The symbol defines a file.
#define EXPORT (1<<3)
The symbol is exported to other files.
#define INEVAL (1<<4)
Used internally.
#define COMMON (1<<5)
The symbol is a common symbol.
#define WEAK (1<<6)
The symbol is weak, which means the linker may overwrite it with
any global definition of the same name. Weak symbols may also
stay undefined, in which case the linker would assign them a value
of zero.
116 vasm manual
expr *size;
The size of the symbol, if specified.
section *sec;
The section a LABSYM symbol is defined in.
taddr pc; The address of a LABSYM symbol.
taddr align;
The alignment of the symbol in bytes.
unsigned long idx;
A member usable by the output module for private purposes.
25.3.5 Atoms
The contents of each section are a linked list built out of non-separable atoms. The general
structure of an atom is:
typedef struct atom {
struct atom *next;
int type;
taddr align;
taddr lastsize;
unsigned changes;
source *src;
int line;
listing *list;
union {
instruction *inst;
dblock *db;
118 vasm manual
symbol *label;
sblock *sb;
defblock *defb;
void *opts;
int srcline;
char *ptext;
printexpr *pexpr;
expr *roffs;
taddr *rorg;
assertion *assert;
aoutnlist *nlist;
} content;
} atom;
The members have the following meaning:
#define PRINTTEXT 8
A string is printed to stdout during the final assembler pass. A
newline is automatically appended.
#define PRINTEXPR 9
Prints the value of an expression during the final assembler pass to
stdout.
#define ROFFS 10
Set the program counter to an address relative to the section’s start
address. These atoms will be translated into SPACE atoms in the
final pass.
#define RORG 11
Assemble this block under the given base address, while the code
is still written into the original memory region.
#define RORGEND 12
Ends a RORG block and returns to the original addessing.
#define ASSERT 13
The assertion expression is checked in the final pass and an error
message is generated (using the expression string and an optional
message out of this atom) when it evaluates to 0.
#define NLIST 14
Defines a stab-entry for the a.out object file format. nlist-style stabs
can also occur embedded in other object file formats, like ELF.
taddr align;
The alignment of this atom. Address must be dividable by align.
taddr lastsize;
The size of this atom in the last resolver pass. When the size has changed in
the current pass, the assembler will request another resolver run through the
section.
unsigned changes;
Number of changes in the size of this atom since pass number FASTOPTPHASE.
An increasing number usually indicates a problem in the cpu backend’s op-
timizer and will be flagged by setting RESOLVE_WARN in the Section flags, as
soon as changes exceeds MAXSIZECHANGES. So the backend can choose not to
optimize this atom as aggressive as before.
source *src;
Pointer to the source text object to which this atom belongs.
int line; The source line number that created this atom.
listing *list;
Pointer to the listing object to which this atoms belong.
instruction *inst;
(In union content.) Pointer to an instruction structure in the case of an
INSTRUCTION-atom. Contains the following elements:
120 vasm manual
defblock *defb;
(In union content.) Pointer to a defblock structure in the case of a DATADEF-
atom. Contains the following elements:
taddr bitsize;
The size of the definition in bits.
operand *op;
Pointer to a cpu-specific operand structure.
void *opts;
(In union content.) Points to a cpu module specific options object in the case
of a OPTS-atom.
int srcline;
(In union content.) Line number for source level debugging in the case of a
LINE-atom.
char *ptext;
(In union content.) A string to print to stdout in case of a PRINTTEXT-atom.
printexpr *pexpr;
(In union content.) Pointer to a printexpr structure in the case of a PRINTEXPR-
atom. Contains the following elements:
expr *print_exp;
Pointer to an expression to evaluate and print.
short type;
Format type of the printed value. We can print as hexadecimal
(PEXP_HEX), signed decimal (PEXP_SDEC), unsigned decimal (PEXP_
UDEC), binary (PEXP_BIN) OR ASCII (PEXP_ASC).
short size;
Size (precision) of the printed value in bits. Excessive bits will be
masked out, and sign-extended when requested.
expr *roffs;
(In union content.) The expression holds the relative section offset to align to
in case of a ROFFS-atom.
taddr *rorg;
(In union content.) Assemble the code under the base address in rorg in case
of a RORG-atom.
assertion *assert;
(In union content.) Pointer to an assertion structure in the case of an ASSERT-
atom. Contains the following elements:
expr *assert_exp;
Pointer to an expression which should evaluate to non-zero.
char *exprstr;
Pointer to the expression as text (to be used in the output).
122 vasm manual
char *msgstr;
Pointer to the message, which would be printed when assert_exp
evaluates to zero.
aoutnlist *nlist;
(In union content.) Pointer to an nlist structure, describing an aout stab entry,
in case of an NLIST-atom. Contains the following elements:
char *name;
Name of the stab symbol.
int type; Symbol type. Refer to stabs.h for definitions.
int other;
Defines the nature of the symbol (function, object, etc.).
int desc; Debugger information.
expr *value;
Symbol’s value.
25.3.6 Relocations
DATA and SPACE atoms can have a relocation list attached that describes how this data must
be modified when linking/relocating. They always refer to the data in this atom only.
There are a number of predefined standard relocations and it is possible to add other cpu-
specific relocations. Note however, that it is always preferrable to use standard relocations,
if possible. Chances that an output module supports a certain relocation are much higher
if it is a standard relocation.
A relocation list uses this structure:
typedef struct rlist {
struct rlist *next;
void *reloc;
int type;
} rlist;
Type identifies the relocation type. All the standard relocations have type numbers be-
tween FIRST_STANDARD_RELOC and LAST_STANDARD_RELOC. Consider reloc.h to see which
standard relocations are available.
The detailed information can be accessed via the pointer reloc. It will point to a structure
that depends on the relocation type, so a module must only use it if it knows the relocation
type.
All standard relocations point to a type nreloc with the following members:
size_t byteoffset;
Offset in bytes, from the start of the current DATA atom, to the beginning of the
relocation field. This may also be the address which is used as a basis for PC-
relative relocations. Or a common basis for several separated relocation fields,
which will be translated into a single relocation type by the output module.
size_t bitoffset;
Offset in bits to the beginning of the relocation field, adds to
byteoffset*bitsperbyte. Bits are counted in a bit-stream from lower to
Chapter 25: Interface 123
higher address bytes. But note, that inside a little-endian byte they are
counted from the LSB to the MSB, while they are counted from the MSB to
the LSB for big-endian targets.
int size; The size of the relocation field in bits.
taddr mask;
The mask defines which portion of the relocated value is set by this relocation
field.
taddr addend;
Value to be added to the symbol value.
symbol *sym;
The symbol referred by this relocation
To describe the meaning of these entries, we will define the steps that shall be executed
when performing a relocation:
1. Extract the size bits from the data atom, starting with bit number
byteoffset*bitsperbyte+bitoffset. We start counting bits from the lowest to the
highest numbered byte in memory. Inside a big-endian byte we count from the MSB
to the LSB. Inside a little-endian byte we count from the LSB to the MSB.
2. Determine the relocation value of the symbol. For a simple absolute relocation, this
will be the value of the symbol sym plus the addend. For other relocation types,
more complex calculations will be needed. For example, in a program-counter relative
relocation, the value will be obtained by subtracting the address of the data atom plus
byteoffset from the value of sym plus addend.
3. Calculate the bit-wise "and" of the value obtained in the step above and the mask
value.
4. Normalize, i.e. shift the value above right as many bit positions as there are low order
zero bits in mask.
5. Add this value to the value extracted in step 1.
6. Insert the low order size bits of this value into the data atom starting with bit
byteoffset*bitsperbyte+bitoffset.
25.3.7 Errors
Each module can provide a list of possible error messages contained e.g. in syntax_
errors.h or cpu_errors.h. They are a comma-separated list of a printf-format string
and error flags. Allowed flags are WARNING, ERROR, FATAL, MESSAGE and NOLINE. They can
be combined using or (|). NOLINE has to be set for error messages during initialiation or
while writing the output, when no source text is available. Errors cause the assembler to
return false. FATAL causes the assembler to terminate immediately.
The errors can be emitted using the function syntax_error(int n,...), cpu_error(int
n,...) or output_error(int n,...). The first argument is the number of the error mes-
sage (starting from zero). Additional arguments must be passed according to the format
string of the corresponding error message.
124 vasm manual
#define MAXMACPARAMS 35
Optionally defines the maximum number of macro arguments, if you need more
than the default number of 9.
#define SKIP_MACRO_ARGNAME(p) skip_identifier(p)
An optional function to skip a named macro argument in the macro defini-
tion. Argument is the current source stream pointer. The default is to skip an
identifier.
#define MACRO_ARG_OPTS(m,n,a,p) NULL
An optional function to parse and skip options, default values and qualifiers
for each macro argument. Returns NULL when no argument options have been
found. Arguments are:
struct macro *m;
Pointer to the macro structure being currently defined.
int n; Argument index, starting with zero.
char *a; Name of this argument.
char *p; Current source stream pointer. An updated pointer will be re-
turned.
Defaults to unused.
#define MACRO_ARG_SEP(p) (*p==’,’ ? skip(p+1) : NULL)
An optional function to skip a separator between the macro argument names
in the macro definition. Returns NULL when no valid separator is found.
Argument is the current source stream pointer. Defaults to using comma as
the only valid separator.
#define MACRO_PARAM_SEP(p) (*p==’,’ ? skip(p+1) : NULL)
An optional function to skip a separator between the macro parameters in a
macro call. Returns NULL when no valid separator is found. Argument is
the current source stream pointer. Defaults to using comma as the only valid
separator.
#define EXEC_MACRO(s)
An optional function to be called just before a macro starts execution. Param-
eters and qualifiers are already parsed. Argument is the source pointer of the
new macro. Defaults to unused.
char *defsectname;
Name of a default section which vasm creates when a label or code occurs in
the source, but the programmer forgot to specify a section. Assigning NULL
means that there is no default and vasm will show an error in this case.
char *defsecttype;
Type of the default section (see above). May be NULL.
int init_syntax();
Will be called during startup, after argument parsing Must return zero if ini-
tializations failed, non-zero otherwise.
int syntax_args(char *);
This function will be called with the command line arguments (unless they were
already recognized by other modules). If an argument was recognized, return
non-zero.
char *skip(char *);
A function to skip whitespace etc.
char *skip_operand(char *);
A function to skip an instruction’s operand. Will terminate at end of line or
the next comma, returning a pointer to the rest of the line behind the comma.
void eol(char *);
This function should check that the argument points to the end of a line (only
comments or whitespace following). If not, an error or warning message should
be omitted.
char *const_prefix(char *,int *);
Check if the first argument points to the start of a constant. If yes return a
pointer to the real start of the number (i.e. skip a prefix that may indicate the
base) and write the base of the number through the pointer passed as second
argument. Return zero if it does not point to a number.
char *const_suffix(char *,char *);
First argument points to the start of the constant (including prefix) and the
second argument to first character after the constant (excluding suffix). Checks
for a constant-suffix and skips it. Return pointer to the first character after
that constant. Example: constants with a ’h’ suffix to indicate a hexadecimal
base.
void parse(void);
This is the main parsing function. It has to read lines via the read_next_
line() function, parse them and create sections, atoms and symbols. Pseudo
directives are usually handled by the syntax module. Instructions can be parsed
by the cpu module using parse_instruction().
char *parse_macro_arg(struct macro *,char *,struct namelen *,struct namelen
*);
Called to parse a macro parameter by using the source stream pointer in the
second argument. The start pointer and length of a single passed parameter is
Chapter 25: Interface 127
written to the first struct namelen, while the optionally selected named macro
argument is passed in the second struct namelen. When the len field of the
second namelen is zero, then the argument is selected by position instead by
name. Returns the updated source stream pointer after successful parsing.
int expand_macro(source *,char **,char *,int);
Expand parameters and special commands inside a macro source. The second
argument is a pointer to the current source stream pointer, which is updated
on any succesful expansion. The function will return the number of charac-
ters written to the destination buffer (third argument) in this case. Returning
-1 means: no expansion took place. The last argument defines the space in
characters which is left in the destination buffer.
char *get_local_label(char **);
Gets a pointer to the current source pointer. Has to check if a valid local label
is found at this point. If yes return a pointer to the vasm-internal symbol name
representing the local label and update the current source pointer to point
behind the label.
Have a look at the support functions provided by the frontend to help.
#define VASM_CPU_<cpu> 1
Insert the cpu specifier.
#define INST_ALIGN 2
Minimum instruction alignment.
#define DATA_ALIGN(n) ...
Default alignment for n-bit data. Can also be a function.
#define DATA_OPERAND(n) ...
Operand class for n-bit data definitions. Can also be a function. Negative
values denote a floating point data definition of -n bits.
typedef ... operand;
Structure to store an operand.
typedef ... mnemonic_extension;
Mnemonic extension.
Optional features, which can be enabled by defining the following macros:
#define HAVE_INSTRUCTION_EXTENSION 1
If cpu-specific data should be added to all instruction atoms.
typedef ... instruction_ext;
Type for the above extension.
#define NEED_CLEARED_OPERANDS 1
Backend requires a zeroed operand structure when calling parse_operand()
for the first time. Defaults to undefined.
START_PARENTH(x)
Valid opening parenthesis for instruction operands. Defaults to ’(’.
END_PARENTH(x)
Valid closing parenthesis for instruction operands. Defaults to ’)’.
#define MNEMONIC_VALID(i)
An optional function with the arguments (int idx). Returns true when the
mnemonic with index idx is valid for the current state of the backend (e.g. it
is available for the selected cpu architecture).
#define MNEMOHTABSIZE 0x4000
You can optionally overwrite the default hash table size defined in vasm.h. May
be necessary for larger mnemonic tables.
#define OPERAND_OPTIONAL(p,t)
When defined, this is a function with the arguments (operand *op,int type),
which returns true when the given operand type (type) is optional. The func-
tion is only called for missing operands and should also initialize op with default
values (e.g. 0).
Implementing additional target-specific unary operations is done by defining the following
optional macros:
Chapter 25: Interface 129
#define EXT_UNARY_NAME(s)
Should return True when the string in s points to an operation name we want
to handle.
#define EXT_UNARY_TYPE(s)
Returns the operation type code for the string in s. Note that the last valid
standard operation is defined as LAST_EXP_TYPE, so the target-specific types
will start with LAST_EXP_TYPE+1.
#define EXT_UNARY_EVAL(t,v,r,c)
Defines a function with the arguments (int t, taddr v, taddr *r, int c) to
handle the operation type t returning an int to indicate whether this type has
been handled or not. Your operation will by applied on the value v and the
result is stored in *r. The flag c is passed as 1 when the value is constant (no
relocatable addresses involved).
#define EXT_FIND_BASE(b,e,s,p)
Defines a function with the arguments (symbol **b, expr *e, section *s,
taddr p) to save a pointer to the base symbol of expression e into the symbol
pointer, pointed to by b. The type of this base is given by an int return code.
Further on, e->type has to checked to be one of the operations to handle.
The section pointer s and the current pc p are needed to call the standard
find_base() function.
− vasm has a mechanism to specify rather complex relocations in a standard way (see
the section on general data structures). They can be extended with CPU specific
relocations, but usually CPU modules will try to create standard relocations (sometimes
several standard relocations can be used to implement a CPU specific relocation).
An output module should try to find appropriate relocations supported by the object
format. The goal is to avoid special CPU specific relocations as much as possible.
Volker Barthelmann [email protected]