Assembly Language
Assembly Language
Preface xi
Introduction 1
A Short History of Assemblers and Loaders 7
Types of Assemblers and Loaders 11
1 Basic Principles 13
1.1 Assembler Operation 13
1.1.1 The source line 13
1.2 The Two-Pass Assembler 20
1.3 The One-Pass Assembler 24
1.4 Absolute and Relocatable Object Files 28
1.4.1 Relocation bits 29
1.4.2 One-pass, relocatable object files 30
1.4.3 The task of relocating 31
1.5 Two Historical Notes 32
1.5.1 Early relocation 32
1.5.2 One-and-a-half pass assemblers 32
1.6 Forcing Upper 33
1.6.1 Relocating packed instructions 34
vi Contents
My computing experience dates back to the early 1960s, when higher-level lan-
guages were fairly new. It is therefore no wonder that my introduction to computers
and computing came through assembler language; specifically, the IBM 7040 assem-
bler language. After programming in assembler exclusively (and enthusiastically)
for more than a year, I finally studied Fortran. However, my love affair with as-
semblers has continued, and I very quickly discovered the lack of literature in this
field. In strict contrast to compilers, for which a wide range of literature exists,
very little has ever been written on assemblers and loaders. References [1,2,3,64]
are the best ones known to me, that describe and discuss the principles of opera-
tion of assemblers and loaders. Assembler language textbooks are—of course—very
common, but they only talk about what assemblers do, not about how they do it.
One reason for this situation is that, for many years—from the mid 1950s to
the mid 1970s—assemblers were in decline. The development of Fortran and other
higher-level languages in the early 1950s overshadowed assemblers. The growth of
higher-level languages was taken by many a programmer to signal the demise of
assemblers, with the result that the use of assemblers dwindled. The advent of the
microprocessor, around 1975, caused a significant change, however.
Initially, there were no compilers available, so programmers had to use assem-
blers, even primitive ones. This situation did not last long, of course, and today, in
the early 1990s, there are many compilers available for microcomputers, but assem-
blers have not been neglected. Virtually all software development systems available
xii Preface
for modern computers include an assembler. The assemblers described in Ch. 8 are
typical examples.
References [5–7] list three Z-80 assemblers running under CP/M. In spite of
being obsolete, they are good examples of modern assemblers. They are all state
of the art, relocatable assemblers that support macros and conditional assembly.
References [5,6] also include linking loaders. These assemblers reflect the interest
in the Z-80 and CP/M in the early 80s. Current processors, such as the 80x86 and
the 680x0 families, continue the tradition. Modern, sophisticated assemblers are
available for these processors, and are used extensively by programmers who need
optimized code in certain procedures.
The situation with loaders is different. Loaders have always been used. They
are used with as well as with assemblers, but their use is normally transparent to
the programmer. The average programmer hardly notices the existence of loaders,
which may explain the lack of literature in this area.
This book differs from the typical assembler text in that it is not a programming
manual, and it is not concerned with any specific assembler language. Instead
it concentrates on the design and implementation of assemblers and loaders. It
assumes that the reader has some knowledge of computers and programming, and it
aims to explain how assemblers and loaders work. Most of the discussion is general,
and most of the examples are in a hypothetical, simple, assembler language. Certain
examples are in the assembler languages of actual machines, and those are always
specified. Some good references for specific assembler languages are [5, 6, 7, 13, 26,
27, 30, 31, 32, 35, 37, 39, 101].
This work has its origins at a point, a few years ago, when my students started
complaining about a lack of literature in this field. Since I include assemblers and
loaders in classes that I teach every semester, I responded by developing class notes.
The notes were an immediate success, and have grown each semester, until I had
enough material for an expository paper on the subject. Since I was too busy to
polish the paper and submit it, I pretty soon found myself in a situation where the
work was too large for a paper. So here it is at last, in the form of a book.
This is mostly a professional book, intended for computer professionals in gen-
eral, and especially for systems programmers. However, it can be used as a sup-
plementary text in a systems programming or computer organization class at any
level.
Chapter 1 introduces the one-pass and two-pass assemblers, discusses other
important concepts—such as absolute- and relocatable object files—and describes
assembler features such as local labels and multiple location counters.
Data structures for implementing the symbol table are discussed in chapter 2.
Chapter 3 presents many directives and dicusses their formats, meaning, and
implementation. These directives are supported by many actual assemblers and,
while not complete, this collection of directives is quite extensive.
The two important topics of macros and conditional assembly are introduced
in chapter 4. The treatment of macros is as complete as practically possible. I have
Preface xiii
tried to include every possible feature of macros and the way it is implemented,
so this chapter can serve as a guide to practical macro implementation. At the
same time, I have tried not to concentrate on the macro features and syntax of any
specific assembler.
Features of the listing file are outlined, with examples, in chapter 5, while
chapter 6 is a general description of the properties of disassemblers, and of three
special types of assemblers. Those topics, especially meta-assemblers and high-level
assemblers, are of special interest to the advanced reader. They are not new, but
even experienced programmers are not always familiar with them.
Chapter 7 covers loaders. There is a very detailed example of the basic opera-
tion of a one pass linking loader, followed by features and concepts such as dynamic
loading, bootstrap loader, overlays, and others.
Finally, chapter 8 contains a survey of four modern, state of the art, assemblers.
Their main characteristics are described, as well as features that distinguish them
from their older counterparts.
To make it possible to use the book as a textbook, each chapter is sprinkled
with exercises, all solved in appendix C. At the end of each chapter there are review
problems and projects. The review questions vary from very easy questions to tasks
that require the student to find some topic in textbooks and study it. The projects
are programming assignments, arranged from simple to more complex, that propose
various assemblers and loaders to be implemented. They should be done in the order
specified, since most of them are extensions of their predecessors. Some instructors
would find appendix A, on addressing modes, useful.
References are indicated by square brackets. Thus [14] (or Ref. [14]) refers to
Grishman’s book listed in the reference section.
This is the attention symbol. It is placed in front of paragraphs that require
special attention, that present fundamental concepts, or that are judged important
for other reasons.
Acknowledgement: I would like to acknowledge the help received from B. A.
Wichmann of the National Physical Laboratory in England. He sent me information
on the PL516 high-level assembler, the BABBAGE language, and the GE 4000
family of minicomputers. His was the only help I have received in collecting and
analyzing the material for this book. Johnny Tolliver, of Oak Ridge National Labs,
should also be mentioned. His version of the MakeIndex program proved invaluable
in preparing the extensive index of this book.
This means that each source instruction is translated into exactly one target in-
struction.
This definition has the advantage of clearly describing the translation process
of an assembler. It is not a precise definition, however, because an assembler can
do (and usually does) much more than just translation. It offers a lot of help to
the programmer in many aspects of writing the program. The many types of help
offered by the assembler are grouped under the general term directives (or pseudo-
instructions). All the important directives are discussed in chapters 3 and 4.
Another good definition of assemblers is:
An assembler is a translator that translates a machine-
oriented language into machine language.
2 Introduction
It tells the assembler to create a pass 1 listing (/D), to create a variable opt and
set its value to 5, to convert all letters read from the source file to upper case (MU),
and to include certain information in the listing file (the V, or verbose, option).
The second input is the source file. It includes the symbolic instructions and
directives. The assembler translates each symbolic instruction into one machine
instruction. The directives, however, are not translated. The directives are our
way of asking the assembler for help. The assembler provides the help by executing
(rather than translating) the directives. A modern assembler can support as many
as a hundred directives. They range from ORG, which is very simple to execute,
to MACRO, which can be very complex. All the common directives are listed and
explained in chapters 3 and 4.
The first and most important output of the assembler is the object file. It
contains the assembled instructions (the machine language program) to be loaded
later into memory and executed. The object file is an important component of the
assembler-loader system. It makes it possible to assemble a program once, and later
load and run it many times. It also provides a natural place for the assembler to
leave information to the loader, instructing the loader in several aspects of loading
the program. This information is called loader directives and is covered in chapters
3 and 7. Note, however, that the object file is optional. The user may specify no
object file, in which case the assembler generates only a listing.
The second output of the assembler is the listing file. For each line in the source
file, a line is created in the listing file, containing:
The Location Counter (see chapter 1). The source line itself. The machine
instruction (if the source line is an instruction), or some other relevant information
(if the source line is a directive).
The listing file is generated by the assembler, sent to the printer, gets printed,
and is then discarded. The user, however, can specify either not to generate a listing
file or not to print it. There are also directives that control the listing. They can
be used to suppress parts of the listing, to print page headers, or to control the
printing of macro expansions.
The cross-reference information is normally a part of the listing file, although
the MASM assembler creates it in a separate file and uses a special utility to print
it. The cross-reference is a list of all symbols used in the program. For each symbol,
the point where it is defined and all the places where it is used, are listed.
Exercise .1 Why would anyone want to suppress the listing file or not to print it?
As mentioned above, the first assemblers were assemble-go type systems. They
did not generate any object file. Their main output was machine instructions loaded
directly into memory. Their secondary output was a listing. Such assemblers are
also in use today (for reasons explained in chapter 1) and are called one-pass as-
semblers. In principle, a one pass assembler can produce an object file, but such a
file would be absolute and its use is limited.
4 Introduction
Most assemblers today are of the two-pass variety. They generate an object file
that is relocatable and can be linked and loaded by a loader.
A loader, as the name implies, is a program that loads programs into memory.
Modern loaders, however, do much more than that. Their main tasks (chapter
7) are loading, relocating, linking and starting the program. In a typical run,
a modern linking-loader can read several object files, load them one by one into
memory, relocating each as it is being loaded, link all the separate object files into
one executable module, and start execution at the right point. Using such a loader
has several advantages (see below), the most important being the ability to write
and assemble a program in several, separate, parts.
Writing a large program in several parts is advantageous, for reasons that will
be briefly mentioned but not fully discussed here. The individual parts can be
written by different programmers (or teams of programmers), each concentrating
on his own part. The different parts can be written in different languages. It is
common to write the main program in a higher-level language and the procedures in
assembler language. The individual parts are assembled (or compiled) separately,
and separate object files are produced. The assembler or compiler can only see one
part at a time and does not see the whole picture. It is only the loader that loads
the separate parts and combines them into a single program. Thus when a program
is assembled, the assembler does not know whether this is a complete program or
just a part of a larger program. It therefore assumes that the program will start at
address zero and assembles it based on that assumption. Before the loader loads the
program, it determines its true start address, based on the memory areas available
at that moment and on the previously loaded object files. The loader then loads the
program, making sure that all instructions fit properly in their memory locations.
This process involves adjusting memory addresses in the program, and is called
relocation.
Since the assembler works on one program at a time, it cannot link individual
programs. When it assembles a source file containing a main program, the assembler
knows nothing about the existence of any other source files containing, perhaps,
procedures called by the main program. As a result, the assembler may not be
able to properly assemble a procedure call instruction (to an external procedure) in
the main program. The object file of the main program will, in such a case, have
missing parts (holes or gaps) that the assembler cannot fill. The loader has access
to all the object files that make up the entire program. It can see the whole picture,
and one of its tasks is to fill up any missing parts in the object files. This task is
called linking.
The task of preparing a source program for execution includes translation (as-
sembling or compiling), loading, relocating, and linking. It is divided between the
assembler (or compiler) and the loader, and dual assembler-loader systems are very
common. The main exception to this arrangement is interpretation. Interpretive
languages such as BASIC or APL use the services of one program, the interpreter,
for their execution, and do not require an assembler or a loader. It should be clear
from the above discussion that the main reason for keeping the assembler and loader
Introduction 5
separate is the need to develop programs (especially large ones) in separate parts.
The detailed reasons for this will not be discussed here. We will, however, point out
the advantages of having a dual assembler-loader system. They are listed below, in
order of importance.
It makes it possible to write programs in separate parts that may also be in
different languages.
It keeps the assembler small. This is an important advantage. The size of the
assembler depends on the size of its internal tables (especially the symbol table and
the macro definition table). An assembler designed to assemble large programs is
large because of its large tables. Separate assembly makes it possible to assemble
very large programs with a small assembler.
When a change is made in the source code, only the modified program needs to
be reassembled. This property is a benefit if one assumes that assembly is slow and
loading is fast. Many times, however, loading is slower than assembling, and this
property is just a feature, not an advantage, of a dual assembler-loader system.
The loader automatically loads routines from a library. This is considered by
some an advantage of a dual assembler-loader system but, actually, it is not. It
could easily be done in a single assembler-loader program. In such a program, the
library would have to contain the source code of the routines, but this is typically
not larger than the object code.
6 Introduction
The words of the wise are as goads, and as nail fastened by masters of
assemblies
— Ecclesiastes 12:11
A Short History of
Assemblers and Loaders
One of the first stored program computers was the EDSAC (Electronic De-
lay Storage Automatic Calculator) developed at Cambridge University in 1949 by
Maurice Wilkes and W. Renwick [4, 8 & 97]. From its very first days the EDSAC
had an assembler, called Initial Orders. It was implemented in a read-only memory
formed from a set of rotary telephone selectors, and it accepted symbolic instruc-
tions. Each instruction consisted of a one letter mnemonic, a decimal address, and
a third field that was a letter. The third field caused one of 12 constants preset by
the programmer to be added to the address at assembly time.
It is interesting to note that Wilkes was also the first to propose the use of
labels (which he called floating addresses), the first to use an early form of macros
(which he called synthetic orders), and the first to develop a subroutine library [4].
Reference [65] is a very early description of the use of labels in an assembler The
IBM 650 computer was first delivered around 1953 and had an assembler very similar
to present day assemblers. SOAP (Symbolic Optimizer and Assembly Program) did
symbolic assembly in the conventional way, and was perhaps the first assembler to
do so. However, its main feature was the optimized calculation of the address of
the next instruction. The IBM 650 (a decimal computer, incidentally), was based
on a magnetic drum memory and the program was stored in that memory. Each
8 A Short History of Assemblers and Loaders
instruction had to be fetched from the drum and had to contain the address of
its successor. For maximum speed, an instruction had to be placed on the drum
in a location that would be under the read head as soon as its predecessor was
completed. SOAP calculated those addresses, based on the execution times of the
individual instructions. Chapter 7 has more details, and a programming project,
on this process.
One of the first commercially successful computers was the IBM 704. It had
features such as floating-point hardware and index registers. It was first delivered
in 1956 and its first assembler, the UASAP-1, was written in the same year by Roy
Nutt of United Aircraft Corp. (hence the name UASAP—United Aircraft Symbolic
Assembly Program). It was a simple binary assembler, did practically nothing
but one-to-one translation, and left the programmer in complete control over the
program. SHARE, the IBM users’ organization, adopted a later version of that
assembler [9] and distributed it to its members together with routines produced
and contributed by members. UASAP has pointed the way to early assembler
writers, and many of its design principles are used by assemblers to this day. The
UASAP was later modified to support macros [62].
In the same year another assembler, the IBM Autocoder was developed by R.
Goldfinger [10] for use on the IBM 702/705 computers. This assembler (actually
several different Autocoder assemblers) was apparently the first to use macros. The
Autocoder assemblers were used extensively and were eventually developed into
large systems with large macro libraries used by many installations.
Another pioneering early assembler was the UNISAP, [47] for the UNIVAC I
& II computers, developed in 1958 by M. E. Conway. It was a one-and-a-half pass
assembler, and was the first one to use local labels. Both concepts are covered in
chapter 1.
By the late fifties, IBM had released the 7000 series of computers. These came
with a macro assembler, SCAT, that had all the features of modern assemblers.
It had many directives (pseudo instructions in the IBM terminology), an extensive
macro facility, and it generated relocatable object files.
The SCAT assembler (Symbolic Coder And Translator) was originally written
for the IBM 709 [56] and was modified to work on the IBM 7090. The GAS (Gen-
eralized Assembly System) assembler was another powerful 7090 assembler [58].
The idea of macros originated with several people. McIlroy [22] was probably
the first to propose the modern form of macros and the idea of conditional assembly.
He implemented these ideas in the GAS assembler mentioned above. Reference [60]
is a short early paper presenting some details of macro definition table handling.
One of the first full-feature loaders, the linking loader for the IBM 704–709–
7090 computers [59], is an example of an early loader supporting both relocation
and linking.
The earliest work discussing meta-assemblers seems to be Ferguson [24]. The
idea of high-level assemblers originated with Wirth [61] and had been extended,
A Short History of Assemblers and Loaders 9
Machine Language
Assembler Language
Absolute Assembler
Directives
Relocation External
Bits Routines
Relocatable
Assembler
and Loader
Conditional Assembly
Full-Feature, Relocatable
Macro Assembler, with
Conditional Assembly
a few years later, by an anonymous software designer at NCR, who proposed the
main ideas of the NEAT/3 language [85,86].
The diagram summarizes the main phases in the historical development of
assemblers and loaders.
A One-pass Assembler: One that performs all its functions by reading the source
file once.
A Two-Pass Assembler: One that reads the source file twice.
A Resident Assembler: One that is permanently loaded in memory. Typically
such an assembler resides in ROM, is very simple (supports only a few directives
and no macros), and is a one-pass assembler. The above assemblers are described
in chapter 1.
A Macro-Assembler: One that supports macros (chapter 4).
A Cross-Assembler: An assembler that runs on one computer and assembles pro-
grams for another. Many cross-assemblers are written in a higher-level language to
make them portable. They run on a large machine and produce object code for a
small machine.
A Meta-Assembler: One that can handle many different instruction sets.
A Disassembler: This, in a sense, is the opposite of an assembler. It translates
machine code into a source program in assembler language.
12 Types of Assemblers and Loaders
The basic principles of assembler operation are simple, involving just one prob-
lem, that of unresolved references. This is a simple problem that has two simple
solutions. The problem is important, however, since its two solutions introduce,
in a natural way, the two main types of assemblers namely, the one-pass and the
two-pass.
1.1 Assembler Operation
As mentioned in the introduction, the main input of the assembler is the source
file. Each record on the source file is a source line that specifes either an assembler
instruction or a directive.
1.1.1 The source line
A typical source line has four fields. A label (or a location), a mnemonic (or
operation), an operand, and a comment.
Example: LOOP ADD R1,ABC PRODUCING THE SUM
The comment is for the programmer’s use only. It is read by the assembler, it
is listed in the listing file, and is otherwise ignored.
The label field is only necessary if the instruction is referred to from some
other point in the program. It may be referred to by another instruction in the
same program, by another instruction in a different program (the two programs
should eventually be linked), or by itself.
The word mnemonic comes from the Greek µνµoνικoσ, meaning pertaining to
memory; it is a memory aid. The mnemonic is always necessary. It is the operation.
It tells the assembler what instruction needs to be assembled or what directive to
execute (but see the comment below about blank lines).
The operand depends on the mnemonic. Instructions can have zero, one, or
two operands (very few computers have also three operand instructions). Directives
also have operands. The operands supply information to the assembler about the
source line.
As a result, only the mnemonic is mandatory, but there are even exceptions
to this rule. One exception is blank lines. Many assemblers allow blank lines—in
which all fields, including the mnemonic, are missing—in the source file. They make
the listing more readable but are otherwise ignored.
Another exception is comment lines. A line that starts with a special sym-
bol (typically a semicolon, sometimes an asterisk and, in a few cases, a slash) is
considered a comment line and, of course, has no mnemonic. Many modern assem-
blers (see, e.g., references [37], [99]–[102]) support a COMMENT directive that has the
following form:
COMMENT delimiter text delimiter
Where the text between the delimiters is a comment. This way the programmer
can enter a long comment, spread over many lines, without having to start each
line with the special comment symbol. Example:
COMMENT =This is a long
comment that . . .
.
.
. . . sufficient to describe what you want=
Columns Field
1-6 Label
7 Blank
8- Mnemonic
The semicolon guarantees that the word ENABLE will not be considered an
operand by the assembler. This is why many assemblers require that comments
start with a semicolon.
Exercise 1.1 Why a semicolon and not some other character such as ‘$’ or ‘@’ ?
Many modern assemblers allow labels without an identifying ‘:’. They simply
have to work harder in order to identify labels.
The instruction sets of some computers are designed such that the mnemonic
specifies more than just the operation. It may also contain part of the operand.
The Signetics 2650 microprocessor, for example, has many mnemonics that include
one of the operands [13]. A ‘Store Relative’ instruction on the 2650 may be written
STRR,R0 SAV; the mnemonic field includes R0 (the register to be stored in location
SAV), which is an operand.
On other computers, the operation may partly be specified in the operand
field. The instruction IX7 X2+X5, on the CDC Cyber computers [14] means: “add
register X2 and register X5 as integers, and store the sum in register X7.” The
operation appears partly in the operation field (‘I’) and partly in the operand field
(‘+’), whereas X7 (an operand) appears in the mnemonic. This makes it harder
for the assembler to identify the operation and the operands and, as a result, such
instruction formats are not common.
16 Basic Principles Ch. 1
Exercise 1.2 What is the meaning of the Cyber instruction FX7 X2+X5?
To translate an instruction, the assembler uses the OpCode table, which is a
static data structure. The two important columns in the table are the mnemonic
and OpCode. Table 1–1 is an example of a simple OpCode table. It is part of the
IBM 360 OpCode table and it includes other information.
The mnemonics are from one to four letters long (in many assemblers they
may include digits). The OpCodes are two hexadecimal digits (8 bits) long, and the
types (which are irrelevant for now) provide more information to the assembler.
The OpCode table should allow for a quick search. For each source line input,
the assembler has to search the OpCode table. If it finds the mnemonic, it uses the
OpCode to start assembling the instruction. It also uses the other information in
the OpCode table to complete the assembly. If it does not find the mnemonic in the
table, it assumes that the mnemonic is that of a directive and proceeds accordingly
(see chapter 3).
The OpCode table thus provides for an easy first step of assembling an instruc-
tion. The next step is using the operand to complete the assembly. The OpCode
table should contain information about the number and types of operands for each
instruction. In table 1–1 above, the type column provides this information. Type
RR means a Register-Register instruction. This is an instruction with two operands,
both registers. The assembler expects two operands, both numbers between 0 and
15 (the IBM 360 has 16 general-purpose registers). Each register number is assem-
bled as a 4 bit field.
Exercise 1.3 Why does the IBM 360 have 16 general purpose registers and not a
round number such as 15 or 20?
Example: The instruction ‘AR 4,6’ means: add register 6 (the source) to reg-
ister 4 (the destination operand). It is assembled as the 16-bit machine instruction
1A46, in which 1A is the OpCode and 46, the two operands.
Type RX stands for Register-indeX. In these instructions the operand consists
of a register followed by an address.
Example: ‘BAL 5,14’. This instruction calls a procedure at location 14, and
saves the return address in register 5 (BAL stands for Branch And Link). It is
assembled as the 32-bit machine instruction 4550000E in which 00E is a 12-bit
Sec. 1.1 Assembler Operation 17
address field (E is hexadecimal 14), 45 is the OpCode, 5 is register 5, and the two
zeros in the middle are irrelevant to our discussion. (A note to readers familiar
with the IBM 360—This example ignores base registers as they do not contribute
anything to our discussion of assemblers.)
Exercise 1.4 What are the two zeros in the middle of the instruction used for?
This example is not a typical one. Numeric addresses are rarely used in assem-
bler programming, since keeping track of their values is a tedious task better left to
the assembler. In practice, symbols are used instead of numeric addresses. Thus the
above example is likely to be written as ‘BAL 5,XYZ’, where XYZ is a symbol whose
value is an address. Symbol XYZ should be the label of some source line. Typically
the program will contain the two lines
Besides the basic task of assembling instructions, the assembler offers many
services to the user, the most important of which is handling symbols. This task
consists of two different parts, defining symbols, and using them. A symbol is
defined by writing it as a label. The symbol is used by writing it in the operand
field of a source line. A symbol can only be defined once but it can be used any
number of times. To understand how a value is assigned to a symbol, consider the
example above. The ‘add’ instruction A is assembled and is eventually loaded into
memory as part of the program. The value of symbol XYZ is the memory address of
that instruction. This means that the assembler has to keep track of the addresses
where instructions are loaded, since some of them will become values of symbols.
To do this, the assembler uses two tools, the location counter (LC), and the symbol
table.
The LC is a variable, maintained by the assembler, that contains the address
into which the current instruction will eventually be loaded. When the assembler
starts, it clears the LC, assuming that the first instruction will go into location 0.
After each instruction is assembled, the assembler increments the LC by the size
of the instruction (the size in words, not in bits). Thus the LC always contains
the current address. Note that the assembler does not load the instructions into
memory. It writes them on the object file, to be eventually loaded into memory by
the loader. The LC, therefore, does not point to the current instruction. It just
shows where the instruction will eventually be loaded. When the source line has
a label (a newly defined symbol), the label is assigned the current value of the LC
as its value. Both the label and its value (plus some other information) are then
placed in the symbol table.
The symbol table is an internal, dynamic table that is generated, maintained,
and used by the assembler. Each entry in the table contains the definition of a
symbol and has fields for the name, value, and type of the symbol. Some symbol
18 Basic Principles Ch. 1
tables contain other information about the symbols. The symbol table starts empty,
labels are entered into it as their definitions are found in the source, and the table
is also searched frequently to find the values and types of symbols whose names are
known. Chapter 2 discusses various ways to implement symbol tables.
In the above example, when the assembler encounters the line
XYZ A 5,ABC ;THE SUBROUTINE STARTS HERE
it performs two independent operations. It stores symbol XYZ and its value (the
current value of the LC) in the symbol table, and it assembles the instruction.
These two operations have nothing to do with each other. Handling the symbol
definition and assembling the instruction are done by two different parts of the
assembler. Many times they are performed in different phases of the assembly.
If the LC happens to have the value 260, then the entry
will be added to the symbol table (104 is the hex value of decimal 260, and the type
REL will be explained later).
When the assembler encounters the line
BAL 5,XYZ
it assembles the instruction but, in order to assemble the operand, the assembler
needs to search the symbol table, find symbol XYZ, fetch its value and make it part
of the assembled instruction. The instruction is, therefore, assembled as 45500104.
Exercise 1.5 The address in our example, 104, is a relatively small number. Many
times, instructions have a 12-bit field for the address, allowing addresses up to
212 − 1 = 4095. What if the value of a certain symbol exceeds that number?
This is, in a very general way, what the assembler has to do in order to assemble
instructions and handle symbols. It is a simple process and it involves only one
problem which is illustrated by the following example.
In this case the value of symbol XYZ is needed before label XYZ is defined. When the
assembler gets to the first line (the BAL instruction), it searches the symbol table
for XYZ and, of course, does not find it. This situation is called the future symbol
problem or the problem of unresolved references. The XYZ in our example is a future
symbol or an unresolved reference.
Sec. 1.1 Assembler Operation 19
Obviously, future symbols are not an error and their use should not be prohib-
ited. The programmer should be able to refer to source lines which either precede
or follow the current line. Thus the future symbol problem has to be solved. It
turns out to be a simple problem and there are two solutions, a one-pass assembler
and a two-pass assembler. They represent not just different solutions to the future
symbol problem but two different approaches to assembler design and operation.
The one-pass assembler, as the name implies, solves the future symbol problem
by reading the source file once. Its most important feature, however, is that it
does not generate a relocatable object file but rather loads the object code (the
machine language program) directly into memory. Similarly, the most important
feature of the two-pass assembler is that it generates a relocatable object file, that
is later loaded into memory by a loader. It also solves the future symbol problem
by performing two passes over the source file. It should be noted at this point that
a one-pass assembler can generate an object file. Such a file, however, would be
absolute, rather than relocatable, and its use is limited. Absolute and relocatable
object files are discussed later in this chapter. Figure 1–1 is a summary of the most
important components and operations of an assembler.
Pass
Location counter
indicator
Source
file
Error
proc.
Source line Main Object
buffer Program file
Object
code
assembly
area
Lexical scan
routine Table search procedures
tion, and is not found in the symbol table?
To assign values to labels in pass 1, the assembler has to maintain the LC. This
in turn means that the assembler has to determine the size of each instruction (in
words), even though the instructions themselves are not assembled.
In many cases it is easy to figure out the size of an instruction. On the IBM 360,
the mnemonic determines the size uniquely. An assembler for this machine keeps
the size of each instruction in the OpCode table together with the mnemonic and
the OpCode (see table 1–1). On the DEC PDP-11 the size is determined both
by the type of the instruction and by the addressing mode(s) that it uses. Most
instructions are one word (16-bits) long. However, if they use either the index or
index deferred modes, one more word is added to the instruction. If the instruction
has two operands (source and destination) both using those modes, its size will be
3 words. On most modern microprocessors, instructions are between 1 and 4 bytes
long and the size is determined by the OpCode and the specific operands used.
This means that, in many cases, the assembler has to work hard in the first
pass just to determine the size of an instruction. It has to look at the mnemonic
and, sometimes, at the operands and the modes, even though it does not assemble
the instruction in the first pass. All the information about the mnemonic and
the operand collected by the assembler in the first pass is extremely useful in the
second pass, when instructions are assembled. This is why many assemblers save
all the information collected during the first pass and transmit it to the second pass
through an intermediate file. Each record on the intermediate file contains a copy
of a source line plus all the information that has been collected about that line in
the first pass. At the end of the first pass the original source file is closed and is no
longer used. The intermediate file is reopened and is read by the second pass as its
input file.
A record in a typical intermediate file contains:
The record type. It can be an instruction, a directive, a comment, or an invalid
line.
The LC value for the line.
A pointer to a specific entry in the OpCode table or the directive table. The
second pass uses this pointer to locate the information necessary to assemble or
execute the line.
Sec. 1.2 The Two-Pass Assembler 21
A copy of the source line. Notice that a label, if any, is not use by pass 2 but
must be included in the intermediate file since it is needed in the final listing.
Fig. 1–2 is a flow chart summarizing the operations in the two passes.
There can be two problems with labels in the first pass; multiply-defined labels
and invalid labels. Before a label is inserted into the symbol table, the table has to
be searched for that label. If the label is already in the table, it is doubly (or even
multiply-) defined. The assembler should treat this label as an error and the best
way of doing this is by inserting a special code in the type field in the symbol table.
Thus a situation such as:
AB ADD 5,X
.
.
AB SUB 6,Y
.
.
JMP AB
Exercise 1.7 What is the advantage of allowing characters other than letters and
digits in a label?
The only problem with symbols in the second pass is bad symbols. These are
either multiply-defined or undefined symbols. When a source line uses a symbol in
the operand field, the assembler looks it up in the symbol table. If the symbol is
found but has a type of MTDF, or if the symbol is not found in the symbol table (i.e.,
it has not been defined), the assembler responds as follows.
It flags the instruction in the listing file.
It assembles the instruction as far as possible, and writes it on the object file.
It flags the entire object file. The flag instructs the loader not to start execution
of the program. The object file is still generated and the loader will read and load
it, but not start it. Loading such a file may be useful if the user wants to see a
memory map (see discussion of memory maps in chapter 7).
22 Basic Principles Ch. 1
pass 1
read line 1
from source file
label pass 2
yes
defined
?
store name & value
no in symbol table
determine size
of instruction
LC:=LC+size
pass 2
yes
eof
stop
?
no
assemble
instruction
assemblers. This point is the reason why a one-pass assembler can only produce
an absolute object file (which has only limited use), whereas a two-pass assembler
can produce a relocatable object file, which is much more general. This important
topic is explained later in this chapter.
24 Basic Principles Ch. 1
LC
36 BEQ AB ;BRANCH ON EQUAL
.
.
67 BNE AB ;BRANCH ON NOT EQUAL
.
.
89 JMP AB ;UNCONDITIONALLY
.
.
126 AB anything
Symbol AB is used three times as a future symbol. On the first reference, when
the LC happens to stand at 36, the assembler searches the symbol table for AB, does
not find it, and therefore assumes that it is a future symbol. It then inserts AB into
the symbol table but, since AB has no value yet, it gets a special type. Its type is
U (undefined). Even though it is still undefined, it now occupies an entry in the
symbol table, an entry that will be used to keep track of AB as long as it is a future
symbol. The next step is to set the ‘value’ field of that entry to 36 (the current
value of the LC). This means that the symbol table entry for AB is now pointing
to the instruction in which AB is needed. The ‘value’ field is an ideal place for the
pointer since it is the right size, it is currently empty, and it is associated with
AB. The BEQ instruction itself is only partly assembled and is stored, incomplete,
in memory location 36. The field in the instruction were the value of AB should be
stored (the address field), remains empty.
When the assembler gets to the BNE instruction (at which point the LC stands
at 67), it searches the symbol table for AB, and finds it. However, AB has a type
of U, which means that it is a future symbol and thus its ‘value’ field (=36) is not
a value but a pointer. It should be noted that, at this point, a type of U does not
necessarily mean an undefined symbol. While the assembler is performing its single
pass, any undefined symbols must be considered future symbols. Only at the end of
the pass can the assembler identify undefined symbols (see below). The assembler
handles the BNE instruction by:
Partly assembling it and storing it in memory location 67.
Copying the pointer 36 from the symbol table to the partly assembled instruction
in location 67. The instruction has an empty field (where the value of AB should
have been), where the pointer is now stored. There may be cases where this field
Sec. 1.3 The One-Pass Assembler 25
in the instruction is too small to store a pointer. In such a case the assembler must
resort to other methods, one of which is discussed below.
Copying the LC (=67) into the ‘value’ field of the symbol table entry for AB,
rewriting the 36.
When the assembler reaches the JMP AB instruction, it repeats the three steps
above. The situation at those three points is summarized below.
memory symbol memory symbol memory symbol
table table table
loc contents n v t loc contents n v t loc contents n v t
printed with asterisks ‘*’ or question marks ‘?’, instead of the value of AB.
The key to the operation of a one-pass assembler is the fact that it loads the
object code directly in memory and does not generate an object file. This makes it
possible for the assembler to go back and complete instructions in memory at any
time during assembly.
The one-pass assembler can, in principle, generate an object file by simply
writing the object program from memory to a file. Such an object file, however,
would be absolute. Absolute and relocatable object files are discussed below.
One more point needs to be mentioned here. It is the case where the address
field in the instruction is too small for a pointer. This is a common case, since
machine instructions are designed to be short and normally do not contain a full
address. Instead of a full address, a typical machine instruction contains two fields,
mode and displacement (or offset), such that the mode tells the computer how to
obtain the full address from the displacement (see appendix A). The displacement
field is small (typically 8–12 bits) and has no room for a full address.
To handle this situation, the one-pass assembler has an additional data struc-
ture, a collection of linked lists, each corresponding to a future symbol. Each linked
list contains, in its nodes, pointers to instructions that are waiting to be completed.
The list for symbol AB is shown below in three successive stages of its construction.
When symbol AB is found, the assembler uses the information in the list to
complete all incomplete instructions. It then returns the entire list to the pool of
available memory.
An easy way to maintain such a collection of lists is to house them in an array.
Fig. 1–5 shows our list, occupying positions 5,9,3 of such an array. Each position
Sec. 1.3 The One-Pass Assembler 27
start.
lc=0 6 enter name,
pointer &
1
type of U
read line
from
source 3
yes
eof 5
7 copy pointer
from s.t. to
no
instruction
being
label yes assembled
4
defined
no
place LC in
s.t. to point
scan line 2 to current
instruction
being
assembled
a no
symbol 3
used
3 assemble line
yes
not
load in memory
search found
symbol 6
table
print LC, source
found & object codes
type 1
3 7
=D no
yes
has two locations, the first being the data item stored (a pointer to an incomplete
instruction) and the second, the array index of the next node of the list.
28 Basic Principles Ch. 1
Error!
found no label is
search type
s.t. =U doubly 5
defined
yes
enter scan
follow pointer
name 1 s.t.
in value field.
LC & complete all
type=D instr. waiting
in s.t. no U
for value of stop symbol
the symbol
?
2 yes
store LC in
value field of Error!
s.t., change undefined
type to D symbol(s)
2 stop
Exercise 1.8 What would be good Pascal declarations for such a future symbol
list:
a. Using absolute pointers.
b. Housed in an array.
The JMP instruction is assembled as ‘JMP 104’ and is written onto the object file.
When the object file is loaded starting at address 0, the JMP instruction is loaded
at location 86 and the ADD instruction, at location 104. When the JMP is executed,
it will cause a branch to 104, i.e., to the ADD instruction.
Sec. 1.4 Absolute and Relocatable Object Files 29
n v t n v t n v t
AB U AB U AB U
36 67 89
36 67
36
symbol table 3 4 5 6 7 8 9
n v t 36 89 67
/ 9 3
AB 5 U
On subsequent loads, however, the loader may decide to load the program
starting at a different address. On large computers, it is common to have several
programs loaded in memory at the same time, each occupying a different area.
Assuming that our program is loaded starting at location 500, the JMP instruction
will go into location 586 and the ADD, into location 604. The JMP should branch to
location 604 but, since it has not been modified, it will still branch to location 104
which is not only the wrong location, but is even outside the memory area of the
program.
In its simplest form, flagging each item is done by adding an extra bit, a
relocation bit, to it. The relocation bit is set by the assembler to 0, if the item
is absolute and to 1, if it is relocatable. The loader, when reading the object file
and loading instructions, reads the relocation bits. If an object instruction has a
relocation bit of 0, the loader simply loads it into memory. If it has a relocation bit
of 1, the loader relocates it by adding the start address to it. It is then loaded into
memory in the usual way. In our example, the ‘JMP TO’ instruction will be relocated
by adding 500 to it. It will thus be loaded as ‘JMP 604’ and, when executed, will
branch to location 604 i.e. to the ADD instruction.
The relocation bits themselves are not loaded into memory since memory should
contain only the object code. When the computer executes the program, it expects
to find just instructions and data in memory. Any relocation bits in memory would
be interpreted by the computer as either instructions or data.
This explains why a one-pass assembler cannot generate a relocatable object
file. The type of the instruction (absolute or relocatable) can be determined only
by examining the original source instruction. The one-pass assembler loads the
machine instructions directly in memory. Once in memory, the instruction is just
a number. By looking at a machine instruction in memory, it is impossible to tell
whether the original instruction was absolute or relocatable. Writing the machine
instructions from memory to a file will create an object file without any relocation
bits, i.e., an absolute object file. Such an object file is useful on computers were the
program is always loaded at the same place. In general, however, such files have
limited value.
Some readers are tempted, at this point, to find ways to allow a one-pass
assembler to generate relocation bits. Such ways exist, and two of them will be
described here. The point is, however, that the one-pass assembler is a simple,
fast, assemble-load-go program. Any modifications may result in a slow, complex
assembler, thereby losing the main advantages of one-pass assembly. It is preferable
to keep the one-pass assembler simple and, if a relocatable object file is necessary, to
use a two-pass assembler (see also the discussion of a one-and-a-half pass assembler
below).
Another point to realize is that a relocatable object file contains more than
relocation bits. It contains loader directives and linking information (covered in
chapter 7). All this is easy for a two-pass assembler to generate but hard for a
one-pass one.
Two ways are discussed below to modify the one-pass assembler to generate a
relocatable object file.
1. A common approach to modify the basic one-pass assembler is to have it generate
a relocation bit each time an instruction is assembled. The instruction is then
loaded into memory and the relocation bit may be stored in a special, packed array
outside the program area. When the object code is finally written on the object
Sec. 1.4 Absolute and Relocatable Object Files 31
file, the relocation bits may be read from the special array and attached each to its
instruction.
Such a method may work, but is cumbersome, especially because of future sym-
bols. In the case of a future symbol, the assembler does not know the type (absolute
or relocatable) of the missing symbol. It thus cannot generate the relocation bit,
resulting in a hole in the special array. When the symbol definition is finally found,
the assembler should complete all the instructions that use this symbol, and also
generate the relocation bit and store it in the special array (a process involving bit
operations).
2. Another possible modification to the one-pass assembler will be briefly outlined.
The assembler can write each machine instruction on the object file as soon as it is
generated and loaded in memory. At that point the source instruction is available
and can be examined, so a relocation bit can be prepared and written on the object
file with the instruction. The only problem is, as before, instructions using future
symbols. They must go on the object file incomplete and without relocation bits.
At the end of the single pass, the assembler writes the entire symbol table on the
object file.
The task of completing those instructions is left to the loader. The loader
initially skips the first part of the object file and reads the symbol table. It then
rereads the file, and loads instructions in memory. Each time it comes across an
incomplete instruction, it uses the symbol table to complete it and, if necessary, to
relocate it as well.
The trouble with this method is that it shifts assembler tasks to the loader,
forcing the loader to do what is essentially a two-pass job.
None of these modifications is satisfactory. The lesson to learn from these
attempts is that, traditionally, the one-pass and two-pass assemblers have been
developed as two different types of assemblers. The first is fast and simple; the
second, a general purpose program which can support many features.
Chapter 7 discusses typical formats of relocatable object files and other items
added by the assembler to those files, to be used by the loader.
1.4.3 The task of relocating
The role of the loader is not as simple as may seem from the above discussion.
Relocating an instruction is not always as simple as adding a start address to it.
On the IBM 7090/7094 computers [65,66], for example, many instructions have the
format:
Field OpCode Decrement Tag Address
Size (in bits) 3 15 3 15
The exact meaning of the fields is irrelevant except that the Address and Decrement
fields may both contain addresses. The assembler must determine the types of both
fields (either can be absolute or relocatable), and prepare two relocation bits. The
loader has to read the two bits and should be able to relocate either field. Relocating
the Decrement field means adding the start address just to that field and not to the
entire instruction.
32 Basic Principles Ch. 1
Exercise 1.9 How can the loader add something to a field in the middle of an
instruction ?
The discussion of separate assembly in chapter 3 (the EXTRN and ENTRY di-
rectives) shows that each field can in fact have three different types, Absolute,
Relocatable, and special relocation. Thus the assembler generally has to generate
two relocation bits for each field which, in the case of the IBM 7090/7094 (or simi-
lar computers), implies a total of four relocation bits. Chapter 7 shows how those
pairs of relocation bits are used as identification bits, identifying each line in the
relocatable object file as one of four types: an absolute instruction, a relocatable
instruction, an instruction requiring special relocation, or as a loader directives.
On the IBM PC, an absolute object file uses the extension .COM, and a relo-
catable object file, the extension .EXE.
The mathematician John von Neumann, the principal contributor to early com-
puter design, was using relocatable code as early as 1945 [11].
Exercise 1.10 What are other ways to explicitly request a forcing upper?
An interesting problem is, how does the assembler handle relocation bits when
several instructions are packed in one word?
In a computer such as the Cyber, only 30- and 60-bit instructions may contain
addresses. There are only six ways of combining instructions in a 60-bit word, as
the following diagram shows.
60
30 30
30 15 15
15 30 15
15 15 30
15 15 15 15
The assembler has to generate one of the values 0–5 as a 3-bit relocation field
attached to each word as it is written on the object file. The loader reads this field
and uses it to perform the actual relocation.
60
0
1 30 30
2 30 15 15
3 15 30 15
4 15 15 30
5 15 15 15 15
The value of A is 16 and its type is relative (meaning A is a regular label, defined
by writing it to the left of an instruction). Thus A represents address 16 from the
start of the program. Similarly B is address 27 from the start of the program. It is
thus reasonable to define the expression B-A as having a value of 27 − 16 = 11 and
a type of absolute. It represents the distance between the two locations, and that
Exercise 1.12 What about the expression A-B? is it valid? If yes, what are its
value and type?
On the other hand, an expression of the form rel + rel has no well-defined
type and is, therefore, invalid. Both A & B above are relative and represent certain
addresses. The sum A+B, however, does not represent any address. In a similar
way abs ∗ abs is abs, rel ∗ abs is rel but rel ∗ rel is invalid. abs/abs is abs, rel/abs
is rel but rel/rel is invalid. All expressions are evaluated at the last possible
moment. Expressions in any pass 0 directives (see Ch. 4 for a discussion of pass 0)
36 Basic Principles Ch. 1
are evaluated when the directive is executed, in pass 0. Expressions in any pass 1
directives are, similarly, evaluated in pass 1. All other expressions (in instructions
or in pass 2 directives) are evaluated in pass 2.
An extreme example of an address expression is ‘A-B+C-D+E’ where all the
symbols involved are relative. It is executed from left to right ‘(((A-B)+C)-D)+E’,
generating the intermediate types: (((rel − rel) + rel) − rel) + rel → ((abs + rel) −
rel) + rel → (rel − rel) + rel → abs + rel → rel. A valid expression.
In general, expressions of the type ‘X+A-B+C-D+· · ·+M-N+Y’ are valid when ‘X,Y’
are absolute and ‘A,B,· · ·,M,N’ are relative. The relative symbols must come in
pairs like A-B except the last one M-N, where N may be missing. If N is missing, the
Exercise 1.13 How does the assembler handle an expression such as ‘A-B+K-L’ in
which all the symbols are relative but K,L are external?
1.7.1 Summary
The two-pass assembler generates the machine instructions in pass two, where it
has access to the source instructions. It checks each source instruction and generates
a relocation bit according to:
If the instruction uses a relative symbol, then it is relocatable and the relocation
bit is 1.
If the instruction uses an absolute symbol (see the discussion of EQU in chapter 3)
or uses no symbols at all, the instruction is absolute and the relocation bit is 0.
An instruction in the relative mode contains an offset, not the full address, and
is therefore absolute (see App. A for the ralative mode).
The one-pass assembler generates the object file at the end of its single pass,
by dumping all the machine instructions from memory to the file. It has no access
to the source at that point and therefore cannot generate relocation bits.
As a result, those two types of assemblers have evolved along different lines,
and represent two different approaches to the overall assembler design, not just to
the problem of resolving future symbols.
1.8 Local Labels
In principle, a label may have any name that obeys the simple syntax rules of
the assembler. In practice, though, label names should be descriptive. Names such
as DATE, MORE, LOSS, RED are preferable to A001, A002,. . .
There are exceptions, however. The use of the non-descriptive label A1 in the
following example:
.
JMP A1
DDCT DS 12 reserve 12 locations for array DDCT
A1 .
.
Sec. 1.8 Local Labels 37
is justified since it is only used to jump over the array DDCT. (Note that the array’s
name is descriptive, possibly meaning deductions or double-dictionary) The DS di-
rective is explained in chapter 3. We say that A1 is used only locally, to serve a
limited purpose.
As a result, many assemblers support a feature called local labels. It is due
to M. E. Conway who used it in the early UNISAP assembler for the UNIVAC I
computer [47]. The main idea is that if a label is used locally and does not require
a descriptive name, why not give it a name that will signify this fact. Conway used
names such as 1H, 2H for the local labels. The name of a local label in our examples
is a single decimal digit. When such a label is referred to (in the operand field), the
digit is followed by either B or F (for Backward or Forward).
LC
.
.
13 1: ...
.
.
17 JMP 1F jump to 24
.
.
24 1: LOD R2,1B 1B here means address 13
.
.
31 1: ADD R1,2F 2F is address 102
.
.
102 2: DC 1206,-17
.
.
115 SUB R3,2B-1 102-1=101
The example shows that local labels is a simple, useful, concept that is easy to
implement. In a two-pass assembler, each local label is entered into the symbol table
as any other symbol, in pass 1. Thus the symbol table in our example contains:
Symbol Table
n v
1 13
1 24
1 31
2 102
38 Basic Principles Ch. 1
The order of the labels in the symbol table is important. If the symbol table is
sorted between the two passes, all occurrences of each local label should remain
sorted by value. In pass 2, when an instruction uses a local label such as 1F, the
assembler identifies the specific occurence of label 1 by comparing all local labels
1 to the current value of the LC. The first such instruction in our example is the
‘JMP 1F’ at LC=17. Clearly, the assembler should look for a local label with the
name ‘1’ and a value ≥ 17. The smallest such label has value 24. In the second
case, LC=24 and the assembler is looking for a 1B. It needs the label with name ‘1’
and a value which is the largest among all values < 24. It therefore identifies the
label as the ‘1’ at 13.
In a one-pass assembler, again the labels are recognized and put into the symbol
table in the single pass. An instruction using a local label iB is no problem, since
is needs the most recent occurence of the local label ‘1’ in the table. An instruction
using an iF is handled like any other future symbol case. An entry is opened in the
symbol table with the name iF, a type of U, and a value which is a pointer to the
instruction.
In the example above, a snapshot of the symbol table at LC=32 is:
Symbol Table
n v t
1 13 D
1 24 D
1 31 D 31 is the value of the third 1
2 31 U 31 is a pointer to the ADD instruction
An advantage of this feature is that the local labels are easy to identify as such,
since their names start with a digit. Most assemblers require regular label names
to start with a letter.
In modern assemblers, local labels sometimes use a syntax different from the
one shown here. See Ch. 8 for examples.
Virtually all assemblers allow a notation such as ‘BPL *+6’ where ‘*’ stands
for the current value of the LC. The operand in this case is located at a point 6
locations following the BPL instruction.
The LC symbol can be part of any address expression and is, of course, relo-
catable. Thus *+A is valid if A is absolute, while *-A is always okay (and is absolute
if A is relative, relative if A is absolute). This feature is easy to implement. The
address expression involving the ‘*’ is calculated, using the current value of the LC,
and the value is used to assemble the instruction, or execute the directive, on the
current source line. Nothing is stored in the symbol table.
Sec. 1.8 Local Labels 39
Some assemblers use the asterisk for multiplication, and may designate the
period ‘.’ or the ‘$’ for the LC symbol.
On the PDP-11 the notation ‘X: .=.+8’ is used to increment the LC by 8, and
However, at run time, the hardware, after executing the ADD instruction, would
try to execute the first element of array D as an instruction. Obviously, instruc-
tions and data have to be separated, and normally all the arrays and constants are
declared at the end of the program, following the last executable instruction (HLT).
Multiple location counters make it possible to enjoy the best of both worlds.
The data can be declared when first used, and can be loaded at the end of the
program or anywhere else the programmer wishes. This feature uses several direc-
tives, covered in chapter 3, the most important of which will be described here.
It is based on the principle that new location counters can be declared and given
names, at assembly time, at any point in the source code. The example above can
be handled by declaring a location counter with a name (such as DATA) instructing
the assembler to assemble the DS directive under that LC, and to switch back to
the main LC—which now must have a name—like any other LC. Its name is ‘ ’ (a
space).
This is done by the special directive USE:
ADD D, ...
USE DATA
D DS 12
USE *
.
.
This directive directs the assembler to start using (or to resume the use of) a new
location counter. The name is specified in the operand field, so an empty operand
means the main LC. The asterisk ‘*’ implies the previous LC, the one that was used
before the last USE.
40 Basic Principles Ch. 1
Exercise 1.16 The previous section discusses the use of asterisk as the LC value.
When executing a USE *, how does the assembler know that the asterisk is not the
LC value?
The USE directives divide the program into several sections, which are loaded,
by the loader, into separate memory areas. The sections are loaded in the order in
which their names appear in the source. Fig. 1–8 is a good example:
.
. (1)
.
USE DATA
.
. (2)
.
USE *
.
. (3)
.
USE BETA
.
. (4)
.
USE DATA
.
. (5)
.
USE <space>
.
. (6)
.
USE GAMMA
.
. (7)
.
END
At load time, the different sections would be loaded in the order MAIN, DATA,
BETA, GAMMA or 1,3,6,2,5,4,7. Chapter 7 explains the details of such a load, which
Exercise 1.17 Can we start a program with a USE ABC? in other words, can the
first section be other than the main section?
Another example of the same feature is procedures. In assembler language, a
procedure can be written as part of the main program. However, the procedure
Sec. 1.9 Multiple Location Counters 41
should only be executed when called from the main program. Therefore, it should
be separated from the main instruction stream, since otherwise the hardware would
execute it when it runs into the first instruction of the procedure. So something
like: .
.
0 LOD ...
.
.
15 SUB ...
16 CALL P
17 P ADD R5,N
.
.
45 RET
46 CLR ...
.
.
104 END
is wrong. The procedure is defined on lines 17–45 and is called on line 16. This
makes the source program more readable, since the procedure is written next to its
call. However, the hardware would run into the procedure and start executing it
right after it executes line 16, i.e., right after it has been called. The solution is to
use a new LC—named, perhaps, PROC—by placing a USE PROC between lines 16, 17
and a USE * between lines 45, 46.
Fortran programmers are familiar with the COMMON statement. This is a block
of memory reserved in a common area, accessible to the main program and to all
its procedures. It is allocated by the loader high in memory, overlaying the loader
itself. The common area cannot be preloaded with constants, since it uses the same
memory area occupied by the loader. In many assemblers, designed to interface with
Fortran programs, there is a preassigned LC, called //, such that all data declared
under it end up being loaded in the common area. The concept of labeled common
in Fortran also has its equivalent in assembler language. The Fortran statement
‘COMMON/NAM/A(12),B(5)’ can be written in assembler language as:
.
USE /NAM/
A DS 12
B DS 5
USE DAT
C DC 56,-90
USE
.
.
42 Basic Principles Ch. 1
The two arrays A, B would be loaded in the labeled common /NAM/, while the
constants labeled C would end up as part of section DAT.
The IBM 360 assembler has a CSECT directive, declaring the start of a control
section. However, a control section on the 360 is a general feature. It can be used to
declare sections like those described here, or to declare sections that are considered
separate programs and are assembled separately. They are later loaded together,
by the loader, to form one executable program. The different control sections are
linked by global symbols, declared either as external or as entry points. The entire
concept is explained in chapter 3, as part of the discussion of the EXTRN, ENTRY
directives.
The VAX Macro assembler (see Ch. 8) [77] has a .PSECT directive similar to
CSECT, and it does not support multiple LCs. A typical VAX example is:
.TITLE CALCULATE PI
.PSECT DATA, NOEXE,WRT
A=2000
B: .WORD 6
C: .LONG 8
.PSECT CODE, EXE,NOWRT
.ENTRY PI,0
.
.
<instructions>
.
.
$EXIT
.PSECT CONS, NOEXE,NOWRT
K: .WORD 1230
.END PI
Each .PSECT includes the name of the section, followed by attributes such as
EXE, NOEXE, WRT, NOWRT.
The memory on the 80x86 microprocessors is organized in 64k (highly over-
lapping) segments. The microprocessor can only generate 16-bit addresses, i.e., it
can only specify an address within a segment. A physical address is created by
combining the 16-bit processor generated address with the contents of one of the
segment registers in a special way (see refs. [38, 57] for the details). There are four
such registers: The DS (data segment), CS (code segment), SS (stack segment) and
ES (extra segment).
When an instruction is fetched, the PC is combined with the CS register and
the result is used as the address of the next instruction (in the code segment). When
an instruction specifies the address of a piece of data, that address is combined with
the DS register, to obtain a full address in the data segment. The extra segment
is normally used for string operations, and the stack segment, for stack-oriented
instructions (PUSH, POP or any instructions that use the SP or BP registers).
Sec. 1.9 Multiple Location Counters 43
1.10 Literals
Many instructions require their operands to be addresses. The ADD instruction
is typically written ‘ADD AB,R3’ or ‘ADD R3,AB’ where AB is a symbol and the in-
struction adds the contents of location AB to register 3. Sometimes, however, the
programmer wants to add to register 3, not the contents of any memory location
but a certain constant, say the number −7. Modern computers support the im-
mediate mode which allows the programmer to write ‘ADD #-7,R3’. The number
sign ‘#’ indicates the immediate mode and it implies that the instruction contains
the operand itself, not the address of the operand. Most old computers, however,
do not support this mode; their instructions have to contain addresses, not the
operands themselves. Also, in many computers, an immediate operand must be a
small number.
To help the programmer in such cases, some assemblers support literals.A no-
table example is the MPW assembler for the Macintosh computer (see Ch. 8). A
literal is a constant preceded by an equal sign ‘=’. Using literals, the programmer
can write ‘ADD =-7,R3’ and the assembler handles this by:
Preloading the constant −7 in the first memory location in the literal table (how-
ever, see the LITORG directive in chapter 3 for an exception). The literal table is
loaded in memory immediately following the program.
Assembling the instruction as ‘ADD TMP,R3’ where TMP is the address where the
constant was loaded
Such assemblers may also support octal (=O377 or =377B), hex (=HFF0A),
real (=1.37 or =12E-5) or other literals.
To handle literals, the assembler maintains a table, the literal table, similar
to the symbol table. It has columns for the name, value, address and type of each
literal. In pass 1, when the assembler finds an instruction that uses a literal, such as
−7, it stores the name (−7) in the first available entry in the literal table, together
with the value (1 . . . 110012 ) and the type (decimal). The instruction itself is treated
as any other instruction with a future symbol. At the end of pass 1, the assembler
uses the LC to assign addresses to the literals in the table. In pass 2, the table
is used, in much the same way as the symbol table, to assemble instructions using
literals. At the end of pass 2, every entry in the literal table is treated as a DC
directive and is written on the object file in the usual way. There are three points
to consider when implementing literals.
44 Basic Principles Ch. 1
Two literals with the same name are considered identical; only one entry is gener-
ated in the literal table. On the other hand, literals with different names are treated
as different even if they have identical values, such as =12.5 and =12.50.
All literals are loaded following the end of the program. If the programmer wants
certain literals to be loaded elsewhere, the LITORG directive can be used. It is fully
described in chapter 3, but the following example clarifies a point that should be
mentioned here. .
ADD =-7,R3
.
LITORG
.
SUB =-7,R4
.
The first −7 is loaded, perhaps with other literals, at the point in the program
where the LITORG is specified. The second −7, even though identical to the first, is
loaded separately, together with all the literals used since the LITORG, at the end of
the program.
The LITORG directive is commonly used to make sure that a literal is loaded
in memory close to the instruction using it. This may be important in case the
relative mode is used.
The LC can be used as a literal ‘= ∗’. This is an example of a literal whose name
is always the same, but whose value is different for each use.
1.10.2 Examples
As has been mentioned before, some assemblers support literals even though
the computer may have an immediate mode, because an immediate operand is
normally limited in size. However, more and more modern computers, such as the
68000 and the VAX, support immediate operands of several sizes. Their assemblers
do not have to support any literals. Some interesting VAX examples are:
1. MOVL #7,R6 is assembled into ‘D0 07 56’. D0 is the OpCode, 07 is a byte with
two mode bits and six bits of operand. The two mode bits (00) specify the
short literal mode. This is really a short immediate mode. Even though the
word ‘literal’ is used, it is not a use of literal but rather an immediate mode.
The difference is that, in the immediate mode, the operand is part of the
instruction whereas, when a literal is used, the instruction contains the address
of the operand, not the operand itself. The third byte (56) specifies the use of
register 6 in mode 5 (register mode). The assembler has generated a three-byte
MOVL instruction in the short literal mode. This mode is automatically selected
by the assembler if the operand fits in six bits.
2. MOVW I^#7,R6 is assembled into ‘B0 8F 0007 56’. Here the user has forced
the assembler to use the immediate mode by specifying I^. The immediate
Sec. 1.10 Literals 45
operand becomes a word (2 bytes or 16 bits) and the instruction is now 5 bytes
long. The second byte specifies register F (which happens to be the PC on
the VAX) in mode 8 (autoincrement). This combination is equivalent to the
immediate mode, where the immediate operand is stored in the third byte of
the instruction. The last byte (56) is as before.
3. Again, a MOVL instruction but in a different context.
LC
MOVL #DATA,R6 assembled into D0 8F 00000037’ 56
.
.
0037 DATA .BYTE ...
.
.
Even though the operand is small (0037) and fits in six bits, the assembler
has automatically selected the immediate mode (8F) and has generated the
constant as a long word (32 bits). The reason is that the source instruction
uses a future symbol (DATA). The assembler has to determine the instruction
size in pass 1 and, since DATA is a future symbol, the assembler does not have
its value and has to assume the largest possible value. The result is a seven
byte instruction instead of the three bytes in the first example!
Incidentally, the qoute in (00000047’) indicates that the constant is relocat-
able.
programmer. Thus things such as ‘T’A, L’B’ specify the type and length of symbols
and can be used throughout the program. Examples such as:
H DC L’X the length of X (in words) is preloaded in location H
G DS L’X array G has L’X elements
AIF (T’X=ABS).Z a conditional assembly directive, see chapter 4.
Even if the operands are of the right type, their values may be out of range.
In a seemingly innocent instruction such as ‘LOD R16,#70000’, either operand, or
even both, may be invalid. If the computer has 16 registers, R0-R15, then R16 is
out of range. If the computer supports 16-bit integers, then the number 70000 is
too large.
Even if the operands are valid, there may still be errors such as a bad addressing
mode. Certain instructions can only use certain modes. A specific mode can only
use addresses in a certain range. Project 1–4 at the end of this chapter describes
an assembler where such restrictions exist and quite a few errors are possible.
4. General errors do not pertain to any individual line and have to do with the gen-
eral status of the assembler. Examples are ‘out of memory’, ‘cannot read/write
file xxx’, ‘illegal character read from source file’, ‘table xxx overflow’ ‘phase
error between passes’.
The last example is particularly interesting and will be described in some de-
tail. It is issued when pass 1 makes an assumption that turns out, in pass 2,
to be wrong. This is a severe error that requires a reassembly. Phase errors re-
quire a computer with sophisticated instructions and complex memory manage-
ment; they don’t exist on computers with simple architectures. The Intel 80x86
microprocessors—with variable-size instructions, several offset sizes, and segmented
memory management—are a good example of computer architecture where phase
errors may easily occur.
Here are two examples of phase errors on those microprocessors. (Refs. [38, 57]
are good introductions to 8086/8088 architecture and instruction set):
An instruction in a certain code segment refers to a variable declared in a data
segment following the code segment. In pass 1, the assembler assumes that the
variable is declared in the same segment as the instruction, and is a future symbol.
The instruction is determined accordingly. In pass 2, when the time comes to
assemble the instruction, all the variables are known, and the assembler discovers
that the variable in question is far. A longer instruction is necessary, the pass-1
assumption turns out to be wrong, and pass 2 cannot assemble the instruction. This
error is illustrated below.
CODE S SEGMENT PUBLIC
..
..
MOV AL,ABC
..
..
CODE S ENDS
DATA S SEGMENT PUBLIC
ABC DB 123
..
DATA S ENDS
END START
48 Basic Principles Ch. 1
an instruction in the relative mode has a field for the relative address (the offset).
Several possible offset sizes are possible on the 80x86, depending on the distance
between the instruction and its operand. If the operand is a future symbol, even
in the same segment as the instruction, the assembler has to guess a size for the
offset. In pass 2 the operand is known and, if it too far from the instruction, the
offset size guessed in pass 1 may turn out to be too small.
The Microsoft macro assembler (MASM), a typical modern assembler for the
80x86 microprocessors, features a list of about 100 error messages. Even an early
assembler such as IBMAP for the IBM 7090 [65] had a list of 125 error messages,
divided into four classes according to the severity of the error.
Sec. 1.13 Review Questions and Projects 49
AR 1,2
N LR 3,4
SBI M,3
Q CR 4,5
P MVI #5,Q
MR 6,7
Z JMP D
11. The asterisk ‘*’ is used, among other things, to indicate the LC value. It can
be used in address expressions such as *+4. Since such an expression indicates an
address greater than the current address, is it an unresolved reference?
12. What feature of assembler language leads to the idea of a two-pass assembler?
13. Compare and contrast literals and the immediate mode. What are the advan-
tages and disadvantages of each?
50 Basic Principles Ch. 1
Test Program
Object
LC Label Source OpCode Op
0 INP 000111 0
1 STO 50 000010 0000110010
2 INP 000111 0
3 STO 51 000010 0000110011
4 BZE X 000100 0000001000
5 ADD 50 000011 0000110010
6 OUT 001000 0
7 BRA Y 000110 0000001010
8 X LOD 50 000001 0000110010
9 ADD 50 000011 0000110010
10 Y STO 52 000010 0000110100
11 HLT 000000 0
Table 1–2.
The most important point about this choice of M , N is the many unused
Op fields (in a third of the instructions in our example). This clearly points to
a very common principle of instruction set design: variable length instructions.
Instructions with an Op field should be longer than ones with only an OpCode.
52 Basic Principles Ch. 1
Exercise 1.20 There is now room for 256 OpCodes, too many for our simple
computer. We probably need only a 4- or 5-bit OpCode, leaving either 3 or 4
unused bits in the first word of each instruction. What would be a good way to
extend the hardware, so we can use those bits?
Testing: After a choice of M , N is made, the assembler (actually, two assem-
blers) can be implemented. Testing is a very important task in each of the projects
described here. You should prepare enough test programs to test every instruction
in every valid mode, every directive, and every error situation. Since the present
project is so simple, there are no directives, no modes, and only a few possible
errors. Subsequent projects will have, of course, more possible errors.
Exercise 1.22 Why is it a good idea to require a label to start with a letter? what
is wrong with 1A as a label?
When local labels are supported, the assembler loses this advantage, since local
label names do start with a digit. On the other hand, it becomes easier to distinguish
a local symbol from a regular one.
2. As a result, the source line format may have to be redesigned. Source lines
should have a free format, and you can select one of the two possibilities:
a. A label, if it exists, must start in position 1. Without a label, position 1
must be left blank.
b. A label must be followed by a colon, and a comment must be preceded by a
semicolon. This requires the programmer to work harder but simplifies the lexical
analysis. The individual fields of a source line should be separated by commas.
3. The three directives END, DS, and DC should be supported.
Sec. 1.13 Review Questions and Projects 53
Extend project 1–2 by adding relocation bits. This can only be done for the
two-pass assembler. The relocation bit of an instruction is determined in pass 2,
when the instruction is assembled. If the instruction uses a relocatable symbol, the
relocation bit should be 1. If the instruction uses an absolute symbol (as, e.g., in
‘EQU 7’) or no symbol at all, the relocation bit should be 0. Instructions without
operands should always have a relocation bit of zero.
If the assembler produces variable length instructions, they should be written
on the object file in a way that would make it easy for the loader to read and
identify the relocation bits. The choice of M = 8, N = 8 mentioned before, for
instance, has resulted in instructions being either 8- or 16-bits long. A reasonable
design of the object file in such a case is to write the instructions as 10-bit records,
where the first 8 bits are the instruction (or part of it), bit 9 identifies the record
54 Basic Principles Ch. 1
The DC generates the record 0...0108 01 since the value of label A is 108
locations from the start of the program.
Notice that now we have three types of records on the object file, namely, abso-
lute instructions, relocatable instructions, and loader directives. They should each
be identified by two bits (see also discussion of special relocation in this chapter).
1.13.4 Project 1–4
This is more than an extension of previous projects. Here we describe a more
realistic assembler that can handle more registers, more directives, and addressing
modes. It still is a very simple assembler, though. A special feature of this assembler
is that it can operate as either a one-pass or a two-pass assembler. As usual, it
should read a source file with a test program, assemble it into machine code and
either load the machine code directly in memory (the one-pass version), or write it
on an object file (the two-pass one). It is suggested to write the one-pass and two-
pass assemblers as two separate parts of the project, sharing common procedures.
Examples of common procedures are symbol table search, OpCode table search,
and lexical scan of the source line.
and are 16-bit long. Each instruction is loaded into one memory word, implying
a choice of N = 16. This roughly corresponds to the architecture of early third-
generation computers. The concept of a status flag is explained in any book on
computer organization and in many books on assembler language. As mentioned
before, it is not necessary to understand status flags in order to implement the
assembler.
All other OpCodes are not implemented yet and are reserved for future use. It
is important to understand that the assembler only assembles the instructions, and
does not execute them. Therefore the decription of the instructions in not important.
Omitting that column from the table above would not make it impossible (or even
harder) to implement the assembler. An assembler generally does not know what
the instructions do, what the addressing modes mean, and when and how the status
flags are used. These are all run-time features, handled by the hardware.
The modes:
0 - Direct, 1 - Relative. 2 - Indirect. 3 - Immediate. They are special cases
of the modes explained in appendix A. Note that certain instructions can only use
certain modes. An error should be detected (and handled as shown below) if such
an instruction tries to use an invalid mode. Also note that mode 1 can generate,
when the instruction is executed, addresses > 64k. Those are illegal addresses on
our machine, but can only be detected at run time, so you don’t have to worry
about them.
The indirect & immediate modes are explicitly selected by the programmer. If
a mode is not specified, the assembler should try the direct mode (as on lines 50,
52 in the example below). Since the Op field is six bits, a direct address must fit
in six bits and is therefore limited to the range 0–63. If the direct mode cannot be
used (the Op is > 63), the assembler should try the relative mode (as on line 55).
If that mode cannot be used either, the assembler should issue an error, assemble
the line as all zeros, and set a flag that will prevent execution of the program.
The Directives
1. PASS n where n is either 1 or 2. This should always be the first line and it
specifies the number of passes.
2. ORG n where n is a non-negative integer. This diredctive resets the LC to n.
It is executed in pass 1 and only affects the LC.
3. label EQU n where n is as above. This directive places the label in the symbol
table with a value n and a type of abs. It is executed in pass 1 and only affects the
symbol table.
4. DC m initializes the current location to m, where m is a number as abve or
a string of up to three characters.
5. END n indicates the end of the source program. Here n is a symbol and, if
present, it indicates the address of the first executable instruction.
see chapter 3 for more information on those directives.
Notice that the program starts at address 49 but the first executable instruction
is at address 50. The program is loaded in memory locations 49–70 so it would
straddle address 63. This is done in order to illustrate both the direct and relative
modes.
Sec. 1.13 Review Questions and Projects 57
General Notes
1. Pass 1 is especially simple since all the instructions have the same size.
You will also find that the intermediate file contains a copy of the source and
almost nothing else. To make this project more interesting, you should redesign the
instruction set to have a few long instructions. Perhaps all the instructions with an
Op field should be 32 bits long.
2. Notice that the Op field is only 6 bits wide. In a one-pass assembler, this
may make it impossible to store a pointer in this field and you may have to resort
to storing all the unresolved references of each future symbol, in a linked list, as
described earlier in this chapter.
The shortage of a single kind of bolt would hold up the entire assembly. . .
— Henry Ford, Today and Tomorrow (1926)
2. The Symbol Table
The organization of the symbol table is the key to fast assembly. Even when
working on a small program, the assembler may use the symbol table hundreds
of times and, consequently, an efficient implementation of the table can cut the
assembly time significantly even for short programs.
The symbol table is a dynamic structure. It starts empty and should support
two operations, insertion and search. In a two-pass assembler, insertions are done
only in the first pass and searches, only in the second. In a one-pass assembler,
both insertions and searches occur in the single pass. The symbol table does not
have to support deletions, and this fact affects the choice of data structure for
implementing the table. A symbol table can be implemented in many different
ways but the following methods are almost always used, and will be discussed here:
A linear array.
A sorted array with binary search.
Buckets with linked lists.
A binary search tree.
A hash table.
60 The Symbol Table Ch. 2
A search is done by first locating the bucket (one step), and then performing
the same comparisons as in the insertion process above. The average search thus
also takes 1 + N/52 steps.
Such a symbol table has a variable size. More nodes can be allocated and added
to the buckets, and the table can, in principle, use the entire available memory.
Disadvantages: Although the number of steps is small, each step involves the use
of a pointer and is therefore slower than a step in the previous methods (that use
arrays). Also, some programmers always tend to assign names that start with an A.
In such a case all the symbols will go into the first bucket, and the table will behave
essentially as a linear array.
Exercise 2.1 What if symbol names can start with a character other than a letter?
Can this data structure still be used? If yes, how?
62 The Symbol Table Ch. 2
This is a general data structure used not just for symbol tables, and is quite
efficient. It can be used by either a one pass or two pass assembler with the same
efficiency.
The table starts as an empty binary tree, and the first symbol inserted into
the table becomes the root of the tree. Every subsequent symbol is inserted into
the table by (lexicographically) comparing it with the root. If the new symbol is
less than the root, the program moves to the left son of the root and compares the
new symbol with that son. If the new symbol is greater than the root, the program
moves to the right son of the root and compares as above. If the new symbol turns
out to be equal to any of the existing tree nodes, then it is a doubly-defined symbol.
Otherwise, the comparisons continue until a node is reached that does not have a
son. The new symbol becomes the (left or right) son of that node.
Example: Assuming that the following symbols are defined, in this order, in
a program.
Symbol BGH becomes the root of the tree, and the final binary search tree is
shown in Fig. 2–1.
BGH
A345 J12
CC MED
ON
TOM
QUE ZIP
PETS
Reference [15] is a good source for binary search trees and it also discusses the
average times for insertion, search, and deletion (which, in the case of a symbol
table, is unnecessary). The minimum number of steps for insertion or search is
obviously 1. The maximum number of steps depends on the height of the tree. The
tree in Fig. 2–1 above has a height of 7, so the next insertion will require from 1
to 7 steps. The height of a binary tree with N nodes varies between log2 N (which
is the height of a fully balanced tree), and N (the height of a skewed tree). It can
be proved that an average binary tree is closer to a balanced tree than to a skewed
tree, and this implies that the average time for insertion or search in a binary search
tree is of the order of log2 N .
Advantages: Efficient operation (as measured by the average number of steps).
Flexible size.
Disadvantages: Each step is more complex than in an array-based symbol table.
The recommendations for use are the same as for the previous method.
2.5 A Hash Table
This method comes in two varieties, open hash, which uses pointers and has
a variable size, and closed hash, which is a fixed-size array.
2.5.1 Closed hashing
A closed hash table is an array (actually three arrays, for the name, value, and
type), normally of size 2N , where each symbol is stored in an entry. To insert a
new symbol, it is necessary to obtain an index to the entry where the symbol will
be stored. This is done by performing an operation on the name of the symbol, an
operation that results in an N -bit number. An N -bit number has a value between
0 and 2N − 1 and can thus serve as an index to the array. The operation is called
hashing and is done by hashing, or scrambling, the bits that constitute the name
of the symbol. For example, consider 6-character names, such as abcdef. Each
character is stored in memory as an 8-bit ASCII code. The name is divided into
three groups of two characters (16-bits) each, ab cd ef. The three groups are
added, producing an 18-bit sum. The sum is split into two 9-bit halves which
are then multiplied to give an 18-bit product. Finally N bits are extracted from
the middle of the product to serve as the hash index. The hashing operations are
meaningless since they operate on codes of characters, not on numbers. However,
they produce an N -bit number that depends on all the bits of the original name.
A good hash function should have the following two properties:
It should consider all the bits in the original name. Thus when two names that are
slightly different are hashed, there should be a good chance of producing different
hash indexes.
For a group of names that are uniformly distributed over the alphabet, the func-
tion should produce indexes uniformly distributed over the range 0 . . . 2N − 1.
Once the hash index is produced, it is used to insert the symbol into the array.
Searching for symbols is done in an identical way. The given name is hashed, and
the hashed index is used to retrieve the value and the type from the array.
64 The Symbol Table Ch. 2
Ideally, a hash table requires fixed time for insert and search, and can be
an excellent choice for a large symbol table. There are, however, two problems
associated with this method namely, collisions and overflow, that make hash tables
less than ideal.
Collisions involve the case where two entirely different symbol names are hashed
into identical indexes. Names such as SYMB and ZWYG6 can be hashed into the same
value, say, 54. If SYMB is encountered first in the program, it will be inserted
into entry 54 of the hash table. When ZWYG6 is found, it will be hashed, and
the assembler should discover that entry 54 is already taken. The collision problem
cannot be avoided just by designing a better hash function. The problem stems from
the fact that the set of all possible symbols is very large, but any given program
uses a small part of it. Typically, symbol names start with a letter, and consist of
letters and digits only. If such a name is limited to six characters, then there are
26 × 365 (≈ 1.572 billion) possible names. A typical program rarely contains more
than, say, 500 names, and a hash table of size 512 (= 29 ) may be sufficient. When
1.572 billion names are mapped into 512 positions, more than 3 million names will
map into each position. Thus even the best hash function will generate the same
index for many different names, and a good solution to the collision problem is the
key to an efficient hash table.
The simplest solution involves a linear search. All entries in the symbol table
are originally marked as vacant. When the symbol SYMB is inserted into entry 54,
that entry is marked occupied. If symbol ZWYG6 should be inserted into entry 54
and that entry is occupied, the assembler tries entries 55, 56 and so on. This implies
that, in the case of a collision, the hash table degrades to a linear table.
Another solution involves trying entry 54 + P where P and the table size are
relative primes. In either case, the assembler tries until a vacant entry is found or
until the entire table is searched and found to be all occupied.
Morris [16] presents a complete analysis of hash tables, where it is shown that
the average number of steps to insert (or search for a) symbol is 1/(1 − p) where p
is the percent-full of the table. p = 0 corresponds to an empty table, p = 0.5 means
a half-full table, etc. The following table gives the average number of steps for a
few values of p.
number
p of steps
0 1
.4 1.66
.5 2
.6 2.5
.7 3.33
.8 5
.9 10
.95 20
Sec. 2.5 A Hash Table 65
It is clear that when the hash table gets more than 50%–60% full, performance
suffers, no matter how good the hashing function is. Thus a good hash table design
makes sure that the table never gets more than 60% occupied. At that point the
table is considered overflowed.
The problem of hash table overflow can be handled in a number of ways. Tradi-
tionally, a new, larger table is opened and the original table is moved to the new one
by rehashing each element. The space taken by the original table is then released.
Hopgood [17] is a good analysis of this method. A better solution, though, is to use
open hashing.
Implement a linear array symbol table and a sorted array symbol table. Use
them in one of the assemblers implemented in the chapter 1 projects. Prepare several
test programs, each with successively more labels defined and more symbols used.
Assemble each test program twice, using the two symbol table implementations
above, and measure the time it takes to assemble each test program. If the computer
does not have an internal clock, use your watch. The aim is to find the point where
the sorted symbol table becomes faster than the linear one. At how many symbols
does it occur in your case?
3.1 Introduction
In addition to assembling instructions, the assembler offers help to the pro-
grammer in the form of directives. The directives are commands to the assembler,
directing it to perform operations other than assembling instructions. The direc-
tives are thus executed by the assembler, not assembled by it. They may affect all
the operations of the assembler. Directives may affect the object code, the symbol
table, the listing file, and the values of internal assembler parameters. Certain di-
rectives are executed in pass 1 and others, in pass 2. Many directives are executed
in both passes. Directives that have to do with macros and conditional assembly
are normally executed in a special pass, pass 0. Regardless of when a directive is
executed, it must be passed over to pass 2, to be included in the listing file.
Some directives are passed by the assembler to the loader, through the object
file. They are eventually executed by the loader. Other directives are used as
programming tools, to simplify the process of writing the program and preparing
the source file.
Simple assemblers may support only a few directives, while large, modern as-
semblers may support about a hundred. Perhaps the fastest way for the assembler
to identify each directive, is to include all the directives in the OpCode table. The
table should, in such a case, contain more information, identifying each item as
either a machine instruction or a directive. The table should also specify the pass
70 Directives Ch. 3
(0,1 or 2) in which the directive should be executed, and should contain the start
address of the routine that executes the directive. If a directive is executed in both
Exercise 3.1 Some assemblers require each directive to start with a period ‘.’, for
easy identification. Why isn’t such a convention adopted by every assembler?
Figure 3–1 is a generalization of figure 1–2, that includes directives.
pass 1
generate END
record on inter-
read line mediate file
1
from source file
yes
no no
label a pass 2
yes defined pass 1
directive? no
?
store name
& value in no yes
symbol table
execute it
determine size
of instruction
generate a
record on
LC:=LC+size interme-
diate file
pass 2
yes
END
stop
?
no
a
directive yes pass 2 no
? directive
?
no
yes
assemble execute it
instruction
The directives listed below are classified by function and, within each function
group, they are listed alphabetically (except when certain directives are needed to
explain others). For each directive, the general format is shown, followed by a short
description and by the way it is executed by the assembler (its implementation).
An important point to keep in mind about directive execution is that all directives
that affect the LC must be executed in pass 1. In general, a pass 1 directive is any
directive that affects the layout of the program (as a result, it must—directly or
72 Directives Ch. 3
indirectly—affect the LC).
Note that the names of directives are determined by the assembler writer and,
as a result, the same directive may have different names in different assemblers. In
several cases, two or three names are mentioned for the same directive.
This chapter lists many directives used by many assemblers. Most are rarely
used and are only supported by a few assemblers. Some directives, however, are
commonly used and are widely supported; those are identified below by an asterisk.
The following is a list of the directives described here, classified by function.
Every assembler manual should describe all the directives supported by the
assembler. References [26, 27, 30, 32, 37, 99, 100, 102, 103] are typical examples of
large, sophisticated assemblers supporting many directives.
Sec. 3.2 Program Identification Directives 73
1. IDENT
Label Operation Operand
none IDENT name,origin
The name in the operand field becomes the name of the program. This direc-
tive is for the benefit of the loader, which uses the program name to identify the
individual programs loaded, and to print a memory map. This directive is normally
not required and many assemblers do not even support it. Some assemblers use the
TITLE directive for this purpose. To execute it, the assembler writes the name on
the object file together with a special code (a loader directive) that tells the loader
what it is. chapter 7 contains examples of loader directives and memory maps. The
‘origin’ field, if used, indicates the start address of the program. It is only used for
absolute assembly.
A few words about loader directives are in order. The relocatable object file
contains machine instructions (each with its relocation bit) and should also contain
loader directives. Both machine instructions and loader directives are written as
binary numbers on the object file and should be explicitly distinguished. The as-
sembler does this by adding one more bit to the relocation bit. With two such bits
(identifying bits), the assembler can identify a record on the object file as one of
four different types. So far we have seen three types, machine instructions (absolute
and relative), and loader directives. The fourth type is machine instructions that
require special relocation. They are discussed later in this chapter, in the section
on the ENTRY, EXTRN directives.
Throughout this book we will assume that the identification bits have the
following meaning:
00—an absolute machine instruction.
01—a relative machine instruction.
10—a machine instruction that needs special relocation.
11—a loader directive.
See chapter 7 about specific loader directives.
74 Directives Ch. 3
2. END
Label Operation Operand
none END address expression
Indicates to the assembler the end of the source program. Upon reading the
END from the source file, the assembler stops reading and completes assembling the
program. Certain assemblers allow the source file to have more than one program,
each having its own END, to indicate separate assembly. In such a case, the assembler
completes one assembly and then tries to read beyond the END to see if there is
another program on the same source file.
The expression in the operand field indicates the address of the first executable
instruction. It is thus possible to start the source program with some data items or
with a procedure, and follow them by the main program. In such a case the first
source line is not the first executable instruction. The assembler does not know
where execution should start and, if it should start at any point other than the
beginning, that point should be specified in the operand field of the END.
Example:
3. ICTL
Label Operation Operand
none ICTL b,e,c
Exercise 3.2 How should the assembler test the values of b,e,c for validity?
4. INCLUDE
Label Operation Operand
none INCLUDE filename
When the INCLUDE directive is encountered, the assembler opens the file spec-
ified in the INCLUDE and uses it as a source file. When that file is fully read, the
assembler returns to the original source file. The new source file may itself include
an INCLUDE directive, that will direct the assembler to start reading a third source
file.
Note. A similar loader directive is mentioned in chapter 7. It it used to explicitly
include an object file in the load.
5. PUNCH
Label Operation Operand
optional PUNCH string
The string in the operand field is punched on a card. The PUNCH directive has
no other effect on the assembly.
6. REPRO
Label Operation Operand
none REPRO address
The next source line is punched on a card. There is no effect on the assembly.
Exercise 3.3 If it does not affect the assembly, what is this directives used for?
76 Directives Ch. 3
7. MACHINE
Label Operation Operand
none MACHINE processor code
This directive is used by assemblers that serve a family of computers, such as the
CDC 6000 series, 7000 series, & Cyber series computers. The different computers
in such a family are upper compatible, but not identical. Some of them support
instructions and hardware features that others do not. The processor code tells
the assembler on what machine the program is supposed to run and the assembler,
based on this information, can detect errors in the source code that stem from small
incompatibilities berween different machines in the same family.
8. p286
Label Operation Operand
none p286 none
The 80x86 is a family of microprocessors that are different but are upper com-
patible. A program intended for the 80286 microprocessor must use this directive,
so that the assembler can generate instructions specific to that machine. There are
also p186 and similar directives.
9. PPU
Label Operation Operand
none PPU string
This directive is specific to the Cyber 70/76 or CDC 7600. Those machine have
one CP (Central Processor) and several PPUs (Peripheral Processing Units). The
CP and the PPUs have different architectures and, as a result, different assembler
languages. The same assembler, however, can handle programs in either language.
The directive declares the program as a PPU, rather than as a CP, program. The
string operand has to do with the way certain PPU instructions are assembled [30].
Sec. 3.4 Machine Identification Directives 77
10. LCC
Label Operation Operand
none LCC loader directive
The loader directive in the operand field is written on the object file. The
loader directive should be coded as a number. This directive allows the advanced
user to explicitly insert loader directives in the object file.
11. ABS
Label Operation Operand
none ABS none
Directs the assembler to generate an absolute object file, rather than a relocat-
able one.
12. BASE
Label Operation Operand
none BASE or RADIX number base
13. CODE
Label Operation Operand
none CODE char
Computers use different character codes, such as the ASCII and EBCDIC. If
the assembler supports more than one code, this directive tells it what code to use
for the current assembly. The operand can be a single letter code such as A (for
ASCII), E (for EBCDIC), D (for Display, the CDC 6-bit code) or anything else.
14. QUAL
Label Operation Operand
none QUAL qualifier or blank
15. COL
Label Operation Operand
none COL column number
A source line with a comment but without an operand field may present a
problem to the assembler. Consider the line ABC HLT 2 TIMES. On many com-
Sec. 3.6 Mode Control Directives 79
puters, the HLT instruction may be written with or without an operand, and the
assembler has to determine whether the 2 is an operand or part of the comment.
Most assemblers require a special character, such as a semicolon, to precede a com-
ment. Some old, punched card based assemblers, consider a certain column as the
beginning of comments. The COL directive can alter that column. Thus COL 40
means that anything on column 40 or after is a comment.
16. BEGIN
Label Operation Operand
none BEGIN Op1,Op2
Where Op1 is a location counter name, and Op2 is a start value for that location
counter.
Some assemblers can use several location counters to assemble one program
(see example below). The different location counters are assigned names, can be
initialized by the programmer to any values, and can be used in any order. The
default location counter (the one used in the absence of any BEGIN) is called ‘blank’
and is initialized to zero. This directive causes the assembler to start using the
location counter named in Op1, and initializes that location counter to the value of
Op2. The directive may appear anywhere in the program and is used to separate
the program into parts that are written in a certain order and are eventually loaded
in memory in a different order.
If the first line of the program is not a BEGIN, the assembler generates a
‘BEGIN ,0’ as the first line. Such a program starts by using the blank location
counter, which is initialized to zero (but see also the discussion of ORG).
17. COMM
Label Operation Operand
name COMM expr
This directive declares the name to be that of an array of size expr. The
array will later be located in blank common. This is handy for compatibility with
Fortran and other languages that use COMMON. The actual storage space is allocated
by the loader, and the directive is executed by the assembler by generating a loader
directive and placing it in the object file.
80 Directives Ch. 3
18. DS
Label Operation Operand
optional DS an expression
To execute the DS, the assembler evaluates the expression in the operand field
(the array size) and increments the LC by this amount. This guarantees that the
next instruction will be loaded in the word following the reserved area.
Example:
LC
15 N EQU 5
15 CLR R3
16 B DS 3
19 ADD R2,AB
20 C DS N N=5
25 ...
Exercise 3.6 What are some ways of mixing instructions and data?
Modern computers have a word size that’s a multiple of 8, so their assemblers
have directives such as BYTE (to reserve bytes), WORD (to reserve words), LONG (to
reserve longwords). Older assemblers use names such as BSS (Block Starting with
Symbol), or BES (Block Ending with Symbol). Those are all similar to DS.
Exercise 3.7 The directive ‘B BES 5’ causes five words of storage to be reserved,
and assigns B a value that is the address of the first word following this array. Why
isn’t B assigned the address of the last word of the array, and what is the use of
BES?
Sec. 3.7 Block Control & LC Directives 81
Exercise 3.8 The directive DS 0 seems useless. Still, it has an important use on
some computers, what is it?
19. =
Label Operation Operand
* = address expr
The ‘=’ is similar to DS and is used to reserve storage. The directive ‘*=*+7’
increments the LC by 7 and thus amounts to reserving 7 words of storage. In some
assemblers (such as the VAX Macro assembler, see Ch. 8) this directive can be
used to modify the value of any modifiable symbol, but we refer to such symbols as
SET symbols and describe the SET directive in chapter 4.
20. EVEN
Label Operation Operand
none EVEN none
21. LIMIT
Label Operation Operand
none LIMIT none
22. ODD
Label Operation Operand
none ODD none
23. ORG
Label Operation Operand
optional ORG expression
82 Directives Ch. 3
The ORG directive instructs the assembler to continue the assembly from the
memory location specified by the operand. The operand must be an expression
that can be immediately evaluated, and its value must be a valid address (i.e., it
cannot be negative). Thus the operand can be a number, a known symbol, or an
expression that can be evaluated by the assembler at this point. Such an operand
is called “definable.”
Example: AB ORG 100
Means—continue the assembly from location 100 and assign 100 as the value
of symbol AB.
To execute this directive, the assembler:
Evaluates the operand (pass 1).
Resets the LC to the value of the operand (pass 1).
Stores the label, if there is one, in the symbol table, with the LC as its value
(pass 1).
Prepares a loader directive on the object file (pass 2). This tells the loader where
to load subsequent instructions.
The LC points to the current address and, by resetting it, the programmer
instructs the assembler to load the instructions that follow into a different memory
area.
Example:
LC Source
0000 LOAD
.
.
0150 ADD
ORG 170
0170 COMP
0171 SUB
.
.
The assembler starts at location 0. It resets the LC to 0 and the first block of
instructions, from the LOAD to the ADD, will be assembled and loaded into memory
locations 0000 through 0150. The ‘ORG 170’ causes the COMP instruction to be
loaded into memory location 170, followed by SUB in location 171, and so on. The
area from location 151 to location 169 remains empty and can be used to store data.
There are two problems, however:
It is hard to refer to this area since it is not labeled (but see the description of
the EQU directive for ways to label any memory location).
Sec. 3.7 Block Control & LC Directives 83
At run time, the computer will execute the ADD instruction and will then proceed
to location 0151, trying to execute its contents as an instruction. Since location
0151 does not contain an instruction, an error will occur.
The precise error depends on the contents of location 0151. If it contains a bit
pattern that happens to be the code of an instruction, the computer will execute it
and proceed to location 0152. If, however, location 0151 contains a bit pattern that
is not the code of any instruction, an interrupt will be generated by the hardware.
Most 8-bit microprocessors, unfortunately, simply skip such invalid bit patterns,
with no interrupts generated.
If the ORG above is changed to ‘ORG 150’, the assembler will reset the LC to
150, with the result that the COMP instruction will go into location 0150 (overwriting
the ADD), the SUB, into location 0151, and so on. The assembler executes the ORG
without checking its operand, and it is therefore easy to make mistakes.
24. OVERLAY
Label Operation Operand
none OVERLAY n
Declares the following code an overlay. The overlay ends at the next OVERLAY
directive or at the end of the program. Overlays are useful when a large program
has to fit in a small memory, or in a multi-user computer where many application
programs compete for limited memory. The program can be logically divided into a
main part and a number of overlays. At run time, the main part resides in memory
and calls any overlay when needed. The overlays are loaded on top of each other
(they overlay each other). Overlays are discussed in detail in chapter 7.
25. POS
Label Operation Operand
none POS expression
In a computer with a large word size, several instructions can be packed in one
word. An assembler for such a computer uses, in addition to the LC, a position
counter, SC. The SC tells the assembler where, in the current word, to assemble
the next instruction. If the next instruction is longer than the rest of the word,
the assembler fills the current word with NOP, resets the SC to 0, and increments
the LC by 1. The next instruction thus goes into the next word. This process is
called a forcing up of the next instruction, and it is discussed in chapter 1. The
84 Directives Ch. 3
POS directive resets the SC to the value of the expression in the operand field. The
expression should not be greater than the word size.
26. USE
Label Operation Operand
none USE LC name or *
This directive tells the assembler to start using the LC named in the operand.
This can be an existing LC or a new one. An asterisk in the operand means returning
to the previously used LC. The special name // refers to the special LC used for
the blank common block.
Example:
LC
00 USE
00 A LOD ... the value of symbol A is 0 relative to the main block
02 C STO ... C has a value of 2 relative to the main block
.
.
56 ADD ...
00 USE XY a new LC is initialized to 0
00 D DS 10 symbol D has value of 0 relative to block XY
10 E DC 1,2
12 .
57 USE * switch to the previous LC. Its current value is 57.
More information on the use of USE and on multiple LCs can be found in
chapter 1 and in references [26, 27].
‘*’ The asterisk is not a directive in the usual sense. It is not written on a separate
source line but is rather part of the operand of instructions. It stands for the LC
value, and is very useful. A typical simple example is:
BPL *+2
ADD ...
COM ...
the BPL instruction branches to the COM instruction. This is useful in short local
communication between instructions, but requires the programmer to know the size
of each instruction. The example above would work if the instructions involved are
1 word long each. If the ADD instruction is two words long, the BPL would branch
to the middle of the ADD.
Sec. 3.8 Segment Control Directives 85
27. SEGMENT
Label Operation Operand
symbol SEGMENT parameters
Used to declare the start of a segment. The end of the segment is defined by
an ENDS directive. Example:
MINE SEGMENT PUBLIC
DB 0
B DW 0
MINE ENDS
All segments using the parameter PUBLIC and having the same name are concate-
nated into one module (see reference [35] for more information on possible param-
eters).
28. ASSUME
Label Operation Operand
none ASSUME seg reg:name
When a program uses segments, every address must have two parts, a segment
address and an offset within the segment. The segment address must be loaded
86 Directives Ch. 3
by the program into one of the segment registers. The offset is included in the
individual instructions. Thus when an instruction is executed that uses an address,
the hardware fetches the offset from the instruction, adds it to the contents of one
of the segment registers, and ends up with a complete address. This process is
called address mapping. The user is responsible for loading the start address of
each segment into a segment register. The user then needs to tell the assembler
what each of the segment registers contains. This is done by the ASSUME directive.
Thus ‘ASSUME DS:MINE’ tells the assembler that segment register DS contains the
start address of segment MINE. The ASSUME directive does not load the address into
the register—that must be done by an instruction at run time—it only tells the
assembler to assume that the register has been loaded.
29. GROUP
Label Operation Operand
optional GROUP list of segment names
This directs the assembler to group several segments into one contiguous mod-
ule.
The assembler generates a loader directive, and the actual grouping is done by
the loader.
30. EQU
Label Operation Operand
symbol EQU or = expression
Where the label is mandatory and the operand is an expression that cannot
include any future symbols. The directive is executed by assigning the operand as
the value of the symbol. The symbol is stored in the symbol table with its value
and with the proper type. This is the first example of a symbol whose value is not
the LC. The EQU directive makes it possible to define symbols with any values and
any type, absolute or relative.
Example:
LC
AB EQU 0
CD EQU 5
0000 EF ADD 1,2
GH EQU AB+1
IJ EQU EF+1
Sec. 3.9 Symbol Definition Directives 87
The symbol types are worth noting in this example. GH is absolute since it is
defined by absolute quantities only. IJ is relative since its definition contains at least
one relative quantity. See also the discussion of address expressions in chapter 1.
The difference between GH and IJ is in the way they are used. GH is identical
to the number 1 and the two can be used interchangeably. Thus:
ADD R1,R2
ADD R’GH,R2
are identical instructions that add registers 1 and 2. They are assembled into
identical machine instructions and are treated by the loader identically. Symbol IJ,
however, is relative and thus stands for address 1. As a result, ‘ADD R’IJ,R2’ is
wrong and would produce an assembler error message, since the assembler expects
a register number (an absolute number) at this point, but
LOAD R3,1
LOAD R3,IJ
is okay since the second operand of LOAD should be an address. Note, however,
that the two instructions are not identical. They go on the object file with different
identification bits.
Chapter 1 explains how the type of a symbol is used by the assembler to
generate relocation bits.
The main use of EQU is in assigning meaningful names to registers and locations
that are frequently used in the program. In the Apple II computer, for example,
addresses C000 and C010 have a special meaning. They are the data register and
the strobe (see reference [40] p. 79) of the keyboard. The directives:
KYBD EQU $C000
STRB EQU $C010
make it easy to input from the keyboard since the programmer has to memorize
the symbols KYBD ,STRB instead of the addresses C000, C010. Notice that the ‘$’ is
Exercise 3.10 A 2-pass assembler can handle future symbols and an instruction
can therefore use a future symbol as an operand. This is not always true for direc-
tives. The EQU directive, for example, cannot use a future symbol. The directive
‘A EQU B+1’ is easy to execute if B is previously defined, but impossible if B is a
future symbol. What’s the reason for this?
88 Directives Ch. 3
Exercise 3.11 Suggest a way for the assembler to eliminate this limitation such
that any source line could use future symbols.
The label is mandatory and is assigned the value of the largest (smallest)
expression in the operand field.
Example:
AB EQU 0
CD EQU 5
GH EQU LQ+AB where LQ is a defined symbol
NOW MAX AB,CD,GH
Symbol NOW is assigned the maximum of the three operands. These directives are
similar to EQU and are used in those rare cases where the largest (or smallest) of a
group of symbols is needed.
32. MICCNT
Label Operation Operand
symbol MICCNT micro name
The symbol is set to the number of characters in the value of the micro (micros
are discussed in a later section). What makes this directive interesting is that the
symbol can be redefined. Thus:
33. SET
Label Operation Operand
symbol SET expression
The SET directive is similar to the EQU directive except that it allows a redefi-
nition of the symbol. Thus the usage:
.
.
G SET 1 point 1
.
.
G SET 2 point 2
.
.
is valid. Between points 1 and 2, the value of symbol G will be 1. From point 2 on,
the value of G will be 2. The SET directive is another example where a symbol can
be redefined. The execution of the SET directive and the reason for having such a
directive are discussed in chapter 4. Certain assemblers (see Ch. 8) use ‘=’ instead
of SET.
34. USING
Label Operation Operand
none USING name
90 Directives Ch. 3
35. DROP
Label Operation Operand
none DROP name
36. CSECT
Label Operation Operand
none CSECT a symbol
This directive is used on the IBM 360 computers to divide a program into parts,
to be assembled separately. Each part starts with a CSECT and ends with the next
CSECT of with the END diective. The PSECT directive on the VAX is similar. It is
described in Ch. 1.
37. EXTRN
Label Operation Operand
none EXTRN list of symbols
The symbols in the operand field are declared as external symbols, symbols
that are used in the current program but are declared elsewhere. In the example
above, the declaration:
EXTRN Q or EXTRN P,Q
should be added to program A, preferably at the beginning of the program. The
assembler still does not have a value for Q but, because of the EXTRN declaration, it
treats the JSR instruction in a special way and does not consider it an error. The
programs are separately assembled and, therefore, when the assembler assembles A,
it knows nothing about program B (it does not have a value for symbol Q). When
the assembler gets to program B, it has already forgot everything about A (it has
erased the symbol table). Thus, in principle, the assembler cannot assemble the
JSR instruction. Only the loader, while loading all the programs as one executable
module, can use values of special symbols from one program in another. Thus our
JSR instruction presents a special case, a case that cannot be fully handled by the
assembler, and requires the help of the loader. It is interesting to note here that a
one pass assembler, since it does not use a separate loader, cannot handle separate
source files and thus cannot support the EXTRN, ENTRY directives.
The assembler writes the JSR instruction on the object file without the value of
Q, and with special identification bits (id ) indicating that the instruction is incom-
plete, and that it is missing the value of symbol Q. The loader eventually reads the
92 Directives Ch. 3
object file, reads the JSR instruction, recognizes the id bits, and completes the in-
struction when it finds the value of symbol Q. This process is explained in chapter 7,
and in this section we will concentrate on the execution of the EXTRN directive.
The EXTRN specifies that Q is a special symbol, a symbol defined outside the
current program. The assembler stores Q in the symbol table with a special type
ext, and without a value. The instruction itself is assembled, and is written on the
object file, without the value of Q in its address field. The assembler stores a pointer
in the empty address field, to point to the symbol table entry for Q. At the end of
pass 2, the symbol table is erased, but the special entries[ in the symbol table] are
copied by the assembler onto the object file, for later use by the loader. The object
file thus contains the object code followed by the special symbol table. It will later
be shown that there are other items in the object file. The following diagram shows
the symbol table entry (entry 5) for symbol Q.
The JSR instruction goes on the object file as the record ‘OpCode, 5, id=10’.
The id indicates (to the loader) that the 5 in the address field is not the value of a
symbol but a pointer to the special-symbol table at the end of the object file. That
table includes the entry ‘5 Q ext’ among, perhaps, other ones.
After recognizing the id and finding the entry, the loader knows that the JSR
instruction needs the value of symbol Q. This value is defined in program B and
should, therefore, be included in the object file generated by that program.
The id is a generalization of the concept of a relocation bit, and is discussed in
this chapter, in conjunction with the IDENT directive.
The EXTRN directive is executed in both passes. In pass 1 the symbol is entered
into the symbol table as a special symbol of type ‘ext’. In pass 2, the object
instruction is generated and the id bits added to it. Also, the special symbol table
38. ENTRY
Label Operation Operand
none ENTRY list of symbols
The symbols in the operand field are defined as entry points to the current
program. An entry point is a point in the program where it can be entered by a call
from another program. In program B above, Q (and possibly P) are entry points,
and the directive:
ENTRY Q or ENTRY P,Q
should appear at the beginning of B. The assembler executes the directive by storing
the symbols in the symbol table with a type of ent. An example follows:
name value type
1. .. .. rel
2. .. .. abs
3. P 12 ent
4. .. .. rel
5. .. .. rel
6. Q 23 ent
7. ..
At the end of pass 2, the symbol table is erased, but the special entries, those with
a type of ext or ent, go on the object file. When the loader reads the object file for
B, it finds the entry for Q and uses the value of Q to complete the JSR instruction
from program A.
This directive is executed partly in pass 1 (setting the type in the symbol table)
and partly in pass 2 (writing the special symbol table on the object file).
3.12 Data Generation Directives
(ASCII, ASCIIZ, BSSZ, CON, *DATA, DEC, DEF, DIS, LIT, LITORG, PACKED, RECORD,
STRUC, VFD)
39. ASCII
Label Operation Operand
optional ASCII char string
The ASCII codes of all the characters in the string are generated and stored in
consecutive bytes in memory. If there is a label, its value is the address of the first
byte. Example:
ASCII "HELLO THERE"
generates (in octal): 150 145 154 154 157 040 164 150 145 162 145 in 11
consecutive bytes.
94 Directives Ch. 3
40. ASCIIZ
Label Operation Operand
optional ASCIIZ char string
Same as the ASCII directive above except that the string of bytes is followed
by a zero byte. This feature is for generating strings that C programs can use.
41. BSSZ
Label Operation Operand
symbol BSSZ expression
Similar to BSS except the block being reserved is initialized to all zeros.
Exercise 3.14 Why is BSSZ described here and not next to BSS?
42. CON
Label Operation Operand
symbol CON expr1,expr2,. . .
The expressions are evaluated and stored in consecutive words, one value per
word. This directive is similar to DATA (see below) except that expressions, rather
than constants, can be used, and each value is stored in one word.
The DATA or DC (Define Code) directive preloads the constants in the operand
field in successive memory locations. The constants can be of different types
(some assemblers support a large number of constant types, both numeric and
non-numeric) and can have different lengths.
Example: AB DC 1,-5.3,‘DOD$’,XY The operand field contains four constants
(notice the three separating commas) that are of different types. The first is the
integer 1, the second, a real number, the third, a string of length four, and the
fourth, the value of symbol XY. The assembler generates the binary values of all
items and stores them in consecutive locations, updating the LC in the process.
The assembler executes the DC in both passes. In pass 1 it eveluates the size of each
constant and updates the LC. In pass 2 it generates the binary constants and writes
them on the object file. Many assemblers permit expressions in a DC directive, thus
‘DC A-1,B-C,. . .’ is valid but all the symbols involved should be known at the time
the DC is encountered in pass 1. Such a DC may include an external symbol and,
Sec. 3.12 Data Generation Directives 95
in such a case, the assembler cannot evaluate the expression and has to generate a
modify loader directive.
If the programmer knows the sizes of the data items, they can refer to any of
the items by calculating the sizes of the preceding ones. Assuming that an integer
number occupies one word, a real, two words, and each word can contain two
characters, the total size of the first three items in the example above is five and the
following is true: The instruction ‘LOD AB+1,R1’ will load the real constant −5.3,
whereas ‘STO R1,AB+4’ will store register 1 in the word containing the characters
D$ (thereby erasing those characters). The instruction ‘CMP AB+5,R2’ will compare
the value of symbol XY (stored in location ‘AB+5’) with the contents of register 2.
Some assemblers use DATA instead of DC while others support separate direc-
tives for different data types. OCT, DEC, BCD are a few such directives, supported
mostly by old assemblers, that preload octal, decimal, and strings in storage. The
examples below are from the ASM 86 assembler (references [33–35])—a modern,
powerful assembler for Intel 16-bit microprocessors—but are typical to data direc-
tives supported by many modern assemblers.
DB (Define Byte), DW (Define Word), DD (Define Double), DQ (Define Quade,
a quade is 8 consecutive bytes or 64 bits), DT (Define Ten), ten consecutive bytes).
These directives are powerful, as the following examples illustrate:
DB 12DUP(0) 12 bytes are preloaded with zeros
DW ?,?,? 3 words are reserved but not preloaded
DW 3DUP(?) same as above
DB 2DUP(1,3DUP(2)) eight bytes are preloaded with 1,2,2,2,1,2,2,2
DQ 1.234E a 64-bit real value
DD -9.8E-34 a 32-bit real value
DB ‘HELLO’,0DH,0AH 7 bytes, the last 2 with CR (0D16) & LF (0A16).
44. DIS
Label Operation Operand
symbol DIS n,string
A total of n characters from the string are preloaded into successive words in
memory (as many words as necessary to hold n characters). If n is larger than
the length of the string, the string is repeatedly loaded into memory until all n
characters have been loaded.
45. DEF
Label Operation Operand
symbol DEF m[,I]
46. DEC
Label Operation Operand
symbol DEC d1[,d2,...,dn]
47. LIT
Label Operation Operand
none LIT list of expressions
The expressions are evaluated and their values inserted into the literal table.
This directive is normally not necessary since literals are added to the literal table
when they are found in the source. However, if LIT is used early in the program to
load the literal table, the assembler will not load it with duplicate values. Some ex-
perienced programmers like to know what their literal table contains. They initially
preload it with all the literals they think their program is using. At the end of the
second pass, they print the literal table and, if it contains anything not inserted by
them (through the LIT), they check for a possible error.
48. LITORG
Label Operation Operand
none LITORG expression
by the first pass and is read by the second pass. It contains a copy of the source
file plus information on literals. The literal table contains the name, value, size and
other attributes of each literal used in the program.
Example:
name value size
1000 4
13.6 8
-1 4
DATUM 5
13.60 8
Where the ‘value’ field contains the binary values of the literals. In the first pass,
each time a literal is found in the source program, the literal table is checked for
a literal with the same name. If no such literal exists, the new literal is added
to the table, otherwise, the existing literal is used. The relative address of the
literal is then written on the intermediate file following the instruction that uses
the literal. Notice that literals with different names and identical values (like 13.6,
13.60) appear as different entries in the literal table. When a LITORG is encountered
(or, in the absence of a LITORG, at the end of the first pass), the assembler copies
the ‘value’ fields from the literal table to memory (to the memory address specified
in the LITORG or, with no LITORG, the address following the end of the program)
and erases the table. Thus, several literal tables may be generated during the first
pass. In the second pass the intermediate file is read and, when an instruction is
encountered that uses a literal, the assembler finds the relative address of the literal
in the file, following the instruction. It uses this address to calculate the absolute
address of the literal, and assembles the instruction with that absolute address.
Literalsliterals are treated in some detail in chapter 1.
49. PACKED
Label Operation Operand
symbol PACKED list of expressions
The expressions are calculated and their values are stored in successive memory
locations in packed decimal form.
50. RECORD
Label Operation Operand
symbol RECORD field1:size1,field2:size2,. . .
Symbol ID is now the name of a record (just a template, not any actual area in
memory) containing three fields: YR, a 3-bit value, CODE, a number between 0 and
15, and GEN, a single bit. The total length of the record is thus 8 bits. Each ‘size’
field is the size (in bits) of a field in the record. To actually allocate memory to
such a record, other directives are used. For example: ‘ACT ID 6DUP(?)’ ACT is 6
consecutive bytes, each a record of type ID, uninitialized. ‘SAM ID 12DUP(5,0,1)’
SAM is 12 consecutive bytes, each a record of type ID. The individual fields are
initialized to ‘YR=5, CODE=0, GEN=1’.
Records allow for easy manipulation of small fields and individual bits while
packing them tightly in either bytes or words. The following example specifies
default values for the individual fields.
VOL RECORD X:8=‘A’,Y:8=‘B’
The following source lines illustrate actual storage allocation.
BAT VOL <,‘Q’> BAT is a record of type VOL with field Y initialized to ‘Q’ and
field X having its default value ‘A’.
CAT VOL<> CAT is a record of type VOL (2 bytes) where the X, Y fields have the
default values.
To access a record, an instruction such as below can be used.
MOV AX,OFFSET CAT. This is an 8086 MOV instruction [37,38] whose general form
is
MOV destination,source
The MOV instruction above moves OFFSET CAT (the address of record CAT) to
the 16-bit register AX. The instruction
MOV AH,OFFSET SAM[3] moves the address of the third component of the array
SAM of records to the 8-bit register AH.
For more information on records see [34,35,37].
51. STRUC
Label Operation Operand
symbol STRUC none
This defines a structure called DIM with 4 fields, the first one is named LEN and is
2 bytes long. The last one has the name MISC and is ten bytes long. To actually
build such a structure in memory, source lines such as:
DAN DIM
BAN DIM 5DUP(?)
can be used. The first declares DAN as a structure of type DIM, the second one
declares BAN as an array of 5 such structures. To access a field in a structure, the
name of the structure should be followed by a .field name.
MOV DAN.LEN,AL moves the 8-bit register AL to field LEN of structure DAN.
52. VFD
Label Operation Operand
symbol VFD l1/expr1,l2/expr2,. . .
53. DECMIC
Label Operation Operand
name DECMIC expr,n
Like OCTMIC below except that the rightmost n decimal digits are used.
100 Directives Ch. 3
54. MICRO
Label Operation Operand
name MICRO n1,n2,dstring
dstring is a delimited string (it is preceded and followed by the same char-
acter). A substring of n2 characters, starting at position n1 is extracted and is
assigned the name in the label field. The name then becomes a micro name. If n2
is zero, the substring starts at position n1 and goes to the end of the original string.
Examples:
AN MICRO 1,,/STRING/ AN is the name of string STRING
BC MICRO 2,3,*NONESUCH* BC is the name of ONE
55. OCTMIC
Label Operation Operand
name OCTMIC expr,n
The expression (which must be numeric) is evaluated by the assembler and the
rightmost n octal digits extracted. They become the string assigned as the value of
name.
Example: Suppose B is a symbol (EQU or SET) with absolute value 1024. Then
‘J OCTMIC B,6’ will assign the string 002000 to micro name J (since 20008=1024).
It is done by placing a micro name between the two micro delimiters (=) at
any point in any source line (except a comment line). The string is then substituted
for the micro name. Examples:
COMP =BC=,R1 is assembled as COMP ONE,R1
ADD #=J=,R2 is assembled as ADD #002000,R2.
56. ERR
Label Operation Operand
flag ERR none
This directive generates an assembler error of the type indicated by the flag.
This can be useful in certain conditional assemblies. The example below assumes
knowledge of macros and conditional assembly.
Sec. 3.16 Error Control Directives 101
57. ERRxx
Label Operation Operand
flag ERRxx expression
Will generate an assembler error of type flag when a condition detected during
pass 2 is true.
Example:
.
.
H DS 0 H is an array of length 0. It is the last address in the program
R ERRPL H-X’FF00 if H-FF0016 is positive (PL), generate an error of type R.
END
The condition xx can be PL, MI, ZR, NZ.
58. EJECT
Label Operation Operand
none EJECT none
It causes the next line of the listing to appear at the top of the next page
following the page heading. Some assemblers use PAGE instead of EJECT.
102 Directives Ch. 3
59. LIST
Label Operation Operand
none LIST ON or OFF
Used to control the listing of the program. LIST ON instructs the assembler
to list subsequent lines. LIST OFF instructs it to suppress listing. The LIST direc-
tive may appear anywhere in the program and is effective until the next LIST is
encountered or until the end of the program, whichever occurs first. Some assem-
blers support two directives LIST, NOLIST instead of a LIST ON/OFF. It should be
emphasized that the LIST directive affects the listing file only. It does not affect
the assembly of the program. The entire program is assembled but the programmer
may decide to list only certain parts of it.
60. PDC
Label Operation Operand
none PDC none
61. SBTTL
Label Operation Operand
none SBTTL a string
Specifies a one-line subheading for the listing file. The assembler prints the
subheading as the second line of each listing page. The first line is the heading (see
TITLE below for other features of the subheading).
62. SPACE
Label Operation Operand
none SPACE n
Inserts n blank lines in the listing. When the number of blank lines exceeds
the number of lines left on the page, a page is ejected, a new heading is printed,
and the rest of the blank lines are allocated on the new page.
63. TITLE
Label Operation Operand
none TITLE a string
Specifies a one line heading for the listing file. The assembler prints the heading
at the beginning of every page. The TITLE directive can appear anywhere in the
Sec. 3.17 Listing Control Directives 103
program, and each TITLE supersedes its predecessor. Each TITLE except the first
one also causes a page eject. This directive is similar to SBTTL above and both have
the same features.
64. XREF
Label Operation Operand
none XREF string
Controls the printing of the cross-reference table. This table lists all the sym-
bols defined in the program and for each symbol, the numbers of all source lines
referring to it. The operand is a string containing characters that specify whether
to print the table and how to list references to a symbol, by line number within the
source file, or by page number and line within the page.
65. %OUT
Label Operation Operand
none %OUT string
It is useful for displaying progress through a long assembly. When the %OUT is
encountered, the operand (a string) is displayed on the standard output device.
3.18 Remote Assembly Directives
(HERE, RMT)
A pair of RMT directives defines source code that is to be remotely saved for
later assembly. The HERE directive directs the assembler to assemble part of the
remote code at a certain point. If any remote code remains unassembled when END
is encountered, it is assembled at the end of the program.
66. HERE
Label Operation Operand
optional HERE none
A remote code that was saved before is fetched and assembled. If the label field
has a name, only remote code with that name is assembled. If there is no name, all
unlabeled remote code existing at this point is fetched and assembled.
67. RMT
Label Operation Operand
name RMT none
The name is optional. A pair of RMT directives with the same name (or both
without names) delimit a section of code that is saved, to be later assembled by a
HERE directive.
104 Directives Ch. 3
68. DUP
Label Operation Operand
name DUP rep,lcnt
rep specifies how many times to duplicate and assemble the sequence. lcnt
specifies the number of source lines in the sequence. The sequence starts on the line
following the DUP and is lcnt lines long. The lcnt parameter is optional and if it
is missing, the sequence goes up to the nearest ENDD. The name is also optional and
serves to identify the sequence duplicated. DUP sequences may be nested, in which
case the names are important and define the inner and outer DUP ranges.
Example:
OUT DUP 3 duplicate 3 times
ADD
INN DUP 2 an inner DUP duplicated twice
SUB
INN ENDD
LOD
OUT ENDD
69. ECHO
Label Operation Operand
name ECHO lcnt,p1=list1,p2=list2,. . .
The name and lcnt fields are as before. p1, p2,. . . are parameters that ap-
pear in the sequence to be duplicated. Each time the sequence is duplicated, each
occurence of p1 in the body of the sequence is replaced with something from list1,
each occurence of p2 is replaced with something from list2, etc.
Each listi should have the form ‘(a1,a2,...,an)’ where a1 is substituted for
pi on the first duplication, a2 is substituted for pi in the second duplication, and
so on.
Sec. 3.19 Code Duplication Directives 105
70. ENDD
Label Operation Operand
name ENDD none
71. STOPDUP
Label Operation Operand
none STOPDUP none
72. OPDEF
Label Operation Operand
name OPDEF parameters
73. PURGDEF
Label Operation Operand
name PURGDEF none
The instruction named in the label field is removed. It can only be something
originally defined with an OPDEF.
74. OPSYN
Label Operation Operand
mnemonic OPSYN mnemonic
This makes the mnemonic in the label field synonymous with the mnemonic in
the operand field. However, if the operand field is blank, this directive deletes the
instruction from the OpCode table (actually, it is deactivated in the table).
Example: STORE OPSYN STA
Declares the new mnemonic STORE to be the same as the existing mnemonic
STA. The programmer may use this directive to change the names of mnemonics
to more familiar ones, or to ones easier to remember. Notice that STA can still be
used. This directive applies to mnemonics of instructions and of directives. Thus
‘EQUATE OPSYN EQU’ declares EQUATE as another way of saying EQU.
This directive is rarely supported by assemblers since it is hard to implement. It
requires the OpCode table to be dynamic, so that the assembler can insert the new
mnemonics. The OpCode table is usually static, which allows for a fast search. In a
dynamic OpCode table the search is slower and, since OpCode table search is done
very often, a dynamic OpCode table should be very carefully designed. The main
design guideline is: If the OPSYN directive has not been used (no new mnemonics
added to the OpCode table), the search time in the dynamic table should be the
same as that of a static table. Only the actual insertion of new mnemonics should
degrade the search time. Such an OpCode table can be designed in two ways:
1. As a hash table (chapter 2). A hash table can be designed such that originally
any mnemonic can be found in a one step search. As mnemonics are added,
the search time degrades.
Sec. 3.21 OpCode Table Management Directives 107
3.22 Summary
Directives illustrate the hidden power of the assembler. It should now be clear
to the reader that the main task of assembling instructions is simple, and consumes
only a fraction of the power of the assembler. The many directives presented here
range from very simple to very complex (the most complicated ones are covered in
the next chapter), they provide the user with many services, and constitute more
than half the volume of a typical two-pass assembler. One-pass assemblers, designed
for speed, support just a few directives and are, therefore, much simpler.
108 Directives Ch. 3
Extend project 1–3 by adding the EXTRN, ENTRY directives plus five more di-
rectives of your choice (except macros, conditional assembly and listing control
directives).
The OPDEF, PURGDEF and OPSYN directives affect the OpCode table. If those
directives are supported, the OpCode table can no longer be static. Design and
implement a dynamic OpCode table and add it to any of chapter 1 projects, to
support the three directives above.
Pseudo instructions (also called directives or simply pseudos) provide
information or directions to the assembler that affect the translation
process
— Roy S. Ellizey, Computer System Software (1987)
4. Macros
Webster [42] defines the word macro (derived from the greek µακρoσ) as mean-
ing long, great, excessive or large. The word is used as a prefix in many compound
technical terms, e.g., Macroeconomics, Macrograph, Macronesia. We will see that
a single macro directive can result in many source lines being generated, which jus-
tifies the use of the word macro in assemblers. As mentioned in the introduction,
macros were introduced into assemblers very early in the history of computing, in
the 1950s.
4.1 Introduction
Strictly speaking, macros are directives but, since they are so commonly used
(and also not easy for the assembler to execute), most assemblers consider them fea-
tures, rather then directives. The concept of a macro is not limited to assemblers, It
is useful in many applications and has been used in many software systems. Refer-
ences [19–22] describe the use and implementation of macros in general. Knuth[36]
describes the use of macros in a large software system. Dellert [92] is an interesting
example of the use of macros in reprogramming (translating or rewriting a program
from computer A to computer B).
A macro is similar to a subroutine (or a procedure), but there are important
differences between them. A subroutine is a section of the program that is written
once, and can be used many times by simply calling it from any point in the program.
Similarly, a macro is a section of code that the programmer writes (defines) once,
110 Macros Ch. 4
and then can use many times. The main difference between a subroutine and a
macro is that the former is stored in memory once (just one copy), whereas the
latter is duplicated as many times as necessary.
A subroutine call is specified by an instruction that is executed by the hardware.
The hardware saves the return address and causes a branch to the start of the
subroutine. Macros, however, are handled in a completely different way. A macro
call is specified by a directive that is executed by the assembler. The assembler
generates a copy of the macro and places it in the program, in place of the directive.
This process is called macro expansion, and each expansion causes another copy of
the macro to be generated and placed in the program, thus increasing the size of
the object code.
As a result, subroutines are completely handled by the hardware, at run time;
macros are completely handled by the assembler, at assembly time. The assembler
knows nothing about subroutines; the hardware knows nothing about macros. A
subroutine call instruction is assembled in the usual way and is treated by the
assembler as any other instruction. Macro definition and macro expansion, however,
are executed by the assembler, so it has to know all the features, options and
exceptions associated with them. The hardware, on the other hand, executes the
subroutine call instruction, so it has to know how to save the return address and
how to branch to the subroutine. The hardware, however, gets to execute the object
code after it has been assembled, with the macros expanded in place. By looking
at an instruction in the object code, it is impossible to tell whether it came from
the main program or from an expanded macro. The hardware thus executes the
program in total ignorance of macros.
Figure 4–1 illustrates the differences between subroutines and macros:
When program A in figure 4–1a is executed, execution starts at label K and each
‘CALL N’ instruction causes the subroutine (section 1) to be executed. The order of
execution will thus be 2,1,3,1,4. Even though section 1 is the first in the program,
it is not the first to be executed since it constitutes the definition of a subroutine.
When the program is assembled, the assembler reads and assembles the source file
straight through. It does not change the positions of the different sections, and
does not treat section 1 (the subroutine) in any special way. The order of execution
has therefore to do only with the way the CALL instruction works. This is why
subroutines are a hardware feature.
Program B (Fig. 4–1b) is handled in a different way. The assembler reads the
MACRO, ENDM directives and thus recognizes the two instructions DIV, OUT as the
body of macro N. It then places a copy of that body wherever it finds a source line
with N in the operation field. The output is the same program, with the macro
definition removed, and with all the expansions in place. This output (Fig. 4–1c) is
ready to be assembled, in the usual way, by passes 1 and 2. This is why macros are
an assembler feature and handling them is done by a special pass, pass 0, where a
new source file is generated (Fig. 4–1d) to be read by pass 1 as its source file.
Having a separate pass 0 simplifies the design of the assembler, since it divides
the entire assembly job in a logical way between the three passes. The user, of
Sec. 4.1 Introduction 111
a b c d
course, has to pay a price in the form of increased assembly time, but this is a
reasonable price to pay for the added power of the assembler. It is possible to
combine passes 0 and 1 into one pass, which speeds up the assembler. However,
this results in a very complex pass 1, which takes more time to write and debug,
and reduces assembler reliability.
The task of pass 0 is thus to read the source file, handle all macro definitions
and expansions, and generate a new source file that is identical to the original
file, except that it does not have the macro definitions, and it has all the macro
expansions in place. In principle, the new source file should have no mention of
macros; in practice, it needs to have some macro information which eventually is
transferred to pass 2, to be written on the listing file. This point is further discussed
below.
112 Macros Ch. 4
Those two directives always come in pairs. The MACRO directive defines the start of
the macro definition, and should have the macro name in the label field. The ENDM
directive specifies the end of the definition.
Some assemblers use different syntax to define a macro. The IBM 360 assembler
uses the following syntax:
MACRO
&p1 name &p2,&p3,. . .
.
.
MEND instead of ENDM
where &p1, &p2 are parameters (explained later), each starting with an ampersand
‘&’.
To expand a macro, the name of the macro is placed in the operation field, and
no special directives are necessary.
.
.
COMP ..
NU
SUB ..
.
.
The assembler recognizes NU as the name of a macro, and expands the macro by
placing a copy of the macro definition between the COMP and SUB instructions. The
object code generated will contain the codes of the five instructions:
COMP ..
LOD A
ADD B
STO C
SUB ..
Handling macros involves two separate phases. Handling the definition and
handling the expansions. A macro can only be defined once (see the discussion of
Sec. 4.1 Introduction 113
nested macros later for exceptions, however), but it can be expanded many times.
Handling the definition is a relatively simple process. The assembler reads the
definition from the source file and saves it in a special table, the Macro Definition
Table (MDT). The assembler does not try to check the definition for errors, to
assemble it, execute it, or do anything else with it. It just saves the definition as it
is (again, there is an exception, mentioned below, that has to do with identifying
parameters). On encountering the MACRO directive, the assembler switches from the
normal mode of operation to a special macro-definition mode in which it:
locates available space in the MDT
reads source lines and saves them in the MDT until an ENDM is read.
Upon reading ENDM from the source file, the assembler switches back to the
normal mode. If the ENDM is missing, the assembler stays in the macro definition
mode and saves source lines in the MDT until an obvious error is found, such as
another MACRO, or the END of the entire program. In such a case, the assembler
issues an error (run away definition) and aborts the assembly.
Handling a macro expansion starts when the assembler reads a source line that
is not any instruction or directive. The assembler searches the MDT for a macro
with that name and, on locating it, switches from the normal mode of operation to
a special macro-expansion mode in which it:
Reads a source line from the MDT.
Writes it on the new source file, unless it is a pass 0 directive, in which case it is
immediately executed.
Repeats the two steps until the end of the macro is located in the MDT.
The following example illustrates this process. The macro definition contains
an error and a label.
BAD MACRO
ADD #1,R4
A$D R5 wrong mnemonic
LAN CMP R3,R5
ENDM
The definition is stored in the MDT with the error (A$D) and the label. Since the
assembler copies the macro definition verbatim, it does not recognize LAN as a label
at this point. The macro may later be expanded several times, causing several
copies to be written onto the new source file. Pass 0 does not check these copies in
any way and, as a result, does not issue any error messages (note that pass 0 does
not handle labels and does not maintain the symbol table). When pass 1 reads the
new source file, it discovers the multiple definitions of LAN and issues an error on
the second and subsequent definitions. When pass 2 assembles the instructions, it
discovers the bad A$D instructions and flags each of them.
114 Macros Ch. 4
Exercise 4.1 In such a case, how can we ever define a macro with a label?
This does not sound like a good way to implement macros. It would seem
better to assemble the macro when it is first encountered, i.e., when its definition is
found, and to store the assembled version in the MDT. The reason why assemblers
do not do that but rather treat macros as described above, is because of the use of
parameters.
X. The process of placing the actual arguments in place of the formal parameters
Exercise 4.2 Consider the case of an actual argument that happens to be identical
to a formal parameter. If the macro of example 2 above is expanded as ‘MG2 B,X,Y’,
we would end up with the expansion
LOD B
ADD X
STO Y
However, B is the name of the second parameter. Would the assembler perform
double substitution, to end up with LOD X?
Example 3 is even more striking. Here the parameters are used in the operation
field. The operands are always the same. When such a macro is expanded, the
user should specify three arguments which are valid mnemonics. The expansion
‘MG3 LOD,SUB,STO’ would generate:
LOD G
SUB H
STO I
whereas the expansion ‘MG3 LOD,CMP,JNE’ would generate
LOD G
CMP H
JNE I
which is a very different macro. It is obvious now that such a macro cannot be
assembled when it is defined.
In example 4 the parameter is in the label field. Each expansion of this macro
will have to specify an argument for the parameter, and that argument will become
a new label. Thus ‘MG4 NON’ generates
LOD G
NON ADD H
STO I
and MG4 BON generates
LOD G
BON ADD H
STO I
Each expansion involves the creation of a new label that will be added to the
symbol table in pass 1. To avoid multiply-defined labels, each expansion should use
a different argument. The argument could also be null (see below), which would
generate no label. It is important to realize, however, that the label becomes known
only when the macro is expanded and not before. Macro expansions, therefore, must
be done early in the assembly process. They cannot be done in the second pass
because the symbol table must be stable during that pass. All macro expansions
are done in pass 0 (except in assemblers that combine passes 0 and 1).
116 Macros Ch. 4
The last example of the use of parameters is a macro whose arguments may be
compound.
C MACRO L1,L2,L3,L4,L5,L6
ADD L1,L2(2) L2 is assumed compound and its 2nd component used
L3
B’L4 DEST
C’L5’D L6
.
.
ENDM
Sec. 4.2 Macro Parameters 117
which illustrates the following points about handling arguments in a macro expan-
sion:
1. There are two spaces between the ADD and the SUM on the first line. This
is because the macro definition has two spaces between the ADD and the L1.
In the second line, though, there is only one space between the SUB and the
R1. This is because the argument in the expansion has one space in it. The
assembler expands a macro by copying the arguments, as strings, into the
original macro line without any editing (except that when the argument is
compound, its parentheses are stripped off). In the third line, the parameter
occupies three positions (‘L4) but the argument Z only takes one position.
When Z is substituted for ‘L4, the source line becomes two positions shorter,
which means that the rest of the line is moved two positions to the left. The
assembler also preserves the original space between L4 and DEST. As a result,
the expanded line has one space between BZ and DEST.
2. The second line of the definition simply reads L3. The assembler replaces L3
by the corresponding argument ‘SUB R1,L1’ and, before trying to assemble
it, scans the line again, looking for more occurrences of parameters. In our
example it finds that the newly generated line has an occurrence of L1 in it.
This is immediately replaced by the argument corresponding to L1 (SUM) and
the line is then completely expanded. The process of macro expansion turns
out to be a little more complex than originally described.
3. In line three of the definition the quote (’) separates the character B from the
parameter L4. On expanding this line, the assembler treats the quote as a
separator, removes it, and concatenates the argument Z to the character B,
thus forming BZ. If BZ is a valid mnemonic, the line can be assembled. This is
another example of a macro line that has no meaning before being completely
expanded.
4. The argument corresponding to parameter L5 is null. The result is the string
CD with nothing between the C and the D. Again, if CD is a valid mnemonic (or
118 Macros Ch. 4
directive), the line can eventually be assembled (or executed). Otherwise, the
assembler flags it as an error.
Note that there is a difference between a null argument and an argument that is
blank. In the expansion C SUM,(D,T,U),(SUB/R1,L1),Z,/,SN, the fifth argument
is a blank space, which ends up being inserted between the C and the D in the
expanded source line. The final result is C/D/SN which is not the same as CD/SN. It
could even be interpreted by the assembler as C=label, D=mnemonic, SN=operand.
If a mnemonic D exists, the instruction would be assembled and C would be placed
in the symbol table, without any error messages or warnings.
Exercise 4.4 What if the last argument of an expansion is null? How can the
assembler distinguish between a missing last argument and a null one?
A little thinking shows that the precise names of the formal parameters are
not important. The parameters are only used when the macro is expanded. No
parameter names are used after pass 0 (except that the original names should appear
in the listing). As a result, the assembler replaces the parameter names with serial
numbers when the macro is placed in the MDT. This makes it easier to locate
occurences of the parameters when the macro is later expanded. Thus in one of the
examples above:
D MACRO A,B,C
LOD A
ADD B
STO C
ENDM
The vertical bar is used to separate source lines in the MDT. The ‘3’ in the
second field indicates that the macro has three parameters, and #1, #2 etc., refer
to the parameters. The bold vertical bar signals the end of the macro in the
MDT. This editing slows down the assembler when handling a macro definition but
it speeds up every expansion. Since a macro is defined once but may be expanded
many times, the advantage is clear.
It is now clear that three special characters are needed when macros are placed
in the MDT. In general, the assembler should use characters that cannot appear in
the original definition of the macro, and each assembler has a list of characters that
are invalid inside macros, or are even prohibited anywhere in the source file.
The example above is simple since each parameter is a single letter. In the
general case, a parameter may be a string of characters, called a token. The editing
process mentioned above is done by breaking up each source line into tokens. A
token is an indivisible unit of the line and is made up of a string of letters and digits,
or a single special character. Each token is compared to all parameter names and,
in case of a match, the token is replaced by its serial number. The example above
Sec. 4.2 Macro Parameters 119
may now be rewritten with the paramater names changed to more than one letter.
D MACRO AD,BCD,ST7
LOD AD
ADD BCD
STO ST7
ENDM
Let’s follow the editing of the second line. It is broken up into the two tokens
ADD and BCD, and each token is compared with all three parameter names. The first
token ‘ADD’ almost matches the first parameter ‘AD’. The second one ‘BCD’ exactly
matches the second parameter. That token is replaced by the serial number ‘#2’
of the parameter, and the entire line—including the serial number—is stored in the
MDT. Our new example will be stored in the MDT in exactly the same way as
before, illustrating the fact that parameter names can be changed without affecting
the meaning of the macro.
start
mode:=N
Input 4
yes
D E macro
2 mode 3 mode:=E
name
N
no
3
yes
mode:=D output
MACRO
no no
2
END 1
yes
pass 0 Execute it yes
directive
stop
no
1
input output
return return
3
expansion
2 mode
definition
mode input
locate space
in MDT yes
end of
mode:=N
macro
store macro
no
name and 1
parameters
scan line &
substitute
parameters
input
yes
output pass 0
Execute it
directive
no MEND
no
yes 3
output
mode:=N
1 3
2. If it is MACRO, read the entire macro definition and store in MDT. Goto 1.
5. In any other case, write the line on the new source file. Goto 1
6. If current line was the END directive, stop (end of pass 0).
Exercise 4.5 Normally, the definition of a macro must precede any expansions of
it. If we eliminate that restriction, what modifications do we have to make to the
assembler?
122 Macros Ch. 4
where the last separator is immediately followed by the name of the next macro.
Such an MDT is easy to search by following the pointers and comparing names.
Since the pointers point backwards, the table is searched from the end to the be-
ginning; an important feature. It guarantees that when a multiply-defined macro is
expanded, the back search will always find the last definition of the macro. Multiply-
defined macros are a result of nested macro definitions(see later), or of macros that
users write to supersede system macros. In either case, it is reasonable to require
that the most recent definition always be used. The advantage of this organization
is its flexibility. It is only limited by one parameter, the size of the array. The total
size of all the definitions cannot exceed this parameter. Thus we can define a few
long macros or many short ones.
The other way to organize the MDT is to store the macros in an MDT array
with separators as described before, but without pointers. An additional array,
called the MNT, contains pairs <macro name, pointer> where the pointers point
to the start of a definition in the MDT array. The advantage of this organization is
that the MNT has fixed size entries, allowing for a faster search of names. However,
the total amount of macros that can be stored in such an MDT is limited by two
parameters. The size of the MDT array—which limits the total size of all the
macros—and the size of the MNT—which limits the number of macros that can
be defined. The following diagram is an example of such an MDT organization. It
shows an MDT array with 3 macros. The first has 3 parameters, the second, 4, and
the third, 2. The MNT array has fixed-size entries.
Sec. 4.4 MDT Organization 123
3 1st line .... last line 4 line .... line 2 line .... line MDT Array
Regardless of the way the MDT is organized, if the assembler supports system
macros, it should also support a directive to remove a macro from the MDT. A user
writing a macro to redefine an existing system macro may want the redefinition
to be temporary. They define their macro, expand it as many times as necessary,
and then remove it such that the original system macro can be used again. Such a
directive is typically called REMOVE.
In old assemblers this directive is executed by removing the pointer that points
to the macro, not the macro itself. Removing the macro itself and freeing the space
in the MDT is done by many new assemblers (see Ch. 8). It is done by changing
the macro definition to a null string, and storing another (smaller) macro in the
space thus freed. After defining and removing many macros, the MDT becomes
fragmented; it can be defragmented by moving macros around and changing pointers
in the MNT.
Exercise 4.6 What could be a practical example justifying the actual removal of
a macro from the MDT?
Pass 0 has to identify each source line as either a macro name, a pass 0 directive,
or something else. To achieve quick identification, the pass 0!directives should be
stored in the MDT with a special flag, identifying them as directives. This way,
only the MDT has to be searched in pass 0.
An assembler combining passes 0 and 1 has to do more searching. Each source
line has to be identified as either an instruction (its size should be determined), as
a pass 0 or pass 1 directive (to be executed), as a pass 2 directive (to be written on
the intermediate file), or as a macro name (to be expanded).
One approach is to search the OpCode table first (it should contain all the
mnemonics and directives) and the MDT next, the idea being that most source
lines are either instructions or directives, macro expansions are relatively rare.
The alternative approach is to search the MDT first and the OpCode table next.
This way, the programmer can define macros which redefine existing instructions or
directives. If this approach is used, the MDT must be designed to allow for quick
search.
124 Macros Ch. 4
Exercise 4.8 There is, however, a way to assign such an expansion a unique mean-
ing. What is it?
The third method uses a special parameter named SYSLIST such that SYS-
LIST(i) refers to the ith argument of the current macro expansion. A possible
definition is
M MACRO no parameters are declared.
LOD SYSLIST(2)
STO SYSLIST(1)
ENDM
LOD Y
STO X
Sec. 4.5 Other Features of Macros 125
In all the examples shown above, macro parameters were delimited by a comma
‘,’. It is possible, however, to use other characters as delimiters, or to use a scheme
where parameters can be delimited in a general way. Such a scheme is used by the
“TeX typesetting system [36] that has many features of a programming language,
although its main task is to set type.
In “TeX, a macro can be defined by \def\xyz#1.#2./{. . .}. This defines a
macro xyz with two parameters. The first is delimited by a period ‘.’, and the
second, by a period and a space ‘./’. In the expansion
\xyz-12.45,=a98.62./abc. . .
the first parameter would be bound to ‘-12’, and the second, to ‘45,=a98.62’. Note
that the comma is part of the second actual argument, not a delimiter. Also, the
period in ‘98.62’ is considered part of the second argument, not a delimiter.
5. Macro arguments are normally treated as strings. However, Macro, the VAX
assembler, can optionally use the value, rather than the name, of an argument.
This is specified by means of a ‘\’. A simple example is:
After defining this macro, we assign ‘CONS=5’, and expand CLEER twice. The
expansion ‘CLEER CONS’ generates ‘CLRL RCONS’ (which is probably wrong),
whereas the expansion ‘CLEER \CONS’ generates ‘CLRL R5’.
Each argument in a macro expansion has attributes that can be used to make
decisions—inside the macro definition—each time the macro is expanded. At the
time the macro definition is written, the arguments are unknown. They only become
known when the macro is expanded, and may have different attributes each time
the macro is expanded.
M MACRO P1
P1
ENDM
126 Macros Ch. 4
creates a block of 1–4 words each time it is expanded, depends on the number of
arguments in the expansion.
The directive .NCHR returns the size of a character string. The general format of
this directive is ‘.NCHR symbol,<string>’. Thus after defining:
L MACRO
P1 ..
BNZ G12 branch on non-zero to label G12
A LOD .. just a line with a label
.
.
G12 STO ..
ENDM
Each time this macro is expanded, symbols A, G12 will be defined. As a result, the
second and subsequent expansions will cause assembler errors.
Most assemblers offer help in the form of automatically generated unique
names. They are called local labels or automatic labels. Here are two examples,
the first one from the MPW assembler for the Macintosh computer. The two lines
with labels in the above definition can be written as:
128 Macros Ch. 4
and every time the macro is expanded, the assembler will append a suffix of the
form 00001, 00002,. . . to any label generated. This way all labels generated are
unique. In our example they will be A00001, G1200002, A00003, . . .
The second example is from Macro, the VAX assembler. When a local label
is needed in a macro, a parameters shoudl be added preceded by a ‘?’. Thus in:
A pair of IRP directives, placed inside a macro, define a sequence of lines. They
direct the assembler to repeatedly duplicate and assemble the sequence a number
of times determined by a compound parameter. Here is an example from the MPW
assembler for the Macintosh computer:
The sequence to be duplicated consists of the single instruction ADD. Each time it is
duplicated, one of the components of the compound parameter P1 is selected. The
expansion:
MAC (A,B,#3),H will generate
ADD A
ADD B
ADD #3
.
.
When a program contains many long macros that are expanded many times,
the programmer may not want to see all the expansions listed in the printed out-
put. The PRINT directive may be used to suppress listing of macro expansions
(‘PRINT NOGEN’) or to turn on such listings (‘PRINT GEN’). This directive does not
affect the listing of the macro definitions or of the body of the program. Those
listings are controlled by the LIST, NOLIST directives.
The comments should be printed, together with the definition of the macro, in the
listing file, but should they also be printed with each expansion? The most general
130 Macros Ch. 4
answer is: It depends. Some comments refer to the lines in the body of the macro
and should be printed each time an expansion is printed (as mentioned elsewhere,
the printing of macro expansions is optional). Other comments refer to the formal
parameters of the macro, and should be printed only when the macro definition
is printed. The decision should be made by the programmer, which means that
the assembler should have two types of comment lines, the regular type, which is
indicated by an asterisk, and the special type, indicated by another character, such
as a ‘!’, for comments that should be printed only as part of a macro definition.
This is the case where the expansion of one macro causes another macro (or
even more than one macro) to be expanded. Example:
C MACRO
COMP
JMP
ENDM
A MACRO
ADD
C
SUB
ENDM
B MACRO
LOD
A
STO
ENDM
mode is very similar to the normal mode. Each line is identified as either the name
of a macro, a pass 1 directive, or something else. If the line is the name of a
macro, the assembler suspends the expansion of B, fully expands the new macro,
and then returns to complete B’s expansion. If the line is a pass 0 directive, the
assembler executes it. If the line is something else, the assembler scans it, substitutes
parameters, and writes the line on the new source file. During this process the
assembler maintains a pointer that always points to the current line in the MDT.
When the expansion of B starts, the macro is located in the MDT, and a
pointer is set to point to its first line. That line (LOD) is fetched and identified as an
instruction. It is written on the new source file, and the pointer is updated to point
to the second line (A). The second line is then fetched and is identified as a macro
name. The assembler then suspends the expansion of B and starts expanding A by
performing the following steps:
It locates macro A in the macro definition table
It sets a new pointer to the first line of macro A.
It saves the values of the actual arguments of macro B.
From then on, macro A is expanded in the usual way until its second line (C) is
fetched. At that point the new pointer points to that line. The assembler suspends
the expansion of A and starts the expansion of macro C by performing three steps as
above. While expanding macro C, the assembler uses a third pointer and, since C is
not nested, the expansion terminates normally and the third pointer is discarded.
At that point the assembler returns to the expansion of macro A and resumes using
the second pointer (which should be incremented to point to the next waiting line
of A). When the expansion of A is completed, the assembler discards the second
pointer and switches to the first one—which means resuming the expansion of the
original macro B. Three typical steps in this process are shown in figure 4–4 below.
In part I, the second line of B has been fetched and the (first) pointer points to
that line. The expansion of macro A has just started and the second pointer points
to the first line of A.
In part II, the second pointer points to the second line of A. This means that
the line being processed is the second line of A. The expansion of C has just started.
In part III, the expansion of C has been completed, the third pointer discarded,
and the assembler is in the process of fetching the third line of A.
The rules for nested macro expansion therefore are:
In the macro expansion mode, when encountering the name of a macro, find it in
the MDT, set up a new pointer to point to the first line, save the arguments of the
current macro, and continue expanding, using the new pointer.
After fetching and expanding the last source line of a macro, discard the current
pointer and start using the previous one (and the previous set of arguments).
If there is no previous pointer, the (nested) macro expansion is over.
132 Macros Ch. 4
II
III
From this discussion it is clear that the pointers are used in a Last In First Out
(LIFO) order, and should thus be stored in a stack. This stack is called the macro
expansion stack (MES), and its size determines the number of possible pointers and
thus the maximum depth of nesting.
Implementing nested macro expansions is, therefore, done by declaring a stack
and using it while expanding a macro. All other details of the expansion remain
the same as in a normal expansion (however, see conditional assembly below).
The following is a set of rules and a flow chart (figure 4–5, a generalized subset
of figure 4–2) which illustrate the main operations of a pass 0 supporting nested
macro expansions.
1. Input line from MDT (since mode=E).
2. If it is a pass 0 directive, execute it. Goto 1.
3. If is is a macro name, start a new expansion. Goto 1.
4. If it is an end-of-macro character, stop current expansion and look back in
MES.
• If MES empty, change mode to N. Goto main input.
• Else—start using previous pointer in MES, remain in E mode. Goto 1.
5. Line is something else, substitute parameters and write line on new source file.
Goto 1.
Sec. 4.6 Nested Macros 133
3 expansion
mode empty not empty
MES
input mode:=N
resume
using
previous
end of yes 1 parameters
macro
no
pass 0 3
Execute it 3
directive
no
no
NOGOOD MACRO
INST1
Q SET Q+1
NOGOOD
INST2
ENDM
N. This test is another pass 0 directive, typically called AIF (Assembler IF). Its
general form is ‘AIF exp.symbol’, where exp is a boolean expression containing
only quantities known to the assembler. The assembler evaluates the expression
and, if its value is true, the assembler goes to the line labeled .symbol.
The next version of the same macro is now:
GOOD MACRO
INST1
Q SET Q+1
AIF Q=N.F if Q equals N then go to line labeled .F
GOOD
.F INST2
ENDM
An expansion such as
N EQU 2
Q SET 0
GOOD
Of the 9 lines expanded, lines 2, 3, 4, 6, 7 are pass 0 directives. They are executed
and do not appear in the final program. Lines 1, 5, 8, 9 are instructions, and are
the only ones actually expanded and written onto the new source file.
The macro has been recursively expanded to a depth of 2 because of the way
symbols Q, N have been defined. It is also possible to say ‘AIF Q=2.F’, in which case
the depth of recursion will depend only on the initial value of Q.
it, store the symbol in the temporary symbol table, and write the EQU on the new
source file for a second execution in pass 1. In pass 1, all EQU directives are executed
and all EQU symbols, absolute and relative, end up in the permanent symbol table.
Some assemblers go one more step and combine pass 0 with pass 1. This makes
for a very complex pass but, on the other hand, it means that conditional assembly
directives can use any quantities known in pass 1. They can use the values of any
symbols in the symbol table (i.e., any non-future symbols), the value of the LC,
and other things known in pass 1 such as the size of the last instruction read. The
following is an example along these lines.
.
.
P DS 10 P is the start address of an array of length 10
N DS 3 N is the start address of an array of length 3 immediately
. following array P. Thus N=P+10
.
Q SET P The value of Q is an address
GOOD depth of recursion will be 10 since
. it takes 10 steps to increment the
. value of Q from address P to address N.
The next example is more practical. It is a recursive macro FACT that calculates
a factorial. Calculating a factorial is a common process and is done by computers
all the time. In our example, however, it is done at assembly time, not at run time.
FACT MACRO N
S SET S+1
K SET K*S
AIF S=N.DON
FACT N
.DON ENDM
The expansion:
S SET 0
K SET 1
FACT 4
will calculate 4! (=24) and store it in the temporary symbol table as the value of the
SET symbol K. The result can only be used at assembly time, since the SET symbols
are wiped out at the end of pass 0. Symbol K could be used, for example, in an
array declaration such as: ‘FACT4 DS K’ which declares FACT4 as an array of length
4!. This, of course, can only be done if the assembler combines passes 0 and 1.
The FACT macro is rewritten below using the IIF directive. IIF stands for
Immediate IF. It has the form ‘IIF condition,source line’. If the condition is true,
Sec. 4.8 Conditional Assembly 137
FACT MACRO N
S SET S+1
K SET K*S
IIF S=N,(FACT N)
ENDM
IF X=2
line 1
ELSE
line 2
line 3
ENDIF
If X=2, then line 1 will be expanded, otherwise, lines 2 & 3 will be expanded.
and thus only exists in pass 0. When the assembler executes an AIF and decides to
go to, say, symbol .F, it searches, in the MDT, for a line labeled .F, sets the current
MES pointer to point to that line, and continues the macro expansion. If a line
fetched from the MDT has such a label, the line goes on the new source file without
the label, which guarantees that only pass 0 will see the sequence symbols. In con-
trast, regular symbols (address symbols and absolute symbols) do not participate
in pass 0. They are defined in pass 1, and their values used in pass 2.
Examples:
AIF (&A(&I) EQ ‘ABC’).TGT where &A is a compound parameter and &I is a SET
symbol used to select one component of &A.
AIF (T‘&X EQ O).KL if the type attribute of argument &X is O, meaning a null
argument, then go to symbol .KL.
AIF (&J GE 8).HJ where &J could be either a parameter or a SET symbol.
AIF (&X AND &B EQ ‘(’ ).LL where &X is a B type SET symbol (see below) and
&B is either a C type SET symbol or a parameter.
The AGO Directive: The general format is ‘AGO SeqSymbol’. It directs the
assembler to the line labeled by the symbol. (This is an unconditional goto at
assembly time, not run time.)
The ANOP (Assembler No OPeration) directive. The assembler does nothing in
response to this directive, and its only use is to provide a line that can be labeled
with a sequence symbol. This directive is used in one of the examples at the end of
this chapter.
SET symbols. They can be of three types. A (arithmetic), B (boolean) or C
(character). An A type SET symbol has an integer value. B type have boolean
values of true/false or 1/0. The value of a C type SET symbol is a string.
Any SET symbol has to be declared as one of the three types and its type cannot
be changed. Thus:
LCLA &A,&B
LCLB &C,&D
LCLC &E,&F
declare the six symbols as local SET symbols of the appropriate types. A local SET
symbol is only known inside the macro in which it is declared (or inside a control
section, but those will not be discussed here). There are also three directives to
assign values to the different types of SET symbols.
&A SETA 1
&A SETA &A+1
&B SETA 1+(B‘1011’*X‘FF1’-15)/&A-N‘SYSLIST(&A) where B‘1011’ is a binary constant,
X‘FF1’ is a hex constant, and N’ is the number attribute
(the number of components) of the second argument of the current
expansion (the second, because &A=2).
&C SETB (&A LT 5)
Sec. 4.8 Conditional Assembly 139
The three directives GBLA, GBLB, GBLC declare global SET symbols. This feature
is very similar to the COMMON statement in Fortran. Once a symbol has been declared
global in one macro definition, it can be declared global in other macro definitions
that follow, and all those declarations will use the same symbol.
N1 MACRO
GBLA &S
&S SETA 1
AR &S,2
N2
ENDM
N2 MACRO
LCLA &S
&S SETA 2
SR &S,2 the local symbol is used
N3
ENDM
N3 MACRO
GBLA &S
CR &S,2 the global symbol is used
ENDM
AR 1,2
SR 2,2
CR 1,2
140 Macros Ch. 4
Declarations such as ‘LCLA &A(7)’ are allowed, and generate a SET symbol
which is an array. Such symbols can be very useful in sophisticated applications.
References [25–27] contain more information on the macro facilities of the .
The following examples summarize many of the features of conditional assembly
discussed here.
SUM MACRO &L,&P1,&P2,&P3
LCLA &SS
&SS SETA 1
&L ST&P1 5,TEMP save register 5 in temp
L&P1 5,&P2(1) load first component of P2
.B AIF (&SS GE N‘&P2).F if done go to .F
&SS SETA &SS+1 use &SS as a loop index
A&P1 5,&P2(&SS) add next component of P2
AGO .B loop
.F ST&P1 5,&P3 store sum in P3
L&P1 5,TEMP restore register 5
MEND
This macro uses the conditional assembly directives to loop and generate several
AD (Add Double) instructions. The number of instructions generated equals the
number of components (the N attribute) of the third argument.
A more sophisticated version of this macro lets the user specify, in argument
®, which register to use. If argument ® is omitted (its T attribute equals O)
the macro selects register 5.
SUM MACRO &L,&P1,&P2,&P3,®
LCLC &CC
LCLC &RR
LCLA &AA
&AA SETA 1
&CC SETC ‘&L’
&RR SETC ‘®’
AIF (T‘® NE ‘O’).Q is argument ® a null?
&RR SETC ‘5’ yes, use register 5
Sec. 4.8 Conditional Assembly 141
The definition of macro Y is nested inside the definition of X. This feature is not
as useful as nested macro expansion but many modern assemblers support it (older
assemblers typically did not). In this section we will see what this feature means,
and look at two different ways of implementing it.
The first thing that needs to be modified, in order to implement this feature, is
the macro definition mode. In this mode, the assembler reads source lines and stores
them in the MDT until it encounters an ENDM. The first modification has to do with
the MACRO directive itself. When the assembler reads a line with a MACRO directive
while it is in the macro definition mode, it treats it as any other line (i.e., it stores
it in the MDT) but it also increments a special counter, the definition level counter,
by 1. This counter starts at 1 when entering definition mode, and is updated to
reflect the current level of nested definitions. When the assembler encounters an
ENDM directive, it decrements the counter by 1 and tests it. If the counter is positive,
the assembler treats the ENDM as any other line. If the counter is zero, the assembler
knows that the ENDM signals the end of the entire nested definition, and it switches
back to the normal mode. This process is described in more detail below.
142 Macros Ch. 4
Exercise 4.16 In the case where both macros X, Y end at the same point
X MACRO
-
-
Y MACRO
-
-
ENDM Y
ENDM X
do we still need two ENDM lines?
The next thing to modify is the macro expansion mode. While expanding
macro X, the assembler will encounter the inner MACRO directive. This implies that
the assembler should be able to switch to the macro definition mode from the macro
expansion mode, and not only from the normal mode. This feature does not add
any special difficulties and is straightforward to implement. While in the macro
expansion mode, if the assembler encounters a MACRO directive, it switches to the
macro definition mode, allocates an available area in the MDT for the new definition
and creates that definition. Specifically, the assembler:
fetches the next source line from the MDT, not from the source file.
stores the line in the macro definition table, as part of the definition of the new
macro.
repeats until it encounters an ENDM line (more precisely, until it encounters an ENDM
line that matches the current MACRO line). At this point the assembler switches back
to the macro expansion mode, and continues with the normal expansion.
Sec. 4.9 Nested Macro Definition 143
Now may be a good point to mention that such nesting can be done in one
direction only. A macro expansion may cause the assembler to temporarily switch to
the macro definition mode; a macro definition, however, cannot cause the assembler
to switch to the expansion mode. When the assembler is in a macro expansion
mode, it may expand a macro with nested definition, so it has to enter the macro
definition mode from within the macro expansion mode. However, if the assembler
is in the macro definition mode and the macro being defined specifies the expansion
of an inner macro, the assembler does not start the nested expansion while in the
process of defining the outer macro. It therefore does not enter the macro expansion
mode from within the macro definition mode. The reason being that when a macro
is defined, its definition is stored in the MDT without any attempt to assemble,
execute, or expand anything. As a result, the assembler can only be in one of the
four following modes: normal, definition, expansion, & definition inside expansion.
They will be numbered 1–4 and denoted N,D,E,& DE, respectively.
In the example above, when macro X is defined, macro Y becomes part of the
body of X and is not recognized as an independent macro. When X is expanded,
though, macro Y is defined and, from that point on, Y can also be expanded. The
only reason to implement and use this feature is that macro X can be expanded
several times, each expansion of X creating a definition of Y in the MDT, and those
definitions—because of the use of parameters—do not have to be the same.
Each time Y is expanded, the assembler should, of course, expand the most
recent definition of Y (otherwise nested macro definition would be a completely
useless feature). Expanding the most recent definition of Y is simple. All that the
assembler has to do is to search the MDT in reverse order; start from the new macros
and continue with the older ones. This feature of backward search has already been
mentioned, in connection with macros that redefine existing instructions.
The above example can be written, with the inclusion of parameters, and with
some slight changes, to make it more realistic.
X MACRO A,B,C,D
MULT A
Y MACRO C
C
ADD DIR
JMP B
ENDM Y
DIV C
Y D
ENDM X
The body of X now contains a definition of Y and also an expansion of it. An
expansion of X will generate:
A MULT instruction.
A definition of Y in the MDT.
144 Macros Ch. 4
A DIV instruction.
An expansion of Y, consisting of three lines, the last two of which are ADD, JMP.
The expansion X SEC,TOR,DIC,BPL will generate:
The only thing that may be a surprise in this example is the fact that macro Y
is stored in the MDT without C being substituted. In other words. Y is defined as
‘Y MACRO C’ and not as ‘Y MACRO DIC’. The rule in such a case is the same as in a
block structured language. Parameters of the outer macro are global and are known
inside the inner macro unless they are redefined by that macro. Thus parameter B
is replaced by its value TOR when Y is defined, but parameter C is not replaced by
DIC.
Since we are interested in how things are done by the assembler, the imple-
mentation of this feature will be discussed in detail. In fact, we will describe in
detail two ways to implement nested macro definitions. One is the traditional way,
described in [28,61]. The other, due to Revesz [81] is newer and more elegant.
Normally, when a macro definition is entered into the MDT, each parameter
is replaced with a serial number #1, #2, . . . To support nested macro definition,
the assembler replaces each parameter, not with a single serial number, but with a
pair of numbers (definition level, serial number). To determine those pairs, a stack,
called the macro definition stack (MDS), is used.
When the assembler starts pass 0, it clears the stack and initializes a special
counter (the Dlevel counter mentioned above) to 0. Every time the assembler en-
counters a MACRO line, it increments the level counter by 1 and pushes the names
of the parameters of that level into the stack, each with a pair (level counter, i )
attached, where i is the serial number of the parameter. The assembler then starts
copying the definition into the MDT, comparing every token on every line with the
stack entries (starting with the most recent stack entry). If a token in one of the
macro lines matches a stack entry, the assembler considers it to be a parameter (of
the current level or any outer one). It fetches the pair (l,i ) from the stack entry
that matched, and stores #(l,i ) in the MDT instead of storing the token itself. If
Sec. 4.9 Nested Macro Definition 145
the token does not match any stack entry, it is considered a stand-alone token and
is copied into the MDT as part of the source line.
When an ENDM is encountered, the stack entries for the current level are popped
out and Dlevel is decremented by 1. After the last ENDM in a nested definition is
encountered, the stack is left empty and Dlevel should be 0.
The example below shows three nested definitions and the contents of the MDT.
It also shows the macro definition stack when the third, innermost, definition is
processed (the stack is at its maximum length at this point).
Lines 3,5,8,10 in the MDT show that the assembler did not treat the inner macros
Q, R as independent ones. They are just part of the body of macro P.
On line 4, the #(2,1) in the MDT means parameter 1 (A) of level 2 (Q), while the
#(1,3) means parameter 3 (C) of level 1 (P).
On line 7, #(3,3) is parameter 3 (E) of level 3 (R) and not that of level 2 (Q). The
H is not found in the stack and is therefore considered a stand-alone symbol, not a
parameter.
On line 11, the assembler is back to level 1 where none of the symbols is a pa-
rameter. The stack at this point only contains the four bottom lines, and symbols
E,F,G,H are all considered stand-alone.
Figure 4–6 is a flow chart (a generalized subset of figure 4–2) summarizing the
operations described above.
146 Macros Ch. 4
stack
G (3,4) top
E (3,3)
C (3,2)
A (3,1)
F (2,4)
E (2,3)
B (2,2)
A (2,1)
D (1,4)
C (1,3)
B (1,2)
A (1,1) bottom
Dlevel:=Dlevel+1
no
Dlevel=1 5
?
yes
input
no yes
5 MEND 6
Figure 4–6. The classical method for nested macro definitions (part I).
5
6
Dlevel:=Dlevel+1
match
yes no
Dlevel=0
Fetch the pair (l,i) ?
Store token
associated with the
in MDT
matched name yes no
mode:=N
7
Store (l,i) in MDT
instead of token
1
Figure 4–6. The classical method for nested macro definitions (part II).
the last MEND, stack P should be empty again, signifying a non-macro definition
mode. The assembler should then switch back to the original mode (either 1
or 3).
6. If the source line is neither of the above, it is scanned, token by token, to de-
termine what tokens are parameters. Each token is compared with all elements
on stack P, from top to bottom. There are three cases:
a: No match. The token is not a parameter of any macro and is not replaced
by anything.
b: The token matches a name in P that does not have a serial number at-
tached. The token is thus a parameter of an inner macro, and can be
ignored for now. We simply leave it alone, knowing that it will be re-
placed by some serial number when the inner macro is eventually defined
(that will happen when some outer macro is eventually expanded).
c: The token matches a name in P that has a serial number attached. The to-
Sec. 4.9 Nested Macro Definition 149
pass 0 8 MACRO? 2
ENDM? 3
Initialize
Dlevel:=0
Elevel:=0
mode:=N mode
N,E D,DE
1
macro name? 4
mode
E,DE
N,D
5
Increment pointer
Read next
located at top of
line from
MES. Use it to read mode
source file DE
next line from MDT
6
E
6 7
eof N
8 D 7
?
yes
output
end of
pass 0
Figure 4–7a. Revesz’s method for nested macro definitions (part I).
ken is thus a parameter of the currently defined macro (the level 1 macro),
and is replaced by the serial number.
After comparing all the tokens of a source line, the line is stored in the MDT.
It is a part of the currently defined macro.
The important point is that the formal parameters are replaced by serial num-
bers only for level 1, i.e., only for the outermost macro definition. For all other
nested definitions, only the names of the parameters are placed on the stack, re-
flecting the fact that only level 1 is processed in macro definition mode. That level
ends up being stored in the MDT with each of its parameters being replaced by a
serial number.
On recognizing the name of a macro, which can happen either in mode 1
(normal) or 3 (expansion), the assembler enters mode 3, where it:
150 Macros Ch. 4
2
MACRO
3
ENDM
mode
N E
Dlevel
mode:=D mode:=DE
=0 >1
=1
Dlevel:= D,
Dlevel+1 DE Copy line
Error! Place in
into current
Unmatched current def
definition
ENDM in MDT
in MDT
Dlevel
>1 Pop current
=1
Dlevel:=0 level of
Allocate space mode:=N
Push all mode:=N params
in MDT for from P
params into new macro. Push
stack P, but each param, with
without #i #i attached, Dlevel:=
1
into P Dlevel+1
Figure 4–7b. Revesz’s method for nested macro definitions (part II).
7. Loads stack A with a new level containing the actual arguments. If the new
macro has no arguments, the new level is empty, but it should always be in
stack A as a level.
8. Inserts a new pointer in the MES, pointing to the first line—in the MDT—of
the macro to be expanded.
9. Starts expanding the macro by bringing in individual lines from the MDT.
Each line is scanned and all serial numbers are replaced by arguments from
stack A. The line is then checked to distinguish three cases:
Sec. 4.9 Nested Macro Definition 151
4 macro name
6
mode:=E
Elevel:=Elevel+1
For each token on source line,
if token is #i, search stack A
top to bottom. On finding a match,
Push actual args, with replace #i with argument from A.
#i attached, into stack A
7
Push a new pointer into
MES, pointing to 1st line
of macro to be expanded
For each token on source line,
search stack P, top to bottom,
for a pair (token,#i).
1 If found, replace token with #i
Figure 4–7c. Revesz’s method for nested macro definitions (part III).
d: The line is MACRO, the assembler enters mode 4 (definition mode within
expansion mode) and follows rules 1–6 above.
e: The current line is MEND, the assembler again follows the rules above and
may decide, if this is the last MEND, to return from mode 4 back to mode
3.
f: For any other line, the line is written on the new source file as explained
earlier.
On recognizing the end of the macro (no more lines to expand), the assembler:
10. Removes the highest level from stack A.
11. Deletes the top pointer from the MES.
12. Checks the MES and stack A. There are three cases:
g: Neither is empty. There is an outer level of expansion, the assembler stays
in mode 3 and resumes the expansion as above.
h: Only one is empty. This should never happen and can only result from
a bug in the assembler itself. The assembler should issue a message like
‘impossible error, inform a systems programmer’.
i: Both are empty. This is the end of the expansion. The assembler switches
to mode 1.
152 Macros Ch. 4
Dlevel
=0 >0
D,
DE
Copy into current
=0 Elevel >0
definition in MDT
N E
1
Elevel:=Elevel-1 mode
D DE
Figure 4–7d. Revesz’s method for nested macro definitions (part IV).
after
line 1 4 7 10 13 16
D #4 C G C D #4
C #3 D C D C #3 empty
B #2 F E E B #2
A #1 E A F A #1
D #4 C D #4
C #3 D C #3
B #2 F B #2
A #1 E A #1
D #4
C #3
B #2
A #1
4.9.3 A note
The rule is that each pair ## is reduced to one #. This way, macro definitions
can be nested to any depth.
Example:
\def\a#1{‘#1’\def\b##1{[#1,##1] \def\x####1{(#1,##1,####1)}\x X}\b Y}
Defining macro b.
Macro x simply prints all three arguments, a’s, b’s and its own, in parentheses.
The reader should convince himself that the rule above really allows for unlim-
ited nesting of macro definitions.
expects macro expansions and will flag any macro definition as an error. Example :
BEGIN EX
A MACRO P,Q
.
.
ENDM
M MACRO C,II
.
.
ENDM
X DS 12 an array declaration
N MACRO
.
.
ENDM
.
.
The definition of macro N will be flagged since it occurs too late in the source.
With this separation of macro definition and expansion, pass 0 is easy to im-
plement. In the normal mode such an assembler simply copies the source file into
the new source. Macro definitions only affect the MDT, while macro expansions
are written onto the new source file.
Such an assembler, however, cannot support nested macro definitions. They
generate new macro definitions too late in pass 0.
An assembler that supports nested macro definitions cannot impose the re-
striction above, and can only require the user to define each macro before it is
expanded.
Having a separate pass 0 simplifies pass 1. The flow charts in his chapter prove
that pass 0 is at least as complicated as pass 1, so keeping them separate reduces
the total size and the complexity of the assembler. Combining pass 0 and pass 1,
however, has the following advantages:
No new source file needed, saving both file space and assembly time.
Procedures—such as input procedures or procedures for checking the type of a
symbol—that are needed in both passes, only need to be implemented once.
All pass 1 facilities (such as EQU and other directives) are available during macro
expansion. This simplifies the handling of conditional assembly.
156 Macros Ch. 4
1. The process of expanding a macro includes two main steps, binding the for-
mal paremeters to their actual arguments, and substituting each parameter in the
macro’s body by its bound argument. Write a procedure, in pseudo-code, to imple-
ment the two steps. Assume that nested macro definition is not allowed, but nested
macro expansion is allowed.
2. The syntax of macro definition and expansion differs from assembler to assem-
bler. Use several textbooks and assembler manuals to study different conventions
of defining macros and naming parameters. Summarize the resuts in a table of the
form:
Assembler A
Assembler B
–
–
3. Continue the table of question 2 with columns for: Types of SET symbols, Syntax
of SET symbol name, Syntax of sequence symbol name.
4. A macro named LOD can be used to redefine the LOD instruction. An assembler
supporting this feature simply searches the MDT backwards. What is another way
of redefining assembler instructions?
5. Review the block structure feature found in higher-level languages. What is the
main difference between this feature and the scope of parameters in a nested macro
definition?
6. What stacks are involved in nested macro definition and expansion? When (in
what passes) are these stacks used? Is it possible to use the memory occupied by
these stacks for something else?
7. What is a pass 0 directive? Make a list of all pass 0 directives mentioned in this
chapter .
8. Compare the two methods described in this chapter for MDT implementation,
the one with the MNT and the one without it.
9. Apply the two methods described in this chapter for handling nested macro
definition to the case:
Sec. 4.11 Review Questions and Projects 157
A MACRO
-
-
B MACRO
-
-
ENDM B
-
-
C MACRO
-
-
ENDM C
-
-
ENDM A
The definitions of the two macros B, C are nested in the definition of A, but are
separate (not nested inside each other).
10. What is the main difference between the parameters of a macro and those of a
procedure?
11. How should a programmer decide whether to use a macro or a procedure in a
given situation?
Modify project 3–1 to include macros with parameters. The macros should
also support unique label generation but should not be nested in any way. No
conditional assembly is necessary.
The EQU, DC directives are very useful. In fact, they can be used, together with
the MACRO, ENDM directives, to implement an assembler. The following macro:
may be used to assemble an ADD instruction. The expansion ‘ADD X,2,Y’ would
158 Macros Ch. 4
result in:
X EQU * Label X defined and set to the current LC
BYTE 2A OpCode of ADD
BYTE 2 Register 2
BYTE Y The value of Y (from the symbol table) is substituted
After writing such a macro for every instruction, source programs can easily be
assembled. Most instructions contain fields that are longer or shorter than one byte,
so a data-generating directive is necessary, that would be more general than BYTE.
The DC directive, for example, could be modified such that ‘DC 17(2,6)’ would
generate the constant 17 = 10012 as a 6-bit field, starting at bit position 2 of the
current byte. The result is xx010001 where the xx field should be filled by the next
DC.
Your task: Select a simple assembler language, such as the one described in
project 1–1, and implement it by writing a full set of such macros.
The listing file is the second output of the assembler. It is generated in pass 2,
and is eventually printed. To be useful, such a file should include, for each source
line, a copy of the source (including comments) and the object codes, the value of
the LC for the instruction, and any error messages pertaining to the line.
In addition, the listing may, optionally, contain a cross reference table of all the
symbols used in the program. This is a list of all symbols, sorted alphabetically by
symbol name, with their values and attributes and, for each symbol, the LC values
of all the instructions using it. If a symbol is used a lot in the program, the cross
reference for it may require more than one line of print. The cross reference can be
an invaluable debugging tool and is generated by many assemblers. It contains all
the information in the symbol table and more.
It has already been mentioned, in chapter 1, that a one-pass assembler cannot
print the values of future symbols in the listing. Thus, for this type of assembler,
the cross reference table is certainly important.
To implement the cross reference table, another column is added to the symbol
table, with a pointer to a linked list. Each node in this list contains the LC value for
an instruction using that symbol. The lists are built in pass 2, while instructions are
assembled. Each time an instruction uses a symbol, the symbol table is searched.
On locating the symbol, its value is used in assembling the instruction, and another
node is added to the list of that symbol.
160 The Listing File Ch. 5
Exercise 5.1 What should be printed in the object code field when a macro defi-
nition is listed?
‘Y: DC 11,2,‘$’’. The first constant (11) should be printed. The complete list
of constants generated by the DC may be too long but, if the programmer wants to
see it, a special directive such as PDC (print defined constants) may be used.
Chapter 3 mentions several directives used to control the listing. They can
be used to print a title and a sub-title (including, perhaps, the date and time) on
every page, to suppress the listing or parts of it, to eject a page at any point in the
listing, and to control the listing of the cross-reference and of macros. In general,
such directives are executed in pass 2 and are easy to implement.
00002 ************************
00003 * THIS PROGRAMS SOLVES...
00004 * IT ALSO GENERATES THE ...
00005 * AND, FINALLY, THE ...
00006 ************************
00007 *
00008 0000 ORG 0
00009 *
00010 * EXTERNAL PROCEDURES DEFINITION
00011 *
00012 E1D1 OUTCH EQU $E1D1 OUTPUT SINGLE CHAR.
00013 E1AC INCH EQU $E1AC INPUT CHAR.
00014 E07E PDATA1 EQU $E07E PRINT MESSAGE
00015 *
00016 * INTERNAL STORAGE
00017 *
00018 0000 0A DEFLT BYTE 10 DEFAULT IS DECIMAL
00019 0001 00 BASE BYTE 0 BASE=DEFAULT
00020 0002 00 ROTCNT BYTE 0 COUNT FOR ROTATE
00021 0003 0000 INVAL WORD 0 VALUE OF INPUT
00022 0005 0000 TMPVAL WORD 0 TEMP. INPUT
00023 0007 00 INCNT BYTE 0 COUNT OF INPUT CHARS.
00024 0008 00 LSTCHR BYTE 0 LAST CHAR BEFORE CR
.
.
00108 *
00109 008A 97 01 PRINIT STA A BASE REPLACE BASE
00110 008C 08 INX
00111 008D 7A 0007 DEC INCNT
.
.
Sec. 5.1 A 6800 Example 163
.
00450 A06D 20444543 DECM FCC / DEC/
00451 A071 04 BYTE 4
00452 A072 20484558 HEXM FCC / HEX/
00453 A076 04 BYTE 4
00454 A077 0D0A0000 CRLFM BYTE $D,$A,0,0,4
A078 04
00455 *
00456 A018 ORG $A018
00457 A018 BUFFER RMB 20 INPUT BUFFER
00458 *
00459 A02C END
ERROR SUMMARY
0 ERROR(S) IN ASSEMBLY
BINARY SUMMARY
684 BYTES OUTPUT
684 BYTES TO RAM
0 BYTES TO ROM
CROSS-REFERENCE LISTING
BASE 0001 ABS 17D 27 58 66 69 109 146 155 162 235 258 288 329
BASERR 006A ABS 41 53 63 73 82 87D
BASTOR 0066 ABS 76 77 80 83D
.
.
.
VALLP 0224 ABS 370 372D 377
VALOUT 0134 233D
END OF LISTING
The subtitle is different on each of our pages. This is the effect of the SUBTTL
directive, which itself is not listed.
The object code is simple and short. It occupies between one and three bytes
per source line and can therefore be easily included in the listing line. On other
computers, the object code may be longer and may have very different sizes for each
source line. In such cases, the assembler may devote, say, 8 printing positions to
the object code per listing line and, if the object code is longer, print the rest on a
second, and even a third, line. This is the situation on the VAX computer (see next
example) where each source line can generate between one and three 32-bit words.
Source line 00454 produces several constants. Its listing is, therefore, long, con-
sisting of two lines.
.ABS. 00000000 ( 0.) 00 (0.) NOPIC USR CON ABS LCL NOSHR NOEXE NORD NOWRT NOVEC BYTE
CONS 00000004 ( 4.) 01 (1.) NOPIC USR CON REL LCL NOSHR NOEXE RD NOWRT NOVEC BYTE
CODE 00000065 (101.) 02 (2.) NOPIC USR CON REL LCL NOSHR EXE RD WRT NOVEC BYTE
DATA 00000008 ( 8.) 03 (3.) NOPIC USR CON REL LCL NOSHR NOEXE RD WRT NOVEC BYTE
+--------------------------+
! Macro library statistics !
+--------------------------+
Macro library name Macros defined
SYS$COMMON:[SYSLIB]STARLET.MLB;2 1
title example
cgr group mycod, mydat
assume cs:cgr, ds:cgr
mydat segment public
if var lt 10
array db var
else
array db 10
endif
abc df 2
mydat ends
mycod segment public
if1
begin:
endif
mov al,xyz
if2
begin:
endif
mov ax,2 eq 4
add ax,dx
mycod ends
mydat segment public
xyz db 3
mydat ends
end begin
The second set is the final, pass-2, listing, containing the phase error. Note how
array becomes a single byte loaded with the constant 4, and how the df directive
reserves 6 bytes. The data segment mydat is thus 7 bytes long, stretching from
address 0 to address 6. The next part of mydat defines xyz as the byte at address
7.
if2
begin:
a:prog.ASM(19): error A2006: Phase error between passes
endif
0000 B8 0000 mov ax,2 eq 4
0003 03 C2 add ax,dx
0005 mycod ends
0007 mydat segment public
0007 03 xyz db 3
0008 mydat ends
end begin
Next come the segment table, symbol table, and general analysis:
The cross reference below was generated by the special cref utility.
172 The Listing File Ch. 5
Note the ‘PRINT OFF’, ‘PRINT ON’ directives. They suppress the listing of the
four INCLUDEd files. The main points worth noting in the listing itself are the
different instruction sizes, and the absence of tables and statistics at the end. This
listing, in fact, resembles the one genereted by the old IBM 360 assembler.
Sec. 5.4 An MPW Example 173
eighttt
MC680xx Assembler - Ver 3.10 MPW Example 06-Nov-91 Page
1
Copyright Apple Computer, Inc. 1984-1989
Loc F Object Code Addr M Source Statement
TITLE ’MPW Example’
00000 MAIN
00000 PRINT ON
00000 DATitle
00000 1146726565204D DC.B ’Free Mem (#Bytes)’ ;Name & Window
Title
00012 0000 0012 ALIGN 2 ; Word align
00012 0142 000A 0152 theWindow DC.W 322,10,338,500 ;windo top,lft,bot,rt
0001A DAOpen
0001A 48E7 0078 MOVEM.L A1-A4,-(SP) ; preserve A1-A4
0001E G 2849 MOVE.L A1,A4 ; MOVE DCE pointer to a reg
00020
00020 598F SUBQ.L #4,SP ; FUNCTION = GrafPtr
00022 2F0F MOVE.L SP,-(SP) ; push a pointer to it
00024 A874 GetPort ; push it on top of stack
00026 4AAC 001E TST.L DCtlWindow(A4) ; do we have a window?
0002A 662E 0005A BNE.S StdReturn ; If so, return, Else...
0002C 42A7 CLR.L -(SP) ; FUNCTION = windowPtr
0002E 42A7 CLR.L -(SP) ; allocate on the heap
00030 487A FFE0 00012 PEA theWindow ; boundsRect
00034 487A FFCA 00000 PEA DATitle ; title
00038 4267 CLR.W -(SP) ; visible flag FALSE
0003A 3F3C 0004 MOVE.W #noGrowDocProc,-(SP) ; window
proc
0003E 2F3C FFFF FFFF MOVE.L #-1,-(SP) ; window in front
00044 3F3C 0100 MOVE.W #$0100,-(SP) ; goAway box TRUE
00048 42A7 CLR.L -(SP) ; refCon is 0
0004A A913 NewWindow
0004C
0004C G 205F MOVE.L (SP)+,A0
0004E 2948 001E MOVE.L A0,DCtlWindow(A4) ; save windowPtr
00052 316C 0018 006C MOVE.W DCtlRefNum(A4),WindowKind(A0)
00058
00058 A11D MaxMem
0005A
0005A StdReturn
174 The Listing File Ch. 5
Extend project 4–1 by adding listing control directives EJECT, LIST, TITLE,
XREF. The main modification is to the symbol table to support cross-reference listing
as explained above.
One may argue that the second definition defines a machine dependent higher-level
language rather than a high-level assembler language, the main reason being the
definition of assembler language, given in the introduction. The basis of this defini-
tion is the one-to-one correspondence between assembler- and machine instructions.
The languages discussed here do not have such a one-to-one correspondence and, in
this respect, resemble more a higher-level language than an assembler language.
The best known examples of HLAs are the NEAT/3 for the NCR Century
computers [85,86], Wirth’s PL360 [61], a language designed specifically for the IBM
360 computers, Bell and Wichmann’s PL516 [63], an Algol-like assembler language
for the Honeywell DDP516 computer, and BABBAGE [87] a language specifically
designed as a HLA for the GE 4000 family of minicomputers. NEAT/3 and BAB-
BAGE are HLAs according to the first definition above, while PL360 and PL516
have been designed in the spirit of the second definition. Following is a short de-
scription of the main features of those languages.
Exercise 6.1 The second definition above defines an HLA as a higher-level, ma-
chine dependent language. Is it also possible to design the logical contrast namely, a
lower-level (assembler), machine-independent language? What would be a possible
use for such a language?
6.1.1 NEAT/3
is translated into one machine instruction. Only when conversion between data
types is necessary, the translator (we will call it a translator, not a compiler or an
assembler) generates more machine instructions.
The main feature that makes NEAT/3 look like a higher-level language is the
data definitions. The way data items and files are declared in NEAT/3 closely
resembles COBOL. Concepts such as working storage area, constants area, data
records divided into fields, and data types, are all borrowed from COBOL, which
makes it easy to write a COBOL compiler in NEAT/3. Even editing masks (for
data to be printed) are supported and use the same characters ‘9’,‘Z’,‘$’,‘+’, ‘−’,‘.’
as in COBOL.
The data types are: Characters (coded in USASI, not ASCII), signed & un-
signed decimal, signed & unsigned packed decimal, binary, and hex.
1. The code field can have values of: R for a record, F for a field, and A, for an
area.
2. Field DAY is located at the same position as DATE. Thus DAY is a subfield of
DATE (and, as a result, also MONTH, YEAR) which is how a record becomes a tree
structure, same as in COBOL.
4. No ‘picture’ is shown in this example, but pictures are heavily used in NEAT/3
and are virtually identical to the COBOL picture clause.
Next we examine some of the NEAT/3 instructions (or, as the NCR literature
calls them, statements). This is the area were one can really justify the name high-
level assembler. Most of the instructions are simple, resemble typical assembler
instructions, and are easy to translate. The main difference in translation is the
automatic conversion between data types, which NEAT/3 supports, but a typical
assembler does not.
Exercise 6.2 What are typical valid and invalid type conversions in languages
such as COBOL and NEAT/3?
1. Comparisons. The instruction ‘COMP A,B’ compares its two operands and
sets status flags like any typical comparison in machine language. The only
difference is that the operands can be of different types, and the translator
takes care of type conversion.
2. Conditional Branches. The ‘BRE CALCTAX’ instruction is executed by testing
the status flags and branching to CALCTAX on ‘equal’. There are several such
branches, each translated into one machine instruction. They are the only way
to make decisions in NEAT/3, except for the simple IF described below.
3. Test and branch. The if alphabetic (IFAL) instruction tests its first operand
and branches to the second operand (a label) if the first operand is alphabetic
(of type character).
4. Moves. Those instructions work the same as in COBOL. In the simplest case,
a move is translated into one machine instruction but, if it also involves type
conversion and editing, it is translated into several instructions.
5. End of execution (FINISH). This instruction is translated into machine in-
structions that close all open files and perform a software interrupt to the
operating system.
In summary, NEAT/3 justifies the name high-level assembler language, but
also the name low-level programming language. Its translator may be called a high-
level assembler but also a compiler. It seems to lie somewhere in between assembler
language and COBOL.
Sec. 6.1 High-Level Assemblers 181
6.1.2 PL360
The main justification for this language, to use the developer’s own words [61],
is:
‘. . . PL360 was designed to improve the readability of
programs which must take into account specific char-
acteristics and limitations of a particular computer.
It represents an attempt to further the state of the
art of programming by encouraging and even forcing
the programmer to improve his style . . . Because of
its inherent simplicity, the language is particulaly well
suited for tutorial purposes . . . a tool that encourages
the programmer to write programs in a disciplined, lu-
cid and readable style while still maintaining control
over the optimal use of specific machine characteris-
tics . . . ’.
The main low-level feature of the language is its heavy dependence on the
IBM360 architecture, allowing the programmer to use specific registers and machine
instructions. Its main high-level features are block structure, procedures, typed
variables, and the use of statements rather than instructions. There is also a goto
statement, which is rarely used.
Specifically, the following 9 points summarize the most important design fea-
tures of the language. The first 6 are low level features, and the rest, high level
ones.
1. Variables can be declared and their types are the types available on the machine
itself. For instance, an integer can be declared as a byte, a short integer, an
integer, or a long integer. Those types correspond to a byte, halfword, fullword,
and doubleword in the machine. This makes it easy to assemble (compile?
translate?) declarations, as each is directly translated into a DS directive.
2. Only one-dimensional arrays can be declared, again making it easy to translate
an array declaration into a DS directive. Also making it easy to generate an
index to an array. Accessing a multi-dimensional array like B[i,j] requires
the compiler to generate an index of the form i*n+j (or something similar).
This, however, is against the design philosophy of PL360, requiring simple
translation.
4. Expressions are simple, they use no parentheses, and are executed strictly from
left to right, with equal precedence for all operands (see example with R9 be-
low). This also implies that no temporary storage is automatically used by the
translator when an expression is evaluated; more in the spirit of an assembler
than a compiler.
5. Machine instructions can easily be used, each is declared as a function in PL360.
6. Functions and procedures can be declared, and can have parameters. A good
programming practice, however, is to pass parameters through registers, avoid-
ing the complexities of call by name, value, or address.
7. The language does not allow the use of absolute addresses, and a program
cannot modify itself.
8. High-level control structures such as if-then-else, case, for, or while, can be
used.
9. Compound statements (sandwiched between a begin end pair) and blocks can
be used. A PL360 variable thus has scope (it can be local or global).
A program written in PL360 superficially resembles an Algol 60 program. It
consists of statements, not instructions. It has the same control structures and block
structure. Even variable declarations are very similar. However, on a closer look,
one can easily see the common use of machine registers and of machine instructions.
Some simple examples are shown in Fig. 6.1 below:
Assignments are executed from left to right without parentheses and with no
operator precedence. The example involving R9 above is executed as:
R9:=R8; R9:=R9 and R7; R9:= R8 shll 8; R9:=R9 or R6;
The shll keyword means shift left logical. The ++ notation means either a
logical (unsigned) integer addition or an unnormalized f.p. addition.
6.1.3 PL516
This language was designed and implemented [63] in 1969–1970, at the Na-
tional Physical Laboratory in England, as a high-level assembler language for a
small, 16-bit machine, the Honeywell DDP516. It was influenced by PL360 and
was an attempt to create a language that would be similar to Algol 60 and, at the
same time, would be machine dependent and would allow the direct use of machine
instructions. To quote from the introduction:
‘. . . It must be emphasized that the language is no sub-
stitute for Fortran or Algol 60, but rather a more conve-
nient and flexible system for writing long, machine-code
programs. The main advantage over DAP, the assembler
for the DDP-516, is that program texts are largely self-
documenting due to their Algol-like structure . . .’
The main features of PL516 are: variables (but no real variables), with scope;
type checking; expressions (but no temporary working storage); procedures (but
not more than one parameter per procedure and no call by name); compound state-
ments; conditional statements; simple for loops; arrays (only one-dimensional); no
dynamic memory allocation; direct control of the machine registers; machine in-
structions can be included in the code.
The following two short examples, taken from the user’s manual, give the flavor
of the language:
1.
for xsymbol←176 do
if mcode[xsymbol]=ident then goto found
else codesymbol IRS,0;
This is a loop, searching array mcode. IRS is a machine instruction (IncRement
in Store, by 1) and, in our example, it increments memory location 0, which also
happens to be the X (index) register.
2.
accumulator conditional procedure bsis;
begin
xsymbol←accumulator;
more: when cleft optable[xsymbol]=bs then
begin
codesymbol STX,addbs;
184 Special Assembler Types Ch. 6
exittrue
end;
when accumulator nonzero then
begin
x←inc icleft optable[xsymbol]+x;
goto more;
end
end;
This is a procedure bsis that searches an array optable to find the type of a
basic symbol (basic symbols are tokens read by the compiler from the source file,
on paper tape). On finding a match in the array, variable addbs is set to the array
index by the machine instruction STX (store X register). The keyword accumulator
stands for the accumulator (the A register).
6.1.4 BABBAGE
ADD p
STA q
MULT n
LOD a
CMP b
BZ L1
LOD c
CMP d
BNZ L2
L1 LOD #0
STA g
L2 --
2. IF a EQ b THEN << 0=>a 1=>b >>. The symbols ‘<<’ , ‘>>’ act as a ‘begin-
end’ pair to delimit compound statements. In our case, the compound statement
is the two simple expressions setting operands a, b to 0,1 respectively. This is
translated into:
LOD a
CMP b
BNZ L1
LOD #0
STA a
LOD #1
STA b
L1 --
Records can be declared and used in a way very similar to Pascal records.
Pointers can be set to combine records into large data structures.
The above description is incomplete but it shows the main characteristics of
the language namely, it combines higher-level language features with simple syntax
and heavy machine dependence. BABBAGE is therefore a HLA in the sense of both
definitions given in this section.
186 Special Assembler Types Ch. 6
6.2 Summary
In the 1970s, while the 360/370 computers were in wide use, PL360 had gener-
ated interest among programmers, and had enjoyed a certain amount of use. Today
(in the 1990s), however, that interest has virtually disappeared. However, there
seem to be language designers who believe in the concept of a high-level assembler
language. As a result, we may perhaps see more interest in this approach in the
future. More future programming languages may be combinations of the higher-
level and assembler languages of today, providing the benefits of both types. If this
proves true, we may see the demise of conventional assembler languages in favor of
such hybrids.
RR AR,R4,R6
RR SR,R4,R‘SYM+1’
SR LD,R4,ABC
188 Special Assembler Types Ch. 6
The first instruction is easy to assemble. The MA only needs an OpCode table.
The second and third instructions require the value of symbols that should be in
the symbol table. SYM should be an absolute symbol, and ABC, a relative one. The
MA should first evaluate the expression ‘SYM+1’, and then assemble the instructions
in the usual way.
Exercise 6.4 What does the following line mean, if anything: RR X’R R4,R6?
Procedures and Functions: A procedure can be defined, which describes an
instruction, or even more than one instruction, in terms of a parameter or several
parameters. The SR instruction LD, mentioned earlier, may be described by the
procedure:
PROC LD,Q
SR $A8,Q(1),Q(2)
ENDP
This is similar to a macro. The procedure name is LD, it has one (compound)
parameter Q. It generates a type SR instruction with the OpCode A816, followed
by the two components of the parameter Q. A typical call could be ‘LD,(R4,ABC)’,
which is very similar to the way the LD instruction is normally written.
In a similar way, a function may be defined, to calculate and return a certain
result. The INC function below assigns a value to a symbol, a value that depends
on the LC.
FUNC INC,Q,P
Q EQU LC+P
ENDF Q
The ENDF specifies that the result of the function is Q. When such a function is used,
as, for example, in: ‘. . . INC S,68 . . .’, it generates ‘S EQU LC+68’ and returns the
value of S, which can be used on the same line.
Library Routines: A good assembler should have accesss to a library of routines,
and should be able to search the library and add routines to the program being
assembled. With a MA, the problem is that there may be several libraries, each
with a different format. A good MA can therefore accept a special command with
the name of a specific library and information on how to search it and read routines
from it.
Directives: Most directives are independent of the specific source language
used, making it easy to execute them by a MA. The most important exceptions are
the data generating directives. They are machine dependent since computers use
different types of data. The size of numbers, the character codes, the method used to
represent signed numbers, all those depend on the architecture of the computer, and
may be very different for different computers. However, computers use a relatively
small number of data types, which simplifies the task of executing those directives.
Sec. 6.3 Meta Assemblers 189
Once the MA receives specifications such as: word size, 2’s complement or 1’s
complement, how many bits in the fraction and exponent parts of a floating point
number, ASCII or EBCDIC, it can execute the data generating directives.
6.4 Disassemblers
Practically speaking, a disassembler is the opposite of an assembler. It trans-
lates in the opposite direction, from machine code to assembler language. Because
of the nature of assembler language, a disassembler is a relatively simple program.
Since each assembler instruction generates one machine instruction, the opposite
translation can be done with relative ease.
The disassembler goes through the following steps:
It reads the next machine instruction from the source file.
It identifies the OpCode. If the OpCode has fixed length, this is very easy. Oth-
erwise, the disassembler has to perform several tests starting with the first few bits
of the machine instruction.
Once the OpCode has been identified, an OpCode table, similar to the one used
by the assembler, is searched. The table provides the mnemonic, the number of
operands, and other information.
Once the number of operands is known, the disassembler scans each operand in
the machine instruction to determine the addressing modes, registers, and addresses
used by each.
After obtaining all this information, the original assembler instruction is known
and can be printed or written on a file.
The simple steps described above are not sufficient to disassemble machine
code. To end up with a correct, readable, assembler program, the disassembler has
to handle the following problems
The symbol names used in the original program cannot be figured out by the
disassembler. Thus an instruction such as ‘JMP ABC’ may be disassembled to read
‘JMP A0001’. When the disassembler finds an instruction with an address operand,
it assumes that the original instruction has used a symbol, not an absolute address,
and it generates a unique symbol name such as Adddd where dddd are digits. The
resulting disassembled program is technically identical to the original one, but may
be harder to read since meaningful symbol names are important.
A similar problem exists with regard to macros. Some instructions in the object
code come from the expansion of macros while others come from the original pro-
gram. When the object code is disassembled, it is impossible to tell whether the
original program has used macros and, if yes, which ones. The disassembled source
can therefore have no macros.
The same thing is true for any feature that is completely handled by the assem-
bler. EQU symbols, for instance, are transparent to the disassembler and cannot be
reflected in the listing generated by it. Thus when something like:
190 Special Assembler Types Ch. 6
REG EQU 5
CMP REG,. . .
is assembled and then disassembled, the following is generated
CMP 5,. . .
Since the contents of address F949 is not the OpCode of any 6502 instruction.
Example: PC DOS, the operating system of the IBM PC, has a simple debug-
ger, called DEBUG, that can disassemble 80x86 machine code. The three examples
below show the results of disassembling code starting from address 0 of a certain
memory segment (the segment number is irrelevant and has been omitted).
0000 CD20 INT 20
0002 C0 DB C0
0003 9F LAHF
0004 009AEEFE ADD [BP+SI+FEEE],BL
0008 1DF0F4 SBB AX,F4F0
000B 02BA102F ADD BH,[BP+SI+2F10]
000F 03BA10BC ADD DI,[BP+SI+BC10]
0013 02BA10FC ADD BH,[BP+SI+FC10]
0017 0C01 OR AL,01
0019 0301 ADD AX,[BX+DI]
001B 0002 ADD [BP+SI],AL
001D FFFF ??? DI
001F FFFF ??? DI
When DEBUG is directed to start disassembling from address 1, the bytes ‘20 C0’,
that were interpreted as parts of two instructions, now constitute an ‘ADD AL,AL’
instruction. The first few lines are different, but the rest is the same
A similar result is obtained when disassembling from address 2. The first byte
is C0 and, since no instruction can start with this value, the disassembler considers
it to be data. It generates a ‘DB C0’ directive, and assumes that the next byte (9F)
starts the next instruction.
0002 C0 DB C0
0003 9F LAHF
192 Special Assembler Types Ch. 6
Clean Up Windows
Empty Trash
Erase Disk
Restart
Shut Down
— Items of ‘SPECIAL’ menu Macintosh computer, 1990.
7. Loaders
of the operating system (because of its memory allocation task), while the assembler
is more a stand-alone program, having little to do with the rest of the OS.
Most of this chapter is devoted to linking loaders, but it starts with two short
sections describing assemble-go loaders and absolute loaders. It ends with a number
of sections devoted to special features of loaders and to special types of loaders.
Before we start, here is a comment on the word ‘relocate’. Loaders do not
relocate a program in the sense that they do not move it in memory from one
area to another. The loader may reload the same program in different memory
areas but, once loaded, the program normally is not relocated. There are some
exceptions where a program is relocated, at run time, to another memory area but,
in general, the term ‘relocate’ is a misnomer.
Exercise 7.2 How can the user estimate the size of a new program?
The following two points show that a one pass assembly-load operation in a
large computer has serious disadvantages. As a result, it is used in practice only in
single-user computers.
Sec. 7.1 Assemble-Go Loaders 197
COMMON
unused Assembler
data
data
area
unused area
a b c
unused
unused
unused
user program user program
unused JMP
a b c
Figure 7–2.
a. Program and SQRT routine loaded in memory.
b. Larger user program loaded in two parts.
c. First part of program JMPs to the second part, over the SQRT routine.
The discussion above illustrates the limitations of this type of loader. They are
the reasons why most loaders are of the relocating type. Only a few are absolute
loaders (see below) or are part of the assembler.
program-size directive
object id
instructions bits
Figure 7–3.
a. The Files Associated with a Loader
b. Layout of a Relocatable Object File.
01011001 11 Y
10000010 11 Entry. length=2
01000011 11 C
01000100 11 D
0 EXTRN LB,Y
0 ENTRY CD
0 Z DS 5 01000101 11
5 ST LOD R5,15 1 5 00 15 00001101 00
00000000 00
00001111 00
8 CD COMP R5 2 5 00010101 00
Sec. 7.3 Linking Loaders 203
1. The final value of the LC in each program (the value printed to the left of
the END directive) is the size of the program. It becomes known at the end of pass
1, and is the first item to go on the object file, at the start of pass 2. The loader
reads it and uses it to determine the memory area (and thus the start address) of
the program.
2. Symbol CD is declared as ENTRY in one program but is never declared as an
EXTRN in any other program. This is an unusual situation created either by an error
(the programmer has forgot to declare CD as an EXTRN in another program) or by
a change in the original design of the program (CD was supposed to be an EXTRN
in another program, plans were changed, CD has become just a regular symbol, but
the prgrammer forgot to delete the declaration ENTRY CD).
The loader should detect this and indicate a warning (‘unmatched entry CD’).
This is not necessarily an error but may provide a hint to the programmer that
something in the program needs to be corrected.
The opposite case—that of an unmatched EXTRN—is different. If a symbol
is declared EXTRN (and is also used by the declaring program) then it should be
declared as an ENTRY in another program. A failure to do so is an error, since the
loader would not be able to link the two programs. However, such a case may mean
that the symbol is the name of a library routine and is declared as an ENTRY in
the library. This is discussed in point 8 below and also in the section on library
routines.
3. Each line in the example object files is 8 bits long plus 2 identification
bits. They identify the line as either an absolute item (00), a relocatable item (01),
something that requires special relocation (10), or as a loader directive (11). Each
line thus becomes 8 + 2 = 10 bits long, of which only 8 bits actually get loaded in
memory. In certain computers, the operating system makes it hard, or inefficient,
to output lines with sizes which are not multiples of 8. In such a case the assembler
may write all the id bits, packed four pairs to a byte, at the end of the object file.
The loader would have to read this information first, reopen the object file, and
start loading. This is the reason why many loaders perform two passes over each
object file.
4. In this example, we limit ourselves to two id bits and thus can have only
four types of records on the object file. Specifically, we can have just one type of
special relocation (identified by 10). Ideally, it would be desirable to have two types
of special relocation, absolute and relative. In such a case one could write:
ENTRY AB
.
AB EQU 7
in one program and:
EXTRN AB
.
CALL AB
Sec. 7.3 Linking Loaders 205
assembly and the assembler, in response, completes pass 2 by writing the special
symbols from the symbol table on the object file (as loader directives, codes ‘101’,
‘100’, of variable size), followed by an end of file (eof). It is the programmer’s
responsibility to make sure that only one source file in the entire program has a
start address in the operand field of the END directive. The loader expects only one
such address and issues an error messages on reading the second (and subsequent)
ones.
Exercise 7.4 What if the loader does not find any code ‘110’ directives?
Using the example, it is easy to see how the loader works. It reads the names
of the object files either from the command line (the keyboard command used to
invoke the loader) or from a special command file. It opens the two object files,
reads the first directive from each, adds the two program sizes (27 + 8 = 35) and
locates an available memory area of size≥ 35. If that area starts, say, at address 64,
then 64 becomes the first relocation term. The loader reads the object files in the
order in which their names are specified. Assuming that it starts with file SUB, the
other relocation terms can now also be calculated. In our case, there is one more
such term, namely 64 + 8 = 72. In general, each relocation term equals the sum of
its predecessor and the length of the previous program.
At this point, the loader can print the memory map which, in our case, is:
Exercise 7.5 What other information can the loader include in the memory map?
Next the loader reads the directives for the ENTRY symbols from the first file.
It stores the special symbol information in the SST after relocating their values
(point 4 above shows that special symbols are always relative). Thus LB gets a
value of 0 + 64 = 64 and Y, a value of 7 + 64 = 71.
The loader repeats this for the next object file. It reads the special symbol
directives from that file into the SST.
The second file contains three special symbols LB, Y, CD. They are entered
into the symbol table with CD—an ENTRY—relocated by 72. Its value becomes
8 + 72 = 80. Symbols LB, Y are of type EXTRN and their values are still unknown.
They are entered, each with its index, instead of a value. The SST now contains:
Note that the program name goes in the SST with each symbol. It is later used to
perform the linking.
After reading all the special symbol information from all the object files, the
loader proceeds to merge symbols in the SST. It scans the SST looking for symbols
of type ext. For each such symbol, the loader searches the SST for the corresponding
ent symbol. On finding it, the value of the ent symbol is stored in the entry for the
ext symbol, and the entry for the ent symbol is deleted.
Notice that, ideally, a GEST should only have ext type symbols. In our example
one ent symbol remained in the GEST because the declaration ‘EXTRN CD’ did not
appear in any program. This case has already been dealt with earlier in this chapter
and it results in a loader warning.
There could also be the case where, for a certain ext entry, the loader cannot
find the corresponding ent entry. This may mean one of two things: either the
programmer has forgot to declare an entry point, or the missing entry is the name
of a library routine.
When a library routine is needed in a program, the prgrammer can simply write
‘CALL xx’ and declare ‘EXTRN xx’ (where xx stands for the name of the routine).
The loader, in such a case, searches the library for such an entry. If it finds one,
it loads the routine into memory and it becomes part of the program. If not, the
loader considers this case a fatal error. It issues an error message and will not
execute the program. Library routines are discussed later in this chapter.
Exercise 7.6 What if there are two ent symbols in the SST, with the same name
but from different programs, both matching the same ext symbol ?
If the user has asked to see the GEST, this is the time for the loader to print
it. This is also the reason why the symbol names are retained in the table. We will
see that they are not used in the actual linking.
In the absence of errors, the loader is now ready to load the object files. It starts
with the first object file, SUB, reads instructions, and loads them into consecutive
locations starting from location 64. Since there are 8 bytes in that file, the last one
will be loaded in location 71. Each byte with id bits 01 will get relocated by adding
the first relocation term, 64, to it. The result in memory will be:
208 Loaders Ch. 7
64 00111001
65 01000101
66 00111001
67 00000101
68 01000000
69 reserved for
70 array X
71 00000010
There are no linking problems in this program since it does not use any EXTRN
symbols. Notice that none of the records on the object file has id = 10.
On reading the eof, the loader switches to the next object file (NOM) and starts
using the next relocation term (72). This file is loaded in the same way as the
previous one, and ends up occupying locations 72–96. The only problem in loading
the second file are the items with id bits of 10. Each of them contains an index
rather than the value of a symbol.
To find the value of such a symbol, the loader searches the GEST for the name
of the program (NOM) and the specific index. On finding the entry, the value becomes
available. The first such line in our example is ‘CALL LB’. It is read from the object
file with id = 10 which means that the corresponding byte (=00000001) is an index,
not a value. The loader searches the GEST for an entry with program=NOM and
index=1. On finding it, the loader uses the ‘value’ field, which has already been
relocated, to complete the instruction. It is loaded, in locations 83–84 as the two
bytes 00100000 01000000. The first being the OpCode (=4) and the second, the
value (=64) of symbol LB.
72 reserved 84 01000000 LB
73 85 00101100 BEQ
74 for 86 01011001 X=17+72=89
75 87 00100000 CALL
76 array Z 88 01000000 LB
77 00001101 LOD 89 00001101 LOD
78 00000000 90 00000000
79 00001111 15 91 01000111 Y
80 00010101 COMP 92 00110000 HLT
81 00000100 LSHFT 93 00000001 constants
82 00100000 94 00100100 $
83 00100000 CALL 95 01011001 X
96 01000000 LB
Sec. 7.4 The Modify Loader Directive 209
The directive code in our example will be 011. The ‘index’ is as before, and the
‘modification code’ can take the four values: replace (00), add (01), subtract (10),
logical OR (11). The ‘address’ field specifies the byte in the current object file to
be modified, and the ‘skip’ & ‘field’ fields specify the exact field in the byte to be
modified. The case skip=4, field=3 specifies field yyy in byte ‘xxxxyyyz’. In the
example above, the special relocation of the bytes at addresses 12 & 19 can now be
achieved by the two ‘modify’ directives:
011 00001 0000 1000 00 001100 011 00010 0000 1000 00 010011
Note that the instruction itself does not need to contain the index of the ex-
ternal symbol any more. Also the id bits can simply be 00, and there is no need
for the special relocation id of 10. The loader loads the instruction in memory
without any modification and, when it finds the ‘modify’ directive later, it modifies
the instruction in memory. The ‘modify’ directive only needs to appear in the same
object file and should not precede the instruction to be modified.
This loader directive adds power to the assembler language. It is now possible
to write something like ‘DC Y-LB’ and the assembler would produce a constant
of all zeros and two ‘modify’ directives, one to replace Y (or to add it) and the
other, to subtract LB. An address expression such as ‘Y-LB’ involves two relative
quantities (that can be external) and, as explained in chapter 1, is itself absolute.
However, to make it meaningful, we always require that all external symbols used
in the expression be declared (as entry) in one program. If they are declared in two
different programs, then the difference between them, even though still absolute, is
totally unpredictable and therefore useless.
The ‘modify’ can be used for both linking and relocation. Normally it is a waste
to use it for relocation, since relocation only requires a bit or two per instruction.
However, on a computer where the relative mode is heavily used, relatively few
instructions need relocation, and the ‘modify’ may be used instead of relocation
bits.
210 Loaders Ch. 7
LC
-
EXTRN M,N
-
12 B JMP -
-
25 A SUB -
-
- X EQU A-B+M-N
62 Y DC A-B+M-N
0 7
1 5 8
2 7 8
3 7 8
As a result, a ‘modify’ needs to specify only the address of the first byte of
the instruction and a number in the range 0–3 (the ‘case’ field below). A possible
format is:
Reloc.
commands & initial input Loader
names of all loader
main output
object files
secondary
output a single
executable
memory map & module
error messages in memory
The linkage editor performs linking and also relocates all the programs (or all
the control sections) relative to the start of the first program. The load module
therefore contains one stream of instructions and is easy to load. To load it, the
relocating loader only needs to add the start address to all the relocatable instruc-
tions. It has no linking to do, and thus no GEST to maintain. It treats the program
as one unit and thus does not have to update the relocation term.
If the start address of the program is known in advance, the linking editor
can relocate all the object files relative to that start address, and produce a load
module that can only be loaded starting at that address. Such a load module is, of
course, an absolute object file, and can easily be loaded later, by an absolute loader,
without any relocation. This is commonly done on modern personal computers and
workstations.
212 Loaders Ch. 7
and the assembler assembles the CALL instruction into a software interrupt instruc-
tion (called SVC, BREAK or some other name). Dynamic loading is therefore not just
a loader feature; the assembler is also involved.
Sec. 7.7 Loader Control 213
P P P
user user user user user
OS OS OS OS OS
DL DL DL DL DL
Normally, when the loader detects errors, it does not start execution. It is
possible to override this and instruct the loader to execute the program in the
presence of certain errors. This may make sense if certain external symbols remain
unresolved, but the programmer knows that this particular run will not use their
values.
To change the list of object files to be loaded, the loader accepts INCLUDE and
DELETE commands. The INCLUDE command has the general form
INCLUDE file name,device name
It specifies an object file, perhaps a library routine, to be included in the load. The
DELETE command tells the loader that a certain object file should be deleted from
the list of object files to be loaded.
The ‘CHANGE old name,new name’ command instructs the loader to change, in
the GEST, the name of an external symbol. This command is used when the user
decides to change a certain routine after the program has been assembled. Imagine
a program with a ‘CALL COS’ instruction, calling a routine COS in object file (or
library file) MATH. The user has decided to use another routine COS1, located in file
NEW. The following loader commands may be necessary:
DELETE MATH
INCLUDE NEW
CHANGE COS,COS1
Other loader commands are mentioned in the following sections, in connection
with library search, overlays, and other loader features.
Exercise 7.9 Library routines are in the form of object files. Can such a routine
be an absolute object file?
When such a routine is loaded, it may call other routines, which means more
library searches but no special complications for the loader. If a library search fails,
the loader issues an error message and will not execute the program.
Sec. 7.8 Library Routines 215
7.9 Overlays
Many modern computers use virtual memories that make it possible to run
programs larger than the physical memory. Either one program or several programs
can be executed even if the total size is greater than the entire memory available.
When a computer does not use virtual memory, running a large program becomes a
problem. One solution is overlays (or chaining), which will be discussed here since
its implementation involves the loader.
Overlays are based on the fact that many programs can be broken into logical
parts such that only one part is needed in memory at any time. The program is
216 Loaders Ch. 7
divided, by the programmer, into a main part (the overlay root), that resides in
memory during the entire execution, and several overlays (links or segments) that
can be called, one at a time, by the root, loaded and executed. All the links share
the same memory area whose size should be the maximum size of the links. A link
may contain one program or several programs, linked in the usual way. At any
given time, only the root and one link are active (but see the discussion of sublinks
and tree structure below). Two features are needed to implement overlays:
A directive declaring the start of each overlay. Those directives are recognized by
the assembler which, in turn, prepares a separate object file for each overlay.
A special ‘CALL OVERLAY’ instruction to load an overlay (a link) at run time.
Such an instruction calls a special loader routine, the overlay manager, resident in
memory with the main program, which loads the specific overlay from the object file
into the shared memory area. The last executable instruction in the overlay must
be a return. It should return to the calling program, which is typically the main
part, but could also be another overlay. Such a return works either by popping the
return address fron the stack, or by generating a software interrupt, that transfers
control to the overlay manager in the OS.
A typical directive declaring an overlay is ‘OVERLAY n’ (or ‘LINK n’) where n is
the overlay number. Each such directive directs the assembler to finish the previous
assembly, write an object file for the current overlay, and start a new assembly for
the next overlay. The END directive terminates the last link. The result is a number
of object files, the first of which is a regular one, containing the main program. All
the rest are special, each containing a loader directive declaring it to be an overlay
and specifying the number of the overlay.
The loader receives the names of all the object files, it loads the first one but,
upon opening the other ones, finds that they are overlays. As a result, the other
object files are not loaded but are left open, accessible to the loader. The loader
uses the maximum size of those files as the size of the shared memory area and
loads, following the main program, a routine that can locate and load an overlay.
At run time, each ‘CALL OVERLAY[n]’ (or ‘CALL LINK’) instruction, invokes that
routine which loads the overlay, on top of the previous one, into the shared area.
As far as relocating the different overlays, there are two possibilities: The first one
is to relocate each overlay while it is loaded. The other possibility is to prepare
a pre-relocated (absolute) version of each overlay and load the absolute versions.
This requires more load time work but speeds up loading the overlays at run time.
Generally, an overlay is a large part of the program and is not loaded many times.
In such a case, the first alternative, of relocating the overlay each time it is loaded,
seems a better choice.
In general, each overlay may be very large, and sub-overlays can be declared.
The result is a program organized as a tree where each branch corresponds to an
overlay, each smaller branch, to a sub-overlay, etc. Figure 7–7 is an example of such
a tree.
The table below assumes certain sizes for the different links and a start address
Sec. 7.9 Overlays 217
B C D
E F G H I
of 0 for the root A. It then shows the start addresses of each link and the total size
of the program when that link is loaded.
total
name size start size
A 1000 0 1000
B 500 1000 1500
E 350 1500 1850
F 700 1500 2200
G 100 1500 1600
C 200 1000 1200
D 250 1000 1250
H 100 1250 1350
J 200 1350 1550
I 250 1250 1500
Links B, C, D all start at the same address. Also E, F, G start at address 1500
and , similarly, H, I start at 1250. The combined size of all the links is 3350, but the
maximum amount of memory needed is only 2200 locations (when link D is loaded).
In keeping with tradition, we say that the root is the lowest level in the tree.
The main rule concerning the layout of links in memory is: if link X is loaded in
memory, all other links lower than X, that connect X to the root, must also be loaded
at the same time. This rule implies that, when X calls a lower link, that link should
be active. If it is not, the loader should issue an error. When X calls a link on the
same level as itself, it is always an error. When X calls a link higher than itself, it
must be exactly one level higher, and it must be a son of X in the tree.
To implement such a tree overlay structure the programmer should define the
structure in advance, to make it possible for the loader to check and verify every
call to a link and every return. To define the tree, the programmer must start each
overlay with an OVERLAY command specifying the name of the overlay and the name
218 Loaders Ch. 7
of its parent. The general format is OVERLAY name, parent. In the above example
the loader sees the following commands
id pair
OVERLAY A,-- (0,0) The root is always (0,0)
OVERLAY B,A (1,0) The parent is A, so B’s level is 1
OVERLAY E,B (2,0) The serial numbers start at 0
OVERLAY F,B (2,1) Level 2 has 5 overlays, numbered
OVERLAY G,B (2,2) 0 thru 4
OVERLAY C,A (1,1)
OVERLAY D,A (1,2)
OVERLAY H,D (2,3)
OVERLAY J,H (3,0) Level 3 has only one overlay
OVERLAY I,D (2,4)
that identify the tree structure. Note that this identification is not unique. Switch-
ing the declarations for overlays E, F would make no difference for the loader al-
though, in principle, it would declare a different tree. The loader assigns id numbers
to the overlays based on their place in the tree. Each id is a pair (level,serial ) where
‘level’ is one greater than the level of the parent, and the serial number is unique in
the level. The id pairs are shown in the list above. To control overlay loading, the
loader maintains one table, the overlay table (OT), containing the name, id pair,
and status of each overlay. The OT is used to decide whether an overlay call is valid
and where to load the next overlay. The OT starts with just the root active:
A 0,0 a B 1,0 - E 2,0 - F 2,1 - G 2,2 -
C 1,1 - D 1,2 - H 2,3 - J 3,0 - I 2,4 -
A status of ‘a’ means active and a status of ‘-’ means not loaded. Suppose that a
while later overlays D, H are loaded. The OT should be updated to:
A 0,0 a B 1,0 - E 2,0 - F 2,1 - G 2,2 -
C 1,1 - D 1,2 a H 2,3 a J 3,0 - I 2,4 -
To see how the OT is used to manage overlay loading and deleting, consider the
case where overlays A, D, H are active. From figure 7–7 it is obvious that the only
overlay that can be loaded at this point is J. The rule in such a case is: Only the
highest-level active overlay can issue an overlay-call instruction.
Similarly, the only overlay that can be deleted, by a RET instruction, is H,
implying that only the highest-level active overlay can issue a RET. The main step
in deleting an overlay is to change its status in the OT from a to -.
One more rule is needed, to determine what overlays can be called at any time.
To understand this rule, let’s examine the case where overlays A, D are active. Again,
figure 7–7 shows that either H or I can be called, which suggests the rule: When
overlay X, with level xl, calls Y then:
1. the level of Y must be xl+1.
2. When scanning the OT, from left to right, starting with X, Y must be found
before reaching an entry with a level ≤xl (or before reaching the end of the table).
Sec. 7.9 Overlays 219
Exercise 7.11 The serial numbers of the overlays in the OT do not seem to be
necessary. What are they used for?
We have already seen that, at load time, the different sections are loaded in
the order MAIN, DATA, BETA, GAMMA or 1,3,6,2,5,4,7. When the loader sees the first
‘use’ loader directive, at the beginning of the object file, it loads the first section
220 Loaders Ch. 7
and reads the rest of the file, looking for more uses of the same section. It finds two
more (3 & 6) and loads them. In the second pass, it starts with section DATA (2)
and reads the rest of the file to find and load the remaining part of that section (5).
Pass 3 loads the single BETA section (4) and pass 4, the single GAMMA section (7).
This is a simple process involving one simple data structure, a list of the ‘use’
directives. In the first pass, while loading all the sections of the first location
counter, the loader reads all the ‘use’ directives and stores them in a list. Before
each pass, the loader finds the first remaining item on that list and thus knows what
LC to load in that pass. Each time a section is found in the object file and loaded,
the loader deletes the corresponding item from the list. When the list is empty, the
loader is finished.
The ‘use’ loader directive contains the name and current value of the LC to be
used in the following section. The loader compares that information to the current
load address, which constitutes a good check of the integrity of the object file (see
review question 7–7).
Typically, the bootstrap ROM contains a very small loader that reads a fixed-
size record from an I/O device (the device can be specified by the operator) into
a predetermined memory area, then jumps to the start of that area. The record
should contain a small loader (the second one in the bootstrap process) which, in
turn, can load the main loader (the third one so far). The main loader is then used
to load the rest of the OS. The advantage—the fixed length record can easily be
modified if the OS is updated.
Several first-generation computers had most (or even all) of their storage on a
magnetic drum. In such a computer, loading a program meant writing the object
code on the drum, ready for execution. The best way to position instructions on the
drum is to consider the execution time of each instruction. If a certain instruction is
fetched from address x on the drum, and if it takes time y to execute the instruction,
than one can easily calculate the drum address that would be under the read/write
head when the instruction is completed. That drum address is, of course, the ideal
position for loading the next instruction. Loading it anywhere else on the drum
would cause a delay in fetching it, since the control unit would have to wait for
the slow drum to bring that instruction under the r/w head. Such computers are
called n+1 address machines, since each instruction should contain, in addition to
its n operands, also the drum address of its successor. If the instructions contain
addresses of operands, each opearnd also has an ideal drum address, depending on
the time interval between the moment the instruction is fetched and the moment
each operand is needed.
The 650 [88, 98] was a decimal computer. It had a drum with a total capacity
of 2000 locations, each a word with a capacity of 10 decimal digits. The words were
recorded on the drum on 40 bands, each a circular track with 50 words. Addresses
ranged from 0000 through 1999 where each band had 50 consecutive addresses.
The drum had a separate read/write head for each band but the processor could
use only one head at any time. As a result if, at a certain point in time, the r/w
head was above location 0003, the processor could read either that location or any
of locations 0053, 0103, 0153, ...0653,..., 1953 at that time.
The computer had no random accesss memory, but it had additional hardware
that could be accessed by the machine instructions, using special, 800x addresses.
Ten Console switches could be read as address 8000, a temporary register called the
distributor had address 8001, and a 20-digit accumulator had addresses 8002 (for
the lower 10 digits) and 8003 (for the upper 10 digits). Each instruction was 10
digits long, with the format:
222 Loaders Ch. 7
Source Object
The first MPY instruction multiplies (the acc) by 240 . The constant itself is stored,
as part of a NOP instruction, in the word that follows the MPY (drum location 0022).
The same thing is true of the second MPY, which uses the constant 12.
Symbol TY005 has a value of 0028 and is defined outside this section of the pro-
gram. Addresses 8001, 8002 have been explained earlier.
224 Loaders Ch. 7
The number 16 is a good choice both for a decimal and for a binary computer.
Binary computers use 16-bit words because they equal two bytes. A decimal com-
puter uses 4-bit digits. Therefore, each drum location on such a computer is a few
digits long, 16 bits implying 4 digits.
We start with just a few instructions but, since there is room for an instruction
set of up to 16 instructions, you might want to add more instructions of your own
design.
Mnemonic Operand Next
fetch Instr. Note
ADD 4 5
LOD 2 1
STO 2 1
COM 4 3 1
BPL 3 3 2
HLT - -
Notes:
1. it does not matter what COM does. It could be ‘compare’, ‘complement’, or
anything else.
2. If the acc is ≥ 0 then go to the next instruction, else go to to the instruction
whose address is in the operand fetch field.
The operand-fetch column indicates the distance, on the drum, between an
instruction and its operand. The next intruction column indicates the distance be-
tween the operand and the next instruction. There should also be the directives END,
DC, DS. Each instruction may have, in addition to its operand, a symbol indicating
226 Loaders Ch. 7
The LOD instruction is assembled with the address of the ADD as its successor.
Before implementing the assembler-loader, it is necessary to understand the
way our drum operates. We will consider three cases, each to be implemented
separately.
Case 1. The drum has just one head which starts at band 0, location 0 and
advances to the next address in each clock cycle. The address following (0,15) is
(1,0) which means that, when the head reaches the end of a band, it automatically
moves to the start of the next band. In such a case, when an instruction is loaded
at address (x,13) and its execution time is 4, the ideal location for its successor
is (x+1,1). If this location is occupied, the assembler should advance and test
successive locations until it finds an empty one. We will assume, for the sake of
simplicity, that our test programs are short and can never overflow the drum.
can be easily handled even though X is a future symbol. On reading the first line
from the source, the assembler searches the symbol table and, if it does not find
X, assumes that X is a future symbol. Since this is an unconditional jump, the ADD
instruction is the successor of the LOD and thus the assembler determines its address
(x,y)—which is also the value of X—from the third column of the OpCode table.
The address of the ADD is known, even though it has not been read from the source
yet. The assembler then stores X in the symbol table, which amounts to reserving
address (x,y). Later, when the ADD is read from the source, the assembler finds X in
the symbol table. This also means that, before storing any instruction in address
(x,y), the assembler should verify that the address is not reserved for any future
instruction.
A conditional branch poses another problem. In a case like
BPL B
SUB
.
B ADD
the assembler uses the OpCode table to find out where to place the SUB and ADD
instructions. The relevant entry is ‘BPL 3 5’, which means that the SUB instruction
should be placed 3 locations from the BPL, and the ADD, 5 locations from the SUB.
There is no need, of course, to actually simulate the execution of the test
programs. Rather, you should gather information on the way the program is loaded.
The most important items to be evaluated are:
The number of preoccupied drum locations encountered while assembling and
loading instructions.
The total execution time lost because of instructions placed in less than ideal
drum addresses.
The same thing for operands. Note that the DC, DS directives specify constants
and arrays as part of the test program. Those should, of course, be placed on the
drum, but where? In a case such as
LOD A
.
ADD A
.
A DC 5
the assembler places A on the drum when it encounters the first usage of A (the LOD
instruction). This means that the ADD instruction will have its operand placed unfa-
vorably and, as a result, would execute slowly. When a DS directive is encountered,
the assembler should place the array wherever it finds room on the drum. This
implies that any instruction using the array as its operand would execute slowly.
228 Loaders Ch. 7
Four modern assemblers are reviewed in this chapter, and their main features
described. They are all two-pass, sophisticated macro assemblers, and each is part
of a complete development system including a debugger, loader and compilers. They
are:
The Microsoft Macro Assembler (MASM) for the Intel 80x86 & 80x88 micropro-
cessors.
The Borland Turbo Assembler (TASM) for the same processors.
The VAX Macro assembler.
The MPW assembler for the Macintosh computer.
They all support many directives and advanced features. In principle they work
just like the older assemblers, but in practice they differ in a number of points, the
most notable of which are:
The way expressions are evaluated. Older assemblers normally calculate an ex-
pression such as 3 + 2 ∗ 4 strictly from left to right, and will end up with 20. The
assemblers described here use the normal operator precedence, and will produce 11
as the value of this expression.
230 A Survey of Some Modern Assemblers Ch. 8
MASM is actually an assembler for the Intel 80x86 and 80x88 family of mi-
croprocessors, including the 80x87 coprocessors. Before delving into the details of
MASM, a few words about this important family are in order.
The 8086 was developed by Intel as a 16-bit microprocessor. It was very suc-
cessful and led to the development of other, 16- and 32-bit microprocessors. Because
of the huge investment in software development, all those microprocessors were de-
veloped as a family, which means that they had to be upward compatible with the
8086. The 8086 can address a megabyte of memory, and was designed for a single-
user environment (we say that it runs in real mode only). No memory protection
exists.
The 8088 differs from the 8086 in that it has an 8-bit data bus. It moves a
16-bit word in two halves, and is therefore somewhat slower than the 8086.
The 80186 has a few additional instructions and runs faster than the 8086.
There is also an 80188.
The 80286 is faster than the 80186, can address 16 megabytes (224 ), and was
designed for a multi-user environment. It is possible to load several user programs
in memory, and switch the processor between them. The 80286 supports memory
protection and privileged instructions, and can run in either real mode or protected
mode (multi-user). The older IBM PC, XT & AT computers can only run in real
Sec. 8.1 The Microsoft Macro Assembler (MASM) 231
mode. The newer PS/2 computers can run in protected mode. There are no 80288
or 80388 processors.
The 80386 has 32-bit registers, and can be used as either a 16-bit or a 32-bit
processor. It can address up to 4 gigabyte (232 ) of memory. It supports virtual
memory, multiple processes, and additional instructions to handle its new features.
It is also faster than the 80286.
The 8087, 80287 & 80387 coprocessors can all operate on floating-point, dec-
imal, and large integers. They can perform arithmetic operations, and compute
several common functions, such as sine & logarithm. The main difference between
them is that the 80287 can also function in the protected mode, and the 80387, in
addition, supports several new operations.
One important feature of all members of the family is the way they address
memory. They create 16-bit addresses (except the 80386, which creates 32-bit ones),
so they can directly address only 64Kbytes of memory. A larger memory has to be
divided into several 64Kbyte segments. In such a memory, a physical address has
the form segment:offset, where the segment part comes from one of the segment
registers and the offset part comes from the instruction. Addresses now have to be
mapped, i.e., the final (physical) address has to be computed from the two parts.
The mapping details are important to the assembly language programmer, and any
assembler for this family should support directives to handle segments. It should
be noted that segments have a lot of overlap, and are thus a source of confusion.
Ref. 57 is a good source of information on 80x86 memory segmentation.
Now back to MASM. An important MASM feature is the ability to maintain
a library of pre-assembled programs. These can easily be located by the librarian,
and loaded and linked with any new program.
MASM is invoked by the MASM command, which specifies the names of all files
involved (source, object, listing, and cross reference) and can select many options.
Following is a list of some interesting or unusual options.
Many options are available for the command line, so the /H option helps the user
by displaying them all on the screen. The result of the command ‘masm /h ...’ is:
Usage: masm /options source(.asm),[out(.obj)],[list(.lst)],[cref(.crf)][;]
/a Alphabetize segments
/c Generate cross-reference
/d Generate pass 1 listing
/D<sym>[=<val>] Define symbol
/e Emulate floating point instructions and IEEE format
/I<path> Search directory for include files
/l[a] Generate listing, a-list all
/M{ lxu}Preserve case of labels: l-All, x-Globals, u-Uppercase Globals
/n Suppress symbol tables in listing
/p Check for pure code
232 A Survey of Some Modern Assemblers Ch. 8
A pass-1 listing can be created by the /d option. Since pass 1 does not handle
future symbols, the listing will flag those symbols as undefined, in addition to other
pass 1 errors. Chapter 5 features a sample MASM listing with the pass-1 errors
shown.
ifdef width
page 60,130
endif
..
..
if opt LT 10
optv db opt
else
optv db 10
endif
will check symbol width and, since it is defined (on the command line), will execute
the ‘page 60,130’ directive, limiting the height of a listing page to 60 lines and its
width, to 130 characters. The second if will execute the ‘optv db opt’ directive
and skip the ‘optv db 10’. This is an typical example of late binding.
The /MU option tells MASM to convert all names read from the source file to
upper case. The /ML option means MASM is to be case sensitive (names such as
segment & Segment should be considered different). The /MX option directs MASM
to make only public and external names case sensitive.
Sec. 8.1 The Microsoft Macro Assembler (MASM) 233
The /V (for verbose) and /T (for terse) options control the amount of listing
produced.
The /W option restricts the assembler to report only the more serious warnings.
Warnings are issued for ambiguous, inefficient, or unclear instructions that are not
illegal and can be assembled.
The listing produced by MASM contains the source and object codes, the LC
values, and the source file line numbers. The line numbers are used by the debugger
to direct the user to offending lines. In addition, certain tables can optionally be
listed, giving information about the macros, the structures and records, and the
groups and segments used in the program. In addition to that, things such as the
symbol table, pass-1 listing, and assembly statistics, can also be listed. A special
utility, CREF, can be used to generate the cross reference information. Chapter 5
contains examples of MASM’s pass 1 & pass 2 listings, and a cross-reference.
The only thing unusual about the source line format in MASM is that only
labels of instructions need to terminate with a ‘:’. Labels of directives don’t require
a ‘:’. Lines are written in free format and may start in column 1 even if there is no
label.
MASM supports more than 50 directives. Some directive names must start
with a period, but there does not seem to be any consistency. A few of the more
interesting or unusual ones are described here.
The DF (Define Far) directive allocates 6 bytes in memory. It is normally used on
the 80386 to store a pointer.
A question mark ‘?’ is used to indicate a zero-value, and the notation ‘(?)’
indicates undefined value. Thus ‘x db ?,?,?’ allocates 3 bytes with zeros, while
‘y db 3 dup(?)’ allocates 3 undefined bytes.
The public directive declares symbols to be entry points. In most assemblers
this directive is called entry. The comm directive declares symbols to be communal
(both an external and an entry point). Such a thing is normally done when a
variable should be shared by several modules. The variable is declared communal
(typically with other such variables) in a module which then becomes an include
file. The file is included in all the other modules. An alternative is to declare such
a variable public in one source module and external in all the other ones.
The two directives if1 and if2, evaluate to true in pass 1 and pass 2, respectively
(see example in Ch. 5).
The ifdef directive evaluates to true if its operand is a defined name (if it appears
in the symbol table). A future symbol is undefined in pass 1, but there are no future
234 A Survey of Some Modern Assemblers Ch. 8
symbols in pass 2. Any undefined symbol in pass 2 is really undefined (an error).
A similar directive, .errndef, performs the same test and also generates an error
if the symbol is undefined.
The equal sign ‘=’ is used for redefinable symbols (many assemblers use SET for
the same purpose). Thus ‘x=0’ can be followed by ‘x=x+1’. The equ directive is
used, as usual, for symbols that should be uniquely defined. Note that on MASM,
such symbols can have character strings as well as numeric, values.
The words short, near & far are directives but, since they are used differently
from other directives, they are called operators. An instruction using the relative
mode has an offset field for the relative address. On the 80x86 microprocessors,
such an instruction has three versions—with offsets of 1-, 2-, and 4 bytes—called
short, near and far, respectively. When such an instruction refers to a future
symbol, the assembler does not know which of the 3 versions to use, and how much
room to reserve for the instruction in pass 1. The user may help the assembler by
using one of the operators above. Thus if dest is a future symbol, the instruction
‘jmp near dest’ would be assembled with a 2-byte offset. A short means an 1-byte
offset, and a far—a 4-byte one.
When no operator is specified, the assembler reserves room (in pass 1) for a
near offset. If this turns out to be too much (only 1 byte is needed), the extra byte
is padded with a nop instruction. If, on the other hand, 2 bytes turn out not to
be enough (the destination is far), a phase error is generated by pass 2, and the
program has to be reassembled.
MASM can handle expressions with many operators. Operands must be ab-
solute, except that the binary ‘+’ operator can operate on one absolute and one
relative operand, and the binary ‘-’ operator can, in addition to that, be used to
subtract two relative operands (but only if they are located in the same segment).
Expressions can be shifted by the shl & shr operators. They are not the
80x86 shift instructions (even though they have the same name), and are executed
at assembly time. A typical example is ‘mov ax,01010001b shl 3’ which moves
the binary number ‘01010001000’ into register ax.
The not, and, or & xor logical operators can be used in expressions, and are
thus executed at assembly time. Again, they should not be confused with the 80x86
instructions (which happen to have the same names).
Relational operators are allowed. The instruction ‘mov ax,2 eq 4’ moves the
value zero (false) to ax (a value of −1 corresponds to true).
A ‘$’ is used for the LC value. Thus the following is very common:
Sec. 8.1 The Microsoft Macro Assembler (MASM) 235
and equates the value of labelB with the length of the string.
The normal is operator precedence is used, and parentheses can be employed
to modify it. There is also strong typing for operands. The seemingly innocent
instruction ‘mov ax,sors[1]’ can cause a warning if symbol sors is defind as a
byte (‘sors db 123’). This is because ax is a register, and thus of type word.
The instruction will be assembled and will move the two bytes starting at sors to
register ax. This may or may not be what the user wants, and it is an example of
a confusing instruction.
8.1.7 MASM macros
As can be expected, MASM supports an extensive macro facility. Macros can
have many parameters (as many as will fit on one line), both macro expansions
and macro definitions can be nested, and an extensive set of conditional assembly
directives permits recursive macros. Macros can be redefined and even deleted
(with the purge directive) from the MDT, to make room for new definitions. An
interesting point is that a macro can purge itself, but the purge command must be
the last line of the macro.
In addition to macros, powerful directives such as irp can be used to create
copies of blocks of code (repeat blocks) that depend on parameters. A common
example is:
TASM is invoked with the TASM command where the user can specify certain
options. Among them is the maximum number of passes. Thus the command
‘TASM \m1 test’ invokes the assembler, limits the number of passes to 1, and sup-
plies the name of the input file ‘test.asm’. This makes sense if fast assembly is
important, and the user knows that there will be no complex forward references.
This makes it somewhat easier to write the program, but makes it harder for the
assembler to recognize labels.
No special syntax is used on the source lines for the immediate mode. The int
instruction (which causes an artificial interrupt) does not use the immediate mode,
so ‘int 21h’ causes an interrupt to address 2116 . The mov instruction, on the other
hand, uses the immediate mode, so ‘mov cx,100’ moves the decimal constant 100
to register cx. Because of the way memory is organized into segments, there is no
absolute mode on the 80x86 microprocessors. It is possible to reach any absolute
address by selecting the segment where the address is located.
Expressions. TASM supports expressions with nested parentheses, and logical and
relational operators. There is operator precedence, and there are many operators.
Extended call. TASM has been written by Borland International, a company
known for its Turbo compilers. As a result, the TASM call instruction has been
extended to allow easy interface with Turbo Pascal and Turbo C. A call instruction
can specify any of these languages, and arguments will be passed in a consistent
manner.
Extended push and pop. These instructions have been extended to allow for more
than one operand. Thus things such as ‘push cx dx’, ‘pop dx cx’ will each handle
two registers.
Predefined variables. The following names can be used to get useful informa-
tion about the current assembly run. ??date, ??time, ??filename, ??version.
They are self explainable.
Sec. 8.2 The Borland Turbo Assembler (TASM) 237
The forward referencing problem in TASM is general and not limited to con-
ditional jumps. The unconditional jmp instruction, e.g., has several varieties. The
‘jmp short’ with a 1-byte opcode and a 1-byte offset; the ‘jmp near’ with a 1-byte
opcode and a 2-byte address, and the ‘jmp far’ with a 1-byte opcode and a 4-byte
offset, If TASM is limited to one- or two passes, then it reserves three bytes for each
jmp. Eventually these bytes are either filled with a ‘jmp near’ or with a jmp short
followed by a nop. If it turns out that a ‘jmp far’ is needed, an error results and
reassembly is necessary.
The user can help TASM create better code if he can estimate the distances
involved. The user can always write ‘jmp short x’ or ‘jmp far y’, which will
create the specified version of jmp.
Sec. 8.2 The Borland Turbo Assembler (TASM) 239
A similar problem exists with other instructions. The mov instruction, e.g.,
has a 4-byte version that can hold a full address, and a 2-byte version that can
hold a small constant. When TASM reads a source line such as ‘mov bl,abc’, and
finds that abc hasn’t been defined yet, it assembles the mov as a 4-byte instruction,
assuming that abc will turn out to be a normal label. If abc turns out to be a
constant (e.g., ‘abc equ 5’), then the short version of mov is created, followed by
two nop instructions.
8.2.5 Conclusions:
The programmer should develop a style that minimizes forward references. Ar-
rays and other variables should be declared at the start of the program. Also,
subprograms should precede the main program.
While the program is in the debugging stages, the programmer should allow for
one- or two passes, and be prepared for less than optimum code generated. When
the program is deemed clean of bugs, it should be assembled one more time, allowing
for more passes. The resulting code should be optimum.
Most directives are not explicitly identified as such to TASM, but some have to
start with a period. Here are some interesting and unusual directives and notation
supported by TASM:
The union directive allows two variables to share the same memory area. It is
similar to the union statement in the C programming language. The first step is
to define a data type such as debit by:
debit union
small db ?
large dw ?
debit ends
The next step is to declare an actual variable of type debit by, e.g.:
‘joe@debt debit <?,?>’. When this is done, ‘joe@debt’ becomes a word in mem-
ory shared by the two variables ‘[email protected]’ (a byte) and ‘[email protected]’
(a word).
The ‘?’ symbol indicates an uninitialized memory location. Thus ‘abc dw 10’
preloads the word at address abc with the constant 10, while ‘xyz dw ?’ just re-
serves one word at xyz. An example is the directive ‘ghk dw 20 dup (?)’ which
duplicates ‘dw (?)’ 20 times, and results in an array ghk of 20 uninitialized words.
The align directive increments the LC to the specified power of 2. Thus ‘align 8’
will increment the LC to the nearest multiple of 8, and will insert as many nop
instructions as necessary to pad up the empty bytes created.
The command line options: The assembler is invoked by the macro command.
Many options may be specified on the line, the most interesting (not necessarily
important) of which are:
addrsize size. This sets the addresses displayed in the listing file to size digits
(typically 4 or 5).
check. No object file is generated. The assembler does a syntax check only.
font fontname, fontsize. Sets the font used in the listing file to the one specified.
print noobj. Creates a short listing (which does not include the object code).
Sec. 8.3 The VAX Macro Assembler 241
They are supported and have the format ‘nn$’ where ‘nn’ is a 16-bit number.
Local labels are only valid in their block, and the block can be terminated by:
A user-defined label.
The .psect directive. This directive is used to divide the program up into blocks
with different accesss codes that may be loaded into separate memory areas.
The .enable .disable directives. They are used, among other things, to define
local blocks overriding user-defined labels and .psect directives.
All directives start with a period. Macro supports more than 80 directives!
They are classified into 19 categories, eight of which are directives for handling
macros. Some interesting or unusual directives are:
.debug—Any symbols declared with this directive are made known to the VAX
debugger. In an interactive session, the user can refer to those symbols by name,
in order to get their current values from the debugger.
.mdelete—Deletes a macro definition from the MDT, freeing memory for new
definitions.
Sec. 8.3 The VAX Macro Assembler 243
8.3.5 Macros
The VAX Macro assembler supports an extensive macro facility. Macros can
have local labels, and can use both positional and keyword arguments. There are
extensive string-manipulation functions that can handle string arguments. Nested
macros are allowed, and the following short example is a useful application of nested
macro definition:
.macro for reg,from,to,?lab
movl #’from,r’reg
lab:
.macro endfor
incl r’reg
cmpl r’reg,#’to
blss lab
.endm endfor
.endm for
It can be used to implement the high-level construct for in asembler. Macro for
generates a mov instruction, declares label lab, and them defines macro endfor.
That macro incremenst the loop counter, compares it to the final value and, if
necessary, branches back to the same label. Note the notation ?lab which means
that lab is a local label.
The VAX computer has 24 addressing modes, and Macro uses a special no-
tation to select them. Table 8–1 is a summary of all the VAX addressing modes.
Of special interest is the assembler syntax used. It ranges from simple (the regis-
ter mode and relative mode have the simplest syntax) to complex (the longword
displacement deferred mode, with the syntax @L^dis(Rn) is perhaps the most com-
plex).
Index
Mode Syntax Name Effective Address base?
General register addressing
0–3 S^#N Short literal None. Operand is N N
4 b[Rn] Index c(b+s·Rn) N
5 Rn Register None. Operand is in Rn N
6 (Rn) Register deferred c(Rn) Y
7 -(Rn) Autodecrement Rn←Rn-s, EA=c(Rn) ?
8 (Rn)+ Autoincrement EA=c(Rn), Rn←Rn+s ?
9 @(Rn)+ Autoincrement deferred EA=c(c(Rn)), Rn←Rn+4 ?
A B^D(Rn) Byte displacement c(Rn+D) Y
B @B^D(Rn) Byte displacement deferred c(c(Rn+D)) Y
C W^D(Rn) Word displacement c(Rn+D) Y
D @W^D(Rn) Word displacement deferred c(c(Rn+D)) Y
E L^D(Rn) Longword displacement c(Rn+D) Y
F @L^D(Rn) Longword displacement deferred c(c(Rn+D)) Y
Program counter addressing
8 I^#N Immediate None. Operand is N Y
9 @#A Absolute A Y
A B^A Byte relative A=c(PC+D) Y
B @B^A Byte relative deferred c(A)=c(c(PC+D)) Y
C W^A Word relative A=c(PC+D) Y
D @W^A Word relative deferred c(A)=c(c(PC+D)) Y
E L^A Longword relative A=c(PC+D) Y
F @L^A Longword relative deferred c(A)=c(c(PC+D)) Y
Notes:
The operand is the contents of the EA, except where there is no EA
N Absolute (literal number)
c(...) “contents of mem loc. ...”
b Index mode base address
s operand size, dependent on operation context:
s=1,2,4,8,16 according as the operation is a byte, word,. . . octaword
? Yes, provided that Rn is not the index register
D Displacement. If byte or word, sign is extended to a longword
A Memory address
8.4.1 Modules
Pass 1 creates three symbol tables: A global one—for symbols defined outside
any code or data modules; a macro symbol table—for macro symbols and macro
definitions (this is the MDT); and a local symbol table—for symbols defined inside
code or data modules.
The program is divided into modules that are assembled separately and ap-
pended to the object file. Each module is a contiguous piece of instructions or data
and, ideally, should be small enough to fit entirely in memory. Labels declared in
the module can either be local (to the module) or global. They can also be exported
to other object files (meaning they can be external) or imported from other object
files. The words IMPORT, EXPORT are directives.
Pass 1 translates a module, as it is being read, into postfix notation, and stores
this in memory, with its local symbol table. (If the postfix module is too big, part
of it is written on a disk, but this increases the assembly time considerably.) Pass 2
assembles the postfix from memory and appends the results to the object file. The
local symbol table is then erased.
The linker can read several object files, each with several modules. It groups all
the data modules together and all the code modules together. They end up being
loaded in memory as two separate units.
A module must start with one of the directives PROC, FUNC, MAIN or RECORD. It
must end with one of ENDP, ENDF, ENDM or ENDR.
Local labels are called @-labels. They must start with an ‘@’.
A listing file can be created and, if the user selects this option, the assembler
uses a temporary file, the scratch file, to help create the listing.
During assembly, warnings and error messages appear in an active window,
called the diagnostic output window. Optionally, a progress report can also be
displayed. The diagnostic output window can later be saved to a file if necessary.
Source files have a suffix of .a, object files, a suffix of .o, and listing files, a
suffix of .lst.
246 A Survey of Some Modern Assemblers Ch. 8
8.4.3 Segments
There is also the concept of segments. They are different from segments on the
80x86 microprocessors. A 680x0 program can be divided into segments that share
the same memory area. This way a large program, exceeding the entire memory
available, can run one segment at a time. The SEG directive is used to define
segments.
Source line format. A label must either start at position 1, or be terminated
with a ‘:’. A statement without a label must start at position 2 or later. Comments
are preceded by a ‘;’. Fields are separated by a space or tab. Many mnemonics
must specify the data type of the operands. Valid types are:
Specifier Type Size
B Byte 8 bits
W Word 16 bits
L Long word 32 bits
S Short 8-bit signed offset (in the range of −128 . . . + 127)
D Double long word 64 bits. For use with 68851 only
S Single precision 32-bit IEEE floating-point format. For 68881 only
D Double precision 64-bit as above
X Extended 96-bit as above
P Packed BCD 96-bit packed characters for f.p. strings.
For 68881 only
8.4.5 Directives
The MACHINE directive must come with one of the following operands: MC68000,
MC68010, MC68020, MC68030. It identifies the specific 680x0 microprocessor used to
the assembler.
The STRING directive should have one of the operands ASIS, PASCAL, C. It tells
the assembler how to prepare character strings.
The BRANCH & FORWARD directives tell the assembler what size to reserve for the
displacements of certain instructions if they refer to a future symbol. The BRANCH
directive relates to branch instructions. Its operand can be one of S, B, W, L. The
FORWARD directive relates to instructions in modes 6 and 7. Its operand can be one
of W,L.
There is an OPT directive to tell the assembler what level of optimization is de-
sired. The operand is one of ALL, NONE, NOCLR. An example of optimization is
the ‘SUBA An,An’ instruction. It clears register An, but a ‘MOV #0,An’ instruction is
faster. The assembler substitutes it if optimization is required. On certain models
of 680x0, the ‘CLR An’ is even better but, on other models, it does not produce
identical results. The NOCLR parameter means to optimize but not to substitute
CLR ... for ‘MOV #0,...’
1. Barron, D. W., Assemblers and Loaders, 3rd ed., New York, N.Y.: American
Elsevier 1968.
2. Kent, W., Assembler Language Macroprogramming, ACM Computing Surveys
1,4(Dec. 1969) 183–196.
3. Presser, L., and J. R. White, Linkers and Loaders, ACM Computing Surveys
4,3(Sep. 1972) 149–167.
4. Wilkes, M. V., D. J. Wheeler, and S. Gill, The Preparation of Programs for an
Electronic Digital Computer. Reading, MA.: Addison-Wesley, 1951.
5. Z80ASM Ver. 1.05 from SLR Systems, Butler, PA. 1984.
6. ASMZ80 Ver 3.6C from Relational Memory Systems, San Jose, CA. 1984.
7. MOPI Ver.2.0 from Voice Operated Computer Systems, Minneapolis, Minn. 1984.
8. Wilkes, M. V., The EDSAC, MTAC 4,(1950) p. 61. Also reprinted in Randall,
B., The Origins of Digital Computers, Springer Verlag, Berlin, 1982.
9. Melcher, W. P., SHARE Assembler UASAP 3-7. SHARE distribution 564, 1958.
10. Goldfinger R., The IBM Type 705 Autocoder. Proc. East Joint Comp. Conf.,
San Francisco, 1956.
11. Knuth, D. E., Von Neumann’s First Computer Program, Computing Surveys
2,4(Dec. 1970) 247–260.
12. IBM 7040 & 7044 Data Processing Systems, Student Text. IBM Form No. C22-
6732.
250 References
13. Signetics 2650 Microprocessor Manual. Sunnyvale, CA.: Signetics Corp., 1977.
14. Grishman, Ralph, Assembly Language Programming for the Control Data 6000
and the Cyber 70 Series. New York, NY.: Algorithmics Press, 1974.
15. Langsam Y., M. J. Augenstein and A. M. Tenenbaum, Data Structures for
Personal Computers, Englewood Cliffs, NJ.: Prentice-Hall, 1985.
16. Morris, R., Scatter Storage Techniques, Comm. ACM 11,(1)p. 38, (Jan. 1968).
17. Hopgood, F. R. A., A solution to the Table Overflow Problem for Hash Tables,
Comp. Bull. 11, p. 297 (1968).
18. Aho, A. V., J. E. Hopcroft and J. D. Ullman, Data Structures and Algorithms,
Reading, Mass.: Addison-Wesley, 1983.
19. Brown, P. J., A Survey of Macro Processors, Ann. Rev. in Aut. Prog. Vol. 6.
Oxford & New York, NY.: Pergamon Press, 1966, pp. 37–88.
20. Campbell-Kelly, M., An Introduction to Macros, New York, NY.: American
Elsevier, 1973.
21. Cole, A. J., Macro Processors, Cambridge: Cambridge University Press, 1976.
22. McIlroy, M. D., Macro Instruction Extensions of Compiler Languages, in
Comm. ACM 3,(4), p. 214 (1960).
23. Graham, M. L., P. Z. Ingerman, An Assembly Language for Reprogramming,
Comm. ACM 8,(12) p. 769, (1965).
24. Ferguson, D. E., The Evolution of the Meta-Assembly Program, Comm. ACM
9, p. 190 (1966).
25. Freeman, D.N., Macro Language Design for System/360, IBM Sys. J. 5, (1966)6–
77.
26. IBM System/360 Operating System Assembler Language, IBM Form No. GC28-
6514.
27. IBM System/360 OS/VS and DOS/VS Assembler Language, IBM Form No.
GC33-4010.
28. Wegner, P., Programming Languages, Information Structures, and Machine Or-
ganization. New York, NY.: McGraw-Hill 1968.
29. Patterson, D. E., Reduced Instruction Set Computers, Comm. ACM 28(1)8–20,
Jan. 1985.
30. CDC COMPASS Version 3 Reference Manual, #60492600.
31. SUN Microsystems Assembler Language Reference Manual, part #800-1179.
32. PDP-11 Macro-11 Language Reference Manual, Order #AA-5075B-TC.
33. Gorsline, G. W., 16-Bit Modern Microcomputers, Englewood Cliffs, NJ.:
Prentice-Hall 1985.
34. An Introduction to ASM 86, Intel Corp., Order #121689, 1981.
35. ASM 86 Language Reference Manual, Intel Corp., Order #121703.
36. Knuth, D. E., The TEXBook, Reading, Reading, MA.: Addison-Wesley, 1984.
37. IBM PC Macro Assembler, IBM #6172234.
References 251
38. Rector, R., G. Alexy, The 8086 Book, Berkeley, CA.: Osborne/McGraw-Hill,
1980.
39. NOVA Computer Assembler Manual, Data General Corp. #093-000017.
40. Apple II Reference Manual, Apple Corp. Product #A2L0001A.
41. Donovan, J. and S. Madnick, Operating Systems, New York, NY.: McGraw-Hill,
1974.
42. The Random House Dictionary of the English Language, New York, NY.: Ran-
dom House, 1970.
43. Ralston, A. (ed.), Encyclopedia of Computer Science & Engineering, Van Nos-
trand, 1985.
44. Barron, D. W., Assemblers, ibid, pp. 124–132.
45. Brown P. J., Macroinstructions, ibid pp. 904–906.
46. Barron, D. W., Loaders, ibid pp. 874–876. Linkage Editors, pp. 851–852.
47. Conway, M. E., UNISAP, Symbolic Assembly Program for UNIVAC I and UNI-
VAC II, The Computer Center,Case Institute of Technology, Cleveland, 1958.
48. Skordalakis, E., Meta-Assemblers, IEEE Micro, 3(2)6–16 (April 1983).
49. CDC 3170/3300/3500 Computer Systems, META/MASTER Reference Manual,
Pub. No. 60236400, Control Data Corp 1968.
50. Yackle, B. E., An Assembler For All Microprocessors, Hewlett-Packard J.,
Oct. 1980, pp. 28–30.
51. Heath, J. R. and S. M. Patel, How To Write a Universal Cross-Assembler, IEEE
Micro, 1(3)45–66 (Aug. 1981).
52. Mezzalana, M., et. al., DEFASM: A Microprogram Meta-Assembler With Se-
mantic Capability, IEEE Proc. 128E(4)133–142 (1981).
53. Habib, S., and X. L. Yang, The Use of a Meta-Assembler to Design an M-code
Inrterpreter on AMD2900 chips, ACM SigMicro Newsletter, 12(4)38–50, 1981.
54. Gill, C. F., and M. T. Holden, On the Evolution of an Adaptive Support System,
AIAA/ NASA/ IEEE/ ACM Comp. in Aerospace Conf., Los Angeles, 1977 (AIAA
Paper No. 77-1420).
55. Holden, M. T., The B-1 Support Software System for Development and Mainte-
nance of Operational Flight Software, Proc. NAECON 76 Conf., 1976, pp. 250–262.
56. Boehm, E. M., and T. B. Steel, The SHARE 709 System, Machine Implemen-
tation and Symbolic Programming, J. ACM 6,2,134–140 (Apr. 1959).
57. Norton, Peter & John Socha, Peter Norton’s Assembly Language Book for the
IBM PC, New York, NY.: Brady, 1986.
58. Mealy, G. H., A Generalized Assembly System, in Rosen S., Programming Sys-
tems and Languages, New York, NY.: McGraw-Hill, 1969, pp. 535–559.
59. McCarthy, J. et. al., The Linking Segment Subprogram Language and Linking
Loader, ibid pp. 572–581.
252 References
A.1 Introduction
One of the main tasks in designing a new computer is to design its instruction
set. Machine language is often called low level which means, among other things,
that every hardware feature should have a machine instruction to control or to use
it. The instruction set of a computer thus reflects its architecture. This fact has long
been recognized and even serves as the basis for defining the concept of computer
architecture [29]. Designing the instruction set of a new computer is, therefore, an
important task involving many considerations. One important design criterion is
that instructions should be short. This is important because instructions have to
be fetched from memory before they can be executed. A long instruction may have
to be stored in two or even three words in memory, and thus takes two or three
times longer to fetch than a short instruction.
The size of an instruction is determined by the size of the individual fields that
make up the instruction. Of those fields, the operand field is by far the largest. To
get an idea of the sizes involved, let’s look at typical fields in a simple machine in-
struction. The OpCode is typically 7–8 bits long, allowing for 128–256 instructions.
In modern computers, the OpCode has variable size, averaging 6–8 bits. A register
field is typically 3–6 bits long, reflecting the fact that most computers have between
8 and 64 registers. (Some computers have more registers, but an instruction on
such a computer can only access a register from a certain group.) In contrast to
Sec. A.1 Introduction 255
those short fields, the ‘operand’ field should contain an address and thus could be
quite long.
Exercise A.1 How can one minimize the size of the OpCode field?
Up until the mid 1970s, memories were expensive, and computers supported
small memories. A typical size memory in a second generation computer was 16k–
32k words, and in an early third generation one, 32k–64k words. Today, however,
with much lower hardware prices, modern computers can access much larger memo-
ries. Most of the early microprocessors could address 64k words and most of todays
microprocessors can address between 1M and 32M words (normally 8- or 16-bit
words). As a result, those computers must handle long addresses. Since 1M (1
mega) is defined as 1024k = 1024 × 210 = 220 , an address in a 1M memory is 20 bits
long. In a 32M memory, an address is 25 bits long, since 32M = 25 × 220 = 225 .
The 68000 microprocessor [68] generates 24-bit addresses. The 80386 microproces-
sor [69] generates 32-bit addresses, and can thus physically address 4 giga bytes (its
virtual address space is 64 tera bytes, = 246 ). Computers that are on the drawing
boards right now could require much longer addresses.
Let’s therefore assume a range of 20–24 bits for a typical address size, which
results in the following representative instruction formats:
The operand field takes up about 60–70% of the total instruction size, and is thus
the main contributor to long instructions. A good instruction design should result
in much shorter operands, and this is achieved by the use of addressing modes.
An addressing mode is a function, or a rule of calculation, used to calculate the
address. In an instruction set that uses addressing modes, an instruction normally
does not contain the full address (which we will call the effective address or EA),
but rather a short number, called a displacement or an offset, and the mode. The
mode field is 3–4 bits long, and the result is the following typical format:
The operand (the mode and displacement fields) is now 11–14 bits long. It is still
the largest field in the instruction, making up 50–55% of the instruction size, but
is considerably shorter. Many instructions do not need an address operand and,
therefore, do not use a mode either. Adding a mode field, therefore, does not
increase the size of those instructions.
There is, of course, a trade off. If an instruction uses a mode, the EA has
to be calculated before the instruction can be executed, which takes time. This
calculation is done by the hardware, though, so it is fast.
Since the mode is a function, it is possible to write: EA = fm (displacement, . . .)
which implies that the EA is calculated by applying a function f to the displacement
256 Addressing Modes Append. A
(and to other arguments), and that the function itself depends on the mode m. So
for different modes we have different functions, different ways of calculating the
EA. We will see that those functions depend on things like the PC, index registers,
and the contents of memory locations. The notation EA = fm (displacement, . . .)
therefore illustrates the nature of addressing modes.
Before looking at specific modes, it is important to mention the second property
of addressing modes. They serve to make the machine instructions more powerful.
We will see examples where a single instruction, using a powerful mode, can do the
work of several instructions.
discussion of less important ones.
The five main addressing modes, found on all modern computers and many old
ones, are: direct, relative, immediate, index, and indirect.
When an instruction uses a small address, an address that fits in the displace-
ment field, there is no need for any calculations and the EA is simply the displace-
ment. This is the direct mode. It is a simple mode and, in the form described here,
used only by absolute assemblers. A typical example is:
LC Obj. Code
0 .
.
19 A ..
.
. Opc m dis
xx MUL A xxx 0 19
Since the value of A is a small number, the assembler uses the direct mode
for the MUL instruction. It should again be emphasized that the assembler does not
need to know what the MUL instruction is, how it works, or even the precise meaning
of the direct mode. It follows a simple rule that says: if the value of the symbol is <
the maximum value of the displacement field, put the symbol in the displacement
field and set the mode field to 0 (or whatever the code for direct mode is). This,
however, implies that the instruction cannot be relocated by the loader. Assuming
that the loader decides to load the program starting at address 1000, it would want
to relocate the displacement field to 1019, which may be too large. This mode is
thus not useful in a relocatable assembler but, as we will see later, certain versions
of it are.
Exercise A.2 What if A is a future symbol. Does that generate a problem for the
assembler?
Sec. A.2 Examples of Modes 257
Here, the displacement field is called offset, and is set to the distance between
the instruction and its operand. This is a useful mode that is sometimes automati-
cally selected by the assembler, and not explicitly specified by the programmer. It
also results in an instruction that does not need relocation.
Using the concept of a function mentioned earlier, we can write the definition
of this mode as EA=offset+PC. This means that, at run time, before the instruc-
tion can be executed, the hardware has to add the offset and the PC to obtain the
effective address. The hardware can easily do that, but how does the assembler
figure out the offset in the first place? The expression above can be written as
‘disp=EA-PC’ and, since the PC always points to the next instruction,
‘disp=EA-(LC+size of current instr.)=EA-LC-size of current instr.’
In the second pass, the LC contains the address of the current instruction, so
‘LC+size’ equals the address of the next one. Since the EA is simply the value
of the symbol used by the instruction, the assembler can calculate the offset and, if
it fits in the displacement field, store it there and use the relative mode. Example:
LC Obj. Code
0 .
.
19 A --
.
. Opc m dis
57 JMP A xxx 1 -39 =19-57-1
Note that the offset is negative. A little thinking shows that this is always the case
if the operand precedes the instruction. Thus, in this mode, the offset is a signed
number. Since the sign uses one of the offset bits, the maximum value of a signed
offset is only half that of an unsigned one. An unsigned 8-bit offset, for instance,
has the range 0 . . . 255, and a signed one, the range −128 . . . + 127. The ranges are
the same size, but the maximum values aren’t.
The example above also shows the absolute aspect of this mode. The offset is
essentially the distance between the operand and the instruction, and this distance
does not depend on the start address of the program. We say that this mode
generates position independent code, and an instruction using it should always have
a relocation bit of 0.
This mode is used when the instruction needs the value of a constant, not
the contents of a memory location. An ADD instruction is sometimes written as
‘ADD R3,XY’ and adds the contents of location XY to register 3. If, however, the
programmer wants to add 67 to register 3, the same instruction can be used in the
immediate mode by saying ‘ADD R3,#67’. The number sign ‘#’ is normally used
to indicate the immediate mode. Assuming that the code of this mode is 2, the
258 Addressing Modes Append. A
Exercise A.3 What instruction can be used instead of LOD R5,M above, to load
the value of ARY.
This mode can also be used a little differently, as in ‘LOD R1,ARY(R5)’ where
ARY is the start address of the array and R5 is initialized to 0. In this case, the index
register really contains the index of the current array element used. This form can
be used if the value of ARY fits in the displacement field.
On the Intel 8086, certain instructions can use two index registers. Thus
‘mov ah,ary[bx][di]’ is valid.
Sec. A.2 Examples of Modes 259
This is a more complex mode, requiring the hardware to work harder in order
to calculate the EA. The rule of calculation is: EA=Mem(disp.) meaning, the
hardware should fetch the contents of the memory location whose address is ‘disp.’,
and that contents is the EA. Obviously this mode is slower than the former ones
since it involves a memory read. The EA has to be read from memory before
execution can start. A simple example is:
LC Obj. Code
. Opc m disp
24 JMP @TO xxx 3 124
.
.
124 TO DC 11387
.
The EA is 11387 and could, in principle, be any address. Notice that the DC
directive itself generates the constant 11387 on the object file with a relocation bit
of 0. However, A directive of the form ‘TO: DC AB’ would generate the value of AB
on the object file as a relocatable quantity (assuming, of course, that AB is a relative
symbol). The value of TO is called the indirect address and, in the simplest version
of the indirect mode, it has to be small enough to fit in the displacement field. This
is one reason why several versions of this mode exist (see below).
What is this mode good for? As this is a discussion of assemblers, and not of
assembly language programming, a complete answer cannot be given here. We will
just show a typical use of this mode namely, a return from a procedure. When a
procedure is called, the return address has to be saved. Most computers save the
return address in either the stack, in one of the registers, or in the first word of the
procedure (in which case the first executable instruction of the procedure should
be stored in the second word). If the latter method is used, a return from the
procedure is a jump to the memory location whose address is contained in the first
word of the proceudre. This is therefore a classical, simple example of the use of
the indirect mode. If the procedure name is P, then an instruction such as ‘JMP @P’
(where ‘@’ specifies the indirect mode) can easily accomplish the job.
Incidentally, if the return address is saved in the stack, a special instruction,
such as RET, is necessary to return from the procedure. Such an instruction should
jump to the memory location whose address is contained in the top of the stack, and
also remove that address from the stack. Thus a RET instruction uses a combination
of the indirect and stack modes.
Other common extensions of the indirect mode combine it with either the
relative or the index modes. The JMP instruction above could be assembled as
‘xxx 3 99’ since the value of TO relative to the JMP instruction is 124 − 24 − 1 = 99.
This discussion assumes that the size of the JMP instruction is one word and that
mode 3 is the combination indirect-relative. A combination indirect-index can also
260 Addressing Modes Append. A
be used and, in fact, the 6502 microprocessor uses two such combinations, a pre-
indexed indirect (that can only be used with index register X), and a post-indexed
one (that can only be used with index register Y). Their definitions are:
Pre-indexed indirect: EA=Mem(disp+X)
Post-indexed indirect: EA=Mem(disp)+Y
In the former version, the indexing (disp+X) is done first and the indirect
operation (the memory read), later. In the latter version, the indirect is done first
and the indexing (...+Y), later. Leventhal [70] illustrates the use of those modes.
The two instructions ‘LDA ($80,X)’ & ‘LDA ($80),Y’ are typical examples. The
‘$’ stands for hexadecimal and the parentheses ‘()’ imply the indirect mode. It is
interesting to note that, in order to keep the instructions short, there is no separate
mode field in the 6502, and the mode is implied in the OpCode. Thus an instruction
may have several OpCodes, one for each valid mode. The two instructions above
have OpCodes of A1, B1 respectively.
This is a rare version of the basic indirect mode. It is suitable for a computer
with, e.g., 16-bit words and 15-bit addresses (or N -bit words and N − 1-bit ad-
dresses). The extra bit in such a word serves as a flag. If it is 1, another level of
indirect is required. The original instruction contains an indirect address, and the
hardware examines the contents of that address. If the flag is 1, the hardware uses
the contents as another indirect address. This process continues until the hardware
gets to a memory location where the flag is zero. The contents of that location is
the EA. The HP1000 and 2100 minicomputers [79] are good examples.
Exercise A.4 How can the programmer generate both an address and a flag in a
memory word?
Computers use many other modes, some simpler and others, more powerful
than the basic modes described so far. We will mention a few other modes, not
trying to provide a complete list but merely to show the possibilities.
This is a different name for the direct mode described earlier. If the displace-
ment field is N bits long, the (unsigned) displacement can have 2N values. Memory
may be divided into pages, each 2N words long, and page zero—the first memory
page—is special in the sense that any instruction accessing it may be assembled in
the direct mode. Hence the name zero page mode. Good examples are the 6502
microprocessor [70] and the PDP-8 minicomputer [71].
Sec. A.2 Examples of Modes 261
The PDP-8 was a 12-bit minicomputer. It also has 12-bit addresses and thus
a 4k word memory. The memory is divided into pages, each 128 words long. The
first memory page, addresses 0–127, is called base page. Many instructions have
the format:
OpCode m disp
4 1 7 =12 bit long.
If m=0, the direct mode is used and EA=displacement. This is zero page adressing.
However, if m=1, the EA becomes the logical OR of the 5 most significant bits of
the PC and the 7 bits of the displacement, pppppddddddd. The displacement is thus
the address in the current page, and the mode is called current page direct mode.
This is the simple case where the instruction has no operands, and is not
using any mode. Instructions such as HLT or TAY (transfer A to Y) are good exam-
ples. Those instructions do not use any modes but many times the manufacturers’
literature refers to them as using the implicit or implied mode.
In a computer with a stack, some instructions use the stack and should update
the stack pointer. Such instructions are said to use the stack mode, even though
they need not have a separate mode field. The use of the stack is implied in the
OpCode. Examples are POP & PUSH. The first pops a data item from the stack
and then updates the stack pointer (say, by incrementing it); the second updates
the stack pointer (by decrementing it) and then pushes the new data item into the
stack.
Exercise A.5 A stack is a LIFO data structure, which implies using the top ele-
ment. What could be a reason for accessing a stack element other than the top?
262 Addressing Modes Append. A
The instruction specifies a register that contains the operand. This mode does
not use an EA and there is no memory accesss.
The register contains the EA. It is thus a pointer to the operand in memory.
Instruction
Instruction disp
memory memory
Direct Relative
Instruction
indirect
disp instruction
address
Operand
Operand
memory
memory
Indirect Immediate
Instruction
disp
+ Operand
memory
registers
Indexed
above for the relative mode will be used to illustrate base registers.
LC Obj. Code
0 .
.
19 A ..
. Opc dis
57 JMP A xxx 19
The JMP instruction is assembled with a displacement of 19, which is the value of
symbol A relative to the start of the program. The loader loads the program starting
264 Addressing Modes Append. A
Instruction Instruction
indirect
indirect disp
disp address
address
+
+ Operand
Operand
registers
memory
memory
registers
Indirect Pre-Indexed Indirect Post-Indexed
Instruction
Instruction
reg
reg
operand
address Operand
registers
registers
memory
Register Register Indirect
base
top start of
Instruction program
disp +
disp
+ Operand Operand
SP
Instruction
stack memory
Base Relative
Stack Relative
at, say, address 1000. No relocation is necessary, and the JMP instruction is loaded
in memory as xxx19. The instruction labeled A is loaded into location 1019. At run
time, the base register should contain 1000 and, every time an address is generated
by the program, the hardware calculates the EA by adding disp.+base. In our case,
EA=19+1000=1019.
There is one problem associated with this feature, it is the case where the value
of a symbol is too large to fit in the displacement field . A simple example is shown
in the table below.
A general solution is to switch to another base register, one containing an
address closer to the value of symbol A. The user can select, say, register 4, load it
Sec. A.3 Base Registers 265
LC Obj. Code
0 .
. Opc dis
57 JMP A xxx 1234 too large!
.
1234 A --
with address 1200 and tell the assembler to assemble the JMP instruction using R4
as the base register. The assembler would then generate a displacement of 34. A
directive is necessary for this purpose and, in the case of the IBM360, the directive
is called USING. Thus USING 1200,4 would do the job. Notice that the directive
cannot load the value 1200 in register 4.
Divide the bits into groups of four, going from right to left. For our example we
get 001 1111 (the leftmost group may be shorter than 4).
Convert each group into one hex digit. Our example becomes 1F.
Note that a few zeros may have to be added to the leftmost group. Conversion
in the opposite direction is as easy.
I.1: Suppressing the listing file makes sense if a listing exists from a previous as-
sembly. Also, if a printer is not immediately available, the user may want to save
the listing file and to print it later.
1.3: The number of registers should be of the form 2k . This implies that a register
number is k bits long, and fits in a k-bit field in the instruction. A choice of
20 registers would mean a 5-bit register number—and thus a 5-bit field in many
instructions—but that field would not be fully utilized since with 5 bits it is possible
to indicate 32 registers.
1.4: To indicate an index register and a base register. Appendix A contains more
information on indexing and base addressing.
1.6: The symbol is simply undefined, an error situation discussed, among other
errors, at the end of chapter 1.
272 Answers to Exercises Append. C
1.7: Certain characters, such as ‘–’, ‘ ’ allow for a natural division of the name
into easy-to-read components. Other characters, such as ‘$’,‘=’ make labels more
descriptive. Examples: NO OF FONTS is more readable than NoOfFonts. REG=DATA
is more descriptive than RegEqData.
1.8:
a.
type
address=0..4096; an address in a 4k memory
node=record
info: address;
next: ^node
end;
list=^node; the type ‘list’ is a pointer to the beginning of such a list
b.
const
lim=500; max =1000; some suitable constants
type
list=o..lim; type ‘list’ is the index of the first list element in array house
var
house: array[1..lim] of 0..max; lists of pointers are housed here
house2: array[1..lim] of 0..lim; pointers pointing to
the next element inside each list, are housed here.
1.9: Using logical operations and, perhaps, shifts to mask the rest of the instruction.
1.10: Using either the POS or the ‘DS 0’ directives. Both are described in chapter
3.
1.11: A branch instruction. Any instruction following a branch must be forced into
position 0 of the next word, even if it is not labeled. The reason is that the only
way to execute such an instruction is to branch to it, and to make it possible to
branch to it, it must be located in position 0.
1.12: Yes, it is valid, its value is −11 and its type, absolute. It simply represents
a negative distance.
1.13: External symbols are described in chapter 3, and the way they are handled,
by the loader, in chapter 7. In the above example, the assembler calculates as much
of the expression as it can (A-B) and generates two modify loader directives (see
chapter 7), one to add the value of K and the other, to subtract L, both executed at
load time.
1.14: Address 24, because of the phrasing above . . .with the name ‘1’ and a value
≥ 17 . . .
1.15: ‘JMP *’ means jump to the current instruction. Such an instruction jumps
to itself and thus causes an infinite loop. In the early microprocessors, this was a
Append. C Answers to Exercises 273
common way to end the program, since many of those microprocessors did not have
a HLT instruction.
‘JMP *-*’ means jump to location 0. However, since the LC symbol ‘*’ is relative,
the expression *-* is rel − rel and is therefore absolute. The instruction would
jump to location 0 absolute, not to the first location in the program.
Note. Knuth [84] mentions another reason for those instructions. See exercise 2
in his section 1.3.2.
1.16: Because the USE is supposed to have the name, not the value, of a LC, in its
operand field.
1.17: Yes. The only problem is that the loader needs to be told where to start
the execution of the entire program. This, however, is specified in the END directive
(see chapter 3) and is a standard feature, used even where no multiple LCs are
supported.
1.18: It is identical to :
JMP TMP
.
.
TMP DC *
The computer will branch to the location labeled TMP but that location contains an
address, not an instruction. The likely result will be an interrupt (invalid OpCode).
1.19:
Object Code
LC Label Source OpCode Op
0 INP 00000111
1 STO 50 00000010 00110010
3 INP 00000111
4 STO 51 00000010 00110011
6 BZE X 00000100 00001101
8 ADD 50 00000011 00110010
10 OUT 00001000
11 BRA Y 00000110 00010001
13 X LOD 50 00000001 00110010
15 ADD 50 00000011 00110010
17 Y STO 52 00000010 00110100
19 HLT 00000000
1.20: Add either addressing modes or more registers. With 4 unused bits we can
have 2 bits specify one of 4 addressing modes, and the other 2 bits, one of 4 general-
purpose registers. Alternatively, we can use 3 bits to specify 8 modes, and the fourth
bit (if one is available) to select one of two index registers. These are special purpose
registers, used to hold an address, not an operand. The arithmetic operations would
still be performed in the Acc.
274 Answers to Exercises Append. C
1.21: Invalid mnemonic, invalid label (not a single letter), multiply-defined la-
bel, and invalid operand (syntactically wrong, such as ‘STO :’, undefined symbol,
operand greater than maximum address).
1.22: It makes it easier for the assembler to distinguish a symbol from a constant
in the operand field. In an instruction such as ‘ADD R1,1A’, the assembler has to
scan the entire name of the symbol 1A to verify that it is a symbol. The restriction
to a letter allows the assembler to scan an instruction such as ‘ADD R1,A1’ and,
immediately after scanning the first character of the symbol name, decide that the
instruction uses a symbol. This simplifies the lexical analysis phase of the assembler.
1.23: There is no point in writing 5000 unnecessary records on the object file. The
assembler places a special code (a loader directiveloader directives) in the object
file, instructing the loader to reserve as many locations as necessary.
1.24: In the relative mode, the Op field is essentially the distance between the
instruction and its operand. If that distance does not fit in six bits, the relative
mode cannot be used. In other words, this mode can only be used if the instruction
is not too far away from its operand. See appendix A for more details.
2.1: It depends on the characters allowed. The ASCII codes of the characters ‘<’,
‘=’, ‘>’, ‘?’, ‘@’ immediately precede the code of ‘A’. Similarly, the codes of ‘[’, ‘\’,
‘]’ immediately follow ‘Z’ in the ASCII sequence. If those codes are used, then it is
still easy to use buckets. Given the first character of a symbol name, we only need
to subtract from it the ASCII code of the first of the allowed characters, say ‘<’, to
get the bucket number. If other characters are allowed, then buckets may not be a
good data structure for the symbol table.
3.1: It adds more work to the programmer and doesn’t speed up the identification
by much. Even if the assembler identifies a source line as a directive by the period, it
still needs to search some table to find the start address of the routine that executes
it.
3.7: a. To make BSS and BES really different. Compare this difference to the
difference between the auto-increment and the auto-decrement modes discussed in
appendix A.
Append. C Answers to Exercises 275
b. The BES directive was useful on old computers with subtractive index registers
where most loops went backwards.
3.8: To produce a forcing upper of the next source line. This is common on com-
puters with a large word size, such as the CDC Cyber and the Cray computers.
The concept of forcing upper is discussed in chapter 1. Also, the IBM 360 uses
halfwords, fullwords, and doublewords in memory. The directive ‘DS 0D’ (reserve 0
double words) on those computers is used to force the LC to the start of the next
doubleword.
3.9: The most common use for ORG is to specify a start address for the program
in a computer without an operating system. On such a machine, the user may
select a start address and may want to load different programs starting at different
addresses. In such a case, the first source line is an ORG and is the only ORG in the
program.
3.10: The reason is that instructions are assembled in pass 2, where all the symbols
are already in the symbol table; certain directives, however, are executed in pass 1,
where future symbols have not been found yet. Thus pass 1 directives cannot use
future symbols.
3.11: The simplest way is to add another pass. The directive ‘A EQU B+1’ can be
handled in three passes. In the first pass, label A cannot be defined, since label B
is not yet in the symbol table. However, later in the same pass, B is found and is
stored in the symbol table. In the second pass label A can be defined and, in the
third pass, the program can be assembled. This, of course, is not a general solution,
since it is possible to nest future symbols very deep. Imagine something like:
A EQU B
-
B EQU C
-
C EQU D
-
-
D -
Such a program requires four passes just to collect all the symbol definitions, fol-
lowed by another pass to assemble instructions. Generally one could design a per-
colative assembler that would perform as many passes as necessary, until no more
future symbols remain. This may be a nice theoretical concept but its practical
value is nil. Cases such as ‘A EQU B’, where B is a future symbol, are not important
and can be considered invalid.
3.13: The modify loader directive, discussed in chapter 7, can be generated by the
assembler to instruct the loader to modify any item loaded, in any way, instead of
just storing in it the value of an external symbol.
3.14: Because the BSSZ generates Data. The BSS (or DS) directive only reserves
storage, so it is one of the Block Control directives.
4.1: Either make the label a parameter, use local labels (see chapter 1), or use
automatic label generation (see Ch. 4).
4.2: Generally not. In the above example it makes no sense for the assembler to
substitute X for B since it does not know if this is what the programmer wants. How-
ever, there are two cases where double substitution (or even multiple substitution)
makes sense. The first is where an argument is the name of a macro. The second
is the case of an assembler where parameters are identified syntactically, with an
‘&’ or some other character. The first case is called nested macro expansion and is
discussed later in this chapter. The second case occurs in the IBM360 where param-
eters must start with an ‘&’ and are therefore easy to identify. The 360 assembler
performs multiple substitution until no ‘&’ are left in the source line.
4.3: It depends on the MDT organization and on the size of the data structures
used for binding. Typically, the maximum number of parameters ranges between a
few tens and a few hundreds.
4.4: If the last argument is null, it should be preceded by a comma. A missing last
comma indicates a missing argument.
4.5: Add another pass (pass −1?) to collect all macro definitions and store them
in the MDT. Pass 0 would now be concerned only with macro expansions.
4.6: Nested macro definition. In such a case, each expansion of certain macros
causes another macro definition to be stored in the MDT, and space in the MDT
may be exhausted very rapidly.
4.7: All three parameters are bound to the same argument and are therefore sub-
stituted by the same string.
4.8: To retain the last binding and use default values. Thus P2 is first bound to
MAN and then to DAN. Since P3 is not bound to any argument, it should get a default
value.
4.9: The macro processor would continue scanning the source, reading and assign-
ing more and more text to the second argument, until one of the following happens:
1. It finds a period-space combination somewhere in the source. This would termi-
nate the scan and would bound the second parameter to a (long) argument.
2. It gets to the end of the source and realizes that something went wrong.
3. It finds a character (rather, a token) in a context that’s invalid inside a macro
argument. In the case of TEX, such a token could be the start of a new paragraph.
Append. C Answers to Exercises 277
In cases 2, 3 the macro processor would issue a ‘run away argument’ error message,
and either terminate the macro expansion or give the user a chance to correct the
source file interactively.
4.10: The difference is in scope. Attributes of macro arguments exist only while
the macro is expanded. Attributes of a symbol exist while the symbol is stored in
the symbol table.
4.11: The programmer probably meant a default value of null for parameter M2.
However, depending on the syntax rules of the assembler, this may also be consid-
ered an error.
4.12: It should be, since it alerts the reader to the fact that some of the listing is
suppressed.
4.14: Yes, many assemblers support simple expressions in the AIF directive.
4.16: No, one ‘ENDM X’ is enough. It signals to the assembler the end of macro X
and all its inner definitions.
5.1: Nothing, since the assembler does not process the macro lines at definition
time.
5.2: .show me. Note that me stands for ‘Macro Expansion’. Therefore, there are
no directives such as .show you or .noshow you. (Both .show and .noshow are
described in Ch. 4.)
5.3: It finds the new definition of ‘begin’, and discovers that it is different from
the one already in the symbol table.
6.2: Refer to the codes in the table; some valid conversions are -
B→D, K→B, D→P, D→K, K→D.
Invalid ones are -
X→D, H→E, E→B.
Also see any COBOL text, e.g. [90], for a complete list of COBOL type conversions.
6.3: If something like A[5,3] is needed, the programmer may either declare three
arrays A1[5], A2[5], A3[5], or one large array A[15]. The operation ‘A[3,2]:=0;’
can then be written either as ‘A2[3]:=0;’ or ‘A[i]:=0;’ where ‘i:=3+5*(2-1);’.
6.4: X is a parameter that should be in the symbol table. Its value should be a
string, such as ‘A’ or ‘S’. The parameter is then replaced by its value and the result
278 Answers to Exercises Append. C
7.10: ‘Undefined External Symbol’. The loader cannot tell if a library routine is
really missing or if the name is that of a bad external symbol.
7.11: It is possible to generalize the concept of overlay load and delete, such that
any overlay can call any other one, and there is no RETurn. Let’s assume that A and
D are active and D calls B. In such a case, D and all its active descendants are deleted
automatically and B is loaded. Even more, if D calls E, then D and its descendants
are deleted as before, and both overlays B and E are loaded.
7.12: There are two ways.
1. The bootstrap ROM is a separate memory, and is copied into main memory as
the first step in a bootstrap.
2. The boostrap ROM can be switched in and out of the address space. Let’s
assume that the bootstrap ROM occupies addresses xxx-yyy in the address space.
It is possible, although probably not worth it, to have a small RAM with the same
address range, and a multiplexor that can switch either the RAM or the ROM into
the address space. When the computer is reset, the multiplexor should switch in
the ROM, ready for the next bootstrap. When the bootstrap loader is executed,
its last instruction should be to switch back the RAM, so address range xxx-yyy
could be used by application programs.
7.13: As explained in chapter 7, the assembler should maintain a table (perhaps a
packed array) with 64 boolean values, each for a drum location, identifying its state
as either free or occupied.
8.1: It directs MASM to create a pass-1 listing in addition to, not instead of, the
usual, pass-2, listing.
A.1: By designing variable-length OpCodes. Most texts on computer organization
explain the method, which is based on assigning short OpCodes to long instructions,
and long OpCodes, to short instructions. Another, intriguing, possibility is to assign
the short OpCodes to commonly used instructions. Such a method is used on the
B1700 computers and is discussed by Organick [93]. An interesting theoretical
possibility is to use the Huffman method [94] to determine the shortest possible
OpCodes based on the frequency of use of each instruction.
A.2: Yes. The instruction is only assembled in pass 2 and, at that time, the values
of all symbols are in the symbol table. However, in pass 1 the assembler has to
determine the size of each instruction, and that size may depend on the mode. If
the mode cannot be determined because of a future symbol, the assembler selects
the mode with the largest size.
A.3: A good choice is ‘LOD R5,#ARY’. This instruction uses the immediate mode.
A.4: A computer using cascaded indirect should have a special directive for this
purpose.
A.5: Many stack operations use the one or two elements on top of the stack as
their operands. Such an operation removes the operands from the stack and pushes
280 Answers to Exercises Append. C
the result into the stack. If the programmer wants to use the same operands in
the future, they should be left in the stack. An easy way to accomplish this is to
generate their copies before excecuting a stack instruction. Example:
The reader should try to figure out the successive states of the stack at the 3 labeled
instructions above.
Page numbers in boldface indicate the most definitive source of information about an item.
assembler errors, 46, 237, 238 bootstrap loader, xiii, 12, 195, 220, 224
fatal, 22, 46 bootstrap process, 220
general, 47 bootstrap ROM, 279
label, 46 Borland International, 236
operand, 46 Borland turbo assembler, see TASM
operation, 46 Bostonians (anagram), 158
pass 1 errors, 232 BRANCH directive, 247
phase errors, 47, 48, 232 BSSZ directive, 94
undefined symbol, 234 Burroughs, 195
warnings, 46, 233 B1700, 279
assembler expressions BYTE directive, 80
operator precedence in, 229
C programming language, 240
assembler language, 4
Calingaert, P., 175
assembling instructions, 18
Cambridge University, 7
associating macro parameters with argu-
cascaded indirect mode, 96, 260, 279
ments, 124
CDC
ASSUME directive, 85
6000, 76
attributes of macro arguments, 125
6600, 266
Autocoder, see IBM702–705 assembler
7600, 76
autodecrement memory locations, 262
Cyber, 15, 33, 76, 187, 275
autodecrement mode, 262
display code, 78
autoincrement memory locations, 262
early computers, 187
autoincrement mode, 45, 262 chained MDT, 122
automatic jump-sizing in TASM, 237, 238 chaining overlays, 215
automatic label generation, 127 CHANGE loader command, 214
COBOL, 178–180, 277
BABBAGE language, xiii, 178, 184, 185, records in, 194
194 CODE directive, 78
arrays, 185 CodeView debugger, 230
expressions, 184 COL directive, 78
records, 185 Coleridge, S. T., 67
backward search of MDT, 143, 156 collector (linker loader), 195
bad symbols, 21 COM files on the IBM PC, 32
BASE directive, 77 COMM directive, 79, 108
base register, 89, 199, 262, 264, 265, 267 comment, in an instruction, 14
base relative mode, 262 COMMON directive, 196
BASIC, 4 Compass assembler, 187
BEGIN directive, 79 complete binary tree, 66
Bendix compound macro arguments, 116
G15, 221 computer architecture, 2, 254
binary numbers, 268 CON directive, 94
binary search, 60, 66 conditional assembly, xii, 8, 69, 121, 126,
binary tree in an array, 66 133, 134, 137, 140, 155, 243
binder (linker loader), 195, 224 constant types, 94
binding, 156, 212 continuation lines, 75
binding time, 212 control section, 138, 197
bit operations, 31 Conway, M. E., 8, 37
block structure, 156, 178, 181 cooperation between assembler and loader,
block structured language, 144 29
BNF definition, 270 core memories, 220
284 Index
directives in Macro, 242 EQU directive, 86, 134, 135, 155, 157, 160,
directives in a meta assembler, 188 189
DIS directive, 95 EQU symbol, 137
DISABLE directive, 92 ERR directive, 100
disassembled object code, 189 ERRxx directive, 101
disassembled source code, 189 EVEN directive, 81
disassembler, xiii, 11, 189, 190, 192, 194 EXE files on the IBM PC, 32
6502, 194 executable image, 211
disks, 193 executable module, 91, 200
displacement, 255, 260–262, 265 EXIT directive, 137
Dlevel counter, 144, 146 EXPORT directive, 245, 246
external linker, 187
double substitution, 115
external symbols, 36, 42, 209, 215, 224,
double-operand instruction, 13
278, 279
DROP directive, 89, 90
EXTRN directive, 42, 90, 91, 92, 108, 195,
drum memory, 7, 221
200, 201, 204, 205
DS directive, 39, 80, 160, 227 EXTRN loader directive, 205
DUP directive, 104, 108
dynamic linking, 201, 212, 224 factorial, 136
dynamic loader (DL), 212 Ferguson, D. E., 8, 187
dynamic loading, xiii, 212 first executable instruction, 74, 199
dynamic OpCode table, 108 first source line, 74
first-generation computers, 221
EA, 255–260, 262 floating point hardware, 8
EBCDIC, 78, 189 floating point numbers, 241
Ecclesiastes, 6 forcing upper, 33, 83, 197, 275
Ford, H., 57
ECHO directive, 104
formal parameters stack, 146
editing a macro, 118
format of an assembler instruction, 49
EDSAC, 7
Fortran, xi, 79, 80
assembler, 7
COMMON statement, 41, 79, 139
memory, 7
FORWARD directive, 247
effective address, see EA FUNC directive, 245
EJECT directive, 101 function in a meta assembler, 188
Elevel counter, 146 future symbol problem, 18, 19
Ellizey, R. S., 108 future symbols, 18, 19, 24, 31, 36, 38, 43,
END directive, 74, 91, 113, 196, 199, 204, 45, 159, 226, 234, 275, 278, 279
205, 216
END loader directive, 205 GAS, see IBM7090 assembler
end of file, 206 GEC 4000 computer, xiii, 178, 184
ENDC directive, 137 general loader, 195
ENDD directive, 105, 108 GEST, 200, 207, 208, 211, 214, 215, 278
ENDF directive, 245 giga (230 ), 255
ENDM directive, 110, 112, 113, 157, 245 global external symbol table, see GEST
global macro parameters, 144
ENDP directive, 245
Goldfinger, R., 8
ENDR directive, 245
Grishman, R., xiii
ENTRY directive, 42, 90, 93, 108, 195,
GROUP directive, 86
200, 201, 204, 205, 246
ENTRY loader directive, 205 Harrington, A., 228, 247
entry point, 42 hash function
forgotten declaration of, 207 properties of, 63
Index 287
loader directives, 3, 30, 32, 54, 73, 77, 79, definition, 109, 110, 111, 116, 155, 160
82, 83, 86, 200, 201, 213, 274 nested, 122, 130, 141, 143, 144, 146,
END, 205 154, 155, 276
ENTRY, 205 runaway, 113
EXTRN, 205 editing, 118
IDENT, 205 expansion, 110, 111, 113, 115, 121, 155
identifying, 204 listing, 160
MODIFY, 95, 205, 209, 210, 224 nested, 130, 132, 146, 276
overlay, 216 recursive, 134
program size, 199, 205 factorial, 136
size of, 205 global parameters, 144
special symbols, 205, 206 in TASM, 239
start execution at, 205 listing, 116, 160
types, 205 local labels, 127
USE, 219, 220, 224 multiply-defined, 122
loading, 4 nested, 121, 130, 243
dynamic, 212 on IBM 360, 140
local labels, xii, 8, 36, 37, 38, 52, 127, 239, on MPW assembler, 243
245 parameters, 114
in Macro, 242 delimiting, 125
in TASM, 237 double substitution, 115
location counter, see LC keyword, 243
LONG directive, 80 multiple substitution, 276
positional, 243
MACHINE directive, 76, 247 recursive, 133, 134
machine instruction, 3, 22, 26, 30, 36 removing from MDT, 123
displacement field, 26 serial number replacing parameters, 144
mode field, 26 system, 122, 123
machine language, 1, 190, 254 tokens, 118
machine-independent languages, 2 macro assembler, 11
machine-oriented language, 1 macro definition mode, 113, 121, 141–143,
Macintosh computer, 43, 127, 128, 137, 148, 149
172, 194, 229 macro definition stack, see MDS
toolbox routines, 245 macro definition table, see MDT
Macintosh Programmer’s Workshop, see macro definition time, 147
MPW MACRO directive, 110, 112, 113, 147, 157
macro, xii, 69, 109, 189 macro expansion mode, 113, 121, 142, 143
actual arguments, 114 macro expansion stack, see MES
arguments macro name table, see MNT
count attribute, 126 Macro (VAX assembler), 2, 42, 81, 92,
integer attribute, 126 125, 126, 128, 129, 164, 229, 240
length attribute, 126 addressing modes of, 243
number attribute, 126 data types in, 241
scaling attribute, 126 directives, 242
type attribute, 126 local labels in, 242
associating parameters with arguments, macros in, 243
124 source line format in, 241
attributes of arguments, 125 special characters in, 241
binding arguments, 114 Macroeconomics, 109
compound arguments, 116 Macrograph, 109
290 Index
tokens (in macros), 118 USE loader directive, 219, 220, 224
Tolliver, J., xiii USING directive, 89, 265
tree data structure, 194 UTMOST meta assembler, 187
tree overlay, 217
tree structure, 216, 218 variable length instructions, 51, 53
turbo assembler, see TASM variable-length OpCodes, 279
turbo C, 236 variables
turbo debugger, 235 scope of, 178
turbo Pascal, 236 VAX assembler, see Macro
two pass assembler, see assembler, two VAX computer, see DEC
passes VFD directive, 99
two’s complement, 267 virtual memory, 200, 215, 278
type of a symbol, 87 VOCS, 187
typed variables, 181 von Neumann, John, 32
Warnock, J., 10
UASAP-1, see IBM704 assembler
Webster (dictionary), 109
unconditional jump, 267
Wichmann, B. A., xiii
UNISAP, see UNIVAC assembler Wilkes, Maurice, 7
United Aircraft Corp., 8 Wirth, Niklaus, 8, 178
UNIVAC, 187, 195 WORD directive, 80
1107, 187
I & II assembler (UNISAP), 8, 32, 37 XREF directive, 103
UNIVAC III, 187
universal assembler, 186 Z-80 microprocessor, 266
unresolved references, 13, 18, 49, 57 Z-80, assemblers for, xii
USASI, 179 zero page mode, 260, 261
USE directive, 84, 219 zero-operand instructions, 49