Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
io
INSTAGRAM: www.instagram.com/hemanthraj_hemu/
INSTAGRAM: www.instagram.com/futurevisionbie/
Contents
Chapter1 Background 1
ud Introduction 1
12 System Software and Machine Architecture 3
L3 The Simplified Instructional Computer (SIC) 4
1.3.1 SIC Machine Architecture 5
1.3.2 SIC/XE Machine Architecture 7
1.3.3 SIC Programming Examples 12
1.4 Traditional (CISC) Machines 21
14.1 VAX Architecture 21
1.4.2 Pentium Pro Architecture 25
RISC Machines 29
1.5.1 UltraSPARC Architecture 29
1.5.2 PowerPC Architecture 33
1.5.3. Cray T3E Architecture 37
Exercises 40
Chapter 2 Assemblers 43
21 Basic Assembler Functions 44
2.1.1 ASimpleSIC Assembler 46
2.1.2 Assembler Algorithm and Data Structures 50
2.2 Machine-Dependent Assembler Features 52
2.2.1 Instruction Formats and Addressing Modes 57
2.2.2 Program Relocation 61
23 Machine-Independent Assembler Features 66
2.3.1 Literals 66
2.3.2 Symbol-Defining Statements 71
2.3.3. Expressions 75
2.3.4 Program Blocks 78
2.3.5 Control Sections and Program Linking 83
24 Assembler Design Options 92
2.4.1 One-Pass Assemblers 92
2.4.2 Multi-Pass Assemblers 98
25 Implementation Examples 102
2.5.1 MASM Assembler 103
2.5.2 SPARC Assembler 105
ix
https://hemanthrajhemu.github.io
Chapter 1
Backgrou nd
1.1 INTRODUCTION
———
difficult to distinguish between those features of the software that are truly
fundamental and those that dependsolely on the idiosyncrasies ofa particular
machine. To avoid this problem, we present the fundamental functions of each
piece of software through discussion of a Simplified Instructional Computer
(SIC). SIC is a hypothetical computer that has been carefully designed to in-
clude the hardware features most often found on real machines, while avoid-
ing unusualor irrelevant complexities. In this way, the central concepts of a
piece of system software can be clearly separated from the implementation de-
tails associated with a particular machine. This approach provides the reader
with a starting point from which to begin the design of system software for a
newor unfamiliar computer.
Each major chapter in this text first introduces the basic functions of
the type of system software being discussed. We then consider machine-
dependent and machine-independentextensions to these functions, and exam-
ples of implementations on actual machines. Specifically, the major chapters
are divided into the following sections:
This chapter contains brief descriptions of SIC and of the real machines
that are used as examples. You are encouraged to read these descriptions now,
and refer to them as necessary when studying the examples in each chapter.
Like many other products, SIC comesin twoversions: the standard model
and an XEversion (XE stands for “extra equipment,” or perhaps “extra expen-
sive”). The two versions have been designed to be upward compatible—thatis,
an object program for the standard SIC machinewill also execute properly on
a SIC/XE system. (Such upward compatibility is often found on real comput-
ers that are closely related to one another.) Section 1.3.1 summarizes the stan-
dard features of SIC. Section 1.3.2 describes the additional features that are
included in SIC/XE. Section 1.3.3 presents simple examples of SIC and
SIC/XE programming. These examplesare intended to help you become more
familiar with the SIC and SIC/XE instruction sets and assembler language.
Practice exercises in SIC and SIC/XE programming can be foundat the end of
this chapter.
Memory
Memory consists of 8-bit bytes; any 3 consecutive bytes form a word (24bits).
All addresses on SIC are byte addresses; words are addressed by the location
of their lowest numbered byte. There are a total of 32,768 (219) bytes in the
computer memory.
Registers
There are five registers, all of which have special uses. Each register is 24 bits
in length. The following table indicates the numbers, mnemonics, and uses of
these registers. (The numbering scheme has been chosen for compatibility
with the XE version of SIC.)
Data Formats
Instruction Formats
All machine instructions on the standard version of SIC have the following
24-bit format:
8 1 15
opcode x address
Addressing Modes
There are two addressing modesavailable, indicated by the setting of the x bit
in the instruction. The following table describes how thetarget address is calcu-
lated from the address given in the instruction. Parentheses are used to indi-
cate the contents of a register or a memory location. For example, (X)
represents the contents of register X.
Instruction Set
SIC providesa basic set of instructions that are sufficient for most simple
tasks. These includeinstructions that load and store registers (LDA, LDX, STA,
STX, etc.), as well as integer arithmetic operations (ADD, SUB, MUL,DIV). All
arithmetic operations involve register A and a word in memory, with the result
being left in the register. There is an instruction (COMP) that compares the
value in register A with a word in memory;this instruction sets a condition code
CC to indicate the result (<, =, or >). Conditional jumpinstructions (JLT, JEQ,
JGT) can test the setting of CC, and jumpaccordingly. Twoinstructions are
https://hemanthrajhemu.github.io
1.3 The Simplified Instructional Computer (SIC)
provided for subroutine linkage. JSUB jumps to the subroutine, placing the
return address in register L; RSUB returns by jumping to the address con-
tained in register L. “
Appendix A gives a complete list of all SIC (and SIC/XE) instructions, with
their operation codes and a specification of the function performed byeach.
On the standard version of SIC, input and output are performed bytransfer-
ring 1 byte at a timeto or from the rightmost8 bits of register A. Each deviceis
assigned a unique 8-bit code. There are three I/O instructions, each of which
specifies the device code as an operand.
The Test Device (TD) instruction tests whether the addressed device is
ready to send or receive a byte of data. The condition codeis set to indicate the
result of this test. (A setting of < means the device is ready to sendorreceive,
and = meansthe deviceis not ready.) A program needing to transfer data must
wait until the device is ready, then execute a Read Data (RD) or Write Data
(WD). This sequence must be repeated for each byte of data to be read or writ-
ten. The program shownin Fig. 2.1 (Chapter 2) illustrates this technique for
performing I/O.
Memory
The memory structure for SIC/XEis the sameas that previously described for
SIC. However, the maximum memoryavailable on a SIC/XE system is
1 megabyte (229 bytes). This increase leads to a change in instruction formats
and addressing modes.
Registers
Data Formats
SIC/XE provides the same data formats as the standard version. In addition,
there is a 48-bit floating-point data type with the following format:
1 11 36
exponent fraction
nw
Thefraction is interpreted as a value between 0 and1; that is, the assumed bi-
nary point is immediately before the high-order bit. For normalized floating-
point numbers, the high-order bit of the fraction must be 1. The exponent is
interpreted as an unsigned binary number between 0 and 2047.If the exponent
has value e and thefraction has valuef, the absolute value of the numberrep-
resentedis
f * 2(e-1024).
Instruction Formats
The larger memory available on SIC/XE means that an address will (in gen-
eral) no longer fit into a 15-bit field; thus the instruction format used on the
standard version of SIC is no longer suitable. There are two possible options—
either use some form of relative addressing, or extend the address field to 20
bits. Both of these options are included in SIC/XE (Formats 3 and4 in thefol-
lowing description). In addition, SIC/XE provides someinstructions that do
not reference memory atall. Formats 1 and 2 in the following description are
used for such instructions.
The newsetof instruction formats is as follows. The settings of theflag bits
in Formats 3 and 4 are discussed under Addressing Modes.Bit e is used to dis-
tinguish between Formats 3 and 4 (e = 0 means Format 3, e = 1 means Format
4). Appendix A indicates the format to be used with each machineinstruction.
Format 1 (1 byte):
8
op
https://hemanthrajhemu.github.io
1:3 The Simplified Instructional Computer (SIC)
Format 2 (2 bytes):
8 4 4
op r1 r2
Format 3 (3 bytes):
6 tone cdpeticd 12
op nji|x|b/pje disp
Format 4 (4 bytes):
6 177444 20
op n/i|x/bl/pje address
Addressing Modes
Two newrelative addressing modes are available for use with instructions
assembled using Format3. These are describedin the followingtable:
Bits f and n in Formats 3 and 4 are used to specify howthe target addressis
used. If bit i = 1 and n = 0, the target addressitself is used as the operand
value; no memoryreference is performed. This is called immediate addressing.
If bit i = 0 and » = 1, the word at the location given by the target addressis
fetched; the value contained in this word is then taken as the address of the
operand value. This is called indirect addressing. If bits i and n are both 0 or
both 1, the target address is taken as the location of the operand; wewill refer
to this as simple addressing. Indexing cannot be used with immediate or indi-
rect addressing modes.
Manyauthors usethe term effective address to denote what we havecalled
the target address for an instruction. However, there is disagreement concern-
ing the meaning of effective address whenreferring to an instruction that uses
indirect addressing. To avoid confusion, we use the term target address
throughoutthis book.
SIC/XEinstructions that specify neither immediate nor indirect addressing
are assembled with bits m and i both set to 1. Assemblers for the standard ver-
sion of SIC will, however, set the bits in both of these positions to 0. (This is be-
cause the 8-bit binary codes for all of the SIC instructions end in 00.) All
SIC/XE machines have a special hardware feature designed to provide the up-
ward compatibility mentioned earlier. If bits m and i are both 0, then bits b, p,
and e are considered to be part of the address field of the instruction (rather
than flags indicating addressing modes). This makes Instruction Format 3
identical to the format used on the standard version of SIC, providing the de-
sired compatibility.
Figure 1.1 gives examples of the different addressing modes available on
SIC/XE. Figure 1.1(a) shows the contents of registers B, PC, and X, and of se-
lected memorylocations. (All values are given in hexadecimal.) Figure 1.1(b)
gives the machine code for a series of LDAinstructions. The target address
generated by each instruction, and the value that is loaded into register A, are
also shown. You should carefully examine these examples, being sure you un-
derstand the different addressing modesillustrated.
For ease of reference,all of the SIC/XE instruction formats and addressing
modesare summarized in Appendix A.
Instruction Set
(B) = 006000
. .
; . (PC) = 003000
. . 7s
a ® (X) = 000090
3030 003600
. .
. .
. .
3600 103000
. .
. .
. .
. .
. s
6390 00C303
. .
. .
. .
. .
. .
. .
C303 003030
. .
. .
. .
. .
(a)
SUBF, MULF, DIVF). There arealsoinstructions that take their operands from
registers. Besides the RMO (register move) instruction, these include
register-to-register arithmetic operations (ADDR, SUBR, MULR,DIVR). A spe-
cial supervisor call instruction (SVC) is provided. Executing this instruction
generates an interrupt that can be used for communication with the operating
system. (Supervisorcalls and interrupts are discussed in Chapter6.)
Thereare also several other new instructions. Appendix A gives a complete
list of all SIC/XEinstructions, with their operation codes and a specification of
the function performed by each.
The I/O instructions we discussed for SIC are also available on SIC/XE.In ad-
dition, there are I/O channels that can be used to perform input and output
while the CPUis executing other instructions. This allows overlap of comput-
ing and I/O, resulting in moreefficient system operation. The instructions
SIO, TIO, and HIO are usedtostart, test, and halt the operation of I/O chan-
nels. (These concepts are discussed in detail in Chapter6.)
This section presents simple examples of SIC and SIC/XE assembler language
programming. These examples are intended to help you become more familiar
with the SIC and SIC/XEinstruction sets and assembler language. It is as-
sumedthat the reader is already familiar with the assembler languageof at
least one machine and with the basic ideas involved in assembly-level pro-
gramming.
The primary subject of this book is systems programming, not assembler
language programming. The following chapters contain discussions of various
types of system software, and in some cases SIC programs are usedtoillus-
trate the points being made. This section contains material that may help you
to understand these examples more easily. However, it does not contain any
new material on system software or systems programming. Thus,this section
can be skipped without anyloss of continuity.
Figure 1.2 contains examples of data movement operations for SIC and
SIC/XE. There are no memory-to-memory moveinstructions; thus, all data
movement must be doneusing registers. Figure 1.2(a) shows two examples of
data movement.In thefirst, a 3-byte word is movedby loadingit into register
A andthen storing the register at the desired destination. Exactly the same
thing could be accomplished using register X (and the instructions LDX, STX)
or register L (LDL, STL). In the second example, a single byte of data is moved
using the instructions LDCH (Load Character) and STCH (Store Character).
https://hemanthrajhemu.github.io
1.8 The Simplified Instructional Computer (SIC) 13
(a)
(b)
Figure 1.2 Sample data movement operations for (a) SIC and
(b) SIC/XE.
https://hemanthrajhemu.github.io
14 Chapter 1 Background
The instructions shownin Fig. 1.2(a) would also work on SIC/XE; how-
ever, they would not take advantage of the more advanced hardwarefeatures
available. Figure 1.2(b) shows the same two data-movement operations as
they might be written for SIC/XE. In this example, the value 5 is loaded into
register A using immediate addressing. The operand field for this instruction
contains the flag # (which specifies immediate addressing) and the data value
to be loaded. Similarly, the character “Z”is placed into register A by using im-
mediate addressing to load the value 90, which is the decimal value of the
ASCII codethat is used internally to represent the character “Z”.
Figure 1.3(a) shows examplesof arithmetic instructions for SIC. All arith-
metic operations are performed using register A, with the result being left in
register A. Thus this sequenceof instructions stores the value (ALPHA + INCR
— 1) in BETA andthe value (GAMMA+ INCR — 1) in DELTA.
Figure 1.3(b) illustrates how the samecalculations could be performed on
SIC/XE. The value of INCRis loadedinto register S initially, and the register-
to-register instruction ADDRis used to add this value to register A whenitis
needed. This avoids having to fetch INCR from memory eachtimeit is used in
a calculation, which may make the program moreefficient. Immediate ad-
dressing is used for the constant 1 in the subtraction operations.
Looping and indexing operationsare illustrated in Fig. 1.4. Figure 1.4(a)
showsa loop that copies one 11-byte character string to another. The index
register (register X) is initialized to zero before the loop begins. Thus, during
the first execution of the loop, the target address for the LDCH instruction will
be the address of the first byte of STR1. Similarly, the STCH instruction will
store the character being copied into the first byte of STR2. The next instruc-
tion, TIX, performs two functions. First it adds 1 to the valuein register X, and
then it compares the new value of register X to the value of the operand (in
this case, the constant value 11). The condition codeis set to indicate the result
of this comparison. The JLT instruction jumps if the condition code is set to
“less than.” Thus, the JLT causes a jump back to the beginning of the loop if
the new valuein register X is less than 11.
During the second execution of the loop, register X will contain the value
1. Thus, the target address for the LDCH instruction will be the second byte of
STRI, and the target address for the STCH instruction will be the second byte
of STR2. The TIX instruction will again add1 to the valuein register X, and the
loop will continue in this way until all 11 bytes have been copied from STR1 to
STR2. Notice that after the TIX instruction is executed, the value in register X
is equal to the numberof bytes thathave already been copied.
Figure 1.4(b) shows the same loop as it might be written for SIC/XE. The
main difference is that the instruction TIXR is used in place of TIX. TIXR
works exactly like TIX, except that the value used for comparison is taken
from anotherregister (in this case, register T), not from memory. This makes
https://hemanthrajhemu.github.io
1.8 The Simplified Instructional Computer (SIC) 15
is eeee
the loop moreefficient, because the value does not have to be fetched from
memory each time the loop is executed. Immediate addressing is used to ini-
tialize register T to the value 11 andto initialize register X to 0.
ONE-WORD CONSTANT
ONE-WORD VARIABLES
ALPHA
Bee ee
BETA
DELTA
(a)
BETA
DELTA
(b)
Figure 1.3 Sample arithmetic operations for (a) SIC and (b) SIC/XE.
https://hemanthrajhemu.github.io
16 Chapter 1 Background
(a)
(b)
Figure 1.4 Sample looping and indexing operations for (a) SIC and
(b) SIC/XE.
(a)
#3 INITIALIZE REGISTER S TO 3
#300 INITIALIZE REGISTER T TO 300
#0 INITIALIZE INDEX REGISTER TO 0
ADDLP ALPHA, X LOAD WORD FROM ALPHA INTO REGISTER A
BETA, X ADD WORD FROM BETA
GAMMA , X STORE THE RESULT IN A WORD IN GAMMA
S,x ADD 3 TO INDEX VALUE
eT COMPARE NEW INDEX VALUE TO 300
ADDLP LOOP IF INDEX VALUE IS LESS THAN 300
(b)
Figure 1.5 Sample indexing and looping operations for (a) SIC and
(b) SIC/XE.
https://hemanthrajhemu.github.io
18 Chapter 1 Background
In Fig. 1.5(a), we define a variable INDEX that holds the value to be used
for indexing for each iteration of the loop. Thus, INDEX should be 0 for the
first iteration, 3 for the second, and so on. INDEXisinitialized to 0 before the
start of the loop. Thefirst instruction in the body of the loop loads the current
value of INDEXinto register X so that it can be used for target address calcula-
tion. The next three instructions in the loop load a word from ALPHA, add the
corresponding word from BETA, and store the result in the corresponding
word of GAMMA. The value of INDEXis then loaded into register A, incre-
mented by 3, and stored back into INDEX. After being stored, the new value of
INDEXis still present in register A. This value is then compared to 300 (the
length of the arrays in bytes) to determine whether or not to terminate the
loop. If the value of INDEXis less than 300, then all bytes of the arrays have
not yet been processed. In that case, the JLT instruction causes a jump back to
the beginning of the loop, where the newvalue of INDEXis loadedinto regis-
ter X.
This particular loop is cumbersome on SIC, because register A must be
used for adding the array elements together and also for incrementing the in-
dex value. The loop can be written much moreefficiently for SIC/XE, as
shown in Fig. 1.5(b). In this example, the index value is kept permanently in
register X. The amount by which to increment the index value (3) is kept in
register S, and the register-to-register ADDRinstruction is used to addthis in-
crement to register X. Similarly, the value 300 is kept in register T, and thein-
struction COMPRis used to compare registers X and T in order to decide
whento terminate the loop.
Figure 1.6 shows a simple example of input and output on SIC; the same
instructions would also work on SIC/XE.(The more advanced input and out-
put facilities available on SIC/XE, such as I/O channels and interrupts, are
discussed in Chapter 6.) This program fragment reads 1 byte of data from de-
vice F1 and copies it to device 05. The actual input of data is performed using
the RD (Read Data) instruction. The operand for the RD is a byte in memory
that contains the hexadecimal code for the input device (in this case, F1).
Executing the RD instruction transfers 1 byte of data from this device into the
rightmost byte of register A. If the input device is character-oriented (for ex-
ample, a keyboard), the value placed in register A is the ASCII code for the
character that wasread.
Before the RD can be executed, however, the input device must be ready to
transmit the data. For example, if the input device is a keyboard, the operator
must have typed a character. The program checks for this by using the TD
(Test Device) instruction. When the TD is executed, the status of the addressed
device is tested and the condition codeis set to indicate the result of this test.
If the device is ready to transmit data, the condition codeis set to “less than”;
if the device is not ready, the condition codeis set to “equal.” As Fig. 1.6
https://hemanthrajhemu.github.io
1.3% The Simplified Instructional Computer (SIC)
illustrates, the program must execute the TD instruction and then check the
condition code by using a conditional jump. If the condition code is “equal”
(device not ready), the program jumps back to the TD instruction. This two-
instruction loop will continue until the device becomes ready; then the RD will
be executed.
Output is performedin the same way.First the program uses TD to check
whether the output device is ready to receive a byte of data. Then the byte to
be written is loaded into the rightmost byte of register A, and the WD (Write
Data) instruction is used to transmitit to the device.
Figure 1.7 shows how these instructions can be used to read a 100-byte
record from an input device into memory. The read operation in this example
is placed in a subroutine. This subroutineis called from the main program by
using the JSUB (Jump to Subroutine) instruction. At the end of the subroutine
there is an RSUB (Return from Subroutine) instruction, which returns control
to the instruction that follows the JSUB.
The READsubroutineitself consists of a loop. Each execution of this loop
reads 1 byte of data from the input device, using the same techniquesillus-
trated in Fig. 1.6. The bytes of data that are read are stored in a 100-byte buffer
area labeled RECORD.The indexing and looping techniques that are used in
storing characters in this buffer are essentially the sameas thoseillustrated in
Fig. 1.4(a).
Figure 1.7(b) shows the same READ subroutineas it might be written for
SIC/XE. The main differences from Fig. 1.7(a) are the use of immediate
addressing and the TIXRinstruction, as wasillustrated in Fig. 1.4(a).
https://hemanthrajhemu.github.io
20 Chapter 1 Background
(a)
JSUB CALL READ SUBROUTINE
(b)
Figure 1.7 Sample subroutine call and record input operations for
(a) SIC and (b) SIC/XE.
https://hemanthrajhemu.github.io
. 1.4 Traditional (CISC) Machines 21
This section introduces the architectures of two of the machines that will be
used as exampleslater in the text. Section 1.4.1 describes the VAX architecture,
and Section 1.4.2 describes the architecture of the Intel x86 family of proces-
sors.
The machines described in this section are classified as Complex Instruc-
tion Set Computers (CISC). CISC machines generally havea relatively large
and complicated instruction set, several different instruction formats and
lengths, and many different addressing modes. Thus the implementation of
such an architecture in hardware tends to be complex.
You may want to compare the examples in this section with the Reduced
Instruction Set Computer (RISC) examples in Section 1.5. Further discussion of
CISC versus RISC designs can be found in Tabak (1995).
Memory
The VAX memory consists of 8-bit bytes. All addresses used are byte ad-
dresses. Two consecutive bytes form a word; four bytes form a longword; eight
bytes form a quadword; sixteen bytes form an octaword. Some operations are
moreefficient when operandsare aligned in a particular way—for example, a
longword operand that begins at a byte address that is a multiple of 4.
All VAX programs operatein a virtual address space of 292 bytes. This vir-
tual memory allows programs to operate as though they had access to an ex-
tremely large memory, regardless of the amount of memory actually present
on the system. Routines in the operating system take care of the details of
memory management. Wediscuss virtual memory in connection with our
study of operating systems in Chapter 6. One half of the VAX virtual address
space is called system space, which contains the operating system, and is shared
by all programs. The other half of the address spaceis called process space, and
https://hemanthrajhemu.github.io
Chapter 1 Background
is defined separately for each program. A part of the ‘process space contains
stacks that are available to the program. Special registers and machineinstruc-
tions aid in the useof thesestacks.
Registers
Data Formats
is used to represent numeric values with onedigit per byte. In this format, the
sign may appeareitherin the last byte, or as a separate byte precedingthefirst
digit. These two variations are called trailing numeric and leading separate nu-
meric.
VAX also supports queues and variable-length bit strings. Data structures
such as these can, of course, be implemented on any machine; however, VAX
provides direct hardware support for them. There are single machine instruc-
tions that insert and remove entries in queues, and perform a variety of opera-
tions on bit strings. The existence of such powerful machineinstructions and
complex primitive data types is one of the more unusual features of the VAX
architecture.
Instruction Formats
Addressing Modes
VAX provides a large number of addressing modes. With few exceptions, any
of these addressing modes may be used with anyinstruction. The operandit-
self may bein a register (register mode), or its address may be specified by a
register (register deferred mode). If the operand addressis in a register, the reg-
ister contents may be automatically incremented or decremented by the
operand length (autoincrement and autodecrement modes). There are several
base relative addressing modes, with displacement fields of different lengths;
when used with register PC, these become program-counter relative modes.
All of these addressing modes mayalso include an index register, and many of
them are available in a form that specifies indirect addressing (called deferred
modes on VAX). In addition, there are immediate operands and several spe-
cial-purpose addressing modes.For further details, see Baase (1992).
Instruction Set
Oneof the goals of the VAX designers was to producean instruction set that is
symmetric with respect to data type. Many instruction mnemonics are formed
by combiningthe following elements:
https://hemanthrajhemu.github.io
24 Chapter 1 Background
For example, the instruction ADDW2is an add operation with two operands,
each a word in length. Likewise, MULL3 is a multiply operation with three
longword operands, and CVTWL specifies a conversion from word to long-
word. (In the latter case, a two-operand instruction is assumed.) For a typical
instruction, operands maybelocated in registers, in memory, orin the instruc-
tion itself (immediate addressing). The same machineinstruction code is used,
regardless of operandlocations.
VAX provides all of the usual types of instructions for computation, data
movement and conversion, comparison, branching,etc. In addition, there are a
number of operations that are much more complex than the machine instruc-
tions found on most computers. These operations are, for the most part, hard-
ware realizations of frequently occurring sequences of code. They are
implementedassingle instructionsfor efficiency and speed. For example, VAX
provides instructions to load and store multiple registers, and to manipulate
queuesand variable-length bit fields. There are also powerful instructions for
calling and returning from procedures. A single instruction saves a designated
set of registers, passes a list of arguments to the procedure, maintains the
stack, frame, and argument pointers, and sets a mask to enable error traps for
arithmetic operations. For further information onall of the VAX instructions,
see Baase (1992).
Input and output on the VAX are accomplished by I/O device controllers.
Each controller has a set of control/status and data registers, which are as-
signed locations in the physical address space. The portion of the address
space into which the device controller registers are mappedis called I/O space.
Nospecial instructions are required to access registers in 1/O space. An
I/O device driver issues commandsto the device controller by storing values
into the appropriate registers, exactly as if they were physical memory loca-
tions. Likewise, software routines may read these registers to obtain status in-
formation. The association of an address in I/O space with a physical register
in a device controller is handled by the memory managementroutines.
https://hemanthrajhemu.github.io
1.4 Traditional (CISC) Machines 25
The Pentium Pro microprocessor, introduced near the end of 1995, is the latest
in the Intel x86 family. Other recent microprocessors in this family are the
80486 and Pentium. Processors of the x86 family are presently used in a major-
ity of personal computers, and there is a vast amount of software for these
processors. It is expected that additional generations of the x86 family will be
developed in the future.
The various x86 processors differ in implementation details and operating
speed. However, they share the same basic architecture. Each succeeding gen-
eration has been designed to be compatible with the earlier versions. This sec-
tion contains an overview of the x86 architecture, which will serve as
backgroundfor the examples to be discussed later in the book. Further infor-
mation about the x86 family can be found in Intel (1995), Anderson and
Shanley (1995), and Tabak (1995).
Memory
Memory in the x86 architecture can be described in at least two different ways.
At the physical level, memory consists of 8-bit bytes. All addresses used are
byte addresses. Two consecutive bytes form a word; four bytes form a double-
word (also called a dword). Some operations are more efficient when operands
are aligned in a particular way—for example, a doubleword operandthat be-
gins at a byte addressthatis a multipleof 4.
However, programmers usually view the x86 memory asa collection of
segments. From this point of view, an address consists of two parts—a segment
numberand an offset that points to a byte within the segment. Segments can
be of different sizes, and are often used for different purposes. For example,
some segments may contain executable instructions, and other segments may
be used to store data. Some data segments maybetreated as stacks that can be
used to save register contents, pass parameters to subroutines, and for other
purposes.
It is not necessaryfor all of the segments used by a program to be in physi-
cal memory. In some cases, a segment can also be divided into pages. Some of
the pages of a segment may be in physical memory, while others may be
stored on disk. When an x86 instruction is executed, the hardware and the op-
erating system make sure that the needed byte of the segment is loaded into
physical memory. The segment/offset address specified by the programmeris
automatically translated into a physical byte address by the x86 Memory
https://hemanthrajhemu.github.io
26 Chapter 1 Background
Registers
There are eight general-purpose registers, which are named EAX, EBX, ECX,
EDX, ESI, EDI, EBP, and ESP. Each general-purposeregister is 32 bits long (i.e.,
one doubleword). Registers EAX, EBX, ECX, and EDX are generally used for
data manipulation; it is possible to access individual words or bytes from
these registers. The other four registers can also be used for data, but are more
commonly used to hold addresses. The general-purpose register set is identi-
cal for all membersof the x86 family beginning with the 80386. This set is also
compatible with the more limited register sets found in earlier membersof the
family.
There are also several different types of special-purposeregisters in the x86
architecture. EIP is a 32-bit register that contains a pointer to the next instruc-
tion to be executed. FLAGSis a 32-bit register that contains many different bit
flags. Some of these flags indicate the status of the processor; others are used
to record the results of comparisons and arithmetic operations. There are also
six 16-bit segment registers that are used to locate segments in memory.
Segment register CS contains the address of the currently executing code seg-
ment, and SS contains the address of the current stack segment. The other seg-
ment registers (DS, ES, FS, and GS) are used to indicate the addresses of data
segments.
Floating-point computations are performed using a special floating-point
unit (FPU). This unit contains eight 80-bit data registers and several other con-
trol and statusregisters.
All of the registers discussed so far are available to application programs.
There are also a number of registers that are used only by system programs
such as the operating system. Some of these registers are used by the MMUto
translate segment addresses into physical addresses. Others are used to con-
trol the operation of the processor, or to support debugging operations.
Data Formats
The x86 architecture provides for the storage of integers, floating-point values,
characters, and strings. Integers are normally stored as 8-, 16-, or 32-bit binary
numbers. Both signed and unsigned integers (also called ordinals) are sup-
ported; 2’s complement is used for negative values. The FPU can also handle
64-bit signed integers. In memory, theleast significant part of a numeric value
is stored at the lowest-numbered address. (This is commonly called
https://hemanthrajhemu.github.io
1.4 Traditional (CISC) Machines 27
little-endian byte ordering, because the “little end” of the value comesfirst in
memory.)
Integers can also ‘be stored in binary coded decimal (BCD). In the unpacked
BCD format, each byte represents one decimal digit. The value ofthis digit is
encoded(in binary) in the low-order 4 bits of the byte; the high-orderbits are
normally zero. In the packed BCD format, each byte represents two decimal
digits, with each digit encoded using bits of the byte.
There are three different floating-point data formats. The single-precision
format is 32 bits long. It stores 24 significant bits of the floating-point value,
and allowsfor a 7-bit exponent (power of 2). (The remaining bit is used to
store the sign of the floating-point value.) The double-precision format is 64
bits long. It stores 53 significant bits, and allows for a 10-bit exponent. The
extended-precision formatis 80 bits long. It stores 64 significant bits, and
allowsfor a 15-bit exponent.
Characters are stored one per byte, using their 8-bit ASCII codes. Strings
may consist of bits, bytes, words, or doublewords; special instructions are
provided to handle each typeofstring.
Instruction Formats
All of the x86 machine instructions use variations of the same basic format.
This format begins with optional prefixes containing flags that modify the op-
eration of the instruction. For example, some prefixes specify a repetition
count for an instruction. Others specify a segment register that is to be used
for addressing an operand (overriding the normal default assumptions made
by the hardware). Following the prefixes (if any) is an opcode(1 or 2 bytes);
some operations have different opcodes, each specifying a different variant of
the operation. Following the opcode are a number of bytes that specify the
operands and addressing modesto be used. (See the description of addressing
modesin the next section for further information.)
The opcodeis the only element that is always present in every instruction.
Other elements may or may not be present, and maybe of different lengths,
dependingon the operation and the operandsinvolved. Thus,there are a large
numberof different potential instruction formats, varying in length from
1 byte to 10 bytes or more.
Addressing Modes
Instruction Set
The x86 architecture has a large and complex instruction set, containing more
than 400 different machine instructions. An instruction may havezero, one,
two, or three operands. There are register-to-register instructions, register-to-
memory instructions, and a few memory-to-memoryinstructions. In some
cases, operands mayalso be specified in the instruction as immediate values.
Most data movement andinteger arithmetic instructions can use operands
that are 1, 2, or 4 bytes long. String manipulation instructions, which use repe-
tition prefixes, can deal directly with variable-length strings of bytes, words,
or doublewords. There are manyinstructions that perform logical and bit ma-
nipulations, and support control of the processor and memory-management
systems.
The x86 architecture also includes special-purpose instructions to perform
operations frequently required in high-level programming languages—for ex-
ample, entering and leaving procedures and checking subscript values against
the boundsof an array.
This section introducés the architectures of three RISC machines that will be
used as examples later in the text. Section 1.5.1 describes the architecture of the
SPARC family of processors. Section 1.5.2 describes the PowerPC family of mi-
croprocessors for personal computers. Section 1.5.3 describes the architecture
of the Cray T3E supercomputing system.
All of these machines are examples of RISC (Reduced Instruction Set
Computers), in contrast to traditional CISC (Complex Instruction Set
Computer) implementations such as Pentium and VAX. The RISC concept, de-
veloped in the early 1980s, was intended to simplify the design of processors.
This simplified design can result in faster and less expensive processor devel-
opment, greaterreliability, and faster instruction execution times.
In general, a RISC system is characterized by a standard, fixed instruction
length (usually equal to one machine word), and single-cycle execution of
most instructions. Memory access is usually done by load and store instruc-
tions only. All instructions except for load andstore are register-to-register op-
erations. There are typically a relatively large number of general-purpose
registers. The number of machine instructions, instruction formats, and ad-
dressing modesis relatively small.
The discussions in the following sections will illustrate some of these RISC
characteristics. Further information about the RISC approach,includingits ad-
vantages and disadvantages, can be foundin Tabak (1995).
Memory
Memory consists of 8-bit bytes; all addresses used are byte addresses. Two
consecutive bytes form a halfword; four bytes form a word; eight bytes form a
doubleword. Halfwordsare stored in memory beginning at byte addresses that
are multiples of 2. Similarly, words begin at addresses that are multiples of4,
and doublewordsat addresses that are multiples of8.
UltraSPARC programs can be written using a virtual address space of
264 bytes. This address space is divided into pages; multiple pagesizes are sup-
ported. Some of the pages used by a program maybe in physical memory,
while others may be stored on disk. Whenaninstruction is executed, the hard-
ware and the operating system make sure that the needed pageis loaded into
physical memory. The virtual address specified by the instruction is automati-
cally translated into a physical address by the UltraSPARC Memory Manage-
ment Unit (MMU). Chapter 6 contains a brief discussion of methods that can
be usedin this kind of address translation.
Registers
Data Formats
Instruction Formats
There are three basic instruction formats in the SPARC architecture. All of
these formats are 32 bits long; the first 2 bits of the instruction word identify
which formatis being used. Format 1 is used for the Call instruction. Format 2
is used for branch instructions (and one special instruction that enters a value
into a register). The remaining instructions use Format 3, which provides for
register loads and stores, and three-operandarithmetic operations.
The fixed instruction length in the SPARCarchitecture is typical of RISC
systems, and is intended to speed the process of instruction fetching and de-
coding. Compare this approach with the complex variable-length instructions
found on CISC systems such as VAX and x86.
Addressing Modes
Instruction Set
1.5.2 PowerPCArchitecture
IBM first introduced the POWERarchitecture early in 1990 with the RS/6000.
(POWERis an acronym for Performance Optimization With Enhanced RISC.)
It was soon realized that this architecture could form the basis for a new fam-
ily of powerful and low-cost microprocessors. In October 1991, IBM, Apple,
and Motorola formed an alliance to develop and market such microprocessors,
which were named PowerPC. Thefirst products using PowerPC chips were
delivered near the end of 1993. Recent implementations of the PowerPC archi-
tecture include the PowerPC 601, 603, and 604; others are expected in the near
future.
Asits name implies, PowerPC is a RISC architecture. As we shall see,it has
much in common with other RISC systems such as SPARC. There are also a
few differences in philosophy, which we will note in the course of the discus-
sion. This section contains an overview of the PowerPC architecture, which
will serve as background for the examples to be discussed later in the book.
Further information about PowerPC can be found in IBM (1994a) and Tabak
(1995).
Memory
Memory consists of 8-bit bytes; all addresses used are byte addresses. Two
consecutive bytes form a halfword; four bytes form a word; eight bytes form a
doubleword; sixteen bytes form a quadword. Manyinstructions may execute
https://hemanthrajhemu.github.io
34 Chapter 1 Background
Registers
Data Formats
Instruction Formats
There are seven basic instruction formats in the PowerPC architecture, some of
which have subforms.All of these formats are 32 bits long. Instructions must
be aligned beginning at a word boundary (i.e., a byte addressthat is a multiple
of 4). The first 6 bits of the instruction word always specify the opcode; some
instruction formats also have an additional “extended opcode”field.
The fixed instruction length in the PowerPC architecture is typical of RISC
systems. The variety and complexity of instruction formats is greater than that
found on most RISC systems (such as SPARC). However, the fixed length
makesinstruction decoding faster and simpler than on CISC systems like VAX
and x86.
Addressing Modes
Instruction Set
Memory
Each processing element in the T3E has its own local memory with a capacity
of from 64 megabytes to 2 gigabytes. The local memory within each PEis part
a Interconnect network
Registers
Data Formats
The Alpha architecture provides for the storage of integers, floating-point val-
ues, and characters. Integers are stored as longwords or quadwords; 2's com-
plementis used for negative values. Wheninterpreted as an integer, the bits of
a longword or quadword havesteadily increasing significance beginning with
bit 0 (whichis stored in the lowest-addressed byte).
There are two different types of floating-point data formats in the Alpha
architecture. One group of three formats is included for compatibility with the
VAX architecture. The other group consists of four IEEE standard formats,
which are compatible with those used on most modern systems.
Characters may be stored one per byte, using their 8-bit ASCII codes.
However, there are no byte load or store operations in the Alpha architecture;
only longwords and quadwords can be transferred between a register and
memory. As a consequence, characters that are to be manipulated separately
are usually stored one per longword.
https://hemanthrajhemu.github.io
1.5 RISC Machines 39
Instruction Formats
There are five basic fnstruction formats in the Alpha architecture, some of
which have subforms. All of these formats are 32 bits long. (As we have noted
before, this fixed length is typical of RISC systems.) Thefirst 6 bits of the in-
struction word always specify the opcode; someinstruction formats also have
an additional “function”field.
Addressing Modes
Register indirect with displacement modeis used for load and store opera-
tions and for subroutine jumps. PC-relative modeis used for conditional and
unconditional branches.
Instruction Set
The T3E system performs I/O through multiple ports into one or more I/O
channels, which can be configured in a number of ways. These channels are
https://hemanthrajhemu.github.io
40 Chapter 1 Background
integrated into the network that interconnects the processing nodes. A system
may be configured with up to one I/O channel for every eight PEs. All chan-
nels are accessible and controllable from all PEs.
Further information about this “scalable” I/O architecture can be found in
Cray Research (1995c).
EXERCISES
Section 1.3
a acacia
integer portion of BETA + GAMMA. Assumethat ALPHA and BETA
are definedasin Fig. 1.3(a).
Write a sequence of instructions for SIC/XE to divide BETA by
GAMMA,setting ALPHA to the integer portion of the quotient and
DELTAto the remainder. Use register-to-register instructions to make
the calculationasefficient as possible.
Write a sequence of instructions for SIC/XE to divide BETA by
GAMMA,setting ALPHA to the value of the quotient, rounded to
the nearest integer. Use register-to-register instructions to make the
calculation as efficient as possible.
Write a sequence of instructions for SIC to clear a 20-byte stringto all
blanks.
Write a sequence of instructions for SIC/XE to clear a 20-byte string
to all blanks. Use immediate addressing and register-to-register in-
structions to makethe processasefficient as possible.
Suppose that ALPHAis an array of 100 words, as defined in Fig.
1.5(a). Write a sequence of instructions for SIC to set all 100 elements
of the array to 0.
Suppose that ALPHAis an array of 100 words, as defined in Fig.
1.5(b). Write a sequence of instructions for SIC/XEto set all 100
https://hemanthrajhemu.github.io
Exercises 4]
Chapter 2
Assemblers
to introduce concepts and techniques that can be used in new and unfamiliar
situations.
Section 2.4 examines some important alternative design schemesfor an as-
sembler. These are features of an assembler that are not reflected in the assem-
bler language. For example, some assemblers process a source program in one
pass instead of two; other assemblers may make more than twopasses. We are
concerned with the implementation of such assemblers, and also with the en-
vironments in which each mightbe useful.
Finally, in Section 2.5 we briefly consider some examples of actual assem-
blers for real machines. We do not attempt to discussall aspects of these as-
semblers in detail. Instead, we focus on the mostinteresting features that are
introduced by hardware or software design decisions.
The program contains a main routine that reads records from an input de-
vice (identified with device code F1) and copies them to an output device
(code 05). This main routine calls subroutine RDREC to read a record into a
buffer and subroutine WRREC to write the record from the buffer to the out-
https://hemanthrajhemu.github.io
2.1 Basic Assembler Functions 45
85 THREE WORD
90 ZERO WORD
95 RETADR RESW
100 LENGTH RESW LENGTH OF RECORD
105 BUFFER RESB 4096 4096-BYTE BUFFER AREA
110
115 SUBROUTINE TO READ RECORD INTO BUFFER
120
125 RDREC LDX ZERO CLEAR LOOP COUNTER
130 LDA CLEAR A TO ZERO
135 RLOOP TD TEST INPUT DEVICE
140 JEQ LOOP UNTIL READY
145 RD READ CHARACTER INTO REGISTER A
150 COMP TEST FOR END OF RECORD (X'00')
155 JEQ EXIT LOOP IF EOR
160 STCH STORE CHARACTER IN BUFFER
165 TIX LOOP UNLESS MAX LENGTH
170 JLT RLOOP HAS BEEN REACHED
175 EXIT STx LENGTH SAVE RECORD LENGTH
180 RSUB RETURN TO CALLER
185 INPUT BYTE Ed CODE FOR INPUT DEVICE
190 MAXLEN WORD 4096
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
210 WRREC LDX ZERO CLEAR LOOP COUNTER
215 WLOOP TD OUTPUT TEST OUTPUT DEVICE
220 JEQ WLOOP LOOP UNTIL READY
225 BUFFER, X GET CHARACTER FROM BUFFER
230 OUTPUT WRITE CHARACTER
235 LENGTH LOOP UNTIL ALL CHARACTERS
240 WLOOP HAVE BEEN WRITTEN
245 RETURN TO CALLER
250 OUTPUT BYTE %'05' CODE FOR OUTPUT DEVICE
255 FIRST
put device. Each subroutine must transfer the record one character at a time
because the only I/O instructions available are RD and WD.The bufferis nec-
essary because the I/O rates for the two devices, such as a disk and a slow
printing terminal, may be very different. (In Chapter 6, we see how to use
channel programs and operating system calls on a SIC/XE system to accom-
plish the same functions.) The end of each record is marked with a null charac-
ter (hexadecimal 00). If a record is longer than the length of the buffer (4096
bytes), only the first 4096 bytes are copied. (For simplicity, the program does
not deal with error recovery when a record containing 4096 bytes or moreis
read.) The end of the file to be copied is indicated by a zero-length record.
Whenthe endoffile is detected, the program writes EOF on the output device
and terminates by executing an RSUB instruction. We assumethat this pro-
gram wascalled by the operating system using a JSUB instruction; thus, the
RSUBwill return control to the operating system.
Figure 2.2 shows the same program asin Fig. 2.1, with the generated object
code for each statement. The column headed Loc gives the machine address
(in hexadecimal) for each part of the assembled program. We have assumed
that the program starts at address 1000. (In an actual assemblerlisting, of
course, the comments would be retained; they have been eliminated here to
save space.)
The translation of source program to object code requires us to accomplish
the following functions (not necessarily in the order given):
Headerrecord:
Col. 1 H
Col. 2-7 Program name
Col. 8-13 Starting address of object program (hexadecimal)
Col. 14-19 Length of object program in bytes (hexadecimal)
https://hemanthrajhemu.github.io
2.1 Basic Assembler Functions
Text record:
Col. 1 i
Col. 2-7 Starting address for object code in this record(hexadecimal)
Col. 8-9 Length of object codein this record in bytes (hexadecimal)
Col. 10-69 Object code, represented in hexadecimal (2 columns per
byte of object code)
Endrecord:
Col. 1 E
Col. 2-7 Addressof first executable instruction in object program
(hexadecimal)
To avoid confusion, we have used the term column rather than byte to refer to
positions within object program records. This is not meant to imply the use of
any particular medium for the object program.
Figure 2.3 shows the object program correspondingto Fig. 2.2, using this
format. In this figure, and in the other object programs wedisplay, the symbol
‘is used to separate fields visually. Of course, such symbols are not present in
the actual object program. Note that there is no object code corresponding to
addresses 1033-2038. This storage is simply reserved by the loader for use by
the program during execution. (Chapter 3 contains a detailed discussion of the
operation of the loader.)
We can now give a general description of the functions of the two passes of
our simple assembler.
HCOPY 001000001074
TO 10001E1 41 033482039001 03628103030101 348206 13€100300 102A0C103900102D
T0010 1 El 500103648206 108 10334C0000454F46000003000000
T0020391E04 103000 1 030£0205D30203FD8205D28 10303020575490392C205E386203F
70020571¢1 010364CO000F 100 1 00004 103050207 9.302064509039DC20792¢ 1036
700207 307.3820644C000005
£001000
In the next section we discuss these functions in more detail, describe the in-
ternal tables required by the assembler, and give an overall description of the
logic flow of each pass.
Our simple assembler uses two major internal data structures: the Operation
Code Table (OPTAB) and the Symbol Table (SYMTAB). OPTABis used to look
up mnemonic operation codes and translate them to their machine language
equivalents. SYMTABis used to store values (addresses) assigned to labels.
We also need a Location Counter LOCCTR.This is a variable that is used
to help in the assignmentof addresses. LOCCTRis initialized to the beginning
address specified in the START statement. After each source statement is
processed, the length of the assembled instruction or data area to be generated
is added to LOCCTR. Thus wheneverwereacha label in the source program,
the current value of LOCCTR gives the address to be associated with that
label.
The Operation Code Table must contain (at least) the mnemonic operation
code and its machine language equivalent. In more complex assemblers, this
table also contains information about instruction format and length. During
Pass 1, OPTAB is used to look up and validate operation codes in the source
program.In Pass 2,it is used to translate the operation codes to machinelan-
guage. Actually, in our simple SIC assembler, both of these processes could be
done together in either Pass 1 or Pass 2. However, for a machine (such as
SIC/XE) that has instructions of different lengths, we must search OPTAB in
the first pass to find the instruction length for incrementing LOCCTR.
https://hemanthrajhemu.github.io
: 2.1 Basic Assembler Functions 51
2.2 MACHINE-DEPENDENT
ASSEMBLER FEATURES
Pass 1:
begin
read first input line
if OPCODE = ‘START’ then
begin
save #[OPERAND] as starting address
initialize LOCCTR to starting address
write line to intermediate file
read next input line
end {if START}
else
initialize LOCCTR to 0
while OPCODE # ‘END’ do
begin
if this is not a comment line then
begin
if there is a symbol in the LABEL field then
begin
search SYMTAB for LABEL
if found then
set error flag (duplicate symbol)
else
insert (LABEL,LOCCTR) into SYMTAB
end {if symbol}
search OPTAB for OPCODE
if found then
add 3 {instruction length} to LOCCTR
else if OPCODE = ‘WORD’ then
add 3 to LOCCTR
else if OPCODE = ‘RESW’ then
add 3 * #[OPERAND] to LOCCTR
else if OPCODE = ‘RESB’ then
add #[OPERAND] to LOCCTR
else if OPCODE = ‘BYTE’ then
begin
find length of constant in bytes
add length to LOCCTR
end {if BYTE)
else
set error flag (invalid operation code)
end {if not a comment}
write line to intermediate file
read next input line
end {while not END}
write last line to intermediate file
save (LOCCTR - starting address) as program length
end {Pass 1}
Pass 2:
begin
a
read first input line (from intermediate file}
if OPCODE = ‘START’ then
begin
write listing line
read next input line
end {if START}
write Header record to object program
initialize first Text record
while OPCODE # ‘END’ do
begin
if this is not a comment line then
begin
search OPTAB for OPCODE
if found then
begin
if there is a symbol in OPERAND field then
begin
search SYMTAB for OPERAND
if found then
store symbol value as operand address
else
begin
store 0 as operand address
set error flag (undefined symbol)
end
end {if symbol}
else
store 0 as operand address
assemble the object code instruction
end {if opcode found}
else if OPCODE = ‘BYTE’ or ‘WORD’ then
convert constant to object code
if object code will not fit into the current Text record then
begin
write Text record to object program
initialize new Text record
end
add object code to Text record
end {if not comment}
write listing line
read next input line
end {while not END}
write last Text record to object program
write End record to object program
write last listing line -
end {Pass 2}
(see line 70). Immediate operands are denoted with the prefix # (lines 25, 55,
133). Instructions that refer to memory are normally assembled using either
the program-counterrelative or the base relative mode. The assembler direc-
tive BASE(line 13) is used in conjunction with base relative addressing. (See
Section 2.2.1 for a discussion and examples.) If the displacements required for
both program-counterrelative and baserelative addressingaretoo largeto fit
into a 3-byte instruction, then the 4-byte extended format (Format 4) must be
used. The extended instruction format is specified with the prefix + added to
the operation code in the source statement (see lines 15, 35, 65). It is the pro-
grammer’s responsibility to specify this form of addressing whenit is re-
quired.
The main differences between this version of the program and the version
in Fig. 2.1 involve the use of register-to-register instructions (in place of regis-
ter-to-memoryinstructions) wherever possible. For example, the statement on
line 150 is changed from COMP ZERO to COMPRA\S. Similarly, line 165 is
changed from TIX MAXLENto TIXR T. In addition, immediate and indirect
addressing have been used as muchas possible (for example, lines 25, 55, and
70).
These changes take advantage of the more advanced SIC/XEarchitecture
to improve the execution speed of the program. Register-to-register instruc-
tions are faster than the corresponding register-to-memory operations because
they are shorter, and, more importantly, because they do not require another
memory reference. (Fetching an operand from a register is muchfaster than re-
trieving it from main memory.) Likewise, when using immediate addressing,
the operand is already present as part of the instruction and need not be
fetched from anywhere. The use of indirect addressing often avoids the need
for another instruction (as in the “return” operation on line 70). You may no-
tice that some of the changes require the addition of other instructions to the
program. For example, changing COMP to COMPRonline 150 forces us to
add the CLEARinstruction on line 132.Thisstill results in an improvementin
execution speed. The CLEAR is executed only once for each record read,
whereas the benefits of COMPR (as opposed to COMP) are realized for every
byte of data transferred.
In Section 2.2.1, we examine the assembly of this SIC/XE program, focus-
ing on the differences in the assembler that are required by the new addressing
modes. (You may wantto briefly review the instruction formats and target ad-
dress calculations described in Section 1.3.2.) These changes are direct conse-
quencesof the extended hardware-functions.
Section 2.2.2 discusses an indirect consequence of the change to SIC/XE.
The larger main memory of SIC/XE means that we may have room to load
and run several programs at the same time. This kind of sharing of the ma-
chine between programsis called multiprogramming. Such sharing often results
in more productive use of the hardware. (We discuss this concept, and its
https://hemanthrajhemu.github.io
* 2.2 Machine-Dependent Assembler Features 57
Figure 2.6 showsthe object code generated for each statement in the program
of Fig. 2.5. In this section we considerthe translation of the source statements,
payingparticular attention to the handling ofdifferent instruction formats and
different addressing modes. Note that the START statement now specifies a
beginning program address of 0. As we discuss in the next section, this indi-
cates a relocatable program. For the purposes of instruction assembly, how-
ever, the program will be translated exactly as if it were really to be loaded at
machine address0.
Translation of register-to-register instructions such as CLEAR(line 125)
and COMPR(line 150) presents no new problems. The assembler must simply
convert the mnemonic operation code to machine language (using OPTAB)
and change each register mnemonicto its numeric equivalent. This translation
is done during Pass 2, at the same point at which the other types of instruc-
tions are assembled. The conversion of register mnemonics to numbers can be
done with a separate table; however, it is often convenient to use the symbol
table for this purpose. To do this, SYMTAB would be preloaded with the regis-
ter names (A, X, etc.) and their values(0, 1, etc.).
Mostof the register-to-memory instructions are assembled using either
program-counterrelative or base relative addressing. The assembler must, in
either case, calculate a displacement to be assembled as part of the object in-
struction. This is computed so that the correct target address results when the
displacementis added to the contents of the program counter (PC) or the base
register (B). Of course, the resulting displacement must be small enoughtofit
in the 12-bit field in the instruction. This means that the displacement must be
between 0 and 4095 (for base relative mode) or between —2048 and +2047 (for
program-counterrelative mode).
If neither program-counter relative nor base relative addressing can be
used (because the displacements are too large), then the 4-byte extended in-
struction format (Format 4) must be used. This 4-byte format contains a 20-bit
address field, which is large enough to contain the full memory address. In
this case, there is no displacement to be calculated. For example, in the instruc-
tion
the operand address is 1036. This full addressis stored in the instruction, with
bit e set to 1 to indicate extended instruction format.
Note that the programmer must specify the extended format by using the
prefix + (as on line 15). If extended formatis not specified, our assemblerfirst
attempts to translate the instruction using program-counterrelative address-
ing. If this is not possible (because the required displacement is out of range),
the assembler then attempts to use base relative addressing. If neither form of
relative addressing is applicable and extended formatis not specified, then the
instruction cannot be properly assembled. In this case, the assembler must
generate an error message.
We now examinethe details of the displacement calculation for program-
counterrelative and baserelative addressing modes. The computation that the
assembler needs to perform is essentially the target address calculation in
reverse. You may wantto review this from Section 1.3.2.
Theinstruction
Here the operand address is 0006. During instruction execution, the program
counter will contain the address 0001A. Thus the displacement required is
6-1A=-14. Thisis represented (using 2’s complement for negative numbers) in
a 12-bit field as FEC, which is the displacement assembledinto the object code.
The displacementcalculation process for base relative addressing is much
the same as for program-counter relative addressing. The main difference is
it
https://hemanthrajhemu.github.io
60 Chapter 2 Assemblers
that the assembler knows what the contents of the program counter will beat
execution time. The base register, on the other hand, is under control of the
programmer. Therefore, the programmer musttell the assembler what the
base register will contain during execution of the program so that the assem-
bler can compute displacements. This is done in our example with the assem-
bler directive BASE. The statement BASE LENGTH(line 13) informs the
assemblerthat the base register will contain the address of LENGTH.Thepre-
ceding instruction (LDB #LENGTH)loadsthis value into the register during
program execution. The assembler assumes for addressing purposes that reg-
ister B contains this address until it encounters another BASE statement. Later
in the program, it may be desirable to use register B for another purpose (for
example, as temporary storage for a data value). In such a case, the program-
mer must use another assembler directive (perhaps NOBASE)to inform the
assembler that the contents of the base register can no longer be relied upon
for addressing.
It is important to understand that BASE and NOBASEare assembler direc-
tives, and produce no executable code. The programmer mustprovideinstruc-
tions that load the proper value into the base register during execution.If this
is not done properly, the target address calculation will not produce the correct
operand address.
Theinstruction
is a typical example of this, with the operand stored in the instruction as 003,
and bit i set to 1 to indicate immediate addressing. Another example can be
foundin the instruction
In this case the operand (4096) is too large to fit into the 12-bit displacement
field, so the extended instruction formatis called for. (If the operand were too
large even for this 20-bit address field, immediate addressing could not be
used.)
A different way of using immediate addressing is shownin the instruction
In this statement the immediate operand is the symbol LENGTH. Since the
value of this symbolis the address assignedtoit, this immediate instruction has
the effect of loading register B with the address of LENGTH. Note here that
we have combined program-counter relative addressing with immediate ad-
dressing. Although this may appear unusual, the interpretation is consistent
with our previous uses of immediate operands. In general, the target address
calculation is performed; then, if immediate modeis specified, the target ad-
dress (not the contents stored at that address) becomes the operand. (In the
LDAstatementon line 55, for example, bits x, b, and p are all 0. Thus the target
addressis simply the displacement 003.)
The assembly of instructions that specify indirect addressing presents
nothing really new. The displacement is computed in the usual way to pro-
duce the target address desired. Then bit 7 is set to indicate that the contents
stored at this location represent the address of the operand, not the operandit-
self. Line 70 shows a statement that combines program-counter relative and
indirect addressing in this way.
from Fig. 2.2. In the object program (Fig. 2.3), this statement is translated as
00102D, specifying that register A is to be loaded from memory address 102D.
Suppose we attempt to load and execute the program at address 2000 instead
of address 1000. If we do this, address 102D will not contain the value that we
expect—in fact, it will probably be part of some other user’s program.
Obviously we need to make somechangein the addressportion of this in-
struction so we can load and execute our program at address 2000. On the
other hand, there are parts of the program (such as the constant 3 generated
from line 85) that should remain the same regardless of where the program is
loaded. Looking at the object code alone, it is in general not possible to tell
which values represent addresses and which represent constant data items.
Since the assembler does not know theactual location where the program
will be loaded, it cannot make the necessary changes in the addresses used by
the program. However, the assembler can identify for the loader those parts of
the object program that need modification. An object program that contains
the information necessary to perform this kind of modificationis called a relo-
catable program.
To look at this in more detail, consider the program from Figs. 2.5 and 2.6.
In the preceding section, we assembled this program using a starting address
of 0000. Figure 2.7(a) shows this program loaded beginning at address 0000.
The JSUB instruction from line 15 is loaded at address 0006. The addressfield
of this instruction contains 01036, which is the address of the instruction la-
beled RDREC.(These addresses are, of course, the same as those assigned by
the assembler.)
Now suppose that we wantto load this program beginning at address
5000, as showninFig. 2.7(b). The address of the instruction labeled RDREC is
then 6036. Thus the JSUB instruction must be modified as shown to contain
this new address. Likewise, if we loaded the program beginning at address
7420 (Fig. 2.7c), the JSUB instruction would need to be changed to 4B108456 to
correspondto the new address of RDREC.
https://hemanthrajhemu.github.io
Sectioh 2.2 Machine-Dependent Assembler Features 63
B41 +—RDREC
5000 ;
5006 48106036 |(+JSUB RDREC)
6036 B410 =e -RDREC
6076 i
7420
neveee O eecce
® aeece oo
¢— RDREC
oO
b
=
oo
Note that no matter where the program is loaded, RDRECis always 1036
bytes past the starting address of the program. This means that we can solve
the relocation problem in the following way:
1. When the assembler generates the object code for the JSUB instruc-
tion weare considering,it will insert the address of RDRECrelative to
the start of the program. (This is the reason weinitialized the location
counter to 0 for the assembly.)
2. The assembler will also produce a commandfor the loader, instruct-
ing it to add the beginning address of the program to the address
field in the JSUB instruction at load time.
https://hemanthrajhemu.github.io
Chapter 2 Assemblers
The commandfor the loader, of course, must also be a part of the object pro-
gram. We can accomplish this with a Modification record having the following
format:
Modification record:
Col. 1 M
Col. 2-7 Starting location of the address field to be modified,rel-
ative to the beginning of the program (hexadecimal)
Col. 8-9 Length of the address field to be modified, in half-
bytes (hexadecimal)
The length is stored in half-bytes (rather than bytes) because the address
field to be modified may not occupy an integral numberof bytes. (For exam-
ple, the address field in the JSUB instruction we considered above occupies 20
bits, which is 5 half-bytes.) The starting location is the location of the byte con-
taining the leftmost bits of the address field to be modified.If this field occu-
pies an odd numberof half-bytes, it is assumed to begin in the middle of the
first byte at the starting location. These conventions are, of course, closely re-
lated to the architecture of SIC/XE. For other types of machines, the half-byte
approach might not be appropriate (see Exercise 2.2.9).
For the JSUB instruction we are using as an example, the Modification
record would be
MOO0000705
This record specifies that the beginning address of the program is to be added
to a field that begins at address 000007 (relative to the start of the program)
and is 5 half-bytes in length. Thus in the assembled instruction 4B101036, the
first 12 bits (4B1) will remain unchanged. The program load address will be
addedto the last 20 bits (01036) to produce the correct operand address. (You
should check for yourself that this gives the results shownin Fig.2.7.)
Exactly the same kind of relocation must be performed for the instructions
on lines 35 and 65in Fig. 2.6. The rest of the instructions in the program, how-
ever, need not be modified when the program is loaded. In somecases this is
because the instruction operand is not a memory addressatall (e.g., CLEARS
or LDA #3). In other cases no modification is needed because the operand is
specified using program-counterrelative or base relative addressing. For ex-
ample, the instruction on line 10 (STL RETADR)is assembled using program-
counter relative addressing with displacement 02D. No matter where the
program is loaded in memory, the word labeled RETADR will always be 2D
https://hemanthrajhemu.github.io
Section 2.2. Machine-Dependent Assembler Features 65
HCOPY 000000010 77
T0000001D17202D69202D4B1010360320262900003320074B10105D3F2FECO32010
7,00001D130F20160100030F200D4B10105D3E2003454F46
7,0010361DB410B400B44075101000£32019332FFADB2013A00433200857C003B850
10010531D3B2FEAI340004F0000F1B410774000E32011332FFA53C003DF2008B850
1001070073B2FEF4F000005
400000705
400001405
M000027,05
5000000
2.3 MACHINE-INDEPENDENT
ASSEMBLER FEATURES
In this section, we discuss some common assembler features that are not
closely related to machine architecture. Of course, more advanced machines
tend to have more complex software; therefore the features we consider are
more likely to be found on larger and more complex machines. However, the
presence or absence of such capabilities is much moreclosely related to issues
such as programmer convenience and software environmentthanit is to
machinearchitecture.
In Section 2.3.1 we discuss the implementation of literals within an assem-
bler, including the required data structures and processing logic. Section 2.3.2
discusses two assembler directives (EQU and ORG) whose main function is
the definition of symbols. Section 2.3.3 briefly examines the use of expressions
in assembler language statements, and discusses the different types of expres-
sions and their evaluation and use.
In Sections 2.3.4 and 2.3.5 we introduce the important topics of program
blocks and control sections. We discuss the reasons for providing such capabil-
ities and illustrate some different uses with examples. We also introduce a set
of assembler directives for supporting these features and discuss their imple-
mentation.
2.3.1 Literals
https://hemanthrajhemu.github.io
"2.3. Machine-Independent Assembler Features 67
specifies a 1-byte literal with the hexadecimal value 05. The notation used for
literals varies from assembler to assembler; however, most assemblers use
some symbol(as we have used =) to makeliteral identification easier.
It is important to understand the difference betweena literal and an imme-
diate operand. With immediate addressing, the operand value is assembled as
part of the machineinstruction. With a literal, the assembler generates the
specified value as a constant at some other memory location. The address of
this generated constant is used as the target address for the machine instruc-
tion. The effect of using a literal is exactly the sameas if the programmer had
defined the constant explicitly and used the label assigned to the constant as
the instruction operand.(In fact, the generated object codefor lines 45 and 215
in Fig. 2.10 is identical to the object code for the correspondinglines in
Fig. 2.6.) You should compare the object instructions generated for lines 45 and
55 in Fig. 2.10 to make sure you understand how literals and immediate
operandsare handled.
All of the literal operands used in a program are gathered together into
one or moreliteral pools. Normally literals are placed into a pool at the end of
the program. The assembly listing of a program containingliterals usually in-
cludes listing of this literal pool, which showsthe assigned addresses and
the generated data values. Sucha literal poollisting is shownin Fig. 2.10 im-
mediately following the END statement. In this case, the pool consists of the
single literal =X’05’.
In some cases, however,it is desirable to place literals into a pool at some
other location in the object program.To allowthis, we introduce the assembler
directive LTORG (line 93in Fig. 2.9). When the assembler encounters a LTORG
statement, it creates a literal pool that containsall of the literal operands used
since the previous LTORG(or the beginning of the program). This literal pool
is placed in the object program at the location where the LTORG directive was
encountered (see Fig. 2.10). Of course, literals placed in a pool by LTORG will
not be repeated in the pool at the end of the program.
If we had not used the LTORG statement on line 93, the literal =C’EOF’
would beplacedin the poolat the end of the program. This literal pool would
begin at address 1073. This meansthat the literal operand would be placed too
far away from the instruction referencing it to allow program-counterrelative
addressing. The problem,ofcourse, is the large amountofstorage reserved for
BUFFER.Byplacing the literal pool before this buffer, we avoid having to use
extended formatinstructions whenreferring to the literals. The need for an as-
sembler directive such as LTORG usually arises whenit is desirable to keep
the literal operand close to the instruction thatusesit.
Most assemblers recognize duplicate literals—thatis, the sameliteral used
in more than one place in the program—andstore only one copy of the speci-
fied data value. For example,the literal =X’05’ is used in our program onlines
70
https://hemanthrajhemu.github.io
Chapter 2. Assemblers
i
215 and 230. However, only one data area with this value is generated. Both
instructionsrefer to the same addressin theliteral pool for their operand.
The easiest way to recognize duplicate literals is by comparison of the
character strings defining them (in this case, the string =X’05’). Sometimes a
slight additional saving is possible if we look at the generated data value in-
stead of the defining expression. For example, the literals =C’EOF’ and
=X’454F46’ would specify identical operand values. The assembler might
avoid storing both literals if it recognized this equivalence. However, the bene-
fits realized in this way are usually not great enoughto justify the additional
complexity in the assembler.
If we use the characterstring defining literal to recognize duplicates, we
must becareful ofliterals whose value depends upontheir location in the pro-
gram. Suppose, for example, that we allow literals that refer to the current
value of the location counter (often denoted by the symbol*). Such literals are
sometimes useful for loading base registers. For example, the statements
BASE .
LDB =
as the first lines of a program would load the beginning address of the pro-
gram into register B. This value would then be available for base relative ad-
dressing.
Such a notation can, however, cause a problem with the detection of dupli-
cate literals. If a literal =* appeared on line 13 of our example program,it
wouldspecify an operand with value 0003. If the sameliteral appeared on line
55, it would specify an operand with value 0020. In such case,theliteral
operands have identical names; however, they have different values, and both
must appear in the literal pool. The same problem arises if a literal refers to
any other item whose value changes between one point in the program and
another.
Now weareready to describe how the assembler handles literal operands.
The basic data structure neededis literal table LITTAB. For each literal used,
this table contains the literal name, the operand value and length, and the ad-
dress assigned to the operand whenitis placed in a literal pool. LITTABis of-
ten organized as a hashtable, using the literal nameor valueas the key.
As each literal operand is recognized during Pass 1, the assembler searches
LITTABforthe specified literal name(or value).If the literal is already present
in the table, no action is needed;if it is not present, the literal is added to LIT-
TAB (leaving the address unassigned). When Pass 1 encounters a LTORG
statementor the end of the program, the assembler makesa scan oftheliteral
table. At this time eachliteral currently in the table is assigned an address (un-
less such an address has already been filled in). As these addresses are as-
https://hemanthrajhemu.github.io
2.3 Mach ine-Independent Assembler Features 71
signed, the location counteris updated to reflect the number of bytes occupied
by eachliteral.
During Pass 2, the‘operand address for use in generating object code is ob-
tained by searching LITTABfor each literal operand encountered. The data
values specified by theliterals in each literal pool are inserted at the appropri-
ate places in the object program exactly as if these values had been generated
by BYTE or WORDstatements. If a literal value represents an address in the
program (for example, a location counter value), the assembler must also gen-
erate the appropriate Modification record.
To be sure you understand how LITTABis created and used by the assem-
bler, you may want to apply the procedure we just described to the source
statements in Fig. 2.9. The object code andliteral pools generated should be
the sameas those in Fig. 2.10.
+LDT #4096
to load the value 4096 into register T. This value represents the maximum-
length record we could read with subroutine RDREC. The meaningis not,
however,as clear as it might be. If we include the statement
+LDT #MAXLEN
A EOU 0
x EQU 1
Ts EQU 2
ime.
These statements cause the symbols A,X,L,... to be entered into SYMTAB with
their corresponding values0, 1, 2,.... An instruction like RMO A,X would then
be allowed. The assembler would search SYMTAB,finding the values 0 and 1
for the symbols A and X, and assemble theinstruction.
On a machinelike SIC, there would belittle point in doing this—itis just
as easy to have the standard register mnemonics built into the assembler.
Furthermore, the standard names(base, index, etc.) reflect the usage of the
registers. Consider, however, a machine that has general-purposeregisters.
These registers are typically designated by 0,1, 2,... (or RO, R1, R2,...). In a par-
ticular program, however, some of these may be used as base registers, some
as index registers, some as accumulators, etc. Furthermore, this usage of regis-
ters changes from one program to the next. By writing statementslike
BASE EOU R1
COUNT EQU R2
INDEX EQU R3
the programmercan establish and use namesthat reflect the logical function
of the registers in the program.
https://hemanthrajhemu.github.io
’
ORG value
We might wantto refer to entries in the table using indexed addressing (plac-
ing in the index register the offset of the desired entry from the beginning of
the table). Of course, we want to be able to refer to the fields SYMBOL,
VALUE, and FLAGSindividually, so we must also define these labels. One
wayof doing this would be with EQU statements:
LDA VALUE, X
to fetch the VALUEfield fromthe table entry indicated by the contents of reg-
ister X. However, this method of definition simply defines the labels; it does
not makethe structure ofthe table as clear asit mightbe.
We can accomplish the same symboldefinition using ORG in the following
Way.
Thefirst ORG resets the location counter to the value of STAB(i.e., the begin-
ning address of the table). The label on the following RESB statement defines
SYMBOLto have the current value in LOCCTR;this is the same address as-
signed to SYMTAB. LOCCTRis then advancedso the label on the RESW state-
mentassigns to VALUE the address (STAB+6), and so on. Theresult is a set of
labels with the same values as those defined with the EQU statements above.
This method of definition makes it clear, however, that each entry in STAB
consists of a 6-byte SYMBOL,followed by a one-word VALUE,followed by a
2-byte FLAGS.
The last ORG statementis very important. It sets LOCCTRbackto its pre-
vious value—the address of the next unassigned byte of memory after the
table STAB. This is necessary so that any labels on subsequent statements,
which do not represent part of STAB, are assigned the proper addresses. In
some assemblers the previous value of LOCCTRis automatically remembered,
so we can simply write
ORG
ALPHA RESW i:
BETA EQU ALPHA
https://hemanthrajhemu.github.io
2.3 Machine-Independent Assembler Features 75
would not. The reason for this is the symbol definition process. In the second
example above, BETA cannotbe assigned a value whenit is encountered dur-
ing Pass 1 of the assembly (because ALPHA doesnotyet have a value).
However, our two-pass assembler design requires that all symbols be defined
during Pass1.
A similar restriction applies to ORG:all symbols used to specify the new
location counter value must have been previously defined. Thus, for example,
the sequence
ORG ALPHA
BYTEL RESB Uy
BYTE2 RESB 1
BYTE3 RESB 1
ORG
ALPHA RESB dl
could not be processed. In this case, the assembler would not know (during
Pass 1) what value to assign to the location counter in response to thefirst
ORGstatement. As a result, the symbols BYTE1, BYTE2, and BYTE3 could not
be assigned addresses during Pass1.
It may appearthatthis restriction is a result of the particular way in which
wedefined the two passes of our assembler. In fact, it is a more general prod-
uct of the forward-reference problem. You can easily see, for example, that the
sequence of statements
2.3.3 Expressions
use of expressions wherever such a single operandis permitted. Each such ex-
pression must, of course, be evaluated by the assembler to producea single
operand addressor value.
Assemblers generally allow arithmetic expressions formed according to
the normal rules using the operators +, —, *, and /. Division is usually defined
to produce an integer result. Individual terms in the expression may be con-
stants, user-defined symbols, or special terms. The most common such special
term is the current value of the location counter (often designated by *). This
term represents the value of the next unassigned memory location. Thus in
Fig. 2.9 the statement
gives BUFENDa valuethatis the address of the next byte after the buffer area.
In Section 2.2 we discussed the problem of program relocation. We saw
that some values in the object program are relative to the beginning of the pro-
gram, while others are absolute (independent of program location). Similarly,
the values of terms and expressionsare either relative or absolute. A constant
is, of course, an absolute term. Labels on instructions and data areas, and ref-
erences to the location counter value, are relative terms. A symbol whosevalue
is given by EQU (or somesimilar assembler directive) may be either an ab-
solute term or a relative term depending upon the expression used to define
its value.
Expressionsare classified as either absolute expressions or relative expressions
depending upon the type of value they produce. An expression that contains
only absolute terms is, of course, an absolute expression. However, absolute
expressions may also contain relative terms provided the relative terms occur
in pairs and the termsin each such pair have opposite signs.It is not necessary
that the paired terms be adjacent to each other in the expression; however,all
relative terms mustbe capable of being paired in this way. Noneoftherelative
terms may enter into a multiplication or division operation.
A relative expression is one in whichall of the relative terms except one
can be paired as described above; the remaining unpaired relative term must
have a positive sign. As before, no relative term may enter into a multiplica-
tion or division operation. Expressions that do not meet the conditions given
for either absolute or relative expressions should be flagged by the assembler
as errors,
Althoughthe rules given above may seem arbitrary,they are actually quite
reasonable. The expressions that are legal under these definitions include ex-
actly those expressions whose value remains meaningful when the program is
relocated. A relative term or expression represents some value that may be
written as (S+ r), where S is the starting address of the program and is the
https://hemanthrajhemu.github.io
2.3. Machine-Independent Assembler Features 77
value of the term or expression relative to the starting address. Thus a relative
term usually represents some location within the program. Whenrelative
terms are paired with opposite signs, the dependency on the program starting
addressis canceled out; the result is an absolute value. Consider, for example,
the program of Fig. 2.9. In the statement
RETADR R 0030
BUFFER R 0036
BUFEND R 1036
MAXLEN A 1000
With this information the assembler can easily determine the type of each ex-
pression used as an operand and generate Modification records in the object
program forrelative values.
In Section 2.3.5 we consider programsthat consist of several parts that can
be relocated independently of each other. As we discuss in the later section,
our rules for determining the type of an expression must be modified in such
instances.
https://hemanthrajhemu.github.io
78 Chapter 2 Assemblers
In all of the examples we have seen so far the program being assembled was
treated as a unit. The source programslogically contained subroutines, data
—-
areas, etc. However, they were handled by the assembler as one entity, result-
ing in a single block of object code. Within this object program the generated
machineinstructions and data appeared in the same order as they were writ-
ten in the source program.
Manyassemblers provide features that allow moreflexible handling of the
source and object programs. Some features allow the generated machine in-
structions and data to appear in the object program in a different order from
the corresponding source statements. Other features result in the creation of
several independent parts of the object program. These parts maintain their
identity and are handled separately by the loader. We use the term program
blocks to refer to segments of code that are rearranged within a single object
program unit, and control sections to refer to segments that are translated into
independent object program units. (This terminology is, unfortunately, far
from uniform. As a matter of fact, in some systems the same assembler lan-
guage feature is used to accomplish both of these logically different functions.)
In this section we consider the use of program blocks and how they are han-
dled by the assembler. Section 2.3.5 discusses control sections and their uses.
Figure 2.11 shows our example program as it might be written using pro-
gram blocks. In this case three blocks are used. The first (unnamed) program
block contains the executable instructions of the program. The second (named
CDATA)contains all data areas that are a few wordsorless in length. The
third (named CBLKS)contains all data areas that consist of larger blocks of
memory. Some possible reasons for making such a division are discussed later
in this section.
The assembler directive USE indicates which portions of the source pro-
gram belong to the various blocks. At the beginning of the program,state-
ments are assumed to be part of the unnamed (default) block; if no USE
statements are included, the entire program belongsto this single block. The
USEstatement on line 92 signals the beginning of the block named CDATA.
Source statements are associated with this block until the USE statement on
line 103, which begins the block named CBLKS. The USE statement may also
indicate a continuation of a previously begun block. Thus the statement on
line 123 resumes the default block, and the statement on line 183 resumes the
block named CDATA.
As we can see, each program block may actually contain several separate
segments of the source program. The assemblerwill (logically) rearrange these
segments to gather together the pieces of each block. These blocks will then be
assigned addresses in the object program, with the blocks appearing in the
https://hemanthrajhemu.github.io
2.3 Machine-Independent Assembler Features 79
same order in which they werefirst begun in the source program. Theresult is
the same as if the programmer had physically rearranged the sourcestate-
ments to group togetherall the source lines belonging to each block.
The assembler accomplishes this logical rearrangement of code by main-
taining, during Pass 1, a separate location counter for each program block. The
location counterfor a blockis initialized to 0 whenthe blockisfirst begun. The
current value of this location counter is saved when switching to another
block, and the saved value is restored when resuming a previous block. Thus
during Pass 1 each label in the program is assigned an addressthatis relative
to the start of the block that contains it. When labels are entered into the sym-
bol table, the block name or numberis stored along with the assignedrelative
address. At the end of Pass 1 the latest value of the location counter for each
block indicates the length of that block. The assembler can then assign to each
block a starting address in the object program (beginning with relative loca-
tion Q).
For code generation during Pass 2, the assembler needs the address for
each symbolrelative to the start of the object program (notthe start of an indi-
vidual program block). This is easily found from the information in SYMTAB.
The assembler simply adds the location of the symbol, relative to the start of
its block, to the assigned block starting address.
Figure 2.12 demonstrates this process applied to our sample program. The
column headed Loc/Block showsthe relative address (within a program
block) assigned to each source line and a block numberindicating which pro-
gram block is involved (0 = default block, 1 = CDATA, 2 = CBLKS). This is es-
sentially the same information that is stored in SYMTABfor each symbol.
Notice that the value of the symbol MAXLEN(line 107) is shown without a
block number. This indicates that MAXLENis an absolute symbol, whose
valueis notrelative to the start of any program block.
Atthe end of Pass 1 the assembler constructs a table that contains the start-
ing addresses and lengthsfor all blocks. For our sample program,this table
looks like
0027 USE
HPO Oo oO Ooo oO oCoO oC oO oo os
004D USE
eEooococe 0c 00 cc
OO0A =K'05' 05
255 END FIRST
SYMTAB showsthe value of the operand (the symbol LENGTH) as relative lo-
cation 0003 within programblock 1 (CDATA). The starting address for CDATA
is 0066. Thus the desired target address for this instruction is 0003 + 0066 =
0069. The instruction is to be assembled using program-counter relative ad-
dressing. Whentheinstruction is executed, the program counter contains the
address of the following instruction (line 25). The addressof this instructionis
relative location 0009 within the default block. Since the default block starts at
location 0000, this address is simply 0009. Thus the required displacementis
0069 — 0009 = 60. The calculation of the other addresses during Pass 2 follows a
similar pattern.
Wecan immediately see that the separation of the program into blocks has
considerably reduced our addressing problems. Because the large buffer area
is movedto the end of the object program, we nolonger need to use extended
format instructions on lines 15, 35, and 65. Furthermore, the base register is no
longer necessary; we have deleted the LDB and BASE statements previously
on lines 13 and 14. The problem of placementofliterals (and literal references)
in the program is also much more easily solved. We simply include a LTORG
statement in the CDATAblock to be sure that the literals are placed ahead of
anylarge data areas.
Of course the use of program blocks has not accomplished anything we
could not have done by rearranging the statements of the source program.For
example, program readability is often improvedif the definitions of data areas
are placed in the source programclose to the statements that reference them.
This could be accomplished in a long subroutine (without using program
blocks) by simply inserting data areas in any convenient position. However,
the programmer would need to provide Jumpinstructions to branch around
the storage thus reserved.
In the situation just discussed, machine considerations suggested that the
parts of the object program appear in memory in a particular order. On the
other hand, human factors suggested that the source program should be in a
different order. The use of program blocks is one wayof satisfying both of
these requirements, with the assembler providing the required reorganization.
It is not necessary to physically rearrange the generated code in the object
program to place the pieces of each program block together. The assembler can
simply write the object code as it is generated during Pass 2 and insert the
proper load address in each Text record. These load addresses will, of course,
reflect the starting address of the block as well as the relative location of the
code within the block. This processisillustrated in Fig. 2.13. Thefirst two Text
records are generated from the source program lines 5 through 70. When the
USEstatement on line 92 is recognized, the assembler writes out the current
Text record (even thoughthereis still room left in it). The assembler then pre-
pares to begin a new Text record for the new program block. Asit happens, the
statements onlines 95 through 105 result in no generated code, so no new Text
————— rc
https://hemanthrajhemu.github.io
2.3. Machine-Independent Assembler Features 83
records are created. The next two Text records comefrom lines 125 through
180. This time the statements that belong to the next program block do result
in the generation of object code. Thefifth Text record contains the single byte
of data from line 185. The sixth Text record resumes the default program block
andtherest of the object program continuesin similar fashion.
It does not matter that the Text records of the object program are notin se-
quence by address; the loader will simply load the object code from each
record at the indicated address. When this loading is completed, the generated
code from the default block will occupyrelative locations 0000 through 0065;
the generated code and reserved storage for CDATA will occupy locations
0066 through 0070; and the storage reserved for CBLKS will occupy locations
0071 through 1070. Figure 2.14 traces the blocks of the example program
through this process of assembly and loading. Notice that the program seg-
ments marked CDATA(1) and CBLKS(1) are not actually present in the object
program. Becauseof the way the addressesare assigned, storage will automat-
ically be reserved for these areas when the program is loaded.
You should carefully examine the generated code in Fig. 2.12, and work
through the assembly of several more instructions to be sure you understand
how the assembler handles multiple program blocks. To understand how the
pieces of each program block are gathered together, you mayalso wantto sim-
ulate (by hand) the loading of the object program ofFig. 2.13.
“ Program loaded
Source program Object program in memory
, Relative
Line address
5 : 0000
4 Default(1) % Defauit(1)
Default(1) 0027
Default(2) ~~ Defauit(2)
95 CDATA(2)
004D
CDATA(1) Je Default(3)
100
105]
a CBLKS(1(1) Default(3)
CDATA(1) 0066
CDATA(3) Rw CDATA(2) ese
006D
Default(2) COTA)
0071
180
185| CDATA(2)
210
CBLKS(1)
Default(3)
245
253| CDATA(3)
1070
Figure 2.14 Program blocks from Fig. 2.11 traced through the assem-
bly and loading processes.
Figure 2.15 shows our example program as it might be written using mulkti-
ple control sections. In this case there are three control sections: one for the
main program and ong for each subroutine. The START statement identifies
the beginning of the assembly and gives a name (COPY) tothe first control
section. Thefirst section continues until the CSECT statementon line 109. This
assembler directive signals the start of a new control section named RDREC.
Similarly, the CSECT statement on line 193 begins the control section named
WRREC. Theassemblerestablishes a separate location counter (beginning at
0) for each controlsection, just as it does for program blocks.
Control sections differ from program blocksin that they are handled sepa-
rately by the assembler. (It is not even necessary for all control sections in a
program to be assembled at the same time.) Symbols that are defined in one
control section may notbe used directly by another control section; they must
be identified as external references for the loader to handle. Figure 2.15 shows
the use of two assembler directives to identify such references: EXTDEF(exter-
nal definition) and EXTREF (external reference). The EXTDEF statement in a
control section names symbols, called external symbols, that are defined in this
control section and may be used by other sections. Control section names(in
this case COPY, RDREC, and WRREC)do not need to be named in an EXTDEF
statement because they are automatically considered to be external symbols.
The EXTREF statement names symbols that are used in this control section
and are defined elsewhere. For example, the symbols BUFFER, BUFEND,and
LENGTH are defined in the control section named COPY and madeavailable
to the other sections by the EXTDEFstatementonline 6. The third control sec-
tion (WRREC) uses two of these symbols,as specified in its EXTREF statement
(line 207). The order in which symbols are listed in the EXTDEF and EXTREF
statementsis not significant.
Now weare ready to look at how external references are handled by the
assembler. Figure 2.16 shows the generated object code for each statement in
the program. Considerfirst the instruction
The operand (RDREC) is named in the EXTREF statementfor the control sec-
tion, so this is an external reference. The assembler has no idea where the con-
trol section containing RDREC will be loaded, so it cannot assemble the
address for this instruction. Instead the assembler inserts an address of zero
and passes information to the loader, which will cause the proper address to
be inserted at load time. The address of RDREC will have no predictable rela-
tionship to anything in this control section; therefore relative addressing is not
possible. Thus an extended format instruction must be used to provide room
for the actual address to be inserted. This is true of any instruction whose
operand involves an external reference.
https://hemanthrajhemu.github.io
Line Source statement
193 CSECT
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
207 EXTREF LENGTH, BUFFER
210 CLEAR x CLEAR LOOP COUNTER
212 +LDT LENGTH
215 WLOOP TD =205/ TEST OUTPUT DEVICE
220 WLOOP LOOP UNTIL READY
225 BUFFER, X GET CHARACTER FROM BUFFER
230 =K'05' WRITE CHARACTER
235 T LOOP UNTIL ALL CHARACTERS
240 WLOOP HAVE BEEN WRITTEN
245 RETURN TO CALLER
255 FIRST
is only slightly different. Here the value of the data word to be generated is
specified by an expression involving two external references: BUFEND and
BUFFER.Asbefore, the assembler stores this value as zero. When the program
is loaded, the loader will add to this data area the address of BUFEND and
subtract from it the address of BUFFER, whichresults in the desired value.
Note the difference between the handling of the expression on line 190 and
the similar expression on line 107. The symbols BUFEND and BUFFER are
defined in the same control section with the EQU statementon line 107. Thus
the value of the expression can be calculated immediately by the assembler.
This could not be done for line 190; BUFEND and BUFFER are defined in an-
other control section, so their values are unknownat assembly time.
As we can see from the above discussion, the assembler must remember
(via entries in SYMTAB)in which control section a symbol is defined. Any
attempt to refer to a symbolin another control section must be flagged as an
error unless the symbolis identified (using EXTREF) as an external reference.
The assembler mustalso allow the same symbolto be used in different control
sections. For example, the conflicting definitions of MAXLEN onlines 107 and
190 should cause no problem. A reference to MAXLENin the control section
COPY would use the definition on line 107, whereas a reference to MAXLEN
in RDREC would usethe definition on line 190.
So far we have seen how the assembler leaves room in the object code for
the values of external symbols. The assembler must also include information
in the object program that will cause the loader to insert the proper values
where they are required. We need two new record typesin the object program
and a changein a previously defined record type. As before, the exact format
of these recordsis arbitrary; however, the same information must be passed to
the loader in someform.
The two new record types are Define and Refer. A Define record gives in-
formation about external symbols that are defined in this control section—that
is, symbols named by EXTDEF. A Refer record lists symbols that are used as
external references by the control section—thatis, symbols named by EXTREF.
The formats of these recordsare as follows.
https://hemanthrajhemu.github.io
Section 2.3. Machine-Independent Assembler Features 89
Define record:
Col. 1 D
Col. 2-7 Nameofexternal symboldefinedin this control section
Col. 8-13 Relative address of symbol within this control section
(hexadecimal)
Col. 14-73 Repeat information in Col. 2-13 for other external
symbols
Refer record:
Col. 1 R
Col. 2-7 Name of external symbol referred to in this control
section
Col. 8-73 Namesof other external reference symbols
Thefirst three items in this record are the same as previously discussed. The
two new items specify the modification to be performed: adding or subtract-
ing the value of some external symbol. The symbol used for modification may
be defined either in this control section or in anotherone.
Figure 2.17 shows the object program corresponding to the source in Fig.
2.16. Notice that there is a separate set of object program records (from Header
through End) for each control section. The records for each control section are
exactly the same as they wouldbeif the sections were assembled separately.
The Define and Refer records for each control section include the symbols
named in the EXTDEF and EXTREF statements. In the case of Define, the
record also indicates the relative address of each external symbol within the
control section. For EXTREF symbols, no address information is available.
These symbols are simply namedin the Refer record.
https://hemanthrajhemu.github.io
90 Chapter 2 Assemblers
HCOPY 00000001033
DBUFFEROO0033BUFENDOO1033LENGTHOOOO2D
RRDREC MRREC =
7,0000001D172027,4B1000000320232900003320074B1000003F2FECO320160F2016
T,00001D0D0100030F20044B1000003E2000
T00003003454F46
MOOO00405+RDREC
MOOO0O] 10 5,+WRREC
M00002405+WRREC
E000000
HRDREC 00000000025
RBUFFERLENGTHBUFEND
TOOOOOOIDE4 1 0B400B4407 720 1FE3201 B332FFADB20 1 5,400433200957900000B850
T0000 1 DOESB2FEQ1 3 1000004F0000F 1000000
M0000 1 805+BUFFER
MOO002 105+LENGTH
M00002806+BUFEND
M00002806-BUFFER
E
HWRREC 000000000 1c
RLENGTHBUFFER
70000001 CB4 1077 100000E320 1 2332 FFA53900000DF2008B8503B2FEE4FO00005
400000 30 S+LENGTH
MOOOOODO5+BUFFER
E
M00000405+RDREC
https://hemanthrajhemu.github.io
2.3 Machine-Independent Assembler Features 91
instructions on lines 35 and 65. Likewise, the first Modification record in con-
trol section RDRECfills in the proper address for the external reference on
line 160. .
The handling of the data word generated by line 190is only slightly differ-
ent. The value of this word is to be BUFEND-BUFFER, where both BUFEND
and BUFFER are defined in another control section. The assembler generates
an initial value of zero for this word (located at relative address 0028 within
control section RDREC). The last two Modification records in RDREC direct
that the address of BUFENDbe added tothis field, and the address of
BUFFERbesubtracted from it. This computation, performed at load time, re-
sults in the desired value for the data word.
In Chapter 3 wediscuss in detail how the required modifications are per-
formed by the loader. At this time, however, you should be sure that you un-
derstand the concepts involved in the linking process. You should carefully
examine the other Modification records in Fig. 2.17, and reconstruct for your-
self how they were generated from the source program statements.
Note that the revised Modification record maystill be used to perform pro-
gram relocation. In the case of relocation, the modification required is adding
the beginning address of the control section to certain fields in the object pro-
gram. The symbol used as the nameof the control section hasasits value the
required address. Since the control section nameis automatically an external
symbol, it is available for use in Modification records. Thus, for example, the
Modification records from Fig. 2.8 are changed from
MOOO000705
M00001405
M00002705
to
M00000705+COPY
MO0001405+COPY
M00002705+COPY
In this way, exactly the same mechanism can be used for program relocation
and for program linking. There are more examples in the next chapter.
The existence of multiple control sections that can be relocated indepen-
dently of one another makes the handling of expressions slightly more compli-
cated. Our earlier definitions required that all of the relative terms in an
expression be paired (for an absolute expression), or that all except one be
paired (for a relative expression). We must now extend this restriction to spec-
ify that both terms in each pair mustbe relative within the same control sec-
https://hemanthrajhemu.github.io
92 Chapter 2 Assemblers
tion. The reason is simple—if the two terms represent relative locations in the
same control section, their difference is an absolute value (regardless of where
the control section is located): On the other hand, if they are in different con-
trol sections, their difference has a value that is unpredictable (and therefore
probablyuseless). For example, the expression
BUFEND-BUFFER
hasas its value the length of BUFFERin bytes. On the other hand, the value of
the expression
RDREC-COPY
is the difference in the load addresses of the twocontrol sections. This value
depends on the way run-timestorageis allocated; it is unlikely to be of any
use whatsoeverto an application program.
Whenan expression involves external references, the assembler cannot in
general determine whether or not the expression is legal. The pairing of rela-
tive terms to test legality cannot be done without knowing which of the terms
occur in the samecontrol sections, and this is unknownat assembly time. In
such a case, the assembler evaluatesall of the termsit can, and combinesthese
to form an initial expression value. It also generates Modification records so
the loader can finish the evaluation. The loader can then check the expression
for errors. We discuss this further in Chapter 3 when weexaminethe design of
a linking loader.
wardreference list for that symbol is scanned (if one exists), and the proper
addressis inserted into any instructions previously generated.
https://hemanthrajhemu.github.io
94 Chapter 2 Assemblers
Saas
155 204F JEQ EXIT
160 2052 STCH BUFFER, X 54900F
165 2055 TIX MAXLEN 2C203A
170 2058 JLT RLOOP 382043
175 205B EXIT STX LENGTH 10100¢
180 205E RSUB 4c0000
435
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
206 2061 OUTPUT BYTE X'05" 05
207
210 2062 LDX 041006
ZEo 2065 WLOOP TD E02061
220 2068 JEQ 302065
225 206B LDCH 50900F
230 206E WD . DC2061
235 2071 TIX 2C100C
240 2074 JLT 382065
245 2077 RSUB 4c0000
205 END FIRST
An example should help to make this process clear. Figure 2.19(a) shows
the object code and symboltable entries as they would beafter scanning line
40 of the program in Fig. 2.18. The first forward reference occurred online 15.
Since the operand (RDREC) was not yet defined, the instruction was assem-
bled with no value assigned as the operand address (denoted in the figure by
----). RDREC wasthen entered into SYMTABas an undefined symbol(indi-
cated by *); the address of the operand field of the instruction (2013) was in-
serted in a list associated with RDREC.A similar process was followed with
the instructions onlines 30 and 35.
Nowconsider Fig. 2.19(b), which corresponds to the situation after scan-
ning line 160. Some of the forward references have been resolved bythis time,
while others have been added. When the symbol ENDFIL wasdefined (line
45), the assembler placed its value in the SYMTAB entry;it then inserted this
value into the instruction operandfield (at address 201C) as directed by the
forward reference list. From this point on, any references to ENDFIL would
not be forward references, and would notbe entered into list. Similarly, the
definition of RDREC(line 125) resulted in thefilling in of the operand address
at location 2013. Meanwhile, two new forward references have been added: to
WRREC(line 65) and EXIT (line 155). You should continue tracing through
this process to the end of the program to show yourself thatall of the forward
Memory
address Contents Symbol Value
: THREE 1003
e
BUFFER 100F
CLOOP 2012
FIRST 200F
. ZERO 1006
2000 XXXMMXXMKM XXMMMKKXX XXXXKKXXX XXXXXK14
2010 10094820 3D00100C 28100630 202448— WEES [Feces 2031
2020 —3C2012 0010000C 100F0010 o030C1l00c EOF 1000
2030 48———08 10094C00 OOF10010 00041006
2040 OO1O06EO 20393020 43D82039 28100630 ENDFIL 2024
2050 ——5490 OF
. RETADR 1009
e
* BUFFER 100F
CLOOP 2012
FIRST 200F
MAXLEN 203A
INPUT 2039
EXIT *| 2050 0
RLOOP 2043
Figure 2.19(b) Object code in memory and symbol table entries for
the program in Fig. 2.18 after scanningline 160.
references will befilled in properly. At the end of the program, any SYMTAB
entries that are still marked with * indicate undefined symbols. These should
be flagged by the assembleraserrors.
Whenthe end of the program is encountered, the assembly is complete.If
no errors have occurred, the assembler searches SYMTAB for the value of the
symbol namedin the END statement (in this case, FIRST) and jumpsto this lo-
cation to begin execution of the assembled program.
We used an absolute program as our example because, for a load-and-go
assembler, the actual address must be known at assembly time. Of course it is
not necessary for this address to be specified by the programmer; it might be
assigned by the system.In either case, however, the assembly process would
be the same—thelocation counter wouldbeinitialized to the actual program
starting address.
One-pass assemblers that produce object programs as output are often
used on systems where external working-storage devices (for the intermediate
file between the two passes) are not available. Such assemblers may also be
https://hemanthrajhemu.github.io
2.4 Assembler Design Options 97
useful when the external storage is slow or is inconvenient to use for some
other reason. One-pass assemblers that produce object programs follow a
slightly different procédure from that previously described. Forward refer-
ences are entered into lists as before. Now, however, when the definition of a
symbolis encountered, instructions that made forward references to that sym-
bol may nolonger be available in memory for modification. In general, they
will already have been written out as part of a Text record in the object pro-
gram. In this case the assembler must generate another Text record with the
correct operand address. When the program is loaded,this addresswill be in-
serted into the instruction by the action of the loader.
Figure 2.20 illustrates this process. The second Text record contains the ob-
ject code generated from lines 10 through 40 in Fig. 2.18. The operand ad-
dresses for the instructions on lines 15, 30, and 35 have been generated as
0000. When the definition of ENDFIL on line 45 is encountered, the assembler
generates the third Text record. This record specifies that the value 2024 (the
address of ENDFIL)is to be loaded at location 201C (the operand address field
of the JEQ instruction on line 30). When the program is loaded, therefore, the
value 2024 will replace the 0000 previously loaded. The other forward refer-
ences in the program are handled in exactly the same way.In effect, the ser-
vices of the loader are being used to complete forward references that could
not be handled by the assembler. Of course, the object program records must
be kept in their original order when they are presented to the loader.
In this section we considered only simple one-pass assemblers that han-
dled absolute programs.Instruction operands were assumedto be single sym-
bols, and the assembled instructions contained the actual (not relative)
addresses of the operands. More advanced assembler features suchasliterals
were not allowed. You are encouraged to think about ways of removing some
of these restrictions (see the Exercises for this section for some suggestions).
(a)
(b)
(c)
(d)
PREVBT 1033
BUFFER 1034
(e)
BUFEND 2034
HALFSZ 800
PREVBT 1033
MAXLEN 1000
BUFFER
The assembler examples wediscuss are for the Pentium (x86), SPARC, and
PowerPC architectures. You may want to review the descriptions of these
architectures in Chaptér 1 before proceeding.
This section describes someof the features of the Microsoft MASM assembler
for Pentium and other x86 systems. Further information about MASM can be
found in Barkakati (1992).
As wediscussed in Section 1.4.2, the programmerof an x86 system views
memory as a collection of segments. An MASM assembler language program
is written as a collection of segments. Each segmentis defined as belonging to
a particular class, corresponding to its contents. Commonly used classes are
CODE, DATA, CONST, and STACK.
During program execution, segments are addressed via the x86 segment
registers. In most cases, code segments are addressed using register CS, and
stack segments are addressed using register SS. These segment registers are
automatically set by the system loader when a program is loaded for execu-
tion. Register CS is set to indicate the segmentthat contains the starting label
specified in the END statement of the program. Register SS is set to indicate
the last stack segment processed bythe loader.
Data segments (including constant segments) are normally addressed us-
ing DS, ES, FS, or GS. The segmentregister to be used can be specified explic-
itly by the programmer (by writing it as part of the assembler language
instruction). If the programmerdoes not specify a segmentregister, oneis se-
lected by the assembler.
By default, the assembler assumesthatall references to data segments use
register DS. This assumption can be changed by the assembler directive
ASSUME.For example, the directive -
ASSUME ES:DATASEG2
would set ES to indicate the data segment DATASEG2. Notice the similarities
between the ASSUME directive and the BASE directive we discussed for
SIC/XE. The BASEdirective tells a SIC/XE assembler the contents of register
B; the programmer must provide executable instructions to load this value
into the register. Likewise, ASSUMEtells MASM the contents of a segment
register; the programmer must provideinstructions to load this register when
the program is executed.
Jumpinstructions are assembled in two different ways, depending on
whether the target of the jump is in the same code segment as the jumpin-
struction. A near jump is a jump to a target in the same code segment;a far jump
is a jumpto a target in a different code segment. A near jumpis assembled us-
ing the current code segmentregister CS. A far jump must be assembled using
a different segment register, which is specified in an instruction prefix. The as-
sembled machineinstruction for a near jump occupies 2 or 3 bytes (depending
upon whether the jump address is within 128 bytes of the current instruction).
The assembled instruction for a far jump requires 5 bytes.
Forwardreferences to labels in the source program can cause problems.
For example, consider a jump instruction like
JMP TARGET
If the definition of the label TARGEToccurs in the program before the JMPin-
struction, the assembler can tell whether this is a near jump or a far jump.
However, if this is a forward reference to TARGET, the assembler does not
know how manybytesto reserve for the instruction.
By default, MASM assumesthat a forward jumpis a near jump.If the tar-
get of the jumpis in another code segment, the programmer must warn the
assembler by writing
If the jump address is within 128 bytes of the currentinstruction, the program-
mer can specify the shorter (2-byte) near jump by writing
If the JMP to TARGETisa far jump, and the programmerdoes not specify FAR
PTR, a problem occurs. During Pass 1, the assembler reserves 3 bytes for the
jumpinstruction. However, the actual assembled instruction requires 5 bytes.
In the earlier versions of MASM,this caused an assemblyerror(called a phase
https://hemanthrajhemu.github.io
2.5 Implementation Examples 105
error). In later versions of MASM, the assembler can repeat Pass 1 to generate
the correct location counter values.
Notice the similarities between the far jump and the forward references in
SIC/XE that require the use of extended format instructions.
There are also many other situations in which the length of an assembled
instruction depends on the operandsthat are used. For example, the operands
of an ADD instruction may be registers, memory locations, or immediate
operands. Immediate operands may occupyfrom 1 to 4 bytes in the instruc-
tion. An operandthat specifies a memory location may take varying amounts
of space in the instruction, depending uponthelocation of the operand.
This means that Pass 1 of an x86 assembler must be considerably more
complex than Pass 1 of a SIC assembler. The first pass of the x86 assembler
must analyze the operandsof an instruction, in addition to looking at the op-
eration code. The operation code table must also be more complicated, since it
must contain information on which addressing modes are valid for each
operand.
Segments in an MASM source program can be written in more than one
part. If a SEGMENTdirective specifies the same nameas a previously defined
segment,it is considered to be a continuation of that segment. All of the parts
of a segment are gathered together by the assembly process. Thus, segments
can perform a similar function to the program blocks we discussed for
SIC/XE.
References between segments that are assembled together are automati-
cally handled by the assembler. External references between separately assem-
bled modules must be handled by the linker. The MASM directive PUBLIC
has approximately the same function as the SIC/XE directive EXTDEF. The
MASMdirective EXTRN has approximately the same function as EXTREF. We
will considerthe action of the linker in moredetail in the next chapter.
The object program from MASM maybein several different formats, to
allow easy andefficient execution of the program in a variety of operating
environments. MASM canalso produce an instruction timinglisting that
shows the numberof clock cycles required to execute each machine instruc-
tion. This allows the programmerto exercise a great deal of control in optimiz-
ing timing-critical sections of code.
This section describes some of the features of the SunOS SPARC assembler.
Further information about this assembler can be found in Sun Microsystems
(1994a).
https://hemanthrajhemu.github.io
106 Chapter 2 Assemblers
CMP sL0, 10
BLE LOOP
ADD $42, tL3, tL4
the ADD instruction is executed before the conditional branch BLE. This ADD
instruction is said to be in the delay slot of the branch;it is executed regardless
of whetheror not the conditional branchis taken.
To simplify debugging, SPARC assembly language programmersoften
place NOP (no-operation) instructions in delay slots when a program is writ-
ten. The code is later rearranged to move useful instructions into the delay
slots. For example, the instruction sequenceillustrated above mightoriginally
have been
LOOP:
Moving the ADDinstruction into the delay slot would produce the version
discussed earlier. (Notice that the CMPinstruction could not be moved into
the delay slot, because it sets the condition codes that must be tested by the
BLE.)
However, there is another possibility. Suppose that the original version of
the loop had been
CMP sL0, 10
BLE LOOP
NOP
Now the ADDinstruction is logically the first instruction in the loop. It could
still be moved into the delay slot, as previously described. However, this
would create a problem. On the last execution of the loop, the ADDinstruction
(which is the beginning of the next loop iteration) should not be executed.
The SPARC architecture defines a solution to this problem. A conditional
branch instruction like BLE can be annulled. If a branch is annulled, the in-
struction in its delay slot is executed if the branch is taken, but not executed if
the branch is not taken. Annulled branches are indicated in SPARC assembler
https://hemanthrajhemu.github.io
108 Chapter 2 Assemblers
language by writing “,A” following the operation code. Thus the loop just dis-
cussed could be rewritten as
LOOP:
CMP SLO, 10
BLE,A LOOP
ADD %L2, %L3, tL4
This section describes some of the features of the AIX assembler for PowerPC
and other similar systems. Further information about this assembler can be
found in IBM (1994b).
The AIX assembler includes support for various models of PowerPC mi-
croprocessors, as well as earlier machines that implementthe original POWER
architecture. The programmer can declare which architecture is being used
with the assembler directiveMACHINE.The assembler automatically checks
for POWERor PowerPC instructions that are not valid for the specified envi-
ronment. When the object program is generated, the assembler includes a flag
that indicates which processors are capable of running the program.This flag
depends on which instructions are actually used in the program, not on the
-MACHINEdirective. For example, a PowerPC program that contains only in-
structions that are also in the original POWERarchitecture would be exe-
cutable on either type of system.
Aswediscussed in Section 1.5.2, PowerPC load andstore instructions use
a base register and a displacementvalue to specify an address in memory. Any
of the general-purpose registers (except GPRO) can be used as a baseregister.
Decisions about which registers to use in this wayareleft to the programmer.
In a long program,it is not unusual to have several different base registers in
use at the same time. The programmerspecifies which registers are available
for use as base registers, and the contents of these registers, with the .USING
https://hemanthrajhemu.github.io
7 2.5 Implementation Examples 109
.USING LENGTH, 1
- USING BUFFER, 4
would identify GPR1 and GPR4as base registers. GPR1 would be assumedto
contain the address of LENGTH, and GPR4 would be assumed to contain the
address of BUFFER. As with SIC/XE, the programmer must provide instruc-
tions to place these values into the registers at execution time. Additional
.USINGstatements may appear at any point in the program.If a base register
is to be used later for some other purpose, the programmerindicates with the
.DROP statementthat this register is no longer available for addressing
purposes.
This additional flexibility in register usage means more work for the as-
sembler. A base register table is used to remember whichof the general-purpose
registers are currently available as base registers, and what base addresses
they contain. Processing a .USING statement causes an entry to be made in
this table (or an existing entry to be modified); processing a .DROP statement
removes the corresponding table entry. For each instruction whose operandis
an address in memory, the assemblerscans the table to find a base register that
can be used to address that operand. If more than oneregister can be used, the
assembler selects the base register that results in the smallest signed displace-
ment. If no suitable base register is available, the instruction cannot be assem-
bled. The process of displacement calculation is the same as we described for
SIC/XE.
The AIX assembler language also allows the programmer to write base
registers and displacements explicitly in the source program. For example, the
instruction
L 2,8(4)
does not continue to the secondpass.In this case, the assembly listing contains
only errors that could be detected during Pass 1.
If no errors are detected during the first pass, the assembler proceeds to
Pass 2. The second pass reads the source program again, instead of using an
intermediate file as we discussed for SIC. This meansthat location counterval-
ues must be recalculated during Pass 2. It also means that any warning mes-
sages that were generated during Pass 1 (but were not serious enough to
terminate the assembly) are lost. The assemblylisting will contain only errors
and warningsthat are generated during Pass2.
Assembled control sections are placed into the object program according to
their storage mappingclass. Executable instructions, read-only data, and vari-
ous kinds of debugging tables are assigned to an object program section
named .TEXT. Read/write data and TOC entries are assigned to an object pro-
gram section named .DATA.Uninitialized data is assigned to a section named
.BSS. When the object program is generated, the assembler first writes all of
the .TEXT control sections, followed by all of the .DATA control sections ex-
cept for the TOC. The TOCis written after the other .DATA control sections.
Relocation and linking operations are specified by entries in a relocation table,
similar to the Modification records we discussed for SIC.
EXERCISES
Section 2.1
1. Apply the algorithm described in Fig. 2.4 to assemble the source pro-
gram in Fig. 2.1. Your results should be the same as those shownin
Figs. 2.2 and 2.3.
2. Apply the algorithm described in Fig. 2.4 to assemble the following
SIC source program:
RESB nc!
this assembler would give an error message only for the second (i.e.,
duplicate) definition. For example, it would give an error message
only for line 5 of the program below.
1 P3 START 1000
2 LDA ALPHA
3 STA ALPHA
o ALPHA RESW i
5 ALPHA WORD 0
6 END
Suppose that you want to change the assembler to give error mes-
sages for all definitions of a doubly defined symbol(e.g., lines 4 and
5), and also forall references to a doubly defined symbol(e.g., lines 2
and 3). Describe the changes you would make to accomplishthis. In
making this modification, you should change the existing assembler
as little as possible.
B3 START 1000
LDA DELTA
ADD BETA
LOOP STA DELTA
Warning: label is never referenced
RSUB
ALPHA RESW 1
Warning: label is never referenced
BETA RESW £
DELTA RESW £
END
Section 2.2
SUM START 0
FIRST LDX #0
LDA #0
+LDB #TABLE2
BASE TABLE2
LOOP ADD TABLE, X
ADD TABLE2 , X
TIX COUNT
JLT LOOP
+STA TOTAL
RSUB
COUNT RESW i
TABLE RESW 2000
TABLE2 RESW 2000
TOTAL RESW 1
END FIRST
4. Generate the complete object program for the source program given
in Exercise 3.
5. Modify the algorithm described in Fig. 2.4 to handle all of the
SIC/XE addressing modes discussed. How would these modifica-
tions be reflected in the assembler designs discussed in Chapter 8?
https://hemanthrajhemu.github.io
Exercises
11. Suppose that you are writing an assembler for a machine that has
only program-counter relative addressing. (That is, there are no di-
rect-addressing instruction formats and no baserelative addressing.)
Suppose that you wish to assemble an instruction whose operandis
an absolute address in memory—for example,
LDA 100
a. ADD ALPHA
ALPHA DC I (3)
b. ADD ALPHA
ALPHA DC F(3.1)
C. ADD ALPHA
ALPHA TC D(3.14159)
Section 2.3
LDA =W'3’
LDA ALPHA-BETA
LDA ALPHA
SUB BETA
LDA #THREE
c. THREE EQU 5
LDA THREE
15. Suppose that for some reasonit is desirable to separate the parts of
an assembler language program that require initialization (e.g., in-
structions and data items defined with WORD or BYTE) from the
parts that do not require initialization (e.g., storage reserved with
RESW or RESB). Thus, when the program is loaded for executionit
should looklike
Instructions and
initialized data items
Reserved storage
(uninitialized data items)
b. LDA LENGTH-1
d. BUPFER+MAXLEN-1
e. BUFFER-MAXLEN
f. 2*LENGTH
https://hemanthrajhemu.github.io
120 Chapter 2 Assemblers
g. 2*MAXLEN-1
h. MAXLEN-BUFFER
i. FIRST+BUFFER
j. FIRST-BUFFER+BUFEND
18. In the program ofFig. 2.9, what is the advantage of writing (on line
107)
instead of
andline 133 to
+LDT #MAXLEN
as wedid in Fig.2.9?
20, The assembler could simply assumethat any reference to a symbol
not defined within a control section is an external reference. This
change wouldeliminate the need for the EXTREF statement. Would
this be a good idea?
ons How could an assembler that allows external references avoid the
need for an EXTDEFstatement? What would be the advantages and
disadvantages of doing this?
22) The assembler could automatically use extended format for instruc-
tions whose operands involve external references. This would elimi-
nate the need for the programmerto code + in such statements. What
would be the advantages and disadvantages of doing this?
23. On somesystems, control sections can be composedof several differ-
ent parts, just as program blocks can. What problems does this pose
for the assembler? How might these problems be solved?
https://hemanthrajhemu.github.io
Exercises 121
24. Assume that the symbols RDREC and COPYare defined asin Fig.
2.15. According to our rules, the expression
*
RDREC-COPY
would be illegal (that is, the assembler and/or the loader would re-
ject it). Suppose that for some reason the program really needs the
value of this expression. How could such a thing be accomplished
without changing the rules for expressions?
22: Wediscussed a large number of assembler directives, and many
more could be implementedin an actual assembler. Checking for
them one at a time using comparisons might be quite inefficient.
How could we usea table, perhaps similar to OPTAB, to speed
recognition and handling of assembler directives? (Hint: the answer
to this problem may depend upon the language in which the assem-
bleritself is written.)
26. Other than the listing of the source program with generated object
code, what assembler outputs might be useful to the programmer?
Suggest some optionallistings that might be generated and discuss
any data structures or algorithms involved in producing them.
Section 2.4
COPY 5
FIRST 10 255
CLOOP 15 40
ENDFIL 45 30
EOF 80 45
RETADR 95 10,70
LENGTH 100 12) fo 207 60h Eppes
https://hemanthrajhemu.github.io
122 Chapter 2 Assemblers
JEQ ENDFIL+3
eaieatiilia amen
that produces an object program is very similar to the linking process
described in Section 2.3.5. Why didn’t we just use Modification
recordsto fix up the forward references?
10. How could we extend the methods of Section 2.4.2 to handle forward
references in ORG statements?
Section 2.5
Chapter 4
Macro Processors
175
https://hemanthrajhemu.github.io
176 Chapter 4 Macro Processors
Figure 4.2. Program from Fig. 4.1 with macros expanded. 179
https://hemanthrajhemu.github.io
180 Chapter 4 Macro Processors
Lines 190a through 190m show the complete expansion of the macro invo-
cation on line 190. The comment lines within the macro body have been
deleted, but comments on individual statements have been retained. Note that
the macro invocation statement itself has been included as a commentline.
This serves as documentation of the statement written by the programmer.
The label on the macro invocation statement (CLOOP) has been retained as a
label on the first statement generated in the macro expansion. This allows the
programmerto use a macro instruction in exactly the same way as an assem-
bler language mnemonic. The macro invocations on lines 210 and 220 are ex-
panded in the same way. Note that the two invocations of WRBUFF specify
different arguments, so they produce different expansions.
After macro processing, the expandedfile (Fig. 4.2) can be used as input to
the assembler. The macro invocation statements will be treated as comments,
and the statements generated from the macro expansions will be assembled
exactly as though they had been written directly by the programmer.
A comparison of the expanded program in Fig. 4.2 with the program in
Fig. 2.5 shows the most significant differences between macro invocation
and subroutine call. In Fig. 4.2, the statements from the body of the macro
WRBUFFare generated twice: lines 210a through 210h andlines 220a through
220h. In the program of Fig. 2.5, the corresponding statements appear only
once: in the subroutine WRREC(lines 210 through 240). In general, the state-
ments that form the expansion of a macro are generated (and assembled) each
time the macro is invoked. Statements in a subroutine appear only once, re-
gardless of how manytimesthe subroutineis called.
Note also that our macro instructions have been written so that the body of
the macro contains no labels. In Fig. 4.1, for example, line 140 contains the
statement “JEQ *-3” and line 155 contains “JLT *-14.” The corresponding
statements in the WRREC subroutine (Fig. 2.5) are “JEQ WLOOP” and “JLT
WLOOP,” where WLOOPis a label on the TD instruction that tests the output
device. If such a label appeared on line 135 of the macro body, it would be gen-
erated twice—onlines 210d and 220d of Fig. 4.2. This would result in an error
(a duplicate label definition) when the program is assembled. To avoid dupli-
cation of symbols, we have eliminated labels from the body of our macro defi-
nitions.
The use of statements like “JLT *-14” is generally considered to be a poor
programmingpractice.It is somewhatless objectionable within a macro defin-
ition; however,it is still an inconvenient and error-prone method.In Section
4.2.2 we discuss ways of avoiding this problem.
https://hemanthrajhemu.github.io
’ 4.1 Basic Macro Processor Functions 181
(a)
{SIC/XE version}
{SIC/XE version}
(b)
Figure 4.3 Example of the definition of macros within a macro body.
NAMTAB DEFTAB
. A
: : ‘
: ee RDBUFF &INDEV, &BUFADR, &RECLTH
RDBUFF &Te4 ae .
CLEAR A
fl CLEAR s
° +LDT #4096
TD ="'7]"
JEQ *-3
RD pry
COMPR A.S
JEQ *+11
STCH 22%
TIXR 7
JLT *-19
STX 23
——>|_ wend
.
ARGTAB (a)
1] F1
2| BUFFER
3] LENGTH
(b)
Figure 4.4 Contents of macro processor tables for the program in
Fig. 4.1: (a) entries in NAMTAB and DEFTABdefining macro RDBUFF,
(b) entries in ARGTABfor invocation of RDBUFFonline 190.
The macro processor algorithm itself is presentedin Fig. 4.5. The proce-
dure DEFINE, whichis called when the beginning of a macro definition is rec-
ognized, makes the appropriate entries in DEFTAB and NAMTAB. EXPAND is
called to set up the argument values in ARGTABand expand a macro invoca-
tion statement. The procedure GETLINE, whichis called at several points in
the algorithm, gets the next line to be processed. This line may come from
DEFTAB (the next line of a macro being expanded), or from the input file,
depending upon whetherthe Boolean variable EXPANDINGis set to TRUE or
FALSE.
One aspect of this algorithm deserves further comment: the handling of
macro definitions within macros(asillustrated in Fig. 4.3). When a macro def-
inition is being entered into DEFTAB, the normal approach wouldbe to con-
tinue until an MENDdirective is reached. This would not work for the
example in Fig. 4.3, however. The MEND online 3 (which actually marks the
end of the definition of RDBUFF) would be taken as the end of the definition
of MACROS.To solve this problem, our DEFINE procedure maintainsa
counter named LEVEL. Each time a MACROdirective is read, the value of
LEVEL is increased by 1; each time an MEND directive is read, the value of
LEVELis decreased by 1. When LEVELreaches 0, the MENDthat corre-
spondsto the original MACROdirective has been found. This process is very
much like matching left and right parentheses when scanning an arithmetic
expression.
procedure PROCESSLINE
begin
search NAMTAB for OPCODE
if found then
EXPAND
else if OPCODE = ‘MACRO’ then
DEFINE
else write source line to expanded file
end {PROCESSLINE}
procedure DEFINE
begin
enter macro name into NAMTAB
enter macro prototype into DEFTAB
LEVEL :=1
while LEVEL > 0 do
begin
GETLINE
if this is not a comment line then
begin
substitute positional notation for parameters
enter line into DEFTAB
if OPCODE = ‘MACRO’ then
LEVEL := LEVEL + 1
else if OPCODE = ‘MEND’ then
LEVEL := LEVEL — 1
end {if not comment}
end {while}
store in NAMTAB pointers to beginning and end of definition
end {DEFINE}
procedure EXPAND
begin
EXPANDING := TRUE
get first line of macro definition {prototype} from DEFTAB
set up arguments from macro invocation in ARGTAB
write macro invocation to expanded file as a comment
while not end of macro definition do
begin
GETLINE
PROCESSLINE
end {while}
EXPANDING := FALSE
end {EXPAND}
procedure GETLINE
begin
if EXPANDING then
begin
get next line of macro definition from DEFTAB
substitute arguments from ARGTAB for positional notation
end {if}
else
read next line from input file
end {GETLINE}
You maywantto apply this algorithm by hand to the program in Fig. 4.1
to be sure you understand its operation. The result should be the same as
shownin Fig.4.2. :
Most macro processors allow the definitions of commonly used macro in-
structions to appearin a standard system library, rather than in the source pro-
gram. This makes the use of such macros much more convenient. Definitions
are retrieved from this library as they are needed during macro processing.
The extension of the algorithm in Fig. 4.5 to include this sort of processing
appearsas an exercise at the end ofthis chapter.
In this section we discuss several extensions to the basic macro processor func-
tions presented in Section 4.1. As we have mentioned before, these extended
features are not directly related to the architecture of the computer for which
the macro processor is written. Section 4.2.1 describes a method for concate-
nating macro instruction parameters with other character strings. Section 4.2.2
discusses one method for generating unique labels within macro expansions,
which avoids the need for extensive use of relative addressing at the source
statement level. Section 4.2.3 introduces the important topic of conditional
macro expansion andillustrates the concepts involved with several examples.
This ability to alter the expansion of a macro by using control statements
makes macro instructions a much more powerful and useful tool for the pro-
grammer. Section 4.2.4 describes the definition and use of keyword parameters
in macro instructions.