0% found this document useful (0 votes)
207 views137 pages

Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share

SYSTEM SOFTWARE AND COMPILERS MOD1

Uploaded by

bgbdbdf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
207 views137 pages

Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share

SYSTEM SOFTWARE AND COMPILERS MOD1

Uploaded by

bgbdbdf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

Visit : https://hemanthrajhemu.github.

io

Join Telegram to get Instant Updates: https://bit.ly/VTU_TELEGRAM

Contact: MAIL: [email protected]

INSTAGRAM: www.instagram.com/hemanthraj_hemu/

INSTAGRAM: www.instagram.com/futurevisionbie/

WHATSAPP SHARE: https://bit.ly/FVBIESHARE


https://hemanthrajhemu.github.io

Contents

Chapter1 Background 1
ud Introduction 1
12 System Software and Machine Architecture 3
L3 The Simplified Instructional Computer (SIC) 4
1.3.1 SIC Machine Architecture 5
1.3.2 SIC/XE Machine Architecture 7
1.3.3 SIC Programming Examples 12
1.4 Traditional (CISC) Machines 21
14.1 VAX Architecture 21
1.4.2 Pentium Pro Architecture 25
RISC Machines 29
1.5.1 UltraSPARC Architecture 29
1.5.2 PowerPC Architecture 33
1.5.3. Cray T3E Architecture 37
Exercises 40

Chapter 2 Assemblers 43
21 Basic Assembler Functions 44
2.1.1 ASimpleSIC Assembler 46
2.1.2 Assembler Algorithm and Data Structures 50
2.2 Machine-Dependent Assembler Features 52
2.2.1 Instruction Formats and Addressing Modes 57
2.2.2 Program Relocation 61
23 Machine-Independent Assembler Features 66
2.3.1 Literals 66
2.3.2 Symbol-Defining Statements 71
2.3.3. Expressions 75
2.3.4 Program Blocks 78
2.3.5 Control Sections and Program Linking 83
24 Assembler Design Options 92
2.4.1 One-Pass Assemblers 92
2.4.2 Multi-Pass Assemblers 98
25 Implementation Examples 102
2.5.1 MASM Assembler 103
2.5.2 SPARC Assembler 105
ix
https://hemanthrajhemu.github.io

2.5.3 AIX Assembler 108


Exercises 11]

Chapter 3 Loaders and Linkers 123


3.1 Basic Loader Functions 124
3.1.1 Design of an Absolute Loader 124
3.1.2. ASimple Bootstrap Loader 127
Machine-Dependent Loader Features 129
3.2.1 Relocation 130
3.2.2 Program Linking 134
3.2.3. Algorithm and DataStructures for a Linking Loader 141
oo Machine-Independent Loader Features 147
3.3.1 Automatic Library Search 147
3.3.2 Loader Options 149
3.4 Loader Design Options 151
3.4.1 Linkage Editors 152
3.4.2 Dynamic Linking 155
3.4.3 Bootstrap Loaders 158
3.0 Implementation Examples 159
3.5.1 MS-DOS Linker 160
3.5.2 SunOS Linkers 162
3.5.3. Cray MPP Linker 164
Exercises 166

Chapter 4 Macro Processors 175


4.1 Basic Macro Processor Functions 176
4.1.1. Macro Definition and Expansion 176
4.1.2 Macro Processor Algorithm and Data Structures 181
4.2 Machine-Independent Macro Processor Features 186
4.2.1. Concatenation of Macro Parameters 186
4.2.2 Generation of Unique Labels 187
4.2.3 Conditional Macro Expansion 189
4.24 Keyword Macro Parameters 196
4.3 MacroProcessor Design Options 197
4.3.1 Recursive Macro Expansion 199
4.3.2 General-Purpose Macro Processors 202
4.3.3. Macro Processing within Language Translators 204
4.4 Implementation Examples 206
44.1 MASM MacroProcessor 206
44.2 ANSIC Macro Language 209
https://hemanthrajhemu.github.io

Chapter 1

Backgrou nd

This chapter contains a variety of information that serves as background for


the material presented later. Section 1.1 gives a brief introduction to system
software and an overview of the structure of this book. Section 1.2 begins a
discussion of the relationships between system software and machinearchitec-
ture, which continues throughoutthe text. Section 1.3 describes the Simplified
Instructional Computer (SIC) that is used to present fundamental software
concepts. Sections 1.4 and 1.5 provide an introduction to the architecture of
several computers that are used as examples throughout the text. Further in-
formation on most of the machine architecture topics discussed can be found
in Tabak (1995) and Patterson and Hennessy (1996).
Mostof the material in this chapter is presented at a summary level, with
many details omitted. The level of detail given here is sufficient background
for the remainderof the text. You should not attempt to memorize the material
in this chapter, or be overly concerned with minorpoints. Instead, it is recom-
mended that you read through this material, and then use it for reference as
needed in later chapters. References are provided throughout the chapter for
readers who wantfurther information.

1.1 INTRODUCTION

This text is an introduction to the design and implementation of system soft-


ware. System software consists of a variety of programs that support the opera-
tion of a computer. This software makesit possible for the user to focus on an
application or other problem to be solved, without needing to know the de-
tails of how the machine worksinternally.
When you took yourfirst programming course, you were already using
many different types of system software. You probably wrote programsin a
high-level language like C++ or Pascal, using a fext editor to create and modify
the program. You translated these programs into machine language using a
compiler. The resulting machine language program was loaded into memory
and prepared for execution by loader or linker. You may have used a debugger
to help detect errors in the program.
https://hemanthrajhemu.github.io
Chapter 1 Background

In later courses, you probably wrote programs in assembler language. You


mayhave used macro instructions in these programs to read and write data,
or to perform other higher-level functions. You used an assembler, which prob-
ably included a macro processor, to translate these programs into machine lan-
guage. The translated programs were prepared for execution by the loader or
linker, and mayhave beentested using the debugger.
You controlled all of these processes by interacting with the operating sys-
tem of the computer.If you were using a system like UNIX or DOS, you proba-
bly typed commandsat a keyboard. If you were using a system like MacOS or
Windows, you probably specified commands with menus and a point-and-
click interface. In either case, the operating system took care of all the ma-
chine-level details for you. Your computer may have been connected to a
network, or may have been shared byother users. It may have had many dif-
ferent kinds of storage devices, and several ways of performing input and out-
put. However, you did not need to be concerned with these issues. You could
concentrate on what you wanted to do, without worrying about how it was
accomplished.
As youread this book, you will learn about several important types of sys-
tem software. You will come to understand the processes that were going on
“behind the scenes” as you used the computer in previous courses. By under-
standing the system software, you will gain a deeper understanding of how
computers actually work.
The major topics covered in this book are assemblers, loaders andlinkers,
macro processors, compilers, and operating systems; each of Chapters 2
through 6 is devoted to one of these subjects. We also consider implementa-
tions of these types of software on several real machines. One central theme of
the book is the relationship between system software and machinearchitec-
ture: the design of an assembler, operating system,etc., is influenced by the ar-
chitecture of the machine on which it is to run. Some of these influences are
discussed in the next section; many other examples appear throughout the
text.
Chapter 7 contains a survey of some other important types of system soft-
ware: database managementsystems, text editors, and interactive debugging
systems. Chapter 8 contains an introduction to software engineering concepts
and techniques, focusing on the use of such methodsin writing system soft-
ware. This chapter can be read at any timeafter the introduction to assemblers
in Section 2.1.
The depth of treatment in this text varies considerably from one topic to
another. The chapters on assemblers, loaders and linkers, and macro proces-
sors contain enough implementation details to prepare the reader to write
these types of software for a real computer. Compilers and operating systems,
on the other hand,are very large topics; each has,byitself, been the subject of
https://hemanthrajhemu.github.io
*
1.2 System Software and Machine Architecture

many complete books and courses.It is obviously impossible to providea full


coverage of these subjects in a single chapter of any reasonablesize. Instead,
weprovide an introduction to the most important concepts and issues related
to compilers and operating systems, stressing the relationships between soft-
ware design and machinearchitecture. Other subtopics are discussed as space
permits, with references provided for readers who wish to explore these areas
further. Our goal is to provide a good overview of these subjects that can also
serve as background for students whowill later take more advanced software
courses. This same approach is also applied to the other topics surveyed in
Chapter7.

1.2 SYSTEM SOFTWARE AND


MACHINE ARCHITECTURE

One characteristic in which most system softwarediffers from application soft-


ware is machine dependency. An application program is primarily concerned
with the solution of some problem, using the computeras a tool. The focusis
on the application, not on the computing system. System programs, on the
other hand, are intended to support the operation and use of the computerit-
self, rather than any particular application. For this reason, they are usuallyre-
lated to the architecture of the machine on which theyare to run. For example,
assemblers translate mnemonic instructions into machine code; the instruction
formats, addressing modes, etc., are of direct concern in assembler design.
Similarly, compilers must generate machine language code, taking into ac-
count such hardwarecharacteristics as the number and type of registers and
the machineinstructions available. Operating systems are directly concerned
with the managementof nearly all of the resources of a computing system.
Manyother examples of such machine dependencies may be found through-
out this book.
On the other hand, there are some aspects of system software that do not
directly depend upon the type of computing system being supported. For
example, the general design and logic of an assembleris basically the same on
most computers. Some of the code optimization techniques used by compilers
are independentof the target machine (although there are also machine-
dependent optimizations). Likewise, the process of linking together indepen-
dently assembled subprograms does not usually depend on the computer
being used. We will also see many examples of such machine-independent
features in the chapters that follow.
Because most system software is machine-dependent, we mustincludereal
machines and real pieces of software in our study. However, most real com-
puters have certain characteristics that are unusual or even unique. It can be
https://hemanthrajhemu.github.io
Chapter 1 Background

———
difficult to distinguish between those features of the software that are truly
fundamental and those that dependsolely on the idiosyncrasies ofa particular
machine. To avoid this problem, we present the fundamental functions of each
piece of software through discussion of a Simplified Instructional Computer
(SIC). SIC is a hypothetical computer that has been carefully designed to in-
clude the hardware features most often found on real machines, while avoid-
ing unusualor irrelevant complexities. In this way, the central concepts of a
piece of system software can be clearly separated from the implementation de-
tails associated with a particular machine. This approach provides the reader
with a starting point from which to begin the design of system software for a
newor unfamiliar computer.
Each major chapter in this text first introduces the basic functions of
the type of system software being discussed. We then consider machine-
dependent and machine-independentextensions to these functions, and exam-
ples of implementations on actual machines. Specifically, the major chapters
are divided into the following sections:

1. Features that are fundamental, and that should be found in any


exampleof this type of software.
2. Features whose presence and character are closely related to the
machine architecture.
3. Other features that are commonly found in implementations of this
type of software, andthatare relatively machine-independent.
4. Major design options for structuring a particular piece of software—
for example, single-pass versus multi-pass processing.
5. Examples of implementations on actual machines, stressing unusual
software features and thosethat are related to machinecharacteristics.

This chapter contains brief descriptions of SIC and of the real machines
that are used as examples. You are encouraged to read these descriptions now,
and refer to them as necessary when studying the examples in each chapter.

1.3 THE SIMPLIFIED INSTRUCTIONAL


COMPUTER(SIC)
In this section we describe the architecture of our Simplified Instructional
Computer (SIC). This machine has been designedto illustrate the most com-
monly encountered hardware features and concepts, while avoiding most of
the idiosyncrasies that are often found in real machines.
https://hemanthrajhemu.github.io
1.3. The Simplified Instructional Computer (SIC)

Like many other products, SIC comesin twoversions: the standard model
and an XEversion (XE stands for “extra equipment,” or perhaps “extra expen-
sive”). The two versions have been designed to be upward compatible—thatis,
an object program for the standard SIC machinewill also execute properly on
a SIC/XE system. (Such upward compatibility is often found on real comput-
ers that are closely related to one another.) Section 1.3.1 summarizes the stan-
dard features of SIC. Section 1.3.2 describes the additional features that are
included in SIC/XE. Section 1.3.3 presents simple examples of SIC and
SIC/XE programming. These examplesare intended to help you become more
familiar with the SIC and SIC/XE instruction sets and assembler language.
Practice exercises in SIC and SIC/XE programming can be foundat the end of
this chapter.

1.3.1 SIC Machine Architecture

Memory

Memory consists of 8-bit bytes; any 3 consecutive bytes form a word (24bits).
All addresses on SIC are byte addresses; words are addressed by the location
of their lowest numbered byte. There are a total of 32,768 (219) bytes in the
computer memory.

Registers

There are five registers, all of which have special uses. Each register is 24 bits
in length. The following table indicates the numbers, mnemonics, and uses of
these registers. (The numbering scheme has been chosen for compatibility
with the XE version of SIC.)

Mnemonic Number Special use

A 0 Accumulator; used for arithmetic operations


Xx 1 Index register; used for addressing
1 oz Linkageregister; the Jump to Subroutine (JSUB)
instruction stores the return address
in this register
PC 8 Program counter; contains the address of the
next instruction to be fetched for execution
SW 9 Status word; contains a variety of
information, including a Condition Code (CC)
https://hemanthrajhemu.github.io
Chapter 1 Background

Data Formats

Integers are stored as 24-bit binary numbers; 2’s complementrepresentation is


used for negative values. Characters are stored using their 8-bit ASCII codes
(see Appendix B). Thereis no floating-point hardware on the standard version
of SIC.

Instruction Formats

All machine instructions on the standard version of SIC have the following
24-bit format:

8 1 15
opcode x address

The flag bit x is used to indicate indexed-addressing mode.

Addressing Modes

There are two addressing modesavailable, indicated by the setting of the x bit
in the instruction. The following table describes how thetarget address is calcu-
lated from the address given in the instruction. Parentheses are used to indi-
cate the contents of a register or a memory location. For example, (X)
represents the contents of register X.

Mode Indication Target addresscalculation

Direct x=0 TA = address


Indexed el TA = address+ (X)

Instruction Set

SIC providesa basic set of instructions that are sufficient for most simple
tasks. These includeinstructions that load and store registers (LDA, LDX, STA,
STX, etc.), as well as integer arithmetic operations (ADD, SUB, MUL,DIV). All
arithmetic operations involve register A and a word in memory, with the result
being left in the register. There is an instruction (COMP) that compares the
value in register A with a word in memory;this instruction sets a condition code
CC to indicate the result (<, =, or >). Conditional jumpinstructions (JLT, JEQ,
JGT) can test the setting of CC, and jumpaccordingly. Twoinstructions are
https://hemanthrajhemu.github.io
1.3 The Simplified Instructional Computer (SIC)

provided for subroutine linkage. JSUB jumps to the subroutine, placing the
return address in register L; RSUB returns by jumping to the address con-
tained in register L. “
Appendix A gives a complete list of all SIC (and SIC/XE) instructions, with
their operation codes and a specification of the function performed byeach.

Input and Output

On the standard version of SIC, input and output are performed bytransfer-
ring 1 byte at a timeto or from the rightmost8 bits of register A. Each deviceis
assigned a unique 8-bit code. There are three I/O instructions, each of which
specifies the device code as an operand.
The Test Device (TD) instruction tests whether the addressed device is
ready to send or receive a byte of data. The condition codeis set to indicate the
result of this test. (A setting of < means the device is ready to sendorreceive,
and = meansthe deviceis not ready.) A program needing to transfer data must
wait until the device is ready, then execute a Read Data (RD) or Write Data
(WD). This sequence must be repeated for each byte of data to be read or writ-
ten. The program shownin Fig. 2.1 (Chapter 2) illustrates this technique for
performing I/O.

1.3.2 SIC/XE Machine Architecture

Memory

The memory structure for SIC/XEis the sameas that previously described for
SIC. However, the maximum memoryavailable on a SIC/XE system is
1 megabyte (229 bytes). This increase leads to a change in instruction formats
and addressing modes.

Registers

The following additional registers are provided by SIC/XE:

Mnemonic Number Special use

B 3 Base register; used for addressing


S 4 General working register—nospecial use
T 5 General working register—nospecial use
FE 6 Floating-point accumulator(48 bits)
https://hemanthrajhemu.github.io
Chapter 1 Background

Data Formats

SIC/XE provides the same data formats as the standard version. In addition,
there is a 48-bit floating-point data type with the following format:

1 11 36
exponent fraction

nw
Thefraction is interpreted as a value between 0 and1; that is, the assumed bi-
nary point is immediately before the high-order bit. For normalized floating-
point numbers, the high-order bit of the fraction must be 1. The exponent is
interpreted as an unsigned binary number between 0 and 2047.If the exponent
has value e and thefraction has valuef, the absolute value of the numberrep-
resentedis

f * 2(e-1024).

The sign of the floating-point numberis indicated by the value of s (0 =


positive, 1 = negative). A value of zero is represented bysetting all bits
(including sign, exponent, andfraction) to 0.

Instruction Formats

The larger memory available on SIC/XE means that an address will (in gen-
eral) no longer fit into a 15-bit field; thus the instruction format used on the
standard version of SIC is no longer suitable. There are two possible options—
either use some form of relative addressing, or extend the address field to 20
bits. Both of these options are included in SIC/XE (Formats 3 and4 in thefol-
lowing description). In addition, SIC/XE provides someinstructions that do
not reference memory atall. Formats 1 and 2 in the following description are
used for such instructions.
The newsetof instruction formats is as follows. The settings of theflag bits
in Formats 3 and 4 are discussed under Addressing Modes.Bit e is used to dis-
tinguish between Formats 3 and 4 (e = 0 means Format 3, e = 1 means Format
4). Appendix A indicates the format to be used with each machineinstruction.

Format 1 (1 byte):
8

op
https://hemanthrajhemu.github.io
1:3 The Simplified Instructional Computer (SIC)

Format 2 (2 bytes):
8 4 4
op r1 r2

Format 3 (3 bytes):

6 tone cdpeticd 12
op nji|x|b/pje disp

Format 4 (4 bytes):
6 177444 20
op n/i|x/bl/pje address

Addressing Modes

Two newrelative addressing modes are available for use with instructions
assembled using Format3. These are describedin the followingtable:

Mode Indication Target address calculation

Baserelative b=1,p=0 TA=(B)+disp (0 < disp < 4095)


Program-counter b=0,p=1 TA=(PC)+disp (—2048 < disp < 2047)
relative

For baserelative addressing, the displacementfield disp in a Format 3 instruc-


tion is interpreted as a 12-bit unsigned integer. For program-counterrelative ad-
dressing, this field is interpreted as a 12-bit signed integer, with negative
values represented in 2’s complementnotation.
If bits b and p are both set to 0, the disp field from the Format3 instruction
is taken to be the target address. For a Format4 instruction, bits b and p are
normally set to 0, and the target address is taken from the addressfield of the
instruction. We will call this direct addressing, to distinguish it from the rela-
tive addressing modes described above.
Any of these addressing modes can also be combined with indexed ad-
dressing—if bit x is set to 1, the term (X) is addedin the target address calcula-
tion. Notice that the standard version of the SIC machine uses only direct
addressing (with or without indexing).
https://hemanthrajhemu.github.io
10 Chapter 1 Background

Bits f and n in Formats 3 and 4 are used to specify howthe target addressis
used. If bit i = 1 and n = 0, the target addressitself is used as the operand
value; no memoryreference is performed. This is called immediate addressing.
If bit i = 0 and » = 1, the word at the location given by the target addressis
fetched; the value contained in this word is then taken as the address of the
operand value. This is called indirect addressing. If bits i and n are both 0 or
both 1, the target address is taken as the location of the operand; wewill refer
to this as simple addressing. Indexing cannot be used with immediate or indi-
rect addressing modes.
Manyauthors usethe term effective address to denote what we havecalled
the target address for an instruction. However, there is disagreement concern-
ing the meaning of effective address whenreferring to an instruction that uses
indirect addressing. To avoid confusion, we use the term target address
throughoutthis book.
SIC/XEinstructions that specify neither immediate nor indirect addressing
are assembled with bits m and i both set to 1. Assemblers for the standard ver-
sion of SIC will, however, set the bits in both of these positions to 0. (This is be-
cause the 8-bit binary codes for all of the SIC instructions end in 00.) All
SIC/XE machines have a special hardware feature designed to provide the up-
ward compatibility mentioned earlier. If bits m and i are both 0, then bits b, p,
and e are considered to be part of the address field of the instruction (rather
than flags indicating addressing modes). This makes Instruction Format 3
identical to the format used on the standard version of SIC, providing the de-
sired compatibility.
Figure 1.1 gives examples of the different addressing modes available on
SIC/XE. Figure 1.1(a) shows the contents of registers B, PC, and X, and of se-
lected memorylocations. (All values are given in hexadecimal.) Figure 1.1(b)
gives the machine code for a series of LDAinstructions. The target address
generated by each instruction, and the value that is loaded into register A, are
also shown. You should carefully examine these examples, being sure you un-
derstand the different addressing modesillustrated.
For ease of reference,all of the SIC/XE instruction formats and addressing
modesare summarized in Appendix A.

Instruction Set

SIC/XE providesall of the instructions that are available on the standard


version. In addition, there are instructions to load and store the new registers
(LDB, STB,etc.) and to perform floating-point arithmetic operations (ADDF,
https://hemanthrajhemu.github.io
1.3 The Simplified Instructional Computer (SIC) 11

(B) = 006000
. .

; . (PC) = 003000
. . 7s

a ® (X) = 000090

3030 003600
. .

. .

. .

3600 103000

. .

. .

. .

. .

. s

6390 00C303

. .

. .

. .

. .

. .

. .

C303 003030
. .

. .

. .

. .

(a)

Machine instruction Value


1 loaded
Hex Binary into
ee el Target register
op n i *%.x b p e disp/address address A

032600 000000 1 1 0 0 1 0 0110 0000 0000 35600 103000


03€300 900000 1 1 1 1 0 0 0011 0000 0000 6390 00C303
022050 000000 l1 oOo OG O 1 OO 0000 9011 2000 3030 103000
010030 000000 0 1 0 0 0 0 0000 0011 0000 350 000030
003600 000000 0 0 0 0 1 ip 0110 0000 0000 3600 103000
0310C303 000000 1 1 0 0 0 1 0000 1100 0011 0000 0011 C303 003030
(b)

Figure 1.1 Examples of SIC/XEinstructions and addressing modes.


https://hemanthrajhemu.github.io
12 Chapter 1 Background

SUBF, MULF, DIVF). There arealsoinstructions that take their operands from
registers. Besides the RMO (register move) instruction, these include
register-to-register arithmetic operations (ADDR, SUBR, MULR,DIVR). A spe-
cial supervisor call instruction (SVC) is provided. Executing this instruction
generates an interrupt that can be used for communication with the operating
system. (Supervisorcalls and interrupts are discussed in Chapter6.)
Thereare also several other new instructions. Appendix A gives a complete
list of all SIC/XEinstructions, with their operation codes and a specification of
the function performed by each.

Input and Output

The I/O instructions we discussed for SIC are also available on SIC/XE.In ad-
dition, there are I/O channels that can be used to perform input and output
while the CPUis executing other instructions. This allows overlap of comput-
ing and I/O, resulting in moreefficient system operation. The instructions
SIO, TIO, and HIO are usedtostart, test, and halt the operation of I/O chan-
nels. (These concepts are discussed in detail in Chapter6.)

1.3.3 SIC Programming Examples

This section presents simple examples of SIC and SIC/XE assembler language
programming. These examples are intended to help you become more familiar
with the SIC and SIC/XEinstruction sets and assembler language. It is as-
sumedthat the reader is already familiar with the assembler languageof at
least one machine and with the basic ideas involved in assembly-level pro-
gramming.
The primary subject of this book is systems programming, not assembler
language programming. The following chapters contain discussions of various
types of system software, and in some cases SIC programs are usedtoillus-
trate the points being made. This section contains material that may help you
to understand these examples more easily. However, it does not contain any
new material on system software or systems programming. Thus,this section
can be skipped without anyloss of continuity.
Figure 1.2 contains examples of data movement operations for SIC and
SIC/XE. There are no memory-to-memory moveinstructions; thus, all data
movement must be doneusing registers. Figure 1.2(a) shows two examples of
data movement.In thefirst, a 3-byte word is movedby loadingit into register
A andthen storing the register at the desired destination. Exactly the same
thing could be accomplished using register X (and the instructions LDX, STX)
or register L (LDL, STL). In the second example, a single byte of data is moved
using the instructions LDCH (Load Character) and STCH (Store Character).
https://hemanthrajhemu.github.io
1.8 The Simplified Instructional Computer (SIC) 13

These instructions operate by loading or storing the nghtmost 8-bit byte of


register A; the other bits in register A are notaffected.
Figure 1.2(a) also shows four different ways of defining storage for data
itemsin the SIC assembler language. (These assembler directives are discussed
in more detail in Section 2.1.) The statement WORD reserves one word of stor-
age, whichis initialized to a value defined in the operandfield of the state-
ment. Thus the WORDstatementin Fig. 1.2(a) defines a data word labeled
FIVE whosevalueis initialized to 5. The statement RESW reserves one or
more wordsof storage for use by the program. For example, the RESW state-
mentin Fig. 1.2(a) defines one word of storage labeled ALPHA, which will be
used to hold a value generated by the program.
The statements BYTE and RESB perform similar storage-definition func-
tions for data items that are characters instead of words. Thusin Fig. 1.2(a)
CHARZ is a 1-byte data item whose valueis initialized to the character “Z”,
and C1 is a 1-byte variable with noinitial value.

LDA FIVE LOAD CONSTANT 5 INTO REGISTER A


STA ALPHA STORE IN ALPHA
LDCH CHARZ LOAD CHARACTER ‘2’ INTO REGISTER A
STCH cr: STORE IN CHARACTER VARIABLE Cl

ALPHA RESW fi ONE-WORD VARIABLE


FIVE WORD 5 ONE-WORD CONSTANT
CHARA BYTE era" ONE-BYTE CONSTANT
ek RESB 1 ONE-BYTE VARIABLE

(a)

LDA #5 LOAD VALUE 5 INTO REGISTER A


STA ALPHA STORE IN ALPHA
LDA #90 LOAD ASCII CODE FOR ‘'Z’ INTO REG A
STCH ee STORE IN CHARACTER VARIABLE Cl

ALPHA RESW J: ONE-WORD VARIABLE


el RESB ze ONE-BYTE VARIABLE

(b)

Figure 1.2 Sample data movement operations for (a) SIC and
(b) SIC/XE.
https://hemanthrajhemu.github.io
14 Chapter 1 Background

The instructions shownin Fig. 1.2(a) would also work on SIC/XE; how-
ever, they would not take advantage of the more advanced hardwarefeatures
available. Figure 1.2(b) shows the same two data-movement operations as
they might be written for SIC/XE. In this example, the value 5 is loaded into
register A using immediate addressing. The operand field for this instruction
contains the flag # (which specifies immediate addressing) and the data value
to be loaded. Similarly, the character “Z”is placed into register A by using im-
mediate addressing to load the value 90, which is the decimal value of the
ASCII codethat is used internally to represent the character “Z”.
Figure 1.3(a) shows examplesof arithmetic instructions for SIC. All arith-
metic operations are performed using register A, with the result being left in
register A. Thus this sequenceof instructions stores the value (ALPHA + INCR
— 1) in BETA andthe value (GAMMA+ INCR — 1) in DELTA.
Figure 1.3(b) illustrates how the samecalculations could be performed on
SIC/XE. The value of INCRis loadedinto register S initially, and the register-
to-register instruction ADDRis used to add this value to register A whenitis
needed. This avoids having to fetch INCR from memory eachtimeit is used in
a calculation, which may make the program moreefficient. Immediate ad-
dressing is used for the constant 1 in the subtraction operations.
Looping and indexing operationsare illustrated in Fig. 1.4. Figure 1.4(a)
showsa loop that copies one 11-byte character string to another. The index
register (register X) is initialized to zero before the loop begins. Thus, during
the first execution of the loop, the target address for the LDCH instruction will
be the address of the first byte of STR1. Similarly, the STCH instruction will
store the character being copied into the first byte of STR2. The next instruc-
tion, TIX, performs two functions. First it adds 1 to the valuein register X, and
then it compares the new value of register X to the value of the operand (in
this case, the constant value 11). The condition codeis set to indicate the result
of this comparison. The JLT instruction jumps if the condition code is set to
“less than.” Thus, the JLT causes a jump back to the beginning of the loop if
the new valuein register X is less than 11.
During the second execution of the loop, register X will contain the value
1. Thus, the target address for the LDCH instruction will be the second byte of
STRI, and the target address for the STCH instruction will be the second byte
of STR2. The TIX instruction will again add1 to the valuein register X, and the
loop will continue in this way until all 11 bytes have been copied from STR1 to
STR2. Notice that after the TIX instruction is executed, the value in register X
is equal to the numberof bytes thathave already been copied.
Figure 1.4(b) shows the same loop as it might be written for SIC/XE. The
main difference is that the instruction TIXR is used in place of TIX. TIXR
works exactly like TIX, except that the value used for comparison is taken
from anotherregister (in this case, register T), not from memory. This makes
https://hemanthrajhemu.github.io
1.8 The Simplified Instructional Computer (SIC) 15
is eeee

the loop moreefficient, because the value does not have to be fetched from
memory each time the loop is executed. Immediate addressing is used to ini-
tialize register T to the value 11 andto initialize register X to 0.

ALPHA LOAD ALPHA INTO REGISTER A


INCR ADD THE VALUE OF INCR
ONE SUBTRACT 1
BETA STORE IN BETA
GAMMA LOAD GAMMA INTO REGISTER A
INCR ADD THE VALUE OF INCR
ONE SUBTRACT 1
DELTA STORE IN DELTA

ONE-WORD CONSTANT
ONE-WORD VARIABLES
ALPHA
Bee ee

BETA

DELTA

(a)

LOAD VALUE OF INCR INTO REGISTER §$


ALPHA LOAD ALPHA INTO REGISTER A
ADDR S,A ADD THE VALUE OF INCR
#1 SUBTRACT 1
STA BETA STORE IN BETA
LOAD GAMMA INTO REGISTER A
ADDR S,A ADD THE VALUE OF INCR
#1 SUBTRACT 1
STA DELTA STORE IN DELTA

ONE WORD VARIABLES


ee

BETA

DELTA

(b)
Figure 1.3 Sample arithmetic operations for (a) SIC and (b) SIC/XE.
https://hemanthrajhemu.github.io
16 Chapter 1 Background

LDX ZERO INITIALIZE INDEX REGISTER TO 0


MOVECH LDCH STR1,X LOAD CHARACTER FROM STR1 INTO REG A
STCH STR2,X STORE CHARACTER INTO STR2
TIX ADD 1 TO INDEX, COMPARE RESULT TO 11
JLT MOVECH LOOP IF INDEX IS LESS THAN 11

STR1 BYTE C' TEST STRING’ 11-BYTE STRING CONSTANT


STR2 RESB £7 11-BYTE VARIABLE
ONE-WORD CONSTANTS
ZERO WORD
ELEVEN WORD 11

(a)

LDT #11 INITIALIZE REGISTER T TO 11


LDX #0 INITIALIZE INDEX REGISTER TO 0
MOVECH LDCH STR1L,X LOAD CHARACTER FROM STR1 INTO REG A
STCH STR2,xX STORE CHARACTER INTO STR2
TIXR aD ADD 1 TO INDEX, COMPARE RESULT TO 11
JLT MOVECH LOOP IF INDEX IS LESS THAN 11

STR1 BYTE C'/TEST STRING’ 11-BYTE STRING CONSTANT


STR2 RESB 11 11-BYTE VARIABLE

(b)
Figure 1.4 Sample looping and indexing operations for (a) SIC and
(b) SIC/XE.

Figure 1.5 contains another example of looping and indexing operations.


The variables ALPHA, BETA, and GAMMAarearrays of 100 words each. In
this case, the task of the loop is to add together the corresponding elements of
ALPHAand BETA,storing the results in the elements of GAMMA.The gen-
eral principles of looping and indexing are the sameas previously discussed.
However,the value in the index register must be incremented by3 for each it-
eration of this loop, because each iteration processes a 3-byte (i-e., one-word)
elementof the arrays. The TIX instruction always adds1 to register X,so it is
not suitable for this program fragment. Instead, we use arithmetic and com-
parison instructions to handle the index value.
https://hemanthrajhemu.github.io
1.3 The Simplified Instructional Computer (SIC) 17

ZERO INITIALIZE INDEX VALUE TO 0


INDEX
ADDLP INDEX LOAD INDEX VALUE INTO REGISTER X
ALPHA, X LOAD WORD FROM ALPHA INTO REGISTER A
BETA, X ADD WORD FROM BETA
GAMMA, X STORE THE RESULT IN A WORD IN GAMMA
INDEX ADD 3 TO INDEX VALUE
THREE
INDEX
K300 COMPARE NEW INDEX VALUE TO 300
ADDLP LOOP IF INDEX IS LESS THAN 300

INDEX ONE-WORD VARIABLE FOR INDEX VALUE


ARRAY VARIABLES--100 WORDS EACH
ALPHA 100
BETA 100
100
ONE-WORD CONSTANTS
ZERO
K300 300

(a)
#3 INITIALIZE REGISTER S TO 3
#300 INITIALIZE REGISTER T TO 300
#0 INITIALIZE INDEX REGISTER TO 0
ADDLP ALPHA, X LOAD WORD FROM ALPHA INTO REGISTER A
BETA, X ADD WORD FROM BETA
GAMMA , X STORE THE RESULT IN A WORD IN GAMMA
S,x ADD 3 TO INDEX VALUE
eT COMPARE NEW INDEX VALUE TO 300
ADDLP LOOP IF INDEX VALUE IS LESS THAN 300

ARRAY VARIABLES--100 WORDS EACH


100
BETA 100
100

(b)
Figure 1.5 Sample indexing and looping operations for (a) SIC and
(b) SIC/XE.
https://hemanthrajhemu.github.io
18 Chapter 1 Background

In Fig. 1.5(a), we define a variable INDEX that holds the value to be used
for indexing for each iteration of the loop. Thus, INDEX should be 0 for the
first iteration, 3 for the second, and so on. INDEXisinitialized to 0 before the
start of the loop. Thefirst instruction in the body of the loop loads the current
value of INDEXinto register X so that it can be used for target address calcula-
tion. The next three instructions in the loop load a word from ALPHA, add the
corresponding word from BETA, and store the result in the corresponding
word of GAMMA. The value of INDEXis then loaded into register A, incre-
mented by 3, and stored back into INDEX. After being stored, the new value of
INDEXis still present in register A. This value is then compared to 300 (the
length of the arrays in bytes) to determine whether or not to terminate the
loop. If the value of INDEXis less than 300, then all bytes of the arrays have
not yet been processed. In that case, the JLT instruction causes a jump back to
the beginning of the loop, where the newvalue of INDEXis loadedinto regis-
ter X.
This particular loop is cumbersome on SIC, because register A must be
used for adding the array elements together and also for incrementing the in-
dex value. The loop can be written much moreefficiently for SIC/XE, as
shown in Fig. 1.5(b). In this example, the index value is kept permanently in
register X. The amount by which to increment the index value (3) is kept in
register S, and the register-to-register ADDRinstruction is used to addthis in-
crement to register X. Similarly, the value 300 is kept in register T, and thein-
struction COMPRis used to compare registers X and T in order to decide
whento terminate the loop.
Figure 1.6 shows a simple example of input and output on SIC; the same
instructions would also work on SIC/XE.(The more advanced input and out-
put facilities available on SIC/XE, such as I/O channels and interrupts, are
discussed in Chapter 6.) This program fragment reads 1 byte of data from de-
vice F1 and copies it to device 05. The actual input of data is performed using
the RD (Read Data) instruction. The operand for the RD is a byte in memory
that contains the hexadecimal code for the input device (in this case, F1).
Executing the RD instruction transfers 1 byte of data from this device into the
rightmost byte of register A. If the input device is character-oriented (for ex-
ample, a keyboard), the value placed in register A is the ASCII code for the
character that wasread.
Before the RD can be executed, however, the input device must be ready to
transmit the data. For example, if the input device is a keyboard, the operator
must have typed a character. The program checks for this by using the TD
(Test Device) instruction. When the TD is executed, the status of the addressed
device is tested and the condition codeis set to indicate the result of this test.
If the device is ready to transmit data, the condition codeis set to “less than”;
if the device is not ready, the condition codeis set to “equal.” As Fig. 1.6
https://hemanthrajhemu.github.io
1.3% The Simplified Instructional Computer (SIC)

INLOOP TD INDEV TEST INPUT DEVICE


JEQ INLOOP LOOP UNTIL DEVICE IS READY
RD INDEV READ ONE BYTE INTO REGISTER A
STCH DATA STORE BYTE THAT WAS READ

OUTLP TD OUTDEV TEST OUTPUT DEVICE


JEQ OUTLP LOOP UNTIL DEVICE IS READY
LDCH DATA LOAD DATA BYTE INTO REGISTER A
WD OQUTDEV WRITE ONE BYTE TO OUTPUT DEVICE

INDEV BYTE xT! INPUT DEVICE NUMBER


OUTDEV BYTE wOUS" OUTPUT DEVICE NUMBER
DATA RESB 1 ONE-BYTE VARIABLE

Figure 1.6 Sample input and output operations for SIC.

illustrates, the program must execute the TD instruction and then check the
condition code by using a conditional jump. If the condition code is “equal”
(device not ready), the program jumps back to the TD instruction. This two-
instruction loop will continue until the device becomes ready; then the RD will
be executed.
Output is performedin the same way.First the program uses TD to check
whether the output device is ready to receive a byte of data. Then the byte to
be written is loaded into the rightmost byte of register A, and the WD (Write
Data) instruction is used to transmitit to the device.
Figure 1.7 shows how these instructions can be used to read a 100-byte
record from an input device into memory. The read operation in this example
is placed in a subroutine. This subroutineis called from the main program by
using the JSUB (Jump to Subroutine) instruction. At the end of the subroutine
there is an RSUB (Return from Subroutine) instruction, which returns control
to the instruction that follows the JSUB.
The READsubroutineitself consists of a loop. Each execution of this loop
reads 1 byte of data from the input device, using the same techniquesillus-
trated in Fig. 1.6. The bytes of data that are read are stored in a 100-byte buffer
area labeled RECORD.The indexing and looping techniques that are used in
storing characters in this buffer are essentially the sameas thoseillustrated in
Fig. 1.4(a).
Figure 1.7(b) shows the same READ subroutineas it might be written for
SIC/XE. The main differences from Fig. 1.7(a) are the use of immediate
addressing and the TIXRinstruction, as wasillustrated in Fig. 1.4(a).
https://hemanthrajhemu.github.io
20 Chapter 1 Background

JSUB READ CALL READ SUBROUTINE

SUBROUTINE TO READ 100-BYTE RECORD


LDX ZERO INITIALIZE INDEX REGISTER TO 0
RLOOP INDEV TEST INPUT DEVICE
JEQ RLOOP LOOP IF DEVICE IS BUSY
INDEV READ ONE BYTE INTO REGISTER A
STCH RECORD, X STORE DATA BYTE INTO RECORD
K100 ADD 1 TO INDEX AND COMPARE TO 100
RLOOP LOOP IF INDEX IS LESS THAN 100
EXIT FROM SUBROUTINE

INDEV x’RIL* INPUT DEVICE NUMBER


RECORD 100 100-BYTE BUFFER FOR INPUT RECORD
ONE-WORD CONSTANTS
ZERO
K100 100

(a)
JSUB CALL READ SUBROUTINE

SUBROUTINE TO READ 100-BYTE RECORD


#0 INITIALIZE INDEX REGISTER TO 0
#100 INITIALIZE REGISTER T TO 100
RLOOP INDEV TEST INPUT DEVICE
RLOOP LOOP IF DEVICE IS BUSY
READ ONE BYTE INTO REGISTER A
RECORD, X STORE DATA BYTE INTO RECORD
ADD 1 TO INDEX AND COMPARE TO 100
RLOOP LOOP IF INDEX IS LESS THAN 100
EXIT FROM SUBROUTINE

INDEV BYTE xP Lt INPUT DEVICE NUMBER


RECORD RESB 100 .100-BYTE BUFFER FOR INPUT RECORD

(b)
Figure 1.7 Sample subroutine call and record input operations for
(a) SIC and (b) SIC/XE.
https://hemanthrajhemu.github.io
. 1.4 Traditional (CISC) Machines 21

1.4 TRADITIONAL (CISC) MACHINES

This section introduces the architectures of two of the machines that will be
used as exampleslater in the text. Section 1.4.1 describes the VAX architecture,
and Section 1.4.2 describes the architecture of the Intel x86 family of proces-
sors.
The machines described in this section are classified as Complex Instruc-
tion Set Computers (CISC). CISC machines generally havea relatively large
and complicated instruction set, several different instruction formats and
lengths, and many different addressing modes. Thus the implementation of
such an architecture in hardware tends to be complex.
You may want to compare the examples in this section with the Reduced
Instruction Set Computer (RISC) examples in Section 1.5. Further discussion of
CISC versus RISC designs can be found in Tabak (1995).

1.4.1 VAX Architecture

The VAX family of computers was introduced by Digital Equipment


Corporation (DEC) in 1978. The VAX architecture was designed for compati-
bility with the earlier PDP-11 machines. A compatibility mode wasprovidedat
the hardware level so that many PDP-11 programs could run unchanged on
the VAX. It was even possible for PDP-11 programs and VAX programsto
share the same machinein a multi-user environment.
This section summarizes some of the main characteristics of the VAX archi-
tecture. For further information, see Baase (1992).

Memory

The VAX memory consists of 8-bit bytes. All addresses used are byte ad-
dresses. Two consecutive bytes form a word; four bytes form a longword; eight
bytes form a quadword; sixteen bytes form an octaword. Some operations are
moreefficient when operandsare aligned in a particular way—for example, a
longword operand that begins at a byte address that is a multiple of 4.
All VAX programs operatein a virtual address space of 292 bytes. This vir-
tual memory allows programs to operate as though they had access to an ex-
tremely large memory, regardless of the amount of memory actually present
on the system. Routines in the operating system take care of the details of
memory management. Wediscuss virtual memory in connection with our
study of operating systems in Chapter 6. One half of the VAX virtual address
space is called system space, which contains the operating system, and is shared
by all programs. The other half of the address spaceis called process space, and
https://hemanthrajhemu.github.io
Chapter 1 Background

is defined separately for each program. A part of the ‘process space contains
stacks that are available to the program. Special registers and machineinstruc-
tions aid in the useof thesestacks.

Registers

There are 16 general-purpose registers on the VAX, denoted by RO through


R15. Some of these registers, however, have special names and uses. All gen-
eral registers are 32 bits in length. Register R15 is the program counter, also
called PC.It is updated during instruction execution to point to the next in-
struction byte to be fetched. R14is the stack pointer SP, which points to the cur-
rent top of the stack in the program’s process space. Althoughit is possible to
use other registers for this purpose, hardware instructions that implicitly use
the stack always use SP. R13 is the frame pointer FP. VAX procedure call con-
ventions build a data structure called a stack frame, and place its address in
FP. R12 is the argument pointer AP. The procedure call convention uses AP to
passa list of arguments associated with the call.
Registers R6 through R11 have no special functions, and are available for
general use by the program.Registers RO through R35are likewise available for
general use; however, these registers are also used by some machineinstruc-
tions.
In addition to the general registers, there is a processor status longword
(PSL), which contains state variables and flags associated with a process. The
PSL includes, among many other items of information, a condition code and a
flag that specifies whether PDP-11 compatibility mode is being used by a
process. There are also a numberof control registers that are used to support
various operating system functions.

Data Formats

Integers are stored as binary numbersin a byte, word, longword, quadword,


or octaword; 2’s complement representation is used for negative values.
Characters are stored using their 8-bit ASCII codes.
There are four different floating-point data formats on the VAX, ranging in
length from 4 to 16 bytes. Two of these are compatible with those found on the
PDP-11, and are standard on all VAX processors. The other two are available
as options, and provide for an extended rangeof values by allowing more bits
in the exponentfield. In each case, the principles are the same as those wedis-
cussed for SIC/XE:a floating-point value is represented as a fraction thatis to
be multiplied by a specified powerof2.
VAX processors provide a packed decimal data format. In this format, each
byte represents two decimal digits, with each digit encoded using bits of the
byte. The sign is encodedin thelast 4 bits. There is also a numeric formatthat
https://hemanthrajhemu.github.io
1.4 Traditional (CISC) Machines 23

is used to represent numeric values with onedigit per byte. In this format, the
sign may appeareitherin the last byte, or as a separate byte precedingthefirst
digit. These two variations are called trailing numeric and leading separate nu-
meric.
VAX also supports queues and variable-length bit strings. Data structures
such as these can, of course, be implemented on any machine; however, VAX
provides direct hardware support for them. There are single machine instruc-
tions that insert and remove entries in queues, and perform a variety of opera-
tions on bit strings. The existence of such powerful machineinstructions and
complex primitive data types is one of the more unusual features of the VAX
architecture.

Instruction Formats

VAX machineinstructions use a variable-length instruction format. Each in-


struction consists of an operation code (1 or 2 bytes) followed by up to six
operandspecifiers, depending on the type of instruction. Each operandspecifier
designates one of the VAX addressing modes and gives any additional infor-
mation necessary to locate the operand. (See the description of addressing
modesin the following section for further information.)

Addressing Modes

VAX provides a large number of addressing modes. With few exceptions, any
of these addressing modes may be used with anyinstruction. The operandit-
self may bein a register (register mode), or its address may be specified by a
register (register deferred mode). If the operand addressis in a register, the reg-
ister contents may be automatically incremented or decremented by the
operand length (autoincrement and autodecrement modes). There are several
base relative addressing modes, with displacement fields of different lengths;
when used with register PC, these become program-counter relative modes.
All of these addressing modes mayalso include an index register, and many of
them are available in a form that specifies indirect addressing (called deferred
modes on VAX). In addition, there are immediate operands and several spe-
cial-purpose addressing modes.For further details, see Baase (1992).

Instruction Set

Oneof the goals of the VAX designers was to producean instruction set that is
symmetric with respect to data type. Many instruction mnemonics are formed
by combiningthe following elements:
https://hemanthrajhemu.github.io
24 Chapter 1 Background

1. a prefix that specifies the type of operation,


2. a suffix that specifies the data type of the operands,
3. a modifier (on someinstructions) that gives the number of operands
involved.

For example, the instruction ADDW2is an add operation with two operands,
each a word in length. Likewise, MULL3 is a multiply operation with three
longword operands, and CVTWL specifies a conversion from word to long-
word. (In the latter case, a two-operand instruction is assumed.) For a typical
instruction, operands maybelocated in registers, in memory, orin the instruc-
tion itself (immediate addressing). The same machineinstruction code is used,
regardless of operandlocations.
VAX provides all of the usual types of instructions for computation, data
movement and conversion, comparison, branching,etc. In addition, there are a
number of operations that are much more complex than the machine instruc-
tions found on most computers. These operations are, for the most part, hard-
ware realizations of frequently occurring sequences of code. They are
implementedassingle instructionsfor efficiency and speed. For example, VAX
provides instructions to load and store multiple registers, and to manipulate
queuesand variable-length bit fields. There are also powerful instructions for
calling and returning from procedures. A single instruction saves a designated
set of registers, passes a list of arguments to the procedure, maintains the
stack, frame, and argument pointers, and sets a mask to enable error traps for
arithmetic operations. For further information onall of the VAX instructions,
see Baase (1992).

Input and Output

Input and output on the VAX are accomplished by I/O device controllers.
Each controller has a set of control/status and data registers, which are as-
signed locations in the physical address space. The portion of the address
space into which the device controller registers are mappedis called I/O space.
Nospecial instructions are required to access registers in 1/O space. An
I/O device driver issues commandsto the device controller by storing values
into the appropriate registers, exactly as if they were physical memory loca-
tions. Likewise, software routines may read these registers to obtain status in-
formation. The association of an address in I/O space with a physical register
in a device controller is handled by the memory managementroutines.
https://hemanthrajhemu.github.io
1.4 Traditional (CISC) Machines 25

1.4.2 Pentium Pro Architecture

The Pentium Pro microprocessor, introduced near the end of 1995, is the latest
in the Intel x86 family. Other recent microprocessors in this family are the
80486 and Pentium. Processors of the x86 family are presently used in a major-
ity of personal computers, and there is a vast amount of software for these
processors. It is expected that additional generations of the x86 family will be
developed in the future.
The various x86 processors differ in implementation details and operating
speed. However, they share the same basic architecture. Each succeeding gen-
eration has been designed to be compatible with the earlier versions. This sec-
tion contains an overview of the x86 architecture, which will serve as
backgroundfor the examples to be discussed later in the book. Further infor-
mation about the x86 family can be found in Intel (1995), Anderson and
Shanley (1995), and Tabak (1995).

Memory

Memory in the x86 architecture can be described in at least two different ways.
At the physical level, memory consists of 8-bit bytes. All addresses used are
byte addresses. Two consecutive bytes form a word; four bytes form a double-
word (also called a dword). Some operations are more efficient when operands
are aligned in a particular way—for example, a doubleword operandthat be-
gins at a byte addressthatis a multipleof 4.
However, programmers usually view the x86 memory asa collection of
segments. From this point of view, an address consists of two parts—a segment
numberand an offset that points to a byte within the segment. Segments can
be of different sizes, and are often used for different purposes. For example,
some segments may contain executable instructions, and other segments may
be used to store data. Some data segments maybetreated as stacks that can be
used to save register contents, pass parameters to subroutines, and for other
purposes.
It is not necessaryfor all of the segments used by a program to be in physi-
cal memory. In some cases, a segment can also be divided into pages. Some of
the pages of a segment may be in physical memory, while others may be
stored on disk. When an x86 instruction is executed, the hardware and the op-
erating system make sure that the needed byte of the segment is loaded into
physical memory. The segment/offset address specified by the programmeris
automatically translated into a physical byte address by the x86 Memory
https://hemanthrajhemu.github.io
26 Chapter 1 Background

Management Unit (MMU). Chapter 6 contains a brief discussion of methods


that can be usedin this kind of address translation.

Registers

There are eight general-purpose registers, which are named EAX, EBX, ECX,
EDX, ESI, EDI, EBP, and ESP. Each general-purposeregister is 32 bits long (i.e.,
one doubleword). Registers EAX, EBX, ECX, and EDX are generally used for
data manipulation; it is possible to access individual words or bytes from
these registers. The other four registers can also be used for data, but are more
commonly used to hold addresses. The general-purpose register set is identi-
cal for all membersof the x86 family beginning with the 80386. This set is also
compatible with the more limited register sets found in earlier membersof the
family.
There are also several different types of special-purposeregisters in the x86
architecture. EIP is a 32-bit register that contains a pointer to the next instruc-
tion to be executed. FLAGSis a 32-bit register that contains many different bit
flags. Some of these flags indicate the status of the processor; others are used
to record the results of comparisons and arithmetic operations. There are also
six 16-bit segment registers that are used to locate segments in memory.
Segment register CS contains the address of the currently executing code seg-
ment, and SS contains the address of the current stack segment. The other seg-
ment registers (DS, ES, FS, and GS) are used to indicate the addresses of data
segments.
Floating-point computations are performed using a special floating-point
unit (FPU). This unit contains eight 80-bit data registers and several other con-
trol and statusregisters.
All of the registers discussed so far are available to application programs.
There are also a number of registers that are used only by system programs
such as the operating system. Some of these registers are used by the MMUto
translate segment addresses into physical addresses. Others are used to con-
trol the operation of the processor, or to support debugging operations.

Data Formats

The x86 architecture provides for the storage of integers, floating-point values,
characters, and strings. Integers are normally stored as 8-, 16-, or 32-bit binary
numbers. Both signed and unsigned integers (also called ordinals) are sup-
ported; 2’s complement is used for negative values. The FPU can also handle
64-bit signed integers. In memory, theleast significant part of a numeric value
is stored at the lowest-numbered address. (This is commonly called
https://hemanthrajhemu.github.io
1.4 Traditional (CISC) Machines 27

little-endian byte ordering, because the “little end” of the value comesfirst in
memory.)
Integers can also ‘be stored in binary coded decimal (BCD). In the unpacked
BCD format, each byte represents one decimal digit. The value ofthis digit is
encoded(in binary) in the low-order 4 bits of the byte; the high-orderbits are
normally zero. In the packed BCD format, each byte represents two decimal
digits, with each digit encoded using bits of the byte.
There are three different floating-point data formats. The single-precision
format is 32 bits long. It stores 24 significant bits of the floating-point value,
and allowsfor a 7-bit exponent (power of 2). (The remaining bit is used to
store the sign of the floating-point value.) The double-precision format is 64
bits long. It stores 53 significant bits, and allows for a 10-bit exponent. The
extended-precision formatis 80 bits long. It stores 64 significant bits, and
allowsfor a 15-bit exponent.
Characters are stored one per byte, using their 8-bit ASCII codes. Strings
may consist of bits, bytes, words, or doublewords; special instructions are
provided to handle each typeofstring.

Instruction Formats

All of the x86 machine instructions use variations of the same basic format.
This format begins with optional prefixes containing flags that modify the op-
eration of the instruction. For example, some prefixes specify a repetition
count for an instruction. Others specify a segment register that is to be used
for addressing an operand (overriding the normal default assumptions made
by the hardware). Following the prefixes (if any) is an opcode(1 or 2 bytes);
some operations have different opcodes, each specifying a different variant of
the operation. Following the opcode are a number of bytes that specify the
operands and addressing modesto be used. (See the description of addressing
modesin the next section for further information.)
The opcodeis the only element that is always present in every instruction.
Other elements may or may not be present, and maybe of different lengths,
dependingon the operation and the operandsinvolved. Thus,there are a large
numberof different potential instruction formats, varying in length from
1 byte to 10 bytes or more.

Addressing Modes

The x86 architecture provides a large numberof addressing modes. An


operand value maybespecified as part of the instruction itself (immediate
mode), or it may be in a register(register mode).
https://hemanthrajhemu.github.io
28 Chapter 1 Background

Operandsstored in memoryare often specified using variations of the gen-


eral target address calculation

TA (baseregister) + (index register) * (scale factor) + displacement

Any general-purpose register may be used as a base register; any general-


purpose register except ESP can be used as an index register. The scale factor
may have the value1, 2, 4, or 8, and the displacement may be an 8-, 16-, or 32-
bit value. The base and index register numbers, scale, and displacement are
encodedas parts of the operand specifiers in the instruction. Various combina-
tions of these items may be omitted, resulting in eight different addressing
modes. The address of an operand in memory mayalso bespecified as an ab-
solute location (direct mode), or as a locationrelative to the EIP register (relative
mode).

Instruction Set

The x86 architecture has a large and complex instruction set, containing more
than 400 different machine instructions. An instruction may havezero, one,
two, or three operands. There are register-to-register instructions, register-to-
memory instructions, and a few memory-to-memoryinstructions. In some
cases, operands mayalso be specified in the instruction as immediate values.
Most data movement andinteger arithmetic instructions can use operands
that are 1, 2, or 4 bytes long. String manipulation instructions, which use repe-
tition prefixes, can deal directly with variable-length strings of bytes, words,
or doublewords. There are manyinstructions that perform logical and bit ma-
nipulations, and support control of the processor and memory-management
systems.
The x86 architecture also includes special-purpose instructions to perform
operations frequently required in high-level programming languages—for ex-
ample, entering and leaving procedures and checking subscript values against
the boundsof an array.

Input and Output

Input is performed by instructions that transfer one byte, word, or double-


word at a time from an I/O port into register EAX. Outputinstructions trans-
fer one byte, word, or doubleword from EAX to an I/O port. Repetition
prefixes allow these instructions to transfer an entire string in a single
operation.
https://hemanthrajhemu.github.io
1.5 RISC Machines 29

1.5 RISC MACHINES

This section introducés the architectures of three RISC machines that will be
used as examples later in the text. Section 1.5.1 describes the architecture of the
SPARC family of processors. Section 1.5.2 describes the PowerPC family of mi-
croprocessors for personal computers. Section 1.5.3 describes the architecture
of the Cray T3E supercomputing system.
All of these machines are examples of RISC (Reduced Instruction Set
Computers), in contrast to traditional CISC (Complex Instruction Set
Computer) implementations such as Pentium and VAX. The RISC concept, de-
veloped in the early 1980s, was intended to simplify the design of processors.
This simplified design can result in faster and less expensive processor devel-
opment, greaterreliability, and faster instruction execution times.
In general, a RISC system is characterized by a standard, fixed instruction
length (usually equal to one machine word), and single-cycle execution of
most instructions. Memory access is usually done by load and store instruc-
tions only. All instructions except for load andstore are register-to-register op-
erations. There are typically a relatively large number of general-purpose
registers. The number of machine instructions, instruction formats, and ad-
dressing modesis relatively small.
The discussions in the following sections will illustrate some of these RISC
characteristics. Further information about the RISC approach,includingits ad-
vantages and disadvantages, can be foundin Tabak (1995).

1.5.1 UltraSPARC Architecture

The UltraSPARC processor, announced by Sun Microsystems in 1995, is the


latest member of the SPARC family. Other members of this family include a
variety of SPARC and SuperSPARCprocessors. The original SPARC architec-
ture was developed in the mid-1980s, and has been implemented by a number
of manufacturers. The name SPARC standsfor scalable processorarchitecture.
This architecture is intended to be suitable for a wide range of implementa-
tions, from microcomputers to supercomputers.
Although SPARC, SuperSPARC, and UltraSPARC architectures differ
slightly, they are upward compatible and share the samebasic structure. This
section contains an overview of the UltraSPARC architecture, which will serve
as background for the examples to be discussed later in the book. Further in-
formation about the SPARC family can be found in Tabak (1995) and Sun
Microsystems (1995a).
https://hemanthrajhemu.github.io
30 Chapter 1 Background

Memory

Memory consists of 8-bit bytes; all addresses used are byte addresses. Two
consecutive bytes form a halfword; four bytes form a word; eight bytes form a
doubleword. Halfwordsare stored in memory beginning at byte addresses that
are multiples of 2. Similarly, words begin at addresses that are multiples of4,
and doublewordsat addresses that are multiples of8.
UltraSPARC programs can be written using a virtual address space of
264 bytes. This address space is divided into pages; multiple pagesizes are sup-
ported. Some of the pages used by a program maybe in physical memory,
while others may be stored on disk. Whenaninstruction is executed, the hard-
ware and the operating system make sure that the needed pageis loaded into
physical memory. The virtual address specified by the instruction is automati-
cally translated into a physical address by the UltraSPARC Memory Manage-
ment Unit (MMU). Chapter 6 contains a brief discussion of methods that can
be usedin this kind of address translation.

Registers

The SPARCarchitecture includesa large registerfile that usually contains more


than 100 general-purpose registers. (The exact numbervaries from one imple-
mentation to another.) However, any procedure can access only 32 registers,
designated r0 through r31. The first eight of these registers (r0 through r7) are
global—thatis, they can be accessed byall procedures on the system. (Register
r0 always contains the value zero.)
The other 24 registers available to a procedure can be visualized as a win-
dow through whichpartof the register file can be seen. These windowsover-
lap, so someregisters in the register file are shared between procedures. For
example, registers r8 through r15 of a calling procedure are physically the
sameregisters as r24 through r31 of the called procedure. This facilitates the
passing of parameters.
The SPARC hardware manages the windowsintotheregisterfile. If a set of
concurrently running procedures needs more windows than are physically
available, a “window overflow” interrupt occurs. The operating system must
then save the contents of someregisters in the file (and restore them later) to
providethe additional windowsthat are needed.
In the original SPARC architecture, the general-purpose registers were
32 bits long. Later implementations (including UltraSPARC) expanded these
registers to 64 bits. Some SPARC implementations provide several physically
different sets of global registers, for use by application procedures and byvari-
ous hardware and operating system functions.
Floating-point computations are performed using a special floating-point
unit (FPU). On UltraSPARC, this unit contains a file of 64 double-precision
floating-point registers, and several other control andstatus registers.
https://hemanthrajhemu.github.io
1.5 RISC Machines 31

Besidestheseregisterfiles, there are a program counter PC (which contains


the address of the next instruction to be executed), condition code registers,
and a numberofother controlregisters.

Data Formats

The UltraSPARC architecture provides for the storage of integers, floating-


point values, and characters. Integers are stored as 8-, 16-, 32-, or 64-bit binary
numbers. Both signed and unsignedintegers are supported; 2’s complementis
used for negative values. In the original SPARC architecture, the mostsignifi-
cant part of a numeric valueis stored at the lowest-numbered address.(Thisis
commonly called big-endian byte ordering, because the “big end” of the value
comes first in memory.) UltraSPARC supports both big-endianandlittle-
endian byte orderings.
There are three different floating-point data formats. The single-precision
formatis 32 bits long. It stores 23 significant bits of the floating-point value,
and allows for an 8-bit exponent (power of 2). (The remaining bit is used to
store the sign of the floating-point value.) The double-precision formatis
64 bits long. It stores 52 significant bits, and allows for a 11-bit exponent. The
quad-precision format stores 63 significant bits, and allows for a 15-bit expo-
nent.
Characters are stored one per byte, using their 8-bit ASCII codes.

Instruction Formats

There are three basic instruction formats in the SPARC architecture. All of
these formats are 32 bits long; the first 2 bits of the instruction word identify
which formatis being used. Format 1 is used for the Call instruction. Format 2
is used for branch instructions (and one special instruction that enters a value
into a register). The remaining instructions use Format 3, which provides for
register loads and stores, and three-operandarithmetic operations.
The fixed instruction length in the SPARCarchitecture is typical of RISC
systems, and is intended to speed the process of instruction fetching and de-
coding. Compare this approach with the complex variable-length instructions
found on CISC systems such as VAX and x86.

Addressing Modes

As in most architectures, an operand value may be specified as part of the in-


struction itself (immediate mode), or it may be in a register (register direct
mode). Operands in memory are addressed using one of the following three
modes:
https://hemanthrajhemu.github.io
32 Chapter 1 Background

Mode Target addresscalculation


PC-relative . TA= (PC) + displacement {30bits, signed}

Register indirect TA = (register) + displacement


with displacement {13 bits, signed}
Register indirect indexed TA = (register-1) + (register-2)

PC-relative modeis used only for branchinstructions.


Therelatively few addressing modes of SPARCallowfor moreefficient im-
plementations than the 10 or more modes found on CISC systemssuch as x86.

Instruction Set

The basic SPARCarchitecture has fewer than 100 machine instructions,reflect-


ing its RISC philosophy. (Compare this with the 300 to 400 instructions often
found in CISC systems.) The only instructions that access memory are loads
andstores. All other instructions are register-to-register operations.
Instruction execution on a SPARC system is pipelined—while one instruc-
tion is being executed, the next one is being fetched from memory and de-
coded. In mostcases, this technique speedsinstruction execution. However, an
ordinary branchinstruction might cause the processto “stall.” The instruction
following the branch (which had already been fetched and decoded) would
have to be discarded without being executed.
To makethe pipeline work moreefficiently, SPARC branch instructions (in-
cluding subroutine calls) are delayed branches. This meansthat the instruction
immediately following the branch instruction is actually executed before the
branchis taken. For example, in the instruction sequence

SUB %L0, 11, %L1


BA NEXT
MOV *L1, %03

the MOVinstruction is executed before the branch BA. This MOVinstruction


is said to be in the delay slot of the branch. The programmer musttake this
characteristic into account when writing an assembler language program.
Further discussions and examples of the use of delayed branches can be found
in Section 2.5.2. ;
The UltraSPARCarchitecture also includes special-purpose instructions to
provide support for operating systems and optimizing compilers. For exam-
ple, high-bandwidth block load and store operations can be used to speed
https://hemanthrajhemu.github.io
1.5 RISC Machines 33

common operating system functions. Communication in a multi-processor


system is facilitated by special “atomic” instructions that can execute without
allowing other memoty accesses to intervene. Conditional moveinstructions
may allow a compiler to eliminate many branch instructions in order to opti-
mize program execution.

Input and Output

In the SPARC architecture, communication with I/O devices is accomplished


through memory. A range of memory locationsis logically replaced by device
registers. Each I/O device has a unique address, or set of addresses, assigned
to it. When a load or store instruction refers to this device register area of
memory, the corresponding device is activated. Thus input and output can be
performed with the regular instruction set of the computer, and no special I/O
instructions are needed.

1.5.2 PowerPCArchitecture

IBM first introduced the POWERarchitecture early in 1990 with the RS/6000.
(POWERis an acronym for Performance Optimization With Enhanced RISC.)
It was soon realized that this architecture could form the basis for a new fam-
ily of powerful and low-cost microprocessors. In October 1991, IBM, Apple,
and Motorola formed an alliance to develop and market such microprocessors,
which were named PowerPC. Thefirst products using PowerPC chips were
delivered near the end of 1993. Recent implementations of the PowerPC archi-
tecture include the PowerPC 601, 603, and 604; others are expected in the near
future.
Asits name implies, PowerPC is a RISC architecture. As we shall see,it has
much in common with other RISC systems such as SPARC. There are also a
few differences in philosophy, which we will note in the course of the discus-
sion. This section contains an overview of the PowerPC architecture, which
will serve as background for the examples to be discussed later in the book.
Further information about PowerPC can be found in IBM (1994a) and Tabak
(1995).

Memory

Memory consists of 8-bit bytes; all addresses used are byte addresses. Two
consecutive bytes form a halfword; four bytes form a word; eight bytes form a
doubleword; sixteen bytes form a quadword. Manyinstructions may execute
https://hemanthrajhemu.github.io
34 Chapter 1 Background

more efficiently if operandsare aligned at a starting address that is a multiple


of their length.
PowerPC programs can be written using a virtual address space of 264
bytes. This address space is divided into fixed-length segments, which are 256
megabytes long. Each segmentis divided into pages, which are 4096 bytes
long. Some of the pages used by a program maybein physical memory, while
others may be stored on disk. When an instruction is executed, the hardware
and the operating system makesure that the needed pageis loaded into physi-
cal memory. The virtual address specified by the instruction is automatically
translated into a physical address. Chapter 6 contains a brief discussion of
methodsthat can be usedin this kind of addresstranslation.

Registers

There are 32 general-purposeregisters, designated GPRO through GPR31. In


the full PowerPC architecture, each register is 64 bits long. PowerPC can also
be implemented in a 32-bit subset, which uses 32-bit registers. The general-
purpose registers can be used to store and manipulate integer data and
addresses.
Floating-point computations are performed using a special floating-point
unit (FPU). This unit contains thirty-two 64-bit floating-point registers, and a
status and controlregister.
A 32-bit condition register reflects the result of certain operations, and can
be used as a mechanism for testing and branching.This register is divided into
eight 4-bit subfields, named CRO through CR7. These subfields can be set and
tested individually by PowerPCinstructions.
The PowerPC architecture includes a Link Register (LR) and a Count
Register (CR), which are used by somebranch instructions. There is also a
Machine Status Register (MSR) and variety of other control and status regis-
ters, some of which are implementation dependent.

Data Formats

The PowerPC architecture provides for the storage of integers, floating-point


values, and characters. Integers are stored as 8-, 16-, 32-, or 64-bit binary num-
bers. Both signed and unsigned integers are supported; 2's complementis
used for negative values. By default, the most significant part of a numeric
value is stored at the lowest-numbered address (big-endian byte ordering). It
is possible to select little-endian byte ordering by setting a bit in a control
register.
https://hemanthrajhemu.github.io
1.5 RISC Machines 35

There are two different floating-point data formats. The single-precision


formatis 32 bits long. It stores 23 significant bits of the floating-point value,
and allows for an 8-bit exponent (power of 2). (The remaining bit is used to
store the sign of the floating-point value.) The double-precision format is
64 bits long. It stores 52 significant bits, and allowsfor a 11-bit exponent.
Characters are stored oneperbyte, using their 8-bit ASCII codes.

Instruction Formats

There are seven basic instruction formats in the PowerPC architecture, some of
which have subforms.All of these formats are 32 bits long. Instructions must
be aligned beginning at a word boundary (i.e., a byte addressthat is a multiple
of 4). The first 6 bits of the instruction word always specify the opcode; some
instruction formats also have an additional “extended opcode”field.
The fixed instruction length in the PowerPC architecture is typical of RISC
systems. The variety and complexity of instruction formats is greater than that
found on most RISC systems (such as SPARC). However, the fixed length
makesinstruction decoding faster and simpler than on CISC systems like VAX
and x86.

Addressing Modes

As in most architectures, an operand value may bespecified as part of the in-


struction itself (immediate mode), or it may be in a register (register direct
mode). The only instructions that address memory are load and store opera-
tions, and branchinstructions.
Load and store operations use one of the following three addressing
modes:

Mode Target address calculation

Register indirect TA = (register)

Register indirect with index TA (register-1) + (register-2)


Register indirect with TA = (register) + displacement
immediate index {16 bits, signed}

The register numbers and displacementare encodedas partof the instruction.


https://hemanthrajhemu.github.io
Chapter 1 Background

Branchinstructions use oneof the following three addressing modes:

Mode Target addresscalculation

Absolute TA = actual address


Relative TA current instruction address +
displacement {25bits, signed}

Link Register TA (LR)


CountRegister TA = (CR)

The absolute address or displacementis encodedaspartof the instruction.

Instruction Set

The PowerPCarchitecture has approximately 200 machine instructions. Some


instructions are more complex than those found in most RISC systems.For ex-
ample, load and store instructions may automatically update the index regis-
ter to contain the just-computed target address. There are floating-point
“multiply and add” instructions that take three input operands and perform a
multiplication and an addition in oneinstruction. Such instructions reflect the
PowerPC approach of using more powerful instructions, so fewer instructions
are required to perform task. This is in contrast to the more usual RISC ap-
proach, which keepsinstructions simple so they can be executed as fast as
possible.
In spite of this difference in philosophy, PowerPC is generally considered
to be a true RISC architecture. Further discussions of these issues can be found
in Smith and Weiss (1994).
Instruction execution on a PowerPCsystem is pipelined, as we discussed
for SPARC. However, the pipelining is more sophisticated than on the original
SPARC systems, with branch prediction used to speed execution. Asa result,
the delayed branch technique we described for SPARC is not used on
PowerPC (and most other modern architectures). Further discussion of
pipelining and branch prediction can be found in Tabak (1995).

Input and Output

The PowerPCarchitecture provides two different methods for performing I/O


operations. In one approach, segments in the virtual address space are
mappedonto an external address space (typically an I/O bus). Segments that
are mappedin this wayare called direct-store segments. This method is similar
to the approach used in the SPARCarchitecture.
https://hemanthrajhemu.github.io
1.5 RISC Machines 37

A reference to an addressthat is not in a direct-store segment represents a


normal virtual memoryaccess. In this situation, I/O is performed using the
regular virtual memory management hardwareand software.

1.5.3 Cray T3E Architecture

The T3E series of supercomputers was announced by Cray Research,Inc., near


the end of 1995. The T3E is a massively parallel processing (MPP) system, de-
signed for use on technical applications in scientific computing. The earlier
Cray T3D system hada similar (but not identical) architecture.
A T3E system contains a large number of processing elements (PE),
arranged in a three-dimensional network asillustrated in Fig. 1.8. This net-
work provides a path for transferring data between processors. It also imple-
ments control functions that are used to synchronize the operation of the PEs
used by a program. The interconnect networkis circular in each dimension.
Thus PEs at “opposite” ends of the three-dimensional array are adjacent with
respect to the network. This is illustrated by the dashed lines in Fig. 1.8; for
simplicity, most of these “circular” connections have been omitted from the
drawing.
Each PE consists of a DEC Alpha EV5 RISC microprocessor (currently
model 21164), local memory, and performance-accelerating control logic devel-
oped by Cray. A T3E system may contain from 16 to 2048 processing elements.
This section contains an overview of the architecture of the T3E and the
DEC Alpha microprocessor. Sections 3.5.3 and 5.5.3 discuss some of the ways
programscan take advantage of the multiprocessor architecture of this ma-
chine. Further information about the T3E can be found in Cray Research
(1995c). Further information about the DEC Alphaarchitecture can be found in
Sites (1992) and Tabak (1995).

Memory

Each processing element in the T3E has its own local memory with a capacity
of from 64 megabytes to 2 gigabytes. The local memory within each PEis part
a Interconnect network

Figure 1.8 Overall T3E architecture.


https://hemanthrajhemu.github.io
38 Chapter 1 Background

of a physically distributed, logically shared memory system. System memory


is physically distributed because each PE contains local memory. System mem-
ory is logically shared because the microprocessor in one PE can access the
memory of another PE without involving the microprocessorin that PE.
The memory within each processing element consists of 8-bit bytes; all
addresses used are byte addresses. Two consecutive bytes form a word; four
bytes form a longword; eight bytes form a quadword. Many Alphainstructions
may execute moreefficiently if operandsare aligned at a starting address that
is a multiple of their length. The Alpha architecture supports 64-bit virtual
addresses.

Registers

The Alphaarchitecture includes 32 general-purpose registers, designated RO


through R31; R31 always contains the value zero. Each general-purpose regis-
ter is 64 bits long. These general-purpose registers can be used to store and
manipulate integer data and addresses.
There are also 32 floating-point registers, designated FO through F31; F31
always contains the value zero. Each floating-pointregister is 64 bits long.
In addition to the general-purpose andfloating-pointregisters, there is a
64-bit program counter PC andseveralother status and controlregisters.

Data Formats

The Alpha architecture provides for the storage of integers, floating-point val-
ues, and characters. Integers are stored as longwords or quadwords; 2's com-
plementis used for negative values. Wheninterpreted as an integer, the bits of
a longword or quadword havesteadily increasing significance beginning with
bit 0 (whichis stored in the lowest-addressed byte).
There are two different types of floating-point data formats in the Alpha
architecture. One group of three formats is included for compatibility with the
VAX architecture. The other group consists of four IEEE standard formats,
which are compatible with those used on most modern systems.
Characters may be stored one per byte, using their 8-bit ASCII codes.
However, there are no byte load or store operations in the Alpha architecture;
only longwords and quadwords can be transferred between a register and
memory. As a consequence, characters that are to be manipulated separately
are usually stored one per longword.
https://hemanthrajhemu.github.io
1.5 RISC Machines 39

Instruction Formats

There are five basic fnstruction formats in the Alpha architecture, some of
which have subforms. All of these formats are 32 bits long. (As we have noted
before, this fixed length is typical of RISC systems.) Thefirst 6 bits of the in-
struction word always specify the opcode; someinstruction formats also have
an additional “function”field.

Addressing Modes

As in most architectures, an operand value may bespecified as part of the in-


struction itself (immediate mode), or it may be in a register (register direct
mode). As in most RISC systems, the only instructions that address memory
are load andstore operations, and branchinstructions.
Operands in memory are addressed using one of the following two modes:

Mode Target address calculation

PC-relative TA (PC) + displacement{23bits, signed}

Register indirect TA = (register) + displacement


with displacement {16 bits, signed}

Register indirect with displacement modeis used for load and store opera-
tions and for subroutine jumps. PC-relative modeis used for conditional and
unconditional branches.

Instruction Set

The Alphaarchitecture has approximately 130 machine instructions,reflecting


its RISC orientation. The instruction set is designed so that an implementation
of the architecture can be as fast as possible. For example, there are no byte or
word load andstore instructions. This means that the memory accessinterface
does not need to include shift-and-mask operations. Further discussion ofthis
approach can be found in Smith and Weiss (1994).

Input and Output

The T3E system performs I/O through multiple ports into one or more I/O
channels, which can be configured in a number of ways. These channels are
https://hemanthrajhemu.github.io
40 Chapter 1 Background

integrated into the network that interconnects the processing nodes. A system
may be configured with up to one I/O channel for every eight PEs. All chan-
nels are accessible and controllable from all PEs.
Further information about this “scalable” I/O architecture can be found in
Cray Research (1995c).

EXERCISES

Section 1.3

Write a sequence of instructions for SIC to set ALPHA equalto the


product of BETA and GAMMA. Assume that ALPHA, BETA, and
GAMMAaredefinedasin Fig.1.3(a).

Write a sequence of instructions for SIC/XE to set ALPHA equal to


4 * BETA — 9. Assume that ALPHA and BETAare defined asin Fig.
1.3(b). Use immediate addressing for the constants.
Write a sequence of instructions for SIC to set ALPHA equal to the

a acacia
integer portion of BETA + GAMMA. Assumethat ALPHA and BETA
are definedasin Fig. 1.3(a).
Write a sequence of instructions for SIC/XE to divide BETA by
GAMMA,setting ALPHA to the integer portion of the quotient and
DELTAto the remainder. Use register-to-register instructions to make
the calculationasefficient as possible.
Write a sequence of instructions for SIC/XE to divide BETA by
GAMMA,setting ALPHA to the value of the quotient, rounded to
the nearest integer. Use register-to-register instructions to make the
calculation as efficient as possible.
Write a sequence of instructions for SIC to clear a 20-byte stringto all
blanks.
Write a sequence of instructions for SIC/XE to clear a 20-byte string
to all blanks. Use immediate addressing and register-to-register in-
structions to makethe processasefficient as possible.
Suppose that ALPHAis an array of 100 words, as defined in Fig.
1.5(a). Write a sequence of instructions for SIC to set all 100 elements
of the array to 0.
Suppose that ALPHAis an array of 100 words, as defined in Fig.
1.5(b). Write a sequence of instructions for SIC/XEto set all 100
https://hemanthrajhemu.github.io
Exercises 4]

elements of the array to 0. Use immediate addressing and register-to-


register instructions to makethe processasefficient as possible.
10. Suppose that RECORD contains a 100-byte record, as in Fig. 1.7(a).
Write a subroutine for SIC that will write this record onto device 05.

i. Suppose that RECORD contains a 100-byte record, as in Fig. 1.7(b).


Write a subroutine for SIC/XE that will write this record onto device
05. Use immediate addressing and register-to-register instructions to
make the subroutineasefficient as possible.
12 Write a subroutine for SIC that will read a record into a buffer, as in
Fig. 1.7(a). The record may be any length from 1 to 100 bytes. The
end of the record is marked with a “null” character (ASCII code 00).
The subroutine should place the length ofthe record readinto a vari-
able named LENGTH.
2: Write a subroutine for SIC/XE that will read a record into a buffer, as
in Fig. 1.7(b). The record may be any length from 1 to 100 bytes. The
end of the record is marked with a “null” character (ASCII code 00).
The subroutine shouldplace the length of the record read into a vari-
able named LENGTH. Use immediate addressing and register-to-
register instructions to make the subroutineasefficient as possible.
https://hemanthrajhemu.github.io
https://hemanthrajhemu.github.io

Chapter 2

Assemblers

In this chapter we discuss the design and implementation of assemblers. There


are certain fundamental functions that any assembler must perform, such as
translating mnemonic operation codes to their machine language equivalents
and assigning machine addresses to symbolic labels used by the programmer.
If we consider only these fundamental functions, most assemblers are very
muchalike.
Beyond this most basic level, however, the features and design of an as-
sembler depend heavily upon the source languageit translates and the ma-
chine language it produces. One aspect of this dependenceis, of course, the
existence of different machine instruction formats and codes to accomplish
(for example) an ADD operation. As weshall see, there are also many subtler
waysthat assemblers depend upon machinearchitecture. On the other hand,
there are some features of an assembler language (and the corresponding as-
sembler) that have no direct relation to machine architecture—theyare, in a
sense, arbitrary decisions madeby the designersof the language.
Webegin by considering the design of a basic assembler for the standard
version of our Simplified Instructional Computer (SIC). Section 2.1 introduces
the most fundamental operations performed by a typical assembler, and de-
scribes common ways of accomplishing these functions. The algorithms and
data structures that we describe are shared by almostall assemblers. Thus this
level of presentation gives us a starting point from which to approach the
study of more advanced assembler features. We can also use this basic struc-
ture as a frameworkfrom whichto begin the design of an assembler for a com-
pletely new or unfamiliar machine.
In Section 2.2, we examine sometypical extensions to the basic assembler
structure that might be dictated by hardware considerations. We do this by
discussing an assembler for the SIC/XE machine. Although this SIC/XEas-
sembler certainly does not include all possible hardware-dependentfeatures,
it does contain some of the ones most commonly found in real machines. The
principles and techniques should be easily applicable to other computers.
Section 2.3 presents a discussion of some of the most commonly encoun-
tered machine-independent assembler language features and their implemen-
tation. Once again, our purposeis not to coverall possible options, but rather
https://hemanthrajhemu.github.io
44 Chapter 2. Assemblers

to introduce concepts and techniques that can be used in new and unfamiliar
situations.
Section 2.4 examines some important alternative design schemesfor an as-
sembler. These are features of an assembler that are not reflected in the assem-
bler language. For example, some assemblers process a source program in one
pass instead of two; other assemblers may make more than twopasses. We are
concerned with the implementation of such assemblers, and also with the en-
vironments in which each mightbe useful.
Finally, in Section 2.5 we briefly consider some examples of actual assem-
blers for real machines. We do not attempt to discussall aspects of these as-
semblers in detail. Instead, we focus on the mostinteresting features that are
introduced by hardware or software design decisions.

2.1 BASIC ASSEMBLER FUNCTIONS


Figure 2.1 shows an assembler language program for the basic version of SIC.
Weuse variations of this program throughout this chapter to show different
assemblerfeatures. The line numbersare for reference only and are not part of
the program. These numbersalso help to relate corresponding parts of differ-
ent versions of the program. The mnemonic instructions used are those intro-
duced in Section 1.3.1 and Appendix A. Indexed addressing is indicated by
adding the modifier “,X” following the operand(see line 160). Lines beginning
with “”.” contain commentsonly.
In addition to the mnemonic machine instructions, we have used the fol-
lowing assembler directives:

START Specify nameandstarting address for the program.


END Indicate the end of the source program and (optionally) specify
the first executable instruction in the program.
BYTE Generate character or hexadecimal constant, occupying as
many bytes as needed to represent the constant.
WORD Generate one-word integer constant.
RESB Reservethe indicated numberof bytesfor a data area.
RESW Reserve the indicated number of wordsfor a data area.

The program contains a main routine that reads records from an input de-
vice (identified with device code F1) and copies them to an output device
(code 05). This main routine calls subroutine RDREC to read a record into a
buffer and subroutine WRREC to write the record from the buffer to the out-
https://hemanthrajhemu.github.io
2.1 Basic Assembler Functions 45

Line Source statement

COPY START 1000 COPY FILE FROM INPUT TO OUTPUT


10 FIRST STL RETADR SAVE RETURN ADDRESS
15 CLOOP JSUB RDREC READ INPUT RECORD
20 LDA LENGTH TEST FOR EOF (LENGTH = 0)
25 COMP ZERO
30 JEQ ENDFIL EXIT IF EOF FOUND
35 JSUB WRREC WRITE OUTPUT RECORD
40 J CLOOP LOOP
45 ENDFIL LDA EOF INSERT END OF FILE MARKER
50 STA BUFFER
55 LDA THREE SET LENGTH = 3
60 STA LENGTH
65 JSUB WRREC WRITE HOF
70 LDL RETADR GET RETURN ADDRESS
715 RSUB RETURN TO CALLER
80 EOF BYTE ‘EOF’
Pe own

85 THREE WORD
90 ZERO WORD
95 RETADR RESW
100 LENGTH RESW LENGTH OF RECORD
105 BUFFER RESB 4096 4096-BYTE BUFFER AREA
110
115 SUBROUTINE TO READ RECORD INTO BUFFER
120
125 RDREC LDX ZERO CLEAR LOOP COUNTER
130 LDA CLEAR A TO ZERO
135 RLOOP TD TEST INPUT DEVICE
140 JEQ LOOP UNTIL READY
145 RD READ CHARACTER INTO REGISTER A
150 COMP TEST FOR END OF RECORD (X'00')
155 JEQ EXIT LOOP IF EOR
160 STCH STORE CHARACTER IN BUFFER
165 TIX LOOP UNLESS MAX LENGTH
170 JLT RLOOP HAS BEEN REACHED
175 EXIT STx LENGTH SAVE RECORD LENGTH
180 RSUB RETURN TO CALLER
185 INPUT BYTE Ed CODE FOR INPUT DEVICE
190 MAXLEN WORD 4096
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
210 WRREC LDX ZERO CLEAR LOOP COUNTER
215 WLOOP TD OUTPUT TEST OUTPUT DEVICE
220 JEQ WLOOP LOOP UNTIL READY
225 BUFFER, X GET CHARACTER FROM BUFFER
230 OUTPUT WRITE CHARACTER
235 LENGTH LOOP UNTIL ALL CHARACTERS
240 WLOOP HAVE BEEN WRITTEN
245 RETURN TO CALLER
250 OUTPUT BYTE %'05' CODE FOR OUTPUT DEVICE
255 FIRST

Figure 2.1 Example of a SIC assembler language program.


https://hemanthrajhemu.github.io
46 Chapter 2 Assemblers

put device. Each subroutine must transfer the record one character at a time
because the only I/O instructions available are RD and WD.The bufferis nec-
essary because the I/O rates for the two devices, such as a disk and a slow
printing terminal, may be very different. (In Chapter 6, we see how to use
channel programs and operating system calls on a SIC/XE system to accom-
plish the same functions.) The end of each record is marked with a null charac-
ter (hexadecimal 00). If a record is longer than the length of the buffer (4096
bytes), only the first 4096 bytes are copied. (For simplicity, the program does
not deal with error recovery when a record containing 4096 bytes or moreis
read.) The end of the file to be copied is indicated by a zero-length record.
Whenthe endoffile is detected, the program writes EOF on the output device
and terminates by executing an RSUB instruction. We assumethat this pro-
gram wascalled by the operating system using a JSUB instruction; thus, the
RSUBwill return control to the operating system.

2.1.1 A Simple SIC Assembler

Figure 2.2 shows the same program asin Fig. 2.1, with the generated object
code for each statement. The column headed Loc gives the machine address
(in hexadecimal) for each part of the assembled program. We have assumed
that the program starts at address 1000. (In an actual assemblerlisting, of
course, the comments would be retained; they have been eliminated here to
save space.)
The translation of source program to object code requires us to accomplish
the following functions (not necessarily in the order given):

1. Convert mnemonic operation codes to their machine language


equivalents—e.g., translate STL to 14 (line 10).
2. Convert symbolic operands to their equivalent machine addresses—
e.g., translate RETADRto 1033(line 10).
3. Build the machineinstructions in the proper format.
4. Convert the data constants specified in the source program into their
internal machine representations—e.g., translate EOF to 454F46 (line
80).
5. Write the object program and the assemblylisting.

All of these functions except number2 can easily be accomplished by sequen-


tial processing of the source program, oneline at a time. Thetranslation of
addresses, however, presents a problem. Consider the statement

10 1000 FIRST STL RETADR 141033


https://hemanthrajhemu.github.io
’ 21 Basic Assembler Functions 47

Line Loc Source statement Object code

1000 COPY START 1000


10 1000 FIRST STL RETADR 141033
15 1003 CLOOP JSUB RDREC 482039
20 1006 LDA LENGTH 001036
25 1009 COMP ZERO 281030
30 100Cc JEQ ENDFIL 301015
35 100F JSUB WRREC 482061
40 1012 e CLOOP 3C1003
45 1015 ENDFT L LDA 00102A
50 1018 STA 0C1039
55 101B LDA 00102D
60 1O1E STA 0C1036
65 1021 JSUB 482061
70 1024 LDL 081033
q 1027 RSUB 4c0000
80 102A BYTE 454F46
85 102D THREE WORD 000003
90 1030 ZERO WORD 000000
95 1033 RETADi RESW
100 1036 LENGTH RESW
105 1039 BUFFER RESB
110
115 SUBROUTINE TO READ RECORD INTO BUFFER
120
125 2039 LDX ZERO 041030
130 203C LDA ZERO 001030
135 203F RLOOP INPUT E0205D
140 2042 RLOOP 30203F
145 2045 INPUT D8205D
150 2048 ZERO 281030
155 204B EXTT 302057
160 204E BUFFER, X 549039
165 2051 MAXALEN 2C205E
170 2054 RLOOP 38203F
175 2057 EXIT LENGTH 101036
180 205A 4c0000
185 205D AE! EE
190 205E 4096 001000
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
210 2061 LDX ZERO 041030
215 2064 WLOOP OUTPUT E02079
220 2067 WLOOP 302064
225 206A BUFFER, X 509039
230 206D OUTPUT DC2079
235 2070 LENGTH 201036
240 2073 WLOOP 382064
245 2076 4c0000
250 2079 X05’ 05
255 FIRST

Figure 2.2 Program from Fig. 2.1 with object code.


https://hemanthrajhemu.github.io
48 Chapter 2 Assemblers

This instruction contains a forward reference—that is, a reference to a label


(RETADR)that is defined later in the program. If we attemptto translate the
program line by line, we will be unable to process this statement because we
do not know the address that will be assigned to RETADR. Becauseofthis,
most assemblers make two passes over the source program. Thefirst pass
doeslittle more than scan the source program for label definitions and assign
addresses (such as those in the Loc column in Fig. 2.2). The second pass per-
forms mostof the actualtranslation previously described.
In addition to translating the instructions of the source program, the assem-
bler must process statements called assembler directives (or pseudo-instructions).
These statements are not translated into machine instructions (although they
may havean effect on the object program). Instead, they provide instructions
to the assembler itself. Examples of assembler directives are statements like
BYTE and WORD,whichdirect the assembler to generate constants as part of
the object program, and RESB and RESW,whichinstruct the assembler to re-
serve memory locations without generating data values. The other assembler
directives in our sample program are START, which specifies the starting
memory address for the object program, and END, which marks the endof the
program.
Finally, the assembler must write the generated object code onto some out-
put device. This object program will later be loaded into memory for execution.
The simple object program format we use contains three types of records:
Header, Text, and End. The Header record contains the program name,start-
ing address, and length. Text records contain the translated (i-e., machine
code) instructions and data of the program, together with an indication of the
addresses where these are to be loaded. The End record marks the end of the
object program and specifies the address in the program where executionis to
begin. (This is taken from the operand of the program’s END statement.If no
operandis specified, the addressofthefirst executable instruction is used.)
The formats weuse for these records are as follows. The details of the for-
mats (column numbers, etc.) are arbitrary; however, the information contained
in these records mustbe present (in some form)in the object program.

Headerrecord:
Col. 1 H
Col. 2-7 Program name
Col. 8-13 Starting address of object program (hexadecimal)
Col. 14-19 Length of object program in bytes (hexadecimal)
https://hemanthrajhemu.github.io
2.1 Basic Assembler Functions

Text record:
Col. 1 i
Col. 2-7 Starting address for object code in this record(hexadecimal)
Col. 8-9 Length of object codein this record in bytes (hexadecimal)
Col. 10-69 Object code, represented in hexadecimal (2 columns per
byte of object code)

Endrecord:
Col. 1 E
Col. 2-7 Addressof first executable instruction in object program
(hexadecimal)

To avoid confusion, we have used the term column rather than byte to refer to
positions within object program records. This is not meant to imply the use of
any particular medium for the object program.
Figure 2.3 shows the object program correspondingto Fig. 2.2, using this
format. In this figure, and in the other object programs wedisplay, the symbol
‘is used to separate fields visually. Of course, such symbols are not present in
the actual object program. Note that there is no object code corresponding to
addresses 1033-2038. This storage is simply reserved by the loader for use by
the program during execution. (Chapter 3 contains a detailed discussion of the
operation of the loader.)
We can now give a general description of the functions of the two passes of
our simple assembler.

HCOPY 001000001074
TO 10001E1 41 033482039001 03628103030101 348206 13€100300 102A0C103900102D
T0010 1 El 500103648206 108 10334C0000454F46000003000000
T0020391E04 103000 1 030£0205D30203FD8205D28 10303020575490392C205E386203F
70020571¢1 010364CO000F 100 1 00004 103050207 9.302064509039DC20792¢ 1036
700207 307.3820644C000005
£001000

Figure 2.3 Object program correspondingto Fig. 2.2.


https://hemanthrajhemu.github.io
50 Chapter 2 Assemblers

Pass 1 (define symbols):


1. Assign addressesto all statements in the program.
2. Save the values (addresses) assignedto all labels for use in Pass 2.
3. Perform some processing of assembler directives. (This includes
processing that affects address assignment, such as determining
the length of data areas defined by BYTE, RESW,etc.)

Pass 2 (assembleinstructions and generate object program):


1. Assemble instructions (translating operation codes and looking
up addresses).
2. Generate data values defined by BYTE, WORD,etc.
3. Perform processing of assemblerdirectives not done during Pass1.
4. Write the object program and the assemblylisting.

In the next section we discuss these functions in more detail, describe the in-
ternal tables required by the assembler, and give an overall description of the
logic flow of each pass.

2.1.2 Assembler Algorithm and Data Structures

Our simple assembler uses two major internal data structures: the Operation
Code Table (OPTAB) and the Symbol Table (SYMTAB). OPTABis used to look
up mnemonic operation codes and translate them to their machine language
equivalents. SYMTABis used to store values (addresses) assigned to labels.
We also need a Location Counter LOCCTR.This is a variable that is used
to help in the assignmentof addresses. LOCCTRis initialized to the beginning
address specified in the START statement. After each source statement is
processed, the length of the assembled instruction or data area to be generated
is added to LOCCTR. Thus wheneverwereacha label in the source program,
the current value of LOCCTR gives the address to be associated with that
label.
The Operation Code Table must contain (at least) the mnemonic operation
code and its machine language equivalent. In more complex assemblers, this
table also contains information about instruction format and length. During
Pass 1, OPTAB is used to look up and validate operation codes in the source
program.In Pass 2,it is used to translate the operation codes to machinelan-
guage. Actually, in our simple SIC assembler, both of these processes could be
done together in either Pass 1 or Pass 2. However, for a machine (such as
SIC/XE) that has instructions of different lengths, we must search OPTAB in
the first pass to find the instruction length for incrementing LOCCTR.
https://hemanthrajhemu.github.io
: 2.1 Basic Assembler Functions 51

Likewise, we must have the information from OPTAB in Pass 2 to tell us


which instruction format to use in assembling the instruction, and any pecu-
liarities of the object cqde instruction. We have chosen to retain this structure
in the current discussion becauseit is typical of most real assemblers.
OPTABis usually organized as a hash table, with mnemonic operation
codeas the key. (The information in OPTAB is, of course, predefined when the
assembleritself is written, rather than being loadedinto the table at execution
time.) The hash table organization is particularly appropriate, since it provides
fast retrieval with a minimum of searching. In most cases, OPTAB is a static
table—that is, entries are not normally added to or deleted from it. In such
cases it is possible to design a special hashing function or other data structure
to give optimum performancefor the particular set of keys being stored. Most
of the time, however, a general-purpose hashing method is used. Further in-
formation about the design and construction of hash tables may be found in
any good data structures text, such as Lewis and Denenberg (1991) or Knuth
(1973).
The symbol table (SYMTAB) includes the name and value (address) for
each label in the source program, together with flags to indicate error condi-
tions (e.g., a symbol defined in two different places). This table may also con-
tain other information aboutthe data area orinstruction labeled—for example,
its type or length. During Pass 1 of the assembler, labels are entered into
SYMTABasthey are encountered in the source program, along with their as-
signed addresses (from LOCCTR). During Pass 2, symbols used as operands
are looked up in SYMTABto obtain the addresses to be inserted in the assem-
bled instructions.
SYMTABis usually organized asa hashtable forefficiency of insertion and
retrieval. Since entries are rarely (if ever) deleted from this table, efficiency of
deletion is not an important consideration. Because SYMTABis used heavily
throughout the assembly, care should be taken in the selection of a hashing
function. Programmers often select many labels that have similar characteris-
tics—for example, labels that start or end with the same characters (like
LOOP1, LOOP2, LOOPA) orare of the same length (like A, X, Y, Z). It is im-
portant that the hashing function used perform well with such non-random
keys. Division of the entire key by a prime table length often gives good
results.
It is possible for both passes of the assembler to read the original source
program as input. However, there is certain information (such as location
counter values anderror flags for statements) that can or should be communi-
cated between the two passes. For this reason, Pass 1 usually writes an inter-
mediate file that contains each source statement together with its assigned
address, error indicators, etc. Thisfile is used as the input to Pass 2. This work-
ing copy of the source program can also be usedto retain the results of certain
https://hemanthrajhemu.github.io
52 Chapter 2. Assemblers

operations that may be performed during Pass 1 (such as scanning the


operand field for symbols and addressing flags), so these need not be per-
formed again during Pass 2. Similarly, pointers into OPTAB and SYMTAB may
be retained for each operation code and symbolused. This avoids the need to
repeat manyof the table-searching operations.
Figures 2.4(a) and (b) showthelogic flow of the two passes of our assem-
bler. Although described for the simple assembler we are discussing, this is
also the underlying logic for more complex two-pass assemblers that we con-
sider later. We assumefor simplicity that the source lines are written in fixed
format with fields LABEL, OPCODE, and OPERAND. If one of these fields
contains a character string that represents a number, we denote its numeric
value with theprefix # (for example, #[{OPERAND)).
Atthis stage, it is very important for you to understand thoroughlytheal-
gorithms in Fig. 2.4. You are strongly urged to follow through the logic in
these algorithms, applying them by handto the program in Fig. 2.1 to produce
the object program ofFig. 2.3.
Much ofthe detail of the assembler logic has, of course, been left out to
emphasize the overall structure and main concepts. You should think about
these details for yourself, and you should also attempt to identify those func-
tions of the assembler that should be implemented as separate procedures or
modules. (For example, the operations “search symbol table” and “read input
line” might be good candidates for such implementation.) This kind of
thoughtful analysis should be done before you make any attemptto actually
implement an assembler or any other large piece of software.
Chapter 8 contains an introduction to software engineering tools and tech-
niques, and illustrates the use of such techniques in designing and implement-
ing a simple assembler. You may want to read this material now to gain
further insight into how an assembler might be constructed.

2.2 MACHINE-DEPENDENT
ASSEMBLER FEATURES

In this section, we consider the design and implementation of an assembler for


the more complex XE version of SIC. In doing so, we examine theeffect of the
extended hardware onthe structure and functions of the assembler. Manyreal
machines havecertain architectural features that are similar to those we con-
sider here. Thus our discussion applies in large part to these machinesas well
as to SIC/XE.
Figure 2.5 shows the example program from Fig. 2.1 as it might be rewrit-
ten to take advantage of the SIC/XEinstruction set. In our assembler lan-
guage, indirect addressing is indicated by adding the prefix @ to the operand
https://hemanthrajhemu.github.io
*2.2Machine-Dependent Assembler Features 53

Pass 1:

begin
read first input line
if OPCODE = ‘START’ then
begin
save #[OPERAND] as starting address
initialize LOCCTR to starting address
write line to intermediate file
read next input line
end {if START}
else
initialize LOCCTR to 0
while OPCODE # ‘END’ do
begin
if this is not a comment line then
begin
if there is a symbol in the LABEL field then
begin
search SYMTAB for LABEL
if found then
set error flag (duplicate symbol)
else
insert (LABEL,LOCCTR) into SYMTAB
end {if symbol}
search OPTAB for OPCODE
if found then
add 3 {instruction length} to LOCCTR
else if OPCODE = ‘WORD’ then
add 3 to LOCCTR
else if OPCODE = ‘RESW’ then
add 3 * #[OPERAND] to LOCCTR
else if OPCODE = ‘RESB’ then
add #[OPERAND] to LOCCTR
else if OPCODE = ‘BYTE’ then
begin
find length of constant in bytes
add length to LOCCTR
end {if BYTE)
else
set error flag (invalid operation code)
end {if not a comment}
write line to intermediate file
read next input line
end {while not END}
write last line to intermediate file
save (LOCCTR - starting address) as program length
end {Pass 1}

Figure 2.4(a) Algorithm for Pass 1 of assembler.


https://hemanthrajhemu.github.io
Chapter 2 Assemblers

Pass 2:

begin

a
read first input line (from intermediate file}
if OPCODE = ‘START’ then
begin
write listing line
read next input line
end {if START}
write Header record to object program
initialize first Text record
while OPCODE # ‘END’ do
begin
if this is not a comment line then
begin
search OPTAB for OPCODE
if found then
begin
if there is a symbol in OPERAND field then
begin
search SYMTAB for OPERAND
if found then
store symbol value as operand address
else
begin
store 0 as operand address
set error flag (undefined symbol)
end
end {if symbol}
else
store 0 as operand address
assemble the object code instruction
end {if opcode found}
else if OPCODE = ‘BYTE’ or ‘WORD’ then
convert constant to object code
if object code will not fit into the current Text record then
begin
write Text record to object program
initialize new Text record
end
add object code to Text record
end {if not comment}
write listing line
read next input line
end {while not END}
write last Text record to object program
write End record to object program
write last listing line -
end {Pass 2}

Figure 2.4(b) Algorithm for Pass 2 of assembler.


https://hemanthrajhemu.github.io
"2.2 Machine-Dependent Assembler Features 55

Line Source statement

COPY START 0 COPY FILE FROM INPUT TO OUTPUT


10 FIRST STL RETADR SAVE RETURN ADDRESS
12 LDB #LENGTH ESTABLISH BASE REGISTER
13 BASE LENGTH
15 RDREC READ INPUT RECORD
20 LENGTH TEST FOR EOF (LENGTH = 0)
25 #0
30 ENDFIL EXIT IF EOF FOUND
35 WRREC WRITE OUTPUT RECORD
40 CLOOP LOOP
45 ENDFIL ROF INSERT END OF FILE MARKER
50 BUFFER
55 #3 SET LENGTH = 3
60 LENGTH
65 WRREC WRITE EOF
70 @RETADR RETURN TO CALLER
80 C*ROF*
95 RETADR 1
100 I LENGTH OF RECORD
105 BUFFER RESB 4096 4096-BYTE BUFFER AREA
110
115 SUBROUTINE TO READ RECORD INTO BUFFER
120
i125 CLEAR LOOP COUNTER
130 CLEAR A TO ZERO
132 CLEAR S TO ZERO
133
135 RLOOP TEST INPUT DEVICE
140 LOOP UNTIL READY
145 READ CHARACTER INTO REGISTER A
150 TEST FOR END OF RECORD (X‘00’)
155 EXIT LOOP IF EOR
160 BUFFER, X STORE CHARACTER IN BUFFER
165 T LOOP UNLESS MAX LENGTH
170 RLOOP HAS BEEN REACHED
175 EXIT LENGTH SAVE RECORD LENGTH
180 RETURN TO CALLER
185 X'F1' CODE FOR INPUT DEVICE
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
210 CLEAR x CLEAR LOOP COUNTER
212 LDT LENGTH
215 WLOOP TD OUTPUT TEST OUTPUT DEVICE
220 WLOOP LOOP UNTIL READY
225 BUFFER, X GET CHARACTER FROM BUFFER
230 OUTPUT WRITE CHARACTER
235 nu LOOP UNTIL ALL CHARACTERS
240 WLOOP HAVE BEEN WRITTEN
245 RETURN TO CALLER
250 OUTPUT x05 CODE FOR OUTPUT DEVICE
255 FIRST

Figure 2.5 Example of a SIC/XE program.


https://hemanthrajhemu.github.io
56 Chapter 2 Assemblers

(see line 70). Immediate operands are denoted with the prefix # (lines 25, 55,
133). Instructions that refer to memory are normally assembled using either
the program-counterrelative or the base relative mode. The assembler direc-
tive BASE(line 13) is used in conjunction with base relative addressing. (See
Section 2.2.1 for a discussion and examples.) If the displacements required for
both program-counterrelative and baserelative addressingaretoo largeto fit
into a 3-byte instruction, then the 4-byte extended format (Format 4) must be
used. The extended instruction format is specified with the prefix + added to
the operation code in the source statement (see lines 15, 35, 65). It is the pro-
grammer’s responsibility to specify this form of addressing whenit is re-
quired.
The main differences between this version of the program and the version
in Fig. 2.1 involve the use of register-to-register instructions (in place of regis-
ter-to-memoryinstructions) wherever possible. For example, the statement on
line 150 is changed from COMP ZERO to COMPRA\S. Similarly, line 165 is
changed from TIX MAXLENto TIXR T. In addition, immediate and indirect
addressing have been used as muchas possible (for example, lines 25, 55, and
70).
These changes take advantage of the more advanced SIC/XEarchitecture
to improve the execution speed of the program. Register-to-register instruc-
tions are faster than the corresponding register-to-memory operations because
they are shorter, and, more importantly, because they do not require another
memory reference. (Fetching an operand from a register is muchfaster than re-
trieving it from main memory.) Likewise, when using immediate addressing,
the operand is already present as part of the instruction and need not be
fetched from anywhere. The use of indirect addressing often avoids the need
for another instruction (as in the “return” operation on line 70). You may no-
tice that some of the changes require the addition of other instructions to the
program. For example, changing COMP to COMPRonline 150 forces us to
add the CLEARinstruction on line 132.Thisstill results in an improvementin
execution speed. The CLEAR is executed only once for each record read,
whereas the benefits of COMPR (as opposed to COMP) are realized for every
byte of data transferred.
In Section 2.2.1, we examine the assembly of this SIC/XE program, focus-
ing on the differences in the assembler that are required by the new addressing
modes. (You may wantto briefly review the instruction formats and target ad-
dress calculations described in Section 1.3.2.) These changes are direct conse-
quencesof the extended hardware-functions.
Section 2.2.2 discusses an indirect consequence of the change to SIC/XE.
The larger main memory of SIC/XE means that we may have room to load
and run several programs at the same time. This kind of sharing of the ma-
chine between programsis called multiprogramming. Such sharing often results
in more productive use of the hardware. (We discuss this concept, and its
https://hemanthrajhemu.github.io
* 2.2 Machine-Dependent Assembler Features 57

implications for operating systems, in Chapter 6.) To take full advantage of


this capability, however, we mustbe able to load programs into memory wher-
ever there is room, rather than specifying a fixed address at assembly time.
Section 2.2.2 introduces the idea of program relocation and discussesits impli-
cations for the assembler.

2.2.1 Instruction Formats and Addressing Modes

Figure 2.6 showsthe object code generated for each statement in the program
of Fig. 2.5. In this section we considerthe translation of the source statements,
payingparticular attention to the handling ofdifferent instruction formats and
different addressing modes. Note that the START statement now specifies a
beginning program address of 0. As we discuss in the next section, this indi-
cates a relocatable program. For the purposes of instruction assembly, how-
ever, the program will be translated exactly as if it were really to be loaded at
machine address0.
Translation of register-to-register instructions such as CLEAR(line 125)
and COMPR(line 150) presents no new problems. The assembler must simply
convert the mnemonic operation code to machine language (using OPTAB)
and change each register mnemonicto its numeric equivalent. This translation
is done during Pass 2, at the same point at which the other types of instruc-
tions are assembled. The conversion of register mnemonics to numbers can be
done with a separate table; however, it is often convenient to use the symbol
table for this purpose. To do this, SYMTAB would be preloaded with the regis-
ter names (A, X, etc.) and their values(0, 1, etc.).
Mostof the register-to-memory instructions are assembled using either
program-counterrelative or base relative addressing. The assembler must, in
either case, calculate a displacement to be assembled as part of the object in-
struction. This is computed so that the correct target address results when the
displacementis added to the contents of the program counter (PC) or the base
register (B). Of course, the resulting displacement must be small enoughtofit
in the 12-bit field in the instruction. This means that the displacement must be
between 0 and 4095 (for base relative mode) or between —2048 and +2047 (for
program-counterrelative mode).
If neither program-counter relative nor base relative addressing can be
used (because the displacements are too large), then the 4-byte extended in-
struction format (Format 4) must be used. This 4-byte format contains a 20-bit
address field, which is large enough to contain the full memory address. In
this case, there is no displacement to be calculated. For example, in the instruc-
tion

LS 0006 CLOOP +JSUB RDREC 4B101036


https://hemanthrajhemu.github.io
58 Chapter 2 Assemblers

Line Loc Source statement Object code

0000 COPY START 0


10 0000 FIRST STL RETADR 17202D
12 0003 LDB #LENGTH 69202D
13 BASE LENGTH
15 0006 CLOOP +JSUB 48101036
20 OOOA LDA LENGTH 032026
25 000D COMP #0 290000
30 O010 JEQ ENDFIL 332007
35 0013 +7SUB WRREC 4B10105D
40 0017 Z CLOOP 3F2FEC
45 OO1A ENDFIL LDA 032010
50 001D ——TFSTA BUFFER oF2016
55 0020 LDA #3 010003
60 0023 STA LENGTH OF200D
65 0026 +JSUB WRREC 4B10105D
70 002A oT @GRETADR 3E2003
80 002D EOF BYTE CeEOr’ A54F46
95 0030 RETADR RESW 4
100 0033 LENGTH RESW L
105 0036 BUFFER RESB 4096
110
415 e SUBROUTINE TO READ RECORD INTO BUFFER
120
125 1036 RDREC CLEAR x B4l0
130 1038 CLEAR A B400
132 103A CLEAR 3 B440
133 103¢ +LDT #4096 75101000
135 1049 RLOOP TD INPUT £32019
140 1043 JEQ RLOOP 332FFA
145 1046 RD INPUT DB2013
150 1049 COMPR A,S A004
155 104B JEQ EXIT 332008
160 104E STCH BUFFER, X 57C003
165 1051 PIXR a B850
170 1053 JLT RLOOP 3B2FEA
7S 1056 EXIT STK LENGTH 134000
180 1059 RSUB 4F0000
185 105C INPUT BYTE xX’ FI' FL
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
210 105D x B40
212 105F LENGTH 774000
215 1062 WLOOP TD OUTPUT E32011
220 1065 WLOOP 332FFA
225 1068 BUFFER, X 53c003
230 106B OUTPUT DF2008
235 106E e B850
240 1070 WLOOP 3B2FEF
245 1073 4F0000
250 1076 OQUTPUT BYTE x'05" 05
255 FIRST

Figure 2.6 Program from Fig. 2.5 with object code.


https://hemanthrajhemu.github.io
* 2.2 Machine-Dependent Assembler Features 59

the operand address is 1036. This full addressis stored in the instruction, with
bit e set to 1 to indicate extended instruction format.
Note that the programmer must specify the extended format by using the
prefix + (as on line 15). If extended formatis not specified, our assemblerfirst
attempts to translate the instruction using program-counterrelative address-
ing. If this is not possible (because the required displacement is out of range),
the assembler then attempts to use base relative addressing. If neither form of
relative addressing is applicable and extended formatis not specified, then the
instruction cannot be properly assembled. In this case, the assembler must
generate an error message.
We now examinethe details of the displacement calculation for program-
counterrelative and baserelative addressing modes. The computation that the
assembler needs to perform is essentially the target address calculation in
reverse. You may wantto review this from Section 1.3.2.
Theinstruction

10 0000 FIRST STL RETADR 17202D

is a typical example of program-counter relative assembly. During execution


of instructions on SIC (as in most computers), the program counter is ad-
vanced after each instruction is fetched andbefore it is executed. Thus during
the execution of the STL instruction, PC will contain the address of the next in-
struction (that is, 0003). From the Loc column of the listing, we see that
RETADR(line 95) is assigned the address 0030. (The assembler would, of
course, get this address from SYMTAB.) The displacement weneedin the in-
struction is 30 — 3 = 2D. At execution time, the target address calculation per-
formed will be (PC) + disp, resulting in the correct address (0030). Note that
bit p is set to 1 to indicate program-counterrelative addressing, making the
last 2 bytes of the instruction 202D. Also note that bits n and i are bothset to 1,
indicating neither indirect nor immediate addressing; this makesthe first byte
17 instead of 14. (See Fig. 1.1 in Section 1.3.2 for a review of the location and
setting of the addressing-modebitflags.)
Another example of program-counterrelative assemblyis the instruction

40 OO17 J CLOOP 3F2FEC

Here the operand address is 0006. During instruction execution, the program
counter will contain the address 0001A. Thus the displacement required is
6-1A=-14. Thisis represented (using 2’s complement for negative numbers) in
a 12-bit field as FEC, which is the displacement assembledinto the object code.
The displacementcalculation process for base relative addressing is much
the same as for program-counter relative addressing. The main difference is
it
https://hemanthrajhemu.github.io
60 Chapter 2 Assemblers

that the assembler knows what the contents of the program counter will beat
execution time. The base register, on the other hand, is under control of the
programmer. Therefore, the programmer musttell the assembler what the
base register will contain during execution of the program so that the assem-
bler can compute displacements. This is done in our example with the assem-
bler directive BASE. The statement BASE LENGTH(line 13) informs the
assemblerthat the base register will contain the address of LENGTH.Thepre-
ceding instruction (LDB #LENGTH)loadsthis value into the register during
program execution. The assembler assumes for addressing purposes that reg-
ister B contains this address until it encounters another BASE statement. Later
in the program, it may be desirable to use register B for another purpose (for
example, as temporary storage for a data value). In such a case, the program-
mer must use another assembler directive (perhaps NOBASE)to inform the
assembler that the contents of the base register can no longer be relied upon
for addressing.
It is important to understand that BASE and NOBASEare assembler direc-
tives, and produce no executable code. The programmer mustprovideinstruc-
tions that load the proper value into the base register during execution.If this
is not done properly, the target address calculation will not produce the correct
operand address.
Theinstruction

160 104E STCH BUFFER, ¥ 57C003

is a typical example of base relative assembly. According to the BASE state-


ment, register B will contain 0033 (the address of LENGTH) during execution.
The address of BUFFERis 0036. Thus the displacementin the instruction must
be 36 — 33 = 3. Notice that bits x and b are set to 1 in the assembled instruction
to indicate indexed and baserelative addressing. Another example is the in-
struction STX LENGTHonline 175. Here the displacementcalculated is 0.
Notice the difference between the assembly of the instructions on lines 20
and 175. On line 20, LDA LENGTH is assembled with program-counterrela-
tive addressing. On line 175, STX LENGTH usesbase relative addressing, as
noted previously. (If you calculate the program-counterrelative displacement
that would be required for the statement on line 175, you will see that it is too
large to fit into the 12-bit displacementfield.) The statement on line 20 could
also have used baserelative mode. In our assembler, however, we havearbi-
trarily chosen to attempt program-counterrelative assemblyfirst.
The assembly of an instruction that specifies immediate addressing is sim-
pler because no memoryreference is involved. All that is necessary is to con-
vert the immediate operandtoits internal representation andinsertit into the
instruction. The instruction
https://hemanthrajhemu.github.io
2.2 Machine-Dependent Assembler Features 61

55 0020 LDA #3 010003

is a typical example of this, with the operand stored in the instruction as 003,
and bit i set to 1 to indicate immediate addressing. Another example can be
foundin the instruction

133 103C +LDT #4096 75101000

In this case the operand (4096) is too large to fit into the 12-bit displacement
field, so the extended instruction formatis called for. (If the operand were too
large even for this 20-bit address field, immediate addressing could not be
used.)
A different way of using immediate addressing is shownin the instruction

12 0003 LDB #LENGTH 69202D

In this statement the immediate operand is the symbol LENGTH. Since the
value of this symbolis the address assignedtoit, this immediate instruction has
the effect of loading register B with the address of LENGTH. Note here that
we have combined program-counter relative addressing with immediate ad-
dressing. Although this may appear unusual, the interpretation is consistent
with our previous uses of immediate operands. In general, the target address
calculation is performed; then, if immediate modeis specified, the target ad-
dress (not the contents stored at that address) becomes the operand. (In the
LDAstatementon line 55, for example, bits x, b, and p are all 0. Thus the target
addressis simply the displacement 003.)
The assembly of instructions that specify indirect addressing presents
nothing really new. The displacement is computed in the usual way to pro-
duce the target address desired. Then bit 7 is set to indicate that the contents
stored at this location represent the address of the operand, not the operandit-
self. Line 70 shows a statement that combines program-counter relative and
indirect addressing in this way.

2.2.2 Program Relocation

As we mentionedbefore, it is often desirable to have more than one program


at a time sharing the memory andother resources of the machine. If we knew
in advance exactly which programs were to be executed concurrently in this
way, we could assign addresses when the programs were assembled so that
they would fit together without overlap or wasted space. Most of the time,
however,it is not practical to plan program execution this closely. (We usually
do not know exactly when jobs will be submitted, exactly how long they will
https://hemanthrajhemu.github.io
62 Chapter 2 Assemblers
iii
run, etc.) Because of this, it is desirable to be able to load a program into mem-
ory wherever there is room for it. In such a situation the actual starting ad-
dress of the program is not knownuntil load time.
The program weconsidered in Section 2.1 is an example of an absolute
program (or absolute assembly). This program mustbe loaded at address 1000
(the address that was specified at assembly time) in order to execute properly.
To see this, consider the instruction

55 101B LDA THREE 00102D

from Fig. 2.2. In the object program (Fig. 2.3), this statement is translated as
00102D, specifying that register A is to be loaded from memory address 102D.
Suppose we attempt to load and execute the program at address 2000 instead
of address 1000. If we do this, address 102D will not contain the value that we
expect—in fact, it will probably be part of some other user’s program.
Obviously we need to make somechangein the addressportion of this in-
struction so we can load and execute our program at address 2000. On the
other hand, there are parts of the program (such as the constant 3 generated
from line 85) that should remain the same regardless of where the program is
loaded. Looking at the object code alone, it is in general not possible to tell
which values represent addresses and which represent constant data items.
Since the assembler does not know theactual location where the program
will be loaded, it cannot make the necessary changes in the addresses used by
the program. However, the assembler can identify for the loader those parts of
the object program that need modification. An object program that contains
the information necessary to perform this kind of modificationis called a relo-
catable program.
To look at this in more detail, consider the program from Figs. 2.5 and 2.6.
In the preceding section, we assembled this program using a starting address
of 0000. Figure 2.7(a) shows this program loaded beginning at address 0000.
The JSUB instruction from line 15 is loaded at address 0006. The addressfield
of this instruction contains 01036, which is the address of the instruction la-
beled RDREC.(These addresses are, of course, the same as those assigned by
the assembler.)
Now suppose that we wantto load this program beginning at address
5000, as showninFig. 2.7(b). The address of the instruction labeled RDREC is
then 6036. Thus the JSUB instruction must be modified as shown to contain
this new address. Likewise, if we loaded the program beginning at address
7420 (Fig. 2.7c), the JSUB instruction would need to be changed to 4B108456 to
correspondto the new address of RDREC.
https://hemanthrajhemu.github.io
Sectioh 2.2 Machine-Dependent Assembler Features 63

eoveve © eevee Ovens

4B1 (+JSUB RDREC)


8 a

B41 +—RDREC

5000 ;
5006 48106036 |(+JSUB RDREC)
6036 B410 =e -RDREC
6076 i

7420

7426 48108456 |(+JSUB RDREC)

neveee O eecce
® aeece oo

¢— RDREC
oO
b
=
oo

(a) (b) (c)

Figure 2.7 Examplesof program relocation.

Note that no matter where the program is loaded, RDRECis always 1036
bytes past the starting address of the program. This means that we can solve
the relocation problem in the following way:

1. When the assembler generates the object code for the JSUB instruc-
tion weare considering,it will insert the address of RDRECrelative to
the start of the program. (This is the reason weinitialized the location
counter to 0 for the assembly.)
2. The assembler will also produce a commandfor the loader, instruct-
ing it to add the beginning address of the program to the address
field in the JSUB instruction at load time.
https://hemanthrajhemu.github.io
Chapter 2 Assemblers

The commandfor the loader, of course, must also be a part of the object pro-
gram. We can accomplish this with a Modification record having the following
format:

Modification record:

Col. 1 M
Col. 2-7 Starting location of the address field to be modified,rel-
ative to the beginning of the program (hexadecimal)
Col. 8-9 Length of the address field to be modified, in half-
bytes (hexadecimal)

The length is stored in half-bytes (rather than bytes) because the address
field to be modified may not occupy an integral numberof bytes. (For exam-
ple, the address field in the JSUB instruction we considered above occupies 20
bits, which is 5 half-bytes.) The starting location is the location of the byte con-
taining the leftmost bits of the address field to be modified.If this field occu-
pies an odd numberof half-bytes, it is assumed to begin in the middle of the
first byte at the starting location. These conventions are, of course, closely re-
lated to the architecture of SIC/XE. For other types of machines, the half-byte
approach might not be appropriate (see Exercise 2.2.9).
For the JSUB instruction we are using as an example, the Modification
record would be

MOO0000705

This record specifies that the beginning address of the program is to be added
to a field that begins at address 000007 (relative to the start of the program)
and is 5 half-bytes in length. Thus in the assembled instruction 4B101036, the
first 12 bits (4B1) will remain unchanged. The program load address will be
addedto the last 20 bits (01036) to produce the correct operand address. (You
should check for yourself that this gives the results shownin Fig.2.7.)
Exactly the same kind of relocation must be performed for the instructions
on lines 35 and 65in Fig. 2.6. The rest of the instructions in the program, how-
ever, need not be modified when the program is loaded. In somecases this is
because the instruction operand is not a memory addressatall (e.g., CLEARS
or LDA #3). In other cases no modification is needed because the operand is
specified using program-counterrelative or base relative addressing. For ex-
ample, the instruction on line 10 (STL RETADR)is assembled using program-
counter relative addressing with displacement 02D. No matter where the
program is loaded in memory, the word labeled RETADR will always be 2D
https://hemanthrajhemu.github.io
Section 2.2. Machine-Dependent Assembler Features 65

bytes away from the STL instruction; thus no instruction modification is


needed. When the STL is executed, the program counter will contain the (ac-
tual) address of the next instruction. The target address calculation process
will then produce the correct (actual) operand address corresponding to
RETADR.
Similarly the distance between LENGTH and BUFFER will always be
3 bytes. Thus the displacementin the baserelative instruction on line 160 will
be correct without modification. (The contents of the base register will, of
course, depend upon where the program is loaded. However, this will be
taken care of automatically when the program-counterrelative instruction
LDB #LENGTHis executed.)
By nowit should beclear that the only parts of the program that require
modification at load time are those that specify direct (as opposedto relative)
addresses. For this SIC/XE program,the only such direct addresses are found
in extended format (4-byte) instructions. This is an advantageof relative ad-
dressing—if we were to attempt to relocate the program from Fig. 2.1, we
would find that almost every instruction required modification.
Figure 2.8 shows the complete object program corresponding to the source
program of Fig. 2.5. Note that the Text records are exactly the same as those
that would be produced by an absolute assembler (with program starting ad-
dress of 0). However, the load addresses in the Text records are interpreted as
relative, rather than absolute, locations. (The sameis, of course, true of the ad-
dresses in the Modification and Endrecords.) There is one Modification record
for each addressfield that needs to be changed whenthe program is relocated
(in this case, the three +JSUB instructions). You should verify these Modifica-
tion records yourself and make sure you understand the contents of each. In
Chapter 3 weconsiderin detail how the loader performsthe required program
modification. It is important that you understand the concepts involved now,
however, because webuild on these concepts in the next section.

HCOPY 000000010 77
T0000001D17202D69202D4B1010360320262900003320074B10105D3F2FECO32010
7,00001D130F20160100030F200D4B10105D3E2003454F46
7,0010361DB410B400B44075101000£32019332FFADB2013A00433200857C003B850
10010531D3B2FEAI340004F0000F1B410774000E32011332FFA53C003DF2008B850
1001070073B2FEF4F000005
400000705
400001405
M000027,05
5000000

Figure 2.8 Object program correspondingto Fig. 2.6.


https://hemanthrajhemu.github.io
66 Chapter 2 Assemblers

2.3 MACHINE-INDEPENDENT
ASSEMBLER FEATURES

In this section, we discuss some common assembler features that are not
closely related to machine architecture. Of course, more advanced machines
tend to have more complex software; therefore the features we consider are
more likely to be found on larger and more complex machines. However, the
presence or absence of such capabilities is much moreclosely related to issues
such as programmer convenience and software environmentthanit is to
machinearchitecture.
In Section 2.3.1 we discuss the implementation of literals within an assem-
bler, including the required data structures and processing logic. Section 2.3.2
discusses two assembler directives (EQU and ORG) whose main function is
the definition of symbols. Section 2.3.3 briefly examines the use of expressions
in assembler language statements, and discusses the different types of expres-
sions and their evaluation and use.
In Sections 2.3.4 and 2.3.5 we introduce the important topics of program
blocks and control sections. We discuss the reasons for providing such capabil-
ities and illustrate some different uses with examples. We also introduce a set
of assembler directives for supporting these features and discuss their imple-
mentation.

2.3.1 Literals

It is often convenient for the programmerto be able to write the value of a


constant operandas a partof the instruction thatusesit. This avoids having to
define the constant elsewhere in the program and make upa label for it. Such
an operandis called a literal because the value is stated “literally” in the in-
struction. Theuseofliterals is illustrated by the program in Fig. 2.9. The object
code generated for the statements of this programis shownin Fig. 2.10. (This
program is a modification of the one in Fig. 2.5; other changes are discussed
later in Section 2.3.)
In our assembler language notation, a literal is identified with the prefix =,
whichis followed bya specification of the literal value, using the same nota-
tion as in the BYTEstatement. Thustheliteral in the statement

45 QO1A ENDFIL LDA ~ =C’EOF’ 032010

specifies a 3-byte operand whosevalue is the character string EOF. Likewise


the statement

25 1062 WLOOP TD =x'05" E32011


=

https://hemanthrajhemu.github.io
"2.3. Machine-Independent Assembler Features 67

Line Source statement

5 COPY SPART 0 COPY FILE FROM INPUT TO OUTPUT


: 10 FIRST STL RETADR SAVE RETURN ADDRESS
13 LDB #LENGTH ESTABLISH BASE REGISTER
14 BASE LENGTH
15 CLOOP +JSUB RDREC READ INPUT RECORD
20 LDA LENGTH TEST FOR EOF (LENGTH = 0)
25 COMP #0
30 JEQ ENDFIL EXIT IF HOF FOUND
35 +JSUB WRREC WRITE OUTPUT RECORD
AO o CLOOP LOOP
45 ENDFIL LDA =C' EOF’ INSERT END OF FILE MARKER
50 STA BUFFER
55 LDA #3 SET LENGTH = 3
60 STA LENGTH
65 +JSUB WRREC WRITE EOF
70 aT @RETADR RETURN TO CALLER
93 LTORG
95 RETADR RESW 1
100 LENGTH RESW 1 LENGTH OF RECORD
105 BUFFER RESB 4096 4096-BYTE BUFFER AREA
106 BUFEND EQU *
107 MAXLEN EQU BUFEND-BUFFER MAXIMUM RECORD LENGTH
110 :
115 SUBROUTINE TO READ RECORD INTO BUFFER
120 :
125 RDREC CLEAR x CLEAR LOOP COUNTER
130 CLEAR A CLEAR A TO ZERO
132 CLEAR s CLEAR S TO ZERO
133 +LDT #MAXLEN
135 RLOOP TD INPUT TEST INPUT DEVICE
140 JEQ RLOOP LOOP UNTIL READY
145 RD INPUT READ CHARACTER INTO REGISTER A
150 COMPR A,S TEST FOR END OF RECORD (X‘00’)
155 JEQ EXIT EXIT LOOP IF FOR
160 STCH BUFFER, X STORE CHARACTER IN BUFFER
165 TITER T LOOP UNLESS MAX LENGTH
170 Jur RLOOP HAS BEEN REACHED
175 EXIT STX LENGTH SAVE RECORD LENGTH
180 RSUB RETURN TO CALLER
185 INPUT BYTE X'FL! CODE FOR INPUT DEVICE
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205 zi
210 WRREC CLEAR x CLEAR LOOP COUNTER
212 Lor LENGTH
215 WLOGP TD =x" 05' TEST OUTPUT DEVICE
220 JEQ WLOOP LOOP UNTIL READY
225 LDCH BUFFER, X GET CHARACTER FROM BUFFER
230 WD =K'05' WRITE CHARACTER
235 TIXR T LOOP UNTIL ALL CHARACTERS
240 gut WLOOP HAVE BEEN WRITTEN
245 RSUB RETURN TO CALLER
255 END FIRST

Figure 2.9 Program demonstrating additional assembler features.


https://hemanthrajhemu.github.io
Chapter 2 Assemblers

Line Loc Source statement Object code


5 0000 COPY START 0
10 0000 FIRST STL RETADR 17202D
13 0003 LDB #LENGTH 69202D
14 BASE LENGTH
i5 0006 CLOOP +JSUB RDREC 4B101036
20 OO0A LDA LENGTH 032026
25 000D COMP #0 290000
30 0010 JEQ ENDFIL 332007
35 0013 +JSUB WRREC 4B10105D
40 0017 J CLOOP 3F2FEC
45 001A ENDFIL LDA =C' EOF’ 032010
50 001D STA BUFFER OF2016
55 0020 LDA #3 010003
60 0023 STA LENGTH OF200D
65 0026 +JSUB WRREC 4B10105D
70 002A J @RETADR 3E2003
93 LTORG
002D a =C‘ EOF’ 454P46
95 0030 RETADR RESW 1
100 0033 LENGTH RESW i
105 0036 BUFFER RESB 4096
106 1036 BUFEND FOU =
107 1000 MAXLEN QU BUFEND-BUFFER
110 :
15 : SUBROUTINE TO READ RECORD INTO BUFFER
120 :
125 1036 RDREC CLEAR x B410
130 1038 CLEAR A B400
132 103A CLEAR s B440
133 103c +LDT #MAXLEN 75101000
135 1040 RLOOP TD INPUT £32019
140 1043 JEQ RLOOP 332FFA
145 1046 RD INPUT DB2013
150 1049 COMPR. A,S A004
155 104B JEO EXIT 332008
160 104E STCH BUFFER, X 57C€003
165 1051 TIXR T B850
170 1053 JLT RLOOP 3B2FEA
175 1056 EXIT STX LENGTH 134000
180 1059 RSUB 4F0000
185 105¢ INPUT BYTE X'F1' Fl
195 z
200 : SUBROUTINE TO WRITE RECORD FROM BUFFER
205 :
210 105D WRREC CLEAR x B4l0
212 105F Lp? LENGTH 774000
215 1062 WLOOP TD =X'05" E32011
220 1065 JEQ WLOOP 332FFA
225 1068 LDCH BUFFER, X 53c003
230 106B wD =X'05" DF2008
235 106E TIAR . £ B850
240 1070 JLT WLOOP 3B2FEP
245 1073 RSUB 4F0000
255 END FIRST
1076 = =X'05’ 05

Figure 2.10 Program from Fig. 2.9 with object code.


https://hemanthrajhemu.github.io
2.3 Machine-Independent Assembler Features 69

specifies a 1-byte literal with the hexadecimal value 05. The notation used for
literals varies from assembler to assembler; however, most assemblers use
some symbol(as we have used =) to makeliteral identification easier.
It is important to understand the difference betweena literal and an imme-
diate operand. With immediate addressing, the operand value is assembled as
part of the machineinstruction. With a literal, the assembler generates the
specified value as a constant at some other memory location. The address of
this generated constant is used as the target address for the machine instruc-
tion. The effect of using a literal is exactly the sameas if the programmer had
defined the constant explicitly and used the label assigned to the constant as
the instruction operand.(In fact, the generated object codefor lines 45 and 215
in Fig. 2.10 is identical to the object code for the correspondinglines in
Fig. 2.6.) You should compare the object instructions generated for lines 45 and
55 in Fig. 2.10 to make sure you understand how literals and immediate
operandsare handled.
All of the literal operands used in a program are gathered together into
one or moreliteral pools. Normally literals are placed into a pool at the end of
the program. The assembly listing of a program containingliterals usually in-
cludes listing of this literal pool, which showsthe assigned addresses and
the generated data values. Sucha literal poollisting is shownin Fig. 2.10 im-
mediately following the END statement. In this case, the pool consists of the
single literal =X’05’.
In some cases, however,it is desirable to place literals into a pool at some
other location in the object program.To allowthis, we introduce the assembler
directive LTORG (line 93in Fig. 2.9). When the assembler encounters a LTORG
statement, it creates a literal pool that containsall of the literal operands used
since the previous LTORG(or the beginning of the program). This literal pool
is placed in the object program at the location where the LTORG directive was
encountered (see Fig. 2.10). Of course, literals placed in a pool by LTORG will
not be repeated in the pool at the end of the program.
If we had not used the LTORG statement on line 93, the literal =C’EOF’
would beplacedin the poolat the end of the program. This literal pool would
begin at address 1073. This meansthat the literal operand would be placed too
far away from the instruction referencing it to allow program-counterrelative
addressing. The problem,ofcourse, is the large amountofstorage reserved for
BUFFER.Byplacing the literal pool before this buffer, we avoid having to use
extended formatinstructions whenreferring to the literals. The need for an as-
sembler directive such as LTORG usually arises whenit is desirable to keep
the literal operand close to the instruction thatusesit.
Most assemblers recognize duplicate literals—thatis, the sameliteral used
in more than one place in the program—andstore only one copy of the speci-
fied data value. For example,the literal =X’05’ is used in our program onlines
70
https://hemanthrajhemu.github.io
Chapter 2. Assemblers
i
215 and 230. However, only one data area with this value is generated. Both
instructionsrefer to the same addressin theliteral pool for their operand.
The easiest way to recognize duplicate literals is by comparison of the
character strings defining them (in this case, the string =X’05’). Sometimes a
slight additional saving is possible if we look at the generated data value in-
stead of the defining expression. For example, the literals =C’EOF’ and
=X’454F46’ would specify identical operand values. The assembler might
avoid storing both literals if it recognized this equivalence. However, the bene-
fits realized in this way are usually not great enoughto justify the additional
complexity in the assembler.
If we use the characterstring defining literal to recognize duplicates, we
must becareful ofliterals whose value depends upontheir location in the pro-
gram. Suppose, for example, that we allow literals that refer to the current
value of the location counter (often denoted by the symbol*). Such literals are
sometimes useful for loading base registers. For example, the statements

BASE .
LDB =

as the first lines of a program would load the beginning address of the pro-
gram into register B. This value would then be available for base relative ad-
dressing.
Such a notation can, however, cause a problem with the detection of dupli-
cate literals. If a literal =* appeared on line 13 of our example program,it
wouldspecify an operand with value 0003. If the sameliteral appeared on line
55, it would specify an operand with value 0020. In such case,theliteral
operands have identical names; however, they have different values, and both
must appear in the literal pool. The same problem arises if a literal refers to
any other item whose value changes between one point in the program and
another.
Now weareready to describe how the assembler handles literal operands.
The basic data structure neededis literal table LITTAB. For each literal used,
this table contains the literal name, the operand value and length, and the ad-
dress assigned to the operand whenitis placed in a literal pool. LITTABis of-
ten organized as a hashtable, using the literal nameor valueas the key.
As each literal operand is recognized during Pass 1, the assembler searches
LITTABforthe specified literal name(or value).If the literal is already present
in the table, no action is needed;if it is not present, the literal is added to LIT-
TAB (leaving the address unassigned). When Pass 1 encounters a LTORG
statementor the end of the program, the assembler makesa scan oftheliteral
table. At this time eachliteral currently in the table is assigned an address (un-
less such an address has already been filled in). As these addresses are as-
https://hemanthrajhemu.github.io
2.3 Mach ine-Independent Assembler Features 71

signed, the location counteris updated to reflect the number of bytes occupied
by eachliteral.
During Pass 2, the‘operand address for use in generating object code is ob-
tained by searching LITTABfor each literal operand encountered. The data
values specified by theliterals in each literal pool are inserted at the appropri-
ate places in the object program exactly as if these values had been generated
by BYTE or WORDstatements. If a literal value represents an address in the
program (for example, a location counter value), the assembler must also gen-
erate the appropriate Modification record.
To be sure you understand how LITTABis created and used by the assem-
bler, you may want to apply the procedure we just described to the source
statements in Fig. 2.9. The object code andliteral pools generated should be
the sameas those in Fig. 2.10.

2.3.2 Symbol-Defining Statements

Upto this point the only user-defined symbols wehaveseen in assemblerlan-


guage programs have appeared as labels on instructions or data areas. The
value of such a label is the address assigned to the statement on whichit ap-
pears. Most assemblers provide an assembler directive that allows the pro-
grammer to define symbols and specify their values. The assembler directive
generally used is EQU (for “equate”). The general form of such a statementis

symbol EQU value

This statement defines the given symbol(i.e., enters it into SYMTAB)andas-


signs to it the value specified. The value may be given as a constant or as any
expression involving constants and previously defined symbols. We discuss
the formation and use of expressions in the next section.
One commonuse of EQUis to establish symbolic names that can be used
for improved readability in place of numeric values. For example, on line 133
of the program in Fig. 2.5 we used the statement

+LDT #4096

to load the value 4096 into register T. This value represents the maximum-
length record we could read with subroutine RDREC. The meaningis not,
however,as clear as it might be. If we include the statement

MAXLEN EOU 4096


https://hemanthrajhemu.github.io
72 Chapter 2 Assemblers

in the program, wecan write line 133 as

+LDT #MAXLEN

When the assembler encounters the EQU statement, it enters MAXLENinto


SYMTAB (with value 4096). During assembly of the LDTinstruction, the as-
sembler searches SYMTAB for the symbol MAXLEN,using its value as the
operandin the instruction. The resulting object code is exactly the sameas in
the original version of the instruction; however, the source statementis easier
to understand.It is also much easier to find and change the value of MAXLEN
if this becomes necessary—we would not have to search through the source
code looking for places where #4096 is used.
Another common use of EQUis in defining mnemonic namesforregisters.
We have assumed that our assembler recognizes standard mnemonicsfor reg-
isters—A, X, L, etc. Suppose, however, that the assembler expected register
numbers instead of namesin an instruction like RMO. This would require the
programmerto write (for example) RMO 0,1 instead of RMO A,X. In such a
case the programmercould include a sequence of EQU statementslike

A EOU 0
x EQU 1
Ts EQU 2

ime.
These statements cause the symbols A,X,L,... to be entered into SYMTAB with
their corresponding values0, 1, 2,.... An instruction like RMO A,X would then
be allowed. The assembler would search SYMTAB,finding the values 0 and 1
for the symbols A and X, and assemble theinstruction.
On a machinelike SIC, there would belittle point in doing this—itis just
as easy to have the standard register mnemonics built into the assembler.
Furthermore, the standard names(base, index, etc.) reflect the usage of the
registers. Consider, however, a machine that has general-purposeregisters.
These registers are typically designated by 0,1, 2,... (or RO, R1, R2,...). In a par-
ticular program, however, some of these may be used as base registers, some
as index registers, some as accumulators, etc. Furthermore, this usage of regis-
ters changes from one program to the next. By writing statementslike

BASE EOU R1
COUNT EQU R2
INDEX EQU R3

the programmercan establish and use namesthat reflect the logical function
of the registers in the program.
https://hemanthrajhemu.github.io

2.3 Machine-Independent Assembler Features 73

There is another commonassemblerdirective that can be used to indirectly


assign values to symbols. This directive is usually called ORG (for “origin”).
Its form is .

ORG value

wherevalue is a constant or an expression involving constants and previously


defined symbols. Whenthis statement is encountered during assembly of a
program, the assemblerresets its location counter (LOCCTR) to the specified
value. Since the values of symbols used as labels are taken from LOCCTR,the
ORGstatementwill affect the valuesof all labels defined until the next ORG.
Of course the location counter is used to control assignment of storage in
the object program; in mostcases, altering its value would result in an incor-
rect assembly. Sometimes, however, ORG can be useful in label definition.
Suppose that we weredefining a symboltable with the following structure:

SYMBOL VALUE FLAGS


STAB
(100 entries)

In this table, the SYMBOLfield contains a 6-byte user-defined symbol; VALUE


is a one-wordrepresentation of the value assigned to the symbol; FLAGSis a
2-byte field that specifies symbol type and other information.
Wecould reserve spacefor this table with the statement

STAB RESB 1100

We might wantto refer to entries in the table using indexed addressing (plac-
ing in the index register the offset of the desired entry from the beginning of
the table). Of course, we want to be able to refer to the fields SYMBOL,
VALUE, and FLAGSindividually, so we must also define these labels. One
wayof doing this would be with EQU statements:

SYMBOL EQU STAB


VALUE EOU STAB+6
FLAGS EQU STAB+9
https://hemanthrajhemu.github.io
74 Chapter 2 Assemblers

This would allowusto write, for example, .

LDA VALUE, X

to fetch the VALUEfield fromthe table entry indicated by the contents of reg-
ister X. However, this method of definition simply defines the labels; it does
not makethe structure ofthe table as clear asit mightbe.
We can accomplish the same symboldefinition using ORG in the following
Way.

STAB RESB 1100 i


ORG STAB
SYMBOL RESB 6
VALUE RESW 1
FLAGS RESB 2
ORG STAB+1100

Thefirst ORG resets the location counter to the value of STAB(i.e., the begin-
ning address of the table). The label on the following RESB statement defines
SYMBOLto have the current value in LOCCTR;this is the same address as-
signed to SYMTAB. LOCCTRis then advancedso the label on the RESW state-
mentassigns to VALUE the address (STAB+6), and so on. Theresult is a set of
labels with the same values as those defined with the EQU statements above.
This method of definition makes it clear, however, that each entry in STAB
consists of a 6-byte SYMBOL,followed by a one-word VALUE,followed by a
2-byte FLAGS.
The last ORG statementis very important. It sets LOCCTRbackto its pre-
vious value—the address of the next unassigned byte of memory after the
table STAB. This is necessary so that any labels on subsequent statements,
which do not represent part of STAB, are assigned the proper addresses. In
some assemblers the previous value of LOCCTRis automatically remembered,
so we can simply write

ORG

(with no value specified) to return to the normal use of LOCCTR.


The descriptions of the EQU and ORGstatements contain restrictions that
are commonto all symbol-defining assembler directives. In the case of EQU,
all symbols used on the right-hand side of the statement—thatis, all terms
used to specify the value of the new symbol—must have been defined previ-
ously in the program. Thus, the sequence

ALPHA RESW i:
BETA EQU ALPHA
https://hemanthrajhemu.github.io
2.3 Machine-Independent Assembler Features 75

would be allowed, whereas the sequence

BETA EQU * ALPHA


ALPHA RESW A

would not. The reason for this is the symbol definition process. In the second
example above, BETA cannotbe assigned a value whenit is encountered dur-
ing Pass 1 of the assembly (because ALPHA doesnotyet have a value).
However, our two-pass assembler design requires that all symbols be defined
during Pass1.
A similar restriction applies to ORG:all symbols used to specify the new
location counter value must have been previously defined. Thus, for example,
the sequence

ORG ALPHA
BYTEL RESB Uy
BYTE2 RESB 1
BYTE3 RESB 1
ORG
ALPHA RESB dl

could not be processed. In this case, the assembler would not know (during
Pass 1) what value to assign to the location counter in response to thefirst
ORGstatement. As a result, the symbols BYTE1, BYTE2, and BYTE3 could not
be assigned addresses during Pass1.
It may appearthatthis restriction is a result of the particular way in which
wedefined the two passes of our assembler. In fact, it is a more general prod-
uct of the forward-reference problem. You can easily see, for example, that the
sequence of statements

ALPHA EQU BETA


BETA EQU DELTA
DELTA RESW 1

cannot be resolved by an ordinary two-pass assembler regardless of how the


workis divided between the passes. In Section 2.4.2, we briefly consider ways
of handling such sequences in a more complex assemblerstructure.

2.3.3 Expressions

Our previous examples of assembler language statements have used single


terms(labels, literals, etc.) as instruction operands. Most assemblers allow the
https://hemanthrajhemu.github.io .
76 Chapter 2. Assemblers

use of expressions wherever such a single operandis permitted. Each such ex-
pression must, of course, be evaluated by the assembler to producea single
operand addressor value.
Assemblers generally allow arithmetic expressions formed according to
the normal rules using the operators +, —, *, and /. Division is usually defined
to produce an integer result. Individual terms in the expression may be con-
stants, user-defined symbols, or special terms. The most common such special
term is the current value of the location counter (often designated by *). This
term represents the value of the next unassigned memory location. Thus in
Fig. 2.9 the statement

106 BUFEND EQU ee

gives BUFENDa valuethatis the address of the next byte after the buffer area.
In Section 2.2 we discussed the problem of program relocation. We saw
that some values in the object program are relative to the beginning of the pro-
gram, while others are absolute (independent of program location). Similarly,
the values of terms and expressionsare either relative or absolute. A constant
is, of course, an absolute term. Labels on instructions and data areas, and ref-
erences to the location counter value, are relative terms. A symbol whosevalue
is given by EQU (or somesimilar assembler directive) may be either an ab-
solute term or a relative term depending upon the expression used to define
its value.
Expressionsare classified as either absolute expressions or relative expressions
depending upon the type of value they produce. An expression that contains
only absolute terms is, of course, an absolute expression. However, absolute
expressions may also contain relative terms provided the relative terms occur
in pairs and the termsin each such pair have opposite signs.It is not necessary
that the paired terms be adjacent to each other in the expression; however,all
relative terms mustbe capable of being paired in this way. Noneoftherelative
terms may enter into a multiplication or division operation.
A relative expression is one in whichall of the relative terms except one
can be paired as described above; the remaining unpaired relative term must
have a positive sign. As before, no relative term may enter into a multiplica-
tion or division operation. Expressions that do not meet the conditions given
for either absolute or relative expressions should be flagged by the assembler
as errors,
Althoughthe rules given above may seem arbitrary,they are actually quite
reasonable. The expressions that are legal under these definitions include ex-
actly those expressions whose value remains meaningful when the program is
relocated. A relative term or expression represents some value that may be
written as (S+ r), where S is the starting address of the program and is the
https://hemanthrajhemu.github.io
2.3. Machine-Independent Assembler Features 77

value of the term or expression relative to the starting address. Thus a relative
term usually represents some location within the program. Whenrelative
terms are paired with opposite signs, the dependency on the program starting
addressis canceled out; the result is an absolute value. Consider, for example,
the program of Fig. 2.9. In the statement

107 MAXLEN EQU BUPEND-BUPFER

both BUFEND and BUFFERarerelative terms, each representing an address


within the program. However, the expression represents an absolute value: the
difference between the two addresses, which is the length of the buffer area in
bytes. Notice that the assembler listing in Fig. 2.10 showsthe value calculated
for this expression (hexadecimal 1000) in the Loc column. This value does not
represent an address, as do mostof the other entries in that column. However,
it does show the value that is associated with the symbol that appears in the
source statement (MAXLEN).
Expressions such as BUFEND + BUFFER, 100 — BUFFER, or 3 * BUFFER
represent neither absolute values nor locations within the program. The values
of these expressions depend upon the program starting address in a way that
is unrelated to anything within the program itself. Because such expressions
are very unlikely to be of any use, they are considerederrors.
To determine the type of an expression, we mustkeep track of the types of
all symbols defined in the program. For this purpose we needa flag in the
symbol table to indicate type of value (absolute or relative) in addition to the
value itself. Thus for the program of Fig. 2.10, some of the symboltable entries
might be

Symbol Type Value

RETADR R 0030
BUFFER R 0036
BUFEND R 1036

MAXLEN A 1000

With this information the assembler can easily determine the type of each ex-
pression used as an operand and generate Modification records in the object
program forrelative values.
In Section 2.3.5 we consider programsthat consist of several parts that can
be relocated independently of each other. As we discuss in the later section,
our rules for determining the type of an expression must be modified in such
instances.
https://hemanthrajhemu.github.io
78 Chapter 2 Assemblers

2.3.4 Program Blocks

In all of the examples we have seen so far the program being assembled was
treated as a unit. The source programslogically contained subroutines, data

—-
areas, etc. However, they were handled by the assembler as one entity, result-
ing in a single block of object code. Within this object program the generated
machineinstructions and data appeared in the same order as they were writ-
ten in the source program.
Manyassemblers provide features that allow moreflexible handling of the
source and object programs. Some features allow the generated machine in-
structions and data to appear in the object program in a different order from
the corresponding source statements. Other features result in the creation of
several independent parts of the object program. These parts maintain their
identity and are handled separately by the loader. We use the term program
blocks to refer to segments of code that are rearranged within a single object
program unit, and control sections to refer to segments that are translated into
independent object program units. (This terminology is, unfortunately, far
from uniform. As a matter of fact, in some systems the same assembler lan-
guage feature is used to accomplish both of these logically different functions.)
In this section we consider the use of program blocks and how they are han-
dled by the assembler. Section 2.3.5 discusses control sections and their uses.
Figure 2.11 shows our example program as it might be written using pro-
gram blocks. In this case three blocks are used. The first (unnamed) program
block contains the executable instructions of the program. The second (named
CDATA)contains all data areas that are a few wordsorless in length. The
third (named CBLKS)contains all data areas that consist of larger blocks of
memory. Some possible reasons for making such a division are discussed later
in this section.
The assembler directive USE indicates which portions of the source pro-
gram belong to the various blocks. At the beginning of the program,state-
ments are assumed to be part of the unnamed (default) block; if no USE
statements are included, the entire program belongsto this single block. The
USEstatement on line 92 signals the beginning of the block named CDATA.
Source statements are associated with this block until the USE statement on
line 103, which begins the block named CBLKS. The USE statement may also
indicate a continuation of a previously begun block. Thus the statement on
line 123 resumes the default block, and the statement on line 183 resumes the
block named CDATA.
As we can see, each program block may actually contain several separate
segments of the source program. The assemblerwill (logically) rearrange these
segments to gather together the pieces of each block. These blocks will then be
assigned addresses in the object program, with the blocks appearing in the
https://hemanthrajhemu.github.io
2.3 Machine-Independent Assembler Features 79

Line Source statement

5 COPY START 0 COPY FILE FROM INPUT TO OUTPUT


10 FIRST STL, RETADR SAVE RETURN ADDRESS
15 CLOOP JSUB RDREC READ INPUT RECORD
20 LDA LENGTH TEST FOR EOF (LENGTH = 0)
25 COMP #0
30 JEQ ENDFIL EXIT IF EOF FOUND
35 JSUB WRREC WRITE OUTPUT RECORD
40 J CLOOP LOOP
45 ENDFIL LDA =C' EOP’ INSERT END GF FILE MARKER
50 STA BUFFER
55 LDA #3 SET LENGTH = 3
60 STA LENGTH
65 JSUB WRREC WRITE EOF
70 c @RETADR RETURN TO CALLER
92 USE CDATA
95 RETADR RESW 1
100 LENGTH RESW 1 LENGTH OF RECORD
103 USE CBLKS
105 BUFFER RESB 4096 4096-BYTE BUFFER AREA
106 BUFEND EQU * FIRST LOCATION AFTER BUFFER
107 MAXLEN EQU BUFEND-BUFFER MAXIMUM RECORD LENGTH
110 P
115 : SUBROUTINE TO READ RECORD INTO BUFFER
120
123 USE
125 RDREC CLEAR x CLEAR LOOP COUNTER
130 CLEAR A CLEAR A TO ZERO
132 CLEAR s CLEAR S TO ZERO
133 +LDT #MAXLEN
135 RLOOP TD INPUT TEST INPUT DEVICE
140 JEQ RLOOP LOOP UNTIL READY
145 RD INPUT READ CHARACTER INTO REGISTER A
| 150 COMPR A,S TEST FOR END OF RECORD (xX‘00’)
155 JEO EXIT EXIT LOOP IF EOR
160 STCH BUFFER, X STORE CHARACTER IN BUFFER
165 TIXR T LOOP UNLESS MAX LENGTH
170 JLT RLOOP HAS BEEN REACHED
175 EXIT STX LENGTH SAVE RECORD LENGTH
180 RSUB RETURN TO CALLER
183 USE CDATA
185 INPUT BYTE mri? CODE FOR INPUT DEVICE
195
200 2 SUBROUTINE TO WRITE RECORD FROM BUFFER
205 5
208 USE
210 WRREC CLEAR x CLEAR LOOP COUNTER
212 LDT LENGTH
215 WLOOP TD =er05' TEST OUTPUT DEVICE
220 JEO WLOOP LOOP UNTIL READY
225 LDCH BUFFER, X GET CHARACTER FROM BUFFER
230 WD =" 05" WRITE CHARACTER
235 TIXR T LOOP UNTIL ALL CHARACTERS
240 out WLOOP HAVE BEEN WRITTEN
245 RSUB RETURN TO CALLER
252 USE CDATA
253 LTORG
255 END FIRST

Figure 2.11 Example of a program with multiple program blocks.


https://hemanthrajhemu.github.io
80 Chapter 2 Assemblers

same order in which they werefirst begun in the source program. Theresult is
the same as if the programmer had physically rearranged the sourcestate-
ments to group togetherall the source lines belonging to each block.
The assembler accomplishes this logical rearrangement of code by main-
taining, during Pass 1, a separate location counter for each program block. The
location counterfor a blockis initialized to 0 whenthe blockisfirst begun. The
current value of this location counter is saved when switching to another
block, and the saved value is restored when resuming a previous block. Thus
during Pass 1 each label in the program is assigned an addressthatis relative
to the start of the block that contains it. When labels are entered into the sym-
bol table, the block name or numberis stored along with the assignedrelative
address. At the end of Pass 1 the latest value of the location counter for each
block indicates the length of that block. The assembler can then assign to each
block a starting address in the object program (beginning with relative loca-
tion Q).
For code generation during Pass 2, the assembler needs the address for
each symbolrelative to the start of the object program (notthe start of an indi-
vidual program block). This is easily found from the information in SYMTAB.
The assembler simply adds the location of the symbol, relative to the start of
its block, to the assigned block starting address.
Figure 2.12 demonstrates this process applied to our sample program. The
column headed Loc/Block showsthe relative address (within a program
block) assigned to each source line and a block numberindicating which pro-
gram block is involved (0 = default block, 1 = CDATA, 2 = CBLKS). This is es-
sentially the same information that is stored in SYMTABfor each symbol.
Notice that the value of the symbol MAXLEN(line 107) is shown without a
block number. This indicates that MAXLENis an absolute symbol, whose
valueis notrelative to the start of any program block.
Atthe end of Pass 1 the assembler constructs a table that contains the start-
ing addresses and lengthsfor all blocks. For our sample program,this table
looks like

Blockname Blocknumber Address Length

(default) 0 0000 0066


CDATA 1 0066 O00B
CBLKS 2 0071 1000

Now consider the instruction

20 0006 O LDA LENGTH 032060


https://hemanthrajhemu.github.io

Loc/Block Source statement Object code

0000 COPY START 0


0000 NMENE HP OoOom oO OO ooo. ooo o FIRST STL RETADR 172063
0003 CLOOP JSUB RDREC 4B2021
0006 LDA LENGTH 032060
0009 COMP #0 290000
o00c JEQ ENDFIL 332006
OOOF JSUB WRREC 4B203B
0012 J CLOOP 3F2FEE
0015 ENDFIL LDA =C’ EOF’ 032055
0018 STA BUFFER OF2056
001B LDA #3 010003
OO1E STA LENGTH OF2048
0021 JSUB WRREC 4B2029
0024 J @RETADR 3E203F
oo00 USE CDATA
0000 RETADR RESW iL
0003 LENGTH RESW 1
0000 USE CBLKS
0000 BUFFER RESB 4096
1000 EQU *

1000 EQU BUFEND-BUFFER

SUBROUTINE TO READ RECORD INTO BUFFER

0027 USE
HPO Oo oO Ooo oO oCoO oC oO oo os

0027 RDREC CLEAR x B410


0029 CLEAR A B400
002B CLEAR S B440
002D +LDT #MAXLEN 75101000
0031 RLOOP TD INPUT E32038
0034 RLOOP 332FFA
0037 INPUT DB2032
003A A,S A004
003C EXIT 332008
003F BUFFER, X 57A02F
0042 TIAR ipl B850
0044 RLOOP 3B2FEA
0047 EXIT STX LENGTH 13201F
004A RSUB 4F0000
0006 USE CDATA
0006 INPUT BYTE K'F1" Fi

SUBROUTINE TO WRITE RECORD FROM BUFFER

004D USE
eEooococe 0c 00 cc

004D CLEAR x B410


O04P LDT LENGTH T2017
0052 WLOOP TD =xX'05' E3201B
0055 JEOQ WLOOP 332FFA
0058 LDCH BUFFER, X 53A016
230 005B WD =K'05’ DF2012
235 OO5E TIXR T B850
240 0060 JLT WLOOP 3B2FEF
245 0063 RSUB 4F0000
252 0007 USE CDATA
253 LTORG
0007 =C’' EOF 454F46
he

OO0A =K'05' 05
255 END FIRST

Figure 2.12 Program from Fig. 2.11 with object code.


https://hemanthrajhemu.github.io
82 Chapter 2. Assemblers

SYMTAB showsthe value of the operand (the symbol LENGTH) as relative lo-
cation 0003 within programblock 1 (CDATA). The starting address for CDATA
is 0066. Thus the desired target address for this instruction is 0003 + 0066 =
0069. The instruction is to be assembled using program-counter relative ad-
dressing. Whentheinstruction is executed, the program counter contains the
address of the following instruction (line 25). The addressof this instructionis
relative location 0009 within the default block. Since the default block starts at
location 0000, this address is simply 0009. Thus the required displacementis
0069 — 0009 = 60. The calculation of the other addresses during Pass 2 follows a
similar pattern.
Wecan immediately see that the separation of the program into blocks has
considerably reduced our addressing problems. Because the large buffer area
is movedto the end of the object program, we nolonger need to use extended
format instructions on lines 15, 35, and 65. Furthermore, the base register is no
longer necessary; we have deleted the LDB and BASE statements previously
on lines 13 and 14. The problem of placementofliterals (and literal references)
in the program is also much more easily solved. We simply include a LTORG
statement in the CDATAblock to be sure that the literals are placed ahead of
anylarge data areas.
Of course the use of program blocks has not accomplished anything we
could not have done by rearranging the statements of the source program.For
example, program readability is often improvedif the definitions of data areas
are placed in the source programclose to the statements that reference them.
This could be accomplished in a long subroutine (without using program
blocks) by simply inserting data areas in any convenient position. However,
the programmer would need to provide Jumpinstructions to branch around
the storage thus reserved.
In the situation just discussed, machine considerations suggested that the
parts of the object program appear in memory in a particular order. On the
other hand, human factors suggested that the source program should be in a
different order. The use of program blocks is one wayof satisfying both of
these requirements, with the assembler providing the required reorganization.
It is not necessary to physically rearrange the generated code in the object
program to place the pieces of each program block together. The assembler can
simply write the object code as it is generated during Pass 2 and insert the
proper load address in each Text record. These load addresses will, of course,
reflect the starting address of the block as well as the relative location of the
code within the block. This processisillustrated in Fig. 2.13. Thefirst two Text
records are generated from the source program lines 5 through 70. When the
USEstatement on line 92 is recognized, the assembler writes out the current
Text record (even thoughthereis still room left in it). The assembler then pre-
pares to begin a new Text record for the new program block. Asit happens, the
statements onlines 95 through 105 result in no generated code, so no new Text
————— rc

https://hemanthrajhemu.github.io
2.3. Machine-Independent Assembler Features 83

HicOPY 0000000 1071


70000001 E17 20634B20210320602900003320064B203B3F2FEE0320550F2056010003
A
T00001E090F20484B202 93E203F
oo002z71 DB4 1 0B400B44075 101000E32038332FFADB2032A40043 3200857A02FB8 50
>a > 3

000044093B2FEAI 320 1F4F0000


O0006CO1F1
700004D19B410772017E3201B332FFA5 JAO1LODF2012B8503B2FEF4FO000
TO0006D04454F4605
£000000

Figure 2.13 Object program correspondingto Fig. 2.11.

records are created. The next two Text records comefrom lines 125 through
180. This time the statements that belong to the next program block do result
in the generation of object code. Thefifth Text record contains the single byte
of data from line 185. The sixth Text record resumes the default program block
andtherest of the object program continuesin similar fashion.
It does not matter that the Text records of the object program are notin se-
quence by address; the loader will simply load the object code from each
record at the indicated address. When this loading is completed, the generated
code from the default block will occupyrelative locations 0000 through 0065;
the generated code and reserved storage for CDATA will occupy locations
0066 through 0070; and the storage reserved for CBLKS will occupy locations
0071 through 1070. Figure 2.14 traces the blocks of the example program
through this process of assembly and loading. Notice that the program seg-
ments marked CDATA(1) and CBLKS(1) are not actually present in the object
program. Becauseof the way the addressesare assigned, storage will automat-
ically be reserved for these areas when the program is loaded.
You should carefully examine the generated code in Fig. 2.12, and work
through the assembly of several more instructions to be sure you understand
how the assembler handles multiple program blocks. To understand how the
pieces of each program block are gathered together, you mayalso wantto sim-
ulate (by hand) the loading of the object program ofFig. 2.13.

2.3.5 Control Sections and Program Linking

In this section, we discuss the handling of programsthat consist of multiple


control sections. A control section is a part of the program that maintainsits
identity after assembly; each such controlsection can be loaded andrelocated
independently of the others. Different control sections are most often used for
subroutines or other logical subdivisions of a program. The programmer can
assemble, load, and manipulate each of these control sections separately. The
https://hemanthrajhemu.github.io
84 Chapter 2 Assemblers

“ Program loaded
Source program Object program in memory
, Relative
Line address
5 : 0000
4 Default(1) % Defauit(1)

Default(1) 0027
Default(2) ~~ Defauit(2)

95 CDATA(2)
004D
CDATA(1) Je Default(3)
100
105]
a CBLKS(1(1) Default(3)
CDATA(1) 0066
CDATA(3) Rw CDATA(2) ese
006D
Default(2) COTA)
0071

180
185| CDATA(2)
210

CBLKS(1)
Default(3)

245
253| CDATA(3)
1070

Figure 2.14 Program blocks from Fig. 2.11 traced through the assem-
bly and loading processes.

resulting flexibility is a major benefit of using control sections. We consider ex-


amples of this when wediscuss linkage editors in Chapter3.
Whencontrol sections form logically related parts of a program,it is neces-
sary to provide some meansfor linking them together. For example, instruc-
tions in one control section might need to refer to instructions or data located
in another section. Because control sections are independently loaded and re-
located, the assembler is unable to process these references in the usual way.
The assembler has no idea where any other control section will be located at
execution time. Such references between control sectionsare called external ref-
erences. The assembler generates information for each external reference that
will allow the loader to perform the required linking. In this section we de-
scribe how external references are handled by our assembler. Chapter 3 dis-
cussesin detail how theactuallinking is performed.
https://hemanthrajhemu.github.io
Section 2.3 Machine-Independent Assembler Features 85

Figure 2.15 shows our example program as it might be written using mulkti-
ple control sections. In this case there are three control sections: one for the
main program and ong for each subroutine. The START statement identifies
the beginning of the assembly and gives a name (COPY) tothe first control
section. Thefirst section continues until the CSECT statementon line 109. This
assembler directive signals the start of a new control section named RDREC.
Similarly, the CSECT statement on line 193 begins the control section named
WRREC. Theassemblerestablishes a separate location counter (beginning at
0) for each controlsection, just as it does for program blocks.
Control sections differ from program blocksin that they are handled sepa-
rately by the assembler. (It is not even necessary for all control sections in a
program to be assembled at the same time.) Symbols that are defined in one
control section may notbe used directly by another control section; they must
be identified as external references for the loader to handle. Figure 2.15 shows
the use of two assembler directives to identify such references: EXTDEF(exter-
nal definition) and EXTREF (external reference). The EXTDEF statement in a
control section names symbols, called external symbols, that are defined in this
control section and may be used by other sections. Control section names(in
this case COPY, RDREC, and WRREC)do not need to be named in an EXTDEF
statement because they are automatically considered to be external symbols.
The EXTREF statement names symbols that are used in this control section
and are defined elsewhere. For example, the symbols BUFFER, BUFEND,and
LENGTH are defined in the control section named COPY and madeavailable
to the other sections by the EXTDEFstatementonline 6. The third control sec-
tion (WRREC) uses two of these symbols,as specified in its EXTREF statement
(line 207). The order in which symbols are listed in the EXTDEF and EXTREF
statementsis not significant.
Now weare ready to look at how external references are handled by the
assembler. Figure 2.16 shows the generated object code for each statement in
the program. Considerfirst the instruction

LS 0003 CLOOP +JSUB RDREC 4B100000

The operand (RDREC) is named in the EXTREF statementfor the control sec-
tion, so this is an external reference. The assembler has no idea where the con-
trol section containing RDREC will be loaded, so it cannot assemble the
address for this instruction. Instead the assembler inserts an address of zero
and passes information to the loader, which will cause the proper address to
be inserted at load time. The address of RDREC will have no predictable rela-
tionship to anything in this control section; therefore relative addressing is not
possible. Thus an extended format instruction must be used to provide room
for the actual address to be inserted. This is true of any instruction whose
operand involves an external reference.
https://hemanthrajhemu.github.io
Line Source statement

COPY START 0 COPY FILE FROM INPUT TO OUTPUT


EXTDEF BUFFER, BUFEND, LENGTH
EXTREF RDREC, WRREC -
LO FIRST STL RETADR SAVE RETURN ADDRESS
i5 CLOOP +JSUB RDREC READ INPUT RECORD
20 LENGTH TEST FOR EOF (LENGTH = 0)
25
30 EXIT IF EOF FOUND
35 WRITE OUTPUT RECORD
40 LOOP
45 ENDFIL INSERT END OF FILE MARKER
50
55 SET LENGTH = 3
60
65 WRITE EOF
70 RETURN TO CALLER
95 RETADR
100 LENGTH LENGTH OF RECORD
103
105 BUFFER 4096-BYTE BUFFER AREA
106 BUFEND
107 MAXLEN BUFEND-BUFFER

109 RDREC CSECT


110
415 SUBROUTINE TO READ RECORD INTO BUFFER
120
122 EXTREF BUFFER, LENGTH, BUFEND
125 CLEAR x CLEAR LOOP COUNTER
130 CLEAR A CLEAR A TO ZERO
132 CLEAR s CLEAR S TO ZERO
133 LDT MAXLEN
135 RLOOP TD INPUT TEST INPUT DEVICE
140 JEQ RLOOP LOOP UNTIL READY
145 RD INPUT READ CHARACTER INTO REGISTER A
150 COMPR A,S TEST FOR END OF RECORD (X’00')
155 JEQ EXIT EXTT LOOP IF EOR
160 +STCH BUFFER, X STORE CHARACTER IN BUFFER
165 1 LOOP UNLESS MAX LENGTH
170 RLOOP HAS BEEN REACHED
175 LENGTH SAVE RECORD LENGTH
180 RETURN TO CALLER
185 XFL’ CODE FOR INPUT DEVICE
190 BUFPEND-BUFFER

193 CSECT
195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
207 EXTREF LENGTH, BUFFER
210 CLEAR x CLEAR LOOP COUNTER
212 +LDT LENGTH
215 WLOOP TD =205/ TEST OUTPUT DEVICE
220 WLOOP LOOP UNTIL READY
225 BUFFER, X GET CHARACTER FROM BUFFER
230 =K'05' WRITE CHARACTER
235 T LOOP UNTIL ALL CHARACTERS
240 WLOOP HAVE BEEN WRITTEN
245 RETURN TO CALLER
255 FIRST

Figure 2.15 Illustration of control sections and programlinking.


86
https://hemanthrajhemu.github.io
Loc Source statement Object code
0000 COPY START 0
EXTDEF BUFFER, BUFEND, LENGTH
RDREC , WRREC
0000 FIRST STL RETADR 172027
0003 CLOOP RDREC 4B100000
0007 LENGTH 032023
OO0A #0 290000
000D ENDFIL 332007
0010 WRREC 458100000
0014 CLOOP 3F2FEC
0017 ENDFIL LDA =C' EOF’ 032016
OO1A BUFFER OF2016
001D #3 010003
0020 LENGTH OF200A
0023 WRREC 4B100000
0027 @RETADR 3E2000
002A RETADR a
002D LENGTH oe
LTORG
0030 * =C’ BOF’ 454F46
0033 BUFFER RESB 4096
1033 BUFEND EOU *

1000 MAXLEN EOU BUFEND-BUFFER

g000 RDREC CSECT

SUBROUTINE TO READ RECORD INTO BUFFER

EXTREF BUFFER, LENGTH, BUFEND


0000 CLEAR x B410
0002 CLEAR A B400
0004 CLEAR oS B440
133 0006 LDT MAXLEN 77201F
135 0009 RLOOP TD INPUT E3201B
140 oooc JEQ RLOOP 332FFA
145 OOOF RD INPUT DB2015
150 0012 COMPR A,S A004
155 0014 JEQ EXIT 332009
160 0017 +STCH BUFFER, X 57900000
165 001B TIXR Jy B850
170 001D JLT RLOOP SB2FE9
175 0020 EXIT +STH LENGTH 13100000
180 0024 RSUB 4F0000
185 0027 BYTE MEL F1
190 0028 WORD BUFEND-BUFFER 000000

193 0000 CSECT


195
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
207 EXTREF LENGTH, BUFFER
210 0000 CLEAR x B410
212 0002 +LDT LENGTH 77100000
215 0006 WLOOP TD =K'05' E32012
220 0009 JEQ WLOOP 332FFA
225 000c +LDCH BUFFER, X 53900000
230 0010 WD =xX'05' DF2008
235 0013 TIXR 2 B850
240 0015 JLT WLOOP 3B2FEE
245 0018 RSUB 4F0000
255 END FIRST
001B z =m OS” 05

Figure 2.16 Program from Fig. 2.15 with object code. 87


i
https://hemanthrajhemu.github.io
Chapter 2 Assemblers |

Similarly, the instruction

160 0017 +STCH BUFFER, X 57900000

makes an external reference to BUFFER. Theinstruction is assembled using


extended format with an address of zero. The x bit is set to 1 to indicate in-
dexed addressing, as specified by the instruction. The statement

190 0028 MAXLEN WORD BUFEND-BUFFER 000000

is only slightly different. Here the value of the data word to be generated is
specified by an expression involving two external references: BUFEND and
BUFFER.Asbefore, the assembler stores this value as zero. When the program
is loaded, the loader will add to this data area the address of BUFEND and
subtract from it the address of BUFFER, whichresults in the desired value.
Note the difference between the handling of the expression on line 190 and
the similar expression on line 107. The symbols BUFEND and BUFFER are
defined in the same control section with the EQU statementon line 107. Thus
the value of the expression can be calculated immediately by the assembler.
This could not be done for line 190; BUFEND and BUFFER are defined in an-
other control section, so their values are unknownat assembly time.
As we can see from the above discussion, the assembler must remember
(via entries in SYMTAB)in which control section a symbol is defined. Any
attempt to refer to a symbolin another control section must be flagged as an
error unless the symbolis identified (using EXTREF) as an external reference.
The assembler mustalso allow the same symbolto be used in different control
sections. For example, the conflicting definitions of MAXLEN onlines 107 and
190 should cause no problem. A reference to MAXLENin the control section
COPY would use the definition on line 107, whereas a reference to MAXLEN
in RDREC would usethe definition on line 190.
So far we have seen how the assembler leaves room in the object code for
the values of external symbols. The assembler must also include information
in the object program that will cause the loader to insert the proper values
where they are required. We need two new record typesin the object program
and a changein a previously defined record type. As before, the exact format
of these recordsis arbitrary; however, the same information must be passed to
the loader in someform.
The two new record types are Define and Refer. A Define record gives in-
formation about external symbols that are defined in this control section—that
is, symbols named by EXTDEF. A Refer record lists symbols that are used as
external references by the control section—thatis, symbols named by EXTREF.
The formats of these recordsare as follows.
https://hemanthrajhemu.github.io
Section 2.3. Machine-Independent Assembler Features 89

Define record:
Col. 1 D
Col. 2-7 Nameofexternal symboldefinedin this control section
Col. 8-13 Relative address of symbol within this control section
(hexadecimal)
Col. 14-73 Repeat information in Col. 2-13 for other external
symbols
Refer record:
Col. 1 R
Col. 2-7 Name of external symbol referred to in this control
section
Col. 8-73 Namesof other external reference symbols

The other information needed for program linking is added to the


Modification record type. The new formatis as follows.

Modification record (revised):


Col. 1 M
Col. 2-7 Starting address of the field to be modified, relative to
the beginning of the control section (hexadecimal)
Col. 8-9 Length of the field to be modified, in half-bytes (hexa-
decimal)
Col. 10 Modification flag (+ or -)
Col. 11-16 External symbol whose value is to be added to or sub-
tracted from the indicated field

Thefirst three items in this record are the same as previously discussed. The
two new items specify the modification to be performed: adding or subtract-
ing the value of some external symbol. The symbol used for modification may
be defined either in this control section or in anotherone.
Figure 2.17 shows the object program corresponding to the source in Fig.
2.16. Notice that there is a separate set of object program records (from Header
through End) for each control section. The records for each control section are
exactly the same as they wouldbeif the sections were assembled separately.
The Define and Refer records for each control section include the symbols
named in the EXTDEF and EXTREF statements. In the case of Define, the
record also indicates the relative address of each external symbol within the
control section. For EXTREF symbols, no address information is available.
These symbols are simply namedin the Refer record.
https://hemanthrajhemu.github.io
90 Chapter 2 Assemblers

HCOPY 00000001033
DBUFFEROO0033BUFENDOO1033LENGTHOOOO2D
RRDREC MRREC =
7,0000001D172027,4B1000000320232900003320074B1000003F2FECO320160F2016
T,00001D0D0100030F20044B1000003E2000
T00003003454F46
MOOO00405+RDREC
MOOO0O] 10 5,+WRREC
M00002405+WRREC
E000000

HRDREC 00000000025
RBUFFERLENGTHBUFEND
TOOOOOOIDE4 1 0B400B4407 720 1FE3201 B332FFADB20 1 5,400433200957900000B850
T0000 1 DOESB2FEQ1 3 1000004F0000F 1000000
M0000 1 805+BUFFER
MOO002 105+LENGTH
M00002806+BUFEND
M00002806-BUFFER
E

HWRREC 000000000 1c
RLENGTHBUFFER
70000001 CB4 1077 100000E320 1 2332 FFA53900000DF2008B8503B2FEE4FO00005
400000 30 S+LENGTH
MOOOOODO5+BUFFER
E

Figure 2.17 Object program correspondingto Fig. 2.15.

Now let us examine the process involvedin linking up external references,


beginning with the source statements we discussed previously. The address
field for the JSUB instruction on line 15 beginsatrelative address 0004. Its ini-
tial value in the object program is zero. The Modification record

M00000405+RDREC

in control section COPY specifies that the address of RDRECis to be added to


this field, thus producing the correct machine instruction for execution. The
other two Modification records in COPY perform similar functions for the
a

https://hemanthrajhemu.github.io
2.3 Machine-Independent Assembler Features 91

instructions on lines 35 and 65. Likewise, the first Modification record in con-
trol section RDRECfills in the proper address for the external reference on
line 160. .
The handling of the data word generated by line 190is only slightly differ-
ent. The value of this word is to be BUFEND-BUFFER, where both BUFEND
and BUFFER are defined in another control section. The assembler generates
an initial value of zero for this word (located at relative address 0028 within
control section RDREC). The last two Modification records in RDREC direct
that the address of BUFENDbe added tothis field, and the address of
BUFFERbesubtracted from it. This computation, performed at load time, re-
sults in the desired value for the data word.
In Chapter 3 wediscuss in detail how the required modifications are per-
formed by the loader. At this time, however, you should be sure that you un-
derstand the concepts involved in the linking process. You should carefully
examine the other Modification records in Fig. 2.17, and reconstruct for your-
self how they were generated from the source program statements.
Note that the revised Modification record maystill be used to perform pro-
gram relocation. In the case of relocation, the modification required is adding
the beginning address of the control section to certain fields in the object pro-
gram. The symbol used as the nameof the control section hasasits value the
required address. Since the control section nameis automatically an external
symbol, it is available for use in Modification records. Thus, for example, the
Modification records from Fig. 2.8 are changed from

MOOO000705
M00001405
M00002705

to

M00000705+COPY
MO0001405+COPY
M00002705+COPY

In this way, exactly the same mechanism can be used for program relocation
and for program linking. There are more examples in the next chapter.
The existence of multiple control sections that can be relocated indepen-
dently of one another makes the handling of expressions slightly more compli-
cated. Our earlier definitions required that all of the relative terms in an
expression be paired (for an absolute expression), or that all except one be
paired (for a relative expression). We must now extend this restriction to spec-
ify that both terms in each pair mustbe relative within the same control sec-
https://hemanthrajhemu.github.io
92 Chapter 2 Assemblers

tion. The reason is simple—if the two terms represent relative locations in the
same control section, their difference is an absolute value (regardless of where
the control section is located): On the other hand, if they are in different con-
trol sections, their difference has a value that is unpredictable (and therefore
probablyuseless). For example, the expression

BUFEND-BUFFER

hasas its value the length of BUFFERin bytes. On the other hand, the value of
the expression

RDREC-COPY

is the difference in the load addresses of the twocontrol sections. This value
depends on the way run-timestorageis allocated; it is unlikely to be of any
use whatsoeverto an application program.
Whenan expression involves external references, the assembler cannot in
general determine whether or not the expression is legal. The pairing of rela-
tive terms to test legality cannot be done without knowing which of the terms
occur in the samecontrol sections, and this is unknownat assembly time. In
such a case, the assembler evaluatesall of the termsit can, and combinesthese
to form an initial expression value. It also generates Modification records so
the loader can finish the evaluation. The loader can then check the expression
for errors. We discuss this further in Chapter 3 when weexaminethe design of
a linking loader.

2.4 ASSEMBLER DESIGN OPTIONS

In this section we discuss twoalternatives to the standard two-pass assembler


logic. Section 2.4.1 describes the structure and logic of one-pass assemblers.
These assemblers are used whenit is necessary or desirable to avoid a second
pass over the source program. Section 2.4.2 introduces the notion of a multi-
pass assembler, an extension to the two-pass logic that allows an assembler to
handle forward references during symboldefinition.

2.4.1 One-Pass Assemblers —

In this section we examinethe structure and design of one-pass assemblers. As


wediscussed in Section 2.1, the main problem in trying to assemble a program
in one pass involves forward references. Instruction operands often are sym-
bols that have not yet been defined in the source program. Thus the assembler
does not know whataddressto insert in the translated instruction.
https://hemanthrajhemu.github.io
2.4 Assembler Design Options 93

It is easy to eliminate forward references to data items; we can simply re-


quire that all such areas be defined in the source program before they are ref-
erenced. This restriction is not too severe. The programmer merely placesall
storage reservation statements at the start of the program rather than at the
end. Unfortunately, forward references to labels on instructions cannot be
eliminated as easily. The logic of the program often requires a forward jump—
for example, in escaping from a loop after testing some condition. Requiring
that the programmereliminate all such forward jumps would be muchtoore-
strictive and inconvenient. Therefore, the assembler must make some special
provision for handling forward references. To reduce the size of the problem,
many one-pass assemblers do, however, prohibit (or at least discourage) for-
wardreferences to data items.
There are two main types of one-pass assembler. One type produces object
code directly in memory for immediate execution; the other type produces the
usual kind of object program for later execution. We use the program in Fig.
2.18 to illustrate our discussion of both types. This example is the same as in
Fig. 2.2, with all data item definitions placed ahead of the code that references
them. The generated object code shownin Fig. 2.18 is for reference only; we
will discuss how each type of one-pass assembler would actually generate the
object program required.
Wefirst discuss one-pass assemblers that generate their object code in
memory for immediate execution. No object program is written out, and no
loader is needed. This kind of load-and-go assembler is useful in a system that
is oriented toward program developmentandtesting. A university computing
system for studentuseis a typical example of such an environment. In such a
system, a large fraction of the total workload consists of program translation.
Because programsare re-assembled nearly every time they are run, efficiency
of the assembly process is an important consideration. A load-and-go assem-
bler avoids the overhead of writing the object program out and readingit back
in. This can be accomplished with either a one- or a two-pass assembler.
However, a one-pass assembler also avoids the overhead of an additional pass
over the source program.
Because the object program is produced in memory rather than being writ-
ten out on secondary storage, the handling of forward references becomesless
difficult. The assembler simply generates object code instructions as it scans
the source program.If an instruction operand is a symbolthat has not yet been
defined, the operand address is omitted whenthe instruction is assembled.
The symbol used as an operandis entered into the symbol table (unless such
an entry is already present). This entry is flagged to indicate that the symbolis
undefined. The address of the operandfield ofthe instruction that refers to the
undefined symbolis added to list of forward references associated with the
symbol]table entry. When the definition for a symbolis encountered, the for-
o

wardreference list for that symbol is scanned (if one exists), and the proper
addressis inserted into any instructions previously generated.
https://hemanthrajhemu.github.io
94 Chapter 2 Assemblers

Loc Source statement Object code


1000 COPY START 1000
1000 EOF BYTE C’ EOF’ 454F46
1003 THREE WORD 000003
1006 ZERO WORD 000000
1009 RETADR RESW
100C LENGTH RESW
100F BUFFER RESB

200F FIRST STL 141009 |


2012 CLOOP JSUB 48203D |
2015 LDA 00100¢ |
2018 COMP 281006
201B JEO 302024
201E JSUB 482062 |
2021 J 302012
2024 ENDFIL LDA EOF 001000
2027 STA BUFFER 0C100F
202A LDA THREE 001003
202D STA LENGTH 0c100c
2030 JSUB WRREC 482062
2033 LDL RETADR 081009
2036 RSUB 4c0000
SUBROUTINE TO READ RECORD INTO BUFFER

2039 BYTE XFL? FL


203A WORD 4096 001000

203D LDX ZERO 041006 .


2040 LDA ZERO 001006 |
2043 RLOOP TD INPUT E02039 |
140 2046 JEQ RLOOP 302043 ; |
145 2049 RD INPUT D82039
150 204C COMP ZERO 281006
30205B

Saas
155 204F JEQ EXIT
160 2052 STCH BUFFER, X 54900F
165 2055 TIX MAXLEN 2C203A
170 2058 JLT RLOOP 382043
175 205B EXIT STX LENGTH 10100¢
180 205E RSUB 4c0000
435
200 SUBROUTINE TO WRITE RECORD FROM BUFFER
205
206 2061 OUTPUT BYTE X'05" 05
207
210 2062 LDX 041006
ZEo 2065 WLOOP TD E02061
220 2068 JEQ 302065
225 206B LDCH 50900F
230 206E WD . DC2061
235 2071 TIX 2C100C
240 2074 JLT 382065
245 2077 RSUB 4c0000
205 END FIRST

Figure 2.18 Sample program for a one-pass assembler.


https://hemanthrajhemu.github.io
2.4 Assembler Design Options 95

An example should help to make this process clear. Figure 2.19(a) shows
the object code and symboltable entries as they would beafter scanning line
40 of the program in Fig. 2.18. The first forward reference occurred online 15.
Since the operand (RDREC) was not yet defined, the instruction was assem-
bled with no value assigned as the operand address (denoted in the figure by
----). RDREC wasthen entered into SYMTABas an undefined symbol(indi-
cated by *); the address of the operand field of the instruction (2013) was in-
serted in a list associated with RDREC.A similar process was followed with
the instructions onlines 30 and 35.
Nowconsider Fig. 2.19(b), which corresponds to the situation after scan-
ning line 160. Some of the forward references have been resolved bythis time,
while others have been added. When the symbol ENDFIL wasdefined (line
45), the assembler placed its value in the SYMTAB entry;it then inserted this
value into the instruction operandfield (at address 201C) as directed by the
forward reference list. From this point on, any references to ENDFIL would
not be forward references, and would notbe entered into list. Similarly, the
definition of RDREC(line 125) resulted in thefilling in of the operand address
at location 2013. Meanwhile, two new forward references have been added: to
WRREC(line 65) and EXIT (line 155). You should continue tracing through
this process to the end of the program to show yourself thatall of the forward

Memory
address Contents Symbol Value

1000 454F4600 00030000 OOxxxxxx Xxxxxxxxx LENGTH 100C


ws XXXXXXXX XXXXXXXX XXXXXKXX XXXXXXXX RDREC el >) 2013

: THREE 1003
e

2000 MXXXXXXX XXXXKXXKX XXXKXXXKX XxXXXxXx1l4 ZERO 1006


2010 1oo948— —00100C 28100630 -—-—-48—
WRREC + o—p) 201F
2020 —3C2012
. EOF 1000

, ENDFIL * e——pl 201C


RETADR 1009

BUFFER 100F

CLOOP 2012

FIRST 200F

Figure 2.19(a) Object code in memory and symboltable entries for


the program in Fig. 2.18 after scanningline 40.
https://hemanthrajhemu.github.io
96 Chapter 2 Assemblers

Memory Symbol Value


address Contents LENGTH 100C
1000 454F4600 00030000 OOxxxxxx xxxxxxxx_ RDREC 203D
1010 XXMMMXXX XXMMMMKXK XMKXXKKXK XXXXKXKK
° THREE 1003
.

. ZERO 1006
2000 XXXMMXXMKM XXMMMKKXX XXXXKKXXX XXXXXK14
2010 10094820 3D00100C 28100630 202448— WEES [Feces 2031
2020 —3C2012 0010000C 100F0010 o030C1l00c EOF 1000
2030 48———08 10094C00 OOF10010 00041006
2040 OO1O06EO 20393020 43D82039 28100630 ENDFIL 2024
2050 ——5490 OF
. RETADR 1009
e
* BUFFER 100F

CLOOP 2012

FIRST 200F

MAXLEN 203A

INPUT 2039

EXIT *| 2050 0

RLOOP 2043

Figure 2.19(b) Object code in memory and symbol table entries for
the program in Fig. 2.18 after scanningline 160.

references will befilled in properly. At the end of the program, any SYMTAB
entries that are still marked with * indicate undefined symbols. These should
be flagged by the assembleraserrors.
Whenthe end of the program is encountered, the assembly is complete.If
no errors have occurred, the assembler searches SYMTAB for the value of the
symbol namedin the END statement (in this case, FIRST) and jumpsto this lo-
cation to begin execution of the assembled program.
We used an absolute program as our example because, for a load-and-go
assembler, the actual address must be known at assembly time. Of course it is
not necessary for this address to be specified by the programmer; it might be
assigned by the system.In either case, however, the assembly process would
be the same—thelocation counter wouldbeinitialized to the actual program
starting address.
One-pass assemblers that produce object programs as output are often
used on systems where external working-storage devices (for the intermediate
file between the two passes) are not available. Such assemblers may also be
https://hemanthrajhemu.github.io
2.4 Assembler Design Options 97

useful when the external storage is slow or is inconvenient to use for some
other reason. One-pass assemblers that produce object programs follow a
slightly different procédure from that previously described. Forward refer-
ences are entered into lists as before. Now, however, when the definition of a
symbolis encountered, instructions that made forward references to that sym-
bol may nolonger be available in memory for modification. In general, they
will already have been written out as part of a Text record in the object pro-
gram. In this case the assembler must generate another Text record with the
correct operand address. When the program is loaded,this addresswill be in-
serted into the instruction by the action of the loader.
Figure 2.20 illustrates this process. The second Text record contains the ob-
ject code generated from lines 10 through 40 in Fig. 2.18. The operand ad-
dresses for the instructions on lines 15, 30, and 35 have been generated as
0000. When the definition of ENDFIL on line 45 is encountered, the assembler
generates the third Text record. This record specifies that the value 2024 (the
address of ENDFIL)is to be loaded at location 201C (the operand address field
of the JEQ instruction on line 30). When the program is loaded, therefore, the
value 2024 will replace the 0000 previously loaded. The other forward refer-
ences in the program are handled in exactly the same way.In effect, the ser-
vices of the loader are being used to complete forward references that could
not be handled by the assembler. Of course, the object program records must
be kept in their original order when they are presented to the loader.
In this section we considered only simple one-pass assemblers that han-
dled absolute programs.Instruction operands were assumedto be single sym-
bols, and the assembled instructions contained the actual (not relative)
addresses of the operands. More advanced assembler features suchasliterals

HCOPY 0100000 107A


TOOL 00009454F46000003000000
TOO200F1 3141 00948000000 1 00C28 10063000004800003C2012
700201 6022024
70020241 9,00 1 0000C1 00F0010030C100C4800000810094c0000F100 1000
700201302203D
7,00203D1E041006001006E02039302043D8203928 100630000054900F2C203A382043
7002050022058
7,00205B07,10100C4C000005
T0020 1F022062
7002031022062
T0020621 804 1006£0206130206550900FDC206 12€10003820654c0000
E00200F

Figure 2.20 Object program from one-pass assembler for program


in Fig. 2.18.
https://hemanthrajhemu.github.io
98 Chapter 2 Assemblers

were not allowed. You are encouraged to think about ways of removing some
of these restrictions (see the Exercises for this section for some suggestions).

2.4.2 Multi-Pass Assemblers

In our discussion of the EQU assemblerdirective, we required that any symbol


used on the right-handside(i.e., in the expression giving the value of the new
symbol) be defined previously in the source program. A similar requirement
was imposed for ORG. As a matter of fact, such a restriction is normally
appliedto all assemblerdirectives that (directly or indirectly) define symbols.
The reason for this is the symbol definition process in a two-pass assem-
bler. Consider, for example, the sequence

ALPHA HOU BETA


BETA EQU DELTA
DELTA RESW iH

The symbol BETA cannotbe assigned a value whenit is encountered during


the first pass because DELTA has not yet been defined. As a result, ALPHA
cannot be evaluated during the second pass. This means that any assembler
that makes only two sequential passes over the source program cannotresolve
such a sequenceof definitions.
Restrictions such as prohibiting forward references in symbol definition
are not normally a serious inconvenience for the programmer. As a matter of
fact, such forward references tend to create difficulty for a person reading the
program as well as for the assembler. Nevertheless, some assemblers are de-
signed to eliminate the need for such restrictions. The general solution is a
multi-pass assembler that can make as many passes as are needed to process
the definitions of symbols. It is not necessary for such an assembler to make
more than two passes over the entire program. Instead, the portions of the
program that involve forward references in symbol definition are saved dur-
ing Pass 1. Additional passes through these stored definitions are made as the
assembly progresses. This process is followed by a normalPass2.
There are several ways of accomplishing the task outlined above. The
method wedescribe involves storing those symboldefinitions that involvefor-
ward references in the symboltable. This table also indicates which symbols
are dependenton the valuesof others, to facilitate symbol evaluation.
Figure 2.21(a) shows a sequence of symbol-defining statements that in-
volve forward references; the other parts of the source program are not impor-
tant for our discussion, and have been omitted. The following parts of Fig. 2.21
showinformation in the symboltable as it might appear after processing each
of the source statements shown.
Figure 2.21(b) displays symboltable entries resulting from Pass 1 process-
ing of the statement
https://hemanthrajhemu.github.io
2.4 Assembler Design Options 99

HALFSZ EQU MAXLEN/2

MAXLENhas not yet*been defined, so no value for HALFSZ can be com-


puted. The defining expression for HALFSZ is stored in the symboltable in
place of its value. The entry &1 indicates that one symbolin the defining ex-
pression is undefined. In an actual implementation, of course, this definition
might be stored at some other location. SYMTAB would then simply contain a
pointer to the defining expression. The symbol MAXLENis also entered in the
symbol table, with the flag * identifying it as undefined. Associated with this
entry is a list of the symbols whose values depend on MAXLEN(in this case,
HALFSZ). (Note the similarity to the way we handled forward references in a
one-pass assembler.)

ili HALFSZ EQU MAXLEN /2


MAZXLEN EQU BUFEND-BUFFER
2 PREVBT EQU BUFFER-1

4 BUFFER RESB 4096


5 BUFEND EQU a

(a)

HALFSZ |&1| MAXLEN/2 0

MAXLEN * e@—p) HALFSZ 0

(b)

Figure 2.21 Example of multi-pass assembler operation.


https://hemanthrajhemu.github.io
100 Chapter 2 Assemblers

BUFEND| * -—P| MAXLEN

HALFSZ |&1| MAXLEN/2 0

MAXLEN |&2] BUFEND-BUFFER +—P} HALFSZ

BUFFER * @+—| MAXLEN

(c)

BUFEND| * o+—P| MAXLEN

HALFSZ |&1| MAXLEN/2 0

PREVBT &1| BUFFER-1 0

MAXLEN |&2| BUFEND-BUFFER o—P) HALFSZ

BUFFER * o+—P} MAXLEN PREVBT

(d)

Figure 2.21 (cont)


https://hemanthrajhemu.github.io
2.4 Assembler Design Options 101

BUFEND > MAXLEN 0

HALFSZ & MAXLEN/2


ar

PREVBT 1033

MAXLEN & BUFEND-BUFFER p—P HALFSZ 0


eee

BUFFER 1034

(e)

BUFEND 2034

HALFSZ 800

PREVBT 1033

MAXLEN 1000

BUFFER

Figure 2.21 (con‘d)


https://hemanthrajhemu.github.io
102 Chapter 2 Assemblers

The same procedureis followed with the definition of MAXLEN[see


Fig. 2.21(c)]. In this case there are two undefined symbols involvedin the defi-
nition: BUFEND and BUFFER. Both of these are entered into SYMTAB with
lists indicating the dependence of MAXLENupon them.Similarly, the defini-
tion of PREVBT causes this symbol to be addedtothe list of dependencies on
BUFFER[as showninFig.2.21(d)].
So far we have simply been saving symboldefinitions for later processing.
The definition of BUFFER on line 4 lets us begin evaluation of some of these
symbols. Let us assume that whenline 4 is read, the location counter contains
the hexadecimal value 1034. This address is stored as the value of BUFFER.
The assembler then examinesthe list of symbols that are dependent on
BUFFER. The symboltable entry for the first symbol in this list (MAXLEN)
showsthat it depends on two currently undefined symbols; therefore,
MAXLENcannot be evaluated immediately. Instead, the &2 is changed to &1
to show that only one symbolin the definition (BUFEND) remains undefined.
The other symbolin the list (PREVBT) can be evaluated because it depends
only on BUFFER.The value of the defining expression for PREVBTis calcu-
lated and stored in SYMTAB. Theresult is shown in Fig. 2.21(e).
The remainderof the processing follows the same pattern. When BUFEND
is defined byline 5, its value is entered into the symboltable. Thelist associ-
ated with BUFEND thendirects the assembler to evaluate MAXLEN, and en-
tering a value for MAXLENcauses the evaluation of the symbolinitslist
(HALFSZ). As shownin Fig. 2.21(f), this completes the symbol definition
process. If any symbols remained undefined at the end of the program,the as-
sembler would flag them aserrors.
The procedure we have just described applies to symbols defined by as-
sembler directives like EQU. You are encouraged to think about how this
method could be modified to allow forward references in ORG statements as
well.

2.5 IMPLEMENTATION EXAMPLES

We discussed many of the most commonassembler features in the preceding


sections. However, the variety of machines and assembler languages is very
great. Most assemblers haveat least some unusual features that are related to
machinearchitecture or language design. In this section we discuss three ex-
amples of assemblers for real machines. We are obviously unableto give a full
description of any of these in the space available. Instead we focus on some of
the most interesting or unusual features of each assembler. We are also partic-
ularly interested in areas where the assembler design differs from thebasic al-
gorithm anddata structures described earlier.
https://hemanthrajhemu.github.io
2.5 Implementation Examples 103

The assembler examples wediscuss are for the Pentium (x86), SPARC, and
PowerPC architectures. You may want to review the descriptions of these
architectures in Chaptér 1 before proceeding.

2.5.1 MASM Assembler

This section describes someof the features of the Microsoft MASM assembler
for Pentium and other x86 systems. Further information about MASM can be
found in Barkakati (1992).
As wediscussed in Section 1.4.2, the programmerof an x86 system views
memory as a collection of segments. An MASM assembler language program
is written as a collection of segments. Each segmentis defined as belonging to
a particular class, corresponding to its contents. Commonly used classes are
CODE, DATA, CONST, and STACK.
During program execution, segments are addressed via the x86 segment
registers. In most cases, code segments are addressed using register CS, and
stack segments are addressed using register SS. These segment registers are
automatically set by the system loader when a program is loaded for execu-
tion. Register CS is set to indicate the segmentthat contains the starting label
specified in the END statement of the program. Register SS is set to indicate
the last stack segment processed bythe loader.
Data segments (including constant segments) are normally addressed us-
ing DS, ES, FS, or GS. The segmentregister to be used can be specified explic-
itly by the programmer (by writing it as part of the assembler language
instruction). If the programmerdoes not specify a segmentregister, oneis se-
lected by the assembler.
By default, the assembler assumesthatall references to data segments use
register DS. This assumption can be changed by the assembler directive
ASSUME.For example, the directive -

ASSUME ES:DATASEG2

tells the assembler to assume that register ES indicates the segment


DATASEG2.Thus,any references to labels that are defined in DATASEG2will
be assembled using register ES. It is also possible to collect several segments
into a group and use ASSUME toassociate a segmentregister with the group.
Registers DS, ES, FS and GS mustbe loaded by the program before they
can be used to address data segments. For example, the instructions

MOV AX, DATASEG2


MOV ES, AX
https://hemanthrajhemu.github.io
104 Chapter 2 Assemblers

would set ES to indicate the data segment DATASEG2. Notice the similarities
between the ASSUME directive and the BASE directive we discussed for
SIC/XE. The BASEdirective tells a SIC/XE assembler the contents of register
B; the programmer must provide executable instructions to load this value
into the register. Likewise, ASSUMEtells MASM the contents of a segment
register; the programmer must provideinstructions to load this register when
the program is executed.
Jumpinstructions are assembled in two different ways, depending on
whether the target of the jump is in the same code segment as the jumpin-
struction. A near jump is a jump to a target in the same code segment;a far jump
is a jumpto a target in a different code segment. A near jumpis assembled us-
ing the current code segmentregister CS. A far jump must be assembled using
a different segment register, which is specified in an instruction prefix. The as-
sembled machineinstruction for a near jump occupies 2 or 3 bytes (depending
upon whether the jump address is within 128 bytes of the current instruction).
The assembled instruction for a far jump requires 5 bytes.
Forwardreferences to labels in the source program can cause problems.
For example, consider a jump instruction like

JMP TARGET

If the definition of the label TARGEToccurs in the program before the JMPin-
struction, the assembler can tell whether this is a near jump or a far jump.
However, if this is a forward reference to TARGET, the assembler does not
know how manybytesto reserve for the instruction.
By default, MASM assumesthat a forward jumpis a near jump.If the tar-
get of the jumpis in another code segment, the programmer must warn the
assembler by writing

JMP FAR PTR TARGET

If the jump address is within 128 bytes of the currentinstruction, the program-
mer can specify the shorter (2-byte) near jump by writing

JMP SHORT TARGET

If the JMP to TARGETisa far jump, and the programmerdoes not specify FAR
PTR, a problem occurs. During Pass 1, the assembler reserves 3 bytes for the
jumpinstruction. However, the actual assembled instruction requires 5 bytes.
In the earlier versions of MASM,this caused an assemblyerror(called a phase
https://hemanthrajhemu.github.io
2.5 Implementation Examples 105

error). In later versions of MASM, the assembler can repeat Pass 1 to generate
the correct location counter values.
Notice the similarities between the far jump and the forward references in
SIC/XE that require the use of extended format instructions.
There are also many other situations in which the length of an assembled
instruction depends on the operandsthat are used. For example, the operands
of an ADD instruction may be registers, memory locations, or immediate
operands. Immediate operands may occupyfrom 1 to 4 bytes in the instruc-
tion. An operandthat specifies a memory location may take varying amounts
of space in the instruction, depending uponthelocation of the operand.
This means that Pass 1 of an x86 assembler must be considerably more
complex than Pass 1 of a SIC assembler. The first pass of the x86 assembler
must analyze the operandsof an instruction, in addition to looking at the op-
eration code. The operation code table must also be more complicated, since it
must contain information on which addressing modes are valid for each
operand.
Segments in an MASM source program can be written in more than one
part. If a SEGMENTdirective specifies the same nameas a previously defined
segment,it is considered to be a continuation of that segment. All of the parts
of a segment are gathered together by the assembly process. Thus, segments
can perform a similar function to the program blocks we discussed for
SIC/XE.
References between segments that are assembled together are automati-
cally handled by the assembler. External references between separately assem-
bled modules must be handled by the linker. The MASM directive PUBLIC
has approximately the same function as the SIC/XE directive EXTDEF. The
MASMdirective EXTRN has approximately the same function as EXTREF. We
will considerthe action of the linker in moredetail in the next chapter.
The object program from MASM maybein several different formats, to
allow easy andefficient execution of the program in a variety of operating
environments. MASM canalso produce an instruction timinglisting that
shows the numberof clock cycles required to execute each machine instruc-
tion. This allows the programmerto exercise a great deal of control in optimiz-
ing timing-critical sections of code.

2.5.2 SPARC Assembler

This section describes some of the features of the SunOS SPARC assembler.
Further information about this assembler can be found in Sun Microsystems
(1994a).
https://hemanthrajhemu.github.io
106 Chapter 2 Assemblers

A SPARC assembler language programis divided into unitscalled sections.


The assembler provides a set of predefined section names. Some examples of
these are

EBA Executable instructions

.DATA Initialized read / write data

-RODATA Read-only data

.BSS Uninitialized data areas

It is also possible to define other sections, specifying section attributes such as


“executable” and “writeable.”
The programmer can switch between sections at any time in the source
program by using assembler directives. The assembler maintains a separate lo-
cation counter for each namedsection. Each time the assembler switches to a
different section, it also switches to the location counter associated with that
section. In this way, sections are similar to the program blocks we discussed
for SIC. However, references between different sections are resolved by the
linker, not by the assembler.
By default, symbols used in a source program are assumedto belocal to
that program. (However, a section may freely refer to local symbols defined in
another section of the same program.) Symbols that are used in linking sepa-
rately assembled programs may be declared to be either global or weak. A
global symbolis either a symbolthat is defined in the program and madeac-
cessible to others, or a symbol that is referenced in a program and defined ex-
ternally. (Notice that this combines the functions of the EXTDEF and EXTREF
directives we discussed for SIC.) A weak symbolis similar to a global symbol.
However, the definition of a weak symbol may be overridden by a global sym-
bol with the same name. Also, weak symbols may remain undefined when the
program is linked, without causing an error.
The object file written by the SPARC assembler contains translated ver-
sions of the segments of the program anda list of relocation and linking opera-
tions that need to be performed. References between different segments of the
same program are resolved when the program is linked. The object program
also includes a symboltable that describes the symbols used during relocation
and linking (global symbols, weak symbols, and section names).
SPARC assembler language has an unusualfeature thatis directly related
to the machine architecture. As we discussed in Section 1.5.1, SPARC branch
instructions (including subroutine calls) are delayed branches. The instruction
immediately following a branch instruction is actually executed before the
branchis taken. For example, in the instruction sequence
https://hemanthrajhemu.github.io
2.5 Implementation Examples 107

CMP sL0, 10
BLE LOOP
ADD $42, tL3, tL4

the ADD instruction is executed before the conditional branch BLE. This ADD
instruction is said to be in the delay slot of the branch;it is executed regardless
of whetheror not the conditional branchis taken.
To simplify debugging, SPARC assembly language programmersoften
place NOP (no-operation) instructions in delay slots when a program is writ-
ten. The code is later rearranged to move useful instructions into the delay
slots. For example, the instruction sequenceillustrated above mightoriginally
have been

LOOP:

ADD sua, tL3, sL4


CMP SL0, 10
BLE LOOP
NOP

Moving the ADDinstruction into the delay slot would produce the version
discussed earlier. (Notice that the CMPinstruction could not be moved into
the delay slot, because it sets the condition codes that must be tested by the
BLE.)
However, there is another possibility. Suppose that the original version of
the loop had been

LOOP: ADD SL2, %L3, tL4

CMP sL0, 10
BLE LOOP
NOP

Now the ADDinstruction is logically the first instruction in the loop. It could
still be moved into the delay slot, as previously described. However, this
would create a problem. On the last execution of the loop, the ADDinstruction
(which is the beginning of the next loop iteration) should not be executed.
The SPARC architecture defines a solution to this problem. A conditional
branch instruction like BLE can be annulled. If a branch is annulled, the in-
struction in its delay slot is executed if the branch is taken, but not executed if
the branch is not taken. Annulled branches are indicated in SPARC assembler
https://hemanthrajhemu.github.io
108 Chapter 2 Assemblers

language by writing “,A” following the operation code. Thus the loop just dis-
cussed could be rewritten as

LOOP:

CMP SLO, 10
BLE,A LOOP
ADD %L2, %L3, tL4

The SPARC assembler provides warning messagesto alert the programmer to


possible problems with delay slots. For example, a label on an instruction in a
delay slot usually indicates an error. A segment that ends with a branch in-
struction (with nothing in the delay slot) is also likely to be incorrect. Before
the branch is executed, the machine will attempt to execute whatever happens
to be stored at the memorylocation immediately following the branch.

2.5.3 AIX Assembler

This section describes some of the features of the AIX assembler for PowerPC
and other similar systems. Further information about this assembler can be
found in IBM (1994b).
The AIX assembler includes support for various models of PowerPC mi-
croprocessors, as well as earlier machines that implementthe original POWER
architecture. The programmer can declare which architecture is being used
with the assembler directiveMACHINE.The assembler automatically checks
for POWERor PowerPC instructions that are not valid for the specified envi-
ronment. When the object program is generated, the assembler includes a flag
that indicates which processors are capable of running the program.This flag
depends on which instructions are actually used in the program, not on the
-MACHINEdirective. For example, a PowerPC program that contains only in-
structions that are also in the original POWERarchitecture would be exe-
cutable on either type of system.
Aswediscussed in Section 1.5.2, PowerPC load andstore instructions use
a base register and a displacementvalue to specify an address in memory. Any
of the general-purpose registers (except GPRO) can be used as a baseregister.
Decisions about which registers to use in this wayareleft to the programmer.
In a long program,it is not unusual to have several different base registers in
use at the same time. The programmerspecifies which registers are available
for use as base registers, and the contents of these registers, with the .USING
https://hemanthrajhemu.github.io
7 2.5 Implementation Examples 109

assembler directive. This is similar in function to the BASE statement in our


SIC/XE assembler language. Thus the statements

.USING LENGTH, 1
- USING BUFFER, 4

would identify GPR1 and GPR4as base registers. GPR1 would be assumedto
contain the address of LENGTH, and GPR4 would be assumed to contain the
address of BUFFER. As with SIC/XE, the programmer must provide instruc-
tions to place these values into the registers at execution time. Additional
.USINGstatements may appear at any point in the program.If a base register
is to be used later for some other purpose, the programmerindicates with the
.DROP statementthat this register is no longer available for addressing
purposes.
This additional flexibility in register usage means more work for the as-
sembler. A base register table is used to remember whichof the general-purpose
registers are currently available as base registers, and what base addresses
they contain. Processing a .USING statement causes an entry to be made in
this table (or an existing entry to be modified); processing a .DROP statement
removes the corresponding table entry. For each instruction whose operandis
an address in memory, the assemblerscans the table to find a base register that
can be used to address that operand. If more than oneregister can be used, the
assembler selects the base register that results in the smallest signed displace-
ment. If no suitable base register is available, the instruction cannot be assem-
bled. The process of displacement calculation is the same as we described for
SIC/XE.
The AIX assembler language also allows the programmer to write base
registers and displacements explicitly in the source program. For example, the
instruction

L 2,8(4)

specifies an operand address that is 8 bytes past the address contained in


GPR4. This form of addressing may be useful when someregister is known to
contain the starting address of a table or data record, and the programmer
wishes to refer to a fixed location within that table or record. The assembler
simplyinserts the specified values into the object code instruction: in this case
base register GPR4 and displacement8. The baseregister table is not involved,
and the register used in this way need not have appeared in a .USINGstate-
ment.
https://hemanthrajhemu.github.io
110 Chapter 2 Assemblers

An AIX assembler language program can be divided into control sections


using the -CSECT assembler directive. Each control section has an associated
storage mappingclass that describes the kind of data it contains. Some of the
most commonly used storage mapping classes are PR (executable instruc-
tions), RO (read-only data), RW (read/write data), and BS (uninitialized
read/write data). AIX control sections combine some of the features of the SIC
control sections and program blocks that we discussed in Section 2.3. One con-
trol section may consist of several different parts of the source program. These
parts are gathered together by the assembler, as with SIC program blocks. The
control sections themselves remain separate after assembly, and are handled
independently by the loaderor linkage editor.
The AIX assembler language provides a special type of control section
called a dummy section. Data items included in a dummysection do not actu-
ally becomepart of the object program; they serve only to define labels within
the section. Dummysections are most commonly used to describe the layout
of a record or table that is defined externally. The labels define symbols that
can be used to addressfields in the record or table (after an appropriate base
register is established). AIX also provides common blocks, which are uninitial-
ized blocks of storage that can be shared between independently assembled
programs.
Linking of control sections can be accomplished using methodslike the
ones we discussed for SIC. The assembler directive .GLOBL makes a symbol
available to the linker, and the directive EXTERN declares that a symbolis
defined in another source module. These directives are essentially the same as
the SIC directives EXTDEF and EXTREF. Expressions that involve relocatable
and external symbols are classified and handled using rules similar to those
discussed in Sections 2.3.3 and 2.3.5.
The AIX assembleralso provides a different method for linking control sec-
tions. By using assembler directives, the programmercan create a table of con-
tents (TOC) for the assembled program. The TOC contains addresses of control
sections and global symbols defined within the control sections. To refer to one
of these symbols, the program retrieves the needed address from the TOC, and
then uses that address to refer to the needed data item or procedure. (Some
types of frequently used data items can be stored directly in the TOC for effi-
ciencyof retrieval.) If all references to external symbols are done in this way,
then the TOC entries are the only parts of the program involvedin relocation
and linking when the program is loaded.
The AIX assembleritself has a-two-pass structure similar to the one wedis-
cussed for SIC. However, there are somesignificant differences. The first pass
of the AIX assembler writesa listing file that contains warnings and error mes-
sages. If errors are found duringthe first pass, the assembler terminates and
https://hemanthrajhemu.github.io
‘ Exercises 111

does not continue to the secondpass.In this case, the assembly listing contains
only errors that could be detected during Pass 1.
If no errors are detected during the first pass, the assembler proceeds to
Pass 2. The second pass reads the source program again, instead of using an
intermediate file as we discussed for SIC. This meansthat location counterval-
ues must be recalculated during Pass 2. It also means that any warning mes-
sages that were generated during Pass 1 (but were not serious enough to
terminate the assembly) are lost. The assemblylisting will contain only errors
and warningsthat are generated during Pass2.
Assembled control sections are placed into the object program according to
their storage mappingclass. Executable instructions, read-only data, and vari-
ous kinds of debugging tables are assigned to an object program section
named .TEXT. Read/write data and TOC entries are assigned to an object pro-
gram section named .DATA.Uninitialized data is assigned to a section named
.BSS. When the object program is generated, the assembler first writes all of
the .TEXT control sections, followed by all of the .DATA control sections ex-
cept for the TOC. The TOCis written after the other .DATA control sections.
Relocation and linking operations are specified by entries in a relocation table,
similar to the Modification records we discussed for SIC.

EXERCISES

Section 2.1

1. Apply the algorithm described in Fig. 2.4 to assemble the source pro-
gram in Fig. 2.1. Your results should be the same as those shownin
Figs. 2.2 and 2.3.
2. Apply the algorithm described in Fig. 2.4 to assemble the following
SIC source program:

SUM START 4000


FIRST LDX ZERO
LDA ZERO
LOOP ADD TABLE, X
TIX COUNT
JLT LOOP
STA TOTAL
RSUB
TABLE RESW 2000
COUNT RESW a
ZERO WORD 0
TOTAL RESW 1
END FIRST
https://hemanthrajhemu.github.io a
112 Chapter 2 Assemblers

3. As mentioned in the text, a number of operations in the algorithm of


Fig. 2.4 are not explicitly spelled out. (One example would be scan-
ning the instruction operand field for the modifier “,X”.) List as
many of these implied operations as you can, and think about how
they might be implemented.
4. Suppose that you are to write a “disassembler”—that is, a system
programthat takes an ordinary object program as input and pro-
ducesa listing of the source version of the program. Whattables and
data structures would be required, and how would they be used?
How many passes would be needed? What problems wouldarise in
recreating the source program?
5. Many assemblers use free-format input. Labels muststart in Column
1 of the source statement, but other fields (opcode, operands, com-
ments) may begin in any column. The variousfields are separated by
blanks. How could our assembler logic be modified to allow this?
6. The algorithm in Fig. 2.4 provides for the detection of some assembly
errors; however, there are many more such errors that might occur.
List error conditions that might arise during the assembly of a SIC
program. When and how would eachtypeof error be detected, and
whataction should the assembler take for each?
7. Suppose that the SIC assembler language is changed to include a
new form of the RESB statement, such as

RESB nc!

whichreserves n bytes of memory andinitializes all of these bytes to


the character‘c’. For example, line 105 in Fig. 2.5 could be changed to

BUFFER RESB 4096’ '

This feature could be implemented by simply generating the re-


quired numberof bytes in Text records. However, this could lead toa
large increase in the size of the object program—for example, the ob-
ject program in Fig. 2.8 would be about 40 timesits previoussize.
Propose a way to implement this new form of RESB without such a
large increase in object program size.

8. Suppose that you have a two-pass assembler that is written accord-


ing to the algorithm in Fig. 2.4. In the case of a duplicate symbol,
https://hemanthrajhemu.github.io
Exercises 113

this assembler would give an error message only for the second (i.e.,
duplicate) definition. For example, it would give an error message
only for line 5 of the program below.

1 P3 START 1000
2 LDA ALPHA

3 STA ALPHA

o ALPHA RESW i

5 ALPHA WORD 0
6 END

Suppose that you want to change the assembler to give error mes-
sages for all definitions of a doubly defined symbol(e.g., lines 4 and
5), and also forall references to a doubly defined symbol(e.g., lines 2
and 3). Describe the changes you would make to accomplishthis. In
making this modification, you should change the existing assembler
as little as possible.

. Suppose that you have a two-pass assemblerthat is written accord-


ing to the algorithm in Fig. 2.4. You want to change this assembler so
that it gives a warning messagefor labels that are not referenced in
the program,asillustrated by the following example.

B3 START 1000
LDA DELTA
ADD BETA
LOOP STA DELTA
Warning: label is never referenced
RSUB
ALPHA RESW 1
Warning: label is never referenced
BETA RESW £
DELTA RESW £
END

The warning messages should appear in the assemblylisting directly


below the line that contains the unreferenced label, as shown above.
Describe the changes you would makein the assembler to add this
EE
https://hemanthrajhemu.github.io
114 Chapter 2 Assemblers

new diagnostic feature. In making this modification, you should


changethe existing assembleraslittle as possible.

Section 2.2

1. Could the assembler decide for itself which instructions need to be


assembled using extended format? (This would avoid the necessity
for the programmerto code+ in such instructions.)
2. As we have describedit, the BASE statement simply gives informa-
tion to the assembler. The programmermustalso write an instruction
like LDB to load the correct value into the base register. Could the as-
sembler automatically generate the LDB instruction from the BASE
statement? If so, what would be the advantages and disadvantages
of doing this?
3. Generate the object code for each statementin the following SIC/XE
program:

SUM START 0
FIRST LDX #0
LDA #0
+LDB #TABLE2
BASE TABLE2
LOOP ADD TABLE, X
ADD TABLE2 , X
TIX COUNT
JLT LOOP
+STA TOTAL
RSUB
COUNT RESW i
TABLE RESW 2000
TABLE2 RESW 2000
TOTAL RESW 1
END FIRST

4. Generate the complete object program for the source program given
in Exercise 3.
5. Modify the algorithm described in Fig. 2.4 to handle all of the
SIC/XE addressing modes discussed. How would these modifica-
tions be reflected in the assembler designs discussed in Chapter 8?
https://hemanthrajhemu.github.io
Exercises

6. Modify the algorithm described in Fig. 2.4 to handle relocatable pro-


grams. Howwould these modifications be reflected in the assembler
designs discussed in Chapter 8?
Suppose that you are writing a disassembler for SIC / XE (see Exercise
2.1.4.) How would your disassembler deal with the various address-
ing modesand instruction formats?
Our discussion of SIC/XE Format 4 instructions specified that the
20-bit “address” field should contain the actual target address, and
that addressing modebits b and p should beset to 0. (Thatis, the in-
struction should contain a direct address—it should notuse baserel-
ative or program-counterrelative addressing.)
However, it would be possible to use program-counterrelative ad-
dressing with Format 4. In that case, the “address” field would actu-
ally contain a displacement, and bit p would beset to 1. For example,
the instruction on line 15 in Fig. 2.6 could be assembled as

0006 CLOOP +JSUB RDREC 4B30102¢C

(using program-counterrelative addressing with displacement


102C).
What would be the advantages (if any) of assembling Format 4
instructions in this way? What would bethe disadvantages(if any)?
Are there any situations in which it would not be possible to assem-
ble a Format 4 instruction using program-counter relative address-
ing?

Our Modification record format is well suited for SIC/XE programs


becauseall addressfields in instructions and data wordsfall neatly
into half-bytes. What sort of Modification record could weuseif this
were not the case (that is, if address fields could begin anywhere
within a byte and could be of any length)?
10. Suppose that we made the program in Fig. 2.1 a relocatable program.
This program is written for the standard version ofSIC,so all operand
addresses are actual addresses, and there is only one instruction for-
mat. Nearly every instruction in the object program would need to
have its operand address modified at load time. This would mean a
large number of Modification records (more than doubling thesize of
the object program). How could weinclude the required relocation
information withoutthis large increase in object program size?
https://hemanthrajhemu.github.io
116 Chapter 2 Assemblers

11. Suppose that you are writing an assembler for a machine that has
only program-counter relative addressing. (That is, there are no di-
rect-addressing instruction formats and no baserelative addressing.)
Suppose that you wish to assemble an instruction whose operandis
an absolute address in memory—for example,

LDA 100

to load register A from address (hexadecimal) 100 in memory. How


might such an instruction be assembled in a relocatable program?
What relocation operations would be required?
12. Suppose that you are writing an assembler for a machine on which
the length of an assembled instruction depends uponthe type of the
operand. Consider, for example, the following three fragments of
code:

a. ADD ALPHA

ALPHA DC I (3)

b. ADD ALPHA

ALPHA DC F(3.1)

C. ADD ALPHA

ALPHA TC D(3.14159)

In case (a), ALPHA is an integer operand; the ADD instruction gener-


ates 2 bytes of object code. In case (b), ALPHA is a single-precision
floating-point operand; the ADD instruction generates 3 bytes of ob-
ject code. In case (c), ALPHA is a double-precision floating-point
operand; the ADD instruction generates 4 bytes of object code.
Whatspecial problems does such a machine present for an assem-
bler? Briefly describe how you would solve these problems—thatis,
how your assembler for this machine would be different from the
assemblerstructure described in Section 2.1.
https://hemanthrajhemu.github.io
‘ Exercises 117

Section 2.3

1. Modify the algorithm described in Fig. 2.4 to handleliterals.


2. In the program of Fig. 2.9, could we have usedliterals on lines 135
and 145? Why mightweprefer not to use literal here?
3. With a minor extension to our literal notation, we could write the in-
struction online 55 of Fig. 2.9 as

LDA =W'3’

specifying as the literal operand a word with the value 3. Wouldthis


be a good idea?
4. Immediate operands and literals are both ways of specifying an
operand value in a source statement. What are the advantages and
disadvantages of each? When mighteach bepreferable to the other?
5. Suppose that you have a two-pass SIC/XE assembler that does not
supportliterals. Now you want to modify the assembler to handle
literals. However, you wantto place theliteral pool at the beginning
of the assembled program,notat the end as is commonly done. (You
do not have to worry about LTORG statements—your assembler
should alwaysplaceall literals in a pool at the beginning of the pro-
gram.) Describe how you could accomplish this. If possible, you
should do so without adding another pass to the assembler. Be sure
to describe any data structures that you may need, and explain how
they are usedin the assembler.
6. Suppose we madethe following changes to the program in Fig. 2.9:
a. Delete the LTORG statementonline 93.
b. Change the statementon line 45 to +LDA... .
c. Change the operands onlines 135 and 145 to useliterals (and
delete line 185).
Show the resulting object code for lines 45, 135, 145, 215, and 230.
Also showtheliteral pool with addresses and data values. Note: you
do not need toretranslate the entire program to do this.
7. Assume that the symbols ALPHA and BETAarelabels in a source
program. What is the difference between the following two
sequences of statements?
https://hemanthrajhemu.github.io
118 Chapter 2 Assemblers

LDA ALPHA-BETA

LDA ALPHA
SUB BETA

8. What is the difference between the following sequences ofstate-


ments?
a LDA #3
b. THREE EQU 3

LDA #THREE

c. THREE EQU 5

LDA THREE

Modify the algorithm described in Fig. 2.4 to handle multiple pro-


gram blocks.
10. Modify the algorithm described in Fig. 2.4 to handle multiple control
sections.
By Supposeall the features we described in Section 2.3 were to be im-
plemented in an assembler. How would the symboltable required be
different from the one discussed in Section 2.1?
12. Which of the features described in Section 2.3 would create addi-
tional problemsin the writing of a disassembler (see Exercise 2.1.4)?
Describe these problems, and discuss possible solutions.
13. Whendifferent control sections are assembled together, some refer-
ences between them could be handled by the assembler (instead of
being passed onto the loader). In the program ofFig. 2.15, for exam-
ple, the expression on line 190 could be evaluated directly by the as-
sembler because its symbol table containsall of the required
information. What would be the advantages and disadvantages of
doing this?
14. In the program of Fig. 2.11, suppose we used only two program
blocks: the default block and CBLKS. Assumethat the data items in
CDATAare to be includedin the default block. Whatchanges in the
source program would accomplish this? Show the object program
(corresponding to Fig. 2.13) that would result.
https://hemanthrajhemu.github.io
: Exercises 119

15. Suppose that for some reasonit is desirable to separate the parts of
an assembler language program that require initialization (e.g., in-
structions and data items defined with WORD or BYTE) from the
parts that do not require initialization (e.g., storage reserved with
RESW or RESB). Thus, when the program is loaded for executionit
should looklike

Instructions and
initialized data items

Reserved storage
(uninitialized data items)

Suppose thatit is considered too restrictive to require the program-


mer to perform this separation. Instead, the assembler should take
the source program statements in whatever order they are written,
and automatically perform the rearrangementas described above.
Describe a way in which this separation of the program could be ac-
complished by a two-pass assembler.
16. Suppose LENGTHis defined as in the program of Fig. 2.9. What
would be the difference between the following sequencesof state-
ments?
a. LDA LENGTH
SUB #1

b. LDA LENGTH-1

a: Referring to the definitions of symbols in Fig. 2.10, give the value,


type, and intuitive meaning (if any) of each of the following expres-
sions:
a. BUFFER-FIRST
b. BUFFER+4095
c. MAXLEN-1

d. BUPFER+MAXLEN-1
e. BUFFER-MAXLEN
f. 2*LENGTH
https://hemanthrajhemu.github.io
120 Chapter 2 Assemblers

g. 2*MAXLEN-1

h. MAXLEN-BUFFER

i. FIRST+BUFFER

j. FIRST-BUFFER+BUFEND

18. In the program ofFig. 2.9, what is the advantage of writing (on line
107)

MAXLEN EQU BUFEND-BUFFER

instead of

MAXLEN EQU 4096 ?

19. In the program ofFig. 2.15, could we changeline 190 to

MAXLEN EQU BUFEND-BUFFER

andline 133 to

+LDT #MAXLEN

as wedid in Fig.2.9?
20, The assembler could simply assumethat any reference to a symbol
not defined within a control section is an external reference. This
change wouldeliminate the need for the EXTREF statement. Would
this be a good idea?
ons How could an assembler that allows external references avoid the
need for an EXTDEFstatement? What would be the advantages and
disadvantages of doing this?
22) The assembler could automatically use extended format for instruc-
tions whose operands involve external references. This would elimi-
nate the need for the programmerto code + in such statements. What
would be the advantages and disadvantages of doing this?
23. On somesystems, control sections can be composedof several differ-
ent parts, just as program blocks can. What problems does this pose
for the assembler? How might these problems be solved?
https://hemanthrajhemu.github.io
Exercises 121

24. Assume that the symbols RDREC and COPYare defined asin Fig.
2.15. According to our rules, the expression
*

RDREC-COPY

would be illegal (that is, the assembler and/or the loader would re-
ject it). Suppose that for some reason the program really needs the
value of this expression. How could such a thing be accomplished
without changing the rules for expressions?
22: Wediscussed a large number of assembler directives, and many
more could be implementedin an actual assembler. Checking for
them one at a time using comparisons might be quite inefficient.
How could we usea table, perhaps similar to OPTAB, to speed
recognition and handling of assembler directives? (Hint: the answer
to this problem may depend upon the language in which the assem-
bleritself is written.)
26. Other than the listing of the source program with generated object
code, what assembler outputs might be useful to the programmer?
Suggest some optionallistings that might be generated and discuss
any data structures or algorithms involved in producing them.

Section 2.4

1. The process of fixing up a few forward references should involve


less overhead than making a complete second pass of the source
program. Why don’t all assemblers use the one-pass technique for
efficiency?
Suppose we wanted our assembler to producea cross-referencelist-
ing for all symbols used in the program. For the program ofFig. 2.5,
sucha listing mightlooklike

Symbol Defined on line Used on lines

COPY 5
FIRST 10 255
CLOOP 15 40
ENDFIL 45 30
EOF 80 45
RETADR 95 10,70
LENGTH 100 12) fo 207 60h Eppes
https://hemanthrajhemu.github.io
122 Chapter 2 Assemblers

How might this be done by the assembler? Indicate changes to the


logic and tables discussed in Section 2.1 that would be required.
Could a one-pass assembler produce a relocatable object program
and handle external references? Describe the processing logic that
would be involved andidentify anypotentialdifficulties.
Howcouldliterals be implemented in a one-pass assembler?
We discussed one-pass assemblers as though instruction operands
could only be single symbols. Howcould a one-pass assembler han-
dle an instruction like

JEQ ENDFIL+3

where ENDFILhasnotyet been defined?


Outline the logic flow for a simple one-pass load-and-go assembler.
Using the methods outlined in Chapter 8, develop a modular design
for a one-pass assembler that produces object code in memory.
Supposethat an instruction involving a forward referenceis to be as-
sembled using program-counter relative addressing. How mightthis
be handled by a one-pass assembler?
The process of fixing up forward references in a one-pass assembler

eaieatiilia amen
that produces an object program is very similar to the linking process
described in Section 2.3.5. Why didn’t we just use Modification
recordsto fix up the forward references?
10. How could we extend the methods of Section 2.4.2 to handle forward
references in ORG statements?

Section 2.5

1, Consider the description of the VAX architecture in Section 1.4.1.


What characteristics would you expect to find ina VAX assembler?

2. Consider the description of the T3E architecture in Section 1.5.3.


Whatcharacteristics would you expect to find in a T3E assembler?
https://hemanthrajhemu.github.io

Chapter 4

Macro Processors

In this chapter we study the design and implementation of macro processors.


A macro instruction (often abbreviated to macro) is simply a notational conve-
nience for the programmer. A macro represents a commonly used group of
statements in the source programming language. The macro processorre-
places each macro instruction with the corresponding group of source lan-
guage statements. This is called expanding the macros. Thus macroinstructions
allow the programmerto write a shorthand version of a program, and leave
the mechanical details to be handled by the macro processor.
For example, supposethat it is necessary to save the contents of all regis-
ters before calling a subprogram. On SIC/XE,this would require a sequence of
seven instructions (STA, STB, etc.). Using a macro instruction, the programmer
could simply write one statement like SAVEREGS. This macro instruction
would be expandedinto the seven assembler language instructions needed to
save the register contents. A similar macro instruction (perhaps named LOAD-
REGS) could be used to reload the register contents after returning from the
subprogram.
The functions of a macro processor essentially involve the substitution of
one group of characters or lines for another. Except in a few specialized cases,
the macro processor performs no analysis of the text it handles. The design
and capabilities of a macro processor maybe influenced bytheform of the pro-
gramming language statements involved. However, the meaning of these state-
ments, and their translation into machine language, are of no concern
whatsoever during macro expansion. This means that the design of a macro
processoris notdirectly related to the architecture of the computer on whichit
is to run.
The most common use of macro processors is in assembler language pro-
gramming. We use SIC assembler language examplesto illustrate most of the
concepts being discussed. However, macro processors can also be used with
high-level programming languages, operating system command languages,
etc. In addition, there are general-purpose macro processors that are not tied
to any particular language.In the later sections of this chapter, we briefly dis-
cuss these more general uses of macros.

175
https://hemanthrajhemu.github.io
176 Chapter 4 Macro Processors

Section 4.1 introduces the basic concepts of macro processing, including


macro definition and expansion. We also present an algorithm for a simple
macro processor. Section 4.2 discusses extended features that are commonly
found in macro processors. These features include the generation of unique la-
bels within macro expansions, conditional macro expansion, and the use of
keyword parameters in macros. All these features are machine-independent.
Because the macro processor is not directly related to machine architecture,
this chapter contains no section on machine-dependentfeatures.
Section 4.3 describes some macro processor design options. One of these
options (recursive macro expansion) involves the internal structure of the
macro processoritself. The other options are concerned with how the macro
processoris related to other pieces of system software such as assemblers or
compilers.
Finally, Section 4.4 briefly presents three examples of actual macro proces-
sors. One of these is a macro processor designed for use by assembler lan-
guage programmers. Anotheris intended to be used with a high-level
programming language. The third is a general-purpose macro processor,
which is not tied to any particular language. Additional examples may be
foundin the references cited throughoutthis chapter.

4.1 BASIC MACRO PROCESSOR FUNCTIONS


In this section we examine the fundamental functions that are commontoall
macro processors. Section 4.1.1 discusses the processes of macro definition, in-
vocation, and expansion with substitution of parameters. These functions are
illustrated with examples using the SIC/XE assembler language. Section 4.1.2
presents a one-pass algorithm for a simple macro processor together with a
description of the data structures needed for macro processing. Later sections
in this chapter discuss extensions to the basic capabilities introduced in this
section.

4.1.1 Macro Definition and Expansion

Figure 4.1 shows an example of a SIC/XE program using macroinstructions.


This program has the same functions and logic as the sample program in Fig.
2.5; however, the numbering scheme used for the source statements has been
changed.
https://hemanthrajhemu.github.io
‘ 4.1 Basic Macro Processor Functions 177

This program defines and uses two macroinstructions, RDBUFF and


WRBUFFE.Thefunctions and logic of the RDBUFF macroare similar to those of
the RDREC subroutine in Fig. 2.5; likewise, the WRBUFF macrois similar to
the WRRECsubroutine. The definitions of these macro instructions appear in
the source program following the START statement.
Two newassembler directives (MACRO and MEND) are used in macro de-
finitions. The first MACRO statement (line 10) identifies the beginning of a
macro definition. The symbolin the label field (RDBUFF) is the name of the
macro, and the entries in the operandfield identify the parameters of the macro
instruction. In our macro language, each parameter begins with the character
&, which facilitates the substitution of parameters during macro expansion.
The macro nameand parameters define a pattern or prototype for the macroin-
structions used by the programmer. Following the MACROdirective are the
statements that make up the body of the macro definition (lines 15 through 90).
These are the statements that will be generated as the expansion of the macro.
The MEND assembler directive (line 95) marks the end of the macro defini-
tion. The definition of the WRBUFF macro (lines 100 through 160) follows a
similar pattern.
The main program itself begins on line 180. The statement on line 190 is a
macro invocation statement that gives the nameof the macro instruction being
invoked and the arguments to be used in expanding the macro. (A macro invo-
cation statementis often referred to as a macrocall. To avoid confusion with the
call statements used for procedures and subroutines, we prefer to use the term
invocation. As weshall see, the processes of macro invocation and subroutine
call are quite different.) You should compare the logic of the main program in
Fig. 4.1 with that of the main program in Fig. 2.5, rememberingthe similarities
in function between RDBUFF and RDREC and between WRBUFF and
WRREC.
The program in Fig. 4.1 could be supplied as input to a macro processor.
Figure 4.2 shows the output that would be generated. The macroinstruction
definitions have been deleted since they are no longer neededafter the macros
are expanded. Each macro invocation statement has been expanded into the
statements that form the body of the macro, with the arguments from the
macro invocation substituted for the parameters in the macro prototype. The
arguments and parameters are associated with one another according to their
positions. The first argument in the macro invocation correspondsto thefirst
parameter in the macro prototype, and so on. In expanding the macro invoca-
tion on line 190, for example, the argumentF1 is substituted for the parameter
&INDEV wherever it occurs in the body of the macro. Similarly, BUFFERis
substituted for &BUFADR, and LENGTH is substituted for &RECLTH.
it
https://hemanthrajhemu.github.io
Line Source statement

5 COPY START 0 COPY FILE FROM INPUT TO OUTPUT


10 RDBUFF MACRO &INDEV, &BUFADR, &RECLTH —
15 :
20 : MACRO TO READ RECORD INTO BUFFER
25 ; _
30 CLEAR = CLEAR LOOP COUNTER
35 CLEAR A
40 CLEAR s
45 +LDT #4096 SET MAXIMUM RECORD LENGTH
50 TD =X‘ &INDEV’ TEST INPUT DEVICE
55 JEO *3 LOOP UNTIL READY
60 RD =X! &INDEV’ READ CHARACTER INTO REG A
65 COMPR A,S TEST FOR END OF RECORD
70 JEQ *411 EXIT LOOP IF EOR
75 STCH &BUFADR , X STORE CHARACTER IN BUFFER
80 TIXR p LOOP UNLESS MAXIMUM LENGTH
85 Papal *=19 HAS BEEN REACHED
90 STX &RECLTH SAVE RECORD LENGTH
95 MEND
100 WRBUFF MACRO &OUTDEV, &BUFADR, &RECLTH
105 :
110 i MACRO TO WRITE RECORD FROM BUFFER
115 :
120 CLEAR x CLEAR LOOP COUNTER
125 LDT &RECLTH
130 LDCH &BUFADR, X GET CHARACTER FROM BUFFER
135 TD =X’ SOUTDEV’ TEST OUTPUT DEVICE
140 JEQ *-3 LOOP UNTIL READY
145 WD =X' &OUTDEV’ WRITE CHARACTER
150 TIXR T LOOP UNTIL ALL CHARACTERS
155 out *-14 HAVE BEEN WRITTEN
160 MEND
165 :
170 / MAIN PROGRAM
175 ,
180 FIRST STL RETADR SAVE RETURN ADDRESS
190 CLOOP RDBUFF F1,BUFFER,LENGTH READ RECORD INTO BUFFER
195 LDA LENGTH TEST FOR END OF FILE
200 COMP #0
205 JEO ENDFIL EXIT IF EOF FOUND
210 WRBUFF 05,BUFFER,LENGTH WRITE OUTPUT RECORD
215 J CLOOP LOOP
220 ENDFIL WRBUFF 05,HOF, THREE INSERT EOF MARKER
225 J @RETADR
230 EOF BYTE C' EOF
235 THREE WORD 3
240 RETADR RESW 1
245 LENGTH RESW os LENGTH OF RECORD
250 BUFFER RESB 4096 4096-BYTE BUFFER AREA |
255 END FIRST

178 Figure 4.1 Use of macros in a SIC/XE program.


https://hemanthrajhemu.github.io
Line Source statement

COPY START 0 COPY FILE FROM INPUT TO OUTPUT


1890 FIRST STL RETADR SAVE RETURN ADDRESS
190 -CLOOP RDBUFF F1, BUFFER, LENGTH READ RECORD INTO BUFFER
190a CLOOP CLEAR =X CLEAR LOOP COUNTER
190b CLEAR A
190¢ CLEAR S$
190d +LDT #4096 SET MAXIMUM RECORD LENGTH
190e TD ee TEST INPUT DEVICE
I9OE JEQ ‘3 LOOP UNTIL READY
199¢ RD mere READ CHARACTER INTO REG A
190h COMPR A,S THST FOR END OF RECORD
1907 JEQ *411 EXIT LOOP IF EOR
1905 STCH BUFFER, X STORE CHARACTER IN BUFFER
190k TIXR 2 LOOP UNLESS MAXIMUM LENGTH
1901 JLT *-19 HAS BEEN REACHED
190m sTx LENGTH SAVE RECORD LENGTH
195 LDA LENGTH TEST FOR END OF FILE
200 coMP #0
205 JEQ ENDFIL EXIT IF EOF FOUND
210 WRBUFF 05,BUFFER, LENGTH WRITE OUTPUT RECORD
210a CLEAR =X CLEAR LOOP COUNTER
210b LD? LENGTH
210¢ LDCH BUFFER, X GET CHARACTER FROM BUFFER
210d TD ens! TEST OUTPUT DEVICE
210e JEQ +3 LOOP UNTIL READY
210f wD Kes! WRITE CHARACTER
210g TIXR T LOOP UNTIL ALL CHARACTERS
210h JLT *-14 HAVE BEEN WRITTEN
215 J CLOOP LOOP
220 -ENDFIL WRBUFF 05,EOF, THREE INSERT EOF MARKER
220a ENDFIL CLEAR X CLEAR LOOP COUNTER
220b LDT THREE
220¢ LDCH EOF, X GET CHARACTER FROM BUFFER
220d TD = sr TEST OUTPUT DEVICE
220e JEQ *=3 LOOP UNTIL READY
220f WD =x‘ 05' WRITE CHARACTER
220g TIXR B LOOP UNTIL ALL CHARACTERS
220h JLT *-14 HAVE BEEN WRITTEN
225 J @RETADR
230 EOF BYTE C' EOF’
235 THREE WORD 3
240 RETADR §RESW 1
245 LENGTH §-RESW 1 LENGTH OF RECORD
250 BUFFER RESB 4096 4096-BYTE BUFFER AREA
255 END FIRST

Figure 4.2. Program from Fig. 4.1 with macros expanded. 179
https://hemanthrajhemu.github.io
180 Chapter 4 Macro Processors

Lines 190a through 190m show the complete expansion of the macro invo-
cation on line 190. The comment lines within the macro body have been
deleted, but comments on individual statements have been retained. Note that
the macro invocation statement itself has been included as a commentline.
This serves as documentation of the statement written by the programmer.
The label on the macro invocation statement (CLOOP) has been retained as a
label on the first statement generated in the macro expansion. This allows the
programmerto use a macro instruction in exactly the same way as an assem-
bler language mnemonic. The macro invocations on lines 210 and 220 are ex-
panded in the same way. Note that the two invocations of WRBUFF specify
different arguments, so they produce different expansions.
After macro processing, the expandedfile (Fig. 4.2) can be used as input to
the assembler. The macro invocation statements will be treated as comments,
and the statements generated from the macro expansions will be assembled
exactly as though they had been written directly by the programmer.
A comparison of the expanded program in Fig. 4.2 with the program in
Fig. 2.5 shows the most significant differences between macro invocation
and subroutine call. In Fig. 4.2, the statements from the body of the macro
WRBUFFare generated twice: lines 210a through 210h andlines 220a through
220h. In the program of Fig. 2.5, the corresponding statements appear only
once: in the subroutine WRREC(lines 210 through 240). In general, the state-
ments that form the expansion of a macro are generated (and assembled) each
time the macro is invoked. Statements in a subroutine appear only once, re-
gardless of how manytimesthe subroutineis called.
Note also that our macro instructions have been written so that the body of
the macro contains no labels. In Fig. 4.1, for example, line 140 contains the
statement “JEQ *-3” and line 155 contains “JLT *-14.” The corresponding
statements in the WRREC subroutine (Fig. 2.5) are “JEQ WLOOP” and “JLT
WLOOP,” where WLOOPis a label on the TD instruction that tests the output
device. If such a label appeared on line 135 of the macro body, it would be gen-
erated twice—onlines 210d and 220d of Fig. 4.2. This would result in an error
(a duplicate label definition) when the program is assembled. To avoid dupli-
cation of symbols, we have eliminated labels from the body of our macro defi-
nitions.
The use of statements like “JLT *-14” is generally considered to be a poor
programmingpractice.It is somewhatless objectionable within a macro defin-
ition; however,it is still an inconvenient and error-prone method.In Section
4.2.2 we discuss ways of avoiding this problem.
https://hemanthrajhemu.github.io
’ 4.1 Basic Macro Processor Functions 181

4.1.2 Macro ProcessorAlgorithm and Data Structures

It is easy to design a two-pass macro processor in whichall macro definitions


are processed during the first pass, and all macro invocation statements are ex-
panded during the second pass. However, such a two-pass macro processor
would not allow the body of one macroinstruction to contain definitions of
other macros (because all macros would have to be defined during the first
pass before any macro invocations were expanded).
Such definitions of macros by other macros can be useful in certain cases.
Consider, for example, the two macro instruction definitions in Fig. 4.3. The
bodyof the first macro (MACROS) contains statements that define RDBUFF,
WRBUFF,and other macro instructions for a SIC system (standard version).
The body of the second macro instruction (MACROX)defines these same
macros for a SIC/XE system. A program that is to be run on a standard SIC
system could invoke MACROSto define the other utility macro instructions. A
program for a SIC/XE system could invoke MACROXto define these same
macrosin their XE versions. In this way, the same program could run on either
a standard SIC machine or a SIC/XE machine (taking advantage of the ex-
tended features). The only change required would be the invocation of either
MACROS or MACROX. It is important to understand that defining MACROS
or MACROX does not define RDBUFFand the other macroinstructions. These
definitions are processed only when an invocation of MACROS or MACROX
is expanded.
A one-pass macro processor that can alternate between macro definition
and macro expansionis able to handle macroslike those in Fig.4.3. In this sec-
tion we present an algorithm anda set of data structures for such a macro
processor. Because of the one-passstructure, the definition of a macro must
appear in the source program before any statements that invoke that macro.
This restriction does not create any real inconvenience for the programmer. In
fact, a macro invocation statement that preceded the definition of the macro
would be confusing for anyone reading the program.
There are three main data structures involved in our macro processor. The
macro definitions themselves are stored in a definition table (DEFTAB), which
contains the macro prototype and the statements that make up the macro body
(with a few modifications). Commentlines from the macro definition are not
entered into DEFTABbecause they will not be part of the macro expansion.
References to the macro instruction parameters are converted to a positional
notation for efficiency in substituting arguments. The macro namesare also
entered into NAMTAB,which serves as an index to DEFTAB.For each macro
instruction defined, NAMTABcontains pointers to the beginning and end of
the definition in DEFTAB.
https://hemanthrajhemu.github.io
182 Chapter 4 Macro Processors

1 MACROS MACRO {Defines SIC standard version macros}


2 RDBUFF MACRO &INDEV, &BUFADR, &RECLTH

{SIC standard version}

3 MEND {End of RDBUFF}


4 WRBUFF MACRO &OUTDEV, &BUFADR, &RECLTH

{SIC standard version}

5 MEND {End of WRBUFF}

6 MEND {End of MACROS}

(a)

1 MACROX MACRO {Defines SIC/XE macros}


2 RDBUFF MACRO &INDEV, &BUFADR, &RECLTH

{SIC/XE version}

3 MEND {End of RDBUFF}


4 WRBUFF MACRO &OUTDEV, &BUFADR, &RECLTH

{SIC/XE version}

5 MEND (End of WRBUFF}

6 MEND {End of MACROX}

(b)
Figure 4.3 Example of the definition of macros within a macro body.

The third data structure is an argument table (ARGTAB), which is used


during the expansion of macro invocations. When a macro invocation state-
ment is recognized, the arguments are stored in ARGTABaccordingto their
position in the argumentlist. As the macro is expanded, arguments from
ARGTABare substituted for the corresponding parameters in the macro body.
Figure 4.4 shows portions of the contents of these tables during the pro-
cessing of the program in Fig. 4.1. Figure 4.4(a) shows the definition of
RDBUFF stored in DEFTAB,with an entry in NAMTABidentifying the begin-
ning and endofthe definition. Note the positional notation that has been used
https://hemanthrajhemu.github.io
‘ 4.1 Basic Macro Processor Functions 183

NAMTAB DEFTAB
. A
: : ‘
: ee RDBUFF &INDEV, &BUFADR, &RECLTH
RDBUFF &Te4 ae .
CLEAR A
fl CLEAR s
° +LDT #4096
TD ="'7]"
JEQ *-3
RD pry
COMPR A.S
JEQ *+11
STCH 22%
TIXR 7
JLT *-19
STX 23
——>|_ wend
.

ARGTAB (a)
1] F1
2| BUFFER

3] LENGTH

(b)
Figure 4.4 Contents of macro processor tables for the program in
Fig. 4.1: (a) entries in NAMTAB and DEFTABdefining macro RDBUFF,
(b) entries in ARGTABfor invocation of RDBUFFonline 190.

for the parameters: the parameter &INDEVhasbeen converted to ?1(indicating


the first parameterin the prototype), &BUFADRhas been converted to ?2, and
so on. Figure 4.4(b) shows ARGTABasit would appear during expansion of
the RDBUFFstatement online 190. For this invocation, the first argument is
F1, the second is BUFFER,etc. This scheme makes substitution of macro argu-
ments much more efficient. When the ?n notation is recognized in a line from
DEFTAB,a simple indexing operation supplies the proper argument from
ARGTAB.
https://hemanthrajhemu.github.io
184 Chapter 4 Macro Processors

The macro processor algorithm itself is presentedin Fig. 4.5. The proce-
dure DEFINE, whichis called when the beginning of a macro definition is rec-
ognized, makes the appropriate entries in DEFTAB and NAMTAB. EXPAND is
called to set up the argument values in ARGTABand expand a macro invoca-
tion statement. The procedure GETLINE, whichis called at several points in
the algorithm, gets the next line to be processed. This line may come from
DEFTAB (the next line of a macro being expanded), or from the input file,
depending upon whetherthe Boolean variable EXPANDINGis set to TRUE or
FALSE.
One aspect of this algorithm deserves further comment: the handling of
macro definitions within macros(asillustrated in Fig. 4.3). When a macro def-
inition is being entered into DEFTAB, the normal approach wouldbe to con-
tinue until an MENDdirective is reached. This would not work for the
example in Fig. 4.3, however. The MEND online 3 (which actually marks the
end of the definition of RDBUFF) would be taken as the end of the definition
of MACROS.To solve this problem, our DEFINE procedure maintainsa
counter named LEVEL. Each time a MACROdirective is read, the value of
LEVEL is increased by 1; each time an MEND directive is read, the value of
LEVELis decreased by 1. When LEVELreaches 0, the MENDthat corre-
spondsto the original MACROdirective has been found. This process is very
much like matching left and right parentheses when scanning an arithmetic
expression.

begin {macro processor}


EXPANDING := FALSE
while OPCODE + ‘END’ do
begin
GETLINE
PROCESSLINE
end {while}
end {macro processor}

procedure PROCESSLINE
begin
search NAMTAB for OPCODE
if found then
EXPAND
else if OPCODE = ‘MACRO’ then
DEFINE
else write source line to expanded file
end {PROCESSLINE}

Figure 4.5 Algorithm for a one-pass macro processor.


https://hemanthrajhemu.github.io
. 4.1 Basic Macro Processor Functions 185

procedure DEFINE
begin
enter macro name into NAMTAB
enter macro prototype into DEFTAB
LEVEL :=1
while LEVEL > 0 do
begin
GETLINE
if this is not a comment line then
begin
substitute positional notation for parameters
enter line into DEFTAB
if OPCODE = ‘MACRO’ then
LEVEL := LEVEL + 1
else if OPCODE = ‘MEND’ then
LEVEL := LEVEL — 1
end {if not comment}
end {while}
store in NAMTAB pointers to beginning and end of definition
end {DEFINE}

procedure EXPAND
begin
EXPANDING := TRUE
get first line of macro definition {prototype} from DEFTAB
set up arguments from macro invocation in ARGTAB
write macro invocation to expanded file as a comment
while not end of macro definition do
begin
GETLINE
PROCESSLINE
end {while}
EXPANDING := FALSE
end {EXPAND}

procedure GETLINE
begin
if EXPANDING then
begin
get next line of macro definition from DEFTAB
substitute arguments from ARGTAB for positional notation
end {if}
else
read next line from input file
end {GETLINE}

Figure 4.5 (contd)


https://hemanthrajhemu.github.io
186 Chapter 4 Macro Processors

You maywantto apply this algorithm by hand to the program in Fig. 4.1
to be sure you understand its operation. The result should be the same as
shownin Fig.4.2. :
Most macro processors allow the definitions of commonly used macro in-
structions to appearin a standard system library, rather than in the source pro-
gram. This makes the use of such macros much more convenient. Definitions
are retrieved from this library as they are needed during macro processing.
The extension of the algorithm in Fig. 4.5 to include this sort of processing
appearsas an exercise at the end ofthis chapter.

4.2 MACHINE-INDEPENDENT MACRO


PROCESSOR FEATURES

In this section we discuss several extensions to the basic macro processor func-
tions presented in Section 4.1. As we have mentioned before, these extended
features are not directly related to the architecture of the computer for which
the macro processor is written. Section 4.2.1 describes a method for concate-
nating macro instruction parameters with other character strings. Section 4.2.2
discusses one method for generating unique labels within macro expansions,
which avoids the need for extensive use of relative addressing at the source
statement level. Section 4.2.3 introduces the important topic of conditional
macro expansion andillustrates the concepts involved with several examples.
This ability to alter the expansion of a macro by using control statements
makes macro instructions a much more powerful and useful tool for the pro-
grammer. Section 4.2.4 describes the definition and use of keyword parameters
in macro instructions.

4.2.1 Concatenation of Macro Parameters

Most macro processors allow parameters to be concatenated with other char-


acterstrings. Suppose, for example, that a program contains oneseries of vari-
ables named by the symbols XA1, XA2, XA3,..., another series named by XB1,
XB2, XB3,..., etc. If similar processing is to be performed on eachseries of vari-
ables, the programmer might wantto incorporate this processing into a macro
instruction. The parameter to such a macroinstruction could specify the series
of variables to be operated on (A, B,etc.). The macro processor would use this
parameter to construct the symbols required in the macro expansion (XA1,
XB1, etc.).

You might also like