Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
55 views
264 pages
System Software
Uploaded by
Hemavathi B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save System Software For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
55 views
264 pages
System Software
Uploaded by
Hemavathi B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save System Software For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 264
Search
Fullscreen
Contents Chapter 1 Background 1 1d 12 13 14 15 Introduction 1 System Software and Machine Architecture 3 The Simplified Instructional Computer (SIC) 4 1.3.1 SIC Machine Architecture 5 132 SIC/XE Machine Architecture 7 13.3. SIC Programming Examples 12 Traditional (CISC) Machines 21 14.1 VAX Architecture 21 142 Pentium Pro Architecture 25 RISC Machines 29 1.5.1 UltraSPARC Architecture 29 15.2 PowerPC Architecture 33 153 Cray T3E Architecture 37 Exercises 40 Chapter 2 Assemblers 43 21 22 23 24 25 Basic Assembler Functions 44 21.1 ASimpleSIC Assembler 46 2.1.2 Assembler Algorithm and Data Structures 50 Machine-Dependent Assembler Features 52 2.2.1 Instruction Formats and Addressing Modes 57 22.2 Program Relocation 61 Machine-Independent Assembler Features 66 23.1 Literals 66 232 Symbol-Defining Statements 71 23.3 Expressions 75 234 Program Blocks 78 235 Control Sections and Program Linking 83 Assembler Design Options 92 24.1 One-Pass Assemblers 92 24.2 Multi-Pass Assemblers 98 Implementation Examples 102 25.1 MASM Assembler 103 252 SPARC Assembler 105 ix25.3 AIX Assembler 108 Exercises 111 Chapter 3 Loaders and Linkers 123 3.1 3.2 33 34 35 Basic Loader Functions 124 3.1.1 Design of an Absolute Loader 124 3.1.2 ASimple Bootstrap Loader 127 Machine-Dependent Loader Features 129 3.2.1 Relocation 130 3.2.2 Program Linking 134 3.23 Algorithm and Data Structures for a Linking Loader 141 Machine-Independent Loader Features 147 3.3.1 Automatic Library Search 147 33.2 Loader Options 149 Loader Design Options 151 34.1 Linkage Editors 152 34.2 Dynamic Linking 155 34.3 Bootstrap Loaders 158 Implementation Examples 159 35.1 MS-DOS Linker 160 35.2 SunOS Linkers 162 35.3 CrayMPPLinker 164 Exercises 166 Chapter 4 Macro Processors 175 41 42 43 44 Basic Macro Processor Functions 176 4.1.1 Macro Definition and Expansion 176 4.1.2 Macro Processor Algorithm and Data Structures 181 Machine-Independent Macro Processor Features 186 42.1 Concatenation of Macro Parameters 186 422 Generation of Unique Labels 187 423 Conditional Macro Expansion 189 4.24 Keyword Macro Parameters 196 Macro Processor Design Options 197 43.1 Recursive Macro Expansion 199 43.2 General-Purpose Macro Processors 202 43.3 Macro Processing within Language Translators 204 Implementation Examples 206 4.4.1 MASM Macro Processor 206 44.2 ANSIC Macro Language 20944.3 The ELENA Macro Processor 213 Exercises 218 Chapter 5 Compilers 225 5.1 52 5.3 54 55 Basic Compiler Functions 225 5.1.1 Grammars 227 5.1.2 Lexical Analysis 231 5.1.3 Syntactic Analysis 242 5.14 Code Generation 260 Machine-Dependent Compiler Features 272 5.21 Intermediate Form of the Program 272 5.2.2 Machine-Dependent Code Optimization 274 Machine-Independent Compiler Features 278 53.1 Structured Variables 279 53.2 Machine-Independent Code Optimization 284 53.3. Storage Allocation 289 5.34 — Block-Structured Languages 294 Compiler Design Options 299 5.4.1 Division into Passes 299 5.42 Interpreters 301 5.43 P-CodeCompilers 302 544 Compiler-Compilers 304 Implementation Examples 305 55.1 SunOSC Compiler 305 55.2 GNUNYUAda Translator 308 553 Cray MPP FORTRAN Compiler 310 554 Java Compiler and Environment 313 555 The YACC Compiler-Compiler 315 Exercises 318 Chapter 6 Operating Systems 325 61 62 Basic Operating System Functions 325 6.1.1 Types of Operating Systems 327 6.12 User Interface 328 6.1.3 Run-Time Environment 329 Machine-Dependent Operating System Features 331 6.21 Interrupt Processing 332 622 Process Scheduling 339 623 1/OSupervision 344 6.24 Management of Real Memory 354 xiChapter 1 Background This chapter contains a variety of information that serves as background for the material presented later. Section 1.1 gives a brief introduction to system software and an overview of the structure of this book. Section 1.2 begins a discussion of the relationships between system software and machine architec ture, which continues throughout the text. Section 1.3 describes the Simplified Instructional Computer (SIC) that is used to present fundamental software concepts. Sections 1.4 and 1.5 provide an introduction to the architecture of several computers that are used as examples throughout the text. Further in- formation on most of the machine architecture topics discussed can be found in Tabak (1995) and Patterson and Hennessy (1996). Most of the material in this chapter is presented at a summary level, with many details omitted. The level of detail given here is sufficient background for the remainder of the text. You should not attempt to memorize the material in this chapter, or be overly concerned with minor points. Instead, it is recom- mended that you read through this material, and then use it for reference as needed in later chapters. References are provided throughout the chapter for readers who want further information. 1.1 INTRODUCTION This text is an introduction to the design and implementation of system soft- ware. System software consists of a variety of programs that support the opera- tion of a computer. This software makes it possible for the user to focus on an application or other problem to be solved, without needing to know the de- tails of how the machine works internally. When you took your first programming course, you were already using many different types of system software. You probably wrote programs in a high-level language like C++ or Pascal, using a text editor to create and modify the program. You translated these programs into machine language using a compiler. The resulting machine language program was loaded into memory and prepared for execution by a londer or linker. You may have used a debugger to help detect errors in the program.Chapter 1 Background In later courses, you probably wrote programs in assembler language. You may have used macro instructions in these programs to read and write data, or to perform other higher-level functions. You used an assembler, which prob- ably included a macro processor, to translate these programs into machine lan- guage. The translated programs were prepared for execution by the loader or linker, and may have been tested using the debugger. You controlled all of these processes by interacting with the operating sys- tem of the computer. If you were using a system like UNIX or DOS, you proba- bly typed commands at a keyboard. If you were using a system like MacOS or Windows, you probably specified commands with menus and a point-and- click interface. In either case, the operating system took care of all the ma- chine-level details for you. Your computer may have been connected to a network, or may have been shared by other users. It may have had many dif- ferent kinds of storage devices, and several ways of performing input and out- put. However, you did not need to be concerned with these issues. You could concentrate on what you wanted to do, without worrying about how it was accomplished. As you read this book, you will learn about several important types of sys- tem software. You will come to understand the processes that were going on “behind the scenes” as you used the computer in previous courses. By under- standing the system software, you will gain a deeper understanding of how computers actually work. The major topics covered in this book are assemblers, loaders and linkers, macro processors, compilers, and operating systems; each of Chapters 2 through 6 is devoted to one of these subjects. We also consider implementa- tions of these types of software on several real machines. One central theme of the book is the relationship between system software and machine architec- ture: the design of an assembler, operating system, etc,, is influenced by the ar- chitecture of the machine on which it is to run. Some of these influences are discussed in the next section; many other examples appear throughout the text. Chapter 7 contains a survey of some other important types of system soft- ware: database management systems, text editors, and interactive debugging systems. Chapter 8 contains an introduction to software engineering concepts and techniques, focusing on the use of such methods in writing system soft- ware. This chapter can be read at any time after the introduction to assemblers in Section 2.1 The depth of treatment in this text varies considerably from one topic to another. The chapters on assemblers, loaders and linkers, and macro proces- sors contain enough implementation details to prepare the reader to write these types of software for a real computer. Compilers and operating systems, on the other hand, are very large topics; each has, by itself, been the subject of1.2. System Software and Machine Architecture many complete books and courses. It is obviously impossible to provide a full coverage of these subjects in a single chapter of any reasonable size. Instead, we provide an introduction to the most important concepts and issues related to compilers and operating systems, stressing the relationships between soft- ware design and machine architecture. Other subtopics are discussed as space permits, with references provided for readers who wish to explore these areas further. Our goal is to provide a good overview of these subjects that can also serve as background for students who will later take more advanced software courses. This same approach is also applied to the other topics surveyed in Chapter 7. 1.2 SYSTEM SOFTWARE AND MACHINE ARCHITECTURE One characteristic in which most system software differs from application soft- ware is machine dependency. An application program is primarily concerned with the solution of some problem, using the computer as a tool. The focus is on the application, not on the computing system. System programs, on the other hand, are intended to support the operation and use of the computer i self, rather than any particular application. For this reason, they are usually re- lated to the architecture of the machine on which they are to run. For example, assemblers translate mnemonic instructions into machine code; the instruction formats, addressing modes, etc., are of direct concern in assembler design. Similarly, compilers must generate machine language code, taking into ac- count such hardware characteristics as the number and type of registers and the machine instructions available. Operating systems are directly concerned with the management of nearly all of the resources of a computing system. Many other examples of such machine dependencies may be found through- out this book. On the other hand, there are some aspects of system software that do not directly depend upon the type of computing system being supported. For example, the general design and logic of an assembler is basically the same on most computers. Some of the code optimization techniques used by compilers are independent of the target machine (although there are also machine- dependent optimizations). Likewise, the process of linking together indepen- dently assembled subprograms does not usually depend on the computer being used. We will also see many examples of such machine-independent features in the chapters that follow. Because most system software is machine-dependent, we must include real machines and real pieces of software in our study. However, most real com- puters have certain characteristics that are unusual or even unique. It can beChapter 1 Background difficult to distinguish between those features of the software that are truly fundamental and those that depend solely on the idiosyncrasies of a particular machine. To avoid this problem, we present the fundamental functions of each piece of software through discussion of a Simplified Instructional Computer (SIC). SIC is a hypothetical computer that has been carefully designed to in- clude the hardware features most often found on real machines, while avoid- ing unusual or irrelevant complexities. In this way, the central concepts of a piece of system software can be clearly separated from the implementation de- tails associated with a particular machine. This approach provides the reader with a starting point from which to begin the design of system software for a new or unfamiliar computer. Each major chapter in this text first introduces the basic functions of the type of system software being discussed. We then consider machine- dependent and machine-independent extensions to these functions, and exam- ples of implementations on actual machines. Specifically, the major chapters are divided into the following sections: 1. Features that are fundamental, and that should be found in any example of this type of software. 2. Features whose presence and character are closely related to the machine architecture. 3. Other features that are commonly found in implementations of this type of software, and that are relatively machine-independent. 4. Major design options for structuring a particular piece of software— for example, single-pass versus multi-pass processing, 5. Examples of implementations on actual machines, stressing unusual software features and those that are related to machine characteristics. This chapter contains brief descriptions of SIC and of the real machines that are used as examples. You are encouraged to read these descriptions now, and refer to them as necessary when studying the examples in each chapter. 1.3 THE SIMPLIFIED INSTRUCTIONAL COMPUTER (SIC) In this section we describe the architecture of our Simplified Instructional Computer (SIC). This machine has been designed to illustrate the most com- monly encountered hardware features and concepts, while avoiding most of the idiosyncrasies that are often found in real machines.1.3. The Simplified Instructional Computer (SIC) Like many other products, SIC comes in two versions: the standard model and an XE version (XE stands for “extra equipment,” or perhaps “extra expen- sive”). The two versions have been designed to be upward compatible—that is, an object program for the standard SIC machine will also execute properly on a SIC/XE system. (Such upward compatibility is often found on real comput- ers that are closely related to one another.) Section 1.3.1 summarizes the stan- dard features of SIC. Section 1.3.2 describes the additional features that are included in SIC/XE. Section 1.3.3 presents simple examples of SIC and SIC/XE programming. These examples are intended to help you become more familiar with the SIC and SIC/XE instruction sets and assembler language. Practice exercises in SIC and SIC/XE programming can be found at the end of this chapter. 1.3.1 SIC Machine Architecture Memory Memory consists of 8-bit bytes; any 3 consecutive bytes form a word (24 bits). All addresses on SIC are byte addresses; words are addressed by the location of their lowest numbered byte. There are a total of 32,768 (215) bytes in the computer memory. Registers There are five registers, all of which have special uses. Each register is 24 bits in length. The following table indicates the numbers, mnemonics, and uses of these registers. (The numbering scheme has been chosen for compatibility with the XE version of SIC.) Mnemonic __Number _ Special use A 0 Accumulator; used for arithmetic operations x 1 Index register; used for addressing L 2 Linkage register; the Jump to Subroutine (JSUB) instruction stores the return address in this register PC 8 Program counter; contains the address of the next instruction to be fetched for execution sw 9 Status word; contains a variety of information, including a Condition Code (CC)Chapter 1 Background Data Formats Integers are stored as 24-bit binary numbers; 2's complement representation is used for negative values. Characters are stored using their 8-bit ASCII codes (see Appendix B). There is no floating-point hardware on the standard version of SIC Instruction Formats All machine instructions on the standard version of SIC have the following 24-bit format: 8 1 15 | opcode x address The flag bit x is used to indicate indexed-addressing mode. Addressing Modes There are two addressing modes available, indicated by the setting of the x bit in the instruction. The following table describes how the target address is calcu- lated from the address given in the instruction. Parentheses are used to indi- cate the contents of a register or a memory location. For example, (X) represents the contents of register X. Mode Indication _Target address calculation Direct x=0 TA = address Indexed — x=1 TA = address + (X) Instruction Set SIC provides a basic set of instructions that are sufficient for most simple tasks. These include instructions that load and store registers (LDA, LDX, STA, STX, etc.), as well as integer arithmetic operations (ADD, SUB, MUL, DIV). All arithmetic operations involve register A and a word in memory, with the result being left in the register. There is an instruction (COMP) that compares the value in register A with a word in memory; this instruction sets a condition code CC to indicate the result (<, =, or >). Conditional jump instructions (LT, JEQ, JGT) can test the setting of CC, and jump accordingly. Two instructions are1.3. The Simplified Instructional Computer (SIC) provided for subroutine linkage. JSUB jumps to the subroutine, placing the return address in register L; RSUB returns by jumping to the address con- tained in register L. * ‘Appendix A gives a complete list of all SIC (and SIC/XE) instructions, with their operation codes and a specification of the function performed by each. Input and Output On the standard version of SIC, input and output are performed by transfer- ring 1 byte at a time to or from the rightmost 8 bits of register A. Each device is assigned a unique 8-bit code. There are three I/O instructions, each of which specifies the device code as an operand. The Test Device (TD) instruction tests whether the addressed device is ready to send or receive a byte of data. The condition code is set to indicate the result of this test. (A setting of < means the device is ready to send or receive, and = means the device is not ready.) A program needing to transfer data must wait until the device is ready, then execute a Read Data (RD) or Write Data (WD). This sequence must be repeated for each byte of data to be read or writ- ten. The program shown in Fig. 2.1 (Chapter 2) illustrates this technique for performing I/O. 1.3.2 SIC/XE Machine Architecture Memory The memory structure for SIC/XE is the same as that previously described for SIC. However, the maximum memory available on a SIC/XE system is 1 megabyte (220 bytes). This increase leads to a change in instruction formats and addressing modes. Registers The following additional registers are provided by SIC/XE: Mnemonic Number _ Special u: B 3 Base register; used for addressing s 4 General working register—no special use T 5 General working register—no special use F 6 Floating-point accumulator (48 bits)Chapter 1 Background Data Formats SIC/XE provides the same data formats as the standard version. In addition, there is a 48-bit floating-point data type with the following format: 1 u 36 : ‘exponent fraction The fraction is interpreted as a value between 0 and 1; that is, the assumed bi- nary point is immediately before the high-order bit. For normalized floating- point numbers, the high-order bit of the fraction must be 1. The exponent is interpreted as an unsigned binary number between 0 and 2047. If the exponent has value ¢ and the fraction has value f, the absolute value of the number rep- resented is f+ 2fe-1024), The sign of the floating-point number is indicated by the value of s (0 = positive, 1 = negative). A value of zero is represented by setting all bits (including sign, exponent, and fraction) to 0. Instruction Formats The larger memory available on SIC/XE means that an address will (in gen- eral) no longer fit into a 15-bit field; thus the instruction format used on the standard version of SIC is no longer suitable. There are two possible options— either use some form of relative addressing, or extend the address field to 20 bits. Both of these options are included in SIC/XE (Formats 3 and 4 in the fol- lowing description). In addition, SIC/XE provides some instructions that do not reference memory at all. Formats 1 and 2 in the following description are used for such instructions. The new set of instruction formats is as follows. The settings of the flag bits in Formats 3 and 4 are discussed under Addressing Modes. Bit ¢ is used to dis- tinguish between Formats 3 and 4 (¢ = 0 means Format 3, ¢ = 1 means Format 4), Appendix A indicates the format to be used with each machine instruction. Format 1 (1 byte): 8 op13 The Simplified Instructional Computer (SIC) Format 2 (2 bytes): 8 4 4 op n [2 Format 3 (3 bytes): 6 pratt 12 op Format 4 (4 bytes): 6 satay 20 PP pple address Addressing Modes ‘Two new relative addressing modes are available for use with instructions assembled using Format 3. These are described in the following table: Mode h jon Target address calculation Base relative b=1,p=0 TA=(B)+disp (0< disp < 4095) Program-counter_ b=0,p=1 TA=(PC)+disp (-2048 < disp < 2047) relative For base relative addressing, the displacement field disp in a Format 3 instruc- tion is interpreted as a 12-bit unsigned integer. For program-counter relative ad- dressing, this field is interpreted as a 12-bit signed integer, with negative values represented in 2's complement notation If bits b and p are both set to 0, the disp field from the Format 3 instruction is taken to be the target address. For a Format 4 instruction, bits b and p are normally set to 0, and the target address is taken from the address field of the instruction. We will call this direct addressing, to distinguish it from the rela- tive addressing modes described above. Any of these addressing modes can also be combined with indexed ad- dressing—if bit x is set to 1, the term (X) is added in the target address calcula- tion. Notice that the standard version of the SIC machine uses only direct addressing (with or without indexing).10 Chupter 1 Background Bits i and 1 in Formats 3 and 4 are used to specify how the target address is used. If bit i= 1 and 1 = 0, the target address itself is used as the operand value; no memory reference is performed. This is called immediate addressing, If bit ( = 0 and n = 1, the word at the location given by the target address is fetched; the value contained in this word is then taken as the address of the operand value. This is called indirect addressing. If bits i and 1 are both 0 or both 1, the target address is taken as the location of the operand; we will refer to this as simple addressing. Indexing cannot be used with immediate or indi- rect addressing modes. Many authors use the term effective address to denote what we have called the target address for an instruction. However, there is disagreement concern- ing the meaning of effective address when referring to an instruction that uses indirect addressing. To avoid confusion, we use the term target address throughout this book. SIC/XE instructions that specify neither immediate nor indirect addressing are assembled with bits 1 and i both set to 1. Assemblers for the standard ver- sion of SIC will, however, set the bits in both of these positions to 0. (This is be- cause the 8-bit binary codes for all of the SIC instructions end in 00.) All SIC/XE machines have a special hardware feature designed to provide the up- ward compatibility mentioned earlier. If bits » and i are both 0, then bits b, p, and ¢ are considered to be part of the address field of the instruction (rather than flags indicating addressing modes). This makes Instruction Format 3 identical to the format used on the standard version of SIC, providing the de- sired compatibility. Figure 1.1 gives examples of the different addressing modes available on SIC/XE. Figure 1.1(a) shows the contents of registers B, PC, and X, and of se- lected memory locations. (All values are given in hexadecimal.) Figure 1.1(b) gives the machine code for a series of LDA instructions. The target address generated by each instruction, and the value that is loaded into register A, are also shown. You should carefully examine these examples, being sure you un- derstand the different addressing modes illustrated. For ease of reference, all of the SIC/XE instruction formats and addressing, modes are summarized in Appendix A. Instruction Set SIC/XE provides all of the instructions that are available on the standard version. In addition, there are instructions to load and store the new registers (LDB, STB, etc.) and to perform floating-point arithmetic operations (ADDF,1.3 The Simplified Instructional Computer (SIC) n (B) = 006000 . . (PC) = 003000 : : (x) = 000080 3030 | 003600 3600 | 103000 (006303 303 | 003030 @) Machine instruction Value loaded Hex Binary into register op on i x bo pe displaddress A 032600 000000 «1:«1 «0 ~©0 1 0 0110 0000 0000 3600 103000 o3c300 000000 «1 «1110 0 _ 0022 0000 0000 6390 00303 022030 000000 ««:«8 = 0 1 0 0000 0011 0000 3030 103000 010030 900000 «9 «S10 «8 «OO 0000 0011 0000 30 000030 003600 000000. :« SO -««O-—-1_—«-1_—(0210 0000 0000 3600 103000 0310¢303 000000 ««1«1 «8 «© «010000 1100 0011 0000 0011 c303.«S(003030 ) Figure 1.1 Examples of SIC/XE instructions and addressing modes.2 Chapter 1 Background SUBF, MULE, DIVF). There are also instructions that take their operands from registers. Besides the RMO (register move) instruction, these include register-to-register arithmetic operations (ADDR, SUBR, MULR, DIVR). A spe- cial supervisor call instruction (SVC) is provided. Executing this instruction generates an interrupt that can be used for communication with the operating system. (Supervisor calls and interrupts are discussed in Chapter 6.) There are also several other new instructions. Appendix A gives a complete list of all SIC/XE instructions, with their operation codes and a specification of the function performed by each. Input and Output The I/O instructions we discussed for SIC are also available on SIC/XE. In ad- dition, there are I/O channels that can be used to perform input and output while the CPU is executing other instructions. This allows overlap of comput- ing and I/O, resulting in more efficient system operation. The instructions SIO, TIO, and HIO are used to start, test, and halt the operation of I/O chan- nels. (These concepts are discussed in detail in Chapter 6.) 1.3.3 SIC Programming Examples This section presents simple examples of SIC and SIC/XE assembler language programming. These examples are intended to help you become more familiar with the SIC and SIC/XE instruction sets and assembler language. It is as- sumed that the reader is already familiar with the assembler language of at least one machine and with the basic ideas involved in assembly-level pro- gramming. The primary subject of this book is systems programming, not assembler language programming. The following chapters contain discussions of various types of system software, and in some cases SIC programs are used to illus- trate the points being made. This section contains material that may help you to understand these examples more easily. However, it does not contain any new material on system software or systems programming. Thus, this section can be skipped without any loss of continuity. Figure 1.2 contains examples of data movement operations for SIC and SIC/XE. There are no memory-to-memory move instructions; thus, all data movement must be done using registers. Figure 1.2(a) shows two examples of data movement. In the first, a 3-byte word is moved by loading it into register Aand then storing the register at the desired destination. Exactly the same thing could be accomplished using register X (and the instructions LDX, STX) or register L (LDL, STL). In the second example, a single byte of data is moved using the instructions LDCH (Load Character) and STCH (Store Character).1a The Simplified Instructional Computer (SIC) These instructions operate by loading or storing the rightmost 8-bit byte of register A; the other bits in register A are not affected. Figure 1.2(a) also shows four different ways of defining storage for data items in the SIC assembler language. (These assembler directives are discussed in more detail in Section 2.1.) The statement WORD reserves one word of stor- age, which is initialized to a value defined in the operand field of the state- ment. Thus the WORD statement in Fig. 1.2(a) defines a data word labeled FIVE whose value is initialized to 5. The statement RESW reserves one or more words of storage for use by the program. For example, the RESW state- ment in Fig. 1.2(a) defines one word of storage labeled ALPHA, which will be used to hold a value generated by the program. The statements BYTE and RESB perform similar storage-definition func- tions for data items that are characters instead of words. Thus in Fig. 1.2(a) CHARZ is a 1-byte data item whose value is initialized to the character “Z’ and C1 is a 1-byte variable with no initial value. LDA FIVE LOAD CONSTANT 5 INTO REGISTER A STA ALPHA STORE IN ALPHA LOCH CHARZ LOAD CHARACTER ‘Z’ INTO REGISTER A sTH cl STORE IN CHARACTER VARIABLE C1 ALPHA RESW FIVE WORD ONE-WORD VARIABLE ONE-WORD CONSTANT ONE-BYTE CONSTANT i 4 . cl RESB ONE-BYTE VARIABLE @) LDA O45 LOAD VALUE 5 INTO REGISTER A STA ALPHA STORE IN ALPHA LDA #90 LOAD ASCII CODE FOR 'Z' INTO REG A stcH cl STORE IN CHARACTER VARIABLE Cl ALPHA RESW 1 ONE-WORD VARIABLE ct RESBO 1 ONE-BYTE VARIABLE (b) Figure 1.2 Sample data movement operations for (a) SIC and (b) SIC/XE. 13vy Chapter 1 Background The instructions shown in Fig. 1.2(a) would also work on SIC/XE; how- ever, they would not take advantage of the more advanced hardware features available. Figure 1.2(b) shows the same two data-movement operations as they might be written for SIC/XE. In this example, the value 5 is loaded into register A using immediate addressing. The operand field for this instruction contains the flag # (which specifies immediate addressing) and the data value to be loaded. Similarly, the character “Z” is placed into register A by using im- mediate addressing to load the value 90, which is the decimal value of the ASCII code that is used internally to represent the character “2”. Figure 1.3(a) shows examples of arithmetic instructions for SIC. All arith- metic operations are performed using register A, with the result being left in register A. Thus this sequence of instructions stores the value (ALPHA + INCR ~1) in BETA and the value (GAMMA + INCR ~ 1) in DELTA. Figure 1.3(b) illustrates how the same calculations could be performed on SIC/XE. The value of INCR is loaded into register S initially, and the register- to-register instruction ADDR is used to add this value to register A when it is needed. This avoids having to fetch INCR from memory each time it is used in a calculation, which may make the program more efficient. Immediate ad- dressing is used for the constant 1 in the subtraction operations. Looping and indexing operations are illustrated in Fig. 1.4. Figure 1.4(a) shows a loop that copies one 11-byte character string to another. The index register (register X) is initialized to zero before the loop begins. Thus, during the first execution of the loop, the target address for the LDCH instruction will be the address of the first byte of STR1. Similarly, the STCH instruction will store the character being copied into the first byte of STR2. The next instruc- tion, TIX, performs two functions. First it adds 1 to the value in register X, and then it compares the new value of register X to the value of the operand (in this case, the constant value 11). The condition code is set to indicate the result of this comparison. The JLT instruction jumps if the condition code is set to “less than.” Thus, the JLT causes a jump back to the beginning of the loop if the new value in register X is less than 11. During the second execution of the loop, register X will contain the value 1. Thus, the target address for the LDCH instruction will be the second byte of STRI, and the target address for the STCH instruction will be the second byte of STR2. The TIX instruction will again add 1 to the value in register X, and the loop will continue in this way until all 11 bytes have been copied from STRI to STR2. Notice that after the TIX instruction is executed, the value in register X is equal to the number of bytes that have already been copied Figure 1.4(b) shows the same loop as it might be written for SIC/XE. The main difference is that the instruction TIXR is used in place of TIX. TIXR works exactly like TIX, except that the value used for comparison is taken from another register (in this case, register T), not from memory. This makes1.8 The Simplified Instructional Computer (SIC) 15 the loop more efficient, because the value does not have to be fetched from memory each time the loop is executed. Immediate addressing is used to ini- tialize register T to thg value 11 and to initialize register X to 0. BETA DELTA BETA DELTA ‘LDA ADD SUB STA ‘LDA ADD SUB STA ALPHA, INCR ONE SA ce BETA SA #1 DELTA LOAD ALPHA INTO REGISTER A ADD THE VALUE OF INCR SUBTRACT 1 STORE IN BETA LOAD GAMMA INTO REGISTER A ADD THE VALUE OF INCR SUBTRACT 1 STORE IN DELTA ONE-WORD CONSTANT ONE-WORD VARIABLES @ LOAD VALUE OF INCR INTO REGISTER S LOAD ALPHA INTO REGISTER A ADD THE VALUE OF INCR SUBTRACT 1 STORE IN BETA LOAD GAMMA INTO REGISTER A ADD THE VALUE OF INCR SUBTRACT 1 STORE IN DELTA ONE WORD VARIABLES ©) Figure 1.3 Sample arithmetic operations for (a) SIC and (b) SIC/XE.16 Chapter 1 Background LOX ZERO INITIALIZE INDEX REGISTER TO 0 MOVECH LDCH = STRI,X LOAD CHARACTER FROM STR1 INTO REG A SICH STR2,X STORE CHARACTER INTO STR2 TIX ELEVEN ADD 1 TO INDEX, COMPARE RESULT TO 11 JLT MOVECH LOOP IF INDEX IS LESS THAN 11 STRL BYTE C’TEST STRING’ 11-BYTE STRING CONSTANT STR2 RESB lL 11-BYTE VARIABLE : ONE-WORD CONSTANTS ZERO WORD 0 @) wor fll INITIALIZE REGISTER T TO 11 Lx #0 INITIALIZE INDEX REGISTER TO 0 MOVECH LDCH —STR1,x LOAD CHARACTER FROM STR1 INTO REG A SICH STR2,X STORE CHARACTER INTO STR2 TDR oT ADD 1 TO INDEX, COMPARE RESULT TO 11 JLT = MOVECH Loop IF INDEX IS LESS THAN 11 STRL BYTE C/TEST STRING’ 11-BYTE STRING CONSTANT ‘STR2 RESB oll 11-BYTE VARIABLE ) Figure 1.4 Sample looping and indexing operations for (a) SIC and (b) SICIXE. Figure 1.5 contains another example of looping and indexing operations. The variables ALPHA, BETA, and GAMMA are arrays of 100 words each. In this case, the task of the loop is to add together the corresponding elements of ALPHA and BETA, storing the results in the elements of GAMMA. The gen- eral principles of looping and indexing are the same as previously discussed. However, the value in the index register must be incremented by 3 for each it- eration of this loop, because each iteration processes a 3-byte (i.e., one-word) element of the arrays. The TIX instruction always adds 1 to register X, so it is not suitable for this program fragment. Instead, we use arithmetic and com- parison instructions to handle the index value.ADDLP = LDK ALPHA = RESW BETA RESW ZERO ‘WORD 300 ‘WORD ADDLP BETA RESW ZERO INDEX ALPHA, X BETA, X 100 100 100 #3 #300 #0 ALPHA, X BETA, X GAMA, X 8.X XT 100 100 100 1.3 The Simplified Instructional Computer (SIC) 7 INITIALIZE INDEX VALUE TO 0 LOAD INDEX VALUE INTO REGISTER X LOAD WORD FROM ALPHA INTO REGISTER A ADD WORD FROM BETA STORE THE RESULT IN A WORD IN GAMMA ADD 3 TO INDEX VALUE COMPARE NEW INDEX VALUE TO 300 LOOP IF INDEX IS LESS THAN 300 ONE-WORD VARIABLE FOR INDEX VALUE, ARRAY VARIABLES--100 WORDS EACH ONE-WORD CONSTANTS (@) INITIALIZE REGISTER S TO 3 INITIALIZE REGISTER T TO 300 INITIALIZE INDEX REGISTER TO 0 LOAD WORD FROM ALPHA INTO REGISTER A ADD WORD FROM BETA STORE THE RESULT IN A WORD IN GAMMA ADD 3 TO INDEX VALUE COMPARE NEW INDEX VALUE TO 300 LOOP IF INDEX VALUE IS LESS THAN 300 ARRAY VARIABLES--100 WORDS EACH (eo) Figure 1.5 Sample indexing and looping operations for (a) SIC and (b) SICIXE.18 Chapter 1 Background In Fig. 1.5(a), we define a variable INDEX that holds the value to be used for indexing for each iteration of the loop. Thus, INDEX should be 0 for the first iteration, 3 for the second, and so on. INDEX is initialized to 0 before the start of the loop. The first instruction in the body of the loop loads the current value of INDEX into register X so that it can be used for target address calcula- tion. The next three instructions in the loop load a word from ALPHA, add the corresponding word from BETA, and store the result in the corresponding word of GAMMA. The value of INDEX is then loaded into register A, incre- mented by 3, and stored back into INDEX. After being stored, the new value of INDEX is still present in register A. This value is then compared to 300 (the length of the arrays in bytes) to determine whether or not to terminate the loop. If the value of INDEX is less than 300, then all bytes of the arrays have not yet been processed. In that case, the JLT instruction causes a jump back to the beginning of the loop, where the new value of INDEX is loaded into regis- ter X, This particular loop is cumbersome on SIC, because register A must be used for adding the array elements together and also for incrementing the in- dex value. The loop can be written much more efficiently for SIC/XE, as shown in Fig. 1.5(b). In this example, the index value is kept permanently in register X. The amount by which to increment the index value (3) is kept in register S, and the register-to-register ADDR instruction is used to add this in- crement to register X. Similarly, the value 300 is kept in register T, and the in- struction COMPR is used to compare registers X and T in order to decide when to terminate the loop. Figure 1.6 shows a simple example of input and output on SIC; the same instructions would also work on SIC/XE. (The more advanced input and out- put facilities available on SIC/XE, such as 1/O channels and interrupts, are discussed in Chapter 6.) This program fragment reads 1 byte of data from de- vice F1 and copies it to device 05. The actual input of data is performed using the RD (Read Data) instruction. The operand for the RD is a byte in memory that contains the hexadecimal code for the input device (in this case, F1) Executing the RD instruction transfers 1 byte of data from this device into the rightmost byte of register A. If the input device is character-oriented (for ex- ample, a keyboard), the value placed in register A is the ASCII code for the character that was read. Before the RD can be executed, however, the input device must be ready to transmit the data. For example, if the input device is a keyboard, the operator must have typed a character. The program checks for this by using the TD (Test Device) instruction. When the TD is executed, the status of the addressed device is tested and the condition code is set to indicate the result of this test. If the device is ready to transmit data, the condition code is set to “less than”; if the device is not ready, the condition code is set to “equal.” As Fig. 1.61.¥ The Simplified Instructional Computer (SIC) INLOOP. 1D INDEV TEST INPUT DEVICE JQ INLOOP LOOP UNTIL DEVICE IS READY RD INDEV READ ONE BYTE INTO REGISTER A STCH DATA STORE BYTE THAT WAS READ courte = 1D ourDEV TEST OUTPUT DEVICE JEQ (OUTLP LOOP UNTIL DEVICE IS READY LOCH DATA LOAD DATA BYTE INTO REGISTER A wD ouTDEV WRITE ONE BYTE TO OUTPUT DEVICE INDEV BYTE X’F1l’ INPUT DEVICE NUMBER OUTDEV BYTE = -x’05' OUTPUT DEVICE NUMBER DATA RESB OL ONE-BYTE VARTABLE Figure 1.6 Sample input and output operations for SIC. illustrates, the program must execute the TD instruction and then check the condition code by using a conditional jump. If the condition code is “equal” (device not ready), the program jumps back to the TD instruction. This two- instruction loop will continue until the device becomes ready; then the RD will be executed. Output is performed in the same way. First the program uses TD to check whether the output device is ready to receive a byte of data. Then the byte to be written is loaded into the rightmost byte of register A, and the WD (Write Data) instruction is used to transmit it to the device. Figure 1.7 shows how these instructions can be used to read a 100-byte record from an input device into memory. The read operation in this example is placed in a subroutine. This subroutine is called from the main program by using the JSUB (Jump to Subroutine) instruction. At the end of the subroutine there is an RSUB (Return from Subroutine) instruction, which returns control to the instruction that follows the JSUB. The READ subroutine itself consists of a loop. Each execution of this loop reads 1 byte of data from the input device, using the same techniques illus- trated in Fig. 1.6. The bytes of data that are read are stored in a 100-byte buffer area labeled RECORD. The indexing and looping techniques that are used in storing characters in this buffer are essentially the same as those illustrated in Fig. 1.4(a). Figure 1.7(b) shows the same READ subroutine as it might be written for SIC/XE. The main differences from Fig. 1.7(a) are the use of immediate addressing and the TIXR instruction, as was illustrated in Fig. 1.4(a).20 Chapter 1 RLOOP INDEV RECORD 100 RLOOP RECORD Background JSUB READ MDX = ZERO ™ INDEV JEQ —-RLOOP RD INDEV STCH RECORD, X TIX 100 JLT ——-RLOOP RSUB BYTE X’F’ RESB 100 WORD 0 WoRD 100 JSUB READ LX #0 wor #100 .D INDEV JEQ -RLOOP RD INDEV STCH RECORD, X TR OT JLT RLOOP. RSUB BYTE X'Fl’ RESB 100 CALL READ SUBROUTINE SUBROUTINE TO READ 100-BYTE RECORD INITIALIZE INDEX REGISTER TO 0 TEST INPUT DEVICE LOOP IF DEVICE IS BUSY READ ONE BYTE INTO REGISTER A STORE DATA BYTE INTO RECORD ADD 1 TO INDEX AND COMPARE TO 100 LOOP IF INDEX IS LESS THAN 100 EXIT FROM SUBROUTINE INPUT DEVICE NUMBER 100-BYTE BUFFER FOR INPUT RECORD ONE-WORD CONSTANTS @ CALL READ SUBROUTINE SUBROUTINE TO READ 100-BYTE RECORD INITIALIZE INDEX REGISTER TO 0 INITIALIZE REGISTER T TO 100 TEST INPUT DEVICE Loop IF DEVICE IS BUSY READ ONE BYTE INTO REGISTER A STORE DATA BYTE INTO RECORD ADD 1 TO INDEX AND COMPARE TO 100 LOOP IF INDEX IS LESS THAN 100 EXIT FROM SUBROUTINE INPUT DEVICE NUMBER -100-BYTE BUFFER FOR INPUT RECORD ) Figure 1.7 Sample subroutine call and record input operations for (a) SIC and (b) SIC/XE.. 1.4 Traditional (CISC) Machines 1.4 TRADITIONAL (CISC) MACHINES This section introducgs the architectures of two of the machines that will be used as examples later in the text. Section 1.4.1 describes the VAX architecture, and Section 1.4.2 describes the architecture of the Intel x86 family of proces- sors. The machines described in this section are classified as Complex Instruc- tion Set Computers (CISC). CISC machines generally have a relatively large and complicated instruction set, several different instruction formats and lengths, and many different addressing modes. Thus the implementation of such an architecture in hardware tends to be complex. You may want to compare the examples in this section with the Reduced Instruction Set Computer (RISC) examples in Section 1.5. Further discussion of CISC versus RISC designs can be found in Tabak (1995). 1.4.1 VAX Architecture The VAX family of computers was introduced by Digital Equipment Corporation (DEC) in 1978. The VAX architecture was designed for compati- bility with the earlier PDP-11 machines. A compatibility mode was provided at the hardware level so that many PDP-11 programs could run unchanged on the VAX. It was even possible for PDP-11 programs and VAX programs to share the same machine in a multi-user environment. This section summarizes some of the main characteristics of the VAX archi- tecture. For further information, see Baase (1992). Memory The VAX memory consists of 8-bit bytes. All addresses used are byte ad- dresses. Two consecutive bytes form a word; four bytes form a longword; eight bytes form a quadword; sixteen bytes form an octaword. Some operations are more efficient when operands are aligned in a particular way—for example, a longword operand that begins at a byte address that is a multiple of 4. All VAX programs operate in a virtual address space of 232 bytes. This vir- tual memory allows programs to operate as though they had access to an ex- tremely large memory, regardless of the amount of memory actually present ‘on the system. Routines in the operating system take care of the details of memory management. We discuss virtual memory in connection with our study of operating systems in Chapter 6. One half of the VAX virtual address space is called system space, which contains the operating system, and is shared by all programs. The other half of the address space is called process space, and 21Chapter 1 Background is defined separately for each program. A part of the process space contains stacks that are available to the program. Special registers and machine instruc- tions aid in the use of these stacks. Registers There are 16 general-purpose registers on the VAX, denoted by RO through R15, Some of these registers, however, have special names and uses. All gen- eral registers are 32 bits in length. Register R15 is the program counter, also called PC. It is updated during instruction execution to point to the next in- struction byte to be fetched. R14 is the stack pointer SP, which points to the cur- rent top of the stack in the program's process space. Although it is possible to use other registers for this purpose, hardware instructions that implicitly use the stack always use SP. R13 is the frame pointer FP. VAX procedure call con- ventions build a data structure called a stack frame, and place its address in FP. R12 is the argument pointer AP. The procedure call convention uses AP to passa list of arguments associated with the call Registers R6 through R11 have no special functions, and are available for general use by the program. Registers RO through R35 are likewise available for general use; however, these registers are also used by some machine instruc- tions. In addition to the general registers, there is a processor status longword (PSL), which contains state variables and flags associated with a process. The PSL includes, among many other items of information, a condition code and a flag that specifies whether PDP-11 compatibility mode is being used by a process. There are also a number of control registers that are used to support various operating system functions. Data Formats Integers are stored as binary numbers in a byte, word, longword, quadword, or octaword; 2's complement representation is used for negative values. Characters are stored using their 8-bit ASCII codes. There are four different floating-point data formats on the VAX, ranging in length from 4 to 16 bytes. Two of these are compatible with those found on the PDP-11, and are standard on all VAX processors. The other two are available as options, and provide for an extended range of values by allowing more bits in the exponent field. In each case, the principles are the same as those we dis- cussed for SIC/XE: a floating-point value is represented as a fraction that is to be multiplied by a specified power of 2. VAX processors provide a packed decimal data format. In this format, each byte represents two decimal digits, with each digit encoded using 4 bits of the byte. The sign is encoded in the last 4 bits. There is also a numeric format that. 1.4 Traditional (CISC) Machines is used to represent numeric values with one digit per byte. In this format, the sign may appear either in the last byte, or as a separate byte preceding the first digit. These two variations are called trailing numeric and leading separate nu- meric. VAX also supports queues and variable-length bit strings. Data structures such as these can, of course, be implemented on any machine; however, VAX provides direct hardware support for them. There are single machine instruc- tions that insert and remove entries in queues, and perform a variety of opera tions on bit strings. The existence of such powerful machine instructions and complex primitive data types is one of the more unusual features of the VAX architecture. Instruction Formats VAX machine instructions use a variable-length instruction format. Each in- struction consists of an operation code (1 or 2 bytes) followed by up to six operand specifiers, depending on the type of instruction. Each operand specifier designates one of the VAX addressing modes and gives any additional infor- mation necessary to locate the operand. (See the description of addressing modes in the following section for further information.) Addressing Modes VAX provides a large number of addressing modes. With few exceptions, any of these addressing modes may be used with any instruction. The operand it- self may be in a register (register mode), or its address may be specified by a register (register deferred mode). If the operand address is in a register, the reg- ister contents may be automatically incremented or decremented by the operand length (autoincrement and autodecrement modes). There are several base relative addressing modes, with displacement fields of different lengths; when used with register PC, these become program-counter relative modes. All of these addressing modes may also include an index register, and many of them are available in a form that specifies indirect addressing (called deferred modes on VAX). In addition, there are immediate operands and several spe- ial-purpose addressing modes. For further details, see Baase (1992). Instruction Set One of the goals of the VAX designers was to produce an instruction set that is symmetric with respect to data type. Many instruction mnemonics are formed by combining the following elements: 2324 Chapter 1 Background 1. a prefix that specifies the type of operation, 2. a suffix that specifies the data type of the operands, 3. a modifier (on some instructions) that gives the number of operands involved. For example, the instruction ADDW2? is an add operation with two operands, each a word in length. Likewise, MULL3 is a multiply operation with three longword operands, and CVTWL specifies a conversion from word to long- word. (In the latter case, a two-operand instruction is assumed.) For a typical instruction, operands may be located in registers, in memory, or in the instruc- tion itself (immediate addressing). The same machine instruction code is used, regardless of operand locations. VAX provides all of the usual types of instructions for computation, data movement and conversion, comparison, branching, etc. In addition, there are a number of operations that are much more complex than the machine instruc- tions found on most computers. These operations are, for the most part, hard- ware realizations of frequently occurring sequences of code. They are implemented as single instructions for efficiency and speed. For example, VAX provides instructions to load and store multiple registers, and to manipulate queues and variable-length bit fields. There are also powerful instructions for calling and returning from procedures. A single instruction saves a designated set of registers, passes a list of arguments to the procedure, maintains the stack, frame, and argument pointers, and sets a mask to enable error traps for arithmetic operations. For further information on all of the VAX instructions, see Baase (1992) Input and Output Input and output on the VAX are accomplished by I/O device controllers. Each controller has a set of control/status and data registers, which are as signed locations in the physical address space. The portion of the address space into which the device controller registers are mapped is called I/O space. No special instructions are required to access registers in I/O space. An 1/0 device driver issues commands to the device controller by storing values into the appropriate registers, exactly as if they were physical memory loca- tions. Likewise, software routines may read these registers to obtain status in- formation. The association of an address in 1/O space with a physical register ina device controller is handled by the memory management routines.1.4 Traditional (CISC) Machines 1.4.2 Pentium Pro Architecture The Pentium Pro microprocessor, introduced near the end of 1995, is the latest in the Intel x86 family. Other recent microprocessors in this family are the 80486 and Pentium. Processors of the x86 family are presently used in a major- ity of personal computers, and there is a vast amount of software for these processors. It is expected that additional generations of the x86 family will be developed in the future The various x86 processors differ in implementation details and operating speed. However, they share the same basic architecture. Each succeeding gen- eration has been designed to be compatible with the earlier versions. This sec- tion contains an overview of the x86 architecture, which will serve as background for the examples to be discussed later in the book. Further infor- mation about the x86 family can be found in Intel (1995), Anderson and Shanley (1995), and Tabak (1995). Memory Memory in the x86 architecture can be described in at least two different ways. At the physical level, memory consists of 8-bit bytes. All addresses used are byte addresses. Two consecutive bytes form a word; four bytes form a double- word (also called a dword). Some operations are more efficient when operands are aligned in a particular way—for example, a doubleword operand that be- gins at a byte address that is a multiple of 4. However, programmers usually view the x86 memory as a collection of segments, From this point of view, an address consists of two parts—a segment number and an offset that points to a byte within the segment. Segments can be of different sizes, and are often used for different purposes. For example, some segments may contain executable instructions, and other segments may be used to store data. Some data segments may be treated as stacks that can be used to save register contents, pass parameters to subroutines, and for other purposes. It is not necessary for all of the segments used by a program to be in physi- cal memory. In some cases, a segment can also be divided into pages. Some of the pages of a segment may be in physical memory, while others may be stored on disk. When an x86 instruction is executed, the hardware and the op- erating system make sure that the needed byte of the segment is loaded into physical memory. The segment/offset address specified by the programmer is automatically translated into a physical byte address by the x86 Memory 2526 Chapter 1 Background Management Unit (MMU). Chapter 6 contains a brief discussion of methods that can be used in this kind of address translation. Registers There are eight general-purpose registers, which are named EAX, EBX, ECX, EDX, ESI, EDI, EBP, and ESP. Each general-purpose register is 32 bits long (i.e., one doubleword). Registers EAX, EBX, ECX, and EDX are generally used for data manipulation; it is possible to access individual words or bytes from these registers. The other four registers can also be used for data, but are more commonly used to hold addresses. The general-purpose register set is identi- cal for all members of the x86 family beginning with the 80386. This set is also compatible with the more limited register sets found in earlier members of the family. There are also several different types of special-purpose registers in the x86 architecture. EIP is a 32-bit register that contains a pointer to the next instruc- tion to be executed. FLAGS is a 32-bit register that contains many different bit flags. Some of these flags indicate the status of the processor; others are used to record the results of comparisons and arithmetic operations. There are also six 16-bit segment registers that are used to locate segments in memory. Segment register CS contains the address of the currently executing code seg- ment, and SS contains the address of the current stack segment. The other seg- ment registers (DS, ES, FS, and GS) are used to indicate the addresses of data segments. Floating-point computations are performed using a special floating-point unit (FPU). This unit contains eight 80-bit data registers and several other con- trol and status registers. All of the registers discussed so far are available to application programs. There are also a number of registers that are used only by system programs such as the operating system. Some of these registers are used by the MMU to translate segment addresses into physical addresses. Others are used to con- trol the operation of the processor, or to support debugging operations. Data Formats The x86 architecture provides for the storage of integers, floating-point values, characters, and strings. Integers are normally stored as 8-, 16-, or 32-bit binary numbers. Both signed and unsigned integers (also called ordinals) are sup- ported; 2's complement is used for negative values. The FPU can also handle 64-bit signed integers. In memory, the least significant part of a numeric value is stored at the lowest-numbered address. (This is commonly called1.4 Traditional (CISC) Machines little-endian byte ordering, because the “little end” of the value comes first in memory.) Integers can also be stored in binary coded decimal (BCD). In the unpacked BCD format, each byte represents one decimal digit. The value of this digit is encoded (in binary) in the low-order 4 bits of the byte; the high-order bits are normally zero. In the packed BCD format, each byte represents two decimal digits, with each digit encoded using 4 bits of the byte. There are three different floating-point data formats. The single-precision format is 32 bits long. It stores 24 significant bits of the floating-point value, and allows for a 7-bit exponent (power of 2). (The remaining bit is used to store the sign of the floating-point value.) The double-precision format is 64 bits long. It stores 53 significant bits, and allows for a 10-bit exponent. The extended-precision format is 80 bits long. It stores 64 significant bits, and allows for a 15-bit exponent. Characters are stored one per byte, using their 8-bit ASCII codes. Strings may consist of bits, bytes, words, or doublewords; special instructions are provided to handle each type of string. Instruction Formats All of the x86 machine instructions use variations of the same basic format. This format begins with optional prefixes containing flags that modify the op- eration of the instruction. For example, some prefixes specify a repetition count for an instruction. Others specify a segment register that is to be used for addressing an operand (overriding the normal default assumptions made by the hardware). Following the prefixes (if any) is an opcode (1 or 2 bytes); some operations have different opcodes, each specifying a different variant of the operation. Following the opcode are a number of bytes that specify the operands and addressing modes to be used. (See the description of addressing modes in the next section for further information.) The opcode is the only element that is always present in every instruction. Other elements may or may not be present, and may be of different lengths, depending on the operation and the operands involved. Thus, there are a large number of different potential instruction formats, varying in length from 1 byte to 10 bytes or more. Addressing Modes The x86 architecture provides a large number of addressing modes. An operand value may be specified as part of the instruction itself (immediate mode), or it may be in a register (register mode). 2728 Chapter 1 Background Operands stored in memory are often specified using vari eral target address calculation ions of the gen- TA = (base register) + (index register) * (scale factor) + displacement Any general-purpose register may be used as a base register; any general- purpose register except ESP can be used as an index register. The scale factor may have the value 1, 2, 4, or 8, and the displacement may be an 8-, 16-, or 32- bit value. The base and index register numbers, scale, and displacement are encoded as parts of the operand specifiers in the instruction. Various combina- tions of these items may be omitted, resulting in eight different addressing modes. The address of an operand in memory may also be specified as an ab- solute location (direct mode), or as a location relative to the EIP register (relative mode). Instruction Set The x86 architecture has a large and complex instruction set, containing more than 400 different machine instructions. An instruction may have zero, one, two, or three operands. There are register-to-register instructions, register-to- memory instructions, and a few memory-to-memory instructions. In some cases, operands may also be specified in the instruction as immediate values. Most data movement and integer arithmetic instructions can use operands that are 1, 2, or 4 bytes long. String manipulation instructions, which use repe- tition prefixes, can deal directly with variable-length strings of bytes, words, or doublewords. There are many instructions that perform logical and bit ma- nipulations, and support control of the processor and memory-management systems. The x86 architecture also includes special-purpose instructions to perform operations frequently required in high-level programming languages—for ex- ample, entering and leaving procedures and checking subscript values against the bounds of an array. Input and Output Input is performed by instructions that transfer one byte, word, or double- word at a time from an I/O port into register EAX. Output instructions trans- fer one byte, word, or doubleword from EAX to an I/O port. Repetition prefixes allow these instructions to transfer an entire string in a single operation.1.5. RISC Machines 1.5 RISC MACHINES This section introducés the architectures of three RISC machines that will be used as examples later in the text. Section 1.5.1 describes the architecture of the SPARC family of processors. Section 1.5.2 describes the PowerPC family of mi- croprocessors for personal computers. Section 1.5.3 describes the architecture of the Cray T3E supercomputing system. All of these machines are examples of RISC (Reduced Instruction Set Computers), in contrast to traditional CISC (Complex Instruction Set Computer) implementations such as Pentium and VAX. The RISC concept, de- veloped in the early 1980s, was intended to simplify the design of processors. This simplified design can result in faster and less expensive processor devel- opment, greater reliability, and faster instruction execution times. In general, a RISC system is characterized by a standard, fixed instruction length (usually equal to one machine word), and single-cycle execution of most instructions. Memory access is usually done by load and store instruc- tions only. Alll instructions except for load and store are register-to-register op- erations. There are typically a relatively large number of general-purpose registers. The number of machine instructions, instruction formats, and ad- dressing modes is relatively small. The discussions in the following sections will illustrate some of these RISC characteristics. Further information about the RISC approach, including its ad- vantages and disadvantages, can be found in Tabak (1995). 1.5.1 UltraSPARC Architecture The UltraSPARC processor, announced by Sun Microsystems in 1995, is the latest member of the SPARC family. Other members of this family include a variety of SPARC and SuperSPARC processors. The original SPARC architec- ture was developed in the mid-1980s, and has been implemented by a number of manufacturers. The name SPARC stands for scalable processor architecture. This architecture is intended to be suitable for a wide range of implementa- tions, from microcomputers to supercomputers. Although SPARC, SuperSPARC, and UltraSPARC architectures differ slightly, they are upward compatible and share the same basic structure. This section contains an overview of the UltraSPARC architecture, which will serve as background for the examples to be discussed later in the book. Further in- formation about the SPARC family can be found in Tabak (1995) and Sun Microsystems (1995a). 2930 Chapter 1 Background Memory Memory consists of 8-bit bytes; all addresses used are byte addresses. Two consecutive bytes form a halfword; four bytes form a word; eight bytes form a doubleword. Halfwords are stored in memory beginning at byte addresses that are multiples of 2. Similarly, words begin at addresses that are multiples of 4, and doublewords at addresses that are multiples of 8. UltraSPARC programs can be written using a virtual address space of 264 bytes. This address space is divided into pages; multiple page sizes are sup- ported. Some of the pages used by a program may be in physical memory, while others may be stored on disk. When an instruction is executed, the hard- ware and the operating system make sure that the needed page is loaded into physical memory. The virtual address specified by the instruction is automati- cally translated into a physical address by the UltraSPARC Memory Manage- ment Unit (MMU). Chapter 6 contains a brief discussion of methods that can be used in this kind of address translation. Registers The SPARC architecture includes a large register file that usually contains more than 100 general-purpose registers. (The exact number varies from one imple- mentation to another.) However, any procedure can access only 32 registers, designated r0 through r31. The first eight of these registers (r0 through r7) are global—that is, they can be accessed by all procedures on the system. (Register 10 always contains the value zero.) The other 24 registers available to a procedure can be visualized as a win- dow through which part of the register file can be seen. These windows over- lap, so some registers in the register file are shared between procedures. For example, registers r8 through r15 of a calling procedure are physically the same registers as 124 through r31 of the called procedure. This facilitates the passing of parameters The SPARC hardware manages the windows into the register file. If a set of concurrently running procedures needs more windows than are physically available, a “window overflow” interrupt occurs. The operating system must then save the contents of some registers in the file (and restore them later) to provide the additional windows that are needed In the original SPARC architecture, the general-purpose registers were 32 bits long. Later implementations (including UltraSPARC) expanded these registers to 64 bits. Some SPARC irhplementations provide several physically different sets of global registers, for use by application procedures and by vari- ous hardware and operating system functions. Floating-point computations are performed using a special floating-point unit (FPU). On UltraSPARC, this unit contains a file of 64 double-precision floating-point registers, and several other control and status registers.1.5 RISC Machines Besides these register files, there are a program counter PC (which contains the address of the next instruction to be executed), condition code registers, and a number of othet control registers. Data Formats The UltraSPARC architecture provides for the storage of integers, floating- point values, and characters. Integers are stored as 8-, 16-, 32-, or 64-bit binary numbers. Both signed and unsigned integers are supported; 2's complement is used for negative values. In the original SPARC architecture, the most signifi- cant part of a numeric value is stored at the lowest-numbered address. (This is commonly called big-endian byte ordering, because the “big end” of the value comes first in memory.) UltraSPARC supports both big-endian and little- endian byte orderings. There are three different floating-point data formats. The single-precision format is 32 bits long. It stores 23 significant bits of the floating-point value, and allows for an 8-bit exponent (power of 2). (The remaining bit is used to store the sign of the floating-point value.) The double-precision format is 64 bits long. It stores 52 significant bits, and allows for a 11-bit exponent. The quad-precision format stores 63 significant bits, and allows for a 15-bit expo- nent. Characters are stored one per byte, using their 8-bit ASCII codes. Instruction Formats There are three basic instruction formats in the SPARC architecture. All of these formats are 32 bits long; the first 2 bits of the instruction word identify which format is being used. Format 1 is used for the Call instruction. Format 2 is used for branch instructions (and one special instruction that enters a value into a register). The remaining instructions use Format 3, which provides for register loads and stores, and three-operand arithmetic operations. The fixed instruction length in the SPARC architecture is typical of RISC systems, and is intended to speed the process of instruction fetching and de- coding. Compare this approach with the complex variable-length instructions found on CISC systems such as VAX and x86. Addressing Modes As in most architectures, an operand value may be specified as part of the in- struction itself (immediate mode), or it may be in a register (register direct mode). Operands in memory are addressed using one of the following three modes: 3132 Chapter 1 Background Mode Target address calculation PC-relative TA = (PC) + displacement (30 bits, signed] Register indirect TA = (register) + displacement with displacement {13 bits, signed] Register indirect indexed TA = (register-1) + (register-2) PC-relative mode is used only for branch instructions. The relatively few addressing modes of SPARC allow for more efficient im- plementations than the 10 or more modes found on CISC systems such as x86. Instruction Set The basic SPARC architecture has fewer than 100 machine instructions, reflect- ing its RISC philosophy. (Compare this with the 300 to 400 instructions often found in CISC systems.) The only instructions that access memory are loads and stores. All other instructions are register-to-register operations. Instruction execution on a SPARC system is pipelined—while one instruc- tion is being executed, the next one is being fetched from memory and de- coded. In most cases, this technique speeds instruction execution. However, an ordinary branch instruction might cause the process to “stall.” The instruction following the branch (which had already been fetched and decoded) would have to be discarded without being executed. To make the pipeline work more efficiently, SPARC branch instructions (in- cluding subroutine calls) are delayed branches. This means that the instruction immediately following the branch instruction is actually executed before the branch is taken. For example, in the instruction sequence SUB LO, 11, LL BA NEXT Mov BL, 803 the MOV instruction is executed before the branch BA. This MOV instruction is said to be in the delay slot of the branch. The programmer must take this characteristic into account when writing an assembler language program. Further discussions and examples of the use of delayed branches can be found in Section 2.5.2. The UltraSPARC architecture also includes special-purpose instructions to provide support for operating systems and optimizing compilers. For exam- ple, high-bandwidth block load and store operations can be used to speed1.5 RISC Machines common operating system functions. Communication in a multi-processor system is facilitated by special “atomic” instructions that can execute without allowing other memoty accesses to intervene. Conditional move instructions may allow a compiler to eliminate many branch instructions in order to opti- mize program execution. Input and Output In the SPARC architecture, communication with 1/O devices is accomplished through memory. A range of memory locations is logically replaced by device registers. Each 1/O device has a unique address, or set of addresses, assigned to it. When a load or store instruction refers to this device register area of memory, the corresponding device is activated. Thus input and output can be performed with the regular instruction set of the computer, and no special 1/O instructions are needed. 1.5.2 PowerPC Architecture IBM first introduced the POWER architecture early in 1990 with the RS/6000. (POWER is an acronym for Performance Optimization With Enhanced RISC.) It was soon realized that this architecture could form the basis for a new fam- ily of powerful and low-cost microprocessors. In October 1991, IBM, Apple, and Motorola formed an alliance to develop and market such microprocessors, which were named PowerPC. The first products using PowerPC chips were delivered near the end of 1993. Recent implementations of the PowerPC archi- tecture include the PowerPC 601, 603, and 604; others are expected in the near future. As its name implies, PowerPC is a RISC architecture. As we shall see, it has much in common with other RISC systems such as SPARC. There are also a few differences in philosophy, which we will note in the course of the discus- sion. This section contains an overview of the PowerPC architecture, which will serve as background for the examples to be discussed later in the book. Further information about PowerPC can be found in IBM (1994a) and Tabak (1995). Memory Memory consists of 8-bit bytes; all addresses used are byte addresses. Two consecutive bytes form a halfword; four bytes form a word; eight bytes form a doubleword; sixteen bytes form a quadword. Many instructions may execute 3334 Chapter 1 Background more efficiently if operands are aligned at a starting address that is a multiple of their length. PowerPC programs can be written using a virtual address space of 264 bytes. This address space is divided into fixed-length segments, which are 256 megabytes long. Each segment is divided into pages, which are 4096 bytes long. Some of the pages used by a program may be in physical memory, while others may be stored on disk. When an instruction is executed, the hardware and the operating system make sure that the needed page is loaded into physi- cal memory. The virtual address specified by the instruction is automatically translated into a physical address. Chapter 6 contains a brief discussion of methods that can be used in this kind of address translation. Registers There are 32 general-purpose registers, designated GPRO through GPR31. In the full PowerPC architecture, each register is 64 bits long. PowerPC can also be implemented in a 32-bit subset, which uses 32-bit registers. The general- purpose registers can be used to store and manipulate integer data and addresses. Floating-point computations are performed using a special floating-point unit (FPU). This unit contains thirty-two 64bit floating-point registers, and a status and control register. A 32-bit condition register reflects the result of certain operations, and can be used as a mechanism for testing and branching. This register is divided into eight 4-bit subfields, named CRO through CR7. These subfields can be set and tested individually by PowerPC instructions. The PowerPC architecture includes a Link Register (LR) and a Count Register (CR), which are used by some branch instructions. There is also a Machine Status Register (MSR) and variety of other control and status regis- ters, some of which are implementation dependent. Data Formats The PowerPC architecture provides for the storage of integers, floating-point values, and characters. Integers are stored as 8-, 16-, 32-, or 6+-bit binary num- bers. Both signed and unsigned integers are supported; 2’s complement is used for negative values. By default, the most significant part of a numeric value is stored at the lowest-numbered address (big-endian byte ordering). It is possible to select little-endian byte ordering by setting a bit in a control register.1.5. RISC Machines There are two different floating-point data formats. The single-precision format is 32 bits long. It stores 23 significant bits of the floating-point value, and allows for an 8-bit exponent (power of 2). (The remaining bit is used to store the sign of the floating-point value.) The double-precision format is 64 bits long. It stores 52 significant bits, and allows for a 11-bit exponent. Characters are stored one per byte, using their 8-bit ASCII codes. Instruction Formats There are seven basic instruction formats in the PowerPC architecture, some of which have subforms. All of these formats are 32 bits long. Instructions must be aligned beginning at a word boundary (i.e, a byte address that is a multiple of 4). The first 6 bits of the instruction word always specify the opcode; some instruction formats also have an additional “extended opcode” field. The fixed instruction length in the PowerPC architecture is typical of RISC systems. The variety and complexity of instruction formats is greater than that found on most RISC systems (such as SPARC). However, the fixed length makes instruction decoding faster and simpler than on CISC systems like VAX and x86. ‘Addressing Modes As in most architectures, an operand value may be specified as part of the in- struction itself (immediate mode), or it may be in a register (register direct mode). The only instructions that address memory are load and store opera- tions, and branch instructions. Load and store operations use one of the following three addressing modes: Mode Target address calculation Register indirect TA = (register) Register indirect with index TA = (register-1) + (register-2) Register indirect with TA = (register) + displacement immediate index {16 bits, signed] ‘The register numbers and displacement are encoded as part of the instruction. 35Chapter 1 Background Branch instructions use one of the following three addressing modes: Mode Target address calculation Absolute TA = actual address Relative TA = current instruction address + displacement {25 bits, signed} Link Register TA = (LR) (CR) Count Register. T/ The absolute address or displacement is encoded as part of the instruction. Instruction Set The PowerPC architecture has approximately 200 machine instructions. Some instructions are more complex than those found in most RISC systems. For ex- ample, load and store instructions may automatically update the index regis- ter to contain the just-computed target address. There are floating-point “multiply and add” instructions that take three input operands and perform a multiplication and an addition in one instruction. Such instructions reflect the PowerPC approach of using more powerful instructions, so fewer instructions are required to perform a task. This is in contrast to the more usual RISC ap- proach, which keeps instructions simple so they can be executed as fast as possible. In spite of this difference in philosophy, PowerPC is generally considered to be a true RISC architecture. Further discussions of these issues can be found in Smith and Weiss (1994). Instruction execution on a PowerPC system is pipelined, as we discussed for SPARC. However, the pipelining is more sophisticated than on the original SPARC systems, with branch prediction used to speed execution. As a result, the delayed branch technique we described for SPARC is not used on PowerPC (and most other modern architectures). Further discussion of pipelining and branch prediction can be found in Tabak (1995). Input and Output The PowerPC architecture provides two different methods for performing 1/O operations. In one approach, segments in the virtual address space are mapped onto an external address space (typically an I/O bus). Segments that are mapped in this way are called direct-store segments. This method is similar to the approach used in the SPARC architecture.1.5. RISC Machines A reference to an address that is not in a direct-store segment represents a normal virtual memory access. In this situation, 1/O is performed using the regular virtual memory management hardware and software. 1.5.3 Cray T3E Architecture The TSE series of supercomputers was announced by Cray Research, Inc., near the end of 1995. The T3E is a massively parallel processing (MPP) system, de- signed for use on technical applications in scientific computing. The earlier Cray T3D system had a similar (but not identical) architecture. ‘A TBE system contains a large number of processing elements (PE), arranged in a three-dimensional network as illustrated in Fig. 1.8. This net- work provides a path for transferring data between processors. It also imple- ments control functions that are used to synchronize the operation of the PEs used by a program. The interconnect network is circular in each dimension. Thus PEs at “opposite” ends of the three-dimensional array are adjacent with respect to the network. This is illustrated by the dashed lines in Fig. 1.8; for simplicity, most of these “circular” connections have been omitted from the drawing. Each PE consists of a DEC Alpha EV5 RISC microprocessor (currently model 21164), local memory, and performance-accelerating control logic devel- oped by Cray. A T3E system may contain from 16 to 2048 processing elements. This section contains an overview of the architecture of the T3E and the DEC Alpha microprocessor. Sections 3.5.3 and 5.5.3 discuss some of the ways programs can take advantage of the multiprocessor architecture of this ma- chine. Further information about the T3E can be found in Cray Research (1995c). Further information about the DEC Alpha architecture can be found in Sites (1992) and Tabak (1995). Memory Each processing element in the T3E has its own local memory with a capacity of from 64 megabytes to 2 gigabytes. The local memory within each PE is part Ra<— Interconnect network 4 "Processing element node Figure 1.8 Overall T3E architecture. 3738 Chapter 1 Background of a physically distributed, logically shared memory system. System memory is physically distributed because each PE contains local memory. System mem- ory is logically shared because the microprocessor in one PE can access the memory of another PE without involving the microprocessor in that PE. The memory within each processing element consists of 8-bit bytes; all addresses used are byte addresses. Two consecutive bytes form a word; four bytes form a longword; eight bytes form a quadword. Many Alpha instructions may execute more efficiently if operands are aligned at a starting address that is a multiple of their length. The Alpha architecture supports 64-bit virtual addresses. Registers The Alpha architecture includes 32 general-purpose registers, designated RO through R31; R31 always contains the value zero. Each general-purpose regis- ter is 64 bits long. These general-purpose registers can be used to store and manipulate integer data and addresses. There are also 32 floating-point registers, designated FO through F31; F31 always contains the value zero. Each floating-point register is 64 bits long. In addition to the general-purpose and floating-point registers, there is a 64-bit program counter PC and several other status and control registers. Data Formats The Alpha architecture provides for the storage of integers, floating-point val- ues, and characters. Integers are stored as longwords or quadwords; 2's com- plement is used for negative values. When interpreted as an integer, the bits of a longword or quadword have steadily increasing significance beginning with bit 0 (which is stored in the lowest-addressed byte). There are two different types of floating-point data formats in the Alpha architecture. One group of three formats is included for compatibility with the VAX architecture. The other group consists of four IEEE standard formats, which are compatible with those used on most modern systems. Characters may be stored one per byte, using their 8-bit ASCII codes. However, there are no byte load or store operations in the Alpha architecture; only longwords and quadwords can be transferred between a register and memory. As a consequence, characters that are to be manipulated separately are usually stored one per longword.15. RISC Machines Instruction Formats There are five basic Instruction formats in the Alpha architecture, some of which have subforms. All of these formats are 32 bits long. (As we have noted before, this fixed length is typical of RISC systems.) The first 6 bits of the in- struction word always specify the opcode; some instruction formats also have an additional “function” field. Addressing Modes As in most architectures, an operand value may be specified as part of the in- struction itself (immediate mode), or it may be in a register (register direct mode). As in most RISC systems, the only instructions that address memory are load and store operations, and branch instructions. Operands in memory are addressed using one of the following two modes: Mode Target address calculation PC-relative TA = (PC) + displacement (23 bits, signed} Register indirect TA = (register) + displacement with displacement {16 bits, signed} Register indirect with displacement mode is used for load and store opera- tions and for subroutine jumps. PC-relative mode is used for conditional and unconditional branches. Instruction Set The Alpha architecture has approximately 130 machine instructions, reflecting its RISC orientation. The instruction set is designed so that an implementation of the architecture can be as fast as possible. For example, there are no byte or word load and store instructions. This means that the memory access interface does not need to include shift-and-mask operations. Further discussion of this approach can be found in Smith and Weiss (1994). Input and Output The T3E system performs I/O through multiple ports into one or more 1/O channels, which can be configured in a number of ways. These channels are 39Chapter 1 Background integrated into the network that interconnects the processing nodes. A system may be configured with up to one I/O channel for every eight PEs. All chan- nels are accessible and controllable from all PEs. Further information about this “scalable” I/O architecture can be found in Cray Research (1995c). EXERCISES Section 1.3 1. Write a sequence of instructions for SIC to set ALPHA equal to the product of BETA and GAMMA. Assume that ALPHA, BETA, and GAMMA are defined as in Fig. 1.3(a). 2. Write a sequence of instructions for SIC/XE to set ALPHA equal to 4* BETA - 9. Assume that ALPHA and BETA are defined as in Fig. 1.3(b). Use immediate addressing for the constants. 3. Write a sequence of instructions for SIC to set ALPHA equal to the integer portion of BETA + GAMMA. Assume that ALPHA and BETA are defined as in Fig. 1.3(a). 4, Write a sequence of instructions for SIC/XE to divide BETA by GAMMA, setting ALPHA to the integer portion of the quotient and DELTA to the remainder. Use register-to-register instructions to make the calculation as efficient as possible. 5. Write a sequence of instructions for SIC/XE to divide BETA by GAMMA, setting ALPHA to the value of the quotient, rounded to the nearest integer. Use register-to-register instructions to make the calculation as efficient as possible. 6. Write a sequence of instructions for SIC to clear a 20-byte string to all blanks. 7. Write a sequence of instructions for SIC/XE to clear a 20-byte string to all blanks. Use immediate addressing and register-to-register in- structions to make the process as efficient as possible. 8. Suppose that ALPHA is an array of 100 words, as defined in Fig. 1.5(a). Write a sequence of instructions for SIC to set all 100 elements of the array to 0. 9. Suppose that ALPHA is an array of 100 words, as defined in Fig. 1.5(b). Write a sequence of instructions for SIC/XE to set all 10010. uu. 12. 13, Exercises elements of the array to 0. Use immediate addressing and register-to- register instructions to make the process as efficient as possible. Suppose that RECORD contains a 100-byte record, as in Fig. 1.7(a). Write a subroutine for SIC that will write this record onto device 05. Suppose that RECORD contains a 100-byte record, as in Fig. 1.7(b). Write a subroutine for SIC/XE that will write this record onto device 05. Use immediate addressing and register-to-register instructions to make the subroutine as efficient as possible. Write a subroutine for SIC that will read a record into a buffer, as in Fig. 1.7(a). The record may be any length from 1 to 100 bytes. The end of the record is marked with a “null” character (ASCII code 00). The subroutine should place the length of the record read into a vari- able named LENGTH. Write a subroutine for SIC/XE that will read a record into a buffer, as in Fig. 1.7(b). The record may be any length from 1 to 100 bytes. The end of the record is marked with a “null” character (ASCII code 00). The subroutine should place the length of the record read into a vari- able named LENGTH. Use immediate addressing and register-to- register instructions to make the subroutine as efficient as possible. 41Chapter 2 Assemblers In this chapter we discuss the design and implementation of assemblers. There are certain fundamental functions that any assembler must perform, such as translating mnemonic operation codes to their machine language equivalents and assigning machine addresses to symbolic labels used by the programmer. If we consider only these fundamental functions, most assemblers are very much alike. Beyond this most basic level, however, the features and design of an as- sembler depend heavily upon the source language it translates and the ma- chine language it produces. One aspect of this dependence is, of course, the existence of different machine instruction formats and codes to accomplish (for example) an ADD operation. As we shall see, there are also many subtler ways that assemblers depend upon machine architecture. On the other hand, there are some features of an assembler language (and the corresponding as- sembler) that have no direct relation to machine architecture—they are, in a sense, arbitrary decisions made by the designers of the language. We begin by considering the design of a basic assembler for the standard version of our Simplified Instructional Computer (SIC). Section 2.1 introduces the most fundamental operations performed by a typical assembler, and de- scribes common ways of accomplishing these functions. The algorithms and data structures that we describe are shared by almost all assemblers. Thus this level of presentation gives us a starting point from which to approach the study of more advanced assembler features. We can also use this basic struc- ture as a framework from which to begin the design of an assembler for a com- pletely new or unfamiliar machine. In Section 2.2, we examine some typical extensions to the basic assembler structure that might be dictated by hardware considerations. We do this by discussing an assembler for the SIC/XE machine. Although this SIC/XE as- sembler certainly does not include all possible hardware-dependent features, it does contain some of the ones most commonly found in real machines. The principles and techniques should be easily applicable to other computers. Section 2.3 presents a discussion of some of the most commonly encoun- tered machine-independent assembler language features and their implemen- tation. Once again, our purpose is not to cover all possible options, but rather 4344 Chapter 2. Assemblers to introduce concepts and techniques that can be used in new and unfamiliar situations. Section 2.4 examines some important alternative design schemes for an as- sembler. These are features of an assembler that are not reflected in the assem- bler language. For example, some assemblers process a source program in one pass instead of two; other assemblers may make more than two passes. We are concerned with the implementation of such assemblers, and also with the en- vironments in which each might be useful. Finally, in Section 2.5 we briefly consider some examples of actual assem- blers for real machines. We do not attempt to discuss all aspects of these as- semblers in detail. Instead, we focus on the most interesting features that are introduced by hardware or software design decisions. 2.1 BASIC ASSEMBLER FUNCTIONS Figure 2.1 shows an assembler language program for the basic version of SIC. We use variations of this program throughout this chapter to show different assembler features. The line numbers are for reference only and are not part of the program. These numbers also help to relate corresponding parts of differ- ent versions of the program. The mnemonic instructions used are those intro- duced in Section 1.3.1 and Appendix A. Indexed addressing is indicated by adding the modifier “,X” following the operand (see line 160). Lines beginning with ”.” contain comments only. In addition to the mnemonic machine instructions, we have used the fol- lowing assembler directives: START Specify name and starting address for the program. END Indicate the end of the source program and (optionally) specify the first executable instruction in the program. BYTE Generate character or hexadecimal constant, occupying as many bytes as needed to represent the constant. WORD _ Generate one-word integer constant. RESB Reserve the indicated number of bytes for a data area. RESW _ Reserve the indicated number of words for a data area. The program contains a main routine that reads records from an input de- vice (identified with device code F1) and copies them to an output device (code 05). This main routine calls subroutine RDREC to read a record into a buffer and subroutine WRREC to write the record from the buffer to the out-Line 10 15, 20 25, 30 35, 40 45 50 55 60 65 70 5 85 90 95 100 105 u10 15 120 125 130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255 Source statement copy START 1000 FIRST STL RETADR coop = JSUB_ REC LDA, LENGTH COMP ZERO BQ ENDFIL JSUB = WRREC z ‘cLooP ENDFIL «LOA EOF ST! BUFFER LOA, ‘THREE, STA LENGTH JSUB = WRREC LDL RETADR SUB EOF BYTE C’EOF’ THREE = WORD 3 ZERO woRD 0 RETADR © -RESW 1 LENGTH = RESW 1 BUFFER RESB 4096 2.1 Basic Assembler Functions COPY FILE FROM INPUT TO OUTPUT SAVE RETURN ADDRESS READ INPUT RECORD TEST FOR BOF (LENGTH = 0) EXIT IF BOF FOUND WRITE OUTPUT RECORD LOOP, INSERT END OF FILE MARKER SET LENGTH = 3 WRITE BOF GET RETURN ADDRESS RETURN TO CALLER LENGTH OF RECORD 4096-BYTE BUFFER AREA ‘SUBROUTINE TO READ RECORD INTO BUFFER ROREC LOK LOA RICOP TD EXIT ed 2ERO ZERO INPUT RLOOP INPUT ZERO EXIT BUFFER, X MAXLEN LOOP LENGTH xtFL 4096 CLEAR LOOP COUNTER CLEAR A TO ZERO TEST INPUT DEVICE LOOP UNTIL READY READ CHARACTER INTO REGISTER A TEST FOR END OF RECORD (X'00") EXIT LOOP IF FOR STORE CHARACTER IN BUFFER LOOP UNLESS MAX LENGTH HAS BEEN REACHED SAVE RECORD LENGTH RETURN TO CALLER CODE FOR INPUT DEVICE SUBROUTINE TO WRITE RECORD FROM BUFFER WRREC LOX woop = 1D ourpur BYTE ZERO ourpur LOOP BUFFER, X oureur LENGTH LOOP x05" FIRST CLEAR LOOP COUNTER TEST OUTPUT DEVICE LOOP UNTIL READY GET CHARACTER FROM BUFFER WRITE CHARACTER LOOP UNTIL ALL CHARACTERS HAVE BEEN WRITTEN RETURN TO CALLER CODE FOR OUTPUT DEVICE Figure 2.1. Example of a SIC assembler language program.46 Chapter 2 Assemblers put device. Each subroutine must transfer the record one character at a time because the only I/O instructions available are RD and WD. The buffer is nec- essary because the I/O rates for the two devices, such as a disk and a slow printing terminal, may be very different. (In Chapter 6, we see how to use channel programs and operating system calls on a SIC/XE system to accom- plish the same functions.) The end of each record is marked with a null charac- ter (hexadecimal 00). If a record is longer than the length of the buffer (4096 bytes), only the first 4096 bytes are copied. (For simplicity, the program does not deal with error recovery when a record containing 4096 bytes or more is read.) The end of the file to be copied is indicated by a zero-length record. When the end of file is detected, the program writes EOF on the output device and terminates by executing an RSUB instruction. We assume that this pro- gram was called by the operating system using a JSUB instruction; thus, the RSUB will return control to the operating system. 2.1.1 A Simple SIC Assembler Figure 2.2 shows the same program as in Fig. 2.1, with the generated object code for each statement. The column headed Loc gives the machine address (in hexadecimal) for each part of the assembled program. We have assumed that the program starts at address 1000. (In an actual assembler listing, of course, the comments would be retained; they have been eliminated here to save space.) The translation of source program to object code requires us to accomplish the following functions (not necessarily in the order given): 1. Convert mnemonic operation codes to their machine language equivalents—e.g,, translate STL to 14 (line 10). 2. Convert symbolic operands to their equivalent machine addresses— e.g., translate RETADR to 1033 (line 10). 3. Build the machine instructions in the proper format. 4, Convert the data constants specified in the source program into their internal machine representations—e.g,, translate EOF to 454F46 (line 80). 5. Write the object program and the assembly listing, All of these functions except number 2 can easily be accomplished by sequen- tial processing of the source program, one line at a time. The translation of addresses, however, presents a problem. Consider the statement 10 1000 FIRST = STL RETADR 141033Line 10 1s 20 25 30 35 40 45 50 55 65 70 5 20 85 90 95 100 105 110 15, 120 125 130 135, 140 145, 150 155 160 165 170 175 180 185 190 195 200 205 210 215 220 225 230 235 240 245, 250 255, Loc 1000 1000 1003 1006 1009 00c 100F 1012 1015 1018 1018 101E 1023 1024 1027 102A 02D 1030 1033 1036 1039 2039 203¢ 203F 2042 2045 2048 2048 2045 2051 2054 2057 205A 20D 205E 2061 2064 2067 206A 2060 2070 2073 2076 2079 Source statement cory First cLooP ENDFIL, RLOOP wWLoOP START sm JSUB TA COMP JEQ JSUB a A STA mA STA JSUB . 2.1 Basic Assembler Functions ‘Object code 1000 RETADR 141033 RDREC 482039 LENGTH 001036 ZERO 281030 ENDFIL 301015, WRREC 482061, ctor 31003, EOF oo102a BUFFER 0c1039 THREE 0102p LENGTH oc1036 WRREC 482061, RETADR 081033, 4co000 c'ROF’ 454F46 3 000003, o ‘00000 1 1 4096 2ERO 041030 ZERO 001030 INPUT 50208 RLOOP 30203F INPUT a20sD ZERO 281030 Bar 302057 BUFFER, X 549039 MAXLEN 2¢205B, RLOOP 38203F LENGTH 101036 4c0000 x'FL FL 4096 001000 SUBROUTINE TO WRITE RECORD FROM BUFFER ZERO 041030 ourrur £02079 WLooP 302064 BUFFER, X 509039 oureur pc2079 LENGTH 2c1036 ‘WLOOP 382064 4co000 x'05" 05 FIRST Figure 2.2 Program from Fig. 2.1 with object code. 4748 Chapter 2. Assemblers This instruction contains a forward reference—that is, a reference to a label (RETADR) that is defined later in the program. If we attempt to translate the program line by line, we will be unable to process this statement because we do not know the address that will be assigned to RETADR. Because of this, most assemblers make two passes over the source program. The first pass does little more than scan the source program for label definitions and assign addresses (such as those in the Loc column in Fig. 2.2). The second pass per- forms most of the actual translation previously described. In addition to translating the instructions of the source program, the assem- bler must process statements called assembler directives (or pseudo-instructions), These statements are not translated into machine instructions (although they may have an effect on the object program). Instead, they provide instructions to the assembler itself. Examples of assembler directives are statements like BYTE and WORD, which direct the assembler to generate constants as part of the object program, and RESB and RESW, which instruct the assembler to re- serve memory locations without generating data values. The other assembler directives in our sample program are START, which specifies the starting memory address for the object program, and END, which marks the end of the program. Finally, the assembler must write the generated object code onto some out- put device. This object program will later be loaded into memory for execution. The simple object program format we use contains three types of records: Header, Text, and End. The Header record contains the program name, start- ing address, and length. Text records contain the translated (ie., machine code) instructions and data of the program, together with an indication of the addresses where these are to be loaded. The End record marks the end of the object program and specifies the address in the program where execution is to begin. (This is taken from the operand of the program’s END statement. If no operand is specified, the address of the first executable instruction is used.) The formats we use for these records are as follows. The details of the for- mats (column numbers, etc.) are arbitrary; however, the information contained in these records must be present (in some form) in the object program. Header record: Col.1 H Col. 2-7 Program name Col. 8-13 Starting address of object program (hexadecimal) Col. 14-19 Length of object program in bytes (hexadecimal)2.1 Basic Assembler Functions Text record: Col. 1 T Col. 2-7 Starting address for object code in this record (hexadecimal) Col. 89 Length of object code in this record in bytes (hexadecimal) Col. 10-69 Object code, represented in hexadecimal (2 columns per byte of object code) End record: Col.1 E Col. 2-7 Address of first executable instruction in object program (hexadecimal) To avoid confusion, we have used the term column rather than byte to refer to positions within object program records. This is not meant to imply the use of any particular medium for the object program. Figure 2.3 shows the object program corresponding to Fig. 2.2, using this format. In this figure, and in the other object programs we display, the symbol * is used to separate fields visually. Of course, such symbols are not present in the actual object program. Note that there is no object code corresponding to addresses 1033-2038. This storage is simply reserved by the loader for use by the program during execution. (Chapter 3 contains a detailed discussion of the operation of the loader.) We can now give a general description of the functions of the two passes of our simple assembler. YEOPY 001000901074 7001009184 410334820390010362810393010154820613C10030010240C103900102D {9010181 5010364820610810334C0000454F46000003000000 9020391804 103000103002050302038D820552810393020575490392C205E38203F 790205 7,161010366c0009F 1,001000041030£02079.302064509039DC20792c1036 90207307.3820644co00905 5901000 Figure 2.3 Object program corresponding to Fig. 2.2. 4950 Chapter 2 Assemblers Pass 1 (define symbols): 1. Assign addresses to all statements in the program. 2. Save the values (addresses) assigned to all labels for use in Pass 2 3. Perform some processing of assembler directives. (This includes processing that affects address assignment, such as determining the length of data areas defined by BYTE, RESW, etc.) Pass 2 (assemble instructions and generate object program): 1. Assemble instructions (translating operation codes and looking up addresses) 2. Generate data values defined by BYTE, WORD, etc. 3. Perform processing of assembler directives not done during Pass 1. 4, Write the object program and the assembly listing. In the next section we discuss these functions in more detail, describe the in- ternal tables required by the assembler, and give an overall description of the logic flow of each pass. 2.1.2 Assembler Algorithm and Data Structures Our simple assembler uses two major internal data structures: the Operation Code Table (OPTAB) and the Symbol Table (SYMTAB). OPTAB is used to look up mnemonic operation codes and translate them to their machine language equivalents. SYMTAB is used to store values (addresses) assigned to labels. We also need a Location Counter LOCCTR. This is a variable that is used to help in the assignment of addresses. LOCCTR is initialized to the beginning address specified in the START statement. After each source statement is processed, the length of the assembled instruction or data area to be generated is added to LOCCTR. Thus whenever we reach a label in the source program, the current value of LOCCTR gives the address to be associated with that label The Operation Code Table must contain (at least) the mnemonic operation code and its machine language equivalent. In more complex assemblers, this table also contains information about instruction format and length. During Pass 1, OPTAB is used to look up and validate operation codes in the source program. In Pass 2, it is used to translate the operation codes to machine lan- guage. Actually, in our simple SIC assembler, both of these processes could be done together in either Pass 1 or Pass 2. However, for a machine (such as SIC/XE) that has instructions of different lengths, we must search OPTAB in the first pass to find the instruction length for incrementing LOCCTR.. 2.1 Basic Assembler Functions Likewise, we must have the information from OPTAB in Pass 2 to tell us which instruction format to use in assembling the instruction, and any pecu- liarities of the object eqde instruction. We have chosen to retain this structure in the current discussion because it is typical of most real assemblers. OPTAB is usually organized as a hash table, with mnemonic operation code as the key. (The information in OPTAB is, of course, predefined when the assembler itself is written, rather than being loaded into the table at execution time.) The hash table organization is particularly appropriate, since it provides fast retrieval with a minimum of searching. In most cases, OPTAB is a static table—that is, entries are not normally added to or deleted from it. In such cases it is possible to design a special hashing function or other data structure to give optimum performance for the particular set of keys being stored. Most of the time, however, a general-purpose hashing method is used. Further in formation about the design and construction of hash tables may be found in any good data structures text, such as Lewis and Denenberg (1991) or Knuth (1973). The symbol table (SYMTAB) includes the name and value (address) for each label in the source program, together with flags to indicate error condi- tions (e.g., a symbol defined in two different places). This table may also con- tain other information about the data area or instruction labeled—for example, its type or length. During Pass 1 of the assembler, labels are entered into SYMTAB as they are encountered in the source program, along with their as- signed addresses (from LOCCTR). During Pass 2, symbols used as operands are looked up in SYMTAB to obtain the addresses to be inserted in the assem- bled instructions. SYMTAB is usually organized as a hash table for efficiency of insertion and retrieval. Since entries are rarely (if ever) deleted from this table, efficiency of deletion is not an important consideration. Because SYMTAB is used heavily throughout the assembly, care should be taken in the selection of a hashing function. Programmers often select many labels that have similar characteris- tics—for example, labels that start or end with the same characters (like LOOP1, LOOP2, LOOPA) or are of the same length (like A, X, Y, Z). It is im- portant that the hashing function used perform well with such non-random keys. Division of the entire key by a prime table length often gives good results. It is possible for both passes of the assembler to read the original source program as input. However, there is certain information (such as location counter values and error flags for statements) that can or should be communi- cated between the two passes. For this reason, Pass 1 usually writes an inter- mediate file that contains each source statement together with its assigned address, error indicators, etc. This file is used as the input to Pass 2. This work- ing copy of the source program can also be used to retain the results of certain 5152 Chapter 2 Assemblers operations that may be performed during Pass 1 (such as scanning the operand field for symbols and addressing flags), so these need not be per- formed again during Pass 2. Similarly, pointers into OPTAB and SYMTAB may be retained for each operation code and symbol used. This avoids the need to repeat many of the table-searching operations. Figures 2.4(a) and (b) show the logic flow of the two passes of our assem- bler. Although described for the simple assembler we are discussing, this is also the underlying logic for more complex two-pass assemblers that we con- sider later. We assume for simplicity that the source lines are written in a fixed format with fields LABEL, OPCODE, and OPERAND. If one of these fields contains a character string that represents a number, we denote its numeric value with the prefix # (for example, (OPERAND)). At this stage, it is very important for you to understand thoroughly the al- gorithms in Fig. 2.4. You are strongly urged to follow through the logi these algorithms, applying them by hand to the program in Fig. 2.1 to produce the object program of Fig. 2.3. Much of the detail of the assembler logic has, of course, been left out to emphasize the overall structure and main concepts. You should think about these details for yourself, and you should also attempt to identify those func- tions of the assembler that should be implemented as separate procedures or modules. (For example, the operations “search symbol table” and “read input line” might be good candidates for such implementation.) This kind of thoughtful analysis should be done before you make any attempt to actually implement an assembler or any other large piece of software. Chapter 8 contains an introduction to software engineering tools and tech- niques, and illustrates the use of such techniques in designing and implement- ing a simple assembler. You may want to read this material now to gain further insight into how an assembler might be constructed. 2.2 MACHINE-DEPENDENT ASSEMBLER FEATURES In this section, we consider the design and implementation of an assembler for the more complex XE version of SIC. In doing so, we examine the effect of the extended hardware on the structure and functions of the assembler. Many real machines have certain architectural features that are similar to those we con- sider here. Thus our discussion applies in large part to these machines as well as to SIC/XE. Figure 2.5 shows the example program from Fig. 2.1 as it might be rewrit- ten to take advantage of the SIC/XE instruction set. In our assembler lan- guage, indirect addressing is indicated by adding the prefix @ to the operand+2.2 Machine-Dependent Assembler Features Pass 1: begin read first input line if OPCODE = ‘START’ then begin save #[OPERAND] as starting address initialize LOCCTR to starting address write line to intermediate file read next input line end {if START) els initialize LOCCTR to 0 while OPCODE # ‘END’ do begin if this is not a comment line then begin if there is a symbol in the LABEL field then begin search SYMTAB for LABEL 4 found then set error flag (duplicate symbol) else insert (LABEL,LOCCTR) into SYMTAB end {if symbol) search OPTAB for OPCODE if found then add 3 (instruction length} to LOccTR else if OPCODE = ‘WORD’ then add 3 to LOCCTR else if OPCODE = ‘RESW’ then add 3 * #{OPERAND) to LOCCTR else if OPCODE = ‘RESB’ then add #[OPERAND] to LOCCTR else if OPCODE = ‘BYTE’ then begin find length of constant in bytes add length to LOCCTR end (if BYTE) else set error flag (invalid operation code) end (if not a comment) write line to intermediate file read next input line end (while not END) write last line to intermediate file save (LOCCTR - starting address) as program length end (Pass 1) Figure 2.4(a) Algorithm for Pass 1 of assembler.Chapter 2 Assemblers Pass 2: begin read first input line (from intermediate file} if OPCODE = ‘START’ then begin write listing line read next input line end (if START) write Header record to object program initialize first Text record while OPCODE # ‘END’ do begin 4f this is not a comment line then begin search OPTAB for OPCODE Af found then begin if there is a symbol in OPERAND field then begin search SYMTAB for OPERAND if found then store symbol value as operand address else begin store 0 as operand address set error flag (undefined symbol) end end {if symbol} el store 0 as operand address assemble the object code instruction end (if opcode found} else if OPCODE = ‘BYTE’ or ‘WORD’ then convert constant to object code Af object code will not fit into the current Text record then begin write Text record to object program initialize new Text record end add object code to Text record end (if not comment) write listing line read next input line end (while not END) write last Text record to object program write End record to object program write last listing line : end (Pass 2) Figure 2.4(b) Algorithm for Pass 2 of assembler.Line 10 12 B 15 20 25, 30 35 40 45 50 55 60 65 70 80 95 100 105 110 115 120 125 130 132 133 135 140 145 150 155 160 165 170 175 180 185 195 200 205 210 212 215 220 225 230 235 240 245 250 255, Source statement cory smrT 0 FIRST STL RETADR 0B #LENGTH BASE LENGTH coop = -+JSUB = RDREC LDA LENGTH comp 40 JEQ ENDFIL +3SUB © WRREC gz cLooP ENDFIL «= LOA OF STA BUFFER WA 8B STA LENGTH +USUB_WRREC J @RETADR OF BYTE —C'EOF’ RETADR = -RESW 1 LENGTH «RESW1 BUFFER ESB (4096 *2.2 Machine-Dependent Assembler Features 55 COPY FILE FROM INPUT TO OUTPUT ‘SAVE. RETURN ADDRESS ESTABLISH BASE REGISTER READ INPUT RECORD TEST FOR BOF (LENGTH = 0) EXIT IF BOF FOUND WRITE OUTPUT RECORD L00P INSERT END OF FILE MARKER SET LENGTH = 3 WRITE BOF RETURN TO CALLER LENGTH OF RECORD 4096-BYTE BUFFER AREA SUBROUTINE TO READ RECORD INTO BUFFER CLEAR |X CLEAR A CLEARS sur #4096 RWOOP = 1D INPUT JEQ RLOOP RD INPUT COMPR A,S EQ maT SICH BUFFER, X TR 7 our RLOOP Exar sm LENGTH RSUB INPUT BYTE X'FL’ CLEAR LOOP COUNTER CLEAR A TO ZERO CLEAR $ TO ZERO ‘TEST INPUT DEVICE ‘WOOP UNTIL READY READ CHARACTER INTO REGISTER A TEST FOR END OF RECORD (X’00") EXIT LOOP IF FOR STORE CHARACTER IN BUFFER LOOP UNLESS MAX LENGTH HAS BEEN REACHED SAVE RECORD LENGTH RETURN TO CALLER CODE FOR INPUT DEVICE SUBROUTINE TO WRITE RECORD FROM BUFFER CLEAR xX Lor LENGTH woop = 1D ourpur BQ WLOOP LOCH BUFFER, X wo ourpur TOR oT our LOOP SUB ourpur BYTE «x"05" =D FIRST CLEAR LOOP COUNTER TEST OUTPUT DEVICE LOOP UNTIL READY GET CHARACTER FROM BUFFER WRITE CHARACTER LOOP UNTIL ALL CHARACTERS HAVE BEEN WRITTEN RETURN TO CALLER CODE FOR OUTPUT DEVICE Figure 2.5 Example of a SIC/XE program.56 Chapter 2. Assemblers (see line 70). Immediate operands are denoted with the prefix # (lines 25, 55, 133). Instructions that refer to memory are normally assembled using either the program-counter relative or the base relative mode. The assembler direc- tive BASE (line 13) is used in conjunction with base relative addressing. (See Section 2.2.1 for a discussion and examples.) If the displacements required for both program-counter relative and base relative addressing are too large to fit into a 3-byte instruction, then the 4-byte extended format (Format 4) must be used. The extended instruction format is specified with the prefix + added to the operation code in the source statement (see lines 15, 35, 65). It is the pro- grammer’s responsibility to specify this form of addressing when it is re- quired. The main differences between this version of the program and the version in Fig. 2.1 involve the use of register-to-register instructions (in place of regis- ter-to-memory instructions) wherever possible. For example, the statement on line 150 is changed from COMP ZERO to COMPR A‘S. Similarly, line 165 is changed from TIX MAXLEN to TIXR T. In addition, immediate and indirect addressing have been used as much as possible (for example, lines 25, 55, and 70). These changes take advantage of the more advanced SIC/XE architecture to improve the execution speed of the program. Register-to-register instruc- tions are faster than the corresponding register-to-memory operations because they are shorter, and, more importantly, because they do not require another memory reference. (Fetching an operand from a register is much faster than re- trieving it from main memory.) Likewise, when using immediate addressing, the operand is already present as part of the instruction and need not be fetched from anywhere. The use of indirect addressing often avoids the need for another instruction (as in the “return” operation on line 70). You may no- tice that some of the changes require the addition of other instructions to the program. For example, changing COMP to COMPR on line 150 forces us to add the CLEAR instruction on line 132. This still results in an improvement in execution speed. The CLEAR is executed only once for each record read, whereas the benefits of COMPR (as opposed to COMP) are realized for every byte of data transferred. In Section 2.2.1, we examine the assembly of this SIC/XE program, focus- ing on the differences in the assembler that are required by the new addressing modes. (You may want to briefly review the instruction formats and target ad- dress calculations described in Section 1.3.2.) These changes are direct conse- quences of the extended hardware functions. Section 2.2.2 discusses an indirect consequence of the change to SIC/XE. The larger main memory of SIC/XE means that we may have room to load and run several programs at the same time. This kind of sharing of the ma- chine between programs is called multiprogramming. Such sharing often results in more productive use of the hardware. (We discuss this concept, and its
You might also like
System Software - An Introduction To Systems Programming (PDFDrive) OCR
PDF
No ratings yet
System Software - An Introduction To Systems Programming (PDFDrive) OCR
537 pages
System Software
PDF
No ratings yet
System Software
144 pages
System Software & Operating System
PDF
No ratings yet
System Software & Operating System
104 pages
Unit 1
PDF
No ratings yet
Unit 1
32 pages
System Software (3160715) Handbook System Software (3160715) Handbook
PDF
No ratings yet
System Software (3160715) Handbook System Software (3160715) Handbook
124 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
PDF
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
137 pages
System Software An Introduction To Systems Programming by Leland L. Beck PDF
PDF
No ratings yet
System Software An Introduction To Systems Programming by Leland L. Beck PDF
537 pages
(McGraw-Hill Computer Science Series) John J. Donovan - Systems Programming-Mcgraw-Hill College (1972) - 16-34
PDF
No ratings yet
(McGraw-Hill Computer Science Series) John J. Donovan - Systems Programming-Mcgraw-Hill College (1972) - 16-34
19 pages
System Software Notes
PDF
67% (3)
System Software Notes
100 pages
SS Lession Notes
PDF
No ratings yet
SS Lession Notes
37 pages
System Software 10CS52
PDF
No ratings yet
System Software 10CS52
206 pages
Spos by Dhamdhere
PDF
50% (2)
Spos by Dhamdhere
456 pages
Module 1
PDF
No ratings yet
Module 1
16 pages
Evolution of The Components of A Programming System
PDF
No ratings yet
Evolution of The Components of A Programming System
12 pages
Unit 1
PDF
No ratings yet
Unit 1
36 pages
System Software An Introduction To Systems Programming by Leland L. Beck PDF
PDF
No ratings yet
System Software An Introduction To Systems Programming by Leland L. Beck PDF
537 pages
System Software An Introduction To Systems Programming Leland Beck Third Editionpdf PDF
PDF
No ratings yet
System Software An Introduction To Systems Programming Leland Beck Third Editionpdf PDF
114 pages
UNIT-I SPOS (Autosaved)
PDF
No ratings yet
UNIT-I SPOS (Autosaved)
139 pages
Chapter 1 Introduction
PDF
No ratings yet
Chapter 1 Introduction
34 pages
Sys Prog
PDF
No ratings yet
Sys Prog
94 pages
CHAPTER ONE - Introduction To System Programming
PDF
No ratings yet
CHAPTER ONE - Introduction To System Programming
7 pages
Software Concepts
PDF
No ratings yet
Software Concepts
6 pages
System Software Notes 10CS52
PDF
No ratings yet
System Software Notes 10CS52
206 pages
Module 1 - 3 Opersyst
PDF
No ratings yet
Module 1 - 3 Opersyst
21 pages
Assignment 1 Ankur Sir
PDF
No ratings yet
Assignment 1 Ankur Sir
14 pages
ICT Chapter 4
PDF
No ratings yet
ICT Chapter 4
7 pages
System Programming Part1&2
PDF
No ratings yet
System Programming Part1&2
27 pages
Systems Programming by Donovan (1-108) PDF
PDF
No ratings yet
Systems Programming by Donovan (1-108) PDF
108 pages
Module 1-1
PDF
No ratings yet
Module 1-1
18 pages
System Software - An Introduction To Systems Programming (PDFDrive)
PDF
No ratings yet
System Software - An Introduction To Systems Programming (PDFDrive)
537 pages
01 Introduction
PDF
No ratings yet
01 Introduction
26 pages
Ict Chapter 03 - 1397-4-31-12-03
PDF
No ratings yet
Ict Chapter 03 - 1397-4-31-12-03
6 pages
CSK W Chapter 1 - Introduction To Computer System Notes 2
PDF
No ratings yet
CSK W Chapter 1 - Introduction To Computer System Notes 2
5 pages
CS2304 System Software UNIT I NOTES PDF
PDF
No ratings yet
CS2304 System Software UNIT I NOTES PDF
9 pages
SP Lesson1
PDF
No ratings yet
SP Lesson1
20 pages
Wa0001.
PDF
No ratings yet
Wa0001.
27 pages
Лекция - 3
PDF
No ratings yet
Лекция - 3
27 pages
System Software UNIT I
PDF
80% (5)
System Software UNIT I
9 pages
Cs54 System Software
PDF
No ratings yet
Cs54 System Software
131 pages
System Software Cs2304 Notes
PDF
33% (6)
System Software Cs2304 Notes
100 pages
System Programing PDF
PDF
No ratings yet
System Programing PDF
456 pages
Unit 1
PDF
No ratings yet
Unit 1
50 pages
Introduction To Systems Programming (CSE 405) Credit: 4 Prerequisite: CSE 301
PDF
No ratings yet
Introduction To Systems Programming (CSE 405) Credit: 4 Prerequisite: CSE 301
25 pages
SP Question Bank
PDF
No ratings yet
SP Question Bank
43 pages
Unit 1 Spos Notes
PDF
No ratings yet
Unit 1 Spos Notes
23 pages
Spos by Dhamdhere PDF
PDF
No ratings yet
Spos by Dhamdhere PDF
456 pages
System Software Cs2304 Notes
PDF
No ratings yet
System Software Cs2304 Notes
100 pages
Notes 1
PDF
No ratings yet
Notes 1
131 pages
Chapter1 Background
PDF
No ratings yet
Chapter1 Background
20 pages
Software
PDF
No ratings yet
Software
50 pages
KCG College of Technology, Chennai-96 Computer Science and Engineering
PDF
No ratings yet
KCG College of Technology, Chennai-96 Computer Science and Engineering
9 pages
Computer Systems Programming: Dr. Eyas El-Qawasmeh
PDF
No ratings yet
Computer Systems Programming: Dr. Eyas El-Qawasmeh
49 pages
Operating System 1.1
PDF
No ratings yet
Operating System 1.1
14 pages
CS1203 System Software UNIT I Question Ans
PDF
No ratings yet
CS1203 System Software UNIT I Question Ans
10 pages