Cos 141 CompArchInstrnForrmat
Cos 141 CompArchInstrnForrmat
1 Computer Architecture:
Introduction
The objectives of this module are to understand the importance of studying Computer Architecture, indicate the
basic components and working of the traditional von Neumann architecture, discuss the different types of
computer systems that are present today, look at the different types of parallelism that programs exhibit and how
the architectures exploit these various types of parallelism.
The first and foremost reason is that Computer Architecture is an exciting subject. You will find many interesting
facts about the machine that you use thrown open to you and you’ll find it a very interesting course. Any
computer engineer or scientist should basically know the underlying details of the machine he or she is going to
use. You may be an application programmer, a compiler writer or any software designer. Only if you know the
underlying architecture, you’ll be able to use the machine much more effectively and your performance will
improve. To become an expert on computer hardware you need to know the underlying concepts of computer
architecture. Even if you’re only looking at becoming a software designer, you need to understand the internals of
the machine in order to improve the code performance. Also, to explore new opportunities, you need to be
updated about the latest technological improvements that are happening. Only if you know the latest
technological improvements, you’ll be able to apply those technological improvements to your advantage. This
subject has an impact on all fields of engineering and science because computers are present everywhere and
whatever field of engineering and science you are at, you know that computers are very predominantly used and
the study on computer architecture will be very useful in order to use your machine more effectively.
A computer by definition is a sophisticated electronic calculating machine that accepts input information,
processes the information according to a list of stored instructions and finally produces the resulting output
information.
Based on the functions performed by the computer, we can identify the components of a digital
computer as the:
The data path is the path through which information flows, say you have an arithmetic and logical unit called the
ALU which includes functional units like adders, subtractors, multipliers, shifters etc., and you also have registers
which are used as storage media within the processor because the data has to be stored somewhere for
processing.
Registers are inbuilt storage mechanisms available within the processor and the arithmetic and logic (ALU) is used
for performing all arithmetic and logical operations.
For the control path you need to have some unit which will coordinate the activities of the various units. You
should know when data flows from one point to another point, when an addition operation has to take place,
when a subtraction operation has to take place, etc. So, the control path coordinates the activities of the various
units of the computer system.
The data path and control path put together is called the central processing unit or popularly abbreviated as the
CPU.
The data storage consists of the memory unit which stores all the information that is required for processing, the
data as well as the program. The program is nothing but a list of instructions.
Computers are only dumb machines that work according to the instructions that are given. If you instruct it to add,
it will add.
Initially the program is stored in memory, it take instructions from there, execute them and output the results to
the outside world, with devices like a monitor or printer.
Apart from these classical components, every machine typically has a network component for communication
with the other machines. We know that we don’t operate them only as a stand-alone machine and we need to
communicate from one machine to another machine either within a very short distance or across the globe.
Therefore, Computer architecture composes of: computer organisation and the Instruction Set Architecture, ISA.
The ISA gives a logical view of what a computer is capable of doing, and when you look at computer organization,
it basically talks about how ISA is implemented. Both these put together is normally called computer architecture
and, in this course, we are trying to cover both the computer organisation part as well as the ISA part.
To give a basic idea about what an instruction is, we will look at some sample instructions. Instructions basically
specify commands to the processor, like transferring information from one point to another within a computer,
say, for example, from one register to another register, from a memory location to a register or an input output
device. You will have specific instructions which will say transfer the information from this source to this
destination.
So, instructions basically specify commands to either transfer information from one point to another within a
computer, instruct the computer to perform arithmetic and logical operations like multiply these two numbers,
etc. You need to also have some instructions to control the flow of the program. Say for example, I’m trying to
add two numbers, and if the result is greater than something I want to take one course of action and if the result
is less than something, I want to take a different course of action. These instructions will allow you to control the
flow of the program. Jump instructions will make the control to transfer to a different point. You may have a
subroutine call, a function call. When we do modular programming, when you are executing something, you need
to specifically go to execute a function, get the result and then continue with the main program. These
instructions are examples of control flow instructions.
When you have a sequence of instructions to perform a particular task, it is called a program, which is stored in
memory. Say for example, if I have to add two numbers, and those numbers are stored in memory. From memory,
you have to bring the numbers to the adder unit and add. So, we need data transfer instructions to transfer the
data from memory to the processor and an add instruction to add. The processor fetches instructions that make
up a program from the memory and performs the operations stated in those instructions exactly in that order.
Suppose you have a control flow instruction in between and it says don’t execute the next instruction but jump to
some other location and execute that instruction, the control is transferred to that point.
Once we have some idea of what these instructions are, we also need to know on what data these instructions
operate. The data could be decimal numbers, binary numbers or octal numbers, or encoded characters. The
memory unit stores instructions as well as data as a sequence of bits. Groups of bits stored or retrieved at a time
and is processed is normally called a word. The word length of the processor depends upon the processor that
you’re looking at, if it is an 8-bit processor, the word length is eight. If it’s a 64-bit processor, you talk about a
word length of 64.
In order to read from and write to the memory, we should know how to access the memory. The memory
consists of a number of memory locations, for example, if I’m looking at 1K memory, I will have 1024 memory
locations. Just like we have unique addresses to identify our houses, each memory location has a unique address
of 10 bits in this case. In order to access the memory location, we need to know the unique address of the
memory location and the processor reads or writes to and from memory based on this memory address. A
random access memory provides fixed access times, independent of the location of the word.
We define memory access time as the time that elapses between the initiation of a request and the satisfaction of
the request. Say for example, I’ve put in a memory read request, so the time between the requisition that has
been placed and the time when the data actually arrives is called the memory access time. The memory access
time depends upon the speed of the memory unit – a slow memory has larger access times and a fast memory
has slower access times.
When you look at memory, we need the memory to be fast, large enough to accommodate voluminous data and
also affordable. Now, all this does not come together. So, we do not look at a flat memory system, but have a
hierarchical memory system. The processor and the memory will have to communicate with each other in order
to read and write information. So, in order to cope up with the processor speed and reduce the communication
time, a small amount of RAM, normally known as the cache is tightly coupled with the processor, and modern
computers have multiple levels of caches. Then, we have the main memory and then the secondary storage.
The fastest memory, closest to the processor, satisfies the speed requirements and the farthest memory satisfies
the capacity requirements. The cost also decreases as we move away from the inner most level. Though we look
at a main memory which is very high these days, the main memory is not obviously enough to store all your
programs and data so you need to look at secondary storage, capable of storing large amounts of data. Examples
are magnetic disks and tapes, optical discs, CDs, etc. The access to the data stored in secondary storages is
definitely slower, but you take advantage of the fact that the most frequently accessed data is placed closer to
the processor.
Having looked at the basic components of a digital computer, we should also have some means of connecting
these components together and communicating between them. The connection is done by means of wires called
a bus. The bus is nothing but an interconnection of wires, capable of carrying bits of information. Functional units
are connected by means of a group of parallel wires, each wire in a bus can transfer one bit of information and
the number of parallel wires on the bus is normally equal to the word length of the computer. When you talk
about a processor which has a word length of, say, 64-bits, it means typically the processor operates on 64 bits of
data. So, it is only reasonable that we also have a bus which can transfer 64 bits of data from one point of the
computer to another point.
You know that the information handled by a computer can be either instructions or data. Instructions or machine
instructions are explicit commands that govern the transfer of information within a computer as well as between
the computer and the memory and I/O devices and specify the arithmetic and logic operations to be performed.
A list of instructions that perform a task is called a program. The program is usually stored in memory and the
processor fetches these instructions one after the other and executes them. The earliest computing machines had
fixed programs. Some very simple computers still use this design, either for simplicity or training purposes. For
example, a desk calculator is a fixed program computer. It can do basic mathematics, but it cannot be used as a
word processor or to run video games. To change the program of such a machine, you have to re-wire or
reprogram the machine. Reprogramming, when it was possible at all, was a very manual process, starting with
flow charts and paper notes, followed by detailed engineering designs, and then the often-arduous process of
implementing the physical changes.
The idea of the stored-program computer changed all that. By creating an instruction set architecture and
detailing the computation as a series of instructions (the program), the machine becomes much more flexible. By
treating those instructions in the same way as data, a stored-program machine can easily change the program,
and can do so under program control. The terms “von Neumann architecture” and “stored-program computer”
are generally used interchangeably.
Instructions, as well as data, are stored in memory as a sequence of zeros and ones and the processor executes
these instructions sequentially and program flow is controlled or governed by the type of instructions and other
factors like interrupts, etc. The fetch-and-execute cycle is repeated continuously so an instruction is fetched from
memory and executed and then you go ahead and fetch the next instruction from memory. This indicates the
fetch execute cycle. The instruction is fetched from memory using the unique address, decoded and then
executed. The instruction is after all a sequence of zeros and ones, and you need to know what is to be done with
those zeros and ones – whether it is an addition to be performed or what operation is to be performed, where
the operands are available and so on. Once the entire information is available, fetch the operands and go ahead
with the execution and then finally store the result.
The advantage of the stored program concept is that programs can be simply shipped as files of binary numbers
that maintain the binary compatibility and computers can inherit ready-made software provided, they are
compatible with the existing ISA.
Computer organization, as we pointed out earlier is the realization of the instruction set architecture. You will
have to look at the characteristics of the principal components that make up your computer system, ways in
which these computer systems are interconnected and how information flows between these components.
There has been a lot of technological improvements that has been happening starting from 1951 – from vacuum
tubes we went into transistors, ICs, VLSIs, ultra-scale ICs, so on and so forth. We find that the processor transistor
counts have increased about 32 to 40% every year, thanks to Moore’s Law. Moore’s Law was basically proposed
by Gordon Moore of Intel in 1965 and he proposed that the transistor densities are going to be doubled every 18
to 24 months and that has really been holding good. The memory capacity also has gone up to about 60% per
year. All these technological advancements give room for better or new applications. The applications demand
more and more and the processors are becoming better and better and this is vicious cycle. The performance
improved greatly from 1978 to 2005. After 2005, you find that the performance has actually slowed down due to
what is called the power wall and the memory wall.
You have different classes or different types of computer systems that are available.
One is the desktop and notebook computers, the most competent market. Here we look at general-purpose
applications where you plan to run a lot of applications and the main constraint is the cost performance trade-off.
The next category of computer systems is the server systems, where they need to have high capacity and
performance is very important. For servers, reliability and availability are very important. Throughput needs to be
high for such systems.
We also have embedded systems, where the computers are hidden as part of a larger system. For example, when
you look at a mobile phone, you don’t realize that it is a computer system but you know that there are many
processors inside your mobile phone. A washing machine is a simple example of an embedded system. These
embedded computers have a stringent power performance requirement, they have stringent cost constraints and
they are specifically meant for a particular application. It is a processor which is meant to do a particular task,
unlike a desktop processor you’re not going to run a range of applications. It is expected to perform well with
respect to that particular application and this is a class of computer system which covers a wide range of
applications.
Your requirements may range from a very small toy car application to a very sophisticated diagnostic system for
example or a surveillance mechanism. Depending on that, all your requirements are going to change. Last of all,
you also have the personal mobile devices which are very predominant today, where cost is important, energy is
important and media performance becomes very important. Personal mobile also will have to lay a lot of
importance on the responsiveness. Once you put in a request to a PMD, you expect to get an answer immediately.
So, responsiveness is very important when you’re looking at personal mobile devices. And of course, these days
you also have clusters and warehouse scale computers that are becoming very popular.
You have large number of computers put together and called a cluster. Here again, price performance becomes
very important and throughput is important. The number of transactions done per unit time or the number of
web services that have been serviced all that becomes very important when you’re looking at clusters. It is again
the same as that of your servers and energy proportionality also gains a lot of importance when you look at this
type of computer systems.
The main driving forces of computer systems are energy and cost. Today everybody is striving to design computer
systems which will minimize your energy and cost. Also, we’ll have to look at the different types of parallelism
that your applications exhibit and try to exploit this parallelism in the computer systems that we designed. So that
becomes the primary driving force of a computer system. The different types of parallelism that programs may
exhibit are called data level parallelism and task level parallelism. You need to design systems that exploit them.
There are different techniques that processors use to exploit parallelism. Even in a sequential execution, there are
different techniques available to exploit the instruction level parallelism (ILP), i.e., executing independent
instructions parallel. When there is data level parallelism available in programs, vector processors and SIMD style
of architectures try to exploit them. Processors also look at having multiple threads of execution. Thread level
parallelism is exploited more in terms of task level parallelism and when it is done in a more loosely coupled
architecture we call it a request level parallelism. So, applications exhibit different types of parallelism and the
computer hardware that you’re designing should try to exploit that parallelism and try to give better performance.
The objectives of this module is to understand the importance of the instruction set architecture, discuss the
features that need to be considered when designing the instruction set architecture of a machine and look at an
example ISA, MIPS-million instructions per second.
We’ve already seen that the computer architecture course consists of two components – the instruction set
architecture and the computer organization itself.
The data on which operations are performed are stored in the computer. Different computers have their own set
of instructions. The CPU or processor takes all these instructions from memory and decodes the bits to carry out
the instructions. The layout of bits of instruction is called an instruction format.
The instruction format is simply a sequence of bits (0 r 1) contained in a machine instruction that defines the
layout of the instruction. The machine instruction contains number of bits (pattern of 0 and 1). These bits are
grouped together called fields
Instruction is of variable length depending upon the number of addresses it contains. Generally, CPU organization
is of three types based on the number of address fields:
Single Accumulator organization. General register organization. Stack organization.
Computers have three formats for instruction code:
memory reference, register and input/output.
An instruction format defines the different component of an instruction. The main components of an instruction
are opcode (which instruction to be executed) and operands (data on which instruction to be executed).
The ISA specifies what the processor is capable of doing and the ISA, how it gets accomplished.
So, the instruction set architecture is basically the interface between your hardware and the software. The only
way that you can interact with the hardware is the instruction set of the processor.
To command the computer, you need to speak its language and the instructions are the words of a computer’s
language and the instruction set is basically its vocabulary. Unless you know the vocabulary and you have a very
good vocabulary, you cannot gain good benefits out of the machine.
ISA is the portion of the machine which is visible to either the assembly language programmer or a compiler
writer or an application programmer.
It is the only interface that you have, because the instruction set architecture is the specification of what the
computer can do and the machine has to be fabricated in such a way that it will execute whatever has been
specified in your ISA. The only way that you can talk to your machine is through the ISA. This gives you an idea of
the interface between the hardware and software.
Let us assume you have a high-level program written in C which is independent of the architecture on which you
want to work. This high-level program has to be translated into an assembly language program which is specific to
a particular architecture. Let us say you find that this consists of a number of instructions like LOAD, STORE, ADD,
etc., where, whatever you had written in terms of high-level language now have been translated into a set of
instructions which are specific to the specific architecture. All these instructions that are being shown here are
part of the instruction set architecture of the MIPS architecture. These are all English like and this is not
understandable to the processor because the processor is after all made up of digital components which can
understand only zeros and ones. So, the assembly language will have to be finely translated into machine
language, object code which consists of zeros and ones. The translation from your high-level language to your
assembly language and the binary code will have to be done with the compiler and the assembler.
We shall look at the instruction set features, and see what will go into the 0s and 1s and how to interpret the 0s
and 1s, as data, or instructions or address. The ISA that is designed should last through many implementations, it
should have portability, it should have compatibility, it should be used in many different ways so it should have
generality and it should also provide convenient functionality to other levels. The taxonomy of ISA is given below.
Taxonomy
ISAs differ based on the internal storage in a processor. Accordingly, the ISA can be classified as follows, based on
where the operands are stored and whether they are named explicitly or implicitly:
Single accumulator organization, which names one of the general-purpose registers as the accumulator and uses
it to necessarily store one of the operands. This indicates that one of the operands is implied to be in the
accumulator and it is enough if the other operand is specified along with the instruction.
General register organization, which specifies all the operands explicitly. Depending on whether the operands
are available in memory or registers, it can be further classified as:
• Register – register, where registers are used for storing operands. Such architectures are in fact also
called load – store architectures, as only load and store instructions can have memory operands.
• Register – memory, where one operand is in a register and the other one in memory.
• Memory – memory, where all the operands are specified as memory operands.
Stack organization, where the operands are put into the stack and the operations are carried out on the top of
the stack. The operands are implicitly specified here.
Let us assume you have to perform the operation A = B + C, where all three operands are memory operands.
In the case of an accumulator-based ISA, where we assume that one of the general-purpose registers is being
designated as an accumulator; one of the operands will be available in the accumulator, you have to initially load
one operand into the accumulator and the ADD instruction will only specify the operand’s address.
• In the register memory ISA, one operand has to be any register and the other one can be a memory
operand.
• In the register – register ISA, both operands will have to be in two registers and the ADD instruction will
only work on registers.
• In the memory – memory ISA, both memory operands. So, you can directly add.
• In a stack-based ISA, you’ll have to first of all push both operands onto the stack and then simply give an
add instruction which will add the top two elements of the stack and then store the result in the stack.
So, you can see from these examples that you have different ways of executing the same operation, and it
obviously depends upon the ISA. Among all these ISAs, it is the register – register ISA that is very popular and
used in all RISC architectures.
We shall now look at what are the different features that need to be considered when designing the instruction
set architecture. They are:
• Types of instructions (Operations in the Instruction set)
• Types and sizes of operands
• Addressing Modes
• Addressing Memory
• Encoding and Instruction Formats
• Compiler related issues
First of all, you have to decide on the types of instructions, i.e., what are the various instructions that you want to
support in the ISA.
The tasks carried out by a computer program consist of a sequence of small steps:
• such as multiplying two numbers,
• moving a data from a register to a memory location,
• testing for a particular condition like zero,
• reading a character from the input device or sending a character to be displayed to the output device, etc.
A computer must have the following types of instructions:
• Data transfer instructions
• Data manipulation instructions
• Program sequencing and control instructions
• Input and output instructions
Data transfer instructions perform data transfer between the various storage places in the computer system, viz.
registers, memory and I/O. Since, both the instructions as well as data are stored in memory, the processor needs
to read the instructions and data from memory. After processing, the results are stored in memory. Therefore,
two basic operations involving the memory are needed, namely, Load (Read or Fetch) and Store (Write).
The Load operation transfers a copy of the data from the memory to the processor and the Store operation
moves the data from the processor to memory. Other data transfer instructions are needed to transfer data from
one register to another or from/to I/O devices and the processor.
Data manipulation instructions perform operations on data and indicate the computational capabilities for the
processor. These operations can be: arithmetic operations, logical operations or shift operations.
Arithmetic operations include: addition (with and without carry), subtraction (with and without borrow),
multiplication, division, increment, decrement and finding the complement of a number.
The logical and bit manipulation instructions include: AND, OR, XOR, Clear carry, set carry, etc. Similarly, you can
perform different types of shift and rotate operations.
We generally assume a sequential flow of instructions. That is, instructions that are stored in consequent
locations are executed one after the other. However, you have program sequencing and control instructions that
help you change the flow of the program. This is best explained with an example. Consider the task of adding a list
of n numbers. A possible sequence is given below.
Move DATA1, R0
Add DATA2, R0
Add DATA3, R0
Add DATAn, R0
Move R0, SUM
The addresses of the memory locations containing the n numbers are symbolically given as DATA1, DATA2, …,
DATAn, and a separate Add instruction is used to add each Data to the contents of register R0. After all the
numbers have been added, the result is placed in memory location SUM.
Instead of using a long list of Add instructions, it is possible to place a single Add instruction in a program loop, as
shown below:
Move N, R1
Clear R0
LOOP Determine address of “Next” number and add “Next” number to R0
Decrement R1
Branch > 0, LOOP
Move R0, SUM
The loop is a straight-line sequence of instructions executed as many times as needed. It starts at location LOOP
and ends at the instruction Branch>0.
During each pass through this loop, the address of the next list entry is determined, and that entry is fetched and
added to R0.
The address of an operand can be specified in various ways, as will be described in the next section. For now, you
need to know how to create and control a program loop. Assume that the number of entries in the list, n, is
stored in memory location N. Register R1 is used as a counter to determine the number of times the loop is
executed. Hence, the contents of location N are loaded into register R1 at the beginning of the program. Then,
within the body of the loop, the instruction, Decrement R1 reduces the contents of R1 by 1 each time through the
loop. The execution of the loop is repeated as long as the result of the decrement operation is greater than zero.
You should now be able to understand branch instructions. This type of instruction loads a new value into the
program counter. As a result, the processor fetches and executes the instruction at this new address, called the
branch target, instead of the instruction at the location that follows the branch instruction in sequential address
order. The branch instruction can be conditional or unconditional.
An unconditional branch instruction does a branch to the specified address irrespective of any condition.
A conditional branch instruction causes a branch only if a specified condition is satisfied. If the condition is not
satisfied, the PC is incremented in the normal way, and the next instruction in sequential address order is fetched
and executed.
In the example above, the instruction Branch>0 LOOP (branch if greater than 0) is a conditional branch instruction
that causes a branch to location LOOP if the result of the immediately preceding instruction, which is the
decremented value in register R1, is greater than zero. This means that the loop is repeated as long as there are
entries in the list that are yet to be added to R0.
At the end of the nth pass through the loop, the Decrement instruction produces a value of zero, and, hence,
branching does not occur. Instead, the Move instruction is fetched and executed. It moves the final result from R0
into memory location SUM.
Some ISAs refer to such instructions as Jumps. The processor keeps track of information about the results of
various operations for use by subsequent conditional branch instructions. This is accomplished by recording the
required information in individual bits, often called condition code flags. These flags are usually grouped together
in a special processor register called the condition code register or status register.
Individual condition code flags are set to 1 or cleared to 0, depending on the outcome of the operation performed.
Some of the commonly used flags are: Sign, Zero, Overflow and Carry.
The call and return instructions are used in conjunction with subroutines. A subroutine is a self-contained
sequence of instructions that performs a given computational task.
During the execution of a program, a subroutine may be called to perform its function many times at
various points in the main program. Each time a subroutine is called, a branch is executed to the beginning of the
subroutine to start executing its set of instructions. After the subroutine has been executed, a branch is made
back to the main program, through the return instruction. Interrupts can also change the flow of a program.
A program interrupt refers to the transfer of program control from a currently running program to
another service program as a result of an external or internally generated request. Control returns to the original
program after the service program is executed.
The interrupt procedure is, in principle, quite similar to a subroutine call except for three variations:
(1) The interrupt is usually initiated by an internal or external signal apart from the execution of an instruction
(2) the address of the interrupt service program is determined by the hardware or from some information from
the interrupt signal or the instruction causing the interrupt; and
(3) an interrupt procedure usually stores all the information necessary to define the state of the CPU rather than
storing only the program counter.
Therefore, when the processor is interrupted, it saves the current status of the processor, including the return
address, the register contents and the status information called the Processor Status Word (PSW), and then
jumps to the interrupt handler or the interrupt service routine. Upon completing this, it returns to the main
program. Interrupts are handled in detail in the next unit on Input / Output.
Input and Output instructions are used for transferring information between the registers, memory and the input
/ output devices. It is possible to use special instructions that exclusively perform I/O transfers, or use memory –
related instructions to do I/O transfers.
Suppose you are designing an embedded processor which is meant to be performing a particular application, then
definitely you will have to bring instructions which are specific to that particular application. But when designing a
general-purpose processor, you only look at including all general types of instructions.
Examples of specialized instructions may be media and signal processing related instructions, say vector type of
instructions which try to exploit the data level parallelism, where the same operation of addition or subtraction is
going to be done on different data and then you may have to look at saturating arithmetic operations, multiply
and accumulator instructions.
The data types and sizes indicate the various data types supported by the processor and their lengths. Common
operand types are:
Character (8 bits), Half word (16 bits), Word (32 bits), Single Precision Floating Point (1 Word),
Double Precision Floating Point (2 Words), Integers – two’s complement binary numbers, ASCII Characters,
Floating point numbers following the IEEE Standard 754 and Packed / unpacked decimal numbers.
The operation field of an instruction specifies the operation to be performed. This operation must be executed on
some data that is given straight away or stored in computer registers or memory words. The way the operands
are chosen during program execution is dependent on the addressing mode of the instruction. The AM specifies
a rule for interpreting or modifying the address field of the instruction before the operand is actually referenced.
In this section, you will learn the most important addressing modes found in modern processors.
Computers use addressing mode techniques for the purpose of accommodating one or both of the following:
1. To give programming versatility to the user by providing such facilities as pointers to memory, counters
for loop control, indexing of data, and program relocation.
When you write programs in a high-level language, you use constants, local and global variables, pointers, and
arrays. When translating a high-level language program into assembly language, the compiler must be able to
implement these constructs using the facilities provided in the instruction set of the computer in which the
program will be run. The different ways in which the location of an operand is specified in an instruction are
referred to as addressing modes. Variables and constants are the simplest data types and are found in almost
every computer program. In assembly language, a variable is represented by allocating a register or a memory
location to hold its value.
Register mode — The operand is the contents of a processor register; the name (address) of the register is given
in the instruction.
Absolute mode — The operand is in a memory location; the address of this location is given explicitly in the
instruction. This is also called Direct.
Address and data constants can be represented in assembly language using the Immediate mode.
Immediate mode — The operand is given explicitly in the instruction. For example, the instruction:
Move 200immediate, R0 this places the value 200 in register R0.
Clearly, the Immediate mode is only used to specify the value of a source operand. A common convention is to
use the sharp sign (#) in front of the value to indicate that this value is to be used as an immediate operand.
Hence, we write the instruction above in the form:
Move #200, R0.
Constant values are used frequently in high-level language programs. For example, the statement A = B + 6
contains the constant 6.
Assuming that A and B have been declared earlier as variables and may be accessed using the Absolute mode, this
statement may be compiled as follows:
Move B, R1
Add #6, R1
Move R1, A
Constants are also used in assembly language to increment a counter, test for some bit pattern, and so on.
Indirect mode — Here, the instruction does not give the operand or its address explicitly. Instead, it provides
information from which the memory address of the operand can be determined. We refer to this address as the
effective address (EA) of the operand.
In this mode, the EA of the operand is the contents of a register or memory location whose address
appears in the instruction.
We denote indirection by placing the name of the register or the memory address given in the instruction
in parentheses. For example, consider the instruction:
Add (R1), R0.
To execute the Add instruction, the processor uses the value in register R1 as the effective address of the operand.
It requests a read operation from the memory to read the contents of this location. The value read is the
desired operand, which the processor adds to the contents of register R0.
Indirect addressing through a memory location is also possible as indicated in the instruction:
Add (A), R0.
In this case, the processor first reads the contents of memory location A, then requests a second read operation
using this value as an address to obtain the operand.
The register or memory location that contains the address of an operand is called a pointer.
Indirection and the use of pointers are important and powerful concepts in programming. Changing the contents
of location, A in the example fetches different operands to add to register R0.
Index mode — This addressing mode provides a different kind of flexibility for accessing operands. It is useful in
dealing with lists and arrays.
In this mode, the effective address of the operand is generated by adding a constant value (displacement)
to the contents of a register. The register used may be either a special register provided for this purpose, or may
be any one of the general-purpose registers in the processor. In either case, it is referred to as an index register.
where X denotes the constant value contained in the instruction and Ri is the name of the register involved.
The effective address of the operand is given by EA = X + [Ri].
The contents of the index register are not changed in the process of generating the effective address.
In an assembly language program, the constant X may be given either as an explicit number or as a
symbolic name representing a numerical value.
When the instruction is translated into machine code, the constant X is given as a part of the instruction
and is usually represented by fewer bits than the word length of the computer.
Since X is a signed integer, it must be sign-extended to the register length before being added to the
contents of the register.
Relative mode — The above defined the Index mode using general-purpose processor registers. A useful version
of this mode is obtained if the program counter (PC), is used instead of a general purpose register.
Then, X (PC) can be used to address a memory location that is X bytes away from the location presently pointed
to by the PC.
Since the addressed location is identified “relative” to the program counter, which always identifies the current
execution point in a program, the name Relative mode is associated with this type of addressing.
In this case, the effective address is determined by the Index mode using the program counter in place of the
general-purpose register Ri.
This addressing mode is generally used with control flow instructions.
Though this mode can be used to access data operands. But, its most common use is to specify the target
address in branch instructions.
An instruction such as: Branch > 0 LOOP, we discussed earlier, causes program execution to go to the
branch target location identified by the name LOOP if the branch condition is satisfied.
This location can be computed by specifying it as an offset from the current value of the PC.
Since the branch target may be either before or after the branch instruction, the offset is given as a signed
number.
Recall that during the execution of an instruction, the processor increments the PC to point to the next instruction.
Most computers use this updated value in computing the effective address in the Relative mode.
The two modes described next are useful for accessing data items in successive locations in the memory.
Autoincrement mode — The EA of the operand is the contents of a register specified in the instruction. After
accessing the operand, the contents of this register are automatically incremented to point to the next item in a
list. We denote the Autoincrement mode by putting the specified register in parentheses, to show that the
contents of the register are used as the EA, followed by a plus sign to indicate that these contents are to be
incremented after the operand is accessed.
Autodecrement mode — As a companion for the Autoincrement mode, another useful mode accesses the items
of a list in the reverse order. In the autodecrement mode, the contents of a register specified in the instruction
are first automatically decremented and are then used as the effective address of the operand.
We denote the Autodecrement mode by putting the specified register in parentheses, preceded by a minus sign
to indicate that the contents of the register are to be decremented before being used as the effective address.
Thus, we write – (Ri ).
In this mode, operands are accessed in descending address order.
You may wonder why the address is decremented before it is used in the Autodecrement mode and incremented
after it is used in the Autoincrement mode. The main reason for this is that these two modes can be used
together to implement a stack.
The sections have shown you that the processor can execute different types of instructions and there are
different ways of specifying the operands. Once all this is decided, this information has to be presented to the
processor in the form of an instruction format. The number of bits in the instruction is divided into groups called
fields. The most common fields found in instruction formats are:
1. An operation code field that specifies the operation to be performed. The number of bits will indicate the
number of operations that can be performed.
2. An address field that designates a memory address or a processor register. The number of bits depends
on the size of memory or the number of registers.
3. A mode field that specifies the way the operand or the effective address is determined. This depends on
the number of addressing modes supported by the processor.
The number of address fields may be three, two or one depending on the type of ISA used. Also, observe that,
based on the number of operands that are supported and the size of the various fields, the length of the
instructions will vary. Some processors fit all the instructions into a single sized format, whereas others make use
of formats of varying sizes. Accordingly, you have a fixed format or a variable format.
Interpreting memory addresses – you basically have two types of interpretation of the memory addresses – Big
endian arrangement and the little endian arrangement. Memories are normally arranged as bytes and a unique
address of a memory location is capable of storing 8 bits of information. But when you look at the word length of
the processor, the word length of the processor may be more than one byte. Suppose you look at a 32-bit
processor, it is made up of four bytes. These four bytes span over four memory locations. When you specify the
address of a word how you would specify the address of the word – are you going to specify the address of the
most significant byte as the address of the word (big end) or specify the address of the least significant byte (little
end) as the address of the word. That distinguishes between a big endian arrangement and a little endian
arrangement. IBM, Motorola, HP follow the big endian arrangement and Intel follows the little endian
arrangement. Also, when a data spans over different memory locations, and if you try to access a word which is
aligned with the word boundary, we say there is an alignment. If you try to access the words not starting at a
word boundary, you can still access, but they are not aligned. Whether there is support to access data that is
misaligned is a design issue. Even if you’re allowed to access data that is misaligned, it normally takes more
number of memory cycles to access the data.