Unit I - Notes - CA - Cs8491 - Updated
Unit I - Notes - CA - Cs8491 - Updated
1
CS8491 – Computer Architecture UNIT 1
2
CS8491 – Computer Architecture UNIT 1
Servers
Servers are built from the same basic technology as desktop
computers, but provide for greater expandability of both computing and
input/output capacity. In general, servers also place a greater emphasis on
dependability, since a crash is usually more costly than it would be on a
single-user desktop computer.
Servers span the widest range in cost and capability.
1. At the lower end, a server may be little more than a desktop computer without a
screen or keyboard and cost a thousand dollars. These low-end servers are typically
used for file storage, small business applications, or simple web serving.
2. At the other extreme are supercomputers, which at the present consist of hundreds
to thousands of processors and usually terabytes of memory and petabytes of
storage, and cost millions to hundreds of millions of dollars.
Supercomputers are usually used for high-end scientific and engineering
calculations, such as weather forecasting, oil exploration, protein structure
determination, and other large-scale problems.
3. Internet datacenters use by companies like eBay and Google also contain
thousands of processors, terabytes of memory, and petabytes of storage. These are
usually considered as large clusters of computers.
Embedded Computers
Embedded computers are the largest class of computers and span the widest range
of applications and performance. Embedded computers include the microprocessor
found in your car, the computers in a cell phone, the computers in a video game or
television, and the networks of processors that control a modern airplane or cargo
ship.
3
CS8491 – Computer Architecture UNIT 1
Embedded computing systems are designed to run one application or one set of
related applications that are normally integrated with the hardware and delivered as
a single system.
The Hardware / Software Interface
The hardware in a computer can only execute extremely simple low-level
instructions. To go from a complex application to the simple instructions involves several
layers of software that interpret or translate high-level operations into simple computer
instructions.
Different layers of software are organized primarily in a hierarchical fashion, with
applications being the outermost ring and a variety of system software sitting between the
hardware and applications software.
4
CS8491 – Computer Architecture UNIT 1
Hardware refers to all visible devices that are assembled together to build a
computer system. These include various input, output, storage, processing and control
devices.
Software
It is basically “the set of instructions grouped into programs that make the
computer to function in the desired way. It is a collection of programs to perform a
particular task.
It is responsible for controlling, integrating and managing the hardware
components of a computer and to accomplish specific tasks.
Types of Software
5
CS8491 – Computer Architecture UNIT 1
System Software
System software is a collection of programs designed to operate, control and
extend processing capabilities of computer and which makes the operation of a
computer system more effective and efficient.
System software consists of several programs, which are directly responsible for
controlling, integrating, and managing the individual hardware components of a
computer system. This software manages and supports the computer system and its
information processing activities. It is more transparent and less noticed by the users,
they usually interact with the hardware or the applications.
Types of System Software’s
1. Operating System
Operating system is the first layer of software loaded into computer
memory when it starts up. It provides a software platform on top of which
other programs can run.
It is a set of programs that controls and supervises the operations of computer
system and provides the services to computer users.
OS is defined as the program that instructs the computer how to work with its
various components.
2. Device Drivers
Device drivers are system programs, which are responsible for proper functioning
of devices.
Whenever a new device is added to the computer system, a new device driver
must be installed before the device is used.
Every device, whether it is a printer, monitor, mouse or keyboard, has a driver
associated with it for its proper functioning.
Functionalities
A driver acts like a translator between the device and program that uses the
6
CS8491 – Computer Architecture UNIT 1
device. Note that each device has its own set of specialized commands that
only its driver understands.
A device driver is not an independent program; it assists and is assisted by the
operating system for the proper functioning of the device.
3. Language Translators
Computers only understand a language consisting of 0s and 1s called Machine
Language. To ease the burden of programming entirely in 0s and 1s, special
programming languages called high-level programming languages were
developed that resemble natural languages like English.
Along with every programming language developed, a language translator was
also developed, which accepts the programs written in a programming language
and executes them by transforming them into a form suitable for execution.
Types of Language Translators
Compiler – Purpose
The programs written in any high-level programming language are converted
into machine language using a compiler.
As a system program, the compiler translates source code (user written form)
into object code (binary form)
Interpreter – Purpose
An interpreter analyses and executes the source code in line-by-line manner,
without looking at the entire program.
It translates a statement in a program and executes the statement immediately,
before translating the next source language statement.
Assembler – Purpose
Compared to all types of programming languages, assembly language is closest to
the machine code. Assembly language is fundamentally a symbolic representation
of machine code.
7
CS8491 – Computer Architecture UNIT 1
The assembly language program must be translated into machine code by a
separate program called an assembler.
Sample Diagram for Conversion
4. System Utilities
System utility programs are used to support, enhance, and secure existing
programs and data in the computer system.
They are mainly used to perform routine functions like loading, saving a
program and keep track of the files on the disk.
Application Software
Application software is a set of programs that allows the computer to perform
specific data processing job for the user. It helps the user to work faster, more
effectively and more productively
An application is the job a user wants the computer to perform.
Application software is dependent on system software. System software acts as an
interface between the user and the computer hardware, while application software
performs specific tasks.
Without application software, the computer, no matter how powerful, will not be
8
CS8491 – Computer Architecture UNIT 1
The one constant for computer designers is rapid change, which is driven largely by
Moore's Law. It states that integrated circuit resources double every 18–24 months. As
computer designs can take years, the resources available per chip can easily double or
9
CS8491 – Computer Architecture UNIT 1
quadruple between the start and finish of the project. We use an "up and to the right"
Moore's Law graph to represent designing for rapid change.
Both computer architects and programmers had to invent techniques to make themselves
more productive, for otherwise design time would lengthen as dramatically as resources
grew by Moore's Law. A major productivity technique for hardware and software is to
use abstractions to represent the design at different levels of representation; lower-level
details are hidden to offer a simpler model at higher levels.
3. Make the common case fast
Making the common case fast will tend to enhance performance better than optimizing
the rare case. Ironically, the common case is often simpler than the rare case and hence is
oft en easier to enhance. We use a sports car as the icon for making the common case
fast, as the most common trip has one or two passengers, and it's surely easier to make a
fast sports car than a fast minivan.
4. Performance via parallelism
10
CS8491 – Computer Architecture UNIT 1
Since the dawn of computing, computer architects have offered designs that get more
performance by performing operations in parallel. We'll see many examples of
parallelism in this book. We use multiple jet engines of a plane as our icon for parallel
performance.
5. Performance via pipelining
Following the saying that it can be better to ask for forgiveness than to ask for
permission, the next great idea is prediction. In some cases it can be faster on average to
guess and start working rather than wait until you know for sure, assuming that the
mechanism to recover from a misprediction is not too expensive and your prediction is
relatively accurate. We use the fortune-teller's crystal ball as our prediction icon.
11
CS8491 – Computer Architecture UNIT 1
7. Hierarchy of memories
Programmers want memory to be fast, large, and cheap, as memory speed often shapes
performance, capacity limits the size of problems that can be solved, and the cost of
memory today is often the majority of computer cost.
Architects have found that they can address these conflicting demands with a hierarchy
of memories, with the fastest, smallest, and most expensive memory per bit at the top of
the hierarchy and the slowest, largest, and cheapest per bit at the bottom. We use a
layered triangle icon to represent the memory hierarchy. The shape indicates speed, cost,
and size.
8. Dependability via redundancy
Computers not only need to be fast; they need to be dependable. Since any physical
device can fail, we make systems dependable by including redundant components that
can take over when a failure occurs and to help detect failures. We use the tractor-trailer
as our icon, since the dual tires on each side of its rear axels allow the truck to continue
driving even when one tire fails.
COMPONENTS OF A COMPUTER SYSTEM
The five classic components of a computer are input, output, memory, datapath,
and control, with the last two sometimes combined and called the processor.
Basic Functional Units
Introduction
12
CS8491 – Computer Architecture UNIT 1
Arithmetic
Input and
logic
Memory
Output Control
I/O Processor
13
CS8491 – Computer Architecture UNIT 1
Keyboard
Mouse
Light Pen
Digitizer
Trackball
Joystick
Example for an Input Device
Mouse
A computer mouse is a handheld hardware input device that controls a cursor in
a GUI (graphical user interface) for pointing, moving and selecting text, icons, files,
and folders on your computer. In addition to these functions, a mouse can also be used
to drag-and-drop objects and give you access to the right-click menu.
For desktop computers, the mouse is placed on a flat surface (e.g., mouse pad or desk) in
front of your computer.
Output Unit
Computers can communicate with human beings using output devices. Output
devices take the machine-coded output results from the CPU and convert them into a
form that is easily readable by human beings.
Functions of output unit
Output unit is the communication between the user and the computer.
It provides the information and results of a computation to the outside
world.
It also converts the binary data into a form that users can understand.
Output devices – Examples
14
CS8491 – Computer Architecture UNIT 1
VDU or Monitor
Printer
Plotter
Liquid Crystal Displays (LCD’s)
15
CS8491 – Computer Architecture UNIT 1
TYPES OF MEMORY
Primary Memory
It is also known as main memory, stores data and instructions temporarily for
processing.
It is an integral component of the CPU but physically, it is a separate part placed on
the computer’s motherboard.
It can be further classified into random access memory (RAM) and read only memory
(ROM).
Functions of Primary Memory
o Used to hold the program being currently executed in the computer.
16
CS8491 – Computer Architecture UNIT 1
17
CS8491 – Computer Architecture UNIT 1
Cache memory consists of a small, fast memory that acts as a buffer for a slower, larger
memory. Cache is built using a different memory technology, SRAM.
A CPU cache is a cache used by the central processing unit (CPU) of a computer to
reduce the average time to access memory. The cache is a smaller, faster memory which
stores copies of the data from frequently used main memory locations. Most CPUs have
different independent caches, including instruction and data caches, where the data cache
is usually organized as a hierarchy of more cache levels (L1, L2 etc.)
2. DRAM
DRAM stands for Dynamic Random Access Memory that provides random access to any
location. Several DRAMs are used together to contain the instructions and data of a
program. The RAM portion of the term DRAM means that memory accesses take
basically the same amount of time no matter what portion of the memory is used.
3. SRAM
SRAM is more expensive and less dense than DRAM and is therefore not used for high-
capacity, low-cost applications such as the main memory in personal computers.
SRAM is a type of semiconductor memory that uses bistable latching circuitry to store
each bit. The term static differentiates it from dynamic RAM (DRAM) which must be
periodically refreshed. SRAM exhibits data remanence, but it is still volatile in the
conventional sense that data is eventually lost when the memory is not powered.
Processor / CPU
The processor is the active part of the board, following the instructions of a
program to do a specific task. It adds numbers, tests numbers, signals I/O devices to
activate, and so on. People call the processor the CPU or Central Processing Unit.
The processor logically comprises two main components: datapath and control,
the respective brawn and brain of the processor. The datapath performs the arithmetic
operations, and control tells the datapath, memory, and I/O devices what to do according
to the wishes of the instructions of the program.
18
CS8491 – Computer Architecture UNIT 1
Transfers between the memory and the processor are started by sending the
address of the memory location to be accessed to the memory unit and issuing
the appropriate control signals. The data are then transferred to or from the memory.
19
CS8491 – Computer Architecture UNIT 1
PROCESSOR
The fig shows how memory & the processor can be connected. In addition to the ALU &
the control circuitry, the processor contains a number of registers used for several
different purposes.
The instruction register (IR):- Holds the instructions that is currently being
executed. Its output is available for the control circuits which generates the timing
signals that control the various processing elements in one execution of instruction.
The program counter PC:-
This is another specialized register that keeps track of execution of a program. It contains
the memory address of the next instruction to be fetched and executed.
Besides IR and PC, there are n-general purpose registers R0 through Rn-1
The other two registers which facilitate communication with memory are: -
1. MAR – (Memory Address Register):- It holds the address of the location to be
20
CS8491 – Computer Architecture UNIT 1
accessed.
MDR – (Memory Data Register):- It contains the data to be written into or read
out of the address location.
Operating steps are
1. Programs reside in the memory & usually get these through the I/P unit.
2. Execution of the program starts when the PC is set to point at the
first instruction of the program.
3. Contents of PC are transferred to MAR and a Read Control Signal is sent to the
memory.
4. After the time required to access the memory elapses, the address word is read
out of the memory and loaded into the MDR.
5. Now contents of MDR are transferred to the IR & now the instruction is
ready to be decoded and executed.
6. If the instruction involves an operation by the ALU, it is necessary to obtain the
required operands.
7. An operand in the memory is fetched by sending its address to MAR &
Initiating a read cycle.
8. When the operand has been read from the memory to the MDR, it is
transferred from MDR to the ALU.
9. After one or two such repeated cycles, the ALU can perform the desired
operation.
10. If the result of this operation is to be stored in the memory, the result is sent to
MDR.
11. Address of location where the result is stored is sent to MAR & a write cycle
is initiated.
12. The contents of PC are incremented so that PC points to the next
instruction that is to be executed.
21
CS8491 – Computer Architecture UNIT 1
PERFORMANCE
Introduction
When trying to choose among different computers, performance is an important attribute.
Accurately measuring and comparing performance of different computers is critical to
both purchasers and to designers.
Performance Measurement is a Challenging Task
Accessing the performance of computers can be quite challenging task. The scale
and intricacy of modern software systems, together with the wide range of performance
improvement techniques employed by hardware designers, have made performance
assessment much more difficult.
Response Time / Execution Time
Response time is the time between the start and completion of a task. It is defined
as the total time required for the computer to complete a task, including disk accesses,
memory accesses, I/O activities, operating system overhead, CPU execution time, and so
on.
Throughput / Bandwidth
22
CS8491 – Computer Architecture UNIT 1
Throughput is the total amount of work done in a given time (or) It is the number
of tasks completed per unit time. As an individual computer user, we are interested in
reducing the response time and for as a datacenter manager, we are often interested in
increasing the throughput.
Performance – Definition
To maximize performance, we want to minimize response time or execution time
for some task. Thus, we can relate the performance and execution time for a computer X
as follows:
Performanc
This means that for two computers X and Y, if the performance of X is greater
than the performance of Y, then we have
Performanc Performanc
If X is ‘n’ times faster than Y, then the execution time on Y is ‘n’ times longer than it is
on X.
Example Problem:
23
CS8491 – Computer Architecture UNIT 1
If computer A runs a program in 10 seconds and computer B runs the same program in
15 seconds, how much faster is A than B?
means that
Measuring Performance
Time is the measure of computer performance: the computer that performs the
same amount of work in the least time is the fastest. Time can be measured in different
ways, depending on what we count. The most straight forward definition of time is called
wall clock time, response time, or elapsed time.
Program Execution Time
Program execution time is measured in seconds per program. It is defined as the
total time required to complete a task, including disk accesses, memory accesses,
input/output (I/O) activities, operating system overhead, etc.
24
CS8491 – Computer Architecture UNIT 1
25
CS8491 – Computer Architecture UNIT 1
This formula makes it clear that the hardware designer can improve performance
by reducing the number of clock cycles required for a program or the length of the clock
cycle.
Example Problem:
Our favorite program runs in 10 seconds on computer A, which has a 2 GHz clock. We
are trying to help a computer designer to build a computer B, which will run this program
in 6 seconds. The designer has determined that a substantial increase in the clock rate is
possible, but this increase will affect the rest of the CPU design, causing computer B to
require 1.2 times as many clock cycles as computer A for this program. What clock rate
should we tell the designer to target?
Solution:
Let’s first find the number of clock cycles required for the program A,
CPU Tim
10 seconds
26
CS8491 – Computer Architecture UNIT 1
27
CS8491 – Computer Architecture UNIT 1
So, we can conclude that computer A is 1.2 times as fast as computer B for this
program.
28
CS8491 – Computer Architecture UNIT 1
These formulas are particularly useful because they separate the three key factors that
affect the performance. We can use these formulas to compare two different
29
CS8491 – Computer Architecture UNIT 1
Always bear in mind that the only complete and reliable measure of computer
performance is time. The performance of a program depends on the algorithm, the
language, the compiler, the architecture, and the actual hardware.
30
CS8491 – Computer Architecture UNIT 1
31
CS8491 – Computer Architecture UNIT 1
It is assumed that the computer has two processor registers, R1 and R2. The
symbol M[A] denotes the operand at memory address symbolized by A.
Two address instructions
Two address instructions are the most common in commercial computers. Here
again each address field can specify either a processor register or a memory word. The
program to evaluate X= (A+B)*(C+D) is as follows:
The MOV instruction moves or transfers the operands to and from memory and
processor registers. The first symbol listed in an instruction is assumed to be both a source
and the destination where the result of the operation is transferred.
One address instructions
One address instructions use an implied accumulator (AC) register for all data
manipulation. For multiplication and division there is a need for a second register.
However, here we will neglect the second register and assume that the AC contains the
result of all operations. The program to evaluate X= (A+B)*(C+D) is
All operations are done between the AC register and a memory operand. T is the
address of a temporary memory location required for storing the intermediate result.
Commercially available computers also use this type of instruction format.
32
CS8491 – Computer Architecture UNIT 1
Label : Opcode Destination Operand, Source1 Operand, Source2 Operand; # Comment Lines
Where
Label – is a user defined variable name which is used for referencing to a
particular line of code.
Opcode – is a reserved Mnemonic code or Keyword which specifies what type of
operation is going to be performed in that instruction.
Destination Operand – is a register / location where the computation result is to be
stored.
33
CS8491 – Computer Architecture UNIT 1
Source1, Source2 Operands – is for specifying the input values for doing the
operation.
# Comment Lines – is used for writing comments about that instruction.
Example
Given a C Language code – a = b + c;
Then the compiler will translate this C language code into MIPS assembly
language instruction as follows:
add $s1, $s2, $s3;
Here, the variables a, b and c are assumed to be stored in register $s1, $s2 and $s3.
OPERANDS OF THE COMPUTER HARDWARE
Operand is a variable used to hold some data or instruction or an address. Usage of
operands is varied from programming language to another.
One major difference between the variables of a programming language and
registers is the limited number of registers, typically 16 to 32 on current computers.
There are 3 types of operands you can use in MIPS architecture. They are
Registers
Memory Operands / Addresses
Constant / Immediate Operands
Registers
Registers are primitives used in hardware design that are also visible to the
programmer once the computer is completely designed. So registers forms the basic
building blocks of computer construction.
The size of a register in the MIPS architecture is 32-bits; groups of 32 bits occur so
frequently that they are given the name words in the MIPS architecture. A word is the
natural unit of access in a computer (32-bits) which corresponds to the size of a register
in the MIPS architecture.
34
CS8491 – Computer Architecture UNIT 1
Register 1, called $at, is reserved for the assembler, and registers 26−27, called
$k0−$k1, are reserved for the operating system.
Memory Operands
Programming languages have simple variables that contain single data elements
but they also have more complex data structures such as arrays and structures. These
complex data structures can contain many more data elements than there are registers in a
computer. The processor can keep only a small amount of data in registers, but computer
memory contains billions of data elements. Hence, complex data structures are kept in
memory.
35
CS8491 – Computer Architecture UNIT 1
Memory Address
To access a word in memory, the instruction must supply the memory address.
Memory address is a value used to identify the location of a specific data element within
a memory array. Memory is just a large, single-dimensional array, with the address acting
as the index to the array, starting at 0.
37
CS8491 – Computer Architecture UNIT 1
38
CS8491 – Computer Architecture UNIT 1
It instructs a computer to add the two variables b and c and to put their sum in a.
Each MIPS arithmetic instruction performs only one operation and must always have
exactly three variables.
For example, suppose we want to place the sum of four variables b, c, d and e into
variable a. then the following sequence of instructions will do that work
add a, b, c ;
add a, a, d ;
add a, a, e ; Thus, it takes three instructions to sum the four variables.
The natural number of operands for an operation like addition is three: the two
numbers being added together and a place to put the sum.
3. Logical Operations
Logical operators are used for performing logical operations like AND, OR and
NOR operations. It is also used for performing shift operations like shift left and shift
right, etc.
40
CS8491 – Computer Architecture UNIT 1
REPRESENTING INSTRUCTIONS
An instruction is an order or command given to the computer processor by the user
in order to perform a particular task. At the lowest level, each instruction is a sequence of
0’s and 1’s that describes a physical operation that the computer is going to perform and
depending on the particular instruction type, the operation is varied.
Instruction Format
It is a form of representation of an instruction composed of fields of binary
numbers. Each instruction is encoded in binary machine code. All MIPS instructions are
encoded as a 32-bit instruction words and it is divided into small segment called “fields”
and each field tells computer something about the instruction. Also there is an attempt to
reuse fields across instructions as much as possible.
MIPS designer is keep all the instruction of the same length, thereby requiring
different kind of instruction format for different kinds of instructions.
In MIPS assembly language, registers $s0 to $s7 maps onto registers 16 to 23 and
registers $t0 to $t7 maps onto registers 8 to 15. $s0 means 16, $s1 means 17, $s2 means
18 and so on, like this $t0 means register 8, $t1 means register 9 and so on.
MIPS Instruction Coding Format
MIPS instructions are classified into three groups according to their coding
formats:
1. R – Type (for Register) or R – Format
2. I – Type (for Immediate) or I – Format
3. J – Type (for Jump) or J- Format
R – Type (or) R – Format
This group contains all instructions that do not require an immediate value, target
offset, memory address displacement, branch address or memory address to specify an
42
CS8491 – Computer Architecture UNIT 1
operand. This includes arithmetic and logic with all operands in register, shift
instructions, and register direct jump instructions (jal and jr).
The unused fields in R-type are coded with all 0 bits and all R-type instructions use
a opcode – 000000 and the operation is specified by the function field. The R-format has
6 fields as follows:
opcode rs rt rd sa Function
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
Where,
Opcode – Specify partially what instruction is it
Function – Combine with opcode, this number exactly specify the
instruction
rs (Source Register) – Specify the content of the source register
rt (Target Register) – Specify the contents of second register
rd (Destination Register) – Specify the content of the destination register
sa (Shift Amount) – Specify the number of bits to be shifted
Example for Translating MIPS Assembly Language Instruction into Machine
Language Code
add $t0, $s1, $s2 ;
MIPS R-format decimal representation is
Opcode rs rt rd sa Function
0 17 18 8 0 32
43
CS8491 – Computer Architecture UNIT 1
Third field gives the other source operand for the addition (18 indicates $s2)
Fourth field contains the number of the register that is to receive the sum (8
indicates $t0)
Fifth field is unused in this instruction, so it is set to 0.
This instruction adds register $s1 to register $s2 and places the sum in
register $t0.
MIPS R-format Binary Representation is
This instruction can also be represented as a field of binary numbers as
opposed to decimal numbers.
000000 10001 10010 01000 00000 100000
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
Here 10001 is the binary value for 17, similar to this the remaining field
values will be represented in binary.
I – Type (or) I – Format
By keeping the formats similar we can reduce the complexity in the design. The
first three field of I and R format are the same size and have the same names. The length
of the last three field of R-format is equal to the length of the fourth field in I-type.
Opcode rs Rt Immediate or Address
6 bits 5 bits 5 bits 16 bits
In I-format, first three fields is similar to R-type. The last (fourth) field indicates
the constant or address (16-bits). This 16-bit address means that a load/store word
instruction can load/store any word within a region of 2 15 or 32,768 bytes of the address
in the base register.
Example:
addi $t0, $t0, 0xABABCDCD ;
becomes :
lui $at, 0xABAB ;
44
CS8491 – Computer Architecture UNIT 1
45
CS8491 – Computer Architecture UNIT 1
Example:
j 2000 ; # Jump to address 2000
jal 2500 ; # Jump and Link to address 2500
Summarization of MIPS Instruction Formats
46
CS8491 – Computer Architecture UNIT 1
Tabulation , shows how hardware decodes and determine the three machine language instructions:
LW
ADD
ADD
The lw instruction is identified by 35 (OP field), The add instruction that follows is specified with 0
(OP field), The sw instruction is identified with 43 (OP field).
are called shift left logical (sll) and shift right logical (srl).
sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits(shifted 4 places)
shamt field in the R-format. Used in shift instructions, it stands for shift
amount. The encoded version of above Shift instruction is shown below.
Also Shifting left by i bits gives the same result as multiplying by 2 i (refer above
representation for 9 and 144) (9 x 2 4 = 9 x 16 = 144, where i =4, since left shift done 4
times)
LOGICAL AND (and)
AND is a bit-by-bit operation that leaves a 1 in the result only if both bits of the operands
are 1.
0000 0000 0000 0000 0000 1101 1100
For example, if register $t2 contains 0000two
0000 0000 0000 0000 0011 1100 0000
and register $t1 contains 0000two
then, after executing the MIPS
instruction
and $t0,$t1,$t2 # reg $t0 = reg $t1 & reg $t2
the value of register $t0 would be 0000 0000 0000 0000 0000 1100 0000
0000two (example for bit wise, note: do not add , ……..00101
….10111
Bit wise AND 00101 (use AND truth table for each bit))
AND is traditionally called a mask, since the mask ―conceals‖ some bits.
LOGICAL OR (or)
It is a bit-by-bit operation that places a 1 in the result if either operand bit is a 1. To elaborate, if the
registers $t1 and $t2 are unchanged from the preceding i.e.,
register $t2 contains 0000 0000 0000 0000 0000 1101 1100 0000two
48
CS8491 – Computer Architecture UNIT 1
and register $t1 contains 0000 0000 0000 0000 0011 1100 0000 0000two
then, after executing the MIPS instruction
or $t0,$t1,$t2 # reg $t0 = reg $t1 | reg $t2
the value in register $t0 would be 0000 0000 0000 0000 0011 1101 1100 0000two
(example for bit wise, ……..00101
….10111
Bit wise OR 10111 (use OR truth table for each bit))
LOGICAL NOT (nor)
The final logical operation is a contrarian. NOT takes one operand and places a 1 in the result if one
operand bit is a 0, and vice versa. Since MIPS needs three-operand format, the designers of MIPS
decided to include the instruction NOR (NOT OR) instead of NOT.
Step 1: . Perform bit wise OR , ……..00101
….00000 (dummy operation register filled with zero)
00101
Step 2: Take Inverse for the above result now we get 11010
Constants are useful in AND and OR logical operations as well as in arithmetic operations, so
MIPS also provides the instructions and immediate (andi) and or immediate (ori).
==========================================================================
DECISION MAKING AND BRANCHING INSTRUCTIONS ( CONTROL OPERATIONS )
Branch and Conditional branches: Decision making is commonly represented in programming
languages using the if statement, sometimes combined with go to statements and labels. MIPS
assembly language includes two decision-making instructions, similar to an if statement with a go to.
The first instruction is
beq register1, register2, L1
This instruction means go to the statement labeled L1 if the value in register1 equals the value in
register2. The mnemonic beq stands for branch if equal. The second instruction is
bne register1, register2, L1
It means go to the statement labeled L1 if the value in register1 does not equal the value in
register2. The mnemonic bne stands for branch if not equal. These two instructions are
49
CS8491 – Computer Architecture UNIT 1
Here bne is used instead of beq, because bne(not equal to) instruction provides a better efficiency. This
example introduces another kind of branch, often called an unconditional branch. This instruction
says that the processor always follows the branch. To distinguish between conditional and
unconditional branches, the MIPS name for this type of instruction is jump, abbreviated as j.
(in example:- f, g, h, i, and j are variables mapped to five registers $s0 through $s4)
Loops:
Decisions are important both for choosing between two alternatives—found in if statements—and for
iterating a computation—found in loops. The same assembly instructions are the basic building blocks
forboth cases(if and loop).
EXAMPLE: while (save[i] == k)
i += 1; the MIPS version of the given statements is
Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6.
Loop: sll $t1,$s3,2 # Temp reg $t1 = i * 4 To get the address of save[i], we need to add $t1 and the
base of save in $s6:
add $t1,$t1,$s6 # $t1 = address of save[i] Now we can use that address to load save[i] into a
temporary register:
lw $t0,0($t1) # Temp reg $t0 = save[i] The next instruction performs the loop test, exiting if
50
CS8491 – Computer Architecture UNIT 1
save[i] ≠ k:
bne $t0,$s5, Exit # go to Exit if save[i] ≠ k
addi $s3,$s3,1 #i=i+1 The next instruction adds 1 to i:
j Loop # go to Loop The end of the loop branches back to the while test at
Exit: the top of the loop. We just add the Exit label after
it, and we’re done:
ADDRESSING MODES
The different ways in which the location of an operand is specified in an
instruction are referred to as addressing modes.
There are different ways to specify the address of the operands for any given
operations such as load, add or branch. The different ways of determining the address of
the operands are called addressing modes.
There are five different types of MIPS addressing modes.
1. Immediate Addressing mode
2. Register Addressing mode
3. Base or Displacement Addressing mode
4. PC-Relative Addressing mode
5. Pseudo Direct Addressing mode
Immediate Addressing Mode
In this addressing mode, the operand is a constant which is specified as part of the
instruction itself. Immediate addressing mode has the advantage of not requiring an extra
memory access to fetch the operand, but the operand is limited to 16 bits in size. The
branch instruction format can also be considered as an example of immediate addressing
mode, since the destination is held address is held in the instruction itself.
MIPS instruction use lui – load upper immediate to set the upper 16 bits of a
constant in a register and allowing a subsequent instruction to specify the lower 16 bits of
the constant.
51
CS8491 – Computer Architecture UNIT 1
52
CS8491 – Computer Architecture UNIT 1
XXXX Offset 00
53
CS8491 – Computer Architecture UNIT 1
Program Counter
Amdahl’s Law
A rule stating that the performance enhancement possible with a given improvement is
limited by the amount that the improved feature is used. It is a quantitative version of the
law of diminishing returns.
Since we want the performance to be five times faster, the new execution time should be
20 seconds, giving
One alternative to time is MIPS (million instructions per second). For a given
program, MIPS is simply
54
CS8491 – Computer Architecture UNIT 1
Finally, and most importantly, if a new program executes more instructions but each
instruction is faster, MIPS can vary independently from performance
55
CS8491 – Computer Architecture UNIT 1
56
CS8491 – Computer Architecture UNIT 1
57
CS8491 – Computer Architecture UNIT 1
This mode is same as immediate post-indexed except that you add or subtract a
register instead of a constant.
Example: LDR r2, [r0], r1 ;
****************
59