0% found this document useful (0 votes)
41 views59 pages

Unit I - Notes - CA - Cs8491 - Updated

Uploaded by

2020pitcse278
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views59 pages

Unit I - Notes - CA - Cs8491 - Updated

Uploaded by

2020pitcse278
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

CS8491 – Computer Architecture UNIT 1

UNIT I BASIC STRUCTURE OF A COMPUTER SYSTEM


Functional Units – Basic Operational Concepts – Performance – Instructions: Language
of the Computer – Operations, Operands – Instruction representation – Logical
operations – decision making – MIPS Addressing.
Introduction
What is Computer?
A computer is an electronic machine, devised for performing calculations and
controlling operations that can be expressed either in logical or numerical terms.
A computer is an electronic device that performs diverse operations with the
help of instructions to process the information in order to achieve desired results.
The computer is an electronic device designed in such a way, it
automatically accepts and stores input data, process them and produce results
under the direction of a detailed step by step program.
Computer Architecture & Organization
The internal logic design of the system is known as architecture. It determines the
various operations performed by the system.
Computer architecture is concerned with the structure and behavior of the
various functional modules of the computer and how they interact to provide the
processing needs of the user.
In short, computer architecture refers to how a computer system is designed and
what hardware and software technologies it is compatible with.
Computer organization refers to the operational units and their interconnections
that realize the architectural specification.
Why we have to study computer architecture?

1
CS8491 – Computer Architecture UNIT 1

We need to understand computer architecture in order to structure the various


internal components of a computer and a program to realize the logics, so that it runs
more efficiently on a real time machine.
There are three important stages of computer architecture:
 System Design: This includes all hardware components in the system,
including data processors aside from the CPU, such as the graphics processing
unit and direct memory access. It also includes memory controllers, data paths
and miscellaneous things like multiprocessing and virtualization.
 Instruction Set Architecture (ISA): This is the embedded programming
language of the central processing unit. It defines the CPU's functions and
capabilities based on what programming it can perform or process. This
includes the word size, processor register types, memory addressing modes,
data formats and the instruction set that programmers use.
 Microarchitecture: Otherwise known as computer organization, this type of
architecture defines the data paths, data processing and storage elements, as
well as how they should be implemented in the ISA.
Different Classes of Computers
Different applications have different design requirements and employ the core
hardware technologies in different ways. In general, computers are used in three different
classes of applications:
 Desktop Computers
A computer designed for use by an individual, usually incorporating a
graphics display, a keyboard, and a mouse. They are possibly the best-
known form of computing and are characterized by the personal computer,
which the users have used extensively. Desktop computers emphasize
delivery of good performance to single users at low cost and usually execute
third-party software.

2
CS8491 – Computer Architecture UNIT 1

 Servers
Servers are built from the same basic technology as desktop
computers, but provide for greater expandability of both computing and
input/output capacity. In general, servers also place a greater emphasis on
dependability, since a crash is usually more costly than it would be on a
single-user desktop computer.
Servers span the widest range in cost and capability.
1. At the lower end, a server may be little more than a desktop computer without a
screen or keyboard and cost a thousand dollars. These low-end servers are typically
used for file storage, small business applications, or simple web serving.
2. At the other extreme are supercomputers, which at the present consist of hundreds
to thousands of processors and usually terabytes of memory and petabytes of
storage, and cost millions to hundreds of millions of dollars.
Supercomputers are usually used for high-end scientific and engineering
calculations, such as weather forecasting, oil exploration, protein structure
determination, and other large-scale problems.
3. Internet datacenters use by companies like eBay and Google also contain
thousands of processors, terabytes of memory, and petabytes of storage. These are
usually considered as large clusters of computers.
 Embedded Computers
Embedded computers are the largest class of computers and span the widest range
of applications and performance. Embedded computers include the microprocessor
found in your car, the computers in a cell phone, the computers in a video game or
television, and the networks of processors that control a modern airplane or cargo
ship.

3
CS8491 – Computer Architecture UNIT 1

Embedded computing systems are designed to run one application or one set of
related applications that are normally integrated with the hardware and delivered as
a single system.
The Hardware / Software Interface
The hardware in a computer can only execute extremely simple low-level
instructions. To go from a complex application to the simple instructions involves several
layers of software that interpret or translate high-level operations into simple computer
instructions.
Different layers of software are organized primarily in a hierarchical fashion, with
applications being the outermost ring and a variety of system software sitting between the
hardware and applications software.

Hardware / Software Interface Diagram


Different Layers of Hardware / Software Interface
Hardware
The term hardware is applied to any of the physical equipment’s in the computer
system, such as the machinery and equipment’s of itself usually containing electronic
components and performing some
kinds of functions in information processing.

4
CS8491 – Computer Architecture UNIT 1

Hardware refers to all visible devices that are assembled together to build a
computer system. These include various input, output, storage, processing and control
devices.
Software
It is basically “the set of instructions grouped into programs that make the
computer to function in the desired way. It is a collection of programs to perform a
particular task.
It is responsible for controlling, integrating and managing the hardware
components of a computer and to accomplish specific tasks.
Types of Software

5
CS8491 – Computer Architecture UNIT 1

System Software
System software is a collection of programs designed to operate, control and
extend processing capabilities of computer and which makes the operation of a

computer system more effective and efficient.
System software consists of several programs, which are directly responsible for
controlling, integrating, and managing the individual hardware components of a
computer system. This software manages and supports the computer system and its
information processing activities. It is more transparent and less noticed by the users,

they usually interact with the hardware or the applications.
Types of System Software’s
1. Operating System
 Operating system is the first layer of software loaded into computer
memory when it starts up. It provides a software platform on top of which
other programs can run.
 It is a set of programs that controls and supervises the operations of computer
system and provides the services to computer users.
 OS is defined as the program that instructs the computer how to work with its
various components.
2. Device Drivers
 Device drivers are system programs, which are responsible for proper functioning
of devices.
 Whenever a new device is added to the computer system, a new device driver
must be installed before the device is used.
 Every device, whether it is a printer, monitor, mouse or keyboard, has a driver
associated with it for its proper functioning.
Functionalities 
 A driver acts like a translator between the device and program that uses the

6
CS8491 – Computer Architecture UNIT 1

device. Note that each device has its own set of specialized commands that
only its driver understands.
 A device driver is not an independent program; it assists and is assisted by the
operating system for the proper functioning of the device.
3. Language Translators
 Computers only understand a language consisting of 0s and 1s called Machine
Language. To ease the burden of programming entirely in 0s and 1s, special
programming languages called high-level programming languages were
developed that resemble natural languages like English.
 Along with every programming language developed, a language translator was
also developed, which accepts the programs written in a programming language
and executes them by transforming them into a form suitable for execution.
Types of Language Translators 
Compiler – Purpose

The programs written in any high-level programming language are converted
into machine language using a compiler. 

As a system program, the compiler translates source code (user written form)
into object code (binary form) 
Interpreter – Purpose

An interpreter analyses and executes the source code in line-by-line manner,
without looking at the entire program. 

It translates a statement in a program and executes the statement immediately,
before translating the next source language statement.
Assembler – Purpose

Compared to all types of programming languages, assembly language is closest to
the machine code. Assembly language is fundamentally a symbolic representation
of machine code.
7
CS8491 – Computer Architecture UNIT 1


The assembly language program must be translated into machine code by a

separate program called an assembler.
 Sample Diagram for Conversion

4. System Utilities
 System utility programs are used to support, enhance, and secure existing
programs and data in the computer system.
 They are mainly used to perform routine functions like loading, saving a
program and keep track of the files on the disk.
Application Software
 Application software is a set of programs that allows the computer to perform
specific data processing job for the user. It helps the user to work faster, more
effectively and more productively
 
An application is the job a user wants the computer to perform.

Application software is dependent on system software. System software acts as an
interface between the user and the computer hardware, while application software
performs specific tasks. 

Without application software, the computer, no matter how powerful, will not be

8
CS8491 – Computer Architecture UNIT 1

helpful in meeting user requirements.



There are so many application software packages available for a wide range of
application ranging from simple applications to complex, scientific and
engineering applications. 
Examples for Application Software’s:
 Word Processors
 Spreadsheets

Presentation Software

Image Editors

DBMS

Desktop Publishing s/w 
EIGHT GREAT IDEAS IN COMPUTER ARCHITECTURE
There are some eight ideas that computer architects have been invented in the last
60 years of computer design. These ideas are so powerful they have lasted long after the
first computer that used them, with newer architects demonstrating their admiration by
imitating their predecessors.
These 8 great ideas enhance the performance of the system with enhanced
pipelining operation, guess the task by prediction method and also provide the memory
hierarchy and dependability of the system to improve the efficiency up to 100%.
1. Design for Moore's Law

The one constant for computer designers is rapid change, which is driven largely by
Moore's Law. It states that integrated circuit resources double every 18–24 months. As
computer designs can take years, the resources available per chip can easily double or

9
CS8491 – Computer Architecture UNIT 1

quadruple between the start and finish of the project. We use an "up and to the right"
Moore's Law graph to represent designing for rapid change.

2. Use Abstraction to Simplify Design

Both computer architects and programmers had to invent techniques to make themselves
more productive, for otherwise design time would lengthen as dramatically as resources
grew by Moore's Law. A major productivity technique for hardware and software is to
use abstractions to represent the design at different levels of representation; lower-level
details are hidden to offer a simpler model at higher levels.
3. Make the common case fast

Making the common case fast will tend to enhance performance better than optimizing
the rare case. Ironically, the common case is often simpler than the rare case and hence is
oft en easier to enhance. We use a sports car as the icon for making the common case
fast, as the most common trip has one or two passengers, and it's surely easier to make a
fast sports car than a fast minivan.
4. Performance via parallelism

10
CS8491 – Computer Architecture UNIT 1

Since the dawn of computing, computer architects have offered designs that get more
performance by performing operations in parallel. We'll see many examples of
parallelism in this book. We use multiple jet engines of a plane as our icon for parallel
performance.
5. Performance via pipelining

A particular pattern of parallelism is so prevalent in computer architecture that it merits


its own name: pipelining. For example, before fire engines, a "bucket brigade" would
respond to a fire, which many cowboy movies show in response to a dastardly act by the
villain. The townsfolk form a human chain to carry a water source to fire, as they could
much more quickly move buckets up the chain instead of individuals running back and
forth. Our pipeline icon is a sequence of pipes, with each section representing one stage
of the pipeline.
6. Performance via prediction

Following the saying that it can be better to ask for forgiveness than to ask for
permission, the next great idea is prediction. In some cases it can be faster on average to
guess and start working rather than wait until you know for sure, assuming that the
mechanism to recover from a misprediction is not too expensive and your prediction is
relatively accurate. We use the fortune-teller's crystal ball as our prediction icon.

11
CS8491 – Computer Architecture UNIT 1

7. Hierarchy of memories
Programmers want memory to be fast, large, and cheap, as memory speed often shapes
performance, capacity limits the size of problems that can be solved, and the cost of
memory today is often the majority of computer cost.

Architects have found that they can address these conflicting demands with a hierarchy
of memories, with the fastest, smallest, and most expensive memory per bit at the top of
the hierarchy and the slowest, largest, and cheapest per bit at the bottom. We use a
layered triangle icon to represent the memory hierarchy. The shape indicates speed, cost,
and size.
8. Dependability via redundancy

Computers not only need to be fast; they need to be dependable. Since any physical
device can fail, we make systems dependable by including redundant components that
can take over when a failure occurs and to help detect failures. We use the tractor-trailer
as our icon, since the dual tires on each side of its rear axels allow the truck to continue
driving even when one tire fails.
COMPONENTS OF A COMPUTER SYSTEM
The five classic components of a computer are input, output, memory, datapath,
and control, with the last two sometimes combined and called the processor.
Basic Functional Units
Introduction
12
CS8491 – Computer Architecture UNIT 1

A computer system consists of a number of interrelated components that work


together with the aim of converting data into information.
To attain information, data is entered through input devices. This data is
processed using the central processing unit and then the processed data is displayed to
the users using various output devices. All these parts are referred to as Hardware of
the computer.

Basic Computer Organization - Block diagram

Arithmetic
Input and
logic

Memory

Output Control

I/O Processor

Main Parts of Computer


Input Unit
Input devices are electromechanical devices that allow the user to feed
information into the computer for analysis, storage and to give commands to the central
processing unit.
Functions of input unit
Accept data and instructions from the outside world.
Convert it to a form that the computer can understand.
Supply the converted data to the computer system for further processing.
Input devices – Examples

13
CS8491 – Computer Architecture UNIT 1

Keyboard
Mouse
Light Pen
Digitizer
Trackball
Joystick
Example for an Input Device
Mouse
A computer mouse is a handheld hardware input device that controls a cursor in
a GUI (graphical user interface) for pointing, moving and selecting text, icons, files,
and folders on your computer. In addition to these functions, a mouse can also be used
to drag-and-drop objects and give you access to the right-click menu.
For desktop computers, the mouse is placed on a flat surface (e.g., mouse pad or desk) in
front of your computer.

Output Unit
Computers can communicate with human beings using output devices. Output
devices take the machine-coded output results from the CPU and convert them into a
form that is easily readable by human beings.
Functions of output unit
Output unit is the communication between the user and the computer.
It provides the information and results of a computation to the outside
world.
It also converts the binary data into a form that users can understand.
Output devices – Examples
14
CS8491 – Computer Architecture UNIT 1

VDU or Monitor
Printer
Plotter
Liquid Crystal Displays (LCD’s)

Example for an Output Device


Liquid Crystal Displays (LCD’s)
All laptop and handheld computers, calculators, cellular phones, and almost all
desktop computers now use liquid crystal displays (LCSs) to get a thin, low-power
display. The LCD is not the source of light; instead it controls the transmission of
light.
Central Processing Unit (CPU)
It is the heart of the computer system, converts data (input) into meaningful
information (output). It is a highly complex, extensive set of electronic circuitry, which
executes stored program instructions.
Functions of CPU
It performs all calculations and all decisions.
It controls and co-ordinates all units of the computer.
It interprets instructions of a program.
It stores data temporarily and monitors external requests.
Parts of CPU
Control Unit
This unit checks the correctness of sequence of operations. It fetches program
instruction from the primary storage unit, interprets them, and ensures correct execution
of the program. It also controls the input / output devices and directs the overall
functioning of the other units of the computer.
Uses

15
CS8491 – Computer Architecture UNIT 1

It instructs the computer how to carry out program instructions.


It directs the flow of data between different parts of the computer.
It controls all other units in the computer
Arithmetic & Logic Unit (ALU)
It contains the electronic circuitry that executes all arithmetic and logical
operations on the data made available to it.
Uses
All calculations are performed in the ALU of the computer.
It also does comparisons and takes decisions.
Memory Unit
Computers require memory to process data and store output.
Memory refers to the electronic holding place for instructions and data.
Memory is a device, which is used to store information temporarily / permanently, it
is a place where information is safely kept.
Uses
Used to store data and retrieve for later use.

TYPES OF MEMORY
Primary Memory
It is also known as main memory, stores data and instructions temporarily for
processing.
It is an integral component of the CPU but physically, it is a separate part placed on
the computer’s motherboard.
It can be further classified into random access memory (RAM) and read only memory
(ROM).
Functions of Primary Memory
o Used to hold the program being currently executed in the computer.

16
CS8491 – Computer Architecture UNIT 1

o Fast Access – usually in the order of nanoseconds.


o Contents will be lost when we switch off the computer.
Secondary Memory
Also known as auxiliary memory or external memory.
Used for storing instructions and data permanently.
It is slower and cheaper than the primary memory.
Functions of Secondary Memory
Not directly accessible – transferred through an I/O system.
Non-Volatile and long-term storage. o Holds large amount of data
Works slower than primary memory.
Uses
Used mainly for taking back-up of large data. o To store and retrieve data
for later use.
Motherboard
If we open the box containing the computer, we see a fascinating board of thin plastic,
covered with dozens of small gray or black rectangles. A plastic board containing
packages of integrated circuits or chips, including processor, cache, memory and
connectors for I/O devices such as networks and disks.
Integrated Chips (IC’s)
The small rectangles on the motherboard contain the devices that drive our
advancing technology, called integrated circuits and nicknamed chips. The board is
composed of three pieces: the piece connecting to the I/O devices, the memory, and the
processor.
Memory
The memory is where the programs are kept when they are running; it also
contains the data needed by the running programs.
1. Cache Memory

17
CS8491 – Computer Architecture UNIT 1

Cache memory consists of a small, fast memory that acts as a buffer for a slower, larger
memory. Cache is built using a different memory technology, SRAM.
A CPU cache is a cache used by the central processing unit (CPU) of a computer to
reduce the average time to access memory. The cache is a smaller, faster memory which
stores copies of the data from frequently used main memory locations. Most CPUs have
different independent caches, including instruction and data caches, where the data cache
is usually organized as a hierarchy of more cache levels (L1, L2 etc.)
2. DRAM
DRAM stands for Dynamic Random Access Memory that provides random access to any
location. Several DRAMs are used together to contain the instructions and data of a
program. The RAM portion of the term DRAM means that memory accesses take
basically the same amount of time no matter what portion of the memory is used.
3. SRAM
SRAM is more expensive and less dense than DRAM and is therefore not used for high-
capacity, low-cost applications such as the main memory in personal computers.
SRAM is a type of semiconductor memory that uses bistable latching circuitry to store
each bit. The term static differentiates it from dynamic RAM (DRAM) which must be
periodically refreshed. SRAM exhibits data remanence, but it is still volatile in the
conventional sense that data is eventually lost when the memory is not powered.
Processor / CPU
The processor is the active part of the board, following the instructions of a
program to do a specific task. It adds numbers, tests numbers, signals I/O devices to
activate, and so on. People call the processor the CPU or Central Processing Unit.
The processor logically comprises two main components: datapath and control,
the respective brawn and brain of the processor. The datapath performs the arithmetic
operations, and control tells the datapath, memory, and I/O devices what to do according
to the wishes of the instructions of the program.

18
CS8491 – Computer Architecture UNIT 1

BASIC OPERATIONAL CONCEPTS


To perform a given task an appropriate program consisting of a list of instructions is
stored in the memory. Individual instructions are brought from the memory into the
processor, which executes the specified operations. Data to be stored are also stored
in the memory.
Examples: - Add LOCA, R0
This instruction adds the operand at memory location LOCA, to operand in
register R0 & places the sum into register. This instruction requires the performance of
several steps,
1. First the instruction is fetched from the memory into the processor.
2. The operand at LOCA is fetched and added to the contents of R0
3. Finally the resulting sum is stored in the register R0
The preceding add instruction combines a memory access operation with an
ALU Operations. In some other type of computers, these two types of
operations are performed by separate instructions for performance reasons.
Load
LOCA,
R1 Add
R1, R0

Transfers between the memory and the processor are started by sending the
address of the memory location to be accessed to the memory unit and issuing
the appropriate control signals. The data are then transferred to or from the memory.

19
CS8491 – Computer Architecture UNIT 1

PROCESSOR

Connections between the processor and the memory

The fig shows how memory & the processor can be connected. In addition to the ALU &
the control circuitry, the processor contains a number of registers used for several
different purposes.
The instruction register (IR):- Holds the instructions that is currently being
executed. Its output is available for the control circuits which generates the timing
signals that control the various processing elements in one execution of instruction.
The program counter PC:-
This is another specialized register that keeps track of execution of a program. It contains
the memory address of the next instruction to be fetched and executed.
Besides IR and PC, there are n-general purpose registers R0 through Rn-1
The other two registers which facilitate communication with memory are: -
1. MAR – (Memory Address Register):- It holds the address of the location to be

20
CS8491 – Computer Architecture UNIT 1

accessed.
MDR – (Memory Data Register):- It contains the data to be written into or read
out of the address location.
Operating steps are
1. Programs reside in the memory & usually get these through the I/P unit.
2. Execution of the program starts when the PC is set to point at the
first instruction of the program.
3. Contents of PC are transferred to MAR and a Read Control Signal is sent to the
memory.
4. After the time required to access the memory elapses, the address word is read
out of the memory and loaded into the MDR.
5. Now contents of MDR are transferred to the IR & now the instruction is
ready to be decoded and executed.
6. If the instruction involves an operation by the ALU, it is necessary to obtain the
required operands.
7. An operand in the memory is fetched by sending its address to MAR &
Initiating a read cycle.
8. When the operand has been read from the memory to the MDR, it is
transferred from MDR to the ALU.
9. After one or two such repeated cycles, the ALU can perform the desired
operation.
10. If the result of this operation is to be stored in the memory, the result is sent to
MDR.
11. Address of location where the result is stored is sent to MAR & a write cycle
is initiated.
12. The contents of PC are incremented so that PC points to the next
instruction that is to be executed.

21
CS8491 – Computer Architecture UNIT 1

 Normal execution of a program may be preempted (temporarily interrupted) if


some devices require urgent servicing, to do this one device raises an Interrupt
signal.
 An interrupt is a request signal from an I/O device for service by the processor.
The processor provides the requested service by executing an appropriate
interrupt service routine.
 The Diversion may change the internal stage of the processor its state must be
saved in the memory location before interruption. When the interrupt-routine
service is completed the state of the processor is restored so that the interrupted
program may continue.

PERFORMANCE
Introduction
When trying to choose among different computers, performance is an important attribute.
Accurately measuring and comparing performance of different computers is critical to
both purchasers and to designers.
Performance Measurement is a Challenging Task
Accessing the performance of computers can be quite challenging task. The scale
and intricacy of modern software systems, together with the wide range of performance
improvement techniques employed by hardware designers, have made performance
assessment much more difficult.
Response Time / Execution Time
Response time is the time between the start and completion of a task. It is defined
as the total time required for the computer to complete a task, including disk accesses,
memory accesses, I/O activities, operating system overhead, CPU execution time, and so
on.
Throughput / Bandwidth

22
CS8491 – Computer Architecture UNIT 1

Throughput is the total amount of work done in a given time (or) It is the number
of tasks completed per unit time. As an individual computer user, we are interested in
reducing the response time and for as a datacenter manager, we are often interested in
increasing the throughput.
Performance – Definition
To maximize performance, we want to minimize response time or execution time
for some task. Thus, we can relate the performance and execution time for a computer X
as follows:
Performanc

This means that for two computers X and Y, if the performance of X is greater
than the performance of Y, then we have
Performanc Performanc

Execution tim Execution tim


That is the execution time of Y is longer than that of X.
In discussing a computer design, we often want to relate the performance of two
different computers quantitatively. We will use the phrase “X is ‘n’ times faster than Y”
or equivalently “X is ‘n’ times as fast as Y”.

If X is ‘n’ times faster than Y, then the execution time on Y is ‘n’ times longer than it is
on X.

Example Problem:

23
CS8491 – Computer Architecture UNIT 1

If computer A runs a program in 10 seconds and computer B runs the same program in
15 seconds, how much faster is A than B?

Solution: We know that A is ‘n’ times faster than B if

Thus the performance ratio is

and A is therefore 1.5 times faster than B.


In the above example, we could also say that computer B is 1.5 times slower than
computer A, since

means that

Measuring Performance
Time is the measure of computer performance: the computer that performs the
same amount of work in the least time is the fastest. Time can be measured in different
ways, depending on what we count. The most straight forward definition of time is called
wall clock time, response time, or elapsed time.
Program Execution Time
Program execution time is measured in seconds per program. It is defined as the
total time required to complete a task, including disk accesses, memory accesses,
input/output (I/O) activities, operating system overhead, etc.

24
CS8491 – Computer Architecture UNIT 1

CPU Execution Time / CPU Time


CPU time is the actual time that the CPU spends in computing for a specific task
and does not include time spent waiting for I/O or in running other programs.
Types of CPU Time
There are two types of CPU time.
1. User CPU Time – CPU time spent in executing the user program.
2. System CPU Time – CPU time spent by the operating system
performing the tasks on behalf of the program.
Clock Rate
Computer designers may want to think about a computer by using a measure that
relates to how fast the hardware can perform basic functions. Almost all computers are
constructed using a clock that determines when events take place in the hardware. These
discrete time intervals are called clock cycles (or ticks, clock ticks, clock periods, clocks,
cycles).
Designers refer to the length of a clock period as the time for a complete clock
cycle and the clock rate is the inverse of the length of the clock period.
CPU Performance & its Factors
Users and designers often examine performance using different metrics. If we
could relate these different metrics, we could determine the effect of a design change on
the performance as experienced by the user. Since we are trying to measure CPU
performance at this point, the bottom-line performance measure is the CPU execution
time. A simple formula relates the most basic metrics to CPU time:
CPU execution time for a program = CPU clock cycles for a program x Clock
cycle time
Alternatively, since clock rate and clock cycle time are inversely proportional, we
can write

25
CS8491 – Computer Architecture UNIT 1

This formula makes it clear that the hardware designer can improve performance
by reducing the number of clock cycles required for a program or the length of the clock
cycle.
Example Problem:
Our favorite program runs in 10 seconds on computer A, which has a 2 GHz clock. We
are trying to help a computer designer to build a computer B, which will run this program
in 6 seconds. The designer has determined that a substantial increase in the clock rate is
possible, but this increase will affect the rest of the CPU design, causing computer B to
require 1.2 times as many clock cycles as computer A for this program. What clock rate
should we tell the designer to target?
Solution:
Let’s first find the number of clock cycles required for the program A,
CPU Tim

10 seconds

CPU clock cycle


CPU time for B can be found using this equation:

Clock rat = = 4 GHz


To run the program in 6 seconds, B must have twice the clock rate of A.
Instruction Performance
The performance equations above did not include any reference to the number of
instructions needed for the program. One way to think about execution time is that it is

26
CS8491 – Computer Architecture UNIT 1

equal to the number of instructions executed to be executed multiplied by the average


time per instruction. Therefore, the number of clock cycles required for a program can be
written as
CPU clock cycles = Instructions for a program x Average clock cycles per
instruction
Clock cycles Per Instruction (CPI)
The term clock cycles per instruction (CPI) is the average number of clock cycles
each instruction of a program takes to execute. Since different instructions may take
different amounts of time depending on what they do, CPI is the average of all the
instructions executed in the program. CPI provides one way of comparing two different
implementations of the same instruction set architecture, since the number of instructions
executed for a program will be the same.
Example Problem
Suppose we have two implementations of the same instruction set architecture. Computer
A has a clock cycle time of 250 ps and a CPI of 2.0 for some program, and computer B
has a clock cycle time of 500 ps and a CPI of 1.2 for the same program. Which computer
is faster for this program and by how much?
Solution:
We know that each computer executes the same number of instructions for the
program. Let’s assume this number as I. First let us find the number of processor clock
cycles for each computer:
CPU clock cycle
CPU clock cycle
Now we can compute the CPU time for each computer:

CPU tim = CPU clock cycle X Clock cycle time


= I x 2.0 x 250 ps = 500 x I ps

27
CS8491 – Computer Architecture UNIT 1

Similarly for computer B,


CPU tim = CPU clock cycle X Clock cycle time
= I x 1.2 x 500 ps = 600 x I ps
Clearly computer A is faster. The amount by which it is faster is given by the ratio
of the execution times

So, we can conclude that computer A is 1.2 times as fast as computer B for this
program.

The Classic CPU Performance Equation


We can now write the basic performance equation in terms of instruction count,
CPI, and clock cycle time.
CPU Time = Instruction Count x CPI x Clock cycle time
Since the clock rate is the inverse of clock cycle time,

28
CS8491 – Computer Architecture UNIT 1

These formulas are particularly useful because they separate the three key factors that
affect the performance. We can use these formulas to compare two different

29
CS8491 – Computer Architecture UNIT 1

implementations or to evaluate a design alternative if we know its impact on these three


parameters.
The basic components of performance and how each is measured

Always bear in mind that the only complete and reliable measure of computer
performance is time. The performance of a program depends on the algorithm, the
language, the compiler, the architecture, and the actual hardware.

INSTRUCTIONS: LANGUAGE OF THE COMPUTER


Instruction
To command a computer’s hardware, you must speak its language. The words of a
computer’s language are called instructions and its vocabulary is called an instruction set.
An instruction is a command given by the user to the computer so that the
computer’s hardware will understand it and performs a specific task.
Instruction Set
A set of instruction designed for a particular hardware is referred to as its
instruction set. Instruction set is the collection of all commands understood by a given
architecture. Instruction Set Architecture (ISA) refers to actual programmer visible
instruction set. It will act as a boundary between the hardware and the software.
There are 3 popular instruction sets as follows:

30
CS8491 – Computer Architecture UNIT 1

1. ARM – Advanced RISC Machines – which is a popular 32-bit instruction


set in the world, as 4 billion were shipped in 2008.
2. MIPS – Microprocessor without Interlocked Pipeline Stages – which is
quite similar to ARM, with making up in elegance but lacks in popularity.
3. Intel x86 – which is inside all of the 300 million PCs made in 2008.
Instruction Set Architecture (ISA) / Architecture
One of the most important abstractions is the interface between the hardware and
the lowest-level software. Because of its importance, it is given a special name: the
instruction set architecture, or simply architecture, of a computer.
The instruction set architecture includes anything programmers need to know to
make a binary machine language program work correctly, including instructions, I/O
devices, and so on. Typically, the operating system will encapsulate the details of doing
I/O, allocating memory, and other low-level system functions so that application
programmers do not need to worry about such details.
There are mainly four types of instruction formats:
 Three address instructions
 Two address instructions
 One address instructions
 Zero address instructions
Three address instructions
Computers with three address instructions formats can use each address field to
specify either a processor register or a memory operand. The program in assembly
language that evaluates X= (A+B)*(C+D) is shown below, together with comments that
explain the register transfer operation of each instruction.
Add R1, A, B R1; #M [A] + M [B]
Add R2, C, D R2; #M [C] + M [D]
Mul X, R1,R2 M; # [X]R1 + R2

31
CS8491 – Computer Architecture UNIT 1

It is assumed that the computer has two processor registers, R1 and R2. The
symbol M[A] denotes the operand at memory address symbolized by A.
Two address instructions
Two address instructions are the most common in commercial computers. Here
again each address field can specify either a processor register or a memory word. The
program to evaluate X= (A+B)*(C+D) is as follows:

The MOV instruction moves or transfers the operands to and from memory and
processor registers. The first symbol listed in an instruction is assumed to be both a source
and the destination where the result of the operation is transferred.
One address instructions
One address instructions use an implied accumulator (AC) register for all data
manipulation. For multiplication and division there is a need for a second register.
However, here we will neglect the second register and assume that the AC contains the
result of all operations. The program to evaluate X= (A+B)*(C+D) is

All operations are done between the AC register and a memory operand. T is the
address of a temporary memory location required for storing the intermediate result.
Commercially available computers also use this type of instruction format.
32
CS8491 – Computer Architecture UNIT 1

Zero address instructions


A stack organized computer does not use an address field for the instructions ADD and
MUL. The PUSH and POP instructions, however, need an address field to specify the
operand that communicates with the stack. The following program shows how
X=(A+B)*(C+D) will be written for a stack organized computer.(TOS stands for top of
stack.)

To evaluate arithmetic expressions in a stack computer, it is necessary to


convert the expression into reverse polish notation. The name “zero address” is given to
this type of computer because of the absence of an address field in computational
instructions.
MIPS - Instruction Format
General Syntax

Label : Opcode Destination Operand, Source1 Operand, Source2 Operand; # Comment Lines

Where
Label – is a user defined variable name which is used for referencing to a
particular line of code.
Opcode – is a reserved Mnemonic code or Keyword which specifies what type of
operation is going to be performed in that instruction.
Destination Operand – is a register / location where the computation result is to be
stored.

33
CS8491 – Computer Architecture UNIT 1

Source1, Source2 Operands – is for specifying the input values for doing the
operation.
# Comment Lines – is used for writing comments about that instruction.
Example
Given a C Language code – a = b + c;
Then the compiler will translate this C language code into MIPS assembly
language instruction as follows:
add $s1, $s2, $s3;
Here, the variables a, b and c are assumed to be stored in register $s1, $s2 and $s3.
OPERANDS OF THE COMPUTER HARDWARE
Operand is a variable used to hold some data or instruction or an address. Usage of
operands is varied from programming language to another.
One major difference between the variables of a programming language and
registers is the limited number of registers, typically 16 to 32 on current computers.
There are 3 types of operands you can use in MIPS architecture. They are
 Registers
 Memory Operands / Addresses
 Constant / Immediate Operands
Registers
Registers are primitives used in hardware design that are also visible to the
programmer once the computer is completely designed. So registers forms the basic
building blocks of computer construction.
The size of a register in the MIPS architecture is 32-bits; groups of 32 bits occur so
frequently that they are given the name words in the MIPS architecture. A word is the
natural unit of access in a computer (32-bits) which corresponds to the size of a register
in the MIPS architecture.

34
CS8491 – Computer Architecture UNIT 1

MIPS – Register Set

Register 1, called $at, is reserved for the assembler, and registers 26−27, called
$k0−$k1, are reserved for the operating system.

Memory Operands
Programming languages have simple variables that contain single data elements
but they also have more complex data structures such as arrays and structures. These
complex data structures can contain many more data elements than there are registers in a
computer. The processor can keep only a small amount of data in registers, but computer
memory contains billions of data elements. Hence, complex data structures are kept in
memory.

35
CS8491 – Computer Architecture UNIT 1

Memory Address
To access a word in memory, the instruction must supply the memory address.
Memory address is a value used to identify the location of a specific data element within
a memory array. Memory is just a large, single-dimensional array, with the address acting
as the index to the array, starting at 0.

Sequential Memory addresses and contents Actual MIPS memory addresses


and contents
of memory at those locations of memory for those words.
Examples
1. Load Instruction
The data transfer instruction that copies data from memory to a register is traditionally
called load. The actual MIPS name for this instruction is ‘lw’, standing for ‘load word’.
Compile this C assignment statement into MIPS instruction / code:
g = h + A[8] ;
Let us assume that A is an array of 100 words and the compiler has associated the
variables g and h with the registers $s1 and $s2 and uses $t0 as a temporary register. Let
us also assume that the starting address or base address of the array is in $s3.
Although this is a single assignment statement, one of the operands is in memory, so we
must first transfer A[8] to a register. The address of this array element is the sum of the
base of the array A, found in register $s3, plus the number to select element 8. The data
should be placed in a temporary register for use in the next instruction.
36
CS8491 – Computer Architecture UNIT 1

The equivalent MIPS code is given by:


lw $t0, 32 ($s3) ; temporary register $t0 gets A[8]
The following instruction can operate on the value in $t0 since it in a register. The
instruction must add $s2 with $t0 and put the sum in the register corresponding to g
($s1).
add $s1, $s2, $t0 ; g = h + A[8]
The constant value in the data transfer instruction (8) is called the offset, and the register
added to form the address ($s3) is called the base register or index register.
2. Store Instruction
The instruction complementary to load is traditionally called store. It copies data from a
register to memory. The format of a store is similar to that of a load: the name of the
operation, followed by the register to be stored, then offset to select the array element,
and finally the base register.
Once again, the MIPS address is specified in part by a constant and in part by the
contents of a register. The actual name for store instruction is ‘sw’, standing for ‘store
word’.
Compile this C assignment statement into MIPS instruction / code:
A[12] = h + a[8] ;
Let us assume that variable h is associated with register $s2 and the base address of the
array is in $s3.
Although this is a single operation in the C language, two of the operands are in memory,
so we need even more MIPS instructions. The first two instructions are the same as the
previous example.
lw $t0, 32 ($s3) ; temporary register $t0 gets A[8]
add $t0, $s2, $t0 ; temporary register $t0 gets h + A[8]
The final instruction stores the sum into A[12], using 48 (4 x 12) as the offset and register
$s3 as the base register.

37
CS8491 – Computer Architecture UNIT 1

sw $t0, 48 ($s3) ; stores h + A[8] back into A[12]


Both load word and store word are the instructions that copy words between memory and
registers in the MIPS architecture.
Constant / Immediate Operands
Constant values are used as one of the operand for many arithmetic operations in
MIPS architecture. Constants have been placed in memory when the program was
loaded.
For example, to add constant value 4 to register $s3, we could use the code

lw $s5, #AddrConstant4 ($s1) ; $s5 = constant 4


add $s3, $s3, $s5 ; $s3 = $s3 + $s5 where ($s5 = 4)
Assuming that $s1 + AddrConstant4 is the memory address of the constant 4.
An alternative method that avoids the load instruction is to offer the option of
having one of the operand of the arithmetic instructions to be a constant, called an
immediate operand.
To add 4 to register $s3, we will just write
addi $s3, $s3,4 ; $s3 = $s3 + 4
This quick add instruction with one constant operand is called add immediate or
addi.

OPERATIONS OF THE COMPUTER HARDWARE


Every computer must be able to perform different kinds of arithmetic operations
based on the given input values. To perform the fundamental arithmetic operations,
programming languages need certain instructions.
The MIPS assembly language notation for performing addition operation is
add a, b, c ;

38
CS8491 – Computer Architecture UNIT 1

It instructs a computer to add the two variables b and c and to put their sum in a.
Each MIPS arithmetic instruction performs only one operation and must always have
exactly three variables.
For example, suppose we want to place the sum of four variables b, c, d and e into
variable a. then the following sequence of instructions will do that work
add a, b, c ;
add a, a, d ;
add a, a, e ; Thus, it takes three instructions to sum the four variables.
The natural number of operands for an operation like addition is three: the two
numbers being added together and a place to put the sum.

TYPES OF OPERATIONS / INSTRUCTIONS


1. Arithmetic Operations
Arithmetic operators are used for performing arithmetic operations like addition,
subtraction, etc. It takes two operands as input sources and performs the specified
operation given in the opcode field and puts the result in the destination.

2. Data Transfer Operations


Data transfer operators are used to transfer data from one register or memory
location/address to another register or memory location/address. Arithmetic
operations occur only on registers in MIPS instructions; thus MIPS must include
instructions that transfer data between memory and registers. Such instructions are
called data transfer instructions.
39
CS8491 – Computer Architecture UNIT 1

3. Logical Operations
Logical operators are used for performing logical operations like AND, OR and
NOR operations. It is also used for performing shift operations like shift left and shift
right, etc.

4. Conditional Branch Operations


Conditional branch operators are used to execute certain instructions based on the
outcome of the condition. If the condition is satisfied it executes a set of statements
and if the condition is false then it executes a different set of statements.

40
CS8491 – Computer Architecture UNIT 1

5. Unconditional Branch Operations


Unconditional branch statements are used to transfer the control from one point to
another point without checking any condition. The control is transferred to the branch
target address immediately.

Examples for Converting C Language Statements into MIPS


1. C Language Statements
a=b+c;
d=a–e;
MIPS code produced by the compiler is
add a, b, c ;
sub d, a, e ;
2. C Language Statement
f=(g+h)–(i+j);
Its equivalent MIPS code is given by
add t0, g, h ; Temporary variable t0 contains g + h
41
CS8491 – Computer Architecture UNIT 1

add t1, i, j ; Temporary variable t1 contains i + j


sub f, t0, t1 ; f gets t0 – t1, which is ( g+ h ) – (i + j )

REPRESENTING INSTRUCTIONS
An instruction is an order or command given to the computer processor by the user
in order to perform a particular task. At the lowest level, each instruction is a sequence of
0’s and 1’s that describes a physical operation that the computer is going to perform and
depending on the particular instruction type, the operation is varied.
Instruction Format
It is a form of representation of an instruction composed of fields of binary
numbers. Each instruction is encoded in binary machine code. All MIPS instructions are
encoded as a 32-bit instruction words and it is divided into small segment called “fields”
and each field tells computer something about the instruction. Also there is an attempt to
reuse fields across instructions as much as possible.
MIPS designer is keep all the instruction of the same length, thereby requiring
different kind of instruction format for different kinds of instructions.
In MIPS assembly language, registers $s0 to $s7 maps onto registers 16 to 23 and
registers $t0 to $t7 maps onto registers 8 to 15. $s0 means 16, $s1 means 17, $s2 means
18 and so on, like this $t0 means register 8, $t1 means register 9 and so on.
MIPS Instruction Coding Format
MIPS instructions are classified into three groups according to their coding
formats:
1. R – Type (for Register) or R – Format
2. I – Type (for Immediate) or I – Format
3. J – Type (for Jump) or J- Format
R – Type (or) R – Format
This group contains all instructions that do not require an immediate value, target
offset, memory address displacement, branch address or memory address to specify an
42
CS8491 – Computer Architecture UNIT 1

operand. This includes arithmetic and logic with all operands in register, shift
instructions, and register direct jump instructions (jal and jr).
The unused fields in R-type are coded with all 0 bits and all R-type instructions use
a opcode – 000000 and the operation is specified by the function field. The R-format has
6 fields as follows:
opcode rs rt rd sa Function
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
Where,
Opcode – Specify partially what instruction is it
Function – Combine with opcode, this number exactly specify the
instruction
rs (Source Register) – Specify the content of the source register
rt (Target Register) – Specify the contents of second register
rd (Destination Register) – Specify the content of the destination register
sa (Shift Amount) – Specify the number of bits to be shifted
Example for Translating MIPS Assembly Language Instruction into Machine
Language Code
add $t0, $s1, $s2 ;
MIPS R-format decimal representation is
Opcode rs rt rd sa Function

0 17 18 8 0 32

 Each segment of this instruction is called a field.


 The first and the last fields (0 and 32) in combination tell the MIPS computer that
this instruction performs addition.
 Second field gives the number of the register that is the first source operand of the
addition operation (17 indicates $s1)

43
CS8491 – Computer Architecture UNIT 1

 Third field gives the other source operand for the addition (18 indicates $s2)
 Fourth field contains the number of the register that is to receive the sum (8
indicates $t0)
 Fifth field is unused in this instruction, so it is set to 0.
This instruction adds register $s1 to register $s2 and places the sum in
register $t0.
MIPS R-format Binary Representation is
This instruction can also be represented as a field of binary numbers as
opposed to decimal numbers.
000000 10001 10010 01000 00000 100000
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
Here 10001 is the binary value for 17, similar to this the remaining field
values will be represented in binary.
I – Type (or) I – Format
By keeping the formats similar we can reduce the complexity in the design. The
first three field of I and R format are the same size and have the same names. The length
of the last three field of R-format is equal to the length of the fourth field in I-type.
Opcode rs Rt Immediate or Address
6 bits 5 bits 5 bits 16 bits
In I-format, first three fields is similar to R-type. The last (fourth) field indicates
the constant or address (16-bits). This 16-bit address means that a load/store word
instruction can load/store any word within a region of 2 15 or 32,768 bytes of the address
in the base register.
Example:
addi $t0, $t0, 0xABABCDCD ;
becomes :
lui $at, 0xABAB ;

44
CS8491 – Computer Architecture UNIT 1

ori $at, $at, 0xCDCD ;


add $t0, $t0, $at ;
Now each I-format instruction has only a 16-bit immediate. An instruction that
must be broken up is called a pseudo-instruction.
Example for Translating MIPS Assembly Language Instruction into Machine
Language Code
lw $t0, 32($s3) ; # load memory content to $t0, whose memory address is $s3 + 32.
MIPS I-format decimal representation is
Opcode rs rt Constant or Address
00001x 19 8 32

MIPS I-format Binary Representation is


This instruction can also be represented as a field of binary numbers as opposed to
decimal numbers.
000001 10011 01000 00000000 00100000
6 bits 5 bits 5 bits 16 bits
J-Type (or) J – Format
MIPS instruction set consists of two direct jump instructions ( j and jal ). These
instructions require a memory address to specify their operand. It use opcode 00001x and
they require a 26-bit coded address field to specify the target of the jump.
Opcode Target
6 bits 26 bits
The coded address is formed from bits at positions 27 to 2 in the binary
representation of the address. The bits at positions 1 and 0 are always 0 since instructions
are word-aligned. When a J-type instruction is executed, a full 32-bit jump target address
is formed by concatenating the high order four bits of the PC (the address of the
instruction following the jump), the 26 bits of the target field, and two 0 bits.

45
CS8491 – Computer Architecture UNIT 1

Example:
j 2000 ; # Jump to address 2000
jal 2500 ; # Jump and Link to address 2500
Summarization of MIPS Instruction Formats

6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Comments


Format
Functio Arithmetic
R opcode rs rt rd sa
n Instructions
Transfer, Branch and
I opcode rs rt Address / Immediate Immediate
Instructions

J opcode Target Address Jump Instructions

FIG: 2.5 MIPS instruction encoding.

Example: MIPS instruction encoding in computer hardware.


Consider A[300] = h + A[300]; the MIPS instruction for the operations are:
lw $t0,1200($t1) # Temporary reg $t0 gets A[300]
add $t0,$s2,$t0 # Temporary reg $t0 gets h + A[300]
sw $t0,1200($t1) # Stores h + A[300] back into A[300]

46
CS8491 – Computer Architecture UNIT 1

Tabulation , shows how hardware decodes and determine the three machine language instructions:

LW
ADD
ADD
The lw instruction is identified by 35 (OP field), The add instruction that follows is specified with 0
(OP field), The sw instruction is identified with 43 (OP field).

Binary version of the above Tabulation


===================================================================
LOGICAL OPERATORS / OPERATION
List of logical operators used in MIPS and other languages along with symbolic notation.

SHIFT LEFT (sll )


The first class of such operations is called shifts. They move all the bits in a word
to the left or right, filling the emptied bits with 0s. For example, if register $s0
contained
0000 0000 0000 0000 0000 0000 0000 1001two = 9ten
and the instruction to shift left by 4 was executed, the new value would be: 0000 0000
0000 0000 0000 0000 1001 0000two = 144ten
The dual of a shift left is a shift right. The actual name of the two MIPS shift instructions
47
CS8491 – Computer Architecture UNIT 1

are called shift left logical (sll) and shift right logical (srl).
sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits(shifted 4 places)
shamt field in the R-format. Used in shift instructions, it stands for shift
amount. The encoded version of above Shift instruction is shown below.

Also Shifting left by i bits gives the same result as multiplying by 2 i (refer above
representation for 9 and 144) (9 x 2 4 = 9 x 16 = 144, where i =4, since left shift done 4
times)
LOGICAL AND (and)
AND is a bit-by-bit operation that leaves a 1 in the result only if both bits of the operands
are 1.
0000 0000 0000 0000 0000 1101 1100
For example, if register $t2 contains 0000two
0000 0000 0000 0000 0011 1100 0000
and register $t1 contains 0000two
then, after executing the MIPS
instruction
and $t0,$t1,$t2 # reg $t0 = reg $t1 & reg $t2
the value of register $t0 would be 0000 0000 0000 0000 0000 1100 0000
0000two (example for bit wise, note: do not add , ……..00101
….10111
Bit wise AND 00101 (use AND truth table for each bit))
AND is traditionally called a mask, since the mask ―conceals‖ some bits.
LOGICAL OR (or)
It is a bit-by-bit operation that places a 1 in the result if either operand bit is a 1. To elaborate, if the
registers $t1 and $t2 are unchanged from the preceding i.e.,

register $t2 contains 0000 0000 0000 0000 0000 1101 1100 0000two

48
CS8491 – Computer Architecture UNIT 1

and register $t1 contains 0000 0000 0000 0000 0011 1100 0000 0000two
then, after executing the MIPS instruction
or $t0,$t1,$t2 # reg $t0 = reg $t1 | reg $t2
the value in register $t0 would be 0000 0000 0000 0000 0011 1101 1100 0000two
(example for bit wise, ……..00101
….10111
Bit wise OR 10111 (use OR truth table for each bit))
LOGICAL NOT (nor)
The final logical operation is a contrarian. NOT takes one operand and places a 1 in the result if one
operand bit is a 0, and vice versa. Since MIPS needs three-operand format, the designers of MIPS
decided to include the instruction NOR (NOT OR) instead of NOT.
Step 1: . Perform bit wise OR , ……..00101
….00000 (dummy operation register filled with zero)
00101
Step 2: Take Inverse for the above result now we get 11010

Instruction : nor $t0,$t1,$t3 # reg $t0 = ~ (reg $t1 | reg $t3)

Constants are useful in AND and OR logical operations as well as in arithmetic operations, so
MIPS also provides the instructions and immediate (andi) and or immediate (ori).
==========================================================================
DECISION MAKING AND BRANCHING INSTRUCTIONS ( CONTROL OPERATIONS )
Branch and Conditional branches: Decision making is commonly represented in programming
languages using the if statement, sometimes combined with go to statements and labels. MIPS
assembly language includes two decision-making instructions, similar to an if statement with a go to.
The first instruction is
beq register1, register2, L1
This instruction means go to the statement labeled L1 if the value in register1 equals the value in
register2. The mnemonic beq stands for branch if equal. The second instruction is
bne register1, register2, L1
It means go to the statement labeled L1 if the value in register1 does not equal the value in
register2. The mnemonic bne stands for branch if not equal. These two instructions are
49
CS8491 – Computer Architecture UNIT 1

traditionally called conditional branches.


EXAMPLE: if (i == j) f = g + h; else f = g – h; the MIPS version of the given statements is
bne $s3,$s4, Else # go to Else if i ≠ j
add $s0,$s1,$s2 # f = g + h (skipped if i ≠ j)
j Exit # go to Exit
Else
: sub $s0,$s1,$s2 #f=g–h (skipped if i = j)
Exi
t:

Here bne is used instead of beq, because bne(not equal to) instruction provides a better efficiency. This
example introduces another kind of branch, often called an unconditional branch. This instruction
says that the processor always follows the branch. To distinguish between conditional and
unconditional branches, the MIPS name for this type of instruction is jump, abbreviated as j.
(in example:- f, g, h, i, and j are variables mapped to five registers $s0 through $s4)
Loops:
Decisions are important both for choosing between two alternatives—found in if statements—and for
iterating a computation—found in loops. The same assembly instructions are the basic building blocks
forboth cases(if and loop).
EXAMPLE: while (save[i] == k)
i += 1; the MIPS version of the given statements is
Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6.
Loop: sll $t1,$s3,2 # Temp reg $t1 = i * 4 To get the address of save[i], we need to add $t1 and the
base of save in $s6:
add $t1,$t1,$s6 # $t1 = address of save[i] Now we can use that address to load save[i] into a
temporary register:
lw $t0,0($t1) # Temp reg $t0 = save[i] The next instruction performs the loop test, exiting if

50
CS8491 – Computer Architecture UNIT 1

save[i] ≠ k:
bne $t0,$s5, Exit # go to Exit if save[i] ≠ k
addi $s3,$s3,1 #i=i+1 The next instruction adds 1 to i:

j Loop # go to Loop The end of the loop branches back to the while test at
Exit: the top of the loop. We just add the Exit label after
it, and we’re done:
ADDRESSING MODES
The different ways in which the location of an operand is specified in an
instruction are referred to as addressing modes.
There are different ways to specify the address of the operands for any given
operations such as load, add or branch. The different ways of determining the address of
the operands are called addressing modes.
There are five different types of MIPS addressing modes.
1. Immediate Addressing mode
2. Register Addressing mode
3. Base or Displacement Addressing mode
4. PC-Relative Addressing mode
5. Pseudo Direct Addressing mode
Immediate Addressing Mode
In this addressing mode, the operand is a constant which is specified as part of the
instruction itself. Immediate addressing mode has the advantage of not requiring an extra
memory access to fetch the operand, but the operand is limited to 16 bits in size. The
branch instruction format can also be considered as an example of immediate addressing
mode, since the destination is held address is held in the instruction itself.
MIPS instruction use lui – load upper immediate to set the upper 16 bits of a
constant in a register and allowing a subsequent instruction to specify the lower 16 bits of
the constant.

51
CS8491 – Computer Architecture UNIT 1

Example: add $s0, $s1, 20 ;

Register Addressing Mode


In this addressing mode, the operand is the contents of a processor register; the
name of the register where the required operand is stored is specified in the instruction.
In immediate addressing mode, operand size is limited to 16 bits, so it creates
difficulty in performing a load or store operations. To solve this problem, registers are
used as a temporary reference which holds the original operand.
Example: add $s3, $s4, $s6 ;

Base or Displacement Addressing Mode


In this mode, the operand is stored in a memory location whose address is the sum
of a register and a constant in the instruction. The effective address of the operand is
generated by adding a constant value to the contents of a register.
The register used may be either a special register provided for this purpose or it
may be any one of a set of general-purpose registers in the processor. It is usually
referred to as Index Register (or) Base Register.
The constant value specified in the instruction is usually referred to as offset value
(or) displacement value.
Example: lw $t0, 32 ($s3) ;

52
CS8491 – Computer Architecture UNIT 1

PC – Relative Addressing Mode


This mode can be used to address a memory location that is ‘n’ bytes away from
the location presently pointed to by the program counter. Since the addressed location is
identified “relative” to the program counter, which always identifies the current
execution point in a program, the name “PC – Relative Mode” is associated with this type
of addressing.
Example: beq $s3, $s4, L1;

Pseudo-direct Addressing Mode


In this addressing mode, memory address is mostly embedded in the instructions.
The effective address is calculated by taking the upper 4-bits of the program counter (PC)
concatenated to the 26-bit immediate value, and the lower 2-bits are 00.
Therefore, the new effective address will always be a word-aligned and we can
never have a target address of a jump instruction with the 2-bits anything other than 00
and creates a complete 32-bit address.
Example: j 26-bit target address
Opcode Offset

XXXX Offset 00
53
CS8491 – Computer Architecture UNIT 1

Program Counter

Amdahl’s Law
A rule stating that the performance enhancement possible with a given improvement is
limited by the amount that the improved feature is used. It is a quantitative version of the
law of diminishing returns.

Suppose a program runs in 100 seconds on a computer, with multiply operations


responsible for 80 seconds of this time. How much do I have to improve the speed of
multiplication if I want my program to run five times faster?
For this problem:

Since we want the performance to be five times faster, the new execution time should be
20 seconds, giving

One alternative to time is MIPS (million instructions per second). For a given
program, MIPS is simply

54
CS8491 – Computer Architecture UNIT 1

Million instructions per second (MIPS): A measurement of program execution speed


based on the number of millions of instructions. MIPS is computed as the instruction
count divided by the product of the execution time and 10 6. There are three problems
with using MIPS as a measure for comparing computers. First, MIPS specifies the
instruction execution rate but does not take into account the capabilities of the
instructions. We cannot compare computers with different instruction sets using MIPS,
since the instruction counts will certainly differ.
Second, MIPS varies between programs on the same computer; thus, a computer cannot
have a single MIPS rating. For example, by substituting for execution time,
We see the relationship between MIPS, clock rate, and CPI:

Finally, and most importantly, if a new program executes more instructions but each
instruction is faster, MIPS can vary independently from performance

Other Types of Addressing Modes found in Different Processors


The number of instructions needed to execute a program can be reduced by the
shifting of arithmetic operands, conditional instruction execution, and rotating immediate
used along with data processing instructions, which gives rise to various other types of
addressing modes as well.
Immediate Offset Addressing Mode
LDR is Load register from memory. In this mode, the effective address is
calculated by adding a constant address to the base register.
Example: LDR r2, [r0, #8] ;

55
CS8491 – Computer Architecture UNIT 1

Register Offset Addressing Mode


Instead of adding a constant to the base register, another register is added to the
base register. This mode can help with an index into an array, where the array index is in
one register and the base of the array is in another.
Example: LDR r2, [r0, r1] ;

Scaled Register Offset Addressing Mode


Just as the second operand can be optionally shifted left or right in data processing
instructions, this addressing mode allows the register to be shifted before it is added to
the base register. This mode can be useful to turn an array index into a byte address by
shifting it left by 2 bits.
Example: LDR r2, [r0, r1, LSL #2] ;

56
CS8491 – Computer Architecture UNIT 1

Immediate Offset Pre-Indexed Addressing Mode


This mode updates the base register with the new address as part of the addressing
mode and can be useful when going sequentially through an array. There are two
versions – one where the address is added to the base and one where the address is
subtracted from the base – to allow the programmer to go through the array forwards and
backwards. Note that in this mode, the addition or subtraction occurs before the address
is sent to memory.
Example: LDR r2, [r0, #4]! ;

Immediate Offset Post-Indexed Addressing Mode


This mode is similar to immediate offset pre-indexed except that the address in the
base register is used to access memory first and then the constant is added or subtracted
later.
Example: LDR r2, [r0], #4;

57
CS8491 – Computer Architecture UNIT 1

Register Offset Pre-Indexed Addressing Mode


This mode is same as immediate offset pre-indexed except that you add or subtract
a register instead of a constant.
Example: LDR r2, [r0, r1]! ;

Scaled Register Offset Pre-Indexed Addressing Mode


This mode is same as register pre-indexed except that you shift the register before
adding or subtracting it.
Example: LDR r2, [r0, r1, LSL #2]! ;

Register Offset Post-Indexed Addressing Mode


58
CS8491 – Computer Architecture UNIT 1

This mode is same as immediate post-indexed except that you add or subtract a
register instead of a constant.
Example: LDR r2, [r0], r1 ;

****************

59

You might also like