0% found this document useful (0 votes)
135 views109 pages

5-Performance Metrics - Execution Time Calculation, MIPS, MFLOPS.-08-02-2025

The document covers instruction sets and control units in computer architecture, focusing on addressing modes and the phases of the instruction cycle. It categorizes various addressing modes, such as immediate, direct, indirect, and register addressing, explaining their advantages and disadvantages. Additionally, it outlines the phases of the instruction cycle, including fetch and execute phases, and describes the role of the ALU and control unit in executing instructions.

Uploaded by

Raghav Kohli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views109 pages

5-Performance Metrics - Execution Time Calculation, MIPS, MFLOPS.-08-02-2025

The document covers instruction sets and control units in computer architecture, focusing on addressing modes and the phases of the instruction cycle. It categorizes various addressing modes, such as immediate, direct, indirect, and register addressing, explaining their advantages and disadvantages. Additionally, it outlines the phases of the instruction cycle, including fetch and execute phases, and describes the role of the ALU and control unit in executing instructions.

Uploaded by

Raghav Kohli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 109

BCSE205L

Computer Architecture and Organization


Module 3 – Instruction Sets and Control Unit

Dr. P.Keerthika
Associate Professor
School of Computer Science and Engineering
Vellore Institute of Technology, Vellore
Addressing Modes

• Instruction Set is categorized Based on Addressing Modes


• Addressing Mode
• The different ways in which the location of an operand is specified in an instruction are
referred to as addressing modes.
• The mode of access of effective address is called addressing mode
• The Way the operands are specified in the instruction
• The operation to be performed is indicated by the opcode.
• Operands can be in registers, memory or embedded in the instruction

• Effective Address
• The address in which the actual operand is available is called as Effective address
Addressing Modes
• Terminologies
• Displacement: It is an 8 bit or 16 bit immediate value given in the
instruction.
• Base: Contents of base register, BX or BP.
• Index: Content of index register SI or DI.
Classification of Addressing Modes
1. Stack (Implied/Implicit) Addressing mode
2. Immediate Addressing mode 9. Auto Increment Addressing Mode
3. (Memory) Direct Addressing mode 10. Auto Decrement Addressing Mode
4. Register Direct Addressing mode
5. (Memory) Indirect Addressing mode
6. Register Indirect Addressing mode
7. Displacement Addressing modes
1. Indexed Addressing mode
2. Base register Addressing mode
3. Relative Addressing Mode
1.Stack (Implied/Implicit) Addressing
mode
• Definition of the instruction itself specify the operands implicitly.
• Operand is implied / specified implicitly in the instruction itself.
• Operations like PUSH and POP for the computation
• Zero address instructions in a stack organized computer are implied mode instructions.
• Effective Address (EA) = AC or Stack[SP]
Opcode
• Advantage:
• Instruction specifies a fixed and unvarying address DEC (Decrement A register)
• No memory references CLC (used to reset Carry flag to 0)
PUSH
• Disadvantage:
POP
• Limited Computational Capacity
2.Immediate Addressing mode
• The simplest form of addressing
• Effective Address (EA) = Value
• Data is a part of instruction itself.
• Example:
Opcode Operand
• MOVE #100, R1
• Here the data 100 is moved to R1.
• MVI #01, A
• MVI stands for Move Immediate. This basically implies move 01 to A.
• Advantage:
• This mode can be used to define and use constants or set of initial values of variables.
• No memory references
• Disadvantage:
• Limited Operand size
3. (Memory) Direct Addressing mode
• The address where data is available is part Advantage:
of the instruction
• Large operand Magnitude, Simple
• The address field contains the effective
address of the operand Disadvantage:
• Limited Address Size
• Effective Address (EA) = LOC
• The change in the location of the
• Example:
program is associated with the change in
• MOVE A, R1 all absolute memory references.
• Here the data in
memory location A
is moved to R1.
4. Register Direct Addressing mode
• Register addressing is similar to Advantage:
direct addressing • No memory Reference
• The only difference is that the address field Disadvantage:
refers to a register rather than a main memory
address • Limited number of registers
• Effective Address (EA) = Ri
• Example:
• MOVE R2, R1
• Here the data in Register R2 is
moved to R1.
5. (Memory) Indirect Addressing mode
• The address field contains the address of Advantage:
effective address of the operand
• Contains a full –length address of the • Large address space
operand Disadvantage:
• Effective Address (EA) = (LOC) or [LOC] • The change in the location of the
• Example: program is associated with the change in
• MOVE (A), R1 all absolute memory references
• Here A – has another memory address
(not data)
• The data in the
address in A
is moved to R1.
6. Register Indirect Addressing mode
• Register indirect is just analogous to Advantage:
indirect addressing in both cases • Large address space
• The only difference is whether the address Disadvantage:
field refers to memory location or a register.
• Extra memory space
• Effective Address (EA) = (Ri) or [Ri]
• Example:
• MOVE (R2), R1
• Here R2 – has memory
address (not data)
• The data in the
address in R2
is moved to R1.
7. Displacement Addressing modes -
Indexed Addressing mode
• The address field reference a main memory The base register holds the beginning location of
Index
a memory array, while the index register holds
address, and the referenced register contain a Register
the relative position of an element in the array.
positive displacement from that address . Advantage:
• Effective Address (EA) = (Ri) + X Index value of an array• Special Locality
Stored in
• Example:
• MOVE 20 (R2), R1 Index
Register
• Here R2 – has memory
address (not data).
• The address in R2 is added with the index value
20 which is the EA.
• The data in the
address in R2+20
is moved to R1.

Example: Works for ARRAYS


7. Displacement Addressing modes –
Base register Addressing mode
The base register holds the beginning location of
• The address field reference a main memory Base
a memory array, while the index register holds
Register
address, and the referenced register contain a the relative position of an element in the array.
positive displacement from that address . Advantage:
• Effective Address (EA) = (Ri + BX) It use a convenient means of implementing
Base address of the segmentation.
• Example: array Stored Disadvantage:
• ADD AX, [BX+SI] Base Complexity Example: Works for ARRAYS
• ADD R1, [R2+R3] Register
• Meaning: R1 R1+M[R2+R3]
• Here R2 & BX – Base Register
• R3 & SI – Index register
• EA=sum (content of BR and SI)
ADD R1, (R2+3) --- Base
ADD R1,(R2+R3) --- Base with Index
ADD R1, 20(R2+R3) --- Base with Index and Offset
7. Displacement Addressing modes –
Relative Addressing Mode
• PC- relative addressing - the implicitly
Advantage:
referenced register is the program counter
(PC) • program-relative addressing is that the
code may be position-independent
• The effective address is the offset parameter
added to the address of the next instruction. Disadvantage:
• Effective Address (EA) = (PC) + X • Complexity
• Example:
• If Branch > 0,
JUMP -200
• Here, based on the condition,
the address in PC is incremented with the
constant value 200.
8. Auto Increment Addressing mode
• Register incremented after accessing memory Advantage:
• The effective address is the offset parameter • Useful while transferring large chunks of
added to the address of the next instruction. contiguous data.
• Effective Address (EA) = (Ri); Increment
• Example: Disadvantage:
• ADD (R2)+, R0
• Here R2 – has Operand Address
• Complexity
• After accessing the operand, the
register content is automatically
incremented.
9. Auto Decrement Addressing mode
• Register decremented and then contents Advantage:
accessed memory • Useful while transferring large chunks of
• The effective address is the offset parameter contiguous data.
added to the address of the next instruction.
Disadvantage:
• Effective Address (EA) = Decrement; (Ri)
• Complexity
• Example:
• ADD -(R2), R0
• Here R2 after decrement – has Operand
Address
• The contents of Register is decremented
first and then the content gives Operand
Address.
Addressing modes
Addressing modes
Problems
Find the effective address and the content of AC for the given data.
Addressing Mode Effective Content of AC
Address
Direct Address 500 AC ← (500) 800
Immediate operand 201 AC ← 500 500
Indirect address 800 AC ← ((500)) 300
Relative address 702 AC ← (PC + 500) 325
Indexed address 600 AC ← (XR + 500) 900
Register - AC ← R1 400
Register Indirect 400 AC ← (R1) 700
Autoincrement 400 AC ← (R1)+ 700
Autodecrement 399 AC ← -(R1) 450
Questions
• An instruction is stored at location 300 with its address field at location 301. The address
field has the value 400. A processor register R1 contains the number 200. Evaluate the
effective address if the addressing mode of the instruction is (a) direct; (b) immediate (c)
relative (d) register indirect; (e) index with R1 as the index register.
• Let the address stored in the program counter be designated by the symbol X1. The
instruction stored in X1 has the address part (operand reference) X2. The operand needed
to execute the instruction is stored in the memory word with address X3. An index
register contains the value X4. What is the relationship between these various quantities if
the addressing mode of the instruction is
• (a) direct (b) indirect (c) PC relative (d) indexed?
Module 3 – Phases of Instruction Cycle
ALU
Phases of Instruction Cycle

• The IAS operates repetitively performing an instruction cycle.

• Each instruction cycle consists of two sub cycles.

• Two Phases:

– Fetch

– Execute
Phases of Instruction Cycle
• 4 phases of Instruction Cycle
Phases of Instruction Cycle
• Fetch Phase:
• PC – holds the address of Instruction
• Processor – Fetches the instruction from memory
and stores in IR
• Increment PC
• Unless told Otherwise
• Processor interprets instruction and performs
required actions
• Execute Phase:
• Carry out the actions specified by the instruction in
the IR (execution phase).
• The instruction decoder and control logic unit is
responsible for implementing the action specified
by the instruction loaded in the IR
Phases of Instruction Cycle
• An instruction can be executed by performing one
or more of the following operations in some
specified sequence.
• Transfer a word of data from one processor
register to another or to the ALU.
• Perform an arithmetic or a logic operation and
store the result in a processor register.
• Fetch the contents of a given memory location
and load them into a processor register.
• Store a word of data from a processor register
into a given memory location.
Instruction Cycle - State Diagram
• Instruction address calculation (IAC):
• Determine the address of the next instruction
to be executed. Adding a fixed number to a
next number.
• Instruction fetch: (IF)
• Read the instruction from its memory location
into the processor.
• Instruction operation decoding (IOD)
• Analyze instruction to determine type of
operation to be performed and operand(s) to
be used.
• Operand Address Calculation: (OAC)
• If the operation involves the reference to an • Data Operation (DO):
operand in memory or available via I/O, then • Perform the operation indicated in the
determine the address of the operand. instruction.
• Operand Fetch (OF): • Operand store (OS):
• Fetch the operand from memory or read it • Write the result into memory or out to I/O.
from I/O.
Interrupts
Instruction cycle with Interrupts
Instruction cycle with Interrupts – State
Diagram
Instruction Execution Cycle
ALU
• Arithmetic-Logic Unit (ALU) is the part of a CPU that carries out arithmetic and
logic operations on the operands.
• ALU is divided into two units:
• Arithmetic Unit (AU)
• Logic Unit (LU).
• Some processors contain more than one AU
• For example, one for fixed-point operations and another for floating-point
operations.
• Control Unit (CU) - supplies the data required by the ALU from memory, or from
input devices, and directs the ALU to perform a specific operation based on the
instruction fetched from the memory
ALU
Operations on ALU
• logical operations − These include operations like AND, OR, NOT, XOR, NOR,
NAND, etc.

• Bit-Shifting Operations − This pertains to shifting the positions of the bits by a


certain number of places either towards the right or left, which is considered a
multiplication or division operations.

• Arithmetic operations − This refers to bit addition and subtraction


How ALU Works?
• ALU has direct input and output
access to
• processor controller,
• main memory (random access memory
or RAM in a personal computer)
• input/output devices
• Inputs and outputs flow along an
electronic path that is called a bus
ALU
• The ALU is that part of the computer that actually performs arithmetic and logical operations on
data
• All of the other elements of the computer system—control unit, registers, memory, I/O—are
there mainly to bring data into the ALU for it to process and then to take the results back out
• We have, in a sense, reached the core or essence of a computer when we consider the ALU
• An ALU and indeed, all electronic components in the computer, are based on the use of simple
digital logic devices that can store binary digits and perform simple Boolean logic operations
• Operands for arithmetic and logic operations are presented to the ALU in registers, and the
results of an operation are stored in registers
• These registers are temporary storage locations within the processor that are connected by signal
paths to the ALU
• The ALU may also set flags as the result of an operation
• For example, an overflow flag is set to 1 if the result of a computation exceeds the length of the
register into which it is to be stored
ALU
• The processor provides signals that control the operation of the ALU
and the movement of the data into and out of the ALU.

Flags
Module 3 – Data Path and Control Unit
-Hardwired Control
- Microprogrammed Control
Datapath and Control
• CPU can be divided into Data section & Control Section
Control Section
issues control
signals to the
datapath
Recap : Data Path and Control
Data Path and Control
• To execute an instruction, a processor must perform the following 3
steps:
Data Path and Control – Single Bus

Register Transfer

ALU Operation

Reading a word from memory

Storing a word in memory


Data Path and Control – Single Bus
Data Path and Control – Single Bus
Data Path and Control – Single Bus
Data Path and Control – Single Bus

Instruction
Execute
Data Path and Control
– Multiple Bus
Data Path and Control
– Multiple Bus PCout

R=B

IRin

Instruction
Execute MARin
QUIZ
Revisit - Stages of Data Path
Stages of Data Path - Examples

jump-and-link (JAL) instruction


is a simple datapath that branches the
PC by a specified offset
Stages of Data Path - Examples
Stages of Data Path - Examples
Stages of Data Path - Examples
Control Unit
• Control Unit (CU) - supplies the data required by the ALU from
memory, or from input devices, and directs the ALU to perform a
specific operation based on the instruction fetched from the memory
• To execute instructions – processor – generates control signals in
proper sequence
• Two ways – to generate control signals in sequence
• Hardwired Control
• Microprogrammed Control
Hardwired Control
• Operates at high speed
• Little flexibility
Hardwired Control
Separate Decoder and Encoder
Hardwired Control
Microprogrammed Control

• Popular in CISC because complex instruction sets require complex


controllers that can more easily be implemented as microprograms.
Microprogrammed Control
1.Control Word: A control word is a word
whose individual bits represent various
control signals.
2.Micro-routine: A sequence of control words
corresponding to the control sequence of a
machine instruction constitutes the micro-
routine for that instruction.
3.Micro-instruction: Individual control words
in this micro-routine are referred to as
microinstructions.
4.Micro-program: A sequence of micro-
instructions is called a micro-program, which
is stored in a ROM or RAM called a Control
Memory (CM).
5.Control Store: the micro-routines for all
instructions in the instruction set of a
computer are stored in a special memory
called the Control Store.
Microprogrammed Control
Basic Organization of Microprogrammed
Control Unit
Microprogrammed Control
Microprogrammed Control – Branch Inst.
Microprogrammed Control
Microprogrammed Control
Microprogrammed Control
Microprogrammed Control
Microprogrammed Control
Hardwired Vs Microprogrammed Control
Hardwired Control Unit Microprogrammed Control Unit
Microprogrammed control unit generates the control
Hardwired control unit generates the control signals
signals with the help of micro instructions stored in
needed for the processor using logic circuits
control memory
Hardwired control unit is faster when compared to
This is slower than the other as micro instructions are
microprogrammed control unit as the required control
used for generating signals here
signals are generated with the help of hardware.
Difficult to modify as the control signals that need to be Easy to modify as the modification need to be done only
generated are hard wired at the instruction level
More costlier as everything has to be realized in terms of Less costlier than hardwired control as only micro
logic gates instructions are used for generating control signals
It cannot handle complex instructions as the circuit
It can handle complex instructions
design for it becomes complex
Only limited number of instructions are used due to the
Control signals for many instructions can be generated
hardware implementation
Used in computer that makes use of Reduced Instruction Used in computer that makes use of Complex Instruction
Set Computers(RISC) Set Computers(CISC)
Module 3 – Performance Metrics :
Execution Time Calculation, MIPS, MFLOPS
Performance
• Measure Performance : How fast the computer works?

• Time – important metric to measure performance

• A computer exhibits higher performance if it executes programs faster.


Performance Metrics
• Elapsed time/Response Time

• The time between the start and completion of a task.


• It includes time spent executing on the CPU, accessing disk and memory, waiting for
I/O and other processes, and operating system overhead.
• A useful number – but often not good for comparison purposes.
Performance Metrics
• CPU Execution Time/CPU Time/Processing Time
• Total time a CPU spends computing on a given task (excludes time for I/O or running
other programs).
• Doesn’t count waiting for I/O or time spent running other programs.
• User CPU time - CPU time spent in the program
• System CPU time - CPU time spent in the operating system

• Throughput of a CPU/Bandwidth
• The total amount of work done in a given time.
• another measure of performance -number of tasks (Quantity) completed per unit
time.
Performance Metrics
• MIPS
• Millions of instructions per second
• MFLOPS
• Millions of floating point operations(FLO) per second
Performance Metrics
• To maximize performance, we want to minimize response time or execution time
for task. Thus, we can relate performance and execution time for a computer X:

• This means that for two computers X and Y, if the performance of X is greater
than the performance of Y, we have

• That is, the execution time on y is longer than X, so X is faster than Y.


Performance Metrics
• For some program running on machine X,
1
Performance =
Execution time(X)
• "X is n times faster than Y" – represented as
Performance(X)
=n
Performance(Y)
• Problem: Machine A runs a program in 20 seconds. Machine B runs the
same program in 25 seconds. How many times faster is machine A?
Performance Metrics - Example

PerformanceA = n × PerformanceB
Performance Metrics - Example

Hint : Performance (X) < Performance (Y)

PerformanceX = n × PerformanceY
Computer Clock
• Almost all computers are constructed using a clock that determines when
events take place in the hardware.
• These discrete time intervals are called clock cycles (ticks, clock ticks, clock
periods, clocks, cycles).

clock period

• Designers refer to the length of a clock period in time for a complete clock
cycle (e.g., 250 picoseconds) and as the clock rate/frequency (e.g. 4 GHz, 5
MHz), which is the inverse of the clock period.
Computer Clock
• Clock cycle time - the amount of time for one clock period to elapse
(e.g. 5 ns, 250 picoseconds….).
• Clock rate/frequency- inverse of the clock cycle time.
• For example, if a computer has a clock cycle time of 5 ns, the clock rate is:
1
---------------------- = 200 MHz
5 x 10-9 sec
clock period
Processor Performance Equation
• Performance Equation:
• Alternatively

• Also, execution time - depend on the number of instructions in a program


• Execution time equals the number of instructions executed multiplied by
the average time per instruction. Therefore, the number of clock cycles
required for a program can be written as

• Clock cycles per instruction (CPI), which is the average number of clock
cycles for each instruction takes to execute, is often abbreviated as CPI.
Processor Performance Equation
• Since different instructions may take different amounts of time depending
on what they do, CPI is an average of all the instructions executed in the
program.

• The basic performance equation can be rewritten in terms of instruction


count, that is, number of instructions executed by the program, CPI and
clock rate.
Processor Performance Equation
• To Summarize,

1
𝐶𝑙𝑜𝑐𝑘 𝑅𝑎𝑡𝑒 =
𝑐𝑙𝑜𝑐𝑘 𝑐𝑦𝑐𝑙𝑒 𝑇𝑖𝑚𝑒

• CPI (cycles per instruction)


A floating point intensive application might have a higher CPI
MIPS & MFLOPS
Practice Problems
• Example 1:
• CPU clock rate is 1 MHz
• Program takes 45 million cycles to execute
• What’s the CPU time?

45,000,000 * (1 / 1,000,000) = 45 seconds


Practice Problems
• Example 2:
• CPU clock rate is 500 MHz
• Program takes 45 million cycles to execute
• What’s the CPU time?

45,000,000 * (1 / 500,000,000) = 0.09 seconds


Practice Problems
• Example 3:
Suppose we have two implementations
of the same instruction set architecture (ISA).
• For some program,
• Machine A has a clock cycle time of 10 ns. and a CPI of 2.0
• Machine B has a clock cycle time of 20 ns. and a CPI of 1.2
• Which machine is faster for this program, and by how much?
• Assume that # of instructions in the program is 1,000,000,000.
Practice Problems
Machine A has a clock cycle time of 10 ns. and a CPI of 2.0
• Example 3: Solution Machine B has a clock cycle time of 20 ns. and a CPI of 1.2
# of instructions - 1,000,000,000.

CPU / Execution TimeA = 109 * 2.0 * 10 * 10-9 = 20 seconds


CPU/ Execution TimeB = 109 * 1.2 * 20 * 10-9 = 24 seconds

= 24/20 = 1.2 times

Machine A is faster
Practice Problems
• Example 4:
Our favorite program runs in 10 seconds on computer A, which has a 2
GHz clock. We are trying to help a computer designer build a computer,
B, which will run this program in 6 seconds. The designer has
determined that a substantial increase in the clock rate is possible, but
this increase will affect the rest of the CPU design, causing computer B
to require 1.2 times as many clock cycles as computer A for this
program. What clock rate should we tell the designer to target?
Practice Problems
A – Exec time = 10 sec, Clock rate – 2GHz
B- Exec.time = 6 sec.
• Example 4: Solution Clock cycle of B = 1.2 (clock cycle of A)
Find Clock rate of B?
Practice Problems
A – Exec time = 10 sec, Clock rate – 2GHz
B- Exec.time = 6 sec.
• Example 4: Solution Clock cycle of B = 1.2 (clock cycle of A)
Find Clock rate of B?
Practice Problems
• Example 5:

Suppose we have two implementations of the same instruction set


architecture. Computer A has a clock cycle time of 250 ps and a CPI of
2.0 for some program, and computer B has a clock cycle time of 500 ps
and a CPI of 1.2 for the same program. Which computer is faster for
this program and by how much?
Practice Problems
• Example 5: Solution
Average Cycles per Instruction

𝑛
𝑖=1 𝐼C𝑖 ∗𝐶𝑃𝐼𝑖
• Total CPI=
Instruction count(Ic)
Practice Problems
• Example 6:
Practice Problems
• Example 6: Solution

B is faster
Since Clock cycle time is not given, we can estimate using CPU clock cycles itself
Practice Problems
• Example 6: Solution
Practice Problems
• Example 7:
Practice Problems
• Example 7: Solution
With Frequency

CPU clock cycles(c)


• CPI=
Instruction count(Ic)
𝑛
𝑖=1 𝐼C𝑖 ∗𝐶𝑃𝐼𝑖
• Total CPI=
Instruction count(Ic)
• The CPI is the average number of cycles per instruction.
• If for each instruction type, we know its frequency and number of
cycles need to execute it, we can compute the overall CPI as follows:
• Total CPI= 𝑛𝑖=1 Freqi ∗ 𝐶𝑃𝐼𝑖
Practice Problems
• Example 8: Solution
Frequency
• Let assume that a benchmark has 100 instructions:
• 25 instructions are loads/stores (each take 2 cycles)
• 50 instructions are adds (each takes 1 cycle)
• 25 instructions are square root (each takes 50 cycles) CPI
• What is the CPI for this benchmark?
𝑛
CPI= 𝑖=1 Freqi ∗ 𝐶𝑃𝐼𝑖

CPI = ((0.25 * 2) + (0.50 * 1) + (0.25 * 50))


= 13.5
Practice Problems
• Example 9:
Two different compilers are being tested for a 500 MHz. machine
with three different classes of instructions: Class A, Class B, and
Class C, which require one, two, and three cycles (respectively).
Both compilers are used to produce code for a large piece
of software.
The first compiler's code uses 5 billions Class A instructions, 1
billion Class B instructions, and 1 billion Class C instructions.
The second compiler's code uses 10 billions Class A instructions,
1 billion Class B instructions, and 1 billion Class C instructions.
• Which sequence will be faster according to MIPS?
• Which sequence will be faster according to execution time?
Practice Problems
• Example 9: Solution

Clock rate – 500 MHz


Cycles – class1-1 cycle,
class2-2 cycles,
class3-3 cycles.
Problems with Arithmetic Mean
• For example, two machines timed on two benchmarks:
Machine A Machine B
Program 1 2 seconds (20%) 6 seconds (20%)
Program 2 12 seconds (80%) 10 seconds (80%)

Average execution timeA = (2 + 12) / 2 = 7 seconds


Average execution timeB = (6 + 10) / 2 = 8 seconds

Weighted average execution timeA = 2*0.2 + 12*0.8 = 10 seconds


Weighted average execution timeB = 6*0.2 + 10*0.8 = 9.2 seconds
MIPS rate
• A common measure of performance for a processor is the
rate at which instructions are executed, expressed as millions
of instructions per second (MIPS), referred to as the MIPS
rate.

• Ic  Instruction count 𝐼𝑐 ∗ 𝑓
• T CPU time CPU clock cycles ∗ 106
• fClock rate
• CPI Cycles Per Instruction
Practice Problems
• Example 10:
Practice Problems
• Example 11:
Practice Problems
• Example 12:
Assume that a benchmark has 100 instructions with the clock rate of
300Mhz. 20% instructions are loads/stores (each take 3 cycles), 40%
instructions are adds (each takes 2 cycles), and 40% instructions are
square root (each takes 60 cycles), what is the CPI and MIPS rate for
this benchmark?

Ans: CPI=25.4 , MIPS=11.8


MFLOPS

• Floating-point performance is expressed as millions of floating-point


operations per second (MFLOPS), defined as follows:

You might also like