0% found this document useful (0 votes)
11 views34 pages

Lecture 02

Uploaded by

Tinotenda Kondo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views34 pages

Lecture 02

Uploaded by

Tinotenda Kondo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

CHAPTER 2- Fetch Execute cycles and

performance measures
Assoc. Prof. Dr. Ezgi Deniz Ülker
European University of Lefke
Department of Computer Engineering
COMP333- Computer Organization and Architecture
Outline
1. The fetch-execute cycle
2. Performance measures
The Fetch-Execute Cycle

• 1. Control unit fetches the


instruction from RAM.
• 2. Control unit decodes the
instruction and sends to ALU.
• 3. ALU executes the instruction
and writes to RAM/prints.
The Fetch-Execute Cycle

• A special register called PC holds a memory


address.
1. Instruction fetched from memory address
held in PC into instruction buffer (IB).
2. Control Unit determines what to do:
decodes instruction.
3. Execution Unit executes instruction
4. PC updated, and back to Step 1.

The Fetch-Decode-Execute cycle of a computer is the process by which a computer:


1. fetches a program instruction from its memory
2. determines what the instruction wants to do
3. carries out those actions.
• This cycle is repeated continuously by the central processing unit (CPU), from boot up to when the
computer is shut down.
• In modern computers this means completing the cycle billions of times a second. Without it nothing
would be able to be calculated.
The Fetch-Execute Cycle
• Von Neumann realized data and programs are indistinguishable &
can therefore use the same memory.
• Von Neumann architecture uses a single processor.
• It follows a linear sequence of fetch–decode–execute operations for
the set of instructions. i.e. the program
• In order to do this, the processor has to use registers.
• A register is an extremely fast piece of on-chip memory, usually 32 or
64 bits in size for temporary storage
• Registers are outside the immediate access store and consequently
allow faster access to the data they store.
The Fetch-Execute Cycle
Special Registers;
• Program counter (PC): keeps track of where to find next instruction so that a copy of the
instruction can be placed in the current instruction register.
• Memory data/buffer register (MDR)/ (MBR) acts like a buffer & holds anything copied
from memory ready for processor to use.
• Memory address register (MAR): used to hold memory address that contains either the
next piece of data or an instruction that is to be used.
• Index register (IR): a microprocessor register used for modifying operand addresses
during run of a program
• Used if address indirect; a constant from the instruction added to contents of IR to form address to
operand/data
• Current instruction register (CIR): holds the instruction that is to be executed.
• Status register (SR): holds results of comparisons to decide later for action, intermediate
results of arithmetic performed and any errors occurred during arithmetic
• General-purpose register: One or more registers in the CPU that temporarily store data
• Accumulator: single general-purpose register inside ALU
• It is a single general-purpose register where all values held when processed by arithmetic & logical
operations.
Register Notation
• To describe the cycle we can use register notation. This is a very
simple way of noting all the steps involved.
• In all cases where you see brackets e.g. [PC], this means that the
contents of the thing inside the brackets is loaded.
• As an example in the first line, contents of the program counter is
loaded into the Memory Address Register (MAR).
1. The contents of the Program Counter, the address of the next instruction to be
executed, is placed into the Memory Address Register.
2. The address is sent from the MAR along the address bus to the Main Memory.
The instruction at that address is found and returned along the data bus to the
Memory Buffer Register. At the same time the contents of the Program Counter is
increased by 1, to reference the next instruction to be executed.
3. The MBR loads the Current Instruction Register with the instruction to be
executed.
4. The instruction is decoded and executed using the ALU if necessary.
Detailed description of Fetch-Decode-Execute Cycle
The contents of the Program Counter, the address of the next
instruction to be executed, is placed into the Memory Address
Register.

The address is sent from the MAR along the address bus to the
Main Memory. The instruction at that address is found and
returned along the data bus to the Memory Buffer Register. At
the same time the contents of the Program Counter is increased
by 1, to reference the next instruction to be executed.
Detailed description of Fetch-Decode-Execute Cycle

The MBR loads the Current Instruction Register with the


instruction to be executed.

The instruction is decoded and executed using the ALU


if necessary.
The Cycle starts again!
Performance Measures
• In computing, computer performance is the amount of useful work
accomplished by a computer system. Outside of specific contexts,
computer performance is estimated in terms of accuracy, efficiency
and speed of executing computer program instructions. When it
comes to high computer performance, one or more of the following
factors might be involved:
• Short response time for a given piece of work.
• High throughput (rate of processing work).
• Low utilization of computing resources.
• High availability of the computing system or application.
• Fast (or highly compact) data compression and
decompression.
• High bandwidth.
• Short data transmission time.
Performance Measures
• Response time (execution time or Latency): you ask the processor to
execute a particular task and how fast you get a response from the
processor – that is basically what is called the response time.
• E.g. How long does it take for my job to run? (including memory accesses,
I/O activities, CPU execution time and so on.)

• Throughput (bandwidth): It is the total amount of work done in a given


time.
• E.g. How many jobs can the machine run at once?
• If we want to maximize performance, we obviously need to minimize our
execution time. Performance is inversely related to execution time.
• Performance = 1/ Execution time
• IT managers are interested in increasing throughput.
Performance Measures
• Example: Do the following changes to a computer system increase
throughput, decrease response time, or both?
1. Replacing the processor in a computer with a faster version.
2. Adding additional processors to a system that uses multiple
processors for separate tasks. (e.g. Searching in the web, writing
an essay)
FACT: If response time is decreased, throughput is increased.
In the case 1: response time and throughput are improved. (Decreased
response time, increased throughput)
In the case 2: each task has executed by a processor is improved, so
throughput (the total work done) is improved, but response time stays
same, unless they replaced with faster processors.
Performance Measures
• Performance = 1/ Execution time
• If a processor X is n times faster than processor Y;
• Performance𝑋 > Performance𝑌

1 1
• >
Execution time𝑋 Execution time𝑌

• Execution time𝑋 < Execution ti𝑚𝑒𝑌

Performance𝑋 Execution time𝑌



Performance𝑌 = Execution time𝑋 = n
Relative performance example:
• Example: If computer A runs a program in 10 seconds and computer B
runs the same program in 15 seconds, how much faster is A than B?
• We know that A is n times faster than B.

Performance𝐴 Execution ti𝑚𝑒𝐵 15


• = 1.5 and A is 1.5 times
Performance𝐵 = Execution time𝐴 = n 10
as fast as B.
Performance Measures-Computer Clock
• Clock cycle: The time for one clock period, usually of the processor
clock.
• Clock period: The length of each clock cycle.

CPU execution time for a program = CPU clock cycles for a program X Clock cycle time
CPU execution time for a program = CPU clock cycles for a program / Clock rate

 This formula makes it clear that the HW designer can improve performance by
reducing the number of clock cycles required for a program or the length of the clock
cycle.
Performance Measures-Computer Clock
Performance Measures-Computer Clock
• How many cycles are required for a program?
Performance Measures-Computer Clock
• Different number of cycles for different instructions.
Performance Measures- Units of period and
frequency

1 Hz = 1 Cycles / Second
e.g. 1000 (kHz) cycles in a second
Performance Measures- Improving
Performance Example
• Question: A program runs in 10 seconds on computer A, which has a 2GHz clock. We
are trying to help a computer designer build a computer B, which will run this
program in 6 seconds. The designer has determined that a substantial increase in the
clock rate is possible, but this increase will affect the rest of the CPU design, causing
computer B to require 1.2 times as many clock cycles as computer A for this program.
What clock rate should we tell the designer to target?
• Solution: first, find the number of cycles required for the program on computer A:
CPU time(A)= CPU clock cycles (A)/Clock rate (A)
10 seconds=CPU clock cycles (A)/2x10^9cycles/sec
CPU clock cycles(A)= 10secondsx2x10^9cycles/sec
CPU clock cycles (A) = 20x10^9 cycles (cont.)
Performance Measures- Improving
Performance Example
• CPU time for B can be found using this equation:
CPU time(B)= (1.2xCPU clock cycles(A)) / Clock rate (B)
6 seconds = (1.2x20x10^9cycles)/ Clock rate(B)
Clock rate (B)= (4x10^9cycles) cycles/second
= (4x10^9cycles)/second= 4GHz
To run the program in 6 seconds, B must have twice the clock rate of A.
Instruction Performance
• The execution time must depend on the number of instructions in a
program.

• CPU clock cycles= Instructions for a program x average clock cycles


per instruction (CPI)

• average clock cycles per instruction (CPI): average number of clock


cycles each instruction takes to execute.
• Since different instructions may take different amounts of time
depending on what they do, CPI is an average of all the instructions
executed in the program.
Instruction Performance- Example
• Question: Suppose we have two implementations of the same instruction set
architecture. Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for
some program, and computer B has a clock cycle time of 500ps and a CPI of 1.2
for the same program. Which computer is faster for this program and by how
much?

• Solution: We know that each computer executes the same number of


instructions for the program; let’s call this number Ins.
• First, find the number of processor clock cycles for each computer.
• CPU clock cycles= Instructions for a program x average clock cycles
per instruction (CPI)

• CPU clock cycles (A)= Insx2.0


• CPU clock cycles (B)= Insx1.2 (cont.)
Instruction Performance- Example
• Now, we can computer the CPU time for each computer;
• CPU time (A)=CPU clock cycles(A)xClock cycle time
= Ins x 2.0 x250ps= 500x Ins ps
Likewise for B;
CPU time (B)= Insx1.2x500ps=600x Ins ps
Clearly, computer A is faster. The amount faster is given by the ratio of
the execution times:
CPU performance(A)/ CPU performance(B)= Execution time (B)
/Execution time (A)= 600xIns ps/500xIns ps =1.2
We can conclude that computer A is 1.2 times as fast as computer B for
this program.
The classic CPU performance Equation
• Instruction count: the number of instructions executed by the
program.
• CPU time= Instruction count X CPI X Clock cycle time
• Or since the clock rate is the inverse of clock cycle time:
• CPU time= (Instruction count X CPI)/clock rate
The classic CPU performance Equation- Example
• Sequence 1 executes 2+1+2=5 instructions.
• Sequence 2 executes 4+1+1=6 instructions.
• Sequence 1 executes fewer instructions.
• However, to find which one is faster, we
should find the CPU clock cycles needed for
each sequence.
CPI example

100
CPI example
MIPS as a Performance Metric
• MIPS (millions of instructions per second) : MIPS is the
approximate number of instructions a CPU can execute in one
second. For example, the Intel 80386 (386) computer processor
was capable of performing more than five million instructions
every second, or 5 MIPS.
MIPS example
MIPS example
Summary of Performance Measures
End of the chapter.
Thank you all.
Assoc. Prof. Dr. Ezgi Deniz Ülker

You might also like