0% found this document useful (0 votes)

57 views42 pages

IAS & MIPS Rate

The document discusses the history and development of the Von Neumann architecture and stored program computers. It describes the key components of the first stored program computer, the IAS machine, including its memory, registers, and structure. It also covers the effects of Moore's Law on computing and techniques used to address the performance mismatch between processors and memory such as caching, pipelining, and parallel processing.

Uploaded by

Waseem Haider

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views42 pages

IAS & MIPS Rate

Uploaded by

Waseem Haider

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 42

Von Neumann Architecture,

 Programs are stored

on storage devices
 Programs are copied
into memory for
execution
 CPU reads each
instruction in the
program and
executes accordingly
Von Neumann/Turing
Stored Program Computer
 ALU capable of operating on binary data
 Both ALU & CU contain registers.
Princeton Institute for Advanced
Studies (IAS)
 First implementation of von Neumann
stored program computer – the IAS
computer
 Began in 1946
 Completed in 1952
Structure of IAS machine
IAS Memory

1000 x 40 bit words of either number or

instruction
 Signed magnitude binary number
 1 sign bit
 39 bits for magnitude
 2 x 20 bit instructions
 Left and right instructions (left executed first)
 8-bit opcode
 12 bit address
IAS Registers
 Set of registers (storage in CPU)
 Memory Buffer Register (MBR)
 Memory Address Register (MAR)
 Instruction Register (IR)
 Instruction Buffer Register (IBR)
 Program Counter (PC)
 Accumulator (AC)
 Multiplier Quotient (MQ)
IAS Registers
 Memory buffer register (MBR): Contains a word
to be stored in memory or sent to the I/O unit, or is
used to receive a word from memory or from the I/O
unit.
 Memory address register (MAR): Specifies the
address in memory of the word to be written from or
read into the MBR.
 Instruction register (IR): Contains the 8-bit
opcode instruction being executed.
IAS Registers
 Instruction buffer register (IBR): Employed to hold
temporarily the right-hand instruction from a word in
memory.
 Program counter (PC): Contains the address of the
next instruction-pair to be fetched from memory.
 Accumulator (AC) and multiplier quotient (MQ):
Employed to hold temporarily operands and results of
ALU operations. For example, the result of multiplying two
40-bit numbers is an 80-bit number; the most significant
40 bits are stored in the AC and the least significant in the
MQ.
Structure of
IAS

Figure 2.3, p. 22
Moore’s Law

 Gordon Moore - cofounder of Intel

 He observed (based on experience) that number of
transistors on a chip doubled every year
 Since 1970’s growth has slowed a little
 Number of transistors doubles every 18 months
 Cost of a chip has remained almost unchanged
 Higher packing density means shorter electrical paths,
giving higher performance
 Smaller size gives increased flexibility/portability
 Reduced power and cooling requirements
 Fewer system interconnections increases reliability
Growth in CPU Transistor Count
Effects of Moore’s Law
The doubling of the number of transistors on a
single chip every 18 months has had some effects on
the application of technology:
 Costs have fallen dramatically since chip prices have not
changed substantially since Moore made his prediction
 Tighter packaging has allowed for shorter electrical paths
and therefore faster execution
 Smaller packaging has allowed for more applications in
more environments
 Reduction in power and cooling requirements which also
helps with portability
 Solder connections are not as reliable, therefore, with
more functions on a single chip, there are fewer unreliable
solder connections
Effects of Moore’s Law (continued)

As technology allows for higher levels of

performance, processor designers must come
up with ways to use it.
 Keeping all parts of the processor busy
 Coordinating multiple pipelines
 Improved branch prediction
 Multiple processors
 Optimizing execution
 Real-time analysis of code to “re-order” execution
 Speculative execution of code
 Incorporating multiple functions on single chip
Performance Mismatch

 Experienced significant improvement

 Processor speed
 Memory capacity
 Experienced only minor improvement
 Memory speed
 Bus rates
 I/O device performance
Speeding it up
 Pipelining
 On board cache
 On board L1 & L2 cache
 Branch prediction
 Data flow analysis
 Speculative execution
Branch Prediction
 The processor looks ahead in the instruction code
fetched from memory and predicts which branches,
or groups of instructions, are likely to be processed
next. If the processor guesses right most of the
time, it can prefetch the correct instructions and
buffer them so that the processor is kept busy. The
more sophisticated examples of this strategy predict
not just the next branch but multiple branches
ahead. Thus, branch prediction increases the
amount of work available for the processor to
execute.
Data Flow Analysis
 The processor analyzes which instructions are
dependent on each other’s results, or data, to
create an optimized schedule of instructions. In
fact, instructions are scheduled to be executed
when ready, independent of the original program
order. This prevents unnecessary delay.
Speculative Executoin
 Using branch prediction and data flow analysis,
some processors speculatively execute
instructions ahead of their actual appearance in
the program execution, holding the results in
temporary locations. This enables the processor
to keep its execution engines as busy as possible
by executing instructions that are likely to be
needed.
Performance Balance (Mismatch?)
 Processor speed increased
 Memory capacity increased
 But not the speed
 Thus, Memory speed lags behind
Processor speed
Logic and Memory Performance Gap
Solutions
 Increase number of bits retrieved at one time
 Change DRAM interface
 Cache

 Reduce frequency of memory access

 More complex cache and cache on chip

 Increase interconnection bandwidth

 High speed buses

 Hierarchy of buses
I/O Devices
 Peripherals with intensive I/O demands
 Large data throughput demands
 Processors can handle this
 Problem moving data
 Solutions:
 Caching

 Buffering

 Higher-speed interconnection buses

 More elaborate bus structures

 Multiple-processor configurations
Key is Balance
 Processor components
 Main memory
 I/O devices
 Interconnection structures
Improvements in Chip Organization
and Architecture
 Increase hardware speed of processor
 Fundamentally due to shrinking logic gate size

 More gates, packed more tightly, increasing clock

rate
 Propagation time for signals reduced

 Increase size and speed of caches

 Dedicating part of processor chip

 Cache access times drop significantly

 Change processor organization and architecture

 Increase effective speed of execution

 Parallelism
Increased Cache Capacity
 Typically two or three levels of cache
between processor and main memory
 Chip density increased
 More cache memory on chip
 Faster cache access
 Pentium chip devoted about 10% of
chip area to cache
 Pentium 4 devotes about 50%
More Complex Execution Logic
 Enable parallel execution of instructions
 Pipeline works like assembly line
 Different stages of execution of different
instructions at same time along pipeline
 Superscalar allows multiple pipelines
within single processor
 Instructions that do not depend on one
another can be executed in parallel
Diminishing Returns
 Internal organization of processors complex
 Can get a great deal of parallelism

 Further significant increases likely to be

relatively modest
 Benefits from cache are reaching limit
 Increasing clock rate runs into power
dissipation problem
 Some fundamental physical limits are being

reached
New Approach – Multiple Cores
 Multiple processors on single chip
 Large shared cache
 Within a processor, increase in performance
proportional to square root of increase in complexity
 If software can use multiple processors, doubling
number of processors almost doubles performance
 So, use two simpler processors on the chip rather than
one more complex processor
 With two processors, larger caches are justified
 Power consumption of memory logic less than processing logic
Performance Assessment
 Performance is one of the key
parameters to consider, along with
 cost,
 size,
 security,
 reliability, and,
 power consumption.
Performance Assessment
 Raw speed is far less important than how a
processor performs when executing a given
application.
 Application performance depends not just on the
raw speed of the processor, but on the
 instruction set, choice of implementation language,
efficiency of the compiler, and skill of the programming
done to implement the application.
System Clock
Performance Assessment: Clock Speed
 Key parameters
 Performance, cost, size, security, reliability, power
consumption
 System clock speed
 In Hz or multiples of (pulse frequency produced by the

clock)
 Clock rate, clock cycle, clock tick, cycle time

 Signals in a CPU take time to settle down to 1 or 0

 Some signals may change at different speeds
 Computer operations need to be synchronised
 Instruction execution in done in discrete steps:
 Fetch, decode, load and store, arithmetic or logical

 Usually require multiple clock cycles per instruction

 Pipelining gives simultaneous execution of instructions

 So, clock speed does not portray the complete picture for
different processors
Performance Assessment: Clock Speed

 A 1-GHz Processor receives I billion pulses per

second.
 Clock Rate/Clock Speed: The rate of pulses
 Cycle Time: the time duration between pulses
Instruction Execution Rate
 A processor is driven by a clock with a constant
frequency f , or
 1. a constant cycle time
 2. Ic = Instruction Count is the number of
machine instructions executed for that program
until it runs to completion or for some defined
time interval (‘executed instructions’?)
 3. CPI = average cycles per instruction
 Is CPI a constant value for a processor?
 Why ‘average’?
Instruction Execution Rate
 On any give processor, the number of clock
cycles required varies for different types of
instructions, such as load, store, branch etc.
 Let CPIi be the number of cycles required for
instruction type i and Ii be the number of
executed instructions of type i for a given
program
 The overall CPI is as:
Instruction Execution Rate
 The processor time T needed to execute a given
program can be expressed as: T = I c * CPI *
 Refinement in this formula is based on the fact the
memory related processing (memory references) take
more time as compared to processing done by the CPU
 Rewriting: T = Ic * [p + (m * k)] *
 Where; p = number of processor cycles needed to
decode and execute the instruction,
 m = number of memory references needed,
 k is the ratio between memory cycle time and processor
cycle time.
Performance Factors & System Attributes
 The five performance factors in the preceding
equation (Ic, p, m, k, ) are influenced by four
system attributes:
 the design of the instruction set (known as
instruction set architecture),
 compiler technology (how effective the compiler is in
producing an efficient machine language program
from a high-level language program),
 processor implementation, and
 cache and memory hierarchy.
Performance Factors & System Attributes
MIPS
 Millions of instructions per second (MIPS)
 Millions of floating point instructions per
second (MFLOPS)
 Heavily dependent on instruction set,
compiler design, processor
implementation, cache & memory
hierarchy
MIPS rate
 MIPS rate in terms of the clock rate and CPI
as follows:
MIPS
 Consider the execution of a program which results in the
execution of 2 million instructions on a 400-MHz
processor. The program consists of four major types of
instructions. The instruction mix and the CPI for each
instruction type are given below based on the result of a
program trace experiment:
MIPS
 The average CPI when the program is
executed on a uniprocessor with the above
trace results is:

 The corresponding MIPS rate is:

Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
49 pages
Computer System: Operating Systems: Internals and Design Principles
No ratings yet
Computer System: Operating Systems: Internals and Design Principles
62 pages
Chapter 1 Edit
No ratings yet
Chapter 1 Edit
463 pages
Functional Units: Input and Arithmetic Logic
No ratings yet
Functional Units: Input and Arithmetic Logic
20 pages
CH 02 W1 Computer Evolution and Performance
No ratings yet
CH 02 W1 Computer Evolution and Performance
58 pages
Computer Evolution and Performance
No ratings yet
Computer Evolution and Performance
52 pages
02 - Computer Evolution and Performance
No ratings yet
02 - Computer Evolution and Performance
32 pages
William Stallings Computer Organization and Architecture
No ratings yet
William Stallings Computer Organization and Architecture
20 pages
Modern Computer Architecture (Processor Design) : Prof. Dan Connors Dconnors@colostate - Edu
No ratings yet
Modern Computer Architecture (Processor Design) : Prof. Dan Connors Dconnors@colostate - Edu
32 pages
COA - 02 - Computer Evolution and Performance
No ratings yet
COA - 02 - Computer Evolution and Performance
9 pages
Chapter 1 Edit PDF
No ratings yet
Chapter 1 Edit PDF
40 pages
Computer Evolution 2 (Details)
No ratings yet
Computer Evolution 2 (Details)
23 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
55 pages
Computer Evolution & Performance
No ratings yet
Computer Evolution & Performance
71 pages
Chapter1 - Basic Structure of Computers
100% (1)
Chapter1 - Basic Structure of Computers
119 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
52 pages
Pankaj
No ratings yet
Pankaj
27 pages
Chapter - 1 DTT 210
No ratings yet
Chapter - 1 DTT 210
55 pages
Computer System Architecture Lecture Notes
No ratings yet
Computer System Architecture Lecture Notes
7 pages
Chapter 2
No ratings yet
Chapter 2
59 pages
Introduction
No ratings yet
Introduction
32 pages
Trio Service Manual
100% (1)
Trio Service Manual
82 pages
William Stallings Computer Organization and Architecture
No ratings yet
William Stallings Computer Organization and Architecture
41 pages
Archi Reviewer
No ratings yet
Archi Reviewer
21 pages
CH02 COA9e
No ratings yet
CH02 COA9e
61 pages
Unit1 1.7 Instr Cycle
No ratings yet
Unit1 1.7 Instr Cycle
35 pages
Chapter01 OSedition7Final
No ratings yet
Chapter01 OSedition7Final
62 pages
1-Module 1-12-12-2024
No ratings yet
1-Module 1-12-12-2024
43 pages
Unit I
No ratings yet
Unit I
27 pages
02 - Computer Evolution and Perfomance1 PDF
No ratings yet
02 - Computer Evolution and Perfomance1 PDF
11 pages
Week2 - 1
No ratings yet
Week2 - 1
64 pages
Modle 01 - HPC Introduction To Pipeline
No ratings yet
Modle 01 - HPC Introduction To Pipeline
124 pages
Lecture-02, Adv. Computer Architecture, CS-522
No ratings yet
Lecture-02, Adv. Computer Architecture, CS-522
49 pages
Chapter 2
No ratings yet
Chapter 2
28 pages
Chap 1
No ratings yet
Chap 1
48 pages
Instructor: L. N. Bhuyan
No ratings yet
Instructor: L. N. Bhuyan
32 pages
Unit-1 ACA
No ratings yet
Unit-1 ACA
86 pages
Module 1 - Introduction
No ratings yet
Module 1 - Introduction
38 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
CH01 - Computer System Overview
No ratings yet
CH01 - Computer System Overview
54 pages
Chapter 01
No ratings yet
Chapter 01
78 pages
RC Ta
100% (1)
RC Ta
362 pages
المحاضرة 2
No ratings yet
المحاضرة 2
65 pages
COS2621-notes Stallings
No ratings yet
COS2621-notes Stallings
49 pages
L1.0 HPC Overview
No ratings yet
L1.0 HPC Overview
58 pages
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
44 pages
Service Manual: PSR-290 PSR-292 Is Only For The Export
No ratings yet
Service Manual: PSR-290 PSR-292 Is Only For The Export
42 pages
DR Somashekhar: Indian Institute of Technology Madras, Chennai - 600 036
No ratings yet
DR Somashekhar: Indian Institute of Technology Madras, Chennai - 600 036
20 pages
The Improvement of The Personal Computer
No ratings yet
The Improvement of The Personal Computer
74 pages
Computer Evolution and Perfomance1
No ratings yet
Computer Evolution and Perfomance1
11 pages
Computer System Overview: 1 Spring 2015
No ratings yet
Computer System Overview: 1 Spring 2015
48 pages
Study Notes COAL Mids
No ratings yet
Study Notes COAL Mids
14 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
50 pages
Lecture1 2
No ratings yet
Lecture1 2
30 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
ADE 2
No ratings yet
ADE 2
64 pages
Soc Design
No ratings yet
Soc Design
42 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
29 pages
H25X Optical Absolute Encoder
No ratings yet
H25X Optical Absolute Encoder
2 pages
Lecture 03 SynthesizableHDL PDF
No ratings yet
Lecture 03 SynthesizableHDL PDF
105 pages
(Ebook PDF) Digital Systems Engineering PDF Download
100% (1)
(Ebook PDF) Digital Systems Engineering PDF Download
53 pages
02-Computer Evolution and Perfo
No ratings yet
02-Computer Evolution and Perfo
57 pages
L3, L4 8085 Instruction Set 1
No ratings yet
L3, L4 8085 Instruction Set 1
71 pages
UC3845 Technical Explanation
No ratings yet
UC3845 Technical Explanation
15 pages
Digital Circuit Design, Working Principles, Types, Applications, Examples
No ratings yet
Digital Circuit Design, Working Principles, Types, Applications, Examples
21 pages
Ck408 (66/100/133/200 MHZ) Spread Spectrum Differential System Clock Generator
No ratings yet
Ck408 (66/100/133/200 MHZ) Spread Spectrum Differential System Clock Generator
30 pages
Functional Coverage and Clocking Blocks PDF
No ratings yet
Functional Coverage and Clocking Blocks PDF
58 pages
VPC3+CLF3 Um305
No ratings yet
VPC3+CLF3 Um305
111 pages
Lab 3 A
No ratings yet
Lab 3 A
10 pages
Rfid Based Electronic Voting Machine: 1.1 Objective of The Project
No ratings yet
Rfid Based Electronic Voting Machine: 1.1 Objective of The Project
45 pages
RTL CodingGuide
No ratings yet
RTL CodingGuide
39 pages
Tutorial 3
No ratings yet
Tutorial 3
3 pages
Digital Clock Thesis
100% (2)
Digital Clock Thesis
7 pages
MX 25 L 12845 Emi
No ratings yet
MX 25 L 12845 Emi
74 pages
Power Optimization Techniques Adopted at Various Abstraction Levels in System On Chip Design - A Survey
No ratings yet
Power Optimization Techniques Adopted at Various Abstraction Levels in System On Chip Design - A Survey
11 pages
Chapter 4 - Counters Using IC
No ratings yet
Chapter 4 - Counters Using IC
6 pages
AS Chapter 4 Processor Fundamentals
No ratings yet
AS Chapter 4 Processor Fundamentals
41 pages
AT89C51 Microcontroller
No ratings yet
AT89C51 Microcontroller
8 pages
Serial Communication - Usart, Uart, Rs232, Usb, Spi, I2c, TTL
100% (2)
Serial Communication - Usart, Uart, Rs232, Usb, Spi, I2c, TTL
1 page
Ri5cy User Manual
No ratings yet
Ri5cy User Manual
60 pages
Ug945 Vivado Using Constraints Tutorial
No ratings yet
Ug945 Vivado Using Constraints Tutorial
46 pages
25AA256 25LC256 256K SPI Bus Serial EEPROM 20001822harc2
No ratings yet
25AA256 25LC256 256K SPI Bus Serial EEPROM 20001822harc2
31 pages
JK Flip Flop
No ratings yet
JK Flip Flop
19 pages
Digital Electronics 1
No ratings yet
Digital Electronics 1
26 pages
JSSC - 2019 - A 1-Pjpbit 80-Gbps PRBS15 Generator With A Modified Cherry-Hooper Output Driver
No ratings yet
JSSC - 2019 - A 1-Pjpbit 80-Gbps PRBS15 Generator With A Modified Cherry-Hooper Output Driver
11 pages

IAS & MIPS Rate

Uploaded by

IAS & MIPS Rate

Uploaded by

Von Neumann Architecture,

 Programs are stored

1000 x 40 bit words of either number or

 Gordon Moore - cofounder of Intel

As technology allows for higher levels of

 Experienced significant improvement

 Reduce frequency of memory access

 Increase interconnection bandwidth

 Higher-speed interconnection buses

 More elaborate bus structures

 More gates, packed more tightly, increasing clock

 Increase size and speed of caches

 Cache access times drop significantly

 Change processor organization and architecture

 Further significant increases likely to be

 Signals in a CPU take time to settle down to 1 or 0

 Usually require multiple clock cycles per instruction

 Pipelining gives simultaneous execution of instructions

 A 1-GHz Processor receives I billion pulses per

 The corresponding MIPS rate is:

You might also like