0% found this document useful (0 votes)
95 views72 pages

Computer Architecture and Organization: General Introduction

This document provides an overview of computer architecture and organization. It discusses the differences between architecture, which is visible to programmers, and organization, which is how features are implemented internally. The document covers the evolution of computer technology from early machines like ENIAC to modern microprocessors. It also discusses how performance has been improved through smaller components, caches, parallelism, and other architectural techniques.

Uploaded by

cudarun
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views72 pages

Computer Architecture and Organization: General Introduction

This document provides an overview of computer architecture and organization. It discusses the differences between architecture, which is visible to programmers, and organization, which is how features are implemented internally. The document covers the evolution of computer technology from early machines like ENIAC to modern microprocessors. It also discusses how performance has been improved through smaller components, caches, parallelism, and other architectural techniques.

Uploaded by

cudarun
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 72

Computer Architecture and Organization

Chapter 1 General Introduction

Architecture & Organization 1


Architecture is those attributes visible to the programmer
Instruction set, number of bits used for data representation, I/O mechanisms, addressing techniques. e.g. Is there a multiply instruction?

Organization is how features are implemented


Control signals, interfaces, memory technology. e.g. Is there a hardware multiply unit or is it done by repeated addition?
2

Architecture & Organization 2


All Intel x86 family share the same basic architecture The IBM System/370 family share the same basic architecture This gives code compatibility Organization differs between different versions

Structure & Function


Structure is the way in which components related to each other Function is the operation of individual components as part of the structure

Function
All computer functions are:
Data processing Data storage (on the fly also ) Data movement (b/w itself and outside)
Input output (peripherals) Data communications

Control

Functional View

Operations (a) Data movement

Operations (b) Storage

Operation (c) Processing from/to storage

Operation (d)Processing from storage to I/O

10

Structure - Top Level


Peripherals

Computer
Central Processing Unit Main Memory

Computer

Systems Interconnection

Input Output

Communication lines
11

Structure - The CPU


CPU
Computer
I/O System Bus Memory CPU

Registers

Arithmetic and Logic Unit

Internal CPU Interconnection

Control Unit

12

Structure - The Control Unit


Control Unit
CPU
ALU Internal Bus Registers Control Unit

Sequencing Logic Control Unit Registers and Decoders

Control Memory

13

Computer Evolution

ENIAC - background
Electronic Numerical Integrator And Computer John presper Eckert and John Mauchly University of Pennsylvania Trajectory tables for weapons Started 1943 Finished 1946 Used until 1955
15

ENIAC - details
Decimal (not binary) 20 accumulators of 10 digit decimal number Programmed manually by switches and cables 18,000 vacuum tubes 30 tons 15,000 square feet 140 kW power consumption 5,000 additions per second
16

ENIAC

17

ENIAC

18

von Neumann/Turing
Stored Program concept Main memory storing programs and data ALU operating on binary data Control unit interpreting instructions from memory and executing Input and output equipment operated by control unit Princeton Institute for Advanced Studies
IAS

Completed 1952
19

Structure of von Neumann machine

20

IAS - details
1000 x 40 bit words
Binary number 2 x 20 bit instructions

Set of registers (storage in CPU)


Memory Buffer Register Memory Address Register Instruction Register Instruction Buffer Register Program Counter Accumulator Multiplier Quotient

21

Structure of IAS detail

22

Instruction set of IAS

23

Commercial Computers
1947 - Eckert-Mauchly Computer Corporation UNIVAC I (Universal Automatic Computer) US Bureau of Census 1950 calculations Late 1950s - UNIVAC II
Faster More memory

24

IBM
Punched-card processing equipment 1953 - the 701
IBMs first stored program computer Scientific calculations

1955 - the 702


More hardware feautres Business applications

Lead to 700/7000 series


Made them as a dominant computer manufacturer
25

Second generation - Transistors


Replaced vacuum tubes Smaller Cheaper Less heat dissipation Solid State device Made from Silicon (Sand) Invented 1947 at Bell Labs William Shockley
26

Transistor Based Computers


Second generation machines NCR & RCA produced small transistor machines IBM 7000 DEC (Digital Equipment Corporation) 1957
Produced PDP-1 (Programmed Data Processor)

27

Third Generation : IC
Discrete components.
Difficult to attach in circuit board

10,000 transistors and more 1958 era of microelectronics - IC

28

Microelectronics
Literally - small electronics A computer is made up of gates, memory cells and interconnections These can be manufactured on a semiconductor e.g. silicon wafer

29

Data storage:
Memory cells

Processing
gates

Movement
Paths between components

Control
Paths carry control signals also
30

Generations of Computer
Vacuum tube - 1946-1957 Transistor - 1958-1964 Small scale integration - 1965 on Medium scale integration - to 1971 Large scale integration - 1971-1977 Very large scale integration - 1978 -1991 Ultra large scale integration 1991 Over 100,000,000 devices on a chip 3,000 - 100,000 devices on a chip Up to 100 devices on a chip

100-3,000 devices on a chip

100,000 - 100,000,000 devices on a chip

31

Moores Law
Increased density of components on chip Gordon Moore co-founder of Intel Number of transistors on a chip will double every year Since 1970s development has slowed a little
Number of transistors doubles every 18 months

Cost of a chip has remained almost unchanged Higher packing density means shorter electrical paths, giving higher performance Smaller size gives increased flexibility Reduced power and cooling requirements Fewer interconnections increases reliability when compared to solder connections
32

Growth in CPU Transistor Count

33

IBM 360 series


1964 Replaced (& not compatible with) 7000 series First planned family of computers
Similar or identical instruction sets Similar or identical O/S Increasing speed Increasing number of I/O ports (i.e. more terminals) Increased memory size Increased cost
34

35

DEC PDP-8
1964 First minicomputer Did not need air conditioned room Small enough to sit on a lab bench $16,000
$100k+ for IBM 360

Embedded applications & OEM(original equipment manufacturers) Another manufactures purchase it and integrate into a new system
36

DEC - PDP-8 Bus Structure

Not Central switched architecture like IBM 96 separate signal paths Carry control, address and data signals Highly flexible
37

Later Generations

Ics used for construction of processor Construct memories also Tiny rings of ferromagnetic materials 16 of an inch in diameter Strung up on grids of fine wires Magnetized one way represents one and Magnetized other way represents Zero Its was faster but expensive and destructive 1970 Fairchild Size of a single core
i.e. 1 bit of magnetic core storage

Holds 256 bits Non-destructive read Much faster than core Capacity approximately doubles each year
38

39

Designing for Performance Microprocessor speed


Capacity of RAM Reducing the distance between the circuits But the raw speed is not increased Constant stream of instruction
Branch prediction ( predicts which branch or group of instructions to be processed next and prefetch it) Data flow analysis( Which instruction depends on others results and schedule it properly) Speculative execution (executes ahead of their actual appearances)
40

Performance Balance
Processor speed increased Memory capacity increased Memory speed lags behind processor speed Interface b/w processor and memory is crucial path Its carrying constant flow of program instructions and data If memory or pathway fails to match speed then processor stalls in wait state and processing time is lost
41

Logic and Memory Performance Gap

42

Solutions
Increase number of bits retrieved at one time
Make DRAM wider rather than deeper

Change DRAM interface


Cache

Reduce frequency of memory access


More complex cache and cache on chip On chip and off chip near to processor

Increase interconnection bandwidth


High speed buses
43

I/O Devices
Peripherals with intensive I/O demands Large data throughput demands Processors can handle this Problem moving data Solutions:
Caching Buffering Higher-speed interconnection buses More elaborate bus structures Multiple-processor configurations

44

Typical I/O Device Data Rates

45

Key is Balance
Balance in Processor components, Main memory, I/O devices, Interconnection structures

46

Improvements in Chip Organization and Architecture


Increase hardware speed of processor
Fundamentally due to shrinking logic gate size
More gates, packed more tightly, increasing clock rate Propagation time for signals reduced

Increase size and speed of caches


Dedicating part of processor chip
Cache access times drop significantly

Change processor organization and architecture


Increase effective speed of execution Parallelism

47

Problems with Clock Speed and Logic Density


Power
Power density increases with density of logic and clock speed Dissipating heat

RC delay
Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them Delay increases as RC product increases As size of the components in chip decreases Wire interconnects become thinner, increasing resistance and Wires closer together, increasing capacitance

Memory latency
Memory speeds lag processor speeds

Solution:
More emphasis on organizational and architectural approaches

48

Intel Microprocessor Performance

49

Increased Cache Capacity


Typically two or three levels of cache between processor and main memory Chip density increased
More cache memory on chip
Faster cache access

Pentium chip devoted about 10% of chip area to cache Pentium 4 devotes about 50%
50

More Complex Execution Logic


Enable parallel execution of instructions Pipeline works like assembly line
Different stages of execution of different instructions at same time along pipeline

Superscalar allows multiple pipelines within single processor


Instructions that do not depend on one another can be executed in parallel
51

Diminishing Returns
Internal organization of processors complex
Can get a great deal of parallelism Further significant increases likely to be relatively modest

Benefits from cache are reaching limit Increasing clock rate runs into power dissipation problem
Some fundamental physical limits are being reached

52

New Approach Multiple Cores


Multiple processors on single chip
Large shared cache

Within a processor, increase in performance proportional to square root of increase in complexity If software can use multiple processors, doubling number of processors almost doubles performance So, use two simpler processors on the chip rather than one more complex processor With two processors, larger caches are justified
Power consumption of memory logic less than processing logic

53

x86 Evolution (1)


8080 first general purpose microprocessor 8 bit data path Used in first personal computer Altair 8086 5MHz 29,000 transistors much more powerful 16 bit instruction cache, pre fetch few instructions 8088 (8 bit external bus) used in first IBM PC 80286 16 Mbyte memory addressable up from 1Mb 80386 32 bit Support for multitasking 80486 sophisticated powerful cache and instruction pipelining built in maths co-processor (Main CPU dont do any maths calculations)
54

x86 Evolution (2)


Pentium
Superscalar Multiple instructions executed in parallel

Pentium Pro
Increased superscalar organization branch prediction data flow analysis speculative execution

Pentium II
MMX technology (Multimedia extensions) graphics, video & audio processing

Pentium III
Additional floating point instructions for 3D graphics

55

x86 Evolution (3)


Pentium 4
Further floating point and multimedia enhancements

Core
First x86 with dual core (two processors in a chip)

Core 2
64 bit architecture (4 processors on a single chip)

Core 2 Quad 3GHz 820 million transistors


Four processors on chip x86 architecture dominant outside embedded systems Instruction set architecture evolved with backwards compatibility ~1 instruction per month added 500 instructions available

56

ARM Evolution
Designed by ARM Inc., Cambridge, England Licensed to manufacturers High speed, small die, low power consumption PDAs, hand held games, phones
E.g. iPod, iPhone

Widely used embedded processor Acorn produced ARM1 & ARM2 in 1985 and ARM3 in 1989 Acorn, VLSI and Apple Computer founded ARM Ltd.
57

ARM Systems Categories


Embedded real time
Storage systems , industrial and networking applications

Application platform
Linux, Palm OS, Symbian OS, Windows mobile

Secure applications
Smart cards and payment terminals

58

T02-Vertical.pdf

59

Top Level View of Computer Function and Interconnection

Consists of CPU, Memory, I/O components, and Interconnection At a top level, computer can described as
The external behavior of each component the data and control signals that it exchanges with other components The interconnection structure and the controls required to manage the use of the interconnection structure

61

Hardwired systems are inflexible

Program Concept

General purpose hardware can do different tasks, given correct control signals Instead of re-wiring, supply a new set of control signals
62

63

What is a program?
A sequence of steps For each step, an arithmetic or logical operation is done For each operation, a different set of control signals is needed

64

Function of Control Unit


For each operation a unique code is provided
e.g. ADD, MOVE

A hardware segment accepts the code and issues the control signals

65

Components
The Control Unit and the Arithmetic and Logic Unit constitute the Central Processing Unit Data and instructions need to get into the system and results out
Input/output

Temporary storage of code and results is needed


Main memory
66

Computer Components: Top Level View

67

Computer function
Basic function is execution of instruction Two steps:
Fetch Execute

Repeating process Instruction Cycle

68

69

Fetch Cycle
Program Counter (PC) holds address of next instruction to fetch Processor fetches instruction from memory location pointed to by PC Increment PC
Unless told otherwise

Instruction loaded into Instruction Register (IR) Processor interprets instruction and performs required actions
70

Execute Cycle
Processor-memory
data transfer between CPU and main memory

Processor I/O
Data transfer between CPU and I/O module

Data processing
Some arithmetic or logical operation on data

Control
Alteration of sequence of operations e.g. jump

Combination of above
71

Example of Program Execution

72

You might also like