CPU Structure & Functions
CPU Structure & Functions
1
CPU Function
CPU must:
1. Fetch instructions
2. Interpret/decode instructions
3. Fetch data
4. Process data
5. Write data
2
CPU With Systems Bus
3
CPU Internal Structure
4
Registers
CPU must have some working space (temporary
storage) - registers
Number and function of registers vary between
processor designs - one of the major design decisions
Top level of memory hierarchy
5
Registers Types
Registers can be divided into two large categories;
1. User-visible Registers
These enable the machine or assembly language
programmer to minimize main memory references by
optimizing use of registers
6
User Visible Registers
1. General Purpose
2. Data
3. Address
4. Condition Codes
7
General Purpose Registers
May be true general purpose
May be restricted
May be used for data or addressing
Data: accumulator (AC)
Addressing: segment, stack
8
General Purpose Registers
Make them general purpose
Increased flexibility and programmer options
Increased instruction size & complexity, addressing
Make them specialized
Smaller (faster) but more instructions
Less flexibility, addresses implicit in opcode
9
How Many GP Registers?
Between 8 – 32
Fewer = more memory references
More does not reduce memory references and takes up
processor real estate
10
How big?
Large enough to hold full address
Large enough to hold full data types
But often possible to combine two data registers or two
address registers by using more complex addressing
11
Condition Code Registers – Flags
Sets of individual bits, flags
e.g., result of last operation was zero
Can be read by programs
e.g., Jump if zero – simplifies branch taking
Can not be set by programs
12
Control & Status Registers
1. Program Counter (PC)
2. Instruction Register (IR)
3. Memory Address Register (MAR) – connects to
address bus
4. Memory Buffer Register (MBR) – connects to data
bus, feeds other registers
13
Program Status Word
A set of bits
Condition Codes:
Sign (of last result)
Zero (last result)
Carry (multiword arithmetic)
Equal (two latest results)
Overflow
Interrupts enabled/disabled
Supervisor/user mode
14
Supervisor Mode
Kernel mode
Allows privileged instructions to execute
Used by operating system
Not available to user programs
15
Indirect Cycle
May require memory access to fetch operands
Indirect addressing requires more memory accesses
Can be thought of as additional instruction subcycle
16
Instruction Cycle with Indirect
17
Instruction Cycle State Diagram
18
Data Flow (Instruction Fetch)
PC contains address of next instruction
Address moved to MAR
Address placed on address bus
Control unit requests memory read
Result placed on data bus, copied to MBR, then to IR
Meanwhile PC incremented by 1
19
Data Flow (Fetch Diagram)
20
Data Flow (Data Fetch)
IR is examined
If indirect addressing, indirect cycle is performed
Rightmost n bits of MBR (address part of instruction)
transferred to MAR
Control unit requests memory read
Result (address of operand) moved to MBR
21
Data Flow (Indirect Diagram)
22
Data Flow (Execute)
May take many forms, depends on instruction being
executed
May include
Memory read/write
Input/Output
Register transfers
ALU operations
23
Data Flow (Interrupt)
Current PC saved to allow resumption after interrupt
Contents of PC copied to MBR
Special memory location (e.g., stack pointer) loaded
to MAR
MBR written to memory according to content of MAR
PC loaded with address of interrupt handling routine
Next instruction (first of interrupt handler) can be
fetched
24
Data Flow (Interrupt Diagram)
25
Prefetch
Fetch involves accessing main memory
Execution of ALU operations do not access main
memory
Can fetch next instruction during execution of current
instruction
26
Instruction Pipelining
Instruction pipelining is similar to the use of an
assembly line in a manufacturing plant
New inputs are accepted at one end before previously
accepted inputs appear as outputs at the other end
27
Improved Performance
But not doubled:
Fetch usually shorter than execution
Prefetch more than one instruction?
Any jump or branch means that prefetched instructions
are not the required instructions
Add more stages to improve performance
28
Two Stage Instruction Pipeline
29
Pipelining (six stages)
1. Fetch instruction
2. Decode instruction
3. Calculate operands (i.e., EAs)
4. Fetch operands
5. Execute instructions
6. Write result
30
Timing Diagram for Instruction Pipeline
Operation (assuming independence)
31
The Effect of a Conditional
Branch/Interrupt on Instruction Pipeline
Operation
32
Six Stage
Instruction
Pipeline
33
Speedup
Factors
with
Instruction
Pipelining
34
Dealing with Branches
1. Prefetch Branch Target
2. Loop buffer
3. Branch prediction
4. Delayed branching
35
Prefetch Branch Target
Target of branch is prefetched in addition to
instructions following branch
Keep target until branch is executed
36
Loop Buffer
Very fast memory
Maintained by fetch stage of pipeline
Check buffer before fetching from memory
Very good for small loops or jumps
e.g.,cache
37
Branch Prediction
1. Predict never taken
Assume that jump will not happen
Always (almost) fetch next instruction
38
Branch Prediction
3. Predict by Opcode
Some instructions are more likely to result in a jump
than others
Can get up to 75% success
39
Delayed Branch
Sometimes it is possible to improve pipeline
performance by automatically rearranging instructions
within a program.
RISC uses this method
40
Branch Prediction Flowchart
41
Branch Prediction State Diagram
(two bits)
42
Intel 80486 Pipelining
1. Fetch
Put in one of two 16-byte prefetch buffers
Fill buffer with new data as soon as old data
consumed
Average 5 instructions fetched per load (variable size)
Independent of other stages to keep buffers full
2. Decode stage 1
Opcode & address-mode info
At most first 3 bytes of instruction needed for this
Can direct D2 stage to get rest of instruction
43
Intel 80486 Pipelining
3. Decode stage 2
Expand opcode into control signals
Computation of complex addressing modes
4. Execute
ALU operations, cache access, register update
5. Writeback
Update registers & flags
Results sent to cache
44