0% found this document useful (0 votes)
54 views

Unit - 1 Microprocessor Architecture

This document discusses the internal structure of CPUs. It covers the following key points in 3 sentences: CPUs contain registers for temporary storage of data and addresses during operations. These include general purpose registers, condition code registers, and control/status registers like the program counter. Pipelining improves CPU performance by overlapping the fetch, decode, execute, and writeback stages of instruction processing, but can cause pipeline hazards like data hazards when instructions depend on results that have not finished processing. The document discusses various techniques for dealing with pipeline hazards and branch instructions, including branch prediction, delayed branching, and maintaining multiple instruction streams.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Unit - 1 Microprocessor Architecture

This document discusses the internal structure of CPUs. It covers the following key points in 3 sentences: CPUs contain registers for temporary storage of data and addresses during operations. These include general purpose registers, condition code registers, and control/status registers like the program counter. Pipelining improves CPU performance by overlapping the fetch, decode, execute, and writeback stages of instruction processing, but can cause pipeline hazards like data hazards when instructions depend on results that have not finished processing. The document discusses various techniques for dealing with pipeline hazards and branch instructions, including branch prediction, delayed branching, and maintaining multiple instruction streams.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 52

UNIT -1

MICROPROCESSOR
ARCHITECTURE

CPU Structure
CPU must:
Fetch instructions
Interpret instructions
Fetch data
Process data
Write data

CPU With Systems Bus

CPU Internal Structure

Registers
CPU must have some working space
(temporary storage) called registers
Number and function vary between
processor designs - One of the major
design decisions
Top level of memory hierarchy

User Visible Registers

General Purpose
Data
Address
Condition Codes

General Purpose Registers (1)

May be true general purpose


May be restricted
May be used for data or addressing
Data
Accumulator

Addressing
Segment

General Purpose Registers (2)

Make them general purpose


Increase flexibility and programmer options
Increase instruction size & complexity

Make them specialized


Smaller (faster) instructions
Less flexibility

How Many GP Registers?


Between 8 - 32
Fewer = more memory references

How big?
Large enough to hold full address
Large enough to hold full word
Often possible to combine two data
registers

Condition Code Registers


Sets of individual bits
e.g. result of last operation was zero

Can be read (implicitly) by programs


e.g. Jump if zero

Can not (usually) be set by programs

Control & Status Registers

Program Counter
Instruction Decoding Register
Memory Address Register
Memory Buffer Register

Program Status Word

A set of bits
Includes Condition Codes
Sign of last result
Zero
Carry
Equal
Overflow
Interrupt enable/disable
Supervisor

Supervisor Mode

Intel ring zero


Kernel mode
Allows privileged instructions to execute
Used by operating system
Not available to user programs

Other Registers
May have registers pointing to:
Process control blocks
Interrupt Vectors

CPU design and operating system design


are closely linked

Example Register Organizations

Instruction Cycle Indirect Cycle


May require memory access to fetch
operands
Indirect addressing requires more
memory accesses
Can be thought of as additional instruction
subcycle

Instruction Cycle with Indirect

Instruction Cycle State Diagram

Data Flow (Instruction Fetch)


Depends on CPU design
In general:
Fetch
PC contains address of next instruction
Address moved to MAR
Address placed on address bus
Control unit requests memory read
Result placed on data bus, copied to MBR,
then to IR
Meanwhile PC incremented by 1

Data Flow (Data Fetch)


IR is examined
If indirect addressing, indirect cycle is
performed
Right most N bits of MBR transferred to MAR
Control unit requests memory read
Result (address of operand) moved to MBR

Data Flow (Fetch Diagram)

Data Flow (Indirect Diagram)

Data Flow (Execute)


May take many forms
Depends on instruction being executed
May include
Memory read/write
Input/Output
Register transfers
ALU operations

Data Flow (Interrupt)


Simple
Predictable
Current PC saved to allow resumption
after interrupt
Contents of PC copied to MBR
Special memory location (e.g. stack
pointer) loaded to MAR
MBR written to memory
PC loaded with address of interrupt
handling routine
Next instruction (first of interrupt handler)
can be fetched

Data Flow (Interrupt Diagram)

Prefetch
Fetch accessing main memory
Execution usually does not access main
memory
Can fetch next instruction during
execution of current instruction
Called instruction prefetch

Improved Performance
But not doubled:
Fetch usually shorter than execution
Any jump or branch means that prefetched
instructions are not the required instructions

Add more stages to improve performance

Pipelining

Fetch instruction
Decode instruction
Calculate operands (i.e. EAs)
Fetch operands
Execute instructions
Write result

Overlap these operations

Two Stage Instruction Pipeline

Timing Diagram for


Instruction Pipeline Operation

The Effect of a Conditional Branch on


Instruction Pipeline Operation

Six Stage
Instruction Pipeline

Alternative Pipeline Depiction

Speedup Factors
with Instruction
Pipelining

Pipeline Hazards
Pipeline, or some portion of pipeline, must
stall
Also called pipeline bubble
Types of hazards
Resource
Data
Control

Resource Hazards

Two (or more) instructions in pipeline need same resource


Executed in serial rather than parallel for part of pipeline
Also called structural hazard
E.g. Assume simplified five-stage pipeline
Each stage takes one clock cycle

Ideal case is new instruction enters pipeline each clock cycle


Assume main memory has single port
Assume instruction fetches and data reads and writes performed
one at a time
Ignore the cache
Operand read or write cannot be performed in parallel with
instruction fetch
Fetch instruction stage must be idle for one cycle fetching I3
E.g. multiple instructions ready to enter execute instruction phase
Single ALU
One solution: increase available resources
Multiple main memory ports
Multiple ALUs

Data Hazards

Conflict in access of an operand location


Two instructions to be executed in sequence
Both access a particular memory or register operand
If in strict sequence, no problem occurs
If in a pipeline, operand value could be updated so as to
produce different result from strict sequential execution
E.g. x86 machine instruction sequence:
ADD EAX, EBX
SUB ECX, EAX

/* EAX = EAX + EBX


/* ECX = ECX EAX

ADD instruction does not update EAX until end of stage 5,


at clock cycle 5
SUB instruction needs value at beginning of its stage 2, at
clock cycle 4
Pipeline must stall for two clocks cycles
Without special hardware and specific avoidance
algorithms, results in inefficient pipeline usage

Data Hazard Diagram

Types of Data Hazard


Read after write (RAW), or true dependency
An instruction modifies a register or memory location
Succeeding instruction reads data in that location
Hazard if read takes place before write complete

Write after read (RAW), or antidependency


An instruction reads a register or memory location
Succeeding instruction writes to location
Hazard if write completes before read takes place

Write after write (WAW), or output dependency


Two instructions both write to same location
Hazard if writes take place in reverse of order intended
sequence

Previous example is RAW hazard

Resource Hazard Diagram

Control Hazard
Also known as branch hazard
Pipeline makes wrong decision on branch
prediction
Brings instructions into pipeline that must
subsequently be discarded
Dealing with Branches
Multiple Streams
Prefetch Branch Target
Loop buffer
Branch prediction
Delayed branching

Multiple Streams
Have two pipelines
Prefetch each branch into a separate
pipeline
Use appropriate pipeline
Leads to bus & register contention
Multiple branches lead to further pipelines
being needed

Prefetch Branch Target


Target of branch is prefetched in addition
to instructions following branch
Keep target until branch is executed
Used by IBM 360/91

Loop Buffer

Very fast memory


Maintained by fetch stage of pipeline
Check buffer before fetching from memory
Very good for small loops or jumps

Loop Buffer Diagram

Branch Prediction (1)


Predict never taken
Assume that jump will not happen
Always fetch next instruction

Predict always taken


Assume that jump will happen
Always fetch target instruction

Branch Prediction (2)


Predict by Opcode
Some instructions are more likely to result in a
jump than others
Can get up to 75% success

Taken/Not taken switch


Based on previous history
Good for loops
Refined by two-level or correlation-based branch
history

Correlation-based
In loop-closing branches, history is good predictor
In more complex structures, branch direction
correlates with that of related branches
Use recent branch history as well

Branch Prediction (3)


Delayed Branch
Do not take jump until you have to rearrange
instructions

Branch Prediction Flowchart

Branch Prediction State Diagram

Dealing With
Branches

You might also like