0% found this document useful (0 votes)

30 views74 pages

Processor Organization & Pipelining

The document discusses processor organization, focusing on CISC (Complex Instruction Set Computers) and RISC (Reduced Instruction Set Computers) architectures. It outlines the characteristics, advantages, and disadvantages of both architectures, emphasizing the differences in instruction complexity and execution efficiency. Additionally, it introduces pipelining as a technique to improve CPU performance by overlapping instruction execution phases.

Uploaded by

Aderinola Olaleye

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views74 pages

Processor Organization & Pipelining

Uploaded by

Aderinola Olaleye

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

www.covenantuniversity.edu.

Raising a new Generation of Leaders

CSC 227
Computer Architecture and Organization I

MODULE5: PROCESSOR ORGANIZATION

AND PIPELINING
Introduction
• A processor is the logic circuitry that responds to and processes the
basic instructions that drive a computer.
• CPU instructions are numbers stored in memory.
• Instruction set is a set of instructions a programmer can give to a machine
to perform operations.
• The instructions are specific to CPU architecture.
• Basic operations: read instruction from memory, decode, execute ,write
back.
Fundamentals
From the architecture point of view, the microprocessor chips can be
classified into two categories:

1. Complex Instruction Set Computers (CISC) and

2. Reduce Instruction Set Computers (RISC) .
CISC: Complex Instruction Set Computers
• CISC existed close to the beginning of general computing.

• Since the earliest machines were programmed in assembly language and

memory was slow and expensive, IBM designed an instruction set to allow
early programmer to easily program a hundred complex instruction rather
than thousands of individual instructions.

• CISC was developed to make compiler development simpler. It shifts most

of the burden of generating machine instructions to the processor.

• For example, instead of having to make a compiler write long machine

instructions to calculate a square-root, a CISC processor would have a
built-in ability to do this.
CISC Architecture
• Complex instruction set computing is a CPU design where
single instructions can execute several low-level operations (such as a load
from memory, an arithmetic operation, and a memory store) or are capable of
multi-step operations or addressing modes within single instructions.

• Called “complex” because of the complex work performed per instruction.

• Concept: Encode the intention directly.
• Eg:” add X and Y and put the result in Z” (for X,Y,Z memory address)
• Problem: Some instruction take more time then others.
• Examples: x86, s390.
• Small number of general purpose registers

5
cont….
• Computers typically use CISC while tablets, smartphones and other devices
use RISC.

• So, the higher efficiency of the RISC architecture makes it desirable in these
applications where cycles and power are usually in short supply.

• In CISC instructions are executed by microcode.

• A CISC instruction set typically includes many instructions with different sizes
and execution cycles, which makes CISC instructions harder to pipeline.
Characteristic of CISC Processors
• A CISC instruction can be thought to contain many different type of
instructions bundled into one simple instruction.
• A large number of instructions - typically from 100 to 250 instructions.
• Some instructions that perform specialized tasks and are used infrequently.
• A large variety of addressing modes - typically 5 to 20 different modes.
• Variable-length instruction formats
• Instructions that manipulate operands in memory.

7
Properties of a CISC Processor
1. Richer instruction set, some simple, some very complex.
2. Instructions generally take more than 1 clock to execute.
3. Instructions of a variable size.
4. Instructions is an interface with memory in multiple mechanisms with
complex addressing modes.
5. No pipelining.
6. Microcode control make CISC instruction set possible & flexible.
7. Work well with simpler compiler.
Advantage
• Microprogramming is as easy as assembly language to implement, and
much less expensive than hardwiring a control unit.

• As each instruction became more capable, fewer instructions could be

used to implement a given task. This made more efficient use of the
relatively slow main memory.

• Because micro-program instruction sets can be written to match the

constructs of high-level languages, the compiler does not have to be as
complicated.
Disadvantage
• Complex instructions are infrequently used by programmers and compilers.

• Memory references, loads and stores, are slow and account for a significant
fraction of all instructions.

• Procedure and function calls are a major bottleneck, passing arguments,

storing and retrieving values in registers.

• Instruction set & chip of new generation hardware become more complex
with each generation of computers.
CISC Instruction Example
A CISC could multiply 5 by 10 as follows:

Mov ax,10
Mov bx,5
Mul bx
RISC: Reduced Instruction Set Computer
History
The first RISC projects came from IBM, Stanford, and UC-Berkeley in the late
70s and early 80s.
The IBM 801, Stanford MIPS, and Berkeley RISC 1 and 2 were all designed
with a similar philosophy which has become known as RISC.

When designers create a new generation of processors, improving

performance is the key goal. There are three main factors that affect
performance; they are :
• How fast you can crank up the clock.
• How much work you can do per cycle.
• How many instructions you need to perform a task.
RISC Architecture
• Called “reduced” because of the reduction of work performed by an
instruction.
• It is a type of microprocessor architecture that utilizes a small, highly-
optimized set of instructions, rather than a more specialized set of
instructions often found in other types of architectures.
• RISC's original goals was to limit the number of instructions on the chip so
that each could be allocated enough transistors to make it execute in one
cycle.
• Small set of instructions of a typical RISC processor consists mostly of
register-to-register operations, with only simple load and store operations
for memory access.
cont….
• Thus each operand is brought into a processor register with a load
instruction
• All computations are done among the data stored in processor registers.
• Results are transferred to memory by means of store instructions.
• Concepts: Break operation into simpler sub operations.
• Eg: instruction:
• load X,
• load Y,
• add X and Y,
• store Z
Characteristic of RISC Processors
• Simplifies the instruction set..
• The use of only a few addressing modes results from the fact that almost all
instructions have simple register addressing.
• Other addressing modes may be included, such as immediate operands.
• By using a relatively simple instruction format, the instruction length can be
fixed and aligned on word boundaries.
• An important aspect of RISC instruction format is that it is easy to decode.
• Shorter Instructions - Breaking the complex instruction into several short
simpler instructions
cont….
• It has the ability to execute one instruction per clock cycle.
➢ This is done by overlapping the fetch, decode and execute phases of two or
three instructions by using a procedure referred to as pipelining.
• The advantage of register storage as opposed to memory storage is that
registers can transfer information to other registers much faster than the
transfer of information to and from memory.
• Relatively few instructions
• Relatively few addressing modes.
• Memory access limited to load and store instructions
• Can run several instructions simultaneously.
Properties of a RISC Processor
1. Simple primitive instructions and addressing modes.
2. Instructions execute in one clock cycle.
3. Uniformed length instructions and fixed instruction format.
4. Instructions interface with memory via fixed mechanisms(load/store).
5. Pipelining.
6. Hardwired control.
7. Complexity pushed to the compiler.
Advantage
• Speed: RISC processors often achieve 2 to 4 times the performance of CISC
processors using comparable semiconductor technology and the same
clock rates.

• Simpler hardware. Because the instruction set of a RISC processor is so

simple, it uses up much less chip space and simple hardware
requirements.

• Shorter design cycle. Since RISC processors are simpler than

corresponding CISC processors, they can be designed more quickly, and
can complete there work in 1 clock cycle
Disadvantage
• Code Quality: The performance of a RISC processor depends greatly on the code
that it is executing. If the programmer (or compiler) does a poor job of instruction
scheduling, the processor can spend quite a bit of time stalling (waiting for the
result of one instruction before it can proceed with a subsequent instruction).

• Code expansion: Since CISC machines perform complex actions with a single
instruction, where RISC machines may require multiple instructions for the same
action, code expansion can be a problem.

• System Design: Another problem that faces RISC machines is that they require very
fast memory systems to feed their instructions. RISC-based systems typically
contain large memory caches, usually on the chip itself. This is known as a first-
level cache.
RISC Instruction Example
• In RISC the microprocessor's designers might make sure that add executes
in one clock.
• Then a compiler could multiply a and b by adding a to itself b times or b to
itself a times.

Mov ax,0
Mov bx,10
Mov cx,5
Begin:
Add ax,bx
Loop Begin
loop cx times
RISC 5 Stage Pipelining
Fivestage “RISC” load-‐store architecture
1. Instruction fetch (IF)
• Get instruction from memory, increment PC.
2. Instruction Decode (ID)
• Translate opcode into control signals and read registers.
3. Execute (EX)
• Perform ALU operation, compute jump/branch target
4. Memory (MEM)
• Access memory if needed
5. Writeback (WB)
• Update register file
RISC 5 Stage Pipelining
Example: CISC and RISC Instructions
Comparisons between CISC and RISC Processors
• Instructions utilize more cycles in CISC than RISC.
• CISC has more complex instructions than RISC.
• CISC typically has fewer instructions than RISC.
• CISC implementations tend to be slower than RISC implementations.
• RISC design is approximately twice as cost-effective as CISC.
• RISC architectures are designed for a good cost/performance, whereas CISC
architectures are designed for a good performance on slow memories.
Comparisons between CISC and RISC Processors
cont….
CISC RISC
Emphasis on hardware Emphasis on software

Includes multi-clock, complex

Single-clock, reduced instruction only
instructions

Memory-to-memory: Register to register:

"LOAD" and "STORE" "LOAD" and "STORE"
incorporated in instructions are independent instructions

Slower since instruction can take Faster since instructions usually take
more than 1 cycle 1 instruction cycle

Main objective is less code. Main objective is speed.

cont….
CISC RISC

More software oriented since the

More hardware oriented. compiler deals with translations.

Instruction size is mostly varied in

Instruction size is always a set size.
size.

Addressing Modes can be complex Addressing Modes are simple.

Examples of CISC and RISC Processors
Improving System Performance
through Pipelining

29
Pipelining - Introduction
In a typical system speedup is achieved through parallelism at all
levels:
Multi-user, multitasking,multi-processing, multi-programming,
multi-threading, compiler optimizations.
• Pipelining : is a technique for overlapping operations during
execution.Today this is a key feature that makes fast CPUs.
• Different types of pipeline: instruction pipeline, operation
pipeline, multi-issue pipelines.

30
What is Pipelining? - 1

• Pipeline processing is an implementation technique where

arithmetic sub-operations or the phases of a computer instruction
cycle overlap in execution

• Pipelining exploits parallelism among instructions by overlapping

them - called Instruction Level Parallelism (ILP)

• Pipelining is a technique of decomposing a sequential process

into sub-operations, with each sub-process being executed in a
special dedicated segment that operates concurrently with all other
segments

3
31
What is a Pipeline? - 2

• Pipeline is like an automobile assembly line.

• A pipeline has many steps or stages or segments.
• Each stage carries out a different part of instruction or operation.
• The stages are connected to form a pipe.
• An instruction or operation enters through one end and progresses
through the stages to exit at the other end.

3
32
Pipeline Characteristics
• Throughput: Number of items (cars, instructions, operations) that
exit the pipeline per unit time. Ex: 1 inst /clock cycle, 10 cars/ hour,
10 floating point operations /cycle.
• Stage time: The pipeline designer’s goal is to balance the
length of each pipeline stage(balanced pipeline). In general,
Stage time = Time per instruction on non-pipelined machine/
number of stages
• In many instances, stage time = max (times for all stages).
C P I : Pipeline yields a reduction in cycles per instruction
• CPI approx = stage time.

3
33
Pipeline Analogy – The Laundry

• Ann, Brian, Cathy, Dave each have one A B C D

load of clothes to wash, dry, and fold

• Washer takes 30 minutes

• Dryer takes 30 minutes

• “Folder” takes 30 minutes

• “Stasher” takes 30 minutes to put clothes

into drawers

3
34
Pipeline Analogy – The Laundry - Sequential Operation
6 PM 7 8 9 10 11 12 1 2 AM

30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
Time
A
B
C
D

• Time Required: 8 hours for 4 loads

3
35
Pipeline Analogy – The Laundry - Overlapping Tasks

6 PM 7 8 9 10 11 12 1 2 AM

30 30 30 30 30 30 30 Time
A
B
C
D

• Time Required: 3.5 Hours for 4 Loads

3
36
Pipeline Analogy – The Laundry (cont’d)

• Pipelining doesn’t help latency of single task, it helps throughput of

entire workload
• Pipeline rate is limited by slowest pipeline stage
• Multiple tasks operating simultaneously
• Potential speedup = Number pipe stages
• Unbalanced lengths of pipe stages reduces speedup
• Time to “fill” pipeline and time to “drain” it reduces speedup

37
Pipeline Analogy – The Laundry (cont’d)
• Key idea: break big computation up into pieces

1 nanosecond = 10^-9 second

1 picosecond = 10^-12 second

1ns

• Separate each piece with a pipeline register

200ps 200ps 200ps 200ps 200ps

Pipeline
Register

38
Pipelining Analogy – Grading of Exam

• Grading the Final exam for a class of 100 students:

▪ 5 problems, five people grading the exam
▪ Each person ONLY grades one problem
▪ Pass the exam to the next person as soon as one finishes his part
▪ Assume each problem takes 12 min to grade
‒ Each individual exam still takes 1 hour to grade
‒ But with 5 people, all exams can be graded five times quicker

39
Pipelining Analogy – Grading of Exam

• The load instruction has 5 stages:

▪ Five independent functional units to work on each stage
‒Each functional unit is used only once
▪ The 2nd load can start as soon as the 1st finishes its fetch stage
▪ Each load still takes five cycles to complete
▪ The throughput, however, is much higher

40
Pipelining Analogy – Grading of Exam
buffer
Input
Tasks Stage 1 Stage 2 K – stage pipeline Stage k

• Let n be number of tasks or exams (or instructions)

• Let k be number of stages for each task
• Let T be the time per stage
• Time per task = T . k
• Total Time per n tasks for non-pipelined solution = T .k .n
• Total Time per n tasks for pipelined solution = T .k + T .(n-1)
• Speedup = pipelined perform/non-pipelined performance
= Total Time non-pipelined/ Total Time for pipelined
= k .n / k + n-1 = k approx. when n > > k
41
41
Single-Cycle vs Pipelined Execution
Non-Pipelined
Instruction 0 200 400 600 800 1000 1200 1400 1600 1800
Order Time
Instruction REG REG
lw $1, 1 0 0 ( $ 0 ) Fetch
ALU MEM
RD WR
Instruction REG REG
lw $2 , 2 0 0 ( $ 0 ) ALU MEM
800ps Fetch RD WR
Instruction
lw $3 , 3 0 0 ( $ 0 ) Fetch
800ps
800ps

Pipelined
Instruction 0 200 400 600 800 1000 1200 1400 1600
Order Time
Instruction REG REG
lw $ 1 , 1 0 0 ( $ 0 ) Fetch ALU MEM
RD WR
Instruction REG REG
lw $ 2 , 200($0) ALU MEM
200ps Fetch RD WR
Instruction REG REG
lw $ 3 , 300($0) Fetch
ALU MEM
RD WR
200ps
200ps 200ps 200ps 200ps 200ps

42
Performance Issues in Pipelining
• Speedup :How much speed up performance we get through pipelining.
▪ n: Number of tasks to be performed

• Conventional Machine (Non-Pipelined)

▪ tn: Clock cycle

▪ t1: Time required to complete n tasks
▪ t1 = n * tn

43
Performance Issues in Pipelining – cont’d
• Pipelined Machine (k stages)
▪ tp: Clock cycle (time to complete each sub operation)
▪ tk: Time required to complete the n tasks
▪ tk = (k + (n - 1)) * tp

• Speedup
▪ S k: Speedup
▪
• S k = n*t n / (k + (n – 1))*tp

44
Performance Issue - Example
Example
- 4-stage pipeline
- sub operation in each stage; tp = 20ns
- 100 tasks to be executed
- 1 task in non-pipelined system; 20*4 = 80ns

Pipelined System
(k + (n – 1))*tp = (4 + 99) * 20 = 2060ns

Non-Pipelined System
n*tp = 100 * 80 = 8000ns

Speedup
Sk = 8000 / 2060 = 3.88

4-Stage Pipeline is basically identical to the system with 4 identical function units

45
Performance Issue – Example (cont’d)

Ii Ii+1 I i+2 I i+3

Multiple P1 P2 P3 P4
Functional Units

46
Pipeline Performance – Example2

47
Pipeline Performance – Example2
• Design1:
• Average instruction execution time = clock cycletime *CPI
• = 10ns * (4 *0.4 + 4 *0.2+ 5*0.4) = 10 *(1.6+0.8+2.0)
• = 44ns
• Design 2:
• Average instruction time at steady state is clock cycle time:
• = 10ns + 1ns (for setup and clock skew) = 11ns
• Speed up = 44/11 = 4

48
Pipeline Performance – Example3
• Assume times for each functional unit of a pipeline to be: 10ns, 8ns,
10ns,10ns and 7ns.
• Overhead 1ns per stage.Compute the speed of the data path.
• Pipelined:Stage time = MAX(10,8,10,10,7) + overhead
• • = 10 + 1 = 11ns.
• This is the average instruction execution time at steady state.
• Non-pipelined:10+8+10+10+7 = 45ns
• Speedup = 45/11= 4.1 times

4949
Performance Issue – (cont’d)
• Efficiency: The efficiency of a pipeline can be measured as the ratio
of busy time span to the total time span including the idle time.
• Let c be the clock period of the pipeline, the efficiency E can be
denoted as:
• E = (n. m. c) / m. [m. c + (n-1).c] = n / [(m + (n-1)]
• As n-> ∞ ,E becomes 1.

5050
Performance Issue – (cont’d)
• Throughput: Throughput of a pipeline can be defined as the number of results
that have been achieved per unit time.
• It can be denoted as:

▪ T = (n / [m + (n-1)]) / c = E / c

• Throughput denotes the computing power of the pipeline.

• Maximum speedup, efficiency and throughput are the ideal cases.

5151
Speedup - Example
• Consider an unpipelined processor. Assume that it has a 1 ns clock cycle and it
uses 4 cycles for ALU operations and branches, and 5 cycles for memory
operations, assume that the relative frequencies of these operations are 40%,
20%, and 40%, respectively. Suppose that due to clock skew and setup,
pipelining the processor adds 0.2ns of overhead to the clock. Ignoring any
latency impact, how much speedup in the instruction execution rate will we gain
from a pipeline?
Average instruction execution time
= 1 ns * ((40% + 20%)*4 + 40%*5)
= 4.4ns
Speedup from pipeline
= Average instruction time unpiplined/Average instruction time pipelined
= 4.4ns/1.2ns = 3.7

5252
Pipeline Hazards/Limitations
• Hazards reduce the performance from the ideal speedup gained by
pipelines:
• Structural hazard: Resource conflict.
• Hardware cannot support all possible combinations of instructions in
simultaneous overlapped execution.
• Data hazard:
• When an instruction depends on the results of the previous instruction.
• Control hazard:
• Due to branches and other instructions that affect the PC.

5353
Pipeline Stalls
• A stall is the delay in cycles caused due to any of the hazards
mentioned above.
• Speedup :
• 1/(1+pipeline stall per instruction)* Number of stages
• Number of cycles needed to initially fill up the pipeline could be
included in computation of average stall per instruction

5454
Structural Hazards
• When more than one instruction in the pipeline needs to access a
resource, the datapath is said to have a structural hazard.
• Examples of resources: register file, memory, ALU.
• Solution: Stall the pipeline for one clock cycle when the conflict is
detected. This results in a pipeline bubble.
• Figures 4 & 5 illustrate the memory access conflict and how it is
resolved by stalling an instruction.
• Problem: one memory port.

5555
Structural Hazards and Stalls - Conflicts

56
Structural Hazards and Stalls - Solution

57
Structural Hazards and Stalls - Bubble

58
Structural Hazard - Example
• Machine with load hazard: Data references constitute 4 0 % of the mix.
Ideal CPI is 1. Clock rate is 1.05 of the machine without hazard.
Which machine is faster, the one with hazard (machine A) or without
the hazard (machine B)? Prove.
• Solution: Hazard affects 4 0 % of the B’s instruction.
• Average instruction time for machine A: C P I * clock cycle time = 1
* x = 1.0x

5959
Structural Hazard - Solution
• Average inst time for machine B:
1) CPI has been extended.
= 4 0 % of the times 1 more cycle
2) Clock rate is faster: 1.05 times: less than machine A. By how much?
• Avg instruction time for machine B: (1 + 40/100*1) * (clock cycle
time /1.05)
= 1.4 * x/1.05 = 1.3x
• Proved that A is faster.

6060
Data Hazard - Example
• Consider the instruction sequence:
• A D D R1,R2,R3 ;result is in R1
• SUB R4,R5,R1
• AND R6,R1,R7
• OR R8,R1,R9
• XOR R10,R1,R11
• All instructions use R1 after the first instruction.

6161
Data Hazard - Solution

• Usually solved by data or register forwarding (bypassing or short-

circuiting).
• How? The data selected is not really used in ID but in the next stage:
ALU.
• Forwarding works as follows:
• ALU result from EX/MEM buffer is always fed back to ALU input
latches.
• If the forwarding hardware detects that its source operand has a
new value, the logic selects the newer result than the value read from
the register file.
6262
Data Hazard – Solution (cont’d)
• The results need to be forwarded not only from the immediate
previous instruction but also from any instruction that started up to
three cycles before.
• The result from E X / M E M (1 cycle before) and M E M / W B (2 cycles
before) are forwarded to the both ALU inputs.
• Writing into the register file is done in the first half of the cycle and
read is done in the second half.(3 cycles before)

6363
Data Hazard – E x a m p l e

Example 1 :
add $s0, $t0, $t1
sub $t2, $s0, $t3
In the example, the second instruction is dependent on the result in $s0 of the
first instruction:
if $s0 = -5 before add
$s0 = 8 after add
then the value 8 should be used in the second instruction sub.

Draw the multiple clock cycle pipeline diagram for the execution:

6464
Data Hazard – E x a m p l e ( c o n t ’d )

Execution Time CC1 CC2 CC3 CC4 CC5 CC6

order
add $s0, $t0, $t1 IF -----ID------EX------MEM----WB
value in $s0 -5 -5 -5 -5 -5/8 8
sub $t2, $s0, $t3 IF -----ID-------EX------MEM-----WB

In which clock cycle add writes to $s0? Clock 5

In which clock cycle sub reads $s0? Clock 3

For sub instruction, the value in $s0 has to be read in its ID stage (CC3).
However, the value in $s0 in CC3 is still -5 and not the correct value
8. We can only have the correct value in $s0 at the end of clock 5 (CC5).
The dependency goes from CC5 ---> CC3 (backward)

6565
Data HazardStalls
• All data hazards cannot be solved by forwarding:
• LW R1,0(R2)
• SUB R4,R1,R5
• AND R6,R1,R7
• OR R8,R1,R9
• Unlike the previous example, data is available until MEM/WB. So
subtract ALU cycle has to be stalled introducing a (vertical) bubble.

6666
Data Hazards and Stalls

67
Data Hazards and Stalls

68
Data Hazards – Time Stage Diagram

69
Data HazardClassification
• RAW - Read After Write. Most common: solved by data forwarding.
• WAW - Write After Write :Inst i (load) before inst j (add).
• Both write to same register. But inst i does it before inst j. DLX avoids
this by waiting for WB to write to registers. So no WAW hazard in
DLX.
• WAR - Write after Read: inst j tries to write a destination before it is
read by I, so I incorrectly gets its value. This cannot happen in DLX
since all inst read early (ID) but write late (WB).
• But WAW happens in complex instruction sets that have auto-
increment mode and require operands to be read late cycle
experience WAW.

7070
Data Hazard Stalls
Describe each of the following categories of Data Hazards: RAW, WAR,
WAW. Using (I – iii) below state which is RAW, WAR or WAW and indicate
how each occurs using an arrow.
R3  R1 op R2 ii) R3  R1 op R2 iii) R3  R1 op R2
R5  R3 op R4 R1  R4 op R5 R3  R6 op R7

7171
Limitations to Speedup - 1

• Data dependency between successive tasks: There may be

dependencies between the instructions of two tasks used in the
pipeline.
• For example:
▪ One instruction cannot be started until the previous instruction returns
the results, as both are interdependent.
▪ Another instance of data dependency will be when that both
instructions try to modify the same data object. These are called
data hazards.

7272
Limitations to Speedup - 2

Resource Constraints: When resources are not available at the time of

execution then delays are caused in pipelining.
For example:
1) If one common memory is used for both data and instructions and there
is need to read/write and fetch the instruction at the same time, then only
one can be carried out and the other has to wait.
2) Limited resource like execution unit, which may be busy at the required
time.

7373
Limitations to Speedup - 3
•Branch Instructionsand Interrupts in the program :
•A program is not a straight flow of sequential
instructions.

•There may be branch instructions that alter the normal flow of program,
which delays the pipelining execution and affects the performance.

•Similarly, there are interrupts that postpones the execution of next

instruction until the interrupt has been serviced.

•Branches and the interrupts have damaging effects on the pipelining.

7474

Embedded Systems Tutorial
50% (2)
Embedded Systems Tutorial
63 pages
Errorlisting SIREMOBIL 2000 - VB01E
100% (5)
Errorlisting SIREMOBIL 2000 - VB01E
194 pages
Quanta Da0z8vmb8e0 R1a
100% (1)
Quanta Da0z8vmb8e0 R1a
44 pages
Digital Counters
No ratings yet
Digital Counters
16 pages
DE10-Lite User Manual PDF
No ratings yet
DE10-Lite User Manual PDF
74 pages
VLSI Circuit Design Course - Outline
No ratings yet
VLSI Circuit Design Course - Outline
3 pages
Digital Manual
No ratings yet
Digital Manual
38 pages
Microcontrollers Applications Solved Questions Answers
No ratings yet
Microcontrollers Applications Solved Questions Answers
60 pages
Motherboard Diagram: Necessary If You Want To Replace Motherboard or Troubleshoot
No ratings yet
Motherboard Diagram: Necessary If You Want To Replace Motherboard or Troubleshoot
3 pages
Chapter 6 Parallel Processor
No ratings yet
Chapter 6 Parallel Processor
21 pages
Manual: Industry Interfaces
No ratings yet
Manual: Industry Interfaces
68 pages
System Software - Unit I
No ratings yet
System Software - Unit I
77 pages
Chapter 3 Delay Calculation PDF
No ratings yet
Chapter 3 Delay Calculation PDF
28 pages
Optimization of Advanced Encryption Standard (AES) Using Vivado High Level Synthesis (HLS)
No ratings yet
Optimization of Advanced Encryption Standard (AES) Using Vivado High Level Synthesis (HLS)
9 pages
RHD'L: Instruction-Level Parallel Processing: History, Overview and Perspective
No ratings yet
RHD'L: Instruction-Level Parallel Processing: History, Overview and Perspective
57 pages
Black Box
No ratings yet
Black Box
7 pages
Manual Asus M4a77td Pro - QSG
No ratings yet
Manual Asus M4a77td Pro - QSG
38 pages
8259A Programmable Interrupt Controller
No ratings yet
8259A Programmable Interrupt Controller
26 pages
4-Instruction Set - IAS Computer-18-Jul-2019Material I 18-Jul-2019 Instruction Set Ias
No ratings yet
4-Instruction Set - IAS Computer-18-Jul-2019Material I 18-Jul-2019 Instruction Set Ias
9 pages
TCS34725 Color Sensor User Manual
No ratings yet
TCS34725 Color Sensor User Manual
16 pages
Superscalar Architectures: COMP375 Computer Architecture and Organization
No ratings yet
Superscalar Architectures: COMP375 Computer Architecture and Organization
35 pages
CISC
No ratings yet
CISC
17 pages
Experiment 5
No ratings yet
Experiment 5
11 pages
Bootload Design
No ratings yet
Bootload Design
13 pages
CTS Assignment
No ratings yet
CTS Assignment
3 pages
Computer Organization and Architecture: CLSC & Risc
No ratings yet
Computer Organization and Architecture: CLSC & Risc
15 pages
System Architecture
No ratings yet
System Architecture
45 pages
Risc - Cisc
No ratings yet
Risc - Cisc
16 pages
Risc and Cisc: By: Farheen Masood Sania Shahzad
No ratings yet
Risc and Cisc: By: Farheen Masood Sania Shahzad
17 pages
Risc Cisc in Microcontroller and Microprocessor
No ratings yet
Risc Cisc in Microcontroller and Microprocessor
31 pages
8085 Microprocessor Trainer LCD Ver st808504
No ratings yet
8085 Microprocessor Trainer LCD Ver st808504
1 page
Embedded Technology Solving Problems
No ratings yet
Embedded Technology Solving Problems
5 pages
8x8x8 Code
No ratings yet
8x8x8 Code
7 pages
Exam Questions RISC and CISC and Parallel
No ratings yet
Exam Questions RISC and CISC and Parallel
55 pages
UNIT 3 - RISC Processors
No ratings yet
UNIT 3 - RISC Processors
14 pages
Risc
No ratings yet
Risc
9 pages
Risc Vs Cisc
No ratings yet
Risc Vs Cisc
8 pages
Lab 3 - Fpga 1 (Design Implementation On Fpga Counter Design) V1
No ratings yet
Lab 3 - Fpga 1 (Design Implementation On Fpga Counter Design) V1
4 pages
Microprocess OR & Computer Architecture: 14CS253 / UE14CS253
No ratings yet
Microprocess OR & Computer Architecture: 14CS253 / UE14CS253
12 pages
Session - 26 CISC and RISC
No ratings yet
Session - 26 CISC and RISC
15 pages
Cisc Vs Risc
No ratings yet
Cisc Vs Risc
14 pages
RISC and CISC - Eugene Clewlow
No ratings yet
RISC and CISC - Eugene Clewlow
17 pages
Risc and Cisc: by Eugene Clewlow
No ratings yet
Risc and Cisc: by Eugene Clewlow
17 pages
Risc and Cisc
No ratings yet
Risc and Cisc
20 pages
Risc and Cisc: Computer Architecture
No ratings yet
Risc and Cisc: Computer Architecture
17 pages
Risc A Cisc P
No ratings yet
Risc A Cisc P
10 pages
Cisc & Risc: Subject-ESD Semester - III Lab Instructor - Shilpa Marathe
No ratings yet
Cisc & Risc: Subject-ESD Semester - III Lab Instructor - Shilpa Marathe
14 pages
Chapter Three
No ratings yet
Chapter Three
51 pages
Microprocessors and Microcontrollers: Lecture-03
No ratings yet
Microprocessors and Microcontrollers: Lecture-03
9 pages
Cco Unit 5
No ratings yet
Cco Unit 5
41 pages
Processor and Computer Achitecture
No ratings yet
Processor and Computer Achitecture
26 pages
CISC and RISC
No ratings yet
CISC and RISC
18 pages
Computer Organization UNIT5
No ratings yet
Computer Organization UNIT5
49 pages
Risc and Cisc
No ratings yet
Risc and Cisc
17 pages
CSC 315 PDF 1
No ratings yet
CSC 315 PDF 1
7 pages
RISC Vs CISC
No ratings yet
RISC Vs CISC
13 pages
Ldco Unit 5 Notes
No ratings yet
Ldco Unit 5 Notes
23 pages
Abushe
No ratings yet
Abushe
28 pages
RISC, CISC & Other Topics
No ratings yet
RISC, CISC & Other Topics
27 pages
FALLSEM2024-25 BCSE205L TH VL2024250109488 2024-08-11 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE205L TH VL2024250109488 2024-08-11 Reference-Material-I
8 pages
Embedded Systems - 7
No ratings yet
Embedded Systems - 7
17 pages
Comparison of CISC Vs RISC Architectures 3.1.9
No ratings yet
Comparison of CISC Vs RISC Architectures 3.1.9
5 pages
Unit-5 PDF Material
No ratings yet
Unit-5 PDF Material
27 pages
8229 90 51 Risc-Cisc-Arm
No ratings yet
8229 90 51 Risc-Cisc-Arm
98 pages
Cisc & Risc Lec 5aa
100% (1)
Cisc & Risc Lec 5aa
4 pages
Risc and Cisc
No ratings yet
Risc and Cisc
6 pages
Intro Archi
No ratings yet
Intro Archi
58 pages
Risc and Cisc: by Eugene Clewlow
No ratings yet
Risc and Cisc: by Eugene Clewlow
17 pages
Riscvscisk
No ratings yet
Riscvscisk
21 pages
5-RISC Vs CISC-11-01-2024
No ratings yet
5-RISC Vs CISC-11-01-2024
8 pages
Lecture6 509
No ratings yet
Lecture6 509
6 pages
RISC and CISC
No ratings yet
RISC and CISC
2 pages
Cisc & Risc
No ratings yet
Cisc & Risc
4 pages
DV Interview
No ratings yet
DV Interview
5 pages
Coa 3.2 - Risc - Cisc
No ratings yet
Coa 3.2 - Risc - Cisc
20 pages
Memory Organization
No ratings yet
Memory Organization
9 pages
Cisc Vs Risc
No ratings yet
Cisc Vs Risc
10 pages
Risc Cisc Study
No ratings yet
Risc Cisc Study
12 pages
Lecture 3.1.6 (RISC and CISC Architectures)
No ratings yet
Lecture 3.1.6 (RISC and CISC Architectures)
5 pages
Risc Processors
No ratings yet
Risc Processors
5 pages
Chapter 16 - RISC and CISC
No ratings yet
Chapter 16 - RISC and CISC
14 pages
2.1 Risc Cisc
No ratings yet
2.1 Risc Cisc
4 pages
Engineering Section 10 LV
No ratings yet
Engineering Section 10 LV
40 pages
Reduced Instruction Set Computer (RISC)
No ratings yet
Reduced Instruction Set Computer (RISC)
6 pages
Risc &cisc
No ratings yet
Risc &cisc
6 pages
Cisc Risc
No ratings yet
Cisc Risc
13 pages
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet

Processor Organization & Pipelining

Uploaded by

Processor Organization & Pipelining

Uploaded by

www.covenantuniversity.edu.

Raising a new Generation of Leaders

MODULE5: PROCESSOR ORGANIZATION

1. Complex Instruction Set Computers (CISC) and

• Since the earliest machines were programmed in assembly language and

• CISC was developed to make compiler development simpler. It shifts most

• For example, instead of having to make a compiler write long machine

• Called “complex” because of the complex work performed per instruction.

• In CISC instructions are executed by microcode.

• As each instruction became more capable, fewer instructions could be

• Because micro-program instruction sets can be written to match the

• Procedure and function calls are a major bottleneck, passing arguments,

When designers create a new generation of processors, improving

• Simpler hardware. Because the instruction set of a RISC processor is so

• Shorter design cycle. Since RISC processors are simpler than

Includes multi-clock, complex

Memory-to-memory: Register to register:

Main objective is less code. Main objective is speed.

More software oriented since the

Instruction size is mostly varied in

Addressing Modes can be complex Addressing Modes are simple.

• Pipeline processing is an implementation technique where

• Pipelining exploits parallelism among instructions by overlapping

• Pipelining is a technique of decomposing a sequential process

• Pipeline is like an automobile assembly line.

• Ann, Brian, Cathy, Dave each have one A B C D

• Washer takes 30 minutes

• Dryer takes 30 minutes

• “Folder” takes 30 minutes

• “Stasher” takes 30 minutes to put clothes

• Time Required: 8 hours for 4 loads

• Time Required: 3.5 Hours for 4 Loads

• Pipelining doesn’t help latency of single task, it helps throughput of

1 nanosecond = 10^-9 second

• Separate each piece with a pipeline register

200ps 200ps 200ps 200ps 200ps

• Grading the Final exam for a class of 100 students:

• The load instruction has 5 stages:

• Let n be number of tasks or exams (or instructions)

• Conventional Machine (Non-Pipelined)

▪ tn: Clock cycle

Ii Ii+1 I i+2 I i+3

• Throughput denotes the computing power of the pipeline.

• Maximum speedup, efficiency and throughput are the ideal cases.

• Usually solved by data or register forwarding (bypassing or short-

Execution Time CC1 CC2 CC3 CC4 CC5 CC6

In which clock cycle add writes to $s0? Clock 5

• Data dependency between successive tasks: There may be

Resource Constraints: When resources are not available at the time of

•Similarly, there are interrupts that postpones the execution of next

•Branches and the interrupts have damaging effects on the pipelining.

You might also like