5 Pipelined Processor: Temporal Overlapping of Processing, Assembly Line
5 Pipelined Processor: Temporal Overlapping of Processing, Assembly Line
5.1 Basic concept 5.2 Design space of pipelines 5.3 Overview of pipelined instruction processing 5.4 Pipelined execution of integer and Boolean instructions 5.5 Pipelined processing of loads and stores
TECH
CH01
Computer Science
Repetition Rate: R
the shortest possible time interval between subsequent independent instructions in the pipeline
Performance potential of a pipeline: P P = 1/(R * tc) PowerPC603 FP double Mul. e.g. R = 2, tc = 12 nsec P = 1/(R * tc) = 1/(2*12nec) = 44.6 MFLOPS
Performance: RAW-dependent
Latency:
specifies the amount of time that the result of a particular instruction takes to become available in the pipeline for a subsequent dependent instruction.
Stalled: the immediately following RAW-dependent instruction has to be stalled in the pipeline for n-1 cycle
Improve Performance
Multiple-operation instructions
HP PA 7100
FMPYADD RM1, RM2, RM3, RA1, RA2
RM3RM1*RM2 RA2RA1+RA2
PowerPC
FMA for performing (A*C) + B
In most case
5-10 stages
Multiplicity of pipelines
CISC pipeline:
Execution of register-register and load/store instructions
Remove Load-use delay: bringing forward the claculation of virtual address: for slow cache