Instruction-Level Parallelism 2
Instruction-Level Parallelism 2
dynamic scheduling
prepared and instructed by
Shmuel Wimer
Eng. Faculty, Bar-Ilan University
Load buffers:
1. hold the components of the effective address until
it is computed,
2. track outstanding loads waiting on memory, and
3. hold the results of completed loads, waiting for the
CDB.
May 2015 Instruction-Level Parallelism 2 12
Store buffers:
1. hold the components of the effective address until
it is computed,
2. hold the destination addresses of outstanding
stores waiting for the data value to store, and
3. hold the address and data to store until the memory
unit is available.
All results of the FPU and load unit are put on the CDB,
which goes to the FP registers, to the RSs and to the
store buffers.
The adder implements also subtraction and the
multiplier implements also division.
May 2015 Instruction-Level Parallelism 2 13
The Steps of an Instruction
1. Issue
Get next instruction from the head of the queue.
Instructions are maintained in FIFO and hence issued
in-order.
If there is an empty matched RS, issue the instruction to
that RS together with the operands if those are
currently in RF.
If there is not an empty matched RS, there is a
structural hazard. Instruction stalls until RS is freed.
Instruction status is
not a part of the
hardware
If the L.D has been completed, field of DIV.D will store
the result and is therefore independent of ADD.D (as
shown in instruction status).
If L.D had not completed, of DIV.D would point to
Load1 RS and DIV.D would be independent of ADD.D.
In either case the ADD.D can issue and execute without
affecting DIV.D.
May 2015 Instruction-Level Parallelism 2 23
Example: Assume the following latencies: load 1 cycle,
add 2 cycles, multiply 6 cycles and divide 12 cycles.
What the status tables look like when the MUL.D is
ready to write result?
Instruction status
Latency
1
1
6
2
12
2
Load1
Register
status
Issue FP Station
operation empty
i
and are the source registers. is the destination
register. is the reservation station () or buffer that the
instruction is assigned to. is the register file, is the
register status.
May 2015 Instruction-Level Parallelism 2 26
If the operands are available in the registers, they are
stored in the fields. Otherwise, the fields are set to
indicate the that will produce the values needed as
source operands.
The instruction waits at the until both its operands are
available, indicated by zero in the fields.
The fields are set to zero either when this instruction is
issued, or when an instruction on which this instruction
depends completes and does its write back.
When an instruction has finished execution and the
CDB is available, it can do its write back.
All the buffers, registers, and s whose value of or is the
same as the completing , update their values from the
CDB and mark their fields with zero to indicate that
values have been received.
May 2015 Instruction-Level Parallelism 2 29
Instruction Wait Action or bookkeeping
state until
; Execution Write result
complete at and of FP
; CDB available operation or
; load
;
;
;
; Execution Write result
; complete at and of store
Write result
of store
0
0
Register status
FP
loads
stores
Execute
FP op
Load
step 1
Load
step 2
Store
Addition latency