0% found this document useful (0 votes)
11 views67 pages

CompanionAsset 9780128119051 Chapter03

Chapter 3 of 'Computer Architecture: A Quantitative Approach' discusses Instruction-Level Parallelism (ILP) and its exploitation through pipelining and various scheduling techniques. It covers hardware-based dynamic approaches and compiler-based static approaches, emphasizing the importance of minimizing cycles per instruction and addressing data dependencies. The chapter also explores advanced topics such as loop unrolling, branch prediction, and dynamic scheduling to enhance performance in modern processors.

Uploaded by

Rasha Alhreimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views67 pages

CompanionAsset 9780128119051 Chapter03

Chapter 3 of 'Computer Architecture: A Quantitative Approach' discusses Instruction-Level Parallelism (ILP) and its exploitation through pipelining and various scheduling techniques. It covers hardware-based dynamic approaches and compiler-based static approaches, emphasizing the importance of minimizing cycles per instruction and addressing data dependencies. The chapter also explores advanced topics such as loop unrolling, branch prediction, and dynamic scheduling to enhance performance in modern processors.

Uploaded by

Rasha Alhreimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 67

Computer Architecture

A Quantitative Approach, Sixth Edition

Chapter 3
Instruction-Level Parallelism
and Its Exploitation

Copyright © 2019, Elsevier Inc. All rights Reserved 1


Introduction
Introduction
 Pipelining become universal technique in 1985
 Overlaps execution of instructions
 Exploits “Instruction Level Parallelism”

 Beyond this, there are two main approaches:


 Hardware-based dynamic approaches

Used in server and desktop processors

Not used as extensively in PMP (personal Mobile
Processors) processors
 Compiler-based static approaches

Not as successful outside of scientific applications

Copyright © 2019, Elsevier Inc. All rights Reserved 2


Introduction
Instruction-Level Parallelism
 When exploiting instruction-level parallelism, goal is to
minimize CPI (Cycles Per Instruction)
 Pipeline CPI = Ideal pipeline CPI + Structural stalls + Data hazard
stalls + Control stalls
Where:

The ideal pipeline CPI is a measure of the maximum performance attainable by the
implementation

Structural hazards arise from resource conflicts when the hardware cannot support
all possible combinations of instructions simultaneously in overlapped execution.

Data hazards arise when an instruction depends on the results of a previous
instruction in a way that is exposed by the overlapping of instructions in the
pipeline.

Control hazards arise from the pipelining of branches and other instructions that
change the PC.
 Parallelism with basic block is limited
 a straight-line code sequence with no branches in except to the entry
and no branches out except at the exit
 Typical size of basic block = 3-6 instructions
 Must optimize across branches

Copyright © 2019, Elsevier Inc. All rights Reserved 3


Introduction
Data Dependence
 The simplest and most common way to increase the
ILP is to exploit parallelism among iterations of a loop
 Loop-Level Parallelism
 Unroll loop statically or dynamically

As an alternative, Use SIMD (vector processors and
GPUs)
 Challenges:
 Data dependency

Instruction j is data dependent on instruction i if
 Instruction i produces a result that may be used by instruction j
 Instruction j is data dependent on instruction k and instruction k
is data dependent on instruction i
 Dependent instructions cannot be executed
simultaneously

Copyright © 2019, Elsevier Inc. All rights Reserved 4


Introduction
Data Dependence

Copyright © 2019, Elsevier Inc. All rights Reserved 5


Introduction
Data Dependence
 Dependencies are a property of programs
 Pipeline organization determines if dependence
is detected and if it causes a stall

 Data dependence conveys:


 Possibility of a hazard
 Order in which results must be calculated
 Upper bound on exploitable instruction level
parallelism

 Dependencies that flow through memory


locations are difficult to detect
Copyright © 2019, Elsevier Inc. All rights Reserved 6
Introduction
Name Dependence
 Two instructions use the same name but no flow of
information.
 occurs when two instructions use the same
register or memory location.
 Not a true data dependence, but is a problem when
reordering instructions
 two types of name dependences between an
instruction i that precedes instruction j:
 Antidependence: instruction j writes a register or
memory location that instruction i reads

Initial ordering (i before j) must be preserved
 Output dependence: instruction i and instruction j
write the same register or memory location

Ordering must be preserved
 To resolve, use register renaming techniques
Copyright © 2019, Elsevier Inc. All rights Reserved 7
Introduction
Other Factors
 A hazard occurs whenever:
 there is a name or data dependence between

instructions, and
 The instructions are close enough that the

overlap during execution would change the


order of access to the operand involved in the
dependence.
 Solution: Preserve program order (program
should execute sequentially).
 The goal of both software and hardware
techniques is to exploit parallelism by
preserving program order only where it
affects the outcome of the program
Copyright © 2019, Elsevier Inc. All rights Reserved 8
Introduction
Other Factors
 Consider two instructions i and j, with i preceding j in
program order. The possible data hazards are
 Read after write (RAW):j tries to read a source before i writes it.
 Write after write (WAW): j tries to write an operand before it is written
by i.
 Write after read (WAR): j tries to write a destination before it is read by
i
 Control Dependence
 Determines the ordering of instruction i with respect to a
branch instruction so that instruction i is executed in
correct program order and only when it should be

Instruction control dependent on a branch cannot be moved before
the branch so that its execution is no longer controlled by the
branch

An instruction not control dependent on a branch cannot be moved
after the branch so that its execution is controlled by the branch

Copyright © 2019, Elsevier Inc. All rights Reserved 9


Introduction
Examples
• Example 1:  or instruction
add x1,x2,x3 dependent on
beq x4,x0,L
sub x1,x1,x6
add and sub
L: …
or x7,x1,x8

• Example 2:
add x1,x2,x3
 Assume x4 isn’t
beq x12,x0,skip used after skip
sub x4,x5,x6  Possible to
add x5,x4,x9 move sub
skip: before the
or x7,x8,x9 branch

Copyright © 2019, Elsevier Inc. All rights Reserved 10


Compiler Techniques
Compiler Techniques for Exposing ILP
 Pipeline scheduling
 Find sequences of unrelated instructions that
can be overlapped in the pipeline.
 To avoid a pipeline stall, the execution of a
dependent instruction must be separated from
the source instruction by a distance in clock
cycles equal to the pipeline latency of that
source instruction.
 A compiler’s ability to perform this
scheduling depends on:
 Amount of ILP available in the program.
 latencies of the functional units in the pipeline.
Copyright © 2019, Elsevier Inc. All rights Reserved 11
Compiler Techniques
Compiler Techniques for Exposing ILP
 Example:
for (i=999; i>=0;
i=i-1)
x[i] = x[i] + s;
 The loop is
parallel: body of
each iteration is
independent.

Copyright © 2019, Elsevier Inc. All rights Reserved 12


Compiler Techniques
Pipeline Stalls

Copyright © 2019, Elsevier Inc. All rights Reserved 13


Compiler Techniques
Loop Unrolling
 Loop unrolling
 Unroll by a factor of 4 (assume # elements is divisible by 4)
 Eliminate unnecessary instructions
Loop: fld f0,0(x1)
fadd.d f4,f0,f2
fsd f4,0(x1) //drop addi & bne
fld f6,-8(x1)
fadd.d f8,f6,f2
fsd f8,-8(x1) //drop addi & bne
fld f10,-16(x1)
fadd.d f12,f0,f2  note: number
fsd f12,-16(x1) //drop addi & bne
of live registers
fld f14,-24(x1)
fadd.d f16,f14,f2
vs. original loop
fsd f16,-24(x1)  26 clock cycles
addi x1,x1,-32
bne x1,x2,Loop

Copyright © 2019, Elsevier Inc. All rights Reserved 14


Compiler Techniques
Loop Unrolling/Pipeline Scheduling
 Pipeline schedule the unrolled loop:
After Before
Loop: fld f0,0(x1)
fld f6,-8(x1)
fld f8,-16(x1)
fld f14,-24(x1)
fadd.d f4,f0,f2
fadd.d f8,f6,f2
fadd.d f12,f0,f2
fadd.d f16,f14,f2
fsd f4,0(x1)
fsd f8,-8(x1)
fsd f12,-16(x1)
fsd f16,-24(x1)
addi x1,x1,-32
 14 cycles
bne x1,x2,Loop  3.5 cycles per element

Copyright © 2019, Elsevier Inc. All rights Reserved 15


Compiler Techniques
Strip Mining
 Unknown number of loop iterations?
(upper bound on the loop is unknown)
 Number of iterations = n
 Goal: make k copies of the loop body
 Instead of a single unrolled loop, Generate
pair of consecutive loops:

First executes n mod k times

Second executes n / k times

“Strip mining”

Copyright © 2019, Elsevier Inc. All rights Reserved 16


Branch Prediction
Branch Prediction
 Basic 2-bit predictor:
 For each branch:

Predict taken or not taken

If the prediction is wrong two consecutive times, change prediction
 Correlating predictor/two-level predictors:
 Branch predictors use the behavior of other branches to make a
prediction.
 Multiple 2-bit predictors for each branch
 One for each possible combination of outcomes of preceding n
branches

(m,n) predictor: behavior from last m branches to choose from 2m n-bit predictors
 Tournament predictor:
 Combine correlating predictor with local predictor: choose among two
different predictors based on which predictor (local, global, or even
some time varying mix) was most effective in recent predictions.

Copyright © 2019, Elsevier Inc. All rights Reserved 17


Branch Prediction

18
Branch Prediction
Branch Prediction

gshare tournament

Copyright © 2019, Elsevier Inc. All rights Reserved 19


Branch Prediction
Branch Prediction Performance

Copyright © 2019, Elsevier Inc. All rights Reserved 20


Branch Prediction
Branch Prediction Performance

Copyright © 2019, Elsevier Inc. All rights Reserved 21


Branch Prediction
Tagged Hybrid Predictors
 This class of branch predictors employs a
series of global predictors indexed with
different length histories.
 Need to have predictor for each branch
and history
 Problem: this implies huge tables
 Solution:

Use hash tables, whose hash value is based on
branch address and branch history

Longer histories may lead to increased chance of
hash collision, so use multiple tables with
increasingly shorter histories

Copyright © 2019, Elsevier Inc. All rights Reserved 22


Branch Prediction
Tagged Hybrid Predictors

Copyright © 2019, Elsevier Inc. All rights Reserved 23


Branch Prediction
Tagged Hybrid Predictors

Copyright © 2019, Elsevier Inc. All rights Reserved 24


Dynamic Scheduling
Dynamic Scheduling
 Dynamic Scheduling: Rearrange order of instructions to
reduce stalls while maintaining data flow
 Advantages:
 code that was compiled with one pipeline in mind can run
efficiently on a different pipeline
 Compiler doesn’t need to have knowledge of microarchitecture
 Handles cases where dependencies are unknown at compile
time
 allows the processor to tolerate unpredictable delays, such as
cache misses, by executing other code while waiting for the miss
to resolve
 Disadvantage:
 Substantial increase in hardware complexity
 Complicates exceptions

Copyright © 2019, Elsevier Inc. All rights Reserved 25


Dynamic Scheduling
Dynamic Scheduling
 dynamically scheduled processor cannot change the
data flow, it tries to avoid stalling when dependences
are present.
 Static pipeline scheduling by the compiler tries to
minimize stalls by separating dependent instructions
so that they will not lead to hazards

Copyright © 2019, Elsevier Inc. All rights Reserved 26


Dynamic Scheduling
Dynamic Scheduling
 Dynamic scheduling implies:
 Out-of-order execution
 Out-of-order completion

 Example 1:
fdiv.d f0,f2,f4
fadd.d f10,f0,f8
fsub.d f12,f8,f14

 fsub.d is not dependent, issue before fadd.d

Copyright © 2019, Elsevier Inc. All rights Reserved 27


Dynamic Scheduling
Dynamic Scheduling
 Example 2:
fdiv.d f0,f2,f4
fmul.d f6,f0,f8
fadd.d f0,f10,f14

 fadd.d is not dependent, but the


antidependence makes it impossible to issue
earlier without register renaming
 fmul.d and fadd.d: antidependence (Register
f0)
 If fadd.d executes before fmul.d, it will result in
WAR Copyright © 2019, Elsevier Inc. All rights Reserved 28
Dynamic Scheduling
Register Renaming
 Example 3:

fdiv.d f0,f2,f4
fadd.d f6,f0,f8
antidependence
fsd f6,0(x1)
fsub.d f8,f10,f14 antidependence

fmul.d f6,f10,f8
 WAR hazards on the use of f8 by fadd.d and its use by the fsub.d
 WAW hazard because the fadd.d may finish later than the fmul.d
 There are also three true data dependences:

between the fdiv.d and the fadd.d,

between the fsub.d and the fmul.d,

between the fadd.d and the fsd.

Copyright © 2019, Elsevier Inc. All rights Reserved 29


Dynamic Scheduling
Register Renaming
 Example 3: assume the existence of two
temporary registers, S and T.

fdiv.d f0,f2,f4
fadd.d S,f0,f8
fsd S,0(x1)
fsub.d T,f10,f14
fmul.d f6,f10,T

 Now only RAW hazards remain, which can be strictly


ordered

Copyright © 2019, Elsevier Inc. All rights Reserved 30


Dynamic Scheduling
Register Renaming
 Tomasulo’s Approach
 Tracks when operands are available: Minimize RAW
 Introduces register renaming in hardware

Minimizes WAW and WAR hazards
 rely on two key principles:
 dynamically determining when an instruction is ready to execute
 renaming registers to avoid unnecessary hazards.
 Register renaming is provided by reservation stations (RS)
 The basic idea is that a RS fetches and buffers an operand
as soon as it is available, eliminating the need to get the
operand from a register
 RS Contains:

The instruction

Buffered operand values (when available)

Reservation station number of instruction providing
the operand values
Copyright © 2019, Elsevier Inc. All rights Reserved 31
Dynamic Scheduling
Register Renaming
 RS fetches and buffers an operand as soon as it
becomes available (not necessarily involving register file)
 Pending instructions designate the RS to which they will
send their output
 Result values broadcast on a result bus, called the common data
bus (CDB)
 Only the last output updates the register file
 As instructions are issued, the register specifiers are
renamed with the reservation station
 May be more reservation stations than registers
 Load and store buffers
 Contain data and addresses, act like reservation stations

Copyright © 2019, Elsevier Inc. All rights Reserved 32


Dynamic Scheduling
Tomasulo’s Algorithm

Copyright © 2019, Elsevier Inc. All rights Reserved 33


Dynamic Scheduling
Tomasulo’s Algorithm
 Three Steps:
 Issue

Get next instruction from FIFO instruction queue

If available RS, issue the instruction to the RS with operand values if
available

If operand values not available, stall the instruction
 Execute

If one or more of the operands is not yet available, monitor the
common data bus while waiting for it to be computed

When operand becomes available, store it in any reservation
stations waiting for it

When all operands are ready, issue the instruction

Loads and store maintained in program order through effective
address

No instruction allowed to initiate execution until all branches that
proceed it in program order have completed
 Write result

Write result on CDB into reservation stations and store buffers
 (Stores must wait until address and value are received)

Copyright © 2019, Elsevier Inc. All rights Reserved 34


Dynamic Scheduling
Example

Copyright © 2019, Elsevier Inc. All rights Reserved 35


Dynamic Scheduling
Tomasulo’s Algorithm
 Example loop:
Loop: fld f0,0(x1)
fmul.d f4,f0,f2
fsd f4,0(x1)
addi x1,x1,8
bne x1,x2,Loop // branches if x16 != x2

Copyright © 2019, Elsevier Inc. All rights Reserved 36


Dynamic Scheduling
Tomasulo’s Algorithm

Copyright © 2019, Elsevier Inc. All rights Reserved 37


Hardware-Based Speculation
Hardware-Based Speculation
 Execute instructions along predicted
execution paths but only commit the
results if prediction was correct
 Instruction commit: allowing an instruction
to update the register file when instruction
is no longer speculative
 Need an additional piece of hardware to
prevent any irrevocable action until an
instruction commits
 I.e. updating state or taking an execution

Copyright © 2019, Elsevier Inc. All rights Reserved 38


Hardware-Based Speculation
Reorder Buffer
 Reorder buffer – holds the result of
instruction between completion and
commit

 Four fields:
 Instruction type: branch/store/register
 Destination field: register number
 Value field: output value
 Ready field: completed execution?

 Modify reservation stations:


 OperandCopyright
source is now reorder buffer instead 39
© 2019, Elsevier Inc. All rights Reserved
Hardware-Based Speculation
Reorder Buffer
 Issue:
 Allocate RS and ROB, read available
operands
 Execute:
 Begin execution when operand values are
available
 Write result:
 Write result and ROB tag on CDB
 Commit:
 When ROB reaches head of ROB, update
register
 When a mispredicted branch reaches head of40
Copyright © 2019, Elsevier Inc. All rights Reserved
Hardware-Based Speculation
Reorder Buffer
 Register values and memory values are
not written until an instruction commits

 On misprediction:
 Speculated entries in ROB are cleared

 Exceptions:
 Not recognized until it is ready to commit

Copyright © 2019, Elsevier Inc. All rights Reserved 41


Hardware-Based Speculation
Reorder Buffer

Copyright © 2019, Elsevier Inc. All rights Reserved 42


Hardware-Based Speculation
Reorder Buffer

Copyright © 2019, Elsevier Inc. All rights Reserved 43


Multiple Issue and Static Scheduling
Multiple Issue and Static Scheduling
 To achieve CPI < 1, need to complete
multiple instructions per clock

 Solutions:
 Statically scheduled superscalar processors
 VLIW (very long instruction word) processors
 Dynamically scheduled superscalar
processors

Copyright © 2019, Elsevier Inc. All rights Reserved 44


Multiple Issue and Static Scheduling
Multiple Issue

Copyright © 2019, Elsevier Inc. All rights Reserved 45


Multiple Issue and Static Scheduling
VLIW Processors
 Package multiple operations into one
instruction

 Example VLIW processor:


 One integer instruction (or branch)
 Two independent floating-point operations
 Two independent memory references

 Must be enough parallelism in code to fill


the available slots

Copyright © 2019, Elsevier Inc. All rights Reserved 46


Multiple Issue and Static Scheduling
VLIW Processors

 Disadvantages:
 Statically finding parallelism
 Code size
 No hazard detection hardware
 Binary code compatibility
Copyright © 2019, Elsevier Inc. All rights Reserved 47
Dynamic Scheduling, Multiple Issue, and Speculation
Dynamic Scheduling, Multiple Issue, and Speculation

 Modern microarchitectures:
 Dynamic scheduling + multiple issue +
speculation

 Two approaches:
 Assign reservation stations and update
pipeline control table in half clock cycles

Only supports 2 instructions/clock
 Design logic to handle any possible
dependencies between the instructions

 Issue logic is the bottleneck in dynamically


Copyright © 2019, Elsevier Inc. All rights Reserved 48
Dynamic Scheduling, Multiple Issue, and Speculation
Overview of Design

Copyright © 2019, Elsevier Inc. All rights Reserved 49


Dynamic Scheduling, Multiple Issue, and Speculation
Multiple Issue
 Examine all the dependencies amoung the
instructions in the bundle

 If dependencies exist in bundle, encode


them in reservation stations

 Also need multiple completion/commit

 To simplify RS allocation:
 Limit the number of instructions of a given
class that can be issued in a “bundle”, i.e. on
FP, one integer, one load, one store
Copyright © 2019, Elsevier Inc. All rights Reserved 50
Dynamic Scheduling, Multiple Issue, and Speculation
Example
Loop: ld x2,0(x1) //x2=array element
addi x2,x2,1 //increment x2
sd x2,0(x1) //store result
addi x1,x1,8 //increment pointer
bne x2,x3,Loop //branch if not last

Copyright © 2019, Elsevier Inc. All rights Reserved 51


Dynamic Scheduling, Multiple Issue, and Speculation
Example (No Speculation)

Copyright © 2019, Elsevier Inc. All rights Reserved 52


Dynamic Scheduling, Multiple Issue, and Speculation
Example (Mutiple Issue with Speculation)

Copyright © 2019, Elsevier Inc. All rights Reserved 53


Adv. Techniques for Instruction Delivery and Speculation
Branch-Target Buffer
 Need high instruction bandwidth
 Branch-Target buffers

Next PC prediction buffer, indexed by current PC

Copyright © 2019, Elsevier Inc. All rights Reserved 54


Adv. Techniques for Instruction Delivery and Speculation
Branch Folding
 Optimization:
 Larger branch-target buffer
 Add target instruction into buffer to deal with
longer decoding time required by larger buffer
 “Branch folding”

Copyright © 2019, Elsevier Inc. All rights Reserved 55


Adv. Techniques for Instruction Delivery and Speculation
Return Address Predictor
 Most unconditional branches come from
function returns
 The same procedure can be called from
multiple sites
 Causes the buffer to potentially forget about
the return address from previous calls
 Create return address buffer organized
as a stack

Copyright © 2019, Elsevier Inc. All rights Reserved 56


Adv. Techniques for Instruction Delivery and Speculation
Return Address Predictor

Copyright © 2019, Elsevier Inc. All rights Reserved 57


Adv. Techniques for Instruction Delivery and Speculation
Integrated Instruction Fetch Unit
 Design monolithic unit that performs:
 Branch prediction
 Instruction prefetch

Fetch ahead
 Instruction memory access and buffering

Deal with crossing cache lines

Copyright © 2019, Elsevier Inc. All rights Reserved 58


Adv. Techniques for Instruction Delivery and Speculation
Register Renaming
 Register renaming vs. reorder buffers
 Instead of virtual registers from reservation stations and reorder
buffer, create a single register pool

Contains visible registers and virtual registers
 Use hardware-based map to rename registers during issue
 WAW and WAR hazards are avoided
 Speculation recovery occurs by copying during commit
 Still need a ROB-like queue to update table in order
 Simplifies commit:

Record that mapping between architectural register and physical register is no
longer speculative

Free up physical register used to hold older value

In other words: SWAP physical registers on commit
 Physical register de-allocation is more difficult

Simple approach: deallocate virtual register when next instruction writes to its
mapped architecturally-visibly register

Copyright © 2019, Elsevier Inc. All rights Reserved 59


Adv. Techniques for Instruction Delivery and Speculation
Integrated Issue and Renaming
 Combining instruction issue with register renaming:
 Issue logic pre-reserves enough physical registers for the
bundle
 Issue logic finds dependencies within bundle, maps registers
as necessary
 Issue logic finds dependencies between current bundle and
already in-flight bundles, maps registers as necessary

Copyright © 2019, Elsevier Inc. All rights Reserved 60


Adv. Techniques for Instruction Delivery and Speculation
How Much?
 How much to speculate
 Mis-speculation degrades performance and
power relative to no speculation

May cause additional misses (cache, TLB)
 Prevent speculative code from causing
higher costing misses (e.g. L2)
 Speculating through multiple branches
 Complicates speculation recovery
 Speculation and energy efficiency
 Note: speculation is only energy efficient
when it significantly improves performance
Copyright © 2019, Elsevier Inc. All rights Reserved 61
Adv. Techniques for Instruction Delivery and Speculation
How Much?
integer

Copyright © 2019, Elsevier Inc. All rights Reserved 62


Adv. Techniques for Instruction Delivery and Speculation
Energy Efficiency
 Value prediction
 Uses:

Loads that load from a constant pool

Instruction that produces a value from a small set
of values
 Not incorporated into modern processors
 Similar idea--address aliasing prediction--is
used on some processors to determine if
two stores or a load and a store reference
the same address to allow for reordering

Copyright © 2019, Elsevier Inc. All rights Reserved 63


Fallacies and Pitfalls
Fallacies and Pitfalls
 It is easy to predict the performance/energy
efficiency of two different versions of the same
ISA if we hold the technology constant

Copyright © 2019, Elsevier Inc. All rights Reserved 64


Fallacies and Pitfalls
Fallacies and Pitfalls
 Processors with lower CPIs / faster clock rates
will also be faster

 Pentium 4 had higher clock, lower CPI


 Itanium had same CPI, lower clock

Copyright © 2019, Elsevier Inc. All rights Reserved 65


Fallacies and Pitfalls
Fallacies and Pitfalls
 Sometimes bigger and dumber is better
 Pentium 4 and Itanium were advanced designs, but
could not achieve their peak instruction throughput
because of relatively small caches as compared to i7

 And sometimes smarter is better than bigger and


dumber
 TAGE branch predictor outperforms gshare with less
stored predictions

Copyright © 2019, Elsevier Inc. All rights Reserved 66


Fallacies and Pitfalls
Fallacies and Pitfalls
 Believing that there
are large amounts
of ILP available, if
only we had the
right techniques

Copyright © 2019, Elsevier Inc. All rights Reserved 67

You might also like