Hardware-Software Interface: - Code Translation
Hardware-Software Interface: - Code Translation
Machine
Available resources statically fixed
Program
Required resources dynamically varying
Designed to support wide variety of programs Interested in running many programs fast
Designed to run well on a variety of machines Interested in having itself run fast
CS 211
CS 211
Compiler Tasks
Code Translation
Source language target language
FORTRAN C C MIPS, PowerPC or Alpha machine code MIPS binary Alpha binary
Compiler Structure
Frond End
Optimizer
Back End
Code Optimization
Code runs faster Match dynamic code behavior to static machine structure
IR
Dependence Analyzer
IR
machine code
Machine independent
Machine dependent
Front End
Lexical Analysis
Misspelling an identifier, keyword, or operator e.g. lex
..
Syntax Analysis
Grammar errors, such as mismatched parentheses e.g. yacc
Optimized HIL
Semantic Analysis
Optimized LIL Target-3 Code Generator and Linker Target-3 Executable
Type checking
..
CS 211
Front-end
1. Scanner - converts input character stream into stream of lexical tokens 2. Parser - derives syntactic structure (parse tree, abstract syntax tree) from token stream, and reports any syntax errors encountered
Front-end
3. Semantic Analysis - generates intermediate language representation from input source program and user options/directives, and reports any semantic errors encountered
CS 211
CS 211
High-level Optimizer
Global intra-procedural and interprocedural analysis of source program's control and data flow Selection of high-level optimizations and transformations Update of high-level intermediate language
CS 211 CS 211
Intermediate Representation
Achieve retargetability
Different source languages Different target machines
Graphical Representation
int a, b, c, d; d = a * (b+c)
FND1 FND2 FND3 FND4 FND5 FND6 FND7 FND8 FND9 FND10
ADDRL A3 ADDRL A0 INDIRI FND2 ADDRL A1 INDIRI FND4 ADDRL A2 INDIRI FND6 ADDI FND5 MULI FND3 ASGI FND1
&b
&c
Machine-Independent Optimizations
Dataflow Analysis and Optimizations
Constant propagation Copy propagation Value numbering
Elimination of common subexpression Dead code elimination Stength reduction Function/Procedure inlining
CS 211
CS 211
Code-Optimizing Transformations
Constant folding
(1 + 2) (100 > 0) x = b+c z = y*x x = b*c+4 z = b*c- 1
y = a*b+3 z = a*b+3+z x = 3 common subexpression t = y = z = CSx = 211 a*b+3 t t+z 3 dead code elimination x y z x = = = =
Code Motion
Move code between basic blocks E.g. move loop invariant computations outside of loops
t = x/y while ( i < 100 ) { *p = t + i i = i +1 }
Strength Reduction
Replace complex (and costly) expressions with simpler ones E.g.
a : = b*17 a: = (b<<4) + b p = & a[ i ] t = i * 100 while ( i < 100 ) { *p = t t = t + 100 p = p + 4 i = i +1 }
E.g.
while ( i < 100 ) { a[ i ] = i * 100 i = i+1 }
CS 211
Loop Optimizations
Motivation: restructure program so as to enable more effective back-end optimizations and hardware exploitation Loop transformations are useful for enhancing
register allocation instruction-level parallelism data-cache locality vectorization parallelization
Rather than recompute i*M+j for each array in each iteration, share induction variable between arrays, increment at end of loop body.
CS 211
CS 211
Loop optimizations
Loops are good targets for optimization. Basic loop optimizations:
code motion; induction-variable elimination; strength reduction (x*2 -> x<<1).
nasa7
9 16 83
------17 96 7 22 96
matrix300
1 15
tomcatv
1 5 12
Study of loop-intensive benchmarks in the SPEC92 suite [C.J. Newburn, 1991] CS 211
CS 211
Function inlining
Replace function calls with function body Increase compilation scope (increase ILP)
e.g. constant propagation, common subexpression
Back End
IR
Back End
Machine code
code selection
code scheduling
register allocation
code emission
in-line Code Expansion 1.25 1.00+ 1.21 1.09 1.06 1.18 1.32 1.17
CS 211
map virtual registers into architect registers rearrange code target machine specific optimizations - delayed branch - conditional move - instruction combining auto increment addressing mode add carrying (PowerPC) hardware branch (PowerPC) Instruction-level IR
Code Selection
Map IR to machine instructions (e.g. pattern Inst *match (IR *n) { matching) switch (n->opcode) {
ASGI &d INDIRI &a INDIRI MULI ADDI INDIRI
&b
&c
case ..: case MUL : l = match (n->left()); r = match (n->right()); if (n->type == D || n->type == F ) inst = mult_fp( (n->type == D), l, r ); else inst = mult_int ( (n->type == I), l, r); break; case ADD : l = match (n->left()); r = match (n->right()); if (n->type == D || n->type == F) inst = add_fp( (n->type == D), l, r); else inst = add_int ((n->type == I), l, r); break; case ..: } return inst; }
CS 211
CS 211
Peephole Optimizations
Replacements of assembly instruction through template matching
Register Allocation Instruction Scheduling Peephole Optimizations Eg. Replacing one addressing mode with another in a CISC
CS 211
CS 211
Code Scheduling
Rearrange code sequence to minimize execution time
Hide instruction latency Utilize all available resources
l.d f4, 8(r8) l.d f2, 16(r8) fadd f5, f4, f6 fsub f7, f2, f6 fmul f7, f7, f5 s.d f7, 24(r8) l.d f8, 0(r9) s.d f8, 8(r9) 0 stall 0 stall 3 stalls 1 stall
reorder
l.d f4, 8(r8) fadd f5, f4, f6 l.d f2, 16(r8) fsub f7, f2, f6 fmul f7, f7, f5 s.d f7, 24(r8) l.d f8, 0(r9) s.d f8, 8(r9) 1 stall 1 stall
CS 211
l.d f4, 8(r8) l.d f2, 16(r8) fadd f5, f4, f6 fsub f7, f2, f6 fmul f7, f7, f5 (memory dis-ambiguation) l.d f8, 0(r9) s.d f8, 8(r9) s.d f7, 24(r8) 1 stall
3 stalls
reorder
main(int argc, char *argv[]) { int a, b, c; a = argc; b = a * 255; c = a * 15; printf("%d\n", b*b - 4*a*c );
op op op op op op op op op
10 12 14 15 16 17 18 27 20
MPY vr2 param1, 255 MPY vr3 param1, 15 MPY vr8 vr2, vr2 SHL vr9 param1, 2 MPY vr10 vr9, r3 SUB param2 vr8, r10 MOV param1 addr("%d\n) PBRR vb12 addr(printf) BRL ret_addr vb12
CS 211
CS 211
Instruction Scheduling
Given a source program P, schedule the instructions so as to minimize the overall execution time on the functional units in the target machine
CS 211
Instruction Scheduling
Input: A basic block represented as a DAG
i2 0 i1 0 i3 0 1 Latency i4
Instruction Scheduling
Idle Cycle Due to Latency S1 S2 i1 i1 i3 i2 i2 i3 i4 i4
i2 is a load instruction. Latency of 1 on (i2,i4) means that i4 cannot start for one cycle after i2 completes. CS 211
Two schedules for the above DAG with S2 as the desired sequence.
CS 211
Register Allocation
Map virtual registers into physical registers
minimize register usage to reduce memory accesses but introduces false dependencies . . . . .
It is useful to store variables as long as possible, once they are loaded into registers Registers are bounded in number
register-sharing is needed over time.
l.d f4, 8(r8) fadd f5, f4, f6 l.d f2, 16(r8) fsub f7, f2, f6 fmul f7, f7, f5 s.d f7, 24(r8) l.d f8, 0(r9) s.d f8, 8(r9)
CS 211
l.d $f0, 8(r8) fadd $f2, $f0, $f3 l.d $f0, 16(r8) fsub $f0, $f0, $f3 fmul $f0, $f0, $f2 s.d $f0, 24(r8) l.d $f0, 0(r9) s.d $f0, 8(r9)
f2 f4 f7 f8 f5 f6
CS 211
The Goal
Primarily to assign registers to variables However, the allocator runs out of registers quite often Decide which variables to flush out of registers to free them up, so that other variables can be bought in
Spilling
CS 211
CS 211
Register Allocation
Stall cycles due to false dependencies, spill code
10
Performance analysis
Elements of program performance (Shaw):
execution time = program path + instruction timing
Path depends on data values. Choose which case you are interested in. Instruction timing depends on pipelining, cache behavior.
importance of compiler
Back-end of compiler
CS 211
CS 211
Instruction timing
Not all instructions take the same amount of time.
Hard to get execution time data for instructions.
Instruction execution times are not independent. Execution time may depend on operand values.
CS 211
CS 211
11
Branch probabilities Average number of loop iterations Average number of procedure calls
Execution Frequencies?
CS 211
Compile-time estimation:
Default values Compiler analysis Goal is to select same set of program regions and optimizations that would be obtained from profiled frequencies
CS 211
CS 211
12
CS 211
CS 211
Cost Functions
Effectiveness of the Optimizations: How well can we optimize our objective function? Impact on running time of the compiled code determined by the completion time. Efficiency of the optimization: How fast can we optimize? Impact on the time it takes to compile or cost for gaining the benefit of code with fast running time.
CS 211
CS 211
13
CS 211
CS 211
Basic Graphs
A graph is made up of a set of nodes (V) and a set of edges (E) Each edge has a source and a sink, both of which must be members of the nodes set, i.e. E = V V Edges may be directed or undirected
A directed graph has only directed edges A undirected graph has only undirected edges
Examples
Undirected graph
Directed graph
CS 211
CS 211
14
Paths
source
Cycles
path sink Undirected graph Directed graph Undirected graph Directed graph Acyclic Directed graph
CS 211
CS 211
Connected Graphs
Unconnected graph
CS 211
15
Motivation: language-independent and machineindependent representation of control flow in programs used in high-level and low-level code optimizers. The flow graph data structure lends itself to use of several important algorithms from graph theory.
CS 211
CS 211
16
CS 211
CS 211
17
Hard to optimize well without detailed knowledge of the range of the iteration.
more...
CS 211 CS 211
CS 211
CS 211
18
The simpler case that we will consider first has no branching and corresponds to basic block of code, eg., loop bodies. The more complicated case of scheduling programs with acyclic control flow with branching will be considered next.
CS 211
CS 211
The Core Case: Scheduling Basic Blocks Why are basic blocks easy? All instructions specified as part of the input must be executed.
i1 0
Instruction Scheduling
Input: A basic block represented as a DAG
i2 0 1 Latency i4 0 i3
Allows deterministic modeling of the input. No branch probabilities to contend with; makes problem space easy to optimize using classical methods. CS 211
i2 is a load instruction. Latency of 1 on (i2,i4) means that i4 cannot start for one cycle after i2 CScompletes. 211
19
The General Instruction Scheduling Problem Input: DAG representing each basic block where: 1. Nodes encode unit execution time (single cycle) instructions. 2. Each node requires a definite class of FUs. 3. Additional pipeline delays encoded as latencies on the edges. 4. Number of FUs of each type in the target machine. more...
CS 211
Two schedules for the above DAG with S2 as the desired sequence.
CS 211
Drawing on Deterministic Scheduling Canonical List Scheduling Algorithm: 1. Assign a Rank (priority) to each instruction (or node). 2. Sort and build a priority list of the instructions in non-decreasing order of Rank.
Nodes with smaller ranks occur earlier
20
Code Scheduling
Objectives: minimize execution latency of the program
Start as early as possible instructions on the critical path Help expose more instruction-level parallelism to the hardware Help avoid resource conflicts that increase execution time
Constraints
Program Precedences Machine Resources
Motivations
Dynamic/Static Interface (DSI): By employing more software (static) optimization techniques at compile time, hardware complexity can be significantly reduced Performance Boost: Even with the same complex hardware, software scheduling can provide additional performance enhancement over that of unscheduled code
CS 211
CS 211
Precedence Constraints
Minimum required ordering and latency between definition and use Precedence graph i1: l.s f2, 4(r2)
Nodes: instructions Edges (ab): a precedes b Edges are annotated with minimum latency
w[i+k].ip = z[i].rp + z[m+i].rp; w[i+j].rp = e[k+1].rp* (z[i].rp -z[m+i].rp) e[k+1].ip * (z[i].ip - z[m+i].ip); i2: l.s f0, 4(r5) i3: fadd.s f0, f2, f0 i4: s.s f0, 4(r6) i5: l.s f14, 8(r7) i6: l.s f6, 0(r2) i7: l.s f5, 0(r3) i8: fsub.s f5, f6, f5 i9: fmul.s f4, f14, f5 i10: l.s f15, 12(r7) i11: l.s f7, 4(r2) i12: l.s f8, 4(r3) i13: fsub.s f8, f7, f8 i14: fmul.s f8, f15, f8 i15: fsub.s f8, f4, f8 i16: s.s f8, 0(r8)
Precedence Graph
i1 2 i3 2 i4 i9 4 i15 2 i16
CS 211
i2 2
i5 2
i6 2 i8 2
i7 2
i10 2
i12 2
21
Resource Constraints
Bookkeeping
Prevent resources from being oversubscribed
The Value of Greedy List Scheduling Example: Consider the DAG shown below:
Using the list = <i1, i2, i3, i4, i5> Greedy scanning produces the steps of the schedule as follows:
CS 211
The Value of Greedy List Scheduling (Contd.) 1. On the first scan: i1 which is the first step. 2. On the second and third scans and out of the list order, respectively i4 and i5 to correspond to steps two and three of the schedule. 3. On the fourth and fifth scans, i2 and i3 respectively scheduled in steps four and five.
3. Greedily choose one ready instruction I from ready list with the highest priority
Possibly using tie-breaking heuristics
5. Add those instructions whose precedence constraints are now satisfied into the ready list
CS 211
CS 211
22
Rank/Priority Functions/Heuristics
Number of descendants in precedence graph Maximum latency from root node of precedence graph Length of operation latency Ranking of paths based on importance Combination of above
Orientation of Scheduling
Instruction Oriented
Initialization (priority and ready list) Choose one ready instruction I and find a slot in schedule
make sure resource constraint is satisfied
Cycle Oriented
Initialization (priority and ready list) Step through schedule cycle by cycle For the current cycle C, choose one ready instruction I
be sure latency and resource constraints are satisfied
CS 211
CS 211
(a+b)*c
ld a
2 ld b
3 ld c
7 fadd
8 fsub
9 fdiv
10 fmul load: 2 cycles add: 1 cycle sub: 1 cycle mul: 4 cycles div: 10 cycles
CS 211
5
CS 211
mul
23
Code
14
Ready inst are green Red indicates not ready Black indicates under execution
CS 211
Some Intuition
Greediness helps in making sure that idle cycles dont remain if there are available instructions further down stream. Ranks help prioritize nodes such that choices made early on favor instructions with greater enabling power, so that there is no unforced idle cycle.
Rank/Priority function is critical
CS 211
CS 211
24
CS 211
CS 211
Rank Functions
1. Postpass Code Optimization of Pipelined Constraints, J. Hennessey and T. Gross, ACM Transactions on Programming Languages and Systems, vol. 5, 422-448, 1983. 2. Scheduling Expressions on a Pipelined Processor with a Maximal Delay of One Cycle, D. Bernstein and I. Gertner, ACM Transactions on Programming Languages and Systems, vol. 11 no. 1, 57-66, Jan 1989.
CS 211 CS 211
25
1. Initially label all the nodes by the same value, say 2. Compute new labels from old starting with nodes at level zero (i4) and working towards higher levels: (a) All nodes at level zero get a rank of . more...
CS 211
is associated with
CS 211
26
A well known classical approach is to consider traces through the (acyclic) control flow graph.
Shall return to this when we cover Compiling for ILP processors
Traces
Trace Scheduling: A Technique for Global Microcode Compaction, J.A. Fisher, IEEE Transactions on Computers, Vol. C-30, 1981. Main Ideas:
Choose a program segment that has no cyclic dependences. Choose one of the paths out of each branch that is encountered.
BB-1
BB-4
BB-5
BB-6
A trace BB-1, BB-4, BB-6
BB-7
Branch Instruction
more...
CS 211 CS 211
STOP
27
Source Program
Front End
Back End
Register Allocation
Scheduling
Register Allocation
CS 211
CS 211
The Goal
However, the allocator runs out of registers quite often Decide which variables to flush out of registers to free them up, so that other variables can be bought in
Spilling
CS 211
CS 211
28
Live Ranges
a :=...
BB1 BB2 T BB3 F BB4
Live range of virtual register a = (BB1, BB2, BB3, BB4, BB5, BB6, BB7). Def-Use chain of virtual register a = (BB1, BB3, BB5, BB7).
CS 211
:= a
:= a
BB6 BB7
BB5
:= a
A variable is live at block i if there is a direct reference to the variable at block i or at some block j that succeeds i in the CFG, provided the variable in question is not redefined in the CS 211 interval between i and j.
29
CS 211
a =...
B1 F
a =...
B1 F
b = ...
..= a
B3
B2
b = ...
c = c +1 B3
T
.. = b
B4
...= a +b
B4
Live range of a = {B1, B3} Live range of b = {B2, B4} No interference! a and b can be assigned to the same register
CS 211
Live range of a = {B1, B2, B3, B4} Live range of b = {B2, B4} Live range of c = {B3} In this example, a and c interfere, and c should be given priority because it has a higher usage CScount. 211
30
Interference Graph
Definition: An interference graph G is an undirected graph with the following properties: (a) each node x denotes exactly one distinct live range X, and (b) an edge exists between nodes x and y iff X, Y interfere (overlap), where X and Y are the live ranges corresponding to nodes x and y.
a a := b := c := := a := b d := := c := d
CS 211
a a := b := c := := a := b d := := c := d
CS 211
b
Node model live ranges
b
Node model live ranges
31
CS 211
more
CS 211
Graph Coloring
Given an undirected graph G and a set of k distinct colors, compute a coloring of the nodes of the graph i.e., assign a color to each node such that no two adjacent nodes get the same color. Recall that two nodes are adjacent iff they have an edge between them. A given graph might not be k-colorable. In general, it is a computationally hard problem to color a given graph using a given number k of colors. The register allocation problem uses good heuristics for coloring.
CS 211
CS 211
32
CS 211
Interference Graph
r1, r2 & r3 are live-in
beq r2, $0
n-1
CS 211
33
Example (N = 4)
COLOR stack = {} r1 r2 r3 r4 blocks r7 r2 r3 r4 remove r6 r7 r6 r5 COLOR stack = {r5} r1 r7 remove r5 r2 r3 r4 spill r1 Is this a ood choice?? r7 r6 r3 r4 COLOR stack = {r5} r6
Assign each node a different color Add removed nodes back one-by-one and pick a legal color as each one is added (2 nodes connected by an edge get different colors) Must be possible with less than n colors Complications: simplification can block if there are no nodes with less than n edges Choose one node to spill based on spilling heuristic
r2
CS 211
CS 211
Example (N = 5)
COLOR stack = {} r1 r2 r3 r4 r1 r1 r2 r3
CS 211
Register Spilling
When simplification is blocked, pick a node to delete from the graph in order to unblock Deleting a node implies the variable it represents will not be kept in register (i.e. spilled into memory)
When constructing the interference graph, each node is assigned a value indicating the estimated cost to spill it. The estimated cost can be a function of the total number of definitions and uses of that variable weighted by its estimated execution frequency. When the coloring procedure is blocked, the node with the least spilling cost is picked for spilling.
r7 r6 r5 remove r5 r2
When a node is spilled, spill code is added into the original code to store a spilled variable at its definition and to reload it at each of its use After spill code is added, a new interference graph is rebuilt from the modified code, and n-coloring of this graph is again attempted
CS 211
34
The Priority Based Coloring Approach to Register Allocation, F. Chow and J. Hennessey, ACM Transactions on Programming Languages and Systems, vol. 12, 501-536, 1990.
Hennessey, Founder of MIPS, President of Stanford Univ!
more...
CS 211 CS 211
The final major difference is the place of the register allocation in the overall compilation process.
In the present approach, the interference graph is considered earlier in the compilation process using intermediate level statements; compiler generated temporaries are known. In contrast, in the previous work the allocation is done at the level of the machine code.
CS 211
35
CS 211
36
The Algorithm
For all constrained live ranges, execute the following steps: 1. Compute Priority(X) if it has not already been computed. 2. For the live range X with the highest priority: (a) If its priority is negative or if no basic block i in X can be assigned a registerbecause every color has been assigned to a basic block that interferes with i then delete X from the list and modify the interference graph. (b) Else, assign it a color that is not in its forbidden set. (c) Update the forbidden sets of the members of INTERFERE for X. more...
CS 211
37
CS 211
BB1
BB1
c :=
c :=
BB3
BB4
:= a := c
BB5
Live Ranges: a: BB1, BB2, BB3, BB4, BB5 b: BB1, BB2, BB3, BB4, BB5, BB6 c: BB2, BB3, BB4, BB5 Assume the number of physical registers = 2
CS 211
:= b
BB6
New live ranges: a: BB1, BB2, BB3, BB4, BB5 BB6 b: BB1 := b c: BB2, BB3, BB4, BB5 b2: BB6 b and b2 are logically the same program variable b2 is a renamed equivalent of b. All nodes are now unconstrained.
CS 211
:= a := c
BB5
spill introduced
38
Scheduling more flexible regions implies using features such as speculation, code duplication, predication
CS 211
Trace scheduling
Pick a trace in the program graph
Most frequently executed region of code
39
A well known classical approach is to consider traces through the (acyclic) control flow graph.
Shall return to this when we cover Compiling for ILP processors
BB-6
BB-4
BB-5
BB-7
Branch Instruction
CS 211
CS 211
STOP
C
0.4 0.6
D
0.8 0.2
Scheduling algorithm
Input is the Region (Trace, Superblock, etc.) Use List scheduling algorithm
Treat movement of instructions past branch and join points as special cases
E G
H
0.2 0.8
I
CS 211 CS 211
40
The Four Elementary but Significant Side-effects Consider a single instruction moving past a conditional branch:
Off-trace Path
Branch Instruction
CS 211
This code movement leads to the instruction executing sometimes when the instruction ought not to have: speculatively.
CS 211
more...
Identical to previous case except the pseudo-dependence edge is from A to the join instruction whenever A is a write or a def. A more general solution is to permit the code motion but undo the effect of the speculated definition by adding repair code An expensive proposition in terms of compilation cost.
CS 211
Off-trace Path
Instruction A will not be executed if the off-trace path is taken. To avoid mistakes, it is replicated.
CS 211
more...
41
Super Block
A trace with a single entry but potentially many exits Simplifies code motion during scheduling
Replicate A
upward movements past a side entry within a block are pure replication downward movements past a side entry within a block are pure speculation
CS 211
The superblock is a scheduling region composed of basic blocks with a single entry but potentially many exits Superblock formation is done in two steps
Trace selection Tail duplication
A larger scheduling region exposes more instructions that may be executed in parallel.
y=1 u=v
y=2 u=w
y=1 u=v
y=2 u=w
If x=3
x=y*2
z=y*3
x=2
z=6
CS 211
CS 211
42
Advantage of Superblock
We have taken care of the replication when we form the region
BB1
BB2
BB2
BB3
Schedule the region independent of other regions! Dont have to worry about code replication each time we move an instruction around a branch
D
BB4
BB8
0.8
E
Send superblock to list scheduler and it works same as it did with basic blocks !
E
BB5
BB5
BB6
BB9
BB7
BB7
BB10
CS 211
CS 211
CS 211
CS 211
43
Loop Peeling
create bigger region for nested loop
Node Splitting
Eliminate dependencies created by control path merge large code expansion
CS 211
CS 211
Tail Duplication
x > 0 x > 0 A y > 0 v:=v*x x = 1 x = 1 y > 0 v:=v*x B B A
Loop Peeling
v:=v+1
v:=v-1
v:=v+1
v:=v-1
u:=v+y
CS 211
u:=v+y
u:=v+y
CS 211
44
Node Splitting
x > 0 x > 0 x > 0 y > 0 y > 0
Assembly Code
A ble x,0,C B ble y,0,F v:=v*x C D ne x,1,F v:=v*x
y > 0
x = 1
x = 1
x = 1
v:=v+1 k:=k+1
v:=v-1
v:=v+1 k:=k+1
v:=v-1 v:=v-1
v:=v+1
v:=v-1
E v:=v+1
F v:=v-1
u:=v+y l=k+z
CS 211
u:=v+y l=k+z
u:=v+y l=k+z
u:=v+y l=k+z
CS 211
u:=v+y
u:=v+y
G u:=v+y
u:=v+y
If conversion
A ble x,0,C B ble y,0,F C D ne x,1,F v:=v*x
ble x,0,C d := ?(y>0) f:= ?(y<=0) e := ?(x=1) f:= ?(x1) f := ?(ff) v := v+1 if e if f if d if d C v:=v*x
E v:=v+1
F v:=v-1
v := v-1 u := v+y
G u:=v+y
CS 211
u:=v+y
u:=v+y
CS 211
45
Operates only on the C language as input Uses a general machine description language (HMDES)
This infrastructure uses a parameterized processor architecture called HPL-PD (a.k.a. PlayDoh) All architectures are mapped into and simulated in HPL-PD.
CS 211 CS 211
CS 211
CS 211
46
CS 211
CS 211
47