RL4 Re Al
RL4 Re Al
● Solutions
○ Constraint-based: ILP and PBQP formulations
○ Heuristic approaches
2
LLVM’s Register Allocation Strategies and Heuristics
y y
x1 z
Splitting x z x2
Coalescing
y y
Spilling
y y
x z M z x1 x2 z x z
Eviction y y
R1 z x z
● No single best allocator
Greedy performs better in general
4
RL4ReAl: Objectives
Objectives: Machine Learning Framework for Register Allocation
● End-to-end application of Reinforcement Learning for register allocation
5
Constraints in Register Allocation
6
Register Allocation: Correctness constraints
Registers are complicated!
1.Register Constraints
2.Type constraints
3.Congruence constraints
4.Interference constraints
7
Register Constraints
● Architectural constraints
○ Eg: IDIV32 → Divides contents of $eax; stores result in $eax and $edx
8
Type constraints
● Different types of registers in a register file
○ General purpose registers
○ Floating point registers
○ Vector registers, …
9
Congruence constraints
● Real-world ISAs have hierarchy of register classes
○ Congruent classes
x = 10
y = 20 Interval Interference
print x Interference Graph
z = 20 + y y
print y y
x z
z = z +10 x z
print z
x z R1 z R1 z R1 R1
MLRegAlloc
gRPC
Update
Split Info
gRPC
Stub gRPC
Stub
12
Interference graphs
Edges: {phy reg - vir reg, vir reg - vir reg}
Vertices
● MIR instruction representations in the live range of a variable
● Final representation: Rm ⨯ n
MIR2Vec representations
● n dimensional vector representation
13
Grouping opcodes
● MIR has specialized opcodes
● Based on width, source and destination types
○ 200 different MOV instructions
○ MOV32rm, MOVZX64rr16, MOVAPDrr, etc.
● Generic opcodes
○ Specialized opcodes are grouped together
○ {MOV32rx, MOVZX64rr16, MOVAPDrr, …} → MOV
14
Representing Interference graphs
● GGNNs - Gated Graph Neural Networks
○ Processing graph structured inputs
● Message passing
○ Information propagated multiple times across nodes
● Rm ⨯ n → R k
y
1 0 1
15
Hierarchical Reinforcement Learning
● Environment - MLRegAlloc pass in LLVM
○ Generates interference graphs + representations
○ Register allocation, splitting and spilling as per the prediction
● Agents
○ Node selection
○ Task selection
○ Splitting
○ Coloring
16
Agents Selects the vertex to process next
Selected Node
Pick
Next Node
Selects between split and color
Action space
Task Selection Agent
Split or Color - Split is allowed only if #Uses > k (k = 2)
Split Color
Splitting Agent Coloring Agent
Predicts the split point in live range of a variable Picks an appropriate color for a given vertex
Reward Reward
Difference in spill weights before and after splitting +Spill weight, if colored; -Spill weight, if spilled
17
Materialization of splitting
● Involves inserting move instructions
● Dataflow problem
○ Similar to phi or copy placement
18
Global Rewards
● Based on the throughput (Th) of the generated function
● Use LLVM MCA
○ Machine Code Analyzer of LLVM
○ Static model to estimate throughput
19
Integration with LLVM
● RL4ReAl - to-and-fro communication
○ Decisions/Actions by Python model
○ Materialization of decisions in C++ compiler
20
Training
1. Request:Interference Graph
Training phase
21
Inference
1. Interference graph
RL Model LLVM
(Python) 2. Reply: Decisions (C++)
Inference phase
○ For any input code C++(LLVM) sends a request to the trained model
for splitting decision
○ As a reply, the trained model returns the decision it took and code is
generated.
22
Experiments
● MIR2Vec representations
○ 2000 source files from SPEC CPU 2017 and C++ Boost libraries
○ 100 dimensional embeddings; trained over 1000 epochs
● Evaluation
○ x86 - Intel Xeon W2133, 6 cores, 32GB RAM
○ AArch64 - ARM Cortex A72, 2 cores, 4GB RAM
● Register allocations
○ General purpose, floating point and vector registers
23
Runtime improvements on x86
25
Analysis of Hot functions
%speedups obtained by Greedy and RL4ReAl over Basic
26
Analysis of Hot functions
%speedups obtained by Greedy and RL4ReAl over Basic
27
Runtimes on AArch64
28
Policy Improvement on Regression cases
● Regression in performance
○ Identify → Refine heuristics → Evaluate
Trofin et al, MLGO: a machine learning guided compiler optimizations framework - arXiv, 2021 29
Summary
● https://compilers.cse.iith.ac.in/publications/rl4real
30
Thank You!
https://compilers.cse.iith.ac.in/publications/rl4real/
31