TechTalk Kruppe Espasa RISC V Vectors and LLVM

The document discusses RISC-V's vector extension (RVV) and LLVM support. RVV provides a simple yet high performance vector ISA that can scale to large and small cores. It includes features like mixed-width computations and vectorization support. LLVM support includes vector types, intrinsics and initial code generation work, but more remains to be done like autovectorization. RVV shows promise as an open standard for efficient vector processing.

Uploaded by

Lộc Nguyễn Tấn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views23 pages

TechTalk Kruppe Espasa RISC V Vectors and LLVM

Uploaded by

Lộc Nguyễn Tấn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Adventures with RISC-V

Vectors and LLVM

Robin Kruppe Roger Espasa
Chief Architect

Embedded Systems and Applications Group

1
Background
• RISC-V is a new open-source ISA rapidly gaining momentum
• Definition controlled by the RISC-V Foundation
• No license fee to implement a processor using RISC-V
• Over 200 companies have joined the foundation
• Very simple and clean ISA, with focus on extensibility
• Supports RISC-V foundation sponsored extensions
• As well as your proprietary “secret sauce” extensions
• There's a backend in LLVM

2
RISC-V Vector Extension (RVV)
• Simple, high performance, high efficiency vector processing
• Scale up & down to large & small cores
• Also base for further domain-specific extensions
• https://github.com/riscv/riscv-v-spec/
• Status: WIP but stable draft, building SW+HW and evaluating

4
Feature Highlight Reel
• Programmability: lots of support for vectorization
• Mixed-width computations, widening operations
• Fixed-point and f16
• Precise exceptions (with caveats for embedded platforms)
• Base for further specialized extensions, e.g. for matrix math, complex
numbers, DSP, ML, graphics, …
• Wide variety of microarchitecture styles supported, yet portable code
• Yes, you can build SIMD
• Yes, you can also build temporal Vectors (Cray anyone?)

5
Support for Vectorization
• Strip-mined loops – no remainder handling needed
• Masking on (almost) every vector instruction
• Strided loads and stores, scatters, gathers
• Reduction instructions (sum, min/max, and/or, …)
• Orthogonal set of vector operations, parity with scalar ISA
• fault-only-first loads for loops with data dependent exits

6
Register State: 32 registers of VLEN bits
• 32 register names: v0 through v31
• Each register is VLEN-bits wide
• VLEN is chosen by implementation, must be power of 2
• See spec for additional restrictions in relation to ELEN and SLEN
• Some control registers
• VL = active vector length
• SEW = standard element width, hosted in vsew[2:0]
• LMUL = grouping multiplier
SEW determines number of elements per vector
• SEW = Standard Element Width
• Dynamically settable through ‘vsew[2:0]’
• Each vector register viewed as VLEN/SEW elements, each SEW-bits wide
• Polymorphic instruction
• vadd can be an i8/i16/i32/… add depending on SEW
• Set up along with VL (vsetvli t0, a0, e32)
Example: VLEN=256b, vsew=‘010, SEW=32b, elements = VLEN/SEW = 8
VLEN = 256b
v0
v1
…
v31
32b 32b 32b 32b 32b 32b 32b 32b
vfadd.vv v0, v1, v2
for (i = 0; i < VL; ++i)
v0[i] = v1[i] + v2[i];
v0[VL..VLMAX] = 0;

• Lanes past VL don‘t trap, raise

exceptions, access memory, etc.

9
Register Grouping: LMUL
• Groups registers to form “longer vector”
• Reduces number of valid register names
• Number of registers in each group is LMUL
• LMUL can be 1, 2, 4, 8
• Example: when LMUL=2
• vadd v2, v4, v6 really means (v2,v3) := (v4,v5) + (v6,v7)
• Also used for widening operators (32b x 32b → 64b result)
• Like SEW, set with VL (vsetvli t0, a0, e32, m4)
Strip-mining
Increase each array element (length in a0, pointer in a1) by the same amount (a2)
loop:
vsetvli t0, a0, e32 # t0 = VL = max(a0, VLMAX)
vlw.v v0, (a1)
vadd.vs v2, v0, a2
vsw.v v2, (a1)
sub a0, a0, t0
... ; advance ptr by VL elements
bnez a0, loop
Sets SEW
Polymorphic! 11
Strip-mining
Increase each array element (length in a0, pointer in a1) by the same amount (a2)
loop:
vsetvli t0, a0, e32 # t0 = VL = max(a0, VLMAX)
vlw.v v0, (a1)
vadd.vs v2, v0, a2
vsw.v v2, (a1)
sub a0, a0, t0
... ; advance ptr by VL elements
bnez a0, loop

12
Mixed-precision Calculations
• Usually, biggest data type limits
vector length
• Unless you want lots of shuffles

13
Mixed-precision Calculations
• Usually, biggest data type limits
vector length
• Alternative with RISC-V V:
• pack 16b elements tightly
• 32b elements span two registers
• Switch LMUL to work with both
• No need to shuffle in registers
• Tradeoff: not a win on all uarchs

14
LLVM Support
• Out-of-tree patches @ https://github.com/rkruppe/rvv-llvm
• Want to start upstreaming when spec frozen
• Mostly MC and CodeGen work so far
• Very interested in autovectorization, but needs groundwork
• Status: can manually write vector code in IR and CodeGen it

15
Strip-mined Loop in IR
loop:
%n = phi ...
%ptr = phi ...
%vl = call i32 @llvm.riscv.vsetvl(i32 %n)
%v1 = call <scalable 1 x i32> @llvm.riscv.vlw(%ptr, i32 %vl)
%v2 = call … @llvm.riscv.vadd.sv1i32(%v1, %splat, i32 %vl)
call void @llvm.riscv.vsw(%ptr, %v2, i32 %vl)
%n.new = sub i32 %n, %vl
%ptr.new = ...
%done = icmp eq i32 %n.new, 0

16
IR Vector Type
• <scalable k x T> type proposed by Arm for their Scalable Vector
Extension (SVE)
• Lots of common ground (even more than last year!)
• vector register size unkown at compile time, constant at runtime
• but: known constant factor, e.g., VLEN multiple of 64b
• Want to use whatever gets accepted upstream for SVE
• References
• https://llvm.org/D32530

17
IR Intrinsics
• @llvm.riscv.vadd.sv1i32(op1, op2, i32 vl, mask)
• Active vector length is just another argument
• Masking as part of every operation, not external select
• Essentially like Simon Moll‘s Vector Predication proposal
• Note: no mention of SEW/LMUL
• References
• https://llvm.org/D57504
• Simon Moll’s talk earlier today

18
CodeGen Perspective
• VL is just another (allocatable) integer register
• Copies to/from GPR supported
• Input to most vector instructions, output of vsetvl
• Need to figure out how to “spill” it
• vtype is reserved physical register
• Implicitly used by everything, defined by vsetvl
• Managed by backend, no IR representation
• SEW, LMUL dictated by vector types used in IR

19
Instruction Selection
• Straightforward mapping of intrinsics to (pseudo-)instructions
• Hardware instructions are polymorphic, but compiler needs static info
• Pseudos for each element width and LMUL
• Different LMUL also means different register classes (e.g., pairs for LMUL=2)
• e.g. <scalable 4 x i32> add → vadd_e32_m4
• VL modelled as normal integer value
• Don’t set up configuration (SEW, LMUL) yet

20
After ISel
• Place instruction that set up necessary SEW and LMUL
• Fold into existing vsetvl’s where possible
• MIR optimizations, e.g., removing redundant vl ↔ GPR copies
• Copying vector registers is a mess
• Need to copy whole register (vl = MAX) in general
• Should usually prove that elements past current vl won‘t be read
• Not yet sure how to best achieve this

21
Next Steps needed
• Fill in more backend features
• Automatic vectorization (cf. SVE)
• Software ecosystem: vendor-tuned libraries
• Evaluate & adjust ISA
• Implementations will start popping out soon

• Please come help!

22
Conclusion
• RISC-V has a great, flexible vector extension
• https://github.com/riscv/riscv-v-spec/
• LLVM backend for it already started
• https://github.com/rkruppe/rvv-llvm
• Lots of industrial activity around it (even if you don’t see it)

Buku Manual Mindray Dp10
0% (1)
Buku Manual Mindray Dp10
157 pages
RISCV Summary
No ratings yet
RISCV Summary
323 pages
SpyGlass CDC Rules Reference Guide, Version N-2017.12-SP2
No ratings yet
SpyGlass CDC Rules Reference Guide, Version N-2017.12-SP2
2,294 pages
Riscv V Spec 1.0 Rc2
No ratings yet
Riscv V Spec 1.0 Rc2
112 pages
Schiavone Wosh2019 Tutorial
No ratings yet
Schiavone Wosh2019 Tutorial
81 pages
RISC V Intro For Hackathon
100% (2)
RISC V Intro For Hackathon
40 pages
Scalable Vectorizationin LLVMIR
No ratings yet
Scalable Vectorizationin LLVMIR
74 pages
15 20-15 55-18 05 06 VEXT-bcn-v1
No ratings yet
15 20-15 55-18 05 06 VEXT-bcn-v1
76 pages
Unit 2
No ratings yet
Unit 2
43 pages
RISC V VectorExtension 1 1
No ratings yet
RISC V VectorExtension 1 1
72 pages
Ther Is CV Reader
No ratings yet
Ther Is CV Reader
192 pages
IT3030E CA Chap3 Instruction Set Architecture
No ratings yet
IT3030E CA Chap3 Instruction Set Architecture
81 pages
2018fa CS61C L09 BN Procedures
No ratings yet
2018fa CS61C L09 BN Procedures
23 pages
Data Factory
100% (2)
Data Factory
26 pages
Lec-33-34 EE-222
No ratings yet
Lec-33-34 EE-222
20 pages
CA I - Chapter 2 ISA 2 RISC V
No ratings yet
CA I - Chapter 2 ISA 2 RISC V
65 pages
Lec-30 EE-222
No ratings yet
Lec-30 EE-222
29 pages
Cs61c Sp25 l08 Risc V Basics
No ratings yet
Cs61c Sp25 l08 Risc V Basics
37 pages
RISC-V Assembly Manual
No ratings yet
RISC-V Assembly Manual
13 pages
RISC V VectorExtension 1 1
No ratings yet
RISC V VectorExtension 1 1
21 pages
Lecture6 RISC V Assembly IV
No ratings yet
Lecture6 RISC V Assembly IV
21 pages
IntroRARS RV Assembler
No ratings yet
IntroRARS RV Assembler
26 pages
2018fa CS61C L10 BN Formats
No ratings yet
2018fa CS61C L10 BN Formats
29 pages
SIMD
No ratings yet
SIMD
44 pages
L06 - RISCVII (Revised)
No ratings yet
L06 - RISCVII (Revised)
48 pages
RISCV Student
No ratings yet
RISCV Student
41 pages
17.40 Vector - RISCV 20190611 Vectors
No ratings yet
17.40 Vector - RISCV 20190611 Vectors
26 pages
Slide 4
No ratings yet
Slide 4
35 pages
Presentación DL8000 PDF
100% (1)
Presentación DL8000 PDF
27 pages
Milestone03 - Computer Architecture Report - Group3
No ratings yet
Milestone03 - Computer Architecture Report - Group3
45 pages
Slide 3
No ratings yet
Slide 3
34 pages
PRE TEST Empowerment Technologies Pre Test
No ratings yet
PRE TEST Empowerment Technologies Pre Test
4 pages
02 Riscv
No ratings yet
02 Riscv
31 pages
Lec03 Arithmetic
No ratings yet
Lec03 Arithmetic
29 pages
CS61C 2022fa L07-Intro-RISC-V
No ratings yet
CS61C 2022fa L07-Intro-RISC-V
39 pages
6040 Usb Software Install
100% (1)
6040 Usb Software Install
7 pages
L05 RISCV Intro (1up)
No ratings yet
L05 RISCV Intro (1up)
45 pages
ECE586 Lecture 3
No ratings yet
ECE586 Lecture 3
16 pages
2018fa CS61C L10 BN Formats
No ratings yet
2018fa CS61C L10 BN Formats
28 pages
Andes RVV Webinar III
No ratings yet
Andes RVV Webinar III
49 pages
RISC-V Assembly Language Presentation
No ratings yet
RISC-V Assembly Language Presentation
19 pages
Disc04 Sols
No ratings yet
Disc04 Sols
7 pages
L06 RISCV Functions
No ratings yet
L06 RISCV Functions
49 pages
A Project Report On Management Information System
No ratings yet
A Project Report On Management Information System
30 pages
Vector
No ratings yet
Vector
38 pages
2018fa CS61C L08 BN decisionsII
No ratings yet
2018fa CS61C L08 BN decisionsII
24 pages
Lec Riscv
No ratings yet
Lec Riscv
45 pages
LLVM Tutorial
100% (1)
LLVM Tutorial
59 pages
LAB 09 RISC-V Assembly (Part I: Introduction) : EE-222 Microprocessors Systems April 11, 2019
100% (1)
LAB 09 RISC-V Assembly (Part I: Introduction) : EE-222 Microprocessors Systems April 11, 2019
9 pages
Aula Ch2 2
No ratings yet
Aula Ch2 2
27 pages
CA I - Chapter 2 ISA 2 RISC V
No ratings yet
CA I - Chapter 2 ISA 2 RISC V
66 pages
Aula Ch2 1
No ratings yet
Aula Ch2 1
27 pages
Data Structures Viva Question Answers
No ratings yet
Data Structures Viva Question Answers
3 pages
L11 Datapath1
No ratings yet
L11 Datapath1
49 pages
Disc05 Sols
No ratings yet
Disc05 Sols
8 pages
Native Shader Compilation With LLVM PDF
No ratings yet
Native Shader Compilation With LLVM PDF
37 pages
Cimplicity
No ratings yet
Cimplicity
18 pages
Data-Level Parallelism Vector and GPU
No ratings yet
Data-Level Parallelism Vector and GPU
6 pages
Risc V Exos
No ratings yet
Risc V Exos
3 pages
Lec. 12: Vector Computers: EECS 252 Graduate Computer Architecture
No ratings yet
Lec. 12: Vector Computers: EECS 252 Graduate Computer Architecture
31 pages
Computer Architecture Simd Vector Gpu
No ratings yet
Computer Architecture Simd Vector Gpu
16 pages
Practice Exercises 4
No ratings yet
Practice Exercises 4
2 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 26-Aug-2021 Module2-SIMD-VectorProcessors
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 26-Aug-2021 Module2-SIMD-VectorProcessors
16 pages
Vector
No ratings yet
Vector
42 pages
Dehnsupport Toolbox Ds709 e
No ratings yet
Dehnsupport Toolbox Ds709 e
20 pages
PS3 Programming Basics: Week 1. SIMD Programming On PPE Materials Are Adapted From The Textbook
No ratings yet
PS3 Programming Basics: Week 1. SIMD Programming On PPE Materials Are Adapted From The Textbook
37 pages
Ireless Networks Ireless Etwork Omponents Ireless Ocal REA Etwork Opologies
No ratings yet
Ireless Networks Ireless Etwork Omponents Ireless Ocal REA Etwork Opologies
22 pages
A Survey of Generative AI Applications
No ratings yet
A Survey of Generative AI Applications
36 pages
QML Animations
No ratings yet
QML Animations
30 pages
14.25 Tao Liu Richard Ho UVM Based RISC V Processor Verification Platform
No ratings yet
14.25 Tao Liu Richard Ho UVM Based RISC V Processor Verification Platform
22 pages
3esi Enersight - Whitepaper - 10 Minute A D - Updated2 SC
No ratings yet
3esi Enersight - Whitepaper - 10 Minute A D - Updated2 SC
8 pages
Comparing C++ Compilers Parallel-Programming Performance
No ratings yet
Comparing C++ Compilers Parallel-Programming Performance
8 pages
CS61C Summer 2018 Discussion 3 - RISC-V
No ratings yet
CS61C Summer 2018 Discussion 3 - RISC-V
3 pages
Class D: User Request/Certification of Access Rights Form
No ratings yet
Class D: User Request/Certification of Access Rights Form
1 page
Simple Vector Processor Modeled With VHDL
No ratings yet
Simple Vector Processor Modeled With VHDL
6 pages
Block-Diagram Tour Late Model (5100 Series) EF Johnson 700/800 MHZ 2-Way Radio RF Deck
No ratings yet
Block-Diagram Tour Late Model (5100 Series) EF Johnson 700/800 MHZ 2-Way Radio RF Deck
7 pages
Lucena Civil Engineers 052017 Room Assignment PDF
No ratings yet
Lucena Civil Engineers 052017 Room Assignment PDF
8 pages
Oracle Exadata Cloud 2022 Solution Engineer Specialist Assessment
No ratings yet
Oracle Exadata Cloud 2022 Solution Engineer Specialist Assessment
3 pages
SOLID Properties PDF
No ratings yet
SOLID Properties PDF
26 pages
Unit 6 Software Metrics
No ratings yet
Unit 6 Software Metrics
6 pages
LockiFi Journal
No ratings yet
LockiFi Journal
6 pages
Contents:: ASAP Installation and Administration
No ratings yet
Contents:: ASAP Installation and Administration
21 pages
Convolutional Neural Networks in Computer Vision: Jochen Lang
No ratings yet
Convolutional Neural Networks in Computer Vision: Jochen Lang
44 pages
Software Designing
No ratings yet
Software Designing
11 pages
Amazon - Applied Scientist
No ratings yet
Amazon - Applied Scientist
2 pages
SS7 Mad Prac 10 2
No ratings yet
SS7 Mad Prac 10 2
4 pages
Module-2 NOSQL
No ratings yet
Module-2 NOSQL
5 pages
Blue Light Blue Color Blocks Flight Attendant CV - 20240530 - 170623 - 0000
No ratings yet
Blue Light Blue Color Blocks Flight Attendant CV - 20240530 - 170623 - 0000
2 pages
Networks of Control PDF
No ratings yet
Networks of Control PDF
165 pages

TechTalk Kruppe Espasa RISC V Vectors and LLVM

Uploaded by

TechTalk Kruppe Espasa RISC V Vectors and LLVM

Uploaded by

Adventures with RISC-V

Vectors and LLVM

Embedded Systems and Applications Group

• Lanes past VL don‘t trap, raise

• Please come help!

You might also like