0% found this document useful (0 votes)

28 views22 pages

A Complete Guide to LLVM for Programming Language Creators

Uploaded by

Monish Obaid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views22 pages

A Complete Guide to LLVM for Programming Language Creators

Uploaded by

Monish Obaid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

MUKUL RATHI About Me Blog

TABLE OF
CONTENTS

A Complete
CREATING THE BOLT COMPILER: PART 8 Guide to LLVM
for
Programming
A Complete Guide to Language
Creators

LLVM for Programming Who's this

tutorial for?

Language Creators Just give me the

code!
Understanding
LLVM IR
DECEMBER 24, 2020 12 MIN READ An example:
Factorial
The LLVM API:
Key Concepts
IRBuilder
Types and
Constants
Type
declarations
More LLVM IR
Concepts
Stack allocation
Global variables
GEPs
GEPS with
Structs
mem2reg
LLVM
Optimisations
Wrap up

SERIES: CREATING THE BOLT COMPILER

Part 1: How I wrote my own "proper" programming language
Part 2: So how do you structure a compiler project?
Part 3: Writing a Lexer and Parser using OCamllex and Menhir
Part 4: An accessible introduction to type theory and implementing a type-
checker
Part 5: A tutorial on liveness and alias dataflow analysis
Part 6: Desugaring - taking our high-level language and simplifying it!
Part 7: A Protobuf tutorial for OCaml and C++
Part 8: A Complete Guide to LLVM for Programming Language Creators
Part 9: Implementing Concurrency and our Runtime Library
Part 10: Generics - adding polymorphism to Bolt
Part 11: Adding Inheritance and Method Overriding to Our Language

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 1/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Update: this post has now taken off on Hacker News and Reddit. Thank you all!

Who’s this tutorial for?

This series of compiler tutorials is for people who don’t just want to create a toy
language. You want objects. You want polymorphism. You want concurrency.
You want garbage collection. Wait you don’t want GC? Okay, no worries, we
won’t do that :P

If you’ve just joined the series at this stage, here’s a quick recap. We’re designing
a Java-esque concurrent object-oriented programming language Bolt. We’ve
gone through the compiler frontend, where we’ve done the parsing, type-
checking and dataflow analysis. We’ve desugared our language to get it ready for
LLVM - the main takeaway is that objects have been desugared to structs, and
their methods desugared to functions.

Learn about LLVM and you’ll be the envy of your friends. Rust uses LLVM for its
backend, so it must be cool. You’ll beat them on all those performance
benchmarks, without having to hand-optimise your code or write machine
assembly code. Shhhh, I won’t tell them.

Just give me the code!

All the code can be found in the Bolt compiler repository.

The C++ class definitions for our desugared representation (we call this Bolt IR)
can be found in deserialise_ir folder. The code for this post (the LLVM IR
generation) can be found in the llvm_ir_codegen folder. The repo uses the Visitor
design pattern and ample use of std::unique_ptr to make memory
management easier.

To cut through the boilerplate, to find out how to generate LLVM IR for a
particular language expression, search for the
IRCodegenVisitor::codegen method that takes in the corresponding
ExprIR object. e.g. for if-else statements:

Copy

Value *IRCodegenVisitor::codegen(const ExprIfElseIR &expr)

... // this is the LLVM IR generation
}

Understanding LLVM IR

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 2/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

LLVM sits in the middle-end of our compiler, after we’ve desugared our language
features, but before the backends that target specific machine architectures
(x86, ARM etc.)

LLVM’s IR is pretty low-level, it can’t contain language features present in some

languages but not others (e.g. classes are present in C++ but not C). If you’ve
come across instruction sets before, LLVM IR is a RISC instruction set.

The upshot of it is that LLVM IR looks like a more readable form of assembly. As
LLVM IR is machine independent, we don’t need to worry about the number of
registers, size of datatypes, calling conventions or other machine-specific details.

So instead of a fixed number of physical registers, in LLVM IR we have an

unlimited set of virtual registers (labelled %0 , %1 , %2 , %3 … we can write
and read from. It’s the backend’s job to map from virtual to physical registers.

And rather than allocating specific sizes of datatypes, we retain types in LLVM IR.
Again, the backend will take this type information and map it to the size of the
datatype. LLVM has types for different sizes of int s and floats, e.g. int32 ,
int8 , int1 etc. It also has derived types: like pointer types, array types,
struct types, function types. To find out more, check out the Type
documentation.

Now, built into LLVM are a set of optimisations we can run over the LLVM IR e.g.
dead-code elimination, function inlining, common subexpression elimination
etc. The details of these algorithms are irrelevant: LLVM implements them for
us.

Our side of the bargain is that we write LLVM IR in Static Single Assignment (SSA)
form, as SSA form makes life easier for optimisation writers. SSA form sounds
fancy, but it just means we define variables before use and assign to variables
only once. In SSA form, we cannot reassign to a variable, e.g. x = x+1 ; instead
we assign to a fresh variable each time ( x2 = x1 + 1 ).

So in short: LLVM IR looks like assembly with types, minus the messy machine-
specific details. LLVM IR must be in SSA form, which makes it easier to optimise.
Let’s look at an example!

An example: Factorial
Let’s look at a simple factorial function in our language Bolt:

factorial.bolt

Copy

function int factorial(int n){

if (n==0) {
1
}
else{

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 3/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

n * factorial(n - 1)
}
}

The corresponding LLVM IR is as follows:

factorial.ll

Copy

define i32 @factorial(i32) {

entry:
%eq = icmp eq i32 %0, 0 // n == 0
br i1 %eq, label %then, label %else

then: ; preds
br label %ifcont

else: ; preds
%sub = sub i32 %0, 1 // n - 1
%2 = call i32 @factorial(i32 %sub) // factorial(n-1)
%mult = mul i32 %0, %2 // n * factorial(n-1)
br label %ifcont

ifcont: ; preds
%iftmp = phi i32 [ 1, %then ], [ %mult, %else ]
ret i32 %iftmp
}

Note the .ll extension is for human-readable LLVM IR output. There’s

also .bc for bit-code, a more compact machine representation of
LLVM IR.

We can walk through this IR in 4 levels of detail:

At the Instruction Level:

Notice how LLVM IR contains assembly instructions like br and icmp , but
abstracts the machine-specific messy details of function calling conventions with
a single call instruction.

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 4/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

At the Control Flow Graph Level:

If we take a step back, you can see the IR defines the control flow graph of the
program. IR instructions are grouped into labeled basic blocks, and the preds
labels for each block represent incoming edges to that block. e.g. the ifcont
basic block has predecessors then and else :

At this point, I’m going to assume you have come across Control Flow
Graphs and basic blocks. We introduced Control Flow Graphs in a
previous post in the series, where we used them to perform different
dataflow analyses on the program. I’d recommend you go and check the
CFG section of that dataflow analysis post now. I’ll wait here :)

The phi instruction represents conditional assignment: assigning different

values depending on which preceding basic block we’ve just come from. It is of
the form phi type [val1, predecessor1], [val2, predecessor2],
... In the example above, we set %iftmp to 1 if we’ve come from the then
block, and %mult if we’ve come from the else block. Phi nodes must be at
the start of a block, and include one entry for each predecessor.

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 5/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

At the Function Level:

Taking another step back, the overall structure of a function in LLVM IR is as
follows:

At the Module Level:

An LLVM module contains all the information associated with a program file.
(For multi-file programs, we’d link together their corresponding modules.)

Our factorial function is just one function definition in our module. If we

want to execute the program, e.g. to compute factorial(10) we need to
define a main function, which will be the entrypoint for our program’s
execution. The main function’s signature is a hangover from C (we return 0 to
indicate successful execution):

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 6/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

example_program.c

Copy

// a C main function
int main(){
factorial(10);
return 0;
}

We specify that we want to compile for an Intel Macbook Pro in the module
target info:

example_module.ll

Copy

source_filename = "Module"
target triple = "x86_64-apple-darwin18.7.0"
...
define i32 @factorial(i32) {
...
}
define i32 @main() {
entry:
%0 = call i32 @factorial(i32 10)
ret i32 0
}

The LLVM API: Key Concepts

Now we’ve got the basics of LLVM IR down, let’s introduce the LLVM API. We’ll go
through the key concepts, then introduce more of the API as we explore LLVM IR
further.

LLVM defines a whole host of classes that map to the concepts we’ve talked
about.

Value
Module
Type
Function
BasicBlock
BranchInst …

These are all in the namespace llvm . In the Bolt repo, I chose to make
this namespacing explicit by referring to them as llvm::Value ,

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 7/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

llvm::Module etc.)

Most of the LLVM API is quite mechanical. Now you’ve seen the diagrams that
define modules, functions and basic blocks, the relationship between their
corresponding classes in the API falls out nicely. You can query a Module
object to get a list of its Function objects, and query a Function to get the
list of its BasicBlock s, and the other way round: you can query a
BasicBlock to get its parent Function object.

Value is the base class for any value computed by the program. This could be
a function ( Function subclasses Value ), a basic block ( BasicBlock also
subclasses Value ), an instruction, or the result of an intermediate
computation.

Each of the expression codegen methods returns a Value * : the result of

executing that expression. You can think of these codegen methods as
generating the IR for that expression and the Value * representing the virtual
register containing the expression’s result.

ir_codegen_visitor.h

Copy

virtual Value *codegen(const ExprIntegerIR &expr) override

virtual Value *codegen(const ExprBooleanIR &expr) override
virtual Value *codegen(const ExprIdentifierIR &expr) overr
virtual Value *codegen(const ExprConstructorIR &expr) over
virtual Value *codegen(const ExprLetIR &expr) override;
virtual Value *codegen(const ExprAssignIR &expr) override;

How do we generate the IR for these expressions? We create a unique

Context object to tie our whole code generation together. We use this
Context to get access to core LLVM data structures e.g LLVM modules and
IRBuilder objects.

We’ll use the context to create just one module, which we’ll imaginatively name
"Module" .

ir_codegen_visitor.cc

Copy

context = make_unique<LLVMContext>();
builder = std::unique_ptr<IRBuilder<>>(new IRBuilder<>(*co
module = make_unique<Module>("Module", *context);

IRBuilder
https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 8/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

We use the IRBuilder object to incrementally build up our IR. It is intuitively

the equivalent of a file pointer when reading/writing a file - it carries around
implicit state, e.g. the last instruction added, the basic block of that instruction
etc. Like moving around a file pointer, you can set the builder object to
insert instructions at the end of a particular Basic Block with the
SetInsertPoint(BasicBlock *TheBB) method. Likewise you can get the
current basic block with GetInsertBlock() .

The builder object has Create___() methods for each of the IR instructions.
e.g. CreateLoad for a load instruction , CreateSub , CreateFSub for
integer and floating point sub instructions respectively etc. Some
Create__() instructions take an optional Twine argument: this is used to
give the result’s register a custom name. e.g. iftmp is the twine for the
following instruction:

%iftmp = phi i32 [ 1, %then ], [ %mult, %else]

Use the IRBuilder docs to find the method corresponding to your instruction.

Types and Constants

We don’t directly construct these, instead we get__() them from their
corresponding classes. (LLVM keeps track of how a unique instance of each type
/ constant class is used).

For example, we getSigned to get a constant signed integer of a particular

type and value, and getInt32Ty to get the int32 type.

expr_codegen.cc

Copy

Value *IRCodegenVisitor::codegen(const ExprIntegerIR &expr

return ConstantInt::getSigned((Type::getInt32Ty(*context
expr.val);
};

Function types are similar: we can use FunctionType::get . Function types

consist of the return type, an array of the types of the params and whether the
function is variadic:

function_codegen.cc

Copy

FunctionType::get(returnType, paramTypes, false /* doesn't

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 9/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Type declarations
We can declare our own custom struct types.

e.g. a Tree with a int value, and pointers to left and right subtrees:

Copy

%Tree = type {i32, Tree, Tree }

Defining a custom struct type is a two-stage process.

First we create the type with that name. This adds it to the module’s symbol
table. This type is opaque: we can now reference in other type declarations e.g.
function types, or other struct types, but we can’t create structs of that type (as
we don’t know what’s in it).

Copy

StructType treeType = StructType::create(context, String

LLVM boxes up strings and arrays using StringRef and ArrayRef .

You can directly pass in a string where the docs require a StringRef, but I
choose to make this StringRef explicit above.

The second step is to specify an array of types that go in the struct body. Note
since we’ve defined the opaque Tree type, we can get a Tree * type using
the Tree type’s getPointerTo() method.

Copy

treeType->setBody(ArrayRef<Type *>({Type::getInt32Ty(*cont

So if you have custom struct types referring to other custom struct types in their
bodies, the best approach is to declare all of the opaque custom struct types,
then fill in each of the structs’ bodies.

class_codegen.cc

Copy

void IRCodegenVisitor::codegenClasses(
const std::vector<std::unique_ptr<ClassIR>> &classes)
// create (opaque) struct types for each of the classes
for (auto &currClass : classes) {
StructType::create(*context, StringRef(currClass->clas

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 10/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

}
// fill in struct bodies
for (auto &currClass : classes) {
std::vector<Type *> bodyTypes;
for (auto &field : currClass->fields) {
// add field type
bodyTypes.push_back(field->codegen(*this));
}
// get opaque class struct type from module symbol tab
StructType *classType =
module->getTypeByName(StringRef(currClass->classNa
classType->setBody(ArrayRef<Type *>(bodyTypes));
}

### Functions

Functions operate in a similar two step process:

1. Define the function prototypes

2. Fill in their function bodies (skip this if you’re linking in an external
function!)

The function prototype consists of the function name, the function type, the
“linkage” information and the module whose symbol table we want to add the
function to. We choose External linkage - this means the function prototype is
viewable externally. This means that we can link in an external function
definition (e.g. if using a library function), or expose our function definition in
another module. You can see the full enum of linkage options here.

function_codegen.cc

Copy

Function::Create(functionType, Function::ExternalLinkage,
function->functionName, module

To generate the function definition we just need to use the API to construct the
control flow graph we discussed in our factorial example:

function_codegen.cc

Copy

void IRCodegenVisitor::codegenFunctionDefn(const FunctionI

// look up function in module symbol definition
Function *llvmFun =
module->getFunction(function.functionName);
BasicBlock *entryBasicBlock =
BasicBlock::Create(*context, "entry", llvmFun);

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 11/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

builder->SetInsertPoint(entryBasicBlock);
...

The official Kaleidoscope tutorial has an excellent explanation of how to

construct a control flow graph for an if-else statement.

More LLVM IR Concepts

Now we’ve covered the basics of LLVM IR and the API, we’re going to look at
some more LLVM IR concepts and introduce the corresponding API function
calls alongside them.

Stack allocation
There are two ways we can store values in local variables in LLVM IR. We’ve seen
the first: assignment to virtual registers. The second is dynamic memory
allocation to the stack using the alloca instruction. Whilst we can store ints,
floats and pointers to either the stack or virtual registers, aggregate datatypes,
like structs and arrays, don’t fit in registers so have to be stored on the stack.

Yes, you read that right. Unlike most programming language memory models,
where we use the heap for dynamic memory allocation, in LLVM we just have a
stack.

Heaps are not provided by LLVM - they are a library feature. For single-
threaded applications, stack allocation is sufficient. We’ll talk about the
need for a global heap in multi-threaded programs in the next post
(where we extend Bolt to support concurrency).

We’ve seen struct types e.g. {i32, i1, i32} . Array types are of the form
[num_elems x elem_type] . Note num_elems is a constant - you need to
provide this when generating the IR, not at runtime. So [3 x int32] is valid
but [n x int32] is not.

We give alloca a type and it allocates a block of memory on the stack and
returns a pointer to it, which we can store in a register. We can use this pointer
to load and store values from/onto the stack.

For example, storing a 32-bit int on the stack:

Copy

%p = alloca i32 // store i32* pointer in %p

store i32 1, i32* %p

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 12/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

%1 = load i32, i32* %p

The corresponding builder instructions are… you guessed it CreateAlloca ,

CreateLoad , CreateStore . CreateAlloca returns a special subclass of
Value * : an AllocaInst * :

Copy

AllocaInst ptr = builder->CreateAlloca(Type::getInt32Ty(

/* Twine */ "p");
// AllocaInst has additional methods e.g. to query type
ptr->getAllocatedType(); // returns i32
builder->CreateLoad(ptr);
builder->CreateStore(someVal, ptr);

Global variables
Just as we alloca local variables on a stack, we can create global variables
and load from them and store to them.

Global variables are declared at the start of a module, and are part of the
module symbol table.

We can use the module object to create named global variables, and to query
them.

Copy

module->getOrInsertGlobal(globalVarName, globalVarType);
...
GlobalVariable *globalVar = module->getNamedGlobal(globalV

Global variables must be initialised with a constant value (not a variable):

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 13/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Copy

globalVar->setInitializer(initValue);

Alternatively we can do this in one command using the GlobalVariable

constructor:

Copy

GlobalVariable *globalVar = new GlobalVariable(module, glo

GlobalValue::ExternalLinkage, initValue, glo

As before we can load and store :

Copy

builder->CreateLoad(globalVar);
builder->CreateStore(someVal, globalVar); // not for const

GEPs
We get a base pointer to the aggregate type (array / struct) on the stack or in
global memory, but what if we want a pointer to a specific element? We’d need
to find the pointer offset of that element within the aggregate, and then add this
to the base pointer to get the address of that element. Calculating the pointer
offset is machine-specific e.g. depends on the size of the datatypes, the struct
padding etc.

The Get Element Pointer (GEP) instruction is an instruction to apply the pointer
offset to the base pointer and return the resultant pointer.

Consider two arrays starting at p . Following C convention, we can represent a

pointer to that array as char* or int* .

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 14/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Below we show the GEP instruction to calculate the pointer p+1 in each of the
arrays:

Copy

// char = 8 bit integer = i8

%idx1 = getelementptr i8, i8* %p, i64 1 // p + 1 for char*
%idx2 = getelementptr i32, i32* %p, i64 1 // p + 1 for int

This GEP instruction is a bit of a mouthful so here’s a breakdown:

This i64 1 index adds multiples of the base type to the base pointer. p+1
for i8 would add 1 byte, whereas as p+1 for i32 would add 4 bytes to p .
If the index was i64 0 we’d return p itself.

The LLVM API instruction for creating a GEP is… CreateGEP .

Copy

Value *ptr = builder->CreateGEP(baseType, basePtr, arrayof

Wait? Array of indices? Yes, the GEP instruction can have multiple indices passed
to it. We’ve looked at a simple example where we only needed one index.

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 15/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Before we look at the case where we pass multiple indices, I want to reiterate
the purpose of this first index:

A pointer of type Foo * can represent in C the base pointer of an array of type
Foo . The first index adds multiples of this base type Foo to traverse this array.

GEPS with Structs

Okay, now let’s look at structs. So take a struct of type Foo :

Copy

%Foo = type { i32, [4 x i32], i32}

We want to index specific fields in the struct. The natural way would be to label
them field 0 , 1 and 2 . We can access field 2 by passing this into the GEP
instruction as another index.

Copy

%ThirdFieldPtr = getelementptr %Foo, %Foo* %ptr, i64 0, i

The returned pointer is then calculated as: ptr + 0 * (size of Foo) +

offset 2 * (fields of Foo) .

For structs, you’ll likely always pass the first index as 0 . The biggest confusion
with GEPs is that this 0 can seem redundant, as we want the field 2 , so why
are we passing a 0 index first? Hopefully you can see from the first example
why we need that 0 . Think of it as passing to GEP the base pointer of an
implicit Foo array of size 1.

To avoid the confusion, LLVM has a special CreateStructGEP instruction that

asks only for field index (this is the CreateGEP instruction with a 0 added
for you):

Copy

Value *thirdFieldPtr = builder->CreateStructGEP(baseStruct

The more nested our aggregate structure, the more indices we can provide. E.g.
for element index 2 of Foo’s second field (the 4 element int array):

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 16/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Copy

getelementptr %Foo, %Foo* %ptr, i64 0, i64 1, i64 2

The pointer returned is: ptr + 0 * (size of Foo) + offset 1 *

(field of Foo) + offset 2 * (elems of array) .

(In terms of the corresponding API, we’d use CreateGEP and pass the array
{0,1,2} .)

A Good talk that explains GEP well.

mem2reg
If you remember, LLVM IR must be written in SSA form. But what happens if the
Bolt source program we’re trying to map to LLVM IR is not in SSA form?

For example, if we’re reassigning x :

reassign_var.bolt

Copy

let x = 1
x = x + 1

One option would be for us to rewrite the program in SSA form in an earlier
compiler stage. Every time we reassign a variable, we’d have to create a fresh
variable. We’d also have to introduce phi nodes for conditional statements.
For our example, this is straightforward, but in general this extra rewrite is a
pain we would rather not deal with.

assign_fresh_vars.bolt

Copy

// Valid SSA: assign fresh variables

let x1 = 1
x2 = x1 + 1

We can use pointers to avoid assigning fresh variables. Note here we aren’t
reassigning the pointer x , just updating the value it pointed to. So this is valid
SSA.

Copy

// valid SSA: use a pointer and update the value it points

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 17/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

let x = &1;
*x = *x + 1

This switch to pointers is a much easier transformation than variable renaming.

It also has a really nice LLVM IR equivalent: allocating stack memory (and
manipulating the pointers to the stack) instead of reading from registers.

So whenever we declare a local variable, we use alloca to get a pointer to

freshly allocated stack space. We use the load and store instructions to
read and update the value pointed to by the pointer:

reassign_var.ll

Copy

%x = alloca i32
store i32 1, i32* %x

%1 = load i32, i32* %x

%2 = add i32 %1, 1
store i32 %2, i32* %x

Let’s revisit the LLVM IR if we were to rewrite the Bolt program to use fresh
variables. It’s only 2 instructions, compared to the 5 instructions needed if using
the stack. Moreover, we avoid the expensive load and store memory
access instructions.

assign_fresh_vars.ll

Copy

%x1 = 1
%x2 = add i32 %x1, 1 // let x2 = x1 + 1

So while we’ve made our lives easier as compiler writers by avoiding a rewrite-
to-SSA pass, this has come at the expense of performance.

Happily, LLVM lets us have our cake and eat it.

LLVM provides a mem2reg optimisation that optimises stack memory accesses

into register accesses. We just need to ensure we declare all our alloca s for
local variables in the entry basic block for the function.

How do we do this if the local variable declaration occurs midway through the
function, in another block? Let’s look at an example:

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 18/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Copy

// BOLT variable declaration

let x : int = someVal;

// translated to LLVM IR
%x = alloca i32
store i32 someVal, i32* %x

We can actually move the alloca . It doesn’t matter where we allocate the
stack space so long as it is allocated before use. So let’s write the alloca at
the very start of the parent function this local variable declaration occurs.

How do we do this in the API? Well, remember the analogy of the builder being
like a file pointer? We can have multiple file pointers pointing to different places
in the file. Likewise, we instantiate a new IRBuilder to point to the start of
the entry basic block of the parent function, and insert the alloca
instructions using that builder.

expr_codegen.cc

Copy

Function *parentFunction = builder->GetInsertBlock()

->getParent();
// create temp builder to point to start of function
IRBuilder<> TmpBuilder(&(parentFunction->getEntryBlock()),
parentFunction->getEntryBlock().be
// .begin() inserts this alloca at beginning of block
AllocaInst *var = TmpBuilder.CreateAlloca(boundVal->getTyp
// resume our current position by using orig. builder
builder->CreateStore(someVal, var);

LLVM Optimisations
The API makes it really easy to add passes. We create a
functionPassManager , add the optimisation passes we’d like, and then
initialise the manager.

ir_codegen_visitor.cc

Copy

std::unique_ptr<legacy::FunctionPassManager> functionPassM
make_unique<legacy::FunctionPassManager>(module.get

// Promote allocas to registers.

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 19/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

functionPassManager->add(createPromoteMemoryToRegisterPa
// Do simple "peephole" optimizations
functionPassManager->add(createInstructionCombiningPass
// Reassociate expressions.
functionPassManager->add(createReassociatePass());
// Eliminate Common SubExpressions.
functionPassManager->add(createGVNPass());
// Simplify the control flow graph (deleting unreachable
functionPassManager->add(createCFGSimplificationPass());

functionPassManager->doInitialization();

We run this on each of the program’s functions:

ir_codegen_visitor.cc

Copy

for (auto &function : functions) {

Function *llvmFun =
module->getFunction(StringRef(function->functionNa
functionPassManager->run(*llvmFun);
}

Function *llvmMainFun = module->getFunction(StringRef("m

functionPassManager->run(*llvmMainFun);

In particular, let’s look at the the factorial LLVM IR output by our Bolt
compiler before and after. You can find them in the repo:

factorial-unoptimised.ll

Copy

define i32 @factorial(i32) {

entry:
%n = alloca i32
store i32 %0, i32* %n
%1 = load i32, i32* %n
%eq = icmp eq i32 %1, 0
br i1 %eq, label %then, label %else

then: ; preds
br label %ifcont

else: ; preds = %entry

%2 = load i32, i32* %n
%3 = load i32, i32* %n

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 20/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

%sub = sub i32 %3, 1

%4 = call i32 @factorial(i32 %sub)
%mult = mul i32 %2, %4
br label %ifcont

ifcont: ; preds = %else, %then

%iftmp = phi i32 [ 1, %then ], [ %mult, %else ]
ret i32 %iftmp
}

And the optimised version:

factorial-optimised.ll

Copy

define i32 @factorial(i32) {

entry:
%eq = icmp eq i32 %0, 0
br i1 %eq, label %ifcont, label %else

else: ; preds
%sub = add i32 %0, -1
%1 = call i32 @factorial(i32 %sub)
%mult = mul i32 %1, %0
br label %ifcont

ifcont: ; preds
%iftmp = phi i32 [ %mult, %else ], [ 1, %entry ]
ret i32 %iftmp
}

Notice how we’ve actually got rid of the alloca and the associated load
and store instructions, and also removed the then basic block!

Wrap up
This last example shows you the power of LLVM and its optimisations. You can
find the top-level code that runs the LLVM code generation and optimisation in
the main.cc file in the Bolt repository.

In the next few posts we’ll be looked at some more advanced language features:
generics, inheritance and method overriding and concurrency! Stay tuned for
when they come out!

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 21/22
15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

Share This On Twitter

If you liked this post, please consider sharing it with your network. If you
have any questions, tweet away and I’ll answer :) I also tweet when new
posts drop!

PS: I also share helpful tips and links as I'm learning - so you get them
well before they make their way into a post!

Post Follow @mukulrathi_

SERIES: CREATING THE BOLT COMPILER

← How do I use__? A guide to React hooks

Implementing Concurrency and our Runtime Library →

https://mukulrathi.com/create-your-own-programming-language/llvm-ir-cpp-api-tutorial/ 22/22

RE4B-EN-A5
No ratings yet
RE4B-EN-A5
1,462 pages
Atmel Studio 7
100% (1)
Atmel Studio 7
360 pages
C166/ST10 v8.7: C Cross Compiler User's Manual
No ratings yet
C166/ST10 v8.7: C Cross Compiler User's Manual
557 pages
1 Davis Chisnall LLVM 2017
No ratings yet
1 Davis Chisnall LLVM 2017
166 pages
RE For Beginners-En
No ratings yet
RE For Beginners-En
510 pages
Misruda Introduction To Systems Software
100% (1)
Misruda Introduction To Systems Software
140 pages
Source Code - Machine Code
No ratings yet
Source Code - Machine Code
102 pages
LLVM Homework
100% (1)
LLVM Homework
7 pages
LLVM IR Quick Reference
No ratings yet
LLVM IR Quick Reference
455 pages
llvm-demo
No ratings yet
llvm-demo
81 pages
Low Level Virtual Machine C# Compiler Senior Project Proposal Presentation
No ratings yet
Low Level Virtual Machine C# Compiler Senior Project Proposal Presentation
38 pages
PLDI Week 05 Lexing
No ratings yet
PLDI Week 05 Lexing
78 pages
Zhao 13
No ratings yet
Zhao 13
136 pages
PLDI Week 04 LLVM
No ratings yet
PLDI Week 04 LLVM
62 pages
CompilerTalk 2019
No ratings yet
CompilerTalk 2019
55 pages
ASPLOS19-LLVM-Tutorial
No ratings yet
ASPLOS19-LLVM-Tutorial
71 pages
L6-LLVM-Part2
No ratings yet
L6-LLVM-Part2
6 pages
Quick Primer On LLVM IR: (For Those Already Familiar With LLVM IR, Feel Free To)
No ratings yet
Quick Primer On LLVM IR: (For Those Already Familiar With LLVM IR, Feel Free To)
13 pages
4.llvm
No ratings yet
4.llvm
26 pages
L03 C Intro
No ratings yet
L03 C Intro
35 pages
A compiler for the π-Calculus: the backend
No ratings yet
A compiler for the π-Calculus: the backend
51 pages
Reverse Engineering For Beginners PDF
No ratings yet
Reverse Engineering For Beginners PDF
473 pages
L3-LLVM-Part1
No ratings yet
L3-LLVM-Part1
31 pages
RE For Beginners-En
No ratings yet
RE For Beginners-En
469 pages
LLVM
No ratings yet
LLVM
1,703 pages
UNDERSTANDING THE MEMORY SEGMENTS - WHY DO WE NEED IT_
No ratings yet
UNDERSTANDING THE MEMORY SEGMENTS - WHY DO WE NEED IT_
19 pages
Reverse Engineering For Beginners
100% (8)
Reverse Engineering For Beginners
473 pages
Faust Tutorial
No ratings yet
Faust Tutorial
78 pages
Kaleidoscope - Implementing A Language With LLVM in Objective Caml
No ratings yet
Kaleidoscope - Implementing A Language With LLVM in Objective Caml
142 pages
Department of Computing: CS 354: Compiler Construction Class: BSCS-7A
No ratings yet
Department of Computing: CS 354: Compiler Construction Class: BSCS-7A
4 pages
Create A Working Compiler With The LLVM Framework, Part 1
No ratings yet
Create A Working Compiler With The LLVM Framework, Part 1
13 pages
MLIR Tutorial
No ratings yet
MLIR Tutorial
78 pages
Exercises C++
No ratings yet
Exercises C++
37 pages
Sctalk 2
No ratings yet
Sctalk 2
41 pages
2014 RE For Beginners PDF
No ratings yet
2014 RE For Beginners PDF
859 pages
The Architecture of Open Source Applications (Volume 1) LLVM5
No ratings yet
The Architecture of Open Source Applications (Volume 1) LLVM5
1 page
Exam Da-100 Analyzing Data With Microsoft Power BI Skills Measured
No ratings yet
Exam Da-100 Analyzing Data With Microsoft Power BI Skills Measured
9 pages
RE For Beginners
No ratings yet
RE For Beginners
791 pages
RE For Beginners
No ratings yet
RE For Beginners
795 pages
20 01 2022 21 58 38
No ratings yet
20 01 2022 21 58 38
42 pages
RE For Beginners-En
No ratings yet
RE For Beginners-En
758 pages
Nacke, Kai, Kwan, Amy - Learn LLVM 17 - A Beginner's Guide To Learning LLVM Compiler Tools and Core Libraries With C++-Packt (2023)
No ratings yet
Nacke, Kai, Kwan, Amy - Learn LLVM 17 - A Beginner's Guide To Learning LLVM Compiler Tools and Core Libraries With C++-Packt (2023)
10 pages
How to Create Pragmatic, Lightweight Languages
No ratings yet
How to Create Pragmatic, Lightweight Languages
370 pages
Reverse Engineering For Beginners
No ratings yet
Reverse Engineering For Beginners
920 pages
ChatGPT_MyLearning on Compiler Backend With LLVM
No ratings yet
ChatGPT_MyLearning on Compiler Backend With LLVM
8 pages
LLVM Essentials - Sample Chapter
No ratings yet
LLVM Essentials - Sample Chapter
16 pages
Faust Tutorial
No ratings yet
Faust Tutorial
79 pages
Developed by University of Illinois at Urbana-Champaign CIS Dept Cisc 471 Matthew Warner
No ratings yet
Developed by University of Illinois at Urbana-Champaign CIS Dept Cisc 471 Matthew Warner
9 pages
Cambridge: Computer Science Tripos Part Ib
No ratings yet
Cambridge: Computer Science Tripos Part Ib
82 pages
Product File Name
No ratings yet
Product File Name
31 pages
Ad01 2023
No ratings yet
Ad01 2023
102 pages
Create Languages
100% (1)
Create Languages
235 pages
LINK KOMIK DIGITAL TERBARU
No ratings yet
LINK KOMIK DIGITAL TERBARU
4 pages
Blockchain Assignment 2
No ratings yet
Blockchain Assignment 2
33 pages
Lesson 5 Et332b
No ratings yet
Lesson 5 Et332b
25 pages
Service Manual: DEH-3400R
No ratings yet
Service Manual: DEH-3400R
73 pages
Computer Science Word 2024-25
No ratings yet
Computer Science Word 2024-25
28 pages
1.0 The Use of Midas Software in Teaching Ce Courses
No ratings yet
1.0 The Use of Midas Software in Teaching Ce Courses
34 pages
Example: 4. Misc. - Library Features - Gotchas - Hints 'N' Tips
No ratings yet
Example: 4. Misc. - Library Features - Gotchas - Hints 'N' Tips
6 pages
LLVM Crash Course
No ratings yet
LLVM Crash Course
15 pages
Job Posting Analysis
No ratings yet
Job Posting Analysis
11 pages
Fundamental Unit 3
No ratings yet
Fundamental Unit 3
25 pages
NEW APP Format (RA-11469)
No ratings yet
NEW APP Format (RA-11469)
14 pages
426-5051-00 - Turck Breakout & BoxRemote Module
No ratings yet
426-5051-00 - Turck Breakout & BoxRemote Module
6 pages
LLVM Tutorial
100% (1)
LLVM Tutorial
59 pages
Conveyance Bill: Rony Chandra Saha / MD - Rashal Miha
No ratings yet
Conveyance Bill: Rony Chandra Saha / MD - Rashal Miha
5 pages
LLVM Implementing A Language
No ratings yet
LLVM Implementing A Language
62 pages
The Phantom Hitch-Hiker Report
No ratings yet
The Phantom Hitch-Hiker Report
5 pages
Practical 4
No ratings yet
Practical 4
2 pages
ASIC Design Verification - Job oriented Program
No ratings yet
ASIC Design Verification - Job oriented Program
21 pages
Resume ATM Technician
No ratings yet
Resume ATM Technician
5 pages
LLVM+GHV Thesis
No ratings yet
LLVM+GHV Thesis
73 pages
32-2024-A Novel Dual-Pipeline based Attention Mechanism for
No ratings yet
32-2024-A Novel Dual-Pipeline based Attention Mechanism for
7 pages
Ian Alfred Orozco CV - Latest PDF
No ratings yet
Ian Alfred Orozco CV - Latest PDF
9 pages
The LLVM Compiler Framework and Infrastructure
No ratings yet
The LLVM Compiler Framework and Infrastructure
61 pages
05-2 Assignment-Question CT027-3-3 - EPDA - 3F2311
No ratings yet
05-2 Assignment-Question CT027-3-3 - EPDA - 3F2311
5 pages
Hypertext and Intertext G11 MODULE
No ratings yet
Hypertext and Intertext G11 MODULE
5 pages
A C R
No ratings yet
A C R
7 pages
The LLVM Compiler Framework and Infrastructure
No ratings yet
The LLVM Compiler Framework and Infrastructure
44 pages
MT5 Guide - English
No ratings yet
MT5 Guide - English
10 pages
ASCII Table: Introducton
No ratings yet
ASCII Table: Introducton
1 page
Polygons Are Used in
No ratings yet
Polygons Are Used in
1 page
Excel 2013 Basic Quick Reference PDF
No ratings yet
Excel 2013 Basic Quick Reference PDF
3 pages
Grade 1
100% (1)
Grade 1
76 pages
LLVM Cookbook - Sample Chapter
No ratings yet
LLVM Cookbook - Sample Chapter
30 pages
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
From Everand
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
Manoj R Chakravarthi
No ratings yet
RUBY Beginner's Crash Course: Ruby for Beginner's Guide to Ruby Programming, Ruby On Rails & Rails Programming
From Everand
RUBY Beginner's Crash Course: Ruby for Beginner's Guide to Ruby Programming, Ruby On Rails & Rails Programming
Quick Start Guides
No ratings yet
C Language for Beginners with Easy Tips of C Basic Programming
From Everand
C Language for Beginners with Easy Tips of C Basic Programming
Publicancy Ltd
No ratings yet
LabVIEW – More LCOD
From Everand
LabVIEW – More LCOD
Rob Maskell
No ratings yet
PLC Controls with Structured Text (ST): IEC 61131-3 and best practice ST programming
From Everand
PLC Controls with Structured Text (ST): IEC 61131-3 and best practice ST programming
Tom Mejer Antonsen
4/5 (12)

A Complete Guide to LLVM for Programming Language Creators

Uploaded by

A Complete Guide to LLVM for Programming Language Creators

Uploaded by

15/01/2024, 00:14 A Complete Guide to LLVM for Programming Language Creators

MUKUL RATHI About Me Blog

LLVM for Programming Who's this

Language Creators Just give me the

SERIES: CREATING THE BOLT COMPILER

Who’s this tutorial for?

Just give me the code!

Value *IRCodegenVisitor::codegen(const ExprIfElseIR &expr)

LLVM’s IR is pretty low-level, it can’t contain language features present in some

So instead of a fixed number of physical registers, in LLVM IR we have an

function int factorial(int n){

The corresponding LLVM IR is as follows:

define i32 @factorial(i32) {

Note the .ll extension is for human-readable LLVM IR output. There’s

We can walk through this IR in 4 levels of detail:

At the Instruction Level:

At the Control Flow Graph Level:

The phi instruction represents conditional assignment: assigning different

At the Function Level:

At the Module Level:

Our factorial function is just one function definition in our module. If we

The LLVM API: Key Concepts

Each of the expression codegen methods returns a Value * : the result of

virtual Value *codegen(const ExprIntegerIR &expr) override

How do we generate the IR for these expressions? We create a unique

We use the IRBuilder object to incrementally build up our IR. It is intuitively

%iftmp = phi i32 [ 1, %then ], [ %mult, %else]

Types and Constants

For example, we getSigned to get a constant signed integer of a particular

Value *IRCodegenVisitor::codegen(const ExprIntegerIR &expr

Function types are similar: we can use FunctionType::get . Function types

FunctionType::get(returnType, paramTypes, false /* doesn't

%Tree = type {i32, Tree*, Tree* }

Defining a custom struct type is a two-stage process.

StructType *treeType = StructType::create(*context, String

LLVM boxes up strings and arrays using StringRef and ArrayRef .

Functions operate in a similar two step process:

1. Define the function prototypes

void IRCodegenVisitor::codegenFunctionDefn(const FunctionI

The official Kaleidoscope tutorial has an excellent explanation of how to

More LLVM IR Concepts

For example, storing a 32-bit int on the stack:

%p = alloca i32 // store i32* pointer in %p

%1 = load i32, i32* %p

The corresponding builder instructions are… you guessed it CreateAlloca ,

AllocaInst *ptr = builder->CreateAlloca(Type::getInt32Ty(*

Global variables must be initialised with a constant value (not a variable):

Alternatively we can do this in one command using the GlobalVariable

GlobalVariable *globalVar = new GlobalVariable(module, glo

As before we can load and store :

Consider two arrays starting at p . Following C convention, we can represent a

// char = 8 bit integer = i8

This GEP instruction is a bit of a mouthful so here’s a breakdown:

The LLVM API instruction for creating a GEP is… CreateGEP .

Value *ptr = builder->CreateGEP(baseType, basePtr, arrayof

GEPS with Structs

%Foo = type { i32, [4 x i32], i32}

%ThirdFieldPtr = getelementptr %Foo, %Foo* %ptr, i64 0, i

The returned pointer is then calculated as: ptr + 0 * (size of Foo) +

To avoid the confusion, LLVM has a special CreateStructGEP instruction that

Value *thirdFieldPtr = builder->CreateStructGEP(baseStruct

getelementptr %Foo, %Foo* %ptr, i64 0, i64 1, i64 2

The pointer returned is: ptr + 0 * (size of Foo) + offset 1 *

A Good talk that explains GEP well.

For example, if we’re reassigning x :

// Valid SSA: assign fresh variables

// valid SSA: use a pointer and update the value it points

This switch to pointers is a much easier transformation than variable renaming.

So whenever we declare a local variable, we use alloca to get a pointer to

%1 = load i32, i32* %x

Happily, LLVM lets us have our cake and eat it.

LLVM provides a mem2reg optimisation that optimises stack memory accesses

// BOLT variable declaration

Function *parentFunction = builder->GetInsertBlock()

// Promote allocas to registers.

We run this on each of the program’s functions:

%Tree = type {i32, Tree, Tree }

StructType treeType = StructType::create(context, String

AllocaInst ptr = builder->CreateAlloca(Type::getInt32Ty(