0% found this document useful (0 votes)

6 views5 pages

Documentation Compiler

This document outlines a compiler designed to convert C++ matrix multiplication operations into instructions for a Processor-in-Memory (PIM) architecture. It details the process of generating LLVM Intermediate Representation, extracting Three-Address Code, and creating ISA instructions for parallel execution across multiple cores. Key components include Clang/LLVM for conversion, a custom LLVM pass for TAC extraction, and a Python script for ISA generation, all aimed at optimizing matrix operations in a PIM environment.

Uploaded by

johnsneak63

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views5 pages

Documentation Compiler

Uploaded by

johnsneak63

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Compiler Implementation for Matrix

Multiplication using a PIM architecture

Faculty : Senthil Prakash

Slot : B1+TB1

------------------------------------COMPILATHON--------------------------------------

Team :

Rohit kumar singh 22BRS1258

Punish Midha 22BPS1150
Taher hussain kapadia 22BPS1113

Overview :
This document describes a compiler that transforms C++ matrix operations into custom instruction set
architecture (ISA) commands for a Processor-in-Memory (PIM) system. The process involves generating
LLVM Intermediate Representation, extracting Three-Address Code, and converting this code into machine
instructions compatible with the PIM architecture.

Process Flow :
C++ Code → LLVM IR → TAC Extraction → ISA Generation → Parallel Execution

Key Components :
 Clang / LLVM: Converts C++ to LLVM IR and provides analysis tools

 Custom LLVM Pass: Extracts Three-Address Code from LLVM IR

 Python Converter: Transforms TAC into ISA-compatible instructions

 Target Architecture: Uses 24-bit instruction format designed for DRAM subarray parallel processing
Implementation Steps :
1. Starting Point

Begin with a C++ program containing predefined matrices and a matrix multiplication function.

2. Generate LLVM IR

Output : matrix_ops.ll
Used ( -01 ) : Disables optnone attribute , allowing the LLVM pass to analyze the IR.

3. Extract Three-Address Code

 Custom LLVM pass (TACGenPass.cpp) identifies load, store, and arithmetic operations
 The pass outputs operations to tac_output.txt
 Compilation command:

Compile the pass : clang++ -shared -fPIC TACGenPass.cpp -o tacgen.so $(llvm-config --cxxflags --ldflags
--system-libs --libs core)

Run the pass : opt -load-pass-plugin ./tacgen.so -passes="tacgen" matrix_ops.ll -o /dev/null 2> tac_output.txt
OUTPUT : tac_output.txt

4. Generate ISA Instructions

 Python script maps TAC operations to the 24-bit ISA format

 Distributes instructions across multiple processing elements
 Execution command:

Run the script : python3 modified_tac_to_isa.py

OUTPUT : parallal_output.isa
ISA Instruction Format
24-bit instruction with the following fields:

 OPCODE (2 bits): 00=LOAD, 01=MULT, 10=STORE

 CODE_ID (6 bits): Processing element ID (0-3)
 Rd/Wr (2 bits): Read/Write flags
 Row Address (9 bits): DRAM row address
 Reserved (5 bits): For future expansion

Example Instruction

00 000001 11 000010000 00000 = LOAD from address 0x1000 on Core 1

Parallel Execution Strategy

 Instructions are distributed across 4 cores using round-robin assignment
 Each DRAM subarray processes independent iterations (row 0 on Core 0, row 1 on Core 1, etc.)

TAC Output Analysis

1. OP: %8 = add nuw nsw i64 %5, 1

o This is an addition operation (likely for loop iteration).

o nuw (No Unsigned Wrap) and nsw (No Signed Wrap) are LLVM flags indicating no
overflow occurs.

2. GEP: %12 = getelementptr inbounds [2 x i32], ptr %2, i64 %5, i64 %11

o This is a GetElementPtr (GEP) instruction used to calculate the address of an element in a

2D array.

o %5 and %11 are loop indices for accessing elements.

3. STORE: store i32 0, ptr %12, align 4, !tbaa !8

o This stores the value 0 into memory at %12. This likely corresponds to initializing C[i][j] = 0.

4. OP: %14 = add nuw nsw i64 %11, 1

o Another addition operation for loop iteration.

5. GEP: %19 = getelementptr inbounds [2 x i32], ptr %0, i64 %5, i64 %17

o GEP instruction to calculate the address of an element in matrix A.

6. LOAD: %20 = load i32, ptr %19, align 4, !tbaa !8

o Load the value from matrix A.

7. GEP: %21 = getelementptr inbounds [2 x i32], ptr %1, i64 %17, i64 %11

o GEP instruction to calculate the address of an element in matrix B.

8. LOAD: %22 = load i32, ptr %21, align 4, !tbaa !8

o Load the value from matrix B.

9. OP: %23 = mul nsw i32 %22, %20

o Multiply two loaded values (A[i][k] * B[k][j]) to compute a partial product.

10. OP: %24 = add nsw i32 %18, %23

o Add the partial product to the accumulator (C[i][j] += A[i][k] * B[k][j]).

11. STORE: store i32 %24, ptr %12, align 4, !tbaa !8

o Store the updated value back into matrix C.

12. OP: %25 = add nuw nsw i64 %17, 1

o Increment loop variable for the innermost loop.

}
}

Wind Calculation Method 2 2015
68% (22)
Wind Calculation Method 2 2015
27 pages
Master Basic DIY (Teach Yourself)
100% (10)
Master Basic DIY (Teach Yourself)
337 pages
HPC-Practical-4Addition of Two Large Vectors
No ratings yet
HPC-Practical-4Addition of Two Large Vectors
4 pages
Assignment-4 Ca
100% (1)
Assignment-4 Ca
10 pages
Soal Bahasa Inggris Kelas XI Semester Genap 2020
100% (1)
Soal Bahasa Inggris Kelas XI Semester Genap 2020
10 pages
Code Generation
No ratings yet
Code Generation
9 pages
01 Lecture02
No ratings yet
01 Lecture02
78 pages
02 Basicarch
No ratings yet
02 Basicarch
83 pages
Compiler Unit 4
No ratings yet
Compiler Unit 4
59 pages
CS 4740/6740 Network Security: Lecture 7: Memory Corruption (Assembly Review, Basic Exploits)
No ratings yet
CS 4740/6740 Network Security: Lecture 7: Memory Corruption (Assembly Review, Basic Exploits)
189 pages
Lesson 2.1 - Intro + x86-x64 Assembly
No ratings yet
Lesson 2.1 - Intro + x86-x64 Assembly
33 pages
Session 6
No ratings yet
Session 6
38 pages
Cache Performance
No ratings yet
Cache Performance
44 pages
217 Lec3
No ratings yet
217 Lec3
46 pages
Modern Computer Architecture and Programming in Assembly Language - TCM - 183 - 1309076
No ratings yet
Modern Computer Architecture and Programming in Assembly Language - TCM - 183 - 1309076
131 pages
Code Optimization Sept. 25, 2003: "The Course That Gives CMU Its Zip!"
No ratings yet
Code Optimization Sept. 25, 2003: "The Course That Gives CMU Its Zip!"
57 pages
DucHuy CA Lab2 2021
No ratings yet
DucHuy CA Lab2 2021
25 pages
COSS - Lecture - 6 - With Annotation
No ratings yet
COSS - Lecture - 6 - With Annotation
37 pages
Lecture2 Orkom 19
No ratings yet
Lecture2 Orkom 19
36 pages
L03 C Intro
No ratings yet
L03 C Intro
35 pages
02 ISA-Ch10
No ratings yet
02 ISA-Ch10
37 pages
Lecture 2.3 & Lecture 3.1 The Control Unit & ISA
No ratings yet
Lecture 2.3 & Lecture 3.1 The Control Unit & ISA
27 pages
Unit-5 Toc
No ratings yet
Unit-5 Toc
41 pages
Coal Assignment 01 Solved
No ratings yet
Coal Assignment 01 Solved
24 pages
Module-5: Syntax Directed Translation, Intermediate Code Generation, Code Generation 5.1,5.2,5.3, 6.1,6.2,8.1,8.2
No ratings yet
Module-5: Syntax Directed Translation, Intermediate Code Generation, Code Generation 5.1,5.2,5.3, 6.1,6.2,8.1,8.2
37 pages
Milen Dimitrov HW2 Q2
No ratings yet
Milen Dimitrov HW2 Q2
28 pages
6-Codegen Opti PDF
No ratings yet
6-Codegen Opti PDF
47 pages
OS Structure
No ratings yet
OS Structure
23 pages
Do Hoang Tu - Operating System From 0 To 1 (2022) - Removed - Removed - Removed
No ratings yet
Do Hoang Tu - Operating System From 0 To 1 (2022) - Removed - Removed - Removed
21 pages
Anti Virus 2.0 "Compilers in Disguise": Mihai G. Chiriac Bitdefender
No ratings yet
Anti Virus 2.0 "Compilers in Disguise": Mihai G. Chiriac Bitdefender
45 pages
Tender BOQ - Architectural & Structural Works - Building 1
No ratings yet
Tender BOQ - Architectural & Structural Works - Building 1
23 pages
Unit 5: Central Processing Unit
No ratings yet
Unit 5: Central Processing Unit
56 pages
Introduction To Compilers: Jun.-Prof. Dr. Christian Plessl Custom Computing University of Paderborn
No ratings yet
Introduction To Compilers: Jun.-Prof. Dr. Christian Plessl Custom Computing University of Paderborn
51 pages
Module 11
No ratings yet
Module 11
18 pages
Mips Instructions
No ratings yet
Mips Instructions
30 pages
09 Pointers Arrays
No ratings yet
09 Pointers Arrays
34 pages
Instructions and Data: Datorteknik, Eitf70, Per Andersson
No ratings yet
Instructions and Data: Datorteknik, Eitf70, Per Andersson
17 pages
DSP Lab
No ratings yet
DSP Lab
40 pages
AAST-CC312-Fall 21-Lec 08
No ratings yet
AAST-CC312-Fall 21-Lec 08
17 pages
Module-3 ISA and Adressing Mode-2-37
No ratings yet
Module-3 ISA and Adressing Mode-2-37
36 pages
Computer Architecture Final Project
No ratings yet
Computer Architecture Final Project
29 pages
Compilers: Tools For Scientists and Engineers
No ratings yet
Compilers: Tools For Scientists and Engineers
42 pages
20 Quiz 14
No ratings yet
20 Quiz 14
12 pages
Direccionamiento de Memoria
No ratings yet
Direccionamiento de Memoria
26 pages
TP1 - Optimizing Memory Access: Imad Kissami
No ratings yet
TP1 - Optimizing Memory Access: Imad Kissami
5 pages
LinearAlgebra Matlab HW3 V2s
No ratings yet
LinearAlgebra Matlab HW3 V2s
5 pages
Matrix Multiplication-Javan.
No ratings yet
Matrix Multiplication-Javan.
6 pages
Homework 1
No ratings yet
Homework 1
9 pages
QCT For GSB (Formets)
No ratings yet
QCT For GSB (Formets)
123 pages
Advanced Computer Architecture 1
No ratings yet
Advanced Computer Architecture 1
14 pages
Chapter 4: Data Movement Instructions
No ratings yet
Chapter 4: Data Movement Instructions
39 pages
Task 1 Types of Parallel Processing
No ratings yet
Task 1 Types of Parallel Processing
3 pages
Unit-Iii: Instructions & Instruction Sequencing
No ratings yet
Unit-Iii: Instructions & Instruction Sequencing
8 pages
CodeGen Memory Optimization Detailed
No ratings yet
CodeGen Memory Optimization Detailed
5 pages
Tesla Outsourcing Services Profile
No ratings yet
Tesla Outsourcing Services Profile
17 pages
PL01 Guiao
No ratings yet
PL01 Guiao
3 pages
Document From Aditya Tripathi
No ratings yet
Document From Aditya Tripathi
3 pages
8 A Instruction Set Format
No ratings yet
8 A Instruction Set Format
2 pages
Reversing Basics - A Practical Approach: Author: Amit Malik (Double - Zer0) E-Mail
No ratings yet
Reversing Basics - A Practical Approach: Author: Amit Malik (Double - Zer0) E-Mail
9 pages
2D Array Lab Manual
No ratings yet
2D Array Lab Manual
6 pages
Gps Gate
No ratings yet
Gps Gate
46 pages
Exp 10
No ratings yet
Exp 10
5 pages
LLVM Reference Card
No ratings yet
LLVM Reference Card
2 pages
Assembly #4
No ratings yet
Assembly #4
3 pages
2013 CBC Standard Gypsum Board Ceiling Details For Suspended and Joist Framing Construction
No ratings yet
2013 CBC Standard Gypsum Board Ceiling Details For Suspended and Joist Framing Construction
68 pages
MOG Kelompok 1
No ratings yet
MOG Kelompok 1
29 pages
Model Pass 2015
No ratings yet
Model Pass 2015
379 pages
GBI Design Reference Guide - Interiors V1.0 Draft 3 Full PDF
No ratings yet
GBI Design Reference Guide - Interiors V1.0 Draft 3 Full PDF
94 pages
Sns College of Technology: Computer Networks
No ratings yet
Sns College of Technology: Computer Networks
27 pages
Pre-Requirement: A) Method
No ratings yet
Pre-Requirement: A) Method
13 pages
XPS Tech Talk
No ratings yet
XPS Tech Talk
8 pages
CASE STUDY 1 Darshan
No ratings yet
CASE STUDY 1 Darshan
6 pages
Footbridge2020 - Mary Elmes Full Paper
No ratings yet
Footbridge2020 - Mary Elmes Full Paper
9 pages
Nandi Woods
No ratings yet
Nandi Woods
8 pages
Manual de Funciones Básicas de Teléfonos Grandstream Networks
No ratings yet
Manual de Funciones Básicas de Teléfonos Grandstream Networks
23 pages
Interface Multiple DS18B20s With ESP32 & Display Values On Web Server
100% (1)
Interface Multiple DS18B20s With ESP32 & Display Values On Web Server
16 pages
59 There Will Be Magic
No ratings yet
59 There Will Be Magic
1 page
McAfee ATD Cisco ESA How To Guide 1.1
No ratings yet
McAfee ATD Cisco ESA How To Guide 1.1
12 pages
Kadek Hindhu Putra Kedaton
No ratings yet
Kadek Hindhu Putra Kedaton
16 pages
Lab 6.3.4: Sound Card Installation: Estimated Time: 30 Minutes Objective
No ratings yet
Lab 6.3.4: Sound Card Installation: Estimated Time: 30 Minutes Objective
3 pages
Bab 2
No ratings yet
Bab 2
27 pages
OSA Express White Paper
No ratings yet
OSA Express White Paper
12 pages
Plate 3 Floor Plan
No ratings yet
Plate 3 Floor Plan
1 page
HTTPWWW Itl Waw Plczasopismajtit2011347
No ratings yet
HTTPWWW Itl Waw Plczasopismajtit2011347
10 pages
Mandir Top View
No ratings yet
Mandir Top View
1 page
Sampling in Intel VTune
No ratings yet
Sampling in Intel VTune
12 pages

Documentation Compiler

Uploaded by

Documentation Compiler

Uploaded by

Compiler Implementation for Matrix

Multiplication using a PIM architecture

Faculty : Senthil Prakash

Rohit kumar singh 22BRS1258

 Custom LLVM Pass: Extracts Three-Address Code from LLVM IR

 Python Converter: Transforms TAC into ISA-compatible instructions

3. Extract Three-Address Code

4. Generate ISA Instructions

 Python script maps TAC operations to the 24-bit ISA format

Run the script : python3 modified_tac_to_isa.py

 OPCODE (2 bits): 00=LOAD, 01=MULT, 10=STORE

00 000001 11 000010000 00000 = LOAD from address 0x1000 on Core 1

Parallel Execution Strategy

TAC Output Analysis

1. OP: %8 = add nuw nsw i64 %5, 1

o This is an addition operation (likely for loop iteration).

o This is a GetElementPtr (GEP) instruction used to calculate the address of an element in a

o %5 and %11 are loop indices for accessing elements.

3. STORE: store i32 0, ptr %12, align 4, !tbaa !8

4. OP: %14 = add nuw nsw i64 %11, 1

o GEP instruction to calculate the address of an element in matrix A.

6. LOAD: %20 = load i32, ptr %19, align 4, !tbaa !8

o Load the value from matrix A.

o GEP instruction to calculate the address of an element in matrix B.

8. LOAD: %22 = load i32, ptr %21, align 4, !tbaa !8

o Load the value from matrix B.

9. OP: %23 = mul nsw i32 %22, %20

o Multiply two loaded values (A[i][k] * B[k][j]) to compute a partial product.

10. OP: %24 = add nsw i32 %18, %23

o Add the partial product to the accumulator (C[i][j] += A[i][k] * B[k][j]).

11. STORE: store i32 %24, ptr %12, align 4, !tbaa !8

o Store the updated value back into matrix C.

12. OP: %25 = add nuw nsw i64 %17, 1

o Increment loop variable for the innermost loop.

You might also like