0% found this document useful (0 votes)

37 views21 pages

Furiosa Introduction Confidential

FuriosaAI Inc. aims to create a next-generation AI accelerator, the RNGD chip, which is designed for efficient and sustainable AI computing, targeting large language models and multimodal applications. The chip architecture, known as TCP (Tensor Contraction Processor), is recognized for its programmability, scalability, and superior performance per watt, making it suitable for future AI workloads. The RNGD chip is set to be commercially available in Q3 2024, following the successful deployment of its first-generation WARBOY chip.

Uploaded by

satasi.satasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views21 pages

Furiosa Introduction Confidential

Uploaded by

satasi.satasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

The next-gen AI accelerator

for more powerful and

sustainable AI computing

Confidential (c) 2024 FuriosaAI Inc.

Our Mission

Make AI computing sustainable,

enabling access to powerful AI
for everyone on Earth

Confidential (c) 2024 FuriosaAI Inc.

“Compute is going to be the
new currency of the future.”

Sam Altman, from Lex Fridman Podcast

Confidential (c) 2024 FuriosaAI Inc.

OpenAI’s ChatGPT Compute & Cost

350B computations per every new token generated

$700K per day to run ChatGPT
$250 M per year to run ChatGPT (as of 2023 April)

So, the boy went to the Park

New token

Context (sequence of tokens)

96 x (12,288 × 49,152 dimensional matrix multiplication)

Confidential (c) 2024 FuriosaAI Inc.

”One of the barriers there is actually building the
great software it’s very usable, so there is a real
possibility to build better chips that are optimized
for not just today’s models, but to be really able
to see where the models are going and making it
so people can experiment very flexibly”

Greg Brockman, OpenAI

5
Confidential (c) 2024 FuriosaAI Inc.
$ 1 Trillion Long-Term Opportunity with Fast-evolving AI

7
02
by2
B
00
$4
$ 400B
Year 2027

$30B Today
Now!
Year 2023
Early stage of the dynamic
and fast changing AI landscape

Confidential (c) 2024 FuriosaAI Inc. Source: AMD & Nvidia announcements
6
Key to winning AI chip architecture

01 Programmability and scalability

02 Superior performance per watt

Easy deployment at mass

2nd gen-chip, RNGD (Renegade)
03 scale cloud, on-premise

Confidential (c) 2024 FuriosaAI Inc.

01
Programmable and Scalable
Chip Architecture (TCP)

“TCP is a tensor processor that hits a good sweet-spot

of being generalized for many tensor-based workloads
in modern neural architectures, yet specialized enough
to exploit many of the commonly known tricks of
structured parallelism in AI acceleration.

This makes TCP a great arch, not only for one use case
but reasonably future-proof to host new and upcoming
DNN architectures (based on current trends observed).”

– ISCA Paper Reviewer

TCP (Tensor Contraction Processor)

Confidential (c) 2024 FuriosaAI Inc.

TCP Recognized by ISCA
“I found all architectural decisions to be well-motivated and
well- thought for many of today's challenges in AI acceleration.

The arch tackles memory access challenges, compute-vs-

memory boundness, full-pipelined logic, diversity in OPs,
scalability, multi-tenancy, dynamic reconfigurability through
extensive control logic, well- thought cost-optimized compilation
flow together with a low-level API for full control, bitwidth
flexibility (FP8, BF16, INT8, INT4), and more.”

– ISCA Reviewer

Context: The only other NPU architectures accepted by ISCA are Google's TPU and Groq

Confidential (c) 2024 FuriosaAI Inc.

01
1st-gen WARBOY’s superior
programmability – high
performance across many models

Confidential (c) 2024 FuriosaAI Inc.

02
RNGD Server H100 DGX
RNGD can run the
memory capacity HBM3 960 GB HBM3 640 GB
most advanced models
with around 1/2
memory performance 30 TB/s 26.8 TB/s

power consumption power (TDP) 4.0 kW max 10.2 kW max

FP8 10.2 petaFLOPS 16 petaFLOPS

Confidential (c) 2024 FuriosaAI Inc.

02
RNGD H100 L40S

RNGD will be the world’s most efficient 4x

AI chip for accelerating LLMs and 3x

multimodal models in data centers

Energy efficiency

LLaMa 7B

Confidential (c) 2024 FuriosaAI Inc.

PyTorch 2.x Support Overview

03
Dynamo Furiosa
tracing compiler codegen
Python fx.GraphModule LLTC IR RNGD ISA
LLM engine quantizer

Furiosa SW stack
device runtime
runtime calibrator

Furiosa SW stack streamlines debugger, profiler

model execution with native
PyTorch 2.x support.
OpenAI Triton Support Plan
All done with zero code change
from users. AST Triton-Furiosa
Visitor compiler codegen
Python Triton-IR LLVM IR RNGD ISA

LLTC Low-Level Tensor Contraction

AST Abstract Syntax Tree
IR Intermediate Representation
Confidential (c) 2024 FuriosaAI Inc.
ISA Instruction Set Architecture
03
Easy deployment at mass
scale cloud and on-premise

Confidential (c) 2024 FuriosaAI Inc.

Semiconductor Productization & Commercialization

Tape-out GDS file Photomask ASML EUV Lithography Machine

Confidential (c) 2024 FuriosaAI Inc.

1st-gen WARBOY had successful
commercialization with Samsung
and ASUS, and deployed with data
centers and enterprise clients.

1st gen, WARBOY

Confidential (c) 2024 FuriosaAI Inc.

2nd gen-chip, RNGD (Renegade)
targeting LLMs and Multimodality
is available in Q3 2024

08/11 – Renegade Bringup getting started

05/20 – 20x Renegade PCIe PCB Board being ready for SW development
6/20 – 100-120x Renegade PCB board ready
7/26 – MLPerf submission (GPT-J, Llama 70B, BERT)
8/15 – Initial Renegade SK release to early customers and more
comprehensive benchmark results for LLM.

Confidential (c) 2024 FuriosaAI Inc.

Furiosa Future Roadmap

2022 4Q 2024 3Q 2024 4Q 2025 1Q 2025 4Q

WARBOY RNGD RNGD-MAX RNGD-S RNGD-TURBO

LPDDR4X 16GB HBM3 48GB HBM3 96GB LPDDR5X 64GB HBM3E 288GB
66 GB/s 1.5 TB/s 3.0 TB/s 256 GB/s 8.0 TB/s
60 W 150 W 350 W 60 W 600 W
64 TOPS (INT8) 512 TFLOPS (FP8) 1024 TFLOPS (FP8) 230 TFLOPS (FP8) 2 PFLOPS (FP8)

Confidential (c) 2024 FuriosaAI Inc.

World-class R&D & Business Team in HW, SW and AI

HW/Architecture AI Models & Algorithms SW Stack HW Verification

“TCP: A Tensor Contraction “Can MLLMs Perform Text-to-Image “Integrating an NPU with PyTorch “Functional Coverage Closure
Processor for AI Workloads” In-Context Learning?” 2.0 Compile” with Python”

Accepted by ISCA 2024 Co-authored with UW Madison Presented at PyTorch Conference 2023 Presented at DVCON 2024

Confidential (c) 2024 FuriosaAI Inc.

Movement coming soon
2024

Confidential (c) 2024 FuriosaAI Inc.

Thank you

Confidential (c) 2024 FuriosaAI Inc.

Meta's Ambition On ASIC AI Server
No ratings yet
Meta's Ambition On ASIC AI Server
13 pages
Scenario Based Splunk Admin Interview Questions
No ratings yet
Scenario Based Splunk Admin Interview Questions
34 pages
04 AMD Edge AI TechDay - Singapore - 2024 - FrankWang
No ratings yet
04 AMD Edge AI TechDay - Singapore - 2024 - FrankWang
29 pages
RNGD Brochure
No ratings yet
RNGD Brochure
8 pages
SP 11v2 Wei Han Tenstorrent Gsa Edge Ai 2024 Final Submit 2
No ratings yet
SP 11v2 Wei Han Tenstorrent Gsa Edge Ai 2024 Final Submit 2
17 pages
Intel AI Everywhere
No ratings yet
Intel AI Everywhere
29 pages
Warboy Brochure
No ratings yet
Warboy Brochure
10 pages
CPUs GPUs Accelerators and Memory v1.0
No ratings yet
CPUs GPUs Accelerators and Memory v1.0
44 pages
PCWorld 01 2024
No ratings yet
PCWorld 01 2024
112 pages
CPUs GPUs Accelerators
No ratings yet
CPUs GPUs Accelerators
22 pages
Hc2024 Amd Vpeng
No ratings yet
Hc2024 Amd Vpeng
36 pages
Nvidia Gears Up For Robotic Revolution, Unveils Powerful Ai Chip
No ratings yet
Nvidia Gears Up For Robotic Revolution, Unveils Powerful Ai Chip
4 pages
Getting Started With The AMD Robotics Hardware Portfolio - Final v2
No ratings yet
Getting Started With The AMD Robotics Hardware Portfolio - Final v2
38 pages
【极术公开课】AI大模型与智能物联创新应用
No ratings yet
【极术公开课】AI大模型与智能物联创新应用
25 pages
Arm Holdings PLC Q4 FYE24 Results
No ratings yet
Arm Holdings PLC Q4 FYE24 Results
11 pages
CB Insights Generative AI Predictions 2024
100% (1)
CB Insights Generative AI Predictions 2024
112 pages
No Exaflops For You
No ratings yet
No Exaflops For You
61 pages
ZyNet Automating Deep Neural Network Implementation On Low-Cost Reconfigurable Edge Computing Platforms
No ratings yet
ZyNet Automating Deep Neural Network Implementation On Low-Cost Reconfigurable Edge Computing Platforms
4 pages
60 HC2024.Intel - RomanKaplan.Gaudi3-0826
No ratings yet
60 HC2024.Intel - RomanKaplan.Gaudi3-0826
16 pages
01 Tutorial Intro Share
No ratings yet
01 Tutorial Intro Share
21 pages
Mipsology Aws f1
No ratings yet
Mipsology Aws f1
10 pages
REN - Vision Ai Across Hardware Platforms R01wp0028eu0100 - WHP - 20250325
No ratings yet
REN - Vision Ai Across Hardware Platforms R01wp0028eu0100 - WHP - 20250325
9 pages
04 Abstract
No ratings yet
04 Abstract
40 pages
A Mixed-Pruning Based Framework For Embedded Convolutional Neural Network Acceleration
No ratings yet
A Mixed-Pruning Based Framework For Embedded Convolutional Neural Network Acceleration
10 pages
Example Poster
No ratings yet
Example Poster
1 page
Hyperion Research HPC and AI Processors
No ratings yet
Hyperion Research HPC and AI Processors
14 pages
NVIDIA Investor Presentation Oct 2024
No ratings yet
NVIDIA Investor Presentation Oct 2024
30 pages
Nvidia Update For Lenovo
No ratings yet
Nvidia Update For Lenovo
30 pages
Vietnam AI DC Summit
No ratings yet
Vietnam AI DC Summit
23 pages
Robotics Webinar Series Session 3 Slides
No ratings yet
Robotics Webinar Series Session 3 Slides
46 pages
AI
No ratings yet
AI
242 pages
Aihub 1017
No ratings yet
Aihub 1017
17 pages
Origami
No ratings yet
Origami
14 pages
2025 03 07 AI Updates
No ratings yet
2025 03 07 AI Updates
20 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
15 pages
Understanding AI Part 2 Inference, Revised
No ratings yet
Understanding AI Part 2 Inference, Revised
4 pages
AI PC Model HQ
No ratings yet
AI PC Model HQ
17 pages
DL TR 2022 002
No ratings yet
DL TR 2022 002
20 pages
Amd Ai Networking Direction and Strategy
No ratings yet
Amd Ai Networking Direction and Strategy
6 pages
48423B Fusion Whitepaper WEB
No ratings yet
48423B Fusion Whitepaper WEB
8 pages
HPC Summit Digital 2020: Gpu Experts Panel: Ampere Explained
No ratings yet
HPC Summit Digital 2020: Gpu Experts Panel: Ampere Explained
29 pages
Idc White Paper
No ratings yet
Idc White Paper
6 pages
Intel Fpga Industrial Solutions Playbook 2022
No ratings yet
Intel Fpga Industrial Solutions Playbook 2022
43 pages
Industrial Infrastructure Iot Strategy CM Day 20230519
No ratings yet
Industrial Infrastructure Iot Strategy CM Day 20230519
23 pages
FPGA Roadmap - 2019
No ratings yet
FPGA Roadmap - 2019
36 pages
Giulio Corradi Presentation PDF
No ratings yet
Giulio Corradi Presentation PDF
64 pages
GTC2025 Keynote
No ratings yet
GTC2025 Keynote
73 pages
Cover TBD: Intel® Fpga Product Catalog
No ratings yet
Cover TBD: Intel® Fpga Product Catalog
100 pages
Hardware Accleration For ML
No ratings yet
Hardware Accleration For ML
26 pages
Accelerating Binarized Neural Networks Comparison of FPGA CPU GPU and ASIC
No ratings yet
Accelerating Binarized Neural Networks Comparison of FPGA CPU GPU and ASIC
8 pages
C R A M: Ompression For EAL Time Pplications Edia
No ratings yet
C R A M: Ompression For EAL Time Pplications Edia
4 pages
5 Introduction To Huawei AI Platforms v3.5
No ratings yet
5 Introduction To Huawei AI Platforms v3.5
113 pages
Comparison of Processing Performance and Architectural Efficiency Metrics For Fpgas and Gpus in 3D Ultrasound Computer Tomography
No ratings yet
Comparison of Processing Performance and Architectural Efficiency Metrics For Fpgas and Gpus in 3D Ultrasound Computer Tomography
7 pages
(English (Auto-Generated) ) Daniel Dines, UiPath CEO & Founder - Why Agents Do Not Mean RPA Is F - E1240 (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) Daniel Dines, UiPath CEO & Founder - Why Agents Do Not Mean RPA Is F - E1240 (DownSub - Com)
47 pages
Imagination Getting Real About Ai White Paper 25
No ratings yet
Imagination Getting Real About Ai White Paper 25
10 pages
07 Firesim Intro
No ratings yet
07 Firesim Intro
42 pages
02 AMD Tech Day AECG Portfolio Overview
No ratings yet
02 AMD Tech Day AECG Portfolio Overview
34 pages
Gpu Applications Catalog
No ratings yet
Gpu Applications Catalog
32 pages
IDG Global Events Calendar 2025
No ratings yet
IDG Global Events Calendar 2025
18 pages
Presentation 1
No ratings yet
Presentation 1
2 pages
196 Wolniak 3
No ratings yet
196 Wolniak 3
16 pages
CIO2025 Web
No ratings yet
CIO2025 Web
9 pages
Lenovo-Managed-Services Brochure WW en
No ratings yet
Lenovo-Managed-Services Brochure WW en
6 pages
Microsoft Azure Fundamentals AZ-900 Exam
100% (3)
Microsoft Azure Fundamentals AZ-900 Exam
7 pages
Oracle DBA Automation Scripts
100% (21)
Oracle DBA Automation Scripts
58 pages
DB 12c Hardening Standards - Reference
No ratings yet
DB 12c Hardening Standards - Reference
19 pages
Module 1 Chap 1 Sakshi Training
No ratings yet
Module 1 Chap 1 Sakshi Training
17 pages
Shell-Script 06042017 050954AM
No ratings yet
Shell-Script 06042017 050954AM
17 pages
Unit 1 Decision Making and Branching
No ratings yet
Unit 1 Decision Making and Branching
18 pages
C++ Part III
No ratings yet
C++ Part III
122 pages
Medal Log 20240719
No ratings yet
Medal Log 20240719
14 pages
1st Half Book Part 1
No ratings yet
1st Half Book Part 1
2 pages
Backup and Job Scheduling Controls
No ratings yet
Backup and Job Scheduling Controls
13 pages
ALOK1
No ratings yet
ALOK1
6 pages
EZTwain User Guide
No ratings yet
EZTwain User Guide
187 pages
Oosd Notes
No ratings yet
Oosd Notes
35 pages
Addressable Fire Alarm Control Panel AW-FP100 User Manual
100% (1)
Addressable Fire Alarm Control Panel AW-FP100 User Manual
29 pages
Business Objects XIR2 Migration Essentials: Slide 1
No ratings yet
Business Objects XIR2 Migration Essentials: Slide 1
13 pages
SQL Server 2017 Editions Retail Volume Licensing Programs Third Party
No ratings yet
SQL Server 2017 Editions Retail Volume Licensing Programs Third Party
3 pages
Digital Age
No ratings yet
Digital Age
28 pages
Quarantine Practise Excel
No ratings yet
Quarantine Practise Excel
6 pages
Systems Programming and Computer Control CT047-3.5-2 Individual Assignment
100% (1)
Systems Programming and Computer Control CT047-3.5-2 Individual Assignment
31 pages
SLM 8 Quarter 4 Week 1 and 2 Parts and Functions of A Motherboard A4
No ratings yet
SLM 8 Quarter 4 Week 1 and 2 Parts and Functions of A Motherboard A4
23 pages
Lenovo Storage V3700 V2 and V3700 V2 XP: Product Guide
No ratings yet
Lenovo Storage V3700 V2 and V3700 V2 XP: Product Guide
29 pages
Mapping The Atari Appendix 7 - Player - Missile Graphics Memory Map
No ratings yet
Mapping The Atari Appendix 7 - Player - Missile Graphics Memory Map
2 pages
IT Internship Report
100% (4)
IT Internship Report
69 pages
Huawei FusionServer Pro V5 Rack Server Data Sheet
No ratings yet
Huawei FusionServer Pro V5 Rack Server Data Sheet
22 pages
Vedic Multiplier
100% (1)
Vedic Multiplier
65 pages
Aurora ISA Guide
No ratings yet
Aurora ISA Guide
509 pages
Cisco Sales Expert 2 2
No ratings yet
Cisco Sales Expert 2 2
76 pages
VWC-2000 Brochure 4U EN-1
No ratings yet
VWC-2000 Brochure 4U EN-1
4 pages
Connecting To An Access Database Using Classic ASP
No ratings yet
Connecting To An Access Database Using Classic ASP
9 pages

Furiosa Introduction Confidential

Uploaded by

Furiosa Introduction Confidential

Uploaded by

The next-gen AI accelerator

for more powerful and

Confidential (c) 2024 FuriosaAI Inc.

Make AI computing sustainable,

Confidential (c) 2024 FuriosaAI Inc.

Sam Altman, from Lex Fridman Podcast

Confidential (c) 2024 FuriosaAI Inc.

350B computations per every new token generated

So, the boy went to the Park

Context (sequence of tokens)

96 x (12,288 × 49,152 dimensional matrix multiplication)

Confidential (c) 2024 FuriosaAI Inc.

Greg Brockman, OpenAI

01 Programmability and scalability

02 Superior performance per watt

Easy deployment at mass

Confidential (c) 2024 FuriosaAI Inc.

“TCP is a tensor processor that hits a good sweet-spot

– ISCA Paper Reviewer

TCP (Tensor Contraction Processor)

Confidential (c) 2024 FuriosaAI Inc.

The arch tackles memory access challenges, compute-vs-

Confidential (c) 2024 FuriosaAI Inc.

Confidential (c) 2024 FuriosaAI Inc.

power consumption power (TDP) 4.0 kW max 10.2 kW max

FP8 10.2 petaFLOPS 16 petaFLOPS

Confidential (c) 2024 FuriosaAI Inc.

RNGD will be the world’s most efficient 4x

AI chip for accelerating LLMs and 3x

Confidential (c) 2024 FuriosaAI Inc.

Furiosa SW stack streamlines debugger, profiler

LLTC Low-Level Tensor Contraction

Confidential (c) 2024 FuriosaAI Inc.

Tape-out GDS file Photomask ASML EUV Lithography Machine

Confidential (c) 2024 FuriosaAI Inc.

1st gen, WARBOY

Confidential (c) 2024 FuriosaAI Inc.

08/11 – Renegade Bringup getting started

Confidential (c) 2024 FuriosaAI Inc.

2022 4Q 2024 3Q 2024 4Q 2025 1Q 2025 4Q

WARBOY RNGD RNGD-MAX RNGD-S RNGD-TURBO

Confidential (c) 2024 FuriosaAI Inc.

HW/Architecture AI Models & Algorithms SW Stack HW Verification

Confidential (c) 2024 FuriosaAI Inc.

Confidential (c) 2024 FuriosaAI Inc.

Confidential (c) 2024 FuriosaAI Inc.

You might also like