Computer Arithmetic
Computer Arithmetic
Kashif Inayat
https://www.bsc.es/inayat-kashif
Content
• Introduction
• Motivation
• Background
• Adders
• Multipliers
• Systolic Array Architecture
• Data Flows
• MACs
• Problem Statement
• Conclusions
12/29/2023 2 Kashif Inayat, 2023
Content
• Introduction
• Motivation
• Background
• Adders
• Multipliers
• Systolic Array Architecture
• Data Flows
• MACs
• Problem Statement
• Conclusions
12/29/2023 3 Kashif Inayat, 2023
Machine Learning
Image classification
Face recognition
Machine learning is
everywhere!
Optical character recognition Autonomous driving
12/29/2023 4 Kashif Inayat, 2023
Transistors are not getting Faster
Slowdown
Need specialized / domain-specific
hardware
for significant improvements in speed
and energy efficiency
• Performance
• Conventional arithmetic blocks
inside PE usually have large
critical path
• Latency, Throughput
• Energy/Power
• MAC is the most power
consuming unit
• Hardware Cost
• Chip Storage, Chip Area [Source: Computing’s Energy Problem (and what we can do about it), Mark Horowitz, [2]]
• Google (TPUv1)
• Extended in TPUv2, TPUv3 and TPUv4
• Tesla
• 96x96 independent array in NPU
• IBM
• 28x28 Wavefront Systolic array
• Samsung
• 1024 MACs NPU
• Intel, etc
[Source: XDNN [5], Xilinx.] [Source: 28x28 wavefront SA [6], IBM.]
t5 t4 t3 t2 t1
12/29/2023 24 Kashif Inayat, 2023
Content
• Introduction
• Motivation
• Background
• Adders
• Multipliers
• Systolic Array Architecture
• Data Flows
• MACs
• Problem Statement
• Conclusions
12/29/2023 25 Kashif Inayat, 2023
Data Flows
• Output Stationary
• Input Stationary
• Weight Stationary
Radix-4
Radix-8
X Booth
Recoding
Booth Selection
X Booth Booth Selection
Recoding
Wallace reduction
Wallace reduction
tree
tree
Multiplier Multiplier
CPA
CPA
CPA
CPA
Accumulator Accumulator
Our Methodology
12/29/2023 30 Kashif Inayat, 2023
Conclusion
• We will focus on both:
• µ-architectures, e.g., multipliers, adders, pipeline stages in Systolic Array based
architectures etc.
• At the same time, we can will explore RISC-V architecture for similar optimization
approaches.
12/29/2023 34