Data Processing On Fpgas
Data Processing On Fpgas
Data Processing On Fpgas
Semantics Empowered Web 3.0: Managing Enterprise, Social, Sensor, and Cloud-based
Data and Services for Advanced Applications
Amit Sheth and Krishnaprasad irunarayan
2012
Declarative Networking
Boon au Loo and Wenchao Zhou
2012
Probabilistic Databases
Dan Suciu, Dan Olteanu, Christopher Ré, and Christoph Koch
2011
Database Replication
Bettina Kemme, Ricardo Jimenez-Peris, and Marta Patino-Martinez
2010
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations
in printed reviews, without the prior permission of the publisher.
DOI 10.1007/978-3-031-01849-7
Lecture #35
Series Editor: M. Tamer Özsu, University of Waterloo
Series ISSN
Synthesis Lectures on Data Management
Print 2153-5418 Electronic 2153-5426
Data Processing on FPGAs
Jens Teubner
Databases and Information Systems Group, Dept. of Computer Science, TU Dortmund
Louis Woods
Systems Group, Dept. of Computer Science, ETH Zürich
KEYWORDS
FPGA, modern hardware, database, data processing, stream processing, parallel al-
gorithms, pipeline parallelism, programming models
ix
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Moore’s Law and Transistor-Speed Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Memory Wall and Von Neumann Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Power Wall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Multicore CPUs and GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Specialized Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Field-Programmable Gate Arrays (FPGAs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7 FPGAs for Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7.1 Stream Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7.2 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7.3 Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7.4 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 A Brief History of FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Look-up Tables—e Key to Re-Programmability . . . . . . . . . . . . . . . . . . . . . . 18
3.2.1 LUT Representation of a Boolean Function . . . . . . . . . . . . . . . . . . . . . . 18
x
3.2.2 Internal Architecture of an LUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.3 LUT (Re)programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.4 Alternative Usage of LUTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 FPGA Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.1 Elementary Logic Units (Slices/ALMs) . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Routing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4.1 Logic Islands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4.2 Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5 High-Speed I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6 Auxiliary On-Chip Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.1 Block RAM (BRAM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.2 Digital Signal Processing (DSP) Units . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.6.3 Soft and Hard IP-Cores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 FPGA Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7.1 FPGA Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.7.2 Dynamic Partial Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.8 Advanced Technology and Future Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.8.1 Die Stacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.8.2 Heterogeneous Die-Stacked FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.8.3 Time-Multiplexed FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.8.4 High-level Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6 Accelerated DB Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.1 Sort Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.1.1 Sorting Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.1.2 BRAM-based FIFO Merge Sorter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1.3 External Sorting with a Tree Merge Sorter . . . . . . . . . . . . . . . . . . . . . . . 74
6.1.4 Sorting with Partial Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Skyline Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.2.1 Standard Block Nested Loops (BNL) Algorithm . . . . . . . . . . . . . . . . . . 77
6.2.2 Parallel BNL with FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2.3 Performance Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
xiii
Preface
System architectures, hardware design, and programmable logic (specifically, field-programmable
gate arrays or FPGAs) are topics generally governed by electrical engineers. “Hardware people”
are in charge of embracing technological advantages (and turning them into improved perfor-
mance), preferably without breaking any of the established hardware/software interfaces, such as
instruction sets or execution models.
Conversely, computer scientists and software engineers are responsible for understanding
users’ problems and satisfying their application and functionality demands. While doing so, they
hardly care how hardware functions underneath—much as their hardware counterparts are largely
unaware of how their systems are being used for concrete problems.
As time progresses, this traditional separation between hard- and software leaves more
and more potential of modern technology unused. But giving up the separation and building
hardware/software co-designed systems requires that both parties involved understand each other’s
terminology, problems/limitations, requirements, and expectations.
With this book we want to help work toward this idea of co-designed architectures.
Most importantly, we want to give the software side of the story—the database community in
particular—a basic understanding of the involved hardware technology. We want to explain what
FPGAs are, how they can be programmed and used, and which role they could play in a database
context.
is book is intended for students and researchers in the database field, including those
that have not had much contact with hardware technology in the past, but would love to get
introduced to the field. At ETH Zürich/TU Dortmund, we have been teaching for several years
a course titled “Data Processing on Modern Hardware.” e material in this book is one part of
that Master-level course (which further discusses also “modern hardware” other than FPGAs).
We start the book by highlighting the urgent need from the database perspective to invest
more effort into hardware/software co-design issues (Chapter 1). Chapters 2 and 3 then introduce
the world of electronic circuit design, starting with a high-level view, then looking at FPGAs
specifically. Chapter 3 also explains how FPGAs work internally and why they are particularly
attractive at the present time.
In the remaining chapters, we then show how the potential of FPGAs can be turned into
actual systems. First, we give general guidelines how algorithms and systems can be designed
to leverage the potential of FPGAs (Chapter 4). Chapter 5 illustrates a number of examples
that successfully used FPGAs to improve database performance. But FPGAs may also be used to
enable new database functionality, which we discuss in Chapter 7 by example of a database crypto
xiv PREFACE
co-processor. We conclude in Chapter 8 with a wary look into the future of FPGAs in a database
context.
A short appendix points to different flavors of FPGA system integration, realized through
different plug-ins for commodity systems.