GPU Accelerated Databases, Speeding Up Database Time Series Analysis Using OpenCL
GPU Accelerated Databases, Speeding Up Database Time Series Analysis Using OpenCL
Outline
Speakers Biography Outline Solution Goals OpenCL Programming Challenge Review of GPU Accelerated Databases Swiss Army Knife of Data OpenCL Bindings to PostgreSQL Challenges Example Use Cases Benefits of the Approach Q&A
Speakers Bio
Tim Child 35 years experience of software development Formerly
VP Engineering, Oracle Corporation VP Engineering, BEA Systems Inc. VP Engineering , Informix Leader at Illustra, Autodesk, Navteq, Intuit,
Goals
Develop New Applications
Develop new GPU Accelerated Database Applications that are computationally intensive.
Ease of Use
Make use GPU accelerated code easier to use Make GPU accelerated code more mainstream to Information Technology
Data Scalability
Scale GPU application data size
Possible Solutions
Other Choices ??
or C/C++ Binding using Web CGI Database Driven Java/Perl/Python Bindings in App Server GPU Programming
GPU Co-Process
TCP/IP DBMS Client DBMS Server
IPC / RPC
GPU Language Co-Process
GPGPU DRAM
PCI Bus
Data Tables
GPGPU
GPGPU
Examples 2008 Bakkum, Skardon 2010 Palo OLAP 2010 ParStream 2011 Kaczmarski
Data Tables
GPGPU DRAM
DBMS Client
10G B
RAM Cache
GPGPU
10T B
Rules System
Extensible Indices
Open Source
Vibrant Community
Native APIs
PostGIS
(Vector, Raster)
OpenCL
Images Types
image2d_t Image3d_t
Web Browser
Web Server
SQL Statement
App Server
PostgreSQL GPGPU
TCP/IP
Data Tables
Client
__kernel void VectorAdd( __global int * id, __global float *a, __global float *b, __global float *c) { int i = get_global_id(0); /* Query OpenCL for the Array Subscript **/ c[i] = a[i] + b[i]; }
$BODY$
Language PgOpenCL; Select VectorAadd(Id, a, c) from Vectors;
Comparison Table
xPU
VectorAdd(A, B) Returns C
Copy Copy
Table
CL_UNSIGNED_INT, CL_INTENSITY
CL_FLOAT, CL_INTENSITY
Pearson Match Correlation Coefficient Correlation between two Time Series 1 Linear Relation Between Samples -1 Inverse Linear Relation Between Samples 0.0 No Linear Relationship between samples
Bioinformatics
DNA & Protein Sequence Matching
Example Screen 1
Example Screen 2
Example Screen 3
Example Screen 4
Example Screen 5
Type Mapping
Challenges
Problem Size
DBMS Table Size >> GPU RAM
Extended SQL Types OpenCL Vectors Types OpenCL Image Types Time Series
Caching kernel info CPU GPU Still present SQL Queries
Runtime Partitioning
Dynamic Simplified Return Types
Data Transfer
Device Management
CPU vs. GPU
Runtime Selection
Concurrency
No Pre-emptive Multi-Tasking
Time-out Long Queries Partitioning / Scheduling
+ Overhead ( < 4s )
Map Array
Bulk Data Loaders
New Task
Summary
OpenCL
PostgreSQL
Q&A
PgOpenCL Twitter @3DMashUp Blog www.scribd.com/3dmashup OpenCL