0% found this document useful (0 votes)

2K views

PostgreSQL OpenCL Procedural Language

The document introduces PgOpenCL, a new procedural language for PostgreSQL that allows developers to execute functions on a GPU using OpenCL. PgOpenCL maps PostgreSQL functions to OpenCL kernels that can run in parallel across hundreds or thousands of GPU threads. This unlocks the massive parallel processing power of GPUs for speeding up compute-intensive PostgreSQL queries and analytics.

Uploaded by

3dmashup

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2K views

PostgreSQL OpenCL Procedural Language

Uploaded by

3dmashup

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Introducing PgOpenCL

A New PostgreSQL
Procedural Language
Unlocking the Power of the GPU!
By
Tim Child
Bio

Tim Child
• 35 years experience of software development
• Formerly
• VP Oracle Corporation
• VP BEA Systems Inc.
• VP Informix
• Leader at Illustra, Autodesk, Navteq, Intuit, …
• 30+ years experience in 3D, CAD, GIS and DBMS
Terminology
Term Description
Procedure Language Language for SQL Procedures (e.g. PgPLSQL, Perl, TCL, Java, … )
GPU Graphics Processing Unit (highly specialized CPU for graphics)
GPGPU General Purpose GPU (non-graphics programming on a GPU)
CUDA Nvidia’s GPU programming environment
APU Accelerated Processing Unit (AMD’s Hybrid CPU & GPU chip)
ISO C99 Modern standard version of the C language
OpenCL Open Compute Language
OpenMP Open Multi-Processing (parallelizing compilers)
SIMD Single Instruction Multiple Data (Vector instructions )
SSE x86, x64 (Intel, AMD) Streaming SIMD Extensions
xPU Any Processing Unit device (CPU, GPU, APU)
Kernel Functions that execute on a OpenCL Device
Work Item Instance of a Kernel
Workgroup A group of Work Items
FLOP Floating Point Operation (single = SQL real type )
MIC Many Integrated Cores (Intel’s 50+ x86 Core chip architecture)
Some Technology Trends
Impacting DBMS
• Solid State Storage
– Reduced Access Time, Lower Power, Increasing in capacity
• Virtualization
– Server consolidation, Specialized VM’s, lowers direct costs
• Cloud Computing
– EC2, Azure, … lowers capital requirements
• Multi-Core
– 2,4,6,8, 12, …. Lots of benefits to multi-threaded applications

• xPU (GPU/APU)
– GPU >1000 Cores
– > 1T FLOP /s @ €2500
– APU = CPU + GPU Chip Hybrids due in Mid 2011
– 2 T FLOP /s for $2.10 per hour (AWS EC2)
– Intel MIC “Knights Corner “ > 50 x86 Cores
Compute Intensive
xPU Database Applications
• Bioinformatics

• Signal/Audio/Image Processing/Video

• Data Mining & Analytics

• Searching

• Sorting

• Spatial Selections and Joins

• Map/Reduce

• Scientific Computing

• Many Others …
GPU vs CPU
Vendor NVidia ATI Radeon Intel
Architecture Fermi Evergreen Nehalem
Cores 448 1600 4
Simple Simple Complex
Transistors 3.1 B 2.15 B 731 M
Clock 1.5 G Hz 851 M Hz 3 G Hz
Peak Float 1500 G 2720 G 96 G
Performance FLOP / s FLOP / s FLOP / s
Peak Double 750 G 544 G 48 G
Performance FLOP / s FLOP / s FLOP / s
Memory ~ 190 G / s ~ 153 G / s ~ 30 G / s
Bandwidth
Power 250 W > 250 W 80 W
Consumption
SIMD / Vector Many Many SSE4+
Instructions
Multi-Core Performance

Source NVidia
Future (Mid 2011)
APU Based PC
APU (Accelerated Processing Unit)

APU Chip
CPU CPU ~20 GB/s System RAM

North Bridge
~20 GB/s APU’s
PCIE ~12 GB/s

Adds an Embedded
Embedded GPU
GPU

Discrete
150 GB/s Graphic RAM
GPU

Source AMD
Scalar vs. SIMD
Scalar Instruction
C=A+B 1 + 2 = 3

SIMD Instruction 1 3 5 7

+
Vector C = Vector A + Vector B 2 4 6 8

=
3 7 11 15

OpenCL
Vector lengths 2,4,8,16 for char, short, int, float, double
Summarizing xPU
Trends
• Many more xPU Cores in our Future
• Compute Environment becoming Hybrid
– CPU and GPU’s
– Need CPU to give access to GPU power
• GPU Capabilities
– Lots of cores
– Vector/SIMD Instructions
– Fast Memory
• GPU Futures
– Virtual Memory
– Multi-tasking / Pre-emption
Scaling PostgreSQL Queries
on xPU’s
Multi-Core CPU Many Core GPU

PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL

Threads Threads Threads Threads Threads Threads Threads Thread Thread

PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL

Postgres Threads Threads Threads Thread Thread
Process

PgOpenCL PgOpenCL PgOpenCL PgOpenCL

PgOpenCL
Threads Threads Thread Thread
Threads

Using More
Transistors
Parallel
Programming Systems
Category CUDA OpenMP OpenCL
Language C C, Fortran C
Cross Platform X √ √
Standard Vendor OpenMP Khronos
CPU X √ √
GPU √ X √
Clusters X √ X

Compilation / Link Static Static Dynamic

What is OpenCL?
• OpenCL - Open Compute Language
– Subset of C 99
– Open Specification
– Proposed by Apple
– Many Companies Collaborated on the Specification
– Portable, Device Agnostic
– Specification maintained by Khronos Group
• PgOpenCL
– OpenCL as a PostgreSQL Procedural Language
System Overview
DBMS Server

PgOpenCL
PgOpenCL
Web HTTP Web SQL SQL
SQL
Browser Server Statement Procedure
Procedure

PCIe X2 Bus
TCP/IP

App
PostgreSQL GPGPU
Server

Disk I/O Tables

TCP/IP
PostgreSQL
Client
OpenCL
Language
• A subset of ISO C99
– - But without some C99 features such as standard C99 headers,
– function pointers, recursion, variable length arrays, and bit fields
• A superset of ISO C99 with additions for:
– - Work-items and Workgroups
– - Vector types
– - Synchronization
– - Address space qualifiers
• Also includes a large set of built-in functions
– - Image manipulation
– - Work-item manipulation,
– - Specialized math routines, etc.
PgOpenCL
Components
• New PostgreSQL Procedural Language
– Language handler
• Maps arguments
• Calls function
• Returns results
– Language validator
• Creates Function with parameter & syntax checking
• Compiles Function to a Binary format
• New data types
– cl_double4, cl_double8, ….
• System Admin Pseudo-Tables
– Platform, Device, Run-Time, …
PgOpenCL
Admin
PGOpenCL
Function Declaration
CREATE or REPLACE FUNCTION VectorAdd(IN a float[], IN B float[], OUT c float[])
AS $BODY$

#pragma PGOPENCL Platform : ATI Stream

#pragma PGOPENCL Device : CPU

kernel attribute__((reqd_work_group_size(64, 1, 1)))

void VectorAdd( __global const float *a, __global const float *b, __global float *c)
{
int i = get_global_id(0);

c[i] = a[i] + b[i];

}

$BODY$
Language PgOpenCL;
PgOpenCL
Execution Model
A
Table
B

Select Table 100’s - 1000’s of

to Array Threads (Kernels)

xPU
VectorAdd(A, B)
A + B Returns C = C

Copy Unnest Array

Copy To Table
Table

C C C C C C C C C C C C C
Using
Re-Shaped Tables
100’s - 1000’s of
Table of Threads (Kernels) Table of
Arrays Arrays
A + B = C

A
C C C C
B
xPU
VectorAdd(A, B)
Returns C
A
C C C C
B

Copy
Copy
Today’s GPGPU
Challenges
• No Pre-emptive Multi-Tasking
• No Virtual Memory
• Limited Bandwidth to discrete GPGPU
– 1 – 8 G/s over PCIe Bus
• Hard to Program
– New Parallel Algorithms and constructs
– “New” C language dialect
• Immature Tools
– Compilers, IDE, Debuggers, Profilers - early years
• Data organization really matters
– Types, Structure, and Alignment
– SQL needs to Shape the Data
• Profiling and Debugging is not easy

Solves Well for Problem Sets with the Right Shape!

Making a Problem
Work for You
• Determine % Parallelism Possible
for ( i = 0, i < ∞, i++)
for ( j = 0; j < ∞; j++ )
for ( k = 0; k < ∞; k++ )

• Arrange data to fit available GPU RAM

• Ensure calculation time >> I/O transfer overhead
• Learn about Parallel Algorithms and the OpenCL language
• Learn new tools
• Carefully choose Data Types, Organization and Alignments
• Profile and Measure at Every Stage
PgOpenCL
System Requirements
• PostgreSQL 9.x
• For GPU’s
– AMD ATI OpenCL Stream SDK 2.x
– NVidia CUDA 3.x SDK
– Recent Macs with O/S 11.6
• For CPU’s (Pentium M or more recent)
– AMD ATI OpenCL Stream SDK 2.x
– Intel OpenCL SDK Alpha Release (x86)
– Recent Macs with O/S 11.6
PGOpenCL
Status
Today 1Q 2011
Prototype Beta

2010 2011

• Wish List
• Beta Testers
– Existing OpenCL App?
– Have a GPU App?
• Contributors
– Code server side functions?
• Sponsors & Supporters
– AMD Fusion Fund?
– Khronos?
PgOpenCL
Future Plans
• Increase Platform Support
• Scatter/Gather Functions
• Additional Type Support
– Image Types
– Sparse Matrices
• Run-Time
– Asynchronous
– Events
– Profiling
– Debugging
Using the
Whole Brain
APU Chip
PgOpenCl PgOpenCl
PgOpenCL PgOpenCL
CPU
CPU CPU
Postgres You can’t be in a
parallel universe
with a single
brain!
North Bridge
~20 GB/s
• Heterogeneous Compute Environments
PgOpenCl
PgOpenCl • CPU’s, GPU’s, APU’s
Embedded PgOpenCl • Expect 100’s – 1000’s of cores
PgOpenCl
GPU PgOpenCL

The Future Is Parallel: What's a Programmer to Do?

Summarizing
PgOpenCL
• Supports Heterogeneous Parallel Compute Environments
• CPU’s, GPU’s, APU’s

• OpenCL
• Portable and high-performance framework
–Ideal for computationally intensive algorithms
–Access to all compute resources (CPU, APU, GPU)
–Well-defined computation/memory model
•Efficient parallel programming language
–C99 with extensions for task and data parallelism
–Rich set of built-in functions
•Open standard for heterogeneous parallel computing
• PgOpenCL
• Integrates PostgreSQL with OpenCL
• Provides Easy SQL Access to xPU’s
• APU, CPU, GPGPU
• Integrates OpenCL
• SQL + Web Apps(PHP, Ruby, … )
More
Information
• PGOpenCL
• Twitter @3DMashUp

• OpenCL

• www.khronos.org/opencl/

• www.amd.com/us/products/technologies/stream-technology/opencl/

• http://software.intel.com/en-us/articles/intel-opencl-sdk

• http://www.nvidia.com/object/cuda_opencl_new.html

• http://developer.apple.com/technologies/mac/snowleopard/opencl.html
Q&A

• Using Parallel Applications?

• Benefits of OpenCL / PgOpenCL?
• Want to Collaborate on PgOpenCL?

2U6 S4HANA1909 Set-Up EN XX
No ratings yet
2U6 S4HANA1909 Set-Up EN XX
10 pages
Node.js 63 Interview Questions and Answers
From Everand
Node.js 63 Interview Questions and Answers
John Edward Cooper Berg
No ratings yet
Cmmi Level 2 Guide
No ratings yet
Cmmi Level 2 Guide
10 pages
upcrc_opencl_lec1
No ratings yet
upcrc_opencl_lec1
38 pages
GPU Accelerated Databases, Speeding Up Database Time Series Analysis Using OpenCL
No ratings yet
GPU Accelerated Databases, Speeding Up Database Time Series Analysis Using OpenCL
29 pages
GPU Programming Using openCL
No ratings yet
GPU Programming Using openCL
13 pages
PgCOn 2011 Parallel Image Searching
No ratings yet
PgCOn 2011 Parallel Image Searching
20 pages
Owens
No ratings yet
Owens
67 pages
Hands On Opencl: Created by Simon Mcintosh-Smith and Tom Deakin
No ratings yet
Hands On Opencl: Created by Simon Mcintosh-Smith and Tom Deakin
258 pages
Introduction_to_OpenCL_with_Examples
No ratings yet
Introduction_to_OpenCL_with_Examples
128 pages
Introduction To OpenCL
No ratings yet
Introduction To OpenCL
44 pages
Introduction To OpenCL Programming (201005)
No ratings yet
Introduction To OpenCL Programming (201005)
132 pages
GPGPU
No ratings yet
GPGPU
139 pages
Seminar Igor Kamzic COSC3P93
No ratings yet
Seminar Igor Kamzic COSC3P93
58 pages
Lecture 1
No ratings yet
Lecture 1
17 pages
OpenCL Jumpstart Guide
No ratings yet
OpenCL Jumpstart Guide
17 pages
11 - OpenCL Fundamentals
No ratings yet
11 - OpenCL Fundamentals
253 pages
Introduction To CUDA
No ratings yet
Introduction To CUDA
51 pages
NTNU HetComp Topublish PDF
No ratings yet
NTNU HetComp Topublish PDF
83 pages
Opencl On Fpga: Marc Gaucheron INTEL Programmable Solution Group
No ratings yet
Opencl On Fpga: Marc Gaucheron INTEL Programmable Solution Group
128 pages
Parallel Programming in Opencl: Advanced Graphics & Image Processing
No ratings yet
Parallel Programming in Opencl: Advanced Graphics & Image Processing
31 pages
OpenMP 4.0 For GPU, Accelerators and Other Things - Michael Wong - CppCon 2014
No ratings yet
OpenMP 4.0 For GPU, Accelerators and Other Things - Michael Wong - CppCon 2014
128 pages
Pete-presentation-2 (1)
No ratings yet
Pete-presentation-2 (1)
17 pages
FPGA and OpenCL
No ratings yet
FPGA and OpenCL
31 pages
Lec 14
No ratings yet
Lec 14
52 pages
IntroGPUs
No ratings yet
IntroGPUs
36 pages
NVIDIA OpenCL JumpStart Guide
No ratings yet
NVIDIA OpenCL JumpStart Guide
15 pages
OpenCL Guide
No ratings yet
OpenCL Guide
19 pages
Opencl 1pp PDF
No ratings yet
Opencl 1pp PDF
48 pages
GPU Programming Slides 1
No ratings yet
GPU Programming Slides 1
33 pages
An Approach To Parallel Processing: Yashraj Rai Puja Padiya
No ratings yet
An Approach To Parallel Processing: Yashraj Rai Puja Padiya
3 pages
Gpgpu Workshop Cuda
No ratings yet
Gpgpu Workshop Cuda
10 pages
(Ebook) Using OpenCL: Programming Massively Parallel Computers by J. Kowalik, T. Puzniakowski ISBN 9781614990291, 1614990298 - The latest ebook is available, download it today
100% (2)
(Ebook) Using OpenCL: Programming Massively Parallel Computers by J. Kowalik, T. Puzniakowski ISBN 9781614990291, 1614990298 - The latest ebook is available, download it today
58 pages
Chapter 5 - General Purpose PGPU, CUDA
No ratings yet
Chapter 5 - General Purpose PGPU, CUDA
70 pages
Lecture 19-Opencl: Ece 459: Programming For Performance
No ratings yet
Lecture 19-Opencl: Ece 459: Programming For Performance
47 pages
cs179_2024_lec01
No ratings yet
cs179_2024_lec01
26 pages
06-Intro To Opencl PDF
No ratings yet
06-Intro To Opencl PDF
57 pages
06 Intro Gpus
No ratings yet
06 Intro Gpus
33 pages
Opencl 2pp
No ratings yet
Opencl 2pp
28 pages
A Jump Start To Opencl: March 15, 2009 Cis 565/665 - Gpu Computing and Architecture
No ratings yet
A Jump Start To Opencl: March 15, 2009 Cis 565/665 - Gpu Computing and Architecture
74 pages
OpenCL A Parallel Programming Standart For Heterogeneous
No ratings yet
OpenCL A Parallel Programming Standart For Heterogeneous
12 pages
Where Can Buy Design of FPGA-Based Computing Systems With OpenCL 1st Edition Hasitha Muthumala Waidyasooriya Ebook With Cheap Price
100% (3)
Where Can Buy Design of FPGA-Based Computing Systems With OpenCL 1st Edition Hasitha Muthumala Waidyasooriya Ebook With Cheap Price
52 pages
cs179 2017 Lec01
No ratings yet
cs179 2017 Lec01
24 pages
Opencl 03 Basics
No ratings yet
Opencl 03 Basics
62 pages
OpenCL For EiT-M
No ratings yet
OpenCL For EiT-M
41 pages
Kirk+Hwu GPU
No ratings yet
Kirk+Hwu GPU
92 pages
Cks 2012 It Art 002
No ratings yet
Cks 2012 It Art 002
10 pages
Parralel 01
No ratings yet
Parralel 01
38 pages
Csit3913 PDF
No ratings yet
Csit3913 PDF
12 pages
UNIT-4
No ratings yet
UNIT-4
48 pages
1
No ratings yet
1
44 pages
GPUProgramming Talk
No ratings yet
GPUProgramming Talk
18 pages
Graphics Processing Unit (GPU) Programming Strategies and Trends in GPU Computing
No ratings yet
Graphics Processing Unit (GPU) Programming Strategies and Trends in GPU Computing
10 pages
Day1 1
No ratings yet
Day1 1
25 pages
Unleashing The Hidden Power of Integrated-Gpus For Database Co-Processing
No ratings yet
Unleashing The Hidden Power of Integrated-Gpus For Database Co-Processing
12 pages
217 Lec1
No ratings yet
217 Lec1
35 pages
CS-3006 7 UsingOpenCL DataParallelProgramming
No ratings yet
CS-3006 7 UsingOpenCL DataParallelProgramming
80 pages
Parralel Demro 001
No ratings yet
Parralel Demro 001
45 pages
Lecture2 GPU Architecture_2025
No ratings yet
Lecture2 GPU Architecture_2025
46 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
All My IT Tech Posts
From Everand
All My IT Tech Posts
Stephen Edwards
No ratings yet
Oops Lab Vaseem
No ratings yet
Oops Lab Vaseem
41 pages
THIRD SEMESTER PROGRAMMING AND DATA STRUCTURES-2 NOTES FOR 5 UNITS REGULATION 2013Cs6301 Notes
No ratings yet
THIRD SEMESTER PROGRAMMING AND DATA STRUCTURES-2 NOTES FOR 5 UNITS REGULATION 2013Cs6301 Notes
207 pages
vlookup in excel
No ratings yet
vlookup in excel
5 pages
Open Cad
100% (3)
Open Cad
116 pages
1.2-4 Apps
No ratings yet
1.2-4 Apps
15 pages
Dumpstate
No ratings yet
Dumpstate
9 pages
Dot Net
No ratings yet
Dot Net
121 pages
s7 1500 Compare Table en 2019 11
No ratings yet
s7 1500 Compare Table en 2019 11
104 pages
How To Get Data From Oracle To Postgresql and Vice Versa
No ratings yet
How To Get Data From Oracle To Postgresql and Vice Versa
66 pages
2023BB10593_it
No ratings yet
2023BB10593_it
2 pages
Control Flow Coverage
No ratings yet
Control Flow Coverage
12 pages
11 - Bookings
No ratings yet
11 - Bookings
11 pages
Azure DP 900 - 80 Questions Tfhfuffhy
100% (3)
Azure DP 900 - 80 Questions Tfhfuffhy
25 pages
Macro To Roll Exalted 3e Die in Foundry VTT
100% (1)
Macro To Roll Exalted 3e Die in Foundry VTT
5 pages
Pharmacy Management System
100% (1)
Pharmacy Management System
41 pages
About Raffle Draw
No ratings yet
About Raffle Draw
36 pages
CyberArk Defender
No ratings yet
CyberArk Defender
77 pages
Spring in Action Fourth Edition
0% (4)
Spring in Action Fourth Edition
2 pages
DNET
No ratings yet
DNET
34 pages
ICT208 Algorithms and Data Structures
No ratings yet
ICT208 Algorithms and Data Structures
11 pages
DOM Traversing
No ratings yet
DOM Traversing
10 pages
Events and Delegates
No ratings yet
Events and Delegates
13 pages
Who, What, Where, When, Wordlist: @tomnomnom
No ratings yet
Who, What, Where, When, Wordlist: @tomnomnom
30 pages
Python Cheatsheet - Python Cheatsheet PDF
No ratings yet
Python Cheatsheet - Python Cheatsheet PDF
128 pages
TCL Training Verdi 201107
No ratings yet
TCL Training Verdi 201107
73 pages
Metafor
No ratings yet
Metafor
2 pages
You Exec - Issue Tracker Free
No ratings yet
You Exec - Issue Tracker Free
207 pages
Log
No ratings yet
Log
117 pages

PostgreSQL OpenCL Procedural Language

Uploaded by

PostgreSQL OpenCL Procedural Language

Uploaded by

Introducing PgOpenCL

• Data Mining & Analytics

• Spatial Selections and Joins

PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL

PgOpenCL PgOpenCL PgOpenCL PgOpenCL PgOpenCL

PgOpenCL PgOpenCL PgOpenCL PgOpenCL

Compilation / Link Static Static Dynamic

Disk I/O Tables

#pragma PGOPENCL Platform : ATI Stream

__kernel __attribute__((reqd_work_group_size(64, 1, 1)))

c[i] = a[i] + b[i];

Select Table 100’s - 1000’s of

Copy Unnest Array

Solves Well for Problem Sets with the Right Shape!

• Arrange data to fit available GPU RAM

The Future Is Parallel: What's a Programmer to Do?

• Using Parallel Applications?

You might also like

kernel attribute__((reqd_work_group_size(64, 1, 1)))