EnSPy: Python Library For Computations of Ensembles of Particles On GPU
EnSPy: Python Library For Computations of Ensembles of Particles On GPU
EnSPy: Python library for computations of ensembles of particles on GPU Why GPU?
GPU Graphic Processing Unit programmable manycore multithreaded with very high memory bandwidth
EnSPy: Python library for computations of ensembles of particles on GPU Why GPU?
GPU Graphic Processing Unit programmable manycore multithreaded with very high memory bandwidth GPU programming give us: high performance transparent scalability ... and is useful for problems with high data parallelism: large datasets portions of data could be processed independently
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU NVIDIA GPU Architecture and CUDA
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU NVIDIA GPU Architecture and CUDA
global memory
EnSPy: Python library for computations of ensembles of particles on GPU NVIDIA GPU Architecture and CUDA
EnSPy: Python library for computations of ensembles of particles on GPU NVIDIA GPU Architecture and CUDA
C for CUDA
a set of extensions to C runtime library function and variable type qualiers builtin vector types: oat4, double2 etc. builtin variables Kernels maps parallel part of the program to the GPU execution: N times in parallel by N CUDA threads CUDA Driver API lowlevel control over the execution no need in nvcc compiler if kernels are precompiled only driver needed
EnSPy: Python library for computations of ensembles of particles on GPU NVIDIA GPU Architecture and CUDA
Execution model
EnSPy: Python library for computations of ensembles of particles on GPU Python and CUDA
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU Why Python?
EnSPy: Python library for computations of ensembles of particles on GPU Why Python?
EnSPy: Python library for computations of ensembles of particles on GPU Why Python?
EnSPy: Python library for computations of ensembles of particles on GPU Why Python?
EnSPy: Python library for computations of ensembles of particles on GPU Why Python?
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy functionality
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy functionality
Motivation
Combine exibility of Python with eciency of C++ CUDA for Nbody sim interface of EnSPy is written in Python core of EnSPy is written in C++ joined together by scipy.weave.inline C++ core could be used without Python just include header and link with precompiled shared library easily extensible: both through highlevel Python interface and lowlevel C++ core new algorithms, initial distributions etc. multiGPU parallelization its easy to experiment with EnSPy!
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy functionality
EnSPy functionality
Types of ensembles: Simple ensemble without interaction, only external potential Nbody ensemble both external potential and gravitational interaction between particles Current algorithms: 4-th order RungeKutta for simple ensemble Hermite scheme with shared time steps for N-body ensemble
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy functionality
Predened initial distributions: Uniform, point and spherical for simple ensembles Uniform sphere with 2T /|U| = 1 for N-body ensemble user could supply functions (in Python) for initial ensemble generation User specied values and expressions: parameters of initial distribution potential, forces, parameters of integration scheme arbitrary number of triggers Ni (t) of particles which do not cross the given hypersurface Fi (q, p) = 0 before time t arbitrary number of averages Fi (q, p, t) quantities which should be averaged over the ensembles
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy functionality
Runtime generation and compilation of C and CUDA code: User specied expressions (as Python strings) are wrapped by EnSPy template subpackage into C functions and CUDA module Compiled at runtime High usage and calculation eciency: exible Python interface all actual calculations are performed by runtime generated C extension and precompiled shared library Drawback: extra time for generation and compilation of new code
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy architecture
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy architecture
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy architecture
EnSPy: Python library for computations of ensembles of particles on GPU EnSPy architecture
EnSPy: Python library for computations of ensembles of particles on GPU Example: D5 potential
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU Example: D5 potential
Overview
Problem: Escape from potential well. Watched values (trigger): N(t) number of particles, remaining in the well at time t Potential: UD5 = 2ay 2 x 2 + xy 2 + Critical energy: Ecr = ES = 0 x4 4
EnSPy: Python library for computations of ensembles of particles on GPU Example: D5 potential
1
1
px
0 y
2
2 2 1 0 x 1 2
1 x
EnSPy: Python library for computations of ensembles of particles on GPU Example: D5 potential
Calculation setup:
Simple ensemble uniform initial distribution of N = 10240 particles in x > 0 U(x, y ) < E 12 lines of simple Python code (examples/d5.py): specication of integration parameters trigger: x = 0 q0 = 0.
EnSPy: Python library for computations of ensembles of particles on GPU Example: D5 potential
Results:
Regular particles are trapped in well initial mixed state splits
1 E = 0.1 0.8
0.2
0 0 10 t 20 30
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem
Overview
Problem: Toy model of escape from star cluster: escape of star from potential of point rotating star cluster Mc and point galaxy core Mg Mc Watched values (trigger): N(t) number of particles, remaining in cluster at time t Potential in cluster frame of reference (tidal approximation): UHill = 3 2 x 2 GMc r2
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem
Potential:
Hill curves
0.5
0.0
0.5
1.0 1.0
0.5
0.0 x
0.5
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem
Calculation setup:
Simple ensemble uniform initial distribution of N = 10240 particles in |x| < rt U(x, y ) < E =
1 3
rt = 1
= 0.
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem
Results:
Traping of regular particles (some tricky physics here):
1 104 8 103 6 103 4 103 2 103 0 0 2.5 104 5 104 nt
N (t)
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem, Nbody version
Outline
1 2 3 4 5 6 7 8 9
NVIDIA GPU Architecture and CUDA Python and CUDA EnSPy functionality EnSPy architecture Example: D5 potential Example: Hill problem Example: Hill problem, Nbody version Performance results Future development
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem, Nbody version
Overview
Problem: Simplied model of escape from star cluster: escape of star from potential of rotating star cluster with total mass Mc and point potential of galaxy core with mass Mg Mc (2D) Watched values: Conguration of cluster Potential of galaxy core in cluster frame of reference (tidal approximation): UHillNB = 3 2 x 2
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem, Nbody version
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem, Nbody version
Calculation setup:
Nbody ensemble 2D (z = 0) initial distribution of N = 10240 particles inside circle R with zero initial velocities 14 lines of simple Python code (examples/hill nbody.py): specication of integration parameters Mc = 1, R = 200, =
1 3
EnSPy: Python library for computations of ensembles of particles on GPU Example: Hill problem, Nbody version
step = 401
300
step = 601
300
200
200
200
100
100
100
100
100
100
200
200
200
300 300
200
100
0 x
100
200
300
300 300
200
100
300 300
200
100
0 x
100
200
300
step = 801
300
step = 1001
300
step = 1201
300
200
200
200
100
100
100
100
100
100
200
200
200
300 300
200
100
0 x
100
200
300
300 300
200
100
300 300
200
100
0 x
100
200
300
EnSPy: Python library for computations of ensembles of particles on GPU Performance results
Not as good, as it could be subject to improve. Estimation: 1TFlops on 2x recent Fermi graphic processors
40
30
GF lop/s
20
10 GTX260 DP - N body GTX260 DP simple ensemble 0 1 104 2 104 5 104 N 1 105 2 105
EnSPy: Python library for computations of ensembles of particles on GPU Future development