0% found this document useful (0 votes)

77 views18 pages

Introduction To CUDA Platform 1

Uploaded by

Tarun Ram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views18 pages

Introduction To CUDA Platform 1

Uploaded by

Tarun Ram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Introduction to the

CUDA Platform
CUDA Parallel Computing Platform
www.nvidia.com/getcuda

OpenACC Programming
Programming Libraries
Directives Languages
Approaches “Drop-in” Easily Accelerate
Maximum Flexibility
Acceleration Apps

Nsight IDE CUDA-GDB

Development Linux, Mac and Windows debugger
GPU Debugging and NVIDIA Visual
Environment Profiling Profiler

Open Compiler Enables compiling new languages to CUDA

platform, and CUDA languages to other
Tool Chain architectures

Dynamic HyperQ GPUDirect

SMX
Parallelism
Hardware
Capabilities
© NVIDIA 2013
3 Ways to Accelerate Applications

Applications

OpenACC Programming
Libraries
Directives Languages

“Drop-in” Easily Accelerate Maximum

Acceleration Applications Flexibility

Applications

OpenACC Programming
Libraries
Directives Languages

“Drop-in” Easily Accelerate Maximum

Acceleration Applications Flexibility

© NVIDIA 2013
Libraries: Easy, High-Quality
Acceleration
• Ease of use: Using libraries enables GPU acceleration without in-depth
knowledge of GPU programming

• “Drop-in”: Many GPU-accelerated libraries follow standard APIs, thus

enabling acceleration with minimal code changes

• Quality: Libraries offer high-quality implementations of functions

encountered in a broad range of applications

• Performance: NVIDIA libraries are tuned by experts

NVIDIA cuBLAS NVIDIA cuRAND NVIDIA cuSPARSE NVIDIA NPP

Vector Signal GPU Accelerated Matrix Algebra

Image Processing Linear Algebra on GPU and NVIDIA cuFFT
Multicore

Building-block Sparse Linear C++ STL

ArrayFire Matrix
Algorithms for Algebra Features for
IMSL Library Computations
CUDA CUDA
© NVIDIA 2013
3 Steps to CUDA-accelerated
application
• Step 1: Substitute library calls with equivalent CUDA library calls
saxpy ( … ) cublasSaxpy ( … )

• Step 2: Manage data locality

- with CUDA: cudaMalloc(), cudaMemcpy(), etc.
- with CUBLAS: cublasAlloc(), cublasSetVector(), etc.

• Step 3: Rebuild and link the CUDA-accelerated library

nvcc myobj.o –l cublas

© NVIDIA 2013
Explore the CUDA (Libraries)
Ecosystem
• CUDA Tools and Ecosystem
described in detail on NVIDIA
Developer Zone:
developer.nvidia.com/cuda-tools-ecosystem

Applications

OpenACC Programming
Libraries
Directives Languages

“Drop-in” Easily Accelerate Maximum

Acceleration Applications Flexibility

Simple Compiler hints

Program myscience Compiler Parallelizes

... serial code ...
!$acc kernels code
do k = 1,n1 OpenACC
do i = 1,n2
compiler
... parallel code ...
enddo Hint Works on many-core
enddo
!$acc end kernels GPUs & multicore CPUs
...
End Program myscience

Your original
Fortran or C
code
© NVIDIA 2013
OpenACC
The Standard for GPU Directives

• Easy: Directives are the easy path to accelerate

compute intensive applications

• Open: OpenACC is an open GPU directives standard,

making GPU programming straightforward and
portable across parallel and multi-core processors

• Powerful: GPU Directives allow complete access to the

massive parallel power of a GPU

© NVIDIA 2013
Directives: Easy & Powerful
Real-Time Object Valuation of Stock Interaction of Solvents
Detection Portfolios using Monte and Biomolecules
Global Manufacturer of Carlo University of Texas at San Antonio
Navigation Systems
Global Technology Consulting
Company

5x in 40 Hours 2x in 4 Hours 5x in 8 Hours

“Optimizing code with directives is quite easy, especially compared to CPU threads or writing
CUDA kernels. The most important thing is avoiding restructuring of existing code for
production applications.
” -- Developer at the Global Manufacturer of
Navigation Systems © NVIDIA 2013
Start Now with OpenACC Directives
Sign up for a free trial of
the directives compiler
now!
Free trial license to PGI
Accelerator

Tools for quick ramp

www.nvidia.com/gpudirectives

Applications

OpenACC Programming
Libraries
Directives Languages

“Drop-in” Easily Accelerate Maximum

Acceleration Applications Flexibility

Fortran OpenACC, CUDA Fortran

C OpenACC, CUDA C

C++ Thrust, CUDA C++

Python PyCUDA, Copperhead

• Resembles C++ STL

• High-level interface // generate 32M random numbers on host
• Enhances developer thrust::host_vector<int> h_vec(32 << 20);
thrust::generate(h_vec.begin(),
productivity h_vec.end(),
• Enables performance rand);
portability between GPUs and
// transfer data to device (GPU)
multicore CPUs thrust::device_vector<int> d_vec = h_vec;
• Flexible
// sort data on device
• CUDA, OpenMP, and TBB thrust::sort(d_vec.begin(), d_vec.end());
backends
// transfer data back to host
• Extensible and customizable thrust::copy(d_vec.begin(),
• Integrates with existing d_vec.end(),
h_vec.begin());
software
• Open source

http://developer.nvidia.com/thrust or http://thrust.googlecode.com
Learn More
These languages are supported on all CUDA-capable GPUs.
You might already have a CUDA-capable GPU in your laptop
or desktop PC!
CUDA C/C++ GPU.NET
http://developer.nvidia.com/cuda-toolkit http://tidepowerd.com

Thrust C++ Template Library

http://developer.nvidia.com/thrust MATLAB
http://www.mathworks.com/discovery/
matlab-gpu.html
CUDA Fortran
http://developer.nvidia.com/cuda-toolkit

Mathematica
PyCUDA (Python) http://www.wolfram.com/mathematica/new
http://mathema.tician.de/software/pycuda -in-8/cuda-and-opencl-support/

• Nsight IDE (Eclipse or Visual Studio): www.nvidia.com/nsight

• Programming Guide/Best Practices:

• docs.nvidia.com

• Questions:
• NVIDIA Developer forums: devtalk.nvidia.com
• Search or ask on: www.stackoverflow.com/tags/cuda

• General: www.nvidia.com/cudazone

Refrigeration and Air Conditioning by R K Rajput PDF
21% (43)
Refrigeration and Air Conditioning by R K Rajput PDF
4 pages
Cheat Sheet On DataBase Management
No ratings yet
Cheat Sheet On DataBase Management
3 pages
Introduction To CUDA
No ratings yet
Introduction To CUDA
51 pages
GPUProgramming Talk
No ratings yet
GPUProgramming Talk
18 pages
CUDA Optimization Fundamentals
No ratings yet
CUDA Optimization Fundamentals
150 pages
06 Intro Gpus
No ratings yet
06 Intro Gpus
33 pages
CUDA Zone - Library of Resources - NVIDIA Developer
No ratings yet
CUDA Zone - Library of Resources - NVIDIA Developer
7 pages
Intro GPUs
No ratings yet
Intro GPUs
36 pages
OpenACC Princeton Bootcamp PDF
No ratings yet
OpenACC Princeton Bootcamp PDF
51 pages
Owens
No ratings yet
Owens
67 pages
Chapter 5 - General Purpose PGPU, CUDA
No ratings yet
Chapter 5 - General Purpose PGPU, CUDA
70 pages
CUDA
No ratings yet
CUDA
20 pages
Cuda Lab Manual
100% (1)
Cuda Lab Manual
22 pages
Unit 5'
No ratings yet
Unit 5'
33 pages
CUDA Tutorial
No ratings yet
CUDA Tutorial
50 pages
GPUMod 2
No ratings yet
GPUMod 2
64 pages
Cuuda Nvidai Guide - Part1
No ratings yet
Cuuda Nvidai Guide - Part1
15 pages
Unit 4
No ratings yet
Unit 4
48 pages
Lecture 2
No ratings yet
Lecture 2
15 pages
ACA Unit3 Revised
No ratings yet
ACA Unit3 Revised
53 pages
Introduction To OpenACC Course 20161026 1550 1
No ratings yet
Introduction To OpenACC Course 20161026 1550 1
68 pages
Barnett Haskins
No ratings yet
Barnett Haskins
29 pages
Kirk+Hwu GPU
No ratings yet
Kirk+Hwu GPU
92 pages
2023 CSC14120 Lecture00 CourseIntroduction
No ratings yet
2023 CSC14120 Lecture00 CourseIntroduction
30 pages
лк CUDA - 1 PDCn
No ratings yet
лк CUDA - 1 PDCn
31 pages
Cuda
No ratings yet
Cuda
15 pages
Unit 5 - CUDA Architecture
No ratings yet
Unit 5 - CUDA Architecture
17 pages
Programming Gpus With Cuda: John Mellor-Crummey
No ratings yet
Programming Gpus With Cuda: John Mellor-Crummey
42 pages
HPC Final 4-8
No ratings yet
HPC Final 4-8
25 pages
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
No ratings yet
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
43 pages
Nvidia Profiling Tools Keipert 10 4 22
No ratings yet
Nvidia Profiling Tools Keipert 10 4 22
27 pages
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
No ratings yet
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
11 pages
HPC Summit Digital 2020: Gpu Experts Panel: Ampere Explained
No ratings yet
HPC Summit Digital 2020: Gpu Experts Panel: Ampere Explained
29 pages
CUDA Wikipedia
No ratings yet
CUDA Wikipedia
10 pages
Nvidia Cuda Getting Started Guide For Microsoft Windows: Installation and Verification On Windows
No ratings yet
Nvidia Cuda Getting Started Guide For Microsoft Windows: Installation and Verification On Windows
15 pages
Unit 6 Chapter 1 Parallel Programming Tools Cuda - Programming
No ratings yet
Unit 6 Chapter 1 Parallel Programming Tools Cuda - Programming
28 pages
CUDA Programming: Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen
No ratings yet
CUDA Programming: Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen
28 pages
Module 4
No ratings yet
Module 4
40 pages
GPU Programming: Dr. Florian Ferreira
No ratings yet
GPU Programming: Dr. Florian Ferreira
101 pages
Getting Started With CUDA Samples
No ratings yet
Getting Started With CUDA Samples
9 pages
S3076 Getting Started With OpenACC
No ratings yet
S3076 Getting Started With OpenACC
58 pages
CUDA Getting Started Windows
No ratings yet
CUDA Getting Started Windows
15 pages
4 1 MWagner GPU Volta
No ratings yet
4 1 MWagner GPU Volta
36 pages
Lecture 1
No ratings yet
Lecture 1
17 pages
Introduction To Gpgpu and Parallel Computing (Gpu Architecture and Cuda Programming Models)
No ratings yet
Introduction To Gpgpu and Parallel Computing (Gpu Architecture and Cuda Programming Models)
4 pages
GPU Cluster4
No ratings yet
GPU Cluster4
31 pages
OpenACC Advanced Fixed
No ratings yet
OpenACC Advanced Fixed
53 pages
Introduction To Gpu Programming With Cuda and Openacc
100% (1)
Introduction To Gpu Programming With Cuda and Openacc
40 pages
Acceleratingpythonongpus
No ratings yet
Acceleratingpythonongpus
33 pages
Cuda Examples
No ratings yet
Cuda Examples
28 pages
Gpu Cuda
No ratings yet
Gpu Cuda
204 pages
Graphics Processing Unit (GPU) Programming Strategies and Trends in GPU Computing
No ratings yet
Graphics Processing Unit (GPU) Programming Strategies and Trends in GPU Computing
10 pages
Overview of GPGPU's
No ratings yet
Overview of GPGPU's
81 pages
Parallel Programming With CUDA - Architecture, Analysis
No ratings yet
Parallel Programming With CUDA - Architecture, Analysis
93 pages
p10 Cuda
No ratings yet
p10 Cuda
28 pages
Nvidia Cuda C Getting Started Guide For Microsoft Windows: Installation and Verification On Windows
No ratings yet
Nvidia Cuda C Getting Started Guide For Microsoft Windows: Installation and Verification On Windows
14 pages
CUDA
No ratings yet
CUDA
46 pages
Lecture 2
No ratings yet
Lecture 2
77 pages
HPC 1
No ratings yet
HPC 1
27 pages
Brodtkorb Etal Meta10
No ratings yet
Brodtkorb Etal Meta10
15 pages
Safenet Virtual Keysecure PB v36 PDF
No ratings yet
Safenet Virtual Keysecure PB v36 PDF
4 pages
Excel Registry
No ratings yet
Excel Registry
14 pages
Presentation Cloud
No ratings yet
Presentation Cloud
18 pages
Question Text: Clear My Choice
No ratings yet
Question Text: Clear My Choice
11 pages
P4api PDF
No ratings yet
P4api PDF
294 pages
Pelajaran Dapper
No ratings yet
Pelajaran Dapper
31 pages
DBM E-Budget Lgu360
100% (3)
DBM E-Budget Lgu360
8 pages
Project Name: Queuing System: PGSNJN - Office
No ratings yet
Project Name: Queuing System: PGSNJN - Office
2 pages
How To Install Canon CanoScan LiDE 100 Scanner in Ubuntu Linux Mint 20
No ratings yet
How To Install Canon CanoScan LiDE 100 Scanner in Ubuntu Linux Mint 20
2 pages
Temas para El Curso de Postfix
No ratings yet
Temas para El Curso de Postfix
13 pages
EOS IT Support Service L1 & L2
No ratings yet
EOS IT Support Service L1 & L2
82 pages
Creación de Mapas Con Stata
No ratings yet
Creación de Mapas Con Stata
7 pages
SAP HANA Administration Guide en SPS10
No ratings yet
SAP HANA Administration Guide en SPS10
4 pages
Rap As A Service For Sharepoint Server: Data Collection Machine Does Not Have Internet Access
No ratings yet
Rap As A Service For Sharepoint Server: Data Collection Machine Does Not Have Internet Access
25 pages
Zoom R16 and Cubase LE4
No ratings yet
Zoom R16 and Cubase LE4
20 pages
Elastic Siem Fundamentals: Course Information
No ratings yet
Elastic Siem Fundamentals: Course Information
1 page
Symfony5 The Fast Track PDF
100% (1)
Symfony5 The Fast Track PDF
342 pages
Getting Started With ATP: Folder H:/atp/xxx Does Not Exist. Create?
No ratings yet
Getting Started With ATP: Folder H:/atp/xxx Does Not Exist. Create?
2 pages
Using The Quickpay Portal On Your Mobile Device: Making A Payment
No ratings yet
Using The Quickpay Portal On Your Mobile Device: Making A Payment
2 pages
How To Convert PDF
No ratings yet
How To Convert PDF
8 pages
Chap 13 PDF
No ratings yet
Chap 13 PDF
33 pages
3333 PDF
No ratings yet
3333 PDF
2 pages
Resume AnnerBonilla
No ratings yet
Resume AnnerBonilla
1 page
UsbFix Report
No ratings yet
UsbFix Report
2 pages
Django CMD
No ratings yet
Django CMD
3 pages
E-Mudhra Digital Signature Certificate Download Instruction Manual
No ratings yet
E-Mudhra Digital Signature Certificate Download Instruction Manual
45 pages
HFM Interview Question
No ratings yet
HFM Interview Question
7 pages

Introduction To CUDA Platform 1

Uploaded by

Introduction To CUDA Platform 1

Uploaded by

Introduction to the

Nsight IDE CUDA-GDB

Open Compiler Enables compiling new languages to CUDA

Dynamic HyperQ GPUDirect

“Drop-in” Easily Accelerate Maximum

“Drop-in” Easily Accelerate Maximum

• “Drop-in”: Many GPU-accelerated libraries follow standard APIs, thus

• Quality: Libraries offer high-quality implementations of functions

• Performance: NVIDIA libraries are tuned by experts

NVIDIA cuBLAS NVIDIA cuRAND NVIDIA cuSPARSE NVIDIA NPP

Vector Signal GPU Accelerated Matrix Algebra

Building-block Sparse Linear C++ STL

• Step 2: Manage data locality

• Step 3: Rebuild and link the CUDA-accelerated library

“Drop-in” Easily Accelerate Maximum

Simple Compiler hints

Program myscience Compiler Parallelizes

• Easy: Directives are the easy path to accelerate

• Open: OpenACC is an open GPU directives standard,

• Powerful: GPU Directives allow complete access to the

5x in 40 Hours 2x in 4 Hours 5x in 8 Hours

Tools for quick ramp

“Drop-in” Easily Accelerate Maximum

Fortran OpenACC, CUDA Fortran

C++ Thrust, CUDA C++

Python PyCUDA, Copperhead

• Resembles C++ STL

Thrust C++ Template Library

• Nsight IDE (Eclipse or Visual Studio): www.nvidia.com/nsight

• Programming Guide/Best Practices:

You might also like