0% found this document useful (0 votes)
84 views3 pages

Lab1 PGPU

This document introduces CUDA programming and provides guidance on measuring performance of CUDA applications. It describes running the DeviceQuery application to identify GPU properties. It also outlines creating a CUDA project in Visual Studio and modifying a demo application to perform vector addition with each element calculated by a separate GPU thread. Finally, it discusses two approaches to measuring execution time: using CPU timers or the CUDA Event API, and introduces the CUDA Visual Profiler tool.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views3 pages

Lab1 PGPU

This document introduces CUDA programming and provides guidance on measuring performance of CUDA applications. It describes running the DeviceQuery application to identify GPU properties. It also outlines creating a CUDA project in Visual Studio and modifying a demo application to perform vector addition with each element calculated by a separate GPU thread. Finally, it discusses two approaches to measuring execution time: using CPU timers or the CUDA Event API, and introduces the CUDA Visual Profiler tool.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Master CI Programarea GPU

Laborator 1

Programarea GPU – Introducere in CUDA

Hello CUDA World!

1. Rulati aplicatia DeviceQuery utilizand NVIDIA GPU Computing SDK Browser si


identificati proprietatile device-urilor CUDA instalate pe statiile din laborator:
CUDA Device
# of Multiprocessors
# of Cores per MP
Total # of cores
Global Memory (MB)
Warp size
# of Threads per block
minimum # of threads processed in SIMD
fashion by a CUDA multiprocessor
Dimensiunile maxime ale unui grid
Dimensiunile maxime ale unui bloc

2. Creati un proiect CUDA in Visual Studio.


a. Urmariti structura programului demo si identificati: portiunea de cod ce se
executa pe GPU, nr. de thread-uri GPU ce executa codul paralel.
b. Modificati aplicatia demo astfel incat sa variati nr. de elemente din vectorii
ce se aduna, iar fiecare element din vectorul rezultat sa fie calculate pe un
thread separate pe GPU. Incercati diferite valori pt. nr de elemente: 1000,
100000, 1000000, 10000000,…. (asigurati-va ca ati furnizat o configuratie
de executie fezabila!)

Urmariti tutorialele CUDA accesibile la:


https://developer.nvidia.com/how-to-cuda-c-cpp
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html

1
Master CI Programarea GPU
Laborator 1

Analiza performantelor unei aplicatii CUDA

1. Masurarea timpului de executie

Varianta 1 – Utilizarea unui Timer pe CPU

cudaMemcpy(…);

t1 = myCPUTimer();
myKernel<<<……>>(…);
cudaDeviceSynchronize();
t2 = myCPUTimer();

cudaMemcpy(…);

Nota: Apelul kernelului CUDA este asincron!! Controlul revine pe CPU imediat dupa apel
(foarte posibil inainte de terminarea executiei kernelului pe GPU). Astfel, este obligatorie
sincronizarea CPU-GPU!

Varianta 2 – Utilizarea Event API

CUDA Event API Management Functions:


cudaEventCreate
cudaEventCreateWithFlags
cudaEventDestroy
cudaEventElapsedTime
cudaEventQuery
cudaEventRecord
cudaEventSynchronize
cudaEvent_t start,stop;

// Generate events
cudaEventCreate(&start);
cudaEventCreate(&stop);

// Trigger event 'start'


cudaEventRecord(start, 0);

/* CUDA Host / Device / Kernel Code ... */

cudaEventRecord(stop, 0); // Trigger Stop event


cudaEventSynchronize(stop); // Sync events (BLOCKS till last
(stop in this case) has been recorded!)

2
Master CI Programarea GPU
Laborator 1

float elapsedTime; // Initialize elapsedTime;


cudaEventElapsedTime(&elapsedTime, start, stop); // Calculate
runtime, write to elapsedTime -- cudaEventElapsedTime returns
value in milliseconds. Resolution ~0.5ms

printf("Execution Time: %f", elapsedTime); // Print Elapsed


time

// Destroy CUDA Event API Events


cudaEventDestroy(start);
cudaEventDestroy(stop);

2. CUDA Visual Profiler

https://developer.nvidia.com/nvidia-visual-profiler

CUDA occupancy calculator:


http://developer.download.nvidia.com/compute/cuda/CUDA_Occupancy_calculator.xls

You might also like