0% found this document useful (0 votes)
3 views5 pages

Sanju HPC 9,10

The document outlines practical exercises for analyzing CUDA code using Nvidia's nvprof profiler and demonstrating multithreading with Pthreads. It explains the purpose and features of nvprof, including performance analysis and optimization of CUDA applications, along with a sample CUDA program. Additionally, it covers the basics of Pthreads, including key functions and a simple multithreading example in C.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

Sanju HPC 9,10

The document outlines practical exercises for analyzing CUDA code using Nvidia's nvprof profiler and demonstrating multithreading with Pthreads. It explains the purpose and features of nvprof, including performance analysis and optimization of CUDA applications, along with a sample CUDA program. Additionally, it covers the basics of Pthreads, including key functions and a simple multithreading example in C.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

FACULTY OF ENGINEERING & TECHNOLOGY

HIGH PERFORMANCE COMPUTING (303105356)


BTECH 3RD YEAR 6TH SEMESTER
ENROLLMENT NUMBER: 2203031240285

PRACTICAL-09
AIM:
Analise the code using Nvidia-Profilers.

WHAT IS A PROFILER:
A profiler is a performance analysis tool that helps developers understand how their program
executes. It provides insights into execution time, memory usage, thread activity, and other
critical metrics. Profilers are essential for optimizing performance in high-performance
computing applications.

INTRODUCTION TO NVPROF:
nvprof is a command-line profiler provided by NVIDIA for CUDA applications. It allows
developers to collect and analyze performance data for programs running on NVIDIA GPUs.
nvprof enables efficient debugging and optimization of CUDA code by providing details on
kernel execution, memory transfers, and utilization metrics.

FEATURES OF NVPROF:
• Provides detailed kernel execution timing and memory usage statistics.
• Allows analysis of memory transfer between host and device.
• Supports collecting metrics such as occupancy, compute efficiency, and memory
throughput.
• Compatible with CUDA-enabled devices and applications.
• Generates reports for further analysis using NVIDIA Visual Profiler (Nsight Systems).

NVPROF STEPS ON CUDA PROGRAM:


Code:
%%writefile p1.cu
#include<stdio.h>
#include<cuda.h>
#include<cuda_runtime_api.h>

32 | P a g e
FACULTY OF ENGINEERING & TECHNOLOGY
HIGH PERFORMANCE COMPUTING (303105356)
BTECH 3RD YEAR 6TH SEMESTER
ENROLLMENT NUMBER: 2203031240285

global void kf()


{
printf("\nHello World from Thread %d of Block %d", threadIdx.x, blockIdx.x);
}
int main()
{
printf("Hello world from cpu!\n");
kf<<<3,2>>>();
cudaDeviceSynchronize();
return 0;
}

Compilation:
!nvcc -o x p1.cu

Execution:
!./x

33 | P a g e
FACULTY OF ENGINEERING & TECHNOLOGY
HIGH PERFORMANCE COMPUTING (303105356)
BTECH 3RD YEAR 6TH SEMESTER
ENROLLMENT NUMBER: 2203031240285

Profiling with NVPROF:


!nvprof ./x

CONCLUSION:

Using nvprof, the performance and behavior of CUDA programs can be analyzed
effectively. Profiling helps identify bottlenecks and optimize execution time, memory
transfers, and resource utilization, leading to more efficient CUDA applications.

34 | P a g e
FACULTY OF ENGINEERING & TECHNOLOGY
HIGH PERFORMANCE COMPUTING (303105356)
BTECH 3RD YEAR 6TH SEMESTER
ENROLLMENT NUMBER: 2203031240285

PRACTICAL-10

AIM: Demonstration of OpenMP and pthread functions.

WHAT IS PTHREAD:
Pthreads, or POSIX threads, is a standardized C library for multithreading. It allows parallel
programming by creating multiple threads, enabling efficient execution of concurrent processes
within a program.

FUNCTIONS OF PTHREAD LIBRARY:


1. pthread_create: Used to create a new thread.
2. pthread_exit: Used to terminate a thread.
3. pthread_join: Used to wait for the termination of a thread.
4. pthread_self: Used to get the thread ID of the current thread.
5. pthread_equal: Compares whether two threads are the same. If the threads are equal,
the function returns a non-zero value; otherwise, it returns zero.
6. pthread_detach: Used to detach a thread. A detached thread does not require a thread to
join on termination. The thread's resources are automatically released upon termination
if detached.

SIMPLE PTHREAD PROGRAM:

%%writefile pthread.c
#include <pthread.h>
#include <stdio.h>
#define NUM_THREADS 5

void *PrintHello(void *threadid)


{ int tid;
tid = (int)threadid;
printf("Hello World! It's me, thread #%d!\n", tid);
pthread_exit(NULL);
}

int main (int argc, char *argv[])


{ pthread_t threads[NUM_THREADS];

35 | P a g e
FACULTY OF ENGINEERING & TECHNOLOGY
HIGH PERFORMANCE COMPUTING (303105356)
BTECH 3RD YEAR 6TH SEMESTER
ENROLLMENT NUMBER: 2203031240285

int rc;
int t;
for(t = 0; t < NUM_THREADS; t++) {
printf("In main: creating thread %d\n", t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc) {
printf("ERROR; return code from pthread_create() is %d\n", rc);
}
}
pthread_exit(NULL);
}

COMPILATION AND EXECUTION STEPS:

1. Compile the program:

!gcc -o pthread pthread.c -lpthread

2. Execute the program:

!./pthread

36 | P a g e

You might also like