0% found this document useful (0 votes)
56 views2 pages

Tutorial No 3

This document contains a tutorial on parallel processing and CUDA. It includes: 1. A definition of GPGPU and what it means for a GPU to have general purpose capabilities. 2. An explanation of why CUDA is considered heterogeneous computing, with different processors handling low and high latency code serially and parallel respectively. 3. Definitions of key CUDA terms like device, kernel, grid of thread blocks, and warp. 4. An overview of CUDA's parallel programming model with one kernel executing at a time across a grid of thread blocks and threads in a warp executing simultaneously. 5. A listing and brief explanation of CUDA's different memory types including registers, local memory, shared memory

Uploaded by

mmed68003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views2 pages

Tutorial No 3

This document contains a tutorial on parallel processing and CUDA. It includes: 1. A definition of GPGPU and what it means for a GPU to have general purpose capabilities. 2. An explanation of why CUDA is considered heterogeneous computing, with different processors handling low and high latency code serially and parallel respectively. 3. Definitions of key CUDA terms like device, kernel, grid of thread blocks, and warp. 4. An overview of CUDA's parallel programming model with one kernel executing at a time across a grid of thread blocks and threads in a warp executing simultaneously. 5. A listing and brief explanation of CUDA's different memory types including registers, local memory, shared memory

Uploaded by

mmed68003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

King Saud University

College of Computer and Information Sciences


Department of Computer Science
CSC453 – Parallel Processing – Tutorial No 3 – Fall 2021

Question
1. What GPGPU stands for and what does it mean.
General Purpose GPU, a GPU that has the ability to perform calculations that are usually dedicated for CPU

2. Why CUDA is said Heterogeneous computing. 2- Processing is handled by two different processors
the low letancy code performed by CPU in a Serial way
the high letancy code performed by GPU in a Parallel way
3. Give the definition of the following terms:
a. Device: Refers to the GPU and its memory
b. Kernel. A function that runs on the device. One kernel executed at a time and Many
threads execute each kernel.
c. Grid of thread blocks. The kernal is executed by a grid of thread blocks. Each Grid has a collection of blocks and
each block has a collection of threads.
d. Warp.
Group of 32 threads of the same block

4. Explain the parallel programming model of CUDA.


One kernel is executed at a time. Kernal executed by a grid of thread blocks and threads of the same Warp they
execute the same instruction at the same time
5. Enumerate and explain the different types of memory adopted by CUDA.

6. Explain Why the Constant memory is cached, while the Global memory is not.

Constant Memory is read only. caching has no overhead because it doesn't has cache coherency problem
Global Memory is read/write. It has cache coherency problem and the overhead to maintain it will be very high we have thousands of
threads running.

cached == read only


1- Registers: per thread, 32bit, on chip
2- local memory: per thread, relative large, in DRAM
3- shared memory: per block, 16KB, on chip
4- Global memory: per grid, non cached, in DRAM Registers, shared: On chip
5- Constant memory: per grid, cached, in DRAM
6- Texture memory: per grid, cached, in DRAM rest are in DRAM

1
King Saud University
College of Computer and Information Sciences
Department of Computer Science
CSC453 – Parallel Processing – Tutorial No 3 – Fall 2021

You might also like