0% found this document useful (0 votes)
43 views

Module 3 Quiz

This document contains 4 multiple choice quiz questions about mapping thread and block indices to data indices for vector addition problems in CUDA. The questions cover cases where each thread calculates: 1) one output element, 2) two adjacent output elements, and 3) two output elements where blocks process sections of two elements at a time. The last question asks how many threads would be in a grid if the vector length is 8000, each thread calculates one element, block size is 1024, and the minimum number of blocks is used. The answer is 8192 threads.

Uploaded by

sy1990010111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Module 3 Quiz

This document contains 4 multiple choice quiz questions about mapping thread and block indices to data indices for vector addition problems in CUDA. The questions cover cases where each thread calculates: 1) one output element, 2) two adjacent output elements, and 3) two output elements where blocks process sections of two elements at a time. The last question asks how many threads would be in a grid if the vector length is 8000, each thread calculates one element, block size is 1024, and the minimum number of blocks is used. The answer is 8192 threads.

Uploaded by

sy1990010111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Quiz Questions for Module 3

1. If we need to use each thread to calculate one output element of a vector addition, what would
be the expression for mapping the thread/block indices to data index:
(A) i=threadIdx.x + threadIdx.y;
(B) i=blockIdx.x + threadIdx.x;
(C) i=blockIdx.x*blockDim.x + threadIdx.x;
(D) i=blockIdx.x * threadIdx.x;

Answer: (C)

Explanation: This is the case we covered in Lecture 2.3.

2. We want to use each thread to calculate two (adjacent) output elements of a vector addition.
Assume that variable i should be the index for the first element to be processed by a thread.
What would be the expression for mapping the thread/block indices to data index of the first
element?
(A) i=blockIdx.x*blockDim.x + threadIdx.x +2;
(B) i=blockIdx.x*threadIdx.x*2
(C) i=(blockIdx.x*blockDim.x + threadIdx.x)*2
(D) i=blockIdx.x*blockDim.x*2 + threadIdx.x

Answer: (C)

Explanation: Every thread covers two adjacent output elements. The starting data index is
simply twice the global thread index. Another way to look at it is that all previous blocks cover
(blockIdx.x*blockDim.x)*2. Within the block, each thread covers 2 elements so the beginning
position for a thread is threadIdx.x.

3. We want to use each thread to calculate two output elements of a vector addition. Each thread
block processes 2*blockDim.x consecutive elements that form two sections. All threads in each
block will first process a section, each processing one element. They will then all move to the
next section, again each processing one element. Assume that variable i should be the index for
the first element to be processed by a thread. What would be the expression for mapping the
thread/block indices to data index of the first element?
(A) i=blockIdx.x*blockDim.x + threadIdx.x +2;
(B) i=blockIdx.x*threadIdx.x*2
(C) i=(blockIdx.x*blockDim.x + threadIdx.x)*2
(D) i=blockIdx.x*blockDim.x*2 + threadIdx.x

Answer: (D)

Explanation: Each previous block covers (blockIdx.x*blockDim.x)*2. The beginning elements of


the threads are consecutive in this case so just add threadIdx.x to it.
4. For a vector addition, assume that the vector length is 8000, each thread calculates one output
element, and the thread block size is 1024 threads. The programmer configures the kernel
launch to have a minimal number of thread blocks to cover all output elements. How many
threads will be in the grid?
(A) 8000
(B) 8196
(C) 8192
(D) 8200

Answer: (C)

Explanation: ceil(8000/1024)*1024 = 8 * 1024 = 8192. Another way to look at it is the minimal


multiple of 1024 to cover 8000 is 1024*8 = 8192.

You might also like