0% found this document useful (0 votes)

27 views10 pages

GPU Programming EE 4702-1 Final Examination: Exam Total

GPU Programming -EE 4702- Lusiana university -

Uploaded by

moien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views10 pages

GPU Programming EE 4702-1 Final Examination: Exam Total

GPU Programming -EE 4702- Lusiana university -

Uploaded by

moien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Name

GPU Programming
EE 4702-1
Final Examination
Tuesday, 4 December 2018 12:30–14:30 CST

Problem 1 (15 pts)

Problem 2 (30 pts)

Problem 3 (15 pts)

Problem 4 (40 pts)

Alias Exam Total (100 pts)

Good Luck!
Problem 1: [15 pts] Appearing below is a geometric figure.
y z=3

x
2 4 6

(a) Complete the individual-triangle rendering pass below so that it renders the figure with all triangles
facing in the positive z direction and without overlapping triangles. Use the provided abbreviation glV.
Note: The part about facing the +z direction was not in the original exam.

Complete code to render shape using individual triangles.

#define glV glVertex3f

glBegin(GL_TRIANGLES);

glEnd();

(b) Complete the triangle-strip rendering pass below so that it renders the figure.

Complete code to render shape using a triangle strip.

#define glV glVertex3f

glBegin(GL_TRIANGLE_STRIP);

glEnd();

2
Problem 2: [30 pts] Appearing below is shortened host and shader code based on Homework 3, in which
text was drawn on the triangular spiral. One drawback of this code is that it uses old-fashioned, deprecated,
inefficient glVertex calls. On the following pages are routines that will implement a more efficient version
of this code in which data such as ctr are placed in buffer objects and a rendering pass is performed for the
entire chain, not just one triangular spiral.

for ( int i=2; i<chain_length; i++ ) { // Host Code

pCoor p0 = balls[i-2].position, p1 = balls[i-1].position, p2 = balls[i].position;
pCoor ctr = (p0+p1+p2)/3; // Compute location of triangle center.
// --- CODE REMOVED FOR BREVITY ---
pCoor pprev = ctr;
float delta_a = 0.6 / opt_n_segs;

if ( opt_shader == SO_HW03 ) {
glBegin(GL_TRIANGLE_STRIP); // Render spiral using 1 triangle strip.

for ( int j=0; j<opt_n_segs; j++ ) {

pVect v = va[j%3]; // Vector from center to fold.
pCoor p = ctr + j * delta_a * v; // Coordinate of fold.
pNorm n = cross( p - pprev, vz ); // Normal of segment.
float tex_x = total_len_compute(j,delta_a,distv) * tex_scale;

glNormal3fv(n);
glTexCoord2f(tex_x,0); glVertex3fv(p + vz);
glTexCoord2f(tex_x,1); glVertex3fv(p - vz);
pprev = p;
}
glEnd();
} }
void vs_main_hw03() { // Vertex Shader Code
gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
vertex_e = gl_ModelViewMatrix * gl_Vertex;
normal_e = normalize(gl_NormalMatrix * gl_Normal);
tex_coord = gl_MultiTexCoord0.xy; }

(a) But first let c denote the value of chain_length and n denote the value of opt_n_segs. Determine
the amount of data, in bytes, sent from the CPU to the GPU for the code shown above. (Do not consider
uniforms and other hidden code.)

Amount of data sent from CPU to GPU in terms of c and n, in bytes.

(b) In the vertex shader above, label variables with the appropriate letter as requested below.

Label uniform variables with a U, shader input variables with an I, and fixed-function shader
outputs with an O.

3
Problem 2, continued: Appearing below is the improved triangular spiral host code and shaders. The
host code prepares four buffer objects and then starts an instanced rendering pass with line strips as the
input primitive. The rendering pass uses a vertex shader and a geometry shader to complete the primitives.
(The fragment shader is not a part of this problem.) The only inputs to the vertex shader are the vertex
and instance IDs, which should be used to retrieve or compute information about the spiral segments.

// Host Code -- DO NOT MAKE OR ASSUME MODIFICATIONS TO THIS CODE.

for ( int i=2; i<chain_length; i++ ) {
// Put ball structures and coordinates into convenient variables.
pCoor p0 = balls[i-2].position, p1 = balls[i-1].position, p2 = balls[i].position;
pCoor ctr = (p0+p1+p2)/3; ctr_a.push_back(ctr);

// -- CODE PREPARING va_a, vz_a, and distv_a NOT SHOWN --

}
glUniform1i(5, opt_n_segs);
TO_BUFFER_OBJECT(ctr_a,1); TO_BUFFER_OBJECT(va_a,2);
TO_BUFFER_OBJECT(distv_a,3); TO_BUFFER_OBJECT(vz_a,4);
glBindBuffer(GL_ARRAY_BUFFER,0);

glDrawArraysInstanced(GL_LINE_STRIP, 0, opt_n_segs, ctr_a.size() );

(c) Modify the vertex and geometry shaders on the following pages to efficiently render the triangle spirals,
make any needed changes to the interface blocks, but do not modify or assume modifications to the host
code (above). For your convenience each shader contains a copy of the host code. Cross out or modify that
code as needed. Use the handy abbreviations at the top of the page.

Update the interface blocks as needed.

#ifdef _VERTEX_SHADER_
out Data { // Out of Vertex Shader to Geometry Shader
int ins_id, vtx_id;
// Add any needed declarations here.

};
#endif
#ifdef _GEOMETRY_SHADER_
in Data { // In to Geometry Shader from Vertex Shader
int ins_id, vtx_id;
// Add any needed declarations here.

} In[2];

// Out of Geometry Shader to Fragment Shader -- No changes needed

out Data { vec3 normal_e; vec4 vertex_e; vec2 tex_coord; };
#endif

4
Problem 2, continued: Vertex shader code on this page.

Cross out unneeded code. Cross out data type for shader outputs. Avoid redundant computation.

Update the interface blocks as needed.

const mat4 mv = gl_ModelViewMatrix, mvp = gl_ModelViewProjectionMatrix;

const mat3 nm = gl_NormalMatrix;

void vs_main_lines() { // Vertex Shader Main Routine

int ins_id = gl_InstanceID, vtx_id = gl_VertexID;

int j = gl_VertexID;
float delta_a = 0.6 / opt_n_segs;
vec4 distv = distv_a[ins_id];
vec3 vz = vz_a[ins_id].xyz;
float tex_scale = 0.2;

vec3 v = va_a[ins_id][j%3].xyz;
vec3 p = ctr_a[ins_id].xyz + j * delta_a * v;
vec3 pprev = vec3(0,0,0); // PLACEHOLDER. Won’t work.
vec3 n = cross( p - pprev, vz );
float tex_x = total_len_compute(j,delta_a,distv) * tex_scale;

5
Problem 2, continued: Geometry shader code on this page.

Cross out unneeded code. Cross out data type for shader outputs. Avoid redundant computation.

Update the interface blocks as needed.

layout ( lines ) in;

layout ( triangle_strip, max_vertices = 4 ) out;

const mat4 mv = gl_ModelViewMatrix, mvp = gl_ModelViewProjectionMatrix;

const mat3 nm = gl_NormalMatrix;

void gs_main_lines() { // Geometry Shader Main Routine

int ins_id = gl_InstanceID, vtx_id = gl_VertexID; // PLACEHOLDER. Won’t work.

int j = 0; // PLACEHOLDER, Won’t work.

float delta_a = 0.6 / opt_n_segs;
vec4 distv = distv_a[ins_id];
vec3 vz = vz_a[ins_id].xyz;
float tex_scale = 0.2;

6
Problem 3: [15 pts] Answer each CUDA question below.
(a) Both CUDA kernels below do the same thing, but one will execute much less efficiently. Explain why in
terms of the minimum request size.
__global__ void kmain_simple(float4 *d_in, float *d_out) {
const int tid = threadIdx.x + blockIdx.x * blockDim.x;
const int elt_per_thread = ( d_app.array_size + d_app.num_threads - 1 ) / d_app.num_threads;
const int start = elt_per_thread * tid;
const int stop = start + elt_per_thread;

for ( int h=start; h<stop; h++ ) {

float4 p = d_in[h];
float sos = p.x * p.x + p.y * p.y + p.z * p.z + p.w * p.w;
d_out[h] = sos;
}
}

global void kmain_efficient(float4 d_in, float d_out) {

const int tid = threadIdx.x + blockIdx.x * blockDim.x;
for ( int h=tid; h<d_app.array_size; h += d_app.num_threads ) {
float4 p = d_in[h];
float sos = p.x * p.x + p.y * p.y + p.z * p.z + p.w * p.w;
d_out[h] = sos;
}
}

Problem with inefficient kernel due to request size.

What is the maximum request size that will avoid this inefficiency? Explain.

(b) A CUDA kernel is to run on a GPU with 8 SMs (MPs). Configuration A consists of 4 blocks of 32 threads
each. Configuration B consists of 8 blocks of 16 threads each. Neither is very good. Explain how each one
underutilizes the hardware on current NVIDIA GPUs.

Configuration A underutilizes hardware by . . .

Configuration B underutilizes hardware by . . .

7
Problem 4: [40 pts] Answer each question below.
(a) The screenshot below is the Homework 3 triangular spiral with 37 segments per spiral. Imagine a spiral
with even more segments, say 1000.

With a large number of segments there will usually be a large computational load on both the vertex and
fragment shaders. In one of these shaders the computational load can be considered wasted, depending on
the eye location, even when the spiral is visible. In which shader is computation wasted, and why.

Which shader wastes computation? Why?

Sketch two views in which the spiral is visible. In one the computational load is high and mostly wasted. In
the other the computational load is lower and not wasted.

View with lower load and little waste. View with high load and waste.

8
(b) The OpenGL call glColor is used to specify a color, say purple glColor3f(1,0,1);. In typical use is
that the color that will be written to the frame buffer? Explain.

Are the arguments to glColor the exact color to be written to the frame buffer? Explain.

The flat qualifier’s feature, the noperspective qualifier’s, feature, the smooth qualifier’s, feature.

(d) Consider a rendering pass using triangles as the primitive. OpenGL (compatibily profile) allows one to
specify a normal for each triangle vertex, but as we all know a triangle, geometrically, has just one normal.
Why would one specify different normals for each vertex? Explain how such normals are chosen.

Why might one choose different normals for each vertex of a triangle?

Give an example and describe how those normals are chosen.

9
(e) Describe what the inputs to the rasterization stage are, what the rasterization stage does, and what its
outputs are.

Rasterization stage input, rasterization stage job, rasterization stage output:

(f) The true-sphere shader used in class rendered spheres perfectly. Would it make sense to use a similar
approach to write a true-cube shader that can perfectly render a cube?

Does a true cube shader make sense? Explain.

(g) The unlabeled diagram below shows how shadow volumes can be used to render shadows. On the diagram
show the location of the eye and light source, fragment(s) found to be in the shade, and fragment(s) found
to be illuminated.

Show: eye, light source, shaded fragment(s), illuminated fragment(s).

SPSS, 2025
No ratings yet
SPSS, 2025
16 pages
How To Install GCC Compiler On Ubuntu (3 Simple Methods)
No ratings yet
How To Install GCC Compiler On Ubuntu (3 Simple Methods)
9 pages
Revy Atm
No ratings yet
Revy Atm
26 pages
Om's Resume Updated - Data Analyst
No ratings yet
Om's Resume Updated - Data Analyst
1 page
Object Oriented Programming
No ratings yet
Object Oriented Programming
4 pages
Tut 01
No ratings yet
Tut 01
24 pages
IES Notes
No ratings yet
IES Notes
12 pages
Introduction To Python
No ratings yet
Introduction To Python
65 pages
DD Vstor40 x64MSI24447
No ratings yet
DD Vstor40 x64MSI24447
129 pages
Rhce Exam V4
No ratings yet
Rhce Exam V4
7 pages
SDET Brochures
No ratings yet
SDET Brochures
4 pages
Computer Graphics Short Summary
No ratings yet
Computer Graphics Short Summary
2 pages
Unit IV File Handling - CSV Files
No ratings yet
Unit IV File Handling - CSV Files
28 pages
Livro Conhecer Mais Respostas
No ratings yet
Livro Conhecer Mais Respostas
5 pages
Rasterize
No ratings yet
Rasterize
5 pages
01 Scaling of EBS Volume For A Linux VM
No ratings yet
01 Scaling of EBS Volume For A Linux VM
17 pages
Week 4. Vector Geometry (Reading Assignment)
No ratings yet
Week 4. Vector Geometry (Reading Assignment)
32 pages
Exposicion de Graficos
No ratings yet
Exposicion de Graficos
7 pages
4 Ps
No ratings yet
4 Ps
18 pages
CG PGM For Students
No ratings yet
CG PGM For Students
22 pages
06 Pipeline
No ratings yet
06 Pipeline
40 pages
Defining Classes Lab
No ratings yet
Defining Classes Lab
5 pages
Input and Output Statements - 1
No ratings yet
Input and Output Statements - 1
19 pages
Document The Program Logic or Design
No ratings yet
Document The Program Logic or Design
27 pages
Develop A Program To Demonstrate 3D Transformation On 3D Objects - Tetrahedron
No ratings yet
Develop A Program To Demonstrate 3D Transformation On 3D Objects - Tetrahedron
11 pages
Organisation of Programming Languages
No ratings yet
Organisation of Programming Languages
50 pages
Java Notes
No ratings yet
Java Notes
4 pages
22x055 ps6
No ratings yet
22x055 ps6
26 pages
Lecture 4: MIPS Instruction Set: Today's Topics: MIPS Instructions Code Examples
No ratings yet
Lecture 4: MIPS Instruction Set: Today's Topics: MIPS Instructions Code Examples
21 pages
(Summer - 20 - 21) SQAT - Ch.06 - Software Testing
No ratings yet
(Summer - 20 - 21) SQAT - Ch.06 - Software Testing
21 pages
Hello Triangle
No ratings yet
Hello Triangle
4 pages
CG 2
No ratings yet
CG 2
3 pages
SQL Sever Recent
No ratings yet
SQL Sever Recent
176 pages
QN 05
No ratings yet
QN 05
4 pages
3 Opengl
No ratings yet
3 Opengl
15 pages
Fe Sol
No ratings yet
Fe Sol
12 pages
Lab Manual 3 - Shader Programming
No ratings yet
Lab Manual 3 - Shader Programming
20 pages
Café Finder System (Easy Meal) : UTM Computing Proceedings
No ratings yet
Café Finder System (Easy Meal) : UTM Computing Proceedings
4 pages
Varios Programas de Practica en OpenGL
No ratings yet
Varios Programas de Practica en OpenGL
47 pages
Complete Football Assignment
No ratings yet
Complete Football Assignment
7 pages
Lecture 39 PPT
No ratings yet
Lecture 39 PPT
71 pages
Problem Set 2
100% (1)
Problem Set 2
18 pages
CGV Practical2
No ratings yet
CGV Practical2
7 pages
OpenGL Shaders
No ratings yet
OpenGL Shaders
53 pages
City Code For CGV
No ratings yet
City Code For CGV
23 pages
QR 1
No ratings yet
QR 1
24 pages
Objectives in This Lesson, You Will Learn To: Connect To A Database by Creating A Data Adapter
No ratings yet
Objectives in This Lesson, You Will Learn To: Connect To A Database by Creating A Data Adapter
14 pages
Function Modules
No ratings yet
Function Modules
5 pages
GPU Programming EE 4702-1 Midterm Examination: Exam Total
0% (1)
GPU Programming EE 4702-1 Midterm Examination: Exam Total
5 pages
How To Use Reference Field On DFF
No ratings yet
How To Use Reference Field On DFF
13 pages
RAPID Refer4.0
No ratings yet
RAPID Refer4.0
879 pages
1 Polygons
No ratings yet
1 Polygons
12 pages
Getting Started With Code Composer Studio 3
No ratings yet
Getting Started With Code Composer Studio 3
26 pages
ALV With User Defined Menu On Toolbar
No ratings yet
ALV With User Defined Menu On Toolbar
3 pages
LSU EE 4702-1 Homework 4 Due: 7 November 2014
No ratings yet
LSU EE 4702-1 Homework 4 Due: 7 November 2014
2 pages
Lab Task-2
No ratings yet
Lab Task-2
8 pages
LSU EE 4702-1 Homework 6 Due: 24 November 2014
No ratings yet
LSU EE 4702-1 Homework 6 Due: 24 November 2014
2 pages
LSU EE 4702-1 Homework 3 Due: 11 October 2018: Non-Assignment-Specific User Interface
No ratings yet
LSU EE 4702-1 Homework 3 Due: 11 October 2018: Non-Assignment-Specific User Interface
4 pages
LSU EE 4702-1 Homework 5 Due: 17 October 2016: Green Text
No ratings yet
LSU EE 4702-1 Homework 5 Due: 17 October 2016: Green Text
3 pages
HW 05
No ratings yet
HW 05
3 pages
GPU Programming LSU EE 4702-1 Solve-Home Final Examination
No ratings yet
GPU Programming LSU EE 4702-1 Solve-Home Final Examination
9 pages
GPU Programming EE 4702-1 Take-Home Pre-Final Examination
No ratings yet
GPU Programming EE 4702-1 Take-Home Pre-Final Examination
9 pages
GPU Programming EE 4702-1 Take-Home Pre-Final Examination: Name Solution
No ratings yet
GPU Programming EE 4702-1 Take-Home Pre-Final Examination: Name Solution
12 pages
GPU Programming EE 4702-1 Take-Home Pre-Final Examination
No ratings yet
GPU Programming EE 4702-1 Take-Home Pre-Final Examination
11 pages
GPU Programming EE 4702-1 Midterm Examination: Exam Total
No ratings yet
GPU Programming EE 4702-1 Midterm Examination: Exam Total
8 pages
EE 4702 Final Exam: Solution
No ratings yet
EE 4702 Final Exam: Solution
8 pages
GPU Programming EE 4702-1 Final Examination: Exam Total
No ratings yet
GPU Programming EE 4702-1 Final Examination: Exam Total
9 pages
GPU Programming EE 4702-1 Final Examination: Name Solution
No ratings yet
GPU Programming EE 4702-1 Final Examination: Name Solution
10 pages
GPU Programming EE 4702-1 Final Examination: Exam Total
No ratings yet
GPU Programming EE 4702-1 Final Examination: Exam Total
10 pages
GPU Programming EE 4702-1 Midterm Examination: Exam Total
No ratings yet
GPU Programming EE 4702-1 Midterm Examination: Exam Total
7 pages
GPU Programming EE 4702-1 Midterm Examination: Exam Total
No ratings yet
GPU Programming EE 4702-1 Midterm Examination: Exam Total
7 pages
EE 4702 Final Exam
No ratings yet
EE 4702 Final Exam
7 pages
GPU Programming LSU EE 4702-1 Solve-Home Final Examination: Name Solution
No ratings yet
GPU Programming LSU EE 4702-1 Solve-Home Final Examination: Name Solution
11 pages
Keyboard Mouse Graphics Project
No ratings yet
Keyboard Mouse Graphics Project
59 pages
Praktikum OpenGL Primitive Drawing
No ratings yet
Praktikum OpenGL Primitive Drawing
14 pages
Solar System Computer Graphics Mini Project With Source Code
No ratings yet
Solar System Computer Graphics Mini Project With Source Code
2 pages
Mali & Opengl Es 3.0: Dave Shreiner Jon Kirkham
No ratings yet
Mali & Opengl Es 3.0: Dave Shreiner Jon Kirkham
27 pages
Computer Graphics LAB Programs For 6TH SEM BE
75% (8)
Computer Graphics LAB Programs For 6TH SEM BE
36 pages
Programas para Generar Carros
No ratings yet
Programas para Generar Carros
45 pages
Computer Graphics
No ratings yet
Computer Graphics
39 pages
Gedung 3dimensi Visual Studio
No ratings yet
Gedung 3dimensi Visual Studio
13 pages
Graphics Project
No ratings yet
Graphics Project
16 pages
Sol Tut 02
No ratings yet
Sol Tut 02
8 pages
Icg Mesh PDF
No ratings yet
Icg Mesh PDF
22 pages
Introduction To Computer Graphics: Assignment 1 3D Primitives and Transformations
No ratings yet
Introduction To Computer Graphics: Assignment 1 3D Primitives and Transformations
30 pages
Try GL
No ratings yet
Try GL
2 pages
3D Object Representation
No ratings yet
3D Object Representation
16 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Opengl42 Quick Reference Card
No ratings yet
Opengl42 Quick Reference Card
12 pages
Opengl4 Quick Reference Card
No ratings yet
Opengl4 Quick Reference Card
8 pages

GPU Programming EE 4702-1 Final Examination: Exam Total

Uploaded by

GPU Programming EE 4702-1 Final Examination: Exam Total

Uploaded by

Name

Problem 1 (15 pts)

Problem 2 (30 pts)

Problem 3 (15 pts)

Problem 4 (40 pts)

Alias Exam Total (100 pts)

Complete code to render shape using individual triangles.

#define glV glVertex3f

Complete code to render shape using a triangle strip.

#define glV glVertex3f

for ( int i=2; i<chain_length; i++ ) { // Host Code

for ( int j=0; j<opt_n_segs; j++ ) {

Amount of data sent from CPU to GPU in terms of c and n, in bytes.

// Host Code -- DO NOT MAKE OR ASSUME MODIFICATIONS TO THIS CODE.

// -- CODE PREPARING va_a, vz_a, and distv_a NOT SHOWN --

glDrawArraysInstanced(GL_LINE_STRIP, 0, opt_n_segs, ctr_a.size() );

Update the interface blocks as needed.

// Out of Geometry Shader to Fragment Shader -- No changes needed

Update the interface blocks as needed.

const mat4 mv = gl_ModelViewMatrix, mvp = gl_ModelViewProjectionMatrix;

void vs_main_lines() { // Vertex Shader Main Routine

Update the interface blocks as needed.

layout ( lines ) in;

const mat4 mv = gl_ModelViewMatrix, mvp = gl_ModelViewProjectionMatrix;

void gs_main_lines() { // Geometry Shader Main Routine

int j = 0; // PLACEHOLDER, Won’t work.

for ( int h=start; h<stop; h++ ) {

__global__ void kmain_efficient(float4 *d_in, float *d_out) {

Problem with inefficient kernel due to request size.

Configuration A underutilizes hardware by . . .

Configuration B underutilizes hardware by . . .

Which shader wastes computation? Why?

Give an example and describe how those normals are chosen.

Rasterization stage input, rasterization stage job, rasterization stage output:

Does a true cube shader make sense? Explain.

Show: eye, light source, shaded fragment(s), illuminated fragment(s).

You might also like

global void kmain_efficient(float4 d_in, float d_out) {