GPU Optimisation
GPU Optimisation
Your assignment is to complete our investigation into GPU speedup, occupancy, and memory
bandwidth, and to answer the question:
“Since there is an overhead in GPU processing, when does it make sense to use the GPU instead of the
CPU?”
Refer to the templates we have developed in the lab (A, B, and C), the NVIDIA webinars on CUDA
optimisation, and the other example programs in the NVIDIA GPU Computing SDK (e.g.
simpleZeroCopy, bandwidthTest, etc.).
Develop a Template D which incorporates your understanding of the CUDA C Runtime API, and the
material covered in the lectures. Write a short report on your findings, and provide comparative results
(speed, occupancy, bandwidth), using Template C results as your baseline.
1. Target the FERMI architecture (in your GTX470), but state which features, in your Template D,
are not CUDA 1.0 compute capabilities when you write your report.
Submit your template and report to the course website no later than first thing Monday 11th October –
please note that there are NO extensions on this deadline.