Gpu Cuda Part1

This document provides an introduction to GPUs and CUDA programming. It discusses the evolution of GPU microarchitectures from early graphics accelerators to modern GPUs with programmable hardware. GPUs have thousands of cores devoted to highly parallel computation and follow a single-program multiple-data processing model. The document outlines GPU terminology and components like streaming multiprocessors and execution cores. It also compares CPU and GPU architectures and provides examples of NVIDIA GPU microarchitectures from Fermi to Volta over several generations.

Uploaded by

Raghav Ganesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views27 pages

Gpu Cuda Part1

Uploaded by

Raghav Ganesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

IT301: INTRODUCTION TO

CUDA
By,
Ms. Thanmayee
Adhoc Faculty,
Department of IT,
NITK, Surathkal
OUTLINE
● Introduction to GPU
● Evolution of GPU microarchitectures
● General Purpose GPU
● Introduction to CUDA
● CUDA Execution Model
● CUDA Memory Model
● Steps in GPU Execution
● Hello World Program
● CUDA Device Variables
● CUDA Programming examples
Let's learn about GPU..

● A little history:
− The first GPUs were designed as graphics accelerators
● supported only specific fixed-function pipelines.
− In the late 1990s, the hardware became increasingly programmable
● Culminating in NVIDIA's first GPU in 1999.
Graphics Processing Unit

● Has thousands of cores and ALUs.

● They can handle billions of repetitive low level tasks.

● GPU is specialized for compute-intensive, highly parallel computation.

● They are devoted to data processing rather than data caching and flow
control.

● Follows SPMD processing model.

CPU versus GPU
More closer look at GPU:
GPU Terms:
● Stream processing -- Term used to denote processing of a stream of
instructions operating in a data parallel fashion.
● Stream Processors (SPs) – the execution cores that will execute the
stream. Each stream processor has compute resources such as register
file, instruction scheduler
● Streaming multiprocessors (SMs) -- groups of streaming processors
that shares control logic and cache.
GPU Microarchitectures
Fermi Architecture (2010)

https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf
What each SMs
have..

● Memory operations are handled by a set of 16 load-store units in

each SM.
● A set of four Special Function Units (SFUs) is also available to
handle transcendental and other special operations such as sin,
cos, exp, and rcp (reciprocal)
● Along with the group of 16 load-store units and the four SFUs,
there are four execution blocks per SM.
○ 16 + 16 Cores (2 blocks)
○ LD/ST
○ SFU
A total of 32 instructions from one or two warps can be dispatched in
each cycle to any two of the four execution blocks within a Fermi SM
Kepler (2012)
Fermi Versus
Kepler
Maxwell (2014)
Pascal (2016)
Fermi : GTX480
GTX580

Kepler : GTX 680

GTX 780
GTX 780 Ti

Maxwell: GTX 980

GTX 980 Ti

Pascal : GTX 1080

Source : https://www.techspot.com/article/1191-nvidia-geforce-six-generations-tested/
Volta micro architecture (2017)
NVIDIA
Tesla GPUs
THANK YOU

CSE Lec4 Cuda
No ratings yet
CSE Lec4 Cuda
91 pages
New Holland E26C Mini Excavator Operator's Manual
100% (1)
New Holland E26C Mini Excavator Operator's Manual
282 pages
Chapter 9 - Multiple Core Computers
No ratings yet
Chapter 9 - Multiple Core Computers
44 pages
Introduction To CUDA
No ratings yet
Introduction To CUDA
51 pages
NVIDIA GPU Computing - A Journey From PC Gaming To Deep Learning
100% (1)
NVIDIA GPU Computing - A Journey From PC Gaming To Deep Learning
91 pages
Chapter7 GPU
No ratings yet
Chapter7 GPU
45 pages
GPU Architecture & Implications: David Luebke NVIDIA Research
No ratings yet
GPU Architecture & Implications: David Luebke NVIDIA Research
94 pages
GPU Architecture Ebook
No ratings yet
GPU Architecture Ebook
67 pages
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
100% (1)
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
29 pages
1 Cuda
100% (1)
1 Cuda
173 pages
Lecture - 01 - CUDA Programming
No ratings yet
Lecture - 01 - CUDA Programming
52 pages
USA Kamma Industrialist &doctors
100% (2)
USA Kamma Industrialist &doctors
72 pages
10 GPU-IntroCUDA3
No ratings yet
10 GPU-IntroCUDA3
141 pages
UNIT 4 GPU Computing - HPC
No ratings yet
UNIT 4 GPU Computing - HPC
13 pages
0 Gpu Computing I Give It
No ratings yet
0 Gpu Computing I Give It
57 pages
Kirk+Hwu GPU
No ratings yet
Kirk+Hwu GPU
92 pages
CUDA2 - Microarchitecture of GPU
No ratings yet
CUDA2 - Microarchitecture of GPU
21 pages
GPGPU
No ratings yet
GPGPU
139 pages
Using GPUs
No ratings yet
Using GPUs
18 pages
00 CourseIntroduction
No ratings yet
00 CourseIntroduction
33 pages
06 Intro Gpus
No ratings yet
06 Intro Gpus
33 pages
Chapter 5 - General Purpose PGPU, CUDA
No ratings yet
Chapter 5 - General Purpose PGPU, CUDA
70 pages
Graphics Processing Unit
No ratings yet
Graphics Processing Unit
21 pages
Lecture 1: An Introduction To CUDA: Mike Giles
No ratings yet
Lecture 1: An Introduction To CUDA: Mike Giles
247 pages
GPU Programming: Dr. Florian Ferreira
No ratings yet
GPU Programming: Dr. Florian Ferreira
101 pages
Lecture GPUArchCUDA01
No ratings yet
Lecture GPUArchCUDA01
57 pages
CUDA 1 - Introduction To GPU, CUDA
No ratings yet
CUDA 1 - Introduction To GPU, CUDA
21 pages
Part1 22
No ratings yet
Part1 22
77 pages
Gpus
No ratings yet
Gpus
32 pages
GPU Architecture and Function: Michael Foster and Ian Frasch
No ratings yet
GPU Architecture and Function: Michael Foster and Ian Frasch
35 pages
CUDA Tutorial
No ratings yet
CUDA Tutorial
50 pages
p10 Cuda
No ratings yet
p10 Cuda
28 pages
Lecture-12-PDC - CUDA
No ratings yet
Lecture-12-PDC - CUDA
25 pages
Gpgpu Final
No ratings yet
Gpgpu Final
124 pages
National Institute of Design (UG Campus) : Thesis Topics
100% (1)
National Institute of Design (UG Campus) : Thesis Topics
4 pages
GPU Architectures
No ratings yet
GPU Architectures
29 pages
GPU (Graphics Processing Unit)
No ratings yet
GPU (Graphics Processing Unit)
23 pages
Lecture 2
No ratings yet
Lecture 2
15 pages
GPUIntro
No ratings yet
GPUIntro
21 pages
GPU Cluster4
No ratings yet
GPU Cluster4
31 pages
PR 2 Module 1ST QTR
No ratings yet
PR 2 Module 1ST QTR
41 pages
HPC 5th Unit - 240504 - 160548
No ratings yet
HPC 5th Unit - 240504 - 160548
18 pages
Universidad Nacional Mayor de San Marcos: Arquitectura de Computadoras Mg. Juan Carlos Gonzales Suarez 2019 - I
No ratings yet
Universidad Nacional Mayor de San Marcos: Arquitectura de Computadoras Mg. Juan Carlos Gonzales Suarez 2019 - I
22 pages
Contemporary Problems of Pakistan
No ratings yet
Contemporary Problems of Pakistan
17 pages
Solutions To Number Theory Exercises
No ratings yet
Solutions To Number Theory Exercises
2 pages
What Is A GPU
No ratings yet
What Is A GPU
3 pages
Comp Arch Project 2 Final
No ratings yet
Comp Arch Project 2 Final
29 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
No ratings yet
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
43 pages
Lecture 1: An Introduction To CUDA: Mike Giles
No ratings yet
Lecture 1: An Introduction To CUDA: Mike Giles
40 pages
Architectural Details of Tesla GPU Microarchitecture
No ratings yet
Architectural Details of Tesla GPU Microarchitecture
9 pages
Nvidia Cuda Arc
No ratings yet
Nvidia Cuda Arc
16 pages
List
No ratings yet
List
8 pages
Rockwell CompactLogix
No ratings yet
Rockwell CompactLogix
50 pages
Jlgwvkyfc Gkpxmvbs7Kxdtdymvcrgc Vcviilbkfcwbhxaep: Bill Statement
No ratings yet
Jlgwvkyfc Gkpxmvbs7Kxdtdymvcrgc Vcviilbkfcwbhxaep: Bill Statement
6 pages
Vocabulary: Example: My Income Fluctuates Wildly When I Work Part-Time
100% (1)
Vocabulary: Example: My Income Fluctuates Wildly When I Work Part-Time
2 pages
Grove Gmk5225 Cranes Material Handlers
No ratings yet
Grove Gmk5225 Cranes Material Handlers
28 pages
Binny OSA Subsequent Events Aff
No ratings yet
Binny OSA Subsequent Events Aff
6 pages
Graphics Processing Unit Graphics Processing Unit: Dhan V Sagar CB - EN.P2CSE13007
No ratings yet
Graphics Processing Unit Graphics Processing Unit: Dhan V Sagar CB - EN.P2CSE13007
21 pages
NVIDIAFermiComputeArchitectureWhitepaper PDF
No ratings yet
NVIDIAFermiComputeArchitectureWhitepaper PDF
21 pages
Journal of Asian Architecture and Building Engineering (Jaabe)
No ratings yet
Journal of Asian Architecture and Building Engineering (Jaabe)
21 pages
ME352 Lecture 1 Static Force Analysis
No ratings yet
ME352 Lecture 1 Static Force Analysis
68 pages
Unit 2 - GPU DFG
No ratings yet
Unit 2 - GPU DFG
27 pages
Agile Architecture Poster V 1 - 0
No ratings yet
Agile Architecture Poster V 1 - 0
1 page
CUDA
No ratings yet
CUDA
46 pages
GPU Based Super Computer: By: Adam Powell Student # 3198371 For COSC 3P93
No ratings yet
GPU Based Super Computer: By: Adam Powell Student # 3198371 For COSC 3P93
13 pages
GPU Introduction
No ratings yet
GPU Introduction
52 pages
Graphics Processing Unit: Shashwat Shriparv Infinitysoft
No ratings yet
Graphics Processing Unit: Shashwat Shriparv Infinitysoft
39 pages
Service Center Repairs We Buy Used Equipment: Instra
No ratings yet
Service Center Repairs We Buy Used Equipment: Instra
5 pages
ME352 Lecture 4 Cams
No ratings yet
ME352 Lecture 4 Cams
43 pages
Programming Gpus With Cuda: John Mellor-Crummey
No ratings yet
Programming Gpus With Cuda: John Mellor-Crummey
42 pages
Whitepaper NVIDIA's Next Generation CUDA Compute Architecture
No ratings yet
Whitepaper NVIDIA's Next Generation CUDA Compute Architecture
21 pages
Why GPU?: CS8803SC Software and Hardware Cooperative Computing
No ratings yet
Why GPU?: CS8803SC Software and Hardware Cooperative Computing
14 pages
Gpu Cuda Part2
No ratings yet
Gpu Cuda Part2
15 pages
4 Substations
No ratings yet
4 Substations
14 pages
Wood Charcoal Strategies Web
No ratings yet
Wood Charcoal Strategies Web
56 pages
KLAXEDDM PDF 09jan23
No ratings yet
KLAXEDDM PDF 09jan23
17 pages
Balance of Payment - FOREX
No ratings yet
Balance of Payment - FOREX
18 pages
Developers Had To Map Scientific Calculations Onto Problems That Could Be Represented by Triangles and Polygons
No ratings yet
Developers Had To Map Scientific Calculations Onto Problems That Could Be Represented by Triangles and Polygons
2 pages
PPL (P1) Checkride POA
No ratings yet
PPL (P1) Checkride POA
33 pages
SVP Business Development IT in Washington DC Resume Frank Blaul
No ratings yet
SVP Business Development IT in Washington DC Resume Frank Blaul
4 pages
242-Article Text-280-1-10-20180718 PDF
No ratings yet
242-Article Text-280-1-10-20180718 PDF
14 pages
Tradable News Trigger Report
No ratings yet
Tradable News Trigger Report
1 page
Sega Saturn Architecture: Architecture of Consoles: A Practical Analysis, #5
From Everand
Sega Saturn Architecture: Architecture of Consoles: A Practical Analysis, #5
Rodrigo Copetti
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
Cam Kinematics - Cam Profile
No ratings yet
Cam Kinematics - Cam Profile
12 pages
Details of Minor Course For 2019 Batch
No ratings yet
Details of Minor Course For 2019 Batch
9 pages
The Effect of e-WOM On Destination Image, Satisfaction and Loyalty
No ratings yet
The Effect of e-WOM On Destination Image, Satisfaction and Loyalty
8 pages
Kinematic and Dynamic Analysis of Cam and Follower
No ratings yet
Kinematic and Dynamic Analysis of Cam and Follower
12 pages
Academic Calendar Jan July 2020 Revised
No ratings yet
Academic Calendar Jan July 2020 Revised
2 pages
Trusts and The Family Home Legal Problem Question
No ratings yet
Trusts and The Family Home Legal Problem Question
7 pages
Information Sheet 1.1-2
No ratings yet
Information Sheet 1.1-2
2 pages
Point Load Tests On Double Tee Flanges
No ratings yet
Point Load Tests On Double Tee Flanges
8 pages
Dreamcast Architecture: Architecture of Consoles: A Practical Analysis, #9
From Everand
Dreamcast Architecture: Architecture of Consoles: A Practical Analysis, #9
Rodrigo Copetti
No ratings yet
BDOCS and IDOCS
No ratings yet
BDOCS and IDOCS
6 pages
DEED OF SALE - Sale of Property - Template
No ratings yet
DEED OF SALE - Sale of Property - Template
2 pages

Gpu Cuda Part1

Uploaded by

Gpu Cuda Part1

Uploaded by

IT301: INTRODUCTION TO

● Has thousands of cores and ALUs.

● They can handle billions of repetitive low level tasks.

● GPU is specialized for compute-intensive, highly parallel computation.

● Follows SPMD processing model.

● Memory operations are handled by a set of 16 load-store units in

Kepler : GTX 680

Maxwell: GTX 980

Pascal : GTX 1080

You might also like