Introduction To GPU Architecture: © 2006 University of Central Florida
Introduction To GPU Architecture: © 2006 University of Central Florida
welcome
Raster graphics:-Is a data structure representing a
generally rectangular grid of pixels or points of color
Mark Colbert
Basics
Textures:-The disposition of the several parts of a
body in connection with each other or manner in which the constituent parts of an object are united
Mark Colbert
Basics
Depth of field:- The effect in which objects
within same range of distances in a scene appear in focus and objects nearer or farther than this range appear out of focus
Mark Colbert
overview
GPU Architecture GPU Pipeline Introduction to Cg Implications for GPGPU
Mark Colbert
GPU
Graphics Processing Unit..May be multi core(Testa Fermi has 520 cores) Parallelized SIMD Architecture.implemented through pipes
Example-24 fragment pipes on nVidia 7800
Mark Colbert
Mark Colbert
notation
Vertex
A data structure for a point in a mesh, containing position, normal, texture coordinates and more
memory constructs
Buffered Objects Uniform Registers/State Table Interpolated Registers Temporary Registers Textures
Mark Colbert
memory constructs
Buffered Objects
CPU Generated Streams of Data Limited Modifiability Example
Vertex Data of a Mesh
Mark Colbert
memory constructs
Uniform Registers/State Table
Constant Data through the Pipeline
Only Necessarily Constant for 1 Polygon
Mark Colbert
memory constructs
Interpolated Registers
Per Vertex Data of a Polygon Stores Information Interpolated Across Polygon 10 General Purpose Interpolated Registers
Mark Colbert
memory constructs
Temporary Registers
Standard Notion of Registers Temporary Registers for In Shader Calculations
Mark Colbert
memory constructs
Textures
Closest to Random Access Memory Expensive to Access
Multiple Dependent Accesses Extremely Expensive
Mark Colbert
GPU pipeline
Program/ API Driver CPU
Bus
GPU GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation Fragment Processing Raster Operations Framebuffer
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert
GPU pipeline
Program
Your Program
Program/ API
API
Either OpenGL or DirectX Interface OpenGL:-it defines cross-language , crossplatform for writing application that produce 2D or 3D computer graphics(uses 250 function calls) DirectX:-collection of APIs , handles tasks related to multimedia(eg..game programming)
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert
GPU pipeline
Driver
Black-box
Implementations are Company Secrets
Driver
Mark Colbert
GPU pipeline
GPU Front End
Receives commands & data from driver PCI Express helps at this stage
Mark Colbert
GPU pipeline
Vertex Processing
Normally performs transformations Programmable
data for rasterization
Vertex Processing
textures
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert
GPU pipeline
Primitive Assembly
Primitive Assembly
Compiles Vertices into Points, Lines and/or Polygons Link elements and set rasterizer
Mark Colbert
GPU pipeline
Rasterization
For each fragment determine respective area of triangle (Barycentric Coordinates) or other primitive
Interpolation
data for rasterization
POSITION
PSIZE
FOG
TEXCOORD[0-7] COLOR[0-1]
data for interpolation
GPU Architecture & Cg
GPU Programming Seminar 1
DEPTH
TEXCOORD[0-7] COLOR[0-1] interpolated data
Interpolator
Mark Colbert
GPU pipeline
Fragment Processing
Programmable
Fragment Processing
rasterized data
DEPTH
TEXCOORD[0-7] COLOR[0-1]
COLOR[0-3] DEPTH
interpolated data
textures
Mark Colbert
GPU pipeline
Depth Checking
Check framebuffer to see if lesser depth already exists (Z-Buffer) Limited Programmability
Raster Operations
Blending
Use alpha channel to combine colors already in the framebuffer Limited Programmability
Mark Colbert
example
Program/ API Code Snippet Driver
Bus
.
glBegin(GL_TRIANGLES); glTexCoord2f(1,0); glVertex3f(0,1,0); glTexCoord2f(0,1); glVertex3f(-1,-1,0); glTexCoord2f(0,0); glVertex3f(1,-1,0); glEnd();
GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1
Fragment Processing
Raster Operations
Framebuffer(s)
Mark Colbert
example
Program/ API Driver
Bus
GPU
GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1
01001001100.
Fragment Processing
Raster Operations
Framebuffer(s)
Mark Colbert
example
Program/ API Driver
Bus
GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1
viewing frustum
Fragment Processing
Raster Operations
Framebuffer(s)
Mark Colbert
example
Program/ API Driver
Bus
GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1
screen space
Fragment Processing
Raster Operations
Framebuffer(s)
Mark Colbert
example
Program/ API Driver
Bus
GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1
framebuffer
Fragment Processing
Raster Operations
Framebuffer(s)
Mark Colbert
example
Program/ API Driver
Bus
GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1
framebuffer
Fragment Processing
Raster Operations
Framebuffer(s)
Mark Colbert
MIMD
Branches are supported with a large overhead
Examples
Cg GLSL HLSL
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert
Cg
nVidias Solution Nearly Identical to HLSL C++ Based
New Intrinsic Classes New Intrinsic Functions Semantics
Mark Colbert
Cg
Intrinsic Classes
Vectorized Primitives
i.e. float2, float3, float4
Mark Colbert
Cg
Intrinsic Classes (contd)
Membership Access
Constructor
e.g. float4 v = float4(a,b,c,d);
Array Operator
e.g. v[0], v[1], v[2], or v[3]
Swizzle Operator
Re-order/Build Vectors e.g. v.xyz, v.xxxz, v.yyx, v.yx, v.xyzw Replaceable with rgba instead of xyzw
Mark Colbert
Cg
Intrinsic Classes (contd)
Matrices
Compounded Vector Classes
e.g. float4x4
Samplers
Texture Data Type
sampler1D, sampler2D, samplerRECT, sampler3D samplerRECT Same as sampler2D but uses pixel locations as texture coordinates instead of from [0,1]
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert
Cg
Intrinsic Functions
Many have direct correspondence to assembly instructions or good approximations Linear Algebra Functions
dot(a,b) Dot Product mul Matrix-Matrix, Vector-Matrix, or Matrix-Vector multiplication
Mark Colbert
Cg
Intrinsic Functions (contd)
Geometric Intrinsic
distance, faceforward, length, normalize, reflect, refract
Mark Colbert
Cg
Semantics
Binds variables to GPU Memory Constructs Uniform Registers
In declaration, use keyword uniform in front of variable type
Textures
Same as uniform variable
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert
FX Composer
Program for quick shader design Uses Cg as underlying shading language Additional Semantic Bindings NOTE: Uses DirectX as base, so uses vector-matrix multiplication notation
Mark Colbert
FX Composer
Walkthrough Example
Mark Colbert
GPGPU
General Purpose GPU Processing Key Notes
Goal to exploit fragment processor Each pixel represents a compacted 4component element of data Most optimal in gathering algorithms
Vertex shader needed to re-order output Possibly Optimal in Unified Shading Architecture
Mark Colbert