100% found this document useful (1 vote)
151 views41 pages

Introduction To GPU Architecture: © 2006 University of Central Florida

GPU Architecture and Cg GPU Programming Seminar 1 Mark Colbert basics Raster graphics:-Is a data structure representing a generally rectangular grid of pixels or points of color viewable via monitor or other display mediums. Texture:-The disposition of the several parts of a body in connection with each other or manner in which the constituent parts of an object are united. Depth of field:-The effect in which objects within same range of distances in a scene appear in focus and objects nearer or farther appear out of

Uploaded by

Sayan Banerjee
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
151 views41 pages

Introduction To GPU Architecture: © 2006 University of Central Florida

GPU Architecture and Cg GPU Programming Seminar 1 Mark Colbert basics Raster graphics:-Is a data structure representing a generally rectangular grid of pixels or points of color viewable via monitor or other display mediums. Texture:-The disposition of the several parts of a body in connection with each other or manner in which the constituent parts of an object are united. Depth of field:-The effect in which objects within same range of distances in a scene appear in focus and objects nearer or farther appear out of

Uploaded by

Sayan Banerjee
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

Introduction to GPU Architecture

2006 University of Central Florida

welcome
Raster graphics:-Is a data structure representing a
generally rectangular grid of pixels or points of color

viewable via monitor or other display mediums.

Aliasing:-Refers to an effect that causes different


signals to become indistinguishable when sampled
Mesh:-Is a collection of vertices, edges and faces that defines the shape of an object in 3D

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

Basics
Textures:-The disposition of the several parts of a
body in connection with each other or manner in which the constituent parts of an object are united

PCI express:-Peripheral component Interconnect


express it is a printed circuit board inserted into an slot of a computer mother board to add functionality to a computer system Used to link motherboard-mounted peripherals

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

Basics
Depth of field:- The effect in which objects
within same range of distances in a scene appear in focus and objects nearer or farther than this range appear out of focus

Vector processor:- Is a central processing unit


that implements an instruction set containing instructions that operate on one-dimension arrays of data called vectors

Pipes:- Is a connector that passes data from one


unit to other. It is generally implemented by data buffers

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

overview
GPU Architecture GPU Pipeline Introduction to Cg Implications for GPGPU

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

GPU
Graphics Processing Unit..May be multi core(Testa Fermi has 520 cores) Parallelized SIMD Architecture.implemented through pipes
Example-24 fragment pipes on nVidia 7800

Each Pipe Handles 4 Vector Operations

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

rules of the game


Not a Generalized Vector Processor Cannot read and write to same areas of memory Limited output capability
Currently, very expensive to output to arbitrary locations in memory

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

notation
Vertex
A data structure for a point in a mesh, containing position, normal, texture coordinates and more

Fragment A pixel, possibly sub-pixel, of a rasterized image Shaders


Small programs ran in the GPU at specific stages of the GPU pipeline
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

memory constructs
Buffered Objects Uniform Registers/State Table Interpolated Registers Temporary Registers Textures

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

memory constructs
Buffered Objects
CPU Generated Streams of Data Limited Modifiability Example
Vertex Data of a Mesh

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

memory constructs
Uniform Registers/State Table
Constant Data through the Pipeline
Only Necessarily Constant for 1 Polygon

32 general purpose registers State Table Specific Registers


Projection/Model View Matrices Lights and more

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

memory constructs
Interpolated Registers
Per Vertex Data of a Polygon Stores Information Interpolated Across Polygon 10 General Purpose Interpolated Registers

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

memory constructs
Temporary Registers
Standard Notion of Registers Temporary Registers for In Shader Calculations

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

memory constructs
Textures
Closest to Random Access Memory Expensive to Access
Multiple Dependent Accesses Extremely Expensive

Rasterization:-Rasterization is the process of


converting a vertex representation to a pixel representation

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

GPU pipeline
Program/ API Driver CPU

Bus
GPU GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation Fragment Processing Raster Operations Framebuffer
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

GPU pipeline
Program
Your Program

Program/ API

API
Either OpenGL or DirectX Interface OpenGL:-it defines cross-language , crossplatform for writing application that produce 2D or 3D computer graphics(uses 250 function calls) DirectX:-collection of APIs , handles tasks related to multimedia(eg..game programming)
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

GPU pipeline
Driver
Black-box
Implementations are Company Secrets

Driver

Largest Bottleneck in many GPU programs

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

GPU pipeline
GPU Front End

GPU Front End

Receives commands & data from driver PCI Express helps at this stage

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

GPU pipeline
Vertex Processing
Normally performs transformations Programmable
data for rasterization

Vertex Processing

vertex POSITION, NORMAL, BINORMAL*, TANGENT*, TEXCOORD[0-7], COLOR[0-1], PSIZE

POSITION PSIZE Vertex Processor shader FOG


TEXCOORD[0-7] COLOR[0-1] data for interpolation

textures
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

GPU pipeline
Primitive Assembly

Primitive Assembly

Compiles Vertices into Points, Lines and/or Polygons Link elements and set rasterizer

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

GPU pipeline
Rasterization

Rasterization & Interpolation

For each fragment determine respective area of triangle (Barycentric Coordinates) or other primitive

Interpolation
data for rasterization

Primitive Assembler Primitive Type Rasterizer Barycentric Coordinates


rasterized data

POSITION

PSIZE
FOG
TEXCOORD[0-7] COLOR[0-1]
data for interpolation
GPU Architecture & Cg
GPU Programming Seminar 1

DEPTH
TEXCOORD[0-7] COLOR[0-1] interpolated data

Interpolator

Mark Colbert

GPU pipeline
Fragment Processing
Programmable

Fragment Processing

rasterized data

data for raster ops

DEPTH
TEXCOORD[0-7] COLOR[0-1]

Fragment Processor shader

COLOR[0-3] DEPTH

interpolated data

textures

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

GPU pipeline
Depth Checking
Check framebuffer to see if lesser depth already exists (Z-Buffer) Limited Programmability

Raster Operations

Blending
Use alpha channel to combine colors already in the framebuffer Limited Programmability

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

example
Program/ API Code Snippet Driver
Bus

.
glBegin(GL_TRIANGLES); glTexCoord2f(1,0); glVertex3f(0,1,0); glTexCoord2f(0,1); glVertex3f(-1,-1,0); glTexCoord2f(0,0); glVertex3f(1,-1,0); glEnd();

GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1

Fragment Processing

Raster Operations

Framebuffer(s)
Mark Colbert

example
Program/ API Driver
Bus

GPU

GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1

01001001100.

Fragment Processing

Raster Operations

Framebuffer(s)
Mark Colbert

example
Program/ API Driver
Bus

GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1

viewing frustum

Fragment Processing

Raster Operations

Framebuffer(s)
Mark Colbert

example
Program/ API Driver
Bus

GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1

screen space

Fragment Processing

Raster Operations

Framebuffer(s)
Mark Colbert

example
Program/ API Driver
Bus

GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1

framebuffer

Fragment Processing

Raster Operations

Framebuffer(s)
Mark Colbert

example
Program/ API Driver
Bus

GPU Front End Vertex Processing Primitive Assembly Rasterization & Interpolation
GPU Architecture & Cg
GPU Programming Seminar 1

framebuffer

Fragment Processing

Raster Operations

Framebuffer(s)
Mark Colbert

quick architecture notes


Limits in Shader Size
Pixel Shader 3.0 Spec
Vertex Program 65535 asm instructions Fragment Program 65535+ asm instructions

MIMD
Branches are supported with a large overhead

Rasterizer & Interpolator


Programmable in DirectX 10 Geometric Shaders

Unified Shading Architecture


Xbox 360 ATI Pool of processors with load balancing
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

higher level shading languages


Vectorized languages for designing shader programs Easy way out of tedious assembly coding Not Perfect
Results Are Sometimes Clearly Not Optimized

Examples
Cg GLSL HLSL
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

Cg
nVidias Solution Nearly Identical to HLSL C++ Based
New Intrinsic Classes New Intrinsic Functions Semantics

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

Cg
Intrinsic Classes
Vectorized Primitives
i.e. float2, float3, float4

16-bit Floating Point Constructs


half, half2, half3, half4 not enabled in ARB shaders

Fixed Precision Decimals


fixed, fixed2, fixed3, fixed4 Not enabled in ARB shaders

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

Cg
Intrinsic Classes (contd)
Membership Access
Constructor
e.g. float4 v = float4(a,b,c,d);

Array Operator
e.g. v[0], v[1], v[2], or v[3]

Swizzle Operator
Re-order/Build Vectors e.g. v.xyz, v.xxxz, v.yyx, v.yx, v.xyzw Replaceable with rgba instead of xyzw

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

Cg
Intrinsic Classes (contd)
Matrices
Compounded Vector Classes
e.g. float4x4

Constructed with multiple vectors


float4 v = float4(a,b,c,d); float4x4 m = float4x4(v,v,v,v);

Samplers
Texture Data Type
sampler1D, sampler2D, samplerRECT, sampler3D samplerRECT Same as sampler2D but uses pixel locations as texture coordinates instead of from [0,1]
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

Cg
Intrinsic Functions
Many have direct correspondence to assembly instructions or good approximations Linear Algebra Functions
dot(a,b) Dot Product mul Matrix-Matrix, Vector-Matrix, or Matrix-Vector multiplication

Texture Lookup Functions


tex*(sampler* texture, float* texCoord) * - The dimensionality of the texture

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

Cg
Intrinsic Functions (contd)
Geometric Intrinsic
distance, faceforward, length, normalize, reflect, refract

A good chunk of math.h


Most Taylor series expansions for two coefficients

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

Cg
Semantics
Binds variables to GPU Memory Constructs Uniform Registers
In declaration, use keyword uniform in front of variable type

Vertex Data/Interpolated Registers


float* varName : SEMANTIC Only used as main function parameter or global variable

Textures
Same as uniform variable
GPU Architecture & Cg
GPU Programming Seminar 1 Mark Colbert

FX Composer
Program for quick shader design Uses Cg as underlying shading language Additional Semantic Bindings NOTE: Uses DirectX as base, so uses vector-matrix multiplication notation

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

FX Composer

Walkthrough Example

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

GPGPU
General Purpose GPU Processing Key Notes
Goal to exploit fragment processor Each pixel represents a compacted 4component element of data Most optimal in gathering algorithms
Vertex shader needed to re-order output Possibly Optimal in Unified Shading Architecture

GPU Architecture & Cg


GPU Programming Seminar 1

Mark Colbert

You might also like