Chapter 2
Chapter 2
Graphics hardware
By Haimano D.
Introduction
• Graphics hardware, also known as a Graphics Processing Unit (GPU),
is a specialized electronic circuit designed to rapidly manipulate and
alter memory to accelerate the creation of images in a frame buffer
intended for output to a display device.
• A frame buffer is a dedicated section of VRAM (Video RAM) or
system memory that holds image data in a rasterized format before it is
rendered on the display.
❖ It contains the color, depth (Z-buffer), and stencil information for
each pixel on the screen.
Conti..,
❖ Memory Calculation Example
❖ For a 1920 × 1080 screen at 32-bit color depth:
▪ Each pixel = 4 bytes (RGBA 32-bit)
▪ Total frame buffer size = 1920 × 1080 × 4 = 8.3 MB per frame
▪ At 60 FPS, total memory usage = 8.3 MB × 60 = ~500 MB/sec
❖ Optimization Techniques
▪ Frame Buffer Compression (FBC) – Reduces memory bandwidth usage.
▪ Variable Rate Shading (VRS) – Allocates resources efficiently based on
focus areas.
▪ Dynamic Resolution Scaling (DRS) – Adjusts resolution in real-time for
better performance.
Basic Concepts of Graphics Hardware
❖ Frame Buffer: This is a portion of RAM containing a bitmap that drives a
video display.
❖ It is a memory buffer containing a complete frame of data.
❖ Rasterization: This is the process by which a primitive (like a polygon) is
converted to a two-dimensional image.
❖ Each point of this image contains associated color, depth, and texture data.
❖ Pixel: The smallest controllable element of a picture represented on the
screen.
❖ Shader: A type of program used in 3D graphics to determine the final
surface properties of an object or image.
❖ This can include its color, brightness, contrast, and other attributes.
❖ Texture Mapping: A method for adding detail, surface texture, or color to a
computer-generated graphic or 3D model.
Conti..,
❖ GPU Architecture: Unlike a CPU, a GPU has a highly parallel
structure made up of thousands of smaller, more efficient cores
designed for handling multiple tasks simultaneously.
❖ Pipeline: The GPU pipeline refers to the series of steps that graphics
data goes through from the initial setup to the final rendering on the
screen.
❖ Anti-aliasing: A technique used to add greater realism to a digital
image by smoothing jagged edges on curved lines and diagonals.
❖ Z-buffering: A management technique for determining which objects
(or parts of objects) are visible and which are hidden behind other
objects.
Conti..,
❖ GPU Architecture: Unlike a CPU, a GPU has a highly parallel
structure made up of thousands of smaller, more efficient cores
designed for handling multiple tasks simultaneously.
❖ Pipeline: The GPU pipeline refers to the series of steps that graphics
data goes through from the initial setup to the final rendering on the
screen.
❖ Anti-aliasing: A technique used to add greater realism to a digital
image by smoothing jagged edges on curved lines and diagonals.
❖ Z-buffering: A management technique for determining which objects
(or parts of objects) are visible and which are hidden behind other
objects.
Conti..,
❖ Ray Tracing: A rendering technique for generating an image by tracing the
path of light as pixels in an image plane and simulating the effects of its
encounters with virtual objects.
❖ Compute Shaders: These allow the GPU to be used for more than just
drawing graphics. They can perform general-purpose computing tasks, such
as physics simulations, image processing, and more.
❖ Volumetric Rendering: A technique used to display a 2D projection of a
3D discretely sampled data set. A typical 3D data set is a group of 2D slice
images acquired by a CT or MRI scanner.
❖ Tessellation: The process of dividing the polygons of a 3D model into
smaller polygons to create a finer mesh. This allows for more detailed and
smoother surfaces.
Conti..,
❖ CUDA and OpenCL: These are parallel computing platforms and
application programming interfaces (APIs) that allow software to use
certain types of GPUs for general purpose processing (an approach
known as GPGPU).
❖ Virtual Reality (VR) and Augmented Reality (AR): These
technologies rely heavily on advanced graphics hardware to render
immersive environments in real-time.
Evolution of Graphics Hardware
❖ Early 2D Accelerators (1980s-1990s)
▪ Only handled basic pixel operations.
▪ Used in arcade machines and early PCs.
❖ Fixed-Function 3D Graphics (1995-2005)
▪ Introduction of hardware-based rasterization.
▪ Examples: NVIDIA RIVA TNT, ATI Radeon 9700.
❖ Programmable Shaders (2005-2015)
▪ Vertex & Pixel Shaders introduced via DirectX 9+ and OpenGL.
▪ Allowed for advanced lighting, texturing, and effects.
▪ Examples: NVIDIA GeForce 8800 GTX, AMD HD 5000 series.
Evolution of Graphics Hardware
❖ Modern GPUs & AI Integration (2016-Present)
▪ Real-time Ray Tracing, Tensor Cores for AI acceleration.
▪ Deep learning-based DLSS (Deep Learning Super Sampling).
▪ Examples: NVIDIA RTX 4090, AMD Radeon 7900XTX.
The Graphics Processing Unit (GPU)
Architecture
❖ A GPU is optimized for parallel processing, handling thousands of
calculations simultaneously.
▪ GPU vs CPU: Key Differences
Feature GPU CPU
Processing Type Parallel Serial
Cores Thousands (SMs, CUDA Few (4-64 Cores)
Cores, Stream Processors)
Memory Bandwidth High (GDDR6, HBM2) Lower (DDR4, DDR5)
Task Optimization Graphics, AI, Physics, General Computing
Simulations
Conti..,
▪ Core Components of a GPU
•Compute Units (CUs) / Streaming Multiprocessors (SMs) –
•Groups of cores that process shader operations.
•Ray Tracing Cores
•Specialized cores for real-time light simulation.
•Tensor Cores
•Used for AI-based graphics enhancements (DLSS, AI upscaling).
•Rasterization Pipeline
•Converts 3D objects into 2D images.
•VRAM (Video RAM)
•Stores textures, depth buffers, and frame buffers.
Conti..,
▪ GPU Memory & Cache Hierarchy
•GDDR vs HBM:
GDDR (Graphics DDR): High-speed, standard VRAM
(GDDR6X in RTX 40 series).
•HBM (High Bandwidth Memory):
Used in professional GPUs, AI computing.
•L1 & L2 Caches:
L1 Cache: Closest to cores, smallest size.
L2 Cache: Shared across multiple cores for faster access.
Conti..,
▪ Parallel Processing in GPUs
▪ SIMD (Single Instruction, Multiple Data):
▪ Efficient execution of the same operation on multiple data
points.
▪ Wavefronts (AMD) vs Warps (NVIDIA):
• Wavefronts (AMD) execute 64 threads per cycle.
• Warps (NVIDIA) execute 32 threads per cycle.
GPU Compute Models and APIs
❖ GPUs are used beyond gaming, including AI, machine learning, and
scientific simulations.
❖ Graphics APIs
❖ DirectX (Microsoft) – Used in gaming (DirectX 12 Ultimate).
❖ OpenGL & Vulkan (Khronos Group) – Cross-platform 3D
rendering.
❖ Metal (Apple) – Optimized for macOS and iOS.
Conti..,
❖ GPU Compute APIs
▪ CUDA (NVIDIA Compute Unified Device Architecture) – Allows for
high-performance computing beyond graphics.
▪ OpenCL (Open Computing Language) – Vendor-neutral compute API.
▪ DirectCompute (Microsoft) – Part of DirectX for general-purpose GPU
computing.
Introduction to the 3D Graphics Pipeline
❖ The 3D graphics pipeline is a sequence of steps that transforms a 3D
scene into a 2D image. It is divided into three main stages:
1. Application Stage
▪ Handles user input, physics, and scene updates.
▪ Sends rendering commands to the graphics API.
2. Geometry Stage
▪ Converts 3D objects into screen-space coordinates.
▪ Includes:
✔ Model Transformation: Converts object coordinates to
world coordinates.
✔ View Transformation: Positions the camera.
Conti..,
✔ Projection Transformation: Converts 3D coordinates into
2D.
✔ Clipping & Culling: Removes unnecessary parts (e.g.,
objects outside the screen).
4. Rasterization Stage
▪ Converts geometric data into pixel data.
▪ Includes
✔ Scan Conversion: Converts 3D shapes into 2D pixels.
✔ Shading: Determines color and lighting (Flat, Gouraud, Phong).
✔ Texturing: Maps images onto 3D surfaces.
✔ Fragment Processing: Determines the final pixel color.
Conti..,
4. Lighting & Shading Models
▪ Flat Shading: One color per polygon.
▪ Gouraud Shading: Interpolates colors across a surface.
▪ Phong Shading: Interpolates normals for smooth lighting.
5. Texturing Techniques
▪ Texture Mapping: Applying a 2D image to a 3D object.
▪ Bump Mapping: Simulates surface roughness.
▪ Mipmap Levels: Uses lower-resolution textures for distant objects.
The Z-Buffer for Hidden Surface Removal
❖ What is the Z-Buffer?
▪ A depth buffer that stores the depth (Z-value) of each pixel.
▪ Helps determine which object is closest to the camera.
▪ When multiple objects overlap, the correct object must be displayed.
❖ How the Z-Buffer Algorithm Works
▪ Initialize the Z-buffer: Set all depth values to maximum (far plane).
▪ Render each object pixel by pixel:
✔ Compare the depth of the new pixel with the stored depth.
✔ If the new pixel is closer, update the Z-buffer and frame buffer.
▪ Display the final image.
The Z-Buffer for Hidden Surface Removal
❖ Z-Buffering Challenges
▪ Z-Fighting: Two objects have similar depth values, causing
flickering.
▪ Precision Issues: Limited depth resolution can cause artifacts.
▪ Memory Usage: Requires extra storage for depth values.
❖ Optimizing Z-Buffer Performance
▪ Depth Pre-Pass: Render depth first to reduce overdraw.
▪ Floating-Point Depth Buffers: Increase precision.
▪ Logarithmic Depth Buffers: Reduce artifacts at long distances.