5.1.3 Universal Scalable Shader Engine (USSE) - Key Features
5.1.3 Universal Scalable Shader Engine (USSE) - Key Features
com
180 Graphics Accelerator (SGX) SPRUH73H – October 2011 – Revised April 2013
Submit Documentation Feedback
Copyright © 2011–2013, Texas Instruments Incorporated
www.ti.com Introduction
SPRUH73H – October 2011 – Revised April 2013 Graphics Accelerator (SGX) 181
Submit Documentation Feedback
Copyright © 2011–2013, Texas Instruments Incorporated
Integration www.ti.com
5.2 Integration
GFX Subsystem
L3 Fast Master
Interconnect
L3 Fast Slave
Interconnect
PRCM
CORE_CLKOUTM4
(200 MHz) pd_gfx_gfx_l3_gclk
SYSCLK
MEMCLK
0 pd_gfx_gfx_fclk
/1, /2 CORECLK
1
PER_CLKOUTM2
(192 MHz)
182 Graphics Accelerator (SGX) SPRUH73H – October 2011 – Revised April 2013
Submit Documentation Feedback
Copyright © 2011–2013, Texas Instruments Incorporated
www.ti.com Integration
SPRUH73H – October 2011 – Revised April 2013 Graphics Accelerator (SGX) 183
Submit Documentation Feedback
Copyright © 2011–2013, Texas Instruments Incorporated
Functional Description www.ti.com
POWERVR
SGX530 Vertex data Coarse-grain
master scheduler Tiling
coprocessor
Universal
Prog. data
Pixel data sequencer scalable
master shader
engine
(USSE)
Data master Pixel
General-purpose selector coprocessor
data master
SOCIF BIF
L3 interconnect L3 interconnect
sgx-003
184 Graphics Accelerator (SGX) SPRUH73H – October 2011 – Revised April 2013
Submit Documentation Feedback
Copyright © 2011–2013, Texas Instruments Incorporated
www.ti.com Functional Description
input control stream, which contains triangle index data and state data. The state data indicates the
PDS program, size of the vertices, and the amount of USSE output buffer resource available to the
VDM. The triangle data is parsed to determine unique indices that must be processed by the USSE.
These are grouped together according to the configuration provided by the driver and presented to the
DMS.
• The PDM is the initiator of rasterization processing within the system. Each pixel pipeline processes
pixels for a different half of a given tile, which allows for optimum efficiency within each pipe due to
locality of data. It determines the amount of resource required within the USSE for each task. It merges
this with the state address and issues a request to the DMS for execution on the USSE.
• The general-purpose data master responds to events within the system (such as end of a pass of
triangles from the ISP, end of a tile from the ISP, end of render, or parameter stream breakpoint
event). Each event causes either an interrupt to the host or synchronized execution of a program on
the PDS. The program may, or may not cause a subsequent task to be executed on the USSE.
The USSE is a user-programmable processing unit. Although general in nature, its instructions and
features are optimized for three types of task: processing vertices (vertex shading), processing pixels
(pixel shading), and video/imaging processing.
The multilevel cache is a 2-level cache consisting of two modules: the main cache and the
mux/arbiter/demux/decompression unit (MADD). The MADD is a wrapper around the main cache module
designed to manage and format requests to and from the cache, as well as providing Level 0 caching for
texture and USSE requests. The MADD can accept requests from the PDS, USSE, and texture address
generator modules. Arbitration, as well as any required texture decompression, are performed between
the three data streams.
The texturing coprocessor performs texture address generation and formatting of texture data. It receives
requests from either the iterators or USSE modules and translates these into requests in the multilevel
cache. Data returned from the cache are then formatted according to the texture format selected, and sent
to the USSE for pixel-shading operations.
To process pixels in a tiled manner, the screen is divided into tiles and arranged as groups of tiles by the
tiling coprocessor. An inherent advantage of tiling architecture is that a large amount of vertex data can be
rejected at this stage, thus reducing the memory storage requirements and the amount of pixel processing
to be performed.
The pixel coprocessor is the final stage of the pixel-processing pipeline and controls the format of the final
pixel data sent to the memory. It supplies the USSE with an address into the output buffer and then USSE
returns the relevant pixel data. The address order is determined by the frame buffer mode. The pixel
coprocessor contains a dithering and packing function.
SPRUH73H – October 2011 – Revised April 2013 Graphics Accelerator (SGX) 185
Submit Documentation Feedback
Copyright © 2011–2013, Texas Instruments Incorporated
Chapter 6
SPRUH73H – October 2011 – Revised April 2013
Interrupts
Priority Threshold
THRESHOLD Priority
Comparator
If (INT Priority
>Threshold)
PRIORITY
PENDING_IRQp
PENDING_FIQp
IRQ_PRIORITY
Processor
6.1.1.2 Masking
5. The ISR saves the remaining context, identifies the interrupt source by reading the
ACTIVEIRQ/ACTIVEFIQ field, and jumps to the relevant subroutine handler as follows:
CAUTION
The code in steps 5 and 7 is an assembly code compatible with ARM
architecture V6 and V7. This code is developed for the Texas Instruments Code
Composer Studio tool set. It is a draft version, only tested on an emulated
environment.
6. The subroutine handler executes code specific to the peripheral generating the interrupt by handling
the event and deasserting the interrupt condition at the peripheral side.
; IRQ0 subroutine
IRQ0handler:
; Save working registers
STMFD SP!, {R0-R1}
; Now read-modify-write the peripheral module status register
; to de-assert the M_IRQ_0 interrupt signal
; De-Assert the peripheral interrupt
MOV R0, #0x7 ; Mask for 3 flags
LDR R1, MODULE0_STATUS_REG_ADDR ; Get the address of the module Status Register
STR R0, [R1] ; Clear the 3 flags
; Restore working registers LDMFD SP!, {R0-R1}
; Jump to the end part of the ISR
B IRQ_ISR_end/FIQ_ISR_end