SPIRV
SPIRV
SPIRV
This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner
without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality
therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession
of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell
anything that it may describe, in whole or in part.
Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and
redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification
and the latest available update of the specification for any version of the API is used whenever possible. Such distributed
specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification
may be incorporated into a product that is sold as long as such product includes significant independent work developed by the
seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible
with specification distributions.
Khronos Group makes no, and expressly disclaims any, representations or warranties, express or implied, regarding this specification, including, without limitation, any implied warranties of merchantability or fitness for a particular purpose or noninfringement of any intellectual property. Khronos Group makes no, and expressly disclaims any, warranties, express or implied,
regarding the correctness, accuracy, completeness, timeliness, and reliability of the specification. Under no circumstances will
the Khronos Group, or any of its Promoters, Contributors or Members or their respective partners, officers, directors, employees,
agents, or representatives be liable for any damages, whether direct, indirect, special or consequential damages for lost revenues,
lost profits, or otherwise, arising from or in connection with these materials.
Khronos, SYCL, SPIR, WebGL, EGL, COLLADA, StreamInput, OpenVX, OpenKCam, glTF, OpenKODE, OpenVG, OpenWF,
OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL and OpenMAX DL are trademarks and WebCL is a certification mark
of the Khronos Group Inc. OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks and the
OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics International used under license by Khronos. All other
product names, trademarks, and/or company names are used solely for identification and belong to their respective owners.
ii
iii
REVISION HISTORY
NUMBER
DATE
DESCRIPTION
NAME
Aug 2014
Created
jk
29
Mar 2015
Provisional Release
jk
30
2-Apr-2015
Provisional Release
jk
iv
Contents
1
Introduction
1.1
Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
1.3
Extendability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4
Debuggability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5
Design Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6
1.7
Built-In Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.8
Specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.9
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Specification
2.1
Language Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1
Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2
Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3
Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.4
Flow Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3
2.4
2.5
Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5.1
SSA Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.6
2.7
Execution Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.8
2.9
Function Calling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Binary Form
21
3.1
Magic Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2
Source Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3
Execution Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4
Addressing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5
Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6
Execution Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.7
Storage Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.8
Dim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.9
vi
110
TBD
111
1 / 111
A Simple Binary Intermediate Language for Graphical Shaders and OpenCL Compute Kernels
Contributors and Acknowledgements
Connor Abbott, Intel
Dan Baker, Oxide Games
Pat Brown, NVIDIA
Patrick Doane, Blizzard Entertainment
Tim Foley, Intel
Ben Gaster, Qualcomm
Kerch Holt, NVIDIA
Neil Henning, Codeplay
Ashwin Kolhe, NVIDIA
Graeme Leese, Broadcom
Yuan Lin, NVIDIA
Timothy Lottes, Epic Games
Daniel Koch, NVIDIA
John McDonald, Valve
Andrew Richards, Codeplay
Ian Romanick, Intel
Graham Sellers, AMD
Robert Simpson, QUALCOMM
Note
This specification can be printed with some or all language capabilities present. See Language Capabilities for details. This
printing includes the following capabilities:
Basic Shaders, which includes Matrices and vertex, fragment, and compute shaders.
Geometry Shaders.
Tessellation Shaders.
Physical Addressing.
Linking.
Kernels for OpenCL.
2 / 111
Introduction
Abstract
Abstract. This document fully defines SPIR-V, a new binary intermediate language for representing graphical-shader stages
and compute kernels for multiple Khronos APIs. Each function in a SPIR-V module contains a control-flow graph (CFG) of
basic blocks, with additional instructions and constraints to retain source-code structured flow control. Load/store instructions
are used to access declared variables, which includes all input/output (IO). Intermediate results bypassing load/store use static
single-assignment (SSA) representation. Data objects are represented logically, with hierarchical type information: There is no
flattening of aggregates or assignment to physical register banks, etc. Selectable addressing models establish whether general
pointers may be used, or if memory access is purely logical.
1.1
Goals
1.2
1.3
3 / 111
Extendability
1.4
Debuggability
SPIR-V can decorate, with a text string, virtually anything created in the shader: types, variables, functions, etc. This is required for externally visible symbols, and also allowed for naming the result of any instruction. This can be used to aid in
understandability when disassembling or debugging lowered versions of SPIR-V.
Line numbers and file names can also be decorations for any type, variable, instruction result, etc.
1.5
Design Principles
Regularity. All instructions start with a word count. This allows walking a SPIR-V module without decoding each opcode. All
instructions have an opcode that dictates for all operands what kind of operand they are. For instructions with a variable number
of operands, the number of variable operands is known by subtracting the number of non-variable words from the instructions
word count. Instructions with a string operand always take the string operand as the last operand.
Non Combinatorial. There is no combinatorial type explosion or need for large encode/decode tables for types. Rather, types
are parameterized. Sampler types declare their dimensionality, arrayness, etc. all orthogonally, which greatly simplify code. This
is done similarly for other types. It also applies to opcodes. Operations are orthogonal to scalar/vector size, but not to integer vs.
floating-point differences.
Modeless. After a given execution model (e.g., OpenGL stage) is specified, internal operation is essentially mode-less: Generally
it will follow the rule: "same spelling, same semantics", and does not have mode bits (like version #) that modify semantics. If a
change to SPIR-V modifies semantics, it should use a different spelling. This makes downstream code much more robust. There
are execution modes declared, but these are generally to affect the way the module interacts with the environment around it, not
the internal semantics.
Declarative. SPIR-V declares externally-visible modes like "writes depth", rather than having rules that require deduction from
full shader inspection. It also explicitly declares what addressing modes, execution model, extended instruction sets, etc. will be
used. See Language Capabilities for more information.
SSA. All results of intermediate operations are strictly SSA. However, declared variables use load/store for access and variables
can be stored to multiple times.
IO. Some storage classes are for input/output (IO) and, fundamentally, IO will be done through load/store of variables declared
in these storage classes.
1.6
4 / 111
SPIR-V includes a phi-instruction to allow the merging together of intermediate results from split flow control. This allows
computation without load/store to variables. SPIR-V is flexible in the degree to which load/store is used; it is possible to use flow
control with no phi-instructions, while still staying in SSA form. (The store instruction does not have a result that participates in
the SSA name space.)
Some storage classes are for IO and, fundamentally, IO will be done through load/store, and initial load and final store can
never be eliminated. Other storage classes are shader local and can have their load/store eliminated. It can be considered an
optimization to largely eliminate such load/stores by moving them into intermediate results in SSA form.
1.7
Built-In Variables
SPIR-V identifies built-in variables from a high-level language with an enumerant decoration. This assigns any unusual semantics
to the variable. Built-in variables must otherwise be declared and treated the same as any other variable.
1.8
Specialization
Specialization enables creating a portable SPIR-V module outside the target execution environment, based on constant values
that wont be known until inside the execution environment. For example, to size a fixed array with a constant not known during
creation of a module, but known when the module will be lowered to the target architecture.
See Specialization in the next section for more details.
1.9
Example
The SPIR-V form is binary, not human readable, and fully described in Binary Form. This is an example direct disassembly to
give a basic idea of what SPIR-V looks like.
GLSL fragment shader:
#version 450
in vec4 color1;
noperspective in vec4 color2;
out vec4 color;
uniform vec4 multiplier;
uniform bool cond;
struct S {
bool b;
vec4 v[5];
int i;
};
uniform S s;
void main()
{
vec4 scale = vec4(1.0, 1.0, 2.0, 1.0);
if (cond)
color = color1 + s.v[2];
else
color = sqrt(color2) * scale;
for (int i = 0; i < 4; ++i)
color *= multiplier;
}
5 / 111
Corresponding SPIR-V:
Result-Id
1:
2:
3:
7:
8:
9:
11:
12:
13:
14:
15:
16(cond):
20:
21(color):
22:
23(color1):
25:
26:
27:
28:
29(S):
30:
31(s):
32:
33:
34:
39(color2):
44:
46:
50:
53(multiplier):
4(main):
5:
10(scale):
45(i):
17:
18:
7(float)
7(float)
8(fvec4)
15(ptr)
20(ptr)
22(ptr)
25(int)
30(ptr)
28(int)
28(int)
22(ptr)
28(int)
28(int)
34(ptr)
2
9(ptr)
44(ptr)
14(bool)
24:
35:
36:
37:
8(fvec4)
34(ptr)
8(fvec4)
8(fvec4)
38:
40:
41:
42:
43:
8(fvec4)
8(fvec4)
8(fvec4)
8(fvec4)
19:
47:
49:
51:
28(int)
14(bool)
52:
54:
55:
56:
8(fvec4)
8(fvec4)
8(fvec4)
57:
58:
28(int)
28(int)
48:
6:
2
2.1
6 / 111
Load 23(color1)
AccessChain 31(s) 32 33
Load 35
FAdd 24 36
Store 21(color) 37
Branch 19
Label
Load 39(color2)
ExtInst 1(GLSL.std.450) 28(sqrt) 40
Load 10(scale)
FMul 41 42
Store 21(color) 43
Branch 19
Label
Store 45(i) 46
Branch 47
Label
Load 45(i)
SLessThan 49 50
LoopMerge 48 None
BranchConditional 51 52 48
Label
Load 53(multiplier)
Load 21(color)
FMul 55 54
Store 21(color) 56
Load 45(i)
IAdd 57 32
Store 45(i) 58
Branch 47
Label
Branch 6
Label
Return
FunctionEnd
Specification
Language Capabilities
Declared Through
OpTypeMatrix
OpEntryPoint
OpEntryPoint
OpEntryPoint
OpMemoryModel
Depends
On
Matrix
Shader
Shader
Capability Capability
Abbreviation
Link
Ability to have partially linked modules
and libraries.
Kernel
Kernels for OpenCL.
7 / 111
Declared Through
Depends
On
Linkage Attributes
Decoration
OpEntryPoint
To obtain portable SPIR-V, a particular release of an API consuming SPIR-V must specify:
Which capabilities above are required to be supported.
Required limits, if they are beyond the Universal Limits.
2.2
2.2.1
Terms
Instructions
Word: 32-bits.
<id>: A numerical name; the name used to refer to an object, a type, a function, a label, etc. An <id> always consumes one
word. The <id>s defined by a module obey SSA.
Literal String: A nul-terminated stream of characters consuming an integral number of words. The character set is Unicode in
the UTF-8 encoding scheme. The UTF-8 octets (8-bit bytes) are packed four per word, following the little-endian convention
(i.e., the first octet is in the lowest-order 8-bits of the word). The final word contains the strings nul-termination character (0),
and all contents past the end of the string in the final word are padded with 0.
Literal Number: A numeric value consuming one or more words. When a numeric value is larger than one word, low-order
words appear first.
Literal: A Literal String or a Literal Number.
Operand: A one-word argument to an instruction. E.g., it could be an <id>, or a (part of a) literal. Which form it holds is always
explicitly known from the opcode.
Immediate: Operand(s) directly holding a literal value rather than an <id>. Immediates larger than one word will consume
multiple operands, one per word. That is, operand counting is always done per word, not per immediate.
Result <id>: Most instructions produce a result, named by an <id> explicitly provided in the instruction.
WordCount: The number of words taken by an instruction, including the instructions opcode and optional operands. That is, the
total space taken by the instruction.
Instruction: After a header, a module is simply a linear list of instructions. An instruction contains a word count, an opcode,
an optional result <id>, an optional <id> of the instructions type, and a variable list of operands. All instruction opcodes and
semantics are listed in Instructions.
Decoration: Auxiliary information such as built-in variable, stream numbers, invariance, interpolation type, precision qualifiers,
etc., added to <id>s or structure members through decorations. Decorations are enumerated in Decoration in the Binary Form
section.
Annotation instruction: E.g., OpName, OpMemberName, OpLine, OpDecorate, OpMemberDecorate, OpGroupDecorate, OpGroupMemberDecorate, and OpDecorationGroup. These add information to an <id> created by some other instruction.
Object: An instantiation of a non-void type, either as the result <id> of an operation, or created through OpVariable.
Memory Object: An object created through OpVariable. Such an object can die on function exit, if it was a function variable, or
exist for the duration of an entry point.
Intermediate Object or Intermediate Value or Intermediate Result: An object created by an operation (not memory allocated by
OpVariable) and dying on its last consumption.
Constant Instruction: Either a specialization-constant instruction or a fixed constant instruction: Instructions that start "OpConstant" or "OpSpec".
2.2.2
8 / 111
Types
Module
Module: A single compilation unit of SPIR-V. Corresponds to one full stage of the graphical pipeline. Corresponds to a fully or
partially linked OpenCL kernel module with one or more entry points.
Execution Model: An OpenGL stage or OpenCL kernel. These are enumerated in Execution Model in the Binary Form section.
Entry Point: The function where a particular execution model will begin execution.
Execution Mode: Modes of operation relating to the interface or execution environment of the module. These are enumerated in
Execution Mode in the Binary Form section.
Vertex Processor: Any stage or execution model that processes vertices: Vertex, tessellation control, tessellation evaluation, and
geometry. Explicitly excludes fragment and compute.
2.2.4
Flow Control
Structured Loop: A loop that retains nested flow-control structure using the OpLoopMerge instruction. See Structured Control
Flow.
Structured Switch: An OpSwitch that retains nested flow-control structure using the OpSelectionMerge instruction. See Structured Control Flow.
Loop Break: An instruction within a Structured Loop that branches to the loops merge block.
Switch Break: An instruction within a Structured Switch that branches to the switchs merge block.
Loop Continue: An instruction within a Structured Loop that branches to the loops header block.
Break Block: A block containing a loop break or switch break instruction.
Continue Block: A block containing a loop continue instruction.
2.3
Word
Number
0
1
2
9 / 111
Contents
Magic Number.
Version number. The first public version will be 100 (use 99 for
pre-release).
Generators magic number. It is associated with the tool that
generated the module. Its value does not affect any semantics, and
is allowed to be 0. Using a non-0 value is encouraged, and can be
registered with Khronos.
Bound; where all <id>s in this module are guaranteed to satisfy
0 < id < Bound
Bound should be small, smaller is better, with all <id> in a module
being densely packed and near 0.
0 (Reserved for instruction schema, if needed.)
First word of instruction stream, see below.
4
5
Contents
Opcode: The 16 high-order bits are the WordCount of the
instruction. The 16 low-order bits are the opcode enumerant.
Optional instruction type <id> (presence determined by
opcode).
Optional instruction result <id> (presence determined by
opcode).
Operand 1 (if needed)
Operand 2 (if needed)
...
Operand N (N is determined by WordCount minus the 1 to 3
words used for the opcode, instruction type <id>, and
instruction result <id>).
Instructions are variable length due both to having optional instruction type <id> and result <id> words as well as a variable
number of operands. The details for each specific instruction are given in the Binary Form section.
2.4
10 / 111
The Validation Rules section lists additional rules that must be satisfied.
2.5
11 / 111
Instructions
Most instructions create a result <id>, as provided in the result <id> field of the instruction. These result <id>s are then referred
to by other instructions through their <id> operands. All instruction operands are specified in the Binary Form section.
Instructions are explicit about whether they require immediates, rather than an <id> referring to some other result. This is strictly
known just from the opcode.
An immediate 32-bit (or smaller) integer is always one operand directly holding a 32-bit twos-complement value.
An immediate 32-bit float is always one operand, directly holding a 32-bit IEEE 754 floating-point representation.
An immediate 64-bit float is always two operands, directly holding a 64-bit IEEE 754 representation. The low-order 32-bits
appear in the first operand.
2.5.1
SSA Form
A module is always in single-static-assignment (SSA) form. That is, there is always exactly one instruction resulting in any
particular result <id>. Storing into variables declared in memory is not subject to this; such stores do not create result <id>s.
Accessing declared variables is done through
OpVariable to allocate an object in memory and create a result <id> that is the name of a pointer to it.
OpAccessChain or OpInBoundsAccessChain to create a pointer to a subpart of a composite object in memory.
OpLoad through a pointer, giving the loaded object a result <id> that can then be used as an operand in other instructions.
OpStore through a pointer, to write a value. There is no result <id> for an OpStore.
OpLoad and OpStore instructions can often be eliminated, using intermediate results instead. When this happens in multiple
control-flow paths, these values need to be merged again at the paths merge point. Use OpPhi to merge such values together.
2.6
The OpEntryPoint instruction identifies two things: an execution model and a function definition. Execution models include
Vertex, GLCompute, etc. (one for each graphical stage), as well as Kernel for OpenCL kernels. For the complete list, see
Execution Model in the Binary Form section.
2.7
Execution Modes
Information like the following are declared with OpExecutionMode instructions. For example,
number of invocations (ExecutionInvocations)
vertex-order CCW (InputVertexOrderCcw)
triangle strip generation (OutputTriangleStrip)
number of output vertices (OutputVertices)
etc.
For a complete list, see Execution Mode in the Binary Form section.
2.8
12 / 111
Types are built up hierarchically, using OpTypeXXX instructions. The result <id> of an OpTypeXXX instruction becomes a type
<id> for future use where type <id>s are needed. There is no type <id> of an OpTypeXXX instruction.
The "leaves" to start building with are OpTypeFloat, OpTypeInt, and OpTypeBool. Other types are built up from the result <id>
of these. The numerical types are parameterized to specify bit width and signed vs. unsigned.
Higher-level types are then constructed using opcodes like OpTypeVector, OpTypeMatrix, OpTypeSampler, OpTypeArray, OpTypeRuntimeArray, OpTypeStruct, and OpTypePointer. These are parameterized by number of components, array size, member
lists, etc. The sampler types are parameterized by the return type, dimensionality, image, depth-comparison, etc. To do texture
filtering operations, a sampler must contain both a filter (sampling state) and a texture. Such a combined sampler can be set
directly by the API, or made by combining a filter and sampler containing just a texture, which themselves were set by the API.
Types are built bottom up: A parameterizing operand in a type must be defined before being used.
Some additional information about the type of an <id> can be provided using the decoration instructions (OpDecorate, OpMemberDecorate, OpGroupDecorate, OpGroupMemberDecorate, and OpDecorationGroup). These can add, for example, Invariant
to an <id> created by another instruction. See the full list of decorations in the Binary Form section.
Types are not allowed to be aliased: Two different type <id>s must be for two different types. Non-structure types cannot be
decorated (OpDecorate). Hence, two non-structure types cannot and will not differ due to decoration differences. (If such decorations are desired, they can be applied to the object declared.) Structure type members can be decorated (OpMemberDecorate)
and structure types can be decorated (OpDecorate). To support types that differ only in decoration, two different structure types
can be made that have the same member-type operands. Types are different even if the only difference is a decoration.
Variables are declared to be of an already built type, and placed in a storage class. Storage classes include UniformConstant,
Input, WorkgroupLocal, etc. and are fully specified in Storage Class. Variables declared with the Function storage class can
have their lifetimes specified within their function using the OpLifetimeStart and OpLifetimeStop instructions.
Intermediate results are typed by the instructions type <id>, which must validate with respect to the operation being done.
Built-in variables needing special driver handling (having unique semantics) are declared using OpDecorate or OpMemberDecorate with the decoration BuiltIn, followed by a BuiltIn enumerant. This decoration is applied to a variable or structure member.
2.9
Function Calling
To call a function defined in the current module or a function declared to be imported from another module, use OpFunctionCall
with an operand that is the <id> of the OpFunction to call, and the <id>s of the arguments to pass. All arguments are passed by
value into the called function. This includes pointers, through which a callee object could be modified.
2.10
Many operations and/or built-in function calls from high-level languages are represented through extended instruction sets.
Extended instruction sets will include things like
trigonometric functions: sin(), cos(), . . .
exponentiation functions: exp(), pow(), . . .
geometry functions: reflect(), smoothstep(), . . .
functions having rich performance/accuracy trade-offs
etc.
Non-extended instructions, those that are core SPIR-V instructions, are listed in the Binary Form section. Native operations
include:
Basic arithmetic: +, -, *, min(), scalar * vector, etc.
13 / 111
Texturing, to help with back-end decoding and support special code-motion rules.
Derivatives, due to special code-motion rules.
Extended instruction sets are specified in independent specifications. They can be referenced (but not specified) in this specification. The separate extended instruction set specification will specify instruction enumerants, semantics, and instruction names.
For example, the GLSL built-in functions are specified in the GLSL Built-In Functions.
To use an extended instruction set, first import it by name using OpExtInstImport and giving it a result <id>:
<extinst-id> OpExtInstImport "name-of-extended-instruction-set"
The "name-of-extended-instruction-set" is a literal string. The standard convention for this string is
"<source language name>.<package name>.<version>"
For example "GLSL.std.450" could be the name of the core built-in functions for GLSL versions 450 and earlier.
Note
There is nothing precluding having two "mirror" sets of instructions with different names but the same enumerants, which could,
for example, let modifying just the import statement to change a performance/accuracy tradeoff.
2.11
Selection and looping in the CFG have the choice to explicitly represent structured form (nested control flow). Selection includes
if-then-else using OpBranchConditional and switch using OpSwitch. Each such loop or selection construct will include:
the set of blocks forming its body,
one header block, and
one merge block. Merge blocks cannot be shared; each construct must have its own.
The constructs header and merge blocks are identified by a merge instruction: An OpLoopMerge instruction for loops and an
OpSelectionMerge instruction for selections. The block containing this merge instruction is the header block, and the block
selected by the merge instructions Label operand is the merge block. The merge instruction is the second-to-last instruction in
the header block.
These blocks define the control-flow construct by satisfying these rules:
14 / 111
the header-block dominates all blocks in the construct, including the merge block
the merge-block post dominates all blocks in the construct, including the header block
with the exception of break blocks and continue blocks: Break and continue blocks, and the blocks they post dominate, are not
post dominated by the merge blocks of the nested constructs containing them.
That is, conceptually speaking, there is no branching from outside a structured control-flow construct to inside it, or from inside
it to outside it, except for breaks and continues.
Note that a switch "default statement" that does nothing can be represented by the OpSwitchs Default operand being the label
of a structured switchs merge block.
2.12
Specialization
Note
Ad hoc specializing should not be done through constants (OpConstant or OpConstantComposite) that get overwritten: A
SPIR-V SPIR-V transform might want to do something irreversible with the value of such a constant, unconstrained from the
possibility that its value could be later changed.
->
->
->
->
OpConstantTrue or OpConstantFalse
OpConstantTrue or OpConstantFalse
OpConstant
OpConstantComposite
15 / 111
The external specialization must indicate whether it is a partial specialization or a full specialization. If it indicates a partial specialization, then only those specialization constants provided by the external specialization can be modified. A full specialization
allows all specialization instructions to be modified.
TBD. Add these instructions to Section 3 to use in constructing derived specialization constants:
OpSpecIAdd
OpSpecISub
OpSpecIMul
OpSpecUDiv (invalid to divide by 0)
OpSpecUMod
OpSpecLogicalAnd
OpSpecLogicalOr
OpSpecLogicalXor
2.13
Linkage
The ability to have partially linked modules and libraries is provided as part of the Link capability.
By default, functions and global variables are private to a module and cannot be accessed by other modules. However, a module
may be written to export or import functions and global (module scope) variables. Imported functions and global variable
definitions are resolved at linkage time. A module is considered to be partially linked if it depends on imported values.
Within a module, imported or exported values are decorated using the Linkage Attributes Decoration. This decoration assigns
the following linkage attributes to decorated values:
A Linkage Type.
A name, which is a Literal String, and is used to uniquely identify exported values.
Note
When resolving imported functions, the Function Control and all Function Parameter Attributes are taken from the function
definition, and not from the function declaration.
2.14
ES Precision
ES precision qualifiers are handled with OpDecorate and OpMemberDecorate instructions (or if more efficient, the OpGroupDecorate, OpGroupMemberDecorate). The precision decorations are:
PrecisionLow
PrecisionMedium
PrecisionHigh
The precision decoration can be applied to
The <id> of a variable, where the variables type is numerical (including vectors, arrays, etc. of numerical types, but not a
structure).
The result <id> of an instruction, meaning the instruction is to operate at and result in the decorated precision.
16 / 111
2.15
Debug Information
Debug information is supplied with the annotations OpName, OpMemberName, and OpLine. A module will not lose any
semantics when all such instructions are removed.
2.15.1
Function-Name Mangling
There is no functional dependency on how functions are named. Signature-typing information is explicitly provided, without any
need for name "unmangling". (Valid modules can be created without inclusion of mangled names.)
By convention, for debugging purposes, modules with OpSource of OpenCL use the Itanium name-mangling standard.
2.16
Validation Rules
2.16.1
17 / 111
There is at least one OpEntryPoint instruction, unless the Link capability is being used.
No function can be targeted by both an OpEntryPoint instruction and an OpFunctionCall instruction.
Functions
A function declaration (an OpFunction with no basic blocks), must have a Linkage Attributes Decoration with the Import
Linkage Type.
A function definition (an OpFunction with basic blocks) cannot be decorated with the Import Linkage Type.
Global (Module Scope) Variables
It is illegal to initialize an imported variable. This means that a module-scope OpVariable with initialization value cannot be
marked with the Import Linkage Type.
Control-Flow Graph (CFG)
Blocks exist only within a function.
The first block in a function definition is the entry point of that function, and dominates all other blocks in the function.
The order of blocks in a function must satisfy the partial ordering imposed by the functions dominator tree (dominators
must appear before blocks they dominate).
Each block starts with a label.
* A label is made by OpLabel.
* This includes the first block of a function (OpFunction is not a label).
* Labels are used only to form blocks.
All OpPhi instructions within a block must appear before all non OpPhi instructions in the block.
Each block ends in one branch instruction. These are
* OpBranch
* OpBranchConditional
* OpSwitch
* OpKill
* OpReturn
* OpReturnValue
* OpUnreachable
The branch instructions can only appear as the last instruction in a block.
OpLabel instructions can only appear within a function.
All branches within a function must be to labels in that function.
Structured Loops (those having an OpLoopMerge instruction).
* The header dominates the merge block.
* The merge block post-dominates the header, except for breaks and continues, see Structured Control Flow for more detail.
* The OpLoopMerge instruction must be the 2nd to last instruction in the header block.
Structured Selection (those have an OpSelectionMerge instruction)
*
*
*
*
All OpFunctionCall Function operands are an <id> of an OpFunction in the same module.
18 / 111
Data rules
Vector types can only be parameterized with numerical types or the OpTypeBool type.
Matrix types can only be parameterized with floating-point types.
OpVariableArray can only be used for the Function storage class.
Specialization constants (see Specialization) are limited to integers, Booleans, floating-point numbers, and vectors of these.
Decoration rules
The Aliased decoration can only be applied to intermediate objects that are pointers to non-void types.
The Linkage Attributes decoration cannot be applied to Entry Point functions (functions targeted by an OpEntryPoint
instruction).
OpLoad and OpStore can only consume objects whose type is a pointer.
All OpLoad, OpStore, and OpPhi instructions must be within a function definition.
A result <id> resulting from an instruction within a function can only be used in that function.
A function call must have the same number of arguments as the function definition has parameters.
An instruction requiring a specific number of operands must have that many operands. The word count must agree.
Each opcode specifies its own requirements for number and type of operands, and these must be followed.
Atomic access rules
The pointers taken by atomic operation instructions must either be formed by a variable declaration or an OpImagePointer
instruction.
The only instructions that can operate on a pointer to the storage class AtomicCounter are
* OpAtomicLoad
* OpAtomicIIncrement
* OpAtomicIDecrement
All pointers used in atomic operation instructions must be pointers to one of the following:
* 32-bit scalar integer
* 64-bit scalar integer
2.16.2
19 / 111
Data rules:
Vector types for can only be parameterized as having 2, 3, or 4 components.
Matrix types can only be parameterized as having only 2, 3, or 4 columns.
Texturing rules:
All OpSampler instructions must appear in the first basic block of the entry point.
2.16.3
The following rules are applicable to modules that use OpenCL1.2 and OpenCL2.0 Memory Model. Modules that adhere to
these rules can be consumed by an OpenCL runtime.
OpenCL supports Addr, Link, Kernel. The use of any other capability is disallowed.
Can only have Physical32 or Physical64 addressing model.
OpTypeInt validation rules
The bit width operand can only be parameterized as 8, 16, 32 and 64 bit.
The sign operand must always be 0.
OpTypeFloat bit width operand can only be parameterized as 16, 32 and 64 bit.
OpTypeVector can only be parameterized as having 2, 3, 4, 8, or 16 components.
Variables used in atomic operations can only be 32-bit or 64-bit floating-point numbers, or a 32-bit or 64-bit signed integers.
2.17
Universal Limits
These quantities are minimum limits for all implementations and validators. Implementations are allowed to support larger
quantities. Specific APIs may impose larger minimums. See Language Capabilities.
Validators must either
inform when these limits are crossed, or
be explicitly parameterized with larger limits.
Limited Entity
Characters in a literal name
Characters in a literal string
Instruction word count
Minimum Limit
Decimal Hexadecimal
1024
400
65,536
10000
264
108
(256 operand words plus 8
more words)
400,000
1024
400
2.18
20 / 111
65,536
10000
524,288
80,000
Number of entries in the
Decoration table.
256
100
256
100
256
100
256
256
16,384
16,384
256
100
100
4000
4000
100
Memory Model
A memory model is chosen using a single OpMemoryModel instruction near the beginning of the module. This selects both an
addressing model and a memory model.
The Logical addressing model means pointers have no physical size or numeric value. In this mode, pointers can only be created
from existing objects, and they cannot be stored into an object.
The non-Logical addressing models allow physical pointers to be formed. OpVariable can be used to create objects that hold
pointers. These are declared for a specific Storage Class. Pointers for one storage class cannot be used to access objects in
another storage class. However, they can be converted with conversion opcodes. Any particular addressing model must describe
the bitwidth of pointers for each of the storage classes.
TBD: A more detailed memory model is being worked out. It will reflect realities of modern system architecture, and largely
unify multiple past memory models through paramaterized use of Memory Semantics and Execution Scopes. E.g., use of acquire/release semantics within an execution scope largely gives the control needed for current and past memory models. This
includes replacing most purposes of the traditional volatile keyword in high-level languages.
2.18.1
Aliasing
2.19
21 / 111
Execution Model
TBD. Executions models will be expanded to include details of various precision requirements for different environments and
releases of a different APIs and high-level languages. This will include IEEE floating point rules, allowed optimizations, and
how NaN, infinities, and denormalized numbers are handled.
2.19.1
Code Motion
Texturing instructions in the Fragment Execution Model that rely on an implicit derivative cannot be moved within flow control
that is not known to be uniform flow control.
TBD. Give a strict definition of uniform flow control.
Binary Form
This section contains the exact form for all instructions, starting with the numerical values for all fields. See Physical Layout for
the order words appear in.
3.1
Magic Number
Magic Number
0x07230203
3.2
Source Language
The source language is an annotation, with no semantics that affect the meaning of other parts of the module. Used by OpSource.
Source Language
0
1
2
3
3.3
Unknown
ESSL
GLSL
OpenCL
Execution Model
Used by OpEntryPoint.
Execution Model
0
1
Vertex
Vertex shading stage.
TessellationControl
Tessellation control (or hull) shading stage.
Required
Capability
Shader
Tess
22 / 111
Execution Model
2
3
4
5
6
3.4
TessellationEvaluation
Tessellation evaluation (or domain) shading stage.
Geometry
Geometry shading stage.
Fragment
Fragment shading stage.
GLCompute
Graphical compute shading stage.
Kernel
Compute kernel.
Required
Capability
Tess
Geom
Shader
Shader
Kernel
Addressing Model
Used by OpMemoryModel.
Addressing Model
0
1
3.5
Logical
Physical32
Indicates a 32-bit module, where the address width is
equal to 32 bits.
Physical64
Indicates a 64-bit module, where the address width is
equal to 64 bits.
Required
Capability
Addr
Addr
Memory Model
Used by OpMemoryModel.
Memory Model
0
1
2
3
4
3.6
Simple
No shared memory consistency issues.
GLSL450
Memory model needed by later versions of GLSL and
ESSL. Works across multiple versions.
OpenCL1.2
OpenCL 1.2 memory model.
OpenCL2.0
OpenCL 2.0 memory model.
OpenCL2.1
OpenCL 2.1 memory model.
Execution Mode
Declare the modes this modules stage will execute in. Used by OpExecutionMode.
Required
Capability
Shader
Shader
Kernel
Kernel
Kernel
Execution Mode
0
10
Invocations
Number of times to invoke the geometry stage for
each input primitive received. The default is to run
once for each input primitive. If greater than the
target-dependent maximum, it will fail to compile.
Only valid with the Geometry Execution Model.
SpacingEqual
Requests the tessellation primitive generator to divide
edges into a collection of equal-sized segments. Only
valid with one of the tessellation Execution Models.
SpacingFractionalEven
Requests the tessellation primitive generator to divide
edges into an even number of equal-length segments
plus two additional shorter fractional segments. Only
valid with one of the tessellation Execution Models.
SpacingFractionalOdd
Requests the tessellation primitive generator to divide
edges into an odd number of equal-length segments
plus two additional shorter fractional segments. Only
valid with one of the tessellation Execution Models.
VertexOrderCw
Requests the tessellation primitive generator to
generate triangles in clockwise order. Only valid with
one of the tessellation Execution Models.
VertexOrderCcw
Requests the tessellation primitive generator to
generate triangles in counter-clockwise order. Only
valid with one of the tessellation Execution Models.
PixelCenterInteger
Pixels appear centered on whole-number pixel
offsets. E.g., the coordinate (0.5, 0.5) appears to
move to (0.0, 0.0). Only valid with the Fragment
Execution Model.
OriginUpperLeft
Pixel coordinates appear to originate in the upper left,
and increase toward the right and downward. Only
valid with the Fragment Execution Model.
EarlyFragmentTests
Fragment tests are to be performed before fragment
shader execution. Only valid with the Fragment
Execution Model.
PointMode
Requests the tessellation primitive generator to
generate a point for each distinct vertex in the
subdivided primitive, rather than to generate lines or
triangles. Only valid with one of the tessellation
Execution Models.
Xfb
This stage will run in transform feedback-capturing
mode and this module is responsible for describing
the transform-feedback setup. See the XfbBuffer,
Offset, and Stride Decorations.
23 / 111
Required
Capability
Geom
Tess
Tess
Tess
Tess
Tess
Shader
Shader
Shader
Tess
Shader
Extra Operands
Literal Number
Number of invocations
Execution Mode
11
12
13
14
15
16
17
18
19
20
21
DepthReplacing
This mode must be declared if this module
potentially changes the fragments depth. Only valid
with the Fragment Execution Model.
DepthAny
TBD: this should probably be removed. Depth
testing will always be performed after the shader has
executed. Only valid with the Fragment Execution
Model.
DepthGreater
External optimizations may assume depth
modifications will leave the fragments depth as
greater than or equal to the fragments interpolated
depth value (given by the z component of the
FragCoord BuiltIn decorated variable). Only valid
with the Fragment Execution Model.
DepthLess
External optimizations may assume depth
modifications leave the fragments depth less than the
fragments interpolated depth value, (given by the z
component of the FragCoord BuiltIn decorated
variable). Only valid with the Fragment Execution
Model.
DepthUnchanged
External optimizations may assume this stage did not
modify the fragments depth. However,
DepthReplacing mode must accurately represent
depth modification. Only valid with the Fragment
Execution Model.
LocalSize
Indicates the work-group size in the x, y, and z
dimensions. Only valid with the GLCompute or
Kernel Execution Models.
LocalSizeHint
A hint to the compiler, which indicates the most
likely to be used work-group size in the x, y, and z
dimensions. Only valid with the Kernel Execution
Model.
InputPoints
Stage input primitive is points. Only valid with the
Geometry Execution Model.
InputLines
Stage input primitive is lines. Only valid with the
Geometry Execution Model.
InputLinesAdjacency
Stage input primitive is lines adjacency. Only valid
with the Geometry Execution Model.
InputTriangles
For a geometry stage, input primitive is triangles. For
a tessellation stage, requests the tessellation primitive
generator to generate triangles. Only valid with the
Geometry or one of the tessellation Execution
Models.
24 / 111
Required
Capability
Shader
Extra Operands
Shader
Shader
Shader
Shader
Kernel
Geom
Geom
Geom
Geom
Tess
Literal
Number
x size
Literal
Number
y size
Literal
Number
z size
Literal
Number
x size
Literal
Number
y size
Literal
Number
z size
Execution Mode
22
23
24
25
26
27
28
29
30
3.7
InputTrianglesAdjacency
Geometry stage input primitive is triangles
adjacency. Only valid with the Geometry Execution
Model.
InputQuads
Requests the tessellation primitive generator to
generate quads. Only valid with one of the
tessellation Execution Models.
InputIsolines
Requests the tessellation primitive generator to
generate isolines. Only valid with one of the
tessellation Execution Models.
OutputVertices
For a geometry stage, the maximum number of
vertices the shader will ever emit in a single
invocation. For a tessellation-control stage, the
number of vertices in the output patch produced by
the tessellation control shader, which also specifies
the number of times the tessellation control shader is
invoked. Only valid with the Geometry or one of the
tessellation Execution Models.
OutputPoints
Stage output primitive is points. Only valid with the
Geometry Execution Model.
OutputLineStrip
Stage output primitive is line strip. Only valid with
the Geometry Execution Model.
OutputTriangleStrip
Stage output primitive is triangle strip. Only valid
with the Geometry Execution Model.
VecTypeHint
A hint to the compiler, which indicates that most
operations used in the entry point are explicitly
vectorized using a particular vector type. Only valid
with the Kernel Execution Model.
ContractionOff
Indicates that floating-point-expressions contraction
is disallowed. Only valid with the Kernel Execution
Model.
25 / 111
Required
Capability
Geom
Extra Operands
Tess
Tess
Geom
Tess
Literal Number
Vertex count
Geom
Geom
Geom
Kernel
<id>
Vector type
Kernel
Storage Class
Class of storage for declared variables (does not include intermediate values). Used by:
OpTypePointer
OpVariable
OpVariableArray
OpGenericCastToPtrExplicit
26 / 111
Storage Class
0
1
2
3
4
7
8
10
3.8
UniformConstant
Shared externally, read-only memory, visible across all
instantiations or work groups. Graphics uniform memory.
OpenCL Constant memory.
Input
Input from pipeline. Read only.
Uniform
Shared externally, visible across all instantiations or work
groups.
Output
Output to pipeline.
WorkgroupLocal
Shared across all work items within a work group.
OpenGL "shared". OpenCL local memory.
WorkgroupGlobal
Visible across all work items of all work groups. OpenCL
global memory.
PrivateGlobal
Accessible across functions within a module, non-IO (not
visible outside the module).
Function
A variable local to a function.
Generic
A generic pointer, which overloads StoragePrivate,
StorageLocal, StorageGlobal. not a real storage class.
Private
Private to a work-item and is not visible to another
work-item. OpenCL private memory.
AtomicCounter
For holding atomic counters.
Required
Capability
Shader
Shader
Shader
Shader
Shader
Kernel
Kernel
Shader
Dim
3.9
1D
2D
3D
Cube
Rect
Buffer
Required
Capability
Shader
Shader
27 / 111
1
2
3
3.10
None
The image coordinates used to sample elements of the
image refer to a location inside the image, otherwise the
results are undefined.
ClampToEdge
Out-of-range image coordinates are clamped to the extent.
Clamp
Out-of-range image coordinates will return a border color.
Repeat
Out-of-range image coordinates are wrapped to the valid
range. Can only be used with normalized coordinates.
RepeatMirrored
Flip the image coordinate at every integer junction. Can
only be used with normalized coordinates.
Required
Capability
Kernel
Kernel
Kernel
Kernel
Kernel
3.11
Nearest
Use filter nearset mode when performing a read image
operation.
Linear
Use filter linear mode when performing a read image
operation.
Required
Capability
Kernel
Kernel
0x8
0x10
None
NotNaN
Assume parameters and result are not NaN.
NotInf
Assume parameters and result are not +/- Inf.
NSZ
Treat the sign of a zero parameter or result as
insignificant.
AllowRecip
Allow the usage of reciprocal rather than perform a
division.
Fast
Allow algebraic transformations according to real-number
associative and distibutive algebra. This flag implies all
the others.
Required
Capability
Kernel
Kernel
Kernel
Kernel
Kernel
3.12
28 / 111
FP Rounding Mode
3.13
RTE
Round to nearest even.
RTZ
Round towards zero.
RTP
Round towards positive infinity.
RTN
Round towards negative infinity.
Required
Capability
Kernel
Kernel
Kernel
Kernel
Linkage Type
3.14
Export
Accessible by other modules as well.
Import
A declaration of a global variable or a function that exists
in another module.
Required
Capability
Link
Link
Access Qualifier
3.15
ReadOnly
A read-only object.
WriteOnly
A write-only object.
ReadWrite
A readable and writable object.
Adds additional information to the return type and to each parameter of a function.
Required
Capability
Kernel
Kernel
Kernel
29 / 111
6
7
3.16
Zext
Value should be zero extended if needed.
Sext
Value should be sign extended if needed.
ByVal
This indicates that the pointer parameter should really be
passed by value to the function. Only valid for pointer
parameters (not for ret value).
Sret
Indicates that the pointer parameter specifies the address
of a structure that is the return value of the function in the
source program. Only applicable to the first parameter
which must be a pointer parameters.
NoAlias
Indicates that the memory pointed by a pointer parameter
is not accessed via pointer values which are not derived
from this pointer parameter. Only valid for pointer
parameters. Not valid on return values.
NoCapture
The callee does not make a copy of the pointer parameter
into a location that is accessible after returning from the
callee. Only valid for pointer parameters. Not valid on
return values.
SVM
CL TBD
NoWrite
Can only read the memory pointed by a pointer
parameter. Only valid for pointer parameters. Not valid
on return values.
NoReadWrite
Cannot dereference the memory pointed by a pointer
parameter. Only valid for pointer parameters. Not valid
on return values.
Required
Capability
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Decoration
PrecisionLow
Apply as described in the ES Precision section.
PrecisionMedium
Apply as described in the ES Precision section.
PrecisionHigh
Apply as described in the ES Precision section.
Block
Apply to a structure type to establish it is a
non-SSBO-like shader-interface block.
TBD can this be removed? Probably doesnt add
anything over a nonwritable structure in the
UniformConstant or Uniform storage class. with a
Binding and DescriptorSet decoration.
Required
Capability
Shader
Shader
Shader
Shader
Extra Operands
Decoration
4
10
11
12
13
14
BufferBlock
Apply to a structure type to establish it is an
SSBO-like shader-interface block.
TBD can this be removed? Probably doesnt add
anything over a structure in the UniformConstant or
Uniform storage class. with a Binding and
DescriptorSet decoration.
RowMajor
Apply to a variable or a member of a structure. Must
decorate an entity whose type is a matrix. Indicates
that components within a row are contiguous in
memory.
ColMajor
Apply to a variable or a member of a structure. Must
decorate an entity whose type is a matrix. Indicates
that components within a column are contiguous in
memory.
GLSLShared
Apply to a structure type to get GLSL shared
memory layout.
GLSLStd140
Apply to a structure type to get GLSL std140
memory layout.
GLSLStd430
Apply to a structure type to get GLSL std430
memory layout.
GLSLPacked
Apply to a structure type to get GLSL packed
memory layout.
Smooth
Apply to a variable or a member of a structure.
Indicates that perspective-correct interpolation must
be used. Only valid for the Input and Output
Storage Classes.
Noperspective
Apply to a variable or a member of a structure.
Indicates that linear, non-perspective correct,
interpolation must be used. Only valid for the Input
and Output Storage Classes.
Flat
Apply to a variable or a member of a structure.
Indicates no interpolation will be done. The
non-interpolated value will come from a vertex, as
described in the API specification. Only valid for the
Input and Output Storage Classes.
Patch
Apply to a variable or a member of a structure.
Indicates a tessellation patch. Only valid for the
Input and Output Storage Classes.
30 / 111
Required
Capability
Shader
Matrix
Matrix
Shader
Shader
Shader
Shader
Shader
Shader
Shader
Tess
Extra Operands
Decoration
15
16
17
18
19
20
21
22
23
24
25
Centroid
Apply to a variable or a member of a structure. When
used with multi-sampling rasterization, allows a
single interpolation location for an entire pixel. The
interpolation location must lie in both the pixel and in
the primitive being rasterized. Only valid for the
Input and Output Storage Classes.
Sample
Apply to a variable or a member of a structure. When
used with multi-sampling rasterization, requires
per-sample interpolation. The interpolation locations
must be the locations of the samples lying in both the
pixel and in the primitive being rasterized. Only valid
for the Input and Output Storage Classes.
Invariant
Apply to a variable, to indicate expressions
computing its value be done invariant with respect to
other modules computing the same expressions.
Restrict
Apply to a variable, to indicate the compiler may
compile as if there is no aliasing. See the Aliasing
section for more detail.
Aliased
Apply to a variable, to indicate the compiler is to
generate accesses to the variable that work correctly
in the presence of aliasing. See the Aliasing section
for more detail.
Volatile
Apply to a variable, to indicate the memory holding
the variable is volatile. See the Memory Model
section for more detail.
Constant
Indicates that a global variable is constant and will
never be modified. Only allowed on global variables.
Coherent
Apply to a variable, to indicate the memory holding
the variable is coherent. See the Memory Model
section for more detail.
Nonwritable
Apply to a variable, to indicate the memory holding
the variable is not writable, and that this module does
not write to it.
Nonreadable
Apply to a variable, to indicate the memory holding
the variable is not readable, and that this module does
not read from it.
Uniform
Apply to a variable or a member of a structure.
Asserts that the value backing the decorated <id> is
dynamically uniform across all instantiations that
might run in parallel.
31 / 111
Required
Capability
Shader
Shader
Shader
Kernel
Shader
Extra Operands
Decoration
26
27
28
29
30
31
32
33
34
32 / 111
Required
Capability
Extra Operands
NoStaticUse
Apply to a variable to indicate that it is known that
this module does not read or write it. Useful for
establishing interface.
TBD consider removing this?
CPacked
Marks a structure type as "packed", indicating that
the alignment of the structure is one and that there is
no padding between structure members.
SaturatedConversion
Indicates that a conversion to an integer type which is
outside the representable range of Result Type will be
clamped to the nearest representable value of Result
Type. NaN will be converted to 0.
This decoration can only be applied to conversion
instructions to integer types, not including the
OpSatConvertUToS and OpSatConvertSToU
instructions.
Stream
Apply to a variable or a member of a structure.
Indicates the stream number to put an output on.
Only valid for the Output Storage Class and the
Geometry Execution Model.
Location
Apply to a variable or a structure member. Forms the
main linkage for Storage Class Input and Output
variables:
- between the API and vertex-stage inputs,
- between consecutive programmable stages, or
- between fragment-stage outputs and the API.
Only valid for the Input and Output Storage
Classes.
Component
Apply to a variable or a member of a structure.
Indicates which component within a Location will
be taken by the decorated entity. Only valid for the
Input and Output Storage Classes.
Index
Apply to a variable to identify a blend equation input
index, used as described in the API specification.
Only valid for the Output Storage Class and the
Fragment Execution Model.
Binding
Apply to a variable. Part of the main linkage between
the API and SPIR-V modules for memory buffers,
textures, etc. See the API specification for more
information.
DescriptorSet
Apply to a variable. Part of the main linkage between
the API and SPIR-V modules for memory buffers,
textures, etc. See the API specification for more
information.
Kernel
Kernel
Geom
Literal Number
Stream number
Shader
Literal Number
Location
Shader
Literal Number
Component within a vector
Shader
Literal Number
Index
Shader
Literal Number
Binding point
Shader
Literal Number
Descriptor set
33 / 111
Decoration
35
36
37
38
39
40
41
42
43
44
3.17
Required
Capability
Offset
Apply to a structure member. This gives the byte
offset of the member relative to the beginning of the
structure. Can be used, for example, by both uniform
and transform-feedback buffers.
Alignment
TBD: This can probably be removed.
XfbBuffer
Apply to a variable or a member of a structure.
Indicates which transform-feedback buffer an output
is written to. Only valid for the Output Storage
Classes of vertex processing Execution Models.
Stride
The stride, in bytes, of array elements or
transform-feedback buffer vertices.
BuiltIn
Apply to a variable or a member of a structure.
Indicates which built-in variable the entity represents.
FuncParamAttr
Indicates a function return value or parameter
attribute.
FP Rounding Mode
Indicates a floating-point rounding mode.
FP Fast Math Mode
Indicates a floating-point fast math flag.
Linkage Attributes
Associate linkage attributes to values. Only valid on
OpFunction or global (module scope) OpVariable.
See linkage.
SpecId
Apply to a specialization constant. Forms the API
linkage for setting a specialized value. See
specialization.
Extra Operands
Literal Number
Byte offset
Shader
Literal Number
Declared alignment
Literal Number
XFB Buffer number
Shader
Literal Number
Stride
Shader
Literal Number
See BuiltIn
Kernel
Kernel
FP Rounding Mode
floating-point rounding mode
FP Fast Math Mode
fast-math mode
Literal
Linkage Type
String
linkage type
name
Kernel
Link
Shader
Literal Number
Specialization Constant ID
BuiltIn
Position
PointSize
ClipVertex
ClipDistance
CullDistance
VertexId
InstanceId
Required
Capability
Shader
Shader
Shader
Shader
Shader
Shader
Shader
34 / 111
BuiltIn
3.18
PrimitiveId
InvocationId
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Layer
ViewportIndex
TessLevelOuter
TessLevelInner
TessCoord
PatchVertices
FragCoord
PointCoord
FrontFacing
SampleId
SamplePosition
SampleMask
FragColor
FragDepth
HelperInvocation
NumWorkgroups
WorkgroupSize
WorkgroupId
LocalInvocationId
GlobalInvocationId
LocalInvocationIndex
WorkDim
GlobalSize
EnqueuedWorkgroupSize
GlobalOffset
GlobalLinearId
WorkgroupLinearId
SubgroupSize
SubgroupMaxSize
NumSubgroups
NumEnqueuedSubgroups
SubgroupId
SubgroupLocalInvocationId
Required
Capability
Geom
Tess
Geom
Tess
Geom
Geom
Tess
Tess
Tess
Tess
Shader
Shader
Shader
Shader
Shader
Shader
Shader
Shader
Shader
Shader
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Selection Control
This value is a mask; it can be formed by combining the bits from multiple rows in the table below.
Used by OpSelectionMerge.
0x0
0x1
0x2
Selection Control
None
Flatten
Strong request, to the extent possible, to remove the flow
control for this selection.
DontFlatten
Strong request, to the extent possible, to keep this selection
as flow control.
3.19
35 / 111
Loop Control
This value is a mask; it can be formed by combining the bits from multiple rows in the table below.
Used by OpLoopMerge.
0x0
0x1
0x2
3.20
Loop Control
None
Unroll
Strong request, to the extent possible, to unroll or unwind
this loop.
DontUnroll
Strong request, to the extent possible, to keep this loop as a
loop, without unrolling.
Function Control
This value is a mask; it can be formed by combining the bits from multiple rows in the table below.
Used by OpFunction.
0x0
0x1
0x2
0x4
0x8
3.21
Function Control
None
Inline
Strong request, to the extent possible, to inline the function.
DontInline
Strong request, to the extent possible, to not inline the
function.
Pure
Compiler can assume this function has no side effect, but
might read global memory or read through dereferenced
function parameters. Always computes the same result for
the same argument values.
Const
Compiler can assume this function has no side effects, and
will not access global memory or dereference function
parameters. Always computes the same result for the same
argument values.
Memory Semantics
36 / 111
OpAtomicIIncrement
OpAtomicIDecrement
OpAtomicIAdd
OpAtomicISub
OpAtomicUMin
OpAtomicUMax
OpAtomicAnd
OpAtomicOr
OpAtomicXor
OpAtomicIMin
OpAtomicIMax
0x0
0x1
0x2
0x4
0x8
0x10
0x20
0x40
0x80
0x100
0x200
3.22
Memory Access
Memory Semantics
None
Relaxed
TBD
SequentiallyConsistent
All observers will see this memory access in the same order
WRT to other sequentially-consistent memory accesses from
this invocation.
Acquire
All memory operations provided in program order after this
memory operation will execute after this memory operation.
Release
All memory operations provided in program order before
this memory operation will execute before this memory
operation.
UniformMemory
Filter the memory operations being constrained to just those
accessing Uniform Storage Class memory.
SubgroupMemory
The memory semantics only have to be correct WRT to this
invocations subgroup memory.
WorkgroupLocalMemory
The memory semantics only have to be correct WRT to this
invocations local workgroup memory.
WorkgroupGlobalMemory
The memory semantics only have to be correct WRT to this
invocations global workgroup memory.
AtomicCounterMemory
Filter the memory operations being constrained to just those
accessing AtomicCounter Storage Class memory.
ImageMemory
Filter the memory operations being constrained to just those
accessing images (see OpTypeSampler Content).
37 / 111
This value is a mask; it can be formed by combining the bits from multiple rows in the table below.
0x0
0x1
0x2
3.23
Execution Scope
Memory Access
None
Volatile
This access cannot be optimized away; it has to be executed.
Aligned
This access has a known alignment, provided as a literal in
the next operand.
38 / 111
OpGroupSMin
OpGroupFMax
OpGroupUMax
OpGroupSMax
OpGroupReserveReadPipePackets
OpGroupReserveWritePipePackets
OpGroupCommitReadPipe
OpGroupCommitWritePipe
OpAtomicIMin
OpAtomicIMax
2
3
3.24
Execution Scope
CrossDevice
Everything executing on all the execution devices in the
system.
Device
Everything executing on the device executing this
invocation.
Workgroup
All invocations for the invoking workgroup.
Subgroup
All invocations in the currently executing subgroup.
Group Operation
Reduce
Returns the result of a reduction operation for all values
of a specific value X specified by workitems within a
workgroup.
Required
Capability
Kernel
39 / 111
Group Operation
1
3.25
InclusiveScan
The inclusive scan performs a binary operation with an
identity I and n (where n is the size of the workgroup)
elements[a0 , a1 , . . . an-1 ] and returns [a0 , (a0 op a1 ),
. . . (a0 op a1 op . . . op an-1 )]
ExclusiveScan
The exclusive scan performs a binary operation with an
identity I and n (where n is the size of the workgroup)
elements[a0 , a1 , . . . an-1 ] and returns [I, a0 , (a0 op a1 ), . . .
(a0 op a1 op . . . op an-2 )].
Required
Capability
Kernel
Kernel
NoWait
Indicates that the enqueued kernels do not need to wait
for the parent kernel to finish execution before they begin
execution.
WaitKernel
Indicates that all work-items of the parent kernel must
finish executing and all immediate side effects committed
before the enqueued child kernel may begin execution.
Note: Immediate meaning not side effects resulting from
child kernels. The side effects would include stores to
global memory and pipe reads and writes.
WaitWorkGroup
Indicates that the enqueued kernels wait only for the
workgroup that enqueued the kernels to finish before they
begin execution.
Required
Capability
Kernel
Kernel
Kernel
3.26
None
Required
Capability
40 / 111
3.27
CmdExecTime
Indicates that the profiling info queried is the execution
time.
Required
Capability
Kernel
Instructions
Required
Capabilities
(when needed)
Instruction description.
Word Count is the high-order 16 bits of word 0 of the
instruction, holding its total WordCount. If the instruction
takes a variable number of operands, Word Count will also say
"+ variable", after stating the minimum size of the instruction.
Opcode is the low-order 16 bits of word 0 of the instruction,
holding its opcode enumerant.
Results, when present, are any Result <id> or Result Type
created by the instruction. Each one is always 32-bits.
Operands, when present, are any literals, other instructions
Result <id>, etc., consumed by the instruction. Each one is
always 32-bits.
Word Count
Opcode
Results
3.27.1
Operands
Miscellaneous Instructions
OpNop
Use is invalid.
1
OpUndef
Make an intermediate object with no initialization.
Result Type is the type of object to make.
3
45
<id>
Result <id>
Result Type
3.27.2
OpSource
Document what source language this module was translated from. This has no
semantic impact and can safely be removed from a module.
Version is the version of the source language.
41 / 111
Source Language
Literal Number
Version
OpSourceExtension
Document an extension to the source language. This has no semantic impact and can safely be removed
from a module.
Extension is a string describing a source-language extension. Its form is dependent on the how the
source language describes extensions.
1 + variable
2
Literal String
Extension
OpName
Name a Result <id>. This has no semantic impact and can safely be removed from a module.
Target is the Result <id> to name. It can be the Result <id> of any other instruction; a variable, function, type,
intermediate result, etc.
Name is the string to name <id> with.
2 + variable
54
<id>
Target
Literal String
Name
OpMemberName
Name a member of a structure type. This has no semantic impact and can safely be removed from a module.
Type is the <id> from an OpTypeStruct instruction.
Member is the number of the member to name in the structure. The first member is member 0, the next is member 1, . . .
Name is the string to name the member with.
3+
55
<id>
variable
Type
Literal Number
Member
Literal String
Name
OpString
Name a string for use with other debug instructions (see OpLine). This has no semantic impact and
can safely be removed from a module.
String is the literal string being assigned a Result <id>. It has no result type and no storage.
2+
56
Result <id>
Literal String
variable
String
42 / 111
OpLine
Add source-level location information. This has no semantic impact and can safely be removed from a module.
Target is the Result <id> to locate. It can be the Result <id> of any other instruction; a variable, function, type,
intermediate result, etc.
File is the <id> from an OpString instruction and is the source-level file name.
Line is the source-level line number.
Column is the source-level column number.
5
57
<id>
Target
3.27.3
<id>
File
Literal Number
Line
Literal Number
Column
Annotation Instructions
OpDecorationGroup
A collector of decorations from OpDecorate instructions. All such instructions must precede this instruction. Subsequent
OpGroupDecorate and OpGroupMemberDecorate instructions can consume the Result <id> to apply multiple
decorations to multiple target <id>s. Those are the only instructions allowed to consume the Result <id>.
2
49
Result <id>
OpDecorate
Add a decoration to another <id>.
Target is the <id> to decorate. It can potentially be any <id> that is a forward reference. A set of decorations can be
grouped together by having multiple OpDecorate instructions target the same OpDecorationGroup instruction.
3+
50
<id>
Decoration
literal, literal, . . .
variable
Target
See Decoration.
OpMemberDecorate
Add a decoration to a member of a structure type.
Structure type is the <id> of a type from OpTypeStruct.
Member is the number of the member to decorate in the structure. The first member is member 0, the next
is member 1, . . .
4+
51
<id>
Literal Number
Decoration
literal, literal, . . .
variStructure type
Member
See Decoration.
able
43 / 111
OpGroupDecorate
Add a group of decorations to another <id>.
Decoration group is the <id> of an OpDecorationGroup instruction.
Target, . . . are the target <id>s to decorate with the groups of decorations.
2+
52
<id>
<id>, <id>, . . .
variable
Decoration group
Target, Target, . . .
OpGroupMemberDecorate
Add a decoration to a member of a structure type.
Decoration group is the <id> of an OpDecorationGroup instruction.
Target, . . . are the target <id>s to decorate with the groups of decorations.
2+
53
<id>
<id>, <id>, . . .
variable
Decoration group
Target, Target, . . .
3.27.4
Extension Instructions
OpExtension
Declare use of an extension to SPIR-V. This allows
validation of additional instructions, tokens, semantics, etc.
Name is the extensions name string.
1+
3
Literal String
variable
Name
OpExtInstImport
Import an extended set of instructions. It can be later referenced by the Result <id>.
Name is the extended instruction-sets name string.
See Extended Instruction Sets for more information.
2+
4
Result <id>
variable
Literal String
Name
OpExtInst
Execute an instruction in an imported set of extended instructions.
Set is the result of an OpExtInstImport instruction.
Instruction is the enumerant of the instruction to execute within the extended instruction Set.
Operand 1, . . . are the operands to the extended instruction.
5+
variable
3.27.5
44
44 / 111
Result <id>
<id>
Result Type
<id>
Set
Literal Number
Instruction
<id>, <id>, . . .
Operand 1,
Operand 2,
...
Mode-Setting Instructions
OpMemoryModel
Set addressing model and memory model for the entire module.
Addressing Model selects the modules addressing model, see Addressing Model.
Memory Model selects the modules memory model, see Memory Model.
3
5
Addressing Model
Memory Model
OpEntryPoint
Declare an entry point and its execution model.
Execution Model is the execution model for the entry point and its static call tree. See Execution Model.
Entry Point must the Result <id> of an OpFunction instruction.
3
6
Execution Model
<id>
Entry Point
OpExecutionMode
Declare an execution mode for an entry point.
Entry Point must be the Entry Point <id> operand of an OpEntryPoint instruction.
Mode is the execution mode. See Execution Mode.
3+
7
<id>
Execution Mode
variEntry Point
Mode
able
OpCompileFlag
Add a compilation
Flag.
1+
218
variable
3.27.6
Type-Declaration Instructions
Capability:
Kernel
Literal String
Flag
literal, literal, . . .
See Execution Mode
45 / 111
OpTypeVoid
Declare the void type.
Result <id> is the <id> of the new void type.
2
8
Result <id>
OpTypeBool
Declare the Boolean type. Values of this type can only be either true or false. There is no physical size or bit pattern
defined for these values. If they are stored (in conjuction with OpVariable), they can only be used with logical addressing
operations, not physical, and only with non-externally visible shader storage classes: WorkgroupLocal,
WorkgroupGlobal, PrivateGlobal, and Function.
Result <id> is the <id> of the new Boolean type.
2
9
Result <id>
OpTypeInt
Declare a new integer type.
Width specifies how many bits wide the type is. The bit pattern of a signed integer value is twos complement.
Signedness specifies whether there are signed semantics to preserve or validate.
0 indicates unsigned, or no signedness semantics
1 indicates signed semantics.
In all cases, the type of operation of an instruction comes from the instructions opcode, not the signedness of the operands.
Result <id> is the <id> of the new integer type.
4
10
Result <id>
Literal Number
Width
Literal Number
Signedness
OpTypeFloat
Declare a new floating-point type.
Width specifies how many bits wide the type is. The bit pattern of a floating-point value is as
described by the IEEE 754 standard.
Result <id> is the <id> of the new floating-point type.
3
11
Result <id>
Literal Number
Width
OpTypeVector
Declare a new vector type.
Component type is the type of each component in the resulting type.
Component count is the number of compononents in the resulting type. It must be at least 2.
Result <id> is the <id> of the new vector type.
12
46 / 111
Result <id>
Literal Number
Component count
<id>
Component type
OpTypeMatrix
Capability:
Matrix
Literal Number
Column count
<id>
Column type
OpTypeSampler
Declare a new sampler type. Consumed, for example, by OpTextureSample.This type is opaque: values of this type have
no defined physical size or bit pattern.
Sampled Type is a scalar type, of the type of the components resulting from sampling or loading through this sampler.
Dim is the texture dimensionality.
Content must be one of the following indicated values:
0 indicates a texture, no filter (no sampling state)
1 indicates an image
2 indicates both a texture and filter (sampling state), see OpTypeFilter
Arrayed must be one of the following indicated values:
0 indicates non-arrayed content
1 indicates arrayed content
Compare must be one of the following indicated values:
0 indicates depth comparisons are not done
1 indicates depth comparison are done
MS is multisampled and must be one of the following indicated values:
0 indicates single-sampled content
1 indicates multisampled content
Qualifier is an image access qualifier. See Access Qualifier.
Result <id> is the <id> of the new sampler type.
8 + 14
Result
<id>
Dim
vari<id>
Sampled
able
Type
Literal
Number
Content
Literal
Number
Arrayed
Literal
Number
Compare
OpTypeFilter
Declare the filter type. Consumed by OpSampler.This
type is opaque: values of this type have no defined
physical size or bit pattern.
Literal
Number
MS
Optional
<id>
Qualifier
47 / 111
15
Result <id>
OpTypeArray
Declare a new array type: a dynamically-indexable ordered aggregate of elements all having the same type.
Element Type is the type of each element in the array.
Length is the number of elements in the array. It must be at least 1. Length must come from a constant instruction of an
Integer-type scalar whose value is at least 1.
Result <id> is the <id> of the new array type.
4
16
Result <id>
<id>
Element type
OpTypeRuntimeArray
<id>
Length
Capability:
Shader
Declare a new run-time array type. Its length is not known at compile time.
Element type is the type of each element in the array. See OpArrayLength for
getting the Length of an array of this type.
Objects of this type can only be created with OpVariable using the Uniform
Storage Class.
Result <id> is the <id> of the new run-time array type.
3
17
Result <id>
<id>
Element type
OpTypeStruct
Declare a new structure type: an aggregate of heteregeneous members.
Member N type is the type of member N of the structure. The first member is member 0, the next is
member 1, . . .
Result <id> is the <id> of the new structure type.
2+
18
Result <id>
variable
OpTypeOpaque
<id>, <id>, . . .
Member 0 type,
member 1 type,
...
Capability:
Kernel
Literal String
The name of the
opaque type.
48 / 111
OpTypePointer
Declare a new pointer type.
Storage Class is the Storage Class of the memory holding the object pointed to.
Type is the type of the object pointed to.
Result <id> is the <id> of the new pointer type.
4
20
Result <id>
Storage Class
<id>
Type
OpTypeFunction
Declare a new function type. OpFunction and OpFunctionDecl, will use this to declare the return type and parameter types
of a function.
Return Type is the type of the return value of functions of this type. If the function has no return value, Return Type should
be from OpTypeVoid.
Parameter N Type is the type <id> of the type of parameter N.
Result <id> is the <id> of the new function type.
3+
21
Result <id>
variable
OpTypeEvent
<id>
Return Type
Capability:
Kernel
Declare an OpenCL
event object.
Result <id> is the <id>
of the new event type.
2
22
OpTypeDeviceEvent
Result <id>
Capability:
Kernel
Declare an OpenCL
device-side event
object.
Result <id> is the
<id> of the new
device-side-event type.
2
23
Result <id>
<id>, <id>, . . .
Parameter 0 Type,
Parameter 1 Type,
...
OpTypeReserveId
49 / 111
Capability:
Kernel
Declare an OpenCL
reservation id object.
Result <id> is the
<id> of the new
reservation type.
2
24
OpTypeQueue
Result <id>
Capability:
Kernel
Declare an OpenCL
queue object.
Result <id> is the <id>
of the new queue type.
2
25
Result <id>
OpTypePipe
Capability:
Kernel
3.27.7
Access
Qualifier
Qualifier
Constant-Creation Instructions
OpConstantTrue
Declare a true Boolean-type scalar constant.
Result Type must be the scalar Boolean type.
3
27
<id>
Result <id>
Result Type
OpConstantFalse
Declare a false Boolean-type scalar constant.
Result Type must be the scalar Boolean type.
3
28
<id>
Result <id>
Result Type
50 / 111
OpConstant
Declare a new Integer-type or Floating-point-type scalar constant.
Value is the bit pattern for the constant. Types 32 bits wide or smaller take one word. Larger types take multiple words,
with low-order words appearing first.
Result Type must be a scalar Integer type or Floating-point type.
3+
29
<id>
Result <id>
variable
Result Type
literal, literal, . . .
Value
OpConstantComposite
Declare a new composite constant.
Constituents will become members of a structure, or elements of an array, or components of a vector, or columns of a
matrix. There must be exactly one Constituent for each top-level member/element/component/column of the result. The
Constituents must appear in the order needed by the definition of the type of the result. The Constituents must be the <id>
of other constant declarations.
Result Type must be a composite type, whose top-level members/elements/components/columns have the same type as the
types of the operands.
3+
30
<id>
Result <id>
<id>, <id>, . . .
variable
Result Type
Constituents
OpConstantSampler
Capability:
Kernel
Literal Number
Mode
OpConstantNullPointer
Declare a new null pointer constant.
3
32
<id>
Result Type
OpConstantNullObject
Declare a new null object constant.
The objerct can be a queue, event or
reservation id.
3
33
<id>
Result Type
Literal Number
Param
Capability:
Addr
Result <id>
Capability:
Kernel
Result <id>
Literal Number
Filter
51 / 111
OpSpecConstantTrue
Capability:
Shader
Result <id>
<id>
Result Type
OpSpecConstantFalse
Capability:
Shader
Result <id>
<id>
Result Type
OpSpecConstant
Capability:
Shader
<id>
Result Type
Result <id>
literal, literal, . . .
Value
52 / 111
OpSpecConstantComposite
Capability:
Shader
<id>
Result Type
Result <id>
<id>, <id>, . . .
Constituents
Memory Instructions
OpVariable
Allocate an object in memory, resulting in a pointer to it, which can be used with OpLoad and OpStore.
Storage Class is the kind of memory holding the object.
Initializer is optional. If Initializer is present, it will be the initial value of the variables memory content. Initializer must
be an <id> from a constant instruction. Initializer must have the same type as the type pointed to by Result Type.
Result Type is a type from OpTypePointer, where the type pointed to is the type of object in memory.
4+
38
<id>
Result <id>
Storage Class
Optional <id>
variResult Type
Initializer
able
OpVariableArray
Capability:
Addr
Allocate N objects sequentially in memory, resulting in a pointer to the first such object.
Storage Class is the kind of memory holding the object.
N is the number of objects to allocate.
Result Type is a type from OpTypePointer whose type pointed to is the type of one of the N
objects allocated in memory.
Note: This is not the same thing as allocating a single object that is an array.
5
39
<id>
Result <id>
Storage Class
Result Type
<id>
N
53 / 111
OpLoad
Load through a pointer.
Pointer is the pointer to load through. It must have a type of OpTypePointer whose operand is the same as Result
Type.
Memory Access must be a Memory Access literal. See Memory Access for more detail.
4+
46
<id>
Result <id>
<id>
variResult Type
Pointer
able
literal, literal, . . .
Memory Access
OpStore
Store through a pointer.
Pointer is the pointer to store through. It must have a type of OpTypePointer whose operand is the same as the type of
Object.
Object is the object to store.
Memory Access must be a Memory Access literal. See Memory Access for more detail.
3+
47
<id>
<id>
variable
Pointer
Object
literal, literal, . . .
Memory Access
OpCopyMemory
Copy from the memory pointed to by Source to the memory pointed to by Target. Both operands must be non-void pointers
of the same type. Matching storage class is not required. The amount of memory copied is the size of the type pointed to.
Memory Access must be a Memory Access literal. See Memory Access for more detail.
3+
65
<id>
<id>
variable
Target
Source
OpCopyMemorySized
literal, literal, . . .
Memory Access
Capability:
Addr
literal, literal,
...
Memory Access
54 / 111
OpAccessChain
Create a pointer into a composite object that can be used with OpLoad and OpStore.
Base must be a pointer type, pointing to the base of the object.
Indexes walk the type hierarchy to the desired depth, potentially down to scalar granularity. The type of the pointer created
will be to the type reached by walking the type hierarchy down to the last provided index.
The storage class of the pointer created will be the same as the storage class of the base operand.
4+
93
<id>
Result <id>
<id>
variResult Type
Base
able
<id>, <id>, . . .
Indexes
OpInBoundsAccessChain
Has the same semantics as OpAccessChain, with the addition that the resulting pointer
is known to point within the base object.
4+
94
<id>
Result <id>
<id>
<id>, <id>,
variResult Type
Base
...
able
Indexes
OpArrayLength
Capability:
Shader
Literal Number
Array member
OpImagePointer
Form a pointer to a texel of an image. Use of such a pointer is limited to atomic operations.
Image is a pointer to a variable of type of OpTypeSampler.
Coordinate and Sample specify which texel and sample within the image to form an address of.
TBD. This requires an Image storage class to be added.
6
190
<id>
Result <id>
Result Type
<id>
Image
OpGenericPtrMemSemantics
Returns a valid Memory Semantics value for ptr. ptr
must point to Generic.
Result Type must be a 32-bits wide OpTypeInt value.
<id>
Coordinate
Capability:
Kernel
<id>
Sample
3.27.9
233
55 / 111
<id>
Result Type
Result <id>
<id>
ptr
Function Instructions
OpFunction
Add a function. This instruction must be immediately followed by one OpFunctionParameter instruction per each formal
parameter of this function. This functions body or declaration will terminate with the next OpFunctionEnd instruction.
Function Type is the result of an OpTypeFunction, which declares the types of the return value and parameters of the
function.
Result Type must be the same as the Return Type declared in Function Type.
5
40
<id>
Result <id>
Function Control
Result Type
<id>
Function Type
OpFunctionParameter
Declare the <id> for a formal parameter belonging to the current function.
This instruction must immediately follow an OpFunction or OpFunctionParameter instruction. The order of contiguous
OpFunctionParameter instructions is the same order arguments will be listed in an OpFunctionCall instruction to this
function. It is also the same order in which Parameter Type operands are listed in the OpTypeFunction of the Function
Type operand for this functions OpFunction instruction.
Result Type for all the OpFunctionParameter instructions for a function must be the same as, in order, the Parameter
Type operands listed in the OpTypeFunction of the Function Type operand for this functions OpFunction instruction.
3
41
<id>
Result <id>
Result Type
OpFunctionEnd
Last instruction of a function.
1
42
OpFunctionCall
Call a function.
Function is the <id> of an OpFunction instruction. This could be a forward reference.
Argument N is the <id> of the object to copy to parameter N of Function.
Result Type is the type of the return value of the function.
Note: A forward call is possible because there is no missing type information: Result Type must match the Return Type of
the function, and the calling argument types must match the formal parameter types.
4+
43
<id>
Result <id>
<id>
<id>, <id>, . . .
variResult Type
Function
Argument 0,
able
Argument 1,
...
3.27.10
56 / 111
Texture Instructions
OpSampler
Create a sampler containing both a filter and texture.
Sampler must be an object whose type is from an OpTypeSampler. Its type must have its Content operand set to 0,
indicating a texture with no filter.
Filter must be an object whose type is OpTypeFilter.
Result Type must be an OpTypeSampler whose Sampled Type, Dimensionality, Arrayed, Comparison, and Multisampled
operands all equal those of this instructions Sampler operand. Further, the Result Type must have its Content operand set
to 2, indicating both a texture and filter are present.
5
67
<id>
Result <id>
<id>
<id>
Result Type
Sampler
Filter
OpTextureSample
Capability:
Shader
OpTextureSampleDref
Sample a cube-map-array texture with depth comparison using an implicit level of detail.
Result Type must be scalar of the same type as Sampled Type of Samplers type.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content operand
set to 2, indicating both a texture and a filter. It must be for a Cube-arrayed depth-comparison type.
Coordinate is a vector of size 4 containing (u, v, w, array layer).
Dref is the depth-comparison reference value.
This instruction is only allowed under the Fragment Execution Model. In addition, it consumes an
implicit derivative that can be affected by code motion.
Optional <id>
[Bias]
Capability:
Shader
69
<id>
Result Type
57 / 111
Result <id>
<id>
Sampler
<id>
Coordinate
OpTextureSampleLod
<id>
Dref
Capability:
Shader
<id>
Coordinate
OpTextureSampleProj
<id>
Level of Detail
Capability:
Shader
Optional <id>
[Bias]
OpTextureSampleGrad
58 / 111
Capability:
Shader
OpTextureSampleOffset
<id>
dy
Capability:
Shader
Sample a texture with an offset from a coordinate using an implicit level of detail.
Result Types component type must be the same as Sampled Type of Samplers type. Result Type must be
scalar if the Samplers type sets depth-comparison, and must be a vector of four components if the
Samplers type does not set depth-comparison.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content operand
set to 2, indicating both a texture and a filter.
Coordinate is a floating-point scalar or vector containing (u[, v] . . . [, array layer]) as needed by the
definiton of Sampler.
Offset is added to (u, v, w) before texel lookup. It must be an <id> of an integer-based constant
instruction of scalar or vector type. It is a compile-time error if these fall outside a target-dependent
allowed range. The number of components in Offset must equal the number of components in
Coordinate, minus the array layer component, if present.
Bias is an optional operand. If present, it is used as a bias to the implicit level of detail.
This instruction is only allowed under the Fragment Execution Model. In addition, it consumes an
implicit derivative that can be affected by code motion.
6+
73
<id>
Result <id>
<id>
<id>
<id>
variResult Type
Sampler
Coordinate
Offset
able
Optional <id>
[Bias]
59 / 111
OpTextureSampleProjLod
Capability:
Shader
<id>
Coordinate
<id>
Level of Detail
OpTextureSampleProjGrad
Capability:
Shader
<id>
dy
OpTextureSampleLodOffset
60 / 111
Capability:
Shader
Sample a texture with explicit level of detail using an offset from a coordinate.
Result Types component type must be the same as Sampled Type of Samplers type. Result Type must be
scalar if the Samplers type sets depth-comparison, and must be a vector of four components if the
Samplers type does not set depth-comparison.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content operand
set to 2, indicating both a texture and a filter.
Coordinate is a floating-point scalar or vector containing (u[, v] . . . [, array layer]) as needed by the
definiton of Sampler.
Level of Detail explicitly controls the level of detail used when sampling.
Offset is added to (u, v, w) before texel lookup. It must be an <id> of an integer-based constant
instruction of scalar or vector type. It is a compile-time error if these fall outside a target-dependent
allowed range. The number of components in Offset must equal the number of components in
Coordinate, minus the array layer component, if present.
7
76
<id>
Result <id>
<id>
<id>
<id>
Result Type
Sampler
Coordinate
Level of Detail
OpTextureSampleProjOffset
<id>
Offset
Capability:
Shader
Sample a texture with an offset from a projective coordinate using an implicit level of detail.
Result Types component type must be the same as Sampled Type of Samplers type. Result Type must be
scalar if the Samplers type sets depth-comparison, and must be a vector of four components if the
Samplers type does not set depth-comparison.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content operand
set to 2, indicating both a texture and a filter.
Coordinate is a floating-point vector of four components containing (u [, v] [, Dref ], q) or (u [, v] [, w],
q), as needed by the definiton of Sampler, with the q component consumed for the projective division.
That is, the actual sample coordinate will be (u/q [, v/q] [,Dref /q]) or (u/q [, v/q] [, w/q]), as needed by the
definiton of Sampler.
Offset is added to (u, v, w) before texel lookup. It must be an <id> of an integer-based constant
instruction of scalar or vector type. It is a compile-time error if these fall outside a target-dependent
allowed range. The number of components in Offset must equal the number of components in
Coordinate, minus the array layer component, if present.
Bias is an optional operand. If present, it is used as a bias to the implicit level of detail.
This instruction is only allowed under the Fragment Execution Model. In addition, it consumes an
implicit derivative that can be affected by code motion.
6+
77
<id>
Result <id>
<id>
<id>
<id>
variResult Type
Sampler
Coordinate
Offset
able
Optional <id>
[Bias]
OpTextureSampleGradOffset
61 / 111
Capability:
Shader
OpTextureSampleProjLodOffset
<id>
Offset
Capability:
Shader
Sample a texture with an offset from a projective coordinate and an explicit level of detail.
Result Types component type must be the same as Sampled Type of Samplers type. Result Type must be
scalar if the Samplers type sets depth-comparison, and must be a vector of four components if the
Samplers type does not set depth-comparison.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content operand
set to 2, indicating both a texture and a filter.
Coordinate is a floating-point vector of four components containing (u [, v] [, Dref ], q) or (u [, v] [, w],
q), as needed by the definiton of Sampler, with the q component consumed for the projective division.
That is, the actual sample coordinate will be (u/q [, v/q] [,Dref /q]) or (u/q [, v/q] [, w/q]), as needed by the
definiton of Sampler.
Level of Detail explicitly controls the level of detail used when sampling.
Offset is added to (u, v, w) before texel lookup. It must be an <id> of an integer-based constant
instruction of scalar or vector type. It is a compile-time error if these fall outside a target-dependent
allowed range. The number of components in Offset must equal the number of components in
Coordinate, minus the array layer component, if present.
7
79
<id>
Result <id>
<id>
<id>
<id>
Result Type
Sampler
Coordinate
Level of Detail
<id>
Offset
62 / 111
OpTextureSampleProjGradOffset
Capability:
Shader
Sample a texture with an offset from a projective coordinate and an explicit gradient.
Result Types component type must be the same as Sampled Type of Samplers type. Result Type must be
scalar if the Samplers type sets depth-comparison, and must be a vector of four components if the
Samplers type does not set depth-comparison.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content operand set
to 2, indicating both a texture and a filter.
Coordinate is a floating-point vector of four components containing (u [, v] [, Dref ], q) or (u [, v] [, w], q),
as needed by the definiton of Sampler, with the q component consumed for the projective division. That is,
the actual sample coordinate will be (u/q [, v/q] [,Dref /q]) or (u/q [, v/q] [, w/q]), as needed by the definiton
of Sampler.
dx and dy are explicit derivatives in the x and y direction to use in computing level of detail. Each is a scalar
or vector containing (du/dx[, dv/dx] [, dw/dx]) and (du/dy[, dv/dy] [, dw/dy]). The number of components of
each must equal the number of components in Coordinate, minus the array layer component, if present.
Offset is added to (u, v, w) before texel lookup. It must be an <id> of an integer-based constant instruction
of scalar or vector type. It is a compile-time error if these fall outside a target-dependent allowed range.
The number of components in Offset must equal the number of components in Coordinate, minus the array
layer component, if present.
8
80
<id>
Result <id>
<id>
<id>
<id>
<id>
Result Type
Sampler
Coordinate
dx
dy
OpTextureFetchTexelLod
<id>
Offset
Capability:
Shader
<id>
Coordinate
<id>
Level of Detail
63 / 111
OpTextureFetchTexelOffset
Capability:
Shader
OpTextureFetchSample
<id>
Offset
Capability:
Shader
<id>
Sampler
<id>
Coordinate
OpTextureFetchTexel
<id>
Sample
Capability:
Shader
<id>
Sampler
<id>
Element
OpTextureGather
64 / 111
Capability:
Shader
OpTextureGatherOffset
<id>
Component
Capability:
Shader
OpTextureGatherOffsets
Gathers the requested component from four offset sampled texels.
Result Type must be a vector of four components of the same type as Sampled Type of Samplers type.
The result has one component per gathered texel.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content operand
set to 2, indicating both a texture and a filter. It must have a Dimensionality of 2D or Rect.
Coordinate is a floating-point scalar or vector containing (u[, v] . . . [, array layer] [, Dref ]) as needed by
the definiton of Sampler.
Component is component number that will be gathered from all four texels. It must be 0, 1, 2 or 3.
Offsets must be an <id> of a constant instruction making an array of size four of vectors of two integer
components. Each gathered texel is identified by adding one of these array elements to the (u, v)
sampled location. It is a compile-time error if this falls outside a target-dependent allowed range.
<id>
Offset
Capability:
Shader
87
<id>
Result Type
65 / 111
Result <id>
<id>
Sampler
<id>
Coordinate
<id>
Component
OpTextureQuerySizeLod
<id>
Offsets
Capability:
Shader
Query the dimensions of the texture for Sampler for mipmap level for Level of Detail.
Result Type must be an integer type scalar or vector. The number of components must be
1 for 1D Dimensionality,
2 for 2D, and Cube Dimensionalities,
3 for 3D Dimensionality,
plus 1 more if the sampler type is arrayed. This vector is filled in with (width [, height] [, depth]
[, elements]) where elements is the number of layers in a texture array, or the number of cubes in
a cube-map array.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content
operand set to 2, indicating both a texture and a filter. Sampler must have a type with
Dimensionality of 1D, 2D, 3D, or Cube. Sampler cannot have a multisampled type. See
OpTextureQuerySize for querying texture types lacking level of detail.
Level of Detail is used to compute which mipmap level to query, as described in the API
specification.
5
88
<id>
Result <id>
<id>
Result Type
Sampler
OpTextureQuerySize
<id>
Level of Detail
Capability:
Shader
Query the dimensions of the texture for Sampler, with no level of detail.
Result Type must be an integer type scalar or vector. The number of components must be
1 for Buffer Dimensionality,
2 for 2D and Rect Dimensionalities,
plus 1 more if the sampler type is arrayed. This vector is filled in with (width [, height] [,
elements]) where elements is the number of layers in a texture array.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its
Content operand set to 2, indicating both a texture and a filter. Sampler must have a type
with Dimensionality of Rect or Buffer, or be multisampled 2D. Sampler cannot have a
texture with levels of detail; there is no implicit level-of-detail consumed by this
instruction. See OpTextureQuerySizeLod for querying textures having level of detail.
4
89
<id>
Result <id>
Result Type
<id>
Sampler
66 / 111
OpTextureQueryLod
Capability:
Shader
Query the mipmap level and the level of detail for a hypothetical sampling of Sampler at
Coordinate using an implicit level of detail.
Result Type must be a two-component floating-point type vector.
The first component of the result will contain the mipmap array layer.
The second component of the result will contain the implicit level of detail relative to the base
level.
TBD: Does this need the GLSL pseudo code for computing array layer and LoD?
Sampler must be an object of a type made by OpTypeSampler. Its type must have its Content
operand set to 2, indicating both a texture and a filter. Sampler must have a type with
Dimensionality of 1D, 2D, 3D, or Cube.
Coordinate is a floating-point scalar or vector containing (u[, v] . . . [, array layer]) as needed by
the definiton of Sampler.
If called on an incomplete texture, the results are undefined.
This instruction is only allowed under the Fragment Execution Model. In addition, it consumes
an implicit derivative that can be affected by code motion.
5
90
<id>
Result <id>
<id>
Result Type
Sampler
OpTextureQueryLevels
<id>
Coordinate
Capability:
Shader
OpTextureQuerySamples
<id>
Sampler
Capability:
Shader
Query the number of samples available per texel fetch in a multisample texture.
Result Type must be a scalar integer type. The result is the number of samples.
Sampler must be an object of a type made by OpTypeSampler. Its type must have its
Content operand set to 2, indicating both a texture and a filter. Sampler must have a type
with Dimensionality of 2D and be a multisample texture.
4
92
<id>
Result <id>
Result Type
<id>
Sampler
3.27.11
67 / 111
Conversion Instructions
OpConvertFToU
Convert (value preserving) Float Value from floating point to unsigned integer, with round toward 0.0.
Results are computed per component. The operands type and Result Type must have the same number of
components. Result Type cannot be a signed integer type.
4
100
<id>
Result <id>
<id>
Result Type
Float Value
OpConvertFToS
Convert (value preserving) Float Value from floating point to signed integer, with round
toward 0.0.
Results are computed per component. The operands type and Result Type must have the
same number of components.
4
101
<id>
Result <id>
<id>
Result Type
Float Value
OpConvertSToF
Convert (value preserving) Signed Value from signed integer to floating point.
Results are computed per component. The operands type and Result Type must
have the same number of components.
4
102
<id>
Result <id>
<id>
Result Type
Signed Value
OpConvertUToF
Convert (value preserving) Unsigned value from unsigned integer to floating point.
Results are computed per component. The operands type and Result Type must
have the same number of components.
4
103
<id>
Result <id>
<id>
Result Type
Unsigned value
OpUConvert
Convert (value preserving) the width of Unsigned value. This is either a truncate or a zero extend.
Results are computed per component. The operands type and Result Type must have the same number of components.
The widths of the components of the operand and the Result Type must be different. Result Type cannot be a signed integer
type.
4
104
<id>
Result <id>
<id>
Result Type
Unsigned value
68 / 111
OpSConvert
Convert (value preserving) the width of Signed Value. This is either a truncate or a sign extend.
Results are computed per component. The operands type and Result Type must have the same number of components.
The widths of the components of the operand and the Result Type must be different.
4
105
<id>
Result <id>
<id>
Result Type
Signed Value
OpFConvert
Convert (value preserving) the width of Float Value.
Results are computed per component. The operands type and Result Type must have the same number of
components. The widths of the components of the operand and the Result Type must be different.
4
106
<id>
Result <id>
<id>
Result Type
Float Value
OpConvertPtrToU
Capability:
Addr
Convert Pointer to an unsigned integer type. A Result Type width larger than the width of
Pointer will zero extend. A Result Type smaller than the width of Pointer will truncate.
For same-width source and target, this is the same as OpBitCast.
Result Type cannot be a signed integer type.
4
107
<id>
Result Type
Result <id>
OpConvertUToPtr
Converts Integer value to a pointer. A Result Type width smaller than the width of
Integer value pointer will truncate. A Result Type width larger than the width of Integer
value pointer will zero extend. For same-width source and target, this is the same as
OpBitCast.
4
108
<id>
Result <id>
Result Type
OpPtrCastToGeneric
<id>
Pointer
Capability:
Addr
<id>
Integer value
Capability:
Kernel
Converts Source pointer to a pointer value pointing to storage class Generic. Source
pointer must point to storage class WorkgroupLocal, WorkgroupGlobal or Private. Result
Type must be a pointer type pointing to storage class Generic.
Result Type and Source pointer must point to the same type.
4
109
<id>
Result <id>
Result Type
<id>
Source pointer
69 / 111
OpGenericCastToPtr
Capability:
Kernel
<id>
Source pointer
OpBitcast
Bit-pattern preserving type conversion for Numerical-type or pointer-type vectors and scalars.
Operand is the bit pattern whose type will change.
Result Type must be different than the type of Operand. Both Result Type and the type of Operand must be
Numerical-types or pointer types. The components of Operand and Result Type must be same bit width.
Results are computed per component. The operands type and Result Type must have the same number of components.
4
111
<id>
Result <id>
<id>
Result Type
Operand
OpGenericCastToPtrExplicit
Capability:
Kernel
Attempts to explicitly convert Source pointer to storage storage-class pointer value. Source
pointer must point to Generic. If the cast cast fails, the instruction returns an
OpConstantNullPointer in storage Storage Class.
Result Type must be a pointer type pointing to storage Storage Class. storage can be one of the
following literal values: WorkgroupLocal, WorkgroupGlobal or Private.
Result Type and Source pointer must point to the same type.
5
232
<id>
Result <id>
Result Type
<id>
Source pointer
OpSatConvertSToU
Storage Class
storage
Capability:
Kernel
Convert the Signed Value from signed integer to unsigned integer. Converted values
outside the representable range of Result Type are clamped to the nearest representable
value of Result Type.
Results are computed per component. The operands type and Result Type must have the
same number of components.
4
263
<id>
Result <id>
Result Type
<id>
Signed Value
70 / 111
OpSatConvertUToS
Capability:
Kernel
Convert Unsigned Value from unsigned integer to signed integer. Converted values
outside the representable range of Result Type are clamped to the nearest representable
value of Result Type.
Results are computed per component. The operands type and Result Type must have the
same number of components.
4
264
<id>
Result <id>
Result Type
3.27.12
<id>
Unsigned Value
Composite Instructions
OpVectorExtractDynamic
Read a single, dynamically selected, component of a vector.
Vector must be a vector type and is the vector from which to read the component.
Index must be a scalar-integer 0-based index of which component to read.
The value read is undefined if Indexs value is less than zero or greater than or equal to the number of components in
Vector.
The Result Type must be the same type as the type of Vector.
5
58
<id>
Result <id>
Result Type
<id>
Vector
<id>
Index
OpVectorInsertDynamic
Write a single, variably selected, component into a vector.
Vector must be a vector type and is the vector that the non-written components will be taken from.
Index must be a scalar-integer 0-based index of which component to read.
What memory is written is undefined if Indexs value is less than zero or greater than or equal to the number of
components in Vector.
The Result Type must be the same type as the type of Vector.
6
59
<id>
Result <id>
<id>
Result Type
Vector
<id>
Component
<id>
Index
71 / 111
OpVectorShuffle
Select arbitrary components from two vectors to make a new vector.
Vector 1 and Vector 2 are logically concatenated, forming a single vector with Vector 1s components appearing before
Vector 2s. The components of this logical vector are logically numbered with a single consecutive set of numbers from 0
to one less than the total number of components. These two vectors must be of the same component type, but do not have
to have the same number of components.
Components are these logical numbers (see above), selecting which of the logically numbered components form the result.
They can select the components in any order and can repeat components. The first component of the result is selected by
the first Component operand, the second component of the result is selected by the second Component operand, etc.
Result Type must be a vector of the same component type as the Vector operands component type. The number of
components in Result Type must be the same as the number of Component operands.
Note: A vector swizzle can be done by using the vector for both Vector operands, or using an OpUndef for one of the
Vector operands.
5+
60
<id>
Result <id>
<id>
<id>
literal, literal, . . .
variResult Type
Vector 1
Vector 2
Components
able
OpCompositeConstruct
Construct a new composite object from a set of constituent objects that will fully form it.
Constituents will become members of a structure, or elements of an array, or components of a vector, or columns of a
matrix. There must be exactly one Constituent for each top-level member/element/component/column of the result, with
one exception. The exception is that for constructing a vector, a contiguous subset of the scalars consumed can be
represented by a vector operand instead. The Constituents must appear in the order needed by the definition of the type of
the result. When constructing a vector, there must be at least two Constituent operands.
Result Type must be a composite type, whose top-level members/elements/components/columns have the same type as the
types of the operands, with one exception. The exception is that for constructing a vector, the operands may also be
vectors with the same component type as the Result Type component type. When constructing a vector, the total number of
components in all the operands must equal the number of components in Result Type.
3+
61
<id>
Result <id>
<id>, <id>, . . .
variable
Result Type
Constituents
OpCompositeExtract
Extract a part of a composite object.
Composite in the composite to extract from.
Indexes walk the type hierarchy, down to component granularity. All indexes must be in bounds.
Result Type must be the type of object selected by the last provided index. The instruction result is the extracted object.
4+
62
<id>
Result <id>
<id>
literal, literal, . . .
variResult Type
Composite
Indexes
able
72 / 111
OpCompositeInsert
Insert into a composite object.
Object is the object to insert.
Composite in the composite to insert into.
Indexes walk the type hierarchy to the desired depth, potentially down to component granularity. All indexes must be in
bounds.
Result Type must be the same type as Composite, and the instruction result is a modified version of Composite.
5+
63
<id>
Result <id>
<id>
<id>
literal, literal, . . .
variResult Type
Object
Composite
Indexes
able
OpCopyObject
Make a copy of Operand. There are no dereferences involved.
Result Type must match Operand type. There are no other restrictions
on the types.
4
64
<id>
Result <id>
<id>
Result Type
Operand
OpTranspose
Capability:
Matrix
Transpose a matrix.
Matrix must be an intermediate <id> whose type comes from an OpTypeMatrix
instruction.
Result Type must be an <id> from an OpTypeMatrix instruction, where the number of
columns and the column size is the reverse of those of the type of Matrix.
4
112
<id>
Result <id>
Result Type
3.27.13
<id>
Matrix
Arithmetic Instructions
OpSNegate
Signed-integer subtract of Operand from zero. The operands type and Result Type must both be
scalars or vectors of integer types with the same number of components and the same component
widths. Works with any mixture of signedness.
4
95
<id>
Result <id>
<id>
Result Type
Operand
OpFNegate
Floating-point subtract of Operand from zero. The operands type and Result
Type must both be scalars or vectors of floating-point types with the same number
of components and the same component widths.
96
73 / 111
<id>
Result Type
Result <id>
<id>
Operand
OpNot
Complement the bits of Operand. The operand type and Result Type
must be scalars or vectors of integer types with the same number of
components and same component widths.
4
97
<id>
Result <id>
<id>
Result Type
Operand
OpIAdd
Integer addition of Operand 1 and Operand 2. The operands types and Result Type must all be
scalars or vectors of integer types with the same number of components and the same component
widths. Works with any mixture of signedness.
5
122
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFAdd
Floating-point addition of Operand 1 and Operand 2. The operands types and Result
Type must all be scalars or vectors of floating-point types with the same number of
components and the same component widths.
5
123
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpISub
Integer subtraction of Operand 2 from Operand 1. The operands types and Result Type must all be
scalars or vectors of integer types with the same number of components and the same component
widths. Works with any mixture of signedness.
5
124
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFSub
Floating-point subtraction of Operand 2 from Operand 1. The operands types and
Result Type must all be scalars or vectors of floating-point types with the same number
of components and the same component widths.
5
125
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpIMul
Integer multiplication of Operand 1 and Operand 2. The operands types and Result Type must all be
scalars or vectors of integer types with the same number of components and the same component
widths. Works with any mixture of signedness.
5
126
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
74 / 111
OpFMul
Floating-point multiplication of Operand 1 and Operand 2. The operands types and
Result Type must all be scalars or vectors of floating-point types with the same number
of components and the same component widths.
5
127
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpUDiv
Unsigned-integer division of Operand 1 divided by Operand 2. The operands types and Result Type must all be scalars or
vectors of integer types with the same number of components and the same component widths. The operands types and
Result Type cannot be signed types. The resulting value is undefined if Operand 2 is 0.
5
128
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpSDiv
Signed-integer division of Operand 1 divided by Operand 2. The operands types and Result Type must all be scalars or
vectors of integer types with the same number of components and the same component widths. Works with any mixture of
signedness. The resulting value is undefined if Operand 2 is 0.
5
129
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFDiv
Floating-point division of Operand 1 divided by Operand 2. The operands types and Result Type must all be
scalars or vectors of floating-point types with the same number of components and the same component widths.
The resulting value is undefined if Operand 2 is 0.
5
130
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpUMod
Unsigned modulo operation of Operand 1 modulo Operand 2. The operands types and Result Type must all be scalars or
vectors of integer types with the same number of components and the same component widths. The operands types and
Result Type cannot be signed types. The resulting value is undefined if Operand 2 is 0.
5
131
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpSRem
Signed remainder operation of Operand 1 divided by Operand 2. The sign of a non-0 result comes from Operand 1. The
operands types and Result Type must all be scalars or vectors of integer types with the same number of components and
the same component widths. Works with any mixture of signedness. The resulting value is undefined if Operand 2 is 0.
5
132
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
75 / 111
OpSMod
Signed modulo operation of Operand 1 modulo Operand 2. The sign of a non-0 result comes from Operand 2. The
operands types and Result Type must all be scalars or vectors of integer types with the same number of components and
the same component widths. Works with any mixture of signedness. The resulting value is undefined if Operand 2 is 0.
5
133
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFRem
Floating-point remainder operation of Operand 1 divided by Operand 2. The sign of a non-0 result comes from Operand
1. The operands types and Result Type must all be scalars or vectors of floating-point types with the same number of
components and the same component widths. The resulting value is undefined if Operand 2 is 0.
5
134
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFMod
Floating-point modulo operation of Operand 1 modulo Operand 2. The sign of a non-0 result comes from Operand 2. The
operands types and Result Type must all be scalars or vectors of floating-point types with the same number of components
and the same component widths. The resulting value is undefined if Operand 2 is 0.
5
135
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpVectorTimesScalar
Scale a floating-point vector.
Vector must have a floating-point vector type.
Scalar must be a floating-point scalar.
Result Type must be the same as the type of Vector.
5
136
<id>
Result <id>
<id>
Result Type
Vector
OpMatrixTimesScalar
<id>
Scalar
Capability:
Matrix
<id>
Scalar
OpVectorTimesMatrix
76 / 111
Capability:
Matrix
OpMatrixTimesVector
<id>
Matrix
Capability:
Matrix
<id>
Vector
OpMatrixTimesMatrix
Capability:
Matrix
OpOuterProduct
<id>
RightMatrix
Capability:
Matrix
<id>
Vector 2
77 / 111
OpDot
Dot product of Vector 1 and Vector 2.
The operands types must be floating-point vectors with the same component type and the same
number of components.
Result Type must be a scalar of the same type as the operands component type.
5
142
<id>
Result <id>
<id>
<id>
Result Type
Vector 1
Vector 2
OpShiftRightLogical
Shift the bits in Operand 1 right by the number of bits specified in Operand 2. The most-significant bits will be zero filled.
Operand 2 is consumed as an unsigned integer. The result is undefined if Operand 2 is greater than the bit width of the
components of Operand 1.
The number of components and bit width of Result Type must match those of Operand 1 type. All types must be integer
types. Works with any mixture of signedness.
5
143
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpShiftRightArithmetic
Shift the bits in Operand 1 right by the number of bits specified in Operand 2. The most-significant bits will be filled with
the sign bit from Operand 1. Operand 2 is treated as unsigned. The result is undefined if Operand 2 is greater than the bit
width of the components of Operand 1.
The number of components and bit width of Result Type must match those Operand 1 type. All types must be integer
types. Works with any mixture of signedness.
5
144
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpShiftLeftLogical
Shift the bits in Operand 1 left by the number of bits specified in Operand 2. The least-significant bits will be zero filled.
Operand 2 is treated as unsigned. The result is undefined if Operand 2 is greater than the bit width of the components of
Operand 1.
The number of components and bit width of Result Type must match those Operand 1 type. All types must be integer
types. Works with any mixture of signedness.
5
145
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpBitwiseOr
Result is 1 if either Operand 1 or Operand 2 is 1. Result is 0 if both Operand 1 and Operand 2 are 0.
Results are computed per component, and within each component, per bit. The operands types and Result Type must all be
scalars or vectors of integer types with the same number of components and the same component widths. Works with any
mixture of signedness.
149
<id>
Result Type
78 / 111
Result <id>
<id>
Operand 1
<id>
Operand 2
OpBitwiseXor
Result is 1 if exactly one of Operand 1 or Operand 2 is 1. Result is 0 if Operand 1 and Operand 2 have the same value.
Results are computed per component, and within each component, per bit. The operands types and Result Type must all be
scalars or vectors of integer types with the same number of components and the same component widths. Works with any
mixture of signedness.
5
150
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpBitwiseAnd
Result is 1 if both Operand 1 and Operand 2 are 1. Result is 0 if either Operand 1 or Operand 2 are 0.
Results are computed per component, and within each component, per bit. The operands types and Result Type must all be
scalars or vectors of integer types with the same number of components and the same component widths. Works with any
mixture of signedness.
5
151
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
3.27.14
OpAny
Result is true if any component of Vector is true, otherwise result is false.
Vector must be a vector of Boolean type.
Result Type must be a Boolean type scalar.
4
98
<id>
Result <id>
Result Type
<id>
Vector
OpAll
Result is true if all components of Vector are true, otherwise result is false.
Vector must be a vector of Boolean type.
Result Type must be a Boolean type scalar.
4
99
<id>
Result <id>
Result Type
<id>
Vector
OpIsNan
Result is true if x is an IEEE NaN, otherwise result is false.
Result Type must be a scalar or vector of Boolean type, with the same number of components as the operand. Results are
computed per component. The operands type and Result Type must have the same number of components.
113
<id>
Result Type
79 / 111
Result <id>
<id>
x
OpIsInf
Result is true if x is an IEEE Inf, otherwise result is false
Result Type must be a scalar or vector of Boolean type, with the same number of components as the operand. Results are
computed per component. The operands type and Result Type must have the same number of components.
4
114
<id>
Result <id>
<id>
Result Type
x
OpIsFinite
Capability:
Kernel
OpIsNormal
<id>
x
Capability:
Kernel
OpSignBitSet
<id>
x
Capability:
Kernel
Result is true if x has its sign bit set, otherwise result is false.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operand. Results are computed per component. The operands type
and Result Type must have the same number of components.
4
117
<id>
Result <id>
Result Type
<id>
x
OpLessOrGreater
Capability:
Kernel
Result is true if x < y or x > y, where IEEE comparisons are used, otherwise result is false.
Result Type must be a scalar or vector of Boolean type, with the same number of components as
the operands. Results are computed per component. The operands types and Result Type must
all have the same number of components.
5
118
<id>
Result <id>
<id>
Result Type
x
<id>
y
OpOrdered
80 / 111
Capability:
Kernel
Result is true if both x == x and y == y are true, where IEEE comparison is used, otherwise
result is false.
Result Type must be a scalar or vector of Boolean type, with the same number of components as
the operands. Results are computed per component. The operands types and Result Type must
all have the same number of components.
5
119
<id>
Result <id>
<id>
Result Type
x
OpUnordered
<id>
y
Capability:
Kernel
<id>
y
OpLogicalOr
Result is true if either Operand 1 or Operand 2 is true. Result is false if both Operand 1 and Operand 2 are false.
Operand 1 and Operand 2 must both be scalars or vectors of Boolean type.
Result Type must be a scalar or vector of Boolean type, with the same number of components as the operands. Results are
computed per component. The operands types and Result Type must all have the same number of components.
5
146
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpLogicalXor
Result is true if exactly one of Operand 1 or Operand 2 is true. Result is false if Operand 1 and Operand 2 have the same
value.
Operand 1 and Operand 2 must both be scalars or vectors of Boolean type.
Result Type must be a scalar or vector of Boolean type, with the same number of components as the operands. Results are
computed per component. The operands types and Result Type must all have the same number of components.
5
147
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpLogicalAnd
Result is true if both Operand 1 and Operand 2 are true. Result is false if either Operand 1 or Operand 2 are false.
Operand 1 and Operand 2 must both be scalars or vectors of Boolean type.
Result Type must be a scalar or vector of Boolean type, with the same number of components as the operands. Results are
computed per component. The operands types and Result Type must all have the same number of components.
148
<id>
Result Type
81 / 111
Result <id>
<id>
Operand 1
<id>
Operand 2
OpSelect
Select between two objects. Results are computed per component.
Condition must be a Boolean type scalar or vector.
Object 1 is selected as the result if Condition is true.
Object 2 is selected as the result if Condition is false.
Result Type, the type of Object 1, and the type of Object 2 must all be the same. Condition must have the same number of
components as the operands.
6
152
<id>
Result <id>
<id>
<id>
<id>
Result Type
Condition
Object 1
Object 2
OpIEqual
Integer comparison for equality.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
153
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFOrdEqual
Floating-point comparison for being ordered and equal.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
154
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFUnordEqual
Floating-point comparison for being unordered or equal.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
155
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpINotEqual
Integer comparison for inequality.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
156
82 / 111
<id>
Result Type
Result <id>
<id>
Operand 1
<id>
Operand 2
OpFOrdNotEqual
Floating-point comparison for being ordered and not equal.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
157
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFUnordNotEqual
Floating-point comparison for being unordered or not equal.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
158
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpULessThan
Unsigned-integer comparison if Operand 1 is less than Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
159
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpSLessThan
Signed-integer comparison if Operand 1 is less than Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
160
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFOrdLessThan
Floating-point comparison if operands are ordered and Operand 1 is less than Operand
2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
161
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFUnordLessThan
Floating-point comparison if operands are unordered or Operand 1 is less than Operand
2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
162
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpUGreaterThan
Unsigned-integer comparison if Operand 1 is greater than Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
163
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpSGreaterThan
Signed-integer comparison if Operand 1 is greater than Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
164
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFOrdGreaterThan
Floating-point comparison if operands are ordered and Operand 1 is greater than Operand
2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
165
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFUnordGreaterThan
Floating-point comparison if operands are unordered or Operand 1 is greater than
Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
166
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
83 / 111
OpULessThanEqual
Unsigned-integer comparison if Operand 1 is less than or equal to Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
167
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpSLessThanEqual
Signed-integer comparison if Operand 1 is less than or equal to Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
168
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFOrdLessThanEqual
Floating-point comparison if operands are ordered and Operand 1 is less than or equal to
Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of components
as the operands.
5
169
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFUnordLessThanEqual
Floating-point comparison if operands are unordered or Operand 1 is less than or equal to
Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of components
as the operands.
5
170
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpUGreaterThanEqual
Unsigned-integer comparison if Operand 1 is greater than or equal to Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
171
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
84 / 111
85 / 111
OpSGreaterThanEqual
Signed-integer comparison if Operand 1 is greater than or equal to Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of
components as the operands.
5
172
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFOrdGreaterThanEqual
Floating-point comparison if operands are ordered and Operand 1 is greater than or equal to
Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of components
as the operands.
5
173
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
OpFUnordGreaterThanEqual
Floating-point comparison if operands are unordered or Operand 1 is greater than or equal to
Operand 2.
Result Type must be a scalar or vector of Boolean type, with the same number of components as
the operands.
5
174
<id>
Result <id>
<id>
<id>
Result Type
Operand 1
Operand 2
3.27.15
Derivative Instructions
OpDPdx
Capability:
Shader
OpDPdy
Same result as either OpDPdyFine or OpDPdyCoarse on P. Selection of which one is
based on external factors.
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point scalar or
floating-point vector.
<id>
P
Capability:
Shader
176
<id>
Result Type
86 / 111
Result <id>
OpFwidth
<id>
P
Capability:
Shader
Result is the same as computing the sum of the absolute values of OpDPdx and
OpDPdy on P.
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point
scalar or floating-point vector.
4
177
<id>
Result <id>
Result Type
OpDPdxFine
<id>
P
Capability:
Shader
Result is the partial derivative of P with respect to the window x coordinate.Will use local
differencing based on the value of P for the current fragment and its immediate
neighbor(s).
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point scalar or
floating-point vector.
4
178
<id>
Result <id>
Result Type
OpDPdyFine
<id>
P
Capability:
Shader
Result is the partial derivative of P with respect to the window y coordinate.Will use local
differencing based on the value of P for the current fragment and its immediate
neighbor(s).
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point scalar or
floating-point vector.
4
179
<id>
Result <id>
Result Type
OpFwidthFine
<id>
P
Capability:
Shader
Result is the same as computing the sum of the absolute values of OpDPdxFine and
OpDPdyFine on P.
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point scalar or
floating-point vector.
4
180
<id>
Result <id>
Result Type
<id>
P
OpDPdxCoarse
87 / 111
Capability:
Shader
Result is the partial derivative of P with respect to the window x coordinate. Will use
local differencing based on the value of P for the current fragments neighbors, and will
possibly, but not necessarily, include the value of P for the current fragment. That is, over
a given area, the implementation can compute x derivatives in fewer unique locations
than would be allowed for OpDPdxFine.
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point scalar or
floating-point vector.
4
181
<id>
Result <id>
Result Type
OpDPdyCoarse
<id>
P
Capability:
Shader
Result is the partial derivative of P with respect to the window y coordinate. Will use
local differencing based on the value of P for the current fragments neighbors, and will
possibly, but not necessarily, include the value of P for the current fragment. That is, over
a given area, the implementation can compute y derivatives in fewer unique locations
than would be allowed for OpDPdyFine.
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point scalar or
floating-point vector.
4
182
<id>
Result <id>
Result Type
OpFwidthCoarse
<id>
P
Capability:
Shader
Result is the same as computing the sum of the absolute values of OpDPdxCoarse and
OpDPdyCoarse on P.
P is the value to take the derivative of.
Result Type must be the same as the type of P. This type must be a floating-point scalar or
floating-point vector.
4
183
<id>
Result <id>
Result Type
3.27.16
<id>
P
Flow-Control Instructions
OpPhi
The SSA phi function. Operands are pairs (<id> of variable, <id> of
parent block). All variables must have a type matching Result Type.
3+
48
<id>
Result <id>
<id>, <id>, . . .
variResult Type
able
88 / 111
OpLoopMerge
Declare and control a structured control-flow loop construct.
Label is the label of the merge block for this structured loop construct.
See Structured Control Flow for more detail.
3
206
<id>
Label
Loop Control
OpSelectionMerge
Declare and control a structured control-flow selection construct, used with OpBranchConditional or OpSwitch.
Label is the label of the merge block for this structured selection construct.
See Structured Control Flow for more detail.
3
207
<id>
Label
Selection Control
OpLabel
The block label instruction: Any reference to a block is through the Result
<id> of its label.
Must be the first instruction of any block, and appears only as the first
instruction of a block.
2
208
Result <id>
OpBranch
Unconditional branch to Target Label.
Target Label must be the Result <id> of an OpLabel instruction in the current
function.
This instruction must be the last instruction in a block.
2
209
<id>
Target Label
89 / 111
OpBranchConditional
If Condition is true, branch to True Label, otherwise branch to False Label.
Condition must be a Boolean type scalar.
True Label must be an OpLabel in the current function.
False Label must be an OpLabel in the current function.
Branch weights are unsigned 32-bit integer literals. There must be either no Branch Weights or exactly two branch weights.
If present, the first is the weight for branching to True Label, and the second is the weight for branching to False Label.
The implied probability that a branch is taken is its weight divided by the sum of the two Branch weights.
This instruction must be the last instruction in a block.
4+
210
<id>
<id>
variCondition
True Label
able
<id>
False Label
literal, literal, . . .
Branch weights
OpSwitch
Multi-way branch to one of the operand label <id>.
Selector must be a scalar integer type. It will be compared for equality to the Target literals.
Default must be the <id> of a label. If Selector does not equal any of the Target literals, control flow will branch to the
Default label <id>.
Target must be alternating scalar-integer literals and the <id> of a label. If Selector equals one of the literals, control flow
will branch to the following label <id>. It is invalid for any two Target literals to be equal to each other. If Target is not
present, control flow will branch to the Default label <id>.
This instruction must be the last instruction in a block.
3+
211
<id>
variable
Selector
OpKill
<id>
Default
Capability:
Shader
212
OpReturn
Return with no value from a function with void return
type.
This instruction must be the last instruction in a block.
90 / 111
213
OpReturnValue
Return a value from a function.
Value is the value returned, by copy, and must match the Return Type operand of the OpTypeFunction type of the
OpFunction body this return instruction is in.
This instruction must be the last instruction in a block.
2
214
<id>
Value
OpUnreachable
Capability:
Kernel
215
OpLifetimeStart
Declare that the content of the object pointed to was not defined before this instruction.
If Operand 1 has a non-void type, Operand 2 must be 0, otherwise Operand 2 is the
amount of memory whose lifetime is starting.
3
216
<id>
Literal Number
OpLifetimeStop
Declare that the content of the object pointed to is dead after this instruction. If
Operand 1 has a non-void type, Operand 2 must be 0, otherwise Operand 2 is the
amount of memory whose life-time is ending.
3
217
<id>
Literal Number
3.27.17
Atomic Instructions
OpAtomicInit
Initialize atomic memory to Value. This is not done atomically with
respect to anything.
The type of Value and the type pointed to by Pointer must be the
same type.
3
191
<id>
<id>
Pointer
Value
91 / 111
OpAtomicLoad
Atomically load through Pointer using the given Semantics. All subparts of the value that is loaded will be
read atomically with respect to all other atomic accesses to it within Scope.
Result Type must be the same type as the type pointed to by Pointer.
6
192
<id>
Result <id>
<id>
Result Type
Pointer
Execution
Scope
Scope
Memory
Semantics
Semantics
OpAtomicStore
Atomically store through Pointer using the given Semantics. All subparts of Value will be written
atomically with respect to all other atomic accesses to it within Scope.
The type pointed to by Pointer must be the same type as the type of Value.
5
193
<id>
Execution Scope
Memory
Pointer
Scope
Semantics
Semantics
<id>
Value
OpAtomicExchange
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value from copying Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
194
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
OpAtomicCompareExchange
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by selecting Value if Original Value equals Comparator or selecting Original Value otherwise, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
8
195 <id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
<id>
Comparator
92 / 111
OpAtomicCompareExchangeWeak
Attempts to do the following:
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by selecting Value if Original Value equals Comparator or selecting Original Value otherwise, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type. This type must also match the type
of Comparator.
TBD. What is the result if the operation fails?
8
196 <id>
Result <id>
<id>
Result Type
Pointer
Execution
Scope
Scope
Memory
Semantics
Semantics
<id>
Value
<id>
Comparator
OpAtomicIIncrement
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value through integer addition of 1 to Original Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type must be the same type as the type pointed to by Pointer.
6
197
<id>
Result <id>
<id>
Result Type
Pointer
Execution Scope
Scope
Memory
Semantics
Semantics
OpAtomicIDecrement
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value through integer subtraction of 1 from Original Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type must be the same type as the type pointed to by Pointer.
6
198
<id>
Result <id>
<id>
Result Type
Pointer
Execution Scope
Scope
Memory
Semantics
Semantics
93 / 111
OpAtomicIAdd
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by integer addition of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
199
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
OpAtomicISub
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by integer subtraction of Value from Original Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
200
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
OpAtomicUMin
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by finding the smallest unsigned integer of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
201
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
OpAtomicUMax
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by finding the largest unsigned integer of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
202
<id>
Result Type
94 / 111
Result <id>
<id>
Pointer
Execution
Scope
Scope
Memory
Semantics
Semantics
<id>
Value
OpAtomicAnd
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by the bitwise AND of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
203
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
OpAtomicOr
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by the bitwise OR of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
204
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
OpAtomicXor
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by the bitwise exclusive OR of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
205
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
95 / 111
OpAtomicIMin
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by finding the smallest signed integer of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
265
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
<id>
Value
OpAtomicIMax
Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:
1) load through Pointer to get an Original Value,
2) get a New Value by finding the largest signed integer of Original Value and Value, and
3) store the New Value back through Pointer.
The instructions result is the Original Value.
Result Type, the type of Value, and the type pointed to by Pointer must all be same type.
7
266
<id>
Result <id>
<id>
Execution
Memory
Result Type
Pointer
Scope
Semantics
Scope
Semantics
3.27.18
Primitive Instructions
OpEmitVertex
Capability:
Geom
OpEndPrimitive
184
Capability:
Geom
185
<id>
Value
OpEmitStreamVertex
96 / 111
Capability:
Geom
OpEndStreamPrimitive
<id>
Stream
Capability:
Geom
3.27.19
<id>
Stream
Barrier Instructions
OpControlBarrier
Wait for other invocations of this module to reach this same point of execution.
All invocations of this module within Scope must reach this point of execution before any will proceed beyond it.
This instruction is only guaranteed to work correctly if placed strictly within dynamically uniform control flow within
Scope. This ensures that if any invocation executes it, all invocations will execute it. If placed elsewhere, an invocation
may stall indefinitely.
It is only valid to use this instruction with TessellationControl, GLCompute, or Kernel execution models.
2
188
Execution Scope
Scope
97 / 111
OpMemoryBarrier
Control the order that memory accesses are observed.
Ensures that memory accesses issued before this instruction will be observed before memory accesses issued after this
instruction. This control is ensured only for memory accesses issued by this invocation and observed by another invocation
executing within Scope.
Semantics declares what kind of memory is being controlled and what kind of control to apply.
3
189
Execution Scope
Memory Semantics
Scope
Semantics
3.27.20
Group Instructions
OpAsyncGroupCopy
Capability:
Kernel
Perform an asynchronous group copy of Num Elements elements from Source to Destination. The
asynchronous copy is performed by all work-items in a group.
Returns an event object that can be used by OpWaitGroupEvents to wait for the copy to finish.
Event must be OpTypeEvent.
Event can be used to associate the copy with a previous copy allowing an event to be shared by multiple
copies. Otherwise Event should be a OpConstantNullObject.
If Event argument is not OpConstantNullObject, the event object supplied in event argument will be returned.
Scope must be the Workgroup or Subgroup Execution Scope.
Destination and Source should both be pointers to the same integer or floating point scalar or vector data type.
Destination and Source pointer storage class can be either WorkgroupLocal or WorkgroupGlobal.
When Destination pointer storage class is WorkgroupLocal, the Source pointer storage class must be
WorkgroupGlobal. In this case Stride defines the stride in elements when reading from Source pointer.
When Destination pointer storage class is WorkgroupGlobal, the Source pointer storage class must be
WorkgroupLocal. In this case Stride defines the stride in elements when writing each element to
Destination pointer.
Stride and NumElemens must be a 32 bit OpTypeInt when the Addressing Model is Physical32 and 64 bit
OpTypeInt when the Addressing Model is Physical64.
9
219 <id>
Result
Execution
<id>
<id>
<id>
<id>
Result
<id>
Scope
Destination Source
Num
Stride
Type
Scope
Elements
<id>
Event
98 / 111
OpWaitGroupEvents
Capability:
Kernel
Wait for events generated by OpAsyncGroupCopy operations to complete. The event objects pointed
by Events List will be released after the wait is performed.
Events List must be a pointer to OpTypeEvent.
Scope must be the Workgroup or Subgroup Execution Scope.
Num Events must be a 32 bits wide OpTypeInt.
6
220
<id>
Result <id>
Result Type
Execution Scope
Scope
<id>
Num Events
OpGroupAll
<id>
Events List
Capability:
Kernel
Evaluates a predicate for all work-items in the group,and returns true if predicate evaluates to
true for all work-items in the group, otherwise returns false.
Both the Predicate and the Result Type must be of OpTypeBool.
Scope must be the Workgroup or Subgroup Execution Scope.
5
221
<id>
Result <id>
Result Type
Execution Scope
Scope
OpGroupAny
<id>
Predicate
Capability:
Kernel
Evaluates a predicate for all work-items in the group,and returns true if predicate evaluates to
true for any work-item in the group, otherwise returns false.
Both the Predicate and the Result Type must be of OpTypeBool.
Scope must be the Workgroup or Subgroup Execution Scope.
5
222
<id>
Result <id>
Result Type
Execution Scope
Scope
OpGroupBroadcast
<id>
Predicate
Capability:
Kernel
Broadcast a value for workitem identified by the local id to all work-items in the group.
Value and Result Type must be a 32 or 64 bits wise OpTypeInt or a 16, 32 or 64 OpTypeFloat
floating-point scalar datatype.
LocalId must be an integer datatype. It can be a scalar, or a vector with 2 components or a vector
with 3 components. LocalId must be the same for all work-items in the group.
Scope must be the Workgroup or Subgroup Execution Scope.
6
223
<id>
Result <id>
Execution Scope
Result Type
Scope
<id>
Value
<id>
LocalId
99 / 111
OpGroupIAdd
Capability:
Kernel
An integer add group operation specified for all values of X specified by work-items in the group.
X and Result Type must be a 32 or 64 bits wide OpTypeInt data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is 0.
6
224
<id>
Result Type
Result <id>
Execution Scope
Scope
Group Operation
Operation
OpGroupFAdd
<id>
X
Capability:
Kernel
A floating-point add group operation specified for all values of X specified by work-items in the
group.
Both X and Result Type must be a 16, 32 or 64 bits wide OpTypeFloat data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is 0.
6
225
<id>
Result Type
Result <id>
Execution Scope
Scope
Group Operation
Operation
OpGroupFMin
<id>
X
Capability:
Kernel
A floating-point minimum group operation specified for all values of X specified by work-items in
the group.
Both X and Result Type must be a 16, 32 or 64 bits wide OpTypeFloat data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is +INF.
6
226
<id>
Result Type
Result <id>
Execution Scope
Scope
Group Operation
Operation
OpGroupUMin
<id>
X
Capability:
Kernel
An unsigned integer minimum group operation specified for all values of X specified by work-items
in the group.
X and Result Type must be a 32 or 64 bits wide OpTypeInt data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is UINT_MAX when X is 32 bits wide and ULONG_MAX when X is 64 bits wide.
6
227
<id>
Result <id>
Execution Scope
Group Operation
Result Type
Scope
Operation
<id>
X
100 / 111
OpGroupSMin
Capability:
Kernel
A signed integer minimum group operation specified for all values of X specified by work-items in
the group.
X and Result Type must be a 32 or 64 bits wide OpTypeInt data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is INT_MAX when X is 32 bits wide and LONG_MAX when X is 64 bits wide.
6
228
<id>
Result <id>
Execution Scope
Group Operation
Result Type
Scope
Operation
OpGroupFMax
<id>
X
Capability:
Kernel
A floating-point maximum group operation specified for all values of X specified by work-items in
the group.
Both X and Result Type must be a 16, 32 or 64 bits wide OpTypeFloat data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is -INF.
6
229
<id>
Result Type
Result <id>
Execution Scope
Scope
Group Operation
Operation
OpGroupUMax
<id>
X
Capability:
Kernel
An unsigned integer maximum group operation specified for all values of X specified by work-items
in the group.
X and Result Type must be a 32 or 64 bits wide OpTypeInt data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is 0.
6
230
<id>
Result Type
Result <id>
Execution Scope
Scope
Group Operation
Operation
OpGroupSMax
<id>
X
Capability:
Kernel
A signed integer maximum group operation specified for all values of X specified by work-items in
the group.
X and Result Type must be a 32 or 64 bits wide OpTypeInt data type.
Scope must be the Workgroup or Subgroup Execution Scope.
The identity I is INT_MIN when X is 32 bits wide and LONG_MIN when X is 64 bits wide.
6
231
<id>
Result <id>
Execution Scope
Group Operation
Result Type
Scope
Operation
<id>
X
3.27.21
101 / 111
OpEnqueueMarker
Capability:
Kernel
Enqueue a marker command to to the queue object specified by q. The marker command waits for a list
of events to complete, or if the list is empty it waits for all previously enqueued commands in q to
complete before the marker completes.
Num Events specifies the number of event objects in the wait list pointed Wait Events and must be 32 bit
OpTypeInt treated as unsigned integer.
Wait Events specifies the list of wait event objects and must be a OpTypePointer to OpTypeDeviceEvent.
Ret Event is OpTypePointer to OpTypeDeviceEvent which gets implictly retained by this instruction.
must be a OpTypePointer to OpTypeDeviceEvent. If Ret Event is set to null this instruction becomes a
no-op.
Result Type must be a 32 bit OpTypeInt.
These are the possible return values:
A successfull enqueue is indicated by the integer value 0
A failed enqueue is indicated by the negative integer value -101
When running the clCompileProgram or clBuildProgram with -g flag, the following errors may be
returned instead of the negative integer value -101:
- When q is an invalid queue object, the negative integer value -102 is returned.
- When Wait Events is null and Num Events > 0, or if Wait Events is not null and Num Events is 0, or if
event objects in Wait Events are not valid events, the negative integer value -57 is returned.
- When the queue object q is full, the negative integer value -161 is returned.
- When Ret Event is not a null object and an event could not be allocated, the negative integer value -100
is returned.
- When there is a failure to queue Invoke in the queue q because of insufficient resources needed to
execute the kernel, the negative integer value -5 is returned.
7
249
<id>
Result <id>
<id>
<id>
<id>
Result Type
q
Num Events
Wait Events
<id>
Ret Event
OpEnqueueKernel
102 / 111
Capability:
Kernel
Enqueue the the function specified by Invoke and the NDRange specified by ND Range for execution to the queue
object specified by q.
ND Range must be a OpTypeStruct created by OpBuildNDRange.
Num Events specifies the number of event objects in the wait list pointed Wait Events and must be 32 bit
OpTypeInt treated as unsigned integer.
Wait Events specifies the list of wait event objects and must be a OpTypePointer to OpTypeDeviceEvent.
Ret Event is OpTypePointer to OpTypeDeviceEvent which gets implictly retained by this instruction. must be a
OpTypePointer to OpTypeDeviceEvent.
Invoke must be a OpTypeFunction with the following signature:
- Result Type must be OpTypeVoid.
- The first parameter must be OpTypePointer to 8 bits OpTypeInt.
- Optional list of parameters that must be OpTypePointer with WorkgroupLocal storage class.
Param is the first parameter of the function specified by Invoke and must be OpTypePointer to 8 bit OpTypeInt.
Param Size is the size in bytes of the memory pointed by Param and must be a 32 bit OpTypeInt treated as
unsigned int.
Param Align is the alignment of Param.
Local Size is an optional list of 32 bit OpTypeInt values which are treated as unsigned integers. Every Local Size
specifies the size in bytes of the OpTypePointer with WorkgroupLocal of Invoke. The number of Local Size
operands must match the signature of Invoke OpTypeFunction
Result Type must be a 32 bit OpTypeInt.
These are the possible return values:
A successfull enqueue is indicated by the integer value 0
A failed enqueue is indicated by the negative integer value -101
When running the clCompileProgram or clBuildProgram with -g flag, the following errors may be returned instead
of the negative value -101:
- When q is an invalid queue object, the negative integer value -102 is returned.
- When ND Range is an invalid descriptor or if the program was compiled with -cl-uniform-work-group-size and
the local work size is specified in ndrange but the global work size specified in ND Range is not a multiple of the
local work size, the negative integer value -160 is returned.
- When Wait Events is null and Num Events > 0, or if Wait Events is not null and Num Events is 0, or if event
objects in Wait Events are not valid events, the negative integer value -57 is returned.
- When the queue object q is full, the negative integer value -161 is returned.
- When one of the operands Local Size is 0, the negative integer value -51 is returned.
- When Ret Event is not a null object and an event could not be allocated, the negative integer value -100 is
returned.
- When there is a failure to queue Invoke in the queue q because of insufficient resources needed to execute the
kernel, the negative integer value -5 is returned.
13 250 <id> Result <id> Kernel <id> <id> <id> <id> <id> <id> <id> <id>
+
Result <id>
q
EnND
Num
Wait
Ret
Invoke Param Param Param
variType
queue Range Events Events Event
Size
Align
able
Flags
flags
<id>,
<id>,
...
Local
Size
103 / 111
OpGetKernelNDrangeSubGroupCount
Capability:
Kernel
Returns the number of subgroups in each workgroup of the dispatch (except for the last in cases
where the global size does not divide cleanly into work-groups) given the combination of the
passed NDRange descriptor specified by ND Range and the function specified by Invoke.
ND Range must be a OpTypeStruct created by OpBuildNDRange.
Invoke must be a OpTypeFunction with the following signature:
- Result Type must be OpTypeVoid.
- The first parameter must be OpTypePointer to 8 bits OpTypeInt.
- Optional list of parameters that must be OpTypePointer with WorkgroupLocal storage class.
Result Type must be a 32 bit OpTypeInt.
5
251
<id>
Result Type
Result <id>
<id>
ND Range
<id>
Invoke
OpGetKernelNDrangeMaxSubGroupSize
Capability:
Kernel
Returns the maximum sub-group size for the function specified by Invoke and the NDRange
specified by ND Range.
ND Range must be a OpTypeStruct created by OpBuildNDRange.
Invoke must be a OpTypeFunction with the following signature:
- Result Type must be OpTypeVoid.
- The first parameter must be OpTypePointer to 8 bits OpTypeInt.
- Optional list of parameters that must be OpTypePointer with WorkgroupLocal storage class.
Result Type must be a 32 bit OpTypeInt.
5
252
<id>
Result Type
Result <id>
<id>
ND Range
OpGetKernelWorkGroupSize
<id>
Invoke
Capability:
Kernel
Returns the maximum work-group size that can be used to execute the function specified
by Invoke on the device.
Invoke must be a OpTypeFunction with the following signature:
- Result Type must be OpTypeVoid.
- The first parameter must be OpTypePointer to 8 bits OpTypeInt.
- Optional list of parameters that must be OpTypePointer with WorkgroupLocal storage
class.
Result Type must be a 32 bit OpTypeInt.
4
253
<id>
Result Type
Result <id>
<id>
Invoke
104 / 111
OpGetKernelPreferredWorkGroupSizeMultiple
Capability:
Kernel
Returns the preferred multiple of work-group size for the function specified by Invoke.
This is a performance hint. Specifying a work-group size that is not a multiple of the
value returned by this query as the value of the local work size will not fail to enqueue
Invoke for execution unless the work-group size specified is larger than the device
maximum.
Invoke must be a OpTypeFunction with the following signature:
- Result Type must be OpTypeVoid.
- The first parameter must be OpTypePointer to 8 bits OpTypeInt.
- Optional list of parameters that must be OpTypePointer with WorkgroupLocal storage
class.
Result Type must be a 32 bit OpTypeInt.
4
254
<id>
Result Type
OpRetainEvent
Result <id>
<id>
Invoke
Capability:
Kernel
OpReleaseEvent
<id>
event
Capability:
Kernel
<id>
event
OpCreateUserEvent
Create a user event. The execution status
of the created event is set to a value of 2
(CL_SUBMITTED).
Result Type must be OpTypeDeviceEvent.
Capability:
Kernel
105 / 111
257
Result <id>
<id>
Result Type
OpIsValidEvent
Capability:
Kernel
Result <id>
OpSetUserEventStatus
<id>
event
Capability:
Kernel
<id>
status
OpCaptureEventProfilingInfo
Capability:
Kernel
Captures the profiling information specified by info for the command associated with the
event specified by event in the memory pointed by value.The profiling information will
be available in value once the command identified by event has completed.
event must be a OpTypeDeviceEvent that was produced by OpEnqueueKernel or
OpEnqueueMarker.
When info is CmdExecTime value must be a OpTypePointer with WorkgroupGlobal
storage class, to two 64-bit OpTypeInt values. The first 64-bit value describes the elapsed
time CL_PROFILING_COMMAND_END - CL_PROFLING_COMMAND_START for
the command identified by event in nanoseconds. The second 64-bit value describes the
elapsed time CL_PROFILING_COMMAND_COMPLETE CL_PROFILING_COMAMND_START for the command identified by event in
nanoseconds.
Note: The behavior of of this instruction is undefined when called multiple times for the
same event.
4
260
<id>
Kernel Profiling Info
event
info
<id>
value
OpGetDefaultQueue
106 / 111
Capability:
Kernel
Result <id>
OpBuildNDRange
Capability:
Kernel
Given the global work size specified by GlobalWorkSize, local work size specified by LocalWorkSize
and global work offset specified by GlobalWorkOffset, builds a 1D, 2D or 3D ND-range descriptor
structure.
GlobalWorkSize, LocalWorkSize and GlobalWorkOffset must be a scalar or an array with 2 or 3
components. Where the type of each element in the array is 32 bit OpTypeInt when the Addressing
Model is Physical32 or 64 bit OpTypeInt when the Addressing Model is Physical64.
Result Type is the descriptor and must be a OpTypeStruct with the following ordered list of members,
starting from the first to last:
- 32 bit OpTypeInt that specifies the number of dimensions used to specify the global work-items and
work-items in the work-group.
- OpTypeArray with 3 elements, where each element is 32 bit OpTypeInt when the Addressing
Model is Physical32 and 64 bit OpTypeInt when the Addressing Model is Physical64. This
member is an array of per-dimension unsigned values that describe the offset used to calculate the
global ID of a work-item.
- OpTypeArray with 3 elements, where each element is 32 bit OpTypeInt when the Addressing
Model is Physical32 and 64 bit OpTypeInt when the Addressing Model is Physical64. This
member is an array of per-dimension unsigned values that describe the number of global work-items
in the dimensions that will execute the kernel function.
- OpTypeArray with 3 elements, where each element is 32 bit OpTypeInt when the Addressing
Model is Physical32 and 64 bit OpTypeInt when the Addressing Model is Physical64. This
member is an array of an array of per-dimension unsigned values that describe the number of
work-items that make up a work-group.
6
262
<id>
Result <id>
<id>
<id>
Result Type
GlobalWorkSize
LocalWorkSize
3.27.22
<id>
GlobalWorkOffset
Pipe Instructions
OpReadPipe
Capability:
Kernel
Read a packet from the pipe object specified by p into ptr. Returns 0 if the operation is
successfull and a negative value if the pipe is empty.
p must be a OpTypePipe with ReadOnly Access Qualifier.
ptr must be a OpTypePointer with the same data type as p and a Generic storage class.
5
234
<id>
Result <id>
<id>
Result Type
p
<id>
ptr
107 / 111
OpWritePipe
Capability:
Kernel
Write a packet from ptr to the pipe object specified by p. Returns 0 if the operation is successfull
and a negative value if the pipe is full.
p must be a OpTypePipe with WriteOnly Access Qualifier.
ptr must be a OpTypePointer with the same data type as p and a Generic storage class.
Result Type must be a 32-bits OpTypeInt.
5
235
<id>
Result Type
Result <id>
<id>
p
<id>
ptr
OpReservedReadPipe
Capability:
Kernel
Read a packet from the reserved area specified by reserve_id and index of the pipe object specified by p
into ptr. The reserved pipe entries are referred to by indices that go from 0 . . . num_packets - 1.Returns
0 if the operation is successfull and a negative value otherwise.
p must be a OpTypePipe with ReadOnly Access Qualifier.
reserve_id must be a OpTypeReserveId.
index must be a 32-bits OpTypeInt which is treated as unsigned value.
ptr must be a OpTypePointer with the same data type as p and a Generic storage class.
Result Type must be a 32-bits OpTypeInt.
7
236
<id>
Result <id>
Result Type
<id>
p
<id>
reserve_id
<id>
index
OpReservedWritePipe
<id>
ptr
Capability:
Kernel
Write a packet from ptr into the reserved area specified by reserve_id and index of the pipe object
specified by p. The reserved pipe entries are referred to by indices that go from 0 . . . num_packets 1.Returns 0 if the operation is successfull and a negative value otherwise.
p must be a OpTypePipe with WriteOnly Access Qualifier.
reserve_id must be a OpTypeReserveId.
index must be a 32-bits OpTypeInt which is treated as unsigned value.
ptr must be a OpTypePointer with the same data type as p and a Generic storage class.
Result Type must be a 32-bits OpTypeInt.
7
237
<id>
Result <id>
Result Type
<id>
p
<id>
reserve_id
<id>
index
<id>
ptr
108 / 111
OpReserveReadPipePackets
Capability:
Kernel
<id>
num_packets
OpReserveWritePipePackets
Capability:
Kernel
OpCommitReadPipe
<id>
num_packets
Capability:
Kernel
<id>
reserve_id
OpCommitWritePipe
Capability:
Kernel
<id>
reserve_id
OpIsValidReserveId
Capability:
Kernel
Result <id>
<id>
reserve_id
109 / 111
OpGetNumPipePackets
Capability:
Kernel
Returns the number of available entries in the pipe object specified by p. The number of
available entries in a pipe is a dynamic value. The value returned should be considered
immediately stale.
p must be a OpTypePipe with ReadOnly or WriteOnly Access Qualifier.
Result Type must be a 32-bits OpTypeInt which should be treated as unsigned value.
4
243
<id>
Result <id>
Result Type
OpGetMaxPipePackets
<id>
p
Capability:
Kernel
Returns the maximum number of packets specified when the pipe object specified by p
was created.
p must be a OpTypePipe with ReadOnly or WriteOnly Access Qualifier.
Result Type must be a 32-bits OpTypeInt which should be treated as unsigned value.
4
244
<id>
Result <id>
Result Type
<id>
p
OpGroupReserveReadPipePackets
Capability:
Kernel
Reserve num_packets entries for reading from the pipe object specified by p at group level. Returns a
valid reservation ID if the reservation is successful.
The reserved pipe entries are referred to by indices that go from 0 . . . num_packets - 1.
Scope must be the Workgroup or Subgroup Execution Scope.
p must be a OpTypePipe with ReadOnly Access Qualifier.
num_packets must be a 32-bits OpTypeInt which is treated as unsigned value.
Result Type must be a OpTypeReserveId.
6
245
<id>
Result <id>
Result Type
Execution Scope
Scope
<id>
p
<id>
num_packets
110 / 111
OpGroupReserveWritePipePackets
Capability:
Kernel
Reserve num_packets entries for writing to the pipe object specified by p at group level. Returns a
valid reservation ID if the reservation is successful.
The reserved pipe entries are referred to by indices that go from 0 . . . num_packets - 1.
Scope must be the Workgroup or Subgroup Execution Scope.
p must be a OpTypePipe with WriteOnly Access Qualifier.
num_packets must be a 32-bits OpTypeInt which is treated as unsigned value.
Result Type must be a OpTypeReserveId.
6
246
<id>
Result <id>
Result Type
Execution Scope
Scope
<id>
p
OpGroupCommitReadPipe
<id>
num_packets
Capability:
Kernel
A group level indication that all reads to num_packets associated with the reservation
specified by reserve_id to the pipe object specified by p are completed.
Scope must be the Workgroup or Subgroup Execution Scope.
p must be a OpTypePipe with ReadOnly Access Qualifier.
reserve_id must be a OpTypeReserveId.
4
247
Execution Scope
Scope
<id>
p
OpGroupCommitWritePipe
<id>
reserve_id
Capability:
Kernel
A group level indication that all writes to num_packets associated with the reservation
specified by reserve_id to the pipe object specified by p are completed.
Scope must be the Workgroup or Subgroup Execution Scope.
p must be a OpTypePipe with WriteOnly Access Qualifier.
reserve_id must be a OpTypeReserveId.
4
248
Execution Scope
Scope
<id>
p
<id>
reserve_id
Made external function linkage done through function declarations (functions with no body) and Linkage Attributes Decoration.
Moved to the official auto-generated header files
all enumerants assigned a numeric value
C++ in the spv namespace
C with an "Spv" prefix
111 / 111
TBD