P4 Language Specification
P4 Language Specification
version 1.2.2
Abstract
P4 is a language for programming the data plane of network devices. This document provides a pre-
cise definition of the P416 language, which is the 2016 revision of the P4 language (http://p4.org).
The target audience for this document includes developers who want to write compilers, simulators,
IDEs, and debuggers for P4 programs. This document may also be of interest to P4 programmers
who are interested in understanding the syntax and semantics of the language at a deeper level.
Contents
1. Scope 5
2. Terms, definitions, and symbols 6
3. Overview 6
3.1. Benefits of P4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2. P4 language evolution: comparison to previous versions (P4 v1.0/v1.1) . . . . . . . . . . 9
4. Architecture Model 10
4.1. Standard architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2. Data plane interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3. Extern objects and functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5. Example: A very simple switch 13
5.1. Very Simple Switch Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2. Very Simple Switch Architecture Description . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.2.1. Arbiter block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2.2. Parser runtime block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2.3. Demux block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2.4. Available extern blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.3. A complete Very Simple Switch program . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6. P4 language definition 24
6.1. Syntax and semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.1.1. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.1.2. Semantics and the P4 abstract machines . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2. Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2.1. P4 core library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3. Lexical constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.3.1. Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1
6.3.2. Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.3.3. Literal constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.4. Naming conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.5. P4 programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.5.1. Scopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.5.2. Stateful elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.6. L-values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.7. Calling convention: call by copy in/copy out . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.7.1. Justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.7.2. Optional parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.8. Name resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.9. Visibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7. P4 data types 35
7.1. Base types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.1.1. The void type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.1.2. The error type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.1.3. The match kind type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.1.4. The Boolean type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.1.5. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.1.6. Integers (signed and unsigned) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.2. Derived types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2.1. Enumeration types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.2.2. Header types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.2.3. Header stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.2.4. Header unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.2.5. Struct types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.2.6. Tuple types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.2.7. Type nesting rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.2.8. Synthesized data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.2.9. Extern types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2.10. Type specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.2.11. Parser and control blocks types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.2.12. Package types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.2.13. Don't care types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.3. Default values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.4. typedef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.5. Introducing new types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8. Expressions 55
8.1. Expression evaluation order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.2. Operations on error types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.3. Operations on enum types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.4. Expressions on Booleans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.4.1. Conditional operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.5. Operations on bit types (unsigned integers) . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.6. Operations on fixed-width signed integers . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.6.1. Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2
8.6.2. A note about shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.7. Operations on arbitrary-precision integers . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.8. Operations on variable-size bit types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.9. Casts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8.9.1. Explicit casts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8.9.2. Implicit casts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.9.3. Illegal arithmetic expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.10. Operations on tuples expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.11. Operations on lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.12. Structure-valued expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.13. Operations on sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.13.1. Singleton sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.13.2. The universal set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.13.3. Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.13.4. Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.13.5. Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.14. Operations on struct types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.15. Structure initializers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.16. Operations on headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.17. Operations on header stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.18. Operations on header unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.19. Method invocations and function calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.20. Constructor invocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.21. Operations on types introduced by type . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.22. Reading uninitialized values and writing fields of invalid headers . . . . . . . . . . . . . 82
8.23. Initializing with default values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9. Function declarations 85
10. Constants and variable declarations 85
10.1. Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
10.2. Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10.3. Instantiations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
10.3.1. Instantiating objects with abstract methods . . . . . . . . . . . . . . . . . . . . . . 87
10.3.2. Restrictions on top-level instantiations . . . . . . . . . . . . . . . . . . . . . . . . 88
11. Statements 89
11.1. Assignment statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
11.2. Empty statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
11.3. Block statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
11.4. Return statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
11.5. Exit statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.6. Conditional statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.7. Switch statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.7.1. Switch statement with action_run expression . . . . . . . . . . . . . . . . . . . . . 93
11.7.2. Switch statement with integer or enumerated type expression . . . . . . . . . . . . 93
11.7.3. Notes common to all switch statements . . . . . . . . . . . . . . . . . . . . . . . . 94
12. Packet parsing 94
12.1. Parser states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3
12.2. Parser declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
12.3. The Parser abstract machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
12.4. Parser states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
12.5. Transition statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
12.6. Select expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
12.7. verify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
12.8. Data extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
12.8.1. Fixed width extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
12.8.2. Variable width extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.8.3. Lookahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
12.8.4. Skipping bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
12.9. Header stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
12.10. Sub-parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12.11. Parser Value Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
13. Control blocks 108
13.1. Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
13.1.1. Invoking actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
13.2. Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
13.2.1. Table properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
13.2.2. Match-action unit invocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
13.2.3. Match-action unit execution semantics . . . . . . . . . . . . . . . . . . . . . . . . 118
13.3. The Match-Action Pipeline Abstract Machine . . . . . . . . . . . . . . . . . . . . . . . . 120
13.4. Invoking controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
14. Parameterization 121
14.1. Direct type invocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
15. Deparsing 123
15.1. Data insertion into packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
16. Architecture description 124
16.1. Example architecture description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
16.2. Example architecture program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
16.3. A Packet Filter Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
17. P4 abstract machine: Evaluation 127
17.1. Compile-time known values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
17.2. Compile-time Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
17.3. Control plane names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
17.3.1. Computing control names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
17.3.2. Annotations controlling naming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
17.3.3. Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
17.4. Dynamic evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
17.4.1. Concurrency model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
18. Annotations 135
18.1. Bodies of Unstructured Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
18.2. Bodies of Structured Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
18.2.1. Structured Annotation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
18.3. Predefined annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
18.3.1. Optional parameter annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4
18.3.2. Annotations on the table action list . . . . . . . . . . . . . . . . . . . . . . . . . . 139
18.3.3. Control-plane API annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
18.3.4. Concurrency control annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
18.3.5. Value set annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
18.3.6. Extern function/method annotations . . . . . . . . . . . . . . . . . . . . . . . . . 140
18.3.7. Deprecated annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
18.3.8. No warnings annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
18.4. Target-specific annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
A. Appendix: Revision History 142
A.1. Summary of changes made in version 1.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . 142
A.2. Summary of changes made in version 1.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . 143
A.3. Summary of changes made in version 1.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . 143
A.4. Summary of changes made in version 1.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . 143
B. Appendix: P4 reserved keywords 144
C. Appendix: P4 reserved annotations 145
D. Appendix: P4 core library 145
E. Appendix: Checksums 146
F. Appendix: Restrictions on compile time and run time calls 147
G. Appendix: Open Issues 150
G.1. Generalized switch statement behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
G.2. Undefined behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
G.3. Structured Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
H. Appendix: P4 grammar 151
1. Scope
This specification document defines the structure and interpretation of programs in the P416 language.
It defines the syntax, semantic rules, and requirements for conformant implementations of the lan-
guage.
It does not define:
It is understood that some implementations may be unable to implement the behavior defined here
in all cases, or may provide options to eliminate some safety guarantees in exchange for better perfor-
mance or handling larger programs. They should document where they deviate from this specification.
5
2. Terms, definitions, and symbols
Throughout this document, the following terms will be used:
• Architecture: A set of P4-programmable components and the data plane interfaces between
them.
• Control plane: A class of algorithms and the corresponding input and output data that are con-
cerned with the provisioning and configuration of the data plane.
• Data plane: A class of algorithms that describe transformations on packets by packet-processing
systems.
• Metadata: Intermediate data generated during execution of a P4 program.
• Packet: A network packet is a formatted unit of data carried by a packet-switched network.
• Packet header: Formatted data at the beginning of a packet. A given packet may contain a se-
quence of packet headers representing different network protocols.
• Packet payload: Packet data that follows the packet headers.
• Packet-processing system: A data-processing system designed for processing network packets.
In general, packet-processing systems implement control plane and data plane algorithms.
• Target: A packet-processing system capable of executing a P4 program.
All terms defined explicitly in this document should not be understood to refer implicitly to similar
terms defined elsewhere. Conversely, any terms not defined explicitly in this document should be
interpreted according to generally recognizable sources—e.g., IETF RFCs.
3. Overview
P4 is a language for expressing how packets are processed by the data plane of a programmable for-
warding element such as a hardware or software switch, network interface card, router, or network
appliance. The name P4 comes from the original paper that introduced the language, “Programming
Protocol-independent Packet Processors,” https://arxiv.org/pdf/1312.1719.pdf. While P4 was
initially designed for programming switches, its scope has been broadened to cover a large variety of
devices. In the rest of this document we use the generic term target for all such devices.
Many targets implement both a control plane and a data plane. P4 is designed to specify only the
data plane functionality of the target. P4 programs also partially define the interface by which the con-
trol plane and the data-plane communicate, but P4 cannot be used to describe the control-plane func-
tionality of the target. In the rest of this document, when we talk about P4 as “programming a target”,
we mean “programming the data plane of a target”.
As a concrete example of a target, Figure 1 illustrates the difference between a traditional fixed-
function switch and a P4-programmable switch. In a traditional switch the manufacturer defines the
data-plane functionality. The control-plane controls the data plane by managing entries in tables (e.g.
routing tables), configuring specialized objects (e.g. meters), and by processing control-packets (e.g.
routing protocol packets) or asynchronous events, such as link state changes or learning notifications.
A P4-programmable switch differs from a traditional switch in two essential ways:
• The data plane functionality is not fixed in advance but is defined by a P4 program. The data plane
is configured at initialization time to implement the functionality described by the P4 program
(shown by the long red arrow) and has no built-in knowledge of existing network protocols.
6
Figure 1. Traditional switches vs. programmable switches.
• The control plane communicates with the data plane using the same channels as in a fixed-
function device, but the set of tables and other objects in the data plane are no longer fixed, since
they are defined by a P4 program. The P4 compiler generates the API that the control plane uses
to communicate with the data plane.
Hence, P4 can be said to be protocol independent, but it enables programmers to express a rich set of
protocols and other data plane behaviors.
The core abstractions provided by the P4 language are:
• Header types describe the format (the set of fields and their sizes) of each header within a packet.
• Parsers describe the permitted sequences of headers within received packets, how to identify
those header sequences, and the headers and fields to extract from packets.
• Tables associate user-defined keys with actions. P4 tables generalize traditional switch tables;
they can be used to implement routing tables, flow lookup tables, access-control lists, and other
user-defined table types, including complex multi-variable decisions.
• Actions are code fragments that describe how packet header fields and metadata are manipu-
lated. Actions can include data, which is supplied by the control-plane at runtime.
• Match-action units perform the following sequence of operations:
7
Figure 2. Programming a target with P4.
• Control flow expresses an imperative program that describes packet-processing on a target, in-
cluding the data-dependent sequence of match-action unit invocations. Deparsing (packet re-
assembly) can also be performed using a control flow.
• Extern objects are architecture-specific constructs that can be manipulated by P4 programs through
well-defined APIs, but whose internal behavior is hard-wired (e.g., checksum units) and hence
not programmable using P4.
• User-defined metadata: user-defined data structures associated with each packet.
• Intrinsic metadata: metadata provided by the architecture associated with each packet—e.g., the
input port where a packet has been received.
Figure 2 shows a typical tool workflow when programming a target using P4.
Target manufacturers provide the hardware or software implementation framework, an architec-
ture definition, and a P4 compiler for that target. P4 programmers write programs for a specific archi-
tecture, which defines a set of P4-programmable components on the target as well as their external
data plane interfaces.
Compiling a set of P4 programs produces two artifacts:
• a data plane configuration that implements the forwarding logic described in the input program
and
• an API for managing the state of the data plane objects from the control plane
8
under these assumptions, the computational complexity of a P4 program is linear in the total size of
all headers, and never depends on the size of the state accumulated while processing data (e.g., the
number of flows, or the total number of packets processed). These guarantees are necessary (but not
sufficient) for enabling fast packet processing across a variety of targets.
P4 conformance of a target is defined as follows: if a specific target T supports only a subset of the
P4 programming language, say P4T , programs written in P4T executed on the target should provide
the exact same behavior as is described in this document. Note that P4 conformant targets can provide
arbitrary P4 language extensions and extern elements.
3.1. Benefits of P4
Compared to state-of-the-art packet-processing systems (e.g., based on writing microcode on top of
custom hardware), P4 provides a number of significant advantages:
9
Figure 3. Evolution of the language between versions P414 (versions 1.0 and 1.1) and P416 .
be in the scope of a future document describing a standard library of P4 elements. In this document
we provide several examples of extern constructs. P416 also introduces and repurposes some v1.1 lan-
guage constructs for describing the programmable parts of an architecture. These language constructs
are: parser, state, control, and package.
One important goal of the P416 language revision is to provide a stable language definition. In other
words, we strive to ensure that all programs written in P416 will remain syntactically correct and behave
identically when treated as programs for future versions of the language. Moreover, if some future ver-
sion of the language requires breaking backwards compatibility, we will seek to provide an easy path
for migrating P416 programs to the new version.
4. Architecture Model
The P4 architecture identifies the P4-programmable blocks (e.g., parser, ingress control flow, egress
control flow, deparser, etc.) and their data plane interfaces.
The P4 architecture can be thought of as a contract between the program and the target. Each
manufacturer must therefore provide both a P4 compiler as well as an accompanying architecture def-
inition for their target. (We expect that P4 compilers can share a common front-end that handles all
architectures). The architecture definition does not have to expose the entire programmable surface
of the data plane—a manufacturer may even choose to provide multiple definitions for the same hard-
ware device, each with different capabilities (e.g., with or without multicast support).
Figure 4 illustrates the data plane interfaces between P4-programmable blocks. It shows a target
that has two programmable blocks (#1 and #2). Each block is programmed through a separate fragment
of P4 code. The target interfaces with the P4 program through a set of control registers or signals. Input
controls provide information to P4 programs (e.g., the input port that a packet was received from),
while output controls can be written to by P4 programs to influence the target behavior (e.g., the output
port where a packet has to be directed). Control registers/signals are represented in P4 as intrinsic
metadata. P4 programs can also store and manipulate data pertaining to each packet as user-defined
10
Figure 4. P4 program interfaces.
metadata.
The behavior of a P4 program can be fully described in terms of transformations that map vectors
of bits to vectors of bits. To actually process a packet, the architecture model interprets the bits that
the P4 program writes to intrinsic metadata. For example, to cause a packet to be forwarded on a spe-
cific output port, a P4 program may need to write the index of an output port into a dedicated control
register. Similarly, to cause a packet to be dropped, a P4 program may need to set a “drop” bit into
another dedicated control register. Note that the details of how intrinsic metadata are interpreted is
architecture-specific.
P4 programs can invoke services implemented by extern objects and functions provided by the
architecture. Figure 5 depicts a P4 program invoking the services of a built-in checksum computation
unit on a target. The implementation of the checksum unit is not specified in P4, but its interface is. In
general, the interface for an extern object describes each operation it provides, as well as their param-
eter and return types.
In general, P4 programs are not expected to be portable across different architectures. For exam-
ple, executing a P4 program that broadcasts packets by writing into a custom control register will not
function correctly on a target that does not have the control register. However, P4 programs written for
11
a given architecture should be portable across all targets that faithfully implement the corresponding
model, provided there are sufficient resources.
This type declaration describes a block named MatchActionPipe that can be programmed using a data-
dependent sequence of match-action unit invocations and other imperative constructs (indicated by
the control keyword). The interface between the MatchActionPipe block and the other components of
the architecture can be read off from this declaration:
• The first parameter is a 4-bit value named inputPort. The direction in indicates that this param-
eter is an input that cannot be modified.
• The second parameter is an object of type H named parsedHeaders, where H is a type variable repre-
senting the headers that will be defined later by the P4 programmer. The direction inout indicates
that this parameter is both an input and an output.
• The third parameter is a 4-bit value named outputPort. The direction out indicates that this pa-
rameter is an output whose value is undefined initially but can be modified.
extern Checksum16 {
Checksum16(); // constructor
void clear(); // prepare unit for computation
void update<T>(in T data); // add data to checksum
12
Figure 6. The Very Simple Switch (VSS) architecture.
• Packets sent to the “CPU port” are sent to the control plane
• Packets sent to the “Drop port” are discarded
• Packets sent to the “Recirculate port” are re-injected in the switch through a special input port
The white blocks in the figure are programmable, and the user must provide a corresponding P4 pro-
gram to specify the behavior of each such block. The red arrows indicate the flow of user-defined data.
The cyan blocks are fixed-function components. The green arrows are data plane interfaces used to
13
convey information between the fixed-function blocks and the programmable blocks—exposed in the
P4 program as intrinsic metadata.
// File "very_simple_switch_model.p4"
// Very Simple Switch P4 declaration
// core library needed for packet_in and packet_out definitions
# include <core.p4>
/* Various constants and structure declarations */
/* ports are represented using 4-bit values */
typedef bit<4> PortId;
/* only 8 ports are "real" */
const PortId REAL_PORT_COUNT = 4w8; // 4w8 is the number 8 in 4 bits
/* metadata accompanying an input packet */
struct InControl {
PortId inputPort;
}
/* special input port values */
const PortId RECIRCULATE_IN_PORT = 0xD;
const PortId CPU_IN_PORT = 0xE;
/* metadata that must be computed for outgoing packets */
struct OutControl {
PortId outputPort;
}
/* special output port values for outgoing packet */
const PortId DROP_PORT = 0xF;
const PortId CPU_OUT_PORT = 0xE;
const PortId RECIRCULATE_OUT_PORT = 0xD;
/* Prototypes for all programmable blocks */
/**
* Programmable parser.
* @param <H> type of headers; defined by user
* @param b input packet
* @param parsedHeaders headers constructed by parser
*/
parser Parser<H>(packet_in b,
out H parsedHeaders);
/**
* Match-action pipeline
14
* @param <H> type of input and output headers
* @param headers headers received from the parser and sent to the deparser
* @param parseError error that may have surfaced during parsing
* @param inCtrl information from architecture, accompanying input packet
* @param outCtrl information for architecture, accompanying output packet
*/
control Pipe<H>(inout H headers,
in error parseError,// parser error
in InControl inCtrl,// input port
out OutControl outCtrl); // output port
/**
* VSS deparser.
* @param <H> type of headers; defined by user
* @param b output packet
* @param outputHeaders headers for output packet
*/
control Deparser<H>(inout H outputHeaders,
packet_out b);
/**
* Top-level package declaration - must be instantiated by user.
* The arguments to the package indicate blocks that
* must be instantiated by the user.
* @param <H> user-defined type of the headers processed.
*/
package VSS<H>(Parser<H> p,
Pipe<H> map,
Deparser<H> d);
// Architecture-specific objects that can be instantiated
// Checksum unit
extern Checksum16 {
Checksum16(); // constructor
void clear(); // prepare unit for computation
void update<T>(in T data); // add data to checksum
void remove<T>(in T data); // remove data from existing checksum
bit<16> get(); // get the checksum for the data added since last clear
}
• The included file core.p4 is described in more detail in Appendix D. It defines some standard
data-types and error codes.
• The syntax 4w0xF indicates the value 15 represented using 4 bits. An alternative notation is 4w15.
In many circumstances the width modifier can be omitted, writing just 15.
15
• Next follows the declaration of a parser:
This declaration describes the interface for a parser, but not yet its implementation, which will be
provided by the programmer. The parser reads its input from a packet_in, which is a pre-defined
P4 extern object that represents an incoming packet, declared in the core.p4 library. The parser
writes its output (the out keyword) into the parsedHeaders argument. The type of this argument is
H, yet unknown—it will also be provided by the programmer.
• The declaration
package VSS<H>
A type variable indicates a type yet unknown that must be provided by the user at a later time. In this
case H is the type of the set of headers that the user program will be processing; the parser will pro-
duce the parsed representation of these headers, and the match-action pipeline will update the input
headers in place to produce the output headers.
• The package VSS declaration has three complex parameters, of types Parser, Pipe, and Deparser
respectively; which are exactly the declarations we have just described. In order to program the
target one has to supply values for these parameters.
• In this program the inCtrl and outCtrl structures represent control registers. The content of the
headers structure is stored in general-purpose registers.
• The extern Checksum16 declaration describes an extern object whose services can be invoked to
compute checksums.
16
section can be seen as a simple example illustrating all the details that have to be handled when writ-
ing an architecture description. The P4 language is not intended to cover the description of all such
functional blocks—the language can only describe the interfaces between programmable blocks and
the architecture. For the current program, this interface is given by the Parser, Pipe, and Deparser dec-
larations. In practice we expect that the complete description of the architecture will be provided as an
executable program and/or diagrams and text; in this document we will provide informal descriptions
in English.
• It receives packets from one of the physical input Ethernet ports, from the control plane, or from
the input recirculation port.
• For packets received from Ethernet ports, the block computes the Ethernet trailer checksum and
verifies it. If the checksum does not match, the packet is discarded. If the checksum does match,
it is removed from the packet payload.
• Receiving a packet involves running an arbitration algorithm if multiple packets are available.
• If the arbiter block is busy processing a previous packet and no queue space is available, input
ports may drop arriving packets, without indicating the fact that the packets were dropped in any
way.
• After receiving a packet, the arbiter block sets the inCtrl.inputPort value that is an input to the
match-action pipeline with the identity of the input port where the packet originated. Physical
Ethernet ports are numbered 0 to 7, while the input recirculation port has a number 13 and the
CPU port has the number 14.
• Sending the packet to the drop port causes the packet to disappear.
• Sending the packet to an output Ethernet port numbered between 0 and 7 causes it to be emitted
on the corresponding physical interface. The packet may be placed in a queue if the output inter-
face is already busy emitting another packet. When the packet is emitted, the physical interface
computes a correct Ethernet checksum trailer and appends it to the packet.
17
• Sending a packet to the output CPU port causes the packet to be transferred to the control plane.
In this case, the packet that is sent to the CPU is the original input packet, and not the packet
received from the deparser—the latter packet is discarded.
• Sending the packet to the output recirculation port causes it to appear at the input recirculation
port. Recirculation is useful when packet processing cannot be completed in a single pass.
• If the outputPort has an illegal value (e.g., 9), the packet is dropped.
• Finally, if the demux unit is busy processing a previous packet and there is no capacity to queue
the packet coming from the deparser, the demux unit may drop the packet, irrespective of the
output port indicated.
Please note that some of the behaviors of the demux block may be unexpected—we have highlighted
them in bold. We are not specifying here several important behaviors related to queue size, arbitration,
and timing, which also influence the packet processing.
The arrow shown from the parser runtime to the demux block represents an additional information
flow from the parser to the demux: the packet being processed as well as the offset within the packet
where parsing ended (i.e., the start of the packet payload).
• clear():
prepares the unit for a new computation
• update<T>(in T data):add some data to be checksummed. The data must be either a bit-string, a
header-typed value, or a struct containing such values. The fields in the header/struct are con-
catenated in the order they appear in the type declaration.
• get(): returns the 16-bit one's complement checksum. When this function is invoked the check-
sum must have received an integral number of bytes of data.
• remove<T>(in T data): assuming that data was used for computing the current checksum, data is
removed from the checksum.
• If any parser error has occurred, the packet is dropped (i.e., by assigning outputPort to DROP_PORT)
• The first table uses the IPv4 destination address to determine the outputPort and the IPv4 address
of the next hop. If this lookup fails, the packet is dropped. The table also decrements the IPv4 ttl
value.
• The second table checks the ttl value: if the ttl becomes 0, the packet is sent to the control plane
through the CPU port.
18
Figure 7. Diagram of the match-action pipeline expressed by the VSS P4 program.
• The third table uses the IPv4 address of the next hop (which was computed by the first table) to
determine the Ethernet address of the next hop.
• Finally, the last table uses the outputPort to identify the source Ethernet address of the current
switch, which is set in the outgoing packet.
The deparser constructs the outgoing packet by reassembling the Ethernet and IPv4 headers as com-
puted by the pipeline.
19
bit<4> version;
bit<4> ihl;
bit<8> diffserv;
bit<16> totalLen;
bit<16> identification;
bit<3> flags;
bit<13> fragOffset;
bit<8> ttl;
bit<8> protocol;
bit<16> hdrChecksum;
IPv4Address srcAddr;
IPv4Address dstAddr;
}
// Parser section
state start {
b.extract(p.ethernet);
transition select(p.ethernet.etherType) {
0x0800: parse_ipv4;
// no default rule: all other packets rejected
}
}
state parse_ipv4 {
b.extract(p.ip);
verify(p.ip.version == 4w4, error.IPv4IncorrectVersion);
verify(p.ip.ihl == 4w5, error.IPv4OptionsNotSupported);
ck.clear();
ck.update(p.ip);
20
// Verify that packet checksum is zero
verify(ck.get() == 16w0, error.IPv4ChecksumError);
transition accept;
}
}
/**
* Indicates that a packet is dropped by setting the
* output port to the DROP_PORT
*/
action Drop_action() {
outCtrl.outputPort = DROP_PORT;
}
/**
* Set the next hop and the output port.
* Decrements ipv4 ttl field.
* @param ivp4_dest ipv4 address of next hop
* @param port output port
*/
action Set_nhop(IPv4Address ipv4_dest, PortId port) {
nextHop = ipv4_dest;
headers.ip.ttl = headers.ip.ttl - 1;
outCtrl.outputPort = port;
}
/**
* Computes address of next IPv4 hop and output port
* based on the IPv4 destination of the current packet.
* Decrements packet IPv4 TTL.
* @param nextHop IPv4 address of next hop
*/
table ipv4_match {
key = { headers.ip.dstAddr: lpm; } // longest-prefix match
actions = {
Drop_action;
Set_nhop;
}
21
size = 1024;
default_action = Drop_action;
}
/**
* Send the packet to the CPU port
*/
action Send_to_cpu() {
outCtrl.outputPort = CPU_OUT_PORT;
}
/**
* Check packet TTL and send to CPU if expired.
*/
table check_ttl {
key = { headers.ip.ttl: exact; }
actions = { Send_to_cpu; NoAction; }
const default_action = NoAction; // defined in core.p4
}
/**
* Set the destination MAC address of the packet
* @param dmac destination MAC address.
*/
action Set_dmac(EthernetAddress dmac) {
headers.ethernet.dstAddr = dmac;
}
/**
* Set the destination Ethernet address of the packet
* based on the next hop IP address.
* @param nextHop IPv4 address of next hop.
*/
table dmac {
key = { nextHop: exact; }
actions = {
Drop_action;
Set_dmac;
}
size = 1024;
default_action = Drop_action;
}
/**
* Set the source MAC address.
* @param smac: source MAC address to use
22
*/
action Set_smac(EthernetAddress smac) {
headers.ethernet.srcAddr = smac;
}
/**
* Set the source mac address based on the output port.
*/
table smac {
key = { outCtrl.outputPort: exact; }
actions = {
Drop_action;
Set_smac;
}
size = 16;
default_action = Drop_action;
}
apply {
if (parseError != error.NoError) {
Drop_action(); // invoke drop directly
return;
}
check_ttl.apply();
if (outCtrl.outputPort == CPU_OUT_PORT) return;
dmac.apply();
if (outCtrl.outputPort == DROP_PORT) return;
smac.apply();
}
}
// deparser section
control TopDeparser(inout Parsed_packet p, packet_out b) {
Checksum16() ck;
apply {
b.emit(p.ethernet);
if (p.ip.isValid()) {
ck.clear(); // prepare checksum unit
p.ip.hdrChecksum = 16w0; // clear checksum
ck.update(p.ip); // compute new checksum.
23
p.ip.hdrChecksum = ck.get();
}
b.emit(p.ip);
}
}
6. P4 language definition
The P4 language can be viewed as having several distinct components, which we describe separately:
• The core language, comprising of types, variables, scoping, declarations, statements, expres-
sions, etc. We start by describing this part of the language.
• A sub-language for expressing parsers, based on state machines (Section 12).
• A sub-language for expressing computations using match-action units, based on traditional im-
perative control-flow (Section 13).
• A sub-language for describing architectures (Section 16).
p4program
: /* empty */
| p4program declaration
| p4program ';'
;
Pseudo-code (mostly used for describing the semantics of various P4 constructs) are shown with fixed-
size fonts as in the following example:
24
}
}
6.2. Preprocessing
To aid composition of programs from multiple source files P4 compilers should support the following
subset of the C preprocessor functionality:
The preprocessor should also remove the sequence backslash newline (ASCII codes 92, 10) to facilitate
splitting content across multiple lines when convenient for formatting.
Additional C preprocessor capabilities may be supported, but are not guaranteed—e.g., macros
with arguments. Similar to C, #include can specify a file name either within double quotes or within <>.
# include <system_file>
# include "user_file"
The difference between the two forms is the order in which the preprocessor searches for header files
when the path is incompletely specified.
P4 compilers should correctly handle #line directives that may be generated during preprocessing.
This functionality allows P4 programs to be built from multiple source files, potentially produced by
different programmers at different times:
25
core library. Including the core library is done with
# include <core.p4>
• IDENTIFIER: start with a letter or underscore, and contain letters, digits and underscores
• TYPE_IDENTIFIER: identifier that denotes a type name
• INTEGER: integer literals
• DONTCARE: a single underscore
• Keywords such as RETURN. By convention, each keyword terminal corresponds to a language key-
word with the same spelling but using lowercase. For example, the RETURN terminal corresponds
to the return keyword.
6.3.1. Identifiers
P4 identifiers may contain only letters, numbers, and the underscore character _, and must start with
a letter or underscore. The special identifier consisting of a single underscore _ is reserved to indicate
a “don't care” value; its type may vary depending on the context. Certain keywords (e.g., apply) can be
used as identifiers if the context makes it unambiguous.
nonTypeName
: IDENTIFIER
| APPLY
| KEY
| ACTIONS
| STATE
| ENTRIES
| TYPE
;
name
: nonTypeName
| TYPE_IDENTIFIER
;
6.3.2. Comments
P4 supports several kinds of comments:
26
• Single-line comments, introduced by // and spanning to the end of line,
• Multi-line comments, enclosed between /* and */
• Nested multi-line comments are not supported.
• Javadoc-style comments, starting with /** and ending with */
Use of Javadoc-style comments is strongly encouraged for the tables and actions that are used to syn-
thesize the interface with the control-plane.
P4 treats comments as token separators and no comments are allowed within a token—e.g. bi/**/t
is parsed as two tokens, bi and t, and not as a single token bit.
6.3.3.2. Integer literals Integer literals are positive, arbitrary-precision integers. By default, liter-
als are represented in base 10. The following prefixes must be employed to specify the base explicitly:
The width of a numeric literal in bits can be specified by an unsigned number prefix consisting of a
number of bits and a signedness indicator:
Note that a leading zero by itself does not indicate an octal (base 8) constant. The underscore character
is considered a digit within number literals but is ignored when computing the value of the parsed
number. This allows long constant numbers to be more easily read by grouping digits together. The
underscore cannot be used in the width specification or as the first character of an integer literal. No
comments or whitespaces are allowed within a literal. Here are some examples of numeric literals:
6.3.3.3. String literals String literals (string constants) are specified as an arbitrary sequence of 8-
bit characters, enclosed within double quote signs " (ASCII code 34). Strings start with a double quote
27
sign and extend to the first double quote sign which is not immediately preceded by an odd number of
backslash characters (ASCII code 92). P4 does not make any validity checks on strings (i.e., it does not
check that strings represent legal UTF-8 encodings).
Since P4 does not provide any operations on strings, string literals are generally passed unchanged
through the P4 compiler to other third-party tools or compiler-backends, including the terminating
quotes. These tools can define their own handling of escape sequences (e.g., how to specify Unicode
characters, or handle unprintable ASCII characters).
Here are 3 examples of string literals:
"simple string"
"string \" with \" embedded \" quotes"
"string with embedded
line terminator"
6.5. P4 programs
A P4 program is a list of declarations:
p4program
: /* empty */
| p4program declaration
| p4program ';' /* empty declaration */
;
declaration
: constantDeclaration
| externDeclaration
| actionDeclaration
| parserDeclaration
| typeDeclaration
| controlDeclaration
28
| instantiation
| errorDeclaration
| matchKindDeclaration
| functionDeclaration
;
An empty declarations is indicated with a single semicolon. (Allowing empty declarations accommo-
dates the habits of C/C++ and Java programmers—e.g., certain constructs, like struct, do not require a
terminating semicolon).
6.5.1. Scopes
Some P4 constructs act as namespaces that create local scopes for names including:
• Derived type declarations (struct, header, header_union, enum), which introduce local scopes for
field names,
• Block statements, which introduce local lexically-enclosed scopes,
• parser, table, action, and control blocks, which introduce local scopes
• Declarations with type variables, which introduce a new scope for those variables. For example,
in the following extern declaration, the scope of the type variable H extends to the end of the
declaration:
extern E<H>(/* parameters omitted */) { /* body omitted */ } // scope of H ends here.
The order of declarations is important; with the exception of parser states, all uses of a symbol must
follow the symbol's declaration. (This is a departure from P414 , which allows declarations in any order.
This requirement significantly simplifies the implementation of compilers for P4, allowing compilers
to use additional information about declared identifiers to resolve ambiguities.)
• tables: Tables are read-only for the data plane, but their entries can be modified by the control-
plane,
• extern objects: many objects have state that can be read and written by the control plane and
data plane. All constructs from the P414 language version that encapsulate state (e.g., counters,
meters, registers) are represented using extern objects in P416 .
In P4 all stateful elements must be explicitly allocated at compilation-time through the process called
“instantiation”.
In addition, parsers, control blocks, and packages may contain stateful element instantiations. Thus,
they are also treated as stateful elements, even if they appear to contain no state, and must be instanti-
ated before they can be used. However, although they are stateful, tables do not need to be instantiated
explicitly—declaring a table also creates an instance of it. This convention is designed to support the
common case, since most tables are used just once. To have finer-grained control over when a table is
instantiated, a programmer can declare it within a control.
29
Recall the example in Section 5.3: TopParser, TopPipe, TopDeparser, Checksum16, and Switch are types.
There are two instances of Checksum16, one in TopParser and one in TopDeparser, both called ck. The
TopParser, TopDeparser, TopPipe, and Switch are instantiated at the end of the program, in the declaration
of the main instance object, which is an instance of the Switch type (a package).
6.6. L-values
L-values are expressions that may appear on the left side of an assignment operation or as arguments
corresponding to out and inout function parameters. An l-value represents a storage reference. The
following expressions are legal l-values:
prefixedNonTypeName
: nonTypeName
| dotPrefix nonTypeName
;
lvalue
: prefixedNonTypeName
| lvalue '.' member
| lvalue '[' expression ']'
| lvalue '[' expression ':' expression ']'
;
The following is a legal l-value: headers.stack[4].field. Note that method and function calls cannot
return l-values.
30
• out parameters are, with a few exceptions listed below, uninitialized and are treated as l-values
(See Section 6.6) within the body of the method or function. An arguments passed as an out
parameter must be an l-value; after the execution of the call, the value of the parameter is copied
to the corresponding storage location for that l-value.
• inout parameters are both in and out. An argument passed as an inout parameter must be an
l-value.
• No direction indicates that value of parameter is either:
Direction out parameters are always initialized at the beginning of execution of the portion of the pro-
gram that has the out parameters, e.g. control, parser, action, function, etc. This initialization is not
performed for parameters with any direction that is not out.
• If a direction out parameter is of type header or header_union, it is set to “invalid”.
• If a direction out parameter is of type header stack, all elements of the header stack are set to
“invalid”, and its nextIndex field is initialized to 0 (see Section 8.17).
• If a direction out parameter is a compound type, e.g. a struct or tuple, other than one of the types
listed above, then apply these rules recursively to its members.
• If a direction out parameter has any other type, e.g. bit<W>, an implementation need not initialize
it to any predictable value.
For example, if a direction out parameter has type s2_t named p:
header h1_t {
bit<8> f1;
bit<8> f2;
}
struct s1_t {
h1_t h1a;
bit<3> a;
bit<7> b;
}
struct s2_t {
h1_t h1b;
s1_t s1;
bit<5> c;
}
then at the beginning of execution of the part of the program that has the out parameter p, it must be
initialized so that p.h1b and and p.s1.h1a are invalid. No other parts of p are required to be initialized.
Arguments are evaluated from left to right prior to the invocation of the function itself. The order of
evaluation is important when the expression supplied for an argument can have side-effects. Consider
the following example:
31
extern void f(inout bit x, in bit y);
extern bit g(inout bit z);
bit a;
f(a, g(a));
Note that the evaluation of g may mutate its argument a, so the compiler has to ensure that the value
passed to f for its first parameter is not changed by the evaluation of the second argument. The seman-
tics for evaluating a function call is given by the following algorithm (implementations can be different
as long as they provide the same result):
1. Arguments are evaluated from left to right as they appear in the function call expression.
2. If a parameter has a default value and no corresponding argument is supplied, the default value
is used as an argument.
3. For each out and inout argument the corresponding l-value is saved (so it cannot be changed by
the evaluation of the following arguments). This is important if the argument contains indexing
operations into a header stack.
4. The value of each argument is saved into a temporary.
5. The function is invoked with the temporaries as arguments. We are guaranteed that the tempo-
raries that are passed as arguments are never aliased to each other, so this “generated” function
call can be implemented using call-by-reference if supported by the architecture.
6. On function return, the temporaries that correspond to out or inout arguments are copied in
order from left to right into the l-values saved in step 2.
According to this algorithm, the previous function call is equivalent to the following sequence of state-
ments:
To see why Step 2 in the above algorithm is important, consider the following example:
header H { bit z; }
H[2] s;
f(s[a].z, g(a));
When used as arguments, extern objects can only be passed as directionless parameters—e.g., see the
packet argument in the very simple switch example.
32
6.7.1. Justification
The main reason for using copy-in/copy-out semantics (instead of the more common call-by-reference
semantics) is for controlling the side-effects of extern functions and methods. extern methods and
functions are the main mechanism by which a P4 program communicates with its environment. With
copy-in/copy-out semantics extern functions cannot hold references to P4 program objects; this en-
ables the compiler to limit the side-effects that extern functions may have on the P4 program both in
space (they can only affect out parameters) and in time (side-effects can only occur at function call
time).
In general, extern functions are arbitrarily powerful: they can store information in global storage,
spawn separate threads, “collude” with each other to share information — but they cannot access any
variable in a P4 program. With copy-in/copy-out semantics the compiler can still reason about P4
programs that invoke extern functions.
There are additional benefits of using copy-in copy-out semantics:
• It enables P4 to be compiled for architectures that do not support references (e.g., where all data
is allocated to named registers. Such architectures may require indices into header stacks that
appear in a program to be compile-time known values.)
• It simplifies some compiler analyses, since function parameters can never alias to each other
within the function body.
parameterList
: /* empty */
| nonEmptyParameterList
;
nonEmptyParameterList
: parameter
| nonEmptyParameterList ',' parameter
;
parameter
: optAnnotations direction typeRef name
| optAnnotations direction typeRef name '=' expression
;
direction
: IN
| OUT
| INOUT
| /* empty */
;
33
extern objects.Values for these parameters must be specified at compile-time, and must evaluate
to compile-time known values. See Section 14 for further details.
• For actions all directionless parameters must be at the end of the parameter list. When an ac-
tion appears in a table's actions list, only the parameters with a direction must be bound. See
Section 13.1 for further details.
• Actions can also be explicitly invoked using function call syntax, either from a control block or
from another action. In this case, values for all action parameters must be supplied explicitly, in-
cluding values for the directionless parameters. The directionless parameters in this case behave
like in parameters. See Section 13.1.1 for further details.
• Default parameter values are only allowed for ‘in’ or direction-less parameters; these values must
evaluate to compile-time constants.
Here the target architecture could implement the elided optional argument using an empty pipeline.
The following example shows optional parameters and parameters with default values.
// function calls
h(10); // same as h(10, true);
h(a = 10); // same as h(10, true);
h(a = 10, b = true);
struct Empty {}
control nothing(inout Empty h, inout Empty m) {
apply {}
}
34
parser parserProto<H, M>(packet_in p, out H h, inout M m);
control controlProto<H, M>(inout H h, inout M m);
package pack<HP, MP, HC, MC>(@optional parserProto<HP, MP> _parser, // optional parameter
controlProto<HC, MC> _control = nothing()); // default parameter value
const bit<32> x = 2;
control c() {
int<32> x = 0;
apply {
x = x + (int<32>).x; // x is the int<32> local variable,
// .x is the top-level bit<32> variable
}
}
References to resolve an identifier are attempted inside-out, starting with the current scope and pro-
ceeding to all lexically enclosing scopes. The compiler may provide a warning if multiple resolutions
are possible for the same name (name shadowing).
const bit<4> x = 1;
control p() {
const bit<8> x = 8; // x declaration shadows global x
const bit<4> y = .x; // reference to top-level x
const bit<8> z = x; // reference to p's local x
apply {}
}
6.9. Visibility
Identifiers defined in the top-level namespace are globally visible. Declarations within a parser or con-
trol are private and cannot be referred to from outside of the enclosing parser or control.
7. P4 data types
P416 is a statically-typed language. Programs that do not pass the type checker are considered invalid
and rejected by the compiler. P4 provides a number of base types as well as type operators that con-
struct derived types. Some values can be converted to a different type using casts. However, to make
35
user intents clear, implicit casts are only allowed in a few circumstances and the range of casts available
is intentionally restricted.
• The void type, which has no values and can be used only in a few restricted circumstances.
• The error type, which is used to convey errors in a target-independent, compiler-managed way.
• The string type, which can be used only for compile-time constant string values.
• The match_kind type, which is used for describing the implementation of table lookups,
• bool, which represents Boolean values
• int, which represents arbitrary-sized constant integer values
• Bit-strings of fixed width, denoted by bit<>
• Fixed-width signed integers represented using two's complement int<>
• Bit-strings of dynamically-computed width with a fixed maximum width varbit<>
baseType
: BOOL
| ERROR
| BIT
| INT
| STRING
| BIT '<' INTEGER '>'
| INT '<' INTEGER '>'
| VARBIT '<' INTEGER '>'
| BIT '<' '(' expression ')' '>'
| INT '<' '(' expression ')' '>'
| VARBIT '<' '(' expression ')' '>'
;
errorDeclaration
: ERROR '{' identifierList '}'
;
All error constants are inserted into the error namespace, irrespective of the place where an error is
defined. error is similar to an enumeration (enum) type in other languages. A program can contain
36
multiple error declarations, which the compiler will merge together. It is an error to declare the same
identifier multiple times. Expressions of type error are described in Section 8.2.
For example, the following declaration creates two constants of error type (these errors are de-
clared in the P4 core library):
matchKindDeclaration
: MATCH_KIND '{' identifierList '}'
;
match_kind {
exact,
ternary,
lpm
}
Architectures may support additional match_kinds. The declaration of new match_kinds can only occur
within model description files; P4 programmers cannot declare new match kinds.
7.1.5. Strings
The type string represents strings. There are no operations on string values; one cannot declare vari-
ables with a string type. Parameters with type string can be only directionless (see Section 6.7). P4
does not support string manipulation in the dataplane; the string type is only allowed for denoting
compile-time constant string values. These may be useful, for example, a specific target architecture
may support an extern function for logging with the following signature:
The only strings that can appear in a P4 program are constant string literals, described in Section 6.3.3.3.
For example, the following annotation indicates that a specific name should be used for a table when
generating the control-plane API:
37
@name("acl") table t1 { /* body omitted */ }
• Inspired by C: Typing of integers is modeled after the well-defined parts of C, expanded to cope
with arbitrary fixed-width integers. In particular, the type of the result of an expression only de-
pends on the expression operands, and not on how the result of the expression is consumed.
• No undefined behaviors: P4 attempts to avoid many of C's behaviors, which include the size of an
integer (int), the results produced on overflow, and the results produced for some input combi-
nations (e.g., shifts with negative amounts, overflows on signed numbers, etc.). P4 computations
on integer types have no undefined behaviors.
• Least surprise: The P4 typing rules are chosen to behave as closely as possible to traditional well-
behaved C programs.
• Forbid rather than surprise: Rather than provide surprising or undefined results (e.g., in C com-
parisons between signed and unsigned integers), we have chosen to forbid expressions with am-
biguous interpretations. For example, P4 does not allow binary operations that combine signed
and unsigned integers.
The priority of arithmetic operations is identical to C—e.g., multiplication binds tighter than addition.
7.1.6.1. Portability No P4 target can support all possible types and operations. For example, the
type bit<23132312> is legal in P4, but it is highly unlikely to be supported on any target in practice.
Hence, each target can impose restrictions on the types it can support. Such restrictions may include:
The documentation supplied with a target should clearly specify restrictions, and target-specific com-
pilers should provide clear error messages when such restrictions are encountered. An architecture
may reject a well-typed P4 program and still be conformant to the P4 spec. However, if an architecture
accepts a P4 program as valid, the runtime program behavior should match this specification.
7.1.6.2. Unsigned integers (bit-strings) An unsigned integer (which we also call a “bit-string”)
has an arbitrary width, expressed in bits. A bit-string of width W is declared as: bit<W>. W must be an ex-
pression that evaluates to a compile-time known value (see Section 17.1) that is a non-negative integer.
When using an expression for the size, the epression must be parenthesized. Bitstrings with width 0
are allowed; they have no actual bits, and can only have the value 0. See { #sec-uninitialized-values-
and-writing-invalid-headers } for additional details.
38
const bit<32> x = 10; // 32-bit constant with value 10.
const bit<(x + 2)> y = 15; // 12-bit constant with value 15.
// expression for width must use ()
Bits within a bit-string are numbered from 0 to W-1. Bit 0 is the least significant, and bit W-1 is the most
significant.
For example, the type bit<128> denotes the type of bit-string values with 128 bits numbered from 0
to 127, where bit 127 is the most significant.
The type bit is a shorthand for bit<1>.
P4 architectures may impose additional constraints on bit types: for example, they may limit the
maximum size, or they may only support some arithmetic operations on certain sizes (e.g., 16-, 32-,
and 64- bit values).
All operations that can be performed on unsigned integers are described in Section 8.5.
7.1.6.3. Signed Integers Signed integers are represented using two's complement. An integer
with W bits is declared as: int<W>. W must be an expression that evaluates to a compile-time known
value that is a positive integer.
Bits within an integer are numbered from 0 to W-1. Bit 0 is the least significant, and bit W-1 is the sign
bit.
For example, the type int<64> describes the type of integers represented using exactly 64 bits with
bits numbered from 0 to 63, where bit 63 is the most significant (sign) bit.
P4 architectures may impose additional constraints on signed types: for example, they may limit
the maximum size, or they may only support some arithmetic operations on certain sizes (e.g., 16-, 32-,
and 64- bit values).
All operations that can be performed on signed integers are described in Section 8.6.
A signed integer with width 1 can only have two legal values: 0 and -1.
7.1.6.4. Dynamically-sized bit-strings Some network protocols use fields whose size is only
known at runtime (e.g., IPv4 options). To support restricted manipulations of such values, P4 provides
a special bit-string type whose size is set at runtime, called a varbit.
The type varbit<W> denotes a bit-string with a width of at most W bits, where W must be a non-negative
integer that is a compile-time known value. For example, the type varbit<120> denotes the type of bit-
string values that may have between 0 and 120 bits. Most operations that are applicable to fixed-size
bit-strings (unsigned numbers) cannot be performed on dynamically sized bit-strings.
P4 architectures may impose additional constraints on varbit types: for example, they may limit
the maximum size, or they may require varbit values to always contain an integer number of bytes at
runtime.
All operations that can be performed on varbits are described in Section 8.8.
7.1.6.5. Infinite-precision integers The infinite-precision data type describes integers with an
unlimited precision. This type is written as int.
This type is reserved for integer literals and expressions that involve only literals. No P4 runtime
value can have an int type; at compile time the compiler will convert all int values that have a runtime
component to fixed-width types, according to the rules described below.
39
All operations that can be performed on infinite-precision integers are described in Section 8.7.
The following example shows three constant definitions whose values are infinite-precision integers.
const int a = 5;
const int b = 2 * a;
const int c = b - a + 3;
7.1.6.6. Integer literal types The types of integer literals (constants) are as follows:
The table below shows several examples of integer literals and their types. For additional examples of
literals see Section 6.3.3.
Literal Interpretation
10 Type is int, value is 10
8w10 Type is bit<8>, value is 10
8s10 Type is int<8>, value is 10
2s3 Type is int<2>, value is -1 (last 2 bits), overflow warning
1w10 Type is bit<1>, value is 0 (last bit), overflow warning
1s1 Type is int<1>, value is -1, overflow warning
• enum
• header
• header stacks
• struct
• header_union
• tuple
• type specialization
• extern
• parser
• control
• package
The types header, header_union, enum, struct, extern, parser, control, and package can only be used in
type declarations, where they introduce a new name for the type. The type can subsequently be referred
to using this identifier.
Other types cannot be declared, but are synthesized by the compiler internally to represent the type
of certain language constructs. These types are described in Section 7.2.8: set types and function types.
For example, the programmer cannot declare a variable with type “set”, but she can write an expression
whose value evaluates to a set type. These types are used during type-checking.
40
typeDeclaration
: derivedTypeDeclaration
| typedefDeclaration
| parserTypeDeclaration ';'
| controlTypeDeclaration ';'
| packageTypeDeclaration ';'
;
derivedTypeDeclaration
: headerTypeDeclaration
| headerUnionDeclaration
| structTypeDeclaration
| enumDeclaration
;
typeRef
: baseType
| typeName
| specializedType
| headerStackType
| tupleType
;
namedType
: typeName
| specializedType
;
prefixedType
: TYPE_IDENTIFIER
| dotPrefix TYPE_IDENTIFIER
;
typeName
: prefixedType
;
enumDeclaration
: optAnnotations ENUM name '{' identifierList '}'
| optAnnotations ENUM typeRef name '{' specifiedIdentifierList '}'
;
41
identifierList
: name
| identifierList ',' name
;
specifiedIdentifierList
: specifiedIdentifier
| specifiedIdentifierList ',' specifiedIdentifier
;
specifiedIdentifier
: name '=' initializer
;
introduces a new enumeration type, which contains four constants—e.g., Suits.Clubs. An enum dec-
laration introduces a new identifier in the current scope for naming the created type. The underlying
representation of the Suits enum is not specified, so their “size” in bits is not specified (it is target-
specific).
It is also possible to specify an enum with an underlying representation. These are sometimes called
serializable enums, because headers are allowed to have fields with such enum types. This requires the
programmer provide both the fixed-width unsigned (or signed) integer type and an associated integer
value for each symbolic entry in the enumeration. The symbol typeRef in the grammar above must be
one of the following types:
introduces a new enumeration type, which contains five constants—e.g., EtherType.IPV4. This enum
declaration specifies the fixed-width unsigned integer representation for each entry in the enum and
provides an underlying type: bit<16>. This type of enum declaration can be thought of as declaring a new
bit<16> type, where variables or fields of this type are expected to be unsigned 16-bit integer values, and
the mapping of symbolic to numeric values defined by the enum are effectively constants defined as a
42
part of this type. In this way, an enum with an underlying type can be thought of as being a type derived
from the underlying type carrying equality, assignment, and casts to/from the underlying type.
Compiler implementations are expected to raise an error if the fixed-width integer representation
for an enumeration entry falls outside the representation range of the underlying type.
For example, the declaration
would raise an error because 300, the value associated with FailingExample.unrepresentable cannot be
represented as a bit<8> value.
The initializer expression must be a compile-time known value.
Annotations, represented by the non-terminal optAnnotations, are described in Section 18.
Operations on enum values are described in Section 8.3.
headerTypeDeclaration
: optAnnotations HEADER name optTypeParameters '{' structFieldList '}'
;
structFieldList
: /* empty */
| structFieldList structField
;
structField
: optAnnotations typeRef name ';'
;
where each typeRef is restricted to a bit-string type (fixed or variable), a fixed-width signed integer
type, a boolean type, or a struct that itself contains other struct fields, nested arbitrarily, as long as all
of the “leaf” types are bit<W>, int<W>, a serializable enum, or a bool. If a bool is used inside a P4 header,
all implementations encode the bool as a one bit long field, with the value 1 representing true and 0
representing false.
A header declaration introduces a new identifier in the current scope; the type can be referred to
using this identifier. A header is similar to a struct in C, containing all the specified fields. However,
in addition, a header also contains a hidden Boolean “validity” field. When the “validity” bit is true
we say that the “header is valid”. When a local variable with a header type is declared, its “validity”
bit is automatically set to false. The “validity” bit can be manipulated by using the header methods
isValid(), setValid(), and setInvalid(), as described in Section 8.16.
43
Note, nesting of headers is not supported. One reason is that it leads to complications in defin-
ing the behavior of arbitrary sequences of setValid, setInvalid, and emit operations. Consider an ex-
ample where header h1 contains header h2 as a member, both currently valid. A program executes
h2.setInvalid() followed by packet.emit(h1). Should all fields of h1 be emitted, but skipping h2? Simi-
larly, should h1.setInvalid() invalidate all headers contained within h1, regardless of how deeply they
are nested?
Header types may be empty:
header Empty_h { }
struct ipv6_addr {
bit<32> Addr0;
bit<32> Addr1;
bit<32> Addr2;
bit<32> Addr3;
}
header ipv6_t {
bit<4> version;
bit<8> trafficClass;
bit<20> flowLabel;
bit<16> payloadLen;
bit<8> nextHdr;
bit<8> hopLimit;
ipv6_addr src;
ipv6_addr dst;
}
Headers that do not contain any varbit field are “fixed size.” Headers containing varbit fields have
“variable size.” The size (in bits) of a fixed-size header is a constant, and it is simply the sum of the sizes
of all component fields (without counting the validity bit). There is no padding or alignment of the
header fields. Targets may impose additional constraints on header types—e.g., restricting headers to
sizes that are an integer number of bytes.
For example, the following declaration describes a typical Ethernet header:
header Ethernet_h {
bit<48> dstAddr;
bit<48> srcAddr;
bit<16> etherType;
}
The following variable declaration uses the newly introduced type Ethernet_h:
44
Ethernet_h ethernetHeader;
P4's parser language provides an extract method that can be used to “fill in” the fields of a header from
a network packet, as described in Section 12.8. The successful execution of an extract operation also
sets the validity bit of the extracted header to true.
Here is an example of an IPv4 header with variable-sized options:
header IPv4_h {
bit<4> version;
bit<4> ihl;
bit<8> diffserv;
bit<16> totalLen;
bit<16> identification;
bit<3> flags;
bit<13> fragOffset;
bit<8> ttl;
bit<8> protocol;
bit<16> hdrChecksum;
bit<32> srcAddr;
bit<32> dstAddr;
varbit<320> options;
}
As demonstrated by a code example in Section 12.8.2, another way to support headers that contain
variable-length fields is to define two headers – one fixed length, one containing a varbit field – and
extract each part in separate parsing steps.
headerStackType
: typeName '[' expression ']'
: specializedType '[' expression ']'
;
where typeName is the name of a header type. For a header stack hs[n], the term n is the maximum
defined size, and must be a positive integer that is a compile-time known value. Nested header stacks
are not supported. At runtime a stack contains n values with type typeName, only some of which may be
valid. Expressions on header stacks are discussed in Section 8.17.
For example, the following declarations,
header Mpls_h {
bit<20> label;
bit<3> tc;
bit bos;
45
bit<8> ttl;
}
Mpls_h[10] mpls;
introduce a header stack called mpls containing ten entries, each of type Mpls_h.
headerUnionDeclaration
: optAnnotations HEADER_UNION name optTypeParameters '{' structFieldList '}'
;
This declaration introduces a new type with the specified name in the current scope. Each element of
the list of fields used to declare a header union must be a header type. However, the empty list of fields
is legal.
As an example, the type Ip_h below represents the union of an IPv4 and IPv6 headers:
header_union IP_h {
IPv4_h v4;
IPv6_h v6;
}
structTypeDeclaration
: optAnnotations STRUCT name optTypeParameters '{' structFieldList '}'
;
This declaration introduces a new type with the specified name in the current scope. An empty struct
(with no fields) is legal. For example, the structure Parsed_headers below contains the headers recog-
nized by a simple parser:
46
Udp_h udp;
}
tupleType
: TUPLE '<' typeArgumentList '>'
;
Operations that manipulate tuple types are described in Sections 8.10 and 8.11.
The type tuple<> is a tuple type with no components.
47
The table below lists all types that may appear as base types in a typedef or type declaration.
Base type B typedef B <name> type B <name>
bit<W> allowed allowed
int<W> allowed allowed
varbit<W> allowed error
int allowed error
void error error
error allowed error
match_kind error error
bool allowed allowed
enum allowed error
header allowed error
header stack allowed error
header_union allowed error
struct allowed error
tuple allowed error
a typedef name allowed allowed3
a type name allowed allowed
7.2.8.1. Set types The type set<T> describes sets of values of type T. Set types can only appear
in restricted contexts in P4 programs. For example, the range expression 8w5 .. 8w8 describes a set
containing the 8-bit numbers 5, 6, 7, and 8, so its type is set<bit<8>>;. This expression can be used as
a label in a select expression (see Section 12.6), matching any value in this range. Set types cannot be
named or declared by P4 programmers, they are only synthesized by the compiler internally and used
for type-checking. Expressions with set types are described in Section 8.13.
7.2.8.2. Function types Function types are created by the P4 compiler internally to repre-
sent the types of functions (explicit functions or extern functions) and methods during type-checking.
We also call the type of a function its signature. Libraries can contain functions and extern function
declarations.
For example, consider the following declarations:
48
– the result type is void
– the function has two inputs
– first input has direction in, type bit<5>, and name logRange
– second input has direction out, type bit<32>, and name value
externDeclaration
: optAnnotations EXTERN nonTypeName optTypeParameters '{' methodPrototypes '}'
| optAnnotations EXTERN functionPrototype ';'
;
7.2.9.1. Extern functions An extern function declaration describes the name and type sig-
nature of the function, but not its implementation.
functionPrototype
: typeOrVoid name optTypeParameters '(' parameterList ')'
;
7.2.9.2. Extern objects An extern object declaration declares an object and all methods that
can be invoked to perform computations and to alter the state of the object. Extern object declarations
can also optionally declare constructor methods; these must have the same name as the enclosing
extern type, no type parameters, and no return type. Extern declarations may only appear as allowed
by the architecture model and may be specific to a target.
methodPrototypes
: /* empty */
| methodPrototypes methodPrototype
;
methodPrototype
: optAnnotations functionPrototype ';'
| optAnnotations TYPE_IDENTIFIER '(' parameterList ')' ';' //constructor
| optAnnotations ABSTRACT functionPrototype ";"
;
49
typeOrVoid
: typeRef
| VOID
| IDENTIFIER // may be a type variable
;
optTypeParameters
: /* empty */
| typeParameters
;
typeParameters
: '<' typeParameterList '>'
;
typeParameterList
: name
| typeParameterList ',' name
;
For example, the P4 core library introduces two extern objects packet_in and packet_out used for ma-
nipulating packets (see Sections 12.8 and 15). Here is an example showing how the methods of these
objects can be invoked on a packet:
extern packet_out {
void emit<T>(in T hdr);
}
control d(packet_out b, in Hdr h) {
apply {
b.emit(h.ipv4); // write ipv4 header into output packet
} // by calling emit method
}
Functions and methods are the only P4 constructs that support overloading: there can exist multiple
methods with the same name in the same scope. When overloading is used, the compiler must be able
to disambiguate at compile-time which method or function is being called, either by the number of
arguments or by the names of the arguments, when calls are specifying argument names. Argument
type information is not used in disambiguating calls.
Abstract methods Typical extern object methods are built-in, and are implemented by the target
architecture. P4 programmers can only call such methods.
However, some types of extern objects may provide methods that can be implemented by the P4
programmers. Such methods are described with the abstract keyword prior to the method definition.
Here is an example:
50
extern Balancer {
Balancer();
// get the number of active flows
bit<32> getFlowCount();
// return port index used for load-balancing
// @param address: IPv4 source address of flow
abstract bit<4> on_new_flow(in bit<32> address);
}
When such an object is instantiated the user has to supply an implementation of all the abstract meth-
ods (see 10.3.1).
specializedType
: prefixedType '<' typeArgumentList '>'
;
For example, the following extern declaration describes a generic block of registers, where the type of
the elements stored in each register is an arbitrary T.
extern Register<T> {
Register(bit<32> size);
T read(bit<32> index);
void write(bit<32> index, T value);
}
The type T has to be specified when instantiating a set of registers, by specializing the Register type:
Register<bit<32>>(128) registerBank;
The instantiation of registerBank is made using the Register type specialized with the bit<32> bound
to the T type argument.
struct, header, header_union and header stack types can be generic as well. In order to use such a
generic type it must be specialized with appropriate type arguments. For example
struct G<T> {
51
S<T> s;
}
// Header union with a type obtained by specializing a generic header union type
HU<bit> hu;
7.2.11.1. Parser type declarations A parser type declaration describes the signature of a parser.
A parser should have at least one argument of type packet_in, representing the received packet that is
processed.
parserTypeDeclaration
: optAnnotations PARSER name optTypeParameters
'(' parameterList ')'
;
52
For example, the following is a type declaration of a parser type named P that is parameterized on a
type variable H. The parser that receives as input a packet_in value b and produces two values:
7.2.11.2. Control type declarations A control type declaration describes the signature of a con-
trol block.
controlTypeDeclaration
: optAnnotations CONTROL name optTypeParameters
'(' parameterList ')'
;
packageTypeDeclaration
: optAnnotations PACKAGE name optTypeParameters
'(' parameterList ')'
;
All parameters of a package are evaluated at compilation-time, and in consequence they must all be
directionless (they cannot be in, out, or inout). Otherwise package types are very similar to parser type
declarations. Packages can only be instantiated; there are no runtime behaviors associated with them.
53
• For bool the default value is false.
• For error the default value is error.NoError (defined in core.p4)
• For string the default value is the empty string ""
• For varbit<N> the default value is a string of zero bits (there is currently no P4 literal to represent
such a value).
• For enum values with an underlying type the default value is 0, even if 0 is actually not one of the
named values in the enum.
• For enum values without an underlying type the default value is the first value that appears in the
enum type declaration.
• For header types the default value is invalid.
• For header stacks the default value is that all elements are invalid and the nextIndex is 0.
• For header_union values the default value is that all union elements are invalid.
• For struct types the default value is a struct where each field has the default value of the suitable
field type – if all such default values are defined.
• For a tuple type the default value is a tuple where each field has the default value of the suitable
type – if all such default values are defined.
Note that some types do not have default values, e.g., match_kind, set types, function types, extern types,
parser types, control types, package types.
7.4. typedef
A typedef declaration can be used to give an alternative name to a type.
typedefDeclaration
: optAnnotations TYPEDEF typeRef name ';'
| optAnnotations TYPEDEF derivedTypeDeclaration name ';'
;
The two types are treated as synonyms, and all operations that can be executed using the original type
can be also executed using the newly created type.
54
While similar to typedef, the type keyword introduces in fact a new type, which is not a synonym with
the original type: values of the original type and the newly introduced type cannot be mixed in expres-
sions.
One important use of such types is in describing P4 values that need to be exchanged with the
control-plane through communication channels (e.g., through the control-plane API or through net-
work packets sent to the control-plane). For example, a P4 architecture may define a type for the switch
ports:
This declaration will prevent PortId_t values from being used in arithmetic expressions. Moreover,
this declaration may enable special manipulation or such values by software that lies outside of the
datapath (e.g., a target specific tool-chain could include software that automatically converts values of
type PortId_t to a different representation when exchanged with the control-plane software).
8. Expressions
This section describes all expressions that can be used in P4, grouped by the type of value they produce.
The grammar production rule for general expressions is as follows:
expression
: INTEGER
| TRUE
| FALSE
| STRING_LITERAL
| nonTypeName
| dotPrefix nonTypeName
| expression '[' expression ']'
| expression '[' expression ':' expression ']'
| '{' expressionList '}'
| '{' kvList '}'
| '(' expression ')'
| '!' expression
| '~' expression
| '-' expression
| '+' expression
| typeName '.' member
| ERROR '.' member
| expression '.' member
| expression '*' expression
| expression '/' expression
| expression '%' expression
| expression '+' expression
| expression '-' expression
| expression SHL expression // SHL is <<
| expression '>''>' expression // check that >> are contiguous
55
| expression LE expression // LE is <=
| expression GE expression
| expression '<' expression
| expression '>' expression
| expression NE expression // NE is !=
| expression EQ expression // EQ is ==
| expression '&' expression
| expression '^' expression
| expression '|' expression
| expression PP expression // PP is ++
| expression AND expression // AND is &&
| expression OR expression // OR is ||
| expression '?' expression ':' expression
| expression '<' realTypeArgumentList '>' '(' argumentList ')'
| expression '(' argumentList ')'
| namedType '(' argumentList ')'
| '(' typeRef ')' expression
;
expressionList
: /* empty */
| expression
| expressionList ',' expression
;
member
: name
;
argumentList
: /* empty */
| nonEmptyArgList
;
nonEmptyArgList
: argument
| nonEmptyArgList ',' argument
;
argument
: expression
;
typeArg
: DONTCARE
| typeRef
56
| nonTypeName
| VOID
;
typeArgumentList
: /* empty */
| typeArg
| typeArgumentList ',' typeArg
;
• Boolean operators && and || use short-circuit evaluation—i.e., the second operand is only eval-
uated if necessary.
• The conditional operator e1 ? e2 : e3 evaluates e1, and then either evaluates e2 or e3.
• All other expressions are evaluated left-to-right as they appear in the source program.
• Method and function calls are evaluated as described in Section 6.7.
error errorFromParser;
57
enum X { v1, v2, v3 }
X.v1 // reference to v1
v1 // error - v1 is not in the top-level namespace
Similar to errors, enum expressions without a specified underlying type only support equality (==) and
inequality (!=) comparisons. Expressions whose type is an enum without a specified underlying type
cannot be cast to or from any other type.
An enum may also specify an underlying type, such as the following:
enum bit<8> E {
e1 = 0,
e2 = 1,
e3 = 2
}
More than one symbolic value in an enum may map to the same fixed-with integer value.
An enum with an underlying type also supports explicit casts to and from the underlying type. For in-
stance, the following code:
bit<8> x;
E a = E.e2;
E b;
x = (bit<8>) a; // sets x to 1
b = (E) x; // sets b to E.e2
casts a, which was initialized to E.e2 to a bit<8>, using the specified fixed-width unsigned integer rep-
resentation for E.e2, 1. The variable b is then set to the symbolic value E.e2, which corresponds to the
fixed-width unsigned integer value 1.
Because it is always safe to cast from an enum to its underlying fixed-width integer type, implicit
casting from an enum to its fixed-width (signed or unsigned) integer type is also supported:
E a = E.e2
bit<8> y = a << 3; // sets y to 8 (a is automatically casted to bit<8> and then shifted)
58
enum bit<8> E1 {
e1 = 0, e2 = 1, e3 = 2
}
enum bit<8> E2 {
e1 = 10, e2 = 11, e3 = 12
}
E1 a = E1.e1;
E2 b = E2.e2;
a = (E1) b; // OK
A reasonable compiler might generate a warning in cases that involve multiple automatic casts.
E1 a = E1.e1;
E2 b = E2.e2;
bit<8> c;
Note that while it is always safe to cast from an enum to its fixed-width unsigned integer type, and vice
versa, there may be cases where casting a fixed-width unsigned integer value to its related enum type
produces an unnamed value.
bit<8> x = 5;
E e = (E) x; // sets e to an unnamed value
59
sets e to an unnamed value, since there is no symbol corresponding to the fixed-width unsigned integer
value 5.
For example, in the following code, the else clause of the if/else if/else block can be reached
even though the matches on x are complete with respect to the symbols defined in MyPartialEnum_t:
if (x == MyPartialEnum_t.VALUE_A) {
// some code here
} else if (x == MyPartialEnum_t.VALUE_B) {
// some code here
} else if (x == MyPartialEnum_t.VALUE_C) {
// some code here
} else {
// A P4 compiler MUST ASSUME that this branch can be executed
// some code here
}
Additionally, if an enumeration is used as a field of a header, we would expect the transition select to
match default when the parsed integer value does not match one of the symbolic values of EtherType
in the following example:
header ethernet {
// Some fields omitted
EtherType etherType;
}
60
EtherType.IPV6: parse_ipv6;
default: reject;
}
}
Any variable with an enum type that contains an unnamed value, whether as the result of a cast to an
enum with an underlying type, parse into the field of an enum with an underlying type, or simply the
declaration of any enum without a specified initial value will not be equal to any of the values defined
for that type. Such an unnamed value should still lead to predictable behavior in cases where any legal
value would match, e.g. it should match in any of these situations:
Note that if an enum value lacking an underlying type appears in the control-plane API, the compiler
must select a suitable serialization data type and representation. For enum values with an underlying
type and representations, the compiler should use the specified underlying type as the serialization
data type and representation.
if (x != 0) /* body omitted */
See the discussion on infinite-precision types and implicit casts in Section 8.9.2 for details on how the
0 in this expression is evaluated.
61
8.5. Operations on bit types (unsigned integers)
This section discusses all operations that can be performed on expressions of type bit<W> for some
width W, also known as bit-strings.
Arithmetic operations “wrap-around”, similar to C operations on unsigned values (i.e., represent-
ing a large value on W bits will only keep the least-significant W bits of the value). In particular, P4 does
not have arithmetic exceptions—the result of an arithmetic operation is defined for all possible inputs.
P4 target architectures may optionally support saturating arithmetic. All saturating operations are
limited to a fixed range between a minimum and maximum value. Saturating arithmetic has advan-
tages, in particular when used as counters. The the result of a saturating counter max-ing out is much
closer to the real result than a counter that overflows and wraps around. According to Wikipedia Satu-
rating Arithmetic saturating arithmetic is as numerically close to the true answer as possible; for 8-bit
binary signed arithmetic, when the correct answer is 130, it is considerably less surprising to get an
answer of 127 from saturating arithmetic than to get an answer of −126 from modular arithmetic. Like-
wise, for 8-bit binary unsigned arithmetic, when the correct answer is 258, it is less surprising to get an
answer of 255 from saturating arithmetic than to get an answer of 2 from modular arithmetic. At this
time, P4 defines saturating operations only for addition and subtraction. For an unsigned integer with
bit-width of W, the minimum value is 0 and the maximum value is 2^W-1. The precedence of saturating
addition and subtraction operations is the same as for modulo arithmetic addition and subtraction.
All binary operations (except shifts) require both operands to have the same exact type and width;
supplying operands with different widths produces an error at compile time. No implicit casts are in-
serted by the compiler to equalize the widths. There are no binary operations that combine signed and
unsigned values (except shifts). The following operations are provided on bit-string expressions:
• Test for equality between bit-strings of the same width, designated by ==. The result is a Boolean
value.
• Test for inequality between bit-strings of the same width, designated by !=. The result is a Boolean
value.
• Unsigned comparisons <,>,<=,>=. Both operands must have the same width and the result is a
Boolean value.
Each of the following operations produces a bit-string result when applied to bit-strings of the same
width:
• Negation, denoted by unary -. The result is computed by subtracting the value from 2W . The re-
sult is unsigned and has the same width as the input. The semantics is the same as the C negation
of unsigned numbers.
• Unary plus, denoted by +. This operation behaves like a no-op.
• Addition, denoted by +. This operation is associative. The result is computed by truncating the
result of the addition to the width of the output (similar to C).
• Subtraction, denoted by -. The result is unsigned, and has the same type as the operands. It is
computed by adding the negation of the second operand (similar to C).
• Multiplication, denoted by *. The result has the same width as the operands and is computed by
truncating the result to the output's width. P4 architectures may impose additional restrictions—
e.g., they may only allow multiplication by a power of two.
• Bitwise “and” between two bit-strings of the same width, denoted by &.
• Bitwise “or” between two bit-strings of the same width, denoted by |.
• Bitwise “complement” of a single bit-string, denoted by ~.
62
• Bitwise “xor” of two bit-strings of the same width, denoted by ^.
• Saturating addition, denoted by |+|.
• Saturating subtraction, denoted by |-|.
• Extraction of a set of contiguous bits, also known as a slice, denoted by [m:l], where m and l must
be positive integers that are compile-time known values, and m >= l. The result is a bit-string of
width m - l + 1, including the bits numbered from l (which becomes the least significant bit of
the result) to m (the most significant bit of the result) from the source operand. The conditions
0 <= l < W and l <= m < W are checked statically (where W is the length of the source bit-string). Note
that both endpoints of the extraction are inclusive. The bounds are required to be compile-time
known values so that the result width can be computed at compile time. Slices are also l-values,
which means that P4 supports assigning to a slice: e[m:l] = x . The effect of this statement is to
set bits m to l of e to the bit-pattern represented by x, and leaves all other bits of e unchanged. A
slice of an unsigned integer is an unsigned integer.
• Logical shift left and right with a runtime known unsigned integer value, denoted by << and >>
respectively. In a shift, the left operand is unsigned, and right operand must be either an ex-
pression of type bit<S> or a non-negative integer literal. The result has the same type as the left
operand. Shifting by an amount greater than the width of the input produces a result where all
bits are zero.
63
• Comparison for equality and inequality, denoted == and != respectively. These operations pro-
duce a Boolean result.
• Numeric comparisons, denoted by <,<=,>, and >=. These operations produce a Boolean result.
• Multiplication, denoted by *. Result has the same width as the operands. P4 architectures may
impose additional restrictions—e.g., they may only allow multiplication by a power of two.
• Saturating addition, denoted by |+|.
• Saturating subtraction, denoted by |-|.
• Arithmetic shift left and right denoted by << and >>. The left operand is signed and the right
operand must be either an unsigned number of type bit<S> or a non-negative integer literal. The
result has the same type as the left operand. Shifting left produces the exact same bit pattern as
a shift left of an unsigned value. Shift left can thus overflow, when it leads to a change of the sign
bit. Shifting by an amount greater than the width of the input produces a “correct” result:
• Extraction of a set of contiguous bits, also known as a slice, denoted by [m:l], where m and l must
be positive integers that are compile-time known values, and m >= l. The result is an unsigned
bit-string of width m - l + 1, including the bits numbered from l (which becomes the least
significant bit of the result) to m (the most significant bit of the result) from the source operand.
The conditions 0 <= l < W and l <= m < W are checked statically (where W is the length of the source
bit-string). Note that both endpoints of the extraction are inclusive. The bounds are required to
be compile-time known values so that the result width can be computed at compile time. Slices
are also l-values, which means that P4 supports assigning to a slice: e[m:l] = x . The effect of
this statement is to set bits m to l of e to the bit-pattern represented by x, and leaves all other bits
of e unchanged. A slice of a signed integer is treated like an unsigned integer.
8.6.1. Concatenation
Concatenation is applied to two bit-strings (signed or unsigned). It is denoted by the infix operator ++.
The result is a bit-string whose length is the sum of the lengths of the inputs where the most significant
bits are taken from the left operand; the sign of the result is taken from the left operand.
• Right shift behaves differently for signed and unsigned values: right shift for signed values is an
arithmetic shift.
• Shifting with a negative amount does not have a clear semantics: the P4 type system makes it
illegal to shift with a negative amount.
• Unlike C, shifting by an amount larger or equal to the number of bits has a well-defined result.
• Finally, depending on the capabilities of the target, shifting may require doing work which is
exponential in the number of bits of the right-hand-side operand.
64
bit<8> x;
bit<16> y;
bit<16> z = y << x;
bit<16> w = y << 1024;
As mentioned above, P4 gives a precise meaning shifting with an amount larger than the size of the
shifted value, unlike C.
P4 targets may impose additional restrictions on shift operations such as forbidding shifts by non-
constant expressions, or by expressions whose width exceeds a certain bound. For example, a target
may forbid shifting an 8-bit value by a non-constant value whose width is greater than 3 bits.
Each operand that participates in any of these operation must have type int. Binary operations can-
not be used to combine values of type int with values of a fixed-width type. However, the compiler
automatically inserts casts from int to fixed-width types in certain situations—see Section 8.9.
All computations on int values are carried out without loss of information. For example, multiply-
ing two 1024-bit values may produce a 2048-bit value (note that concrete representation of int values is
not specified). int values can be cast to bit<w> and int<w> values. Casting an int value to a fixed-width
type will preserve the least-significant bits. If truncation causes significant bits to be lost, the compiler
should emit a warning.
Note: bitwise-operations (|,&,^,~) are not defined on expressions of type int. In addition, it is illegal
to apply division and modulo to negative values.
Note: saturating arithmetic is not supported for arbitrary-precision integers.
65
Variable-length bit-strings support a limited set of operations:
• Assignment to another variable-sized bit-string. The target of the assignment must have the same
static width as the source. When executed, the assignment sets the dynamic width of the target
to the dynamic width of the source.
• Comparison for equality or inequality with another varbit field. Two varbit fields can be com-
pared only if they have the same type. Two varbits are equal if they have the same dynamic width
and all the bits up to the dynamic width are the same.
The following operations are not supported directly on a value of type varbit, but instead on any type for
which extract and emit operations are supported (e.g. a value with type header) that may contain a field
of type varbit. They are mentioned here only to ease finding this information in a section dedicated to
type varbit.
• Parser extraction into a header containing a variable-sized bit-string using the two-argument
extract method of a packet_in extern object (see Section 12.8.2). This operation sets the dynamic
width of the field.
• The emit method of a packet_out extern object can be performed on a header and a few other
types (see Section 15) that contain a field with type varbit. Such an emit method call inserts a
variable-sized bit-string with a known dynamic width into the packet being constructed.
8.9. Casts
P4 provides a limited set of casts between types. A cast is written (t) e, where t is a type and e is an
expression. Casts are only permitted between base types. While this design is arguably more onerous
for programmers, it has several benefits:
• bit<1> <-> bool: converts the value 0 to false, the value 1 to true, and vice versa.
• int -> bool: only if the int value is 0 (converted to false) or 1 (converted to true)
• int<W> -> bit<W>: preserves all bits unchanged and reinterprets negative values as positive values
• bit<W> -> int<W>: preserves all bits unchanged and reinterprets values whose most-significant
bit is 1 as negative values
• bit<W> -> bit<X>: truncates the value if W > X, and otherwise (i.e., if W <= X) pads the value with
zero bits.
• int<W> -> int<X>: truncates the value if W > X, and otherwise (i.e., if W < X) extends it with the
sign bit.
• bit<W> -> int: preserves the value unchanged but converts it to an unlimited-precision integer;
the result is alwasy positive
66
• int<W> -> int: preserves the value unchanged but converts it to an unlimited-precision integer;
the result may be negative
• int -> bit<W>: converts the integer value into a sufficiently large two's complement bit string
to avoid information loss, and then truncates the result to W bits. The compiler should emit a
warning on overflow or on conversion of negative value.
• int -> int<W>: converts the integer value into a sufficiently-large two's complement bit string
to avoid information loss, and then truncates the result to W bits. The compiler should emit a
warning on overflow.
• casts between two types that are introduced by typedef and are equivalent to one of the above
combinations.
• casts between a type introduced by type and the original type.
• casts between an enum with an explicit type and its underlying type
enum bit<8> E {
a = 5;
}
bit<8> x;
bit<16> y;
int<8> z;
• x + 1 becomes x + (bit<8>)1
• z < 0 becomes z < (int<8>)0
• x << 13 becomes 0; overflow warning
• x | 0xFFF becomes x | (bit<8>)0xFFF; overflow warning
• x + E.a becomes x + (bit<8>)E.a
bit<8> x;
bit<16> y;
int<8> z;
The table below shows several expressions which are illegal because they do not obey the P4 typing
rules. For each expression we provide several ways that the expression could be manually rewritten
67
into a legal expression. Note that for some expression there are several legal alternatives, which may
produce different results! The compiler cannot guess the user intent, so P4 requires the user to disam-
biguate.
Expression Why it is illegal Alternatives
x + y Different widths (bit<16>)x + y
x + (bit<8>)y
x + z Different signs (int<8>)x + z
x + (bit<8>)z
(int<8>)y Cannot change both sign and width (int<8>)(bit<8>)y
(int<8>)(int<16>)y
y + z Different widths and signs (int<8>)(bit<8>)y + z
y + (bit<16>)(bit<8>)z
(bit<8>)y + (bit<8>)z
(int<16>)y + (int<16>)z
x << z RHS of shift cannot be signed x << (bit<8>)z
x < z Different signs X < (bit<8>)z
(int<8>)x < z
1 << x Width of 1 is unknown 32w1 << x
~1 Bitwise operation on int ~32w1
5 & -3 Bitwise operation on int 32w5 & -3
The fields of a tuple can be accessed using array index syntax x[0], x[1]. The array indexes must be
compile-time constants, to enable the type-checker to identify the field types statically.
Currently tuple fields are not left-values, even if the tuple itself is. (I.e. a tuple can only be assigned
monolithically, and the field values cannot be changed individually.) This restriction may be lifted in
a future version of the language.
expression ...
| '{' expressionList '}'
expressionList
: /* empty */
| expression
| expressionList ',' expression
;
68
The type of a list expression is a tuple type (Section 7.2.8). List expressions can be assigned to expres-
sions of type tuple, struct or header, and can also be passed as arguments to methods. Lists may be
nested. However, list expressions are not l-values.
For example, the following program fragment uses a list expression to pass several header fields
simultaneously to a learning provider:
extern LearningProvider {
void learn<T>(in T data);
}
LearningProvider() lp;
A list may be used to initialize a structure if the list has the same number of elements as fields in the
structure. The effect of such an initializer is to assign to the ith element of the list to the ith field in the
structure:
struct S {
bit<32> a;
bit<32> b;
}
const S x = { 10, 20 }; //a = 10, b = 20
List expressions can also be used to initialize variables whose type is a tuple type.
The empty list expression has type tuple<> - a tuple with no components.
expression ...
| '{' kvList '}'
| '(' typeRef ')' expression
;
kvList
: kvPair
| kvList "," kvPair
;
kvPair
: name "=" expression
;
69
For a structure-valued expression typeRef is the name of a struct or header type. The typeRef can be
omitted if it can be inferred from context, e.g., when initializing a variable with a struct type. The fol-
lowing example shows a structure-valued expression used in an equality comparison expression:
struct S {
bit<32> a;
bit<32> b;
}
S s;
select (expression) {
set1: state1;
set2: state2;
// More labels omitted
}
Here the expressions set1, set2, etc. evaluate to sets of values and the select expression tests whether
expression belongs to the sets used as labels.
keysetExpression
: tupleKeysetExpression
| simpleKeysetExpression
;
tupleKeysetExpression
: "(" simpleKeysetExpression "," simpleExpressionList ")"
| "(" reducedSimpleKeysetExpression ")"
;
simpleExpressionList
: simpleKeysetExpression
| simpleExpressionList ',' simpleKeysetExpression
70
;
reducedSimpleKeysetExpression
: expression "&&&" expression
| expression ".." expression
| DEFAULT
| "_"
;
simpleKeysetExpression
: expression
| DEFAULT
| DONTCARE
| expression MASK expression
| expression RANGE expression
;
The mask (&&&) and range (..) operators have the same precedence, which is just higher than &.
select (hdr.ipv4.version) {
4: continue;
}
select (hdr.ipv4.version) {
4: continue;
_: reject;
}
8.13.3. Masks
The infix operator &&& takes two arguments of type bit<W> or serializable enum, and creates a value of
type set<ltype>, where ltype is the type of the left argument. The right value is used as a “mask”, where
each bit set to 0 in the mask indicates a “don't care” bit. More formally, the set denoted by a &&& b is
defined as follows:
71
a &&& b = { c of type bit<W> where a & b = c & b }
For example:
denotes a set that contains 16 different 8-bit values, whose bit-pattern is XXXX1010, where the value of an
X can be any bit. Note that there may be multiple ways to express a keyset using a mask operator—e.g.,
8w0xFA &&& 8w0x0F denotes the same keyset as in the example above.
P4 architectures may impose additional restrictions on the expressions on the left and right-hand
side of a mask operator: for example, they may require that either or both sub-expressions be compile-
time known values.
8.13.4. Ranges
The infix operator .. takes two arguments of the same type T, where T is either bit<W> or int<W>, and
creates a value of type set<T>. The set contains all values numerically between the first and the second,
inclusively. For example:
4w5 .. 4w8
8.13.5. Products
Multiple sets can be combined using Cartesian product:
select(hdr.ipv4.ihl, hdr.ipv4.protocol) {
(4w0x5, 8w0x1): parse_icmp;
(4w0x5, 8w0x6): parse_tcp;
(4w0x5, 8w0x11): parse_udp;
(_, _): accept; }
72
8.15. Structure initializers
Structures can be initialized using structure-valued expression (8.12). The following example shows a
structure initialized using a structure-valued expression:
struct S {
bit<32> a;
bit<32> b;
}
const S x = { a = 10, b = 20 };
const S x = (S){ a = 10, b = 20 }; // equivalent
The compiler must raise an error if a field name appears more than once in the same structure initial-
izer.
See Section 8.22 for a description of the behavior if struct fields are read without being initialized.
• The method isValid() returns the value of the “validity” bit of the header.
• The method setValid() sets the header's validity bit to “true”. It can only be applied to an l-value.
• The method setInvalid() sets the header's validity bit to “false”. It can only be applied to an l-
value.
The expression h.minSizeInBits() is defined for any value h that has a header type. The expression is
equal to the sum of the sizes of all of header h's fields in bits, counting all varbit fields as length 0. An
expression h.minSizeInBits() is a compile-time constant with type int.
The expression h.minSizeInBytes() is similar to h.minSizeInBits(), except that it returns the total
size of all of the header's fields in bytes, rounding up to the next whole number of bytes if the header's
size is not a multiple of 8 bits long. h.minSizeInBytes() is equal to (h.minSizeInBits() + 7) >> 3.
Similar to a struct, a header object can be initialized with a list expression 8.11 — the list fields are
assigned to the header fields in the order they appear — or with a structure initializer expression 8.14.
When initialized the header automatically becomes valid:
Two headers can be compared for equality (==) or inequality (!=) only if they have the same type. Two
headers are equal if and only if they are both invalid, or they are both valid and all their corresponding
fields are equal.
See Section 8.22 for a description of the behavior if header fields are read without being initialized,
or header fields are written to a currently invalid header.
73
8.17. Operations on header stacks
A header stack is a fixed-size array of headers with the same type. The valid elements of a header stack
need not be contiguous. P4 provides a set of computations for manipulating header stacks. A header
stack hs of type h[n] can be understood in terms of the following pseudocode:
// type declaration
struct hs_t {
bit<32> nextIndex;
bit<32> size;
h[n] data; // Ordinary array
}
Intuitively, a header stack can be thought of as a struct containing an ordinary array of headers hs and
a counter nextIndex that can be used to simplify the construction of parsers for header stacks, as dis-
cussed below. The nextIndex counter is initialized to 0.
Given a header stack value hs of size n, the following expressions are legal:
• hs[index]: produces a reference to the header at the specified position within the stack; if hs is
an l-value, the result is also an l-value. The header may be invalid. Some implementations may
impose the constraint that the index expression evaluates to a compile-time known value. A P4
compiler must give an error if an index value that is a compile-time constant is out of range.
Accessing a header stack hs with an index less than 0 or greater than or equal to hs.size results
in an undefined value. See Section 8.22 for more details.
The index is an expression that must be one of the following types:
• hs.size:produces a 32-bit unsigned integer that returns the size of the header stack (a compile-
time constant).
• assignment from a header stack hs into another stack requires the stacks to have the same types
and sizes. All components of hs are copied, including its elements and their validity bits, as well
as nextIndex.
To help programmers write parsers for header stacks, P4 also offers computations that automatically
advance through the stack as elements are parsed:
• hs.next: produces a reference to the element with index hs.nextIndex in the stack. May only be
used in a parser. If the stack's nextIndex counter is greater than or equal to size, then evaluating
74
this expression results in a transition to reject and sets the error to error.StackOutOfBounds. If hs
is an l-value, then hs.next is also an l-value.
• hs.last: produces a reference to the element with index hs.nextIndex - 1 in the stack, if such an
element exists. May only be used in a parser. If the nextIndex counter is less than 1, or greater
than size, then evaluating this expression results in a transition to reject and sets the error to
error.StackOutOfBounds. Unlike hs.next, the resulting reference is never an l-value.
• hs.lastIndex: produces a 32-bit unsigned integer that encodes the index hs.nextIndex - 1. May
only be used in a parser. If the nextIndex counter is 0, then evaluating this expression produces
an undefined value.
Finally, P4 offers the following computations that can be used to manipulate the elements at the front
and back of the stack:
• hs.push_front(int count): shifts hs “right” by count. The first count elements become invalid. The
last count elements in the stack are discarded. The hs.nextIndex counter is incremented by count.
The count argument must be a positive integer that is a compile-time known value. The return
type is void.
• hs.pop_front(int count): shifts hs “left” by count (i.e., element with index count is copied in stack
at index 0). The last count elements become invalid. The hs.nextIndex counter is decremented
by count. The count argument must be a positive integer that is a compile-time known value. The
return type is void.
75
}
if (this.nextIndex >= count) {
this.nextIndex = this.nextIndex - count;
} else {
this.nextIndex = 0;
}
// Note: this.last, this.next, and this.lastIndex adjust with this.nextIndex
}
Two header stacks can be compared for equality (==) or inequality (!=) only if they have the same el-
ement type and the same length. Two stacks are equal if and only if all their corresponding elements
are equal. Note that the nextIndex value is not used in the equality comparison.
header H1 {
bit<8> f;
}
header H2 {
bit<16> g;
}
header_union U {
H1 h1;
H2 h2;
}
U u; // u invalid
This also implies that each of the headers h1 through hn contained in a header union are also initially
invalid. Unlike headers, a union cannot be initialized. However, the validity of a header union can be
updated by assigning a valid header to one of its elements:
U u;
H1 my_h1 = { 8w0 }; // my_h1 is valid
u.h1 = my_h1; // u and u.h1 are both valid
U u;
u.h2 = { 16w1 }; // u and u.h2 are both valid
U u;
u.h1.setValid(); // u and u.h1 are both valid
76
H1 my_h1 = u.h1; // my_h1 is now valid, but contains an undefined value
Note that reading an uninitialized header produces an undefined value, even if the header is itself valid.
More formally, if u is an expression whose type is a header union U with fields ranged over by hi,
then the following operations can be used to manipulate u:
• u.hi.setValid(): sets the valid bit for header hi to true and sets the valid bit for all other headers
to false, which implies that reading these headers will return an unspecified value.
• u.hi.setInvalid(): if the valid bit for any member header of u is true then sets it to false, which
implies that reading any member header of u will return an unspecified value.
u.hi = e
as equivalent to
u.hi.setValid();
u.hi = e;
if e is valid and
u.hi.setInvalid();
otherwise.
Assignments between variables of the same type of header union are permitted. The assignment
u1 = u2 copies the full state of header union u2 to u1. If u2 is valid, then there is some header u2.hi that
is valid. The assignment behaves the same as u1.hi = u2.hi. If u2 is not valid, then u1 becomes invalid
(i.e. if any header of u1 was valid, it becomes invalid).
u.isValid() returns true if any member of the header union u is valid, otherwise it returns false.
setValid() and setInvalid() methods are not defined for header unions.
Supplying an expression with a union type to emit simply emits the single header that is valid, if
any.
The following example shows how we can use header unions to represent IPv4 and IPv6 headers
uniformly:
header_union IP {
IPv4 ipv4;
IPv6 ipv6;
}
struct Parsed_packet {
Ethernet ethernet;
IP ip;
}
77
state start {
b.extract(p.ethernet);
transition select(p.ethernet.etherType) {
16w0x0800 : parse_ipv4;
16w0x86DD : parse_ipv6;
}
}
state parse_ipv4 {
b.extract(p.ip.ipv4);
transition accept;
}
state parse_ipv6 {
b.extract(p.ip.ipv6);
transition accept;
}
}
As another example, we can also use unions to parse (selected) TCP options:
header Tcp_option_end_h {
bit<8> kind;
}
header Tcp_option_nop_h {
bit<8> kind;
}
header Tcp_option_ss_h {
bit<8> kind;
bit<32> maxSegmentSize;
}
header Tcp_option_s_h {
bit<8> kind;
bit<24> scale;
}
header Tcp_option_sack_h {
bit<8> kind;
bit<8> length;
varbit<256> sack;
}
header_union Tcp_option_h {
Tcp_option_end_h end;
Tcp_option_nop_h nop;
Tcp_option_ss_h ss;
Tcp_option_s_h s;
Tcp_option_sack_h sack;
}
78
typedef Tcp_option_h[10] Tcp_option_stack;
struct Tcp_option_sack_top {
bit<8> kind;
bit<8> length;
}
Two header unions can be compared for equality (==) or inequality (!=) if they have the same type. The
unions are equal if and only if all their corresponding fields are equal (i.e., either all fields are invalid
in both unions, or in both unions the same field is valid, and the values of the valid fields are equal as
79
headers).
expression
: ...
| expression '<' realTypeArgumentList '>' '(' argumentList ')'
| expression '(' argumentList ')'
argumentList
: /* empty */
| nonEmptyArgList
;
nonEmptyArgList
: argument
| nonEmptyArgList ',' argument
;
argument
: expression /* positional argument */
| name '=' expression /* named argument */
| DONTCARE
;
realTypeArgumentList
: realTypeArg
| realTypeArgumentList ',' typeArg
;
realTypeArg
: DONTCARE
| typeRef
| VOID
;
A function call or method invocation can optionally specify for each argument the corresponding pa-
rameter name. It is illegal to use names only for some arguments: either all or no arguments should
specify the parameter name. Function arguments are evaluated in the order they appear, left to right,
before the function invocation takes place.
80
f(xa, ya); // match arguments by position
f(x = xa, y = ya); // match arguments by name
f(y = ya, x = xa); // match arguments by name in any order
//f(x = xa); -- error: enough arguments
//f(x = xa, x = ya); -- error: argument specified twice
//f(x = xa, ya); -- error: some arguments specified by name
//f(z = xa, w = yz); -- error: no parameter named z or w
//f(x = xa, y = 0); -- error: y must be a left-value
The calling convention is copy-in/copy-out (Section 6.7). For generic functions the type arguments
can be explicitly specified in the function call. The compiler only inserts implicit casts for direction in
arguments to methods or functions as described in Section 8.9. The types for all other arguments must
match the parameter types exactly.
The result returned by a function call is discarded when the function call is used as a statement.
The “don't care” identifier (_) can only be used for an out function/method argument, when the
value of returned in that argument is ignored by subsequent computations. When used in generic
functions or methods, the compiler may reject the program if it is unable to infer a type for the don't
care argument.
• extern objects
• parsers
• control blocks
• packages
• Using constructor invocations, which are expressions that return an object of the corresponding
type.
• Using instantiations, described in Section 10.3.
The syntax for a constructor invocation is similar to a function call; constructors can also be called
using named arguments. Constructors are evaluated entirely at compilation-time (see Section 17). In
consequence, all constructor arguments must also be expressions that can be evaluated at compilation
time.
The following example shows a constructor invocation for setting the target-dependent implemen-
tation property of a table:
extern ActionProfile {
ActionProfile(bit<32> size); // constructor
}
table tbl {
actions = { /* body omitted */ }
implementation = ActionProfile(1024); // constructor invocation
}
81
8.21. Operations on types introduced by type
Values with a type introduced by the type keyword provide only few operations:
Calling the isValid() method on an element of a header stack, where the index is out of range, returns
an undefined boolean value, i.e., it is either true or false, but the specification does not require one or
the other, nor that a consistent value is returned across multiple such calls. Assigning an out of range
header stack element to another header variable h leads to a state where h is undefined in all of its field
values, and its validity is also undefined.
82
Where a header is mentioned, it may be a member of a header_union, an element in a header stack,
or a normal header. This unspecified value could differ from one such read to another.
For an uninitialized field or variable with a type of enum or error, the unspecified value that is read
might not be equal to any of the values defined for that type. Such an unspecified value should still
lead to predictable behavior in cases where any legal value would match, e.g. it should match in any of
these situations:
Consider a situation where a header_union u1 has member headers u1.h1 and u1.h2, and at a given point
in the program's execution u1.h1 is valid and u1.h2 is invalid. If a write is attempted to a field of the
invalid member header u1.h2, then any or all of the fields of the valid member header u1.h1 may change
as a result. Such a write must not change the validity of any member headers of u1, nor any other state
that is currently defined in the system, whether it is defined state in header fields or anywhere else.
If any of these kinds of writes are performed:
• a write to a field in a currently invalid header, either a regular header or an element of a header
stack with an index that is in range, and that header is not part of a header_union
• a write to a field in an element of a header stack, where the index is out of range
• a method call of setValid() or setInvalid() on an element of a header stack, where the index is
out of range
then that write must not change any state that is currently defined in the system, neither in header fields
nor anywhere else. In particular, if an invalid header is involved in the write, it must remain invalid.
Any writes to fields in a currently invalid header, or to header stack elements where the index is out
of range, are allowed to modify state whose values are not defined, e.g. the values of fields in headers
that are currently invalid.
For a top level parser or control in an architecture, it is up to that architecture to specify whether pa-
rameters with direction in or inout are initialized when the control is called, and under what conditions
they are initialized, and if so, what their values will be.
Since P4 allows empty tuples and structs, one can construct types whose values carry no “useful”
information, e.g.:
struct Empty {
tuple<> t;
}
83
• a tuple having all fields of an empty type
• a struct having all fields of an empty type
Values with empty types carry no useful information. In particular, they do not have to be explicitly
initialized to have a legal value.
(Header types with no fields always have a validity bit.)
struct S {
bit<32> b32;
bool b;
}
enum int<8> N0 {
one = 1,
zero = 0,
two = 2
}
enum N1 {
A, B, C, F
}
struct T {
S s;
N0 n0;
N1 n1;
}
header H {
bit<16> f1;
bit<8> f2;
}
84
T t0 = ...; // initialize t0 with the value { { 0, false }, 0, N1.A }
T t1 = { s = ..., ... }; // initialize t1 with the value { { 0, false }, 0, N1.A }
T t2 = { s = ... }; // error: no initializer specified for fields n0 and n1
tuple<N0, N1> p = { ... }; // initialize p with default value { 0, N1.A }
T t3 = { ..., n0 = 2}; // error: ... must be last
H h1 = ...; // initialize h1 with a header that is invalid
H h2 = { f2=5, ... }; // initialize h2 with a header that is valid, field f1 0, field f2 5
H h3 = { ... }; // initialize h3 with a header that is valid, field f1 0, field f2 0
9. Function declarations
Functions can only be declared at the top-level and all parameters must have a direction. P4 functions
are modeled after functions as found in most other programming languages, however, the language
does not permit recursive functions.
functionDeclaration
: functionPrototype blockStatement
;
functionPrototype
: typeOrVoid name optTypeParameters '(' parameterList ')'
;
Here is an example of a function that returns the maximum of two 32-bit values:
A function returns a value using the return statement. A function that returns void can simply use the
return statement with no arguments. A function with a non-void return type must return a value of the
suitable type on all possible execution paths.
constantDeclaration
: optAnnotations CONST typeRef name '=' initializer ';'
;
85
initializer
: expression
;
Such a declaration introduces a constant whose value has the specified type. The following are all legal
constant declarations:
10.2. Variables
Local variables are declared with a type, a name, and an optional initializer (as well as an optional
annotation):
variableDeclaration
: annotations typeRef name optInitializer ';'
| typeRef name optInitializer ';'
;
optInitializer
: /* empty */
| '=' initializer
;
Variable declarations without an initializer are uninitialized (except for headers and other header-
related types, which are initialized to invalid in the same way as described for direction out parameters
in Section 6.7). The language places few restrictions on the types of the variables: most P4 types that
can be written explicitly can be used (e.g., base types, struct, header, header stack, tuple). However, it
is impossible to declare variables with types that are only synthesized by the compiler (e.g., set). In ad-
dition, variables of type parser, control, package, or extern types must be declared using instantiations
(see Section 10.3).
Reading the value of a variable that has not been initialized yields an undefined result. The compiler
should attempt to detect and emit a warning in such situations.
Variables declarations can appear in the following locations within a P4 program:
• In a block statement,
• In a parser state,
• In an action body,
• In a control block apply block,
• In the list of local declarations in a parser, and
• In the list of local declarations in a control.
86
Variables have local scope, and behave like stack-allocated variables in languages such as C. The value
of a variable is never preserved from one invocation of its enclosing block to the next. In particular,
variables cannot be used to maintain state between different network packets.
10.3. Instantiations
Instantiations are similar to variable declarations, but are reserved for the types with constructors
(extern objects, control blocks, parsers, and packages):
instantiation
: typeRef '(' argumentList ')' name ';'
| annotations typeRef '(' argumentList ')' name ';'
;
lvalue:
...
| THIS
expression:
87
...
| THIS
instantiation:
...
| annotations typeRef "(" argumentList ")" name "=" objInitializer ";"
| typeRef "(" argumentList ")" name "=" objInitializer ";"
objInitializer
: "{" objDeclarations "}"
;
objDeclarations
: /* empty */
| objDeclarations objDeclaration
;
objDeclaration
: functionDeclaration
| instantiation
;
The abstract methods can only use the supplied arguments or refer to values that are in the top-level
scope. When calling another method of the same instance the this keyword is used to indicate the
current object instance:
// Instantiate a balancer
Balancer() b = { // provide an implementation for the abstract methods
bit<4> on_new_flow(in bit<32> address) {
// uses the address and the number of flows to load balance
bit<32> count = this.getFlowCount(); // call method of the same instance
return (address + count)[3:0];
}
}
Abstract methods may be invoked by users explicitly, or they may be invoked by the target architec-
ture. The architectural description has to specify when the abstract methods are invoked and what
the meaning of their arguments and return values is; target architectures may impose additional con-
straints on abstract methods.
88
// Program
control c(/* parameters omitted */) { /* body omitted */ }
c() c1; // illegal top-level instantiation
because control c1 is instantiated at the top-level. Note that top-level declarations of constants and
instantiations of extern objects are permitted.
11. Statements
Every statement in P4 (except block statements) must end with a semicolon. Statements can appear in
several places:
There are restrictions for the kinds of statements that can appear in each of these places. For example,
returns are not supported in parsers, and switch statements are only supported in control blocks. We
present here the most general case, for control blocks.
statement
: assignmentOrMethodCallStatement
| conditionalStatement
| emptyStatement
| blockStatement
| exitStatement
| returnStatement
| switchStatement
;
assignmentOrMethodCallStatement
: lvalue '(' argumentList ')' ';'
| lvalue '<' typeArgumentList '>' '(' argumentList ')' ';'
| lvalue '=' expression ';'
;
89
11.2. Empty statement
The empty statement, written ; is a no-op.
emptyStatement
: ';'
;
blockStatement
: optAnnotations '{' statOrDeclList '}'
;
statOrDeclList
: /* empty */
| statOrDeclList statementOrDeclaration
;
statementOrDeclaration
: variableDeclaration
| constantDeclaration
| statement
| instantiation
;
returnStatement
: RETURN ';'
| RETURN expression ';'
;
90
11.5. Exit statement
The exit statement immediately terminates the execution of all the blocks currently executing: the
current action (if invoked within an action), the current control, and all its callers. exit statements are
not allowed within parsers or functions.
Any copy-out behavior due to direction out or inout parameters of the enclosing action or control,
and all of its callers, are still performed after the execution of the exit statement. See Section 6.7 for
details on copy-out behavior.
exitStatement
: EXIT ';'
;
conditionalStatement
: IF '(' expression ')' statement
| IF '(' expression ')' statement ELSE statement
;
When several if statements are nested, the else applies to the innermost if statement that does not
have an else statement.
switchStatement
: SWITCH '(' expression ')' '{' switchCases '}'
;
switchCases
: /* empty */
| switchCases switchCase
;
switchCase
: switchLabel ':' blockStatement
| switchLabel ':' // fall-through
;
switchLabel
91
: DEFAULT
| nonBraceExpression
;
nonBraceExpression
: INTEGER
| TRUE
| FALSE
| STRING_LITERAL
| nonTypeName
| dotPrefix nonTypeName
| nonBraceExpression '[' expression ']'
| nonBraceExpression '[' expression ':' expression ']'
| '(' expression ')'
| '!' expression %prec PREFIX
| '~' expression %prec PREFIX
| '-' expression %prec PREFIX
| '+' expression %prec PREFIX
| typeName '.' member
| ERROR '.' member
| nonBraceExpression '.' member
| nonBraceExpression '*' expression
| nonBraceExpression '/' expression
| nonBraceExpression '%' expression
| nonBraceExpression '+' expression
| nonBraceExpression '-' expression
| nonBraceExpression '|+|' expression
| nonBraceExpression '|-|' expression
| nonBraceExpression '<<' expression
| nonBraceExpression '>>' expression
| nonBraceExpression '<=' expression
| nonBraceExpression '>=' expression
| nonBraceExpression '<' expression
| nonBraceExpression '>' expression
| nonBraceExpression '!=' expression
| nonBraceExpression '==' expression
| nonBraceExpression '&' expression
| nonBraceExpression '^' expression
| nonBraceExpression '|' expression
| nonBraceExpression '++' expression
| nonBraceExpression '&&' expression
| nonBraceExpression '||' expression
| nonBraceExpression '?' expression ':' expression
| nonBraceExpression '<' realTypeArgumentList '>' '(' argumentList ')'
| nonBraceExpression '(' argumentList ')'
| namedType '(' argumentList ')'
92
| '(' typeRef ')' expression
;
The nonBraceExpression is the same as expression as defined in Section 8, except it does not include any
cases that can begin with a left brace { character, to avoid syntactic ambiguity with a block statement.
There are two kinds of switch expressions allowed, described separately in the following two sub-
sections.
switch (t.apply().action_run) {
action1: // fall-through to action2:
action2: { /* body omitted */ }
action3: { /* body omitted */ } // no fall-through from action2 to action3 labels
default: { /* body omitted */ }
}
Note that the default label of the switch statement is used to match on the kind of action executed, no
matter whether there was a table hit or miss. The default label does not indicate that the table missed
and the default_action was executed.
• bit<W>
• int<W>
• enum, either with or without an underlying representation specified
• error
All switch labels must be expressions with compile-time known values, and must have a type that can
be implicitly cast to the type of the switch expression (see Section 8.9.2). Switch labels must not begin
with a left brace character {, to avoid ambiguity with a block statement.
93
Figure 8. Parser FSM structure.
• if there is a default label, the case with the default label is executed.
• if there is no default label, then no switch case is executed, and execution continues after the
end of the switch statement, with no side effects (except any that were caused by evaluating the
switch expression).
94
distinct from the states provided by the programmer and are logically outside of the parser. Figure 8
illustrates the general structure of a parser state machine.
parserTypeDeclaration
: optAnnotations PARSER name optTypeParameters
'(' parameterList ')'
;
parserDeclaration
: parserTypeDeclaration optConstructorParameters
'{' parserLocalElements parserStates '}'
;
parserLocalElements
: /* empty */
| parserLocalElements parserLocalElement
;
parserStates
: parserState
| parserStates parserState
;
For a description of optConstructorParameters, which are useful for building parameterized parsers,
see Section 14.
Unlike parser type declarations, parser declarations may not be generic—e.g., the following decla-
ration is illegal:
Hence, used in the context of a parserDeclaration the production rule parserTypeDeclaration should
not yield type parameters.
At least one state, named start, must be present in any parser. A parser may not define two states
with the same name. It is also illegal for a parser to give explicit definitions for the accept and reject
states—those states are logically distinct from the states defined by the programmer.
State declarations are described below. Preceding the parser states, a parser may also contain a
list of local elements. These can be constants, variables, or instantiations of objects that may be used
within the parser. Such objects may be instantiations of extern objects, or other parsers that may be
invoked as subroutines. However, it is illegal to instantiate a control block within a parser.
parserLocalElement
: constantDeclaration
95
| variableDeclaration
| valueSetDeclaration
| instantiation
;
ParserModel {
error parseError;
onPacketArrival(packet p) {
ParserModel.parseError = error.NoError;
goto start;
}
}
An architecture must specify the behavior when the accept and reject states are reached. For example,
an architecture may specify that all packets reaching the reject state are dropped without further pro-
cessing. Alternatively, it may specify that such packets are passed to the next block after the parser, with
intrinsic metadata indicating that the parser reached the reject state, along with the error recorded.
parserState
: optAnnotations STATE name
'{' parserStatements transitionStatement '}'
;
Each state has a name and a body. The body consists of a sequence of statements that describe the
processing performed when the parser transitions to that state including:
– Invoking functions (e.g., using verify to check the validity of data already parsed), and
– Invoking methods (e.g., extracting data out of packets or computing checksums) and other
parsers (see Section 12.10), and
• Conditional statements,
• Transitions to other states (discussed in Section 12.5).
96
The syntax for parser statements is given by the following grammar rules:
parserStatements
: /* empty */
| parserStatements parserStatement
;
parserStatement
: assignmentOrMethodCallStatement
| directApplication
| variableDeclaration
| constantDeclaration
| parserBlockStatement
| emptyStatement
| conditionalStatement
;
parserBlockStatement
: optAnnotations '{' parserStatements '}'
;
Architectures may place restrictions on the expressions and statements that can be used in a parser—
e.g., they may forbid the use of operations such as multiplication or place restrictions on the number
of local variables that may be used.
In terms of the ParserModel, the sequence of statements in a state are executed sequentially.
transitionStatement
: /* empty */
| TRANSITION stateExpression
;
stateExpression
: name ';'
| selectExpression
;
The execution of the transition statement causes stateExpression to be evaluated, and transfers control
to the resulting state.
In terms of the ParserModel, the semantics of a transition statement can be formalized as follows:
goto eval(stateExpression)
97
For example, this statement:
transition accept;
terminates execution of the current parser and transitions immediately to the accept state.
If the body of a state block does not end with a transition statement, the implied statement is
transition reject;
selectExpression
: SELECT '(' expressionList ')' '{' selectCaseList '}'
;
selectCaseList
: /* empty */
| selectCaseList selectCase
;
selectCase
: keysetExpression ':' name ';'
;
In a select expression, if the expressionList has type tuple<T>, then each keysetExpression must have
type set<tuple<T>>.
In terms of the ParserModel, the meaning of a select expression:
select(e) {
ks[0]: s[0];
ks[1]: s[1];
/* more labels omitted */
ks[n-2]: s[n-1];
_ : sd; // ks[n-1] is default
}
key = eval(e);
for (int i=0; i < n; i++) {
keyset = eval(ks[i]);
if (keyset.contains(key)) return s[i];
}
verify(false, error.NoMatch);
98
Some targets may require that all keyset expressions in a select expression be compile-time known
values. Keysets are evaluated in order, from top to bottom as implied by the pseudo-code above; the
first keyset that includes the value in the select argument provides the result state. If no label matches,
the execution triggers a runtime error with the standard error code error.NoMatch.
Note that this implies that all cases after a default or _ label are unreachable; the compiler should
emit a warning if it detects unreachable cases. This constitutes an important difference between select
expressions and the switch statements found in many programming languages since the keysets of a
select expression may “overlap”.
The typical way to use a select expression is to compare the value of a recently-extracted header
field against a set of constant values, as in the following example:
For example, to detect TCP reserved ports (< 1024) one could write:
select (p.tcp.port) {
16w0 &&& 16w0xFC00: well_known_port;
_: other_port;
}
The expression 16w0 &&& 16w0xFC00 describes the set of 16-bit values whose most significant six bits are
zero.
Some targets may support parser value set, see Section 12.11. Given a type T for the type parameter
of the value set, the type of the value set is set<T>. The type of the value set must match to the type of all
other keysetExpression in the same select expression. If there is a mismatch, the compiler must raise
an error. The type of the values in the set must be one of bit<>, tuple, and struct.
For example, to allow the control plane API to specify TCP reserved ports at runtime, one could
write:
struct vsk_t {
@match(ternary)
bit<16> port;
}
value_set<vsk_t>(4) pvs;
select (p.tcp.port) {
pvs: runtime_defined_port;
_: other_port;
}
The above example allows the runtime API to populate up to 4 different keysetExpressions in the value_set.
99
If the value_set takes a struct as type parameter, the runtime API can use the struct field names to name
the objects in the value set. The match type of the struct field is specified with the @match annotation. If
the @match annotation is not specified on a struct field, by default it is assumed to be @match(exact). A
single non-exact field must be placed into a struct by itself, with the desired @match annotation.
12.7. verify
The verify statement provides a simple form of error handling. verify can only be invoked within a
parser; it is used syntactically as if it were a function with the following signature:
If the first argument is true, then executing the statement has no side-effect. However, if the first ar-
gument is false, it causes an immediate transition to reject, which causes immediate parsing termi-
nation; at the same time, the parserError associated with the parser is set to the value of the second
argument.
In terms of the ParserModel the semantics of a verify statement is given by:
extern packet_in {
void extract<T>(out T headerLvalue);
void extract<T>(out T variableSizeHeader, in bit<32> varFieldSizeBits);
T lookahead<T>();
bit<32> length(); // This method may be unavailable in some architectures
void advance(bit<32> bits);
}
To extract data from a packet represented by an argument b with type packet_in, a parser invokes the
extract methods of b. There are two variants of the extract method: a one-argument variant for ex-
tracting fixed-size headers, and a two-argument variant for extracting variable-sized headers. Because
these operations can cause runtime verification failures (see below), these methods can only be exe-
cuted within parsers.
When extracting data into a bit-string or integer, the first packet bit is extracted to the most signifi-
cant bit of the integer.
100
Some targets may perform cut-through packet processing, i.e., they may start processing a packet
before its length is known (i.e., before all bytes have been received). On such a target calls to the
packet_in.length() method cannot be implemented. Attempts to call this method should be flagged as
errors (either at compilation time by the compiler back-end, or when attempting to load the compiled
P4 program onto a target that does not support this method).
In terms of the ParserModel, the semantics of packet_in can be captured using the following abstract
model of packets:
packet_in {
unsigned nextBitIndex;
byte[] data;
unsigned lengthInBits;
void initialize(byte[] data) {
this.data = data;
this.nextBitIndex = 0;
this.lengthInBits = data.sizeInBytes * 8;
}
bit<32> length() { return this.lengthInBits / 8; }
}
The expression headerLeftValue must evaluate to a l-value (see Section 6.6) of type header with a fixed
width. If this method executes successfully, on completion the headerLvalue is filled with data from
the packet and its validity bit is set to true. This method may fail in various ways—e.g., if there are not
enough bits left in the packet to fill the specified header.
For example, the following program fragment extracts an Ethernet header:
In terms of the ParserModel, the semantics of the single-argument extract is given in terms of the fol-
lowing pseudo-code method, using data from the packet class defined above. We use the special valid$
identifier to indicate the hidden valid bit of a header, isNext$ to indicate that the l-value was obtained
using next, and nextIndex$ to indicate the corresponding header stack properties.
101
lastBitNeeded = this.nextBitIndex + bitsToExtract;
ParserModel.verify(this.lengthInBits >= lastBitNeeded, error.PacketTooShort);
headerLValue = this.data.extractBits(this.nextBitIndex, bitsToExtract);
headerLValue.valid$ = true;
if headerLValue.isNext$ {
verify(headerLValue.nextIndex$ < headerLValue.size, error.StackOutOfBounds);
headerLValue.nextIndex$ = headerLValue.nextIndex$ + 1;
}
this.nextBitIndex += bitsToExtract;
}
The expression headerLvalue must be a l-value representing a header that contains exactly one varbit
field. The expression variableFieldSize must evaluate to a bit<32> value that indicates the number of
bits to be extracted into the unique varbit field of the header (i.e., this size is not the size of the complete
header, just the varbit field).
In terms of the ParserModel, the semantics of the two-argument extract is captured by the following
pseudo-code:
The following example shows one way to parse IPv4 options—by splitting the IPv4 header into two
separate headers:
102
// IPv4 header without options
header IPv4_no_options_h {
bit<4> version;
bit<4> ihl;
bit<8> diffserv;
bit<16> totalLen;
bit<16> identification;
bit<3> flags;
bit<13> fragOffset;
bit<8> ttl;
bit<8> protocol;
bit<16> hdrChecksum;
bit<32> srcAddr;
bit<32> dstAddr;
}
header IPv4_options_h {
varbit<320> options;
}
struct Parsed_headers {
// Some fields omitted
IPv4_no_options_h ipv4;
IPv4_options_h ipv4options;
}
error { InvalidIPv4Header }
state parse_ipv4 {
b.extract(headers.ipv4);
verify(headers.ipv4.ihl >= 5, error.InvalidIPv4Header);
transition select (headers.ipv4.ihl) {
5: dispatch_on_protocol;
_: parse_ipv4_options;
}
state parse_ipv4_options {
// use information in the ipv4 header to compute the number
// of bits to extract
b.extract(headers.ipv4options,
(bit<32>)(((bit<16>)headers.ipv4.ihl - 5) * 32));
transition dispatch_on_protocol;
}
103
}
12.8.3. Lookahead
The lookahead method provided by the packet_in packet abstraction evaluates to a set of bits from the
input packet without advancing the nextBitIndex pointer. Similar to extract, it will transition to reject
and set the error if there are not enough bits in the packet. The lookahead method can be invoked as
follows,
b.lookahead<T>()
where T must be a type with fixed width. In case of success the result of the evaluation of lookahead
returns a value of type T.
In terms of the ParserModel, the semantics of lookahead is given by the following pseudo-code:
T packet_in.lookahead<T>() {
bitsToExtract = sizeof(T);
lastBitNeeded = this.nextBitIndex + bitsToExtract;
ParserModel.verify(this.lengthInBits >= lastBitNeeded, error.PacketTooShort);
T tmp = this.data.extractBits(this.nextBitIndex, bitsToExtract);
return tmp;
}
The TCP options example from Section 8.18 also illustrates how lookahead can be used:
state start {
transition select(b.lookahead<bit<8>>()) {
0: parse_tcp_option_end;
1: parse_tcp_option_nop;
2: parse_tcp_option_ss;
3: parse_tcp_option_s;
5: parse_tcp_option_sack;
}
}
state parse_tcp_option_sack {
bit<8> n = b.lookahead<Tcp_option_sack_top>().length;
b.extract(vec.next.sack, (bit<32>) (8 * n - 16));
transition start;
}
104
One way is to extract to the underscore identifier, explicitly specifying the type of the data:
b.extract<T>(_)
Another way is to use the advance method of the packet when the number of bits to skip is known.
In terms of the ParserModel, the meaning of advance is given in pseudo-code as follows:
header Mpls_h {
bit<20> label;
bit<3> tc;
bit bos;
bit<8> ttl;
}
Mpls_h[10] mpls;
The expression mpls.next represents an l-value of type Mpls_h that references an element in the mpls
stack. Initially, mpls.next refers to the first element of stack. It is automatically advanced on each suc-
cessful call to extract. The mpls.last property refers to the element immediately preceding next if such
an element exists. Attempting to access mpls.next element when the stack's nextIndex counter is greater
than or equal to size causes a transition to reject and sets the error to error.StackOutOfBounds. Like-
wise, attempting to access mpls.last when the nextIndex counter is equal to 0 causes a transition to
reject and sets the error to error.StackOutOfBounds.
The following example shows a simplified parser for MPLS processing:
struct Pkthdr {
Ethernet_h ethernet;
Mpls_h[3] mpls;
// other headers omitted
}
105
b.extract(p.ethernet);
transition select(p.ethernet.etherType) {
0x8847: parse_mpls;
0x0800: parse_ipv4;
}
}
state parse_mpls {
b.extract(p.mpls.next);
transition select(p.mpls.last.bos) {
0: parse_mpls; // This creates a loop
1: parse_ipv4;
}
}
// other states omitted
}
12.10. Sub-parsers
P4 allows parsers to invoke the services of other parsers, similar to subroutines. To invoke the services
of another parser, the sub-parser must be first instantiated; the services of an instance are invoked by
calling it using its apply method.
The following example shows a sub-parser invocation:
• The state invoking the sub-parser is split into two half-states at the parser invocation statement.
• The top half includes a transition to the sub-parser start state.
• The sub-parser's accept state is identified with the bottom half of the current state
• The sub-parser's reject state is identified with the reject state of the current parser.
106
Figure 9. Semantics of invoking a sub-parser: top: original program, bottom: equivalent program.
valueSetDeclaration
: optAnnotations
VALUESET '<' baseType '>' '(' expression ')' name ';'
| optAnnotations
VALUESET '<' tupleType '>' '(' expression ')' name ';'
| optAnnotations
VALUESET '<' typeName '>' '(' expression ')' name ';'
;
Parser Value Sets support a size argument to provide hints to the compiler to reserve hardware resource
to implement the value set. For example, this parser value set:
value_set<bit<16>>(4) pvs;
107
creates a value_set of size 4 with entries of type bit<16>.
The semantics of the size argument is similar to the size property of a table. If a value set has a
size argument with value N, it is recommended that a compiler should choose a data plane implemen-
tation that is capable of storing N value set entries. See “Size property of P4 tables and parser value
sets” P4SizeProperty for further discussion on the implementation of parser value set size.
The value set is populated by the control-plane by methods specified in the P4Runtime specifica-
tion.
controlDeclaration
: controlTypeDeclaration optConstructorParameters
/* controlTypeDeclaration cannot contain type parameters */
'{' controlLocalDeclarations APPLY controlBody '}'
;
controlLocalDeclarations
: /* empty */
| controlLocalDeclarations controlLocalDeclaration
;
controlLocalDeclaration
: constantDeclaration
| variableDeclaration
| actionDeclaration
| tableDeclaration
| instantiation
;
controlBody
: blockStatement
;
It is illegal to instantiate a parser within a control block. For a description of the optConstructorParam-
eters, which can be used to build parameterized control blocks, see Section 14.
Unlike control type declarations, control declarations may not be generic—e.g., the following dec-
laration is illegal:
108
Figure 10. Actions contain code and data. The code is in the P4 program, while the data is
provided in the table entries, typically populated by the control plane. Other parameters are
bound by the data plane.
P4 does not support exceptional control-flow within a control block. The only statement which has a
non-local effect on control flow is exit, which causes execution of the enclosing control block to im-
mediately terminate. That is, there is no equivalent of the verify statement or the reject state from
parsers. Hence, all error handling must be performed explicitly by the programmer.
The rest of this section describes the core components of a control block, starting with actions.
13.1. Actions
Actions are code fragments that can read and write the data being processed. Actions may contain data
values that can be written by the control plane and read by the data plane. Actions are the main con-
struct by which the control-plane can influence dynamically the behavior of the data plane. Figure 10
shows the abstract model of an action.
actionDeclaration
: optAnnotations ACTION name '(' parameterList ')' blockStatement
;
Syntactically actions resemble functions with no return value. Actions may be declared within a control
block; in this case they can only be used within instances of that control block.
The following example shows an action declaration:
Action parameters may not have extern types. Action parameters that have no direction (e.g., port
in the previous example) indicate “action data.” All such parameters must appear at the end of the
parameter list. When used in a match-action table (see Section 13.2.1.2), these parameters will be
provided by the table entries (e.g., as specified by the control plane, the default_action table property,
or the const entries table property).
The body of an action consists of a sequence of statements and declarations. No switch statements
are allowed within an action—the grammar permits them, but a semantic check should reject them.
Some targets may impose additional restrictions on action bodies—e.g., only allowing straight-line
109
Figure 11. Match-Action Unit Dataflow.
13.2. Tables
A table describes a match-action unit. The structure of a match-action unit is shown in Figure 11.
Processing a packet using a match-action table executes the following steps:
• Key construction.
• Key lookup in a lookup table (the “match” step). The result of key lookup is an “action”.
• Action execution (the “action step”) over the input data, resulting in mutations of the data.
A table declaration introduces a table instance. To obtain multiple instances of a table, it must be
declared within a control block that is itself instantiated multiple times.
The look-up table is a finite map whose contents are manipulated asynchronously (read/write) by
the target control-plane, through a separate control-plane API (see Figure 11). Note that the term “ta-
ble” is overloaded: it can refer to the P4 table objects that appear in P4 programs, as well as the internal
110
look-up tables used in targets. We will use the term “match-action unit” when necessary to disam-
biguate.
Syntactically a table is defined in terms of a set of key-value properties. Some of these properties are
“standard” properties, but the set of properties can be extended by target-specific compilers as needed.
tableDeclaration
: optAnnotations TABLE name '{' tablePropertyList '}'
;
tablePropertyList
: tableProperty
| tablePropertyList tableProperty
;
tableProperty
: KEY '=' '{' keyElementList '}'
| ACTIONS '=' '{' actionList '}'
| optAnnotations CONST ENTRIES '=' '{' entriesList '}' /* immutable entries */
| optAnnotations CONST nonTableKwName '=' initializer ';'
| optAnnotations nonTableKwName '=' initializer ';'
;
nonTableKwName
: IDENTIFIER
| TYPE_IDENTIFIER
| APPLY
| STATE
| TYPE
;
• key: An expression that describes how the key used for look-up is computed.
• actions: A list of all actions that may be found in the table.
• default_action: an action to execute when the lookup in the lookup table fails to find a match for
the key used.
• size: an integer specifying the desired size of the table.
The compiler must set the default_action to NoAction (and also insert it into the list of actions) for ta-
bles that do not define the default_action property. This is consistent with the semantics given in Sec-
tion 13.2.1.3. Hence, all tables can be thought of as having a default_action property, either implicitly
or explicitly.
In addition, tables may contain architecture-specific properties (see Section 13.2.1.6).
A property marked as const cannot be changed dynamically by the control-plane. The key, actions,
and size properties are always constant, so the const keyword is not needed for these.
111
13.2.1. Table properties
13.2.1.1. Keys The key is a table property which specifies the data plane values that should be used
to look up an entry. A key is a list of pairs of the form (e : m), where e is an expression that describes
the data to be matched in the table, and m is a match_kind constant that describes the algorithm used to
perform the lookup (see Section 7.1.3).
keyElementList
: /* empty */
| keyElementList keyElement
;
keyElement
: expression ':' name optAnnotations ';'
;
table Fwd {
key = {
ipv4header.dstAddress : ternary;
ipv4header.version : exact;
}
// more fields omitted
}
Here the key comprises two fields from the ipv4header header: dstAddress and version. The match_kind
constants serve three purposes:
• They specify the algorithm used to match data plane values against the entries in the table at
runtime.
• They are used to synthesize the control-plane API that is used to populate the table.
• They are used by the compiler back-end to allocate resources for the implementation of the table.
match_kind {
exact,
ternary,
lpm
}
These identifiers correspond to the P414 match kinds with the same names. The semantics of these
match kinds is actually not needed to describe the behavior of the P4 abstract machine; how they are
used influences only the control-plane API and the implementation of the look-up table. From the
point of view of the P4 program, a look-up table is an abstract finite map that is given a key and produces
as a result either an action or a “miss” indication, as described in Section 13.2.3.
If a table has no key property, then it contains no look-up table, just a default action—i.e., the asso-
112
ciated lookup table is always the empty map.
Each key element can have an optional @name annotation which is used to synthesize the control-
plane visible name for the key field.
13.2.1.2. Actions A table must declare all possible actions that may appear within the associated
lookup table or in the default action. This is done with the actions property; the value of this property
is always an actionList:
actionList
: /* empty */
| actionList optAnnotations actionRef ';'
;
actionRef
: prefixedNonTypeName
| prefixedNonTypeName '(' argumentList ')'
;
To illustrate, recall the example Very Simple Switch program in Section 5.3:
action Drop_action() {
outCtrl.outputPort = DROP_PORT;
}
table smac {
key = { outCtrl.outputPort : exact; }
actions = {
Drop_action;
Rewrite_smac;
}
}
• The entries in the smac table may contain two different actions: Drop_action and Rewrite_mac.
• The Rewrite_smac action has one parameter, sourceMac, which in this case will be provided by the
control plane.
Each action in the list of actions for a table must have a distinct name—e.g., the following program
fragment is illegal:
action a() {}
control c() {
action a() {}
113
// Illegal table: two actions with the same name
table t { actions = { a; .a; } }
}
Each action parameter that has a direction (in, inout, or out) must be bound in the actions list specifi-
cation; conversely, no directionless parameters may be bound in the list. The expressions supplied as
arguments to an action are not evaluated until the action is invoked.
13.2.1.3. Default action The default action for a table is an action that is invoked automatically
by the match-action unit whenever the lookup table does not find a match for the supplied key.
If present, the default_action property must appear after the action property. It may be declared as
const, indicating that it cannot be changed dynamically by the control-plane. The default action must
be one of the actions that appear in the actions list. In particular, the expressions passed as in, out, or
inout parameters must be syntactically identical to the expressions used in one of the elements of the
actions list.
For example, in the above table we could set the default action as follows (marking it also as con-
stant):
Note that the specified default action must supply arguments for the control-plane bound parameters
(i.e., the directionless parameters), since the action is synthesized at compilation time. The expressions
supplied as arguments for parameters with a direction (in, inout, or out) are evaluated when the action
is invoked while the expressions supplied as arguments for directionless parameters are evaluated at
compile time.
Continuing the example from the previous section, following are several legal and illegal specifica-
tions of default actions for the table t:
114
If a table does not specify the default_action property and no entry matches a given packet, then the
table does not affect the packet and processing continues according to the imperative control flow of
the program.
13.2.1.4. Entries While table entries are typically installed by the control plane, tables may also
be initialized at compile-time with a set of entries. This is useful in situations where tables are used
to implement fixed algorithms—defining table entries statically enables expressing these algorithm
directly in P4, which allows the compiler to infer how the table is actually used and potentially make
better allocation decisions for targets with limited resources. Entries declared in the P4 source are
installed in the table when the program is loaded onto the target.
Table entries are defined using the following syntax:
tableProperty
: const ENTRIES '=' '{' entriesLlist '}' /* immutable entries */
entriesList
: entry
| entriesList entry
;
entry
: keysetExpression ':' actionRef optAnnotations ';'
;
Table entries are immutable (const)—i.e., they can only be read and cannot be changed or removed by
the control plane. It follows that tables that define entries in the P4 source are immutable. This design
choice has important ramifications for the P4 runtime since it does not have to keep track of different
types of entries in one table (mutable and immutable). Future versions of P4 may add the ability to mix
mutable and immutable entries in the same table, by declaring additional entries properties without
the const keyword.
The keysetExpression component of an entry is a tuple that must provide a field for each key in the
table keys (see Sec. 13.2.1). The table key type must match the type of the element of the set. The
actionRef component must be an action which appears in the table actions list, with all its arguments
bound.
If the runtime API requires a priority for the entries of a table—e.g. when using the P4 Runtime
API, tables with at least one ternary search key field—then the entries are matched in program order,
stopping at the first matching entry. Architectures should define the significance of entry order (if any)
for other kinds of tables.
Depending on the match_kind of the keys, key set expressions may define one or multiple entries.
The compiler will synthesize the correct number of entries to be installed in the table. Target con-
straints may further restrict the ability of synthesizing entries. For example, if the number of synthe-
sized entries exceeds the table size, the compiler implementation may choose to issue a warning or an
error, depending on target capabilities.
To illustrate, consider the following example:
115
header hdr {
bit<8> e;
bit<16> t;
bit<8> l;
bit<8> r;
bit<1> v;
}
struct Header_t {
hdr h;
}
struct Meta_t {}
table t_exact_ternary {
key = {
h.h.e : exact;
h.h.t : ternary;
}
actions = {
a;
a_with_control_params;
}
default_action = a;
const entries = {
(0x01, 0x1111 &&& 0xF ) : a_with_control_params(1);
(0x02, 0x1181 ) : a_with_control_params(2);
(0x03, 0x1111 &&& 0xF000) : a_with_control_params(3);
(0x04, 0x1211 &&& 0x02F0) : a_with_control_params(4);
(0x04, 0x1311 &&& 0x02F0) : a_with_control_params(5);
(0x06, _ ) : a_with_control_params(6);
_ : a;
}
}
116
In this example we define a set of 7 entries, all of which invoke action a_with_control_params except for
the final entry which invokes action a. Once the program is loaded, these entries are installed in the
table in the order they are enumerated in the program.
13.2.1.5. Size The size is an optional property of a table. When present, its value is always an inte-
ger compile-time known value. It is specified in units of number of table entries.
If a table has a size value specified for it with value N, it is recommended that a compiler should
choose a data plane implementation that is capable of storing N table entries. This does not guarantee
that an arbitrary set of N entries can always be inserted in such a table, only that there is some set of
N entries that can be inserted. For example, attempts to add some combinations of N entries may fail
because the compiler selected a hash table with O(1) guaranteed search time. See “Size property of P4
tables and parser value sets” P4SizeProperty for further discussion on some P4 table implementations
and what they are able to guarantee.
If a P4 implementation must dimension table resources at compile time, they may treat it as an error if
they encounter a table with no size property.
Some P4 implementations may be able to dynamically dimension table resources at run time. If
a size value is specified in the P4 program, it is recommended that such an implementation uses the
size value as the initial capacity of the table.
13.2.1.6. Additional properties A table declaration defines its essential control and data plane
interfaces—i.e., keys and actions. However, the best way to implement a table may actually depend
on the nature of the entries that will be installed at runtime (for example, tables could be dense or
sparse, could be implemented as hash-tables, associative memories, tries, etc.) In addition, some ar-
chitectures may support extra table properties whose semantics lies outside the scope of this specifi-
cation. For example, in architectures where table resources are statically allocated, programmers may
be required to define a size table property, which can be used by the compiler back-end to allocate
storage resources. However, these architecture-specific properties may not change the semantics of
table lookups, which always produce either a hit and an action or a miss—they can only change how
those results are interpreted on the state of the data plane. This restriction is needed to ensure that it is
possible to reason about the behavior of tables during compilation.
As another example, an implementation property could be used to pass additional information to
the compiler back-end. The value of this property could be an instance of an extern block chosen from
a suitable library of components. For example, the core functionality of the P414 table action_profile
constructs could be implemented on architectures that support this feature using a construct such as
the following:
extern ActionProfile {
ActionProfile(bit<32> size); // number of distinct actions expected
}
table t {
key = { /* body omitted */ }
size = 1024;
implementation = ActionProfile(32); // constructor invocation
}
Here the action profile might be used to optimize for the case where the table has a large number of
117
entries, but the actions associated with those entries are expected to range over a small number of dis-
tinct values. Introducing a layer of indirection enables sharing identical entries, which can significantly
reduce the table's storage requirements.
enum action_list(T) {
// one field for each action in the actions list of table T
}
struct apply_result(T) {
bool hit;
bool miss;
action_list(T) action_run;
}
The evaluation of the apply method sets the hit field to true and the field miss to false if a match is
found in the lookup-table; if a match is not found hit is set to false and miss to true. These bits can be
used to drive the execution of the control-flow in the control block that invoked the table:
if (ipv4_match.apply().hit) {
// there was a hit
} else {
// there was a miss
}
if (ipv4_host.apply().miss) {
ipv4_lpm.apply(); // Lookup the route only if host table missed
}
The action_run field indicates which kind of action was executed (irrespective of whether it was a hit or
a miss). It can be used in a switch statement:
switch (dmac.apply().action_run) {
Drop_action: { return; }
}
m.apply();
118
apply_result(m) m.apply() {
apply_result(m) result;
The behavior of the buildKey call in the pseudocode above is to evaluate each key expression in the order
they appear in the table key definition. The behavior must be the same as if the result of evaluating each
key expression is assigned to a fresh temporary variable, before starting the evaluation of the following
key expression. For example, this P4 table definition and apply call:
119
// same definition of f1, x, and y as before, so they are not repeated here
bit<8> tmp_1;
bit<8> tmp_2;
bit<8> tmp_3;
table t1 {
key = {
tmp_1 : exact @name("masked_y");
tmp_2 : exact @name("f1");
tmp_3 : exact @name("y");
}
// ... rest of table properties defined here, not relevant to example
}
apply {
// assign values to x and y here, not relevant to example
tmp_1 = y & 0x7;
tmp_2 = f1(x, y);
tmp_3 = y;
t1.apply();
}
Note that the second code example above is given in order to specify the behavior of the first one. An
implementation is free to choose any technique that achieves this behavior4 .
• At runtime, statements within a block are executed in the order they appear in the control block.
• Execution of the return statement causes immediate termination of the execution of the current
control block, and a return to the caller.
• Execution of the exit statement causes the immediate termination of the execution of the current
control block and of all the enclosing caller control blocks.
• Applying a table executes the corresponding match-action unit, as described above.
120
control Callee(inout IPv4 ipv4) { /* body omitted */ }
control Caller(inout Headers h) {
Callee() instance; // instance of callee
apply {
instance.apply(h.ipv4); // invoke control
}
}
14. Parameterization
In order to support libraries of useful P4 components, both parsers and control blocks can be addi-
tionally parameterized through the use of constructor parameters.
Consider again the parser declaration syntax:
parserDeclaration
: parserTypeDeclaration optConstructorParameters
'{' parserLocalElements parserStates '}'
;
optConstructorParameters
: /* empty */
| '(' parameterList ')'
;
From this grammar fragment we infer that a parser declaration may have two sets of parameters:
Constructor parameters must be directionless (i.e., they cannot be in, out, or inout) and when the
parser is instantiated, it must be possible to fully evaluate the expressions supplied for these parame-
ters at compilation time.
Consider the following example:
121
6: tcp;
17: tryudp;
}
}
state tryudp {
transition select(udpSupport) {
false: accept;
true : udp;
}
}
state udp {
// body omitted
}
}
When instantiating the GenericParser it is necessary to supply a value for the udpSupport parameter, as
in the following example:
This feature is intended to streamline the common case where a type is instantiated exactly once. For
completeness, the behavior of directly invoking the same type more than once is defined as follows.
122
• Direct type invocation in different scopes will result in different local instances with different
fully-qualified control names.
• In the same scope, direct type invocation will result in a different local instance per invocation—
however, instances of the same type will share the same global name, via the @name annotation. If
the type contains controllable entities, then invoking it directly more than once in the same scope
is illegal, because it will produce multiple controllable entities with the same fully-qualified con-
trol name.
15. Deparsing
The inverse of parsing is deparsing, or packet construction. P4 does not provide a separate language
for packet deparsing; deparsing is done in a control block that has at least one parameter of type
packet_out.
For example, the following code sequence writes first an Ethernet header and then an IPv4 header
into a packet_out:
Emitting a header appends the header to the packet_out only if the header is valid. Emitting a header
stack will emit all elements of the stack in order of increasing indexes.
extern packet_out {
void emit<T>(in T data);
}
The emit method supports appending the data contained in a header, header stack, struct, or header
union to the output packet.
• When applied to a header, emit appends the data in the header to the packet if it is valid and
otherwise behaves like a no-op.
• When applied to a header stack, emit recursively invokes itself to each element of the stack.
• When applied to a struct or header union, emit recursively invokes itself to each field. Note, a
struct must not contain a field of type error or enum because these types cannot be serialized.
It is illegal to invoke emit on an expression whose type is a base type, enum, or error.
We can define the meaning of the emit method in pseudo-code as follows:
123
packet_out {
byte[] data;
unsigned lengthInBits;
void initializeForWriting() {
this.data.clear();
this.lengthInBits = 0;
}
/// Append data to the packet. Type T must be a header, header
/// stack, header union, or struct formed recursively from those types
void emit<T>(T data) {
if (isHeader(T))
if(data.valid$) {
this.data.append(data);
this.lengthInBits += data.lengthInBits;
}
else if (isHeaderStack(T))
for (e : data)
emit(e);
else if (isHeaderUnion(T) || isStruct(T))
for (f : data.fields$)
emit(e.f)
// Other cases for T are illegal
}
Here we use the special valid$ identifier to indicate the hidden valid bit of headers and fields$ to indi-
cate the list of fields for a struct or header union. We also use standard for notation to iterate through
the elements of a stack (e : data) and list of fields for header unions and structs (f : data.fields$).
The iteration order for a struct is the order those fields appear in the type declaration.
124
Figure 12. Fragment of example switch architecture.
Just from these declarations, even without reading a precise description of the target, the programmer
can infer some useful information about the architecture of the described switch, as shown in Figure 12:
125
the Ingress.IPipe block has an input of type Ingress.IH, which is an output of the Ingress.Parser.
• Similarly, the Parser, EPipe, and Deparser are chained in the Egress package.
• The Ingress.IPipe is connected to the Egress.EPipe, because the first outputs a value of type T,
which is an input to the second. Note that the the occurrences of the type variable T are instan-
tiated with the same type in Switch. In contrast, the Ingress type IH and the Egress type IH may
be different. To force them to be the same, we could instead declare IH and OH at the switch level:
package Switch<T,IH,OH>(Ingress<T, IH, OH> ingress, Egress<T, IH, OH> egress).
Hence, this architecture models a target switch that contains two separate channels between the ingress
and egress pipeline:
• A channel that can pass data directly via its argument of type T. On a software target with shared
memory between ingress and egress this could be implemented by passing directly a pointer; on
an architecture without shared memory presumably the compiler will need to synthesize auto-
matically serialization code.
• A channel that can pass data indirectly using a parser and deparser that serializes data into a
packet and back.
The latter declaration is incorrect because the parser P requires T to be bit<32>, while Pipe2 requires T
to be bit<8>.
126
Figure 13. A packet filter target model. The parser computes a Boolean value, which is used to
decide whether the packet is dropped.
The user can also explicitly specify values for the type variables (otherwise the compiler has to infer
values for these type variables):
• static evaluation: at compile time the P4 program is analyzed and all stateful blocks are instanti-
ated.
• dynamic evaluation: at runtime each P4 functional block is executed to completion, in isolation,
when it receives control from the architecture
127
• Integer literals, Boolean literals, and string literals.
• Identifiers declared in an error, enum, or match_kind declaration.
• The default identifier.
• The size field of a value with type header stack.
• The _ identifier when used as a select expression label
• Identifiers that represent declared types, actions, tables, parsers, controls, or packages.
• List expression where all components are compile-time known values.
• Structure initializer expressions, where all fields are compile-time known values.
• Instances constructed by instance declarations (Section 10.3) and constructor invocations.
• The following expressions (+, -, *, / , %, cast, !, &, |, &&, ||, << , >> , ~ , >, <, ==, !=, <=, >=, ++, [:])
when their operands are all compile-time known values.
• Identifiers declared as constants using the const keyword.
• Expressions of the form e.minSizeInBits() and e.minSizeInBytes().
// architecture declaration
parser P(/* parameters omitted */);
control C(/* parameters omitted */);
control D(/* parameters omitted */);
// user code
Checksum16() ck16; // checksum unit instance
128
Figure 14. Evaluation result.
Switch(TopParser(ck16),
Pipe(),
TopDeparser(ck16)) main;
5. The result of the program evaluation is the value of the main variable, which is the above instance
of the Switch package.
Figure 14 shows the result of the evaluation in a graphical form. The result is always a graph of in-
stances. There is only one instance of Checksum16, called ck16, shared between the TopParser and TopDe-
parser. Whether this is possible is architecture-dependent. Specific target compilers may require dis-
tinct checksum units to be used in distinct blocks.
129
17.3. Control plane names
Every controllable entity exposed in a P4 program must be assigned a unique, fully-qualified name,
which the control plane may use to interact with that entity. The following entities are controllable.
• tables
• keys
• actions
• extern instances
A fully qualified name consists of the local name of a controllable entity prepended with the fully qual-
ified name of its enclosing namespace. Hence, the following program constructs, which enclose con-
trollable entities, must themselves have unique, fully-qualified names.
• control instances
• parser instances
Evaluation may create multiple instances from one type, each of which must have a unique, fully-
qualified name.
17.3.1.1. Tables For each table construct, its syntactic name becomes the local name of the table.
For example:
17.3.1.2. Keys Syntactically, table keys are expressions. For simple expressions, the local key name
can be generated from the expression itself. In the following example, the table t has keys with names
data.f1 and hdrs[3].f2.
table t {
keys = {
data.f1 : exact;
hdrs[3].f2 : exact;
}
actions = { /* body omitted */ }
}
The following kinds of expressions have local names derived from their syntactic names:
130
Kind Example Name
The isValid() method. h.isValid() "h.isValid()"
Array accesses. header_stack[1] "header_stack[1]"
Constants. 1 "1"
Field projections. data.f1 "data.f1"
Slices. f1[3:0] "f1[3:0]"
All other kinds of expressions must be annotated with a @name annotation (Section 18.3.3), as in the
following example.
table t {
keys = {
data.f1 + 1 : exact @name("f1_mask");
}
actions = { /* body omitted */ }
}
Here, the @name("f1_mask") annotation assigns the local name "f1_mask" to this key.
17.3.1.3. Actions For each action construct, its syntactic name is the local name of the action. For
example:
17.3.1.4. Instances The local names of extern, parser, and control instances are derived based on
how the instance is used. If the instance is bound to a name, that name becomes its local control plane
name. For example, if control C is declared as,
C() c_inst;
131
C(E()) c_inst;
Note that in this example, the architecture will supply an instance of the extern when it applies the
instance of MyC passed to the Arch package. The fully-qualified name of that instance is main.c.e2.
Next, consider a larger example that demonstrates name generation when there are multiple in-
stances.
control Callee() {
table t { /* body omitted */ }
apply { t.apply(); }
}
control Caller() {
Callee() c1;
Callee() c2;
apply {
c1.apply();
c2.apply();
}
}
control Simple();
package Top(Simple s);
Top(Caller()) main;
The compile-time evaluation of this program produces the structure in Figure 15. Notice that there are
two instances of the table t. These instances must both be exposed to the control plane. To name
an object in this hierarchy, one uses a path composed of the names of containing instances. In this
case, the two tables have names s.c1.t and s.c2.t, where s is the name of the argument to the package
instantiation, which is derived from the name of its corresponding formal parameter.
132
Figure 15. Evaluating a program that has several instantiations of the same component.
• The @name annotation may be used to change the local name of a controllable entity.
Programs that yield the same fully-qualified name for two different controllable entities are invalid.
17.3.3. Recommendations
The control plane may refer to a controllable entity by a postfix of its fully qualified name when it is
unambiguous in the context in which it is used. Consider the following example.
Control plane software may refer to action c_inst.a as a when inserting rules into table c_inst.t, be-
cause it is clear from the definition of the table which action a refers to.
Not all unambiguous postfix shortcuts are recommended. For instance, consider the first example
in Section 17.3. One might be tempted to refer to s.c1 simply as c1, as no other instance named c1
appears in the program. However, this leads to a brittle program since future modifications can never
introduce an instance named c1, or include libraries of P4 code that contain instances with that name.
133
17.4.1. Concurrency model
A typical packet processing system needs to execute multiple simultaneous logical “threads.” At the
very least there is a thread executing the control plane, which can modify the contents of the tables.
Architecture specifications should describe in detail the interactions between the control-plane and
the data-plane. The data plane can exchange information with the control plane through extern func-
tion and method calls. Moreover, high-throughput packet-processing systems may be processing mul-
tiple packets simultaneously, e.g., in a pipelined fashion, or concurrently parsing a first packet while
performing match-action operations on a second packet. This section specifies the semantics of P4
programs with respect to such concurrent executions.
Each top-level parser or control block is executed as a separate thread when invoked by the archi-
tecture. All the parameters of the block and all local variables are thread-local—i.e., each thread has a
private copy of these resources. This applies to the packet_in and packet_out parameters of parsers and
deparsers.
As long as a P4 block uses only thread-local storage (e.g., metadata, packet headers, local vari-
ables), its behavior in the presence of concurrency is identical with the behavior in isolation, since any
interleaving of statements from different threads must produce the same output.
In contrast, extern blocks instantiated by a P4 program are global, shared across all threads. If ex-
tern blocks mediate access to state (e.g., counters, registers)—i.e., the methods of the extern block read
and write state, these stateful operations are subject to data races. P4 mandates that execution of a
method call on an extern instance is atomic.
To allow users to express atomic execution of larger code blocks, P4 provides an @atomic annotation,
which can be applied to block statements, parser states, control blocks, or whole parsers.
Consider the following example:
This program accesses an extern object r of type Register in actions invoked from tables flowlet (read-
ing) and new_flowlet (writing). Without the @atomic annotation these two operations would not execute
atomically: a second packet may read the state of r before the first packet had a chance to update it.
Note that even within an action definition, if the action does something like reading a register, mod-
ifying it, and writing it back, in a way that only the modified value should be visible to the next packet,
then, to guarantee correct execution in all cases, that portion of the action definition should be en-
closed within a block annotated with @atomic.
A compiler backend must reject a program containing @atomic blocks if it cannot implement the
atomic execution of the instruction sequence. In such cases, the compiler should provide reasonable
diagnostics.
134
18. Annotations
Annotations are similar to C# attributes and Java annotations. They are a simple mechanism for ex-
tending the P4 language to some limited degree without changing the grammar. To some degree they
subsume C #pragmas. Annotations are attached to types, fields, variables, etc. using the @ syntax (as
shown explicitly in the P4 grammar). Unstructured annotations, or just “annotations,” have an optional
body; structured annotations have a mandatory body, containing at least a pair of square brackets [].
optAnnotations
: /* empty */
| annotations
;
annotations
: annotation
| annotations annotation
;
annotation
: '@' name
| '@' name '(' annotationBody ')'
| '@' name '[' structuredAnnotationBody ']'
;
Structured annotations and unstructured annotations on any one element must not use the same name.
Thus, a given name can only be applied to one type of annotation or the other for any one element. An
annotation used on one element does not affect the annotation on another because they have different
scope.
This is legal:
This is illegal:
@my_anno(1)
@my_anno[2] table U { /* body omitted */ } // Error - changed type of anno on an element
Multiple unstructured annotations using the same name can appear on a given element; they are cu-
mulative. Each one will be bound to that element. In contrast, only one structured annotation using a
given name may appear on an element; multiple uses of the same name will produce an error.
This is legal:
@my_anno(1)
@my_anno(2) table U { /* body omitted */ } // OK - unstructured annos accumulate
135
This is illegal:
@my_anno[1]
@my_anno[2] table U { /* body omitted */ } // Error -
reused the same structured anno on an element
annotationBody
: /* empty */
| annotationBody '(' annotationBody ')'
| annotationBody annotationToken
Unstructured annotations may impose additional structure on their bodies, and are not confined to
the P4 language. For example, the P4Runtime specification defines a @pkginfo annotation that expects
key-value pairs.
structuredAnnotationBody
: expressionList
| kvList
;
...
expressionList
: /* empty */
| expression
| expressionList ',' expression
136
;
...
kvList
: kvPair
| kvList ',' kvPair
;
kvPair
: name '=' expression
;
@Empty[]
table t {
/* body omitted */
}
kvList of Strings
137
@MixedKV[label="text", my_bool=true, int_val=2*3]
table t {
/* body omitted */
}
@DupAnno[k1=4]
@DupAnno[k2=5] // illegal duplicate name
table t {
/* body omitted */
}
@MixAnno("Anything")
@MixAnno[k2=5] // illegal use in both annotation types
table t {
/* body omitted */
}
138
list will grow. We encourage custom architectures to define annotations starting with a manufacturer
prefix: e.g., an organization named X would use annotations named like @X_annotation
• @tableonly: actions with this annotation can only appear within the table, and never as default
action.
• @defaultonly: actions with this annotation can only appear in the default action, and never in the
table.
table t {
actions = {
a, // can appear anywhere
@tableonly b, // can only appear in the table
@defaultonly c, // can only appear in the default action
}
/* body omitted */
}
The @hidden annotation hides a controllable entity, e.g. a table, key, action, or extern, from the control
plane. This effectively removes its fully-qualified name (Section 17.3). This annotation does not have
a body.
18.3.3.1. Restrictions Each element may be annotated with at most one @name or @hidden anno-
tation, and each control plane name must refer to at most one controllable entity. This is of special
139
concern when using an absolute @name annotation: if a type containing a @name annotation with an ab-
solute pathname (i.e., one starting with a dot) is instantiated more than once, it will result in the same
name referring to two controllable entities.
control noargs();
package top(noargs c1, noargs c2);
control c() {
@name(".foo.bar") table t { /* body omitted */ }
apply { /* body omitted */ }
}
top(c(), c()) main;
Without the @name annotation, this program would produce two controllable entities with fully-qualified
names main.c1.t and main.c2.t. However, the @name(".foo.bar") annotation renames table t in both
instances to foo.bar, resulting in one name that refers to two controllable entities, which is illegal.
• @pure - Describes a function that depends solely on its in parameter values, and has no effect
other than returning a value, and copy-out behavior on its out and inout parameters. No hidden
state is recorded between calls, and its value does not depend on any hidden state that may be
changed by other calls. An example is a hash function that computes a deterministic hash of its
arguments, and its return value does not depend upon any control-plane writable seed or initial-
ization vector value. A @pure function whose results are unused may be safely eliminated with no
adverse effects, and multiple calls with identical arguments may be combined into a single call
(subject to the limits imposed by copy-out behavior of out and inout parameters). @pure func-
tions may also be reordered with respect to other computations that are not data dependent.
140
• @noSideEffects - Weaker than @pure and describes a function that does not change any hidden
state, but may depend on hidden state. One example is a hash function that computes a deter-
ministic hash of its arguments, plus some internal state that can be modified via control plane
API calls such as a seed or initialization vector. Another example is a read of one element of a
register array extern object. Such a function may be dead code eliminated, and may be reordered
or combined with other @noSideEffects or @pure calls (subject to the limits imposed by copy-out
behavior of out and inout parameters), but not with other function calls that may have side effects
that affect the function.
• Errors when annotations are used incorrectly (e.g., an annotation expecting a parameter but used
without arguments, or with arguments of the wrong type
• Warnings for unknown annotations.
141
A. Appendix: Revision History
142
A.2. Summary of changes made in version 1.2.1
• Added structure-value expressions (Section 8.12).
• Added support for default values (Section 7.3).
• Added support for concatenating signed strings (Section 8.6.1).
• Added key-value and list-structured annotations (Section 18).
• Added @pure and @noSideEffects annotations (Section 18.3.6).
• Added @noWarn annotation (Section 18.3.8).
• Generalized typing for masks to allow serializable enums (Section 8.13.3).
• Restricted the right operands of bit shifts involving infinite-precision integers to be constant and
positive (Section 8.7).
• Clarified copy-out behavior for return (Section 11.4) and exit (Section 11.5) statements.
• Clarified semantics of invalid header stacks (Section 8.22).
• Clarified initialization semantics (Section 6.6 and 6.7), especially for headers and local variables.
• Clarified evaluation order for table keys (Section 13.2.3).
• Fixed grammar to clarify parsing of right shift operator (>>), allow empty statements in parser
(Section 12.4), and eliminate annotations on const entries (Section 13.2.1.4).
143
– value_set objects for control-plane programmable select labels.
144
C. Appendix: P4 reserved annotations
The following table shows all P4 reserved annotations.
Annotation Purpose See Section
atomic specify atomic execution 17.4.1
defaultonly action can only appear in the default action 18.3.2
hidden hides a controllable entity from the control plane 17.3.2
match specify match_kind of a field in a value_set 18.3.5
name assign local control-plane name 17.3.2
optional parameter is optional 18.3.1
tableonly action cannot be a default_action 18.3.2
deprecated Construct has been deprecated 18.3.7
pure pure function 18.3.6
noSideEffects function with no side effects 18.3.6
noWarn Has a string argument; inhibits compiler warnings 18.3.8
/// Standard error codes. New error codes can be declared by users.
error {
NoError, /// No error.
PacketTooShort, /// Not enough bits in packet for 'extract'.
NoMatch, /// 'select' expression has no matches.
StackOutOfBounds, /// Reference to invalid element of a header stack.
HeaderTooShort, /// Extracting too many bits into a varbit field.
ParserTimeout, /// Parser execution time limit exceeded.
ParserInvalidArgument /// Parser operation was called with a value
/// not supported by the implementation.
}
extern packet_in {
/// Read a header from the packet into a fixed-sized header @hdr
/// and advance the cursor.
/// May trigger error PacketTooShort or StackOutOfBounds.
/// @T must be a fixed-size header type
void extract<T>(out T hdr);
/// Read bits from the packet into a variable-sized header @variableSizeHeader
/// and advance the cursor.
/// @T must be a header containing exactly 1 varbit field.
/// May trigger errors PacketTooShort, StackOutOfBounds, or HeaderTooShort.
void extract<T>(out T variableSizeHeader,
in bit<32> variableFieldSizeInBits);
145
/// Read bits from the packet without advancing the cursor.
/// @returns: the bits read from the packet.
/// T may be an arbitrary fixed-size type.
T lookahead<T>();
/// Advance the packet cursor by the specified number of bits.
void advance(in bit<32> sizeInBits);
/// @return packet length in bytes. This method may be unavailable on
/// some target architectures.
bit<32> length();
}
extern packet_out {
/// Write @data into the output packet, skipping invalid headers
/// and advancing the cursor
/// @T can be a header type, a header stack, a header_union, or a struct
/// containing fields with such types.
void emit<T>(in T data);
}
action NoAction() {}
/// Standard match kinds for table key fields.
/// Some architectures may not support all these match kinds.
/// Architectures can declare additional match kinds.
match_kind {
/// Match bits exactly.
exact,
/// Ternary match, using a mask.
ternary,
/// Longest-prefix match.
lpm
}
E. Appendix: Checksums
There are no built-in constructs in P416 for manipulating packet checksums. We expect that checksum
operations can be expressed as extern library objects that are provided in target-specific libraries. The
standard architecture library should provide such checksum units.
For example, one could provide an incremental checksum unit Checksum16 (also described in the
VSS example in Section 5.2.4) for computing 16-bit one's complement using an extern object with a
signature such as:
extern Checksum16 {
Checksum16(); // constructor
void clear(); // prepare unit for computation
void update<T>(in T data); // add data to checksum
void remove<T>(in T data); // remove data from existing checksum
bit<16> get(); // get the checksum for the data added since last clear
146
}
h.ipv4.hdrChecksum = 16w0;
ck16.clear();
ck16.update(h.ipv4);
h.ipv4.hdrChecksum = ck16.get();
Moreover, some switch architectures do not perform checksum verification, but only update check-
sums incrementally to reflect packet modifications. This could be achieved as well, as the following P4
program fragments illustrates:
ck16.clear();
ck16.update(h.ipv4.hdrChecksum); // original checksum
ck16.remove( { h.ipv4.ttl, h.ipv4.proto } );
h.ipv4.ttl = h.ipv4.ttl - 1;
ck16.update( { h.ipv4.ttl, h.ipv4.proto } );
h.ipv4.hdrChecksum = ck16.get();
• Controls are not allowed to call parsers, and vice versa, so there is no use in passing one type to
the other in constructor parameters or run-time parameters.
• At run time, after a control is called, and before that call is complete, there can be no recursive
calls between controls, nor from a control to itself. Similarly for parsers. There can be loops
among states within a single parser.
• Externs are not allowed to call parsers or controls, so there is no use in passing objects of those
types to them.
• Tables are always instantiated directly in their enclosing control, and cannot be instantiated at
the top level. There is no syntax for specifying parameters that are tables. Tables are only intended
to be used from within the control where they are defined.
147
• Value-sets can be instantiated in an enclosing parser or at the top level. There is no syntax for
specifying parameters that are value-sets. Value-sets can be shared between the parsers as long
as they are in the scope.
A note on recursion: It is expected that some architectures will define capabilities for recirculating a
packet to be processed again as if it were a newly arriving packet, or to make “clones” of packets that
are then processed by parsers and/or control blocks that the original packet has already completed.
This does not change the notes above on recursion that apply while a parser or control is executing.
The first table lists restrictions on what types can be passed as constructor parameters to other
types.
can be a constructor parameter for this type
This type package parser control extern
package yes no no no
parser yes yes no no
control yes no yes no
extern yes yes yes yes
function no no no no
table no no no no
value-set no no no no
value types yes yes yes yes
The next table lists restrictions on where one may perform instantiations (see Section 10.3) of different
types. The answer for package is always “no” because there is no “inside a package” where instantiations
can be written in P416 . One can definitely make constructor calls and use instances of stateful types as
parameters when instantiating a package, and restrictions on those types are in the table above.
For externs, one can only specify their interface in P416 , not their implementation. Thus there is no
place to instantiate objects within an extern.
You may declare variables and constants of any of the value types within a parser, control, or func-
tion (see Section 10.2 for more details). Declaring a variable or constant is not the same as instantiation,
hence the answer “N/A” (for not applicable) in those table entries. Variables may not be declared at
the top level of your program, but constants may.
can be instantiated in this place
This type top level package parser control extern function
package yes no no no no no
parser no no yes no no no
control no no no yes no no
extern yes no yes yes no no
function yes no no no no no
table no no no yes no no
value-set yes no yes no no no
value types N/A N/A N/A N/A N/A N/A
The next table lists restrictions on what types can be passed as run-time parameters to other callable
things that have run-time parameters: parsers, controls, extern methods, actions, and functions.
148
can be a run-time parameter to this callable thing
This type parser control method action function
package no no no no no
parser no no no no no
control no no no no no
extern yes yes yes no no
table no no no no no
value-set no no no no no
action no no no no no
function no no no no no
value types yes yes yes yes yes
Extern method calls may only return a value that is a value type, or no value at all (specified by a return
type of void).
The next table lists restrictions on what kinds of calls can be made from which places in a P4 pro-
gram. Calling a parser, control, or table means invoking its apply() method. Calling a value-set means
using it in a select expression. The row for extern describes where extern method calls can be made
from.
One way that an extern can be called from the top level of a parser or control is in an initializer
expression for a declared variable, e.g. bit<32> x = rand.get();.
can be called at run time from this place in a P4 program
control parser or
parser apply control
This type state block top level action extern function
package N/A N/A N/A N/A N/A N/A
parser yes no no no no no
control no yes no no no no
extern yes yes yes yes no no
table no yes no no no no
value-set yes no no no no no
action no yes no yes no no
function yes yes no yes no yes
value types N/A N/A N/A N/A N/A N/A
There may not be any recursion in calls, neither by a thing calling itself directly, nor mutual recursion.
An extern can never cause any other type of P4 program object to be called. See Section 6.7.1.
Actions may be called directly from a control apply block.
Note that while the extern row shows that extern methods can be called from many places, partic-
ular externs may have additional restrictions not listed in this table. Any such restrictions should be
documented in the description for each extern, as part of the documentation for the architecture that
defines the extern.
In many cases, the restriction will be “from a parser state only” or “from a control apply block or
action only”, but it may be even more restrictive, e.g. only from a particular kind of control block in-
stantiated in a particular role in an architecture.
149
G. Appendix: Open Issues
There are a number of open issues that are currently under discussion in the P4 design working group.
A brief summary of these issues is highlighted in this section. We seek input on these issues from the
community, and encourage experimenting with different implementations in the compiler before con-
verging on the specification.
Here, the value being scrutinized is given by a tuple (e1,/* parameters omitted */,en), and the patterns
are given by expressions that denote sets of values. The value matches a branch if it is an element of the
set denoted by the pattern. Unlike C and C++, there is no break statement so control “falls through” to
the next case only when there is no statement associated with the case label.
This design is intended to capture the standard semantics of switch statements as well as a common
idiom in P4 parsers where they are used to control transitions to different parser states depending on
the values of one or more previously-parsed values. Using switch statements, we can also generalize
the design for parsers, eliminating select and lifting most restrictions on which kinds of statements may
appear in a state. In particular, we allow conditional statements and select statements, which may be
nested arbitrarily. This language can be translated into more restricted versions, where the body of each
state comprised a sequence of variable declarations, assignments, and method invocations followed
by a singletransition statement by introducing new states.
We also generalize the design for processing of table hit/miss and actions in control blocks, by gen-
erating implicit types for actions and results.
The counter-argument to this proposal is that the semantics of select in the parser is sufficiently
distinct from the switch statement, and moreover these are constructs that network programmers are
already familiar with, and they are typically mapped very efficiently onto a variety of targets.
150
above. Given the concern for performance, we propose to define compiler flags and/or pragmas that
can override the safe behavior. However, our expectation is that programmers should be guided to-
ward writing safe programs, and encouraged to think harder when excepting from the safe behavior.
Since the stacks are always known statically (at compile-time), the compiler could transform the fore-
ach statement into the replicated code with explicit index references at compile-time. This has the ad-
vantage of allowing the code to be written without regard to a parameterized header stack length.
Since the compiler can statically determine the number of operations that would result from the
foreach it can also reject a program if the result requires more action resources than are available, or
can split the action code up to fit available resources as needed.
H. Appendix: P4 grammar
This is the grammar of P416 written using the YACC/bison language. Absent from this grammar is the
precedence of various operations.
The grammar is actually ambiguous, so the lexer and the parser must collaborate for parsing the
language. In particular, the lexer must be able to distinguish two kinds of identifiers:
• Type names previously introduced (TYPE_IDENTIFIER tokens)
• Regular identifiers (IDENTIFIER token)
The parser has to use a symbol table to indicate to the lexer how to parse subsequent appearances of
identifiers. For example, given the following program fragment:
typedef bit<4> t;
struct s { /* body omitted */}
t x;
parser p(bit<8> b) { /* body omitted */ }
t - TYPE_IDENTIFIER
s - TYPE_IDENTIFIER
x - IDENTIFIER
p - TYPE_IDENTIFIER
b - IDENTIFIER
This grammar has been heavily influenced by limitations of the Bison parser generator tool.
151
Several other constant terminals appear in these rules:
- MASK is &&&
- RANGE is ..
- DONTCARE is _
The STRING_LITERAL token corresponds to a string literal enclosed within double quotes, as described
in Section 6.3.3.3.
All other terminals are uppercase spellings of the corresponding keywords. For example, RETURN is
the terminal returned by the lexer when parsing the keyword return.
p4program
: /* empty */
| p4program declaration
| p4program ';' /* empty declaration */
;
declaration
: constantDeclaration
| externDeclaration
| actionDeclaration
| parserDeclaration
| typeDeclaration
| controlDeclaration
| instantiation
| errorDeclaration
| matchKindDeclaration
| functionDeclaration
;
nonTypeName
: IDENTIFIER
| APPLY
| KEY
| ACTIONS
| STATE
| ENTRIES
| TYPE
;
name
: nonTypeName
| TYPE_IDENTIFIER
;
nonTableKwName
152
: IDENTIFIER
| TYPE_IDENTIFIER
| APPLY
| STATE
| TYPE
;
optAnnotations
: /* empty */
| annotations
;
annotations
: annotation
| annotations annotation
;
annotation
: '@' name
| '@' name '(' annotationBody ')'
| '@' name '[' structuredAnnotationBody ']'
;
parameterList
: /* empty */
| nonEmptyParameterList
;
nonEmptyParameterList
: parameter
| nonEmptyParameterList ',' parameter
;
parameter
: optAnnotations direction typeRef name
| optAnnotations direction typeRef name '=' expression
;
direction
: IN
| OUT
| INOUT
| /* empty */
;
packageTypeDeclaration
153
: optAnnotations PACKAGE name optTypeParameters
'(' parameterList ')'
;
instantiation
: typeRef '(' argumentList ')' name ';'
| annotations typeRef '(' argumentList ')' name ';'
| annotations typeRef '(' argumentList ')' name '=' objInitializer ';'
| typeRef '(' argumentList ')' name '=' objInitializer ';'
;
objInitializer
: '{' objDeclarations '}'
;
objDeclarations
: /* empty */
| objDeclarations objDeclaration
;
objDeclaration
: functionDeclaration
| instantiation
;
optConstructorParameters
: /* empty */
| '(' parameterList ')'
;
dotPrefix
: '.'
;
parserDeclaration
: parserTypeDeclaration optConstructorParameters
/* no type parameters allowed in the parserTypeDeclaration */
'{' parserLocalElements parserStates '}'
;
parserLocalElements
: /* empty */
| parserLocalElements parserLocalElement
;
154
parserLocalElement
: constantDeclaration
| variableDeclaration
| instantiation
| valueSetDeclaration
;
parserTypeDeclaration
: optAnnotations PARSER name optTypeParameters '(' parameterList ')'
;
parserStates
: parserState
| parserStates parserState
;
parserState
: optAnnotations STATE name '{' parserStatements transitionStatement '}'
;
parserStatements
: /* empty */
| parserStatements parserStatement
;
parserStatement
: assignmentOrMethodCallStatement
| directApplication
| parserBlockStatement
| constantDeclaration
| variableDeclaration
| emptyStatement
| conditionalStatement
;
parserBlockStatement
: optAnnotations '{' parserStatements '}'
;
transitionStatement
: /* empty */
| TRANSITION stateExpression
;
stateExpression
155
: name ';'
| selectExpression
;
selectExpression
: SELECT '(' expressionList ')' '{' selectCaseList '}'
;
selectCaseList
: /* empty */
| selectCaseList selectCase
;
selectCase
: keysetExpression ':' name ';'
;
keysetExpression
: tupleKeysetExpression
| simpleKeysetExpression
;
tupleKeysetExpression
: "(" simpleKeysetExpression "," simpleExpressionList ")"
| "(" reducedSimpleKeysetExpression ")"
;
simpleExpressionList
: simpleKeysetExpression
| simpleExpressionList ',' simpleKeysetExpression
;
reducedSimpleKeysetExpression
: expression "&&&" expression
| expression ".." expression
| DEFAULT
| "_"
;
simpleKeysetExpression
: expression
| DEFAULT
| DONTCARE
| expression MASK expression
| expression RANGE expression
;
156
valueSetDeclaration
: optAnnotations
VALUESET '<' baseType '>' '(' expression ')' name ';'
| optAnnotations
VALUESET '<' tupleType '>' '(' expression ')' name ';'
| optAnnotations
VALUESET '<' typeName '>' '(' expression ')' name ';'
;
controlDeclaration
: controlTypeDeclaration optConstructorParameters
/* no type parameters allowed in controlTypeDeclaration */
'{' controlLocalDeclarations APPLY controlBody '}'
;
controlTypeDeclaration
: optAnnotations CONTROL name optTypeParameters
'(' parameterList ')'
;
controlLocalDeclarations
: /* empty */
| controlLocalDeclarations controlLocalDeclaration
;
controlLocalDeclaration
: constantDeclaration
| actionDeclaration
| tableDeclaration
| instantiation
| variableDeclaration
;
controlBody
: blockStatement
;
externDeclaration
: optAnnotations EXTERN nonTypeName optTypeParameters '{' methodPrototypes '}'
| optAnnotations EXTERN functionPrototype ';'
;
157
methodPrototypes
: /* empty */
| methodPrototypes methodPrototype
;
functionPrototype
: typeOrVoid name optTypeParameters '(' parameterList ')'
;
methodPrototype
: optAnnotations functionPrototype ';'
| optAnnotations TYPE_IDENTIFIER '(' parameterList ')' ';'
;
typeRef
: baseType
| typeName
| specializedType
| headerStackType
| tupleType
;
namedType
: typeName
| specializedType
;
prefixedType
: TYPE_IDENTIFIER
| dotPrefix TYPE_IDENTIFIER
;
typeName
: prefixedType
;
tupleType
: TUPLE '<' typeArgumentList '>'
;
headerStackType
: typeName '[' expression ']'
| specializedType '[' expression ']'
158
;
specializedType
: prefixedType '<' typeArgumentList '>'
;
baseType
: BOOL
| ERROR
| STRING
| INT
| BIT
| BIT '<' INTEGER '>'
| INT '<' INTEGER '>'
| VARBIT '<' INTEGER '>'
| BIT '<' '(' expression ')' '>'
| INT '<' '(' expression ')' '>'
| VARBIT '<' '(' expression ')' '>'
;
typeOrVoid
: typeRef
| VOID
| IDENTIFIER // may be a type variable
;
optTypeParameters
: /* empty */
| typeParameters
;
typeParameters
: '<' typeParameterList '>'
;
typeParameterList
: name
| typeParameterList ',' name
;
realTypeArg
: DONTCARE
| typeRef
| VOID
;
159
typeArg
: DONTCARE
| typeRef
| nonTypeName
| VOID
;
realTypeArgumentList
: realTypeArg
| realTypeArgumentList COMMA typeArg
;
typeArgumentList
: /* empty */
| typeArg
| typeArgumentList ',' typeArg
;
typeDeclaration
: derivedTypeDeclaration
| typedefDeclaration
| parserTypeDeclaration ';'
| controlTypeDeclaration ';'
| packageTypeDeclaration ';'
;
derivedTypeDeclaration
: headerTypeDeclaration
| headerUnionDeclaration
| structTypeDeclaration
| enumDeclaration
;
headerTypeDeclaration
: optAnnotations HEADER name optTypeParameters '{' structFieldList '}'
;
headerUnionDeclaration
: optAnnotations HEADER_UNION name optTypeParameters '{' structFieldList '}'
;
structTypeDeclaration
: optAnnotations STRUCT name optTypeParameters '{' structFieldList '}'
;
structFieldList
160
: /* empty */
| structFieldList structField
;
structField
: optAnnotations typeRef name ';'
;
enumDeclaration
: optAnnotations ENUM name '{' identifierList '}'
| optAnnotations ENUM typeRef name '{' specifiedIdentifierList '}'
;
errorDeclaration
: ERROR '{' identifierList '}'
;
matchKindDeclaration
: MATCH_KIND '{' identifierList '}'
;
identifierList
: name
| identifierList ',' name
;
specifiedIdentifierList
: specifiedIdentifier
| specifiedIdentifierList ',' specifiedIdentifier
;
specifiedIdentifier
: name '=' initializer
;
typedefDeclaration
: optAnnotations TYPEDEF typeRef name ';'
| optAnnotations TYPEDEF derivedTypeDeclaration name ';'
| optAnnotations TYPE typeRef name ';'
| optAnnotations TYPE derivedTypeDeclaration name ';'
;
assignmentOrMethodCallStatement
: lvalue '(' argumentList ')' ';'
161
| lvalue '<' typeArgumentList '>' '(' argumentList ')' ';'
| lvalue '=' expression ';'
;
emptyStatement
: ';'
;
returnStatement
: RETURN ';'
| RETURN expression ';'
;
exitStatement
: EXIT ';'
;
conditionalStatement
: IF '(' expression ')' statement
| IF '(' expression ')' statement ELSE statement
;
statement
: assignmentOrMethodCallStatement
| directApplication
| conditionalStatement
| emptyStatement
| blockStatement
| exitStatement
| returnStatement
| switchStatement
;
blockStatement
: optAnnotations '{' statOrDeclList '}'
;
statOrDeclList
: /* empty */
| statOrDeclList statementOrDeclaration
;
162
switchStatement
: SWITCH '(' expression ')' '{' switchCases '}'
;
switchCases
: /* empty */
| switchCases switchCase
;
switchCase
: switchLabel ':' blockStatement
| switchLabel ':'
;
switchLabel
: DEFAULT
| nonBraceExpression
;
statementOrDeclaration
: variableDeclaration
| constantDeclaration
| statement
| instantiation
;
tablePropertyList
: tableProperty
| tablePropertyList tableProperty
;
tableProperty
: KEY '=' '{' keyElementList '}'
| ACTIONS '=' '{' actionList '}'
| optAnnotations CONST ENTRIES '=' '{' entriesList '}' /* immutable entries */
| optAnnotations CONST nonTableKwName '=' initializer ';'
| optAnnotations nonTableKwName '=' initializer ';'
;
keyElementList
163
: /* empty */
| keyElementList keyElement
;
keyElement
: expression ':' name optAnnotations ';'
;
actionList
: /* empty */
| actionList optAnnotations actionRef ';'
;
actionRef
: prefixedNonTypeName
| prefixedNonTypeName '(' argumentList ')'
;
entriesList
: entry
| entriesList entry
;
entry
: keysetExpression ':' actionRef optAnnotations ';'
;
actionDeclaration
: optAnnotations ACTION name '(' parameterList ')' blockStatement
;
variableDeclaration
: annotations typeRef name optInitializer ';'
| typeRef name optInitializer ';'
;
constantDeclaration
: optAnnotations CONST typeRef name '=' initializer ';'
;
optInitializer
: /* empty */
164
| '=' initializer
;
initializer
: expression
;
functionDeclaration
: functionPrototype blockStatement
;
argumentList
: /* empty */
| nonEmptyArgList
;
nonEmptyArgList
: argument
| nonEmptyArgList ',' argument
;
argument
: expression
| name '=' expression
| DONTCARE
;
kvList
: kvPair
| kvList ',' kvPair
;
kvPair
: name '=' expression
;
expressionList
: /* empty */
| expression
| expressionList ',' expression
;
annotationBody
165
: /* empty */
| annotationBody '(' annotationBody ')'
| annotationBody annotationToken
;
structuredAnnotationBody
: expressionList
| kvList
;
annotationToken
: ABSTRACT
| ACTION
| ACTIONS
| APPLY
| BOOL
| BIT
| CONST
| CONTROL
| DEFAULT
| ELSE
| ENTRIES
| ENUM
| ERROR
| EXIT
| EXTERN
| FALSE
| HEADER
| HEADER_UNION
| IF
| IN
| INOUT
| INT
| KEY
| MATCH_KIND
| TYPE
| OUT
| PARSER
| PACKAGE
| PRAGMA
| RETURN
| SELECT
| STATE
| STRING
| STRUCT
| SWITCH
166
| TABLE
| TRANSITION
| TRUE
| TUPLE
| TYPEDEF
| VARBIT
| VALUESET
| VOID
| "_"
| IDENTIFIER
| TYPE_IDENTIFIER
| STRING_LITERAL
| INTEGER
| "&&&"
| ".."
| "<<"
| "&&"
| "||"
| "=="
| "!="
| ">="
| "<="
| "++"
| "+"
| "|+|"
| "-"
| "|-|"
| "*"
| "/"
| "%"
| "|"
| "&"
| "^"
| "~"
| "["
| "]"
| "{"
| "}"
| "<"
| ">"
| "!"
| ":"
| ","
| "?"
| "."
| "="
167
| ";"
| "@"
| UNKNOWN_TOKEN
;
member
: name
;
prefixedNonTypeName
: nonTypeName
| dotPrefix nonTypeName
;
lvalue
: prefixedNonTypeName
| THIS
| lvalue '.' member
| lvalue '[' expression ']'
| lvalue '[' expression ':' expression ']'
;
%left ','
%nonassoc '?'
%nonassoc ':'
%left '||'
%left '&&'
%left '==' '!='
%left '<' '>' '<=' '>='
%left '|'
%left '^'
%left '&'
%left '<<' '>>'
%left '++' '+' '-' '|+|' '|-|'
%left '*' '/' '%'
%right PREFIX
%nonassoc ']' '(' '['
%left '.'
expression
: INTEGER
| TRUE
| FALSE
| THIS
168
| STRING_LITERAL
| nonTypeName
| dotPrefix nonTypeName
| expression '[' expression ']'
| expression '[' expression ':' expression ']'
| '{' expressionList '}'
| '{' kvList '}'
| '(' expression ')'
| '!' expression %prec PREFIX
| '~' expression %prec PREFIX
| '-' expression %prec PREFIX
| '+' expression %prec PREFIX
| typeName '.' member
| ERROR '.' member
| expression '.' member
| expression '*' expression
| expression '/' expression
| expression '%' expression
| expression '+' expression
| expression '-' expression
| expression '|+|' expression
| expression '|-|' expression
| expression '<<' expression
| expression '>>' expression
| expression '<=' expression
| expression '>=' expression
| expression '<' expression
| expression '>' expression
| expression '!=' expression
| expression '==' expression
| expression '&' expression
| expression '^' expression
| expression '|' expression
| expression '++' expression
| expression '&&' expression
| expression '||' expression
| expression '?' expression ':' expression
| expression '<' realTypeArgumentList '>' '(' argumentList ')'
| expression '(' argumentList ')'
| namedType '(' argumentList ')'
| '(' typeRef ')' expression
;
nonBraceExpression
: INTEGER
| STRING_LITERAL
169
| TRUE
| FALSE
| THIS
| nonTypeName
| dotPrefix nonTypeName
| nonBraceExpression '[' expression ']'
| nonBraceExpression '[' expression ':' expression ']'
| '(' expression ')'
| '!' expression %prec PREFIX
| '~' expression %prec PREFIX
| '-' expression %prec PREFIX
| '+' expression %prec PREFIX
| typeName '.' member
| ERROR '.' member
| nonBraceExpression '.' member
| nonBraceExpression '*' expression
| nonBraceExpression '/' expression
| nonBraceExpression '%' expression
| nonBraceExpression '+' expression
| nonBraceExpression '-' expression
| nonBraceExpression '|+|' expression
| nonBraceExpression '|-|' expression
| nonBraceExpression '<<' expression
| nonBraceExpression '>>' expression
| nonBraceExpression '<=' expression
| nonBraceExpression '>=' expression
| nonBraceExpression '<' expression
| nonBraceExpression '>' expression
| nonBraceExpression '!=' expression
| nonBraceExpression '==' expression
| nonBraceExpression '&' expression
| nonBraceExpression '^' expression
| nonBraceExpression '|' expression
| nonBraceExpression '++' expression
| nonBraceExpression '&&' expression
| nonBraceExpression '||' expression
| nonBraceExpression '?' expression ':' expression
| nonBraceExpression '<' realTypeArgumentList '>' '(' argumentList ')'
| nonBraceExpression '(' argumentList ')'
| namedType '(' argumentList ')'
| '(' typeRef ')' expression
;
170