Tutorial - Introduction To P4
Tutorial - Introduction To P4
2
Recap: Switch/ Router Architectures
Routing Information
Base (RIB) OSPF, BGP, … Control-plane CPU
Forwarding
Information Base (FIB)
Control plane
Runtime API
IPv4 Match-action/ Lookup Tables
Switching ASIC
packets
Data plane
…
Broadcom Cisco Intel Juniper 3
Tomahawk5 Silicon One Tofino2 Trio6
Source: https://sdn.systemsapproach.org/intro.html 4
Bottom-up Design
Switch OS
Network Demands
?
“This is how I know to process packets”
(i.e. the ASIC datasheet makes the rules)
Run-time API
Driver
Fixed-function ASIC
Switch OS
Network Demands
Feedback
Run-time API
Driver
P4
“This is how I want the network to
behave and how to switch packets…”
(the user / controller makes the rules)
P4 Programmable Device
6
Reconfigurable Match-action Tables (RMT)
7
Benefits of Data Plane Programmability
• New Features – Add new protocols
• Reduce complexity – Remove unused protocols
• Efficient use of resources – flexible use of tables
• Greater visibility – New diagnostic techniques, telemetry, etc.
• SW style development – rapid design cycle, fast innovation, fix data
plane bugs in the field
• You keep your own ideas
11
Introduction to P4
12
Language evolution
13
P4_16 Specification
14
Available Software Tools
15
P4 Approach
Term Explanation
Community-Developed Vendor-supplied
16
Mapping a simple logical pipeline on PISA
17
P4 programs and architectures
18
Example Architectures and Targets
V1Model/ PISA Architecture
Software
TM
Switch
Tofino Native Architecture*
(P4_14)
TM
Software
Switch
Tofino Native Architecture*
State machine,
Parsers bitfield
extraction
Tables, Actions,
Controls control flow
statements
Basic operations
Expressions and operators
Bistrings, headers,
Data Types structures, arrays
20
Programming a P4 Target
User supplied
Control Plane
RUNTIME
P4 Program P4 Compiler Add/remove Extern Packet-in/out
table entries control
CPU port
P4 Architecture Target-specific Extern
configuration Load Tables Data Plane
Model objects
binary
Target
Vendor supplied
21
Protocol-Independent Switch (PISA)/ V1Model
Architecture
Programmer defines the
tables and the exact Programmer declares
Programmer declares the processing algorithm how the output packet
headers that should be will look on the wire
recognized and their order in
the packet
22
P4 Program Template (V1Model)
#include <core.p4> /* EGRESS PROCESSING */
#include <v1model.p4> control MyEgress(inout headers hdr,
/* HEADERS */ inout metadata meta,
struct metadata { ... } inout standard_metadata_t std_meta) {
struct headers { ...
ethernet_t ethernet; }
ipv4_t ipv4; /* CHECKSUM UPDATE */
} control MyComputeChecksum(inout headers hdr,
/* PARSER */ inout metadata meta) {
parser MyParser(packet_in packet, ...
out headers hdr, }
inout metadata meta, /* DEPARSER */
inout standard_metadata_t smeta) { control MyDeparser(inout headers hdr,
... inout metadata meta) {
} ...
/* CHECKSUM VERIFICATION */ }
control MyVerifyChecksum(in headers hdr, /* SWITCH */
inout metadata meta) { V1Switch(
... MyParser(),
} MyVerifyChecksum(),
/* INGRESS PROCESSING */ MyIngress(),
control MyIngress(inout headers hdr, MyEgress(),
inout metadata meta, MyComputeChecksum(),
inout standard_metadata_t std_meta) { MyDeparser()
... ) main; 23
}
V1Model Standard Metadata
struct standard_metadata_t {
bit<9> ingress_port; • ingress_port - the port on which
bit<9> egress_spec;
bit<9> egress_port; the packet arrived
bit<32> clone_spec;
bit<32> instance_type; • egress_spec - the port to which
bit<1> drop;
bit<16> recirculate_port; the packet should be sent to
bit<32> packet_length;
bit<32> enq_timestamp; • egress_port - the port that the
bit<19> enq_qdepth;
bit<32> deq_timedelta; packet will be sent out of (read
bit<19> deq_qdepth;
bit<48> ingress_global_timestamp; only in egress pipeline)
bit<32> lf_field_list;
bit<16> mcast_grp;
bit<1> resubmit_flag;
bit<16> egress_rid;
bit<1> checksum_error;
}
24
Simple P4 Program Example
#include <core.p4>
#include <v1model.p4> control MyEgress(inout headers hdr,
struct metadata {} inout metadata meta,
struct headers {} inout standard_metadata_t standard_metadata) {
apply { }
parser MyParser(packet_in packet, }
out headers hdr,
inout metadata meta, control MyComputeChecksum(inout headers hdr, inout metadata
inout standard_metadata_t standard_metadata) { meta) {
apply { }
state start { transition accept; } }
}
control MyDeparser(packet_out packet, in headers hdr) {
control MyVerifyChecksum(inout headers hdr, inout metadata apply { }
meta) { apply { } } }
26
Defining and (De-)Parsing Headers
27
P4 Types (Basic and Header Types)
Basic Types
• bit<n>: Unsigned integer (bitstring) of size n
typedef bit<48> macAddr_t; • bit is the same as bit<1>
• int<n>: Signed integer of size n (>=2)
header ethernet_t {
macAddr_t dstAddr; • varbit<n>: Variable-length bitstring
macAddr_t srcAddr;
bit<16> etherType; Header Types: Ordered collection of members
}
• Can contain bit<n>, int<n>, and varbit<n>
• Byte-aligned
• Can be valid or invalid
• Provides several operations to test and set validity bit:
isValid(), setValid(), and setInvalid()
28
Example: IPv4 Header
header ipv4_t {
bit<4> version;
bit<4> ihl;
bit<8> diffserv;
bit<16> totalLen;
bit<16> identification;
bit<3> flags;
bit<13> fragOffset;
bit<8> ttl;
bit<8> protocol;
bit<16> hdrChecksum;
ip4Addr_t srcAddr;
ip4Addr_t dstAddr;
}
29
P4 Types (Other Types)
/* Architecture */
struct standard_metadata_t {
Other useful types
bit<9> ingress_port;
bit<9> egress_spec; • Struct: Unordered collection of members (with
bit<9> egress_port;
bit<32> clone_spec; no alignment restrictions)
bit<32> instance_type;
bit<1> drop; • Header Stack: array of headers
bit<16> recirculate_port;
bit<32> packet_length; • Header Union: one of several headers
...
}
/* User program */
struct metadata {
...
}
struct headers {
ethernet_t ethernet;
ipv4_t ipv4;
}
30
Programmable Parsers
MyParser
/* User Program */
parser MyParser(packet_in packet, packet_in hdr
out headers hdr,
inout metadata meta,
inout standard_metadata_t std_meta) {
meta meta
state start {
packet.extract(hdr.ethernet); The platform Initializes
transition accept; User Metadata to 0
}
}
standard_meta
32
“select” statement
35
(Stateless) Packet Processing
36
P4 Control Blocks
37
Example: Reflector (V1Model)
38
Example: Simple Actions
control MyIngress(inout headers hdr,
inout metadata meta, • Very similar to C functions
inout standard_metadata_t std_meta) { • Can be declared inside a control or
globally
action swap_mac(inout bit<48> src,
• Parameters have type and direction
inout bit<48> dst) {
bit<48> tmp = src; • Variables can be instantiated inside
src = dst; • Many standard arithmetic and logical
dst = tmp; operations are supported
} ◦ +, -, *
◦ ~, &, |, ^, >>, <<
apply { ◦ ==, !=, >, >=, <, <=
swap_mac(hdr.ethernet.srcAddr, ◦ No division/modulo
hdr.ethernet.dstAddr); • Non-standard operations:
std_meta.egress_spec = std_meta.ingress_port; ◦ Bit-slicing: [m:l] (works as l-value too)
} ◦ Bit Concatenation: ++
}
39
P4 Match-action Tables
• The fundamental unit of a Match-Action Pipeline
◦ Specifies what data to match on and match kind
◦ Specifies a list of possible actions
◦ Optionally specifies a number of table properties
■ Size
■ Default action
■ Static entries
■ …
• Each table contains one or more entries (rules)
• An entry contains:
◦ A specific key to match on
◦ A single action that is executed when a packet matches the entry
◦ Action data (possibly empty)
40
Example: IPv4_LPM Table
table ipv4_lpm {
key = {
hdr.ipv4.dstAddr: lpm;
}
actions = {
ipv4_forward;
drop;
NoAction;
}
size = 1024;
default_action = NoAction();
}
42
Defining Actions
(DataPlane)
Parameters
Directional
action NoAction() { types of parameters
} ◦ Directional (from the Data Plane)
◦ Directionless (from the Control
/* basic.p4 */ Plane)
action drop() { • Actions that are called directly:
mark_to_drop(); ◦ Only use directional parameters Action
} • Actions used in tables: Code
(Action Data)
Directionless
Parameters
action ipv4_forward(macAddr_t dstAddr, ◦ May sometimes use directional
bit<9> port) { parameters too
...
}
Action
Execution
43
Applying Tables in Controls
table ipv4_lpm {
...
}
apply {
...
ipv4_lpm.apply();
...
}
} 44
Demo
45
Questions?
46
(Stateless) Packet Processing - cont.
47
Hashing (V1Model)
enum HashAlgorithm {
csum16,
xor16,
crc32, Computes the hash of data (using algo)
crc32_custom,
crc16,
modulo max and adds it to base
crc16_custom,
random,
identity
Uses type variables (like C++ templates /
} Java Generics) to allow hashing primitive
extern void hash<O, T, D, M>(
out O result,
to be used with many different types.
in HashAlgorithm algo,
in T base,
in D data,
in M max);
bit<10> variable_to_output;
hash (variable_to_output, HashAlgorithm.crc32, 0,
{hdr.ipv4.src_addr, … }, 1024);
48
// here we have the hash value ready
ECMP as a Use Case
49
Mirroring (V1Model)
50
Multicast (V1Model)
51
Stateful Packet Processing
52
What is “Stateful Packet Processing”?
53
Registers in V1 Model
54
Registers (V1Model)
Registers are used maintain states in the data plane
55
Example: Stateful Firewall
● Assume a network admin:
○ Allows connections to be initiated only from/ within an enterprise/home network
○ Blocks incoming connection requests from outside
External
Network Alice Bob
Server
56
Example: Stateful Firewall
2. SYN(Alice->Server)
1. SYN-ACK(Server->Alice)
External
Network Alice Bob
Serve
r
57
Example: Stateful Firewall
4. SYN(Server->Bob)
External
Network Alice Bob
Serve
r
58
Bloom Filters
Wikipedia:
“A Bloom filter is a space-efficient probabilistic data structure, conceived by
Burton Howard Bloom in 1970, that is used to test whether an element is
a member of a set.”
59
Slides from: https://courses.cs.washington.edu/courses/cse312/20au/files/slides/10-23-annotated.pdf
60
Meters in V1 Model Architecture
61
Counters in V1 Model Architecture
62
Counters (V1Model)
Counters can be updated from the P4 program, but can only be read from the
control plane.
63
Using the V1 Model Counters
64
Questions?
65
Interested to working with us?
66
Quiz 3
67
Question 1:
68
Question 2:
69
Question 3:
70
Question 4:
71
Question 5:
72
Question 6:
73
References
1. ONF P4 Tutorial: https://opennetworking.org/wp-content/uploads/2020/12/P4_tutorial_01_basics.gslide.pdf
2. ONF ONS Tutorial:
https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Tutorial-P4-and-P4Runtime-Technical-Introduction-
and-Use-Cases-for-Service-Providers-Carmelo-Cascone-Open-Networking-Foundation.pdf
3. Standford CS344 Lecture 2: https://cs344-stanford.github.io/lectures/Lecture-2-P4-tutorial.pdf
4. P4 Introduction: https://conferences.sigcomm.org/sigcomm/2018/files/slides/hda/paper_2.2.pdf
5. ETH Zurich Adv Topics in Communication Networks: https://polybox.ethz.ch/index.php/s/dP7zuZH5Y9amcGG
74
NetCache: Balancing Key-Value
Stores with Fast In-Network Caching
Xin Jin1, Xiaozhou Li2, Haoyu Zhang3, Robert Soule2,4,
Jeongkeun Lee2, Nate Foster2,5, Changhoon Kim2, Ion Stoica6
1
John Hopkins University, 2Barefoot Networks, 3Princeton University,
4
Università della Svizzera italiana, 5Cornell University, 6UC Berkeley
75
Goal: Fast and cost-efficient rack-scale KV stores
❑ Store, retrieve, manage key-value objects
▪ Critical building block for large-scale cloud services
76
Key challenge:
Highly skewed and rapidly changing workloads
Long tail
distribution
[1] Atikoglu et al., Workload Analysis of a Large-scale Key-value Store. 2012. ACM SIGMETRICS.
77
Key challenge:
Highly skewed and rapidly changing workloads
78
Opportunity:
Fast, small cache can ensure load balancing
Balanced load
[1] Fan et al., Small Cache, Big Effect: Provable Load Balancing for Randomly Partitioned Cluster Services. 2011. ACM SOCC.
79
Opportunity:
Fast, small cache can ensure load balancing
I need to be able
absorb and handle all
the load for the cache
items!
Cache absorbs hottest queries
Requirement: cache throughput ≥ backend aggregate throughput
Balanced load
80
NetCache: Towards billions QPS KV store racks
cache
In-network
In-memory
O(1) BQPS
Each: O(10) MQPS
Total: O(1) BQPS Small on-chip memory? 10s of MB only.
Only cache O(N log N) small items!
81
NetCache Rack-scale architecture
Network Cache
Management Management
PCIE
Run-time API
Read 1
Query Hit Cache Update Stats
2
(cache hit) Client
Server
Read Query 1 2
(cache Miss Cache Update Stats
miss) 4 3
Client
Server
1 2
Write
Query Invalidate Cache Stats
4 3
Client Server
83
The “boring life” of a NetCache switch
84
Backup Slides
85
Control Plane
(DataPlane)
Parameters
Directional
Headers and Metadata
Lookup Key
Hit
Key Action ID Action Data
Action
ID
Action ID
Code
Selector
Hit/Miss
(Action Data)
Directionless
Parameters
Data
86
Recap
87
Slides from: https://courses.cs.washington.edu/courses/cse312/20au/files/slides/10-23-annotated.pdf
88
Slides from: https://courses.cs.washington.edu/courses/cse312/20au/files/slides/10-23-annotated.pdf 89
Slides from: https://courses.cs.washington.edu/courses/cse312/20au/files/slides/10-23-annotated.pdf 90
Slides from: https://courses.cs.washington.edu/courses/cse312/20au/files/slides/10-23-annotated.pdf 91
Slides from: https://courses.cs.washington.edu/courses/cse312/20au/files/slides/10-23-annotated.pdf 92
Slides from: https://courses.cs.washington.edu/courses/cse312/20au/files/slides/10-23-annotated.pdf
93