An Exhaustive Survey On P4 Programmable Data Plane Switches: Taxonomy, Applications, Challenges, and Future Trends
An Exhaustive Survey On P4 Programmable Data Plane Switches: Taxonomy, Applications, Challenges, and Future Trends
An Exhaustive Survey On P4 Programmable Data Plane Switches: Taxonomy, Applications, Challenges, and Future Trends
Section I: Section II: Section III: Section IV: Section V: Sections VI-XII: Section XIII:
Introduction Related Surveys Traditional Control Programmable Methodology and Surveyed Work Challenges and
Plane and SDN Switches Taxonomy Future Trends
• Protocol ossification • Comparison of • Comparison
C i b
between
t • PISA-based
PISA b d • SSurvey methodology
th d l • Background and • Generall challenges
h ll and
d
• Evolution of SDN aspects covered in traditional, SDN, and data plane • Proposed taxonomy literature review Future trends
• Rise of P4 and previous surveys programmable devices • Programmable • Year-based distribution • Intra-category • Memory availability
programmable data • Analysis and • Analogy with other switch features of the surveyed work comparison and • Arithmetic computations
planes limitations of domain specific • P4 language • Implementation discussions • Network-wide
• Paper contributions existing surveys processors platform distribution • Comparison with legacy cooperation, etc.
facto standard for defining the forwarding behavior is the B. Paper Organization
P4 language [9], which stands for Programming Protocol- The road-map of this survey is illustrated in Fig. 2. Section
independent Packet Processors. Essentially, P4 programmable II studies and compares existing surveys on various P4-
switches have removed the entry barrier to network design, related topics and demonstrates the added value of the offered
previously reserved to network vendors. work. Section III describes the traditional and SDN devices,
The momentum of programmable switches is reflected in and the evolution toward programmable data planes. Section
the global ecosystem around P4. Operators such as ATT [10], IV introduces programmable switches and their features and
Comcast [11], NTT [12], KPN [13], Turk Telekom [14], explains the Protocol Independent Switch Architecture (PISA),
Deutsche Telekom [15], and China Unicom [14], are now a pipeline forwarding model. Section V describes the survey
using P4-based platforms and applications to optimize their methodology and the proposed taxonomy. Subsequent sections
networks. Companies with large data centers such as Facebook (from Section VI to Section XII) explore the works pertaining
[16], Alibaba [17], and Google [18] operate on programmable to various categories proposed in the taxonomy, and compare
platforms running customized software, a contrast from the the P4 approaches in each category, as well as with the
fully proprietary implementations of just a few years ago legacy-enabled solutions. Section XIII outlines challenges and
[19]. Switch manufacturers such as Edgecore [20], Stordis considerations extracted and induced from the literature, and
[21], Cisco [22], Arista [23], Juniper [24], and Interface Mas- pinpoints directions that can be explored in the future to
ters [25] are now manufacturing P4 programmable switches ameliorate the state-of-the-art solutions. Finally, Section XIV
with multiple deployment models, from fully programmable concludes the survey. The abbreviations used in this article are
or white boxes to hybrid schemes. Chip manufactures such summarized in Table XIV, at the end of the article.
as Barefoot Networks (Intel) [26], Xilinx [27], Pensando
[28], Mellanox [29], and Innovium [30] have embraced pro- II. R ELATED S URVEYS
grammable data planes without compromising performance.
The availability of tools and the agility of software devel- The advantages of programmable switches attracted con-
opment have opened an unprecedented possibility of experi- siderable attention from the research community. They were
mentation and innovation by enabling network owners to build described in previous surveys.
custom protocols and process them using protocol-independent Stubbe et al. [35] discussed various P4 compilers and
primitives, reprogram the data plane in the field, and run interpreters in a short survey. This work provided a background
P4 codes on diverse platforms. Main agencies supporting on the P4 language and demonstrated the main building blocks
engineering research and education world-wide are investing that describe packet processing in a programmable switch.
in programmable networks as well [31–34]. It outlined reference hardware and software programmable
switch implementations. The survey lacks discussions on exist-
ing application schemes, challenges, and potential future work.
A. Contribution Dargahi et al. [36] focused on stateful data planes and
Despite the increasing interest on P4 switches, previous the security implications. There are two main objectives of
work has only partially covered this technology. As shown this survey. First, it introduces the reader to recent trends
in Table I, currently, there is no updated and comprehensive and technologies pertaining to stateful data planes. Second,
material. Thus, this paper addresses this gap by providing it discusses relevant security issues by analyzing selected
an overview of the evolution of networks from legacy to use cases. The scope of the survey is not limited to P4
programmable; describing the essentials of programmable for programming the data plane. Instead, it describes other
switches and P4; and summarizing the advantages of pro- schemes such as OpenState [44], Flow-level State Transitions
grammable switches over SDN and legacy devices. The paper (FAST) [45], etc. When reviewing the security properties of
continues by presenting a taxonomy of applications developed stateful data planes, the authors described a mapping between
with P4; surveying, classifying, and analyzing and comparing potential attacks and corresponding vulnerabilities.
more than 150 articles; discussing challenges and consid- Cordeiro et al. [37] discussed the evolution of SDN from
erations; and putting forward future perspectives and open OpenFlow to data plane programmability. The survey briefly
research issues. explained the layout of a P4 program and how it is mapped to
3
TABLE I
C OMPARISON WITH R ELATED S URVEYS
the abstract forwarding model. It then listed various compil- evolution of programmable networks. This work described
ers, tools, simulators, and frameworks for P4 development. the pre-SDN model and the evolution to SDN and pro-
The authors categorized the literature into two categories: grammable data plane. The authors highlighted some features
1) programmable security and dependability management; 2) of programmable switches such as stateful processing, accurate
enhanced accounting and performance management. In the timing information, and flexible packet cloning and recircu-
first category, the authors listed works pertaining to policy lation. The survey categorized data plane applications into
modeling, analysis, and verification, as well as intrusion two categories, namely, network monitoring and in-network
detection and prevention, and network survivability. In the computing. While this survey listed a considerable number of
second category, the authors focused on network monitoring, papers belonging to these categories, it barely explained the
traffic engineering, and load balancing. The survey only lists operation and main ideas of each paper.
a limited set of papers without providing much details or how Tan et al. [42] presented a survey describing In-band Net-
papers differ from each. Moreover, the survey was published work Telemetry (INT). The survey explained the development
in 2017, and since then, a significant percentage of P4-related stages and classifications of network measurement (traditional,
works are missing. SDN-based, and P4-based). It also outlined some existing
Satapathy et al. [38] presented a short description about applications that leverage INT such as congestion control,
the pitfalls of traditional networks and the evolution of SDN. troubleshooting, etc. The survey concludes with discussions
The report briefly described elements of the P4 language. The and potential future work related to INT.
authors then discussed the control plane and P4Runtime [46], Zhang et al. [43] presented a survey that focuses on stateful
and enumerated three use cases of P4 applications. The report data plane. The survey starts with an overview of stateless
concludes with potential future. and stateful data planes, then overviews and compares some
The short survey presented by Bifulco et al. [39] reviews stateful platforms (e.g., OpenState, FAST, FlowBlaze, etc.).
the trends and issues of abstractions and architectures that The paper reviews a handful of stateful data plane applications
realize programmable networks. The authors discussed the and discusses challenges and future perspectives.
motivation of packet processing devices in the networking Table I summarizes the topics and the features described
field and described the anatomy of a programmable switch. in the related surveys. It also highlights how this paper
The proposed taxonomy categorizes the literature as state- differs from the existing surveys. All previous surveys lack
based, abstraction-based, implementation-based, and layer- a microscopic comparison between the intra-category works.
based. The layer-based consists of control/intent layer and data Also, none of them compare switch-based schemes against
plane layer; the implementation-based encompasses software legacy server-based schemes. To the best of the authors’
and hardware switches; the abstraction-based includes data knowledge, this work is the first to exhaustively explore the
flow graph and match-action pipelines; and the state-based whole programmable data plane ecosystem. Specifically, the
differentiates between stateful and stateless data planes. paper describes P4 switches and provides a detailed taxonomy
Kaljic et al. [40] presented a survey on data plane flex- of applications using P4 switches. It categorizes and compares
ibility and programmability in SDN networks. The authors the applications within each category as well as with legacy
evaluated data plane architectures through several definitions approaches, and provides challenges and future perspectives.
of flexibility and programmability. In general, flexibility in
SDN refers to the ability of the network to adapt its resources III. T RADITIONAL C ONTROL P LANE AND SDN
(e.g., changes in the topology or the network requirements).
Afterwards, the authors identified key factors that influence the A. Traditional and SDN Devices
deviation from the original data plane given with OpenFlow. With traditional devices, networks are connected using
The survey concludes with future research directions. protocols such as Open Shortest Path First (OSPF) and Border
Kannan et al. [41] presented a short survey related to the Gateway Protocol (BGP) [47]) running in the control plane
4
TABLE II
F EATURES , TRADITIONAL , SDN, AND P4 PROGRAMMABLE DEVICES
at each device. Both control and data planes are under running on CPUs. The use of high-level languages accel-
full control of vendors. On the other hand, SDN delineates erated innovation by hiding the target hardware (e.g., x86).
a clear separation between the control plane and the data In signal processing, Digital Signal Processors (DSPs) were
plane, and consolidates the control plane so that a single developed in the late 1970s and early 1980s with instruction
centralized controller can control multiple remote data planes. sets optimized for digital signal processing. Matlab is used for
The controller is implemented in software, under the control developing DSP applications. In graphics, Graphics Processing
of the network owner. The controller computes the tables Units (GPUs) were developed in the late 1990s and early 2000s
used by each switch and distributes them via a well-defined with instruction sets for graphics. Open Computing Language
Application Programming Interface (API), such as Openflow (OpenCL) is one of the main languages for developing graphic
[48]. While SDN allows for the customization of the control applications. In machine learning, Tensor Processor Units
plane, it is limited to the OpenFlow specifications and the (TPUs) and TensorFlow were developed in mid 2010s with
fixed-function data plane. instruction sets optimized for machine learning.
The programmable forwarding is part of the larger informa-
tion technology evolution observed above. Specifically, over
B. Comparison of Traditional, SDN, and Programmable Data
the last few years, a group of researchers developed a ma-
Plane Devices
chine model for networking, namely the Protocol Independent
Table II contrasts the main characteristics of traditional, Switch Architecture (PISA) [49]. PISA was designed with
SDN, and P4 programmable devices. In the latter, the forward- instruction sets optimized for network operations. The high-
ing behavior is defined by the user’s code. Other advantages level language for programming PISA devices is P4.
include the program-dependent APIs, where the same P4
program running on different targets requires no modifications IV. P ROGRAMMABLE S WITCHES
in the runtime applications (i.e., the control plane and the
A. PISA Architecture
interface between control and data planes are target agnostic);
the protocol-independent primitives used to process packets; PISA is a packet processing model that includes the fol-
the more powerful computation model where the match-action lowing elements: programmable parser, programmable match-
stages can not only be in series but also in parallel; and the action pipeline, and programmable deparser, see Fig. 3.
infield reprogrammability at runtime. On the other hand, the The programmable parser permits the programmer to define
technology maturity and support for P4 devices can still be the headers (according to custom or standard protocols) and
considered low in contrast to traditional and SDN devices. to parse them. The parser can be represented as a state ma-
chine. The programmable match-action pipeline executes the
operations over the packet headers and intermediate results. A
C. Network Evolution and Analogy with other Domain Spe- single match-action stage has multiple memory blocks (tables,
cific Processors registers) and Arithmetic Logic Units (ALUs), which allow for
The introduction of the general-purpose computers in the simultaneous lookups and actions. Since some action results
early 1970s enabled programmers to develop applications may be needed for further processing (e.g., data dependencies),
5
Switch
• Differentiation: the customized protocol or feature imple-
ASIC mented by the programmer needs not to be shared with the
chip manufacturer.
Fig. 3. A PISA-based data plane and its interaction with the control plane. • Enhanced performance: programmable switches do not in-
troduce performance penalty. On the contrary, they may pro-
stages are arranged sequentially. The programmable deparser duce better performance than fixed-function switches. Table
assembles the packet headers back and serializes them for III shows a comparison between a programmable switch
transmission. A PISA device is protocol-independent. and a fixed-function switch, reproduced from [51]. Note
In Fig. 3, the P4 program defines the format of the keys the enhanced performance of the former (e.g., maximum
used for lookup operations. Keys can be formed using packet forwarding rate, latency, power draw).
header’s information. The control plane populates table entries
with keys and action data. Keys are used for matching packet C. P4 Language
information (e.g., destination IP address) and action data is P4 has a reduced instruction set and has the following goals:
used for operations (e.g., output port). • Reconfigurability: the parser and the processing logic can
be redefined in the field.
• Protocol independence: the switch is protocol-agnostic. The
B. Programmable Switch Features
programmer defines the protocols, the parser, and the oper-
The main features of programmable switches are [50]: ations to process the headers.
• Agility: the programmer can design, test, and adopt new • Target independence: the underlying ASIC is hidden from
protocols and features in significantly shorter times (i.e., the programmer. The compiler takes the switch’s capabilities
weeks or months rather than years). into account when turning a target-independent P4 program
• Top-down design: for decades the networking industry oper- into a target-dependent binary.
ated in a bottom-up approach. Fixed-function ASICs are at
SmartNICs
the bottom and enforce available protocols and features to 5%
the programmer at the top. With programmable switches, the
programmer describes protocols and features in the ASICs. 2020
Note that the physical layer and parts of the MAC layer may 2019
not be programmable. ASIC Software
Year
2018 48.5%
• Visibility: programmable switches provide greater visibility 38.6%
resources and add complexity to the processing logic, which (a) (b)
is hard-coded in silicon. With programmable switches, the Fig. 4. (a) Distribution of surveyed data plane research works per year. (b)
programmer has the option to implement only those proto- Implementation platform distribution. The shares are calculated based on the
cols that are needed. studied papers in this survey.
6
The original specification of the P4 language was released are still not widely available for sale in the public. For
in 2014, and is referred to as P414 . In 2016, a new version of example, to acquire a switch equipped with Tofino chip (e.g.,
the language was drafted, which is referred to as P416 . P416 Edgecore Wedge100BF-32 [20]), and to get the development
is a more mature language which extended the P4 language to environment and the customer support, a Non-Disclosure
broader underlying targets: ASICs, Field-Programmable Gate Agreement (NDA) with Barefoot Networks should be signed.
Arrays (FPGAs), Network Interface Cards (NICs), etc. Additionally, the client should attend a training course (e.g.,
[206]) to understand the architecture and the specifics of the
V. M ETHODOLOGY AND TAXONOMY platform. This process is considered lengthy and costly, and
This section describes the systematic methodology that was not every institution is capable of affording it.
adopted to generate the proposed taxonomy. The results of The proposed taxonomy is demonstrated in Fig. 5. The tax-
this literature survey represent derived findings by thoroughly onomy was meticulously designed to cover the most significant
exploring more than 150 data plane-related research works works related to data plane programmability and P4. The aim
starting from 2016 up to late 2020. The distribution of which is to categorize the surveyed works based on various high-
is summarized in Fig. 4 (a). level disciplines. The taxonomy provides a clear separation of
Fig. 4 (b) depicts the share of each implementation plat- categories so that a reader interested in a specific discipline can
form used in the surveyed papers, grouped by software (e.g., only read the works pertaining to the said discipline. The cor-
BMv2, PISCES), ASIC (e.g., Tofino, Cavium), NetFPGA (e.g., rectness of the taxonomy was verified by carefully examining
NetFPGA SUME), and SmartNICs (e.g., Netronome NFP). the related work of each paper to correlate them into high-
The graph shows that the vast majority of the works were level categories. Each high-level category is further divided
implemented on software switches. Note that behavioral soft- into sub-categories. For instance, various measurements works
ware switches (e.g., BMv2 [203]) are not suitable indicators of belong to the sub-category “Measurements” under the high-
whether the program could run on a hardware target; they are level category “Network Performance”.
typically used for prototyping ideas and to foster innovation. Further, the survey compares the results and the features of-
On the other hand, non-behavioral software switches (e.g., fered by programmable data plane approaches (intra-category),
PICSES [204], derived from Open vSwitch (OVS) [205]) are as well as with those of the contemporary and legacy ones.
production-grade and can be deployed in data centers. This detailed comparison is elaborated upon for each sub-
Hardware targets constitute a smaller share of the platform category, giving the interested reader a comprehensive view of
distribution than software switches. A possible reasoning the state-of-the-art findings of that sub-category. Additionally,
behind this is that the technology is still recent and targets the survey presents various challenges and considerations, as
Programmable
Switches Literature
Congestion Load
Variations Consensus Aggregation Heavy Hitter Troubleshoot
Control Balancing
[52–57] [129–136] [152–155] [158–164] [189–193]
[63–68] [103–109]
Telecom
AQM Miscellaneous Anonymity
Services
[91–95] [143–151] [169–172]
[118–124]
Access
QoS and TM Pub/Sub
Control
[96–99] [125–128]
[173–176]
Attacks and
Multicast
Defenses
[100–102]
[177–188]
Fig. 5. Taxonomy of programmable switches literature based upon relevant, explored research areas.
7
...
Telemetry
Teleme
metry
...
...
...
IInstructio
ions
Instructions INT
Sender Collector
Telemetry Add Add Extract
instructions metadata metadata metadata Data
Data
Data [S4] [S4]
Data [S3]
INT [S1]
header [S2] [S3] [S3]
[S2]
INT {S1}
header [S1] [S2] [S2]
[S1]
INT header [S1] [S1]
Packet headers INT header
Packet headers INT header INT header
Packet headers
Original packet headers Telemetry instructions Switch metadata INT Collector Packet headers
Fig. 6. In-band Network Telemetry (INT). Fig. 7. Example of how INT can be used to provide the path traversed by
a packet in the network. The INT source inserts its label (S1) as well as the
well as some current and future trends that could be explored INT headers to instruct subsequent switches about the required operations
(i.e., push their labels). Finally, switch S4 strips the INT headers from the
as future work. packet and forwards them to a collector, while forwarding the original packet
to the receiver.
VI. I N - BAND N ETWORK T ELEMETRY (INT)
was the main concern to monitor, the programmer inserts
Conventional monitoring and collecting tools and protocols queue metadata and transit latency. An INT-enabled network
(e.g., ping, traceroute, Simple Network Management Protocol has the following entities: 1) INT source: a trusted entity
(SNMP), NetFlow, sFlow) are by no means sufficiently accu- that instruments with the initial instruction set what metadata
rate to troubleshoot the network, especially with the presence should be added into the packet by other INT-capable devices;
of congestion. These methods provide milliseconds accuracy 2) INT transit hop: a device adding its own metadata to an
at best and cannot capture events that happen on microseconds INT packet after examining the INT instructions inserted by
magnitude. Moreover, they cannot provide per-packet visibility the INT source; 3) INT sink: a trusted entity that extracts the
across the network. INT headers in order to keep the INT operation transparent
In-band Network Telemetry (INT) [207] is one of the for upper-layer applications; and 4) INT collector: a device
earliest key applications of programmable data plane switches. that receives and processes INT packets.
It enables querying the internal state of the switch and pro- The location of an INT header in the packet is intentionally
vides fine-grained and precise telemetry measurements (e.g., not enforced in the specifications document. For example, it
queue occupancy, link utilization, queuing latency, etc.). INT can be inserted as a payload on top of TCP, UDP, and NSH, as
handles events that occur on microseconds scale, also known a Geneve option on top of Geneve, and as a VXLAN payload
as microbursts. Collecting and reporting the network state is on top of VXLAN.
performed entirely by the data plane, without any intervention
from the control plane. Due to the increased visibility achieved
with INT, network operators are able to troubleshoot problems A. Postcard-based Telemetry (PBT)
more efficiently. Additionally, it is possible to perform instant INT provides the exact forwarding path, the timestamp and
processing in the data plane after measuring telemetry data latency at each network node, and other information. Such
(e.g., reroute flows when a link is congested), without having detailed information is derived by augmenting user packets
to interact with the control plane. Fig. 6 shows an INT-enabled with data collected by each switch. Postcard-based Telemetry
network. INT enables network administrators to determine the (PBT) is an alternative to INT which does not modify user
following: packets. Fig. 8 shows an example of PBT. As a user packet
• The path a packet took when traversing the network (see traverses the network, each switch generates a postcard and
Fig. 7). Such information is difficult to learn using existing sends it to the monitor. The event that triggers the generation
technologies when multi-path routing strategies (e.g., Equal- of the postcard is defined by the programmer, according to
cost Multi-Path Routing (ECMP) [208], flowlet switching the application’s need. Examples include start and/or end of a
[209]) are used.
Postcard-based Telemetry
• The matched rules that forwarded the packets (e.g., ACL
Flow watchlist Flow watchlist
entry, routing lookup). Event detection Event detection
TABLE IV
INT VARIATIONS C OMPARISON
Variation Name Overhead reduction strategy Metadata collection Operator intervention Implementation
[52] NetVision On-demand probing Active (segment routing) High; telemetry through queries Mininet
Flow subset selection by Software (BMv2)
[53] N/A Passive Low; closed-loop network
the knowledge plane w/ ONOS controller
Monitoring ratio adjustment Low; telemetry based on network
[54] sINT Passive Software (BMv2)
based on network changes behavior
Telemetry orchestration High; telemetry specified by
[55] INTO Passive N/A
based on heuristics operators
Per-flow packet subset High; telemetry specified by ASIC (Tofino) and
[56] ML-INT Passive
selection through sampling operators SmartNIC (NFP-4000)
Telemetry encoding on
[57] PINT Passive High; telemetry through queries ASIC (Tofino)
multiple packets
flow, sampling (e.g., one report per second), packet dropped in realtime. The proposed system encodes INT headers in
by the switch, queue congestion, etc. a subset of packets pertaining to an IP flow. The encoded
headers contain metadata that describes statistics of electrical
and optical network elements on the flow’s routing path. Ben
B. INT Variations et al. [57] proposed Probabilistic INT (PINT), an approach that
B.1. Background probabilistically adds telemetry information into a collection
Despite the improvements that INT brings compared to of packets to minimize the per-packet overhead associated with
legacy monitoring schemes, it introduces bandwidth overhead regular INT.
when enabled unconditionally by network operators. In such B.3. INT Variations, Comparison, and Discussions
scenarios, INT headers are added to every packet traversing
the switch, increasing bandwidth overhead which decreases Table IV compares the aforementioned INT variations so-
the overall network throughput. To mitigate such limitation, lutions. The main motivation behind these solutions is that
conditional statements are included in the P4 program to the majority of applications that leverage INT (e.g., con-
send reports only when certain events occur (e.g., queue gestion control, fast reroute) only require approximations of
utilization exceeds a threshold). This solution requires network the telemetry data and therefore, do not need to gather per-
operators to adjust thresholds and parameters manually based packet per-hop INT information. NetVision uses probing to
on the usual network traffic patterns. Consequently, several reduce the overhead of INT. The main limitation of this
variations of INT have been developed, aiming at customizing approach is that probing might result in poor accuracy and
its functionalities and addressing its limitations. Mainly, recent timeliness as the probes might experience different network
works focus on minimizing the bandwidth overhead of INT conditions than actual packets. All other works collect INT
by adjusting thresholds and parameters automatically, based information passively. [53] and sINT select flows based on
on measured traffic patterns and the desired application type. current network conditions, ML-INT uses a fixed sampling
scheme to select a small portion of packets in a flow, and
B.2. Literature Review PINT uses a probabilistic approach to encode telemetry on
Liu et al. [52] proposed NetVision, a telemetry system that multiple packets. Sampling and anomaly-based monitoring
aims at minimizing the traffic overhead of INT by using prob- might lead to information loss since not all packets are
ing. NetVision actively sends the rightful amount and format being reported. Some solutions require manual intervention
of probe packets depending on the telemetry application (e.g., from the operators to configure the telemetry process. The
traffic engineering, network visualization). Hyun et al. [53] simplicity of the configuration interface is vital to make
proposed an architecture for self-driving networks that uses the solution attractive to network operators. Finally, some
INT to collect packet-level network telemetry, and Knowledge- solutions were implemented on software switches, while other
Defined Networking (KDN) to create intelligence to the net- were implemented on hardware. It is important to note that not
work management, considering the collected telemetry data. all software implementations can fit into the pipeline of the
KDN accepts the network information as input and generates hardware.
policies to improve the network performance. Kim et al. [54]
B.4. INT, PBT, and Traditional Telemetry Comparison
proposed selective INT (sINT), a scheme that dynamically
adjusts the insertion frequency of INT headers. A monitoring Table V compares INT, PBT, and traditional telemetry.
engine observes changes in consecutive INT metadata and INT has higher potential vulnerabilities than PBT, such as
applies a heuristic algorithm to compute the insertion ratio. eavesdropping and tampering. Adding extra protective mea-
Marques et al. [55] described the orchestration problem in sures (e.g., encryption) is difficult on the fast data path. On
INT, which is associated with the optimal use of network the other hand, PBT packets tolerate additional processing to
resources for collecting the state and behavior of forwarding enhance security. The flow tracking process is simpler with
devices through INT. Niu at al. [56] proposed multilayer INT INT than with PBT. The latter requires the server receiving
(ML-INT), a system that visualizes IP-over-optical networks INT reports (i.e., INT collector, explained in Section VI-C)
9
TABLE V
I N - BAND , P OSTCARD - BASED , AND T RADITIONAL N ETWORK T ELEMETRY
to correlate multiple postcards of a single flow packet passing [59], which extracts information from every INT packet and
through the network, to form the packet history at the mon- pushes them to a gateway. A database server then periodically
itor. This process also adds delay in reporting and tracking. pulls information from the gateway. INTCollector [60] is a
Legacy schemes that rely on sampling and polling suffer from collector that extracts events, which are important network
accuracy issues, especially when links are congested. INT information, from INT raw data. It uses in-kernel processing
on the other hand is push-based, has better accuracy, and to further improve the performance. INTCollector has two
is more granular (microseconds scale). Reports sent by an processing flows; the fast path, which processes INT reports
INT-capable device contain rich information (e.g., the path and needs to execute quickly, and the normal path which
a packet took) that can aid in troubleshooting the network. processes events sent from the fast path, and stores information
Such visibility is minimal in legacy monitoring schemes. in the database. Deep Insight [61] is a proprietary solution
Programmable switches permit reporting telemetry after the provided by Barefoot Networks that leverages INT capabilities
occurrence of specific events (e.g., congestion). Moreover, they to provide services such as real-time anomaly detection, con-
provide flexibility in programming reactive logic that executes gestion analysis, packet-drop analysis, etc. Another proprietary
promptly in the data plane. One drawback of INT is that it solution is BroadView Analytics used on Broadcom Trident 3
imposes bandwidth overhead if configured to report for every devices by Broadcom [62].
packet; however, when event-based reports are considered, the
bandwidth overhead significantly decreases. C.3. INT Collectors Comparison, Discussions, and Limita-
tions
C. INT Collectors Fig. 9 and Table VI compare the aforementioned INT
C.1. Background collectors. IntMon and Prometheus INT exporter were among
An INT collector is a component in the network that the earliest collectors. Both have low processing rates since
processes telemetry reports produced by INT devices. It parses they are implemented without kernel nor hardware accelera-
and filters metrics from the collected reports, then optionally
stores the results persistently into a database. Since a large
number of reports is typically produced in INT, having a high-
performance collector is essential to avoid missing important
network events. To this end, a number of research works
focus on developing and enhancing the performance of INT
collectors running on commodity servers.
C.2. Literature Review
IntMon [58] is an ONOS-based collector application for
INT reports. It includes a web-based interface that allows
controlling which flows to monitor and the specific metadata to Fig. 9. CPU efficiency with the three INT collectors. Source: INTCollector
collect. Another INT collector is the Prometheus INT exporter paper [60].
10
TABLE VI
INT C OLLECTORS C OMPARISON
tion. Also, they are very limited with respect to the features embedded into packets and forwarded to a high-performance
they provide (e.g., lack of event detection, limited analytics, collector. The collector typically performs analysis and ap-
historical data unavailability, etc.). Prometheus INT exporter plies actions accordingly (e.g., informs the control plane to
also suffers from increased overhead of sending the data for update table entries). Current research efforts mainly focus
every INT packet to the gateway, and the potential loss of on developing variations of INT to decrease its telemetry
network events as the database only stores the latest data pulled traffic overhead, considering the overhead-accuracy trade-off.
from the gateway. INTCollector on the other hand has higher Other works aim at accelerating INT collectors to handle
rate and uses the eXpress Data Path (XDP) [211] to accelerate large volumes of traffic (in the scale of Kpps). Future work
the packet processing in the kernel space. It filters the data could possibly investigate further improvements for INT such
to be published based on significant changes in the network as compressing packets’ headers, broadening coverage and
through its event detection mechanism. DeepInsight Analytics visibility, enriching the telemetry information, and simplifying
has a modular architecture and runs on commodity servers. the deployment.
It executes the Barefoot SPRINT data plane telemetry which
consists of a P4 program (INT.p4) encompassing intelligent VII. N ETWORK P ERFORMANCE
triggers. It also provides open northbound RESTful APIs that
allow customers to integrate their third-party network man- Measuring and improving network performance is critical
agement solutions. DeepInsight Analytics is advanced with in nowadays’ infrastructures. Low latency and high bandwidth
respect to the features it provides (real-time anomaly detection, are key requirements to operate modern applications that con-
congestion analysis, packet-drop analysis, etc.). However, it tinuously generate enormous amounts of data [212]. Conges-
is a closed-source solution and lacks reports of performance tion control (CC), which aims at avoiding network overload, is
benchmarks. critical to meet these requirements. Another important concept
Fig. 9 demonstrates the CPU efficiency of three INT col- for expediting these applications is managing the queues
lectors (IntMon, Prometheus INT exporter, and INTCollector) that form in routers and switches through Active Queuing
[60]. IntMon has the lowest throughput, and is 57 times slower Management (AQM) algorithms. This section explores the
than Prometheus INT. INTCollector on the other hand has the literature related to measuring and improving the performance
highest throughput and is 27 times faster than Prometheus INT of programmable networks.
exporter.
C.4. Collectors in INT and Legacy Monitoring Schemes Com- A. Congestion Control (CC)
parison A.1. Background
Generally, collectors used with both INT and legacy moni- One of the most challenging tasks in the Internet today is
toring schemes run on general purpose CPUs, and hence, have congestion control and collapse avoidance [213]. The difficulty
comparable performance. INT produces excessive amounts in controlling the congestion is increasing due to factors
of reports when compared with legacy monitoring schemes such as high-speed links, traffic diversity and burstiness, and
(e.g., NetFlow), and therefore, requires having a collector with buffer sizes [63]. Today’s CC algorithms aim at shortening
high processing capability. INT-based collectors are typically delays, maximizing throughput, and improving the fairness and
accelerated with in-kernel fast packet processing technologies utilization of network resources.
(e.g., XDP) and hardware-based accelerators (e.g., Data Plane Tremendous amount of research work has been done on
Development Kit (DPDK)). congestion control, including end hosts algorithms such as
loss-based CC algorithms (e.g., CUBIC [214], Hamilton TCP
D. Summary and Lessons Learned (HTCP) [215], etc.), model-based algorithms (e.g., Bottleneck
Legacy telemetry tools and protocols are not capable of Bandwidth and Round-trip Time (BBR) [216]), congestion-
capturing microbursts nor providing fine-grained telemetry signalling mechanisms (e.g., Explicit Congestion Notification
measurements. INT was developed to address these challenges; (ECN) [217]), data-center specific schemes (e.g., TIMELY
it enables the data plane developer to query with high- [218], Data Center Quantized Congestion Notification (DC-
precision the internal state of switches. Telemetry data are then QCN) [219], Data Center TCP (DCTCP) [220], pFabric [221],
11
INT INT
Packet
end-hosts then use this information to adjust the sending rate
Adjust rate through their smart Network Interface Controllers (NICs).
per ACK
Kfoury et al. [67] proposed a P4-based method to automate
Sender ACK ACK ACK Receiver
end-hosts’ TCP pacing. It supplies the bottleneck bandwidths
and the number of elephants flows to senders so that they can
pace their rates to safe targets, avoiding filling routers’ buffers.
Fig. 10. HPCC: INT-based high precision congestion control.
Turkovic et al. [64] proposed a P4-based method that reroutes
Performance-oriented Congestion Control (PCC) [222], etc.), flows to backup paths during congestion. The system detects
and application-specific schemes (e.g., QUIC [223]). congestion by continuously monitoring the queueing delays
With the advent of programmable data plane switches, of latency-critical flows. The same authors [68] proposed a
researchers are investigating new methods to provide network- method that separates the senders based on their congestion
assisted congestion feedback for end-hosts. control algorithm. Each congestion control uses a separate
queue in order to enforce the fairness among its competing
A.2. Literature Review flows.
Handley et al. [63] proposed NDP, a novel protocol archi-
A.3. CC Schemes Comparison, Discussions, and Limitations
tecture for datacenters that aims at achieving low comple-
tion latency for short flows and high throughput for longer Table VII compares the aforementioned CC schemes. NDP
flows. NDP avoids core network congestion by applying per- and NCF are similar in the sense that both use NACKs as
packet multipath load balancing, which comes at the cost congestion feedback. NDP avoids congestion by applying per-
of reordering. It also trims the payloads of packets, similar packet multihop load balancing. This approach works ade-
to what is done in Cut Payload (CP) [224], whenever the quately with symmetric topologies, but fails when topologies
queues of the switches become saturated. Once the payload is are asymmetric (e.g., BCube, Jellyfish), especially during
trimmed, the headers are forwarded using high-priority queues. heavy network load. Another limitation of NDP is the ex-
Consequently, a Negative ACK (NACK) is generated and sent cessive retransmissions produced by the server. NCF adopted
through high-priority queues so that a retransmission is sent the idea of packet trimming from NDP, but generates NACKs
before draining the low priority queue. Similarly, Feldmann from the trimmed packet and sends it directly to the sender.
et al. [66] proposed a method that uses network-assisted Such approach removes the receiver from the feedback loop,
congestion feedback (NCF) in the form of NACKs generated improving the sender’s reaction time. One limitation of NCF
entirely in the data plane. NACKs are sent to throttle elephant- is that it requires operators to manually tune some of the
flow senders in case of congestion. The method maintains three predefined parameters (e.g., threshold, queue size, etc.). Addi-
separate queues for mice flows, elephant flows, and control tionally, NCF might disclose network congestion information,
packets to ensure fair sharing of resources. making it less attractive to operators. Finally, the authors of
Li et al. [65] proposed High Precision Congestion Control NCF claim that the approach works with both datacenters and
(HPCC), a new CC mechanism that leverages INT-based data Internet-wide scenarios. However, no implementation results
added by P4 switches to obtain precise link load information. were presented to evaluate the effectiveness of the solution.
HPCC computes accurate flow rate by using only one rate HPCC leverages INT data to control network congestion.
update, as opposed to legacy approaches that require a large It enhances the convergence time by using a Multiplicative-
number of iterations to determine the rate. HPCC provides Increase Multiplicative-Decrease (MIMD) scheme. Note
near-zero queueing, while being almost parameterless. Fig. 10 that previous TCP variants use the Additive-Increase
shows the mechanism of HPCC. The switches add INT headers Multiplicative-Decrease (AIMD), which is conservative when
to every packet, and then the INT information is piggybacked increasing the rate, and hence has a slow convergence time.
into the TCP/RDMA Acknowledgement (ACK) packet. The The reason AIMD schemes are slow is that they use a single-
TABLE VII
C ONGESTION C ONTROL S CHEMES C OMPARISON
TABLE VIII
C ONGESTION C ONTROL S CHEMES . 1) P ROGRAMMABLE S WITCHES (HPCC); 2) E ND - HOSTS ; AND 3) L EGACY N ETWORK - ASSISTED (ECN)
bit congestion information (packet loss, ECN). With HPCC, switches on the other hand require few parameters (e.g.,
end-hosts can perform aggressive increase as INT metadata en- marking threshold) to adapt to different network conditions.
compasses precise link utilization and timely queue statistics.
HPCC demonstrated promising results with respect to latency,
B. Measurements
bandwidth, and convergence time. The authors however did
not evaluate the performance of HPCC with conventional B.1. Background
congestion control algorithms in the Internet (e.g., CUBIC, Gaining an overall understanding of the network behavior
BBR). Note that achieving inter-protocol fairness is essential is an increasingly complex task, especially when the size
so that the solution is adopted by operators. of the network is large and the bandwidth is high. Legacy
The method in [67] uses TCP pacing. Pacing decreases measurements schemes have accuracy limitations since they
throughput variations and traffic burstiness, and hence, mini- rely on polling and sampling-based methods to gather traffic
mizes queuing delays. However, this method works well only statistics. Typically, sampling methods have high sampling
in networks where the number of large flows senders is small rates (e.g., one every 30,000 packets) and polling methods
(e.g., in science Demilitarized Zone (DMZ) [212]). have large polling intervals. The literature [225] has shown that
P4Air, which applies traffic separation, demonstrated sig- such methods are only suitable for coarse-grained visibility.
nificant improvements in fairness compared to contemporary The accuracy limitation of sampling and polling techniques
solutions. However, it requires allocating a queue for each hampers the development of measurement applications. For
congestion control algorithm group (e.g., loss-based (Cubic), instance, it is not possible to accurately measure frequently
delay-based (TCP Vegas), etc.). Note that the number of changing TCP-specific fields such as congestion window,
queues is limited in switches, and production networks often receive window, and sending rate.
reserve them for other applications’ QoS [65]. Data streaming or sketching algorithms [226–230] were
Note that some schemes require modifying the end-hosts proposed to answer the limitation of sampling and polling.
(e.g., HPCC) while others are fully in-network (e.g., P4Air). They address the following problem: an algorithm is allowed
to perform a constant number of passes over a data stream
A.4. End-hosts, Programmable Switches, and Legacy Devices’
(input sequence of items) while using sub-linear space com-
CC Schemes
pared to the dataset and the dictionary sizes; desired statistical
Table VIII compares the CC schemes assisted by pro- properties (e.g., median) on the data stream are then estimated
grammable switches (e.g., HPCC) with end-hosts CC al- by the algorithm. The main problem with such algorithms is
gorithms (e.g., CUBIC) and legacy congestion signalling that they are tightly coupled to the metrics of interest. This
schemes (e.g., ECN). End-hosts CC infer congestion through means that switch vendors should build specialized algorithms,
packet drops and estimations (e.g., btlbw and Round-trip Time data structures, and hardware for specific monitoring tasks.
(RTT) estimation with BBR), which is not always sufficient to With the constraints of CPU and memory in networking
infer the existence of congestion. Legacy devices use classic devices, it is challenging to support a wide spectrum of
ECN to signal congestion so that end-hosts slow down their monitoring tasks that satisfy all customers. Legacy devices also
transmission rates. Classic ECN is limited as it only marks lack the capability of customizing the processing behavior so
a single bit to signal congestion, and is not aggressive nor that switches co-operate in the measurement process.
immediate. Programmable switches on the other hand use With the emergence of programmable switches, it is now
fine-grained prompt measurements to signal congestion (e.g., possible to perform fine-grained measurements in the data
INT metadata), which results in higher detection accuracy, plane at line rate. Moreover, data structures such as sketches
near-zero queueing delays, and faster convergence time. The and bloom filters can be easily implemented and customized
distributed nature of end-hosts CC schemes allows them to op- for specific metrics of interest. Programmable switches pave
erate without modifying the network infrastructure and without the way for new areas of research in measurements since not
tweaking parameters. ECN-enabled devices and programmable only they provide flexibility in inspecting with high accuracy
13
the traffic statistics, but also allow programmers to express counts the attributes across related packets identified by keys,
reactive processing in real time (e.g., dropping a packet when and flags packets that surpass a defined threshold.
a threshold is bypassed as done in Random Early Detection Other approaches such as Elastic sketch [73] performs mea-
(RED) [231]). surement that are adaptive to changes in network conditions
(e.g., bandwidth, packet rate and flow size distribution). *Flow
B.2. Literature Review [77] supports concurrent measurements and dynamic queries.
INT provides path-level metrics, with data similar to that of Such approach aims at minimizing the concurrency problems
polling-based techniques. Note that the metrics themselves are and the network disruption resulting from compiling excessive
fixed; for instance, it is possible to determine the flow-level queries into the data plane. TurboFlow [78] aims at achieving
latency, but not the latency variation (jitter) [71]. The fixed high coverage without sacrificing information richness. Bai
metrics of INT also prevent performing network-wide mea- et al. [86] proposed FastFE, a system that performs traffic
surements; note that the INT standard specification document features extraction by leveraging programmable data planes.
does not mention methods to aggregate metadata and perform Features are then used by traffic analysis and behavior detector
complex analytics in the data plane. ML techniques.
This section focuses on techniques that provide measure- Performance Diagnosis Systems. Recent works are leverag-
ments that go beyond the fixed metrics extracted from the ing programmable data planes to diagnose network perfor-
internal state of the switch. mance. The main motivation here is that fine-grained infor-
Generic Query-based Monitoring. Operators constantly mation can be monitored at line rate, mitigating the slow
change their monitoring specifications. Adding new moni- reaction to “gray failures” experienced by diagnosing end-
toring requirements on the fixed-function switching ASIC is hosts in legacy approaches.
expensive. Recent work explored the idea of providing a Ghasemi et al. [72] proposed Dapper, an in-network TCP
query-driven interface that allows operators to express their performance diagnosis system. Dapper analyzes packets in real
monitoring requirements. The queries can then be converted time, and identifies and pinpoints the root cause of the bottle-
into switch programs (e.g., P4) to be deployed in the network. neck (sender, network, or receiver). Blink [82] also diagnoses
Alternatively, the queries can be executed on the control plane TCP-related issues. In particular, it detects failures in the data
considering the measured information extracted from the data plane based on retransmissions, and consequently, reroutes
plane. traffic. Other approaches attempt to diagnose performance
A simplistic attempt is FlowRadar [69], a system that degradation manifested by an increase of latency. Wang et al.
stores counters for all flows in the data plane with low [84] proposed SpiderMon, a system that performs network-
memory footprint, then exports periodically (every 10ms) to a wide performance degradation diagnosis. The key idea is to
remote collector. Liu et al. [70] proposed Universal Monitor- have every switch maintain fine-grained telemetry data for a
ing (UnivMon), an application-agnostic monitoring framework short period of time, and upon detecting performance degra-
that provides accuracy and generality across a wide range dation (e.g., increased delay), the information is offloaded
of monitoring tasks. UnivMon benefits from the granularity to a collector. Liu et al. [81] proposed a memory-efficient
of the data plane to improve accuracy and runs different approach for network performance monitoring. This solution
estimation algorithms on the control plane. Narayana et al. only monitors the top-k problematic flows.
[71] presented Marple, a query language based on common Queue and Other Metrics Measurement. Programmable
query constructs (i.e., map, filter, group by). Marple allows data planes allows querying the internal state of the queue with
performing advanced aggregation (e.g., moving average of fine-grained visibility. Recent works leveraged this feature to
latencies) at line rate in the data plane. Similarly, Sonata provide better queueing information which can be used by
[79] provides a unified query interface that uses common various applications (e.g., AQMs, congestion control, etc.).
dataflow operators, and partitions each query across the stream Chen et al. [80] proposed ConQuest, a P4-based queue mea-
processor and the data plane. PacketScope [85] also uses surement solution that determines the size of flows occupying
dataflow constructs but allows to query the internal switch the queue in real time, and identifies flows that are grabbing a
processing, both in the ingress and the egress pipelines. significant portion of the queue. Joshi et al. [75] proposed
Many of the previous works use the sketch data structure. BurstRadar, a system that uses programmable switches to
The work in [88] extended the sketching approach used in monitor microbursts in the data plane. Mircorbursts are events
previous works to support the notion of time. The motivation of sporadic congestion that last for tens or hundreds of
of this work is that recently captured traffic trends are the microseconds. Microbursts increase latency, jitter, and packet
most relevant in network monitoring. Huang et al. [89] pro- loss, especially when links’ speeds are high and switch buffers
posed OmniMon, an architectural design that coordinates flow- are small.
level network telemetry operations between programmable Other works enabled measuring further metric. For instance,
switches, end-hosts, and controllers. Such coordination aims at Ding et al. [83] proposed P4Entropy, an algorithm to estimate
achieving high accuracy while maintaining low resource over- network traffic entropy (Shannon entropy) in the data plane.
head. Chen et al. [90] proposed BeauCoup, a P4-based mea- Tracking entropy is useful for calculating traffic distribution
surement system that handles multiple heterogeneous queries in order to understand the network behavior. Another example
in the data plane. It offers a general query abstraction that is the system proposed by Chen et al. [87] which passively
14
TABLE IX
M EASUREMENTS S CHEMES C OMPARISON
measures the RTT of TCP traffic in ISP networks. RTT event matching techniques. Such techniques are primarily
measurement is important for detecting spoofing and routing used to achieve high resource efficiency (i.e., low memory
attacks, ensuring Service Level Agreements (SLAs) compli- footprint), but cannot achieve full accuracy. On the other hand,
ance, measuring the Quality of Experience (QoE), improving systems like OmniMon carefully coordinates the collaboration
congestion control, and many others. among different types of entities in the network. Such coor-
dination will result in efficient resource utilization and fully
B.3. Measurements Schemes Comparison, Discussions, and accuracy. OmniMon follows a split-merge strategy where the
Limitations split operation decomposes telemetry operations into partial
Table IX compares the aforementioned measurements operations and schedules them among the entities (switches,
schemes. end-hosts, and controller), and the merge operation coordinates
the collaboration among these entities. The idea is to leverage
Generic Query-based Monitoring. Some schemes (e.g.,
the strength of the data plane in the switches and end-hosts
Sonata, FlowRadar, UnivMon) performed approximations of
(i.e., per-flow measurements with high accuracy) and the con-
the metrics by using probabilistic data structures (e.g., sketch,
trol plane (i.e., network-wide collaboration). OmniFlow also
bloom filter, etc), sampling methods, and top-k counting. In
ensures consistency through a synchronization mechanism and
addition, some focused on a subset of traffic by leveraging
15
accountability through a system of linear equation considering Size (MSS), sender’s reaction time (time between received
packet loss and other data center characteristics. Results show ACK and new transmission), loss rate, latency, congestion
that OmniMon reduces the memory by 33%-96% and the window (CWND), receiver window (RWND), and delayed
number of actions by 66%-90% when compared to state-of- ACKs. Based on the inferred variables, Dapper can identify
the-art solutions. the root cause of the bottleneck. Similarly, the authors in
Another criterion that differentiates the measurements [81] monitored conditions such as retransmissions, packet
schemes is whether there are computations being performed loss, round-trip-time, out-of-order packets to identify the top-k
outside the data plane. Most of the systems use the control problematic flows. Furthermore, Blink detects failures based
plane or external servers to perform complex computations on the predictable behavior of TCP, which retransmits packets
since the data plane has limited support to complex arithmetic at epochs exponentially spaced in time, in the presence of
functions. While some systems (e.g., BeauCoup) do not re- failure. Other schemes (i.e., SpiderMon) identify failures based
quire an external computation device, they often support less on the increase of latency.
measurement operations. Some schemes use reactive processing to mitigate the net-
The selection of the data structure to be used in the data work performance issue. For instance, Blink promptly reroutes
plane strongly affects the measurements features supported traffic whenever failures signals are generated by the data
by a certain scheme. For instance, the goal of BeauCoup plane, while SpiderMon limits the sending rate of the root
is to enable simultaneous distinct counting queries; for such cause hosts.
task, the authors based their design on the coupon-collection Finally, it is worth mentioning that some systems (e.g.,
problem [232], which computes the number of random draws Blink, Dapper) considered traces from real-world captures
from n coupons such that all coupons are drawn at least such as the ones provided by CAIDA for evaluation. Using
once. For example, if the threshold of distinct destination IPs real-world traces gives more credibility to the proposed solu-
for detecting superspreaders is 130, instead of recording all tion.
distinct destination IPs, 32 coupons are defined. Consequently,
Queue and other Metrics Measurement. Understanding
the destination IPs of incoming packets are mapped to those
the occupancy of the queue is useful for use cases such
32 coupons. While this data structure uses less memory than
as mitigating congestion-based attacks, avoiding conflicting
the other state-of-the-art measurement sketches, it is limited
workloads, implementing new AQMs, optimizing switch con-
to specific objectives (distinct counting). Other works (e.g.,
figurations, debugging switch implementation, off-path mon-
UnivMon) focused on generalizing the measurement scenarios,
itoring of queues in legacy devices, etc. ConQuest performs
and hence, used universal sketches as data structures.
queue measurements and identifies flows depending on the
Qiu et al. [88] focused on capturing traffic trends that are the purpose (e.g., detecting bursty connections). It maintains
most relevant in network monitoring and attacks’ detection. compact snapshots of the queue, updated on each incoming
The notion of time is not supported by native streaming packet. The snapshots are then aggregated in a round-robin
algorithms. For instance, count-min sketch, which is a data fashion to approximate the queue occupancy. Afterwards, it
structure that uses constant memory amount to record data, cleans the previous snapshots to reuse it for further packets.
is oblivious to the passage of time. Existing solutions that Similarly, BurstRadar detects microbursts, which can increase
consider recency are easily implemented on software, but not latency, jitter, and packet loss, especially when links’ speeds
on programmable ASICs. For example, resetting a sketch after are high and switch buffers are small. It is almost impossible
a timer expires requires iterating over the elements in the to detect microbursts in legacy switches which use sampling
sketch, an operation that cannot be implemented in the data and polling-based techniques. BurstRadar detects microbursts,
plane due to the lack of loops. Likewise, creating multiple and captures a snapshot of the telemetry information of all
sketches require additional stages which is limited in the the involved packets. Afterwards, an analysis is conducted
hardware. Time-adaptive sketches utilize the idea of Dolby on the snapshot to identify the microburst-contributing flow
noise reduction [233, 234]; a pre-emphasis function inflates and the burst characteristics. Note that BurstRadar does not
the update when a new key is inserted and a de-emphasis support measuring the queues of legacy devices passively, but
function restores the original value. This mechanism ages the ConQuest does. In addition, BurstRadar performs the analysis
old events over time, and therefore, improves the accuracy on the control plane, while ConQuest uses the data plane for
of recent events. The authors implemented the pre-emphasis analysis.
function in the data plane using simple bit shifts, and the de-
emphasis function in the control plane. B.4. In-Network versus Legacy Measurements
Finally, some systems considered network-wide monitoring,
Fig. 11 compares the legacy measurements to those con-
while others only restricted their capabilities to local per-
ducted on programmable switches. There are two main
switch measurements. Network-wide measurement is essential
classes of legacy measurements techniques. First, there are
and can significantly improve the visibility of traffic, as
techniques that rely on polling and sampling (e.g., Net-
discussed in Section XIII-D.
Flow). The differences between in-network measurements and
Performance Diagnosis Systems. Some performance diag- polling/sampling-based schemes are closely related to the dif-
nosis schemes restricted their scope to troubleshooting TCP. ferences between legacy measurements and INT (see Table V).
For instance, Dapper infers sending rate, Maximum Segment For instance, the granularity of the measurements conducted in
16
Application-specific
computation
Configure
Control Plane Control Plane
Data structures
Flow reports
(e.g., Sketch)
Traffic Traffic
(a) (b)
Fig. 11. (a) Traditional measurements with sampling/polling. The switch uses sampling and polling protocols (e.g., NetFlow, SNMP) to generate fixed network
flow records. Instead of collecting every packet, sampling collects only one every N number of packets. Records are then exported to an external server for
further analysis. (b) Measurements with programmable switches (e.g., UnivMon [70]). The switch runs a universal algorithm over a universal data structure
(e.g., universal sketch). The control plane then estimates a wide range of metrics for various applications. Note that this is not the only design possible for
measurement tasks with programmable switches. The programmer has the flexibility to use customized algorithms than run at line rate in the data plane. Such
algorithms can leverage various data structures in the P4 program (e.g., sketch, bloom filter) to store flow statistics. The switch then push statistics reports to
the control plane for further analysis and reactive processing.
the data plane is much higher than those collected in traditional too much data is known as "Bufferbloat". Bufferbloat not
measurements (e.g., NetFlow). Further, it is not possible to only increases the end-to-end delay, but also decreases the
conduct event-based monitoring in legacy approaches, whereas throughput and increases the jitter of a communication session.
with in-network measurements, the programmer has the flexi- Modern AQMs help in mitigating the bufferbloat problem
bility of customizing the monitoring based on conditions and [235–238]. Unfortunately, modern AQMs are typically not
thresholds. Second, there are techniques that rely on sketching available in state-of-the-art network equipment; for instance,
or streaming algorithms to estimate the metric of interest. Controlled Delay (CoDel) AQM, which was proposed in
Such methods are tightly coupled with the metric, which 2013, and was proven in the literature to be effective in
forces hardware vendors to invest time and effort in building mitigating Bufferbloat [239], is still not available in most
customized algorithms and data structures that might not be network equipment. With programmable switches, it is now
used by various customers. Moreover, with the constraints possible to implement AQMs as P4 programs, which not only
of routers and switches, it is not possible to implement a accelerates support for new AQMs, but also provides means
variety of monitoring tasks while still supporting the standard to customize its parameters programmatically in response to
routing/switching functionalities. Therefore, such approaches network traffic. Moreover, programmable switches thrives for
are not scalable for the long run. innovation on newer AQMs that can be easily implemented
With programmable switches, it is possible to customize and rapidly tested.
the monitoring tasks by implementing customized sketch-
C.2. Literature Review
ing/streaming algorithms as P4 programs. This advantage
improves scalability as the operator can always modify the Kundel et al. [91] implemented CoDel queueing discipline
algorithms whenever needed. on a programmable switch. CoDel eliminates Bufferbloat, even
in the presence of large buffers [240]. Sharma et al. [92]
C. Active Queue Management (AQM) proposed Approximate Fair Queueing (AFQ), a mechanism
built on top of programmable switches that approximates
C.1. Background fair queuing on line rate. Fair Queueing (FQ) aims at fairly
A fundamental component in network devices is the queue dividing the bandwidth allocation among active flows. Laki
which temporarily buffers packets. As data traffic is inherently et al. [93] described an AQM evaluation testbed with P4 in
bursty, routers have been provisioned with large queues to a demo paper. The authors tested the framework with two
absorb this burstiness and to maintain high link utilization. The AQMs: Proportional Integral Controller Enhanced (PIE) and
majority of delays encountered in a communication session is RED. Mushtaq et al. [241] approximated Shortest Remaining
a result of large backlogs formed in queues. Previous legacy Processing Time (SRPT). Papagianni et al. [94] implemented
devices are limited in the visibility of the queue as they provide Proportional Integral PI2 AQM on a programmable switch. PI2
little or no insight about which flows are occupying or sharing is an extension of PIE AQM to support coexistence between
the queue [80]. Consequently, researchers have been investi- classic and scalable congestion controls in the public Internet.
gating queue management algorithms to shorten the delay and Kumazoe et al. [95] implemented MTQ/QTL scheme on P4.
mitigate packet losses, while providing fairness among flows.
C.3. AQM Schemes Comparison, Discussions, and Limitations
AQM is a set of algorithms designed to shorten the queueing
delay by prohibiting buffers on devices from becoming full. Table X compares the aforementioned AQM schemes. Some
The undesirable latency that results from a device buffering schemes require tuning a number of parameters and thresholds
17
TABLE X
AQM S CHEMES C OMPARISON
Scheme Name Idea Params & thresholds Multiple queues Data structure Implementation
[91] P4-CoDel Implementation of CoDel on P4 2 × Registers BMv2
Approximate fair queueing in the Count-min Cavium
[92] AFQ 4
switch sketch OCTEON
[93] N/A Evaluation testbed for PIE and RED Red 1, PIE 5 × Registers BMv2
[94] PI2 for P4 Implementation of PI2 on P4 3 × Registers BMv2
[95] MTQ/QTL Implementation of MTQ/QTL on P4 3 × Registers BMv2
so that they operate well in certain network conditions. It is ditions (e.g., short/long RTTs, lossy networks, WANs) is an
worth mentioning that a scheme becomes hard to manage active research area. Typically, new AQMs are implemented
and less autonomous when the number of parameters and and tested in software (e.g., as a Linux queueing discipline
thresholds is high. (qdisc) used with traffic control (tc)), which is limited when
Some schemes are simple to implement in the data plane. the objective is to deploy the AQMs on production networks.
CoDel’s algorithm can be easily expressed in the data plane With programmable switches, AQMs are implemented in P4
as it consists of comparisons, counting, basic arithmetic, and programs, which foster innovation and enhance testing with
dropping packets. Similarly, PI2 is simple to implement as it production networks. Additionally, operators can create their
is mostly based on basic bit manipulations. FQ algorithms on own customized AQMs that perform efficiently with their typ-
the other hand are difficult to implement on hardware as they ical network traffic. Historically, deploying AQMs on network
require complex flow classification, per-packet scheduling, devices is a lengthy and costly process; once an effective
and buffer allocation. Such requirements make FQ algorithms AQM is published and thoroughly tested, equipment vendors
expensive to be implemented on high-speed devices. AFQ start investigating whether it is feasible to implement it on
aims at approximating fair queueing by using programmable future devices. Such process might take years to finish, and
switches’ features such as mutating switch state, performing by then, new network conditions evolve, requiring new AQMs.
basic calculations, and selecting the egress queue of a packet. With programmable switches, this process is cost-efficient and
AFQ’s operations can be summarized as follows: 1) per-flow relatively fast (can be completed in weeks). Table XI compares
state, which includes the number and timing information of the the features of AQMs on programmable switches versus fixed-
previous packet pertaining to that flow, is approximated; 2) the function devices.
position of each packet in the output schedule is determined;
3) the egress queue to use is selected; and 4) the packet is
dequeued based on the approximate sorted order. Note that D. Quality of Service and Traffic Management
AFQ uses a probabilistic data structure (count-min sketch) D.1. Background
since it only approximates the states, and uses multiple queues
in its implementation. Meeting diverse Quality of Service (QoS) requirements is
a fundamental challenge in today’s networks. Traffic Man-
C.4. AQMs on Programmable Switches and Fixed-function agement (TM) provides access control that guarantees that
Devices the traffic admitted to the network conforms to the defined
Inventing novel AQMs that control queueing delay, mitigate QoS specifications. TM often regulates the rate of a flow by
bufferbloat, and achieve fairness with different network con- applying traffic policing. New generation of programmable
switches facilitate traffic policing and differentiation by al-
TABLE XI lowing network operators to express their logic in a pro-
AQM S ON P ROGRAMMABLE AND F IXED - FUNCTION S WITCHES gramming language (P4). This section explores the works on
programmable switches that involve QoS and TM.
Feature Programmable switches Fixed-function devices
Lower; only D.2. Literature Review
Higher; new AQMs are
Innovation developed by
expressed in P4 programs
equipment vendors Bhat et al. [96] described a system where programmable
Higher; operators can switches route traffic intelligently by inspecting application
implement their own Lower; most
Exclusivity custom AQMs without supported AQMs are headers (layer-5) to improve users’ QoE. Lee et al. [97]
disclosing technical standards implemented a traffic meter based on Multi-Color Markers
information (MCM) on programmable switches to support multi-tenancy
Faster (weeks to months);
once an AQM is environments. Tokmakov et al. [98] proposed RL-SP-DRR, a
Readiness Slower (years)
expressed in P4, it can be traffic management system that combines Rate-limited Strict
immediately available Priority (RL-SP) and Deficit round-robin (DRR) to achieve
Cost Lower Higher
Higher; even standard low latency and fair scheduling while improving link utilisa-
AQMs can be customized Lower; only through tion, prioritization and scalability. Chen et al. [99] proposed
Tweakable
and tweaked based on parameters a bandwidth manager for end-to-end QoS provisioning using
network traffic
programmable switches. The system classifies packets into
18
TABLE XV
L OAD BALANCING S CHEMES C OMPARISON
Platform
Scheme Name Stateful Centralized Active probing MP-TCP support Failure handling
Hardware Software
[103] HULA × ×
[104] SilkRoad × × ×
[105] MP-HULA ×
[106] Beamer × ×
[108] Dash × ×
[109] Contra × ×
TABLE XVI the key. The switch is used as an “in-network cache”, where
S WITCH - BASED AND S ERVER - BASED L OAD BALANCERS the hottest items are stored. When a read request for a hot key
Feature Switch-based Server-based
is received, the switch consults its local table and returns the
Higher; (e.g., SilkRoad
value corresponding to that key. If the key is missed (i.e., the
Lower (e.g., 9Mpps per case for non-hot keys) then the switch forwards the request to
Throughput with 6.4Tbps ASIC can
core [247])
achieve about 10Gpps) the appropriate server. When a write request is received, the
Higher; additional latency
Latency
Lower; sub-microseconds
when processing new
switch checks its local table and evicts the entry if the key
from ingress to egress is stored there. It then forwards the request to the appropriate
requests∗
Scalability
Lower; connection is
Higher backend server. A controller periodically collects statistics to
stored in limited SRAM update the cache with the current hot items.
Limited; hash-based flow
Policy Flexible policies can be A noteworthy approach is NetCache [110], an in-network
assignments may lead to
flexibility written in software
imbalance architecture that uses programmable switches to store hot
More complex; it requires items and balance the load across storage nodes. Similarly,
Simpler; it requires a
System coordination with routers,
customized parser, Liu et al. [112] proposed IncBricks, a caching fabric for key-
complexity tunneling (e.g., GRE
match-action tables
encapsulation) value pairs with basic computing primitives in the data plane.
∗ After the first packet is processed, no additional latency is observed [247]. Cidon et al. [111] proposed AppSwitch, a packet switch
that performs load balancing for key-value storage systems.
balancers, which makes it more scalable than switch-based Signorello et al. [113] developed a preliminary implementation
load balancing schemes. Moreover, software load balancers of Named Data Networking (NDN) instance using P4. Grig-
are more flexible in assigning flow identification policies. oryan et al. [114] proposed a system that caches Forwarding
Finally, switch-based schemes are simpler as the whole logic Information Base (FIB) entries (the most popular entries) in
is expressed in a program (customized parser and match- fast memory in order to minimize the TCAM consumption
action tables), whereas server-based balancers might require and to avoid the TCAM overflow problem. Zhang et al. [115]
additional coordination with routers (e.g., tunneling). proposed B-Cache, a framework that bypasses the original
processing pipeline to improve the performance of caching.
B. Caching Vestin et al. [116] proposed FastReact, a system that enables
B.1. Background caching for industrial control networks. Finally, Woodruff et
Modern applications (e.g., online banking, social networks) al. [117] proposed P4DNS, an in-network cache for Domain
rely on key-value stores. For example, retrieving a single Name System (DNS) entries.
web page may require thousands of storage accesses. As the B.3. Caching Schemes Comparison, Discussions, and Limita-
number of users increases to millions or billions, the need for tions
higher throughput and lower latency is needed. A challenge of
key-value stores is the non-uniform access of items. Instead, Table XVII compares the aforementioned caching schemes.
popular items, referred to as “hot items”, receive more queries Schemes can be separated based on the type of data they
than others. Furthermore, popular items may change rapidly aim to cache. For instance, NetCache, AppSwitch, and In-
due to popular posts, limited-time offers, and trending events cBricks cache arbitrary key-value pairs, while NDN.p4 caches
[110]. Fig. 13(a) shows a typical skew key-value store system only NDN names. Further, some schemes (e.g., NetCache,
which presents load imbalance among servers storing key- P4DNS, etc.) automatically index entries to be cached based
value objects. The performance of such systems may present on their access frequencies, while others require the operators
reduced throughput and long latencies. For example, server 2 to manually specify the entries. Another important distinction
may add substantial latency as a result of storing a hot item is whether the scheme uses a custom protocol or not. For
and being over-utilized, while server 1 is under-utilized. instance, switches in NetCache parse a custom protocol that
carries key-value pairs, while switches in P4DNS parse stan-
B.2. Literature Review dard DNS headers.
Fig. 13(b) illustrates a system where a programmable switch The main motivation of switch-based caching schemes is
receives a query before forwarding them to the server storing to improve the performance issues of server-based schemes.
For instance, NetCache, which efficiently detects hot key-
Key-value table value items and serves them in the data plane, was capable of
Key Value
… … handling two billion queries per second for 64,000 items with
Switch
16-bytes keys and 128-bytes values. Compared to commodity
Switch + cache
servers, NetCache improves the throughput by 3-10 times and
reduces the latency of 40% of queries by 50%. In addition to
Load
the throughput, the latency of the queries is also a major metric
to improve. In IncBricks, the latency of requests is reduced by
Server1 Server2 Server3 Server1 Server2 Server3 over 30% compared to client-side caching systems.
(a) (b) Similarly, B-Cache aims at improving the performance by
caching into a single cache match-action table. The motivation
Fig. 13. (a) Traditional software-based caching. (b) Switch-based caching. behind B-Cache is that the performance of the data plane
22
TABLE XVII
C ACHING S CHEMES C OMPARISON
decreases significantly as the complexity of the P4 program other hand is more flexible regarding cache policies, as well
and the packet processing pipeline grows. When a match as keys, values, and tables’ sizes.
occurs, the packet bypasses the original pipeline, making the
performance of caching independent of the pipeline length.
Note however that this system was evaluated on a software C. Telecommunication Services
switch (BMv2), and it is not certain whether this design is
always feasible on hardware targets. C.1. Background
Other caching schemes are more targeted for specific appli- The evolution of the current mobile network to the emerging
cations. As examples, FastReact enables caching for industrial Fifth-Generation (5G) technology implies significant improve-
control networks, while P4DNS caches DNS entries. Note ments of the network infrastructure. Such improvements are
that some schemes require a custom protocol to operate (e.g., necessary in order to meet the Key Performance Indicators
NetCache), while others (e.g., P4DNS) work with standard (KPIs) and requirements of 5G [248]. 5G requires ultra-
protocols (e.g., DNS). Finally, some schemes offer multi-level reliable low latency and jitter (microseconds-scale). As pro-
caching (e.g., level-1 and level-2 caches). grammable switches fulfill these requirements, researchers are
investigating the idea of offloading telecom-oriented VNFs
B.4. Comparison between Switch-based and Server-based running on x86 servers to programmable hardware.
Caching
C.2. Literature Review
Table XVIII compares the switch-based versus server-based
caching schemes. The throughput when data is cached on Ricart-Sanchez et al. [118] proposed a system that uses
the switch is order of magnitude larger than that of general programmable data plane to enhance the performance of the
purpose servers. The latency is also reduced by 50%, and most data path from the edge to the core network, also known as
of it is induced by the client. The switched-based caching the backhaul, in a 5G multi-tenant network. The same authors
solves the load imbalance problem and is simpler as the whole [119] proposed a 5G firewall that detects, differentiates and
logic is expressed in a program. Server-based caching on the selectively blocks 5G network traffic in the backhaul network.
In parallel, attempts such as TurboEPC [120] proposed
TABLE XVIII offloading a subset of user state in mobile packet core to
S WITCH - BASED AND S ERVER - BASED C ACHING programmable switches in order to perform signaling in the
Feature Switch-based Server-based
data plane. Similarly, Singh et al. [121] designed a P4-based
Higher; (e.g., NetCache,
element of 5G Mobile Packet Core (MPC) that merges the
Throughput Lower; 0.2BQPS
2BQPS1 ) functions of both signaling gateway (SGW) and the Packet
Lower; (e.g., NetCache, Data Network Gateway (PGW). Additionally, Voros et al.
Latency 7 μs, mostly caused by Higher; 15 μs
the client)
[122] proposed a a hybrid next-generation NodeB (gNB) that
Not flexible (limited by combines the capabilities of P4 switches and the external
Key size Arbitrary
packet header length) services built on top of NIC accelerators (DPDK).
Not flexible (limited by
the amount of state
Another important function required in 5G is handover.
Value size Arbitrary Palagummi et al. [123] proposed SMARTHO, a system that
accessed when processing
a packet) uses programmable switches to perform handover efficiently
Load
imbalance
No Yes in a wireless network.
More complex; it requires Finally, Kfoury et al. [124] proposed a system for offloading
Simpler; it requires a
System coordination with routers, conversational media traffic (e.g., Voice over IP (VoIP), Voice
customized parser,
complexity tunneling (e.g., GRE
match-action tables
encapsulation)
over LTE (VoLTE), WebRTC, media conferencing, etc.) from
Table size Limited by RAM Arbitrary x86-based relay servers to programmable switches. While
Cache
Limited by table size Arbitrary this system is not tailored for 5G network specifically, it
policies provides significant performance improvements for Over-The-
1 BQPS: Billion Queries Per Second. Top (OTT) VoIP systems.
23
TABLE XIX
T ELECOM S CHEMES C OMPARISON
TABLE XX
S WITCH - BASED AND S ERVER - BASED M EDIA R ELAYING
D.2. Literature Review routing. The advantage of storing the distribution tree in the
Jepsen et al. [125] presented “packet subscription”, a new packet header instead of storing it in the switch is that rules
abstraction that generalizes the forwarding rules by evalu- in the switches do not need to be updated when subscriptions
ating stateful predicates on input packets. Wernecke et al. change. Another distinction between the pub/sub systems is
[126, 127] presented distribution strategies for content-based whether they require a dedicated language to describe the
publish/subscribe systems using programmable switches. The subscriptions, and the configuration complexity.
authors described a system where the notification distribution
D.4. Comparison between Switch-based and Server-based
tree (i.e., the subscribers that should receive the notification)
Pub/Sub Systems
is encoded in the packet headers, similar to multicast source
routing. Similarly, Kundel et al. [128] implemented a pub- Fig. 15 illustrates the operations of traditional software-
lish/subscribe system on programmable switches. The system based pub/sub systems (a) and switch-based pub/sub systems
is flexible in encoding attributes/values in packet headers. (b). Latency and its variations are significantly reduced when
the switch acts as a pub/sub broker. However, the size of mem-
D.3. Publish/Subscribe Schemes Comparison, Discussions, ory in the switch limits the amount of data to be distributed.
and Limitations Moreover, implementing features provided by software-based
Table XXI compares the aforementioned pub/sub schemes. pub/sub systems such as QoS levels, session persistence,
In [125], the authors described a compiler that generates P4 message retaining, last will and testament (notify users after
tables from logical predicates. It utilizes a novel algorithm a device disconnects) in hardware is challenging.
based on Binary Decision Diagrams (BDD) to preserve switch
resources (TCAM and SRAM). This feature simplifies the con-
figuration as operators do not need to manually install tables E. Summary and Lessons Learned
entries switches, which is a cumbersome process when the Programmable switches offer the flexibility of customizing
topology is large. The prototype was evaluated on a hardware the data plane to enable middlebox functions. A middlebox can
switch (Tofino), and the authors considered the Nasdaq’s ITCH be defined as a device that performs functions that are beyond
protocol as the pub/sub use case. Results show that the system the standard capabilities of routers and switches. A number of
was able to process messages at line rate while using the works demonstrated the implementation of middlebox func-
full switch capacity (6.5 Tbps). The other systems considered tions such as caching, load balancing, offloading services,
different encoding strategies. For example, in [126, 127], the and others on programmable switches. The majority of load
authors described a system where the notification distribution balancing schemes took advantage of the stateful nature of the
tree (i.e., the subscribers that should receive the notification) data plane to store the load balancing connection table. Future
is encoded in the packet headers, similar to multicast source work should consider minimizing the storage requirement to
SDN Controller
...
...
...
Pub/Sub Pub/Sub
PublisherN info SubscriberN PublisherN info SubscriberN
Broker
(a) (b)
Fig. 15. (a) Traditional software-based pub/sub architecture. (b) Pub/sub implemented on a programmable switch.
25
improve the scalability, supporting flow priority, and develop- Consensus protocol (e.g., Paxos)
ing further variations for novel multipath transport protocols running the network
TABLE XXIII
M ACHINE L EARNING S CHEMES C OMPARISON
B.3. ML Schemes Comparison, Discussions, and Limitations to a gradient vector; and 2) updating the model by computing
Table XXIII compares the aforementioned ML schemes. the mean of all gradient vectors. The main motivation of this
While the goal of DAIET is to discuss what computations the idea is that the aggregation is computationally cheap (takes
network can perform, the authors did not design a complete 100ms), but is communication-intensive (transfer hundreds of
system, nor did they address the major challenges of support- megabytes each iteration). SwitchML uses computation on
ing ML applications. Moreover, their proof-of-concept pre- the switch to aggregate model update in the network as the
sented a simple MapReduce application on a software switch, workers are sending them (see Fig. 17). An advantage is
and it is not certain whether the system can be implemented that there is minimal communication; each worker sends its
on a hardware switch. Compared to DAIET, SwitchAgg does update vector and receives back the aggregated updates. The
not require modifying the network architecture, and offers design challenges of this system include: 1) the limitation of
better processing abilities with a significant data reduction rate. storage available on the switch, addressed by using a streaming
Moreover, SwitchAgg was implemented on an FPGA, and the approach; 2) switches cannot perform much computations per
results show that the job completion time can be reduced as packet, addressed by partitioning the work between the switch
much as 50%. and the workers; 3) ML systems use floating point numbers,
addressed by quantization approaches; and 4) failure recovery
SwitchML extended the literature on accelerating ML mod-
is needed to ensure correctness. The system is implemented
els training by providing a complete implementation and
on a hardware switch (Tofino), and results show that the
evaluation on a hardware switch. A commonly used training
system speeds up training by up to 300% compared to existing
technique for deep neural networks is synchronous stochastic
distributed learning approaches.
gradient descent [257]. In this technique, each worker has a
copy of the model that is being trained. The training is an it- With respect to in-network inference, it is challenging
erative process where each iteration consists of: 1) reading the to implement full-fledged models as they require extensive
sample of the dataset and locally perform some computation- computations (e.g., multiplications and activation functions).
intensive learning using the worker’s accelerators. This yields Simple variation such as the Binary Neural Network (BNN)
... ...
(a) (b)
Fig. 17. (a) ML model updates in legacy networks. The aggregation process is communication-intensive and follows an all-to-all communication pattern.
This means that the workers should receive all the other workers’ updates. Since accelerators on end-hosts are becoming faster, the network should speed up
so that it does not become the bottleneck. Therefore, it is expensive to deploy additional accelerators since it requires re-architecting the network. The red
arrow in (a) shows that the bottleneck source is the network. (b) ML model updates accelerated by the network. Aggregation is performed in the network by
the programmable switches while the workers are sending them. The workers do not need to obtain the updates of all other workers, hence there is minimal
communication. They only obtain the aggregated model from the switch. The red arrow in (b) shows that the bottleneck source is the worker rather than the
network [141, 256]
28
TABLE XXIV
S WITCH - BASED AND S ERVER - BASED ML A PPROACHES
Inference Training
Feature
Switch-based Server-based Switch-based Server-based
Slower; aggregations on an x86
Speed Faster, inference at line rate Slower Faster, aggregations at line rate
server
Complex computations Lower, basic arithmetic and
Higher Lower Higher
support bitwise logic function
Lower, switch is the centralized Higher, updates are exchanged
Communication overhead Low Low
aggregator with a remote aggregator
Lower, update is not stored
Storage Lower Higher Higher
entirely at once
Encrypted traffic Difficult Easy Difficult Easy
only requires bitwise logic functions (e.g., XNOR, POPCNT, networks. Although switches only support basic and limited
SIGN). N2Net provides a compiler that translates a given operations, it was shown in the literature that the performance
BNN model to switching chip’s configuration (P4 program). of various tasks (e.g., consensus, training models in machine
The authors did not mention on which platform N2Net was learning), could significantly improve if computations are
evaluated; however, based on their evaluations, they concluded delegated to the network.
that a BNN can be implemented on most current switching The majority of the in-network consensus works aim at
chips, and with small additions to the chip design, more implementing common consensus protocols such as Paxos
complex models can be implemented. IIsy studied other ML and Raft in the data plane. Due to the hardware constraints,
models. The authors of IIsy acknowledged that the work is current schemes implement only simplified variations of the
limited in scope as it does not address popular ML algorithms protocols. Future work could investigate implementing novel
such as neural networks. Furthermore, it is bounded to the consensus algorithms that diverge from the existing complex
type of features it can extract (i.e., packet headers), and has ones. Further, such schemes should encompass failure recovery
accuracy limitations. IIsy tries to find a balance between the mechanisms.
limited resources on the switch and the classification accuracy. Another interesting in-network application is ML train-
Finally, BaNaNa Split took a different approach by partitioning ing/inference acceleration. The literature has shown that signif-
the processing of NN to offload a subset of layers from the icant performance improvements are attained when the switch
CPU to a different processor. Note that the solution is far aggregates model updates or classifies new samples. Future
from complete, and the authors evaluated a single binary fully systems could explore developing further ML models for
connected layer with 4096 neurons using a network processor- various tasks such as classification, regression, clustering, etc.
based SmartNIC. In addition to the aforementioned categories, data plane
programming is being used for stream processing [143, 144],
parallel processing [145], string searching [146], erasure cod-
C. Comparison between Switch-based and Server-based ML
ing [147], in-network lock managers [148], database queries
Table XXIV shows a comparison between switch-based and acceleration [149], in-network compression [150], and com-
server-based ML approaches. ML works that were extracted puter vision offloading [151].
from the literature can be divided into two main categories:
1) expedited inference in the data plane, and 2) accelerated
X. I NTERNET OF T HINGS (I OT)
training in the network. The main advantage of switch-based
over server-based inference is the ability to execute at line rate, The Internet of Things (IoT) is a novel paradigm in which
and hence provides faster results to the clients. Performing pervasive devices equipped with sensors and actuators collect
complex computations in the switch is achieved through physical environment information and control the outside
estimations, and hence is limited. Moreover, the SRAM ca- world. IoT applications include smart water utilities, smart
pacity of the switch is small, impeding the storage of large grid, smart manufacturing, smart gas, smart metering, and
models. Such limitations are not problematic with server-based many others. Typical IoT scenarios entail a large number
inference approaches. of devices periodically transmitting their sensors’ readings
Distributed training can be significantly faster when aggre- to remote servers. Data received on those collectors is then
gations are offloaded to a centralized switch. However, due to processed and analyzed to assist organizations in taking data-
the small capacity of the switch memory, it is not possible to driven intelligence decisions.
store the whole model update at once. Additionally, encrypted
traffic remains a challenge when inference or training is A. Aggregation
handled by the switch. A.1. Background
Since IoT devices are constrained in size and process-
D. Summary and Lessons Learned ing capabilities, they typically generate packets that carry
Accelerating computations by leveraging programmable small payloads (e.g., temperature sensor readings). While such
switches is becoming a trend in data centers and backbone packets are small in size, their headers occupy a significant
29
TABLE XXV
I OT AGGREGATION S CHEMES C OMPARISON
portion of the total packet size. For instance, Sigfox Low- receiving a packet, the P4 switch parses its headers and
Power Wide Area Network (LPWAN) [258] can support a identifies whether the packet is an IoT packet. If the packet was
maximum of 12-bytes payload size per packet. The overhead identified as an IoT packet, the switch parses and extracts the
of headers is 42-bytes (Ethernet 14-bytes + IP 20-bytes + UDP payload. Afterwards, the payload is stored in switch registers
8-bytes), which represent approximately 78% of the packet along with some other metadata, and the packet is dropped.
total size. When numerous devices continuously transmit IoT Once packets are aggregated, the resulting packet is sent across
packets, a significant percentage of network bandwidth is the WAN to reach the remote server. Before the packet reaches
wasted on transmitting these headers. Packet aggregation is the server, it is disaggregated by another P4 switch situated
a mechanism in which the payloads of small packets are close to the server and several packets identical to the original
aggregated into a single larger packet in order to mitigate the ones are generated. An important observation is that the
bandwidth overhead caused by transmitting multiple headers. aggregation/disaggregation processes are transparent to both
Legacy packet aggregation mechanisms operate on the CPUs the IoT devices and the servers; hence, no modifications are
of servers or on the control plane of switches [259–264]. required on either end. The main advantages of [153] over
While legacy mechanisms reduce the overhead of packet [152] are: 1) packets can have different payload sizes; 2) the
headers, they unquestionably increase the end-to-end latency payload size is no longer limited to 16 bytes; 3) the number
and decrease the throughput. As a result, some studies have of packets is dynamic and only limited by the packet MTU;
suggested aggregating only packets that are not real-time. and 4) both the disaggregation and the aggregation run at line
rate.
A.2. Literature Review
A.4. Comparison between Server-based and Switch-based Ag-
Wang et al. [152] presented an approach where small IoT
gregation
packets are aggregated into a larger packet in the switch data
plane (see Fig. 18). The goal of performing this aggregation Table XXVI shows a comparison between switch-based
is to minimize the bandwidth overhead of packets’ headers. and server-based packet aggregation. When aggregation is
The same authors [153] extended this work to solve some performed on the switch (ASIC), the throughput is higher
constraints related to the payload size and the number of aggre- while the latency and jitter are lower than that of the server-
gated packets. Similarly, Madureira et al. [155] proposed IoTP, based approaches (e.g., switch CPU or x86-based server).
a layer-2 communication protocol that enables the aggregation On the other hand, the server-based aggregation has more
of IoT data in programmable switches. The solution gathers flexibility in defining the number of packets and the amount
network information that includes the Maximum Transmis- of data that can be aggregated.
sion Unit (MTU), link bandwidths, underlying protocol, and
delays. These properties are used to empower the aggregation
algorithm. B. Service Automation
in such technologies can be divided into two distinct types, pe- TABLE XXVII
ripheral and central. Peripheral devices, which consist of sen- S WITCH - BASED , P2P, AND C LOUD S ERVICE AUTOMATION
sors and actuators, receive commands and execute subsequent
Feature Switch-based Peer-to-peer Cloud-based
actions. Central devices on the other hand run applications
Latency Low Low High
that analyze information collected from peripheral devices and IoT energy Low Low High
subsequently issue commands. Scalability High Low High
The interconnection of devices and services can follow Reachability High Low High
a Peer-to-Peer (P2P) model or a cloud-centric approach. In
the P2P model, the automation service runs on the central Another difference from BLESS is that the implementation
device which processes and analyzes sensor data published of Muppet’s control plane leverages ONOS controller with
by peripheral devices in order to issue commands. The main Protocol Independent (PI) framework.
advantages of the P2P include the low end-to-end latency B.4. Comparison between Server-based and Switch-based
and the subtle power consumption as devices are physically Service Automation
close to each other. The drawbacks of the P2P model in-
clude poor scalability, short reachability, and inflexibility of Table XXVII shows a comparison between switch-based,
policy enforcement. The cloud-centric model addresses the P2P, and cloud-based service automation. Generally, the
limitations of the P2P model by adding a gateway node switch-based approach overcomes the limitations of both ap-
that connects peripheral devices to a middleware hosted on proaches. It achieves the low energy and latency characteristics
the cloud (Internet). While this approach solves the poor of P2P while increasing scalability and reachability.
scalability and the policy enforcement flexibility issues, it
incurs additional delays and jitters in collecting and reacting
C. Summary and Lessons Learned
to data. Moreover, the middleware represents a single point
of failure which can shutdown the whole service in the event In the context of IoT, there exist broadly two categories,
of an outage. With programmable switches, researchers are namely, packets aggregation and service automation. The goal
investigating in-network approaches to manage transactional of packet aggregation is to minimize the overhead of IoT
relationships between low-power, low-range IoT devices. packets’ headers. Typically, headers in IoT packets represent
a significant portion of the whole packet size. By aggregating
B.2. Literature Review several packets into a single packet, the bandwidth overhead
Uddin et al. [156] proposed Bluetooth Low Energy Service is reduced. Future work should study the performance side-
Switch (BLESS), a programmable switch that automates IoT effects (e.g., delay, jitter, loss rate, retransmission) that ag-
applications services by encoding their transactions in the data gregation causes to packets. Furthermore, timers should be
plane. It maintains link-layer connections to the devices to implemented to avoid excessive delays resulting from waiting
support P2P connectivity. The same authors proposed Muppet for enough packets to be aggregated.
[157], an extension to BLESS to support multiple non-IP With respect to service automation, the goal is to automate
protocols. IoT applications services by encoding their transactions in the
data plane while improving scalability, reachability, energy
B.3. Service Automation Comparison, Discussions, and Limi- consumption, and latency. Future work should design and de-
tations velop translators for non-IP IoT protocols so that applications
In BLESS, the data plane operations are performed at the on various devices that run different protocols can exchange
Attribute Protocol (ATT) service layer which consists of three data. Additionally, production-grade software switches should
operations: read attributes, write attributes, and attributes’ be leveraged to support non-Ethernet IoT protocols.
notification. BLESS parses ATT packets, then processes and Other works that involve IoT include flowlet-based stateful
forwards them to the devices. The control plane on the other multipath forwarding [268] and SDN/NFV-based architecture
hand is responsible for address assignment, device and service for IoT networks [269].
discovery, policy enforcement, and subscription management.
The switch was implemented on a software switch (PISCES),
XI. C YBERSECURITY
and the results show that BLESS combines the advantages of
P2P and the cloud-center approaches. Specifically, it achieves Extensive research efforts have been devoted on deploying
small communication latency, low device power consumption, programmable switches to perform various security-related
high scalability, and flexible policy enforcement. Muppet ex- functions in the data plane. Such functions include heavy hitter
tended this approach to support multiple IoT protocols. The detection, traffic engineering, DDoS attacks detection and
system studied two popular IoT protocols, namely BLE and mitigation, anonymity, and cryptography. Fig. 19 demonstrates
Zigbee. Being in the middle, Muppet switch is responsible for the difference between contemporary security appliances and
translating actions (e.g., on/off switch of a light bulb) between programmable switches with respect to layers inspection in the
Zigbee and BLE protocols, as well as logging important events OSI model. Although programmable switches are limited in
to a database which resides on the Internet via the Hypertext the computation power, they are capable of inspecting upper
Transfer Protocol (HTTP). Note that parsers and actions layers (e.g., application layer) at line rate. Such functionality
policies have to be implemented for each supported protocol. is not available in any of the existing solutions.
31
Software inspection Hardware inspection of programmable switches while achieving high accuracy. A
Application Application
subsequent work proposed by Harrison et al. [159] considers a
Next-generation network-wide distributed heavy-hitter detection. Furthermore,
firewall, IDS/IPS
Presentation Presentation Kučera et al. [160] proposed Elastic Trie, a solution that
Traditional firewall, detects hierarchical heavy hitters, in-network traffic changes,
Session flow-based IDS Session
Programmable and superspreaders in the data plane. Hierarchical heavy hitters
switch
Transport Transport include the total activity of all traffic matching relevant IP
ACL,
packet filter prefixes. Basat et al. [161] proposed PRECISION, a heavy
Network Network hitter detection algorithm that probabilistically recirculates
Data Link Data Link a fraction of packets for a second pipeline traversal. The
recirculation idea greatly simplifies the access pattern of
Physical Physical memory without significantly degrading throughput. Ding et
(a) (b) al. [162] proposed an approach for incrementally deploying
programmable switches in a network consisting of legacy
Fig. 19. Layers inspection in the OSI model. (a) Contemporary security devices with the goal of monitoring as many distinct network
appliances. (b) Programmable switch.
flows as possible. Tang et al. [163] proposed MV-Sketch, a
A. Heavy Hitter solution that exploits the idea of majority voting to track the
candidate heavy flows inside the sketch data structure. Finally,
A.1. Background
Silva et al. [164] proposed a solution that identifies elephant
Heavy hitters are a small number of flows that constitute flows in Internet eXchange Points (IXP) networks.
most of the network traffic over a certain amount of time.
A.3. Heavy Hitter Detection Comparison, Limitations, and
They are identified based on the port speed, network RTT,
Discussions
traffic distribution, application policy, and others. Heavy hitters
increase the flow completion time for delay-sensitive mice Table XXVIII compares the aforementioned heavy hitter
flows, and represent the major source of congestion. It is schemes. The main criteria that differentiates the solutions
important to promptly detect heavy hitters in order to react is the selection and the implementation of the data structure.
to them; for instance, redirect them to a low priority queue, Hash tables and sketches are frequently used to store counters
perform rate control and traffic engineering, block volumetric for heavy flows. Note that several variations of such data
DDoS attacks, and diagnose congestion. Traditionally, packet structures are being used in the literature, mainly to tackle the
sampling technique (e.g., NetFlow) was used to detect heavy memory-accuracy tradeoff; the choice of data structure reflects
hitters. The main problem with such technique is the limited on the accuracy of the performed measurements. For example,
accuracy due to the CPU and bandwidth overheads of process- with probabilistic data structures, only approximations are
ing samples in the software. Advancements in programmable performed.
switches paved the way to detect heavy hitters in the data In HashPipe, the programmable switch stores the flows
plane, which is not only orders of magnitude faster than identifiers and their byte counts in a pipeline of hash tables.
sampling, but also enables additional applications (e.g., flow- HashPipe adapts the space saving algorithm which is described
size aware routing). in [270]. The system was evaluated using an ISP trace provided
by CAIDA (400,000 flows), and the results show that HashPipe
A.2. Literature Review needed only 80KB of memory to identify the 300 heaviest
Sivaraman et al. [158] proposed HashPipe, a heavy hitter flows, with an accuracy of 95%. Another hashtable-based
detection algorithm that operates entirely in the data plane. solution is Elastic Trie, which consists of a prefix tree that
It detects the k-th heavy hitter flows within the constraints expands or collapses to focus only on the prefixes that grabs a
TABLE XXVIII
H EAVY H ITTER S CHEMES C OMPARISON
TABLE XXIX
C RYPTOGRAPHY S CHEMES C OMPARISON
large share of the network. The data plane informs the control B.2. Literature Review
plane about high-volume traffic clusters in an event-based push
The authors in [165] argue on the need to implement
approach only when some conditions are met. Other systems
cryptographic hash functions in the data plane to mitigate
explored different data structures for the task. For instance,
potential attacks targeting hash collisions. Consequently, they
in [162] the authors used the HyperLogLog algorithm [271]
presented prototype implementations of cryptographic hash
which approximates the number of distinct elements in a multi-
functions in three different P4 target platforms (CPU, Smart-
set. The solution is capable of detecting heavy hitters by only
NIC, NetFPGA SUME). Another work by Hauser et al. [166]
using partial input from the data plane.
attempted to implement host-to-site IPsec in P4 switches. For
Another important criteria is whether the scheme tracks simplification, only Encapsulating Security Payload (ESP) in
heavy hitters across the whole network. For example, un- tunnel mode with different cipher suites is implemented. The
like HashPipe which considers a single switch, [159] tracks same authors also proposed P4-MACsec, an implementation
network-wide heavy hitters. Tracking network-wide heavy of MACsec on P4 switches. MACsec is an IEEE standard for
hitter is important as some applications (e.g., port scanners, securing Layer 2 infrastructure by encrypting, decrypting, and
superspreaders, etc.) cannot go undetected within a single performing integrity checks on packets.
location. Moreover, aggregating the results of switches sep- The previous works delegated the complex computations to
arately for detecting heavy hitter is not sufficient; flows might the control plane. Chen et al. [168] implemented the Advanced
not exceed a threshold locally, but when the total volume is Encryption Standard (AES) protocol in the data plane using
considered, the threshold might be crossed. scrambled lookup tables. AES is one of the most widely
A.4. Comparison between P4-based and Traditional Heavy used symmetric cryptography algorithms that applies several
Hitter Detection encryption rounds on 128-bit input data blocks
The main advantage of heavy hitters detection schemes in B.3. Cryptography Schemes Comparison, Discussions and
the data plane over sampling-based approaches is the ability to Limitations
operate at line rate. This means that every packet is considered Table XXIX compares the aforementioned cryptography
in the detection algorithm, which improves accuracy and schemes. With respect to hashing, P4 currently implements
the speed of detection. Moreover, additional applications that hash functions that do not have the characteristics of cryp-
exploit reactive processing can be implemented. For instance, tographic hashing. For example, Cyclic Redundancy Check
switches can perform a flow-size aware routing method to (CRC), which is commonly used in P4 targets, is originally
redirect traffic upon detecting a heavy hitter. developed for error detection. CRC can be easily implemented
in embedded hardware, and is computationally much less
complex than cryptographic hash functions (e.g., Secure Hash
B. Cryptography Algorithm (SHA)-256); however, it is not secure and has a
high collision rate. Evaluation results in [165] show that 1)
B.1. Background
implementing cryptographic hash functions on CPU is easy,
Performing cryptographic functions in the data plane is but has high latency (several milliseconds); 2) SmartNICs has
useful for a variety of applications (e.g., protecting the layer- the highest throughput, but can only process packets up to
2 with cryptographic integrity checks and encryption, miti- 900 bytes; and 3) NetFPGA has the lowest latency, but cannot
gating hash collisions, etc.). Computations in cryptographic be integrated using native P4 features. The authors found
operations (e.g., hashing, encryption, decryption) are known to that the performance of hashing is highly dependent on the
be complex and resource-intensive. The supported operations application, the input type, and the hashing algorithm, and
in switch targets and in the P4 language are limited to ba- therefore there is no single solution that fits all requirements.
sic arithmetic (e.g., additions, subtractions, bit concatenation, However, P4 targets should benefit from the characteristics
etc.). Recently, a handful of works have started studying the of each solution (CPU, SmartNICs, FPGA, and ASICs) to
possibility of performing cryptographic functions in the data implement cryptographic hashing.
plane. As for more complex protocol suites (e.g., IPsec), Hauser
33
...
which either requires network operators to anonymize packet
Context
traces before sharing them with other researchers and analysts, packets
or anonymize traffic online but with significant overhead. End devices P4 switches
ONTAS provides a policy language used by operators for
expressing anonymization tasks, which makes the system Fig. 21. Overview of Poise [175]. A compiler translates high-level policies
into P4 programs and device configurations. Context packets are continuously
flexible and scalable. The system was implemented and tested sent from the clients to the network, where the switches enforce the policies.
on a hardware switch, and results show that ONTAS entails 0%
packet processing overhead and requires half storage compared volunteers and have no performance guarantees. Moreover,
to existing offline tools. A limitation of this system is that it they often require performing advanced encryption routines
does not anonymize TCP/UDP field values. Another limitation to obfuscate from where the packet is originated (e.g., onion
is that it does not support applying multiple privacy policies routing technique used by Tor involves encapsulating messages
concurrently. in several layers of encryption) . On the other hand, approaches
Other line of research (i.e., PANEL, SPINE) focused on that are based on programmable switches often rely on headers
protecting the identities of Internet user. PANEL overcomes modification and simplified encryption and hashing to conceal
the performance limitations of popular anonymity systems information (e.g., SPINE [172]).
(e.g., Tor), and does not require modifying entirely the Internet
routing and forwarding protocols as proposed in [273] and D. Access Control
[274]. Partial deployment is possible as PANEL can co- D.1. Background
exist with legacy devices. The solution involves: 1) source
address rewriting to hide the origin of the packet; 2) source The selective restriction to access digital resources is known
information normalization (IP identification and TCP sequence as access control in cybersecurity. Typically, access control
randomization) to mitigate against fingerprinting attacks; and begins with “authentication” in order to verify the identity of a
3) path information hiding (TTL randomization) to hide the party. Afterwards, “authorization” is enforced through policies
distance to the original sender at any given vantage point. to specify access rights to resources. To authenticate parties,
methods such as passwords, biometric analysis, cryptographic
As for SPINE, it does not require cooperation between
keys, and others are used. With respect to authorization,
switches and end-hosts, but assumes that at least two entities
methods such as ACL are used to describe what operations
(typically two ASes or two ISPs) are trusted. Fig. 20 shows
are allowed on given objects.
the SPINE architecture. The solution encrypts the IP addresses
With the advent of programmable switches, it is now
before the packets enter the intermediary ASes. Therefore,
possible to delegate authentication and authorization to the
adversarial devices only see the encrypted addresses in the
data plane. As a result, access can be promptly granted or
headers. It also encrypts the TCP sequence and ACK num-
denied at line rate, before reaching the target server. A clear
bers to mitigate against attributing packets to flows. SPINE
advantage of this approach is that servers are no longer busy
transforms IPv4 headers into IPv6 headers when packets
processing access verification routines, which increases their
leave the trusted entity and restore the IPv4 headers upon
service throughput.
entering the trusted entity. These operations enable routing to
be performed in intermediary networks. The encrypted IPv4 D.2. Literature Review
address is inserted in the last 32-bits of the IPv6 destination Datta et al. [173] presented P4Guard, a P4-based config-
address. The encryption works by XORing the IP address with urable firewall that acts based on predefined policies set by
the hash of a pre-shared key and a nonce. The system uses the controller. Kang et al. [175] presented a scheme that
SipHash since it is easily implemented in the data plane. implements context-aware security policies (see Fig. 21). The
policies are applicable to enterprise and campus networks with
C.4. Privacy and Anonymity in Switch-based and Legacy
diverse devices, i.e., Bring Your Own Device (BYOD) (e.g.,
Systems
laptops, mobile devices, tablets, etc.).
Contemporary approaches that provide privacy and Almain et al. [174] proposed delegating the authentication
anonymity in the Internet uses special routing overlay net- of end hosts to the data plane. The method is based on
works to hide the physical location of each node from other port knocking, in which hosts deliver a sequence of packets
participants (e.g., Tor). Such approaches have performance addressed to an ordered list of closed ports. If the ports match
limitations as proxy servers (overlays) are maintained by the ones configured by the network administrators, then end
35
TABLE XXXI
ACCESS C ONTROL S CHEMES C OMPARISON
Platform
Scheme Goal Strategy Scope Limitations
HW SW
Simple firewall-based Translates from high-level Header-based firewall
[173] Lacks NGFW capabilities
access control security policies to table entries (layer-4)
User-authentication Uses port knocking technique Unencrypted sequence- Unencrypted sequence
[174]
in the data plane for authentication based authentication vulnerable to packet sniffing
Context-aware policies Translates from high-level CAS dynamic policies External encryptions are slow;
[175]
enforcement security policies to P4 programs based on runtime contexts lack of authentication
OS fingerprinting and Compares TCP/IP headers to a Uses p0f to filter Lack of advanced built-in
[176]
policy enforcement fingerprint database file connections actions (e.g., rate-limiting)
host is authenticated, and subsequent packets are allowed. user devices and the switch, impersonation, and others.
Finally, Bai et al. [176] presented P40f, a tool that performs OS Finally, [176] proposes fingerprinting OS in the data plane.
fingerprinting on programmable switches, and consequently, The main motivation behind this work is that software-based
applies security policies (e.g., allow, drop, redirect) at line passive fingerprinting tools (e.g., p0f [275]) are not practical
rate. nor sufficient with large amounts of traffic on high-speed
links. Furthermore, out-of-band monitoring systems cannot
D.3. Access Control Comparison, Discussions, and Limita- promptly take actions (e.g., drop, forward, rate-limit) on traffic
tions at line rate. The main drawback of the solution is that it lacks
Table XXXI compares the aforementioned access control sophisticated policies that involve rate-limiting traffic.
schemes. P4Guard provides access control based on security
D.4. Comparison between Switch-based and Server-based Ac-
policies translated from high-level security policies to table
cess Control
entries. Note that P4Guard only operates up to the transport
layer (e.g., source/destination IP addresses, source/destination Controlling access to resources often starts with authenti-
ports, protocol, etc.), similar to a traditional firewall. While cation. While server-based approaches are more flexible in
programmable switches provide increased flexibility in the the methods of authentication they can provide, they typi-
parser (e.g., parse beyond the transport layer) and the packet cally require client connections to reach the server before
processing logic, P4Guard did not leverage such capabilities. the communication starts. In switch-based approaches, the
It would be interesting to investigate additional capabilities authentication can be done in-network at the edge, eliminating
such as those enabled by next-generation firewalls (NGFW). unnecessary latency incurred from traversing the network and
The solution in [174] controls access by performing authen- from software processing.
tication in the data plane. The solution has several limitations Access to resources can be controlled after fingerprinting
since it uses on port knocking, a technique that has several end-hosts OSs. Software-based passive fingerprinting tools
security implications. For instance, programmable switches do cannot keep up with the high load (gigabits/s links). The
not use cryptographic hashes, making the solution vulnerable literature has shown that tools lead to 38% degradation in
to IP address spoofing attacks. Additionally, unencrypted port throughput [276]. Additionally, such tools are out-of-band,
knocking is vulnerable to packet sniffing. Furthermore, port meaning that it is not possible to apply policies on traffic
knocking relies on security through obscurity. (e.g., after fingerprinting an OS). On the other hand, switch
In [175], the scheme dynamically enforces access control hardware is able to perform OS fingerprinting and apply
to users based on contexts (e.g., if the user’s device uses security policies at line rate.
Secure Shell (SSH) 2.0 or higher, then the switch forwards Context-aware policies applied on nodes (clients/servers)
the packets of this flow. Otherwise, the switch drops the pack- have local visibility. A newer approach is to use a centralized
ets). The scheme requires user devices to run an application SDN controller (e.g., [277]), but such scheme is vulnerable
which communicates with the switch using a custom protocol to control plane saturation attacks and is subject for delay
(context packets). The context packets are generated on a increases. Switch-based schemes on the other hand are able to
per-flow basis. The switch tracks flows using a match action provide access control at line rate.
table and registers at the data plane. Actions over a packet
are dropping, allowing, and forwarding to other appliances E. Defenses
for deep packet inspection. Data packets are not modified.
Evaluations show that the proposed approach can operate E.1. Background
(install new flows in the and update rules) with a minimum DDoS attacks remain among the top security concerns
latency, even under heavy DoS attacks. On the other hand, despite the continuous efforts towards the development of their
such attacks can decimate similar SDN-based systems. One detection and mitigation schemes. This concern is exacerbated
of the main drawbacks of the proposed system is the lack not only by the frequency of said attacks, but also by their high
of authentication, integrity, and confidentiality of the context volumes and rates. Recent attacks (e.g. [278, 279]) reached
packets. Thus, the system can be subject to attacks such the order of terabits per seconds, a rate that existing defense
as snooping (i.e., eavesdropping) on communication between mechanisms cannot keep with.
36
TABLE XXXII
D EFENSES S CHEMES C OMPARISON
There are two main concerns with existing defense methods traffic patterns that exploit the behavior of the P4 program.
handled by end-hosts or deployed as middlebox functions Lapolli et al. [181] implemented a mechanism to perform
on x86-based servers. First, they dramatically degrade the real-time DDoS attack detection based on entropy changes.
throughput and increase latency and jitter, impacting the Such changes will be used to compute anomaly detection
performance of the network. Second, they present severe thresholds. Mi et al. [182] proposed ML-Pushback, a P4-based
consequences on the network operation when they are installed implementation of the Pushback method [281].
at the last mile (i.e., far from the edge). Zhang et al. [184] proposed Poseidon, a system that miti-
The escalation of volumetric DDoS attacks and the lack gates against volumetric DDoS attacks through programmable
of robust and efficient defense mechanisms motivated the switches. It provides a language where operators can express
idea of architecting defenses into the network. Up until re- a range of security policies. Friday et al. [185] proposed a
cently, in-network security methods were restricted to simple unified in-network DDoS detection and mitigation strategy that
access control lists encoded into the switching and routing considers both volumetric and slow/stealthy DDoS attacks.
devices. The main reason is that the data plane was fixed in Xing et al. [186] proposed NetWarden, a broad-spectrum
function, impeding the capabilities of developing customized defense against network covert channels in a performance-
and dynamic algorithms that can assist in detecting attacks. preserving manner. The method in [187] models a stateful
With the advent of programmable data planes, it is possible security monitoring function as an Extended Finite State Ma-
to develop systems that detect and mitigate various types of chine (EFSM) and expresses the EFSM using P4 abstractions.
attacks without imposing significant overhead on the network. Finally, Ripple [188] provides decentralized link-flooding de-
fense against dynamic adversaries.
E.2. Literature Review
E.3. Defense Schemes Comparison, Discussions, and Limita-
Li et al. [177] presented NETHCF, a Hop-Count Filtering
tions
(HCF) defense mechanism that mitigates spoofed IP traffic.
HCF schemes filter spoofed traffic with an IP-to-hop-count Table XXXII compares the aforementioned defense
mapping table. Another attack-specific scheme proposed by schemes. Broadly, defense schemes can be grouped into two
Febro et al. [180] mitigates against distributed SIP DDoS in main categories: attack-specific and generic. Attack-specific
the data plane. Furthermore, Scholz et al. [183, 280] presented category consists of the work that address a specific attack
a scheme that defends against SYN flood attacks. (e.g., NETHCF for IP spoofing, [180] for SIP DDoS, etc.),
Alternatively, some schemes are generic and aim at ad- while the generic category aims at addressing various types of
dressing multiple attacks concurrently. For instance, Xing et attacks (e.g., FastFlex for various availability attacks, Ripple
al. [178] proposed FastFlex, an abstraction that architects for link flooding attacks, etc.).
defenses into the network paths based on changing attacks. The significant advantage of architecting defenses in the
Kang et al. [179] presented an automated approach for dis- data plane is the performance improvement of the applica-
covering sensitivity attacks targeting the data plane programs. tion. For instance, NETHCF is motivated by the fact that
Sensitivity attacks in this context are intelligently crafted traditional HCF-based schemes are implemented on end-hosts,
37
which delays the filtering of spoofed packets and increases execute cryptographic primitives in the data plane to enable
the bandwidth overhead. Moreover, since traditional schemes further applications; 3) protect the identity and the behavior
are implemented in server-based middleboxes, low latency of end-hosts, as well as obfuscate the network topology; 4)
and minimal jitter are hard to achieve. Similarly, FastFlex enforce access control policies in the network while consid-
advocates on the need to offload the defenses to the data ering network dynamics; and 5) architect defenses in the data
plane. Specifically, it tackles the following key challenges that plane to accelerate the detection and mitigation processes.
are faced when programming defenses in the data plane: 1) Identifying heavy hitters at line rate has several advan-
resource multiplexing; 2) optimal placement; 3) distributed tages. Recent works considered various data structures and
control; and 4) dynamic scaling. streaming algorithms to detect heavy hitters. Future systems
When deploying defenses in the data plane, operators must could explore more complex data structures that reduce the
be aware of the capabilities of the constrained targets. Many amount of state storage required on the switches. Furthermore,
operations that require extensive computations cannot be easily novel systems must minimize the false positives and the
implemented on the data plane. The existing work either false negatives compared to both P4-based and legacy heavy
approximate the computations in the data plane (considering hitter detection systems. Finally, new schemes should explore
the computation complexity and the measurements accuracy strategies for incremental deployment while maximizing flow
trade-off), or delegate the computations to external processors visibility across the network.
(e.g., CPU on the switch, external server, SDN controller, There is an absolute necessity to implement cryptographic
etc.). For instance, NETHCF decouples the HCF defense into functions (e.g., hash, encrypt, decrypt) in the data plane.
a cache running in the data plane and a mirror in the control Such functions can be used by various applications that
plane. The cache serves the legitimate packets at line rate, require low hashing collisions (e.g., load balancing) and strong
while the mirror processes the missed packets, maintains the data protection. Most existing efforts delegate the complex
IP-to-hop-count mapping table, and adjust the state of the computations to the control plane. However, recent systems
system based on network dynamics. In Poseidon, the defense have demonstrated that AES, a well-known symmetric key
primitives are partitioned to be executed on switches and on encryption algorithm, can be implemented in the data plane.
servers, based on their properties. On the other hand, in [181], Another interesting line of work provided privacy and
the authors estimated the entropies of source and destination anonymity to the network. Recent efforts obfuscated the net-
IP addresses of incoming packets for consecutive partitions work topology in order to mitigate topology-centric attacks
(observation windows) in the data plane, without consulting (e.g., LFA). Such systems must preserve the practicality of
external devices. path tracing tools, while being robust against obfuscation
Network-wide defenses are those that are not restricted to a inversion. Additionally, link failures in the physical topology
single switch, and require multiple switches to co-operate in should remain visible after obfuscation. Furthermore, when
the attacks detection and mitigation phases. Such co-operation randomizing identifiers to achieve session unlinkability, the
significantly improves the accuracy and the promptness of the identifiers must fit into the small fixed header space so
detection. More details on network-wide data plane systems that compatibility with legacy networks is preserved. Other
is explained in Section XIII-D. efforts considered rewriting source information and headers
Finally, table XXXII lists some limitations of the existing concealing to protect the identity of Internet users.
schemes, which can be explored in future work to advance the Finally, access control methods and in-network defenses
state-of-the-art. were proposed. Future access control schemes should explore
further in-network methods to authenticate the users. Addi-
E.4. Comparison between P4-based and Traditional Defense
tionally, since switches are capable of inspecting upper-layer
Schemes
headers, it is worth exploring offloading some next generation
Network attacks such as large-scale DDoS and link flooding firewall functionalities to the data plane. For instance, in
may have substantial impact on the network operation. For [146], the authors proposed a system that allows searching
such attacks, server-based defenses deployed at the last mile for keywords in the payload of the packet. Similar techniques
are problematic and inherently insufficient, especially when could be leveraged to achieve URL filtering at line rate.
attacks target the network core. Moreover, it is not feasible to Additionally, schemes should mitigate against stealthy DDoS
detect and mitigate large volume of attack traffic (e.g., SYN attacks.
flood) on end-hosts without impacting the throughput of the
network. When defenses are architected into the network (i.e.,
XII. N ETWORK T ESTING
detection and mitigation are programmed into the forwarding
devices), it is easy to detect, throttle, or drop suspicious traffic Although programmable switches provide flexibility in
at any vantage point, at line rate. defining the packet processing logic, they introduce potential
risks of having erroneous and buggy programs. Such bugs
may cause fatal damages, especially when they are unexpect-
F. Summary and Lessons Learned edly triggered in production networks. In such scenarios, the
In the context of cybersecurity, a wide range of works network starts experiencing a degradation in performance as
leveraged programmable switches to achieve the following well as disruption in its operation. Bugs can occur in various
goals: 1) detect heavy hitters and apply countermeasures; 2) phases in the P4 program development workflow (e.g., in
38
TABLE XXXIII
T ROUBLESHOOTING S CHEMES C OMPARISON
the P4 program itself, in the controller updating data plane diagnoses faults by injecting probes (e.g., [190, 193]). The
table entries, in the target compiler, etc.). Bugs are usually main limitation of passive detection is that schemes can only
manifested after processing a sequence of packets with certain detect rule faults that have been triggered by existing packets,
combinations not envisioned by the designer of the code. and cannot check the correctness of all table rules. On the
This section gives an overview of the troubleshooting and other hand, probing-based schemes may incur large control
verification schemes for P4 programmable networks. and probes overheads.
Examples of probing-based schemes include P4Tester and
A. Troubleshooting KeySight. P4Tester generates intermediate representation of
P4 programs and table rules based on BDD data structure.
A.1. Background Afterwards, it performs an automated analysis to generate
Intensive research interests were drawn on troubleshooting probes. Probes are sent using source routing to achieve high
the network. Previous efforts are mainly based on passive rule coverage while maintaining low overheads. The system
packet behavior tracking through the usage of monitoring was prototyped on a hardware switch (Tofino), and results
technologies (e.g., NetSight [282], EverFlow [283]). Other show that it can check all rules efficiently and that the probes
techniques (e.g., Automatic test Packet Generation (ATPG) count is smaller than that of server-based probe injection
[284]) send probing packets to proactively detect network systems (i.e., ATPG and Pronto).
bugs. Such techniques have two main problems. First, the Other schemes that use passive fault detection (e.g., P4DB)
number of probe packets increases exponentially as the size assume that packets consistently trigger the runtime bugs.
of the network increases. Second, the coverage is limited by P4DB debugs P4 programs in three levels of visibility by
the number of probes-generating servers. Despite the flexibility provisioning operator-friendly primitives: watch, break, and
that programmable switches offer, writing data plane programs next. P4DB does not require modifying the implementation of
increases the chance of introducing bugs into the network. Pro- the data plane. It was implemented and evaluated on a software
grams are inevitably prone to faults which could significantly switch (BMv2), and the results show that it is capable of
compromise the performance of the network and incur high troubleshooting runtime bugs with a small throughput penalty
penalty costs. and little latency increase.
Another important criterion that differentiate the trou-
A.2. Literature Review
bleshooting schemes is the memory footprint they require.
Zhang et al. [189] proposed P4DB, an on-the-fly runtime Some schemes (e.g., P4DB) require more memory than others
debugging platform. The system debugs P4 programs in three (e.g., KeySight) which bound the memory usage.
levels of visibility by provisioning operator-friendly primi- Finally, the work in [191] is different than the others.
tives: watch, break, and next. Zhou et al. [190] proposed The authors examined how three different targets, BMv2,
P4Tester, a troubleshooting system for data plane runtime P4-NetFPGA, and Barefoot’s Tofino, behave when undesired
faults. It generates intermediate representation of P4 programs behaviours are triggered. The authors first developed buggy
and table rules based on BDD data structure. Dumitru et programs in order to observe the actual behavior of targets.
al. [191] examined how three different targets, BMv2, P4- Then, they examined the most complex P4 program publicly
NetFPGA, and Barefoot’s Tofino, behave when undesired be- available, switch.p4, and found that it can be exploited when
haviours are triggered. Kodeswaran et al. [192] proposed a data attackers know the specifics of the implementation. In sum-
plane primitive for detecting and localizing bugs as they occur mary, the paper suggests that BMv2 leaks information from
in real time. Finally, Zhou et al. [193] proposed KeySight, a previous packets. This behavior is not observed with the other
platform that troubleshoots programmable switches with high two targets. Furthermore, the authors were able to perform
scalability and high coverage. It uses Packet Equivalence Class privilege escalation on switch.p4 due to a header destined
(PEC) abstraction when generating probes. to ensure communication between the CPU and the P4 data
plane.
A.3. Troubleshooting Schemes Comparison, Discussions, and
Limitations A.4. Comparison Legacy vs. P4-based Debugging
Table XXXIII compares the aforementioned troubleshooting In legacy networks, network devices are equipped with
schemes. Essentially, the schemes either passively track how fixed-function services that operate on standard protocols.
packets are processed inside switches (e.g., [189, 192]) or Troubleshooting these networks often involve testing proto-
39
cols and typical data plane functions (e.g., layer-3 routing) paths. Similarly, Lukács et al. [199] described a framework
through rigid probing. On the other hand, with programmable for verifying functional and non-functional requirement of
networks, since operators have the flexibility of defining protocols in P4. The system translates a P4 program in a
custom data plane functions and protocols, testing is more versatile symbolic formula to analyze various performance
complex and is program-dependent. Probing-based approaches costs. The proposed approach estimates the performance cost
should craft patterns depending on the deployed P4 program. of a P4 program prior to its execution.
Other approaches proposed primitives that increase the levels Stoenescu et al. [200] proposed Vera, a symbolic execution-
of visibility when debugging P4 programs. Research work based verification tool for P4 programs. The authors argue
extracted from the literature show that it is essential to develop in this paper that a data plane program should be verified
flexible mechanisms that operate dynamically on diverse P4 before deployment to ensure safe operations. Vera accepts as
programs and targets. input a P4 program, and translates it to a network verification
language, SEFL. It then relies on SymNet [287], a network
B. Verification static analysis tool based on symbolic execution to analyze the
behavior of the resulting program. Essentially, Vera generates
B.1. Background all possible packets layouts after inspecting the program’s
Program verification consists of tools and methods that parser and assumes that the header fields can accept any value.
ensure correctness of programs with respect to specifications Afterwards, it tracks the paths when processing these packets
and properties. Verification of P4 programs is an active area in the program following all branches to completion. For
as bugs can cause faults that have drastic impacts on the scalability improvements, Vera utilizes a novel match-forest
performance and the security of networking systems. Static data structure to optimize updates and verification time. Pars-
P4 verification handles programs before deployment to the ing/deparsing errors, invalid memory accesses, loops, among
network, and hence, cannot detect faults that occur at runtime. others, can be detected by Vera.
On the other hand, runtime verification uses passive measure- A different approach uses reinforcement learning is P4RL
ments and proactive network testing. This section describes [201], a fuzzy testing system that automatically verifies P4
the major verification work pertaining to P4 programs. switches at runtime. The authors described a query language
p4q in which operators express their intended switch behavior.
B.2. Literature Review
A prototype that executes verification on layer-3 switch was
Lopes et al. [194] proposed P4NOD, a tool that compiles implemented, and results show that PR4L detects various bugs
P4 specifications to Datalog rules. The main motivation be- and outperforms the baseline approach.
hind this work is that existing static checking tools (e.g., Finally, Dumitrescu et al. [202] proposed bf4, an end-to-
Header Space Analysis (HSA) [285], VeriFlow [286]) are end P4 program verification tool. It aims at guarantying that
not capable of handling changes to forwarding behaviors deployed P4 programs are bug-free. First, bf4 finds potential
without reprogramming tool internals. The authors introduced bugs at compile-time. Second, it automatically generates pred-
the “well formedness” bugs, a class of bugs arising due to the icates that must be followed by the controller whenever a rule
capabilities of modifying and adding headers. is to be inserted. Third, it proposes code changes if additional
Another interesting work is ASSERT-P4 [195, 196], a bugs remain reachable. bf4 executes a monitor at runtime
network verification technique that checks at compile-time that inspects the rules inserted by the controller and raises an
the correctness and the security properties of P4 programs. exception whenever a predicate is not satisfied. The authors
ASSERT-P4 offers a language with which programmers ex- executed bf4 on various data plane programs and interesting
press their intended properties with assertions. After annotat- bugs that were not detected in state-of-the-art approaches were
ing the program, a symbolic execution takes place with all the discovered.
assertions being checked while the paths are tested.
Further, Liu et al. [197] proposed p4v, a practical veri- B.3. Verification Schemes Discussions
fication tool for P4. It allows the programmer to annotate Table XXXIV compares the aforementioned verification
the program with Hoare logic clauses in order to perform schemes. Essentially, some schemes translate P4 programs to
static verification. To improve scalability, the system suggests verification languages and engines. For instance, in [194], P4
adding assumptions about the control plane and domain-
specific optimizations. The control plane interface is manually TABLE XXXIV
written by the programmer and is not verified, which makes V ERIFICATION S CHEMES C OMPARISON
it error-prone and cumbersome. The authors evaluated p4v Engine, Evaluated Inconsistency
on both an open source and proprietary P4 programs (e.g., Scheme Name
language programs detection
switch.p4) that have different sizes and complexities. [194] P4NOD NOD 2 ×
Nötzli et al. [198] proposed p4pktgen, a tool that automat- [195] ASSERT-P4 KLEE 5 ×
[197] p4v Z3 23 ×
ically generates test cases for P4 programs using symbolic [198] p4pktgen SMT 4 ×
execution and concrete paths. The tool accepts as input a [199] N/A Pure 0 ×
JSON representation of the P4 program (output of the p4c [200] Vera SEFL 11 ×
compiler for BMv2), and generates test cases. These test [201] P4RL DDQN 1
[202] bf4 Z3 21 ×
cases consist of packets, tables configurations, and expected
40
Traditional verification techniques that address the security Fig. 22. Challenges and future trends. The references represent examples of
existing works that tackle the corresponding future trends.
properties in computer networks are mainly related to host
reachability, isolation, blackholes, and loop-freedom. Tech- A. Memory Capacity (SRAM and TCAM)
niques that check for the aforementioned properties include
Anteater [288], which models the data plane as boolean Stateful processing is a key enabler for programmable
functions to be used in a Boolean Satisfiability Problem (SAT) data planes as it allows applications to store and retrieve
solver, NetPlumber [289] which uses header space algebra data across different packets. This advantage enabled a wide
[285], and others (e.g., VeriFlow [286], DeltaNet [290], Flover range of novel applications (e.g., in-network caching, fine
[291], and VMN [292]). grained measurements, stateful load balancing, etc.) that were
Since P4 programs incorporate customized protocols and not possible in non-programmable networks. The amount
processing logic to be used in the data plane, traditional tools of data stored in the switch is limited by the size of the
are not capable of handling changes to forwarding behaviors on-chip memory which ranges from tens to hundreds of
without reprogramming their internals. Therefore, verification megabytes at most. Consequently, the majority of stateful-
techniques in programmable networks rely on analyzing the based applications suffer have trade-offs between performance
P4 programs themselves since they define the behavior of the and memory usage. For instance, the efficiency of caching
data plane. which is determined by the hit rate is directly affected by the
memory size. Furthermore, the vast majority of measurement
applications require storing statistics in the data plane (e.g.,
C. Summary and Lessons Learned byte/packet counters). The number of flows to be measured
and the richness of measurement information is bound by the
Network testing can generally be divided into debug-
size of the memory in the switch.
ging/troubleshooting network problems and verifying the be-
havior of forwarding devices. While traditional tools and Current and future initiatives. A notable work by Kim et
techniques were adequate for non-programmable networks, al. [295, 296] suggests accessing remote Dynamic Random
they are insufficient for programmable ones due to their Access Memory (DRAM) installed on data center servers
inability to handle changes to forwarding behaviors without purely from data plane to expand the available memory on the
reprogramming and restructuring their internals. A variety of switch. The bandwidth of the chip is traded for the bandwidth
works were proposed to analyze and model P4 programs in needed to access the external DRAM. The approach is cheap
order to troubleshoot and verify the correctness of networks’ and flexible since it reuses existing resources in commodity
operations. hardware without adding additional infrastructure costs. The
system is realized by allowing the data plane to access remote
memory through an access channel (RDMA over Converged
XIII. C HALLENGES AND F UTURE T RENDS Ethernet (RoCE)) as shown in Fig. 23. The implementation
show that the proposal achieves throughput close to the line
In this section, a number of research and operational
rate, and only incur 1-2 extra microseconds latency (Fig.
challenges that correspond to the proposed taxonomy are
24). There are some limitations in this approach that can be
outlined. The challenges are extracted after comprehensively
explored in the future.
reviewing and diving into each work in the described literature.
Further, the section discusses and pinpoints several initiatives • The current implementation only supports address-based
for future work which could be worthy of being pursued in this memory access, and hence, complicated data layouts and
imperative field of programmable switches. The challenges ternary matching in remote memory should be explored.
and the future trends are illustrated in Fig. 22 • Frequent updates in the remote memory requires several
41
D. Network-wide Cooperation
The SDN architecture suggests using a centralized controller
for network-wide switches management. Through centraliza-
tion, the state of each programmable switch can be shared with
other switches. Consequently, applications will have the ability
Fig. 24. Accessing remote DRAM latency overhead. Achieved throughput to make better decisions as network-wide data is available
close to the line rate (≈ 37.5 Gbps) [295]. locally on the switch. The problem with such architecture is
42
S1 S1
C1 < T ID Count C1 + C 2 > T ID Count CountTotal
IPA C1 IPA C1 C1 + C2
Internet Internet C1 , C2
(a) (b)
Fig. 25. (a) Local detection of DDoS attacks. (b) network-wide detection of DDoS attack.
the requirement of having a continuous exchange of packets exchanged data. P4Sync addresses the limitations of existing
with a software-based system. As an alternative, switches can approaches. It guarantees the completeness of the migration,
exchange messages to synchronize their states in a decentral- ensuring that the snapshot transfer is completed. Moreover, it
ized manner. solves the overhead of the repeatedly retransmitted updates.
An interesting aspect of P4Sync is its ability to control the
Consider Fig. 25 which shows an in-network DDoS defense
migration traffic rate depending on the changing network
solution. Each switch maintains a list of senders and their
conditions. Zeno et al. [303] presented a design of SwiSh-
corresponding numbers of bytes. A switch compares the
mem, a management layer that facilitates the deployment of
number of bytes transmitted from a given flow to a threshold.
network functions (NFs) on multiple switches by managing
When the threshold is crossed, the flow is blocked and the
the distributed shared states.
device is identified as a malicious DDoS sender. Assume
that the network implements a load balancing mechanism that The future work in this area should consider handling
distributes traffic across the switches. In the scenario where frequent state migrations. Some systems require migration
switches do not consider the byte counts of other switches packets to be generated each RTT, causing increased traffic
(Fig. 25 (a)), the traffic of a DDoS device might remain under overhead and additional expensive authentication operations.
the threshold. On the other hand, when switches synchronize For instance, P4Sync uses public key cryptography in the
their states by sharing the byte counts (Fig. 25 (b)), the control plane to sign and verify the end of the migration
total number of bytes is compared against the threshold. sequence chain (2.15ms for signing and 0.07ms to verify using
Consequently, the total load of a DDoS device is considered. RSA-2048 signature). Frequent migrations would cause this
This example demonstrates an application that heavily depends signature to be involved repeatedly. Another major concern
on network-wide cooperation and hence motivates the need for that should be handled in future work is denial of service.
state synchronization. Even with migration updates authentication, changes in the
packets cause the receiver to reject updates, leading to state
Current and Future Initiatives. Arashloo et al. [298] pro- inconsistency among switches.
posed SNAP, a centralized stateful programming model that
aims at solving the synchronization problem. SNAP introduced
the idea of writing programs for “one big switch” instead of E. Control Plane Intervention
many. Essentially, developers write stateful applications with-
Delegating tasks to the control plane incurs latency and
out caring about the distribution, placement, and optimization
affects the application’s performance. For instance, in conges-
of access to resources. SNAP is limited to one replica of
tion control, rerouting-based schemes often use tables to store
each state in the network. Sviridov et al. [299, 300] proposed
alternative routes. Since the data plane cannot directly modify
LODGE and LOADER to extend SNAP and enable multiple
table entries, intervention from the control plane is required.
replicas. Luo et al. [301] proposed Swing State, a framework
The interaction with the control plane in this application
for runtime state migration and management. This approach
hampers the promptness of rerouting. Another example are
leverages existing traffic to piggyback state updates between
methods that use collisions-free hashing. For example, cuckoo
cooperating switches. Swing State overcomes the challenges
hash [305], which rearranges items to solve collisions, uses a
of the SDN-based architecture by synchronizing the states
complex search algorithm that cannot run on the switch ASIC,
entirely in the data plane, at line rate, and without intervention
and is often executed on the switch CPU. Ideally, the control
from the control plane. There are several limitations with this
plane intervention should be minimized when possible. For
approach. First, there are no message delivery guarantees (i.e.,
example, to synchronize the state among switches, in-network
packets dropped/reordered are not retransmitted), leading to
cooperation should be considered.
inconsistency in the states among the switches. Second, it does
not merge the states if two switches share common states. Current and Future Initiatives. The design of the interaction
Third, the overhead can significantly increase if a single state between the control plane and the data plane is fully decided
is mirrored several times. Finally, there is no authentication by the developer. Experienced developers might have enough
of data or senders. Xing et al. [302] proposed P4Sync, a background to immediately minimize such interaction. Future
system that migrates states between switches in the data plane work should devise algorithms and tools that automatically
while guaranteeing the authenticity of the senders and the determine the excessive interaction between the control/data
43
planes, and suggest alternative workflows (ideally, as generated deploy programmable switches in an incremental fashion. That
codes) to minimize such interaction. is, P4 switches will be added to the network alongside the
existing legacy devices. While this solution seems simplistic
F. Security at first, studies have showed that partial deployment leads
to reduced effectiveness [162]. For instance, the accuracy of
When designing a system for the data plane, the developer
heavy hitter detection schemes is strongly affected by the flow
must envision the kind of traffic a malicious user can initiate
visibility. The work in [162] devised a greedy algorithm that
to corrupt the operation of the system. This class of attacks is
attempts to strategically position P4 switches in the network,
referred to as sensitivity attacks as coined in [179]. Essentially,
with the goal of monitoring as many distinct network flows
an attacker can intelligently craft traffic patterns to trigger
as possible. The F1 score is used to quantify correctness of
unexpected behaviors of a system in the data plane. For
switches placement. Future work in this area should consider
instance, a load balancer that balances traffic through packet
generalizing and enhancing this approach to work with any P4
headers hashing without cyrptographic support (e.g., modulo
application, and not only heavy hitter detection. For instance,
operator on the number of available paths) can be tricked by an
a future work could suggest the positioning of P4 switches in
attacker that craft skewed traffic patterns. This results in traffic
applications such as in-network caching, accelerated consen-
being forwarded to a single path, leading to congestion, link
sus, and in-network defenses, while taking into account the
saturation, and denial of service. Another example is attacks
current topology consisting of legacy devices.
against in-network caching. Caching in data plane performs
well when requests are mostly reads rather than writes. If an
attacker continuously generates high-skewed write requests, H. Programming Simplicity and Modularity
the load on the storage servers would be imbalanced. If the Writing in-network applications using P4 language is not
system is designed to handle write queries on hot items in the an easy task. Recent studies have shown that many existing
switch, a random failure in the switch causes data to be lost. P4 programs have several bugs that might lead to network
Further, an attacker can also exploit the memory limitation disruption [191]. For several decades, the networking indus-
of switch and request diverse values, causing the pre-cached try operated in a bottom-up approach, where switches are
values to be evicted. equipped with fixed-function ASICs. Consequently, little to
Current and Future Initiatives. To mitigate against sensi- no programming skills were needed by network operators.
tivity attacks, a developer attempts to discover various un- With the advent of programmable switches, operators are now
predicted traffic patterns, and accordingly, develops defense expected to have experience in programming the ASIC2 .
strategies. Such solution is highly unreliable, time consuming, Current and Future Initiatives. Since programming the
and error-prone. Recent efforts [179] aimed at automatically ASIC is not a straightforward task, future research endeavours
discovering sensitivity attacks in the data plane. Essentially, should consider simplifying the programming workflow for
the proposed system aims at deriving traffic patterns that would the operators and generating code (e.g., [293]). For instance,
drive the program away from common case behavior as much graphical tools can be developed to translate workflows (e.g.,
as possible. Other efforts focused on architecting defenses in flowcharts) to P4 programs that can fit into the hardware.
the data plane that perform distributed mode changes upon Further, future work should develop tools that allow operators
attack discovery [178]. Future work in this direction should to enable features (i.e., program modules) that will translate to
consider achieving high assurance by formally verifying the P4 programs. As an analogy, consider the mobile application
codes. Additionally, the stability of the data plane should be stores (e.g., Play store, Apple store). The user simply down-
carefully handled with fast mode changes; future work could loads and installs application on the device, without having to
consider integrating self-stabilizing systems for such purpose. understand anything about programming. An interesting work
Finally, future work should provide security interfaces for could investigate the idea of creating a store for P4 applications
collaborating switches that belong to different domains. It is where operators select the “apps” they want to activate, and
also worth exposing sensitivity attack patterns for different the result is a generated P4 program optimized to fit in the
application types so that data plane developers can avoid the hardware, considering the different targets available in the
vulnerabilities that trigger those attacks in their codes. market today (e.g., Tofino). Recent efforts attempted to merge
and test modular programs in P4 [294].
G. Interoperability
Programmable switches pave the way for a wide range of XIV. C ONCLUSIONS
innovative in-network applications. The literature has shown This article presents an exhaustive survey on programmable
that significant performance improvements are brought when data planes. The survey describes the evolution of networking
applications offload their processing logic to the network. by discussing the traditional control plane and the transition to
Despite such facts, it is very unlikely that mobile operators
2 Note that most vendors (e.g., Barefoot Networks) provide a program
will replace their current infrastructure with programmable
(switch.p4) that expresses the forwarding plane of a switch, with the typical
switches in one shot. This unlikelihood comes from the fact features of an advanced layer-2 and layer-3 switch. If the goal is to simply
that major operational and budgeting costs will incur. deploy a switch with no in-network applications, then the operators are not
required to program the chip. They just need to learn the interaction between
Current and Future Initiatives. Network operators might the control plane and the data plane (e.g., to populate table entries).
44
Abbreviation Term
SDN. Afterwards, the survey motivates the need for program-
DRAM Dynamic Random Access Memory
ming the data plane and delves into the general architecture DSP Digital Signal Processors
of a programmable switch (PISA). A brief description of P4, ECMP Equal-Cost Multi-Path Routing
the de-facto language for programming the data plane was ECN Explicit Congestion Notification
ESP Encapsulating Security Payload
presented. Motivated by the increasing trend in programming FAST Flow-level State Transitions
the data plane, the survey provides a taxonomy that sheds the FCT Flow Completion Time
light on numerous significant works and compares schemes FIB Forwarding Information Base
FPGA Field-programmable Gate Array
within each category in the taxonomy and with those in legacy FQ Fair Queueing
approaches. The survey concludes by discussing challenges GPU Graphics Processing Unit
and considerations as well as various future trends and initia- GRE Generic Routing Encapsulation
HCF Hop-Count Filtering
tives. HSA Header Space Analysis
HTCP Hamilton Transmission Control Protocol
ACKNOWLEDGEMENT HTTP Hypertext Transfer Protocol
IDS Intrusion Detection System
This material is based upon work supported by the Na- IGMP Internet Group Management Protocol
tional Science Foundation under grant numbers 1925484 and IKE Internet Key Exchange
1829698, funded by the Office of Advanced Cyberinfrastruc- ILP Integer Linear Programming
INT In-band Network Telemetry
ture (OAC). IoT Internet of Things
IP Internet Protocol
R EFERENCES ISP Internet Service Provider
[1] N. McKeown, “How we might get humans out of the way.” Open Net- JSON JavaScript Object Notation
working Foundation (ONF) Connect 19, Sep. 2019. [Online]. Available: KDN Knowledge-defined Networking
https://tinyurl.com/y4dnxacz. KPI Key Performance Indicator
[2] RFC Editor, “Number of RFCs published per year.” [Online]. Avail- INT In-band Network Telemetry
able: https://www.rfc-editor.org/rfcs-per-year/. IoT Internet of Things
[3] B. Trammell and M. Kuehlewind, “Report from the IAB workshop on IP Internet Protocol
stack evolution in a middlebox Internet (SEMI),” RFC7663. [Online]. ISP Internet Service Provider
Available: https://tools.ietf.org/html/rfc7663. INT In-band Network Telemetry
IoT Internet of Things
INT In-band Network Telemetry
TABLE XXXV IoT Internet of Things
A BBREVIATIONS U SED IN T HIS A RTICLE IP Internet Protocol
ISP Internet Service Provider
Abbreviation Term JSON JavaScript Object Notation
ABR Adaptive Bit Rate KDN Knowledge-defined Networking
ACK Acknowledgement KPI Key Performance Indicator
ACL Access Control List LAN Local Area Network
AFQ Approximate Fair Queueing LFA Link Flooding Attack
AIMD Additive Increase Multiplicative Decrease LPM Longest Prefix Match
ALU Arithmetic Logical Unit LPWAN Low Power Wide Area Network
API Application Programming Interface LTE Long Term Evolution
AQM Active Queue Management MAC Medium Access Control
AS Autonomous System MAU Match-Action Unit
ASIC Application-specific Integrated Circuit MCM Multicolor Markers
ATPG Automatic Test Packet Generation MIMD Multiplicative Increase Multiplicative Decrease
ATT Attribute Protocol ML Machine Learning
BBR Bottleneck Bandwidth and Round-trip Time MOS Mean Opinion Score
BDD Binary Decision Diagram MPC Mobile Packet Core
BFT Byzantine Fault Tolerance MQTT Message Queueing Telemetry Transport
BGP Border Gateway Protocol MSS Maximum Segment Size
BIER Bit Index Explicit Replication MPTCP Multipath Transmission Control Protocol
BLE Bluetooth Low Energy MTU Maximum Transmission Unit
BLESS Bluetooth Low Energy Service Switch NACK Negative Acknowledgement
BMv2 Behavioral Model Version 2 NAT Network Address Translation
BNN Binary Neural Network NDA Non-disclosure Agreement
BQPS Billion Queries Per Second NDN Named Data Networking
BYOD Bring Your Own Device NFV Network Functions Virtualization
CAIDA Center of Applied Internet Data Analysis NIC Network Interface Controller
CC Congestion Control NN Neural Networks
CNN Convolutional Neural Network NSH Network Service Header
CoDel Controlled Delay ONOS Open Network Operating System
CPU Central Processing Unit OSPF Open Shortest Path First
CRC Cyclic Redundancy Check OUM Ordered Unreliable Multicast
CWND Congestion Window OVS Open Virtual Switch
DCQCN Data Center Quantized Congestion Notification P2P Peer-to-peer
DCTCP Data Center Transmission Control Protocol PBT Postcard-Based Telemetry
DDoS Distributed Denial-of-Service PCC Performance-oriented Congestion Control
DIP Direct Internet Protocol PCC Per-Connection Consistency
DMA Direct Memory Access PD Program Dependent
DMZ Demilitarized Zone PGW Packet Data Network Gateway
DNS Domain Name Server PI Protocol Independent
DPDK Data Plane Development Kit PIE Proportional Integral Controller Enhanced
45
Abbreviation Term
[15] P4.org Community, “P4 gains broad adoption, joins Open Networking
PISA Protocol Independent Switch Architecture Foundation (ONF) and Linux Foundation (LF) to accelerate next phase
QoE Quality of Experience of growth and innovation.” [Online]. Available: https://p4.org/p4/
QoS Quality of Service p4-joins-onf-and-lf.html.
RAM Random-Access Memory [16] Facebook engineering, “Disaggregate: networking recap.” [Online].
RDMA Remote Direct Memory Access Available: https://tinyurl.com/yxoaj7kw.
RED Random Early Detection [17] Open Compute Project (OCP), “Alibaba DC network evolution with
REST Representational State Transfer open SONiC and programmable HW.” [Online]. Available: https://
RFC Request for Comments www.opencompute.org/files/OCP2018.alibaba.pdf.
RMT Reconfigurable Match-action Tables [18] S. Heule, “Using P4 and P4Runtime for optimal L3 routing.” [Online].
RSA Rivest-Shamir-Adleman Available: https://tinyurl.com/y365gnqy.
RSS Really Simple Syndication [19] N. McKeown, “SDN phase 3: getting the humans out of the way. ONF
RTT Round-trip Time Connect 19.” [Online]. Available: https://tinyurl.com/tp9bxw4.
RWND Receiver Window [20] Edgecore, “Wedge 100BF-32X, 100GbE data center switch,” 2020.
SAD Security Association Database [Online]. Available: https://tinyurl.com/sy2jkqe.
SAT Boolean Satisfiability Problem [21] STORDIS, “The new advanced programmable switches are available.”
SDN Software Defined Networking [Online]. Available: https://www.stordis.com/products/.
SHA Secure Hash Algorithm [22] Cisco, “Cisco Nexus 34180YC and 3464C programmable switches data
SIP Session Initiation Protocol sheet.” [Online]. Available: https://tinyurl.com/y92cbdxe.
SLA Service Level Agreement [23] Arista, “Arista 7170 series.” [Online]. Available: https://www.arista.
SNMP Simple Network Management Protocol com/en/products/7170-series.
SPD Security Policy Database [24] Juniper Networks, “Juniper advancing disaggregation
SRAM Static Random-Access Memory through P4Runtime integration.” [Online]. Available:
SSH Secure Shell https://tinyurl.com/yygz547t.
TCAM Ternary Content-Addressable Memory [25] Interface Masters, “Tahoe 2624.” [Online]. Available: https://
TCP Transmission Control Protocol interfacemasters.com/products/switches/10g-40g/tahoe-2624/.
TM Traffic Management [26] Barefoot Networks, “Tofino ASIC.” [Online]. Available: https://www.
ToR The Onion Router barefootnetworks.com/products/brief-tofino/.
TPU Tensor Processing Unit [27] Xilinx, “Xilinx solutions.” [Online]. Available: https://www.xilinx.com/
TTL Time-to-Live products/silicon-devices.html.
UDP User Datagram Protocol [28] Pensando, “The Pensando distributed services platform.” [Online].
UE User Equipment Available: https://pensando.io/our-platform/.
VIP Virtual Internet Protocol [29] Mellanox, “Empowering the next generation of secure cloud Smart-
VMN Verifying Mutable Networks NICs.” [Online]. Available: https://www.mellanox.com/products/
VN Virtual Network smartnic.
VoLTE Voice over Long-term Evolution [30] Innovium, “Teralynx switch silicon.” [Online]. Available: https://www.
VXLAN Virtual eXtensible Local Area Network innovium.com/teralynx/.
WAN Wide Area Network [31] I. Baldin, J. Griffioen, K. Wang, I. Monga, A. Nikolich, “Mid-Scale
XDP eXpress Data Path RI-1 (M1:IP): FABRIC: adaptive programmable research infrastructure
for computer science and science applications.” [Online]. Available:
[4] G. Papastergiou, G. Fairhurst, D. Ros, A. Brunstrom, K.-J. Grinnemo, https://tinyurl.com/y463v9z9.
P. Hurtig, N. Khademi, M. Tüxen, M. Welzl, D. Damjanovic, and [32] FABRIC, “About FABRIC.” [Online]. Available: https://fabric-testbed.
S. Mangiante, “De-ossifying the Internet transport layer: a survey net/about/.
and future perspectives,” IEEE Communications Surveys & Tutorials, [33] J. Mambretti, J. Chen, F. Yeh, and S. Y. Yu, “International P4
vol. 19, no. 1, pp. 619–639, 2016. networking testbed,” in SC19 Network Research Exhibition, 2019.
[5] “VMware, Cisco stretch virtual LANs across the heavens.” in The Reg- [34] 2STiC, “A national programmable infrastructure to experiment with
ister, Aug. 2011. [Online]. Available: https://tinyurl.com/y6mxhqzn. next-generation networks.” [Online]. Available: https://www.2stic.nl/
[6] M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger, T. Sridhar, national-programmable-infrastructure.html.
M. Bursell, and C. Wright, “Virtual eXtensible Local Area Network [35] H. Stubbe, “P4 compiler & interpreter: a survey,” Future Internet
(VXLAN): a framework for overlaying virtualized layer 2 networks (FI) and Innovative Internet Technologies and Mobile Communication
over layer 3 networks,” RFC7348. [Online]. Available: http://www. (IITM), vol. 47, 2017.
rfc-editor.org/rfc/rfc7348.txt. [36] T. Dargahi, A. Caponi, M. Ambrosin, G. Bianchi, and M. Conti, “A
[7] M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. McKeown, and survey on the security of stateful SDN data planes,” IEEE Communi-
S. Shenker, “Ethane: taking control of the enterprise,” ACM SIGCOMM cations Surveys & Tutorials, vol. 19, no. 3, pp. 1701–1725, 2017.
computer communication review, vol. 37, no. 4, pp. 1–12, 2007. [37] W. L. da Costa Cordeiro, J. A. Marques, and L. P. Gaspary, “Data plane
[8] D. Kreutz, F. M. Ramos, P. E. Verissimo, C. E. Rothenberg, S. Azodol- programmability beyond OpenFlow: opportunities and challenges for
molky, and S. Uhlig, “Software-defined networking: a comprehensive network and service operations and management,” Journal of Network
survey,” Proceedings of the IEEE, vol. 103, no. 1, pp. 14–76, 2014. and Systems Management, vol. 25, no. 4, pp. 784–818, 2017.
[9] P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, [38] A. Satapathy, “Comprehensive study of P4 programming language and
C. Schlesinger, D. Talayco, A. Vahdat, and G. Varghese, “P4: pro- software-defined networks,” 2018. [Online]. Available: https://tinyurl.
gramming protocol-independent packet processors,” ACM SIGCOMM com/y4d4zma9.
Computer Communication Review, vol. 44, no. 3, pp. 87–95, 2014. [39] R. Bifulco and G. Rétvári, “A survey on the programmable data plane:
[10] Barefoot Networks, “Use cases.” [Online]. Available: https://www. abstractions, architectures, and open problems,” in 2018 IEEE 19th
barefootnetworks.com/use-cases/. International Conference on High Performance Switching and Routing
[11] A. Weissberger, “Comcast: ONF Trellis software is in production (HPSR), pp. 1–7, IEEE, 2018.
together with L2/L3 white box switches.” [Online]. Available: https: [40] E. Kaljic, A. Maric, P. Njemcevic, and M. Hadzialic, “A survey on data
//tinyurl.com/y69jc7sv. plane flexibility and programmability in software-defined networking,”
[12] N. Akiyama, M. Nishiki, “P4 and Stratum use case for new edge cloud.” IEEE Access, vol. 7, pp. 47804–47840, 2019.
[Online]. Available: https://tinyurl.com/yxuoo9qv. [41] P. G. Kannan and M. C. Chan, “On programmable networking evolu-
[13] Stordis GmbH, “New STORDIS advanced programmable switches tion,” CSI Transactions on ICT, vol. 8, no. 1, pp. 69–76, 2020.
(APS) first to unlock the full potential of P4 and next generation [42] L. Tan, W. Su, W. Zhang, J. Lv, Z. Zhang, J. Miao, X. Liu, and N. Li,
software defined networking (NG-SDN).” [Online]. Available: https: “In-band network telemetry: A survey,” Computer Networks, p. 107763,
//tinyurl.com/y3kjnypl. 2020.
[14] Open Networking Foundation (ONF), “Stratum – ONF launches major [43] X. Zhang, L. Cui, K. Wei, F. P. Tso, Y. Ji, and W. Jia, “A survey on
new open source SDN switching platform with support from Google.” stateful data plane in software defined networks,” Computer Networks,
[Online]. Available: https://tinyurl.com/yy3ykw7g. p. 107597, 2020.
[44] G. Bianchi, M. Bonola, A. Capone, and C. Cascone, “OpenState:
46
programming platform-independent stateful OpenFlow applications in- 2019 42nd International Conference on Telecommunications and Signal
side the switch,” ACM SIGCOMM Computer Communication Review, Processing (TSP), pp. 273–277, IEEE, 2019.
vol. 44, no. 2, pp. 44–51, 2014. [68] B. Turkovic and F. Kuipers, “P4air: Increasing fairness among com-
[45] M. Moshref, A. Bhargava, A. Gupta, M. Yu, and R. Govindan, peting congestion control algorithms,” 2020.
“Flow-level state transition as a new switch primitive for SDN,” in [69] Y. Li, R. Miao, C. Kim, and M. Yu, “Flowradar: A better NetFlow for
Proceedings of the third workshop on Hot topics in software defined data centers,” in 13th {USENIX} Symposium on Networked Systems
networking, pp. 61–66, 2014. Design and Implementation (NSDI), pp. 311–324, 2016.
[46] P4 Language Consortium, “P4Runtime.” [Online]. Available: https: [70] Z. Liu, A. Manousis, G. Vorsanger, V. Sekar, and V. Braverman,
//github.com/p4lang/PI/. “One sketch to rule them all: rethinking network flow monitoring with
[47] Y. Rekhter, T. Li, and S. Hares, “A border gateway protocol 4 (bgp-4),” UnivMon,” in Proceedings of the 2016 ACM SIGCOMM Conference,
RFC4271. http://www.rfc-editor.org/rfc/rfc4271.txt. pp. 101–114, 2016.
[48] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, [71] S. Narayana, A. Sivaraman, V. Nathan, P. Goyal, V. Arun, M. Alizadeh,
J. Rexford, S. Shenker, and J. Turner, “Openflow: enabling innovation V. Jeyakumar, and C. Kim, “Language-directed hardware design for
in campus networks,” ACM SIGCOMM Computer Communication network performance monitoring,” in Proceedings of the Conference
Review, vol. 38, no. 2, pp. 69–74, 2008. of the ACM Special Interest Group on Data Communication, pp. 85–
[49] N. McKeown, “Why does the Internet need a programmable forwarding 98, 2017.
plane.” [Online]. Available: https://tinyurl.com/y6x7qqpm. [72] M. Ghasemi, T. Benson, and J. Rexford, “Dapper: data plane perfor-
[50] A. Shapiro, “P4-programming data plane use-cases.” in P4 Expert mance diagnosis of TCP,” in Proceedings of the Symposium on SDN
Roundtable Series, April 28-29, 2020. [Online]. Available: https:// Research, pp. 61–74, 2017.
tinyurl.com/y5n4k83h. [73] T. Yang, J. Jiang, P. Liu, Q. Huang, J. Gong, Y. Zhou, R. Miao,
[51] C. Kim, “Evolution of networking, Networking Field Day 21, 2:01,” X. Li, and S. Uhlig, “Elastic sketch: adaptive and fast network-wide
2019. [Online]. Available: https://tinyurl.com/y9fkj7qx. measurements,” in Proceedings of the 2018 Conference of the ACM
[52] Z. Liu, J. Bi, Y. Zhou, Y. Wang, and Y. Lin, “Netvision: towards Special Interest Group on Data Communication, pp. 561–575, 2018.
network telemetry as a service,” in 2018 IEEE 26th International [74] N. Yaseen, J. Sonchack, and V. Liu, “Synchronized network snapshots,”
Conference on Network Protocols (ICNP), pp. 247–248, IEEE, 2018. in Proceedings of the 2018 Conference of the ACM Special Interest
[53] J. Hyun, N. Van Tu, and J. W.-K. Hong, “Towards knowledge-defined Group on Data Communication, pp. 402–416, 2018.
networking using in-band network telemetry,” in NOMS 2018-2018 [75] R. Joshi, T. Qu, M. C. Chan, B. Leong, and B. T. Loo, “Burstradar:
IEEE/IFIP Network Operations and Management Symposium, pp. 1–7, practical real-time microburst monitoring for datacenter networks,” in
IEEE, 2018. Proceedings of the 9th Asia-Pacific Workshop on Systems, pp. 1–8,
[54] Y. Kim, D. Suh, and S. Pack, “Selective in-band network telemetry 2018.
for overhead reduction,” in 2018 IEEE 7th International Conference [76] M. Lee and J. Rexford, “Detecting violations of service-level agree-
on Cloud Networking (CloudNet), pp. 1–3, IEEE, 2018. ments in programmable switches,” 2018. [Online]. Available: https:
[55] J. A. Marques, M. C. Luizelli, R. I. T. da Costa Filho, and L. P. Gaspary, //p4campus.cs.princeton.edu/pubs/mackl_thesis_paper.pdf.
“An optimization-based approach for efficient network monitoring [77] J. Sonchack, O. Michel, A. J. Aviv, E. Keller, and J. M. Smith, “Scaling
using in-band network telemetry,” Journal of Internet Services and hardware accelerated network monitoring to concurrent and dynamic
Applications, vol. 10, no. 1, p. 12, 2019. queries with* flow,” in 2018 USENIX Annual Technical Conference
[56] B. Niu, J. Kong, S. Tang, Y. Li, and Z. Zhu, “Visualize your IP- (USENIX ATC 18), pp. 823–835, 2018.
over-optical network in realtime: a P4-based flexible multilayer in-band [78] J. Sonchack, A. J. Aviv, E. Keller, and J. M. Smith, “Turboflow:
network telemetry (ML-INT) system,” IEEE Access, vol. 7, pp. 82413– Information rich flow record generation on commodity switches,” in
82423, 2019. Proceedings of the Thirteenth EuroSys Conference, pp. 1–16, 2018.
[57] R. Ben Basat, S. Ramanathan, Y. Li, G. Antichi, M. Yu, and M. Mitzen- [79] A. Gupta, R. Harrison, M. Canini, N. Feamster, J. Rexford, and
macher, “PINT: probabilistic in-band network telemetry,” in Proceed- W. Willinger, “Sonata: query-driven streaming network telemetry,” in
ings of the Annual conference of the ACM Special Interest Group on Proceedings of the 2018 Conference of the ACM Special Interest Group
Data Communication on the applications, technologies, architectures, on Data Communication, pp. 357–371, 2018.
and protocols for computer communication, pp. 662–680, 2020. [80] X. Chen, S. L. Feibish, Y. Koral, J. Rexford, O. Rottenstreich, S. A.
[58] N. Van Tu, J. Hyun, and J. W.-K. Hong, “Towards ONOS-based SDN Monetti, and T.-Y. Wang, “Fine-grained queue measurement in the
monitoring using in-band network telemetry,” in 2017 19th Asia-Pacific data plane,” in Proceedings of the 15th International Conference on
Network Operations and Management Symposium (APNOMS), pp. 76– Emerging Networking Experiments And Technologies, pp. 15–29, 2019.
81, IEEE, 2017. [81] Z. Liu, S. Zhou, O. Rottenstreich, V. Braverman, and J. Rexford,
[59] Serkant, “Prometheus INT exporter.” [Online]. Available: https://github. “Memory-efficient performance monitoring on programmable switches
com/serkantul/prometheus_int_exporter/. with lean algorithms,” in Symposium on Algorithmic Principles of
[60] N. Van Tu, J. Hyun, G. Y. Kim, J.-H. Yoo, and J. W.-K. Hong, “IntCol- Computer Systems (APoCS), 2020.
lector: a high-performance collector for in-band network telemetry,” in [82] T. Holterbach, E. C. Molero, M. Apostolaki, A. Dainotti, S. Vissicchio,
2018 14th International Conference on Network and Service Manage- and L. Vanbever, “Blink: fast connectivity recovery entirely in the data
ment (CNSM), pp. 10–18, IEEE, 2018. plane,” in 16th {USENIX} Symposium on Networked Systems Design
[61] Barefoot Networks, “Barefoot Deep Insight - product brief.” [Online]. and Implementation ({NSDI} 19), pp. 161–176, 2019.
Available: https://tinyurl.com/u2ncvry. [83] D. Ding, M. Savi, and D. Siracusa, “Estimating logarithmic and expo-
[62] Broadcom, “BroadView Analytics, Trident 3 in-band telemetry.” [On- nential functions to track network traffic entropy in P4,” in IEEE/IFIP
line]. Available: https://tinyurl.com/yxr2qydb. Network Operations and Management Symposium (NOMS), 2019.
[63] M. Handley, C. Raiciu, A. Agache, A. Voinescu, A. W. Moore, [84] W. Wang, P. Tammana, A. Chen, and T. E. Ng, “Grasp the root causes
G. Antichi, and M. Wójcik, “Re-architecting datacenter networks and in the data plane: diagnosing latency problems with SpiderMon,” in
stacks for low latency and high performance,” in Proceedings of the Proceedings of the Symposium on SDN Research, pp. 55–61, 2020.
Conference of the ACM Special Interest Group on Data Communica- [85] R. Teixeira, R. Harrison, A. Gupta, and J. Rexford, “PacketScope:
tion, pp. 29–42, 2017. monitoring the packet lifecycle inside a switch,” in Proceedings of
[64] B. Turkovic, F. Kuipers, N. van Adrichem, and K. Langendoen, “Fast the Symposium on SDN Research, pp. 76–82, 2020.
network congestion detection and avoidance using P4,” in Proceedings [86] J. Bai, M. Zhang, G. Li, C. Liu, M. Xu, and H. Hu, “FastFE:
of the 2018 Workshop on Networking for Emerging Applications and accelerating ML-based traffic analysis with programmable switches,”
Technologies, pp. 45–51, 2018. in Proceedings of the Workshop on Secure Programmable Network In-
[65] Y. Li, R. Miao, H. H. Liu, Y. Zhuang, F. Feng, L. Tang, Z. Cao, frastructure, SPIN ’20, p. 1–7, Association for Computing Machinery,
M. Zhang, F. Kelly, and M. Y. Alizadeh, Mohammad, “HPCC: high 2020.
precision congestion control,” in Proceedings of the ACM Special [87] X. Chen, H. Kim, J. M. Aman, W. Chang, M. Lee, and J. Rexford,
Interest Group on Data Communication, pp. 44–58, 2019. “Measuring TCP round-trip time in the data plane,” in Proceedings of
[66] A. Feldmann, B. Chandrasekaran, S. Fathalli, and E. N. Weyulu, “P4- the Workshop on Secure Programmable Network Infrastructure, pp. 35–
enabled network-assisted congestion feedback: a case for NACKs,” 41, 2020.
2019. [88] Y. Qiu, K.-F. Hsu, J. Xing, and A. Chen, “A feasibility study on time-
[67] E. F. Kfoury, J. Crichigno, E. Bou-Harb, D. Khoury, and G. Srivastava, aware monitoring with commodity switches,” in Proceedings of the
“Enabling TCP pacing using programmable data plane switches,” in Workshop on Secure Programmable Network Infrastructure, pp. 22–
47
[128] R. Kundel, C. Gärtner, M. Luthra, S. Bhowmik, and B. Koldehofe, 16th International Conference on emerging Networking EXperiments
“Flexible content-based publish/subscribe over programmable data and Technologies, pp. 399–405, 2020.
planes,” in NOMS 2020-2020 IEEE/IFIP Network Operations and [151] R. Glebke, J. Krude, I. Kunze, J. Rüth, F. Senger, and K. Wehrle,
Management Symposium, pp. 1–5, IEEE, 2020. “Towards executing computer vision functionality on programmable
[129] J. Li, E. Michael, N. K. Sharma, A. Szekeres, and D. R. Ports, “Just say network devices,” in Proceedings of the 1st ACM CoNEXT Workshop
{NO} to paxos overhead: replacing consensus with network ordering,” on Emerging in-Network Computing Paradigms, pp. 15–20, 2019.
in 12th {USENIX} Symposium on Operating Systems Design and [152] S.-Y. Wang, C.-M. Wu, Y.-B. Lin, and C.-C. Huang, “High-speed data-
Implementation (OSDI), pp. 467–483, 2016. plane packet aggregation and disaggregation by P4 switches,” Journal
[130] H. T. Dang, M. Canini, F. Pedone, and R. Soulé, “Paxos made switch- of Network and Computer Applications, vol. 142, pp. 98–110, 2019.
y,” ACM SIGCOMM Computer Communication Review, vol. 46, no. 2, [153] S.-Y. Wang, J.-Y. Li, and Y.-B. Lin, “Aggregating and disaggregating
pp. 18–24, 2016. packets with various sizes of payload in P4 switches at 100 Gbps line
[131] J. Li, E. Michael, and D. R. Ports, “Eris: coordination-free consistent rate,” Journal of Network and Computer Applications, p. 102676, 2020.
transactions using in-network concurrency control,” in Proceedings of [154] Y.-B. Lin, S.-Y. Wang, C.-C. Huang, and C.-M. Wu, “The SDN
the 26th Symposium on Operating Systems Principles, pp. 104–120, approach for the aggregation/disaggregation of sensor data,” Sensors,
2017. vol. 18, no. 7, p. 2025, 2018.
[132] B. Han, V. Gopalakrishnan, M. Platania, Z.-L. Zhang, and Y. Zhang, [155] A. L. R. Madureira, F. R. C. Araújo, and L. N. Sampaio, “On
“Network-assisted raft consensus protocol,” Feb. 13 2020. US Patent supporting IoT data aggregation through programmable data planes,”
App. 16/101,751. Computer Networks, p. 107330, 2020.
[133] X. Jin, X. Li, H. Zhang, N. Foster, J. Lee, R. Soulé, C. Kim, [156] M. Uddin, S. Mukherjee, H. Chang, and T. Lakshman, “SDN-based
and I. Stoica, “Netchain: scale-free sub-rtt coordination,” in 15th service automation for IoT,” in 2017 IEEE 25th International Confer-
{USENIX} Symposium on Networked Systems Design and Implemen- ence on Network Protocols (ICNP), pp. 1–10, IEEE, 2017.
tation ({NSDI} 18), pp. 35–49, 2018. [157] M. Uddin, S. Mukherjee, H. Chang, and T. Lakshman, “SDN-based
[134] H. T. Dang, P. Bressana, H. Wang, K. S. Lee, N. Zilberman, H. Weath- multi-protocol edge switching for IoT service automation,” IEEE Jour-
erspoon, M. Canini, F. Pedone, and R. Soulé, “Partitioned Paxos via nal on Selected Areas in Communications, vol. 36, no. 12, pp. 2775–
the network data plane,” arXiv preprint arXiv:1901.08806, 2019. 2786, 2018.
[135] E. Sakic, N. Deric, E. Goshi, and W. Kellerer, “P4BFT: hardware- [158] V. Sivaraman, S. Narayana, O. Rottenstreich, S. Muthukrishnan, and
accelerated byzantine-resilient network control plane,” arXiv preprint J. Rexford, “Heavy-hitter detection entirely in the data plane,” in
arXiv:1905.04064, 2019. Proceedings of the Symposium on SDN Research, pp. 164–176, 2017.
[136] H. T. Dang, P. Bressana, H. Wang, K. S. Lee, N. Zilberman, H. Weath- [159] R. Harrison, Q. Cai, A. Gupta, and J. Rexford, “Network-wide heavy
erspoon, M. Canini, F. Pedone, and R. Soulé, “P4xos: Consensus as a hitter detection with commodity switches,” in Proceedings of the
network service,” IEEE/ACM Transactions on Networking, 2020. Symposium on SDN Research, pp. 1–7, 2018.
[137] A. Sapio, I. Abdelaziz, A. Aldilaijan, M. Canini, and P. Kalnis, [160] J. Kučera, D. A. Popescu, G. Antichi, J. Kořenek, and A. W. Moore,
“In-network computation is a dumb idea whose time has come,” in “Seek and push: detecting large traffic aggregates in the dataplane,”
Proceedings of the 16th ACM Workshop on Hot Topics in Networks, arXiv preprint arXiv:1805.05993, 2018.
pp. 150–156, 2017. [161] R. Ben-Basat, X. Chen, G. Einziger, and O. Rottenstreich, “Efficient
[138] G. Siracusano and R. Bifulco, “In-network neural networks,” arXiv measurement on programmable switches using probabilistic recircu-
preprint arXiv:1801.05731, 2018. lation,” in 2018 IEEE 26th International Conference on Network
[139] D. Sanvito, G. Siracusano, and R. Bifulco, “Can the network be the Protocols (ICNP), pp. 313–323, IEEE, 2018.
AI accelerator?,” in Proceedings of the 2018 Morning Workshop on [162] D. Ding, M. Savi, G. Antichi, and D. Siracusa, “An incrementally-
In-Network Computing, pp. 20–25, 2018. deployable P4-enabled architecture for network-wide heavy-hitter de-
[140] F. Yang, Z. Wang, X. Ma, G. Yuan, and X. An, “SwitchAgg: tection,” IEEE Transactions on Network and Service Management,
a further step towards in-network computation,” arXiv preprint vol. 17, no. 1, pp. 75–88, 2020.
arXiv:1904.04024, 2019. [163] L. Tang, Q. Huang, and P. P. Lee, “A fast and compact invertible sketch
[141] A. Sapio, M. Canini, C.-Y. Ho, J. Nelson, P. Kalnis, C. Kim, A. Kr- for network-wide heavy flow detection,” IEEE/ACM Transactions on
ishnamurthy, M. Moshref, D. R. Ports, and P. Richtárik, “Scaling dis- Networking, vol. 28, no. 5, pp. 2350–2363, 2020.
tributed machine learning with in-network aggregation,” arXiv preprint [164] M. V. B. da Silva, J. A. Marques, L. P. Gaspary, and L. Z. Granville,
arXiv:1903.06701, 2019. “Identifying elephant flows using dynamic thresholds in programmable
[142] Z. Xiong and N. Zilberman, “Do switches dream of machine learning? ixp networks,” Journal of Internet Services and Applications, vol. 11,
toward in-network classification,” in Proceedings of the 18th ACM no. 1, pp. 1–12, 2020.
Workshop on Hot Topics in Networks, pp. 25–33, 2019. [165] D. Scholz, A. Oeldemann, F. Geyer, S. Gallenmüller, H. Stubbe,
[143] T. Jepsen, M. Moshref, A. Carzaniga, N. Foster, and R. Soulé, “Life in T. Wild, A. Herkersdorf, and G. Carle, “Cryptographic hashing in
the fast lane: a line-rate linear road,” in Proceedings of the Symposium P4 data planes,” in 2019 ACM/IEEE Symposium on Architectures for
on SDN Research, pp. 1–7, 2018. Networking and Communications Systems (ANCS), pp. 1–6, IEEE,
[144] T. Kohler, R. Mayer, F. Dürr, M. Maaß, S. Bhowmik, and K. Rothermel, 2019.
“P4CEP: towards in-network complex event processing,” in Proceed- [166] F. Hauser, M. Häberle, M. Schmidt, and M. Menth, “P4-IPsec: imple-
ings of the 2018 Morning Workshop on In-Network Computing, pp. 33– mentation of IPsec gateways in P4 with SDN control for host-to-site
38, 2018. scenarios,” arXiv preprint arXiv:1907.03593, 2019.
[145] L. Chen, G. Chen, J. Lingys, and K. Chen, “Programmable switch as [167] F. Hauser, M. Schmidt, M. Häberle, and M. Menth, “P4-MACsec:
a parallel computing device,” arXiv preprint arXiv:1803.01491, 2018. dynamic topology monitoring and data layer protection with MACsec
[146] T. Jepsen, D. Alvarez, N. Foster, C. Kim, J. Lee, M. Moshref, and in P4-based SDN,” IEEE Access, 2020.
R. Soulé, “Fast string searching on PISA,” in Proceedings of the 2019 [168] X. Chen, “Implementing AES encryption on programmable switches
ACM Symposium on SDN Research, pp. 21–28, 2019. via scrambled lookup tables,” in Proceedings of the Workshop on
[147] Y. Qiao, X. Kong, M. Zhang, Y. Zhou, M. Xu, and J. Bi, “Towards Secure Programmable Network Infrastructure, SPIN ’20, p. 8–14,
in-network acceleration of erasure coding,” in Proceedings of the Association for Computing Machinery, 2020.
Symposium on SDN Research, pp. 41–47, 2020. [169] R. Meier, P. Tsankov, V. Lenders, L. Vanbever, and M. Vechev,
[148] Z. Yu, Y. Zhang, V. Braverman, M. Chowdhury, and X. Jin, “NetLock: “NetHide: secure and practical network topology obfuscation,” in 27th
fast, centralized lock management using programmable switches,” in {USENIX} Security Symposium ({USENIX} Security 18), pp. 693–709,
Proceedings of the Annual conference of the ACM Special Interest 2018.
Group on Data Communication on the applications, technologies, [170] H. M. Moghaddam and A. Mosenia, “Anonymizing masses: prac-
architectures, and protocols for computer communication, pp. 126– tical light-weight anonymity at the network level,” arXiv preprint
138, 2020. arXiv:1911.09642, 2019.
[149] M. Tirmazi, R. Ben Basat, J. Gao, and M. Yu, “Cheetah: Accelerating [171] H. Kim and A. Gupta, “ONTAS: flexible and scalable online network
database queries with switch pruning,” in Proceedings of the 2020 ACM traffic anonymization system,” in Proceedings of the 2019 Workshop
SIGMOD International Conference on Management of Data, pp. 2407– on Network Meets AI & ML, pp. 15–21, 2019.
2422, 2020. [172] T. Datta, N. Feamster, J. Rexford, and L. Wang, “{SPINE}: surveil-
[150] S. Vaucher, N. Yazdani, P. Felber, D. E. Lucani, and V. Schiavoni, lance protection in the network elements,” in 9th {USENIX} Workshop
“Zipline: in-network compression at line speed,” in Proceedings of the on Free and Open Communications on the Internet (FOCI), 2019.
49
[173] R. Datta, S. Choi, A. Chowdhary, and Y. Park, “P4Guard: designing M. Barcellos, “Uncovering bugs in P4 programs with assertion-based
P4 based firewall,” in MILCOM 2018-2018 IEEE Military Communi- verification,” in Proceedings of the Symposium on SDN Research,
cations Conference (MILCOM), pp. 1–6, IEEE, 2018. pp. 1–7, 2018.
[174] A. Almaini, A. Al-Dubai, I. Romdhani, and M. Schramm, “Delegation [196] M. Neves, L. Freire, A. Schaeffer-Filho, and M. Barcellos, “Verification
of authentication to the data plane in software-defined networks,” of P4 programs in feasible time using assertions,” in Proceedings of the
in 2019 IEEE International Conferences on Ubiquitous Computing 14th International Conference on emerging Networking EXperiments
& Communications (IUCC) and Data Science and Computational and Technologies, pp. 73–85, 2018.
Intelligence (DSCI) and Smart Computing, Networking and Services [197] J. Liu, W. Hallahan, C. Schlesinger, M. Sharif, J. Lee, R. Soulé,
(SmartCNS), pp. 58–65, IEEE, 2019. H. Wang, C. Caşcaval, N. McKeown, and N. Foster, “P4v: practical
[175] Q. Kang, L. Xue, A. Morrison, Y. Tang, A. Chen, and X. Luo, verification for programmable data planes,” in Proceedings of the 2018
“Programmable in-network security for context-aware BYOD policies,” Conference of the ACM Special Interest Group on Data Communica-
arXiv preprint arXiv:1908.01405, 2019. tion, pp. 490–503, 2018.
[176] S. Bai, H. Kim, and J. Rexford, “Passive OS fingerprinting on com- [198] A. Nötzli, J. Khan, A. Fingerhut, C. Barrett, and P. Athanas, “P4pktgen:
modity switches,” automated test case generation for P4 programs,” in Proceedings of the
[177] G. Li, M. Zhang, C. Liu, X. Kong, A. Chen, G. Gu, and H. Duan, Symposium on SDN Research, pp. 1–7, 2018.
“NetHCF: enabling line-rate and adaptive spoofed IP traffic filtering,” [199] D. Lukács, M. Tejfel, and G. Pongrácz, “Keeping P4 switches fast and
in 2019 IEEE 27th International Conference on Network Protocols fault-free through automatic verification,” Acta Cybernetica, vol. 24,
(ICNP), pp. 1–12, IEEE, 2019. no. 1, pp. 61–81, 2019.
[178] J. Xing, W. Wu, and A. Chen, “Architecting programmable data plane [200] R. Stoenescu, D. Dumitrescu, M. Popovici, L. Negreanu, and C. Raiciu,
defenses into the network with FastFlex,” in Proceedings of the 18th “Debugging P4 programs with Vera,” in Proceedings of the 2018 Con-
ACM Workshop on Hot Topics in Networks, pp. 161–169, 2019. ference of the ACM Special Interest Group on Data Communication,
[179] Q. Kang, J. Xing, and A. Chen, “Automated attack discovery in pp. 518–532, 2018.
data plane systems,” in 12th {USENIX} Workshop on Cyber Security [201] A. Shukla, K. N. Hudemann, A. Hecker, and S. Schmid, “Runtime ver-
Experimentation and Test (CSET), 2019. ification of P4 switches with reinforcement learning,” in Proceedings
[180] A. Febro, H. Xiao, and J. Spring, “Distributed SIP DDoS defense of the 2019 Workshop on Network Meets AI & ML, pp. 1–7, 2019.
with P4,” in 2019 IEEE Wireless Communications and Networking [202] D. Dumitrescu, R. Stoenescu, L. Negreanu, and C. Raiciu, “bf4: to-
Conference (WCNC), pp. 1–8, IEEE, 2019. wards bug-free P4 programs,” in Proceedings of the Annual conference
[181] Â. C. Lapolli, J. A. Marques, and L. P. Gaspary, “Offloading real- of the ACM Special Interest Group on Data Communication on the
time DDoS attack detection to programmable data planes,” in 2019 applications, technologies, architectures, and protocols for computer
IFIP/IEEE Symposium on Integrated Network and Service Management communication, pp. 571–585, 2020.
(IM), pp. 19–27, IEEE, 2019. [203] A. Bas and A. Fingerhut, “P4 tutorial, slide 22.” [Online]. Available:
[182] Y. Mi and A. Wang, “ML-pushback: machine learning based pushback https://tinyurl.com/tb4m749.
defense against DDoS,” in Proceedings of the 15th International [204] M. Shahbaz, S. Choi, B. Pfaff, C. Kim, N. Feamster, N. McKeown, and
Conference on emerging Networking EXperiments and Technologies, J. Rexford, “PISCES: A programmable, protocol-independent software
pp. 80–81, 2019. switch,” in Proceedings of the 2016 ACM SIGCOMM Conference,
[183] D. Scholz, S. Gallenmüller, H. Stubbe, B. Jaber, M. Rouhi, and pp. 525–538, 2016.
G. Carle, “Me love (SYN-) cookies: SYN flood mitigation in pro- [205] B. Pfaff, J. Pettit, T. Koponen, E. Jackson, A. Zhou, J. Rajahalme,
grammable data planes,” arXiv preprint arXiv:2003.03221, 2020. J. Gross, A. Wang, J. Stringer, P. Shelar, et al., “The design and
[184] M. Zhang, G. Li, S. Wang, C. Liu, A. Chen, H. Hu, G. Gu, Q. Li, implementation of open vswitch,” in 12th {USENIX} Symposium on
M. Xu, and J. Wu, “Poseidon: mitigating volumetric DDoS attacks Networked Systems Design and Implementation (NSDI), pp. 117–130,
with programmable switches,” in Proceedings of NDSS, 2020. 2015.
[185] K. Friday, E. Kfoury, E. Bou-Harb, and J. Crichigno, “Towards a [206] Barefoot Networks, “Barefoot Academy,” 2020. [Online]. Available:
unified in-network DDoS detection and mitigation strategy,” in 2020 https://www.barefootnetworks.com/barefoot-academy/.
6th IEEE Conference on Network Softwarization (NetSoft), pp. 218– [207] C. Kim, A. Sivaraman, N. Katta, A. Bas, A. Dixit, and L. J. Wobker,
226, 2020. “In-band network telemetry via programmable dataplanes,” in ACM
[186] J. Xing, Q. Kang, and A. Chen, “NetWarden: mitigating network covert SIGCOMM, 2015.
channels while preserving performance,” in 29th {USENIX} Security [208] C. Hopps et al., “Analysis of an equal-cost multi-path algorithm,” tech.
Symposium ({USENIX} Security 20), 2020. rep., RFC 2992, November, 2000.
[187] A. Laraba, J. François, I. Chrisment, S. R. Chowdhury, and R. Boutaba, [209] S. Sinha, S. Kandula, and D. Katabi, “Harnessing TCP’s burstiness
“Defeating protocol abuse with p4: Application to explicit conges- with flowlet switching,” in Proc. 3rd ACM Workshop on Hot Topics in
tion notification,” in 2020 IFIP Networking Conference (Networking), Networks (Hotnets-III), Citeseer, 2004.
pp. 431–439, IEEE, 2020. [210] C. Kim, P. Bhide, E. Doe, H. Holbrook, A. Ghanwani, D. Daly,
[188] “Ripple: A programmable, decentralized link-flooding defense against M. Hira, and B. Davie, “In-band network telemetry (INT),” technical
adaptive adversaries,” in 30th USENIX Security Symposium (USENIX specification, 2016.
Security 21), (Vancouver, B.C.), USENIX Association, 2021. [211] M. A. Vieira, M. S. Castanho, R. D. Pacífico, E. R. Santos, E. P. C.
[189] C. Zhang, J. Bi, Y. Zhou, J. Wu, B. Liu, Z. Li, A. B. Dogar, and Júnior, and L. F. Vieira, “Fast packet processing with eBPF and XDP:
Y. Wang, “P4DB: on-the-fly debugging of the programmable data concepts, code, challenges, and applications,” ACM Computing Surveys
plane,” in 2017 IEEE 25th International Conference on Network (CSUR), vol. 53, no. 1, pp. 1–36, 2020.
Protocols (ICNP), pp. 1–10, IEEE, 2017. [212] J. Crichigno, E. Bou-Harb, and N. Ghani, “A comprehensive tutorial
[190] Y. Zhou, J. Bi, Y. Lin, Y. Wang, D. Zhang, Z. Xi, J. Cao, and C. Sun, on science DMZ,” IEEE Communications Surveys & Tutorials, vol. 21,
“P4tester: efficient runtime rule fault detection for programmable data no. 2, pp. 2041–2078, 2018.
planes,” in Proceedings of the International Symposium on Quality of [213] J. F. Kurose and K. W. Ross, “Computer networking a top down
Service, pp. 1–10, 2019. approach featuring the intel,” 2016.
[191] M. V. Dumitru, D. Dumitrescu, and C. Raiciu, “Can we exploit buggy [214] S. Ha, I. Rhee, and L. Xu, “CUBIC: a new TCP-friendly high-speed
P4 programs?,” in Proceedings of the Symposium on SDN Research, TCP variant,” ACM SIGOPS operating systems review, vol. 42, no. 5,
pp. 62–68, 2020. pp. 64–74, 2008.
[192] S. Kodeswaran, M. T. Arashloo, P. Tammana, and J. Rexford, “Tracking [215] D. Leith and R. Shorten, “H-TCP: TCP congestion control for
P4 program execution in the data plane,” in Proceedings of the high bandwidth-delay product paths,” draft-leith-tcp-htcp-06 (work in
Symposium on SDN Research, pp. 117–122, 2020. progress), 2008.
[193] Y. Zhou, J. Bi, T. Yang, K. Gao, C. Zhang, J. Cao, and Y. Wang, [216] N. Cardwell, Y. Cheng, C. S. Gunn, S. H. Yeganeh, and V. Jacobson,
“Keysight: Troubleshooting programmable switches via scalable high- “BBR: congestion-based congestion control,” Communications of the
coverage behavior tracking,” in 2018 IEEE 26th International Confer- ACM, vol. 60, no. 2, pp. 58–66, 2017.
ence on Network Protocols (ICNP), pp. 291–301, IEEE, 2018. [217] S. Floyd, “TCP and explicit congestion notification,” ACM SIGCOMM
[194] N. Lopes, N. Bjørner, N. McKeown, A. Rybalchenko, D. Talayco, Computer Communication Review, vol. 24, no. 5, pp. 8–23, 1994.
and G. Varghese, “Automatically verifying reachability and well- [218] R. Mittal, V. T. Lam, N. Dukkipati, E. Blem, H. Wassel, M. Ghobadi,
formedness in P4 networks,” Technical Report, Tech. Rep, 2016. A. Vahdat, Y. Wang, D. Wetherall, and D. Zats, “TIMELY: RTT-based
[195] L. Freire, M. Neves, L. Leal, K. Levchenko, A. Schaeffer-Filho, and congestion control for the data center,” ACM SIGCOMM Computer
50
Communication Review, vol. 45, no. 4, pp. 537–550, 2015. protocol specification (revised).,” [Online]. Available: https://tools.ietf.
[219] Y. Zhu, H. Eran, D. Firestone, C. Guo, M. Lipshteyn, Y. Liron, org/html/rfc7761.
J. Padhye, S. Raindel, M. H. Yahia, and M. Zhang, “Congestion control [244] H. Holbrook, B. Cain, and B. Haberman, “Using Internet group man-
for large-scale RDMA deployments,” ACM SIGCOMM Computer agement protocol version 3 (IGMPv3) and multicast listener discovery
Communication Review, vol. 45, no. 4, pp. 523–536, 2015. protocol version 2 (MLDv2) for source-specific multicast,” RFC 4604
[220] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prab- (Proposed Standard), Internet Engineering Task Force, 2006.
hakar, S. Sengupta, and M. Sridharan, “Data Center TCP (DCTCP),” [245] I. Wijnands, E. C. Rosen, A. Dolganow, T. Przygienda, and S. Aldrin,
in Proceedings of the ACM SIGCOMM 2010 conference, pp. 63–74, “Multicast using bit index explicit replication (BIER),” in RFC Editor,
2010. 2017.
[221] M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, [246] B. Carpenter and S. Brim, “Middleboxes: taxonomy and issues,” 2002.
and S. Shenker, “pFabric: minimal near-optimal datacenter transport,” [Online]. Available: https://tools.ietf.org/html/rfc3234.
ACM SIGCOMM Computer Communication Review, vol. 43, no. 4, [247] J. McCauley, A. Panda, A. Krishnamurthy, and S. Shenker, “Thoughts
pp. 435–446, 2013. on load distribution and the role of programmable switches,” ACM
[222] M. Dong, Q. Li, D. Zarchy, P. B. Godfrey, and M. Schapira, “{PCC}: SIGCOMM Computer Communication Review, vol. 49, no. 1, pp. 18–
Re-architecting congestion control for consistent high performance,” 23, 2019.
in 12th {USENIX} Symposium on Networked Systems Design and [248] T. Norp, “5G Requirements and key performance indicators,” Journal
Implementation (NSDI), pp. 395–408, 2015. of ICT Standardization, vol. 6, no. 1, pp. 15–30, 2018.
[223] A. Langley, A. Riddoch, A. Wilk, A. Vicente, C. Krasic, D. Zhang, [249] G. Xylomenos, C. N. Ververidis, V. A. Siris, N. Fotiou, C. Tsilopou-
F. Yang, F. Kouranov, I. Swett, J. Iyengar, et al., “The QUIC transport los, X. Vasilakos, K. V. Katsaros, and G. C. Polyzos, “A survey
protocol: design and Internet-scale deployment,” in Proceedings of the of information-centric networking research,” IEEE communications
Conference of the ACM Special Interest Group on Data Communica- surveys & tutorials, vol. 16, no. 2, pp. 1024–1049, 2013.
tion, pp. 183–196, 2017. [250] D. L. Tennenhouse and D. J. Wetherall, “Towards an active network
[224] P. Cheng, F. Ren, R. Shu, and C. Lin, “Catch the whole lot in an action: architecture,” in Proceedings DARPA Active Networks Conference and
rapid precise packet loss notification in data center,” in 11th {USENIX} Exposition, pp. 2–15, IEEE, 2002.
Symposium on Networked Systems Design and Implementation (NSDI), [251] E. F. Kfoury, J. Gomez, J. Crichigno, E. Bou-Harb, and D. Khoury,
pp. 17–28, 2014. “Decentralized distribution of PCP mappings over blockchain for
[225] A. Ramachandran, S. Seetharaman, N. Feamster, and V. Vazirani, “Fast end-to-end secure direct communications,” IEEE Access, vol. 7,
monitoring of traffic subpopulations,” in Proceedings of the 8th ACM pp. 110159–110173, 2019.
SIGCOMM conference on Internet measurement, pp. 257–270, 2008. [252] S. A. Weil, S. A. Brandt, E. L. Miller, D. D. Long, and C. Maltzahn,
[226] N. Alon, Y. Matias, and M. Szegedy, “The space complexity of “Ceph: A scalable, high-performance distributed file system,” in Pro-
approximating the frequency moments,” Journal of Computer and ceedings of the 7th symposium on Operating systems design and
system sciences, vol. 58, no. 1, pp. 137–147, 1999. implementation, pp. 307–320, 2006.
[227] V. Braverman and R. Ostrovsky, “Zero-one frequency laws,” in Pro- [253] L. Lamport et al., “Paxos made simple,” ACM Sigact News, vol. 32,
ceedings of the forty-second ACM symposium on Theory of computing, no. 4, pp. 18–25, 2001.
pp. 281–290, 2010. [254] D. Ongaro and J. Ousterhout, “In search of an understandable con-
[228] M. Charikar, K. Chen, and M. Farach-Colton, “Finding frequent items sensus algorithm,” in 2014 {USENIX} Annual Technical Conference
in data streams,” in International Colloquium on Automata, Languages, (USENIX ATC 14), pp. 305–319, 2014.
and Programming, pp. 693–703, Springer, 2002. [255] Huynh Tu Dang, “Consensus as a network service.” [Online]. Avail-
[229] G. Cormode and S. Muthukrishnan, “An improved data stream sum- able: https://tinyurl.com/y2t9plsu.
mary: the count-min sketch and its applications,” Journal of Algorithms, [256] J. Nelson, “SwitchML scaling distributed machine learning with in net-
vol. 55, no. 1, pp. 58–75, 2005. work aggregation.” [Online]. Available: https://tinyurl.com/y53upm7k.
[230] M. Datar, A. Gionis, P. Indyk, and R. Motwani, “Maintaining stream [257] D. Das, S. Avancha, D. Mudigere, K. Vaidynathan, S. Sridharan,
statistics over sliding windows,” SIAM journal on computing, vol. 31, D. Kalamkar, B. Kaul, and P. Dubey, “Distributed deep learn-
no. 6, pp. 1794–1813, 2002. ing using synchronous stochastic gradient descent,” arXiv preprint
[231] S. Floyd and V. Jacobson, “Random early detection gateways for arXiv:1602.06709, 2016.
congestion avoidance,” IEEE/ACM Transactions on networking, vol. 1, [258] S. Farrell, “Low-power wide area network (LPWAN) overview,”
no. 4, pp. 397–413, 1993. RFC8376. [Online]. Available: https://tools.ietf.org/html/rfc8376.
[232] P. Flajolet, D. Gardy, and L. Thimonier, “Birthday paradox, coupon [259] A. Koike, T. Ohba, and R. Ishibashi, “IoT network architecture using
collectors, caching algorithms and self-organizing search,” Discrete packet aggregation and disaggregation,” in 2016 5th IIAI International
Applied Mathematics, vol. 39, no. 3, pp. 207–229, 1992. Congress on Advanced Applied Informatics (IIAI-AAI), pp. 1140–1145,
[233] R. Dolby, “Noise reduction systems,” Nov. 5 1974. US Patent IEEE, 2016.
3,846,719. [260] J. Deng and M. Davis, “An adaptive packet aggregation algorithm
[234] S. V. Vaseghi, Advanced digital signal processing and noise reduction. for wireless networks,” in 2013 International Conference on Wireless
John Wiley & Sons, 2008. Communications and Signal Processing, pp. 1–6, IEEE, 2013.
[235] J. Gettys, “Bufferbloat: dark buffers in the Internet,” IEEE Internet [261] Y. Yasuda, R. Nakamura, and H. Ohsaki, “A probabilistic interest
Computing, no. 3, p. 96, 2011. packet aggregation for content-centric networking,” in 2018 IEEE 42nd
[236] M. Allman, “Comments on bufferbloat,” ACM SIGCOMM Computer Annual Computer Software and Applications Conference (COMPSAC),
Communication Review, vol. 43, no. 1, pp. 30–37, 2013. vol. 2, pp. 783–788, IEEE, 2018.
[237] Y. Gong, D. Rossi, C. Testa, S. Valenti, and M. D. Täht, “Fighting the [262] A. S. Akyurek and T. S. Rosing, “Optimal packet aggregation schedul-
bufferbloat: on the coexistence of AQM and low priority congestion ing in wireless networks,” IEEE Transactions on Mobile Computing,
control,” Computer Networks, vol. 65, pp. 255–267, 2014. vol. 17, no. 12, pp. 2835–2852, 2018.
[238] C. Staff, “Bufferbloat: what’s wrong with the Internet?,” Communica- [263] K. Zhou and N. Nikaein, “Packet aggregation for machine type commu-
tions of the ACM, vol. 55, no. 2, pp. 40–47, 2012. nications in LTE with random access channel,” in 2013 IEEE Wireless
[239] V. G. Cerf, “Bufferbloat and other internet challenges,” IEEE Internet Communications and Networking Conference (WCNC), pp. 262–267,
Computing, vol. 18, no. 5, pp. 80–80, 2014. IEEE, 2013.
[240] F. Schwarzkopf, S. Veith, and M. Menth, “Performance analysis of [264] A. Majeed and N. B. Abu-Ghazaleh, “Packet aggregation in multi-
CoDel and PIE for saturated TCP sources,” in 2016 28th International rate wireless LANs,” in 2012 9th Annual IEEE Communications
Teletraffic Congress (ITC 28), vol. 1, pp. 175–183, IEEE, 2016. Society Conference on Sensor, Mesh and Ad Hoc Communications and
[241] A. Mushtaq, R. Mittal, J. McCauley, M. Alizadeh, S. Ratnasamy, Networks (SECON), pp. 452–460, IEEE, 2012.
and S. Shenker, “Datacenter congestion control: identifying what is [265] D. SIG, “Bluetooth core specification version 4.2,” Specification of the
essential and making it practical,” ACM SIGCOMM Computer Com- Bluetooth System, 2014.
munication Review, vol. 49, no. 3, pp. 32–38, 2019. [266] S. Farahani, ZigBee wireless networks and transceivers. Newnes, 2011.
[242] K. Nichols, S. Blake, F. Baker, and D. Black, “Definition of the [267] O. Hersent, D. Boswarthick, and O. Elloumi, The Internet of things:
differentiated services field (DS field) in the IPv4 and IPv6 headers,” key applications and protocols. John Wiley & Sons, 2011.
RFC8376. [Online]. Available: https://tools.ietf.org/html/rfc8376. [268] J. Shi, W. Quan, D. Gao, M. Liu, G. Liu, C. Yu, and W. Su,
[243] B. Fenner, M. Handley, H. Holbrook, I. Kouvelas, R. Parekh, Z. Zhang, “Flowlet-based stateful multipath forwarding in heterogeneous Internet
and L. Zheng, “Protocol independent multicast-sparse mode (PIM-SM): of things,” IEEE Access, vol. 8, pp. 74875–74886, 2020.
51
[269] S. Do, L.-V. Le, B.-S. P. Lin, and L.-P. Tung, “SDN/NFV-based network checking invariant security properties in OpenFlow,” in 2013 IEEE
infrastructure for enhancing IoT gateways,” in 2019 International Con- international conference on communications (ICC), pp. 1974–1979,
ference on Internet of Things (iThings) and IEEE Green Computing and IEEE, 2013.
Communications (GreenCom) and IEEE Cyber, Physical and Social [292] A. Panda, O. Lahav, K. Argyraki, M. Sagiv, and S. Shenker, “Verifying
Computing (CPSCom) and IEEE Smart Data (SmartData), pp. 1135– reachability in networks with mutable datapaths,” in 14th {USENIX}
1142, IEEE, 2019. Symposium on Networked Systems Design and Implementation (NSDI),
[270] A. Metwally, D. Agrawal, and A. El Abbadi, “Efficient computation pp. 699–718, 2017.
of frequent and top-k elements in data streams,” in International [293] X. Gao, T. Kim, M. D. Wong, D. Raghunathan, A. K. Varma, P. G.
Conference on Database Theory, pp. 398–412, Springer, 2005. Kannan, A. Sivaraman, S. Narayana, and A. Gupta, “Switch code
[271] S. Heule, M. Nunkesser, and A. Hall, “HyperLogLog in practice: generation using program synthesis,” in Proceedings of the Annual
algorithmic engineering of a state of the art cardinality estimation conference of the ACM Special Interest Group on Data Communication
algorithm,” in Proceedings of the 16th International Conference on on the applications, technologies, architectures, and protocols for
Extending Database Technology, pp. 683–692, 2013. computer communication, pp. 44–61, 2020.
[272] M. G. Reed, P. F. Syverson, and D. M. Goldschlag, “Anonymous [294] P. Zheng, T. A. Benson, and C. Hu, “Building and testing modular
connections and onion routing,” IEEE Journal on Selected areas in programs for programmable data planes,” IEEE Journal on Selected
Communications, vol. 16, no. 4, pp. 482–494, 1998. Areas in Communications, vol. 38, no. 7, pp. 1432–1447, 2020.
[273] V. Liu, S. Han, A. Krishnamurthy, and T. Anderson, “Tor instead of IP,” [295] D. Kim, Y. Zhu, C. Kim, J. Lee, and S. Seshan, “Generic external
in Proceedings of the 10th ACM Workshop on Hot Topics in Networks, memory for switch data planes,” in Proceedings of the 17th ACM
pp. 1–6, 2011. Workshop on Hot Topics in Networks, pp. 1–7, 2018.
[274] C. Chen, D. E. Asoni, D. Barrera, G. Danezis, and A. Perrig, “HOR- [296] D. Kim, Z. Liu, Y. Zhu, C. Kim, J. Lee, V. Sekar, and S. Seshan, “TEA:
NET: high-speed onion routing at the network layer,” in Proceedings of enabling state-intensive network functions on programmable switches,”
the 22nd ACM SIGSAC Conference on Computer and Communications in Proceedings of the 2020 ACM SIGCOMM Conference, 2020.
Security, pp. 1441–1454, 2015. [297] S. Chole, A. Fingerhut, S. Ma, A. Sivaraman, S. Vargaftik, A. Berger,
[275] M. Zalewski and W. Stearns, “p0f,” see http://lcamtuf. coredump. G. Mendelson, M. Alizadeh, S.-T. Chuang, I. Keslassy, et al., “dRMT:
cx/p0f3, 2006. disaggregated programmable switching,” in Proceedings of the Con-
[276] J. Barnes and P. Crowley, “k-p0f: A high-throughput kernel passive OS ference of the ACM Special Interest Group on Data Communication,
fingerprinter,” in Architectures for Networking and Communications pp. 1–14, 2017.
Systems, pp. 113–114, IEEE, 2013. [298] M. T. Arashloo, Y. Koral, M. Greenberg, J. Rexford, and D. Walker,
[277] S. Hong, R. Baykov, L. Xu, S. Nadimpalli, and G. Gu, “Towards SDN- “SNAP: stateful network-wide abstractions for packet processing,” in
defined programmable BYOD (bring your own device) security,” in Proceedings of the 2016 ACM SIGCOMM Conference, pp. 29–43,
NDSS, 2016. 2016.
[278] S. Hilton, “Dyn analysis summary of Friday October 21 [299] G. Sviridov, M. Bonola, A. Tulumello, P. Giaccone, A. Bianco,
Attack, 2016..” [Online]. Available: https://dyn.com/blog/ and G. Bianchi, “LODGE: Local decisions on global states in pro-
dyn-analysis-summary-of-friday-october-21-attack/. grammable data planes,” in 2018 4th IEEE Conference on Network
[279] S. Kottler, “February 28th DDoS incident report, March, 2018.” [On- Softwarization and Workshops (NetSoft), pp. 257–261, IEEE, 2018.
line]. Available: https://githubengineering.com/ddos-incident-report/. [300] G. Sviridov, M. Bonola, A. Tulumello, P. Giaccone, A. Bianco,
[280] D. Scholz, S. Gallenmüller, H. Stubbe, and G. Carle, “Syn flood defense and G. Bianchi, “Local decisions on replicated states (LOADER) in
in programmable data planes,” in Proceedings of the 3rd P4 Workshop programmable data planes: programming abstraction and experimental
in Europe, pp. 13–20, 2020. evaluation,” arXiv preprint arXiv:2001.07670, 2020.
[281] J. Ioannidis and S. M. Bellovin, “Implementing pushback: router-based [301] S. Luo, H. Yu, and L. Vanbever, “Swing state: consistent updates
defense against DDoS attacks,” in NDSS, 2016. for stateful and programmable data planes,” in Proceedings of the
[282] N. Handigol, B. Heller, V. Jeyakumar, D. Mazières, and N. McKeown, Symposium on SDN Research, pp. 115–121, 2017.
“I know what your packet did last hop: using packet histories to [302] J. Xing, A. Chen, and T. E. Ng, “Secure state migration in the data
troubleshoot networks,” in 11th {USENIX} Symposium on Networked plane,” in Proceedings of the Workshop on Secure Programmable
Systems Design and Implementation ({NSDI} 14), pp. 71–85, 2014. Network Infrastructure, pp. 28–34, 2020.
[283] Y. Zhu, N. Kang, J. Cao, A. Greenberg, G. Lu, R. Mahajan, D. Maltz, [303] L. Zeno, D. R. Ports, J. Nelson, and M. Silberstein, “Swishmem:
L. Yuan, M. Zhang, B. Y. Zhao, and H. Zheng, “Packet-level telemetry Distributed shared state abstractions for programmable switches,” in
in large datacenter networks,” in Proceedings of the 2015 ACM Confer- Proceedings of the 19th ACM Workshop on Hot Topics in Networks,
ence on Special Interest Group on Data Communication, pp. 479–491, pp. 160–167, 2020.
2015. [304] P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Iz-
[284] H. Zeng, P. Kazemian, G. Varghese, and N. McKeown, “Automatic test zard, F. Mujica, and M. Horowitz, “Forwarding metamorphosis: fast
packet generation,” in Proceedings of the 8th international conference programmable match-action processing in hardware for SDN,” ACM
on Emerging networking experiments and technologies, pp. 241–252, SIGCOMM Computer Communication Review, vol. 43, no. 4, pp. 99–
2012. 110, 2013.
[285] P. Kazemian, G. Varghese, and N. McKeown, “Header space anal- [305] R. Pagh and F. F. Rodler, “Cuckoo hashing,” J. Algorithms, vol. 51,
ysis: static checking for networks,” in Presented as part of the 9th p. 122–144, May 2004.
{USENIX} Symposium on Networked Systems Design and Implemen-
tation ({NSDI} 12), pp. 113–126, 2012.
[286] A. Khurshid, X. Zou, W. Zhou, M. Caesar, and P. B. Godfrey,
“Veriflow: verifying network-wide invariants in real time,” in Presented
as part of the 10th {USENIX} Symposium on Networked Systems
Design and Implementation (NSDI), pp. 15–27, 2013.
[287] R. Stoenescu, M. Popovici, L. Negreanu, and C. Raiciu, “Symnet:
scalable symbolic execution for modern networks,” in Proceedings of
the 2016 ACM SIGCOMM Conference, pp. 314–327, 2016.
[288] H. Mai, A. Khurshid, R. Agarwal, M. Caesar, P. B. Godfrey, and S. T.
King, “Debugging the data plane with Anteater,” ACM SIGCOMM
Computer Communication Review, vol. 41, no. 4, pp. 290–301, 2011.
[289] P. Kazemian, M. Chang, H. Zeng, G. Varghese, N. McKeown, and
S. Whyte, “Real time network policy checking using header space
analysis,” in Presented as part of the 10th {USENIX} Symposium on
Networked Systems Design and Implementation (NSDI), pp. 99–111,
2013.
[290] A. Horn, A. Kheradmand, and M. Prasad, “Delta-net: real-time network
verification using atoms,” in 14th {USENIX} Symposium on Networked
Systems Design and Implementation (NSDI), pp. 735–749, 2017.
[291] S. Son, S. Shin, V. Yegneswaran, P. Porras, and G. Gu, “Model