0% found this document useful (0 votes)

26 views16 pages

wp555 Vitis Networking p4

Uploaded by

Chris John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views16 pages

wp555 Vitis Networking p4

Uploaded by

Chris John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

White Paper

Simplify Packet Processing Design

with P4 and Vivado Tools

WP555 (v1.0) January 24, 2024

Abstract
The AMD Vitis™ Networking P4 tool (VNP4) is a high-level design environment used to simplify
the design of packet-processing data planes that target FPGAs and adaptive SoCs. It converts
designs coded in P4—the ubiquitous network programming language—into device-ready RTL
code for optimal hardware implementation. By using this tool, you can significantly reduce
engineering effort required to develop device-based packet-processing systems, while still
achieving high quality results in terms of performance per LUT or performance per RAM. The
benefits of designing with VNP4 are outlined in this document.

AMD Adaptive Computing is creating an environment where employees, customers, and partners feel welcome and included. To that end, we’re
removing non-inclusive language from our products and related collateral. We’ve launched an internal initiative to remove language that could
exclude people or reinforce historical biases, including terms embedded in our software and IPs. You may still find examples of non-inclusive
language in our older products as we work to make these changes and align with evolving industry standards. Follow this link for more
information.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 1
Simplify Packet Processing Design with P4 and Vivado Tools

Introduction
The benefits of VNP4 fall broadly into two categories; reduced engineering effort and high
quality, performant results.

Figure 1: VNP4 Benefits

Reduced Engineering Effort High Quality, Performant Results

Productivity

FPGA Expertise for Packet Processing

Rapid Prototyping and Time to Market Expansion

Domain
Features Specificity
Performance

Migration

X28922-120423

• Productivity: The solution reduces development effort.

• Rapid Prototyping and Time to Market: Getting your product to market is faster with the
accelerated design cycle. Iterating through multiple design options is simple and quick.

• Features: Extensive features differentiate your product, including options in User Metadata
and User Externs.

• Migration: The design intent can be migrated from one FPGA or SoC to another.

• Expansion: Packet-processing blocks generated by VNP4 can be deployed in parallel or

serially to support capabilities such as multi-level parsing and multi-data-pipeline systems.

• Domain Specificity: This high-level abstraction solution is domain specific, allowing you to
take advantage of abstraction without sacrificing performance.

• FPGA Expertise for Packet Processing: The solution and quality of hardware implementation
reflects years of experience in high-speed FPGA design and memory subsystems for high
throughput packet processing.

• Performance: The system has been engineered from the ground up to ensure high throughput,
low latency, and minimized resource utilization.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 2
Simplify Packet Processing Design with P4 and Vivado Tools

Programming Protocol-independent Packet Processing

An industry standard, domain specific language, programming protocol-independent packet
processors (P4), is used for requirements capture. VNP4 converts the P4 design intent into an
AMD FPGA or adaptive SoC design solution and allows programmers to build new data planes by
explicitly specifying the header and packet processing requirements. To implement a P4 design,
the compiler maps the intended functionality onto a custom data plane architecture of VNP4 RTL
engines and software drivers. This mapping chooses appropriate engine types and customizes
each of them based on the P4-specified processing. The specialized engines used to achieve this
goal include parsing engines, match-action engines, and deparsing engines, each generated
according to an application-specific requirement.

The generated RTL is integrated in a packaged IP in the AMD Vivado™ Design Suite where it can
immediately be combined with other standard IPs, such as media access controllers, to create a
complete device design. The design is then synthesized and a bit-file is generated for the
targeted device. Even before synthesis design data is generated, critical design metrics are
available, such as required latency and memory resources.

The current AMD solution was designed based upon feedback from hundreds of customers and
information gathered from earlier iterations. The three key elements that differentiate this latest
generation of the tool are:

• Native support for the P416 language

• Algorithmic content addressable memory technology
• Dedication to efficient resource utilization and robust timing closure

Productivity
A major advantage of designing with VNP4 is the savings in development time and effort. This
applies to the actual generation of the RTL, but also might be more significant during the RTL
verification. To highlight where the savings exist, the following figure compares the different
phases of a typical RTL development flow against the approach using VNP4.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 3
Simplify Packet Processing Design with P4 and Vivado Tools

Figure 2: Conceptual Breakdown of Project Effort

Project Effort

System Verification
System Integration
Verification Phase
Implementation Phase
Design Phase
Definition Phase

RTL Development Flow P4 Development Flow

X28923-120423

The definition phase includes defining the scope of a project and capturing the important details
in a requirements specification document. When it comes to packet processing, the requirements
can be specified more efficiently and effectively with the use of P4 code compared to a
requirements specification document. The P4 code is concise and less ambiguous, which helps to
avoid misinterpretation later in the project. This consequently saves effort and time in those later
stages. Many examples within the industry highlight the benefits of P4 as a specification
language [4][5][6].

The definition phase also includes test planning, which involves decisions about the design of a
test bench, the nature of the stimulus that is needed to test the design, and the nature of the
checking mechanisms. VNP4 provides an example design including a SystemVerilog test bench
with automated self-checking against the P4 behavioral model, allowing you to focus more on
the stimulus side. The verification can be run using the P4 behavioral model, which has much
faster runtimes. The P4's higher level of abstraction makes debug work much easier, where the
model outputs a detailed log of each step through the P4 program as a packet is processed. RTL
simulation is still recommended when integrating into a larger system design.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 4
Simplify Packet Processing Design with P4 and Vivado Tools

Figure 3: VNP4 Example Design Test Bench

.json file
Behavioral
Model
Input Input
Packet Packet cli
Data File Data File commands
txt file

Packet Data In

Stimulus Meta Data In Golden Golden

Packet Meta
Data File Data File

Init Complete
DUT
wrapper
Packet Data Out

AXI
Control Meta Data Out
Checker

X28937-012224

The design phase, which would otherwise involve the detailed inner workings and interface
connection specifications of the RTL modules, can be simplified to a few top-level VNP4
parameters and clocking decisions. The use of standard interfaces (AXI4-Stream and AXI4-Lite)
simplifies the connection to other parts of the system. The user metadata structure also provides
customization for custom side-band signals that are needed for interconnection by the user
application.

One of the biggest savings when using VNP4 is the reduction in RTL and driver coding in the
implementation phase. If the functionality can be described in P4 without user externs, then no
RTL coding is required. The engineering effort saved in RTL coding is magnified for more complex
P4 designs. This savings is further multiplied in cases of changing requirements, scope creep, and
new features.

Similarly, the verification phase is also much shorter where there is little or no RTL test bench
coding involved. The P4 code can be verified using the behavioral model. The runtime and
iteration cycles are faster here compared to an RTL test bench. Detailed log information is
provided by the model to indicate how each packet is processed by the P4 code step by step,
allowing for easier debug compared to reviewing RTL waveforms.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 5
Simplify Packet Processing Design with P4 and Vivado Tools

Both flows require a system integration stage in either RTL or IP integrator. However, timing
closure can be a significantly lower risk with the P4 flow. In the context of a hardware debug
iteration cycle, this becomes even more pronounced. The P4 code can be quickly simplified (for
example, reduced table size) to generate test bitstreams with a quicker, more reliable turnaround
time, before later switching back to the full P4 functionality.

Ease of Use
Technology parameters can be customized via a graphical user interface (GUI), which provides
visual feedback, such as memory utilization for tables. The GUI displays the specific features and
allowable parameter values tailored to the P4 program.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 6
Simplify Packet Processing Design with P4 and Vivado Tools

Figure 4: VNP4 Customization GUI

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 7
Simplify Packet Processing Design with P4 and Vivado Tools

IP integrator can be used for connectivity between IPs, because VNP4 uses standard interfaces
(AXI4-Stream and AXI4-Lite). These IPs can easily be connected in IP integrator, along with
straightforward address mapping for AXI4-Lite.

Figure 5: VNP4 within Vivado IP Integrator

Typical configurations of the IP are supplied in the form of Example Designs. This allows you to
see IP features operating in simulation and synthesis.

Robust Designs
The use of correct-by-design code generation reduces the effort required for verification. Once
the P4 code functionality has been verified using the P4 behavioral model, you can have
confidence that the design intent in the form of P4 is properly converted into a working design
ready for deployment in the FPGA or adaptive SoC.

Rapid Prototyping and Time to Market

Getting your product to market is faster with the accelerated design cycle. Once the P4 is
created, detailed information on the latency of the design and the memory requirements of the
system are available. This aids in high-level design decisions such as device selection. Different
options can be trialed. After initial decisions are made, the design can then be synthesized and
brought through the back-end flow in the Vivado tools to assess the feasibility of the design in
terms of resource allocation and timing closure. The P4 code can be verified much quicker than
an RTL alternative approach, which provides earlier confidence in the design sizing information
from the Vivado tools.

Features
P4 provides extensive features to differentiate your product. These include options in user
metadata and user externs.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 8
Simplify Packet Processing Design with P4 and Vivado Tools

User Metadata
User metadata, inferred in the P4 code, provides side-band signaling alongside the packet data at
the input to VNP4. It can be processed within the parser and the match-action pipeline, then it is
output again alongside the packet data. One common use provides ingress and egress port
number information to indicate where each packet is coming from and going to.

Targeting an FPGA or adaptive SoC means that VNP4 has the benefit of being able to scale the
side-band signaling to any width to suit each application. The user metadata structure can be
broken down into further struct definitions for convenient grouping of fields. It can be defined to
align with other standard architectures (such as Portable NIC Architecture), but it is not restricted
to any one of these definitions.

User Externs
User externs provide an interface between the match-action control block and your own RTL
module that resides outside of the VNP4. You can implement whatever function is useful for
your design, such as a custom checksum calculation or a hash function.

Migration
VNP4 allows for easy migration in several ways. Migrating to a different line rate is
straightforward (for example, 100 Gb/s to 200 Gb/s), by scaling clock frequencies and bus widths
in the GUI. No changes are required to the P4 code, and confidence can be maintained that the
original implementation of the requirements remains faithful to the design intent. This can be
very effective if developing a family of products with each family member targeting a specific line
rate. The same P4 code generates the packet processing RTL in each family member, saving time.

Prototyping with smaller table implementations before moving to a larger number of table entries
is also trivial to enable even more rapid prototyping through to hardware implementation. This
can include starting off with on-chip SRAM before later increasing the size of the same P4 table
to target off-chip DRAM. In a pure RTL design flow, this can be time consuming and introduce
new risks. Any impact to the latency or performance of one table can have consequential impact
on other parts of the P4 pipeline, for the whole design to remain in sync. However, VNP4
automatically manages all these latency and alignment challenges. More extensive changes in
design are also enabled in cases where evolving functionality and requirements are supported
through small P4 updates.

Expansion
Designs can be deployed with multiple instances in parallel and multiple instances in series to
support capabilities such as multi-level parsing and multi-data-pipeline systems. The flexible user
metadata structure allows information to be passed between P4 instances to support this. There
are several reasons to do this:

• To achieve higher total line rate processing or total packet rate

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 9
Simplify Packet Processing Design with P4 and Vivado Tools

• To optimize more complex parts of the design that do not require high data rates or increased
packet rates
• To create a more modular system, restricting the size of each P4 program. Modularity is also a
re-use benefit.

Figure 6: Multi-stage Ingress and Egress Pipelines

Ingress_stage1.p4 Ingress_stage2.p4

100G
Switch PCIe
MAC

Egress_stage2.p4 Egress_stage1.p4

X28929-012224

Figure 7: 4x100G Pipelines

100g_pipeline.p4

100g_pipeline.p4
400G 1:4
MAC Split
100g_pipeline.p4

100g_pipeline.p4

X28928-012224

Domain Specificity
This high-level abstraction solution is domain specific, allowing you to take advantage of
abstraction without sacrificing performance. The number of degrees of freedom in the packet
processing design space is orders of magnitude less than in a general-purpose data processing
solution. This means the solution can be optimized for a more efficient implementation while
maintaining the flexibility to implement arbitrary packet processing functions.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 10
Simplify Packet Processing Design with P4 and Vivado Tools

Performance
The pipelined nature of the P4 designs allows VNP4 to process one packet every clock cycle. The
only exception to this is for HBM/DDR binary CAM (BCAM) table look-ups, where the DRAM
bandwidth might become a limiting factor. All elements of the design can scale in complexity
without reducing the performance. This includes a complex header parsing tree, many different
table look-ups and actions, and many packet editing operations. VNP4 does not set limitations
on these complexities, for example:

• There is no fixed limit to the number of parsing states or header extracts

• There is no fixed limit to the number of headers that can be modified, removed, or inserted
• There is no fixed limit to the number of tables in the match-action block
• There is no fixed limit on the size of the user metadata

All elements can scale up to a large value without impacting the performance. Ultimately, designs
reach a natural limit in terms of the device resource utilization (for example, if all block RAMs and
UltraRAMs are exhausted). The performance is also not impacted by how deep the parsing goes
into a packet.

The packet bus width and the clock frequency can be chosen to achieve the desired
performance. The packet rate can also be configured to allow for further optimizations. Some
common examples are shown in the following table.

Table 1: Examples of Parameter Settings for Different Throughputs

Throughput Packet Rate Data Bus Width Clock Frequency

(Gb/s) Mp/s (Bytes) (MHz)

200 300 128 336

100 150 64 300
50 75 32 300
10 15 4 312.5

Note: Higher clock frequencies and packet rates can also be achieved; a trade-off is then needed between
function complexity and timing closure.

Resource Utilization
Very complex parsers can be supported in VNP4, while still operating at 200 Gb/s line rate and
300 million packets per second. To illustrate this point, a consolidated switch P4 example
was taken and ported to the VNP4 XSA target, with the match-action section removed to focus
on the parser. This example has 130,000 unique paths through the P4 parser (including error
conditions), and it uses 31k LUTs. The example showcases the level of complexity in terms of
parsing that can be enabled by VNP4. While a robust example, it is not the limit of the parsing
complexity that can be supported in VNP4.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 11
Simplify Packet Processing Design with P4 and Vivado Tools

Figure 8: Parse Graph for consolidated_switch_xsa.p4

start

parse_ethernet_contd

parse_fabric_header

parse_fabric_header_unicast parse_fabric_header_cpu

parse_fabric_header_mirror parse_fabric_header_multicast parse_fabric_sflow_header

parse_fabric_payload_header

parse_llc_header

parse_snap_header parse_fabric_payload_header_contd

parse_qinq

parse_vlan parse_qinq_vlan

parse_arp_rarp parse_ipv4 parse_ipv6 parse_mpls parse_set_prio_high

parse_icmp parse_tcp parse_ipv4_in_ip parse_udp parse_ipv6_in_ip

parse_set_prio_med parse_vxlan_gpe

parse_sflow parse_gpe_int_header

parse_all_int_meta_value_heders parse_vxlan

parse_int_val

parse_int_val_1

parse_int_val_2

parse_int_val_3

parse_int_val_4

parse_int_val_5

parse_int_val_6

parse_int_val_7

parse_int_val_8

parse_int_val_9

parse_int_val_10

parse_int_val_11

parse_int_val_12

parse_int_val_13

parse_int_val_14

parse_int_val_15

parse_int_val_16

parse_int_val_17

parse_int_val_18

parse_int_val_19

parse_int_val_20 parse_mpls_1

parse_int_val_21 parse_mpls_2

parse_int_val_22 parse_gre parse_mpls_bos

parse_geneve parse_int_val_23 parse_nvgre parse_erspan_t3 parse_eompls

parse_gre_ipv4 parse_inner_ethernet parse_mpls_inner_ipv4 parse_gre_ipv6 parse_mpls_inner_ipv6

parse_inner_ipv4 parse_inner_ipv6

parse_inner_tcp parse_inner_udp parse_inner_icmp

null

X28925-120523

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 12
Simplify Packet Processing Design with P4 and Vivado Tools

For comparison, the parse graph of the less complicated FiveTuple example design available
within the tool.

Figure 9: Parse Graph for FiveTuple.p4

start

parse_vlan

parse_ipv4

parse_tcp parse_udp

null
X28932-120623

Typically, the CAMs have a much larger utilization than the packet parsing and editing functions,
along with other parts of the logic design outside VNP4, therefore designers can focus their
efforts on system level trade-offs such as table entry numbers. Some other examples of resource
utilization are given in the following section.

P4 Examples
P4 Language Tutorials
The P4 language tutorials [2] provide a set of 12 P4 programs that were created independently
from VNP4. They target the BMv2 simple switch (v1model architecture). With a few updates, the
programs can be modified to retarget the VNP4 pipeline architecture (XSA). The following table
provides a summary of the designs along with device resource utilization numbers for those that
are currently supported in VNP4. These designs were configured for a 100 Gb/s setup. The last

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 13
Simplify Packet Processing Design with P4 and Vivado Tools

column demonstrates that for the larger designs, it is typically the P4 tables that dominate the
logic utilization, rather than the packet parsing and editing. To put these utilization numbers in
context, a 10G Ethernet MAC and PCS/PMA IP use approximately 10k LUTs [7]. Three of these
P4 programs include language or extern features that are not yet supported in VNP4 (for
example, the register extern).

Note: Utilization numbers are based on 100 Gb/s setup, where TDATA_NUM_BYTES = 64 and the
PKT_RATE = 150.

Table 2: P4 Design Examples

Supported Latency Tables as % of

P4 Program Name LUTs Total Flip-Flops Block RAMs UltraRAMs
with VNP4 (Cycles) LUTs
Basic Forwarding Yes 28339 43929 138 0 53 91%
Basic Tunnel Yes 30993 48291 146 0 54 88%
P4 Runtime Yes 35825 56894 158 32 83 77%
Explicit Congestion Yes 28346 44070 138 0 53 90%
Notification
Multi-hop Route Yes 29715 46448 138 0 60 86%
Inspection
Source Routing Yes 2675 5472 2 0 30 0%
Calculator Yes 2382 5208 3 0 24 5%
Load Balancing No
Quality of Service Yes 28341 44000 138 0 53 90%
Firewall Support for register Extern underway
Link Monitoring Support for register Extern underway
MultiCast Yes 3154 5554 10 0 37 69%
Notes:
1. Testing and analysis by AMD as of 11/24/23, using AMD Vivado™ Design Suite 2023.2 and an AMD Virtex™ UltraScale+™ device (xcvu37p-
fsvh2892-2L-e), with out-of-context synthesis and implementation, and utilization numbers from a post-place utilization report. Actual
results can vary. VIV-009.

VNP4 Example Designs

The following table provides a summary of the device resource utilization numbers for the
example designs that are released with VNP4. These example designs are primarily to showcase
the various features of the P4 language that are supported in VNP4, rather than complete
applications, but the resource numbers still highlight the efficiency of implementing various
features.

Note: Utilization numbers are based on 100 Gb/s setup, where TDATA_NUM_BYTES = 64 and the PKT_RATE =
150.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 14
Simplify Packet Processing Design with P4 and Vivado Tools

Table 3: VNP4 Example Designs

Latency Tables as % of
P4 Program Name LUTs (Total) Flip-Flops Block RAMs UltraRAMs
(Cycles) LUTs
Echo 3106 6784 2 0 26 0%
FiveTuple 8807 15702 6 16 49 33%
FiveTuple_tinycam 8605 15215 6 4 30 38%
Forward 65251 85615 250 0 68 93%
Forward_tinycam 11923 18680 2 0 32 65%
Calculator 2542 5149 2 0 26 5%
Advanced Calculator 2969 5814 3 0 59 4%
Notes:
1. Testing and analysis by AMD as of 11/24/23, using AMD Vivado™ Design Suite 2023.2 and an AMD Virtex™ UltraScale+™ device (xcvu37p-
fsvh2892-2L-e), with out-of-context synthesis and implementation, and utilization numbers from a post-place utilization report. Actual
results can vary. VIV-009.

Conclusion
Designing for high-speed packet processing in programmable logic can be challenging. VNP4
simplifies the process by using higher-level abstraction with the P4 language, without
compromising on efficiency of resource utilization. For further information, download the Vivado
Design Suite and select install Vitis Networking P4. Evaluation licenses are available from the
product page, along with further documentation. You can also learn more about the P4 language
at p4.org.

References
These documents provide supplemental material useful with this guide:

1. P4 Language Specification https://p4.org/p4-spec/docs/p4-16-working-draft.html

2. P4 Language Tutorials https://github.com/p4lang/tutorials
3. GitHub, Consolidated switch repository (API, SAI, and Nettlink) p4lang/switch
4. Parveen Patel, Google, P4 Workshop 2023 Keynote: P4 HAL for Network Virtualization
YouTube
5. Nick McKeown Fireside Chat, 2023 P4 Workshop (YouTube)
6. Keynote: The Power of Fully-Specified Data Planes, Rob Sherwood (Intel), YouTube
7. Resource Utilization for 10G/25G Ethernet Subsystem v4.1
8. Vitis Networking P4 User Guide (UG1308)

Revision History
The following table shows the revision history for this document.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 15
Simplify Packet Processing Design with P4 and Vivado Tools

Section Revision Summary

1/24/2024 Version 1.0
Initial release. N/A

Please Read: Important Legal Notices

The information presented in this document is for informational purposes only and may contain
technical inaccuracies, omissions, and typographical errors. The information contained herein is
subject to change and may be rendered inaccurate for many reasons, including but not limited to
product and roadmap changes, component and motherboard version changes, new model and/or
product releases, product differences between differing manufacturers, software changes, BIOS
flashes, firmware upgrades, or the like. Any computer system has risks of security vulnerabilities
that cannot be completely prevented or mitigated. AMD assumes no obligation to update or
otherwise correct or revise this information. However, AMD reserves the right to revise this
information and to make changes from time to time to the content hereof without obligation of
AMD to notify any person of such revisions or changes. THIS INFORMATION IS PROVIDED "AS
IS." AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE
CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES,
ERRORS, OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY
DISCLAIMS ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, OR
FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY
PERSON FOR ANY RELIANCE, DIRECT, INDIRECT, SPECIAL, OR OTHER CONSEQUENTIAL
DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF
AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

AUTOMOTIVE APPLICATIONS DISCLAIMER

AUTOMOTIVE PRODUCTS (IDENTIFIED AS "XA" IN THE PART NUMBER) ARE NOT

WARRANTED FOR USE IN THE DEPLOYMENT OF AIRBAGS OR FOR USE IN APPLICATIONS
THAT AFFECT CONTROL OF A VEHICLE ("SAFETY APPLICATION") UNLESS THERE IS A
SAFETY CONCEPT OR REDUNDANCY FEATURE CONSISTENT WITH THE ISO 26262
AUTOMOTIVE SAFETY STANDARD ("SAFETY DESIGN"). CUSTOMER SHALL, PRIOR TO USING
OR DISTRIBUTING ANY SYSTEMS THAT INCORPORATE PRODUCTS, THOROUGHLY TEST
SUCH SYSTEMS FOR SAFETY PURPOSES. USE OF PRODUCTS IN A SAFETY APPLICATION
WITHOUT A SAFETY DESIGN IS FULLY AT THE RISK OF CUSTOMER, SUBJECT ONLY TO
APPLICABLE LAWS AND REGULATIONS GOVERNING LIMITATIONS ON PRODUCT
LIABILITY.

© Copyright 2024 Advanced Micro Devices, Inc. AMD, the AMD Arrow logo, UltraScale+, Virtex,
Vitis, Vivado, and combinations thereof are trademarks of Advanced Micro Devices, Inc. PCI,
PCIe, and PCI Express are trademarks of PCI-SIG and used under license. Other product names
used in this publication are for identification purposes only and may be trademarks of their
respective companies.

WP555 (v1.0) January 24, 2024

Send Feedback
White Paper 16

Xcell 19
No ratings yet
Xcell 19
40 pages
Course Slides 2018
No ratings yet
Course Slides 2018
485 pages
Course Slides 2018
No ratings yet
Course Slides 2018
484 pages
pg085 Axi4stream Infrastructure
No ratings yet
pg085 Axi4stream Infrastructure
82 pages
FPGA 2018 P4 Tutorial PDF
No ratings yet
FPGA 2018 P4 Tutorial PDF
145 pages
MASTERTHESIS: The Usage of Model-Based Design For Signal Processing of An RTP Stream
No ratings yet
MASTERTHESIS: The Usage of Model-Based Design For Signal Processing of An RTP Stream
97 pages
P4 Tutorial
No ratings yet
P4 Tutorial
120 pages
Labrecque Martin 201111 PHD Thesis
No ratings yet
Labrecque Martin 201111 PHD Thesis
151 pages
VLSID - 2023 Unit V FPGA Design
No ratings yet
VLSID - 2023 Unit V FPGA Design
91 pages
SoC Mod2 Notes
No ratings yet
SoC Mod2 Notes
64 pages
500Mb/s Soft Output Viterbi Decoder: Engling Yeo Stephanie Augsburger, Wm. Rhett Davis, Borivoje Nikolić
No ratings yet
500Mb/s Soft Output Viterbi Decoder: Engling Yeo Stephanie Augsburger, Wm. Rhett Davis, Borivoje Nikolić
32 pages
Customs of The Tagalogs (Critical Essay)
67% (3)
Customs of The Tagalogs (Critical Essay)
3 pages
Pg073 Axi Apb Bridge
No ratings yet
Pg073 Axi Apb Bridge
30 pages
Axi Verification Ip V1.0: Logicore Ip Product Guide
No ratings yet
Axi Verification Ip V1.0: Logicore Ip Product Guide
91 pages
Panagiotis Famelis Thesis
No ratings yet
Panagiotis Famelis Thesis
35 pages
P4 Deep Dive: Richard Cziva 7 September 2017
No ratings yet
P4 Deep Dive: Richard Cziva 7 September 2017
43 pages
p4 d2 2017 p4 16 Tutorial
No ratings yet
p4 d2 2017 p4 16 Tutorial
94 pages
VHDL Modeling and Synthesis in The Laboratory: Session 2647
No ratings yet
VHDL Modeling and Synthesis in The Laboratory: Session 2647
12 pages
Tutorial - Introduction To P4
No ratings yet
Tutorial - Introduction To P4
93 pages
IEEE Paper Format
No ratings yet
IEEE Paper Format
78 pages
Lect6 - Customizable
No ratings yet
Lect6 - Customizable
15 pages
Vlsi
No ratings yet
Vlsi
70 pages
Pantech Project Titles VLSI Projects 2017-18
No ratings yet
Pantech Project Titles VLSI Projects 2017-18
4 pages
What Is Xilinx XC4VLX25-10FF668C
No ratings yet
What Is Xilinx XC4VLX25-10FF668C
14 pages
P4 Language: Other Concerns
No ratings yet
P4 Language: Other Concerns
17 pages
PG043 Video in To AXI4-Stream v4.0
No ratings yet
PG043 Video in To AXI4-Stream v4.0
40 pages
Pg144 Axi Gpio
No ratings yet
Pg144 Axi Gpio
34 pages
Group 7 Report
No ratings yet
Group 7 Report
10 pages
C5 Next Gen SONET
No ratings yet
C5 Next Gen SONET
54 pages
VLSI Design For High Performance Computing
No ratings yet
VLSI Design For High Performance Computing
10 pages
P4 Tutorial
No ratings yet
P4 Tutorial
107 pages
The Future of VLSI Industry in 2024 Explore The Evolution!
No ratings yet
The Future of VLSI Industry in 2024 Explore The Evolution!
8 pages
Psda Fix
No ratings yet
Psda Fix
12 pages
DATX02-19-30 Uppladdad I 360
No ratings yet
DATX02-19-30 Uppladdad I 360
68 pages
Creative Non-Fiction 12: Theme and Literary Techniques
No ratings yet
Creative Non-Fiction 12: Theme and Literary Techniques
6 pages
pg044 V Axis Vid Out
No ratings yet
pg044 V Axis Vid Out
54 pages
AXI Quad SPI Datasheet
No ratings yet
AXI Quad SPI Datasheet
120 pages
FDio VPPwhitepaper July 2017
No ratings yet
FDio VPPwhitepaper July 2017
21 pages
C Compiler Design For A Network Processor
No ratings yet
C Compiler Design For A Network Processor
8 pages
Pg022 Axi Datamover
No ratings yet
Pg022 Axi Datamover
60 pages
Ebpf, Fpga
No ratings yet
Ebpf, Fpga
9 pages
Ug897 Vivado Sysgen User
No ratings yet
Ug897 Vivado Sysgen User
226 pages
Overview of P4 Programmable Data Plane Switches
No ratings yet
Overview of P4 Programmable Data Plane Switches
16 pages
Xapp 1167
No ratings yet
Xapp 1167
15 pages
Ug1165 Zynq Embedded Design Tutorial 1
No ratings yet
Ug1165 Zynq Embedded Design Tutorial 1
136 pages
Chiu 2020
No ratings yet
Chiu 2020
6 pages
p6 Lab
No ratings yet
p6 Lab
26 pages
Video Test Pattern Generator V8.2: Logicore Ip Product Guide
No ratings yet
Video Test Pattern Generator V8.2: Logicore Ip Product Guide
51 pages
Axi Ahb - Lite Bridge
No ratings yet
Axi Ahb - Lite Bridge
47 pages
pg034 Axi Cdma
No ratings yet
pg034 Axi Cdma
57 pages
2.0 Design Flow
No ratings yet
2.0 Design Flow
21 pages
FINAL English 10 Q1 Module 7
No ratings yet
FINAL English 10 Q1 Module 7
30 pages
Machine Learning Applications in Physical Design: Recent Results and Directions
No ratings yet
Machine Learning Applications in Physical Design: Recent Results and Directions
114 pages
Chat GPT
No ratings yet
Chat GPT
7 pages
Lecture 3 P4 NetFPGA
No ratings yet
Lecture 3 P4 NetFPGA
83 pages
Vlsi Basics and Roadmap To Job Profiles 1694948706
No ratings yet
Vlsi Basics and Roadmap To Job Profiles 1694948706
20 pages
Embedded Systems Lecture 15: HW & SW Optimisations: Björn Franke University of Edinburgh
No ratings yet
Embedded Systems Lecture 15: HW & SW Optimisations: Björn Franke University of Edinburgh
33 pages
Systems On Chip (SoC)
No ratings yet
Systems On Chip (SoC)
46 pages
Jamgon Mipham Treasury of Blessings Practice
100% (2)
Jamgon Mipham Treasury of Blessings Practice
10 pages
Run Fast With Vivado HLS
No ratings yet
Run Fast With Vivado HLS
4 pages
Dialects of Hindi Language
0% (1)
Dialects of Hindi Language
4 pages
Vls I Design Flow
No ratings yet
Vls I Design Flow
3 pages
Steps in Asic Design Flow
No ratings yet
Steps in Asic Design Flow
3 pages
Basic Rules For Gerunds and Infinitives
No ratings yet
Basic Rules For Gerunds and Infinitives
7 pages
Uprof User Guide v4.2
No ratings yet
Uprof User Guide v4.2
268 pages
Coldplay - Yellow: Were Came Wrote Was Took Was
No ratings yet
Coldplay - Yellow: Were Came Wrote Was Took Was
2 pages
Cambridge IGCSE™: English As A Second Language 0510/13 October/November 2021
No ratings yet
Cambridge IGCSE™: English As A Second Language 0510/13 October/November 2021
9 pages
Lewensoriëntering Graad 11 Taak 5.3: Liggaamlike Opvoedingstaak 3 TYD: 2 Weke Totaal: 10 Instruksies
No ratings yet
Lewensoriëntering Graad 11 Taak 5.3: Liggaamlike Opvoedingstaak 3 TYD: 2 Weke Totaal: 10 Instruksies
3 pages
030 - Mathematics and Mathematics Difficulties
No ratings yet
030 - Mathematics and Mathematics Difficulties
159 pages
Definite Ness
No ratings yet
Definite Ness
398 pages
Fpga Smartnic n6000 PL Platform Product Brief
No ratings yet
Fpga Smartnic n6000 PL Platform Product Brief
5 pages
Evolve 1 Unit 1 PPT Lesson 3
No ratings yet
Evolve 1 Unit 1 PPT Lesson 3
9 pages
Đề 1 DEN 10
No ratings yet
Đề 1 DEN 10
59 pages
2022 Nicmem Slides
No ratings yet
2022 Nicmem Slides
26 pages
Lesson 4 - Sentence Structures
No ratings yet
Lesson 4 - Sentence Structures
48 pages
Opium of The People - Wikipedia
No ratings yet
Opium of The People - Wikipedia
6 pages
Assembly Language For x86 Processors: Chapter 1: Basic Concepts
No ratings yet
Assembly Language For x86 Processors: Chapter 1: Basic Concepts
41 pages
AI Unit 1 Notes
No ratings yet
AI Unit 1 Notes
15 pages
Xilinx Alveo sn1000 Product Brief
No ratings yet
Xilinx Alveo sn1000 Product Brief
2 pages
The Hydrogen Cipher (Judy Beebe)
100% (1)
The Hydrogen Cipher (Judy Beebe)
14 pages
MATH Q2 With TOS FINALE
No ratings yet
MATH Q2 With TOS FINALE
6 pages
A Bridge: - Verb (Used With Object), A Bridged, A Bridg Ing
No ratings yet
A Bridge: - Verb (Used With Object), A Bridged, A Bridg Ing
7 pages
Still Vs Yet
No ratings yet
Still Vs Yet
5 pages
ART Py-Pde A Python Package For Solving Partial Differential Equtions
No ratings yet
ART Py-Pde A Python Package For Solving Partial Differential Equtions
4 pages
United Independent Bengal
No ratings yet
United Independent Bengal
3 pages
Vasiliev PDF Istoria Imperiului Bizantin A A Imperiului Bizantin A A Vasiliev
0% (1)
Vasiliev PDF Istoria Imperiului Bizantin A A Imperiului Bizantin A A Vasiliev
4 pages
Unit 2 - Lecture 10 - RDBMS
No ratings yet
Unit 2 - Lecture 10 - RDBMS
15 pages
2.5 The Sylow Theorems
No ratings yet
2.5 The Sylow Theorems
2 pages
Narrative Report - INSET DAY 4
No ratings yet
Narrative Report - INSET DAY 4
2 pages
Concepts: Discourse Analysis Conversation Analysis
No ratings yet
Concepts: Discourse Analysis Conversation Analysis
9 pages
1 Syllabus For IINDUSTRY 4.0 - 20 Use Cases
No ratings yet
1 Syllabus For IINDUSTRY 4.0 - 20 Use Cases
5 pages
Rubrik Penilaian Pembentangan
No ratings yet
Rubrik Penilaian Pembentangan
3 pages

wp555 Vitis Networking p4

Uploaded by

wp555 Vitis Networking p4

Uploaded by

White Paper

Simplify Packet Processing Design

WP555 (v1.0) January 24, 2024

WP555 (v1.0) January 24, 2024

Figure 1: VNP4 Benefits

Reduced Engineering Effort High Quality, Performant Results

FPGA Expertise for Packet Processing

• Productivity: The solution reduces development effort.

• Expansion: Packet-processing blocks generated by VNP4 can be deployed in parallel or

WP555 (v1.0) January 24, 2024

Programming Protocol-independent Packet Processing

• Native support for the P416 language

WP555 (v1.0) January 24, 2024

Figure 2: Conceptual Breakdown of Project Effort

RTL Development Flow P4 Development Flow

WP555 (v1.0) January 24, 2024

Figure 3: VNP4 Example Design Test Bench

Stimulus Meta Data In Golden Golden

WP555 (v1.0) January 24, 2024

WP555 (v1.0) January 24, 2024

Figure 4: VNP4 Customization GUI

WP555 (v1.0) January 24, 2024

Figure 5: VNP4 within Vivado IP Integrator

Rapid Prototyping and Time to Market

WP555 (v1.0) January 24, 2024

• To achieve higher total line rate processing or total packet rate

WP555 (v1.0) January 24, 2024

Figure 6: Multi-stage Ingress and Egress Pipelines

Figure 7: 4x100G Pipelines

WP555 (v1.0) January 24, 2024

• There is no fixed limit to the number of parsing states or header extracts

Table 1: Examples of Parameter Settings for Different Throughputs

Throughput Packet Rate Data Bus Width Clock Frequency

200 300 128 336

WP555 (v1.0) January 24, 2024

Figure 8: Parse Graph for consolidated_switch_xsa.p4

parse_fabric_header_mirror parse_fabric_header_multicast parse_fabric_sflow_header

parse_arp_rarp parse_ipv4 parse_ipv6 parse_mpls parse_set_prio_high

parse_icmp parse_tcp parse_ipv4_in_ip parse_udp parse_ipv6_in_ip

parse_int_val_22 parse_gre parse_mpls_bos

parse_geneve parse_int_val_23 parse_nvgre parse_erspan_t3 parse_eompls

parse_gre_ipv4 parse_inner_ethernet parse_mpls_inner_ipv4 parse_gre_ipv6 parse_mpls_inner_ipv6

parse_inner_tcp parse_inner_udp parse_inner_icmp

WP555 (v1.0) January 24, 2024

Figure 9: Parse Graph for FiveTuple.p4

WP555 (v1.0) January 24, 2024

Table 2: P4 Design Examples

Supported Latency Tables as % of

VNP4 Example Designs

WP555 (v1.0) January 24, 2024

Table 3: VNP4 Example Designs

1. P4 Language Specification https://p4.org/p4-spec/docs/p4-16-working-draft.html

WP555 (v1.0) January 24, 2024

Section Revision Summary

Please Read: Important Legal Notices

AUTOMOTIVE APPLICATIONS DISCLAIMER

AUTOMOTIVE PRODUCTS (IDENTIFIED AS "XA" IN THE PART NUMBER) ARE NOT

WP555 (v1.0) January 24, 2024

You might also like