0% found this document useful (0 votes)

44 views26 pages

2022 Nicmem Slides

Uploaded by

Chris John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views26 pages

2022 Nicmem Slides

Uploaded by

Chris John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

†

The Benefits of ‡

General-Purpose
On-NIC Memory ^

§
Boris Pismenny † Liran Liss §
Adam Morrison ‡ Dan Tsafrir †^

1
Data movers – definition
message
Apps that are
metadata
1. Network intensive (processed)
2. Process message metadata
3. Do not process message data

data
(unprocessed)

2
Data movers – types
1. Apps that process headers but not payload
− Examples: SW routers, NAT, load balancers, multicast, …

2. Apps that get item key and return item data

− Examples: key-value stores (Memcached, …), static webservers (Apache, …)

3
Data movers – types
1. Apps that process headers but not payload
− Examples: SW routers, NAT, load balancers, multicast, …

2. Apps that associate item key with item data

− Examples: key-value stores (Memcached, …), static webservers (Apache, …)

This talk is about the first, the second is in the paper

4
Data movers – cost
Example: software router Routing table
CPU Dst IP Output port
NIC
PCIe 172.16.1.0/24 0
172.16.2.0/24 1
Unnecessary wasteful data movement!
172.16.3.0/24 0
srcMAC newSrcMAC … …
dstMAC newDstMAC
64B
srcIP=172.16.1.100
dstIP= 172.16.2.1 dstIP= 172.16.2.1

…
data 1400B
…
5
Data movers – cost
Waste
• PCIe bandwidth
• Memory bandwidth
• CPU cycles (if mover isn’t zero-copy)
• LLC space & bandwidth
− DDIO allows NIC to directly accesses LLC

6
What we do in a nutshell
• Leave data on nicmem
• Copy only metadata

nicmem

7
NIC memory (nicmem) today
• Most NICs have internal SRAM
memory
− For stateful offloading
▪ RDMA, steering, SRIOV, …
− Size: few MBs

• Nicmem is underutilized
− Only 15% used by default in recent
NVIDIA (Mellanox) NICs
nicmem
• Nicmem is cheap & can easily be
enlarged
− About 0.2$ per MB at 7nm
− 3D stacking further reduces area + cost
8
Nicmem is like regular memory
• Expose nicmem as regular memory
− MMIO (like GPU frame buffers)
− Map into process virtual address space
− Dereference via regular pointers
− NIC queues can point to nicmem

hostmem
struct packet {
char *header;
char *data;
} nicmem
9
Leveraging Nicmem for NFV
• Baseline: host memory stores
header and payload (a) host mem
1. NIC DMA writes packet

NIC PCIe mem

host
2
2. NF processes packet header
3. NIC DMA reads packet
1 3

Rx ring Tx ring header payload

10
Leveraging Nicmem for NFV
• Baseline: host memory stores
header and payload (a) host mem (b) nicmem
1. NIC DMA writes packet

NIC PCIe mem

host
2 2
2. NF processes packet header
3. NIC DMA reads packet
1 3 1 3
• Nicmem nicmem
− Splits header and payload
− Stores payload on NIC memory Rx ring Tx ring header payload

11
Leveraging Nicmem for NFV
• Baseline: host memory stores
header and payload (a) host mem (b) nicmem (c) nicmem + inline
1. NIC DMA writes packet

NIC PCIe mem

host
2. NF processes packet header 2 2 2
3. NIC DMA reads packet

1 3 1 3 1 3
• Nicmem
− Splits header and payload nicmem nicmem
− Stores payload on NIC memory
Rx ring Tx ring header payload
• Header inlining
− Write header inside descriptor
− Back to one descriptor per packet
12
Bottlenecks
• NIC
• PCIe
• Memory bandwidth

13
Bottleneck: inside the NIC
• NIC Tx queue overflows

• Nicmem avoids the issue

(DPDK l3fwd running on a single core)

14
Bottleneck: PCIe
• PCIe links towards the host are full
− Increasing latency by 3x

• Nicmem avoids the issue

(DPDK l3fwd running on a two cores)

15
Bottleneck: memory bandwidth
• Memory bandwidth is 2.5x
− 15% lower throughput
− 10x higher latency

• Nicmem avoids the issue

(DPDK l3fwd running on eight cores)

16
Bottleneck: memory bandwidth

(DPDK l3fwd running on eight cores)

17
Additional experimental results
• Nicmem improves scalability
• Nicmem is better than DDIO
• Nicmem outperforms NFV hardware acceleration

18
Nicmem improves scalability

(FastClick NAT loaded with 200Gbps)

19
Nicmem reduces DDIO use

(FastClick NAT running on 14 cores and loaded with 200Gbps)

20
Nicmem is preferrable to NIC acceleration
• NIC memory can be used by
− Software as nicmem; or
− Hardware for per-flow acceleration state
• NIC acceleration eliminates CPU overhead
− But it doesn’t scale

(DPDK per-flow packet and byte counters running on 2 queues)

21
Conclusion
• Nicmem benefits data-mover applications

• Nicmem eliminates NIC, PCIe, and memory bandwidth bottlenecks

• Nicmem complements DDIO and outperform NFV acceleration in hardware

22
Conclusion
• Nicmem benefits data-mover applications

• Nicmem eliminates NIC, PCIe, and memory bandwidth bottlenecks

• Nicmem complements DDIO and outperform NFV acceleration in hardware

Have any question? Send me an email

Boris Pismenny: [email protected]

23
Non-data mover applications (1)

25
Non-data mover applications (2)

We find that header-data split is not free because it requires

both CPU and NIC to process two buffers per-packet.

26
Practical considerations
• Today’s nicmem is small
− Each core’s queue is 1.5MB

• Single nicmem queue eliminates the

PCIe bottleneck

• All nicmem queues reduces memory

bandwidth
(FastClick NAT running on 14 cores with 200Gbps)

Heqing Zhu - Data Plane Development Kit (DPDK) - A Software Optimization Guide To The User Space-Based Network Applications-CRC Press (2020)
No ratings yet
Heqing Zhu - Data Plane Development Kit (DPDK) - A Software Optimization Guide To The User Space-Based Network Applications-CRC Press (2020)
336 pages
Canva Mastery
100% (2)
Canva Mastery
16 pages
C3510 Communication Protocol 2.0
No ratings yet
C3510 Communication Protocol 2.0
41 pages
The Benefits of General-Purpose On-NIC Memory
No ratings yet
The Benefits of General-Purpose On-NIC Memory
18 pages
Network Interface Data Caching
No ratings yet
Network Interface Data Caching
15 pages
How-to-Expand-the-Power-of-Flash-with-FC-NVMe-Final FCIA 2018-11
No ratings yet
How-to-Expand-the-Power-of-Flash-with-FC-NVMe-Final FCIA 2018-11
36 pages
NVMe Over Fabric Presentation
No ratings yet
NVMe Over Fabric Presentation
17 pages
SNIA-SDC19-Selecting - An - NVMe - Over - Fabrics - Ethernet - Transport - RDMA - or - TCP 2019
No ratings yet
SNIA-SDC19-Selecting - An - NVMe - Over - Fabrics - Ethernet - Transport - RDMA - or - TCP 2019
21 pages
Marvell Fibre Channel Nvme Over Fabrics White Paper
No ratings yet
Marvell Fibre Channel Nvme Over Fabrics White Paper
8 pages
Euro 2021
No ratings yet
Euro 2021
97 pages
Cisco Nvme Fundamental
No ratings yet
Cisco Nvme Fundamental
109 pages
The IO Driven Server
No ratings yet
The IO Driven Server
9 pages
8234 - OCR Computer Science Revision Posters
No ratings yet
8234 - OCR Computer Science Revision Posters
23 pages
Dos and Don'Ts of Deploying NVMe Over Fabrics BRKDCN-3812 2022
No ratings yet
Dos and Don'Ts of Deploying NVMe Over Fabrics BRKDCN-3812 2022
226 pages
2022 Streaming Summit Netflix
No ratings yet
2022 Streaming Summit Netflix
100 pages
2016 Guide To SDN and NFV-Part 2
No ratings yet
2016 Guide To SDN and NFV-Part 2
26 pages
Nvme™ and Nvme-Of™ in Enterprise Arrays
No ratings yet
Nvme™ and Nvme-Of™ in Enterprise Arrays
33 pages
Adq Using SPDK SB Unlocked
No ratings yet
Adq Using SPDK SB Unlocked
13 pages
Slides M1.1
No ratings yet
Slides M1.1
11 pages
SWDN - w1-3 - Module 1.1
No ratings yet
SWDN - w1-3 - Module 1.1
11 pages
NetVM Paper
No ratings yet
NetVM Paper
14 pages
NVMe-oF An Advanced Introduction
No ratings yet
NVMe-oF An Advanced Introduction
39 pages
IP Fabric For Storage Networks
No ratings yet
IP Fabric For Storage Networks
166 pages
COA Lecture 24 DMA PDF
No ratings yet
COA Lecture 24 DMA PDF
25 pages
Day 1 - Introduction To Software Defined Networking
No ratings yet
Day 1 - Introduction To Software Defined Networking
30 pages
Hai Jin
No ratings yet
Hai Jin
56 pages
317001lecture 19 - Basics of Network On Chips (NoC) .Pptx-1740580543403
No ratings yet
317001lecture 19 - Basics of Network On Chips (NoC) .Pptx-1740580543403
57 pages
Dos and Don'ts of Deploying NVMe Over Fabrics BRKDCN-3677 2023
100% (1)
Dos and Don'ts of Deploying NVMe Over Fabrics BRKDCN-3677 2023
453 pages
NVMe1.1 Spec PPT
No ratings yet
NVMe1.1 Spec PPT
92 pages
PreConfD Marks PDF
No ratings yet
PreConfD Marks PDF
92 pages
Atm - Frame Relay
No ratings yet
Atm - Frame Relay
48 pages
SDC NVMe
No ratings yet
SDC NVMe
27 pages
Lets Talk Fabrics NVMe Over Fabrics
No ratings yet
Lets Talk Fabrics NVMe Over Fabrics
49 pages
Dos and Don'ts of Deploying NVMe Over Fabrics BRKDCN-3912 2023
100% (1)
Dos and Don'ts of Deploying NVMe Over Fabrics BRKDCN-3912 2023
344 pages
Day 1 - Introduction To Software Defined Networking
No ratings yet
Day 1 - Introduction To Software Defined Networking
23 pages
GinosarNOC Tutorial
No ratings yet
GinosarNOC Tutorial
35 pages
Q3FY21 Heroes - MDC Emerging Storage Technologies
No ratings yet
Q3FY21 Heroes - MDC Emerging Storage Technologies
36 pages
Network Interface Card
No ratings yet
Network Interface Card
5 pages
Lecture 2 - Developing Virtual Network Functions
No ratings yet
Lecture 2 - Developing Virtual Network Functions
43 pages
6.888 Advanced Topics in Networking: Lecture 1: Introduction
No ratings yet
6.888 Advanced Topics in Networking: Lecture 1: Introduction
51 pages
NUMA Optimizations in The FreeBSD Network Stack - Drew Gallatin
No ratings yet
NUMA Optimizations in The FreeBSD Network Stack - Drew Gallatin
71 pages
Configuring Advanced Networking Features
100% (1)
Configuring Advanced Networking Features
25 pages
Electronics 10 02486
No ratings yet
Electronics 10 02486
13 pages
Chapter7 2
No ratings yet
Chapter7 2
23 pages
CCNA DC Networking Fundamentals Slides
No ratings yet
CCNA DC Networking Fundamentals Slides
48 pages
About Network Processor: Xit404 Seminar Suresh.M IT2 Year
No ratings yet
About Network Processor: Xit404 Seminar Suresh.M IT2 Year
19 pages
Infrastructure Building Blocks: Networking
No ratings yet
Infrastructure Building Blocks: Networking
49 pages
Session 22, 23
No ratings yet
Session 22, 23
14 pages
Business Data Communications: 8/E, John Wiley & Sons 2004, Fitzgerald and Dennis
No ratings yet
Business Data Communications: 8/E, John Wiley & Sons 2004, Fitzgerald and Dennis
39 pages
Smart Memory
No ratings yet
Smart Memory
19 pages
Cache Utilization Using A Router Buffer
No ratings yet
Cache Utilization Using A Router Buffer
6 pages
FCIA Is FC NVMe Ready For Prime Time Final
No ratings yet
FCIA Is FC NVMe Ready For Prime Time Final
35 pages
Introduction To Networking - Chapter-1
No ratings yet
Introduction To Networking - Chapter-1
31 pages
Flash Systems: Turn Into
No ratings yet
Flash Systems: Turn Into
50 pages
CN Experiment
No ratings yet
CN Experiment
94 pages
The Future of Network Storage: by Patrick Khoo
No ratings yet
The Future of Network Storage: by Patrick Khoo
23 pages
SDN/NFV and Open Networking Ecosystem: Prof. James Won-Ki Hong
No ratings yet
SDN/NFV and Open Networking Ecosystem: Prof. James Won-Ki Hong
94 pages
The NIC Should Be Part of The OS
No ratings yet
The NIC Should Be Part of The OS
6 pages
Research Proposal - Seah Jia Chen
No ratings yet
Research Proposal - Seah Jia Chen
16 pages
DPDK Optimization Techniques
No ratings yet
DPDK Optimization Techniques
37 pages
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Cisco Ces Privacy Data Sheet
No ratings yet
Cisco Ces Privacy Data Sheet
10 pages
NetWorker 19.4 VMware Integration Guide
No ratings yet
NetWorker 19.4 VMware Integration Guide
190 pages
Empowerment Lesson1
No ratings yet
Empowerment Lesson1
1 page
Fortigate Cli Ref 54
No ratings yet
Fortigate Cli Ref 54
459 pages
GW1000 Manual
No ratings yet
GW1000 Manual
61 pages
300 Viral TikTok Intro Hooks
No ratings yet
300 Viral TikTok Intro Hooks
22 pages
Opticalink G1000 GPON OLT Operation ManualV2.2
No ratings yet
Opticalink G1000 GPON OLT Operation ManualV2.2
215 pages
Introduction To Salesforce by Gulzar Ghosh
No ratings yet
Introduction To Salesforce by Gulzar Ghosh
18 pages
List of Largest Internet Companies - Wikipedia
No ratings yet
List of Largest Internet Companies - Wikipedia
1 page
Juniper. Обновление коммутатора qfx5100
No ratings yet
Juniper. Обновление коммутатора qfx5100
2 pages
Reference Material - Azure Analytics Services
No ratings yet
Reference Material - Azure Analytics Services
159 pages
ORM. Lucene. Elasticsearch. Integrated. - Hibernate Search
No ratings yet
ORM. Lucene. Elasticsearch. Integrated. - Hibernate Search
6 pages
Unit C Concepts Review
No ratings yet
Unit C Concepts Review
3 pages
Pan Os New Features
No ratings yet
Pan Os New Features
102 pages
Activity - Create and Manage Sprints in Asana
No ratings yet
Activity - Create and Manage Sprints in Asana
4 pages
Sliding Glass Shower Door Hardware: Search
No ratings yet
Sliding Glass Shower Door Hardware: Search
3 pages
How To Make Money From Home - 5 Ways Teens Can Earn Money Online
No ratings yet
How To Make Money From Home - 5 Ways Teens Can Earn Money Online
1 page
Basic Web Programming With HTML: Lab Manual
No ratings yet
Basic Web Programming With HTML: Lab Manual
23 pages
Soyealink Mobile WiFi E5785 Quick Start Guide - (E5785-330,01, En)
No ratings yet
Soyealink Mobile WiFi E5785 Quick Start Guide - (E5785-330,01, En)
6 pages
User Guide 2023
No ratings yet
User Guide 2023
3 pages
NUML Student Clearance Portal Guide
No ratings yet
NUML Student Clearance Portal Guide
9 pages
Dubai Engineering Qualification System Corporate Services V4.5
No ratings yet
Dubai Engineering Qualification System Corporate Services V4.5
51 pages
Microsoft Az 204 Dumps by Montoya 29 01 2024 8qa Certsdeals
No ratings yet
Microsoft Az 204 Dumps by Montoya 29 01 2024 8qa Certsdeals
9 pages
FDIO Quick Start Guide PDF
No ratings yet
FDIO Quick Start Guide PDF
22 pages
Basics of Internet
100% (1)
Basics of Internet
23 pages
I.T Era 2
No ratings yet
I.T Era 2
67 pages
Italy Map 16 9
No ratings yet
Italy Map 16 9
3 pages
Serve Atonce Traffica Traffica Z5.2: Traffica Admin Tool Help
No ratings yet
Serve Atonce Traffica Traffica Z5.2: Traffica Admin Tool Help
109 pages

2022 Nicmem Slides

Uploaded by

2022 Nicmem Slides

Uploaded by

†

2. Apps that get item key and return item data

2. Apps that associate item key with item data

This talk is about the first, the second is in the paper

NIC PCIe mem

Rx ring Tx ring header payload

NIC PCIe mem

NIC PCIe mem

• Nicmem avoids the issue

(DPDK l3fwd running on a single core)

• Nicmem avoids the issue

(DPDK l3fwd running on a two cores)

• Nicmem avoids the issue

(DPDK l3fwd running on eight cores)

(DPDK l3fwd running on eight cores)

(FastClick NAT loaded with 200Gbps)

(FastClick NAT running on 14 cores and loaded with 200Gbps)

(DPDK per-flow packet and byte counters running on 2 queues)

• Nicmem eliminates NIC, PCIe, and memory bandwidth bottlenecks

• Nicmem complements DDIO and outperform NFV acceleration in hardware

• Nicmem eliminates NIC, PCIe, and memory bandwidth bottlenecks

• Nicmem complements DDIO and outperform NFV acceleration in hardware

Have any question? Send me an email

Boris Pismenny: [email protected]

We find that header-data split is not free because it requires

• Single nicmem queue eliminates the

• All nicmem queues reduces memory

You might also like