0% found this document useful (0 votes)
14 views43 pages

Lecture 2 - Developing Virtual Network Functions

The document discusses the development of Virtual Network Functions (VNFs) in cloud computing, emphasizing the need to virtualize network functions to enhance performance and portability. It introduces the Data Plane Development Kit (DPDK) as a solution for efficient user-space packet processing, highlighting its features such as bypassing the Linux kernel and enabling zero-copy packet processing. The lecture also covers performance issues and implementation strategies for VNFs, using a load balancer as a concrete example.

Uploaded by

Atul Avhad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views43 pages

Lecture 2 - Developing Virtual Network Functions

The document discusses the development of Virtual Network Functions (VNFs) in cloud computing, emphasizing the need to virtualize network functions to enhance performance and portability. It introduces the Data Plane Development Kit (DPDK) as a solution for efficient user-space packet processing, highlighting its features such as bypassing the Linux kernel and enabling zero-copy packet processing. The lecture also covers performance issues and implementation strategies for VNFs, using a load balancer as a concrete example.

Uploaded by

Atul Avhad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

System Issues in Cloud Computing

Mini-course: Network Function Virtualization

KISHORE RAMACHANDRAN, PROFESSOR


School of Computer Science
College of Computing
Lecture 2 - Developing
Virtual Network
Functions
Opening headshot
In the first lecture, we introduced the role played by network functions in the
enterprise computing ecosystem. We also identified the need for liberating
network functions from vendor-locked in middleboxes and implement them as
software entities running on commodity servers. Further, to make such functions
portable across platforms, we discussed that virtualizing the network functions so
that they can be run on hypervisors is the right approach. In this lecture, we will
take an in-depth look at virtual network functions. In particular, we will discuss the
issues in developing virtual network functions and the emerging technologies for
aiding the performance-conscious development of virtual network functions.
Outline
● Virtual network functions (VNF): revisiting “load balancer” as a
concrete example
● Performance issues in implementing virtual network functions
● Performance-conscious implementation of virtual network functions
● Data Plane Development Kit (DPDK) - an exemplar for user-space
packet processing
● Implementation of VNF on commodity hardware using DPDK
● Putting it together: load balancer example using DPDK
Virtual Network Functions
● Network functions implemented in user-space on top of hypervisor
● Load balancer as a concrete example
○ Keeps a pool of backend service instances (e.g., HTTP Server)
○ Distributes incoming packet flows to a specific instance to exploit inherent
parallelism in the hardware platform and balance the load across all the service
instances.
Architecture of a load balancer network
function Packet 5- Backend Service
● Distribute client connections to tuple Instance

a pool of backend service


instances
Update & Backend
○ For example HTTP Server
Lookup Instance 0
● Use packet’s 5-tuple to choose
Load Backend
backend instance Balance Instance 1
○ Provides connection-level affinity r
Backend
○ Same connection is sent to same Instance 2
backend instance
Backend Pool
Architecture of a load balancer network
function Packet 5- Backend Service
tuple Instance
Select
backend
instance and
add to table
If match
Extract not
Read Lookup
connection found
packe connection
info from
t info in table
header

If match
found Send packet
to backend
instance

sendto() call to send


recvfrom() system call packet out
for
reading packets into Userspace
buffer
Kernel space
Outline
● Virtual network functions (VNF): revisiting “load balancer” as a
concrete example
● Performance issues in implementing virtual network functions
● Performance-conscious implementation of virtual network functions
● Data Plane Development Kit (DPDK) - an exemplar for user-space
packet processing
● Implementation of VNF on commodity hardware using DPDK
● Putting it together: load balancer example using DPDK
Eliminating the overhead of virtualization
● Network function is in the critical path of packet processing
● Need to eliminate the overhead of virtualization
○ Intel VT-d allows the NIC to bypass the VMM (i.e., the hypervisor) by direct
mapping user-space buffers for DMA and passing the device interrupt directly to
the VM above the VMM
● Is that enough?
○ Unfortunately no…
○ To fully appreciate why
■ let’s look at the path of packet processing in an OS like Linux...
Packet Processing in Linux
● NIC uses DMA to write incoming
packet to a receive ring buffer
allocated to the NIC
● NIC generates an interrupt which is
delivered to the OS by the CPU
● OS handles the interrupt, allocates
kernel buffer and copies DMA’d
packet into the kernel buffer for IP
and TCP processing
● After protocol processing, packet
payload is copied to application
buffer (user-space) for processing by
the application
An example networking app on Linux Kernel
● A web server on Linux
● 83% of the CPU time
spent in the kernel
● This is not good news
even for a networking
app such as a web server
● This is REALLY BAD news
if the app is a network
function... Source: mTCP: A Highly Scalable
User-level TCP Stack for Multicore
Systems
https://www.usenix.org/node/179774
Network Functions on Linux kernel
● Performance hits
○ One interrupt for each incoming packet
○ Dynamic memory allocation (packet buffer) on a per packet basis
○ Interrupt service time
○ Context switch to kernel and then to the application implementing the NF
○ Copying packets multiple times
■ From DMA buffer to kernel buffer
■ Kernel buffer to user-space application buffer
■ Note that not a NF may or may not need TCP/IP protocol stack traversal in
the kernel depending on its functionality
Outline
● Virtual network functions (VNF): revisiting “load balancer” as a
concrete example
● Performance issues in implementing virtual network functions
● Performance-conscious implementation of virtual network functions
● Data Plane Development Kit (DPDK) - an exemplar for user-space
packet processing
● Implementation of VNF on commodity hardware using DPDK
● Putting it together: load balancer example using DPDK
Circling back to Virtualizing Network
Functions...
● Intel VT-d provides the means to bypass the hypervisor and go
directly to the VM (i.e, the Guest Kernel which is usually Linux)
○ NF is the app on top of Linux
● Guest kernel presents a new bottleneck
○ Specifically for NF applications
○ Sole purpose of NF applications is to read/write packets from/to NICs
● Slowdown due to kernel is prohibitive
○ Dennard scaling broke down in 2006 => CPU clock frequencies are not increasing
significantly from generation to generation => CPU speeds are not keeping up
with network speeds
○ NICs can handle more packets per second => increasing pressure on CPU
● So it is not sufficient to bypass the VMM for NF virtualization
○ We have to bypass the kernel as well...
Performance-Conscious Packet Processing
Alternatives
By-passing the Linux kernel
● Netmap, PF_RING ZC, and Linux Foundation DPDK
These alternatives possess common features
● Rely on polling to read packets instead of interrupts
● Pre-allocate buffers for packets
● Zero-copy packet processing
○ NIC uses DMA to write packets into pre-allocated application
buffers
● Process packets in batches
Outline
● Virtual network functions (VNF): revisiting “load balancer” as a
concrete example
● Performance issues in implementing virtual network functions
● Performance-conscious implementation of virtual network functions
● Data Plane Development Kit (DPDK) - an exemplar for user-space
packet processing
● Implementation of VNF on commodity hardware using DPDK
● Putting it together: load balancer example using DPDK
Data Plane Development Kit
● Developed by Intel in 2010
○ Now an open source project under Linux Foundation (https://www.dpdk.org/)
● Libraries to accelerate packet processing
● Targets wide variety of CPU architectures
● User-space packet processing to avoid overheads of Linux Kernel
Features of DPDK Located in userspace
memory

● Buffers for storing incoming


and outgoing packets in user- Transmit
buffer
space memory N/w
○ Directly accessed by the NIC DMA App
● NIC configuration registers are Receive
buffer
mapped in user-space memory
○ PCIe configuration space
https://en.wikipedia.org/wiki/PCI_configuration_space
○ Can be modified directly by
userspace application
● Effectively bypasses the kernel
Configuration
for interacting with NIC Registers
DPDK is a user-space library
● Very small component in the
kernel
○ Used for initialization of userspace
packet processing
○ https://doc.dpdk.org/guides/linux_gsg/linux_drivers
.html

● Needed to initialize the NIC to


DMA to appropriate memory
locations
● Setup memory mapping for
configuration registers on the
Image source :
NIC https://www.slideshare.net/LiorBetzalel/introduction-to-dpdk
-and-exploration-of-acceleration-techniques
○ PCI Configuration space
○ Updating those registers is then
Poll Mode Driver
● Allows accessing Receive (RX) while (true) {
and Transmit (TX) queues buff ← bulk_receive(in_port)
for pkt in buff :
● Interrupts on packet arrival are
out_port ← look_up(pkt.header)
disabled # Handle failed lookup somehow
● CPU is always busy polling for out_buffs[out_port].append(pkt)
packets even if there are no for out_port in out_ports:
bulk_transmit(out_buffs[out_port])
packets to be received }
● Receive and Transmit in batches
for efficiency
NIC Ring Buffer
Read pointer
(advanced when CPU
● Each NIC queue is implemented as a ring reads packets)
buffer
[Not specific to DPDK,
https://www.linuxjournal.com/content/queueing-linux-network-stack]

● Each slot in the ring buffer holds a


“descriptor” for a packet
○ Descriptor contains a pointer to the actual packet
data
■ And other metadata
○ Actual packet is stored in another buffer data Write pointer
structure (advanced when NIC
receives packets)
NIC Ring Buffer
Read pointer
(advanced when CPU
● Upon packet arrival, NIC populates the reads packets)
next vacant slot with packet’s descriptor
● CPU core running NF polls ring for unread
slots
● When new descriptors are found
○ CPU reads the packet data for those descriptors
○ Returns packets to application
● No need for locking: producer and
consumer are decoupled in ring buffer
● If no vacant descriptor slots in ring buffer, Write pointer
(advanced when NIC
NIC drops packets receives packets)
Pre-allocated buffers for storing packets
● Instead of allocating a buffer for each incoming packet, DPDK
preallocates multiple buffers on initialization
● Each Rx queue in the NIC can hold no more packets than the
capacity of the ring buffer
○ Total size of packet buffers is thereby known = capacity of ring


Pre-allocated buffers
Each ring slot points for holding incoming
to a preallocated packets
buffer

Pre-allocated buffers for storing packets
● Incoming packet is DMA’d into the buffer along with adding new
packet descriptor to ring buffer
● DPDK uses hugepages to maintain large pools of memory
○ Each page is 2 MB in size (compared to traditional 4KB pages)
○ Fewer pages ⇒ Fewer TLB misses ⇒ improved performance

Pre-allocated buffers
for holding incoming
packets

Advance the ring
write pointer DMA packet data
No overhead of copying packet data
● NIC DMA transfers packets directly to userspace buffers
● Protocol processing (TCP/IP) is done using those buffered packets in
place
○ …if needed by the network function [Note: not all NFs require TCP/IP processing
Upshot of NF using DPDK and Intelligent NICs
● All the kernel overheads in packet processing (alluded to earlier)
mitigated/eliminated
● Results in performance-conscious implementation of the VNF
● Developer of NF can concentrate on just the functionality of the NF
○ DPDK alleviates all the packet processing overheads for any NF
Outline
● Virtual network functions (VNF): revisiting “load balancer” as a
concrete example
● Performance issues in implementing virtual network functions
● Performance-conscious implementation of virtual network functions
● Data Plane Development Kit (DPDK) - an exemplar for user-space
packet processing
● Implementation of VNF on commodity hardware using DPDK
● Putting it together: load balancer example using DPDK
DPDK optimizations
Various optimization opportunities are available in DPDK to improve
packet processing

Each optimization attempts to eliminate a particular source of


performance drop in Linux kernel and/or exploitation of hardware
features in NICs and modern CPUs

Now let’s look at using DPDK to implement NFs on commodity hardware


Implementing NFs using DPDK on commodity
H/W
● Modern commodity servers contain multi-core CPUs
● Using multiple cores for packet processing can allow us to match
the increasing capacities of NICs
● NUMA servers
○ Multiple sockets each with given number of cores and local RAM
○ Accessing remote RAM is much more expensive than local RAM
Upshot
● Need to carefully design the packet processing path from NIC to NF
taking these hardware trends
● Partnership between system software and hardware
DPDK Application Model
1. Run-to-completion model
● Polling for incoming packets, processing on packet, and
transmission of output packet all done by the same core
● Each packet is handled by a unique core
2. Pipelined model
● Dedicated cores for polling and processing packets
● Inter-core packet transfer using ring buffers
Run-to-completion model
● All cores responsible for both Process
RX
packets
I/O and packet processing Queues
Poll
● Simplest model
● Each packet sees only one core
○ Works for monolithic
packet processing code
○ When all the packet
Insert in Tx
processing logic is queues
contained inside a single
thread TX
Queue
○ Simple to implement but s
less expressive
Pipelined execution model
RX
● Dedicate cores for processing NF logic Queues
● Some cores are dedicated for reading
packets Poll
● Each packet sees multiple cores
○ Can be used to chain multiple packet
processing logics (within an NF!)
○ E.g., IN → Firewall → Router → OUT
● Inter-core communication done using queue
buffers in memory Process
● Also useful when packet processing is CPU packets
bound, so having number of polling cores <
number of processing cores is a good choice
○ E.g., Intrusion Detection System TX
Queues
Multi-core implementation challenges
How to ensure that processing done by two distinct cores don’t interfere
with each other?

● If different packets of same connection are sent to different cores,


sharing NF state will be a nightmare.

How to ensure that cores participating in inter-core communication are


on the same NUMA node?

How to ensure that the cores processing packets are on same NUMA
socket as the NIC?
Receive side scaling: Useful hardware
technology
●Enabler of multi-core processing
● Use hashing to distribute incoming
packets to individual cores
○ Hash function takes 5-tuple of packet as
input Hash
○ src_ip, dst_ip, src_port, dst_port, proto Functio
n
● Each core is assigned a unique ring
buffer to poll
○ No contention among threads
● Different connection ⇒ Different
Ring CPU
queue (ring) ⇒ Different core buffers cores
○ Per-connection state is accessed only by a
single core, so state management is easy
Multi-core support in DPDK
Allows admin to specify the following (hardware/software partnership):
● Allows mapping of specific RX queue to specific CPU core
○ Port 0 - Rx queue 1 → CPU core 6
○ CPU core 6 → Port 1 - Tx queue 2
○ Flexible to create as many queues as admin wants
● Each thread is pinned to a specific core
○ To avoid contention
● Each thread/core runs the same code
NUMA awareness in DPDK
● DPDK creates memory pools for inter-core communication on the
same NUMA socket as the cores involved
● Ring buffers are allocated on the same socket as the NIC and cores
selected for processing
● Remote memory access is minimized
Outline
● Virtual network functions (VNF): revisiting “load balancer” as a
concrete example
● Performance issues in implementing virtual network functions
● Performance-conscious implementation of virtual network functions
● Data Plane Development Kit (DPDK) - an exemplar for user-space
packet processing
● Implementation of VNF on commodity hardware using DPDK
● Putting it together: load balancer example using DPDK
Putting it all together: Load balancer
application
● Multi-threaded load balancer
○ Run-to-completion model
○ Each thread performing identical processing
● Dedicated Rx and Tx queues for each core
Scalable implementation of Load Balancer on
Multi-core CPU using DPDK
Load
Balance
r
Load
Balance
r
RSS Load
Balance
r

Load
Balance
r

Ring CPU Connectio


buffers cores n Tables
Packet 5- Backend Service
Each thread tuple Instance
Select
backend
within a Core instance and
add to table
If match
Reading packet data
Extract not
for new descriptors Read Lookup
connection found
packe connection
info from
t info in table
header

If match
Polling for new packet found Send packet
descriptors
to backend
instance

Write output
Advance write
packet
pointer

DMA from Advance


NIC write pointer

DMA to NIC Userspace

Kernel space
Closing headshot
In this lecture we saw how to build performance-conscious virtual network
functions. The state-of-the-art is to implement network functions on top of Linux
Kernel. We identified the source of performance bottleneck with this approach for
network function implementation. We discussed the general trend towards
accelerating packet processing via mechanisms to bypass the kernel. We used
Intel’s DPDK as an exemplar to understand how the kernel overhead could be
mitigated. Further, we also saw the optimization opportunities offered by DPDK
for exploiting the inherent multi-core capabilities of modern processors. Due to the
hardware/software partnership offered by technologies such as VT-d and DPDK, it
is now possible to have very performance-conscious user-space implementation
of virtual network functions.
Resources
1. Comparison of Frameworks for High-Performance Packet IO
https://www.net.in.tum.de/publications/papers/gallenmueller_ancs2015.pdf
2. mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems
https://www.usenix.org/node/179774
3. On Kernel-Bypass Networking and Programmable Packet Processing
https://medium.com/@penberg/on-kernel-bypass-networking-and-programmable-packet-processing-799609b06898
4. Introduction to DPDK: Architecture and Principles
https://blog.selectel.com/introduction-dpdk-architecture-principles/
5. DPDK architecture
https://doc.dpdk.org/guides/prog_guide/overview.html
6. A Look at Intel’s Dataplane Development Kit
https://www.net.in.tum.de/fileadmin/TUM/NET/NET-2014-08-1/NET-2014-08-1_15.pdf
7. Linux drivers
https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html
Credits for figures used in this presentation
1. mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems
https://www.usenix.org/node/179774
2. Introduction to DPDK and exploration of acceleration techniques
https://www.slideshare.net/LiorBetzalel/introduction-to-dpdk-and-exploration-of-acceleration-techniques
3. On Kernel-Bypass Networking and Programmable Packet Processing
https://medium.com/@penberg/on-kernel-bypass-networking-and-programmable-packet-processing-799609b06898
4. Dynamic Tracing with DTrace & SystemTap : Network stack
https://myaut.github.io/dtrace-stap-book/kernel/net.html
5. Introduction to DPDK: Architecture and Principles
https://blog.selectel.com/introduction-dpdk-architecture-principles/

You might also like