Throughput and Latency of Virtual Switching With Open Vswitch: A Quantitative Analysis
Throughput and Latency of Virtual Switching With Open Vswitch: A Quantitative Analysis
Throughput and Latency of Virtual Switching With Open Vswitch: A Quantitative Analysis
net/publication/318604719
CITATIONS READS
28 3,608
5 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Sebastian Gallenmüller on 17 September 2019.
Abstract Virtual switches, like Open vSwitch, have the traditional benefits of software switches: high flex-
emerged as an important part of today’s data centers. ibility, vendor independence, low costs, and conceptual
They connect interfaces of virtual machines and pro- benefits for switching without Ethernet bandwidth lim-
vide an uplink to the physical network via network itations. The most popular virtual switch implementa-
interface cards. We discuss usage scenarios for virtual tion – Open vSwitch (OvS [43]) – is heavily used in
switches involving physical and virtual network inter- cloud computing frameworks like OpenStack [7] and
faces. We present extensive black-box tests to quantify OpenNebula [6]. OvS is an open source project that is
the throughput and latency of software switches with backed by an active community, and supports common
emphasis on the market leader, Open vSwitch. Finally, standards such as OpenFlow, SNMP, and IPFIX.
we explain the observed effects using white-box mea- The performance of packet processing in software
surements. depends on multiple factors including the underlying
hardware and its configuration, the network stack of
Keywords Network measurement · Cloud · Perfor- the operating system, the virtualization hypervisor, and
mance evaluation · Performance characterization · traffic characteristics (e.g., packet size, number of flows).
MoonGen Each factor can significantly hurt the performance, which
gives the motivation to perform systematic experiments
to study the performance of virtual switching. We carry
1 Introduction out experiments to quantify performance influencing
factors and describe the overhead that is introduced
Software switches form an integral part of any virtu- by the network stack of virtual machines, using Open
alized computing setup. They provide network access vSwitch in representative scenarios.
for virtual machines (VMs) by linking virtual and also Knowing the performance characteristics of a switch
physical network interfaces. The deployment of software is important when planning or optimizing the deploy-
switches in virtualized environments has led to the ex- ment of a virtualization infrastructure. We show how
tended term virtual switches and paved the way for the one can drastically improve performance by using a dif-
mainstream adoption of software switches [37], which ferent IO-backend for Open vSwitch. Explicitly map-
did not receive much attention before. In order to meet ping virtual machines and interrupts to specific cores is
the requirements in a virtualized environment, new vir- also an important configuration of a system as we show
tual switches have been developed that focus on per- with a measurement.
formance and provide advanced features in addition to The remainder of this paper is structured as fol-
lows: Section 2 provides an overview of software switch-
P. Emmerich · D. Raumer · S. Gallenmüller · F. Wohlfart · ing. We explain recent developments in hardware and
G. Carle software that enable sufficient performance in general
Technical University of Munich, Department of Informatics,
Chair of Network Architectures and Services
purpose PC systems based on commodity hardware,
Boltzmannstr. 3, 85748 Garching, Germany highlight challenges, and provide an overview of Open
E-mail: {emmericp|raumer|gallenmu|wohlfart|carle} vSwitch. Furthermore, we present related work on per-
@net.in.tum.de formance measurements in Section 3. The following Sec-
2 Paul Emmerich et al.
tion 4 explains the different test setups for the measure- The performance of virtual data plane forwarding
ments of this paper. Section 5 and Section 6 describe capabilities is a key issue for migrating existing services
our study on the performance of software switches and into VMs when moving from a traditional data center
their delay respectively. Ultimately, Section 7 sums up to a cloud system like OpenStack. This is especially
our results and gives advice for the deployment of soft- important for applications like web services which make
ware switches. extensive use of the VM’s networking capabilities.
Although hardware switches are currently the dom-
inant way to interconnect physical machines, software
switches like Open vSwitch come with a broad sup-
2 Software Switches
port of OpenFlow features and were the first to sup-
port new versions. Therefore, pNIC to pNIC switching
A traditional hardware switch relies on special purpose
allows software switches to be an attractive alternative
hardware, e.g., content addressable memory to store the
to hardware switches in certain scenarios. For software
forwarding or flow table, to process and forward pack-
switches the number of entries in the flow table is just
ets. In contrast, a software switch is the combination
a matter of configuration whereas it is limited to a few
of commodity PC hardware and software for packet
thousand in hardware switches [49].
switching and manipulation. Packet switching in soft-
ware grew in importance with the increasing deploy-
ment of host virtualization. Virtual machines (VMs)
running on the same host system must be intercon- 2.1 State of the Art
nected and connected to the physical network. If the
focus lies on switching between virtual machines, soft- Multiple changes in the system and CPU architectures
ware switches are often referred to as virtual switches. significantly increase the packet processing performance
A virtual switch is an addressable switching unit of po- of modern commodity hardware: integrated memory
tentially many software and hardware switches span- controllers in CPUs, efficient handling of interrupts,
ning over one or more physical nodes (e.g., the ”One and offloading mechanisms implemented in the NICs.
Big Switch” abstraction [29]). Compared to the default Important support mechanisms are built into the net-
VM bridging solutions, software switches like OvS are work adapters: checksum calculations and distribution
more flexible and provide a whole range of additional of packets directly to the addressed VM [8]. NICs can
features like advanced filter rules to implement firewalls transfer packets into memory (DMA) and even into
and per-flow statistics tracking. the CPU caches (DCA) [4] without involving the CPU.
DCA improves the performance by reducing the num-
ber of main memory accesses [26]. Further methods
such as interrupt coalescence aim at allowing batch
VM VM VM VM style processing of packets. These features mitigate the
...
vNIC vNIC vNIC vNIC vNIC effects of interrupt storms and therefore reduce the num-
Software Switch
ber of context switches. Network cards support modern
pNIC pNIC pNIC
hardware architecture principles such as multi-core se-
Physical Switches & Crosslink Connections
tups: Receive Side Scaling (RSS) distributes incoming
Fig. 1 Application scenario of a virtual switch
packets among queues that are attached to individual
CPU cores to maintain cache locality on each packet
processing core.
Figure 1 illustrates a typical application for virtual These features are available in commodity hardware
switching with both software and hardware switches. and the driver needs to support them. These consider-
The software switch connects the virtual network in- ations apply for packet switching in virtual host envi-
terface cards (NIC) vNIC with the physical NICs pNIC. ronments as well as between physical interfaces. As the
Typical applications in virtualization environments in- CPU proves to be the main bottleneck [18, 35, 13, 47]
clude traffic switching from pNIC to vNIC, vNIC to features like RSS and offloading are important to re-
pNIC, and vNIC to vNIC. For example, OpenStack rec- duce CPU load and help to distribute load among the
ommends multiple physical NICs to separate networks available cores.
and forwards traffic between them on network nodes Packet forwarding apps such as Open vSwitch [43,
that implement firewalling or routing functionality [39]. 5], the Linux router, or Click Modular Router [31] avoid
As components of future network architectures packet copying packets when forwarding between interfaces by
flows traversing a chain of VMs are also discussed [33]. performing the actual forwarding in a kernel module.
Throughput and Latency of Virtual Switching with Open vSwitch 3
However, forwarding a packet to a user space applica- eral purpose software switch that connects physically
tion or a VM requires a copy operation with the stan- separated nodes. It supports OpenFlow and provides
dard Linux network stack. There are several techniques advanced features for network virtualization.
based on memory mapping that can avoid this by giv-
ing a user space application direct access to the mem-
ory used by the DMA transfer. Prominent examples of ovs-vswitchd
User space
frameworks that implement this are PF RING DNA [16], VM
VM ofproto
VM
netmap [46], and DPDK [9, 1]. E.g., with DPDK run- dpif
ning on an Intel Xeon E5645 (6x 2.4 GHz cores) an L3 vNIC
slow
forwarding performance of 35.2 Mpps can be achieved [9]. path
Kernel space
We showed in previous work that these frameworks not
only improve the throughput but also reduce the de- data-
fast path path OpenFlow
lay [21]. Virtual switches like VALE [48] achieve over
17 Mpps vNIC to vNIC bridging performance by uti- OpenFlow
Controller
lizing shared memory between VMs and the hypervi- pNIC pNIC pNIC
against OpenFlow rules, which can be added by an ex- Panda et al. consider the overhead of virtualization too
ternal OpenFlow controller or via a command line in- high to implement SFC and present NetBricks [40], a
terface. The daemon derives datapath rules for packets NFV framework for writing fast network functions in
based on the OpenFlow rules and installs them in the the memory-safe language Rust. Our work does not fo-
kernel module so that future packets of this flow can cus on NFV: we provide benchmark results for Open
take the fast path. All rules in the datapath are associ- vSwitch, a mature and stable software switch that sup-
ated with an inactivity timeout. The flow table in the ports arbitrary virtual machines.
datapath therefore only contains the required rules to The first two papers from the OvS developers [41,
handle the currently active flows, so it acts as a cache 42] only provide coarse measurements of throughput
for the bigger and more complicated OpenFlow flow performance in bits per second in vNIC to vNIC switch-
table in the slow path. ing scenarios with Open vSwitch. Neither frame lengths
nor measurement results in packets per second (pps)
nor delay measurements are provided. In 2015 they pub-
3 Related Work lished design considerations for efficient packet process-
ing and how they reflect in the OvS architecture [43].
Detailed performance analysis of PC-based packet pro- In this publication, they also presented a performance
cessing systems have been continuously addressed in evaluation with focus on the FIB lookup, as this is sup-
the past. In 2005, Tedesco et al. [51] presented mea- ported by hierarchical caches in OvS. In [12] the au-
sured latencies for packet processing in a PC and sub- thors measured a software OpenFlow implementation
divided them to different internal processing steps. In in the Linux kernel that is similar to OvS. They com-
2007, Bolla and Bruschi [13] performed pNIC to pNIC pared the performance of the data plane of the Linux
measurements (according to RFC 2544 [14]) on a soft- bridge-utils software, the IP forwarding of the Linux
ware router based on Linux 2.61 . Furthermore, they kernel and the software implementation of OpenFlow
used profiling to explain their measurement results. Do- and studied the influence of the size of the used lookup
brescu et al. [18] revealed performance influences of tables. A basic study on the influence of QoS treat-
multi-core PC systems under different workloads [17]. ment and network separation on OvS can be found in
Contributions to the state of the art of latency mea- [25]. The authors of [28] measured the sojourn time of
surements in software routers were also made by An- different OpenFlow switches. Although the main focus
grisani et al. [10] and Larsen et al. [32] who performed was on hardware switches, they measured a delay be-
a detailed analysis of TCP/IP traffic latency. However, tween 35 and 100 microseconds for the OvS datapath.
they only analyzed the system under low load while Whiteaker et al. [54] observed a long tail distribution
we look at the behavior under increasing load up to of latencies when packets are forwarded into a VM but
10 Gbit/s. A close investigation of the of latency in their measurements were restricted to a 100 Mbit/s net-
packet processing software like OvS is presented by Bei- work due to hardware restrictions of their time stamp-
fuß et al. [11]. ing device. Rotsos et al. [49] presented OFLOPS, a
In the context of different modifications to the guest framework for OpenFlow switch evaluation. They ap-
and host OS network stack (cf. Section 2.1), virtual plied it, amongst others, to Open vSwitch. Deployed on
switching performance was measured [48, 33, 15, 44, 47, systems with a NetFPGA the framework measures ac-
27] but the presented data provide only limited possibil- curate time delay of OpenFlow table updates but not
ity for direct comparison. Other studies addressed the the data plane performance. Their study revealed ac-
performance of virtual switching within a performance tions that can be performed faster by software switches
analysis of cloud datacenters [53], but provide less de- than by hardware switches, e.g., requesting statistics.
tailed information on virtual switching performance. We previously presented delay measurements of VM
Running network functions in VMs and connect- network packet processing in selected setups on an ap-
ing them via a virtual switch can be used to imple- plication level [21].
ment network function virtualization (NFV) with ser-
Latency measurements are sparse in the literature
vice function chaining (SFC) [23]. Martins et al. present
as they are hard to perform in a precise manner [20].
ClickOS, a software platform for small and resource-
Publications often rely on either special-purpose hard-
efficient virtual machines implementing network func-
ware, often only capable of low rates, (e.g., [13, 54])
tions [34]. Niu et al. discuss the performance of ClickOS [34]
or on crude software measurement tools that are not
and SoftNIC [24] when used to implement SFC [38].
precise enough to get insights into latency distributions
1
The “New API” network interface was introduced with on low-latency devices such as virtual switches. We use
this kernel version. our packet generator MoonGen that supports hardware
Throughput and Latency of Virtual Switching with Open vSwitch 5
VM VM
4 Test Setup
vNIC vNIC vNIC
The description of our test setup reflects the specific Switch Switch
pNIC pNIC pNIC
hardware and software used for our measurements and
includes the various VM setups investigated. Figure 3 (c) vNIC to vNIC (d) pNIC forward-
shows the server setup. ing
2
5
8 1.9.3 1.9.3
Packet Rate [Mpps]
7 1.10.2 4 1.10.2
6 1.11.0 1.11.0
packet queues. Figure 9 plots the CPU load caused Offered Load [Mpps]
by context switches (kernel function switch to) and Fig. 11 Throughput without explicitly pinning all tasks to
functions related to virtual NIC queues at the tested CPU cores
offered loads with a run time of five minutes per run.
This indicates that a congestion occurs at the vNICs
Packet sizes are also relevant in comparison to the
and the system tries to resolve this by forcing a context
pNIC to pNIC scenario because the packet needs to
switch to the network task of the virtual machine to
be copied to the user space to forward it to a VM.
retrieve the packets. This additional overhead leads to
Figure 10 plots the throughput and the CPU load of
drops.
the kernel function copy user enhanced fast string,
2
Lower than the previously stated figure of 1.88 Mpps due which copies a packet into the user space, in the for-
to active profiling. warding scenario shown in Figure 3a. The throughput
Throughput and Latency of Virtual Switching with Open vSwitch 9
60
should be measured with hardware counters using perf.
5.5 Conclusion
drops only marginally from 0.85 Mpps to 0.8 Mpps until
it becomes limited by the line rate with packets larger Virtual switching is limited by the number of packets,
than 656 Byte. Copying packets poses a measurable but not the overall throughput. Applications that require a
small overhead. The reason for this is the high mem- large number of small packets, e.g., virtualized network
ory bandwidth of modern servers: our test server has a functions, are thus more difficult for a virtual switch
memory bandwidth of 200 Gbit per second. This means than applications relying on bulk data transfer, e.g.,
that VMs are well-suited for running network services file servers. Overloading virtual ports on the switch can
that rely on bulk throughput with large packets, e.g., lead to packet loss before the maximum throughput is
file servers. Virtualizing packet processing or forward- achieved.
ing systems that need to be able to process a large num- Using the DPDK backend in OvS can improve the
ber of small packets per second is, however, problem- throughput by a factor of 7 when no VMs are involved.
atic. With VMs, an improvement of a factor of 3 to 4 can be
We derive another test case from the fact that the achieved, cf. Table 1. However, DPDK requires stati-
DuT runs multiple applications: OvS and the VM re- cally assigned CPU cores that are constantly being uti-
ceiving the packets. This is relevant on a virtualiza- lized by a busy-wait polling logic, causing 100% load on
tion server where the running VMs generate substan- these cores. Using the slower default Linux IO backend
tial CPU load. The VM was pinned to a different core results in a linear correlation between network load and
than the NIC interrupt for the previous test. Figure 11 CPU load, cf. Figure 12.
shows the throughput in the same scenario under in-
creasing offered load, but without pinning the VM to a
6 Latency Measurements
core. This behavior can be attributed to a scheduling
conflict because the Linux kernel does not measure the
In another set of measurements we address the packet
load caused by interrupts properly by default. Figure 12
delay introduced by software switching in OvS. There-
shows the average CPU load of a core running only OvS
fore, we investigate two different scenarios. In the first
as seen by the scheduler (read from the procfs pseudo
experiment, traffic is forwarded between two physical
filesystem with the mpstat utility) and compares it to
interfaces (cf. Figure 3d). For the second scenario the
the actual average load measured by reading the CPU’s
packets are not forwarded between the physical inter-
cycle counter with the profiling utility perf.
faces directly but through a VM as it is shown in Fig-
Guideline 2 Virtual machine cores and NIC inter- ure 3b.
rupts should be pinned to disjoint sets of CPU cores.
The Linux scheduler does not measure the CPU load
caused by hardware interrupts properly and therefore 6.1 Forwarding between Physical Interfaces
schedules the VM on the same core, which impacts the
performance. CONFIG IRQ TIME ACCOUNTING is a kernel Figure 13 shows the measurement for a forwarding be-
option, which can be used to enable accurate reporting tween two pNICs by Open vSwitch. This graph features
10 Paul Emmerich et al.
104 rates lower than 312.5 kpps, and to 8k per second above
99th Percentile
that. These packet rates equal to the step into the next
75th Percentile
50th Percentile plateau of the graph.
25th Percentile
103
At the end of the third level the latency drops again
right before the switch is overloaded. Note that the drop
Latency [µs]
P2
P3
100
increases to about 1 ms as Open vSwitch can no longer
0 0.5 1 1.5 2 cope with the load and all queues fill up completely.
Offered Load [Mpps]
We visualized the distributions of latency at three
Fig. 13 Latency of packet forwarding between pNICs
measurement points P1 – P3 (cf. Figure 13 and Ta-
ble 2). The distributions at these three measurements
Probability [%]
6 P1 (44.6 kpps)
4
are plotted as histogram with bin width of 0.25 µs in
2 Figure 14. The three selected points show the typical
0 shapes of the probability density function of their re-
0 10 20 30 40 50 60 70 80 90 100 110 120 130
Latency [µs] spective levels.
Probability [%]
four different levels of delay. The first level has an aver- For measurement P3 the distribution depicts a high
age latency of around 15 µs and a packet transfer rate load at which both the interrupt throttle rate and the
of up to 179 kpps. Above that transfer rate the sec- poll mechanism of the NAPI affect the distribution. A
ond level has a delay value of around 28 µs and lasts significant number of packets accumulates on the NIC
up to 313 kpps. Beyond that rate the third level of- before the processing is finished. Linux then polls the
fers a latency of around 53 µs up to a transfer rate of NIC again after processing, without re-enabling inter-
1.78 Mpps. We selected three different points (P1 - P3) rupts in between, and processes a second smaller batch.
– one as a representative for each of the levels before This causes an overlay of the previously seen uniform
the system becomes overloaded (cf. Figure 13). Table 2 distribution with additional peaks caused by the NAPI
also includes these points to give typical values for their processing.
corresponding level.
Overloading the system leads to an unrealistic ex-
The reason for the shape and length of the first cessive latency of ≈ 1 ms and its exact distribution is of
three levels is the architecture of the ixgbe driver as little interest. Even the best-case 1st percentile shows
described by Beifuß et al. [11]. This driver limits the a latency of about 375 µs in all measurements during
interrupt rate to 100k per second for packet rates lower overload conditions, far higher than even the worst-case
than 156.2 kpps, which relates the highest transfer rate of the other scenarios.
measured for the first level in Figure 13. The same ob-
servation holds for the second and the third level. The Guideline 4 Avoid overloading ports handling latency-
interrupt rate is limited to 20k per second for transfer critical traffic.
Throughput and Latency of Virtual Switching with Open vSwitch 11
pNIC 50th Percentile flection points and also increase the delay.
pNIC 25th Percentile
102 The histograms for the latency at the lowest investi-
gated representatively selected packet rate – P1 in Fig-
ure 14 and V1 in Figure 16 – have a similar shape.
The shape of the histogram in V3 is a long-tail dis-
101
tribution, i.e. while the average latency is low, there
is a significant number of packets with a high delay.
V1
V2
V3
This distribution was also observed by Whiteaker et
100 al. [54] in virtualized environments. However, we could
0 50 100 150 200 250 300 350 400
Offered Load [kpps] only observe this type of traffic under an overload sce-
Fig. 15 Latency of packet forwarding through VM
nario like V3. Note that the maximum packet rate for
this scenario was previously given as 300 kpps in Ta-
ble 1, so V3 is already an overload scenario. We could
Probability [%]
3 V1 (39.0 kpps)
2
not observe such a distribution under normal load. The
1 worst-case latency is also significantly higher than in
0 the pNIC scenario due to the additional buffers in the
0 50 100 150 200 250 300 350
Latency [µs] vNICs.
Guideline 5 Avoid virtualizing services that are sen-
Probability [%]
4 DP1 (4.1 Mpps) is used. The load caused by processing packets on the
2 hypervisor should also be considered when allocating
0
CPU resources to VMs. Even a VM with only one vir-
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 tual CPU core can load two CPU cores due to virtual
Latency [µs]
switching. The total system load of Open vSwitch can
Probability [%]
nitude. However, moving packet processing systems or 10. Angrisani, L., Ventre, G., Peluso, L., Tedesco, A.:
virtual switches and routers into VMs is problematic Measurement of Processing and Queuing Delays In-
because of the high overhead per packet that needs to troduced by an Open-Source Router in a Single-
cross the VM/host barrier and because of their latency- Hop Network. IEEE Transactions on Instrumenta-
sensitive nature. tion and Measurement 55(4), 1065–1076 (2006)
The shift to user space packet-processing frameworks 11. Beifuß, A., Raumer, D., Emmerich, P., Runge,
like DPDK promises substantial improvements for both T.M., Wohlfart, F., Wolfinger, B.E., Carle, G.: A
throughput (cf. Section 5.1) and latency (cf. Section 6). Study of Networking Software Induced Latency. In:
DPDK is integrated, but disabled by default, in Open 2nd International Conference on Networked Sys-
vSwitch. However, the current version we evaluated still tems 2015. Cottbus, Germany (2015)
had stability issues and is not yet fit for production. 12. Bianco, A., Birke, R., Giraudo, L., Palacin,
Further issues with the DPDK port are usability as M.: Openflow switching: Data plane performance.
complex configuration is required and the lack of de- In: International Conference on Communications
bugging facilities as standard tools like tcpdump are (ICC). IEEE (2010)
currently not supported. Intel is currently working on 13. Bolla, R., Bruschi, R.: Linux Software Router: Data
improving these points to get DPDK vSwitch into pro- Plane Optimization and Performance Evaluation.
duction [30]. Journal of Networks 2(3), 6–17 (2007)
14. Bradner, S., McQuaid, J.: Benchmarking Method-
ology for Network Interconnect Devices. RFC 2544
Acknowledgments (Informational) (1999)
15. Cardigliano, A., Deri, L., Gasparakis, J., Fusco, F.:
This research has been supported by the DFG (German vPFRING: Towards WireSpeed Network Monitor-
Research Foundation) as part of the MEMPHIS project ing using Virtual Machines. In: ACM Internet Mea-
(CA 595/5-2) and in the framework of the CELTIC surement Conference (2011)
EUREKA project SENDATE-PLANETS (Project ID 16. Deri, L.: nCap: Wire-speed Packet Capture and
C2015/3-1), partly funded by the German BMBF (Pro- Transmission. In: IEEE Workshop on End-to-
ject ID 16KIS0460K). The authors alone are responsible End Monitoring Techniques and Services, pp. 47–55
for the content of the paper. (2005)
17. Dobrescu, M., Argyraki, K., Ratnasamy, S.: To-
ward Predictable Performance in Software Packet-
References
Processing Platforms. In: USENIX Conference
on Networked Systems Design and Implementation
1. Intel DPDK: Data Plane Development Kit. http:
(NSDI) (2012)
//dpdk.org. Last visited 2016-03-27
18. Dobrescu, M., Egi, N., Argyraki, K., Chun, B., Fall,
2. Intel DPDK vSwitch. https://01.org/sites/
K., Iannaccone, G., Knies, A., Manesh, M., Rat-
default/files/page/intel_dpdk_vswitch_
nasamy, S.: RouteBricks: Exploiting Parallelism To
performance_figures_0.10.0_0.pdf. Last
Scale Software Routers. In: 22nd ACM Symposium
visited 2016-03-27
on Operating Systems Principles (SOSP) (2009)
3. Intel DPDK vSwitch. https://github.com/
19. DPDK Project: DPDK 16.11 Release Notes.
01org/dpdk-ovs. Last visited 2016-03-27
http://dpdk.org/doc/guides/rel_notes/
4. Intel I/O Acceleration Technology. http:
release_16_11.html (2016). Last visited 2016-
//www.intel.com/content/www/us/en/
03-27
wireless-network/accel-technology.html.
20. Emmerich, P., Gallenmüller, S., Raumer, D., Wohl-
Last visited 2016-03-27
fart, F., Carle, G.: MoonGen: A Scriptable High-
5. Open vSwitch. http://openvswitch.org. Last
Speed Packet Generator. In: 15th ACM SIGCOMM
visited 2016-03-27
Conference on Internet Measurement (IMC’15)
6. OpenNebula. https://opennebula.org. Last vis-
(2015)
ited 2016-03-27
21. Emmerich, P., Raumer, D., Wohlfart, F., Carle,
7. OpenStack. https://openstack.org. Last visited
G.: A Study of Network Stack Latency for Game
2016-03-27
Servers. In: 13th Annual Workshop on Network
8. Virtual Machine Device Queues: Technical White
and Systems Support for Games (NetGames’14).
Paper (2008)
Nagoya, Japan (2014)
9. Impressive Packet Processing Performance Enables
Greater Workload Consolidation (2013)
14 Paul Emmerich et al.
22. Emmerich, P., Raumer, D., Wohlfart, F., Carle, 33. Martins, J., Ahmed, M., Raiciu, C., Olteanu, V.,
G.: Performance Characteristics of Virtual Switch- Honda, M., Bifulco, R., Huici, F.: ClickOS and the
ing. In: 2014 IEEE 3rd International Conference Art of Network Function Virtualization. In: 11th
on Cloud Networking (CloudNet’14). Luxembourg USENIX Symposium on Networked Systems De-
(2014) sign and Implementation (NSDI 14), pp. 459–473.
23. ETSI: Network Functions Virtualisation (NFV); USENIX Association, Seattle, WA (2014)
Architectural Framework, V1.1.1 (2013) 34. Martins, J., Ahmed, M., Raiciu, C., Olteanu, V.,
24. Han, S., Jang, K., Panda, A., Palkar, S., Han, Honda, M., Bifulco, R., Huici, F.: Clickos and the
D., Ratnasamy, S.: Softnic: A software nic to aug- art of network function virtualization. In: 11th
ment hardware. Tech. Rep. UCB/EECS-2015-155, USENIX Symposium on Networked Systems De-
EECS Department, University of California, Berke- sign and Implementation (NSDI 14), pp. 459–473.
ley (2015) USENIX Association, Seattle, WA (2014)
25. He, Z., Liang, G.: Research and evaluation of net- 35. Meyer, T., Wohlfart, F., Raumer, D., Wolfinger,
work virtualization in cloud computing environ- B., Carle, G.: Validated Model-Based Prediction of
ment. In: Networking and Distributed Computing Multi-Core Software Router Performance. Praxis
(ICNDC), pp. 40–44. IEEE (2012) der Informationsverarbeitung und Kommunikation
26. Huggahalli, R., Iyer, R., Tetrick, S.: Direct Cache (PIK) (2014)
Access for High Bandwidth Network I/O. In: Pro- 36. Michael Tsirkin Cornelia Huck, P.M.: Virtual I/O
ceedings of the 32nd Annual International Sympo- Device (VIRTIO) Version 1.0 Committee Specifica-
sium on Computer Architecture, pp. 50–59 (2005) tion 04. OASIS (2016)
27. Hwang, J., Ramakrishnan, K.K., Wood, T.: Netvm: 37. Munch, B.: Hype Cycle for Networking and Com-
High performance and flexible networking using munications. Report, Gartner (2013)
virtualization on commodity platforms. In: 11th 38. Niu, Z., Xu, H., Tian, Y., Liu, L., Wang, P.,
USENIX Symposium on Networked Systems De- Li, Z.: Benchmarking NFV Software Dataplanes
sign and Implementation (NSDI 14), pp. 445–458. arXiv:1605.05843 (2016)
USENIX Association, Seattle, WA (2014) 39. OpenStack: Networking Guide: Deployment Sce-
28. Jarschel, M., Oechsner, S., Schlosser, D., Pries, narios. http://docs.openstack.org/liberty/
R., Goll, S., Tran-Gia, P.: Modeling and perfor- networking-guide/deploy.html (2015). Last vis-
mance evaluation of an OpenFlow architecture. In: ited 2016-03-27
Proceedings of the 23rd International Teletraffic 40. Panda, A., Han, S., Jang, K., Walls, M., Rat-
Congress. ITCP (2011) nasamy, S., Shenker, S.: Netbricks: Taking the v
29. Kang, N., Liu, Z., Rexford, J., Walker, D.: Optimiz- out of nfv. In: 12th USENIX Symposium on Oper-
ing the ”One Big Switch” Abstraction in Software- ating Systems Design and Implementation (OSDI
defined Networks. In: Proceedings of the Ninth 16), pp. 203–216. USENIX Association, GA (2016)
ACM Conference on Emerging Networking Exper- 41. Pettit, J., Gross, J., Pfaff, B., Casado, M., Crosby,
iments and Technologies, CoNEXT ’13, pp. 13– S.: Virtual Switching in an Era of Advanced Edges.
24. ACM, New York, NY, USA (2013). DOI In: 2nd Workshop on Data Center Converged and
10.1145/2535372.2535373. URL http://doi.acm. Virtual Ethernet Switching (DC-CAVES) (2011)
org/10.1145/2535372.2535373 42. Pfaff, B., Pettit, J., Koponen, T., Amidon, K.,
30. Kevin Traynor: OVS, DPDK and Software Dat- Casado, M., Shenker, S.: Extending Networking
aplane Acceleration. https://fosdem.org/ into the Virtualization Layer. In: Proc. of workshop
2016/schedule/event/ovs_dpdk/attachments/ on Hot Topics in Networks (HotNets-VIII) (2009)
slides/1104/export/events/attachments/ovs_ 43. Pfaff, B., Pettit, J., Koponen, T., Jackson, E.,
dpdk/slides/1104/ovs_dpdk_fosdem_16.pdf Zhou, A., Rajahalme, J., Gross, J., Wang, A.,
(2016). Last visited 2016-03-27 Stringer, J., Shelar, P., Amidon, K., Casado, M.:
31. Kohler, E., Morris, R., Chen, B., Jannotti, J., The design and implementation of open vswitch. In:
Kaashoek, M.F.: The Click Modular Router. ACM 12th USENIX Symposium on Networked Systems
Transactions on Computer Systems (TOCS) 18(3), Design and Implementation (NSDI 15). USENIX
263–297 (2000). DOI 10.1145/354871.354874 Association (2015)
32. Larsen, S., Sarangam, P., Huggahalli, R., Kulka- 44. Pongracz, G., Molnar, L., Kis, Z.L.: Removing
rni, S.: Architectural Breakdown of End-to-End La- Roadblocks from SDN: OpenFlow Software Switch
tency in a TCP/IP Network. International Journal Performance on Intel DPDK. Second Euro-
of Parallel Programming 37(6), 556–571 (2009) pean Workshop on Software Defined Networks
Throughput and Latency of Virtual Switching with Open vSwitch 15
(EWSDN’13) pp. 62–67 (2013) and 2012. He is concerned with device performance measure-
45. Ram, K.K., Cox, A.L., Chadha, M., Rixner, S.: ments with relevance to Network Function Virtualization as
part of Software-defined Networking architectures.
Hyper-Switch: A Scalable Software Virtual Switch-
ing Architecture. In: Presented as part of the 2013
USENIX Annual Technical Conference (USENIX
ATC 13), pp. 13–24. USENIX, San Jose, CA (2013) Sebastian Gallenmüller is a Ph.D. student at the Chair
of Network Architectures and Services at Technical Univer-
46. Rizzo, L.: netmap: a novel framework for fast sity of Munich. There he received his M.Sc. in Informatics in
packet I/O. In: USENIX Annual Technical Con- 2014. He focuses on the topic of assessment and evaluation of
ference (2012) software systems for packet processing.
47. Rizzo, L., Carbone, M., Catalli, G.: Transparent
Acceleration of Software Packet Forwarding using
Netmap. In: INFOCOM, pp. 2471–2479. IEEE Florian Wohlfart is a Ph.D. student working at the Chair of
(2012) Network Architectures and Services at Technical University
of Munich. He received his M.Sc. in Informatics at Technical
48. Rizzo, L., Lettieri, G.: VALE, a switched ethernet University of Munich in 2012. His research interests include
for virtual machines. In: C. Barakat, R. Teixeira, software packet processing, middlebox analysis, and network
K.K. Ramakrishnan, P. Thiran (eds.) CoNEXT, performance measurements.
pp. 61–72. ACM (2012)
49. Rotsos, C., Sarrar, N., Uhlig, S., Sherwood, R.,
Moore, A.W.: Oflops: An Open Framework for Georg Carle is professor at the Department of Informat-
OpenFlow Switch Evaluation. In: Passive and Ac- ics at Technical University of Munich, holding the Chair of
Network Architectures and Services. He studied at University
tive Measurement, pp. 85–95. Springer (2012)
of Stuttgart, Brunel University, London, and Ecole Nationale
50. Salim, J.H., Olsson, R., Kuznetsov, A.: Beyond Superieure des Telecommunications, Paris. He did his Ph.D.
Softnet. In: Proceedings of the 5th annual Linux in Computer Science at University of Karlsruhe, and worked
Showcase & Conference, vol. 5, pp. 18–18 (2001) as postdoctoral scientist at Institut Eurecom, Sophia Antipo-
lis, France, at the Fraunhofer Institute for Open Communi-
51. Tedesco, A., Ventre, G., Angrisani, L., Peluso, L.:
cation Systems, Berlin, and as professor at the University of
Measurement of Processing and Queuing Delays In- Tübingen.
troduced by a Software Router in a Single-Hop Net-
work. In: IEEE Instrumentation and Measurement
Technology Conference, pp. 1797–1802 (2005)
52. Thomas Monjalon: dropping librte ivshmem.
http://dpdk.org/ml/archives/dev/2016-June/
040844.html (2016). Mailing list discussion
53. Wang, G., Ng, T.E.: The impact of virtualization
on network performance of amazon ec2 data center.
In: INFOCOM, pp. 1–9. IEEE (2010)
54. Whiteaker, J., Schneider, F., Teixeira, R.: Explain-
ing Packet Delays under Virtualization. ACM SIG-
COMM Computer Communication Review 41(1),
38–44 (2011)
Author Biographies