VxLAN rfc7348
VxLAN rfc7348
M. Mahalingam
Storvisor
D. Dutt
Cumulus Networks
K. Duda
Arista
P. Agarwal
Broadcom
L. Kreeger
Cisco
T. Sridhar
VMware
M. Bursell
Intel
C. Wright
Red Hat
August 2014
Mahalingam, et al.
Informational
[Page 1]
RFC 7348
VXLAN
August 2014
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trusts Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Table of Contents
1. Introduction ....................................................3
1.1. Acronyms and Definitions ...................................4
2. Conventions Used in This Document ...............................4
3. VXLAN Problem Statement .........................................5
3.1. Limitations Imposed by Spanning Tree and VLAN Ranges .......5
3.2. Multi-tenant Environments ..................................5
3.3. Inadequate Table Sizes at ToR Switch .......................6
4. VXLAN ...........................................................6
4.1. Unicast VM-to-VM Communication .............................7
4.2. Broadcast Communication and Mapping to Multicast ...........8
4.3. Physical Infrastructure Requirements .......................9
5. VXLAN Frame Format .............................................10
6. VXLAN Deployment Scenarios .....................................14
6.1. Inner VLAN Tag Handling ...................................18
7. Security Considerations ........................................18
8. IANA Considerations ............................................19
9. References .....................................................19
9.1. Normative References ......................................19
9.2. Informative References ....................................20
10. Acknowledgments ...............................................21
Mahalingam, et al.
Informational
[Page 2]
RFC 7348
1.
VXLAN
August 2014
Introduction
Server virtualization has placed increased demands on the physical
network infrastructure. A physical server now has multiple Virtual
Machines (VMs) each with its own Media Access Control (MAC) address.
This requires larger MAC address tables in the switched Ethernet
network due to potential attachment of and communication among
hundreds of thousands of VMs.
In the case when the VMs in a data center are grouped according to
their Virtual LAN (VLAN), one might need thousands of VLANs to
partition the traffic according to the specific group to which the VM
may belong. The current VLAN limit of 4094 is inadequate in such
situations.
Data centers are often required to host multiple tenants, each with
their own isolated network domain. Since it is not economical to
realize this with dedicated infrastructure, network administrators
opt to implement isolation over a shared network. In such scenarios,
a common problem is that each tenant may independently assign MAC
addresses and VLAN IDs leading to potential duplication of these on
the physical network.
An important requirement for virtualized environments using a Layer 2
physical infrastructure is having the Layer 2 network scale across
the entire data center or even between data centers for efficient
allocation of compute, network, and storage resources. In such
networks, using traditional approaches like the Spanning Tree
Protocol (STP) for a loop-free topology can result in a large number
of disabled links.
The last scenario is the case where the network operator prefers to
use IP for interconnection of the physical infrastructure (e.g., to
achieve multipath scalability through Equal-Cost Multipath (ECMP),
thus avoiding disabled links). Even in such environments, there is a
need to preserve the Layer 2 model for inter-VM communication.
The scenarios described above lead to a requirement for an overlay
network. This overlay is used to carry the MAC traffic from the
individual VMs in an encapsulated format over a logical "tunnel".
This document details a framework termed "Virtual eXtensible Local
Area Network (VXLAN)" that provides such an encapsulation scheme to
address the various requirements specified above. This memo
documents the deployed VXLAN protocol for the benefit of the Internet
community.
Mahalingam, et al.
Informational
[Page 3]
VXLAN
RFC 7348
1.1.
August 2014
ACL
ECMP
Equal-Cost Multipath
IGMP
IHL
MTU
PIM
SPB
STP
ToR
Top of Rack
TRILL
VLAN
VM
Virtual Machine
VNI
VTEP
VXLAN
VXLAN Segment
VXLAN Layer 2 overlay network over which VMs communicate
VXLAN Gateway
an entity that forwards traffic between VXLANs
2.
Mahalingam, et al.
Informational
[Page 4]
VXLAN
RFC 7348
3.
August 2014
3.1.
Current Layer 2 networks use the IEEE 802.1D Spanning Tree Protocol
(STP) [802.1D] to avoid loops in the network due to duplicate paths.
STP blocks the use of links to avoid the replication and looping of
frames. Some data center operators see this as a problem with Layer
2 networks in general, since with STP they are effectively paying for
more ports and links than they can really use. In addition,
resiliency due to multipathing is not available with the STP model.
Newer initiatives, such as TRILL [RFC6325] and SPB [802.1aq], have
been proposed to help with multipathing and surmount some of the
problems with STP. STP limitations may also be avoided by
configuring servers within a rack to be on the same Layer 3 network,
with switching happening at Layer 3 both within the rack and between
racks. However, this is incompatible with a Layer 2 model for interVM communication.
A key characteristic of Layer 2 data center networks is their use of
Virtual LANs (VLANs) to provide broadcast isolation. A 12-bit VLAN
ID is used in the Ethernet data frames to divide the larger Layer 2
network into multiple broadcast domains. This has served well for
many data centers that require fewer than 4094 VLANs. With the
growing adoption of virtualization, this upper limit is seeing
pressure. Moreover, due to STP, several data centers limit the
number of VLANs that could be used. In addition, requirements for
multi-tenant environments accelerate the need for larger VLAN limits,
as discussed in Section 3.3.
3.2.
Multi-tenant Environments
Mahalingam, et al.
Informational
[Page 5]
RFC 7348
VXLAN
August 2014
VXLAN
VXLAN (Virtual eXtensible Local Area Network) addresses the above
requirements of the Layer 2 and Layer 3 data center network
infrastructure in the presence of VMs in a multi-tenant environment.
It runs over the existing networking infrastructure and provides a
means to "stretch" a Layer 2 network. In short, VXLAN is a Layer 2
overlay scheme on a Layer 3 network. Each overlay is termed a VXLAN
segment. Only VMs within the same VXLAN segment can communicate with
Mahalingam, et al.
Informational
[Page 6]
RFC 7348
VXLAN
August 2014
Mahalingam, et al.
Informational
[Page 7]
RFC 7348
VXLAN
August 2014
Mahalingam, et al.
Informational
[Page 8]
RFC 7348
VXLAN
August 2014
multicast routing protocols like Protocol Independent Multicast Sparse Mode (PIM-SM see [RFC4601]) will provide efficient multicast
trees within the Layer 3 network.
The VTEP will use (*,G) joins. This is needed as the set of VXLAN
tunnel sources is unknown and may change often, as the VMs come up /
go down across different hosts. A side note here is that since each
VTEP can act as both the source and destination for multicast
packets, a protocol like bidirectional PIM (BIDIR-PIM -- see
[RFC5015]) would be more efficient.
The destination VM sends a standard ARP response using IP unicast.
This frame will be encapsulated back to the VTEP connecting the
originating VM using IP unicast VXLAN encapsulation. This is
possible since the mapping of the ARP responses destination MAC to
the VXLAN tunnel end point IP was learned earlier through the ARP
request.
Note that multicast frames and "unknown MAC destination" frames are
also sent using the multicast tree, similar to the broadcast frames.
4.3.
Mahalingam, et al.
Informational
[Page 9]
VXLAN
RFC 7348
5.
August 2014
Destination Port: IANA has assigned the value 4789 for the
VXLAN UDP port, and this value SHOULD be used by default as the
destination UDP port. Some early implementations of VXLAN have
used other values for the destination port. To enable
interoperability with these implementations, the destination
port SHOULD be configurable.
Mahalingam, et al.
Informational
[Page 10]
VXLAN
RFC 7348
August 2014
Mahalingam, et al.
Informational
[Page 11]
RFC 7348
VXLAN
August 2014
Mahalingam, et al.
Informational
[Page 12]
VXLAN
RFC 7348
August 2014
Mahalingam, et al.
Informational
[Page 13]
VXLAN
RFC 7348
August 2014
Mahalingam, et al.
Informational
[Page 14]
RFC 7348
VXLAN
August 2014
Mahalingam, et al.
Informational
[Page 15]
RFC 7348
VXLAN
August 2014
+------------+-------------+
|
Server 1
|
| +----+----+ +----+----+ |
| |VM1-1
| |VM1-2
| |
| |VNI 22
| |VNI 34
| |
| |
| |
| |
| +---------+ +---------+ |
|
|
| +----+----+ +----+----+ |
| |VM1-3
| |VM1-4
| |
| |VNI 74
| |VNI 98
| |
| |
| |
| |
| +---------+ +---------+ |
| Hypervisor VTEP (IP1)
|
+--------------------------+
|
|
|
|
+-------------+
|
|
Layer 3
|
|---|
Network
|
|
|
+-------------+
|
|
+-----------+
|
|
+------------+-------------+
|
Server 2
|
| +----+----+ +----+----+ |
| |VM2-1
| |VM2-2
| |
| |VNI 34
| |VNI 74
| |
| |
| |
| |
| +---------+ +---------+ |
|
|
| +----+----+ +----+----+ |
| |VM2-3
| |VM2-4
| |
| |VNI 98
| |VNI 22
| |
| |
| |
| |
| +---------+ +---------+ |
| Hypervisor VTEP (IP2)
|
+--------------------------+
Figure 3: VXLAN Deployment - VTEPs across a Layer 3 Network
Mahalingam, et al.
Informational
[Page 16]
RFC 7348
VXLAN
August 2014
Mahalingam, et al.
Informational
[Page 17]
VXLAN
RFC 7348
August 2014
+---+-----+---+
+---+-----+---+
|
Server 1 |
| Non-VXLAN |
(VXLAN enabled)<-----+
+---->| server
|
+-------------+
|
|
+-------------+
|
|
+---+-----+---+
|
|
+---+-----+---+
|Server 2
|
|
|
| Non-VXLAN |
(VXLAN enabled)<-----+
+---+-----+---+
+---->|
server
|
+-------------+
|
|Switch acting|
|
+-------------+
|---| as VXLAN
|-----|
+---+-----+---+
|
|
Gateway
|
| Server 3
|
|
+-------------+
(VXLAN enabled)<-----+
+-------------+
|
|
+---+-----+---+
|
| Server 4
|
|
(VXLAN enabled)<-----+
+-------------+
Figure 4: VXLAN Deployment - VXLAN Gateway
6.1.
Inner VLAN Tag Handling in VTEP and VXLAN gateway should conform to
the following:
Decapsulated VXLAN frames with the inner VLAN tag SHOULD be discarded
unless configured otherwise. On the encapsulation side, a VTEP
SHOULD NOT include an inner VLAN tag on tunnel packets unless
configured otherwise. When a VLAN-tagged packet is a candidate for
VXLAN tunneling, the encapsulating VTEP SHOULD strip the VLAN tag
unless configured otherwise.
7.
Security Considerations
Traditionally, Layer 2 networks can only be attacked from within by
rogue end points -- either by having inappropriate access to a LAN
and snooping on traffic, by injecting spoofed packets to take over
another MAC address, or by flooding and causing denial of service. A
MAC-over-IP mechanism for delivering Layer 2 traffic significantly
extends this attack surface. This can happen by rogues injecting
themselves into the network by subscribing to one or more multicast
groups that carry broadcast traffic for VXLAN segments and also by
sourcing MAC-over-UDP frames into the transport network to inject
spurious traffic, possibly to hijack MAC addresses.
Mahalingam, et al.
Informational
[Page 18]
RFC 7348
VXLAN
August 2014
IANA Considerations
A well-known UDP port (4789) has been assigned by the IANA in the
Service Name and Transport Protocol Port Number Registry for VXLAN.
See Section 5 for discussion of the port number.
9.
References
9.1.
Normative References
Mahalingam, et al.
Informational
[Page 19]
VXLAN
RFC 7348
9.2.
August 2014
Informative References
[802.1aq] IEEE, "Standard for Local and metropolitan area networks -Media Access Control (MAC) Bridges and Virtual Bridged
Local Area Networks -- Amendment 20: Shortest Path
Bridging", IEEE P802.1aq-2012, 2012.
[802.1D]
[802.1X]
Mahalingam, et al.
Informational
[Page 20]
VXLAN
RFC 7348
10.
August 2014
Acknowledgments
Dinesh G. Dutt
Cumulus Networks
140C S. Whisman Road
Mountain View, CA 94041
USA
EMail: [email protected]
Kenneth Duda
Arista Networks
5453 Great America Parkway
Santa Clara, CA 95054
USA
EMail: [email protected]
Puneet Agarwal
Broadcom Corporation
3151 Zanker Road
San Jose, CA 95134
USA
EMail: [email protected]
Mahalingam, et al.
Informational
[Page 21]
VXLAN
RFC 7348
August 2014
Lawrence Kreeger
Cisco Systems, Inc.
170 W. Tasman Avenue
San Jose, CA 95134
USA
EMail: [email protected]
T. Sridhar
VMware, Inc.
3401 Hillview
Palo Alto, CA 94304
USA
EMail: [email protected]
Mike Bursell
Intel
Bowyers, North Road
Great Yeldham
Halstead
Essex. C09 4QD
UK
EMail: [email protected]
Chris Wright
Red Hat, Inc.
100 East Davie Street
Raleigh, NC 27601
USA
EMail: [email protected]
Mahalingam, et al.
Informational
[Page 22]