Network Working Group L. Kreeger, Ed.
Internet-Draft Cisco Systems, Inc.
Intended status: Standards Track U. Elzur, Ed.
Expires: October 23, 2016 Intel
April 21, 2016
Generic Protocol Extension for VXLANdraft-ietf-nvo3-vxlan-gpe-02.txt
Abstract
This draft describes extending Virtual eXtensible Local Area Network
(VXLAN), via changes to the VXLAN header, with three new
capabilities: support for multi-protocol encapsulation, operations,
administration and management (OAM) signaling and explicit
versioning.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 23, 2016.
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
Kreeger & Elzur Expires October 23, 2016 [Page 1]

Internet-Draft Generic Protocol Extension for VXLAN April 20161. Introduction
Virtual eXtensible Local Area Network VXLAN [RFC7348] defines an
encapsulation format that encapsulates Ethernet frames in an outer
UDP/IP transport. As data centers evolve, the need to carry other
protocols encapsulated in an IP packet is required, as well as the
need to provide increased visibility and diagnostic capabilities
within the overlay. The VXLAN header does not specify the protocol
being encapsulated and therefore is currently limited to
encapsulating only Ethernet frame payload, nor does it provide the
ability to define OAM protocols. In addition, [RFC6335] requires
that new transports not use transport layer port numbers to identify
tunnel payload, rather it encourages encapsulations to use their own
identifiers for this purpose. VXLAN GPE is intended to extend the
existing VXLAN protocol to provide protocol typing, OAM, and
versioning capabilities.
The Version and OAM bits are introduced in Section 3, and the choice
of location for these fields is driven by minimizing the impact on
existing deployed hardware.
In order to facilitate deployments of VXLAN GPE with hardware
currently deployed to support VXLAN, changes from legacy VXLAN have
been kept to a minimum. Section 5 provides a detailed discussion
about how VXLAN GPE addresses the requirement for backward
compatibility with VXLAN.
Kreeger & Elzur Expires October 23, 2016 [Page 3]

Internet-Draft Generic Protocol Extension for VXLAN April 20162. VXLAN Without Protocol Extension
VXLAN provides a method of creating multi-tenant overlay networks by
encapsulating packets in IP/UDP along with a header containing a
network identifier which is used to isolate tenant traffic in each
overlay network from each other. This allows the overlay networks to
run over an existing IP network.
Through this encapsulation, VXLAN creates stateless tunnels between
VXLAN Tunnel End Points (VTEPs) which are responsible for adding/
removing the IP/UDP/VXLAN headers and providing tenant traffic
isolation based on the VXLAN Network Identifier (VNI). Tenant
systems are unaware that their networking service is being provided
by an overlay.
When encapsulating packets, a VTEP must know the IP address of the
proper remote VTEP at the far end of the tunnel that can deliver the
inner packet to the Tenant System corresponding to the inner
destination address. In the case of tenant multicast or broadcast,
the outer IP address may be an IP multicast group address, or the
VTEP may replicate the packet and send it to all known VTEPs. If
multicast is used in the underlay network to send encapsulated
packets to remote VTEPs, Any Source Multicast is used and each VTEP
serving a particular VNI must perform a (*, G) join to the same group
IP address.
Inner to outer address mapping can be determined in two ways. One is
source based learning in the data plane, and the other is
distribution via a control plane.
Source based learning requires a receiving VTEP to create an inner to
outer address mapping by gleaning the information from the received
packets by correlating the inner source address to the outer source
IP address. When a mapping does not exist, a VTEP forwards the
packets to all remote VTEPs participating in the VNI by using IP
multicast in the IP underlay network. Each VTEP must be configured
with the IP multicast address to use for each VNI. How this occurs
is out of scope.
The control plane used to distribute inner to outer mappings is also
out of scope. It could use a centralized authority or be
distributed, or use a hybrid.
The VXLAN Network Identifier (VNI) provides scoping for the addresses
in the header of the encapsulated PDU. If the encapsulated packet is
an Ethernet frame, this means the Ethernet MAC addresses are only
unique within a given VNI and may overlap with MAC addresses within a
different VNI. If the encapsulated packet is an IP packet, this
Kreeger & Elzur Expires October 23, 2016 [Page 4]

Internet-Draft Generic Protocol Extension for VXLAN April 20163. Generic Protocol Extension for VXLAN (VXLAN GPE)3.1. VXLAN GPE Header
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R|R|Ver|I|P|R|O| Reserved |Next Protocol |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VXLAN Network Identifier (VNI) | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: VXLAN GPE Header
Flags (8 bits): The first 8 bits of the header are the flag field.
The bits designated "R" above are reserved flags. These MUST be
set to zero on transmission and ignored on receipt.
Version (Ver): Indicates VXLAN GPE protocol version. The initial
version is 0. If a receiver does not support the version
indicated it MUST drop the packet.
Instance Bit (I bit): The I bit MUST be set to indicate a valid VNI.
Next Protocol Bit (P bit): The P bit is set to indicate that the
Next Protocol field is present.
OAM Flag Bit (O bit): The O bit is set to indicate that the packet
is an OAM packet.
Next Protocol: This 8 bit field indicates the protocol header
immediately following the VXLAN GPE header.
VNI: This 24 bit field identifies the VXLAN overlay network the
inner packet belongs to. Inner packets belonging to different
VNIs cannot communicate with each other (unless explicitly allowed
by policy).
Reserved: Reserved fields MUST be set to zero on transmission and
ignored on receipt.
Kreeger & Elzur Expires October 23, 2016 [Page 6]

Internet-Draft Generic Protocol Extension for VXLAN April 20163.2. Multi Protocol Support
This draft defines the following two changes to the VXLAN header in
order to support multi-protocol encapsulation:
P Bit: Flag bit 5 is defined as the Next Protocol bit. The P bit
MUST be set to 1 to indicate the presence of the 8 bit next
protocol field.
When UDP dest port=4790, P = 0 the "Next Protocol" field must be
set to zero and the payload MUST be ETHERNET(L2) as defined by
[RFC7348].
Flag bit 5 was chosen as the P bit because this flag bit is
currently reserved in VXLAN.
Next Protocol Field: The lower 8 bits of the first word are used to
carry a next protocol. This next protocol field contains the
protocol of the encapsulated payload packet. A new protocol
registry will be requested from IANA, see section 9.2.
This draft defines the following Next Protocol values:
0x1 : IPv4
0x2 : IPv6
0x3 : Ethernet
0x4 : Network Service Header [NSH]
0x5 : Multiprotocol Label Switching [RFC3031]. Please see
[idrtun] for more details.
3.3. OAM Support
Flag bit 7 is defined as the O bit. When the O bit is set to 1, the
packet is an OAM packet and OAM processing MUST occur. Other header
fields including Next Protocol MUST adhere to the definitions in
section 3. The OAM protocol details are out of scope for this
document. As with the P-bit, bit 7 is currently a reserved flag in
VXLAN.
3.4. Version Bits
VXLAN GPE bits 2 and 3 are defined as version bits. These bits are
reserved in VXLAN. The version field is used to ensure backward
compatibility going forward with future VXLAN GPE updates.
The initial version for VXLAN GPE is 0.
Kreeger & Elzur Expires October 23, 2016 [Page 7]

Internet-Draft Generic Protocol Extension for VXLAN April 20164. Outer Encapsulations
In addition to the VXLAN GPE header, the packet is further
encapsulated in UDP and IP. Data centers based on Ethernet, will
then send this IP packet over Ethernet.
Outer UDP Header:
Destination UDP Port: IANA has assigned the value 4790 for the VXLAN
GPE UDP port. This well-known destination port is used when sending
VXLAN GPE encapsulated packets.
Source UDP Port: The source UDP port is used as entropy for devices
forwarding encapsulated packets across the underlay (ECMP for IP
routers, or load splitting for link aggregation by bridges). Tenant
traffic flows should all use the same source UDP port to lower the
chances of packet reordering by the underlay for a given flow. It is
recommended for VTEPs to generate this port number using a hash of
the inner packet headers. Implementations MAY use the entire 16 bit
source UDP port for entropy.
UDP Checksum: Source VTEPs MAY either calculate a valid checksum, or
if this is not possible, set the checksum to zero. When calculating
a checksum, it MUST be calculated across the entire packet (outer IP
header, UDP header, VXLAN GPE header and payload packet). All
receiving VTEPs must accept a checksum value of zero. If the
receiving VTEP is capable of validating the checksum, it MAY validate
a non-zero checksum and MUST discard the packet if the checksum is
determined to be invalid.
Outer IP Header:
This is the header used by the underlay network to deliver packets
between VTEPs. The destination IP address can be a unicast or a
multicast IP address. The source IP address must be the source VTEP
IP address which can be used to return tenant packets to the tenant
system source address within the inner packet header.
When the outer IP header is IPv4, VTEPs MUST set the DF bit.
Outer Ethernet Header:
Most data centers networks are built on Ethernet. Assuming the outer
IP packet is being sent across Ethernet, there will be an Ethernet
header used to deliver the IP packet to the next hop, which could be
the destination VTEP or be a router used to forward the IP packet
towards the destination VTEP. If VLANs are in use within the data
center, then this Ethernet header would also contain a VLAN tag.
Kreeger & Elzur Expires October 23, 2016 [Page 8]

Internet-Draft Generic Protocol Extension for VXLAN April 20164.1. Inner VLAN Tag Handling
If the inner packet (as indicated by the VXLAN GPE Next Protocol
field) is an Ethernet frame, it is recommended that it does not
contain a VLAN tag. In the most common scenarios, the tenant VLAN
tag is translated into a VXLAN Network Identifier. In these
scenarios, VTEPs should never send an inner Ethernet frame with a
VLAN tag, and a VTEP performing decapsulation should discard any
inner frames received with a VLAN tag. However, if the VTEPs are
specifically configured to support it for a specific VXLAN Network
Identifier, a VTEP may support transparent transport of the inner
VLAN tag between all tenant systems on that VNI. The VTEP never
looks at the value of the inner VLAN tag, but simply passes it across
the underlay.
4.2. Fragmentation Considerations
VTEPs MUST never fragment an encapsulated VXLAN GPE packet, and when
the outer IP header is IPv4, VTEPs MUST set the DF bit in the outer
IPv4 header. It is recommended that the underlay network be
configured to carry an MTU at least large enough to accommodate the
added encapsulation headers. It is recommended that VTEPs perform
Path MTU discovery [RFC1191] [RFC1981] to determine if the underlay
network can carry the encapsulated payload packet.
Kreeger & Elzur Expires October 23, 2016 [Page 12]

Internet-Draft Generic Protocol Extension for VXLAN April 20165. Backward Compatibility5.1. VXLAN VTEP to VXLAN GPE VTEP
A VXLAN VTEP conforms to VXLAN frame format and uses UDP destination
port 4789 when sending traffic to VXLAN GPE VTEP. As per VXLAN,
reserved bits 5 and 7, VXLAN GPE P and O-bits respectively must be
set to zero. The remaining reserved bits must be zero, including the
VXLAN GPE version field, bits 2 and 3. The encapsulated payload MUST
be Ethernet.
5.2. VXLAN GPE VTEP to VXLAN VTEP
A VXLAN GPE VTEP MUST NOT encapsulate non-Ethernet frames to a VXLAN
VTEP. When encapsulating Ethernet frames to a VXLAN VTEP, the VXLAN
GPE VTEP MUST conform to VXLAN frame format and hence will set the P
bit to 0, the Next Protocol to 0 and use UDP destination port 4789.
A VXLAN GPE VTEP MUST also set O = 0 and Ver = 0 when encapsulating
Ethernet frames to VXLAN VTEP. The receiving VXLAN VTEP will treat
this packet as a VXLAN packet.
A method for determining the capabilities of a VXLAN VTEP (GPE or
non-GPE) is out of the scope of this draft.
5.3. VXLAN GPE UDP Ports
VXLAN GPE uses a IANA assigned UDP destination port, 4790, when
sending traffic to VXLAN GPE VTEPs.
5.4. VXLAN GPE and Encapsulated IP Header Fields
When encapsulating and decapsulating IPv4 and IPv6 packets, certain
fields, such as IPv4 Time to Live (TTL) from the inner IP header need
to be considered. VXLAN GPE IP encapsulation and decapsulation
utilizes the techniques described in [RFC6830], section 5.3.
Kreeger & Elzur Expires October 23, 2016 [Page 13]