Bug Reference

Branch

Introduction

The purposed plugin will bring the new network isolation methind "VXLAN" to CloudStack.
The plugin is for KVM hypervisor and work with Linux kernel 3.7+ native VXLAN support.

Purpose

This is a functional specification of the VXLAN Network Plugin, which has Jira ID 2328

The goal of this feature is to remove the limitations associated with VLANs.
Among the limitations are:

Scaling: a maximum of 4094 VLANs per datacenter is possible. This number is however a theoretical maximum. The actual number of VLANs that can be configured is often limited by the capabilities of the physical switches in the data center, as they need to maintain a distinct forwarding table for each VLAN.
Configuration complexity: VLAN information has to be consistently provisioned on all networking hardware
Broadcast containment: broadcasts within one VLAN causes needless flooding on links not using that VLAN
Flexibility: Since VLAN are terminated at layer-2, they do not allow to define virtual networks with span across different L2 broadcast domains.

Glossary

VXLAN Tunnel End Point - an entity which originates and/or terminates VXLAN tunnels

CS

CloudStack

Feature Specifications

Summary
The purposed plugin will bring will bring the new network isolation methind "VXLAN" to CloudStack.
VXLAN is one of emerging technologies to overcome VLAN limitation. VXLAN enable Layer 2 tunneling over UDP/IP with VLAN-like encapsulation and allow 16M isolated networks in the domain.

interoperability and compatibility requirements:

OS and Hypervisors
It supports KVM Hypervisor.
It require Linux kernel 3.7 or later with VXLAN support enabled and corresponding version of iproute2.
Early VXLAN implementation may uses UDP port different from one that IANA assigned, so you may need to match kernel version or configure kernel to use same port number.

storage, networks, other
Hypervisors must have IP address that can be used for VXLAN endpoint.
Network cannot filter multicast packet and it is required for hypervisors to communicate each other with UDP:VXLAN-port.
It is recommended to enable IGMP/MLD snooping in network to filter unneccesary L2 flooding, but this plugin will works without IGMP/MLD snooping.
Multicast routing is required if you'd like to create VXLAN isolated network over multiple L2 segments.

Performance and Scalability consideration
TBD.
VXLAN introduce additonal header, so inner MTU for VMs will be reduced.
Without configuring (outer/inner) MTU, it will reduce network throughput.

Use cases

There won't be any change to the existing CloudStack workflow.

Architecture and Design description

Software design and architecture:

We still use vNets. However, the vNet ID represents a VNI and not anymore a VLAN ID.
When a network is implemented, the VXLAN network guru allocates a vNet whose identifier will be used for allocating the VNI for the network.

Cloudstack configures Linux bridge instances and VXLAN tunnel interface as required: configuration occurs only when VMs are actually started on hosts.

At VM startup, during network preparation for a given NIC, newly implemented VXLAN VifDriver (com.cloud.hypervisor.kvm.resource.VxlanBridgeVifDriver) call newly implemented VXLAN manipulation script "modifyvxlan.sh". The script work as same manner as "modifyvlan.sh", create bridge and vxlan interface if required and attach the VM to the bridge. This is achieved by dispatching commands to the hypervisor resource using Cloudstack's agent framework.

When a host no longer have VM belonging to the VXLAN segment this will cause the bridge and vxlan interface to be unconfigured. This is achieved by the KVM Server Resource (com.cloud.hypervisor.kvm.resource.LibvirtComputingResource) calling the VXLAN manipulation script "modifyvxlan.sh".

Network design and architecture:

With Linux kernel support, we can simply replace VLAN sub-interface to VXLAN interface in KVM hypervisors.
Figure below shows the comparison between VLAN bridging and VXLAN bridging in KVM Hypervisor.

Each VXLAN segment has own VNI and multicast group, all VTEPs (=KVM hypervisors) that have VMs belonging to a VXLAN segment should be configured to use same VNI and multicast group for the VXLAN segment.
CS management server manages mapping between VNI and multicast group.