Let’s expand the small network described in the previous post a bit, adding a second ESX server and another switch. Both ESX servers are connected to both switches (resulting in a fully redundant design) and the switches have been configured as a MLAG cluster (using VSS with Catalyst 6500, vPC with Nexus 7000 or Nexus 5000, or IRF with the HP switches). Link aggregation is not used between the physical switches and ESX servers due to lack of LACP support in ESX.

The physical switches are unaware of the physical connectivity the ESX servers have. Assuming that vSwitches use per-VM load balancing and each VM is pinned to one of the uplinks, source MAC address learning in the physical switches produces the following logical topology:

Each VM appears to be single-homed to one of the switches. The traffic between VM A and VM C is thus forwarded locally by the left-hand switch; the traffic between VM A and VM D has to traverse the inter-switch link because neither switch knows VM D can also be reached by a shorter path.

In a Multi-chassis Link Aggregation scenario it’s thus almost mandatory to configure static port channel on the switches to which the vSphere servers are connected, otherwise you risk overloading the inter-switch link as the traffic between adjacent ESX servers is needlessly sent across that link. When doing that, you (probably) have to configure IP-hash-based load balancing in vSwitch (more information about the vSwitch-side implications if the NICs in ESX are configured in active/standby configuration).

Related Posts by Categories

12 comments:

It's not necessarily as bad as the diagram indicates...yes, this is what happens when the VMs are on different ESX hosts, but any VMs in the same port group on the same host will be switched within the vSwitch. So in the diagram, if A and D are running on the same ESX and are on the same VLAN, they can be in the same port group and none of the traffic between them will leave the host. For this reason, any VMs that send high amounts of data over the network to each other, we will often add DRS affinity rules to keep them on the same host.

But I disagree with "Each VM appears to be single-homed to one of the switches."

From the pSwitch perspective, each VM is homed to an *aggregation*, and MAC learning will happen on the aggregation, regardless of the link member where they arrive.

Taking just the left pSwitch, all of A's frames will arrive on link member 0 and all of C's frames will arrive on link member 1... But the pSwitch won't notice this. The pSwitch will associate both MACs with the aggregate interface, and will forward downstream frames according to whichever hashing method is configured, totally ignoring which MAC showed up on which link member.

Traffic won't flow across the inter-switch link. It will flow *down* the aggregate, and only *maybe* get delivered to the VMs.

Chris, I think Ivan's statement, "Each VM appears to be single-homed to one of the switches," refers to the default port ID vSwitch NIC teaming policy, not the IP-hash policy, so there are no port channels to the ESX hosts in the scenario where that statement applies...

"the switches have been configured as a MLAG cluster"..."link aggregation is not used"

Ah! Okay, I'd missed the "aggregation is not used" sentence until just now. Looking back at your previous comment, I guess this is the part that got clarified (after my misunderstanding of the topology was firmly cemented in my brain)

VM A and D are connected to different vSwitches if you look at the above diagram, so traffic will still need to go thru the physical switch stack if those VMs need to communicate even if they're on the same physical host and on the same VLAN. There is no way the traffic can flow between different vSwitches within the same host unless you purposefully introduce bridging between two vSwitches.

The author

Ivan Pepelnjak (CCIE#1354 Emeritus), Independent Network Architect at ipSpace.net,
has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced internetworking technologies since 1990.