When you use IRF to group multiple Comware switches into 1 logical device, it is generally recommended to enable some split brain detection (split brain happens when all the stacking links are down).

For the MAD LACP method, only Comware switch could be used so far, now the Provision switches firmware has been updated, so an LACP link between a Provision and Comware IRF can be used for the MAD LACP.

Background

The split brain detection mechanism, which is known as Multiple Active Detection (MAD), is available through LACP, BFD, ARP and ND. I personally only use the LACP and BFD methods.

The LACP method is easy, since you can use an existing link-aggregation to a peer switch for the MAD detection. However, this uses an extended LACP PDU, using an additional TLV in the LACP packet. The TLV contains the active master ID of the IRF system.

When the peer LACP device receives this information, it should proxy this information back to the original IRF system over all the other ports of the link-aggregation.

As a result, the IRF system will receives its own ID information back on the other ports of the link-aggregation.

When the ID is the same, everything is OK. When the IDs are different however, it means there is a split brain.

Provision support

So far, only Comware devices could be used as the peer detection device, since any other vendor LACP implementation would not proxy the the additional TLV back over the other link-aggregation ports.

Now the Provision firmware has received an update and Provision switches can be used to provide the MAD LACP support.

This is good news for any mixed Provision-Comware networks, where MAD BFD may not have been possible for whatever reason.

Example configuration

Example is using a 3500 with K.15.16.0004 , so all 5400/3500/3800/2920 switches have support for it. The 2620 can also be used with current firmware (check release notes).
The Comware IRF is based on a 3600, a 100Mbit Comware switch (ports used are Ex/0/x, as opposed to Gx/0/x for a Gigabit switch)

The example assumes an IRF system is running already, so it shows only the steps to enable MAD LACP between the Comware IRF and Provision switch.

Steps:

On Comware IRF, define an LACP link-aggregation to Provision

On Comware IRF, enable the link-aggregation for MAD LACP

On Provision, define an LACP link-aggregation to Comware IRF

On Provision, enable the link-aggregation to perform MAD LACP pass-through

Supplemental validation

Run another split-brain check, to see the console output. This can be forced using the mad restore command.

The original mad restore command is intended to be used in this rare occasion:

* IRF configured between switches (example SW1/SW2)
* Split brain occurs, SW2 MAD detects it and shuts down all interfaces
* SW1 is the only surviving node, network still ok
* SW1 encounters a power failure, so the network is down (SW2 has all ports down, so no more network)
* Instead of performing a full reboot of SW2 to get it online again, the admin can use mad restore on SW2 to enable the interfaces again. Network will be back online after this command, since there is no more split brain condition (SW1 is powered down).

You can abuse this functionality to run multiple split brain tests without having to do a full reboot of the switches, this is what is done in this example.

Since the mad restore will be done on member2, the interfaces will come UP, MAD LACP will detect the split brain again, and all interfaces will be SHUTDOWN again. But this time you can follow the process on the console log output as well.

Hello quick question about enabling LACP MAD on a bridge aggregation link if prompts you for the domain id which i leave as the ID of the IRF pair i am on but why does it prompt for this and in what scenarios would you chose a different domain ID to the switch you are on ?

Hi, I have a couple of questions.
1. I am using route aggregation interfaces exclusively between the core/distribution layers. Can MAD LACP be enabled on a route aggregation interface in the same way as a bridge aggregation interface? The command is accepted, but I cannot find a reference to a MAD/RAGG config example.
2. Some references state that MAD needs to be enabled on both ends of a link and others do not. Is this necessary for the comware LACP extensions to work properly?

Hi Rob,
1. the MAD information is part of the LACP packet exchange, so if LACP is ok, MAD is ok. In fact, from a link-agg point of view, there is no difference between a BAGG with LACP and RAGG with LACP. This is only a switch local config difference with regards to routing/switching over the link-aggregation, so it has no impact on the MAD LACP process.
2. Both ends must understand the MAD LACP extensions. Comware switches understand these LACP extensions by default, no config required except for enabling LACP (so target does not need to be an IRF, just a Comware device). When you have a non-Comware device, such as the ArubaOS-switch (Provision), you need to enable support for the MAD LACP extensions manually.