FabricPath Multidestination Trees

FabricPath has many advantages over the classical Spanning Tree Protocol. Mainly because it can use ECMP (Equal Cost Multi Paths) Routing.

For unicast frames it uses the well known Switch-ID that is inserted in a FabricPath header. This will be explained in a future post for sure. I have been intrigued regarding how multicast frames (Unknown Unicast – Multicast – Broadcast) are forwarding within a FabricPath network.

This sounds familiar to many network engineer and it has some similarities with Spanning Tree. The protocol starts building tree 1 and in order to do that, the topology will elect a root based on 2 factors:

FabricPath priority : <1-255> Unlike spanning-tree where lower is better, the election will elect the switch with the highest priority.

SystemID : It is a mac address that belongs to the backplane of the switch

If there is a tie in the priority, the system-id will be the tie-breaker.

Once tree 1 has been elected, FabricPath will elect the root of the second tree based on the same factor described above. The trees are both identified based on an FTag field within the header.

The multidestination tree 1 is bound to FTag 1 and will forward unknown unicast, broadcast and multicast frames.

The multidestination tree 2 is bound to FTag 2 and will forward multicast frames only. We can see here that multicast frames will be load balanced on both trees in the topology. Broadcast and unknown unicast will be forwarded on the multidestination tree 1 only.

Here is the topology we will be working on:

Let’s assume that the fabric path adjacencies are formed and stable.

We want the Nexus 7K spine switches to be roots of the multidestination trees. N7K-1 will be root of multidestination tree 1 and N7K-2 will be root of the multidestination tree 2.

1

2

3

4

5

6

7

N7K-1# show run | sec "fabricpath domain default"

fabricpath domain default

root-priority255

N7K-2# show run | sec "fabricpath domain default"

fabricpath domain default

root-priority254

Now let’s check how the system has built the trees and how can we verify it.

1

2

3

4

5

6

7

8

9

10

11

12

13

N7K-1# show fabricpath isis switch-id

Fabricpath IS-ISdomain:default

Fabricpath IS-ISSwitch-ID Database

Legend:C-Confirmed,T-tentative,W-swap

S-sticky,E-Emulated Switch

'*'-thissystem

System-ID Primary Secondary Reachable Bcast-Priority

MT-0

0005.73bf.3bbc52[C]0[C]Yes64[S]

0005.73ca.900151[C]0[C]Yes64[S]

0026.980d.3cc472[C]0[C]Yes254[S]

18ef.63e3.cec4*71[C]0[C]Yes255[S]

As you can see here, the system-ID looks like a MAC address and as explained above it is indeed a MAC address that belongs to the backplane of the nexus.

For example on N5K-1 (SystemID: 005.73ca.9001) , the backplane has 96 MAC addresses that are usable

1

2

3

N5K-1# show sprom backplane | inc MAC

MAC Addresses:00-05-73-ca-8f-c0

Number of MACs:96

The range of MAC addresses is the following for that particular switch: [00-05-73-ca-8f-ca to 00-05-73-ca-90-20]

For example on N7K-1 (SystemID: 18ef.63e3.cec4) , the backplane has 128 MAC addresses that are usable

1

2

3

4

5

6

N7K-1# show sprom back 1 | inc MAC

MAC Addresses:18-ef-63-e3-ce-80

Number of MACs:128

N7K-1# show sprom back 2 | inc MAC

MAC Addresses:18-ef-63-e3-ce-80

Number of MACs:128

The range of MAC addresses is the following for that particular switch: [18-ef-63-e3-ce-c4 to 18-ef-63-e3-cf-44]

From the root of the tree (switchID 71) to the switch 52, the metric is 40 (which is the default for 10G link in ISIS) and N5K-1 has to use interface E1/4

From the root of the tree (switchID 71) to the switch 71 (itself) the metric is 0 (obviously) and N5K-1 has to use interface E1/4

From the root of the tree (switchID 71) to the switch 72, the metric is 80 (crossing 2x 10Gb/s links) and N5K-1 has to use interface 1/4

If you type the same command on all the switches in the fabricpath domain, you can draw the following topology:

Reminder : Metric is 40 for a 10Gb/s link

1

2

3

4

5

6

7

N5K-1# show fabric isi interface brief | inc 1/2|Type

InterfaceType Idx State Circuit MTU Metric Priority Adjs/AdjsUp

Ethernet1/2P2P5Up/Ready0x01/L1150040641/1

N5K-1# sh int e1/2 | inc Gb/s

full-duplex,10Gb/s,media type is10G

Let’s see how a broadcast is then forwarded within the FabricPath domain and more specifically on the multidestination tree 1:

Let’s say Server1 (which is on the left) sends a regular broadcast ethernet frame (Destination MAC: FFFF.FFFF.FFFF) to know what MAC should be used to reach Server2. The broadcast is forwarded within the classical ethernet domain and reach N5K-1.

N5K-1 recognizes that the frame is a broadcast and decides to forward it on tree one using port E1/4. A FabricPath header is added to the ethernet frame with the following characteristics:

OUTER DA : FFFF.FFFF.FFFF . It is a special case where the inner DA is copied to outer DA when a frame is a broadcast. (For your reference, unknown unicast FabricPath frames have a special OUTER DA: 01:0F:FF:C1:01:C0)

OUTER SA : SwitchID 51 ( I do not cover SSID and LID on purpose 🙂 )

FTAG : 1

TTL : 32

N7k-1 receives the frames and recognizes that it is a broadcast assigned to FTAG1. Normally a FabricPath switch uses Switch-ID to forward frames but since this is a broadcast, the FTAG will be used in order to make a forwarding decision.

FTAG1 has 1 link to N5k-2 (e1/28). The TTL is decremented and frames is forwarded on that link.

and so on and so on…..

Obviously if a brodcast was sent from Server2, N5K-2 would forward the frame onto mutlidestination tree 1 using ports E1/8 to N7K-2 and E1/4 to N7K-1.

The fact that there is no redundant link for multidestination frames guarantees that there will be no looped frames in the FabricPath domain.