Enable DCB on Dell N4000 and Dell Force10 S6000-on

In my plans for our Dataon S2D-3212 setup i had plans on using our Dell N4000 switches for DCB/RDMA, as we have this in both our datacenters. When i did our first install i had problems when enabling DCB with no-drop on the N4000. The N4000 was running firmware version 6.3.2.3 and we where loosing connectivity to some servers when no-drop was enabled. So we ended up buying some new Dell Force10 S6000-on switches, as the Nic’s in our servers are Mellanox ConnectX-4 40Gbit cards.

Now i have just setup our 2nd Dataon S2D-3212 cluster in our other datacenter with the same N4000 switches. These where running version 6.3.0.16 and when i enabled the no-drop on the interfaces everything worked perfect. So i was happy and i decided to upgrade to the latest version as Dell had came out with a new version 6.3.2.4, that had a fix for mac address dropping over stack. But no go, it had the same issue. So i set the backup firmware as the boot firmware on reload, and booted the switches and all was ok again.

DO NOT tag Port-Channel 100 on any vlan. That will cause connectivity problems. The VLT will transmit the data over the link between the switches if traffic comes on one switch and should go to the another port on the 2nd switch.

Now duplicate the setup on the other switch, make sure the system-mac adress on the VLT setup is identical on both sides, and set the back-up destination to diffrent ip adresses on same subnet.

I’ve just came across your blog having followed on you Twitter for a while. I’ve been playing with S2D since December and have had lots of issues causing a BSOD. I have a case open with Microsoft so they’re looking at the lsass.exe being the cause. Whilst they’re doing that, I’ve been doing a lot of troubleshooting myself. I’ve just noticed I’m getting a few “Packets Received Discarded” on the Mellanox Traffic Counters. It’ll start with about 811 then hold that count. I’m not sure if that’s normal? Is it as one of the nodes is booting up? Anyway, I’m okay with the client side of things, the Dell N4000 series switches are completely new to me. We’re a school so finances are tight, however we’ve been lucky to be able to buy two stacked N4032’s. There’s not a whole lot on the Internet about configuring a two node RDMA setup using these other than your post. I’m wondering if I’m missing something to cause a few discarded packets. Is this a result of the firmware you found?

Keep up the good work, people like me really appreciate the work you guys put into sharing
experiences.

Are you running the N4032F with the SFP+? When it comes to the RDMA, make sure you are running the correct N4000 firmware, which is the 6.3.0.16 otherwise the nodrop will not work. Follow my config for the N4000 and all the config is set on the interfaces for the S2D servers.

Also use the standard MS config for RDMA on windows. What Mellanox cards are you using? ConnectX3 EN? And the servers are brand new? Or this just for testing.