Tacking with Captain Ron and the Nexus 7000

So, I am watching Captain Ron (hey, we all have our guilty pleasures) and it reminded me of a conversation I had with a customer back in Vegas at VMworld. The customer was giving me some friendly grief around the recent NetworkWorld lab test of the Cisco Nexus 7000 and the high availability features of the platform. At this point, you are probably wondering where Captain Ron figures into this…. …so, in pretty much every nautically-themed comedy, there is a scene where the captain is tacking and as the wind shifts, the boom shifts sides and knocks someone into the water. I see the same thing happen with vendors where a sea change (yes, pun intended) in technology will catch a vendor off-guard, and boom–into the water they go.So, in this case, the customer took exception with David Newman’s test of stateful process restart test of OSPF (256 adjacencies, advertising traffic to 50K routes, traffic on all networks), where he was able to kill the OSPF process without losing a packet or causing the routes to be recalculated. The customer contention was that a better test would have been to see how quickly the Nexus 7000 could re-converge the network. I told the customer this sounded like having an argument about who’s airbag deploys faster when it would be better if the brakes worked well enough that we didn’t hit things in the first place. In today’s operational environment, minimizing service loss is, to paraphrase my boss, using yesterday’s thinking to solve tomorrow’s problems. To meet these increasing demands in the data center, the conversation needs to evolve from”how do we drop fewer packets” to”how do we not drop drop any packets in the first place” and the data center infrastructure needs to keep up..wait, was that a splash I just heard?

1 Comments.

Omar, have to agree. Not sure why in that scenario a reconvergence test would be appropriate at all.Certainly testing reconvergence is a good thing to do. Networks converge when links fail and for a host of other reasons. Knowing how quickly a device can converge these types of events is valuable. However, some of these parameters are gated by the protocol standards themselves and 'tuning' to a certain point can start creating problems.The purpose of the test NWW ran was somewhat different. Software is fallible. Let's accept that as a fact-of-life. As such we have all seen the OSPF LSDB get corrupted and had to go through fixing it in a CCIE exam - usually by rebooting the router. Option-A: Reboot. Option-B: Restart the process. If you can restart the process and not cause a single-packet drop, I would imagine that a process-restart with no packet loss would be the preferred method of correcting this problem.

Some of the individuals posting to this site, including the moderators, work for Cisco Systems. Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not of Cisco. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Cisco or any other party. This site is available to the public. No information you consider confidential should be posted to this site. By posting you agree to be solely responsible for the content of all information you contribute, link to, or otherwise upload to the Website and release Cisco from any liability related to your use of the Website. You also grant to Cisco a worldwide, perpetual, irrevocable, royalty-free and fully-paid, transferable (including rights to sublicense) right to exercise all copyright, publicity, and moral rights with respect to any original content you provide. The comments are moderated. Comments will appear as soon as they are approved by the moderator.