Sunday, March 20, 2011

After a long delay, let's pick up where we left off last with our OTV deep dive. This post will focus on a key concept with OTV that is critical to understand. We'll examine how we localize our First Hop Redundancy Protocols (FHRPs). These protocols are Host Standby Routing Protocol (HSRP v1 and v2) Virtual Router Redundancy Protocol (VRRP), and Gateway Load Balancing Protocol (GLBP). These protocols allow two network devices to share a common IP address to be used as the default gateway on a subnet and provide redundancy and load balancing to clients in that subnet.Before we can discuss FHRP localization, let's review why this might be significant to our design. Typically with FHRPs the members of the group are local to each other both logically and physically. Depending on the FHRP there is load balancing or redirection between the devices to the "active" member to handle traffic. This works well when considered locally and most of us use it without a second thought.When we start to stretch or extend our VLANs across distances, latency is introduced. While a 1ms one-way latency may not sound significant, when accumulated over a complete flow or transaction, it can become quite detrimental to performance. This is exacerbated if the two devices are both in the same location, but have default gateways in another data center. Sub optimal switching and routing at its finest. This effect is referred to as tromboning traffic and is illustrated below where device A needs to talk with device B and the default gateway resides across a stretched VLAN.

We address this with OTV by implementing filters to prevent the FHRP peers in each opposite data centers from seeing each other and therefore becoming localized. There are two approaches to do this, one using a MAC access list which we won't cover, and the other, recommended one is via an IP ACL that is applied as a VLAN ACL (VACL). To be fair, both work equally well in my experience, but he IP ACL is easier to operationalize and I am a staunch believer in making network easier to maintain and avoiding what I refer to as Science Fair Projects. We've all worked on, inherited or (hopefully not!) created a Science Fair Project - let's avoid that. ;)

This access list matches the multicast addresses for HSRPv1, and HSRPv2, though can be modified for VRRP and GLBP.This access-list is then applied as a VACL to filter the FHRP hellos from entering the OTV through the internal interfaces. The VACL looks like below where we’ll filter HSRP on VLAN 31-33.

If you are like me and want to verify your VACL is applied and matching, the steps are not as easy we’d like them to be but the capability does exist. *NOTE* that I am not responsible for you monkeying around with any of the other commands available when you attach to the module. You’ve been warned. :)The first thing to do is attach to the module where your internal interfaces physically are. In the example below, it’s module 1. If your OTV is configured in a non-default VDC, you’ll need to set the parser to use that VDC as below.

With this configuration, the FHRP in each data center will be locally active and mitigate the tromboning we mentioned earlier. This has a significant impact in that now we only send traffic across the Data Center Interconnect (DCI) that needs to go across as the local routers in each site can service the traffic.

Note that is technique is useful for optimizing egress traffic but does nothing to help draw or “attract” traffic into the right data center. Other technologies that provide that functionality will be the topic of future blogs. ;)

One last step to undertake when performing FHRP isolation is to exclude the FHRP MAC addresses from being advertised by OTV. You might be thinking OTV won't know about the FHRP MACs becuase of the VACL, right? Wrong. :) Due to the nature of MAC address learning, OTV will learn about the MAC addresses before the VACL drops them so we need to tell OTV to not advertise them. This is a three part process where we'll define the mac access list, add it to a route-map and then apply it to the OTV ISIS process as shown below.

Sunday, February 20, 2011

Now that we've covered OTV theory and nomenclature, let's dig in to the fun stuff and talk about the CLI and what OTV looks like when it's setup. We'll be using the topology below comprised of four Nexus 7000s and eight VDCs.

We'll focus first on the minimum configuration required to get basic OTV adjacency up and working and then add in multi-homing for redundancy. First, make sure the L3 network that OTV will be traversing is multicast enabled. Today with current shipping code, neighbor discovery is done via multicast which helps facilitate easy additions and removal of sites from the OTV network. With this requirement met, we can get rolling.

A simple initial config is below and we'll dissect it.

First, we enable the featurefeature otvThen we configure the Overlay interfaceinterface Overlay1Next we configure the join interface. This is the interface that will be used for the IGMP join and will be the source IP address of all packets after encapsulation.otv join-interface Ethernet1/7.1Now we'll configure the control group. As its name implies the control group is the multicast group used by all OTV speakers in an Overlay network. This should be a unique multicast group in the multicast network.otv control-group 239.192.1.1Then we configure the data group which is used to encapsulate any L2 multicast traffic that is being extended across the Overlay. Any L3 mutlicast will be routed off of the VLAN through whatever regular multicast mechanisms exist on the network.otv data-group 239.192.2.0/24Next to last bare minimum config to add is the list of VLANs to be extended.otv extend-vlan 31-33,100,1010,1088-1089Finally, no shut to enable the interface.no shutdownWe can now look at the Overlay interface but honestly, won't see much. Force of habit after a no shut on an interface. :)show int o1Overlay1 is upBW 1000000 KbitLast clearing of "show interface" counters neverRX0 unicast packets 77420 multicast packets77420 input packets 574 bits/sec 0 packets/secTX0 unicast packets 0 multicast packets0 output packets 0 bits/sec 0 packets/sec

If we configure the other hosts in our network and multicast is working, we'll see adjacencies form as below.

champs1-OTV#With this in place, we now have a basic config and will be able to extend VLANs between the four devices.The last thing we'll cover in this post is how multi-homing can be enabled. First to level set on multi-homing in this context I'm referring to the ability have redundancy in each site and not have a crippling loop.

The way this is accomplished in OTV is by the use of the concept of a site VLAN. The site VLAN is a VLAN that's dedicated to OTV and NOT extended across the Overlay but is trunked between the two OTV edge devices. This VLAN doesn't need any IP addresses or SVIs created, it just needs to exist and be added to the OTV config as shown below.

otv site-vlan 99With the simple addition of this command the OTV edge devices will discover each other locally and then use an algorithm to determine a role each edge device will assume on a per VLAN basis. This role is called the Authoritative Edge Device (AED). The AED is responsible for forwarding all traffic for a given VLAN including broadcast and multicast traffic. Today the algorithm aligns with the VLAN ID with one edge device supporting the odd numbered VLANs and the other supporting the even numbered VLANs. This can be seen by reviewing the output below.

1000 champs2-OTV inactive(Non AED) Overlay11010 champs2-OTV inactive(Non AED) Overlay11088 champs2-OTV inactive(Non AED) Overlay11089* champs1-OTV active Overlay1 If we look at the output above we can see that this edge device is the AED for VLANs 31, 33 and 1098 and is the non-AED for 32,1000, 1010 and 1088. In the event of a failure of champs2, champs1 will take over and become the AED for all VLANs.

We'll explore FHRP localization and what happens across the OTV control group in the next post. As always, your thoughts, comments and feeback are welcome.

Wednesday, February 16, 2011

I've been meaning to do this for a long time and now that I have the blog and am awake in the hotel room at 3AM, what better thing to do than talk about a technology I've been fortunate enough to work with for almost a year. This will be a series of posts as I'd like to take a structured approach to the technology and dig into the details and mechanics as well as operational aspects of the technology.

Overlay Transport Virtualization (OTV) is a feature available on the Nexus 7000 series switches that enables extension of VLANs across Layer 3 networks. This enables new options of data center scale and design that have not been available in the past. The two common use cases I've worked with customers to implement include data center migration and workload mobility. Interestingly, many jump to a multiple physical data center scenario and start to consider stretched clusters and worry about data sync issues and while OTV can provide value in those scenarios it also is a valid solution inside the data center where L3 interconnects may segment the network but the need for mobility is present.

OTV is significant in its ability to provide this extension without the hassles and challenges associated with traditional Layer 2 extension such as merging STP domains, MAC learning and flooding. OTV is designed to drop STP BPDUs across the Overlay interface which means STP domains on each side of the L3 network are not merged. This is significant in that it minimizes fate sharing where a STP event in one domain ripples to other domains. Additionally OTV uses IS-IS at its control plane to advertise MAC addresses and provide capabilities such as loop avoidance and optimized traffic handling. Finally, OTV doesn't have state that needs maintained as is required with pseudo wire transports like EoMPLS and VPLS. OTV is an encapsulating technology and as such add a 42 byte header to each frame transported across the Overlay. Below is the frame format in more detail.

We'll start defining the components and interfaces used when discussing OTV. Refer the topology below.

We have a typical data center aggregation layer based on Nexus 7000 which is our boundary between Layer 2 and Layer 3. The two switches, Agg1 and Agg2 utilize a Nexus technology, virtual Port Channel (vPC) to provide multi-chassis Etherchannel (MCEC) to the OTV Edge devices. In this topology, the OTV edge devices happen to be Virtual Device Contexts (VDC) that share the same sheet metal as the Agg switches but are logically separate. We'll dig into VDCs more in future blog posts, but know that VDCs are a very, very powerful feature within NX-OS on the Nexus 7000.

Three primary interfaces are used in OTV. The internal interface as its name implies is internal to OTV and is where the VLANs that are to be extended are brought in to the OTV network. These are normal Ethernet interfaces running at Layer 2 and can be trunks or access ports depending on your network's needs. It is important to note that the internal interfaces *DO* participate in STP and as such, considerations such as rootguard and appropriate STP prioritization should be taken into account. In most topologies you wouldn't want, or need the OTV edge device to be the root though if that works in your topology, OTV will work as desired.

The next interface is the join interface which is where the encapsulated L2 frames are placed on the L3 network for transport to the appropriate OTV edge device. The join interface has an IP address and behaves much as a client in that it issues IGMP requests to join the OTV multicast control group. In some topologies it is desirable to have the join interface participate in a dynamic routing protocol and that is not a problem either. As mentioned earlier, OTV encapsulates traffic and adds a 42 byte header to each packet so it may be prudent to ensure your transit network can support packets larger than 1500 bytes. Though not a requirement, performance may suffer if jumbo frames are not supported.

Finally, the Overlay interface is where OTV specific configuration options are applied to define key attributes such as multicast control groups, VLANs to be extended and join interfaces. The Overlay interface is where the (in)famous 5 commands to enable OTV are entered though anyone who's worked with the technology recognize more than 5 commands are needed for a successful implementation. :) The Overlay interface is similar to a Loopback interface in that it's a virtual interface.

In the next post, we'll discuss the initial OTV configuration and multi-homing capabilities in more detail. As always, I welcome your comments and feedback.

Saturday, February 12, 2011

One of the most difficult components in any data center architecture to design and plan for is the access layer. In a traditional network hierarchy the access layer is where the most dynamic and changing requirements exist. Myriad technologies abound and can tell a history of the data center as new technologies were introduced with the progression from 100Mb Ethernet to 1G to 10G and the emergence of Unified Fabric (FCoE). Scaling these access layers has been a black art at times because of the changing pace of technology. What if you could have an access layer that meets your current 100/1G Ethernet needs today as well as 10G, provided a reduction in management points and helps tame the Spanning Tree beast? Enter the Nexus 7000 with support for Nexus 2000 Fabric Extenders (FEX).

The Nexus 7000s have been shipping for close to 3 years now and have a well established install base, mature software and have proven themselves as scalable Data Center platforms. The Nexus 2000 has been shipping for over 2 years and has been solving access layer challenges for customers very well when paired with the Nexus 5000 switch. Combining the two technologies provides similar benefits for the traditional FEX architectures only on a larger scale. Today the Nexus 5000 series support up to a maximum of 16 FEX while the Nexus 7000 supports 32 with current code and plans for more in the future. Let’s dig into the details.

First, what are the requirements for FEX support on the Nexus 7000? Three primary requirements must be met:1. NX-OS 5.1(1) or higher must be installed on the Nexus 70002. 32 port M1 10GE modules (part number)3. EPLD must be current to support VNTag

Once these requirements are met we can connect the FEX to the Nexus 7000. The options supported include traditional 10G Short Reach (SR), 10G Long Reach (LR) optics and Fabric Extender Transceiver (FET) for the M1 32 port card. The M1 32 “L” card add support for active Twinax cables which currently are available in 7 and 10M lengths. In our example, we’ll be using SR optics.

Let’s start by verifying we meet the requirements.We see below we are running NX-OS 5.1(2) so we’re good to go there.cmhlab-dc2-sw2-agg1# show verCisco Nexus Operating System (NX-OS) SoftwareTAC support: http://www.cisco.com/tacDocuments: http://www.cisco.com/en/US/products/ps9372/tsd_products_support_series_home.htmlCopyright (c) 2002-2010, Cisco Systems, Inc. All rights reserved.The copyrights to certain works contained in this software areowned by other third parties and used and distributed underlicense. Certain components of this software are licensed underthe GNU General Public License (GPL) version 2.0 or the GNULesser General Public License (LGPL) Version 2.1. A copy of eachsuch license is available athttp://www.opensource.org/licenses/gpl-2.0.php andhttp://www.opensource.org/licenses/lgpl-2.1.php

Now let’s check the EPLD*NOTE* This must be done from the default VDC and if an EPLD upgrades is required, it is disruptive so plan accordingly.cmhlab-dc2-sw2-otv1# install all epld bootflash:n7000-s1-epld.5.1.1.img

So we’re in good shape there, too. It’s like I’ve done this before….. :)Now that we’re ready, we’ve cabled the FEX to the switch via port e3/1-4 and we’ll be creating a topology that looks like this.

First, we need to install the FEX feature set. This is a bit different than what we’ve done with features in the past and must be done from the default VDC.cmhlab-dc2-sw2-otv1# show run | i fexcmhlab-dc2-sw2-otv1# confi tEnter configuration commands, one per line. End with CNTL/Z.cmhlab-dc2-sw2-otv1(config)# install feature-set fexcmhlab-dc2-sw2-otv1(config)# show run | i fexinstall feature-set fexallow feature-set fexallow feature-set fexallow feature-set fexallow feature-set fexcmhlab-dc2-sw2-otv1(config)#

Note that each VDC now has a config for allow feature-set fex.Next, we’ll go to our VDC where we want the FEX configured and get it setup.cmhlab-dc2-sw2-agg1# confiEnter configuration commands, one per line. End with CNTL/Z.cmhlab-dc2-sw2-agg1(config)# feature-set fex

Then we’ll define the FEX and specify the model. While this isn’t required because the FEX will identify itself to the Nexus switch, I think it makes the config more readable and is somewhat self documenting.

Note that today we cannot have a FEX multi-homed into two Nexus 7000s like we can on the Nexus 5000. Look for that capability in a future release along with support for additional FEX platforms.

When you think of the scale – 32 FEX x 48 ports = 1,536, that’s pretty impressive. Being able to take advantage of the cable savings with localized, in –rack cabling without the challenges of increased STP diameter, the FEX and Nexus 7000 make a powerful impact on the data center topology.

Sunday, January 23, 2011

How many times have you had to fill out a change control document to upgrade code on your network devices where you've detailed the redundancy, portions of the networks impacted, application owners notified only to have it rejected due to "impact"? Prior to my current job at Cisco, this was a common theme. I wished I had a device that would let me roll code without impacting traffic. Fast forward a few years and my wishes have come true with In Service Software Upgrade (ISSU) within NX-OS.

A brief history lesson - Storage switches have had this capability for a long time in the higher end platforms that are considered director class. It makes sense to have ISSU functionality on fibre channel switches because fibre channel as a protocol relies on the network to guarantee delivery of frames. Dropping frames means bad things for storage traffic. Moving the capability for ISSU to Ethernet/IP networks makes sense in a modern data center where high density virtualization and the "always on" mindset prevail. Networking teams have been clamoring for ISSU for a long time. Let's face it, rolling code isn't one of the more exciting things to do on a network, but it's a necessary function, good news is that we now have it.

We'll focus on ISSU on the Nexus series of devices though know that other products in Cisco's portfolio support it. To provide a hitless upgrade capability the device and software require an intrinsic separation of the control plane and data plane. This allows changes to be made in the control plane, like software version, without affecting the data plane, through which the packets and frames that traverse the device pass. NX-OS has been engineered from day one to have this separation of planes. Coupling it with years of experience in ISSU on the Cisco MDS and one of my most favorite features of NX-OS is born.

So enough talk, let's get into the action. To start an ISSU we use the install all command as shown below where we specify the kickstart image and system image to use.

Once that is completed, the install routine also shows the type of upgrade per module, reflecting a rolling upgrade for line cards and reset for the supervisors. Rolling upgrades are non-disruptive as the modules have been engineered to provide this functionality and not drop link to ports or disrupt switching.

Compatibility check is done:

Module bootable Impact Install-type Reason

------ -------- -------------- ------------ ------

2 yes non-disruptive rolling

5 yes non-disruptive reset

6 yes non-disruptive reset

9 yes non-disruptive rolling

Finally, a nice table is presented showing the details of the upgrade and waits for the green light to continue.

At this point, the supervisor that was the secondary (module 6 in my example) has reload and come up with the new code. This triggers the primary to initiate a Stateful Switch Over (SSO) to the new code running in the control plane. Meanwhile, data is still traversing the switch with no impact. J

Since our telnet session was disconnected during the SSO (telnet isn't SSO aware), we need to re-establish the session and issue a command to continue monitoring the upgrade.

rfuller@cmhlab-tools:~$ telnet cmhlab-dc2-sw2-otv1

Trying 10.2.0.4...

Connected to cmhlab-dc2-sw2-otv1.csc.dublin.cisco.com.

Escape character is '^]'.

User Access Verificationlogin: adminPassword:Cisco Nexus Operating System (NX-OS) SoftwareTAC support: http://www.cisco.com/tacCopyright (c) 2002-2010, Cisco Systems, Inc. All rights reserved.The copyrights to certain works contained in this software areowned by other third parties and used and distributed underlicense. Certain components of this software are licensed underthe GNU General Public License (GPL) version 2.0 or the GNULesser General Public License (LGPL) Version 2.1. A copy of eachsuch license is available athttp://www.opensource.org/licenses/gpl-2.0.php andhttp://www.opensource.org/licenses/lgpl-2.1.php

cmhlab-dc2-sw2-otv1# show install all statusThere is an on-going installation...Enter Ctrl-C to go back to the prompt.Continuing with installation, please wait

Module 2: Non-disruptive upgrading.-- SUCCESSModule 9: Non-disruptive upgrading.-- SUCCESSInstall has been successful.With that, we've upgraded our NX-OS, had the system automatically copy the files to the right locations, modify the boot values and didn't drop a frame. How's that for hot?

cmhlab-dc2-sw2-otv1# show ver i uptime

Kernel uptime is 0 day(s), 0 hour(s), 26 minute(s), 50 second(s)

*NOTE* The Kernel has been up for just a while but we'll see that the overall system has been up much longer

Tuesday, January 18, 2011

I finally decided I needed to do some blogging, so here we go. Before we get into the fun stuff, let's talk a bit about who I am. This will help you decide if you are in the right place or not.

My name is Ron Fuller and I work as a Technology Solutions Architect with Cisco in Dublin, Ohio. I work with our Enterprise customers on data center architecture, which means I'm not a product guy per se. Architectures can be enabled by a product or suite of products though I happen to think some enable it better than others. ;) I am a dual CCIE #5851 (Routing and Switching and Storage Networking) and have held a myriad of certification from other vendors including Novell - where I started my certification track and was a Master CNE, VMware, SNIA, Microsoft, HP, Okidata, IBM, ISC2, CompTIA and more. Certifications have been a focal point for me early in my career and certainly opened doors that would have otherwise remained closed in tough times.

I have had the opportunity to be published a few times and my latest effort was a collaboration with two great guys who I am lucky to call friends as well, David Jansen and Kevin Corbin. We created NX-OS and Cisco Nexus Switching: Next-Generation Data Center Architectures with CiscoPress. The book was released last June and we're already working on a 2nd Edition because of the many changes and innovations NX-OS has brought to market in the last few months and those coming! I have a passion for NX-OS and if you've been following me on Twitter (@ccie5851) you might have picked up on it. ;) I have a sticker on my laptop that says it all.

On a personal front, my wife and I have four awesome, smart, creative, cute....you get the picture...kids. We live north of Columbus OH and love to travel- WITH the kids - especially if there is a F1 race involved. We've become very adept at long haul travel with kids and have taken them with us to Japan, England, France, Germany, Australia and our last big adventure, China. I may blog about the science of traveling with little ones in the future. We think we've got a good system but may be biased.

As I mentioned earlier, F1 is a great excuse to travel and for that matter, I'm a fan of most autosports though F1 holds a special place in my heart. It is the perfect integration of technology (I'm a geek after all!) and speed, exotic locations and competition. I do watch Indycar and it's probably best to say I monitor NASCAR. NASCAR has so many races and they are so long that it becomes quite the commitment to actually WATCH every race. I still miss the days of Dale and Rusty beating and banging on each other, but as with all things, change happens.

I'm sure more of my idiosyncrasies will emerge as I write, but know that I plan to discuss NX-OS and Nexus switching, some UCS action, MDS and whatever else comes up. Its an exciting time in the Data Center space and I couldn't be happier to be hip-deep in the action!

Newest LiveLesson!

vExpert 2017

About Me

Field Engineer at VMware focused on NSX though blog posts are all my own. Husband, father, F1 fanatic and geek.
Ron Fuller is a Staff Engineer in the Network and Security Business Unit (NSBU) focused on NSX for VMware. He has 22 years of experience in the industry and has held certifications from VMware, Novell, HP, Microsoft, ISC2, SNIA, and Cisco including two CCIEs No. 5851 (Routing and Switching/Storage Networking). His focus is working with customers to address their challenges with comprehensive end-to-end Data Center architectures and how they can best utilize VMware technology to their advantage. He is the co-author of the VMware Press NSX Fundamentals LiveLesson video series. This adds to his existing body of work with CiscoPress. He has had the opportunity to speak in Europe, Australia and the United States on multiple networking and security topics. He lives in Ohio with his wife and four wonderful children and enjoys travel and auto racing. He can be found on Twitter @ccie5851.