LIFE of a NETwork adminhttp://www.apextier.com
A blog focused on networking and other ideasTue, 19 Mar 2019 03:41:19 +0000en-UShourly1https://wordpress.org/?v=5.0.4Extreme Networks – EXOS BGP MEDhttp://www.apextier.com/2019/03/extreme-networks-exos-bgp-med/
http://www.apextier.com/2019/03/extreme-networks-exos-bgp-med/#respondTue, 19 Mar 2019 03:40:19 +0000http://www.apextier.com/?p=1037I came across a scenario where a user had two data centers in different locations connecting back to the same ISP via BGP. These two data centers would be advertising a unique /24 at each site. However, the user also wanted to advertise the other DC’s /24, but not in an active state for failover. Being that the user was connecting back to the same provider AS, I decided to test using the BGP MED (Multi Exit Discriminator) attribute to determine which /24 would be the preferred route from the provider end. The MED with the lowest value takes a higher priority.

We’re using Extreme Networks summit series switches, so I tested the configuration on exos 22.6 using my EXOS virtual lab. I made sure to apply a lower MED value to the /24 I wanted to prioritize at each primary site and also applied a higher MED value to the backup /24 at the opposite site.

In summit series switches you start with a policy file that matches the network address used in the BGP network statement. You can then apply a med value to that match. EXOS uses vi when creating these policy files. Here are the commands:

In my lab, I was using two different AS numbers representing each DC connected back to the same AS. Therefore I also had to use the “enable bgp always-compare-med” command on the simulated provider AS exos virtual switch as MED values are not compared between routes advertised from different autonomous systems by default.

Of course, your provider has to be willing to accept MED. If not, you could also try prepending your AS number to the AS_Path. This is another way to manipulate what route is less preferred. However, this method is not always supported as some providers ignore duplicate AS in the AS_path. The change for AS_Path is simple, just replace the med set 100; with As-path “2020“; in the policy file. This example is using AS path 2020 and should be applied to the BGP network statement that serves as the backup route at the opposite DC location.

As a systems engineer for Extreme Networks, I like to get as much hands-on lab gear that I can within a reasonable budget. I have quite a large lab setup at home as you can see.

One of my goals was to build something a bit more portable and powerful enough to run ESXi with a few VMs. I also like things that don’t take up too much power sitting idle. My test lab configurations usually consist of different virtual network operating systems such as Extreme Networks EXOS as well as Extreme virtual Wi-Fi controllers, Extreme Control VMs, and a host of other VMs. I usually don’t generate large amounts of traffic or massive compute load in my lab. If I need more computing power, I move to my physical switches and higher end Sandy-bridge based XEON servers.

The x86 Based ODROID H2

After a bit of hunting, I found a nice and small Supermicro rig, the SYS-E200. You can check out an awesome portable rig built using a few SYS-E200 units at tinkertry.com here. One of the downsides is that the SYS-E200 build per unit can be quite expensive, so I started to look for a smaller x86 system on chip (SoC) solution. SoC tends to be cheaper, draw less power, but the downside is they aren’t typically powerful. I then came across the x86 SoC Odriod-H2 system by Hardkernel which sported a Gemini Lake Intel CPU with VT-x virtualization support. This board looked like a perfect small and portable solution to run ESXi on. I quickly placed a pre-order as the board started at just $112. Here are the full specifications:

This tiny x86 lab box has some impressive specifications. I didn’t need lots of CPU power, but with dual SATA ports, 32GB max RAM, M.2 support, VT-x, and 2 Gigabit interfaces I couldn’t pass this board up. I’m glad I preordered because the units sold out pretty quick. Here are the total build costs so far:

The power switch isn’t required as the board does have a small power and reset button that’s accessible through a small opening in the ODROID case, but I thought it would function a bit better with a larger power button. You could even run with NVMe or eMMC storage only and go with the smaller Type 2 case.

Once I received the board, I inserted the first 4GB RAM module (Patriot), and to my dismay, I couldn’t get the unit to post. Nothing I tried worked. I quickly posted to the ODRIOD forum and noticed other people were having issues with different RAM modules. I quickly ordered a second RAM stick that was on the officially supported list, and I finally got the board posted to the BIOS. I would have gone with a larger capacity module but didn’t want to spend the extra cash just in case the board was DOA. One thing for sure is that this unit is picky on RAM, so make sure you order from the official HardKernel supported list. From reading through the rest of the forums, a BIOS update is in the works to correct some of the RAM compatibility issues.

ODRIOD H2 and ESXi

The next task was to get ESXi installed. I did some preliminary research on the NICs which were Realtek RTL8111G units. I quickly found that newer versions of ESXi didn’t have these drivers baked in, so I started looking at how to add the drivers into ESXi. I followed an article from sysadminstories.com for the how-to. I started with ESXi 6.5.0 update 2 offline bundle. Free ESXi doesn’t have a 6.7 offline bundle that I could find.

Once you have a USB ESXi image ready, plug it in. Set the bios order to boot via USB and install ESXi. I decided to install a SATA SSD in my ODRIOD as I had an extra 64GB drive laying around which saved on the build cost. I also had to manually modify precheck.py during the ESXi installation since the system detected less than 4GB of RAM and my other RAM module wasn’t working. Here’s another article that shows the steps from simon-simonnaes.blogspot.com.

I then loaded up one of my favorite virtual network operating systems, Extreme Networks virtual_EXOS ISO from the Extreme Networks GitHub page. If you get the following error during the EXOS installation “mount: mounting /dev/hdc on /mnt/a failed: No such device or address” make sure your VM cd-rom drive is on IDE controller 1 and set to Master in Vmware. I added the first Realtek NIC assigned to vSwitch0 on port 2 of EXOS and the second Realtek NIC assigned to a new vSwitch to port 3 on EXOS. I’m now running a two-port virtual_EXOS system that bridges traffic across both ODROID-H2 NICs.

With three H2s, I could also try and build an HCI demo lab possibly running Nutanix CE. A FreeNAS build would also be a cool project. You’d have to check driver support, but as of the end of December 2018, the ODROID H2 is still on back order. Overall, this was a pretty fun build.

]]>http://www.apextier.com/2018/12/x86-tiny-esxi-lab-build/feed/3Welcome to Extreme Networkshttp://www.apextier.com/2018/10/welcome-to-extreme-networks/
http://www.apextier.com/2018/10/welcome-to-extreme-networks/#respondTue, 23 Oct 2018 03:08:05 +0000http://www.apextier.com/?p=971Since my last post, I’ve had quite a life change of events occur. I recently accepted a position with Extreme Networks as a Senior Systems Engineer and moved out of Northwest Indiana to North Carolina. How did that happen after almost eight years at my previous role and fresh into an interim position? Well, my wife and I had been thinking about relocating for quote some time. We had visited numerous warmer states within the last couple of years and ended up revisiting Raleigh North Carolina quite a few times. The only thing stopping us from making the jump was a job of course. However I didn’t want to apply for any IT job, so I spent quite some time searching and applying to specific positions. One of those happened to be with Extreme Networks. At Purdue Northwest, we were a legacy Enterasys customer who transitioned into Extreme Networks after the acquisition of Enterasys. I’d become very familiar with Extreme Networks products and had experience showcasing how we used Extreme Networks at PNW numerous times. I loved working with Extreme products and thought it would be even better working for Extreme Networks.

I’m now two months into my new position and I’m enjoying every minute of it. We’re settling into the area and are also looking for a local church home. The weirdest transition is the kids are now on a year-round schooling schedule, but we’re getting used to it. I also work from home and work out of our main office once a week. My home lab is growing fast and getting to meet new and potential Extreme Networks customers is fun. Purdue Northwest was great. However, I felt very comfortable leaving the University in the hands of some great folks. I know my previous team will do a fantastic job.

Now I’m focused more than ever in my passion for networking. I’m learning a great deal and I’m also working with some of the brightest minds in the networking community. My future posts will start diving back down the technical track, but I’ll make sure to share the culture side of things as well. I’d like to thank God first and foremost along with my wife and family for all the support. I’d also like to thank everyone else who helped me get to where I am.

]]>http://www.apextier.com/2018/10/welcome-to-extreme-networks/feed/0What did I get myself into?http://www.apextier.com/2018/07/what-did-i-get-myself-into/
http://www.apextier.com/2018/07/what-did-i-get-myself-into/#respondWed, 11 Jul 2018 03:46:06 +0000http://www.apextier.com/?p=951New challenges tend to surprise you sometimes. I was pleasantly surprised when I was recently asked to serve as the Interim Assistant Director of Information Security Services for Purdue University Northwest. I currently manage a team of seven full-time individuals and two student workers that make up the networking, infrastructure, and telecom team. The group isn’t that big, but I’d just found a great rhythm managing across the considerable breadth of IT services my team supports. The security team consists of a security engineer, analyst, and a student worker. I’d done work with InfoSec before, but I gave myself some time to think about the opportunity.

One thing that helped me make a decision was that I have a fabulous team. I’ve always set a model of allowing others to grow and empowered individuals to take on leadership responsibilities without micromanagement. In the past, I served as interim for the server administration team while we went through a merger in the middle of an outlook and AD migration which was lots of work but was very successful. The networking team had also served in a security operations capacity until a dedicated security department formed two years ago. I believe these items were factored into why I was asked to serve as interim. It was a great honor and opportunity to be asked if I could help serve others and I also have a passion for teaching, so I accepted the position.

Then human nature kicked in, and I started to ask what did I get myself into when I accepted the position. Information security is no joke and there was lots of work to do, but I know that I’ve surrounded myself with supportive individuals that will help along the way. It’s been about three weeks thus far. I’ve received lots of positive feedback and have a long list of goals to accomplish. However, my primary objective is to promote teaming and collaboration across the division and the Information Security Services team. We have lots of smart individuals, so together I know that we can accomplish any task. I look forward to diving back into InfoSec and plan to share the journey.

Layer 2 Bridging

If you make your way into the world of networking, you’re bound to come across a decision path on how you should handle network expansion. Should your default method always be to extend or stretch your layer 2 bridge domain? The root of the answer can be found when discussing the why. Let’s take a look at some of the use cases I’ve come across within enterprise network environments:

Device Requirement Device “A” needs to communicate with device “B” and those two devices are “required” to live on the same layer 2 broadcast domain. I haven’t come across any new devices or applications that fall into that spectrum, and it’s 2018. However, some enterprise organizations may still have legacy devices or poorly manufactured devices/applications with no foreseeable updates that may fall into this category.

Customer demand A customer you service in area “A” needs network services expanded to area “B.” They want their equipment to stay on the same subnet. Cough, cough, point of sales systems. I believe that modern POS systems can talk via IP across different subnets, but this can also be a possible use case that still comes up.

Data center disaster recovery or should I say “specific” DR models. I say “specific” because not all DC DR needs to be developed with an absolute layer 2 requirement extension model. Specific apps that are short-sighted will include layer 2 extension as a requirement. Someone insists that a VM pinned to a specific IP move from region “A” to region “B” and the IP needs to stay the same. What!?! Let’s think of better ways to do this, DNS, automate IP provisioning? However, this can still be a possible use case.

Ease of use Sometimes if you’re uncomfortable with routing protocols, it may seem easier to span a VLAN across the core of the network. Less IP provisioning, less ACL’s, potentially less firewall rules, and less management of those dreaded IP routing protocols. However, this is something we are in control of, so it’s OK to take time to research and learn what routing protocol would work best for your environment. Don’t let the lack of information drive your operation.

I can confirm that extending 100’s of VLAN’s through your core along with multiple instances of STP with a sprinkle of HSRP is NOT scalable. You will run into issues at some point. Others would say, “but my superior wants things done yesterday.” That’s another topic which may be worth blogging about in the future but hang in there.

You’re getting the point. There are some better ways to accomplish the listed use cases, but I understand that sometimes you may not be able to work with vendor X, customer Y, or technician Z to remove the necessity of layer 2 extension. Maybe your options are limited, but you’re a Rockstar network admin/engineer, so can we design around the “end user requirements”? If you must, you have quite a few options to extend layer 2 through the use of overlays. There will be some added complexity, but overlays may be worth considering instead of spanning layer 2 segments across the core.

Over What?

Ok, so what’s this overlay stuff? Say you designed your network with proper layer 2 segmentation along with a layer 3 routing protocol. Everything is working great. Your layer 2 fault domains are isolated through the use of routing protocols, you don’t have STP running across your core, and you’re taking advantage of multipath layer 3 routing. All is wonderful in the world. You then have a “hard” requirement, maybe one listed above to extend layer 2. Do you go back and span a VLAN through your core? No, overlays to the rescue! Overlays have been around for quite some time, think GRE, Pseudo-wires, etc. Some of the latest overlays you may have heard of are VXLAN or EVPN. Basically you’re encapsulating information from one segment of your network and forwarding it across an existing layer. The information de-encapsulates at an endpoint and voila you’ve extended layer 2 across your layer 3 network. I know, easier said than done. There’s plenty of resources out there on how to setup overlay protocols, so I won’t go into the details.

Alternative Thinking

Now let’s say you want to build your network from the ground up with extension services in mind. This would allow you to have a robust layer 2 transport natively built into your network. Extreme Networks has something called Fabric Connect which was an acquired technology from their Avaya network acquisition. Fabric connect is designed around shortest path bridging MAC (SPBM) as the forwarding plane and IS-IS as a control plane. You forward traffic not by IP routes, but by using an I-SID or Individual Service Identifier. You can create a layer 2 virtual service network (VSN) that’s more “circuit” based. The core of your network becomes a fabric connect mesh, and from an operational perspective, you configure services at the edge. You no longer have to segment devices to only certain parts of your network. Extreme Networks claim is that you get something like MPLS (however different) without the complexity.

Fabric connect makes me start thinking about locator/ID Separation Protocol or LISP which focuses on separating location (think IP address) from a device ID (think IP address again). If you separate location and device apart, you can now create two namespaces. In LISP, that’s the endpoint identifier (EID) and the routing locator (RLOC). What you then create is a mapping architecture similar to DNS mapping an IP to a name for determining forwarding. In fact, Cisco Campus Fabric uses LISP and VXLAN that creates another overlay solution that allows client mobility across a network.

The next time you have the opportunity to design or redesign a network, take time to study the why before you implement the how. And most importantly have fun!

I finally made it out to a CHI-NOG event, the Chicago network operators group. Experienced network engineers and architects put the group together to focus on all things network related. The yearly events concentrate on vendor-neutral topics and encourage other network enthusiasts to attend within the Chicago land region. This year’s gathering had more than a dozen sessions and a lineup with some excellent guest speakers. If you’re ever in the area and love networking with technology and people, I highly recommend you go. I attended quite a few of the sessions, but I’ll start with one of my favorites.

Rethinking BGP in the Data Center

BGP the chosen EGP of the Internet has taken quite a hold in large-scale data centers across companies such as Facebook, Microsoft, LinkedIn, and Google. You can do all kinds of clever traffic engineering using BGP, but should it be the chosen IGP for data centers? The companies mentioned above are now looking into or are already deploying other technologies such as openR, openfabric, and firepath as a BGP replacement. Russ challenged BGP deployment complexity and talked about some of the most significant hurdles being delay and jitter within the hyperscale arena. Flooding also becomes an issue along with autoconfiguration of devices.

I think it’s important not to try and over complicate existing protocols to make them fit what we want. We need to become better engineers and try something different. That’s where white box switching and new protocols such as draft-white-openfabric come into play. White box allows for the deployment of newly developed routing protocols that are more appropriate for what we wish to accomplish. Automation is also critical for successful manageability. Russ talked about having a router or switch that you never have to configure or CLI into, a little tough to swallow for us network operators.

Closing Thoughts

I couldn’t help but think about wireless controllers. When’s the last time you ever ssh’d into your wireless access points? We couldn’t imagine going back to individually configuring access points, what a nightmare! Centralized automated management for our switches and routers makes complete sense. Are we ready for the transition? The thought of what will happen to our existing jobs always comes up. However, I say we can then transition into working on solving other problems that we never had time to complete. Overall CHI-NOG was an awesome experience. I have lots more notes, so hopefully I can come up with more stuff that you’ll enjoy reading.

White box switching seems to be all the networking hype. For some in-depth research, check out this podcast from packet pushers about ATT making its move into white box switching. Cisco is also committed to offering a decoupled version of IOS-XR from Cisco hardware to enable running their NOS on OCP (open compute project) compliant hardware aka “white box switching.” Fascinating stuff, but what’s the big deal? Well, I’m going to try and make a comparison.

A Lego comparison

I’m a huge adult fan of Lego (AFOL). I remember dumping old tin popcorn bins with Legos all over my bedroom floor as a child. I’m more organized today, but I can’t help tearing down and building new creations. Now imagine you have an advanced Lego technic set put together. You have gears that move, hinges that open and close, wheels turn, etc. Now imagine all those connecting pieces glued together. A nightmare for those AFOL’s who want to rebuild something special.

Picture that glued together Lego set as a networking switch or router. Sure you can plug and unplug a few items, configure features within the CLI, and even get some sweet stats via SNMP. However, your switch or router’s underlying code is static which you can’t change. You’re at the mercy of the vendors nicely glued together product. I’m not suggesting that’s necessarily a bad thing, but you get where I’m going. With white box switching, you finally get to be a bit more creative with your switch or router. You can unload the default network operating system and load up something completely different. You’ve just expanded your imagination beyond one vendor and their fixed code.

A modular future

Maybe we’ll start to see advanced hardware modularity for white box switching as well. You need more processing power; upgrade your CPU. You need more space for your NOS apps or massively large routing tables, then go ahead and add more RAM. Are you a Cisco or Cumulus fan, who cares, you choose what NOS to run. Now you’re building like an AFOL. The possibilities of customization that deliver high flexibility are endless.

Now that I have my Nutanix CE lab setup, I wanted to get some of my virtual network operating systems installed within my home lab. One of the NOS’s I’ve been running is Extreme Networks virtual EXOS. My last EXOS-VM lived in Virtualbox and ESXi. Extreme Networks has a github page here with all the information you need to get started with running the VM within a Virtualbox or ESXi environment.

Issue and Solution with Nutanix

Following the EXOS installation guide using the downloadable iso and mimicking the Vmware/Virtualbox VM settings within Nutanix CE wouldn’t work. I kept receiving an issue with the disk not correctly detected while using the Nutanix CE hypervisor, so I started off with a fresh installation of EXOS via the iso within Virtualbox. Once I completed the installation within Virtualbox, I exported the VM and extracted the vmdk. I imported the vmdk into Nutanix using the Prism image configuration GUI utility.

I then built a fresh VM with the proper specifications, mounted the imported image on an ide.0 disk and booted the VM within Nutanix. The disk was detected correctly and EXOS booted with no issues.

I mapped the management interface to the first NIC on vlan0. I then added three NIC’s on an unused created VLAN in order prevent any loops. Unfortunately, Nutanix doesn’t have an option to disable or virtually unplug network adapters.

You can set up a management IP on the virtual EXOS switch using the following EXOS command using the Nutanix Prism console:

SSH into the EXOS virtual machine running in Nutanix CE from my workstation was successful, hooray! Now I can continue testing my virtual NOS’s. Next on the list is cumulus and openswitch. Due to the disk issues I experienced, I’ll start with a vmdk import of a working Virtualbox image for the next round of NOS deployments.

Today’s Hyper-converged infrastructure (HCI) vendors have some exciting product offerings. HCI ultimately provides scalable and flexible storage along with coupling computing resources. Recently at work, our KPI metrics started to show that some of our SAN hardware was having issues keeping up with production workload. So we ended up looking at a few HCI vendors; Simplivity, Nutanix, Pivot 3, and Vmware VSAN. After our initial investigations, it became clear that we weren’t quite ready to step into HCI just yet. Our project scope explicitly called for storage performance. At the current junction, additional compute wasn’t necessary and not budgeted for the project. Since HCI solutions couple storage with computing costs, we ending up investing in an all-flash SAN solution.

However, I was very intrigued with the different HCI platforms. What interested me the most was the ability to scale storage using x86 based systems. During our Nutanix research, I came across their community edition. I decided to load Nutanix CE on my home virtualization server and give it a whirl. There are lots of other great sites with information on how to get the initial setup going, so I’ll focus more on some of my specific findings during my home lab testing.

Nutanix CE Single Node Setup

I started out with a single node cluster as I didn’t have enough spare SSD parts for a three node cluster setup. Unfortunately I can’t review storage clustering, but that will come soon as I’ve ordered two extra 240GB SSD’s. The single node setup consists of the hypervisor (Acropolis) installation on the host; a VM called CVM (Nutanix controller VM), and Prism the management software running on the CVM. A few things I could test with the single node setup were in-line compression, hybrid storage presentation, importing of OVA’s, and overall management of the system. The minimum specifications must be at least 16GB RAM, four CPU cores, one SSD greater than 200GB, and one spinning disk greater than 500GB. For my lab setup, I have a Samsung 240GB SSD drive and a 1TB 7200rpm spinning disk drive. The default configuration for storage after the installation was one storage pool with both the SSD and HDD within the same pool. I enabled in-line compression on the default storage container. I couldn’t setup dedupe because I didn’t have enough RAM installed.

To test file transfer rates, I set up a quick NFS share. To get this going you simply add a whitelist IP range into the Nutanix storage container under the advanced settings. On my windows 10 client I created an NFS share. You will need to know the Nutanix container name which you can find under the Nutanix Prism storage settings. Then run the following in cmd on the client machine:

mount -o nolock nutanix-cvm-ip:/nutanix-container-name windriveletter

mount -o nolock 192.168.254.211:/default-container-50404077655521 P:

I started copying files directly to the Nutanix CE host without the need to set up another VM. I copied a few gigs of family pictures and videos to test throughput. I topped out at around 500Mbps. These results may have been due to disk bottlenecks as I was copying from an older 1TB 5200RPM drive on the Windows 10 client machine. To make sure my NIC or switch wasn’t an issue, I set up a VM ubuntu server on Nutanix and ran Iperf which runs TCP/UDP tests from memory. My tests topped out at the NIC interface speed of 1Gbps running Iperf between my physical desktop and the Linux server. I’m also seeing a total data reduction ratio of 1.67:1, but that’s a combination of data reduction (compression), any cloning, and VM thin provisioning. The compression data reduction ratio is 1.03:1 with a data reduction of 5.89GB. However, most of the large data I have stored are compressed jpg and avi files. Below is a screen capture of the Prism storage web interface:

Nutanix CE Conclusion

The methods I’ve used so far aren’t the best way to store data files, so for my next test I’ll set up a windows server VM and enable file server services. So far, I’m having lots of fun learning the in’s and out’s of Nutanix CE. I’ve started testing backups, imports, and a few other things. Once my SSD parts arrive, I would like to rebuild the setup as a three-node cluster to see how the storage clustering functions across three nodes.

My previous IT roles have revolved around the administration of different technologies, specifically networking technologies. However, I’ve always had the willingness to perform any other job functions as needed. That’s lead me to learn all types of new things such as tower climbing, billing, phone support, inventory tracking, training, and the list goes on. At my current employment, I started as a network administrator. I moved into a network supervisor position within three years, then was asked to serve as an interim supervisor for another area through a merger. I’m now the supervisor of networking and infrastructure.

Transitioning from a network administrator to a supervisor isn’t always a breeze. When you’ve spent lots of time administering systems, you become ingrained into build, support, and fix mode. Supervising is more because those duties should now be your team members primary functions. Managing your team while they build, support, and fix is now one of your primary responsibilities. During my transition I found myself occasionally working on tickets. I’m not saying that supervisors shouldn’t be able to assist the team when required, but if you continue to pick up those say more difficult tickets then it only hurts your team in the long run. I would ponder the idea that something would get done quicker if I just did it. Stay away from that mindset. Your team will only get better by practice. The more they perform a task, the better they will become. They may one day become quicker than you. That’s what a supervisor should want for their team, to grow. I’ve even shared this message some of my other peers, so it’s not just a challenge for supervisors.

You need to find your “value-added” as a supervisor/manager.

You need to find your “value-added” as a supervisor/manager. Setting priorities, training, planning, and growing your team start to become where you add value. Maybe you lack documentation or process in your area. Perhaps you need to start tracking changes better, or perhaps you need an assessment of how your services are performing. This is where you can become a change agent. You can now provide a framework for your team that will help develop growth in one of these areas. A supervisor isn’t just someone that tells people what to do, but helps shows others what they can do. If you ever have the opportunity to lead a group, think about how you will be able to help others in their journey. Leading isn’t always easy, but it can be quite rewarding to help others grow.

“I mostly know what not to do by experiencing the mistakes of others around me. Remember to watch, learn, and grow.” – Javier Solis