I recently came across a customer that had limited space, power and cooling in there datacentre (very few who don’t right?) but wanted to put in a Vblock but to do so would need to split most of the UCS chassis across multiple racks and at opposing ends of the datahall. Traditionally, when I design and spec a UCS system I use the default ‘passive‘ Copper Twinax SFP+. In the event I need to provide cabling for Fabric Interconnects that are more that 5 meters away from the chassis then I would use ‘active‘ Copper Twinax SFP+ as these can go up to 10 meters.

But in this case the distances are over 30 meters. The alternative is go optical by using SFP+ modules (SFP-H10GB-SR) which can more than compensate for almost any datacentre distances (300m or so).

A few of you may have noticed I said this was a Vblock, you may be thinking you will not be aloud to do this with a Vblock as it brakes the default spec. While it does go against what’s in the design for a Vblock, it is a great example of how the Vblock products are actually flexible and not as rigid a people may think and exceptions can be raised when genuine requirements demand it.

Click here for more info on installing and configuring a UCS chassis and cabling it up.

UPDATE:

Thanks to a Andrew Sharrock (@AndrewSharrock) for pointing this one out. As of UCS Software release 1.4 Fabric Extender Transceivers have been supported and are an alternative to using the above. You can get up to 100m from a FET and it supports OM2, 3 and 4 cables. I have a feeling not many people have deployed this as its Google doesn’t bring many results back on this subject but its an option. I’m not sure if VCE support it within a Vblock either (VCE peeps are welcome to confirm or deny this in the comments).

From Cisco:

To replace a copper Twinax SFP+ transceiver with an optical SFP+ transceiver, follow these steps:

Step 1 Remove the copper Twinax SFP+ from the I/O module port by pulling gently on the rubber loop (see Figure 2-19). The cable and SFP+ transceiver come out as a single unit, leaving the I/O module port empty.

Step 2 Insert the optical SFP+ transceiver into the I/O module port. Make sure that it clicks firmly into place.

Trying to create the Service profile failed. Initially we saw cryptic error messages, leading us to believe it was mgmt. ip or vnic pool related. The second error message does refer to a pool based issue.

There were no clues elsewhere, as the service profile creation just fails and therefore the profile takes nothing from any of the pools.

So, we manually rolled back the firmware on the new blades before trying to create a new SP. But we could not do the BIOS however, unless there is a spare existing SP you can use to create a BIOS policy. In our case there was, so tried this BIOS version downgrade, this did not fix the problem.

Resolution:

The exact fix – Create a new UUID pool, add enough UUID’s – Point the service Profile at this new pool – result – clone or template method now both work.

“Explanation” is that somehow, rolling the firmware back on the Blades results in the existing UUID pool entries normally eligible becoming stranded for some reason. Below is the link to the Cisco TAC explanation. This is not quite really the scenario we saw at all, and the labs at Cisco have not reproduced it – but the fix worked! We did not delete the existing UUID pool, didn’t see the point in taking any risk. Also the Fabric Interconnects were at 1.3.(1T) throughout.

So if you get any trouble with Service profile creation, on my list would be to start with creating a new UUID pool, referencing that new pool in the SP. I probably would do the same with the other pools one by one as well in case this leads to resolution.. can always delete these if it turns out they don’t help.

I listened to the excellent Infosmack podcast focusing on a deepdive into Blade Servers vs Rack Servers. I guess it had the desired effect as it really got me thinking. Not so much about the main objective of the podcast, comparing blades to rack mount servers, but rack servers vs traditional blades vs Cisco UCS.

Over the last 6 months I have been neck deep in the Cisco UCS platform from both a blades and rack mount servers perspective. It struck me that many challenges raised by the panel are addressed with UCS.

Each topic I touch on bellow is probably a blog post in its own right so I have skimmed over them. My goal is to highlight vendors are aware of these issues and are actively working to resolve them.

Id like to highlight this does not go into ‘what is UCS’ is this post, for that I recommend:

Life Cycle for Chassis:

Nigel raised a very real concern for server architects and engineers around the longevity of the blade environment. With traditional rack and ever tower servers replacing them for the latest and greatest was an easy task. However when you introduce a blade environment element of longevity in delivered into the infrastructure. The blade chassis is a fixture that can be 2 or 3 times the life of the server and I/O components that the house. So how do venders get round this? Well the answer is to make the chassis as basic as possible. With the Cisco UCS 5100 series chassis you get some power, front to back airflow and a midplane. This midplane can handle up to 1.2Tb of aggregate throughput (Ethernet). This midplane is both the point of failure and the life cycle point of failure for a chassis. All other parts are easy to upgrade or replace however this midplane is built into the chassis and is not a quick fix should it fail or need to be upgraded.

The chassis midplane supports two 10-Gbps unified fabric connections per half slot to support today’s server blades, with the ability to scale to up to two 40-Gbps connections using future blades and fabric extenders. For this reason I’m fairly confident that the Cisco UCS 5100 series chassis would be future proof for a significant amount of time.

For me all of this shows, that while adding complexity to things, the concern over longevity and upgrades should be minimal. Especially as UCS is easily managed with only 1 non hot-swappable unit (chassis midplane).

NOTE: traditional blade chassis have inbuilt management etc. This adds an additional point of aging over UCS blades.

Reliability

As I touched on above the midplane is a single point of failure in the 5100 series chassis. Within the chassis this is the only component that if a failure did occur would result in the loss of a chassis. You could ask yourself ‘why would you put all your eggs into one basket’, for me the risk is very small. However no self respecting Architect/Engineer would recommend putting this kind of risk into a production datacenter. This is when you would recommend multiple blade chassis. Now we get some testing questions:

Am I actually saving on rack space?

Have I increased my failure risk %?

Have I added additional cost and complexity?

‘

There will be plenty of times when the answers to those questions will leave little choice but to go for rack mount servers. But for me this will only be for small business or small projects. When consolidating racks of servers or considering cloud based architectures, blades make allot more sense.

Ownership

Big, Big issue this one. Just like when virtualisation first came round the questions of ownership caused issues. With virtualisation it became obvious (or evolved) that a new role was necessary, this was possible because virtualisation can only cover each discipline only to a certain level, leaving the SAN, Network and Compute teams to remain segregated. UCS now muddies the water because the network encroaches further into the network engineers realm further than ever before (especially when you add the Nexus 1000v into the mix). The same goes for the SAN and Compute.

This is not solved with UCS, if anything it can be exasperated. However careful planning and understanding can go along way to possibly improve the existing relationships between the disciplines.

Rack Space & Co-Location

Its hard to beat blades when it comes to space in a rack. Its fairly obvious when you look at a Cisco B230 M2 and how powerful it is for a half width blade that you can fill a rack with a very large amount of compute. Co-location of racks and blades is of course possible but with UCS you can manage both from the UCSM console.

Where things get complicated with blades and rack space is when it comes to power & aircon. A common rack setup from a power point of view would be twin 16 amp feeds. This should be enough to fully populate a rack with chassis. However you get into issues with air con and managing to meet what are often relatively low BTU’s (british thermal units). I once worked on datacenter that had plentiful electricity but could not populate more than 1 fully populated chassis and 1 half height chassis in a single rack. Unless you have a brand new datacenter build with blades in mind you are unlikely to be able to fit out a chassis environment (like the pic above).

Management

One big advantage with UCS is that both blades and rack servers can be managed in the same way through UCSM via the Fabric Interconnects. In traditional rack server topology each server is a point of management, in traditional blades each chassis is the management point and with UCS this is aggregated to another layer, the Cisco 6100 series fabric interconnects.

UCSM is a Linux based OS run from the ROM and delivered through a webserver hosted on the FI. This software is the coalface of UCS and allows for the centralised management of every chassis connected to the FI. Where this become appealing in a cloud environment is it allows for a similar topology to VMware vCenter in that it can be contacted through and API and all components connected to the FI are treat like objects. If we compared this with traditional blades chassis each chassis is the management point so meaning an individual connection to each chassis. When you start looking at allot of chassis this becomes a bottleneck and logistical problem.

Density

Are blades as dense as their rack equivalents? Well the guys on the podcast discussed what is quite a common occurrence. The perception of being able to pack more compute into a rack server over a blade server is a complex one. There are more rack servers that can out gun the blades but that number is falling. Then you have to look at things like how many Us is the rack server? e.g. Two 4u servers packed full of compute will probably lose out to 2 UCS chassis packed with RAM and the new westmere Intel chips.

Obviously this will change from deployment to deployment but in general if you plan carefully you can top trump the rack mount equivalent.

One point raised in the podcast was around local disk. Now local disk has been minimised in virtual environments generally just to host the hypervisor. However with VDI venders utilising local disk I can see this being a potential issue with blades going forward. Having said that if companies like Atlantis Computing are working on running VDI desktops directly out of memory, and with memory density only set to get higher this potentially a SAN less environment (blog post pending on that ).

The Virtual Interface Card (VIC) is a converged network adaptor (Ethernet & FC) that is designed with virtualisation in mind. VN-link technology enables policy-based virtual machine connectivity and mobility of network and security policy that is persistent throughout the virtual machine lifecycle, including VMware VMotion. It delivers dual 10-GE ports and dual fibre channel ports to the midplane.

Cable Management

So lets think of a fairly common example. 10 x Dell R710 2u servers with and additional PCI network card with quad ports and an additional PCI HBA with 2 dual ports. Lets assume every PCI card port is full.

When comparing traditional blade chassis this is reduced as you can add switch’s internally to the chassis however this model will not always work any you may need to use pass-through modules which will keep the number of Ethernet and fibre cables high.

With a UCS blade system this is reduced significantly with the introduction of FCoE. This FCoE strategy only operates between the chassis and the FI’s allowing for up to 40GBps up to the FI’s. A maximum of 8 twinax cables per chassis (4 per 2100 series fabric extender).

Nigel mentioned wireless as a possible alternative or future. Personally I think this is not round the corner technology wise but is within the realms of possibility.

Cost

This is where Cisco UCS can fall down because of the Fabric Interconnects. However concerns over how much a empty chassis will impact on rack server vs blade server capex are a little unfounded. Where things get expensive with Cisco UCS blades are with the fabric interconnects and the fabric extenders. Each can be purchased in singular formats (i.e. a single FE per chassis and single FI managing) however this introduces the single points of failure that we want to avoid.