I found it interesting because while it was rather “new” and interesting back then, if you ‘s/virtualization/cloud‘ especially from the perspective of heavily virtualized or cloud computing environments, it’s even more relevant today! Virtualization and the abstraction it brings to network architecture, design and security makes for interesting challenges. Not much has changed in two years, sadly.

Allan sets up the discussion describing how we’ve typically plumbed disparate physical appliances into our network infrastructure to provide discrete network and security capabilities such as load balancers, VPNs, SSL termination, firewalls, etc. He then goes on to describe the stunted evolution of virtual appliances:

To be sure, some networking devices and appliances are now available in virtual form. Switches and routers have begun to move toward virtualization with VMware’s vSwitch, Cisco’s Nexus 1000v, the open sourceOpen vSwitch and routers and firewalls running in various VMs from the company I helped found, Vyatta. For load balancers, Citrix has released a version of its Netscaler VPX software that runs on top of its virtual machine, XenServer; and Zeus Systems has an application traffic controller that can be deployed as a virtual appliance on Amazon EC2, Joyent and other public clouds.

Ultimately I think it prudent for discussion’s sake to separate routing, switching and load balancing (connectivity) from functions such as DLP, firewalls, and IDS/IPS (security) as lumping them together actually abstracts the problem which is that the latter is completely dependent upon the capabilities and functionality of the former. This is what Allan almost gets to when describing his lament with the virtual appliance ecosystem today:

Yet the fundamental problem remains: Most networking appliances are still stuck in physical hardware — hardware that may or may not be deployed where the applications need them, which means those applications and their associated VMs can be left with major gaps in their infrastructure needs. Without a full-featured and stateful firewall to protect an application, it’s susceptible to various Internet attacks. A missing load balancer that operates at layers three through seven leaves a gap in the need to distribute load between multiple application servers. Meanwhile, the lack of an SSL accelerator to offload processing may lead to performance issues and without an IDS device present, malicious activities may occur. Without some (or all) of these networking appliances available in a virtual environment, a VM may find itself constrained, unable to take full advantage of the possible economic benefits.

I’ve written about this many, many times. In fact almost three years ago I created a presentation called “The Four Horsemen of the Virtualization Security Apocalypse” which described in excruciating detail how network virtual appliances were a big ball of fail and would be for some time. I further suggested that much of the “best-of-breed” products would ultimately become “good enough” features in virtualization vendor’s hypervisor platforms.

Why? Because there are some very real problems with virtualization (and Cloud) as it relates to connectivity and security:

Most of the virtual network appliances, especially those “ported” from the versions that usually run on dedicated physical hardware (COTS or proprietary) do not provide feature, performance, scale or high-availability parity; most are hobbled or require per-platform customization or re-engineering in order to function.

The resilience and high availability options from today’s off-the-shelf virtual connectivity does not pair well with the mobility and dynamism of de-coupled virtual machines; VMs are ultimately temporal and networks don’t like topological instability due to key components moving or disappearing

The performance and scale of virtual appliances still suffer when competing for I/O and resources on the same physical hosts as the guests they attempt to protect

Virtual connectivity is a generally a function of the VMM (or a loadable module/domain therein.) The architecture of the VMM has dramatic impact upon the architecture of the software designed to provide the connectivity and vice versa.

Security solutions are incredibly topology sensitive. Given the scenario in #1 when a VM moves or is distributed across the pooled infrastructure, unless the security capabilities are already present on the physical host or the connectivity and security layers share a control plane (or at least can exchange telemetry,) things will simply break

Many virtualization (and especially cloud) platforms do not support protocols or topologies that many connectivity and security virtual appliances require to function (such as multicast for load balancing)

It’s very difficult to mimic the in-line path requirements in virtual networking environments that would otherwise force traffic passing through the connectivity layers (layers 2 through 7) up through various policy-driven security layers (virtual appliances)

There is no common methodology to express what security requirements the connectivity fabrics should ensure are available prior to allowing a VM to spool up let alone move

Virtualization vendors who provide solutions for the enterprise have rich networking capabilities natively as well as with third party connectivity partners, including VM and VMM introspection capabilities. As I wrote about here, mass-market Cloud providers such as Amazon Web Services or Rackspace Cloud have severely crippled networking.

Much of the basic networking capabilities are being pushed lower into silicon (into the CPUs themselves) which makes virtual appliances even further removed from the guts that enable them

Physical appliances (in the enterprise) exist en-mass. Many of them provide highly scalable solutions to the specific functions that Alan refers to. The need exists, given the limitations I describe above, to provide for integration/interaction between them, the VMM and any virtual appliances in order to offload certain functions as well as provide coverage between the physical and the logical.

What does this mean? It means that ultimately to ensure their own survival, virtualization and cloud providers will depend less upon virtual appliances and add more of the basic connectivity AND security capabilities into the VMMs themselves as its the only way to guarantee performance, scalability, resilience and satisfy the security requirements of customers. There will be new generations of protocols, APIs and control planes that will emerge to provide for this capability, but this will drive the same old integration battles we’re supposed to be absolved from with virtualization and Cloud.

Connectivity and security vendors will offer virtual replicas of their physical appliances in order to gain a foothold in virtualized/cloud environments in order to intercept traffic (think basic traps/ACL’s) and then interact with higher-performing physical appliance security service overlays or embedded line cards in service chassis. This is especially true in enterprises but poses many challenges in software-only, mass-market cloud environments where what you’ll continue to get is simply basic connectivity and security with limited networking functionality. This implies more and more security will be pushed into the guest and application logic layers to deal with this disconnect.

This is exactly where we are today with Cloud providers like Amazon Web Services: basic ingress-only filtering with a very simplistic, limited and abstracted set of both connectivity and security capability. See “Dear Public Cloud Providers: Please Make Your Networking Capabilities Suck Less. Kthxbye” Will they add more functionality? Perhaps. The question is whether they can afford to in order to limit the impact that connecitivity and security variability/instability can bring to an environment.

That said, it’s certainly achievable, if you are willing and able to do so, to construct a completely software-based networking environment, but these environments require a complete approach and stack re-write with an operational expertise that will be hard to support for those who have spent the last 20 years working in a different paradigm and that’s a huge piece of this problem.

The connectivity layer — however integrated into the virtualized and cloud environments they seem — continues to limit how and what the security layers can do and will for some time, thus limiting the uptake of virtual network and security appliances.

Some suggest in discussing the role and long-term sustainable value of infrastructure versus software in cloud that software will marginalize bespoke infrastructure and the latter will simply commoditize.

I find that an interesting assertion, given that it tends to ignore the realities that both hardware and software ultimately suffer from a case of Moore’s Law — from speeds and feeds to the multi-core crisis, this will continue ad infinitum. It’s important to keep that perspective.

In discussing this, proponents of software domination almost exclusively highlight Amazon Web Services as their lighthouse illustration. For the purpose of simplicity, let’s focus on compute infrastructure.

Here’s why pointing to Amazon Web Services (AWS) as representative of all cloud offerings in general to anchor the hardware versus software value debate is not a reasonable assertion:

AWS delivers a well-defined set of services designed to be deployed without modification across a massive number of customers; leveraging a common set of standardized capabilities across these customers differentiates the service and enables low cost

AWS enjoys general non-variability in workload from *their* perspective since they offer fixed increments of compute and memory allocation per unit measure of exposed abstracted and virtualized infrastructure resources, so there’s a ceiling on what workloads per unit measure can do. It’s predictable.

From AWS’ perspective (the lens of the provider) regardless of the “custom stuff” running within these fixed-sized containers, the main focus of their core “cloud” infrastructure actually functions like a grid — performing what amounts to a few tasks on a finely-tuned platform to deliver such

This yields the ability for homogeneity in infrastructure and a focus on standardized and globalized power efficient, low cost, and easy-to-replicate components since the problem of expansion beyond a single unit measure of maximal workload capacity is simply a function of scaling out to more of them (or stepping up to one of the next few rungs on the scale-up ladder)

Yup, I just said that AWS is actually a grid whose derivative output is a set of cloud services.

Why does this matter? Because not all IaaS cloud providers are architected to achieve this — by design — and this has dramatic impact on where hardware and software, leveraged independently or as a total solution, play in the debate.

This is because AWS built and own the entire “CloudOS” stack from customized hardware through to the VMM, management and security planes (much as Google does the same) versus other providers who use what amounts to more generic software offerings from the likes of VMware and lean on API’s and an ecosystem to extend it’s capabilities as well as big iron to power it. This will yield more customizable offerings that likely won’t scale as highly as AWS.

That’s because they’re not “grids” and were never designed to be.

Many other IaaS providers that have evolved from hosting are building their next-generation offerings from unified fabric and unified computing platforms (so-called “big iron”) which are the furtherest thing from “commodity” hardware you can get. Further, SaaS and PaaS providers generally tend to do the same based on design goals and business models. Remember, IaaS is not representative of all things cloud — it’s only one of the service models.

Comparing AWS to most other IaaS cloud providers is a false argument upon which to anchor the hardware versus software debate.

Within the construct of virtualization, and especially VMware, an awful lot of time is spent on VM Mobility but as numerous polls and direct customer engagements have shown, the majority (50% and higher) do not use VMotion. I talked about this in a post titled “The VM Mobility Myth:”

…the capability to provide for integrated networking and virtualization coupled with governance and autonomics simply isn’t mature at this point. Most people are simply replicating existing zoned/perimertized non-virtualized network topologies in their consolidated virtualized environments and waiting for the platforms to catch up. We’re really still seeing the effects of what virtualization is doing to the classical core/distribution/access design methodology as it relates to how shackled much of this mobility is to critical components like DNS and IP addressing and layer 2 VLANs. See Greg Ness and Lori Macvittie’s scribblings.

Furthermore, Workload distribution (Ed: today) is simply impractical for anything other than monolithic stacks because the virtualization platforms, the applications and the networks aren’t at a point where from a policy or intelligence perspective they can easily and reliably self-orchestrate.

That last point about “monolithic stacks” described what I talked about in my last post “Virtual Machines Are the Problem, Not the Solution” in which I bemoaned the bloat associated with VM’s and general purpose OS’s included within them and the fact that VMs continue to hinder the notion of being able to achieve true workload portability within the construct of how programmatically one might architect a distributed application using an SOA approach of loosely coupled services.

Combined with the VM bloat — which simply makes these “workloads” too large to practically move in real time — if one couples the annoying laws of physics and current constraints of virtualization driving the return to big, flat layer 2 network architecture — collapsing core/distribution/access designs and dissolving classical n-tier application architectures — one might argue that the proposition of VMotion really is a move backward, not forward, as it relates to true agility.

That’s a little contentious, but in discussions with customers and other Social Media venues, it’s important to think about other designs and options; the fact is that the Metastructure (as it pertains to supporting protocols/services such as DNS which are needed to support this “infrastructure 2.0”) still isn’t where it needs to be in regards to mobility and even with emerging solutions like long-distance VMotion between datacenters, we’re butting up against laws of physics (and costs of the associated bandwidth and infrastructure.)

While we do see advancements in network-driven policy stickiness with the development of elements such as distributed virtual switching, port profiles, software-based vSwitches and virtual appliances (most of which are good solutions in their own right,) this is a network-centric approach. The policies really ought to be defined by the VM’s themselves (similar to SOA service contracts — see here) and enforced by the network, not the other way around.

Further, what isn’t talked about much is something that @joe_shonk brought up, which is that the SAN volumes/storage from which most of these virtual machines boot, upon which their data is stored and in some cases against which they are archived, don’t move, many times for the same reasons. In many cases we’re waiting on the maturation of converged networking and advances in networked storage to deliver solutions to some of these challenges.

In the long term, the promise of mobility will be delivered by a split into three four camps which have overlapping and potentially competitive approaches depending upon who is doing the design:

Integration distribution and “mobility” at the application/OS layer [application architect,] or

The more traditional network-based load balancing of traffic to replicated/distributed images [network architect.]

Moving or redirecting pointers to large pools of storage where all the images/data(bases) live [Ed. forgot to include this from above]

Depending upon the need and capability of your application(s), virtualization/Cloud platform, and network infrastructure, you’ll likely need a mash-up of all three four. This model really mimics the differences today in architectural approach between SaaS and IaaS models in Cloud and further suggests that folks need to take a more focused look at PaaS.

Don’t get me wrong, I think VMotion is fantastic and the options it can ultimately delivery intensely useful, but we’re hamstrung by what is really the requirement to forklift — network design, network architecture and the laws of physics. In many cases we’re fascinated by VM Mobility, but a lot of that romanticization plays on emotion rather than utilization.

Greg Ness touched off an interesting discussion when he asked “Will Virtualization Undermine Network Equipment Vendors?” It’s a great read summarizing how virtualization (and Cloud) are really beginning to accelerate how classical networking equipment vendors are re-evaluating their portfolios in order to come to terms with these disruptive innovations.

I’ve written so much about this over the last three years and my response is short and sweet:

Virtualization has actually long been an enabler for network equipment vendors — not server virtualization, mind you, but network virtualization. The same goes in the security space. The disruption caused by server virtualization is only acting as an accelerant — pushing the limits of scale, redefining organizational and operational boundaries, and acting as a forcing function causing wholesale reconsideration of archetypal network (and security) topologies.

The compressed timeframe associated with the disruption caused by virtualization and its adoption in conjunction with the arrival of Cloud Computing may seem unnatural given the relatively short window associated with its arrival, but when one takes the longer-term view, it’s quite natural. We’ve seen it before in vignettes across the evolution of computing, but the convergence of economics, culture, technology and consumerism have amplified its relevance.

To answer Greg’s question, Virtualization will only undermine those network equipment vendors who were not prepared for it in the first place. Those that were building highly virtualized, context-enabled routing, switching and security products will embrace this swing in the hardware/software pendulum and develop hybrid solutions that span the physical and virtual manifestations of what the “network” has become.

Specifically, as it comes to understanding how the network plays in virtual and Cloud architectures, it’s not where the network *is* in the increasingly complex virtualized, converged and unified computing architectures, it’s where networking *isn’t.*

Where ISN'T The Network?

Take a look at your network equipment vendors. Where do they play in that stack above? Compare and contrast that with what is going on with vendors like Citrix/Xen with the Open vSwitch, Vyatta, Arista with vEOS and Cisco with the Nexus 1000v*…interesting times for sure.

I’ve covered this before in more complex terms, but I thought I’d reintroduce the topic due to a very relevant discussion I just had recently (*cough cough*)

So here’s an interesting scenario in virtualized and/or Cloud environments that make use of virtual appliances to provide security capabilities*:

Since virtual appliances (VAs) are just virtual machines (VMs) what happens when a SysAdmin spins down or moves one that happens to be your shiny new firewall protecting your production VMs behind it, accidentally or maliciously? Brings new meaning to the phrase “failing closed.”

Without getting into the vagaries of vendor specific mobility-enabled/enabling technologies, one of the issues with VMs/VAs is that there’s not really a good way of designating one as being “more important” or functionally differentiated such as “security” or “critical application” that would otherwise ensure a higher priority for service availability (read: don’t spin this down unless…) or provide a topological dependency hierarchy in virtualized network constructs.

Unlike physical environments where system administrators (servers) are segregated from access to network and security appliances, this isn’t the case in virtual environments. In Cloud environments (especially public, multi-tenant) where we are often reliant only upon virtual security capabilities since we have no option for physical alternatives, this is an interesting corner case.

We’ve talked a lot about visibility, audit and policy management in virtual environments and this is a poignant example.

In my Four Horsemen presentation, I made reference to one of the challenges with how the networking function is being integrated into virtual environments. I’ve gone on to highlight how this is exacerbated in Cloud networking, also.

Specifically, as it comes to understanding how the network plays in virtual and Cloud architectures, it’s not where the network *is* in the increasingly complex virtualized, converged and unified computing architectures, it’s where networking *isn’t.*

What do I mean by this? Here’s a graphical representation that I built about a year ago. It’s well out-of-date and overly-simplified, but you get the picture:

There’s networking at almost every substrate level — in the physical and virtual construct. In our never-ending quest to balance performance, agility, resiliency and security, we’re ending up with a trade-off I call simplexity: the most complex simplicity in networking we’ve ever seen. I wrote about this in a blog post last year titled “The Network Is the Computer…(Is the Network, Is the Computer…)”

If you take a look at some of the more recent blips to appear on the virtual and Cloud networking radar, you’ll see examples such as:

This list is far from inclusive. Yes, I know I’ve left off blade server manufacturers and other players like HP (ProCurve) and Juniper, etc. as well as ADC vendors like f5. It’s not that I don’t appreciate their solutions, it’s just that I have a couple of free cycles to write this, and the list above appear on the top of my stack.

I plan on writing in more detail about the impact some of these technologies are having on next generation datacenters and Cloud deployments, as it’s a really interesting subject for me coming from my background at Crossbeam.

The most startling differences are in the approach of either putting the networking (and all its attendant capabilities) back in the hands of the network folks or allowing the server/virtual server admins to continue to leverage their foothold in the space and manage the network as a component of the converged and virtualized solution as a whole.

My friend @aneel (Twitter) summed it up really well this morning when comparing the Blade Network Technology VMready offering and the Cisco Nexus 100ov:

huh.. where cisco uses nx1kv to put net control more in hands of net ppl, bnt uses vmready to put it further in server/virt admin hands

Looking at just the small sampling of solutions above, we see the diversity in integrated networking, external fabrics, converged fabrics (including storage) and add-on network processors.