In the midst of the Cold War in October of 1962, the United States and the Soviet Union stood periously on the brink of nuclear war as a small island some 90 miles off the coast of Florida became the focal point of intense foreign policy scrutiny, challenges to sovereignty and political arm wrestling the likes of which were never seen before.

Photographic evidence provided by a high altitude U.S. spy plane exposed the until-then secret construction of medium and intermediate ballistic nuclear missile silos, constructed by the Soviet Union, which were deliberately placed so as to be close enough to reach the continental United States.

The United States, alarmed by this unprecedented move by the Soviets and the already uneasy relations with communist Cuba, unsuccessfully attempted a CIA-led forceful invasion and overthrow of the Cuban regime at the Bay of Pigs.

This did not sit well with either the Cubans or Soviets. A nightmare scenario ensued as the Soviets responded with threats of its own to defend its ally (and strategic missile sites) at any cost, declaring the American’s actions as unprovoked and unacceptable.

During an incredibly tense standoff, the U.S. mulled over plans to again attack Cuba both by air and sea to ensure the disarmament of the weapons that posed a dire threat to the country.

As posturing and threats continued to escalate from the Soviets, President Kennedy elected to pursue a less direct military action; a naval blockade designed to prevent the shipment of supplies necessary for the completion and activation of launchable missiles. Using this as a lever, the U.S. continued to demand that Russia dismantle and remove all nuclear weapons as they prevented any and all naval traffic to and from Cuba.

Soviet premier Krustchev protested such acts of “direct aggression” and communicated to president Kennedy that his tactics were plunging the world into the depths of potential nuclear war.

While both countries publicly traded threats of war, the bravado, posturing and defiance were actually a cover for secret backchannel negotiations involving the United Nations. The Soviets promised they would dismantle and remove nuclear weapons, support infrastructure and transports from Cuba, and the United States promised not to invade Cuba while also removing nuclear weapons from Turkey and Italy.

The Soviets made good on their commitment two weeks later. Eleven months after the agreement, the United States complied and removed from service the weapons abroad.

The Cold War ultimately ended and the Soviet Union fell, but the political, economic and social impact remains even today — 40 years later we have uneasy relations with (now) Russia and the United States still enforces ridiculous economic and social embargoes on Cuba.

What does this have to do with Cloud?

Well, it’s a cute “movie of the week” analog desperately in need of a casting call for Nikita Khrushchev and JFK. I hear Gary Busey and Aston Kutcher are free…

As John Furrier, Dave Vellante and I were discussing on theCUBE recently at VMworld 2012, there exists an uneasy standoff — a cold war — between the so-called “super powers” staking a claim in Cloud. The posturing and threats currently in process don’t quite have the world-ending outcomes that nuclear war would bring, but it could have devastating technology outcomes nonetheless.

In this case, the characters of the Americans, Soviets, Cubans and the United Nations are played by networking vendors, SDN vendors, virtualization/abstraction vendors, cloud “stack” projects/efforts/products and underlying CPU/chipset vendors (not necessarily in that order…) The rest of the world stands by as their fate is determined on the world’s stage.

If we squint hard enough at Cloud, we might find out very own version of the “Bay of Pigs,” with what’s going on with OpenStack.

The “community” effort behind OpenStack is one largely based on “industry” and if we think of OpenStack as Cuba, it’s being played as pawn in the much larger battle for global domination. The munitions being stocked in this tiny little enclave threatens to disrupt relations of epic proportions. That’s why we now see so much strategic movement around an initiative and technology that many outside of the navel gazers haven’t really paid much attention to.

Then there are players like Amazon Web Services who, like China of today, quietly amass their weapons of mass abstraction as the industry-jockeying and distractions play on (but that’s a topic for another post)

Cutting to the chase…if we step back for a minute

Intel is natively bundling more and more networking and virtualization capabilities into their CPU/Chipsets and a $7B investment in security company McAfee makes them a serious player there. VMware is de-emphasizing the “hypervisor” and is instead positioning they are focused on end-to-end solutions which include everything from secure mobility, orchestration/provisioning and now, with Nicira, networking. Networking companies like Cisco and Juniper continue to move up-stack to deeper integrate networking and security along with service overlays in order to remain relevant in light of virtualization and SDN.

…and OpenStack’s threat of disrupting all of those plays makes it important enough to pay attention to. It’s a little island of technology that is causing huge behemoths to collide. A molehill that has become a mountain.

If today’s announcements of VMware and Intel joining OpenStack as Gold Members along with the existing membership by other “super powers” doesn’t make it clear that we’re in the middle of an enormous power struggle, I’ve got a small Island to sell you 😉

Me? I’m going to make some Lechon Asado, enjoy a mojito and a La Gloria Cubana.

The last 3 years have been very interesting when engaging with large enterprises and service providers as they set about designing, selecting and deploying their “next generation” network architecture. These new networks are deployed in timescales that see them collide with disruptive innovation such as fabrics, cloud, big data and DevOps.

In most cases, these network platforms must account for the nuanced impact of virtualized design patterns, refreshes of programmatic architecture and languages, and the operational model differences these things introduce. What’s often apparent is that no matter how diligent the review, by the time these platforms are chosen, many tradeoffs are made — especially when it comes to security and compliance — and we arrive at the old adage: “You can get fast, cheap or secure…pick two.”

…And In the Beginning, There Was Spanning Tree…

The juxtaposition of flatter and flatter physical networks, nee “fabrics” (compute, network and storage,) with what seems to be a flip-flop transition between belief systems and architects who push for either layer 2 or layer 3 (or encapsulated versions thereof) segmentation at the higher layers is again aggravated by continued push for security boundary definition that yields further segmentation based on policy at the application and information layer.

So what we end up with is the benefits of flatter, any-to-any connectivity at the physical networking layer with a “software defined” and virtualized networking context floating both alongside (Nicira, BigSwitch, OpenFlow) as well as atop it (VMware, Citrix, OpenStack Quantum, etc.) with a bunch of protocols ladled on like some protocol gravy blanketing the Chicken Fried Steak that represents the modern data center.

Oh! You Mean the Cloud…

Now, there are many folks who don’t approach it this way, and instead abstract away much of what I just described. In Amazon Web Services’ case as a service provider, they dumb down the network sufficiently and control the abstracted infrastructure to the point that “flatness” is the only thing customers get and if you’re going to run your applications atop, you must keep it simple and programmatic in nature else risk introducing unnecessary complexity into the “software stack.”

The customers who then depend upon these simplified networking services must then absorb the gaps introduced by a lack of features by architecturally engineering around them, becoming more automated, instrumented and programmatic in nature or add yet another layer of virtualized (and generally encrypted) transport and execution above them.

This works if you’re able to engineer your way around these gaps (or make them less relevant,) but generally this is where segmentation becomes an issue due to security and compliance design patterns which depend on the “complexity” introduced by the very flexible networking constructs available in most enterprise of SP networks.

It’s like a layered cake that keeps self-frosting.

Software Defined Architecture…

You can see the extreme opportunity for Software Defined *anything* then, can’t you? With SDN, let the physical networks NOT be complex but rather more simple and flat and then unify the orchestration, traffic steering, service insertion and (even) security capabilities of the physical and virtual networks AND the virtualization/cloud orchestration layers (from the networking perspective) into a single intelligent control plane…

That’s a big old self-frosting cake.

Basically, this is what AWS has done…but all that intelligence provided by the single pane of glass is currently left up to the app owner atop them. That’s the downside. Those sufficiently enlightened AWS’ customers are aware generally of this and understand the balance of benefits and limitations of this path.

In an enterprise environment, however, it’s a timing game between the controller vendors, the virtualization/cloud stack providers, the networking vendors, and security vendors …each trying to offer up this capability either as an “integrated” capability or as an overlay…all under the watchful eye of the auditor who is generally unmotivated, uneducated and unnerved by all this new technology — especially since the compliance frameworks and regulatory elements aren’t designed to account for these dramatic shifts in architecture or operation (let alone the threat curve of advanced adversaries.)

Back To The Future…Hey, Look, It’s Token Ring and DMZs!

As I sit with these customers who build these nextgen networks, the moment segmentation comes up, the elegant network and application architectures rapidly crumble into piles of asset-based rubble as what happens next borders on the criminal…

Thanks to compliance initiatives — PCI is a good example — no matter how well scoped, those flat networks become more and more logically hierarchical. Because SDN is still nascent and we’re lacking that unified virtualized network (and security) control plane, we end up resorting back to platform-specific “less flat” network architectures in both the physical and virtual layers to achieve “enclave” like segmentation.

But with virtualization the problem gets more complex as in an attempt to be agile, cost efficient and in order to bring data to the workloads to reduce heaving lifting of the opposite approach, out-of-scope assets can often (and suddenly) be co-resident with in-scope assets…traversing logical and physical constructs that makes it much more difficult to threat model since the level of virtualized context supports differs wildly across these layers.

Architects are then left to think how they can effectively take all the awesome performance, agility, scale and simplicity offered by the underlying fabrics (compute, network and storage) and then layer on — bolt on — security and compliance capabilities.

What they discover is that it’s very, very, very platform specific…which is why we see protocols such as VXLAN and NVGRE pop up to deal with them.

Lego Blocks and Pig Farms…

These architects then replicate the design patterns with which they are familiar and start to craft DMZs that are logically segmented in the physical network and then grafted on to the virtual. So we end up with relying on what Gunnar Petersen and I refer to as the “SSL and Firewall” lego block…we front end collections of “layer 2 connected” assets based on criticality or function, many of which stretched across these fabrics, and locate them behind layer 3 “firewalls” which provide basic zone-based isolation and often VPN connectivity between “trusted” groups of other assets.

In short, rather than build applications that securely authenticate, communicate — or worse yet, even when they do — we pigpen our corralled assets and make our estate fatter instead of flatter. It’s really a shame.

I’ve made the case in my “Commode Computing” presentation that one of the very first things that architects need to embrace is the following:

…by not artificially constraining the way in which we organize, segment and apply policy (i.e. “put it in a DMZ”) we can think about how design “anti-patterns” may actually benefit us…you can call them what you like, but we need to employ better methodology for “zoning.”

These trust zones or enclaves are reasonable in concept so long as we can ultimately further abstract their “segmentation” and abstract the security and compliance policy requirements by expressing policy programmatically and taking the logical business and functional use-case PROCESSES into consideration when defining, expressing and instantiating said policy.

You know…understand what talks to what and why…

A great way to think about this problem is to apply the notion of application mobility — without VM containers — and how one would instantiate a security “policy” in that context. In many cases, as we march up the stack to distributed platform application architectures, we’re not able to depend upon the “crutch” that hypervisors or VM packages have begun to give us in legacy architectures that have virtualization grafted onto them.

Since many enterprises are now just starting to better leverage their virtualized infrastructure, there *are* some good solutions (again, platform specific) that unify the physical and virtual networks from a zoning perspective, but the all-up process-driven, asset-centric (app & information) view of “policy” is still woefully lacking, especially in heterogeneous environments.

Wrapping Up…

In enterprise and SP environments where we don’t have the opportunity to start anew, it often feels like we’re so far off from this sort of capability because it requires a shift that makes software defined networking look like child’s play. Most enterprises don’t do risk-driven, asset-centric, process-mapped modelling, [and SP’s are disconnected from this,] so segmentation falls back to what we know: DMZs with VLANs, NAT, Firewalls, SSL and new protocol band-aids invented to cover gaping arterial wounds.

In environments lucky enough to think about and match the application use cases with the highly-differentiated operational models that virtualized *everything* brings to bear, it’s here today — but be prepared and honest that the vendor(s) you chose must be strategic and the interfaces between those platforms and external entities VERY well defined…else you risk software defined entropy.

I wish I had more than the 5 minutes it took to scratch this out because there’s SO much to talk about here…

It’s not really *that* incomplete of a thought, but I figure I’d get it down on vPaper anyway…be forewarned, it’s massively over-simplified.

Over the last five years or so, I’ve spent my time working with enterprises who are building and deploying large scale (relative to an Enterprise’s requirements, that is) virtualized data centers and private cloud environments.

For the purpose of this discussion, I am referring to VMware-based deployments given the audience and solutions I will reference.

To this day, I’m often shocked with regard to how many of these organizations that seek to provide contextualized security for intra- and inter-VM traffic seem to position an either-or decision with respect to the use of physical or virtual security solutions.

If you’ve seen/read the FHOTVA, you will recollect that there are many tradeoffs involved when considering the use of virtual security appliances and their integration with physical solutions. Notably, an all-virtual or all-physical approach will constrain you in one form or another from the perspective of efficacy, agility, and the impact architecturally, operationally, or economically.

The topic that has a bunch of hair on it is where I see many enterprises trending: obviating virtual solutions and using physical appliances only:

…the bit that’s missing in the picture is the external physical firewall connected to that physical switch. People are still, in this day and age, ONLY relying on horseshoeing all traffic between VMs (in the same or different VLANs) out of the physical cluster machine and to an external firewall.

Now, there are many physical firewalls that allow for virtualized contexts, zoning, etc., but that’s really dependent upon dumping trunked VLAN ports from the firewall/switches into the server and then “extending” virtual network contexts, policies, etc. upstream in an attempt to flatten the physical/virtual networks in order to force traffic through a physical firewall hop — sometimes at layer 2, sometimes at layer 3.

It’s important to realize that physical firewalls DO offer benefits over the virtual appliances in terms of functionality, performance, and some capabilities that depend on hardware acceleration, etc. but from an overall architectural positioning, they’re not sufficient, especially given the visibility and access to virtual networks that the physical firewalls often do not have if segregated.

Here’s a hint, physical-only firewall solutions alone will never scale with the agility required to service the virtualized workloads that they are designed to protect. Further, a physical-only solution won’t satisfy the needs to dynamically provision and orchestrate security as close to the workload as possible, when the workloads move the policies will generally break, and it will most certainly add latency and ultimately hamper network designs (both physical and virtual.)

…which is to say that there exists the capability to utilize virtual solutions for “east-west” traffic and physical solutions for “north-south” traffic, regardless of whether these VMs are in the same or different VLAN boundaries or even across distributed virtual switches which exist across hypervisors on different physical cluster members.

It’s probably important to mention that while the next slide is out-of-date from the perspective of the advancement of VMsafe APIs, there’s not only the ability to inject a slow-path (user mode) virtual appliance between vSwitches, but also utilize a set of APIs to instantiate security policies at the hypervisor layer via a fast path kernel module/filter set…this means greater performance and the ability to scale better across physical clusters and distributed virtual switching:

Interestingly, there also exists the capability to actually integrate policies and zoning from physical firewalls and have them “flow through” to the virtual appliances to provide “micro-perimeterization” within the virtual environment, preserving policy and topology.

There are at least three choices for hypervisor management-integrated solutions on the market for these solutions today:

VMware vShield App,

Cisco VSG+Nexus 1000v and

Juniper vGW

Note that the solutions above can be thought of as “layer 2” solutions — it’s a poor way of describing them, but think “inter-VM” introspection for workloads in VLAN buckets. All three vendors above also have, or are bringing to market, complementary “layer 3” solutions that function as virtual “edge” devices and act as a multi-function “next-hop” gateway between groups of VMs/applications (nee vDC.) For the sake of brevity, I’m omitting those here (they are incredibly important, however.)

They (layer 2 solutions) are all reasonably mature and offer various performance, efficacy and feature set capabilities. There are also different methods for plumbing the solutions and steering traffic to them…and these have huge performance and scale implications.

It’s important to recognize that the lack of thinking about virtual solutions often seem to be based largely on ignorance of need and availability of solutions.

However, other reasons surface such as cost, operational concerns and compliance issues with security teams or assessors/auditors who don’t understand virtualized environments well enough.

From an engineering and architectural perspective, however, obviating them from design consideration is a disappointing concern.

Enterprises should consider a hybrid of the two models; virtual where you can, physical where you must.

If you’ve considered virtual solutions but chose not to deploy them, can you comment on why and share your thinking with us (even if it’s for the reasons above?)

This thoughtful piece was constructed in order to point out the challenges involved in providing auditability, visibility, and transparency in service — specifically cloud computing — in which the notion of building in or bolting on security is debated.

I think this is timely. I have thought about this a couple of times with one piece aligned heavily with Chris’ thoughts:

Chris’ discussion really contrasted the delivery/deployment models against the availability and operationalization of controls:

If we’re building security in, then how do we audit the controls?

Will platform as a service (PaaS) give us a way to build security in such that it can be evaluated independently of the custom code running on it?

Further, as part of some good examples, he points out the notion that with separation of duties, the ability to apply “defense in depth” (hate that term,) and the ability to respond to new threats, the “bolt-on” approach is useful — if not siloed:

There lies the issue – bolt on security is easy to audit. There’s a separate thing, with a separate bit of config (administered by a separate bunch of people) that stands alone from the application code.

…versus building secure applications:

Code security is hard. We know that from the constant stream of vulnerabilities that get found in the tools we use every day. Auditing that specific controls implemented in code are present and effective is a big problem, and that is why I think we’re still seeing so much bolting on rather than building in.

I don’t disagree with this at all. Code security is hard. People look for gap-fillers. The notion that Chris finds limited options for bolting security on versus integrating security (building it in) programmatically as part of the security development lifecycle leaves me a bit puzzled.

This identifies both the skills and cultural gap between where we are with security and how cloud changes our process, technology, and operational approaches but there are many options we should discuss.

Thus what was interesting (read: I disagree with) is what came next wherein Chris maintained that one “can’t bolt on in the cloud”:

One of the challenges that cloud services present is an inability to bolt on extra functionality, including security, beyond that offered by the service provider. Amazon, Google etc. aren’t going to let me or you show up to their data centre and install an XML gateway, so if I want something like schema validation then I’m obliged to build it in rather than bolt it on, and I must confront the audit issue that goes with that.

While it’s true that CSP’s may not enable/allow you to show up to their DC and “…install and XML gateway,” they are pushing the security deployment model toward the virtual networking hooks, the guest based approach within the VMs and leveraging both the security and service models of cloud itself to solve these challenges.

I allude to this below, but as an example, there are now cloud services which can sit “in-line” or in conjunction with your cloud application deployments and deliver security as a service…application, information (and even XML) security as a service are here today and ramping!

While immature and emerging in some areas, I offer the following suggestions that the “bolt-on” approach is very much alive and kicking. Given that the “code security” is hard, this means that the cloud providers harden/secure their platforms, but the app stacks that get deployed by the customers…that’s the customers’ concerns and here are some options:

Introspection APIs (VMsafe)

Security as a Service (Cloudflare, Dome9, CloudPassage)

Auditing frameworks (CloudAudit, STAR, etc)

Virtual networking overlays & virtual appliances (vGW, VSG, Embrane)

Software defined networking (Nicira, BigSwitch, etc.)

Yes, some of them are platform specific and I think Chris was mostly speaking about “Public Cloud,” but “bolt-on” options are most certainly available an are aggressively evolving.

I totally agree that from the PaaS/SaaS perspective, we are poised for many wins that can eliminate entire classes of vulnerabilities as the platforms themselves enforce better security hygiene and assurance BUILT IN. This is just as emerging as the BOLT ON solutions I listed above.

As I mention in my Cloudifornication presentation, I think that from a security perspective, PaaS offers the potential of eliminating entire classes of vulnerabilities in the application development lifecycle by enforcing sanitary programmatic practices across the derivate works built upon them. I look forward also to APIs and standards that allow for consistency across providers. I think PaaS has the greatest potential to deliver this.

There are clearly trade-offs here, but as we start to move toward the two key differentiators (at least for public clouds) — management and security — I think the value of PaaS will really start to shine.

My opinion is that given the wide model of integration between various delivery and deployment models, we’re gonna need both for quite some time.

Back to Chris’ original point, the notion that auditors will in any way be able to easily audit code-based (built-in) security at the APPLICATION layer or the PLATFORM layer versus the bolt-on layer is really at the whim on the skillset of the auditors themselves and the checklists they use which call out how one is audited:

Infrastructure as a service shows us that this can be done e.g. the AWS firewall is very straightforward to configure and audit (without needing to reveal any details of how it’s actually implemented). What can we do with PaaS, and how quickly?

This is a very simplistic example (more infrastructure versus applistructure perspective) but represents the very interesting battleground we’ll be entrenched in for years to come.

In the related posts below, you’ll see I’ve written a bunch about this and am working toward ensuring that as really smart folks work to build it in, the ecosystem is encouraged to provide bolt ons to fill those gaps.

I admit I was enticed by the title of the blog and the introductory paragraph certainly reeled me in with the author creds:

This post was written with Andrew Lambeth. Andrew has been virtualizing networking for long enough to have coined the term “vswitch”, and led the vDS distributed switching project at VMware

I can only assume that this is the same Andrew Lambeth who is currently employed at Nicira. I had high expectations given the title, so I sat down, strapped in and prepared for a fire hose.

Boy did I get one…

27 paragraphs amounting to 1,601 words worth that basically concluded that server virtualization is not the same thing as network virtualization, stateful L2 & L3 network virtualization at scale is difficult and ultimately virtualizing the data plane is the easy part while the hard part of getting the mackerel out of the tin is virtualizing the control plane – statefully.*

*[These are clearly *my* words as the only thing fishy here was the conclusion…]

It seems the main point here, besides that mentioned above, is to stealthily and diligently distance Nicira as far from the description of “…could be to networking something like what VMWare was to computer servers” as possible.

This is interesting given that this is how they were described in a NY Times blog some months ago. Indeed, this is exactly the description I could have sworn *used* to appear on Nicira’s own about page…it certainly shows up in Google searches of their partners o_O

In his last section titled “This is all interesting … but why do I care?,” I had selfishly hoped for that very answer.

Sadly, at the end of the day, Lambeth’s potentially excellent post appears more concerned about culling marketing terms than hammering home an important engineering nuance:

Perhaps the confusion is harmless, but it does seem to effect how the solution space is viewed, and that may be drawing the conversation away from what really is important, scale (lots of it) and distributed state consistency. Worrying about the datapath , is worrying about a trivial component of an otherwise enormously challenging problem

This smacks of positioning against both OpenFlow (addressed here) as well as other network virtualization startups.

Given the recent influx of virtual networking solutions, many of which are OpenFlow-based, what possible in-roads and value can they hope to offer in heavily virtualized enterprise environments wherein the virtual networking is owned and controlled by VMware?

Specifically, if the only third-party VMware virtual switch to date is Cisco’s and access to this platform is limited (if at all available) to startup players, how on Earth do BigSwitch, Nicira, vCider, etc. plan to insert themselves into an already contentious environment effectively doing mindshare and relevance battle with the likes of mainline infrastructure networking giants and VMware?

If you’re answer is “OpenFlow and OpenStack will enable this access,” I’ll follow along with a question that asks how long a runway these startups have hanging their shingle on relatively new efforts (mainly open source) that the enterprise is not a typically early adopter of.

I keep hearing notional references to the problems these startups hope to solve for the “Enterprise,” but just how (and who) do they think they’re going to get to consider their products at a level that gives them reasonable penetration?

Service providers, maybe?

Enterprises…?

It occurs to me that most of these startups are being built to be acquired by traditional networking vendors who will (or will not) adopt OpenFlow when significant enterprise dollars materialize in stacks that are not VMware-centric.

Not meaning to piss anyone off, but many of these startups’ business plans are shrouded in the mystical vail of “wait and see.”

So I do.

/Hoff

Ed: To be clear, this post isn’t about “OpenFlow” specifically (that’s only one of many protocols/approaches,) but rather the penetration of a virtual networking solution into a “closed” platform environment dominated by a single vendor.

If you want a relevant analog, look at the wasteland that represents the virtual security startups that tried to enter this space (and even the larger vendors’ solutions) and how long this has taken/fared.

If you read the comments below, you’ll see people start to accidentally tease out the real answer to the question I was asking…about the value of these virtual networking solutions providers. The funny part is that despite the lack of comments from most of the startups I mention, it took Brad Hedlund (from Cisco) to recognize why I wrote the post, which is the following:

“The *real* reason I wrote this piece was to illustrate that really, these virtual networking startups are really trying to invade the physical network in virtual sheep’s clothing…”

…in short, the problem space they’re trying to solve is actually in the physical network, or more specifically bridge the gap between the two.