Archive

So-called Next Generation Firewalls (NGFW) are those that extend “traditional port firewalls” with the added context of policy with application visibility and control to include user identity while enforcing security, compliance and productivity decisions to flows from internal users to the Internet.

NGFW, as defined, is a campus and branch solution.Campus and Branch NGFW solves the “inside-out” problem — applying policy from a number of known/identified users on the “inside” to a potentially infinite number of applications and services “outside” the firewall, generally connected to the Internet. They function generally as forward proxies with various network insertion strategies.

Campus and Branch NGFW is NOT a Data Center NGFW solution.

Data Center NGFW is the inverse of the “inside-out” problem. They solve the “outside-in” problem; applying policy from a potentially infinite number of unknown (or potentially unknown) users/clients on the “outside” to a nominally diminutive number of well-known applications and services “inside” the firewall that are exposed generally to the Internet. They function generally as reverse proxies with various network insertion strategies.

Campus and Branch NGFWs need to provide application visibility and control across potentially tens of thousands of applications, many of which are evasive.

Data Center NGFWs need to provide application visibility and control across a significantly fewer number of well-known managed applications, many of which are bespoke.

There are wholesale differences in performance, scale and complexity between “inside-out” and “outside-in” firewalls. They solve different problems.

The things that make a NGFW supposedly “special” and different from a “traditional port firewall” in a Campus & Branch environment are largely irrelevant in the Data Center. Speaking of which, you’d find it difficult to find solutions today that are simply “traditional port firewalls”; the notion that firewalls integrated with IPS, UTM, ALGs, proxies, integrated user authentication, application identification/granular control (AVC), etc., are somehow incapable of providing the same outcome is now largely a marketing distinction.

While both sets of NGFW solutions share a valid deployment scenario at the “edge” or perimeter of a network (C&B or DC,) a further differentiation in DC NGFW is the notion of deployment in the so-called “core” of a network. The requirements in this scenario mean comparing the deployment scenarios is comparing apples and oranges.

Firstly, the notion of a “core” is quickly becoming an anachronism from the perspective of architectural references, especially given the advent of collapsed network tiers and fabrics as well as the impact of virtualization, cloud and network virtualization (nee SDN) models. Shunting a firewall into these models is often difficult, no matter how many interfaces. Flows are also asynchronous and often times stateless.

Traditional Data Center segmentation strategies are becoming a blended mix of physical isolation (usually for compliance and/or peace of mind o_O) with a virtualized overlay provided in the hypervisor and/or virtual appliances. Shifts in traffic patterns include a majority of machine-to-machine in east-west direction via intra-enclave “pods” are far more common. Dumping all flows through one (or a cluster) of firewalls at the “core” does what, exactly — besides adding latency and often times obscured or unnecessary inspection.

Add to this the complexity of certain verticals in the DC where extreme low-latency “firewalls” are needed with requirements at 5 microseconds or less. The sorts of things people care about enforcing from a policy perspective aren’t exactly “next generation.” Or, then again, how about DC firewalls that work at the mobile service provider eNodeB, mobile packet core or Gi with specific protocol requirements not generally found in the “Enterprise?”

In these scenarios, claims that a Campus & Branch NGFW is tuned to defend against “outside-in” application level attacks against workloads hosted in a Data Center is specious at best. Slapping a bunch of those Campus & Branch firewalls together in a chassis and calling it a Data Center NGFW invokes ROFLcoptr.

Show me how a forward-proxy optimized C&B NGFW deals with a DDoS attack (assuming the pipe isn’t flooded in the first place.) Show me how a forward-proxy optimized C&B NGFW deals with application level attacks manipulating business logic and webapp attack vectors across known-good or unknown inputs.

They don’t. So don’t believe the marketing.

I haven’t even mentioned the operational model and expertise deltas needed to manage the two. Or integration between physical and virtual zoning, or on/off-box automation and visibility to orchestration systems such that policies are more dynamic and “virtualization aware” in nature…

In my opinion, NGFW is being redefined by the addition of functionality that again differentiates C&B from DC based on use case. Here are JUST two of them:

C&B NGFW is becoming what I call C&B NGFW+, specifically the addition of advanced anti-malware (AAMW) capabilities at the edge to detect and prevent infection as part of the “inside-out” use case. This includes adjacent solutions that include other components and delivery models.

DC NGFW is becoming DC NGFW+, specifically the addition of (web) application security capabilities and DoS/DDoS capabilities to prevent (generally) externally-originated attacks against internally-hosted (web) applications. This, too, requires the collaboration of other solutions specifically designed to enable security in this use case.

There are hybrid models that often take BOTH solutions to adequately protect against client infection, distribution and exploitation in the C&B to prevent attacks against DC assets connected over the WAN or a VPN.

Pretending both use cases are the same is farcical.

It’s unlikely you’ll see a shift in analyst “Enchanted Dodecahedrons” relative to functionality/definition of NGFW because…strangely…people aren’t generally buying Campus and Branch NGFW for their datacenters because they’re trying to solve different problems. At different levels of scale and performance.

A Campus and Branch NGFW is “No Good For Workloads” in the Data Center.

Many people who may only casually read my blog or peer at the timeline of my tweets may come away with the opinion that I suffer from confirmation bias when I speak about security and Cloud.

That is, many conclude that I am pro Private Cloud and against Public Cloud.

I find this deliciously ironic and wildly inaccurate. However, I must also take responsibility for this, as anytime one threads the needle and attempts to present a view from both sides with regard to incendiary topics without planting a polarizing stake in the ground, it gets confusing.

Let me clear some things up.

Digging deeper into what I believe, one would actually find that my blog, tweets, presentations, talks and keynotes highlight deficiencies in current security practices and solutions on the part of providers, practitioners and users in both Public AND Private Cloud, and in my own estimation, deliver an operationally-centric perspective that is reasonably critical and yet sensitive to emergent paths as well as the well-trodden path behind us.

I’m not a developer. I dabble in little bits of code (interpreted and compiled) for humor and to try and remain relevant. Nor am I an application security expert for the same reason. However, I spend a lot of time around developers of all sorts, those that write code for machines whose end goal isn’t to deliver applications directly, but rather help deliver them securely. Which may seem odd as you read on…

The name of this blog, Rational Survivability, highlights my belief that the last two decades of security architecture and practices — while useful in foundation — requires a rather aggressive tune-up of priorities.

Our trust models, architecture, and operational silos have not kept pace with the velocity of the environments they were initially designed to support and unfortunately as defenders, we’ve been outpaced by both developers and attackers.

Since we’ve come to the conclusion that there’s no such thing as perfect security, “survivability” is a better goal. Survivability leverages “security” and is ultimately a subset of resilience but is defined as the “…capability of a system to fulfill its mission, in a timely manner, in the presence of attacks, failures, or accidents.” You might be interested in this little ditty from back in 2007 on the topic.

Sharp readers will immediately recognize the parallels between this definition of “survivability,” how security applies within context, and how phrases like “design for failure” align. In fact, this is one of the calling cards of a company that has become synonymous with (IaaS) Public Cloud: Amazon Web Services (AWS.) I’ll use them as an example going forward.

So here’s a line in the sand that I think will be polarizing enough:

I really hope that AWS continues to gain traction with the Enterprise. I hope that AWS continues to disrupt the network and security ecosystem. I hope that AWS continues to pressure the status quo and I hope that they do it quickly.

Why?

Almost a decade ago, the Open Group’s Jericho Forum published their Commandments. Designed to promote a change in thinking and operational constructs with respect to security, what they presciently released upon the world describes a point at which one might imagine taking one’s most important assets and connecting them directly to the Internet and the shifts required to understand what that would mean to “security”:

The scope and level of protection should be specific and appropriate to the asset at risk.

Security mechanisms must be pervasive, simple, scalable, and easy to manage.

Assume context at your peril.

Devices and applications must communicate using open, secure protocols.

All devices must be capable of maintaining their security policy on an un-trusted network.

All people, processes, and technology must have declared and transparent levels of trust for any transaction to take place.

Mutual trust assurance levels must be determinable.

Authentication, authorization, and accountability must interoperate/exchange outside of your locus/area of control

Access to data should be controlled by security attributes of the data itself

Data privacy (and security of any asset of sufficiently high value) requires a segregation of duties/privileges

By default, data must be appropriately secured when stored, in transit, and in use.

These seem harmless enough today, but were quite unsettling when paired with the notion of “de-perimieterization” which was often misconstrued to mean the immediate disposal of firewalls. Many security professionals appreciated the commandments for what they expressed, but the the design patterns, availability of solutions and belief systems of traditionalists constrained traction.

Interestingly enough, now that the technology, platforms, and utility services have evolved to enable these sorts of capabilities, and in fact have stressed our approaches to date, these exact tenets are what Public Cloud forces us to come to terms with.

If one were to look at what public cloud services like AWS mean when aligned to traditional “enterprise” security architecture, operations and solutions, and map that against the Jericho Forum’s Commandments, it enables such a perfect rethink.

Instead of being focused on implementing “security” to protect applications and information based at the network layer — which is more often than not blind to both, contextually and semantically — public cloud computing forces us to shift our security models back to protecting the things that matter most: the information and the conduits that traffic in them (applications.)

As networks become more abstracted, it means that existing security models do also. This means that we must think about security programatticaly and embedded as a functional delivery requirement of the application.

“Security” in complex, distributed and networked systems is NOT a tidy simple atomic service. It is, unfortunately, represented as such because we choose to use a single noun to represent an aggregate of many sub-services, shotgunned across many layers, each with its own context, metadata, protocols and consumption models.

As the use cases for public cloud obscure and abstract these layers — flattens them — we’re left with the core of that which we should focus:

Build secure, reliable, resilient, and survivable systems of applications, comprised of secure services, atop platforms that are themselves engineered to do the same in way in which the information which transits them inherits these qualities.

So if Public Cloud forces one to think this way, how does one relate this to practices of today?

Frankly, enterprise (network) security design patterns are a crutch. The screened-subnet DMZ patterns with perimeters is outmoded. As Gunnar Peterson eloquently described, our best attempts at “security” over time are always some variation of firewalls and SSL. This is the sux0r. Importantly, this is not stated to blame anyone or suggest that a bad job is being done, but rather that a better one can be.

It’s not like we don’t know *what* the problems are, we just don’t invest in solving them as long term projects. Instead, we deploy compensation that defers what is now becoming more inevitable: the compromise of applications that are poorly engineered and defended by systems that have no knowledge or context of the things they are defending.

We all know this, but yet looking at most private cloud platforms and implementations, we gravitate toward replicating these traditional design patterns logically after we’ve gone to so much trouble to articulate our way around them. Public clouds make us approach what, where and how we apply “security” differently because we don’t have these crutches.

Either we learn to walk without them or simply not move forward.

Now, let me be clear. I’m not suggesting that we don’t need security controls, but I do mean that we need a different and better application of them at a different level, protecting things that aren’t tied to physical topology or addressing schemes…or operating systems (inclusive of things like hypervisors, also.)

I think we’re getting closer. Beyond infrastructure as a service, platform as a service gets us even closer.

Interestingly, at the same time we see the evolution of computing with Public Cloud, networking is also undergoing a renaissance, and as this occurs, security is coming along for the ride. Because it has to.

As I was writing this blog (ironically in the parking lot of VMware awaiting the start of a meeting to discuss abstraction, networking and security,) James Staten (Forrester) tweeted something from @Werner Vogels keynote at AWS re:invent:

Werner: “There’s no excuse not to use fine grained security to make your apps secure from the start.” Echoing @kindervag Zero Trust

So while I may have been, and will continue to be, a thorn in the side of platform providers to improve the “survivability” capabilities to help us get from there to there, I reiterate the title of this scribbling: Amazon Web Services (AWS) Is the Best Thing To Happen To Security & I Desperately Want It To Succeed.

I trust that’s clear?

/Hoff

P.S. There’s so much more I could/should write, but I’m late for the meeting 🙂

Every day for the last week or so after their launch, I’ve been asked left and right about whether I’d spoken to CloudPassage and what my opinion was of their offering. In full disclosure, I spoke with them when they were in stealth almost a year ago and offered some guidance as well as the day before their launch last week.

Disappointing as it may be to some, this post isn’t really about my opinion of CloudPassage directly; it is, however, the reaffirmation of the deployment & delivery models for the security solution that CloudPassage has employed. I’ll let you connect the dots…

Specifically, in public IaaS clouds where homogeneity of packaging, standardization of images and uniformity of configuration enables scale, security has lagged. This is mostly due to the fact that for a variety of reasons, security itself does not scale (well.)

In an environment where the underlying platform cannot be counted upon to provide “hooks” to integrate security capabilities in at the “network” level, all that’s left is what lies inside the VM packaging:

If we focus on the first item in that list, you’ll notice that generally to effect policy in the guest, you must have a footprint on said guest — however thin — to provide the hooks that are needed to either directly effect policy or redirect back to some engine that offloads this functionality. There’s a bit of marketing fluff associated with using the word “agentless” in many applications of this methodology today, but at some point, the endpoint needs some sort of “agent” to play*

So that’s where we are today. The abstraction offered by virtualized public IaaS cloud platforms is pushing us back to the guest-centric-based models of yesteryear.

This will bring challenges with scale, management, efficacy, policy convergence between physical and virtual and the overall API-driven telemetry driven by true cloud solutions.

Finally, since I used them for eyeballs, please do take a look at CloudPassage — their first (free) offerings are based upon leveraging small footprint Linux agents and a cloud-based SaaS “grid” to provide vulnerability management, and firewall/zoning in public cloud environments.

/Hoff

* There are exceptions to this rule depending upon *what* you’re trying to do, such as anti-malware offload via a hypervisor API, but this is not generally available to date in public cloud. This will, I hope, one day soon change.

There are lots of reasons one might use to illustrate why operationalizing security — both from the human and technology perspectives — doesn’t scale.

I’ve painted numerous pictures highlighting the cyclical nature of technology transitions, the supply/demand curve related to threats, vulnerabilities, technology and compensating controls and even relevant anecdotes involving the intersection of Moore’s and Metcalfe’s laws. This really was a central theme in my Cloudinomicon presentation; “idempotent infrastructure, building survivable systems and bringing sexy back to information centricity.”

Batting around how public “commodity” cloud solutions forces us to re-evaluate how, where, why and who “does” security was an interesting journey. Ultimately, it comes down to architecture and poking at the sanctity of models hinged on an operational premise that may or may not be as relevant as it used to be.

However, I think the most poignant and yet potentially obvious answer to the “why doesn’t security scale?” question is the fact that security products, by design, don’t scale because they have not been created to allow for automation across almost every aspect of their architecture.

Automation and the interfaces (read: APIs) by which security products ought to be provisioned, orchestrated, and deployed are simply lacking in most security products.

Yes, there exist security products that are distributed but they are still managed, provisioned and deployed manually — generally using a management hub-spoke model that doesn’t lend itself to automated “anything” that does not otherwise rely upon bubble-gum and bailing wire scripting…

Sure, we’ve had things like SNMP as a “standard interface” for “management” for a long while 😉 We’ve had common ways of describing threats and vulnerabilities. Recently we’ve seen the emergence of XML-based APIs emerge as a function of the latest generation of (mostly virtualized) firewall technologies, but most products still rely upon stand-alone GUIs, CLIs, element managers and a meat cloud of operators to push the go button (or reconfigure.)

Really annoying.

Alongside the lack of standard API-based management planes, control planes are largely proprietary and the output for correlated event-driven telemetry at all layers of the stack is equally lacking. Of course the applications and security layers that run atop infrastructure are still largely discrete thus making the problem more difficult.

The good news is that virtualization in the enterprise and the emergence of the cultural and operational models predicated upon automation are starting to influence product roadmaps in ways that will positively affect the problem space described above but we’ve got a long haul as we make this transition.

Security vendors are starting to realize that they must retool many of their technology roadmaps to deal with the impact of dynamism and automation. Some, not all, are discovering painfully the fact that simply creating a virtualized version of a physical appliance doesn’t make it a virtual security solution (or cloud security solution) in the same way that moving an application directly to cloud doesn’t necessarily make it a “cloud application.”

In the same way that one must often re-write or specifically design applications “designed” for cloud, we have to do the same for security. Arguably there are things that can and should be preserved; the examples of the basic underpinnings such as firewalls that at their core don’t need to change but their “packaging” does.

I’m privy to lots of the underlying mechanics of these activities — from open source to highly-proprietary — and I’m heartened by the fact that we’re beginning to make progress. We shouldn’t have to make a distinction between crafting and deploying security policies in physical or virtual environments. We shouldn’t be held hostage by the separation of application logic from the underlying platforms.

I use my ‘Security Hamster Sine Wave of Pain” to illustrate the cyclical nature of security investment and deployment models over time and how disruptive innovation and technology impacts the flip-flop across the horizon of choice.

To wit: most mass-market Public Cloud providers such as Amazon Web Services rely on highly-abstracted and limited exposure of networking capabilities. This means that most traditional network-based security solutions are impractical or non-deployable in these environments.

Network-based virtual appliances which expect generally to be deployed in-line with the assets they protect are at a disadvantage given their topological dependency.

So what we see are security solution providers simply re-marketing their network-based solutions as host-based solutions instead…or confusing things with Barney announcements.

Snort and Sourcefire Vulnerability Research Team(TM) (VRT) rules are now available through the Amazon Elastic Compute Cloud (Amazon EC2) in the form of an Amazon Machine Image (AMI), enabling customers to proactively monitor network activity for malicious behavior and provide automated responses.

Leveraging Snort installed on the AMI, customers of Amazon Web Services can further secure their most critical cloud-based applications with Sourcefire’s leading protection. Snort and Sourcefire(R) VRT rules are also listed in the Amazon Web Services Solution Partner Directory, so that users can easily ensure that their AMI includes the latest updates.

As far as I can tell, this means you can install a ‘virtual appliance’ of Snort/Sourcefire as a standalone AMI, but there’s no real description on how one might actually implement it in an environment that isn’t topologically-friendly to this sort of network-based implementation constraint.*

Since you can’t easily “steer traffic” through an IPS in the model of AWS, can’t leverage promiscuous mode or taps, what does this packaging implementation actually mean? Also, if one has a few hundred AMI’s which contain applications spread out across multiple availability zones/regions, how does a solution like this scale (from both a performance or management perspective?)

Ultimately, expect that Public Cloud will force the return to host-based HIDS/HIPS deployments — the return to agent-based security models. This poses just as many operational challenges as those I allude to above. We *must* have better ways of tying together network and host-based security solutions in these Public Cloud environments that make sense from an operational, cost, and security perspective.

* I “spoke” with Marty Roesch on the Twitter and he filled in the gaps associated with how this version of Snort works – there’s a host-based packet capture element with a “network” redirect to a stand-alone AMI:

I’ve covered this before in more complex terms, but I thought I’d reintroduce the topic due to a very relevant discussion I just had recently (*cough cough*)

So here’s an interesting scenario in virtualized and/or Cloud environments that make use of virtual appliances to provide security capabilities*:

Since virtual appliances (VAs) are just virtual machines (VMs) what happens when a SysAdmin spins down or moves one that happens to be your shiny new firewall protecting your production VMs behind it, accidentally or maliciously? Brings new meaning to the phrase “failing closed.”

Without getting into the vagaries of vendor specific mobility-enabled/enabling technologies, one of the issues with VMs/VAs is that there’s not really a good way of designating one as being “more important” or functionally differentiated such as “security” or “critical application” that would otherwise ensure a higher priority for service availability (read: don’t spin this down unless…) or provide a topological dependency hierarchy in virtualized network constructs.

Unlike physical environments where system administrators (servers) are segregated from access to network and security appliances, this isn’t the case in virtual environments. In Cloud environments (especially public, multi-tenant) where we are often reliant only upon virtual security capabilities since we have no option for physical alternatives, this is an interesting corner case.

We’ve talked a lot about visibility, audit and policy management in virtual environments and this is a poignant example.

So here’s an interesting spin on de/re-perimeterization…if people think we cannot achieve and cannot afford to wait for secure operating systems, secure protocols and self-defending information-centric environments but need to "secure" their environments today, I have a simple question supported by a simple equation for illustration:

For the majority of mobile and internal users in a typical corporation who use the basic set of applications:

Assume a company that:…fits within the 90% of those who still have data centers, isn’t completely outsourced/off-shored for IT and supports a remote workforce that uses Microsoft OS and the usual suspect applications and doesn’t plan on utilizing distributed grid computing and widespread third-party SaaS

You Get:Less Risk. Less Cost. Better Control Over Data. More "Secure" Operations. Better Resilience. Assurance of Information. Simplified Operations. Easier Backup. One Version of the Truth (data.)

I really just don’t get why we continue to deploy and are forced to support remote platforms we can’t protect, allow our data to inhabit islands we can’t control and at the same time admit the inevitability of disaster while continuing to spend our money on solutions that can’t possibly solve the problems.

If we’re going to be information centric, we should take the first rational and reasonable steps toward doing so. Until the operating systems are more secure, the data can self-describe and cause the compute and network stacks to "self-defend," why do we continue to focus on the endpoint which is a waste of time.

If we can isolate and reduce the number of avenues of access to data and leverage dumb presentation platforms to do it, why aren’t we?

…I mean besides the fact that an entire industry has been leeching off this mess for decades…

I’ll Gladly Pay You Tuesday For A Secure Solution Today…

The technology exists TODAY to centralize the bulk of our most important assets and allow our workforce to accomplish their goals and the business to function just as well (perhaps better) without the need for data to actually "leave" the data centers in whose security we have already invested so much money.

Many people are doing that with the servers already with the adoption of virtualization. Now they need to do with their clients.

The only reason we’re now going absolutely stupid and spending money on securing endpoints in their current state is because we’re CAUSING (not just allowing) data to leave our enclaves. In fact with all this blabla2.0 hype, we’ve convinced ourselves we must.

Hogwash. I’ve posted on the consumerization of IT where companies are allowing their employees to use their own compute platforms. How do you think many of them do this?

Relax, Dude…Keep Your Firewalls…

In the case of centralized computing and streamed desktops to dumb/thin clients, the "perimeter" still includes our data centers and security castles/moats, but also encapsulates a streamed, virtualized, encrypted, and authenticated thin-client session bubble. Instead of worrying about the endpoint, it’s nothing more than a flickering display with a keyboard/mouse.

Let your kid use Limewire. Let Uncle Bob surf pr0n. Let wifey download spyware. If my data and applications don’t live on the machine and all the clicks/mouseys are just screen updates, what do I care?

Yup, you can still use a screen scraper or a camera phone to use data inappropriately, but this is where balancing risk comes into play. Let’s keep the discussion within the 80% of reasonable factored arguments. We’ll never eliminate 100% and we don’t have to in order to be successful.

Sure, there are exceptions and corner cases where data *does* need to leave our embrace, but we can eliminate an entire class of problem if we take advantage of what we have today and stop this endpoint madness.

This goes for internal corporate users who are chained to their desks and not just mobile users.

I had an interesting email this last week from a former co-worker that I found philosophically interesting (if not alarming.) It was slightly baited, but the sender is a smart cookie who was obviously looking for a little backup.

Not being one to shy away from discourse (or a good old-fashioned geek debate on security philosophy) I pondered the topic.

Specifically, the query posed was centered on a suggested diametrically-opposed set of opinions on how, if at all, IPS devices and firewalls ought to behave differently when they fail:

I was having a philosophical discussion with [He who shall not be named]today about uptime expectations of IPS vs. Firewall. The discussion was in reference to a security admin's expectation of IPS "upness" vs. Firewall's.

Basic question: if a firewall goes down we naturally expect it to BLOCK all traffic. However, if an IPS goes down, the prevailing theory is that the IPS should ALLOW all traffic, or in other words fail open.

[He who shall not be named] says this is because best practices say that a firewall is a default DENY ALL device, whereas an IPS is a default ALLOW ALL device.

My thinking is trying to be a little more progressive. If Firewalls protect at Layer 3 and IPSes at L4-7, then why would you open yourself up at L4-7 when the device fails? I know that the concept of "firewall" is morphing these days especially to include more L4-7 inspection. But the question is the same. Are security admins starting to consider protocol and payload analysis as important as IP and Port protection? Or are we all still playing with sticks and fire in the mud?

I know you're all focused on virtualization these days, but how about a good old religious firewall debate!

I responded to this email with my own set of beliefs and foundational arguments which challenged several of the statements above, but I’m interested in two things from you, dear reader, and hope you’ll comment back with your opinions:

Do you recognize that there are two valid perspectives here? Would you fail open on one and closed on another?

If your answer to question #1 is yes, which do you support and why?

You can assume, for sake or argument, that you have only a firewall, only an IPS or both devices in-line with one-another. Talk amongst yourselves…