I look forward to the time (hopefully soon) when an industry consortium or worldwide standards effort brings together legitimate ISVs to create a shareable whitelist for all to use.

Whitelisting is foundational to any information security protection strategy. It is key to one of my areas of research on Application Control. At the application level, the problem I see is that there are multiple, overlapping efforts to build a industry-wide database of “known good” applications.

Bit9 is an Application Control vendor that has built a significant repository with its Global Software Registry.

SignaCert is a whitelisting vendor primarily used for configuration and drift management that has built its Global Trust Repository.

The US National Drug Intelligence center within the US Department of Justice has created HashKeeper to assist in forensics investigations (by enabling investigators to eliminate known good application and system files or to focus quickly on files/content known to be bad)

If anyone knows of more, please add them as a comment. The point is, this is a problem the software industry can help solve. Why do we need multiple, competing efforts to build this database? Why don’t legitimate ISVs get together and agree on a standard so that ISV-level data can be gathered directly from authors and shared as a public service? A standards group like the TCG could help define the application metadata exchange format with broad industry support.

Neil MacDonald is a vice president, distinguished analyst and Gartner Fellow in Gartner Research. Mr. MacDonald is a member of Gartner's information security and privacy research team, focusing on operating system and application-level security strategies. Specific research areas include Windows security…Read Full Bio

Thoughts on We Need a Global Industry-wide Application Whitelist

Unfortunately, I have to disagree. The act of defining every version of every language of every binary from every vendor that should be trusted is effectively intractable. Microsoft releases patches the second Tuesday of every month. In those are numerous distinct binaries, depending on the patch. These patches are localized into every language Windows, Office, Visual Studio, Exchange, etc support. That’s thousands of binaries per month. And that’s just from Microsoft. Add in Adobe, Apple, Sun, Oracle, and every other major vendor, and all the locales and service pack/service release iterations of each, and you are dealing with tens to hundreds of thousands of binaries per month.

Then we have to discuss the applications that are actually critical to the life of any enterprise: 1) Legacy either completely unmaintained or nearly abandoned, yet still mission critical LOB applications, 2) Even _current_ LOB applications, where you are dealing with custom code that was never, and will never be, published to the outside world for “trusting”. And if you ask an enterprise to define the binaries that comprise that application, the exercise is intractable as well. If you miss one binary when rolling out your whitelist, you have a work stoppage, and instantly AV becomes the “go to guy” again.

AV has stood the test of time and has lasted for over 20 years because it was “good enough”. It fails today because of its latency at catching new exploits, inability to stop zero-day or targeted attacks, and inability to stop most buffer overflows – again, the chief mechanism used to infect Windows via the web browser. It has survived not because it was the right tool for the job, not even because it worked well. Instead, it provided a fair trade off of security vs. management friction. Rarely was it the case that AV took down your system when you updated it with legitimate software. Yet today, that is becoming more frequent with blacklists. But immediately, you can see the rub here. If you have a cloudlist of software that you trust, and it can only be updated when it knows of new content, yet you must push out updates AS SOON AS THEY ARE AVAILABLE, you have a paradox. You need to wait for your cloud to be updated, or you risk wounding or killing a Windows system with binaries that weren’t in the cloud yet. This is the reason why CoreTrace built the Trusted Change mechanism we did – because the best people to decide which applications and updates should be installable (or which management tools should be able to delegate that task) is the management team you already have.

The goal of whitelisting MUST BE to be as near-zero-friction as AV, yet provide a layer of security that AV simply cannot provide in 2009. To do that in a way that doesn’t risk a work stoppage doesn’t start with a cloud (that can be potentially poisoned, or more importantly suffers from latency), and it cannot ask an admin to sort through 15,000 binaries on a system and ask themselves, “is this a good witch, or is this a bad witch?”. It has to just work. That was, is, and will be our goal at CoreTrace. Not building a cloud, but using the tools an organization already has at hand to dramatically increase their security footprint, with as close to zero friction footprint as possible.

It is great to see this dialog on not only the need for application whitelisting, but what makes the approach viable.

While there may be some debate the most pragmatic way to enable customers to realize the value of the Application Whitelisting approach, all of the vendors in the industry have a mutual interest in educating customers the compelling benefits of “only allowing the good to prevent the bad.”

With the experience of successful deployments at enterprise accounts we have lived what Neil is saying – it is all about managing the list. And that list is all about trust which, in turn, is based on organization specific policies. In fact, as our CTO likes to day, it is not a list at all, but rather a profile of what each organization chooses to allow in its environment.

Trust is subjective and based on familiarity. Putting software aside for a moment, I may trust an individual because I am more familiar with that person than you. That is, that person is a “known”. If I do not know that person, but I know you do, I may ask you about him/her similar to doing a blind reference when vetting a candidate for a job opening. Now back to software. I will have known software and I will have unknown software, – software with which I am not especially familiar, and therefore cannot easily assign trust. So then the question becomes, how do I decide if I authorize (i.e. whitelist) that software?

Starting with the known software: Company A may allow a certain application to run, Company B may not. An appropriate example of such an application is Skype. Some companies want the cost benefits and authorize its usage, while some do not want the network bandwidth hit and are concerned about security and not being able to audit these connections. In the case that I DO trust Skype, I need to be able to set a policy that not only allows the known good and trusted software to run today, but also to run tomorrow when it updates. This makes the list of the known manageable.

And what about the unknowns? My most two recent examples are DVD-MP4 conversion and music metadata tagging software. IT professionals cannot possibly be familiar with all such software such that they know whether or not to trust that software to run on their endpoints. In this case what is needed is the ability to identify that software against a registry to learn more about it to determine trust. Call it a blind reference or background check on software. In this case, the database of software in the cloud is not, in fact, a global whitelist, but rather a software registry. I like the terms “cloudlisting” and “crowdlisting” and would use them as well if I didn’t have a software registry, but as we educate customers, we should not mislead on the use case for the software databases Neil references.

So, the net-net is you need to have both – policies to allow what is known and thus trusted, or not, to run, or not, and the ability to identify new and unknown software against a software registry to learn more about that software to vet trust and thus policy.

To be clear, I think a hybrid model is what is needed. Agree, there will always be LOB applications that I only have knowledge of within my organization. I’m talking about the applications that come from the outside where you need to look up in some type of centralized database to see if they are known.

Wes said: “The act of defining every version of every language of every binary from every vendor that should be trusted is effectively intractable. Microsoft releases patches the second Tuesday of every month. In those are numerous distinct binaries, depending on the patch. These patches are localized into every language Windows, Office, Visual Studio, Exchange, etc support. That’s thousands of binaries per month. And that’s just from Microsoft.”

That’s exactly my point. This why the ISVs should directly feed this publicly availablerepository rather than have multiple third parties do it after the fact. This could be as simple as changing Microsoft’s software publishing process to update the global catalog at the same time the binaires are made available to the world for download(local languages and all). The problem is not intractable if the majority of ISVs that publish code participate.

Thanks for forking this dialog as when we really drill down on “whitelist methods” I think we’ll find three different components of the solution(s):

1. The “instrumentation” that lives client-side for determining whether the code that the platform is trying to interpret/load/run is in fact known/trusted or unknown/untrusted (definition to be addressed later) AND the linkage/policy framework for what to DO when the answer to the known/unknown question is determined.

2. And, as you state, the resource(s) ABOVE the platform to check the answer from the question generated in 1. above. This resource must be generated from both an industry-fed view (ISV’s) and a line of business (LOB) view of what is authorized to run/not run on a given platform at a given time (and potentially location, and other policy attributes).

3. And a higher-level resource where software measurements (meta-structure and high-resolution cryptographic statements of code authenticity) – what you are referring to and a “Global Industry-wide Application Whitelist”. (This resource should follow the 80:20 rule where the goal is NOT to capture the entire whitelist world (this is intractable) but rather to capture high-quality, known provenance signatures of what is most commonly seen on all standard build platform. Platform-specific and LOB-based deltas must be accommodated with the resource in 2. above)

So my answer to your question is YES. It is not only desirable, but necessary to have a common global view of whitelist measurements as a resourcce (per 3. above).

One of the key notions that has not yet been evident in this discussion is the notion of software “provenance”. In my opinion, when the ISV’s begin to support this notion of “trusted reference” and the ability to establish code validity – this must be BOTH from customer/platform AND the supply-side perspective. This means that some chain of custody or evidence must be maintained from an ISV’s point of view that *this is the code we built”.

Provenance or Origin must be established and maintained in order to establish trust/responsibility delineation.

It is only with provenance metric set, established and maintained in the s/w supply cycle, can both the customer and supply-side needs be fully addressed.

I disagree with Wes notion of global whitelist is “intractable” – it is not. It is certainly challenging as it has never really been done before in a way that this discussion is suggesting.

It will require vendor-independent specifications for how “software measurements” are derived, stored, and maintained. And the bigger hurdle (as both you and I have eluded to) is getting the ISV’s to “play” behind these methods. If we can enable this, then IMHO the business model for it will be evident.

This get’s me back to another comment that I didn’t address in the last thread – that the whitelist question is “academic”……

I didn’t as all mean to trivialize the need/value for new methods for trusted policy to be enforced on all types of platform from servers to consumer endpoints. I agree with you on this point entirely.

My point was just to say that it is only WHEN the ISV’s and Platform vendors fully embrace the need for new and improved platform-intrinsic methods for software validation, will this movement really hit full stride.

At that point I think you will see vendors (several of them represented in this thread including us) focus their efforts and expertise on one or more of the three buckets listed above (client actions and policy, LOB whitelist resources and global whitelist resources).

We look forward with great anticipation to seeing this space move forward on all fronts this year and next.

(P.S. I will opine on the “third-party” client/agent question in a future post perhaps).

I should point out that “known-good,” “white” and “black” are judgements relative to policy and context, not absolutes. For instance, to certain repressive governments, something accessing TOR would be “black” and email encryption with a built-in backdoor would be “white”…when deployed on citizen machines. And a vendor may have a history of producing shoddy code. Is that “known good”?

Wyatt’s comment about provenance is a key issue here. Unless I can be very, very certain about the origin of the code, I cannot judge it by my policies because I can’t be sure what it is.

Donn Parker augmented the hoary C-I-A model with 3 other attributes; authenticity is one of them (see “Parkerian hexad” in Wikipedia; “control” and “utility” are the other two). He got it right on this one.

Having a “community” list isn’t going provide sufficient confidence that I know where something actually originated — it only tells me that a bunch of people have seen it.

Agree that “known good” is not an absolute. Trust is not binary and is indeed shades of grey. We’ve experienced something like this with signed device drivers on 64 bit Windows. Even badly written code can be digitally signed by a known vendor (the “good code gone bad” problem I have discussed) as well as the bad guys getting signatures and signing their code as well which means we have to be picky about who signed the code and the process by which the certificates were issues. We are seeing a similar problem know with https and EV-Certs. The process for getting certificates was weak, and bad guys figured out they could make their sites look more legitimate with the use of https so we had to tighten up the process.

Disagree that having a community isn’t useful. I would say that community visibility is one factor in a multi-faceted decion of trust which also includes digital signatures, location where the code was received, how many others have seen it (or not seen it), does a static analysis of the code show any abnormalities, does a multi-engine AV scan of the code show anything know to be bad, and so on.

One advantage of community visibility is the inverse of what you point out. If a user claims they need this common application and no one has ever seen it, it becomes suspect. It’s the opposite of a blacklist where the bad guys can “fly under the radar”. With whitelisting, the stealthiness of targeted attacks works against the bad guys. If no one anywhere has ever seen this, it becomes more suspect

[…] at the application level. Rather than take an approach solely rooted in whitelisting or building a global whitelist, Symantec is instead using the Quorum technology to focus on the vast greyspace between blacklists […]

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.