This is my PERSONAL blog, as as of August 1, 2011, it focuses on personal matters and various things I find to be fun.

Saturday, June 20, 2009

Why No Open Source SIEM, EVER?

Here is a perfect weekend post – on SIEM :-) Ok, all this Google web traffic of people searching for “open source SIEM” (sometimes “open source SIM”, almost never “open source SEM” {Is SEM .. dead? :-)}) continues to fill my web server logs and it finally prompted me to write this post, rather than simply whine about, like I was doing for 3 years :-)

It all started here when Matasano folks (sockpuppet.org at the time), in a rare bout of punditry proclaimed back in 2005 (!):

“A Credible Open-Source SIM

I predicted that, just as SourceFire commoditized and co-opted the IDS market, a nascent open source project would challenge SIM products like ArcSight and Cisco MARS.

Result: No Credit [A.C. – this is a later addition to their post when their scored their 2006 predictions]

What’s taking you guys so long? Getting spooked that all the money seems to be going to log management? That’s exactly the dynamic Snort charged in to! Get with the program!”

There's about $100MM spent annually on products that manage and correlate logs. Guess what? None of it is hard to do. The underlying tools are there. Customers know how to do this better than the vendors do. Expect a mainstream open-source combination of Argus <http://www.qosient.com/argus/> and Sguil <to">to">to">http://sguil.sourceforge.net/>to own the security management conversation next year.”

Building a SIEM is fun (perfect for open source), BUT SIEM is inherently “high-maintenance” via a lot of boring, manual tasks (one example: check Cisco.com weekly for changes to log messages of their hundreds of devices THEN pull your hair out anyway when logs change without any documentation). Maintenance is NOT open-source forte, and for SIEM, “no meticulous maintenance –> no value.” Open source community is not so great with eternal commitments.

To analyze logs, you need to have logs. Either you get the logging devices (expensive –> not for open source) or you get the logs. Many people said “oh, open source community will collaborate on that.” Guess what? It didn’t (attempts here, here, here (now redirects)). When log standards (CEE) emerge, it will change; today it is impossible.

Can the task of log analysis be pushed to end users of the open source tool (after all, they are getting it for free, they can do some work…)? Yes, it can, provided there are tools to drastically simplify the logs->intelligence path (at one point, I hoped splunk’s “Event type discovery” will do it, but it didn’t); such tools do not exist. And, sadly, normal people don’t write regexes (good joke about it). To top it off, writing parsing rules is nowhere near as much fun as writing IDS sigs or vulnerability checks – and then packet headers don’t change on you, while log headers do.

Log analysis or SIEM system needs to be able to handle volume, not only live flow, but also storage. A lot of tools work well on 10MB of logs, but then again, so does human brain. When you move to TB volumes, a lot of simple things start to require engineering marvels. Is it as hard as getting the Linux kernel (the pinnacle of open source engineering) to perform? Probably not as hard, but the OSS SIEM project creator need to be BOTH a log expert and a performance expert.

SIEM is also a lot about integration and not just hard-core coding. I believe in open-source correlation engine (SEC, OSSEC, general-purpose Esper), maybe in open source parser generator, possibly in open source data presentation UI, but definitely not in all pieces working together and pulling log data and context data from all the required sources and then making sense of it. There are way too many moving pieces – as we all know, many SIEM deployments fail not because of crappy technology, but because of politics.

Some people (in the same DD thread) even suggested that the reason that open source community didn’t get to tackle the above problems is simple: SIEM products aren’t really needed (Richard doesn’t have much love for them, for example) and that the community will find some other way of solving it (“a small, useful, standalone tool will almost always be more functional and more reliable than a merit badge feature equivalent in a commercial product”) I agree with that in principle, but if part of SIEM’s value-add is "tying stuff together" then having analysts watching 10 "small, useful, standalone tools" is actually a way back, not forward.

Maybe an open source SIEM project can only support a few “right” log messages? This was a very popular view in the 90s: just filter the logs and see the important ones. But do you know why Marcus created “artificial ignorance”? ‘Cause “filter the logs” approach doesn’t really work: you never know what are the right ones, until you look at all.

Sguil is not a SIEM. It is based on a different model, assumptions (=intelligent user) and use cases.

OSSEC is awesome, but also not a SIEM. It has correlation now and “wide-ish” log source support, but doesn’t measure up to SIEM in many dimensions.

OSSIM is indeed an open-source SIEM. Now that it ha a full-blown corporate parent, it has potential. In fact, when I first saw it in 2005 (maybe before, not sure), it had potential too. It is just now it has more of it!

Now, more on OSSIM: Dominic and the crew are awesome, but I think that the above considerations will prevent OSSIM from becoming widely adopted. Here is why: how many open source NIDS do you know? 94% [source: srand() :-)] of folks in security will say: one (Snort), another 3% will say two (Snort, Bro), another 2% will say 3 (Snort, Bro, Prelude), another 1% will say something else. Now, try that with open source SIEM: there is no “snort of SIEM” and the result will be different. IMHO this is inherent (=not a question of time) due to incompatibility of SIEM and open source model, shown in items 1.-5. above.

OSSIM team in “Can be OSSIM considered a SIEM? Is it enterprise ready?” (sadly, has some misquotes of myself, mostly corrected here in this post) states that OSSIM has large and small deployments already (and the ones not managed by its creators). That is why I think it has potential; the above explains why I think that it won’t spread, despite the potential.

About Me

He is a recognized security expert in the field of log management and PCI DSS compliance. He is an author of books "Security Warrior", "PCI Compliance", "Logging and Log Management" and a contributor to "Know Your Enemy II", "Information Security Management Handbook" and others. Anton has published dozens of papers on log management, correlation, data analysis, PCI DSS, security management, honeypots, etc . His blog securitywarrior.org was one of the most popular in the industry.

In addition, Anton teaches classes (including his own SANS class on log management) and presents at many security conferences across the world; he recently addressed audiences in United States, UK, Singapore, Spain, Russia and other countries. He worked on emerging security standards and served on the advisory boards of several security start-ups.

Before joining Gartner in 2011, Anton was running his own security consulting practice www.securitywarriorconsulting.com, focusing on logging and PCI DSS compliance for security vendors and Fortune 500 organizations. Dr. Anton Chuvakin was formerly a Director of PCI Compliance Solutions at Qualys. Previously, Anton worked at LogLogic as a Chief Logging Evangelist, tasked with educating the world about the importance of logging for security, compliance and operations. Before LogLogic, Anton was employed by a security vendor in a strategic product management role. Anton earned his Ph.D. degree from Stony Brook University.