This is my PERSONAL blog, as as of August 1, 2011, it focuses on personal matters and various things I find to be fun.

Tuesday, June 14, 2011

Algorithmic SIEM “Correlation” Is Back?

Back in 2002 when I was at a SIEM vendor that shall remain nameless (at least until they finally die), I fell in love with algorithmic "correlation." Yes, I can write correlation rules like there is no tomorrow (and have fun doing it!), but that’s just me – I am funny that way. A lot of organizations today will rely on default correlation rules (hoping that SIEM is some kinda “improved host IDS” of yesteryear … remember those?) and then quickly realize that logs are too darn diverse across environments to make such naïve pattern matching useful for many situations. Other organizations will just start hating SIEM in general for all the false default rule alerts and fall back in the rathole of log search aka “we can figure out what happened in days , not months” mindset.

That problem becomes even more dramatic especially when they try to use mostly simple filtering rules (IF username=root AND ToD>10:00PM AND ToD<7:00AM AND Source_Country=China, THEN ALERT “Root Login During Non-Biz Hours from Foreign IP”) and not stateful correlation rules, written with their own applications in mind. As a result, you'd be stuck with ill-fitting default rules and no ability to create custom, site-specific rules or even intelligently modify the default rules to fit your use cases better. Not a good situation - well, unless you are a consultant offering rule correlation tuning services ;-)

One of the ways out, in my opinion, is in wide use of event scoring algorithms and other ruleless methods. These methods, while not without known limitations, can be extremely useful in environment where correlation rule tuning is not likely to happen, no matter how many times we say it should happen. By the way, algorithmic or "statistical" correlation has typically little to do with correlation or statistics. A more useful way to think about is weighted event scoring or weighted object (such as IP address, port, username, asset or a combination of these) scoring

So, in many cases back then people used a naïve risk scoring where:risk (for each destination IP inside) = threat (=event severity derivative) x value (=user-entered for each targeted “asset” - obviously a major Achilles heel in the real world implementations) x vulnerability (=derived from vulnerability scan results)

It mostly failed to work when used for real-time visualization (not historical profiling) and was also really noisy for alerting. But even such simplistic algorithm, however, still presents a very useful starting point to develop better methods, post-process and baseline the data, add dimensions, etc. It was also commonly not integrated with rules, extended asset data, user identity, etc.

Let’s now fast forward to 2011. People still hate the rules AND rules still remain a mainstay of SIEM technology. However, it seems like the algorithmic ruleless methods are making a comeback, with better analysis, profiling, baselining and with better rule integration. For example, this recent whitepaper from NitroSecurity (here, with registration) covers the technology they acquired when LogMatrix/OpenService crashed and now integrated into NitroESM. The paper covers some methods of event scoring that I personally know to work well. For example, a trick I used to call “2D baselining”: not just tracking the user actions over time and activities on destination assets over time, but tracking pair of user<->asset over time. So, “jsmith” might be a frequent user on “server1”, but only rarely goes to “server2”, and such pair scoring will occasionally show some fun things from the “OMG, he is really doing it!” category

So, when you think SIEM, don’t just think “how many rules?” – think “what other methods for real-time and historical event analysis do they use?”

About Me

He is a recognized security expert in the field of log management and PCI DSS compliance. He is an author of books "Security Warrior", "PCI Compliance", "Logging and Log Management" and a contributor to "Know Your Enemy II", "Information Security Management Handbook" and others. Anton has published dozens of papers on log management, correlation, data analysis, PCI DSS, security management, honeypots, etc . His blog securitywarrior.org was one of the most popular in the industry.

In addition, Anton teaches classes (including his own SANS class on log management) and presents at many security conferences across the world; he recently addressed audiences in United States, UK, Singapore, Spain, Russia and other countries. He worked on emerging security standards and served on the advisory boards of several security start-ups.

Before joining Gartner in 2011, Anton was running his own security consulting practice www.securitywarriorconsulting.com, focusing on logging and PCI DSS compliance for security vendors and Fortune 500 organizations. Dr. Anton Chuvakin was formerly a Director of PCI Compliance Solutions at Qualys. Previously, Anton worked at LogLogic as a Chief Logging Evangelist, tasked with educating the world about the importance of logging for security, compliance and operations. Before LogLogic, Anton was employed by a security vendor in a strategic product management role. Anton earned his Ph.D. degree from Stony Brook University.