Just a fun observation from a CSO Summit about data theft "Russian-style."

So, you are a CSO for a major org (say a government agency, a bank or an Internet provider); you walk down the street and pass a typical street vendor selling books, software, etc. Suddenly you see "a database on DVD" for sale. You look closely and - oops! - it is your customer database with names, passport numbers, addresses, etc. Fun! :-)

Sunday, March 23, 2008

Am I a leading visionary in the field of log management? :-) Who cares - I will now pontificate as if I am :-) It is about time: specifically, timing logs. As I said in my Log Trust and Protecting Logs from Admins posts, the issue of trust is critical in the logging world. After all, logs = accountability; and the latter in unthinkable without trust. If we are to at least pretend that logs objectively record events and user actions, we need to unambiguously establish WHAT happened and WHEN. This post deals with the 'WHEN' issue.

So, can we trust that the time stamp in the log file or the one added by the log management system correctly describes when the event actually happened?

We will start from locating the timestamps in logs. Most of the log formats, such as file-based logs (web, application, some security gear, etc) and syslog, Windows event logs, database audit tables, proprietary ones, contain a timestamp. In fact, once I saw somebody use a timestamp to define logs as "timed records of IT activity." So, time is critical for logs being, well, logs :-) At this point it is worthwhile to note that file-based logs will contain a timestamp IN the file, while syslog records arriving over the UDP or TCP port 514 connection are usually timestamped upon arrival BY the syslog daemon (using its own "knowledge" of time) - and then it shows up in the syslog files in /var/log.

Let's assess whether this "in-log timestamp" provides an adequate way of timing the actual event that is being logged. Answering this question is important for investigations and troubleshooting, but becomes nearly a matter of life and death in case of log forensics.

Here are some fun cases and issues to consider:

First, what are the chances of a completely false timestamp in logs (BTW, today is Jan 1, 1970!) When might that happen? Typically when a logging system own clock is reset or not set correctly. This timestamp clearly should NOT be trusted.

Second, we can say that it’s always 5PM somewhere: in other words, what time zone are your logs in? EST? PDT? GMT? UTC? Or any of more than 24 other possibilities. If you have no idea, you should not trust the timestamp.

Third, are you in drift? Is your system clock? Those pesky drift seconds turn into minutes which then work to undermine the accuracy of timing the records (and thus your certainly and trust in evidence quality)

Fourth, syslog forwarder mysteries are plenty: some of the syslog messages will be delayed in transit and the be timestamped by the final recipient daemon, thus completely losing when the event was originally logged. Admittedly, this delayed syslog is rare, but as more people employ buffering syslog daemons (e.g. syslog-ng), it might happen more often.

Fifth, more esoteric, but still real (and really annoying): some system logs will contain two timestamps. If you don't possess in-depth knowledge of this specific log, confusion has a chance to cut the trust as well (so, which timestamp should I use?)

Sixth, most people will not think that they will fall to something that stupid: 24 vs 12 hour time. However, when facing an unknown (and poorly designed!) log format, beware that 5:17 might well be 17:17...

Finally, if you know that something got logged at 5:17AM, then when did it happen? Beware of "Log lag!" This issues is actually to tricky to give it justice here... The simplest example is when the process leaves a log records when it exits not when it starts, possibly days earlier (thus creating a log lag).

As we dive into more issues with timing logs, we also need to think about sequence timing and absolute timing. Sequence of logged events is a critical fact! Miss the sequence and the whole “house of cards” goes … But! Absolute time is also important! Can we be assured of both all the time? (hint: no)

So, when you look at logs next time and you see a timestamp there - start thinking about all this :-)

Thursday, March 20, 2008

Why the sudden blogging frenzy? Well, I am sitting here in BMI Lounge at Heathrow waiting for a flight to Moscow (having just crossed the ocean and having been blessed with an upgrade to First :-)) and I have time, Internet access and my "to_blog" list :-)

So, here is one more piece of note, which has a bizarre quote: "And then there’s the fact that not many companies are aware of the need for log management as an element of compliance."

Really? Is anybody really that ... you know ... dim? I really want to get a copy of a "PCI Compliance" book and slap them with it :-)

I really wanted to play some kind of joke on vendor being "knacked off," but all sounded too stupid to post here :-) In any case, some people were saying that there are way too many NAC vendors around, especially given slow adoption of this technology. Now they have proof.

Now, it is a sad thing to see a security company "go poof" and I am sure this one had good people, but I think certain market common sense should apply... For example, I know some people who want to launch a DLP vendor. Now, if their data loss "prevention" technology is better than anybody else's, they will probably fail. However, if they looked at the problem from a different angle and solved some of the challenges that nobody can touch (and which are real), now we are talking....

Wednesday, March 19, 2008

As I am boarding the plane for Moscow to give a keynote at First Russian CSO Summit, I have to warn that comment moderation on my blog will be slow in the next few days. I will post the presentation here when I am back.

"Welcome to the 4th Carnival of the Security Catalyst Community. Each week, a different member of the Security Catalyst Community takes a turn pointing out three to five posts from the community and three links to blog articles by members of the security catalyst community. If you are not a member, but would like to join us, information on the community is included at the bottom of the post."

Here are some forum posts that I enjoyed in the last few days that may also benefit you.

Compliance Measurement and Verification Solutions covers tools and some fun discussion on whether such tools to "verify compliance" [with what?] are even possible. If you are looking to automate the WHOLE compliance .... keep looking (and see ya in year 3000 :-))

Do you trust small vendors? made my blood curl first (darn it, why do you trust LARGE vendors? :-)), but there is some discussion on why "small is better" (sometimes)

Here are some recent blog posts from the members of the Security Catalyst Community

To create your account, point your browser to: http://www.securitycatalyst.org/forums/ and register an account. Please register using your real full name in the following format: firstname.lastname (we generally use all lower case and separate the names with a period). This is important for our community of professionals. Accounts are reviewed quickly and activated. Your currency of the community is your participation. We look forward to learning from you!"

Richard has a fun post called "How Many Burning Homes" which talks about firefighting. He talks about "How many burning houses can you stand in your town?" and other fun metrics "Number of burning homes at any sampled time", "Average length of time any home is burning","Average time from detection to response", etc.

Friday, March 14, 2008

As I mentioned before, I received a lot of fun questions from the audience during our "Log Management Thought Leadership Roundtable Webcast" (recording, some comments). Since they would be useful to my readers, I am answering some of them here (questions are anonymous and slightly rewritten for clarity):

Q1: When you mention "forensics", are you speaking in term of legal forensic terminology - or in terms of incident investigation?

A1: When I say "forensics", I usually mean it in the legal sense. I call other investigations simply "incident investigations;" forensics carries an extra burden of proof and seeks to establish facts, not just "good hunches."

Q2: Are there solutions that can handle 2-3 Terabytes of log data per minute?

A2: No. Easy, huh? :-) See this for a specific example. Well, let me take this back: theoretically, you can always use a vendor that can handle a lot of data (like LogLogic) AND that has an ability to run a distributed operation across many appliances. The catch? You will need a lot of the appliances since 2-3 TB/minute is about 90 millions of log messages/second (assuming an optimistic 200 bytes/message)

Q3: I have terabytes of log data but how can be analyzed all this data? Are there products that can process all this data and receive valuable information?

A3: Yes, but you need to ask one question first: analyze why (example reasons here)? To discover something "interesting" (my favorite reason)? To find some specific artifact that you need in the logs? Or for some other reason? Before anybody can answer a question about "are there tools to 'analyze this'?", you'd need to answer that dreaded "why" question.

Q4: We were told to log every access to every SQL database in our environment. Is this even feasible with the best products on the market?

A4: Yes, it is. However, one needs to be extra careful with this. Look at this post for options and ideas. It may turn out that logging every SELECT statement and then collecting those native database logs will not be the best approach (mostly for database performance reasons) and a dedicated tool will need to be used. Database built-in auditing are better used for selective auditing.

Q5: Once logs are captured, and centrally stored, who should be responsible for the management and review of those logs?

A5: Good question! Really, this is a very good question that a) is important to have answered and b) does not have an "accepted," standard answer. It also depends upon what logs are those; let's assume the most complex scenario of a diverse set of logs from networks, systems and applications. So, the choices are: security team (sometimes: CIRT i.e. incident response team), some dedicated team in IT that provides "log services" (uncommon option, but growing in popularity) or some unit in IT that is responsible for regulatory projects (if compliance driven). If your answer is nobody, then you will be in trouble :-) If you answer wrong, you might have to fight to access your own logs (example)

Q6: Most of the discussion so far is about how to get started. What about after the system is deployed? Products tend to focus on collection and not on action or response. Where are the tools heading in terms of usability, incident tracking, collaboration?

A6: That's a long story, really, and it is hard to provide a short answer to this. Yes, collection has been a focus of products in the last few years, but now we are at a point where analysis and various uses of the data will come to the forefront. At the very least, you should be able to run reports and searches on the logs that you collected.

Q7: Do vendors typically offer a template of which logs to collect based the desired use cases?

A7: They should, yes :-) In some cases what you have is a bit of a push-pull between a vendor and a customer: "Tell us what to do?" - "First, you tell us what you would like to accomplish?" - "No, really, you tell me what I should be looking to accomplish." - .... sometimes ad infinitum. Also, for some uses cases it is hard to come up with a credible list (see this discussion about PCI DSS here)

Q8: What are the biggest difficulties when the log management solution is going to to be integrated and deployed in an organization with a lot of different log sources?

A8: Political boundaries and "log ownership issues" (see some discussion here) If you need to submit a paper form in triplicate to add a line to /etc/syslog.conf and then send more forms when something doesn't work right and you need to troubleshoot it (a real story), everything becomes painfully slow and inefficient.

Wednesday, March 12, 2008

Balabit.hu, the creators of syslog-ng, seems to have developed larger plans: taking over the world of logging :-)

In this post called "A silent explosion", they say: "At first sight, logging infrastructure might seem simple, and log management trivial. This might have been true in the past, but nowadays it is unarguably a process of strategic importance, and not only because of the standards or regulations. Information is power, and you cannot guarantee the security of a large IT system without logs. The idea is simple: Collect the logs to a central place, preferably using an encrypted channel. Get proper filtering and archiving. Finally, add some intelligence and analyzing capabilities, and you will know what is happening on your network."

So, Anton Security Tip of the Day #14: More access_log Fun: What Are You Not GETting?

In this tip, we will look at some bizarre artifacts that show up in web server access logs today. Here we have a production log from an Apache web server that is full of interesting (and sometimes ominous!) little mysteries that we will investigate in order to determine their impact on security and operational health of the site.

Logs do contain more mysteries than we have time, so we will focus on a few of them: specifically, unusual web request methods. Let's see who is trying to POST or use some other method (OPTIONS, HEAD, PUT or something - see a list here) on our site, instead of just GET'ting the content (GET command is used by web browsers to retrieve the pages, while POST is used to upload content, press buttons, etc - at least in "web 1.0" land - see earlier tip #12 where POST request was found in proxy logs)

Here is one little artifact that attracted my attention due to a POST request vs a web forum as well as a battery of slashes (which actually increases in subsequent request - of which there were many)

This one really is a mystery; what do we know about it? The server responded to the request OK (code 200), so the POST actually happened. The first request was a request to register with a web discussion board and the second was a request to login. Multiple slashes are actually ignored by the web server, so why put them in the request (no answer)? Also, I think that the User-Agent is spoofed ... do you know why? Finally, if I see something like that in my logs, I will definitely investigate it, primarily due to the fact that Apache responded with 200 OK code.

The next one is so classic it it dumb (and so dumb, it's a classic :-))

It is probably one of the ancient IIS attacks (check out this fun BlackHat preso on that, circa 2003) - why would someone probe for it now is beyond me. In any case, Apache on Linux and "*.exe" don't mix :-)

The above uses a PUT request which is pretty much deprecated now; the purpose of the above is clearly malicious. In fact, modern Apache shouldn't even allow it, thus it responds with code 405 "Method Not Allowed." Nothing to worry about (even though some poor critter got owned with that! BTW, if you follow that link, check out HTTP response code 201 - if you see it in your logs, run! :-))

Overall, this tip teaches to look for unusual request methods (POSTs to strange pages, all PUT, DELETE, OPTONS requests, etc) and then check the response codes to assess the impact. If your web server happily executed such strange request (code 200), that you'd need to dig further. And, you "lucky" :-) and you see the response code 201 "Run for the Hills" (in reality, it stands for "New File Created"), then you can go straight into incident response mode.

Another lesson to learn is that if you see too many POSTs or too many "GET then POST" sequences from the same IP in rapid succession, investigate it since no legitimate access should produce such a pattern...

Now, some say that Bruce starts to lose it from being a spokesperson (not a doer) for too long, but this proves that he is still a security visionary and can start a fun, thought-inspiring controversy. This post called "Security Products: Suites vs. Best-of-Breed" is a VERY fun read (you MUST also read the discussion that followed - and think!)

A few representative quotes: "Honestly, no one wants to buy IT security. People want to buy whatever they want -- connectivity, a Web presence, email, networked applications, whatever -- and they want it to be secure." (do they really?)

"And sooner or later the need to buy security will disappear. " (bullshit, I say! :-) - analogous 'some day the need to have police will disappear...')

"It will disappear because IT vendors are starting to realize they have to provide security as part of whatever they're selling. " (year 3000?)

"IT is infrastructure. Infrastructure is always outsourced. And the details of how the infrastructure works are left to the companies that provide it." (hmmmm... is your information infrastructure? no!)

Mike R comments on that (here): "But the idea that the answer is neither and that outsourcing will be the death knell in the security business is interesting, but ultimately wrong. [...] Trying to wait for Big Security to die would give new meaning to the long and slow goodbye."

Richard "IDS is dead" Stiennonthrows a bomb: "First, esoteric matters like IT security really do not matter to the overall performance of a retailer. Customers, employees, stakeholders, apparently don’t care. Second, no matter what the security industry says, you should not justify security spending based on potential impact of a data breach on your stock price. That theory is completely disproved by TJX."

Enraged? Think he is pushing it too far? Being illogical? Me too :-) I don't think TJX example just goes and "disproves" it; we don't really know how it works with breaches and stock prices (some say 4-8% down, some say none, some say 'major impact', whatever...)

He then clarifies: "But let me point out that TJX has attributed $200 million in direct costs to this breach. It is easy to surmise this is bigger than just about anyone’s security budget. In TJX’s case some well known security practices and a little security spending would have avoided this whole incident."

Overall, a fun read. Still, I think breach impact assessment and breach's impact on anything (much less the stock price...) is not really well-defined or understood yet ...

Tuesday, March 11, 2008

I spent the past 3 hours wondering around the vendor expo at InfoSecWorld 2008 in Orlando, FL. I used to like wandering around vendor shows, but somehow didn't have a chance to do it in the past few months. I guess I can also consider this "preparing for the RSA", the mother of all vendor shows (where I will be present as an "all powerful member of the press" :-) - BTW, if you are a security blogger you can try getting a press pass here)

Many of the "usual suspects" were there; some of the "die-will-you-die- already ... please" vendors made the showing (probably by selling those newly unneeded chairs to pay for the booth space).

I love to talk to people in the same or adjacent markets as LogLogic (euphemism for "competitors" :-)), some are friendly and you can have a fun and insightful conversation with them (with neither of us disclosing any deep and dark secrets about our solutions ...), others are obnoxious and think you are "out to steal their brochures."

However, the most fun part will definitely happen on Thursday - a Log Management Summit. MISTI folks planned a few very fun panels; will there by a vendor fight? A mud-slinging match? We'll see ...

As you know, I have long been on a quest to save the world from having to write long and ugly regular expressions (regexes) for log analysis. Back in 2005 (post, big discussion that ensued) and later in 2007 (post, another big discussion that again ensued), I have tried to poll people for approaches that convert logs into useful information without messing with massive quantities of regular expressions as well as performed some research on my own. In all honesty, I didn't notice a major breakthrough.

Until now? Here ("prequel" here and follow-up here) is what looks like an interesting and major development along that line. Indeed, one can automate the processing of some "self-describing" log formats (name=value pairs, comma/tab delimited with descriptive header, sequential names and values [yuck!], XML, etc) to obtain a semblance of structured data (not just a flow of text logs) from logs without any human involvement.

But is that an endgame, that "holy grail" of log analysis or yet another step towards it? First, bad logs break it (e.g. with space in names or values with spaces and without quotes) and thus call for a return of a human logging expert to write an even fancier regex that can deal with it (then again, bad logs often break human-written rules as well). Second, there is a more important issue that I will bring up. So, if logs contain "user=jsmith" we can certainly learn a new piece of info (that the "user" was probably "jsmith"). But what if they contain "bla_bla=huh_huh" - and we don't know what "bla_bla" and "huh_huh" mean? Do we really have more information at hand if we tokenize it as "object called 'bla_bla' has the value of 'huh_huh'" compared to just having a single blurb of text "bla_bla=huh_huh." I personally don't think so - but I've been known to be wrong before :-)

So, let's review what we have: I decided to organize the current approaches to logs in the form of this table (hoping to start a discussion!)

Text Indexing

Field Extraction (Algorithmic)

Rule-based Parsing (Manual)

Pros

Easy - no human effort needed: just collect the logs and go

Easy - no per-log effort on behalf of the log analyst (but some creative code needs to be written)

Hard - an expensive logging expert must first understand the logs and then write the rules; normalization across devices implies having a uniform data store for logs

So, what can we conclude? It is too early to retire the human-written rules (so people will still have '\s' and '\w' coming up in bad dreams... :-)), but this automated approach should definitely be used on the logs that will "allow you to do it to them." :-) Personally, I am also very happy that somebody is thinking about such matters ...

OK, not really mad :-) In fact, pretty intelligent :-) But a new salvo has been fired in a "great security ROI war." Counter-salvos have been fired as well :-)

The salvo is the paper called “The Fallacy of Information Security ROI” by Jon Pols ("ISSA Journal", February 2008) where Jon argues against the ROI for security (since there is no money earned by security, just saving which are NOT the same thing); Jon proposes "security as insurance" model which, in all honesty, I am not too comfortable with (since security doesn't "pay you back" after the breach).

ROI proponents "hit hard" in return: 'One is Jos Pols who, in his recent article “The Fallacy of Information Security ROI” in the February 2008 issue of the ISSA Journal (membership required to access link resource), claims that one cannot have a return where there is no income. .' They next bring back the "return in the form of savings" (which many disagree with ...): 'this is an overly restrictive view of the meaning of the word “income.” The avoidance of potential losses redounds to the bottom line, as does revenue, so that a cost saving is a return on an investment.' Read the whole pro-ROI counter-point here.

Friday, March 07, 2008

My next fun logging poll is here - please vote! It is about tools for centralized collection of Windows Event Log from servers and other systems. One of the somewhat surprising discoveries from my previous poll was that few people look at Windows logs; this poll drills down into it.

UPDATE: just looked at the results collected so far, and I would like to say this: why - oh - why some people want to turn an honest research effort into a vendor war? Ye bastards, :-) you know who you are ...

Thursday, March 06, 2008

This poll on looking at logs poll was relatively popular; lets see what we can learn (live results are also here).

First, what are the top 3 log types that people look at? They are:

Unix/Linux server syslog

Web server logs

Firewall logs

How does that compare with the top 3 log types that people collect (see picture showing results from my previous poll below)?

These are:

Unix/Linux server syslog

Firewall logs

Web server logs

Huh? They are the same - doesn't it just make sense? What are the possibilities here?

a. People only collect the logs they plan to look at, OR

b. People look at logs they collect (duh!).

Strangely, I find a) unlikely; I think most people collect more than they can review and that the incident/issue response and compliance needs drive collection more than review or analysis.

Another observation is that all of the "big 3" log types are useful for security, operations and compliance and not just for security (like NIDS/NIPS logs). Is that why they are so popular?

Second, I was fearful that "I only look at whatever logs needed for the incident/issue investigation" will win. It didn't!!! This to me indicates that proactive log review is not as unpopular as I feared. Good! It is working.

Fourth, much more people look at Unix/Linux logs than Windows server logs (factor of 3x); this is not entirely unexpected and my next poll will drill down into this.\

Finally, I am SHOCKED that people don't look at NIDS/NIPS logs (only 11% do). People, what's wrong with you? :-) Why have you deployed those beasts if you don't look at what they produce? Then again, maybe you haven't :-(

This is very fun and insightful read from Gunnar Peterson: "When Will We See Market Forces in Infosec?" Example fun quote: "... Wait - they listen to customers, innovate new things, control costs, and deliver safety mechanisms to market while growing their business? When will Silicon Valley answer the bell on this model?" Read on.

Monday, March 03, 2008

I saw this idea of a monthly blog round-up and I liked it. In general, blogs are a bit "stateless" and a lot of good content gets lost since many people, sadly, only pay attention to what they see today.

PCI compliance is still all the rage! So, MUST-DO Logging for PCI? post was propelled to a place in my Top5 popular posts list. It discusses the fact that there is no "easy list" of what you MUST do to comply.

About Me

He is a recognized security expert in the field of log management and PCI DSS compliance. He is an author of books "Security Warrior", "PCI Compliance", "Logging and Log Management" and a contributor to "Know Your Enemy II", "Information Security Management Handbook" and others. Anton has published dozens of papers on log management, correlation, data analysis, PCI DSS, security management, honeypots, etc . His blog securitywarrior.org was one of the most popular in the industry.

In addition, Anton teaches classes (including his own SANS class on log management) and presents at many security conferences across the world; he recently addressed audiences in United States, UK, Singapore, Spain, Russia and other countries. He worked on emerging security standards and served on the advisory boards of several security start-ups.

Before joining Gartner in 2011, Anton was running his own security consulting practice www.securitywarriorconsulting.com, focusing on logging and PCI DSS compliance for security vendors and Fortune 500 organizations. Dr. Anton Chuvakin was formerly a Director of PCI Compliance Solutions at Qualys. Previously, Anton worked at LogLogic as a Chief Logging Evangelist, tasked with educating the world about the importance of logging for security, compliance and operations. Before LogLogic, Anton was employed by a security vendor in a strategic product management role. Anton earned his Ph.D. degree from Stony Brook University.