Thursday, October 21, 2010

I have just published a first preview of the libee library API. This work is obviously far from being finished, but I think the preview provides at least some idea of how it will materialize.

To tell a bit about the status in general: I have completed a first round of evaluation of CEE objects and concepts, based on the draft standard. Note that the draft standard is far from being finished, and will probably undergo serious changes. As such, I have not literally taken object definitions from there. Instead, I have mostly pulled what I am currently interested in and also have added some additions here and there, as I need them to do what I intend to do.

My primary goal is to get some proof of concept working first, then re-evaluate everything and probably do a rewrite of some parts. For the same reason, performance is currently not on my radar (but of course I always think about it ;)).

I would also like to note that the libee git is also available -- and a good way to follow libee development for those interested. Comments are always appreciated and especially useful in this early phase.

Tuesday, October 19, 2010

I am happy to feature a guest post written by Champ Clark III, the author of Sagan, a real time, "snort like" event and system log sniffing tool. Champ will explain a bit about it and how he made it work with rsyslog. I think it is a very interesting project and I am glad it now has full rsyslog support.

But enough of my words, enjoy the real thing ;)

RainerI admit it, I'm a recent convert to rsyslog. I've known about rsyslog for years, but have only recently started using rsyslog in production environments. The primary reason for looking into rsyslog is users of Sagan are requesting support for it. I'm very glad they pushed me in that direction. I knew how popular rsyslog was,
but the 'hassles' of changing our core logging facilities seemed like a pain.

So I bit the bullet, and started working with Sagan and rsyslog. I haven't looked back since.

I work in the network & computer security field. I've known for years the importance of log management. One thing that I had noticed was a lack of open source log & packet levelcorrelation engines. This is essentially what Sagan does. One common comparison of Sagan is Cisco's MARS. Sagan reads in your logs and attempts to correlate the information with the intrusion detection/prevention system's packet level information.

At Softwink, Inc, my place of employment, we monitor security events for various clients. At the
packet-level inspection for 'bad events' (security related), we use Snort. Snort 'watches' the network connectionsand sends out an 'alert' when it sees nefarious traffic. We configure Snort to send the 'alert' to a MySQL database for further analysis. We can then monitoring these Snort sensors for 'bad events/attacks' in real time.

However, we found that we were missing the 'bigger picture' without logs. This is where rsyslog and Sagan come into play. Essentially, we take all machines and equipment on a network and forward it to a
centralized server. Rsyslog is the 'receiver', and sometimes the sender of these log messages. In many cases, we find that centralized secure logging is a requirement for clients. With rsyslog, we
have the ability to store log information into a MySQL database for archive purposes. We can then give the client access to the log information via Loganalyzer for easy, simple retrieval.

How does Sagan fit into this picture? For security analysis, we only want key, typically security related, events from the logs. Manually searching databases for 'security related' events is prone to error. It is easy to 'miss' key events. Sagan is the 'eyes on the logs' watching for security related events in real time. First, Sagan has to have access to the logs coming into the network. This is very simple with Rsyslog:

# rsyslog.conf file.
#
# As rsyslog receives logs from remote systems, we put them into a format
# that Sagan can understand:
#
$template
sagan,"%fromhost-ip%|%syslogfacility-text%|%syslogpriority-text%|%syslogseverity-text%|%syslogtag%|%timegenerated:1:10:date-rfc3339%|%timegenerated:12:19:date-rfc3339%|%programname%|%msg%\n"
# We now take the logs, in the above format, and send them to a 'named pipe'
# or FIFO.
*.* |/var/run/sagan.fifo;sagan

Sagan can now 'read' the logs as they come into rsyslog from the /var/run/sagan.fifo (named pipe/FIFO) in real time. rsyslog actually performs double duty for us; logging to our MySQLdatabase for archival purposes and handing Sagan log information for analysis.

Over all, there is nothing really new about this concept. However, Sagan does something a bit different than other log analysis engines. When Sagan sees a 'bad event', Sagan will log that to your Snort IDS/IPS MySQL/PostgreSQL database. What does this mean? Packet level security events and log events reside in
the same database for correlation. There are several advantages; for one, we can now have a single, unified console for log and IDS/IPS events! Second, we can now take advantage of Snort front-end
software to view log events. For example, if you use BASE or Snorby to view packet level IDS/IPS events, you can use the same software to view log level Sagan events. Maybe your shop uses report generation
tools that query the Snort database to show 'bad packet events' in your network. Guess what. You can use those same reporting tools for your log information as well. I've posted some example screen shots of Snort & Sagan working together here. The idea is that we take advantage of the Snort community's work on consoles.

Correlation with Sagan and Snort, at the engine level, works several different ways. First, Sagan can in some cases pull network information directly from the log message and use that for correlation in the SQL database. For example, let's say an attacker is probing your network and is attempting to get information on the SMTP port. The attacker sends your SMTP server 'expn root'. Your IDS/IPS engine will 'detect' this traffic and
store it. It'll record the source IP, destination IP, packet dump, time stamp, etc. Sagan will do the same at the log level. Sagan will 'extract' as much of the information from the log message for further correlation with the packet level.

Recently, Rainer announced liblognorm (early liblognorm website). This is an exciting project. The idea is to "normalize" log information to a nice, standard usable format. I plan on putting as much support and effort as I can into this project, because it's an important step. For Sagan, it means we will be able to better
correlate information. In my time to ponder about it since its recent announcement, I can see liblognorm being extremely useful for many different projects.

Sagan also shares another feature with Snort; it uses the same rule sets. Sagan rules sets are very much 'Snort like'. Here is an example rule (this is a single line, broken just for readability):

If you're already a Snort user, then the format and syntax should be very simple to understand. We use 'pcre' (regular expressions) to 'look' for a message from the program 'sshd' that contains the term 'invalid user' or 'illegal user' (case insensitive). We set the classifications, just as Snort does (for further correlation). We can 'threshold' the rule, so we don't get flooded with events.

Sagan uses this format for a variety of reasons. For one, its a well know format in the security field. Second, we can now take advantage of Snort rule maintenance software! For example 'oinkmaster' or 'pulled pork'. The idea is that with Sagan, you don't need to 're-tool' your network in order for it to work.

Using Sagan with your Snort based IDS/IPS system is just a feature of Sagan. Sagan can operate independently from Snort databases, and offers the normal bells/whistlers you'd expect in a SEIM (e-mailing alerts, etc).

To tie all this together, it means we can simply monitor packet level threats and log level events from a unified console. We can monitor just about everything in a network from the log level standpoint. We can monitor Cisco gear, Fortigate firewalls, Linux/*nix servers, wireless access points, etc.

Sagan is a relatively new project and still under development. Like rsyslog, Sagan is built from the ground up with performance in mind. Sagan is multi-threaded and written in C with the thought that it should be as efficient with memory and the CPU(s) as possible. Rsyslog seems to follow the same philosophy, yet another reason I made the switch.

The more information you know about a threat to your network/system, the better off you'll be. That is what the mix of rsyslog and Sagan offers. Throw in IDS/IPS (Snort) monitoring, and you can get a complete view about 'bad things' happening in your environment.

Thursday, October 14, 2010

After some discussions, we have finally decided to name the CEE part of the original liblognorm project "libee". Note the missing "c" ;) We originally thought that libcee would be a smart name for a library implementing an API on top of the upcoming CEE standard. However, there were some complexities associated with that because "libcee" sounds much like *the* official library. While I hope I can finish a decent implementation of the full CEE standard, I think it is a bit bold to try to write the standard lib when the standard is not even finished.

So the compromise is to drop the "c". The "ee" in libee means "Event Expression", much as CEE means "Common Event Expression". If CEE evolves as we all hope and if I manage to actually create a decent implementation of it, we may reconsider calling the library libcee. But that's not important now and definitely something that will involve a lot of people.

In the mean time, I wanted to make sure you all understand what I mean when I talk about "libee". I hope to populate its public git with at least a skeleton soon (liblognorm reaches a point where I need some CEE definitions, so this becomes more important).

Wednesday, October 13, 2010

While there is not yet much content, the liblognorm site has been put online today. Over time, this will become an important place to both learn about liblognorm AND share log samples. It will most probably also contain the area that you can use to download new log samples (much like you download virus patterns for a scanner). But for now, I just wanted to share the good news.

I thought a while on how to support Unicode in liblognorm. The final decision is to use passive mode, which is a very popular option under Linux. A core driver behind this decision is the ability to safe lots of space (and thus also cache space and so processing time as well) as the majority of log content is written in US-ASCII. This is even the case in Asian countries, where large parts of the log message are usually ASCII but contain a few select fields in local language support (like names). Even if the message itself is in local language, there is a lot of punctuation and numbers in them, so I think the overall result will not use up notably more space than a UTF-16 implementation. I18N-wise, it must also be noted that UTF-16 is a very small (but important) subset of full unicode, so using UTF-8 gives us the ability to encode full 32-bit UCS-4 characters should there be need to do so.

The same decision will apply to the CEE library (whatever it will be named). This is also nicely in line with libxm2, which I intend to use for XML parsing.

Tuesday, October 12, 2010

While this is not yet much, I have populated liblognorm's git repository with some early (and very incomplete) code. Most importantly, it contains a glimpse liblognorm API, which I tried to keep as simple as possible.

All of this material is, for obvious reasons, very preliminary and incomplete. But it may be of interest for those folks interested in the library.

I need to progress while waiting. So I have decided to name the library libcee for now, but rename it if Mitre decides this is not possible. That should not be much of a problem, as this decision is probably much quicker made than me writing the first releasable version of it ;)

Monday, October 11, 2010

I have dug into the design of my upcoming event/log normalization library. As it will base on CEE, I intend to pull in CEE definitions for types defined there, like tags or field types. Also, I thought about what the library should output. An obvious choice for many use cases is an in-memory object model describing the normalized form of the event that was passed in. This is probably most convenient for applications that want to do further processing on the event.

However, it also seems useful to have the ability to serialize this data in the form of a text string. That string could be stored in a file for later reference, forensics or to feed some other tool capable of understanding the file format. And as the in-memory object model will be CEE based, and CEE defines such serialization formats, it seems obvious that the library should be able to generate serialization based on the CEE-defined and supported formats (note that does not necessarily means XML, it may be JSON or syslog structured data as well).

Looking at all this, the normalization library seems to consist of two largely independent (but co-operating) parts:

the parser engine itself, that part that is used to actually normalize the input string according to the provided sample base and CEE definitions

a CEE support library, which provides the plumbing for everything that is defined in CEE (like tags, field types and serialization formats)

Now consider that I intended to create the normalization feature as a separate library instead of a rsyslog plugin because I hope that other projects can reuse it. Looking at the above bullet points, it looks like it is also natural to split core parser from CEE functionality. Again, there seems to be a broader potential user base for generic CEE functionality than for normalization. For example, a CEE support library could also be used by projects that natively support CEE. It hopefully would safe them the hassle of coding all CEE functionality just to do some basic things. Think, for example, on some application that would "just" like to emit a well-formed CEE log record (a very common case, I guess). With a library, it could just generate (via the library) a proper in-memory representation of the event and then have the library process it. The library could then also check if it is syntactically correct and contains all the necessary fields to conform to a specific CEE profile.

The more I think about it, the more I think it is useful. So I'll probably split the core normalization library from the CEE part. This is not much effort, but opens up additional uses. I'll call the normalization part then liblognorm (or libeventnorm) and the CEE part libcee -- or something along these lines. Under this light, liblognorm may actually be a better name, because the parser part is more concernd about logs and log files instead of generic events (which often come in other format).

Sunday, October 10, 2010

Names are always important. While thinking about liblognorm's desing, it occured to me that the name may be "too limited". In fact, the library will make havy use of CEE, the common event expression standard. In CEE, the term "log" was deliberately avoided in spite of the broader term "event". And thinking about this, my library is much more about normalizing all type of event representations than it is about normalizing "just" logs. So libeventnorm would probably be a better name. So for now, I think I will use the new name.

Any feedback in this issue is appreciated. I am also open to new name suggestions. But I think I will populate a public git early next week, so we should have fixed a name by that then ;)

Thursday, October 07, 2010

With this posting, I introduce a new project of mine: liblognorm. This library shall help to make sense out of syslog data, or, actually, any event data that is present in text form.

In short words, one will be able to throw arbitrary log message to liblognorm, one at a time, and for each message it will output well-defined name-value pairs and a set of tags describing the message.

So, for example, if you have traffic logs from three different firewalls, liblognorm will be able to "normalize" the events into generic ones. Among others, it will extract source and destination ip addresses and ports and make them available via well-defined fields. As the end result, a common log analysis application will be able to work on that common set and so this backend will be independent from the actual firewalls feeding it. Even better, once we have a well-understood interim format, it is also easy to convert that into any other vendor specific format, so that you can use that vendor's analysis tool.

So liblognorm will be able to provide interoperability between systems that were never designed to be interoperable.

This sounds bold, so why am I thinking I can do this?

Well, I am working for quite some years in this field and have accumulated a lot of experience including a sufficient number of implementations where I failed in one way or another. You can read about this in my previous blog post on "syslog normalization". So know-how is available. The core technology is actually not that complex. I hope to code the core parts of the lib within one month (but it may take some more real-world time as I can not yet work on it full time). Also, very importantly, there is the Common Event Expression (CEE) standard coming up. Its aim is nothing less than to provide the plumbing that is needed for log normalization. CEE is initiated by the same folks that made so successful standards like CVE alive -- so it is something we can hope to succeed. Thankfully, I have recently become a member of the editorial board so I have good insight of what it takes to write a library that utilizes CEE.

This all sounds good, but there is one ingredient missing: liblognorm will not be able to magically know the format of each message. We must teach it the formats. This involves some community effort as a single person can not compile all message formats this IT world has to offer. Not even a small group can do. I hope that we can get sufficient log samples from the rsyslog community and hopefully later from a growing liblognorm community. I have thought long about how this can be done. A core finding is that it must be dumb easy to do. This is how it is intended to work:

When you feed liblognorm a message that it can not recognize, it will spit out that message. Let's assume we have the following message:

The strings "reject" and "AAA" are tags. Tags will be placed in comma-delimited format in front of the actual message sample and are terminated by a colon. Everything after the colon is actual message text. The field between percent signs reflect some well-known properties (which are taken from the CEE base def and/or are custom defined). The syntax will be taken from a data dictionary, so the user does not need to bother about that in most cases. So creating a message sample out of an unknown message type should be fairly easy.

The idea now is that we gather these one-line message samples and compile them into a central repository. Then, all users can download fresh versions of the "sample base" and use them to normalize their tags -- much like a virus scanner works. Of course, there are a number of questions in regard of how trustworthy samples are. But this is definitely something we can solve. To get started, we could simply do some manual review first, much like code integration. At later stages, some trust level policy could be applied. Well-known technology...

None of this is cast in stone as I am in a very early stage of this project. Your feedback definitely makes a difference and so be sure to provide ample ;)

That's it for today, but I plan to do a series of blog posts on the new system within the next couple of days. Please provide feedback whenever you have it. It's definitely something very useful to have. I'll also make sure that I set up a new list for this effort, but initially I will be abusing the rsyslog mailing list, as I assume folks on that list will definitely be interested in liblognorm as well. And, of course, once the library is ready I'll utilize it in various ways inside rsyslog.

Wednesday, October 06, 2010

I was able to place my conference paper "rsyslog: going up from 40K messages per second to 250K" online. It provides a good drill-down of what we did during the first performance tuning effort. Hopefully, it is also useful to other developers enhancing userland single-threaded applications to multi-threading.

I do not only focus on what we did well, but also provide quite some insight on where we (I!!) failed.

Tuesday, October 05, 2010

I will be releasing rsyslog 5.7.1 today, a member of the v5-devel branch. With this version, omhdfs debuts. This is a specially-crafted output module to support Hadoop's HDFS file system. The new module was a sponsored project and is useful for folks expecting enormous amounts of data or having high processing time needs for analysis of the logs.

The module has undergone basic testing and is considered (almost) production-ready. However, I myself have limited test equipment and limited needs for and know-how of Hadoop, so it is probably be interesting to see how real-world users will perceive this module. I am looking forward to any experiences, be it good or bad!

One thing that is a bit bad at the moment is the way omhdfs is build: Hadoop is Java-based, and so is HDFS. There is a C library, libhdfs, available to handle the details, but it uses JNI. That seems to result in a lot of dependencies on environment variables. Not knowing better, I request the user to set these before ./configure is called. If someone has a better way, I would really appreciate to hear about that.

Please also note that it was almost impossible to check omhdfs under valgrind: the Java VM created an enormous amount of memory violations under the debugger and made the output unusable. So far I have not bothered to create a suppression file, but may try this in the future.

All in all, I am very happy we now have native output capability for HDFS as well. Adding the module also proved how useful the idea of a rsyslog core paired up with relatively light-weight output/action modules is.

Finally, Google has create some useful tools to help fight comment spam. As a result, I was able to quickly delete the remaining 100 (or so) spam comments and so I can now show comments on this blog again. This is useful, as they contain a lot of insight. Also, I hope that my readers will now comment again!