Remote Logging with SSH and Syslog-NG

Hal Pomeranz, Deer Run Associates

One of the points I make repeatedly in my training classes
is the value of centralized logging. Keeping an off-line copy of your site's
logs on some central, secure log server not only gives you greater visibility
from a systems management perspective, but can prove invaluable after a
security incident when the local copies of the log files on the target
system(s) have been compromised by the attacker.

The difficulty is that the standard Unix Syslog daemon uses
unauthenticated UDP messages to transmit log messages to remote servers. This
makes drilling holes in your firewalls to accept Syslog messages from remote
locations very undesirable, to say nothing of the security implications of
having critical system log messages traveling in clear text over public
networks. Use of IPSEC or some other strong VPN product can certainly help
mitigate these concerns, but if all you care about is obtaining logging
information from some remote site then firing up a full-bore VPN session may
seem like overkill.

However, the fact that UDP is not a guaranteed delivery
protocol also means that important log messages can be dropped entirely. While
lack of guaranteed delivery can be a factor for Syslog messages in LAN
environments, the risk becomes much greater when trying to drive remote log
messages across highly congested public networks. Simply using a VPN to
protect the security of the remote log stream does nothing to address the
guaranteed delivery concern. This is where Syslog-NG becomes attractive,
because two Syslog-NG servers can share remote logging information using TCP
rather than UDP. But once you're logging via TCP, then it is also possible to
tunnel this TCP communication via SSH rather than firing up a full VPN-- the
"best of both worlds" if you're looking for a quick and dirty
solution.

The rest of this article covers the basic configuration for
establishing an SSH tunnel between two servers and configuring Syslog-NG at
both ends to communicate log messages down this tunnel. Because Syslog-NG is
capable of both accepting UDP-based log messages from standard Unix Syslog
daemons as well as forwarding those messages to another machine, it is possible
to set up a single Syslog-NG server at a remote site which acts as a collector
and relay for the log messages generated by all machines at that location, but
this configuration is largely outside of the scope of this article (though I'll
give you some pointers in that direction as we go along).

Start with SSH

The first step is to get the SSH tunnel set up between the
two machines. My personal preference is to originate the SSH tunnel on my
central "loghost" machine at the primary site, and have it connect to
the machine at the "remote" site that I want to get logs from.
Typically this involves drilling out through the firewall at the primary site--
often the site's default firewall rules will allow this connection without any
reconfiguration-- and allowing the connection "inward" through the
firewall at the remote end, which usually requires some firewall ruleset tweaks
on the remote site's firewall.

However, since we want the remote log server to be sending
logs back to the central loghost at the primary site, we need to use a reverse
tunnel (that's the "-R"
option on the SSH command line) to get things working properly. This is
actually one of very few places where I find reverse tunnels to be useful.
Figure A, below, shows a high-level picture of how the traffic is flowing in
this design.

We need to make sure that the SSH session and tunnel are set
up automatically when the central log host boots. If the SSH session dies for
some reason (intermittent network outage, system administration
"accident", etc) we'd also like the connection to be re-established
as quickly as possible. In situations like this, I like to have the init process fire off the SSH
connection with a line like this in /etc/inittab:

The example above must appear as a single long line in /etc/inittab-- I've just broken it
onto multiple lines for clarity.

Let's examine the SSH command line first. The "-R 514:loghost.domain.com:514" on
the second line of the example sets up the reverse tunnel from 514/tcp on the
remote server to "loghost.domain.com:514"--
in other words, port 514/tcp on the central loghost machine. While it seems
natural to use 514/tcp for Syslog-NG logging, you have to remember that 514/tcp
is the reserved port for the Unix rlogin/rsh service so you're going to run
into a port conflict if you still have these services enabled. I generally
turn off unencrypted network protocols like telnet,
FTP, rlogin/rsh/rcp
on my servers and use SSH instead, so it's not an issue for me, but you can run
this tunnel over any free ports if there is a conflict at your site.

As for the other SSH command line options above, the "-n" flag tells SSH to associate
the standard input with /dev/null.
There won't be any command line input since we're essentially going to be
running the SSH client as a "daemon" via init. As you can see at the end of the command line in
the example, we're also sending the standard output and standard error to /dev/null as well ("... >/dev/null 2>&1").
Since we're never going to be issuing remote commands via this SSH connection
(we only care about the tunnel), the "-N"
option to SSH tells the SSH client to only set up the tunnel and to not bother
preparing a command stream for issuing commands on the remote system, while
"-T" says to not
bother allocating a pseudo-tty on the remote system. The "-x" option disables X11
forwarding, just as a defense-in-depth gesture.

Turning our attention to the rest of the /etc/initab entry, the first field
("log1") is just an
identifier for this entry in the inittab
file. These identifiers can be any sequence of 2-4 alphanumeric characters;
the only requirement is that they be unique from all other identifiers used in
the file. I've chosen "log1"
here because it's usually the case that I have multiple SSH tunnels set up to
different remote log sources, and I typically name the inittab entries "log1",
"log2", etc. The
second field in the inittab file
("3") is the run level
where this entry should be fired. Make sure to start this SSH process after
the network interfaces have been initialized but before the
Syslog-NG daemon is started.

The "respawn"
option in the third field is the reason I like to use init for spawning processes like this. When the "respawn" option is enabled, the init process will automatically fire
off a new SSH process if the old one dies for any reason. In other words, init acts a like a "watchdog"
type daemon and makes sure that the SSH tunnel is always up and running. This
is an extremely useful technique, but one that a lot of system admins seem to
have forgotten.

Once you've got your inittab
entry all set up, HUP the init process ("kill -HUP 1"). This should cause
the init process to re-read the inittab file and spawn the SSH
connection. You should be able to verify that the SSH client is running with
the ps command and verify the
existence of the tunnel using netstat.
Once you've got all that working, it's time to turn our attention to
configuring Syslog-NG.

Configuring Syslog-NG

In general, configuration of Syslog-NG is well covered by
Balazs Scheidler's reference manual[1] and Nate Campi's excellent FAQ[2]. So
allow me to just present complete configuration examples for the main loghost
and remote log server and point out the critical bits.

As far as the options
go, "check_hostname(yes)"
forces Syslog-NG to do a little bit of sanity checking on the incoming remote
hostname in the log message. In our destination
directive we'll be creating directories for each system's logs by hostname and
it wouldn't be good if an attacker could embed shell meta-characters in the
hostname to cause us problems. "keep_hostname(yes)"
means to use the hostname that's presented in the actual message from the
remote log server rather than using the hostname we get by resolving the source
of the remote Syslog connection. After all, since we expect remote messages to
be coming down our SSH tunnel, the source IP address of these messages will be
the loopback address (127.0.0.1), and having all messages tagged with "localhost" is not what we want.
"chain_hostnames(no)"
causes Syslog-NG just to show the original hostname in the message rather than
a chain of all the hops the message has been through to get to its final
destination. This becomes a lot more relevant when you start relaying messages
through multiple servers.

The inputs
cover all of the various places we can get logging information from. "internal()" is internal messages
from the Syslog-NG daemon itself. "unix-stream("/dev/log")"
is the normal /dev/log device
that Linux systems use for local logging. Note that if you're on a non-Linux
platform like Solaris, HP-UX, or one of the *BSD operating systems then your
local log channel is likely to be very different (examples of
appropriate configurations for various operating systems can be found in the
Syslog-NG source distribution). Some sites actually run the vendor Syslog in
parallel with Syslog-NG rather than having to deal with the problem of
emulating the standard vendor Syslog interfaces-- the vendor Syslog daemon can
just relay messages to Syslog-NG via the standard UDP Syslog channel, even
within the same machine. The "udp()"
line means to listen on the standard 514/udp Syslog channel and "tcp()" means to listen on 514/tcp
for messages from another Syslog-NG server (or in our case, the SSH tunnel).
Note that both the "tcp()"
and "udp()" options
accept the "port()"
option to specify a different port. For example, if you wanted your Syslog-NG
server to listen on port 5014/tcp to avoid conflicts with the rlogin/rsh
daemon you would write:

tcp(port(5014)
max-connections(100));

Note also the use of the "max_connections()" option to increase the number of
simultaneous TCP sessions the logging daemon can handle.

The destination
clause allows us to specify a "log sink", or place where we want our
logs to end up. Here we're using some built-in Syslog-NG macros to force
incoming log messages to be divided out into directories: first by hostname,
and then by year and month. Within each directory, messages will go into log
files named for the Syslog facility the message was logged to (mail, auth,
kern, local0, etc), with each file having a date stamp
attached. Notice that with Syslog-NG automatically creating a new file for
each day of logs, we don't even need a separate log rotation program! This is
just one more useful feature of Syslog-NG. The other options to the "file()" directive make sure that
directories will be created as needed and set sensible ownerships and
permissions on the newly created files and directories.

Having defined our inputs
and destination directives, we
combine them into log
declarations to actually tell the Syslog-NG daemon what to do with the incoming
messages. Here we're just doing the trivial rule that sends all of our
incoming messages from all sources into the log file directory hierarchy we
defined in the destination
directive above.

With the basic configuration of the central loghost out of
the way, let's take a look at a sample configuration for the remote log server
on the other end of the SSH tunnel. It's actually not too much different from
the configuration for the central loghost:

Basically, all we've done here is added an additional destination directive and an
additional log directive. The
"remote" destination
says to log via TCP to "localhost"
using the default port 514/tcp (since we didn't specify an alternate port).
"localhost:514" should
be the location of our reverse tunnel endpoint. Note that if you used an
alternate port for the tunnel endpoint, you can specify it:

destination remote
{ tcp("localhost" port(5014)); };

Our first log
declaration keeps a local copy of all log messages received in a directory
structure on the remote log server that parallels the one on the central
loghost. The second log
directive also relays a copy of all messages back to the central log server via
the SSH tunnel. It's up to you whether you keep a local copy of the logs on
the remote log server, but most likely the admins at the remote site will
appreciate having this copy of the logs.

Note that in the inputs
section above, we've configured the standard "udp()" input for normal UDP Syslog messages. This
means that other hosts at the remote site can send Syslog messages to the
remote log server and those messages will be relayed by the Syslog-NG server back
through the SSH tunnel to the central log host at home base. We've also
configured the remote log server to listen for messages on the "tcp()" input channel. Maybe
there are other Syslog-NG servers at the remote location, or perhaps there is
an SSH tunnel from the remote log server to some other remote site and we're
chaining log messages through multiple hops!

Conclusion

I think you'll find this a very easy little recipe to
implement, and yet it achieves a very powerful goal. Of course, once you have
this big pile of logs you're going to want some sort of tool that actually
reads the logs for you and send you the "interesting" events. You
could use a simple tool like Logcheck[3] or Swatch[4], or investigate some of
the newer, fancier tools out there like Logsurfer+[5], SEC[6], or Lire[7].
Whatever solution you end up with, let me assure you that I never regret the
effort I expend to set up centralized logging and log monitoring, because the
visibility I get as far as what's happening on my networks is enormously
useful.

About the Author

Hal Pomeranz (hal@deer-run.com)
has been doing IT for more than 15 years. His favorite activity is being up at
midnight on New Year's Eve so he can hear the disk drives on his log servers
spin as the logging directory hierarchy for the new year is created.