Interface

s6-log reads and compiles logging script to an internal
form. Then it reads its standard input, line by line, and performs actions
on it, following the script it is given. It does its best to ensure there
is never any log loss. It exits cleanly when stdin closes or when
it receives a SIGTERM or a SIGHUP.

Options

-b : blocking mode. With this option, s6-log stops
reading its standard input while it has unflushed buffers. This ensures that
every log line has been fully processed before reading the next one; this is also
multilog's behaviour.
By default, s6-log keeps reading from stdin even if its buffers still
contain data. -b is safer, but may slow down your service; the default
is faster, but may lead to unbound memory use if you have a lot of output to
write to a slow file system.

-p : protect against SIGTERM. Do not exit on receipt of a
SIGTERM; only exit on a SIGHUP or when reading EOF on stdin. This is useful
for a logger that you really do not want to lose even if automated
administration (e.g. the downing of a supervision tree) would kill it.

-t : timestamp. Prepends every log line that is written to
a logging directory with a
TAI64N
timestamp. This option is now deprecated.

-e : timestamp alerts. Prepends every "alert" line with a
TAI64N timestamp. This option is now deprecated.

-l linelimit : if a log line is longer than
linelimit bytes, cut it in two by inserting a newline. The newline
may appear anywhere after linelimit or more bytes. linelimit
cannot be less than 48. Using this option ensures that an ill-formatted log
stream will not make s6-log uncontrollably eat memory by trying to read and
process an extremely long line in one go. The default is no cap on line
length.

Logdirs

A logdir (logging directory) is a place where logs are
stored. s6-log can be scripted to write into one or more logdirs.

A logdir may contain the following files:

lock: this file is locked by s6-log at start,
to make sure only one instance is running at the same time.

current: the file where selected log lines are appended.
If current has the executable-by-user flag, it means that no
s6-log process is currently writing to it and the previous
s6-log process managed to cleanly finalize it. If it does not,
either a s6-log process is writing to it or the previous one
has been interrupted without finalizing it.

state: last processor's output, see below.

previous: a rotation is happening in that logdir.

processed, newstate: a rotation is happening
in that logdir and the processor script is running.

timestamped files: those files are @timestamp.s or
@timestamp.u and are old log files that have been processed
and rotated (if they're ending in .s) or that were the current
file when s6-log got interrupted (if they're ending in .u), in which
case they have not been processed.

Rotation

In a logdir, selected lines are appended to the current file.
When current becomes too big, a rotation happens.
The current file will be possibly processed, then
it will become an archived log file named
@timestamp.s, where timestamp, a
TAI64N
timestamp, is the absolute time of the rotation. If there are too
many archived log files in the logdir, the older ones are then
suppressed. Logging then resumes, to a brand new current
file.

You can use this mechanism to ensure that your logs never fill up
the available disk space, for instance: something that neither
syslogd, nor syslog-ng, nor rsyslog offers.

Processors

A processor script can be set for every logdir. When a rotation
occurs, current (which has then been renamed previous) is
fed to processor's stdin, and processor's stdout is saved
and archived. processor can also read the state file
on its fd 4; what it writes to its fd 5 will be saved as the next
state file, for the next rotation. A processor script runs with
the logdir as its working directory.

Processors should not background themselves: s6-log considers the
processing done when its processor direct child dies.
Processors should exit 0 on success and nonzero on failure; if a
processor fails, s6-log will try it again after some
cooldown time.

Processors make s6-log Turing-complete by allowing you to use any
external program to handle log files that are going to be archived.

Logging script syntax

When starting up, s6-log reads its arguments one by one; this
argument sequence, or directive sequence, forms a
logging script which tells s6-log what to log, where, and how.

Every directive can be a selection directive, a control
directive or an action directive. A valid logging script always
contains at least one action directive; every action directive can be
preceded by zero or more selection or control directives. s6-log will exit 100
if the script is invalid. If it can process the script but the last directive
is not an action directive, s6-log will emit a warning.

Selection directives

These directives tell s6-log whether to select or deselect lines it reads
from stdin; actions will only happen on selected lines. By default, every
line is selected.

-regexp: deselect yet-selected lines that match regexp, which must be a
POSIX Extended Regular Expression.

f: select exactly lines that have not yet been acted
upon (i.e. that were always deselected when the script encountered an action
directive).

Control directives

These directives tune s6-log's behaviour for the next actions.

nnumber: next logdirs will contain up to
number archived log files. If there are more, the oldest archived
log files will be suppressed, only the latest number will be kept.
By default, number is 10.

sfilesize: next rotations will occur when
current log files approach filesize bytes. By default,
filesize is 99999; it cannot be set lower than 4096 or
higher than 268435455.

Stotalsize: next logdirs will contain up
to totalsize bytes of archived (and maybe processed) log files. If
archived log files take more space than that, the older ones are suppressed
until the total size fits. A totalsize of zero means no such limit;
use n0 instead if you don't want any archived log files. By
default, totalsize is 0 (unlimited).

ltolerance: next rotations will be triggered
when the size of current goes higher than filesize minus
tolerance. tolerance cannot be more than half of
filesize. By default, tolerance is 2000.

rcooldown: if an error occurs during operations
on the next logdirs, retry every cooldown milliseconds. By default,
cooldown is 2000; it's strongly discouraged to set it to a value
under 50.

Ealertsize: only the first alertsize
bytes of the selected lines will be used in the next alerts. An
alertsize of 0 means no limit. By default, alertsize is
200.

^statussize: only the first statussize
bytes of the selected lines will be used in the next status file updates.
If a line is shorter than statussize bytes, the status file will be
padded with newlines so it is always statussize bytes long. 0 means
an unpadded, unlimited status file. By default, statussize is 1001.

!processor: registers
execlineb -Pc processor as a processor for the next logdirs;
execlineb must be found in s6-log's PATH.
If processor is empty, no processor will be set for the next logdirs.
By default, no processor is set.

t: the logged line will be prepended with a
TAI64N
timestamp (and a space) before being processed by the next action
directive. Giving the t directive several times before an action
directive has no effect.

T: the selected line will be prepended with a
ISO 8601
timestamp for combined date and time representing local time
according to the system's timezone, with a space (not a 'T')
between the date and the time and two spaces after the time, before being
processed by the next action directive. Giving the T directive
several times before an action directive has no effect.

Note that unlike the other control directives, the t and
T directives are not sticky: their effect will
disappear after the next action directive, and they need to be
reapplied if necessary. If both a t and a T
directives are given before an action directive, the TAI64N timestamp
will always appear before the ISO 8601 timestamp.

Action directives

These directives determine what s6-log actually does with the
selected lines.

2 or e: alert. s6-log will print "s6-log: alert: ",
possibly prepended with a timestamp, followed by the first
alertsize bytes of the line, to its standard error.
The e form is deprecated.

1: forward to stdout. s6-log will print the selected
line to its stdout. If any error occurs, e.g. if stdout was a pipe and the
reading end closed, this directive will be ignored for the rest of
s6-log's lifetime.

=statusfile: status. s6-log will atomically
update the statusfile file with the first statussize
bytes of the line, and pad it with newlines. s6-log must have the right
to write to statusfile and to statusfile's directory.

dir (must start with '/' or '.'): logdir.
s6-log will log the line into the logdir dir. s6-log must have
the right to write to dir.

Signals

SIGTERM: If s6-log has been run with the -p option, does nothing.
Without this option, SIGTERM instructs s6-log to stop reading stdin after the
next newline and exit after logging the last line.

SIGALRM: triggers a rotation on every logdir s6-log is monitoring,
as if the current file in those logdirs had reached the size
limit.

Examples

s6-log -b n20 s1000000 t /var/log/services/stuff

Logs all of stdin, prepending every line with a TAI64N timestamp, into the
/var/log/services/stuff logdir, with a maximum archive of
20 log files of 1 MB each; makes sure every line has been written
before reading the next one.

Sends alerts to stderr with the 500 first bytes of lines containing "fatal:".

Maintains the /var/log/foobard/status file at 1001 bytes,
updating it when it finds a log line starting with "STAT".

Logs all other lines to logdir /var/log/foobard, prepending
them with an ISO 8601 timestamp. When current
reaches at least 9998 kB (i.e. 10 MB filesise minus 2kB tolerance), pipe it
through gzip -nq9 and save the result into a timestamped archive file, with
a maximum of 30 such files or a total of 15 MB of compressed archive files.

Why use execlineb to interpret the "processor" string ?

Directly executing processor is not flexible enough. We want
to be able to run a complete command line, with an executable name and its
arguments.

We could interpret the processor string via /bin/sh.
This is what multilog
does. However, /bin/sh, despite being the traditional Unix interpreter,
is overpowered for this. We don't need a complete shell script interpreter:
most processor commands will be very simple, with only two or three
words, and we only need a way to turn a string into an argv, i.e. a
command line.

execlineb
was designed just for this: to turn simple strings into command lines.
It is a very fast and lightweight script launcher, that does not do any heavy
startup initialization like /bin/sh does. It happens to be the perfect
tool for the job.

To be perfectly honest: I also did this on purpose so people have a
reason to use the
execline language. But
seriously, it really is the perfect tool for the job.

Why have another logging mechanism ?

Because the syslog mechanism and all its implementations (save one) suck.
I'm not being judgmental; I'm just stating the obvious.

When
asked why he started rsyslog, Rainer Gerhards came up with a lot of
hand-waving and not a single word about technical points. There is a
reason for that: rsyslog is forked from sysklogd! So, no matter how
many bells and whistles are added to it, it still suffers from the same
basic flaws.

The problem with syslogd does not come from such or such implementation.
The problem comes from syslog's design in the first place.

syslog makes you send all your logs to the same place.
The logs from a zillion processes are read by a single syslogd server.
The server checks log lines against system-wide regular expressions
to decide where to write them. This raises the following issues:

Unless the client explicitly mentions its name in every log
line, there is no way for log readers to know what process generated a
given line. Some syslogd implementations can log the pid of the client;
big deal.

Log lines from every client have to run through the same regular
expression matching. This requires huge regular expression sets, and an
obvious performance impact, to do anything meaningful. And as a matter
of fact, standard syslogd configurations don't do anything meaningful:
they separate the logs into a few streams such as /var/log/messages,
/var/log/auth.log, /var/log/daemon.log or
/var/log/syslog with very vague semantics. All of syslogd's
line processing power remains unused, because making real use of it would
be too complex.

syslogd logs to files. This is wrong, because files grow
and disks fill up. Sure, there are utilities such as logrotate
to perform cleaning up, but as long as logging and log rotation are
kept separate, there is a race condition: a log file can grow and fill
up a disk before a rotation occurs. I am all for separating tasks that
can be separated, but there is no choice here: logging and log
rotation management must be done by the same tool.
Only a few non-mainstream implementations of syslogd do this, including the
Busybox one - and that is a
feature added by the Busybox developers who are aware of the problem
but want to maintain compatibility with the historical syslogd.
Neither syslogd (-ng or not) nor rsyslogd manages its log files: that's a
flaw that no amount of external tools is going to fix.

syslogd is a complex process that runs as root. We all know what
complex processes running as root mean: bugs turning into security holes.

syslog requires a syslogd service, and fails otherwise. A syslogd
service may not be present, it may fail... or it may want to log stuff.
Who's going to take care of syslogd's error messages ?

syslog is slow, it's unsafe, and it's incomplete. The only reason people
use it is because it's historical, it exists, and there hasn't been any
serious alternative yet, except maybe
multilog, which
s6-log improves upon.

Unix distributions already do this to some extent, but it's at best
unclear where the logs go for any given program.

Every program, without exception, should send its logs (be it
error messages, warning messages, or anything) to its standard
error descriptor, i.e. fd 2. This is why it's open for.

When process 1 starts, the logging chain is rooted to the
machine console: anything process 1 sends to its stderr
appears, without modification, on the machine console, which should
at any time remain the last resort place where logs are sent.

Process 1 should spawn and supervise a catch-all logging
mechanism that handles logs from every service that does not
take care of its own logging. Error messages from this logging
mechanism naturally go to the machine console.

Process 1's own error messages can go to the machine console,
or dirty tricks can be used so they
go to the catch-all logging mechanism.

Services that are spawned by process 1 should come with their
own logger service; the supervision mechanism offered by
s6-svscan makes it easy. Error messages
from the loggers themselves naturally go to the catch-all
mechanism.

User login mechanisms such as getty, xdm or
sshd are services: they should be started with their own
loggers. Of course, when a user gets a terminal and a shell, the
shell's stderr should be redirected to the terminal: interactive
programs break the automatic logging chain and delegate responsibility
to the user.

A syslogd service may exist, to catch logs sent via
syslog() by legacy programs. But it is a normal service, and logs
caught by this syslogd service are not part of the logging chain.
It is probably overkill to provide the syslogd service with its own
logger; error messages from syslogd can default to the catch-all logger.
The s6 package, including the ucspilogd program,
provides enough tools to easily implement
a complete syslogd system, for a small fraction of the resource needs and
the complexity of native syslogd implementations.

So, given a program, where are its logs sent ?

Logs sent via syslog() will be handled by the syslogd service as
usual. Smart administrators will make sure that those ones are as few as
possible. The rest of this analysis is about logs sent to stderr.

If the program is descended from a user's interactive program,
its logs are sent to the user's terminal or the user's choice redirection
target.

If the program is descended from a logged service, its logs are
naturally sent to the service's logger.

Else the logs are sent to the catch-all logger.

Only the catch-all logger's error messages, the kernel's fatal
error messages, and maybe process 1's error messages, are sent to the
system console.

In a logging chain situation, every service must have
its own logger. To avoid syslogd's design mistakes, one logger process
per service must be run. s6-log fits that role. Using s6-log as
your one-stop logger offers the following benefits:

Every instance of s6-log can run as a different user, so it's
easy to give different access rights to different logs. It is also
more secure not to have any logger process running as root.

s6-log consumes very little memory per instance (unless it
accumulates unflushed log lines, which you can avoid with the
-b option). So, launching a lot of s6-log processes does
not waste resources.

s6-log is vastly configurable via logging scripts; every instance
is as powerful as a traditional syslogd.

s6-log can log to a RAM filesystem and thus is suitable as a
catch-all logger. Clever tricks like Upstart's logd or daemontools'
readproctitle
are just that: tricks. s6-log gives a unified interface to all of
your system's loggers.

You mean you want to send, live, every log line
over the network via UDP ? You can't be serious.

Do yourself a favor and use s6-log to write log lines to a logdir,
with a processor script that sends files-being-archived to the
network, possibly after compressing them. More reliability, less
log lines lost, less network traffic, better engineering. If you
have no disk to even write the current files to, write
to a small RAM filesystem.

If you have to log stuff live via the network, you
do not need any local logging software. You don't even need syslogd.
Just filter your stderr via some grep that selects lines for
you, then sends them to a network socket. A trivial shell script, or
execline
script, can do that for you.

Do not insist on using syslogd. It does nothing magical, and nothing
that can't be done in a simpler way using simpler tools.