LogCabin.appendEntry(3, "Server Stats")

This is the third in a series of blog posts detailing the ongoing development
of LogCabin. This entry describes a new
tool to extract information from LogCabin servers and a tricky bug that
occurred when unregistering handlers from the event loop.

Server Stats

Nate Hardt has been integrating and testing LogCabin into Scale's software
stack, and he's run into a few issues lately that were needlessly
time-consuming to diagnose. Some of the problems were silly, like when one test
stopped the LogCabin daemon and it was never restarted for the next test.
Still, even this was one hard to spot, since LogCabin had no tooling for
diagnostics.

To make our lives easier, we wanted a
tool that would go collect
information that might be useful about each server's internals, then report it
back in a nice way.

The first step was to collect some statistics into a Protocol Buffer, and have
servers dump these out to their debug logs. They now do this periodically
(defaulting to once per minute) and also upon the SIGUSR1 signal. Protocol
Buffers are nice for this sort of thing, since we anticipate the number of
stats to grow over time, and not every tool needs to understand each stat.

Here's an example of the raw output (using the Protocol Buffers text format):

The second step was to go collect all this information with a new RPC, named
getServerStats(). This RPC is different from existing ones in that it's not
destined for the cluster leader. That's because we want statistics from every
individual server, and they need to be collected even if the cluster has no
leader. The client library had to be refactored a little bit as a result, so
that the event loop and TCP connection rate-limiting mechanism are available
outside of Client::LeaderRPC. A new client binary called ServerStats
invokes the RPC and prints out the results.

Finally, the scripts/serverstats.py script understands the meaning of the
ServerStats fields, and it tries to present something a little nicer for human
consumption. Some of the fields in the raw output (like nanoseconds since the
Unix epoch) are not easy for humans, so the script translates those to more
meaningful formats. It also uses colors to highlight interesting things, and
it'll output a warning if the cluster has no leader (more warnings to come in
the future).

Here's an example of the serverstats.py output for the same server as above:

The ServerStats structure will need to include much more information to be
comprehensive, but it's probably easier to do that over time than all at once.
It currently includes just the state that was easily available in the Raft
module. There's also room for improvement on formatting the information nicely
and highlighting problems. Still, it's a useful tool already, and it's a good
starting point for further iteration.

DumpTree

It's also important to be able to see what a LogCabin state machine is storing
in its Tree structure. I created a simple client called DumpTree to
recursively list out the values for each key. This will only work for small
state machines with human-readable values, but it's so much better than not
being able to see inside at all or writing a new C++ client every time.

Event Loop Crashes

Nate kept running into crashes like this
one, which came with pretty
mysterious error messages:

pure virtual method called
terminate called without an active exception
Program received signal SIGABRT, Aborted.

After seeing a few of these with similar stack traces, we took a closer look
and found the problem. It's subtle but worth discussing.

LogCabin has a set of classes that wrap epoll and provide an event loop.
Event::File used to be an abstract base class that would register a file
descriptor with the event loop and arrange for the event loop thread to call a
virtual handleFileEvent() method appropriately. The destructor on File would
first interrupt the event loop thread, then unregister the file handler.

RPC::ReceiveSocket is one example of a class that derives from Event::File,
representing the receiving end of a TCP connection. When ReceiveSocket is
destroyed, C++ calls its destructor, starts to destroy the ReceiveSocket
members, probably scribbles on the vtable, and only then does it call the
destructor on the base class, Event::File. If the event loop thread called
handleFileEvent at this point (just before the Event::File destructor), the
process would crash.

The desired behavior is to run through the shutdown procedure for Event::File
first, interrupting the event loop thread and unregistering the file handler.
Only then would it be safe to start destroying the ReceiveSocket.

One way to achieve this behavior is to require all classes that derive from
Event::File to call a shutdown method on Event::File as the first step in
their destructors. However, this approach would be error-prone: if someone
forgot to call this method in any one class, they might get crashes, either now
or anytime in the future.

Instead, the approach I implemented split Event::File into two classes: one
to provide a handler, and a second one, called a Monitor, to register the
handler with the event loop. The lifetime of the handler must extend beyond
that of the monitor; thus, the monitor's destructor will interrupt the event
loop and unregister the handler before the handler begins to be destroyed. This
is relatively easy for users of these classes to get right: they generally
declare a monitor following a handler in the same object or scope. The
monitor's constructor needs a reference to the handler, so it's hard to declare
these variables backwards. And if someone forgets to create a monitor
altogether, their handler will never fire, so they will probably notice in
testing.

Next

Scale is in the middle of an office move that's making some of the development
and test clusters unavailable, so it's probably a good time to work through
some of the issue backlog next.
Thanks to Scale Computing for supporting this
work. And congrats to Nate on his first two pull requests
(#79 and
#81).