Unintended consequences of daemonizing C++ code

I created an interesting problem for myself the other day while working
on some C++ code for a client. It was a direct result of unintended
consequences after completing a feature request, and it also had a touch
of "heisenbug" type behavior for extra fun.

As part of this project, I've written something which talks to a
third-party persistent storage backend. The backend has a client
library which is just an ordinary C library and header file. I have a
class of my own which links against that in order to read and store
things.

One of the feature requests which came down the pipe was to make this
program turn itself into a daemon when it was run. This isn't a big
deal, and I've done it plenty of times. I just wrote the usual "call
fork, parent exits, child closes stdin/out/err and becomes a session
leader before carrying on" code. That part worked just fine.

I also added a testing switch to make it run in the foreground during
development work. It's a lot easier to test things on the real binary
when you don't have to chase down something which has escaped into the
background. It also means you have stderr right there for logging
purposes.

This was all okay, but then something happened when I went to stand up a
test instance of this server. It started up and went into the
background just fine, but then things started misbehaving. I'd poke
this server with a request which needed to hit the backend storage, and
it would fail. The third-party library said something like "server
closed the connection". That made no sense.

This library used either a Unix domain or TCP socket to talk to the
storage server. I started up strace and watched where it sent the
request along that socket and got -EPIPE, aka "broken pipe". Something
was going on which made that socket go bad, but what?

Further confounding matters was the fact that running the same code and
skipping the call to drop into the background would let it work just
fine. I could watch it with strace in test mode and it would send the
same data over the socket and it would work just fine. WTF?

I got the idea to strace the program from the beginning so I could
follow the actions of both parent and child as it entered the
background. Something weird was happening which I couldn't see when I
attached to the already-running background process, and watching from
the beginning was the only way to see it.

That's when I noticed that the parent was explicitly closing its file
descriptor to the server. It happened right after fork() returned. I
thought "close-on-exec" for a moment, but that didn't make any sense
since I wasn't calling exec. That line of reasoning wasn't directly
useful, but it did get me thinking generally about "stuff that runs when
things shut down", and that's when it hit me.

Destructors.

As previously mentioned, I have a class which wraps this third-party C
library. It has a simple little constructor which does no real work, an
Init() function which calls the library's "open connection" function,
and a destructor which calls the library's "finish connection" function.

What was happening was now obvious: when the parent returns from the
fork() and calls exit(), all of the destructors run, and that calls the
"finish connection" code in that client library.

Now, granted, the child process has its own address space courtesy of
copy-on-write behavior, but there was only ever one file descriptor open
to that storage server. When the parent shut down, it wrote something
down that fd which said "I'm done here", and the server disconnected it.
The child had no idea it was now sitting on top of a useless file
descriptor where the far end had disconnected. As a result, when it did
its next write(), that failed with the "broken pipe" error.

I did a couple of things about this. First, I decided that calling
destructors when the parent exits is inappropriate for this particular
program and switched over to using _exit() instead. Next, I did some
digging around in that library's API and found a way to detect failed
connections and a way to request a retry.

With those changes, I don't break my server connection, and I'm also
resilient to other things which might cause it to go down. If that
happens for whatever reason, it will bring the link back up.

If you have a C++ program which works fine in the foreground but always
fails its first external communication when it's allowed to background
itself, something like this might just be happening in there.