fork/exec is forked up

I don’t know why it’s taken me so long to realise this, but the whole fork/exec shebang is screwed. There doesn’t seem to be any simple standards-conformant way (or even a generally portable way) to execute another process in parallel and be certain that the exec() call was successful. The problem is, once you’ve fork()d and then successfully exec()d you can’t communicate with the parent process to inform that the exec() was successful. If the exec() fails then you can communicate with the parent (via a signal for instance) but you can’t inform of success – the only way the parent can be sure of exec() success is to wait() for the child process to finish (and check that there is no failure indication) and that of course is not a parallel execution.

If you have a working vfork() then you can communicate with the parent process using a shared memory location (any variable declared “volatile” should do the trick, bearing in mind that “volatile” is not particularly well defined) although doing so is not standards conformant (see http://opengroup.org/onlinepubs/007908775/xsh/vfork.html for instance). By “working” vfork() I mean a vfork() which 1. causes the address space to be shared between the child and parent process (rather than copying the address space as fork() does) and 2. causes execution of the parent process to be suspended until the child either successfully exec()s or _exit()s. The 2nd requirement is necessary to avoid the race condition whereby the parent process checks the status variable before the child process has actually tried to exec() or similar.

Even the vfork() solution, if it can be used, is reliant on the compiler doing the right thing both in terms of volatile access, and in terms of vfork() generally. (Note that “right thing” in terms of volatile access is the generally accepted definition, not that in the C standard. Also, the fact that vfork()s correct operation really requires compiler support is something that I’m not willing to go into right now, other than to say that the compiler is allowed to make the assumption that “if (vfork()) { } else { }” will always follow one branch and not the other which is not the case here as both branches will be executed with the same address space).

The only other solution I can think of is to use pipe() to create a pipe, set the output end to be close-on-exec, then fork() (or vfork()), exec(), and write something (perhaps errno) to the pipe if the exec() fails (before calling _exit()). The parent process can read from the pipe and will get an immediate end-of-input if the exec() succeeds, or some data if the exec() failed. This seems like it should be fairly portable, as long as FD_CLOEXEC is available, but I haven’t actually tried it out yet.

Update 8/1/2009: The latter solution really requires fork() instead of vfork(), because vfork() might suspend the parent and the write by the child might then block, causing a deadlock. (In practice a pipe is always going to have enough of a buffer for this not to happen). POSIX doesn’t even let a vfork()d child call write(), anyway.

13/11/2015 also worth mentioning is the possibility for priority inversion to occur if the child process is to be run at lower priority and a third process is running at some priority level between the other two. The parent, which is running at the highest priority, should receive notification of fork-exec status as soon as possible; however, the child (running at the lowest priority) will not send this notification while the third process is scheduled.

This can avoided only by not waiting for exec to succeed or fail, but instead to process the exec status (data coming over the pipe or pipe closure) asynchronously (17/11/2015 or by eg using ptrace() to stop the child after exec() so that its priority can be set at that point; this wouldn’t work however if the child was setuid [unless the parent was root] since you can’t change the priority of a process not owned by yourself).

Update 18/9/2012: There’s a reasonably portable solution to this whole mess in the form of posix_spawn. (Update again 11/11/2015 – no. posix_spawn allows various errors to result in the child process exiting with a status of 127, so implementations may exhibit the exact same issue as the fork/exec combination. I somehow missed this earlier).

TL;DR – it’s not easy to determine immediately and without blocking whether a fork/exec pair of calls succeeded. If the child process is to run at the same or higher priority as the parent, you can do it with a CLOEXEC pipe trick; if the child process is to run at lower priority, the pipe trick suffers from potential priority inversion problems.

Post navigation

11 thoughts on “fork/exec is forked up”

Umm.. the child can easily getppid() and send a SIGUSR1 to the parent that indicates “Dad, I started!” … or the child can open a name pipe for writing which tells the parent (who opened it for reading) the same. Or, depending on the signal type, a pointer to some meaningful information can be sent.

Tim, I was referrring to exec()ing an arbitrary child process… one you don’t control (such as a user-specified command). You can’t make such a process send a signal back to the parent. Once the exec() call succeeds, you no longer have control.

It doesn’t matter if it *does* block. You only need write to the pipe if the exec failed; in this case the parent process is waiting to determine the exec status anyway, so it will read the pipe shortly. The worst case is that the forked child survives slightly longer than it technically needs to.

Ah, unless using vfork of course – I guess that’s what you were referring to. However, if you use vfork you’re not allowed to ‘call any other functions’ before calling _exit or exec, which precludes writing to a pipe anyway.

Sam: In theory you could implement posix_spawn via fork() and exec() but still return a suitable error code if the exec() fails, using the techniques I’ve outlined in the post. I’m not particularly worried about this implementation detail, it’s more the awkwardness of the interface (particularly, in discovering whether and why the exec() call failed) that bothered me.

However, Glibc doesn’t do this, instead deferring reporting of exec() failures via a child exit status code of 127 (which of course doesn’t give information as to *why* the exec failed) and what’s worse is that, on closer reading, the POSIX specification for the function actually seems to allow this kind of weak implementation.