Re: [Sbcl-devel] sigchld handlers

Richard M Kreuter writes:
>
> So I think it's desirable to do something to support composition of
> independent child-creating programs here, but I'm not sure what's right.
> These options seem obvious to me...
Another approach would be to make SB-POSIX:FORK not return pids, but
instances of SB-IMPL:PROCESS. This would be an incompatible breakage
for any program that calls SB-POSIX:FORK, however, though it would make
SIGCHLD handling invisible to the user. (Though it would be arguably
more aesthetic to give users one high-level interface, instead of two
different interfaces at different abstraction levels that are able to
break each other.)

Thread view

Richard M Kreuter writes:
>
> So I think it's desirable to do something to support composition of
> independent child-creating programs here, but I'm not sure what's right.
> These options seem obvious to me...
Another approach would be to make SB-POSIX:FORK not return pids, but
instances of SB-IMPL:PROCESS. This would be an incompatible breakage
for any program that calls SB-POSIX:FORK, however, though it would make
SIGCHLD handling invisible to the user. (Though it would be arguably
more aesthetic to give users one high-level interface, instead of two
different interfaces at different abstraction levels that are able to
break each other.)

Daniel Herring writes:
> On Wed, 19 Dec 2007, Richard M Kreuter wrote:
>
> > So I think it's desirable to do something to support composition of
> > independent child-creating programs here, but I'm not sure what's
> > right.
>
> Interesting. I thought unix signals were designed to allow
> daisy-chaining. When you install a handler, you keep the old one to
> restore when your handler "exits scope". Similarly your handler may call
> the old handler (unless null), especially when it doesn't know how to
> handle a signal.
What you say is technically true, but it doesn't help us much.
(1) While any particular program that wants to install a signal handler
can hold onto a previously-installed signal handler function,
independent libraries that want to install handlers can't generally
coordinate: suppose that libraries A and B each install a handler
for a signal and arrange to call whatever handler was in place
before installation. So let's say A installs a handler, and then B
installs a handler and arranges to invoke A's handler under some
circumstances. Now suppose that A later wants to uninstall its
handler. Since A's handler is now invoked by B, A has to how know
to tell B not to run A's handler. So daisy-chaining is hard to pull
off without coordination among different users of a signal.
This problem applies to the present case: RUN-PROGRAM installs a
sigchld handler, and so can users.
(2) A SIGCHLD handler serves two different purposes: it gives the parent
an opportunity to take action when a child's status changes, and it
gives the parent an opportunity to call a function in the wait()
family. And you must wait for each child at least once, or else run
the risk of exhausting the parent's resource limits or even the
operating system's process id space. But there are many functions
in the wait family, and mixing handlers that use different wait
functions can screw things up. So even if one wanted to allow
different libraries to daisy-chain signal handlers, you'd need to
establish an unenforceable protocol about which wait functions
handlers are allowed to call.
> I take it RUN-PROGRAM doesn't follow this pattern?
That's correct: RUN-PROGRAM doesn't save any existing handler. But as I
explain above, it wouldn't help if it did: signal handlers are rivalrous
goods and SIGCHLD handling has to be done consistently by all handlers
in order to work out.
> I'd vote for adding another keyword argument to RUN-PROGRAM that basically
> follows the semantics you suggested for FORK-WITH-HANDLER. Something like
>
> (sb-ext:run-program program args
> :sigchld (lambda () (terpri "child died")))
You can already do this for processes created with RUN-PROGRAM:
(run-program "ls" '("-1" "/")
:status-hook (lambda (proc)
(format t "~&~D is ~@:(~A~).~%"
(process-pid proc)
(process-status proc)))
:search t :output t)
The trouble is that to acheive analogous functionality for processes
created by the lower-level SB-POSIX:FORK, you have to install a SIGCHLD
handler of your own, which breaks the functionality of the PROCESSes
RUN-PROGRAM creates. (Until the next invocation of RUN-PROGRAM, when
any custom SIGCHLD handler gets replaced...)
Thanks,
RmK

Richard M Kreuter writes:
> About the attached files: the first is a loadable file that effects
> the discussed changes. It should be loadable into a running SBCL.
> The second is a handful of simple exercises for bookkeeping, and shows
> off the ability to supply a STATUS-HOOK to SB-POSIX:FORK, (which
> should make custom SIGCHLD handlers unnecessary in most cases). To
> run the tests, which take about a minute, do this:
>
> sbcl --load process-management.lisp --load process-management-test.lisp
>
> Could people who have a stake in this issue try your programs with an
> SBCL that has loaded the first file?
Did anybody who cares about fork/wait/sigchld handlers have a chance to
try this code? Any outstanding objections?
The files are available for download here:
http://www.progn.net/static/cl/process-management.lisphttp://www.progn.net/static/cl/process-management-test.lisp
--
RmK

Richard M Kreuter writes:
> Richard M Kreuter writes:
> >
> > So I think it's desirable to do something to support composition of
> > independent child-creating programs here, but I'm not sure what's right.
> > These options seem obvious to me...
>
> Another approach would be to make SB-POSIX:FORK not return pids, but
> instances of SB-IMPL:PROCESS. This would be an incompatible breakage
> for any program that calls SB-POSIX:FORK, however, though it would make
> SIGCHLD handling invisible to the user.
The more I think about this, the more I think having an SB-POSIX:FORK
that worked like this would be the right thing. Suppose we had an
SB-POSIX:FORK that returned a PROCESS to the parent and NIL to the child
(and signaled an error when fork() fails), and had a "pid designator" to
SB-POSIX with the following definition:
* an integer designates itself,
* an instance of SB-IMPL:PROCESS designates its PID slot.
Then programs that used synchronous children could call SB-POSIX:WAITPID
on the new PROCESS, and programs that used asynchronous children could
employ the STATUS-HOOK slot for reacting to child state changes. Users
wouldn't need to install SIGCHLD handlers, and the SIGCHLD machinery in
RUN-PROGRAM would suffice for all asynchronous child processes.
Changing SB-POSIX:FORK to return PROCESSes now, however, would break
every program that uses it, since probably every program that calls
SB-POSIX:FORK does something like this:
(let ((pid (sb-posix:fork)))
(if (zerop pid)
(progn #| child |# ...)
(progn #| parent |# ...)))
Not all such programs are problematic: if the parent waits for the child
immediately, it doesn't have to touch the SIGCHLD handler. And even
those programs that clobber the SIGCHLD handler can work; they just
won't work properly in a Lisp where any other programs also use the
SIGCHLD handler.
So here's a slightly fishy compromise that might suffice: have
SB-POSIX:FORK create a PROCESS, stash it in SB-IMPL::*ACTIVE-PROCESSES*,
return the PROCESS as a second return value, and have the bindings to
the wait family of functions update the processes *ACTIVE-PROCESSES* and
prune exited processes from *ACTIVE-PROCESSES* behind the scenes. This
way, programs that now call fork() and wait() synchronously will work
unchanged, and ones that now use install a SIGCHLD handler could instead
be written as
(let ((process (nth-value 1 (sb-posix:fork))))
(if (null process)
(progn #| child |# ...)
(progn (setf (process-status-hook proc) #'...))))
So user programs should never have to install a SIGCHLD handler. (Maybe
we could signal warning when the user tries to install a SIGCHLD
handler, even.)
Or, if you wanted an API closer to RUN-PROGRAM's, we could have
SB-POSIX:FORK take a STATUS-HOOK keyword argument, and set the slot when
the process is constructed, so that user programs could be written as
either
(let ((pid (sb-posix:fork :status-hook #'...)))
(when (zerop process)
(progn #| child |# ...)))
or
(let ((process (nth-value 1 (sb-posix:fork :status-hook #'...))))
(when (null process)
(progn #| child |# ...)))
This does deviate from SB-POSIX's notional goal of being a low level
interface, but if you grant that it's a flaw to give users a FORK that
encourages clobbering a global resource in potentially destabilizing
ways, then maybe you'll agree that this is one of those points where we
shouldn't expose quite so low level an interface.
Any opinions?
--
RmK

Richard M Kreuter <kreuter@...> writes:
> Any opinions?
[ without having thought about it for myself -- I'm sorry that I seem
to be saying this more and more... ]
Is there a way to preserve the return value of sb-posix:fork as an
integer, but also to provide a make-process-from-pid function for the
users who wish to interact with the hooks that run-program-like things
give? This, combined with what I think is a good idea of having
PROCESS objects designate PIDs for sb-posix, would match the
established semantics of being liberal in what sb-posix functions
accept, and somewhat conservative in what they return.
Cheers,
Christophe

I'd rather we have two layers, one that is a thin wrapper over POSIX
and returns integers and for which you have to waitpid and/or use
custom sigchld handlers (with proper API), and one on top than returns
process objects. The first layer may allow to write semi-portable
programs that do the right thing. You'll need that internally, anyway
-- I think it would be a good policy to export it.
NB: I'm writing a library that uses fork for robust(er) multitasking.
[ Fran=E7ois-Ren=E9 =D0VB Rideau | Reflection&Cybernethics | http://fare.tu=
nes.org ]
If mice were the ultimate input device, humans would be born
with one arm and three fingers.
On 20/12/2007, Richard M Kreuter <kreuter@...> wrote:
> Richard M Kreuter writes:
> > Richard M Kreuter writes:
> > >
> > > So I think it's desirable to do something to support composition of
> > > independent child-creating programs here, but I'm not sure what's rig=
ht.
> > > These options seem obvious to me...
> >
> > Another approach would be to make SB-POSIX:FORK not return pids, but
> > instances of SB-IMPL:PROCESS. This would be an incompatible breakage
> > for any program that calls SB-POSIX:FORK, however, though it would make
> > SIGCHLD handling invisible to the user.
>
> The more I think about this, the more I think having an SB-POSIX:FORK
> that worked like this would be the right thing. Suppose we had an
> SB-POSIX:FORK that returned a PROCESS to the parent and NIL to the child
> (and signaled an error when fork() fails), and had a "pid designator" to
> SB-POSIX with the following definition:
>
> * an integer designates itself,
> * an instance of SB-IMPL:PROCESS designates its PID slot.
>
> Then programs that used synchronous children could call SB-POSIX:WAITPID
> on the new PROCESS, and programs that used asynchronous children could
> employ the STATUS-HOOK slot for reacting to child state changes. Users
> wouldn't need to install SIGCHLD handlers, and the SIGCHLD machinery in
> RUN-PROGRAM would suffice for all asynchronous child processes.
>
> Changing SB-POSIX:FORK to return PROCESSes now, however, would break
> every program that uses it, since probably every program that calls
> SB-POSIX:FORK does something like this:
>
> (let ((pid (sb-posix:fork)))
> (if (zerop pid)
> (progn #| child |# ...)
> (progn #| parent |# ...)))
>
> Not all such programs are problematic: if the parent waits for the child
> immediately, it doesn't have to touch the SIGCHLD handler. And even
> those programs that clobber the SIGCHLD handler can work; they just
> won't work properly in a Lisp where any other programs also use the
> SIGCHLD handler.
>
> So here's a slightly fishy compromise that might suffice: have
> SB-POSIX:FORK create a PROCESS, stash it in SB-IMPL::*ACTIVE-PROCESSES*,
> return the PROCESS as a second return value, and have the bindings to
> the wait family of functions update the processes *ACTIVE-PROCESSES* and
> prune exited processes from *ACTIVE-PROCESSES* behind the scenes. This
> way, programs that now call fork() and wait() synchronously will work
> unchanged, and ones that now use install a SIGCHLD handler could instead
> be written as
>
> (let ((process (nth-value 1 (sb-posix:fork))))
> (if (null process)
> (progn #| child |# ...)
> (progn (setf (process-status-hook proc) #'...))))
>
> So user programs should never have to install a SIGCHLD handler. (Maybe
> we could signal warning when the user tries to install a SIGCHLD
> handler, even.)
>
> Or, if you wanted an API closer to RUN-PROGRAM's, we could have
> SB-POSIX:FORK take a STATUS-HOOK keyword argument, and set the slot when
> the process is constructed, so that user programs could be written as
> either
>
> (let ((pid (sb-posix:fork :status-hook #'...)))
> (when (zerop process)
> (progn #| child |# ...)))
>
> or
>
> (let ((process (nth-value 1 (sb-posix:fork :status-hook #'...))))
> (when (null process)
> (progn #| child |# ...)))
>
> This does deviate from SB-POSIX's notional goal of being a low level
> interface, but if you grant that it's a flaw to give users a FORK that
> encourages clobbering a global resource in potentially destabilizing
> ways, then maybe you'll agree that this is one of those points where we
> shouldn't expose quite so low level an interface.
>
> Any opinions?

"Far=C3=A9" writes:
> I'd rather we have two layers, one that is a thin wrapper over POSIX
> and returns integers and for which you have to waitpid and/or use
> custom sigchld handlers (with proper API), and one on top than returns
> process objects.=20
Note that my compromise proposal does return pids as the first value
from SB-POSIX:FORK, so existing programs should continue to work
unchanged.
What I am proposing is that FORK implicitly create and intern PROCESS
instances, and that the WAIT family implicitly do bookkeeping on those
processes. This bookkeeping I'm proposing will be invisible to you if
you don't hold onto the second return value from SB-POSIX:FORK, even, I
think, if you install your own SIGCHLD handler that uses the wait()
family of functions in SB-POSIX, because the bindings to the wait()
functions will use the machinery like what we already have for cleaning
up PROCESS instances.
I have not yet tried implementing the proposal to see how well it works,
so there might be something I'm not thinking of. But I don't think that
having SB-POSIX:FORK and SB-POSIX:WAIT* share bookkeeping machinery with
RUN-PROGRAM should adversely affect any uses of FORK or WAIT.
> The first layer may allow to write semi-portable programs that do the
> right thing.
Note that such programs won't port to Windows, for example, where we
don't have SB-POSIX:FORK. I don't know whether other Lisps expose so
primitive a fork() binding as we do; if they do, they may have similar
flaws in this area. So if your definition of "portable" is "runs on
SBCL where SBCL runs well", I might call the definition laudable, but
perhaps unusual ;)
> You'll need that internally, anyway -- I think it would be a good
> policy to export it.
I don't agree with the general "operation O requires primitives P1 and
P2, therefore there should be a supported interface to the raw P1 and P2
operations" is faulty. For example, IIRC, we don't have a supported
interface for circumventing the handler that invokes the garbage
collector.
In the case of fork(), we do export the "raw" primitive, and I am
arguing that it is destabilizing in an analogous way. If we changed
things so that our FORK and WAIT did a bit of bookkeeping behind the
scenes, but you wanted to circumvent our attempts to keep things
consistent, it would be trivial to get your own destabilizing fork():
(define-alien-routine "fork" sb-alien:int)
But I'd hope that users would look for FORK in SB-POSIX first, and
prefer the FORK we offer there, which we can at least try to implement
in a manner that doesn't break other parts of the system.
> NB: I'm writing a library that uses fork for robust(er) multitasking.
Okay. Does your library set a SIGCHLD handler? Were you aware that
that every call to RUN-PROGRAM (including, for instance, invocations of
the C compiler in ASDF systems) will replace your handler, and that
installation of your handler will break RUN-PROGRAM's process
management, with the consequence that some ways that you might handle
SIGCHLD can prevent processes created by RUN-PROGRAM from ever being
wait()ed for? You might be aware of these details; but maybe you
weren't.
ISTM that we'd be making your library more reliable by giving you an
SB-POSIX that ensured that all of SBCL's children were taken care of, if
we can do so.
--
RmK

On Dec 20, 2007 8:21 PM, Richard M Kreuter <kreuter@...> wrote:
I sure having different parts of the system agree with the way
books are kept is all good. I'm less sure about the secondary
return value, though, but I'm not sure why. Perhaps they just
don't seem like the kind of thing SB-POSIX should be returning?
There is also other gunk that fork() should be aware of, mostly
to do with pthreads -- but my pthread book is not next to me, and
I cannot remember. ...but babysteps are probably the way to go.
> I don't agree with the general "operation O requires primitives P1 and
> P2, therefore there should be a supported interface to the raw P1 and P2
> operations" is faulty. For example, IIRC, we don't have a supported
> interface for circumventing the handler that invokes the garbage
> collector.
>
> In the case of fork(), we do export the "raw" primitive, and I am
> arguing that it is destabilizing in an analogous way. If we changed
> things so that our FORK and WAIT did a bit of bookkeeping behind the
> scenes, but you wanted to circumvent our attempts to keep things
> consistent, it would be trivial to get your own destabilizing fork():
>
> (define-alien-routine "fork" sb-alien:int)
>
> But I'd hope that users would look for FORK in SB-POSIX first, and
> prefer the FORK we offer there, which we can at least try to implement
> in a manner that doesn't break other parts of the system.
This is veering off topic, but that's never stopped me. :)
I agree with the basic sentiment, but I also submit that...
<blind-faith>
...often that is also a signal that there is something wrong with the layering
-- typically the lower-level api pushing concerns from below itself onto its
callers, but not doing it via an established protocol, but instead just saying
"callers must know what they are doing".
</blind faith>
Cheers,
-- Nikodemus

On Dec 21, 2007 3:29 AM, Nikodemus Siivola <nikodemus@...> wrote:
> On Dec 20, 2007 8:21 PM, Richard M Kreuter <kreuter@...> wrote:
>
> I sure having different parts of the system agree with the way
> books are kept is all good. I'm less sure about the secondary
> return value, though, but I'm not sure why. Perhaps they just
> don't seem like the kind of thing SB-POSIX should be returning?
I agree with Nikodemus that it seems out of wack with what sb-posix
does, but I do like the idea of cleaning up the sigchld mess. Maybe
sb-posix:fork could return an integer like it currently does, but it
could already have interned that as a PROCESS for the sigchld handler
to clean up. The user who cares about the exit status of the child
can look up the process object from the PID ... except for the race
condition there, which leaves me in the end liking the idea of the
second return value.
Oh, and I'd hate to see the ability to unconditionally set/block the
handler for sigchld go away. The vast majority of the time that's not
what you want to be doing, but on those occasions where it is, I like
how SBCL (in contrast to Allegro) gives you pretty easy, open access
to the direct unix level. Hypothesizing that Python can already see
the vast majority of cases where the user is setting the handler for a
specific signal, if a more elaborate protocol goes into place, a
style-warning about setting the handler for sigchld would be in order.