My employer recently upgraded our development machines from relatively
old Pentiums to AMD Opterons (x86_64, easily an order of magnitude
faster). This naturally required a re-compile of my Ocaml installation
(3.09.0) and the supporting libraries for my application.
I am using Unix.system to invoke external commands from within Ocaml. On
the old machines (with the 32-bit version of Ocaml), I would
occasionally get the exception Unix_error(ECHILD,"waitpid","") from
Unix.system. With the new machines, I'm seeing this at every call to
Unix.system, every time. I have investigate the behavior of the
sub-processes, and they are terminating normally, with no indication of
any error.
I have a theory why this is happening, but no supporting evidence. It
goes like this: system is basically fork+execv+waitpid; on the new
machines, fork+execv is terminating so quickly that the process is gone
before the call to waitpid. (NOTE: The external commands are relatively
small, and take on the order of 0.05s of user time.)
There's one rather large piece of evidence cutting against this theory:
I can't reproduce this behavior using the simple test case below, even
using precisly the same command arguments that get used in my full app.
I don't know how anything else going on in my code could interfere with
the output of Unix.system. Maybe the context-switching time with a
bigger executable makes a difference?
This isn't the biggest deal in the world, but I'd like to know for sure
that this behavior is harmless before I write code that ignores the
exception. If my theory is correct, it would be nice to see this fixed
(since this behavior will become more and more common with time).
Thanks,
Chris
system.ml:
let n = ref "echo 'no command'" in
Arg.parse [] (fun s -> n := s) "Usage: system <cmd>" ;
try
ignore (Unix.system !n); (* returns normally *)
print_endline "Success"
with
| Unix.Unix_error(err,fn,arg) ->
print_endline
("Unix_error: \"" ^ (Unix.error_message err) ^ "\" in "
^ fn ^ "(" ^ arg ^ ")")