caml_sys_open in byterun/sys.c uses fcntl to set the FD_CLOEXEC flag (if supported by that platform) on file descriptors after opening a file. Unfortunately, this happens outside of the OCaml runtime lock, which can lead to problems if fcntl blocks.

We have observed this issue when e.g. opening files on NFS. It seems glibc wraps the fcntl system call in a way that requires it to also call "stat", which is known to block under some circumstances.

Could you please change caml_sys_open to call fcntl in the "blocking section" used to open the file? - Thanks!

I failed to reproduce this bug and doubt that, as of glibc 2.13, fcntl calls stat while setting FD_CLOSEXEC. The kernel however has a small lock around the fd and it is possible that, on NFS, some operations hold that lock for a bit too long.
The patch to do the fcntl out of the ocaml lock is very simple and probably worth applying just to stray on the safe side.

The straces back then definitely showed that fcntl was calling "stat", though the problem might also be due to kernel locks. Hard to say, the problem was not reliably reproducible. Newer glibc versions may also behave differently. I agree we should fix this in any case to be on the safe side.