четверг, 16 февраля 2017 г.

I have noticed multiple times that when a process seems to hang, strace sometimes can kick it forward and cause it to proceed and either terminate as it should or fail with a signal.

There was a strange hang within systemd (PID 1), it apparently created a forked copy of itself which hung, and when I tried to strace the child process it failed with a signal. Then the parent (PID 1) ran pause() but any tries to send it SIGTERM or SIGUSR1 caused nothing, the pause() system call did not interrupt. I think that all signals were blocked. The systemd (PID 1) did not process parentless zombies either. Reboot was required.

One time when restarting mysqld it took forever to terminate, but finished in a second when I ran strace on it.

Another time, df hung due to unresponsive nfs. After I had unmounted the nfs filesystem using "umount -l -f" the df still lingered for minutes until I ran strace on it.