Re: "device busy" reading file from NFS

On Sun, Apr 26, 2009 at 08:19:24AM -0500, John D. Baker wrote:
> I've been having problems with 5.0_RCx NFS clients (i386) on my 4.0_STABLE
> NFS servers (sparc64 and i386). I use 'amd' to automount a lot of
> things. Mostly, I'm rebuilding the system with sources on NFS and all
> other directories on local disk.
>
> The process will run fine until an attempt to read some file causes
> the (usually compiler) to stop, with a complaint similar to:
>
> In file included from
> /amd/kob/r0/nbsd/netbsd-5/src/dist/ntp/include/ntpd.h:9,
> from
> /amd/kob/r0/nbsd/netbsd-5/src/dist/ntp/libntp/iosignal.c:43:
> /amd/kob/r0/nbsd/netbsd-5/src/dist/ntp/include/ntp.h:17:22: error:
> /amd/kob/r0/nbsd/netbsd-5/src/usr.sbin/ntp/libntp/../include/isc/list.h:
> Device busy
...
> Has anyone else seen something similar?
http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=38141
From the PR:
--
I'm guessing how amd works - haven't researched it yet - but I think the
main problem will be new calls to VFS_ROOT() from lookup(). While an idle
file system is being garbage collected by amd, those will fail with EBUSY.
So there's a small window across the unmount where operations would fail
instead of causing an automount to occur.
For an unmount that's not forced, those should be easy enough to gate
because it's OK to wait there. The deadlocks (some of which have been there
for a long time) start happening when we cause threads already in the guts
of the file system code to wait, because there is a tangled mess of locks.
--
Code hasn't been written to deal with this properly yet. I didn't envision
that it would fire with any regularity and we haven't had any problem
reports up until this week. :-(. I will see about fixing it, but it will not
be in time for 5.0.