On Tue, Jun 08, 2004 at 11:35:58PM +0200, Lars Ellenberg wrote:
> / 2004-06-08 22:58:21 +0200
> \ Bernd Schubert:
>> > I still not understand whats causing this, but I'm pretty sure that the debian
> > nfs-kernel-server script cannot stop the nfs-server when nfs is started from
> > heartbeat. Just check it yourself, after running '/etc/init.d/heartbeat
> > stop', a 'ps ax' should show running nfs-daemons on this system. Those nfsd
> > processes can only be killed with 'killall -9 nfsd'. So I think after
> > stopping nfs, the nfs-daemons will survive and cause a stale filehandle when
> > drbd is stopped, probably they also ignore the 'exportfs -au' command.
In my case it _only_ works _with_ 'exportfs -au'. And that delay I
already mentioned.
> > I'm also still wondering about the way the debian script is stopping nfs, I
> > checked the script of several distributions and either nfsd's are immediately
> > stopped with signal 9 or first with signal 15 and after a short break with
> > signal 9, however the debian script only stops the daemons with signal 1.
Interesting. I'll look at Suse/Redhat scripts then...
> stange thing is, that there is a fuser -k -m $device in the drbd script,
> which *should* deliver a kill -9 ...
Yes, I saw that. That's why I tested it in that simple case. I think
the problem is, that it doesn't find any PID to kill. The ha-debug
file contains:
datadisk: ERROR: fuser -k -m /dev/nbd/0 [1]:
datadisk: ERROR: NO OUTPUT
datadisk: ERROR: umount -v /dev/nbd/0 [1]:
datadisk: ERROR: umount: /drbd/0: device is busy
> maybe retrying some more times does help?
That message is actually repeating quite often, since heartbeat
retries a few times to stop drbd, before it reboots the machine.
Haven't tried your patch yet, but I have little hope. Again that
test-case:
root at yang:~> mount /dev/hda3 /mountpoint
root at yang:~> exportfs -vi yang:/mountpoint
This is basically what I have if I remove that "exportfs -au" from my
init-scripts. nfs-kernel-server is stopped, nfs-common is stopped.
Still, umount won't succeed:
root at yang:~> umount /mountpoint
umount: /mountpoint: device is busy
umount: /mountpoint: device is busy
In this situation fuser doesn't find any PID:
root at yang:~> fuser -m /mountpoint
root at yang:~>
So it doesn't kill anything. If I cd into that dir, it behaves as
expected:
root at yang:~> cd /mountpoint
root at yang:/mountpoint> fuser -m /mountpoint
/mountpoint: 8752c
root at yang:/mountpoint> echo $$
8752
root at yang:/mountpoint> fuser -k -m /mountpoint
/mountpoint: 8752c
Connection to yang closed.
What PID is fuser _supposed_ to find in this situation? There is no
nfsd running. What exactly does "exportfs yang:/mountpoint" do? It
seems to update the xtab file and pass some info to the kernel (shows
up in /proc/fs/nfs/exports). Is that what the umount is blocked by?
If so, what other than "exportfs -u" can one do?
I'll look at this a little more tomorrow. Thanks for your help so far!
Jens.