I'm in the process of setting up a new server, bringing functions up one-by-one. I'm stuck with nfsv4 at the moment.

Both old and new servers, as well as the client I'm using for debug are running nfs-utils-1.2.9 with identical USE="caps libmount nfsidmap nfsv4 tcpd uuid -ipv6 -kerberos -nfsdcld -nfsv41 (-selinux)"

When I try to mount a filesystem from the new server in /var/log/messages I get:
Dec 31 19:52:57 anastasia kernel: RPC: AUTH_GSS upcall timed out.
Dec 31 19:52:57 anastasia kernel: Please check user daemon is running.
But as can be seen above, none of these installations use Kerberos, yet GSS makes me think that this is a Kerberos-related sugestion, and it isn't necessary to connect to the old server, anyway.

Have you tried to explicitly mount it with sec=sys to disable any kerberos auth? Like this:

Quote:

mount -t nfs4 -o sec=sys server:/share /mnt/pount

Edit: Note: this is only meant for testing, it disables kerberos authentication / encryption. You should not use this in production, especially not on untrusted networks. Thank BitPit for pointing that out more verbosely.

Last edited by Kompi on Mon Mar 17, 2014 3:49 pm; edited 1 time in total

I seem to be working. I'm not sure that the "sec=sys" option did it, there appear to be a number of little problems here, and they all need to be flattened in order to get a clean mount.

First, half of my RAID-1 is an eSATA drive that doesn't reliably mount at power up. So I've de-auto'ed all of that stuff. After power up, I need to:
1 - See if the eSATA drive is there, and replug it if it isn't.
2 - Start the RAID
3 - Mount the RAID
4 - Bind-mount the /exports/home

Next, it appears that nfs doesn't do a clean start, first time. I get this error message:

Second time it's OK. I may have seen this before, on my old server I found that I needed to restart nfs after boot, and never got that far into debug, once it worked. So:
5 - Restart nfs, after having started it once already.

Next, I've noticed that nfs is a bit slow to start on my other systems, and haven't really looked into it. I suspect that they're seeing the GSS timeout. Incidentally, even after adding the "sec=sys" option I still see the GSS timeout, which is what makes me think that getting it working was a bunch of i-dotting and t-crossing.

I'm not marking this [SOLVED] yet because I suspect I've been running for years with all sorts of cruft and bugs, tucking it under the rug. It would be nice to see if that can be cleaned up. (Maybe I should be trying to secure this better, too.)_________________.sigs waste space and bandwidth

There is still something wrong with NFS4 using kerberos. I too have this problem, but have not figured out why it came about or how to fix it.

It worked until the last month on a gentoo mostly stable system. It may have broken in linux 3.10.17, but I didn't see really bad things until linux 3.12.13 where is stopped working. I noticed it in linux 3.12.13 because I share /usr/portage with several machines and a remote update suddenly wanted to revert a lot of stuff to a much older version. /usr/portage was not NFS mounted over the original portage files.

ssh with kerberos still works, but it is real slow making a connection.