SUMMARY: /tmp/.getwdxnnnnn (again)

It seems that I summarized too soon. Right after I sent out
my summary, I got a pile of messages, some of which
contained some additional information that I thought was
useful enough to warrant a follow-up, to wit:

=================================================================
>From Jim Mattson <uunet!cs.UCSD.EDU!mattson>:
=================================================================
There are two solutions. One is to make sure that as root
you do a pwd in some deeper directory than / each time your
mount table changes (i.e. each time you mount or umount a
file system). This will force an update of /tmp/.getwd and
keep it in synch with /etc/mtab. If you run an
automounter, though, this may not be so easy. (Maybe you
could do it out of cron every half hour, but what a
kludge.) The other solution is to upgrade to 4.1 or 4.1.1
where this problem has been fixed. (I don't know about
your Solbournes).
=================================================================

=================================================================
>From Paul O'Neill, OSU--Oceanography, Corvallis, OR:
=================================================================
Here are the last 2 lines of my /etc/rc.local. I'm not
enough of a system hacker to know why this works. I saw it
on the net, tried it, and it's gotten rid of all those
/tmp/.get?????'s. It's been so long, I forget the
explaination.

=================================================================
>From Daniel Trinkle, Purdue University:
=================================================================
Sun decided to speed up getwd() by adding a cache that
contains information relevant to NFS mounted filesystems.
This is a big win over probing each NFS server every time.
The cache is in /tmp/.getwd. If /etc/mtab has changed
since /tmp/.getwd, then getwd() rebuilds the cache in
/tmp/.getwda<pid> and then moves the new cache into place
(removing /tmp/.getwd in the process). However, if the
person that happens to trigger the rebuild of the cache
does not own the previous cache, and the /tmp directory has
the sticky bit set, it will not work.

There are two workarounds. The first is to remove the
sticky bit from /tmp. This is a slight security loss. The
second is to make sure root runs getwd any time a
filesystem is mounted or umounted. This is not a big deal
for most systems, and may be the prefered option.
=================================================================

=================================================================
>From Andie Ness, CSTR, University of Edinburgh, EDINBURGH:
=================================================================
At a guess, I would say you have the sticky bit set on /tmp.

Once this is close it tries to rename /tmp/.getwda{pid} to
"/tmp/.getwd" which will fail thus

rename ("/tmp/.getwda14170", "/tmp/.getwd") = -1 EPERM (Not owner)

because a the first pwd since /tmp got cleared managed to
rename ... Since you have the sticky bit set for /tmp,
only the owner can remove .getwd so you'll get left with
the /tmp/.getpwda* files owned by everyone except the owner
of /tmp/.getpwd (if you see what I mean).

One way to test this would be to check that the owner of
.getwd doesn't own any other .getwda* files.

We get round this by running a script every hour to clear
out /tmp/.getwd* files.

On reviewing all this, I am curious about one thing: We do
not run any auto-mounter, but we were collecting thousands
of .getwdxxx files with a stable mount table. We *do*,
however, have our fstabs automated, so it should be
possible to force the getwd every time the mtab changes.
In any event, I think that we might try one or some
combination of these suggestions.