My current guess is that GlusterFS is saying the mount is complete to
AutoFS before the actual mount operation takes effect. 50% of the
time GlusterFS is able to complete the mount before AutoFS let's the
user continue, and all is well. The other 50% of the time, GlusterFS
does not quite finish the mount, and AutoFS gives the user a broken
directory.

I might try and prove this by adding a sleep 5 to
/sbin/mount.glusterfs, although I do not consider this a valid
solution, as it just reduces the effect of the race - it does not
eliminate the race.

Uhh... Hmm... It already has a "sleep 3", and changing it to "sleep 5"
does not reduce the frequency of the problem. Changing it to "sleep
10" also has no effect.

Why does it sometimes work and sometimes not?

I note that the fusermount from the FUSE libraries does not seem to have
the same problem:

Note that the first 'ls' returns 'hi', and a second later, 'ls' returns
the glusterfs content.

For fusexmp, it appears to complete the mount before it returns. For
glusterfs, it seems to complete the mount a short time after it completes.

I think this is where autofs is getting confused, and serving the handle
to the directory to the client too early. It thinks glusterfs is done
mounting, and gives the handle to the client, but this handle is broken
and fails. Glusterfs completes the mount, and a short time later the
lookups succeed. Adding 'sleep' in mount.glusterfs do not seem to be
good enough - as 'sleep 1' and 'sleep 20' do not change the frequency.
The existing 'sleep 3' in /sbin/mount.glusterfs should be completely
unnecessary. Instead, we should figure out why GlusterFS cannot ensure
the mount is in place before it returns?