On Wed, 1 Oct 2014 13:30:19 -0400 (EDT)
Benjamin Kaduk <kaduk@MIT.EDU> wrote:
> On Tue, 30 Sep 2014, Eric Shell wrote:
>> > # vos listaddrs -noresolve -localauth
> > vos: could not list the server addresses
> > Possible communication failure
> >
> > # pts listmax -localauth
> > pts: server or network not responding getting maximum user id
> >
> > # backup listhosts -localauth
> > backup: server or network not responding ; Can't access backup database
> > backup: server or network not responding ; Can't initialize backup
>> So, it seems like no authentication is working yet.
But 'bos -localauth' is working, so auth is working for bosserver. Maybe
check that the bosserver binary and the ptserver/etc binaries are
actually from the same build? At this low level of the code, the
behavior is supposed to be identical for these different cases.
Eric, would you also be willing to provide a packet trace for when this
happens? Even if you did do something wrong and something is wacky with
the auth or config or whatever, you shouldn't be getting these "server
or network not responding" errors just from using -localauth. UDP port
7007 for the 'bos' command (for a baseline "it's working" case), UDP
port 7002 for 'pts', and UDP port 7003 for 'vos'. (Send to me or Ben or
someone if you're comfortable with that, not to the list)
As another guess, maybe something is causing the connection structures
to get corrupted when we create the authenticated conns (freebsd is a
less-tested platform of course). So maybe we are indeed trying to reach
some other IP for some reason, but that's just a guess.
Or another guess is that the non-bosserver servers are
crashing/hanging/something when you do this, which is why you get a
network error. Try checking for that. ('bos status -long' will give you
some more info about restarts)
--
Andrew Deason
adeason@sinenomine.net