This isn't quite a Torque question, but I ran into using torque so I
thought I'd ask here.
I have a 32 node cluster. From the head node, I have no problem starting
a pvm using all 32 nodes. However, from any of the nodes a few nodes
will fail to start the pvmd (the only error message is that getenv(HOME)
failed, and they are using / as the home directory). However, if I wait
a couple seconds then try to add them again it will work. So if I do
something like:
echo add `cat nodes` | pvm | grep "Can't start pvmd" | cut -b19-24 > xxx
sleep 5
echo add `cat xxx` | pvm | grep "Can't start pvmd" | cut -b19-24 > xxx
sleep 5
...
After a few times through it it works out fine. Of course, thats just a
nasty hack.
While I'm at it, has anyone written a wrapper to use pbsdsh as a PVM
replacement, rather than actually having rsh installed?
--
------------------------------------------------------
| Josh Lauricha | Ford, you're turning |
|laurichj at bioinfo.ucr.edu | into a penguin. Stop |
| Bioinformatics, UCR | it |
|----------------------------------------------------|
| OpenPG: |
| 4E7D 0FC0 DB6C E91D 4D7B C7F3 9BE9 8740 E4DC 6184 |
|----------------------------------------------------|
Josh Lauricha
laurichj at bioinfo.ucr.edu
OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 486 bytes
Desc: This is a digitally signed message part
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20041014/c8547b81/PGP.bin