Hello all,
I am sorry to ask what is probably a newbie question, I have searched the archives but am probably not using the proper key word to locate.
I am working on an atmospheric model which uses openmpi/openrte. I have two nodes setup but the model only runs on one node.
I can use mpirun to execute an application on another node by entering the below on HOST1:
mpirun --np 2 --host HOST2 APPNAME
In this scenario, the system connects via ssh to HOST2 and runs the application without a problem.
If I attempt to run:
mpirun --np 2 --nolocal APPNAME
I get:
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file base/rmaps_base_support_fns.c at line 168
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file rmaps_rr.c at line 402
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file base/rmaps_base_map_job.c at line 210
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file rmgr_urm.c at line 372
[virtualModel1:03939] mpirun: spawn failed with errno=-3
Looking at the source code, that is the area where the available nodes are enumerated and this error appears to indicate no "non-local" node is available if I am interpreting this correctly.
I have the hosts file correct along with the ssh key so the user can login without a password etc etc. I don't know where the system looks for identification of node IPs so this can be enumerated.
Can someone give me a quick pointer to the correct location in the manual (I realize the answer is RTM but I have not found the answer in the manual thus far so I figured I would throw it out there to the experts).
Thanks for your patience with my query.