IPython is not well suited for remote process management. In fact, that&#39;s what IPython is worst at. Where IPython is helpful is handing coordination and communication among processes *once they have been started*. If you want a tool to help starting processes, better choices include celery, salt, puppet, etc.<div>

2012/2/7 Brian Granger &lt;<a href="mailto:ellisonbg@gmail.com">ellisonbg@gmail.com</a>&gt;:<br>
&gt; Florian,<br>
&gt;<br>
&gt; On Tue, Feb 7, 2012 at 6:04 AM, Florian Lindner &lt;<a href="mailto:mailinglists@xgm.de">mailinglists@xgm.de</a>&gt; wrote:<br>
&gt;&gt; 2012/2/6 Brian Granger &lt;<a href="mailto:ellisonbg@gmail.com">ellisonbg@gmail.com</a>&gt;:<br>
<div class="im">&gt;&gt;&gt; On Sat, Feb 4, 2012 at 2:03 PM, Florian Lindner &lt;<a href="mailto:mailinglists@xgm.de">mailinglists@xgm.de</a>&gt; wrote:<br>
&gt;&gt;<br>
</div><div><div class="h5">&gt;&gt;&gt;&gt; I&#39;m currently working on a control-/queue-management software for a<br>
&gt;&gt;&gt;&gt; CFD simulation system. If consists of three parts:<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; - A client communicates with the CFD system. It can be long running.<br>
&gt;&gt;&gt;&gt; The client can also be run standalone.<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; - A server which does the queue management and starts up the clients.<br>
&gt;&gt;&gt;&gt; It is non-interactive<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; - A server interface which the user uses to talk to the server, e.g.<br>
&gt;&gt;&gt;&gt; to enqueue new jobs.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; I am following your design here, but the naming of things is a bit<br>
&gt;&gt;&gt; backwards from IPython. Here is our terminology:<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; * Engine = runs on a compute node and does the actual computation.<br>
&gt;&gt;&gt; This is where the CFD sim would run.<br>
&gt;&gt;&gt; * Controller = Schedules tasks to engines using lightweight, low<br>
&gt;&gt;&gt; latency scheduler.<br>
&gt;&gt;&gt; * Cluster = Starts Engines/Controller using batch system.<br>
&gt;&gt;&gt; * Client = Frontend process that the users uses to talk to the above.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; Currently they are communicating via XMLRPC (from python stdlib):<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; client &lt;---- server &lt;---- server interface.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; This architecture is partially reinventing everything in<br>
&gt;&gt;&gt; IPython.parallel. I would just use IPython.parallel and take<br>
&gt;&gt;&gt; advantage of everything we have there. It is extremely powerful and<br>
&gt;&gt;&gt; will out perform XMLRPC by a long shot.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; A this time the system works only localhost and with one client.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; IPython supports a wide range of cluster configurations (PBS, Torque,<br>
&gt;&gt;&gt; mpiexec, SSH, etc.) and multiple engines and clients.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; Before continue to extend it I wonder if iPython could be useful for<br>
&gt;&gt;&gt;&gt; network communication and process management. I browsed through the<br>
&gt;&gt;&gt;&gt; docs but I&#39;m not entirely sure if I got the ideas of iPython right.<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; The user should not get in contact with iPython. The software is not<br>
&gt;&gt;&gt;&gt; doing and probably will never do any numerical demanding calculations<br>
&gt;&gt;&gt;&gt; itself.<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; Is iPython useful in the scenario?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; It would be extremely useful. I would check out our cluster docs here:<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; <a href="http://ipython.org/ipython-doc/dev/parallel/index.html" target="_blank">http://ipython.org/ipython-doc/dev/parallel/index.html</a><br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; The notebook would also be useful as well:<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; <a href="http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html" target="_blank">http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html</a><br>
&gt;&gt;<br>
</div></div>&gt;&gt; Ok, I&#39;m still unsure about the ideas behind it all...<br>
&gt;&gt;<br>
&gt;&gt; Given I have a working SSH connection with passwordless authentication.<br>
&gt;&gt;<br>
&gt;&gt; Currently my engine is used like: flof.py case.conf. case.conf<br>
&gt;&gt; contains the steps for pre-processing, solving and post-processing.<br>
&gt;&gt; Data (= the case) is a directory structure and thus file based.<br>
&gt;&gt; flof.py would be the engine.<br>
&gt;<br>
&gt; You would have to do some refactoring and make the code in flof.py a<br>
&gt; Python library that you can import and run.<br>
&gt;<br>
&gt;&gt; - Code Distribution: Does iPython provides a method for code<br>
&gt;&gt; distribution or does the code needs to copied manually before starting<br>
&gt;&gt; it on a remote computing node? Is iPython starting the remote process?<br>
&gt;<br>
&gt; Yes for some cases it can do that, but in most cases you want to have<br>
&gt; the code installed on the compute nodes.<br>
&gt;<br>
&gt;&gt; - Data Distribution: Does iPython provides a method for data<br>
&gt;&gt; distribution (copy over the case to a remote node)? I think I read<br>
&gt;&gt; that there is no data distribution method.<br>
&gt;<br>
&gt; Yes, definitely.<br>
<br>
<a href="http://ipython.org/ipython-doc/stable/parallel/parallel_process.html" target="_blank">http://ipython.org/ipython-doc/stable/parallel/parallel_process.html</a> says:<br>
&quot;SSH mode does not do any file movement, so you will need to<br>
distribute configuration files manually.&quot; I need to copy over files (a<br>
directory)<br>
<br>
&gt;&gt; My server (controller) starts up the engines.<br>
&gt;&gt;<br>
&gt;&gt; - Can I tell the controller to start up a specific job on a specific machine?<br>
&gt;<br>
&gt; Yes, see the DirectView docs.<br>
<br>
I&#39;ve seen it and also played around with it though found no way to<br>
launch an engine on a specific host. Didn&#39;t dive further into this.<br>
<br>
&gt;&gt; The matter is that I don&#39;t need queue management, load balancing etc.<br>
&gt;&gt; I&#39;m looking for a tool that helps me controlling the remote jobs,<br>
&gt;&gt; checking and exchanging information about their status and - if needed<br>
&gt;&gt; - kill them. (a job is a process instance of flof.py)<br>
&gt;<br>
&gt; I strongly encourage you to read the docs I linked to above. It will<br>
&gt; answer all of these questions and more.<br>
<br>
I did read them and unfortunatly they are not.<br>
<br>
Well I will try to split up my questions in following threads.<br>
<br>
Regards,<br>
<div class="HOEnZb"><div class="h5"><br>
Florian<br>
_______________________________________________<br>
IPython-User mailing list<br>
<a href="mailto:IPython-User@scipy.org">IPython-User@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/ipython-user" target="_blank">http://mail.scipy.org/mailman/listinfo/ipython-user</a><br>
</div></div></blockquote></div><br></div></div>