<br>My intention is parallelise a small Python script that calls an external set of scripts that process the dataset I have in-hand. It is not a huge computing power demanding task but in my Intel 2.5Ghz Dual Core 2 it takes about 1.5 hours to process the whole dataset. Looking at the system monitor I see that the workload is not equally distributed in between CPUs (one of them usually much lazier than the other.) I am sure parallezing the code run would boost the processing speed. In my dataset I have 17 folders and each folder is independent from each other. My script visits each folder and calls the main external script via subprocess module&#39;s call function. Processing starts with the first folder, and doesn&#39;t work on the next folder unless the processing finishes with the previous folder. Basically, what I really want is to put externally called scripts into separate threads, so that I don&#39;t need to wait the previous job to be done during the processing process.<br>

<br>From the IPython parallel computing documentation, it seems like what I want is doable in IPython. However I need some advice whether my understanding is correct in this aspect. Also for the solution of the below warning messages. <br>

<br></blockquote></div><div><br>Yes, I think it would work just fine for that. If you have the names of the folders and a function that will compute what you want, given the name of the folder, you should be able to just use MultiEngineClient.map<br>
</div></div></blockquote><div><br>This is the script in hand that I want to parallelize:<br><br><br>import os<br>from subprocess import call<br><br>init = os.getcwd()<br><br>for root, dirs, files in os.walk(&#39;.&#39;):<br>
dirs.sort()<br> for file in files:<br> if file.endswith(&#39;.sea&#39;) == True:<br> print file<br> os.chdir(root)<br> print os.getcwd()<br> call([&#39;postprocessing_saudi&#39;, file])<br>
os.chdir(init)<br><br>From the top of the dataset folder hierarchy I call this script, and whenever a &quot;sea&quot; ended file encountered it executes set of external scripts starting with postprocessing_saudi bash script. And goes on with IDL, perl, python scripts till it finishes processing of that &quot;sea&quot; file and so on so forth till the directories exhaust. <br>
<br>If I can make parallel functionality working, will I need to make any changes in this code? If not could you be little more descriptive on the use of MultiEngineClient.map<br><br>Thanks for your comments.<br><br> </div>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="gmail_quote"><div>
<br>Cheers,<br><font color="#888888"><br>Brian<br> <br></font></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div class="h5">
Thanks.<br><br><br>[gsever@ccn Desktop]$ ipcluster local -n 4<br>
/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead<br>