Hi,
I would strongly suggest not using 'from numpy import *' etc. but use
'import numpy'. In particular you want to use ensure that you are
using numpy.exp not math.exp on numpy objects.
Also, please ensure that you have at least 3 processors available (the
default). If not, you may introduce problems especially if you only
have two processors because one processor will be used by system for
other tasks.
Without knowing your 'simple for-loop' I do not see you apparently see.
from numpy import ones, exp
import time
if __name__=='__main__':
def f(x):
y = ones(10000000)
exp(y)
t1=time.time()
foreach(f,range(100))
t2=time.time()
for ndx in range(100):
y = ones(10000000)
exp(y)
t3=time.time()
print 'Handythread / simple loop)=, (t3-t2)/(t2-t1)
With this code, the 'for loop' takes about 2.7 times as long as the
handythread loop for a quad-core system. Further, on my Linux system I
can see via 'top' that handythread is using 3 (of the four cores) and
then this drops to 1 with the loop. Note this is not 3 to 1 as would
be expected if linear speed but rather close - there is overhead
involved. If you have limited resources (ie memory or processors) or
another OS that is not fully multithreaded, you may run into
additional problems since handythread.py assumes everything is
possible.
Regards
Bruce
On Wed, Feb 20, 2008 at 10:22 PM, Anand Patil
<anand.prabhakar.patil@gmail.com> wrote:
> Hi all,
>> I have a question primarily for Anne Archibald, the author of the
> cookbook entry on multithreading,
>http://www.scipy.org/Cookbook/Multithreading.>> I tried replacing the 'if name=='__main__' clause in the attachment
> handythread.py with
>> from numpy import ones, exp
> def f(x):
> print x
> y = ones(10000000)
> exp(y)
>> and the wall-clock time with foreach was 4.72s vs 6.68s for a simple for-loop.
>> First of all, that's amazing! I've been internally railing against the
> GIL for months. But it looks like only a portion of f is being done
> concurrently. In fact if I comment out the 'exp(y)', I don't see any
> speedup at all.
>> It makes sense that you can't malloc simultaneously from different
> threads... but if I replace 'ones' with 'empty', the time drops
> precipitously, indicating that most of the time taken by 'ones' is
> spent actually filling the array with ones. It seems like you should
> be able to do that concurrently.
>> So my question is, what kinds of numpy functions tend to release the
> GIL? Is there a system to it, so that one can figure out ahead of time
> where a speedup is likely, or do you have to try and see? Do
> third-party f2py functions with the 'threadsafe' option release the
> GIL?
>> Thanks,
> Anand
> _______________________________________________
> SciPy-user mailing list
>SciPy-user@scipy.org>http://projects.scipy.org/mailman/listinfo/scipy-user>