I'm relatively certain its possible, but then you have to deal with
locks, semaphores, synchronization, etc...
On Thu, Jul 2, 2009 at 12:04 PM, Sebastian Haase<seb.haase@gmail.com> wrote:
> On Thu, Jul 2, 2009 at 5:38 PM, Chris Colbert<sccolbert@gmail.com> wrote:
>> Who are quoting Sebastian?
>>>> Multiprocessing is a python package that spawns multiple python
>> processes, effectively side-stepping the GIL, and provides easy
>> mechanisms for IPC. Hence the need for serialization....
>>> I was replying to the OP's email
>> Regarding your comment: can separate processes not access the same
> memory space !? via shared memory ...
> I think there was a discussion about this not to long ago on this list.
>> -S.
>>>>> On Thu, Jul 2, 2009 at 11:30 AM, Sebastian Haase<seb.haase@gmail.com> wrote:
>>> On Thu, Jul 2, 2009 at 5:14 PM, Chris Colbert<sccolbert@gmail.com> wrote:
>>>> can you hold the entire file in memory as single array with room to spare?
>>>> If so, you could use multiprocessing and load a bunch of smaller
>>>> arrays, then join them all together.
>>>>>>>> It wont be super fast, because serializing a numpy array is somewhat
>>>> slow when using multiprocessing. That said, its still faster than disk
>>>> transfers.
>>>>>>>> I'm sure some numpy expert will come on here though and give you a
>>>> much better idea.
>>>>>>>>>>>>>>>> On Wed, Jul 1, 2009 at 7:57 AM, Mag Gam<magawake@gmail.com> wrote:
>>>>> Is it possible to use loadtxt in a mult thread way? Basically, I want
>>>>> to process a very large CSV file (100+ million records) and instead of
>>>>> loading thousand elements into a buffer process and then load another
>>>>> 1 thousand elements and process and so on...
>>>>>>>>>> I was wondering if there is a technique where I can use multiple
>>>>> processors to do this faster.
>>>>>>>>>> TIA
>>>>>> Do you know about the GIL (global interpreter lock) in Python ?
>>> It means that Python isn't doing "real" multithreading...
>>> Only if one thread is e.g. doing some slow or blocking io stuff, the
>>> other thread could keep work, e.g. doing CPU-heavy numpy stuff.
>>> But you would get 2-CPU numpy code - except for some C-implemented
>>> "long running" operations -- these should be programmed in a way that
>>> releases the GIL so that the other CPU could go on doing it's Python
>>> code.
>>>>>> HTH,
>>> Sebastian Haase
>>> _______________________________________________
>>> Numpy-discussion mailing list
>>>Numpy-discussion@scipy.org>>>http://mail.scipy.org/mailman/listinfo/numpy-discussion>>>>> _______________________________________________
>> Numpy-discussion mailing list
>>Numpy-discussion@scipy.org>>http://mail.scipy.org/mailman/listinfo/numpy-discussion>>> _______________________________________________
> Numpy-discussion mailing list
>Numpy-discussion@scipy.org>http://mail.scipy.org/mailman/listinfo/numpy-discussion>