Oops! Having just done an svn update, I now see that David appears to
have done most of this about a week ago...
I'm behind the times.
-tim
Tim Hochberg wrote:
>I've finally got around to looking at numexpr again. Specifically, I'm
>looking at Francesc Altet's numexpr-0.2, with the idea of harmonizing
>the two versions. Let me go through his list of enhancements and comment
>(my comments are dedented):
>> - Addition of a boolean type. This allows better array copying times
> for large arrays (lightweight computations ara typically bounded by
> memory bandwidth).
>>Adding this to numexpr looks like a no brainer. Behaviour of booleans
>are different than integers, so in addition to being more memory
>efficient, this enables boolean &, |, ~, etc to work properly.
>> - Enhanced performance for strided and unaligned data, specially for
> lightweigth computations (e.g. 'a>10'). With this and the addition of
> the boolean type, we can get up to 2x better times than previous
> versions. Also, most of the supported computations goes faster than
> with numpy or numarray, even the simplest one.
>>Francesc, if you're out there, can you briefly describe what this
>support consists of? It's been long enough since I was messing with this
>that it's going to take me a while to untangle NumExpr_run, where I
>expect it's lurking, so any hints would be appreciated.
>> - Addition of ~, & and | operators (a la numarray.where)
>>Sounds good.
>> - Support for both numpy and numarray (use the flag --force-numarray
> in setup.py).
>>At first glance this looks like it doesn't make things to messy, so I'm
>in favor of incorporating this.
>> - Added a new benchmark for testing boolean expressions and
> strided/unaligned arrays: boolean_timing.py
>>Benchmarks are always good.
>> Things that I want to address in the future:
>> - Add tests on strided and unaligned data (currently only tested
> manually)
>>Yep! Tests are good.
>> - Add types for int16, int64 (in 32-bit platforms), float32,
> complex64 (simple prec.)
>>I have some specific ideas about how this should be accomplished.
>Basically, I don't think we want to support every type in the same way,
>since this is going to make the case statement blow up to an enormous
>size. This may slow things down and at a minimum it will make things
>less comprehensible. My thinking is that we only add casts for the extra
>types and do the computations at high precision. Thus adding two int16
>numbers compiles to two OP_CAST_Ffs followed by an OP_ADD_FFF, and then
>a OP_CAST_fF. The details are left as an excercise to the reader ;-).
>So, adding int16, float32, complex64 should only require the addition of
>6 casting opcodes plus appropriate modifications to the compiler.
>>For large arrays, this should have most of the benfits of giving each
>type it's own opcode, since the memory bandwidth is still small, while
>keeping the interpreter relatively simple.
>>Unfortunately, int64 doesn't fit under this scheme; is it used enough to
>matter? I hate pile a whole pile of new opcodes on for something that's
>rarely used.
>>>Regards,
>>-tim
>>>>>>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net>https://lists.sourceforge.net/lists/listinfo/numpy-discussion>>>>