The cython function below returns a long int:
@cython.boundscheck(False)
def mysum(np.ndarray[np.int64_t, ndim=1] a):
"sum of 1d numpy array with dtype=np.int64."
cdef Py_ssize_t i
cdef int asize = a.shape[0]
cdef np.int64_t asum = 0
for i in range(asize):
asum += a[i]
return asum
What's the best way to make it return a numpy long int, or whatever it
is called, that has dtype, ndim, size, etc. class methods? The only
thing I could come up with is changing the last line to
return np.array(asum)[()]
It works. And adds some overhead:
>> a = np.arange(10)
>> timeit mysum(a)
10000000 loops, best of 3: 167 ns per loop
>> timeit mysum2(a)
1000000 loops, best of 3: 984 ns per loop
And for scale:
>> timeit np.sum(a)
100000 loops, best of 3: 3.3 us per loop
I'm new to cython. Did I miss any optimizations in the mysum function above?